linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/12] crypto: Adiantum support
@ 2018-10-15 17:54 Eric Biggers
  2018-10-15 17:54 ` [RFC PATCH v2 01/12] crypto: chacha20-generic - add HChaCha20 library function Eric Biggers
                   ` (12 more replies)
  0 siblings, 13 replies; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

Hello,

We've been working to find a way to bring storage encryption to
entry-level Android devices like the inexpensive "Android Go" devices
sold in developing countries, and some smartwatches.  Unfortunately,
often these devices still ship with no encryption, since for cost
reasons they have to use older CPUs like ARM Cortex-A7; and these CPUs
lack the ARMv8 Cryptography Extensions, making AES-XTS much too slow.

We're trying to change this, since we believe encryption is for
everyone, not just those who can afford it.  And while it's unknown how
long CPUs without AES support will be around, there will likely always
be a "low end"; and in any case it's immensely valuable to provide a
software-optimized cipher that doesn't depend on hardware support.
Lack of hardware support should not be an excuse for no encryption.

But after an extensive search (e.g. see [1]) we were unable to find an
existing cipher that simultaneously meets the very strict performance
requirements on ARM processors, is secure (including having sufficient
security parameters as well as sufficient cryptanalysis of any
primitive(s) used), is suitable for practical use in dm-crypt and
fscrypt, *and* avoids any particularly controversial primitive.

Therefore, we (well, Paul Crowley did the real work) designed a new
encryption mode, Adiantum.  In essence, Adiantum makes it secure to use
the ChaCha stream cipher for disk encryption.  Adiantum is specified by
our paper here: https://eprint.iacr.org/2018/720.pdf ("Adiantum:
length-preserving encryption for entry-level processors").  Reference
code and test vectors are here: https://github.com/google/adiantum.
Most of the high-level concepts of Adiantum are not new; similar
existing modes include XCB, HCTR, and HCH.  Adiantum and these modes are
true wide-block modes (tweakable super-pseudorandom permutations), so
they actually provide a stronger notion of security than XTS.

Adiantum is an improved version of our previous algorithm, HPolyC [2].
Like HPolyC, Adiantum uses XChaCha12, two passes of an
ε-almost-∆-universal (εA∆U) hash function, and one AES-256 encryption of
a single 16-byte block.  On ARM Cortex-A7, on 4096-byte messages
Adiantum is about 4x faster than AES-256-XTS (about 5x for decryption),
and about 30% faster than Speck128/256-XTS.

Adiantum is a construction, not a primitive.  Its security is reducible
to that of XChaCha12 and AES-256, subject to a security bound; the proof
is in Section 5 of our paper.  Therefore, one need not "trust" Adiantum;
they only need trust XChaCha12 and AES-256.  Note that of these two
primitives, AES-256 currently has the lower security margin.

Adiantum is ~20% faster than HPolyC, with no loss of security; in fact,
Adiantum's security bound is slightly better than HPolyC's.  It does
this by choosing a faster εA∆U hash function: it still uses Poly1305's
εA∆U hash function, but now a hash function from the "NH" family of hash
functions is used to "compress" the message by 32x first.  NH is εAU (as
shown in the UMAC paper[3]) but is over twice as fast as Poly1305.  Key
agility is reduced, but that's acceptable for disk encryption.

NH is also very simple, and it's easy to implement in SIMD assembly,
e.g. in ARM NEON.  Now, to get good performance only a SIMD
implementation of NH is required, not Poly1305.  Therefore, Adiantum can
be easier to port to new platforms than HPolyC, despite Adiantum's
slightly increased complexity.  For now this patchset only includes an
ARM32 NEON implementation of NH, but as a proof of concept I've also
written SSE2, AVX2, and ARM64 NEON implementations of NH; see
https://github.com/google/adiantum/tree/master/benchmark/src.

This patchset adds Adiantum to Linux's crypto API, focusing on generic
and ARM32 implementations.  Patches 1-7 add support for XChaCha20 and
XChaCha12.  Patches 8-10 add NHPoly1305 support, needed for Adiantum
hashing.  Patch 11 adds Adiantum support as a skcipher template.

Patch 12 adds Adiantum support to fscrypt ("file-based encryption").
In fscrypt, Adiantum is used for filenames encryption as well as
contents encryption; since Adiantum is a SPRP, it fixes the information
leak when filenames share a common prefix.  We also take advantage of
Adiantum's support for long tweaks to include the per-inode nonce
directly in the tweak, which allows providing an option to skip the
per-file key derivation, providing even greater performance benefits.

As before, some of these patches conflict with the new "Zinc" crypto
library.  But I don't know when Zinc will be merged, so for now I've
continued to base this patchset on the current 'cryptodev'.

Again, for more details please read our paper:

    Adiantum: length-preserving encryption for entry-level processors
    (https://eprint.iacr.org/2018/720.pdf)

This patchset can also be found in git at
https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git
branch "adiantum-v2".

References:
  [1] https://www.spinics.net/lists/linux-crypto/msg33000.html
  [2] https://patchwork.kernel.org/cover/10558059/
  [3] https://fastcrypto.org/umac/umac_proc.pdf

Eric Biggers (12):
  crypto: chacha20-generic - add HChaCha20 library function
  crypto: chacha20-generic - add XChaCha20 support
  crypto: chacha20-generic - refactor to allow varying number of rounds
  crypto: chacha - add XChaCha12 support
  crypto: arm/chacha20 - add XChaCha20 support
  crypto: arm/chacha20 - refactor to allow varying number of rounds
  crypto: arm/chacha - add XChaCha12 support
  crypto: poly1305 - add Poly1305 core API
  crypto: nhpoly1305 - add NHPoly1305 support
  crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305
  crypto: adiantum - add Adiantum support
  fscrypt: add Adiantum support

 Documentation/filesystems/fscrypt.rst         |  183 +-
 arch/arm/crypto/Kconfig                       |    7 +-
 arch/arm/crypto/Makefile                      |    6 +-
 ...hacha20-neon-core.S => chacha-neon-core.S} |   90 +-
 arch/arm/crypto/chacha-neon-glue.c            |  207 ++
 arch/arm/crypto/chacha20-neon-glue.c          |  127 -
 arch/arm/crypto/nh-neon-core.S                |  116 +
 arch/arm/crypto/nhpoly1305-neon-glue.c        |   78 +
 arch/arm64/crypto/chacha20-neon-glue.c        |   40 +-
 arch/x86/crypto/chacha20_glue.c               |   52 +-
 arch/x86/crypto/poly1305_glue.c               |   20 +-
 crypto/Kconfig                                |   46 +-
 crypto/Makefile                               |    4 +-
 crypto/adiantum.c                             |  648 ++++
 crypto/chacha20_generic.c                     |  137 -
 crypto/chacha20poly1305.c                     |   10 +-
 crypto/chacha_generic.c                       |  217 ++
 crypto/nhpoly1305.c                           |  288 ++
 crypto/poly1305_generic.c                     |  174 +-
 crypto/testmgr.c                              |   30 +
 crypto/testmgr.h                              | 2856 ++++++++++++++++-
 drivers/char/random.c                         |   51 +-
 fs/crypto/crypto.c                            |   35 +-
 fs/crypto/fname.c                             |   22 +-
 fs/crypto/fscrypt_private.h                   |   66 +-
 fs/crypto/keyinfo.c                           |  322 +-
 fs/crypto/policy.c                            |    5 +-
 include/crypto/chacha.h                       |   53 +
 include/crypto/chacha20.h                     |   27 -
 include/crypto/nhpoly1305.h                   |   74 +
 include/crypto/poly1305.h                     |   28 +-
 include/uapi/linux/fs.h                       |    4 +-
 lib/Makefile                                  |    2 +-
 lib/{chacha20.c => chacha.c}                  |   59 +-
 34 files changed, 5389 insertions(+), 695 deletions(-)
 rename arch/arm/crypto/{chacha20-neon-core.S => chacha-neon-core.S} (92%)
 create mode 100644 arch/arm/crypto/chacha-neon-glue.c
 delete mode 100644 arch/arm/crypto/chacha20-neon-glue.c
 create mode 100644 arch/arm/crypto/nh-neon-core.S
 create mode 100644 arch/arm/crypto/nhpoly1305-neon-glue.c
 create mode 100644 crypto/adiantum.c
 delete mode 100644 crypto/chacha20_generic.c
 create mode 100644 crypto/chacha_generic.c
 create mode 100644 crypto/nhpoly1305.c
 create mode 100644 include/crypto/chacha.h
 delete mode 100644 include/crypto/chacha20.h
 create mode 100644 include/crypto/nhpoly1305.h
 rename lib/{chacha20.c => chacha.c} (59%)

-- 
2.19.0.605.g01d371f741-goog


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 01/12] crypto: chacha20-generic - add HChaCha20 library function
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-19 14:13   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 02/12] crypto: chacha20-generic - add XChaCha20 support Eric Biggers
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Refactor the unkeyed permutation part of chacha20_block() into its own
function, then add hchacha20_block() which is the ChaCha equivalent of
HSalsa20 and is an intermediate step towards XChaCha20 (see
https://cr.yp.to/snuffle/xsalsa-20081128.pdf).  HChaCha20 skips the
final addition of the initial state, and outputs only certain words of
the state.  It should not be used for streaming directly.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 include/crypto/chacha20.h |  2 ++
 lib/chacha20.c            | 50 ++++++++++++++++++++++++++++++++++-----
 2 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
index f76302d99e2be..fbec4e6a87890 100644
--- a/include/crypto/chacha20.h
+++ b/include/crypto/chacha20.h
@@ -19,6 +19,8 @@ struct chacha20_ctx {
 };
 
 void chacha20_block(u32 *state, u8 *stream);
+void hchacha20_block(const u32 *in, u32 *out);
+
 void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
 int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
 			   unsigned int keysize);
diff --git a/lib/chacha20.c b/lib/chacha20.c
index d907fec6a9ed1..6a484e16171d1 100644
--- a/lib/chacha20.c
+++ b/lib/chacha20.c
@@ -1,5 +1,5 @@
 /*
- * ChaCha20 256-bit cipher algorithm, RFC7539
+ * The "hash function" used as the core of the ChaCha20 stream cipher (RFC7539)
  *
  * Copyright (C) 2015 Martin Willi
  *
@@ -16,14 +16,10 @@
 #include <asm/unaligned.h>
 #include <crypto/chacha20.h>
 
-void chacha20_block(u32 *state, u8 *stream)
+static void chacha20_permute(u32 *x)
 {
-	u32 x[16];
 	int i;
 
-	for (i = 0; i < ARRAY_SIZE(x); i++)
-		x[i] = state[i];
-
 	for (i = 0; i < 20; i += 2) {
 		x[0]  += x[4];    x[12] = rol32(x[12] ^ x[0],  16);
 		x[1]  += x[5];    x[13] = rol32(x[13] ^ x[1],  16);
@@ -65,6 +61,25 @@ void chacha20_block(u32 *state, u8 *stream)
 		x[8]  += x[13];   x[7]  = rol32(x[7]  ^ x[8],   7);
 		x[9]  += x[14];   x[4]  = rol32(x[4]  ^ x[9],   7);
 	}
+}
+
+/**
+ * chacha20_block - generate one keystream block and increment block counter
+ * @state: input state matrix (16 32-bit words)
+ * @stream: output keystream block (64 bytes)
+ *
+ * This is the ChaCha20 core, a function from 64-byte strings to 64-byte
+ * strings.  The caller has already converted the endianness of the input.  This
+ * function also handles incrementing the block counter in the input matrix.
+ */
+void chacha20_block(u32 *state, u8 *stream)
+{
+	u32 x[16];
+	int i;
+
+	memcpy(x, state, 64);
+
+	chacha20_permute(x);
 
 	for (i = 0; i < ARRAY_SIZE(x); i++)
 		put_unaligned_le32(x[i] + state[i], &stream[i * sizeof(u32)]);
@@ -72,3 +87,26 @@ void chacha20_block(u32 *state, u8 *stream)
 	state[12]++;
 }
 EXPORT_SYMBOL(chacha20_block);
+
+/**
+ * hchacha20_block - abbreviated ChaCha20 core, for XChaCha20
+ * @in: input state matrix (16 32-bit words)
+ * @out: output (8 32-bit words)
+ *
+ * HChaCha20 is the ChaCha equivalent of HSalsa20 and is an intermediate step
+ * towards XChaCha20 (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf).
+ * HChaCha20 skips the final addition of the initial state, and outputs only
+ * certain words of the state.  It should not be used for streaming directly.
+ */
+void hchacha20_block(const u32 *in, u32 *out)
+{
+	u32 x[16];
+
+	memcpy(x, in, 64);
+
+	chacha20_permute(x);
+
+	memcpy(&out[0], &x[0], 16);
+	memcpy(&out[4], &x[12], 16);
+}
+EXPORT_SYMBOL(hchacha20_block);
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 02/12] crypto: chacha20-generic - add XChaCha20 support
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
  2018-10-15 17:54 ` [RFC PATCH v2 01/12] crypto: chacha20-generic - add HChaCha20 library function Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-19 14:24   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 03/12] crypto: chacha20-generic - refactor to allow varying number of rounds Eric Biggers
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Add support for the XChaCha20 stream cipher.  XChaCha20 is the
application of the XSalsa20 construction
(https://cr.yp.to/snuffle/xsalsa-20081128.pdf) to ChaCha20 rather than
to Salsa20.  XChaCha20 extends ChaCha20's nonce length from 64 bits (or
96 bits, depending on convention) to 192 bits, while provably retaining
ChaCha20's security.  XChaCha20 uses the ChaCha20 permutation to map the
key and first 128 nonce bits to a 256-bit subkey.  Then, it does the
ChaCha20 stream cipher with the subkey and remaining 64 bits of nonce.

We need XChaCha support in order to add support for the Adiantum
encryption mode.  Note that to meet our performance requirements, we
actually plan to primarily use the variant XChaCha12.  But we believe
it's wise to first add XChaCha20 as a baseline with a higher security
margin, in case there are any situations where it can be used.
Supporting both variants is straightforward.

Since XChaCha20's subkey differs for each request, XChaCha20 can't be a
template that wraps ChaCha20; that would require re-keying the
underlying ChaCha20 for every request, which wouldn't be thread-safe.
Instead, we make XChaCha20 its own top-level algorithm which calls the
ChaCha20 streaming implementation internally.

Similar to the existing ChaCha20 implementation, we define the IV to be
the nonce and stream position concatenated together.  This allows users
to seek to any position in the stream.

I considered splitting the code into separate chacha20-common, chacha20,
and xchacha20 modules, so that chacha20 and xchacha20 could be
enabled/disabled independently.  However, since nearly all the code is
shared anyway, I ultimately decided there would have been little benefit
to the added complexity of separate modules.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 crypto/Kconfig            |  14 +-
 crypto/chacha20_generic.c | 120 +++++---
 crypto/testmgr.c          |   6 +
 crypto/testmgr.h          | 577 ++++++++++++++++++++++++++++++++++++++
 include/crypto/chacha20.h |  14 +-
 5 files changed, 689 insertions(+), 42 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index f7a235db56aaa..d9acbce23d4d5 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1387,18 +1387,22 @@ config CRYPTO_SALSA20
 	  Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>
 
 config CRYPTO_CHACHA20
-	tristate "ChaCha20 cipher algorithm"
+	tristate "ChaCha20 stream cipher algorithms"
 	select CRYPTO_BLKCIPHER
 	help
-	  ChaCha20 cipher algorithm, RFC7539.
+	  The ChaCha20 and XChaCha20 stream cipher algorithms.
 
 	  ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
 	  Bernstein and further specified in RFC7539 for use in IETF protocols.
-	  This is the portable C implementation of ChaCha20.
-
-	  See also:
+	  This is the portable C implementation of ChaCha20.  See also:
 	  <http://cr.yp.to/chacha/chacha-20080128.pdf>
 
+	  XChaCha20 is the application of the XSalsa20 construction to ChaCha20
+	  rather than to Salsa20.  XChaCha20 extends ChaCha20's nonce length
+	  from 64 bits (or 96 bits using the RFC7539 convention) to 192 bits,
+	  while provably retaining ChaCha20's security.  See also:
+	  <https://cr.yp.to/snuffle/xsalsa-20081128.pdf>
+
 config CRYPTO_CHACHA20_X86_64
 	tristate "ChaCha20 cipher algorithm (x86_64/SSSE3/AVX2)"
 	depends on X86 && 64BIT
diff --git a/crypto/chacha20_generic.c b/crypto/chacha20_generic.c
index 3ae96587caf9a..07902fe37aeb8 100644
--- a/crypto/chacha20_generic.c
+++ b/crypto/chacha20_generic.c
@@ -1,7 +1,8 @@
 /*
- * ChaCha20 256-bit cipher algorithm, RFC7539
+ * ChaCha20 (RFC7539) and XChaCha20 stream cipher algorithms
  *
  * Copyright (C) 2015 Martin Willi
+ * Copyright (C) 2018 Google LLC
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -36,6 +37,31 @@ static void chacha20_docrypt(u32 *state, u8 *dst, const u8 *src,
 	}
 }
 
+static int chacha20_stream_xor(struct skcipher_request *req,
+			       struct chacha20_ctx *ctx, u8 *iv)
+{
+	struct skcipher_walk walk;
+	u32 state[16];
+	int err;
+
+	err = skcipher_walk_virt(&walk, req, true);
+
+	crypto_chacha20_init(state, ctx, iv);
+
+	while (walk.nbytes > 0) {
+		unsigned int nbytes = walk.nbytes;
+
+		if (nbytes < walk.total)
+			nbytes = round_down(nbytes, walk.stride);
+
+		chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
+				 nbytes);
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+	}
+
+	return err;
+}
+
 void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
 {
 	state[0]  = 0x61707865; /* "expa" */
@@ -77,54 +103,74 @@ int crypto_chacha20_crypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
-	struct skcipher_walk walk;
-	u32 state[16];
-	int err;
-
-	err = skcipher_walk_virt(&walk, req, true);
 
-	crypto_chacha20_init(state, ctx, walk.iv);
+	return chacha20_stream_xor(req, ctx, req->iv);
+}
+EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
 
-	while (walk.nbytes > 0) {
-		unsigned int nbytes = walk.nbytes;
+int crypto_xchacha20_crypt(struct skcipher_request *req)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct chacha20_ctx subctx;
+	u32 state[16];
+	u8 real_iv[16];
 
-		if (nbytes < walk.total)
-			nbytes = round_down(nbytes, walk.stride);
+	/* Compute the subkey given the original key and first 128 nonce bits */
+	crypto_chacha20_init(state, ctx, req->iv);
+	hchacha20_block(state, subctx.key);
 
-		chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
-				 nbytes);
-		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
-	}
+	/* Build the real IV */
+	memcpy(&real_iv[0], req->iv + 24, 8); /* stream position */
+	memcpy(&real_iv[8], req->iv + 16, 8); /* remaining 64 nonce bits */
 
-	return err;
+	/* Generate the stream and XOR it with the data */
+	return chacha20_stream_xor(req, &subctx, real_iv);
 }
-EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
-
-static struct skcipher_alg alg = {
-	.base.cra_name		= "chacha20",
-	.base.cra_driver_name	= "chacha20-generic",
-	.base.cra_priority	= 100,
-	.base.cra_blocksize	= 1,
-	.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
-	.base.cra_module	= THIS_MODULE,
-
-	.min_keysize		= CHACHA20_KEY_SIZE,
-	.max_keysize		= CHACHA20_KEY_SIZE,
-	.ivsize			= CHACHA20_IV_SIZE,
-	.chunksize		= CHACHA20_BLOCK_SIZE,
-	.setkey			= crypto_chacha20_setkey,
-	.encrypt		= crypto_chacha20_crypt,
-	.decrypt		= crypto_chacha20_crypt,
+EXPORT_SYMBOL_GPL(crypto_xchacha20_crypt);
+
+static struct skcipher_alg algs[] = {
+	{
+		.base.cra_name		= "chacha20",
+		.base.cra_driver_name	= "chacha20-generic",
+		.base.cra_priority	= 100,
+		.base.cra_blocksize	= 1,
+		.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+		.base.cra_module	= THIS_MODULE,
+
+		.min_keysize		= CHACHA20_KEY_SIZE,
+		.max_keysize		= CHACHA20_KEY_SIZE,
+		.ivsize			= CHACHA20_IV_SIZE,
+		.chunksize		= CHACHA20_BLOCK_SIZE,
+		.setkey			= crypto_chacha20_setkey,
+		.encrypt		= crypto_chacha20_crypt,
+		.decrypt		= crypto_chacha20_crypt,
+	}, {
+		.base.cra_name		= "xchacha20",
+		.base.cra_driver_name	= "xchacha20-generic",
+		.base.cra_priority	= 100,
+		.base.cra_blocksize	= 1,
+		.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+		.base.cra_module	= THIS_MODULE,
+
+		.min_keysize		= CHACHA20_KEY_SIZE,
+		.max_keysize		= CHACHA20_KEY_SIZE,
+		.ivsize			= XCHACHA20_IV_SIZE,
+		.chunksize		= CHACHA20_BLOCK_SIZE,
+		.setkey			= crypto_chacha20_setkey,
+		.encrypt		= crypto_xchacha20_crypt,
+		.decrypt		= crypto_xchacha20_crypt,
+	}
 };
 
 static int __init chacha20_generic_mod_init(void)
 {
-	return crypto_register_skcipher(&alg);
+	return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
 }
 
 static void __exit chacha20_generic_mod_fini(void)
 {
-	crypto_unregister_skcipher(&alg);
+	crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
 }
 
 module_init(chacha20_generic_mod_init);
@@ -132,6 +178,8 @@ module_exit(chacha20_generic_mod_fini);
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
-MODULE_DESCRIPTION("chacha20 cipher algorithm");
+MODULE_DESCRIPTION("ChaCha20 and XChaCha20 stream ciphers (generic)");
 MODULE_ALIAS_CRYPTO("chacha20");
 MODULE_ALIAS_CRYPTO("chacha20-generic");
+MODULE_ALIAS_CRYPTO("xchacha20");
+MODULE_ALIAS_CRYPTO("xchacha20-generic");
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index b1f79c6bf4096..a5512e69c8f31 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -3544,6 +3544,12 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.hash = __VECS(aes_xcbc128_tv_template)
 		}
+	}, {
+		.alg = "xchacha20",
+		.test = alg_test_skcipher,
+		.suite = {
+			.cipher = __VECS(xchacha20_tv_template)
+		},
 	}, {
 		.alg = "xts(aes)",
 		.test = alg_test_skcipher,
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 1fe7b97ba03f9..371641c73cf8c 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -30802,6 +30802,583 @@ static const struct cipher_testvec chacha20_tv_template[] = {
 	},
 };
 
+static const struct cipher_testvec xchacha20_tv_template[] = {
+	{ /* from libsodium test/default/xchacha20.c */
+		.key	= "\x79\xc9\x97\x98\xac\x67\x30\x0b"
+			  "\xbb\x27\x04\xc9\x5c\x34\x1e\x32"
+			  "\x45\xf3\xdc\xb2\x17\x61\xb9\x8e"
+			  "\x52\xff\x45\xb2\x4f\x30\x4f\xc4",
+		.klen	= 32,
+		.iv	= "\xb3\x3f\xfd\x30\x96\x47\x9b\xcf"
+			  "\xbc\x9a\xee\x49\x41\x76\x88\xa0"
+			  "\xa2\x55\x4f\x8d\x95\x38\x94\x19"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00",
+		.ctext	= "\xc6\xe9\x75\x81\x60\x08\x3a\xc6"
+			  "\x04\xef\x90\xe7\x12\xce\x6e\x75"
+			  "\xd7\x79\x75\x90\x74\x4e\x0c\xf0"
+			  "\x60\xf0\x13\x73\x9c",
+		.len	= 29,
+	}, { /* from libsodium test/default/xchacha20.c */
+		.key	= "\x9d\x23\xbd\x41\x49\xcb\x97\x9c"
+			  "\xcf\x3c\x5c\x94\xdd\x21\x7e\x98"
+			  "\x08\xcb\x0e\x50\xcd\x0f\x67\x81"
+			  "\x22\x35\xea\xaf\x60\x1d\x62\x32",
+		.klen	= 32,
+		.iv	= "\xc0\x47\x54\x82\x66\xb7\xc3\x70"
+			  "\xd3\x35\x66\xa2\x42\x5c\xbf\x30"
+			  "\xd8\x2d\x1e\xaf\x52\x94\x10\x9e"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00",
+		.ctext	= "\xa2\x12\x09\x09\x65\x94\xde\x8c"
+			  "\x56\x67\xb1\xd1\x3a\xd9\x3f\x74"
+			  "\x41\x06\xd0\x54\xdf\x21\x0e\x47"
+			  "\x82\xcd\x39\x6f\xec\x69\x2d\x35"
+			  "\x15\xa2\x0b\xf3\x51\xee\xc0\x11"
+			  "\xa9\x2c\x36\x78\x88\xbc\x46\x4c"
+			  "\x32\xf0\x80\x7a\xcd\x6c\x20\x3a"
+			  "\x24\x7e\x0d\xb8\x54\x14\x84\x68"
+			  "\xe9\xf9\x6b\xee\x4c\xf7\x18\xd6"
+			  "\x8d\x5f\x63\x7c\xbd\x5a\x37\x64"
+			  "\x57\x78\x8e\x6f\xae\x90\xfc\x31"
+			  "\x09\x7c\xfc",
+		.len	= 91,
+	}, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes
+		to nonce, and recomputed the ciphertext with libsodium */
+		.key	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.klen	= 32,
+		.iv	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x67\xc6\x69\x73"
+			  "\x51\xff\x4a\xec\x29\xcd\xba\xab"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.ctext	= "\x9c\x49\x2a\xe7\x8a\x2f\x93\xc7"
+			  "\xb3\x33\x6f\x82\x17\xd8\xc4\x1e"
+			  "\xad\x80\x11\x11\x1d\x4c\x16\x18"
+			  "\x07\x73\x9b\x4f\xdb\x7c\xcb\x47"
+			  "\xfd\xef\x59\x74\xfa\x3f\xe5\x4c"
+			  "\x9b\xd0\xea\xbc\xba\x56\xad\x32"
+			  "\x03\xdc\xf8\x2b\xc1\xe1\x75\x67"
+			  "\x23\x7b\xe6\xfc\xd4\x03\x86\x54",
+		.len	= 64,
+	}, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes
+		to nonce, and recomputed the ciphertext with libsodium */
+		.key	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x01",
+		.klen	= 32,
+		.iv	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x02\xf2\xfb\xe3\x46"
+			  "\x7c\xc2\x54\xf8\x1b\xe8\xe7\x8d"
+			  "\x01\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x41\x6e\x79\x20\x73\x75\x62\x6d"
+			  "\x69\x73\x73\x69\x6f\x6e\x20\x74"
+			  "\x6f\x20\x74\x68\x65\x20\x49\x45"
+			  "\x54\x46\x20\x69\x6e\x74\x65\x6e"
+			  "\x64\x65\x64\x20\x62\x79\x20\x74"
+			  "\x68\x65\x20\x43\x6f\x6e\x74\x72"
+			  "\x69\x62\x75\x74\x6f\x72\x20\x66"
+			  "\x6f\x72\x20\x70\x75\x62\x6c\x69"
+			  "\x63\x61\x74\x69\x6f\x6e\x20\x61"
+			  "\x73\x20\x61\x6c\x6c\x20\x6f\x72"
+			  "\x20\x70\x61\x72\x74\x20\x6f\x66"
+			  "\x20\x61\x6e\x20\x49\x45\x54\x46"
+			  "\x20\x49\x6e\x74\x65\x72\x6e\x65"
+			  "\x74\x2d\x44\x72\x61\x66\x74\x20"
+			  "\x6f\x72\x20\x52\x46\x43\x20\x61"
+			  "\x6e\x64\x20\x61\x6e\x79\x20\x73"
+			  "\x74\x61\x74\x65\x6d\x65\x6e\x74"
+			  "\x20\x6d\x61\x64\x65\x20\x77\x69"
+			  "\x74\x68\x69\x6e\x20\x74\x68\x65"
+			  "\x20\x63\x6f\x6e\x74\x65\x78\x74"
+			  "\x20\x6f\x66\x20\x61\x6e\x20\x49"
+			  "\x45\x54\x46\x20\x61\x63\x74\x69"
+			  "\x76\x69\x74\x79\x20\x69\x73\x20"
+			  "\x63\x6f\x6e\x73\x69\x64\x65\x72"
+			  "\x65\x64\x20\x61\x6e\x20\x22\x49"
+			  "\x45\x54\x46\x20\x43\x6f\x6e\x74"
+			  "\x72\x69\x62\x75\x74\x69\x6f\x6e"
+			  "\x22\x2e\x20\x53\x75\x63\x68\x20"
+			  "\x73\x74\x61\x74\x65\x6d\x65\x6e"
+			  "\x74\x73\x20\x69\x6e\x63\x6c\x75"
+			  "\x64\x65\x20\x6f\x72\x61\x6c\x20"
+			  "\x73\x74\x61\x74\x65\x6d\x65\x6e"
+			  "\x74\x73\x20\x69\x6e\x20\x49\x45"
+			  "\x54\x46\x20\x73\x65\x73\x73\x69"
+			  "\x6f\x6e\x73\x2c\x20\x61\x73\x20"
+			  "\x77\x65\x6c\x6c\x20\x61\x73\x20"
+			  "\x77\x72\x69\x74\x74\x65\x6e\x20"
+			  "\x61\x6e\x64\x20\x65\x6c\x65\x63"
+			  "\x74\x72\x6f\x6e\x69\x63\x20\x63"
+			  "\x6f\x6d\x6d\x75\x6e\x69\x63\x61"
+			  "\x74\x69\x6f\x6e\x73\x20\x6d\x61"
+			  "\x64\x65\x20\x61\x74\x20\x61\x6e"
+			  "\x79\x20\x74\x69\x6d\x65\x20\x6f"
+			  "\x72\x20\x70\x6c\x61\x63\x65\x2c"
+			  "\x20\x77\x68\x69\x63\x68\x20\x61"
+			  "\x72\x65\x20\x61\x64\x64\x72\x65"
+			  "\x73\x73\x65\x64\x20\x74\x6f",
+		.ctext	= "\xf9\xab\x7a\x4a\x60\xb8\x5f\xa0"
+			  "\x50\xbb\x57\xce\xef\x8c\xc1\xd9"
+			  "\x24\x15\xb3\x67\x5e\x7f\x01\xf6"
+			  "\x1c\x22\xf6\xe5\x71\xb1\x43\x64"
+			  "\x63\x05\xd5\xfc\x5c\x3d\xc0\x0e"
+			  "\x23\xef\xd3\x3b\xd9\xdc\x7f\xa8"
+			  "\x58\x26\xb3\xd0\xc2\xd5\x04\x3f"
+			  "\x0a\x0e\x8f\x17\xe4\xcd\xf7\x2a"
+			  "\xb4\x2c\x09\xe4\x47\xec\x8b\xfb"
+			  "\x59\x37\x7a\xa1\xd0\x04\x7e\xaa"
+			  "\xf1\x98\x5f\x24\x3d\x72\x9a\x43"
+			  "\xa4\x36\x51\x92\x22\x87\xff\x26"
+			  "\xce\x9d\xeb\x59\x78\x84\x5e\x74"
+			  "\x97\x2e\x63\xc0\xef\x29\xf7\x8a"
+			  "\xb9\xee\x35\x08\x77\x6a\x35\x9a"
+			  "\x3e\xe6\x4f\x06\x03\x74\x1b\xc1"
+			  "\x5b\xb3\x0b\x89\x11\x07\xd3\xb7"
+			  "\x53\xd6\x25\x04\xd9\x35\xb4\x5d"
+			  "\x4c\x33\x5a\xc2\x42\x4c\xe6\xa4"
+			  "\x97\x6e\x0e\xd2\xb2\x8b\x2f\x7f"
+			  "\x28\xe5\x9f\xac\x4b\x2e\x02\xab"
+			  "\x85\xfa\xa9\x0d\x7c\x2d\x10\xe6"
+			  "\x91\xab\x55\x63\xf0\xde\x3a\x94"
+			  "\x25\x08\x10\x03\xc2\x68\xd1\xf4"
+			  "\xaf\x7d\x9c\x99\xf7\x86\x96\x30"
+			  "\x60\xfc\x0b\xe6\xa8\x80\x15\xb0"
+			  "\x81\xb1\x0c\xbe\xb9\x12\x18\x25"
+			  "\xe9\x0e\xb1\xe7\x23\xb2\xef\x4a"
+			  "\x22\x8f\xc5\x61\x89\xd4\xe7\x0c"
+			  "\x64\x36\x35\x61\xb6\x34\x60\xf7"
+			  "\x7b\x61\x37\x37\x12\x10\xa2\xf6"
+			  "\x7e\xdb\x7f\x39\x3f\xb6\x8e\x89"
+			  "\x9e\xf3\xfe\x13\x98\xbb\x66\x5a"
+			  "\xec\xea\xab\x3f\x9c\x87\xc4\x8c"
+			  "\x8a\x04\x18\x49\xfc\x77\x11\x50"
+			  "\x16\xe6\x71\x2b\xee\xc0\x9c\xb6"
+			  "\x87\xfd\x80\xff\x0b\x1d\x73\x38"
+			  "\xa4\x1d\x6f\xae\xe4\x12\xd7\x93"
+			  "\x9d\xcd\x38\x26\x09\x40\x52\xcd"
+			  "\x67\x01\x67\x26\xe0\x3e\x98\xa8"
+			  "\xe8\x1a\x13\x41\xbb\x90\x4d\x87"
+			  "\xbb\x42\x82\x39\xce\x3a\xd0\x18"
+			  "\x6d\x7b\x71\x8f\xbb\x2c\x6a\xd1"
+			  "\xbd\xf5\xc7\x8a\x7e\xe1\x1e\x0f"
+			  "\x0d\x0d\x13\x7c\xd9\xd8\x3c\x91"
+			  "\xab\xff\x1f\x12\xc3\xee\xe5\x65"
+			  "\x12\x8d\x7b\x61\xe5\x1f\x98",
+		.len	= 375,
+		.also_non_np = 1,
+		.np	= 3,
+		.tap	= { 375 - 20, 4, 16 },
+
+	}, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes
+		to nonce, and recomputed the ciphertext with libsodium */
+		.key	= "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
+			  "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
+			  "\x47\x39\x17\xc1\x40\x2b\x80\x09"
+			  "\x9d\xca\x5c\xbc\x20\x70\x75\xc0",
+		.klen	= 32,
+		.iv	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x02\x76\x5a\x2e\x63"
+			  "\x33\x9f\xc9\x9a\x66\x32\x0d\xb7"
+			  "\x2a\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x27\x54\x77\x61\x73\x20\x62\x72"
+			  "\x69\x6c\x6c\x69\x67\x2c\x20\x61"
+			  "\x6e\x64\x20\x74\x68\x65\x20\x73"
+			  "\x6c\x69\x74\x68\x79\x20\x74\x6f"
+			  "\x76\x65\x73\x0a\x44\x69\x64\x20"
+			  "\x67\x79\x72\x65\x20\x61\x6e\x64"
+			  "\x20\x67\x69\x6d\x62\x6c\x65\x20"
+			  "\x69\x6e\x20\x74\x68\x65\x20\x77"
+			  "\x61\x62\x65\x3a\x0a\x41\x6c\x6c"
+			  "\x20\x6d\x69\x6d\x73\x79\x20\x77"
+			  "\x65\x72\x65\x20\x74\x68\x65\x20"
+			  "\x62\x6f\x72\x6f\x67\x6f\x76\x65"
+			  "\x73\x2c\x0a\x41\x6e\x64\x20\x74"
+			  "\x68\x65\x20\x6d\x6f\x6d\x65\x20"
+			  "\x72\x61\x74\x68\x73\x20\x6f\x75"
+			  "\x74\x67\x72\x61\x62\x65\x2e",
+		.ctext	= "\x95\xb9\x51\xe7\x8f\xb4\xa4\x03"
+			  "\xca\x37\xcc\xde\x60\x1d\x8c\xe2"
+			  "\xf1\xbb\x8a\x13\x7f\x61\x85\xcc"
+			  "\xad\xf4\xf0\xdc\x86\xa6\x1e\x10"
+			  "\xbc\x8e\xcb\x38\x2b\xa5\xc8\x8f"
+			  "\xaa\x03\x3d\x53\x4a\x42\xb1\x33"
+			  "\xfc\xd3\xef\xf0\x8e\x7e\x10\x9c"
+			  "\x6f\x12\x5e\xd4\x96\xfe\x5b\x08"
+			  "\xb6\x48\xf0\x14\x74\x51\x18\x7c"
+			  "\x07\x92\xfc\xac\x9d\xf1\x94\xc0"
+			  "\xc1\x9d\xc5\x19\x43\x1f\x1d\xbb"
+			  "\x07\xf0\x1b\x14\x25\x45\xbb\xcb"
+			  "\x5c\xe2\x8b\x28\xf3\xcf\x47\x29"
+			  "\x27\x79\x67\x24\xa6\x87\xc2\x11"
+			  "\x65\x03\xfa\x45\xf7\x9e\x53\x7a"
+			  "\x99\xf1\x82\x25\x4f\x8d\x07",
+		.len	= 127,
+	}, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes
+		to nonce, and recomputed the ciphertext with libsodium */
+		.key	= "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
+			  "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
+			  "\x47\x39\x17\xc1\x40\x2b\x80\x09"
+			  "\x9d\xca\x5c\xbc\x20\x70\x75\xc0",
+		.klen	= 32,
+		.iv	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x01\x31\x58\xa3\x5a"
+			  "\x25\x5d\x05\x17\x58\xe9\x5e\xd4"
+			  "\x1c\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x49\xee\xe0\xdc\x24\x90\x40\xcd"
+			  "\xc5\x40\x8f\x47\x05\xbc\xdd\x81"
+			  "\x47\xc6\x8d\xe6\xb1\x8f\xd7\xcb"
+			  "\x09\x0e\x6e\x22\x48\x1f\xbf\xb8"
+			  "\x5c\xf7\x1e\x8a\xc1\x23\xf2\xd4"
+			  "\x19\x4b\x01\x0f\x4e\xa4\x43\xce"
+			  "\x01\xc6\x67\xda\x03\x91\x18\x90"
+			  "\xa5\xa4\x8e\x45\x03\xb3\x2d\xac"
+			  "\x74\x92\xd3\x53\x47\xc8\xdd\x25"
+			  "\x53\x6c\x02\x03\x87\x0d\x11\x0c"
+			  "\x58\xe3\x12\x18\xfd\x2a\x5b\x40"
+			  "\x0c\x30\xf0\xb8\x3f\x43\xce\xae"
+			  "\x65\x3a\x7d\x7c\xf4\x54\xaa\xcc"
+			  "\x33\x97\xc3\x77\xba\xc5\x70\xde"
+			  "\xd7\xd5\x13\xa5\x65\xc4\x5f\x0f"
+			  "\x46\x1a\x0d\x97\xb5\xf3\xbb\x3c"
+			  "\x84\x0f\x2b\xc5\xaa\xea\xf2\x6c"
+			  "\xc9\xb5\x0c\xee\x15\xf3\x7d\xbe"
+			  "\x9f\x7b\x5a\xa6\xae\x4f\x83\xb6"
+			  "\x79\x49\x41\xf4\x58\x18\xcb\x86"
+			  "\x7f\x30\x0e\xf8\x7d\x44\x36\xea"
+			  "\x75\xeb\x88\x84\x40\x3c\xad\x4f"
+			  "\x6f\x31\x6b\xaa\x5d\xe5\xa5\xc5"
+			  "\x21\x66\xe9\xa7\xe3\xb2\x15\x88"
+			  "\x78\xf6\x79\xa1\x59\x47\x12\x4e"
+			  "\x9f\x9f\x64\x1a\xa0\x22\x5b\x08"
+			  "\xbe\x7c\x36\xc2\x2b\x66\x33\x1b"
+			  "\xdd\x60\x71\xf7\x47\x8c\x61\xc3"
+			  "\xda\x8a\x78\x1e\x16\xfa\x1e\x86"
+			  "\x81\xa6\x17\x2a\xa7\xb5\xc2\xe7"
+			  "\xa4\xc7\x42\xf1\xcf\x6a\xca\xb4"
+			  "\x45\xcf\xf3\x93\xf0\xe7\xea\xf6"
+			  "\xf4\xe6\x33\x43\x84\x93\xa5\x67"
+			  "\x9b\x16\x58\x58\x80\x0f\x2b\x5c"
+			  "\x24\x74\x75\x7f\x95\x81\xb7\x30"
+			  "\x7a\x33\xa7\xf7\x94\x87\x32\x27"
+			  "\x10\x5d\x14\x4c\x43\x29\xdd\x26"
+			  "\xbd\x3e\x3c\x0e\xfe\x0e\xa5\x10"
+			  "\xea\x6b\x64\xfd\x73\xc6\xed\xec"
+			  "\xa8\xc9\xbf\xb3\xba\x0b\x4d\x07"
+			  "\x70\xfc\x16\xfd\x79\x1e\xd7\xc5"
+			  "\x49\x4e\x1c\x8b\x8d\x79\x1b\xb1"
+			  "\xec\xca\x60\x09\x4c\x6a\xd5\x09"
+			  "\x49\x46\x00\x88\x22\x8d\xce\xea"
+			  "\xb1\x17\x11\xde\x42\xd2\x23\xc1"
+			  "\x72\x11\xf5\x50\x73\x04\x40\x47"
+			  "\xf9\x5d\xe7\xa7\x26\xb1\x7e\xb0"
+			  "\x3f\x58\xc1\x52\xab\x12\x67\x9d"
+			  "\x3f\x43\x4b\x68\xd4\x9c\x68\x38"
+			  "\x07\x8a\x2d\x3e\xf3\xaf\x6a\x4b"
+			  "\xf9\xe5\x31\x69\x22\xf9\xa6\x69"
+			  "\xc6\x9c\x96\x9a\x12\x35\x95\x1d"
+			  "\x95\xd5\xdd\xbe\xbf\x93\x53\x24"
+			  "\xfd\xeb\xc2\x0a\x64\xb0\x77\x00"
+			  "\x6f\x88\xc4\x37\x18\x69\x7c\xd7"
+			  "\x41\x92\x55\x4c\x03\xa1\x9a\x4b"
+			  "\x15\xe5\xdf\x7f\x37\x33\x72\xc1"
+			  "\x8b\x10\x67\xa3\x01\x57\x94\x25"
+			  "\x7b\x38\x71\x7e\xdd\x1e\xcc\x73"
+			  "\x55\xd2\x8e\xeb\x07\xdd\xf1\xda"
+			  "\x58\xb1\x47\x90\xfe\x42\x21\x72"
+			  "\xa3\x54\x7a\xa0\x40\xec\x9f\xdd"
+			  "\xc6\x84\x6e\xca\xae\xe3\x68\xb4"
+			  "\x9d\xe4\x78\xff\x57\xf2\xf8\x1b"
+			  "\x03\xa1\x31\xd9\xde\x8d\xf5\x22"
+			  "\x9c\xdd\x20\xa4\x1e\x27\xb1\x76"
+			  "\x4f\x44\x55\xe2\x9b\xa1\x9c\xfe"
+			  "\x54\xf7\x27\x1b\xf4\xde\x02\xf5"
+			  "\x1b\x55\x48\x5c\xdc\x21\x4b\x9e"
+			  "\x4b\x6e\xed\x46\x23\xdc\x65\xb2"
+			  "\xcf\x79\x5f\x28\xe0\x9e\x8b\xe7"
+			  "\x4c\x9d\x8a\xff\xc1\xa6\x28\xb8"
+			  "\x65\x69\x8a\x45\x29\xef\x74\x85"
+			  "\xde\x79\xc7\x08\xae\x30\xb0\xf4"
+			  "\xa3\x1d\x51\x41\xab\xce\xcb\xf6"
+			  "\xb5\xd8\x6d\xe0\x85\xe1\x98\xb3"
+			  "\x43\xbb\x86\x83\x0a\xa0\xf5\xb7"
+			  "\x04\x0b\xfa\x71\x1f\xb0\xf6\xd9"
+			  "\x13\x00\x15\xf0\xc7\xeb\x0d\x5a"
+			  "\x9f\xd7\xb9\x6c\x65\x14\x22\x45"
+			  "\x6e\x45\x32\x3e\x7e\x60\x1a\x12"
+			  "\x97\x82\x14\xfb\xaa\x04\x22\xfa"
+			  "\xa0\xe5\x7e\x8c\x78\x02\x48\x5d"
+			  "\x78\x33\x5a\x7c\xad\xdb\x29\xce"
+			  "\xbb\x8b\x61\xa4\xb7\x42\xe2\xac"
+			  "\x8b\x1a\xd9\x2f\x0b\x8b\x62\x21"
+			  "\x83\x35\x7e\xad\x73\xc2\xb5\x6c"
+			  "\x10\x26\x38\x07\xe5\xc7\x36\x80"
+			  "\xe2\x23\x12\x61\xf5\x48\x4b\x2b"
+			  "\xc5\xdf\x15\xd9\x87\x01\xaa\xac"
+			  "\x1e\x7c\xad\x73\x78\x18\x63\xe0"
+			  "\x8b\x9f\x81\xd8\x12\x6a\x28\x10"
+			  "\xbe\x04\x68\x8a\x09\x7c\x1b\x1c"
+			  "\x83\x66\x80\x47\x80\xe8\xfd\x35"
+			  "\x1c\x97\x6f\xae\x49\x10\x66\xcc"
+			  "\xc6\xd8\xcc\x3a\x84\x91\x20\x77"
+			  "\x72\xe4\x24\xd2\x37\x9f\xc5\xc9"
+			  "\x25\x94\x10\x5f\x40\x00\x64\x99"
+			  "\xdc\xae\xd7\x21\x09\x78\x50\x15"
+			  "\xac\x5f\xc6\x2c\xa2\x0b\xa9\x39"
+			  "\x87\x6e\x6d\xab\xde\x08\x51\x16"
+			  "\xc7\x13\xe9\xea\xed\x06\x8e\x2c"
+			  "\xf8\x37\x8c\xf0\xa6\x96\x8d\x43"
+			  "\xb6\x98\x37\xb2\x43\xed\xde\xdf"
+			  "\x89\x1a\xe7\xeb\x9d\xa1\x7b\x0b"
+			  "\x77\xb0\xe2\x75\xc0\xf1\x98\xd9"
+			  "\x80\x55\xc9\x34\x91\xd1\x59\xe8"
+			  "\x4b\x0f\xc1\xa9\x4b\x7a\x84\x06"
+			  "\x20\xa8\x5d\xfa\xd1\xde\x70\x56"
+			  "\x2f\x9e\x91\x9c\x20\xb3\x24\xd8"
+			  "\x84\x3d\xe1\x8c\x7e\x62\x52\xe5"
+			  "\x44\x4b\x9f\xc2\x93\x03\xea\x2b"
+			  "\x59\xc5\xfa\x3f\x91\x2b\xbb\x23"
+			  "\xf5\xb2\x7b\xf5\x38\xaf\xb3\xee"
+			  "\x63\xdc\x7b\xd1\xff\xaa\x8b\xab"
+			  "\x82\x6b\x37\x04\xeb\x74\xbe\x79"
+			  "\xb9\x83\x90\xef\x20\x59\x46\xff"
+			  "\xe9\x97\x3e\x2f\xee\xb6\x64\x18"
+			  "\x38\x4c\x7a\x4a\xf9\x61\xe8\x9a"
+			  "\xa1\xb5\x01\xa6\x47\xd3\x11\xd4"
+			  "\xce\xd3\x91\x49\x88\xc7\xb8\x4d"
+			  "\xb1\xb9\x07\x6d\x16\x72\xae\x46"
+			  "\x5e\x03\xa1\x4b\xb6\x02\x30\xa8"
+			  "\x3d\xa9\x07\x2a\x7c\x19\xe7\x62"
+			  "\x87\xe3\x82\x2f\x6f\xe1\x09\xd9"
+			  "\x94\x97\xea\xdd\x58\x9e\xae\x76"
+			  "\x7e\x35\xe5\xb4\xda\x7e\xf4\xde"
+			  "\xf7\x32\x87\xcd\x93\xbf\x11\x56"
+			  "\x11\xbe\x08\x74\xe1\x69\xad\xe2"
+			  "\xd7\xf8\x86\x75\x8a\x3c\xa4\xbe"
+			  "\x70\xa7\x1b\xfc\x0b\x44\x2a\x76"
+			  "\x35\xea\x5d\x85\x81\xaf\x85\xeb"
+			  "\xa0\x1c\x61\xc2\xf7\x4f\xa5\xdc"
+			  "\x02\x7f\xf6\x95\x40\x6e\x8a\x9a"
+			  "\xf3\x5d\x25\x6e\x14\x3a\x22\xc9"
+			  "\x37\x1c\xeb\x46\x54\x3f\xa5\x91"
+			  "\xc2\xb5\x8c\xfe\x53\x08\x97\x32"
+			  "\x1b\xb2\x30\x27\xfe\x25\x5d\xdc"
+			  "\x08\x87\xd0\xe5\x94\x1a\xd4\xf1"
+			  "\xfe\xd6\xb4\xa3\xe6\x74\x81\x3c"
+			  "\x1b\xb7\x31\xa7\x22\xfd\xd4\xdd"
+			  "\x20\x4e\x7c\x51\xb0\x60\x73\xb8"
+			  "\x9c\xac\x91\x90\x7e\x01\xb0\xe1"
+			  "\x8a\x2f\x75\x1c\x53\x2a\x98\x2a"
+			  "\x06\x52\x95\x52\xb2\xe9\x25\x2e"
+			  "\x4c\xe2\x5a\x00\xb2\x13\x81\x03"
+			  "\x77\x66\x0d\xa5\x99\xda\x4e\x8c"
+			  "\xac\xf3\x13\x53\x27\x45\xaf\x64"
+			  "\x46\xdc\xea\x23\xda\x97\xd1\xab"
+			  "\x7d\x6c\x30\x96\x1f\xbc\x06\x34"
+			  "\x18\x0b\x5e\x21\x35\x11\x8d\x4c"
+			  "\xe0\x2d\xe9\x50\x16\x74\x81\xa8"
+			  "\xb4\x34\xb9\x72\x42\xa6\xcc\xbc"
+			  "\xca\x34\x83\x27\x10\x5b\x68\x45"
+			  "\x8f\x52\x22\x0c\x55\x3d\x29\x7c"
+			  "\xe3\xc0\x66\x05\x42\x91\x5f\x58"
+			  "\xfe\x4a\x62\xd9\x8c\xa9\x04\x19"
+			  "\x04\xa9\x08\x4b\x57\xfc\x67\x53"
+			  "\x08\x7c\xbc\x66\x8a\xb0\xb6\x9f"
+			  "\x92\xd6\x41\x7c\x5b\x2a\x00\x79"
+			  "\x72",
+		.ctext	= "\x3a\x92\xee\x53\x31\xaf\x2b\x60"
+			  "\x5f\x55\x8d\x00\x5d\xfc\x74\x97"
+			  "\x28\x54\xf4\xa5\x75\xf1\x9b\x25"
+			  "\x62\x1c\xc0\xe0\x13\xc8\x87\x53"
+			  "\xd0\xf3\xa7\x97\x1f\x3b\x1e\xea"
+			  "\xe0\xe5\x2a\xd1\xdd\xa4\x3b\x50"
+			  "\x45\xa3\x0d\x7e\x1b\xc9\xa0\xad"
+			  "\xb9\x2c\x54\xa6\xc7\x55\x16\xd0"
+			  "\xc5\x2e\x02\x44\x35\xd0\x7e\x67"
+			  "\xf2\xc4\x9b\xcd\x95\x10\xcc\x29"
+			  "\x4b\xfa\x86\x87\xbe\x40\x36\xbe"
+			  "\xe1\xa3\x52\x89\x55\x20\x9b\xc2"
+			  "\xab\xf2\x31\x34\x16\xad\xc8\x17"
+			  "\x65\x24\xc0\xff\x12\x37\xfe\x5a"
+			  "\x62\x3b\x59\x47\x6c\x5f\x3a\x8e"
+			  "\x3b\xd9\x30\xc8\x7f\x2f\x88\xda"
+			  "\x80\xfd\x02\xda\x7f\x9a\x7a\x73"
+			  "\x59\xc5\x34\x09\x9a\x11\xcb\xa7"
+			  "\xfc\xf6\xa1\xa0\x60\xfb\x43\xbb"
+			  "\xf1\xe9\xd7\xc6\x79\x27\x4e\xff"
+			  "\x22\xb4\x24\xbf\x76\xee\x47\xb9"
+			  "\x6d\x3f\x8b\xb0\x9c\x3c\x43\xdd"
+			  "\xff\x25\x2e\x6d\xa4\x2b\xfb\x5d"
+			  "\x1b\x97\x6c\x55\x0a\x82\x7a\x7b"
+			  "\x94\x34\xc2\xdb\x2f\x1f\xc1\xea"
+			  "\xd4\x4d\x17\x46\x3b\x51\x69\x09"
+			  "\xe4\x99\x32\x25\xfd\x94\xaf\xfb"
+			  "\x10\xf7\x4f\xdd\x0b\x3c\x8b\x41"
+			  "\xb3\x6a\xb7\xd1\x33\xa8\x0c\x2f"
+			  "\x62\x4c\x72\x11\xd7\x74\xe1\x3b"
+			  "\x38\x43\x66\x7b\x6c\x36\x48\xe7"
+			  "\xe3\xe7\x9d\xb9\x42\x73\x7a\x2a"
+			  "\x89\x20\x1a\x41\x80\x03\xf7\x8f"
+			  "\x61\x78\x13\xbf\xfe\x50\xf5\x04"
+			  "\x52\xf9\xac\x47\xf8\x62\x4b\xb2"
+			  "\x24\xa9\xbf\x64\xb0\x18\x69\xd2"
+			  "\xf5\xe4\xce\xc8\xb1\x87\x75\xd6"
+			  "\x2c\x24\x79\x00\x7d\x26\xfb\x44"
+			  "\xe7\x45\x7a\xee\x58\xa5\x83\xc1"
+			  "\xb4\x24\xab\x23\x2f\x4d\xd7\x4f"
+			  "\x1c\xc7\xaa\xa9\x50\xf4\xa3\x07"
+			  "\x12\x13\x89\x74\xdc\x31\x6a\xb2"
+			  "\xf5\x0f\x13\x8b\xb9\xdb\x85\x1f"
+			  "\xf5\xbc\x88\xd9\x95\xea\x31\x6c"
+			  "\x36\x60\xb6\x49\xdc\xc4\xf7\x55"
+			  "\x3f\x21\xc1\xb5\x92\x18\x5e\xbc"
+			  "\x9f\x87\x7f\xe7\x79\x25\x40\x33"
+			  "\xd6\xb9\x33\xd5\x50\xb3\xc7\x89"
+			  "\x1b\x12\xa0\x46\xdd\xa7\xd8\x3e"
+			  "\x71\xeb\x6f\x66\xa1\x26\x0c\x67"
+			  "\xab\xb2\x38\x58\x17\xd8\x44\x3b"
+			  "\x16\xf0\x8e\x62\x8d\x16\x10\x00"
+			  "\x32\x8b\xef\xb9\x28\xd3\xc5\xad"
+			  "\x0a\x19\xa2\xe4\x03\x27\x7d\x94"
+			  "\x06\x18\xcd\xd6\x27\x00\xf9\x1f"
+			  "\xb6\xb3\xfe\x96\x35\x5f\xc4\x1c"
+			  "\x07\x62\x10\x79\x68\x50\xf1\x7e"
+			  "\x29\xe7\xc4\xc4\xe7\xee\x54\xd6"
+			  "\x58\x76\x84\x6d\x8d\xe4\x59\x31"
+			  "\xe9\xf4\xdc\xa1\x1f\xe5\x1a\xd6"
+			  "\xe6\x64\x46\xf5\x77\x9c\x60\x7a"
+			  "\x5e\x62\xe3\x0a\xd4\x9f\x7a\x2d"
+			  "\x7a\xa5\x0a\x7b\x29\x86\x7a\x74"
+			  "\x74\x71\x6b\xca\x7d\x1d\xaa\xba"
+			  "\x39\x84\x43\x76\x35\xfe\x4f\x9b"
+			  "\xbb\xbb\xb5\x6a\x32\xb5\x5d\x41"
+			  "\x51\xf0\x5b\x68\x03\x47\x4b\x8a"
+			  "\xca\x88\xf6\x37\xbd\x73\x51\x70"
+			  "\x66\xfe\x9e\x5f\x21\x9c\xf3\xdd"
+			  "\xc3\xea\x27\xf9\x64\x94\xe1\x19"
+			  "\xa0\xa9\xab\x60\xe0\x0e\xf7\x78"
+			  "\x70\x86\xeb\xe0\xd1\x5c\x05\xd3"
+			  "\xd7\xca\xe0\xc0\x47\x47\x34\xee"
+			  "\x11\xa3\xa3\x54\x98\xb7\x49\x8e"
+			  "\x84\x28\x70\x2c\x9e\xfb\x55\x54"
+			  "\x4d\xf8\x86\xf7\x85\x7c\xbd\xf3"
+			  "\x17\xd8\x47\xcb\xac\xf4\x20\x85"
+			  "\x34\x66\xad\x37\x2d\x5e\x52\xda"
+			  "\x8a\xfe\x98\x55\x30\xe7\x2d\x2b"
+			  "\x19\x10\x8e\x7b\x66\x5e\xdc\xe0"
+			  "\x45\x1f\x7b\xb4\x08\xfb\x8f\xf6"
+			  "\x8c\x89\x21\x34\x55\x27\xb2\x76"
+			  "\xb2\x07\xd9\xd6\x68\x9b\xea\x6b"
+			  "\x2d\xb4\xc4\x35\xdd\xd2\x79\xae"
+			  "\xc7\xd6\x26\x7f\x12\x01\x8c\xa7"
+			  "\xe3\xdb\xa8\xf4\xf7\x2b\xec\x99"
+			  "\x11\x00\xf1\x35\x8c\xcf\xd5\xc9"
+			  "\xbd\x91\x36\x39\x70\xcf\x7d\x70"
+			  "\x47\x1a\xfc\x6b\x56\xe0\x3f\x9c"
+			  "\x60\x49\x01\x72\xa9\xaf\x2c\x9c"
+			  "\xe8\xab\xda\x8c\x14\x19\xf3\x75"
+			  "\x07\x17\x9d\x44\x67\x7a\x2e\xef"
+			  "\xb7\x83\x35\x4a\xd1\x3d\x1c\x84"
+			  "\x32\xdd\xaa\xea\xca\x1d\xdc\x72"
+			  "\x2c\xcc\x43\xcd\x5d\xe3\x21\xa4"
+			  "\xd0\x8a\x4b\x20\x12\xa3\xd5\x86"
+			  "\x76\x96\xff\x5f\x04\x57\x0f\xe6"
+			  "\xba\xe8\x76\x50\x0c\x64\x1d\x83"
+			  "\x9c\x9b\x9a\x9a\x58\x97\x9c\x5c"
+			  "\xb4\xa4\xa6\x3e\x19\xeb\x8f\x5a"
+			  "\x61\xb2\x03\x7b\x35\x19\xbe\xa7"
+			  "\x63\x0c\xfd\xdd\xf9\x90\x6c\x08"
+			  "\x19\x11\xd3\x65\x4a\xf5\x96\x92"
+			  "\x59\xaa\x9c\x61\x0c\x29\xa7\xf8"
+			  "\x14\x39\x37\xbf\x3c\xf2\x16\x72"
+			  "\x02\xfa\xa2\xf3\x18\x67\x5d\xcb"
+			  "\xdc\x4d\xbb\x96\xff\x70\x08\x2d"
+			  "\xc2\xa8\x52\xe1\x34\x5f\x72\xfe"
+			  "\x64\xbf\xca\xa7\x74\x38\xfb\x74"
+			  "\x55\x9c\xfa\x8a\xed\xfb\x98\xeb"
+			  "\x58\x2e\x6c\xe1\x52\x76\x86\xd7"
+			  "\xcf\xa1\xa4\xfc\xb2\x47\x41\x28"
+			  "\xa3\xc1\xe5\xfd\x53\x19\x28\x2b"
+			  "\x37\x04\x65\x96\x99\x7a\x28\x0f"
+			  "\x07\x68\x4b\xc7\x52\x0a\x55\x35"
+			  "\x40\x19\x95\x61\xe8\x59\x40\x1f"
+			  "\x9d\xbf\x78\x7d\x8f\x84\xff\x6f"
+			  "\xd0\xd5\x63\xd2\x22\xbd\xc8\x4e"
+			  "\xfb\xe7\x9f\x06\xe6\xe7\x39\x6d"
+			  "\x6a\x96\x9f\xf0\x74\x7e\xc9\x35"
+			  "\xb7\x26\xb8\x1c\x0a\xa6\x27\x2c"
+			  "\xa2\x2b\xfe\xbe\x0f\x07\x73\xae"
+			  "\x7f\x7f\x54\xf5\x7c\x6a\x0a\x56"
+			  "\x49\xd4\x81\xe5\x85\x53\x99\x1f"
+			  "\x95\x05\x13\x58\x8d\x0e\x1b\x90"
+			  "\xc3\x75\x48\x64\x58\x98\x67\x84"
+			  "\xae\xe2\x21\xa2\x8a\x04\x0a\x0b"
+			  "\x61\xaa\xb0\xd4\x28\x60\x7a\xf8"
+			  "\xbc\x52\xfb\x24\x7f\xed\x0d\x2a"
+			  "\x0a\xb2\xf9\xc6\x95\xb5\x11\xc9"
+			  "\xf4\x0f\x26\x11\xcf\x2a\x57\x87"
+			  "\x7a\xf3\xe7\x94\x65\xc2\xb5\xb3"
+			  "\xab\x98\xe3\xc1\x2b\x59\x19\x7c"
+			  "\xd6\xf3\xf9\xbf\xff\x6d\xc6\x82"
+			  "\x13\x2f\x4a\x2e\xcd\x26\xfe\x2d"
+			  "\x01\x70\xf4\xc2\x7f\x1f\x4c\xcb"
+			  "\x47\x77\x0c\xa0\xa3\x03\xec\xda"
+			  "\xa9\xbf\x0d\x2d\xae\xe4\xb8\x7b"
+			  "\xa9\xbc\x08\xb4\x68\x2e\xc5\x60"
+			  "\x8d\x87\x41\x2b\x0f\x69\xf0\xaf"
+			  "\x5f\xba\x72\x20\x0f\x33\xcd\x6d"
+			  "\x36\x7d\x7b\xd5\x05\xf1\x4b\x05"
+			  "\xc4\xfc\x7f\x80\xb9\x4d\xbd\xf7"
+			  "\x7c\x84\x07\x01\xc2\x40\x66\x5b"
+			  "\x98\xc7\x2c\xe3\x97\xfa\xdf\x87"
+			  "\xa0\x1f\xe9\x21\x42\x0f\x3b\xeb"
+			  "\x89\x1c\x3b\xca\x83\x61\x77\x68"
+			  "\x84\xbb\x60\x87\x38\x2e\x25\xd5"
+			  "\x9e\x04\x41\x70\xac\xda\xc0\x9c"
+			  "\x9c\x69\xea\x8d\x4e\x55\x2a\x29"
+			  "\xed\x05\x4b\x7b\x73\x71\x90\x59"
+			  "\x4d\xc8\xd8\x44\xf0\x4c\xe1\x5e"
+			  "\x84\x47\x55\xcc\x32\x3f\xe7\x97"
+			  "\x42\xc6\x32\xac\x40\xe5\xa5\xc7"
+			  "\x8b\xed\xdb\xf7\x83\xd6\xb1\xc2"
+			  "\x52\x5e\x34\xb7\xeb\x6e\xd9\xfc"
+			  "\xe5\x93\x9a\x97\x3e\xb0\xdc\xd9"
+			  "\xd7\x06\x10\xb6\x1d\x80\x59\xdd"
+			  "\x0d\xfe\x64\x35\xcd\x5d\xec\xf0"
+			  "\xba\xd0\x34\xc9\x2d\x91\xc5\x17"
+			  "\x11",
+		.len	= 1281,
+		.also_non_np = 1,
+		.np	= 3,
+		.tap	= { 1200, 1, 80 },
+	},
+};
+
 /*
  * CTS (Cipher Text Stealing) mode tests
  */
diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
index fbec4e6a87890..6290d997060ec 100644
--- a/include/crypto/chacha20.h
+++ b/include/crypto/chacha20.h
@@ -1,6 +1,10 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
- * Common values for the ChaCha20 algorithm
+ * Common values and helper functions for the ChaCha20 and XChaCha20 algorithms.
+ *
+ * XChaCha20 extends ChaCha20's nonce to 192 bits, while provably retaining
+ * ChaCha20's security.  Here they share the same key size, tfm context, and
+ * setkey function; only their IV size and encrypt/decrypt function differ.
  */
 
 #ifndef _CRYPTO_CHACHA20_H
@@ -10,10 +14,15 @@
 #include <linux/types.h>
 #include <linux/crypto.h>
 
+/* 32-bit stream position, then 96-bit nonce (RFC7539 convention) */
 #define CHACHA20_IV_SIZE	16
+
 #define CHACHA20_KEY_SIZE	32
 #define CHACHA20_BLOCK_SIZE	64
 
+/* 192-bit nonce, then 64-bit stream position */
+#define XCHACHA20_IV_SIZE	32
+
 struct chacha20_ctx {
 	u32 key[8];
 };
@@ -22,8 +31,11 @@ void chacha20_block(u32 *state, u8 *stream);
 void hchacha20_block(const u32 *in, u32 *out);
 
 void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
+
 int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
 			   unsigned int keysize);
+
 int crypto_chacha20_crypt(struct skcipher_request *req);
+int crypto_xchacha20_crypt(struct skcipher_request *req);
 
 #endif
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 03/12] crypto: chacha20-generic - refactor to allow varying number of rounds
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
  2018-10-15 17:54 ` [RFC PATCH v2 01/12] crypto: chacha20-generic - add HChaCha20 library function Eric Biggers
  2018-10-15 17:54 ` [RFC PATCH v2 02/12] crypto: chacha20-generic - add XChaCha20 support Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-19 14:25   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 04/12] crypto: chacha - add XChaCha12 support Eric Biggers
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

In preparation for adding XChaCha12 support, rename/refactor
chacha20-generic to support different numbers of rounds.  The
justification for needing XChaCha12 support is explained in more detail
in the patch "crypto: chacha - add XChaCha12 support".

The only difference between ChaCha{8,12,20} are the number of rounds
itself; all other parts of the algorithm are the same.  Therefore,
remove the "20" from all definitions, structures, functions, files, etc.
that will be shared by all ChaCha versions.

Also make ->setkey() store the round count in the chacha_ctx (previously
chacha20_ctx).  The generic code then passes the round count through to
chacha_block().  There will be a ->setkey() function for each explicitly
allowed round count; the encrypt/decrypt functions will be the same.  I
decided not to do it the opposite way (same ->setkey() function for all
round counts, with different encrypt/decrypt functions) because that
would have required more boilerplate code in architecture-specific
implementations of ChaCha and XChaCha.

To be as careful as possible, we whitelist the allowed round counts in
the low-level generic code.  Currently only 20 is allowed, i.e. no
actual use of other variants is introduced by this patch.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 arch/arm/crypto/chacha20-neon-glue.c          |  40 +++----
 arch/arm64/crypto/chacha20-neon-glue.c        |  40 +++----
 arch/x86/crypto/chacha20_glue.c               |  52 ++++-----
 crypto/Makefile                               |   2 +-
 crypto/chacha20poly1305.c                     |  10 +-
 .../{chacha20_generic.c => chacha_generic.c}  | 110 ++++++++++--------
 drivers/char/random.c                         |  51 ++++----
 include/crypto/chacha.h                       |  46 ++++++++
 include/crypto/chacha20.h                     |  41 -------
 lib/Makefile                                  |   2 +-
 lib/{chacha20.c => chacha.c}                  |  43 ++++---
 11 files changed, 227 insertions(+), 210 deletions(-)
 rename crypto/{chacha20_generic.c => chacha_generic.c} (57%)
 create mode 100644 include/crypto/chacha.h
 delete mode 100644 include/crypto/chacha20.h
 rename lib/{chacha20.c => chacha.c} (67%)

diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c
index 59a7be08e80ce..7386eb1c1889d 100644
--- a/arch/arm/crypto/chacha20-neon-glue.c
+++ b/arch/arm/crypto/chacha20-neon-glue.c
@@ -19,7 +19,7 @@
  */
 
 #include <crypto/algapi.h>
-#include <crypto/chacha20.h>
+#include <crypto/chacha.h>
 #include <crypto/internal/skcipher.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -34,20 +34,20 @@ asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
 static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
 			    unsigned int bytes)
 {
-	u8 buf[CHACHA20_BLOCK_SIZE];
+	u8 buf[CHACHA_BLOCK_SIZE];
 
-	while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
+	while (bytes >= CHACHA_BLOCK_SIZE * 4) {
 		chacha20_4block_xor_neon(state, dst, src);
-		bytes -= CHACHA20_BLOCK_SIZE * 4;
-		src += CHACHA20_BLOCK_SIZE * 4;
-		dst += CHACHA20_BLOCK_SIZE * 4;
+		bytes -= CHACHA_BLOCK_SIZE * 4;
+		src += CHACHA_BLOCK_SIZE * 4;
+		dst += CHACHA_BLOCK_SIZE * 4;
 		state[12] += 4;
 	}
-	while (bytes >= CHACHA20_BLOCK_SIZE) {
+	while (bytes >= CHACHA_BLOCK_SIZE) {
 		chacha20_block_xor_neon(state, dst, src);
-		bytes -= CHACHA20_BLOCK_SIZE;
-		src += CHACHA20_BLOCK_SIZE;
-		dst += CHACHA20_BLOCK_SIZE;
+		bytes -= CHACHA_BLOCK_SIZE;
+		src += CHACHA_BLOCK_SIZE;
+		dst += CHACHA_BLOCK_SIZE;
 		state[12]++;
 	}
 	if (bytes) {
@@ -60,17 +60,17 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
 static int chacha20_neon(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
-	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
 	struct skcipher_walk walk;
 	u32 state[16];
 	int err;
 
-	if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
-		return crypto_chacha20_crypt(req);
+	if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
+		return crypto_chacha_crypt(req);
 
 	err = skcipher_walk_virt(&walk, req, true);
 
-	crypto_chacha20_init(state, ctx, walk.iv);
+	crypto_chacha_init(state, ctx, walk.iv);
 
 	kernel_neon_begin();
 	while (walk.nbytes > 0) {
@@ -93,14 +93,14 @@ static struct skcipher_alg alg = {
 	.base.cra_driver_name	= "chacha20-neon",
 	.base.cra_priority	= 300,
 	.base.cra_blocksize	= 1,
-	.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+	.base.cra_ctxsize	= sizeof(struct chacha_ctx),
 	.base.cra_module	= THIS_MODULE,
 
-	.min_keysize		= CHACHA20_KEY_SIZE,
-	.max_keysize		= CHACHA20_KEY_SIZE,
-	.ivsize			= CHACHA20_IV_SIZE,
-	.chunksize		= CHACHA20_BLOCK_SIZE,
-	.walksize		= 4 * CHACHA20_BLOCK_SIZE,
+	.min_keysize		= CHACHA_KEY_SIZE,
+	.max_keysize		= CHACHA_KEY_SIZE,
+	.ivsize			= CHACHA_IV_SIZE,
+	.chunksize		= CHACHA_BLOCK_SIZE,
+	.walksize		= 4 * CHACHA_BLOCK_SIZE,
 	.setkey			= crypto_chacha20_setkey,
 	.encrypt		= chacha20_neon,
 	.decrypt		= chacha20_neon,
diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c
index 727579c93dedb..96e0cfb8c3f5b 100644
--- a/arch/arm64/crypto/chacha20-neon-glue.c
+++ b/arch/arm64/crypto/chacha20-neon-glue.c
@@ -19,7 +19,7 @@
  */
 
 #include <crypto/algapi.h>
-#include <crypto/chacha20.h>
+#include <crypto/chacha.h>
 #include <crypto/internal/skcipher.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -34,15 +34,15 @@ asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
 static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
 			    unsigned int bytes)
 {
-	u8 buf[CHACHA20_BLOCK_SIZE];
+	u8 buf[CHACHA_BLOCK_SIZE];
 
-	while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
+	while (bytes >= CHACHA_BLOCK_SIZE * 4) {
 		kernel_neon_begin();
 		chacha20_4block_xor_neon(state, dst, src);
 		kernel_neon_end();
-		bytes -= CHACHA20_BLOCK_SIZE * 4;
-		src += CHACHA20_BLOCK_SIZE * 4;
-		dst += CHACHA20_BLOCK_SIZE * 4;
+		bytes -= CHACHA_BLOCK_SIZE * 4;
+		src += CHACHA_BLOCK_SIZE * 4;
+		dst += CHACHA_BLOCK_SIZE * 4;
 		state[12] += 4;
 	}
 
@@ -50,11 +50,11 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
 		return;
 
 	kernel_neon_begin();
-	while (bytes >= CHACHA20_BLOCK_SIZE) {
+	while (bytes >= CHACHA_BLOCK_SIZE) {
 		chacha20_block_xor_neon(state, dst, src);
-		bytes -= CHACHA20_BLOCK_SIZE;
-		src += CHACHA20_BLOCK_SIZE;
-		dst += CHACHA20_BLOCK_SIZE;
+		bytes -= CHACHA_BLOCK_SIZE;
+		src += CHACHA_BLOCK_SIZE;
+		dst += CHACHA_BLOCK_SIZE;
 		state[12]++;
 	}
 	if (bytes) {
@@ -68,17 +68,17 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
 static int chacha20_neon(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
-	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
 	struct skcipher_walk walk;
 	u32 state[16];
 	int err;
 
-	if (!may_use_simd() || req->cryptlen <= CHACHA20_BLOCK_SIZE)
-		return crypto_chacha20_crypt(req);
+	if (!may_use_simd() || req->cryptlen <= CHACHA_BLOCK_SIZE)
+		return crypto_chacha_crypt(req);
 
 	err = skcipher_walk_virt(&walk, req, false);
 
-	crypto_chacha20_init(state, ctx, walk.iv);
+	crypto_chacha_init(state, ctx, walk.iv);
 
 	while (walk.nbytes > 0) {
 		unsigned int nbytes = walk.nbytes;
@@ -99,14 +99,14 @@ static struct skcipher_alg alg = {
 	.base.cra_driver_name	= "chacha20-neon",
 	.base.cra_priority	= 300,
 	.base.cra_blocksize	= 1,
-	.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+	.base.cra_ctxsize	= sizeof(struct chacha_ctx),
 	.base.cra_module	= THIS_MODULE,
 
-	.min_keysize		= CHACHA20_KEY_SIZE,
-	.max_keysize		= CHACHA20_KEY_SIZE,
-	.ivsize			= CHACHA20_IV_SIZE,
-	.chunksize		= CHACHA20_BLOCK_SIZE,
-	.walksize		= 4 * CHACHA20_BLOCK_SIZE,
+	.min_keysize		= CHACHA_KEY_SIZE,
+	.max_keysize		= CHACHA_KEY_SIZE,
+	.ivsize			= CHACHA_IV_SIZE,
+	.chunksize		= CHACHA_BLOCK_SIZE,
+	.walksize		= 4 * CHACHA_BLOCK_SIZE,
 	.setkey			= crypto_chacha20_setkey,
 	.encrypt		= chacha20_neon,
 	.decrypt		= chacha20_neon,
diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
index dce7c5d39c2f2..bd249f0b29dc2 100644
--- a/arch/x86/crypto/chacha20_glue.c
+++ b/arch/x86/crypto/chacha20_glue.c
@@ -10,7 +10,7 @@
  */
 
 #include <crypto/algapi.h>
-#include <crypto/chacha20.h>
+#include <crypto/chacha.h>
 #include <crypto/internal/skcipher.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -29,31 +29,31 @@ static bool chacha20_use_avx2;
 static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src,
 			    unsigned int bytes)
 {
-	u8 buf[CHACHA20_BLOCK_SIZE];
+	u8 buf[CHACHA_BLOCK_SIZE];
 
 #ifdef CONFIG_AS_AVX2
 	if (chacha20_use_avx2) {
-		while (bytes >= CHACHA20_BLOCK_SIZE * 8) {
+		while (bytes >= CHACHA_BLOCK_SIZE * 8) {
 			chacha20_8block_xor_avx2(state, dst, src);
-			bytes -= CHACHA20_BLOCK_SIZE * 8;
-			src += CHACHA20_BLOCK_SIZE * 8;
-			dst += CHACHA20_BLOCK_SIZE * 8;
+			bytes -= CHACHA_BLOCK_SIZE * 8;
+			src += CHACHA_BLOCK_SIZE * 8;
+			dst += CHACHA_BLOCK_SIZE * 8;
 			state[12] += 8;
 		}
 	}
 #endif
-	while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
+	while (bytes >= CHACHA_BLOCK_SIZE * 4) {
 		chacha20_4block_xor_ssse3(state, dst, src);
-		bytes -= CHACHA20_BLOCK_SIZE * 4;
-		src += CHACHA20_BLOCK_SIZE * 4;
-		dst += CHACHA20_BLOCK_SIZE * 4;
+		bytes -= CHACHA_BLOCK_SIZE * 4;
+		src += CHACHA_BLOCK_SIZE * 4;
+		dst += CHACHA_BLOCK_SIZE * 4;
 		state[12] += 4;
 	}
-	while (bytes >= CHACHA20_BLOCK_SIZE) {
+	while (bytes >= CHACHA_BLOCK_SIZE) {
 		chacha20_block_xor_ssse3(state, dst, src);
-		bytes -= CHACHA20_BLOCK_SIZE;
-		src += CHACHA20_BLOCK_SIZE;
-		dst += CHACHA20_BLOCK_SIZE;
+		bytes -= CHACHA_BLOCK_SIZE;
+		src += CHACHA_BLOCK_SIZE;
+		dst += CHACHA_BLOCK_SIZE;
 		state[12]++;
 	}
 	if (bytes) {
@@ -66,7 +66,7 @@ static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src,
 static int chacha20_simd(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
-	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
 	u32 *state, state_buf[16 + 2] __aligned(8);
 	struct skcipher_walk walk;
 	int err;
@@ -74,20 +74,20 @@ static int chacha20_simd(struct skcipher_request *req)
 	BUILD_BUG_ON(CHACHA20_STATE_ALIGN != 16);
 	state = PTR_ALIGN(state_buf + 0, CHACHA20_STATE_ALIGN);
 
-	if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
-		return crypto_chacha20_crypt(req);
+	if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
+		return crypto_chacha_crypt(req);
 
 	err = skcipher_walk_virt(&walk, req, true);
 
-	crypto_chacha20_init(state, ctx, walk.iv);
+	crypto_chacha_init(state, ctx, walk.iv);
 
 	kernel_fpu_begin();
 
-	while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
+	while (walk.nbytes >= CHACHA_BLOCK_SIZE) {
 		chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
-				rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
+				rounddown(walk.nbytes, CHACHA_BLOCK_SIZE));
 		err = skcipher_walk_done(&walk,
-					 walk.nbytes % CHACHA20_BLOCK_SIZE);
+					 walk.nbytes % CHACHA_BLOCK_SIZE);
 	}
 
 	if (walk.nbytes) {
@@ -106,13 +106,13 @@ static struct skcipher_alg alg = {
 	.base.cra_driver_name	= "chacha20-simd",
 	.base.cra_priority	= 300,
 	.base.cra_blocksize	= 1,
-	.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+	.base.cra_ctxsize	= sizeof(struct chacha_ctx),
 	.base.cra_module	= THIS_MODULE,
 
-	.min_keysize		= CHACHA20_KEY_SIZE,
-	.max_keysize		= CHACHA20_KEY_SIZE,
-	.ivsize			= CHACHA20_IV_SIZE,
-	.chunksize		= CHACHA20_BLOCK_SIZE,
+	.min_keysize		= CHACHA_KEY_SIZE,
+	.max_keysize		= CHACHA_KEY_SIZE,
+	.ivsize			= CHACHA_IV_SIZE,
+	.chunksize		= CHACHA_BLOCK_SIZE,
 	.setkey			= crypto_chacha20_setkey,
 	.encrypt		= chacha20_simd,
 	.decrypt		= chacha20_simd,
diff --git a/crypto/Makefile b/crypto/Makefile
index 5c207c76abf7e..7e673f7c71107 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -116,7 +116,7 @@ obj-$(CONFIG_CRYPTO_KHAZAD) += khazad.o
 obj-$(CONFIG_CRYPTO_ANUBIS) += anubis.o
 obj-$(CONFIG_CRYPTO_SEED) += seed.o
 obj-$(CONFIG_CRYPTO_SALSA20) += salsa20_generic.o
-obj-$(CONFIG_CRYPTO_CHACHA20) += chacha20_generic.o
+obj-$(CONFIG_CRYPTO_CHACHA20) += chacha_generic.o
 obj-$(CONFIG_CRYPTO_POLY1305) += poly1305_generic.o
 obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o
 obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o
diff --git a/crypto/chacha20poly1305.c b/crypto/chacha20poly1305.c
index 600afa99941fe..573c07e6f189e 100644
--- a/crypto/chacha20poly1305.c
+++ b/crypto/chacha20poly1305.c
@@ -13,7 +13,7 @@
 #include <crypto/internal/hash.h>
 #include <crypto/internal/skcipher.h>
 #include <crypto/scatterwalk.h>
-#include <crypto/chacha20.h>
+#include <crypto/chacha.h>
 #include <crypto/poly1305.h>
 #include <linux/err.h>
 #include <linux/init.h>
@@ -51,7 +51,7 @@ struct poly_req {
 };
 
 struct chacha_req {
-	u8 iv[CHACHA20_IV_SIZE];
+	u8 iv[CHACHA_IV_SIZE];
 	struct scatterlist src[1];
 	struct skcipher_request req; /* must be last member */
 };
@@ -91,7 +91,7 @@ static void chacha_iv(u8 *iv, struct aead_request *req, u32 icb)
 	memcpy(iv, &leicb, sizeof(leicb));
 	memcpy(iv + sizeof(leicb), ctx->salt, ctx->saltlen);
 	memcpy(iv + sizeof(leicb) + ctx->saltlen, req->iv,
-	       CHACHA20_IV_SIZE - sizeof(leicb) - ctx->saltlen);
+	       CHACHA_IV_SIZE - sizeof(leicb) - ctx->saltlen);
 }
 
 static int poly_verify_tag(struct aead_request *req)
@@ -494,7 +494,7 @@ static int chachapoly_setkey(struct crypto_aead *aead, const u8 *key,
 	struct chachapoly_ctx *ctx = crypto_aead_ctx(aead);
 	int err;
 
-	if (keylen != ctx->saltlen + CHACHA20_KEY_SIZE)
+	if (keylen != ctx->saltlen + CHACHA_KEY_SIZE)
 		return -EINVAL;
 
 	keylen -= ctx->saltlen;
@@ -639,7 +639,7 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb,
 
 	err = -EINVAL;
 	/* Need 16-byte IV size, including Initial Block Counter value */
-	if (crypto_skcipher_alg_ivsize(chacha) != CHACHA20_IV_SIZE)
+	if (crypto_skcipher_alg_ivsize(chacha) != CHACHA_IV_SIZE)
 		goto out_drop_chacha;
 	/* Not a stream cipher? */
 	if (chacha->base.cra_blocksize != 1)
diff --git a/crypto/chacha20_generic.c b/crypto/chacha_generic.c
similarity index 57%
rename from crypto/chacha20_generic.c
rename to crypto/chacha_generic.c
index 07902fe37aeb8..8e25e9930c549 100644
--- a/crypto/chacha20_generic.c
+++ b/crypto/chacha_generic.c
@@ -12,33 +12,33 @@
 
 #include <asm/unaligned.h>
 #include <crypto/algapi.h>
-#include <crypto/chacha20.h>
+#include <crypto/chacha.h>
 #include <crypto/internal/skcipher.h>
 #include <linux/module.h>
 
-static void chacha20_docrypt(u32 *state, u8 *dst, const u8 *src,
-			     unsigned int bytes)
+static void chacha_docrypt(u32 *state, u8 *dst, const u8 *src,
+			   unsigned int bytes, int nrounds)
 {
 	/* aligned to potentially speed up crypto_xor() */
-	u8 stream[CHACHA20_BLOCK_SIZE] __aligned(sizeof(long));
+	u8 stream[CHACHA_BLOCK_SIZE] __aligned(sizeof(long));
 
 	if (dst != src)
 		memcpy(dst, src, bytes);
 
-	while (bytes >= CHACHA20_BLOCK_SIZE) {
-		chacha20_block(state, stream);
-		crypto_xor(dst, stream, CHACHA20_BLOCK_SIZE);
-		bytes -= CHACHA20_BLOCK_SIZE;
-		dst += CHACHA20_BLOCK_SIZE;
+	while (bytes >= CHACHA_BLOCK_SIZE) {
+		chacha_block(state, stream, nrounds);
+		crypto_xor(dst, stream, CHACHA_BLOCK_SIZE);
+		bytes -= CHACHA_BLOCK_SIZE;
+		dst += CHACHA_BLOCK_SIZE;
 	}
 	if (bytes) {
-		chacha20_block(state, stream);
+		chacha_block(state, stream, nrounds);
 		crypto_xor(dst, stream, bytes);
 	}
 }
 
-static int chacha20_stream_xor(struct skcipher_request *req,
-			       struct chacha20_ctx *ctx, u8 *iv)
+static int chacha_stream_xor(struct skcipher_request *req,
+			     struct chacha_ctx *ctx, u8 *iv)
 {
 	struct skcipher_walk walk;
 	u32 state[16];
@@ -46,7 +46,7 @@ static int chacha20_stream_xor(struct skcipher_request *req,
 
 	err = skcipher_walk_virt(&walk, req, true);
 
-	crypto_chacha20_init(state, ctx, iv);
+	crypto_chacha_init(state, ctx, iv);
 
 	while (walk.nbytes > 0) {
 		unsigned int nbytes = walk.nbytes;
@@ -54,15 +54,15 @@ static int chacha20_stream_xor(struct skcipher_request *req,
 		if (nbytes < walk.total)
 			nbytes = round_down(nbytes, walk.stride);
 
-		chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
-				 nbytes);
+		chacha_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
+			       nbytes, ctx->nrounds);
 		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
 	}
 
 	return err;
 }
 
-void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
+void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv)
 {
 	state[0]  = 0x61707865; /* "expa" */
 	state[1]  = 0x3320646e; /* "nd 3" */
@@ -81,53 +81,61 @@ void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
 	state[14] = get_unaligned_le32(iv +  8);
 	state[15] = get_unaligned_le32(iv + 12);
 }
-EXPORT_SYMBOL_GPL(crypto_chacha20_init);
+EXPORT_SYMBOL_GPL(crypto_chacha_init);
 
-int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
-			   unsigned int keysize)
+static int chacha_setkey(struct crypto_skcipher *tfm, const u8 *key,
+			 unsigned int keysize, int nrounds)
 {
-	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
 	int i;
 
-	if (keysize != CHACHA20_KEY_SIZE)
+	if (keysize != CHACHA_KEY_SIZE)
 		return -EINVAL;
 
 	for (i = 0; i < ARRAY_SIZE(ctx->key); i++)
 		ctx->key[i] = get_unaligned_le32(key + i * sizeof(u32));
 
+	ctx->nrounds = nrounds;
 	return 0;
 }
+
+int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
+			   unsigned int keysize)
+{
+	return chacha_setkey(tfm, key, keysize, 20);
+}
 EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
 
-int crypto_chacha20_crypt(struct skcipher_request *req)
+int crypto_chacha_crypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
-	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
 
-	return chacha20_stream_xor(req, ctx, req->iv);
+	return chacha_stream_xor(req, ctx, req->iv);
 }
-EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
+EXPORT_SYMBOL_GPL(crypto_chacha_crypt);
 
-int crypto_xchacha20_crypt(struct skcipher_request *req)
+int crypto_xchacha_crypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
-	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
-	struct chacha20_ctx subctx;
+	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct chacha_ctx subctx;
 	u32 state[16];
 	u8 real_iv[16];
 
 	/* Compute the subkey given the original key and first 128 nonce bits */
-	crypto_chacha20_init(state, ctx, req->iv);
-	hchacha20_block(state, subctx.key);
+	crypto_chacha_init(state, ctx, req->iv);
+	hchacha_block(state, subctx.key, ctx->nrounds);
+	subctx.nrounds = ctx->nrounds;
 
 	/* Build the real IV */
 	memcpy(&real_iv[0], req->iv + 24, 8); /* stream position */
 	memcpy(&real_iv[8], req->iv + 16, 8); /* remaining 64 nonce bits */
 
 	/* Generate the stream and XOR it with the data */
-	return chacha20_stream_xor(req, &subctx, real_iv);
+	return chacha_stream_xor(req, &subctx, real_iv);
 }
-EXPORT_SYMBOL_GPL(crypto_xchacha20_crypt);
+EXPORT_SYMBOL_GPL(crypto_xchacha_crypt);
 
 static struct skcipher_alg algs[] = {
 	{
@@ -135,50 +143,50 @@ static struct skcipher_alg algs[] = {
 		.base.cra_driver_name	= "chacha20-generic",
 		.base.cra_priority	= 100,
 		.base.cra_blocksize	= 1,
-		.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+		.base.cra_ctxsize	= sizeof(struct chacha_ctx),
 		.base.cra_module	= THIS_MODULE,
 
-		.min_keysize		= CHACHA20_KEY_SIZE,
-		.max_keysize		= CHACHA20_KEY_SIZE,
-		.ivsize			= CHACHA20_IV_SIZE,
-		.chunksize		= CHACHA20_BLOCK_SIZE,
+		.min_keysize		= CHACHA_KEY_SIZE,
+		.max_keysize		= CHACHA_KEY_SIZE,
+		.ivsize			= CHACHA_IV_SIZE,
+		.chunksize		= CHACHA_BLOCK_SIZE,
 		.setkey			= crypto_chacha20_setkey,
-		.encrypt		= crypto_chacha20_crypt,
-		.decrypt		= crypto_chacha20_crypt,
+		.encrypt		= crypto_chacha_crypt,
+		.decrypt		= crypto_chacha_crypt,
 	}, {
 		.base.cra_name		= "xchacha20",
 		.base.cra_driver_name	= "xchacha20-generic",
 		.base.cra_priority	= 100,
 		.base.cra_blocksize	= 1,
-		.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+		.base.cra_ctxsize	= sizeof(struct chacha_ctx),
 		.base.cra_module	= THIS_MODULE,
 
-		.min_keysize		= CHACHA20_KEY_SIZE,
-		.max_keysize		= CHACHA20_KEY_SIZE,
-		.ivsize			= XCHACHA20_IV_SIZE,
-		.chunksize		= CHACHA20_BLOCK_SIZE,
+		.min_keysize		= CHACHA_KEY_SIZE,
+		.max_keysize		= CHACHA_KEY_SIZE,
+		.ivsize			= XCHACHA_IV_SIZE,
+		.chunksize		= CHACHA_BLOCK_SIZE,
 		.setkey			= crypto_chacha20_setkey,
-		.encrypt		= crypto_xchacha20_crypt,
-		.decrypt		= crypto_xchacha20_crypt,
+		.encrypt		= crypto_xchacha_crypt,
+		.decrypt		= crypto_xchacha_crypt,
 	}
 };
 
-static int __init chacha20_generic_mod_init(void)
+static int __init chacha_generic_mod_init(void)
 {
 	return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
 }
 
-static void __exit chacha20_generic_mod_fini(void)
+static void __exit chacha_generic_mod_fini(void)
 {
 	crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
 }
 
-module_init(chacha20_generic_mod_init);
-module_exit(chacha20_generic_mod_fini);
+module_init(chacha_generic_mod_init);
+module_exit(chacha_generic_mod_fini);
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
-MODULE_DESCRIPTION("ChaCha20 and XChaCha20 stream ciphers (generic)");
+MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (generic)");
 MODULE_ALIAS_CRYPTO("chacha20");
 MODULE_ALIAS_CRYPTO("chacha20-generic");
 MODULE_ALIAS_CRYPTO("xchacha20");
diff --git a/drivers/char/random.c b/drivers/char/random.c
index d22d967c50f0a..5f47c4c8b9b15 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -265,7 +265,7 @@
 #include <linux/syscalls.h>
 #include <linux/completion.h>
 #include <linux/uuid.h>
-#include <crypto/chacha20.h>
+#include <crypto/chacha.h>
 
 #include <asm/processor.h>
 #include <linux/uaccess.h>
@@ -431,11 +431,10 @@ static int crng_init = 0;
 #define crng_ready() (likely(crng_init > 1))
 static int crng_init_cnt = 0;
 static unsigned long crng_global_init_time = 0;
-#define CRNG_INIT_CNT_THRESH (2*CHACHA20_KEY_SIZE)
-static void _extract_crng(struct crng_state *crng,
-			  __u8 out[CHACHA20_BLOCK_SIZE]);
+#define CRNG_INIT_CNT_THRESH (2*CHACHA_KEY_SIZE)
+static void _extract_crng(struct crng_state *crng, __u8 out[CHACHA_BLOCK_SIZE]);
 static void _crng_backtrack_protect(struct crng_state *crng,
-				    __u8 tmp[CHACHA20_BLOCK_SIZE], int used);
+				    __u8 tmp[CHACHA_BLOCK_SIZE], int used);
 static void process_random_ready_list(void);
 static void _get_random_bytes(void *buf, int nbytes);
 
@@ -858,7 +857,7 @@ static int crng_fast_load(const char *cp, size_t len)
 	}
 	p = (unsigned char *) &primary_crng.state[4];
 	while (len > 0 && crng_init_cnt < CRNG_INIT_CNT_THRESH) {
-		p[crng_init_cnt % CHACHA20_KEY_SIZE] ^= *cp;
+		p[crng_init_cnt % CHACHA_KEY_SIZE] ^= *cp;
 		cp++; crng_init_cnt++; len--;
 	}
 	spin_unlock_irqrestore(&primary_crng.lock, flags);
@@ -890,7 +889,7 @@ static int crng_slow_load(const char *cp, size_t len)
 	unsigned long		flags;
 	static unsigned char	lfsr = 1;
 	unsigned char		tmp;
-	unsigned		i, max = CHACHA20_KEY_SIZE;
+	unsigned		i, max = CHACHA_KEY_SIZE;
 	const char *		src_buf = cp;
 	char *			dest_buf = (char *) &primary_crng.state[4];
 
@@ -908,8 +907,8 @@ static int crng_slow_load(const char *cp, size_t len)
 		lfsr >>= 1;
 		if (tmp & 1)
 			lfsr ^= 0xE1;
-		tmp = dest_buf[i % CHACHA20_KEY_SIZE];
-		dest_buf[i % CHACHA20_KEY_SIZE] ^= src_buf[i % len] ^ lfsr;
+		tmp = dest_buf[i % CHACHA_KEY_SIZE];
+		dest_buf[i % CHACHA_KEY_SIZE] ^= src_buf[i % len] ^ lfsr;
 		lfsr += (tmp << 3) | (tmp >> 5);
 	}
 	spin_unlock_irqrestore(&primary_crng.lock, flags);
@@ -921,7 +920,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
 	unsigned long	flags;
 	int		i, num;
 	union {
-		__u8	block[CHACHA20_BLOCK_SIZE];
+		__u8	block[CHACHA_BLOCK_SIZE];
 		__u32	key[8];
 	} buf;
 
@@ -932,7 +931,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
 	} else {
 		_extract_crng(&primary_crng, buf.block);
 		_crng_backtrack_protect(&primary_crng, buf.block,
-					CHACHA20_KEY_SIZE);
+					CHACHA_KEY_SIZE);
 	}
 	spin_lock_irqsave(&crng->lock, flags);
 	for (i = 0; i < 8; i++) {
@@ -968,7 +967,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
 }
 
 static void _extract_crng(struct crng_state *crng,
-			  __u8 out[CHACHA20_BLOCK_SIZE])
+			  __u8 out[CHACHA_BLOCK_SIZE])
 {
 	unsigned long v, flags;
 
@@ -985,7 +984,7 @@ static void _extract_crng(struct crng_state *crng,
 	spin_unlock_irqrestore(&crng->lock, flags);
 }
 
-static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE])
+static void extract_crng(__u8 out[CHACHA_BLOCK_SIZE])
 {
 	struct crng_state *crng = NULL;
 
@@ -1003,14 +1002,14 @@ static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE])
  * enough) to mutate the CRNG key to provide backtracking protection.
  */
 static void _crng_backtrack_protect(struct crng_state *crng,
-				    __u8 tmp[CHACHA20_BLOCK_SIZE], int used)
+				    __u8 tmp[CHACHA_BLOCK_SIZE], int used)
 {
 	unsigned long	flags;
 	__u32		*s, *d;
 	int		i;
 
 	used = round_up(used, sizeof(__u32));
-	if (used + CHACHA20_KEY_SIZE > CHACHA20_BLOCK_SIZE) {
+	if (used + CHACHA_KEY_SIZE > CHACHA_BLOCK_SIZE) {
 		extract_crng(tmp);
 		used = 0;
 	}
@@ -1022,7 +1021,7 @@ static void _crng_backtrack_protect(struct crng_state *crng,
 	spin_unlock_irqrestore(&crng->lock, flags);
 }
 
-static void crng_backtrack_protect(__u8 tmp[CHACHA20_BLOCK_SIZE], int used)
+static void crng_backtrack_protect(__u8 tmp[CHACHA_BLOCK_SIZE], int used)
 {
 	struct crng_state *crng = NULL;
 
@@ -1037,8 +1036,8 @@ static void crng_backtrack_protect(__u8 tmp[CHACHA20_BLOCK_SIZE], int used)
 
 static ssize_t extract_crng_user(void __user *buf, size_t nbytes)
 {
-	ssize_t ret = 0, i = CHACHA20_BLOCK_SIZE;
-	__u8 tmp[CHACHA20_BLOCK_SIZE] __aligned(4);
+	ssize_t ret = 0, i = CHACHA_BLOCK_SIZE;
+	__u8 tmp[CHACHA_BLOCK_SIZE] __aligned(4);
 	int large_request = (nbytes > 256);
 
 	while (nbytes) {
@@ -1052,7 +1051,7 @@ static ssize_t extract_crng_user(void __user *buf, size_t nbytes)
 		}
 
 		extract_crng(tmp);
-		i = min_t(int, nbytes, CHACHA20_BLOCK_SIZE);
+		i = min_t(int, nbytes, CHACHA_BLOCK_SIZE);
 		if (copy_to_user(buf, tmp, i)) {
 			ret = -EFAULT;
 			break;
@@ -1617,14 +1616,14 @@ static void _warn_unseeded_randomness(const char *func_name, void *caller,
  */
 static void _get_random_bytes(void *buf, int nbytes)
 {
-	__u8 tmp[CHACHA20_BLOCK_SIZE] __aligned(4);
+	__u8 tmp[CHACHA_BLOCK_SIZE] __aligned(4);
 
 	trace_get_random_bytes(nbytes, _RET_IP_);
 
-	while (nbytes >= CHACHA20_BLOCK_SIZE) {
+	while (nbytes >= CHACHA_BLOCK_SIZE) {
 		extract_crng(buf);
-		buf += CHACHA20_BLOCK_SIZE;
-		nbytes -= CHACHA20_BLOCK_SIZE;
+		buf += CHACHA_BLOCK_SIZE;
+		nbytes -= CHACHA_BLOCK_SIZE;
 	}
 
 	if (nbytes > 0) {
@@ -1632,7 +1631,7 @@ static void _get_random_bytes(void *buf, int nbytes)
 		memcpy(buf, tmp, nbytes);
 		crng_backtrack_protect(tmp, nbytes);
 	} else
-		crng_backtrack_protect(tmp, CHACHA20_BLOCK_SIZE);
+		crng_backtrack_protect(tmp, CHACHA_BLOCK_SIZE);
 	memzero_explicit(tmp, sizeof(tmp));
 }
 
@@ -2203,8 +2202,8 @@ struct ctl_table random_table[] = {
 
 struct batched_entropy {
 	union {
-		u64 entropy_u64[CHACHA20_BLOCK_SIZE / sizeof(u64)];
-		u32 entropy_u32[CHACHA20_BLOCK_SIZE / sizeof(u32)];
+		u64 entropy_u64[CHACHA_BLOCK_SIZE / sizeof(u64)];
+		u32 entropy_u32[CHACHA_BLOCK_SIZE / sizeof(u32)];
 	};
 	unsigned int position;
 };
diff --git a/include/crypto/chacha.h b/include/crypto/chacha.h
new file mode 100644
index 0000000000000..ae79e9983c72f
--- /dev/null
+++ b/include/crypto/chacha.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Common values and helper functions for the ChaCha and XChaCha stream ciphers.
+ *
+ * XChaCha extends ChaCha's nonce to 192 bits, while provably retaining ChaCha's
+ * security.  Here they share the same key size, tfm context, and setkey
+ * function; only their IV size and encrypt/decrypt function differ.
+ */
+
+#ifndef _CRYPTO_CHACHA_H
+#define _CRYPTO_CHACHA_H
+
+#include <crypto/skcipher.h>
+#include <linux/types.h>
+#include <linux/crypto.h>
+
+/* 32-bit stream position, then 96-bit nonce (RFC7539 convention) */
+#define CHACHA_IV_SIZE		16
+
+#define CHACHA_KEY_SIZE		32
+#define CHACHA_BLOCK_SIZE	64
+
+/* 192-bit nonce, then 64-bit stream position */
+#define XCHACHA_IV_SIZE		32
+
+struct chacha_ctx {
+	u32 key[8];
+	int nrounds;
+};
+
+void chacha_block(u32 *state, u8 *stream, int nrounds);
+static inline void chacha20_block(u32 *state, u8 *stream)
+{
+	chacha_block(state, stream, 20);
+}
+void hchacha_block(const u32 *in, u32 *out, int nrounds);
+
+void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv);
+
+int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
+			   unsigned int keysize);
+
+int crypto_chacha_crypt(struct skcipher_request *req);
+int crypto_xchacha_crypt(struct skcipher_request *req);
+
+#endif /* _CRYPTO_CHACHA_H */
diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
deleted file mode 100644
index 6290d997060ec..0000000000000
--- a/include/crypto/chacha20.h
+++ /dev/null
@@ -1,41 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Common values and helper functions for the ChaCha20 and XChaCha20 algorithms.
- *
- * XChaCha20 extends ChaCha20's nonce to 192 bits, while provably retaining
- * ChaCha20's security.  Here they share the same key size, tfm context, and
- * setkey function; only their IV size and encrypt/decrypt function differ.
- */
-
-#ifndef _CRYPTO_CHACHA20_H
-#define _CRYPTO_CHACHA20_H
-
-#include <crypto/skcipher.h>
-#include <linux/types.h>
-#include <linux/crypto.h>
-
-/* 32-bit stream position, then 96-bit nonce (RFC7539 convention) */
-#define CHACHA20_IV_SIZE	16
-
-#define CHACHA20_KEY_SIZE	32
-#define CHACHA20_BLOCK_SIZE	64
-
-/* 192-bit nonce, then 64-bit stream position */
-#define XCHACHA20_IV_SIZE	32
-
-struct chacha20_ctx {
-	u32 key[8];
-};
-
-void chacha20_block(u32 *state, u8 *stream);
-void hchacha20_block(const u32 *in, u32 *out);
-
-void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
-
-int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
-			   unsigned int keysize);
-
-int crypto_chacha20_crypt(struct skcipher_request *req);
-int crypto_xchacha20_crypt(struct skcipher_request *req);
-
-#endif
diff --git a/lib/Makefile b/lib/Makefile
index ca3f7ebb900d8..9a5f0b7a48891 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -20,7 +20,7 @@ KCOV_INSTRUMENT_dynamic_debug.o := n
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 rbtree.o radix-tree.o timerqueue.o\
 	 idr.o int_sqrt.o extable.o \
-	 sha1.o chacha20.o irq_regs.o argv_split.o \
+	 sha1.o chacha.o irq_regs.o argv_split.o \
 	 flex_proportions.o ratelimit.o show_mem.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
 	 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
diff --git a/lib/chacha20.c b/lib/chacha.c
similarity index 67%
rename from lib/chacha20.c
rename to lib/chacha.c
index 6a484e16171d1..0a2c2e5b7b84d 100644
--- a/lib/chacha20.c
+++ b/lib/chacha.c
@@ -1,5 +1,5 @@
 /*
- * The "hash function" used as the core of the ChaCha20 stream cipher (RFC7539)
+ * The "hash function" used as the core of the ChaCha stream cipher (RFC7539)
  *
  * Copyright (C) 2015 Martin Willi
  *
@@ -14,13 +14,16 @@
 #include <linux/bitops.h>
 #include <linux/cryptohash.h>
 #include <asm/unaligned.h>
-#include <crypto/chacha20.h>
+#include <crypto/chacha.h>
 
-static void chacha20_permute(u32 *x)
+static void chacha_permute(u32 *x, int nrounds)
 {
 	int i;
 
-	for (i = 0; i < 20; i += 2) {
+	/* whitelist the allowed round counts */
+	BUG_ON(nrounds != 20);
+
+	for (i = 0; i < nrounds; i += 2) {
 		x[0]  += x[4];    x[12] = rol32(x[12] ^ x[0],  16);
 		x[1]  += x[5];    x[13] = rol32(x[13] ^ x[1],  16);
 		x[2]  += x[6];    x[14] = rol32(x[14] ^ x[2],  16);
@@ -64,49 +67,51 @@ static void chacha20_permute(u32 *x)
 }
 
 /**
- * chacha20_block - generate one keystream block and increment block counter
+ * chacha_block - generate one keystream block and increment block counter
  * @state: input state matrix (16 32-bit words)
  * @stream: output keystream block (64 bytes)
+ * @nrounds: number of rounds (currently must be 20)
  *
- * This is the ChaCha20 core, a function from 64-byte strings to 64-byte
- * strings.  The caller has already converted the endianness of the input.  This
- * function also handles incrementing the block counter in the input matrix.
+ * This is the ChaCha core, a function from 64-byte strings to 64-byte strings.
+ * The caller has already converted the endianness of the input.  This function
+ * also handles incrementing the block counter in the input matrix.
  */
-void chacha20_block(u32 *state, u8 *stream)
+void chacha_block(u32 *state, u8 *stream, int nrounds)
 {
 	u32 x[16];
 	int i;
 
 	memcpy(x, state, 64);
 
-	chacha20_permute(x);
+	chacha_permute(x, nrounds);
 
 	for (i = 0; i < ARRAY_SIZE(x); i++)
 		put_unaligned_le32(x[i] + state[i], &stream[i * sizeof(u32)]);
 
 	state[12]++;
 }
-EXPORT_SYMBOL(chacha20_block);
+EXPORT_SYMBOL(chacha_block);
 
 /**
- * hchacha20_block - abbreviated ChaCha20 core, for XChaCha20
+ * hchacha_block - abbreviated ChaCha core, for XChaCha
  * @in: input state matrix (16 32-bit words)
  * @out: output (8 32-bit words)
+ * @nrounds: number of rounds (currently must be 20)
  *
- * HChaCha20 is the ChaCha equivalent of HSalsa20 and is an intermediate step
- * towards XChaCha20 (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf).
- * HChaCha20 skips the final addition of the initial state, and outputs only
- * certain words of the state.  It should not be used for streaming directly.
+ * HChaCha is the ChaCha equivalent of HSalsa and is an intermediate step
+ * towards XChaCha (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf).  HChaCha
+ * skips the final addition of the initial state, and outputs only certain words
+ * of the state.  It should not be used for streaming directly.
  */
-void hchacha20_block(const u32 *in, u32 *out)
+void hchacha_block(const u32 *in, u32 *out, int nrounds)
 {
 	u32 x[16];
 
 	memcpy(x, in, 64);
 
-	chacha20_permute(x);
+	chacha_permute(x, nrounds);
 
 	memcpy(&out[0], &x[0], 16);
 	memcpy(&out[4], &x[12], 16);
 }
-EXPORT_SYMBOL(hchacha20_block);
+EXPORT_SYMBOL(hchacha_block);
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 04/12] crypto: chacha - add XChaCha12 support
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (2 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 03/12] crypto: chacha20-generic - refactor to allow varying number of rounds Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-19 14:34   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 05/12] crypto: arm/chacha20 - add XChaCha20 support Eric Biggers
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Now that the generic implementation of ChaCha20 has been refactored to
allow varying the number of rounds, add support for XChaCha12, which is
the XSalsa construction applied to ChaCha12.  ChaCha12 is one of the
three ciphers specified by the original ChaCha paper
(https://cr.yp.to/chacha/chacha-20080128.pdf: "ChaCha, a variant of
Salsa20"), alongside ChaCha8 and ChaCha20.  ChaCha12 is faster than
ChaCha20 but has a lower, but still large, security margin.

We need XChaCha12 support so that it can be used in the Adiantum
encryption mode, which enables disk/file encryption on low-end mobile
devices where AES-XTS is too slow as the CPUs lack AES instructions.

We'd prefer XChaCha20 (the more popular variant), but it's too slow on
some of our target devices, so at least in some cases we do need the
XChaCha12-based version.  In more detail, the problem is that Adiantum
is still much slower than we're happy with, and encryption still has a
quite noticeable effect on the feel of low-end devices.  Users and
vendors push back hard against encryption that degrades the user
experience, which always risks encryption being disabled entirely.  So
we need to choose the fastest option that gives us a solid margin of
security, and here that's XChaCha12.  The best known attack on ChaCha
breaks only 7 rounds and has 2^235 time complexity, so ChaCha12's
security margin is still better than AES-256's.  Much has been learned
about cryptanalysis of ARX ciphers since Salsa20 was originally designed
in 2005, and it now seems we can be comfortable with a smaller number of
rounds.  The eSTREAM project also suggests the 12-round version of
Salsa20 as providing the best balance among the different variants:
combining very good performance with a "comfortable margin of security".

Note that it would be trivial to add vanilla ChaCha12 in addition to
XChaCha12.  However, it's unneeded for now and therefore is omitted.

As discussed in the patch that introduced XChaCha20 support, I
considered splitting the code into separate chacha-common, chacha20,
xchacha20, and xchacha12 modules, so that these algorithms could be
enabled/disabled independently.  However, since nearly all the code is
shared anyway, I ultimately decided there would have been little benefit
to the added complexity.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 crypto/Kconfig          |   8 +-
 crypto/chacha_generic.c |  26 +-
 crypto/testmgr.c        |   6 +
 crypto/testmgr.h        | 578 ++++++++++++++++++++++++++++++++++++++++
 include/crypto/chacha.h |   7 +
 lib/chacha.c            |   6 +-
 6 files changed, 625 insertions(+), 6 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index d9acbce23d4d5..4fa0a4a0e8615 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1387,10 +1387,10 @@ config CRYPTO_SALSA20
 	  Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>
 
 config CRYPTO_CHACHA20
-	tristate "ChaCha20 stream cipher algorithms"
+	tristate "ChaCha stream cipher algorithms"
 	select CRYPTO_BLKCIPHER
 	help
-	  The ChaCha20 and XChaCha20 stream cipher algorithms.
+	  The ChaCha20, XChaCha20, and XChaCha12 stream cipher algorithms.
 
 	  ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
 	  Bernstein and further specified in RFC7539 for use in IETF protocols.
@@ -1403,6 +1403,10 @@ config CRYPTO_CHACHA20
 	  while provably retaining ChaCha20's security.  See also:
 	  <https://cr.yp.to/snuffle/xsalsa-20081128.pdf>
 
+	  XChaCha12 is XChaCha20 reduced to 12 rounds, with correspondingly
+	  reduced security margin but increased performance.  It can be needed
+	  in some performance-sensitive scenarios.
+
 config CRYPTO_CHACHA20_X86_64
 	tristate "ChaCha20 cipher algorithm (x86_64/SSSE3/AVX2)"
 	depends on X86 && 64BIT
diff --git a/crypto/chacha_generic.c b/crypto/chacha_generic.c
index 8e25e9930c549..8f8f84e51f334 100644
--- a/crypto/chacha_generic.c
+++ b/crypto/chacha_generic.c
@@ -1,5 +1,5 @@
 /*
- * ChaCha20 (RFC7539) and XChaCha20 stream cipher algorithms
+ * ChaCha and XChaCha stream ciphers, including ChaCha20 (RFC7539)
  *
  * Copyright (C) 2015 Martin Willi
  * Copyright (C) 2018 Google LLC
@@ -106,6 +106,13 @@ int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
 }
 EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
 
+int crypto_chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key,
+			   unsigned int keysize)
+{
+	return chacha_setkey(tfm, key, keysize, 12);
+}
+EXPORT_SYMBOL_GPL(crypto_chacha12_setkey);
+
 int crypto_chacha_crypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -168,6 +175,21 @@ static struct skcipher_alg algs[] = {
 		.setkey			= crypto_chacha20_setkey,
 		.encrypt		= crypto_xchacha_crypt,
 		.decrypt		= crypto_xchacha_crypt,
+	}, {
+		.base.cra_name		= "xchacha12",
+		.base.cra_driver_name	= "xchacha12-generic",
+		.base.cra_priority	= 100,
+		.base.cra_blocksize	= 1,
+		.base.cra_ctxsize	= sizeof(struct chacha_ctx),
+		.base.cra_module	= THIS_MODULE,
+
+		.min_keysize		= CHACHA_KEY_SIZE,
+		.max_keysize		= CHACHA_KEY_SIZE,
+		.ivsize			= XCHACHA_IV_SIZE,
+		.chunksize		= CHACHA_BLOCK_SIZE,
+		.setkey			= crypto_chacha12_setkey,
+		.encrypt		= crypto_xchacha_crypt,
+		.decrypt		= crypto_xchacha_crypt,
 	}
 };
 
@@ -191,3 +213,5 @@ MODULE_ALIAS_CRYPTO("chacha20");
 MODULE_ALIAS_CRYPTO("chacha20-generic");
 MODULE_ALIAS_CRYPTO("xchacha20");
 MODULE_ALIAS_CRYPTO("xchacha20-generic");
+MODULE_ALIAS_CRYPTO("xchacha12");
+MODULE_ALIAS_CRYPTO("xchacha12-generic");
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index a5512e69c8f31..3ff70ebc745cb 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -3544,6 +3544,12 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.hash = __VECS(aes_xcbc128_tv_template)
 		}
+	}, {
+		.alg = "xchacha12",
+		.test = alg_test_skcipher,
+		.suite = {
+			.cipher = __VECS(xchacha12_tv_template)
+		},
 	}, {
 		.alg = "xchacha20",
 		.test = alg_test_skcipher,
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 371641c73cf8c..3b57b2701fcb2 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -31379,6 +31379,584 @@ static const struct cipher_testvec xchacha20_tv_template[] = {
 	},
 };
 
+/*
+ * Same as XChaCha20 test vectors above, but recomputed the ciphertext with
+ * XChaCha12, using a modified libsodium.
+ */
+static const struct cipher_testvec xchacha12_tv_template[] = {
+	{
+		.key	= "\x79\xc9\x97\x98\xac\x67\x30\x0b"
+			  "\xbb\x27\x04\xc9\x5c\x34\x1e\x32"
+			  "\x45\xf3\xdc\xb2\x17\x61\xb9\x8e"
+			  "\x52\xff\x45\xb2\x4f\x30\x4f\xc4",
+		.klen	= 32,
+		.iv	= "\xb3\x3f\xfd\x30\x96\x47\x9b\xcf"
+			  "\xbc\x9a\xee\x49\x41\x76\x88\xa0"
+			  "\xa2\x55\x4f\x8d\x95\x38\x94\x19"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00",
+		.ctext	= "\x1b\x78\x7f\xd7\xa1\x41\x68\xab"
+			  "\x3d\x3f\xd1\x7b\x69\x56\xb2\xd5"
+			  "\x43\xce\xeb\xaf\x36\xf0\x29\x9d"
+			  "\x3a\xfb\x18\xae\x1b",
+		.len	= 29,
+	}, {
+		.key	= "\x9d\x23\xbd\x41\x49\xcb\x97\x9c"
+			  "\xcf\x3c\x5c\x94\xdd\x21\x7e\x98"
+			  "\x08\xcb\x0e\x50\xcd\x0f\x67\x81"
+			  "\x22\x35\xea\xaf\x60\x1d\x62\x32",
+		.klen	= 32,
+		.iv	= "\xc0\x47\x54\x82\x66\xb7\xc3\x70"
+			  "\xd3\x35\x66\xa2\x42\x5c\xbf\x30"
+			  "\xd8\x2d\x1e\xaf\x52\x94\x10\x9e"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00",
+		.ctext	= "\xfb\x32\x09\x1d\x83\x05\xae\x4c"
+			  "\x13\x1f\x12\x71\xf2\xca\xb2\xeb"
+			  "\x5b\x83\x14\x7d\x83\xf6\x57\x77"
+			  "\x2e\x40\x1f\x92\x2c\xf9\xec\x35"
+			  "\x34\x1f\x93\xdf\xfb\x30\xd7\x35"
+			  "\x03\x05\x78\xc1\x20\x3b\x7a\xe3"
+			  "\x62\xa3\x89\xdc\x11\x11\x45\xa8"
+			  "\x82\x89\xa0\xf1\x4e\xc7\x0f\x11"
+			  "\x69\xdd\x0c\x84\x2b\x89\x5c\xdc"
+			  "\xf0\xde\x01\xef\xc5\x65\x79\x23"
+			  "\x87\x67\xd6\x50\xd9\x8d\xd9\x92"
+			  "\x54\x5b\x0e",
+		.len	= 91,
+	}, {
+		.key	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.klen	= 32,
+		.iv	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x67\xc6\x69\x73"
+			  "\x51\xff\x4a\xec\x29\xcd\xba\xab"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+		.ctext	= "\xdf\x2d\xc6\x21\x2a\x9d\xa1\xbb"
+			  "\xc2\x77\x66\x0c\x5c\x46\xef\xa7"
+			  "\x79\x1b\xb9\xdf\x55\xe2\xf9\x61"
+			  "\x4c\x7b\xa4\x52\x24\xaf\xa2\xda"
+			  "\xd1\x8f\x8f\xa2\x9e\x53\x4d\xc4"
+			  "\xb8\x55\x98\x08\x7c\x08\xd4\x18"
+			  "\x67\x8f\xef\x50\xb1\x5f\xa5\x77"
+			  "\x4c\x25\xe7\x86\x26\x42\xca\x44",
+		.len	= 64,
+	}, {
+		.key	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x01",
+		.klen	= 32,
+		.iv	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x02\xf2\xfb\xe3\x46"
+			  "\x7c\xc2\x54\xf8\x1b\xe8\xe7\x8d"
+			  "\x01\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x41\x6e\x79\x20\x73\x75\x62\x6d"
+			  "\x69\x73\x73\x69\x6f\x6e\x20\x74"
+			  "\x6f\x20\x74\x68\x65\x20\x49\x45"
+			  "\x54\x46\x20\x69\x6e\x74\x65\x6e"
+			  "\x64\x65\x64\x20\x62\x79\x20\x74"
+			  "\x68\x65\x20\x43\x6f\x6e\x74\x72"
+			  "\x69\x62\x75\x74\x6f\x72\x20\x66"
+			  "\x6f\x72\x20\x70\x75\x62\x6c\x69"
+			  "\x63\x61\x74\x69\x6f\x6e\x20\x61"
+			  "\x73\x20\x61\x6c\x6c\x20\x6f\x72"
+			  "\x20\x70\x61\x72\x74\x20\x6f\x66"
+			  "\x20\x61\x6e\x20\x49\x45\x54\x46"
+			  "\x20\x49\x6e\x74\x65\x72\x6e\x65"
+			  "\x74\x2d\x44\x72\x61\x66\x74\x20"
+			  "\x6f\x72\x20\x52\x46\x43\x20\x61"
+			  "\x6e\x64\x20\x61\x6e\x79\x20\x73"
+			  "\x74\x61\x74\x65\x6d\x65\x6e\x74"
+			  "\x20\x6d\x61\x64\x65\x20\x77\x69"
+			  "\x74\x68\x69\x6e\x20\x74\x68\x65"
+			  "\x20\x63\x6f\x6e\x74\x65\x78\x74"
+			  "\x20\x6f\x66\x20\x61\x6e\x20\x49"
+			  "\x45\x54\x46\x20\x61\x63\x74\x69"
+			  "\x76\x69\x74\x79\x20\x69\x73\x20"
+			  "\x63\x6f\x6e\x73\x69\x64\x65\x72"
+			  "\x65\x64\x20\x61\x6e\x20\x22\x49"
+			  "\x45\x54\x46\x20\x43\x6f\x6e\x74"
+			  "\x72\x69\x62\x75\x74\x69\x6f\x6e"
+			  "\x22\x2e\x20\x53\x75\x63\x68\x20"
+			  "\x73\x74\x61\x74\x65\x6d\x65\x6e"
+			  "\x74\x73\x20\x69\x6e\x63\x6c\x75"
+			  "\x64\x65\x20\x6f\x72\x61\x6c\x20"
+			  "\x73\x74\x61\x74\x65\x6d\x65\x6e"
+			  "\x74\x73\x20\x69\x6e\x20\x49\x45"
+			  "\x54\x46\x20\x73\x65\x73\x73\x69"
+			  "\x6f\x6e\x73\x2c\x20\x61\x73\x20"
+			  "\x77\x65\x6c\x6c\x20\x61\x73\x20"
+			  "\x77\x72\x69\x74\x74\x65\x6e\x20"
+			  "\x61\x6e\x64\x20\x65\x6c\x65\x63"
+			  "\x74\x72\x6f\x6e\x69\x63\x20\x63"
+			  "\x6f\x6d\x6d\x75\x6e\x69\x63\x61"
+			  "\x74\x69\x6f\x6e\x73\x20\x6d\x61"
+			  "\x64\x65\x20\x61\x74\x20\x61\x6e"
+			  "\x79\x20\x74\x69\x6d\x65\x20\x6f"
+			  "\x72\x20\x70\x6c\x61\x63\x65\x2c"
+			  "\x20\x77\x68\x69\x63\x68\x20\x61"
+			  "\x72\x65\x20\x61\x64\x64\x72\x65"
+			  "\x73\x73\x65\x64\x20\x74\x6f",
+		.ctext	= "\xe4\xa6\xc8\x30\xc4\x23\x13\xd6"
+			  "\x08\x4d\xc9\xb7\xa5\x64\x7c\xb9"
+			  "\x71\xe2\xab\x3e\xa8\x30\x8a\x1c"
+			  "\x4a\x94\x6d\x9b\xe0\xb3\x6f\xf1"
+			  "\xdc\xe3\x1b\xb3\xa9\x6d\x0d\xd6"
+			  "\xd0\xca\x12\xef\xe7\x5f\xd8\x61"
+			  "\x3c\x82\xd3\x99\x86\x3c\x6f\x66"
+			  "\x02\x06\xdc\x55\xf9\xed\xdf\x38"
+			  "\xb4\xa6\x17\x00\x7f\xef\xbf\x4f"
+			  "\xf8\x36\xf1\x60\x7e\x47\xaf\xdb"
+			  "\x55\x9b\x12\xcb\x56\x44\xa7\x1f"
+			  "\xd3\x1a\x07\x3b\x00\xec\xe6\x4c"
+			  "\xa2\x43\x27\xdf\x86\x19\x4f\x16"
+			  "\xed\xf9\x4a\xf3\x63\x6f\xfa\x7f"
+			  "\x78\x11\xf6\x7d\x97\x6f\xec\x6f"
+			  "\x85\x0f\x5c\x36\x13\x8d\x87\xe0"
+			  "\x80\xb1\x69\x0b\x98\x89\x9c\x4e"
+			  "\xf8\xdd\xee\x5c\x0a\x85\xce\xd4"
+			  "\xea\x1b\x48\xbe\x08\xf8\xe2\xa8"
+			  "\xa5\xb0\x3c\x79\xb1\x15\xb4\xb9"
+			  "\x75\x10\x95\x35\x81\x7e\x26\xe6"
+			  "\x78\xa4\x88\xcf\xdb\x91\x34\x18"
+			  "\xad\xd7\x8e\x07\x7d\xab\x39\xf9"
+			  "\xa3\x9e\xa5\x1d\xbb\xed\x61\xfd"
+			  "\xdc\xb7\x5a\x27\xfc\xb5\xc9\x10"
+			  "\xa8\xcc\x52\x7f\x14\x76\x90\xe7"
+			  "\x1b\x29\x60\x74\xc0\x98\x77\xbb"
+			  "\xe0\x54\xbb\x27\x49\x59\x1e\x62"
+			  "\x3d\xaf\x74\x06\xa4\x42\x6f\xc6"
+			  "\x52\x97\xc4\x1d\xc4\x9f\xe2\xe5"
+			  "\x38\x57\x91\xd1\xa2\x28\xcc\x40"
+			  "\xcc\x70\x59\x37\xfc\x9f\x4b\xda"
+			  "\xa0\xeb\x97\x9a\x7d\xed\x14\x5c"
+			  "\x9c\xb7\x93\x26\x41\xa8\x66\xdd"
+			  "\x87\x6a\xc0\xd3\xc2\xa9\x3e\xae"
+			  "\xe9\x72\xfe\xd1\xb3\xac\x38\xea"
+			  "\x4d\x15\xa9\xd5\x36\x61\xe9\x96"
+			  "\x6c\x23\xf8\x43\xe4\x92\x29\xd9"
+			  "\x8b\x78\xf7\x0a\x52\xe0\x19\x5b"
+			  "\x59\x69\x5b\x5d\xa1\x53\xc4\x68"
+			  "\xe1\xbb\xac\x89\x14\xe2\xe2\x85"
+			  "\x41\x18\xf5\xb3\xd1\xfa\x68\x19"
+			  "\x44\x78\xdc\xcf\xe7\x88\x2d\x52"
+			  "\x5f\x40\xb5\x7e\xf8\x88\xa2\xae"
+			  "\x4a\xb2\x07\x35\x9d\x9b\x07\x88"
+			  "\xb7\x00\xd0\x0c\xb6\xa0\x47\x59"
+			  "\xda\x4e\xc9\xab\x9b\x8a\x7b",
+
+		.len	= 375,
+		.also_non_np = 1,
+		.np	= 3,
+		.tap	= { 375 - 20, 4, 16 },
+
+	}, {
+		.key	= "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
+			  "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
+			  "\x47\x39\x17\xc1\x40\x2b\x80\x09"
+			  "\x9d\xca\x5c\xbc\x20\x70\x75\xc0",
+		.klen	= 32,
+		.iv	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x02\x76\x5a\x2e\x63"
+			  "\x33\x9f\xc9\x9a\x66\x32\x0d\xb7"
+			  "\x2a\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x27\x54\x77\x61\x73\x20\x62\x72"
+			  "\x69\x6c\x6c\x69\x67\x2c\x20\x61"
+			  "\x6e\x64\x20\x74\x68\x65\x20\x73"
+			  "\x6c\x69\x74\x68\x79\x20\x74\x6f"
+			  "\x76\x65\x73\x0a\x44\x69\x64\x20"
+			  "\x67\x79\x72\x65\x20\x61\x6e\x64"
+			  "\x20\x67\x69\x6d\x62\x6c\x65\x20"
+			  "\x69\x6e\x20\x74\x68\x65\x20\x77"
+			  "\x61\x62\x65\x3a\x0a\x41\x6c\x6c"
+			  "\x20\x6d\x69\x6d\x73\x79\x20\x77"
+			  "\x65\x72\x65\x20\x74\x68\x65\x20"
+			  "\x62\x6f\x72\x6f\x67\x6f\x76\x65"
+			  "\x73\x2c\x0a\x41\x6e\x64\x20\x74"
+			  "\x68\x65\x20\x6d\x6f\x6d\x65\x20"
+			  "\x72\x61\x74\x68\x73\x20\x6f\x75"
+			  "\x74\x67\x72\x61\x62\x65\x2e",
+		.ctext	= "\xb9\x68\xbc\x6a\x24\xbc\xcc\xd8"
+			  "\x9b\x2a\x8d\x5b\x96\xaf\x56\xe3"
+			  "\x11\x61\xe7\xa7\x9b\xce\x4e\x7d"
+			  "\x60\x02\x48\xac\xeb\xd5\x3a\x26"
+			  "\x9d\x77\x3b\xb5\x32\x13\x86\x8e"
+			  "\x20\x82\x26\x72\xae\x64\x1b\x7e"
+			  "\x2e\x01\x68\xb4\x87\x45\xa1\x24"
+			  "\xe4\x48\x40\xf0\xaa\xac\xee\xa9"
+			  "\xfc\x31\xad\x9d\x89\xa3\xbb\xd2"
+			  "\xe4\x25\x13\xad\x0f\x5e\xdf\x3c"
+			  "\x27\xab\xb8\x62\x46\x22\x30\x48"
+			  "\x55\x2c\x4e\x84\x78\x1d\x0d\x34"
+			  "\x8d\x3c\x91\x0a\x7f\x5b\x19\x9f"
+			  "\x97\x05\x4c\xa7\x62\x47\x8b\xc5"
+			  "\x44\x2e\x20\x33\xdd\xa0\x82\xa9"
+			  "\x25\x76\x37\xe6\x3c\x67\x5b",
+		.len	= 127,
+	}, {
+		.key	= "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
+			  "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
+			  "\x47\x39\x17\xc1\x40\x2b\x80\x09"
+			  "\x9d\xca\x5c\xbc\x20\x70\x75\xc0",
+		.klen	= 32,
+		.iv	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x01\x31\x58\xa3\x5a"
+			  "\x25\x5d\x05\x17\x58\xe9\x5e\xd4"
+			  "\x1c\x00\x00\x00\x00\x00\x00\x00",
+		.ptext	= "\x49\xee\xe0\xdc\x24\x90\x40\xcd"
+			  "\xc5\x40\x8f\x47\x05\xbc\xdd\x81"
+			  "\x47\xc6\x8d\xe6\xb1\x8f\xd7\xcb"
+			  "\x09\x0e\x6e\x22\x48\x1f\xbf\xb8"
+			  "\x5c\xf7\x1e\x8a\xc1\x23\xf2\xd4"
+			  "\x19\x4b\x01\x0f\x4e\xa4\x43\xce"
+			  "\x01\xc6\x67\xda\x03\x91\x18\x90"
+			  "\xa5\xa4\x8e\x45\x03\xb3\x2d\xac"
+			  "\x74\x92\xd3\x53\x47\xc8\xdd\x25"
+			  "\x53\x6c\x02\x03\x87\x0d\x11\x0c"
+			  "\x58\xe3\x12\x18\xfd\x2a\x5b\x40"
+			  "\x0c\x30\xf0\xb8\x3f\x43\xce\xae"
+			  "\x65\x3a\x7d\x7c\xf4\x54\xaa\xcc"
+			  "\x33\x97\xc3\x77\xba\xc5\x70\xde"
+			  "\xd7\xd5\x13\xa5\x65\xc4\x5f\x0f"
+			  "\x46\x1a\x0d\x97\xb5\xf3\xbb\x3c"
+			  "\x84\x0f\x2b\xc5\xaa\xea\xf2\x6c"
+			  "\xc9\xb5\x0c\xee\x15\xf3\x7d\xbe"
+			  "\x9f\x7b\x5a\xa6\xae\x4f\x83\xb6"
+			  "\x79\x49\x41\xf4\x58\x18\xcb\x86"
+			  "\x7f\x30\x0e\xf8\x7d\x44\x36\xea"
+			  "\x75\xeb\x88\x84\x40\x3c\xad\x4f"
+			  "\x6f\x31\x6b\xaa\x5d\xe5\xa5\xc5"
+			  "\x21\x66\xe9\xa7\xe3\xb2\x15\x88"
+			  "\x78\xf6\x79\xa1\x59\x47\x12\x4e"
+			  "\x9f\x9f\x64\x1a\xa0\x22\x5b\x08"
+			  "\xbe\x7c\x36\xc2\x2b\x66\x33\x1b"
+			  "\xdd\x60\x71\xf7\x47\x8c\x61\xc3"
+			  "\xda\x8a\x78\x1e\x16\xfa\x1e\x86"
+			  "\x81\xa6\x17\x2a\xa7\xb5\xc2\xe7"
+			  "\xa4\xc7\x42\xf1\xcf\x6a\xca\xb4"
+			  "\x45\xcf\xf3\x93\xf0\xe7\xea\xf6"
+			  "\xf4\xe6\x33\x43\x84\x93\xa5\x67"
+			  "\x9b\x16\x58\x58\x80\x0f\x2b\x5c"
+			  "\x24\x74\x75\x7f\x95\x81\xb7\x30"
+			  "\x7a\x33\xa7\xf7\x94\x87\x32\x27"
+			  "\x10\x5d\x14\x4c\x43\x29\xdd\x26"
+			  "\xbd\x3e\x3c\x0e\xfe\x0e\xa5\x10"
+			  "\xea\x6b\x64\xfd\x73\xc6\xed\xec"
+			  "\xa8\xc9\xbf\xb3\xba\x0b\x4d\x07"
+			  "\x70\xfc\x16\xfd\x79\x1e\xd7\xc5"
+			  "\x49\x4e\x1c\x8b\x8d\x79\x1b\xb1"
+			  "\xec\xca\x60\x09\x4c\x6a\xd5\x09"
+			  "\x49\x46\x00\x88\x22\x8d\xce\xea"
+			  "\xb1\x17\x11\xde\x42\xd2\x23\xc1"
+			  "\x72\x11\xf5\x50\x73\x04\x40\x47"
+			  "\xf9\x5d\xe7\xa7\x26\xb1\x7e\xb0"
+			  "\x3f\x58\xc1\x52\xab\x12\x67\x9d"
+			  "\x3f\x43\x4b\x68\xd4\x9c\x68\x38"
+			  "\x07\x8a\x2d\x3e\xf3\xaf\x6a\x4b"
+			  "\xf9\xe5\x31\x69\x22\xf9\xa6\x69"
+			  "\xc6\x9c\x96\x9a\x12\x35\x95\x1d"
+			  "\x95\xd5\xdd\xbe\xbf\x93\x53\x24"
+			  "\xfd\xeb\xc2\x0a\x64\xb0\x77\x00"
+			  "\x6f\x88\xc4\x37\x18\x69\x7c\xd7"
+			  "\x41\x92\x55\x4c\x03\xa1\x9a\x4b"
+			  "\x15\xe5\xdf\x7f\x37\x33\x72\xc1"
+			  "\x8b\x10\x67\xa3\x01\x57\x94\x25"
+			  "\x7b\x38\x71\x7e\xdd\x1e\xcc\x73"
+			  "\x55\xd2\x8e\xeb\x07\xdd\xf1\xda"
+			  "\x58\xb1\x47\x90\xfe\x42\x21\x72"
+			  "\xa3\x54\x7a\xa0\x40\xec\x9f\xdd"
+			  "\xc6\x84\x6e\xca\xae\xe3\x68\xb4"
+			  "\x9d\xe4\x78\xff\x57\xf2\xf8\x1b"
+			  "\x03\xa1\x31\xd9\xde\x8d\xf5\x22"
+			  "\x9c\xdd\x20\xa4\x1e\x27\xb1\x76"
+			  "\x4f\x44\x55\xe2\x9b\xa1\x9c\xfe"
+			  "\x54\xf7\x27\x1b\xf4\xde\x02\xf5"
+			  "\x1b\x55\x48\x5c\xdc\x21\x4b\x9e"
+			  "\x4b\x6e\xed\x46\x23\xdc\x65\xb2"
+			  "\xcf\x79\x5f\x28\xe0\x9e\x8b\xe7"
+			  "\x4c\x9d\x8a\xff\xc1\xa6\x28\xb8"
+			  "\x65\x69\x8a\x45\x29\xef\x74\x85"
+			  "\xde\x79\xc7\x08\xae\x30\xb0\xf4"
+			  "\xa3\x1d\x51\x41\xab\xce\xcb\xf6"
+			  "\xb5\xd8\x6d\xe0\x85\xe1\x98\xb3"
+			  "\x43\xbb\x86\x83\x0a\xa0\xf5\xb7"
+			  "\x04\x0b\xfa\x71\x1f\xb0\xf6\xd9"
+			  "\x13\x00\x15\xf0\xc7\xeb\x0d\x5a"
+			  "\x9f\xd7\xb9\x6c\x65\x14\x22\x45"
+			  "\x6e\x45\x32\x3e\x7e\x60\x1a\x12"
+			  "\x97\x82\x14\xfb\xaa\x04\x22\xfa"
+			  "\xa0\xe5\x7e\x8c\x78\x02\x48\x5d"
+			  "\x78\x33\x5a\x7c\xad\xdb\x29\xce"
+			  "\xbb\x8b\x61\xa4\xb7\x42\xe2\xac"
+			  "\x8b\x1a\xd9\x2f\x0b\x8b\x62\x21"
+			  "\x83\x35\x7e\xad\x73\xc2\xb5\x6c"
+			  "\x10\x26\x38\x07\xe5\xc7\x36\x80"
+			  "\xe2\x23\x12\x61\xf5\x48\x4b\x2b"
+			  "\xc5\xdf\x15\xd9\x87\x01\xaa\xac"
+			  "\x1e\x7c\xad\x73\x78\x18\x63\xe0"
+			  "\x8b\x9f\x81\xd8\x12\x6a\x28\x10"
+			  "\xbe\x04\x68\x8a\x09\x7c\x1b\x1c"
+			  "\x83\x66\x80\x47\x80\xe8\xfd\x35"
+			  "\x1c\x97\x6f\xae\x49\x10\x66\xcc"
+			  "\xc6\xd8\xcc\x3a\x84\x91\x20\x77"
+			  "\x72\xe4\x24\xd2\x37\x9f\xc5\xc9"
+			  "\x25\x94\x10\x5f\x40\x00\x64\x99"
+			  "\xdc\xae\xd7\x21\x09\x78\x50\x15"
+			  "\xac\x5f\xc6\x2c\xa2\x0b\xa9\x39"
+			  "\x87\x6e\x6d\xab\xde\x08\x51\x16"
+			  "\xc7\x13\xe9\xea\xed\x06\x8e\x2c"
+			  "\xf8\x37\x8c\xf0\xa6\x96\x8d\x43"
+			  "\xb6\x98\x37\xb2\x43\xed\xde\xdf"
+			  "\x89\x1a\xe7\xeb\x9d\xa1\x7b\x0b"
+			  "\x77\xb0\xe2\x75\xc0\xf1\x98\xd9"
+			  "\x80\x55\xc9\x34\x91\xd1\x59\xe8"
+			  "\x4b\x0f\xc1\xa9\x4b\x7a\x84\x06"
+			  "\x20\xa8\x5d\xfa\xd1\xde\x70\x56"
+			  "\x2f\x9e\x91\x9c\x20\xb3\x24\xd8"
+			  "\x84\x3d\xe1\x8c\x7e\x62\x52\xe5"
+			  "\x44\x4b\x9f\xc2\x93\x03\xea\x2b"
+			  "\x59\xc5\xfa\x3f\x91\x2b\xbb\x23"
+			  "\xf5\xb2\x7b\xf5\x38\xaf\xb3\xee"
+			  "\x63\xdc\x7b\xd1\xff\xaa\x8b\xab"
+			  "\x82\x6b\x37\x04\xeb\x74\xbe\x79"
+			  "\xb9\x83\x90\xef\x20\x59\x46\xff"
+			  "\xe9\x97\x3e\x2f\xee\xb6\x64\x18"
+			  "\x38\x4c\x7a\x4a\xf9\x61\xe8\x9a"
+			  "\xa1\xb5\x01\xa6\x47\xd3\x11\xd4"
+			  "\xce\xd3\x91\x49\x88\xc7\xb8\x4d"
+			  "\xb1\xb9\x07\x6d\x16\x72\xae\x46"
+			  "\x5e\x03\xa1\x4b\xb6\x02\x30\xa8"
+			  "\x3d\xa9\x07\x2a\x7c\x19\xe7\x62"
+			  "\x87\xe3\x82\x2f\x6f\xe1\x09\xd9"
+			  "\x94\x97\xea\xdd\x58\x9e\xae\x76"
+			  "\x7e\x35\xe5\xb4\xda\x7e\xf4\xde"
+			  "\xf7\x32\x87\xcd\x93\xbf\x11\x56"
+			  "\x11\xbe\x08\x74\xe1\x69\xad\xe2"
+			  "\xd7\xf8\x86\x75\x8a\x3c\xa4\xbe"
+			  "\x70\xa7\x1b\xfc\x0b\x44\x2a\x76"
+			  "\x35\xea\x5d\x85\x81\xaf\x85\xeb"
+			  "\xa0\x1c\x61\xc2\xf7\x4f\xa5\xdc"
+			  "\x02\x7f\xf6\x95\x40\x6e\x8a\x9a"
+			  "\xf3\x5d\x25\x6e\x14\x3a\x22\xc9"
+			  "\x37\x1c\xeb\x46\x54\x3f\xa5\x91"
+			  "\xc2\xb5\x8c\xfe\x53\x08\x97\x32"
+			  "\x1b\xb2\x30\x27\xfe\x25\x5d\xdc"
+			  "\x08\x87\xd0\xe5\x94\x1a\xd4\xf1"
+			  "\xfe\xd6\xb4\xa3\xe6\x74\x81\x3c"
+			  "\x1b\xb7\x31\xa7\x22\xfd\xd4\xdd"
+			  "\x20\x4e\x7c\x51\xb0\x60\x73\xb8"
+			  "\x9c\xac\x91\x90\x7e\x01\xb0\xe1"
+			  "\x8a\x2f\x75\x1c\x53\x2a\x98\x2a"
+			  "\x06\x52\x95\x52\xb2\xe9\x25\x2e"
+			  "\x4c\xe2\x5a\x00\xb2\x13\x81\x03"
+			  "\x77\x66\x0d\xa5\x99\xda\x4e\x8c"
+			  "\xac\xf3\x13\x53\x27\x45\xaf\x64"
+			  "\x46\xdc\xea\x23\xda\x97\xd1\xab"
+			  "\x7d\x6c\x30\x96\x1f\xbc\x06\x34"
+			  "\x18\x0b\x5e\x21\x35\x11\x8d\x4c"
+			  "\xe0\x2d\xe9\x50\x16\x74\x81\xa8"
+			  "\xb4\x34\xb9\x72\x42\xa6\xcc\xbc"
+			  "\xca\x34\x83\x27\x10\x5b\x68\x45"
+			  "\x8f\x52\x22\x0c\x55\x3d\x29\x7c"
+			  "\xe3\xc0\x66\x05\x42\x91\x5f\x58"
+			  "\xfe\x4a\x62\xd9\x8c\xa9\x04\x19"
+			  "\x04\xa9\x08\x4b\x57\xfc\x67\x53"
+			  "\x08\x7c\xbc\x66\x8a\xb0\xb6\x9f"
+			  "\x92\xd6\x41\x7c\x5b\x2a\x00\x79"
+			  "\x72",
+		.ctext	= "\xe1\xb6\x8b\x5c\x80\xb8\xcc\x08"
+			  "\x1b\x84\xb2\xd1\xad\xa4\x70\xac"
+			  "\x67\xa9\x39\x27\xac\xb4\x5b\xb7"
+			  "\x4c\x26\x77\x23\x1d\xce\x0a\xbe"
+			  "\x18\x9e\x42\x8b\xbd\x7f\xd6\xf1"
+			  "\xf1\x6b\xe2\x6d\x7f\x92\x0e\xcb"
+			  "\xb8\x79\xba\xb4\xac\x7e\x2d\xc0"
+			  "\x9e\x83\x81\x91\xd5\xea\xc3\x12"
+			  "\x8d\xa4\x26\x70\xa4\xf9\x71\x0b"
+			  "\xbd\x2e\xe1\xb3\x80\x42\x25\xb3"
+			  "\x0b\x31\x99\xe1\x0d\xde\xa6\x90"
+			  "\xf2\xa3\x10\xf7\xe5\xf3\x83\x1e"
+			  "\x2c\xfb\x4d\xf0\x45\x3d\x28\x3c"
+			  "\xb8\xf1\xcb\xbf\x67\xd8\x43\x5a"
+			  "\x9d\x7b\x73\x29\x88\x0f\x13\x06"
+			  "\x37\x50\x0d\x7c\xe6\x9b\x07\xdd"
+			  "\x7e\x01\x1f\x81\x90\x10\x69\xdb"
+			  "\xa4\xad\x8a\x5e\xac\x30\x72\xf2"
+			  "\x36\xcd\xe3\x23\x49\x02\x93\xfa"
+			  "\x3d\xbb\xe2\x98\x83\xeb\xe9\x8d"
+			  "\xb3\x8f\x11\xaa\x53\xdb\xaf\x2e"
+			  "\x95\x13\x99\x3d\x71\xbd\x32\x92"
+			  "\xdd\xfc\x9d\x5e\x6f\x63\x2c\xee"
+			  "\x91\x1f\x4c\x64\x3d\x87\x55\x0f"
+			  "\xcc\x3d\x89\x61\x53\x02\x57\x8f"
+			  "\xe4\x77\x29\x32\xaf\xa6\x2f\x0a"
+			  "\xae\x3c\x3f\x3f\xf4\xfb\x65\x52"
+			  "\xc5\xc1\x78\x78\x53\x28\xad\xed"
+			  "\xd1\x67\x37\xc7\x59\x70\xcd\x0a"
+			  "\xb8\x0f\x80\x51\x9f\xc0\x12\x5e"
+			  "\x06\x0a\x7e\xec\x24\x5f\x73\x00"
+			  "\xb1\x0b\x31\x47\x4f\x73\x8d\xb4"
+			  "\xce\xf3\x55\x45\x6c\x84\x27\xba"
+			  "\xb9\x6f\x03\x4a\xeb\x98\x88\x6e"
+			  "\x53\xed\x25\x19\x0d\x8f\xfe\xca"
+			  "\x60\xe5\x00\x93\x6e\x3c\xff\x19"
+			  "\xae\x08\x3b\x8a\xa6\x84\x05\xfe"
+			  "\x9b\x59\xa0\x8c\xc8\x05\x45\xf5"
+			  "\x05\x37\xdc\x45\x6f\x8b\x95\x8c"
+			  "\x4e\x11\x45\x7a\xce\x21\xa5\xf7"
+			  "\x71\x67\xb9\xce\xd7\xf9\xe9\x5e"
+			  "\x60\xf5\x53\x7a\xa8\x85\x14\x03"
+			  "\xa0\x92\xec\xf3\x51\x80\x84\xc4"
+			  "\xdc\x11\x9e\x57\xce\x4b\x45\xcf"
+			  "\x90\x95\x85\x0b\x96\xe9\xee\x35"
+			  "\x10\xb8\x9b\xf2\x59\x4a\xc6\x7e"
+			  "\x85\xe5\x6f\x38\x51\x93\x40\x0c"
+			  "\x99\xd7\x7f\x32\xa8\x06\x27\xd1"
+			  "\x2b\xd5\xb5\x3a\x1a\xe1\x5e\xda"
+			  "\xcd\x5a\x50\x30\x3c\xc7\xe7\x65"
+			  "\xa6\x07\x0b\x98\x91\xc6\x20\x27"
+			  "\x2a\x03\x63\x1b\x1e\x3d\xaf\xc8"
+			  "\x71\x48\x46\x6a\x64\x28\xf9\x3d"
+			  "\xd1\x1d\xab\xc8\x40\x76\xc2\x39"
+			  "\x4e\x00\x75\xd2\x0e\x82\x58\x8c"
+			  "\xd3\x73\x5a\xea\x46\x89\xbe\xfd"
+			  "\x4e\x2c\x0d\x94\xaa\x9b\x68\xac"
+			  "\x86\x87\x30\x7e\xa9\x16\xcd\x59"
+			  "\xd2\xa6\xbe\x0a\xd8\xf5\xfd\x2d"
+			  "\x49\x69\xd2\x1a\x90\xd2\x1b\xed"
+			  "\xff\x71\x04\x87\x87\x21\xc4\xb8"
+			  "\x1f\x5b\x51\x33\xd0\xd6\x59\x9a"
+			  "\x03\x0e\xd3\x8b\xfb\x57\x73\xfd"
+			  "\x5a\x52\x63\x82\xc8\x85\x2f\xcb"
+			  "\x74\x6d\x4e\xd9\x68\x37\x85\x6a"
+			  "\xd4\xfb\x94\xed\x8d\xd1\x1a\xaf"
+			  "\x76\xa7\xb7\x88\xd0\x2b\x4e\xda"
+			  "\xec\x99\x94\x27\x6f\x87\x8c\xdf"
+			  "\x4b\x5e\xa6\x66\xdd\xcb\x33\x7b"
+			  "\x64\x94\x31\xa8\x37\xa6\x1d\xdb"
+			  "\x0d\x5c\x93\xa4\x40\xf9\x30\x53"
+			  "\x4b\x74\x8d\xdd\xf6\xde\x3c\xac"
+			  "\x5c\x80\x01\x3a\xef\xb1\x9a\x02"
+			  "\x0c\x22\x8e\xe7\x44\x09\x74\x4c"
+			  "\xf2\x9a\x27\x69\x7f\x12\x32\x36"
+			  "\xde\x92\xdf\xde\x8f\x5b\x31\xab"
+			  "\x4a\x01\x26\xe0\xb1\xda\xe8\x37"
+			  "\x21\x64\xe8\xff\x69\xfc\x9e\x41"
+			  "\xd2\x96\x2d\x18\x64\x98\x33\x78"
+			  "\x24\x61\x73\x9b\x47\x29\xf1\xa7"
+			  "\xcb\x27\x0f\xf0\x85\x6d\x8c\x9d"
+			  "\x2c\x95\x9e\xe5\xb2\x8e\x30\x29"
+			  "\x78\x8a\x9d\x65\xb4\x8e\xde\x7b"
+			  "\xd9\x00\x50\xf5\x7f\x81\xc3\x1b"
+			  "\x25\x85\xeb\xc2\x8c\x33\x22\x1e"
+			  "\x68\x38\x22\x30\xd8\x2e\x00\x98"
+			  "\x85\x16\x06\x56\xb4\x81\x74\x20"
+			  "\x95\xdb\x1c\x05\x19\xe8\x23\x4d"
+			  "\x65\x5d\xcc\xd8\x7f\xc4\x2d\x0f"
+			  "\x57\x26\x71\x07\xad\xaa\x71\x9f"
+			  "\x19\x76\x2f\x25\x51\x88\xe4\xc0"
+			  "\x82\x6e\x08\x05\x37\x04\xee\x25"
+			  "\x23\x90\xe9\x4e\xce\x9b\x16\xc1"
+			  "\x31\xe7\x6e\x2c\x1b\xe1\x85\x9a"
+			  "\x0c\x8c\xbb\x12\x1e\x68\x7b\x93"
+			  "\xa9\x3c\x39\x56\x23\x3e\x6e\xc7"
+			  "\x77\x84\xd3\xe0\x86\x59\xaa\xb9"
+			  "\xd5\x53\x58\xc9\x0a\x83\x5f\x85"
+			  "\xd8\x47\x14\x67\x8a\x3c\x17\xe0"
+			  "\xab\x02\x51\xea\xf1\xf0\x4f\x30"
+			  "\x7d\xe0\x92\xc2\x5f\xfb\x19\x5a"
+			  "\x3f\xbd\xf4\x39\xa4\x31\x0c\x39"
+			  "\xd1\xae\x4e\xf7\x65\x7f\x1f\xce"
+			  "\xc2\x39\xd1\x84\xd4\xe5\x02\xe0"
+			  "\x58\xaa\xf1\x5e\x81\xaf\x7f\x72"
+			  "\x0f\x08\x99\x43\xb9\xd8\xac\x41"
+			  "\x35\x55\xf2\xb2\xd4\x98\xb8\x3b"
+			  "\x2b\x3c\x3e\x16\x06\x31\xfc\x79"
+			  "\x47\x38\x63\x51\xc5\xd0\x26\xd7"
+			  "\x43\xb4\x2b\xd9\xc5\x05\xf2\x9d"
+			  "\x18\xc9\x26\x82\x56\xd2\x11\x05"
+			  "\xb6\x89\xb4\x43\x9c\xb5\x9d\x11"
+			  "\x6c\x83\x37\x71\x27\x1c\xae\xbf"
+			  "\xcd\x57\xd2\xee\x0d\x5a\x15\x26"
+			  "\x67\x88\x80\x80\x1b\xdc\xc1\x62"
+			  "\xdd\x4c\xff\x92\x5c\x6c\xe1\xa0"
+			  "\xe3\x79\xa9\x65\x8c\x8c\x14\x42"
+			  "\xe5\x11\xd2\x1a\xad\xa9\x56\x6f"
+			  "\x98\xfc\x8a\x7b\x56\x1f\xc6\xc1"
+			  "\x52\x12\x92\x9b\x41\x0f\x4b\xae"
+			  "\x1b\x4a\xbc\xfe\x23\xb6\x94\x70"
+			  "\x04\x30\x9e\x69\x47\xbe\xb8\x8f"
+			  "\xca\x45\xd7\x8a\xf4\x78\x3e\xaa"
+			  "\x71\x17\xd8\x1e\xb8\x11\x8f\xbc"
+			  "\xc8\x1a\x65\x7b\x41\x89\x72\xc7"
+			  "\x5f\xbe\xc5\x2a\xdb\x5c\x54\xf9"
+			  "\x25\xa3\x7a\x80\x56\x9c\x8c\xab"
+			  "\x26\x19\x10\x36\xa6\xf3\x14\x79"
+			  "\x40\x98\x70\x68\xb7\x35\xd9\xb9"
+			  "\x27\xd4\xe7\x74\x5b\x3d\x97\xb4"
+			  "\xd9\xaa\xd9\xf2\xb5\x14\x84\x1f"
+			  "\xa9\xde\x12\x44\x5b\x00\xc0\xbc"
+			  "\xc8\x11\x25\x1b\x67\x7a\x15\x72"
+			  "\xa6\x31\x6f\xf4\x68\x7a\x86\x9d"
+			  "\x43\x1c\x5f\x16\xd3\xad\x2e\x52"
+			  "\xf3\xb4\xc3\xfa\x27\x2e\x68\x6c"
+			  "\x06\xe7\x4c\x4f\xa2\xe0\xe4\x21"
+			  "\x5d\x9e\x33\x58\x8d\xbf\xd5\x70"
+			  "\xf8\x80\xa5\xdd\xe7\x18\x79\xfa"
+			  "\x7b\xfd\x09\x69\x2c\x37\x32\xa8"
+			  "\x65\xfa\x8d\x8b\x5c\xcc\xe8\xf3"
+			  "\x37\xf6\xa6\xc6\x5c\xa2\x66\x79"
+			  "\xfa\x8a\xa7\xd1\x0b\x2e\x1b\x5e"
+			  "\x95\x35\x00\x76\xae\x42\xf7\x50"
+			  "\x51\x78\xfb\xb4\x28\x24\xde\x1a"
+			  "\x70\x8b\xed\xca\x3c\x5e\xe4\xbd"
+			  "\x28\xb5\xf3\x76\x4f\x67\x5d\x81"
+			  "\xb2\x60\x87\xd9\x7b\x19\x1a\xa7"
+			  "\x79\xa2\xfa\x3f\x9e\xa9\xd7\x25"
+			  "\x61\xe1\x74\x31\xa2\x77\xa0\x1b"
+			  "\xf6\xf7\xcb\xc5\xaa\x9e\xce\xf9"
+			  "\x9b\x96\xef\x51\xc3\x1a\x44\x96"
+			  "\xae\x17\x50\xab\x29\x08\xda\xcc"
+			  "\x1a\xb3\x12\xd0\x24\xe4\xe2\xe0"
+			  "\xc6\xe3\xcc\x82\xd0\xba\x47\x4c"
+			  "\x3f\x49\xd7\xe8\xb6\x61\xaa\x65"
+			  "\x25\x18\x40\x2d\x62\x25\x02\x71"
+			  "\x61\xa2\xc1\xb2\x13\xd2\x71\x3f"
+			  "\x43\x1a\xc9\x09\x92\xff\xd5\x57"
+			  "\xf0\xfc\x5e\x1c\xf1\xf5\xf9\xf3"
+			  "\x5b",
+		.len	= 1281,
+		.also_non_np = 1,
+		.np	= 3,
+		.tap	= { 1200, 1, 80 },
+	},
+};
+
 /*
  * CTS (Cipher Text Stealing) mode tests
  */
diff --git a/include/crypto/chacha.h b/include/crypto/chacha.h
index ae79e9983c72f..3d261f5cd156d 100644
--- a/include/crypto/chacha.h
+++ b/include/crypto/chacha.h
@@ -5,6 +5,11 @@
  * XChaCha extends ChaCha's nonce to 192 bits, while provably retaining ChaCha's
  * security.  Here they share the same key size, tfm context, and setkey
  * function; only their IV size and encrypt/decrypt function differ.
+ *
+ * The ChaCha paper specifies 20, 12, and 8-round variants.  In general, it is
+ * recommended to use the 20-round variant ChaCha20.  However, the other
+ * variants can be needed in some performance-sensitive scenarios.  The generic
+ * ChaCha code currently allows only the 20 and 12-round variants.
  */
 
 #ifndef _CRYPTO_CHACHA_H
@@ -39,6 +44,8 @@ void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv);
 
 int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
 			   unsigned int keysize);
+int crypto_chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key,
+			   unsigned int keysize);
 
 int crypto_chacha_crypt(struct skcipher_request *req);
 int crypto_xchacha_crypt(struct skcipher_request *req);
diff --git a/lib/chacha.c b/lib/chacha.c
index 0a2c2e5b7b84d..c4d69a83fcd2d 100644
--- a/lib/chacha.c
+++ b/lib/chacha.c
@@ -21,7 +21,7 @@ static void chacha_permute(u32 *x, int nrounds)
 	int i;
 
 	/* whitelist the allowed round counts */
-	BUG_ON(nrounds != 20);
+	BUG_ON(nrounds != 20 && nrounds != 12);
 
 	for (i = 0; i < nrounds; i += 2) {
 		x[0]  += x[4];    x[12] = rol32(x[12] ^ x[0],  16);
@@ -70,7 +70,7 @@ static void chacha_permute(u32 *x, int nrounds)
  * chacha_block - generate one keystream block and increment block counter
  * @state: input state matrix (16 32-bit words)
  * @stream: output keystream block (64 bytes)
- * @nrounds: number of rounds (currently must be 20)
+ * @nrounds: number of rounds (20 or 12; 20 is recommended)
  *
  * This is the ChaCha core, a function from 64-byte strings to 64-byte strings.
  * The caller has already converted the endianness of the input.  This function
@@ -96,7 +96,7 @@ EXPORT_SYMBOL(chacha_block);
  * hchacha_block - abbreviated ChaCha core, for XChaCha
  * @in: input state matrix (16 32-bit words)
  * @out: output (8 32-bit words)
- * @nrounds: number of rounds (currently must be 20)
+ * @nrounds: number of rounds (20 or 12; 20 is recommended)
  *
  * HChaCha is the ChaCha equivalent of HSalsa and is an intermediate step
  * towards XChaCha (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf).  HChaCha
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 05/12] crypto: arm/chacha20 - add XChaCha20 support
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (3 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 04/12] crypto: chacha - add XChaCha12 support Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-20  2:29   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds Eric Biggers
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Add an XChaCha20 implementation that is hooked up to the ARM NEON
implementation of ChaCha20.  This is needed for use in the Adiantum
encryption mode; see the generic code patch,
"crypto: chacha20-generic - add XChaCha20 support", for more details.

We also update the NEON code to support HChaCha20 on one block, so we
can use that in XChaCha20 rather than calling the generic HChaCha20.
This required factoring the permutation out into its own macro.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 arch/arm/crypto/Kconfig              |   2 +-
 arch/arm/crypto/chacha20-neon-core.S |  68 ++++++++++------
 arch/arm/crypto/chacha20-neon-glue.c | 111 ++++++++++++++++++++-------
 3 files changed, 130 insertions(+), 51 deletions(-)

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index ef0c7feea6e29..0aa1471f27d2e 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -117,7 +117,7 @@ config CRYPTO_CRC32_ARM_CE
 	select CRYPTO_HASH
 
 config CRYPTO_CHACHA20_NEON
-	tristate "NEON accelerated ChaCha20 symmetric cipher"
+	tristate "NEON accelerated ChaCha20 stream cipher algorithms"
 	depends on KERNEL_MODE_NEON
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_CHACHA20
diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha20-neon-core.S
index 50e7b98968189..db59f1fbc728b 100644
--- a/arch/arm/crypto/chacha20-neon-core.S
+++ b/arch/arm/crypto/chacha20-neon-core.S
@@ -52,33 +52,22 @@
 	.fpu		neon
 	.align		5
 
-ENTRY(chacha20_block_xor_neon)
-	// r0: Input state matrix, s
-	// r1: 1 data block output, o
-	// r2: 1 data block input, i
-
-	//
-	// This function encrypts one ChaCha20 block by loading the state matrix
-	// in four NEON registers. It performs matrix operation on four words in
-	// parallel, but requireds shuffling to rearrange the words after each
-	// round.
-	//
-
-	// x0..3 = s0..3
-	add		ip, r0, #0x20
-	vld1.32		{q0-q1}, [r0]
-	vld1.32		{q2-q3}, [ip]
-
-	vmov		q8, q0
-	vmov		q9, q1
-	vmov		q10, q2
-	vmov		q11, q3
+/*
+ * _chacha20_permute - permute one block
+ *
+ * Permute one 64-byte block where the state matrix is stored in the four NEON
+ * registers q0-q3.  It performs matrix operation on four words in parallel, but
+ * requires shuffling to rearrange the words after each round.
+ *
+ * Clobbers: r3, q4-q5
+ */
+.macro	_chacha20_permute
 
 	adr		ip, .Lrol8_table
 	mov		r3, #10
 	vld1.8		{d10}, [ip, :64]
 
-.Ldoubleround:
+.Ldoubleround_\@:
 	// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
 	vadd.i32	q0, q0, q1
 	veor		q3, q3, q0
@@ -140,7 +129,25 @@ ENTRY(chacha20_block_xor_neon)
 	vext.8		q3, q3, q3, #4
 
 	subs		r3, r3, #1
-	bne		.Ldoubleround
+	bne		.Ldoubleround_\@
+.endm
+
+ENTRY(chacha20_block_xor_neon)
+	// r0: Input state matrix, s
+	// r1: 1 data block output, o
+	// r2: 1 data block input, i
+
+	// x0..3 = s0..3
+	add		ip, r0, #0x20
+	vld1.32		{q0-q1}, [r0]
+	vld1.32		{q2-q3}, [ip]
+
+	vmov		q8, q0
+	vmov		q9, q1
+	vmov		q10, q2
+	vmov		q11, q3
+
+	_chacha20_permute
 
 	add		ip, r2, #0x20
 	vld1.8		{q4-q5}, [r2]
@@ -169,6 +176,21 @@ ENTRY(chacha20_block_xor_neon)
 	bx		lr
 ENDPROC(chacha20_block_xor_neon)
 
+ENTRY(hchacha20_block_neon)
+	// r0: Input state matrix, s
+	// r1: output (8 32-bit words)
+
+	vld1.32		{q0-q1}, [r0]!
+	vld1.32		{q2-q3}, [r0]
+
+	_chacha20_permute
+
+	vst1.32		{q0}, [r1]!
+	vst1.32		{q3}, [r1]
+
+	bx		lr
+ENDPROC(hchacha20_block_neon)
+
 	.align		4
 .Lctrinc:	.word	0, 1, 2, 3
 .Lrol8_table:	.byte	3, 0, 1, 2, 7, 4, 5, 6
diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c
index 7386eb1c1889d..becc7990b1d39 100644
--- a/arch/arm/crypto/chacha20-neon-glue.c
+++ b/arch/arm/crypto/chacha20-neon-glue.c
@@ -1,5 +1,5 @@
 /*
- * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
+ * ChaCha20 (RFC7539) and XChaCha20 stream ciphers, NEON accelerated
  *
  * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
  *
@@ -30,6 +30,7 @@
 
 asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
 asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+asmlinkage void hchacha20_block_neon(const u32 *state, u32 *out);
 
 static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
 			    unsigned int bytes)
@@ -57,22 +58,17 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
 	}
 }
 
-static int chacha20_neon(struct skcipher_request *req)
+static int chacha20_neon_stream_xor(struct skcipher_request *req,
+				    struct chacha_ctx *ctx, u8 *iv)
 {
-	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
-	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
 	struct skcipher_walk walk;
 	u32 state[16];
 	int err;
 
-	if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
-		return crypto_chacha_crypt(req);
-
 	err = skcipher_walk_virt(&walk, req, true);
 
-	crypto_chacha_init(state, ctx, walk.iv);
+	crypto_chacha_init(state, ctx, iv);
 
-	kernel_neon_begin();
 	while (walk.nbytes > 0) {
 		unsigned int nbytes = walk.nbytes;
 
@@ -83,27 +79,85 @@ static int chacha20_neon(struct skcipher_request *req)
 				nbytes);
 		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
 	}
+
+	return err;
+}
+
+static int chacha20_neon(struct skcipher_request *req)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
+	int err;
+
+	if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
+		return crypto_chacha_crypt(req);
+
+	kernel_neon_begin();
+	err = chacha20_neon_stream_xor(req, ctx, req->iv);
+	kernel_neon_end();
+	return err;
+}
+
+static int xchacha20_neon(struct skcipher_request *req)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct chacha_ctx subctx;
+	u32 state[16];
+	u8 real_iv[16];
+	int err;
+
+	if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
+		return crypto_xchacha_crypt(req);
+
+	crypto_chacha_init(state, ctx, req->iv);
+
+	kernel_neon_begin();
+
+	hchacha20_block_neon(state, subctx.key);
+	memcpy(&real_iv[0], req->iv + 24, 8);
+	memcpy(&real_iv[8], req->iv + 16, 8);
+	err = chacha20_neon_stream_xor(req, &subctx, real_iv);
+
 	kernel_neon_end();
 
 	return err;
 }
 
-static struct skcipher_alg alg = {
-	.base.cra_name		= "chacha20",
-	.base.cra_driver_name	= "chacha20-neon",
-	.base.cra_priority	= 300,
-	.base.cra_blocksize	= 1,
-	.base.cra_ctxsize	= sizeof(struct chacha_ctx),
-	.base.cra_module	= THIS_MODULE,
-
-	.min_keysize		= CHACHA_KEY_SIZE,
-	.max_keysize		= CHACHA_KEY_SIZE,
-	.ivsize			= CHACHA_IV_SIZE,
-	.chunksize		= CHACHA_BLOCK_SIZE,
-	.walksize		= 4 * CHACHA_BLOCK_SIZE,
-	.setkey			= crypto_chacha20_setkey,
-	.encrypt		= chacha20_neon,
-	.decrypt		= chacha20_neon,
+static struct skcipher_alg algs[] = {
+	{
+		.base.cra_name		= "chacha20",
+		.base.cra_driver_name	= "chacha20-neon",
+		.base.cra_priority	= 300,
+		.base.cra_blocksize	= 1,
+		.base.cra_ctxsize	= sizeof(struct chacha_ctx),
+		.base.cra_module	= THIS_MODULE,
+
+		.min_keysize		= CHACHA_KEY_SIZE,
+		.max_keysize		= CHACHA_KEY_SIZE,
+		.ivsize			= CHACHA_IV_SIZE,
+		.chunksize		= CHACHA_BLOCK_SIZE,
+		.walksize		= 4 * CHACHA_BLOCK_SIZE,
+		.setkey			= crypto_chacha20_setkey,
+		.encrypt		= chacha20_neon,
+		.decrypt		= chacha20_neon,
+	}, {
+		.base.cra_name		= "xchacha20",
+		.base.cra_driver_name	= "xchacha20-neon",
+		.base.cra_priority	= 300,
+		.base.cra_blocksize	= 1,
+		.base.cra_ctxsize	= sizeof(struct chacha_ctx),
+		.base.cra_module	= THIS_MODULE,
+
+		.min_keysize		= CHACHA_KEY_SIZE,
+		.max_keysize		= CHACHA_KEY_SIZE,
+		.ivsize			= XCHACHA_IV_SIZE,
+		.chunksize		= CHACHA_BLOCK_SIZE,
+		.walksize		= 4 * CHACHA_BLOCK_SIZE,
+		.setkey			= crypto_chacha20_setkey,
+		.encrypt		= xchacha20_neon,
+		.decrypt		= xchacha20_neon,
+	}
 };
 
 static int __init chacha20_simd_mod_init(void)
@@ -111,12 +165,12 @@ static int __init chacha20_simd_mod_init(void)
 	if (!(elf_hwcap & HWCAP_NEON))
 		return -ENODEV;
 
-	return crypto_register_skcipher(&alg);
+	return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
 }
 
 static void __exit chacha20_simd_mod_fini(void)
 {
-	crypto_unregister_skcipher(&alg);
+	crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
 }
 
 module_init(chacha20_simd_mod_init);
@@ -125,3 +179,6 @@ module_exit(chacha20_simd_mod_fini);
 MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
 MODULE_LICENSE("GPL v2");
 MODULE_ALIAS_CRYPTO("chacha20");
+MODULE_ALIAS_CRYPTO("chacha20-neon");
+MODULE_ALIAS_CRYPTO("xchacha20");
+MODULE_ALIAS_CRYPTO("xchacha20-neon");
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (4 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 05/12] crypto: arm/chacha20 - add XChaCha20 support Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-20  3:35   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 07/12] crypto: arm/chacha - add XChaCha12 support Eric Biggers
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

In preparation for adding XChaCha12 support, rename/refactor the NEON
implementation of ChaCha20 to support different numbers of rounds.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 arch/arm/crypto/Makefile                      |  4 +-
 ...hacha20-neon-core.S => chacha-neon-core.S} | 36 ++++++------
 ...hacha20-neon-glue.c => chacha-neon-glue.c} | 56 ++++++++++---------
 3 files changed, 52 insertions(+), 44 deletions(-)
 rename arch/arm/crypto/{chacha20-neon-core.S => chacha-neon-core.S} (96%)
 rename arch/arm/crypto/{chacha20-neon-glue.c => chacha-neon-glue.c} (73%)

diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index bd5bceef0605f..005482ff95047 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -9,7 +9,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
 obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
 obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
 obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
-obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
+obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
 
 ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
@@ -52,7 +52,7 @@ aes-arm-ce-y	:= aes-ce-core.o aes-ce-glue.o
 ghash-arm-ce-y	:= ghash-ce-core.o ghash-ce-glue.o
 crct10dif-arm-ce-y	:= crct10dif-ce-core.o crct10dif-ce-glue.o
 crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
-chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
+chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
 
 ifdef REGENERATE_ARM_CRYPTO
 quiet_cmd_perl = PERL    $@
diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha-neon-core.S
similarity index 96%
rename from arch/arm/crypto/chacha20-neon-core.S
rename to arch/arm/crypto/chacha-neon-core.S
index db59f1fbc728b..4b12064449f78 100644
--- a/arch/arm/crypto/chacha20-neon-core.S
+++ b/arch/arm/crypto/chacha-neon-core.S
@@ -1,5 +1,5 @@
 /*
- * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
+ * ChaCha/XChaCha NEON helper functions
  *
  * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
  *
@@ -53,18 +53,19 @@
 	.align		5
 
 /*
- * _chacha20_permute - permute one block
+ * _chacha_permute - permute one block
  *
  * Permute one 64-byte block where the state matrix is stored in the four NEON
  * registers q0-q3.  It performs matrix operation on four words in parallel, but
  * requires shuffling to rearrange the words after each round.
  *
+ * The round count is given in r3.
+ *
  * Clobbers: r3, q4-q5
  */
-.macro	_chacha20_permute
+.macro	_chacha_permute
 
 	adr		ip, .Lrol8_table
-	mov		r3, #10
 	vld1.8		{d10}, [ip, :64]
 
 .Ldoubleround_\@:
@@ -128,14 +129,15 @@
 	// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
 	vext.8		q3, q3, q3, #4
 
-	subs		r3, r3, #1
+	subs		r3, r3, #2
 	bne		.Ldoubleround_\@
 .endm
 
-ENTRY(chacha20_block_xor_neon)
+ENTRY(chacha_block_xor_neon)
 	// r0: Input state matrix, s
 	// r1: 1 data block output, o
 	// r2: 1 data block input, i
+	// r3: nrounds
 
 	// x0..3 = s0..3
 	add		ip, r0, #0x20
@@ -147,7 +149,7 @@ ENTRY(chacha20_block_xor_neon)
 	vmov		q10, q2
 	vmov		q11, q3
 
-	_chacha20_permute
+	_chacha_permute
 
 	add		ip, r2, #0x20
 	vld1.8		{q4-q5}, [r2]
@@ -174,29 +176,31 @@ ENTRY(chacha20_block_xor_neon)
 	vst1.8		{q2-q3}, [ip]
 
 	bx		lr
-ENDPROC(chacha20_block_xor_neon)
+ENDPROC(chacha_block_xor_neon)
 
-ENTRY(hchacha20_block_neon)
+ENTRY(hchacha_block_neon)
 	// r0: Input state matrix, s
 	// r1: output (8 32-bit words)
+	// r2: nrounds
 
 	vld1.32		{q0-q1}, [r0]!
 	vld1.32		{q2-q3}, [r0]
 
-	_chacha20_permute
+	mov		r3, r2
+	_chacha_permute
 
 	vst1.32		{q0}, [r1]!
 	vst1.32		{q3}, [r1]
 
 	bx		lr
-ENDPROC(hchacha20_block_neon)
+ENDPROC(hchacha_block_neon)
 
 	.align		4
 .Lctrinc:	.word	0, 1, 2, 3
 .Lrol8_table:	.byte	3, 0, 1, 2, 7, 4, 5, 6
 
 	.align		5
-ENTRY(chacha20_4block_xor_neon)
+ENTRY(chacha_4block_xor_neon)
 	push		{r4-r5}
 	mov		r4, sp			// preserve the stack pointer
 	sub		ip, sp, #0x20		// allocate a 32 byte buffer
@@ -206,9 +210,10 @@ ENTRY(chacha20_4block_xor_neon)
 	// r0: Input state matrix, s
 	// r1: 4 data blocks output, o
 	// r2: 4 data blocks input, i
+	// r3: nrounds
 
 	//
-	// This function encrypts four consecutive ChaCha20 blocks by loading
+	// This function encrypts four consecutive ChaCha blocks by loading
 	// the state matrix in NEON registers four times. The algorithm performs
 	// each operation on the corresponding word of each state matrix, hence
 	// requires no word shuffling. The words are re-interleaved before the
@@ -241,7 +246,6 @@ ENTRY(chacha20_4block_xor_neon)
 	vdup.32		q0, d0[0]
 
 	adr		ip, .Lrol8_table
-	mov		r3, #10
 	b		1f
 
 .Ldoubleround4:
@@ -439,7 +443,7 @@ ENTRY(chacha20_4block_xor_neon)
 	vsri.u32	q5, q8, #25
 	vsri.u32	q6, q9, #25
 
-	subs		r3, r3, #1
+	subs		r3, r3, #2
 	bne		.Ldoubleround4
 
 	// x0..7[0-3] are in q0-q7, x10..15[0-3] are in q10-q15.
@@ -549,4 +553,4 @@ ENTRY(chacha20_4block_xor_neon)
 
 	pop		{r4-r5}
 	bx		lr
-ENDPROC(chacha20_4block_xor_neon)
+ENDPROC(chacha_4block_xor_neon)
diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha-neon-glue.c
similarity index 73%
rename from arch/arm/crypto/chacha20-neon-glue.c
rename to arch/arm/crypto/chacha-neon-glue.c
index becc7990b1d39..b236af4889c61 100644
--- a/arch/arm/crypto/chacha20-neon-glue.c
+++ b/arch/arm/crypto/chacha-neon-glue.c
@@ -28,24 +28,26 @@
 #include <asm/neon.h>
 #include <asm/simd.h>
 
-asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
-asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
-asmlinkage void hchacha20_block_neon(const u32 *state, u32 *out);
-
-static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
-			    unsigned int bytes)
+asmlinkage void chacha_block_xor_neon(const u32 *state, u8 *dst, const u8 *src,
+				      int nrounds);
+asmlinkage void chacha_4block_xor_neon(const u32 *state, u8 *dst, const u8 *src,
+				       int nrounds);
+asmlinkage void hchacha_block_neon(const u32 *state, u32 *out, int nrounds);
+
+static void chacha_doneon(u32 *state, u8 *dst, const u8 *src,
+			  unsigned int bytes, int nrounds)
 {
 	u8 buf[CHACHA_BLOCK_SIZE];
 
 	while (bytes >= CHACHA_BLOCK_SIZE * 4) {
-		chacha20_4block_xor_neon(state, dst, src);
+		chacha_4block_xor_neon(state, dst, src, nrounds);
 		bytes -= CHACHA_BLOCK_SIZE * 4;
 		src += CHACHA_BLOCK_SIZE * 4;
 		dst += CHACHA_BLOCK_SIZE * 4;
 		state[12] += 4;
 	}
 	while (bytes >= CHACHA_BLOCK_SIZE) {
-		chacha20_block_xor_neon(state, dst, src);
+		chacha_block_xor_neon(state, dst, src, nrounds);
 		bytes -= CHACHA_BLOCK_SIZE;
 		src += CHACHA_BLOCK_SIZE;
 		dst += CHACHA_BLOCK_SIZE;
@@ -53,13 +55,13 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
 	}
 	if (bytes) {
 		memcpy(buf, src, bytes);
-		chacha20_block_xor_neon(state, buf, buf);
+		chacha_block_xor_neon(state, buf, buf, nrounds);
 		memcpy(dst, buf, bytes);
 	}
 }
 
-static int chacha20_neon_stream_xor(struct skcipher_request *req,
-				    struct chacha_ctx *ctx, u8 *iv)
+static int chacha_neon_stream_xor(struct skcipher_request *req,
+				  struct chacha_ctx *ctx, u8 *iv)
 {
 	struct skcipher_walk walk;
 	u32 state[16];
@@ -75,15 +77,15 @@ static int chacha20_neon_stream_xor(struct skcipher_request *req,
 		if (nbytes < walk.total)
 			nbytes = round_down(nbytes, walk.stride);
 
-		chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
-				nbytes);
+		chacha_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
+			      nbytes, ctx->nrounds);
 		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
 	}
 
 	return err;
 }
 
-static int chacha20_neon(struct skcipher_request *req)
+static int chacha_neon(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
@@ -93,12 +95,12 @@ static int chacha20_neon(struct skcipher_request *req)
 		return crypto_chacha_crypt(req);
 
 	kernel_neon_begin();
-	err = chacha20_neon_stream_xor(req, ctx, req->iv);
+	err = chacha_neon_stream_xor(req, ctx, req->iv);
 	kernel_neon_end();
 	return err;
 }
 
-static int xchacha20_neon(struct skcipher_request *req)
+static int xchacha_neon(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
@@ -114,10 +116,11 @@ static int xchacha20_neon(struct skcipher_request *req)
 
 	kernel_neon_begin();
 
-	hchacha20_block_neon(state, subctx.key);
+	hchacha_block_neon(state, subctx.key, ctx->nrounds);
+	subctx.nrounds = ctx->nrounds;
 	memcpy(&real_iv[0], req->iv + 24, 8);
 	memcpy(&real_iv[8], req->iv + 16, 8);
-	err = chacha20_neon_stream_xor(req, &subctx, real_iv);
+	err = chacha_neon_stream_xor(req, &subctx, real_iv);
 
 	kernel_neon_end();
 
@@ -139,8 +142,8 @@ static struct skcipher_alg algs[] = {
 		.chunksize		= CHACHA_BLOCK_SIZE,
 		.walksize		= 4 * CHACHA_BLOCK_SIZE,
 		.setkey			= crypto_chacha20_setkey,
-		.encrypt		= chacha20_neon,
-		.decrypt		= chacha20_neon,
+		.encrypt		= chacha_neon,
+		.decrypt		= chacha_neon,
 	}, {
 		.base.cra_name		= "xchacha20",
 		.base.cra_driver_name	= "xchacha20-neon",
@@ -155,12 +158,12 @@ static struct skcipher_alg algs[] = {
 		.chunksize		= CHACHA_BLOCK_SIZE,
 		.walksize		= 4 * CHACHA_BLOCK_SIZE,
 		.setkey			= crypto_chacha20_setkey,
-		.encrypt		= xchacha20_neon,
-		.decrypt		= xchacha20_neon,
+		.encrypt		= xchacha_neon,
+		.decrypt		= xchacha_neon,
 	}
 };
 
-static int __init chacha20_simd_mod_init(void)
+static int __init chacha_simd_mod_init(void)
 {
 	if (!(elf_hwcap & HWCAP_NEON))
 		return -ENODEV;
@@ -168,14 +171,15 @@ static int __init chacha20_simd_mod_init(void)
 	return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
 }
 
-static void __exit chacha20_simd_mod_fini(void)
+static void __exit chacha_simd_mod_fini(void)
 {
 	crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
 }
 
-module_init(chacha20_simd_mod_init);
-module_exit(chacha20_simd_mod_fini);
+module_init(chacha_simd_mod_init);
+module_exit(chacha_simd_mod_fini);
 
+MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (NEON accelerated)");
 MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
 MODULE_LICENSE("GPL v2");
 MODULE_ALIAS_CRYPTO("chacha20");
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 07/12] crypto: arm/chacha - add XChaCha12 support
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (5 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-20  3:36   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 08/12] crypto: poly1305 - add Poly1305 core API Eric Biggers
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Now that the 32-bit ARM NEON implementation of ChaCha20 and XChaCha20
has been refactored to support varying the number of rounds, add support
for XChaCha12.  This is identical to XChaCha20 except for the number of
rounds, which is 12 instead of 20.

XChaCha12 is faster than XChaCha20 but has a lower security margin,
though still greater than AES-256's since the best known attacks make it
through only 7 rounds.  See the patch "crypto: chacha - add XChaCha12
support" for more details about why we need XChaCha12 support.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 arch/arm/crypto/Kconfig            |  2 +-
 arch/arm/crypto/chacha-neon-glue.c | 21 ++++++++++++++++++++-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 0aa1471f27d2e..cc932d9bba561 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -117,7 +117,7 @@ config CRYPTO_CRC32_ARM_CE
 	select CRYPTO_HASH
 
 config CRYPTO_CHACHA20_NEON
-	tristate "NEON accelerated ChaCha20 stream cipher algorithms"
+	tristate "NEON accelerated ChaCha stream cipher algorithms"
 	depends on KERNEL_MODE_NEON
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_CHACHA20
diff --git a/arch/arm/crypto/chacha-neon-glue.c b/arch/arm/crypto/chacha-neon-glue.c
index b236af4889c61..0b1b238227707 100644
--- a/arch/arm/crypto/chacha-neon-glue.c
+++ b/arch/arm/crypto/chacha-neon-glue.c
@@ -1,5 +1,6 @@
 /*
- * ChaCha20 (RFC7539) and XChaCha20 stream ciphers, NEON accelerated
+ * ARM NEON accelerated ChaCha and XChaCha stream ciphers,
+ * including ChaCha20 (RFC7539)
  *
  * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
  *
@@ -160,6 +161,22 @@ static struct skcipher_alg algs[] = {
 		.setkey			= crypto_chacha20_setkey,
 		.encrypt		= xchacha_neon,
 		.decrypt		= xchacha_neon,
+	}, {
+		.base.cra_name		= "xchacha12",
+		.base.cra_driver_name	= "xchacha12-neon",
+		.base.cra_priority	= 300,
+		.base.cra_blocksize	= 1,
+		.base.cra_ctxsize	= sizeof(struct chacha_ctx),
+		.base.cra_module	= THIS_MODULE,
+
+		.min_keysize		= CHACHA_KEY_SIZE,
+		.max_keysize		= CHACHA_KEY_SIZE,
+		.ivsize			= XCHACHA_IV_SIZE,
+		.chunksize		= CHACHA_BLOCK_SIZE,
+		.walksize		= 4 * CHACHA_BLOCK_SIZE,
+		.setkey			= crypto_chacha12_setkey,
+		.encrypt		= xchacha_neon,
+		.decrypt		= xchacha_neon,
 	}
 };
 
@@ -186,3 +203,5 @@ MODULE_ALIAS_CRYPTO("chacha20");
 MODULE_ALIAS_CRYPTO("chacha20-neon");
 MODULE_ALIAS_CRYPTO("xchacha20");
 MODULE_ALIAS_CRYPTO("xchacha20-neon");
+MODULE_ALIAS_CRYPTO("xchacha12");
+MODULE_ALIAS_CRYPTO("xchacha12-neon");
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 08/12] crypto: poly1305 - add Poly1305 core API
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (6 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 07/12] crypto: arm/chacha - add XChaCha12 support Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-20  3:45   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support Eric Biggers
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Expose a low-level Poly1305 API which implements the
ε-almost-∆-universal (εA∆U) hash function underlying the Poly1305 MAC
and supports block-aligned inputs only.

This is needed for Adiantum hashing, which builds an εA∆U hash function
from NH and a polynomial evaluation in GF(2^{130}-5); this polynomial
evaluation is identical to the one the Poly1305 MAC does.  However, the
crypto_shash Poly1305 API isn't very appropriate for this because its
calling convention assumes it is used as a MAC, with a 32-byte
"one-time key" provided for every digest.

But by design, in Adiantum hashing the performance of the polynomial
evaluation isn't nearly as critical as NH.  So it suffices to just have
some C helper functions.  Thus, this patch adds such functions.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 arch/x86/crypto/poly1305_glue.c |  20 ++--
 crypto/poly1305_generic.c       | 174 ++++++++++++++++++--------------
 include/crypto/poly1305.h       |  28 ++++-
 3 files changed, 136 insertions(+), 86 deletions(-)

diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c
index f012b7e28ad1d..88cc01506c84a 100644
--- a/arch/x86/crypto/poly1305_glue.c
+++ b/arch/x86/crypto/poly1305_glue.c
@@ -83,35 +83,37 @@ static unsigned int poly1305_simd_blocks(struct poly1305_desc_ctx *dctx,
 	if (poly1305_use_avx2 && srclen >= POLY1305_BLOCK_SIZE * 4) {
 		if (unlikely(!sctx->wset)) {
 			if (!sctx->uset) {
-				memcpy(sctx->u, dctx->r, sizeof(sctx->u));
-				poly1305_simd_mult(sctx->u, dctx->r);
+				memcpy(sctx->u, dctx->r.r, sizeof(sctx->u));
+				poly1305_simd_mult(sctx->u, dctx->r.r);
 				sctx->uset = true;
 			}
 			memcpy(sctx->u + 5, sctx->u, sizeof(sctx->u));
-			poly1305_simd_mult(sctx->u + 5, dctx->r);
+			poly1305_simd_mult(sctx->u + 5, dctx->r.r);
 			memcpy(sctx->u + 10, sctx->u + 5, sizeof(sctx->u));
-			poly1305_simd_mult(sctx->u + 10, dctx->r);
+			poly1305_simd_mult(sctx->u + 10, dctx->r.r);
 			sctx->wset = true;
 		}
 		blocks = srclen / (POLY1305_BLOCK_SIZE * 4);
-		poly1305_4block_avx2(dctx->h, src, dctx->r, blocks, sctx->u);
+		poly1305_4block_avx2(dctx->h.h, src, dctx->r.r, blocks,
+				     sctx->u);
 		src += POLY1305_BLOCK_SIZE * 4 * blocks;
 		srclen -= POLY1305_BLOCK_SIZE * 4 * blocks;
 	}
 #endif
 	if (likely(srclen >= POLY1305_BLOCK_SIZE * 2)) {
 		if (unlikely(!sctx->uset)) {
-			memcpy(sctx->u, dctx->r, sizeof(sctx->u));
-			poly1305_simd_mult(sctx->u, dctx->r);
+			memcpy(sctx->u, dctx->r.r, sizeof(sctx->u));
+			poly1305_simd_mult(sctx->u, dctx->r.r);
 			sctx->uset = true;
 		}
 		blocks = srclen / (POLY1305_BLOCK_SIZE * 2);
-		poly1305_2block_sse2(dctx->h, src, dctx->r, blocks, sctx->u);
+		poly1305_2block_sse2(dctx->h.h, src, dctx->r.r, blocks,
+				     sctx->u);
 		src += POLY1305_BLOCK_SIZE * 2 * blocks;
 		srclen -= POLY1305_BLOCK_SIZE * 2 * blocks;
 	}
 	if (srclen >= POLY1305_BLOCK_SIZE) {
-		poly1305_block_sse2(dctx->h, src, dctx->r, 1);
+		poly1305_block_sse2(dctx->h.h, src, dctx->r.r, 1);
 		srclen -= POLY1305_BLOCK_SIZE;
 	}
 	return srclen;
diff --git a/crypto/poly1305_generic.c b/crypto/poly1305_generic.c
index 47d3a6b83931e..2a06874204e87 100644
--- a/crypto/poly1305_generic.c
+++ b/crypto/poly1305_generic.c
@@ -38,7 +38,7 @@ int crypto_poly1305_init(struct shash_desc *desc)
 {
 	struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
 
-	memset(dctx->h, 0, sizeof(dctx->h));
+	poly1305_core_init(&dctx->h);
 	dctx->buflen = 0;
 	dctx->rset = false;
 	dctx->sset = false;
@@ -47,23 +47,16 @@ int crypto_poly1305_init(struct shash_desc *desc)
 }
 EXPORT_SYMBOL_GPL(crypto_poly1305_init);
 
-static void poly1305_setrkey(struct poly1305_desc_ctx *dctx, const u8 *key)
+void poly1305_core_setkey(struct poly1305_key *key, const u8 *raw_key)
 {
 	/* r &= 0xffffffc0ffffffc0ffffffc0fffffff */
-	dctx->r[0] = (get_unaligned_le32(key +  0) >> 0) & 0x3ffffff;
-	dctx->r[1] = (get_unaligned_le32(key +  3) >> 2) & 0x3ffff03;
-	dctx->r[2] = (get_unaligned_le32(key +  6) >> 4) & 0x3ffc0ff;
-	dctx->r[3] = (get_unaligned_le32(key +  9) >> 6) & 0x3f03fff;
-	dctx->r[4] = (get_unaligned_le32(key + 12) >> 8) & 0x00fffff;
-}
-
-static void poly1305_setskey(struct poly1305_desc_ctx *dctx, const u8 *key)
-{
-	dctx->s[0] = get_unaligned_le32(key +  0);
-	dctx->s[1] = get_unaligned_le32(key +  4);
-	dctx->s[2] = get_unaligned_le32(key +  8);
-	dctx->s[3] = get_unaligned_le32(key + 12);
+	key->r[0] = (get_unaligned_le32(raw_key +  0) >> 0) & 0x3ffffff;
+	key->r[1] = (get_unaligned_le32(raw_key +  3) >> 2) & 0x3ffff03;
+	key->r[2] = (get_unaligned_le32(raw_key +  6) >> 4) & 0x3ffc0ff;
+	key->r[3] = (get_unaligned_le32(raw_key +  9) >> 6) & 0x3f03fff;
+	key->r[4] = (get_unaligned_le32(raw_key + 12) >> 8) & 0x00fffff;
 }
+EXPORT_SYMBOL_GPL(poly1305_core_setkey);
 
 /*
  * Poly1305 requires a unique key for each tag, which implies that we can't set
@@ -75,13 +68,16 @@ unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
 {
 	if (!dctx->sset) {
 		if (!dctx->rset && srclen >= POLY1305_BLOCK_SIZE) {
-			poly1305_setrkey(dctx, src);
+			poly1305_core_setkey(&dctx->r, src);
 			src += POLY1305_BLOCK_SIZE;
 			srclen -= POLY1305_BLOCK_SIZE;
 			dctx->rset = true;
 		}
 		if (srclen >= POLY1305_BLOCK_SIZE) {
-			poly1305_setskey(dctx, src);
+			dctx->s[0] = get_unaligned_le32(src +  0);
+			dctx->s[1] = get_unaligned_le32(src +  4);
+			dctx->s[2] = get_unaligned_le32(src +  8);
+			dctx->s[3] = get_unaligned_le32(src + 12);
 			src += POLY1305_BLOCK_SIZE;
 			srclen -= POLY1305_BLOCK_SIZE;
 			dctx->sset = true;
@@ -91,41 +87,37 @@ unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
 }
 EXPORT_SYMBOL_GPL(crypto_poly1305_setdesckey);
 
-static unsigned int poly1305_blocks(struct poly1305_desc_ctx *dctx,
-				    const u8 *src, unsigned int srclen,
-				    u32 hibit)
+static void poly1305_blocks_internal(struct poly1305_state *state,
+				     const struct poly1305_key *key,
+				     const void *src, unsigned int nblocks,
+				     u32 hibit)
 {
 	u32 r0, r1, r2, r3, r4;
 	u32 s1, s2, s3, s4;
 	u32 h0, h1, h2, h3, h4;
 	u64 d0, d1, d2, d3, d4;
-	unsigned int datalen;
 
-	if (unlikely(!dctx->sset)) {
-		datalen = crypto_poly1305_setdesckey(dctx, src, srclen);
-		src += srclen - datalen;
-		srclen = datalen;
-	}
+	if (!nblocks)
+		return;
 
-	r0 = dctx->r[0];
-	r1 = dctx->r[1];
-	r2 = dctx->r[2];
-	r3 = dctx->r[3];
-	r4 = dctx->r[4];
+	r0 = key->r[0];
+	r1 = key->r[1];
+	r2 = key->r[2];
+	r3 = key->r[3];
+	r4 = key->r[4];
 
 	s1 = r1 * 5;
 	s2 = r2 * 5;
 	s3 = r3 * 5;
 	s4 = r4 * 5;
 
-	h0 = dctx->h[0];
-	h1 = dctx->h[1];
-	h2 = dctx->h[2];
-	h3 = dctx->h[3];
-	h4 = dctx->h[4];
-
-	while (likely(srclen >= POLY1305_BLOCK_SIZE)) {
+	h0 = state->h[0];
+	h1 = state->h[1];
+	h2 = state->h[2];
+	h3 = state->h[3];
+	h4 = state->h[4];
 
+	do {
 		/* h += m[i] */
 		h0 += (get_unaligned_le32(src +  0) >> 0) & 0x3ffffff;
 		h1 += (get_unaligned_le32(src +  3) >> 2) & 0x3ffffff;
@@ -154,16 +146,36 @@ static unsigned int poly1305_blocks(struct poly1305_desc_ctx *dctx,
 		h1 += h0 >> 26;       h0 = h0 & 0x3ffffff;
 
 		src += POLY1305_BLOCK_SIZE;
-		srclen -= POLY1305_BLOCK_SIZE;
-	}
+	} while (--nblocks);
 
-	dctx->h[0] = h0;
-	dctx->h[1] = h1;
-	dctx->h[2] = h2;
-	dctx->h[3] = h3;
-	dctx->h[4] = h4;
+	state->h[0] = h0;
+	state->h[1] = h1;
+	state->h[2] = h2;
+	state->h[3] = h3;
+	state->h[4] = h4;
+}
 
-	return srclen;
+void poly1305_core_blocks(struct poly1305_state *state,
+			  const struct poly1305_key *key,
+			  const void *src, unsigned int nblocks)
+{
+	poly1305_blocks_internal(state, key, src, nblocks, 1 << 24);
+}
+EXPORT_SYMBOL_GPL(poly1305_core_blocks);
+
+static void poly1305_blocks(struct poly1305_desc_ctx *dctx,
+			    const u8 *src, unsigned int srclen, u32 hibit)
+{
+	unsigned int datalen;
+
+	if (unlikely(!dctx->sset)) {
+		datalen = crypto_poly1305_setdesckey(dctx, src, srclen);
+		src += srclen - datalen;
+		srclen = datalen;
+	}
+
+	poly1305_blocks_internal(&dctx->h, &dctx->r,
+				 src, srclen / POLY1305_BLOCK_SIZE, hibit);
 }
 
 int crypto_poly1305_update(struct shash_desc *desc,
@@ -187,9 +199,9 @@ int crypto_poly1305_update(struct shash_desc *desc,
 	}
 
 	if (likely(srclen >= POLY1305_BLOCK_SIZE)) {
-		bytes = poly1305_blocks(dctx, src, srclen, 1 << 24);
-		src += srclen - bytes;
-		srclen = bytes;
+		poly1305_blocks(dctx, src, srclen, 1 << 24);
+		src += srclen - (srclen % POLY1305_BLOCK_SIZE);
+		srclen %= POLY1305_BLOCK_SIZE;
 	}
 
 	if (unlikely(srclen)) {
@@ -201,30 +213,18 @@ int crypto_poly1305_update(struct shash_desc *desc,
 }
 EXPORT_SYMBOL_GPL(crypto_poly1305_update);
 
-int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
+void poly1305_core_emit(const struct poly1305_state *state, void *dst)
 {
-	struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
 	u32 h0, h1, h2, h3, h4;
 	u32 g0, g1, g2, g3, g4;
 	u32 mask;
-	u64 f = 0;
-
-	if (unlikely(!dctx->sset))
-		return -ENOKEY;
-
-	if (unlikely(dctx->buflen)) {
-		dctx->buf[dctx->buflen++] = 1;
-		memset(dctx->buf + dctx->buflen, 0,
-		       POLY1305_BLOCK_SIZE - dctx->buflen);
-		poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 0);
-	}
 
 	/* fully carry h */
-	h0 = dctx->h[0];
-	h1 = dctx->h[1];
-	h2 = dctx->h[2];
-	h3 = dctx->h[3];
-	h4 = dctx->h[4];
+	h0 = state->h[0];
+	h1 = state->h[1];
+	h2 = state->h[2];
+	h3 = state->h[3];
+	h4 = state->h[4];
 
 	h2 += (h1 >> 26);     h1 = h1 & 0x3ffffff;
 	h3 += (h2 >> 26);     h2 = h2 & 0x3ffffff;
@@ -254,16 +254,40 @@ int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
 	h4 = (h4 & mask) | g4;
 
 	/* h = h % (2^128) */
-	h0 = (h0 >>  0) | (h1 << 26);
-	h1 = (h1 >>  6) | (h2 << 20);
-	h2 = (h2 >> 12) | (h3 << 14);
-	h3 = (h3 >> 18) | (h4 <<  8);
+	put_unaligned_le32((h0 >>  0) | (h1 << 26), dst +  0);
+	put_unaligned_le32((h1 >>  6) | (h2 << 20), dst +  4);
+	put_unaligned_le32((h2 >> 12) | (h3 << 14), dst +  8);
+	put_unaligned_le32((h3 >> 18) | (h4 <<  8), dst + 12);
+}
+EXPORT_SYMBOL_GPL(poly1305_core_emit);
+
+int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
+{
+	struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
+	__le32 digest[4];
+	u64 f = 0;
+
+	if (unlikely(!dctx->sset))
+		return -ENOKEY;
+
+	if (unlikely(dctx->buflen)) {
+		dctx->buf[dctx->buflen++] = 1;
+		memset(dctx->buf + dctx->buflen, 0,
+		       POLY1305_BLOCK_SIZE - dctx->buflen);
+		poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 0);
+	}
+
+	poly1305_core_emit(&dctx->h, digest);
 
 	/* mac = (h + s) % (2^128) */
-	f = (f >> 32) + h0 + dctx->s[0]; put_unaligned_le32(f, dst +  0);
-	f = (f >> 32) + h1 + dctx->s[1]; put_unaligned_le32(f, dst +  4);
-	f = (f >> 32) + h2 + dctx->s[2]; put_unaligned_le32(f, dst +  8);
-	f = (f >> 32) + h3 + dctx->s[3]; put_unaligned_le32(f, dst + 12);
+	f = (f >> 32) + le32_to_cpu(digest[0]) + dctx->s[0];
+	put_unaligned_le32(f, dst + 0);
+	f = (f >> 32) + le32_to_cpu(digest[1]) + dctx->s[1];
+	put_unaligned_le32(f, dst + 4);
+	f = (f >> 32) + le32_to_cpu(digest[2]) + dctx->s[2];
+	put_unaligned_le32(f, dst + 8);
+	f = (f >> 32) + le32_to_cpu(digest[3]) + dctx->s[3];
+	put_unaligned_le32(f, dst + 12);
 
 	return 0;
 }
diff --git a/include/crypto/poly1305.h b/include/crypto/poly1305.h
index f718a19da82f7..34317ed2071e6 100644
--- a/include/crypto/poly1305.h
+++ b/include/crypto/poly1305.h
@@ -13,13 +13,21 @@
 #define POLY1305_KEY_SIZE	32
 #define POLY1305_DIGEST_SIZE	16
 
+struct poly1305_key {
+	u32 r[5];	/* key, base 2^26 */
+};
+
+struct poly1305_state {
+	u32 h[5];	/* accumulator, base 2^26 */
+};
+
 struct poly1305_desc_ctx {
 	/* key */
-	u32 r[5];
+	struct poly1305_key r;
 	/* finalize key */
 	u32 s[4];
 	/* accumulator */
-	u32 h[5];
+	struct poly1305_state h;
 	/* partial buffer */
 	u8 buf[POLY1305_BLOCK_SIZE];
 	/* bytes used in partial buffer */
@@ -30,6 +38,22 @@ struct poly1305_desc_ctx {
 	bool sset;
 };
 
+/*
+ * Poly1305 core functions.  These implement the ε-almost-∆-universal hash
+ * function underlying the Poly1305 MAC, i.e. they don't add an encrypted nonce
+ * ("s key") at the end.  They also only support block-aligned inputs.
+ */
+void poly1305_core_setkey(struct poly1305_key *key, const u8 *raw_key);
+static inline void poly1305_core_init(struct poly1305_state *state)
+{
+	memset(state->h, 0, sizeof(state->h));
+}
+void poly1305_core_blocks(struct poly1305_state *state,
+			  const struct poly1305_key *key,
+			  const void *src, unsigned int nblocks);
+void poly1305_core_emit(const struct poly1305_state *state, void *dst);
+
+/* Crypto API helper functions for the Poly1305 MAC */
 int crypto_poly1305_init(struct shash_desc *desc);
 unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
 					const u8 *src, unsigned int srclen);
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (7 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 08/12] crypto: poly1305 - add Poly1305 core API Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-20  4:00   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305 Eric Biggers
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Add a generic implementation of NHPoly1305, an ε-almost-∆-universal hash
function used in the Adiantum encryption mode.

CONFIG_NHPOLY1305 is not selectable by itself since there won't be any
real reason to enable it without also enabling Adiantum support.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 crypto/Kconfig              |    5 +
 crypto/Makefile             |    1 +
 crypto/nhpoly1305.c         |  288 ++++++++
 crypto/testmgr.c            |    6 +
 crypto/testmgr.h            | 1240 ++++++++++++++++++++++++++++++++++-
 include/crypto/nhpoly1305.h |   74 +++
 6 files changed, 1610 insertions(+), 4 deletions(-)
 create mode 100644 crypto/nhpoly1305.c
 create mode 100644 include/crypto/nhpoly1305.h

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 4fa0a4a0e8615..431beca903623 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -493,6 +493,11 @@ config CRYPTO_KEYWRAP
 	  Support for key wrapping (NIST SP800-38F / RFC3394) without
 	  padding.
 
+config CRYPTO_NHPOLY1305
+	tristate
+	select CRYPTO_HASH
+	select CRYPTO_POLY1305
+
 comment "Hash modes"
 
 config CRYPTO_CMAC
diff --git a/crypto/Makefile b/crypto/Makefile
index 7e673f7c71107..87b86f221a2a2 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -84,6 +84,7 @@ obj-$(CONFIG_CRYPTO_LRW) += lrw.o
 obj-$(CONFIG_CRYPTO_XTS) += xts.o
 obj-$(CONFIG_CRYPTO_CTR) += ctr.o
 obj-$(CONFIG_CRYPTO_KEYWRAP) += keywrap.o
+obj-$(CONFIG_CRYPTO_NHPOLY1305) += nhpoly1305.o
 obj-$(CONFIG_CRYPTO_GCM) += gcm.o
 obj-$(CONFIG_CRYPTO_CCM) += ccm.o
 obj-$(CONFIG_CRYPTO_CHACHA20POLY1305) += chacha20poly1305.o
diff --git a/crypto/nhpoly1305.c b/crypto/nhpoly1305.c
new file mode 100644
index 0000000000000..087ad7680dd62
--- /dev/null
+++ b/crypto/nhpoly1305.c
@@ -0,0 +1,288 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
+ *
+ * Copyright 2018 Google LLC
+ */
+
+/*
+ * "NHPoly1305" is the main component of Adiantum hashing.
+ * Specifically, it is the calculation
+ *
+ *	H_M ← Poly1305_{K_M}(NH_{K_N}(pad_{128}(M)))
+ *
+ * from the procedure in section A.5 of the Adiantum paper [1].  It is an
+ * ε-almost-∆-universal (εA∆U) hash function for equal-length inputs over
+ * Z/(2^{128}Z), where the "∆" operation is addition.  It hashes 1024-byte
+ * chunks of the input with the NH hash function [2], reducing the input length
+ * by 32x.  The resulting NH digests are evaluated as a polynomial in
+ * GF(2^{130}-5), like in the Poly1305 MAC [3].  Note that the polynomial
+ * evaluation by itself would suffice to achieve the εA∆U property; NH is used
+ * for performance since it's over twice as fast as Poly1305.
+ *
+ * This is *not* a cryptographic hash function; do not use it as such!
+ *
+ * [1] Adiantum: length-preserving encryption for entry-level processors
+ *     (https://eprint.iacr.org/2018/720.pdf)
+ * [2] UMAC: Fast and Secure Message Authentication
+ *     (https://fastcrypto.org/umac/umac_proc.pdf)
+ * [3] The Poly1305-AES message-authentication code
+ *     (https://cr.yp.to/mac/poly1305-20050329.pdf)
+ */
+
+#include <asm/unaligned.h>
+#include <crypto/algapi.h>
+#include <crypto/internal/hash.h>
+#include <crypto/nhpoly1305.h>
+#include <linux/crypto.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+#define NH_STRIDE(K0, K1, K2, K3)				\
+({								\
+	m_A = get_unaligned_le32(src); src += 4;		\
+	m_B = get_unaligned_le32(src); src += 4;		\
+	m_C = get_unaligned_le32(src); src += 4;		\
+	m_D = get_unaligned_le32(src); src += 4;		\
+	K3##_A = *key++;					\
+	K3##_B = *key++;					\
+	K3##_C = *key++;					\
+	K3##_D = *key++;					\
+	sum0 += (u64)(u32)(m_A + K0##_A) * (u32)(m_C + K0##_C);	\
+	sum1 += (u64)(u32)(m_A + K1##_A) * (u32)(m_C + K1##_C);	\
+	sum2 += (u64)(u32)(m_A + K2##_A) * (u32)(m_C + K2##_C);	\
+	sum3 += (u64)(u32)(m_A + K3##_A) * (u32)(m_C + K3##_C);	\
+	sum0 += (u64)(u32)(m_B + K0##_B) * (u32)(m_D + K0##_D);	\
+	sum1 += (u64)(u32)(m_B + K1##_B) * (u32)(m_D + K1##_D);	\
+	sum2 += (u64)(u32)(m_B + K2##_B) * (u32)(m_D + K2##_D);	\
+	sum3 += (u64)(u32)(m_B + K3##_B) * (u32)(m_D + K3##_D);	\
+})
+
+static void nh_generic(const u32 *key, const u8 *src, size_t srclen,
+		       __le64 hash[NH_NUM_PASSES])
+{
+	u64 sum0 = 0, sum1 = 0, sum2 = 0, sum3 = 0;
+	u32 k0_A = *key++;
+	u32 k0_B = *key++;
+	u32 k0_C = *key++;
+	u32 k0_D = *key++;
+	u32 k1_A = *key++;
+	u32 k1_B = *key++;
+	u32 k1_C = *key++;
+	u32 k1_D = *key++;
+	u32 k2_A = *key++;
+	u32 k2_B = *key++;
+	u32 k2_C = *key++;
+	u32 k2_D = *key++;
+	u32 k3_A, k3_B, k3_C, k3_D;
+	u32 m_A, m_B, m_C, m_D;
+	size_t n = srclen / NH_MESSAGE_UNIT;
+
+	BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
+	BUILD_BUG_ON(NH_NUM_PASSES != 4);
+
+	while (n >= 4) {
+		NH_STRIDE(k0, k1, k2, k3);
+		NH_STRIDE(k1, k2, k3, k0);
+		NH_STRIDE(k2, k3, k0, k1);
+		NH_STRIDE(k3, k0, k1, k2);
+		n -= 4;
+	}
+	if (n) {
+		NH_STRIDE(k0, k1, k2, k3);
+		if (--n) {
+			NH_STRIDE(k1, k2, k3, k0);
+			if (--n)
+				NH_STRIDE(k2, k3, k0, k1);
+		}
+	}
+
+	hash[0] = cpu_to_le64(sum0);
+	hash[1] = cpu_to_le64(sum1);
+	hash[2] = cpu_to_le64(sum2);
+	hash[3] = cpu_to_le64(sum3);
+}
+
+/* Pass the next NH hash value through Poly1305 */
+static void process_nh_hash_value(struct nhpoly1305_state *state,
+				  const struct nhpoly1305_key *key)
+{
+	BUILD_BUG_ON(NH_HASH_BYTES % POLY1305_BLOCK_SIZE != 0);
+
+	poly1305_core_blocks(&state->poly_state, &key->poly_key, state->nh_hash,
+			     NH_HASH_BYTES / POLY1305_BLOCK_SIZE);
+}
+
+/*
+ * Feed the next portion of the source data, as a whole number of 16-byte
+ * "NH message units", through NH and Poly1305.  Each NH hash is taken over
+ * 1024 bytes, except possibly the final one which is taken over a multiple of
+ * 16 bytes up to 1024.  Also, in the case where data is passed in misaligned
+ * chunks, we combine partial hashes; the end result is the same either way.
+ */
+static void nhpoly1305_units(struct nhpoly1305_state *state,
+			     const struct nhpoly1305_key *key,
+			     const u8 *src, unsigned int srclen, nh_t nh_fn)
+{
+	do {
+		unsigned int bytes;
+
+		if (state->nh_remaining == 0) {
+			/* Starting a new NH message */
+			bytes = min_t(unsigned int, srclen, NH_MESSAGE_BYTES);
+			nh_fn(key->nh_key, src, bytes, state->nh_hash);
+			state->nh_remaining = NH_MESSAGE_BYTES - bytes;
+		} else {
+			/* Continuing a previous NH message */
+			__le64 tmp_hash[NH_NUM_PASSES];
+			unsigned int pos;
+			int i;
+
+			pos = NH_MESSAGE_BYTES - state->nh_remaining;
+			bytes = min(srclen, state->nh_remaining);
+			nh_fn(&key->nh_key[pos / 4], src, bytes, tmp_hash);
+			for (i = 0; i < NH_NUM_PASSES; i++)
+				le64_add_cpu(&state->nh_hash[i],
+					     le64_to_cpu(tmp_hash[i]));
+			state->nh_remaining -= bytes;
+		}
+		if (state->nh_remaining == 0)
+			process_nh_hash_value(state, key);
+		src += bytes;
+		srclen -= bytes;
+	} while (srclen);
+}
+
+int crypto_nhpoly1305_setkey(struct crypto_shash *tfm,
+			     const u8 *key, unsigned int keylen)
+{
+	struct nhpoly1305_key *ctx = crypto_shash_ctx(tfm);
+	int i;
+
+	if (keylen != NHPOLY1305_KEY_SIZE)
+		return -EINVAL;
+
+	poly1305_core_setkey(&ctx->poly_key, key);
+	key += POLY1305_BLOCK_SIZE;
+
+	for (i = 0; i < NH_KEY_WORDS; i++)
+		ctx->nh_key[i] = get_unaligned_le32(key + i * sizeof(u32));
+
+	return 0;
+}
+EXPORT_SYMBOL(crypto_nhpoly1305_setkey);
+
+int crypto_nhpoly1305_init(struct shash_desc *desc)
+{
+	struct nhpoly1305_state *state = shash_desc_ctx(desc);
+
+	poly1305_core_init(&state->poly_state);
+	state->buflen = 0;
+	state->nh_remaining = 0;
+	return 0;
+}
+EXPORT_SYMBOL(crypto_nhpoly1305_init);
+
+int crypto_nhpoly1305_update_helper(struct shash_desc *desc,
+				    const u8 *src, unsigned int srclen,
+				    nh_t nh_fn)
+{
+	struct nhpoly1305_state *state = shash_desc_ctx(desc);
+	const struct nhpoly1305_key *key = crypto_shash_ctx(desc->tfm);
+	unsigned int bytes;
+
+	if (state->buflen) {
+		bytes = min(srclen, (int)NH_MESSAGE_UNIT - state->buflen);
+		memcpy(&state->buffer[state->buflen], src, bytes);
+		state->buflen += bytes;
+		if (state->buflen < NH_MESSAGE_UNIT)
+			return 0;
+		nhpoly1305_units(state, key, state->buffer, NH_MESSAGE_UNIT,
+				 nh_fn);
+		state->buflen = 0;
+		src += bytes;
+		srclen -= bytes;
+	}
+
+	if (srclen >= NH_MESSAGE_UNIT) {
+		bytes = round_down(srclen, NH_MESSAGE_UNIT);
+		nhpoly1305_units(state, key, src, bytes, nh_fn);
+		src += bytes;
+		srclen -= bytes;
+	}
+
+	if (srclen) {
+		memcpy(state->buffer, src, srclen);
+		state->buflen = srclen;
+	}
+	return 0;
+}
+EXPORT_SYMBOL(crypto_nhpoly1305_update_helper);
+
+int crypto_nhpoly1305_update(struct shash_desc *desc,
+			     const u8 *src, unsigned int srclen)
+{
+	return crypto_nhpoly1305_update_helper(desc, src, srclen, nh_generic);
+}
+EXPORT_SYMBOL(crypto_nhpoly1305_update);
+
+int crypto_nhpoly1305_final_helper(struct shash_desc *desc, u8 *dst, nh_t nh_fn)
+{
+	struct nhpoly1305_state *state = shash_desc_ctx(desc);
+	const struct nhpoly1305_key *key = crypto_shash_ctx(desc->tfm);
+
+	if (state->buflen) {
+		memset(&state->buffer[state->buflen], 0,
+		       NH_MESSAGE_UNIT - state->buflen);
+		nhpoly1305_units(state, key, state->buffer, NH_MESSAGE_UNIT,
+				 nh_fn);
+	}
+
+	if (state->nh_remaining)
+		process_nh_hash_value(state, key);
+
+	poly1305_core_emit(&state->poly_state, dst);
+	return 0;
+}
+EXPORT_SYMBOL(crypto_nhpoly1305_final_helper);
+
+int crypto_nhpoly1305_final(struct shash_desc *desc, u8 *dst)
+{
+	return crypto_nhpoly1305_final_helper(desc, dst, nh_generic);
+}
+EXPORT_SYMBOL(crypto_nhpoly1305_final);
+
+static struct shash_alg nhpoly1305_alg = {
+	.digestsize	= POLY1305_DIGEST_SIZE,
+	.init		= crypto_nhpoly1305_init,
+	.update		= crypto_nhpoly1305_update,
+	.final		= crypto_nhpoly1305_final,
+	.setkey		= crypto_nhpoly1305_setkey,
+	.descsize	= sizeof(struct nhpoly1305_state),
+	.base		= {
+		.cra_name		= "nhpoly1305",
+		.cra_driver_name	= "nhpoly1305-generic",
+		.cra_priority		= 100,
+		.cra_ctxsize		= sizeof(struct nhpoly1305_key),
+		.cra_module		= THIS_MODULE,
+	},
+};
+
+static int __init nhpoly1305_mod_init(void)
+{
+	return crypto_register_shash(&nhpoly1305_alg);
+}
+
+static void __exit nhpoly1305_mod_exit(void)
+{
+	crypto_unregister_shash(&nhpoly1305_alg);
+}
+
+module_init(nhpoly1305_mod_init);
+module_exit(nhpoly1305_mod_exit);
+
+MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function");
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
+MODULE_ALIAS_CRYPTO("nhpoly1305");
+MODULE_ALIAS_CRYPTO("nhpoly1305-generic");
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 3ff70ebc745cb..039a5d850a29c 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -3291,6 +3291,12 @@ static const struct alg_test_desc alg_test_descs[] = {
 				.dec = __VECS(morus640_dec_tv_template),
 			}
 		}
+	}, {
+		.alg = "nhpoly1305",
+		.test = alg_test_hash,
+		.suite = {
+			.hash = __VECS(nhpoly1305_tv_template)
+		}
 	}, {
 		.alg = "ofb(aes)",
 		.test = alg_test_skcipher,
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 3b57b2701fcb2..40197d74b3d56 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -27,7 +27,7 @@
 #define MAX_DIGEST_SIZE		64
 #define MAX_TAP			8
 
-#define MAX_KEYLEN		160
+#define MAX_KEYLEN		1088
 #define MAX_IVLEN		32
 
 struct hash_testvec {
@@ -35,10 +35,10 @@ struct hash_testvec {
 	const char *key;
 	const char *plaintext;
 	const char *digest;
-	unsigned char tap[MAX_TAP];
+	unsigned short tap[MAX_TAP];
+	unsigned short np;
 	unsigned short psize;
-	unsigned char np;
-	unsigned char ksize;
+	unsigned short ksize;
 };
 
 /*
@@ -5593,6 +5593,1238 @@ static const struct hash_testvec poly1305_tv_template[] = {
 	},
 };
 
+/* NHPoly1305 test vectors from https://github.com/google/adiantum */
+static const struct hash_testvec nhpoly1305_tv_template[] = {
+	{
+		.key	= "\xd2\x5d\x4c\xdd\x8d\x2b\x7f\x7a"
+			  "\xd9\xbe\x71\xec\xd1\x83\x52\xe3"
+			  "\xe1\xad\xd7\x5c\x0a\x75\x9d\xec"
+			  "\x1d\x13\x7e\x5d\x71\x07\xc9\xe4"
+			  "\x57\x2d\x44\x68\xcf\xd8\xd6\xc5"
+			  "\x39\x69\x7d\x32\x75\x51\x4f\x7e"
+			  "\xb2\x4c\xc6\x90\x51\x6e\xd9\xd6"
+			  "\xa5\x8b\x2d\xf1\x94\xf9\xf7\x5e"
+			  "\x2c\x84\x7b\x41\x0f\x88\x50\x89"
+			  "\x30\xd9\xa1\x38\x46\x6c\xc0\x4f"
+			  "\xe8\xdf\xdc\x66\xab\x24\x43\x41"
+			  "\x91\x55\x29\x65\x86\x28\x5e\x45"
+			  "\xd5\x2d\xb7\x80\x08\x9a\xc3\xd4"
+			  "\x9a\x77\x0a\xd4\xef\x3e\xe6\x3f"
+			  "\x6f\x2f\x9b\x3a\x7d\x12\x1e\x80"
+			  "\x6c\x44\xa2\x25\xe1\xf6\x60\xe9"
+			  "\x0d\xaf\xc5\x3c\xa5\x79\xae\x64"
+			  "\xbc\xa0\x39\xa3\x4d\x10\xe5\x4d"
+			  "\xd5\xe7\x89\x7a\x13\xee\x06\x78"
+			  "\xdc\xa4\xdc\x14\x27\xe6\x49\x38"
+			  "\xd0\xe0\x45\x25\x36\xc5\xf4\x79"
+			  "\x2e\x9a\x98\x04\xe4\x2b\x46\x52"
+			  "\x7c\x33\xca\xe2\x56\x51\x50\xe2"
+			  "\xa5\x9a\xae\x18\x6a\x13\xf8\xd2"
+			  "\x21\x31\x66\x02\xe2\xda\x8d\x7e"
+			  "\x41\x19\xb2\x61\xee\x48\x8f\xf1"
+			  "\x65\x24\x2e\x1e\x68\xce\x05\xd9"
+			  "\x2a\xcf\xa5\x3a\x57\xdd\x35\x91"
+			  "\x93\x01\xca\x95\xfc\x2b\x36\x04"
+			  "\xe6\x96\x97\x28\xf6\x31\xfe\xa3"
+			  "\x9d\xf6\x6a\x1e\x80\x8d\xdc\xec"
+			  "\xaf\x66\x11\x13\x02\x88\xd5\x27"
+			  "\x33\xb4\x1a\xcd\xa3\xf6\xde\x31"
+			  "\x8e\xc0\x0e\x6c\xd8\x5a\x97\x5e"
+			  "\xdd\xfd\x60\x69\x38\x46\x3f\x90"
+			  "\x5e\x97\xd3\x32\x76\xc7\x82\x49"
+			  "\xfe\xba\x06\x5f\x2f\xa2\xfd\xff"
+			  "\x80\x05\x40\xe4\x33\x03\xfb\x10"
+			  "\xc0\xde\x65\x8c\xc9\x8d\x3a\x9d"
+			  "\xb5\x7b\x36\x4b\xb5\x0c\xcf\x00"
+			  "\x9c\x87\xe4\x49\xad\x90\xda\x4a"
+			  "\xdd\xbd\xff\xe2\x32\x57\xd6\x78"
+			  "\x36\x39\x6c\xd3\x5b\x9b\x88\x59"
+			  "\x2d\xf0\x46\xe4\x13\x0e\x2b\x35"
+			  "\x0d\x0f\x73\x8a\x4f\x26\x84\x75"
+			  "\x88\x3c\xc5\x58\x66\x18\x1a\xb4"
+			  "\x64\x51\x34\x27\x1b\xa4\x11\xc9"
+			  "\x6d\x91\x8a\xfa\x32\x60\x9d\xd7"
+			  "\x87\xe5\xaa\x43\x72\xf8\xda\xd1"
+			  "\x48\x44\x13\x61\xdc\x8c\x76\x17"
+			  "\x0c\x85\x4e\xf3\xdd\xa2\x42\xd2"
+			  "\x74\xc1\x30\x1b\xeb\x35\x31\x29"
+			  "\x5b\xd7\x4c\x94\x46\x35\xa1\x23"
+			  "\x50\xf2\xa2\x8e\x7e\x4f\x23\x4f"
+			  "\x51\xff\xe2\xc9\xa3\x7d\x56\x8b"
+			  "\x41\xf2\xd0\xc5\x57\x7e\x59\xac"
+			  "\xbb\x65\xf3\xfe\xf7\x17\xef\x63"
+			  "\x7c\x6f\x23\xdd\x22\x8e\xed\x84"
+			  "\x0e\x3b\x09\xb3\xf3\xf4\x8f\xcd"
+			  "\x37\xa8\xe1\xa7\x30\xdb\xb1\xa2"
+			  "\x9c\xa2\xdf\x34\x17\x3e\x68\x44"
+			  "\xd0\xde\x03\x50\xd1\x48\x6b\x20"
+			  "\xe2\x63\x45\xa5\xea\x87\xc2\x42"
+			  "\x95\x03\x49\x05\xed\xe0\x90\x29"
+			  "\x1a\xb8\xcf\x9b\x43\xcf\x29\x7a"
+			  "\x63\x17\x41\x9f\xe0\xc9\x10\xfd"
+			  "\x2c\x56\x8c\x08\x55\xb4\xa9\x27"
+			  "\x0f\x23\xb1\x05\x6a\x12\x46\xc7"
+			  "\xe1\xfe\x28\x93\x93\xd7\x2f\xdc"
+			  "\x98\x30\xdb\x75\x8a\xbe\x97\x7a"
+			  "\x02\xfb\x8c\xba\xbe\x25\x09\xbe"
+			  "\xce\xcb\xa2\xef\x79\x4d\x0e\x9d"
+			  "\x1b\x9d\xb6\x39\x34\x38\xfa\x07"
+			  "\xec\xe8\xfc\x32\x85\x1d\xf7\x85"
+			  "\x63\xc3\x3c\xc0\x02\x75\xd7\x3f"
+			  "\xb2\x68\x60\x66\x65\x81\xc6\xb1"
+			  "\x42\x65\x4b\x4b\x28\xd7\xc7\xaa"
+			  "\x9b\xd2\xdc\x1b\x01\xe0\x26\x39"
+			  "\x01\xc1\x52\x14\xd1\x3f\xb7\xe6"
+			  "\x61\x41\xc7\x93\xd2\xa2\x67\xc6"
+			  "\xf7\x11\xb5\xf5\xea\xdd\x19\xfb"
+			  "\x4d\x21\x12\xd6\x7d\xf1\x10\xb0"
+			  "\x89\x07\xc7\x5a\x52\x73\x70\x2f"
+			  "\x32\xef\x65\x2b\x12\xb2\xf0\xf5"
+			  "\x20\xe0\x90\x59\x7e\x64\xf1\x4c"
+			  "\x41\xb3\xa5\x91\x08\xe6\x5e\x5f"
+			  "\x05\x56\x76\xb4\xb0\xcd\x70\x53"
+			  "\x10\x48\x9c\xff\xc2\x69\x55\x24"
+			  "\x87\xef\x84\xea\xfb\xa7\xbf\xa0"
+			  "\x91\x04\xad\x4f\x8b\x57\x54\x4b"
+			  "\xb6\xe9\xd1\xac\x37\x2f\x1d\x2e"
+			  "\xab\xa5\xa4\xe8\xff\xfb\xd9\x39"
+			  "\x2f\xb7\xac\xd1\xfe\x0b\x9a\x80"
+			  "\x0f\xb6\xf4\x36\x39\x90\x51\xe3"
+			  "\x0a\x2f\xb6\x45\x76\x89\xcd\x61"
+			  "\xfe\x48\x5f\x75\x1d\x13\x00\x62"
+			  "\x80\x24\x47\xe7\xbc\x37\xd7\xe3"
+			  "\x15\xe8\x68\x22\xaf\x80\x6f\x4b"
+			  "\xa8\x9f\x01\x10\x48\x14\xc3\x02"
+			  "\x52\xd2\xc7\x75\x9b\x52\x6d\x30"
+			  "\xac\x13\x85\xc8\xf7\xa3\x58\x4b"
+			  "\x49\xf7\x1c\x45\x55\x8c\x39\x9a"
+			  "\x99\x6d\x97\x27\x27\xe6\xab\xdd"
+			  "\x2c\x42\x1b\x35\xdd\x9d\x73\xbb"
+			  "\x6c\xf3\x64\xf1\xfb\xb9\xf7\xe6"
+			  "\x4a\x3c\xc0\x92\xc0\x2e\xb7\x1a"
+			  "\xbe\xab\xb3\x5a\xe5\xea\xb1\x48"
+			  "\x58\x13\x53\x90\xfd\xc3\x8e\x54"
+			  "\xf9\x18\x16\x73\xe8\xcb\x6d\x39"
+			  "\x0e\xd7\xe0\xfe\xb6\x9f\x43\x97"
+			  "\xe8\xd0\x85\x56\x83\x3e\x98\x68"
+			  "\x7f\xbd\x95\xa8\x9a\x61\x21\x8f"
+			  "\x06\x98\x34\xa6\xc8\xd6\x1d\xf3"
+			  "\x3d\x43\xa4\x9a\x8c\xe5\xd3\x5a"
+			  "\x32\xa2\x04\x22\xa4\x19\x1a\x46"
+			  "\x42\x7e\x4d\xe5\xe0\xe6\x0e\xca"
+			  "\xd5\x58\x9d\x2c\xaf\xda\x33\x5c"
+			  "\xb0\x79\x9e\xc9\xfc\xca\xf0\x2f"
+			  "\xa8\xb2\x77\xeb\x7a\xa2\xdd\x37"
+			  "\x35\x83\x07\xd6\x02\x1a\xb6\x6c"
+			  "\x24\xe2\x59\x08\x0e\xfd\x3e\x46"
+			  "\xec\x40\x93\xf4\x00\x26\x4f\x2a"
+			  "\xff\x47\x2f\xeb\x02\x92\x26\x5b"
+			  "\x53\x17\xc2\x8d\x2a\xc7\xa3\x1b"
+			  "\xcd\xbc\xa7\xe8\xd1\x76\xe3\x80"
+			  "\x21\xca\x5d\x3b\xe4\x9c\x8f\xa9"
+			  "\x5b\x7f\x29\x7f\x7c\xd8\xed\x6d"
+			  "\x8c\xb2\x86\x85\xe7\x77\xf2\x85"
+			  "\xab\x38\xa9\x9d\xc1\x4e\xc5\x64"
+			  "\x33\x73\x8b\x59\x03\xad\x05\xdf"
+			  "\x25\x98\x31\xde\xef\x13\xf1\x9b"
+			  "\x3c\x91\x9d\x7b\xb1\xfa\xe6\xbf"
+			  "\x5b\xed\xa5\x55\xe6\xea\x6c\x74"
+			  "\xf4\xb9\xe4\x45\x64\x72\x81\xc2"
+			  "\x4c\x28\xd4\xcd\xac\xe2\xde\xf9"
+			  "\xeb\x5c\xeb\x61\x60\x5a\xe5\x28",
+		.ksize	= 1088,
+		.plaintext	= "",
+		.psize	= 0,
+		.digest	= "\x00\x00\x00\x00\x00\x00\x00\x00"
+			  "\x00\x00\x00\x00\x00\x00\x00\x00",
+	}, {
+		.key	= "\x29\x21\x43\xcb\xcb\x13\x07\xde"
+			  "\xbf\x48\xdf\x8a\x7f\xa2\x84\xde"
+			  "\x72\x23\x9d\xf5\xf0\x07\xf2\x4c"
+			  "\x20\x3a\x93\xb9\xcd\x5d\xfe\xcb"
+			  "\x99\x2c\x2b\x58\xc6\x50\x5f\x94"
+			  "\x56\xc3\x7c\x0d\x02\x3f\xb8\x5e"
+			  "\x7b\xc0\x6c\x51\x34\x76\xc0\x0e"
+			  "\xc6\x22\xc8\x9e\x92\xa0\x21\xc9"
+			  "\x85\x5c\x7c\xf8\xe2\x64\x47\xc9"
+			  "\xe4\xa2\x57\x93\xf8\xa2\x69\xcd"
+			  "\x62\x98\x99\xf4\xd7\x7b\x14\xb1"
+			  "\xd8\x05\xff\x04\x15\xc9\xe1\x6e"
+			  "\x9b\xe6\x50\x6b\x0b\x3f\x22\x1f"
+			  "\x08\xde\x0c\x5b\x08\x7e\xc6\x2f"
+			  "\x6c\xed\xd6\xb2\x15\xa4\xb3\xf9"
+			  "\xa7\x46\x38\x2a\xea\x69\xa5\xde"
+			  "\x02\xc3\x96\x89\x4d\x55\x3b\xed"
+			  "\x3d\x3a\x85\x77\xbf\x97\x45\x5c"
+			  "\x9e\x02\x69\xe2\x1b\x68\xbe\x96"
+			  "\xfb\x64\x6f\x0f\xf6\x06\x40\x67"
+			  "\xfa\x04\xe3\x55\xfa\xbe\xa4\x60"
+			  "\xef\x21\x66\x97\xe6\x9d\x5c\x1f"
+			  "\x62\x37\xaa\x31\xde\xe4\x9c\x28"
+			  "\x95\xe0\x22\x86\xf4\x4d\xf3\x07"
+			  "\xfd\x5f\x3a\x54\x2c\x51\x80\x71"
+			  "\xba\x78\x69\x5b\x65\xab\x1f\x81"
+			  "\xed\x3b\xff\x34\xa3\xfb\xbc\x73"
+			  "\x66\x7d\x13\x7f\xdf\x6e\xe2\xe2"
+			  "\xeb\x4f\x6c\xda\x7d\x33\x57\xd0"
+			  "\xd3\x7c\x95\x4f\x33\x58\x21\xc7"
+			  "\xc0\xe5\x6f\x42\x26\xc6\x1f\x5e"
+			  "\x85\x1b\x98\x9a\xa2\x1e\x55\x77"
+			  "\x23\xdf\x81\x5e\x79\x55\x05\xfc"
+			  "\xfb\xda\xee\xba\x5a\xba\xf7\x77"
+			  "\x7f\x0e\xd3\xe1\x37\xfe\x8d\x2b"
+			  "\xd5\x3f\xfb\xd0\xc0\x3c\x0b\x3f"
+			  "\xcf\x3c\x14\xcf\xfb\x46\x72\x4c"
+			  "\x1f\x39\xe2\xda\x03\x71\x6d\x23"
+			  "\xef\x93\xcd\x39\xd9\x37\x80\x4d"
+			  "\x65\x61\xd1\x2c\x03\xa9\x47\x72"
+			  "\x4d\x1e\x0e\x16\x33\x0f\x21\x17"
+			  "\xec\x92\xea\x6f\x37\x22\xa4\xd8"
+			  "\x03\x33\x9e\xd8\x03\x69\x9a\xe8"
+			  "\xb2\x57\xaf\x78\x99\x05\x12\xab"
+			  "\x48\x90\x80\xf0\x12\x9b\x20\x64"
+			  "\x7a\x1d\x47\x5f\xba\x3c\xf9\xc3"
+			  "\x0a\x0d\x8d\xa1\xf9\x1b\x82\x13"
+			  "\x3e\x0d\xec\x0a\x83\xc0\x65\xe1"
+			  "\xe9\x95\xff\x97\xd6\xf2\xe4\xd5"
+			  "\x86\xc0\x1f\x29\x27\x63\xd7\xde"
+			  "\xb7\x0a\x07\x99\x04\x2d\xa3\x89"
+			  "\xa2\x43\xcf\xf3\xe1\x43\xac\x4a"
+			  "\x06\x97\xd0\x05\x4f\x87\xfa\xf9"
+			  "\x9b\xbf\x52\x70\xbd\xbc\x6c\xf3"
+			  "\x03\x13\x60\x41\x28\x09\xec\xcc"
+			  "\xb1\x1a\xec\xd6\xfb\x6f\x2a\x89"
+			  "\x5d\x0b\x53\x9c\x59\xc1\x84\x21"
+			  "\x33\x51\x47\x19\x31\x9c\xd4\x0a"
+			  "\x4d\x04\xec\x50\x90\x61\xbd\xbc"
+			  "\x7e\xc8\xd9\x6c\x98\x1d\x45\x41"
+			  "\x17\x5e\x97\x1c\xc5\xa8\xe8\xea"
+			  "\x46\x58\x53\xf7\x17\xd5\xad\x11"
+			  "\xc8\x54\xf5\x7a\x33\x90\xf5\x19"
+			  "\xba\x36\xb4\xfc\x52\xa5\x72\x3d"
+			  "\x14\xbb\x55\xa7\xe9\xe3\x12\xf7"
+			  "\x1c\x30\xa2\x82\x03\xbf\x53\x91"
+			  "\x2e\x60\x41\x9f\x5b\x69\x39\xf6"
+			  "\x4d\xc8\xf8\x46\x7a\x7f\xa4\x98"
+			  "\x36\xff\x06\xcb\xca\xe7\x33\xf2"
+			  "\xc0\x4a\xf4\x3c\x14\x44\x5f\x6b"
+			  "\x75\xef\x02\x36\x75\x08\x14\xfd"
+			  "\x10\x8e\xa5\x58\xd0\x30\x46\x49"
+			  "\xaf\x3a\xf8\x40\x3d\x35\xdb\x84"
+			  "\x11\x2e\x97\x6a\xb7\x87\x7f\xad"
+			  "\xf1\xfa\xa5\x63\x60\xd8\x5e\xbf"
+			  "\x41\x78\x49\xcf\x77\xbb\x56\xbb"
+			  "\x7d\x01\x67\x05\x22\xc8\x8f\x41"
+			  "\xba\x81\xd2\xca\x2c\x38\xac\x76"
+			  "\x06\xc1\x1a\xc2\xce\xac\x90\x67"
+			  "\x57\x3e\x20\x12\x5b\xd9\x97\x58"
+			  "\x65\x05\xb7\x04\x61\x7e\xd8\x3a"
+			  "\xbf\x55\x3b\x13\xe9\x34\x5a\x37"
+			  "\x36\xcb\x94\x45\xc5\x32\xb3\xa0"
+			  "\x0c\x3e\x49\xc5\xd3\xed\xa7\xf0"
+			  "\x1c\x69\xcc\xea\xcc\x83\xc9\x16"
+			  "\x95\x72\x4b\xf4\x89\xd5\xb9\x10"
+			  "\xf6\x2d\x60\x15\xea\x3c\x06\x66"
+			  "\x9f\x82\xad\x17\xce\xd2\xa4\x48"
+			  "\x7c\x65\xd9\xf8\x02\x4d\x9b\x4c"
+			  "\x89\x06\x3a\x34\x85\x48\x89\x86"
+			  "\xf9\x24\xa9\x54\x72\xdb\x44\x95"
+			  "\xc7\x44\x1c\x19\x11\x4c\x04\xdc"
+			  "\x13\xb9\x67\xc8\xc3\x3a\x6a\x50"
+			  "\xfa\xd1\xfb\xe1\x88\xb6\xf1\xa3"
+			  "\xc5\x3b\xdc\x38\x45\x16\x26\x02"
+			  "\x3b\xb8\x8f\x8b\x58\x7d\x23\x04"
+			  "\x50\x6b\x81\x9f\xae\x66\xac\x6f"
+			  "\xcf\x2a\x9d\xf1\xfd\x1d\x57\x07"
+			  "\xbe\x58\xeb\x77\x0c\xe3\xc2\x19"
+			  "\x14\x74\x1b\x51\x1c\x4f\x41\xf3"
+			  "\x32\x89\xb3\xe7\xde\x62\xf6\x5f"
+			  "\xc7\x6a\x4a\x2a\x5b\x0f\x5f\x87"
+			  "\x9c\x08\xb9\x02\x88\xc8\x29\xb7"
+			  "\x94\x52\xfa\x52\xfe\xaa\x50\x10"
+			  "\xba\x48\x75\x5e\x11\x1b\xe6\x39"
+			  "\xd7\x82\x2c\x87\xf1\x1e\xa4\x38"
+			  "\x72\x3e\x51\xe7\xd8\x3e\x5b\x7b"
+			  "\x31\x16\x89\xba\xd6\xad\x18\x5e"
+			  "\xba\xf8\x12\xb3\xf4\x6c\x47\x30"
+			  "\xc0\x38\x58\xb3\x10\x8d\x58\x5d"
+			  "\xb4\xfb\x19\x7e\x41\xc3\x66\xb8"
+			  "\xd6\x72\x84\xe1\x1a\xc2\x71\x4c"
+			  "\x0d\x4a\x21\x7a\xab\xa2\xc0\x36"
+			  "\x15\xc5\xe9\x46\xd7\x29\x17\x76"
+			  "\x5e\x47\x36\x7f\x72\x05\xa7\xcc"
+			  "\x36\x63\xf9\x47\x7d\xe6\x07\x3c"
+			  "\x8b\x79\x1d\x96\x61\x8d\x90\x65"
+			  "\x7c\xf5\xeb\x4e\x6e\x09\x59\x6d"
+			  "\x62\x50\x1b\x0f\xe0\xdc\x78\xf2"
+			  "\x5b\x83\x1a\xa1\x11\x75\xfd\x18"
+			  "\xd7\xe2\x8d\x65\x14\x21\xce\xbe"
+			  "\xb5\x87\xe3\x0a\xda\x24\x0a\x64"
+			  "\xa9\x9f\x03\x8d\x46\x5d\x24\x1a"
+			  "\x8a\x0c\x42\x01\xca\xb1\x5f\x7c"
+			  "\xa5\xac\x32\x4a\xb8\x07\x91\x18"
+			  "\x6f\xb0\x71\x3c\xc9\xb1\xa8\xf8"
+			  "\x5f\x69\xa5\xa1\xca\x9e\x7a\xaa"
+			  "\xac\xe9\xc7\x47\x41\x75\x25\xc3"
+			  "\x73\xe2\x0b\xdd\x6d\x52\x71\xbe"
+			  "\xc5\xdc\xb4\xe7\x01\x26\x53\x77"
+			  "\x86\x90\x85\x68\x6b\x7b\x03\x53"
+			  "\xda\x52\x52\x51\x68\xc8\xf3\xec"
+			  "\x6c\xd5\x03\x7a\xa3\x0e\xb4\x02"
+			  "\x5f\x1a\xab\xee\xca\x67\x29\x7b"
+			  "\xbd\x96\x59\xb3\x8b\x32\x7a\x92"
+			  "\x9f\xd8\x25\x2b\xdf\xc0\x4c\xda",
+		.ksize	= 1088,
+		.plaintext	= "\xbc\xda\x81\xa8\x78\x79\x1c\xbf"
+			  "\x77\x53\xba\x4c\x30\x5b\xb8\x33",
+		.psize	= 16,
+		.digest	= "\x04\xbf\x7f\x6a\xce\x72\xea\x6a"
+			  "\x79\xdb\xb0\xc9\x60\xf6\x12\xcc",
+		.np	= 6,
+		.tap	= { 4, 4, 1, 1, 1, 5 },
+	}, {
+		.key	= "\x65\x4d\xe3\xf8\xd2\x4c\xac\x28"
+			  "\x68\xf5\xb3\x81\x71\x4b\xa1\xfa"
+			  "\x04\x0e\xd3\x81\x36\xbe\x0c\x81"
+			  "\x5e\xaf\xbc\x3a\xa4\xc0\x8e\x8b"
+			  "\x55\x63\xd3\x52\x97\x88\xd6\x19"
+			  "\xbc\x96\xdf\x49\xff\x04\x63\xf5"
+			  "\x0c\x11\x13\xaa\x9e\x1f\x5a\xf7"
+			  "\xdd\xbd\x37\x80\xc3\xd0\xbe\xa7"
+			  "\x05\xc8\x3c\x98\x1e\x05\x3c\x84"
+			  "\x39\x61\xc4\xed\xed\x71\x1b\xc4"
+			  "\x74\x45\x2c\xa1\x56\x70\x97\xfd"
+			  "\x44\x18\x07\x7d\xca\x60\x1f\x73"
+			  "\x3b\x6d\x21\xcb\x61\x87\x70\x25"
+			  "\x46\x21\xf1\x1f\x21\x91\x31\x2d"
+			  "\x5d\xcc\xb7\xd1\x84\x3e\x3d\xdb"
+			  "\x03\x53\x2a\x82\xa6\x9a\x95\xbc"
+			  "\x1a\x1e\x0a\x5e\x07\x43\xab\x43"
+			  "\xaf\x92\x82\x06\x91\x04\x09\xf4"
+			  "\x17\x0a\x9a\x2c\x54\xdb\xb8\xf4"
+			  "\xd0\xf0\x10\x66\x24\x8d\xcd\xda"
+			  "\xfe\x0e\x45\x9d\x6f\xc4\x4e\xf4"
+			  "\x96\xaf\x13\xdc\xa9\xd4\x8c\xc4"
+			  "\xc8\x57\x39\x3c\xc2\xd3\x0a\x76"
+			  "\x4a\x1f\x75\x83\x44\xc7\xd1\x39"
+			  "\xd8\xb5\x41\xba\x73\x87\xfa\x96"
+			  "\xc7\x18\x53\xfb\x9b\xda\xa0\x97"
+			  "\x1d\xee\x60\x85\x9e\x14\xc3\xce"
+			  "\xc4\x05\x29\x3b\x95\x30\xa3\xd1"
+			  "\x9f\x82\x6a\x04\xf5\xa7\x75\x57"
+			  "\x82\x04\xfe\x71\x51\x71\xb1\x49"
+			  "\x50\xf8\xe0\x96\xf1\xfa\xa8\x88"
+			  "\x3f\xa0\x86\x20\xd4\x60\x79\x59"
+			  "\x17\x2d\xd1\x09\xf4\xec\x05\x57"
+			  "\xcf\x62\x7e\x0e\x7e\x60\x78\xe6"
+			  "\x08\x60\x29\xd8\xd5\x08\x1a\x24"
+			  "\xc4\x6c\x24\xe7\x92\x08\x3d\x8a"
+			  "\x98\x7a\xcf\x99\x0a\x65\x0e\xdc"
+			  "\x8c\x8a\xbe\x92\x82\x91\xcc\x62"
+			  "\x30\xb6\xf4\x3f\xc6\x8a\x7f\x12"
+			  "\x4a\x8a\x49\xfa\x3f\x5c\xd4\x5a"
+			  "\xa6\x82\xa3\xe6\xaa\x34\x76\xb2"
+			  "\xab\x0a\x30\xef\x6c\x77\x58\x3f"
+			  "\x05\x6b\xcc\x5c\xae\xdc\xd7\xb9"
+			  "\x51\x7e\x8d\x32\x5b\x24\x25\xbe"
+			  "\x2b\x24\x01\xcf\x80\xda\x16\xd8"
+			  "\x90\x72\x2c\xad\x34\x8d\x0c\x74"
+			  "\x02\xcb\xfd\xcf\x6e\xef\x97\xb5"
+			  "\x4c\xf2\x68\xca\xde\x43\x9e\x8a"
+			  "\xc5\x5f\x31\x7f\x14\x71\x38\xec"
+			  "\xbd\x98\xe5\x71\xc4\xb5\xdb\xef"
+			  "\x59\xd2\xca\xc0\xc1\x86\x75\x01"
+			  "\xd4\x15\x0d\x6f\xa4\xf7\x7b\x37"
+			  "\x47\xda\x18\x93\x63\xda\xbe\x9e"
+			  "\x07\xfb\xb2\x83\xd5\xc4\x34\x55"
+			  "\xee\x73\xa1\x42\x96\xf9\x66\x41"
+			  "\xa4\xcc\xd2\x93\x6e\xe1\x0a\xbb"
+			  "\xd2\xdd\x18\x23\xe6\x6b\x98\x0b"
+			  "\x8a\x83\x59\x2c\xc3\xa6\x59\x5b"
+			  "\x01\x22\x59\xf7\xdc\xb0\x87\x7e"
+			  "\xdb\x7d\xf4\x71\x41\xab\xbd\xee"
+			  "\x79\xbe\x3c\x01\x76\x0b\x2d\x0a"
+			  "\x42\xc9\x77\x8c\xbb\x54\x95\x60"
+			  "\x43\x2e\xe0\x17\x52\xbd\x90\xc9"
+			  "\xc2\x2c\xdd\x90\x24\x22\x76\x40"
+			  "\x5c\xb9\x41\xc9\xa1\xd5\xbd\xe3"
+			  "\x44\xe0\xa4\xab\xcc\xb8\xe2\x32"
+			  "\x02\x15\x04\x1f\x8c\xec\x5d\x14"
+			  "\xac\x18\xaa\xef\x6e\x33\x19\x6e"
+			  "\xde\xfe\x19\xdb\xeb\x61\xca\x18"
+			  "\xad\xd8\x3d\xbf\x09\x11\xc7\xa5"
+			  "\x86\x0b\x0f\xe5\x3e\xde\xe8\xd9"
+			  "\x0a\x69\x9e\x4c\x20\xff\xf9\xc5"
+			  "\xfa\xf8\xf3\x7f\xa5\x01\x4b\x5e"
+			  "\x0f\xf0\x3b\x68\xf0\x46\x8c\x2a"
+			  "\x7a\xc1\x8f\xa0\xfe\x6a\x5b\x44"
+			  "\x70\x5c\xcc\x92\x2c\x6f\x0f\xbd"
+			  "\x25\x3e\xb7\x8e\x73\x58\xda\xc9"
+			  "\xa5\xaa\x9e\xf3\x9b\xfd\x37\x3e"
+			  "\xe2\x88\xa4\x7b\xc8\x5c\xa8\x93"
+			  "\x0e\xe7\x9a\x9c\x2e\x95\x18\x9f"
+			  "\xc8\x45\x0c\x88\x9e\x53\x4f\x3a"
+			  "\x76\xc1\x35\xfa\x17\xd8\xac\xa0"
+			  "\x0c\x2d\x47\x2e\x4f\x69\x9b\xf7"
+			  "\xd0\xb6\x96\x0c\x19\xb3\x08\x01"
+			  "\x65\x7a\x1f\xc7\x31\x86\xdb\xc8"
+			  "\xc1\x99\x8f\xf8\x08\x4a\x9d\x23"
+			  "\x22\xa8\xcf\x27\x01\x01\x88\x93"
+			  "\x9c\x86\x45\xbd\xe0\x51\xca\x52"
+			  "\x84\xba\xfe\x03\xf7\xda\xc5\xce"
+			  "\x3e\x77\x75\x86\xaf\x84\xc8\x05"
+			  "\x44\x01\x0f\x02\xf3\x58\xb0\x06"
+			  "\x5a\xd7\x12\x30\x8d\xdf\x1f\x1f"
+			  "\x0a\xe6\xd2\xea\xf6\x3a\x7a\x99"
+			  "\x63\xe8\xd2\xc1\x4a\x45\x8b\x40"
+			  "\x4d\x0a\xa9\x76\x92\xb3\xda\x87"
+			  "\x36\x33\xf0\x78\xc3\x2f\x5f\x02"
+			  "\x1a\x6a\x2c\x32\xcd\x76\xbf\xbd"
+			  "\x5a\x26\x20\x28\x8c\x8c\xbc\x52"
+			  "\x3d\x0a\xc9\xcb\xab\xa4\x21\xb0"
+			  "\x54\x40\x81\x44\xc7\xd6\x1c\x11"
+			  "\x44\xc6\x02\x92\x14\x5a\xbf\x1a"
+			  "\x09\x8a\x18\xad\xcd\x64\x3d\x53"
+			  "\x4a\xb6\xa5\x1b\x57\x0e\xef\xe0"
+			  "\x8c\x44\x5f\x7d\xbd\x6c\xfd\x60"
+			  "\xae\x02\x24\xb6\x99\xdd\x8c\xaf"
+			  "\x59\x39\x75\x3c\xd1\x54\x7b\x86"
+			  "\xcc\x99\xd9\x28\x0c\xb0\x94\x62"
+			  "\xf9\x51\xd1\x19\x96\x2d\x66\xf5"
+			  "\x55\xcf\x9e\x59\xe2\x6b\x2c\x08"
+			  "\xc0\x54\x48\x24\x45\xc3\x8c\x73"
+			  "\xea\x27\x6e\x66\x7d\x1d\x0e\x6e"
+			  "\x13\xe8\x56\x65\x3a\xb0\x81\x5c"
+			  "\xf0\xe8\xd8\x00\x6b\xcd\x8f\xad"
+			  "\xdd\x53\xf3\xa4\x6c\x43\xd6\x31"
+			  "\xaf\xd2\x76\x1e\x91\x12\xdb\x3c"
+			  "\x8c\xc2\x81\xf0\x49\xdb\xe2\x6b"
+			  "\x76\x62\x0a\x04\xe4\xaa\x8a\x7c"
+			  "\x08\x0b\x5d\xd0\xee\x1d\xfb\xc4"
+			  "\x02\x75\x42\xd6\xba\xa7\x22\xa8"
+			  "\x47\x29\xb7\x85\x6d\x93\x3a\xdb"
+			  "\x00\x53\x0b\xa2\xeb\xf8\xfe\x01"
+			  "\x6f\x8a\x31\xd6\x17\x05\x6f\x67"
+			  "\x88\x95\x32\xfe\x4f\xa6\x4b\xf8"
+			  "\x03\xe4\xcd\x9a\x18\xe8\x4e\x2d"
+			  "\xf7\x97\x9a\x0c\x7d\x9f\x7e\x44"
+			  "\x69\x51\xe0\x32\x6b\x62\x86\x8f"
+			  "\xa6\x8e\x0b\x21\x96\xe5\xaf\x77"
+			  "\xc0\x83\xdf\xa5\x0e\xd0\xa1\x04"
+			  "\xaf\xc1\x10\xcb\x5a\x40\xe4\xe3"
+			  "\x38\x7e\x07\xe8\x4d\xfa\xed\xc5"
+			  "\xf0\x37\xdf\xbb\x8a\xcf\x3d\xdc"
+			  "\x61\xd2\xc6\x2b\xff\x07\xc9\x2f"
+			  "\x0c\x2d\x5c\x07\xa8\x35\x6a\xfc"
+			  "\xae\x09\x03\x45\x74\x51\x4d\xc4"
+			  "\xb8\x23\x87\x4a\x99\x27\x20\x87"
+			  "\x62\x44\x0a\x4a\xce\x78\x47\x22",
+		.ksize	= 1088,
+		.plaintext	= "\x8e\xb0\x4c\xde\x9c\x4a\x04\x5a"
+			  "\xf6\xa9\x7f\x45\x25\xa5\x7b\x3a"
+			  "\xbc\x4d\x73\x39\x81\xb5\xbd\x3d"
+			  "\x21\x6f\xd7\x37\x50\x3c\x7b\x28"
+			  "\xd1\x03\x3a\x17\xed\x7b\x7c\x2a"
+			  "\x16\xbc\xdf\x19\x89\x52\x71\x31"
+			  "\xb6\xc0\xfd\xb5\xd3\xba\x96\x99"
+			  "\xb6\x34\x0b\xd0\x99\x93\xfc\x1a"
+			  "\x01\x3c\x85\xc6\x9b\x78\x5c\x8b"
+			  "\xfe\xae\xd2\xbf\xb2\x6f\xf9\xed"
+			  "\xc8\x25\x17\xfe\x10\x3b\x7d\xda"
+			  "\xf4\x8d\x35\x4b\x7c\x7b\x82\xe7"
+			  "\xc2\xb3\xee\x60\x4a\x03\x86\xc9"
+			  "\x4e\xb5\xc4\xbe\xd2\xbd\x66\xf1"
+			  "\x13\xf1\x09\xab\x5d\xca\x63\x1f"
+			  "\xfc\xfb\x57\x2a\xfc\xca\x66\xd8"
+			  "\x77\x84\x38\x23\x1d\xac\xd3\xb3"
+			  "\x7a\xad\x4c\x70\xfa\x9c\xc9\x61"
+			  "\xa6\x1b\xba\x33\x4b\x4e\x33\xec"
+			  "\xa0\xa1\x64\x39\x40\x05\x1c\xc2"
+			  "\x3f\x49\x9d\xae\xf2\xc5\xf2\xc5"
+			  "\xfe\xe8\xf4\xc2\xf9\x96\x2d\x28"
+			  "\x92\x30\x44\xbc\xd2\x7f\xe1\x6e"
+			  "\x62\x02\x8f\x3d\x1c\x80\xda\x0e"
+			  "\x6a\x90\x7e\x75\xff\xec\x3e\xc4"
+			  "\xcd\x16\x34\x3b\x05\x6d\x4d\x20"
+			  "\x1c\x7b\xf5\x57\x4f\xfa\x3d\xac"
+			  "\xd0\x13\x55\xe8\xb3\xe1\x1b\x78"
+			  "\x30\xe6\x9f\x84\xd4\x69\xd1\x08"
+			  "\x12\x77\xa7\x4a\xbd\xc0\xf2\xd2"
+			  "\x78\xdd\xa3\x81\x12\xcb\x6c\x14"
+			  "\x90\x61\xe2\x84\xc6\x2b\x16\xcc"
+			  "\x40\x99\x50\x88\x01\x09\x64\x4f"
+			  "\x0a\x80\xbe\x61\xae\x46\xc9\x0a"
+			  "\x5d\xe0\xfb\x72\x7a\x1a\xdd\x61"
+			  "\x63\x20\x05\xa0\x4a\xf0\x60\x69"
+			  "\x7f\x92\xbc\xbf\x4e\x39\x4d\xdd"
+			  "\x74\xd1\xb7\xc0\x5a\x34\xb7\xae"
+			  "\x76\x65\x2e\xbc\x36\xb9\x04\x95"
+			  "\x42\xe9\x6f\xca\x78\xb3\x72\x07"
+			  "\xa3\xba\x02\x94\x67\x4c\xb1\xd7"
+			  "\xe9\x30\x0d\xf0\x3b\xb8\x10\x6d"
+			  "\xea\x2b\x21\xbf\x74\x59\x82\x97"
+			  "\x85\xaa\xf1\xd7\x54\x39\xeb\x05"
+			  "\xbd\xf3\x40\xa0\x97\xe6\x74\xfe"
+			  "\xb4\x82\x5b\xb1\x36\xcb\xe8\x0d"
+			  "\xce\x14\xd9\xdf\xf1\x94\x22\xcd"
+			  "\xd6\x00\xba\x04\x4c\x05\x0c\xc0"
+			  "\xd1\x5a\xeb\x52\xd5\xa8\x8e\xc8"
+			  "\x97\xa1\xaa\xc1\xea\xc1\xbe\x7c"
+			  "\x36\xb3\x36\xa0\xc6\x76\x66\xc5"
+			  "\xe2\xaf\xd6\x5c\xe2\xdb\x2c\xb3"
+			  "\x6c\xb9\x99\x7f\xff\x9f\x03\x24"
+			  "\xe1\x51\x44\x66\xd8\x0c\x5d\x7f"
+			  "\x5c\x85\x22\x2a\xcf\x6d\x79\x28"
+			  "\xab\x98\x01\x72\xfe\x80\x87\x5f"
+			  "\x46\xba\xef\x81\x24\xee\xbf\xb0"
+			  "\x24\x74\xa3\x65\x97\x12\xc4\xaf"
+			  "\x8b\xa0\x39\xda\x8a\x7e\x74\x6e"
+			  "\x1b\x42\xb4\x44\x37\xfc\x59\xfd"
+			  "\x86\xed\xfb\x8c\x66\x33\xda\x63"
+			  "\x75\xeb\xe1\xa4\x85\x4f\x50\x8f"
+			  "\x83\x66\x0d\xd3\x37\xfa\xe6\x9c"
+			  "\x4f\x30\x87\x35\x18\xe3\x0b\xb7"
+			  "\x6e\x64\x54\xcd\x70\xb3\xde\x54"
+			  "\xb7\x1d\xe6\x4c\x4d\x55\x12\x12"
+			  "\xaf\x5f\x7f\x5e\xee\x9d\xe8\x8e"
+			  "\x32\x9d\x4e\x75\xeb\xc6\xdd\xaa"
+			  "\x48\x82\xa4\x3f\x3c\xd7\xd3\xa8"
+			  "\x63\x9e\x64\xfe\xe3\x97\x00\x62"
+			  "\xe5\x40\x5d\xc3\xad\x72\xe1\x28"
+			  "\x18\x50\xb7\x75\xef\xcd\x23\xbf"
+			  "\x3f\xc0\x51\x36\xf8\x41\xc3\x08"
+			  "\xcb\xf1\x8d\x38\x34\xbd\x48\x45"
+			  "\x75\xed\xbc\x65\x7b\xb5\x0c\x9b"
+			  "\xd7\x67\x7d\x27\xb4\xc4\x80\xd7"
+			  "\xa9\xb9\xc7\x4a\x97\xaa\xda\xc8"
+			  "\x3c\x74\xcf\x36\x8f\xe4\x41\xe3"
+			  "\xd4\xd3\x26\xa7\xf3\x23\x9d\x8f"
+			  "\x6c\x20\x05\x32\x3e\xe0\xc3\xc8"
+			  "\x56\x3f\xa7\x09\xb7\xfb\xc7\xf7"
+			  "\xbe\x2a\xdd\x0f\x06\x7b\x0d\xdd"
+			  "\xb0\xb4\x86\x17\xfd\xb9\x04\xe5"
+			  "\xc0\x64\x5d\xad\x2a\x36\x38\xdb"
+			  "\x24\xaf\x5b\xff\xca\xf9\x41\xe8"
+			  "\xf9\x2f\x1e\x5e\xf9\xf5\xd5\xf2"
+			  "\xb2\x88\xca\xc9\xa1\x31\xe2\xe8"
+			  "\x10\x95\x65\xbf\xf1\x11\x61\x7a"
+			  "\x30\x1a\x54\x90\xea\xd2\x30\xf6"
+			  "\xa5\xad\x60\xf9\x4d\x84\x21\x1b"
+			  "\xe4\x42\x22\xc8\x12\x4b\xb0\x58"
+			  "\x3e\x9c\x2d\x32\x95\x0a\x8e\xb0"
+			  "\x0a\x7e\x77\x2f\xe8\x97\x31\x6a"
+			  "\xf5\x59\xb4\x26\xe6\x37\x12\xc9"
+			  "\xcb\xa0\x58\x33\x6f\xd5\x55\x55"
+			  "\x3c\xa1\x33\xb1\x0b\x7e\x2e\xb4"
+			  "\x43\x2a\x84\x39\xf0\x9c\xf4\x69"
+			  "\x4f\x1e\x79\xa6\x15\x1b\x87\xbb"
+			  "\xdb\x9b\xe0\xf1\x0b\xba\xe3\x6e"
+			  "\xcc\x2f\x49\x19\x22\x29\xfc\x71"
+			  "\xbb\x77\x38\x18\x61\xaf\x85\x76"
+			  "\xeb\xd1\x09\xcc\x86\x04\x20\x9a"
+			  "\x66\x53\x2f\x44\x8b\xc6\xa3\xd2"
+			  "\x5f\xc7\x79\x82\x66\xa8\x6e\x75"
+			  "\x7d\x94\xd1\x86\x75\x0f\xa5\x4f"
+			  "\x3c\x7a\x33\xce\xd1\x6e\x9d\x7b"
+			  "\x1f\x91\x37\xb8\x37\x80\xfb\xe0"
+			  "\x52\x26\xd0\x9a\xd4\x48\x02\x41"
+			  "\x05\xe3\x5a\x94\xf1\x65\x61\x19"
+			  "\xb8\x88\x4e\x2b\xea\xba\x8b\x58"
+			  "\x8b\x42\x01\x00\xa8\xfe\x00\x5c"
+			  "\xfe\x1c\xee\x31\x15\x69\xfa\xb3"
+			  "\x9b\x5f\x22\x8e\x0d\x2c\xe3\xa5"
+			  "\x21\xb9\x99\x8a\x8e\x94\x5a\xef"
+			  "\x13\x3e\x99\x96\x79\x6e\xd5\x42"
+			  "\x36\x03\xa9\xe2\xca\x65\x4e\x8a"
+			  "\x8a\x30\xd2\x7d\x74\xe7\xf0\xaa"
+			  "\x23\x26\xdd\xcb\x82\x39\xfc\x9d"
+			  "\x51\x76\x21\x80\xa2\xbe\x93\x03"
+			  "\x47\xb0\xc1\xb6\xdc\x63\xfd\x9f"
+			  "\xca\x9d\xa5\xca\x27\x85\xe2\xd8"
+			  "\x15\x5b\x7e\x14\x7a\xc4\x89\xcc"
+			  "\x74\x14\x4b\x46\xd2\xce\xac\x39"
+			  "\x6b\x6a\x5a\xa4\x0e\xe3\x7b\x15"
+			  "\x94\x4b\x0f\x74\xcb\x0c\x7f\xa9"
+			  "\xbe\x09\x39\xa3\xdd\x56\x5c\xc7"
+			  "\x99\x56\x65\x39\xf4\x0b\x7d\x87"
+			  "\xec\xaa\xe3\x4d\x22\x65\x39\x4e",
+		.psize	= 1024,
+		.digest	= "\x64\x3a\xbc\xc3\x3f\x74\x40\x51"
+			  "\x6e\x56\x01\x1a\x51\xec\x36\xde",
+		.np	= 8,
+		.tap	= { 64, 203, 267, 28, 263, 62, 54, 83 },
+	}, {
+		.key	= "\x1b\x82\x2e\x1b\x17\x23\xb9\x6d"
+			  "\xdc\x9c\xda\x99\x07\xe3\x5f\xd8"
+			  "\xd2\xf8\x43\x80\x8d\x86\x7d\x80"
+			  "\x1a\xd0\xcc\x13\xb9\x11\x05\x3f"
+			  "\x7e\xcf\x7e\x80\x0e\xd8\x25\x48"
+			  "\x8b\xaa\x63\x83\x92\xd0\x72\xf5"
+			  "\x4f\x67\x7e\x50\x18\x25\xa4\xd1"
+			  "\xe0\x7e\x1e\xba\xd8\xa7\x6e\xdb"
+			  "\x1a\xcc\x0d\xfe\x9f\x6d\x22\x35"
+			  "\xe1\xe6\xe0\xa8\x7b\x9c\xb1\x66"
+			  "\xa3\xf8\xff\x4d\x90\x84\x28\xbc"
+			  "\xdc\x19\xc7\x91\x49\xfc\xf6\x33"
+			  "\xc9\x6e\x65\x7f\x28\x6f\x68\x2e"
+			  "\xdf\x1a\x75\xe9\xc2\x0c\x96\xb9"
+			  "\x31\x22\xc4\x07\xc6\x0a\x2f\xfd"
+			  "\x36\x06\x5f\x5c\xc5\xb1\x3a\xf4"
+			  "\x5e\x48\xa4\x45\x2b\x88\xa7\xee"
+			  "\xa9\x8b\x52\xcc\x99\xd9\x2f\xb8"
+			  "\xa4\x58\x0a\x13\xeb\x71\x5a\xfa"
+			  "\xe5\x5e\xbe\xf2\x64\xad\x75\xbc"
+			  "\x0b\x5b\x34\x13\x3b\x23\x13\x9a"
+			  "\x69\x30\x1e\x9a\xb8\x03\xb8\x8b"
+			  "\x3e\x46\x18\x6d\x38\xd9\xb3\xd8"
+			  "\xbf\xf1\xd0\x28\xe6\x51\x57\x80"
+			  "\x5e\x99\xfb\xd0\xce\x1e\x83\xf7"
+			  "\xe9\x07\x5a\x63\xa9\xef\xce\xa5"
+			  "\xfb\x3f\x37\x17\xfc\x0b\x37\x0e"
+			  "\xbb\x4b\x21\x62\xb7\x83\x0e\xa9"
+			  "\x9e\xb0\xc4\xad\x47\xbe\x35\xe7"
+			  "\x51\xb2\xf2\xac\x2b\x65\x7b\x48"
+			  "\xe3\x3f\x5f\xb6\x09\x04\x0c\x58"
+			  "\xce\x99\xa9\x15\x2f\x4e\xc1\xf2"
+			  "\x24\x48\xc0\xd8\x6c\xd3\x76\x17"
+			  "\x83\x5d\xe6\xe3\xfd\x01\x8e\xf7"
+			  "\x42\xa5\x04\x29\x30\xdf\xf9\x00"
+			  "\x4a\xdc\x71\x22\x1a\x33\x15\xb6"
+			  "\xd7\x72\xfb\x9a\xb8\xeb\x2b\x38"
+			  "\xea\xa8\x61\xa8\x90\x11\x9d\x73"
+			  "\x2e\x6c\xce\x81\x54\x5a\x9f\xcd"
+			  "\xcf\xd5\xbd\x26\x5d\x66\xdb\xfb"
+			  "\xdc\x1e\x7c\x10\xfe\x58\x82\x10"
+			  "\x16\x24\x01\xce\x67\x55\x51\xd1"
+			  "\xdd\x6b\x44\xa3\x20\x8e\xa9\xa6"
+			  "\x06\xa8\x29\x77\x6e\x00\x38\x5b"
+			  "\xde\x4d\x58\xd8\x1f\x34\xdf\xf9"
+			  "\x2c\xac\x3e\xad\xfb\x92\x0d\x72"
+			  "\x39\xa4\xac\x44\x10\xc0\x43\xc4"
+			  "\xa4\x77\x3b\xfc\xc4\x0d\x37\xd3"
+			  "\x05\x84\xda\x53\x71\xf8\x80\xd3"
+			  "\x34\x44\xdb\x09\xb4\x2b\x8e\xe3"
+			  "\x00\x75\x50\x9e\x43\x22\x00\x0b"
+			  "\x7c\x70\xab\xd4\x41\xf1\x93\xcd"
+			  "\x25\x2d\x84\x74\xb5\xf2\x92\xcd"
+			  "\x0a\x28\xea\x9a\x49\x02\x96\xcb"
+			  "\x85\x9e\x2f\x33\x03\x86\x1d\xdc"
+			  "\x1d\x31\xd5\xfc\x9d\xaa\xc5\xe9"
+			  "\x9a\xc4\x57\xf5\x35\xed\xf4\x4b"
+			  "\x3d\x34\xc2\x29\x13\x86\x36\x42"
+			  "\x5d\xbf\x90\x86\x13\x77\xe5\xc3"
+			  "\x62\xb4\xfe\x0b\x70\x39\x35\x65"
+			  "\x02\xea\xf6\xce\x57\x0c\xbb\x74"
+			  "\x29\xe3\xfd\x60\x90\xfd\x10\x38"
+			  "\xd5\x4e\x86\xbd\x37\x70\xf0\x97"
+			  "\xa6\xab\x3b\x83\x64\x52\xca\x66"
+			  "\x2f\xf9\xa4\xca\x3a\x55\x6b\xb0"
+			  "\xe8\x3a\x34\xdb\x9e\x48\x50\x2f"
+			  "\x3b\xef\xfd\x08\x2d\x5f\xc1\x37"
+			  "\x5d\xbe\x73\xe4\xd8\xe9\xac\xca"
+			  "\x8a\xaa\x48\x7c\x5c\xf4\xa6\x96"
+			  "\x5f\xfa\x70\xa6\xb7\x8b\x50\xcb"
+			  "\xa6\xf5\xa9\xbd\x7b\x75\x4c\x22"
+			  "\x0b\x19\x40\x2e\xc9\x39\x39\x32"
+			  "\x83\x03\xa8\xa4\x98\xe6\x8e\x16"
+			  "\xb9\xde\x08\xc5\xfc\xbf\xad\x39"
+			  "\xa8\xc7\x93\x6c\x6f\x23\xaf\xc1"
+			  "\xab\xe1\xdf\xbb\x39\xae\x93\x29"
+			  "\x0e\x7d\x80\x8d\x3e\x65\xf3\xfd"
+			  "\x96\x06\x65\x90\xa1\x28\x64\x4b"
+			  "\x69\xf9\xa8\x84\x27\x50\xfc\x87"
+			  "\xf7\xbf\x55\x8e\x56\x13\x58\x7b"
+			  "\x85\xb4\x6a\x72\x0f\x40\xf1\x4f"
+			  "\x83\x81\x1f\x76\xde\x15\x64\x7a"
+			  "\x7a\x80\xe4\xc7\x5e\x63\x01\x91"
+			  "\xd7\x6b\xea\x0b\x9b\xa2\x99\x3b"
+			  "\x6c\x88\xd8\xfd\x59\x3c\x8d\x22"
+			  "\x86\x56\xbe\xab\xa1\x37\x08\x01"
+			  "\x50\x85\x69\x29\xee\x9f\xdf\x21"
+			  "\x3e\x20\x20\xf5\xb0\xbb\x6b\xd0"
+			  "\x9c\x41\x38\xec\x54\x6f\x2d\xbd"
+			  "\x0f\xe1\xbd\xf1\x2b\x6e\x60\x56"
+			  "\x29\xe5\x7a\x70\x1c\xe2\xfc\x97"
+			  "\x82\x68\x67\xd9\x3d\x1f\xfb\xd8"
+			  "\x07\x9f\xbf\x96\x74\xba\x6a\x0e"
+			  "\x10\x48\x20\xd8\x13\x1e\xb5\x44"
+			  "\xf2\xcc\xb1\x8b\xfb\xbb\xec\xd7"
+			  "\x37\x70\x1f\x7c\x55\xd2\x4b\xb9"
+			  "\xfd\x70\x5e\xa3\x91\x73\x63\x52"
+			  "\x13\x47\x5a\x06\xfb\x01\x67\xa5"
+			  "\xc0\xd0\x49\x19\x56\x66\x9a\x77"
+			  "\x64\xaf\x8c\x25\x91\x52\x87\x0e"
+			  "\x18\xf3\x5f\x97\xfd\x71\x13\xf8"
+			  "\x05\xa5\x39\xcc\x65\xd3\xcc\x63"
+			  "\x5b\xdb\x5f\x7e\x5f\x6e\xad\xc4"
+			  "\xf4\xa0\xc5\xc2\x2b\x4d\x97\x38"
+			  "\x4f\xbc\xfa\x33\x17\xb4\x47\xb9"
+			  "\x43\x24\x15\x8d\xd2\xed\x80\x68"
+			  "\x84\xdb\x04\x80\xca\x5e\x6a\x35"
+			  "\x2c\x2c\xe7\xc5\x03\x5f\x54\xb0"
+			  "\x5e\x4f\x1d\x40\x54\x3d\x78\x9a"
+			  "\xac\xda\x80\x27\x4d\x15\x4c\x1a"
+			  "\x6e\x80\xc9\xc4\x3b\x84\x0e\xd9"
+			  "\x2e\x93\x01\x8c\xc3\xc8\x91\x4b"
+			  "\xb3\xaa\x07\x04\x68\x5b\x93\xa5"
+			  "\xe7\xc4\x9d\xe7\x07\xee\xf5\x3b"
+			  "\x40\x89\xcc\x60\x34\x9d\xb4\x06"
+			  "\x1b\xef\x92\xe6\xc1\x2a\x7d\x0f"
+			  "\x81\xaa\x56\xe3\xd7\xed\xa7\xd4"
+			  "\xa7\x3a\x49\xc4\xad\x81\x5c\x83"
+			  "\x55\x8e\x91\x54\xb7\x7d\x65\xa5"
+			  "\x06\x16\xd5\x9a\x16\xc1\xb0\xa2"
+			  "\x06\xd8\x98\x47\x73\x7e\x73\xa0"
+			  "\xb8\x23\xb1\x52\xbf\x68\x74\x5d"
+			  "\x0b\xcb\xfa\x8c\x46\xe3\x24\xe6"
+			  "\xab\xd4\x69\x8d\x8c\xf2\x8a\x59"
+			  "\xbe\x48\x46\x50\x8c\x9a\xe8\xe3"
+			  "\x31\x55\x0a\x06\xed\x4f\xf8\xb7"
+			  "\x4f\xe3\x85\x17\x30\xbd\xd5\x20"
+			  "\xe7\x5b\xb2\x32\xcf\x6b\x16\x44"
+			  "\xd2\xf5\x7e\xd7\xd1\x2f\xee\x64"
+			  "\x3e\x9d\x10\xef\x27\x35\x43\x64"
+			  "\x67\xfb\x7a\x7b\xe0\x62\x31\x9a"
+			  "\x4d\xdf\xa5\xab\xc0\x20\xbb\x01"
+			  "\xe9\x7b\x54\xf1\xde\xb2\x79\x50"
+			  "\x6c\x4b\x91\xdb\x7f\xbb\x50\xc1"
+			  "\x55\x44\x38\x9a\xe0\x9f\xe8\x29"
+			  "\x6f\x15\xf8\x4e\xa6\xec\xa0\x60",
+		.ksize	= 1088,
+		.plaintext	= "\x15\x68\x9e\x2f\xad\x15\x52\xdf"
+			  "\xf0\x42\x62\x24\x2a\x2d\xea\xbf"
+			  "\xc7\xf3\xb4\x1a\xf5\xed\xb2\x08"
+			  "\x15\x60\x1c\x00\x77\xbf\x0b\x0e"
+			  "\xb7\x2c\xcf\x32\x3a\xc7\x01\x77"
+			  "\xef\xa6\x75\xd0\x29\xc7\x68\x20"
+			  "\xb2\x92\x25\xbf\x12\x34\xe9\xa4"
+			  "\xfd\x32\x7b\x3f\x7c\xbd\xa5\x02"
+			  "\x38\x41\xde\xc9\xc1\x09\xd9\xfc"
+			  "\x6e\x78\x22\x83\x18\xf7\x50\x8d"
+			  "\x8f\x9c\x2d\x02\xa5\x30\xac\xff"
+			  "\xea\x63\x2e\x80\x37\x83\xb0\x58"
+			  "\xda\x2f\xef\x21\x55\xba\x7b\xb1"
+			  "\xb6\xed\xf5\xd2\x4d\xaa\x8c\xa9"
+			  "\xdd\xdb\x0f\xb4\xce\xc1\x9a\xb1"
+			  "\xc1\xdc\xbd\xab\x86\xc2\xdf\x0b"
+			  "\xe1\x2c\xf9\xbe\xf6\xd8\xda\x62"
+			  "\x72\xdd\x98\x09\x52\xc0\xc4\xb6"
+			  "\x7b\x17\x5c\xf5\xd8\x4b\x88\xd6"
+			  "\x6b\xbf\x84\x4a\x3f\xf5\x4d\xd2"
+			  "\x94\xe2\x9c\xff\xc7\x3c\xd9\xc8"
+			  "\x37\x38\xbc\x8c\xf3\xe7\xb7\xd0"
+			  "\x1d\x78\xc4\x39\x07\xc8\x5e\x79"
+			  "\xb6\x5a\x90\x5b\x6e\x97\xc9\xd4"
+			  "\x82\x9c\xf3\x83\x7a\xe7\x97\xfc"
+			  "\x1d\xbb\xef\xdb\xce\xe0\x82\xad"
+			  "\xca\x07\x6c\x54\x62\x6f\x81\xe6"
+			  "\x7a\x5a\x96\x6e\x80\x3a\xa2\x37"
+			  "\x6f\xc6\xa4\x29\xc3\x9e\x19\x94"
+			  "\x9f\xb0\x3e\x38\xfb\x3c\x2b\x7d"
+			  "\xaa\xb8\x74\xda\x54\x23\x51\x12"
+			  "\x4b\x96\x36\x8f\x91\x4f\x19\x37"
+			  "\x83\xc9\xdd\xc7\x1a\x32\x2d\xab"
+			  "\xc7\x89\xe2\x07\x47\x6c\xe8\xa6"
+			  "\x70\x6b\x8e\x0c\xda\x5c\x6a\x59"
+			  "\x27\x33\x0e\xe1\xe1\x20\xe8\xc8"
+			  "\xae\xdc\xd0\xe3\x6d\xa8\xa6\x06"
+			  "\x41\xb4\xd4\xd4\xcf\x91\x3e\x06"
+			  "\xb0\x9a\xf7\xf1\xaa\xa6\x23\x92"
+			  "\x10\x86\xf0\x94\xd1\x7c\x2e\x07"
+			  "\x30\xfb\xc5\xd8\xf3\x12\xa9\xe8"
+			  "\x22\x1c\x97\x1a\xad\x96\xb0\xa1"
+			  "\x72\x6a\x6b\xb4\xfd\xf7\xe8\xfa"
+			  "\xe2\x74\xd8\x65\x8d\x35\x17\x4b"
+			  "\x00\x23\x5c\x8c\x70\xad\x71\xa2"
+			  "\xca\xc5\x6c\x59\xbf\xb4\xc0\x6d"
+			  "\x86\x98\x3e\x19\x5a\x90\x92\xb1"
+			  "\x66\x57\x6a\x91\x68\x7c\xbc\xf3"
+			  "\xf1\xdb\x94\xf8\x48\xf1\x36\xd8"
+			  "\x78\xac\x1c\xa9\xcc\xd6\x27\xba"
+			  "\x91\x54\x22\xf5\xe6\x05\x3f\xcc"
+			  "\xc2\x8f\x2c\x3b\x2b\xc3\x2b\x2b"
+			  "\x3b\xb8\xb6\x29\xb7\x2f\x94\xb6"
+			  "\x7b\xfc\x94\x3e\xd0\x7a\x41\x59"
+			  "\x7b\x1f\x9a\x09\xa6\xed\x4a\x82"
+			  "\x9d\x34\x1c\xbd\x4e\x1c\x3a\x66"
+			  "\x80\x74\x0e\x9a\x4f\x55\x54\x47"
+			  "\x16\xba\x2a\x0a\x03\x35\x99\xa3"
+			  "\x5c\x63\x8d\xa2\x72\x8b\x17\x15"
+			  "\x68\x39\x73\xeb\xec\xf2\xe8\xf5"
+			  "\x95\x32\x27\xd6\xc4\xfe\xb0\x51"
+			  "\xd5\x0c\x50\xc5\xcd\x6d\x16\xb3"
+			  "\xa3\x1e\x95\x69\xad\x78\x95\x06"
+			  "\xb9\x46\xf2\x6d\x24\x5a\x99\x76"
+			  "\x73\x6a\x91\xa6\xac\x12\xe1\x28"
+			  "\x79\xbc\x08\x4e\x97\x00\x98\x63"
+			  "\x07\x1c\x4e\xd1\x68\xf3\xb3\x81"
+			  "\xa8\xa6\x5f\xf1\x01\xc9\xc1\xaf"
+			  "\x3a\x96\xf9\x9d\xb5\x5a\x5f\x8f"
+			  "\x7e\xc1\x7e\x77\x0a\x40\xc8\x8e"
+			  "\xfc\x0e\xed\xe1\x0d\xb0\xe5\x5e"
+			  "\x5e\x6f\xf5\x7f\xab\x33\x7d\xcd"
+			  "\xf0\x09\x4b\xb2\x11\x37\xdc\x65"
+			  "\x97\x32\x62\x71\x3a\x29\x54\xb9"
+			  "\xc7\xa4\xbf\x75\x0f\xf9\x40\xa9"
+			  "\x8d\xd7\x8b\xa7\xe0\x9a\xbe\x15"
+			  "\xc6\xda\xd8\x00\x14\x69\x1a\xaf"
+			  "\x5f\x79\xc3\xf5\xbb\x6c\x2a\x9d"
+			  "\xdd\x3c\x5f\x97\x21\xe1\x3a\x03"
+			  "\x84\x6a\xe9\x76\x11\x1f\xd3\xd5"
+			  "\xf0\x54\x20\x4d\xc2\x91\xc3\xa4"
+			  "\x36\x25\xbe\x1b\x2a\x06\xb7\xf3"
+			  "\xd1\xd0\x55\x29\x81\x4c\x83\xa3"
+			  "\xa6\x84\x1e\x5c\xd1\xd0\x6c\x90"
+			  "\xa4\x11\xf0\xd7\x63\x6a\x48\x05"
+			  "\xbc\x48\x18\x53\xcd\xb0\x8d\xdb"
+			  "\xdc\xfe\x55\x11\x5c\x51\xb3\xab"
+			  "\xab\x63\x3e\x31\x5a\x8b\x93\x63"
+			  "\x34\xa9\xba\x2b\x69\x1a\xc0\xe3"
+			  "\xcb\x41\xbc\xd7\xf5\x7f\x82\x3e"
+			  "\x01\xa3\x3c\x72\xf4\xfe\xdf\xbe"
+			  "\xb1\x67\x17\x2b\x37\x60\x0d\xca"
+			  "\x6f\xc3\x94\x2c\xd2\x92\x6d\x9d"
+			  "\x75\x18\x77\xaa\x29\x38\x96\xed"
+			  "\x0e\x20\x70\x92\xd5\xd0\xb4\x00"
+			  "\xc0\x31\xf2\xc9\x43\x0e\x75\x1d"
+			  "\x4b\x64\xf2\x1f\xf2\x29\x6c\x7b"
+			  "\x7f\xec\x59\x7d\x8c\x0d\xd4\xd3"
+			  "\xac\x53\x4c\xa3\xde\x42\x92\x95"
+			  "\x6d\xa3\x4f\xd0\xe6\x3d\xe7\xec"
+			  "\x7a\x4d\x68\xf1\xfe\x67\x66\x09"
+			  "\x83\x22\xb1\x98\x43\x8c\xab\xb8"
+			  "\x45\xe6\x6d\xdf\x5e\x50\x71\xce"
+			  "\xf5\x4e\x40\x93\x2b\xfa\x86\x0e"
+			  "\xe8\x30\xbd\x82\xcc\x1c\x9c\x5f"
+			  "\xad\xfd\x08\x31\xbe\x52\xe7\xe6"
+			  "\xf2\x06\x01\x62\x25\x15\x99\x74"
+			  "\x33\x51\x52\x57\x3f\x57\x87\x61"
+			  "\xb9\x7f\x29\x3d\xcd\x92\x5e\xa6"
+			  "\x5c\x3b\xf1\xed\x5f\xeb\x82\xed"
+			  "\x56\x7b\x61\xe7\xfd\x02\x47\x0e"
+			  "\x2a\x15\xa4\xce\x43\x86\x9b\xe1"
+			  "\x2b\x4c\x2a\xd9\x42\x97\xf7\x9a"
+			  "\xe5\x47\x46\x48\xd3\x55\x6f\x4d"
+			  "\xd9\xeb\x4b\xdd\x7b\x21\x2f\xb3"
+			  "\xa8\x36\x28\xdf\xca\xf1\xf6\xd9"
+			  "\x10\xf6\x1c\xfd\x2e\x0c\x27\xe0"
+			  "\x01\xb3\xff\x6d\x47\x08\x4d\xd4"
+			  "\x00\x25\xee\x55\x4a\xe9\xe8\x5b"
+			  "\xd8\xf7\x56\x12\xd4\x50\xb2\xe5"
+			  "\x51\x6f\x34\x63\x69\xd2\x4e\x96"
+			  "\x4e\xbc\x79\xbf\x18\xae\xc6\x13"
+			  "\x80\x92\x77\xb0\xb4\x0f\x29\x94"
+			  "\x6f\x4c\xbb\x53\x11\x36\xc3\x9f"
+			  "\x42\x8e\x96\x8a\x91\xc8\xe9\xfc"
+			  "\xfe\xbf\x7c\x2d\x6f\xf9\xb8\x44"
+			  "\x89\x1b\x09\x53\x0a\x2a\x92\xc3"
+			  "\x54\x7a\x3a\xf9\xe2\xe4\x75\x87"
+			  "\xa0\x5e\x4b\x03\x7a\x0d\x8a\xf4"
+			  "\x55\x59\x94\x2b\x63\x96\x0e\xf5",
+		.psize	= 1040,
+		.digest	= "\xb5\xb9\x08\xb3\x24\x3e\x03\xf0"
+			  "\xd6\x0b\x57\xbc\x0a\x6d\x89\x59",
+	}, {
+		.key	= "\xf6\x34\x42\x71\x35\x52\x8b\x58"
+			  "\x02\x3a\x8e\x4a\x8d\x41\x13\xe9"
+			  "\x7f\xba\xb9\x55\x9d\x73\x4d\xf8"
+			  "\x3f\x5d\x73\x15\xff\xd3\x9e\x7f"
+			  "\x20\x2a\x6a\xa8\xd1\xf0\x8f\x12"
+			  "\x6b\x02\xd8\x6c\xde\xba\x80\x22"
+			  "\x19\x37\xc8\xd0\x4e\x89\x17\x7c"
+			  "\x7c\xdd\x88\xfd\x41\xc0\x04\xb7"
+			  "\x1d\xac\x19\xe3\x20\xc7\x16\xcf"
+			  "\x58\xee\x1d\x7a\x61\x69\xa9\x12"
+			  "\x4b\xef\x4f\xb6\x38\xdd\x78\xf8"
+			  "\x28\xee\x70\x08\xc7\x7c\xcc\xc8"
+			  "\x1e\x41\xf5\x80\x86\x70\xd0\xf0"
+			  "\xa3\x87\x6b\x0a\x00\xd2\x41\x28"
+			  "\x74\x26\xf1\x24\xf3\xd0\x28\x77"
+			  "\xd7\xcd\xf6\x2d\x61\xf4\xa2\x13"
+			  "\x77\xb4\x6f\xa0\xf4\xfb\xd6\xb5"
+			  "\x38\x9d\x5a\x0c\x51\xaf\xad\x63"
+			  "\x27\x67\x8c\x01\xea\x42\x1a\x66"
+			  "\xda\x16\x7c\x3c\x30\x0c\x66\x53"
+			  "\x1c\x88\xa4\x5c\xb2\xe3\x78\x0a"
+			  "\x13\x05\x6d\xe2\xaf\xb3\xe4\x75"
+			  "\x00\x99\x58\xee\x76\x09\x64\xaa"
+			  "\xbb\x2e\xb1\x81\xec\xd8\x0e\xd3"
+			  "\x0c\x33\x5d\xb7\x98\xef\x36\xb6"
+			  "\xd2\x65\x69\x41\x70\x12\xdc\x25"
+			  "\x41\x03\x99\x81\x41\x19\x62\x13"
+			  "\xd1\x0a\x29\xc5\x8c\xe0\x4c\xf3"
+			  "\xd6\xef\x4c\xf4\x1d\x83\x2e\x6d"
+			  "\x8e\x14\x87\xed\x80\xe0\xaa\xd3"
+			  "\x08\x04\x73\x1a\x84\x40\xf5\x64"
+			  "\xbd\x61\x32\x65\x40\x42\xfb\xb0"
+			  "\x40\xf6\x40\x8d\xc7\x7f\x14\xd0"
+			  "\x83\x99\xaa\x36\x7e\x60\xc6\xbf"
+			  "\x13\x8a\xf9\x21\xe4\x7e\x68\x87"
+			  "\xf3\x33\x86\xb4\xe0\x23\x7e\x0a"
+			  "\x21\xb1\xf5\xad\x67\x3c\x9c\x9d"
+			  "\x09\xab\xaf\x5f\xba\xe0\xd0\x82"
+			  "\x48\x22\x70\xb5\x6d\x53\xd6\x0e"
+			  "\xde\x64\x92\x41\xb0\xd3\xfb\xda"
+			  "\x21\xfe\xab\xea\x20\xc4\x03\x58"
+			  "\x18\x2e\x7d\x2f\x03\xa9\x47\x66"
+			  "\xdf\x7b\xa4\x6b\x34\x6b\x55\x9c"
+			  "\x4f\xd7\x9c\x47\xfb\xa9\x42\xec"
+			  "\x5a\x12\xfd\xfe\x76\xa0\x92\x9d"
+			  "\xfe\x1e\x16\xdd\x24\x2a\xe4\x27"
+			  "\xd5\xa9\xf2\x05\x4f\x83\xa2\xaf"
+			  "\xfe\xee\x83\x7a\xad\xde\xdf\x9a"
+			  "\x80\xd5\x81\x14\x93\x16\x7e\x46"
+			  "\x47\xc2\x14\xef\x49\x6e\xb9\xdb"
+			  "\x40\xe8\x06\x6f\x9c\x2a\xfd\x62"
+			  "\x06\x46\xfd\x15\x1d\x36\x61\x6f"
+			  "\x77\x77\x5e\x64\xce\x78\x1b\x85"
+			  "\xbf\x50\x9a\xfd\x67\xa6\x1a\x65"
+			  "\xad\x5b\x33\x30\xf1\x71\xaa\xd9"
+			  "\x23\x0d\x92\x24\x5f\xae\x57\xb0"
+			  "\x24\x37\x0a\x94\x12\xfb\xb5\xb1"
+			  "\xd3\xb8\x1d\x12\x29\xb0\x80\x24"
+			  "\x2d\x47\x9f\x96\x1f\x95\xf1\xb1"
+			  "\xda\x35\xf6\x29\xe0\xe1\x23\x96"
+			  "\xc7\xe8\x22\x9b\x7c\xac\xf9\x41"
+			  "\x39\x01\xe5\x73\x15\x5e\x99\xec"
+			  "\xb4\xc1\xf4\xe7\xa7\x97\x6a\xd5"
+			  "\x90\x9a\xa0\x1d\xf3\x5a\x8b\x5f"
+			  "\xdf\x01\x52\xa4\x93\x31\x97\xb0"
+			  "\x93\x24\xb5\xbc\xb2\x14\x24\x98"
+			  "\x4a\x8f\x19\x85\xc3\x2d\x0f\x74"
+			  "\x9d\x16\x13\x80\x5e\x59\x62\x62"
+			  "\x25\xe0\xd1\x2f\x64\xef\xba\xac"
+			  "\xcd\x09\x07\x15\x8a\xcf\x73\xb5"
+			  "\x8b\xc9\xd8\x24\xb0\x53\xd5\x6f"
+			  "\xe1\x2b\x77\xb1\xc5\xe4\xa7\x0e"
+			  "\x18\x45\xab\x36\x03\x59\xa8\xbd"
+			  "\x43\xf0\xd8\x2c\x1a\x69\x96\xbb"
+			  "\x13\xdf\x6c\x33\x77\xdf\x25\x34"
+			  "\x5b\xa5\x5b\x8c\xf9\x51\x05\xd4"
+			  "\x8b\x8b\x44\x87\x49\xfc\xa0\x8f"
+			  "\x45\x15\x5b\x40\x42\xc4\x09\x92"
+			  "\x98\x0c\x4d\xf4\x26\x37\x1b\x13"
+			  "\x76\x01\x93\x8d\x4f\xe6\xed\x18"
+			  "\xd0\x79\x7b\x3f\x44\x50\xcb\xee"
+			  "\xf7\x4a\xc9\x9e\xe0\x96\x74\xa7"
+			  "\xe6\x93\xb2\x53\xca\x55\xa8\xdc"
+			  "\x1e\x68\x07\x87\xb7\x2e\xc1\x08"
+			  "\xb2\xa4\x5b\xaf\xc6\xdb\x5c\x66"
+			  "\x41\x1c\x51\xd9\xb0\x07\x00\x0d"
+			  "\xf0\x4c\xdc\x93\xde\xa9\x1e\x8e"
+			  "\xd3\x22\x62\xd8\x8b\x88\x2c\xea"
+			  "\x5e\xf1\x6e\x14\x40\xc7\xbe\xaa"
+			  "\x42\x28\xd0\x26\x30\x78\x01\x9b"
+			  "\x83\x07\xbc\x94\xc7\x57\xa2\x9f"
+			  "\x03\x07\xff\x16\xff\x3c\x6e\x48"
+			  "\x0a\xd0\xdd\x4c\xf6\x64\x9a\xf1"
+			  "\xcd\x30\x12\x82\x2c\x38\xd3\x26"
+			  "\x83\xdb\xab\x3e\xc6\xf8\xe6\xfa"
+			  "\x77\x0a\x78\x82\x75\xf8\x63\x51"
+			  "\x59\xd0\x8d\x24\x9f\x25\xe6\xa3"
+			  "\x4c\xbc\x34\xfc\xe3\x10\xc7\x62"
+			  "\xd4\x23\xc8\x3d\xa7\xc6\xa6\x0a"
+			  "\x4f\x7e\x29\x9d\x6d\xbe\xb5\xf1"
+			  "\xdf\xa4\x53\xfa\xc0\x23\x0f\x37"
+			  "\x84\x68\xd0\xb5\xc8\xc6\xae\xf8"
+			  "\xb7\x8d\xb3\x16\xfe\x8f\x87\xad"
+			  "\xd0\xc1\x08\xee\x12\x1c\x9b\x1d"
+			  "\x90\xf8\xd1\x63\xa4\x92\x3c\xf0"
+			  "\xc7\x34\xd8\xf1\x14\xed\xa3\xbc"
+			  "\x17\x7e\xd4\x62\x42\x54\x57\x2c"
+			  "\x3e\x7a\x35\x35\x17\x0f\x0b\x7f"
+			  "\x81\xa1\x3f\xd0\xcd\xc8\x3b\x96"
+			  "\xe9\xe0\x4a\x04\xe1\xb6\x3c\xa1"
+			  "\xd6\xca\xc4\xbd\xb6\xb5\x95\x34"
+			  "\x12\x9d\xc5\x96\xf2\xdf\xba\x54"
+			  "\x76\xd1\xb2\x6b\x3b\x39\xe0\xb9"
+			  "\x18\x62\xfb\xf7\xfc\x12\xf1\x5f"
+			  "\x7e\xc7\xe3\x59\x4c\xa6\xc2\x3d"
+			  "\x40\x15\xf9\xa3\x95\x64\x4c\x74"
+			  "\x8b\x73\x77\x33\x07\xa7\x04\x1d"
+			  "\x33\x5a\x7e\x8f\xbd\x86\x01\x4f"
+			  "\x3e\xb9\x27\x6f\xe2\x41\xf7\x09"
+			  "\x67\xfd\x29\x28\xc5\xe4\xf6\x18"
+			  "\x4c\x1b\x49\xb2\x9c\x5b\xf6\x81"
+			  "\x4f\xbb\x5c\xcc\x0b\xdf\x84\x23"
+			  "\x58\xd6\x28\x34\x93\x3a\x25\x97"
+			  "\xdf\xb2\xc3\x9e\x97\x38\x0b\x7d"
+			  "\x10\xb3\x54\x35\x23\x8c\x64\xee"
+			  "\xf0\xd8\x66\xff\x8b\x22\xd2\x5b"
+			  "\x05\x16\x3c\x89\xf7\xb1\x75\xaf"
+			  "\xc0\xae\x6a\x4f\x3f\xaf\x9a\xf4"
+			  "\xf4\x9a\x24\xd9\x80\x82\xc0\x12"
+			  "\xde\x96\xd1\xbe\x15\x0b\x8d\x6a"
+			  "\xd7\x12\xe4\x85\x9f\x83\xc9\xc3"
+			  "\xff\x0b\xb5\xaf\x3b\xd8\x6d\x67"
+			  "\x81\x45\xe6\xac\xec\xc1\x7b\x16"
+			  "\x18\x0a\xce\x4b\xc0\x2e\x76\xbc"
+			  "\x1b\xfa\xb4\x34\xb8\xfc\x3e\xc8"
+			  "\x5d\x90\x71\x6d\x7a\x79\xef\x06",
+		.ksize	= 1088,
+		.plaintext	= "\xaa\x5d\x54\xcb\xea\x1e\x46\x0f"
+			  "\x45\x87\x70\x51\x8a\x66\x7a\x33"
+			  "\xb4\x18\xff\xa9\x82\xf9\x45\x4b"
+			  "\x93\xae\x2e\x7f\xab\x98\xfe\xbf"
+			  "\x01\xee\xe5\xa0\x37\x8f\x57\xa6"
+			  "\xb0\x76\x0d\xa4\xd6\x28\x2b\x5d"
+			  "\xe1\x03\xd6\x1c\x6f\x34\x0d\xe7"
+			  "\x61\x2d\x2e\xe5\xae\x5d\x47\xc7"
+			  "\x80\x4b\x18\x8f\xa8\x99\xbc\x28"
+			  "\xed\x1d\x9d\x86\x7d\xd7\x41\xd1"
+			  "\xe0\x2b\xe1\x8c\x93\x2a\xa7\x80"
+			  "\xe1\x07\xa0\xa9\x9f\x8c\x8d\x1a"
+			  "\x55\xfc\x6b\x24\x7a\xbd\x3e\x51"
+			  "\x68\x4b\x26\x59\xc8\xa7\x16\xd9"
+			  "\xb9\x61\x13\xde\x8b\x63\x1c\xf6"
+			  "\x60\x01\xfb\x08\xb3\x5b\x0a\xbf"
+			  "\x34\x73\xda\x87\x87\x3d\x6f\x97"
+			  "\x4a\x0c\xa3\x58\x20\xa2\xc0\x81"
+			  "\x5b\x8c\xef\xa9\xc2\x01\x1e\x64"
+			  "\x83\x8c\xbc\x03\xb6\xd0\x29\x9f"
+			  "\x54\xe2\xce\x8b\xc2\x07\x85\x78"
+			  "\x25\x38\x96\x4c\xb4\xbe\x17\x4a"
+			  "\x65\xa6\xfa\x52\x9d\x66\x9d\x65"
+			  "\x4a\xd1\x01\x01\xf0\xcb\x13\xcc"
+			  "\xa5\x82\xf3\xf2\x66\xcd\x3f\x9d"
+			  "\xd1\xaa\xe4\x67\xea\xf2\xad\x88"
+			  "\x56\x76\xa7\x9b\x59\x3c\xb1\x5d"
+			  "\x78\xfd\x69\x79\x74\x78\x43\x26"
+			  "\x7b\xde\x3f\xf1\xf5\x4e\x14\xd9"
+			  "\x15\xf5\x75\xb5\x2e\x19\xf3\x0c"
+			  "\x48\x72\xd6\x71\x6d\x03\x6e\xaa"
+			  "\xa7\x08\xf9\xaa\x70\xa3\x0f\x4d"
+			  "\x12\x8a\xdd\xe3\x39\x73\x7e\xa7"
+			  "\xea\x1f\x6d\x06\x26\x2a\xf2\xc5"
+			  "\x52\xb4\xbf\xfd\x52\x0c\x06\x60"
+			  "\x90\xd1\xb2\x7b\x56\xae\xac\x58"
+			  "\x5a\x6b\x50\x2a\xf5\xe0\x30\x3c"
+			  "\x2a\x98\x0f\x1b\x5b\x0a\x84\x6c"
+			  "\x31\xae\x92\xe2\xd4\xbb\x7f\x59"
+			  "\x26\x10\xb9\x89\x37\x68\x26\xbf"
+			  "\x41\xc8\x49\xc4\x70\x35\x7d\xff"
+			  "\x2d\x7f\xf6\x8a\x93\x68\x8c\x78"
+			  "\x0d\x53\xce\x7d\xff\x7d\xfb\xae"
+			  "\x13\x1b\x75\xc4\x78\xd7\x71\xd8"
+			  "\xea\xd3\xf4\x9d\x95\x64\x8e\xb4"
+			  "\xde\xb8\xe4\xa6\x68\xc8\xae\x73"
+			  "\x58\xaf\xa8\xb0\x5a\x20\xde\x87"
+			  "\x43\xb9\x0f\xe3\xad\x41\x4b\xd5"
+			  "\xb7\xad\x16\x00\xa6\xff\xf6\x74"
+			  "\xbf\x8c\x9f\xb3\x58\x1b\xb6\x55"
+			  "\xa9\x90\x56\x28\xf0\xb5\x13\x4e"
+			  "\x9e\xf7\x25\x86\xe0\x07\x7b\x98"
+			  "\xd8\x60\x5d\x38\x95\x3c\xe4\x22"
+			  "\x16\x2f\xb2\xa2\xaf\xe8\x90\x17"
+			  "\xec\x11\x83\x1a\xf4\xa9\x26\xda"
+			  "\x39\x72\xf5\x94\x61\x05\x51\xec"
+			  "\xa8\x30\x8b\x2c\x13\xd0\x72\xac"
+			  "\xb9\xd2\xa0\x4c\x4b\x78\xe8\x6e"
+			  "\x04\x85\xe9\x04\x49\x82\x91\xff"
+			  "\x89\xe5\xab\x4c\xaa\x37\x03\x12"
+			  "\xca\x8b\x74\x10\xfd\x9e\xd9\x7b"
+			  "\xcb\xdb\x82\x6e\xce\x2e\x33\x39"
+			  "\xce\xd2\x84\x6e\x34\x71\x51\x6e"
+			  "\x0d\xd6\x01\x87\xc7\xfa\x0a\xd3"
+			  "\xad\x36\xf3\x4c\x9f\x96\x5e\x62"
+			  "\x62\x54\xc3\x03\x78\xd6\xab\xdd"
+			  "\x89\x73\x55\x25\x30\xf8\xa7\xe6"
+			  "\x4f\x11\x0c\x7c\x0a\xa1\x2b\x7b"
+			  "\x3d\x0d\xde\x81\xd4\x9d\x0b\xae"
+			  "\xdf\x00\xf9\x4c\xb6\x90\x8e\x16"
+			  "\xcb\x11\xc8\xd1\x2e\x73\x13\x75"
+			  "\x75\x3e\xaa\xf5\xee\x02\xb3\x18"
+			  "\xa6\x2d\xf5\x3b\x51\xd1\x1f\x47"
+			  "\x6b\x2c\xdb\xc4\x10\xe0\xc8\xba"
+			  "\x9d\xac\xb1\x9d\x75\xd5\x41\x0e"
+			  "\x7e\xbe\x18\x5b\xa4\x1f\xf8\x22"
+			  "\x4c\xc1\x68\xda\x6d\x51\x34\x6c"
+			  "\x19\x59\xec\xb5\xb1\xec\xa7\x03"
+			  "\xca\x54\x99\x63\x05\x6c\xb1\xac"
+			  "\x9c\x31\xd6\xdb\xba\x7b\x14\x12"
+			  "\x7a\xc3\x2f\xbf\x8d\xdc\x37\x46"
+			  "\xdb\xd2\xbc\xd4\x2f\xab\x30\xd5"
+			  "\xed\x34\x99\x8e\x83\x3e\xbe\x4c"
+			  "\x86\x79\x58\xe0\x33\x8d\x9a\xb8"
+			  "\xa9\xa6\x90\x46\xa2\x02\xb8\xdd"
+			  "\xf5\xf9\x1a\x5c\x8c\x01\xaa\x6e"
+			  "\xb4\x22\x12\xf5\x0c\x1b\x9b\x7a"
+			  "\xc3\x80\xf3\x06\x00\x5f\x30\xd5"
+			  "\x06\xdb\x7d\x82\xc2\xd4\x0b\x4c"
+			  "\x5f\xe9\xc5\xf5\xdf\x97\x12\xbf"
+			  "\x56\xaf\x9b\x69\xcd\xee\x30\xb4"
+			  "\xa8\x71\xff\x3e\x7d\x73\x7a\xb4"
+			  "\x0d\xa5\x46\x7a\xf3\xf4\x15\x87"
+			  "\x5d\x93\x2b\x8c\x37\x64\xb5\xdd"
+			  "\x48\xd1\xe5\x8c\xae\xd4\xf1\x76"
+			  "\xda\xf4\xba\x9e\x25\x0e\xad\xa3"
+			  "\x0d\x08\x7c\xa8\x82\x16\x8d\x90"
+			  "\x56\x40\x16\x84\xe7\x22\x53\x3a"
+			  "\x58\xbc\xb9\x8f\x33\xc8\xc2\x84"
+			  "\x22\xe6\x0d\xe7\xb3\xdc\x5d\xdf"
+			  "\xd7\x2a\x36\xe4\x16\x06\x07\xd2"
+			  "\x97\x60\xb2\xf5\x5e\x14\xc9\xfd"
+			  "\x8b\x05\xd1\xce\xee\x9a\x65\x99"
+			  "\xb7\xae\x19\xb7\xc8\xbc\xd5\xa2"
+			  "\x7b\x95\xe1\xcc\xba\x0d\xdc\x8a"
+			  "\x1d\x59\x52\x50\xaa\x16\x02\x82"
+			  "\xdf\x61\x33\x2e\x44\xce\x49\xc7"
+			  "\xe5\xc6\x2e\x76\xcf\x80\x52\xf0"
+			  "\x3d\x17\x34\x47\x3f\xd3\x80\x48"
+			  "\xa2\xba\xd5\xc7\x7b\x02\x28\xdb"
+			  "\xac\x44\xc7\x6e\x05\x5c\xc2\x79"
+			  "\xb3\x7d\x6a\x47\x77\x66\xf1\x38"
+			  "\xf0\xf5\x4f\x27\x1a\x31\xca\x6c"
+			  "\x72\x95\x92\x8e\x3f\xb0\xec\x1d"
+			  "\xc7\x2a\xff\x73\xee\xdf\x55\x80"
+			  "\x93\xd2\xbd\x34\xd3\x9f\x00\x51"
+			  "\xfb\x2e\x41\xba\x6c\x5a\x7c\x17"
+			  "\x7f\xe6\x70\xac\x8d\x39\x3f\x77"
+			  "\xe2\x23\xac\x8f\x72\x4e\xe4\x53"
+			  "\xcc\xf1\x1b\xf1\x35\xfe\x52\xa4"
+			  "\xd6\xb8\x40\x6b\xc1\xfd\xa0\xa1"
+			  "\xf5\x46\x65\xc2\x50\xbb\x43\xe2"
+			  "\xd1\x43\x28\x34\x74\xf5\x87\xa0"
+			  "\xf2\x5e\x27\x3b\x59\x2b\x3e\x49"
+			  "\xdf\x46\xee\xaf\x71\xd7\x32\x36"
+			  "\xc7\x14\x0b\x58\x6e\x3e\x2d\x41"
+			  "\xfa\x75\x66\x3a\x54\xe0\xb2\xb9"
+			  "\xaf\xdd\x04\x80\x15\x19\x3f\x6f"
+			  "\xce\x12\xb4\xd8\xe8\x89\x3c\x05"
+			  "\x30\xeb\xf3\x3d\xcd\x27\xec\xdc"
+			  "\x56\x70\x12\xcf\x78\x2b\x77\xbf"
+			  "\x22\xf0\x1b\x17\x9c\xcc\xd6\x1b"
+			  "\x2d\x3d\xa0\x3b\xd8\xc9\x70\xa4"
+			  "\x7a\x3e\x07\xb9\x06\xc3\xfa\xb0"
+			  "\x33\xee\xc1\xd8\xf6\xe0\xf0\xb2"
+			  "\x61\x12\x69\xb0\x5f\x28\x99\xda"
+			  "\xc3\x61\x48\xfa\x07\x16\x03\xc4"
+			  "\xa8\xe1\x3c\xe8\x0e\x64\x15\x30"
+			  "\xc1\x9d\x84\x2f\x73\x98\x0e\x3a"
+			  "\xf2\x86\x21\xa4\x9e\x1d\xb5\x86"
+			  "\x16\xdb\x2b\x9a\x06\x64\x8e\x79"
+			  "\x8d\x76\x3e\xc3\xc2\x64\x44\xe3"
+			  "\xda\xbc\x1a\x52\xd7\x61\x03\x65"
+			  "\x54\x32\x77\x01\xed\x9d\x8a\x43"
+			  "\x25\x24\xe3\xc1\xbe\xb8\x2f\xcb"
+			  "\x89\x14\x64\xab\xf6\xa0\x6e\x02"
+			  "\x57\xe4\x7d\xa9\x4e\x9a\x03\x36"
+			  "\xad\xf1\xb1\xfc\x0b\xe6\x79\x51"
+			  "\x9f\x81\x77\xc4\x14\x78\x9d\xbf"
+			  "\xb6\xd6\xa3\x8c\xba\x0b\x26\xe7"
+			  "\xc8\xb9\x5c\xcc\xe1\x5f\xd5\xc6"
+			  "\xc4\xca\xc2\xa3\x45\xba\x94\x13"
+			  "\xb2\x8f\xc3\x54\x01\x09\xe7\x8b"
+			  "\xda\x2a\x0a\x11\x02\x43\xcb\x57"
+			  "\xc9\xcc\xb5\x5c\xab\xc4\xec\x54"
+			  "\x00\x06\x34\xe1\x6e\x03\x89\x7c"
+			  "\xc6\xfb\x6a\xc7\x60\x43\xd6\xc5"
+			  "\xb5\x68\x72\x89\x8f\x42\xc3\x74"
+			  "\xbd\x25\xaa\x9f\x67\xb5\xdf\x26"
+			  "\x20\xe8\xb7\x01\x3c\xe4\x77\xce"
+			  "\xc4\x65\xa7\x23\x79\xea\x33\xc7"
+			  "\x82\x14\x5c\x82\xf2\x4e\x3d\xf6"
+			  "\xc6\x4a\x0e\x29\xbb\xec\x44\xcd"
+			  "\x2f\xd1\x4f\x21\x71\xa9\xce\x0f"
+			  "\x5c\xf2\x72\x5c\x08\x2e\x21\xd2"
+			  "\xc3\x29\x13\xd8\xac\xc3\xda\x13"
+			  "\x1a\x9d\xa7\x71\x1d\x27\x1d\x27"
+			  "\x1d\xea\xab\x44\x79\xad\xe5\xeb"
+			  "\xef\x1f\x22\x0a\x44\x4f\xcb\x87"
+			  "\xa7\x58\x71\x0e\x66\xf8\x60\xbf"
+			  "\x60\x74\x4a\xb4\xec\x2e\xfe\xd3"
+			  "\xf5\xb8\xfe\x46\x08\x50\x99\x6c"
+			  "\x66\xa5\xa8\x34\x44\xb5\xe5\xf0"
+			  "\xdd\x2c\x67\x4e\x35\x96\x8e\x67"
+			  "\x48\x3f\x5f\x37\x44\x60\x51\x2e"
+			  "\x14\x91\x5e\x57\xc3\x0e\x79\x77"
+			  "\x2f\x03\xf4\xe2\x1c\x72\xbf\x85"
+			  "\x5d\xd3\x17\xdf\x6c\xc5\x70\x24"
+			  "\x42\xdf\x51\x4e\x2a\xb2\xd2\x5b"
+			  "\x9e\x69\x83\x41\x11\xfe\x73\x22"
+			  "\xde\x8a\x9e\xd8\x8a\xfb\x20\x38"
+			  "\xd8\x47\x6f\xd5\xed\x8f\x41\xfd"
+			  "\x13\x7a\x18\x03\x7d\x0f\xcd\x7d"
+			  "\xa6\x7d\x31\x9e\xf1\x8f\x30\xa3"
+			  "\x8b\x4c\x24\xb7\xf5\x48\xd7\xd9"
+			  "\x12\xe7\x84\x97\x5c\x31\x6d\xfb"
+			  "\xdf\xf3\xd3\xd1\xd5\x0c\x30\x06"
+			  "\x01\x6a\xbc\x6c\x78\x7b\xa6\x50"
+			  "\xfa\x0f\x3c\x42\x2d\xa5\xa3\x3b"
+			  "\xcf\x62\x50\xff\x71\x6d\xe7\xda"
+			  "\x27\xab\xc6\x67\x16\x65\x68\x64"
+			  "\xc7\xd5\x5f\x81\xa9\xf6\x65\xb3"
+			  "\x5e\x43\x91\x16\xcd\x3d\x55\x37"
+			  "\x55\xb3\xf0\x28\xc5\x54\x19\xc0"
+			  "\xe0\xd6\x2a\x61\xd4\xc8\x72\x51"
+			  "\xe9\xa1\x7b\x48\x21\xad\x44\x09"
+			  "\xe4\x01\x61\x3c\x8a\x5b\xf9\xa1"
+			  "\x6e\x1b\xdf\xc0\x04\xa8\x8b\xf2"
+			  "\x21\xbe\x34\x7b\xfc\xa1\xcd\xc9"
+			  "\xa9\x96\xf4\xa4\x4c\xf7\x4e\x8f"
+			  "\x84\xcc\xd3\xa8\x92\x77\x8f\x36"
+			  "\xe2\x2e\x8c\x33\xe8\x84\xa6\x0c"
+			  "\x6c\x8a\xda\x14\x32\xc2\x96\xff"
+			  "\xc6\x4a\xc2\x9b\x30\x7f\xd1\x29"
+			  "\xc0\xd5\x78\x41\x00\x80\x80\x03"
+			  "\x2a\xb1\xde\x26\x03\x48\x49\xee"
+			  "\x57\x14\x76\x51\x3c\x36\x5d\x0a"
+			  "\x5c\x9f\xe8\xd8\x53\xdb\x4f\xd4"
+			  "\x38\xbf\x66\xc9\x75\x12\x18\x75"
+			  "\x34\x2d\x93\x22\x96\x51\x24\x6e"
+			  "\x4e\xd9\x30\xea\x67\xff\x92\x1c"
+			  "\x16\x26\xe9\xb5\x33\xab\x8c\x22"
+			  "\x47\xdb\xa0\x2c\x08\xf0\x12\x69"
+			  "\x7e\x93\x52\xda\xa5\xe5\xca\xc1"
+			  "\x0f\x55\x2a\xbd\x09\x30\x88\x1b"
+			  "\x9c\xc6\x9f\xe6\xdb\xa6\x92\xeb"
+			  "\xf4\xbd\x5c\xc4\xdb\xc6\x71\x09"
+			  "\xab\x5e\x48\x0c\xed\x6f\xda\x8e"
+			  "\x8d\x0c\x98\x71\x7d\x10\xd0\x9c"
+			  "\x20\x9b\x79\x53\x26\x5d\xb9\x85"
+			  "\x8a\x31\xb8\xc5\x1c\x97\xde\x88"
+			  "\x61\x55\x7f\x7c\x21\x06\xea\xc4"
+			  "\x5f\xaf\xf2\xf0\xd5\x5e\x7d\xb4"
+			  "\x6e\xcf\xe9\xae\x1b\x0e\x11\x80"
+			  "\xc1\x9a\x74\x7e\x52\x6f\xa0\xb7"
+			  "\x24\xcd\x8d\x0a\x11\x40\x63\x72"
+			  "\xfa\xe2\xc5\xb3\x94\xef\x29\xa2"
+			  "\x1a\x23\x43\x04\x37\x55\x0d\xe9"
+			  "\x83\xb2\x29\x51\x49\x64\xa0\xbd"
+			  "\xde\x73\xfd\xa5\x7c\x95\x70\x62"
+			  "\x58\xdc\xe2\xd0\xbf\x98\xf5\x8a"
+			  "\x6a\xfd\xce\xa8\x0e\x42\x2a\xeb"
+			  "\xd2\xff\x83\x27\x53\x5c\xa0\x6e"
+			  "\x93\xef\xe2\xb9\x5d\x35\xd6\x98"
+			  "\xf6\x71\x19\x7a\x54\xa1\xa7\xe8"
+			  "\x09\xfe\xf6\x9e\xc7\xbd\x3e\x29"
+			  "\xbd\x6b\x17\xf4\xe7\x3e\x10\x5c"
+			  "\xc1\xd2\x59\x4f\x4b\x12\x1a\x5b"
+			  "\x50\x80\x59\xb9\xec\x13\x66\xa8"
+			  "\xd2\x31\x7b\x6a\x61\x22\xdd\x7d"
+			  "\x61\xee\x87\x16\x46\x9f\xf9\xc7"
+			  "\x41\xee\x74\xf8\xd0\x96\x2c\x76"
+			  "\x2a\xac\x7d\x6e\x9f\x0e\x7f\x95"
+			  "\xfe\x50\x16\xb2\x23\xca\x62\xd5"
+			  "\x68\xcf\x07\x3f\x3f\x97\x85\x2a"
+			  "\x0c\x25\x45\xba\xdb\x32\xcb\x83"
+			  "\x8c\x4f\xe0\x6d\x9a\x99\xf9\xc9"
+			  "\xda\xd4\x19\x31\xc1\x7c\x6d\xd9"
+			  "\x9c\x56\xd3\xec\xc1\x81\x4c\xed"
+			  "\x28\x9d\x87\xeb\x19\xd7\x1a\x4f"
+			  "\x04\x6a\xcb\x1f\xcf\x1f\xa2\x16"
+			  "\xfc\x2a\x0d\xa1\x14\x2d\xfa\xc5"
+			  "\x5a\xd2\xc5\xf9\x19\x7c\x20\x1f"
+			  "\x2d\x10\xc0\x66\x7c\xd9\x2d\xe5"
+			  "\x88\x70\x59\xa7\x85\xd5\x2e\x7c"
+			  "\x5c\xe3\xb7\x12\xd6\x97\x3f\x29",
+		.psize	= 2048,
+		.digest	= "\x37\x90\x92\xc2\xeb\x01\x87\xd9"
+			  "\x95\xc7\x91\xc3\x17\x8b\x38\x52",
+	}
+};
+
+
 /*
  * DES test vectors.
  */
diff --git a/include/crypto/nhpoly1305.h b/include/crypto/nhpoly1305.h
new file mode 100644
index 0000000000000..06bfb876a1563
--- /dev/null
+++ b/include/crypto/nhpoly1305.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Common values and helper functions for the NHPoly1305 hash function.
+ */
+
+#ifndef _NHPOLY1305_H
+#define _NHPOLY1305_H
+
+#include <crypto/hash.h>
+#include <crypto/poly1305.h>
+
+/* NH parameterization: */
+
+/* Endianness: little */
+/* Word size: 32 bits (works well on NEON, SSE2, AVX2) */
+
+/* Stride: 2 words (optimal on ARM32 NEON; works okay on other CPUs too) */
+#define NH_PAIR_STRIDE		2
+#define NH_MESSAGE_UNIT		(NH_PAIR_STRIDE * 2 * sizeof(u32))
+
+/* Num passes (Toeplitz iteration count): 4, to give ε = 2^{-128} */
+#define NH_NUM_PASSES		4
+#define NH_HASH_BYTES		(NH_NUM_PASSES * sizeof(u64))
+
+/* Max message size: 1024 bytes (32x compression factor) */
+#define NH_NUM_STRIDES		64
+#define NH_MESSAGE_WORDS	(NH_PAIR_STRIDE * 2 * NH_NUM_STRIDES)
+#define NH_MESSAGE_BYTES	(NH_MESSAGE_WORDS * sizeof(u32))
+#define NH_KEY_WORDS		(NH_MESSAGE_WORDS + \
+				 NH_PAIR_STRIDE * 2 * (NH_NUM_PASSES - 1))
+#define NH_KEY_BYTES		(NH_KEY_WORDS * sizeof(u32))
+
+#define NHPOLY1305_KEY_SIZE	(POLY1305_BLOCK_SIZE + NH_KEY_BYTES)
+
+struct nhpoly1305_key {
+	struct poly1305_key poly_key;
+	u32 nh_key[NH_KEY_WORDS];
+};
+
+struct nhpoly1305_state {
+
+	/* Running total of polynomial evaluation */
+	struct poly1305_state poly_state;
+
+	/* Partial block buffer */
+	u8 buffer[NH_MESSAGE_UNIT];
+	unsigned int buflen;
+
+	/*
+	 * Number of bytes remaining until the current NH message reaches
+	 * NH_MESSAGE_BYTES.  When nonzero, 'nh_hash' holds the partial NH hash.
+	 */
+	unsigned int nh_remaining;
+
+	__le64 nh_hash[NH_NUM_PASSES];
+};
+
+typedef void (*nh_t)(const u32 *key, const u8 *src, size_t srclen,
+		     __le64 hash[NH_NUM_PASSES]);
+
+int crypto_nhpoly1305_setkey(struct crypto_shash *tfm,
+			     const u8 *key, unsigned int keylen);
+
+int crypto_nhpoly1305_init(struct shash_desc *desc);
+int crypto_nhpoly1305_update(struct shash_desc *desc,
+			     const u8 *src, unsigned int srclen);
+int crypto_nhpoly1305_update_helper(struct shash_desc *desc,
+				    const u8 *src, unsigned int srclen,
+				    nh_t nh_fn);
+int crypto_nhpoly1305_final(struct shash_desc *desc, u8 *dst);
+int crypto_nhpoly1305_final_helper(struct shash_desc *desc, u8 *dst,
+				   nh_t nh_fn);
+
+#endif /* _NHPOLY1305_H */
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (8 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-20  4:12   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support Eric Biggers
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Add an ARM NEON implementation of NHPoly1305, an ε-almost-∆-universal
hash function used in the Adiantum encryption mode.  For now, only the
NH portion is actually NEON-accelerated; the Poly1305 part is less
performance-critical so is just implemented in C.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 arch/arm/crypto/Kconfig                |   5 ++
 arch/arm/crypto/Makefile               |   2 +
 arch/arm/crypto/nh-neon-core.S         | 116 +++++++++++++++++++++++++
 arch/arm/crypto/nhpoly1305-neon-glue.c |  78 +++++++++++++++++
 4 files changed, 201 insertions(+)
 create mode 100644 arch/arm/crypto/nh-neon-core.S
 create mode 100644 arch/arm/crypto/nhpoly1305-neon-glue.c

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index cc932d9bba561..458562a34aabe 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -122,4 +122,9 @@ config CRYPTO_CHACHA20_NEON
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_CHACHA20
 
+config CRYPTO_NHPOLY1305_NEON
+	tristate "NEON accelerated NHPoly1305 hash function (for Adiantum)"
+	depends on KERNEL_MODE_NEON
+	select CRYPTO_NHPOLY1305
+
 endif
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index 005482ff95047..b65d6bfab8e6b 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
 obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
 obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
 obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
+obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o
 
 ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
@@ -53,6 +54,7 @@ ghash-arm-ce-y	:= ghash-ce-core.o ghash-ce-glue.o
 crct10dif-arm-ce-y	:= crct10dif-ce-core.o crct10dif-ce-glue.o
 crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
 chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
+nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o
 
 ifdef REGENERATE_ARM_CRYPTO
 quiet_cmd_perl = PERL    $@
diff --git a/arch/arm/crypto/nh-neon-core.S b/arch/arm/crypto/nh-neon-core.S
new file mode 100644
index 0000000000000..434d80ab531c2
--- /dev/null
+++ b/arch/arm/crypto/nh-neon-core.S
@@ -0,0 +1,116 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * NH - ε-almost-universal hash function, NEON accelerated version
+ *
+ * Copyright 2018 Google LLC
+ *
+ * Author: Eric Biggers <ebiggers@google.com>
+ */
+
+#include <linux/linkage.h>
+
+	.text
+	.fpu		neon
+
+	KEY		.req	r0
+	MESSAGE		.req	r1
+	MESSAGE_LEN	.req	r2
+	HASH		.req	r3
+
+	PASS0_SUMS	.req	q0
+	PASS0_SUM_A	.req	d0
+	PASS0_SUM_B	.req	d1
+	PASS1_SUMS	.req	q1
+	PASS1_SUM_A	.req	d2
+	PASS1_SUM_B	.req	d3
+	PASS2_SUMS	.req	q2
+	PASS2_SUM_A	.req	d4
+	PASS2_SUM_B	.req	d5
+	PASS3_SUMS	.req	q3
+	PASS3_SUM_A	.req	d6
+	PASS3_SUM_B	.req	d7
+	K0		.req	q4
+	K1		.req	q5
+	K2		.req	q6
+	K3		.req	q7
+	T0		.req	q8
+	T0_L		.req	d16
+	T0_H		.req	d17
+	T1		.req	q9
+	T1_L		.req	d18
+	T1_H		.req	d19
+	T2		.req	q10
+	T2_L		.req	d20
+	T2_H		.req	d21
+	T3		.req	q11
+	T3_L		.req	d22
+	T3_H		.req	d23
+
+.macro _nh_stride	k0, k1, k2, k3
+
+	// Load next message stride
+	vld1.8		{T3}, [MESSAGE]!
+
+	// Load next key stride
+	vld1.32		{\k3}, [KEY]!
+
+	// Add message words to key words
+	vadd.u32	T0, T3, \k0
+	vadd.u32	T1, T3, \k1
+	vadd.u32	T2, T3, \k2
+	vadd.u32	T3, T3, \k3
+
+	// Multiply 32x32 => 64 and accumulate
+	vmlal.u32	PASS0_SUMS, T0_L, T0_H
+	vmlal.u32	PASS1_SUMS, T1_L, T1_H
+	vmlal.u32	PASS2_SUMS, T2_L, T2_H
+	vmlal.u32	PASS3_SUMS, T3_L, T3_H
+.endm
+
+/*
+ * void nh_neon(const u32 *key, const u8 *message, size_t message_len,
+ *		u8 hash[NH_HASH_BYTES])
+ *
+ * It's guaranteed that message_len % 16 == 0.
+ */
+ENTRY(nh_neon)
+
+	vld1.32		{K0,K1}, [KEY]!
+	  vmov.u64	PASS0_SUMS, #0
+	  vmov.u64	PASS1_SUMS, #0
+	vld1.32		{K2}, [KEY]!
+	  vmov.u64	PASS2_SUMS, #0
+	  vmov.u64	PASS3_SUMS, #0
+
+	subs		MESSAGE_LEN, MESSAGE_LEN, #64
+	blt		.Lloop4_done
+.Lloop4:
+	_nh_stride	K0, K1, K2, K3
+	_nh_stride	K1, K2, K3, K0
+	_nh_stride	K2, K3, K0, K1
+	_nh_stride	K3, K0, K1, K2
+	subs		MESSAGE_LEN, MESSAGE_LEN, #64
+	bge		.Lloop4
+
+.Lloop4_done:
+	ands		MESSAGE_LEN, MESSAGE_LEN, #63
+	beq		.Ldone
+	_nh_stride	K0, K1, K2, K3
+
+	subs		MESSAGE_LEN, MESSAGE_LEN, #16
+	beq		.Ldone
+	_nh_stride	K1, K2, K3, K0
+
+	subs		MESSAGE_LEN, MESSAGE_LEN, #16
+	beq		.Ldone
+	_nh_stride	K2, K3, K0, K1
+
+.Ldone:
+	// Sum the accumulators for each pass, then store the sums to 'hash'
+	vadd.u64	T0_L, PASS0_SUM_A, PASS0_SUM_B
+	vadd.u64	T0_H, PASS1_SUM_A, PASS1_SUM_B
+	vadd.u64	T1_L, PASS2_SUM_A, PASS2_SUM_B
+	vadd.u64	T1_H, PASS3_SUM_A, PASS3_SUM_B
+	vst1.8		{T0-T1}, [HASH]
+	bx		lr
+ENDPROC(nh_neon)
diff --git a/arch/arm/crypto/nhpoly1305-neon-glue.c b/arch/arm/crypto/nhpoly1305-neon-glue.c
new file mode 100644
index 0000000000000..df48a00f4c50f
--- /dev/null
+++ b/arch/arm/crypto/nhpoly1305-neon-glue.c
@@ -0,0 +1,78 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
+ * (NEON accelerated version)
+ *
+ * Copyright 2018 Google LLC
+ */
+
+#include <asm/neon.h>
+#include <asm/simd.h>
+#include <crypto/internal/hash.h>
+#include <crypto/nhpoly1305.h>
+#include <linux/module.h>
+
+asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
+			u8 hash[NH_HASH_BYTES]);
+
+static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
+		     __le64 hash[NH_NUM_PASSES])
+{
+	nh_neon(key, message, message_len, (u8 *)hash);
+}
+
+static int nhpoly1305_neon_update(struct shash_desc *desc,
+				  const u8 *src, unsigned int srclen)
+{
+	if (srclen < 64 || !may_use_simd())
+		return crypto_nhpoly1305_update(desc, src, srclen);
+
+	do {
+		unsigned int n = min_t(unsigned int, srclen, PAGE_SIZE);
+
+		kernel_neon_begin();
+		crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon);
+		kernel_neon_end();
+		src += n;
+		srclen -= n;
+	} while (srclen);
+	return 0;
+}
+
+static struct shash_alg nhpoly1305_alg = {
+	.digestsize	= POLY1305_DIGEST_SIZE,
+	.init		= crypto_nhpoly1305_init,
+	.update		= nhpoly1305_neon_update,
+	.final		= crypto_nhpoly1305_final,
+	.setkey		= crypto_nhpoly1305_setkey,
+	.descsize	= sizeof(struct nhpoly1305_state),
+	.base		= {
+		.cra_name		= "nhpoly1305",
+		.cra_driver_name	= "nhpoly1305-neon",
+		.cra_priority		= 200,
+		.cra_ctxsize		= sizeof(struct nhpoly1305_key),
+		.cra_module		= THIS_MODULE,
+	},
+};
+
+static int __init nhpoly1305_mod_init(void)
+{
+	if (!(elf_hwcap & HWCAP_NEON))
+		return -ENODEV;
+
+	return crypto_register_shash(&nhpoly1305_alg);
+}
+
+static void __exit nhpoly1305_mod_exit(void)
+{
+	crypto_unregister_shash(&nhpoly1305_alg);
+}
+
+module_init(nhpoly1305_mod_init);
+module_exit(nhpoly1305_mod_exit);
+
+MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function (NEON-accelerated)");
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
+MODULE_ALIAS_CRYPTO("nhpoly1305");
+MODULE_ALIAS_CRYPTO("nhpoly1305-neon");
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (9 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305 Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-20  4:17   ` Ard Biesheuvel
  2018-10-15 17:54 ` [RFC PATCH v2 12/12] fscrypt: " Eric Biggers
  2018-10-19 15:58 ` [RFC PATCH v2 00/12] crypto: " Jason A. Donenfeld
  12 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Add support for the Adiantum encryption mode.  Adiantum was designed by
Paul Crowley and is specified by our paper:

    Adiantum: length-preserving encryption for entry-level processors
    (https://eprint.iacr.org/2018/720.pdf)

See our paper for full details; this patch only provides an overview.

Adiantum is a tweakable, length-preserving encryption mode designed for
fast and secure disk encryption, especially on CPUs without dedicated
crypto instructions.  Adiantum encrypts each sector using the XChaCha12
stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
function, and an invocation of the AES-256 block cipher on a single
16-byte block.  On CPUs without AES instructions, Adiantum is much
faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
and decryption about 5 times faster.

Adiantum is a specialization of the more general HBSH construction.  Our
earlier proposal, HPolyC, was also a HBSH specialization, but it used a
different εA∆U hash function, one based on Poly1305 only.  Adiantum's
εA∆U hash function, which is based primarily on the "NH" hash function
like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
consequently, Adiantum is about 20% faster than HPolyC.

This speed comes with no loss of security: Adiantum is provably just as
secure as HPolyC, in fact slightly *more* secure.  Like HPolyC,
Adiantum's security is reducible to that of XChaCha12 and AES-256,
subject to a security bound.  XChaCha12 itself has a security reduction
to ChaCha12.  Therefore, one need not "trust" Adiantum; one need only
trust ChaCha12 and AES-256.  Note that the εA∆U hash function is only
used for its proven combinatorical properties so cannot be "broken".

Adiantum is also a true wide-block encryption mode, so flipping any
plaintext bit in the sector scrambles the entire ciphertext, and vice
versa.  No other such mode is available in the kernel currently; doing
the same with XTS scrambles only 16 bytes.  Adiantum also supports
arbitrary-length tweaks and naturally supports any length input >= 16
bytes without needing "ciphertext stealing".

For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in
order to make encryption feasible on the widest range of devices.
Although the 20-round variant is quite popular, the best known attacks
on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial
security margin; in fact, larger than AES-256's.  12-round Salsa20 is
also the eSTREAM recommendation.  For the block cipher, Adiantum uses
AES-256, despite it having a lower security margin than XChaCha12 and
needing table lookups, due to AES's extensive adoption and analysis
making it the obvious first choice.  Nevertheless, for flexibility this
patch also permits the "adiantum" template to be instantiated with
XChaCha20 and/or with an alternate block cipher.

We need Adiantum support in the kernel for use in dm-crypt and fscrypt,
where currently the only other suitable options are block cipher modes
such as AES-XTS.  A big problem with this is that many low-end mobile
devices (e.g. Android Go phones sold primarily in developing countries,
as well as some smartwatches) still have CPUs that lack AES
instructions, e.g. ARM Cortex-A7.  Sadly, AES-XTS encryption is much too
slow to be viable on these devices.  We did find that some "lightweight"
block ciphers are fast enough, but these suffer from problems such as
not having much cryptanalysis or being too controversial.

The ChaCha stream cipher has excellent performance but is insecure to
use directly for disk encryption, since each sector's IV is reused each
time it is overwritten.  Even restricting the threat model to offline
attacks only isn't enough, since modern flash storage devices don't
guarantee that "overwrites" are really overwrites, due to wear-leveling.
Adiantum avoids this problem by constructing a
"tweakable super-pseudorandom permutation"; this is the strongest
possible security model for length-preserving encryption.

Of course, storing random nonces along with the ciphertext would be the
ideal solution.  But doing that with existing hardware and filesystems
runs into major practical problems; in most cases it would require data
journaling (like dm-integrity) which severely degrades performance.
Thus, for now length-preserving encryption is still needed.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 crypto/Kconfig    |  23 ++
 crypto/Makefile   |   1 +
 crypto/adiantum.c | 648 ++++++++++++++++++++++++++++++++++++++++++++++
 crypto/testmgr.c  |  12 +
 crypto/testmgr.h  | 461 +++++++++++++++++++++++++++++++++
 5 files changed, 1145 insertions(+)
 create mode 100644 crypto/adiantum.c

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 431beca903623..d60a8575049c0 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -498,6 +498,29 @@ config CRYPTO_NHPOLY1305
 	select CRYPTO_HASH
 	select CRYPTO_POLY1305
 
+config CRYPTO_ADIANTUM
+	tristate "Adiantum support"
+	select CRYPTO_CHACHA20
+	select CRYPTO_POLY1305
+	select CRYPTO_NHPOLY1305
+	help
+	  Adiantum is a tweakable, length-preserving encryption mode
+	  designed for fast and secure disk encryption, especially on
+	  CPUs without dedicated crypto instructions.  It encrypts
+	  each sector using the XChaCha12 stream cipher, two passes of
+	  an ε-almost-∆-universal hash function, and an invocation of
+	  the AES-256 block cipher on a single 16-byte block.  On CPUs
+	  without AES instructions, Adiantum is much faster than
+	  AES-XTS.
+
+	  Adiantum's security is provably reducible to that of its
+	  underlying stream and block ciphers, subject to a security
+	  bound.  Unlike XTS, Adiantum is a true wide-block encryption
+	  mode, so it actually provides an even stronger notion of
+	  security than XTS, subject to the security bound.
+
+	  If unsure, say N.
+
 comment "Hash modes"
 
 config CRYPTO_CMAC
diff --git a/crypto/Makefile b/crypto/Makefile
index 87b86f221a2a2..1c66475593af3 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -84,6 +84,7 @@ obj-$(CONFIG_CRYPTO_LRW) += lrw.o
 obj-$(CONFIG_CRYPTO_XTS) += xts.o
 obj-$(CONFIG_CRYPTO_CTR) += ctr.o
 obj-$(CONFIG_CRYPTO_KEYWRAP) += keywrap.o
+obj-$(CONFIG_CRYPTO_ADIANTUM) += adiantum.o
 obj-$(CONFIG_CRYPTO_NHPOLY1305) += nhpoly1305.o
 obj-$(CONFIG_CRYPTO_GCM) += gcm.o
 obj-$(CONFIG_CRYPTO_CCM) += ccm.o
diff --git a/crypto/adiantum.c b/crypto/adiantum.c
new file mode 100644
index 0000000000000..b5738ea2f98f5
--- /dev/null
+++ b/crypto/adiantum.c
@@ -0,0 +1,648 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Adiantum length-preserving encryption mode
+ *
+ * Copyright 2018 Google LLC
+ */
+
+/*
+ * Adiantum is a tweakable, length-preserving encryption mode designed for fast
+ * and secure disk encryption, especially on CPUs without dedicated crypto
+ * instructions.  Adiantum encrypts each sector using the XChaCha12 stream
+ * cipher, two passes of an ε-almost-∆-universal (εA∆U) hash function based on
+ * NH and Poly1305, and an invocation of the AES-256 block cipher on a single
+ * 16-byte block.  See the paper for details:
+ *
+ *	Adiantum: length-preserving encryption for entry-level processors
+ *      (https://eprint.iacr.org/2018/720.pdf)
+ *
+ * For flexibility, this implementation also allows other ciphers:
+ *
+ *	- Stream cipher: XChaCha12 or XChaCha20
+ *	- Block cipher: any with a 128-bit block size and 256-bit key
+ *
+ * This implementation doesn't currently allow other εA∆U hash functions, i.e.
+ * HPolyC is not supported.  This is because Adiantum is ~20% faster than HPolyC
+ * but still provably as secure, and also the εA∆U hash function of HBSH is
+ * formally defined to take two inputs (tweak, message) which makes it difficult
+ * to wrap with the crypto_shash API.  Rather, some details need to be handled
+ * here.  Nevertheless, if needed in the future, support for other εA∆U hash
+ * functions could be added here.
+ */
+
+#include <crypto/b128ops.h>
+#include <crypto/chacha.h>
+#include <crypto/internal/hash.h>
+#include <crypto/internal/skcipher.h>
+#include <crypto/nhpoly1305.h>
+#include <crypto/scatterwalk.h>
+#include <linux/module.h>
+
+#include "internal.h"
+
+/*
+ * Size of right-hand block of input data, in bytes; also the size of the block
+ * cipher's block size and the hash function's output.
+ */
+#define BLOCKCIPHER_BLOCK_SIZE		16
+
+/* Size of the block cipher key (K_E) in bytes */
+#define BLOCKCIPHER_KEY_SIZE		32
+
+/* Size of the hash key (K_H) in bytes */
+#define HASH_KEY_SIZE		(POLY1305_BLOCK_SIZE + NHPOLY1305_KEY_SIZE)
+
+/*
+ * The specification allows variable-length tweaks, but Linux's crypto API
+ * currently only allows algorithms to support a single length.  The "natural"
+ * tweak length for Adiantum is 16, since that fits into one Poly1305 block for
+ * the best performance.  But longer tweaks are useful for fscrypt, to avoid
+ * needing to derive per-file keys.  So instead we use two blocks, or 32 bytes.
+ */
+#define TWEAK_SIZE		32
+
+struct adiantum_instance_ctx {
+	struct crypto_skcipher_spawn streamcipher_spawn;
+	struct crypto_spawn blockcipher_spawn;
+	struct crypto_shash_spawn hash_spawn;
+};
+
+struct adiantum_tfm_ctx {
+	struct crypto_skcipher *streamcipher;
+	struct crypto_cipher *blockcipher;
+	struct crypto_shash *hash;
+	struct poly1305_key header_hash_key;
+};
+
+struct adiantum_request_ctx {
+
+	/*
+	 * Buffer for right-hand block of data, i.e.
+	 *
+	 *    P_L => P_M => C_M => C_R when encrypting, or
+	 *    C_R => C_M => P_M => P_L when decrypting.
+	 *
+	 * Also used to build the IV for the stream cipher.
+	 */
+	union {
+		u8 bytes[XCHACHA_IV_SIZE];
+		__le32 words[XCHACHA_IV_SIZE / sizeof(__le32)];
+		le128 bignum;	/* interpret as element of Z/(2^{128}Z) */
+	} rbuf;
+
+	bool enc; /* true if encrypting, false if decrypting */
+
+	/*
+	 * The result of the Poly1305 εA∆U hash function applied to
+	 * (message length, tweak).
+	 */
+	le128 header_hash;
+
+	/* Sub-requests, must be last */
+	union {
+		struct shash_desc hash_desc;
+		struct skcipher_request streamcipher_req;
+	} u;
+};
+
+/*
+ * Given the XChaCha stream key K_S, derive the block cipher key K_E and the
+ * hash key K_H as follows:
+ *
+ *     K_E || K_H || ... = XChaCha(key=K_S, nonce=1||0^191)
+ *
+ * Note that this denotes using bits from the XChaCha keystream, which here we
+ * get indirectly by encrypting a buffer containing all 0's.
+ */
+static int adiantum_setkey(struct crypto_skcipher *tfm, const u8 *key,
+			   unsigned int keylen)
+{
+	struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
+	struct {
+		u8 iv[XCHACHA_IV_SIZE];
+		u8 derived_keys[BLOCKCIPHER_KEY_SIZE + HASH_KEY_SIZE];
+		struct scatterlist sg;
+		struct crypto_wait wait;
+		struct skcipher_request req; /* must be last */
+	} *data;
+	u8 *keyp;
+	int err;
+
+	/* Set the stream cipher key (K_S) */
+	crypto_skcipher_clear_flags(tctx->streamcipher, CRYPTO_TFM_REQ_MASK);
+	crypto_skcipher_set_flags(tctx->streamcipher,
+				  crypto_skcipher_get_flags(tfm) &
+				  CRYPTO_TFM_REQ_MASK);
+	err = crypto_skcipher_setkey(tctx->streamcipher, key, keylen);
+	crypto_skcipher_set_flags(tfm,
+				crypto_skcipher_get_flags(tctx->streamcipher) &
+				CRYPTO_TFM_RES_MASK);
+	if (err)
+		return err;
+
+	/* Derive the subkeys */
+	data = kzalloc(sizeof(*data) +
+		       crypto_skcipher_reqsize(tctx->streamcipher), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+	data->iv[0] = 1;
+	sg_init_one(&data->sg, data->derived_keys, sizeof(data->derived_keys));
+	crypto_init_wait(&data->wait);
+	skcipher_request_set_tfm(&data->req, tctx->streamcipher);
+	skcipher_request_set_callback(&data->req, CRYPTO_TFM_REQ_MAY_SLEEP |
+						  CRYPTO_TFM_REQ_MAY_BACKLOG,
+				      crypto_req_done, &data->wait);
+	skcipher_request_set_crypt(&data->req, &data->sg, &data->sg,
+				   sizeof(data->derived_keys), data->iv);
+	err = crypto_wait_req(crypto_skcipher_encrypt(&data->req), &data->wait);
+	if (err)
+		goto out;
+	keyp = data->derived_keys;
+
+	/* Set the block cipher key (K_E) */
+	crypto_cipher_clear_flags(tctx->blockcipher, CRYPTO_TFM_REQ_MASK);
+	crypto_cipher_set_flags(tctx->blockcipher,
+				crypto_skcipher_get_flags(tfm) &
+				CRYPTO_TFM_REQ_MASK);
+	err = crypto_cipher_setkey(tctx->blockcipher, keyp,
+				   BLOCKCIPHER_KEY_SIZE);
+	crypto_skcipher_set_flags(tfm,
+				  crypto_cipher_get_flags(tctx->blockcipher) &
+				  CRYPTO_TFM_RES_MASK);
+	if (err)
+		goto out;
+	keyp += BLOCKCIPHER_KEY_SIZE;
+
+	/* Set the hash key (K_H) */
+	poly1305_core_setkey(&tctx->header_hash_key, keyp);
+	keyp += POLY1305_BLOCK_SIZE;
+
+	crypto_shash_clear_flags(tctx->hash, CRYPTO_TFM_REQ_MASK);
+	crypto_shash_set_flags(tctx->hash, crypto_skcipher_get_flags(tfm) &
+					   CRYPTO_TFM_REQ_MASK);
+	err = crypto_shash_setkey(tctx->hash, keyp, NHPOLY1305_KEY_SIZE);
+	crypto_skcipher_set_flags(tfm, crypto_shash_get_flags(tctx->hash) &
+				       CRYPTO_TFM_RES_MASK);
+	keyp += NHPOLY1305_KEY_SIZE;
+	WARN_ON(keyp != &data->derived_keys[ARRAY_SIZE(data->derived_keys)]);
+out:
+	kzfree(data);
+	return err;
+}
+
+/* Addition in Z/(2^{128}Z) */
+static inline void le128_add(le128 *r, const le128 *v1, const le128 *v2)
+{
+	u64 x = le64_to_cpu(v1->b);
+	u64 y = le64_to_cpu(v2->b);
+
+	r->b = cpu_to_le64(x + y);
+	r->a = cpu_to_le64(le64_to_cpu(v1->a) + le64_to_cpu(v2->a) +
+			   (x + y < x));
+}
+
+/* Subtraction in Z/(2^{128}Z) */
+static inline void le128_sub(le128 *r, const le128 *v1, const le128 *v2)
+{
+	u64 x = le64_to_cpu(v1->b);
+	u64 y = le64_to_cpu(v2->b);
+
+	r->b = cpu_to_le64(x - y);
+	r->a = cpu_to_le64(le64_to_cpu(v1->a) - le64_to_cpu(v2->a) -
+			   (x - y > x));
+}
+
+/*
+ * Apply the Poly1305 εA∆U hash function to (message length, tweak) and save the
+ * result to rctx->header_hash.
+ *
+ * This value is reused in both the first and second hash steps.  Specifically,
+ * it's added to the result of an independently keyed εA∆U hash function (for
+ * equal length inputs only) taken over the message.  This gives the overall
+ * Adiantum hash of the (tweak, message) pair.
+ */
+static void adiantum_hash_header(struct skcipher_request *req)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
+	struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
+	const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
+	struct {
+		__le64 message_bits;
+		__le64 padding;
+	} header = {
+		.message_bits = cpu_to_le64((u64)bulk_len * 8)
+	};
+	struct poly1305_state state;
+
+	poly1305_core_init(&state);
+
+	BUILD_BUG_ON(sizeof(header) % POLY1305_BLOCK_SIZE != 0);
+	poly1305_core_blocks(&state, &tctx->header_hash_key,
+			     &header, sizeof(header) / POLY1305_BLOCK_SIZE);
+
+	BUILD_BUG_ON(TWEAK_SIZE % POLY1305_BLOCK_SIZE != 0);
+	poly1305_core_blocks(&state, &tctx->header_hash_key, req->iv,
+			     TWEAK_SIZE / POLY1305_BLOCK_SIZE);
+
+	poly1305_core_emit(&state, &rctx->header_hash);
+}
+
+/* Hash the left-hand block (the "bulk") of the message using NHPoly1305 */
+static int adiantum_hash_message(struct skcipher_request *req,
+				 struct scatterlist *sgl, le128 *digest)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
+	struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
+	const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
+	struct shash_desc *hash_desc = &rctx->u.hash_desc;
+	struct sg_mapping_iter miter;
+	unsigned int i, n;
+	int err;
+
+	hash_desc->tfm = tctx->hash;
+	hash_desc->flags = 0;
+
+	err = crypto_shash_init(hash_desc);
+	if (err)
+		return err;
+
+	sg_miter_start(&miter, sgl, sg_nents(sgl),
+		       SG_MITER_FROM_SG | SG_MITER_ATOMIC);
+	for (i = 0; i < bulk_len; i += n) {
+		sg_miter_next(&miter);
+		n = min_t(unsigned int, miter.length, bulk_len - i);
+		err = crypto_shash_update(hash_desc, miter.addr, n);
+		if (err)
+			break;
+	}
+	sg_miter_stop(&miter);
+	if (err)
+		return err;
+
+	return crypto_shash_final(hash_desc, (u8 *)digest);
+}
+
+/* Continue Adiantum encryption/decryption after the stream cipher step */
+static int adiantum_finish(struct skcipher_request *req)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
+	struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
+	const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
+	le128 digest;
+	int err;
+
+	/* If decrypting, decrypt C_M with the block cipher to get P_M */
+	if (!rctx->enc)
+		crypto_cipher_decrypt_one(tctx->blockcipher, rctx->rbuf.bytes,
+					  rctx->rbuf.bytes);
+
+	/*
+	 * Second hash step
+	 *	enc: C_R = C_M - H_{K_H}(T, C_L)
+	 *	dec: P_R = P_M - H_{K_H}(T, P_L)
+	 */
+	err = adiantum_hash_message(req, req->dst, &digest);
+	if (err)
+		return err;
+	le128_add(&digest, &digest, &rctx->header_hash);
+	le128_sub(&rctx->rbuf.bignum, &rctx->rbuf.bignum, &digest);
+	scatterwalk_map_and_copy(&rctx->rbuf.bignum, req->dst,
+				 bulk_len, BLOCKCIPHER_BLOCK_SIZE, 1);
+	return 0;
+}
+
+static void adiantum_streamcipher_done(struct crypto_async_request *areq, int err)
+{
+	struct skcipher_request *req = areq->data;
+
+	if (!err)
+		err = adiantum_finish(req);
+
+	skcipher_request_complete(req, err);
+}
+
+static int adiantum_crypt(struct skcipher_request *req, bool enc)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
+	struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
+	const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
+	unsigned int stream_len;
+	le128 digest;
+	int err;
+
+	if (req->cryptlen < BLOCKCIPHER_BLOCK_SIZE)
+		return -EINVAL;
+
+	rctx->enc = enc;
+
+	/*
+	 * First hash step
+	 *	enc: P_M = P_R + H_{K_H}(T, P_L)
+	 *	dec: C_M = C_R + H_{K_H}(T, C_L)
+	 */
+	adiantum_hash_header(req);
+	err = adiantum_hash_message(req, req->src, &digest);
+	if (err)
+		return err;
+	le128_add(&digest, &digest, &rctx->header_hash);
+	scatterwalk_map_and_copy(&rctx->rbuf.bignum, req->src,
+				 bulk_len, BLOCKCIPHER_BLOCK_SIZE, 0);
+	le128_add(&rctx->rbuf.bignum, &rctx->rbuf.bignum, &digest);
+
+	/* If encrypting, encrypt P_M with the block cipher to get C_M */
+	if (enc)
+		crypto_cipher_encrypt_one(tctx->blockcipher, rctx->rbuf.bytes,
+					  rctx->rbuf.bytes);
+
+	/* Initialize the rest of the XChaCha IV (first part is C_M) */
+	BUILD_BUG_ON(BLOCKCIPHER_BLOCK_SIZE != 16);
+	BUILD_BUG_ON(XCHACHA_IV_SIZE != 32);	/* nonce || stream position */
+	rctx->rbuf.words[4] = cpu_to_le32(1);
+	rctx->rbuf.words[5] = 0;
+	rctx->rbuf.words[6] = 0;
+	rctx->rbuf.words[7] = 0;
+
+	/*
+	 * XChaCha needs to be done on all the data except the last 16 bytes;
+	 * for disk encryption that usually means 4080 or 496 bytes.  But ChaCha
+	 * implementations tend to be most efficient when passed a whole number
+	 * of 64-byte ChaCha blocks, or sometimes even a multiple of 256 bytes.
+	 * And here it doesn't matter whether the last 16 bytes are written to,
+	 * as the second hash step will overwrite them.  Thus, round the XChaCha
+	 * length up to the next 64-byte boundary if possible.
+	 */
+	stream_len = bulk_len;
+	if (round_up(stream_len, CHACHA_BLOCK_SIZE) <= req->cryptlen)
+		stream_len = round_up(stream_len, CHACHA_BLOCK_SIZE);
+
+	skcipher_request_set_tfm(&rctx->u.streamcipher_req, tctx->streamcipher);
+	skcipher_request_set_crypt(&rctx->u.streamcipher_req, req->src,
+				   req->dst, stream_len, &rctx->rbuf);
+	skcipher_request_set_callback(&rctx->u.streamcipher_req,
+				      req->base.flags,
+				      adiantum_streamcipher_done, req);
+	return crypto_skcipher_encrypt(&rctx->u.streamcipher_req) ?:
+		adiantum_finish(req);
+}
+
+static int adiantum_encrypt(struct skcipher_request *req)
+{
+	return adiantum_crypt(req, true);
+}
+
+static int adiantum_decrypt(struct skcipher_request *req)
+{
+	return adiantum_crypt(req, false);
+}
+
+static int adiantum_init_tfm(struct crypto_skcipher *tfm)
+{
+	struct skcipher_instance *inst = skcipher_alg_instance(tfm);
+	struct adiantum_instance_ctx *ictx = skcipher_instance_ctx(inst);
+	struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
+	struct crypto_skcipher *streamcipher;
+	struct crypto_cipher *blockcipher;
+	struct crypto_shash *hash;
+	unsigned int subreq_size;
+	int err;
+
+	streamcipher = crypto_spawn_skcipher(&ictx->streamcipher_spawn);
+	if (IS_ERR(streamcipher))
+		return PTR_ERR(streamcipher);
+
+	blockcipher = crypto_spawn_cipher(&ictx->blockcipher_spawn);
+	if (IS_ERR(blockcipher)) {
+		err = PTR_ERR(blockcipher);
+		goto err_free_streamcipher;
+	}
+
+	hash = crypto_spawn_shash(&ictx->hash_spawn);
+	if (IS_ERR(hash)) {
+		err = PTR_ERR(hash);
+		goto err_free_blockcipher;
+	}
+
+	tctx->streamcipher = streamcipher;
+	tctx->blockcipher = blockcipher;
+	tctx->hash = hash;
+
+	BUILD_BUG_ON(offsetofend(struct adiantum_request_ctx, u) !=
+		     sizeof(struct adiantum_request_ctx));
+	subreq_size = max(FIELD_SIZEOF(struct adiantum_request_ctx,
+				       u.hash_desc) +
+			  crypto_shash_descsize(hash),
+			  FIELD_SIZEOF(struct adiantum_request_ctx,
+				       u.streamcipher_req) +
+			  crypto_skcipher_reqsize(streamcipher));
+
+	crypto_skcipher_set_reqsize(tfm,
+				    offsetof(struct adiantum_request_ctx, u) +
+				    subreq_size);
+	return 0;
+
+err_free_blockcipher:
+	crypto_free_cipher(blockcipher);
+err_free_streamcipher:
+	crypto_free_skcipher(streamcipher);
+	return err;
+}
+
+static void adiantum_exit_tfm(struct crypto_skcipher *tfm)
+{
+	struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
+
+	crypto_free_skcipher(tctx->streamcipher);
+	crypto_free_cipher(tctx->blockcipher);
+	crypto_free_shash(tctx->hash);
+}
+
+static void adiantum_free_instance(struct skcipher_instance *inst)
+{
+	struct adiantum_instance_ctx *ictx = skcipher_instance_ctx(inst);
+
+	crypto_drop_skcipher(&ictx->streamcipher_spawn);
+	crypto_drop_spawn(&ictx->blockcipher_spawn);
+	crypto_drop_shash(&ictx->hash_spawn);
+	kfree(inst);
+}
+
+/*
+ * Check for a supported set of inner algorithms.
+ * See the comment at the beginning of this file.
+ */
+static bool adiantum_supported_algorithms(struct skcipher_alg *streamcipher_alg,
+					  struct crypto_alg *blockcipher_alg,
+					  struct shash_alg *hash_alg)
+{
+	if (strcmp(streamcipher_alg->base.cra_name, "xchacha12") != 0 &&
+	    strcmp(streamcipher_alg->base.cra_name, "xchacha20") != 0)
+		return false;
+
+	if (blockcipher_alg->cra_cipher.cia_min_keysize > BLOCKCIPHER_KEY_SIZE ||
+	    blockcipher_alg->cra_cipher.cia_max_keysize < BLOCKCIPHER_KEY_SIZE)
+		return false;
+	if (blockcipher_alg->cra_blocksize != BLOCKCIPHER_BLOCK_SIZE)
+		return false;
+
+	if (strcmp(hash_alg->base.cra_name, "nhpoly1305") != 0)
+		return false;
+
+	return true;
+}
+
+static int adiantum_create(struct crypto_template *tmpl, struct rtattr **tb)
+{
+	struct crypto_attr_type *algt;
+	const char *streamcipher_name;
+	const char *blockcipher_name;
+	struct skcipher_instance *inst;
+	struct adiantum_instance_ctx *ictx;
+	struct skcipher_alg *streamcipher_alg;
+	struct crypto_alg *blockcipher_alg;
+	struct crypto_alg *_hash_alg;
+	struct shash_alg *hash_alg;
+	int err;
+
+	algt = crypto_get_attr_type(tb);
+	if (IS_ERR(algt))
+		return PTR_ERR(algt);
+
+	if ((algt->type ^ CRYPTO_ALG_TYPE_SKCIPHER) & algt->mask)
+		return -EINVAL;
+
+	streamcipher_name = crypto_attr_alg_name(tb[1]);
+	if (IS_ERR(streamcipher_name))
+		return PTR_ERR(streamcipher_name);
+
+	blockcipher_name = crypto_attr_alg_name(tb[2]);
+	if (IS_ERR(blockcipher_name))
+		return PTR_ERR(blockcipher_name);
+
+	inst = kzalloc(sizeof(*inst) + sizeof(*ictx), GFP_KERNEL);
+	if (!inst)
+		return -ENOMEM;
+	ictx = skcipher_instance_ctx(inst);
+
+	/* Stream cipher, e.g. "xchacha12" */
+	err = crypto_grab_skcipher(&ictx->streamcipher_spawn, streamcipher_name,
+				   0, crypto_requires_sync(algt->type,
+							   algt->mask));
+	if (err)
+		goto out_free_inst;
+	streamcipher_alg = crypto_spawn_skcipher_alg(&ictx->streamcipher_spawn);
+
+	/* Block cipher, e.g. "aes" */
+	err = crypto_grab_spawn(&ictx->blockcipher_spawn, blockcipher_name,
+				CRYPTO_ALG_TYPE_CIPHER, CRYPTO_ALG_TYPE_MASK);
+	if (err)
+		goto out_drop_streamcipher;
+	blockcipher_alg = ictx->blockcipher_spawn.alg;
+
+	/* NHPoly1305 εA∆U hash function */
+	_hash_alg = crypto_alg_mod_lookup("nhpoly1305", CRYPTO_ALG_TYPE_SHASH,
+					  CRYPTO_ALG_TYPE_MASK);
+	if (IS_ERR(_hash_alg)) {
+		err = PTR_ERR(_hash_alg);
+		goto out_drop_blockcipher;
+	}
+	hash_alg = __crypto_shash_alg(_hash_alg);
+	err = crypto_init_shash_spawn(&ictx->hash_spawn, hash_alg,
+				      skcipher_crypto_instance(inst));
+	if (err) {
+		crypto_mod_put(_hash_alg);
+		goto out_drop_blockcipher;
+	}
+
+	/* Check the set of algorithms */
+	err = -EINVAL;
+	if (!adiantum_supported_algorithms(streamcipher_alg, blockcipher_alg,
+					   hash_alg)) {
+		pr_warn("Unsupported Adiantum instantiation: (%s,%s,%s)\n",
+			streamcipher_alg->base.cra_name,
+			blockcipher_alg->cra_name, hash_alg->base.cra_name);
+		goto out_drop_hash;
+	}
+
+	/* Instance fields */
+
+	err = -ENAMETOOLONG;
+	if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
+		     "adiantum(%s,%s)", streamcipher_alg->base.cra_name,
+		     blockcipher_alg->cra_name) >= CRYPTO_MAX_ALG_NAME)
+		goto out_drop_hash;
+	if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME,
+		     "adiantum_base(%s,%s,%s)",
+		     streamcipher_alg->base.cra_driver_name,
+		     blockcipher_alg->cra_driver_name,
+		     hash_alg->base.cra_driver_name) >= CRYPTO_MAX_ALG_NAME)
+		goto out_drop_hash;
+
+	inst->alg.base.cra_blocksize = BLOCKCIPHER_BLOCK_SIZE;
+	inst->alg.base.cra_ctxsize = sizeof(struct adiantum_tfm_ctx);
+	inst->alg.base.cra_alignmask = streamcipher_alg->base.cra_alignmask;
+	/*
+	 * The block cipher is only invoked once per message, so for long
+	 * messages (e.g. sectors for disk encryption) its performance doesn't
+	 * matter as much as that of the stream cipher and hash function.  Thus,
+	 * weigh the block cipher's ->cra_priority less.
+	 */
+	inst->alg.base.cra_priority = (4 * streamcipher_alg->base.cra_priority +
+				       2 * hash_alg->base.cra_priority +
+				       blockcipher_alg->cra_priority) / 7;
+
+	inst->alg.setkey = adiantum_setkey;
+	inst->alg.encrypt = adiantum_encrypt;
+	inst->alg.decrypt = adiantum_decrypt;
+	inst->alg.init = adiantum_init_tfm;
+	inst->alg.exit = adiantum_exit_tfm;
+	inst->alg.min_keysize = streamcipher_alg->min_keysize;
+	inst->alg.max_keysize = streamcipher_alg->max_keysize;
+	inst->alg.ivsize = TWEAK_SIZE;
+
+	inst->free = adiantum_free_instance;
+
+	err = skcipher_register_instance(tmpl, inst);
+	if (err)
+		goto out_drop_hash;
+
+	return 0;
+
+out_drop_hash:
+	crypto_drop_shash(&ictx->hash_spawn);
+out_drop_blockcipher:
+	crypto_drop_spawn(&ictx->blockcipher_spawn);
+out_drop_streamcipher:
+	crypto_drop_skcipher(&ictx->streamcipher_spawn);
+out_free_inst:
+	kfree(inst);
+	return err;
+}
+
+/* adiantum(streamcipher_name, blockcipher_name) */
+static struct crypto_template adiantum_tmpl = {
+	.name = "adiantum",
+	.create = adiantum_create,
+	.module = THIS_MODULE,
+};
+
+static int adiantum_module_init(void)
+{
+	return crypto_register_template(&adiantum_tmpl);
+}
+
+static void __exit adiantum_module_exit(void)
+{
+	crypto_unregister_template(&adiantum_tmpl);
+}
+
+module_init(adiantum_module_init);
+module_exit(adiantum_module_exit);
+
+MODULE_DESCRIPTION("Adiantum length-preserving encryption mode");
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
+MODULE_ALIAS_CRYPTO("adiantum");
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 039a5d850a29c..4ce255a4509de 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -2404,6 +2404,18 @@ static int alg_test_null(const struct alg_test_desc *desc,
 /* Please keep this list sorted by algorithm name. */
 static const struct alg_test_desc alg_test_descs[] = {
 	{
+		.alg = "adiantum(xchacha12,aes)",
+		.test = alg_test_skcipher,
+		.suite = {
+			.cipher = __VECS(adiantum_xchacha12_aes_tv_template)
+		},
+	}, {
+		.alg = "adiantum(xchacha20,aes)",
+		.test = alg_test_skcipher,
+		.suite = {
+			.cipher = __VECS(adiantum_xchacha20_aes_tv_template)
+		},
+	}, {
 		.alg = "aegis128",
 		.test = alg_test_aead,
 		.suite = {
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 40197d74b3d56..6b2fb444f6877 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -33189,6 +33189,467 @@ static const struct cipher_testvec xchacha12_tv_template[] = {
 	},
 };
 
+/* Adiantum test vectors from https://github.com/google/adiantum */
+static const struct cipher_testvec adiantum_xchacha12_aes_tv_template[] = {
+	{
+		.key	= "\x9e\xeb\xb2\x49\x3c\x1c\xf5\xf4"
+			  "\x6a\x99\xc2\xc4\xdf\xb1\xf4\xdd"
+			  "\x75\x20\x57\xea\x2c\x4f\xcd\xb2"
+			  "\xa5\x3d\x7b\x49\x1e\xab\xfd\x0f",
+		.klen	= 32,
+		.iv	= "\xdf\x63\xd4\xab\xd2\x49\xf3\xd8"
+			  "\x33\x81\x37\x60\x7d\xfa\x73\x08"
+			  "\xd8\x49\x6d\x80\xe8\x2f\x62\x54"
+			  "\xeb\x0e\xa9\x39\x5b\x45\x7f\x8a",
+		.ptext	= "\x67\xc9\xf2\x30\x84\x41\x8e\x43"
+			  "\xfb\xf3\xb3\x3e\x79\x36\x7f\xe8",
+		.ctext	= "\x6d\x32\x86\x18\x67\x86\x0f\x3f"
+			  "\x96\x7c\x9d\x28\x0d\x53\xec\x9f",
+		.len	= 16,
+		.also_non_np = 1,
+		.np	= 2,
+		.tap	= { 14, 2 },
+	}, {
+		.key	= "\x36\x2b\x57\x97\xf8\x5d\xcd\x99"
+			  "\x5f\x1a\x5a\x44\x1d\x92\x0f\x27"
+			  "\xcc\x16\xd7\x2b\x85\x63\x99\xd3"
+			  "\xba\x96\xa1\xdb\xd2\x60\x68\xda",
+		.klen	= 32,
+		.iv	= "\xef\x58\x69\xb1\x2c\x5e\x9a\x47"
+			  "\x24\xc1\xb1\x69\xe1\x12\x93\x8f"
+			  "\x43\x3d\x6d\x00\xdb\x5e\xd8\xd9"
+			  "\x12\x9a\xfe\xd9\xff\x2d\xaa\xc4",
+		.ptext	= "\x5e\xa8\x68\x19\x85\x98\x12\x23"
+			  "\x26\x0a\xcc\xdb\x0a\x04\xb9\xdf"
+			  "\x4d\xb3\x48\x7b\xb0\xe3\xc8\x19"
+			  "\x43\x5a\x46\x06\x94\x2d\xf2",
+		.ctext	= "\xc7\xc6\xf1\x73\x8f\xc4\xff\x4a"
+			  "\x39\xbe\x78\xbe\x8d\x28\xc8\x89"
+			  "\x46\x63\xe7\x0c\x7d\x87\xe8\x4e"
+			  "\xc9\x18\x7b\xbe\x18\x60\x50",
+		.len	= 31,
+	}, {
+		.key	= "\xa5\x28\x24\x34\x1a\x3c\xd8\xf7"
+			  "\x05\x91\x8f\xee\x85\x1f\x35\x7f"
+			  "\x80\x3d\xfc\x9b\x94\xf6\xfc\x9e"
+			  "\x19\x09\x00\xa9\x04\x31\x4f\x11",
+		.klen	= 32,
+		.iv	= "\xa1\xba\x49\x95\xff\x34\x6d\xb8"
+			  "\xcd\x87\x5d\x5e\xfd\xea\x85\xdb"
+			  "\x8a\x7b\x5e\xb2\x5d\x57\xdd\x62"
+			  "\xac\xa9\x8c\x41\x42\x94\x75\xb7",
+		.ptext	= "\x69\xb4\xe8\x8c\x37\xe8\x67\x82"
+			  "\xf1\xec\x5d\x04\xe5\x14\x91\x13"
+			  "\xdf\xf2\x87\x1b\x69\x81\x1d\x71"
+			  "\x70\x9e\x9c\x3b\xde\x49\x70\x11"
+			  "\xa0\xa3\xdb\x0d\x54\x4f\x66\x69"
+			  "\xd7\xdb\x80\xa7\x70\x92\x68\xce"
+			  "\x81\x04\x2c\xc6\xab\xae\xe5\x60"
+			  "\x15\xe9\x6f\xef\xaa\x8f\xa7\xa7"
+			  "\x63\x8f\xf2\xf0\x77\xf1\xa8\xea"
+			  "\xe1\xb7\x1f\x9e\xab\x9e\x4b\x3f"
+			  "\x07\x87\x5b\x6f\xcd\xa8\xaf\xb9"
+			  "\xfa\x70\x0b\x52\xb8\xa8\xa7\x9e"
+			  "\x07\x5f\xa6\x0e\xb3\x9b\x79\x13"
+			  "\x79\xc3\x3e\x8d\x1c\x2c\x68\xc8"
+			  "\x51\x1d\x3c\x7b\x7d\x79\x77\x2a"
+			  "\x56\x65\xc5\x54\x23\x28\xb0\x03",
+		.ctext	= "\x9e\x16\xab\xed\x4b\xa7\x42\x5a"
+			  "\xc6\xfb\x4e\x76\xff\xbe\x03\xa0"
+			  "\x0f\xe3\xad\xba\xe4\x98\x2b\x0e"
+			  "\x21\x48\xa0\xb8\x65\x48\x27\x48"
+			  "\x84\x54\x54\xb2\x9a\x94\x7b\xe6"
+			  "\x4b\x29\xe9\xcf\x05\x91\x80\x1a"
+			  "\x3a\xf3\x41\x96\x85\x1d\x9f\x74"
+			  "\x51\x56\x63\xfa\x7c\x28\x85\x49"
+			  "\xf7\x2f\xf9\xf2\x18\x46\xf5\x33"
+			  "\x80\xa3\x3c\xce\xb2\x57\x93\xf5"
+			  "\xae\xbd\xa9\xf5\x7b\x30\xc4\x93"
+			  "\x66\xe0\x30\x77\x16\xe4\xa0\x31"
+			  "\xba\x70\xbc\x68\x13\xf5\xb0\x9a"
+			  "\xc1\xfc\x7e\xfe\x55\x80\x5c\x48"
+			  "\x74\xa6\xaa\xa3\xac\xdc\xc2\xf5"
+			  "\x8d\xde\x34\x86\x78\x60\x75\x8d",
+		.len	= 128,
+		.also_non_np = 1,
+		.np	= 4,
+		.tap	= { 104, 16, 4, 4 },
+	}, {
+		.key	= "\xd3\x81\x72\x18\x23\xff\x6f\x4a"
+			  "\x25\x74\x29\x0d\x51\x8a\x0e\x13"
+			  "\xc1\x53\x5d\x30\x8d\xee\x75\x0d"
+			  "\x14\xd6\x69\xc9\x15\xa9\x0c\x60",
+		.klen	= 32,
+		.iv	= "\x65\x9b\xd4\xa8\x7d\x29\x1d\xf4"
+			  "\xc4\xd6\x9b\x6a\x28\xab\x64\xe2"
+			  "\x62\x81\x97\xc5\x81\xaa\xf9\x44"
+			  "\xc1\x72\x59\x82\xaf\x16\xc8\x2c",
+		.ptext	= "\xc7\x6b\x52\x6a\x10\xf0\xcc\x09"
+			  "\xc1\x12\x1d\x6d\x21\xa6\x78\xf5"
+			  "\x05\xa3\x69\x60\x91\x36\x98\x57"
+			  "\xba\x0c\x14\xcc\xf3\x2d\x73\x03"
+			  "\xc6\xb2\x5f\xc8\x16\x27\x37\x5d"
+			  "\xd0\x0b\x87\xb2\x50\x94\x7b\x58"
+			  "\x04\xf4\xe0\x7f\x6e\x57\x8e\xc9"
+			  "\x41\x84\xc1\xb1\x7e\x4b\x91\x12"
+			  "\x3a\x8b\x5d\x50\x82\x7b\xcb\xd9"
+			  "\x9a\xd9\x4e\x18\x06\x23\x9e\xd4"
+			  "\xa5\x20\x98\xef\xb5\xda\xe5\xc0"
+			  "\x8a\x6a\x83\x77\x15\x84\x1e\xae"
+			  "\x78\x94\x9d\xdf\xb7\xd1\xea\x67"
+			  "\xaa\xb0\x14\x15\xfa\x67\x21\x84"
+			  "\xd3\x41\x2a\xce\xba\x4b\x4a\xe8"
+			  "\x95\x62\xa9\x55\xf0\x80\xad\xbd"
+			  "\xab\xaf\xdd\x4f\xa5\x7c\x13\x36"
+			  "\xed\x5e\x4f\x72\xad\x4b\xf1\xd0"
+			  "\x88\x4e\xec\x2c\x88\x10\x5e\xea"
+			  "\x12\xc0\x16\x01\x29\xa3\xa0\x55"
+			  "\xaa\x68\xf3\xe9\x9d\x3b\x0d\x3b"
+			  "\x6d\xec\xf8\xa0\x2d\xf0\x90\x8d"
+			  "\x1c\xe2\x88\xd4\x24\x71\xf9\xb3"
+			  "\xc1\x9f\xc5\xd6\x76\x70\xc5\x2e"
+			  "\x9c\xac\xdb\x90\xbd\x83\x72\xba"
+			  "\x6e\xb5\xa5\x53\x83\xa9\xa5\xbf"
+			  "\x7d\x06\x0e\x3c\x2a\xd2\x04\xb5"
+			  "\x1e\x19\x38\x09\x16\xd2\x82\x1f"
+			  "\x75\x18\x56\xb8\x96\x0b\xa6\xf9"
+			  "\xcf\x62\xd9\x32\x5d\xa9\xd7\x1d"
+			  "\xec\xe4\xdf\x1b\xbe\xf1\x36\xee"
+			  "\xe3\x7b\xb5\x2f\xee\xf8\x53\x3d"
+			  "\x6a\xb7\x70\xa9\xfc\x9c\x57\x25"
+			  "\xf2\x89\x10\xd3\xb8\xa8\x8c\x30"
+			  "\xae\x23\x4f\x0e\x13\x66\x4f\xe1"
+			  "\xb6\xc0\xe4\xf8\xef\x93\xbd\x6e"
+			  "\x15\x85\x6b\xe3\x60\x81\x1d\x68"
+			  "\xd7\x31\x87\x89\x09\xab\xd5\x96"
+			  "\x1d\xf3\x6d\x67\x80\xca\x07\x31"
+			  "\x5d\xa7\xe4\xfb\x3e\xf2\x9b\x33"
+			  "\x52\x18\xc8\x30\xfe\x2d\xca\x1e"
+			  "\x79\x92\x7a\x60\x5c\xb6\x58\x87"
+			  "\xa4\x36\xa2\x67\x92\x8b\xa4\xb7"
+			  "\xf1\x86\xdf\xdc\xc0\x7e\x8f\x63"
+			  "\xd2\xa2\xdc\x78\xeb\x4f\xd8\x96"
+			  "\x47\xca\xb8\x91\xf9\xf7\x94\x21"
+			  "\x5f\x9a\x9f\x5b\xb8\x40\x41\x4b"
+			  "\x66\x69\x6a\x72\xd0\xcb\x70\xb7"
+			  "\x93\xb5\x37\x96\x05\x37\x4f\xe5"
+			  "\x8c\xa7\x5a\x4e\x8b\xb7\x84\xea"
+			  "\xc7\xfc\x19\x6e\x1f\x5a\xa1\xac"
+			  "\x18\x7d\x52\x3b\xb3\x34\x62\x99"
+			  "\xe4\x9e\x31\x04\x3f\xc0\x8d\x84"
+			  "\x17\x7c\x25\x48\x52\x67\x11\x27"
+			  "\x67\xbb\x5a\x85\xca\x56\xb2\x5c"
+			  "\xe6\xec\xd5\x96\x3d\x15\xfc\xfb"
+			  "\x22\x25\xf4\x13\xe5\x93\x4b\x9a"
+			  "\x77\xf1\x52\x18\xfa\x16\x5e\x49"
+			  "\x03\x45\xa8\x08\xfa\xb3\x41\x92"
+			  "\x79\x50\x33\xca\xd0\xd7\x42\x55"
+			  "\xc3\x9a\x0c\x4e\xd9\xa4\x3c\x86"
+			  "\x80\x9f\x53\xd1\xa4\x2e\xd1\xbc"
+			  "\xf1\x54\x6e\x93\xa4\x65\x99\x8e"
+			  "\xdf\x29\xc0\x64\x63\x07\xbb\xea",
+		.ctext	= "\x15\x97\xd0\x86\x18\x03\x9c\x51"
+			  "\xc5\x11\x36\x62\x13\x92\xe6\x73"
+			  "\x29\x79\xde\xa1\x00\x3e\x08\x64"
+			  "\x17\x1a\xbc\xd5\xfe\x33\x0e\x0c"
+			  "\x7c\x94\xa7\xc6\x3c\xbe\xac\xa2"
+			  "\x89\xe6\xbc\xdf\x0c\x33\x27\x42"
+			  "\x46\x73\x2f\xba\x4e\xa6\x46\x8f"
+			  "\xe4\xee\x39\x63\x42\x65\xa3\x88"
+			  "\x7a\xad\x33\x23\xa9\xa7\x20\x7f"
+			  "\x0b\xe6\x6a\xc3\x60\xda\x9e\xb4"
+			  "\xd6\x07\x8a\x77\x26\xd1\xab\x44"
+			  "\x99\x55\x03\x5e\xed\x8d\x7b\xbd"
+			  "\xc8\x21\xb7\x21\x30\x3f\xc0\xb5"
+			  "\xc8\xec\x6c\x23\xa6\xa3\x6d\xf1"
+			  "\x30\x0a\xd0\xa6\xa9\x28\x69\xae"
+			  "\x2a\xe6\x54\xac\x82\x9d\x6a\x95"
+			  "\x6f\x06\x44\xc5\x5a\x77\x6e\xec"
+			  "\xf8\xf8\x63\xb2\xe6\xaa\xbd\x8e"
+			  "\x0e\x8a\x62\x00\x03\xc8\x84\xdd"
+			  "\x47\x4a\xc3\x55\xba\xb7\xe7\xdf"
+			  "\x08\xbf\x62\xf5\xe8\xbc\xb6\x11"
+			  "\xe4\xcb\xd0\x66\x74\x32\xcf\xd4"
+			  "\xf8\x51\x80\x39\x14\x05\x12\xdb"
+			  "\x87\x93\xe2\x26\x30\x9c\x3a\x21"
+			  "\xe5\xd0\x38\x57\x80\x15\xe4\x08"
+			  "\x58\x05\x49\x7d\xe6\x92\x77\x70"
+			  "\xfb\x1e\x2d\x6a\x84\x00\xc8\x68"
+			  "\xf7\x1a\xdd\xf0\x7b\x38\x1e\xd8"
+			  "\x2c\x78\x78\x61\xcf\xe3\xde\x69"
+			  "\x1f\xd5\x03\xd5\x1a\xb4\xcf\x03"
+			  "\xc8\x7a\x70\x68\x35\xb4\xf6\xbe"
+			  "\x90\x62\xb2\x28\x99\x86\xf5\x44"
+			  "\x99\xeb\x31\xcf\xca\xdf\xd0\x21"
+			  "\xd6\x60\xf7\x0f\x40\xb4\x80\xb7"
+			  "\xab\xe1\x9b\x45\xba\x66\xda\xee"
+			  "\xdd\x04\x12\x40\x98\xe1\x69\xe5"
+			  "\x2b\x9c\x59\x80\xe7\x7b\xcc\x63"
+			  "\xa6\xc0\x3a\xa9\xfe\x8a\xf9\x62"
+			  "\x11\x34\x61\x94\x35\xfe\xf2\x99"
+			  "\xfd\xee\x19\xea\x95\xb6\x12\xbf"
+			  "\x1b\xdf\x02\x1a\xcc\x3e\x7e\x65"
+			  "\x78\x74\x10\x50\x29\x63\x28\xea"
+			  "\x6b\xab\xd4\x06\x4d\x15\x24\x31"
+			  "\xc7\x0a\xc9\x16\xb6\x48\xf0\xbf"
+			  "\x49\xdb\x68\x71\x31\x8f\x87\xe2"
+			  "\x13\x05\x64\xd6\x22\x0c\xf8\x36"
+			  "\x84\x24\x3e\x69\x5e\xb8\x9e\x16"
+			  "\x73\x6c\x83\x1e\xe0\x9f\x9e\xba"
+			  "\xe5\x59\x21\x33\x1b\xa9\x26\xc2"
+			  "\xc7\xd9\x30\x73\xb6\xa6\x73\x82"
+			  "\x19\xfa\x44\x4d\x40\x8b\x69\x04"
+			  "\x94\x74\xea\x6e\xb3\x09\x47\x01"
+			  "\x2a\xb9\x78\x34\x43\x11\xed\xd6"
+			  "\x8c\x95\x65\x1b\x85\x67\xa5\x40"
+			  "\xac\x9c\x05\x4b\x57\x4a\xa9\x96"
+			  "\x0f\xdd\x4f\xa1\xe0\xcf\x6e\xc7"
+			  "\x1b\xed\xa2\xb4\x56\x8c\x09\x6e"
+			  "\xa6\x65\xd7\x55\x81\xb7\xed\x11"
+			  "\x9b\x40\x75\xa8\x6b\x56\xaf\x16"
+			  "\x8b\x3d\xf4\xcb\xfe\xd5\x1d\x3d"
+			  "\x85\xc2\xc0\xde\x43\x39\x4a\x96"
+			  "\xba\x88\x97\xc0\xd6\x00\x0e\x27"
+			  "\x21\xb0\x21\x52\xba\xa7\x37\xaa"
+			  "\xcc\xbf\x95\xa8\xf4\xd0\x91\xf6",
+		.len	= 512,
+		.also_non_np = 1,
+		.np	= 2,
+		.tap	= { 144, 368 },
+	}
+};
+
+/* Adiantum with XChaCha20 instead of XChaCha12 */
+/* Test vectors from https://github.com/google/adiantum */
+static const struct cipher_testvec adiantum_xchacha20_aes_tv_template[] = {
+	{
+		.key	= "\x9e\xeb\xb2\x49\x3c\x1c\xf5\xf4"
+			  "\x6a\x99\xc2\xc4\xdf\xb1\xf4\xdd"
+			  "\x75\x20\x57\xea\x2c\x4f\xcd\xb2"
+			  "\xa5\x3d\x7b\x49\x1e\xab\xfd\x0f",
+		.klen	= 32,
+		.iv	= "\xdf\x63\xd4\xab\xd2\x49\xf3\xd8"
+			  "\x33\x81\x37\x60\x7d\xfa\x73\x08"
+			  "\xd8\x49\x6d\x80\xe8\x2f\x62\x54"
+			  "\xeb\x0e\xa9\x39\x5b\x45\x7f\x8a",
+		.ptext	= "\x67\xc9\xf2\x30\x84\x41\x8e\x43"
+			  "\xfb\xf3\xb3\x3e\x79\x36\x7f\xe8",
+		.ctext	= "\xf6\x78\x97\xd6\xaa\x94\x01\x27"
+			  "\x2e\x4d\x83\xe0\x6e\x64\x9a\xdf",
+		.len	= 16,
+		.also_non_np = 1,
+		.np	= 3,
+		.tap	= { 5, 2, 9 },
+	}, {
+		.key	= "\x36\x2b\x57\x97\xf8\x5d\xcd\x99"
+			  "\x5f\x1a\x5a\x44\x1d\x92\x0f\x27"
+			  "\xcc\x16\xd7\x2b\x85\x63\x99\xd3"
+			  "\xba\x96\xa1\xdb\xd2\x60\x68\xda",
+		.klen	= 32,
+		.iv	= "\xef\x58\x69\xb1\x2c\x5e\x9a\x47"
+			  "\x24\xc1\xb1\x69\xe1\x12\x93\x8f"
+			  "\x43\x3d\x6d\x00\xdb\x5e\xd8\xd9"
+			  "\x12\x9a\xfe\xd9\xff\x2d\xaa\xc4",
+		.ptext	= "\x5e\xa8\x68\x19\x85\x98\x12\x23"
+			  "\x26\x0a\xcc\xdb\x0a\x04\xb9\xdf"
+			  "\x4d\xb3\x48\x7b\xb0\xe3\xc8\x19"
+			  "\x43\x5a\x46\x06\x94\x2d\xf2",
+		.ctext	= "\x4b\xb8\x90\x10\xdf\x7f\x64\x08"
+			  "\x0e\x14\x42\x5f\x00\x74\x09\x36"
+			  "\x57\x72\xb5\xfd\xb5\x5d\xb8\x28"
+			  "\x0c\x04\x91\x14\x91\xe9\x37",
+		.len	= 31,
+		.also_non_np = 1,
+		.np	= 2,
+		.tap	= { 16, 15 },
+	}, {
+		.key	= "\xa5\x28\x24\x34\x1a\x3c\xd8\xf7"
+			  "\x05\x91\x8f\xee\x85\x1f\x35\x7f"
+			  "\x80\x3d\xfc\x9b\x94\xf6\xfc\x9e"
+			  "\x19\x09\x00\xa9\x04\x31\x4f\x11",
+		.klen	= 32,
+		.iv	= "\xa1\xba\x49\x95\xff\x34\x6d\xb8"
+			  "\xcd\x87\x5d\x5e\xfd\xea\x85\xdb"
+			  "\x8a\x7b\x5e\xb2\x5d\x57\xdd\x62"
+			  "\xac\xa9\x8c\x41\x42\x94\x75\xb7",
+		.ptext	= "\x69\xb4\xe8\x8c\x37\xe8\x67\x82"
+			  "\xf1\xec\x5d\x04\xe5\x14\x91\x13"
+			  "\xdf\xf2\x87\x1b\x69\x81\x1d\x71"
+			  "\x70\x9e\x9c\x3b\xde\x49\x70\x11"
+			  "\xa0\xa3\xdb\x0d\x54\x4f\x66\x69"
+			  "\xd7\xdb\x80\xa7\x70\x92\x68\xce"
+			  "\x81\x04\x2c\xc6\xab\xae\xe5\x60"
+			  "\x15\xe9\x6f\xef\xaa\x8f\xa7\xa7"
+			  "\x63\x8f\xf2\xf0\x77\xf1\xa8\xea"
+			  "\xe1\xb7\x1f\x9e\xab\x9e\x4b\x3f"
+			  "\x07\x87\x5b\x6f\xcd\xa8\xaf\xb9"
+			  "\xfa\x70\x0b\x52\xb8\xa8\xa7\x9e"
+			  "\x07\x5f\xa6\x0e\xb3\x9b\x79\x13"
+			  "\x79\xc3\x3e\x8d\x1c\x2c\x68\xc8"
+			  "\x51\x1d\x3c\x7b\x7d\x79\x77\x2a"
+			  "\x56\x65\xc5\x54\x23\x28\xb0\x03",
+		.ctext	= "\xb1\x8b\xa0\x05\x77\xa8\x4d\x59"
+			  "\x1b\x8e\x21\xfc\x3a\x49\xfa\xd4"
+			  "\xeb\x36\xf3\xc4\xdf\xdc\xae\x67"
+			  "\x07\x3f\x70\x0e\xe9\x66\xf5\x0c"
+			  "\x30\x4d\x66\xc9\xa4\x2f\x73\x9c"
+			  "\x13\xc8\x49\x44\xcc\x0a\x90\x9d"
+			  "\x7c\xdd\x19\x3f\xea\x72\x8d\x58"
+			  "\xab\xe7\x09\x2c\xec\xb5\x44\xd2"
+			  "\xca\xa6\x2d\x7a\x5c\x9c\x2b\x15"
+			  "\xec\x2a\xa6\x69\x91\xf9\xf3\x13"
+			  "\xf7\x72\xc1\xc1\x40\xd5\xe1\x94"
+			  "\xf4\x29\xa1\x3e\x25\x02\xa8\x3e"
+			  "\x94\xc1\x91\x14\xa1\x14\xcb\xbe"
+			  "\x67\x4c\xb9\x38\xfe\xa7\xaa\x32"
+			  "\x29\x62\x0d\xb2\xf6\x3c\x58\x57"
+			  "\xc1\xd5\x5a\xbb\xd6\xa6\x2a\xe5",
+		.len	= 128,
+		.also_non_np = 1,
+		.np	= 4,
+		.tap	= { 112, 7, 8, 1 },
+	}, {
+		.key	= "\xd3\x81\x72\x18\x23\xff\x6f\x4a"
+			  "\x25\x74\x29\x0d\x51\x8a\x0e\x13"
+			  "\xc1\x53\x5d\x30\x8d\xee\x75\x0d"
+			  "\x14\xd6\x69\xc9\x15\xa9\x0c\x60",
+		.klen	= 32,
+		.iv	= "\x65\x9b\xd4\xa8\x7d\x29\x1d\xf4"
+			  "\xc4\xd6\x9b\x6a\x28\xab\x64\xe2"
+			  "\x62\x81\x97\xc5\x81\xaa\xf9\x44"
+			  "\xc1\x72\x59\x82\xaf\x16\xc8\x2c",
+		.ptext	= "\xc7\x6b\x52\x6a\x10\xf0\xcc\x09"
+			  "\xc1\x12\x1d\x6d\x21\xa6\x78\xf5"
+			  "\x05\xa3\x69\x60\x91\x36\x98\x57"
+			  "\xba\x0c\x14\xcc\xf3\x2d\x73\x03"
+			  "\xc6\xb2\x5f\xc8\x16\x27\x37\x5d"
+			  "\xd0\x0b\x87\xb2\x50\x94\x7b\x58"
+			  "\x04\xf4\xe0\x7f\x6e\x57\x8e\xc9"
+			  "\x41\x84\xc1\xb1\x7e\x4b\x91\x12"
+			  "\x3a\x8b\x5d\x50\x82\x7b\xcb\xd9"
+			  "\x9a\xd9\x4e\x18\x06\x23\x9e\xd4"
+			  "\xa5\x20\x98\xef\xb5\xda\xe5\xc0"
+			  "\x8a\x6a\x83\x77\x15\x84\x1e\xae"
+			  "\x78\x94\x9d\xdf\xb7\xd1\xea\x67"
+			  "\xaa\xb0\x14\x15\xfa\x67\x21\x84"
+			  "\xd3\x41\x2a\xce\xba\x4b\x4a\xe8"
+			  "\x95\x62\xa9\x55\xf0\x80\xad\xbd"
+			  "\xab\xaf\xdd\x4f\xa5\x7c\x13\x36"
+			  "\xed\x5e\x4f\x72\xad\x4b\xf1\xd0"
+			  "\x88\x4e\xec\x2c\x88\x10\x5e\xea"
+			  "\x12\xc0\x16\x01\x29\xa3\xa0\x55"
+			  "\xaa\x68\xf3\xe9\x9d\x3b\x0d\x3b"
+			  "\x6d\xec\xf8\xa0\x2d\xf0\x90\x8d"
+			  "\x1c\xe2\x88\xd4\x24\x71\xf9\xb3"
+			  "\xc1\x9f\xc5\xd6\x76\x70\xc5\x2e"
+			  "\x9c\xac\xdb\x90\xbd\x83\x72\xba"
+			  "\x6e\xb5\xa5\x53\x83\xa9\xa5\xbf"
+			  "\x7d\x06\x0e\x3c\x2a\xd2\x04\xb5"
+			  "\x1e\x19\x38\x09\x16\xd2\x82\x1f"
+			  "\x75\x18\x56\xb8\x96\x0b\xa6\xf9"
+			  "\xcf\x62\xd9\x32\x5d\xa9\xd7\x1d"
+			  "\xec\xe4\xdf\x1b\xbe\xf1\x36\xee"
+			  "\xe3\x7b\xb5\x2f\xee\xf8\x53\x3d"
+			  "\x6a\xb7\x70\xa9\xfc\x9c\x57\x25"
+			  "\xf2\x89\x10\xd3\xb8\xa8\x8c\x30"
+			  "\xae\x23\x4f\x0e\x13\x66\x4f\xe1"
+			  "\xb6\xc0\xe4\xf8\xef\x93\xbd\x6e"
+			  "\x15\x85\x6b\xe3\x60\x81\x1d\x68"
+			  "\xd7\x31\x87\x89\x09\xab\xd5\x96"
+			  "\x1d\xf3\x6d\x67\x80\xca\x07\x31"
+			  "\x5d\xa7\xe4\xfb\x3e\xf2\x9b\x33"
+			  "\x52\x18\xc8\x30\xfe\x2d\xca\x1e"
+			  "\x79\x92\x7a\x60\x5c\xb6\x58\x87"
+			  "\xa4\x36\xa2\x67\x92\x8b\xa4\xb7"
+			  "\xf1\x86\xdf\xdc\xc0\x7e\x8f\x63"
+			  "\xd2\xa2\xdc\x78\xeb\x4f\xd8\x96"
+			  "\x47\xca\xb8\x91\xf9\xf7\x94\x21"
+			  "\x5f\x9a\x9f\x5b\xb8\x40\x41\x4b"
+			  "\x66\x69\x6a\x72\xd0\xcb\x70\xb7"
+			  "\x93\xb5\x37\x96\x05\x37\x4f\xe5"
+			  "\x8c\xa7\x5a\x4e\x8b\xb7\x84\xea"
+			  "\xc7\xfc\x19\x6e\x1f\x5a\xa1\xac"
+			  "\x18\x7d\x52\x3b\xb3\x34\x62\x99"
+			  "\xe4\x9e\x31\x04\x3f\xc0\x8d\x84"
+			  "\x17\x7c\x25\x48\x52\x67\x11\x27"
+			  "\x67\xbb\x5a\x85\xca\x56\xb2\x5c"
+			  "\xe6\xec\xd5\x96\x3d\x15\xfc\xfb"
+			  "\x22\x25\xf4\x13\xe5\x93\x4b\x9a"
+			  "\x77\xf1\x52\x18\xfa\x16\x5e\x49"
+			  "\x03\x45\xa8\x08\xfa\xb3\x41\x92"
+			  "\x79\x50\x33\xca\xd0\xd7\x42\x55"
+			  "\xc3\x9a\x0c\x4e\xd9\xa4\x3c\x86"
+			  "\x80\x9f\x53\xd1\xa4\x2e\xd1\xbc"
+			  "\xf1\x54\x6e\x93\xa4\x65\x99\x8e"
+			  "\xdf\x29\xc0\x64\x63\x07\xbb\xea",
+		.ctext	= "\xe0\x33\xf6\xe0\xb4\xa5\xdd\x2b"
+			  "\xdd\xce\xfc\x12\x1e\xfc\x2d\xf2"
+			  "\x8b\xc7\xeb\xc1\xc4\x2a\xe8\x44"
+			  "\x0f\x3d\x97\x19\x2e\x6d\xa2\x38"
+			  "\x9d\xa6\xaa\xe1\x96\xb9\x08\xe8"
+			  "\x0b\x70\x48\x5c\xed\xb5\x9b\xcb"
+			  "\x8b\x40\x88\x7e\x69\x73\xf7\x16"
+			  "\x71\xbb\x5b\xfc\xa3\x47\x5d\xa6"
+			  "\xae\x3a\x64\xc4\xe7\xb8\xa8\xe7"
+			  "\xb1\x32\x19\xdb\xe3\x01\xb8\xf0"
+			  "\xa4\x86\xb4\x4c\xc2\xde\x5c\xd2"
+			  "\x6c\x77\xd2\xe8\x18\xb7\x0a\xc9"
+			  "\x3d\x53\xb5\xc4\x5c\xf0\x8c\x06"
+			  "\xdc\x90\xe0\x74\x47\x1b\x0b\xf6"
+			  "\xd2\x71\x6b\xc4\xf1\x97\x00\x2d"
+			  "\x63\x57\x44\x1f\x8c\xf4\xe6\x9b"
+			  "\xe0\x7a\xdd\xec\x32\x73\x42\x32"
+			  "\x7f\x35\x67\x60\x0d\xcf\x10\x52"
+			  "\x61\x22\x53\x8d\x8e\xbb\x33\x76"
+			  "\x59\xd9\x10\xce\xdf\xef\xc0\x41"
+			  "\xd5\x33\x29\x6a\xda\x46\xa4\x51"
+			  "\xf0\x99\x3d\x96\x31\xdd\xb5\xcb"
+			  "\x3e\x2a\x1f\xc7\x5c\x79\xd3\xc5"
+			  "\x20\xa1\xb1\x39\x1b\xc6\x0a\x70"
+			  "\x26\x39\x95\x07\xad\x7a\xc9\x69"
+			  "\xfe\x81\xc7\x88\x08\x38\xaf\xad"
+			  "\x9e\x8d\xfb\xe8\x24\x0d\x22\xb8"
+			  "\x0e\xed\xbe\x37\x53\x7c\xa6\xc6"
+			  "\x78\x62\xec\xa3\x59\xd9\xc6\x9d"
+			  "\xb8\x0e\x69\x77\x84\x2d\x6a\x4c"
+			  "\xc5\xd9\xb2\xa0\x2b\xa8\x80\xcc"
+			  "\xe9\x1e\x9c\x5a\xc4\xa1\xb2\x37"
+			  "\x06\x9b\x30\x32\x67\xf7\xe7\xd2"
+			  "\x42\xc7\xdf\x4e\xd4\xcb\xa0\x12"
+			  "\x94\xa1\x34\x85\x93\x50\x4b\x0a"
+			  "\x3c\x7d\x49\x25\x01\x41\x6b\x96"
+			  "\xa9\x12\xbb\x0b\xc0\xd7\xd0\x93"
+			  "\x1f\x70\x38\xb8\x21\xee\xf6\xa7"
+			  "\xee\xeb\xe7\x81\xa4\x13\xb4\x87"
+			  "\xfa\xc1\xb0\xb5\x37\x8b\x74\xa2"
+			  "\x4e\xc7\xc2\xad\x3d\x62\x3f\xf8"
+			  "\x34\x42\xe5\xae\x45\x13\x63\xfe"
+			  "\xfc\x2a\x17\x46\x61\xa9\xd3\x1c"
+			  "\x4c\xaf\xf0\x09\x62\x26\x66\x1e"
+			  "\x74\xcf\xd6\x68\x3d\x7d\xd8\xb7"
+			  "\xe7\xe6\xf8\xf0\x08\x20\xf7\x47"
+			  "\x1c\x52\xaa\x0f\x3e\x21\xa3\xf2"
+			  "\xbf\x2f\x95\x16\xa8\xc8\xc8\x8c"
+			  "\x99\x0f\x5d\xfb\xfa\x2b\x58\x8a"
+			  "\x7e\xd6\x74\x02\x60\xf0\xd0\x5b"
+			  "\x65\xa8\xac\xea\x8d\x68\x46\x34"
+			  "\x26\x9d\x4f\xb1\x9a\x8e\xc0\x1a"
+			  "\xf1\xed\xc6\x7a\x83\xfd\x8a\x57"
+			  "\xf2\xe6\xe4\xba\xfc\xc6\x3c\xad"
+			  "\x5b\x19\x50\x2f\x3a\xcc\x06\x46"
+			  "\x04\x51\x3f\x91\x97\xf0\xd2\x07"
+			  "\xe7\x93\x89\x7e\xb5\x32\x0f\x03"
+			  "\xe5\x58\x9e\x74\x72\xeb\xc2\x38"
+			  "\x00\x0c\x91\x72\x69\xed\x7d\x6d"
+			  "\xc8\x71\xf0\xec\xff\x80\xd9\x1c"
+			  "\x9e\xd2\xfa\x15\xfc\x6c\x4e\xbc"
+			  "\xb1\xa6\xbd\xbd\x70\x40\xca\x20"
+			  "\xb8\x78\xd2\xa3\xc6\xf3\x79\x9c"
+			  "\xc7\x27\xe1\x6a\x29\xad\xa4\x03",
+		.len	= 512,
+	}
+};
+
 /*
  * CTS (Cipher Text Stealing) mode tests
  */
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH v2 12/12] fscrypt: add Adiantum support
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (10 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support Eric Biggers
@ 2018-10-15 17:54 ` Eric Biggers
  2018-10-19 15:58 ` [RFC PATCH v2 00/12] crypto: " Jason A. Donenfeld
  12 siblings, 0 replies; 54+ messages in thread
From: Eric Biggers @ 2018-10-15 17:54 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-fscrypt, linux-arm-kernel, linux-kernel, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

From: Eric Biggers <ebiggers@google.com>

Add support for the Adiantum encryption mode to fscrypt.  Adiantum is a
tweakable, length-preserving encryption mode with security provably
reducible to that of XChaCha12 and AES-256, subject to a security bound.
It's also a true wide-block mode, unlike XTS.  See the paper
"Adiantum: length-preserving encryption for entry-level processors"
(https://eprint.iacr.org/2018/720.pdf) for full details;
also see the crypto API patch which added Adiantum.

On sufficiently long inputs, Adiantum's performance-critical parts are
XChaCha12 and the NH hash function.  These algorithms are fast even on
processors without dedicated crypto instructions.  Adiantum makes it
feasible to enable storage encryption on low-end mobile devices that
lack AES instructions; currently such devices are unencrypted.  On ARM
Cortex-A7, on 4096-byte messages Adiantum encryption is about 4 times
faster than AES-256-XTS encryption; decryption is about 5 times faster.

Adiantum is also suitable to replace CTS-CBC for fscrypt's filenames
encryption, fixing the information leakage in encrypted filenames when
two filenames in a directory share a common prefix.

Adiantum accepts long IVs, so in fscrypt we include the 16-byte
per-inode nonce in the IVs too, and we allow userspace to choose to use
the master key directly rather than deriving per-file keys.  This is
especially desirable because each Adiantum tfm (crypto_skcipher) uses
more memory than an AES-XTS tfm, since Adiantum has more sub-tfms as
well as a long subkey for NH.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 Documentation/filesystems/fscrypt.rst | 183 ++++++++-------
 fs/crypto/crypto.c                    |  35 +--
 fs/crypto/fname.c                     |  22 +-
 fs/crypto/fscrypt_private.h           |  66 +++++-
 fs/crypto/keyinfo.c                   | 322 ++++++++++++++++++++------
 fs/crypto/policy.c                    |   5 +-
 include/uapi/linux/fs.h               |   4 +-
 7 files changed, 450 insertions(+), 187 deletions(-)

diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index cfbc18f0d9c98..63e949b1c2ee6 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -132,47 +132,31 @@ designed for this purpose be used, such as scrypt, PBKDF2, or Argon2.
 Per-file keys
 -------------
 
-Master keys are not used to encrypt file contents or names directly.
-Instead, a unique key is derived for each encrypted file, including
-each regular file, directory, and symbolic link.  This has several
-advantages:
-
-- In cryptosystems, the same key material should never be used for
-  different purposes.  Using the master key as both an XTS key for
-  contents encryption and as a CTS-CBC key for filenames encryption
-  would violate this rule.
-- Per-file keys simplify the choice of IVs (Initialization Vectors)
-  for contents encryption.  Without per-file keys, to ensure IV
-  uniqueness both the inode and logical block number would need to be
-  encoded in the IVs.  This would make it impossible to renumber
-  inodes, which e.g. ``resize2fs`` can do when resizing an ext4
-  filesystem.  With per-file keys, it is sufficient to encode just the
-  logical block number in the IVs.
-- Per-file keys strengthen the encryption of filenames, where IVs are
-  reused out of necessity.  With a unique key per directory, IV reuse
-  is limited to within a single directory.
-- Per-file keys allow individual files to be securely erased simply by
-  securely erasing their keys.  (Not yet implemented.)
-
-A KDF (Key Derivation Function) is used to derive per-file keys from
-the master key.  This is done instead of wrapping a randomly-generated
-key for each file because it reduces the size of the encryption xattr,
-which for some filesystems makes the xattr more likely to fit in-line
-in the filesystem's inode table.  With a KDF, only a 16-byte nonce is
-required --- long enough to make key reuse extremely unlikely.  A
-wrapped key, on the other hand, would need to be up to 64 bytes ---
-the length of an AES-256-XTS key.  Furthermore, currently there is no
-requirement to support unlocking a file with multiple alternative
-master keys or to support rotating master keys.  Instead, the master
-keys may be wrapped in userspace, e.g. as done by the `fscrypt
-<https://github.com/google/fscrypt>`_ tool.
-
-The current KDF encrypts the master key using the 16-byte nonce as an
-AES-128-ECB key.  The output is used as the derived key.  If the
-output is longer than needed, then it is truncated to the needed
-length.  Truncation is the norm for directories and symlinks, since
-those use the CTS-CBC encryption mode which requires a key half as
-long as that required by the XTS encryption mode.
+Since each master key can protect many files, it is necessary to
+"tweak" the encryption of each file, so that the same plaintext in two
+files doesn't map to the same ciphertext, or vice versa.
+
+In most cases, fscrypt solves this problem by deriving per-file keys.
+When a new encrypted inode (regular file, directory, or symlink) is
+created, fscrypt generates a 16-byte nonce uniformly at random and
+stores it in the inode's encryption xattr.  Then, a KDF (Key
+Derivation Function) is used to derive an inode-specific encryption
+key from the master key and nonce.
+
+The Adiantum encryption mode (see `Encryption modes and usage`_) is
+special, since it accepts long IVs and is suited for both contents and
+filenames encryption.  For it, fscrypt includes the 16-byte nonce in
+the IVs, and users can choose to use the master key for Adiantum
+encryption directly, for added efficiency.  In this configuration, no
+per-file keys are derived.  However, when doing this, users must take
+care not to reuse the same master key for any other modes.
+
+Below, the KDF and design considerations are described in more detail.
+
+The current KDF works by encrypting the master key with AES-128-ECB,
+using the 16-byte nonce as the AES key.  The output is used as the
+derived key.  If the output is longer than needed, then it is
+truncated to the needed length.
 
 Note: this KDF meets the primary security requirement, which is to
 produce unique derived keys that preserve the entropy of the master
@@ -181,6 +165,28 @@ However, it is nonstandard and has some problems such as being
 reversible, so it is generally considered to be a mistake!  It may be
 replaced with HKDF or another more standard KDF in the future.
 
+Key derivation was chosen over key wrapping because wrapped keys would
+require larger xattrs which would be less likely to fit in-line in the
+filesystem's inode table, and there didn't appear to be any
+significant advantages to key wrapping.  In particular, currently
+there is no requirement to support unlocking a file with multiple
+alternative master keys or to support rotating master keys.  Instead,
+the master keys may be wrapped in userspace, e.g. as done by the
+`fscrypt <https://github.com/google/fscrypt>`_ tool.
+
+Including the inode number in the IVs was considered.  However, it was
+rejected as it would have prevented ext4 filesystems from being
+resized, and by itself still wouldn't have been sufficient to prevent
+the same key from being directly reused for both XTS and CTS-CBC.
+
+Including the per-inode nonce in the IVs would allow filesystem
+resizing, but it wouldn't allow key reuse between two different modes,
+nor would it be compatible with XTS and CTS-CBC since a 16-byte IV
+isn't long enough to contain both a collision-resistant nonce and a
+block offset.  However, this method works well for Adiantum, which
+accepts longer IVs and is suited for both contents and filenames
+encryption.
+
 Encryption modes and usage
 ==========================
 
@@ -191,54 +197,72 @@ Currently, the following pairs of encryption modes are supported:
 
 - AES-256-XTS for contents and AES-256-CTS-CBC for filenames
 - AES-128-CBC for contents and AES-128-CTS-CBC for filenames
+- Adiantum for both contents and filenames
+
+If unsure, you should use the (AES-256-XTS, AES-256-CTS-CBC) pair.
 
-It is strongly recommended to use AES-256-XTS for contents encryption.
 AES-128-CBC was added only for low-powered embedded devices with
 crypto accelerators such as CAAM or CESA that do not support XTS.
 
+Adiantum is a (primarily) stream cipher-based mode that was designed
+primarily for CPUs that don't support AES instructions.  Though
+Adiantum actually provides a stronger formal notion of security than
+XTS and CTS-CBC (since it's a true wide-block mode), it's also newer
+and depends on the security of XChaCha12 in addition to AES-256.
+
 New encryption modes can be added relatively easily, without changes
 to individual filesystems.  However, authenticated encryption (AE)
 modes are not currently supported because of the difficulty of dealing
 with ciphertext expansion.
 
+Contents encryption
+-------------------
+
 For file contents, each filesystem block is encrypted independently.
 Currently, only the case where the filesystem block size is equal to
-the system's page size (usually 4096 bytes) is supported.  With the
-XTS mode of operation (recommended), the logical block number within
-the file is used as the IV.  With the CBC mode of operation (not
-recommended), ESSIV is used; specifically, the IV for CBC is the
-logical block number encrypted with AES-256, where the AES-256 key is
-the SHA-256 hash of the inode's data encryption key.
-
-For filenames, the full filename is encrypted at once.  Because of the
-requirements to retain support for efficient directory lookups and
-filenames of up to 255 bytes, a constant initialization vector (IV) is
-used.  However, each encrypted directory uses a unique key, which
-limits IV reuse to within a single directory.  Note that IV reuse in
-the context of CTS-CBC encryption means that when the original
-filenames share a common prefix at least as long as the cipher block
-size (16 bytes for AES), the corresponding encrypted filenames will
-also share a common prefix.  This is undesirable; it may be fixed in
-the future by switching to an encryption mode that is a strong
-pseudorandom permutation on arbitrary-length messages, e.g. the HEH
-(Hash-Encrypt-Hash) mode.
-
-Since filenames are encrypted with the CTS-CBC mode of operation, the
-plaintext and ciphertext filenames need not be multiples of the AES
-block size, i.e. 16 bytes.  However, the minimum size that can be
-encrypted is 16 bytes, so shorter filenames are NUL-padded to 16 bytes
-before being encrypted.  In addition, to reduce leakage of filename
-lengths via their ciphertexts, all filenames are NUL-padded to the
-next 4, 8, 16, or 32-byte boundary (configurable).  32 is recommended
-since this provides the best confidentiality, at the cost of making
-directory entries consume slightly more space.  Note that since NUL
-(``\0``) is not otherwise a valid character in filenames, the padding
-will never produce duplicate plaintexts.
+the system's page size (usually 4096 bytes) is supported.  IVs are
+chosen as follows, depending on the contents encryption mode:
+
+- XTS: the logical block number within the file.
+- CBC: ESSIV, specifically the logical block number within the file
+  encrypted with AES-256, where the AES-256 key is the SHA-256 hash of
+  the inode's data encryption key.
+- Adiantum: the logical block number within the file concatenated with
+  the per-file nonce.  (As noted earlier, including the nonce makes it
+  safe to use the same key for many files.)
+
+Filenames encryption
+--------------------
+
+For filenames, each full filename is encrypted at once.  Because of
+the requirements to retain support for efficient directory lookups and
+filenames of up to 255 bytes, the same IV is used for every filename
+in a directory.
+
+However, each encrypted directory still uses a unique key; or
+alternatively has the inode's nonce included in the IV (for Adiantum).
+Thus, IV reuse is limited to within a single directory.
+
+With CTS-CBC, the IV reuse means that when the plaintext filenames
+share a common prefix at least as long as the cipher block size (16
+bytes for AES), the corresponding encrypted filenames will also share
+a common prefix.  This is undesirable.  Adiantum does not have this
+weakness, as it is a super-pseudorandom permutation.
+
+All supported filenames encryption modes accept any plaintext length
+>= 16 bytes; cipher block alignment is not required.  However,
+filenames shorter than 16 bytes are NUL-padded to 16 bytes before
+being encrypted.  In addition, to reduce leakage of filename lengths
+via their ciphertexts, all filenames are NUL-padded to the next 4, 8,
+16, or 32-byte boundary (configurable).  32 is recommended since this
+provides the best confidentiality, at the cost of making directory
+entries consume slightly more space.  Note that since NUL (``\0``) is
+not otherwise a valid character in filenames, the padding will never
+produce duplicate plaintexts.
 
 Symbolic link targets are considered a type of filename and are
-encrypted in the same way as filenames in directory entries.  Each
-symlink also uses a unique key; hence, the hardcoded IV is not a
-problem for symlinks.
+encrypted in the same way as filenames in directory entries, except
+that IV reuse is not a problem as each symlink has its own inode.
 
 User API
 ========
@@ -272,9 +296,12 @@ This structure must be initialized as follows:
   and FS_ENCRYPTION_MODE_AES_256_CTS (4) for
   ``filenames_encryption_mode``.
 
-- ``flags`` must be set to a value from ``<linux/fs.h>`` which
+- ``flags`` must contain a value from ``<linux/fs.h>`` which
   identifies the amount of NUL-padding to use when encrypting
-  filenames.  If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3).
+  filenames.  In addition, if the chosen encryption modes are both
+  FS_ENCRYPTION_MODE_ADIANTUM, this can contain FS_POLICY_FLAGS_DIRECT
+  to specify that the master key should be used directly, without key
+  derivation.  If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3).
 
 - ``master_key_descriptor`` specifies how to find the master key in
   the keyring; see `Adding keys`_.  It is up to userspace to choose a
diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index 0f46cf550907f..5dd59e790ad31 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -133,15 +133,32 @@ struct fscrypt_ctx *fscrypt_get_ctx(const struct inode *inode, gfp_t gfp_flags)
 }
 EXPORT_SYMBOL(fscrypt_get_ctx);
 
+void fscrypt_prepare_iv(union fscrypt_iv *iv, u64 lblk_num,
+			const struct fscrypt_info *ci)
+{
+	if (ci->ci_mode->uses_long_ivs) {
+		/* Using lblk_num and nonce as tweak */
+		iv->long_iv.index = cpu_to_le64(lblk_num);
+		memcpy(iv->long_iv.nonce, ci->ci_nonce,
+		       FS_KEY_DERIVATION_NONCE_SIZE);
+		iv->long_iv.unused = 0;
+	} else {
+		/* Using only lblk_num as tweak (key was derived from nonce) */
+		iv->short_iv.index = cpu_to_le64(lblk_num);
+		iv->short_iv.unused = 0;
+		if (ci->ci_essiv_tfm != NULL) {
+			crypto_cipher_encrypt_one(ci->ci_essiv_tfm, (u8 *)iv,
+						  (const u8 *)iv);
+		}
+	}
+}
+
 int fscrypt_do_page_crypto(const struct inode *inode, fscrypt_direction_t rw,
 			   u64 lblk_num, struct page *src_page,
 			   struct page *dest_page, unsigned int len,
 			   unsigned int offs, gfp_t gfp_flags)
 {
-	struct {
-		__le64 index;
-		u8 padding[FS_IV_SIZE - sizeof(__le64)];
-	} iv;
+	union fscrypt_iv iv;
 	struct skcipher_request *req = NULL;
 	DECLARE_CRYPTO_WAIT(wait);
 	struct scatterlist dst, src;
@@ -151,15 +168,7 @@ int fscrypt_do_page_crypto(const struct inode *inode, fscrypt_direction_t rw,
 
 	BUG_ON(len == 0);
 
-	BUILD_BUG_ON(sizeof(iv) != FS_IV_SIZE);
-	BUILD_BUG_ON(AES_BLOCK_SIZE != FS_IV_SIZE);
-	iv.index = cpu_to_le64(lblk_num);
-	memset(iv.padding, 0, sizeof(iv.padding));
-
-	if (ci->ci_essiv_tfm != NULL) {
-		crypto_cipher_encrypt_one(ci->ci_essiv_tfm, (u8 *)&iv,
-					  (u8 *)&iv);
-	}
+	fscrypt_prepare_iv(&iv, lblk_num, ci);
 
 	req = skcipher_request_alloc(tfm, gfp_flags);
 	if (!req)
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index d7a0f682ca122..c60ae8f70395f 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -40,10 +40,11 @@ int fname_encrypt(struct inode *inode, const struct qstr *iname,
 {
 	struct skcipher_request *req = NULL;
 	DECLARE_CRYPTO_WAIT(wait);
-	struct crypto_skcipher *tfm = inode->i_crypt_info->ci_ctfm;
-	int res = 0;
-	char iv[FS_CRYPTO_BLOCK_SIZE];
+	struct fscrypt_info *ci = inode->i_crypt_info;
+	struct crypto_skcipher *tfm = ci->ci_ctfm;
+	union fscrypt_iv iv;
 	struct scatterlist sg;
+	int res;
 
 	/*
 	 * Copy the filename to the output buffer for encrypting in-place and
@@ -55,7 +56,7 @@ int fname_encrypt(struct inode *inode, const struct qstr *iname,
 	memset(out + iname->len, 0, olen - iname->len);
 
 	/* Initialize the IV */
-	memset(iv, 0, FS_CRYPTO_BLOCK_SIZE);
+	fscrypt_prepare_iv(&iv, 0, ci);
 
 	/* Set up the encryption request */
 	req = skcipher_request_alloc(tfm, GFP_NOFS);
@@ -65,7 +66,7 @@ int fname_encrypt(struct inode *inode, const struct qstr *iname,
 			CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP,
 			crypto_req_done, &wait);
 	sg_init_one(&sg, out, olen);
-	skcipher_request_set_crypt(req, &sg, &sg, olen, iv);
+	skcipher_request_set_crypt(req, &sg, &sg, olen, &iv);
 
 	/* Do the encryption */
 	res = crypto_wait_req(crypto_skcipher_encrypt(req), &wait);
@@ -94,9 +95,10 @@ static int fname_decrypt(struct inode *inode,
 	struct skcipher_request *req = NULL;
 	DECLARE_CRYPTO_WAIT(wait);
 	struct scatterlist src_sg, dst_sg;
-	struct crypto_skcipher *tfm = inode->i_crypt_info->ci_ctfm;
-	int res = 0;
-	char iv[FS_CRYPTO_BLOCK_SIZE];
+	struct fscrypt_info *ci = inode->i_crypt_info;
+	struct crypto_skcipher *tfm = ci->ci_ctfm;
+	union fscrypt_iv iv;
+	int res;
 
 	/* Allocate request */
 	req = skcipher_request_alloc(tfm, GFP_NOFS);
@@ -107,12 +109,12 @@ static int fname_decrypt(struct inode *inode,
 		crypto_req_done, &wait);
 
 	/* Initialize IV */
-	memset(iv, 0, FS_CRYPTO_BLOCK_SIZE);
+	fscrypt_prepare_iv(&iv, 0, ci);
 
 	/* Create decryption request */
 	sg_init_one(&src_sg, iname->name, iname->len);
 	sg_init_one(&dst_sg, oname->name, oname->len);
-	skcipher_request_set_crypt(req, &src_sg, &dst_sg, iname->len, iv);
+	skcipher_request_set_crypt(req, &src_sg, &dst_sg, iname->len, &iv);
 	res = crypto_wait_req(crypto_skcipher_decrypt(req), &wait);
 	skcipher_request_free(req);
 	if (res < 0) {
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index 79debfc9cef91..a3ec179051dc2 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -17,7 +17,6 @@
 #include <crypto/hash.h>
 
 /* Encryption parameters */
-#define FS_IV_SIZE			16
 #define FS_KEY_DERIVATION_NONCE_SIZE	16
 
 /**
@@ -52,16 +51,42 @@ struct fscrypt_symlink_data {
 } __packed;
 
 /*
- * A pointer to this structure is stored in the file system's in-core
- * representation of an inode.
+ * fscrypt_info - the "encryption key" for an inode
+ *
+ * When an encrypted file's key is made available, an instance of this struct is
+ * allocated and stored in ->i_crypt_info.  Once created, it remains until the
+ * inode is evicted.
  */
 struct fscrypt_info {
+
+	/* The actual crypto transform used for encryption and decryption */
+	struct crypto_skcipher *ci_ctfm;
+
+	/*
+	 * Cipher for ESSIV IV generation.  Only set for CBC contents
+	 * encryption, otherwise is NULL.
+	 */
+	struct crypto_cipher *ci_essiv_tfm;
+
+	/*
+	 * Encryption mode used for this inode.  It corresponds to either
+	 * ci_data_mode or ci_filename_mode, depending on the inode type.
+	 */
+	struct fscrypt_mode *ci_mode;
+
+	/*
+	 * If non-NULL, then this inode uses a master key directly rather than a
+	 * derived key, and ci_ctfm will equal ci_master_key->mk_ctfm.
+	 * Otherwise, this inode uses a derived key.
+	 */
+	struct fscrypt_master_key *ci_master_key;
+
+	/* fields from the fscrypt_context */
 	u8 ci_data_mode;
 	u8 ci_filename_mode;
 	u8 ci_flags;
-	struct crypto_skcipher *ci_ctfm;
-	struct crypto_cipher *ci_essiv_tfm;
-	u8 ci_master_key[FS_KEY_DESCRIPTOR_SIZE];
+	u8 ci_master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
+	u8 ci_nonce[FS_KEY_DERIVATION_NONCE_SIZE];
 };
 
 typedef enum {
@@ -83,6 +108,10 @@ static inline bool fscrypt_valid_enc_modes(u32 contents_mode,
 	    filenames_mode == FS_ENCRYPTION_MODE_AES_256_CTS)
 		return true;
 
+	if (contents_mode == FS_ENCRYPTION_MODE_ADIANTUM &&
+	    filenames_mode == FS_ENCRYPTION_MODE_ADIANTUM)
+		return true;
+
 	return false;
 }
 
@@ -107,6 +136,21 @@ fscrypt_msg(struct super_block *sb, const char *level, const char *fmt, ...);
 #define fscrypt_err(sb, fmt, ...)		\
 	fscrypt_msg(sb, KERN_ERR, fmt, ##__VA_ARGS__)
 
+union fscrypt_iv {
+	struct {
+		__le64 index;
+		__le64 unused;
+	} short_iv;
+	struct {
+		__le64 index;
+		u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE];
+		__le64 unused;
+	} long_iv;
+};
+
+void fscrypt_prepare_iv(union fscrypt_iv *iv, u64 lblk_num,
+			const struct fscrypt_info *ci);
+
 /* fname.c */
 extern int fname_encrypt(struct inode *inode, const struct qstr *iname,
 			 u8 *out, unsigned int olen);
@@ -115,6 +159,16 @@ extern bool fscrypt_fname_encrypted_size(const struct inode *inode,
 					 u32 *encrypted_len_ret);
 
 /* keyinfo.c */
+
+struct fscrypt_mode {
+	const char *friendly_name;
+	const char *cipher_str;
+	int keysize;
+	bool logged_impl_name;
+	bool uses_essiv;
+	bool uses_long_ivs;
+};
+
 extern void __exit fscrypt_essiv_cleanup(void);
 
 #endif /* _FSCRYPT_PRIVATE_H */
diff --git a/fs/crypto/keyinfo.c b/fs/crypto/keyinfo.c
index 7874c9bb2fc53..49bfbe8c15cd4 100644
--- a/fs/crypto/keyinfo.c
+++ b/fs/crypto/keyinfo.c
@@ -10,15 +10,20 @@
  */
 
 #include <keys/user-type.h>
+#include <linux/hashtable.h>
 #include <linux/scatterlist.h>
 #include <linux/ratelimit.h>
 #include <crypto/aes.h>
+#include <crypto/algapi.h>
 #include <crypto/sha.h>
 #include <crypto/skcipher.h>
 #include "fscrypt_private.h"
 
 static struct crypto_shash *essiv_hash_tfm;
 
+static DEFINE_HASHTABLE(fscrypt_master_keys, 6); /* 6 bits = 64 buckets */
+static DEFINE_SPINLOCK(fscrypt_master_keys_lock);
+
 /*
  * Key derivation function.  This generates the derived key by encrypting the
  * master key with AES-128-ECB using the inode's nonce as the AES key.
@@ -123,37 +128,7 @@ find_and_lock_process_key(const char *prefix,
 	return ERR_PTR(-ENOKEY);
 }
 
-/* Find the master key, then derive the inode's actual encryption key */
-static int find_and_derive_key(const struct inode *inode,
-			       const struct fscrypt_context *ctx,
-			       u8 *derived_key, unsigned int derived_keysize)
-{
-	struct key *key;
-	const struct fscrypt_key *payload;
-	int err;
-
-	key = find_and_lock_process_key(FS_KEY_DESC_PREFIX,
-					ctx->master_key_descriptor,
-					derived_keysize, &payload);
-	if (key == ERR_PTR(-ENOKEY) && inode->i_sb->s_cop->key_prefix) {
-		key = find_and_lock_process_key(inode->i_sb->s_cop->key_prefix,
-						ctx->master_key_descriptor,
-						derived_keysize, &payload);
-	}
-	if (IS_ERR(key))
-		return PTR_ERR(key);
-	err = derive_key_aes(payload->raw, ctx, derived_key, derived_keysize);
-	up_read(&key->sem);
-	key_put(key);
-	return err;
-}
-
-static struct fscrypt_mode {
-	const char *friendly_name;
-	const char *cipher_str;
-	int keysize;
-	bool logged_impl_name;
-} available_modes[] = {
+static struct fscrypt_mode available_modes[] = {
 	[FS_ENCRYPTION_MODE_AES_256_XTS] = {
 		.friendly_name = "AES-256-XTS",
 		.cipher_str = "xts(aes)",
@@ -168,12 +143,19 @@ static struct fscrypt_mode {
 		.friendly_name = "AES-128-CBC",
 		.cipher_str = "cbc(aes)",
 		.keysize = 16,
+		.uses_essiv = true,
 	},
 	[FS_ENCRYPTION_MODE_AES_128_CTS] = {
 		.friendly_name = "AES-128-CTS-CBC",
 		.cipher_str = "cts(cbc(aes))",
 		.keysize = 16,
 	},
+	[FS_ENCRYPTION_MODE_ADIANTUM] = {
+		.friendly_name = "Adiantum",
+		.cipher_str = "adiantum(xchacha12,aes)",
+		.keysize = 32,
+		.uses_long_ivs = true,
+	},
 };
 
 static struct fscrypt_mode *
@@ -198,14 +180,178 @@ select_encryption_mode(const struct fscrypt_info *ci, const struct inode *inode)
 	return ERR_PTR(-EINVAL);
 }
 
-static void put_crypt_info(struct fscrypt_info *ci)
+/* Find the master key, then derive the inode's actual encryption key */
+static int find_and_derive_key(const struct inode *inode,
+			       const struct fscrypt_context *ctx,
+			       u8 *derived_key, const struct fscrypt_mode *mode)
 {
-	if (!ci)
+	struct key *key;
+	const struct fscrypt_key *payload;
+	int err;
+
+	key = find_and_lock_process_key(FS_KEY_DESC_PREFIX,
+					ctx->master_key_descriptor,
+					mode->keysize, &payload);
+	if (key == ERR_PTR(-ENOKEY) && inode->i_sb->s_cop->key_prefix) {
+		key = find_and_lock_process_key(inode->i_sb->s_cop->key_prefix,
+						ctx->master_key_descriptor,
+						mode->keysize, &payload);
+	}
+	if (IS_ERR(key))
+		return PTR_ERR(key);
+
+	if (ctx->flags & FS_POLICY_FLAGS_DIRECT) {
+		if (mode->uses_long_ivs) {
+			memcpy(derived_key, payload->raw, mode->keysize);
+			err = 0;
+		} else {
+			fscrypt_warn(inode->i_sb,
+				     "direct key mode not allowed with %s\n",
+				     mode->friendly_name);
+			err = -EINVAL;
+		}
+	} else {
+		err = derive_key_aes(payload->raw, ctx, derived_key,
+				     mode->keysize);
+	}
+	up_read(&key->sem);
+	key_put(key);
+	return err;
+}
+
+/* Allocate and key a symmetric cipher object for the given encryption mode */
+static struct crypto_skcipher *
+allocate_skcipher_for_mode(struct fscrypt_mode *mode, const u8 *raw_key,
+			   const struct inode *inode)
+{
+	struct crypto_skcipher *tfm;
+	int err;
+
+	tfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
+	if (IS_ERR(tfm)) {
+		fscrypt_warn(inode->i_sb,
+			     "error allocating '%s' transform for inode %lu: %ld",
+			     mode->cipher_str, inode->i_ino, PTR_ERR(tfm));
+		return tfm;
+	}
+	if (unlikely(!mode->logged_impl_name)) {
+		/*
+		 * fscrypt performance can vary greatly depending on which
+		 * crypto algorithm implementation is used.  Help people debug
+		 * performance problems by logging the ->cra_driver_name the
+		 * first time a mode is used.  Note that multiple threads can
+		 * race here, but it doesn't really matter.
+		 */
+		mode->logged_impl_name = true;
+		pr_info("fscrypt: %s using implementation \"%s\"\n",
+			mode->friendly_name,
+			crypto_skcipher_alg(tfm)->base.cra_driver_name);
+	}
+	crypto_skcipher_set_flags(tfm, CRYPTO_TFM_REQ_WEAK_KEY);
+	err = crypto_skcipher_setkey(tfm, raw_key, mode->keysize);
+	if (err)
+		goto err_free_tfm;
+
+	return tfm;
+
+err_free_tfm:
+	crypto_free_skcipher(tfm);
+	return ERR_PTR(err);
+}
+
+/* Master key referenced by FS_POLICY_FLAGS_DIRECT policy */
+struct fscrypt_master_key {
+	struct hlist_node mk_node;
+	refcount_t mk_refcount;
+	const struct fscrypt_mode *mk_mode;
+	struct crypto_skcipher *mk_ctfm;
+	u8 mk_descriptor[FS_KEY_DESCRIPTOR_SIZE];
+	u8 mk_raw[FS_MAX_KEY_SIZE];
+};
+
+static void free_master_key(struct fscrypt_master_key *mk)
+{
+	if (mk) {
+		crypto_free_skcipher(mk->mk_ctfm);
+		kzfree(mk);
+	}
+}
+
+static void put_master_key(struct fscrypt_master_key *mk)
+{
+	if (!refcount_dec_and_lock(&mk->mk_refcount, &fscrypt_master_keys_lock))
 		return;
+	hash_del(&mk->mk_node);
+	spin_unlock(&fscrypt_master_keys_lock);
 
-	crypto_free_skcipher(ci->ci_ctfm);
-	crypto_free_cipher(ci->ci_essiv_tfm);
-	kmem_cache_free(fscrypt_info_cachep, ci);
+	free_master_key(mk);
+}
+
+static struct fscrypt_master_key *
+find_or_insert_master_key(struct fscrypt_master_key *to_insert,
+			  const u8 *raw_key, const struct fscrypt_mode *mode,
+			  const struct fscrypt_info *ci)
+{
+	unsigned long hash_key;
+	struct fscrypt_master_key *mk;
+
+	BUILD_BUG_ON(sizeof(hash_key) > FS_KEY_DESCRIPTOR_SIZE);
+	memcpy(&hash_key, ci->ci_master_key_descriptor, sizeof(hash_key));
+
+	spin_lock(&fscrypt_master_keys_lock);
+	hash_for_each_possible(fscrypt_master_keys, mk, mk_node, hash_key) {
+		if (memcmp(mk->mk_descriptor, ci->ci_master_key_descriptor,
+			   FS_KEY_DESCRIPTOR_SIZE) != 0)
+			continue;
+		if (mode != mk->mk_mode ||
+		    crypto_memneq(raw_key, mk->mk_raw, mode->keysize))
+			continue;
+		/* using existing tfm with same (descriptor, mode, raw_key) */
+		refcount_inc(&mk->mk_refcount);
+		spin_unlock(&fscrypt_master_keys_lock);
+		free_master_key(to_insert);
+		return mk;
+	}
+	if (to_insert)
+		hash_add(fscrypt_master_keys, &to_insert->mk_node, hash_key);
+	spin_unlock(&fscrypt_master_keys_lock);
+	return to_insert;
+}
+
+/* Prepare to encrypt directly using the master key in the given mode */
+static struct fscrypt_master_key *
+fscrypt_get_master_key(const struct fscrypt_info *ci, struct fscrypt_mode *mode,
+		       const u8 *raw_key, const struct inode *inode)
+{
+	struct fscrypt_master_key *mk;
+	int err;
+
+	/* Is there already a tfm for this key? */
+	mk = find_or_insert_master_key(NULL, raw_key, mode, ci);
+	if (mk)
+		return mk;
+
+	/* Nope, allocate one. */
+	mk = kzalloc(sizeof(*mk), GFP_NOFS);
+	if (!mk)
+		return ERR_PTR(-ENOMEM);
+	refcount_set(&mk->mk_refcount, 1);
+	mk->mk_mode = mode;
+	mk->mk_ctfm = allocate_skcipher_for_mode(mode, raw_key, inode);
+	if (IS_ERR(mk->mk_ctfm)) {
+		err = PTR_ERR(mk->mk_ctfm);
+		mk->mk_ctfm = NULL;
+		goto err_free_mk;
+	}
+	memcpy(mk->mk_descriptor, ci->ci_master_key_descriptor,
+	       FS_KEY_DESCRIPTOR_SIZE);
+	memcpy(mk->mk_raw, raw_key, mode->keysize);
+
+	return find_or_insert_master_key(mk, raw_key, mode, ci);
+
+err_free_mk:
+	free_master_key(mk);
+	return ERR_PTR(err);
 }
 
 static int derive_essiv_salt(const u8 *key, int keysize, u8 *salt)
@@ -275,11 +421,66 @@ void __exit fscrypt_essiv_cleanup(void)
 	crypto_free_shash(essiv_hash_tfm);
 }
 
+/*
+ * Given the encryption mode and key (normally the derived key, but for
+ * FS_POLICY_FLAGS_DIRECT mode it's the master key), set up the inode's
+ * symmetric cipher transform object(s).
+ */
+static int setup_crypto_transform(struct fscrypt_info *ci,
+				  struct fscrypt_mode *mode,
+				  const u8 *raw_key, const struct inode *inode)
+{
+	struct fscrypt_master_key *mk;
+	struct crypto_skcipher *ctfm;
+	int err;
+
+	if (ci->ci_flags & FS_POLICY_FLAGS_DIRECT) {
+		mk = fscrypt_get_master_key(ci, mode, raw_key, inode);
+		if (IS_ERR(mk))
+			return PTR_ERR(mk);
+		ctfm = mk->mk_ctfm;
+	} else {
+		mk = NULL;
+		ctfm = allocate_skcipher_for_mode(mode, raw_key, inode);
+		if (IS_ERR(ctfm))
+			return PTR_ERR(ctfm);
+	}
+	ci->ci_master_key = mk;
+	ci->ci_ctfm = ctfm;
+
+	if (mode->uses_essiv && S_ISREG(inode->i_mode)) {
+		WARN_ON(mode->uses_long_ivs);
+		BUILD_BUG_ON(FIELD_SIZEOF(union fscrypt_iv, short_iv) !=
+			     AES_BLOCK_SIZE);
+		err = init_essiv_generator(ci, raw_key, mode->keysize);
+		if (err) {
+			fscrypt_warn(inode->i_sb,
+				     "error initializing ESSIV generator for inode %lu: %d",
+				     inode->i_ino, err);
+			return err;
+		}
+	}
+	return 0;
+}
+
+static void put_crypt_info(struct fscrypt_info *ci)
+{
+	if (!ci)
+		return;
+
+	if (ci->ci_master_key) {
+		put_master_key(ci->ci_master_key);
+	} else {
+		crypto_free_skcipher(ci->ci_ctfm);
+		crypto_free_cipher(ci->ci_essiv_tfm);
+	}
+	kmem_cache_free(fscrypt_info_cachep, ci);
+}
+
 int fscrypt_get_encryption_info(struct inode *inode)
 {
 	struct fscrypt_info *crypt_info;
 	struct fscrypt_context ctx;
-	struct crypto_skcipher *ctfm;
 	struct fscrypt_mode *mode;
 	u8 *raw_key = NULL;
 	int res;
@@ -312,23 +513,23 @@ int fscrypt_get_encryption_info(struct inode *inode)
 	if (ctx.flags & ~FS_POLICY_FLAGS_VALID)
 		return -EINVAL;
 
-	crypt_info = kmem_cache_alloc(fscrypt_info_cachep, GFP_NOFS);
+	crypt_info = kmem_cache_zalloc(fscrypt_info_cachep, GFP_NOFS);
 	if (!crypt_info)
 		return -ENOMEM;
 
 	crypt_info->ci_flags = ctx.flags;
 	crypt_info->ci_data_mode = ctx.contents_encryption_mode;
 	crypt_info->ci_filename_mode = ctx.filenames_encryption_mode;
-	crypt_info->ci_ctfm = NULL;
-	crypt_info->ci_essiv_tfm = NULL;
-	memcpy(crypt_info->ci_master_key, ctx.master_key_descriptor,
-				sizeof(crypt_info->ci_master_key));
+	memcpy(crypt_info->ci_master_key_descriptor, ctx.master_key_descriptor,
+	       FS_KEY_DESCRIPTOR_SIZE);
+	memcpy(crypt_info->ci_nonce, ctx.nonce, FS_KEY_DERIVATION_NONCE_SIZE);
 
 	mode = select_encryption_mode(crypt_info, inode);
 	if (IS_ERR(mode)) {
 		res = PTR_ERR(mode);
 		goto out;
 	}
+	crypt_info->ci_mode = mode;
 
 	/*
 	 * This cannot be a stack buffer because it is passed to the scatterlist
@@ -339,47 +540,14 @@ int fscrypt_get_encryption_info(struct inode *inode)
 	if (!raw_key)
 		goto out;
 
-	res = find_and_derive_key(inode, &ctx, raw_key, mode->keysize);
+	res = find_and_derive_key(inode, &ctx, raw_key, mode);
 	if (res)
 		goto out;
 
-	ctfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
-	if (IS_ERR(ctfm)) {
-		res = PTR_ERR(ctfm);
-		fscrypt_warn(inode->i_sb,
-			     "error allocating '%s' transform for inode %lu: %d",
-			     mode->cipher_str, inode->i_ino, res);
-		goto out;
-	}
-	if (unlikely(!mode->logged_impl_name)) {
-		/*
-		 * fscrypt performance can vary greatly depending on which
-		 * crypto algorithm implementation is used.  Help people debug
-		 * performance problems by logging the ->cra_driver_name the
-		 * first time a mode is used.  Note that multiple threads can
-		 * race here, but it doesn't really matter.
-		 */
-		mode->logged_impl_name = true;
-		pr_info("fscrypt: %s using implementation \"%s\"\n",
-			mode->friendly_name,
-			crypto_skcipher_alg(ctfm)->base.cra_driver_name);
-	}
-	crypt_info->ci_ctfm = ctfm;
-	crypto_skcipher_set_flags(ctfm, CRYPTO_TFM_REQ_WEAK_KEY);
-	res = crypto_skcipher_setkey(ctfm, raw_key, mode->keysize);
+	res = setup_crypto_transform(crypt_info, mode, raw_key, inode);
 	if (res)
 		goto out;
 
-	if (S_ISREG(inode->i_mode) &&
-	    crypt_info->ci_data_mode == FS_ENCRYPTION_MODE_AES_128_CBC) {
-		res = init_essiv_generator(crypt_info, raw_key, mode->keysize);
-		if (res) {
-			fscrypt_warn(inode->i_sb,
-				     "error initializing ESSIV generator for inode %lu: %d",
-				     inode->i_ino, res);
-			goto out;
-		}
-	}
 	if (cmpxchg(&inode->i_crypt_info, NULL, crypt_info) == NULL)
 		crypt_info = NULL;
 out:
diff --git a/fs/crypto/policy.c b/fs/crypto/policy.c
index c6d431a5cce93..f490de921ce82 100644
--- a/fs/crypto/policy.c
+++ b/fs/crypto/policy.c
@@ -199,7 +199,8 @@ int fscrypt_has_permitted_context(struct inode *parent, struct inode *child)
 	child_ci = child->i_crypt_info;
 
 	if (parent_ci && child_ci) {
-		return memcmp(parent_ci->ci_master_key, child_ci->ci_master_key,
+		return memcmp(parent_ci->ci_master_key_descriptor,
+			      child_ci->ci_master_key_descriptor,
 			      FS_KEY_DESCRIPTOR_SIZE) == 0 &&
 			(parent_ci->ci_data_mode == child_ci->ci_data_mode) &&
 			(parent_ci->ci_filename_mode ==
@@ -254,7 +255,7 @@ int fscrypt_inherit_context(struct inode *parent, struct inode *child,
 	ctx.contents_encryption_mode = ci->ci_data_mode;
 	ctx.filenames_encryption_mode = ci->ci_filename_mode;
 	ctx.flags = ci->ci_flags;
-	memcpy(ctx.master_key_descriptor, ci->ci_master_key,
+	memcpy(ctx.master_key_descriptor, ci->ci_master_key_descriptor,
 	       FS_KEY_DESCRIPTOR_SIZE);
 	get_random_bytes(ctx.nonce, FS_KEY_DERIVATION_NONCE_SIZE);
 	BUILD_BUG_ON(sizeof(ctx) != FSCRYPT_SET_CONTEXT_MAX_SIZE);
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index a441ea1bfe6d9..8a49e971146d5 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -269,7 +269,8 @@ struct fsxattr {
 #define FS_POLICY_FLAGS_PAD_16		0x02
 #define FS_POLICY_FLAGS_PAD_32		0x03
 #define FS_POLICY_FLAGS_PAD_MASK	0x03
-#define FS_POLICY_FLAGS_VALID		0x03
+#define FS_POLICY_FLAGS_DIRECT		0x04	/* use master key directly */
+#define FS_POLICY_FLAGS_VALID		0x07
 
 /* Encryption algorithms */
 #define FS_ENCRYPTION_MODE_INVALID		0
@@ -281,6 +282,7 @@ struct fsxattr {
 #define FS_ENCRYPTION_MODE_AES_128_CTS		6
 #define FS_ENCRYPTION_MODE_SPECK128_256_XTS	7 /* Removed, do not use. */
 #define FS_ENCRYPTION_MODE_SPECK128_256_CTS	8 /* Removed, do not use. */
+#define FS_ENCRYPTION_MODE_ADIANTUM		9
 
 struct fscrypt_policy {
 	__u8 version;
-- 
2.19.1.331.ge82ca0e54c-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 01/12] crypto: chacha20-generic - add HChaCha20 library function
  2018-10-15 17:54 ` [RFC PATCH v2 01/12] crypto: chacha20-generic - add HChaCha20 library function Eric Biggers
@ 2018-10-19 14:13   ` Ard Biesheuvel
  0 siblings, 0 replies; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-19 14:13 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	Jason A . Donenfeld, Greg Kaiser, Herbert Xu, Samuel Neves,
	Michael Halcrow, Linux Kernel Mailing List, linux-fscrypt,
	Tomer Ashur, linux-arm-kernel, Paul Crowley

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Refactor the unkeyed permutation part of chacha20_block() into its own
> function, then add hchacha20_block() which is the ChaCha equivalent of
> HSalsa20 and is an intermediate step towards XChaCha20 (see
> https://cr.yp.to/snuffle/xsalsa-20081128.pdf).  HChaCha20 skips the
> final addition of the initial state, and outputs only certain words of
> the state.  It should not be used for streaming directly.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---
>  include/crypto/chacha20.h |  2 ++
>  lib/chacha20.c            | 50 ++++++++++++++++++++++++++++++++++-----
>  2 files changed, 46 insertions(+), 6 deletions(-)
>
> diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
> index f76302d99e2be..fbec4e6a87890 100644
> --- a/include/crypto/chacha20.h
> +++ b/include/crypto/chacha20.h
> @@ -19,6 +19,8 @@ struct chacha20_ctx {
>  };
>
>  void chacha20_block(u32 *state, u8 *stream);
> +void hchacha20_block(const u32 *in, u32 *out);
> +
>  void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
>  int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
>                            unsigned int keysize);
> diff --git a/lib/chacha20.c b/lib/chacha20.c
> index d907fec6a9ed1..6a484e16171d1 100644
> --- a/lib/chacha20.c
> +++ b/lib/chacha20.c
> @@ -1,5 +1,5 @@
>  /*
> - * ChaCha20 256-bit cipher algorithm, RFC7539
> + * The "hash function" used as the core of the ChaCha20 stream cipher (RFC7539)
>   *
>   * Copyright (C) 2015 Martin Willi
>   *
> @@ -16,14 +16,10 @@
>  #include <asm/unaligned.h>
>  #include <crypto/chacha20.h>
>
> -void chacha20_block(u32 *state, u8 *stream)
> +static void chacha20_permute(u32 *x)
>  {
> -       u32 x[16];
>         int i;
>
> -       for (i = 0; i < ARRAY_SIZE(x); i++)
> -               x[i] = state[i];
> -
>         for (i = 0; i < 20; i += 2) {
>                 x[0]  += x[4];    x[12] = rol32(x[12] ^ x[0],  16);
>                 x[1]  += x[5];    x[13] = rol32(x[13] ^ x[1],  16);
> @@ -65,6 +61,25 @@ void chacha20_block(u32 *state, u8 *stream)
>                 x[8]  += x[13];   x[7]  = rol32(x[7]  ^ x[8],   7);
>                 x[9]  += x[14];   x[4]  = rol32(x[4]  ^ x[9],   7);
>         }
> +}
> +
> +/**
> + * chacha20_block - generate one keystream block and increment block counter
> + * @state: input state matrix (16 32-bit words)
> + * @stream: output keystream block (64 bytes)
> + *
> + * This is the ChaCha20 core, a function from 64-byte strings to 64-byte
> + * strings.  The caller has already converted the endianness of the input.  This
> + * function also handles incrementing the block counter in the input matrix.
> + */
> +void chacha20_block(u32 *state, u8 *stream)
> +{
> +       u32 x[16];
> +       int i;
> +
> +       memcpy(x, state, 64);
> +
> +       chacha20_permute(x);
>
>         for (i = 0; i < ARRAY_SIZE(x); i++)
>                 put_unaligned_le32(x[i] + state[i], &stream[i * sizeof(u32)]);
> @@ -72,3 +87,26 @@ void chacha20_block(u32 *state, u8 *stream)
>         state[12]++;
>  }
>  EXPORT_SYMBOL(chacha20_block);
> +
> +/**
> + * hchacha20_block - abbreviated ChaCha20 core, for XChaCha20
> + * @in: input state matrix (16 32-bit words)
> + * @out: output (8 32-bit words)
> + *
> + * HChaCha20 is the ChaCha equivalent of HSalsa20 and is an intermediate step
> + * towards XChaCha20 (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf).
> + * HChaCha20 skips the final addition of the initial state, and outputs only
> + * certain words of the state.  It should not be used for streaming directly.
> + */
> +void hchacha20_block(const u32 *in, u32 *out)
> +{
> +       u32 x[16];
> +
> +       memcpy(x, in, 64);
> +
> +       chacha20_permute(x);
> +
> +       memcpy(&out[0], &x[0], 16);
> +       memcpy(&out[4], &x[12], 16);
> +}
> +EXPORT_SYMBOL(hchacha20_block);
> --
> 2.19.1.331.ge82ca0e54c-goog
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 02/12] crypto: chacha20-generic - add XChaCha20 support
  2018-10-15 17:54 ` [RFC PATCH v2 02/12] crypto: chacha20-generic - add XChaCha20 support Eric Biggers
@ 2018-10-19 14:24   ` Ard Biesheuvel
  0 siblings, 0 replies; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-19 14:24 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Add support for the XChaCha20 stream cipher.  XChaCha20 is the
> application of the XSalsa20 construction
> (https://cr.yp.to/snuffle/xsalsa-20081128.pdf) to ChaCha20 rather than
> to Salsa20.  XChaCha20 extends ChaCha20's nonce length from 64 bits (or
> 96 bits, depending on convention) to 192 bits, while provably retaining
> ChaCha20's security.  XChaCha20 uses the ChaCha20 permutation to map the
> key and first 128 nonce bits to a 256-bit subkey.  Then, it does the
> ChaCha20 stream cipher with the subkey and remaining 64 bits of nonce.
>
> We need XChaCha support in order to add support for the Adiantum
> encryption mode.  Note that to meet our performance requirements, we
> actually plan to primarily use the variant XChaCha12.  But we believe
> it's wise to first add XChaCha20 as a baseline with a higher security
> margin, in case there are any situations where it can be used.
> Supporting both variants is straightforward.
>
> Since XChaCha20's subkey differs for each request, XChaCha20 can't be a
> template that wraps ChaCha20; that would require re-keying the
> underlying ChaCha20 for every request, which wouldn't be thread-safe.
> Instead, we make XChaCha20 its own top-level algorithm which calls the
> ChaCha20 streaming implementation internally.
>
> Similar to the existing ChaCha20 implementation, we define the IV to be
> the nonce and stream position concatenated together.  This allows users
> to seek to any position in the stream.
>
> I considered splitting the code into separate chacha20-common, chacha20,
> and xchacha20 modules, so that chacha20 and xchacha20 could be
> enabled/disabled independently.  However, since nearly all the code is
> shared anyway, I ultimately decided there would have been little benefit
> to the added complexity of separate modules.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

One nit below but that should be fixed separately, so

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---
>  crypto/Kconfig            |  14 +-
>  crypto/chacha20_generic.c | 120 +++++---
>  crypto/testmgr.c          |   6 +
>  crypto/testmgr.h          | 577 ++++++++++++++++++++++++++++++++++++++
>  include/crypto/chacha20.h |  14 +-
>  5 files changed, 689 insertions(+), 42 deletions(-)
>
> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index f7a235db56aaa..d9acbce23d4d5 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -1387,18 +1387,22 @@ config CRYPTO_SALSA20
>           Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>
>
>  config CRYPTO_CHACHA20
> -       tristate "ChaCha20 cipher algorithm"
> +       tristate "ChaCha20 stream cipher algorithms"
>         select CRYPTO_BLKCIPHER
>         help
> -         ChaCha20 cipher algorithm, RFC7539.
> +         The ChaCha20 and XChaCha20 stream cipher algorithms.
>
>           ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
>           Bernstein and further specified in RFC7539 for use in IETF protocols.
> -         This is the portable C implementation of ChaCha20.
> -
> -         See also:
> +         This is the portable C implementation of ChaCha20.  See also:
>           <http://cr.yp.to/chacha/chacha-20080128.pdf>
>
> +         XChaCha20 is the application of the XSalsa20 construction to ChaCha20
> +         rather than to Salsa20.  XChaCha20 extends ChaCha20's nonce length
> +         from 64 bits (or 96 bits using the RFC7539 convention) to 192 bits,
> +         while provably retaining ChaCha20's security.  See also:
> +         <https://cr.yp.to/snuffle/xsalsa-20081128.pdf>
> +
>  config CRYPTO_CHACHA20_X86_64
>         tristate "ChaCha20 cipher algorithm (x86_64/SSSE3/AVX2)"
>         depends on X86 && 64BIT
> diff --git a/crypto/chacha20_generic.c b/crypto/chacha20_generic.c
> index 3ae96587caf9a..07902fe37aeb8 100644
> --- a/crypto/chacha20_generic.c
> +++ b/crypto/chacha20_generic.c
> @@ -1,7 +1,8 @@
>  /*
> - * ChaCha20 256-bit cipher algorithm, RFC7539
> + * ChaCha20 (RFC7539) and XChaCha20 stream cipher algorithms
>   *
>   * Copyright (C) 2015 Martin Willi
> + * Copyright (C) 2018 Google LLC
>   *
>   * This program is free software; you can redistribute it and/or modify
>   * it under the terms of the GNU General Public License as published by
> @@ -36,6 +37,31 @@ static void chacha20_docrypt(u32 *state, u8 *dst, const u8 *src,
>         }
>  }
>
> +static int chacha20_stream_xor(struct skcipher_request *req,
> +                              struct chacha20_ctx *ctx, u8 *iv)
> +{
> +       struct skcipher_walk walk;
> +       u32 state[16];
> +       int err;
> +
> +       err = skcipher_walk_virt(&walk, req, true);
> +

We shouldn't be calling skcipher_walk_virt() here with atomic set to
true, but that is an existing issue so perhaps you could include a
separate patch to fix that?

> +       crypto_chacha20_init(state, ctx, iv);
> +
> +       while (walk.nbytes > 0) {
> +               unsigned int nbytes = walk.nbytes;
> +
> +               if (nbytes < walk.total)
> +                       nbytes = round_down(nbytes, walk.stride);
> +
> +               chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
> +                                nbytes);
> +               err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
> +       }
> +
> +       return err;
> +}
> +
>  void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
>  {
>         state[0]  = 0x61707865; /* "expa" */
> @@ -77,54 +103,74 @@ int crypto_chacha20_crypt(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
>         struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
> -       struct skcipher_walk walk;
> -       u32 state[16];
> -       int err;
> -
> -       err = skcipher_walk_virt(&walk, req, true);
>
> -       crypto_chacha20_init(state, ctx, walk.iv);
> +       return chacha20_stream_xor(req, ctx, req->iv);
> +}
> +EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
>
> -       while (walk.nbytes > 0) {
> -               unsigned int nbytes = walk.nbytes;
> +int crypto_xchacha20_crypt(struct skcipher_request *req)
> +{
> +       struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> +       struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       struct chacha20_ctx subctx;
> +       u32 state[16];
> +       u8 real_iv[16];
>
> -               if (nbytes < walk.total)
> -                       nbytes = round_down(nbytes, walk.stride);
> +       /* Compute the subkey given the original key and first 128 nonce bits */
> +       crypto_chacha20_init(state, ctx, req->iv);
> +       hchacha20_block(state, subctx.key);
>
> -               chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
> -                                nbytes);
> -               err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
> -       }
> +       /* Build the real IV */
> +       memcpy(&real_iv[0], req->iv + 24, 8); /* stream position */
> +       memcpy(&real_iv[8], req->iv + 16, 8); /* remaining 64 nonce bits */
>
> -       return err;
> +       /* Generate the stream and XOR it with the data */
> +       return chacha20_stream_xor(req, &subctx, real_iv);
>  }
> -EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
> -
> -static struct skcipher_alg alg = {
> -       .base.cra_name          = "chacha20",
> -       .base.cra_driver_name   = "chacha20-generic",
> -       .base.cra_priority      = 100,
> -       .base.cra_blocksize     = 1,
> -       .base.cra_ctxsize       = sizeof(struct chacha20_ctx),
> -       .base.cra_module        = THIS_MODULE,
> -
> -       .min_keysize            = CHACHA20_KEY_SIZE,
> -       .max_keysize            = CHACHA20_KEY_SIZE,
> -       .ivsize                 = CHACHA20_IV_SIZE,
> -       .chunksize              = CHACHA20_BLOCK_SIZE,
> -       .setkey                 = crypto_chacha20_setkey,
> -       .encrypt                = crypto_chacha20_crypt,
> -       .decrypt                = crypto_chacha20_crypt,
> +EXPORT_SYMBOL_GPL(crypto_xchacha20_crypt);
> +
> +static struct skcipher_alg algs[] = {
> +       {
> +               .base.cra_name          = "chacha20",
> +               .base.cra_driver_name   = "chacha20-generic",
> +               .base.cra_priority      = 100,
> +               .base.cra_blocksize     = 1,
> +               .base.cra_ctxsize       = sizeof(struct chacha20_ctx),
> +               .base.cra_module        = THIS_MODULE,
> +
> +               .min_keysize            = CHACHA20_KEY_SIZE,
> +               .max_keysize            = CHACHA20_KEY_SIZE,
> +               .ivsize                 = CHACHA20_IV_SIZE,
> +               .chunksize              = CHACHA20_BLOCK_SIZE,
> +               .setkey                 = crypto_chacha20_setkey,
> +               .encrypt                = crypto_chacha20_crypt,
> +               .decrypt                = crypto_chacha20_crypt,
> +       }, {
> +               .base.cra_name          = "xchacha20",
> +               .base.cra_driver_name   = "xchacha20-generic",
> +               .base.cra_priority      = 100,
> +               .base.cra_blocksize     = 1,
> +               .base.cra_ctxsize       = sizeof(struct chacha20_ctx),
> +               .base.cra_module        = THIS_MODULE,
> +
> +               .min_keysize            = CHACHA20_KEY_SIZE,
> +               .max_keysize            = CHACHA20_KEY_SIZE,
> +               .ivsize                 = XCHACHA20_IV_SIZE,
> +               .chunksize              = CHACHA20_BLOCK_SIZE,
> +               .setkey                 = crypto_chacha20_setkey,
> +               .encrypt                = crypto_xchacha20_crypt,
> +               .decrypt                = crypto_xchacha20_crypt,
> +       }
>  };
>
>  static int __init chacha20_generic_mod_init(void)
>  {
> -       return crypto_register_skcipher(&alg);
> +       return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
>  }
>
>  static void __exit chacha20_generic_mod_fini(void)
>  {
> -       crypto_unregister_skcipher(&alg);
> +       crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
>  }
>
>  module_init(chacha20_generic_mod_init);
> @@ -132,6 +178,8 @@ module_exit(chacha20_generic_mod_fini);
>
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
> -MODULE_DESCRIPTION("chacha20 cipher algorithm");
> +MODULE_DESCRIPTION("ChaCha20 and XChaCha20 stream ciphers (generic)");
>  MODULE_ALIAS_CRYPTO("chacha20");
>  MODULE_ALIAS_CRYPTO("chacha20-generic");
> +MODULE_ALIAS_CRYPTO("xchacha20");
> +MODULE_ALIAS_CRYPTO("xchacha20-generic");
> diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> index b1f79c6bf4096..a5512e69c8f31 100644
> --- a/crypto/testmgr.c
> +++ b/crypto/testmgr.c
> @@ -3544,6 +3544,12 @@ static const struct alg_test_desc alg_test_descs[] = {
>                 .suite = {
>                         .hash = __VECS(aes_xcbc128_tv_template)
>                 }
> +       }, {
> +               .alg = "xchacha20",
> +               .test = alg_test_skcipher,
> +               .suite = {
> +                       .cipher = __VECS(xchacha20_tv_template)
> +               },
>         }, {
>                 .alg = "xts(aes)",
>                 .test = alg_test_skcipher,
> diff --git a/crypto/testmgr.h b/crypto/testmgr.h
> index 1fe7b97ba03f9..371641c73cf8c 100644
> --- a/crypto/testmgr.h
> +++ b/crypto/testmgr.h
> @@ -30802,6 +30802,583 @@ static const struct cipher_testvec chacha20_tv_template[] = {
>         },
>  };
>
> +static const struct cipher_testvec xchacha20_tv_template[] = {
> +       { /* from libsodium test/default/xchacha20.c */
> +               .key    = "\x79\xc9\x97\x98\xac\x67\x30\x0b"
> +                         "\xbb\x27\x04\xc9\x5c\x34\x1e\x32"
> +                         "\x45\xf3\xdc\xb2\x17\x61\xb9\x8e"
> +                         "\x52\xff\x45\xb2\x4f\x30\x4f\xc4",
> +               .klen   = 32,
> +               .iv     = "\xb3\x3f\xfd\x30\x96\x47\x9b\xcf"
> +                         "\xbc\x9a\xee\x49\x41\x76\x88\xa0"
> +                         "\xa2\x55\x4f\x8d\x95\x38\x94\x19"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00",
> +               .ctext  = "\xc6\xe9\x75\x81\x60\x08\x3a\xc6"
> +                         "\x04\xef\x90\xe7\x12\xce\x6e\x75"
> +                         "\xd7\x79\x75\x90\x74\x4e\x0c\xf0"
> +                         "\x60\xf0\x13\x73\x9c",
> +               .len    = 29,
> +       }, { /* from libsodium test/default/xchacha20.c */
> +               .key    = "\x9d\x23\xbd\x41\x49\xcb\x97\x9c"
> +                         "\xcf\x3c\x5c\x94\xdd\x21\x7e\x98"
> +                         "\x08\xcb\x0e\x50\xcd\x0f\x67\x81"
> +                         "\x22\x35\xea\xaf\x60\x1d\x62\x32",
> +               .klen   = 32,
> +               .iv     = "\xc0\x47\x54\x82\x66\xb7\xc3\x70"
> +                         "\xd3\x35\x66\xa2\x42\x5c\xbf\x30"
> +                         "\xd8\x2d\x1e\xaf\x52\x94\x10\x9e"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00",
> +               .ctext  = "\xa2\x12\x09\x09\x65\x94\xde\x8c"
> +                         "\x56\x67\xb1\xd1\x3a\xd9\x3f\x74"
> +                         "\x41\x06\xd0\x54\xdf\x21\x0e\x47"
> +                         "\x82\xcd\x39\x6f\xec\x69\x2d\x35"
> +                         "\x15\xa2\x0b\xf3\x51\xee\xc0\x11"
> +                         "\xa9\x2c\x36\x78\x88\xbc\x46\x4c"
> +                         "\x32\xf0\x80\x7a\xcd\x6c\x20\x3a"
> +                         "\x24\x7e\x0d\xb8\x54\x14\x84\x68"
> +                         "\xe9\xf9\x6b\xee\x4c\xf7\x18\xd6"
> +                         "\x8d\x5f\x63\x7c\xbd\x5a\x37\x64"
> +                         "\x57\x78\x8e\x6f\xae\x90\xfc\x31"
> +                         "\x09\x7c\xfc",
> +               .len    = 91,
> +       }, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes
> +               to nonce, and recomputed the ciphertext with libsodium */
> +               .key    = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .klen   = 32,
> +               .iv     = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x67\xc6\x69\x73"
> +                         "\x51\xff\x4a\xec\x29\xcd\xba\xab"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .ctext  = "\x9c\x49\x2a\xe7\x8a\x2f\x93\xc7"
> +                         "\xb3\x33\x6f\x82\x17\xd8\xc4\x1e"
> +                         "\xad\x80\x11\x11\x1d\x4c\x16\x18"
> +                         "\x07\x73\x9b\x4f\xdb\x7c\xcb\x47"
> +                         "\xfd\xef\x59\x74\xfa\x3f\xe5\x4c"
> +                         "\x9b\xd0\xea\xbc\xba\x56\xad\x32"
> +                         "\x03\xdc\xf8\x2b\xc1\xe1\x75\x67"
> +                         "\x23\x7b\xe6\xfc\xd4\x03\x86\x54",
> +               .len    = 64,
> +       }, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes
> +               to nonce, and recomputed the ciphertext with libsodium */
> +               .key    = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x01",
> +               .klen   = 32,
> +               .iv     = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x02\xf2\xfb\xe3\x46"
> +                         "\x7c\xc2\x54\xf8\x1b\xe8\xe7\x8d"
> +                         "\x01\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x41\x6e\x79\x20\x73\x75\x62\x6d"
> +                         "\x69\x73\x73\x69\x6f\x6e\x20\x74"
> +                         "\x6f\x20\x74\x68\x65\x20\x49\x45"
> +                         "\x54\x46\x20\x69\x6e\x74\x65\x6e"
> +                         "\x64\x65\x64\x20\x62\x79\x20\x74"
> +                         "\x68\x65\x20\x43\x6f\x6e\x74\x72"
> +                         "\x69\x62\x75\x74\x6f\x72\x20\x66"
> +                         "\x6f\x72\x20\x70\x75\x62\x6c\x69"
> +                         "\x63\x61\x74\x69\x6f\x6e\x20\x61"
> +                         "\x73\x20\x61\x6c\x6c\x20\x6f\x72"
> +                         "\x20\x70\x61\x72\x74\x20\x6f\x66"
> +                         "\x20\x61\x6e\x20\x49\x45\x54\x46"
> +                         "\x20\x49\x6e\x74\x65\x72\x6e\x65"
> +                         "\x74\x2d\x44\x72\x61\x66\x74\x20"
> +                         "\x6f\x72\x20\x52\x46\x43\x20\x61"
> +                         "\x6e\x64\x20\x61\x6e\x79\x20\x73"
> +                         "\x74\x61\x74\x65\x6d\x65\x6e\x74"
> +                         "\x20\x6d\x61\x64\x65\x20\x77\x69"
> +                         "\x74\x68\x69\x6e\x20\x74\x68\x65"
> +                         "\x20\x63\x6f\x6e\x74\x65\x78\x74"
> +                         "\x20\x6f\x66\x20\x61\x6e\x20\x49"
> +                         "\x45\x54\x46\x20\x61\x63\x74\x69"
> +                         "\x76\x69\x74\x79\x20\x69\x73\x20"
> +                         "\x63\x6f\x6e\x73\x69\x64\x65\x72"
> +                         "\x65\x64\x20\x61\x6e\x20\x22\x49"
> +                         "\x45\x54\x46\x20\x43\x6f\x6e\x74"
> +                         "\x72\x69\x62\x75\x74\x69\x6f\x6e"
> +                         "\x22\x2e\x20\x53\x75\x63\x68\x20"
> +                         "\x73\x74\x61\x74\x65\x6d\x65\x6e"
> +                         "\x74\x73\x20\x69\x6e\x63\x6c\x75"
> +                         "\x64\x65\x20\x6f\x72\x61\x6c\x20"
> +                         "\x73\x74\x61\x74\x65\x6d\x65\x6e"
> +                         "\x74\x73\x20\x69\x6e\x20\x49\x45"
> +                         "\x54\x46\x20\x73\x65\x73\x73\x69"
> +                         "\x6f\x6e\x73\x2c\x20\x61\x73\x20"
> +                         "\x77\x65\x6c\x6c\x20\x61\x73\x20"
> +                         "\x77\x72\x69\x74\x74\x65\x6e\x20"
> +                         "\x61\x6e\x64\x20\x65\x6c\x65\x63"
> +                         "\x74\x72\x6f\x6e\x69\x63\x20\x63"
> +                         "\x6f\x6d\x6d\x75\x6e\x69\x63\x61"
> +                         "\x74\x69\x6f\x6e\x73\x20\x6d\x61"
> +                         "\x64\x65\x20\x61\x74\x20\x61\x6e"
> +                         "\x79\x20\x74\x69\x6d\x65\x20\x6f"
> +                         "\x72\x20\x70\x6c\x61\x63\x65\x2c"
> +                         "\x20\x77\x68\x69\x63\x68\x20\x61"
> +                         "\x72\x65\x20\x61\x64\x64\x72\x65"
> +                         "\x73\x73\x65\x64\x20\x74\x6f",
> +               .ctext  = "\xf9\xab\x7a\x4a\x60\xb8\x5f\xa0"
> +                         "\x50\xbb\x57\xce\xef\x8c\xc1\xd9"
> +                         "\x24\x15\xb3\x67\x5e\x7f\x01\xf6"
> +                         "\x1c\x22\xf6\xe5\x71\xb1\x43\x64"
> +                         "\x63\x05\xd5\xfc\x5c\x3d\xc0\x0e"
> +                         "\x23\xef\xd3\x3b\xd9\xdc\x7f\xa8"
> +                         "\x58\x26\xb3\xd0\xc2\xd5\x04\x3f"
> +                         "\x0a\x0e\x8f\x17\xe4\xcd\xf7\x2a"
> +                         "\xb4\x2c\x09\xe4\x47\xec\x8b\xfb"
> +                         "\x59\x37\x7a\xa1\xd0\x04\x7e\xaa"
> +                         "\xf1\x98\x5f\x24\x3d\x72\x9a\x43"
> +                         "\xa4\x36\x51\x92\x22\x87\xff\x26"
> +                         "\xce\x9d\xeb\x59\x78\x84\x5e\x74"
> +                         "\x97\x2e\x63\xc0\xef\x29\xf7\x8a"
> +                         "\xb9\xee\x35\x08\x77\x6a\x35\x9a"
> +                         "\x3e\xe6\x4f\x06\x03\x74\x1b\xc1"
> +                         "\x5b\xb3\x0b\x89\x11\x07\xd3\xb7"
> +                         "\x53\xd6\x25\x04\xd9\x35\xb4\x5d"
> +                         "\x4c\x33\x5a\xc2\x42\x4c\xe6\xa4"
> +                         "\x97\x6e\x0e\xd2\xb2\x8b\x2f\x7f"
> +                         "\x28\xe5\x9f\xac\x4b\x2e\x02\xab"
> +                         "\x85\xfa\xa9\x0d\x7c\x2d\x10\xe6"
> +                         "\x91\xab\x55\x63\xf0\xde\x3a\x94"
> +                         "\x25\x08\x10\x03\xc2\x68\xd1\xf4"
> +                         "\xaf\x7d\x9c\x99\xf7\x86\x96\x30"
> +                         "\x60\xfc\x0b\xe6\xa8\x80\x15\xb0"
> +                         "\x81\xb1\x0c\xbe\xb9\x12\x18\x25"
> +                         "\xe9\x0e\xb1\xe7\x23\xb2\xef\x4a"
> +                         "\x22\x8f\xc5\x61\x89\xd4\xe7\x0c"
> +                         "\x64\x36\x35\x61\xb6\x34\x60\xf7"
> +                         "\x7b\x61\x37\x37\x12\x10\xa2\xf6"
> +                         "\x7e\xdb\x7f\x39\x3f\xb6\x8e\x89"
> +                         "\x9e\xf3\xfe\x13\x98\xbb\x66\x5a"
> +                         "\xec\xea\xab\x3f\x9c\x87\xc4\x8c"
> +                         "\x8a\x04\x18\x49\xfc\x77\x11\x50"
> +                         "\x16\xe6\x71\x2b\xee\xc0\x9c\xb6"
> +                         "\x87\xfd\x80\xff\x0b\x1d\x73\x38"
> +                         "\xa4\x1d\x6f\xae\xe4\x12\xd7\x93"
> +                         "\x9d\xcd\x38\x26\x09\x40\x52\xcd"
> +                         "\x67\x01\x67\x26\xe0\x3e\x98\xa8"
> +                         "\xe8\x1a\x13\x41\xbb\x90\x4d\x87"
> +                         "\xbb\x42\x82\x39\xce\x3a\xd0\x18"
> +                         "\x6d\x7b\x71\x8f\xbb\x2c\x6a\xd1"
> +                         "\xbd\xf5\xc7\x8a\x7e\xe1\x1e\x0f"
> +                         "\x0d\x0d\x13\x7c\xd9\xd8\x3c\x91"
> +                         "\xab\xff\x1f\x12\xc3\xee\xe5\x65"
> +                         "\x12\x8d\x7b\x61\xe5\x1f\x98",
> +               .len    = 375,
> +               .also_non_np = 1,
> +               .np     = 3,
> +               .tap    = { 375 - 20, 4, 16 },
> +
> +       }, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes
> +               to nonce, and recomputed the ciphertext with libsodium */
> +               .key    = "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
> +                         "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
> +                         "\x47\x39\x17\xc1\x40\x2b\x80\x09"
> +                         "\x9d\xca\x5c\xbc\x20\x70\x75\xc0",
> +               .klen   = 32,
> +               .iv     = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x02\x76\x5a\x2e\x63"
> +                         "\x33\x9f\xc9\x9a\x66\x32\x0d\xb7"
> +                         "\x2a\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x27\x54\x77\x61\x73\x20\x62\x72"
> +                         "\x69\x6c\x6c\x69\x67\x2c\x20\x61"
> +                         "\x6e\x64\x20\x74\x68\x65\x20\x73"
> +                         "\x6c\x69\x74\x68\x79\x20\x74\x6f"
> +                         "\x76\x65\x73\x0a\x44\x69\x64\x20"
> +                         "\x67\x79\x72\x65\x20\x61\x6e\x64"
> +                         "\x20\x67\x69\x6d\x62\x6c\x65\x20"
> +                         "\x69\x6e\x20\x74\x68\x65\x20\x77"
> +                         "\x61\x62\x65\x3a\x0a\x41\x6c\x6c"
> +                         "\x20\x6d\x69\x6d\x73\x79\x20\x77"
> +                         "\x65\x72\x65\x20\x74\x68\x65\x20"
> +                         "\x62\x6f\x72\x6f\x67\x6f\x76\x65"
> +                         "\x73\x2c\x0a\x41\x6e\x64\x20\x74"
> +                         "\x68\x65\x20\x6d\x6f\x6d\x65\x20"
> +                         "\x72\x61\x74\x68\x73\x20\x6f\x75"
> +                         "\x74\x67\x72\x61\x62\x65\x2e",
> +               .ctext  = "\x95\xb9\x51\xe7\x8f\xb4\xa4\x03"
> +                         "\xca\x37\xcc\xde\x60\x1d\x8c\xe2"
> +                         "\xf1\xbb\x8a\x13\x7f\x61\x85\xcc"
> +                         "\xad\xf4\xf0\xdc\x86\xa6\x1e\x10"
> +                         "\xbc\x8e\xcb\x38\x2b\xa5\xc8\x8f"
> +                         "\xaa\x03\x3d\x53\x4a\x42\xb1\x33"
> +                         "\xfc\xd3\xef\xf0\x8e\x7e\x10\x9c"
> +                         "\x6f\x12\x5e\xd4\x96\xfe\x5b\x08"
> +                         "\xb6\x48\xf0\x14\x74\x51\x18\x7c"
> +                         "\x07\x92\xfc\xac\x9d\xf1\x94\xc0"
> +                         "\xc1\x9d\xc5\x19\x43\x1f\x1d\xbb"
> +                         "\x07\xf0\x1b\x14\x25\x45\xbb\xcb"
> +                         "\x5c\xe2\x8b\x28\xf3\xcf\x47\x29"
> +                         "\x27\x79\x67\x24\xa6\x87\xc2\x11"
> +                         "\x65\x03\xfa\x45\xf7\x9e\x53\x7a"
> +                         "\x99\xf1\x82\x25\x4f\x8d\x07",
> +               .len    = 127,
> +       }, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes
> +               to nonce, and recomputed the ciphertext with libsodium */
> +               .key    = "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
> +                         "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
> +                         "\x47\x39\x17\xc1\x40\x2b\x80\x09"
> +                         "\x9d\xca\x5c\xbc\x20\x70\x75\xc0",
> +               .klen   = 32,
> +               .iv     = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x01\x31\x58\xa3\x5a"
> +                         "\x25\x5d\x05\x17\x58\xe9\x5e\xd4"
> +                         "\x1c\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x49\xee\xe0\xdc\x24\x90\x40\xcd"
> +                         "\xc5\x40\x8f\x47\x05\xbc\xdd\x81"
> +                         "\x47\xc6\x8d\xe6\xb1\x8f\xd7\xcb"
> +                         "\x09\x0e\x6e\x22\x48\x1f\xbf\xb8"
> +                         "\x5c\xf7\x1e\x8a\xc1\x23\xf2\xd4"
> +                         "\x19\x4b\x01\x0f\x4e\xa4\x43\xce"
> +                         "\x01\xc6\x67\xda\x03\x91\x18\x90"
> +                         "\xa5\xa4\x8e\x45\x03\xb3\x2d\xac"
> +                         "\x74\x92\xd3\x53\x47\xc8\xdd\x25"
> +                         "\x53\x6c\x02\x03\x87\x0d\x11\x0c"
> +                         "\x58\xe3\x12\x18\xfd\x2a\x5b\x40"
> +                         "\x0c\x30\xf0\xb8\x3f\x43\xce\xae"
> +                         "\x65\x3a\x7d\x7c\xf4\x54\xaa\xcc"
> +                         "\x33\x97\xc3\x77\xba\xc5\x70\xde"
> +                         "\xd7\xd5\x13\xa5\x65\xc4\x5f\x0f"
> +                         "\x46\x1a\x0d\x97\xb5\xf3\xbb\x3c"
> +                         "\x84\x0f\x2b\xc5\xaa\xea\xf2\x6c"
> +                         "\xc9\xb5\x0c\xee\x15\xf3\x7d\xbe"
> +                         "\x9f\x7b\x5a\xa6\xae\x4f\x83\xb6"
> +                         "\x79\x49\x41\xf4\x58\x18\xcb\x86"
> +                         "\x7f\x30\x0e\xf8\x7d\x44\x36\xea"
> +                         "\x75\xeb\x88\x84\x40\x3c\xad\x4f"
> +                         "\x6f\x31\x6b\xaa\x5d\xe5\xa5\xc5"
> +                         "\x21\x66\xe9\xa7\xe3\xb2\x15\x88"
> +                         "\x78\xf6\x79\xa1\x59\x47\x12\x4e"
> +                         "\x9f\x9f\x64\x1a\xa0\x22\x5b\x08"
> +                         "\xbe\x7c\x36\xc2\x2b\x66\x33\x1b"
> +                         "\xdd\x60\x71\xf7\x47\x8c\x61\xc3"
> +                         "\xda\x8a\x78\x1e\x16\xfa\x1e\x86"
> +                         "\x81\xa6\x17\x2a\xa7\xb5\xc2\xe7"
> +                         "\xa4\xc7\x42\xf1\xcf\x6a\xca\xb4"
> +                         "\x45\xcf\xf3\x93\xf0\xe7\xea\xf6"
> +                         "\xf4\xe6\x33\x43\x84\x93\xa5\x67"
> +                         "\x9b\x16\x58\x58\x80\x0f\x2b\x5c"
> +                         "\x24\x74\x75\x7f\x95\x81\xb7\x30"
> +                         "\x7a\x33\xa7\xf7\x94\x87\x32\x27"
> +                         "\x10\x5d\x14\x4c\x43\x29\xdd\x26"
> +                         "\xbd\x3e\x3c\x0e\xfe\x0e\xa5\x10"
> +                         "\xea\x6b\x64\xfd\x73\xc6\xed\xec"
> +                         "\xa8\xc9\xbf\xb3\xba\x0b\x4d\x07"
> +                         "\x70\xfc\x16\xfd\x79\x1e\xd7\xc5"
> +                         "\x49\x4e\x1c\x8b\x8d\x79\x1b\xb1"
> +                         "\xec\xca\x60\x09\x4c\x6a\xd5\x09"
> +                         "\x49\x46\x00\x88\x22\x8d\xce\xea"
> +                         "\xb1\x17\x11\xde\x42\xd2\x23\xc1"
> +                         "\x72\x11\xf5\x50\x73\x04\x40\x47"
> +                         "\xf9\x5d\xe7\xa7\x26\xb1\x7e\xb0"
> +                         "\x3f\x58\xc1\x52\xab\x12\x67\x9d"
> +                         "\x3f\x43\x4b\x68\xd4\x9c\x68\x38"
> +                         "\x07\x8a\x2d\x3e\xf3\xaf\x6a\x4b"
> +                         "\xf9\xe5\x31\x69\x22\xf9\xa6\x69"
> +                         "\xc6\x9c\x96\x9a\x12\x35\x95\x1d"
> +                         "\x95\xd5\xdd\xbe\xbf\x93\x53\x24"
> +                         "\xfd\xeb\xc2\x0a\x64\xb0\x77\x00"
> +                         "\x6f\x88\xc4\x37\x18\x69\x7c\xd7"
> +                         "\x41\x92\x55\x4c\x03\xa1\x9a\x4b"
> +                         "\x15\xe5\xdf\x7f\x37\x33\x72\xc1"
> +                         "\x8b\x10\x67\xa3\x01\x57\x94\x25"
> +                         "\x7b\x38\x71\x7e\xdd\x1e\xcc\x73"
> +                         "\x55\xd2\x8e\xeb\x07\xdd\xf1\xda"
> +                         "\x58\xb1\x47\x90\xfe\x42\x21\x72"
> +                         "\xa3\x54\x7a\xa0\x40\xec\x9f\xdd"
> +                         "\xc6\x84\x6e\xca\xae\xe3\x68\xb4"
> +                         "\x9d\xe4\x78\xff\x57\xf2\xf8\x1b"
> +                         "\x03\xa1\x31\xd9\xde\x8d\xf5\x22"
> +                         "\x9c\xdd\x20\xa4\x1e\x27\xb1\x76"
> +                         "\x4f\x44\x55\xe2\x9b\xa1\x9c\xfe"
> +                         "\x54\xf7\x27\x1b\xf4\xde\x02\xf5"
> +                         "\x1b\x55\x48\x5c\xdc\x21\x4b\x9e"
> +                         "\x4b\x6e\xed\x46\x23\xdc\x65\xb2"
> +                         "\xcf\x79\x5f\x28\xe0\x9e\x8b\xe7"
> +                         "\x4c\x9d\x8a\xff\xc1\xa6\x28\xb8"
> +                         "\x65\x69\x8a\x45\x29\xef\x74\x85"
> +                         "\xde\x79\xc7\x08\xae\x30\xb0\xf4"
> +                         "\xa3\x1d\x51\x41\xab\xce\xcb\xf6"
> +                         "\xb5\xd8\x6d\xe0\x85\xe1\x98\xb3"
> +                         "\x43\xbb\x86\x83\x0a\xa0\xf5\xb7"
> +                         "\x04\x0b\xfa\x71\x1f\xb0\xf6\xd9"
> +                         "\x13\x00\x15\xf0\xc7\xeb\x0d\x5a"
> +                         "\x9f\xd7\xb9\x6c\x65\x14\x22\x45"
> +                         "\x6e\x45\x32\x3e\x7e\x60\x1a\x12"
> +                         "\x97\x82\x14\xfb\xaa\x04\x22\xfa"
> +                         "\xa0\xe5\x7e\x8c\x78\x02\x48\x5d"
> +                         "\x78\x33\x5a\x7c\xad\xdb\x29\xce"
> +                         "\xbb\x8b\x61\xa4\xb7\x42\xe2\xac"
> +                         "\x8b\x1a\xd9\x2f\x0b\x8b\x62\x21"
> +                         "\x83\x35\x7e\xad\x73\xc2\xb5\x6c"
> +                         "\x10\x26\x38\x07\xe5\xc7\x36\x80"
> +                         "\xe2\x23\x12\x61\xf5\x48\x4b\x2b"
> +                         "\xc5\xdf\x15\xd9\x87\x01\xaa\xac"
> +                         "\x1e\x7c\xad\x73\x78\x18\x63\xe0"
> +                         "\x8b\x9f\x81\xd8\x12\x6a\x28\x10"
> +                         "\xbe\x04\x68\x8a\x09\x7c\x1b\x1c"
> +                         "\x83\x66\x80\x47\x80\xe8\xfd\x35"
> +                         "\x1c\x97\x6f\xae\x49\x10\x66\xcc"
> +                         "\xc6\xd8\xcc\x3a\x84\x91\x20\x77"
> +                         "\x72\xe4\x24\xd2\x37\x9f\xc5\xc9"
> +                         "\x25\x94\x10\x5f\x40\x00\x64\x99"
> +                         "\xdc\xae\xd7\x21\x09\x78\x50\x15"
> +                         "\xac\x5f\xc6\x2c\xa2\x0b\xa9\x39"
> +                         "\x87\x6e\x6d\xab\xde\x08\x51\x16"
> +                         "\xc7\x13\xe9\xea\xed\x06\x8e\x2c"
> +                         "\xf8\x37\x8c\xf0\xa6\x96\x8d\x43"
> +                         "\xb6\x98\x37\xb2\x43\xed\xde\xdf"
> +                         "\x89\x1a\xe7\xeb\x9d\xa1\x7b\x0b"
> +                         "\x77\xb0\xe2\x75\xc0\xf1\x98\xd9"
> +                         "\x80\x55\xc9\x34\x91\xd1\x59\xe8"
> +                         "\x4b\x0f\xc1\xa9\x4b\x7a\x84\x06"
> +                         "\x20\xa8\x5d\xfa\xd1\xde\x70\x56"
> +                         "\x2f\x9e\x91\x9c\x20\xb3\x24\xd8"
> +                         "\x84\x3d\xe1\x8c\x7e\x62\x52\xe5"
> +                         "\x44\x4b\x9f\xc2\x93\x03\xea\x2b"
> +                         "\x59\xc5\xfa\x3f\x91\x2b\xbb\x23"
> +                         "\xf5\xb2\x7b\xf5\x38\xaf\xb3\xee"
> +                         "\x63\xdc\x7b\xd1\xff\xaa\x8b\xab"
> +                         "\x82\x6b\x37\x04\xeb\x74\xbe\x79"
> +                         "\xb9\x83\x90\xef\x20\x59\x46\xff"
> +                         "\xe9\x97\x3e\x2f\xee\xb6\x64\x18"
> +                         "\x38\x4c\x7a\x4a\xf9\x61\xe8\x9a"
> +                         "\xa1\xb5\x01\xa6\x47\xd3\x11\xd4"
> +                         "\xce\xd3\x91\x49\x88\xc7\xb8\x4d"
> +                         "\xb1\xb9\x07\x6d\x16\x72\xae\x46"
> +                         "\x5e\x03\xa1\x4b\xb6\x02\x30\xa8"
> +                         "\x3d\xa9\x07\x2a\x7c\x19\xe7\x62"
> +                         "\x87\xe3\x82\x2f\x6f\xe1\x09\xd9"
> +                         "\x94\x97\xea\xdd\x58\x9e\xae\x76"
> +                         "\x7e\x35\xe5\xb4\xda\x7e\xf4\xde"
> +                         "\xf7\x32\x87\xcd\x93\xbf\x11\x56"
> +                         "\x11\xbe\x08\x74\xe1\x69\xad\xe2"
> +                         "\xd7\xf8\x86\x75\x8a\x3c\xa4\xbe"
> +                         "\x70\xa7\x1b\xfc\x0b\x44\x2a\x76"
> +                         "\x35\xea\x5d\x85\x81\xaf\x85\xeb"
> +                         "\xa0\x1c\x61\xc2\xf7\x4f\xa5\xdc"
> +                         "\x02\x7f\xf6\x95\x40\x6e\x8a\x9a"
> +                         "\xf3\x5d\x25\x6e\x14\x3a\x22\xc9"
> +                         "\x37\x1c\xeb\x46\x54\x3f\xa5\x91"
> +                         "\xc2\xb5\x8c\xfe\x53\x08\x97\x32"
> +                         "\x1b\xb2\x30\x27\xfe\x25\x5d\xdc"
> +                         "\x08\x87\xd0\xe5\x94\x1a\xd4\xf1"
> +                         "\xfe\xd6\xb4\xa3\xe6\x74\x81\x3c"
> +                         "\x1b\xb7\x31\xa7\x22\xfd\xd4\xdd"
> +                         "\x20\x4e\x7c\x51\xb0\x60\x73\xb8"
> +                         "\x9c\xac\x91\x90\x7e\x01\xb0\xe1"
> +                         "\x8a\x2f\x75\x1c\x53\x2a\x98\x2a"
> +                         "\x06\x52\x95\x52\xb2\xe9\x25\x2e"
> +                         "\x4c\xe2\x5a\x00\xb2\x13\x81\x03"
> +                         "\x77\x66\x0d\xa5\x99\xda\x4e\x8c"
> +                         "\xac\xf3\x13\x53\x27\x45\xaf\x64"
> +                         "\x46\xdc\xea\x23\xda\x97\xd1\xab"
> +                         "\x7d\x6c\x30\x96\x1f\xbc\x06\x34"
> +                         "\x18\x0b\x5e\x21\x35\x11\x8d\x4c"
> +                         "\xe0\x2d\xe9\x50\x16\x74\x81\xa8"
> +                         "\xb4\x34\xb9\x72\x42\xa6\xcc\xbc"
> +                         "\xca\x34\x83\x27\x10\x5b\x68\x45"
> +                         "\x8f\x52\x22\x0c\x55\x3d\x29\x7c"
> +                         "\xe3\xc0\x66\x05\x42\x91\x5f\x58"
> +                         "\xfe\x4a\x62\xd9\x8c\xa9\x04\x19"
> +                         "\x04\xa9\x08\x4b\x57\xfc\x67\x53"
> +                         "\x08\x7c\xbc\x66\x8a\xb0\xb6\x9f"
> +                         "\x92\xd6\x41\x7c\x5b\x2a\x00\x79"
> +                         "\x72",
> +               .ctext  = "\x3a\x92\xee\x53\x31\xaf\x2b\x60"
> +                         "\x5f\x55\x8d\x00\x5d\xfc\x74\x97"
> +                         "\x28\x54\xf4\xa5\x75\xf1\x9b\x25"
> +                         "\x62\x1c\xc0\xe0\x13\xc8\x87\x53"
> +                         "\xd0\xf3\xa7\x97\x1f\x3b\x1e\xea"
> +                         "\xe0\xe5\x2a\xd1\xdd\xa4\x3b\x50"
> +                         "\x45\xa3\x0d\x7e\x1b\xc9\xa0\xad"
> +                         "\xb9\x2c\x54\xa6\xc7\x55\x16\xd0"
> +                         "\xc5\x2e\x02\x44\x35\xd0\x7e\x67"
> +                         "\xf2\xc4\x9b\xcd\x95\x10\xcc\x29"
> +                         "\x4b\xfa\x86\x87\xbe\x40\x36\xbe"
> +                         "\xe1\xa3\x52\x89\x55\x20\x9b\xc2"
> +                         "\xab\xf2\x31\x34\x16\xad\xc8\x17"
> +                         "\x65\x24\xc0\xff\x12\x37\xfe\x5a"
> +                         "\x62\x3b\x59\x47\x6c\x5f\x3a\x8e"
> +                         "\x3b\xd9\x30\xc8\x7f\x2f\x88\xda"
> +                         "\x80\xfd\x02\xda\x7f\x9a\x7a\x73"
> +                         "\x59\xc5\x34\x09\x9a\x11\xcb\xa7"
> +                         "\xfc\xf6\xa1\xa0\x60\xfb\x43\xbb"
> +                         "\xf1\xe9\xd7\xc6\x79\x27\x4e\xff"
> +                         "\x22\xb4\x24\xbf\x76\xee\x47\xb9"
> +                         "\x6d\x3f\x8b\xb0\x9c\x3c\x43\xdd"
> +                         "\xff\x25\x2e\x6d\xa4\x2b\xfb\x5d"
> +                         "\x1b\x97\x6c\x55\x0a\x82\x7a\x7b"
> +                         "\x94\x34\xc2\xdb\x2f\x1f\xc1\xea"
> +                         "\xd4\x4d\x17\x46\x3b\x51\x69\x09"
> +                         "\xe4\x99\x32\x25\xfd\x94\xaf\xfb"
> +                         "\x10\xf7\x4f\xdd\x0b\x3c\x8b\x41"
> +                         "\xb3\x6a\xb7\xd1\x33\xa8\x0c\x2f"
> +                         "\x62\x4c\x72\x11\xd7\x74\xe1\x3b"
> +                         "\x38\x43\x66\x7b\x6c\x36\x48\xe7"
> +                         "\xe3\xe7\x9d\xb9\x42\x73\x7a\x2a"
> +                         "\x89\x20\x1a\x41\x80\x03\xf7\x8f"
> +                         "\x61\x78\x13\xbf\xfe\x50\xf5\x04"
> +                         "\x52\xf9\xac\x47\xf8\x62\x4b\xb2"
> +                         "\x24\xa9\xbf\x64\xb0\x18\x69\xd2"
> +                         "\xf5\xe4\xce\xc8\xb1\x87\x75\xd6"
> +                         "\x2c\x24\x79\x00\x7d\x26\xfb\x44"
> +                         "\xe7\x45\x7a\xee\x58\xa5\x83\xc1"
> +                         "\xb4\x24\xab\x23\x2f\x4d\xd7\x4f"
> +                         "\x1c\xc7\xaa\xa9\x50\xf4\xa3\x07"
> +                         "\x12\x13\x89\x74\xdc\x31\x6a\xb2"
> +                         "\xf5\x0f\x13\x8b\xb9\xdb\x85\x1f"
> +                         "\xf5\xbc\x88\xd9\x95\xea\x31\x6c"
> +                         "\x36\x60\xb6\x49\xdc\xc4\xf7\x55"
> +                         "\x3f\x21\xc1\xb5\x92\x18\x5e\xbc"
> +                         "\x9f\x87\x7f\xe7\x79\x25\x40\x33"
> +                         "\xd6\xb9\x33\xd5\x50\xb3\xc7\x89"
> +                         "\x1b\x12\xa0\x46\xdd\xa7\xd8\x3e"
> +                         "\x71\xeb\x6f\x66\xa1\x26\x0c\x67"
> +                         "\xab\xb2\x38\x58\x17\xd8\x44\x3b"
> +                         "\x16\xf0\x8e\x62\x8d\x16\x10\x00"
> +                         "\x32\x8b\xef\xb9\x28\xd3\xc5\xad"
> +                         "\x0a\x19\xa2\xe4\x03\x27\x7d\x94"
> +                         "\x06\x18\xcd\xd6\x27\x00\xf9\x1f"
> +                         "\xb6\xb3\xfe\x96\x35\x5f\xc4\x1c"
> +                         "\x07\x62\x10\x79\x68\x50\xf1\x7e"
> +                         "\x29\xe7\xc4\xc4\xe7\xee\x54\xd6"
> +                         "\x58\x76\x84\x6d\x8d\xe4\x59\x31"
> +                         "\xe9\xf4\xdc\xa1\x1f\xe5\x1a\xd6"
> +                         "\xe6\x64\x46\xf5\x77\x9c\x60\x7a"
> +                         "\x5e\x62\xe3\x0a\xd4\x9f\x7a\x2d"
> +                         "\x7a\xa5\x0a\x7b\x29\x86\x7a\x74"
> +                         "\x74\x71\x6b\xca\x7d\x1d\xaa\xba"
> +                         "\x39\x84\x43\x76\x35\xfe\x4f\x9b"
> +                         "\xbb\xbb\xb5\x6a\x32\xb5\x5d\x41"
> +                         "\x51\xf0\x5b\x68\x03\x47\x4b\x8a"
> +                         "\xca\x88\xf6\x37\xbd\x73\x51\x70"
> +                         "\x66\xfe\x9e\x5f\x21\x9c\xf3\xdd"
> +                         "\xc3\xea\x27\xf9\x64\x94\xe1\x19"
> +                         "\xa0\xa9\xab\x60\xe0\x0e\xf7\x78"
> +                         "\x70\x86\xeb\xe0\xd1\x5c\x05\xd3"
> +                         "\xd7\xca\xe0\xc0\x47\x47\x34\xee"
> +                         "\x11\xa3\xa3\x54\x98\xb7\x49\x8e"
> +                         "\x84\x28\x70\x2c\x9e\xfb\x55\x54"
> +                         "\x4d\xf8\x86\xf7\x85\x7c\xbd\xf3"
> +                         "\x17\xd8\x47\xcb\xac\xf4\x20\x85"
> +                         "\x34\x66\xad\x37\x2d\x5e\x52\xda"
> +                         "\x8a\xfe\x98\x55\x30\xe7\x2d\x2b"
> +                         "\x19\x10\x8e\x7b\x66\x5e\xdc\xe0"
> +                         "\x45\x1f\x7b\xb4\x08\xfb\x8f\xf6"
> +                         "\x8c\x89\x21\x34\x55\x27\xb2\x76"
> +                         "\xb2\x07\xd9\xd6\x68\x9b\xea\x6b"
> +                         "\x2d\xb4\xc4\x35\xdd\xd2\x79\xae"
> +                         "\xc7\xd6\x26\x7f\x12\x01\x8c\xa7"
> +                         "\xe3\xdb\xa8\xf4\xf7\x2b\xec\x99"
> +                         "\x11\x00\xf1\x35\x8c\xcf\xd5\xc9"
> +                         "\xbd\x91\x36\x39\x70\xcf\x7d\x70"
> +                         "\x47\x1a\xfc\x6b\x56\xe0\x3f\x9c"
> +                         "\x60\x49\x01\x72\xa9\xaf\x2c\x9c"
> +                         "\xe8\xab\xda\x8c\x14\x19\xf3\x75"
> +                         "\x07\x17\x9d\x44\x67\x7a\x2e\xef"
> +                         "\xb7\x83\x35\x4a\xd1\x3d\x1c\x84"
> +                         "\x32\xdd\xaa\xea\xca\x1d\xdc\x72"
> +                         "\x2c\xcc\x43\xcd\x5d\xe3\x21\xa4"
> +                         "\xd0\x8a\x4b\x20\x12\xa3\xd5\x86"
> +                         "\x76\x96\xff\x5f\x04\x57\x0f\xe6"
> +                         "\xba\xe8\x76\x50\x0c\x64\x1d\x83"
> +                         "\x9c\x9b\x9a\x9a\x58\x97\x9c\x5c"
> +                         "\xb4\xa4\xa6\x3e\x19\xeb\x8f\x5a"
> +                         "\x61\xb2\x03\x7b\x35\x19\xbe\xa7"
> +                         "\x63\x0c\xfd\xdd\xf9\x90\x6c\x08"
> +                         "\x19\x11\xd3\x65\x4a\xf5\x96\x92"
> +                         "\x59\xaa\x9c\x61\x0c\x29\xa7\xf8"
> +                         "\x14\x39\x37\xbf\x3c\xf2\x16\x72"
> +                         "\x02\xfa\xa2\xf3\x18\x67\x5d\xcb"
> +                         "\xdc\x4d\xbb\x96\xff\x70\x08\x2d"
> +                         "\xc2\xa8\x52\xe1\x34\x5f\x72\xfe"
> +                         "\x64\xbf\xca\xa7\x74\x38\xfb\x74"
> +                         "\x55\x9c\xfa\x8a\xed\xfb\x98\xeb"
> +                         "\x58\x2e\x6c\xe1\x52\x76\x86\xd7"
> +                         "\xcf\xa1\xa4\xfc\xb2\x47\x41\x28"
> +                         "\xa3\xc1\xe5\xfd\x53\x19\x28\x2b"
> +                         "\x37\x04\x65\x96\x99\x7a\x28\x0f"
> +                         "\x07\x68\x4b\xc7\x52\x0a\x55\x35"
> +                         "\x40\x19\x95\x61\xe8\x59\x40\x1f"
> +                         "\x9d\xbf\x78\x7d\x8f\x84\xff\x6f"
> +                         "\xd0\xd5\x63\xd2\x22\xbd\xc8\x4e"
> +                         "\xfb\xe7\x9f\x06\xe6\xe7\x39\x6d"
> +                         "\x6a\x96\x9f\xf0\x74\x7e\xc9\x35"
> +                         "\xb7\x26\xb8\x1c\x0a\xa6\x27\x2c"
> +                         "\xa2\x2b\xfe\xbe\x0f\x07\x73\xae"
> +                         "\x7f\x7f\x54\xf5\x7c\x6a\x0a\x56"
> +                         "\x49\xd4\x81\xe5\x85\x53\x99\x1f"
> +                         "\x95\x05\x13\x58\x8d\x0e\x1b\x90"
> +                         "\xc3\x75\x48\x64\x58\x98\x67\x84"
> +                         "\xae\xe2\x21\xa2\x8a\x04\x0a\x0b"
> +                         "\x61\xaa\xb0\xd4\x28\x60\x7a\xf8"
> +                         "\xbc\x52\xfb\x24\x7f\xed\x0d\x2a"
> +                         "\x0a\xb2\xf9\xc6\x95\xb5\x11\xc9"
> +                         "\xf4\x0f\x26\x11\xcf\x2a\x57\x87"
> +                         "\x7a\xf3\xe7\x94\x65\xc2\xb5\xb3"
> +                         "\xab\x98\xe3\xc1\x2b\x59\x19\x7c"
> +                         "\xd6\xf3\xf9\xbf\xff\x6d\xc6\x82"
> +                         "\x13\x2f\x4a\x2e\xcd\x26\xfe\x2d"
> +                         "\x01\x70\xf4\xc2\x7f\x1f\x4c\xcb"
> +                         "\x47\x77\x0c\xa0\xa3\x03\xec\xda"
> +                         "\xa9\xbf\x0d\x2d\xae\xe4\xb8\x7b"
> +                         "\xa9\xbc\x08\xb4\x68\x2e\xc5\x60"
> +                         "\x8d\x87\x41\x2b\x0f\x69\xf0\xaf"
> +                         "\x5f\xba\x72\x20\x0f\x33\xcd\x6d"
> +                         "\x36\x7d\x7b\xd5\x05\xf1\x4b\x05"
> +                         "\xc4\xfc\x7f\x80\xb9\x4d\xbd\xf7"
> +                         "\x7c\x84\x07\x01\xc2\x40\x66\x5b"
> +                         "\x98\xc7\x2c\xe3\x97\xfa\xdf\x87"
> +                         "\xa0\x1f\xe9\x21\x42\x0f\x3b\xeb"
> +                         "\x89\x1c\x3b\xca\x83\x61\x77\x68"
> +                         "\x84\xbb\x60\x87\x38\x2e\x25\xd5"
> +                         "\x9e\x04\x41\x70\xac\xda\xc0\x9c"
> +                         "\x9c\x69\xea\x8d\x4e\x55\x2a\x29"
> +                         "\xed\x05\x4b\x7b\x73\x71\x90\x59"
> +                         "\x4d\xc8\xd8\x44\xf0\x4c\xe1\x5e"
> +                         "\x84\x47\x55\xcc\x32\x3f\xe7\x97"
> +                         "\x42\xc6\x32\xac\x40\xe5\xa5\xc7"
> +                         "\x8b\xed\xdb\xf7\x83\xd6\xb1\xc2"
> +                         "\x52\x5e\x34\xb7\xeb\x6e\xd9\xfc"
> +                         "\xe5\x93\x9a\x97\x3e\xb0\xdc\xd9"
> +                         "\xd7\x06\x10\xb6\x1d\x80\x59\xdd"
> +                         "\x0d\xfe\x64\x35\xcd\x5d\xec\xf0"
> +                         "\xba\xd0\x34\xc9\x2d\x91\xc5\x17"
> +                         "\x11",
> +               .len    = 1281,
> +               .also_non_np = 1,
> +               .np     = 3,
> +               .tap    = { 1200, 1, 80 },
> +       },
> +};
> +
>  /*
>   * CTS (Cipher Text Stealing) mode tests
>   */
> diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
> index fbec4e6a87890..6290d997060ec 100644
> --- a/include/crypto/chacha20.h
> +++ b/include/crypto/chacha20.h
> @@ -1,6 +1,10 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>  /*
> - * Common values for the ChaCha20 algorithm
> + * Common values and helper functions for the ChaCha20 and XChaCha20 algorithms.
> + *
> + * XChaCha20 extends ChaCha20's nonce to 192 bits, while provably retaining
> + * ChaCha20's security.  Here they share the same key size, tfm context, and
> + * setkey function; only their IV size and encrypt/decrypt function differ.
>   */
>
>  #ifndef _CRYPTO_CHACHA20_H
> @@ -10,10 +14,15 @@
>  #include <linux/types.h>
>  #include <linux/crypto.h>
>
> +/* 32-bit stream position, then 96-bit nonce (RFC7539 convention) */
>  #define CHACHA20_IV_SIZE       16
> +
>  #define CHACHA20_KEY_SIZE      32
>  #define CHACHA20_BLOCK_SIZE    64
>
> +/* 192-bit nonce, then 64-bit stream position */
> +#define XCHACHA20_IV_SIZE      32
> +
>  struct chacha20_ctx {
>         u32 key[8];
>  };
> @@ -22,8 +31,11 @@ void chacha20_block(u32 *state, u8 *stream);
>  void hchacha20_block(const u32 *in, u32 *out);
>
>  void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
> +
>  int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
>                            unsigned int keysize);
> +
>  int crypto_chacha20_crypt(struct skcipher_request *req);
> +int crypto_xchacha20_crypt(struct skcipher_request *req);
>
>  #endif
> --
> 2.19.1.331.ge82ca0e54c-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 03/12] crypto: chacha20-generic - refactor to allow varying number of rounds
  2018-10-15 17:54 ` [RFC PATCH v2 03/12] crypto: chacha20-generic - refactor to allow varying number of rounds Eric Biggers
@ 2018-10-19 14:25   ` Ard Biesheuvel
  0 siblings, 0 replies; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-19 14:25 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> In preparation for adding XChaCha12 support, rename/refactor
> chacha20-generic to support different numbers of rounds.  The
> justification for needing XChaCha12 support is explained in more detail
> in the patch "crypto: chacha - add XChaCha12 support".
>
> The only difference between ChaCha{8,12,20} are the number of rounds
> itself; all other parts of the algorithm are the same.  Therefore,
> remove the "20" from all definitions, structures, functions, files, etc.
> that will be shared by all ChaCha versions.
>
> Also make ->setkey() store the round count in the chacha_ctx (previously
> chacha20_ctx).  The generic code then passes the round count through to
> chacha_block().  There will be a ->setkey() function for each explicitly
> allowed round count; the encrypt/decrypt functions will be the same.  I
> decided not to do it the opposite way (same ->setkey() function for all
> round counts, with different encrypt/decrypt functions) because that
> would have required more boilerplate code in architecture-specific
> implementations of ChaCha and XChaCha.
>
> To be as careful as possible, we whitelist the allowed round counts in
> the low-level generic code.  Currently only 20 is allowed, i.e. no
> actual use of other variants is introduced by this patch.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---
>  arch/arm/crypto/chacha20-neon-glue.c          |  40 +++----
>  arch/arm64/crypto/chacha20-neon-glue.c        |  40 +++----
>  arch/x86/crypto/chacha20_glue.c               |  52 ++++-----
>  crypto/Makefile                               |   2 +-
>  crypto/chacha20poly1305.c                     |  10 +-
>  .../{chacha20_generic.c => chacha_generic.c}  | 110 ++++++++++--------
>  drivers/char/random.c                         |  51 ++++----
>  include/crypto/chacha.h                       |  46 ++++++++
>  include/crypto/chacha20.h                     |  41 -------
>  lib/Makefile                                  |   2 +-
>  lib/{chacha20.c => chacha.c}                  |  43 ++++---
>  11 files changed, 227 insertions(+), 210 deletions(-)
>  rename crypto/{chacha20_generic.c => chacha_generic.c} (57%)
>  create mode 100644 include/crypto/chacha.h
>  delete mode 100644 include/crypto/chacha20.h
>  rename lib/{chacha20.c => chacha.c} (67%)
>
> diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c
> index 59a7be08e80ce..7386eb1c1889d 100644
> --- a/arch/arm/crypto/chacha20-neon-glue.c
> +++ b/arch/arm/crypto/chacha20-neon-glue.c
> @@ -19,7 +19,7 @@
>   */
>
>  #include <crypto/algapi.h>
> -#include <crypto/chacha20.h>
> +#include <crypto/chacha.h>
>  #include <crypto/internal/skcipher.h>
>  #include <linux/kernel.h>
>  #include <linux/module.h>
> @@ -34,20 +34,20 @@ asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
>  static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
>                             unsigned int bytes)
>  {
> -       u8 buf[CHACHA20_BLOCK_SIZE];
> +       u8 buf[CHACHA_BLOCK_SIZE];
>
> -       while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
> +       while (bytes >= CHACHA_BLOCK_SIZE * 4) {
>                 chacha20_4block_xor_neon(state, dst, src);
> -               bytes -= CHACHA20_BLOCK_SIZE * 4;
> -               src += CHACHA20_BLOCK_SIZE * 4;
> -               dst += CHACHA20_BLOCK_SIZE * 4;
> +               bytes -= CHACHA_BLOCK_SIZE * 4;
> +               src += CHACHA_BLOCK_SIZE * 4;
> +               dst += CHACHA_BLOCK_SIZE * 4;
>                 state[12] += 4;
>         }
> -       while (bytes >= CHACHA20_BLOCK_SIZE) {
> +       while (bytes >= CHACHA_BLOCK_SIZE) {
>                 chacha20_block_xor_neon(state, dst, src);
> -               bytes -= CHACHA20_BLOCK_SIZE;
> -               src += CHACHA20_BLOCK_SIZE;
> -               dst += CHACHA20_BLOCK_SIZE;
> +               bytes -= CHACHA_BLOCK_SIZE;
> +               src += CHACHA_BLOCK_SIZE;
> +               dst += CHACHA_BLOCK_SIZE;
>                 state[12]++;
>         }
>         if (bytes) {
> @@ -60,17 +60,17 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
>  static int chacha20_neon(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> -       struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
>         struct skcipher_walk walk;
>         u32 state[16];
>         int err;
>
> -       if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
> -               return crypto_chacha20_crypt(req);
> +       if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
> +               return crypto_chacha_crypt(req);
>
>         err = skcipher_walk_virt(&walk, req, true);
>
> -       crypto_chacha20_init(state, ctx, walk.iv);
> +       crypto_chacha_init(state, ctx, walk.iv);
>
>         kernel_neon_begin();
>         while (walk.nbytes > 0) {
> @@ -93,14 +93,14 @@ static struct skcipher_alg alg = {
>         .base.cra_driver_name   = "chacha20-neon",
>         .base.cra_priority      = 300,
>         .base.cra_blocksize     = 1,
> -       .base.cra_ctxsize       = sizeof(struct chacha20_ctx),
> +       .base.cra_ctxsize       = sizeof(struct chacha_ctx),
>         .base.cra_module        = THIS_MODULE,
>
> -       .min_keysize            = CHACHA20_KEY_SIZE,
> -       .max_keysize            = CHACHA20_KEY_SIZE,
> -       .ivsize                 = CHACHA20_IV_SIZE,
> -       .chunksize              = CHACHA20_BLOCK_SIZE,
> -       .walksize               = 4 * CHACHA20_BLOCK_SIZE,
> +       .min_keysize            = CHACHA_KEY_SIZE,
> +       .max_keysize            = CHACHA_KEY_SIZE,
> +       .ivsize                 = CHACHA_IV_SIZE,
> +       .chunksize              = CHACHA_BLOCK_SIZE,
> +       .walksize               = 4 * CHACHA_BLOCK_SIZE,
>         .setkey                 = crypto_chacha20_setkey,
>         .encrypt                = chacha20_neon,
>         .decrypt                = chacha20_neon,
> diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c
> index 727579c93dedb..96e0cfb8c3f5b 100644
> --- a/arch/arm64/crypto/chacha20-neon-glue.c
> +++ b/arch/arm64/crypto/chacha20-neon-glue.c
> @@ -19,7 +19,7 @@
>   */
>
>  #include <crypto/algapi.h>
> -#include <crypto/chacha20.h>
> +#include <crypto/chacha.h>
>  #include <crypto/internal/skcipher.h>
>  #include <linux/kernel.h>
>  #include <linux/module.h>
> @@ -34,15 +34,15 @@ asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
>  static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
>                             unsigned int bytes)
>  {
> -       u8 buf[CHACHA20_BLOCK_SIZE];
> +       u8 buf[CHACHA_BLOCK_SIZE];
>
> -       while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
> +       while (bytes >= CHACHA_BLOCK_SIZE * 4) {
>                 kernel_neon_begin();
>                 chacha20_4block_xor_neon(state, dst, src);
>                 kernel_neon_end();
> -               bytes -= CHACHA20_BLOCK_SIZE * 4;
> -               src += CHACHA20_BLOCK_SIZE * 4;
> -               dst += CHACHA20_BLOCK_SIZE * 4;
> +               bytes -= CHACHA_BLOCK_SIZE * 4;
> +               src += CHACHA_BLOCK_SIZE * 4;
> +               dst += CHACHA_BLOCK_SIZE * 4;
>                 state[12] += 4;
>         }
>
> @@ -50,11 +50,11 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
>                 return;
>
>         kernel_neon_begin();
> -       while (bytes >= CHACHA20_BLOCK_SIZE) {
> +       while (bytes >= CHACHA_BLOCK_SIZE) {
>                 chacha20_block_xor_neon(state, dst, src);
> -               bytes -= CHACHA20_BLOCK_SIZE;
> -               src += CHACHA20_BLOCK_SIZE;
> -               dst += CHACHA20_BLOCK_SIZE;
> +               bytes -= CHACHA_BLOCK_SIZE;
> +               src += CHACHA_BLOCK_SIZE;
> +               dst += CHACHA_BLOCK_SIZE;
>                 state[12]++;
>         }
>         if (bytes) {
> @@ -68,17 +68,17 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
>  static int chacha20_neon(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> -       struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
>         struct skcipher_walk walk;
>         u32 state[16];
>         int err;
>
> -       if (!may_use_simd() || req->cryptlen <= CHACHA20_BLOCK_SIZE)
> -               return crypto_chacha20_crypt(req);
> +       if (!may_use_simd() || req->cryptlen <= CHACHA_BLOCK_SIZE)
> +               return crypto_chacha_crypt(req);
>
>         err = skcipher_walk_virt(&walk, req, false);
>
> -       crypto_chacha20_init(state, ctx, walk.iv);
> +       crypto_chacha_init(state, ctx, walk.iv);
>
>         while (walk.nbytes > 0) {
>                 unsigned int nbytes = walk.nbytes;
> @@ -99,14 +99,14 @@ static struct skcipher_alg alg = {
>         .base.cra_driver_name   = "chacha20-neon",
>         .base.cra_priority      = 300,
>         .base.cra_blocksize     = 1,
> -       .base.cra_ctxsize       = sizeof(struct chacha20_ctx),
> +       .base.cra_ctxsize       = sizeof(struct chacha_ctx),
>         .base.cra_module        = THIS_MODULE,
>
> -       .min_keysize            = CHACHA20_KEY_SIZE,
> -       .max_keysize            = CHACHA20_KEY_SIZE,
> -       .ivsize                 = CHACHA20_IV_SIZE,
> -       .chunksize              = CHACHA20_BLOCK_SIZE,
> -       .walksize               = 4 * CHACHA20_BLOCK_SIZE,
> +       .min_keysize            = CHACHA_KEY_SIZE,
> +       .max_keysize            = CHACHA_KEY_SIZE,
> +       .ivsize                 = CHACHA_IV_SIZE,
> +       .chunksize              = CHACHA_BLOCK_SIZE,
> +       .walksize               = 4 * CHACHA_BLOCK_SIZE,
>         .setkey                 = crypto_chacha20_setkey,
>         .encrypt                = chacha20_neon,
>         .decrypt                = chacha20_neon,
> diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
> index dce7c5d39c2f2..bd249f0b29dc2 100644
> --- a/arch/x86/crypto/chacha20_glue.c
> +++ b/arch/x86/crypto/chacha20_glue.c
> @@ -10,7 +10,7 @@
>   */
>
>  #include <crypto/algapi.h>
> -#include <crypto/chacha20.h>
> +#include <crypto/chacha.h>
>  #include <crypto/internal/skcipher.h>
>  #include <linux/kernel.h>
>  #include <linux/module.h>
> @@ -29,31 +29,31 @@ static bool chacha20_use_avx2;
>  static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src,
>                             unsigned int bytes)
>  {
> -       u8 buf[CHACHA20_BLOCK_SIZE];
> +       u8 buf[CHACHA_BLOCK_SIZE];
>
>  #ifdef CONFIG_AS_AVX2
>         if (chacha20_use_avx2) {
> -               while (bytes >= CHACHA20_BLOCK_SIZE * 8) {
> +               while (bytes >= CHACHA_BLOCK_SIZE * 8) {
>                         chacha20_8block_xor_avx2(state, dst, src);
> -                       bytes -= CHACHA20_BLOCK_SIZE * 8;
> -                       src += CHACHA20_BLOCK_SIZE * 8;
> -                       dst += CHACHA20_BLOCK_SIZE * 8;
> +                       bytes -= CHACHA_BLOCK_SIZE * 8;
> +                       src += CHACHA_BLOCK_SIZE * 8;
> +                       dst += CHACHA_BLOCK_SIZE * 8;
>                         state[12] += 8;
>                 }
>         }
>  #endif
> -       while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
> +       while (bytes >= CHACHA_BLOCK_SIZE * 4) {
>                 chacha20_4block_xor_ssse3(state, dst, src);
> -               bytes -= CHACHA20_BLOCK_SIZE * 4;
> -               src += CHACHA20_BLOCK_SIZE * 4;
> -               dst += CHACHA20_BLOCK_SIZE * 4;
> +               bytes -= CHACHA_BLOCK_SIZE * 4;
> +               src += CHACHA_BLOCK_SIZE * 4;
> +               dst += CHACHA_BLOCK_SIZE * 4;
>                 state[12] += 4;
>         }
> -       while (bytes >= CHACHA20_BLOCK_SIZE) {
> +       while (bytes >= CHACHA_BLOCK_SIZE) {
>                 chacha20_block_xor_ssse3(state, dst, src);
> -               bytes -= CHACHA20_BLOCK_SIZE;
> -               src += CHACHA20_BLOCK_SIZE;
> -               dst += CHACHA20_BLOCK_SIZE;
> +               bytes -= CHACHA_BLOCK_SIZE;
> +               src += CHACHA_BLOCK_SIZE;
> +               dst += CHACHA_BLOCK_SIZE;
>                 state[12]++;
>         }
>         if (bytes) {
> @@ -66,7 +66,7 @@ static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src,
>  static int chacha20_simd(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> -       struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
>         u32 *state, state_buf[16 + 2] __aligned(8);
>         struct skcipher_walk walk;
>         int err;
> @@ -74,20 +74,20 @@ static int chacha20_simd(struct skcipher_request *req)
>         BUILD_BUG_ON(CHACHA20_STATE_ALIGN != 16);
>         state = PTR_ALIGN(state_buf + 0, CHACHA20_STATE_ALIGN);
>
> -       if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
> -               return crypto_chacha20_crypt(req);
> +       if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
> +               return crypto_chacha_crypt(req);
>
>         err = skcipher_walk_virt(&walk, req, true);
>
> -       crypto_chacha20_init(state, ctx, walk.iv);
> +       crypto_chacha_init(state, ctx, walk.iv);
>
>         kernel_fpu_begin();
>
> -       while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
> +       while (walk.nbytes >= CHACHA_BLOCK_SIZE) {
>                 chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
> -                               rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
> +                               rounddown(walk.nbytes, CHACHA_BLOCK_SIZE));
>                 err = skcipher_walk_done(&walk,
> -                                        walk.nbytes % CHACHA20_BLOCK_SIZE);
> +                                        walk.nbytes % CHACHA_BLOCK_SIZE);
>         }
>
>         if (walk.nbytes) {
> @@ -106,13 +106,13 @@ static struct skcipher_alg alg = {
>         .base.cra_driver_name   = "chacha20-simd",
>         .base.cra_priority      = 300,
>         .base.cra_blocksize     = 1,
> -       .base.cra_ctxsize       = sizeof(struct chacha20_ctx),
> +       .base.cra_ctxsize       = sizeof(struct chacha_ctx),
>         .base.cra_module        = THIS_MODULE,
>
> -       .min_keysize            = CHACHA20_KEY_SIZE,
> -       .max_keysize            = CHACHA20_KEY_SIZE,
> -       .ivsize                 = CHACHA20_IV_SIZE,
> -       .chunksize              = CHACHA20_BLOCK_SIZE,
> +       .min_keysize            = CHACHA_KEY_SIZE,
> +       .max_keysize            = CHACHA_KEY_SIZE,
> +       .ivsize                 = CHACHA_IV_SIZE,
> +       .chunksize              = CHACHA_BLOCK_SIZE,
>         .setkey                 = crypto_chacha20_setkey,
>         .encrypt                = chacha20_simd,
>         .decrypt                = chacha20_simd,
> diff --git a/crypto/Makefile b/crypto/Makefile
> index 5c207c76abf7e..7e673f7c71107 100644
> --- a/crypto/Makefile
> +++ b/crypto/Makefile
> @@ -116,7 +116,7 @@ obj-$(CONFIG_CRYPTO_KHAZAD) += khazad.o
>  obj-$(CONFIG_CRYPTO_ANUBIS) += anubis.o
>  obj-$(CONFIG_CRYPTO_SEED) += seed.o
>  obj-$(CONFIG_CRYPTO_SALSA20) += salsa20_generic.o
> -obj-$(CONFIG_CRYPTO_CHACHA20) += chacha20_generic.o
> +obj-$(CONFIG_CRYPTO_CHACHA20) += chacha_generic.o
>  obj-$(CONFIG_CRYPTO_POLY1305) += poly1305_generic.o
>  obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o
>  obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o
> diff --git a/crypto/chacha20poly1305.c b/crypto/chacha20poly1305.c
> index 600afa99941fe..573c07e6f189e 100644
> --- a/crypto/chacha20poly1305.c
> +++ b/crypto/chacha20poly1305.c
> @@ -13,7 +13,7 @@
>  #include <crypto/internal/hash.h>
>  #include <crypto/internal/skcipher.h>
>  #include <crypto/scatterwalk.h>
> -#include <crypto/chacha20.h>
> +#include <crypto/chacha.h>
>  #include <crypto/poly1305.h>
>  #include <linux/err.h>
>  #include <linux/init.h>
> @@ -51,7 +51,7 @@ struct poly_req {
>  };
>
>  struct chacha_req {
> -       u8 iv[CHACHA20_IV_SIZE];
> +       u8 iv[CHACHA_IV_SIZE];
>         struct scatterlist src[1];
>         struct skcipher_request req; /* must be last member */
>  };
> @@ -91,7 +91,7 @@ static void chacha_iv(u8 *iv, struct aead_request *req, u32 icb)
>         memcpy(iv, &leicb, sizeof(leicb));
>         memcpy(iv + sizeof(leicb), ctx->salt, ctx->saltlen);
>         memcpy(iv + sizeof(leicb) + ctx->saltlen, req->iv,
> -              CHACHA20_IV_SIZE - sizeof(leicb) - ctx->saltlen);
> +              CHACHA_IV_SIZE - sizeof(leicb) - ctx->saltlen);
>  }
>
>  static int poly_verify_tag(struct aead_request *req)
> @@ -494,7 +494,7 @@ static int chachapoly_setkey(struct crypto_aead *aead, const u8 *key,
>         struct chachapoly_ctx *ctx = crypto_aead_ctx(aead);
>         int err;
>
> -       if (keylen != ctx->saltlen + CHACHA20_KEY_SIZE)
> +       if (keylen != ctx->saltlen + CHACHA_KEY_SIZE)
>                 return -EINVAL;
>
>         keylen -= ctx->saltlen;
> @@ -639,7 +639,7 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb,
>
>         err = -EINVAL;
>         /* Need 16-byte IV size, including Initial Block Counter value */
> -       if (crypto_skcipher_alg_ivsize(chacha) != CHACHA20_IV_SIZE)
> +       if (crypto_skcipher_alg_ivsize(chacha) != CHACHA_IV_SIZE)
>                 goto out_drop_chacha;
>         /* Not a stream cipher? */
>         if (chacha->base.cra_blocksize != 1)
> diff --git a/crypto/chacha20_generic.c b/crypto/chacha_generic.c
> similarity index 57%
> rename from crypto/chacha20_generic.c
> rename to crypto/chacha_generic.c
> index 07902fe37aeb8..8e25e9930c549 100644
> --- a/crypto/chacha20_generic.c
> +++ b/crypto/chacha_generic.c
> @@ -12,33 +12,33 @@
>
>  #include <asm/unaligned.h>
>  #include <crypto/algapi.h>
> -#include <crypto/chacha20.h>
> +#include <crypto/chacha.h>
>  #include <crypto/internal/skcipher.h>
>  #include <linux/module.h>
>
> -static void chacha20_docrypt(u32 *state, u8 *dst, const u8 *src,
> -                            unsigned int bytes)
> +static void chacha_docrypt(u32 *state, u8 *dst, const u8 *src,
> +                          unsigned int bytes, int nrounds)
>  {
>         /* aligned to potentially speed up crypto_xor() */
> -       u8 stream[CHACHA20_BLOCK_SIZE] __aligned(sizeof(long));
> +       u8 stream[CHACHA_BLOCK_SIZE] __aligned(sizeof(long));
>
>         if (dst != src)
>                 memcpy(dst, src, bytes);
>
> -       while (bytes >= CHACHA20_BLOCK_SIZE) {
> -               chacha20_block(state, stream);
> -               crypto_xor(dst, stream, CHACHA20_BLOCK_SIZE);
> -               bytes -= CHACHA20_BLOCK_SIZE;
> -               dst += CHACHA20_BLOCK_SIZE;
> +       while (bytes >= CHACHA_BLOCK_SIZE) {
> +               chacha_block(state, stream, nrounds);
> +               crypto_xor(dst, stream, CHACHA_BLOCK_SIZE);
> +               bytes -= CHACHA_BLOCK_SIZE;
> +               dst += CHACHA_BLOCK_SIZE;
>         }
>         if (bytes) {
> -               chacha20_block(state, stream);
> +               chacha_block(state, stream, nrounds);
>                 crypto_xor(dst, stream, bytes);
>         }
>  }
>
> -static int chacha20_stream_xor(struct skcipher_request *req,
> -                              struct chacha20_ctx *ctx, u8 *iv)
> +static int chacha_stream_xor(struct skcipher_request *req,
> +                            struct chacha_ctx *ctx, u8 *iv)
>  {
>         struct skcipher_walk walk;
>         u32 state[16];
> @@ -46,7 +46,7 @@ static int chacha20_stream_xor(struct skcipher_request *req,
>
>         err = skcipher_walk_virt(&walk, req, true);
>
> -       crypto_chacha20_init(state, ctx, iv);
> +       crypto_chacha_init(state, ctx, iv);
>
>         while (walk.nbytes > 0) {
>                 unsigned int nbytes = walk.nbytes;
> @@ -54,15 +54,15 @@ static int chacha20_stream_xor(struct skcipher_request *req,
>                 if (nbytes < walk.total)
>                         nbytes = round_down(nbytes, walk.stride);
>
> -               chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
> -                                nbytes);
> +               chacha_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
> +                              nbytes, ctx->nrounds);
>                 err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
>         }
>
>         return err;
>  }
>
> -void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
> +void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv)
>  {
>         state[0]  = 0x61707865; /* "expa" */
>         state[1]  = 0x3320646e; /* "nd 3" */
> @@ -81,53 +81,61 @@ void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
>         state[14] = get_unaligned_le32(iv +  8);
>         state[15] = get_unaligned_le32(iv + 12);
>  }
> -EXPORT_SYMBOL_GPL(crypto_chacha20_init);
> +EXPORT_SYMBOL_GPL(crypto_chacha_init);
>
> -int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
> -                          unsigned int keysize)
> +static int chacha_setkey(struct crypto_skcipher *tfm, const u8 *key,
> +                        unsigned int keysize, int nrounds)
>  {
> -       struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
>         int i;
>
> -       if (keysize != CHACHA20_KEY_SIZE)
> +       if (keysize != CHACHA_KEY_SIZE)
>                 return -EINVAL;
>
>         for (i = 0; i < ARRAY_SIZE(ctx->key); i++)
>                 ctx->key[i] = get_unaligned_le32(key + i * sizeof(u32));
>
> +       ctx->nrounds = nrounds;
>         return 0;
>  }
> +
> +int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
> +                          unsigned int keysize)
> +{
> +       return chacha_setkey(tfm, key, keysize, 20);
> +}
>  EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
>
> -int crypto_chacha20_crypt(struct skcipher_request *req)
> +int crypto_chacha_crypt(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> -       struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
>
> -       return chacha20_stream_xor(req, ctx, req->iv);
> +       return chacha_stream_xor(req, ctx, req->iv);
>  }
> -EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
> +EXPORT_SYMBOL_GPL(crypto_chacha_crypt);
>
> -int crypto_xchacha20_crypt(struct skcipher_request *req)
> +int crypto_xchacha_crypt(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> -       struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
> -       struct chacha20_ctx subctx;
> +       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       struct chacha_ctx subctx;
>         u32 state[16];
>         u8 real_iv[16];
>
>         /* Compute the subkey given the original key and first 128 nonce bits */
> -       crypto_chacha20_init(state, ctx, req->iv);
> -       hchacha20_block(state, subctx.key);
> +       crypto_chacha_init(state, ctx, req->iv);
> +       hchacha_block(state, subctx.key, ctx->nrounds);
> +       subctx.nrounds = ctx->nrounds;
>
>         /* Build the real IV */
>         memcpy(&real_iv[0], req->iv + 24, 8); /* stream position */
>         memcpy(&real_iv[8], req->iv + 16, 8); /* remaining 64 nonce bits */
>
>         /* Generate the stream and XOR it with the data */
> -       return chacha20_stream_xor(req, &subctx, real_iv);
> +       return chacha_stream_xor(req, &subctx, real_iv);
>  }
> -EXPORT_SYMBOL_GPL(crypto_xchacha20_crypt);
> +EXPORT_SYMBOL_GPL(crypto_xchacha_crypt);
>
>  static struct skcipher_alg algs[] = {
>         {
> @@ -135,50 +143,50 @@ static struct skcipher_alg algs[] = {
>                 .base.cra_driver_name   = "chacha20-generic",
>                 .base.cra_priority      = 100,
>                 .base.cra_blocksize     = 1,
> -               .base.cra_ctxsize       = sizeof(struct chacha20_ctx),
> +               .base.cra_ctxsize       = sizeof(struct chacha_ctx),
>                 .base.cra_module        = THIS_MODULE,
>
> -               .min_keysize            = CHACHA20_KEY_SIZE,
> -               .max_keysize            = CHACHA20_KEY_SIZE,
> -               .ivsize                 = CHACHA20_IV_SIZE,
> -               .chunksize              = CHACHA20_BLOCK_SIZE,
> +               .min_keysize            = CHACHA_KEY_SIZE,
> +               .max_keysize            = CHACHA_KEY_SIZE,
> +               .ivsize                 = CHACHA_IV_SIZE,
> +               .chunksize              = CHACHA_BLOCK_SIZE,
>                 .setkey                 = crypto_chacha20_setkey,
> -               .encrypt                = crypto_chacha20_crypt,
> -               .decrypt                = crypto_chacha20_crypt,
> +               .encrypt                = crypto_chacha_crypt,
> +               .decrypt                = crypto_chacha_crypt,
>         }, {
>                 .base.cra_name          = "xchacha20",
>                 .base.cra_driver_name   = "xchacha20-generic",
>                 .base.cra_priority      = 100,
>                 .base.cra_blocksize     = 1,
> -               .base.cra_ctxsize       = sizeof(struct chacha20_ctx),
> +               .base.cra_ctxsize       = sizeof(struct chacha_ctx),
>                 .base.cra_module        = THIS_MODULE,
>
> -               .min_keysize            = CHACHA20_KEY_SIZE,
> -               .max_keysize            = CHACHA20_KEY_SIZE,
> -               .ivsize                 = XCHACHA20_IV_SIZE,
> -               .chunksize              = CHACHA20_BLOCK_SIZE,
> +               .min_keysize            = CHACHA_KEY_SIZE,
> +               .max_keysize            = CHACHA_KEY_SIZE,
> +               .ivsize                 = XCHACHA_IV_SIZE,
> +               .chunksize              = CHACHA_BLOCK_SIZE,
>                 .setkey                 = crypto_chacha20_setkey,
> -               .encrypt                = crypto_xchacha20_crypt,
> -               .decrypt                = crypto_xchacha20_crypt,
> +               .encrypt                = crypto_xchacha_crypt,
> +               .decrypt                = crypto_xchacha_crypt,
>         }
>  };
>
> -static int __init chacha20_generic_mod_init(void)
> +static int __init chacha_generic_mod_init(void)
>  {
>         return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
>  }
>
> -static void __exit chacha20_generic_mod_fini(void)
> +static void __exit chacha_generic_mod_fini(void)
>  {
>         crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
>  }
>
> -module_init(chacha20_generic_mod_init);
> -module_exit(chacha20_generic_mod_fini);
> +module_init(chacha_generic_mod_init);
> +module_exit(chacha_generic_mod_fini);
>
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
> -MODULE_DESCRIPTION("ChaCha20 and XChaCha20 stream ciphers (generic)");
> +MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (generic)");
>  MODULE_ALIAS_CRYPTO("chacha20");
>  MODULE_ALIAS_CRYPTO("chacha20-generic");
>  MODULE_ALIAS_CRYPTO("xchacha20");
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index d22d967c50f0a..5f47c4c8b9b15 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -265,7 +265,7 @@
>  #include <linux/syscalls.h>
>  #include <linux/completion.h>
>  #include <linux/uuid.h>
> -#include <crypto/chacha20.h>
> +#include <crypto/chacha.h>
>
>  #include <asm/processor.h>
>  #include <linux/uaccess.h>
> @@ -431,11 +431,10 @@ static int crng_init = 0;
>  #define crng_ready() (likely(crng_init > 1))
>  static int crng_init_cnt = 0;
>  static unsigned long crng_global_init_time = 0;
> -#define CRNG_INIT_CNT_THRESH (2*CHACHA20_KEY_SIZE)
> -static void _extract_crng(struct crng_state *crng,
> -                         __u8 out[CHACHA20_BLOCK_SIZE]);
> +#define CRNG_INIT_CNT_THRESH (2*CHACHA_KEY_SIZE)
> +static void _extract_crng(struct crng_state *crng, __u8 out[CHACHA_BLOCK_SIZE]);
>  static void _crng_backtrack_protect(struct crng_state *crng,
> -                                   __u8 tmp[CHACHA20_BLOCK_SIZE], int used);
> +                                   __u8 tmp[CHACHA_BLOCK_SIZE], int used);
>  static void process_random_ready_list(void);
>  static void _get_random_bytes(void *buf, int nbytes);
>
> @@ -858,7 +857,7 @@ static int crng_fast_load(const char *cp, size_t len)
>         }
>         p = (unsigned char *) &primary_crng.state[4];
>         while (len > 0 && crng_init_cnt < CRNG_INIT_CNT_THRESH) {
> -               p[crng_init_cnt % CHACHA20_KEY_SIZE] ^= *cp;
> +               p[crng_init_cnt % CHACHA_KEY_SIZE] ^= *cp;
>                 cp++; crng_init_cnt++; len--;
>         }
>         spin_unlock_irqrestore(&primary_crng.lock, flags);
> @@ -890,7 +889,7 @@ static int crng_slow_load(const char *cp, size_t len)
>         unsigned long           flags;
>         static unsigned char    lfsr = 1;
>         unsigned char           tmp;
> -       unsigned                i, max = CHACHA20_KEY_SIZE;
> +       unsigned                i, max = CHACHA_KEY_SIZE;
>         const char *            src_buf = cp;
>         char *                  dest_buf = (char *) &primary_crng.state[4];
>
> @@ -908,8 +907,8 @@ static int crng_slow_load(const char *cp, size_t len)
>                 lfsr >>= 1;
>                 if (tmp & 1)
>                         lfsr ^= 0xE1;
> -               tmp = dest_buf[i % CHACHA20_KEY_SIZE];
> -               dest_buf[i % CHACHA20_KEY_SIZE] ^= src_buf[i % len] ^ lfsr;
> +               tmp = dest_buf[i % CHACHA_KEY_SIZE];
> +               dest_buf[i % CHACHA_KEY_SIZE] ^= src_buf[i % len] ^ lfsr;
>                 lfsr += (tmp << 3) | (tmp >> 5);
>         }
>         spin_unlock_irqrestore(&primary_crng.lock, flags);
> @@ -921,7 +920,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
>         unsigned long   flags;
>         int             i, num;
>         union {
> -               __u8    block[CHACHA20_BLOCK_SIZE];
> +               __u8    block[CHACHA_BLOCK_SIZE];
>                 __u32   key[8];
>         } buf;
>
> @@ -932,7 +931,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
>         } else {
>                 _extract_crng(&primary_crng, buf.block);
>                 _crng_backtrack_protect(&primary_crng, buf.block,
> -                                       CHACHA20_KEY_SIZE);
> +                                       CHACHA_KEY_SIZE);
>         }
>         spin_lock_irqsave(&crng->lock, flags);
>         for (i = 0; i < 8; i++) {
> @@ -968,7 +967,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
>  }
>
>  static void _extract_crng(struct crng_state *crng,
> -                         __u8 out[CHACHA20_BLOCK_SIZE])
> +                         __u8 out[CHACHA_BLOCK_SIZE])
>  {
>         unsigned long v, flags;
>
> @@ -985,7 +984,7 @@ static void _extract_crng(struct crng_state *crng,
>         spin_unlock_irqrestore(&crng->lock, flags);
>  }
>
> -static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE])
> +static void extract_crng(__u8 out[CHACHA_BLOCK_SIZE])
>  {
>         struct crng_state *crng = NULL;
>
> @@ -1003,14 +1002,14 @@ static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE])
>   * enough) to mutate the CRNG key to provide backtracking protection.
>   */
>  static void _crng_backtrack_protect(struct crng_state *crng,
> -                                   __u8 tmp[CHACHA20_BLOCK_SIZE], int used)
> +                                   __u8 tmp[CHACHA_BLOCK_SIZE], int used)
>  {
>         unsigned long   flags;
>         __u32           *s, *d;
>         int             i;
>
>         used = round_up(used, sizeof(__u32));
> -       if (used + CHACHA20_KEY_SIZE > CHACHA20_BLOCK_SIZE) {
> +       if (used + CHACHA_KEY_SIZE > CHACHA_BLOCK_SIZE) {
>                 extract_crng(tmp);
>                 used = 0;
>         }
> @@ -1022,7 +1021,7 @@ static void _crng_backtrack_protect(struct crng_state *crng,
>         spin_unlock_irqrestore(&crng->lock, flags);
>  }
>
> -static void crng_backtrack_protect(__u8 tmp[CHACHA20_BLOCK_SIZE], int used)
> +static void crng_backtrack_protect(__u8 tmp[CHACHA_BLOCK_SIZE], int used)
>  {
>         struct crng_state *crng = NULL;
>
> @@ -1037,8 +1036,8 @@ static void crng_backtrack_protect(__u8 tmp[CHACHA20_BLOCK_SIZE], int used)
>
>  static ssize_t extract_crng_user(void __user *buf, size_t nbytes)
>  {
> -       ssize_t ret = 0, i = CHACHA20_BLOCK_SIZE;
> -       __u8 tmp[CHACHA20_BLOCK_SIZE] __aligned(4);
> +       ssize_t ret = 0, i = CHACHA_BLOCK_SIZE;
> +       __u8 tmp[CHACHA_BLOCK_SIZE] __aligned(4);
>         int large_request = (nbytes > 256);
>
>         while (nbytes) {
> @@ -1052,7 +1051,7 @@ static ssize_t extract_crng_user(void __user *buf, size_t nbytes)
>                 }
>
>                 extract_crng(tmp);
> -               i = min_t(int, nbytes, CHACHA20_BLOCK_SIZE);
> +               i = min_t(int, nbytes, CHACHA_BLOCK_SIZE);
>                 if (copy_to_user(buf, tmp, i)) {
>                         ret = -EFAULT;
>                         break;
> @@ -1617,14 +1616,14 @@ static void _warn_unseeded_randomness(const char *func_name, void *caller,
>   */
>  static void _get_random_bytes(void *buf, int nbytes)
>  {
> -       __u8 tmp[CHACHA20_BLOCK_SIZE] __aligned(4);
> +       __u8 tmp[CHACHA_BLOCK_SIZE] __aligned(4);
>
>         trace_get_random_bytes(nbytes, _RET_IP_);
>
> -       while (nbytes >= CHACHA20_BLOCK_SIZE) {
> +       while (nbytes >= CHACHA_BLOCK_SIZE) {
>                 extract_crng(buf);
> -               buf += CHACHA20_BLOCK_SIZE;
> -               nbytes -= CHACHA20_BLOCK_SIZE;
> +               buf += CHACHA_BLOCK_SIZE;
> +               nbytes -= CHACHA_BLOCK_SIZE;
>         }
>
>         if (nbytes > 0) {
> @@ -1632,7 +1631,7 @@ static void _get_random_bytes(void *buf, int nbytes)
>                 memcpy(buf, tmp, nbytes);
>                 crng_backtrack_protect(tmp, nbytes);
>         } else
> -               crng_backtrack_protect(tmp, CHACHA20_BLOCK_SIZE);
> +               crng_backtrack_protect(tmp, CHACHA_BLOCK_SIZE);
>         memzero_explicit(tmp, sizeof(tmp));
>  }
>
> @@ -2203,8 +2202,8 @@ struct ctl_table random_table[] = {
>
>  struct batched_entropy {
>         union {
> -               u64 entropy_u64[CHACHA20_BLOCK_SIZE / sizeof(u64)];
> -               u32 entropy_u32[CHACHA20_BLOCK_SIZE / sizeof(u32)];
> +               u64 entropy_u64[CHACHA_BLOCK_SIZE / sizeof(u64)];
> +               u32 entropy_u32[CHACHA_BLOCK_SIZE / sizeof(u32)];
>         };
>         unsigned int position;
>  };
> diff --git a/include/crypto/chacha.h b/include/crypto/chacha.h
> new file mode 100644
> index 0000000000000..ae79e9983c72f
> --- /dev/null
> +++ b/include/crypto/chacha.h
> @@ -0,0 +1,46 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Common values and helper functions for the ChaCha and XChaCha stream ciphers.
> + *
> + * XChaCha extends ChaCha's nonce to 192 bits, while provably retaining ChaCha's
> + * security.  Here they share the same key size, tfm context, and setkey
> + * function; only their IV size and encrypt/decrypt function differ.
> + */
> +
> +#ifndef _CRYPTO_CHACHA_H
> +#define _CRYPTO_CHACHA_H
> +
> +#include <crypto/skcipher.h>
> +#include <linux/types.h>
> +#include <linux/crypto.h>
> +
> +/* 32-bit stream position, then 96-bit nonce (RFC7539 convention) */
> +#define CHACHA_IV_SIZE         16
> +
> +#define CHACHA_KEY_SIZE                32
> +#define CHACHA_BLOCK_SIZE      64
> +
> +/* 192-bit nonce, then 64-bit stream position */
> +#define XCHACHA_IV_SIZE                32
> +
> +struct chacha_ctx {
> +       u32 key[8];
> +       int nrounds;
> +};
> +
> +void chacha_block(u32 *state, u8 *stream, int nrounds);
> +static inline void chacha20_block(u32 *state, u8 *stream)
> +{
> +       chacha_block(state, stream, 20);
> +}
> +void hchacha_block(const u32 *in, u32 *out, int nrounds);
> +
> +void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv);
> +
> +int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
> +                          unsigned int keysize);
> +
> +int crypto_chacha_crypt(struct skcipher_request *req);
> +int crypto_xchacha_crypt(struct skcipher_request *req);
> +
> +#endif /* _CRYPTO_CHACHA_H */
> diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
> deleted file mode 100644
> index 6290d997060ec..0000000000000
> --- a/include/crypto/chacha20.h
> +++ /dev/null
> @@ -1,41 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -/*
> - * Common values and helper functions for the ChaCha20 and XChaCha20 algorithms.
> - *
> - * XChaCha20 extends ChaCha20's nonce to 192 bits, while provably retaining
> - * ChaCha20's security.  Here they share the same key size, tfm context, and
> - * setkey function; only their IV size and encrypt/decrypt function differ.
> - */
> -
> -#ifndef _CRYPTO_CHACHA20_H
> -#define _CRYPTO_CHACHA20_H
> -
> -#include <crypto/skcipher.h>
> -#include <linux/types.h>
> -#include <linux/crypto.h>
> -
> -/* 32-bit stream position, then 96-bit nonce (RFC7539 convention) */
> -#define CHACHA20_IV_SIZE       16
> -
> -#define CHACHA20_KEY_SIZE      32
> -#define CHACHA20_BLOCK_SIZE    64
> -
> -/* 192-bit nonce, then 64-bit stream position */
> -#define XCHACHA20_IV_SIZE      32
> -
> -struct chacha20_ctx {
> -       u32 key[8];
> -};
> -
> -void chacha20_block(u32 *state, u8 *stream);
> -void hchacha20_block(const u32 *in, u32 *out);
> -
> -void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
> -
> -int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
> -                          unsigned int keysize);
> -
> -int crypto_chacha20_crypt(struct skcipher_request *req);
> -int crypto_xchacha20_crypt(struct skcipher_request *req);
> -
> -#endif
> diff --git a/lib/Makefile b/lib/Makefile
> index ca3f7ebb900d8..9a5f0b7a48891 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -20,7 +20,7 @@ KCOV_INSTRUMENT_dynamic_debug.o := n
>  lib-y := ctype.o string.o vsprintf.o cmdline.o \
>          rbtree.o radix-tree.o timerqueue.o\
>          idr.o int_sqrt.o extable.o \
> -        sha1.o chacha20.o irq_regs.o argv_split.o \
> +        sha1.o chacha.o irq_regs.o argv_split.o \
>          flex_proportions.o ratelimit.o show_mem.o \
>          is_single_threaded.o plist.o decompress.o kobject_uevent.o \
>          earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
> diff --git a/lib/chacha20.c b/lib/chacha.c
> similarity index 67%
> rename from lib/chacha20.c
> rename to lib/chacha.c
> index 6a484e16171d1..0a2c2e5b7b84d 100644
> --- a/lib/chacha20.c
> +++ b/lib/chacha.c
> @@ -1,5 +1,5 @@
>  /*
> - * The "hash function" used as the core of the ChaCha20 stream cipher (RFC7539)
> + * The "hash function" used as the core of the ChaCha stream cipher (RFC7539)
>   *
>   * Copyright (C) 2015 Martin Willi
>   *
> @@ -14,13 +14,16 @@
>  #include <linux/bitops.h>
>  #include <linux/cryptohash.h>
>  #include <asm/unaligned.h>
> -#include <crypto/chacha20.h>
> +#include <crypto/chacha.h>
>
> -static void chacha20_permute(u32 *x)
> +static void chacha_permute(u32 *x, int nrounds)
>  {
>         int i;
>
> -       for (i = 0; i < 20; i += 2) {
> +       /* whitelist the allowed round counts */
> +       BUG_ON(nrounds != 20);
> +
> +       for (i = 0; i < nrounds; i += 2) {
>                 x[0]  += x[4];    x[12] = rol32(x[12] ^ x[0],  16);
>                 x[1]  += x[5];    x[13] = rol32(x[13] ^ x[1],  16);
>                 x[2]  += x[6];    x[14] = rol32(x[14] ^ x[2],  16);
> @@ -64,49 +67,51 @@ static void chacha20_permute(u32 *x)
>  }
>
>  /**
> - * chacha20_block - generate one keystream block and increment block counter
> + * chacha_block - generate one keystream block and increment block counter
>   * @state: input state matrix (16 32-bit words)
>   * @stream: output keystream block (64 bytes)
> + * @nrounds: number of rounds (currently must be 20)
>   *
> - * This is the ChaCha20 core, a function from 64-byte strings to 64-byte
> - * strings.  The caller has already converted the endianness of the input.  This
> - * function also handles incrementing the block counter in the input matrix.
> + * This is the ChaCha core, a function from 64-byte strings to 64-byte strings.
> + * The caller has already converted the endianness of the input.  This function
> + * also handles incrementing the block counter in the input matrix.
>   */
> -void chacha20_block(u32 *state, u8 *stream)
> +void chacha_block(u32 *state, u8 *stream, int nrounds)
>  {
>         u32 x[16];
>         int i;
>
>         memcpy(x, state, 64);
>
> -       chacha20_permute(x);
> +       chacha_permute(x, nrounds);
>
>         for (i = 0; i < ARRAY_SIZE(x); i++)
>                 put_unaligned_le32(x[i] + state[i], &stream[i * sizeof(u32)]);
>
>         state[12]++;
>  }
> -EXPORT_SYMBOL(chacha20_block);
> +EXPORT_SYMBOL(chacha_block);
>
>  /**
> - * hchacha20_block - abbreviated ChaCha20 core, for XChaCha20
> + * hchacha_block - abbreviated ChaCha core, for XChaCha
>   * @in: input state matrix (16 32-bit words)
>   * @out: output (8 32-bit words)
> + * @nrounds: number of rounds (currently must be 20)
>   *
> - * HChaCha20 is the ChaCha equivalent of HSalsa20 and is an intermediate step
> - * towards XChaCha20 (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf).
> - * HChaCha20 skips the final addition of the initial state, and outputs only
> - * certain words of the state.  It should not be used for streaming directly.
> + * HChaCha is the ChaCha equivalent of HSalsa and is an intermediate step
> + * towards XChaCha (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf).  HChaCha
> + * skips the final addition of the initial state, and outputs only certain words
> + * of the state.  It should not be used for streaming directly.
>   */
> -void hchacha20_block(const u32 *in, u32 *out)
> +void hchacha_block(const u32 *in, u32 *out, int nrounds)
>  {
>         u32 x[16];
>
>         memcpy(x, in, 64);
>
> -       chacha20_permute(x);
> +       chacha_permute(x, nrounds);
>
>         memcpy(&out[0], &x[0], 16);
>         memcpy(&out[4], &x[12], 16);
>  }
> -EXPORT_SYMBOL(hchacha20_block);
> +EXPORT_SYMBOL(hchacha_block);
> --
> 2.19.1.331.ge82ca0e54c-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 04/12] crypto: chacha - add XChaCha12 support
  2018-10-15 17:54 ` [RFC PATCH v2 04/12] crypto: chacha - add XChaCha12 support Eric Biggers
@ 2018-10-19 14:34   ` Ard Biesheuvel
  2018-10-19 18:28     ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-19 14:34 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	Jason A . Donenfeld, Greg Kaiser, Herbert Xu, Samuel Neves,
	Michael Halcrow, Linux Kernel Mailing List, linux-fscrypt,
	Tomer Ashur, linux-arm-kernel, Paul Crowley

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Now that the generic implementation of ChaCha20 has been refactored to
> allow varying the number of rounds, add support for XChaCha12, which is
> the XSalsa construction applied to ChaCha12.  ChaCha12 is one of the
> three ciphers specified by the original ChaCha paper
> (https://cr.yp.to/chacha/chacha-20080128.pdf: "ChaCha, a variant of
> Salsa20"), alongside ChaCha8 and ChaCha20.  ChaCha12 is faster than
> ChaCha20 but has a lower, but still large, security margin.
>
> We need XChaCha12 support so that it can be used in the Adiantum
> encryption mode, which enables disk/file encryption on low-end mobile
> devices where AES-XTS is too slow as the CPUs lack AES instructions.
>
> We'd prefer XChaCha20 (the more popular variant), but it's too slow on
> some of our target devices, so at least in some cases we do need the
> XChaCha12-based version.  In more detail, the problem is that Adiantum
> is still much slower than we're happy with, and encryption still has a
> quite noticeable effect on the feel of low-end devices.  Users and
> vendors push back hard against encryption that degrades the user
> experience, which always risks encryption being disabled entirely.  So
> we need to choose the fastest option that gives us a solid margin of
> security, and here that's XChaCha12.  The best known attack on ChaCha
> breaks only 7 rounds and has 2^235 time complexity, so ChaCha12's
> security margin is still better than AES-256's.  Much has been learned
> about cryptanalysis of ARX ciphers since Salsa20 was originally designed
> in 2005, and it now seems we can be comfortable with a smaller number of
> rounds.  The eSTREAM project also suggests the 12-round version of
> Salsa20 as providing the best balance among the different variants:
> combining very good performance with a "comfortable margin of security".
>
> Note that it would be trivial to add vanilla ChaCha12 in addition to
> XChaCha12.  However, it's unneeded for now and therefore is omitted.
>
> As discussed in the patch that introduced XChaCha20 support, I
> considered splitting the code into separate chacha-common, chacha20,
> xchacha20, and xchacha12 modules, so that these algorithms could be
> enabled/disabled independently.  However, since nearly all the code is
> shared anyway, I ultimately decided there would have been little benefit
> to the added complexity.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

One nit below but

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---
>  crypto/Kconfig          |   8 +-
>  crypto/chacha_generic.c |  26 +-
>  crypto/testmgr.c        |   6 +
>  crypto/testmgr.h        | 578 ++++++++++++++++++++++++++++++++++++++++
>  include/crypto/chacha.h |   7 +
>  lib/chacha.c            |   6 +-
>  6 files changed, 625 insertions(+), 6 deletions(-)
>
> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index d9acbce23d4d5..4fa0a4a0e8615 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -1387,10 +1387,10 @@ config CRYPTO_SALSA20
>           Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>
>
>  config CRYPTO_CHACHA20
> -       tristate "ChaCha20 stream cipher algorithms"
> +       tristate "ChaCha stream cipher algorithms"
>         select CRYPTO_BLKCIPHER
>         help
> -         The ChaCha20 and XChaCha20 stream cipher algorithms.
> +         The ChaCha20, XChaCha20, and XChaCha12 stream cipher algorithms.
>
>           ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
>           Bernstein and further specified in RFC7539 for use in IETF protocols.
> @@ -1403,6 +1403,10 @@ config CRYPTO_CHACHA20
>           while provably retaining ChaCha20's security.  See also:
>           <https://cr.yp.to/snuffle/xsalsa-20081128.pdf>
>
> +         XChaCha12 is XChaCha20 reduced to 12 rounds, with correspondingly
> +         reduced security margin but increased performance.  It can be needed
> +         in some performance-sensitive scenarios.
> +
>  config CRYPTO_CHACHA20_X86_64
>         tristate "ChaCha20 cipher algorithm (x86_64/SSSE3/AVX2)"
>         depends on X86 && 64BIT
> diff --git a/crypto/chacha_generic.c b/crypto/chacha_generic.c
> index 8e25e9930c549..8f8f84e51f334 100644
> --- a/crypto/chacha_generic.c
> +++ b/crypto/chacha_generic.c
> @@ -1,5 +1,5 @@
>  /*
> - * ChaCha20 (RFC7539) and XChaCha20 stream cipher algorithms
> + * ChaCha and XChaCha stream ciphers, including ChaCha20 (RFC7539)
>   *
>   * Copyright (C) 2015 Martin Willi
>   * Copyright (C) 2018 Google LLC
> @@ -106,6 +106,13 @@ int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
>  }
>  EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
>
> +int crypto_chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key,
> +                          unsigned int keysize)
> +{
> +       return chacha_setkey(tfm, key, keysize, 12);
> +}
> +EXPORT_SYMBOL_GPL(crypto_chacha12_setkey);
> +
>  int crypto_chacha_crypt(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> @@ -168,6 +175,21 @@ static struct skcipher_alg algs[] = {
>                 .setkey                 = crypto_chacha20_setkey,
>                 .encrypt                = crypto_xchacha_crypt,
>                 .decrypt                = crypto_xchacha_crypt,
> +       }, {
> +               .base.cra_name          = "xchacha12",
> +               .base.cra_driver_name   = "xchacha12-generic",
> +               .base.cra_priority      = 100,
> +               .base.cra_blocksize     = 1,
> +               .base.cra_ctxsize       = sizeof(struct chacha_ctx),
> +               .base.cra_module        = THIS_MODULE,
> +
> +               .min_keysize            = CHACHA_KEY_SIZE,
> +               .max_keysize            = CHACHA_KEY_SIZE,
> +               .ivsize                 = XCHACHA_IV_SIZE,
> +               .chunksize              = CHACHA_BLOCK_SIZE,
> +               .setkey                 = crypto_chacha12_setkey,
> +               .encrypt                = crypto_xchacha_crypt,
> +               .decrypt                = crypto_xchacha_crypt,
>         }
>  };
>
> @@ -191,3 +213,5 @@ MODULE_ALIAS_CRYPTO("chacha20");
>  MODULE_ALIAS_CRYPTO("chacha20-generic");
>  MODULE_ALIAS_CRYPTO("xchacha20");
>  MODULE_ALIAS_CRYPTO("xchacha20-generic");
> +MODULE_ALIAS_CRYPTO("xchacha12");
> +MODULE_ALIAS_CRYPTO("xchacha12-generic");
> diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> index a5512e69c8f31..3ff70ebc745cb 100644
> --- a/crypto/testmgr.c
> +++ b/crypto/testmgr.c
> @@ -3544,6 +3544,12 @@ static const struct alg_test_desc alg_test_descs[] = {
>                 .suite = {
>                         .hash = __VECS(aes_xcbc128_tv_template)
>                 }
> +       }, {
> +               .alg = "xchacha12",
> +               .test = alg_test_skcipher,
> +               .suite = {
> +                       .cipher = __VECS(xchacha12_tv_template)
> +               },
>         }, {
>                 .alg = "xchacha20",
>                 .test = alg_test_skcipher,
> diff --git a/crypto/testmgr.h b/crypto/testmgr.h
> index 371641c73cf8c..3b57b2701fcb2 100644
> --- a/crypto/testmgr.h
> +++ b/crypto/testmgr.h
> @@ -31379,6 +31379,584 @@ static const struct cipher_testvec xchacha20_tv_template[] = {
>         },
>  };
>
> +/*
> + * Same as XChaCha20 test vectors above, but recomputed the ciphertext with
> + * XChaCha12, using a modified libsodium.
> + */
> +static const struct cipher_testvec xchacha12_tv_template[] = {
> +       {
> +               .key    = "\x79\xc9\x97\x98\xac\x67\x30\x0b"
> +                         "\xbb\x27\x04\xc9\x5c\x34\x1e\x32"
> +                         "\x45\xf3\xdc\xb2\x17\x61\xb9\x8e"
> +                         "\x52\xff\x45\xb2\x4f\x30\x4f\xc4",
> +               .klen   = 32,
> +               .iv     = "\xb3\x3f\xfd\x30\x96\x47\x9b\xcf"
> +                         "\xbc\x9a\xee\x49\x41\x76\x88\xa0"
> +                         "\xa2\x55\x4f\x8d\x95\x38\x94\x19"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00",
> +               .ctext  = "\x1b\x78\x7f\xd7\xa1\x41\x68\xab"
> +                         "\x3d\x3f\xd1\x7b\x69\x56\xb2\xd5"
> +                         "\x43\xce\xeb\xaf\x36\xf0\x29\x9d"
> +                         "\x3a\xfb\x18\xae\x1b",
> +               .len    = 29,
> +       }, {
> +               .key    = "\x9d\x23\xbd\x41\x49\xcb\x97\x9c"
> +                         "\xcf\x3c\x5c\x94\xdd\x21\x7e\x98"
> +                         "\x08\xcb\x0e\x50\xcd\x0f\x67\x81"
> +                         "\x22\x35\xea\xaf\x60\x1d\x62\x32",
> +               .klen   = 32,
> +               .iv     = "\xc0\x47\x54\x82\x66\xb7\xc3\x70"
> +                         "\xd3\x35\x66\xa2\x42\x5c\xbf\x30"
> +                         "\xd8\x2d\x1e\xaf\x52\x94\x10\x9e"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00",
> +               .ctext  = "\xfb\x32\x09\x1d\x83\x05\xae\x4c"
> +                         "\x13\x1f\x12\x71\xf2\xca\xb2\xeb"
> +                         "\x5b\x83\x14\x7d\x83\xf6\x57\x77"
> +                         "\x2e\x40\x1f\x92\x2c\xf9\xec\x35"
> +                         "\x34\x1f\x93\xdf\xfb\x30\xd7\x35"
> +                         "\x03\x05\x78\xc1\x20\x3b\x7a\xe3"
> +                         "\x62\xa3\x89\xdc\x11\x11\x45\xa8"
> +                         "\x82\x89\xa0\xf1\x4e\xc7\x0f\x11"
> +                         "\x69\xdd\x0c\x84\x2b\x89\x5c\xdc"
> +                         "\xf0\xde\x01\xef\xc5\x65\x79\x23"
> +                         "\x87\x67\xd6\x50\xd9\x8d\xd9\x92"
> +                         "\x54\x5b\x0e",
> +               .len    = 91,
> +       }, {
> +               .key    = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .klen   = 32,
> +               .iv     = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x67\xc6\x69\x73"
> +                         "\x51\xff\x4a\xec\x29\xcd\xba\xab"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +               .ctext  = "\xdf\x2d\xc6\x21\x2a\x9d\xa1\xbb"
> +                         "\xc2\x77\x66\x0c\x5c\x46\xef\xa7"
> +                         "\x79\x1b\xb9\xdf\x55\xe2\xf9\x61"
> +                         "\x4c\x7b\xa4\x52\x24\xaf\xa2\xda"
> +                         "\xd1\x8f\x8f\xa2\x9e\x53\x4d\xc4"
> +                         "\xb8\x55\x98\x08\x7c\x08\xd4\x18"
> +                         "\x67\x8f\xef\x50\xb1\x5f\xa5\x77"
> +                         "\x4c\x25\xe7\x86\x26\x42\xca\x44",
> +               .len    = 64,
> +       }, {
> +               .key    = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x01",
> +               .klen   = 32,
> +               .iv     = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x02\xf2\xfb\xe3\x46"
> +                         "\x7c\xc2\x54\xf8\x1b\xe8\xe7\x8d"
> +                         "\x01\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x41\x6e\x79\x20\x73\x75\x62\x6d"
> +                         "\x69\x73\x73\x69\x6f\x6e\x20\x74"
> +                         "\x6f\x20\x74\x68\x65\x20\x49\x45"
> +                         "\x54\x46\x20\x69\x6e\x74\x65\x6e"
> +                         "\x64\x65\x64\x20\x62\x79\x20\x74"
> +                         "\x68\x65\x20\x43\x6f\x6e\x74\x72"
> +                         "\x69\x62\x75\x74\x6f\x72\x20\x66"
> +                         "\x6f\x72\x20\x70\x75\x62\x6c\x69"
> +                         "\x63\x61\x74\x69\x6f\x6e\x20\x61"
> +                         "\x73\x20\x61\x6c\x6c\x20\x6f\x72"
> +                         "\x20\x70\x61\x72\x74\x20\x6f\x66"
> +                         "\x20\x61\x6e\x20\x49\x45\x54\x46"
> +                         "\x20\x49\x6e\x74\x65\x72\x6e\x65"
> +                         "\x74\x2d\x44\x72\x61\x66\x74\x20"
> +                         "\x6f\x72\x20\x52\x46\x43\x20\x61"
> +                         "\x6e\x64\x20\x61\x6e\x79\x20\x73"
> +                         "\x74\x61\x74\x65\x6d\x65\x6e\x74"
> +                         "\x20\x6d\x61\x64\x65\x20\x77\x69"
> +                         "\x74\x68\x69\x6e\x20\x74\x68\x65"
> +                         "\x20\x63\x6f\x6e\x74\x65\x78\x74"
> +                         "\x20\x6f\x66\x20\x61\x6e\x20\x49"
> +                         "\x45\x54\x46\x20\x61\x63\x74\x69"
> +                         "\x76\x69\x74\x79\x20\x69\x73\x20"
> +                         "\x63\x6f\x6e\x73\x69\x64\x65\x72"
> +                         "\x65\x64\x20\x61\x6e\x20\x22\x49"
> +                         "\x45\x54\x46\x20\x43\x6f\x6e\x74"
> +                         "\x72\x69\x62\x75\x74\x69\x6f\x6e"
> +                         "\x22\x2e\x20\x53\x75\x63\x68\x20"
> +                         "\x73\x74\x61\x74\x65\x6d\x65\x6e"
> +                         "\x74\x73\x20\x69\x6e\x63\x6c\x75"
> +                         "\x64\x65\x20\x6f\x72\x61\x6c\x20"
> +                         "\x73\x74\x61\x74\x65\x6d\x65\x6e"
> +                         "\x74\x73\x20\x69\x6e\x20\x49\x45"
> +                         "\x54\x46\x20\x73\x65\x73\x73\x69"
> +                         "\x6f\x6e\x73\x2c\x20\x61\x73\x20"
> +                         "\x77\x65\x6c\x6c\x20\x61\x73\x20"
> +                         "\x77\x72\x69\x74\x74\x65\x6e\x20"
> +                         "\x61\x6e\x64\x20\x65\x6c\x65\x63"
> +                         "\x74\x72\x6f\x6e\x69\x63\x20\x63"
> +                         "\x6f\x6d\x6d\x75\x6e\x69\x63\x61"
> +                         "\x74\x69\x6f\x6e\x73\x20\x6d\x61"
> +                         "\x64\x65\x20\x61\x74\x20\x61\x6e"
> +                         "\x79\x20\x74\x69\x6d\x65\x20\x6f"
> +                         "\x72\x20\x70\x6c\x61\x63\x65\x2c"
> +                         "\x20\x77\x68\x69\x63\x68\x20\x61"
> +                         "\x72\x65\x20\x61\x64\x64\x72\x65"
> +                         "\x73\x73\x65\x64\x20\x74\x6f",
> +               .ctext  = "\xe4\xa6\xc8\x30\xc4\x23\x13\xd6"
> +                         "\x08\x4d\xc9\xb7\xa5\x64\x7c\xb9"
> +                         "\x71\xe2\xab\x3e\xa8\x30\x8a\x1c"
> +                         "\x4a\x94\x6d\x9b\xe0\xb3\x6f\xf1"
> +                         "\xdc\xe3\x1b\xb3\xa9\x6d\x0d\xd6"
> +                         "\xd0\xca\x12\xef\xe7\x5f\xd8\x61"
> +                         "\x3c\x82\xd3\x99\x86\x3c\x6f\x66"
> +                         "\x02\x06\xdc\x55\xf9\xed\xdf\x38"
> +                         "\xb4\xa6\x17\x00\x7f\xef\xbf\x4f"
> +                         "\xf8\x36\xf1\x60\x7e\x47\xaf\xdb"
> +                         "\x55\x9b\x12\xcb\x56\x44\xa7\x1f"
> +                         "\xd3\x1a\x07\x3b\x00\xec\xe6\x4c"
> +                         "\xa2\x43\x27\xdf\x86\x19\x4f\x16"
> +                         "\xed\xf9\x4a\xf3\x63\x6f\xfa\x7f"
> +                         "\x78\x11\xf6\x7d\x97\x6f\xec\x6f"
> +                         "\x85\x0f\x5c\x36\x13\x8d\x87\xe0"
> +                         "\x80\xb1\x69\x0b\x98\x89\x9c\x4e"
> +                         "\xf8\xdd\xee\x5c\x0a\x85\xce\xd4"
> +                         "\xea\x1b\x48\xbe\x08\xf8\xe2\xa8"
> +                         "\xa5\xb0\x3c\x79\xb1\x15\xb4\xb9"
> +                         "\x75\x10\x95\x35\x81\x7e\x26\xe6"
> +                         "\x78\xa4\x88\xcf\xdb\x91\x34\x18"
> +                         "\xad\xd7\x8e\x07\x7d\xab\x39\xf9"
> +                         "\xa3\x9e\xa5\x1d\xbb\xed\x61\xfd"
> +                         "\xdc\xb7\x5a\x27\xfc\xb5\xc9\x10"
> +                         "\xa8\xcc\x52\x7f\x14\x76\x90\xe7"
> +                         "\x1b\x29\x60\x74\xc0\x98\x77\xbb"
> +                         "\xe0\x54\xbb\x27\x49\x59\x1e\x62"
> +                         "\x3d\xaf\x74\x06\xa4\x42\x6f\xc6"
> +                         "\x52\x97\xc4\x1d\xc4\x9f\xe2\xe5"
> +                         "\x38\x57\x91\xd1\xa2\x28\xcc\x40"
> +                         "\xcc\x70\x59\x37\xfc\x9f\x4b\xda"
> +                         "\xa0\xeb\x97\x9a\x7d\xed\x14\x5c"
> +                         "\x9c\xb7\x93\x26\x41\xa8\x66\xdd"
> +                         "\x87\x6a\xc0\xd3\xc2\xa9\x3e\xae"
> +                         "\xe9\x72\xfe\xd1\xb3\xac\x38\xea"
> +                         "\x4d\x15\xa9\xd5\x36\x61\xe9\x96"
> +                         "\x6c\x23\xf8\x43\xe4\x92\x29\xd9"
> +                         "\x8b\x78\xf7\x0a\x52\xe0\x19\x5b"
> +                         "\x59\x69\x5b\x5d\xa1\x53\xc4\x68"
> +                         "\xe1\xbb\xac\x89\x14\xe2\xe2\x85"
> +                         "\x41\x18\xf5\xb3\xd1\xfa\x68\x19"
> +                         "\x44\x78\xdc\xcf\xe7\x88\x2d\x52"
> +                         "\x5f\x40\xb5\x7e\xf8\x88\xa2\xae"
> +                         "\x4a\xb2\x07\x35\x9d\x9b\x07\x88"
> +                         "\xb7\x00\xd0\x0c\xb6\xa0\x47\x59"
> +                         "\xda\x4e\xc9\xab\x9b\x8a\x7b",
> +
> +               .len    = 375,
> +               .also_non_np = 1,
> +               .np     = 3,
> +               .tap    = { 375 - 20, 4, 16 },
> +
> +       }, {
> +               .key    = "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
> +                         "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
> +                         "\x47\x39\x17\xc1\x40\x2b\x80\x09"
> +                         "\x9d\xca\x5c\xbc\x20\x70\x75\xc0",
> +               .klen   = 32,
> +               .iv     = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x02\x76\x5a\x2e\x63"
> +                         "\x33\x9f\xc9\x9a\x66\x32\x0d\xb7"
> +                         "\x2a\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x27\x54\x77\x61\x73\x20\x62\x72"
> +                         "\x69\x6c\x6c\x69\x67\x2c\x20\x61"
> +                         "\x6e\x64\x20\x74\x68\x65\x20\x73"
> +                         "\x6c\x69\x74\x68\x79\x20\x74\x6f"
> +                         "\x76\x65\x73\x0a\x44\x69\x64\x20"
> +                         "\x67\x79\x72\x65\x20\x61\x6e\x64"
> +                         "\x20\x67\x69\x6d\x62\x6c\x65\x20"
> +                         "\x69\x6e\x20\x74\x68\x65\x20\x77"
> +                         "\x61\x62\x65\x3a\x0a\x41\x6c\x6c"
> +                         "\x20\x6d\x69\x6d\x73\x79\x20\x77"
> +                         "\x65\x72\x65\x20\x74\x68\x65\x20"
> +                         "\x62\x6f\x72\x6f\x67\x6f\x76\x65"
> +                         "\x73\x2c\x0a\x41\x6e\x64\x20\x74"
> +                         "\x68\x65\x20\x6d\x6f\x6d\x65\x20"
> +                         "\x72\x61\x74\x68\x73\x20\x6f\x75"
> +                         "\x74\x67\x72\x61\x62\x65\x2e",
> +               .ctext  = "\xb9\x68\xbc\x6a\x24\xbc\xcc\xd8"
> +                         "\x9b\x2a\x8d\x5b\x96\xaf\x56\xe3"
> +                         "\x11\x61\xe7\xa7\x9b\xce\x4e\x7d"
> +                         "\x60\x02\x48\xac\xeb\xd5\x3a\x26"
> +                         "\x9d\x77\x3b\xb5\x32\x13\x86\x8e"
> +                         "\x20\x82\x26\x72\xae\x64\x1b\x7e"
> +                         "\x2e\x01\x68\xb4\x87\x45\xa1\x24"
> +                         "\xe4\x48\x40\xf0\xaa\xac\xee\xa9"
> +                         "\xfc\x31\xad\x9d\x89\xa3\xbb\xd2"
> +                         "\xe4\x25\x13\xad\x0f\x5e\xdf\x3c"
> +                         "\x27\xab\xb8\x62\x46\x22\x30\x48"
> +                         "\x55\x2c\x4e\x84\x78\x1d\x0d\x34"
> +                         "\x8d\x3c\x91\x0a\x7f\x5b\x19\x9f"
> +                         "\x97\x05\x4c\xa7\x62\x47\x8b\xc5"
> +                         "\x44\x2e\x20\x33\xdd\xa0\x82\xa9"
> +                         "\x25\x76\x37\xe6\x3c\x67\x5b",
> +               .len    = 127,
> +       }, {
> +               .key    = "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
> +                         "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
> +                         "\x47\x39\x17\xc1\x40\x2b\x80\x09"
> +                         "\x9d\xca\x5c\xbc\x20\x70\x75\xc0",
> +               .klen   = 32,
> +               .iv     = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x01\x31\x58\xa3\x5a"
> +                         "\x25\x5d\x05\x17\x58\xe9\x5e\xd4"
> +                         "\x1c\x00\x00\x00\x00\x00\x00\x00",
> +               .ptext  = "\x49\xee\xe0\xdc\x24\x90\x40\xcd"
> +                         "\xc5\x40\x8f\x47\x05\xbc\xdd\x81"
> +                         "\x47\xc6\x8d\xe6\xb1\x8f\xd7\xcb"
> +                         "\x09\x0e\x6e\x22\x48\x1f\xbf\xb8"
> +                         "\x5c\xf7\x1e\x8a\xc1\x23\xf2\xd4"
> +                         "\x19\x4b\x01\x0f\x4e\xa4\x43\xce"
> +                         "\x01\xc6\x67\xda\x03\x91\x18\x90"
> +                         "\xa5\xa4\x8e\x45\x03\xb3\x2d\xac"
> +                         "\x74\x92\xd3\x53\x47\xc8\xdd\x25"
> +                         "\x53\x6c\x02\x03\x87\x0d\x11\x0c"
> +                         "\x58\xe3\x12\x18\xfd\x2a\x5b\x40"
> +                         "\x0c\x30\xf0\xb8\x3f\x43\xce\xae"
> +                         "\x65\x3a\x7d\x7c\xf4\x54\xaa\xcc"
> +                         "\x33\x97\xc3\x77\xba\xc5\x70\xde"
> +                         "\xd7\xd5\x13\xa5\x65\xc4\x5f\x0f"
> +                         "\x46\x1a\x0d\x97\xb5\xf3\xbb\x3c"
> +                         "\x84\x0f\x2b\xc5\xaa\xea\xf2\x6c"
> +                         "\xc9\xb5\x0c\xee\x15\xf3\x7d\xbe"
> +                         "\x9f\x7b\x5a\xa6\xae\x4f\x83\xb6"
> +                         "\x79\x49\x41\xf4\x58\x18\xcb\x86"
> +                         "\x7f\x30\x0e\xf8\x7d\x44\x36\xea"
> +                         "\x75\xeb\x88\x84\x40\x3c\xad\x4f"
> +                         "\x6f\x31\x6b\xaa\x5d\xe5\xa5\xc5"
> +                         "\x21\x66\xe9\xa7\xe3\xb2\x15\x88"
> +                         "\x78\xf6\x79\xa1\x59\x47\x12\x4e"
> +                         "\x9f\x9f\x64\x1a\xa0\x22\x5b\x08"
> +                         "\xbe\x7c\x36\xc2\x2b\x66\x33\x1b"
> +                         "\xdd\x60\x71\xf7\x47\x8c\x61\xc3"
> +                         "\xda\x8a\x78\x1e\x16\xfa\x1e\x86"
> +                         "\x81\xa6\x17\x2a\xa7\xb5\xc2\xe7"
> +                         "\xa4\xc7\x42\xf1\xcf\x6a\xca\xb4"
> +                         "\x45\xcf\xf3\x93\xf0\xe7\xea\xf6"
> +                         "\xf4\xe6\x33\x43\x84\x93\xa5\x67"
> +                         "\x9b\x16\x58\x58\x80\x0f\x2b\x5c"
> +                         "\x24\x74\x75\x7f\x95\x81\xb7\x30"
> +                         "\x7a\x33\xa7\xf7\x94\x87\x32\x27"
> +                         "\x10\x5d\x14\x4c\x43\x29\xdd\x26"
> +                         "\xbd\x3e\x3c\x0e\xfe\x0e\xa5\x10"
> +                         "\xea\x6b\x64\xfd\x73\xc6\xed\xec"
> +                         "\xa8\xc9\xbf\xb3\xba\x0b\x4d\x07"
> +                         "\x70\xfc\x16\xfd\x79\x1e\xd7\xc5"
> +                         "\x49\x4e\x1c\x8b\x8d\x79\x1b\xb1"
> +                         "\xec\xca\x60\x09\x4c\x6a\xd5\x09"
> +                         "\x49\x46\x00\x88\x22\x8d\xce\xea"
> +                         "\xb1\x17\x11\xde\x42\xd2\x23\xc1"
> +                         "\x72\x11\xf5\x50\x73\x04\x40\x47"
> +                         "\xf9\x5d\xe7\xa7\x26\xb1\x7e\xb0"
> +                         "\x3f\x58\xc1\x52\xab\x12\x67\x9d"
> +                         "\x3f\x43\x4b\x68\xd4\x9c\x68\x38"
> +                         "\x07\x8a\x2d\x3e\xf3\xaf\x6a\x4b"
> +                         "\xf9\xe5\x31\x69\x22\xf9\xa6\x69"
> +                         "\xc6\x9c\x96\x9a\x12\x35\x95\x1d"
> +                         "\x95\xd5\xdd\xbe\xbf\x93\x53\x24"
> +                         "\xfd\xeb\xc2\x0a\x64\xb0\x77\x00"
> +                         "\x6f\x88\xc4\x37\x18\x69\x7c\xd7"
> +                         "\x41\x92\x55\x4c\x03\xa1\x9a\x4b"
> +                         "\x15\xe5\xdf\x7f\x37\x33\x72\xc1"
> +                         "\x8b\x10\x67\xa3\x01\x57\x94\x25"
> +                         "\x7b\x38\x71\x7e\xdd\x1e\xcc\x73"
> +                         "\x55\xd2\x8e\xeb\x07\xdd\xf1\xda"
> +                         "\x58\xb1\x47\x90\xfe\x42\x21\x72"
> +                         "\xa3\x54\x7a\xa0\x40\xec\x9f\xdd"
> +                         "\xc6\x84\x6e\xca\xae\xe3\x68\xb4"
> +                         "\x9d\xe4\x78\xff\x57\xf2\xf8\x1b"
> +                         "\x03\xa1\x31\xd9\xde\x8d\xf5\x22"
> +                         "\x9c\xdd\x20\xa4\x1e\x27\xb1\x76"
> +                         "\x4f\x44\x55\xe2\x9b\xa1\x9c\xfe"
> +                         "\x54\xf7\x27\x1b\xf4\xde\x02\xf5"
> +                         "\x1b\x55\x48\x5c\xdc\x21\x4b\x9e"
> +                         "\x4b\x6e\xed\x46\x23\xdc\x65\xb2"
> +                         "\xcf\x79\x5f\x28\xe0\x9e\x8b\xe7"
> +                         "\x4c\x9d\x8a\xff\xc1\xa6\x28\xb8"
> +                         "\x65\x69\x8a\x45\x29\xef\x74\x85"
> +                         "\xde\x79\xc7\x08\xae\x30\xb0\xf4"
> +                         "\xa3\x1d\x51\x41\xab\xce\xcb\xf6"
> +                         "\xb5\xd8\x6d\xe0\x85\xe1\x98\xb3"
> +                         "\x43\xbb\x86\x83\x0a\xa0\xf5\xb7"
> +                         "\x04\x0b\xfa\x71\x1f\xb0\xf6\xd9"
> +                         "\x13\x00\x15\xf0\xc7\xeb\x0d\x5a"
> +                         "\x9f\xd7\xb9\x6c\x65\x14\x22\x45"
> +                         "\x6e\x45\x32\x3e\x7e\x60\x1a\x12"
> +                         "\x97\x82\x14\xfb\xaa\x04\x22\xfa"
> +                         "\xa0\xe5\x7e\x8c\x78\x02\x48\x5d"
> +                         "\x78\x33\x5a\x7c\xad\xdb\x29\xce"
> +                         "\xbb\x8b\x61\xa4\xb7\x42\xe2\xac"
> +                         "\x8b\x1a\xd9\x2f\x0b\x8b\x62\x21"
> +                         "\x83\x35\x7e\xad\x73\xc2\xb5\x6c"
> +                         "\x10\x26\x38\x07\xe5\xc7\x36\x80"
> +                         "\xe2\x23\x12\x61\xf5\x48\x4b\x2b"
> +                         "\xc5\xdf\x15\xd9\x87\x01\xaa\xac"
> +                         "\x1e\x7c\xad\x73\x78\x18\x63\xe0"
> +                         "\x8b\x9f\x81\xd8\x12\x6a\x28\x10"
> +                         "\xbe\x04\x68\x8a\x09\x7c\x1b\x1c"
> +                         "\x83\x66\x80\x47\x80\xe8\xfd\x35"
> +                         "\x1c\x97\x6f\xae\x49\x10\x66\xcc"
> +                         "\xc6\xd8\xcc\x3a\x84\x91\x20\x77"
> +                         "\x72\xe4\x24\xd2\x37\x9f\xc5\xc9"
> +                         "\x25\x94\x10\x5f\x40\x00\x64\x99"
> +                         "\xdc\xae\xd7\x21\x09\x78\x50\x15"
> +                         "\xac\x5f\xc6\x2c\xa2\x0b\xa9\x39"
> +                         "\x87\x6e\x6d\xab\xde\x08\x51\x16"
> +                         "\xc7\x13\xe9\xea\xed\x06\x8e\x2c"
> +                         "\xf8\x37\x8c\xf0\xa6\x96\x8d\x43"
> +                         "\xb6\x98\x37\xb2\x43\xed\xde\xdf"
> +                         "\x89\x1a\xe7\xeb\x9d\xa1\x7b\x0b"
> +                         "\x77\xb0\xe2\x75\xc0\xf1\x98\xd9"
> +                         "\x80\x55\xc9\x34\x91\xd1\x59\xe8"
> +                         "\x4b\x0f\xc1\xa9\x4b\x7a\x84\x06"
> +                         "\x20\xa8\x5d\xfa\xd1\xde\x70\x56"
> +                         "\x2f\x9e\x91\x9c\x20\xb3\x24\xd8"
> +                         "\x84\x3d\xe1\x8c\x7e\x62\x52\xe5"
> +                         "\x44\x4b\x9f\xc2\x93\x03\xea\x2b"
> +                         "\x59\xc5\xfa\x3f\x91\x2b\xbb\x23"
> +                         "\xf5\xb2\x7b\xf5\x38\xaf\xb3\xee"
> +                         "\x63\xdc\x7b\xd1\xff\xaa\x8b\xab"
> +                         "\x82\x6b\x37\x04\xeb\x74\xbe\x79"
> +                         "\xb9\x83\x90\xef\x20\x59\x46\xff"
> +                         "\xe9\x97\x3e\x2f\xee\xb6\x64\x18"
> +                         "\x38\x4c\x7a\x4a\xf9\x61\xe8\x9a"
> +                         "\xa1\xb5\x01\xa6\x47\xd3\x11\xd4"
> +                         "\xce\xd3\x91\x49\x88\xc7\xb8\x4d"
> +                         "\xb1\xb9\x07\x6d\x16\x72\xae\x46"
> +                         "\x5e\x03\xa1\x4b\xb6\x02\x30\xa8"
> +                         "\x3d\xa9\x07\x2a\x7c\x19\xe7\x62"
> +                         "\x87\xe3\x82\x2f\x6f\xe1\x09\xd9"
> +                         "\x94\x97\xea\xdd\x58\x9e\xae\x76"
> +                         "\x7e\x35\xe5\xb4\xda\x7e\xf4\xde"
> +                         "\xf7\x32\x87\xcd\x93\xbf\x11\x56"
> +                         "\x11\xbe\x08\x74\xe1\x69\xad\xe2"
> +                         "\xd7\xf8\x86\x75\x8a\x3c\xa4\xbe"
> +                         "\x70\xa7\x1b\xfc\x0b\x44\x2a\x76"
> +                         "\x35\xea\x5d\x85\x81\xaf\x85\xeb"
> +                         "\xa0\x1c\x61\xc2\xf7\x4f\xa5\xdc"
> +                         "\x02\x7f\xf6\x95\x40\x6e\x8a\x9a"
> +                         "\xf3\x5d\x25\x6e\x14\x3a\x22\xc9"
> +                         "\x37\x1c\xeb\x46\x54\x3f\xa5\x91"
> +                         "\xc2\xb5\x8c\xfe\x53\x08\x97\x32"
> +                         "\x1b\xb2\x30\x27\xfe\x25\x5d\xdc"
> +                         "\x08\x87\xd0\xe5\x94\x1a\xd4\xf1"
> +                         "\xfe\xd6\xb4\xa3\xe6\x74\x81\x3c"
> +                         "\x1b\xb7\x31\xa7\x22\xfd\xd4\xdd"
> +                         "\x20\x4e\x7c\x51\xb0\x60\x73\xb8"
> +                         "\x9c\xac\x91\x90\x7e\x01\xb0\xe1"
> +                         "\x8a\x2f\x75\x1c\x53\x2a\x98\x2a"
> +                         "\x06\x52\x95\x52\xb2\xe9\x25\x2e"
> +                         "\x4c\xe2\x5a\x00\xb2\x13\x81\x03"
> +                         "\x77\x66\x0d\xa5\x99\xda\x4e\x8c"
> +                         "\xac\xf3\x13\x53\x27\x45\xaf\x64"
> +                         "\x46\xdc\xea\x23\xda\x97\xd1\xab"
> +                         "\x7d\x6c\x30\x96\x1f\xbc\x06\x34"
> +                         "\x18\x0b\x5e\x21\x35\x11\x8d\x4c"
> +                         "\xe0\x2d\xe9\x50\x16\x74\x81\xa8"
> +                         "\xb4\x34\xb9\x72\x42\xa6\xcc\xbc"
> +                         "\xca\x34\x83\x27\x10\x5b\x68\x45"
> +                         "\x8f\x52\x22\x0c\x55\x3d\x29\x7c"
> +                         "\xe3\xc0\x66\x05\x42\x91\x5f\x58"
> +                         "\xfe\x4a\x62\xd9\x8c\xa9\x04\x19"
> +                         "\x04\xa9\x08\x4b\x57\xfc\x67\x53"
> +                         "\x08\x7c\xbc\x66\x8a\xb0\xb6\x9f"
> +                         "\x92\xd6\x41\x7c\x5b\x2a\x00\x79"
> +                         "\x72",
> +               .ctext  = "\xe1\xb6\x8b\x5c\x80\xb8\xcc\x08"
> +                         "\x1b\x84\xb2\xd1\xad\xa4\x70\xac"
> +                         "\x67\xa9\x39\x27\xac\xb4\x5b\xb7"
> +                         "\x4c\x26\x77\x23\x1d\xce\x0a\xbe"
> +                         "\x18\x9e\x42\x8b\xbd\x7f\xd6\xf1"
> +                         "\xf1\x6b\xe2\x6d\x7f\x92\x0e\xcb"
> +                         "\xb8\x79\xba\xb4\xac\x7e\x2d\xc0"
> +                         "\x9e\x83\x81\x91\xd5\xea\xc3\x12"
> +                         "\x8d\xa4\x26\x70\xa4\xf9\x71\x0b"
> +                         "\xbd\x2e\xe1\xb3\x80\x42\x25\xb3"
> +                         "\x0b\x31\x99\xe1\x0d\xde\xa6\x90"
> +                         "\xf2\xa3\x10\xf7\xe5\xf3\x83\x1e"
> +                         "\x2c\xfb\x4d\xf0\x45\x3d\x28\x3c"
> +                         "\xb8\xf1\xcb\xbf\x67\xd8\x43\x5a"
> +                         "\x9d\x7b\x73\x29\x88\x0f\x13\x06"
> +                         "\x37\x50\x0d\x7c\xe6\x9b\x07\xdd"
> +                         "\x7e\x01\x1f\x81\x90\x10\x69\xdb"
> +                         "\xa4\xad\x8a\x5e\xac\x30\x72\xf2"
> +                         "\x36\xcd\xe3\x23\x49\x02\x93\xfa"
> +                         "\x3d\xbb\xe2\x98\x83\xeb\xe9\x8d"
> +                         "\xb3\x8f\x11\xaa\x53\xdb\xaf\x2e"
> +                         "\x95\x13\x99\x3d\x71\xbd\x32\x92"
> +                         "\xdd\xfc\x9d\x5e\x6f\x63\x2c\xee"
> +                         "\x91\x1f\x4c\x64\x3d\x87\x55\x0f"
> +                         "\xcc\x3d\x89\x61\x53\x02\x57\x8f"
> +                         "\xe4\x77\x29\x32\xaf\xa6\x2f\x0a"
> +                         "\xae\x3c\x3f\x3f\xf4\xfb\x65\x52"
> +                         "\xc5\xc1\x78\x78\x53\x28\xad\xed"
> +                         "\xd1\x67\x37\xc7\x59\x70\xcd\x0a"
> +                         "\xb8\x0f\x80\x51\x9f\xc0\x12\x5e"
> +                         "\x06\x0a\x7e\xec\x24\x5f\x73\x00"
> +                         "\xb1\x0b\x31\x47\x4f\x73\x8d\xb4"
> +                         "\xce\xf3\x55\x45\x6c\x84\x27\xba"
> +                         "\xb9\x6f\x03\x4a\xeb\x98\x88\x6e"
> +                         "\x53\xed\x25\x19\x0d\x8f\xfe\xca"
> +                         "\x60\xe5\x00\x93\x6e\x3c\xff\x19"
> +                         "\xae\x08\x3b\x8a\xa6\x84\x05\xfe"
> +                         "\x9b\x59\xa0\x8c\xc8\x05\x45\xf5"
> +                         "\x05\x37\xdc\x45\x6f\x8b\x95\x8c"
> +                         "\x4e\x11\x45\x7a\xce\x21\xa5\xf7"
> +                         "\x71\x67\xb9\xce\xd7\xf9\xe9\x5e"
> +                         "\x60\xf5\x53\x7a\xa8\x85\x14\x03"
> +                         "\xa0\x92\xec\xf3\x51\x80\x84\xc4"
> +                         "\xdc\x11\x9e\x57\xce\x4b\x45\xcf"
> +                         "\x90\x95\x85\x0b\x96\xe9\xee\x35"
> +                         "\x10\xb8\x9b\xf2\x59\x4a\xc6\x7e"
> +                         "\x85\xe5\x6f\x38\x51\x93\x40\x0c"
> +                         "\x99\xd7\x7f\x32\xa8\x06\x27\xd1"
> +                         "\x2b\xd5\xb5\x3a\x1a\xe1\x5e\xda"
> +                         "\xcd\x5a\x50\x30\x3c\xc7\xe7\x65"
> +                         "\xa6\x07\x0b\x98\x91\xc6\x20\x27"
> +                         "\x2a\x03\x63\x1b\x1e\x3d\xaf\xc8"
> +                         "\x71\x48\x46\x6a\x64\x28\xf9\x3d"
> +                         "\xd1\x1d\xab\xc8\x40\x76\xc2\x39"
> +                         "\x4e\x00\x75\xd2\x0e\x82\x58\x8c"
> +                         "\xd3\x73\x5a\xea\x46\x89\xbe\xfd"
> +                         "\x4e\x2c\x0d\x94\xaa\x9b\x68\xac"
> +                         "\x86\x87\x30\x7e\xa9\x16\xcd\x59"
> +                         "\xd2\xa6\xbe\x0a\xd8\xf5\xfd\x2d"
> +                         "\x49\x69\xd2\x1a\x90\xd2\x1b\xed"
> +                         "\xff\x71\x04\x87\x87\x21\xc4\xb8"
> +                         "\x1f\x5b\x51\x33\xd0\xd6\x59\x9a"
> +                         "\x03\x0e\xd3\x8b\xfb\x57\x73\xfd"
> +                         "\x5a\x52\x63\x82\xc8\x85\x2f\xcb"
> +                         "\x74\x6d\x4e\xd9\x68\x37\x85\x6a"
> +                         "\xd4\xfb\x94\xed\x8d\xd1\x1a\xaf"
> +                         "\x76\xa7\xb7\x88\xd0\x2b\x4e\xda"
> +                         "\xec\x99\x94\x27\x6f\x87\x8c\xdf"
> +                         "\x4b\x5e\xa6\x66\xdd\xcb\x33\x7b"
> +                         "\x64\x94\x31\xa8\x37\xa6\x1d\xdb"
> +                         "\x0d\x5c\x93\xa4\x40\xf9\x30\x53"
> +                         "\x4b\x74\x8d\xdd\xf6\xde\x3c\xac"
> +                         "\x5c\x80\x01\x3a\xef\xb1\x9a\x02"
> +                         "\x0c\x22\x8e\xe7\x44\x09\x74\x4c"
> +                         "\xf2\x9a\x27\x69\x7f\x12\x32\x36"
> +                         "\xde\x92\xdf\xde\x8f\x5b\x31\xab"
> +                         "\x4a\x01\x26\xe0\xb1\xda\xe8\x37"
> +                         "\x21\x64\xe8\xff\x69\xfc\x9e\x41"
> +                         "\xd2\x96\x2d\x18\x64\x98\x33\x78"
> +                         "\x24\x61\x73\x9b\x47\x29\xf1\xa7"
> +                         "\xcb\x27\x0f\xf0\x85\x6d\x8c\x9d"
> +                         "\x2c\x95\x9e\xe5\xb2\x8e\x30\x29"
> +                         "\x78\x8a\x9d\x65\xb4\x8e\xde\x7b"
> +                         "\xd9\x00\x50\xf5\x7f\x81\xc3\x1b"
> +                         "\x25\x85\xeb\xc2\x8c\x33\x22\x1e"
> +                         "\x68\x38\x22\x30\xd8\x2e\x00\x98"
> +                         "\x85\x16\x06\x56\xb4\x81\x74\x20"
> +                         "\x95\xdb\x1c\x05\x19\xe8\x23\x4d"
> +                         "\x65\x5d\xcc\xd8\x7f\xc4\x2d\x0f"
> +                         "\x57\x26\x71\x07\xad\xaa\x71\x9f"
> +                         "\x19\x76\x2f\x25\x51\x88\xe4\xc0"
> +                         "\x82\x6e\x08\x05\x37\x04\xee\x25"
> +                         "\x23\x90\xe9\x4e\xce\x9b\x16\xc1"
> +                         "\x31\xe7\x6e\x2c\x1b\xe1\x85\x9a"
> +                         "\x0c\x8c\xbb\x12\x1e\x68\x7b\x93"
> +                         "\xa9\x3c\x39\x56\x23\x3e\x6e\xc7"
> +                         "\x77\x84\xd3\xe0\x86\x59\xaa\xb9"
> +                         "\xd5\x53\x58\xc9\x0a\x83\x5f\x85"
> +                         "\xd8\x47\x14\x67\x8a\x3c\x17\xe0"
> +                         "\xab\x02\x51\xea\xf1\xf0\x4f\x30"
> +                         "\x7d\xe0\x92\xc2\x5f\xfb\x19\x5a"
> +                         "\x3f\xbd\xf4\x39\xa4\x31\x0c\x39"
> +                         "\xd1\xae\x4e\xf7\x65\x7f\x1f\xce"
> +                         "\xc2\x39\xd1\x84\xd4\xe5\x02\xe0"
> +                         "\x58\xaa\xf1\x5e\x81\xaf\x7f\x72"
> +                         "\x0f\x08\x99\x43\xb9\xd8\xac\x41"
> +                         "\x35\x55\xf2\xb2\xd4\x98\xb8\x3b"
> +                         "\x2b\x3c\x3e\x16\x06\x31\xfc\x79"
> +                         "\x47\x38\x63\x51\xc5\xd0\x26\xd7"
> +                         "\x43\xb4\x2b\xd9\xc5\x05\xf2\x9d"
> +                         "\x18\xc9\x26\x82\x56\xd2\x11\x05"
> +                         "\xb6\x89\xb4\x43\x9c\xb5\x9d\x11"
> +                         "\x6c\x83\x37\x71\x27\x1c\xae\xbf"
> +                         "\xcd\x57\xd2\xee\x0d\x5a\x15\x26"
> +                         "\x67\x88\x80\x80\x1b\xdc\xc1\x62"
> +                         "\xdd\x4c\xff\x92\x5c\x6c\xe1\xa0"
> +                         "\xe3\x79\xa9\x65\x8c\x8c\x14\x42"
> +                         "\xe5\x11\xd2\x1a\xad\xa9\x56\x6f"
> +                         "\x98\xfc\x8a\x7b\x56\x1f\xc6\xc1"
> +                         "\x52\x12\x92\x9b\x41\x0f\x4b\xae"
> +                         "\x1b\x4a\xbc\xfe\x23\xb6\x94\x70"
> +                         "\x04\x30\x9e\x69\x47\xbe\xb8\x8f"
> +                         "\xca\x45\xd7\x8a\xf4\x78\x3e\xaa"
> +                         "\x71\x17\xd8\x1e\xb8\x11\x8f\xbc"
> +                         "\xc8\x1a\x65\x7b\x41\x89\x72\xc7"
> +                         "\x5f\xbe\xc5\x2a\xdb\x5c\x54\xf9"
> +                         "\x25\xa3\x7a\x80\x56\x9c\x8c\xab"
> +                         "\x26\x19\x10\x36\xa6\xf3\x14\x79"
> +                         "\x40\x98\x70\x68\xb7\x35\xd9\xb9"
> +                         "\x27\xd4\xe7\x74\x5b\x3d\x97\xb4"
> +                         "\xd9\xaa\xd9\xf2\xb5\x14\x84\x1f"
> +                         "\xa9\xde\x12\x44\x5b\x00\xc0\xbc"
> +                         "\xc8\x11\x25\x1b\x67\x7a\x15\x72"
> +                         "\xa6\x31\x6f\xf4\x68\x7a\x86\x9d"
> +                         "\x43\x1c\x5f\x16\xd3\xad\x2e\x52"
> +                         "\xf3\xb4\xc3\xfa\x27\x2e\x68\x6c"
> +                         "\x06\xe7\x4c\x4f\xa2\xe0\xe4\x21"
> +                         "\x5d\x9e\x33\x58\x8d\xbf\xd5\x70"
> +                         "\xf8\x80\xa5\xdd\xe7\x18\x79\xfa"
> +                         "\x7b\xfd\x09\x69\x2c\x37\x32\xa8"
> +                         "\x65\xfa\x8d\x8b\x5c\xcc\xe8\xf3"
> +                         "\x37\xf6\xa6\xc6\x5c\xa2\x66\x79"
> +                         "\xfa\x8a\xa7\xd1\x0b\x2e\x1b\x5e"
> +                         "\x95\x35\x00\x76\xae\x42\xf7\x50"
> +                         "\x51\x78\xfb\xb4\x28\x24\xde\x1a"
> +                         "\x70\x8b\xed\xca\x3c\x5e\xe4\xbd"
> +                         "\x28\xb5\xf3\x76\x4f\x67\x5d\x81"
> +                         "\xb2\x60\x87\xd9\x7b\x19\x1a\xa7"
> +                         "\x79\xa2\xfa\x3f\x9e\xa9\xd7\x25"
> +                         "\x61\xe1\x74\x31\xa2\x77\xa0\x1b"
> +                         "\xf6\xf7\xcb\xc5\xaa\x9e\xce\xf9"
> +                         "\x9b\x96\xef\x51\xc3\x1a\x44\x96"
> +                         "\xae\x17\x50\xab\x29\x08\xda\xcc"
> +                         "\x1a\xb3\x12\xd0\x24\xe4\xe2\xe0"
> +                         "\xc6\xe3\xcc\x82\xd0\xba\x47\x4c"
> +                         "\x3f\x49\xd7\xe8\xb6\x61\xaa\x65"
> +                         "\x25\x18\x40\x2d\x62\x25\x02\x71"
> +                         "\x61\xa2\xc1\xb2\x13\xd2\x71\x3f"
> +                         "\x43\x1a\xc9\x09\x92\xff\xd5\x57"
> +                         "\xf0\xfc\x5e\x1c\xf1\xf5\xf9\xf3"
> +                         "\x5b",
> +               .len    = 1281,
> +               .also_non_np = 1,
> +               .np     = 3,
> +               .tap    = { 1200, 1, 80 },
> +       },
> +};
> +
>  /*
>   * CTS (Cipher Text Stealing) mode tests
>   */
> diff --git a/include/crypto/chacha.h b/include/crypto/chacha.h
> index ae79e9983c72f..3d261f5cd156d 100644
> --- a/include/crypto/chacha.h
> +++ b/include/crypto/chacha.h
> @@ -5,6 +5,11 @@
>   * XChaCha extends ChaCha's nonce to 192 bits, while provably retaining ChaCha's
>   * security.  Here they share the same key size, tfm context, and setkey
>   * function; only their IV size and encrypt/decrypt function differ.
> + *
> + * The ChaCha paper specifies 20, 12, and 8-round variants.  In general, it is
> + * recommended to use the 20-round variant ChaCha20.  However, the other
> + * variants can be needed in some performance-sensitive scenarios.  The generic
> + * ChaCha code currently allows only the 20 and 12-round variants.
>   */
>
>  #ifndef _CRYPTO_CHACHA_H
> @@ -39,6 +44,8 @@ void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv);
>
>  int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
>                            unsigned int keysize);
> +int crypto_chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key,
> +                          unsigned int keysize);
>
>  int crypto_chacha_crypt(struct skcipher_request *req);
>  int crypto_xchacha_crypt(struct skcipher_request *req);
> diff --git a/lib/chacha.c b/lib/chacha.c
> index 0a2c2e5b7b84d..c4d69a83fcd2d 100644
> --- a/lib/chacha.c
> +++ b/lib/chacha.c
> @@ -21,7 +21,7 @@ static void chacha_permute(u32 *x, int nrounds)
>         int i;
>
>         /* whitelist the allowed round counts */
> -       BUG_ON(nrounds != 20);
> +       BUG_ON(nrounds != 20 && nrounds != 12);
>

I didn't spot this until this patch, but BUG_ON() may bring down the
kernel, and so it should really only be used as a last resort. (i.e.,
if this is called from non-process context things may explode rather
painfully)

I didn't look at the entire file [which is a bit cumbersome while
reviewing incremental changes like this] and so I don't really have
another suggestion right now, but please try to come up with something
better if you can.


>         for (i = 0; i < nrounds; i += 2) {
>                 x[0]  += x[4];    x[12] = rol32(x[12] ^ x[0],  16);
> @@ -70,7 +70,7 @@ static void chacha_permute(u32 *x, int nrounds)
>   * chacha_block - generate one keystream block and increment block counter
>   * @state: input state matrix (16 32-bit words)
>   * @stream: output keystream block (64 bytes)
> - * @nrounds: number of rounds (currently must be 20)
> + * @nrounds: number of rounds (20 or 12; 20 is recommended)
>   *
>   * This is the ChaCha core, a function from 64-byte strings to 64-byte strings.
>   * The caller has already converted the endianness of the input.  This function
> @@ -96,7 +96,7 @@ EXPORT_SYMBOL(chacha_block);
>   * hchacha_block - abbreviated ChaCha core, for XChaCha
>   * @in: input state matrix (16 32-bit words)
>   * @out: output (8 32-bit words)
> - * @nrounds: number of rounds (currently must be 20)
> + * @nrounds: number of rounds (20 or 12; 20 is recommended)
>   *
>   * HChaCha is the ChaCha equivalent of HSalsa and is an intermediate step
>   * towards XChaCha (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf).  HChaCha
> --
> 2.19.1.331.ge82ca0e54c-goog
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
                   ` (11 preceding siblings ...)
  2018-10-15 17:54 ` [RFC PATCH v2 12/12] fscrypt: " Eric Biggers
@ 2018-10-19 15:58 ` Jason A. Donenfeld
  2018-10-19 18:19   ` Paul Crowley
  2018-10-19 19:04   ` Eric Biggers
  12 siblings, 2 replies; 54+ messages in thread
From: Jason A. Donenfeld @ 2018-10-19 15:58 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Linux Crypto Mailing List, linux-fscrypt, linux-arm-kernel, LKML,
	Herbert Xu, Paul Crowley, Greg Kaiser, Michael Halcrow,
	Samuel Neves, Tomer Ashur

Hello Eric,

> As before, some of these patches conflict with the new "Zinc" crypto
> library.  But I don't know when Zinc will be merged, so for now I've
> continued to base this patchset on the current 'cryptodev'.

I'd appreciate it if you waited to merge this until you can rebase it
on top of Zinc. In fact, if you already want to build it on top of
Zinc, I'm happy to work with you on that in a shared repo or similar.
We can also hash out the details of that in person in Vancouver in a
few weeks. I think pushing this in before will create undesirable
churn for both of us.

> Therefore, we (well, Paul Crowley did the real work) designed a new
> encryption mode, Adiantum.  In essence, Adiantum makes it secure to use
> the ChaCha stream cipher for disk encryption.  Adiantum is specified by
> our paper here: https://eprint.iacr.org/2018/720.pdf ("Adiantum:
> length-preserving encryption for entry-level processors").  Reference
> code and test vectors are here: https://github.com/google/adiantum.
> Most of the high-level concepts of Adiantum are not new; similar
> existing modes include XCB, HCTR, and HCH.  Adiantum and these modes are
> true wide-block modes (tweakable super-pseudorandom permutations), so
> they actually provide a stronger notion of security than XTS.

Great, I'm very happy to see you've created such a high performance alternative.

Before merging this into the kernel, do you want to wait until you've
received some public review from academia?

Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-19 15:58 ` [RFC PATCH v2 00/12] crypto: " Jason A. Donenfeld
@ 2018-10-19 18:19   ` Paul Crowley
  2018-10-20  3:24     ` Ard Biesheuvel
       [not found]     ` <2395454e-a0dc-408f-4138-9d15ab5f20b8@esat.kuleuven.be>
  2018-10-19 19:04   ` Eric Biggers
  1 sibling, 2 replies; 54+ messages in thread
From: Paul Crowley @ 2018-10-19 18:19 UTC (permalink / raw)
  To: Jason
  Cc: ebiggers, linux-crypto, linux-fscrypt, linux-arm-kernel,
	linux-kernel, Herbert Xu, Greg Kaiser, Michael Halcrow,
	samuel.c.p.neves, tomer.ashur

On Fri, 19 Oct 2018 at 08:58, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Before merging this into the kernel, do you want to wait until you've
> received some public review from academia?

I would prefer not to wait. Unlike a new primitive whose strength can
only be known through attempts at cryptanalysis, Adiantum is a
construction based on
well-understood and trusted primitives; it is secure if the proof
accompanying it is correct. Given that (outside competitions or
standardization efforts) no-one ever issues public statements that
they think algorithms or proofs are good, what I'm expecting from
academia is silence :) The most we could hope for would be getting the
paper accepted at a conference, and we're pursuing that but there's a
good chance that won't happen simply because it's not very novel. It
basically takes existing ideas and applies them using a stream cipher
instead of a block cipher, and a faster hashing mode; it's also a
small update from HPolyC. I've had some private feedback that the
proof seems correct, and that's all I'm expecting to get.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 04/12] crypto: chacha - add XChaCha12 support
  2018-10-19 14:34   ` Ard Biesheuvel
@ 2018-10-19 18:28     ` Eric Biggers
  0 siblings, 0 replies; 54+ messages in thread
From: Eric Biggers @ 2018-10-19 18:28 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	Jason A . Donenfeld, Greg Kaiser, Herbert Xu, Samuel Neves,
	Michael Halcrow, Linux Kernel Mailing List, linux-fscrypt,
	Tomer Ashur, linux-arm-kernel, Paul Crowley

Hi Ard,

On Fri, Oct 19, 2018 at 10:34:41PM +0800, Ard Biesheuvel wrote:
> > diff --git a/include/crypto/chacha.h b/include/crypto/chacha.h
> > index ae79e9983c72f..3d261f5cd156d 100644
> > --- a/include/crypto/chacha.h
> > +++ b/include/crypto/chacha.h
> > @@ -5,6 +5,11 @@
> >   * XChaCha extends ChaCha's nonce to 192 bits, while provably retaining ChaCha's
> >   * security.  Here they share the same key size, tfm context, and setkey
> >   * function; only their IV size and encrypt/decrypt function differ.
> > + *
> > + * The ChaCha paper specifies 20, 12, and 8-round variants.  In general, it is
> > + * recommended to use the 20-round variant ChaCha20.  However, the other
> > + * variants can be needed in some performance-sensitive scenarios.  The generic
> > + * ChaCha code currently allows only the 20 and 12-round variants.
> >   */
> >
> >  #ifndef _CRYPTO_CHACHA_H
> > @@ -39,6 +44,8 @@ void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv);
> >
> >  int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
> >                            unsigned int keysize);
> > +int crypto_chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key,
> > +                          unsigned int keysize);
> >
> >  int crypto_chacha_crypt(struct skcipher_request *req);
> >  int crypto_xchacha_crypt(struct skcipher_request *req);
> > diff --git a/lib/chacha.c b/lib/chacha.c
> > index 0a2c2e5b7b84d..c4d69a83fcd2d 100644
> > --- a/lib/chacha.c
> > +++ b/lib/chacha.c
> > @@ -21,7 +21,7 @@ static void chacha_permute(u32 *x, int nrounds)
> >         int i;
> >
> >         /* whitelist the allowed round counts */
> > -       BUG_ON(nrounds != 20);
> > +       BUG_ON(nrounds != 20 && nrounds != 12);
> >
> 
> I didn't spot this until this patch, but BUG_ON() may bring down the
> kernel, and so it should really only be used as a last resort. (i.e.,
> if this is called from non-process context things may explode rather
> painfully)
> 
> I didn't look at the entire file [which is a bit cumbersome while
> reviewing incremental changes like this] and so I don't really have
> another suggestion right now, but please try to come up with something
> better if you can.
> 

I'll change it to WARN_ON_ONCE(), I guess.  I do still want it to be very noisy
if something fishy is going on with the round count.

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-19 15:58 ` [RFC PATCH v2 00/12] crypto: " Jason A. Donenfeld
  2018-10-19 18:19   ` Paul Crowley
@ 2018-10-19 19:04   ` Eric Biggers
  2018-10-20 10:26     ` Milan Broz
  2018-10-21 22:23     ` Eric Biggers
  1 sibling, 2 replies; 54+ messages in thread
From: Eric Biggers @ 2018-10-19 19:04 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Linux Crypto Mailing List, linux-fscrypt, linux-arm-kernel, LKML,
	Herbert Xu, Paul Crowley, Greg Kaiser, Michael Halcrow,
	Samuel Neves, Tomer Ashur

Hi Jason,

On Fri, Oct 19, 2018 at 05:58:35PM +0200, Jason A. Donenfeld wrote:
> Hello Eric,
> 
> > As before, some of these patches conflict with the new "Zinc" crypto
> > library.  But I don't know when Zinc will be merged, so for now I've
> > continued to base this patchset on the current 'cryptodev'.
> 
> I'd appreciate it if you waited to merge this until you can rebase it
> on top of Zinc. In fact, if you already want to build it on top of
> Zinc, I'm happy to work with you on that in a shared repo or similar.
> We can also hash out the details of that in person in Vancouver in a
> few weeks. I think pushing this in before will create undesirable
> churn for both of us.
> 

I won't be at Plumbers, sorry!  For if/when it's needed, I'll start a version of
this based on Zinc.  The basic requirements are that we need (1) xchacha12 and
xchacha20 available as 'skciphers' in the crypto API, and (2) the poly1305_core
functions (see patch 08/12).  In principle, these can be implemented in Zinc.
The Adiantum template and all the NHPoly1305 stuff will be the same either way.
(Unless you'll want one or both of those moved to Zinc too.  To be honest, even
after your explanations I still don't have a clear idea of what is supposed to
go in Zinc and what isn't...)

However, for now I'm hesitant to completely abandon the current approach and bet
the farm on Zinc.  Zinc has a large scope and various controversies that haven't
yet been fully resolved to everyone's satisfaction, including unclear licenses
on some of the essential assembly files.  It's not appropriate to grind kernel
crypto development to grind a halt while everyone waits for Zinc.

So if Zinc is ready, then it makes sense for it to go first;
otherwise, it doesn't.  It's not yet clear which is the case.

Thanks,

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 05/12] crypto: arm/chacha20 - add XChaCha20 support
  2018-10-15 17:54 ` [RFC PATCH v2 05/12] crypto: arm/chacha20 - add XChaCha20 support Eric Biggers
@ 2018-10-20  2:29   ` Ard Biesheuvel
  0 siblings, 0 replies; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20  2:29 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	Jason A . Donenfeld, Greg Kaiser, Herbert Xu, Samuel Neves,
	Michael Halcrow, Linux Kernel Mailing List, linux-fscrypt,
	Tomer Ashur, linux-arm-kernel, Paul Crowley

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Add an XChaCha20 implementation that is hooked up to the ARM NEON
> implementation of ChaCha20.  This is needed for use in the Adiantum
> encryption mode; see the generic code patch,
> "crypto: chacha20-generic - add XChaCha20 support", for more details.
>
> We also update the NEON code to support HChaCha20 on one block, so we
> can use that in XChaCha20 rather than calling the generic HChaCha20.
> This required factoring the permutation out into its own macro.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  arch/arm/crypto/Kconfig              |   2 +-
>  arch/arm/crypto/chacha20-neon-core.S |  68 ++++++++++------
>  arch/arm/crypto/chacha20-neon-glue.c | 111 ++++++++++++++++++++-------
>  3 files changed, 130 insertions(+), 51 deletions(-)
>
> diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
> index ef0c7feea6e29..0aa1471f27d2e 100644
> --- a/arch/arm/crypto/Kconfig
> +++ b/arch/arm/crypto/Kconfig
> @@ -117,7 +117,7 @@ config CRYPTO_CRC32_ARM_CE
>         select CRYPTO_HASH
>
>  config CRYPTO_CHACHA20_NEON
> -       tristate "NEON accelerated ChaCha20 symmetric cipher"
> +       tristate "NEON accelerated ChaCha20 stream cipher algorithms"
>         depends on KERNEL_MODE_NEON
>         select CRYPTO_BLKCIPHER
>         select CRYPTO_CHACHA20
> diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha20-neon-core.S
> index 50e7b98968189..db59f1fbc728b 100644
> --- a/arch/arm/crypto/chacha20-neon-core.S
> +++ b/arch/arm/crypto/chacha20-neon-core.S
> @@ -52,33 +52,22 @@
>         .fpu            neon
>         .align          5
>
> -ENTRY(chacha20_block_xor_neon)
> -       // r0: Input state matrix, s
> -       // r1: 1 data block output, o
> -       // r2: 1 data block input, i
> -
> -       //
> -       // This function encrypts one ChaCha20 block by loading the state matrix
> -       // in four NEON registers. It performs matrix operation on four words in
> -       // parallel, but requireds shuffling to rearrange the words after each
> -       // round.
> -       //
> -
> -       // x0..3 = s0..3
> -       add             ip, r0, #0x20
> -       vld1.32         {q0-q1}, [r0]
> -       vld1.32         {q2-q3}, [ip]
> -
> -       vmov            q8, q0
> -       vmov            q9, q1
> -       vmov            q10, q2
> -       vmov            q11, q3
> +/*
> + * _chacha20_permute - permute one block
> + *
> + * Permute one 64-byte block where the state matrix is stored in the four NEON
> + * registers q0-q3.  It performs matrix operation on four words in parallel, but

operations [since you're touching this anyway]

> + * requires shuffling to rearrange the words after each round.
> + *
> + * Clobbers: r3, q4-q5
> + */
> +.macro _chacha20_permute
>

As you know, I'd prefer the GAS directives to be indented and their
arguments to be aligned with the right hand sides of the ordinary
instructions. However, this entire file may end up getting replaced
once we move to your scalar version combined with AndyP's NEON
version, at which point it may no longer matter. [Does that code
support the alternative xchacha constructions btw?]

>         adr             ip, .Lrol8_table
>         mov             r3, #10
>         vld1.8          {d10}, [ip, :64]
>
> -.Ldoubleround:
> +.Ldoubleround_\@:
>         // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
>         vadd.i32        q0, q0, q1
>         veor            q3, q3, q0
> @@ -140,7 +129,25 @@ ENTRY(chacha20_block_xor_neon)
>         vext.8          q3, q3, q3, #4
>
>         subs            r3, r3, #1
> -       bne             .Ldoubleround
> +       bne             .Ldoubleround_\@
> +.endm
> +

Since your macro does not take any parameters: could we change this to
a subroutine?

> +ENTRY(chacha20_block_xor_neon)
> +       // r0: Input state matrix, s
> +       // r1: 1 data block output, o
> +       // r2: 1 data block input, i
> +
> +       // x0..3 = s0..3
> +       add             ip, r0, #0x20
> +       vld1.32         {q0-q1}, [r0]
> +       vld1.32         {q2-q3}, [ip]
> +
> +       vmov            q8, q0
> +       vmov            q9, q1
> +       vmov            q10, q2
> +       vmov            q11, q3
> +
> +       _chacha20_permute
>
>         add             ip, r2, #0x20
>         vld1.8          {q4-q5}, [r2]
> @@ -169,6 +176,21 @@ ENTRY(chacha20_block_xor_neon)
>         bx              lr
>  ENDPROC(chacha20_block_xor_neon)
>
> +ENTRY(hchacha20_block_neon)
> +       // r0: Input state matrix, s
> +       // r1: output (8 32-bit words)
> +
> +       vld1.32         {q0-q1}, [r0]!
> +       vld1.32         {q2-q3}, [r0]
> +
> +       _chacha20_permute
> +
> +       vst1.32         {q0}, [r1]!
> +       vst1.32         {q3}, [r1]
> +
> +       bx              lr
> +ENDPROC(hchacha20_block_neon)
> +
>         .align          4
>  .Lctrinc:      .word   0, 1, 2, 3
>  .Lrol8_table:  .byte   3, 0, 1, 2, 7, 4, 5, 6
> diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c
> index 7386eb1c1889d..becc7990b1d39 100644
> --- a/arch/arm/crypto/chacha20-neon-glue.c
> +++ b/arch/arm/crypto/chacha20-neon-glue.c
> @@ -1,5 +1,5 @@
>  /*
> - * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
> + * ChaCha20 (RFC7539) and XChaCha20 stream ciphers, NEON accelerated
>   *
>   * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
>   *
> @@ -30,6 +30,7 @@
>
>  asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
>  asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
> +asmlinkage void hchacha20_block_neon(const u32 *state, u32 *out);
>
>  static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
>                             unsigned int bytes)
> @@ -57,22 +58,17 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
>         }
>  }
>
> -static int chacha20_neon(struct skcipher_request *req)
> +static int chacha20_neon_stream_xor(struct skcipher_request *req,
> +                                   struct chacha_ctx *ctx, u8 *iv)
>  {
> -       struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> -       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
>         struct skcipher_walk walk;
>         u32 state[16];
>         int err;
>
> -       if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
> -               return crypto_chacha_crypt(req);
> -
>         err = skcipher_walk_virt(&walk, req, true);
>

I am slightly unhappy that we are still using atomic==true here, and
perform the entire scatterwalk with preemption disabled. Could we
please try and fix that as well (as a separate patch)? Thanks.

> -       crypto_chacha_init(state, ctx, walk.iv);
> +       crypto_chacha_init(state, ctx, iv);
>
> -       kernel_neon_begin();
>         while (walk.nbytes > 0) {
>                 unsigned int nbytes = walk.nbytes;
>
> @@ -83,27 +79,85 @@ static int chacha20_neon(struct skcipher_request *req)
>                                 nbytes);
>                 err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
>         }
> +
> +       return err;
> +}
> +
> +static int chacha20_neon(struct skcipher_request *req)
> +{
> +       struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> +       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       int err;
> +
> +       if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
> +               return crypto_chacha_crypt(req);
> +
> +       kernel_neon_begin();
> +       err = chacha20_neon_stream_xor(req, ctx, req->iv);
> +       kernel_neon_end();
> +       return err;
> +}
> +
> +static int xchacha20_neon(struct skcipher_request *req)
> +{
> +       struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> +       struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
> +       struct chacha_ctx subctx;
> +       u32 state[16];
> +       u8 real_iv[16];
> +       int err;
> +
> +       if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd())
> +               return crypto_xchacha_crypt(req);
> +
> +       crypto_chacha_init(state, ctx, req->iv);
> +
> +       kernel_neon_begin();
> +
> +       hchacha20_block_neon(state, subctx.key);
> +       memcpy(&real_iv[0], req->iv + 24, 8);
> +       memcpy(&real_iv[8], req->iv + 16, 8);
> +       err = chacha20_neon_stream_xor(req, &subctx, real_iv);
> +
>         kernel_neon_end();
>
>         return err;
>  }
>
> -static struct skcipher_alg alg = {
> -       .base.cra_name          = "chacha20",
> -       .base.cra_driver_name   = "chacha20-neon",
> -       .base.cra_priority      = 300,
> -       .base.cra_blocksize     = 1,
> -       .base.cra_ctxsize       = sizeof(struct chacha_ctx),
> -       .base.cra_module        = THIS_MODULE,
> -
> -       .min_keysize            = CHACHA_KEY_SIZE,
> -       .max_keysize            = CHACHA_KEY_SIZE,
> -       .ivsize                 = CHACHA_IV_SIZE,
> -       .chunksize              = CHACHA_BLOCK_SIZE,
> -       .walksize               = 4 * CHACHA_BLOCK_SIZE,
> -       .setkey                 = crypto_chacha20_setkey,
> -       .encrypt                = chacha20_neon,
> -       .decrypt                = chacha20_neon,
> +static struct skcipher_alg algs[] = {
> +       {
> +               .base.cra_name          = "chacha20",
> +               .base.cra_driver_name   = "chacha20-neon",
> +               .base.cra_priority      = 300,
> +               .base.cra_blocksize     = 1,
> +               .base.cra_ctxsize       = sizeof(struct chacha_ctx),
> +               .base.cra_module        = THIS_MODULE,
> +
> +               .min_keysize            = CHACHA_KEY_SIZE,
> +               .max_keysize            = CHACHA_KEY_SIZE,
> +               .ivsize                 = CHACHA_IV_SIZE,
> +               .chunksize              = CHACHA_BLOCK_SIZE,
> +               .walksize               = 4 * CHACHA_BLOCK_SIZE,
> +               .setkey                 = crypto_chacha20_setkey,
> +               .encrypt                = chacha20_neon,
> +               .decrypt                = chacha20_neon,
> +       }, {
> +               .base.cra_name          = "xchacha20",
> +               .base.cra_driver_name   = "xchacha20-neon",
> +               .base.cra_priority      = 300,
> +               .base.cra_blocksize     = 1,
> +               .base.cra_ctxsize       = sizeof(struct chacha_ctx),
> +               .base.cra_module        = THIS_MODULE,
> +
> +               .min_keysize            = CHACHA_KEY_SIZE,
> +               .max_keysize            = CHACHA_KEY_SIZE,
> +               .ivsize                 = XCHACHA_IV_SIZE,
> +               .chunksize              = CHACHA_BLOCK_SIZE,
> +               .walksize               = 4 * CHACHA_BLOCK_SIZE,
> +               .setkey                 = crypto_chacha20_setkey,
> +               .encrypt                = xchacha20_neon,
> +               .decrypt                = xchacha20_neon,
> +       }
>  };
>
>  static int __init chacha20_simd_mod_init(void)
> @@ -111,12 +165,12 @@ static int __init chacha20_simd_mod_init(void)
>         if (!(elf_hwcap & HWCAP_NEON))
>                 return -ENODEV;
>
> -       return crypto_register_skcipher(&alg);
> +       return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
>  }
>
>  static void __exit chacha20_simd_mod_fini(void)
>  {
> -       crypto_unregister_skcipher(&alg);
> +       crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
>  }
>
>  module_init(chacha20_simd_mod_init);
> @@ -125,3 +179,6 @@ module_exit(chacha20_simd_mod_fini);
>  MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
>  MODULE_LICENSE("GPL v2");
>  MODULE_ALIAS_CRYPTO("chacha20");
> +MODULE_ALIAS_CRYPTO("chacha20-neon");
> +MODULE_ALIAS_CRYPTO("xchacha20");
> +MODULE_ALIAS_CRYPTO("xchacha20-neon");
> --
> 2.19.1.331.ge82ca0e54c-goog
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-19 18:19   ` Paul Crowley
@ 2018-10-20  3:24     ` Ard Biesheuvel
  2018-10-20  5:22       ` Eric Biggers
       [not found]     ` <2395454e-a0dc-408f-4138-9d15ab5f20b8@esat.kuleuven.be>
  1 sibling, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20  3:24 UTC (permalink / raw)
  To: Paul Crowley
  Cc: Jason A. Donenfeld, Eric Biggers,
	open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Greg Kaiser, Michael Halcrow, Samuel Neves, Tomer Ashur

On 20 October 2018 at 02:19, Paul Crowley <paulcrowley@google.com> wrote:
> On Fri, 19 Oct 2018 at 08:58, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>> Before merging this into the kernel, do you want to wait until you've
>> received some public review from academia?
>
> I would prefer not to wait. Unlike a new primitive whose strength can
> only be known through attempts at cryptanalysis, Adiantum is a
> construction based on
> well-understood and trusted primitives; it is secure if the proof
> accompanying it is correct. Given that (outside competitions or
> standardization efforts) no-one ever issues public statements that
> they think algorithms or proofs are good, what I'm expecting from
> academia is silence :) The most we could hope for would be getting the
> paper accepted at a conference, and we're pursuing that but there's a
> good chance that won't happen simply because it's not very novel. It
> basically takes existing ideas and applies them using a stream cipher
> instead of a block cipher, and a faster hashing mode; it's also a
> small update from HPolyC. I've had some private feedback that the
> proof seems correct, and that's all I'm expecting to get.

Hi Paul, Eric,

The Adiantum paper claims

"On an ARM Cortex-A7 processor, Adiantum decrypts 4096-byte messages
at 11 cycles per byte, five times faster than AES-256-XTS, with a
constant-time implementation."

which is surprising to me. The bit slicing NEON AES core runs at ~14
cycle per byte on a Cortex-A15 (when encrypting), so 55 cycles per
byte on A7 sounds rather high. Is it really that bad?

Also, the paper mentions that the second hash pass and the stream
cipher en/decryption pass could be executed in parallel, while your
implementation performs three distinct passes. Do you have any
estimates on the potential performance gain of implementing that? In
my experience (which is mostly A53 rather than A7 based, mind you),
removing memory accesses can help tremendously to speed up the
execution on low end cores.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds
  2018-10-15 17:54 ` [RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds Eric Biggers
@ 2018-10-20  3:35   ` Ard Biesheuvel
  2018-10-20  5:26     ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20  3:35 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> In preparation for adding XChaCha12 support, rename/refactor the NEON
> implementation of ChaCha20 to support different numbers of rounds.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---
>  arch/arm/crypto/Makefile                      |  4 +-
>  ...hacha20-neon-core.S => chacha-neon-core.S} | 36 ++++++------
>  ...hacha20-neon-glue.c => chacha-neon-glue.c} | 56 ++++++++++---------
>  3 files changed, 52 insertions(+), 44 deletions(-)
>  rename arch/arm/crypto/{chacha20-neon-core.S => chacha-neon-core.S} (96%)
>  rename arch/arm/crypto/{chacha20-neon-glue.c => chacha-neon-glue.c} (73%)
>
> diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
> index bd5bceef0605f..005482ff95047 100644
> --- a/arch/arm/crypto/Makefile
> +++ b/arch/arm/crypto/Makefile
> @@ -9,7 +9,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
>  obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
>  obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
>  obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
> -obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
> +obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
>

I take it you are preserving the Kconfig symbol name to prevent
breaking existing configs?

If so, we might consider doing something like

config CRYPTO_CHACHA20_NEON
    tristate

config CRYPTO_CHACHA_NEON
    default CRYPTO_CHACHA20_NEON
    ... the existing kconfig symbol description ...

and drop the former at some point in the future?

>  ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
>  ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
> @@ -52,7 +52,7 @@ aes-arm-ce-y  := aes-ce-core.o aes-ce-glue.o
>  ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
>  crct10dif-arm-ce-y     := crct10dif-ce-core.o crct10dif-ce-glue.o
>  crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
> -chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
> +chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
>
>  ifdef REGENERATE_ARM_CRYPTO
>  quiet_cmd_perl = PERL    $@
> diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha-neon-core.S
> similarity index 96%
> rename from arch/arm/crypto/chacha20-neon-core.S
> rename to arch/arm/crypto/chacha-neon-core.S
> index db59f1fbc728b..4b12064449f78 100644
> --- a/arch/arm/crypto/chacha20-neon-core.S
> +++ b/arch/arm/crypto/chacha-neon-core.S
> @@ -1,5 +1,5 @@
>  /*
> - * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
> + * ChaCha/XChaCha NEON helper functions
>   *
>   * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
>   *
> @@ -53,18 +53,19 @@
>         .align          5
>
>  /*
> - * _chacha20_permute - permute one block
> + * _chacha_permute - permute one block
>   *
>   * Permute one 64-byte block where the state matrix is stored in the four NEON
>   * registers q0-q3.  It performs matrix operation on four words in parallel, but
>   * requires shuffling to rearrange the words after each round.
>   *
> + * The round count is given in r3.
> + *
>   * Clobbers: r3, q4-q5
>   */
> -.macro _chacha20_permute
> +.macro _chacha_permute
>
>         adr             ip, .Lrol8_table
> -       mov             r3, #10
>         vld1.8          {d10}, [ip, :64]
>
>  .Ldoubleround_\@:
> @@ -128,14 +129,15 @@
>         // x3 = shuffle32(x3, MASK(0, 3, 2, 1))
>         vext.8          q3, q3, q3, #4
>
> -       subs            r3, r3, #1
> +       subs            r3, r3, #2
>         bne             .Ldoubleround_\@
>  .endm
>
> -ENTRY(chacha20_block_xor_neon)
> +ENTRY(chacha_block_xor_neon)
>         // r0: Input state matrix, s
>         // r1: 1 data block output, o
>         // r2: 1 data block input, i
> +       // r3: nrounds
>
>         // x0..3 = s0..3
>         add             ip, r0, #0x20
> @@ -147,7 +149,7 @@ ENTRY(chacha20_block_xor_neon)
>         vmov            q10, q2
>         vmov            q11, q3
>
> -       _chacha20_permute
> +       _chacha_permute
>
>         add             ip, r2, #0x20
>         vld1.8          {q4-q5}, [r2]
> @@ -174,29 +176,31 @@ ENTRY(chacha20_block_xor_neon)
>         vst1.8          {q2-q3}, [ip]
>
>         bx              lr
> -ENDPROC(chacha20_block_xor_neon)
> +ENDPROC(chacha_block_xor_neon)
>
> -ENTRY(hchacha20_block_neon)
> +ENTRY(hchacha_block_neon)
>         // r0: Input state matrix, s
>         // r1: output (8 32-bit words)
> +       // r2: nrounds
>
>         vld1.32         {q0-q1}, [r0]!
>         vld1.32         {q2-q3}, [r0]
>
> -       _chacha20_permute
> +       mov             r3, r2
> +       _chacha_permute
>
>         vst1.32         {q0}, [r1]!
>         vst1.32         {q3}, [r1]
>
>         bx              lr
> -ENDPROC(hchacha20_block_neon)
> +ENDPROC(hchacha_block_neon)
>
>         .align          4
>  .Lctrinc:      .word   0, 1, 2, 3
>  .Lrol8_table:  .byte   3, 0, 1, 2, 7, 4, 5, 6
>
>         .align          5
> -ENTRY(chacha20_4block_xor_neon)
> +ENTRY(chacha_4block_xor_neon)
>         push            {r4-r5}
>         mov             r4, sp                  // preserve the stack pointer
>         sub             ip, sp, #0x20           // allocate a 32 byte buffer
> @@ -206,9 +210,10 @@ ENTRY(chacha20_4block_xor_neon)
>         // r0: Input state matrix, s
>         // r1: 4 data blocks output, o
>         // r2: 4 data blocks input, i
> +       // r3: nrounds
>
>         //
> -       // This function encrypts four consecutive ChaCha20 blocks by loading
> +       // This function encrypts four consecutive ChaCha blocks by loading
>         // the state matrix in NEON registers four times. The algorithm performs
>         // each operation on the corresponding word of each state matrix, hence
>         // requires no word shuffling. The words are re-interleaved before the
> @@ -241,7 +246,6 @@ ENTRY(chacha20_4block_xor_neon)
>         vdup.32         q0, d0[0]
>
>         adr             ip, .Lrol8_table
> -       mov             r3, #10
>         b               1f
>
>  .Ldoubleround4:
> @@ -439,7 +443,7 @@ ENTRY(chacha20_4block_xor_neon)
>         vsri.u32        q5, q8, #25
>         vsri.u32        q6, q9, #25
>
> -       subs            r3, r3, #1
> +       subs            r3, r3, #2
>         bne             .Ldoubleround4
>
>         // x0..7[0-3] are in q0-q7, x10..15[0-3] are in q10-q15.
> @@ -549,4 +553,4 @@ ENTRY(chacha20_4block_xor_neon)
>
>         pop             {r4-r5}
>         bx              lr
> -ENDPROC(chacha20_4block_xor_neon)
> +ENDPROC(chacha_4block_xor_neon)
> diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha-neon-glue.c
> similarity index 73%
> rename from arch/arm/crypto/chacha20-neon-glue.c
> rename to arch/arm/crypto/chacha-neon-glue.c
> index becc7990b1d39..b236af4889c61 100644
> --- a/arch/arm/crypto/chacha20-neon-glue.c
> +++ b/arch/arm/crypto/chacha-neon-glue.c
> @@ -28,24 +28,26 @@
>  #include <asm/neon.h>
>  #include <asm/simd.h>
>
> -asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
> -asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
> -asmlinkage void hchacha20_block_neon(const u32 *state, u32 *out);
> -
> -static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
> -                           unsigned int bytes)
> +asmlinkage void chacha_block_xor_neon(const u32 *state, u8 *dst, const u8 *src,
> +                                     int nrounds);
> +asmlinkage void chacha_4block_xor_neon(const u32 *state, u8 *dst, const u8 *src,
> +                                      int nrounds);
> +asmlinkage void hchacha_block_neon(const u32 *state, u32 *out, int nrounds);
> +
> +static void chacha_doneon(u32 *state, u8 *dst, const u8 *src,
> +                         unsigned int bytes, int nrounds)
>  {
>         u8 buf[CHACHA_BLOCK_SIZE];
>
>         while (bytes >= CHACHA_BLOCK_SIZE * 4) {
> -               chacha20_4block_xor_neon(state, dst, src);
> +               chacha_4block_xor_neon(state, dst, src, nrounds);
>                 bytes -= CHACHA_BLOCK_SIZE * 4;
>                 src += CHACHA_BLOCK_SIZE * 4;
>                 dst += CHACHA_BLOCK_SIZE * 4;
>                 state[12] += 4;
>         }
>         while (bytes >= CHACHA_BLOCK_SIZE) {
> -               chacha20_block_xor_neon(state, dst, src);
> +               chacha_block_xor_neon(state, dst, src, nrounds);
>                 bytes -= CHACHA_BLOCK_SIZE;
>                 src += CHACHA_BLOCK_SIZE;
>                 dst += CHACHA_BLOCK_SIZE;
> @@ -53,13 +55,13 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
>         }
>         if (bytes) {
>                 memcpy(buf, src, bytes);
> -               chacha20_block_xor_neon(state, buf, buf);
> +               chacha_block_xor_neon(state, buf, buf, nrounds);
>                 memcpy(dst, buf, bytes);
>         }
>  }
>
> -static int chacha20_neon_stream_xor(struct skcipher_request *req,
> -                                   struct chacha_ctx *ctx, u8 *iv)
> +static int chacha_neon_stream_xor(struct skcipher_request *req,
> +                                 struct chacha_ctx *ctx, u8 *iv)
>  {
>         struct skcipher_walk walk;
>         u32 state[16];
> @@ -75,15 +77,15 @@ static int chacha20_neon_stream_xor(struct skcipher_request *req,
>                 if (nbytes < walk.total)
>                         nbytes = round_down(nbytes, walk.stride);
>
> -               chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
> -                               nbytes);
> +               chacha_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
> +                             nbytes, ctx->nrounds);
>                 err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
>         }
>
>         return err;
>  }
>
> -static int chacha20_neon(struct skcipher_request *req)
> +static int chacha_neon(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
>         struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
> @@ -93,12 +95,12 @@ static int chacha20_neon(struct skcipher_request *req)
>                 return crypto_chacha_crypt(req);
>
>         kernel_neon_begin();
> -       err = chacha20_neon_stream_xor(req, ctx, req->iv);
> +       err = chacha_neon_stream_xor(req, ctx, req->iv);
>         kernel_neon_end();
>         return err;
>  }
>
> -static int xchacha20_neon(struct skcipher_request *req)
> +static int xchacha_neon(struct skcipher_request *req)
>  {
>         struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
>         struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
> @@ -114,10 +116,11 @@ static int xchacha20_neon(struct skcipher_request *req)
>
>         kernel_neon_begin();
>
> -       hchacha20_block_neon(state, subctx.key);
> +       hchacha_block_neon(state, subctx.key, ctx->nrounds);
> +       subctx.nrounds = ctx->nrounds;
>         memcpy(&real_iv[0], req->iv + 24, 8);
>         memcpy(&real_iv[8], req->iv + 16, 8);
> -       err = chacha20_neon_stream_xor(req, &subctx, real_iv);
> +       err = chacha_neon_stream_xor(req, &subctx, real_iv);
>
>         kernel_neon_end();
>
> @@ -139,8 +142,8 @@ static struct skcipher_alg algs[] = {
>                 .chunksize              = CHACHA_BLOCK_SIZE,
>                 .walksize               = 4 * CHACHA_BLOCK_SIZE,
>                 .setkey                 = crypto_chacha20_setkey,
> -               .encrypt                = chacha20_neon,
> -               .decrypt                = chacha20_neon,
> +               .encrypt                = chacha_neon,
> +               .decrypt                = chacha_neon,
>         }, {
>                 .base.cra_name          = "xchacha20",
>                 .base.cra_driver_name   = "xchacha20-neon",
> @@ -155,12 +158,12 @@ static struct skcipher_alg algs[] = {
>                 .chunksize              = CHACHA_BLOCK_SIZE,
>                 .walksize               = 4 * CHACHA_BLOCK_SIZE,
>                 .setkey                 = crypto_chacha20_setkey,
> -               .encrypt                = xchacha20_neon,
> -               .decrypt                = xchacha20_neon,
> +               .encrypt                = xchacha_neon,
> +               .decrypt                = xchacha_neon,
>         }
>  };
>
> -static int __init chacha20_simd_mod_init(void)
> +static int __init chacha_simd_mod_init(void)
>  {
>         if (!(elf_hwcap & HWCAP_NEON))
>                 return -ENODEV;
> @@ -168,14 +171,15 @@ static int __init chacha20_simd_mod_init(void)
>         return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
>  }
>
> -static void __exit chacha20_simd_mod_fini(void)
> +static void __exit chacha_simd_mod_fini(void)
>  {
>         crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
>  }
>
> -module_init(chacha20_simd_mod_init);
> -module_exit(chacha20_simd_mod_fini);
> +module_init(chacha_simd_mod_init);
> +module_exit(chacha_simd_mod_fini);
>
> +MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (NEON accelerated)");
>  MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
>  MODULE_LICENSE("GPL v2");
>  MODULE_ALIAS_CRYPTO("chacha20");
> --
> 2.19.1.331.ge82ca0e54c-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 07/12] crypto: arm/chacha - add XChaCha12 support
  2018-10-15 17:54 ` [RFC PATCH v2 07/12] crypto: arm/chacha - add XChaCha12 support Eric Biggers
@ 2018-10-20  3:36   ` Ard Biesheuvel
  0 siblings, 0 replies; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20  3:36 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Now that the 32-bit ARM NEON implementation of ChaCha20 and XChaCha20
> has been refactored to support varying the number of rounds, add support
> for XChaCha12.  This is identical to XChaCha20 except for the number of
> rounds, which is 12 instead of 20.
>
> XChaCha12 is faster than XChaCha20 but has a lower security margin,
> though still greater than AES-256's since the best known attacks make it
> through only 7 rounds.  See the patch "crypto: chacha - add XChaCha12
> support" for more details about why we need XChaCha12 support.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---
>  arch/arm/crypto/Kconfig            |  2 +-
>  arch/arm/crypto/chacha-neon-glue.c | 21 ++++++++++++++++++++-
>  2 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
> index 0aa1471f27d2e..cc932d9bba561 100644
> --- a/arch/arm/crypto/Kconfig
> +++ b/arch/arm/crypto/Kconfig
> @@ -117,7 +117,7 @@ config CRYPTO_CRC32_ARM_CE
>         select CRYPTO_HASH
>
>  config CRYPTO_CHACHA20_NEON
> -       tristate "NEON accelerated ChaCha20 stream cipher algorithms"
> +       tristate "NEON accelerated ChaCha stream cipher algorithms"
>         depends on KERNEL_MODE_NEON
>         select CRYPTO_BLKCIPHER
>         select CRYPTO_CHACHA20
> diff --git a/arch/arm/crypto/chacha-neon-glue.c b/arch/arm/crypto/chacha-neon-glue.c
> index b236af4889c61..0b1b238227707 100644
> --- a/arch/arm/crypto/chacha-neon-glue.c
> +++ b/arch/arm/crypto/chacha-neon-glue.c
> @@ -1,5 +1,6 @@
>  /*
> - * ChaCha20 (RFC7539) and XChaCha20 stream ciphers, NEON accelerated
> + * ARM NEON accelerated ChaCha and XChaCha stream ciphers,
> + * including ChaCha20 (RFC7539)
>   *
>   * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
>   *
> @@ -160,6 +161,22 @@ static struct skcipher_alg algs[] = {
>                 .setkey                 = crypto_chacha20_setkey,
>                 .encrypt                = xchacha_neon,
>                 .decrypt                = xchacha_neon,
> +       }, {
> +               .base.cra_name          = "xchacha12",
> +               .base.cra_driver_name   = "xchacha12-neon",
> +               .base.cra_priority      = 300,
> +               .base.cra_blocksize     = 1,
> +               .base.cra_ctxsize       = sizeof(struct chacha_ctx),
> +               .base.cra_module        = THIS_MODULE,
> +
> +               .min_keysize            = CHACHA_KEY_SIZE,
> +               .max_keysize            = CHACHA_KEY_SIZE,
> +               .ivsize                 = XCHACHA_IV_SIZE,
> +               .chunksize              = CHACHA_BLOCK_SIZE,
> +               .walksize               = 4 * CHACHA_BLOCK_SIZE,
> +               .setkey                 = crypto_chacha12_setkey,
> +               .encrypt                = xchacha_neon,
> +               .decrypt                = xchacha_neon,
>         }
>  };
>
> @@ -186,3 +203,5 @@ MODULE_ALIAS_CRYPTO("chacha20");
>  MODULE_ALIAS_CRYPTO("chacha20-neon");
>  MODULE_ALIAS_CRYPTO("xchacha20");
>  MODULE_ALIAS_CRYPTO("xchacha20-neon");
> +MODULE_ALIAS_CRYPTO("xchacha12");
> +MODULE_ALIAS_CRYPTO("xchacha12-neon");
> --
> 2.19.1.331.ge82ca0e54c-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 08/12] crypto: poly1305 - add Poly1305 core API
  2018-10-15 17:54 ` [RFC PATCH v2 08/12] crypto: poly1305 - add Poly1305 core API Eric Biggers
@ 2018-10-20  3:45   ` Ard Biesheuvel
  0 siblings, 0 replies; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20  3:45 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Expose a low-level Poly1305 API which implements the
> ε-almost-∆-universal (εA∆U) hash function underlying the Poly1305 MAC
> and supports block-aligned inputs only.
>
> This is needed for Adiantum hashing, which builds an εA∆U hash function
> from NH and a polynomial evaluation in GF(2^{130}-5); this polynomial
> evaluation is identical to the one the Poly1305 MAC does.  However, the
> crypto_shash Poly1305 API isn't very appropriate for this because its
> calling convention assumes it is used as a MAC, with a 32-byte
> "one-time key" provided for every digest.
>
> But by design, in Adiantum hashing the performance of the polynomial
> evaluation isn't nearly as critical as NH.  So it suffices to just have
> some C helper functions.  Thus, this patch adds such functions.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Could we split this up into
- a patch that updates the poly1305_desc_ctx layout and fixes up all
the references
- a patch that actually breaks out the functionality you need to
access separately

I am aware that you'll end up touching some lines twice, but it should
be much easier to review.

> ---
>  arch/x86/crypto/poly1305_glue.c |  20 ++--
>  crypto/poly1305_generic.c       | 174 ++++++++++++++++++--------------
>  include/crypto/poly1305.h       |  28 ++++-
>  3 files changed, 136 insertions(+), 86 deletions(-)
>
> diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c
> index f012b7e28ad1d..88cc01506c84a 100644
> --- a/arch/x86/crypto/poly1305_glue.c
> +++ b/arch/x86/crypto/poly1305_glue.c
> @@ -83,35 +83,37 @@ static unsigned int poly1305_simd_blocks(struct poly1305_desc_ctx *dctx,
>         if (poly1305_use_avx2 && srclen >= POLY1305_BLOCK_SIZE * 4) {
>                 if (unlikely(!sctx->wset)) {
>                         if (!sctx->uset) {
> -                               memcpy(sctx->u, dctx->r, sizeof(sctx->u));
> -                               poly1305_simd_mult(sctx->u, dctx->r);
> +                               memcpy(sctx->u, dctx->r.r, sizeof(sctx->u));
> +                               poly1305_simd_mult(sctx->u, dctx->r.r);
>                                 sctx->uset = true;
>                         }
>                         memcpy(sctx->u + 5, sctx->u, sizeof(sctx->u));
> -                       poly1305_simd_mult(sctx->u + 5, dctx->r);
> +                       poly1305_simd_mult(sctx->u + 5, dctx->r.r);
>                         memcpy(sctx->u + 10, sctx->u + 5, sizeof(sctx->u));
> -                       poly1305_simd_mult(sctx->u + 10, dctx->r);
> +                       poly1305_simd_mult(sctx->u + 10, dctx->r.r);
>                         sctx->wset = true;
>                 }
>                 blocks = srclen / (POLY1305_BLOCK_SIZE * 4);
> -               poly1305_4block_avx2(dctx->h, src, dctx->r, blocks, sctx->u);
> +               poly1305_4block_avx2(dctx->h.h, src, dctx->r.r, blocks,
> +                                    sctx->u);
>                 src += POLY1305_BLOCK_SIZE * 4 * blocks;
>                 srclen -= POLY1305_BLOCK_SIZE * 4 * blocks;
>         }
>  #endif
>         if (likely(srclen >= POLY1305_BLOCK_SIZE * 2)) {
>                 if (unlikely(!sctx->uset)) {
> -                       memcpy(sctx->u, dctx->r, sizeof(sctx->u));
> -                       poly1305_simd_mult(sctx->u, dctx->r);
> +                       memcpy(sctx->u, dctx->r.r, sizeof(sctx->u));
> +                       poly1305_simd_mult(sctx->u, dctx->r.r);
>                         sctx->uset = true;
>                 }
>                 blocks = srclen / (POLY1305_BLOCK_SIZE * 2);
> -               poly1305_2block_sse2(dctx->h, src, dctx->r, blocks, sctx->u);
> +               poly1305_2block_sse2(dctx->h.h, src, dctx->r.r, blocks,
> +                                    sctx->u);
>                 src += POLY1305_BLOCK_SIZE * 2 * blocks;
>                 srclen -= POLY1305_BLOCK_SIZE * 2 * blocks;
>         }
>         if (srclen >= POLY1305_BLOCK_SIZE) {
> -               poly1305_block_sse2(dctx->h, src, dctx->r, 1);
> +               poly1305_block_sse2(dctx->h.h, src, dctx->r.r, 1);
>                 srclen -= POLY1305_BLOCK_SIZE;
>         }
>         return srclen;
> diff --git a/crypto/poly1305_generic.c b/crypto/poly1305_generic.c
> index 47d3a6b83931e..2a06874204e87 100644
> --- a/crypto/poly1305_generic.c
> +++ b/crypto/poly1305_generic.c
> @@ -38,7 +38,7 @@ int crypto_poly1305_init(struct shash_desc *desc)
>  {
>         struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
>
> -       memset(dctx->h, 0, sizeof(dctx->h));
> +       poly1305_core_init(&dctx->h);
>         dctx->buflen = 0;
>         dctx->rset = false;
>         dctx->sset = false;
> @@ -47,23 +47,16 @@ int crypto_poly1305_init(struct shash_desc *desc)
>  }
>  EXPORT_SYMBOL_GPL(crypto_poly1305_init);
>
> -static void poly1305_setrkey(struct poly1305_desc_ctx *dctx, const u8 *key)
> +void poly1305_core_setkey(struct poly1305_key *key, const u8 *raw_key)
>  {
>         /* r &= 0xffffffc0ffffffc0ffffffc0fffffff */
> -       dctx->r[0] = (get_unaligned_le32(key +  0) >> 0) & 0x3ffffff;
> -       dctx->r[1] = (get_unaligned_le32(key +  3) >> 2) & 0x3ffff03;
> -       dctx->r[2] = (get_unaligned_le32(key +  6) >> 4) & 0x3ffc0ff;
> -       dctx->r[3] = (get_unaligned_le32(key +  9) >> 6) & 0x3f03fff;
> -       dctx->r[4] = (get_unaligned_le32(key + 12) >> 8) & 0x00fffff;
> -}
> -
> -static void poly1305_setskey(struct poly1305_desc_ctx *dctx, const u8 *key)
> -{
> -       dctx->s[0] = get_unaligned_le32(key +  0);
> -       dctx->s[1] = get_unaligned_le32(key +  4);
> -       dctx->s[2] = get_unaligned_le32(key +  8);
> -       dctx->s[3] = get_unaligned_le32(key + 12);
> +       key->r[0] = (get_unaligned_le32(raw_key +  0) >> 0) & 0x3ffffff;
> +       key->r[1] = (get_unaligned_le32(raw_key +  3) >> 2) & 0x3ffff03;
> +       key->r[2] = (get_unaligned_le32(raw_key +  6) >> 4) & 0x3ffc0ff;
> +       key->r[3] = (get_unaligned_le32(raw_key +  9) >> 6) & 0x3f03fff;
> +       key->r[4] = (get_unaligned_le32(raw_key + 12) >> 8) & 0x00fffff;
>  }
> +EXPORT_SYMBOL_GPL(poly1305_core_setkey);
>
>  /*
>   * Poly1305 requires a unique key for each tag, which implies that we can't set
> @@ -75,13 +68,16 @@ unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
>  {
>         if (!dctx->sset) {
>                 if (!dctx->rset && srclen >= POLY1305_BLOCK_SIZE) {
> -                       poly1305_setrkey(dctx, src);
> +                       poly1305_core_setkey(&dctx->r, src);
>                         src += POLY1305_BLOCK_SIZE;
>                         srclen -= POLY1305_BLOCK_SIZE;
>                         dctx->rset = true;
>                 }
>                 if (srclen >= POLY1305_BLOCK_SIZE) {
> -                       poly1305_setskey(dctx, src);
> +                       dctx->s[0] = get_unaligned_le32(src +  0);
> +                       dctx->s[1] = get_unaligned_le32(src +  4);
> +                       dctx->s[2] = get_unaligned_le32(src +  8);
> +                       dctx->s[3] = get_unaligned_le32(src + 12);
>                         src += POLY1305_BLOCK_SIZE;
>                         srclen -= POLY1305_BLOCK_SIZE;
>                         dctx->sset = true;
> @@ -91,41 +87,37 @@ unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
>  }
>  EXPORT_SYMBOL_GPL(crypto_poly1305_setdesckey);
>
> -static unsigned int poly1305_blocks(struct poly1305_desc_ctx *dctx,
> -                                   const u8 *src, unsigned int srclen,
> -                                   u32 hibit)
> +static void poly1305_blocks_internal(struct poly1305_state *state,
> +                                    const struct poly1305_key *key,
> +                                    const void *src, unsigned int nblocks,
> +                                    u32 hibit)
>  {
>         u32 r0, r1, r2, r3, r4;
>         u32 s1, s2, s3, s4;
>         u32 h0, h1, h2, h3, h4;
>         u64 d0, d1, d2, d3, d4;
> -       unsigned int datalen;
>
> -       if (unlikely(!dctx->sset)) {
> -               datalen = crypto_poly1305_setdesckey(dctx, src, srclen);
> -               src += srclen - datalen;
> -               srclen = datalen;
> -       }
> +       if (!nblocks)
> +               return;
>
> -       r0 = dctx->r[0];
> -       r1 = dctx->r[1];
> -       r2 = dctx->r[2];
> -       r3 = dctx->r[3];
> -       r4 = dctx->r[4];
> +       r0 = key->r[0];
> +       r1 = key->r[1];
> +       r2 = key->r[2];
> +       r3 = key->r[3];
> +       r4 = key->r[4];
>
>         s1 = r1 * 5;
>         s2 = r2 * 5;
>         s3 = r3 * 5;
>         s4 = r4 * 5;
>
> -       h0 = dctx->h[0];
> -       h1 = dctx->h[1];
> -       h2 = dctx->h[2];
> -       h3 = dctx->h[3];
> -       h4 = dctx->h[4];
> -
> -       while (likely(srclen >= POLY1305_BLOCK_SIZE)) {
> +       h0 = state->h[0];
> +       h1 = state->h[1];
> +       h2 = state->h[2];
> +       h3 = state->h[3];
> +       h4 = state->h[4];
>
> +       do {
>                 /* h += m[i] */
>                 h0 += (get_unaligned_le32(src +  0) >> 0) & 0x3ffffff;
>                 h1 += (get_unaligned_le32(src +  3) >> 2) & 0x3ffffff;
> @@ -154,16 +146,36 @@ static unsigned int poly1305_blocks(struct poly1305_desc_ctx *dctx,
>                 h1 += h0 >> 26;       h0 = h0 & 0x3ffffff;
>
>                 src += POLY1305_BLOCK_SIZE;
> -               srclen -= POLY1305_BLOCK_SIZE;
> -       }
> +       } while (--nblocks);
>
> -       dctx->h[0] = h0;
> -       dctx->h[1] = h1;
> -       dctx->h[2] = h2;
> -       dctx->h[3] = h3;
> -       dctx->h[4] = h4;
> +       state->h[0] = h0;
> +       state->h[1] = h1;
> +       state->h[2] = h2;
> +       state->h[3] = h3;
> +       state->h[4] = h4;
> +}
>
> -       return srclen;
> +void poly1305_core_blocks(struct poly1305_state *state,
> +                         const struct poly1305_key *key,
> +                         const void *src, unsigned int nblocks)
> +{
> +       poly1305_blocks_internal(state, key, src, nblocks, 1 << 24);
> +}
> +EXPORT_SYMBOL_GPL(poly1305_core_blocks);
> +
> +static void poly1305_blocks(struct poly1305_desc_ctx *dctx,
> +                           const u8 *src, unsigned int srclen, u32 hibit)
> +{
> +       unsigned int datalen;
> +
> +       if (unlikely(!dctx->sset)) {
> +               datalen = crypto_poly1305_setdesckey(dctx, src, srclen);
> +               src += srclen - datalen;
> +               srclen = datalen;
> +       }
> +
> +       poly1305_blocks_internal(&dctx->h, &dctx->r,
> +                                src, srclen / POLY1305_BLOCK_SIZE, hibit);
>  }
>
>  int crypto_poly1305_update(struct shash_desc *desc,
> @@ -187,9 +199,9 @@ int crypto_poly1305_update(struct shash_desc *desc,
>         }
>
>         if (likely(srclen >= POLY1305_BLOCK_SIZE)) {
> -               bytes = poly1305_blocks(dctx, src, srclen, 1 << 24);
> -               src += srclen - bytes;
> -               srclen = bytes;
> +               poly1305_blocks(dctx, src, srclen, 1 << 24);
> +               src += srclen - (srclen % POLY1305_BLOCK_SIZE);
> +               srclen %= POLY1305_BLOCK_SIZE;
>         }
>
>         if (unlikely(srclen)) {
> @@ -201,30 +213,18 @@ int crypto_poly1305_update(struct shash_desc *desc,
>  }
>  EXPORT_SYMBOL_GPL(crypto_poly1305_update);
>
> -int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
> +void poly1305_core_emit(const struct poly1305_state *state, void *dst)
>  {
> -       struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
>         u32 h0, h1, h2, h3, h4;
>         u32 g0, g1, g2, g3, g4;
>         u32 mask;
> -       u64 f = 0;
> -
> -       if (unlikely(!dctx->sset))
> -               return -ENOKEY;
> -
> -       if (unlikely(dctx->buflen)) {
> -               dctx->buf[dctx->buflen++] = 1;
> -               memset(dctx->buf + dctx->buflen, 0,
> -                      POLY1305_BLOCK_SIZE - dctx->buflen);
> -               poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 0);
> -       }
>
>         /* fully carry h */
> -       h0 = dctx->h[0];
> -       h1 = dctx->h[1];
> -       h2 = dctx->h[2];
> -       h3 = dctx->h[3];
> -       h4 = dctx->h[4];
> +       h0 = state->h[0];
> +       h1 = state->h[1];
> +       h2 = state->h[2];
> +       h3 = state->h[3];
> +       h4 = state->h[4];
>
>         h2 += (h1 >> 26);     h1 = h1 & 0x3ffffff;
>         h3 += (h2 >> 26);     h2 = h2 & 0x3ffffff;
> @@ -254,16 +254,40 @@ int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
>         h4 = (h4 & mask) | g4;
>
>         /* h = h % (2^128) */
> -       h0 = (h0 >>  0) | (h1 << 26);
> -       h1 = (h1 >>  6) | (h2 << 20);
> -       h2 = (h2 >> 12) | (h3 << 14);
> -       h3 = (h3 >> 18) | (h4 <<  8);
> +       put_unaligned_le32((h0 >>  0) | (h1 << 26), dst +  0);
> +       put_unaligned_le32((h1 >>  6) | (h2 << 20), dst +  4);
> +       put_unaligned_le32((h2 >> 12) | (h3 << 14), dst +  8);
> +       put_unaligned_le32((h3 >> 18) | (h4 <<  8), dst + 12);
> +}
> +EXPORT_SYMBOL_GPL(poly1305_core_emit);
> +
> +int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
> +{
> +       struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
> +       __le32 digest[4];
> +       u64 f = 0;
> +
> +       if (unlikely(!dctx->sset))
> +               return -ENOKEY;
> +
> +       if (unlikely(dctx->buflen)) {
> +               dctx->buf[dctx->buflen++] = 1;
> +               memset(dctx->buf + dctx->buflen, 0,
> +                      POLY1305_BLOCK_SIZE - dctx->buflen);
> +               poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 0);
> +       }
> +
> +       poly1305_core_emit(&dctx->h, digest);
>
>         /* mac = (h + s) % (2^128) */
> -       f = (f >> 32) + h0 + dctx->s[0]; put_unaligned_le32(f, dst +  0);
> -       f = (f >> 32) + h1 + dctx->s[1]; put_unaligned_le32(f, dst +  4);
> -       f = (f >> 32) + h2 + dctx->s[2]; put_unaligned_le32(f, dst +  8);
> -       f = (f >> 32) + h3 + dctx->s[3]; put_unaligned_le32(f, dst + 12);
> +       f = (f >> 32) + le32_to_cpu(digest[0]) + dctx->s[0];
> +       put_unaligned_le32(f, dst + 0);
> +       f = (f >> 32) + le32_to_cpu(digest[1]) + dctx->s[1];
> +       put_unaligned_le32(f, dst + 4);
> +       f = (f >> 32) + le32_to_cpu(digest[2]) + dctx->s[2];
> +       put_unaligned_le32(f, dst + 8);
> +       f = (f >> 32) + le32_to_cpu(digest[3]) + dctx->s[3];
> +       put_unaligned_le32(f, dst + 12);
>
>         return 0;
>  }
> diff --git a/include/crypto/poly1305.h b/include/crypto/poly1305.h
> index f718a19da82f7..34317ed2071e6 100644
> --- a/include/crypto/poly1305.h
> +++ b/include/crypto/poly1305.h
> @@ -13,13 +13,21 @@
>  #define POLY1305_KEY_SIZE      32
>  #define POLY1305_DIGEST_SIZE   16
>
> +struct poly1305_key {
> +       u32 r[5];       /* key, base 2^26 */
> +};
> +
> +struct poly1305_state {
> +       u32 h[5];       /* accumulator, base 2^26 */
> +};
> +
>  struct poly1305_desc_ctx {
>         /* key */
> -       u32 r[5];
> +       struct poly1305_key r;
>         /* finalize key */
>         u32 s[4];
>         /* accumulator */
> -       u32 h[5];
> +       struct poly1305_state h;
>         /* partial buffer */
>         u8 buf[POLY1305_BLOCK_SIZE];
>         /* bytes used in partial buffer */
> @@ -30,6 +38,22 @@ struct poly1305_desc_ctx {
>         bool sset;
>  };
>
> +/*
> + * Poly1305 core functions.  These implement the ε-almost-∆-universal hash
> + * function underlying the Poly1305 MAC, i.e. they don't add an encrypted nonce
> + * ("s key") at the end.  They also only support block-aligned inputs.
> + */
> +void poly1305_core_setkey(struct poly1305_key *key, const u8 *raw_key);
> +static inline void poly1305_core_init(struct poly1305_state *state)
> +{
> +       memset(state->h, 0, sizeof(state->h));
> +}
> +void poly1305_core_blocks(struct poly1305_state *state,
> +                         const struct poly1305_key *key,
> +                         const void *src, unsigned int nblocks);
> +void poly1305_core_emit(const struct poly1305_state *state, void *dst);
> +
> +/* Crypto API helper functions for the Poly1305 MAC */
>  int crypto_poly1305_init(struct shash_desc *desc);
>  unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
>                                         const u8 *src, unsigned int srclen);
> --
> 2.19.1.331.ge82ca0e54c-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
  2018-10-15 17:54 ` [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support Eric Biggers
@ 2018-10-20  4:00   ` Ard Biesheuvel
  2018-10-20  5:38     ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20  4:00 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Add a generic implementation of NHPoly1305, an ε-almost-∆-universal hash
> function used in the Adiantum encryption mode.
>
> CONFIG_NHPOLY1305 is not selectable by itself since there won't be any
> real reason to enable it without also enabling Adiantum support.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  crypto/Kconfig              |    5 +
>  crypto/Makefile             |    1 +
>  crypto/nhpoly1305.c         |  288 ++++++++
>  crypto/testmgr.c            |    6 +
>  crypto/testmgr.h            | 1240 ++++++++++++++++++++++++++++++++++-
>  include/crypto/nhpoly1305.h |   74 +++
>  6 files changed, 1610 insertions(+), 4 deletions(-)
>  create mode 100644 crypto/nhpoly1305.c
>  create mode 100644 include/crypto/nhpoly1305.h
>
> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index 4fa0a4a0e8615..431beca903623 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -493,6 +493,11 @@ config CRYPTO_KEYWRAP
>           Support for key wrapping (NIST SP800-38F / RFC3394) without
>           padding.
>
> +config CRYPTO_NHPOLY1305
> +       tristate
> +       select CRYPTO_HASH
> +       select CRYPTO_POLY1305
> +
>  comment "Hash modes"
>
>  config CRYPTO_CMAC
> diff --git a/crypto/Makefile b/crypto/Makefile
> index 7e673f7c71107..87b86f221a2a2 100644
> --- a/crypto/Makefile
> +++ b/crypto/Makefile
> @@ -84,6 +84,7 @@ obj-$(CONFIG_CRYPTO_LRW) += lrw.o
>  obj-$(CONFIG_CRYPTO_XTS) += xts.o
>  obj-$(CONFIG_CRYPTO_CTR) += ctr.o
>  obj-$(CONFIG_CRYPTO_KEYWRAP) += keywrap.o
> +obj-$(CONFIG_CRYPTO_NHPOLY1305) += nhpoly1305.o
>  obj-$(CONFIG_CRYPTO_GCM) += gcm.o
>  obj-$(CONFIG_CRYPTO_CCM) += ccm.o
>  obj-$(CONFIG_CRYPTO_CHACHA20POLY1305) += chacha20poly1305.o
> diff --git a/crypto/nhpoly1305.c b/crypto/nhpoly1305.c
> new file mode 100644
> index 0000000000000..087ad7680dd62
> --- /dev/null
> +++ b/crypto/nhpoly1305.c
> @@ -0,0 +1,288 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
> + *
> + * Copyright 2018 Google LLC
> + */
> +
> +/*
> + * "NHPoly1305" is the main component of Adiantum hashing.
> + * Specifically, it is the calculation
> + *
> + *     H_M ← Poly1305_{K_M}(NH_{K_N}(pad_{128}(M)))
> + *
> + * from the procedure in section A.5 of the Adiantum paper [1].  It is an
> + * ε-almost-∆-universal (εA∆U) hash function for equal-length inputs over
> + * Z/(2^{128}Z), where the "∆" operation is addition.  It hashes 1024-byte
> + * chunks of the input with the NH hash function [2], reducing the input length
> + * by 32x.  The resulting NH digests are evaluated as a polynomial in
> + * GF(2^{130}-5), like in the Poly1305 MAC [3].  Note that the polynomial
> + * evaluation by itself would suffice to achieve the εA∆U property; NH is used
> + * for performance since it's over twice as fast as Poly1305.
> + *
> + * This is *not* a cryptographic hash function; do not use it as such!
> + *
> + * [1] Adiantum: length-preserving encryption for entry-level processors
> + *     (https://eprint.iacr.org/2018/720.pdf)
> + * [2] UMAC: Fast and Secure Message Authentication
> + *     (https://fastcrypto.org/umac/umac_proc.pdf)
> + * [3] The Poly1305-AES message-authentication code
> + *     (https://cr.yp.to/mac/poly1305-20050329.pdf)
> + */
> +
> +#include <asm/unaligned.h>
> +#include <crypto/algapi.h>
> +#include <crypto/internal/hash.h>
> +#include <crypto/nhpoly1305.h>
> +#include <linux/crypto.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +
> +#define NH_STRIDE(K0, K1, K2, K3)                              \
> +({                                                             \
> +       m_A = get_unaligned_le32(src); src += 4;                \
> +       m_B = get_unaligned_le32(src); src += 4;                \
> +       m_C = get_unaligned_le32(src); src += 4;                \
> +       m_D = get_unaligned_le32(src); src += 4;                \
> +       K3##_A = *key++;                                        \
> +       K3##_B = *key++;                                        \
> +       K3##_C = *key++;                                        \
> +       K3##_D = *key++;                                        \
> +       sum0 += (u64)(u32)(m_A + K0##_A) * (u32)(m_C + K0##_C); \
> +       sum1 += (u64)(u32)(m_A + K1##_A) * (u32)(m_C + K1##_C); \
> +       sum2 += (u64)(u32)(m_A + K2##_A) * (u32)(m_C + K2##_C); \
> +       sum3 += (u64)(u32)(m_A + K3##_A) * (u32)(m_C + K3##_C); \
> +       sum0 += (u64)(u32)(m_B + K0##_B) * (u32)(m_D + K0##_D); \
> +       sum1 += (u64)(u32)(m_B + K1##_B) * (u32)(m_D + K1##_D); \
> +       sum2 += (u64)(u32)(m_B + K2##_B) * (u32)(m_D + K2##_D); \
> +       sum3 += (u64)(u32)(m_B + K3##_B) * (u32)(m_D + K3##_D); \
> +})
> +
> +static void nh_generic(const u32 *key, const u8 *src, size_t srclen,
> +                      __le64 hash[NH_NUM_PASSES])
> +{
> +       u64 sum0 = 0, sum1 = 0, sum2 = 0, sum3 = 0;
> +       u32 k0_A = *key++;
> +       u32 k0_B = *key++;
> +       u32 k0_C = *key++;
> +       u32 k0_D = *key++;
> +       u32 k1_A = *key++;
> +       u32 k1_B = *key++;
> +       u32 k1_C = *key++;
> +       u32 k1_D = *key++;
> +       u32 k2_A = *key++;
> +       u32 k2_B = *key++;
> +       u32 k2_C = *key++;
> +       u32 k2_D = *key++;
> +       u32 k3_A, k3_B, k3_C, k3_D;
> +       u32 m_A, m_B, m_C, m_D;
> +       size_t n = srclen / NH_MESSAGE_UNIT;
> +
> +       BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
> +       BUILD_BUG_ON(NH_NUM_PASSES != 4);
> +
> +       while (n >= 4) {
> +               NH_STRIDE(k0, k1, k2, k3);
> +               NH_STRIDE(k1, k2, k3, k0);
> +               NH_STRIDE(k2, k3, k0, k1);
> +               NH_STRIDE(k3, k0, k1, k2);
> +               n -= 4;
> +       }
> +       if (n) {
> +               NH_STRIDE(k0, k1, k2, k3);
> +               if (--n) {
> +                       NH_STRIDE(k1, k2, k3, k0);
> +                       if (--n)
> +                               NH_STRIDE(k2, k3, k0, k1);
> +               }
> +       }
> +

This all looks a bit clunky to me, with the macro, the *key++s in the
initializers and these conditionals.

Was it written in this particular way to get GCC to optimize it in the
right way?

> +       hash[0] = cpu_to_le64(sum0);
> +       hash[1] = cpu_to_le64(sum1);
> +       hash[2] = cpu_to_le64(sum2);
> +       hash[3] = cpu_to_le64(sum3);
> +}
> +
> +/* Pass the next NH hash value through Poly1305 */
> +static void process_nh_hash_value(struct nhpoly1305_state *state,
> +                                 const struct nhpoly1305_key *key)
> +{
> +       BUILD_BUG_ON(NH_HASH_BYTES % POLY1305_BLOCK_SIZE != 0);
> +
> +       poly1305_core_blocks(&state->poly_state, &key->poly_key, state->nh_hash,
> +                            NH_HASH_BYTES / POLY1305_BLOCK_SIZE);
> +}
> +
> +/*
> + * Feed the next portion of the source data, as a whole number of 16-byte
> + * "NH message units", through NH and Poly1305.  Each NH hash is taken over
> + * 1024 bytes, except possibly the final one which is taken over a multiple of
> + * 16 bytes up to 1024.  Also, in the case where data is passed in misaligned
> + * chunks, we combine partial hashes; the end result is the same either way.
> + */
> +static void nhpoly1305_units(struct nhpoly1305_state *state,
> +                            const struct nhpoly1305_key *key,
> +                            const u8 *src, unsigned int srclen, nh_t nh_fn)

Since indirect calls are going out of style: can we get rid of the
function pointer? Or is the compiler already inferring that it always
refers to nh_generic()?

> +{
> +       do {
> +               unsigned int bytes;
> +
> +               if (state->nh_remaining == 0) {
> +                       /* Starting a new NH message */
> +                       bytes = min_t(unsigned int, srclen, NH_MESSAGE_BYTES);
> +                       nh_fn(key->nh_key, src, bytes, state->nh_hash);
> +                       state->nh_remaining = NH_MESSAGE_BYTES - bytes;
> +               } else {
> +                       /* Continuing a previous NH message */
> +                       __le64 tmp_hash[NH_NUM_PASSES];
> +                       unsigned int pos;
> +                       int i;
> +
> +                       pos = NH_MESSAGE_BYTES - state->nh_remaining;
> +                       bytes = min(srclen, state->nh_remaining);
> +                       nh_fn(&key->nh_key[pos / 4], src, bytes, tmp_hash);
> +                       for (i = 0; i < NH_NUM_PASSES; i++)
> +                               le64_add_cpu(&state->nh_hash[i],
> +                                            le64_to_cpu(tmp_hash[i]));
> +                       state->nh_remaining -= bytes;
> +               }
> +               if (state->nh_remaining == 0)
> +                       process_nh_hash_value(state, key);
> +               src += bytes;
> +               srclen -= bytes;
> +       } while (srclen);
> +}
> +
> +int crypto_nhpoly1305_setkey(struct crypto_shash *tfm,
> +                            const u8 *key, unsigned int keylen)
> +{
> +       struct nhpoly1305_key *ctx = crypto_shash_ctx(tfm);
> +       int i;
> +
> +       if (keylen != NHPOLY1305_KEY_SIZE)
> +               return -EINVAL;
> +
> +       poly1305_core_setkey(&ctx->poly_key, key);
> +       key += POLY1305_BLOCK_SIZE;
> +
> +       for (i = 0; i < NH_KEY_WORDS; i++)
> +               ctx->nh_key[i] = get_unaligned_le32(key + i * sizeof(u32));
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(crypto_nhpoly1305_setkey);
> +
> +int crypto_nhpoly1305_init(struct shash_desc *desc)
> +{
> +       struct nhpoly1305_state *state = shash_desc_ctx(desc);
> +
> +       poly1305_core_init(&state->poly_state);
> +       state->buflen = 0;
> +       state->nh_remaining = 0;
> +       return 0;
> +}
> +EXPORT_SYMBOL(crypto_nhpoly1305_init);
> +
> +int crypto_nhpoly1305_update_helper(struct shash_desc *desc,
> +                                   const u8 *src, unsigned int srclen,
> +                                   nh_t nh_fn)
> +{
> +       struct nhpoly1305_state *state = shash_desc_ctx(desc);
> +       const struct nhpoly1305_key *key = crypto_shash_ctx(desc->tfm);
> +       unsigned int bytes;
> +
> +       if (state->buflen) {
> +               bytes = min(srclen, (int)NH_MESSAGE_UNIT - state->buflen);
> +               memcpy(&state->buffer[state->buflen], src, bytes);
> +               state->buflen += bytes;
> +               if (state->buflen < NH_MESSAGE_UNIT)
> +                       return 0;
> +               nhpoly1305_units(state, key, state->buffer, NH_MESSAGE_UNIT,
> +                                nh_fn);
> +               state->buflen = 0;
> +               src += bytes;
> +               srclen -= bytes;
> +       }
> +
> +       if (srclen >= NH_MESSAGE_UNIT) {
> +               bytes = round_down(srclen, NH_MESSAGE_UNIT);
> +               nhpoly1305_units(state, key, src, bytes, nh_fn);
> +               src += bytes;
> +               srclen -= bytes;
> +       }
> +
> +       if (srclen) {
> +               memcpy(state->buffer, src, srclen);
> +               state->buflen = srclen;
> +       }
> +       return 0;
> +}
> +EXPORT_SYMBOL(crypto_nhpoly1305_update_helper);
> +
> +int crypto_nhpoly1305_update(struct shash_desc *desc,
> +                            const u8 *src, unsigned int srclen)
> +{
> +       return crypto_nhpoly1305_update_helper(desc, src, srclen, nh_generic);
> +}
> +EXPORT_SYMBOL(crypto_nhpoly1305_update);
> +
> +int crypto_nhpoly1305_final_helper(struct shash_desc *desc, u8 *dst, nh_t nh_fn)
> +{
> +       struct nhpoly1305_state *state = shash_desc_ctx(desc);
> +       const struct nhpoly1305_key *key = crypto_shash_ctx(desc->tfm);
> +
> +       if (state->buflen) {
> +               memset(&state->buffer[state->buflen], 0,
> +                      NH_MESSAGE_UNIT - state->buflen);
> +               nhpoly1305_units(state, key, state->buffer, NH_MESSAGE_UNIT,
> +                                nh_fn);
> +       }
> +
> +       if (state->nh_remaining)
> +               process_nh_hash_value(state, key);
> +
> +       poly1305_core_emit(&state->poly_state, dst);
> +       return 0;
> +}
> +EXPORT_SYMBOL(crypto_nhpoly1305_final_helper);
> +
> +int crypto_nhpoly1305_final(struct shash_desc *desc, u8 *dst)
> +{
> +       return crypto_nhpoly1305_final_helper(desc, dst, nh_generic);
> +}
> +EXPORT_SYMBOL(crypto_nhpoly1305_final);
> +
> +static struct shash_alg nhpoly1305_alg = {
> +       .digestsize     = POLY1305_DIGEST_SIZE,
> +       .init           = crypto_nhpoly1305_init,
> +       .update         = crypto_nhpoly1305_update,
> +       .final          = crypto_nhpoly1305_final,
> +       .setkey         = crypto_nhpoly1305_setkey,
> +       .descsize       = sizeof(struct nhpoly1305_state),
> +       .base           = {
> +               .cra_name               = "nhpoly1305",
> +               .cra_driver_name        = "nhpoly1305-generic",
> +               .cra_priority           = 100,
> +               .cra_ctxsize            = sizeof(struct nhpoly1305_key),
> +               .cra_module             = THIS_MODULE,
> +       },

Could we use the .base.xxxx idiom here instead of the separately indented block?

> +};
> +
> +static int __init nhpoly1305_mod_init(void)
> +{
> +       return crypto_register_shash(&nhpoly1305_alg);
> +}
> +
> +static void __exit nhpoly1305_mod_exit(void)
> +{
> +       crypto_unregister_shash(&nhpoly1305_alg);
> +}
> +
> +module_init(nhpoly1305_mod_init);
> +module_exit(nhpoly1305_mod_exit);
> +
> +MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function");
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
> +MODULE_ALIAS_CRYPTO("nhpoly1305");
> +MODULE_ALIAS_CRYPTO("nhpoly1305-generic");
> diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> index 3ff70ebc745cb..039a5d850a29c 100644
> --- a/crypto/testmgr.c
> +++ b/crypto/testmgr.c
> @@ -3291,6 +3291,12 @@ static const struct alg_test_desc alg_test_descs[] = {
>                                 .dec = __VECS(morus640_dec_tv_template),
>                         }
>                 }
> +       }, {
> +               .alg = "nhpoly1305",
> +               .test = alg_test_hash,
> +               .suite = {
> +                       .hash = __VECS(nhpoly1305_tv_template)
> +               }
>         }, {
>                 .alg = "ofb(aes)",
>                 .test = alg_test_skcipher,
> diff --git a/crypto/testmgr.h b/crypto/testmgr.h
> index 3b57b2701fcb2..40197d74b3d56 100644
> --- a/crypto/testmgr.h
> +++ b/crypto/testmgr.h
> @@ -27,7 +27,7 @@
>  #define MAX_DIGEST_SIZE                64
>  #define MAX_TAP                        8
>
> -#define MAX_KEYLEN             160
> +#define MAX_KEYLEN             1088
>  #define MAX_IVLEN              32
>
>  struct hash_testvec {
> @@ -35,10 +35,10 @@ struct hash_testvec {
>         const char *key;
>         const char *plaintext;
>         const char *digest;
> -       unsigned char tap[MAX_TAP];
> +       unsigned short tap[MAX_TAP];
> +       unsigned short np;
>         unsigned short psize;
> -       unsigned char np;
> -       unsigned char ksize;
> +       unsigned short ksize;
>  };
>
>  /*
> @@ -5593,6 +5593,1238 @@ static const struct hash_testvec poly1305_tv_template[] = {
>         },
>  };
>
> +/* NHPoly1305 test vectors from https://github.com/google/adiantum */
> +static const struct hash_testvec nhpoly1305_tv_template[] = {
> +       {
> +               .key    = "\xd2\x5d\x4c\xdd\x8d\x2b\x7f\x7a"
> +                         "\xd9\xbe\x71\xec\xd1\x83\x52\xe3"
> +                         "\xe1\xad\xd7\x5c\x0a\x75\x9d\xec"
> +                         "\x1d\x13\x7e\x5d\x71\x07\xc9\xe4"
> +                         "\x57\x2d\x44\x68\xcf\xd8\xd6\xc5"
> +                         "\x39\x69\x7d\x32\x75\x51\x4f\x7e"
> +                         "\xb2\x4c\xc6\x90\x51\x6e\xd9\xd6"
> +                         "\xa5\x8b\x2d\xf1\x94\xf9\xf7\x5e"
> +                         "\x2c\x84\x7b\x41\x0f\x88\x50\x89"
> +                         "\x30\xd9\xa1\x38\x46\x6c\xc0\x4f"
> +                         "\xe8\xdf\xdc\x66\xab\x24\x43\x41"
> +                         "\x91\x55\x29\x65\x86\x28\x5e\x45"
> +                         "\xd5\x2d\xb7\x80\x08\x9a\xc3\xd4"
> +                         "\x9a\x77\x0a\xd4\xef\x3e\xe6\x3f"
> +                         "\x6f\x2f\x9b\x3a\x7d\x12\x1e\x80"
> +                         "\x6c\x44\xa2\x25\xe1\xf6\x60\xe9"
> +                         "\x0d\xaf\xc5\x3c\xa5\x79\xae\x64"
> +                         "\xbc\xa0\x39\xa3\x4d\x10\xe5\x4d"
> +                         "\xd5\xe7\x89\x7a\x13\xee\x06\x78"
> +                         "\xdc\xa4\xdc\x14\x27\xe6\x49\x38"
> +                         "\xd0\xe0\x45\x25\x36\xc5\xf4\x79"
> +                         "\x2e\x9a\x98\x04\xe4\x2b\x46\x52"
> +                         "\x7c\x33\xca\xe2\x56\x51\x50\xe2"
> +                         "\xa5\x9a\xae\x18\x6a\x13\xf8\xd2"
> +                         "\x21\x31\x66\x02\xe2\xda\x8d\x7e"
> +                         "\x41\x19\xb2\x61\xee\x48\x8f\xf1"
> +                         "\x65\x24\x2e\x1e\x68\xce\x05\xd9"
> +                         "\x2a\xcf\xa5\x3a\x57\xdd\x35\x91"
> +                         "\x93\x01\xca\x95\xfc\x2b\x36\x04"
> +                         "\xe6\x96\x97\x28\xf6\x31\xfe\xa3"
> +                         "\x9d\xf6\x6a\x1e\x80\x8d\xdc\xec"
> +                         "\xaf\x66\x11\x13\x02\x88\xd5\x27"
> +                         "\x33\xb4\x1a\xcd\xa3\xf6\xde\x31"
> +                         "\x8e\xc0\x0e\x6c\xd8\x5a\x97\x5e"
> +                         "\xdd\xfd\x60\x69\x38\x46\x3f\x90"
> +                         "\x5e\x97\xd3\x32\x76\xc7\x82\x49"
> +                         "\xfe\xba\x06\x5f\x2f\xa2\xfd\xff"
> +                         "\x80\x05\x40\xe4\x33\x03\xfb\x10"
> +                         "\xc0\xde\x65\x8c\xc9\x8d\x3a\x9d"
> +                         "\xb5\x7b\x36\x4b\xb5\x0c\xcf\x00"
> +                         "\x9c\x87\xe4\x49\xad\x90\xda\x4a"
> +                         "\xdd\xbd\xff\xe2\x32\x57\xd6\x78"
> +                         "\x36\x39\x6c\xd3\x5b\x9b\x88\x59"
> +                         "\x2d\xf0\x46\xe4\x13\x0e\x2b\x35"
> +                         "\x0d\x0f\x73\x8a\x4f\x26\x84\x75"
> +                         "\x88\x3c\xc5\x58\x66\x18\x1a\xb4"
> +                         "\x64\x51\x34\x27\x1b\xa4\x11\xc9"
> +                         "\x6d\x91\x8a\xfa\x32\x60\x9d\xd7"
> +                         "\x87\xe5\xaa\x43\x72\xf8\xda\xd1"
> +                         "\x48\x44\x13\x61\xdc\x8c\x76\x17"
> +                         "\x0c\x85\x4e\xf3\xdd\xa2\x42\xd2"
> +                         "\x74\xc1\x30\x1b\xeb\x35\x31\x29"
> +                         "\x5b\xd7\x4c\x94\x46\x35\xa1\x23"
> +                         "\x50\xf2\xa2\x8e\x7e\x4f\x23\x4f"
> +                         "\x51\xff\xe2\xc9\xa3\x7d\x56\x8b"
> +                         "\x41\xf2\xd0\xc5\x57\x7e\x59\xac"
> +                         "\xbb\x65\xf3\xfe\xf7\x17\xef\x63"
> +                         "\x7c\x6f\x23\xdd\x22\x8e\xed\x84"
> +                         "\x0e\x3b\x09\xb3\xf3\xf4\x8f\xcd"
> +                         "\x37\xa8\xe1\xa7\x30\xdb\xb1\xa2"
> +                         "\x9c\xa2\xdf\x34\x17\x3e\x68\x44"
> +                         "\xd0\xde\x03\x50\xd1\x48\x6b\x20"
> +                         "\xe2\x63\x45\xa5\xea\x87\xc2\x42"
> +                         "\x95\x03\x49\x05\xed\xe0\x90\x29"
> +                         "\x1a\xb8\xcf\x9b\x43\xcf\x29\x7a"
> +                         "\x63\x17\x41\x9f\xe0\xc9\x10\xfd"
> +                         "\x2c\x56\x8c\x08\x55\xb4\xa9\x27"
> +                         "\x0f\x23\xb1\x05\x6a\x12\x46\xc7"
> +                         "\xe1\xfe\x28\x93\x93\xd7\x2f\xdc"
> +                         "\x98\x30\xdb\x75\x8a\xbe\x97\x7a"
> +                         "\x02\xfb\x8c\xba\xbe\x25\x09\xbe"
> +                         "\xce\xcb\xa2\xef\x79\x4d\x0e\x9d"
> +                         "\x1b\x9d\xb6\x39\x34\x38\xfa\x07"
> +                         "\xec\xe8\xfc\x32\x85\x1d\xf7\x85"
> +                         "\x63\xc3\x3c\xc0\x02\x75\xd7\x3f"
> +                         "\xb2\x68\x60\x66\x65\x81\xc6\xb1"
> +                         "\x42\x65\x4b\x4b\x28\xd7\xc7\xaa"
> +                         "\x9b\xd2\xdc\x1b\x01\xe0\x26\x39"
> +                         "\x01\xc1\x52\x14\xd1\x3f\xb7\xe6"
> +                         "\x61\x41\xc7\x93\xd2\xa2\x67\xc6"
> +                         "\xf7\x11\xb5\xf5\xea\xdd\x19\xfb"
> +                         "\x4d\x21\x12\xd6\x7d\xf1\x10\xb0"
> +                         "\x89\x07\xc7\x5a\x52\x73\x70\x2f"
> +                         "\x32\xef\x65\x2b\x12\xb2\xf0\xf5"
> +                         "\x20\xe0\x90\x59\x7e\x64\xf1\x4c"
> +                         "\x41\xb3\xa5\x91\x08\xe6\x5e\x5f"
> +                         "\x05\x56\x76\xb4\xb0\xcd\x70\x53"
> +                         "\x10\x48\x9c\xff\xc2\x69\x55\x24"
> +                         "\x87\xef\x84\xea\xfb\xa7\xbf\xa0"
> +                         "\x91\x04\xad\x4f\x8b\x57\x54\x4b"
> +                         "\xb6\xe9\xd1\xac\x37\x2f\x1d\x2e"
> +                         "\xab\xa5\xa4\xe8\xff\xfb\xd9\x39"
> +                         "\x2f\xb7\xac\xd1\xfe\x0b\x9a\x80"
> +                         "\x0f\xb6\xf4\x36\x39\x90\x51\xe3"
> +                         "\x0a\x2f\xb6\x45\x76\x89\xcd\x61"
> +                         "\xfe\x48\x5f\x75\x1d\x13\x00\x62"
> +                         "\x80\x24\x47\xe7\xbc\x37\xd7\xe3"
> +                         "\x15\xe8\x68\x22\xaf\x80\x6f\x4b"
> +                         "\xa8\x9f\x01\x10\x48\x14\xc3\x02"
> +                         "\x52\xd2\xc7\x75\x9b\x52\x6d\x30"
> +                         "\xac\x13\x85\xc8\xf7\xa3\x58\x4b"
> +                         "\x49\xf7\x1c\x45\x55\x8c\x39\x9a"
> +                         "\x99\x6d\x97\x27\x27\xe6\xab\xdd"
> +                         "\x2c\x42\x1b\x35\xdd\x9d\x73\xbb"
> +                         "\x6c\xf3\x64\xf1\xfb\xb9\xf7\xe6"
> +                         "\x4a\x3c\xc0\x92\xc0\x2e\xb7\x1a"
> +                         "\xbe\xab\xb3\x5a\xe5\xea\xb1\x48"
> +                         "\x58\x13\x53\x90\xfd\xc3\x8e\x54"
> +                         "\xf9\x18\x16\x73\xe8\xcb\x6d\x39"
> +                         "\x0e\xd7\xe0\xfe\xb6\x9f\x43\x97"
> +                         "\xe8\xd0\x85\x56\x83\x3e\x98\x68"
> +                         "\x7f\xbd\x95\xa8\x9a\x61\x21\x8f"
> +                         "\x06\x98\x34\xa6\xc8\xd6\x1d\xf3"
> +                         "\x3d\x43\xa4\x9a\x8c\xe5\xd3\x5a"
> +                         "\x32\xa2\x04\x22\xa4\x19\x1a\x46"
> +                         "\x42\x7e\x4d\xe5\xe0\xe6\x0e\xca"
> +                         "\xd5\x58\x9d\x2c\xaf\xda\x33\x5c"
> +                         "\xb0\x79\x9e\xc9\xfc\xca\xf0\x2f"
> +                         "\xa8\xb2\x77\xeb\x7a\xa2\xdd\x37"
> +                         "\x35\x83\x07\xd6\x02\x1a\xb6\x6c"
> +                         "\x24\xe2\x59\x08\x0e\xfd\x3e\x46"
> +                         "\xec\x40\x93\xf4\x00\x26\x4f\x2a"
> +                         "\xff\x47\x2f\xeb\x02\x92\x26\x5b"
> +                         "\x53\x17\xc2\x8d\x2a\xc7\xa3\x1b"
> +                         "\xcd\xbc\xa7\xe8\xd1\x76\xe3\x80"
> +                         "\x21\xca\x5d\x3b\xe4\x9c\x8f\xa9"
> +                         "\x5b\x7f\x29\x7f\x7c\xd8\xed\x6d"
> +                         "\x8c\xb2\x86\x85\xe7\x77\xf2\x85"
> +                         "\xab\x38\xa9\x9d\xc1\x4e\xc5\x64"
> +                         "\x33\x73\x8b\x59\x03\xad\x05\xdf"
> +                         "\x25\x98\x31\xde\xef\x13\xf1\x9b"
> +                         "\x3c\x91\x9d\x7b\xb1\xfa\xe6\xbf"
> +                         "\x5b\xed\xa5\x55\xe6\xea\x6c\x74"
> +                         "\xf4\xb9\xe4\x45\x64\x72\x81\xc2"
> +                         "\x4c\x28\xd4\xcd\xac\xe2\xde\xf9"
> +                         "\xeb\x5c\xeb\x61\x60\x5a\xe5\x28",
> +               .ksize  = 1088,
> +               .plaintext      = "",
> +               .psize  = 0,
> +               .digest = "\x00\x00\x00\x00\x00\x00\x00\x00"
> +                         "\x00\x00\x00\x00\x00\x00\x00\x00",
> +       }, {
> +               .key    = "\x29\x21\x43\xcb\xcb\x13\x07\xde"
> +                         "\xbf\x48\xdf\x8a\x7f\xa2\x84\xde"
> +                         "\x72\x23\x9d\xf5\xf0\x07\xf2\x4c"
> +                         "\x20\x3a\x93\xb9\xcd\x5d\xfe\xcb"
> +                         "\x99\x2c\x2b\x58\xc6\x50\x5f\x94"
> +                         "\x56\xc3\x7c\x0d\x02\x3f\xb8\x5e"
> +                         "\x7b\xc0\x6c\x51\x34\x76\xc0\x0e"
> +                         "\xc6\x22\xc8\x9e\x92\xa0\x21\xc9"
> +                         "\x85\x5c\x7c\xf8\xe2\x64\x47\xc9"
> +                         "\xe4\xa2\x57\x93\xf8\xa2\x69\xcd"
> +                         "\x62\x98\x99\xf4\xd7\x7b\x14\xb1"
> +                         "\xd8\x05\xff\x04\x15\xc9\xe1\x6e"
> +                         "\x9b\xe6\x50\x6b\x0b\x3f\x22\x1f"
> +                         "\x08\xde\x0c\x5b\x08\x7e\xc6\x2f"
> +                         "\x6c\xed\xd6\xb2\x15\xa4\xb3\xf9"
> +                         "\xa7\x46\x38\x2a\xea\x69\xa5\xde"
> +                         "\x02\xc3\x96\x89\x4d\x55\x3b\xed"
> +                         "\x3d\x3a\x85\x77\xbf\x97\x45\x5c"
> +                         "\x9e\x02\x69\xe2\x1b\x68\xbe\x96"
> +                         "\xfb\x64\x6f\x0f\xf6\x06\x40\x67"
> +                         "\xfa\x04\xe3\x55\xfa\xbe\xa4\x60"
> +                         "\xef\x21\x66\x97\xe6\x9d\x5c\x1f"
> +                         "\x62\x37\xaa\x31\xde\xe4\x9c\x28"
> +                         "\x95\xe0\x22\x86\xf4\x4d\xf3\x07"
> +                         "\xfd\x5f\x3a\x54\x2c\x51\x80\x71"
> +                         "\xba\x78\x69\x5b\x65\xab\x1f\x81"
> +                         "\xed\x3b\xff\x34\xa3\xfb\xbc\x73"
> +                         "\x66\x7d\x13\x7f\xdf\x6e\xe2\xe2"
> +                         "\xeb\x4f\x6c\xda\x7d\x33\x57\xd0"
> +                         "\xd3\x7c\x95\x4f\x33\x58\x21\xc7"
> +                         "\xc0\xe5\x6f\x42\x26\xc6\x1f\x5e"
> +                         "\x85\x1b\x98\x9a\xa2\x1e\x55\x77"
> +                         "\x23\xdf\x81\x5e\x79\x55\x05\xfc"
> +                         "\xfb\xda\xee\xba\x5a\xba\xf7\x77"
> +                         "\x7f\x0e\xd3\xe1\x37\xfe\x8d\x2b"
> +                         "\xd5\x3f\xfb\xd0\xc0\x3c\x0b\x3f"
> +                         "\xcf\x3c\x14\xcf\xfb\x46\x72\x4c"
> +                         "\x1f\x39\xe2\xda\x03\x71\x6d\x23"
> +                         "\xef\x93\xcd\x39\xd9\x37\x80\x4d"
> +                         "\x65\x61\xd1\x2c\x03\xa9\x47\x72"
> +                         "\x4d\x1e\x0e\x16\x33\x0f\x21\x17"
> +                         "\xec\x92\xea\x6f\x37\x22\xa4\xd8"
> +                         "\x03\x33\x9e\xd8\x03\x69\x9a\xe8"
> +                         "\xb2\x57\xaf\x78\x99\x05\x12\xab"
> +                         "\x48\x90\x80\xf0\x12\x9b\x20\x64"
> +                         "\x7a\x1d\x47\x5f\xba\x3c\xf9\xc3"
> +                         "\x0a\x0d\x8d\xa1\xf9\x1b\x82\x13"
> +                         "\x3e\x0d\xec\x0a\x83\xc0\x65\xe1"
> +                         "\xe9\x95\xff\x97\xd6\xf2\xe4\xd5"
> +                         "\x86\xc0\x1f\x29\x27\x63\xd7\xde"
> +                         "\xb7\x0a\x07\x99\x04\x2d\xa3\x89"
> +                         "\xa2\x43\xcf\xf3\xe1\x43\xac\x4a"
> +                         "\x06\x97\xd0\x05\x4f\x87\xfa\xf9"
> +                         "\x9b\xbf\x52\x70\xbd\xbc\x6c\xf3"
> +                         "\x03\x13\x60\x41\x28\x09\xec\xcc"
> +                         "\xb1\x1a\xec\xd6\xfb\x6f\x2a\x89"
> +                         "\x5d\x0b\x53\x9c\x59\xc1\x84\x21"
> +                         "\x33\x51\x47\x19\x31\x9c\xd4\x0a"
> +                         "\x4d\x04\xec\x50\x90\x61\xbd\xbc"
> +                         "\x7e\xc8\xd9\x6c\x98\x1d\x45\x41"
> +                         "\x17\x5e\x97\x1c\xc5\xa8\xe8\xea"
> +                         "\x46\x58\x53\xf7\x17\xd5\xad\x11"
> +                         "\xc8\x54\xf5\x7a\x33\x90\xf5\x19"
> +                         "\xba\x36\xb4\xfc\x52\xa5\x72\x3d"
> +                         "\x14\xbb\x55\xa7\xe9\xe3\x12\xf7"
> +                         "\x1c\x30\xa2\x82\x03\xbf\x53\x91"
> +                         "\x2e\x60\x41\x9f\x5b\x69\x39\xf6"
> +                         "\x4d\xc8\xf8\x46\x7a\x7f\xa4\x98"
> +                         "\x36\xff\x06\xcb\xca\xe7\x33\xf2"
> +                         "\xc0\x4a\xf4\x3c\x14\x44\x5f\x6b"
> +                         "\x75\xef\x02\x36\x75\x08\x14\xfd"
> +                         "\x10\x8e\xa5\x58\xd0\x30\x46\x49"
> +                         "\xaf\x3a\xf8\x40\x3d\x35\xdb\x84"
> +                         "\x11\x2e\x97\x6a\xb7\x87\x7f\xad"
> +                         "\xf1\xfa\xa5\x63\x60\xd8\x5e\xbf"
> +                         "\x41\x78\x49\xcf\x77\xbb\x56\xbb"
> +                         "\x7d\x01\x67\x05\x22\xc8\x8f\x41"
> +                         "\xba\x81\xd2\xca\x2c\x38\xac\x76"
> +                         "\x06\xc1\x1a\xc2\xce\xac\x90\x67"
> +                         "\x57\x3e\x20\x12\x5b\xd9\x97\x58"
> +                         "\x65\x05\xb7\x04\x61\x7e\xd8\x3a"
> +                         "\xbf\x55\x3b\x13\xe9\x34\x5a\x37"
> +                         "\x36\xcb\x94\x45\xc5\x32\xb3\xa0"
> +                         "\x0c\x3e\x49\xc5\xd3\xed\xa7\xf0"
> +                         "\x1c\x69\xcc\xea\xcc\x83\xc9\x16"
> +                         "\x95\x72\x4b\xf4\x89\xd5\xb9\x10"
> +                         "\xf6\x2d\x60\x15\xea\x3c\x06\x66"
> +                         "\x9f\x82\xad\x17\xce\xd2\xa4\x48"
> +                         "\x7c\x65\xd9\xf8\x02\x4d\x9b\x4c"
> +                         "\x89\x06\x3a\x34\x85\x48\x89\x86"
> +                         "\xf9\x24\xa9\x54\x72\xdb\x44\x95"
> +                         "\xc7\x44\x1c\x19\x11\x4c\x04\xdc"
> +                         "\x13\xb9\x67\xc8\xc3\x3a\x6a\x50"
> +                         "\xfa\xd1\xfb\xe1\x88\xb6\xf1\xa3"
> +                         "\xc5\x3b\xdc\x38\x45\x16\x26\x02"
> +                         "\x3b\xb8\x8f\x8b\x58\x7d\x23\x04"
> +                         "\x50\x6b\x81\x9f\xae\x66\xac\x6f"
> +                         "\xcf\x2a\x9d\xf1\xfd\x1d\x57\x07"
> +                         "\xbe\x58\xeb\x77\x0c\xe3\xc2\x19"
> +                         "\x14\x74\x1b\x51\x1c\x4f\x41\xf3"
> +                         "\x32\x89\xb3\xe7\xde\x62\xf6\x5f"
> +                         "\xc7\x6a\x4a\x2a\x5b\x0f\x5f\x87"
> +                         "\x9c\x08\xb9\x02\x88\xc8\x29\xb7"
> +                         "\x94\x52\xfa\x52\xfe\xaa\x50\x10"
> +                         "\xba\x48\x75\x5e\x11\x1b\xe6\x39"
> +                         "\xd7\x82\x2c\x87\xf1\x1e\xa4\x38"
> +                         "\x72\x3e\x51\xe7\xd8\x3e\x5b\x7b"
> +                         "\x31\x16\x89\xba\xd6\xad\x18\x5e"
> +                         "\xba\xf8\x12\xb3\xf4\x6c\x47\x30"
> +                         "\xc0\x38\x58\xb3\x10\x8d\x58\x5d"
> +                         "\xb4\xfb\x19\x7e\x41\xc3\x66\xb8"
> +                         "\xd6\x72\x84\xe1\x1a\xc2\x71\x4c"
> +                         "\x0d\x4a\x21\x7a\xab\xa2\xc0\x36"
> +                         "\x15\xc5\xe9\x46\xd7\x29\x17\x76"
> +                         "\x5e\x47\x36\x7f\x72\x05\xa7\xcc"
> +                         "\x36\x63\xf9\x47\x7d\xe6\x07\x3c"
> +                         "\x8b\x79\x1d\x96\x61\x8d\x90\x65"
> +                         "\x7c\xf5\xeb\x4e\x6e\x09\x59\x6d"
> +                         "\x62\x50\x1b\x0f\xe0\xdc\x78\xf2"
> +                         "\x5b\x83\x1a\xa1\x11\x75\xfd\x18"
> +                         "\xd7\xe2\x8d\x65\x14\x21\xce\xbe"
> +                         "\xb5\x87\xe3\x0a\xda\x24\x0a\x64"
> +                         "\xa9\x9f\x03\x8d\x46\x5d\x24\x1a"
> +                         "\x8a\x0c\x42\x01\xca\xb1\x5f\x7c"
> +                         "\xa5\xac\x32\x4a\xb8\x07\x91\x18"
> +                         "\x6f\xb0\x71\x3c\xc9\xb1\xa8\xf8"
> +                         "\x5f\x69\xa5\xa1\xca\x9e\x7a\xaa"
> +                         "\xac\xe9\xc7\x47\x41\x75\x25\xc3"
> +                         "\x73\xe2\x0b\xdd\x6d\x52\x71\xbe"
> +                         "\xc5\xdc\xb4\xe7\x01\x26\x53\x77"
> +                         "\x86\x90\x85\x68\x6b\x7b\x03\x53"
> +                         "\xda\x52\x52\x51\x68\xc8\xf3\xec"
> +                         "\x6c\xd5\x03\x7a\xa3\x0e\xb4\x02"
> +                         "\x5f\x1a\xab\xee\xca\x67\x29\x7b"
> +                         "\xbd\x96\x59\xb3\x8b\x32\x7a\x92"
> +                         "\x9f\xd8\x25\x2b\xdf\xc0\x4c\xda",
> +               .ksize  = 1088,
> +               .plaintext      = "\xbc\xda\x81\xa8\x78\x79\x1c\xbf"
> +                         "\x77\x53\xba\x4c\x30\x5b\xb8\x33",
> +               .psize  = 16,
> +               .digest = "\x04\xbf\x7f\x6a\xce\x72\xea\x6a"
> +                         "\x79\xdb\xb0\xc9\x60\xf6\x12\xcc",
> +               .np     = 6,
> +               .tap    = { 4, 4, 1, 1, 1, 5 },
> +       }, {
> +               .key    = "\x65\x4d\xe3\xf8\xd2\x4c\xac\x28"
> +                         "\x68\xf5\xb3\x81\x71\x4b\xa1\xfa"
> +                         "\x04\x0e\xd3\x81\x36\xbe\x0c\x81"
> +                         "\x5e\xaf\xbc\x3a\xa4\xc0\x8e\x8b"
> +                         "\x55\x63\xd3\x52\x97\x88\xd6\x19"
> +                         "\xbc\x96\xdf\x49\xff\x04\x63\xf5"
> +                         "\x0c\x11\x13\xaa\x9e\x1f\x5a\xf7"
> +                         "\xdd\xbd\x37\x80\xc3\xd0\xbe\xa7"
> +                         "\x05\xc8\x3c\x98\x1e\x05\x3c\x84"
> +                         "\x39\x61\xc4\xed\xed\x71\x1b\xc4"
> +                         "\x74\x45\x2c\xa1\x56\x70\x97\xfd"
> +                         "\x44\x18\x07\x7d\xca\x60\x1f\x73"
> +                         "\x3b\x6d\x21\xcb\x61\x87\x70\x25"
> +                         "\x46\x21\xf1\x1f\x21\x91\x31\x2d"
> +                         "\x5d\xcc\xb7\xd1\x84\x3e\x3d\xdb"
> +                         "\x03\x53\x2a\x82\xa6\x9a\x95\xbc"
> +                         "\x1a\x1e\x0a\x5e\x07\x43\xab\x43"
> +                         "\xaf\x92\x82\x06\x91\x04\x09\xf4"
> +                         "\x17\x0a\x9a\x2c\x54\xdb\xb8\xf4"
> +                         "\xd0\xf0\x10\x66\x24\x8d\xcd\xda"
> +                         "\xfe\x0e\x45\x9d\x6f\xc4\x4e\xf4"
> +                         "\x96\xaf\x13\xdc\xa9\xd4\x8c\xc4"
> +                         "\xc8\x57\x39\x3c\xc2\xd3\x0a\x76"
> +                         "\x4a\x1f\x75\x83\x44\xc7\xd1\x39"
> +                         "\xd8\xb5\x41\xba\x73\x87\xfa\x96"
> +                         "\xc7\x18\x53\xfb\x9b\xda\xa0\x97"
> +                         "\x1d\xee\x60\x85\x9e\x14\xc3\xce"
> +                         "\xc4\x05\x29\x3b\x95\x30\xa3\xd1"
> +                         "\x9f\x82\x6a\x04\xf5\xa7\x75\x57"
> +                         "\x82\x04\xfe\x71\x51\x71\xb1\x49"
> +                         "\x50\xf8\xe0\x96\xf1\xfa\xa8\x88"
> +                         "\x3f\xa0\x86\x20\xd4\x60\x79\x59"
> +                         "\x17\x2d\xd1\x09\xf4\xec\x05\x57"
> +                         "\xcf\x62\x7e\x0e\x7e\x60\x78\xe6"
> +                         "\x08\x60\x29\xd8\xd5\x08\x1a\x24"
> +                         "\xc4\x6c\x24\xe7\x92\x08\x3d\x8a"
> +                         "\x98\x7a\xcf\x99\x0a\x65\x0e\xdc"
> +                         "\x8c\x8a\xbe\x92\x82\x91\xcc\x62"
> +                         "\x30\xb6\xf4\x3f\xc6\x8a\x7f\x12"
> +                         "\x4a\x8a\x49\xfa\x3f\x5c\xd4\x5a"
> +                         "\xa6\x82\xa3\xe6\xaa\x34\x76\xb2"
> +                         "\xab\x0a\x30\xef\x6c\x77\x58\x3f"
> +                         "\x05\x6b\xcc\x5c\xae\xdc\xd7\xb9"
> +                         "\x51\x7e\x8d\x32\x5b\x24\x25\xbe"
> +                         "\x2b\x24\x01\xcf\x80\xda\x16\xd8"
> +                         "\x90\x72\x2c\xad\x34\x8d\x0c\x74"
> +                         "\x02\xcb\xfd\xcf\x6e\xef\x97\xb5"
> +                         "\x4c\xf2\x68\xca\xde\x43\x9e\x8a"
> +                         "\xc5\x5f\x31\x7f\x14\x71\x38\xec"
> +                         "\xbd\x98\xe5\x71\xc4\xb5\xdb\xef"
> +                         "\x59\xd2\xca\xc0\xc1\x86\x75\x01"
> +                         "\xd4\x15\x0d\x6f\xa4\xf7\x7b\x37"
> +                         "\x47\xda\x18\x93\x63\xda\xbe\x9e"
> +                         "\x07\xfb\xb2\x83\xd5\xc4\x34\x55"
> +                         "\xee\x73\xa1\x42\x96\xf9\x66\x41"
> +                         "\xa4\xcc\xd2\x93\x6e\xe1\x0a\xbb"
> +                         "\xd2\xdd\x18\x23\xe6\x6b\x98\x0b"
> +                         "\x8a\x83\x59\x2c\xc3\xa6\x59\x5b"
> +                         "\x01\x22\x59\xf7\xdc\xb0\x87\x7e"
> +                         "\xdb\x7d\xf4\x71\x41\xab\xbd\xee"
> +                         "\x79\xbe\x3c\x01\x76\x0b\x2d\x0a"
> +                         "\x42\xc9\x77\x8c\xbb\x54\x95\x60"
> +                         "\x43\x2e\xe0\x17\x52\xbd\x90\xc9"
> +                         "\xc2\x2c\xdd\x90\x24\x22\x76\x40"
> +                         "\x5c\xb9\x41\xc9\xa1\xd5\xbd\xe3"
> +                         "\x44\xe0\xa4\xab\xcc\xb8\xe2\x32"
> +                         "\x02\x15\x04\x1f\x8c\xec\x5d\x14"
> +                         "\xac\x18\xaa\xef\x6e\x33\x19\x6e"
> +                         "\xde\xfe\x19\xdb\xeb\x61\xca\x18"
> +                         "\xad\xd8\x3d\xbf\x09\x11\xc7\xa5"
> +                         "\x86\x0b\x0f\xe5\x3e\xde\xe8\xd9"
> +                         "\x0a\x69\x9e\x4c\x20\xff\xf9\xc5"
> +                         "\xfa\xf8\xf3\x7f\xa5\x01\x4b\x5e"
> +                         "\x0f\xf0\x3b\x68\xf0\x46\x8c\x2a"
> +                         "\x7a\xc1\x8f\xa0\xfe\x6a\x5b\x44"
> +                         "\x70\x5c\xcc\x92\x2c\x6f\x0f\xbd"
> +                         "\x25\x3e\xb7\x8e\x73\x58\xda\xc9"
> +                         "\xa5\xaa\x9e\xf3\x9b\xfd\x37\x3e"
> +                         "\xe2\x88\xa4\x7b\xc8\x5c\xa8\x93"
> +                         "\x0e\xe7\x9a\x9c\x2e\x95\x18\x9f"
> +                         "\xc8\x45\x0c\x88\x9e\x53\x4f\x3a"
> +                         "\x76\xc1\x35\xfa\x17\xd8\xac\xa0"
> +                         "\x0c\x2d\x47\x2e\x4f\x69\x9b\xf7"
> +                         "\xd0\xb6\x96\x0c\x19\xb3\x08\x01"
> +                         "\x65\x7a\x1f\xc7\x31\x86\xdb\xc8"
> +                         "\xc1\x99\x8f\xf8\x08\x4a\x9d\x23"
> +                         "\x22\xa8\xcf\x27\x01\x01\x88\x93"
> +                         "\x9c\x86\x45\xbd\xe0\x51\xca\x52"
> +                         "\x84\xba\xfe\x03\xf7\xda\xc5\xce"
> +                         "\x3e\x77\x75\x86\xaf\x84\xc8\x05"
> +                         "\x44\x01\x0f\x02\xf3\x58\xb0\x06"
> +                         "\x5a\xd7\x12\x30\x8d\xdf\x1f\x1f"
> +                         "\x0a\xe6\xd2\xea\xf6\x3a\x7a\x99"
> +                         "\x63\xe8\xd2\xc1\x4a\x45\x8b\x40"
> +                         "\x4d\x0a\xa9\x76\x92\xb3\xda\x87"
> +                         "\x36\x33\xf0\x78\xc3\x2f\x5f\x02"
> +                         "\x1a\x6a\x2c\x32\xcd\x76\xbf\xbd"
> +                         "\x5a\x26\x20\x28\x8c\x8c\xbc\x52"
> +                         "\x3d\x0a\xc9\xcb\xab\xa4\x21\xb0"
> +                         "\x54\x40\x81\x44\xc7\xd6\x1c\x11"
> +                         "\x44\xc6\x02\x92\x14\x5a\xbf\x1a"
> +                         "\x09\x8a\x18\xad\xcd\x64\x3d\x53"
> +                         "\x4a\xb6\xa5\x1b\x57\x0e\xef\xe0"
> +                         "\x8c\x44\x5f\x7d\xbd\x6c\xfd\x60"
> +                         "\xae\x02\x24\xb6\x99\xdd\x8c\xaf"
> +                         "\x59\x39\x75\x3c\xd1\x54\x7b\x86"
> +                         "\xcc\x99\xd9\x28\x0c\xb0\x94\x62"
> +                         "\xf9\x51\xd1\x19\x96\x2d\x66\xf5"
> +                         "\x55\xcf\x9e\x59\xe2\x6b\x2c\x08"
> +                         "\xc0\x54\x48\x24\x45\xc3\x8c\x73"
> +                         "\xea\x27\x6e\x66\x7d\x1d\x0e\x6e"
> +                         "\x13\xe8\x56\x65\x3a\xb0\x81\x5c"
> +                         "\xf0\xe8\xd8\x00\x6b\xcd\x8f\xad"
> +                         "\xdd\x53\xf3\xa4\x6c\x43\xd6\x31"
> +                         "\xaf\xd2\x76\x1e\x91\x12\xdb\x3c"
> +                         "\x8c\xc2\x81\xf0\x49\xdb\xe2\x6b"
> +                         "\x76\x62\x0a\x04\xe4\xaa\x8a\x7c"
> +                         "\x08\x0b\x5d\xd0\xee\x1d\xfb\xc4"
> +                         "\x02\x75\x42\xd6\xba\xa7\x22\xa8"
> +                         "\x47\x29\xb7\x85\x6d\x93\x3a\xdb"
> +                         "\x00\x53\x0b\xa2\xeb\xf8\xfe\x01"
> +                         "\x6f\x8a\x31\xd6\x17\x05\x6f\x67"
> +                         "\x88\x95\x32\xfe\x4f\xa6\x4b\xf8"
> +                         "\x03\xe4\xcd\x9a\x18\xe8\x4e\x2d"
> +                         "\xf7\x97\x9a\x0c\x7d\x9f\x7e\x44"
> +                         "\x69\x51\xe0\x32\x6b\x62\x86\x8f"
> +                         "\xa6\x8e\x0b\x21\x96\xe5\xaf\x77"
> +                         "\xc0\x83\xdf\xa5\x0e\xd0\xa1\x04"
> +                         "\xaf\xc1\x10\xcb\x5a\x40\xe4\xe3"
> +                         "\x38\x7e\x07\xe8\x4d\xfa\xed\xc5"
> +                         "\xf0\x37\xdf\xbb\x8a\xcf\x3d\xdc"
> +                         "\x61\xd2\xc6\x2b\xff\x07\xc9\x2f"
> +                         "\x0c\x2d\x5c\x07\xa8\x35\x6a\xfc"
> +                         "\xae\x09\x03\x45\x74\x51\x4d\xc4"
> +                         "\xb8\x23\x87\x4a\x99\x27\x20\x87"
> +                         "\x62\x44\x0a\x4a\xce\x78\x47\x22",
> +               .ksize  = 1088,
> +               .plaintext      = "\x8e\xb0\x4c\xde\x9c\x4a\x04\x5a"
> +                         "\xf6\xa9\x7f\x45\x25\xa5\x7b\x3a"
> +                         "\xbc\x4d\x73\x39\x81\xb5\xbd\x3d"
> +                         "\x21\x6f\xd7\x37\x50\x3c\x7b\x28"
> +                         "\xd1\x03\x3a\x17\xed\x7b\x7c\x2a"
> +                         "\x16\xbc\xdf\x19\x89\x52\x71\x31"
> +                         "\xb6\xc0\xfd\xb5\xd3\xba\x96\x99"
> +                         "\xb6\x34\x0b\xd0\x99\x93\xfc\x1a"
> +                         "\x01\x3c\x85\xc6\x9b\x78\x5c\x8b"
> +                         "\xfe\xae\xd2\xbf\xb2\x6f\xf9\xed"
> +                         "\xc8\x25\x17\xfe\x10\x3b\x7d\xda"
> +                         "\xf4\x8d\x35\x4b\x7c\x7b\x82\xe7"
> +                         "\xc2\xb3\xee\x60\x4a\x03\x86\xc9"
> +                         "\x4e\xb5\xc4\xbe\xd2\xbd\x66\xf1"
> +                         "\x13\xf1\x09\xab\x5d\xca\x63\x1f"
> +                         "\xfc\xfb\x57\x2a\xfc\xca\x66\xd8"
> +                         "\x77\x84\x38\x23\x1d\xac\xd3\xb3"
> +                         "\x7a\xad\x4c\x70\xfa\x9c\xc9\x61"
> +                         "\xa6\x1b\xba\x33\x4b\x4e\x33\xec"
> +                         "\xa0\xa1\x64\x39\x40\x05\x1c\xc2"
> +                         "\x3f\x49\x9d\xae\xf2\xc5\xf2\xc5"
> +                         "\xfe\xe8\xf4\xc2\xf9\x96\x2d\x28"
> +                         "\x92\x30\x44\xbc\xd2\x7f\xe1\x6e"
> +                         "\x62\x02\x8f\x3d\x1c\x80\xda\x0e"
> +                         "\x6a\x90\x7e\x75\xff\xec\x3e\xc4"
> +                         "\xcd\x16\x34\x3b\x05\x6d\x4d\x20"
> +                         "\x1c\x7b\xf5\x57\x4f\xfa\x3d\xac"
> +                         "\xd0\x13\x55\xe8\xb3\xe1\x1b\x78"
> +                         "\x30\xe6\x9f\x84\xd4\x69\xd1\x08"
> +                         "\x12\x77\xa7\x4a\xbd\xc0\xf2\xd2"
> +                         "\x78\xdd\xa3\x81\x12\xcb\x6c\x14"
> +                         "\x90\x61\xe2\x84\xc6\x2b\x16\xcc"
> +                         "\x40\x99\x50\x88\x01\x09\x64\x4f"
> +                         "\x0a\x80\xbe\x61\xae\x46\xc9\x0a"
> +                         "\x5d\xe0\xfb\x72\x7a\x1a\xdd\x61"
> +                         "\x63\x20\x05\xa0\x4a\xf0\x60\x69"
> +                         "\x7f\x92\xbc\xbf\x4e\x39\x4d\xdd"
> +                         "\x74\xd1\xb7\xc0\x5a\x34\xb7\xae"
> +                         "\x76\x65\x2e\xbc\x36\xb9\x04\x95"
> +                         "\x42\xe9\x6f\xca\x78\xb3\x72\x07"
> +                         "\xa3\xba\x02\x94\x67\x4c\xb1\xd7"
> +                         "\xe9\x30\x0d\xf0\x3b\xb8\x10\x6d"
> +                         "\xea\x2b\x21\xbf\x74\x59\x82\x97"
> +                         "\x85\xaa\xf1\xd7\x54\x39\xeb\x05"
> +                         "\xbd\xf3\x40\xa0\x97\xe6\x74\xfe"
> +                         "\xb4\x82\x5b\xb1\x36\xcb\xe8\x0d"
> +                         "\xce\x14\xd9\xdf\xf1\x94\x22\xcd"
> +                         "\xd6\x00\xba\x04\x4c\x05\x0c\xc0"
> +                         "\xd1\x5a\xeb\x52\xd5\xa8\x8e\xc8"
> +                         "\x97\xa1\xaa\xc1\xea\xc1\xbe\x7c"
> +                         "\x36\xb3\x36\xa0\xc6\x76\x66\xc5"
> +                         "\xe2\xaf\xd6\x5c\xe2\xdb\x2c\xb3"
> +                         "\x6c\xb9\x99\x7f\xff\x9f\x03\x24"
> +                         "\xe1\x51\x44\x66\xd8\x0c\x5d\x7f"
> +                         "\x5c\x85\x22\x2a\xcf\x6d\x79\x28"
> +                         "\xab\x98\x01\x72\xfe\x80\x87\x5f"
> +                         "\x46\xba\xef\x81\x24\xee\xbf\xb0"
> +                         "\x24\x74\xa3\x65\x97\x12\xc4\xaf"
> +                         "\x8b\xa0\x39\xda\x8a\x7e\x74\x6e"
> +                         "\x1b\x42\xb4\x44\x37\xfc\x59\xfd"
> +                         "\x86\xed\xfb\x8c\x66\x33\xda\x63"
> +                         "\x75\xeb\xe1\xa4\x85\x4f\x50\x8f"
> +                         "\x83\x66\x0d\xd3\x37\xfa\xe6\x9c"
> +                         "\x4f\x30\x87\x35\x18\xe3\x0b\xb7"
> +                         "\x6e\x64\x54\xcd\x70\xb3\xde\x54"
> +                         "\xb7\x1d\xe6\x4c\x4d\x55\x12\x12"
> +                         "\xaf\x5f\x7f\x5e\xee\x9d\xe8\x8e"
> +                         "\x32\x9d\x4e\x75\xeb\xc6\xdd\xaa"
> +                         "\x48\x82\xa4\x3f\x3c\xd7\xd3\xa8"
> +                         "\x63\x9e\x64\xfe\xe3\x97\x00\x62"
> +                         "\xe5\x40\x5d\xc3\xad\x72\xe1\x28"
> +                         "\x18\x50\xb7\x75\xef\xcd\x23\xbf"
> +                         "\x3f\xc0\x51\x36\xf8\x41\xc3\x08"
> +                         "\xcb\xf1\x8d\x38\x34\xbd\x48\x45"
> +                         "\x75\xed\xbc\x65\x7b\xb5\x0c\x9b"
> +                         "\xd7\x67\x7d\x27\xb4\xc4\x80\xd7"
> +                         "\xa9\xb9\xc7\x4a\x97\xaa\xda\xc8"
> +                         "\x3c\x74\xcf\x36\x8f\xe4\x41\xe3"
> +                         "\xd4\xd3\x26\xa7\xf3\x23\x9d\x8f"
> +                         "\x6c\x20\x05\x32\x3e\xe0\xc3\xc8"
> +                         "\x56\x3f\xa7\x09\xb7\xfb\xc7\xf7"
> +                         "\xbe\x2a\xdd\x0f\x06\x7b\x0d\xdd"
> +                         "\xb0\xb4\x86\x17\xfd\xb9\x04\xe5"
> +                         "\xc0\x64\x5d\xad\x2a\x36\x38\xdb"
> +                         "\x24\xaf\x5b\xff\xca\xf9\x41\xe8"
> +                         "\xf9\x2f\x1e\x5e\xf9\xf5\xd5\xf2"
> +                         "\xb2\x88\xca\xc9\xa1\x31\xe2\xe8"
> +                         "\x10\x95\x65\xbf\xf1\x11\x61\x7a"
> +                         "\x30\x1a\x54\x90\xea\xd2\x30\xf6"
> +                         "\xa5\xad\x60\xf9\x4d\x84\x21\x1b"
> +                         "\xe4\x42\x22\xc8\x12\x4b\xb0\x58"
> +                         "\x3e\x9c\x2d\x32\x95\x0a\x8e\xb0"
> +                         "\x0a\x7e\x77\x2f\xe8\x97\x31\x6a"
> +                         "\xf5\x59\xb4\x26\xe6\x37\x12\xc9"
> +                         "\xcb\xa0\x58\x33\x6f\xd5\x55\x55"
> +                         "\x3c\xa1\x33\xb1\x0b\x7e\x2e\xb4"
> +                         "\x43\x2a\x84\x39\xf0\x9c\xf4\x69"
> +                         "\x4f\x1e\x79\xa6\x15\x1b\x87\xbb"
> +                         "\xdb\x9b\xe0\xf1\x0b\xba\xe3\x6e"
> +                         "\xcc\x2f\x49\x19\x22\x29\xfc\x71"
> +                         "\xbb\x77\x38\x18\x61\xaf\x85\x76"
> +                         "\xeb\xd1\x09\xcc\x86\x04\x20\x9a"
> +                         "\x66\x53\x2f\x44\x8b\xc6\xa3\xd2"
> +                         "\x5f\xc7\x79\x82\x66\xa8\x6e\x75"
> +                         "\x7d\x94\xd1\x86\x75\x0f\xa5\x4f"
> +                         "\x3c\x7a\x33\xce\xd1\x6e\x9d\x7b"
> +                         "\x1f\x91\x37\xb8\x37\x80\xfb\xe0"
> +                         "\x52\x26\xd0\x9a\xd4\x48\x02\x41"
> +                         "\x05\xe3\x5a\x94\xf1\x65\x61\x19"
> +                         "\xb8\x88\x4e\x2b\xea\xba\x8b\x58"
> +                         "\x8b\x42\x01\x00\xa8\xfe\x00\x5c"
> +                         "\xfe\x1c\xee\x31\x15\x69\xfa\xb3"
> +                         "\x9b\x5f\x22\x8e\x0d\x2c\xe3\xa5"
> +                         "\x21\xb9\x99\x8a\x8e\x94\x5a\xef"
> +                         "\x13\x3e\x99\x96\x79\x6e\xd5\x42"
> +                         "\x36\x03\xa9\xe2\xca\x65\x4e\x8a"
> +                         "\x8a\x30\xd2\x7d\x74\xe7\xf0\xaa"
> +                         "\x23\x26\xdd\xcb\x82\x39\xfc\x9d"
> +                         "\x51\x76\x21\x80\xa2\xbe\x93\x03"
> +                         "\x47\xb0\xc1\xb6\xdc\x63\xfd\x9f"
> +                         "\xca\x9d\xa5\xca\x27\x85\xe2\xd8"
> +                         "\x15\x5b\x7e\x14\x7a\xc4\x89\xcc"
> +                         "\x74\x14\x4b\x46\xd2\xce\xac\x39"
> +                         "\x6b\x6a\x5a\xa4\x0e\xe3\x7b\x15"
> +                         "\x94\x4b\x0f\x74\xcb\x0c\x7f\xa9"
> +                         "\xbe\x09\x39\xa3\xdd\x56\x5c\xc7"
> +                         "\x99\x56\x65\x39\xf4\x0b\x7d\x87"
> +                         "\xec\xaa\xe3\x4d\x22\x65\x39\x4e",
> +               .psize  = 1024,
> +               .digest = "\x64\x3a\xbc\xc3\x3f\x74\x40\x51"
> +                         "\x6e\x56\x01\x1a\x51\xec\x36\xde",
> +               .np     = 8,
> +               .tap    = { 64, 203, 267, 28, 263, 62, 54, 83 },
> +       }, {
> +               .key    = "\x1b\x82\x2e\x1b\x17\x23\xb9\x6d"
> +                         "\xdc\x9c\xda\x99\x07\xe3\x5f\xd8"
> +                         "\xd2\xf8\x43\x80\x8d\x86\x7d\x80"
> +                         "\x1a\xd0\xcc\x13\xb9\x11\x05\x3f"
> +                         "\x7e\xcf\x7e\x80\x0e\xd8\x25\x48"
> +                         "\x8b\xaa\x63\x83\x92\xd0\x72\xf5"
> +                         "\x4f\x67\x7e\x50\x18\x25\xa4\xd1"
> +                         "\xe0\x7e\x1e\xba\xd8\xa7\x6e\xdb"
> +                         "\x1a\xcc\x0d\xfe\x9f\x6d\x22\x35"
> +                         "\xe1\xe6\xe0\xa8\x7b\x9c\xb1\x66"
> +                         "\xa3\xf8\xff\x4d\x90\x84\x28\xbc"
> +                         "\xdc\x19\xc7\x91\x49\xfc\xf6\x33"
> +                         "\xc9\x6e\x65\x7f\x28\x6f\x68\x2e"
> +                         "\xdf\x1a\x75\xe9\xc2\x0c\x96\xb9"
> +                         "\x31\x22\xc4\x07\xc6\x0a\x2f\xfd"
> +                         "\x36\x06\x5f\x5c\xc5\xb1\x3a\xf4"
> +                         "\x5e\x48\xa4\x45\x2b\x88\xa7\xee"
> +                         "\xa9\x8b\x52\xcc\x99\xd9\x2f\xb8"
> +                         "\xa4\x58\x0a\x13\xeb\x71\x5a\xfa"
> +                         "\xe5\x5e\xbe\xf2\x64\xad\x75\xbc"
> +                         "\x0b\x5b\x34\x13\x3b\x23\x13\x9a"
> +                         "\x69\x30\x1e\x9a\xb8\x03\xb8\x8b"
> +                         "\x3e\x46\x18\x6d\x38\xd9\xb3\xd8"
> +                         "\xbf\xf1\xd0\x28\xe6\x51\x57\x80"
> +                         "\x5e\x99\xfb\xd0\xce\x1e\x83\xf7"
> +                         "\xe9\x07\x5a\x63\xa9\xef\xce\xa5"
> +                         "\xfb\x3f\x37\x17\xfc\x0b\x37\x0e"
> +                         "\xbb\x4b\x21\x62\xb7\x83\x0e\xa9"
> +                         "\x9e\xb0\xc4\xad\x47\xbe\x35\xe7"
> +                         "\x51\xb2\xf2\xac\x2b\x65\x7b\x48"
> +                         "\xe3\x3f\x5f\xb6\x09\x04\x0c\x58"
> +                         "\xce\x99\xa9\x15\x2f\x4e\xc1\xf2"
> +                         "\x24\x48\xc0\xd8\x6c\xd3\x76\x17"
> +                         "\x83\x5d\xe6\xe3\xfd\x01\x8e\xf7"
> +                         "\x42\xa5\x04\x29\x30\xdf\xf9\x00"
> +                         "\x4a\xdc\x71\x22\x1a\x33\x15\xb6"
> +                         "\xd7\x72\xfb\x9a\xb8\xeb\x2b\x38"
> +                         "\xea\xa8\x61\xa8\x90\x11\x9d\x73"
> +                         "\x2e\x6c\xce\x81\x54\x5a\x9f\xcd"
> +                         "\xcf\xd5\xbd\x26\x5d\x66\xdb\xfb"
> +                         "\xdc\x1e\x7c\x10\xfe\x58\x82\x10"
> +                         "\x16\x24\x01\xce\x67\x55\x51\xd1"
> +                         "\xdd\x6b\x44\xa3\x20\x8e\xa9\xa6"
> +                         "\x06\xa8\x29\x77\x6e\x00\x38\x5b"
> +                         "\xde\x4d\x58\xd8\x1f\x34\xdf\xf9"
> +                         "\x2c\xac\x3e\xad\xfb\x92\x0d\x72"
> +                         "\x39\xa4\xac\x44\x10\xc0\x43\xc4"
> +                         "\xa4\x77\x3b\xfc\xc4\x0d\x37\xd3"
> +                         "\x05\x84\xda\x53\x71\xf8\x80\xd3"
> +                         "\x34\x44\xdb\x09\xb4\x2b\x8e\xe3"
> +                         "\x00\x75\x50\x9e\x43\x22\x00\x0b"
> +                         "\x7c\x70\xab\xd4\x41\xf1\x93\xcd"
> +                         "\x25\x2d\x84\x74\xb5\xf2\x92\xcd"
> +                         "\x0a\x28\xea\x9a\x49\x02\x96\xcb"
> +                         "\x85\x9e\x2f\x33\x03\x86\x1d\xdc"
> +                         "\x1d\x31\xd5\xfc\x9d\xaa\xc5\xe9"
> +                         "\x9a\xc4\x57\xf5\x35\xed\xf4\x4b"
> +                         "\x3d\x34\xc2\x29\x13\x86\x36\x42"
> +                         "\x5d\xbf\x90\x86\x13\x77\xe5\xc3"
> +                         "\x62\xb4\xfe\x0b\x70\x39\x35\x65"
> +                         "\x02\xea\xf6\xce\x57\x0c\xbb\x74"
> +                         "\x29\xe3\xfd\x60\x90\xfd\x10\x38"
> +                         "\xd5\x4e\x86\xbd\x37\x70\xf0\x97"
> +                         "\xa6\xab\x3b\x83\x64\x52\xca\x66"
> +                         "\x2f\xf9\xa4\xca\x3a\x55\x6b\xb0"
> +                         "\xe8\x3a\x34\xdb\x9e\x48\x50\x2f"
> +                         "\x3b\xef\xfd\x08\x2d\x5f\xc1\x37"
> +                         "\x5d\xbe\x73\xe4\xd8\xe9\xac\xca"
> +                         "\x8a\xaa\x48\x7c\x5c\xf4\xa6\x96"
> +                         "\x5f\xfa\x70\xa6\xb7\x8b\x50\xcb"
> +                         "\xa6\xf5\xa9\xbd\x7b\x75\x4c\x22"
> +                         "\x0b\x19\x40\x2e\xc9\x39\x39\x32"
> +                         "\x83\x03\xa8\xa4\x98\xe6\x8e\x16"
> +                         "\xb9\xde\x08\xc5\xfc\xbf\xad\x39"
> +                         "\xa8\xc7\x93\x6c\x6f\x23\xaf\xc1"
> +                         "\xab\xe1\xdf\xbb\x39\xae\x93\x29"
> +                         "\x0e\x7d\x80\x8d\x3e\x65\xf3\xfd"
> +                         "\x96\x06\x65\x90\xa1\x28\x64\x4b"
> +                         "\x69\xf9\xa8\x84\x27\x50\xfc\x87"
> +                         "\xf7\xbf\x55\x8e\x56\x13\x58\x7b"
> +                         "\x85\xb4\x6a\x72\x0f\x40\xf1\x4f"
> +                         "\x83\x81\x1f\x76\xde\x15\x64\x7a"
> +                         "\x7a\x80\xe4\xc7\x5e\x63\x01\x91"
> +                         "\xd7\x6b\xea\x0b\x9b\xa2\x99\x3b"
> +                         "\x6c\x88\xd8\xfd\x59\x3c\x8d\x22"
> +                         "\x86\x56\xbe\xab\xa1\x37\x08\x01"
> +                         "\x50\x85\x69\x29\xee\x9f\xdf\x21"
> +                         "\x3e\x20\x20\xf5\xb0\xbb\x6b\xd0"
> +                         "\x9c\x41\x38\xec\x54\x6f\x2d\xbd"
> +                         "\x0f\xe1\xbd\xf1\x2b\x6e\x60\x56"
> +                         "\x29\xe5\x7a\x70\x1c\xe2\xfc\x97"
> +                         "\x82\x68\x67\xd9\x3d\x1f\xfb\xd8"
> +                         "\x07\x9f\xbf\x96\x74\xba\x6a\x0e"
> +                         "\x10\x48\x20\xd8\x13\x1e\xb5\x44"
> +                         "\xf2\xcc\xb1\x8b\xfb\xbb\xec\xd7"
> +                         "\x37\x70\x1f\x7c\x55\xd2\x4b\xb9"
> +                         "\xfd\x70\x5e\xa3\x91\x73\x63\x52"
> +                         "\x13\x47\x5a\x06\xfb\x01\x67\xa5"
> +                         "\xc0\xd0\x49\x19\x56\x66\x9a\x77"
> +                         "\x64\xaf\x8c\x25\x91\x52\x87\x0e"
> +                         "\x18\xf3\x5f\x97\xfd\x71\x13\xf8"
> +                         "\x05\xa5\x39\xcc\x65\xd3\xcc\x63"
> +                         "\x5b\xdb\x5f\x7e\x5f\x6e\xad\xc4"
> +                         "\xf4\xa0\xc5\xc2\x2b\x4d\x97\x38"
> +                         "\x4f\xbc\xfa\x33\x17\xb4\x47\xb9"
> +                         "\x43\x24\x15\x8d\xd2\xed\x80\x68"
> +                         "\x84\xdb\x04\x80\xca\x5e\x6a\x35"
> +                         "\x2c\x2c\xe7\xc5\x03\x5f\x54\xb0"
> +                         "\x5e\x4f\x1d\x40\x54\x3d\x78\x9a"
> +                         "\xac\xda\x80\x27\x4d\x15\x4c\x1a"
> +                         "\x6e\x80\xc9\xc4\x3b\x84\x0e\xd9"
> +                         "\x2e\x93\x01\x8c\xc3\xc8\x91\x4b"
> +                         "\xb3\xaa\x07\x04\x68\x5b\x93\xa5"
> +                         "\xe7\xc4\x9d\xe7\x07\xee\xf5\x3b"
> +                         "\x40\x89\xcc\x60\x34\x9d\xb4\x06"
> +                         "\x1b\xef\x92\xe6\xc1\x2a\x7d\x0f"
> +                         "\x81\xaa\x56\xe3\xd7\xed\xa7\xd4"
> +                         "\xa7\x3a\x49\xc4\xad\x81\x5c\x83"
> +                         "\x55\x8e\x91\x54\xb7\x7d\x65\xa5"
> +                         "\x06\x16\xd5\x9a\x16\xc1\xb0\xa2"
> +                         "\x06\xd8\x98\x47\x73\x7e\x73\xa0"
> +                         "\xb8\x23\xb1\x52\xbf\x68\x74\x5d"
> +                         "\x0b\xcb\xfa\x8c\x46\xe3\x24\xe6"
> +                         "\xab\xd4\x69\x8d\x8c\xf2\x8a\x59"
> +                         "\xbe\x48\x46\x50\x8c\x9a\xe8\xe3"
> +                         "\x31\x55\x0a\x06\xed\x4f\xf8\xb7"
> +                         "\x4f\xe3\x85\x17\x30\xbd\xd5\x20"
> +                         "\xe7\x5b\xb2\x32\xcf\x6b\x16\x44"
> +                         "\xd2\xf5\x7e\xd7\xd1\x2f\xee\x64"
> +                         "\x3e\x9d\x10\xef\x27\x35\x43\x64"
> +                         "\x67\xfb\x7a\x7b\xe0\x62\x31\x9a"
> +                         "\x4d\xdf\xa5\xab\xc0\x20\xbb\x01"
> +                         "\xe9\x7b\x54\xf1\xde\xb2\x79\x50"
> +                         "\x6c\x4b\x91\xdb\x7f\xbb\x50\xc1"
> +                         "\x55\x44\x38\x9a\xe0\x9f\xe8\x29"
> +                         "\x6f\x15\xf8\x4e\xa6\xec\xa0\x60",
> +               .ksize  = 1088,
> +               .plaintext      = "\x15\x68\x9e\x2f\xad\x15\x52\xdf"
> +                         "\xf0\x42\x62\x24\x2a\x2d\xea\xbf"
> +                         "\xc7\xf3\xb4\x1a\xf5\xed\xb2\x08"
> +                         "\x15\x60\x1c\x00\x77\xbf\x0b\x0e"
> +                         "\xb7\x2c\xcf\x32\x3a\xc7\x01\x77"
> +                         "\xef\xa6\x75\xd0\x29\xc7\x68\x20"
> +                         "\xb2\x92\x25\xbf\x12\x34\xe9\xa4"
> +                         "\xfd\x32\x7b\x3f\x7c\xbd\xa5\x02"
> +                         "\x38\x41\xde\xc9\xc1\x09\xd9\xfc"
> +                         "\x6e\x78\x22\x83\x18\xf7\x50\x8d"
> +                         "\x8f\x9c\x2d\x02\xa5\x30\xac\xff"
> +                         "\xea\x63\x2e\x80\x37\x83\xb0\x58"
> +                         "\xda\x2f\xef\x21\x55\xba\x7b\xb1"
> +                         "\xb6\xed\xf5\xd2\x4d\xaa\x8c\xa9"
> +                         "\xdd\xdb\x0f\xb4\xce\xc1\x9a\xb1"
> +                         "\xc1\xdc\xbd\xab\x86\xc2\xdf\x0b"
> +                         "\xe1\x2c\xf9\xbe\xf6\xd8\xda\x62"
> +                         "\x72\xdd\x98\x09\x52\xc0\xc4\xb6"
> +                         "\x7b\x17\x5c\xf5\xd8\x4b\x88\xd6"
> +                         "\x6b\xbf\x84\x4a\x3f\xf5\x4d\xd2"
> +                         "\x94\xe2\x9c\xff\xc7\x3c\xd9\xc8"
> +                         "\x37\x38\xbc\x8c\xf3\xe7\xb7\xd0"
> +                         "\x1d\x78\xc4\x39\x07\xc8\x5e\x79"
> +                         "\xb6\x5a\x90\x5b\x6e\x97\xc9\xd4"
> +                         "\x82\x9c\xf3\x83\x7a\xe7\x97\xfc"
> +                         "\x1d\xbb\xef\xdb\xce\xe0\x82\xad"
> +                         "\xca\x07\x6c\x54\x62\x6f\x81\xe6"
> +                         "\x7a\x5a\x96\x6e\x80\x3a\xa2\x37"
> +                         "\x6f\xc6\xa4\x29\xc3\x9e\x19\x94"
> +                         "\x9f\xb0\x3e\x38\xfb\x3c\x2b\x7d"
> +                         "\xaa\xb8\x74\xda\x54\x23\x51\x12"
> +                         "\x4b\x96\x36\x8f\x91\x4f\x19\x37"
> +                         "\x83\xc9\xdd\xc7\x1a\x32\x2d\xab"
> +                         "\xc7\x89\xe2\x07\x47\x6c\xe8\xa6"
> +                         "\x70\x6b\x8e\x0c\xda\x5c\x6a\x59"
> +                         "\x27\x33\x0e\xe1\xe1\x20\xe8\xc8"
> +                         "\xae\xdc\xd0\xe3\x6d\xa8\xa6\x06"
> +                         "\x41\xb4\xd4\xd4\xcf\x91\x3e\x06"
> +                         "\xb0\x9a\xf7\xf1\xaa\xa6\x23\x92"
> +                         "\x10\x86\xf0\x94\xd1\x7c\x2e\x07"
> +                         "\x30\xfb\xc5\xd8\xf3\x12\xa9\xe8"
> +                         "\x22\x1c\x97\x1a\xad\x96\xb0\xa1"
> +                         "\x72\x6a\x6b\xb4\xfd\xf7\xe8\xfa"
> +                         "\xe2\x74\xd8\x65\x8d\x35\x17\x4b"
> +                         "\x00\x23\x5c\x8c\x70\xad\x71\xa2"
> +                         "\xca\xc5\x6c\x59\xbf\xb4\xc0\x6d"
> +                         "\x86\x98\x3e\x19\x5a\x90\x92\xb1"
> +                         "\x66\x57\x6a\x91\x68\x7c\xbc\xf3"
> +                         "\xf1\xdb\x94\xf8\x48\xf1\x36\xd8"
> +                         "\x78\xac\x1c\xa9\xcc\xd6\x27\xba"
> +                         "\x91\x54\x22\xf5\xe6\x05\x3f\xcc"
> +                         "\xc2\x8f\x2c\x3b\x2b\xc3\x2b\x2b"
> +                         "\x3b\xb8\xb6\x29\xb7\x2f\x94\xb6"
> +                         "\x7b\xfc\x94\x3e\xd0\x7a\x41\x59"
> +                         "\x7b\x1f\x9a\x09\xa6\xed\x4a\x82"
> +                         "\x9d\x34\x1c\xbd\x4e\x1c\x3a\x66"
> +                         "\x80\x74\x0e\x9a\x4f\x55\x54\x47"
> +                         "\x16\xba\x2a\x0a\x03\x35\x99\xa3"
> +                         "\x5c\x63\x8d\xa2\x72\x8b\x17\x15"
> +                         "\x68\x39\x73\xeb\xec\xf2\xe8\xf5"
> +                         "\x95\x32\x27\xd6\xc4\xfe\xb0\x51"
> +                         "\xd5\x0c\x50\xc5\xcd\x6d\x16\xb3"
> +                         "\xa3\x1e\x95\x69\xad\x78\x95\x06"
> +                         "\xb9\x46\xf2\x6d\x24\x5a\x99\x76"
> +                         "\x73\x6a\x91\xa6\xac\x12\xe1\x28"
> +                         "\x79\xbc\x08\x4e\x97\x00\x98\x63"
> +                         "\x07\x1c\x4e\xd1\x68\xf3\xb3\x81"
> +                         "\xa8\xa6\x5f\xf1\x01\xc9\xc1\xaf"
> +                         "\x3a\x96\xf9\x9d\xb5\x5a\x5f\x8f"
> +                         "\x7e\xc1\x7e\x77\x0a\x40\xc8\x8e"
> +                         "\xfc\x0e\xed\xe1\x0d\xb0\xe5\x5e"
> +                         "\x5e\x6f\xf5\x7f\xab\x33\x7d\xcd"
> +                         "\xf0\x09\x4b\xb2\x11\x37\xdc\x65"
> +                         "\x97\x32\x62\x71\x3a\x29\x54\xb9"
> +                         "\xc7\xa4\xbf\x75\x0f\xf9\x40\xa9"
> +                         "\x8d\xd7\x8b\xa7\xe0\x9a\xbe\x15"
> +                         "\xc6\xda\xd8\x00\x14\x69\x1a\xaf"
> +                         "\x5f\x79\xc3\xf5\xbb\x6c\x2a\x9d"
> +                         "\xdd\x3c\x5f\x97\x21\xe1\x3a\x03"
> +                         "\x84\x6a\xe9\x76\x11\x1f\xd3\xd5"
> +                         "\xf0\x54\x20\x4d\xc2\x91\xc3\xa4"
> +                         "\x36\x25\xbe\x1b\x2a\x06\xb7\xf3"
> +                         "\xd1\xd0\x55\x29\x81\x4c\x83\xa3"
> +                         "\xa6\x84\x1e\x5c\xd1\xd0\x6c\x90"
> +                         "\xa4\x11\xf0\xd7\x63\x6a\x48\x05"
> +                         "\xbc\x48\x18\x53\xcd\xb0\x8d\xdb"
> +                         "\xdc\xfe\x55\x11\x5c\x51\xb3\xab"
> +                         "\xab\x63\x3e\x31\x5a\x8b\x93\x63"
> +                         "\x34\xa9\xba\x2b\x69\x1a\xc0\xe3"
> +                         "\xcb\x41\xbc\xd7\xf5\x7f\x82\x3e"
> +                         "\x01\xa3\x3c\x72\xf4\xfe\xdf\xbe"
> +                         "\xb1\x67\x17\x2b\x37\x60\x0d\xca"
> +                         "\x6f\xc3\x94\x2c\xd2\x92\x6d\x9d"
> +                         "\x75\x18\x77\xaa\x29\x38\x96\xed"
> +                         "\x0e\x20\x70\x92\xd5\xd0\xb4\x00"
> +                         "\xc0\x31\xf2\xc9\x43\x0e\x75\x1d"
> +                         "\x4b\x64\xf2\x1f\xf2\x29\x6c\x7b"
> +                         "\x7f\xec\x59\x7d\x8c\x0d\xd4\xd3"
> +                         "\xac\x53\x4c\xa3\xde\x42\x92\x95"
> +                         "\x6d\xa3\x4f\xd0\xe6\x3d\xe7\xec"
> +                         "\x7a\x4d\x68\xf1\xfe\x67\x66\x09"
> +                         "\x83\x22\xb1\x98\x43\x8c\xab\xb8"
> +                         "\x45\xe6\x6d\xdf\x5e\x50\x71\xce"
> +                         "\xf5\x4e\x40\x93\x2b\xfa\x86\x0e"
> +                         "\xe8\x30\xbd\x82\xcc\x1c\x9c\x5f"
> +                         "\xad\xfd\x08\x31\xbe\x52\xe7\xe6"
> +                         "\xf2\x06\x01\x62\x25\x15\x99\x74"
> +                         "\x33\x51\x52\x57\x3f\x57\x87\x61"
> +                         "\xb9\x7f\x29\x3d\xcd\x92\x5e\xa6"
> +                         "\x5c\x3b\xf1\xed\x5f\xeb\x82\xed"
> +                         "\x56\x7b\x61\xe7\xfd\x02\x47\x0e"
> +                         "\x2a\x15\xa4\xce\x43\x86\x9b\xe1"
> +                         "\x2b\x4c\x2a\xd9\x42\x97\xf7\x9a"
> +                         "\xe5\x47\x46\x48\xd3\x55\x6f\x4d"
> +                         "\xd9\xeb\x4b\xdd\x7b\x21\x2f\xb3"
> +                         "\xa8\x36\x28\xdf\xca\xf1\xf6\xd9"
> +                         "\x10\xf6\x1c\xfd\x2e\x0c\x27\xe0"
> +                         "\x01\xb3\xff\x6d\x47\x08\x4d\xd4"
> +                         "\x00\x25\xee\x55\x4a\xe9\xe8\x5b"
> +                         "\xd8\xf7\x56\x12\xd4\x50\xb2\xe5"
> +                         "\x51\x6f\x34\x63\x69\xd2\x4e\x96"
> +                         "\x4e\xbc\x79\xbf\x18\xae\xc6\x13"
> +                         "\x80\x92\x77\xb0\xb4\x0f\x29\x94"
> +                         "\x6f\x4c\xbb\x53\x11\x36\xc3\x9f"
> +                         "\x42\x8e\x96\x8a\x91\xc8\xe9\xfc"
> +                         "\xfe\xbf\x7c\x2d\x6f\xf9\xb8\x44"
> +                         "\x89\x1b\x09\x53\x0a\x2a\x92\xc3"
> +                         "\x54\x7a\x3a\xf9\xe2\xe4\x75\x87"
> +                         "\xa0\x5e\x4b\x03\x7a\x0d\x8a\xf4"
> +                         "\x55\x59\x94\x2b\x63\x96\x0e\xf5",
> +               .psize  = 1040,
> +               .digest = "\xb5\xb9\x08\xb3\x24\x3e\x03\xf0"
> +                         "\xd6\x0b\x57\xbc\x0a\x6d\x89\x59",
> +       }, {
> +               .key    = "\xf6\x34\x42\x71\x35\x52\x8b\x58"
> +                         "\x02\x3a\x8e\x4a\x8d\x41\x13\xe9"
> +                         "\x7f\xba\xb9\x55\x9d\x73\x4d\xf8"
> +                         "\x3f\x5d\x73\x15\xff\xd3\x9e\x7f"
> +                         "\x20\x2a\x6a\xa8\xd1\xf0\x8f\x12"
> +                         "\x6b\x02\xd8\x6c\xde\xba\x80\x22"
> +                         "\x19\x37\xc8\xd0\x4e\x89\x17\x7c"
> +                         "\x7c\xdd\x88\xfd\x41\xc0\x04\xb7"
> +                         "\x1d\xac\x19\xe3\x20\xc7\x16\xcf"
> +                         "\x58\xee\x1d\x7a\x61\x69\xa9\x12"
> +                         "\x4b\xef\x4f\xb6\x38\xdd\x78\xf8"
> +                         "\x28\xee\x70\x08\xc7\x7c\xcc\xc8"
> +                         "\x1e\x41\xf5\x80\x86\x70\xd0\xf0"
> +                         "\xa3\x87\x6b\x0a\x00\xd2\x41\x28"
> +                         "\x74\x26\xf1\x24\xf3\xd0\x28\x77"
> +                         "\xd7\xcd\xf6\x2d\x61\xf4\xa2\x13"
> +                         "\x77\xb4\x6f\xa0\xf4\xfb\xd6\xb5"
> +                         "\x38\x9d\x5a\x0c\x51\xaf\xad\x63"
> +                         "\x27\x67\x8c\x01\xea\x42\x1a\x66"
> +                         "\xda\x16\x7c\x3c\x30\x0c\x66\x53"
> +                         "\x1c\x88\xa4\x5c\xb2\xe3\x78\x0a"
> +                         "\x13\x05\x6d\xe2\xaf\xb3\xe4\x75"
> +                         "\x00\x99\x58\xee\x76\x09\x64\xaa"
> +                         "\xbb\x2e\xb1\x81\xec\xd8\x0e\xd3"
> +                         "\x0c\x33\x5d\xb7\x98\xef\x36\xb6"
> +                         "\xd2\x65\x69\x41\x70\x12\xdc\x25"
> +                         "\x41\x03\x99\x81\x41\x19\x62\x13"
> +                         "\xd1\x0a\x29\xc5\x8c\xe0\x4c\xf3"
> +                         "\xd6\xef\x4c\xf4\x1d\x83\x2e\x6d"
> +                         "\x8e\x14\x87\xed\x80\xe0\xaa\xd3"
> +                         "\x08\x04\x73\x1a\x84\x40\xf5\x64"
> +                         "\xbd\x61\x32\x65\x40\x42\xfb\xb0"
> +                         "\x40\xf6\x40\x8d\xc7\x7f\x14\xd0"
> +                         "\x83\x99\xaa\x36\x7e\x60\xc6\xbf"
> +                         "\x13\x8a\xf9\x21\xe4\x7e\x68\x87"
> +                         "\xf3\x33\x86\xb4\xe0\x23\x7e\x0a"
> +                         "\x21\xb1\xf5\xad\x67\x3c\x9c\x9d"
> +                         "\x09\xab\xaf\x5f\xba\xe0\xd0\x82"
> +                         "\x48\x22\x70\xb5\x6d\x53\xd6\x0e"
> +                         "\xde\x64\x92\x41\xb0\xd3\xfb\xda"
> +                         "\x21\xfe\xab\xea\x20\xc4\x03\x58"
> +                         "\x18\x2e\x7d\x2f\x03\xa9\x47\x66"
> +                         "\xdf\x7b\xa4\x6b\x34\x6b\x55\x9c"
> +                         "\x4f\xd7\x9c\x47\xfb\xa9\x42\xec"
> +                         "\x5a\x12\xfd\xfe\x76\xa0\x92\x9d"
> +                         "\xfe\x1e\x16\xdd\x24\x2a\xe4\x27"
> +                         "\xd5\xa9\xf2\x05\x4f\x83\xa2\xaf"
> +                         "\xfe\xee\x83\x7a\xad\xde\xdf\x9a"
> +                         "\x80\xd5\x81\x14\x93\x16\x7e\x46"
> +                         "\x47\xc2\x14\xef\x49\x6e\xb9\xdb"
> +                         "\x40\xe8\x06\x6f\x9c\x2a\xfd\x62"
> +                         "\x06\x46\xfd\x15\x1d\x36\x61\x6f"
> +                         "\x77\x77\x5e\x64\xce\x78\x1b\x85"
> +                         "\xbf\x50\x9a\xfd\x67\xa6\x1a\x65"
> +                         "\xad\x5b\x33\x30\xf1\x71\xaa\xd9"
> +                         "\x23\x0d\x92\x24\x5f\xae\x57\xb0"
> +                         "\x24\x37\x0a\x94\x12\xfb\xb5\xb1"
> +                         "\xd3\xb8\x1d\x12\x29\xb0\x80\x24"
> +                         "\x2d\x47\x9f\x96\x1f\x95\xf1\xb1"
> +                         "\xda\x35\xf6\x29\xe0\xe1\x23\x96"
> +                         "\xc7\xe8\x22\x9b\x7c\xac\xf9\x41"
> +                         "\x39\x01\xe5\x73\x15\x5e\x99\xec"
> +                         "\xb4\xc1\xf4\xe7\xa7\x97\x6a\xd5"
> +                         "\x90\x9a\xa0\x1d\xf3\x5a\x8b\x5f"
> +                         "\xdf\x01\x52\xa4\x93\x31\x97\xb0"
> +                         "\x93\x24\xb5\xbc\xb2\x14\x24\x98"
> +                         "\x4a\x8f\x19\x85\xc3\x2d\x0f\x74"
> +                         "\x9d\x16\x13\x80\x5e\x59\x62\x62"
> +                         "\x25\xe0\xd1\x2f\x64\xef\xba\xac"
> +                         "\xcd\x09\x07\x15\x8a\xcf\x73\xb5"
> +                         "\x8b\xc9\xd8\x24\xb0\x53\xd5\x6f"
> +                         "\xe1\x2b\x77\xb1\xc5\xe4\xa7\x0e"
> +                         "\x18\x45\xab\x36\x03\x59\xa8\xbd"
> +                         "\x43\xf0\xd8\x2c\x1a\x69\x96\xbb"
> +                         "\x13\xdf\x6c\x33\x77\xdf\x25\x34"
> +                         "\x5b\xa5\x5b\x8c\xf9\x51\x05\xd4"
> +                         "\x8b\x8b\x44\x87\x49\xfc\xa0\x8f"
> +                         "\x45\x15\x5b\x40\x42\xc4\x09\x92"
> +                         "\x98\x0c\x4d\xf4\x26\x37\x1b\x13"
> +                         "\x76\x01\x93\x8d\x4f\xe6\xed\x18"
> +                         "\xd0\x79\x7b\x3f\x44\x50\xcb\xee"
> +                         "\xf7\x4a\xc9\x9e\xe0\x96\x74\xa7"
> +                         "\xe6\x93\xb2\x53\xca\x55\xa8\xdc"
> +                         "\x1e\x68\x07\x87\xb7\x2e\xc1\x08"
> +                         "\xb2\xa4\x5b\xaf\xc6\xdb\x5c\x66"
> +                         "\x41\x1c\x51\xd9\xb0\x07\x00\x0d"
> +                         "\xf0\x4c\xdc\x93\xde\xa9\x1e\x8e"
> +                         "\xd3\x22\x62\xd8\x8b\x88\x2c\xea"
> +                         "\x5e\xf1\x6e\x14\x40\xc7\xbe\xaa"
> +                         "\x42\x28\xd0\x26\x30\x78\x01\x9b"
> +                         "\x83\x07\xbc\x94\xc7\x57\xa2\x9f"
> +                         "\x03\x07\xff\x16\xff\x3c\x6e\x48"
> +                         "\x0a\xd0\xdd\x4c\xf6\x64\x9a\xf1"
> +                         "\xcd\x30\x12\x82\x2c\x38\xd3\x26"
> +                         "\x83\xdb\xab\x3e\xc6\xf8\xe6\xfa"
> +                         "\x77\x0a\x78\x82\x75\xf8\x63\x51"
> +                         "\x59\xd0\x8d\x24\x9f\x25\xe6\xa3"
> +                         "\x4c\xbc\x34\xfc\xe3\x10\xc7\x62"
> +                         "\xd4\x23\xc8\x3d\xa7\xc6\xa6\x0a"
> +                         "\x4f\x7e\x29\x9d\x6d\xbe\xb5\xf1"
> +                         "\xdf\xa4\x53\xfa\xc0\x23\x0f\x37"
> +                         "\x84\x68\xd0\xb5\xc8\xc6\xae\xf8"
> +                         "\xb7\x8d\xb3\x16\xfe\x8f\x87\xad"
> +                         "\xd0\xc1\x08\xee\x12\x1c\x9b\x1d"
> +                         "\x90\xf8\xd1\x63\xa4\x92\x3c\xf0"
> +                         "\xc7\x34\xd8\xf1\x14\xed\xa3\xbc"
> +                         "\x17\x7e\xd4\x62\x42\x54\x57\x2c"
> +                         "\x3e\x7a\x35\x35\x17\x0f\x0b\x7f"
> +                         "\x81\xa1\x3f\xd0\xcd\xc8\x3b\x96"
> +                         "\xe9\xe0\x4a\x04\xe1\xb6\x3c\xa1"
> +                         "\xd6\xca\xc4\xbd\xb6\xb5\x95\x34"
> +                         "\x12\x9d\xc5\x96\xf2\xdf\xba\x54"
> +                         "\x76\xd1\xb2\x6b\x3b\x39\xe0\xb9"
> +                         "\x18\x62\xfb\xf7\xfc\x12\xf1\x5f"
> +                         "\x7e\xc7\xe3\x59\x4c\xa6\xc2\x3d"
> +                         "\x40\x15\xf9\xa3\x95\x64\x4c\x74"
> +                         "\x8b\x73\x77\x33\x07\xa7\x04\x1d"
> +                         "\x33\x5a\x7e\x8f\xbd\x86\x01\x4f"
> +                         "\x3e\xb9\x27\x6f\xe2\x41\xf7\x09"
> +                         "\x67\xfd\x29\x28\xc5\xe4\xf6\x18"
> +                         "\x4c\x1b\x49\xb2\x9c\x5b\xf6\x81"
> +                         "\x4f\xbb\x5c\xcc\x0b\xdf\x84\x23"
> +                         "\x58\xd6\x28\x34\x93\x3a\x25\x97"
> +                         "\xdf\xb2\xc3\x9e\x97\x38\x0b\x7d"
> +                         "\x10\xb3\x54\x35\x23\x8c\x64\xee"
> +                         "\xf0\xd8\x66\xff\x8b\x22\xd2\x5b"
> +                         "\x05\x16\x3c\x89\xf7\xb1\x75\xaf"
> +                         "\xc0\xae\x6a\x4f\x3f\xaf\x9a\xf4"
> +                         "\xf4\x9a\x24\xd9\x80\x82\xc0\x12"
> +                         "\xde\x96\xd1\xbe\x15\x0b\x8d\x6a"
> +                         "\xd7\x12\xe4\x85\x9f\x83\xc9\xc3"
> +                         "\xff\x0b\xb5\xaf\x3b\xd8\x6d\x67"
> +                         "\x81\x45\xe6\xac\xec\xc1\x7b\x16"
> +                         "\x18\x0a\xce\x4b\xc0\x2e\x76\xbc"
> +                         "\x1b\xfa\xb4\x34\xb8\xfc\x3e\xc8"
> +                         "\x5d\x90\x71\x6d\x7a\x79\xef\x06",
> +               .ksize  = 1088,
> +               .plaintext      = "\xaa\x5d\x54\xcb\xea\x1e\x46\x0f"
> +                         "\x45\x87\x70\x51\x8a\x66\x7a\x33"
> +                         "\xb4\x18\xff\xa9\x82\xf9\x45\x4b"
> +                         "\x93\xae\x2e\x7f\xab\x98\xfe\xbf"
> +                         "\x01\xee\xe5\xa0\x37\x8f\x57\xa6"
> +                         "\xb0\x76\x0d\xa4\xd6\x28\x2b\x5d"
> +                         "\xe1\x03\xd6\x1c\x6f\x34\x0d\xe7"
> +                         "\x61\x2d\x2e\xe5\xae\x5d\x47\xc7"
> +                         "\x80\x4b\x18\x8f\xa8\x99\xbc\x28"
> +                         "\xed\x1d\x9d\x86\x7d\xd7\x41\xd1"
> +                         "\xe0\x2b\xe1\x8c\x93\x2a\xa7\x80"
> +                         "\xe1\x07\xa0\xa9\x9f\x8c\x8d\x1a"
> +                         "\x55\xfc\x6b\x24\x7a\xbd\x3e\x51"
> +                         "\x68\x4b\x26\x59\xc8\xa7\x16\xd9"
> +                         "\xb9\x61\x13\xde\x8b\x63\x1c\xf6"
> +                         "\x60\x01\xfb\x08\xb3\x5b\x0a\xbf"
> +                         "\x34\x73\xda\x87\x87\x3d\x6f\x97"
> +                         "\x4a\x0c\xa3\x58\x20\xa2\xc0\x81"
> +                         "\x5b\x8c\xef\xa9\xc2\x01\x1e\x64"
> +                         "\x83\x8c\xbc\x03\xb6\xd0\x29\x9f"
> +                         "\x54\xe2\xce\x8b\xc2\x07\x85\x78"
> +                         "\x25\x38\x96\x4c\xb4\xbe\x17\x4a"
> +                         "\x65\xa6\xfa\x52\x9d\x66\x9d\x65"
> +                         "\x4a\xd1\x01\x01\xf0\xcb\x13\xcc"
> +                         "\xa5\x82\xf3\xf2\x66\xcd\x3f\x9d"
> +                         "\xd1\xaa\xe4\x67\xea\xf2\xad\x88"
> +                         "\x56\x76\xa7\x9b\x59\x3c\xb1\x5d"
> +                         "\x78\xfd\x69\x79\x74\x78\x43\x26"
> +                         "\x7b\xde\x3f\xf1\xf5\x4e\x14\xd9"
> +                         "\x15\xf5\x75\xb5\x2e\x19\xf3\x0c"
> +                         "\x48\x72\xd6\x71\x6d\x03\x6e\xaa"
> +                         "\xa7\x08\xf9\xaa\x70\xa3\x0f\x4d"
> +                         "\x12\x8a\xdd\xe3\x39\x73\x7e\xa7"
> +                         "\xea\x1f\x6d\x06\x26\x2a\xf2\xc5"
> +                         "\x52\xb4\xbf\xfd\x52\x0c\x06\x60"
> +                         "\x90\xd1\xb2\x7b\x56\xae\xac\x58"
> +                         "\x5a\x6b\x50\x2a\xf5\xe0\x30\x3c"
> +                         "\x2a\x98\x0f\x1b\x5b\x0a\x84\x6c"
> +                         "\x31\xae\x92\xe2\xd4\xbb\x7f\x59"
> +                         "\x26\x10\xb9\x89\x37\x68\x26\xbf"
> +                         "\x41\xc8\x49\xc4\x70\x35\x7d\xff"
> +                         "\x2d\x7f\xf6\x8a\x93\x68\x8c\x78"
> +                         "\x0d\x53\xce\x7d\xff\x7d\xfb\xae"
> +                         "\x13\x1b\x75\xc4\x78\xd7\x71\xd8"
> +                         "\xea\xd3\xf4\x9d\x95\x64\x8e\xb4"
> +                         "\xde\xb8\xe4\xa6\x68\xc8\xae\x73"
> +                         "\x58\xaf\xa8\xb0\x5a\x20\xde\x87"
> +                         "\x43\xb9\x0f\xe3\xad\x41\x4b\xd5"
> +                         "\xb7\xad\x16\x00\xa6\xff\xf6\x74"
> +                         "\xbf\x8c\x9f\xb3\x58\x1b\xb6\x55"
> +                         "\xa9\x90\x56\x28\xf0\xb5\x13\x4e"
> +                         "\x9e\xf7\x25\x86\xe0\x07\x7b\x98"
> +                         "\xd8\x60\x5d\x38\x95\x3c\xe4\x22"
> +                         "\x16\x2f\xb2\xa2\xaf\xe8\x90\x17"
> +                         "\xec\x11\x83\x1a\xf4\xa9\x26\xda"
> +                         "\x39\x72\xf5\x94\x61\x05\x51\xec"
> +                         "\xa8\x30\x8b\x2c\x13\xd0\x72\xac"
> +                         "\xb9\xd2\xa0\x4c\x4b\x78\xe8\x6e"
> +                         "\x04\x85\xe9\x04\x49\x82\x91\xff"
> +                         "\x89\xe5\xab\x4c\xaa\x37\x03\x12"
> +                         "\xca\x8b\x74\x10\xfd\x9e\xd9\x7b"
> +                         "\xcb\xdb\x82\x6e\xce\x2e\x33\x39"
> +                         "\xce\xd2\x84\x6e\x34\x71\x51\x6e"
> +                         "\x0d\xd6\x01\x87\xc7\xfa\x0a\xd3"
> +                         "\xad\x36\xf3\x4c\x9f\x96\x5e\x62"
> +                         "\x62\x54\xc3\x03\x78\xd6\xab\xdd"
> +                         "\x89\x73\x55\x25\x30\xf8\xa7\xe6"
> +                         "\x4f\x11\x0c\x7c\x0a\xa1\x2b\x7b"
> +                         "\x3d\x0d\xde\x81\xd4\x9d\x0b\xae"
> +                         "\xdf\x00\xf9\x4c\xb6\x90\x8e\x16"
> +                         "\xcb\x11\xc8\xd1\x2e\x73\x13\x75"
> +                         "\x75\x3e\xaa\xf5\xee\x02\xb3\x18"
> +                         "\xa6\x2d\xf5\x3b\x51\xd1\x1f\x47"
> +                         "\x6b\x2c\xdb\xc4\x10\xe0\xc8\xba"
> +                         "\x9d\xac\xb1\x9d\x75\xd5\x41\x0e"
> +                         "\x7e\xbe\x18\x5b\xa4\x1f\xf8\x22"
> +                         "\x4c\xc1\x68\xda\x6d\x51\x34\x6c"
> +                         "\x19\x59\xec\xb5\xb1\xec\xa7\x03"
> +                         "\xca\x54\x99\x63\x05\x6c\xb1\xac"
> +                         "\x9c\x31\xd6\xdb\xba\x7b\x14\x12"
> +                         "\x7a\xc3\x2f\xbf\x8d\xdc\x37\x46"
> +                         "\xdb\xd2\xbc\xd4\x2f\xab\x30\xd5"
> +                         "\xed\x34\x99\x8e\x83\x3e\xbe\x4c"
> +                         "\x86\x79\x58\xe0\x33\x8d\x9a\xb8"
> +                         "\xa9\xa6\x90\x46\xa2\x02\xb8\xdd"
> +                         "\xf5\xf9\x1a\x5c\x8c\x01\xaa\x6e"
> +                         "\xb4\x22\x12\xf5\x0c\x1b\x9b\x7a"
> +                         "\xc3\x80\xf3\x06\x00\x5f\x30\xd5"
> +                         "\x06\xdb\x7d\x82\xc2\xd4\x0b\x4c"
> +                         "\x5f\xe9\xc5\xf5\xdf\x97\x12\xbf"
> +                         "\x56\xaf\x9b\x69\xcd\xee\x30\xb4"
> +                         "\xa8\x71\xff\x3e\x7d\x73\x7a\xb4"
> +                         "\x0d\xa5\x46\x7a\xf3\xf4\x15\x87"
> +                         "\x5d\x93\x2b\x8c\x37\x64\xb5\xdd"
> +                         "\x48\xd1\xe5\x8c\xae\xd4\xf1\x76"
> +                         "\xda\xf4\xba\x9e\x25\x0e\xad\xa3"
> +                         "\x0d\x08\x7c\xa8\x82\x16\x8d\x90"
> +                         "\x56\x40\x16\x84\xe7\x22\x53\x3a"
> +                         "\x58\xbc\xb9\x8f\x33\xc8\xc2\x84"
> +                         "\x22\xe6\x0d\xe7\xb3\xdc\x5d\xdf"
> +                         "\xd7\x2a\x36\xe4\x16\x06\x07\xd2"
> +                         "\x97\x60\xb2\xf5\x5e\x14\xc9\xfd"
> +                         "\x8b\x05\xd1\xce\xee\x9a\x65\x99"
> +                         "\xb7\xae\x19\xb7\xc8\xbc\xd5\xa2"
> +                         "\x7b\x95\xe1\xcc\xba\x0d\xdc\x8a"
> +                         "\x1d\x59\x52\x50\xaa\x16\x02\x82"
> +                         "\xdf\x61\x33\x2e\x44\xce\x49\xc7"
> +                         "\xe5\xc6\x2e\x76\xcf\x80\x52\xf0"
> +                         "\x3d\x17\x34\x47\x3f\xd3\x80\x48"
> +                         "\xa2\xba\xd5\xc7\x7b\x02\x28\xdb"
> +                         "\xac\x44\xc7\x6e\x05\x5c\xc2\x79"
> +                         "\xb3\x7d\x6a\x47\x77\x66\xf1\x38"
> +                         "\xf0\xf5\x4f\x27\x1a\x31\xca\x6c"
> +                         "\x72\x95\x92\x8e\x3f\xb0\xec\x1d"
> +                         "\xc7\x2a\xff\x73\xee\xdf\x55\x80"
> +                         "\x93\xd2\xbd\x34\xd3\x9f\x00\x51"
> +                         "\xfb\x2e\x41\xba\x6c\x5a\x7c\x17"
> +                         "\x7f\xe6\x70\xac\x8d\x39\x3f\x77"
> +                         "\xe2\x23\xac\x8f\x72\x4e\xe4\x53"
> +                         "\xcc\xf1\x1b\xf1\x35\xfe\x52\xa4"
> +                         "\xd6\xb8\x40\x6b\xc1\xfd\xa0\xa1"
> +                         "\xf5\x46\x65\xc2\x50\xbb\x43\xe2"
> +                         "\xd1\x43\x28\x34\x74\xf5\x87\xa0"
> +                         "\xf2\x5e\x27\x3b\x59\x2b\x3e\x49"
> +                         "\xdf\x46\xee\xaf\x71\xd7\x32\x36"
> +                         "\xc7\x14\x0b\x58\x6e\x3e\x2d\x41"
> +                         "\xfa\x75\x66\x3a\x54\xe0\xb2\xb9"
> +                         "\xaf\xdd\x04\x80\x15\x19\x3f\x6f"
> +                         "\xce\x12\xb4\xd8\xe8\x89\x3c\x05"
> +                         "\x30\xeb\xf3\x3d\xcd\x27\xec\xdc"
> +                         "\x56\x70\x12\xcf\x78\x2b\x77\xbf"
> +                         "\x22\xf0\x1b\x17\x9c\xcc\xd6\x1b"
> +                         "\x2d\x3d\xa0\x3b\xd8\xc9\x70\xa4"
> +                         "\x7a\x3e\x07\xb9\x06\xc3\xfa\xb0"
> +                         "\x33\xee\xc1\xd8\xf6\xe0\xf0\xb2"
> +                         "\x61\x12\x69\xb0\x5f\x28\x99\xda"
> +                         "\xc3\x61\x48\xfa\x07\x16\x03\xc4"
> +                         "\xa8\xe1\x3c\xe8\x0e\x64\x15\x30"
> +                         "\xc1\x9d\x84\x2f\x73\x98\x0e\x3a"
> +                         "\xf2\x86\x21\xa4\x9e\x1d\xb5\x86"
> +                         "\x16\xdb\x2b\x9a\x06\x64\x8e\x79"
> +                         "\x8d\x76\x3e\xc3\xc2\x64\x44\xe3"
> +                         "\xda\xbc\x1a\x52\xd7\x61\x03\x65"
> +                         "\x54\x32\x77\x01\xed\x9d\x8a\x43"
> +                         "\x25\x24\xe3\xc1\xbe\xb8\x2f\xcb"
> +                         "\x89\x14\x64\xab\xf6\xa0\x6e\x02"
> +                         "\x57\xe4\x7d\xa9\x4e\x9a\x03\x36"
> +                         "\xad\xf1\xb1\xfc\x0b\xe6\x79\x51"
> +                         "\x9f\x81\x77\xc4\x14\x78\x9d\xbf"
> +                         "\xb6\xd6\xa3\x8c\xba\x0b\x26\xe7"
> +                         "\xc8\xb9\x5c\xcc\xe1\x5f\xd5\xc6"
> +                         "\xc4\xca\xc2\xa3\x45\xba\x94\x13"
> +                         "\xb2\x8f\xc3\x54\x01\x09\xe7\x8b"
> +                         "\xda\x2a\x0a\x11\x02\x43\xcb\x57"
> +                         "\xc9\xcc\xb5\x5c\xab\xc4\xec\x54"
> +                         "\x00\x06\x34\xe1\x6e\x03\x89\x7c"
> +                         "\xc6\xfb\x6a\xc7\x60\x43\xd6\xc5"
> +                         "\xb5\x68\x72\x89\x8f\x42\xc3\x74"
> +                         "\xbd\x25\xaa\x9f\x67\xb5\xdf\x26"
> +                         "\x20\xe8\xb7\x01\x3c\xe4\x77\xce"
> +                         "\xc4\x65\xa7\x23\x79\xea\x33\xc7"
> +                         "\x82\x14\x5c\x82\xf2\x4e\x3d\xf6"
> +                         "\xc6\x4a\x0e\x29\xbb\xec\x44\xcd"
> +                         "\x2f\xd1\x4f\x21\x71\xa9\xce\x0f"
> +                         "\x5c\xf2\x72\x5c\x08\x2e\x21\xd2"
> +                         "\xc3\x29\x13\xd8\xac\xc3\xda\x13"
> +                         "\x1a\x9d\xa7\x71\x1d\x27\x1d\x27"
> +                         "\x1d\xea\xab\x44\x79\xad\xe5\xeb"
> +                         "\xef\x1f\x22\x0a\x44\x4f\xcb\x87"
> +                         "\xa7\x58\x71\x0e\x66\xf8\x60\xbf"
> +                         "\x60\x74\x4a\xb4\xec\x2e\xfe\xd3"
> +                         "\xf5\xb8\xfe\x46\x08\x50\x99\x6c"
> +                         "\x66\xa5\xa8\x34\x44\xb5\xe5\xf0"
> +                         "\xdd\x2c\x67\x4e\x35\x96\x8e\x67"
> +                         "\x48\x3f\x5f\x37\x44\x60\x51\x2e"
> +                         "\x14\x91\x5e\x57\xc3\x0e\x79\x77"
> +                         "\x2f\x03\xf4\xe2\x1c\x72\xbf\x85"
> +                         "\x5d\xd3\x17\xdf\x6c\xc5\x70\x24"
> +                         "\x42\xdf\x51\x4e\x2a\xb2\xd2\x5b"
> +                         "\x9e\x69\x83\x41\x11\xfe\x73\x22"
> +                         "\xde\x8a\x9e\xd8\x8a\xfb\x20\x38"
> +                         "\xd8\x47\x6f\xd5\xed\x8f\x41\xfd"
> +                         "\x13\x7a\x18\x03\x7d\x0f\xcd\x7d"
> +                         "\xa6\x7d\x31\x9e\xf1\x8f\x30\xa3"
> +                         "\x8b\x4c\x24\xb7\xf5\x48\xd7\xd9"
> +                         "\x12\xe7\x84\x97\x5c\x31\x6d\xfb"
> +                         "\xdf\xf3\xd3\xd1\xd5\x0c\x30\x06"
> +                         "\x01\x6a\xbc\x6c\x78\x7b\xa6\x50"
> +                         "\xfa\x0f\x3c\x42\x2d\xa5\xa3\x3b"
> +                         "\xcf\x62\x50\xff\x71\x6d\xe7\xda"
> +                         "\x27\xab\xc6\x67\x16\x65\x68\x64"
> +                         "\xc7\xd5\x5f\x81\xa9\xf6\x65\xb3"
> +                         "\x5e\x43\x91\x16\xcd\x3d\x55\x37"
> +                         "\x55\xb3\xf0\x28\xc5\x54\x19\xc0"
> +                         "\xe0\xd6\x2a\x61\xd4\xc8\x72\x51"
> +                         "\xe9\xa1\x7b\x48\x21\xad\x44\x09"
> +                         "\xe4\x01\x61\x3c\x8a\x5b\xf9\xa1"
> +                         "\x6e\x1b\xdf\xc0\x04\xa8\x8b\xf2"
> +                         "\x21\xbe\x34\x7b\xfc\xa1\xcd\xc9"
> +                         "\xa9\x96\xf4\xa4\x4c\xf7\x4e\x8f"
> +                         "\x84\xcc\xd3\xa8\x92\x77\x8f\x36"
> +                         "\xe2\x2e\x8c\x33\xe8\x84\xa6\x0c"
> +                         "\x6c\x8a\xda\x14\x32\xc2\x96\xff"
> +                         "\xc6\x4a\xc2\x9b\x30\x7f\xd1\x29"
> +                         "\xc0\xd5\x78\x41\x00\x80\x80\x03"
> +                         "\x2a\xb1\xde\x26\x03\x48\x49\xee"
> +                         "\x57\x14\x76\x51\x3c\x36\x5d\x0a"
> +                         "\x5c\x9f\xe8\xd8\x53\xdb\x4f\xd4"
> +                         "\x38\xbf\x66\xc9\x75\x12\x18\x75"
> +                         "\x34\x2d\x93\x22\x96\x51\x24\x6e"
> +                         "\x4e\xd9\x30\xea\x67\xff\x92\x1c"
> +                         "\x16\x26\xe9\xb5\x33\xab\x8c\x22"
> +                         "\x47\xdb\xa0\x2c\x08\xf0\x12\x69"
> +                         "\x7e\x93\x52\xda\xa5\xe5\xca\xc1"
> +                         "\x0f\x55\x2a\xbd\x09\x30\x88\x1b"
> +                         "\x9c\xc6\x9f\xe6\xdb\xa6\x92\xeb"
> +                         "\xf4\xbd\x5c\xc4\xdb\xc6\x71\x09"
> +                         "\xab\x5e\x48\x0c\xed\x6f\xda\x8e"
> +                         "\x8d\x0c\x98\x71\x7d\x10\xd0\x9c"
> +                         "\x20\x9b\x79\x53\x26\x5d\xb9\x85"
> +                         "\x8a\x31\xb8\xc5\x1c\x97\xde\x88"
> +                         "\x61\x55\x7f\x7c\x21\x06\xea\xc4"
> +                         "\x5f\xaf\xf2\xf0\xd5\x5e\x7d\xb4"
> +                         "\x6e\xcf\xe9\xae\x1b\x0e\x11\x80"
> +                         "\xc1\x9a\x74\x7e\x52\x6f\xa0\xb7"
> +                         "\x24\xcd\x8d\x0a\x11\x40\x63\x72"
> +                         "\xfa\xe2\xc5\xb3\x94\xef\x29\xa2"
> +                         "\x1a\x23\x43\x04\x37\x55\x0d\xe9"
> +                         "\x83\xb2\x29\x51\x49\x64\xa0\xbd"
> +                         "\xde\x73\xfd\xa5\x7c\x95\x70\x62"
> +                         "\x58\xdc\xe2\xd0\xbf\x98\xf5\x8a"
> +                         "\x6a\xfd\xce\xa8\x0e\x42\x2a\xeb"
> +                         "\xd2\xff\x83\x27\x53\x5c\xa0\x6e"
> +                         "\x93\xef\xe2\xb9\x5d\x35\xd6\x98"
> +                         "\xf6\x71\x19\x7a\x54\xa1\xa7\xe8"
> +                         "\x09\xfe\xf6\x9e\xc7\xbd\x3e\x29"
> +                         "\xbd\x6b\x17\xf4\xe7\x3e\x10\x5c"
> +                         "\xc1\xd2\x59\x4f\x4b\x12\x1a\x5b"
> +                         "\x50\x80\x59\xb9\xec\x13\x66\xa8"
> +                         "\xd2\x31\x7b\x6a\x61\x22\xdd\x7d"
> +                         "\x61\xee\x87\x16\x46\x9f\xf9\xc7"
> +                         "\x41\xee\x74\xf8\xd0\x96\x2c\x76"
> +                         "\x2a\xac\x7d\x6e\x9f\x0e\x7f\x95"
> +                         "\xfe\x50\x16\xb2\x23\xca\x62\xd5"
> +                         "\x68\xcf\x07\x3f\x3f\x97\x85\x2a"
> +                         "\x0c\x25\x45\xba\xdb\x32\xcb\x83"
> +                         "\x8c\x4f\xe0\x6d\x9a\x99\xf9\xc9"
> +                         "\xda\xd4\x19\x31\xc1\x7c\x6d\xd9"
> +                         "\x9c\x56\xd3\xec\xc1\x81\x4c\xed"
> +                         "\x28\x9d\x87\xeb\x19\xd7\x1a\x4f"
> +                         "\x04\x6a\xcb\x1f\xcf\x1f\xa2\x16"
> +                         "\xfc\x2a\x0d\xa1\x14\x2d\xfa\xc5"
> +                         "\x5a\xd2\xc5\xf9\x19\x7c\x20\x1f"
> +                         "\x2d\x10\xc0\x66\x7c\xd9\x2d\xe5"
> +                         "\x88\x70\x59\xa7\x85\xd5\x2e\x7c"
> +                         "\x5c\xe3\xb7\x12\xd6\x97\x3f\x29",
> +               .psize  = 2048,
> +               .digest = "\x37\x90\x92\xc2\xeb\x01\x87\xd9"
> +                         "\x95\xc7\x91\xc3\x17\x8b\x38\x52",
> +       }
> +};
> +
> +
>  /*
>   * DES test vectors.
>   */
> diff --git a/include/crypto/nhpoly1305.h b/include/crypto/nhpoly1305.h
> new file mode 100644
> index 0000000000000..06bfb876a1563
> --- /dev/null
> +++ b/include/crypto/nhpoly1305.h
> @@ -0,0 +1,74 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Common values and helper functions for the NHPoly1305 hash function.
> + */
> +
> +#ifndef _NHPOLY1305_H
> +#define _NHPOLY1305_H
> +
> +#include <crypto/hash.h>
> +#include <crypto/poly1305.h>
> +
> +/* NH parameterization: */
> +
> +/* Endianness: little */
> +/* Word size: 32 bits (works well on NEON, SSE2, AVX2) */
> +
> +/* Stride: 2 words (optimal on ARM32 NEON; works okay on other CPUs too) */
> +#define NH_PAIR_STRIDE         2
> +#define NH_MESSAGE_UNIT                (NH_PAIR_STRIDE * 2 * sizeof(u32))
> +
> +/* Num passes (Toeplitz iteration count): 4, to give ε = 2^{-128} */
> +#define NH_NUM_PASSES          4
> +#define NH_HASH_BYTES          (NH_NUM_PASSES * sizeof(u64))
> +
> +/* Max message size: 1024 bytes (32x compression factor) */
> +#define NH_NUM_STRIDES         64
> +#define NH_MESSAGE_WORDS       (NH_PAIR_STRIDE * 2 * NH_NUM_STRIDES)
> +#define NH_MESSAGE_BYTES       (NH_MESSAGE_WORDS * sizeof(u32))
> +#define NH_KEY_WORDS           (NH_MESSAGE_WORDS + \
> +                                NH_PAIR_STRIDE * 2 * (NH_NUM_PASSES - 1))
> +#define NH_KEY_BYTES           (NH_KEY_WORDS * sizeof(u32))
> +
> +#define NHPOLY1305_KEY_SIZE    (POLY1305_BLOCK_SIZE + NH_KEY_BYTES)
> +
> +struct nhpoly1305_key {
> +       struct poly1305_key poly_key;
> +       u32 nh_key[NH_KEY_WORDS];
> +};
> +
> +struct nhpoly1305_state {
> +
> +       /* Running total of polynomial evaluation */
> +       struct poly1305_state poly_state;
> +
> +       /* Partial block buffer */
> +       u8 buffer[NH_MESSAGE_UNIT];
> +       unsigned int buflen;
> +
> +       /*
> +        * Number of bytes remaining until the current NH message reaches
> +        * NH_MESSAGE_BYTES.  When nonzero, 'nh_hash' holds the partial NH hash.
> +        */
> +       unsigned int nh_remaining;
> +
> +       __le64 nh_hash[NH_NUM_PASSES];
> +};
> +
> +typedef void (*nh_t)(const u32 *key, const u8 *src, size_t srclen,
> +                    __le64 hash[NH_NUM_PASSES]);
> +
> +int crypto_nhpoly1305_setkey(struct crypto_shash *tfm,
> +                            const u8 *key, unsigned int keylen);
> +
> +int crypto_nhpoly1305_init(struct shash_desc *desc);
> +int crypto_nhpoly1305_update(struct shash_desc *desc,
> +                            const u8 *src, unsigned int srclen);
> +int crypto_nhpoly1305_update_helper(struct shash_desc *desc,
> +                                   const u8 *src, unsigned int srclen,
> +                                   nh_t nh_fn);
> +int crypto_nhpoly1305_final(struct shash_desc *desc, u8 *dst);
> +int crypto_nhpoly1305_final_helper(struct shash_desc *desc, u8 *dst,
> +                                  nh_t nh_fn);
> +
> +#endif /* _NHPOLY1305_H */
> --
> 2.19.1.331.ge82ca0e54c-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305
  2018-10-15 17:54 ` [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305 Eric Biggers
@ 2018-10-20  4:12   ` Ard Biesheuvel
  2018-10-20  5:51     ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20  4:12 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Add an ARM NEON implementation of NHPoly1305, an ε-almost-∆-universal
> hash function used in the Adiantum encryption mode.  For now, only the
> NH portion is actually NEON-accelerated; the Poly1305 part is less
> performance-critical so is just implemented in C.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  arch/arm/crypto/Kconfig                |   5 ++
>  arch/arm/crypto/Makefile               |   2 +
>  arch/arm/crypto/nh-neon-core.S         | 116 +++++++++++++++++++++++++
>  arch/arm/crypto/nhpoly1305-neon-glue.c |  78 +++++++++++++++++
>  4 files changed, 201 insertions(+)
>  create mode 100644 arch/arm/crypto/nh-neon-core.S
>  create mode 100644 arch/arm/crypto/nhpoly1305-neon-glue.c
>
> diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
> index cc932d9bba561..458562a34aabe 100644
> --- a/arch/arm/crypto/Kconfig
> +++ b/arch/arm/crypto/Kconfig
> @@ -122,4 +122,9 @@ config CRYPTO_CHACHA20_NEON
>         select CRYPTO_BLKCIPHER
>         select CRYPTO_CHACHA20
>
> +config CRYPTO_NHPOLY1305_NEON
> +       tristate "NEON accelerated NHPoly1305 hash function (for Adiantum)"
> +       depends on KERNEL_MODE_NEON
> +       select CRYPTO_NHPOLY1305
> +
>  endif
> diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
> index 005482ff95047..b65d6bfab8e6b 100644
> --- a/arch/arm/crypto/Makefile
> +++ b/arch/arm/crypto/Makefile
> @@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
>  obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
>  obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
>  obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
> +obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o
>
>  ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
>  ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
> @@ -53,6 +54,7 @@ ghash-arm-ce-y        := ghash-ce-core.o ghash-ce-glue.o
>  crct10dif-arm-ce-y     := crct10dif-ce-core.o crct10dif-ce-glue.o
>  crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
>  chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
> +nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o
>
>  ifdef REGENERATE_ARM_CRYPTO
>  quiet_cmd_perl = PERL    $@
> diff --git a/arch/arm/crypto/nh-neon-core.S b/arch/arm/crypto/nh-neon-core.S
> new file mode 100644
> index 0000000000000..434d80ab531c2
> --- /dev/null
> +++ b/arch/arm/crypto/nh-neon-core.S
> @@ -0,0 +1,116 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * NH - ε-almost-universal hash function, NEON accelerated version
> + *
> + * Copyright 2018 Google LLC
> + *
> + * Author: Eric Biggers <ebiggers@google.com>
> + */
> +
> +#include <linux/linkage.h>
> +
> +       .text
> +       .fpu            neon
> +
> +       KEY             .req    r0
> +       MESSAGE         .req    r1
> +       MESSAGE_LEN     .req    r2
> +       HASH            .req    r3
> +
> +       PASS0_SUMS      .req    q0
> +       PASS0_SUM_A     .req    d0
> +       PASS0_SUM_B     .req    d1
> +       PASS1_SUMS      .req    q1
> +       PASS1_SUM_A     .req    d2
> +       PASS1_SUM_B     .req    d3
> +       PASS2_SUMS      .req    q2
> +       PASS2_SUM_A     .req    d4
> +       PASS2_SUM_B     .req    d5
> +       PASS3_SUMS      .req    q3
> +       PASS3_SUM_A     .req    d6
> +       PASS3_SUM_B     .req    d7
> +       K0              .req    q4
> +       K1              .req    q5
> +       K2              .req    q6
> +       K3              .req    q7
> +       T0              .req    q8
> +       T0_L            .req    d16
> +       T0_H            .req    d17
> +       T1              .req    q9
> +       T1_L            .req    d18
> +       T1_H            .req    d19
> +       T2              .req    q10
> +       T2_L            .req    d20
> +       T2_H            .req    d21
> +       T3              .req    q11
> +       T3_L            .req    d22
> +       T3_H            .req    d23
> +
> +.macro _nh_stride      k0, k1, k2, k3
> +
> +       // Load next message stride
> +       vld1.8          {T3}, [MESSAGE]!
> +
> +       // Load next key stride
> +       vld1.32         {\k3}, [KEY]!
> +
> +       // Add message words to key words
> +       vadd.u32        T0, T3, \k0
> +       vadd.u32        T1, T3, \k1
> +       vadd.u32        T2, T3, \k2
> +       vadd.u32        T3, T3, \k3
> +
> +       // Multiply 32x32 => 64 and accumulate
> +       vmlal.u32       PASS0_SUMS, T0_L, T0_H
> +       vmlal.u32       PASS1_SUMS, T1_L, T1_H
> +       vmlal.u32       PASS2_SUMS, T2_L, T2_H
> +       vmlal.u32       PASS3_SUMS, T3_L, T3_H
> +.endm
> +

Since we seem to have some spare NEON registers: would it help to have
a double round version of this macro?

> +/*
> + * void nh_neon(const u32 *key, const u8 *message, size_t message_len,
> + *             u8 hash[NH_HASH_BYTES])
> + *
> + * It's guaranteed that message_len % 16 == 0.
> + */
> +ENTRY(nh_neon)
> +
> +       vld1.32         {K0,K1}, [KEY]!
> +         vmov.u64      PASS0_SUMS, #0
> +         vmov.u64      PASS1_SUMS, #0
> +       vld1.32         {K2}, [KEY]!
> +         vmov.u64      PASS2_SUMS, #0
> +         vmov.u64      PASS3_SUMS, #0
> +
> +       subs            MESSAGE_LEN, MESSAGE_LEN, #64
> +       blt             .Lloop4_done
> +.Lloop4:
> +       _nh_stride      K0, K1, K2, K3
> +       _nh_stride      K1, K2, K3, K0
> +       _nh_stride      K2, K3, K0, K1
> +       _nh_stride      K3, K0, K1, K2
> +       subs            MESSAGE_LEN, MESSAGE_LEN, #64
> +       bge             .Lloop4
> +
> +.Lloop4_done:
> +       ands            MESSAGE_LEN, MESSAGE_LEN, #63
> +       beq             .Ldone
> +       _nh_stride      K0, K1, K2, K3
> +
> +       subs            MESSAGE_LEN, MESSAGE_LEN, #16
> +       beq             .Ldone
> +       _nh_stride      K1, K2, K3, K0
> +
> +       subs            MESSAGE_LEN, MESSAGE_LEN, #16
> +       beq             .Ldone
> +       _nh_stride      K2, K3, K0, K1
> +
> +.Ldone:
> +       // Sum the accumulators for each pass, then store the sums to 'hash'
> +       vadd.u64        T0_L, PASS0_SUM_A, PASS0_SUM_B
> +       vadd.u64        T0_H, PASS1_SUM_A, PASS1_SUM_B
> +       vadd.u64        T1_L, PASS2_SUM_A, PASS2_SUM_B
> +       vadd.u64        T1_H, PASS3_SUM_A, PASS3_SUM_B
> +       vst1.8          {T0-T1}, [HASH]
> +       bx              lr
> +ENDPROC(nh_neon)
> diff --git a/arch/arm/crypto/nhpoly1305-neon-glue.c b/arch/arm/crypto/nhpoly1305-neon-glue.c
> new file mode 100644
> index 0000000000000..df48a00f4c50f
> --- /dev/null
> +++ b/arch/arm/crypto/nhpoly1305-neon-glue.c
> @@ -0,0 +1,78 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
> + * (NEON accelerated version)
> + *
> + * Copyright 2018 Google LLC
> + */
> +
> +#include <asm/neon.h>
> +#include <asm/simd.h>
> +#include <crypto/internal/hash.h>
> +#include <crypto/nhpoly1305.h>
> +#include <linux/module.h>
> +
> +asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
> +                       u8 hash[NH_HASH_BYTES]);
> +
> +static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
> +                    __le64 hash[NH_NUM_PASSES])
> +{
> +       nh_neon(key, message, message_len, (u8 *)hash);
> +}
> +

Why do we need this function?

> +static int nhpoly1305_neon_update(struct shash_desc *desc,
> +                                 const u8 *src, unsigned int srclen)
> +{
> +       if (srclen < 64 || !may_use_simd())
> +               return crypto_nhpoly1305_update(desc, src, srclen);
> +
> +       do {
> +               unsigned int n = min_t(unsigned int, srclen, PAGE_SIZE);
> +
> +               kernel_neon_begin();
> +               crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon);
> +               kernel_neon_end();
> +               src += n;
> +               srclen -= n;
> +       } while (srclen);
> +       return 0;
> +}
> +
> +static struct shash_alg nhpoly1305_alg = {
> +       .digestsize     = POLY1305_DIGEST_SIZE,
> +       .init           = crypto_nhpoly1305_init,
> +       .update         = nhpoly1305_neon_update,
> +       .final          = crypto_nhpoly1305_final,
> +       .setkey         = crypto_nhpoly1305_setkey,
> +       .descsize       = sizeof(struct nhpoly1305_state),
> +       .base           = {
> +               .cra_name               = "nhpoly1305",
> +               .cra_driver_name        = "nhpoly1305-neon",
> +               .cra_priority           = 200,
> +               .cra_ctxsize            = sizeof(struct nhpoly1305_key),
> +               .cra_module             = THIS_MODULE,
> +       },

Can we use .base.xxx please?

> +};
> +
> +static int __init nhpoly1305_mod_init(void)
> +{
> +       if (!(elf_hwcap & HWCAP_NEON))
> +               return -ENODEV;
> +
> +       return crypto_register_shash(&nhpoly1305_alg);
> +}
> +
> +static void __exit nhpoly1305_mod_exit(void)
> +{
> +       crypto_unregister_shash(&nhpoly1305_alg);
> +}
> +
> +module_init(nhpoly1305_mod_init);
> +module_exit(nhpoly1305_mod_exit);
> +
> +MODULE_DESCRIPTION("NHPoly1305 ε-almost-∆-universal hash function (NEON-accelerated)");
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
> +MODULE_ALIAS_CRYPTO("nhpoly1305");
> +MODULE_ALIAS_CRYPTO("nhpoly1305-neon");
> --
> 2.19.1.331.ge82ca0e54c-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support
  2018-10-15 17:54 ` [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support Eric Biggers
@ 2018-10-20  4:17   ` Ard Biesheuvel
  2018-10-20  7:12     ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20  4:17 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> Add support for the Adiantum encryption mode.  Adiantum was designed by
> Paul Crowley and is specified by our paper:
>
>     Adiantum: length-preserving encryption for entry-level processors
>     (https://eprint.iacr.org/2018/720.pdf)
>
> See our paper for full details; this patch only provides an overview.
>
> Adiantum is a tweakable, length-preserving encryption mode designed for
> fast and secure disk encryption, especially on CPUs without dedicated
> crypto instructions.  Adiantum encrypts each sector using the XChaCha12
> stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
> function, and an invocation of the AES-256 block cipher on a single
> 16-byte block.  On CPUs without AES instructions, Adiantum is much
> faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
> Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
> and decryption about 5 times faster.
>
> Adiantum is a specialization of the more general HBSH construction.  Our
> earlier proposal, HPolyC, was also a HBSH specialization, but it used a
> different εA∆U hash function, one based on Poly1305 only.  Adiantum's
> εA∆U hash function, which is based primarily on the "NH" hash function
> like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
> consequently, Adiantum is about 20% faster than HPolyC.
>
> This speed comes with no loss of security: Adiantum is provably just as
> secure as HPolyC, in fact slightly *more* secure.  Like HPolyC,
> Adiantum's security is reducible to that of XChaCha12 and AES-256,
> subject to a security bound.  XChaCha12 itself has a security reduction
> to ChaCha12.  Therefore, one need not "trust" Adiantum; one need only
> trust ChaCha12 and AES-256.  Note that the εA∆U hash function is only
> used for its proven combinatorical properties so cannot be "broken".
>

So what happens if the part of the input covered by the block cipher
is identical between different generations of the same disk block
(whose sector count is used as the 'outer' IV). How are we not in the
same boat as before when using stream ciphers for disk encryption?

> Adiantum is also a true wide-block encryption mode, so flipping any
> plaintext bit in the sector scrambles the entire ciphertext, and vice
> versa.  No other such mode is available in the kernel currently; doing
> the same with XTS scrambles only 16 bytes.  Adiantum also supports
> arbitrary-length tweaks and naturally supports any length input >= 16
> bytes without needing "ciphertext stealing".
>
> For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in
> order to make encryption feasible on the widest range of devices.
> Although the 20-round variant is quite popular, the best known attacks
> on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial
> security margin; in fact, larger than AES-256's.  12-round Salsa20 is
> also the eSTREAM recommendation.  For the block cipher, Adiantum uses
> AES-256, despite it having a lower security margin than XChaCha12 and
> needing table lookups, due to AES's extensive adoption and analysis
> making it the obvious first choice.  Nevertheless, for flexibility this
> patch also permits the "adiantum" template to be instantiated with
> XChaCha20 and/or with an alternate block cipher.
>
> We need Adiantum support in the kernel for use in dm-crypt and fscrypt,
> where currently the only other suitable options are block cipher modes
> such as AES-XTS.  A big problem with this is that many low-end mobile
> devices (e.g. Android Go phones sold primarily in developing countries,
> as well as some smartwatches) still have CPUs that lack AES
> instructions, e.g. ARM Cortex-A7.  Sadly, AES-XTS encryption is much too
> slow to be viable on these devices.  We did find that some "lightweight"
> block ciphers are fast enough, but these suffer from problems such as
> not having much cryptanalysis or being too controversial.
>
> The ChaCha stream cipher has excellent performance but is insecure to
> use directly for disk encryption, since each sector's IV is reused each
> time it is overwritten.  Even restricting the threat model to offline
> attacks only isn't enough, since modern flash storage devices don't
> guarantee that "overwrites" are really overwrites, due to wear-leveling.
> Adiantum avoids this problem by constructing a
> "tweakable super-pseudorandom permutation"; this is the strongest
> possible security model for length-preserving encryption.
>
> Of course, storing random nonces along with the ciphertext would be the
> ideal solution.  But doing that with existing hardware and filesystems
> runs into major practical problems; in most cases it would require data
> journaling (like dm-integrity) which severely degrades performance.
> Thus, for now length-preserving encryption is still needed.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  crypto/Kconfig    |  23 ++
>  crypto/Makefile   |   1 +
>  crypto/adiantum.c | 648 ++++++++++++++++++++++++++++++++++++++++++++++
>  crypto/testmgr.c  |  12 +
>  crypto/testmgr.h  | 461 +++++++++++++++++++++++++++++++++
>  5 files changed, 1145 insertions(+)
>  create mode 100644 crypto/adiantum.c
>
> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index 431beca903623..d60a8575049c0 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -498,6 +498,29 @@ config CRYPTO_NHPOLY1305
>         select CRYPTO_HASH
>         select CRYPTO_POLY1305
>
> +config CRYPTO_ADIANTUM
> +       tristate "Adiantum support"
> +       select CRYPTO_CHACHA20
> +       select CRYPTO_POLY1305
> +       select CRYPTO_NHPOLY1305
> +       help
> +         Adiantum is a tweakable, length-preserving encryption mode
> +         designed for fast and secure disk encryption, especially on
> +         CPUs without dedicated crypto instructions.  It encrypts
> +         each sector using the XChaCha12 stream cipher, two passes of
> +         an ε-almost-∆-universal hash function, and an invocation of
> +         the AES-256 block cipher on a single 16-byte block.  On CPUs
> +         without AES instructions, Adiantum is much faster than
> +         AES-XTS.
> +
> +         Adiantum's security is provably reducible to that of its
> +         underlying stream and block ciphers, subject to a security
> +         bound.  Unlike XTS, Adiantum is a true wide-block encryption
> +         mode, so it actually provides an even stronger notion of
> +         security than XTS, subject to the security bound.
> +
> +         If unsure, say N.
> +
>  comment "Hash modes"
>
>  config CRYPTO_CMAC
> diff --git a/crypto/Makefile b/crypto/Makefile
> index 87b86f221a2a2..1c66475593af3 100644
> --- a/crypto/Makefile
> +++ b/crypto/Makefile
> @@ -84,6 +84,7 @@ obj-$(CONFIG_CRYPTO_LRW) += lrw.o
>  obj-$(CONFIG_CRYPTO_XTS) += xts.o
>  obj-$(CONFIG_CRYPTO_CTR) += ctr.o
>  obj-$(CONFIG_CRYPTO_KEYWRAP) += keywrap.o
> +obj-$(CONFIG_CRYPTO_ADIANTUM) += adiantum.o
>  obj-$(CONFIG_CRYPTO_NHPOLY1305) += nhpoly1305.o
>  obj-$(CONFIG_CRYPTO_GCM) += gcm.o
>  obj-$(CONFIG_CRYPTO_CCM) += ccm.o
> diff --git a/crypto/adiantum.c b/crypto/adiantum.c
> new file mode 100644
> index 0000000000000..b5738ea2f98f5
> --- /dev/null
> +++ b/crypto/adiantum.c
> @@ -0,0 +1,648 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Adiantum length-preserving encryption mode
> + *
> + * Copyright 2018 Google LLC
> + */
> +
> +/*
> + * Adiantum is a tweakable, length-preserving encryption mode designed for fast
> + * and secure disk encryption, especially on CPUs without dedicated crypto
> + * instructions.  Adiantum encrypts each sector using the XChaCha12 stream
> + * cipher, two passes of an ε-almost-∆-universal (εA∆U) hash function based on
> + * NH and Poly1305, and an invocation of the AES-256 block cipher on a single
> + * 16-byte block.  See the paper for details:
> + *
> + *     Adiantum: length-preserving encryption for entry-level processors
> + *      (https://eprint.iacr.org/2018/720.pdf)
> + *
> + * For flexibility, this implementation also allows other ciphers:
> + *
> + *     - Stream cipher: XChaCha12 or XChaCha20
> + *     - Block cipher: any with a 128-bit block size and 256-bit key
> + *
> + * This implementation doesn't currently allow other εA∆U hash functions, i.e.
> + * HPolyC is not supported.  This is because Adiantum is ~20% faster than HPolyC
> + * but still provably as secure, and also the εA∆U hash function of HBSH is
> + * formally defined to take two inputs (tweak, message) which makes it difficult
> + * to wrap with the crypto_shash API.  Rather, some details need to be handled
> + * here.  Nevertheless, if needed in the future, support for other εA∆U hash
> + * functions could be added here.
> + */
> +
> +#include <crypto/b128ops.h>
> +#include <crypto/chacha.h>
> +#include <crypto/internal/hash.h>
> +#include <crypto/internal/skcipher.h>
> +#include <crypto/nhpoly1305.h>
> +#include <crypto/scatterwalk.h>
> +#include <linux/module.h>
> +
> +#include "internal.h"
> +
> +/*
> + * Size of right-hand block of input data, in bytes; also the size of the block
> + * cipher's block size and the hash function's output.
> + */
> +#define BLOCKCIPHER_BLOCK_SIZE         16
> +
> +/* Size of the block cipher key (K_E) in bytes */
> +#define BLOCKCIPHER_KEY_SIZE           32
> +
> +/* Size of the hash key (K_H) in bytes */
> +#define HASH_KEY_SIZE          (POLY1305_BLOCK_SIZE + NHPOLY1305_KEY_SIZE)
> +
> +/*
> + * The specification allows variable-length tweaks, but Linux's crypto API
> + * currently only allows algorithms to support a single length.  The "natural"
> + * tweak length for Adiantum is 16, since that fits into one Poly1305 block for
> + * the best performance.  But longer tweaks are useful for fscrypt, to avoid
> + * needing to derive per-file keys.  So instead we use two blocks, or 32 bytes.
> + */
> +#define TWEAK_SIZE             32
> +
> +struct adiantum_instance_ctx {
> +       struct crypto_skcipher_spawn streamcipher_spawn;
> +       struct crypto_spawn blockcipher_spawn;
> +       struct crypto_shash_spawn hash_spawn;
> +};
> +
> +struct adiantum_tfm_ctx {
> +       struct crypto_skcipher *streamcipher;
> +       struct crypto_cipher *blockcipher;
> +       struct crypto_shash *hash;
> +       struct poly1305_key header_hash_key;
> +};
> +
> +struct adiantum_request_ctx {
> +
> +       /*
> +        * Buffer for right-hand block of data, i.e.
> +        *
> +        *    P_L => P_M => C_M => C_R when encrypting, or
> +        *    C_R => C_M => P_M => P_L when decrypting.
> +        *
> +        * Also used to build the IV for the stream cipher.
> +        */
> +       union {
> +               u8 bytes[XCHACHA_IV_SIZE];
> +               __le32 words[XCHACHA_IV_SIZE / sizeof(__le32)];
> +               le128 bignum;   /* interpret as element of Z/(2^{128}Z) */
> +       } rbuf;
> +
> +       bool enc; /* true if encrypting, false if decrypting */
> +
> +       /*
> +        * The result of the Poly1305 εA∆U hash function applied to
> +        * (message length, tweak).
> +        */
> +       le128 header_hash;
> +
> +       /* Sub-requests, must be last */
> +       union {
> +               struct shash_desc hash_desc;
> +               struct skcipher_request streamcipher_req;
> +       } u;
> +};
> +
> +/*
> + * Given the XChaCha stream key K_S, derive the block cipher key K_E and the
> + * hash key K_H as follows:
> + *
> + *     K_E || K_H || ... = XChaCha(key=K_S, nonce=1||0^191)
> + *
> + * Note that this denotes using bits from the XChaCha keystream, which here we
> + * get indirectly by encrypting a buffer containing all 0's.
> + */
> +static int adiantum_setkey(struct crypto_skcipher *tfm, const u8 *key,
> +                          unsigned int keylen)
> +{
> +       struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
> +       struct {
> +               u8 iv[XCHACHA_IV_SIZE];
> +               u8 derived_keys[BLOCKCIPHER_KEY_SIZE + HASH_KEY_SIZE];
> +               struct scatterlist sg;
> +               struct crypto_wait wait;
> +               struct skcipher_request req; /* must be last */
> +       } *data;
> +       u8 *keyp;
> +       int err;
> +
> +       /* Set the stream cipher key (K_S) */
> +       crypto_skcipher_clear_flags(tctx->streamcipher, CRYPTO_TFM_REQ_MASK);
> +       crypto_skcipher_set_flags(tctx->streamcipher,
> +                                 crypto_skcipher_get_flags(tfm) &
> +                                 CRYPTO_TFM_REQ_MASK);
> +       err = crypto_skcipher_setkey(tctx->streamcipher, key, keylen);
> +       crypto_skcipher_set_flags(tfm,
> +                               crypto_skcipher_get_flags(tctx->streamcipher) &
> +                               CRYPTO_TFM_RES_MASK);
> +       if (err)
> +               return err;
> +
> +       /* Derive the subkeys */
> +       data = kzalloc(sizeof(*data) +
> +                      crypto_skcipher_reqsize(tctx->streamcipher), GFP_KERNEL);
> +       if (!data)
> +               return -ENOMEM;
> +       data->iv[0] = 1;
> +       sg_init_one(&data->sg, data->derived_keys, sizeof(data->derived_keys));
> +       crypto_init_wait(&data->wait);
> +       skcipher_request_set_tfm(&data->req, tctx->streamcipher);
> +       skcipher_request_set_callback(&data->req, CRYPTO_TFM_REQ_MAY_SLEEP |
> +                                                 CRYPTO_TFM_REQ_MAY_BACKLOG,
> +                                     crypto_req_done, &data->wait);
> +       skcipher_request_set_crypt(&data->req, &data->sg, &data->sg,
> +                                  sizeof(data->derived_keys), data->iv);
> +       err = crypto_wait_req(crypto_skcipher_encrypt(&data->req), &data->wait);
> +       if (err)
> +               goto out;
> +       keyp = data->derived_keys;
> +
> +       /* Set the block cipher key (K_E) */
> +       crypto_cipher_clear_flags(tctx->blockcipher, CRYPTO_TFM_REQ_MASK);
> +       crypto_cipher_set_flags(tctx->blockcipher,
> +                               crypto_skcipher_get_flags(tfm) &
> +                               CRYPTO_TFM_REQ_MASK);
> +       err = crypto_cipher_setkey(tctx->blockcipher, keyp,
> +                                  BLOCKCIPHER_KEY_SIZE);
> +       crypto_skcipher_set_flags(tfm,
> +                                 crypto_cipher_get_flags(tctx->blockcipher) &
> +                                 CRYPTO_TFM_RES_MASK);
> +       if (err)
> +               goto out;
> +       keyp += BLOCKCIPHER_KEY_SIZE;
> +
> +       /* Set the hash key (K_H) */
> +       poly1305_core_setkey(&tctx->header_hash_key, keyp);
> +       keyp += POLY1305_BLOCK_SIZE;
> +
> +       crypto_shash_clear_flags(tctx->hash, CRYPTO_TFM_REQ_MASK);
> +       crypto_shash_set_flags(tctx->hash, crypto_skcipher_get_flags(tfm) &
> +                                          CRYPTO_TFM_REQ_MASK);
> +       err = crypto_shash_setkey(tctx->hash, keyp, NHPOLY1305_KEY_SIZE);
> +       crypto_skcipher_set_flags(tfm, crypto_shash_get_flags(tctx->hash) &
> +                                      CRYPTO_TFM_RES_MASK);
> +       keyp += NHPOLY1305_KEY_SIZE;
> +       WARN_ON(keyp != &data->derived_keys[ARRAY_SIZE(data->derived_keys)]);
> +out:
> +       kzfree(data);
> +       return err;
> +}
> +
> +/* Addition in Z/(2^{128}Z) */
> +static inline void le128_add(le128 *r, const le128 *v1, const le128 *v2)
> +{
> +       u64 x = le64_to_cpu(v1->b);
> +       u64 y = le64_to_cpu(v2->b);
> +
> +       r->b = cpu_to_le64(x + y);
> +       r->a = cpu_to_le64(le64_to_cpu(v1->a) + le64_to_cpu(v2->a) +
> +                          (x + y < x));
> +}
> +
> +/* Subtraction in Z/(2^{128}Z) */
> +static inline void le128_sub(le128 *r, const le128 *v1, const le128 *v2)
> +{
> +       u64 x = le64_to_cpu(v1->b);
> +       u64 y = le64_to_cpu(v2->b);
> +
> +       r->b = cpu_to_le64(x - y);
> +       r->a = cpu_to_le64(le64_to_cpu(v1->a) - le64_to_cpu(v2->a) -
> +                          (x - y > x));
> +}
> +
> +/*
> + * Apply the Poly1305 εA∆U hash function to (message length, tweak) and save the
> + * result to rctx->header_hash.
> + *
> + * This value is reused in both the first and second hash steps.  Specifically,
> + * it's added to the result of an independently keyed εA∆U hash function (for
> + * equal length inputs only) taken over the message.  This gives the overall
> + * Adiantum hash of the (tweak, message) pair.
> + */
> +static void adiantum_hash_header(struct skcipher_request *req)
> +{
> +       struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> +       const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
> +       struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
> +       const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
> +       struct {
> +               __le64 message_bits;
> +               __le64 padding;
> +       } header = {
> +               .message_bits = cpu_to_le64((u64)bulk_len * 8)
> +       };
> +       struct poly1305_state state;
> +
> +       poly1305_core_init(&state);
> +
> +       BUILD_BUG_ON(sizeof(header) % POLY1305_BLOCK_SIZE != 0);
> +       poly1305_core_blocks(&state, &tctx->header_hash_key,
> +                            &header, sizeof(header) / POLY1305_BLOCK_SIZE);
> +
> +       BUILD_BUG_ON(TWEAK_SIZE % POLY1305_BLOCK_SIZE != 0);
> +       poly1305_core_blocks(&state, &tctx->header_hash_key, req->iv,
> +                            TWEAK_SIZE / POLY1305_BLOCK_SIZE);
> +
> +       poly1305_core_emit(&state, &rctx->header_hash);
> +}
> +
> +/* Hash the left-hand block (the "bulk") of the message using NHPoly1305 */
> +static int adiantum_hash_message(struct skcipher_request *req,
> +                                struct scatterlist *sgl, le128 *digest)
> +{
> +       struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> +       const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
> +       struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
> +       const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
> +       struct shash_desc *hash_desc = &rctx->u.hash_desc;
> +       struct sg_mapping_iter miter;
> +       unsigned int i, n;
> +       int err;
> +
> +       hash_desc->tfm = tctx->hash;
> +       hash_desc->flags = 0;
> +
> +       err = crypto_shash_init(hash_desc);
> +       if (err)
> +               return err;
> +
> +       sg_miter_start(&miter, sgl, sg_nents(sgl),
> +                      SG_MITER_FROM_SG | SG_MITER_ATOMIC);
> +       for (i = 0; i < bulk_len; i += n) {
> +               sg_miter_next(&miter);
> +               n = min_t(unsigned int, miter.length, bulk_len - i);
> +               err = crypto_shash_update(hash_desc, miter.addr, n);
> +               if (err)
> +                       break;
> +       }
> +       sg_miter_stop(&miter);
> +       if (err)
> +               return err;
> +
> +       return crypto_shash_final(hash_desc, (u8 *)digest);
> +}
> +
> +/* Continue Adiantum encryption/decryption after the stream cipher step */
> +static int adiantum_finish(struct skcipher_request *req)
> +{
> +       struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> +       const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
> +       struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
> +       const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
> +       le128 digest;
> +       int err;
> +
> +       /* If decrypting, decrypt C_M with the block cipher to get P_M */
> +       if (!rctx->enc)
> +               crypto_cipher_decrypt_one(tctx->blockcipher, rctx->rbuf.bytes,
> +                                         rctx->rbuf.bytes);
> +
> +       /*
> +        * Second hash step
> +        *      enc: C_R = C_M - H_{K_H}(T, C_L)
> +        *      dec: P_R = P_M - H_{K_H}(T, P_L)
> +        */
> +       err = adiantum_hash_message(req, req->dst, &digest);
> +       if (err)
> +               return err;
> +       le128_add(&digest, &digest, &rctx->header_hash);
> +       le128_sub(&rctx->rbuf.bignum, &rctx->rbuf.bignum, &digest);
> +       scatterwalk_map_and_copy(&rctx->rbuf.bignum, req->dst,
> +                                bulk_len, BLOCKCIPHER_BLOCK_SIZE, 1);
> +       return 0;
> +}
> +
> +static void adiantum_streamcipher_done(struct crypto_async_request *areq, int err)
> +{
> +       struct skcipher_request *req = areq->data;
> +
> +       if (!err)
> +               err = adiantum_finish(req);
> +
> +       skcipher_request_complete(req, err);
> +}
> +
> +static int adiantum_crypt(struct skcipher_request *req, bool enc)
> +{
> +       struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
> +       const struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
> +       struct adiantum_request_ctx *rctx = skcipher_request_ctx(req);
> +       const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
> +       unsigned int stream_len;
> +       le128 digest;
> +       int err;
> +
> +       if (req->cryptlen < BLOCKCIPHER_BLOCK_SIZE)
> +               return -EINVAL;
> +
> +       rctx->enc = enc;
> +
> +       /*
> +        * First hash step
> +        *      enc: P_M = P_R + H_{K_H}(T, P_L)
> +        *      dec: C_M = C_R + H_{K_H}(T, C_L)
> +        */
> +       adiantum_hash_header(req);
> +       err = adiantum_hash_message(req, req->src, &digest);
> +       if (err)
> +               return err;
> +       le128_add(&digest, &digest, &rctx->header_hash);
> +       scatterwalk_map_and_copy(&rctx->rbuf.bignum, req->src,
> +                                bulk_len, BLOCKCIPHER_BLOCK_SIZE, 0);
> +       le128_add(&rctx->rbuf.bignum, &rctx->rbuf.bignum, &digest);
> +
> +       /* If encrypting, encrypt P_M with the block cipher to get C_M */
> +       if (enc)
> +               crypto_cipher_encrypt_one(tctx->blockcipher, rctx->rbuf.bytes,
> +                                         rctx->rbuf.bytes);
> +
> +       /* Initialize the rest of the XChaCha IV (first part is C_M) */
> +       BUILD_BUG_ON(BLOCKCIPHER_BLOCK_SIZE != 16);
> +       BUILD_BUG_ON(XCHACHA_IV_SIZE != 32);    /* nonce || stream position */
> +       rctx->rbuf.words[4] = cpu_to_le32(1);
> +       rctx->rbuf.words[5] = 0;
> +       rctx->rbuf.words[6] = 0;
> +       rctx->rbuf.words[7] = 0;
> +
> +       /*
> +        * XChaCha needs to be done on all the data except the last 16 bytes;
> +        * for disk encryption that usually means 4080 or 496 bytes.  But ChaCha
> +        * implementations tend to be most efficient when passed a whole number
> +        * of 64-byte ChaCha blocks, or sometimes even a multiple of 256 bytes.
> +        * And here it doesn't matter whether the last 16 bytes are written to,
> +        * as the second hash step will overwrite them.  Thus, round the XChaCha
> +        * length up to the next 64-byte boundary if possible.
> +        */
> +       stream_len = bulk_len;
> +       if (round_up(stream_len, CHACHA_BLOCK_SIZE) <= req->cryptlen)
> +               stream_len = round_up(stream_len, CHACHA_BLOCK_SIZE);
> +
> +       skcipher_request_set_tfm(&rctx->u.streamcipher_req, tctx->streamcipher);
> +       skcipher_request_set_crypt(&rctx->u.streamcipher_req, req->src,
> +                                  req->dst, stream_len, &rctx->rbuf);
> +       skcipher_request_set_callback(&rctx->u.streamcipher_req,
> +                                     req->base.flags,
> +                                     adiantum_streamcipher_done, req);
> +       return crypto_skcipher_encrypt(&rctx->u.streamcipher_req) ?:
> +               adiantum_finish(req);
> +}
> +
> +static int adiantum_encrypt(struct skcipher_request *req)
> +{
> +       return adiantum_crypt(req, true);
> +}
> +
> +static int adiantum_decrypt(struct skcipher_request *req)
> +{
> +       return adiantum_crypt(req, false);
> +}
> +
> +static int adiantum_init_tfm(struct crypto_skcipher *tfm)
> +{
> +       struct skcipher_instance *inst = skcipher_alg_instance(tfm);
> +       struct adiantum_instance_ctx *ictx = skcipher_instance_ctx(inst);
> +       struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
> +       struct crypto_skcipher *streamcipher;
> +       struct crypto_cipher *blockcipher;
> +       struct crypto_shash *hash;
> +       unsigned int subreq_size;
> +       int err;
> +
> +       streamcipher = crypto_spawn_skcipher(&ictx->streamcipher_spawn);
> +       if (IS_ERR(streamcipher))
> +               return PTR_ERR(streamcipher);
> +
> +       blockcipher = crypto_spawn_cipher(&ictx->blockcipher_spawn);
> +       if (IS_ERR(blockcipher)) {
> +               err = PTR_ERR(blockcipher);
> +               goto err_free_streamcipher;
> +       }
> +
> +       hash = crypto_spawn_shash(&ictx->hash_spawn);
> +       if (IS_ERR(hash)) {
> +               err = PTR_ERR(hash);
> +               goto err_free_blockcipher;
> +       }
> +
> +       tctx->streamcipher = streamcipher;
> +       tctx->blockcipher = blockcipher;
> +       tctx->hash = hash;
> +
> +       BUILD_BUG_ON(offsetofend(struct adiantum_request_ctx, u) !=
> +                    sizeof(struct adiantum_request_ctx));
> +       subreq_size = max(FIELD_SIZEOF(struct adiantum_request_ctx,
> +                                      u.hash_desc) +
> +                         crypto_shash_descsize(hash),
> +                         FIELD_SIZEOF(struct adiantum_request_ctx,
> +                                      u.streamcipher_req) +
> +                         crypto_skcipher_reqsize(streamcipher));
> +
> +       crypto_skcipher_set_reqsize(tfm,
> +                                   offsetof(struct adiantum_request_ctx, u) +
> +                                   subreq_size);
> +       return 0;
> +
> +err_free_blockcipher:
> +       crypto_free_cipher(blockcipher);
> +err_free_streamcipher:
> +       crypto_free_skcipher(streamcipher);
> +       return err;
> +}
> +
> +static void adiantum_exit_tfm(struct crypto_skcipher *tfm)
> +{
> +       struct adiantum_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
> +
> +       crypto_free_skcipher(tctx->streamcipher);
> +       crypto_free_cipher(tctx->blockcipher);
> +       crypto_free_shash(tctx->hash);
> +}
> +
> +static void adiantum_free_instance(struct skcipher_instance *inst)
> +{
> +       struct adiantum_instance_ctx *ictx = skcipher_instance_ctx(inst);
> +
> +       crypto_drop_skcipher(&ictx->streamcipher_spawn);
> +       crypto_drop_spawn(&ictx->blockcipher_spawn);
> +       crypto_drop_shash(&ictx->hash_spawn);
> +       kfree(inst);
> +}
> +
> +/*
> + * Check for a supported set of inner algorithms.
> + * See the comment at the beginning of this file.
> + */
> +static bool adiantum_supported_algorithms(struct skcipher_alg *streamcipher_alg,
> +                                         struct crypto_alg *blockcipher_alg,
> +                                         struct shash_alg *hash_alg)
> +{
> +       if (strcmp(streamcipher_alg->base.cra_name, "xchacha12") != 0 &&
> +           strcmp(streamcipher_alg->base.cra_name, "xchacha20") != 0)
> +               return false;
> +
> +       if (blockcipher_alg->cra_cipher.cia_min_keysize > BLOCKCIPHER_KEY_SIZE ||
> +           blockcipher_alg->cra_cipher.cia_max_keysize < BLOCKCIPHER_KEY_SIZE)
> +               return false;
> +       if (blockcipher_alg->cra_blocksize != BLOCKCIPHER_BLOCK_SIZE)
> +               return false;
> +
> +       if (strcmp(hash_alg->base.cra_name, "nhpoly1305") != 0)
> +               return false;
> +
> +       return true;
> +}
> +
> +static int adiantum_create(struct crypto_template *tmpl, struct rtattr **tb)
> +{
> +       struct crypto_attr_type *algt;
> +       const char *streamcipher_name;
> +       const char *blockcipher_name;
> +       struct skcipher_instance *inst;
> +       struct adiantum_instance_ctx *ictx;
> +       struct skcipher_alg *streamcipher_alg;
> +       struct crypto_alg *blockcipher_alg;
> +       struct crypto_alg *_hash_alg;
> +       struct shash_alg *hash_alg;
> +       int err;
> +
> +       algt = crypto_get_attr_type(tb);
> +       if (IS_ERR(algt))
> +               return PTR_ERR(algt);
> +
> +       if ((algt->type ^ CRYPTO_ALG_TYPE_SKCIPHER) & algt->mask)
> +               return -EINVAL;
> +
> +       streamcipher_name = crypto_attr_alg_name(tb[1]);
> +       if (IS_ERR(streamcipher_name))
> +               return PTR_ERR(streamcipher_name);
> +
> +       blockcipher_name = crypto_attr_alg_name(tb[2]);
> +       if (IS_ERR(blockcipher_name))
> +               return PTR_ERR(blockcipher_name);
> +
> +       inst = kzalloc(sizeof(*inst) + sizeof(*ictx), GFP_KERNEL);
> +       if (!inst)
> +               return -ENOMEM;
> +       ictx = skcipher_instance_ctx(inst);
> +
> +       /* Stream cipher, e.g. "xchacha12" */
> +       err = crypto_grab_skcipher(&ictx->streamcipher_spawn, streamcipher_name,
> +                                  0, crypto_requires_sync(algt->type,
> +                                                          algt->mask));
> +       if (err)
> +               goto out_free_inst;
> +       streamcipher_alg = crypto_spawn_skcipher_alg(&ictx->streamcipher_spawn);
> +
> +       /* Block cipher, e.g. "aes" */
> +       err = crypto_grab_spawn(&ictx->blockcipher_spawn, blockcipher_name,
> +                               CRYPTO_ALG_TYPE_CIPHER, CRYPTO_ALG_TYPE_MASK);
> +       if (err)
> +               goto out_drop_streamcipher;
> +       blockcipher_alg = ictx->blockcipher_spawn.alg;
> +
> +       /* NHPoly1305 εA∆U hash function */
> +       _hash_alg = crypto_alg_mod_lookup("nhpoly1305", CRYPTO_ALG_TYPE_SHASH,
> +                                         CRYPTO_ALG_TYPE_MASK);
> +       if (IS_ERR(_hash_alg)) {
> +               err = PTR_ERR(_hash_alg);
> +               goto out_drop_blockcipher;
> +       }
> +       hash_alg = __crypto_shash_alg(_hash_alg);
> +       err = crypto_init_shash_spawn(&ictx->hash_spawn, hash_alg,
> +                                     skcipher_crypto_instance(inst));
> +       if (err) {
> +               crypto_mod_put(_hash_alg);
> +               goto out_drop_blockcipher;
> +       }
> +
> +       /* Check the set of algorithms */
> +       err = -EINVAL;
> +       if (!adiantum_supported_algorithms(streamcipher_alg, blockcipher_alg,
> +                                          hash_alg)) {
> +               pr_warn("Unsupported Adiantum instantiation: (%s,%s,%s)\n",
> +                       streamcipher_alg->base.cra_name,
> +                       blockcipher_alg->cra_name, hash_alg->base.cra_name);
> +               goto out_drop_hash;
> +       }
> +
> +       /* Instance fields */
> +
> +       err = -ENAMETOOLONG;
> +       if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
> +                    "adiantum(%s,%s)", streamcipher_alg->base.cra_name,
> +                    blockcipher_alg->cra_name) >= CRYPTO_MAX_ALG_NAME)
> +               goto out_drop_hash;
> +       if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME,
> +                    "adiantum_base(%s,%s,%s)",
> +                    streamcipher_alg->base.cra_driver_name,
> +                    blockcipher_alg->cra_driver_name,
> +                    hash_alg->base.cra_driver_name) >= CRYPTO_MAX_ALG_NAME)
> +               goto out_drop_hash;
> +
> +       inst->alg.base.cra_blocksize = BLOCKCIPHER_BLOCK_SIZE;
> +       inst->alg.base.cra_ctxsize = sizeof(struct adiantum_tfm_ctx);
> +       inst->alg.base.cra_alignmask = streamcipher_alg->base.cra_alignmask;
> +       /*
> +        * The block cipher is only invoked once per message, so for long
> +        * messages (e.g. sectors for disk encryption) its performance doesn't
> +        * matter as much as that of the stream cipher and hash function.  Thus,
> +        * weigh the block cipher's ->cra_priority less.
> +        */
> +       inst->alg.base.cra_priority = (4 * streamcipher_alg->base.cra_priority +
> +                                      2 * hash_alg->base.cra_priority +
> +                                      blockcipher_alg->cra_priority) / 7;
> +
> +       inst->alg.setkey = adiantum_setkey;
> +       inst->alg.encrypt = adiantum_encrypt;
> +       inst->alg.decrypt = adiantum_decrypt;
> +       inst->alg.init = adiantum_init_tfm;
> +       inst->alg.exit = adiantum_exit_tfm;
> +       inst->alg.min_keysize = streamcipher_alg->min_keysize;
> +       inst->alg.max_keysize = streamcipher_alg->max_keysize;
> +       inst->alg.ivsize = TWEAK_SIZE;
> +
> +       inst->free = adiantum_free_instance;
> +
> +       err = skcipher_register_instance(tmpl, inst);
> +       if (err)
> +               goto out_drop_hash;
> +
> +       return 0;
> +
> +out_drop_hash:
> +       crypto_drop_shash(&ictx->hash_spawn);
> +out_drop_blockcipher:
> +       crypto_drop_spawn(&ictx->blockcipher_spawn);
> +out_drop_streamcipher:
> +       crypto_drop_skcipher(&ictx->streamcipher_spawn);
> +out_free_inst:
> +       kfree(inst);
> +       return err;
> +}
> +
> +/* adiantum(streamcipher_name, blockcipher_name) */
> +static struct crypto_template adiantum_tmpl = {
> +       .name = "adiantum",
> +       .create = adiantum_create,
> +       .module = THIS_MODULE,
> +};
> +
> +static int adiantum_module_init(void)
> +{
> +       return crypto_register_template(&adiantum_tmpl);
> +}
> +
> +static void __exit adiantum_module_exit(void)
> +{
> +       crypto_unregister_template(&adiantum_tmpl);
> +}
> +
> +module_init(adiantum_module_init);
> +module_exit(adiantum_module_exit);
> +
> +MODULE_DESCRIPTION("Adiantum length-preserving encryption mode");
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
> +MODULE_ALIAS_CRYPTO("adiantum");
> diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> index 039a5d850a29c..4ce255a4509de 100644
> --- a/crypto/testmgr.c
> +++ b/crypto/testmgr.c
> @@ -2404,6 +2404,18 @@ static int alg_test_null(const struct alg_test_desc *desc,
>  /* Please keep this list sorted by algorithm name. */
>  static const struct alg_test_desc alg_test_descs[] = {
>         {
> +               .alg = "adiantum(xchacha12,aes)",
> +               .test = alg_test_skcipher,
> +               .suite = {
> +                       .cipher = __VECS(adiantum_xchacha12_aes_tv_template)
> +               },
> +       }, {
> +               .alg = "adiantum(xchacha20,aes)",
> +               .test = alg_test_skcipher,
> +               .suite = {
> +                       .cipher = __VECS(adiantum_xchacha20_aes_tv_template)
> +               },
> +       }, {
>                 .alg = "aegis128",
>                 .test = alg_test_aead,
>                 .suite = {
> diff --git a/crypto/testmgr.h b/crypto/testmgr.h
> index 40197d74b3d56..6b2fb444f6877 100644
> --- a/crypto/testmgr.h
> +++ b/crypto/testmgr.h
> @@ -33189,6 +33189,467 @@ static const struct cipher_testvec xchacha12_tv_template[] = {
>         },
>  };
>
> +/* Adiantum test vectors from https://github.com/google/adiantum */
> +static const struct cipher_testvec adiantum_xchacha12_aes_tv_template[] = {
> +       {
> +               .key    = "\x9e\xeb\xb2\x49\x3c\x1c\xf5\xf4"
> +                         "\x6a\x99\xc2\xc4\xdf\xb1\xf4\xdd"
> +                         "\x75\x20\x57\xea\x2c\x4f\xcd\xb2"
> +                         "\xa5\x3d\x7b\x49\x1e\xab\xfd\x0f",
> +               .klen   = 32,
> +               .iv     = "\xdf\x63\xd4\xab\xd2\x49\xf3\xd8"
> +                         "\x33\x81\x37\x60\x7d\xfa\x73\x08"
> +                         "\xd8\x49\x6d\x80\xe8\x2f\x62\x54"
> +                         "\xeb\x0e\xa9\x39\x5b\x45\x7f\x8a",
> +               .ptext  = "\x67\xc9\xf2\x30\x84\x41\x8e\x43"
> +                         "\xfb\xf3\xb3\x3e\x79\x36\x7f\xe8",
> +               .ctext  = "\x6d\x32\x86\x18\x67\x86\x0f\x3f"
> +                         "\x96\x7c\x9d\x28\x0d\x53\xec\x9f",
> +               .len    = 16,
> +               .also_non_np = 1,
> +               .np     = 2,
> +               .tap    = { 14, 2 },
> +       }, {
> +               .key    = "\x36\x2b\x57\x97\xf8\x5d\xcd\x99"
> +                         "\x5f\x1a\x5a\x44\x1d\x92\x0f\x27"
> +                         "\xcc\x16\xd7\x2b\x85\x63\x99\xd3"
> +                         "\xba\x96\xa1\xdb\xd2\x60\x68\xda",
> +               .klen   = 32,
> +               .iv     = "\xef\x58\x69\xb1\x2c\x5e\x9a\x47"
> +                         "\x24\xc1\xb1\x69\xe1\x12\x93\x8f"
> +                         "\x43\x3d\x6d\x00\xdb\x5e\xd8\xd9"
> +                         "\x12\x9a\xfe\xd9\xff\x2d\xaa\xc4",
> +               .ptext  = "\x5e\xa8\x68\x19\x85\x98\x12\x23"
> +                         "\x26\x0a\xcc\xdb\x0a\x04\xb9\xdf"
> +                         "\x4d\xb3\x48\x7b\xb0\xe3\xc8\x19"
> +                         "\x43\x5a\x46\x06\x94\x2d\xf2",
> +               .ctext  = "\xc7\xc6\xf1\x73\x8f\xc4\xff\x4a"
> +                         "\x39\xbe\x78\xbe\x8d\x28\xc8\x89"
> +                         "\x46\x63\xe7\x0c\x7d\x87\xe8\x4e"
> +                         "\xc9\x18\x7b\xbe\x18\x60\x50",
> +               .len    = 31,
> +       }, {
> +               .key    = "\xa5\x28\x24\x34\x1a\x3c\xd8\xf7"
> +                         "\x05\x91\x8f\xee\x85\x1f\x35\x7f"
> +                         "\x80\x3d\xfc\x9b\x94\xf6\xfc\x9e"
> +                         "\x19\x09\x00\xa9\x04\x31\x4f\x11",
> +               .klen   = 32,
> +               .iv     = "\xa1\xba\x49\x95\xff\x34\x6d\xb8"
> +                         "\xcd\x87\x5d\x5e\xfd\xea\x85\xdb"
> +                         "\x8a\x7b\x5e\xb2\x5d\x57\xdd\x62"
> +                         "\xac\xa9\x8c\x41\x42\x94\x75\xb7",
> +               .ptext  = "\x69\xb4\xe8\x8c\x37\xe8\x67\x82"
> +                         "\xf1\xec\x5d\x04\xe5\x14\x91\x13"
> +                         "\xdf\xf2\x87\x1b\x69\x81\x1d\x71"
> +                         "\x70\x9e\x9c\x3b\xde\x49\x70\x11"
> +                         "\xa0\xa3\xdb\x0d\x54\x4f\x66\x69"
> +                         "\xd7\xdb\x80\xa7\x70\x92\x68\xce"
> +                         "\x81\x04\x2c\xc6\xab\xae\xe5\x60"
> +                         "\x15\xe9\x6f\xef\xaa\x8f\xa7\xa7"
> +                         "\x63\x8f\xf2\xf0\x77\xf1\xa8\xea"
> +                         "\xe1\xb7\x1f\x9e\xab\x9e\x4b\x3f"
> +                         "\x07\x87\x5b\x6f\xcd\xa8\xaf\xb9"
> +                         "\xfa\x70\x0b\x52\xb8\xa8\xa7\x9e"
> +                         "\x07\x5f\xa6\x0e\xb3\x9b\x79\x13"
> +                         "\x79\xc3\x3e\x8d\x1c\x2c\x68\xc8"
> +                         "\x51\x1d\x3c\x7b\x7d\x79\x77\x2a"
> +                         "\x56\x65\xc5\x54\x23\x28\xb0\x03",
> +               .ctext  = "\x9e\x16\xab\xed\x4b\xa7\x42\x5a"
> +                         "\xc6\xfb\x4e\x76\xff\xbe\x03\xa0"
> +                         "\x0f\xe3\xad\xba\xe4\x98\x2b\x0e"
> +                         "\x21\x48\xa0\xb8\x65\x48\x27\x48"
> +                         "\x84\x54\x54\xb2\x9a\x94\x7b\xe6"
> +                         "\x4b\x29\xe9\xcf\x05\x91\x80\x1a"
> +                         "\x3a\xf3\x41\x96\x85\x1d\x9f\x74"
> +                         "\x51\x56\x63\xfa\x7c\x28\x85\x49"
> +                         "\xf7\x2f\xf9\xf2\x18\x46\xf5\x33"
> +                         "\x80\xa3\x3c\xce\xb2\x57\x93\xf5"
> +                         "\xae\xbd\xa9\xf5\x7b\x30\xc4\x93"
> +                         "\x66\xe0\x30\x77\x16\xe4\xa0\x31"
> +                         "\xba\x70\xbc\x68\x13\xf5\xb0\x9a"
> +                         "\xc1\xfc\x7e\xfe\x55\x80\x5c\x48"
> +                         "\x74\xa6\xaa\xa3\xac\xdc\xc2\xf5"
> +                         "\x8d\xde\x34\x86\x78\x60\x75\x8d",
> +               .len    = 128,
> +               .also_non_np = 1,
> +               .np     = 4,
> +               .tap    = { 104, 16, 4, 4 },
> +       }, {
> +               .key    = "\xd3\x81\x72\x18\x23\xff\x6f\x4a"
> +                         "\x25\x74\x29\x0d\x51\x8a\x0e\x13"
> +                         "\xc1\x53\x5d\x30\x8d\xee\x75\x0d"
> +                         "\x14\xd6\x69\xc9\x15\xa9\x0c\x60",
> +               .klen   = 32,
> +               .iv     = "\x65\x9b\xd4\xa8\x7d\x29\x1d\xf4"
> +                         "\xc4\xd6\x9b\x6a\x28\xab\x64\xe2"
> +                         "\x62\x81\x97\xc5\x81\xaa\xf9\x44"
> +                         "\xc1\x72\x59\x82\xaf\x16\xc8\x2c",
> +               .ptext  = "\xc7\x6b\x52\x6a\x10\xf0\xcc\x09"
> +                         "\xc1\x12\x1d\x6d\x21\xa6\x78\xf5"
> +                         "\x05\xa3\x69\x60\x91\x36\x98\x57"
> +                         "\xba\x0c\x14\xcc\xf3\x2d\x73\x03"
> +                         "\xc6\xb2\x5f\xc8\x16\x27\x37\x5d"
> +                         "\xd0\x0b\x87\xb2\x50\x94\x7b\x58"
> +                         "\x04\xf4\xe0\x7f\x6e\x57\x8e\xc9"
> +                         "\x41\x84\xc1\xb1\x7e\x4b\x91\x12"
> +                         "\x3a\x8b\x5d\x50\x82\x7b\xcb\xd9"
> +                         "\x9a\xd9\x4e\x18\x06\x23\x9e\xd4"
> +                         "\xa5\x20\x98\xef\xb5\xda\xe5\xc0"
> +                         "\x8a\x6a\x83\x77\x15\x84\x1e\xae"
> +                         "\x78\x94\x9d\xdf\xb7\xd1\xea\x67"
> +                         "\xaa\xb0\x14\x15\xfa\x67\x21\x84"
> +                         "\xd3\x41\x2a\xce\xba\x4b\x4a\xe8"
> +                         "\x95\x62\xa9\x55\xf0\x80\xad\xbd"
> +                         "\xab\xaf\xdd\x4f\xa5\x7c\x13\x36"
> +                         "\xed\x5e\x4f\x72\xad\x4b\xf1\xd0"
> +                         "\x88\x4e\xec\x2c\x88\x10\x5e\xea"
> +                         "\x12\xc0\x16\x01\x29\xa3\xa0\x55"
> +                         "\xaa\x68\xf3\xe9\x9d\x3b\x0d\x3b"
> +                         "\x6d\xec\xf8\xa0\x2d\xf0\x90\x8d"
> +                         "\x1c\xe2\x88\xd4\x24\x71\xf9\xb3"
> +                         "\xc1\x9f\xc5\xd6\x76\x70\xc5\x2e"
> +                         "\x9c\xac\xdb\x90\xbd\x83\x72\xba"
> +                         "\x6e\xb5\xa5\x53\x83\xa9\xa5\xbf"
> +                         "\x7d\x06\x0e\x3c\x2a\xd2\x04\xb5"
> +                         "\x1e\x19\x38\x09\x16\xd2\x82\x1f"
> +                         "\x75\x18\x56\xb8\x96\x0b\xa6\xf9"
> +                         "\xcf\x62\xd9\x32\x5d\xa9\xd7\x1d"
> +                         "\xec\xe4\xdf\x1b\xbe\xf1\x36\xee"
> +                         "\xe3\x7b\xb5\x2f\xee\xf8\x53\x3d"
> +                         "\x6a\xb7\x70\xa9\xfc\x9c\x57\x25"
> +                         "\xf2\x89\x10\xd3\xb8\xa8\x8c\x30"
> +                         "\xae\x23\x4f\x0e\x13\x66\x4f\xe1"
> +                         "\xb6\xc0\xe4\xf8\xef\x93\xbd\x6e"
> +                         "\x15\x85\x6b\xe3\x60\x81\x1d\x68"
> +                         "\xd7\x31\x87\x89\x09\xab\xd5\x96"
> +                         "\x1d\xf3\x6d\x67\x80\xca\x07\x31"
> +                         "\x5d\xa7\xe4\xfb\x3e\xf2\x9b\x33"
> +                         "\x52\x18\xc8\x30\xfe\x2d\xca\x1e"
> +                         "\x79\x92\x7a\x60\x5c\xb6\x58\x87"
> +                         "\xa4\x36\xa2\x67\x92\x8b\xa4\xb7"
> +                         "\xf1\x86\xdf\xdc\xc0\x7e\x8f\x63"
> +                         "\xd2\xa2\xdc\x78\xeb\x4f\xd8\x96"
> +                         "\x47\xca\xb8\x91\xf9\xf7\x94\x21"
> +                         "\x5f\x9a\x9f\x5b\xb8\x40\x41\x4b"
> +                         "\x66\x69\x6a\x72\xd0\xcb\x70\xb7"
> +                         "\x93\xb5\x37\x96\x05\x37\x4f\xe5"
> +                         "\x8c\xa7\x5a\x4e\x8b\xb7\x84\xea"
> +                         "\xc7\xfc\x19\x6e\x1f\x5a\xa1\xac"
> +                         "\x18\x7d\x52\x3b\xb3\x34\x62\x99"
> +                         "\xe4\x9e\x31\x04\x3f\xc0\x8d\x84"
> +                         "\x17\x7c\x25\x48\x52\x67\x11\x27"
> +                         "\x67\xbb\x5a\x85\xca\x56\xb2\x5c"
> +                         "\xe6\xec\xd5\x96\x3d\x15\xfc\xfb"
> +                         "\x22\x25\xf4\x13\xe5\x93\x4b\x9a"
> +                         "\x77\xf1\x52\x18\xfa\x16\x5e\x49"
> +                         "\x03\x45\xa8\x08\xfa\xb3\x41\x92"
> +                         "\x79\x50\x33\xca\xd0\xd7\x42\x55"
> +                         "\xc3\x9a\x0c\x4e\xd9\xa4\x3c\x86"
> +                         "\x80\x9f\x53\xd1\xa4\x2e\xd1\xbc"
> +                         "\xf1\x54\x6e\x93\xa4\x65\x99\x8e"
> +                         "\xdf\x29\xc0\x64\x63\x07\xbb\xea",
> +               .ctext  = "\x15\x97\xd0\x86\x18\x03\x9c\x51"
> +                         "\xc5\x11\x36\x62\x13\x92\xe6\x73"
> +                         "\x29\x79\xde\xa1\x00\x3e\x08\x64"
> +                         "\x17\x1a\xbc\xd5\xfe\x33\x0e\x0c"
> +                         "\x7c\x94\xa7\xc6\x3c\xbe\xac\xa2"
> +                         "\x89\xe6\xbc\xdf\x0c\x33\x27\x42"
> +                         "\x46\x73\x2f\xba\x4e\xa6\x46\x8f"
> +                         "\xe4\xee\x39\x63\x42\x65\xa3\x88"
> +                         "\x7a\xad\x33\x23\xa9\xa7\x20\x7f"
> +                         "\x0b\xe6\x6a\xc3\x60\xda\x9e\xb4"
> +                         "\xd6\x07\x8a\x77\x26\xd1\xab\x44"
> +                         "\x99\x55\x03\x5e\xed\x8d\x7b\xbd"
> +                         "\xc8\x21\xb7\x21\x30\x3f\xc0\xb5"
> +                         "\xc8\xec\x6c\x23\xa6\xa3\x6d\xf1"
> +                         "\x30\x0a\xd0\xa6\xa9\x28\x69\xae"
> +                         "\x2a\xe6\x54\xac\x82\x9d\x6a\x95"
> +                         "\x6f\x06\x44\xc5\x5a\x77\x6e\xec"
> +                         "\xf8\xf8\x63\xb2\xe6\xaa\xbd\x8e"
> +                         "\x0e\x8a\x62\x00\x03\xc8\x84\xdd"
> +                         "\x47\x4a\xc3\x55\xba\xb7\xe7\xdf"
> +                         "\x08\xbf\x62\xf5\xe8\xbc\xb6\x11"
> +                         "\xe4\xcb\xd0\x66\x74\x32\xcf\xd4"
> +                         "\xf8\x51\x80\x39\x14\x05\x12\xdb"
> +                         "\x87\x93\xe2\x26\x30\x9c\x3a\x21"
> +                         "\xe5\xd0\x38\x57\x80\x15\xe4\x08"
> +                         "\x58\x05\x49\x7d\xe6\x92\x77\x70"
> +                         "\xfb\x1e\x2d\x6a\x84\x00\xc8\x68"
> +                         "\xf7\x1a\xdd\xf0\x7b\x38\x1e\xd8"
> +                         "\x2c\x78\x78\x61\xcf\xe3\xde\x69"
> +                         "\x1f\xd5\x03\xd5\x1a\xb4\xcf\x03"
> +                         "\xc8\x7a\x70\x68\x35\xb4\xf6\xbe"
> +                         "\x90\x62\xb2\x28\x99\x86\xf5\x44"
> +                         "\x99\xeb\x31\xcf\xca\xdf\xd0\x21"
> +                         "\xd6\x60\xf7\x0f\x40\xb4\x80\xb7"
> +                         "\xab\xe1\x9b\x45\xba\x66\xda\xee"
> +                         "\xdd\x04\x12\x40\x98\xe1\x69\xe5"
> +                         "\x2b\x9c\x59\x80\xe7\x7b\xcc\x63"
> +                         "\xa6\xc0\x3a\xa9\xfe\x8a\xf9\x62"
> +                         "\x11\x34\x61\x94\x35\xfe\xf2\x99"
> +                         "\xfd\xee\x19\xea\x95\xb6\x12\xbf"
> +                         "\x1b\xdf\x02\x1a\xcc\x3e\x7e\x65"
> +                         "\x78\x74\x10\x50\x29\x63\x28\xea"
> +                         "\x6b\xab\xd4\x06\x4d\x15\x24\x31"
> +                         "\xc7\x0a\xc9\x16\xb6\x48\xf0\xbf"
> +                         "\x49\xdb\x68\x71\x31\x8f\x87\xe2"
> +                         "\x13\x05\x64\xd6\x22\x0c\xf8\x36"
> +                         "\x84\x24\x3e\x69\x5e\xb8\x9e\x16"
> +                         "\x73\x6c\x83\x1e\xe0\x9f\x9e\xba"
> +                         "\xe5\x59\x21\x33\x1b\xa9\x26\xc2"
> +                         "\xc7\xd9\x30\x73\xb6\xa6\x73\x82"
> +                         "\x19\xfa\x44\x4d\x40\x8b\x69\x04"
> +                         "\x94\x74\xea\x6e\xb3\x09\x47\x01"
> +                         "\x2a\xb9\x78\x34\x43\x11\xed\xd6"
> +                         "\x8c\x95\x65\x1b\x85\x67\xa5\x40"
> +                         "\xac\x9c\x05\x4b\x57\x4a\xa9\x96"
> +                         "\x0f\xdd\x4f\xa1\xe0\xcf\x6e\xc7"
> +                         "\x1b\xed\xa2\xb4\x56\x8c\x09\x6e"
> +                         "\xa6\x65\xd7\x55\x81\xb7\xed\x11"
> +                         "\x9b\x40\x75\xa8\x6b\x56\xaf\x16"
> +                         "\x8b\x3d\xf4\xcb\xfe\xd5\x1d\x3d"
> +                         "\x85\xc2\xc0\xde\x43\x39\x4a\x96"
> +                         "\xba\x88\x97\xc0\xd6\x00\x0e\x27"
> +                         "\x21\xb0\x21\x52\xba\xa7\x37\xaa"
> +                         "\xcc\xbf\x95\xa8\xf4\xd0\x91\xf6",
> +               .len    = 512,
> +               .also_non_np = 1,
> +               .np     = 2,
> +               .tap    = { 144, 368 },
> +       }
> +};
> +
> +/* Adiantum with XChaCha20 instead of XChaCha12 */
> +/* Test vectors from https://github.com/google/adiantum */
> +static const struct cipher_testvec adiantum_xchacha20_aes_tv_template[] = {
> +       {
> +               .key    = "\x9e\xeb\xb2\x49\x3c\x1c\xf5\xf4"
> +                         "\x6a\x99\xc2\xc4\xdf\xb1\xf4\xdd"
> +                         "\x75\x20\x57\xea\x2c\x4f\xcd\xb2"
> +                         "\xa5\x3d\x7b\x49\x1e\xab\xfd\x0f",
> +               .klen   = 32,
> +               .iv     = "\xdf\x63\xd4\xab\xd2\x49\xf3\xd8"
> +                         "\x33\x81\x37\x60\x7d\xfa\x73\x08"
> +                         "\xd8\x49\x6d\x80\xe8\x2f\x62\x54"
> +                         "\xeb\x0e\xa9\x39\x5b\x45\x7f\x8a",
> +               .ptext  = "\x67\xc9\xf2\x30\x84\x41\x8e\x43"
> +                         "\xfb\xf3\xb3\x3e\x79\x36\x7f\xe8",
> +               .ctext  = "\xf6\x78\x97\xd6\xaa\x94\x01\x27"
> +                         "\x2e\x4d\x83\xe0\x6e\x64\x9a\xdf",
> +               .len    = 16,
> +               .also_non_np = 1,
> +               .np     = 3,
> +               .tap    = { 5, 2, 9 },
> +       }, {
> +               .key    = "\x36\x2b\x57\x97\xf8\x5d\xcd\x99"
> +                         "\x5f\x1a\x5a\x44\x1d\x92\x0f\x27"
> +                         "\xcc\x16\xd7\x2b\x85\x63\x99\xd3"
> +                         "\xba\x96\xa1\xdb\xd2\x60\x68\xda",
> +               .klen   = 32,
> +               .iv     = "\xef\x58\x69\xb1\x2c\x5e\x9a\x47"
> +                         "\x24\xc1\xb1\x69\xe1\x12\x93\x8f"
> +                         "\x43\x3d\x6d\x00\xdb\x5e\xd8\xd9"
> +                         "\x12\x9a\xfe\xd9\xff\x2d\xaa\xc4",
> +               .ptext  = "\x5e\xa8\x68\x19\x85\x98\x12\x23"
> +                         "\x26\x0a\xcc\xdb\x0a\x04\xb9\xdf"
> +                         "\x4d\xb3\x48\x7b\xb0\xe3\xc8\x19"
> +                         "\x43\x5a\x46\x06\x94\x2d\xf2",
> +               .ctext  = "\x4b\xb8\x90\x10\xdf\x7f\x64\x08"
> +                         "\x0e\x14\x42\x5f\x00\x74\x09\x36"
> +                         "\x57\x72\xb5\xfd\xb5\x5d\xb8\x28"
> +                         "\x0c\x04\x91\x14\x91\xe9\x37",
> +               .len    = 31,
> +               .also_non_np = 1,
> +               .np     = 2,
> +               .tap    = { 16, 15 },
> +       }, {
> +               .key    = "\xa5\x28\x24\x34\x1a\x3c\xd8\xf7"
> +                         "\x05\x91\x8f\xee\x85\x1f\x35\x7f"
> +                         "\x80\x3d\xfc\x9b\x94\xf6\xfc\x9e"
> +                         "\x19\x09\x00\xa9\x04\x31\x4f\x11",
> +               .klen   = 32,
> +               .iv     = "\xa1\xba\x49\x95\xff\x34\x6d\xb8"
> +                         "\xcd\x87\x5d\x5e\xfd\xea\x85\xdb"
> +                         "\x8a\x7b\x5e\xb2\x5d\x57\xdd\x62"
> +                         "\xac\xa9\x8c\x41\x42\x94\x75\xb7",
> +               .ptext  = "\x69\xb4\xe8\x8c\x37\xe8\x67\x82"
> +                         "\xf1\xec\x5d\x04\xe5\x14\x91\x13"
> +                         "\xdf\xf2\x87\x1b\x69\x81\x1d\x71"
> +                         "\x70\x9e\x9c\x3b\xde\x49\x70\x11"
> +                         "\xa0\xa3\xdb\x0d\x54\x4f\x66\x69"
> +                         "\xd7\xdb\x80\xa7\x70\x92\x68\xce"
> +                         "\x81\x04\x2c\xc6\xab\xae\xe5\x60"
> +                         "\x15\xe9\x6f\xef\xaa\x8f\xa7\xa7"
> +                         "\x63\x8f\xf2\xf0\x77\xf1\xa8\xea"
> +                         "\xe1\xb7\x1f\x9e\xab\x9e\x4b\x3f"
> +                         "\x07\x87\x5b\x6f\xcd\xa8\xaf\xb9"
> +                         "\xfa\x70\x0b\x52\xb8\xa8\xa7\x9e"
> +                         "\x07\x5f\xa6\x0e\xb3\x9b\x79\x13"
> +                         "\x79\xc3\x3e\x8d\x1c\x2c\x68\xc8"
> +                         "\x51\x1d\x3c\x7b\x7d\x79\x77\x2a"
> +                         "\x56\x65\xc5\x54\x23\x28\xb0\x03",
> +               .ctext  = "\xb1\x8b\xa0\x05\x77\xa8\x4d\x59"
> +                         "\x1b\x8e\x21\xfc\x3a\x49\xfa\xd4"
> +                         "\xeb\x36\xf3\xc4\xdf\xdc\xae\x67"
> +                         "\x07\x3f\x70\x0e\xe9\x66\xf5\x0c"
> +                         "\x30\x4d\x66\xc9\xa4\x2f\x73\x9c"
> +                         "\x13\xc8\x49\x44\xcc\x0a\x90\x9d"
> +                         "\x7c\xdd\x19\x3f\xea\x72\x8d\x58"
> +                         "\xab\xe7\x09\x2c\xec\xb5\x44\xd2"
> +                         "\xca\xa6\x2d\x7a\x5c\x9c\x2b\x15"
> +                         "\xec\x2a\xa6\x69\x91\xf9\xf3\x13"
> +                         "\xf7\x72\xc1\xc1\x40\xd5\xe1\x94"
> +                         "\xf4\x29\xa1\x3e\x25\x02\xa8\x3e"
> +                         "\x94\xc1\x91\x14\xa1\x14\xcb\xbe"
> +                         "\x67\x4c\xb9\x38\xfe\xa7\xaa\x32"
> +                         "\x29\x62\x0d\xb2\xf6\x3c\x58\x57"
> +                         "\xc1\xd5\x5a\xbb\xd6\xa6\x2a\xe5",
> +               .len    = 128,
> +               .also_non_np = 1,
> +               .np     = 4,
> +               .tap    = { 112, 7, 8, 1 },
> +       }, {
> +               .key    = "\xd3\x81\x72\x18\x23\xff\x6f\x4a"
> +                         "\x25\x74\x29\x0d\x51\x8a\x0e\x13"
> +                         "\xc1\x53\x5d\x30\x8d\xee\x75\x0d"
> +                         "\x14\xd6\x69\xc9\x15\xa9\x0c\x60",
> +               .klen   = 32,
> +               .iv     = "\x65\x9b\xd4\xa8\x7d\x29\x1d\xf4"
> +                         "\xc4\xd6\x9b\x6a\x28\xab\x64\xe2"
> +                         "\x62\x81\x97\xc5\x81\xaa\xf9\x44"
> +                         "\xc1\x72\x59\x82\xaf\x16\xc8\x2c",
> +               .ptext  = "\xc7\x6b\x52\x6a\x10\xf0\xcc\x09"
> +                         "\xc1\x12\x1d\x6d\x21\xa6\x78\xf5"
> +                         "\x05\xa3\x69\x60\x91\x36\x98\x57"
> +                         "\xba\x0c\x14\xcc\xf3\x2d\x73\x03"
> +                         "\xc6\xb2\x5f\xc8\x16\x27\x37\x5d"
> +                         "\xd0\x0b\x87\xb2\x50\x94\x7b\x58"
> +                         "\x04\xf4\xe0\x7f\x6e\x57\x8e\xc9"
> +                         "\x41\x84\xc1\xb1\x7e\x4b\x91\x12"
> +                         "\x3a\x8b\x5d\x50\x82\x7b\xcb\xd9"
> +                         "\x9a\xd9\x4e\x18\x06\x23\x9e\xd4"
> +                         "\xa5\x20\x98\xef\xb5\xda\xe5\xc0"
> +                         "\x8a\x6a\x83\x77\x15\x84\x1e\xae"
> +                         "\x78\x94\x9d\xdf\xb7\xd1\xea\x67"
> +                         "\xaa\xb0\x14\x15\xfa\x67\x21\x84"
> +                         "\xd3\x41\x2a\xce\xba\x4b\x4a\xe8"
> +                         "\x95\x62\xa9\x55\xf0\x80\xad\xbd"
> +                         "\xab\xaf\xdd\x4f\xa5\x7c\x13\x36"
> +                         "\xed\x5e\x4f\x72\xad\x4b\xf1\xd0"
> +                         "\x88\x4e\xec\x2c\x88\x10\x5e\xea"
> +                         "\x12\xc0\x16\x01\x29\xa3\xa0\x55"
> +                         "\xaa\x68\xf3\xe9\x9d\x3b\x0d\x3b"
> +                         "\x6d\xec\xf8\xa0\x2d\xf0\x90\x8d"
> +                         "\x1c\xe2\x88\xd4\x24\x71\xf9\xb3"
> +                         "\xc1\x9f\xc5\xd6\x76\x70\xc5\x2e"
> +                         "\x9c\xac\xdb\x90\xbd\x83\x72\xba"
> +                         "\x6e\xb5\xa5\x53\x83\xa9\xa5\xbf"
> +                         "\x7d\x06\x0e\x3c\x2a\xd2\x04\xb5"
> +                         "\x1e\x19\x38\x09\x16\xd2\x82\x1f"
> +                         "\x75\x18\x56\xb8\x96\x0b\xa6\xf9"
> +                         "\xcf\x62\xd9\x32\x5d\xa9\xd7\x1d"
> +                         "\xec\xe4\xdf\x1b\xbe\xf1\x36\xee"
> +                         "\xe3\x7b\xb5\x2f\xee\xf8\x53\x3d"
> +                         "\x6a\xb7\x70\xa9\xfc\x9c\x57\x25"
> +                         "\xf2\x89\x10\xd3\xb8\xa8\x8c\x30"
> +                         "\xae\x23\x4f\x0e\x13\x66\x4f\xe1"
> +                         "\xb6\xc0\xe4\xf8\xef\x93\xbd\x6e"
> +                         "\x15\x85\x6b\xe3\x60\x81\x1d\x68"
> +                         "\xd7\x31\x87\x89\x09\xab\xd5\x96"
> +                         "\x1d\xf3\x6d\x67\x80\xca\x07\x31"
> +                         "\x5d\xa7\xe4\xfb\x3e\xf2\x9b\x33"
> +                         "\x52\x18\xc8\x30\xfe\x2d\xca\x1e"
> +                         "\x79\x92\x7a\x60\x5c\xb6\x58\x87"
> +                         "\xa4\x36\xa2\x67\x92\x8b\xa4\xb7"
> +                         "\xf1\x86\xdf\xdc\xc0\x7e\x8f\x63"
> +                         "\xd2\xa2\xdc\x78\xeb\x4f\xd8\x96"
> +                         "\x47\xca\xb8\x91\xf9\xf7\x94\x21"
> +                         "\x5f\x9a\x9f\x5b\xb8\x40\x41\x4b"
> +                         "\x66\x69\x6a\x72\xd0\xcb\x70\xb7"
> +                         "\x93\xb5\x37\x96\x05\x37\x4f\xe5"
> +                         "\x8c\xa7\x5a\x4e\x8b\xb7\x84\xea"
> +                         "\xc7\xfc\x19\x6e\x1f\x5a\xa1\xac"
> +                         "\x18\x7d\x52\x3b\xb3\x34\x62\x99"
> +                         "\xe4\x9e\x31\x04\x3f\xc0\x8d\x84"
> +                         "\x17\x7c\x25\x48\x52\x67\x11\x27"
> +                         "\x67\xbb\x5a\x85\xca\x56\xb2\x5c"
> +                         "\xe6\xec\xd5\x96\x3d\x15\xfc\xfb"
> +                         "\x22\x25\xf4\x13\xe5\x93\x4b\x9a"
> +                         "\x77\xf1\x52\x18\xfa\x16\x5e\x49"
> +                         "\x03\x45\xa8\x08\xfa\xb3\x41\x92"
> +                         "\x79\x50\x33\xca\xd0\xd7\x42\x55"
> +                         "\xc3\x9a\x0c\x4e\xd9\xa4\x3c\x86"
> +                         "\x80\x9f\x53\xd1\xa4\x2e\xd1\xbc"
> +                         "\xf1\x54\x6e\x93\xa4\x65\x99\x8e"
> +                         "\xdf\x29\xc0\x64\x63\x07\xbb\xea",
> +               .ctext  = "\xe0\x33\xf6\xe0\xb4\xa5\xdd\x2b"
> +                         "\xdd\xce\xfc\x12\x1e\xfc\x2d\xf2"
> +                         "\x8b\xc7\xeb\xc1\xc4\x2a\xe8\x44"
> +                         "\x0f\x3d\x97\x19\x2e\x6d\xa2\x38"
> +                         "\x9d\xa6\xaa\xe1\x96\xb9\x08\xe8"
> +                         "\x0b\x70\x48\x5c\xed\xb5\x9b\xcb"
> +                         "\x8b\x40\x88\x7e\x69\x73\xf7\x16"
> +                         "\x71\xbb\x5b\xfc\xa3\x47\x5d\xa6"
> +                         "\xae\x3a\x64\xc4\xe7\xb8\xa8\xe7"
> +                         "\xb1\x32\x19\xdb\xe3\x01\xb8\xf0"
> +                         "\xa4\x86\xb4\x4c\xc2\xde\x5c\xd2"
> +                         "\x6c\x77\xd2\xe8\x18\xb7\x0a\xc9"
> +                         "\x3d\x53\xb5\xc4\x5c\xf0\x8c\x06"
> +                         "\xdc\x90\xe0\x74\x47\x1b\x0b\xf6"
> +                         "\xd2\x71\x6b\xc4\xf1\x97\x00\x2d"
> +                         "\x63\x57\x44\x1f\x8c\xf4\xe6\x9b"
> +                         "\xe0\x7a\xdd\xec\x32\x73\x42\x32"
> +                         "\x7f\x35\x67\x60\x0d\xcf\x10\x52"
> +                         "\x61\x22\x53\x8d\x8e\xbb\x33\x76"
> +                         "\x59\xd9\x10\xce\xdf\xef\xc0\x41"
> +                         "\xd5\x33\x29\x6a\xda\x46\xa4\x51"
> +                         "\xf0\x99\x3d\x96\x31\xdd\xb5\xcb"
> +                         "\x3e\x2a\x1f\xc7\x5c\x79\xd3\xc5"
> +                         "\x20\xa1\xb1\x39\x1b\xc6\x0a\x70"
> +                         "\x26\x39\x95\x07\xad\x7a\xc9\x69"
> +                         "\xfe\x81\xc7\x88\x08\x38\xaf\xad"
> +                         "\x9e\x8d\xfb\xe8\x24\x0d\x22\xb8"
> +                         "\x0e\xed\xbe\x37\x53\x7c\xa6\xc6"
> +                         "\x78\x62\xec\xa3\x59\xd9\xc6\x9d"
> +                         "\xb8\x0e\x69\x77\x84\x2d\x6a\x4c"
> +                         "\xc5\xd9\xb2\xa0\x2b\xa8\x80\xcc"
> +                         "\xe9\x1e\x9c\x5a\xc4\xa1\xb2\x37"
> +                         "\x06\x9b\x30\x32\x67\xf7\xe7\xd2"
> +                         "\x42\xc7\xdf\x4e\xd4\xcb\xa0\x12"
> +                         "\x94\xa1\x34\x85\x93\x50\x4b\x0a"
> +                         "\x3c\x7d\x49\x25\x01\x41\x6b\x96"
> +                         "\xa9\x12\xbb\x0b\xc0\xd7\xd0\x93"
> +                         "\x1f\x70\x38\xb8\x21\xee\xf6\xa7"
> +                         "\xee\xeb\xe7\x81\xa4\x13\xb4\x87"
> +                         "\xfa\xc1\xb0\xb5\x37\x8b\x74\xa2"
> +                         "\x4e\xc7\xc2\xad\x3d\x62\x3f\xf8"
> +                         "\x34\x42\xe5\xae\x45\x13\x63\xfe"
> +                         "\xfc\x2a\x17\x46\x61\xa9\xd3\x1c"
> +                         "\x4c\xaf\xf0\x09\x62\x26\x66\x1e"
> +                         "\x74\xcf\xd6\x68\x3d\x7d\xd8\xb7"
> +                         "\xe7\xe6\xf8\xf0\x08\x20\xf7\x47"
> +                         "\x1c\x52\xaa\x0f\x3e\x21\xa3\xf2"
> +                         "\xbf\x2f\x95\x16\xa8\xc8\xc8\x8c"
> +                         "\x99\x0f\x5d\xfb\xfa\x2b\x58\x8a"
> +                         "\x7e\xd6\x74\x02\x60\xf0\xd0\x5b"
> +                         "\x65\xa8\xac\xea\x8d\x68\x46\x34"
> +                         "\x26\x9d\x4f\xb1\x9a\x8e\xc0\x1a"
> +                         "\xf1\xed\xc6\x7a\x83\xfd\x8a\x57"
> +                         "\xf2\xe6\xe4\xba\xfc\xc6\x3c\xad"
> +                         "\x5b\x19\x50\x2f\x3a\xcc\x06\x46"
> +                         "\x04\x51\x3f\x91\x97\xf0\xd2\x07"
> +                         "\xe7\x93\x89\x7e\xb5\x32\x0f\x03"
> +                         "\xe5\x58\x9e\x74\x72\xeb\xc2\x38"
> +                         "\x00\x0c\x91\x72\x69\xed\x7d\x6d"
> +                         "\xc8\x71\xf0\xec\xff\x80\xd9\x1c"
> +                         "\x9e\xd2\xfa\x15\xfc\x6c\x4e\xbc"
> +                         "\xb1\xa6\xbd\xbd\x70\x40\xca\x20"
> +                         "\xb8\x78\xd2\xa3\xc6\xf3\x79\x9c"
> +                         "\xc7\x27\xe1\x6a\x29\xad\xa4\x03",
> +               .len    = 512,
> +       }
> +};
> +
>  /*
>   * CTS (Cipher Text Stealing) mode tests
>   */
> --
> 2.19.1.331.ge82ca0e54c-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-20  3:24     ` Ard Biesheuvel
@ 2018-10-20  5:22       ` Eric Biggers
  0 siblings, 0 replies; 54+ messages in thread
From: Eric Biggers @ 2018-10-20  5:22 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Paul Crowley, Jason A. Donenfeld,
	open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Greg Kaiser, Michael Halcrow, Samuel Neves, Tomer Ashur

Hi Ard,

On Sat, Oct 20, 2018 at 11:24:05AM +0800, Ard Biesheuvel wrote:
> On 20 October 2018 at 02:19, Paul Crowley <paulcrowley@google.com> wrote:
> > On Fri, 19 Oct 2018 at 08:58, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >> Before merging this into the kernel, do you want to wait until you've
> >> received some public review from academia?
> >
> > I would prefer not to wait. Unlike a new primitive whose strength can
> > only be known through attempts at cryptanalysis, Adiantum is a
> > construction based on
> > well-understood and trusted primitives; it is secure if the proof
> > accompanying it is correct. Given that (outside competitions or
> > standardization efforts) no-one ever issues public statements that
> > they think algorithms or proofs are good, what I'm expecting from
> > academia is silence :) The most we could hope for would be getting the
> > paper accepted at a conference, and we're pursuing that but there's a
> > good chance that won't happen simply because it's not very novel. It
> > basically takes existing ideas and applies them using a stream cipher
> > instead of a block cipher, and a faster hashing mode; it's also a
> > small update from HPolyC. I've had some private feedback that the
> > proof seems correct, and that's all I'm expecting to get.
> 
> Hi Paul, Eric,
> 
> The Adiantum paper claims
> 
> "On an ARM Cortex-A7 processor, Adiantum decrypts 4096-byte messages
> at 11 cycles per byte, five times faster than AES-256-XTS, with a
> constant-time implementation."
> 
> which is surprising to me. The bit slicing NEON AES core runs at ~14
> cycle per byte on a Cortex-A15 (when encrypting), so 55 cycles per
> byte on A7 sounds rather high. Is it really that bad?

Yes, it's really that slow, maybe because the NEON unit on Cortex-A7 isn't very
good.  Our figures are shown in the performance table in section 4.  Note that
the abstract is talking about AES-256-XTS.  AES-128-XTS is ~27% faster.  You can
also reproduce our performance results using our userspace benchmark program
from https://github.com/google/adiantum/tree/master/benchmark.  It uses a copy
of aes-neonbs-core.S from the kernel source tree.

> 
> Also, the paper mentions that the second hash pass and the stream
> cipher en/decryption pass could be executed in parallel, while your
> implementation performs three distinct passes. Do you have any
> estimates on the potential performance gain of implementing that? In
> my experience (which is mostly A53 rather than A7 based, mind you),
> removing memory accesses can help tremendously to speed up the
> execution on low end cores.

As a quick hack, on Cortex-A7 I timed "NH" without loading the message words.
It became about 10% faster.  My NEON-accelerated NH is already only about 1.3
cpb, so that means in theory not having to reload the message words would save
~0.13 cpb...  But Adiantum as a whole is ~11 cpb, so that suggests the
improvement would be only a bit over 1%.

Maybe it could actually be better (for example, not having to map the pages
again could save a lot), but in practice considering the increased complexity as
well as that probably there wouldn't actually be enough registers to do
everything efficiently, it seemed it would cause far too much trouble to bother
yet (at least for the Linux kernel implementation; a two-pass implementation
could still be useful elsewhere, of course).

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds
  2018-10-20  3:35   ` Ard Biesheuvel
@ 2018-10-20  5:26     ` Eric Biggers
  0 siblings, 0 replies; 54+ messages in thread
From: Eric Biggers @ 2018-10-20  5:26 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

Hi Ard,

On Sat, Oct 20, 2018 at 11:35:22AM +0800, Ard Biesheuvel wrote:
> On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> > From: Eric Biggers <ebiggers@google.com>
> >
> > In preparation for adding XChaCha12 support, rename/refactor the NEON
> > implementation of ChaCha20 to support different numbers of rounds.
> >
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> 
> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> 
> > ---
> >  arch/arm/crypto/Makefile                      |  4 +-
> >  ...hacha20-neon-core.S => chacha-neon-core.S} | 36 ++++++------
> >  ...hacha20-neon-glue.c => chacha-neon-glue.c} | 56 ++++++++++---------
> >  3 files changed, 52 insertions(+), 44 deletions(-)
> >  rename arch/arm/crypto/{chacha20-neon-core.S => chacha-neon-core.S} (96%)
> >  rename arch/arm/crypto/{chacha20-neon-glue.c => chacha-neon-glue.c} (73%)
> >
> > diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
> > index bd5bceef0605f..005482ff95047 100644
> > --- a/arch/arm/crypto/Makefile
> > +++ b/arch/arm/crypto/Makefile
> > @@ -9,7 +9,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
> >  obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
> >  obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
> >  obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
> > -obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
> > +obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
> >
> 
> I take it you are preserving the Kconfig symbol name to prevent
> breaking existing configs?

Yes, that's the intent.  Though perhaps we should just change it.

> 
> If so, we might consider doing something like
> 
> config CRYPTO_CHACHA20_NEON
>     tristate
> 
> config CRYPTO_CHACHA_NEON
>     default CRYPTO_CHACHA20_NEON
>     ... the existing kconfig symbol description ...
> 
> and drop the former at some point in the future?
> 

The problem is that only symbols with a prompt string can be set explicitly,
e.g. in a kconfig file.  So it's not possible to migrate a symbol to a new one
without breakage, unless both are to remain separately promptable.  It seems
there should be a way, but last I checked I don't think there was...

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
  2018-10-20  4:00   ` Ard Biesheuvel
@ 2018-10-20  5:38     ` Eric Biggers
  2018-10-20 15:06       ` Ard Biesheuvel
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-20  5:38 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

Hi Ard,

On Sat, Oct 20, 2018 at 12:00:31PM +0800, Ard Biesheuvel wrote:
> On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> > From: Eric Biggers <ebiggers@google.com>
> >
> > Add a generic implementation of NHPoly1305, an ε-almost-∆-universal hash
> > function used in the Adiantum encryption mode.
> >
> > CONFIG_NHPOLY1305 is not selectable by itself since there won't be any
> > real reason to enable it without also enabling Adiantum support.
> >
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> > ---
> >  crypto/Kconfig              |    5 +
> >  crypto/Makefile             |    1 +
> >  crypto/nhpoly1305.c         |  288 ++++++++
> >  crypto/testmgr.c            |    6 +
> >  crypto/testmgr.h            | 1240 ++++++++++++++++++++++++++++++++++-
> >  include/crypto/nhpoly1305.h |   74 +++
> >  6 files changed, 1610 insertions(+), 4 deletions(-)
> >  create mode 100644 crypto/nhpoly1305.c
> >  create mode 100644 include/crypto/nhpoly1305.h
> >
> > diff --git a/crypto/Kconfig b/crypto/Kconfig
> > index 4fa0a4a0e8615..431beca903623 100644
> > --- a/crypto/Kconfig
> > +++ b/crypto/Kconfig
> > @@ -493,6 +493,11 @@ config CRYPTO_KEYWRAP
> >           Support for key wrapping (NIST SP800-38F / RFC3394) without
> >           padding.
> >
> > +config CRYPTO_NHPOLY1305
> > +       tristate
> > +       select CRYPTO_HASH
> > +       select CRYPTO_POLY1305
> > +
> >  comment "Hash modes"
> >
> >  config CRYPTO_CMAC
> > diff --git a/crypto/Makefile b/crypto/Makefile
> > index 7e673f7c71107..87b86f221a2a2 100644
> > --- a/crypto/Makefile
> > +++ b/crypto/Makefile
> > @@ -84,6 +84,7 @@ obj-$(CONFIG_CRYPTO_LRW) += lrw.o
> >  obj-$(CONFIG_CRYPTO_XTS) += xts.o
> >  obj-$(CONFIG_CRYPTO_CTR) += ctr.o
> >  obj-$(CONFIG_CRYPTO_KEYWRAP) += keywrap.o
> > +obj-$(CONFIG_CRYPTO_NHPOLY1305) += nhpoly1305.o
> >  obj-$(CONFIG_CRYPTO_GCM) += gcm.o
> >  obj-$(CONFIG_CRYPTO_CCM) += ccm.o
> >  obj-$(CONFIG_CRYPTO_CHACHA20POLY1305) += chacha20poly1305.o
> > diff --git a/crypto/nhpoly1305.c b/crypto/nhpoly1305.c
> > new file mode 100644
> > index 0000000000000..087ad7680dd62
> > --- /dev/null
> > +++ b/crypto/nhpoly1305.c
> > @@ -0,0 +1,288 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
> > + *
> > + * Copyright 2018 Google LLC
> > + */
> > +
> > +/*
> > + * "NHPoly1305" is the main component of Adiantum hashing.
> > + * Specifically, it is the calculation
> > + *
> > + *     H_M ← Poly1305_{K_M}(NH_{K_N}(pad_{128}(M)))
> > + *
> > + * from the procedure in section A.5 of the Adiantum paper [1].  It is an
> > + * ε-almost-∆-universal (εA∆U) hash function for equal-length inputs over
> > + * Z/(2^{128}Z), where the "∆" operation is addition.  It hashes 1024-byte
> > + * chunks of the input with the NH hash function [2], reducing the input length
> > + * by 32x.  The resulting NH digests are evaluated as a polynomial in
> > + * GF(2^{130}-5), like in the Poly1305 MAC [3].  Note that the polynomial
> > + * evaluation by itself would suffice to achieve the εA∆U property; NH is used
> > + * for performance since it's over twice as fast as Poly1305.
> > + *
> > + * This is *not* a cryptographic hash function; do not use it as such!
> > + *
> > + * [1] Adiantum: length-preserving encryption for entry-level processors
> > + *     (https://eprint.iacr.org/2018/720.pdf)
> > + * [2] UMAC: Fast and Secure Message Authentication
> > + *     (https://fastcrypto.org/umac/umac_proc.pdf)
> > + * [3] The Poly1305-AES message-authentication code
> > + *     (https://cr.yp.to/mac/poly1305-20050329.pdf)
> > + */
> > +
> > +#include <asm/unaligned.h>
> > +#include <crypto/algapi.h>
> > +#include <crypto/internal/hash.h>
> > +#include <crypto/nhpoly1305.h>
> > +#include <linux/crypto.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +
> > +#define NH_STRIDE(K0, K1, K2, K3)                              \
> > +({                                                             \
> > +       m_A = get_unaligned_le32(src); src += 4;                \
> > +       m_B = get_unaligned_le32(src); src += 4;                \
> > +       m_C = get_unaligned_le32(src); src += 4;                \
> > +       m_D = get_unaligned_le32(src); src += 4;                \
> > +       K3##_A = *key++;                                        \
> > +       K3##_B = *key++;                                        \
> > +       K3##_C = *key++;                                        \
> > +       K3##_D = *key++;                                        \
> > +       sum0 += (u64)(u32)(m_A + K0##_A) * (u32)(m_C + K0##_C); \
> > +       sum1 += (u64)(u32)(m_A + K1##_A) * (u32)(m_C + K1##_C); \
> > +       sum2 += (u64)(u32)(m_A + K2##_A) * (u32)(m_C + K2##_C); \
> > +       sum3 += (u64)(u32)(m_A + K3##_A) * (u32)(m_C + K3##_C); \
> > +       sum0 += (u64)(u32)(m_B + K0##_B) * (u32)(m_D + K0##_D); \
> > +       sum1 += (u64)(u32)(m_B + K1##_B) * (u32)(m_D + K1##_D); \
> > +       sum2 += (u64)(u32)(m_B + K2##_B) * (u32)(m_D + K2##_D); \
> > +       sum3 += (u64)(u32)(m_B + K3##_B) * (u32)(m_D + K3##_D); \
> > +})
> > +
> > +static void nh_generic(const u32 *key, const u8 *src, size_t srclen,
> > +                      __le64 hash[NH_NUM_PASSES])
> > +{
> > +       u64 sum0 = 0, sum1 = 0, sum2 = 0, sum3 = 0;
> > +       u32 k0_A = *key++;
> > +       u32 k0_B = *key++;
> > +       u32 k0_C = *key++;
> > +       u32 k0_D = *key++;
> > +       u32 k1_A = *key++;
> > +       u32 k1_B = *key++;
> > +       u32 k1_C = *key++;
> > +       u32 k1_D = *key++;
> > +       u32 k2_A = *key++;
> > +       u32 k2_B = *key++;
> > +       u32 k2_C = *key++;
> > +       u32 k2_D = *key++;
> > +       u32 k3_A, k3_B, k3_C, k3_D;
> > +       u32 m_A, m_B, m_C, m_D;
> > +       size_t n = srclen / NH_MESSAGE_UNIT;
> > +
> > +       BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
> > +       BUILD_BUG_ON(NH_NUM_PASSES != 4);
> > +
> > +       while (n >= 4) {
> > +               NH_STRIDE(k0, k1, k2, k3);
> > +               NH_STRIDE(k1, k2, k3, k0);
> > +               NH_STRIDE(k2, k3, k0, k1);
> > +               NH_STRIDE(k3, k0, k1, k2);
> > +               n -= 4;
> > +       }
> > +       if (n) {
> > +               NH_STRIDE(k0, k1, k2, k3);
> > +               if (--n) {
> > +                       NH_STRIDE(k1, k2, k3, k0);
> > +                       if (--n)
> > +                               NH_STRIDE(k2, k3, k0, k1);
> > +               }
> > +       }
> > +
> 
> This all looks a bit clunky to me, with the macro, the *key++s in the
> initializers and these conditionals.
> 
> Was it written in this particular way to get GCC to optimize it in the
> right way?

This does get compiled into something much faster than a naive version, which
you can find commented out at
https://github.com/google/adiantum/blob/master/benchmark/src/nh.c#L14.

Though, I admit that I haven't put a ton of effort into this C implementation of
NH yet.  Right now it's actually somewhat of a translation of the NEON version.
I'll do some experiments and see if it can be made into something less ugly
without losing performance.

> 
> > +       hash[0] = cpu_to_le64(sum0);
> > +       hash[1] = cpu_to_le64(sum1);
> > +       hash[2] = cpu_to_le64(sum2);
> > +       hash[3] = cpu_to_le64(sum3);
> > +}
> > +
> > +/* Pass the next NH hash value through Poly1305 */
> > +static void process_nh_hash_value(struct nhpoly1305_state *state,
> > +                                 const struct nhpoly1305_key *key)
> > +{
> > +       BUILD_BUG_ON(NH_HASH_BYTES % POLY1305_BLOCK_SIZE != 0);
> > +
> > +       poly1305_core_blocks(&state->poly_state, &key->poly_key, state->nh_hash,
> > +                            NH_HASH_BYTES / POLY1305_BLOCK_SIZE);
> > +}
> > +
> > +/*
> > + * Feed the next portion of the source data, as a whole number of 16-byte
> > + * "NH message units", through NH and Poly1305.  Each NH hash is taken over
> > + * 1024 bytes, except possibly the final one which is taken over a multiple of
> > + * 16 bytes up to 1024.  Also, in the case where data is passed in misaligned
> > + * chunks, we combine partial hashes; the end result is the same either way.
> > + */
> > +static void nhpoly1305_units(struct nhpoly1305_state *state,
> > +                            const struct nhpoly1305_key *key,
> > +                            const u8 *src, unsigned int srclen, nh_t nh_fn)
> 
> Since indirect calls are going out of style: can we get rid of the
> function pointer? Or is the compiler already inferring that it always
> refers to nh_generic()?
> 

At least for now I want to use the same crypto_nhpoly1305_*_helper() functions
for all nhpoly1305 implementations, and that requires that 'nh' be a function
pointer.  The helpers could be placed in a header and inlined which would turn
'nh' into a direct call, but it seemed to be too much code to inline, and
normally 'nh' is only invoked once per 1024 bytes anyway.

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305
  2018-10-20  4:12   ` Ard Biesheuvel
@ 2018-10-20  5:51     ` Eric Biggers
  2018-10-20 15:00       ` Ard Biesheuvel
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-20  5:51 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On Sat, Oct 20, 2018 at 12:12:56PM +0800, Ard Biesheuvel wrote:
> On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> > From: Eric Biggers <ebiggers@google.com>
> >
> > Add an ARM NEON implementation of NHPoly1305, an ε-almost-∆-universal
> > hash function used in the Adiantum encryption mode.  For now, only the
> > NH portion is actually NEON-accelerated; the Poly1305 part is less
> > performance-critical so is just implemented in C.
> >
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> > ---
> >  arch/arm/crypto/Kconfig                |   5 ++
> >  arch/arm/crypto/Makefile               |   2 +
> >  arch/arm/crypto/nh-neon-core.S         | 116 +++++++++++++++++++++++++
> >  arch/arm/crypto/nhpoly1305-neon-glue.c |  78 +++++++++++++++++
> >  4 files changed, 201 insertions(+)
> >  create mode 100644 arch/arm/crypto/nh-neon-core.S
> >  create mode 100644 arch/arm/crypto/nhpoly1305-neon-glue.c
> >
> > diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
> > index cc932d9bba561..458562a34aabe 100644
> > --- a/arch/arm/crypto/Kconfig
> > +++ b/arch/arm/crypto/Kconfig
> > @@ -122,4 +122,9 @@ config CRYPTO_CHACHA20_NEON
> >         select CRYPTO_BLKCIPHER
> >         select CRYPTO_CHACHA20
> >
> > +config CRYPTO_NHPOLY1305_NEON
> > +       tristate "NEON accelerated NHPoly1305 hash function (for Adiantum)"
> > +       depends on KERNEL_MODE_NEON
> > +       select CRYPTO_NHPOLY1305
> > +
> >  endif
> > diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
> > index 005482ff95047..b65d6bfab8e6b 100644
> > --- a/arch/arm/crypto/Makefile
> > +++ b/arch/arm/crypto/Makefile
> > @@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
> >  obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
> >  obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
> >  obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
> > +obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o
> >
> >  ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
> >  ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
> > @@ -53,6 +54,7 @@ ghash-arm-ce-y        := ghash-ce-core.o ghash-ce-glue.o
> >  crct10dif-arm-ce-y     := crct10dif-ce-core.o crct10dif-ce-glue.o
> >  crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
> >  chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
> > +nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o
> >
> >  ifdef REGENERATE_ARM_CRYPTO
> >  quiet_cmd_perl = PERL    $@
> > diff --git a/arch/arm/crypto/nh-neon-core.S b/arch/arm/crypto/nh-neon-core.S
> > new file mode 100644
> > index 0000000000000..434d80ab531c2
> > --- /dev/null
> > +++ b/arch/arm/crypto/nh-neon-core.S
> > @@ -0,0 +1,116 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * NH - ε-almost-universal hash function, NEON accelerated version
> > + *
> > + * Copyright 2018 Google LLC
> > + *
> > + * Author: Eric Biggers <ebiggers@google.com>
> > + */
> > +
> > +#include <linux/linkage.h>
> > +
> > +       .text
> > +       .fpu            neon
> > +
> > +       KEY             .req    r0
> > +       MESSAGE         .req    r1
> > +       MESSAGE_LEN     .req    r2
> > +       HASH            .req    r3
> > +
> > +       PASS0_SUMS      .req    q0
> > +       PASS0_SUM_A     .req    d0
> > +       PASS0_SUM_B     .req    d1
> > +       PASS1_SUMS      .req    q1
> > +       PASS1_SUM_A     .req    d2
> > +       PASS1_SUM_B     .req    d3
> > +       PASS2_SUMS      .req    q2
> > +       PASS2_SUM_A     .req    d4
> > +       PASS2_SUM_B     .req    d5
> > +       PASS3_SUMS      .req    q3
> > +       PASS3_SUM_A     .req    d6
> > +       PASS3_SUM_B     .req    d7
> > +       K0              .req    q4
> > +       K1              .req    q5
> > +       K2              .req    q6
> > +       K3              .req    q7
> > +       T0              .req    q8
> > +       T0_L            .req    d16
> > +       T0_H            .req    d17
> > +       T1              .req    q9
> > +       T1_L            .req    d18
> > +       T1_H            .req    d19
> > +       T2              .req    q10
> > +       T2_L            .req    d20
> > +       T2_H            .req    d21
> > +       T3              .req    q11
> > +       T3_L            .req    d22
> > +       T3_H            .req    d23
> > +
> > +.macro _nh_stride      k0, k1, k2, k3
> > +
> > +       // Load next message stride
> > +       vld1.8          {T3}, [MESSAGE]!
> > +
> > +       // Load next key stride
> > +       vld1.32         {\k3}, [KEY]!
> > +
> > +       // Add message words to key words
> > +       vadd.u32        T0, T3, \k0
> > +       vadd.u32        T1, T3, \k1
> > +       vadd.u32        T2, T3, \k2
> > +       vadd.u32        T3, T3, \k3
> > +
> > +       // Multiply 32x32 => 64 and accumulate
> > +       vmlal.u32       PASS0_SUMS, T0_L, T0_H
> > +       vmlal.u32       PASS1_SUMS, T1_L, T1_H
> > +       vmlal.u32       PASS2_SUMS, T2_L, T2_H
> > +       vmlal.u32       PASS3_SUMS, T3_L, T3_H
> > +.endm
> > +
> 
> Since we seem to have some spare NEON registers: would it help to have
> a double round version of this macro?
> 

It helps a little bit, but not much.  The loads are the only part that can be
optimized further.  I think I'd rather have the shorter + simpler version,
unless the loads can be optimized significantly more on other processors.

Also, originally I had it loading the key and message for the next stride while
doing the current one, but that didn't seem worthwhile either.

> > +/*
> > + * void nh_neon(const u32 *key, const u8 *message, size_t message_len,
> > + *             u8 hash[NH_HASH_BYTES])
> > + *
> > + * It's guaranteed that message_len % 16 == 0.
> > + */
> > +ENTRY(nh_neon)
> > +
> > +       vld1.32         {K0,K1}, [KEY]!
> > +         vmov.u64      PASS0_SUMS, #0
> > +         vmov.u64      PASS1_SUMS, #0
> > +       vld1.32         {K2}, [KEY]!
> > +         vmov.u64      PASS2_SUMS, #0
> > +         vmov.u64      PASS3_SUMS, #0
> > +
> > +       subs            MESSAGE_LEN, MESSAGE_LEN, #64
> > +       blt             .Lloop4_done
> > +.Lloop4:
> > +       _nh_stride      K0, K1, K2, K3
> > +       _nh_stride      K1, K2, K3, K0
> > +       _nh_stride      K2, K3, K0, K1
> > +       _nh_stride      K3, K0, K1, K2
> > +       subs            MESSAGE_LEN, MESSAGE_LEN, #64
> > +       bge             .Lloop4
> > +
> > +.Lloop4_done:
> > +       ands            MESSAGE_LEN, MESSAGE_LEN, #63
> > +       beq             .Ldone
> > +       _nh_stride      K0, K1, K2, K3
> > +
> > +       subs            MESSAGE_LEN, MESSAGE_LEN, #16
> > +       beq             .Ldone
> > +       _nh_stride      K1, K2, K3, K0
> > +
> > +       subs            MESSAGE_LEN, MESSAGE_LEN, #16
> > +       beq             .Ldone
> > +       _nh_stride      K2, K3, K0, K1
> > +
> > +.Ldone:
> > +       // Sum the accumulators for each pass, then store the sums to 'hash'
> > +       vadd.u64        T0_L, PASS0_SUM_A, PASS0_SUM_B
> > +       vadd.u64        T0_H, PASS1_SUM_A, PASS1_SUM_B
> > +       vadd.u64        T1_L, PASS2_SUM_A, PASS2_SUM_B
> > +       vadd.u64        T1_H, PASS3_SUM_A, PASS3_SUM_B
> > +       vst1.8          {T0-T1}, [HASH]
> > +       bx              lr
> > +ENDPROC(nh_neon)
> > diff --git a/arch/arm/crypto/nhpoly1305-neon-glue.c b/arch/arm/crypto/nhpoly1305-neon-glue.c
> > new file mode 100644
> > index 0000000000000..df48a00f4c50f
> > --- /dev/null
> > +++ b/arch/arm/crypto/nhpoly1305-neon-glue.c
> > @@ -0,0 +1,78 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
> > + * (NEON accelerated version)
> > + *
> > + * Copyright 2018 Google LLC
> > + */
> > +
> > +#include <asm/neon.h>
> > +#include <asm/simd.h>
> > +#include <crypto/internal/hash.h>
> > +#include <crypto/nhpoly1305.h>
> > +#include <linux/module.h>
> > +
> > +asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
> > +                       u8 hash[NH_HASH_BYTES]);
> > +
> > +static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
> > +                    __le64 hash[NH_NUM_PASSES])
> > +{
> > +       nh_neon(key, message, message_len, (u8 *)hash);
> > +}
> > +
> 
> Why do we need this function?
> 

For now it's not needed so I should probably just remove it, but it seems likely
that indirect calls to assembly functions in the kernel will be going away in
order to add support for CFI (control flow integrity).  The android-4.9 and
android-4.14 kernels support CFI on arm64, so you might notice that some of the
arm64 crypto code had to be patched for this reason.

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support
  2018-10-20  4:17   ` Ard Biesheuvel
@ 2018-10-20  7:12     ` Eric Biggers
  2018-10-23 10:40       ` Ard Biesheuvel
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-20  7:12 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

Hi Ard,

On Sat, Oct 20, 2018 at 12:17:58PM +0800, Ard Biesheuvel wrote:
> On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> > From: Eric Biggers <ebiggers@google.com>
> >
> > Add support for the Adiantum encryption mode.  Adiantum was designed by
> > Paul Crowley and is specified by our paper:
> >
> >     Adiantum: length-preserving encryption for entry-level processors
> >     (https://eprint.iacr.org/2018/720.pdf)
> >
> > See our paper for full details; this patch only provides an overview.
> >
> > Adiantum is a tweakable, length-preserving encryption mode designed for
> > fast and secure disk encryption, especially on CPUs without dedicated
> > crypto instructions.  Adiantum encrypts each sector using the XChaCha12
> > stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
> > function, and an invocation of the AES-256 block cipher on a single
> > 16-byte block.  On CPUs without AES instructions, Adiantum is much
> > faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
> > Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
> > and decryption about 5 times faster.
> >
> > Adiantum is a specialization of the more general HBSH construction.  Our
> > earlier proposal, HPolyC, was also a HBSH specialization, but it used a
> > different εA∆U hash function, one based on Poly1305 only.  Adiantum's
> > εA∆U hash function, which is based primarily on the "NH" hash function
> > like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
> > consequently, Adiantum is about 20% faster than HPolyC.
> >
> > This speed comes with no loss of security: Adiantum is provably just as
> > secure as HPolyC, in fact slightly *more* secure.  Like HPolyC,
> > Adiantum's security is reducible to that of XChaCha12 and AES-256,
> > subject to a security bound.  XChaCha12 itself has a security reduction
> > to ChaCha12.  Therefore, one need not "trust" Adiantum; one need only
> > trust ChaCha12 and AES-256.  Note that the εA∆U hash function is only
> > used for its proven combinatorical properties so cannot be "broken".
> >
> 
> So what happens if the part of the input covered by the block cipher
> is identical between different generations of the same disk block
> (whose sector count is used as the 'outer' IV). How are we not in the
> same boat as before when using stream ciphers for disk encryption?
> 

This is the point of the hash step.  The value encrypted with the block cipher
to produce the intermediate value C_M (used as the stream cipher nonce) is
H(T, P_L) + P_R.  (T is the tweak a.k.a the IV, P_L is the plaintext except the
last 16 bytes, P_R is the last 16 bytes.)  A collision in this value occurs iff:

	H(T1, P1_L) + P1_R = H(T2, P2_L) + P2_R
i.e.
	H(T1, P1_L) - H(T2, P2_L) = P2_R - P1_R
	
If (T1, P1_L) = (T2, P2_L) then P1_R != P2_R so the equation has no solutions
(since we don't consider queries where the whole input is the same; those
unavoidably produce the same ciphertext).  Otherwise (T1, P1_L) != (T2, P2_L),
and since the hash function H is ε-almost-∆-universal over integers mod 2^128,
the equation is true for at most a very small proportion 'ε' of hash keys.
But, the hash key is chosen at random and is unknown to the attacker.

The same applies in the other direction, for chosen ciphertext attacks.

Basically, it's very difficult for an attacker to cause the intermediate value
C_M to be reused, and the outputs will appear random until they do.

Of course, all this is explained much more precisely and comprehensively in our
paper.  See section 5, "Security reduction".

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-19 19:04   ` Eric Biggers
@ 2018-10-20 10:26     ` Milan Broz
  2018-10-20 13:47       ` Jason A. Donenfeld
  2018-11-16 21:52       ` Eric Biggers
  2018-10-21 22:23     ` Eric Biggers
  1 sibling, 2 replies; 54+ messages in thread
From: Milan Broz @ 2018-10-20 10:26 UTC (permalink / raw)
  To: Eric Biggers, Jason A. Donenfeld
  Cc: Linux Crypto Mailing List, linux-fscrypt, linux-arm-kernel, LKML,
	Herbert Xu, Paul Crowley, Greg Kaiser, Michael Halcrow,
	Samuel Neves, Tomer Ashur

On 19/10/2018 21:04, Eric Biggers wrote:
> Hi Jason,
> 
> On Fri, Oct 19, 2018 at 05:58:35PM +0200, Jason A. Donenfeld wrote:
>> Hello Eric,
>>
>>> As before, some of these patches conflict with the new "Zinc" crypto
>>> library.  But I don't know when Zinc will be merged, so for now I've
>>> continued to base this patchset on the current 'cryptodev'.
>>
>> I'd appreciate it if you waited to merge this until you can rebase it
>> on top of Zinc. In fact, if you already want to build it on top of
>> Zinc, I'm happy to work with you on that in a shared repo or similar.
>> We can also hash out the details of that in person in Vancouver in a
>> few weeks. I think pushing this in before will create undesirable
>> churn for both of us.
>>
> 
> I won't be at Plumbers, sorry!  For if/when it's needed, I'll start a version of
> this based on Zinc.  The basic requirements are that we need (1) xchacha12 and
> xchacha20 available as 'skciphers' in the crypto API, and (2) the poly1305_core
> functions (see patch 08/12).  In principle, these can be implemented in Zinc.
> The Adiantum template and all the NHPoly1305 stuff will be the same either way.
> (Unless you'll want one or both of those moved to Zinc too.  To be honest, even
> after your explanations I still don't have a clear idea of what is supposed to
> go in Zinc and what isn't...)
> 
> However, for now I'm hesitant to completely abandon the current approach and bet
> the farm on Zinc.  Zinc has a large scope and various controversies that haven't
> yet been fully resolved to everyone's satisfaction, including unclear licenses
> on some of the essential assembly files.  It's not appropriate to grind kernel
> crypto development to grind a halt while everyone waits for Zinc.
> 
> So if Zinc is ready, then it makes sense for it to go first;
> otherwise, it doesn't.  It's not yet clear which is the case.

Does it mean, that if Adiantum is based on Zinc, it can be no longer used
for FDE (dm-crypt)? IOW only file-based encryption is possible?

Adiantum (as in your current git branches on kernel.org) can be used for dm-crypt
without any changes (yes, I played with it :) and with some easy tricks directly
through cryptsetup/LUKS as well.

I think we should have this as an alternative to length-preserving wide-block
cipher modes for FDE.

Milan

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-20 10:26     ` Milan Broz
@ 2018-10-20 13:47       ` Jason A. Donenfeld
  2018-11-16 21:52       ` Eric Biggers
  1 sibling, 0 replies; 54+ messages in thread
From: Jason A. Donenfeld @ 2018-10-20 13:47 UTC (permalink / raw)
  To: Milan Broz
  Cc: Eric Biggers, Linux Crypto Mailing List, linux-fscrypt,
	linux-arm-kernel, LKML, Herbert Xu, Paul Crowley, Greg Kaiser,
	Michael Halcrow, Samuel Neves, Tomer Ashur

Hi Milan,

On Sat, Oct 20, 2018 at 12:53 PM Milan Broz <gmazyland@gmail.com> wrote:
> Does it mean, that if Adiantum is based on Zinc, it can be no longer used
> for FDE (dm-crypt)? IOW only file-based encryption is possible?

No, don't worry. All I had in mind was the software implementations of
chacha12 and so forth. There aren't any current plans at this point to
change the scafolding underlying dm-crypt.

Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305
  2018-10-20  5:51     ` Eric Biggers
@ 2018-10-20 15:00       ` Ard Biesheuvel
  0 siblings, 0 replies; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20 15:00 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 20 October 2018 at 13:51, Eric Biggers <ebiggers@kernel.org> wrote:
> On Sat, Oct 20, 2018 at 12:12:56PM +0800, Ard Biesheuvel wrote:
>> On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
>> > From: Eric Biggers <ebiggers@google.com>
>> >
>> > Add an ARM NEON implementation of NHPoly1305, an ε-almost-∆-universal
>> > hash function used in the Adiantum encryption mode.  For now, only the
>> > NH portion is actually NEON-accelerated; the Poly1305 part is less
>> > performance-critical so is just implemented in C.
>> >
>> > Signed-off-by: Eric Biggers <ebiggers@google.com>
>> > ---
>> >  arch/arm/crypto/Kconfig                |   5 ++
>> >  arch/arm/crypto/Makefile               |   2 +
>> >  arch/arm/crypto/nh-neon-core.S         | 116 +++++++++++++++++++++++++
>> >  arch/arm/crypto/nhpoly1305-neon-glue.c |  78 +++++++++++++++++
>> >  4 files changed, 201 insertions(+)
>> >  create mode 100644 arch/arm/crypto/nh-neon-core.S
>> >  create mode 100644 arch/arm/crypto/nhpoly1305-neon-glue.c
>> >
>> > diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
>> > index cc932d9bba561..458562a34aabe 100644
>> > --- a/arch/arm/crypto/Kconfig
>> > +++ b/arch/arm/crypto/Kconfig
>> > @@ -122,4 +122,9 @@ config CRYPTO_CHACHA20_NEON
>> >         select CRYPTO_BLKCIPHER
>> >         select CRYPTO_CHACHA20
>> >
>> > +config CRYPTO_NHPOLY1305_NEON
>> > +       tristate "NEON accelerated NHPoly1305 hash function (for Adiantum)"
>> > +       depends on KERNEL_MODE_NEON
>> > +       select CRYPTO_NHPOLY1305
>> > +
>> >  endif
>> > diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
>> > index 005482ff95047..b65d6bfab8e6b 100644
>> > --- a/arch/arm/crypto/Makefile
>> > +++ b/arch/arm/crypto/Makefile
>> > @@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
>> >  obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
>> >  obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
>> >  obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
>> > +obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o
>> >
>> >  ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
>> >  ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
>> > @@ -53,6 +54,7 @@ ghash-arm-ce-y        := ghash-ce-core.o ghash-ce-glue.o
>> >  crct10dif-arm-ce-y     := crct10dif-ce-core.o crct10dif-ce-glue.o
>> >  crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
>> >  chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
>> > +nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o
>> >
>> >  ifdef REGENERATE_ARM_CRYPTO
>> >  quiet_cmd_perl = PERL    $@
>> > diff --git a/arch/arm/crypto/nh-neon-core.S b/arch/arm/crypto/nh-neon-core.S
>> > new file mode 100644
>> > index 0000000000000..434d80ab531c2
>> > --- /dev/null
>> > +++ b/arch/arm/crypto/nh-neon-core.S
>> > @@ -0,0 +1,116 @@
>> > +/* SPDX-License-Identifier: GPL-2.0 */
>> > +/*
>> > + * NH - ε-almost-universal hash function, NEON accelerated version
>> > + *
>> > + * Copyright 2018 Google LLC
>> > + *
>> > + * Author: Eric Biggers <ebiggers@google.com>
>> > + */
>> > +
>> > +#include <linux/linkage.h>
>> > +
>> > +       .text
>> > +       .fpu            neon
>> > +
>> > +       KEY             .req    r0
>> > +       MESSAGE         .req    r1
>> > +       MESSAGE_LEN     .req    r2
>> > +       HASH            .req    r3
>> > +
>> > +       PASS0_SUMS      .req    q0
>> > +       PASS0_SUM_A     .req    d0
>> > +       PASS0_SUM_B     .req    d1
>> > +       PASS1_SUMS      .req    q1
>> > +       PASS1_SUM_A     .req    d2
>> > +       PASS1_SUM_B     .req    d3
>> > +       PASS2_SUMS      .req    q2
>> > +       PASS2_SUM_A     .req    d4
>> > +       PASS2_SUM_B     .req    d5
>> > +       PASS3_SUMS      .req    q3
>> > +       PASS3_SUM_A     .req    d6
>> > +       PASS3_SUM_B     .req    d7
>> > +       K0              .req    q4
>> > +       K1              .req    q5
>> > +       K2              .req    q6
>> > +       K3              .req    q7
>> > +       T0              .req    q8
>> > +       T0_L            .req    d16
>> > +       T0_H            .req    d17
>> > +       T1              .req    q9
>> > +       T1_L            .req    d18
>> > +       T1_H            .req    d19
>> > +       T2              .req    q10
>> > +       T2_L            .req    d20
>> > +       T2_H            .req    d21
>> > +       T3              .req    q11
>> > +       T3_L            .req    d22
>> > +       T3_H            .req    d23
>> > +
>> > +.macro _nh_stride      k0, k1, k2, k3
>> > +
>> > +       // Load next message stride
>> > +       vld1.8          {T3}, [MESSAGE]!
>> > +
>> > +       // Load next key stride
>> > +       vld1.32         {\k3}, [KEY]!
>> > +
>> > +       // Add message words to key words
>> > +       vadd.u32        T0, T3, \k0
>> > +       vadd.u32        T1, T3, \k1
>> > +       vadd.u32        T2, T3, \k2
>> > +       vadd.u32        T3, T3, \k3
>> > +
>> > +       // Multiply 32x32 => 64 and accumulate
>> > +       vmlal.u32       PASS0_SUMS, T0_L, T0_H
>> > +       vmlal.u32       PASS1_SUMS, T1_L, T1_H
>> > +       vmlal.u32       PASS2_SUMS, T2_L, T2_H
>> > +       vmlal.u32       PASS3_SUMS, T3_L, T3_H
>> > +.endm
>> > +
>>
>> Since we seem to have some spare NEON registers: would it help to have
>> a double round version of this macro?
>>
>
> It helps a little bit, but not much.  The loads are the only part that can be
> optimized further.  I think I'd rather have the shorter + simpler version,
> unless the loads can be optimized significantly more on other processors.
>
> Also, originally I had it loading the key and message for the next stride while
> doing the current one, but that didn't seem worthwhile either.
>

Fair enough.

>> > +/*
>> > + * void nh_neon(const u32 *key, const u8 *message, size_t message_len,
>> > + *             u8 hash[NH_HASH_BYTES])
>> > + *
>> > + * It's guaranteed that message_len % 16 == 0.
>> > + */
>> > +ENTRY(nh_neon)
>> > +
>> > +       vld1.32         {K0,K1}, [KEY]!
>> > +         vmov.u64      PASS0_SUMS, #0
>> > +         vmov.u64      PASS1_SUMS, #0
>> > +       vld1.32         {K2}, [KEY]!
>> > +         vmov.u64      PASS2_SUMS, #0
>> > +         vmov.u64      PASS3_SUMS, #0
>> > +
>> > +       subs            MESSAGE_LEN, MESSAGE_LEN, #64
>> > +       blt             .Lloop4_done
>> > +.Lloop4:
>> > +       _nh_stride      K0, K1, K2, K3
>> > +       _nh_stride      K1, K2, K3, K0
>> > +       _nh_stride      K2, K3, K0, K1
>> > +       _nh_stride      K3, K0, K1, K2
>> > +       subs            MESSAGE_LEN, MESSAGE_LEN, #64
>> > +       bge             .Lloop4
>> > +
>> > +.Lloop4_done:
>> > +       ands            MESSAGE_LEN, MESSAGE_LEN, #63
>> > +       beq             .Ldone
>> > +       _nh_stride      K0, K1, K2, K3
>> > +
>> > +       subs            MESSAGE_LEN, MESSAGE_LEN, #16
>> > +       beq             .Ldone
>> > +       _nh_stride      K1, K2, K3, K0
>> > +
>> > +       subs            MESSAGE_LEN, MESSAGE_LEN, #16
>> > +       beq             .Ldone
>> > +       _nh_stride      K2, K3, K0, K1
>> > +
>> > +.Ldone:
>> > +       // Sum the accumulators for each pass, then store the sums to 'hash'
>> > +       vadd.u64        T0_L, PASS0_SUM_A, PASS0_SUM_B
>> > +       vadd.u64        T0_H, PASS1_SUM_A, PASS1_SUM_B
>> > +       vadd.u64        T1_L, PASS2_SUM_A, PASS2_SUM_B
>> > +       vadd.u64        T1_H, PASS3_SUM_A, PASS3_SUM_B
>> > +       vst1.8          {T0-T1}, [HASH]
>> > +       bx              lr
>> > +ENDPROC(nh_neon)
>> > diff --git a/arch/arm/crypto/nhpoly1305-neon-glue.c b/arch/arm/crypto/nhpoly1305-neon-glue.c
>> > new file mode 100644
>> > index 0000000000000..df48a00f4c50f
>> > --- /dev/null
>> > +++ b/arch/arm/crypto/nhpoly1305-neon-glue.c
>> > @@ -0,0 +1,78 @@
>> > +// SPDX-License-Identifier: GPL-2.0
>> > +/*
>> > + * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
>> > + * (NEON accelerated version)
>> > + *
>> > + * Copyright 2018 Google LLC
>> > + */
>> > +
>> > +#include <asm/neon.h>
>> > +#include <asm/simd.h>
>> > +#include <crypto/internal/hash.h>
>> > +#include <crypto/nhpoly1305.h>
>> > +#include <linux/module.h>
>> > +
>> > +asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
>> > +                       u8 hash[NH_HASH_BYTES]);
>> > +
>> > +static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
>> > +                    __le64 hash[NH_NUM_PASSES])
>> > +{
>> > +       nh_neon(key, message, message_len, (u8 *)hash);
>> > +}
>> > +
>>
>> Why do we need this function?
>>
>
> For now it's not needed so I should probably just remove it, but it seems likely
> that indirect calls to assembly functions in the kernel will be going away in
> order to add support for CFI (control flow integrity).  The android-4.9 and
> android-4.14 kernels support CFI on arm64, so you might notice that some of the
> arm64 crypto code had to be patched for this reason.
>

I didn't actually look at those kernel trees so I hadn't noticed yet.
In any case, I'd suggest that we just keep this wrapper then, but
please add a comment describing why it's there.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
  2018-10-20  5:38     ` Eric Biggers
@ 2018-10-20 15:06       ` Ard Biesheuvel
  2018-10-22 18:42         ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-20 15:06 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 20 October 2018 at 13:38, Eric Biggers <ebiggers@kernel.org> wrote:
> Hi Ard,
>
> On Sat, Oct 20, 2018 at 12:00:31PM +0800, Ard Biesheuvel wrote:
>> On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
>> > From: Eric Biggers <ebiggers@google.com>
>> >
>> > Add a generic implementation of NHPoly1305, an ε-almost-∆-universal hash
>> > function used in the Adiantum encryption mode.
>> >
>> > CONFIG_NHPOLY1305 is not selectable by itself since there won't be any
>> > real reason to enable it without also enabling Adiantum support.
>> >
>> > Signed-off-by: Eric Biggers <ebiggers@google.com>
>> > ---
>> >  crypto/Kconfig              |    5 +
>> >  crypto/Makefile             |    1 +
>> >  crypto/nhpoly1305.c         |  288 ++++++++
>> >  crypto/testmgr.c            |    6 +
>> >  crypto/testmgr.h            | 1240 ++++++++++++++++++++++++++++++++++-
>> >  include/crypto/nhpoly1305.h |   74 +++
>> >  6 files changed, 1610 insertions(+), 4 deletions(-)
>> >  create mode 100644 crypto/nhpoly1305.c
>> >  create mode 100644 include/crypto/nhpoly1305.h
>> >
>> > diff --git a/crypto/Kconfig b/crypto/Kconfig
>> > index 4fa0a4a0e8615..431beca903623 100644
>> > --- a/crypto/Kconfig
>> > +++ b/crypto/Kconfig
>> > @@ -493,6 +493,11 @@ config CRYPTO_KEYWRAP
>> >           Support for key wrapping (NIST SP800-38F / RFC3394) without
>> >           padding.
>> >
>> > +config CRYPTO_NHPOLY1305
>> > +       tristate
>> > +       select CRYPTO_HASH
>> > +       select CRYPTO_POLY1305
>> > +
>> >  comment "Hash modes"
>> >
>> >  config CRYPTO_CMAC
>> > diff --git a/crypto/Makefile b/crypto/Makefile
>> > index 7e673f7c71107..87b86f221a2a2 100644
>> > --- a/crypto/Makefile
>> > +++ b/crypto/Makefile
>> > @@ -84,6 +84,7 @@ obj-$(CONFIG_CRYPTO_LRW) += lrw.o
>> >  obj-$(CONFIG_CRYPTO_XTS) += xts.o
>> >  obj-$(CONFIG_CRYPTO_CTR) += ctr.o
>> >  obj-$(CONFIG_CRYPTO_KEYWRAP) += keywrap.o
>> > +obj-$(CONFIG_CRYPTO_NHPOLY1305) += nhpoly1305.o
>> >  obj-$(CONFIG_CRYPTO_GCM) += gcm.o
>> >  obj-$(CONFIG_CRYPTO_CCM) += ccm.o
>> >  obj-$(CONFIG_CRYPTO_CHACHA20POLY1305) += chacha20poly1305.o
>> > diff --git a/crypto/nhpoly1305.c b/crypto/nhpoly1305.c
>> > new file mode 100644
>> > index 0000000000000..087ad7680dd62
>> > --- /dev/null
>> > +++ b/crypto/nhpoly1305.c
>> > @@ -0,0 +1,288 @@
>> > +// SPDX-License-Identifier: GPL-2.0
>> > +/*
>> > + * NHPoly1305 - ε-almost-∆-universal hash function for Adiantum
>> > + *
>> > + * Copyright 2018 Google LLC
>> > + */
>> > +
>> > +/*
>> > + * "NHPoly1305" is the main component of Adiantum hashing.
>> > + * Specifically, it is the calculation
>> > + *
>> > + *     H_M ← Poly1305_{K_M}(NH_{K_N}(pad_{128}(M)))
>> > + *
>> > + * from the procedure in section A.5 of the Adiantum paper [1].  It is an
>> > + * ε-almost-∆-universal (εA∆U) hash function for equal-length inputs over
>> > + * Z/(2^{128}Z), where the "∆" operation is addition.  It hashes 1024-byte
>> > + * chunks of the input with the NH hash function [2], reducing the input length
>> > + * by 32x.  The resulting NH digests are evaluated as a polynomial in
>> > + * GF(2^{130}-5), like in the Poly1305 MAC [3].  Note that the polynomial
>> > + * evaluation by itself would suffice to achieve the εA∆U property; NH is used
>> > + * for performance since it's over twice as fast as Poly1305.
>> > + *
>> > + * This is *not* a cryptographic hash function; do not use it as such!
>> > + *
>> > + * [1] Adiantum: length-preserving encryption for entry-level processors
>> > + *     (https://eprint.iacr.org/2018/720.pdf)
>> > + * [2] UMAC: Fast and Secure Message Authentication
>> > + *     (https://fastcrypto.org/umac/umac_proc.pdf)
>> > + * [3] The Poly1305-AES message-authentication code
>> > + *     (https://cr.yp.to/mac/poly1305-20050329.pdf)
>> > + */
>> > +
>> > +#include <asm/unaligned.h>
>> > +#include <crypto/algapi.h>
>> > +#include <crypto/internal/hash.h>
>> > +#include <crypto/nhpoly1305.h>
>> > +#include <linux/crypto.h>
>> > +#include <linux/kernel.h>
>> > +#include <linux/module.h>
>> > +
>> > +#define NH_STRIDE(K0, K1, K2, K3)                              \
>> > +({                                                             \
>> > +       m_A = get_unaligned_le32(src); src += 4;                \
>> > +       m_B = get_unaligned_le32(src); src += 4;                \
>> > +       m_C = get_unaligned_le32(src); src += 4;                \
>> > +       m_D = get_unaligned_le32(src); src += 4;                \
>> > +       K3##_A = *key++;                                        \
>> > +       K3##_B = *key++;                                        \
>> > +       K3##_C = *key++;                                        \
>> > +       K3##_D = *key++;                                        \
>> > +       sum0 += (u64)(u32)(m_A + K0##_A) * (u32)(m_C + K0##_C); \
>> > +       sum1 += (u64)(u32)(m_A + K1##_A) * (u32)(m_C + K1##_C); \
>> > +       sum2 += (u64)(u32)(m_A + K2##_A) * (u32)(m_C + K2##_C); \
>> > +       sum3 += (u64)(u32)(m_A + K3##_A) * (u32)(m_C + K3##_C); \
>> > +       sum0 += (u64)(u32)(m_B + K0##_B) * (u32)(m_D + K0##_D); \
>> > +       sum1 += (u64)(u32)(m_B + K1##_B) * (u32)(m_D + K1##_D); \
>> > +       sum2 += (u64)(u32)(m_B + K2##_B) * (u32)(m_D + K2##_D); \
>> > +       sum3 += (u64)(u32)(m_B + K3##_B) * (u32)(m_D + K3##_D); \
>> > +})
>> > +
>> > +static void nh_generic(const u32 *key, const u8 *src, size_t srclen,
>> > +                      __le64 hash[NH_NUM_PASSES])
>> > +{
>> > +       u64 sum0 = 0, sum1 = 0, sum2 = 0, sum3 = 0;
>> > +       u32 k0_A = *key++;
>> > +       u32 k0_B = *key++;
>> > +       u32 k0_C = *key++;
>> > +       u32 k0_D = *key++;
>> > +       u32 k1_A = *key++;
>> > +       u32 k1_B = *key++;
>> > +       u32 k1_C = *key++;
>> > +       u32 k1_D = *key++;
>> > +       u32 k2_A = *key++;
>> > +       u32 k2_B = *key++;
>> > +       u32 k2_C = *key++;
>> > +       u32 k2_D = *key++;
>> > +       u32 k3_A, k3_B, k3_C, k3_D;
>> > +       u32 m_A, m_B, m_C, m_D;
>> > +       size_t n = srclen / NH_MESSAGE_UNIT;
>> > +
>> > +       BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
>> > +       BUILD_BUG_ON(NH_NUM_PASSES != 4);
>> > +
>> > +       while (n >= 4) {
>> > +               NH_STRIDE(k0, k1, k2, k3);
>> > +               NH_STRIDE(k1, k2, k3, k0);
>> > +               NH_STRIDE(k2, k3, k0, k1);
>> > +               NH_STRIDE(k3, k0, k1, k2);
>> > +               n -= 4;
>> > +       }
>> > +       if (n) {
>> > +               NH_STRIDE(k0, k1, k2, k3);
>> > +               if (--n) {
>> > +                       NH_STRIDE(k1, k2, k3, k0);
>> > +                       if (--n)
>> > +                               NH_STRIDE(k2, k3, k0, k1);
>> > +               }
>> > +       }
>> > +
>>
>> This all looks a bit clunky to me, with the macro, the *key++s in the
>> initializers and these conditionals.
>>
>> Was it written in this particular way to get GCC to optimize it in the
>> right way?
>
> This does get compiled into something much faster than a naive version, which
> you can find commented out at
> https://github.com/google/adiantum/blob/master/benchmark/src/nh.c#L14.
>
> Though, I admit that I haven't put a ton of effort into this C implementation of
> NH yet.  Right now it's actually somewhat of a translation of the NEON version.
> I'll do some experiments and see if it can be made into something less ugly
> without losing performance.
>

No that's fine but please document it.

>>
>> > +       hash[0] = cpu_to_le64(sum0);
>> > +       hash[1] = cpu_to_le64(sum1);
>> > +       hash[2] = cpu_to_le64(sum2);
>> > +       hash[3] = cpu_to_le64(sum3);
>> > +}
>> > +
>> > +/* Pass the next NH hash value through Poly1305 */
>> > +static void process_nh_hash_value(struct nhpoly1305_state *state,
>> > +                                 const struct nhpoly1305_key *key)
>> > +{
>> > +       BUILD_BUG_ON(NH_HASH_BYTES % POLY1305_BLOCK_SIZE != 0);
>> > +
>> > +       poly1305_core_blocks(&state->poly_state, &key->poly_key, state->nh_hash,
>> > +                            NH_HASH_BYTES / POLY1305_BLOCK_SIZE);
>> > +}
>> > +
>> > +/*
>> > + * Feed the next portion of the source data, as a whole number of 16-byte
>> > + * "NH message units", through NH and Poly1305.  Each NH hash is taken over
>> > + * 1024 bytes, except possibly the final one which is taken over a multiple of
>> > + * 16 bytes up to 1024.  Also, in the case where data is passed in misaligned
>> > + * chunks, we combine partial hashes; the end result is the same either way.
>> > + */
>> > +static void nhpoly1305_units(struct nhpoly1305_state *state,
>> > +                            const struct nhpoly1305_key *key,
>> > +                            const u8 *src, unsigned int srclen, nh_t nh_fn)
>>
>> Since indirect calls are going out of style: can we get rid of the
>> function pointer? Or is the compiler already inferring that it always
>> refers to nh_generic()?
>>
>
> At least for now I want to use the same crypto_nhpoly1305_*_helper() functions
> for all nhpoly1305 implementations, and that requires that 'nh' be a function
> pointer.  The helpers could be placed in a header and inlined which would turn
> 'nh' into a direct call, but it seemed to be too much code to inline, and
> normally 'nh' is only invoked once per 1024 bytes anyway.
>

OK.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-19 19:04   ` Eric Biggers
  2018-10-20 10:26     ` Milan Broz
@ 2018-10-21 22:23     ` Eric Biggers
  2018-10-21 22:51       ` Jason A. Donenfeld
  1 sibling, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-21 22:23 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Linux Crypto Mailing List, linux-fscrypt, linux-arm-kernel, LKML,
	Herbert Xu, Paul Crowley, Greg Kaiser, Michael Halcrow,
	Samuel Neves, Tomer Ashur

On Fri, Oct 19, 2018 at 12:04:11PM -0700, Eric Biggers wrote:
> Hi Jason,
> 
> On Fri, Oct 19, 2018 at 05:58:35PM +0200, Jason A. Donenfeld wrote:
> > Hello Eric,
> > 
> > > As before, some of these patches conflict with the new "Zinc" crypto
> > > library.  But I don't know when Zinc will be merged, so for now I've
> > > continued to base this patchset on the current 'cryptodev'.
> > 
> > I'd appreciate it if you waited to merge this until you can rebase it
> > on top of Zinc. In fact, if you already want to build it on top of
> > Zinc, I'm happy to work with you on that in a shared repo or similar.
> > We can also hash out the details of that in person in Vancouver in a
> > few weeks. I think pushing this in before will create undesirable
> > churn for both of us.
> > 
> 
> I won't be at Plumbers, sorry!  For if/when it's needed, I'll start a version of
> this based on Zinc.  The basic requirements are that we need (1) xchacha12 and
> xchacha20 available as 'skciphers' in the crypto API, and (2) the poly1305_core
> functions (see patch 08/12).  In principle, these can be implemented in Zinc.
> The Adiantum template and all the NHPoly1305 stuff will be the same either way.
> (Unless you'll want one or both of those moved to Zinc too.  To be honest, even
> after your explanations I still don't have a clear idea of what is supposed to
> go in Zinc and what isn't...)
> 
> However, for now I'm hesitant to completely abandon the current approach and bet
> the farm on Zinc.  Zinc has a large scope and various controversies that haven't
> yet been fully resolved to everyone's satisfaction, including unclear licenses
> on some of the essential assembly files.  It's not appropriate to grind kernel
> crypto development to grind a halt while everyone waits for Zinc.
> 
> So if Zinc is ready, then it makes sense for it to go first;
> otherwise, it doesn't.  It's not yet clear which is the case.
> 

I started a branch based on Zinc:
https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git,
branch "adiantum-zinc".

For Poly1305, for now I decided to just use the existing functions, passing 0
for the 16-byte element is added at the end.  This causes some unnecessary
overhead, but it's not very much.  It also results in a much larger size of
'struct nhpoly1305_state', but that doesn't matter too much anymore either [1].

For ChaCha, I haven't yet updated all the "Zinc" assembly to support 12 rounds.
So far I've updated my ARM scalar implementation.  I still don't see how you
expect people to maintain the files like chacha20-x86_64.S from which all
comments, register aliases, etc. were removed in comparison to the original
OpenSSL code.  I find it hard to very understand what's going on from what is
nearly an 'objdump' output.  (I'll figure it out eventually, but it will take
some time.)  I don't see how dumping thousands of lines of undocumented,
generated assembly code into the kernel fits with your goals of "Zinc's focus is
on simplicity and clarity" and "inviting collaboration".  Note that the
OpenSSL-derived assembly files still have an unclear license as well.

I'm also still not a fan of the remaining duplication between "zinc" and
"crypto", e.g. we still have both crypto/chacha.h and zinc/chacha.h, and
separate tests for "zinc" and "crypto".  (I haven't yet gotten around to adding
"zinc tests" for XChaCha12, though I did add "crypto tests".  Note that "crypto
tests" are much easier to add, since all algorithms of the same type share a
common test framework -- not the case for Zinc.)

Of course, both myself and others have expressed concerns about these issues
previously too, yet they remain unaddressed nor is there a documentation file
explaining things.  So please understand that until it's clear that Zinc is
ready, I still have to have Adiantum ready to go without Zinc, just in case.

Thanks,

- Eric

[1] Originally we were going to define Adiantum's hash function to be
    Poly1305(message_length || tweak_length || tweak || NH(message)), which
    would have made it desirable to export the Poly1305 state before NH, so that
    it could be imported for the second hash step to avoid redundantly hashing
    the message length and tweak.  But later we changed it to
    Poly1305(message_length || tweak) + Poly1305(NH(message)).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-21 22:23     ` Eric Biggers
@ 2018-10-21 22:51       ` Jason A. Donenfeld
  2018-10-22 17:17         ` Paul Crowley
  0 siblings, 1 reply; 54+ messages in thread
From: Jason A. Donenfeld @ 2018-10-21 22:51 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Linux Crypto Mailing List, linux-fscrypt, linux-arm-kernel, LKML,
	Herbert Xu, Paul Crowley, Greg Kaiser, Michael Halcrow,
	Samuel Neves, Tomer Ashur

Hey Eric,

On Mon, Oct 22, 2018 at 12:23 AM Eric Biggers <ebiggers@kernel.org> wrote:
> I started a branch based on Zinc:

Nice to see. I'm heading to bed in a second, so I'll give this a
thorough read-through tomorrow, but some preliminary notes on your
comments:

> For Poly1305, for now I decided to just use the existing functions, passing 0
> for the 16-byte element is added at the end.  This causes some unnecessary
> overhead, but it's not very much.  It also results in a much larger size of
> 'struct nhpoly1305_state', but that doesn't matter too much anymore either [1].
> [1] Originally we were going to define Adiantum's hash function to be
>     Poly1305(message_length || tweak_length || tweak || NH(message)), which
>     would have made it desirable to export the Poly1305 state before NH, so that
>     it could be imported for the second hash step to avoid redundantly hashing
>     the message length and tweak.  But later we changed it to
>     Poly1305(message_length || tweak) + Poly1305(NH(message)).

Out of curiosity, why this change?

> For ChaCha, I haven't yet updated all the "Zinc" assembly to support 12 rounds.
> So far I've updated my ARM scalar implementation.  I still don't see how you
> expect people to maintain the files like chacha20-x86_64.S from which all
> comments, register aliases, etc. were removed in comparison to the original
> OpenSSL code.

For at least the ARM[64] and MIPS64 code, I think it will be feasible
to import the .pl eventually. There's an open PR from Andy importing
some of the necessary changes. For the x86_64, that might be a little
trickier, but I can take another stab at it.

> I don't see how dumping thousands of lines of undocumented,
> generated assembly code into the kernel fits with your goals of "Zinc's focus is
> on simplicity and clarity" and "inviting collaboration".

It's not totally "undocumented" and totally "dumped"; that's a bit
hyperbolic. But I can understand it's not as friendly as we'd like.
I'll try to improve that.

> Note that the
> OpenSSL-derived assembly files still have an unclear license as well.

Andy's been pretty clear about the CRYPTOGAMS aspect with me. But, as
you pointed out on lkml and in the private thread, it hasn't yet
migrated over to the CRYPTOGAMS repo. I don't think this is a cause
for immediate concern, because it seems pretty certain it will wind up
there soon enough.

> (I haven't yet gotten around to adding
> "zinc tests" for XChaCha12, though I did add "crypto tests".  Note that "crypto
> tests" are much easier to add, since all algorithms of the same type share a
> common test framework -- not the case for Zinc.)

Actually the advantage of not working with a winding abstraction layer
is that specific tests can test particular aspects of particular
primitives -- for example, by looking at different chunking patterns.
It also enables you to write tests for internal, non-exported
functions.

> nor is there a documentation file
> explaining things.

Sorry, my bad on delaying that one. I'll be sure the Documentation/
stuff is ready before posting another series.

> So please understand that until it's clear that Zinc is
> ready, I still have to have Adiantum ready to go without Zinc, just in case.

Makes sense. I do really appreciate you taking the time, though, to
try this out with Zinc as well. Thanks for that.

Regards,
Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
       [not found]     ` <2395454e-a0dc-408f-4138-9d15ab5f20b8@esat.kuleuven.be>
@ 2018-10-22 11:20       ` Tomer Ashur
  0 siblings, 0 replies; 54+ messages in thread
From: Tomer Ashur @ 2018-10-22 11:20 UTC (permalink / raw)
  To: Paul Crowley, Jason
  Cc: ebiggers, linux-crypto, linux-fscrypt, linux-arm-kernel,
	linux-kernel, Herbert Xu, Greg Kaiser, Michael Halcrow,
	samuel.c.p.neves


[-- Attachment #1.1: Type: text/plain, Size: 1687 bytes --]

> On 19-Oct-18 8:19 PM, Paul Crowley wrote:
>> I would prefer not to wait. Unlike a new primitive whose strength can
>> only be known through attempts at cryptanalysis, Adiantum is a
>> construction based on
>> well-understood and trusted primitives; it is secure if the proof
>> accompanying it is correct. Given that (outside competitions or
>> standardization efforts) no-one ever issues public statements that
>> they think algorithms or proofs are good, what I'm expecting from
>> academia is silence :) The most we could hope for would be getting the
>> paper accepted at a conference, and we're pursuing that but there's a
>> good chance that won't happen simply because it's not very novel. It
>> basically takes existing ideas and applies them using a stream cipher
>> instead of a block cipher, and a faster hashing mode; it's also a
>> small update from HPolyC. I've had some private feedback that the
>> proof seems correct, and that's all I'm expecting to get.
>
I tend to agree with Paul on this point. This is a place where academia
needs to improve. An attempt to do so is the Real World Crypto
conference (RWC; https://rwc.iacr.org/2019/), but the deadline for
submissions was October 1st. For HpolyC I asked a few people to take a
look at the construction and the consensus was that it seems secure but
that the proof style makes it hard to verify. I haven't had the time yet
to read the Adiantum paper (and I'm not a provable security person
anyway) but I suppose Paul took the comments he received on this into
account and that's the best we can hope for. Academia simply moves in a
different pace and has different incentives.

 Tomer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-21 22:51       ` Jason A. Donenfeld
@ 2018-10-22 17:17         ` Paul Crowley
  0 siblings, 0 replies; 54+ messages in thread
From: Paul Crowley @ 2018-10-22 17:17 UTC (permalink / raw)
  To: Jason
  Cc: ebiggers, linux-crypto, linux-fscrypt, linux-arm-kernel,
	linux-kernel, Herbert Xu, Greg Kaiser, Michael Halcrow,
	samuel.c.p.neves, tomer.ashur

On Sun, 21 Oct 2018 at 15:52, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > [1] Originally we were going to define Adiantum's hash function to be
> >     Poly1305(message_length || tweak_length || tweak || NH(message)), which
> >     would have made it desirable to export the Poly1305 state before NH, so that
> >     it could be imported for the second hash step to avoid redundantly hashing
> >     the message length and tweak.  But later we changed it to
> >     Poly1305(message_length || tweak) + Poly1305(NH(message)).
>
> Out of curiosity, why this change?

With the old system, Eric ended up implementing a function which took
"message_length || tweak_length || tweak || message" as input and
*parsed out* the lengths in the first 16 bytes to know when to start
applying NH. That struck me as not nice at all, and we worked together
to design something that fitted more naturally into the way that
crypto is done in the kernel.

With this change, the part that can be kept in common between the two
hashing stages is cleanly separated from the part that will be
different, and the Poly1305(NH(message)) construction is a relatively
clean thing by itself to be part of the Linux kernel, though by itself
it is only epsilon-almost-delta-universal over equal-length inputs so
it has to be combined with something else to handle varying-length
inputs. This is not too dissimilar from the caveats around GHASH which
is also part of the kernel.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
  2018-10-20 15:06       ` Ard Biesheuvel
@ 2018-10-22 18:42         ` Eric Biggers
  2018-10-22 22:25           ` Ard Biesheuvel
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-22 18:42 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On Sat, Oct 20, 2018 at 11:06:00PM +0800, Ard Biesheuvel wrote:
> >> > +
> >> > +#define NH_STRIDE(K0, K1, K2, K3)                              \
> >> > +({                                                             \
> >> > +       m_A = get_unaligned_le32(src); src += 4;                \
> >> > +       m_B = get_unaligned_le32(src); src += 4;                \
> >> > +       m_C = get_unaligned_le32(src); src += 4;                \
> >> > +       m_D = get_unaligned_le32(src); src += 4;                \
> >> > +       K3##_A = *key++;                                        \
> >> > +       K3##_B = *key++;                                        \
> >> > +       K3##_C = *key++;                                        \
> >> > +       K3##_D = *key++;                                        \
> >> > +       sum0 += (u64)(u32)(m_A + K0##_A) * (u32)(m_C + K0##_C); \
> >> > +       sum1 += (u64)(u32)(m_A + K1##_A) * (u32)(m_C + K1##_C); \
> >> > +       sum2 += (u64)(u32)(m_A + K2##_A) * (u32)(m_C + K2##_C); \
> >> > +       sum3 += (u64)(u32)(m_A + K3##_A) * (u32)(m_C + K3##_C); \
> >> > +       sum0 += (u64)(u32)(m_B + K0##_B) * (u32)(m_D + K0##_D); \
> >> > +       sum1 += (u64)(u32)(m_B + K1##_B) * (u32)(m_D + K1##_D); \
> >> > +       sum2 += (u64)(u32)(m_B + K2##_B) * (u32)(m_D + K2##_D); \
> >> > +       sum3 += (u64)(u32)(m_B + K3##_B) * (u32)(m_D + K3##_D); \
> >> > +})
> >> > +
> >> > +static void nh_generic(const u32 *key, const u8 *src, size_t srclen,
> >> > +                      __le64 hash[NH_NUM_PASSES])
> >> > +{
> >> > +       u64 sum0 = 0, sum1 = 0, sum2 = 0, sum3 = 0;
> >> > +       u32 k0_A = *key++;
> >> > +       u32 k0_B = *key++;
> >> > +       u32 k0_C = *key++;
> >> > +       u32 k0_D = *key++;
> >> > +       u32 k1_A = *key++;
> >> > +       u32 k1_B = *key++;
> >> > +       u32 k1_C = *key++;
> >> > +       u32 k1_D = *key++;
> >> > +       u32 k2_A = *key++;
> >> > +       u32 k2_B = *key++;
> >> > +       u32 k2_C = *key++;
> >> > +       u32 k2_D = *key++;
> >> > +       u32 k3_A, k3_B, k3_C, k3_D;
> >> > +       u32 m_A, m_B, m_C, m_D;
> >> > +       size_t n = srclen / NH_MESSAGE_UNIT;
> >> > +
> >> > +       BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
> >> > +       BUILD_BUG_ON(NH_NUM_PASSES != 4);
> >> > +
> >> > +       while (n >= 4) {
> >> > +               NH_STRIDE(k0, k1, k2, k3);
> >> > +               NH_STRIDE(k1, k2, k3, k0);
> >> > +               NH_STRIDE(k2, k3, k0, k1);
> >> > +               NH_STRIDE(k3, k0, k1, k2);
> >> > +               n -= 4;
> >> > +       }
> >> > +       if (n) {
> >> > +               NH_STRIDE(k0, k1, k2, k3);
> >> > +               if (--n) {
> >> > +                       NH_STRIDE(k1, k2, k3, k0);
> >> > +                       if (--n)
> >> > +                               NH_STRIDE(k2, k3, k0, k1);
> >> > +               }
> >> > +       }
> >> > +
> >>
> >> This all looks a bit clunky to me, with the macro, the *key++s in the
> >> initializers and these conditionals.
> >>
> >> Was it written in this particular way to get GCC to optimize it in the
> >> right way?
> >
> > This does get compiled into something much faster than a naive version, which
> > you can find commented out at
> > https://github.com/google/adiantum/blob/master/benchmark/src/nh.c#L14.
> >
> > Though, I admit that I haven't put a ton of effort into this C implementation of
> > NH yet.  Right now it's actually somewhat of a translation of the NEON version.
> > I'll do some experiments and see if it can be made into something less ugly
> > without losing performance.
> >
> 
> No that's fine but please document it.
> 

Hmm, I'm actually leaning towards the following instead.  Unrolling multiple
strides to try to reduce loads of the keys doesn't seem worthwhile in the C
implementation; for one, it bloats the code size a lot
(412 => 2332 bytes on arm32).

static void nh_generic(const u32 *key, const u8 *message, size_t message_len,
		       __le64 hash[NH_NUM_PASSES])
{
	u64 sums[4] = { 0, 0, 0, 0 };

	BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
	BUILD_BUG_ON(NH_NUM_PASSES != 4);

	while (message_len) {
		u32 m0 = get_unaligned_le32(message + 0);
		u32 m1 = get_unaligned_le32(message + 4);
		u32 m2 = get_unaligned_le32(message + 8);
		u32 m3 = get_unaligned_le32(message + 12);

		sums[0] += (u64)(u32)(m0 + key[ 0]) * (u32)(m2 + key[ 2]);
		sums[1] += (u64)(u32)(m0 + key[ 4]) * (u32)(m2 + key[ 6]);
		sums[2] += (u64)(u32)(m0 + key[ 8]) * (u32)(m2 + key[10]);
		sums[3] += (u64)(u32)(m0 + key[12]) * (u32)(m2 + key[14]);
		sums[0] += (u64)(u32)(m1 + key[ 1]) * (u32)(m3 + key[ 3]);
		sums[1] += (u64)(u32)(m1 + key[ 5]) * (u32)(m3 + key[ 7]);
		sums[2] += (u64)(u32)(m1 + key[ 9]) * (u32)(m3 + key[11]);
		sums[3] += (u64)(u32)(m1 + key[13]) * (u32)(m3 + key[15]);
		key += NH_MESSAGE_UNIT / sizeof(key[0]);
		message += NH_MESSAGE_UNIT;
		message_len -= NH_MESSAGE_UNIT;
	}

	hash[0] = cpu_to_le64(sums[0]);
	hash[1] = cpu_to_le64(sums[1]);
	hash[2] = cpu_to_le64(sums[2]);
	hash[3] = cpu_to_le64(sums[3]);
}

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
  2018-10-22 18:42         ` Eric Biggers
@ 2018-10-22 22:25           ` Ard Biesheuvel
  2018-10-22 22:40             ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-22 22:25 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 22 October 2018 at 15:42, Eric Biggers <ebiggers@kernel.org> wrote:
> On Sat, Oct 20, 2018 at 11:06:00PM +0800, Ard Biesheuvel wrote:
>> >> > +
>> >> > +#define NH_STRIDE(K0, K1, K2, K3)                              \
>> >> > +({                                                             \
>> >> > +       m_A = get_unaligned_le32(src); src += 4;                \
>> >> > +       m_B = get_unaligned_le32(src); src += 4;                \
>> >> > +       m_C = get_unaligned_le32(src); src += 4;                \
>> >> > +       m_D = get_unaligned_le32(src); src += 4;                \
>> >> > +       K3##_A = *key++;                                        \
>> >> > +       K3##_B = *key++;                                        \
>> >> > +       K3##_C = *key++;                                        \
>> >> > +       K3##_D = *key++;                                        \
>> >> > +       sum0 += (u64)(u32)(m_A + K0##_A) * (u32)(m_C + K0##_C); \
>> >> > +       sum1 += (u64)(u32)(m_A + K1##_A) * (u32)(m_C + K1##_C); \
>> >> > +       sum2 += (u64)(u32)(m_A + K2##_A) * (u32)(m_C + K2##_C); \
>> >> > +       sum3 += (u64)(u32)(m_A + K3##_A) * (u32)(m_C + K3##_C); \
>> >> > +       sum0 += (u64)(u32)(m_B + K0##_B) * (u32)(m_D + K0##_D); \
>> >> > +       sum1 += (u64)(u32)(m_B + K1##_B) * (u32)(m_D + K1##_D); \
>> >> > +       sum2 += (u64)(u32)(m_B + K2##_B) * (u32)(m_D + K2##_D); \
>> >> > +       sum3 += (u64)(u32)(m_B + K3##_B) * (u32)(m_D + K3##_D); \
>> >> > +})
>> >> > +
>> >> > +static void nh_generic(const u32 *key, const u8 *src, size_t srclen,
>> >> > +                      __le64 hash[NH_NUM_PASSES])
>> >> > +{
>> >> > +       u64 sum0 = 0, sum1 = 0, sum2 = 0, sum3 = 0;
>> >> > +       u32 k0_A = *key++;
>> >> > +       u32 k0_B = *key++;
>> >> > +       u32 k0_C = *key++;
>> >> > +       u32 k0_D = *key++;
>> >> > +       u32 k1_A = *key++;
>> >> > +       u32 k1_B = *key++;
>> >> > +       u32 k1_C = *key++;
>> >> > +       u32 k1_D = *key++;
>> >> > +       u32 k2_A = *key++;
>> >> > +       u32 k2_B = *key++;
>> >> > +       u32 k2_C = *key++;
>> >> > +       u32 k2_D = *key++;
>> >> > +       u32 k3_A, k3_B, k3_C, k3_D;
>> >> > +       u32 m_A, m_B, m_C, m_D;
>> >> > +       size_t n = srclen / NH_MESSAGE_UNIT;
>> >> > +
>> >> > +       BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
>> >> > +       BUILD_BUG_ON(NH_NUM_PASSES != 4);
>> >> > +
>> >> > +       while (n >= 4) {
>> >> > +               NH_STRIDE(k0, k1, k2, k3);
>> >> > +               NH_STRIDE(k1, k2, k3, k0);
>> >> > +               NH_STRIDE(k2, k3, k0, k1);
>> >> > +               NH_STRIDE(k3, k0, k1, k2);
>> >> > +               n -= 4;
>> >> > +       }
>> >> > +       if (n) {
>> >> > +               NH_STRIDE(k0, k1, k2, k3);
>> >> > +               if (--n) {
>> >> > +                       NH_STRIDE(k1, k2, k3, k0);
>> >> > +                       if (--n)
>> >> > +                               NH_STRIDE(k2, k3, k0, k1);
>> >> > +               }
>> >> > +       }
>> >> > +
>> >>
>> >> This all looks a bit clunky to me, with the macro, the *key++s in the
>> >> initializers and these conditionals.
>> >>
>> >> Was it written in this particular way to get GCC to optimize it in the
>> >> right way?
>> >
>> > This does get compiled into something much faster than a naive version, which
>> > you can find commented out at
>> > https://github.com/google/adiantum/blob/master/benchmark/src/nh.c#L14.
>> >
>> > Though, I admit that I haven't put a ton of effort into this C implementation of
>> > NH yet.  Right now it's actually somewhat of a translation of the NEON version.
>> > I'll do some experiments and see if it can be made into something less ugly
>> > without losing performance.
>> >
>>
>> No that's fine but please document it.
>>
>
> Hmm, I'm actually leaning towards the following instead.  Unrolling multiple
> strides to try to reduce loads of the keys doesn't seem worthwhile in the C
> implementation; for one, it bloats the code size a lot
> (412 => 2332 bytes on arm32).
>
> static void nh_generic(const u32 *key, const u8 *message, size_t message_len,
>                        __le64 hash[NH_NUM_PASSES])
> {
>         u64 sums[4] = { 0, 0, 0, 0 };
>
>         BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
>         BUILD_BUG_ON(NH_NUM_PASSES != 4);
>
>         while (message_len) {
>                 u32 m0 = get_unaligned_le32(message + 0);
>                 u32 m1 = get_unaligned_le32(message + 4);
>                 u32 m2 = get_unaligned_le32(message + 8);
>                 u32 m3 = get_unaligned_le32(message + 12);
>
>                 sums[0] += (u64)(u32)(m0 + key[ 0]) * (u32)(m2 + key[ 2]);
>                 sums[1] += (u64)(u32)(m0 + key[ 4]) * (u32)(m2 + key[ 6]);
>                 sums[2] += (u64)(u32)(m0 + key[ 8]) * (u32)(m2 + key[10]);
>                 sums[3] += (u64)(u32)(m0 + key[12]) * (u32)(m2 + key[14]);
>                 sums[0] += (u64)(u32)(m1 + key[ 1]) * (u32)(m3 + key[ 3]);
>                 sums[1] += (u64)(u32)(m1 + key[ 5]) * (u32)(m3 + key[ 7]);
>                 sums[2] += (u64)(u32)(m1 + key[ 9]) * (u32)(m3 + key[11]);
>                 sums[3] += (u64)(u32)(m1 + key[13]) * (u32)(m3 + key[15]);

Are these (u32) casts really necessary? All the addends are u32 types,
so I'd expect each (x + y) subexpression to have a u32 type already as
well. Or am I missing something?

>                 key += NH_MESSAGE_UNIT / sizeof(key[0]);
>                 message += NH_MESSAGE_UNIT;
>                 message_len -= NH_MESSAGE_UNIT;
>         }
>
>         hash[0] = cpu_to_le64(sums[0]);
>         hash[1] = cpu_to_le64(sums[1]);
>         hash[2] = cpu_to_le64(sums[2]);
>         hash[3] = cpu_to_le64(sums[3]);
> }

In any case, this looks much better to me, so if the performance is
satisfactory, let's use this version.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
  2018-10-22 22:25           ` Ard Biesheuvel
@ 2018-10-22 22:40             ` Eric Biggers
  2018-10-22 22:43               ` Ard Biesheuvel
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-22 22:40 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

Hi Ard,

On Mon, Oct 22, 2018 at 07:25:27PM -0300, Ard Biesheuvel wrote:
> >
> > Hmm, I'm actually leaning towards the following instead.  Unrolling multiple
> > strides to try to reduce loads of the keys doesn't seem worthwhile in the C
> > implementation; for one, it bloats the code size a lot
> > (412 => 2332 bytes on arm32).
> >
> > static void nh_generic(const u32 *key, const u8 *message, size_t message_len,
> >                        __le64 hash[NH_NUM_PASSES])
> > {
> >         u64 sums[4] = { 0, 0, 0, 0 };
> >
> >         BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
> >         BUILD_BUG_ON(NH_NUM_PASSES != 4);
> >
> >         while (message_len) {
> >                 u32 m0 = get_unaligned_le32(message + 0);
> >                 u32 m1 = get_unaligned_le32(message + 4);
> >                 u32 m2 = get_unaligned_le32(message + 8);
> >                 u32 m3 = get_unaligned_le32(message + 12);
> >
> >                 sums[0] += (u64)(u32)(m0 + key[ 0]) * (u32)(m2 + key[ 2]);
> >                 sums[1] += (u64)(u32)(m0 + key[ 4]) * (u32)(m2 + key[ 6]);
> >                 sums[2] += (u64)(u32)(m0 + key[ 8]) * (u32)(m2 + key[10]);
> >                 sums[3] += (u64)(u32)(m0 + key[12]) * (u32)(m2 + key[14]);
> >                 sums[0] += (u64)(u32)(m1 + key[ 1]) * (u32)(m3 + key[ 3]);
> >                 sums[1] += (u64)(u32)(m1 + key[ 5]) * (u32)(m3 + key[ 7]);
> >                 sums[2] += (u64)(u32)(m1 + key[ 9]) * (u32)(m3 + key[11]);
> >                 sums[3] += (u64)(u32)(m1 + key[13]) * (u32)(m3 + key[15]);
> 
> Are these (u32) casts really necessary? All the addends are u32 types,
> so I'd expect each (x + y) subexpression to have a u32 type already as
> well. Or am I missing something?
> 

The (u32) casts are only necessary when sizeof(int) > sizeof(u32), as then the
addends will be promoted to 'int'.  Of course, that's never the case for the
Linux kernel.  But I prefer it to be as robust and well-defined as possible,
since people might use this as a reference when coding other implementations,
which could end up finding their way into unusual and/or future platforms.

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support
  2018-10-22 22:40             ` Eric Biggers
@ 2018-10-22 22:43               ` Ard Biesheuvel
  0 siblings, 0 replies; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-22 22:43 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 22 October 2018 at 19:40, Eric Biggers <ebiggers@kernel.org> wrote:
> Hi Ard,
>
> On Mon, Oct 22, 2018 at 07:25:27PM -0300, Ard Biesheuvel wrote:
>> >
>> > Hmm, I'm actually leaning towards the following instead.  Unrolling multiple
>> > strides to try to reduce loads of the keys doesn't seem worthwhile in the C
>> > implementation; for one, it bloats the code size a lot
>> > (412 => 2332 bytes on arm32).
>> >
>> > static void nh_generic(const u32 *key, const u8 *message, size_t message_len,
>> >                        __le64 hash[NH_NUM_PASSES])
>> > {
>> >         u64 sums[4] = { 0, 0, 0, 0 };
>> >
>> >         BUILD_BUG_ON(NH_PAIR_STRIDE != 2);
>> >         BUILD_BUG_ON(NH_NUM_PASSES != 4);
>> >
>> >         while (message_len) {
>> >                 u32 m0 = get_unaligned_le32(message + 0);
>> >                 u32 m1 = get_unaligned_le32(message + 4);
>> >                 u32 m2 = get_unaligned_le32(message + 8);
>> >                 u32 m3 = get_unaligned_le32(message + 12);
>> >
>> >                 sums[0] += (u64)(u32)(m0 + key[ 0]) * (u32)(m2 + key[ 2]);
>> >                 sums[1] += (u64)(u32)(m0 + key[ 4]) * (u32)(m2 + key[ 6]);
>> >                 sums[2] += (u64)(u32)(m0 + key[ 8]) * (u32)(m2 + key[10]);
>> >                 sums[3] += (u64)(u32)(m0 + key[12]) * (u32)(m2 + key[14]);
>> >                 sums[0] += (u64)(u32)(m1 + key[ 1]) * (u32)(m3 + key[ 3]);
>> >                 sums[1] += (u64)(u32)(m1 + key[ 5]) * (u32)(m3 + key[ 7]);
>> >                 sums[2] += (u64)(u32)(m1 + key[ 9]) * (u32)(m3 + key[11]);
>> >                 sums[3] += (u64)(u32)(m1 + key[13]) * (u32)(m3 + key[15]);
>>
>> Are these (u32) casts really necessary? All the addends are u32 types,
>> so I'd expect each (x + y) subexpression to have a u32 type already as
>> well. Or am I missing something?
>>
>
> The (u32) casts are only necessary when sizeof(int) > sizeof(u32), as then the
> addends will be promoted to 'int'.  Of course, that's never the case for the
> Linux kernel.  But I prefer it to be as robust and well-defined as possible,
> since people might use this as a reference when coding other implementations,
> which could end up finding their way into unusual and/or future platforms.
>

Fair enough.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support
  2018-10-20  7:12     ` Eric Biggers
@ 2018-10-23 10:40       ` Ard Biesheuvel
  2018-10-24 22:06         ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Ard Biesheuvel @ 2018-10-23 10:40 UTC (permalink / raw)
  To: Eric Biggers
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On 20 October 2018 at 15:12, Eric Biggers <ebiggers@kernel.org> wrote:
> Hi Ard,
>
> On Sat, Oct 20, 2018 at 12:17:58PM +0800, Ard Biesheuvel wrote:
>> On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
>> > From: Eric Biggers <ebiggers@google.com>
>> >
>> > Add support for the Adiantum encryption mode.  Adiantum was designed by
>> > Paul Crowley and is specified by our paper:
>> >
>> >     Adiantum: length-preserving encryption for entry-level processors
>> >     (https://eprint.iacr.org/2018/720.pdf)
>> >
>> > See our paper for full details; this patch only provides an overview.
>> >
>> > Adiantum is a tweakable, length-preserving encryption mode designed for
>> > fast and secure disk encryption, especially on CPUs without dedicated
>> > crypto instructions.  Adiantum encrypts each sector using the XChaCha12
>> > stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
>> > function, and an invocation of the AES-256 block cipher on a single
>> > 16-byte block.  On CPUs without AES instructions, Adiantum is much
>> > faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
>> > Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
>> > and decryption about 5 times faster.
>> >
>> > Adiantum is a specialization of the more general HBSH construction.  Our
>> > earlier proposal, HPolyC, was also a HBSH specialization, but it used a
>> > different εA∆U hash function, one based on Poly1305 only.  Adiantum's
>> > εA∆U hash function, which is based primarily on the "NH" hash function
>> > like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
>> > consequently, Adiantum is about 20% faster than HPolyC.
>> >
>> > This speed comes with no loss of security: Adiantum is provably just as
>> > secure as HPolyC, in fact slightly *more* secure.  Like HPolyC,
>> > Adiantum's security is reducible to that of XChaCha12 and AES-256,
>> > subject to a security bound.  XChaCha12 itself has a security reduction
>> > to ChaCha12.  Therefore, one need not "trust" Adiantum; one need only
>> > trust ChaCha12 and AES-256.  Note that the εA∆U hash function is only
>> > used for its proven combinatorical properties so cannot be "broken".
>> >
>>
>> So what happens if the part of the input covered by the block cipher
>> is identical between different generations of the same disk block
>> (whose sector count is used as the 'outer' IV). How are we not in the
>> same boat as before when using stream ciphers for disk encryption?
>>
>
> This is the point of the hash step.  The value encrypted with the block cipher
> to produce the intermediate value C_M (used as the stream cipher nonce) is
> H(T, P_L) + P_R.  (T is the tweak a.k.a the IV, P_L is the plaintext except the
> last 16 bytes, P_R is the last 16 bytes.)  A collision in this value occurs iff:
>
>         H(T1, P1_L) + P1_R = H(T2, P2_L) + P2_R
> i.e.
>         H(T1, P1_L) - H(T2, P2_L) = P2_R - P1_R
>
> If (T1, P1_L) = (T2, P2_L) then P1_R != P2_R so the equation has no solutions
> (since we don't consider queries where the whole input is the same; those
> unavoidably produce the same ciphertext).  Otherwise (T1, P1_L) != (T2, P2_L),
> and since the hash function H is ε-almost-∆-universal over integers mod 2^128,
> the equation is true for at most a very small proportion 'ε' of hash keys.
> But, the hash key is chosen at random and is unknown to the attacker.
>
> The same applies in the other direction, for chosen ciphertext attacks.
>
> Basically, it's very difficult for an attacker to cause the intermediate value
> C_M to be reused, and the outputs will appear random until they do.
>
> Of course, all this is explained much more precisely and comprehensively in our
> paper.  See section 5, "Security reduction".
>

Thanks for the explanation. I saw that the result of the AES
encryption was used as the XChaCha nonce, but I failed to spot that
the result of the nhpoly1305 pass is added/subtracted to/from that
particular block first.

In any case, this looks good to me: as far as I can tell, the code
implements the algorithm as described in the paper, and the plumbing
into the crypto API looks correct to me as well.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Whether the paper is correct is a different matter: it looks
convincing to me but IANAC.

The only request I have is to add a speed test to tcrypt as well so we
can easily benchmark it.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support
  2018-10-23 10:40       ` Ard Biesheuvel
@ 2018-10-24 22:06         ` Eric Biggers
  2018-10-30  8:17           ` Herbert Xu
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-10-24 22:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fscrypt,
	linux-arm-kernel, Linux Kernel Mailing List, Herbert Xu,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On Tue, Oct 23, 2018 at 07:40:34AM -0300, Ard Biesheuvel wrote:
> On 20 October 2018 at 15:12, Eric Biggers <ebiggers@kernel.org> wrote:
> > Hi Ard,
> >
> > On Sat, Oct 20, 2018 at 12:17:58PM +0800, Ard Biesheuvel wrote:
> >> On 16 October 2018 at 01:54, Eric Biggers <ebiggers@kernel.org> wrote:
> >> > From: Eric Biggers <ebiggers@google.com>
> >> >
> >> > Add support for the Adiantum encryption mode.  Adiantum was designed by
> >> > Paul Crowley and is specified by our paper:
> >> >
> >> >     Adiantum: length-preserving encryption for entry-level processors
> >> >     (https://eprint.iacr.org/2018/720.pdf)
> >> >
> >> > See our paper for full details; this patch only provides an overview.
> >> >
> >> > Adiantum is a tweakable, length-preserving encryption mode designed for
> >> > fast and secure disk encryption, especially on CPUs without dedicated
> >> > crypto instructions.  Adiantum encrypts each sector using the XChaCha12
> >> > stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
> >> > function, and an invocation of the AES-256 block cipher on a single
> >> > 16-byte block.  On CPUs without AES instructions, Adiantum is much
> >> > faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
> >> > Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
> >> > and decryption about 5 times faster.
> >> >
> >> > Adiantum is a specialization of the more general HBSH construction.  Our
> >> > earlier proposal, HPolyC, was also a HBSH specialization, but it used a
> >> > different εA∆U hash function, one based on Poly1305 only.  Adiantum's
> >> > εA∆U hash function, which is based primarily on the "NH" hash function
> >> > like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
> >> > consequently, Adiantum is about 20% faster than HPolyC.
> >> >
> >> > This speed comes with no loss of security: Adiantum is provably just as
> >> > secure as HPolyC, in fact slightly *more* secure.  Like HPolyC,
> >> > Adiantum's security is reducible to that of XChaCha12 and AES-256,
> >> > subject to a security bound.  XChaCha12 itself has a security reduction
> >> > to ChaCha12.  Therefore, one need not "trust" Adiantum; one need only
> >> > trust ChaCha12 and AES-256.  Note that the εA∆U hash function is only
> >> > used for its proven combinatorical properties so cannot be "broken".
> >> >
> >>
> >> So what happens if the part of the input covered by the block cipher
> >> is identical between different generations of the same disk block
> >> (whose sector count is used as the 'outer' IV). How are we not in the
> >> same boat as before when using stream ciphers for disk encryption?
> >>
> >
> > This is the point of the hash step.  The value encrypted with the block cipher
> > to produce the intermediate value C_M (used as the stream cipher nonce) is
> > H(T, P_L) + P_R.  (T is the tweak a.k.a the IV, P_L is the plaintext except the
> > last 16 bytes, P_R is the last 16 bytes.)  A collision in this value occurs iff:
> >
> >         H(T1, P1_L) + P1_R = H(T2, P2_L) + P2_R
> > i.e.
> >         H(T1, P1_L) - H(T2, P2_L) = P2_R - P1_R
> >
> > If (T1, P1_L) = (T2, P2_L) then P1_R != P2_R so the equation has no solutions
> > (since we don't consider queries where the whole input is the same; those
> > unavoidably produce the same ciphertext).  Otherwise (T1, P1_L) != (T2, P2_L),
> > and since the hash function H is ε-almost-∆-universal over integers mod 2^128,
> > the equation is true for at most a very small proportion 'ε' of hash keys.
> > But, the hash key is chosen at random and is unknown to the attacker.
> >
> > The same applies in the other direction, for chosen ciphertext attacks.
> >
> > Basically, it's very difficult for an attacker to cause the intermediate value
> > C_M to be reused, and the outputs will appear random until they do.
> >
> > Of course, all this is explained much more precisely and comprehensively in our
> > paper.  See section 5, "Security reduction".
> >
> 
> Thanks for the explanation. I saw that the result of the AES
> encryption was used as the XChaCha nonce, but I failed to spot that
> the result of the nhpoly1305 pass is added/subtracted to/from that
> particular block first.
> 
> In any case, this looks good to me: as far as I can tell, the code
> implements the algorithm as described in the paper, and the plumbing
> into the crypto API looks correct to me as well.
> 
> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> 
> Whether the paper is correct is a different matter: it looks
> convincing to me but IANAC.
> 
> The only request I have is to add a speed test to tcrypt as well so we
> can easily benchmark it.

I'll add an Adiantum testing mode to tcrypt, though I have to admit that I
really dislike tcrypt, so I've actually been using a custom patch instead
(https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git/commit/?h=cryptobench).

FWIW, I mostly dislike tcrypt's lack of flexibility.  Every benchmark setting
has to be coded explicitly in tcrypt, with a mode number to select it.  It's
often missing what I want, or what I want is included but comes with a bunch of
unwanted tests too.  Also tcrypt has to be built as a loadable module, so it
can't be used with CONFIG_MODULES=n.  And the benchmark results only go directly
to the kernel log, where they aren't easily retrievable from a script.

The patch I've been using just exposes a file /proc/cryptobench (note: it maybe
should be made into a char device instead) to which you can write commands like:

	algtype=skcipher algname=adiantum(xchacha12,aes) keysize=32 bufsize=4096 niter=1000

Then userspace reads back the result:

	SUCCESS algname=adiantum(xchacha12,aes) driver_name=adiantum(xchacha12-software,aes-aesni,nhpoly1305-generic) measurement=0x30a5abd5776e0af enc_time=44831104 dec_time=38303077

It's then pretty straightforward to wrap this API in a userspace script that
does all the benchmarks you need, and formats the results nicely.

For correctness testing there's also an option 'sgl_fuzz' that randomizes the
scatterlist division.

Currently the patch is missing some features such as AEAD support, but at some
point I'd like to get it into a state where it can be included upstream.

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support
  2018-10-24 22:06         ` Eric Biggers
@ 2018-10-30  8:17           ` Herbert Xu
  0 siblings, 0 replies; 54+ messages in thread
From: Herbert Xu @ 2018-10-30  8:17 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Ard Biesheuvel, open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	linux-fscrypt, linux-arm-kernel, Linux Kernel Mailing List,
	Paul Crowley, Greg Kaiser, Michael Halcrow, Jason A . Donenfeld,
	Samuel Neves, Tomer Ashur

On Wed, Oct 24, 2018 at 03:06:17PM -0700, Eric Biggers wrote:
>
> The patch I've been using just exposes a file /proc/cryptobench (note: it maybe
> should be made into a char device instead) to which you can write commands like:
> 
> 	algtype=skcipher algname=adiantum(xchacha12,aes) keysize=32 bufsize=4096 niter=1000
> 
> Then userspace reads back the result:
> 
> 	SUCCESS algname=adiantum(xchacha12,aes) driver_name=adiantum(xchacha12-software,aes-aesni,nhpoly1305-generic) measurement=0x30a5abd5776e0af enc_time=44831104 dec_time=38303077
> 
> It's then pretty straightforward to wrap this API in a userspace script that
> does all the benchmarks you need, and formats the results nicely.
> 
> For correctness testing there's also an option 'sgl_fuzz' that randomizes the
> scatterlist division.
> 
> Currently the patch is missing some features such as AEAD support, but at some
> point I'd like to get it into a state where it can be included upstream.

I completely agree that a new interface would be much better for
tcrypt.

However, we're trying to avoid using /proc for new APIs.  So perhaps
netlink would be a better choice given the existing configuration and
stats APIs that already exist for it.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-10-20 10:26     ` Milan Broz
  2018-10-20 13:47       ` Jason A. Donenfeld
@ 2018-11-16 21:52       ` Eric Biggers
  2018-11-17 10:29         ` Milan Broz
  1 sibling, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-11-16 21:52 UTC (permalink / raw)
  To: Milan Broz
  Cc: Jason A. Donenfeld, Linux Crypto Mailing List, linux-fscrypt,
	linux-arm-kernel, LKML, Herbert Xu, Paul Crowley, Greg Kaiser,
	Michael Halcrow, Samuel Neves, Tomer Ashur

Hi Milan,

On Sat, Oct 20, 2018 at 12:26:20PM +0200, Milan Broz wrote:
> 
> Adiantum (as in your current git branches on kernel.org) can be used for dm-crypt
> without any changes (yes, I played with it :) and with some easy tricks directly
> through cryptsetup/LUKS as well.
> 
> I think we should have this as an alternative to length-preserving wide-block
> cipher modes for FDE.
> 

Yes, dm-crypt can use Adiantum by specifying the cipher as
"capi:adiantum(xchacha12,aes)-plain64".

But, I'm having trouble getting cryptsetup/LUKS to use Adiantum.
Using LUKS1, the following works:

    cryptsetup luksFormat /dev/$partition --cipher='capi:adiantum(xchacha12,aes)-plain64' --key-size 256

However, when possible we'd like people to use 4K sectors for better
performance, which I understand requires using the LUKS2 format along with
cryptsetup v2.0.0+ and Linux v4.12+.  But the following does *not* work:

    cryptsetup luksFormat /dev/$partition --cipher='capi:adiantum(xchacha12,aes)-plain64' --key-size 256 --type luks2 --sector-size 4096

The problem seems to be that when cryptsetup tries to encrypt the keyslot in
luks2_encrypt_to_storage(), it tries to use the algorithm via AF_ALG, but it
incorrectly requests "plain64(capi:adiantum(xchacha12,aes))" which fails.
It should request just "adiantum(xchacha12,aes)".

What are the "easy tricks" you had in mind -- do you mean there's already a way
to use Adiantum with cryptsetup, or do you mean that cryptsetup still needs to
be updated to fully support algorithms using the crypto API syntax?

Thanks,

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-11-16 21:52       ` Eric Biggers
@ 2018-11-17 10:29         ` Milan Broz
  2018-11-19 19:28           ` Eric Biggers
  0 siblings, 1 reply; 54+ messages in thread
From: Milan Broz @ 2018-11-17 10:29 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Jason A. Donenfeld, Linux Crypto Mailing List, linux-fscrypt,
	linux-arm-kernel, LKML, Herbert Xu, Paul Crowley, Greg Kaiser,
	Michael Halcrow, Samuel Neves, Tomer Ashur

On 16/11/2018 22:52, Eric Biggers wrote:
> Hi Milan,
> 
> On Sat, Oct 20, 2018 at 12:26:20PM +0200, Milan Broz wrote:
>>
>> Adiantum (as in your current git branches on kernel.org) can be used for dm-crypt
>> without any changes (yes, I played with it :) and with some easy tricks directly
>> through cryptsetup/LUKS as well.
>>
>> I think we should have this as an alternative to length-preserving wide-block
>> cipher modes for FDE.
>>
> 
> Yes, dm-crypt can use Adiantum by specifying the cipher as
> "capi:adiantum(xchacha12,aes)-plain64".
> 
> But, I'm having trouble getting cryptsetup/LUKS to use Adiantum.
> Using LUKS1, the following works:
> 
>     cryptsetup luksFormat /dev/$partition --cipher='capi:adiantum(xchacha12,aes)-plain64' --key-size 256
> 
> However, when possible we'd like people to use 4K sectors for better
> performance, which I understand requires using the LUKS2 format along with
> cryptsetup v2.0.0+ and Linux v4.12+.  But the following does *not* work:
> 
>     cryptsetup luksFormat /dev/$partition --cipher='capi:adiantum(xchacha12,aes)-plain64' --key-size 256 --type luks2 --sector-size 4096

Hi Eric,

actually I planned to test it and then reply to these patches with example cryptsetup
commands, but did not have time for it yet.
So thanks for a reminder ;-)

Recent cryptsetup supports sector-size even for plain device.

You actually do not need to use capi: prefix, Adiantum is a composition,
so "xchacha20,aes-adiantum-plain64" works as well (and it should work even for old cryptsetup).
(It is ugly, but it should be compatible.)

# cryptsetup open --type plain -c xchacha20,aes-adiantum-plain64 -s 256 --sector-size 4096 /dev/sdb test

For LUKS and benchmark, Adiantum need to use 32 bytes IV. And we have these parameter,
unfortunately, hardcoded...
(I guess there is already a way how to get this dynamically from userspace crypto API now.)

So, I already added patch to devel branch patch for benchmark to support Adiantum few days ago
https://gitlab.com/cryptsetup/cryptsetup/commit/bce567db461e558af7d735c694a50146db899709

This allows trivial benchmark (but it just encrypts one big blob of data):

#  cryptsetup benchmark -c xchacha20,aes-adiantum -s 256
# Tests are approximate using memory only (no storage IO).
#            Algorithm |       Key |      Encryption |      Decryption
xchacha20,aes-adiantum        256b       146.6 MiB/s       148.0 MiB/s
...

# ./cryptsetup benchmark -c xchacha12,aes-adiantum -s 256
xchacha12,aes-adiantum        256b       181.7 MiB/s       184.6 MiB/s

For LUKS2, we need a similar change to cryptoAPI IV size (unfortunately it does not
fallback to old keyslot handling, so LUKS2 does not work currently now).

I quickly added a workaround that fallbacks to default keyslot encryption for keyslots
in this case
https://gitlab.com/cryptsetup/cryptsetup/commit/29e87add5aac9d5eb0087881146988d9c4280915

then you can use LUKS2
# cryptsetup luksFormat --type luks2 --sector-size 4096 -c xchacha20,aes-adiantum-plain64 -s 256 /dev/sdb

(Example above will encrypt keyslots with AES-XTS and use Aviantum for data only.)

So, unfortunately yes, we need some small changes in cryptsetup for LUKS;
plain mode should work out of the box (with the syntax above).

Milan

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-11-17 10:29         ` Milan Broz
@ 2018-11-19 19:28           ` Eric Biggers
  2018-11-19 20:05             ` Milan Broz
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Biggers @ 2018-11-19 19:28 UTC (permalink / raw)
  To: Milan Broz
  Cc: Jason A. Donenfeld, Linux Crypto Mailing List, linux-fscrypt,
	linux-arm-kernel, LKML, Herbert Xu, Paul Crowley, Greg Kaiser,
	Michael Halcrow, Samuel Neves, Tomer Ashur

Hi Milan,

On Sat, Nov 17, 2018 at 11:29:23AM +0100, Milan Broz wrote:
> On 16/11/2018 22:52, Eric Biggers wrote:
> > Hi Milan,
> > 
> > On Sat, Oct 20, 2018 at 12:26:20PM +0200, Milan Broz wrote:
> >>
> >> Adiantum (as in your current git branches on kernel.org) can be used for dm-crypt
> >> without any changes (yes, I played with it :) and with some easy tricks directly
> >> through cryptsetup/LUKS as well.
> >>
> >> I think we should have this as an alternative to length-preserving wide-block
> >> cipher modes for FDE.
> >>
> > 
> > Yes, dm-crypt can use Adiantum by specifying the cipher as
> > "capi:adiantum(xchacha12,aes)-plain64".
> > 
> > But, I'm having trouble getting cryptsetup/LUKS to use Adiantum.
> > Using LUKS1, the following works:
> > 
> >     cryptsetup luksFormat /dev/$partition --cipher='capi:adiantum(xchacha12,aes)-plain64' --key-size 256
> > 
> > However, when possible we'd like people to use 4K sectors for better
> > performance, which I understand requires using the LUKS2 format along with
> > cryptsetup v2.0.0+ and Linux v4.12+.  But the following does *not* work:
> > 
> >     cryptsetup luksFormat /dev/$partition --cipher='capi:adiantum(xchacha12,aes)-plain64' --key-size 256 --type luks2 --sector-size 4096
> 
> Hi Eric,
> 
> actually I planned to test it and then reply to these patches with example cryptsetup
> commands, but did not have time for it yet.
> So thanks for a reminder ;-)
> 
> Recent cryptsetup supports sector-size even for plain device.
> 
> You actually do not need to use capi: prefix, Adiantum is a composition,
> so "xchacha20,aes-adiantum-plain64" works as well (and it should work even for old cryptsetup).
> (It is ugly, but it should be compatible.)

Okay, good to know the "capi:" prefix is not needed.
That makes things slightly easier for us.

> 
> # cryptsetup open --type plain -c xchacha20,aes-adiantum-plain64 -s 256 --sector-size 4096 /dev/sdb test
> 
> For LUKS and benchmark, Adiantum need to use 32 bytes IV. And we have these parameter,
> unfortunately, hardcoded...
> (I guess there is already a way how to get this dynamically from userspace crypto API now.)
> 
> So, I already added patch to devel branch patch for benchmark to support Adiantum few days ago
> https://gitlab.com/cryptsetup/cryptsetup/commit/bce567db461e558af7d735c694a50146db899709
> 
> This allows trivial benchmark (but it just encrypts one big blob of data):
> 
> #  cryptsetup benchmark -c xchacha20,aes-adiantum -s 256
> # Tests are approximate using memory only (no storage IO).
> #            Algorithm |       Key |      Encryption |      Decryption
> xchacha20,aes-adiantum        256b       146.6 MiB/s       148.0 MiB/s
> ...
> 
> # ./cryptsetup benchmark -c xchacha12,aes-adiantum -s 256
> xchacha12,aes-adiantum        256b       181.7 MiB/s       184.6 MiB/s

Note that Adiantum benchmarks on x86 are misleading at the moment, since the
initial kernel patchset doesn't include SSE2 and AVX2 optimized XChaCha and
NHPoly1305.  To start, only C and arm32 NEON implementations are included.
Hence, on x86 Adiantum will appear much slower than it should be.  But I'm
planning to add the x86 and arm64 implementations, so it will get much faster.

> 
> For LUKS2, we need a similar change to cryptoAPI IV size (unfortunately it does not
> fallback to old keyslot handling, so LUKS2 does not work currently now).
> 
> I quickly added a workaround that fallbacks to default keyslot encryption for keyslots
> in this case
> https://gitlab.com/cryptsetup/cryptsetup/commit/29e87add5aac9d5eb0087881146988d9c4280915
> 
> then you can use LUKS2
> # cryptsetup luksFormat --type luks2 --sector-size 4096 -c xchacha20,aes-adiantum-plain64 -s 256 /dev/sdb
> 
> (Example above will encrypt keyslots with AES-XTS and use Aviantum for data only.)
> 
> So, unfortunately yes, we need some small changes in cryptsetup for LUKS;
> plain mode should work out of the box (with the syntax above).

I think that when using AF_ALG, cryptsetup should get the IV size from
/proc/crypto, or else have it hardcoded that "adiantum" uses 32-byte IVs.
(Actually Adiantum can formally can use any size IV, but we had to choose a
fixed size for Linux's crypto API.)

Getting the IV size via CRYPTO_MSG_GETALG via NETLINK_CRYPTO is also an option,
but that requires the kconfig option CONFIG_CRYPTO_USER which isn't guaranteed
to be enabled even if CONFIG_CRYPTO_USER_API_SKCIPHER is.

Also: why is cryptsetup's default keyslot encryption AES-128-XTS instead of
AES-256-XTS?  People can choose a cipher with a 256-bit key strength such as
AES-256-XTS or Adiantum, so the keyslots should use at least that strength too.

Thanks,

- Eric

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-11-19 19:28           ` Eric Biggers
@ 2018-11-19 20:05             ` Milan Broz
  2018-11-19 20:30               ` Jason A. Donenfeld
  0 siblings, 1 reply; 54+ messages in thread
From: Milan Broz @ 2018-11-19 20:05 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Jason A. Donenfeld, Linux Crypto Mailing List, linux-fscrypt,
	linux-arm-kernel, LKML, Herbert Xu, Paul Crowley, Greg Kaiser,
	Michael Halcrow, Samuel Neves, Tomer Ashur

Hi,

On 19/11/2018 20:28, Eric Biggers wrote:
> Note that Adiantum benchmarks on x86 are misleading at the moment, since the
> initial kernel patchset doesn't include SSE2 and AVX2 optimized XChaCha and
> NHPoly1305.  To start, only C and arm32 NEON implementations are included.
> Hence, on x86 Adiantum will appear much slower than it should be.  But I'm
> planning to add the x86 and arm64 implementations, so it will get much faster.

The posted benchmark was just an example (it was 32bit virtual machine on my
old laptop so numbers are misleading).

If Adiantum is going to be merged, I expect it can be used as an alternative
even on x86, so I expect more optimizations.

...
> I think that when using AF_ALG, cryptsetup should get the IV size from
> /proc/crypto, or else have it hardcoded that "adiantum" uses 32-byte IVs.
> (Actually Adiantum can formally can use any size IV, but we had to choose a
> fixed size for Linux's crypto API.)

I do not want to parse /proc/crypto (it needs to load the module first anyway)
and proper API was not yet here when I wrote this code (I think we were the first
real user of userspace crypto api...)

> Getting the IV size via CRYPTO_MSG_GETALG via NETLINK_CRYPTO is also an option,
> but that requires the kconfig option CONFIG_CRYPTO_USER which isn't guaranteed
> to be enabled even if CONFIG_CRYPTO_USER_API_SKCIPHER is.

Yes. For now, I hardcode Adiantum IV size in cryptsetup and later we will try to
find a more generic way.

> Also: why is cryptsetup's default keyslot encryption AES-128-XTS instead of
> AES-256-XTS?  People can choose a cipher with a 256-bit key strength such as
> AES-256-XTS or Adiantum, so the keyslots should use at least that strength too.

It was inherited from 256bit default key (so 2xAES-128 in XTS).
It is still the default for LUKS1, but we should perhaps change it to double key
it for XTS mode (at least for fallback keyslot encryption).

Anyway, we will release cryptsetup 2.0.6 very soon to fix one problem
in LUKS2, so I'll add the Adiantum IV size there as well so people can play with it.

Thanks,
Milan

p.s.
Reading the discussion about Zinc/Adiantum - I would perhaps prefer to merge
Adiantum first (if it is ready).
It is a new feature, I see it as useful cipher alternative for dm-crypt and it can be
esily backported without Zinc to older kernels (I am testing it actually this way).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH v2 00/12] crypto: Adiantum support
  2018-11-19 20:05             ` Milan Broz
@ 2018-11-19 20:30               ` Jason A. Donenfeld
  0 siblings, 0 replies; 54+ messages in thread
From: Jason A. Donenfeld @ 2018-11-19 20:30 UTC (permalink / raw)
  To: Milan Broz
  Cc: Eric Biggers, Linux Crypto Mailing List, linux-fscrypt,
	linux-arm-kernel, LKML, Herbert Xu, Paul Crowley, Greg Kaiser,
	Michael Halcrow, Samuel Neves, Tomer Ashur

On Mon, Nov 19, 2018 at 9:05 PM Milan Broz <gmazyland@gmail.com> wrote:
> p.s.
> Reading the discussion about Zinc/Adiantum - I would perhaps prefer to merge
> Adiantum first (if it is ready).
> It is a new feature, I see it as useful cipher alternative for dm-crypt and it can be
> esily backported without Zinc to older kernels (I am testing it actually this way).

Seems reasonable to me.

Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2018-11-19 20:30 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-15 17:54 [RFC PATCH v2 00/12] crypto: Adiantum support Eric Biggers
2018-10-15 17:54 ` [RFC PATCH v2 01/12] crypto: chacha20-generic - add HChaCha20 library function Eric Biggers
2018-10-19 14:13   ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 02/12] crypto: chacha20-generic - add XChaCha20 support Eric Biggers
2018-10-19 14:24   ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 03/12] crypto: chacha20-generic - refactor to allow varying number of rounds Eric Biggers
2018-10-19 14:25   ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 04/12] crypto: chacha - add XChaCha12 support Eric Biggers
2018-10-19 14:34   ` Ard Biesheuvel
2018-10-19 18:28     ` Eric Biggers
2018-10-15 17:54 ` [RFC PATCH v2 05/12] crypto: arm/chacha20 - add XChaCha20 support Eric Biggers
2018-10-20  2:29   ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds Eric Biggers
2018-10-20  3:35   ` Ard Biesheuvel
2018-10-20  5:26     ` Eric Biggers
2018-10-15 17:54 ` [RFC PATCH v2 07/12] crypto: arm/chacha - add XChaCha12 support Eric Biggers
2018-10-20  3:36   ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 08/12] crypto: poly1305 - add Poly1305 core API Eric Biggers
2018-10-20  3:45   ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 09/12] crypto: nhpoly1305 - add NHPoly1305 support Eric Biggers
2018-10-20  4:00   ` Ard Biesheuvel
2018-10-20  5:38     ` Eric Biggers
2018-10-20 15:06       ` Ard Biesheuvel
2018-10-22 18:42         ` Eric Biggers
2018-10-22 22:25           ` Ard Biesheuvel
2018-10-22 22:40             ` Eric Biggers
2018-10-22 22:43               ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 10/12] crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305 Eric Biggers
2018-10-20  4:12   ` Ard Biesheuvel
2018-10-20  5:51     ` Eric Biggers
2018-10-20 15:00       ` Ard Biesheuvel
2018-10-15 17:54 ` [RFC PATCH v2 11/12] crypto: adiantum - add Adiantum support Eric Biggers
2018-10-20  4:17   ` Ard Biesheuvel
2018-10-20  7:12     ` Eric Biggers
2018-10-23 10:40       ` Ard Biesheuvel
2018-10-24 22:06         ` Eric Biggers
2018-10-30  8:17           ` Herbert Xu
2018-10-15 17:54 ` [RFC PATCH v2 12/12] fscrypt: " Eric Biggers
2018-10-19 15:58 ` [RFC PATCH v2 00/12] crypto: " Jason A. Donenfeld
2018-10-19 18:19   ` Paul Crowley
2018-10-20  3:24     ` Ard Biesheuvel
2018-10-20  5:22       ` Eric Biggers
     [not found]     ` <2395454e-a0dc-408f-4138-9d15ab5f20b8@esat.kuleuven.be>
2018-10-22 11:20       ` Tomer Ashur
2018-10-19 19:04   ` Eric Biggers
2018-10-20 10:26     ` Milan Broz
2018-10-20 13:47       ` Jason A. Donenfeld
2018-11-16 21:52       ` Eric Biggers
2018-11-17 10:29         ` Milan Broz
2018-11-19 19:28           ` Eric Biggers
2018-11-19 20:05             ` Milan Broz
2018-11-19 20:30               ` Jason A. Donenfeld
2018-10-21 22:23     ` Eric Biggers
2018-10-21 22:51       ` Jason A. Donenfeld
2018-10-22 17:17         ` Paul Crowley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).