linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] crypto: lrw - Fixes and improvements
@ 2018-09-13  8:51 Ondrej Mosnacek
  2018-09-13  8:51 ` [PATCH v4 1/4] crypto: lrw - Fix out-of bounds access on counter overflow Ondrej Mosnacek
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Ondrej Mosnacek @ 2018-09-13  8:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Biggers, dm-devel, Mikulas Patocka, linux-crypto, Ondrej Mosnacek

This patchset contains a corner-case fix and several improvements  for
the LRW template.

The first patch fixes an out-of-bounds array access (and subsequently
incorrect cipher output) when the LRW counter goes from all ones to all
zeros. This patch should be applied to the crypto-2.6 tree and also go
to stable.

The second patch adds a test vector for lrw(aes) that covers the above
bug.

The third patch is a small optimization of the LRW tweak computation.

The fourth patch is a follow-up to a similar patch for XTS (it
simplifies away the use of dynamically allocated auxiliary buffer to
cache the computed tweak values):
https://patchwork.kernel.org/patch/10588775/

Patches 2-4 should be applied only to cryptodev-2.6, but they all depend
on the first patch.

Changes in v4:
  - applied various corrections/suggestions from Eric Biggers
  - added a fix for buggy behavior on counter wrap-around (+ test vector)

v3: https://www.spinics.net/lists/linux-crypto/msg34946.html
Changes in v3:
  - fix a copy-paste error

v2: https://www.spinics.net/lists/linux-crypto/msg34890.html
Changes in v2:
  - small cleanup suggested by Eric Biggers

v1: https://www.spinics.net/lists/linux-crypto/msg34871.html

Ondrej Mosnacek (4):
  crypto: lrw - Fix out-of bounds access on counter overflow
  crypto: testmgr - Add test for LRW counter wrap-around
  crypto: lrw - Optimize tweak computation
  crypto: lrw - Do not use auxiliary buffer

 crypto/lrw.c     | 342 +++++++++++++----------------------------------
 crypto/testmgr.h |  21 +++
 2 files changed, 112 insertions(+), 251 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 1/4] crypto: lrw - Fix out-of bounds access on counter overflow
  2018-09-13  8:51 [PATCH v4 0/4] crypto: lrw - Fixes and improvements Ondrej Mosnacek
@ 2018-09-13  8:51 ` Ondrej Mosnacek
  2018-09-13  8:51 ` [PATCH v4 2/4] crypto: testmgr - Add test for LRW counter wrap-around Ondrej Mosnacek
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Ondrej Mosnacek @ 2018-09-13  8:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-crypto, dm-devel, Mikulas Patocka, Eric Biggers,
	Ondrej Mosnacek, stable

When the LRW block counter overflows, the current implementation returns
128 as the index to the precomputed multiplication table, which has 128
entries. This patch fixes it to return the correct value (127).

Fixes: 64470f1b8510 ("[CRYPTO] lrw: Liskov Rivest Wagner, a tweakable narrow block cipher mode")
Cc: <stable@vger.kernel.org> # 2.6.20+
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 crypto/lrw.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/crypto/lrw.c b/crypto/lrw.c
index 393a782679c7..5504d1325a56 100644
--- a/crypto/lrw.c
+++ b/crypto/lrw.c
@@ -143,7 +143,12 @@ static inline int get_index128(be128 *block)
 		return x + ffz(val);
 	}
 
-	return x;
+	/*
+	 * If we get here, then x == 128 and we are incrementing the counter
+	 * from all ones to all zeros. This means we must return index 127, i.e.
+	 * the one corresponding to key2*{ 1,...,1 }.
+	 */
+	return 127;
 }
 
 static int post_crypt(struct skcipher_request *req)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/4] crypto: testmgr - Add test for LRW counter wrap-around
  2018-09-13  8:51 [PATCH v4 0/4] crypto: lrw - Fixes and improvements Ondrej Mosnacek
  2018-09-13  8:51 ` [PATCH v4 1/4] crypto: lrw - Fix out-of bounds access on counter overflow Ondrej Mosnacek
@ 2018-09-13  8:51 ` Ondrej Mosnacek
  2018-09-13  8:51 ` [PATCH v4 3/4] crypto: lrw - Optimize tweak computation Ondrej Mosnacek
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Ondrej Mosnacek @ 2018-09-13  8:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Biggers, dm-devel, Mikulas Patocka, linux-crypto, Ondrej Mosnacek

This patch adds a test vector for lrw(aes) that triggers wrap-around of
the counter, which is a tricky corner case.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 crypto/testmgr.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 0b3d7cadbe93..47629cb1efd3 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -13145,6 +13145,27 @@ static const struct cipher_testvec aes_lrw_tv_template[] = {
 		.ctext	= "\x5b\x90\x8e\xc1\xab\xdd\x67\x5f"
 			  "\x3d\x69\x8a\x95\x53\xc8\x9c\xe5",
 		.len	= 16,
+	}, { /* Test counter wrap-around, modified from LRW-32-AES 1 */
+		.key    = "\x45\x62\xac\x25\xf8\x28\x17\x6d"
+			  "\x4c\x26\x84\x14\xb5\x68\x01\x85"
+			  "\x25\x8e\x2a\x05\xe7\x3e\x9d\x03"
+			  "\xee\x5a\x83\x0c\xcc\x09\x4c\x87",
+		.klen   = 32,
+		.iv     = "\xff\xff\xff\xff\xff\xff\xff\xff"
+			  "\xff\xff\xff\xff\xff\xff\xff\xff",
+		.ptext	= "\x30\x31\x32\x33\x34\x35\x36\x37"
+			  "\x38\x39\x41\x42\x43\x44\x45\x46"
+			  "\x30\x31\x32\x33\x34\x35\x36\x37"
+			  "\x38\x39\x41\x42\x43\x44\x45\x46"
+			  "\x30\x31\x32\x33\x34\x35\x36\x37"
+			  "\x38\x39\x41\x42\x43\x44\x45\x46",
+		.ctext	= "\x47\x90\x50\xf6\xf4\x8d\x5c\x7f"
+			  "\x84\xc7\x83\x95\x2d\xa2\x02\xc0"
+			  "\xda\x7f\xa3\xc0\x88\x2a\x0a\x50"
+			  "\xfb\xc1\x78\x03\x39\xfe\x1d\xe5"
+			  "\xf1\xb2\x73\xcd\x65\xa3\xdf\x5f"
+			  "\xe9\x5d\x48\x92\x54\x63\x4e\xb8",
+		.len	= 48,
 	}, {
 /* http://www.mail-archive.com/stds-p1619@listserv.ieee.org/msg00173.html */
 		.key    = "\xf8\xd4\x76\xff\xd6\x46\xee\x6c"
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 3/4] crypto: lrw - Optimize tweak computation
  2018-09-13  8:51 [PATCH v4 0/4] crypto: lrw - Fixes and improvements Ondrej Mosnacek
  2018-09-13  8:51 ` [PATCH v4 1/4] crypto: lrw - Fix out-of bounds access on counter overflow Ondrej Mosnacek
  2018-09-13  8:51 ` [PATCH v4 2/4] crypto: testmgr - Add test for LRW counter wrap-around Ondrej Mosnacek
@ 2018-09-13  8:51 ` Ondrej Mosnacek
  2018-09-13  8:51 ` [PATCH v4 4/4] crypto: lrw - Do not use auxiliary buffer Ondrej Mosnacek
  2018-09-21  5:45 ` [PATCH v4 0/4] crypto: lrw - Fixes and improvements Herbert Xu
  4 siblings, 0 replies; 8+ messages in thread
From: Ondrej Mosnacek @ 2018-09-13  8:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Biggers, dm-devel, Mikulas Patocka, linux-crypto, Ondrej Mosnacek

This patch rewrites the tweak computation to a slightly simpler method
that performs less bswaps. Based on performance measurements the new
code seems to provide slightly better performance than the old one.

PERFORMANCE MEASUREMENTS (x86_64)
Performed using: https://gitlab.com/omos/linux-crypto-bench
Crypto driver used: lrw(ecb-aes-aesni)

Before:
       ALGORITHM KEY (b)        DATA (B)   TIME ENC (ns)   TIME DEC (ns)
        lrw(aes)     256              64             204             286
        lrw(aes)     320              64             227             203
        lrw(aes)     384              64             208             204
        lrw(aes)     256             512             441             439
        lrw(aes)     320             512             456             455
        lrw(aes)     384             512             469             483
        lrw(aes)     256            4096            2136            2190
        lrw(aes)     320            4096            2161            2213
        lrw(aes)     384            4096            2295            2369
        lrw(aes)     256           16384            7692            7868
        lrw(aes)     320           16384            8230            8691
        lrw(aes)     384           16384            8971            8813
        lrw(aes)     256           32768           15336           15560
        lrw(aes)     320           32768           16410           16346
        lrw(aes)     384           32768           18023           17465

After:
       ALGORITHM KEY (b)        DATA (B)   TIME ENC (ns)   TIME DEC (ns)
        lrw(aes)     256              64             200             203
        lrw(aes)     320              64             202             204
        lrw(aes)     384              64             204             205
        lrw(aes)     256             512             415             415
        lrw(aes)     320             512             432             440
        lrw(aes)     384             512             449             451
        lrw(aes)     256            4096            1838            1995
        lrw(aes)     320            4096            2123            1980
        lrw(aes)     384            4096            2100            2119
        lrw(aes)     256           16384            7183            6954
        lrw(aes)     320           16384            7844            7631
        lrw(aes)     384           16384            8256            8126
        lrw(aes)     256           32768           14772           14484
        lrw(aes)     320           32768           15281           15431
        lrw(aes)     384           32768           16469           16293

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 crypto/lrw.c | 61 +++++++++++++++++++++++++++++++---------------------
 1 file changed, 37 insertions(+), 24 deletions(-)

diff --git a/crypto/lrw.c b/crypto/lrw.c
index 5504d1325a56..7377b5b486fd 100644
--- a/crypto/lrw.c
+++ b/crypto/lrw.c
@@ -120,27 +120,28 @@ static int setkey(struct crypto_skcipher *parent, const u8 *key,
 	return 0;
 }
 
-static inline void inc(be128 *iv)
-{
-	be64_add_cpu(&iv->b, 1);
-	if (!iv->b)
-		be64_add_cpu(&iv->a, 1);
-}
-
-/* this returns the number of consequative 1 bits starting
- * from the right, get_index128(00 00 00 00 00 00 ... 00 00 10 FB) = 2 */
-static inline int get_index128(be128 *block)
+/*
+ * Returns the number of trailing '1' bits in the words of the counter, which is
+ * represented by 4 32-bit words, arranged from least to most significant.
+ * At the same time, increments the counter by one.
+ *
+ * For example:
+ *
+ * u32 counter[4] = { 0xFFFFFFFF, 0x1, 0x0, 0x0 };
+ * int i = next_index(&counter);
+ * // i == 33, counter == { 0x0, 0x2, 0x0, 0x0 }
+ */
+static int next_index(u32 *counter)
 {
-	int x;
-	__be32 *p = (__be32 *) block;
+	int i, res = 0;
 
-	for (p += 3, x = 0; x < 128; p--, x += 32) {
-		u32 val = be32_to_cpup(p);
-
-		if (!~val)
-			continue;
-
-		return x + ffz(val);
+	for (i = 0; i < 4; i++) {
+		if (counter[i] + 1 != 0) {
+			res += ffz(counter[i]++);
+			break;
+		}
+		counter[i] = 0;
+		res += 32;
 	}
 
 	/*
@@ -214,8 +215,9 @@ static int pre_crypt(struct skcipher_request *req)
 	struct scatterlist *sg;
 	unsigned cryptlen;
 	unsigned offset;
-	be128 *iv;
 	bool more;
+	__be32 *iv;
+	u32 counter[4];
 	int err;
 
 	subreq = &rctx->subreq;
@@ -230,7 +232,12 @@ static int pre_crypt(struct skcipher_request *req)
 				   cryptlen, req->iv);
 
 	err = skcipher_walk_virt(&w, subreq, false);
-	iv = w.iv;
+	iv = (__be32 *)w.iv;
+
+	counter[0] = be32_to_cpu(iv[3]);
+	counter[1] = be32_to_cpu(iv[2]);
+	counter[2] = be32_to_cpu(iv[1]);
+	counter[3] = be32_to_cpu(iv[0]);
 
 	while (w.nbytes) {
 		unsigned int avail = w.nbytes;
@@ -247,10 +254,16 @@ static int pre_crypt(struct skcipher_request *req)
 			/* T <- I*Key2, using the optimization
 			 * discussed in the specification */
 			be128_xor(&rctx->t, &rctx->t,
-				  &ctx->mulinc[get_index128(iv)]);
-			inc(iv);
+				  &ctx->mulinc[next_index(counter)]);
 		} while ((avail -= bs) >= bs);
 
+		if (w.nbytes == w.total) {
+			iv[0] = cpu_to_be32(counter[3]);
+			iv[1] = cpu_to_be32(counter[2]);
+			iv[2] = cpu_to_be32(counter[1]);
+			iv[3] = cpu_to_be32(counter[0]);
+		}
+
 		err = skcipher_walk_done(&w, avail);
 	}
 
@@ -548,7 +561,7 @@ static int create(struct crypto_template *tmpl, struct rtattr **tb)
 	inst->alg.base.cra_priority = alg->base.cra_priority;
 	inst->alg.base.cra_blocksize = LRW_BLOCK_SIZE;
 	inst->alg.base.cra_alignmask = alg->base.cra_alignmask |
-				       (__alignof__(u64) - 1);
+				       (__alignof__(__be32) - 1);
 
 	inst->alg.ivsize = LRW_BLOCK_SIZE;
 	inst->alg.min_keysize = crypto_skcipher_alg_min_keysize(alg) +
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 4/4] crypto: lrw - Do not use auxiliary buffer
  2018-09-13  8:51 [PATCH v4 0/4] crypto: lrw - Fixes and improvements Ondrej Mosnacek
                   ` (2 preceding siblings ...)
  2018-09-13  8:51 ` [PATCH v4 3/4] crypto: lrw - Optimize tweak computation Ondrej Mosnacek
@ 2018-09-13  8:51 ` Ondrej Mosnacek
  2018-09-21  5:45 ` [PATCH v4 0/4] crypto: lrw - Fixes and improvements Herbert Xu
  4 siblings, 0 replies; 8+ messages in thread
From: Ondrej Mosnacek @ 2018-09-13  8:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Biggers, dm-devel, Mikulas Patocka, linux-crypto, Ondrej Mosnacek

This patch simplifies the LRW template to recompute the LRW tweaks from
scratch in the second pass and thus also removes the need to allocate a
dynamic buffer using kmalloc().

As discussed at [1], the use of kmalloc causes deadlocks with dm-crypt.

PERFORMANCE MEASUREMENTS (x86_64)
Performed using: https://gitlab.com/omos/linux-crypto-bench
Crypto driver used: lrw(ecb-aes-aesni)

The results show that the new code has about the same performance as the
old code. For 512-byte message it seems to be even slightly faster, but
that might be just noise.

Before:
       ALGORITHM KEY (b)        DATA (B)   TIME ENC (ns)   TIME DEC (ns)
        lrw(aes)     256              64             200             203
        lrw(aes)     320              64             202             204
        lrw(aes)     384              64             204             205
        lrw(aes)     256             512             415             415
        lrw(aes)     320             512             432             440
        lrw(aes)     384             512             449             451
        lrw(aes)     256            4096            1838            1995
        lrw(aes)     320            4096            2123            1980
        lrw(aes)     384            4096            2100            2119
        lrw(aes)     256           16384            7183            6954
        lrw(aes)     320           16384            7844            7631
        lrw(aes)     384           16384            8256            8126
        lrw(aes)     256           32768           14772           14484
        lrw(aes)     320           32768           15281           15431
        lrw(aes)     384           32768           16469           16293

After:
       ALGORITHM KEY (b)        DATA (B)   TIME ENC (ns)   TIME DEC (ns)
        lrw(aes)     256              64             197             196
        lrw(aes)     320              64             200             197
        lrw(aes)     384              64             203             199
        lrw(aes)     256             512             385             380
        lrw(aes)     320             512             401             395
        lrw(aes)     384             512             415             415
        lrw(aes)     256            4096            1869            1846
        lrw(aes)     320            4096            2080            1981
        lrw(aes)     384            4096            2160            2109
        lrw(aes)     256           16384            7077            7127
        lrw(aes)     320           16384            7807            7766
        lrw(aes)     384           16384            8108            8357
        lrw(aes)     256           32768           14111           14454
        lrw(aes)     320           32768           15268           15082
        lrw(aes)     384           32768           16581           16250

[1] https://lkml.org/lkml/2018/8/23/1315

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 crypto/lrw.c | 280 ++++++++++-----------------------------------------
 1 file changed, 51 insertions(+), 229 deletions(-)

diff --git a/crypto/lrw.c b/crypto/lrw.c
index 7377b5b486fd..6fcf0d431185 100644
--- a/crypto/lrw.c
+++ b/crypto/lrw.c
@@ -29,8 +29,6 @@
 #include <crypto/b128ops.h>
 #include <crypto/gf128mul.h>
 
-#define LRW_BUFFER_SIZE 128u
-
 #define LRW_BLOCK_SIZE 16
 
 struct priv {
@@ -56,19 +54,7 @@ struct priv {
 };
 
 struct rctx {
-	be128 buf[LRW_BUFFER_SIZE / sizeof(be128)];
-
 	be128 t;
-
-	be128 *ext;
-
-	struct scatterlist srcbuf[2];
-	struct scatterlist dstbuf[2];
-	struct scatterlist *src;
-	struct scatterlist *dst;
-
-	unsigned int left;
-
 	struct skcipher_request subreq;
 };
 
@@ -152,86 +138,31 @@ static int next_index(u32 *counter)
 	return 127;
 }
 
-static int post_crypt(struct skcipher_request *req)
+/*
+ * We compute the tweak masks twice (both before and after the ECB encryption or
+ * decryption) to avoid having to allocate a temporary buffer and/or make
+ * mutliple calls to the 'ecb(..)' instance, which usually would be slower than
+ * just doing the next_index() calls again.
+ */
+static int xor_tweak(struct skcipher_request *req, bool second_pass)
 {
-	struct rctx *rctx = skcipher_request_ctx(req);
-	be128 *buf = rctx->ext ?: rctx->buf;
-	struct skcipher_request *subreq;
 	const int bs = LRW_BLOCK_SIZE;
-	struct skcipher_walk w;
-	struct scatterlist *sg;
-	unsigned offset;
-	int err;
-
-	subreq = &rctx->subreq;
-	err = skcipher_walk_virt(&w, subreq, false);
-
-	while (w.nbytes) {
-		unsigned int avail = w.nbytes;
-		be128 *wdst;
-
-		wdst = w.dst.virt.addr;
-
-		do {
-			be128_xor(wdst, buf++, wdst);
-			wdst++;
-		} while ((avail -= bs) >= bs);
-
-		err = skcipher_walk_done(&w, avail);
-	}
-
-	rctx->left -= subreq->cryptlen;
-
-	if (err || !rctx->left)
-		goto out;
-
-	rctx->dst = rctx->dstbuf;
-
-	scatterwalk_done(&w.out, 0, 1);
-	sg = w.out.sg;
-	offset = w.out.offset;
-
-	if (rctx->dst != sg) {
-		rctx->dst[0] = *sg;
-		sg_unmark_end(rctx->dst);
-		scatterwalk_crypto_chain(rctx->dst, sg_next(sg), 2);
-	}
-	rctx->dst[0].length -= offset - sg->offset;
-	rctx->dst[0].offset = offset;
-
-out:
-	return err;
-}
-
-static int pre_crypt(struct skcipher_request *req)
-{
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
-	struct rctx *rctx = skcipher_request_ctx(req);
 	struct priv *ctx = crypto_skcipher_ctx(tfm);
-	be128 *buf = rctx->ext ?: rctx->buf;
-	struct skcipher_request *subreq;
-	const int bs = LRW_BLOCK_SIZE;
+	struct rctx *rctx = skcipher_request_ctx(req);
+	be128 t = rctx->t;
 	struct skcipher_walk w;
-	struct scatterlist *sg;
-	unsigned cryptlen;
-	unsigned offset;
-	bool more;
 	__be32 *iv;
 	u32 counter[4];
 	int err;
 
-	subreq = &rctx->subreq;
-	skcipher_request_set_tfm(subreq, tfm);
-
-	cryptlen = subreq->cryptlen;
-	more = rctx->left > cryptlen;
-	if (!more)
-		cryptlen = rctx->left;
-
-	skcipher_request_set_crypt(subreq, rctx->src, rctx->dst,
-				   cryptlen, req->iv);
+	if (second_pass) {
+		req = &rctx->subreq;
+		/* set to our TFM to enforce correct alignment: */
+		skcipher_request_set_tfm(req, tfm);
+	}
 
-	err = skcipher_walk_virt(&w, subreq, false);
+	err = skcipher_walk_virt(&w, req, false);
 	iv = (__be32 *)w.iv;
 
 	counter[0] = be32_to_cpu(iv[3]);
@@ -248,16 +179,14 @@ static int pre_crypt(struct skcipher_request *req)
 		wdst = w.dst.virt.addr;
 
 		do {
-			*buf++ = rctx->t;
-			be128_xor(wdst++, &rctx->t, wsrc++);
+			be128_xor(wdst++, &t, wsrc++);
 
 			/* T <- I*Key2, using the optimization
 			 * discussed in the specification */
-			be128_xor(&rctx->t, &rctx->t,
-				  &ctx->mulinc[next_index(counter)]);
+			be128_xor(&t, &t, &ctx->mulinc[next_index(counter)]);
 		} while ((avail -= bs) >= bs);
 
-		if (w.nbytes == w.total) {
+		if (second_pass && w.nbytes == w.total) {
 			iv[0] = cpu_to_be32(counter[3]);
 			iv[1] = cpu_to_be32(counter[2]);
 			iv[2] = cpu_to_be32(counter[1]);
@@ -267,175 +196,68 @@ static int pre_crypt(struct skcipher_request *req)
 		err = skcipher_walk_done(&w, avail);
 	}
 
-	skcipher_request_set_tfm(subreq, ctx->child);
-	skcipher_request_set_crypt(subreq, rctx->dst, rctx->dst,
-				   cryptlen, NULL);
-
-	if (err || !more)
-		goto out;
-
-	rctx->src = rctx->srcbuf;
-
-	scatterwalk_done(&w.in, 0, 1);
-	sg = w.in.sg;
-	offset = w.in.offset;
-
-	if (rctx->src != sg) {
-		rctx->src[0] = *sg;
-		sg_unmark_end(rctx->src);
-		scatterwalk_crypto_chain(rctx->src, sg_next(sg), 2);
-	}
-	rctx->src[0].length -= offset - sg->offset;
-	rctx->src[0].offset = offset;
-
-out:
 	return err;
 }
 
-static int init_crypt(struct skcipher_request *req, crypto_completion_t done)
+static int xor_tweak_pre(struct skcipher_request *req)
 {
-	struct priv *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
-	struct rctx *rctx = skcipher_request_ctx(req);
-	struct skcipher_request *subreq;
-	gfp_t gfp;
-
-	subreq = &rctx->subreq;
-	skcipher_request_set_callback(subreq, req->base.flags, done, req);
-
-	gfp = req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ? GFP_KERNEL :
-							   GFP_ATOMIC;
-	rctx->ext = NULL;
-
-	subreq->cryptlen = LRW_BUFFER_SIZE;
-	if (req->cryptlen > LRW_BUFFER_SIZE) {
-		unsigned int n = min(req->cryptlen, (unsigned int)PAGE_SIZE);
-
-		rctx->ext = kmalloc(n, gfp);
-		if (rctx->ext)
-			subreq->cryptlen = n;
-	}
-
-	rctx->src = req->src;
-	rctx->dst = req->dst;
-	rctx->left = req->cryptlen;
-
-	/* calculate first value of T */
-	memcpy(&rctx->t, req->iv, sizeof(rctx->t));
-
-	/* T <- I*Key2 */
-	gf128mul_64k_bbe(&rctx->t, ctx->table);
-
-	return 0;
+	return xor_tweak(req, false);
 }
 
-static void exit_crypt(struct skcipher_request *req)
+static int xor_tweak_post(struct skcipher_request *req)
 {
-	struct rctx *rctx = skcipher_request_ctx(req);
-
-	rctx->left = 0;
-
-	if (rctx->ext)
-		kzfree(rctx->ext);
+	return xor_tweak(req, true);
 }
 
-static int do_encrypt(struct skcipher_request *req, int err)
-{
-	struct rctx *rctx = skcipher_request_ctx(req);
-	struct skcipher_request *subreq;
-
-	subreq = &rctx->subreq;
-
-	while (!err && rctx->left) {
-		err = pre_crypt(req) ?:
-		      crypto_skcipher_encrypt(subreq) ?:
-		      post_crypt(req);
-
-		if (err == -EINPROGRESS || err == -EBUSY)
-			return err;
-	}
-
-	exit_crypt(req);
-	return err;
-}
-
-static void encrypt_done(struct crypto_async_request *areq, int err)
+static void crypt_done(struct crypto_async_request *areq, int err)
 {
 	struct skcipher_request *req = areq->data;
-	struct skcipher_request *subreq;
-	struct rctx *rctx;
-
-	rctx = skcipher_request_ctx(req);
-
-	if (err == -EINPROGRESS) {
-		if (rctx->left != req->cryptlen)
-			return;
-		goto out;
-	}
 
-	subreq = &rctx->subreq;
-	subreq->base.flags &= CRYPTO_TFM_REQ_MAY_BACKLOG;
+	if (!err)
+		err = xor_tweak_post(req);
 
-	err = do_encrypt(req, err ?: post_crypt(req));
-	if (rctx->left)
-		return;
-
-out:
 	skcipher_request_complete(req, err);
 }
 
-static int encrypt(struct skcipher_request *req)
-{
-	return do_encrypt(req, init_crypt(req, encrypt_done));
-}
-
-static int do_decrypt(struct skcipher_request *req, int err)
+static void init_crypt(struct skcipher_request *req)
 {
+	struct priv *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
 	struct rctx *rctx = skcipher_request_ctx(req);
-	struct skcipher_request *subreq;
+	struct skcipher_request *subreq = &rctx->subreq;
 
-	subreq = &rctx->subreq;
-
-	while (!err && rctx->left) {
-		err = pre_crypt(req) ?:
-		      crypto_skcipher_decrypt(subreq) ?:
-		      post_crypt(req);
+	skcipher_request_set_tfm(subreq, ctx->child);
+	skcipher_request_set_callback(subreq, req->base.flags, crypt_done, req);
+	/* pass req->iv as IV (will be used by xor_tweak, ECB will ignore it) */
+	skcipher_request_set_crypt(subreq, req->dst, req->dst,
+				   req->cryptlen, req->iv);
 
-		if (err == -EINPROGRESS || err == -EBUSY)
-			return err;
-	}
+	/* calculate first value of T */
+	memcpy(&rctx->t, req->iv, sizeof(rctx->t));
 
-	exit_crypt(req);
-	return err;
+	/* T <- I*Key2 */
+	gf128mul_64k_bbe(&rctx->t, ctx->table);
 }
 
-static void decrypt_done(struct crypto_async_request *areq, int err)
+static int encrypt(struct skcipher_request *req)
 {
-	struct skcipher_request *req = areq->data;
-	struct skcipher_request *subreq;
-	struct rctx *rctx;
-
-	rctx = skcipher_request_ctx(req);
-
-	if (err == -EINPROGRESS) {
-		if (rctx->left != req->cryptlen)
-			return;
-		goto out;
-	}
-
-	subreq = &rctx->subreq;
-	subreq->base.flags &= CRYPTO_TFM_REQ_MAY_BACKLOG;
-
-	err = do_decrypt(req, err ?: post_crypt(req));
-	if (rctx->left)
-		return;
+	struct rctx *rctx = skcipher_request_ctx(req);
+	struct skcipher_request *subreq = &rctx->subreq;
 
-out:
-	skcipher_request_complete(req, err);
+	init_crypt(req);
+	return xor_tweak_pre(req) ?:
+		crypto_skcipher_encrypt(subreq) ?:
+		xor_tweak_post(req);
 }
 
 static int decrypt(struct skcipher_request *req)
 {
-	return do_decrypt(req, init_crypt(req, decrypt_done));
+	struct rctx *rctx = skcipher_request_ctx(req);
+	struct skcipher_request *subreq = &rctx->subreq;
+
+	init_crypt(req);
+	return xor_tweak_pre(req) ?:
+		crypto_skcipher_decrypt(subreq) ?:
+		xor_tweak_post(req);
 }
 
 static int init_tfm(struct crypto_skcipher *tfm)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 0/4] crypto: lrw - Fixes and improvements
  2018-09-13  8:51 [PATCH v4 0/4] crypto: lrw - Fixes and improvements Ondrej Mosnacek
                   ` (3 preceding siblings ...)
  2018-09-13  8:51 ` [PATCH v4 4/4] crypto: lrw - Do not use auxiliary buffer Ondrej Mosnacek
@ 2018-09-21  5:45 ` Herbert Xu
  2018-09-30 19:00   ` Ard Biesheuvel
  4 siblings, 1 reply; 8+ messages in thread
From: Herbert Xu @ 2018-09-21  5:45 UTC (permalink / raw)
  To: Ondrej Mosnacek; +Cc: Eric Biggers, dm-devel, Mikulas Patocka, linux-crypto

On Thu, Sep 13, 2018 at 10:51:30AM +0200, Ondrej Mosnacek wrote:
> This patchset contains a corner-case fix and several improvements  for
> the LRW template.
> 
> The first patch fixes an out-of-bounds array access (and subsequently
> incorrect cipher output) when the LRW counter goes from all ones to all
> zeros. This patch should be applied to the crypto-2.6 tree and also go
> to stable.
> 
> The second patch adds a test vector for lrw(aes) that covers the above
> bug.
> 
> The third patch is a small optimization of the LRW tweak computation.
> 
> The fourth patch is a follow-up to a similar patch for XTS (it
> simplifies away the use of dynamically allocated auxiliary buffer to
> cache the computed tweak values):
> https://patchwork.kernel.org/patch/10588775/
> 
> Patches 2-4 should be applied only to cryptodev-2.6, but they all depend
> on the first patch.
> 
> Changes in v4:
>   - applied various corrections/suggestions from Eric Biggers
>   - added a fix for buggy behavior on counter wrap-around (+ test vector)
> 
> v3: https://www.spinics.net/lists/linux-crypto/msg34946.html
> Changes in v3:
>   - fix a copy-paste error
> 
> v2: https://www.spinics.net/lists/linux-crypto/msg34890.html
> Changes in v2:
>   - small cleanup suggested by Eric Biggers
> 
> v1: https://www.spinics.net/lists/linux-crypto/msg34871.html
> 
> Ondrej Mosnacek (4):
>   crypto: lrw - Fix out-of bounds access on counter overflow
>   crypto: testmgr - Add test for LRW counter wrap-around
>   crypto: lrw - Optimize tweak computation
>   crypto: lrw - Do not use auxiliary buffer
> 
>  crypto/lrw.c     | 342 +++++++++++++----------------------------------
>  crypto/testmgr.h |  21 +++
>  2 files changed, 112 insertions(+), 251 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 0/4] crypto: lrw - Fixes and improvements
  2018-09-21  5:45 ` [PATCH v4 0/4] crypto: lrw - Fixes and improvements Herbert Xu
@ 2018-09-30 19:00   ` Ard Biesheuvel
  2018-09-30 19:40     ` Ard Biesheuvel
  0 siblings, 1 reply; 8+ messages in thread
From: Ard Biesheuvel @ 2018-09-30 19:00 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Biggers, device-mapper development, Mikulas Patocka,
	Ondrej Mosnacek, open list:HARDWARE RANDOM NUMBER GENERATOR CORE

On 21 September 2018 at 07:45, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Thu, Sep 13, 2018 at 10:51:30AM +0200, Ondrej Mosnacek wrote:
>> This patchset contains a corner-case fix and several improvements  for
>> the LRW template.
>>
>> The first patch fixes an out-of-bounds array access (and subsequently
>> incorrect cipher output) when the LRW counter goes from all ones to all
>> zeros. This patch should be applied to the crypto-2.6 tree and also go
>> to stable.
>>
>> The second patch adds a test vector for lrw(aes) that covers the above
>> bug.
>>
>> The third patch is a small optimization of the LRW tweak computation.
>>
>> The fourth patch is a follow-up to a similar patch for XTS (it
>> simplifies away the use of dynamically allocated auxiliary buffer to
>> cache the computed tweak values):
>> https://patchwork.kernel.org/patch/10588775/
>>
>> Patches 2-4 should be applied only to cryptodev-2.6, but they all depend
>> on the first patch.
>>
>> Changes in v4:
>>   - applied various corrections/suggestions from Eric Biggers
>>   - added a fix for buggy behavior on counter wrap-around (+ test vector)
>>
>> v3: https://www.spinics.net/lists/linux-crypto/msg34946.html
>> Changes in v3:
>>   - fix a copy-paste error
>>
>> v2: https://www.spinics.net/lists/linux-crypto/msg34890.html
>> Changes in v2:
>>   - small cleanup suggested by Eric Biggers
>>
>> v1: https://www.spinics.net/lists/linux-crypto/msg34871.html
>>
>> Ondrej Mosnacek (4):
>>   crypto: lrw - Fix out-of bounds access on counter overflow
>>   crypto: testmgr - Add test for LRW counter wrap-around
>>   crypto: lrw - Optimize tweak computation
>>   crypto: lrw - Do not use auxiliary buffer
>>
>>  crypto/lrw.c     | 342 +++++++++++++----------------------------------
>>  crypto/testmgr.h |  21 +++
>>  2 files changed, 112 insertions(+), 251 deletions(-)
>
> All applied.  Thanks.

I am seeing tcrypt failures with this code:

alg: skcipher: Test 8 failed (invalid result) on encryption for lrw(ecb-aes-ce)
00000000: 47 90 50 f6 f4 8d 5c 7f 84 c7 83 95 2d a2 02 c0
00000010: da 7f a3 c0 88 2a 0a 50 fb c1 78 03 39 fe 1d e5
00000020: 47 90 50 f6 f4 8d 5c 7f 84 c7 83 95 2d a2 02 c0

reproduced on both arm64 and ARM (the latter in LE and BE modes)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 0/4] crypto: lrw - Fixes and improvements
  2018-09-30 19:00   ` Ard Biesheuvel
@ 2018-09-30 19:40     ` Ard Biesheuvel
  0 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2018-09-30 19:40 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Biggers, device-mapper development, Mikulas Patocka,
	Ondrej Mosnacek, open list:HARDWARE RANDOM NUMBER GENERATOR CORE

On 30 September 2018 at 21:00, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 21 September 2018 at 07:45, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>> On Thu, Sep 13, 2018 at 10:51:30AM +0200, Ondrej Mosnacek wrote:
>>> This patchset contains a corner-case fix and several improvements  for
>>> the LRW template.
>>>
>>> The first patch fixes an out-of-bounds array access (and subsequently
>>> incorrect cipher output) when the LRW counter goes from all ones to all
>>> zeros. This patch should be applied to the crypto-2.6 tree and also go
>>> to stable.
>>>
>>> The second patch adds a test vector for lrw(aes) that covers the above
>>> bug.
>>>
>>> The third patch is a small optimization of the LRW tweak computation.
>>>
>>> The fourth patch is a follow-up to a similar patch for XTS (it
>>> simplifies away the use of dynamically allocated auxiliary buffer to
>>> cache the computed tweak values):
>>> https://patchwork.kernel.org/patch/10588775/
>>>
>>> Patches 2-4 should be applied only to cryptodev-2.6, but they all depend
>>> on the first patch.
>>>
>>> Changes in v4:
>>>   - applied various corrections/suggestions from Eric Biggers
>>>   - added a fix for buggy behavior on counter wrap-around (+ test vector)
>>>
>>> v3: https://www.spinics.net/lists/linux-crypto/msg34946.html
>>> Changes in v3:
>>>   - fix a copy-paste error
>>>
>>> v2: https://www.spinics.net/lists/linux-crypto/msg34890.html
>>> Changes in v2:
>>>   - small cleanup suggested by Eric Biggers
>>>
>>> v1: https://www.spinics.net/lists/linux-crypto/msg34871.html
>>>
>>> Ondrej Mosnacek (4):
>>>   crypto: lrw - Fix out-of bounds access on counter overflow
>>>   crypto: testmgr - Add test for LRW counter wrap-around
>>>   crypto: lrw - Optimize tweak computation
>>>   crypto: lrw - Do not use auxiliary buffer
>>>
>>>  crypto/lrw.c     | 342 +++++++++++++----------------------------------
>>>  crypto/testmgr.h |  21 +++
>>>  2 files changed, 112 insertions(+), 251 deletions(-)
>>
>> All applied.  Thanks.
>
> I am seeing tcrypt failures with this code:
>
> alg: skcipher: Test 8 failed (invalid result) on encryption for lrw(ecb-aes-ce)
> 00000000: 47 90 50 f6 f4 8d 5c 7f 84 c7 83 95 2d a2 02 c0
> 00000010: da 7f a3 c0 88 2a 0a 50 fb c1 78 03 39 fe 1d e5
> 00000020: 47 90 50 f6 f4 8d 5c 7f 84 c7 83 95 2d a2 02 c0
>
> reproduced on both arm64 and ARM (the latter in LE and BE modes)

This fixes it for me

diff --git a/crypto/lrw.c b/crypto/lrw.c
index 6fcf0d431185..0430ccd08728 100644
--- a/crypto/lrw.c
+++ b/crypto/lrw.c
@@ -122,10 +122,9 @@ static int next_index(u32 *counter)
        int i, res = 0;

        for (i = 0; i < 4; i++) {
-               if (counter[i] + 1 != 0) {
-                       res += ffz(counter[i]++);
-                       break;
-               }
+               if (counter[i] + 1 != 0)
+                       return res + ffz(counter[i]++);
+
                counter[i] = 0;
                res += 32;
        }

since otherwise, the function always returns 127 (!)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-09-30 19:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-13  8:51 [PATCH v4 0/4] crypto: lrw - Fixes and improvements Ondrej Mosnacek
2018-09-13  8:51 ` [PATCH v4 1/4] crypto: lrw - Fix out-of bounds access on counter overflow Ondrej Mosnacek
2018-09-13  8:51 ` [PATCH v4 2/4] crypto: testmgr - Add test for LRW counter wrap-around Ondrej Mosnacek
2018-09-13  8:51 ` [PATCH v4 3/4] crypto: lrw - Optimize tweak computation Ondrej Mosnacek
2018-09-13  8:51 ` [PATCH v4 4/4] crypto: lrw - Do not use auxiliary buffer Ondrej Mosnacek
2018-09-21  5:45 ` [PATCH v4 0/4] crypto: lrw - Fixes and improvements Herbert Xu
2018-09-30 19:00   ` Ard Biesheuvel
2018-09-30 19:40     ` Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).