Stable Archive on lore.kernel.org
 help / Atom feed
* [PATCH] crypto: caam - Do not overwrite IV
@ 2019-01-31  6:12 Sascha Hauer
  2019-02-04 12:26 ` Horia Geanta
  0 siblings, 1 reply; 14+ messages in thread
From: Sascha Hauer @ 2019-01-31  6:12 UTC (permalink / raw)
  To: linux-crypto; +Cc: Horia Geantă, kernel, stable, Sascha Hauer

In skcipher_decrypt() the IV passed in by the caller is overwritten and
the tcrypt module fails with:

alg: aead: decryption failed on test 1 for gcm_base(ctr-aes-caam,ghash-generic): ret=74
alg: aead: Failed to load transform for gcm(aes): -2

With this patch tcrypt runs without errors.

Fixes: 115957bb3e59 ("crypto: caam - fix IV DMA mapping and updating")

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 drivers/crypto/caam/caamalg.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 80ae69f906fb..493fa4169382 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -1735,7 +1735,6 @@ static int skcipher_decrypt(struct skcipher_request *req)
 	struct skcipher_edesc *edesc;
 	struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req);
 	struct caam_ctx *ctx = crypto_skcipher_ctx(skcipher);
-	int ivsize = crypto_skcipher_ivsize(skcipher);
 	struct device *jrdev = ctx->jrdev;
 	u32 *desc;
 	int ret = 0;
@@ -1745,13 +1744,6 @@ static int skcipher_decrypt(struct skcipher_request *req)
 	if (IS_ERR(edesc))
 		return PTR_ERR(edesc);
 
-	/*
-	 * The crypto API expects us to set the IV (req->iv) to the last
-	 * ciphertext block.
-	 */
-	scatterwalk_map_and_copy(req->iv, req->src, req->cryptlen - ivsize,
-				 ivsize, 0);
-
 	/* Create and submit job descriptor*/
 	init_skcipher_job(req, edesc, false);
 	desc = edesc->hw_desc;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-01-31  6:12 [PATCH] crypto: caam - Do not overwrite IV Sascha Hauer
@ 2019-02-04 12:26 ` Horia Geanta
  2019-02-05  8:31   ` Sascha Hauer
  2019-02-08  7:16   ` Herbert Xu
  0 siblings, 2 replies; 14+ messages in thread
From: Horia Geanta @ 2019-02-04 12:26 UTC (permalink / raw)
  To: Sascha Hauer, linux-crypto, Herbert Xu; +Cc: kernel, stable

On 1/31/2019 8:12 AM, Sascha Hauer wrote:
> In skcipher_decrypt() the IV passed in by the caller is overwritten and
> the tcrypt module fails with:
> 
> alg: aead: decryption failed on test 1 for gcm_base(ctr-aes-caam,ghash-generic): ret=74
> alg: aead: Failed to load transform for gcm(aes): -2
> 
> With this patch tcrypt runs without errors.
> 
This doesn't mean the patch is correct.
crypto API requires skcipher implementations to update the IV with the last
ciphertext block.

The root cause of the issue is cache line sharing.

struct crypto_gcm_req_priv_ctx {
        u8 iv[16];
        u8 auth_tag[16];
	[...]
};

Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
offloaded.
The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
S/G entry) and req->iv pointing to iv[16].
caam driver:
1-DMA maps req->src
2-copies original req->iv to internal buffer
3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)
4-sends job to crypto engine

Problem is that operation 3 above is writing iv[16], which is on the same cache
line as auth_tag[16] that was previously DMA mapped.

I've checked that forcing auth_tag and iv to be on separate cache lines
-       u8 auth_tag[16];
+       u8 auth_tag[16] ____cacheline_aligned;
solves the issue.

OTOH, maybe the fix should be done in caam driver, by avoiding any writes
(touching any data, even seemingly unrelated req->iv) after DMA mapping
req->src, req->dst etc.
Having req->iv and req->src sharing the same cache line is unfortunate.

Herbert, what do you think?

Thanks,
Horia

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-04 12:26 ` Horia Geanta
@ 2019-02-05  8:31   ` Sascha Hauer
  2019-02-05 11:49     ` Horia Geanta
  2019-02-08  7:16   ` Herbert Xu
  1 sibling, 1 reply; 14+ messages in thread
From: Sascha Hauer @ 2019-02-05  8:31 UTC (permalink / raw)
  To: Horia Geanta; +Cc: linux-crypto, Herbert Xu, kernel, stable

Hi Horia,

On Mon, Feb 04, 2019 at 12:26:26PM +0000, Horia Geanta wrote:
> On 1/31/2019 8:12 AM, Sascha Hauer wrote:
> > In skcipher_decrypt() the IV passed in by the caller is overwritten and
> > the tcrypt module fails with:
> > 
> > alg: aead: decryption failed on test 1 for gcm_base(ctr-aes-caam,ghash-generic): ret=74
> > alg: aead: Failed to load transform for gcm(aes): -2
> > 
> > With this patch tcrypt runs without errors.
> > 
> This doesn't mean the patch is correct.
> crypto API requires skcipher implementations to update the IV with the last
> ciphertext block.
> 
> The root cause of the issue is cache line sharing.
> 
> struct crypto_gcm_req_priv_ctx {
>         u8 iv[16];
>         u8 auth_tag[16];
> 	[...]
> };
> 
> Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
> offloaded.
> The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
> S/G entry) and req->iv pointing to iv[16].
> caam driver:
> 1-DMA maps req->src
> 2-copies original req->iv to internal buffer
> 3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)
> 4-sends job to crypto engine
> 
> Problem is that operation 3 above is writing iv[16], which is on the same cache
> line as auth_tag[16] that was previously DMA mapped.
> 
> I've checked that forcing auth_tag and iv to be on separate cache lines
> -       u8 auth_tag[16];
> +       u8 auth_tag[16] ____cacheline_aligned;
> solves the issue.

I can confirm that this solves it here on my side aswell.

I have no idea what's best to do here, but I'd like to have that fixed.

Is there some easy to reproduce testcase to show the issues that arise
with my patch? Apparently tcrypt doesn't do chaining, right?

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-05  8:31   ` Sascha Hauer
@ 2019-02-05 11:49     ` Horia Geanta
  0 siblings, 0 replies; 14+ messages in thread
From: Horia Geanta @ 2019-02-05 11:49 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: linux-crypto, Herbert Xu, kernel, stable

On 2/5/2019 10:31 AM, Sascha Hauer wrote:
> Hi Horia,
> 
> On Mon, Feb 04, 2019 at 12:26:26PM +0000, Horia Geanta wrote:
>> On 1/31/2019 8:12 AM, Sascha Hauer wrote:
>>> In skcipher_decrypt() the IV passed in by the caller is overwritten and
>>> the tcrypt module fails with:
>>>
>>> alg: aead: decryption failed on test 1 for gcm_base(ctr-aes-caam,ghash-generic): ret=74
>>> alg: aead: Failed to load transform for gcm(aes): -2
>>>
>>> With this patch tcrypt runs without errors.
>>>
>> This doesn't mean the patch is correct.
>> crypto API requires skcipher implementations to update the IV with the last
>> ciphertext block.
>>
>> The root cause of the issue is cache line sharing.
>>
>> struct crypto_gcm_req_priv_ctx {
>>         u8 iv[16];
>>         u8 auth_tag[16];
>> 	[...]
>> };
>>
>> Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
>> offloaded.
>> The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
>> S/G entry) and req->iv pointing to iv[16].
>> caam driver:
>> 1-DMA maps req->src
>> 2-copies original req->iv to internal buffer
>> 3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)
>> 4-sends job to crypto engine
>>
>> Problem is that operation 3 above is writing iv[16], which is on the same cache
>> line as auth_tag[16] that was previously DMA mapped.
>>
>> I've checked that forcing auth_tag and iv to be on separate cache lines
>> -       u8 auth_tag[16];
>> +       u8 auth_tag[16] ____cacheline_aligned;
>> solves the issue.
> 
> I can confirm that this solves it here on my side aswell.
> 
Thanks for confirming.

> I have no idea what's best to do here, but I'd like to have that fixed.
> 
Yes, me too.
I would wait for Herbert's input though. As I said, it's a bit weird for a
skcipher implementation to be aware of updating req->iv having such a side
effect on req->src.

> Is there some easy to reproduce testcase to show the issues that arise
> with my patch? Apparently tcrypt doesn't do chaining, right?
> 
In theory cts(cbc(aes)) decryption would have to fail when cbc(aes) would
execute on caam and cts() in SW.

However, cts() SW implementation (crypto/cts.c) does not rely on underlying
cbc(aes) to update the IV, instead it uses a temporary saved IV - see discussion
here:
https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg27719.html

Horia


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-04 12:26 ` Horia Geanta
  2019-02-05  8:31   ` Sascha Hauer
@ 2019-02-08  7:16   ` Herbert Xu
  2019-02-08  8:41     ` Horia Geanta
  1 sibling, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2019-02-08  7:16 UTC (permalink / raw)
  To: Horia Geanta; +Cc: Sascha Hauer, linux-crypto, kernel, stable

On Mon, Feb 04, 2019 at 12:26:26PM +0000, Horia Geanta wrote:
>
> The root cause of the issue is cache line sharing.
> 
> struct crypto_gcm_req_priv_ctx {
>         u8 iv[16];
>         u8 auth_tag[16];
> 	[...]
> };
> 
> Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
> offloaded.
> The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
> S/G entry) and req->iv pointing to iv[16].
> caam driver:
> 1-DMA maps req->src
> 2-copies original req->iv to internal buffer
> 3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)
> 4-sends job to crypto engine
> 
> Problem is that operation 3 above is writing iv[16], which is on the same cache
> line as auth_tag[16] that was previously DMA mapped.
> 
> I've checked that forcing auth_tag and iv to be on separate cache lines
> -       u8 auth_tag[16];
> +       u8 auth_tag[16] ____cacheline_aligned;
> solves the issue.
> 
> OTOH, maybe the fix should be done in caam driver, by avoiding any writes
> (touching any data, even seemingly unrelated req->iv) after DMA mapping
> req->src, req->dst etc.
> Having req->iv and req->src sharing the same cache line is unfortunate.
> 
> Herbert, what do you think?

Well just like the other cases if your input is a kernel pointer you
must not perform DMA on it.  Only SG lists can be used for DMA.

So the IV needs to be copied on completion.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-08  7:16   ` Herbert Xu
@ 2019-02-08  8:41     ` Horia Geanta
  2019-02-08  8:55       ` Ard Biesheuvel
  2019-02-08 11:45       ` Herbert Xu
  0 siblings, 2 replies; 14+ messages in thread
From: Horia Geanta @ 2019-02-08  8:41 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Sascha Hauer, linux-crypto, kernel, stable

On 2/8/2019 9:16 AM, Herbert Xu wrote:
> On Mon, Feb 04, 2019 at 12:26:26PM +0000, Horia Geanta wrote:
>>
>> The root cause of the issue is cache line sharing.
>>
>> struct crypto_gcm_req_priv_ctx {
>>         u8 iv[16];
>>         u8 auth_tag[16];
>> 	[...]
>> };
>>
>> Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
>> offloaded.
>> The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
>> S/G entry) and req->iv pointing to iv[16].
>> caam driver:
>> 1-DMA maps req->src
>> 2-copies original req->iv to internal buffer
>> 3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)
>> 4-sends job to crypto engine
>>
>> Problem is that operation 3 above is writing iv[16], which is on the same cache
>> line as auth_tag[16] that was previously DMA mapped.
>>
>> I've checked that forcing auth_tag and iv to be on separate cache lines
>> -       u8 auth_tag[16];
>> +       u8 auth_tag[16] ____cacheline_aligned;
>> solves the issue.
>>
>> OTOH, maybe the fix should be done in caam driver, by avoiding any writes
>> (touching any data, even seemingly unrelated req->iv) after DMA mapping
>> req->src, req->dst etc.
>> Having req->iv and req->src sharing the same cache line is unfortunate.
>>
>> Herbert, what do you think?
> 
> Well just like the other cases if your input is a kernel pointer you
> must not perform DMA on it.  Only SG lists can be used for DMA.
> 
As I said at point 2 above, req->iv is copied to an internal buffer, which is
allocated using kmalloc.

> So the IV needs to be copied on completion.
> 
Is it mandatory to be copied *on completion*?
In some cases implementations could update req->iv before completion - for e.g.
in case of cbc decryption the last block from req->src is copied into req->iv
before the engine performs decryption (to avoid in-place decryption, where the
last block would be overwritten).

I'll try to explain issue at hand in more detail.

------------------
|       IV       |
------------------
|  input buffer  |
------------------

Consider that the skcipher implementation receives, via crypto API, a request
with req->IV pointing to "IV" and, for simplicity, a 1-entry req->src
scatterlist pointing at "input buffer".

In caam's particular case (and for decryption):
a-req->src is DMA mapped
b-req->iv is overwritten with last block of req->src
c-crypto engine executes decryption (using the original value of req->iv)

If IV and input buffer are on the same cache line, there is a problem when the
device is non-coherent (i.MX case) since CPU is touching part of the cache line
(writing the IV) after DMA API mapping was called for the same cacheline
(req->src -> input buffer).

I don't think we could ask an implementation to be aware of the memory layout of
req->iv and req->src (and req->dst etc.) buffers.

If I am not mistaken, req->src for skcipher request in crypto/gcm.c is breaking
one of the DMA API rules - Documentation/DMA-API.txt:

.. warning::

        Memory coherency operates at a granularity called the cache
        line width.  In order for memory mapped by this API to operate
        correctly, the mapped region must begin exactly on a cache line
        boundary and end exactly on one (to prevent two separately mapped
        regions from sharing a single cache line).

So if there is a real intention to support offloading skcipher, as this old
commit suggests:

84c911523020 ("[CRYPTO] gcm: Add support for async ciphers")
    This patch adds the necessary changes for GCM to be used with async
    ciphers.  This would allow it to be used with hardware devices that
    support CTR.

then we must take special care when building skcipher req->src and avoid having
it's first entry (auth_tag[16] in crypto_gcm_req_priv_ctx structure) sharing a
cache line with req->iv.

Thanks,
Horia

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-08  8:41     ` Horia Geanta
@ 2019-02-08  8:55       ` Ard Biesheuvel
  2019-02-08 11:44         ` Herbert Xu
  2019-02-12  9:13         ` Ard Biesheuvel
  2019-02-08 11:45       ` Herbert Xu
  1 sibling, 2 replies; 14+ messages in thread
From: Ard Biesheuvel @ 2019-02-08  8:55 UTC (permalink / raw)
  To: Horia Geanta; +Cc: Herbert Xu, Sascha Hauer, linux-crypto, kernel, stable

On Fri, 8 Feb 2019 at 09:41, Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 2/8/2019 9:16 AM, Herbert Xu wrote:
> > On Mon, Feb 04, 2019 at 12:26:26PM +0000, Horia Geanta wrote:
> >>
> >> The root cause of the issue is cache line sharing.
> >>
> >> struct crypto_gcm_req_priv_ctx {
> >>         u8 iv[16];
> >>         u8 auth_tag[16];
> >>      [...]
> >> };
> >>
> >> Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
> >> offloaded.
> >> The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
> >> S/G entry) and req->iv pointing to iv[16].
> >> caam driver:
> >> 1-DMA maps req->src
> >> 2-copies original req->iv to internal buffer
> >> 3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)

This violates the DMA api, since you are touching memory that is owned
by the device at this point (even though the addresses do not actually
overlap). Note that on architectures that support non-cache coherent
DMA, the kmalloc alignment is at least the cacheline size, for this
exact reason.

However, it should not be the driver's job to figure out whether
kernel pointers may alias cachelines that are covered by a
scatterlist.

So I would argue that the generic GCM driver should ensure that
whatever it passes into scatterlists is safe for non-cache coherent
DMA. Blowing up the struct like this is probably not the right answer,
instead, we should probably have an auth_tag pointer and a separate
kmalloc buffer so we don't affect cache coherent architectures too
much.


> >> 4-sends job to crypto engine
> >>
> >> Problem is that operation 3 above is writing iv[16], which is on the same cache
> >> line as auth_tag[16] that was previously DMA mapped.
> >>
> >> I've checked that forcing auth_tag and iv to be on separate cache lines
> >> -       u8 auth_tag[16];
> >> +       u8 auth_tag[16] ____cacheline_aligned;
> >> solves the issue.
> >>
> >> OTOH, maybe the fix should be done in caam driver, by avoiding any writes
> >> (touching any data, even seemingly unrelated req->iv) after DMA mapping
> >> req->src, req->dst etc.
> >> Having req->iv and req->src sharing the same cache line is unfortunate.
> >>
> >> Herbert, what do you think?
> >
> > Well just like the other cases if your input is a kernel pointer you
> > must not perform DMA on it.  Only SG lists can be used for DMA.
> >
> As I said at point 2 above, req->iv is copied to an internal buffer, which is
> allocated using kmalloc.
>
> > So the IV needs to be copied on completion.
> >
> Is it mandatory to be copied *on completion*?
> In some cases implementations could update req->iv before completion - for e.g.
> in case of cbc decryption the last block from req->src is copied into req->iv
> before the engine performs decryption (to avoid in-place decryption, where the
> last block would be overwritten).
>
> I'll try to explain issue at hand in more detail.
>
> ------------------
> |       IV       |
> ------------------
> |  input buffer  |
> ------------------
>
> Consider that the skcipher implementation receives, via crypto API, a request
> with req->IV pointing to "IV" and, for simplicity, a 1-entry req->src
> scatterlist pointing at "input buffer".
>
> In caam's particular case (and for decryption):
> a-req->src is DMA mapped
> b-req->iv is overwritten with last block of req->src
> c-crypto engine executes decryption (using the original value of req->iv)
>
> If IV and input buffer are on the same cache line, there is a problem when the
> device is non-coherent (i.MX case) since CPU is touching part of the cache line
> (writing the IV) after DMA API mapping was called for the same cacheline
> (req->src -> input buffer).
>
> I don't think we could ask an implementation to be aware of the memory layout of
> req->iv and req->src (and req->dst etc.) buffers.
>
> If I am not mistaken, req->src for skcipher request in crypto/gcm.c is breaking
> one of the DMA API rules - Documentation/DMA-API.txt:
>
> .. warning::
>
>         Memory coherency operates at a granularity called the cache
>         line width.  In order for memory mapped by this API to operate
>         correctly, the mapped region must begin exactly on a cache line
>         boundary and end exactly on one (to prevent two separately mapped
>         regions from sharing a single cache line).
>
> So if there is a real intention to support offloading skcipher, as this old
> commit suggests:
>
> 84c911523020 ("[CRYPTO] gcm: Add support for async ciphers")
>     This patch adds the necessary changes for GCM to be used with async
>     ciphers.  This would allow it to be used with hardware devices that
>     support CTR.
>
> then we must take special care when building skcipher req->src and avoid having
> it's first entry (auth_tag[16] in crypto_gcm_req_priv_ctx structure) sharing a
> cache line with req->iv.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-08  8:55       ` Ard Biesheuvel
@ 2019-02-08 11:44         ` Herbert Xu
  2019-02-12  9:13         ` Ard Biesheuvel
  1 sibling, 0 replies; 14+ messages in thread
From: Herbert Xu @ 2019-02-08 11:44 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Horia Geanta, Sascha Hauer, linux-crypto, kernel, stable

On Fri, Feb 08, 2019 at 09:55:19AM +0100, Ard Biesheuvel wrote:
>
> So I would argue that the generic GCM driver should ensure that
> whatever it passes into scatterlists is safe for non-cache coherent
> DMA. Blowing up the struct like this is probably not the right answer,
> instead, we should probably have an auth_tag pointer and a separate
> kmalloc buffer so we don't affect cache coherent architectures too
> much.

Yes I agree.  GCM needs to ensure auth_tag is aligned properly.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-08  8:41     ` Horia Geanta
  2019-02-08  8:55       ` Ard Biesheuvel
@ 2019-02-08 11:45       ` Herbert Xu
  2019-02-11 15:13         ` Horia Geanta
  1 sibling, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2019-02-08 11:45 UTC (permalink / raw)
  To: Horia Geanta; +Cc: Sascha Hauer, linux-crypto, kernel, stable

On Fri, Feb 08, 2019 at 08:41:37AM +0000, Horia Geanta wrote:
>
> So if there is a real intention to support offloading skcipher, as this old
> commit suggests:
> 
> 84c911523020 ("[CRYPTO] gcm: Add support for async ciphers")
>     This patch adds the necessary changes for GCM to be used with async
>     ciphers.  This would allow it to be used with hardware devices that
>     support CTR.
> 
> then we must take special care when building skcipher req->src and avoid having
> it's first entry (auth_tag[16] in crypto_gcm_req_priv_ctx structure) sharing a
> cache line with req->iv.

Could you prepare a patch for this?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-08 11:45       ` Herbert Xu
@ 2019-02-11 15:13         ` Horia Geanta
  2019-02-11 17:28           ` Ard Biesheuvel
  0 siblings, 1 reply; 14+ messages in thread
From: Horia Geanta @ 2019-02-11 15:13 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Sascha Hauer, linux-crypto, kernel, stable

On 2/8/2019 1:45 PM, Herbert Xu wrote:
> On Fri, Feb 08, 2019 at 08:41:37AM +0000, Horia Geanta wrote:
>>
>> So if there is a real intention to support offloading skcipher, as this old
>> commit suggests:
>>
>> 84c911523020 ("[CRYPTO] gcm: Add support for async ciphers")
>>     This patch adds the necessary changes for GCM to be used with async
>>     ciphers.  This would allow it to be used with hardware devices that
>>     support CTR.
>>
>> then we must take special care when building skcipher req->src and avoid having
>> it's first entry (auth_tag[16] in crypto_gcm_req_priv_ctx structure) sharing a
>> cache line with req->iv.
> 
> Could you prepare a patch for this?
> 
Yes, will do.

Note that I am seeing the same issue on CCM.

Also for GCM, besides auth_tag there is iauth_tag[16] buffer that is added in a
1-entry S/G in gcm_hash_len():
	sg_init_one(&pctx->sg, pctx->iauth_tag, 16);

It looks like there other places where this happens, for e.g. "tail" member of
poly_req (crypto/cacha20poly1305.c) in poly_tail():
	sg_set_buf(preq->src, &preq->tail, sizeof(preq->tail));

Thanks,
Horia

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-11 15:13         ` Horia Geanta
@ 2019-02-11 17:28           ` Ard Biesheuvel
  0 siblings, 0 replies; 14+ messages in thread
From: Ard Biesheuvel @ 2019-02-11 17:28 UTC (permalink / raw)
  To: Horia Geanta; +Cc: Herbert Xu, Sascha Hauer, linux-crypto, kernel, stable

On Mon, 11 Feb 2019 at 16:13, Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 2/8/2019 1:45 PM, Herbert Xu wrote:
> > On Fri, Feb 08, 2019 at 08:41:37AM +0000, Horia Geanta wrote:
> >>
> >> So if there is a real intention to support offloading skcipher, as this old
> >> commit suggests:
> >>
> >> 84c911523020 ("[CRYPTO] gcm: Add support for async ciphers")
> >>     This patch adds the necessary changes for GCM to be used with async
> >>     ciphers.  This would allow it to be used with hardware devices that
> >>     support CTR.
> >>
> >> then we must take special care when building skcipher req->src and avoid having
> >> it's first entry (auth_tag[16] in crypto_gcm_req_priv_ctx structure) sharing a
> >> cache line with req->iv.
> >
> > Could you prepare a patch for this?
> >
> Yes, will do.
>
> Note that I am seeing the same issue on CCM.
>
> Also for GCM, besides auth_tag there is iauth_tag[16] buffer that is added in a
> 1-entry S/G in gcm_hash_len():
>         sg_init_one(&pctx->sg, pctx->iauth_tag, 16);
>

Yes, but I suppose the issue only occurs when writing data from the
device, no? AFAICT, iauth_tag[] is passed into the device but never
updated by it.

> It looks like there other places where this happens, for e.g. "tail" member of
> poly_req (crypto/cacha20poly1305.c) in poly_tail():
>         sg_set_buf(preq->src, &preq->tail, sizeof(preq->tail));
>

Same here. If it never occurs in the destination of the scatterlist,
it should be safe for non-cache coherent DMA (since DMA to the device
does not involve cache invalidation)

However, it would be nice if we could test this in some way ...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-08  8:55       ` Ard Biesheuvel
  2019-02-08 11:44         ` Herbert Xu
@ 2019-02-12  9:13         ` Ard Biesheuvel
  2019-02-13 20:48           ` Horia Geanta
  1 sibling, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2019-02-12  9:13 UTC (permalink / raw)
  To: Horia Geanta; +Cc: Herbert Xu, Sascha Hauer, linux-crypto, kernel, stable

On Fri, 8 Feb 2019 at 09:55, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>
> On Fri, 8 Feb 2019 at 09:41, Horia Geanta <horia.geanta@nxp.com> wrote:
> >
> > On 2/8/2019 9:16 AM, Herbert Xu wrote:
> > > On Mon, Feb 04, 2019 at 12:26:26PM +0000, Horia Geanta wrote:
> > >>
> > >> The root cause of the issue is cache line sharing.
> > >>
> > >> struct crypto_gcm_req_priv_ctx {
> > >>         u8 iv[16];
> > >>         u8 auth_tag[16];
> > >>      [...]
> > >> };
> > >>
> > >> Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
> > >> offloaded.
> > >> The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
> > >> S/G entry) and req->iv pointing to iv[16].
> > >> caam driver:
> > >> 1-DMA maps req->src
> > >> 2-copies original req->iv to internal buffer
> > >> 3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)
>
> This violates the DMA api, since you are touching memory that is owned
> by the device at this point (even though the addresses do not actually
> overlap). Note that on architectures that support non-cache coherent
> DMA, the kmalloc alignment is at least the cacheline size, for this
> exact reason.
>

Actually, the driver does violate the DMA api in another way:
scatterwalk_map_and_copy() is accessing req->src after DMA mapping it.
Does the issue still exist if scatterwalk_map_and_copy() is done
before the DMA map?

(On a non-cache coherent system, the DMA map will typically perform a
clean+invalidate, which means that the invalidate that occurs at unmap
time cannot corrupt adjacent data, but this only works if the CPU does
not write to the same cacheline while it is mapped for the device)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-12  9:13         ` Ard Biesheuvel
@ 2019-02-13 20:48           ` Horia Geanta
  2019-02-13 21:15             ` Ard Biesheuvel
  0 siblings, 1 reply; 14+ messages in thread
From: Horia Geanta @ 2019-02-13 20:48 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Herbert Xu, Sascha Hauer, linux-crypto, kernel, stable

On 2/12/2019 11:13 AM, Ard Biesheuvel wrote:
> On Fri, 8 Feb 2019 at 09:55, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>
>> On Fri, 8 Feb 2019 at 09:41, Horia Geanta <horia.geanta@nxp.com> wrote:
>>>
>>> On 2/8/2019 9:16 AM, Herbert Xu wrote:
>>>> On Mon, Feb 04, 2019 at 12:26:26PM +0000, Horia Geanta wrote:
>>>>>
>>>>> The root cause of the issue is cache line sharing.
>>>>>
>>>>> struct crypto_gcm_req_priv_ctx {
>>>>>         u8 iv[16];
>>>>>         u8 auth_tag[16];
>>>>>      [...]
>>>>> };
>>>>>
>>>>> Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
>>>>> offloaded.
>>>>> The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
>>>>> S/G entry) and req->iv pointing to iv[16].
>>>>> caam driver:
>>>>> 1-DMA maps req->src
>>>>> 2-copies original req->iv to internal buffer
>>>>> 3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)
>>
>> This violates the DMA api, since you are touching memory that is owned
>> by the device at this point (even though the addresses do not actually
>> overlap). Note that on architectures that support non-cache coherent
>> DMA, the kmalloc alignment is at least the cacheline size, for this
>> exact reason.
>>
> 
> Actually, the driver does violate the DMA api in another way:
> scatterwalk_map_and_copy() is accessing req->src after DMA mapping it.
> Does the issue still exist if scatterwalk_map_and_copy() is done
> before the DMA map?
> 
> (On a non-cache coherent system, the DMA map will typically perform a
> clean+invalidate, which means that the invalidate that occurs at unmap
> time cannot corrupt adjacent data, but this only works if the CPU does
> not write to the same cacheline while it is mapped for the device)
> 
scatterwalk_map_and_copy() is reading from req->src.
Are you saying it's forbidden for CPU to read from an area after it's DMA mapped?

Thanks,
Horia

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] crypto: caam - Do not overwrite IV
  2019-02-13 20:48           ` Horia Geanta
@ 2019-02-13 21:15             ` Ard Biesheuvel
  0 siblings, 0 replies; 14+ messages in thread
From: Ard Biesheuvel @ 2019-02-13 21:15 UTC (permalink / raw)
  To: Horia Geanta; +Cc: Herbert Xu, Sascha Hauer, linux-crypto, kernel, stable

On Wed, 13 Feb 2019 at 21:48, Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 2/12/2019 11:13 AM, Ard Biesheuvel wrote:
> > On Fri, 8 Feb 2019 at 09:55, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> >>
> >> On Fri, 8 Feb 2019 at 09:41, Horia Geanta <horia.geanta@nxp.com> wrote:
> >>>
> >>> On 2/8/2019 9:16 AM, Herbert Xu wrote:
> >>>> On Mon, Feb 04, 2019 at 12:26:26PM +0000, Horia Geanta wrote:
> >>>>>
> >>>>> The root cause of the issue is cache line sharing.
> >>>>>
> >>>>> struct crypto_gcm_req_priv_ctx {
> >>>>>         u8 iv[16];
> >>>>>         u8 auth_tag[16];
> >>>>>      [...]
> >>>>> };
> >>>>>
> >>>>> Since caam does not support ghash on i.MX6, only ctr skcipher part of the gcm is
> >>>>> offloaded.
> >>>>> The skcipher request received by caam has req->src pointing to auth_tag[16] (1st
> >>>>> S/G entry) and req->iv pointing to iv[16].
> >>>>> caam driver:
> >>>>> 1-DMA maps req->src
> >>>>> 2-copies original req->iv to internal buffer
> >>>>> 3-updates req->iv (scatterwalk_map_and_copy from last block in req->src)
> >>
> >> This violates the DMA api, since you are touching memory that is owned
> >> by the device at this point (even though the addresses do not actually
> >> overlap). Note that on architectures that support non-cache coherent
> >> DMA, the kmalloc alignment is at least the cacheline size, for this
> >> exact reason.
> >>
> >
> > Actually, the driver does violate the DMA api in another way:
> > scatterwalk_map_and_copy() is accessing req->src after DMA mapping it.
> > Does the issue still exist if scatterwalk_map_and_copy() is done
> > before the DMA map?
> >
> > (On a non-cache coherent system, the DMA map will typically perform a
> > clean+invalidate, which means that the invalidate that occurs at unmap
> > time cannot corrupt adjacent data, but this only works if the CPU does
> > not write to the same cacheline while it is mapped for the device)
> >
> scatterwalk_map_and_copy() is reading from req->src.
> Are you saying it's forbidden for CPU to read from an area after it's DMA mapped?
>

OK, I read that the wrong way around. The DMA api does permit reading
from a region after it has been mapped for DMA.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, back to index

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-31  6:12 [PATCH] crypto: caam - Do not overwrite IV Sascha Hauer
2019-02-04 12:26 ` Horia Geanta
2019-02-05  8:31   ` Sascha Hauer
2019-02-05 11:49     ` Horia Geanta
2019-02-08  7:16   ` Herbert Xu
2019-02-08  8:41     ` Horia Geanta
2019-02-08  8:55       ` Ard Biesheuvel
2019-02-08 11:44         ` Herbert Xu
2019-02-12  9:13         ` Ard Biesheuvel
2019-02-13 20:48           ` Horia Geanta
2019-02-13 21:15             ` Ard Biesheuvel
2019-02-08 11:45       ` Herbert Xu
2019-02-11 15:13         ` Horia Geanta
2019-02-11 17:28           ` Ard Biesheuvel

Stable Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://lore.kernel.org/stable \
		stable@vger.kernel.org stable@archiver.kernel.org
	public-inbox-index stable


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.stable


AGPL code for this site: git clone https://public-inbox.org/ public-inbox