All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
@ 2020-11-25 21:13 Iuliana Prodan (OSS)
  2020-11-25 21:13 ` [RFC PATCH 1/4] " Iuliana Prodan (OSS)
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Iuliana Prodan (OSS) @ 2020-11-25 21:13 UTC (permalink / raw)
  To: Herbert Xu, Ard Biesheuvel, David S. Miller, Horia Geanta
  Cc: Aymen Sghaier, Silvano Di Ninno, Franck Lenormand, linux-crypto,
	linux-kernel, linux-imx, Iuliana Prodan

From: Iuliana Prodan <iuliana.prodan@nxp.com>

Add the option to allocate the crypto request object plus any extra space
needed by the driver into a DMA-able memory.

Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
indicate to crypto API the need to allocate GFP_DMA memory
for private contexts of the crypto requests.

For IPsec use cases, CRYPTO_TFM_REQ_DMA flag is also checked in
esp_alloc_tmp() function for IPv4 and IPv6.

This series includes an example of how a driver can use
CRYPTO_TFM_REQ_DMA flag while setting reqsize to a larger value
to avoid allocating memory at crypto request runtime.
The extra size needed by the driver is added to the reqsize field
that indicates how much memory could be needed per request.

Iuliana Prodan (4):
  crypto: add CRYPTO_TFM_REQ_DMA flag
  net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
  crypto: caam - avoid allocating memory at crypto request runtime for
    skcipher
  crypto: caam - avoid allocating memory at crypto request runtime for
    aead

 drivers/crypto/caam/caamalg.c | 130 +++++++++++++++++++++++++---------
 include/crypto/aead.h         |   4 ++
 include/crypto/akcipher.h     |  21 ++++++
 include/crypto/hash.h         |   4 ++
 include/crypto/skcipher.h     |   4 ++
 include/linux/crypto.h        |   1 +
 net/ipv4/esp4.c               |   7 +-
 net/ipv6/esp6.c               |   7 +-
 8 files changed, 144 insertions(+), 34 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH 1/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-11-25 21:13 [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Iuliana Prodan (OSS)
@ 2020-11-25 21:13 ` Iuliana Prodan (OSS)
  2020-11-25 21:13 ` [RFC PATCH 2/4] net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request Iuliana Prodan (OSS)
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Iuliana Prodan (OSS) @ 2020-11-25 21:13 UTC (permalink / raw)
  To: Herbert Xu, Ard Biesheuvel, David S. Miller, Horia Geanta
  Cc: Aymen Sghaier, Silvano Di Ninno, Franck Lenormand, linux-crypto,
	linux-kernel, linux-imx, Iuliana Prodan

From: Iuliana Prodan <iuliana.prodan@nxp.com>

The CRYPTO_TFM_REQ_DMA flag can be used by backend implementations to
indicate to crypto API the need to allocate GFP_DMA memory
for private contexts of the crypto requests.

For public key encryption add the needed functions to
set/get/clear flags.

Signed-off-by: Horia Geanta <horia.geanta@nxp.com>
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
---
 include/crypto/aead.h     |  4 ++++
 include/crypto/akcipher.h | 21 +++++++++++++++++++++
 include/crypto/hash.h     |  4 ++++
 include/crypto/skcipher.h |  4 ++++
 include/linux/crypto.h    |  1 +
 5 files changed, 34 insertions(+)

diff --git a/include/crypto/aead.h b/include/crypto/aead.h
index fcc12c593ef8..ae2ef87cfb0d 100644
--- a/include/crypto/aead.h
+++ b/include/crypto/aead.h
@@ -416,6 +416,10 @@ static inline struct aead_request *aead_request_alloc(struct crypto_aead *tfm,
 {
 	struct aead_request *req;
 
+	if (crypto_aead_reqsize(tfm) &&
+	    (crypto_aead_get_flags(tfm) & CRYPTO_TFM_REQ_DMA))
+		gfp |= GFP_DMA;
+
 	req = kmalloc(sizeof(*req) + crypto_aead_reqsize(tfm), gfp);
 
 	if (likely(req))
diff --git a/include/crypto/akcipher.h b/include/crypto/akcipher.h
index 1d3aa252caba..c06c140d1b7a 100644
--- a/include/crypto/akcipher.h
+++ b/include/crypto/akcipher.h
@@ -158,6 +158,23 @@ static inline unsigned int crypto_akcipher_reqsize(struct crypto_akcipher *tfm)
 	return crypto_akcipher_alg(tfm)->reqsize;
 }
 
+static inline u32 crypto_akcipher_get_flags(struct crypto_akcipher *tfm)
+{
+	return crypto_tfm_get_flags(crypto_akcipher_tfm(tfm));
+}
+
+static inline void crypto_akcipher_set_flags(struct crypto_akcipher *tfm,
+					     u32 flags)
+{
+	crypto_tfm_set_flags(crypto_akcipher_tfm(tfm), flags);
+}
+
+static inline void crypto_akcipher_clear_flags(struct crypto_akcipher *tfm,
+					       u32 flags)
+{
+	crypto_tfm_clear_flags(crypto_akcipher_tfm(tfm), flags);
+}
+
 static inline void akcipher_request_set_tfm(struct akcipher_request *req,
 					    struct crypto_akcipher *tfm)
 {
@@ -193,6 +210,10 @@ static inline struct akcipher_request *akcipher_request_alloc(
 {
 	struct akcipher_request *req;
 
+	if (crypto_akcipher_reqsize(tfm) &&
+	    (crypto_akcipher_get_flags(tfm) & CRYPTO_TFM_REQ_DMA))
+		gfp |= GFP_DMA;
+
 	req = kmalloc(sizeof(*req) + crypto_akcipher_reqsize(tfm), gfp);
 	if (likely(req))
 		akcipher_request_set_tfm(req, tfm);
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index af2ff31ff619..cb28be54569a 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -599,6 +599,10 @@ static inline struct ahash_request *ahash_request_alloc(
 {
 	struct ahash_request *req;
 
+	if (crypto_ahash_reqsize(tfm) &&
+	    (crypto_ahash_get_flags(tfm) & CRYPTO_TFM_REQ_DMA))
+		gfp |= GFP_DMA;
+
 	req = kmalloc(sizeof(struct ahash_request) +
 		      crypto_ahash_reqsize(tfm), gfp);
 
diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index 6a733b171a5d..3c598b56628b 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -493,6 +493,10 @@ static inline struct skcipher_request *skcipher_request_alloc(
 {
 	struct skcipher_request *req;
 
+	if (crypto_skcipher_reqsize(tfm) &&
+	    (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_REQ_DMA))
+		gfp |= GFP_DMA;
+
 	req = kmalloc(sizeof(struct skcipher_request) +
 		      crypto_skcipher_reqsize(tfm), gfp);
 
diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index ef90e07c9635..87d7f0563c13 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -141,6 +141,7 @@
 #define CRYPTO_TFM_REQ_FORBID_WEAK_KEYS	0x00000100
 #define CRYPTO_TFM_REQ_MAY_SLEEP	0x00000200
 #define CRYPTO_TFM_REQ_MAY_BACKLOG	0x00000400
+#define CRYPTO_TFM_REQ_DMA			0x00000800
 
 /*
  * Miscellaneous stuff.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 2/4] net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
  2020-11-25 21:13 [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Iuliana Prodan (OSS)
  2020-11-25 21:13 ` [RFC PATCH 1/4] " Iuliana Prodan (OSS)
@ 2020-11-25 21:13 ` Iuliana Prodan (OSS)
  2020-11-25 21:13 ` [RFC PATCH 3/4] crypto: caam - avoid allocating memory at crypto request runtime for skcipher Iuliana Prodan (OSS)
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Iuliana Prodan (OSS) @ 2020-11-25 21:13 UTC (permalink / raw)
  To: Herbert Xu, Ard Biesheuvel, David S. Miller, Horia Geanta
  Cc: Aymen Sghaier, Silvano Di Ninno, Franck Lenormand, linux-crypto,
	linux-kernel, linux-imx, Iuliana Prodan

From: Iuliana Prodan <iuliana.prodan@nxp.com>

Some crypto backends might require the requests' private contexts
to be allocated in DMA-able memory.

Signed-off-by: Horia Geanta <horia.geanta@nxp.com>
---
 net/ipv4/esp4.c | 7 ++++++-
 net/ipv6/esp6.c | 7 ++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 8b07f3a4f2db..9edfb1012c3d 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -46,6 +46,7 @@ struct esp_output_extra {
 static void *esp_alloc_tmp(struct crypto_aead *aead, int nfrags, int extralen)
 {
 	unsigned int len;
+	gfp_t gfp = GFP_ATOMIC;
 
 	len = extralen;
 
@@ -62,7 +63,11 @@ static void *esp_alloc_tmp(struct crypto_aead *aead, int nfrags, int extralen)
 
 	len += sizeof(struct scatterlist) * nfrags;
 
-	return kmalloc(len, GFP_ATOMIC);
+	if (crypto_aead_reqsize(aead) &&
+	    (crypto_aead_get_flags(aead) & CRYPTO_TFM_REQ_DMA))
+		gfp |= GFP_DMA;
+
+	return kmalloc(len, gfp);
 }
 
 static inline void *esp_tmp_extra(void *tmp)
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 52c2f063529f..e9125e1234b5 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -63,6 +63,7 @@ struct esp_output_extra {
 static void *esp_alloc_tmp(struct crypto_aead *aead, int nfrags, int seqihlen)
 {
 	unsigned int len;
+	gfp_t gfp = GFP_ATOMIC;
 
 	len = seqihlen;
 
@@ -79,7 +80,11 @@ static void *esp_alloc_tmp(struct crypto_aead *aead, int nfrags, int seqihlen)
 
 	len += sizeof(struct scatterlist) * nfrags;
 
-	return kmalloc(len, GFP_ATOMIC);
+	if (crypto_aead_reqsize(aead) &&
+	    (crypto_aead_get_flags(aead) & CRYPTO_TFM_REQ_DMA))
+		gfp |= GFP_DMA;
+
+	return kmalloc(len, gfp);
 }
 
 static inline void *esp_tmp_extra(void *tmp)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 3/4] crypto: caam - avoid allocating memory at crypto request runtime for skcipher
  2020-11-25 21:13 [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Iuliana Prodan (OSS)
  2020-11-25 21:13 ` [RFC PATCH 1/4] " Iuliana Prodan (OSS)
  2020-11-25 21:13 ` [RFC PATCH 2/4] net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request Iuliana Prodan (OSS)
@ 2020-11-25 21:13 ` Iuliana Prodan (OSS)
  2020-11-25 21:13 ` [RFC PATCH 4/4] crypto: caam - avoid allocating memory at crypto request runtime for aead Iuliana Prodan (OSS)
  2020-11-25 21:16 ` [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Ard Biesheuvel
  4 siblings, 0 replies; 14+ messages in thread
From: Iuliana Prodan (OSS) @ 2020-11-25 21:13 UTC (permalink / raw)
  To: Herbert Xu, Ard Biesheuvel, David S. Miller, Horia Geanta
  Cc: Aymen Sghaier, Silvano Di Ninno, Franck Lenormand, linux-crypto,
	linux-kernel, linux-imx, Iuliana Prodan

From: Iuliana Prodan <iuliana.prodan@nxp.com>

Remove CRYPTO_ALG_ALLOCATES_MEMORY flag and allocate the memory
needed by the driver, to fulfil a request, within the crypto
request object.
The extra size needed for base extended descriptor and hw
descriptor commands, link tables, IV is computed in frontend
driver (caamalg) initialization and saved in reqsize field
that indicates how much memory could be needed per request.

CRYPTO_ALG_ALLOCATES_MEMORY flag is limited only to
dm-crypt use-cases, which seems to be 4 entries maximum.
Therefore in reqsize we allocate memory for maximum 4 entries
for src and 1 for IV, and the same for dst, both aligned.
If the driver needs more than the 4 entries maximum, the memory
is dynamically allocated, at runtime.

Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
---
 drivers/crypto/caam/caamalg.c | 71 +++++++++++++++++++++++++----------
 1 file changed, 52 insertions(+), 19 deletions(-)

diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 8697ae53b063..6ace8545faec 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -905,6 +905,7 @@ struct aead_edesc {
  * @iv_dma: dma address of iv for checking continuity and link table
  * @sec4_sg_bytes: length of dma mapped sec4_sg space
  * @bklog: stored to determine if the request needs backlog
+ * @free: stored to determine if skcipher_edesc needs to be freed
  * @sec4_sg_dma: bus physical mapped address of h/w link table
  * @sec4_sg: pointer to h/w link table
  * @hw_desc: the h/w job descriptor followed by any referenced link tables
@@ -918,6 +919,7 @@ struct skcipher_edesc {
 	dma_addr_t iv_dma;
 	int sec4_sg_bytes;
 	bool bklog;
+	bool free;
 	dma_addr_t sec4_sg_dma;
 	struct sec4_sg_entry *sec4_sg;
 	u32 hw_desc[];
@@ -1037,7 +1039,8 @@ static void skcipher_crypt_done(struct device *jrdev, u32 *desc, u32 err,
 		     DUMP_PREFIX_ADDRESS, 16, 4, req->dst,
 		     edesc->dst_nents > 1 ? 100 : req->cryptlen, 1);
 
-	kfree(edesc);
+	if (edesc->free)
+		kfree(edesc);
 
 	/*
 	 * If no backlog flag, the completion of the request is done
@@ -1604,7 +1607,7 @@ static struct skcipher_edesc *skcipher_edesc_alloc(struct skcipher_request *req,
 	dma_addr_t iv_dma = 0;
 	u8 *iv;
 	int ivsize = crypto_skcipher_ivsize(skcipher);
-	int dst_sg_idx, sec4_sg_ents, sec4_sg_bytes;
+	int dst_sg_idx, sec4_sg_ents, sec4_sg_bytes, edesc_size = 0;
 
 	src_nents = sg_nents_for_len(req->src, req->cryptlen);
 	if (unlikely(src_nents < 0)) {
@@ -1675,16 +1678,30 @@ static struct skcipher_edesc *skcipher_edesc_alloc(struct skcipher_request *req,
 
 	sec4_sg_bytes = sec4_sg_ents * sizeof(struct sec4_sg_entry);
 
-	/*
-	 * allocate space for base edesc and hw desc commands, link tables, IV
-	 */
-	edesc = kzalloc(sizeof(*edesc) + desc_bytes + sec4_sg_bytes + ivsize,
-			GFP_DMA | flags);
-	if (!edesc) {
-		dev_err(jrdev, "could not allocate extended descriptor\n");
-		caam_unmap(jrdev, req->src, req->dst, src_nents, dst_nents, 0,
-			   0, 0, 0);
-		return ERR_PTR(-ENOMEM);
+	 /* Check if there's enough space for edesc saved in req */
+	edesc_size = sizeof(*edesc) + desc_bytes + sec4_sg_bytes + ivsize;
+	if (edesc_size > (crypto_skcipher_reqsize(skcipher) -
+			  sizeof(struct caam_skcipher_req_ctx))) {
+		/*
+		 * allocate space for base edesc and hw desc commands,
+		 * link tables, IV
+		 */
+		edesc = kzalloc(edesc_size, GFP_DMA | flags);
+		if (!edesc) {
+			caam_unmap(jrdev, req->src, req->dst, src_nents,
+				   dst_nents, 0, 0, 0, 0);
+			return ERR_PTR(-ENOMEM);
+		}
+		edesc->free = true;
+	} else {
+		/*
+		 * get address for base edesc and hw desc commands,
+		 * link tables, IV
+		 */
+		edesc = (struct skcipher_edesc *)((u8 *)rctx +
+			sizeof(struct caam_skcipher_req_ctx));
+		/* clear memory */
+		memset(edesc, 0, sizeof(*edesc));
 	}
 
 	edesc->src_nents = src_nents;
@@ -1764,11 +1781,11 @@ static int skcipher_do_one_req(struct crypto_engine *engine, void *areq)
 
 	if (ret != -EINPROGRESS) {
 		skcipher_unmap(ctx->jrdev, rctx->edesc, req);
-		kfree(rctx->edesc);
+		if (rctx->edesc->free)
+			kfree(rctx->edesc);
 	} else {
 		ret = 0;
 	}
-
 	return ret;
 }
 
@@ -3393,10 +3410,25 @@ static int caam_cra_init(struct crypto_skcipher *tfm)
 		container_of(alg, typeof(*caam_alg), skcipher);
 	struct caam_ctx *ctx = crypto_skcipher_ctx(tfm);
 	u32 alg_aai = caam_alg->caam.class1_alg_type & OP_ALG_AAI_MASK;
-	int ret = 0;
+	int ret = 0, extra_reqsize = 0;
 
 	ctx->enginectx.op.do_one_request = skcipher_do_one_req;
 
+	/*
+	 * Compute extra space needed for base edesc and
+	 * hw desc commands, link tables, IV
+	 */
+	extra_reqsize = sizeof(struct skcipher_edesc) +
+			DESC_JOB_IO_LEN * CAAM_CMD_SZ + /* hw desc commands */
+			/* link tables for src and dst:
+			 * 4 entries max + 1 for IV, aligned = 8
+			 */
+			(16 * sizeof(struct sec4_sg_entry)) +
+			AES_BLOCK_SIZE; /* ivsize */
+
+	/* Need GFP_DMA for extra request size */
+	crypto_skcipher_set_flags(tfm, CRYPTO_TFM_REQ_DMA);
+
 	if (alg_aai == OP_ALG_AAI_XTS) {
 		const char *tfm_name = crypto_tfm_alg_name(&tfm->base);
 		struct crypto_skcipher *fallback;
@@ -3411,9 +3443,11 @@ static int caam_cra_init(struct crypto_skcipher *tfm)
 
 		ctx->fallback = fallback;
 		crypto_skcipher_set_reqsize(tfm, sizeof(struct caam_skcipher_req_ctx) +
-					    crypto_skcipher_reqsize(fallback));
+					    crypto_skcipher_reqsize(fallback) +
+					    extra_reqsize);
 	} else {
-		crypto_skcipher_set_reqsize(tfm, sizeof(struct caam_skcipher_req_ctx));
+		crypto_skcipher_set_reqsize(tfm, sizeof(struct caam_skcipher_req_ctx) +
+					    extra_reqsize);
 	}
 
 	ret = caam_init_common(ctx, &caam_alg->caam, false);
@@ -3486,8 +3520,7 @@ static void caam_skcipher_alg_init(struct caam_skcipher_alg *t_alg)
 	alg->base.cra_module = THIS_MODULE;
 	alg->base.cra_priority = CAAM_CRA_PRIORITY;
 	alg->base.cra_ctxsize = sizeof(struct caam_ctx);
-	alg->base.cra_flags |= (CRYPTO_ALG_ASYNC | CRYPTO_ALG_ALLOCATES_MEMORY |
-			      CRYPTO_ALG_KERN_DRIVER_ONLY);
+	alg->base.cra_flags |= (CRYPTO_ALG_ASYNC | CRYPTO_ALG_KERN_DRIVER_ONLY);
 
 	alg->init = caam_cra_init;
 	alg->exit = caam_cra_exit;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 4/4] crypto: caam - avoid allocating memory at crypto request runtime for aead
  2020-11-25 21:13 [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Iuliana Prodan (OSS)
                   ` (2 preceding siblings ...)
  2020-11-25 21:13 ` [RFC PATCH 3/4] crypto: caam - avoid allocating memory at crypto request runtime for skcipher Iuliana Prodan (OSS)
@ 2020-11-25 21:13 ` Iuliana Prodan (OSS)
  2020-11-25 23:32   ` kernel test robot
  2020-11-25 21:16 ` [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Ard Biesheuvel
  4 siblings, 1 reply; 14+ messages in thread
From: Iuliana Prodan (OSS) @ 2020-11-25 21:13 UTC (permalink / raw)
  To: Herbert Xu, Ard Biesheuvel, David S. Miller, Horia Geanta
  Cc: Aymen Sghaier, Silvano Di Ninno, Franck Lenormand, linux-crypto,
	linux-kernel, linux-imx, Iuliana Prodan

From: Iuliana Prodan <iuliana.prodan@nxp.com>

Remove CRYPTO_ALG_ALLOCATES_MEMORY flag and allocate the memory
needed by the driver, to fulfil a request, within the crypto
request object.
The extra size needed for base extended descriptor, hw
descriptor commands and link tables is computed in frontend
driver (caamalg) initialization and saved in reqsize field
that indicates how much memory could be needed per request.

CRYPTO_ALG_ALLOCATES_MEMORY flag is limited only to
dm-crypt use-cases, which seems to be 4 entries maximum.
Therefore in reqsize we allocate memory for maximum 4 entries
for src and 4 for dst, aligned.
If the driver needs more than the 4 entries maximum, the memory
is dynamically allocated, at runtime.

Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
---
 drivers/crypto/caam/caamalg.c | 59 +++++++++++++++++++++++++++--------
 1 file changed, 46 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 6ace8545faec..7038394c41c0 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -880,6 +880,7 @@ static int xts_skcipher_setkey(struct crypto_skcipher *skcipher, const u8 *key,
  * @mapped_dst_nents: number of segments in output h/w link table
  * @sec4_sg_bytes: length of dma mapped sec4_sg space
  * @bklog: stored to determine if the request needs backlog
+ * @free: stored to determine if aead_edesc needs to be freed
  * @sec4_sg_dma: bus physical mapped address of h/w link table
  * @sec4_sg: pointer to h/w link table
  * @hw_desc: the h/w job descriptor followed by any referenced link tables
@@ -891,6 +892,7 @@ struct aead_edesc {
 	int mapped_dst_nents;
 	int sec4_sg_bytes;
 	bool bklog;
+	bool free;
 	dma_addr_t sec4_sg_dma;
 	struct sec4_sg_entry *sec4_sg;
 	u32 hw_desc[];
@@ -987,8 +989,8 @@ static void aead_crypt_done(struct device *jrdev, u32 *desc, u32 err,
 		ecode = caam_jr_strstatus(jrdev, err);
 
 	aead_unmap(jrdev, edesc, req);
-
-	kfree(edesc);
+	if (edesc->free)
+		kfree(edesc);
 
 	/*
 	 * If no backlog flag, the completion of the request is done
@@ -1301,7 +1303,7 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req,
 	int src_nents, mapped_src_nents, dst_nents = 0, mapped_dst_nents = 0;
 	int src_len, dst_len = 0;
 	struct aead_edesc *edesc;
-	int sec4_sg_index, sec4_sg_len, sec4_sg_bytes;
+	int sec4_sg_index, sec4_sg_len, sec4_sg_bytes, edesc_size = 0;
 	unsigned int authsize = ctx->authsize;
 
 	if (unlikely(req->dst != req->src)) {
@@ -1381,13 +1383,30 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req,
 
 	sec4_sg_bytes = sec4_sg_len * sizeof(struct sec4_sg_entry);
 
-	/* allocate space for base edesc and hw desc commands, link tables */
-	edesc = kzalloc(sizeof(*edesc) + desc_bytes + sec4_sg_bytes,
-			GFP_DMA | flags);
-	if (!edesc) {
-		caam_unmap(jrdev, req->src, req->dst, src_nents, dst_nents, 0,
-			   0, 0, 0);
-		return ERR_PTR(-ENOMEM);
+	 /* Check if there's enough space for edesc saved in req */
+	edesc_size = sizeof(*edesc) + desc_bytes + sec4_sg_bytes;
+	if (edesc_size > (crypto_aead_reqsize(aead) -
+			  sizeof(struct caam_aead_req_ctx))) {
+		/*
+		 * allocate space for base edesc and
+		 * hw desc commands, link tables
+		 */
+		edesc = kzalloc(edesc_size, GFP_DMA | flags);
+		if (!edesc) {
+			caam_unmap(jrdev, req->src, req->dst, src_nents,
+				   dst_nents, 0, 0, 0, 0);
+			return ERR_PTR(-ENOMEM);
+		}
+		edesc->free = true;
+	} else {
+		/*
+		 * get address for base edesc and
+		 * hw desc commands, link tables
+		 */
+		edesc = (struct aead_edesc *)((u8 *)rctx +
+			sizeof(struct caam_aead_req_ctx));
+		/* clear memory */
+		memset(edesc, 0, sizeof(*edesc));
 	}
 
 	edesc->src_nents = src_nents;
@@ -1538,7 +1557,8 @@ static int aead_do_one_req(struct crypto_engine *engine, void *areq)
 
 	if (ret != -EINPROGRESS) {
 		aead_unmap(ctx->jrdev, rctx->edesc, req);
-		kfree(rctx->edesc);
+		if (rctx->edesc->free)
+			kfree(rctx->edesc);
 	} else {
 		ret = 0;
 	}
@@ -3463,6 +3483,20 @@ static int caam_aead_init(struct crypto_aead *tfm)
 	struct caam_aead_alg *caam_alg =
 		 container_of(alg, struct caam_aead_alg, aead);
 	struct caam_ctx *ctx = crypto_aead_ctx(tfm);
+	int extra_reqsize = 0;
+
+	/*
+	 * Compute extra space needed for base edesc and
+	 * hw desc commands, link tables, IV
+	 */
+	extra_reqsize = sizeof(struct aead_edesc) +
+			 /* max size for hw desc commands */
+			(AEAD_DESC_JOB_IO_LEN + CAAM_CMD_SZ * 6) +
+			/* link tables for src and dst, 4 entries max, aligned */
+			(8 * sizeof(struct sec4_sg_entry));
+
+	/* Need GFP_DMA for extra request size */
+	crypto_aead_set_flags(tfm, CRYPTO_TFM_REQ_DMA);
 
 	crypto_aead_set_reqsize(tfm, sizeof(struct caam_aead_req_ctx));
 
@@ -3533,8 +3567,7 @@ static void caam_aead_alg_init(struct caam_aead_alg *t_alg)
 	alg->base.cra_module = THIS_MODULE;
 	alg->base.cra_priority = CAAM_CRA_PRIORITY;
 	alg->base.cra_ctxsize = sizeof(struct caam_ctx);
-	alg->base.cra_flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_ALLOCATES_MEMORY |
-			      CRYPTO_ALG_KERN_DRIVER_ONLY;
+	alg->base.cra_flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_KERN_DRIVER_ONLY;
 
 	alg->init = caam_aead_init;
 	alg->exit = caam_aead_exit;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-11-25 21:13 [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Iuliana Prodan (OSS)
                   ` (3 preceding siblings ...)
  2020-11-25 21:13 ` [RFC PATCH 4/4] crypto: caam - avoid allocating memory at crypto request runtime for aead Iuliana Prodan (OSS)
@ 2020-11-25 21:16 ` Ard Biesheuvel
  2020-11-25 21:39   ` Iuliana Prodan
  4 siblings, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2020-11-25 21:16 UTC (permalink / raw)
  To: Iuliana Prodan (OSS)
  Cc: Herbert Xu, Ard Biesheuvel, David S. Miller, Horia Geanta,
	Aymen Sghaier, Silvano Di Ninno, Franck Lenormand,
	Linux Crypto Mailing List, Linux Kernel Mailing List, linux-imx,
	Iuliana Prodan

On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
<iuliana.prodan@oss.nxp.com> wrote:
>
> From: Iuliana Prodan <iuliana.prodan@nxp.com>
>
> Add the option to allocate the crypto request object plus any extra space
> needed by the driver into a DMA-able memory.
>
> Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
> indicate to crypto API the need to allocate GFP_DMA memory
> for private contexts of the crypto requests.
>

These are always directional DMA mappings, right? So why can't we use
bounce buffering here?

> For IPsec use cases, CRYPTO_TFM_REQ_DMA flag is also checked in
> esp_alloc_tmp() function for IPv4 and IPv6.
>
> This series includes an example of how a driver can use
> CRYPTO_TFM_REQ_DMA flag while setting reqsize to a larger value
> to avoid allocating memory at crypto request runtime.
> The extra size needed by the driver is added to the reqsize field
> that indicates how much memory could be needed per request.
>
> Iuliana Prodan (4):
>   crypto: add CRYPTO_TFM_REQ_DMA flag
>   net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
>   crypto: caam - avoid allocating memory at crypto request runtime for
>     skcipher
>   crypto: caam - avoid allocating memory at crypto request runtime for
>     aead
>
>  drivers/crypto/caam/caamalg.c | 130 +++++++++++++++++++++++++---------
>  include/crypto/aead.h         |   4 ++
>  include/crypto/akcipher.h     |  21 ++++++
>  include/crypto/hash.h         |   4 ++
>  include/crypto/skcipher.h     |   4 ++
>  include/linux/crypto.h        |   1 +
>  net/ipv4/esp4.c               |   7 +-
>  net/ipv6/esp6.c               |   7 +-
>  8 files changed, 144 insertions(+), 34 deletions(-)
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-11-25 21:16 ` [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Ard Biesheuvel
@ 2020-11-25 21:39   ` Iuliana Prodan
  2020-11-26  7:09     ` Ard Biesheuvel
  0 siblings, 1 reply; 14+ messages in thread
From: Iuliana Prodan @ 2020-11-25 21:39 UTC (permalink / raw)
  To: Ard Biesheuvel, Iuliana Prodan (OSS)
  Cc: Herbert Xu, Ard Biesheuvel, David S. Miller, Horia Geanta,
	Aymen Sghaier, Silvano Di Ninno, Franck Lenormand,
	Linux Crypto Mailing List, Linux Kernel Mailing List, linux-imx

On 11/25/2020 11:16 PM, Ard Biesheuvel wrote:
> On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
> <iuliana.prodan@oss.nxp.com> wrote:
>>
>> From: Iuliana Prodan <iuliana.prodan@nxp.com>
>>
>> Add the option to allocate the crypto request object plus any extra space
>> needed by the driver into a DMA-able memory.
>>
>> Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
>> indicate to crypto API the need to allocate GFP_DMA memory
>> for private contexts of the crypto requests.
>>
> 
> These are always directional DMA mappings, right? So why can't we use
> bounce buffering here?
> 
The idea was to avoid allocating any memory in crypto drivers.
We want to be able to use dm-crypt with CAAM, which needs DMA-able 
memory and increasing reqsize is not enough.
It started from here 
https://lore.kernel.org/linux-crypto/71b6f739-d4a8-8b26-bf78-ce9acf9a0f99@nxp.com/T/#m39684173a2f0f4b83d8bcbec223e98169273d1e4

>> For IPsec use cases, CRYPTO_TFM_REQ_DMA flag is also checked in
>> esp_alloc_tmp() function for IPv4 and IPv6.
>>
>> This series includes an example of how a driver can use
>> CRYPTO_TFM_REQ_DMA flag while setting reqsize to a larger value
>> to avoid allocating memory at crypto request runtime.
>> The extra size needed by the driver is added to the reqsize field
>> that indicates how much memory could be needed per request.
>>
>> Iuliana Prodan (4):
>>    crypto: add CRYPTO_TFM_REQ_DMA flag
>>    net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
>>    crypto: caam - avoid allocating memory at crypto request runtime for
>>      skcipher
>>    crypto: caam - avoid allocating memory at crypto request runtime for
>>      aead
>>
>>   drivers/crypto/caam/caamalg.c | 130 +++++++++++++++++++++++++---------
>>   include/crypto/aead.h         |   4 ++
>>   include/crypto/akcipher.h     |  21 ++++++
>>   include/crypto/hash.h         |   4 ++
>>   include/crypto/skcipher.h     |   4 ++
>>   include/linux/crypto.h        |   1 +
>>   net/ipv4/esp4.c               |   7 +-
>>   net/ipv6/esp6.c               |   7 +-
>>   8 files changed, 144 insertions(+), 34 deletions(-)
>>
>> --
>> 2.17.1
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 4/4] crypto: caam - avoid allocating memory at crypto request runtime for aead
  2020-11-25 21:13 ` [RFC PATCH 4/4] crypto: caam - avoid allocating memory at crypto request runtime for aead Iuliana Prodan (OSS)
@ 2020-11-25 23:32   ` kernel test robot
  0 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2020-11-25 23:32 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3055 bytes --]

Hi "Iuliana,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on cryptodev/master]
[also build test WARNING on crypto/master ipsec-next/master ipsec/master v5.10-rc5 next-20201125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Iuliana-Prodan-OSS/crypto-add-CRYPTO_TFM_REQ_DMA-flag/20201126-051830
base:   https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
config: arm-defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/d5dfd7856a3f3e56e4b7da2b5458f2a38847f913
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Iuliana-Prodan-OSS/crypto-add-CRYPTO_TFM_REQ_DMA-flag/20201126-051830
        git checkout d5dfd7856a3f3e56e4b7da2b5458f2a38847f913
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/crypto/caam/caamalg.c: In function 'caam_aead_init':
>> drivers/crypto/caam/caamalg.c:3486:6: warning: variable 'extra_reqsize' set but not used [-Wunused-but-set-variable]
    3486 |  int extra_reqsize = 0;
         |      ^~~~~~~~~~~~~

vim +/extra_reqsize +3486 drivers/crypto/caam/caamalg.c

  3479	
  3480	static int caam_aead_init(struct crypto_aead *tfm)
  3481	{
  3482		struct aead_alg *alg = crypto_aead_alg(tfm);
  3483		struct caam_aead_alg *caam_alg =
  3484			 container_of(alg, struct caam_aead_alg, aead);
  3485		struct caam_ctx *ctx = crypto_aead_ctx(tfm);
> 3486		int extra_reqsize = 0;
  3487	
  3488		/*
  3489		 * Compute extra space needed for base edesc and
  3490		 * hw desc commands, link tables, IV
  3491		 */
  3492		extra_reqsize = sizeof(struct aead_edesc) +
  3493				 /* max size for hw desc commands */
  3494				(AEAD_DESC_JOB_IO_LEN + CAAM_CMD_SZ * 6) +
  3495				/* link tables for src and dst, 4 entries max, aligned */
  3496				(8 * sizeof(struct sec4_sg_entry));
  3497	
  3498		/* Need GFP_DMA for extra request size */
  3499		crypto_aead_set_flags(tfm, CRYPTO_TFM_REQ_DMA);
  3500	
  3501		crypto_aead_set_reqsize(tfm, sizeof(struct caam_aead_req_ctx));
  3502	
  3503		ctx->enginectx.op.do_one_request = aead_do_one_req;
  3504	
  3505		return caam_init_common(ctx, &caam_alg->caam, !caam_alg->caam.nodkp);
  3506	}
  3507	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 53699 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-11-25 21:39   ` Iuliana Prodan
@ 2020-11-26  7:09     ` Ard Biesheuvel
  2020-11-26 16:00       ` Iuliana Prodan
  2020-12-07 13:49       ` Horia Geantă
  0 siblings, 2 replies; 14+ messages in thread
From: Ard Biesheuvel @ 2020-11-26  7:09 UTC (permalink / raw)
  To: Iuliana Prodan
  Cc: Iuliana Prodan (OSS),
	Herbert Xu, Ard Biesheuvel, David S. Miller, Horia Geanta,
	Aymen Sghaier, Silvano Di Ninno, Franck Lenormand,
	Linux Crypto Mailing List, Linux Kernel Mailing List, linux-imx

On Wed, 25 Nov 2020 at 22:39, Iuliana Prodan <iuliana.prodan@nxp.com> wrote:
>
> On 11/25/2020 11:16 PM, Ard Biesheuvel wrote:
> > On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
> > <iuliana.prodan@oss.nxp.com> wrote:
> >>
> >> From: Iuliana Prodan <iuliana.prodan@nxp.com>
> >>
> >> Add the option to allocate the crypto request object plus any extra space
> >> needed by the driver into a DMA-able memory.
> >>
> >> Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
> >> indicate to crypto API the need to allocate GFP_DMA memory
> >> for private contexts of the crypto requests.
> >>
> >
> > These are always directional DMA mappings, right? So why can't we use
> > bounce buffering here?
> >
> The idea was to avoid allocating any memory in crypto drivers.
> We want to be able to use dm-crypt with CAAM, which needs DMA-able
> memory and increasing reqsize is not enough.

But what does 'needs DMA-able memory' mean? DMA operations are
asynchronous by definition, and so the DMA layer should be able to
allocate bounce buffers when needed. This will cost some performance
in cases where the hardware cannot address all of memory directly, but
this is a consequence of the design, and I don't think we should
burden the generic API with this.


> It started from here
> https://lore.kernel.org/linux-crypto/71b6f739-d4a8-8b26-bf78-ce9acf9a0f99@nxp.com/T/#m39684173a2f0f4b83d8bcbec223e98169273d1e4
>
> >> For IPsec use cases, CRYPTO_TFM_REQ_DMA flag is also checked in
> >> esp_alloc_tmp() function for IPv4 and IPv6.
> >>
> >> This series includes an example of how a driver can use
> >> CRYPTO_TFM_REQ_DMA flag while setting reqsize to a larger value
> >> to avoid allocating memory at crypto request runtime.
> >> The extra size needed by the driver is added to the reqsize field
> >> that indicates how much memory could be needed per request.
> >>
> >> Iuliana Prodan (4):
> >>    crypto: add CRYPTO_TFM_REQ_DMA flag
> >>    net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
> >>    crypto: caam - avoid allocating memory at crypto request runtime for
> >>      skcipher
> >>    crypto: caam - avoid allocating memory at crypto request runtime for
> >>      aead
> >>
> >>   drivers/crypto/caam/caamalg.c | 130 +++++++++++++++++++++++++---------
> >>   include/crypto/aead.h         |   4 ++
> >>   include/crypto/akcipher.h     |  21 ++++++
> >>   include/crypto/hash.h         |   4 ++
> >>   include/crypto/skcipher.h     |   4 ++
> >>   include/linux/crypto.h        |   1 +
> >>   net/ipv4/esp4.c               |   7 +-
> >>   net/ipv6/esp6.c               |   7 +-
> >>   8 files changed, 144 insertions(+), 34 deletions(-)
> >>
> >> --
> >> 2.17.1
> >>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-11-26  7:09     ` Ard Biesheuvel
@ 2020-11-26 16:00       ` Iuliana Prodan
  2020-11-26 17:12         ` Ard Biesheuvel
  2020-12-07 13:49       ` Horia Geantă
  1 sibling, 1 reply; 14+ messages in thread
From: Iuliana Prodan @ 2020-11-26 16:00 UTC (permalink / raw)
  To: Ard Biesheuvel, Horia Geanta
  Cc: Herbert Xu, Ard Biesheuvel, David S. Miller, Aymen Sghaier,
	Silvano Di Ninno, Franck Lenormand, Linux Crypto Mailing List,
	Linux Kernel Mailing List, linux-imx, Iuliana Prodan (OSS)

On 11/26/2020 9:09 AM, Ard Biesheuvel wrote:
> On Wed, 25 Nov 2020 at 22:39, Iuliana Prodan <iuliana.prodan@nxp.com> wrote:
>>
>> On 11/25/2020 11:16 PM, Ard Biesheuvel wrote:
>>> On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
>>> <iuliana.prodan@oss.nxp.com> wrote:
>>>>
>>>> From: Iuliana Prodan <iuliana.prodan@nxp.com>
>>>>
>>>> Add the option to allocate the crypto request object plus any extra space
>>>> needed by the driver into a DMA-able memory.
>>>>
>>>> Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
>>>> indicate to crypto API the need to allocate GFP_DMA memory
>>>> for private contexts of the crypto requests.
>>>>
>>>
>>> These are always directional DMA mappings, right? So why can't we use
>>> bounce buffering here?
>>>
>> The idea was to avoid allocating any memory in crypto drivers.
>> We want to be able to use dm-crypt with CAAM, which needs DMA-able
>> memory and increasing reqsize is not enough.
> 
> But what does 'needs DMA-able memory' mean? DMA operations are
> asynchronous by definition, and so the DMA layer should be able to
> allocate bounce buffers when needed. This will cost some performance
> in cases where the hardware cannot address all of memory directly, but
> this is a consequence of the design, and I don't think we should
> burden the generic API with this.
> 
Ard, I believe you're right.

In CAAM, for req->src and req->dst, which comes from crypto request, we 
use DMA mappings without knowing if the memory is DMAable or not.

We should do the same for CAAM's hw descriptors commands and link 
tables. That's the extra memory allocated by increasing reqsize.

Horia, do you see any limitations, in CAAM, for not using the above 
approach?


>> It started from here
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-crypto%2F71b6f739-d4a8-8b26-bf78-ce9acf9a0f99%40nxp.com%2FT%2F%23m39684173a2f0f4b83d8bcbec223e98169273d1e4&amp;data=04%7C01%7Ciuliana.prodan%40nxp.com%7C436d50e009434fd1c52808d891da3c8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637419713794116972%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=38rJuxFhPWDJXWc2R66Hu%2Fn7vve9u%2BeF0Evp%2BFAhCwQ%3D&amp;reserved=0
>>
>>>> For IPsec use cases, CRYPTO_TFM_REQ_DMA flag is also checked in
>>>> esp_alloc_tmp() function for IPv4 and IPv6.
>>>>
>>>> This series includes an example of how a driver can use
>>>> CRYPTO_TFM_REQ_DMA flag while setting reqsize to a larger value
>>>> to avoid allocating memory at crypto request runtime.
>>>> The extra size needed by the driver is added to the reqsize field
>>>> that indicates how much memory could be needed per request.
>>>>
>>>> Iuliana Prodan (4):
>>>>     crypto: add CRYPTO_TFM_REQ_DMA flag
>>>>     net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
>>>>     crypto: caam - avoid allocating memory at crypto request runtime for
>>>>       skcipher
>>>>     crypto: caam - avoid allocating memory at crypto request runtime for
>>>>       aead
>>>>
>>>>    drivers/crypto/caam/caamalg.c | 130 +++++++++++++++++++++++++---------
>>>>    include/crypto/aead.h         |   4 ++
>>>>    include/crypto/akcipher.h     |  21 ++++++
>>>>    include/crypto/hash.h         |   4 ++
>>>>    include/crypto/skcipher.h     |   4 ++
>>>>    include/linux/crypto.h        |   1 +
>>>>    net/ipv4/esp4.c               |   7 +-
>>>>    net/ipv6/esp6.c               |   7 +-
>>>>    8 files changed, 144 insertions(+), 34 deletions(-)
>>>>
>>>> --
>>>> 2.17.1
>>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-11-26 16:00       ` Iuliana Prodan
@ 2020-11-26 17:12         ` Ard Biesheuvel
  2020-11-26 18:21           ` Iuliana Prodan
  0 siblings, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2020-11-26 17:12 UTC (permalink / raw)
  To: Iuliana Prodan
  Cc: Horia Geanta, Herbert Xu, Ard Biesheuvel, David S. Miller,
	Aymen Sghaier, Silvano Di Ninno, Franck Lenormand,
	Linux Crypto Mailing List, Linux Kernel Mailing List, linux-imx,
	Iuliana Prodan (OSS)

On Thu, 26 Nov 2020 at 17:00, Iuliana Prodan <iuliana.prodan@nxp.com> wrote:
>
> On 11/26/2020 9:09 AM, Ard Biesheuvel wrote:
> > On Wed, 25 Nov 2020 at 22:39, Iuliana Prodan <iuliana.prodan@nxp.com> wrote:
> >>
> >> On 11/25/2020 11:16 PM, Ard Biesheuvel wrote:
> >>> On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
> >>> <iuliana.prodan@oss.nxp.com> wrote:
> >>>>
> >>>> From: Iuliana Prodan <iuliana.prodan@nxp.com>
> >>>>
> >>>> Add the option to allocate the crypto request object plus any extra space
> >>>> needed by the driver into a DMA-able memory.
> >>>>
> >>>> Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
> >>>> indicate to crypto API the need to allocate GFP_DMA memory
> >>>> for private contexts of the crypto requests.
> >>>>
> >>>
> >>> These are always directional DMA mappings, right? So why can't we use
> >>> bounce buffering here?
> >>>
> >> The idea was to avoid allocating any memory in crypto drivers.
> >> We want to be able to use dm-crypt with CAAM, which needs DMA-able
> >> memory and increasing reqsize is not enough.
> >
> > But what does 'needs DMA-able memory' mean? DMA operations are
> > asynchronous by definition, and so the DMA layer should be able to
> > allocate bounce buffers when needed. This will cost some performance
> > in cases where the hardware cannot address all of memory directly, but
> > this is a consequence of the design, and I don't think we should
> > burden the generic API with this.
> >
> Ard, I believe you're right.
>
> In CAAM, for req->src and req->dst, which comes from crypto request, we
> use DMA mappings without knowing if the memory is DMAable or not.
>
> We should do the same for CAAM's hw descriptors commands and link
> tables. That's the extra memory allocated by increasing reqsize.
>

It depends on whether any such mappings are non-directional. But I
would not expect per-request mappings to be modifiable by both the CPU
and the device at the same time.


> Horia, do you see any limitations, in CAAM, for not using the above
> approach?
>
>
> >> It started from here
> >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-crypto%2F71b6f739-d4a8-8b26-bf78-ce9acf9a0f99%40nxp.com%2FT%2F%23m39684173a2f0f4b83d8bcbec223e98169273d1e4&amp;data=04%7C01%7Ciuliana.prodan%40nxp.com%7C436d50e009434fd1c52808d891da3c8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637419713794116972%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=38rJuxFhPWDJXWc2R66Hu%2Fn7vve9u%2BeF0Evp%2BFAhCwQ%3D&amp;reserved=0
> >>
> >>>> For IPsec use cases, CRYPTO_TFM_REQ_DMA flag is also checked in
> >>>> esp_alloc_tmp() function for IPv4 and IPv6.
> >>>>
> >>>> This series includes an example of how a driver can use
> >>>> CRYPTO_TFM_REQ_DMA flag while setting reqsize to a larger value
> >>>> to avoid allocating memory at crypto request runtime.
> >>>> The extra size needed by the driver is added to the reqsize field
> >>>> that indicates how much memory could be needed per request.
> >>>>
> >>>> Iuliana Prodan (4):
> >>>>     crypto: add CRYPTO_TFM_REQ_DMA flag
> >>>>     net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
> >>>>     crypto: caam - avoid allocating memory at crypto request runtime for
> >>>>       skcipher
> >>>>     crypto: caam - avoid allocating memory at crypto request runtime for
> >>>>       aead
> >>>>
> >>>>    drivers/crypto/caam/caamalg.c | 130 +++++++++++++++++++++++++---------
> >>>>    include/crypto/aead.h         |   4 ++
> >>>>    include/crypto/akcipher.h     |  21 ++++++
> >>>>    include/crypto/hash.h         |   4 ++
> >>>>    include/crypto/skcipher.h     |   4 ++
> >>>>    include/linux/crypto.h        |   1 +
> >>>>    net/ipv4/esp4.c               |   7 +-
> >>>>    net/ipv6/esp6.c               |   7 +-
> >>>>    8 files changed, 144 insertions(+), 34 deletions(-)
> >>>>
> >>>> --
> >>>> 2.17.1
> >>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-11-26 17:12         ` Ard Biesheuvel
@ 2020-11-26 18:21           ` Iuliana Prodan
  0 siblings, 0 replies; 14+ messages in thread
From: Iuliana Prodan @ 2020-11-26 18:21 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Horia Geanta, Herbert Xu, Ard Biesheuvel, David S. Miller,
	Aymen Sghaier, Silvano Di Ninno, Franck Lenormand,
	Linux Crypto Mailing List, Linux Kernel Mailing List, linux-imx,
	Iuliana Prodan (OSS)

On 11/26/2020 7:12 PM, Ard Biesheuvel wrote:
> On Thu, 26 Nov 2020 at 17:00, Iuliana Prodan <iuliana.prodan@nxp.com> wrote:
>>
>> On 11/26/2020 9:09 AM, Ard Biesheuvel wrote:
>>> On Wed, 25 Nov 2020 at 22:39, Iuliana Prodan <iuliana.prodan@nxp.com> wrote:
>>>>
>>>> On 11/25/2020 11:16 PM, Ard Biesheuvel wrote:
>>>>> On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
>>>>> <iuliana.prodan@oss.nxp.com> wrote:
>>>>>>
>>>>>> From: Iuliana Prodan <iuliana.prodan@nxp.com>
>>>>>>
>>>>>> Add the option to allocate the crypto request object plus any extra space
>>>>>> needed by the driver into a DMA-able memory.
>>>>>>
>>>>>> Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
>>>>>> indicate to crypto API the need to allocate GFP_DMA memory
>>>>>> for private contexts of the crypto requests.
>>>>>>
>>>>>
>>>>> These are always directional DMA mappings, right? So why can't we use
>>>>> bounce buffering here?
>>>>>
>>>> The idea was to avoid allocating any memory in crypto drivers.
>>>> We want to be able to use dm-crypt with CAAM, which needs DMA-able
>>>> memory and increasing reqsize is not enough.
>>>
>>> But what does 'needs DMA-able memory' mean? DMA operations are
>>> asynchronous by definition, and so the DMA layer should be able to
>>> allocate bounce buffers when needed. This will cost some performance
>>> in cases where the hardware cannot address all of memory directly, but
>>> this is a consequence of the design, and I don't think we should
>>> burden the generic API with this.
>>>
>> Ard, I believe you're right.
>>
>> In CAAM, for req->src and req->dst, which comes from crypto request, we
>> use DMA mappings without knowing if the memory is DMAable or not.
>>
>> We should do the same for CAAM's hw descriptors commands and link
>> tables. That's the extra memory allocated by increasing reqsize.
>>
> 
> It depends on whether any such mappings are non-directional. But I
> would not expect per-request mappings to be modifiable by both the CPU
> and the device at the same time.
> 
There are bidirectional mappings on req->src (if it's also used for 
output) and IV (if exits).
But, these are not modify by CPU and CAAM at the same time.

> 
>> Horia, do you see any limitations, in CAAM, for not using the above
>> approach?
>>
>>
>>>> It started from here
>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-crypto%2F71b6f739-d4a8-8b26-bf78-ce9acf9a0f99%40nxp.com%2FT%2F%23m39684173a2f0f4b83d8bcbec223e98169273d1e4&amp;data=04%7C01%7Ciuliana.prodan%40nxp.com%7Cfdd8e587f49f44821e6d08d8922e8ca9%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637420075916446952%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=x2G4kaWiKVVcOie2yC8JwOpDnPsa3OPO6HpfThqXChE%3D&amp;reserved=0
>>>>
>>>>>> For IPsec use cases, CRYPTO_TFM_REQ_DMA flag is also checked in
>>>>>> esp_alloc_tmp() function for IPv4 and IPv6.
>>>>>>
>>>>>> This series includes an example of how a driver can use
>>>>>> CRYPTO_TFM_REQ_DMA flag while setting reqsize to a larger value
>>>>>> to avoid allocating memory at crypto request runtime.
>>>>>> The extra size needed by the driver is added to the reqsize field
>>>>>> that indicates how much memory could be needed per request.
>>>>>>
>>>>>> Iuliana Prodan (4):
>>>>>>      crypto: add CRYPTO_TFM_REQ_DMA flag
>>>>>>      net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
>>>>>>      crypto: caam - avoid allocating memory at crypto request runtime for
>>>>>>        skcipher
>>>>>>      crypto: caam - avoid allocating memory at crypto request runtime for
>>>>>>        aead
>>>>>>
>>>>>>     drivers/crypto/caam/caamalg.c | 130 +++++++++++++++++++++++++---------
>>>>>>     include/crypto/aead.h         |   4 ++
>>>>>>     include/crypto/akcipher.h     |  21 ++++++
>>>>>>     include/crypto/hash.h         |   4 ++
>>>>>>     include/crypto/skcipher.h     |   4 ++
>>>>>>     include/linux/crypto.h        |   1 +
>>>>>>     net/ipv4/esp4.c               |   7 +-
>>>>>>     net/ipv6/esp6.c               |   7 +-
>>>>>>     8 files changed, 144 insertions(+), 34 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.17.1
>>>>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-11-26  7:09     ` Ard Biesheuvel
  2020-11-26 16:00       ` Iuliana Prodan
@ 2020-12-07 13:49       ` Horia Geantă
  2020-12-08  7:43         ` Ard Biesheuvel
  1 sibling, 1 reply; 14+ messages in thread
From: Horia Geantă @ 2020-12-07 13:49 UTC (permalink / raw)
  To: Ard Biesheuvel, Iuliana Prodan
  Cc: Iuliana Prodan (OSS),
	Herbert Xu, Ard Biesheuvel, David S. Miller, Aymen Sghaier,
	Silvano Di Ninno, Franck Lenormand, Linux Crypto Mailing List,
	Linux Kernel Mailing List, dl-linux-imx

On 11/26/2020 9:09 AM, Ard Biesheuvel wrote:
> On Wed, 25 Nov 2020 at 22:39, Iuliana Prodan <iuliana.prodan@nxp.com> wrote:
>>
>> On 11/25/2020 11:16 PM, Ard Biesheuvel wrote:
>>> On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
>>> <iuliana.prodan@oss.nxp.com> wrote:
>>>>
>>>> From: Iuliana Prodan <iuliana.prodan@nxp.com>
>>>>
>>>> Add the option to allocate the crypto request object plus any extra space
>>>> needed by the driver into a DMA-able memory.
>>>>
>>>> Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
>>>> indicate to crypto API the need to allocate GFP_DMA memory
>>>> for private contexts of the crypto requests.
>>>>
>>>
>>> These are always directional DMA mappings, right? So why can't we use
>>> bounce buffering here?
>>>
>> The idea was to avoid allocating any memory in crypto drivers.
>> We want to be able to use dm-crypt with CAAM, which needs DMA-able
>> memory and increasing reqsize is not enough.
> 
> But what does 'needs DMA-able memory' mean? DMA operations are
> asynchronous by definition, and so the DMA layer should be able to
> allocate bounce buffers when needed. This will cost some performance
> in cases where the hardware cannot address all of memory directly, but
> this is a consequence of the design, and I don't think we should
> burden the generic API with this.
> 
The performance loss due to bounce buffering is non-negligible.
Previous experiments we did showed a 35% gain (when forcing all data,
including I/O buffers, in ZONE_DMA32).

I don't have the exact numbers for bounce buffering introduced by allowing
only by the control data structures (descriptors etc.) in high memory,
but I don't think it's fair to easily dismiss this topic,
given the big performance drop and relatively low impact
on the generic API.

Thanks,
Horia

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag
  2020-12-07 13:49       ` Horia Geantă
@ 2020-12-08  7:43         ` Ard Biesheuvel
  0 siblings, 0 replies; 14+ messages in thread
From: Ard Biesheuvel @ 2020-12-08  7:43 UTC (permalink / raw)
  To: Horia Geantă
  Cc: Iuliana Prodan, Iuliana Prodan (OSS),
	Herbert Xu, Ard Biesheuvel, David S. Miller, Aymen Sghaier,
	Silvano Di Ninno, Franck Lenormand, Linux Crypto Mailing List,
	Linux Kernel Mailing List, dl-linux-imx

On Mon, 7 Dec 2020 at 14:50, Horia Geantă <horia.geanta@nxp.com> wrote:
>
> On 11/26/2020 9:09 AM, Ard Biesheuvel wrote:
> > On Wed, 25 Nov 2020 at 22:39, Iuliana Prodan <iuliana.prodan@nxp.com> wrote:
> >>
> >> On 11/25/2020 11:16 PM, Ard Biesheuvel wrote:
> >>> On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
> >>> <iuliana.prodan@oss.nxp.com> wrote:
> >>>>
> >>>> From: Iuliana Prodan <iuliana.prodan@nxp.com>
> >>>>
> >>>> Add the option to allocate the crypto request object plus any extra space
> >>>> needed by the driver into a DMA-able memory.
> >>>>
> >>>> Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
> >>>> indicate to crypto API the need to allocate GFP_DMA memory
> >>>> for private contexts of the crypto requests.
> >>>>
> >>>
> >>> These are always directional DMA mappings, right? So why can't we use
> >>> bounce buffering here?
> >>>
> >> The idea was to avoid allocating any memory in crypto drivers.
> >> We want to be able to use dm-crypt with CAAM, which needs DMA-able
> >> memory and increasing reqsize is not enough.
> >
> > But what does 'needs DMA-able memory' mean? DMA operations are
> > asynchronous by definition, and so the DMA layer should be able to
> > allocate bounce buffers when needed. This will cost some performance
> > in cases where the hardware cannot address all of memory directly, but
> > this is a consequence of the design, and I don't think we should
> > burden the generic API with this.
> >
> The performance loss due to bounce buffering is non-negligible.
> Previous experiments we did showed a 35% gain (when forcing all data,
> including I/O buffers, in ZONE_DMA32).
>
> I don't have the exact numbers for bounce buffering introduced by allowing
> only by the control data structures (descriptors etc.) in high memory,
> but I don't think it's fair to easily dismiss this topic,
> given the big performance drop and relatively low impact
> on the generic API.
>

It is not about the impact on the API. It is about the layering
violation: all masters in a system will be affected by DMA addressing
limitations, and all will be affected by the performance impact of
bounce buffering when it is needed. DMA accessible memory is generally
'owned' by the DMA layer so it can be used for bounce buffering for
all masters. If one device starts claiming DMA-able memory for its own
use, other masters could be adversely affected, given that they may
not be able to do DMA at all (not even via bounce buffers) once a
single master uses up all DMA-able memory.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-12-08  7:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-25 21:13 [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Iuliana Prodan (OSS)
2020-11-25 21:13 ` [RFC PATCH 1/4] " Iuliana Prodan (OSS)
2020-11-25 21:13 ` [RFC PATCH 2/4] net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request Iuliana Prodan (OSS)
2020-11-25 21:13 ` [RFC PATCH 3/4] crypto: caam - avoid allocating memory at crypto request runtime for skcipher Iuliana Prodan (OSS)
2020-11-25 21:13 ` [RFC PATCH 4/4] crypto: caam - avoid allocating memory at crypto request runtime for aead Iuliana Prodan (OSS)
2020-11-25 23:32   ` kernel test robot
2020-11-25 21:16 ` [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag Ard Biesheuvel
2020-11-25 21:39   ` Iuliana Prodan
2020-11-26  7:09     ` Ard Biesheuvel
2020-11-26 16:00       ` Iuliana Prodan
2020-11-26 17:12         ` Ard Biesheuvel
2020-11-26 18:21           ` Iuliana Prodan
2020-12-07 13:49       ` Horia Geantă
2020-12-08  7:43         ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.