All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Chain crypto requests together at the DMA level
@ 2016-06-15 19:15 ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

The Cryptographic Engines and Security Accelerators (CESA) supports
the TDMA chained mode support. When this mode is enabled and crypto
requests are chained at the DMA level, multiple crypto requests can be
handled by the hardware engine without requiring any software
intervention. This approach limits the number of interrupts generated
by the engines thus improving its throughput and making the whole system
behave nicely under heavy crypto load.

Benchmarking results with dmcrypt
=================================
		I/O read	I/O write
Before		81.7 MB/s	31.7 MB/s
After		129  MB/s	39.8 MB/s

Improvement	+57.8 %		+25.5 %



Romain Perier (7):
  crypto: marvell: Add a macro constant for the size of the crypto queue
  crypto: marvell: Check engine is not already running when enabling a
    req
  crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
  crypto: marvell: Adding a complete operation for async requests
  crypto: marvell: Adding load balancing between engines
  crypto: marvell: Add support for chaining crypto requests in TDMA mode

 drivers/crypto/marvell/cesa.c   | 142 ++++++++++++++++++++++++++++++----------
 drivers/crypto/marvell/cesa.h   | 103 +++++++++++++++++++++++------
 drivers/crypto/marvell/cipher.c | 141 +++++++++++++++++++++++----------------
 drivers/crypto/marvell/hash.c   | 126 +++++++++++++++++------------------
 drivers/crypto/marvell/tdma.c   | 120 +++++++++++++++++++++++++++++++--
 5 files changed, 452 insertions(+), 180 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Chain crypto requests together at the DMA level
@ 2016-06-15 19:15 ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

The Cryptographic Engines and Security Accelerators (CESA) supports
the TDMA chained mode support. When this mode is enabled and crypto
requests are chained at the DMA level, multiple crypto requests can be
handled by the hardware engine without requiring any software
intervention. This approach limits the number of interrupts generated
by the engines thus improving its throughput and making the whole system
behave nicely under heavy crypto load.

Benchmarking results with dmcrypt
=================================
		I/O read	I/O write
Before		81.7 MB/s	31.7 MB/s
After		129  MB/s	39.8 MB/s

Improvement	+57.8 %		+25.5 %



Romain Perier (7):
  crypto: marvell: Add a macro constant for the size of the crypto queue
  crypto: marvell: Check engine is not already running when enabling a
    req
  crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
  crypto: marvell: Adding a complete operation for async requests
  crypto: marvell: Adding load balancing between engines
  crypto: marvell: Add support for chaining crypto requests in TDMA mode

 drivers/crypto/marvell/cesa.c   | 142 ++++++++++++++++++++++++++++++----------
 drivers/crypto/marvell/cesa.h   | 103 +++++++++++++++++++++++------
 drivers/crypto/marvell/cipher.c | 141 +++++++++++++++++++++++----------------
 drivers/crypto/marvell/hash.c   | 126 +++++++++++++++++------------------
 drivers/crypto/marvell/tdma.c   | 120 +++++++++++++++++++++++++++++++--
 5 files changed, 452 insertions(+), 180 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/7] crypto: marvell: Add a macro constant for the size of the crypto queue
  2016-06-15 19:15 ` Romain Perier
@ 2016-06-15 19:15   ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

Adding a macro constant to be used for the size of the crypto queue,
instead of using a numeric value directly. It will be easier to
maintain in case we add more than one crypto queue of the same size.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 056a754..fb403e1 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -31,6 +31,9 @@
 
 #include "cesa.h"
 
+/* Limit of the crypto queue before reaching the backlog */
+#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
+
 static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
 module_param_named(allhwsupport, allhwsupport, int, 0444);
 MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if overlaps with the mv_cesa driver)");
@@ -416,7 +419,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	spin_lock_init(&cesa->lock);
-	crypto_init_queue(&cesa->queue, 50);
+	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
 	cesa->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(cesa->regs))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/7] crypto: marvell: Add a macro constant for the size of the crypto queue
@ 2016-06-15 19:15   ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

Adding a macro constant to be used for the size of the crypto queue,
instead of using a numeric value directly. It will be easier to
maintain in case we add more than one crypto queue of the same size.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 056a754..fb403e1 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -31,6 +31,9 @@
 
 #include "cesa.h"
 
+/* Limit of the crypto queue before reaching the backlog */
+#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
+
 static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
 module_param_named(allhwsupport, allhwsupport, int, 0444);
 MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if overlaps with the mv_cesa driver)");
@@ -416,7 +419,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	spin_lock_init(&cesa->lock);
-	crypto_init_queue(&cesa->queue, 50);
+	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
 	cesa->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(cesa->regs))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/7] crypto: marvell: Check engine is not already running when enabling a req
  2016-06-15 19:15 ` Romain Perier
@ 2016-06-15 19:15   ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

Adding BUG_ON() macro to be sure that the step operation is not about
to activate a request on the engine if the corresponding engine is
already processing a crypto request. This is helpful when the support
for chaining crypto requests will be added. Instead of hanging the
system when the engine is in an incoherent state, we add this macro
which throws an understandable error.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cipher.c | 2 ++
 drivers/crypto/marvell/hash.c   | 2 ++
 drivers/crypto/marvell/tdma.c   | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index dcf1fce..8d0fabb 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -106,6 +106,8 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 
 	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
 	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD)
+				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 7ca2e0f..0fae351 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -237,6 +237,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 
 	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
 	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD)
+				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 7642798..d493714 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -53,6 +53,8 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
 		       engine->regs + CESA_SA_CFG);
 	writel_relaxed(dreq->chain.first->cur_dma,
 		       engine->regs + CESA_TDMA_NEXT_ADDR);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD)
+				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/7] crypto: marvell: Check engine is not already running when enabling a req
@ 2016-06-15 19:15   ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

Adding BUG_ON() macro to be sure that the step operation is not about
to activate a request on the engine if the corresponding engine is
already processing a crypto request. This is helpful when the support
for chaining crypto requests will be added. Instead of hanging the
system when the engine is in an incoherent state, we add this macro
which throws an understandable error.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cipher.c | 2 ++
 drivers/crypto/marvell/hash.c   | 2 ++
 drivers/crypto/marvell/tdma.c   | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index dcf1fce..8d0fabb 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -106,6 +106,8 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 
 	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
 	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD)
+				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 7ca2e0f..0fae351 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -237,6 +237,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 
 	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
 	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD)
+				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 7642798..d493714 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -53,6 +53,8 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
 		       engine->regs + CESA_SA_CFG);
 	writel_relaxed(dreq->chain.first->cur_dma,
 		       engine->regs + CESA_TDMA_NEXT_ADDR);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD)
+				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  2016-06-15 19:15 ` Romain Perier
@ 2016-06-15 19:15   ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

Adding a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is required for processing
cipher requests asynchroniously in chained mode, otherwise the content
of the IV vector will be overwriten for each new finished request.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   |  4 ++++
 drivers/crypto/marvell/cesa.h   |  5 +++++
 drivers/crypto/marvell/cipher.c | 40 +++++++++++++++++++++++++++-------------
 drivers/crypto/marvell/tdma.c   | 29 +++++++++++++++++++++++++++++
 4 files changed, 65 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index fb403e1..93700cd 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -312,6 +312,10 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
 	if (!dma->padding_pool)
 		return -ENOMEM;
 
+	dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
+	if (!dma->iv_pool)
+		return -ENOMEM;
+
 	cesa->dma = dma;
 
 	return 0;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 74071e4..74b84bd 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
+#define CESA_TDMA_IV				4
 
 /**
  * struct mv_cesa_tdma_desc - TDMA descriptor
@@ -390,6 +391,7 @@ struct mv_cesa_dev_dma {
 	struct dma_pool *op_pool;
 	struct dma_pool *cache_pool;
 	struct dma_pool *padding_pool;
+	struct dma_pool *iv_pool;
 };
 
 /**
@@ -790,6 +792,9 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
 	memset(chain, 0, sizeof(*chain));
 }
 
+int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+			  u32 size, u32 flags, gfp_t gfp_flags);
+
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
 					const struct mv_cesa_op_ctx *op_templ,
 					bool skip_ctx,
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 8d0fabb..f42620e 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -118,6 +118,7 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
 	struct mv_cesa_engine *engine = sreq->base.engine;
 	size_t len;
+	unsigned int ivsize;
 
 	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
 				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -127,6 +128,10 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	if (sreq->offset < req->nbytes)
 		return -EINPROGRESS;
 
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
+	memcpy_fromio(req->info,
+		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
+
 	return 0;
 }
 
@@ -135,23 +140,23 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
-	int ret;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (creq->req.base.type == CESA_DMA_REQ) {
+		int ret;
+		struct mv_cesa_tdma_req *dreq;
+		unsigned int ivsize;
+
 		ret = mv_cesa_dma_process(&creq->req.dma, status);
-	else
-		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
+		if (ret)
+			return ret;
 
-	if (ret)
+		dreq = &creq->req.dma;
+		ivsize = crypto_ablkcipher_ivsize(
+					     crypto_ablkcipher_reqtfm(ablkreq));
+		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
 		return ret;
-
-	memcpy_fromio(ablkreq->info,
-		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
-		      crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq)));
-
-	return 0;
+	}
+	return mv_cesa_ablkcipher_std_process(ablkreq, status);
 }
 
 static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
@@ -302,6 +307,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_tdma_chain chain;
 	bool skip_ctx = false;
 	int ret;
+	unsigned int ivsize;
 
 	dreq->base.type = CESA_DMA_REQ;
 	dreq->chain.first = NULL;
@@ -360,6 +366,14 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 
 	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
 
+	/* Add output data for IV */
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
+	ret = mv_cesa_dma_add_iv_op(&chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
+				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
+
+	if (ret)
+		goto err_free_tdma;
+
 	dreq->chain = chain;
 
 	return 0;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index d493714..88c87be 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -68,6 +68,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 		if (tdma->flags & CESA_TDMA_OP)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
+		else if (tdma->flags & CESA_TDMA_IV)
+			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
+				      le32_to_cpu(tdma->dst));
 
 		tdma = tdma->next;
 		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
@@ -120,6 +123,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 	return new_tdma;
 }
 
+int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+			  u32 size, u32 flags, gfp_t gfp_flags)
+{
+
+	struct mv_cesa_tdma_desc *tdma;
+	u8 *cache;
+	dma_addr_t dma_handle;
+
+	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
+	if (IS_ERR(tdma))
+		return PTR_ERR(tdma);
+
+	cache = dma_pool_alloc(cesa_dev->dma->iv_pool, flags, &dma_handle);
+	if (!cache)
+		return -ENOMEM;
+
+	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
+	tdma->src = src;
+	tdma->dst = cpu_to_le32(dma_handle);
+	tdma->data = cache;
+
+	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
+	tdma->flags = flags | CESA_TDMA_DATA | CESA_TDMA_IV;
+	return 0;
+}
+
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
 					const struct mv_cesa_op_ctx *op_templ,
 					bool skip_ctx,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
@ 2016-06-15 19:15   ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

Adding a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is required for processing
cipher requests asynchroniously in chained mode, otherwise the content
of the IV vector will be overwriten for each new finished request.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   |  4 ++++
 drivers/crypto/marvell/cesa.h   |  5 +++++
 drivers/crypto/marvell/cipher.c | 40 +++++++++++++++++++++++++++-------------
 drivers/crypto/marvell/tdma.c   | 29 +++++++++++++++++++++++++++++
 4 files changed, 65 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index fb403e1..93700cd 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -312,6 +312,10 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
 	if (!dma->padding_pool)
 		return -ENOMEM;
 
+	dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
+	if (!dma->iv_pool)
+		return -ENOMEM;
+
 	cesa->dma = dma;
 
 	return 0;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 74071e4..74b84bd 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
+#define CESA_TDMA_IV				4
 
 /**
  * struct mv_cesa_tdma_desc - TDMA descriptor
@@ -390,6 +391,7 @@ struct mv_cesa_dev_dma {
 	struct dma_pool *op_pool;
 	struct dma_pool *cache_pool;
 	struct dma_pool *padding_pool;
+	struct dma_pool *iv_pool;
 };
 
 /**
@@ -790,6 +792,9 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
 	memset(chain, 0, sizeof(*chain));
 }
 
+int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+			  u32 size, u32 flags, gfp_t gfp_flags);
+
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
 					const struct mv_cesa_op_ctx *op_templ,
 					bool skip_ctx,
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 8d0fabb..f42620e 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -118,6 +118,7 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
 	struct mv_cesa_engine *engine = sreq->base.engine;
 	size_t len;
+	unsigned int ivsize;
 
 	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
 				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -127,6 +128,10 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	if (sreq->offset < req->nbytes)
 		return -EINPROGRESS;
 
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
+	memcpy_fromio(req->info,
+		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
+
 	return 0;
 }
 
@@ -135,23 +140,23 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
-	int ret;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (creq->req.base.type == CESA_DMA_REQ) {
+		int ret;
+		struct mv_cesa_tdma_req *dreq;
+		unsigned int ivsize;
+
 		ret = mv_cesa_dma_process(&creq->req.dma, status);
-	else
-		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
+		if (ret)
+			return ret;
 
-	if (ret)
+		dreq = &creq->req.dma;
+		ivsize = crypto_ablkcipher_ivsize(
+					     crypto_ablkcipher_reqtfm(ablkreq));
+		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
 		return ret;
-
-	memcpy_fromio(ablkreq->info,
-		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
-		      crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq)));
-
-	return 0;
+	}
+	return mv_cesa_ablkcipher_std_process(ablkreq, status);
 }
 
 static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
@@ -302,6 +307,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_tdma_chain chain;
 	bool skip_ctx = false;
 	int ret;
+	unsigned int ivsize;
 
 	dreq->base.type = CESA_DMA_REQ;
 	dreq->chain.first = NULL;
@@ -360,6 +366,14 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 
 	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
 
+	/* Add output data for IV */
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
+	ret = mv_cesa_dma_add_iv_op(&chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
+				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
+
+	if (ret)
+		goto err_free_tdma;
+
 	dreq->chain = chain;
 
 	return 0;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index d493714..88c87be 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -68,6 +68,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 		if (tdma->flags & CESA_TDMA_OP)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
+		else if (tdma->flags & CESA_TDMA_IV)
+			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
+				      le32_to_cpu(tdma->dst));
 
 		tdma = tdma->next;
 		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
@@ -120,6 +123,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 	return new_tdma;
 }
 
+int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+			  u32 size, u32 flags, gfp_t gfp_flags)
+{
+
+	struct mv_cesa_tdma_desc *tdma;
+	u8 *cache;
+	dma_addr_t dma_handle;
+
+	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
+	if (IS_ERR(tdma))
+		return PTR_ERR(tdma);
+
+	cache = dma_pool_alloc(cesa_dev->dma->iv_pool, flags, &dma_handle);
+	if (!cache)
+		return -ENOMEM;
+
+	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
+	tdma->src = src;
+	tdma->dst = cpu_to_le32(dma_handle);
+	tdma->data = cache;
+
+	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
+	tdma->flags = flags | CESA_TDMA_DATA | CESA_TDMA_IV;
+	return 0;
+}
+
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
 					const struct mv_cesa_op_ctx *op_templ,
 					bool skip_ctx,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
  2016-06-15 19:15 ` Romain Perier
@ 2016-06-15 19:15   ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

Actually the only way to access the tdma chain is to use the 'req' union
from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem if we
want to handle the TDMA chaining vs standard/non-DMA processing in a
generic way (with generic functions at the cesa.c level detecting
whether the request should be queued at the DMA level or not). Hence the
decision to move the chain field a the mv_cesa_req level at the expense
of adding 2 void * fields to all request contexts (including non-DMA
ones). To limit the overhead, we get rid of the type field, which can
now be deduced from the req->chain.first value.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   |  3 ++-
 drivers/crypto/marvell/cesa.h   | 31 +++++++++++++------------------
 drivers/crypto/marvell/cipher.c | 40 ++++++++++++++++++++++------------------
 drivers/crypto/marvell/hash.c   | 36 +++++++++++++++---------------------
 drivers/crypto/marvell/tdma.c   |  8 ++++----
 5 files changed, 56 insertions(+), 62 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 93700cd..fe04d1b 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -111,7 +111,8 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 	return ret;
 }
 
-int mv_cesa_queue_req(struct crypto_async_request *req)
+int mv_cesa_queue_req(struct crypto_async_request *req,
+		      struct mv_cesa_req *creq)
 {
 	int ret;
 	int i;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 74b84bd..158ff82 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -509,21 +509,11 @@ enum mv_cesa_req_type {
 
 /**
  * struct mv_cesa_req - CESA request
- * @type:	request type
  * @engine:	engine associated with this request
+ * @chain:	list of tdma descriptors associated  with this request
  */
 struct mv_cesa_req {
-	enum mv_cesa_req_type type;
 	struct mv_cesa_engine *engine;
-};
-
-/**
- * struct mv_cesa_tdma_req - CESA TDMA request
- * @base:	base information
- * @chain:	TDMA chain
- */
-struct mv_cesa_tdma_req {
-	struct mv_cesa_req base;
 	struct mv_cesa_tdma_chain chain;
 };
 
@@ -562,7 +552,6 @@ struct mv_cesa_ablkcipher_std_req {
 struct mv_cesa_ablkcipher_req {
 	union {
 		struct mv_cesa_req base;
-		struct mv_cesa_tdma_req dma;
 		struct mv_cesa_ablkcipher_std_req std;
 	} req;
 	int src_nents;
@@ -587,7 +576,6 @@ struct mv_cesa_ahash_std_req {
  * @cache_dma:		DMA address of the cache buffer
  */
 struct mv_cesa_ahash_dma_req {
-	struct mv_cesa_tdma_req base;
 	u8 *padding;
 	dma_addr_t padding_dma;
 	u8 *cache;
@@ -625,6 +613,12 @@ struct mv_cesa_ahash_req {
 
 extern struct mv_cesa_dev *cesa_dev;
 
+static inline enum mv_cesa_req_type
+mv_cesa_req_get_type(struct mv_cesa_req *req)
+{
+	return req->chain.first ? CESA_DMA_REQ : CESA_STD_REQ;
+}
+
 static inline void mv_cesa_update_op_cfg(struct mv_cesa_op_ctx *op,
 					 u32 cfg, u32 mask)
 {
@@ -697,7 +691,8 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 		CESA_SA_DESC_CFG_FIRST_FRAG;
 }
 
-int mv_cesa_queue_req(struct crypto_async_request *req);
+int mv_cesa_queue_req(struct crypto_async_request *req,
+		      struct mv_cesa_req *creq);
 
 /*
  * Helper function that indicates whether a crypto request needs to be
@@ -767,9 +762,9 @@ static inline bool mv_cesa_req_dma_iter_next_op(struct mv_cesa_dma_iter *iter)
 	return iter->op_len;
 }
 
-void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq);
+void mv_cesa_dma_step(struct mv_cesa_req *dreq);
 
-static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
+static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
 				      u32 status)
 {
 	if (!(status & CESA_SA_INT_ACC0_IDMA_DONE))
@@ -781,10 +776,10 @@ static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
 	return 0;
 }
 
-void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine);
+void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
 
-void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq);
 
 static inline void
 mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index f42620e..15d2c5a 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -70,14 +70,14 @@ mv_cesa_ablkcipher_dma_cleanup(struct ablkcipher_request *req)
 		dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents,
 			     DMA_BIDIRECTIONAL);
 	}
-	mv_cesa_dma_cleanup(&creq->req.dma);
+	mv_cesa_dma_cleanup(&creq->req.base);
 }
 
 static inline void mv_cesa_ablkcipher_cleanup(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ablkcipher_dma_cleanup(req);
 }
 
@@ -141,19 +141,19 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ) {
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
 		int ret;
-		struct mv_cesa_tdma_req *dreq;
+		struct mv_cesa_req *basereq;
 		unsigned int ivsize;
 
-		ret = mv_cesa_dma_process(&creq->req.dma, status);
+		ret = mv_cesa_dma_process(&creq->req.base, status);
 		if (ret)
 			return ret;
 
-		dreq = &creq->req.dma;
+		basereq = &creq->req.base;
 		ivsize = crypto_ablkcipher_ivsize(
 					     crypto_ablkcipher_reqtfm(ablkreq));
-		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
+		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
 		return ret;
 	}
 	return mv_cesa_ablkcipher_std_process(ablkreq, status);
@@ -164,8 +164,8 @@ static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		mv_cesa_dma_step(&creq->req.dma);
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->req.base);
 	else
 		mv_cesa_ablkcipher_std_step(ablkreq);
 }
@@ -174,9 +174,9 @@ static inline void
 mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_req *dreq = &creq->req.base;
 
-	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+	mv_cesa_dma_prepare(dreq, dreq->engine);
 }
 
 static inline void
@@ -199,7 +199,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 	creq->req.base.engine = engine;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ablkcipher_dma_prepare(ablkreq);
 	else
 		mv_cesa_ablkcipher_std_prepare(ablkreq);
@@ -302,14 +302,13 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_req *dreq = &creq->req.base;
 	struct mv_cesa_ablkcipher_dma_iter iter;
 	struct mv_cesa_tdma_chain chain;
 	bool skip_ctx = false;
 	int ret;
 	unsigned int ivsize;
 
-	dreq->base.type = CESA_DMA_REQ;
 	dreq->chain.first = NULL;
 	dreq->chain.last = NULL;
 
@@ -397,10 +396,12 @@ mv_cesa_ablkcipher_std_req_init(struct ablkcipher_request *req,
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
+	struct mv_cesa_req *basereq = &creq->req.base;
 
-	sreq->base.type = CESA_STD_REQ;
 	sreq->op = *op_templ;
 	sreq->skip_ctx = false;
+	basereq->chain.first = NULL;
+	basereq->chain.last = NULL;
 
 	return 0;
 }
@@ -442,6 +443,7 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 static int mv_cesa_des_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
 
@@ -454,7 +456,7 @@ static int mv_cesa_des_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
@@ -562,6 +564,7 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
 static int mv_cesa_des3_op(struct ablkcipher_request *req,
 			   struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
 
@@ -574,7 +577,7 @@ static int mv_cesa_des3_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
@@ -688,6 +691,7 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
 static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret, i;
 	u32 *key;
@@ -716,7 +720,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 0fae351..cc7c5b0 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -103,14 +103,14 @@ static inline void mv_cesa_ahash_dma_cleanup(struct ahash_request *req)
 
 	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
 	mv_cesa_ahash_dma_free_cache(&creq->req.dma);
-	mv_cesa_dma_cleanup(&creq->req.dma.base);
+	mv_cesa_dma_cleanup(&creq->req.base);
 }
 
 static inline void mv_cesa_ahash_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_cleanup(req);
 }
 
@@ -118,7 +118,7 @@ static void mv_cesa_ahash_last_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_last_cleanup(req);
 }
 
@@ -256,9 +256,9 @@ static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
 static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
+	struct mv_cesa_req *dreq = &creq->req.base;
 
-	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+	mv_cesa_dma_prepare(dreq, dreq->engine);
 }
 
 static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
@@ -277,8 +277,8 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		mv_cesa_dma_step(&creq->req.dma.base);
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->req.base);
 	else
 		mv_cesa_ahash_std_step(ahashreq);
 }
@@ -291,8 +291,8 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 	unsigned int digsize;
 	int ret, i;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->req.dma.base, status);
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
+		ret = mv_cesa_dma_process(&creq->req.base, status);
 	else
 		ret = mv_cesa_ahash_std_process(ahashreq, status);
 
@@ -340,7 +340,7 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 
 	creq->req.base.engine = engine;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_prepare(ahashreq);
 	else
 		mv_cesa_ahash_std_prepare(ahashreq);
@@ -555,8 +555,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
-	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
-	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
+	struct mv_cesa_req *dreq = &creq->req.base;
 	struct mv_cesa_ahash_dma_iter iter;
 	struct mv_cesa_op_ctx *op = NULL;
 	unsigned int frag_len;
@@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	int ret;
 
-	if (cesa_dev->caps->has_tdma)
-		creq->req.base.type = CESA_DMA_REQ;
-	else
-		creq->req.base.type = CESA_STD_REQ;
-
 	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
 	if (creq->src_nents < 0) {
 		dev_err(cesa_dev->dev, "Invalid number of src SG");
@@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	if (*cached)
 		return 0;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		ret = mv_cesa_ahash_dma_req_init(req);
 
 	return ret;
@@ -700,7 +694,7 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
@@ -725,7 +719,7 @@ static int mv_cesa_ahash_final(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
@@ -750,7 +744,7 @@ static int mv_cesa_ahash_finup(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 88c87be..9a424f9 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -37,9 +37,9 @@ bool mv_cesa_req_dma_iter_next_transfer(struct mv_cesa_dma_iter *iter,
 	return true;
 }
 
-void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
+void mv_cesa_dma_step(struct mv_cesa_req *dreq)
 {
-	struct mv_cesa_engine *engine = dreq->base.engine;
+	struct mv_cesa_engine *engine = dreq->engine;
 
 	writel_relaxed(0, engine->regs + CESA_SA_CFG);
 
@@ -58,7 +58,7 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
-void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
+void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
 {
 	struct mv_cesa_tdma_desc *tdma;
 
@@ -81,7 +81,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 	dreq->chain.last = NULL;
 }
 
-void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine)
 {
 	struct mv_cesa_tdma_desc *tdma;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
@ 2016-06-15 19:15   ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

Actually the only way to access the tdma chain is to use the 'req' union
from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem if we
want to handle the TDMA chaining vs standard/non-DMA processing in a
generic way (with generic functions at the cesa.c level detecting
whether the request should be queued at the DMA level or not). Hence the
decision to move the chain field a the mv_cesa_req level at the expense
of adding 2 void * fields to all request contexts (including non-DMA
ones). To limit the overhead, we get rid of the type field, which can
now be deduced from the req->chain.first value.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   |  3 ++-
 drivers/crypto/marvell/cesa.h   | 31 +++++++++++++------------------
 drivers/crypto/marvell/cipher.c | 40 ++++++++++++++++++++++------------------
 drivers/crypto/marvell/hash.c   | 36 +++++++++++++++---------------------
 drivers/crypto/marvell/tdma.c   |  8 ++++----
 5 files changed, 56 insertions(+), 62 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 93700cd..fe04d1b 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -111,7 +111,8 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 	return ret;
 }
 
-int mv_cesa_queue_req(struct crypto_async_request *req)
+int mv_cesa_queue_req(struct crypto_async_request *req,
+		      struct mv_cesa_req *creq)
 {
 	int ret;
 	int i;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 74b84bd..158ff82 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -509,21 +509,11 @@ enum mv_cesa_req_type {
 
 /**
  * struct mv_cesa_req - CESA request
- * @type:	request type
  * @engine:	engine associated with this request
+ * @chain:	list of tdma descriptors associated  with this request
  */
 struct mv_cesa_req {
-	enum mv_cesa_req_type type;
 	struct mv_cesa_engine *engine;
-};
-
-/**
- * struct mv_cesa_tdma_req - CESA TDMA request
- * @base:	base information
- * @chain:	TDMA chain
- */
-struct mv_cesa_tdma_req {
-	struct mv_cesa_req base;
 	struct mv_cesa_tdma_chain chain;
 };
 
@@ -562,7 +552,6 @@ struct mv_cesa_ablkcipher_std_req {
 struct mv_cesa_ablkcipher_req {
 	union {
 		struct mv_cesa_req base;
-		struct mv_cesa_tdma_req dma;
 		struct mv_cesa_ablkcipher_std_req std;
 	} req;
 	int src_nents;
@@ -587,7 +576,6 @@ struct mv_cesa_ahash_std_req {
  * @cache_dma:		DMA address of the cache buffer
  */
 struct mv_cesa_ahash_dma_req {
-	struct mv_cesa_tdma_req base;
 	u8 *padding;
 	dma_addr_t padding_dma;
 	u8 *cache;
@@ -625,6 +613,12 @@ struct mv_cesa_ahash_req {
 
 extern struct mv_cesa_dev *cesa_dev;
 
+static inline enum mv_cesa_req_type
+mv_cesa_req_get_type(struct mv_cesa_req *req)
+{
+	return req->chain.first ? CESA_DMA_REQ : CESA_STD_REQ;
+}
+
 static inline void mv_cesa_update_op_cfg(struct mv_cesa_op_ctx *op,
 					 u32 cfg, u32 mask)
 {
@@ -697,7 +691,8 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 		CESA_SA_DESC_CFG_FIRST_FRAG;
 }
 
-int mv_cesa_queue_req(struct crypto_async_request *req);
+int mv_cesa_queue_req(struct crypto_async_request *req,
+		      struct mv_cesa_req *creq);
 
 /*
  * Helper function that indicates whether a crypto request needs to be
@@ -767,9 +762,9 @@ static inline bool mv_cesa_req_dma_iter_next_op(struct mv_cesa_dma_iter *iter)
 	return iter->op_len;
 }
 
-void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq);
+void mv_cesa_dma_step(struct mv_cesa_req *dreq);
 
-static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
+static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
 				      u32 status)
 {
 	if (!(status & CESA_SA_INT_ACC0_IDMA_DONE))
@@ -781,10 +776,10 @@ static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
 	return 0;
 }
 
-void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine);
+void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
 
-void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq);
 
 static inline void
 mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index f42620e..15d2c5a 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -70,14 +70,14 @@ mv_cesa_ablkcipher_dma_cleanup(struct ablkcipher_request *req)
 		dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents,
 			     DMA_BIDIRECTIONAL);
 	}
-	mv_cesa_dma_cleanup(&creq->req.dma);
+	mv_cesa_dma_cleanup(&creq->req.base);
 }
 
 static inline void mv_cesa_ablkcipher_cleanup(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ablkcipher_dma_cleanup(req);
 }
 
@@ -141,19 +141,19 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ) {
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
 		int ret;
-		struct mv_cesa_tdma_req *dreq;
+		struct mv_cesa_req *basereq;
 		unsigned int ivsize;
 
-		ret = mv_cesa_dma_process(&creq->req.dma, status);
+		ret = mv_cesa_dma_process(&creq->req.base, status);
 		if (ret)
 			return ret;
 
-		dreq = &creq->req.dma;
+		basereq = &creq->req.base;
 		ivsize = crypto_ablkcipher_ivsize(
 					     crypto_ablkcipher_reqtfm(ablkreq));
-		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
+		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
 		return ret;
 	}
 	return mv_cesa_ablkcipher_std_process(ablkreq, status);
@@ -164,8 +164,8 @@ static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		mv_cesa_dma_step(&creq->req.dma);
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->req.base);
 	else
 		mv_cesa_ablkcipher_std_step(ablkreq);
 }
@@ -174,9 +174,9 @@ static inline void
 mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_req *dreq = &creq->req.base;
 
-	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+	mv_cesa_dma_prepare(dreq, dreq->engine);
 }
 
 static inline void
@@ -199,7 +199,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 	creq->req.base.engine = engine;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ablkcipher_dma_prepare(ablkreq);
 	else
 		mv_cesa_ablkcipher_std_prepare(ablkreq);
@@ -302,14 +302,13 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_req *dreq = &creq->req.base;
 	struct mv_cesa_ablkcipher_dma_iter iter;
 	struct mv_cesa_tdma_chain chain;
 	bool skip_ctx = false;
 	int ret;
 	unsigned int ivsize;
 
-	dreq->base.type = CESA_DMA_REQ;
 	dreq->chain.first = NULL;
 	dreq->chain.last = NULL;
 
@@ -397,10 +396,12 @@ mv_cesa_ablkcipher_std_req_init(struct ablkcipher_request *req,
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
+	struct mv_cesa_req *basereq = &creq->req.base;
 
-	sreq->base.type = CESA_STD_REQ;
 	sreq->op = *op_templ;
 	sreq->skip_ctx = false;
+	basereq->chain.first = NULL;
+	basereq->chain.last = NULL;
 
 	return 0;
 }
@@ -442,6 +443,7 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 static int mv_cesa_des_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
 
@@ -454,7 +456,7 @@ static int mv_cesa_des_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
@@ -562,6 +564,7 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
 static int mv_cesa_des3_op(struct ablkcipher_request *req,
 			   struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
 
@@ -574,7 +577,7 @@ static int mv_cesa_des3_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
@@ -688,6 +691,7 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
 static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret, i;
 	u32 *key;
@@ -716,7 +720,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 0fae351..cc7c5b0 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -103,14 +103,14 @@ static inline void mv_cesa_ahash_dma_cleanup(struct ahash_request *req)
 
 	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
 	mv_cesa_ahash_dma_free_cache(&creq->req.dma);
-	mv_cesa_dma_cleanup(&creq->req.dma.base);
+	mv_cesa_dma_cleanup(&creq->req.base);
 }
 
 static inline void mv_cesa_ahash_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_cleanup(req);
 }
 
@@ -118,7 +118,7 @@ static void mv_cesa_ahash_last_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_last_cleanup(req);
 }
 
@@ -256,9 +256,9 @@ static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
 static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
+	struct mv_cesa_req *dreq = &creq->req.base;
 
-	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+	mv_cesa_dma_prepare(dreq, dreq->engine);
 }
 
 static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
@@ -277,8 +277,8 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		mv_cesa_dma_step(&creq->req.dma.base);
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->req.base);
 	else
 		mv_cesa_ahash_std_step(ahashreq);
 }
@@ -291,8 +291,8 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 	unsigned int digsize;
 	int ret, i;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->req.dma.base, status);
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
+		ret = mv_cesa_dma_process(&creq->req.base, status);
 	else
 		ret = mv_cesa_ahash_std_process(ahashreq, status);
 
@@ -340,7 +340,7 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 
 	creq->req.base.engine = engine;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_prepare(ahashreq);
 	else
 		mv_cesa_ahash_std_prepare(ahashreq);
@@ -555,8 +555,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
-	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
-	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
+	struct mv_cesa_req *dreq = &creq->req.base;
 	struct mv_cesa_ahash_dma_iter iter;
 	struct mv_cesa_op_ctx *op = NULL;
 	unsigned int frag_len;
@@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	int ret;
 
-	if (cesa_dev->caps->has_tdma)
-		creq->req.base.type = CESA_DMA_REQ;
-	else
-		creq->req.base.type = CESA_STD_REQ;
-
 	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
 	if (creq->src_nents < 0) {
 		dev_err(cesa_dev->dev, "Invalid number of src SG");
@@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	if (*cached)
 		return 0;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		ret = mv_cesa_ahash_dma_req_init(req);
 
 	return ret;
@@ -700,7 +694,7 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
@@ -725,7 +719,7 @@ static int mv_cesa_ahash_final(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
@@ -750,7 +744,7 @@ static int mv_cesa_ahash_finup(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 88c87be..9a424f9 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -37,9 +37,9 @@ bool mv_cesa_req_dma_iter_next_transfer(struct mv_cesa_dma_iter *iter,
 	return true;
 }
 
-void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
+void mv_cesa_dma_step(struct mv_cesa_req *dreq)
 {
-	struct mv_cesa_engine *engine = dreq->base.engine;
+	struct mv_cesa_engine *engine = dreq->engine;
 
 	writel_relaxed(0, engine->regs + CESA_SA_CFG);
 
@@ -58,7 +58,7 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
-void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
+void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
 {
 	struct mv_cesa_tdma_desc *tdma;
 
@@ -81,7 +81,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 	dreq->chain.last = NULL;
 }
 
-void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine)
 {
 	struct mv_cesa_tdma_desc *tdma;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 5/7] crypto: marvell: Adding a complete operation for async requests
  2016-06-15 19:15 ` Romain Perier
@ 2016-06-15 19:15   ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

So far, the 'process' operation was used to check if the current request
was correctly handled by the engine, if it was the case it copied
information from the SRAM to the main memory. Now, we split this
operation. We keep the 'process' operation, which still checks if the
request was correctly handled by the engine or not, then we add a new
operation for completion. The 'complete' method copies the content of
the SRAM to memory. This will soon become useful if we want to call
the process and the complete operations from different locations
depending on the type of the request (different cleanup logic).

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   |  1 +
 drivers/crypto/marvell/cesa.h   |  3 +++
 drivers/crypto/marvell/cipher.c | 47 ++++++++++++++++++++++++-----------------
 drivers/crypto/marvell/hash.c   | 22 ++++++++++---------
 4 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index fe04d1b..af96426 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -98,6 +98,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 				engine->req = NULL;
 				mv_cesa_dequeue_req_unlocked(engine);
 				spin_unlock_bh(&engine->lock);
+				ctx->ops->complete(req);
 				ctx->ops->cleanup(req);
 				local_bh_disable();
 				req->complete(req, res);
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 158ff82..32de08b 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -456,6 +456,8 @@ struct mv_cesa_engine {
  *		code)
  * @step:	launch the crypto operation on the next chunk
  * @cleanup:	cleanup the crypto request (release associated data)
+ * @complete:	complete the request, i.e copy result from sram or contexts
+ * 		when it is needed.
  */
 struct mv_cesa_req_ops {
 	void (*prepare)(struct crypto_async_request *req,
@@ -463,6 +465,7 @@ struct mv_cesa_req_ops {
 	int (*process)(struct crypto_async_request *req, u32 status);
 	void (*step)(struct crypto_async_request *req);
 	void (*cleanup)(struct crypto_async_request *req);
+	void (*complete)(struct crypto_async_request *req);
 };
 
 /**
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 15d2c5a..fbaae2f 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -118,7 +118,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
 	struct mv_cesa_engine *engine = sreq->base.engine;
 	size_t len;
-	unsigned int ivsize;
 
 	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
 				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -128,10 +127,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	if (sreq->offset < req->nbytes)
 		return -EINPROGRESS;
 
-	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
-	memcpy_fromio(req->info,
-		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
-
 	return 0;
 }
 
@@ -141,21 +136,9 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
-		int ret;
-		struct mv_cesa_req *basereq;
-		unsigned int ivsize;
-
-		ret = mv_cesa_dma_process(&creq->req.base, status);
-		if (ret)
-			return ret;
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
+		return mv_cesa_dma_process(&creq->req.base, status);
 
-		basereq = &creq->req.base;
-		ivsize = crypto_ablkcipher_ivsize(
-					     crypto_ablkcipher_reqtfm(ablkreq));
-		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
-		return ret;
-	}
 	return mv_cesa_ablkcipher_std_process(ablkreq, status);
 }
 
@@ -197,6 +180,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
+
 	creq->req.base.engine = engine;
 
 	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
@@ -213,11 +197,36 @@ mv_cesa_ablkcipher_req_cleanup(struct crypto_async_request *req)
 	mv_cesa_ablkcipher_cleanup(ablkreq);
 }
 
+static void
+mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
+{
+	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
+	struct mv_cesa_engine *engine = creq->req.base.engine;
+	unsigned int ivsize;
+
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
+
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
+		struct mv_cesa_req *basereq;
+
+		basereq = &creq->req.base;
+		ivsize = crypto_ablkcipher_ivsize(
+					     crypto_ablkcipher_reqtfm(ablkreq));
+		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
+	} else {
+		memcpy_fromio(ablkreq->info,
+			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
+			      ivsize);
+	}
+}
+
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
 	.step = mv_cesa_ablkcipher_step,
 	.process = mv_cesa_ablkcipher_process,
 	.prepare = mv_cesa_ablkcipher_prepare,
 	.cleanup = mv_cesa_ablkcipher_req_cleanup,
+	.complete = mv_cesa_ablkcipher_complete,
 };
 
 static int mv_cesa_ablkcipher_cra_init(struct crypto_tfm *tfm)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index cc7c5b0..f7f84cc 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -287,17 +287,20 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	struct mv_cesa_engine *engine = creq->req.base.engine;
-	unsigned int digsize;
-	int ret, i;
 
 	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->req.base, status);
-	else
-		ret = mv_cesa_ahash_std_process(ahashreq, status);
+		return mv_cesa_dma_process(&creq->req.base, status);
 
-	if (ret == -EINPROGRESS)
-		return ret;
+	return mv_cesa_ahash_std_process(ahashreq, status);
+}
+
+static void mv_cesa_ahash_complete(struct crypto_async_request *req)
+{
+	struct ahash_request *ahashreq = ahash_request_cast(req);
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
+	struct mv_cesa_engine *engine = creq->req.base.engine;
+	unsigned int digsize;
+	int i;
 
 	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
 	for (i = 0; i < digsize / 4; i++)
@@ -326,8 +329,6 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 				result[i] = cpu_to_be32(creq->state[i]);
 		}
 	}
-
-	return ret;
 }
 
 static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
@@ -366,6 +367,7 @@ static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
 	.process = mv_cesa_ahash_process,
 	.prepare = mv_cesa_ahash_prepare,
 	.cleanup = mv_cesa_ahash_req_cleanup,
+	.complete = mv_cesa_ahash_complete,
 };
 
 static int mv_cesa_ahash_init(struct ahash_request *req,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 5/7] crypto: marvell: Adding a complete operation for async requests
@ 2016-06-15 19:15   ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

So far, the 'process' operation was used to check if the current request
was correctly handled by the engine, if it was the case it copied
information from the SRAM to the main memory. Now, we split this
operation. We keep the 'process' operation, which still checks if the
request was correctly handled by the engine or not, then we add a new
operation for completion. The 'complete' method copies the content of
the SRAM to memory. This will soon become useful if we want to call
the process and the complete operations from different locations
depending on the type of the request (different cleanup logic).

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   |  1 +
 drivers/crypto/marvell/cesa.h   |  3 +++
 drivers/crypto/marvell/cipher.c | 47 ++++++++++++++++++++++++-----------------
 drivers/crypto/marvell/hash.c   | 22 ++++++++++---------
 4 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index fe04d1b..af96426 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -98,6 +98,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 				engine->req = NULL;
 				mv_cesa_dequeue_req_unlocked(engine);
 				spin_unlock_bh(&engine->lock);
+				ctx->ops->complete(req);
 				ctx->ops->cleanup(req);
 				local_bh_disable();
 				req->complete(req, res);
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 158ff82..32de08b 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -456,6 +456,8 @@ struct mv_cesa_engine {
  *		code)
  * @step:	launch the crypto operation on the next chunk
  * @cleanup:	cleanup the crypto request (release associated data)
+ * @complete:	complete the request, i.e copy result from sram or contexts
+ * 		when it is needed.
  */
 struct mv_cesa_req_ops {
 	void (*prepare)(struct crypto_async_request *req,
@@ -463,6 +465,7 @@ struct mv_cesa_req_ops {
 	int (*process)(struct crypto_async_request *req, u32 status);
 	void (*step)(struct crypto_async_request *req);
 	void (*cleanup)(struct crypto_async_request *req);
+	void (*complete)(struct crypto_async_request *req);
 };
 
 /**
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 15d2c5a..fbaae2f 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -118,7 +118,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
 	struct mv_cesa_engine *engine = sreq->base.engine;
 	size_t len;
-	unsigned int ivsize;
 
 	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
 				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -128,10 +127,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	if (sreq->offset < req->nbytes)
 		return -EINPROGRESS;
 
-	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
-	memcpy_fromio(req->info,
-		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
-
 	return 0;
 }
 
@@ -141,21 +136,9 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
-		int ret;
-		struct mv_cesa_req *basereq;
-		unsigned int ivsize;
-
-		ret = mv_cesa_dma_process(&creq->req.base, status);
-		if (ret)
-			return ret;
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
+		return mv_cesa_dma_process(&creq->req.base, status);
 
-		basereq = &creq->req.base;
-		ivsize = crypto_ablkcipher_ivsize(
-					     crypto_ablkcipher_reqtfm(ablkreq));
-		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
-		return ret;
-	}
 	return mv_cesa_ablkcipher_std_process(ablkreq, status);
 }
 
@@ -197,6 +180,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
+
 	creq->req.base.engine = engine;
 
 	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
@@ -213,11 +197,36 @@ mv_cesa_ablkcipher_req_cleanup(struct crypto_async_request *req)
 	mv_cesa_ablkcipher_cleanup(ablkreq);
 }
 
+static void
+mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
+{
+	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
+	struct mv_cesa_engine *engine = creq->req.base.engine;
+	unsigned int ivsize;
+
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
+
+	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
+		struct mv_cesa_req *basereq;
+
+		basereq = &creq->req.base;
+		ivsize = crypto_ablkcipher_ivsize(
+					     crypto_ablkcipher_reqtfm(ablkreq));
+		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
+	} else {
+		memcpy_fromio(ablkreq->info,
+			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
+			      ivsize);
+	}
+}
+
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
 	.step = mv_cesa_ablkcipher_step,
 	.process = mv_cesa_ablkcipher_process,
 	.prepare = mv_cesa_ablkcipher_prepare,
 	.cleanup = mv_cesa_ablkcipher_req_cleanup,
+	.complete = mv_cesa_ablkcipher_complete,
 };
 
 static int mv_cesa_ablkcipher_cra_init(struct crypto_tfm *tfm)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index cc7c5b0..f7f84cc 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -287,17 +287,20 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	struct mv_cesa_engine *engine = creq->req.base.engine;
-	unsigned int digsize;
-	int ret, i;
 
 	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->req.base, status);
-	else
-		ret = mv_cesa_ahash_std_process(ahashreq, status);
+		return mv_cesa_dma_process(&creq->req.base, status);
 
-	if (ret == -EINPROGRESS)
-		return ret;
+	return mv_cesa_ahash_std_process(ahashreq, status);
+}
+
+static void mv_cesa_ahash_complete(struct crypto_async_request *req)
+{
+	struct ahash_request *ahashreq = ahash_request_cast(req);
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
+	struct mv_cesa_engine *engine = creq->req.base.engine;
+	unsigned int digsize;
+	int i;
 
 	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
 	for (i = 0; i < digsize / 4; i++)
@@ -326,8 +329,6 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 				result[i] = cpu_to_be32(creq->state[i]);
 		}
 	}
-
-	return ret;
 }
 
 static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
@@ -366,6 +367,7 @@ static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
 	.process = mv_cesa_ahash_process,
 	.prepare = mv_cesa_ahash_prepare,
 	.cleanup = mv_cesa_ahash_req_cleanup,
+	.complete = mv_cesa_ahash_complete,
 };
 
 static int mv_cesa_ahash_init(struct ahash_request *req,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 6/7] crypto: marvell: Adding load balancing between engines
  2016-06-15 19:15 ` Romain Perier
@ 2016-06-15 19:15   ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

This commits adds support for fine grained load balancing on
multi-engine IPs. The engine is pre-selected based on its current load
and on the weight of the crypto request that is about to be processed.
The global crypto queue is also moved to each engine. These changes are
useful for preparing the code to support TDMA chaining between crypto
requests, because each tdma chain will be handled per engine. By using
a crypto queue per engine, we make sure that we keep the state of the
tdma chain synchronized with the crypto queue. We also reduce contention
on 'cesa_dev->lock' and improve parallelism.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   | 30 +++++++++----------
 drivers/crypto/marvell/cesa.h   | 26 +++++++++++++++--
 drivers/crypto/marvell/cipher.c | 59 ++++++++++++++++++-------------------
 drivers/crypto/marvell/hash.c   | 65 +++++++++++++++++++----------------------
 4 files changed, 97 insertions(+), 83 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index af96426..f9e6688 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -45,11 +45,9 @@ static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
 	struct crypto_async_request *req, *backlog;
 	struct mv_cesa_ctx *ctx;
 
-	spin_lock_bh(&cesa_dev->lock);
-	backlog = crypto_get_backlog(&cesa_dev->queue);
-	req = crypto_dequeue_request(&cesa_dev->queue);
+	backlog = crypto_get_backlog(&engine->queue);
+	req = crypto_dequeue_request(&engine->queue);
 	engine->req = req;
-	spin_unlock_bh(&cesa_dev->lock);
 
 	if (!req)
 		return;
@@ -58,7 +56,6 @@ static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
 		backlog->complete(backlog, -EINPROGRESS);
 
 	ctx = crypto_tfm_ctx(req->tfm);
-	ctx->ops->prepare(req, engine);
 	ctx->ops->step(req);
 }
 
@@ -116,21 +113,19 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq)
 {
 	int ret;
-	int i;
+	struct mv_cesa_engine *engine = creq->engine;
 
-	spin_lock_bh(&cesa_dev->lock);
-	ret = crypto_enqueue_request(&cesa_dev->queue, req);
-	spin_unlock_bh(&cesa_dev->lock);
+	spin_lock_bh(&engine->lock);
+	ret = crypto_enqueue_request(&engine->queue, req);
+	spin_unlock_bh(&engine->lock);
 
 	if (ret != -EINPROGRESS)
 		return ret;
 
-	for (i = 0; i < cesa_dev->caps->nengines; i++) {
-		spin_lock_bh(&cesa_dev->engines[i].lock);
-		if (!cesa_dev->engines[i].req)
-			mv_cesa_dequeue_req_unlocked(&cesa_dev->engines[i]);
-		spin_unlock_bh(&cesa_dev->engines[i].lock);
-	}
+	spin_lock_bh(&engine->lock);
+	if (!engine->req)
+		mv_cesa_dequeue_req_unlocked(engine);
+	spin_unlock_bh(&engine->lock);
 
 	return -EINPROGRESS;
 }
@@ -425,7 +420,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	spin_lock_init(&cesa->lock);
-	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
+
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
 	cesa->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(cesa->regs))
@@ -498,6 +493,9 @@ static int mv_cesa_probe(struct platform_device *pdev)
 						engine);
 		if (ret)
 			goto err_cleanup;
+
+		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
+		atomic_set(&engine->load, 0);
 	}
 
 	cesa_dev = cesa;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 32de08b..5626aa7 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -400,7 +400,6 @@ struct mv_cesa_dev_dma {
  * @regs:	device registers
  * @sram_size:	usable SRAM size
  * @lock:	device lock
- * @queue:	crypto request queue
  * @engines:	array of engines
  * @dma:	dma pools
  *
@@ -412,7 +411,6 @@ struct mv_cesa_dev {
 	struct device *dev;
 	unsigned int sram_size;
 	spinlock_t lock;
-	struct crypto_queue queue;
 	struct mv_cesa_engine *engines;
 	struct mv_cesa_dev_dma *dma;
 };
@@ -431,6 +429,8 @@ struct mv_cesa_dev {
  * @int_mask:		interrupt mask cache
  * @pool:		memory pool pointing to the memory region reserved in
  *			SRAM
+ * @queue:		fifo of the pending crypto requests
+ * @load:		engine load counter, useful for load balancing
  *
  * Structure storing CESA engine information.
  */
@@ -446,6 +446,8 @@ struct mv_cesa_engine {
 	size_t max_req_len;
 	u32 int_mask;
 	struct gen_pool *pool;
+	struct crypto_queue queue;
+	atomic_t load;
 };
 
 /**
@@ -697,6 +699,26 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq);
 
+static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
+{
+	int i;
+	u32 min_load = U32_MAX;
+	struct mv_cesa_engine *selected = NULL;
+
+	for (i = 0; i < cesa_dev->caps->nengines; i++) {
+		struct mv_cesa_engine *engine = cesa_dev->engines + i;
+		u32 load = atomic_read(&engine->load);
+		if (load < min_load) {
+			min_load = load;
+			selected = engine;
+		}
+	}
+
+	atomic_add(weight, &selected->load);
+
+	return selected;
+}
+
 /*
  * Helper function that indicates whether a crypto request needs to be
  * cleaned up or not after being enqueued using mv_cesa_queue_req().
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index fbaae2f..02aa38f 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -89,6 +89,9 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
 			    CESA_SA_SRAM_PAYLOAD_SIZE);
 
+	mv_cesa_adjust_op(engine, &sreq->op);
+	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
+
 	len = sg_pcopy_to_buffer(req->src, creq->src_nents,
 				 engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 				 len, sreq->offset);
@@ -167,12 +170,9 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
 
 	sreq->size = 0;
 	sreq->offset = 0;
-	mv_cesa_adjust_op(engine, &sreq->op);
-	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
 }
 
 static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
@@ -205,6 +205,7 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 	struct mv_cesa_engine *engine = creq->req.base.engine;
 	unsigned int ivsize;
 
+	atomic_sub(ablkreq->nbytes, &engine->load);
 	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
 
 	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
@@ -449,29 +450,43 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	return ret;
 }
 
-static int mv_cesa_des_op(struct ablkcipher_request *req,
-			  struct mv_cesa_op_ctx *tmpl)
+static int mv_cesa_ablkcipher_queue_req(struct ablkcipher_request *req,
+					struct mv_cesa_op_ctx *tmpl)
 {
-	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
-
-	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
-			      CESA_SA_DESC_CFG_CRYPTM_MSK);
-
-	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+	struct mv_cesa_engine *engine;
 
 	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
 	if (ret)
 		return ret;
 
+	engine = mv_cesa_select_engine(req->nbytes);
+	mv_cesa_ablkcipher_prepare(&req->base, engine);
+
 	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
+
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
 	return ret;
 }
 
+static int mv_cesa_des_op(struct ablkcipher_request *req,
+			  struct mv_cesa_op_ctx *tmpl)
+{
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+	int ret;
+
+	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
+			      CESA_SA_DESC_CFG_CRYPTM_MSK);
+
+	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
+
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
+}
+
 static int mv_cesa_ecb_des_encrypt(struct ablkcipher_request *req)
 {
 	struct mv_cesa_op_ctx tmpl;
@@ -582,15 +597,7 @@ static int mv_cesa_des3_op(struct ablkcipher_request *req,
 
 	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES3_EDE_KEY_SIZE);
 
-	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
-	if (ret)
-		return ret;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ablkcipher_cleanup(req);
-
-	return ret;
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
 }
 
 static int mv_cesa_ecb_des3_ede_encrypt(struct ablkcipher_request *req)
@@ -725,15 +732,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			      CESA_SA_DESC_CFG_CRYPTM_MSK |
 			      CESA_SA_DESC_CFG_AES_LEN_MSK);
 
-	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
-	if (ret)
-		return ret;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ablkcipher_cleanup(req);
-
-	return ret;
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
 }
 
 static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index f7f84cc..5946a69 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -162,6 +162,15 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	unsigned int new_cache_ptr = 0;
 	u32 frag_mode;
 	size_t  len;
+	unsigned int digsize;
+	int i;
+
+	mv_cesa_adjust_op(engine, &creq->op_tmpl);
+	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
+
+	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
+	for (i = 0; i < digsize / 4; i++)
+		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -265,11 +274,8 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
 
 	sreq->offset = 0;
-	mv_cesa_adjust_op(engine, &creq->op_tmpl);
-	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
 }
 
 static void mv_cesa_ahash_step(struct crypto_async_request *req)
@@ -329,6 +335,8 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
 				result[i] = cpu_to_be32(creq->state[i]);
 		}
 	}
+
+	atomic_sub(ahashreq->nbytes, &engine->load);
 }
 
 static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
@@ -336,8 +344,6 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	unsigned int digsize;
-	int i;
 
 	creq->req.base.engine = engine;
 
@@ -345,10 +351,6 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 		mv_cesa_ahash_dma_prepare(ahashreq);
 	else
 		mv_cesa_ahash_std_prepare(ahashreq);
-
-	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
-	for (i = 0; i < digsize / 4; i++)
-		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 }
 
 static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
@@ -682,13 +684,13 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	return ret;
 }
 
-static int mv_cesa_ahash_update(struct ahash_request *req)
+static int mv_cesa_ahash_queue_req(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	struct mv_cesa_engine *engine;
 	bool cached = false;
 	int ret;
 
-	creq->len += req->nbytes;
 	ret = mv_cesa_ahash_req_init(req, &cached);
 	if (ret)
 		return ret;
@@ -696,13 +698,28 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
+	engine = mv_cesa_select_engine(req->nbytes);
+	mv_cesa_ahash_prepare(&req->base, engine);
+
 	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
+
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
 	return ret;
 }
 
+static int mv_cesa_ahash_update(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	bool cached = false;
+	int ret;
+
+	creq->len += req->nbytes;
+
+	return mv_cesa_ahash_queue_req(req);
+}
+
 static int mv_cesa_ahash_final(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
@@ -714,18 +731,7 @@ static int mv_cesa_ahash_final(struct ahash_request *req)
 	creq->last_req = true;
 	req->nbytes = 0;
 
-	ret = mv_cesa_ahash_req_init(req, &cached);
-	if (ret)
-		return ret;
-
-	if (cached)
-		return 0;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ahash_cleanup(req);
-
-	return ret;
+	return mv_cesa_ahash_queue_req(req);
 }
 
 static int mv_cesa_ahash_finup(struct ahash_request *req)
@@ -739,18 +745,7 @@ static int mv_cesa_ahash_finup(struct ahash_request *req)
 	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
 	creq->last_req = true;
 
-	ret = mv_cesa_ahash_req_init(req, &cached);
-	if (ret)
-		return ret;
-
-	if (cached)
-		return 0;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ahash_cleanup(req);
-
-	return ret;
+	return mv_cesa_ahash_queue_req(req);
 }
 
 static int mv_cesa_ahash_export(struct ahash_request *req, void *hash,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 6/7] crypto: marvell: Adding load balancing between engines
@ 2016-06-15 19:15   ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

This commits adds support for fine grained load balancing on
multi-engine IPs. The engine is pre-selected based on its current load
and on the weight of the crypto request that is about to be processed.
The global crypto queue is also moved to each engine. These changes are
useful for preparing the code to support TDMA chaining between crypto
requests, because each tdma chain will be handled per engine. By using
a crypto queue per engine, we make sure that we keep the state of the
tdma chain synchronized with the crypto queue. We also reduce contention
on 'cesa_dev->lock' and improve parallelism.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   | 30 +++++++++----------
 drivers/crypto/marvell/cesa.h   | 26 +++++++++++++++--
 drivers/crypto/marvell/cipher.c | 59 ++++++++++++++++++-------------------
 drivers/crypto/marvell/hash.c   | 65 +++++++++++++++++++----------------------
 4 files changed, 97 insertions(+), 83 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index af96426..f9e6688 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -45,11 +45,9 @@ static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
 	struct crypto_async_request *req, *backlog;
 	struct mv_cesa_ctx *ctx;
 
-	spin_lock_bh(&cesa_dev->lock);
-	backlog = crypto_get_backlog(&cesa_dev->queue);
-	req = crypto_dequeue_request(&cesa_dev->queue);
+	backlog = crypto_get_backlog(&engine->queue);
+	req = crypto_dequeue_request(&engine->queue);
 	engine->req = req;
-	spin_unlock_bh(&cesa_dev->lock);
 
 	if (!req)
 		return;
@@ -58,7 +56,6 @@ static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
 		backlog->complete(backlog, -EINPROGRESS);
 
 	ctx = crypto_tfm_ctx(req->tfm);
-	ctx->ops->prepare(req, engine);
 	ctx->ops->step(req);
 }
 
@@ -116,21 +113,19 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq)
 {
 	int ret;
-	int i;
+	struct mv_cesa_engine *engine = creq->engine;
 
-	spin_lock_bh(&cesa_dev->lock);
-	ret = crypto_enqueue_request(&cesa_dev->queue, req);
-	spin_unlock_bh(&cesa_dev->lock);
+	spin_lock_bh(&engine->lock);
+	ret = crypto_enqueue_request(&engine->queue, req);
+	spin_unlock_bh(&engine->lock);
 
 	if (ret != -EINPROGRESS)
 		return ret;
 
-	for (i = 0; i < cesa_dev->caps->nengines; i++) {
-		spin_lock_bh(&cesa_dev->engines[i].lock);
-		if (!cesa_dev->engines[i].req)
-			mv_cesa_dequeue_req_unlocked(&cesa_dev->engines[i]);
-		spin_unlock_bh(&cesa_dev->engines[i].lock);
-	}
+	spin_lock_bh(&engine->lock);
+	if (!engine->req)
+		mv_cesa_dequeue_req_unlocked(engine);
+	spin_unlock_bh(&engine->lock);
 
 	return -EINPROGRESS;
 }
@@ -425,7 +420,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	spin_lock_init(&cesa->lock);
-	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
+
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
 	cesa->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(cesa->regs))
@@ -498,6 +493,9 @@ static int mv_cesa_probe(struct platform_device *pdev)
 						engine);
 		if (ret)
 			goto err_cleanup;
+
+		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
+		atomic_set(&engine->load, 0);
 	}
 
 	cesa_dev = cesa;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 32de08b..5626aa7 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -400,7 +400,6 @@ struct mv_cesa_dev_dma {
  * @regs:	device registers
  * @sram_size:	usable SRAM size
  * @lock:	device lock
- * @queue:	crypto request queue
  * @engines:	array of engines
  * @dma:	dma pools
  *
@@ -412,7 +411,6 @@ struct mv_cesa_dev {
 	struct device *dev;
 	unsigned int sram_size;
 	spinlock_t lock;
-	struct crypto_queue queue;
 	struct mv_cesa_engine *engines;
 	struct mv_cesa_dev_dma *dma;
 };
@@ -431,6 +429,8 @@ struct mv_cesa_dev {
  * @int_mask:		interrupt mask cache
  * @pool:		memory pool pointing to the memory region reserved in
  *			SRAM
+ * @queue:		fifo of the pending crypto requests
+ * @load:		engine load counter, useful for load balancing
  *
  * Structure storing CESA engine information.
  */
@@ -446,6 +446,8 @@ struct mv_cesa_engine {
 	size_t max_req_len;
 	u32 int_mask;
 	struct gen_pool *pool;
+	struct crypto_queue queue;
+	atomic_t load;
 };
 
 /**
@@ -697,6 +699,26 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq);
 
+static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
+{
+	int i;
+	u32 min_load = U32_MAX;
+	struct mv_cesa_engine *selected = NULL;
+
+	for (i = 0; i < cesa_dev->caps->nengines; i++) {
+		struct mv_cesa_engine *engine = cesa_dev->engines + i;
+		u32 load = atomic_read(&engine->load);
+		if (load < min_load) {
+			min_load = load;
+			selected = engine;
+		}
+	}
+
+	atomic_add(weight, &selected->load);
+
+	return selected;
+}
+
 /*
  * Helper function that indicates whether a crypto request needs to be
  * cleaned up or not after being enqueued using mv_cesa_queue_req().
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index fbaae2f..02aa38f 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -89,6 +89,9 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
 			    CESA_SA_SRAM_PAYLOAD_SIZE);
 
+	mv_cesa_adjust_op(engine, &sreq->op);
+	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
+
 	len = sg_pcopy_to_buffer(req->src, creq->src_nents,
 				 engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 				 len, sreq->offset);
@@ -167,12 +170,9 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
 
 	sreq->size = 0;
 	sreq->offset = 0;
-	mv_cesa_adjust_op(engine, &sreq->op);
-	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
 }
 
 static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
@@ -205,6 +205,7 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 	struct mv_cesa_engine *engine = creq->req.base.engine;
 	unsigned int ivsize;
 
+	atomic_sub(ablkreq->nbytes, &engine->load);
 	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
 
 	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
@@ -449,29 +450,43 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	return ret;
 }
 
-static int mv_cesa_des_op(struct ablkcipher_request *req,
-			  struct mv_cesa_op_ctx *tmpl)
+static int mv_cesa_ablkcipher_queue_req(struct ablkcipher_request *req,
+					struct mv_cesa_op_ctx *tmpl)
 {
-	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
-
-	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
-			      CESA_SA_DESC_CFG_CRYPTM_MSK);
-
-	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+	struct mv_cesa_engine *engine;
 
 	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
 	if (ret)
 		return ret;
 
+	engine = mv_cesa_select_engine(req->nbytes);
+	mv_cesa_ablkcipher_prepare(&req->base, engine);
+
 	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
+
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
 	return ret;
 }
 
+static int mv_cesa_des_op(struct ablkcipher_request *req,
+			  struct mv_cesa_op_ctx *tmpl)
+{
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+	int ret;
+
+	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
+			      CESA_SA_DESC_CFG_CRYPTM_MSK);
+
+	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
+
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
+}
+
 static int mv_cesa_ecb_des_encrypt(struct ablkcipher_request *req)
 {
 	struct mv_cesa_op_ctx tmpl;
@@ -582,15 +597,7 @@ static int mv_cesa_des3_op(struct ablkcipher_request *req,
 
 	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES3_EDE_KEY_SIZE);
 
-	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
-	if (ret)
-		return ret;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ablkcipher_cleanup(req);
-
-	return ret;
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
 }
 
 static int mv_cesa_ecb_des3_ede_encrypt(struct ablkcipher_request *req)
@@ -725,15 +732,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			      CESA_SA_DESC_CFG_CRYPTM_MSK |
 			      CESA_SA_DESC_CFG_AES_LEN_MSK);
 
-	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
-	if (ret)
-		return ret;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ablkcipher_cleanup(req);
-
-	return ret;
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
 }
 
 static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index f7f84cc..5946a69 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -162,6 +162,15 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	unsigned int new_cache_ptr = 0;
 	u32 frag_mode;
 	size_t  len;
+	unsigned int digsize;
+	int i;
+
+	mv_cesa_adjust_op(engine, &creq->op_tmpl);
+	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
+
+	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
+	for (i = 0; i < digsize / 4; i++)
+		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -265,11 +274,8 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
 
 	sreq->offset = 0;
-	mv_cesa_adjust_op(engine, &creq->op_tmpl);
-	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
 }
 
 static void mv_cesa_ahash_step(struct crypto_async_request *req)
@@ -329,6 +335,8 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
 				result[i] = cpu_to_be32(creq->state[i]);
 		}
 	}
+
+	atomic_sub(ahashreq->nbytes, &engine->load);
 }
 
 static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
@@ -336,8 +344,6 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	unsigned int digsize;
-	int i;
 
 	creq->req.base.engine = engine;
 
@@ -345,10 +351,6 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 		mv_cesa_ahash_dma_prepare(ahashreq);
 	else
 		mv_cesa_ahash_std_prepare(ahashreq);
-
-	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
-	for (i = 0; i < digsize / 4; i++)
-		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 }
 
 static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
@@ -682,13 +684,13 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	return ret;
 }
 
-static int mv_cesa_ahash_update(struct ahash_request *req)
+static int mv_cesa_ahash_queue_req(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	struct mv_cesa_engine *engine;
 	bool cached = false;
 	int ret;
 
-	creq->len += req->nbytes;
 	ret = mv_cesa_ahash_req_init(req, &cached);
 	if (ret)
 		return ret;
@@ -696,13 +698,28 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
+	engine = mv_cesa_select_engine(req->nbytes);
+	mv_cesa_ahash_prepare(&req->base, engine);
+
 	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
+
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
 	return ret;
 }
 
+static int mv_cesa_ahash_update(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	bool cached = false;
+	int ret;
+
+	creq->len += req->nbytes;
+
+	return mv_cesa_ahash_queue_req(req);
+}
+
 static int mv_cesa_ahash_final(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
@@ -714,18 +731,7 @@ static int mv_cesa_ahash_final(struct ahash_request *req)
 	creq->last_req = true;
 	req->nbytes = 0;
 
-	ret = mv_cesa_ahash_req_init(req, &cached);
-	if (ret)
-		return ret;
-
-	if (cached)
-		return 0;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ahash_cleanup(req);
-
-	return ret;
+	return mv_cesa_ahash_queue_req(req);
 }
 
 static int mv_cesa_ahash_finup(struct ahash_request *req)
@@ -739,18 +745,7 @@ static int mv_cesa_ahash_finup(struct ahash_request *req)
 	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
 	creq->last_req = true;
 
-	ret = mv_cesa_ahash_req_init(req, &cached);
-	if (ret)
-		return ret;
-
-	if (cached)
-		return 0;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ahash_cleanup(req);
-
-	return ret;
+	return mv_cesa_ahash_queue_req(req);
 }
 
 static int mv_cesa_ahash_export(struct ahash_request *req, void *hash,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 7/7] crypto: marvell: Add support for chaining crypto requests in TDMA mode
  2016-06-15 19:15 ` Romain Perier
@ 2016-06-15 19:15   ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

The Cryptographic Engines and Security Accelerators (CESA) supports the
Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
can be chained and processed by the hardware without software
interferences. This mode was already activated, however the crypto
requests were not chained together. By doing so, we reduce significantly
the number of IRQs. Instead of being interrupted at the end of each
crypto request, we are interrupted at the end of the last cryptographic
request processed by the engine.

This commits re-factorizes the code, changes the code architecture and
adds the required data structures to chain cryptographic requests
together before sending them to an engine.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   | 117 +++++++++++++++++++++++++++++++---------
 drivers/crypto/marvell/cesa.h   |  38 ++++++++++++-
 drivers/crypto/marvell/cipher.c |   3 +-
 drivers/crypto/marvell/hash.c   |   9 +++-
 drivers/crypto/marvell/tdma.c   |  81 ++++++++++++++++++++++++++++
 5 files changed, 218 insertions(+), 30 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index f9e6688..33411f6 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -32,7 +32,7 @@
 #include "cesa.h"
 
 /* Limit of the crypto queue before reaching the backlog */
-#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
+#define CESA_CRYPTO_DEFAULT_MAX_QLEN 128
 
 static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
 module_param_named(allhwsupport, allhwsupport, int, 0444);
@@ -40,23 +40,83 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
 
 struct mv_cesa_dev *cesa_dev;
 
-static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
+struct crypto_async_request *mv_cesa_dequeue_req_locked(
+	struct mv_cesa_engine *engine, struct crypto_async_request **backlog)
+{
+	struct crypto_async_request *req;
+
+	*backlog = crypto_get_backlog(&engine->queue);
+	req = crypto_dequeue_request(&engine->queue);
+
+	if (!req)
+		return NULL;
+
+	return req;
+}
+
+static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
 {
 	struct crypto_async_request *req, *backlog;
 	struct mv_cesa_ctx *ctx;
 
-	backlog = crypto_get_backlog(&engine->queue);
-	req = crypto_dequeue_request(&engine->queue);
-	engine->req = req;
 
+	spin_lock_bh(&engine->lock);
+	if (engine->req)
+		goto out_unlock;
+
+	req = mv_cesa_dequeue_req_locked(engine, &backlog);
 	if (!req)
-		return;
+		goto out_unlock;
+
+	engine->req = req;
+	spin_unlock_bh(&engine->lock);
 
 	if (backlog)
 		backlog->complete(backlog, -EINPROGRESS);
 
 	ctx = crypto_tfm_ctx(req->tfm);
 	ctx->ops->step(req);
+	return;
+out_unlock:
+	spin_unlock_bh(&engine->lock);
+}
+
+static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
+{
+	struct crypto_async_request *req;
+	struct mv_cesa_ctx *ctx;
+	int res;
+
+	req = engine->req;
+	ctx = crypto_tfm_ctx(req->tfm);
+	res = ctx->ops->process(req, status);
+
+	if (res == 0) {
+		ctx->ops->complete(req);
+		mv_cesa_engine_enqueue_complete_request(engine, req);
+	} else if (res == -EINPROGRESS) {
+		ctx->ops->step(req);
+	} else {
+		ctx->ops->complete(req);
+	}
+
+	return res;
+}
+
+static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
+{
+	if (engine->chain.first && engine->chain.last)
+		return mv_cesa_tdma_process(engine, status);
+	return mv_cesa_std_process(engine, status);
+}
+
+static inline void mv_cesa_complete_req(struct mv_cesa_ctx *ctx,
+	struct crypto_async_request *req, int res)
+{
+	ctx->ops->cleanup(req);
+	local_bh_disable();
+	req->complete(req, res);
+	local_bh_enable();
 }
 
 static irqreturn_t mv_cesa_int(int irq, void *priv)
@@ -83,26 +143,31 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 		writel(~status, engine->regs + CESA_SA_FPGA_INT_STATUS);
 		writel(~status, engine->regs + CESA_SA_INT_STATUS);
 
+		/* Process fetched requests */
+		res = mv_cesa_int_process(engine, status & mask);
 		ret = IRQ_HANDLED;
+
 		spin_lock_bh(&engine->lock);
 		req = engine->req;
+		if (res != -EINPROGRESS)
+			engine->req = NULL;
 		spin_unlock_bh(&engine->lock);
-		if (req) {
-			ctx = crypto_tfm_ctx(req->tfm);
-			res = ctx->ops->process(req, status & mask);
-			if (res != -EINPROGRESS) {
-				spin_lock_bh(&engine->lock);
-				engine->req = NULL;
-				mv_cesa_dequeue_req_unlocked(engine);
-				spin_unlock_bh(&engine->lock);
-				ctx->ops->complete(req);
-				ctx->ops->cleanup(req);
-				local_bh_disable();
-				req->complete(req, res);
-				local_bh_enable();
-			} else {
-				ctx->ops->step(req);
-			}
+
+		ctx = crypto_tfm_ctx(req->tfm);
+
+		if (res && res != -EINPROGRESS)
+			mv_cesa_complete_req(ctx, req, res);
+
+		/* Launch the next pending request */
+		mv_cesa_rearm_engine(engine);
+
+		/* Iterate over the complete queue */
+		while (true) {
+			req = mv_cesa_engine_dequeue_complete_request(engine);
+			if (!req)
+				break;
+
+			mv_cesa_complete_req(ctx, req, 0);
 		}
 	}
 
@@ -116,16 +181,15 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
 	struct mv_cesa_engine *engine = creq->engine;
 
 	spin_lock_bh(&engine->lock);
+	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
+		mv_cesa_tdma_chain(engine, creq);
 	ret = crypto_enqueue_request(&engine->queue, req);
 	spin_unlock_bh(&engine->lock);
 
 	if (ret != -EINPROGRESS)
 		return ret;
 
-	spin_lock_bh(&engine->lock);
-	if (!engine->req)
-		mv_cesa_dequeue_req_unlocked(engine);
-	spin_unlock_bh(&engine->lock);
+	mv_cesa_rearm_engine(engine);
 
 	return -EINPROGRESS;
 }
@@ -496,6 +560,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 
 		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
 		atomic_set(&engine->load, 0);
+		INIT_LIST_HEAD(&engine->complete_queue);
 	}
 
 	cesa_dev = cesa;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 5626aa7..e0fee1f 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
 /* TDMA descriptor flags */
 #define CESA_TDMA_DST_IN_SRAM			BIT(31)
 #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
-#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
+#define CESA_TDMA_END_OF_REQ			BIT(29)
+#define CESA_TDMA_NOT_CHAIN			BIT(28)
+#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
@@ -431,6 +433,9 @@ struct mv_cesa_dev {
  *			SRAM
  * @queue:		fifo of the pending crypto requests
  * @load:		engine load counter, useful for load balancing
+ * @chain:		list of the current tdma descriptors being processed
+ * 			by this engine.
+ * @complete_queue:	fifo of the processed requests by the engine
  *
  * Structure storing CESA engine information.
  */
@@ -448,6 +453,8 @@ struct mv_cesa_engine {
 	struct gen_pool *pool;
 	struct crypto_queue queue;
 	atomic_t load;
+	struct mv_cesa_tdma_chain chain;
+	struct list_head complete_queue;
 };
 
 /**
@@ -618,6 +625,28 @@ struct mv_cesa_ahash_req {
 
 extern struct mv_cesa_dev *cesa_dev;
 
+
+static inline void mv_cesa_engine_enqueue_complete_request(
+	struct mv_cesa_engine *engine, struct crypto_async_request *req)
+{
+	list_add_tail(&req->list, &engine->complete_queue);
+}
+
+static inline struct crypto_async_request *
+mv_cesa_engine_dequeue_complete_request(struct mv_cesa_engine *engine)
+{
+	struct crypto_async_request *req;
+
+	req = list_first_entry_or_null(&engine->complete_queue,
+				       struct crypto_async_request,
+				       list);
+	if (req)
+		list_del(&req->list);
+
+	return req;
+}
+
+
 static inline enum mv_cesa_req_type
 mv_cesa_req_get_type(struct mv_cesa_req *req)
 {
@@ -699,6 +728,10 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq);
 
+struct crypto_async_request *mv_cesa_dequeue_req_locked(
+		      struct mv_cesa_engine *engine,
+		      struct crypto_async_request **backlog);
+
 static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
 {
 	int i;
@@ -804,6 +837,9 @@ static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
 void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine);
 void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
+void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
+			struct mv_cesa_req *dreq);
+int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status);
 
 
 static inline void
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 02aa38f..9033191 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -225,7 +225,6 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
 	.step = mv_cesa_ablkcipher_step,
 	.process = mv_cesa_ablkcipher_process,
-	.prepare = mv_cesa_ablkcipher_prepare,
 	.cleanup = mv_cesa_ablkcipher_req_cleanup,
 	.complete = mv_cesa_ablkcipher_complete,
 };
@@ -384,6 +383,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 		goto err_free_tdma;
 
 	dreq->chain = chain;
+	dreq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
 
 	return 0;
 
@@ -441,7 +441,6 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
 			      CESA_SA_DESC_CFG_OP_MSK);
 
-	/* TODO: add a threshold for DMA usage */
 	if (cesa_dev->caps->has_tdma)
 		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
 	else
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 5946a69..c2ff353 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -172,6 +172,9 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	for (i = 0; i < digsize / 4; i++)
 		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
+	mv_cesa_adjust_op(engine, &creq->op_tmpl);
+	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
+
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 			    creq->cache, creq->cache_ptr);
@@ -282,6 +285,9 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
+	struct mv_cesa_engine *engine = creq->req.base.engine;
+	unsigned int digsize;
+	int i;
 
 	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_dma_step(&creq->req.base);
@@ -367,7 +373,6 @@ static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
 static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
 	.step = mv_cesa_ahash_step,
 	.process = mv_cesa_ahash_process,
-	.prepare = mv_cesa_ahash_prepare,
 	.cleanup = mv_cesa_ahash_req_cleanup,
 	.complete = mv_cesa_ahash_complete,
 };
@@ -648,6 +653,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	else
 		creq->cache_ptr = 0;
 
+	dreq->chain.last->flags |= (CESA_TDMA_END_OF_REQ | CESA_TDMA_NOT_CHAIN);
+
 	return 0;
 
 err_free_tdma:
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 9a424f9..ae50545 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -98,6 +98,87 @@ void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 	}
 }
 
+void
+mv_cesa_tdma_chain(struct mv_cesa_engine *engine, struct mv_cesa_req *dreq)
+{
+	if (engine->chain.first == NULL && engine->chain.last == NULL) {
+		engine->chain.first = dreq->chain.first;
+		engine->chain.last  = dreq->chain.last;
+	} else {
+		struct mv_cesa_tdma_desc *last;
+
+		last = engine->chain.last;
+		last->next = dreq->chain.first;
+		engine->chain.last = dreq->chain.last;
+		if (!(last->flags & CESA_TDMA_NOT_CHAIN))
+			last->next_dma = dreq->chain.first->cur_dma;
+	}
+}
+
+int
+mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
+{
+	struct crypto_async_request *req = NULL;
+	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
+	dma_addr_t tdma_cur;
+	int res = 0;
+
+	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
+
+	for (tdma = engine->chain.first; tdma; tdma = next) {
+		spin_lock_bh(&engine->lock);
+		next = tdma->next;
+		spin_unlock_bh(&engine->lock);
+
+		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
+			struct crypto_async_request *backlog = NULL;
+			struct mv_cesa_ctx *ctx;
+
+			spin_lock_bh(&engine->lock);
+			/*
+			 * if req is NULL, this means we're processing the
+			 * request in engine->req.
+			 */
+			if (!req)
+				req = engine->req;
+			else
+				req = mv_cesa_dequeue_req_locked(engine,
+								 &backlog);
+
+			/* Re-chaining to the next request */
+			engine->chain.first = tdma->next;
+			tdma->next = NULL;
+
+			/* If this is the last request, clear the chain */
+			if (engine->chain.first == NULL)
+				engine->chain.last  = NULL;
+			spin_unlock_bh(&engine->lock);
+
+			ctx = crypto_tfm_ctx(req->tfm);
+			res = ctx->ops->process(req, status);
+			ctx->ops->complete(req);
+
+			if (res == 0)
+				mv_cesa_engine_enqueue_complete_request(engine,
+									req);
+
+			if (backlog)
+				backlog->complete(backlog, -EINPROGRESS);
+		}
+		if (res || tdma->cur_dma == tdma_cur)
+			break;
+	}
+
+	if (res) {
+		spin_lock_bh(&engine->lock);
+		engine->req = req;
+		spin_unlock_bh(&engine->lock);
+	}
+
+	return res;
+}
+
+
 static struct mv_cesa_tdma_desc *
 mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 7/7] crypto: marvell: Add support for chaining crypto requests in TDMA mode
@ 2016-06-15 19:15   ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-15 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

The Cryptographic Engines and Security Accelerators (CESA) supports the
Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
can be chained and processed by the hardware without software
interferences. This mode was already activated, however the crypto
requests were not chained together. By doing so, we reduce significantly
the number of IRQs. Instead of being interrupted at the end of each
crypto request, we are interrupted at the end of the last cryptographic
request processed by the engine.

This commits re-factorizes the code, changes the code architecture and
adds the required data structures to chain cryptographic requests
together before sending them to an engine.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c   | 117 +++++++++++++++++++++++++++++++---------
 drivers/crypto/marvell/cesa.h   |  38 ++++++++++++-
 drivers/crypto/marvell/cipher.c |   3 +-
 drivers/crypto/marvell/hash.c   |   9 +++-
 drivers/crypto/marvell/tdma.c   |  81 ++++++++++++++++++++++++++++
 5 files changed, 218 insertions(+), 30 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index f9e6688..33411f6 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -32,7 +32,7 @@
 #include "cesa.h"
 
 /* Limit of the crypto queue before reaching the backlog */
-#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
+#define CESA_CRYPTO_DEFAULT_MAX_QLEN 128
 
 static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
 module_param_named(allhwsupport, allhwsupport, int, 0444);
@@ -40,23 +40,83 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
 
 struct mv_cesa_dev *cesa_dev;
 
-static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
+struct crypto_async_request *mv_cesa_dequeue_req_locked(
+	struct mv_cesa_engine *engine, struct crypto_async_request **backlog)
+{
+	struct crypto_async_request *req;
+
+	*backlog = crypto_get_backlog(&engine->queue);
+	req = crypto_dequeue_request(&engine->queue);
+
+	if (!req)
+		return NULL;
+
+	return req;
+}
+
+static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
 {
 	struct crypto_async_request *req, *backlog;
 	struct mv_cesa_ctx *ctx;
 
-	backlog = crypto_get_backlog(&engine->queue);
-	req = crypto_dequeue_request(&engine->queue);
-	engine->req = req;
 
+	spin_lock_bh(&engine->lock);
+	if (engine->req)
+		goto out_unlock;
+
+	req = mv_cesa_dequeue_req_locked(engine, &backlog);
 	if (!req)
-		return;
+		goto out_unlock;
+
+	engine->req = req;
+	spin_unlock_bh(&engine->lock);
 
 	if (backlog)
 		backlog->complete(backlog, -EINPROGRESS);
 
 	ctx = crypto_tfm_ctx(req->tfm);
 	ctx->ops->step(req);
+	return;
+out_unlock:
+	spin_unlock_bh(&engine->lock);
+}
+
+static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
+{
+	struct crypto_async_request *req;
+	struct mv_cesa_ctx *ctx;
+	int res;
+
+	req = engine->req;
+	ctx = crypto_tfm_ctx(req->tfm);
+	res = ctx->ops->process(req, status);
+
+	if (res == 0) {
+		ctx->ops->complete(req);
+		mv_cesa_engine_enqueue_complete_request(engine, req);
+	} else if (res == -EINPROGRESS) {
+		ctx->ops->step(req);
+	} else {
+		ctx->ops->complete(req);
+	}
+
+	return res;
+}
+
+static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
+{
+	if (engine->chain.first && engine->chain.last)
+		return mv_cesa_tdma_process(engine, status);
+	return mv_cesa_std_process(engine, status);
+}
+
+static inline void mv_cesa_complete_req(struct mv_cesa_ctx *ctx,
+	struct crypto_async_request *req, int res)
+{
+	ctx->ops->cleanup(req);
+	local_bh_disable();
+	req->complete(req, res);
+	local_bh_enable();
 }
 
 static irqreturn_t mv_cesa_int(int irq, void *priv)
@@ -83,26 +143,31 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 		writel(~status, engine->regs + CESA_SA_FPGA_INT_STATUS);
 		writel(~status, engine->regs + CESA_SA_INT_STATUS);
 
+		/* Process fetched requests */
+		res = mv_cesa_int_process(engine, status & mask);
 		ret = IRQ_HANDLED;
+
 		spin_lock_bh(&engine->lock);
 		req = engine->req;
+		if (res != -EINPROGRESS)
+			engine->req = NULL;
 		spin_unlock_bh(&engine->lock);
-		if (req) {
-			ctx = crypto_tfm_ctx(req->tfm);
-			res = ctx->ops->process(req, status & mask);
-			if (res != -EINPROGRESS) {
-				spin_lock_bh(&engine->lock);
-				engine->req = NULL;
-				mv_cesa_dequeue_req_unlocked(engine);
-				spin_unlock_bh(&engine->lock);
-				ctx->ops->complete(req);
-				ctx->ops->cleanup(req);
-				local_bh_disable();
-				req->complete(req, res);
-				local_bh_enable();
-			} else {
-				ctx->ops->step(req);
-			}
+
+		ctx = crypto_tfm_ctx(req->tfm);
+
+		if (res && res != -EINPROGRESS)
+			mv_cesa_complete_req(ctx, req, res);
+
+		/* Launch the next pending request */
+		mv_cesa_rearm_engine(engine);
+
+		/* Iterate over the complete queue */
+		while (true) {
+			req = mv_cesa_engine_dequeue_complete_request(engine);
+			if (!req)
+				break;
+
+			mv_cesa_complete_req(ctx, req, 0);
 		}
 	}
 
@@ -116,16 +181,15 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
 	struct mv_cesa_engine *engine = creq->engine;
 
 	spin_lock_bh(&engine->lock);
+	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
+		mv_cesa_tdma_chain(engine, creq);
 	ret = crypto_enqueue_request(&engine->queue, req);
 	spin_unlock_bh(&engine->lock);
 
 	if (ret != -EINPROGRESS)
 		return ret;
 
-	spin_lock_bh(&engine->lock);
-	if (!engine->req)
-		mv_cesa_dequeue_req_unlocked(engine);
-	spin_unlock_bh(&engine->lock);
+	mv_cesa_rearm_engine(engine);
 
 	return -EINPROGRESS;
 }
@@ -496,6 +560,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 
 		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
 		atomic_set(&engine->load, 0);
+		INIT_LIST_HEAD(&engine->complete_queue);
 	}
 
 	cesa_dev = cesa;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 5626aa7..e0fee1f 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
 /* TDMA descriptor flags */
 #define CESA_TDMA_DST_IN_SRAM			BIT(31)
 #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
-#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
+#define CESA_TDMA_END_OF_REQ			BIT(29)
+#define CESA_TDMA_NOT_CHAIN			BIT(28)
+#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
@@ -431,6 +433,9 @@ struct mv_cesa_dev {
  *			SRAM
  * @queue:		fifo of the pending crypto requests
  * @load:		engine load counter, useful for load balancing
+ * @chain:		list of the current tdma descriptors being processed
+ * 			by this engine.
+ * @complete_queue:	fifo of the processed requests by the engine
  *
  * Structure storing CESA engine information.
  */
@@ -448,6 +453,8 @@ struct mv_cesa_engine {
 	struct gen_pool *pool;
 	struct crypto_queue queue;
 	atomic_t load;
+	struct mv_cesa_tdma_chain chain;
+	struct list_head complete_queue;
 };
 
 /**
@@ -618,6 +625,28 @@ struct mv_cesa_ahash_req {
 
 extern struct mv_cesa_dev *cesa_dev;
 
+
+static inline void mv_cesa_engine_enqueue_complete_request(
+	struct mv_cesa_engine *engine, struct crypto_async_request *req)
+{
+	list_add_tail(&req->list, &engine->complete_queue);
+}
+
+static inline struct crypto_async_request *
+mv_cesa_engine_dequeue_complete_request(struct mv_cesa_engine *engine)
+{
+	struct crypto_async_request *req;
+
+	req = list_first_entry_or_null(&engine->complete_queue,
+				       struct crypto_async_request,
+				       list);
+	if (req)
+		list_del(&req->list);
+
+	return req;
+}
+
+
 static inline enum mv_cesa_req_type
 mv_cesa_req_get_type(struct mv_cesa_req *req)
 {
@@ -699,6 +728,10 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq);
 
+struct crypto_async_request *mv_cesa_dequeue_req_locked(
+		      struct mv_cesa_engine *engine,
+		      struct crypto_async_request **backlog);
+
 static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
 {
 	int i;
@@ -804,6 +837,9 @@ static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
 void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine);
 void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
+void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
+			struct mv_cesa_req *dreq);
+int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status);
 
 
 static inline void
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 02aa38f..9033191 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -225,7 +225,6 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
 	.step = mv_cesa_ablkcipher_step,
 	.process = mv_cesa_ablkcipher_process,
-	.prepare = mv_cesa_ablkcipher_prepare,
 	.cleanup = mv_cesa_ablkcipher_req_cleanup,
 	.complete = mv_cesa_ablkcipher_complete,
 };
@@ -384,6 +383,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 		goto err_free_tdma;
 
 	dreq->chain = chain;
+	dreq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
 
 	return 0;
 
@@ -441,7 +441,6 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
 			      CESA_SA_DESC_CFG_OP_MSK);
 
-	/* TODO: add a threshold for DMA usage */
 	if (cesa_dev->caps->has_tdma)
 		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
 	else
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 5946a69..c2ff353 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -172,6 +172,9 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	for (i = 0; i < digsize / 4; i++)
 		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
+	mv_cesa_adjust_op(engine, &creq->op_tmpl);
+	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
+
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 			    creq->cache, creq->cache_ptr);
@@ -282,6 +285,9 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
+	struct mv_cesa_engine *engine = creq->req.base.engine;
+	unsigned int digsize;
+	int i;
 
 	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
 		mv_cesa_dma_step(&creq->req.base);
@@ -367,7 +373,6 @@ static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
 static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
 	.step = mv_cesa_ahash_step,
 	.process = mv_cesa_ahash_process,
-	.prepare = mv_cesa_ahash_prepare,
 	.cleanup = mv_cesa_ahash_req_cleanup,
 	.complete = mv_cesa_ahash_complete,
 };
@@ -648,6 +653,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	else
 		creq->cache_ptr = 0;
 
+	dreq->chain.last->flags |= (CESA_TDMA_END_OF_REQ | CESA_TDMA_NOT_CHAIN);
+
 	return 0;
 
 err_free_tdma:
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 9a424f9..ae50545 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -98,6 +98,87 @@ void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 	}
 }
 
+void
+mv_cesa_tdma_chain(struct mv_cesa_engine *engine, struct mv_cesa_req *dreq)
+{
+	if (engine->chain.first == NULL && engine->chain.last == NULL) {
+		engine->chain.first = dreq->chain.first;
+		engine->chain.last  = dreq->chain.last;
+	} else {
+		struct mv_cesa_tdma_desc *last;
+
+		last = engine->chain.last;
+		last->next = dreq->chain.first;
+		engine->chain.last = dreq->chain.last;
+		if (!(last->flags & CESA_TDMA_NOT_CHAIN))
+			last->next_dma = dreq->chain.first->cur_dma;
+	}
+}
+
+int
+mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
+{
+	struct crypto_async_request *req = NULL;
+	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
+	dma_addr_t tdma_cur;
+	int res = 0;
+
+	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
+
+	for (tdma = engine->chain.first; tdma; tdma = next) {
+		spin_lock_bh(&engine->lock);
+		next = tdma->next;
+		spin_unlock_bh(&engine->lock);
+
+		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
+			struct crypto_async_request *backlog = NULL;
+			struct mv_cesa_ctx *ctx;
+
+			spin_lock_bh(&engine->lock);
+			/*
+			 * if req is NULL, this means we're processing the
+			 * request in engine->req.
+			 */
+			if (!req)
+				req = engine->req;
+			else
+				req = mv_cesa_dequeue_req_locked(engine,
+								 &backlog);
+
+			/* Re-chaining to the next request */
+			engine->chain.first = tdma->next;
+			tdma->next = NULL;
+
+			/* If this is the last request, clear the chain */
+			if (engine->chain.first == NULL)
+				engine->chain.last  = NULL;
+			spin_unlock_bh(&engine->lock);
+
+			ctx = crypto_tfm_ctx(req->tfm);
+			res = ctx->ops->process(req, status);
+			ctx->ops->complete(req);
+
+			if (res == 0)
+				mv_cesa_engine_enqueue_complete_request(engine,
+									req);
+
+			if (backlog)
+				backlog->complete(backlog, -EINPROGRESS);
+		}
+		if (res || tdma->cur_dma == tdma_cur)
+			break;
+	}
+
+	if (res) {
+		spin_lock_bh(&engine->lock);
+		engine->req = req;
+		spin_unlock_bh(&engine->lock);
+	}
+
+	return res;
+}
+
+
 static struct mv_cesa_tdma_desc *
 mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/7] crypto: marvell: Add a macro constant for the size of the crypto queue
  2016-06-15 19:15   ` Romain Perier
@ 2016-06-15 19:20     ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 19:20 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Wed, 15 Jun 2016 21:15:28 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Adding a macro constant to be used for the size of the crypto queue,
> instead of using a numeric value directly. It will be easier to
> maintain in case we add more than one crypto queue of the same size.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

> ---
>  drivers/crypto/marvell/cesa.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index 056a754..fb403e1 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -31,6 +31,9 @@
>  
>  #include "cesa.h"
>  
> +/* Limit of the crypto queue before reaching the backlog */
> +#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
> +
>  static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
>  module_param_named(allhwsupport, allhwsupport, int, 0444);
>  MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if overlaps with the mv_cesa driver)");
> @@ -416,7 +419,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  
>  	spin_lock_init(&cesa->lock);
> -	crypto_init_queue(&cesa->queue, 50);
> +	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
>  	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
>  	cesa->regs = devm_ioremap_resource(dev, res);
>  	if (IS_ERR(cesa->regs))



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/7] crypto: marvell: Add a macro constant for the size of the crypto queue
@ 2016-06-15 19:20     ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 19:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jun 2016 21:15:28 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Adding a macro constant to be used for the size of the crypto queue,
> instead of using a numeric value directly. It will be easier to
> maintain in case we add more than one crypto queue of the same size.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

> ---
>  drivers/crypto/marvell/cesa.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index 056a754..fb403e1 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -31,6 +31,9 @@
>  
>  #include "cesa.h"
>  
> +/* Limit of the crypto queue before reaching the backlog */
> +#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
> +
>  static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
>  module_param_named(allhwsupport, allhwsupport, int, 0444);
>  MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if overlaps with the mv_cesa driver)");
> @@ -416,7 +419,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  
>  	spin_lock_init(&cesa->lock);
> -	crypto_init_queue(&cesa->queue, 50);
> +	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
>  	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
>  	cesa->regs = devm_ioremap_resource(dev, res);
>  	if (IS_ERR(cesa->regs))



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/7] crypto: marvell: Check engine is not already running when enabling a req
  2016-06-15 19:15   ` Romain Perier
@ 2016-06-15 19:37     ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 19:37 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Wed, 15 Jun 2016 21:15:29 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Adding BUG_ON() macro to be sure that the step operation is not about
> to activate a request on the engine if the corresponding engine is
> already processing a crypto request. This is helpful when the support
> for chaining crypto requests will be added. Instead of hanging the
> system when the engine is in an incoherent state, we add this macro

You don't add the macro, you use it.

> which throws an understandable error.

How about rewording the commit message this way:

"
Add a BUG_ON() call when the driver tries to launch a crypto request
while the engine is still processing the previous one. This replaces
a silent system hang by a verbose kernel panic with the associated
backtrace to let the user know that something went wrong in the CESA
driver.
"

> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Apart from the coding style issue mentioned below,

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

> ---
>  drivers/crypto/marvell/cipher.c | 2 ++
>  drivers/crypto/marvell/hash.c   | 2 ++
>  drivers/crypto/marvell/tdma.c   | 2 ++
>  3 files changed, 6 insertions(+)
> 
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index dcf1fce..8d0fabb 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -106,6 +106,8 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
>  
>  	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
>  	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);

Nit: please put the '&' operator at the end of the first line and
align CESA_SA_CMD_EN_CESA_SA_ACCL0 on the open parenthesis.

	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
	       CESA_SA_CMD_EN_CESA_SA_ACCL0);

>  	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>  }
>  
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index 7ca2e0f..0fae351 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -237,6 +237,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>  
>  	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
>  	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);

Ditto.

>  	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>  }
>  
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 7642798..d493714 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -53,6 +53,8 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
>  		       engine->regs + CESA_SA_CFG);
>  	writel_relaxed(dreq->chain.first->cur_dma,
>  		       engine->regs + CESA_TDMA_NEXT_ADDR);
> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);

Ditto.

>  	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>  }
>  



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 2/7] crypto: marvell: Check engine is not already running when enabling a req
@ 2016-06-15 19:37     ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 19:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jun 2016 21:15:29 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Adding BUG_ON() macro to be sure that the step operation is not about
> to activate a request on the engine if the corresponding engine is
> already processing a crypto request. This is helpful when the support
> for chaining crypto requests will be added. Instead of hanging the
> system when the engine is in an incoherent state, we add this macro

You don't add the macro, you use it.

> which throws an understandable error.

How about rewording the commit message this way:

"
Add a BUG_ON() call when the driver tries to launch a crypto request
while the engine is still processing the previous one. This replaces
a silent system hang by a verbose kernel panic with the associated
backtrace to let the user know that something went wrong in the CESA
driver.
"

> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Apart from the coding style issue mentioned below,

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

> ---
>  drivers/crypto/marvell/cipher.c | 2 ++
>  drivers/crypto/marvell/hash.c   | 2 ++
>  drivers/crypto/marvell/tdma.c   | 2 ++
>  3 files changed, 6 insertions(+)
> 
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index dcf1fce..8d0fabb 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -106,6 +106,8 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
>  
>  	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
>  	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);

Nit: please put the '&' operator at the end of the first line and
align CESA_SA_CMD_EN_CESA_SA_ACCL0 on the open parenthesis.

	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
	       CESA_SA_CMD_EN_CESA_SA_ACCL0);

>  	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>  }
>  
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index 7ca2e0f..0fae351 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -237,6 +237,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>  
>  	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
>  	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);

Ditto.

>  	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>  }
>  
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 7642798..d493714 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -53,6 +53,8 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
>  		       engine->regs + CESA_SA_CFG);
>  	writel_relaxed(dreq->chain.first->cur_dma,
>  		       engine->regs + CESA_TDMA_NEXT_ADDR);
> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);

Ditto.

>  	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>  }
>  



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  2016-06-15 19:15   ` Romain Perier
@ 2016-06-15 20:07     ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 20:07 UTC (permalink / raw)
  To: Romain Perier
  Cc: Thomas Petazzoni, Russell King, Arnaud Ebalard, linux-crypto,
	Gregory Clement, David S. Miller, linux-arm-kernel

On Wed, 15 Jun 2016 21:15:30 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Adding a TDMA descriptor at the end of the request for copying the
> output IV vector via a DMA transfer. This is required for processing
> cipher requests asynchroniously in chained mode, otherwise the content

		  asynchronously

> of the IV vector will be overwriten for each new finished request.

BTW, Not sure the term 'asynchronously' is appropriate here. The
standard (AKA non-DMA) processing is also asynchronous. The real reason
here is that you want to chain the requests and offload as much
processing as possible to the DMA and crypto engine. And as you
explained, this is only possible if we retrieve the updated IV using
DMA. 

> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   |  4 ++++
>  drivers/crypto/marvell/cesa.h   |  5 +++++
>  drivers/crypto/marvell/cipher.c | 40 +++++++++++++++++++++++++++-------------
>  drivers/crypto/marvell/tdma.c   | 29 +++++++++++++++++++++++++++++
>  4 files changed, 65 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index fb403e1..93700cd 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -312,6 +312,10 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
>  	if (!dma->padding_pool)
>  		return -ENOMEM;
>  
> +	dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
> +	if (!dma->iv_pool)
> +		return -ENOMEM;
> +
>  	cesa->dma = dma;
>  
>  	return 0;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 74071e4..74b84bd 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
>  #define CESA_TDMA_DUMMY				0
>  #define CESA_TDMA_DATA				1
>  #define CESA_TDMA_OP				2
> +#define CESA_TDMA_IV				4

Should be 3 and not 4: TDMA_TYPE is an enum, not a bit field.

>  
>  /**
>   * struct mv_cesa_tdma_desc - TDMA descriptor
> @@ -390,6 +391,7 @@ struct mv_cesa_dev_dma {
>  	struct dma_pool *op_pool;
>  	struct dma_pool *cache_pool;
>  	struct dma_pool *padding_pool;
> +	struct dma_pool *iv_pool;
>  };
>  
>  /**
> @@ -790,6 +792,9 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
>  	memset(chain, 0, sizeof(*chain));
>  }
>  
> +int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
> +			  u32 size, u32 flags, gfp_t gfp_flags);
> +
>  struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
>  					const struct mv_cesa_op_ctx *op_templ,
>  					bool skip_ctx,
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 8d0fabb..f42620e 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -118,6 +118,7 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
>  	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
>  	struct mv_cesa_engine *engine = sreq->base.engine;
>  	size_t len;
> +	unsigned int ivsize;
>  
>  	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
>  				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
> @@ -127,6 +128,10 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
>  	if (sreq->offset < req->nbytes)
>  		return -EINPROGRESS;
>  
> +	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
> +	memcpy_fromio(req->info,
> +		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
> +
>  	return 0;
>  }
>  
> @@ -135,23 +140,23 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
>  {
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
> -	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
> -	struct mv_cesa_engine *engine = sreq->base.engine;
> -	int ret;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (creq->req.base.type == CESA_DMA_REQ) {
> +		int ret;
> +		struct mv_cesa_tdma_req *dreq;
> +		unsigned int ivsize;
> +
>  		ret = mv_cesa_dma_process(&creq->req.dma, status);
> -	else
> -		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
> +		if (ret)
> +			return ret;
>  
> -	if (ret)
> +		dreq = &creq->req.dma;
> +		ivsize = crypto_ablkcipher_ivsize(
> +					     crypto_ablkcipher_reqtfm(ablkreq));

Sometime it's better to offend the < 80 characters rule than doing
funky stuff ;).

> +		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
>  		return ret;
> -
> -	memcpy_fromio(ablkreq->info,
> -		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
> -		      crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq)));
> -
> -	return 0;
> +	}

Missing blank line.

> +	return mv_cesa_ablkcipher_std_process(ablkreq, status);

This version is more readable IMHO:

	struct mv_cesa_tdma_req *dreq;
	unsigned int ivsize;
	int ret;

	if (creq->req.base.type == CESA_STD_REQ)
		return mv_cesa_ablkcipher_std_process(ablkreq, status);

	ret = mv_cesa_dma_process(&creq->req.dma, status);
	if (ret)
		return ret;

	dreq = &creq->req.dma;
	ivsize =
	crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
	memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);

	return 0;

>  
>  static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
> @@ -302,6 +307,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  	struct mv_cesa_tdma_chain chain;
>  	bool skip_ctx = false;
>  	int ret;
> +	unsigned int ivsize;
>  
>  	dreq->base.type = CESA_DMA_REQ;
>  	dreq->chain.first = NULL;
> @@ -360,6 +366,14 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  
>  	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
>  
> +	/* Add output data for IV */
> +	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
> +	ret = mv_cesa_dma_add_iv_op(&chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
> +				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
> +
> +	if (ret)
> +		goto err_free_tdma;
> +
>  	dreq->chain = chain;
>  
>  	return 0;
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index d493714..88c87be 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -68,6 +68,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
>  		if (tdma->flags & CESA_TDMA_OP)

I realize this test is wrong.

It should be
		type = tdma->flags & CESA_TDMA_TYPE_MSK;
		if (type == CESA_TDMA_OP)

>  			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
>  				      le32_to_cpu(tdma->src));
> +		else if (tdma->flags & CESA_TDMA_IV)

and here
		else if (type == CESA_TDMA_IV)

> +			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
> +				      le32_to_cpu(tdma->dst));
>  
>  		tdma = tdma->next;
>  		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
> @@ -120,6 +123,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
>  	return new_tdma;
>  }
>  
> +int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
> +			  u32 size, u32 flags, gfp_t gfp_flags)
> +{
> +
> +	struct mv_cesa_tdma_desc *tdma;
> +	u8 *cache;

Why do you name that one cache? iv would be a better name.

> +	dma_addr_t dma_handle;
> +
> +	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
> +	if (IS_ERR(tdma))
> +		return PTR_ERR(tdma);
> +
> +	cache = dma_pool_alloc(cesa_dev->dma->iv_pool, flags, &dma_handle);
> +	if (!cache)
> +		return -ENOMEM;
> +
> +	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
> +	tdma->src = src;
> +	tdma->dst = cpu_to_le32(dma_handle);
> +	tdma->data = cache;
> +
> +	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
> +	tdma->flags = flags | CESA_TDMA_DATA | CESA_TDMA_IV;

You should not mix 2 different types, it's either CESA_TDMA_DATA or
CESA_TDMA_IV, and in this case it should be CESA_TDMA_IV.

> +	return 0;
> +}
> +
>  struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
>  					const struct mv_cesa_op_ctx *op_templ,
>  					bool skip_ctx,



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
@ 2016-06-15 20:07     ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 20:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jun 2016 21:15:30 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Adding a TDMA descriptor at the end of the request for copying the
> output IV vector via a DMA transfer. This is required for processing
> cipher requests asynchroniously in chained mode, otherwise the content

		  asynchronously

> of the IV vector will be overwriten for each new finished request.

BTW, Not sure the term 'asynchronously' is appropriate here. The
standard (AKA non-DMA) processing is also asynchronous. The real reason
here is that you want to chain the requests and offload as much
processing as possible to the DMA and crypto engine. And as you
explained, this is only possible if we retrieve the updated IV using
DMA. 

> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   |  4 ++++
>  drivers/crypto/marvell/cesa.h   |  5 +++++
>  drivers/crypto/marvell/cipher.c | 40 +++++++++++++++++++++++++++-------------
>  drivers/crypto/marvell/tdma.c   | 29 +++++++++++++++++++++++++++++
>  4 files changed, 65 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index fb403e1..93700cd 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -312,6 +312,10 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
>  	if (!dma->padding_pool)
>  		return -ENOMEM;
>  
> +	dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
> +	if (!dma->iv_pool)
> +		return -ENOMEM;
> +
>  	cesa->dma = dma;
>  
>  	return 0;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 74071e4..74b84bd 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
>  #define CESA_TDMA_DUMMY				0
>  #define CESA_TDMA_DATA				1
>  #define CESA_TDMA_OP				2
> +#define CESA_TDMA_IV				4

Should be 3 and not 4: TDMA_TYPE is an enum, not a bit field.

>  
>  /**
>   * struct mv_cesa_tdma_desc - TDMA descriptor
> @@ -390,6 +391,7 @@ struct mv_cesa_dev_dma {
>  	struct dma_pool *op_pool;
>  	struct dma_pool *cache_pool;
>  	struct dma_pool *padding_pool;
> +	struct dma_pool *iv_pool;
>  };
>  
>  /**
> @@ -790,6 +792,9 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
>  	memset(chain, 0, sizeof(*chain));
>  }
>  
> +int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
> +			  u32 size, u32 flags, gfp_t gfp_flags);
> +
>  struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
>  					const struct mv_cesa_op_ctx *op_templ,
>  					bool skip_ctx,
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 8d0fabb..f42620e 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -118,6 +118,7 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
>  	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
>  	struct mv_cesa_engine *engine = sreq->base.engine;
>  	size_t len;
> +	unsigned int ivsize;
>  
>  	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
>  				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
> @@ -127,6 +128,10 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
>  	if (sreq->offset < req->nbytes)
>  		return -EINPROGRESS;
>  
> +	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
> +	memcpy_fromio(req->info,
> +		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
> +
>  	return 0;
>  }
>  
> @@ -135,23 +140,23 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
>  {
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
> -	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
> -	struct mv_cesa_engine *engine = sreq->base.engine;
> -	int ret;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (creq->req.base.type == CESA_DMA_REQ) {
> +		int ret;
> +		struct mv_cesa_tdma_req *dreq;
> +		unsigned int ivsize;
> +
>  		ret = mv_cesa_dma_process(&creq->req.dma, status);
> -	else
> -		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
> +		if (ret)
> +			return ret;
>  
> -	if (ret)
> +		dreq = &creq->req.dma;
> +		ivsize = crypto_ablkcipher_ivsize(
> +					     crypto_ablkcipher_reqtfm(ablkreq));

Sometime it's better to offend the < 80 characters rule than doing
funky stuff ;).

> +		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
>  		return ret;
> -
> -	memcpy_fromio(ablkreq->info,
> -		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
> -		      crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq)));
> -
> -	return 0;
> +	}

Missing blank line.

> +	return mv_cesa_ablkcipher_std_process(ablkreq, status);

This version is more readable IMHO:

	struct mv_cesa_tdma_req *dreq;
	unsigned int ivsize;
	int ret;

	if (creq->req.base.type == CESA_STD_REQ)
		return mv_cesa_ablkcipher_std_process(ablkreq, status);

	ret = mv_cesa_dma_process(&creq->req.dma, status);
	if (ret)
		return ret;

	dreq = &creq->req.dma;
	ivsize =
	crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
	memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);

	return 0;

>  
>  static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
> @@ -302,6 +307,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  	struct mv_cesa_tdma_chain chain;
>  	bool skip_ctx = false;
>  	int ret;
> +	unsigned int ivsize;
>  
>  	dreq->base.type = CESA_DMA_REQ;
>  	dreq->chain.first = NULL;
> @@ -360,6 +366,14 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  
>  	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
>  
> +	/* Add output data for IV */
> +	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
> +	ret = mv_cesa_dma_add_iv_op(&chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
> +				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
> +
> +	if (ret)
> +		goto err_free_tdma;
> +
>  	dreq->chain = chain;
>  
>  	return 0;
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index d493714..88c87be 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -68,6 +68,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
>  		if (tdma->flags & CESA_TDMA_OP)

I realize this test is wrong.

It should be
		type = tdma->flags & CESA_TDMA_TYPE_MSK;
		if (type == CESA_TDMA_OP)

>  			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
>  				      le32_to_cpu(tdma->src));
> +		else if (tdma->flags & CESA_TDMA_IV)

and here
		else if (type == CESA_TDMA_IV)

> +			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
> +				      le32_to_cpu(tdma->dst));
>  
>  		tdma = tdma->next;
>  		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
> @@ -120,6 +123,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
>  	return new_tdma;
>  }
>  
> +int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
> +			  u32 size, u32 flags, gfp_t gfp_flags)
> +{
> +
> +	struct mv_cesa_tdma_desc *tdma;
> +	u8 *cache;

Why do you name that one cache? iv would be a better name.

> +	dma_addr_t dma_handle;
> +
> +	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
> +	if (IS_ERR(tdma))
> +		return PTR_ERR(tdma);
> +
> +	cache = dma_pool_alloc(cesa_dev->dma->iv_pool, flags, &dma_handle);
> +	if (!cache)
> +		return -ENOMEM;
> +
> +	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
> +	tdma->src = src;
> +	tdma->dst = cpu_to_le32(dma_handle);
> +	tdma->data = cache;
> +
> +	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
> +	tdma->flags = flags | CESA_TDMA_DATA | CESA_TDMA_IV;

You should not mix 2 different types, it's either CESA_TDMA_DATA or
CESA_TDMA_IV, and in this case it should be CESA_TDMA_IV.

> +	return 0;
> +}
> +
>  struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
>  					const struct mv_cesa_op_ctx *op_templ,
>  					bool skip_ctx,



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
  2016-06-15 19:15   ` Romain Perier
@ 2016-06-15 20:42     ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 20:42 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Wed, 15 Jun 2016 21:15:31 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Actually the only way to access the tdma chain is to use the 'req' union

Currently, ...

> from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem if we
> want to handle the TDMA chaining vs standard/non-DMA processing in a
> generic way (with generic functions at the cesa.c level detecting
> whether the request should be queued at the DMA level or not). Hence the
> decision to move the chain field a the mv_cesa_req level at the expense

				   at

> of adding 2 void * fields to all request contexts (including non-DMA
> ones). To limit the overhead, we get rid of the type field, which can
> now be deduced from the req->chain.first value.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   |  3 ++-
>  drivers/crypto/marvell/cesa.h   | 31 +++++++++++++------------------
>  drivers/crypto/marvell/cipher.c | 40 ++++++++++++++++++++++------------------
>  drivers/crypto/marvell/hash.c   | 36 +++++++++++++++---------------------
>  drivers/crypto/marvell/tdma.c   |  8 ++++----
>  5 files changed, 56 insertions(+), 62 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index 93700cd..fe04d1b 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -111,7 +111,8 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  	return ret;
>  }
>  
> -int mv_cesa_queue_req(struct crypto_async_request *req)
> +int mv_cesa_queue_req(struct crypto_async_request *req,
> +		      struct mv_cesa_req *creq)
>  {
>  	int ret;
>  	int i;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 74b84bd..158ff82 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -509,21 +509,11 @@ enum mv_cesa_req_type {
>  
>  /**
>   * struct mv_cesa_req - CESA request
> - * @type:	request type
>   * @engine:	engine associated with this request
> + * @chain:	list of tdma descriptors associated  with this request

						   ^ extra white space.

>   */
>  struct mv_cesa_req {
> -	enum mv_cesa_req_type type;
>  	struct mv_cesa_engine *engine;
> -};
> -
> -/**
> - * struct mv_cesa_tdma_req - CESA TDMA request
> - * @base:	base information
> - * @chain:	TDMA chain
> - */
> -struct mv_cesa_tdma_req {
> -	struct mv_cesa_req base;
>  	struct mv_cesa_tdma_chain chain;
>  };
>  
> @@ -562,7 +552,6 @@ struct mv_cesa_ablkcipher_std_req {
>  struct mv_cesa_ablkcipher_req {
>  	union {
>  		struct mv_cesa_req base;
> -		struct mv_cesa_tdma_req dma;
>  		struct mv_cesa_ablkcipher_std_req std;

Now that the dma specific fields are part of the base request there's no
reason to keep this union.

You can just put struct mv_cesa_req base; directly under struct
mv_cesa_ablkcipher_req, and move mv_cesa_ablkcipher_std_req fields in
mv_cesa_ablkcipher_req.

>  	} req;
>  	int src_nents;
> @@ -587,7 +576,6 @@ struct mv_cesa_ahash_std_req {
>   * @cache_dma:		DMA address of the cache buffer
>   */
>  struct mv_cesa_ahash_dma_req {
> -	struct mv_cesa_tdma_req base;
>  	u8 *padding;
>  	dma_addr_t padding_dma;
>  	u8 *cache;
> @@ -625,6 +613,12 @@ struct mv_cesa_ahash_req {
>  
>  extern struct mv_cesa_dev *cesa_dev;
>  
> +static inline enum mv_cesa_req_type
> +mv_cesa_req_get_type(struct mv_cesa_req *req)
> +{
> +	return req->chain.first ? CESA_DMA_REQ : CESA_STD_REQ;
> +}
> +
>  static inline void mv_cesa_update_op_cfg(struct mv_cesa_op_ctx *op,
>  					 u32 cfg, u32 mask)
>  {
> @@ -697,7 +691,8 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
>  		CESA_SA_DESC_CFG_FIRST_FRAG;
>  }
>  
> -int mv_cesa_queue_req(struct crypto_async_request *req);
> +int mv_cesa_queue_req(struct crypto_async_request *req,
> +		      struct mv_cesa_req *creq);
>  
>  /*
>   * Helper function that indicates whether a crypto request needs to be
> @@ -767,9 +762,9 @@ static inline bool mv_cesa_req_dma_iter_next_op(struct mv_cesa_dma_iter *iter)
>  	return iter->op_len;
>  }
>  
> -void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq);
> +void mv_cesa_dma_step(struct mv_cesa_req *dreq);
>  
> -static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
> +static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
>  				      u32 status)
>  {
>  	if (!(status & CESA_SA_INT_ACC0_IDMA_DONE))
> @@ -781,10 +776,10 @@ static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
>  	return 0;
>  }
>  
> -void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
> +void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  			 struct mv_cesa_engine *engine);
> +void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
>  
> -void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq);
>  
>  static inline void
>  mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index f42620e..15d2c5a 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -70,14 +70,14 @@ mv_cesa_ablkcipher_dma_cleanup(struct ablkcipher_request *req)
>  		dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents,
>  			     DMA_BIDIRECTIONAL);
>  	}
> -	mv_cesa_dma_cleanup(&creq->req.dma);
> +	mv_cesa_dma_cleanup(&creq->req.base);
>  }
>  
>  static inline void mv_cesa_ablkcipher_cleanup(struct ablkcipher_request *req)
>  {
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ablkcipher_dma_cleanup(req);
>  }
>  
> @@ -141,19 +141,19 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ) {
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
>  		int ret;
> -		struct mv_cesa_tdma_req *dreq;
> +		struct mv_cesa_req *basereq;
>  		unsigned int ivsize;
>  
> -		ret = mv_cesa_dma_process(&creq->req.dma, status);
> +		ret = mv_cesa_dma_process(&creq->req.base, status);

Initialize basereq earlier and pass it as the first argument of
mv_cesa_dma_process().

>  		if (ret)
>  			return ret;
>  
> -		dreq = &creq->req.dma;
> +		basereq = &creq->req.base;
>  		ivsize = crypto_ablkcipher_ivsize(
>  					     crypto_ablkcipher_reqtfm(ablkreq));
> -		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
> +		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
>  		return ret;
>  	}
>  	return mv_cesa_ablkcipher_std_process(ablkreq, status);
> @@ -164,8 +164,8 @@ static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> -		mv_cesa_dma_step(&creq->req.dma);
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> +		mv_cesa_dma_step(&creq->req.base);
>  	else
>  		mv_cesa_ablkcipher_std_step(ablkreq);
>  }
> @@ -174,9 +174,9 @@ static inline void
>  mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
>  {
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
> +	struct mv_cesa_req *dreq = &creq->req.base;

You named it basereq in mv_cesa_ablkcipher_step(). Try to be
consistent, no matter the name.

>  
> -	mv_cesa_dma_prepare(dreq, dreq->base.engine);
> +	mv_cesa_dma_prepare(dreq, dreq->engine);
>  }
>  
>  static inline void
> @@ -199,7 +199,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>  	creq->req.base.engine = engine;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ablkcipher_dma_prepare(ablkreq);
>  	else
>  		mv_cesa_ablkcipher_std_prepare(ablkreq);
> @@ -302,14 +302,13 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
>  		      GFP_KERNEL : GFP_ATOMIC;
> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
> +	struct mv_cesa_req *dreq = &creq->req.base;

Ditto.

>  	struct mv_cesa_ablkcipher_dma_iter iter;
>  	struct mv_cesa_tdma_chain chain;
>  	bool skip_ctx = false;
>  	int ret;
>  	unsigned int ivsize;
>  
> -	dreq->base.type = CESA_DMA_REQ;
>  	dreq->chain.first = NULL;
>  	dreq->chain.last = NULL;
>  
> @@ -397,10 +396,12 @@ mv_cesa_ablkcipher_std_req_init(struct ablkcipher_request *req,
>  {
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
> +	struct mv_cesa_req *basereq = &creq->req.base;
>  
> -	sreq->base.type = CESA_STD_REQ;
>  	sreq->op = *op_templ;
>  	sreq->skip_ctx = false;
> +	basereq->chain.first = NULL;
> +	basereq->chain.last = NULL;
>  
>  	return 0;
>  }
> @@ -442,6 +443,7 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
>  static int mv_cesa_des_op(struct ablkcipher_request *req,
>  			  struct mv_cesa_op_ctx *tmpl)
>  {
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>  	int ret;
>  
> @@ -454,7 +456,7 @@ static int mv_cesa_des_op(struct ablkcipher_request *req,
>  	if (ret)
>  		return ret;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ablkcipher_cleanup(req);
>  
> @@ -562,6 +564,7 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
>  static int mv_cesa_des3_op(struct ablkcipher_request *req,
>  			   struct mv_cesa_op_ctx *tmpl)
>  {
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>  	int ret;
>  
> @@ -574,7 +577,7 @@ static int mv_cesa_des3_op(struct ablkcipher_request *req,
>  	if (ret)
>  		return ret;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ablkcipher_cleanup(req);
>  
> @@ -688,6 +691,7 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
>  static int mv_cesa_aes_op(struct ablkcipher_request *req,
>  			  struct mv_cesa_op_ctx *tmpl)
>  {
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>  	int ret, i;
>  	u32 *key;
> @@ -716,7 +720,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
>  	if (ret)
>  		return ret;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ablkcipher_cleanup(req);
>  
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index 0fae351..cc7c5b0 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -103,14 +103,14 @@ static inline void mv_cesa_ahash_dma_cleanup(struct ahash_request *req)
>  
>  	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
>  	mv_cesa_ahash_dma_free_cache(&creq->req.dma);
> -	mv_cesa_dma_cleanup(&creq->req.dma.base);
> +	mv_cesa_dma_cleanup(&creq->req.base);
>  }
>  
>  static inline void mv_cesa_ahash_cleanup(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ahash_dma_cleanup(req);
>  }
>  
> @@ -118,7 +118,7 @@ static void mv_cesa_ahash_last_cleanup(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ahash_dma_last_cleanup(req);
>  }
>  
> @@ -256,9 +256,9 @@ static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
>  static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
> +	struct mv_cesa_req *dreq = &creq->req.base;

Ditto.

>  
> -	mv_cesa_dma_prepare(dreq, dreq->base.engine);
> +	mv_cesa_dma_prepare(dreq, dreq->engine);
>  }
>  
>  static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
> @@ -277,8 +277,8 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
>  	struct ahash_request *ahashreq = ahash_request_cast(req);
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> -		mv_cesa_dma_step(&creq->req.dma.base);
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> +		mv_cesa_dma_step(&creq->req.base);
>  	else
>  		mv_cesa_ahash_std_step(ahashreq);
>  }
> @@ -291,8 +291,8 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
>  	unsigned int digsize;
>  	int ret, i;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> -		ret = mv_cesa_dma_process(&creq->req.dma.base, status);
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> +		ret = mv_cesa_dma_process(&creq->req.base, status);
>  	else
>  		ret = mv_cesa_ahash_std_process(ahashreq, status);
>  
> @@ -340,7 +340,7 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
>  
>  	creq->req.base.engine = engine;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ahash_dma_prepare(ahashreq);
>  	else
>  		mv_cesa_ahash_std_prepare(ahashreq);
> @@ -555,8 +555,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
>  		      GFP_KERNEL : GFP_ATOMIC;
> -	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
> -	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
> +	struct mv_cesa_req *dreq = &creq->req.base;

Ditto.

>  	struct mv_cesa_ahash_dma_iter iter;
>  	struct mv_cesa_op_ctx *op = NULL;
>  	unsigned int frag_len;
> @@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	int ret;
>  
> -	if (cesa_dev->caps->has_tdma)
> -		creq->req.base.type = CESA_DMA_REQ;
> -	else
> -		creq->req.base.type = CESA_STD_REQ;
> -

Hm, where is it decided now? I mean, I don't see this test anywhere
else in your patch, which means you're now always using standard mode.

>  	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
>  	if (creq->src_nents < 0) {
>  		dev_err(cesa_dev->dev, "Invalid number of src SG");
> @@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>  	if (*cached)
>  		return 0;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)

Should be

	if (cesa_dev->caps->has_tdma)

>  		ret = mv_cesa_ahash_dma_req_init(req);
>  
>  	return ret;
> @@ -700,7 +694,7 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
>  	if (cached)
>  		return 0;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ahash_cleanup(req);
>  
> @@ -725,7 +719,7 @@ static int mv_cesa_ahash_final(struct ahash_request *req)
>  	if (cached)
>  		return 0;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ahash_cleanup(req);
>  
> @@ -750,7 +744,7 @@ static int mv_cesa_ahash_finup(struct ahash_request *req)
>  	if (cached)
>  		return 0;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ahash_cleanup(req);
>  
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 88c87be..9a424f9 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -37,9 +37,9 @@ bool mv_cesa_req_dma_iter_next_transfer(struct mv_cesa_dma_iter *iter,
>  	return true;
>  }
>  
> -void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
> +void mv_cesa_dma_step(struct mv_cesa_req *dreq)
>  {
> -	struct mv_cesa_engine *engine = dreq->base.engine;
> +	struct mv_cesa_engine *engine = dreq->engine;
>  
>  	writel_relaxed(0, engine->regs + CESA_SA_CFG);
>  
> @@ -58,7 +58,7 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
>  	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>  }
>  
> -void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
> +void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
>  {
>  	struct mv_cesa_tdma_desc *tdma;
>  
> @@ -81,7 +81,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
>  	dreq->chain.last = NULL;
>  }
>  
> -void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
> +void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  			 struct mv_cesa_engine *engine)
>  {
>  	struct mv_cesa_tdma_desc *tdma;



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
@ 2016-06-15 20:42     ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 20:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jun 2016 21:15:31 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Actually the only way to access the tdma chain is to use the 'req' union

Currently, ...

> from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem if we
> want to handle the TDMA chaining vs standard/non-DMA processing in a
> generic way (with generic functions at the cesa.c level detecting
> whether the request should be queued at the DMA level or not). Hence the
> decision to move the chain field a the mv_cesa_req level at the expense

				   at

> of adding 2 void * fields to all request contexts (including non-DMA
> ones). To limit the overhead, we get rid of the type field, which can
> now be deduced from the req->chain.first value.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   |  3 ++-
>  drivers/crypto/marvell/cesa.h   | 31 +++++++++++++------------------
>  drivers/crypto/marvell/cipher.c | 40 ++++++++++++++++++++++------------------
>  drivers/crypto/marvell/hash.c   | 36 +++++++++++++++---------------------
>  drivers/crypto/marvell/tdma.c   |  8 ++++----
>  5 files changed, 56 insertions(+), 62 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index 93700cd..fe04d1b 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -111,7 +111,8 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  	return ret;
>  }
>  
> -int mv_cesa_queue_req(struct crypto_async_request *req)
> +int mv_cesa_queue_req(struct crypto_async_request *req,
> +		      struct mv_cesa_req *creq)
>  {
>  	int ret;
>  	int i;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 74b84bd..158ff82 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -509,21 +509,11 @@ enum mv_cesa_req_type {
>  
>  /**
>   * struct mv_cesa_req - CESA request
> - * @type:	request type
>   * @engine:	engine associated with this request
> + * @chain:	list of tdma descriptors associated  with this request

						   ^ extra white space.

>   */
>  struct mv_cesa_req {
> -	enum mv_cesa_req_type type;
>  	struct mv_cesa_engine *engine;
> -};
> -
> -/**
> - * struct mv_cesa_tdma_req - CESA TDMA request
> - * @base:	base information
> - * @chain:	TDMA chain
> - */
> -struct mv_cesa_tdma_req {
> -	struct mv_cesa_req base;
>  	struct mv_cesa_tdma_chain chain;
>  };
>  
> @@ -562,7 +552,6 @@ struct mv_cesa_ablkcipher_std_req {
>  struct mv_cesa_ablkcipher_req {
>  	union {
>  		struct mv_cesa_req base;
> -		struct mv_cesa_tdma_req dma;
>  		struct mv_cesa_ablkcipher_std_req std;

Now that the dma specific fields are part of the base request there's no
reason to keep this union.

You can just put struct mv_cesa_req base; directly under struct
mv_cesa_ablkcipher_req, and move mv_cesa_ablkcipher_std_req fields in
mv_cesa_ablkcipher_req.

>  	} req;
>  	int src_nents;
> @@ -587,7 +576,6 @@ struct mv_cesa_ahash_std_req {
>   * @cache_dma:		DMA address of the cache buffer
>   */
>  struct mv_cesa_ahash_dma_req {
> -	struct mv_cesa_tdma_req base;
>  	u8 *padding;
>  	dma_addr_t padding_dma;
>  	u8 *cache;
> @@ -625,6 +613,12 @@ struct mv_cesa_ahash_req {
>  
>  extern struct mv_cesa_dev *cesa_dev;
>  
> +static inline enum mv_cesa_req_type
> +mv_cesa_req_get_type(struct mv_cesa_req *req)
> +{
> +	return req->chain.first ? CESA_DMA_REQ : CESA_STD_REQ;
> +}
> +
>  static inline void mv_cesa_update_op_cfg(struct mv_cesa_op_ctx *op,
>  					 u32 cfg, u32 mask)
>  {
> @@ -697,7 +691,8 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
>  		CESA_SA_DESC_CFG_FIRST_FRAG;
>  }
>  
> -int mv_cesa_queue_req(struct crypto_async_request *req);
> +int mv_cesa_queue_req(struct crypto_async_request *req,
> +		      struct mv_cesa_req *creq);
>  
>  /*
>   * Helper function that indicates whether a crypto request needs to be
> @@ -767,9 +762,9 @@ static inline bool mv_cesa_req_dma_iter_next_op(struct mv_cesa_dma_iter *iter)
>  	return iter->op_len;
>  }
>  
> -void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq);
> +void mv_cesa_dma_step(struct mv_cesa_req *dreq);
>  
> -static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
> +static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
>  				      u32 status)
>  {
>  	if (!(status & CESA_SA_INT_ACC0_IDMA_DONE))
> @@ -781,10 +776,10 @@ static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
>  	return 0;
>  }
>  
> -void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
> +void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  			 struct mv_cesa_engine *engine);
> +void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
>  
> -void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq);
>  
>  static inline void
>  mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index f42620e..15d2c5a 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -70,14 +70,14 @@ mv_cesa_ablkcipher_dma_cleanup(struct ablkcipher_request *req)
>  		dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents,
>  			     DMA_BIDIRECTIONAL);
>  	}
> -	mv_cesa_dma_cleanup(&creq->req.dma);
> +	mv_cesa_dma_cleanup(&creq->req.base);
>  }
>  
>  static inline void mv_cesa_ablkcipher_cleanup(struct ablkcipher_request *req)
>  {
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ablkcipher_dma_cleanup(req);
>  }
>  
> @@ -141,19 +141,19 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ) {
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
>  		int ret;
> -		struct mv_cesa_tdma_req *dreq;
> +		struct mv_cesa_req *basereq;
>  		unsigned int ivsize;
>  
> -		ret = mv_cesa_dma_process(&creq->req.dma, status);
> +		ret = mv_cesa_dma_process(&creq->req.base, status);

Initialize basereq earlier and pass it as the first argument of
mv_cesa_dma_process().

>  		if (ret)
>  			return ret;
>  
> -		dreq = &creq->req.dma;
> +		basereq = &creq->req.base;
>  		ivsize = crypto_ablkcipher_ivsize(
>  					     crypto_ablkcipher_reqtfm(ablkreq));
> -		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
> +		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
>  		return ret;
>  	}
>  	return mv_cesa_ablkcipher_std_process(ablkreq, status);
> @@ -164,8 +164,8 @@ static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> -		mv_cesa_dma_step(&creq->req.dma);
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> +		mv_cesa_dma_step(&creq->req.base);
>  	else
>  		mv_cesa_ablkcipher_std_step(ablkreq);
>  }
> @@ -174,9 +174,9 @@ static inline void
>  mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
>  {
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
> +	struct mv_cesa_req *dreq = &creq->req.base;

You named it basereq in mv_cesa_ablkcipher_step(). Try to be
consistent, no matter the name.

>  
> -	mv_cesa_dma_prepare(dreq, dreq->base.engine);
> +	mv_cesa_dma_prepare(dreq, dreq->engine);
>  }
>  
>  static inline void
> @@ -199,7 +199,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>  	creq->req.base.engine = engine;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ablkcipher_dma_prepare(ablkreq);
>  	else
>  		mv_cesa_ablkcipher_std_prepare(ablkreq);
> @@ -302,14 +302,13 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
>  		      GFP_KERNEL : GFP_ATOMIC;
> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
> +	struct mv_cesa_req *dreq = &creq->req.base;

Ditto.

>  	struct mv_cesa_ablkcipher_dma_iter iter;
>  	struct mv_cesa_tdma_chain chain;
>  	bool skip_ctx = false;
>  	int ret;
>  	unsigned int ivsize;
>  
> -	dreq->base.type = CESA_DMA_REQ;
>  	dreq->chain.first = NULL;
>  	dreq->chain.last = NULL;
>  
> @@ -397,10 +396,12 @@ mv_cesa_ablkcipher_std_req_init(struct ablkcipher_request *req,
>  {
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
> +	struct mv_cesa_req *basereq = &creq->req.base;
>  
> -	sreq->base.type = CESA_STD_REQ;
>  	sreq->op = *op_templ;
>  	sreq->skip_ctx = false;
> +	basereq->chain.first = NULL;
> +	basereq->chain.last = NULL;
>  
>  	return 0;
>  }
> @@ -442,6 +443,7 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
>  static int mv_cesa_des_op(struct ablkcipher_request *req,
>  			  struct mv_cesa_op_ctx *tmpl)
>  {
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>  	int ret;
>  
> @@ -454,7 +456,7 @@ static int mv_cesa_des_op(struct ablkcipher_request *req,
>  	if (ret)
>  		return ret;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ablkcipher_cleanup(req);
>  
> @@ -562,6 +564,7 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
>  static int mv_cesa_des3_op(struct ablkcipher_request *req,
>  			   struct mv_cesa_op_ctx *tmpl)
>  {
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>  	int ret;
>  
> @@ -574,7 +577,7 @@ static int mv_cesa_des3_op(struct ablkcipher_request *req,
>  	if (ret)
>  		return ret;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ablkcipher_cleanup(req);
>  
> @@ -688,6 +691,7 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
>  static int mv_cesa_aes_op(struct ablkcipher_request *req,
>  			  struct mv_cesa_op_ctx *tmpl)
>  {
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>  	int ret, i;
>  	u32 *key;
> @@ -716,7 +720,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
>  	if (ret)
>  		return ret;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ablkcipher_cleanup(req);
>  
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index 0fae351..cc7c5b0 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -103,14 +103,14 @@ static inline void mv_cesa_ahash_dma_cleanup(struct ahash_request *req)
>  
>  	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
>  	mv_cesa_ahash_dma_free_cache(&creq->req.dma);
> -	mv_cesa_dma_cleanup(&creq->req.dma.base);
> +	mv_cesa_dma_cleanup(&creq->req.base);
>  }
>  
>  static inline void mv_cesa_ahash_cleanup(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ahash_dma_cleanup(req);
>  }
>  
> @@ -118,7 +118,7 @@ static void mv_cesa_ahash_last_cleanup(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ahash_dma_last_cleanup(req);
>  }
>  
> @@ -256,9 +256,9 @@ static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
>  static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
> +	struct mv_cesa_req *dreq = &creq->req.base;

Ditto.

>  
> -	mv_cesa_dma_prepare(dreq, dreq->base.engine);
> +	mv_cesa_dma_prepare(dreq, dreq->engine);
>  }
>  
>  static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
> @@ -277,8 +277,8 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
>  	struct ahash_request *ahashreq = ahash_request_cast(req);
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> -		mv_cesa_dma_step(&creq->req.dma.base);
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> +		mv_cesa_dma_step(&creq->req.base);
>  	else
>  		mv_cesa_ahash_std_step(ahashreq);
>  }
> @@ -291,8 +291,8 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
>  	unsigned int digsize;
>  	int ret, i;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> -		ret = mv_cesa_dma_process(&creq->req.dma.base, status);
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> +		ret = mv_cesa_dma_process(&creq->req.base, status);
>  	else
>  		ret = mv_cesa_ahash_std_process(ahashreq, status);
>  
> @@ -340,7 +340,7 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
>  
>  	creq->req.base.engine = engine;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_ahash_dma_prepare(ahashreq);
>  	else
>  		mv_cesa_ahash_std_prepare(ahashreq);
> @@ -555,8 +555,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
>  		      GFP_KERNEL : GFP_ATOMIC;
> -	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
> -	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
> +	struct mv_cesa_req *dreq = &creq->req.base;

Ditto.

>  	struct mv_cesa_ahash_dma_iter iter;
>  	struct mv_cesa_op_ctx *op = NULL;
>  	unsigned int frag_len;
> @@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	int ret;
>  
> -	if (cesa_dev->caps->has_tdma)
> -		creq->req.base.type = CESA_DMA_REQ;
> -	else
> -		creq->req.base.type = CESA_STD_REQ;
> -

Hm, where is it decided now? I mean, I don't see this test anywhere
else in your patch, which means you're now always using standard mode.

>  	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
>  	if (creq->src_nents < 0) {
>  		dev_err(cesa_dev->dev, "Invalid number of src SG");
> @@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>  	if (*cached)
>  		return 0;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)

Should be

	if (cesa_dev->caps->has_tdma)

>  		ret = mv_cesa_ahash_dma_req_init(req);
>  
>  	return ret;
> @@ -700,7 +694,7 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
>  	if (cached)
>  		return 0;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ahash_cleanup(req);
>  
> @@ -725,7 +719,7 @@ static int mv_cesa_ahash_final(struct ahash_request *req)
>  	if (cached)
>  		return 0;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ahash_cleanup(req);
>  
> @@ -750,7 +744,7 @@ static int mv_cesa_ahash_finup(struct ahash_request *req)
>  	if (cached)
>  		return 0;
>  
> -	ret = mv_cesa_queue_req(&req->base);
> +	ret = mv_cesa_queue_req(&req->base, &creq->req.base);
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ahash_cleanup(req);
>  
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 88c87be..9a424f9 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -37,9 +37,9 @@ bool mv_cesa_req_dma_iter_next_transfer(struct mv_cesa_dma_iter *iter,
>  	return true;
>  }
>  
> -void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
> +void mv_cesa_dma_step(struct mv_cesa_req *dreq)
>  {
> -	struct mv_cesa_engine *engine = dreq->base.engine;
> +	struct mv_cesa_engine *engine = dreq->engine;
>  
>  	writel_relaxed(0, engine->regs + CESA_SA_CFG);
>  
> @@ -58,7 +58,7 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
>  	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>  }
>  
> -void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
> +void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
>  {
>  	struct mv_cesa_tdma_desc *tdma;
>  
> @@ -81,7 +81,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
>  	dreq->chain.last = NULL;
>  }
>  
> -void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
> +void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  			 struct mv_cesa_engine *engine)
>  {
>  	struct mv_cesa_tdma_desc *tdma;



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  2016-06-15 19:15   ` Romain Perier
@ 2016-06-15 20:48     ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 20:48 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Wed, 15 Jun 2016 21:15:30 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> @@ -135,23 +140,23 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
>  {
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
> -	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
> -	struct mv_cesa_engine *engine = sreq->base.engine;
> -	int ret;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (creq->req.base.type == CESA_DMA_REQ) {
> +		int ret;
> +		struct mv_cesa_tdma_req *dreq;
> +		unsigned int ivsize;
> +
>  		ret = mv_cesa_dma_process(&creq->req.dma, status);
> -	else
> -		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
> +		if (ret)
> +			return ret;
>  
> -	if (ret)
> +		dreq = &creq->req.dma;
> +		ivsize = crypto_ablkcipher_ivsize(
> +					     crypto_ablkcipher_reqtfm(ablkreq));
> +		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);

Just use memcpy() here: you're not copying from an iomem region here.

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
@ 2016-06-15 20:48     ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 20:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jun 2016 21:15:30 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> @@ -135,23 +140,23 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
>  {
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
> -	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
> -	struct mv_cesa_engine *engine = sreq->base.engine;
> -	int ret;
>  
> -	if (creq->req.base.type == CESA_DMA_REQ)
> +	if (creq->req.base.type == CESA_DMA_REQ) {
> +		int ret;
> +		struct mv_cesa_tdma_req *dreq;
> +		unsigned int ivsize;
> +
>  		ret = mv_cesa_dma_process(&creq->req.dma, status);
> -	else
> -		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
> +		if (ret)
> +			return ret;
>  
> -	if (ret)
> +		dreq = &creq->req.dma;
> +		ivsize = crypto_ablkcipher_ivsize(
> +					     crypto_ablkcipher_reqtfm(ablkreq));
> +		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);

Just use memcpy() here: you're not copying from an iomem region here.

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/7] crypto: marvell: Adding a complete operation for async requests
  2016-06-15 19:15   ` Romain Perier
@ 2016-06-15 20:55     ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 20:55 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Wed, 15 Jun 2016 21:15:32 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> So far, the 'process' operation was used to check if the current request
> was correctly handled by the engine, if it was the case it copied
> information from the SRAM to the main memory. Now, we split this
> operation. We keep the 'process' operation, which still checks if the
> request was correctly handled by the engine or not, then we add a new
> operation for completion. The 'complete' method copies the content of
> the SRAM to memory. This will soon become useful if we want to call
> the process and the complete operations from different locations
> depending on the type of the request (different cleanup logic).
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   |  1 +
>  drivers/crypto/marvell/cesa.h   |  3 +++
>  drivers/crypto/marvell/cipher.c | 47 ++++++++++++++++++++++++-----------------
>  drivers/crypto/marvell/hash.c   | 22 ++++++++++---------
>  4 files changed, 44 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index fe04d1b..af96426 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -98,6 +98,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  				engine->req = NULL;
>  				mv_cesa_dequeue_req_unlocked(engine);
>  				spin_unlock_bh(&engine->lock);
> +				ctx->ops->complete(req);
>  				ctx->ops->cleanup(req);
>  				local_bh_disable();
>  				req->complete(req, res);
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 158ff82..32de08b 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -456,6 +456,8 @@ struct mv_cesa_engine {
>   *		code)
>   * @step:	launch the crypto operation on the next chunk
>   * @cleanup:	cleanup the crypto request (release associated data)
> + * @complete:	complete the request, i.e copy result from sram or contexts
> + * 		when it is needed.
>   */
>  struct mv_cesa_req_ops {
>  	void (*prepare)(struct crypto_async_request *req,
> @@ -463,6 +465,7 @@ struct mv_cesa_req_ops {
>  	int (*process)(struct crypto_async_request *req, u32 status);
>  	void (*step)(struct crypto_async_request *req);
>  	void (*cleanup)(struct crypto_async_request *req);
> +	void (*complete)(struct crypto_async_request *req);
>  };
>  
>  /**
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 15d2c5a..fbaae2f 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -118,7 +118,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
>  	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
>  	struct mv_cesa_engine *engine = sreq->base.engine;
>  	size_t len;
> -	unsigned int ivsize;
>  
>  	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
>  				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
> @@ -128,10 +127,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
>  	if (sreq->offset < req->nbytes)
>  		return -EINPROGRESS;
>  
> -	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
> -	memcpy_fromio(req->info,
> -		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
> -
>  	return 0;
>  }
>  
> @@ -141,21 +136,9 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>  
> -	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
> -		int ret;
> -		struct mv_cesa_req *basereq;
> -		unsigned int ivsize;
> -
> -		ret = mv_cesa_dma_process(&creq->req.base, status);
> -		if (ret)
> -			return ret;
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> +		return mv_cesa_dma_process(&creq->req.base, status);
>  
> -		basereq = &creq->req.base;
> -		ivsize = crypto_ablkcipher_ivsize(
> -					     crypto_ablkcipher_reqtfm(ablkreq));
> -		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
> -		return ret;
> -	}
>  	return mv_cesa_ablkcipher_std_process(ablkreq, status);
>  }
>  
> @@ -197,6 +180,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
>  {
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
> +

Nit: not sure you should mix this cosmetic change with the other
changes.

>  	creq->req.base.engine = engine;
>  
>  	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> @@ -213,11 +197,36 @@ mv_cesa_ablkcipher_req_cleanup(struct crypto_async_request *req)
>  	mv_cesa_ablkcipher_cleanup(ablkreq);
>  }
>  
> +static void
> +mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
> +{
> +	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
> +	struct mv_cesa_engine *engine = creq->req.base.engine;
> +	unsigned int ivsize;
> +
> +	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
> +
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
> +		struct mv_cesa_req *basereq;
> +
> +		basereq = &creq->req.base;
> +		ivsize = crypto_ablkcipher_ivsize(
> +					     crypto_ablkcipher_reqtfm(ablkreq));

You already have ivsize initialized.

> +		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);

Use memcpy() here.

> +	} else {
> +		memcpy_fromio(ablkreq->info,
> +			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
> +			      ivsize);
> +	}
> +}
> +
>  static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
>  	.step = mv_cesa_ablkcipher_step,
>  	.process = mv_cesa_ablkcipher_process,
>  	.prepare = mv_cesa_ablkcipher_prepare,
>  	.cleanup = mv_cesa_ablkcipher_req_cleanup,
> +	.complete = mv_cesa_ablkcipher_complete,
>  };
>  
>  static int mv_cesa_ablkcipher_cra_init(struct crypto_tfm *tfm)
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index cc7c5b0..f7f84cc 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -287,17 +287,20 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
>  {
>  	struct ahash_request *ahashreq = ahash_request_cast(req);
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
> -	struct mv_cesa_engine *engine = creq->req.base.engine;
> -	unsigned int digsize;
> -	int ret, i;
>  
>  	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> -		ret = mv_cesa_dma_process(&creq->req.base, status);
> -	else
> -		ret = mv_cesa_ahash_std_process(ahashreq, status);
> +		return mv_cesa_dma_process(&creq->req.base, status);
>  
> -	if (ret == -EINPROGRESS)
> -		return ret;
> +	return mv_cesa_ahash_std_process(ahashreq, status);
> +}
> +
> +static void mv_cesa_ahash_complete(struct crypto_async_request *req)
> +{
> +	struct ahash_request *ahashreq = ahash_request_cast(req);
> +	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
> +	struct mv_cesa_engine *engine = creq->req.base.engine;
> +	unsigned int digsize;
> +	int i;
>  
>  	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
>  	for (i = 0; i < digsize / 4; i++)
> @@ -326,8 +329,6 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
>  				result[i] = cpu_to_be32(creq->state[i]);
>  		}
>  	}
> -
> -	return ret;
>  }
>  
>  static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
> @@ -366,6 +367,7 @@ static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
>  	.process = mv_cesa_ahash_process,
>  	.prepare = mv_cesa_ahash_prepare,
>  	.cleanup = mv_cesa_ahash_req_cleanup,
> +	.complete = mv_cesa_ahash_complete,
>  };
>  
>  static int mv_cesa_ahash_init(struct ahash_request *req,



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 5/7] crypto: marvell: Adding a complete operation for async requests
@ 2016-06-15 20:55     ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 20:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jun 2016 21:15:32 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> So far, the 'process' operation was used to check if the current request
> was correctly handled by the engine, if it was the case it copied
> information from the SRAM to the main memory. Now, we split this
> operation. We keep the 'process' operation, which still checks if the
> request was correctly handled by the engine or not, then we add a new
> operation for completion. The 'complete' method copies the content of
> the SRAM to memory. This will soon become useful if we want to call
> the process and the complete operations from different locations
> depending on the type of the request (different cleanup logic).
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   |  1 +
>  drivers/crypto/marvell/cesa.h   |  3 +++
>  drivers/crypto/marvell/cipher.c | 47 ++++++++++++++++++++++++-----------------
>  drivers/crypto/marvell/hash.c   | 22 ++++++++++---------
>  4 files changed, 44 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index fe04d1b..af96426 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -98,6 +98,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  				engine->req = NULL;
>  				mv_cesa_dequeue_req_unlocked(engine);
>  				spin_unlock_bh(&engine->lock);
> +				ctx->ops->complete(req);
>  				ctx->ops->cleanup(req);
>  				local_bh_disable();
>  				req->complete(req, res);
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 158ff82..32de08b 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -456,6 +456,8 @@ struct mv_cesa_engine {
>   *		code)
>   * @step:	launch the crypto operation on the next chunk
>   * @cleanup:	cleanup the crypto request (release associated data)
> + * @complete:	complete the request, i.e copy result from sram or contexts
> + * 		when it is needed.
>   */
>  struct mv_cesa_req_ops {
>  	void (*prepare)(struct crypto_async_request *req,
> @@ -463,6 +465,7 @@ struct mv_cesa_req_ops {
>  	int (*process)(struct crypto_async_request *req, u32 status);
>  	void (*step)(struct crypto_async_request *req);
>  	void (*cleanup)(struct crypto_async_request *req);
> +	void (*complete)(struct crypto_async_request *req);
>  };
>  
>  /**
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 15d2c5a..fbaae2f 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -118,7 +118,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
>  	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
>  	struct mv_cesa_engine *engine = sreq->base.engine;
>  	size_t len;
> -	unsigned int ivsize;
>  
>  	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
>  				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
> @@ -128,10 +127,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
>  	if (sreq->offset < req->nbytes)
>  		return -EINPROGRESS;
>  
> -	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
> -	memcpy_fromio(req->info,
> -		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
> -
>  	return 0;
>  }
>  
> @@ -141,21 +136,9 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>  
> -	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
> -		int ret;
> -		struct mv_cesa_req *basereq;
> -		unsigned int ivsize;
> -
> -		ret = mv_cesa_dma_process(&creq->req.base, status);
> -		if (ret)
> -			return ret;
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> +		return mv_cesa_dma_process(&creq->req.base, status);
>  
> -		basereq = &creq->req.base;
> -		ivsize = crypto_ablkcipher_ivsize(
> -					     crypto_ablkcipher_reqtfm(ablkreq));
> -		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
> -		return ret;
> -	}
>  	return mv_cesa_ablkcipher_std_process(ablkreq, status);
>  }
>  
> @@ -197,6 +180,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
>  {
>  	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
> +

Nit: not sure you should mix this cosmetic change with the other
changes.

>  	creq->req.base.engine = engine;
>  
>  	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> @@ -213,11 +197,36 @@ mv_cesa_ablkcipher_req_cleanup(struct crypto_async_request *req)
>  	mv_cesa_ablkcipher_cleanup(ablkreq);
>  }
>  
> +static void
> +mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
> +{
> +	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
> +	struct mv_cesa_engine *engine = creq->req.base.engine;
> +	unsigned int ivsize;
> +
> +	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
> +
> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ) {
> +		struct mv_cesa_req *basereq;
> +
> +		basereq = &creq->req.base;
> +		ivsize = crypto_ablkcipher_ivsize(
> +					     crypto_ablkcipher_reqtfm(ablkreq));

You already have ivsize initialized.

> +		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);

Use memcpy() here.

> +	} else {
> +		memcpy_fromio(ablkreq->info,
> +			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
> +			      ivsize);
> +	}
> +}
> +
>  static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
>  	.step = mv_cesa_ablkcipher_step,
>  	.process = mv_cesa_ablkcipher_process,
>  	.prepare = mv_cesa_ablkcipher_prepare,
>  	.cleanup = mv_cesa_ablkcipher_req_cleanup,
> +	.complete = mv_cesa_ablkcipher_complete,
>  };
>  
>  static int mv_cesa_ablkcipher_cra_init(struct crypto_tfm *tfm)
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index cc7c5b0..f7f84cc 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -287,17 +287,20 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
>  {
>  	struct ahash_request *ahashreq = ahash_request_cast(req);
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
> -	struct mv_cesa_engine *engine = creq->req.base.engine;
> -	unsigned int digsize;
> -	int ret, i;
>  
>  	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
> -		ret = mv_cesa_dma_process(&creq->req.base, status);
> -	else
> -		ret = mv_cesa_ahash_std_process(ahashreq, status);
> +		return mv_cesa_dma_process(&creq->req.base, status);
>  
> -	if (ret == -EINPROGRESS)
> -		return ret;
> +	return mv_cesa_ahash_std_process(ahashreq, status);
> +}
> +
> +static void mv_cesa_ahash_complete(struct crypto_async_request *req)
> +{
> +	struct ahash_request *ahashreq = ahash_request_cast(req);
> +	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
> +	struct mv_cesa_engine *engine = creq->req.base.engine;
> +	unsigned int digsize;
> +	int i;
>  
>  	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
>  	for (i = 0; i < digsize / 4; i++)
> @@ -326,8 +329,6 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
>  				result[i] = cpu_to_be32(creq->state[i]);
>  		}
>  	}
> -
> -	return ret;
>  }
>  
>  static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
> @@ -366,6 +367,7 @@ static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
>  	.process = mv_cesa_ahash_process,
>  	.prepare = mv_cesa_ahash_prepare,
>  	.cleanup = mv_cesa_ahash_req_cleanup,
> +	.complete = mv_cesa_ahash_complete,
>  };
>  
>  static int mv_cesa_ahash_init(struct ahash_request *req,



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 6/7] crypto: marvell: Adding load balancing between engines
  2016-06-15 19:15   ` Romain Perier
@ 2016-06-15 21:13     ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 21:13 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Wed, 15 Jun 2016 21:15:33 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> This commits adds support for fine grained load balancing on
> multi-engine IPs. The engine is pre-selected based on its current load
> and on the weight of the crypto request that is about to be processed.
> The global crypto queue is also moved to each engine. These changes are

					to the mv_cesa_engine object.

> useful for preparing the code to support TDMA chaining between crypto
> requests, because each tdma chain will be handled per engine.

These changes are required to allow chaining crypto requests at the DMA
level.

> By using
> a crypto queue per engine, we make sure that we keep the state of the
> tdma chain synchronized with the crypto queue. We also reduce contention
> on 'cesa_dev->lock' and improve parallelism.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   | 30 +++++++++----------
>  drivers/crypto/marvell/cesa.h   | 26 +++++++++++++++--
>  drivers/crypto/marvell/cipher.c | 59 ++++++++++++++++++-------------------
>  drivers/crypto/marvell/hash.c   | 65 +++++++++++++++++++----------------------
>  4 files changed, 97 insertions(+), 83 deletions(-)
> 

[...]

> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index fbaae2f..02aa38f 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -89,6 +89,9 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
>  	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
>  			    CESA_SA_SRAM_PAYLOAD_SIZE);
>  
> +	mv_cesa_adjust_op(engine, &sreq->op);
> +	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
> +
>  	len = sg_pcopy_to_buffer(req->src, creq->src_nents,
>  				 engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>  				 len, sreq->offset);
> @@ -167,12 +170,9 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
>  {
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
> -	struct mv_cesa_engine *engine = sreq->base.engine;
>  
>  	sreq->size = 0;
>  	sreq->offset = 0;
> -	mv_cesa_adjust_op(engine, &sreq->op);
> -	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));

Are these changes really related to this load balancing support?
AFAICT, it's something that could have been done earlier, and is not
dependent on the changes your introducing here, but maybe I'm missing
something.

>  }

[...]

>  static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index f7f84cc..5946a69 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -162,6 +162,15 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>  	unsigned int new_cache_ptr = 0;
>  	u32 frag_mode;
>  	size_t  len;
> +	unsigned int digsize;
> +	int i;
> +
> +	mv_cesa_adjust_op(engine, &creq->op_tmpl);
> +	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
> +
> +	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
> +	for (i = 0; i < digsize / 4; i++)
> +		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
>  
>  	if (creq->cache_ptr)
>  		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
> @@ -265,11 +274,8 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
> -	struct mv_cesa_engine *engine = sreq->base.engine;
>  
>  	sreq->offset = 0;
> -	mv_cesa_adjust_op(engine, &creq->op_tmpl);
> -	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));

Same as above: it doesn't seem related to the load balancing stuff.

>  }

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 6/7] crypto: marvell: Adding load balancing between engines
@ 2016-06-15 21:13     ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 21:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jun 2016 21:15:33 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> This commits adds support for fine grained load balancing on
> multi-engine IPs. The engine is pre-selected based on its current load
> and on the weight of the crypto request that is about to be processed.
> The global crypto queue is also moved to each engine. These changes are

					to the mv_cesa_engine object.

> useful for preparing the code to support TDMA chaining between crypto
> requests, because each tdma chain will be handled per engine.

These changes are required to allow chaining crypto requests at the DMA
level.

> By using
> a crypto queue per engine, we make sure that we keep the state of the
> tdma chain synchronized with the crypto queue. We also reduce contention
> on 'cesa_dev->lock' and improve parallelism.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   | 30 +++++++++----------
>  drivers/crypto/marvell/cesa.h   | 26 +++++++++++++++--
>  drivers/crypto/marvell/cipher.c | 59 ++++++++++++++++++-------------------
>  drivers/crypto/marvell/hash.c   | 65 +++++++++++++++++++----------------------
>  4 files changed, 97 insertions(+), 83 deletions(-)
> 

[...]

> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index fbaae2f..02aa38f 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -89,6 +89,9 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
>  	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
>  			    CESA_SA_SRAM_PAYLOAD_SIZE);
>  
> +	mv_cesa_adjust_op(engine, &sreq->op);
> +	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
> +
>  	len = sg_pcopy_to_buffer(req->src, creq->src_nents,
>  				 engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>  				 len, sreq->offset);
> @@ -167,12 +170,9 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
>  {
>  	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
> -	struct mv_cesa_engine *engine = sreq->base.engine;
>  
>  	sreq->size = 0;
>  	sreq->offset = 0;
> -	mv_cesa_adjust_op(engine, &sreq->op);
> -	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));

Are these changes really related to this load balancing support?
AFAICT, it's something that could have been done earlier, and is not
dependent on the changes your introducing here, but maybe I'm missing
something.

>  }

[...]

>  static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index f7f84cc..5946a69 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -162,6 +162,15 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>  	unsigned int new_cache_ptr = 0;
>  	u32 frag_mode;
>  	size_t  len;
> +	unsigned int digsize;
> +	int i;
> +
> +	mv_cesa_adjust_op(engine, &creq->op_tmpl);
> +	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
> +
> +	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
> +	for (i = 0; i < digsize / 4; i++)
> +		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
>  
>  	if (creq->cache_ptr)
>  		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
> @@ -265,11 +274,8 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
> -	struct mv_cesa_engine *engine = sreq->base.engine;
>  
>  	sreq->offset = 0;
> -	mv_cesa_adjust_op(engine, &creq->op_tmpl);
> -	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));

Same as above: it doesn't seem related to the load balancing stuff.

>  }

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 7/7] crypto: marvell: Add support for chaining crypto requests in TDMA mode
  2016-06-15 19:15   ` Romain Perier
@ 2016-06-15 21:43     ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 21:43 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Wed, 15 Jun 2016 21:15:34 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> The Cryptographic Engines and Security Accelerators (CESA) supports the
> Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
> can be chained and processed by the hardware without software
> interferences.

intervention.

> This mode was already activated, however the crypto
> requests were not chained together. By doing so, we reduce significantly

						   significantly reduce

> the number of IRQs. Instead of being interrupted at the end of each
> crypto request, we are interrupted at the end of the last cryptographic
> request processed by the engine.
> 
> This commits re-factorizes the code, changes the code architecture and
> adds the required data structures to chain cryptographic requests
> together before sending them to an engine.

Not necessarily before sending them to the engine, it can be done while
the engine is running.

> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   | 117 +++++++++++++++++++++++++++++++---------
>  drivers/crypto/marvell/cesa.h   |  38 ++++++++++++-
>  drivers/crypto/marvell/cipher.c |   3 +-
>  drivers/crypto/marvell/hash.c   |   9 +++-
>  drivers/crypto/marvell/tdma.c   |  81 ++++++++++++++++++++++++++++
>  5 files changed, 218 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index f9e6688..33411f6 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -32,7 +32,7 @@
>  #include "cesa.h"
>  
>  /* Limit of the crypto queue before reaching the backlog */
> -#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
> +#define CESA_CRYPTO_DEFAULT_MAX_QLEN 128
>  
>  static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
>  module_param_named(allhwsupport, allhwsupport, int, 0444);
> @@ -40,23 +40,83 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
>  
>  struct mv_cesa_dev *cesa_dev;
>  
> -static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
> +struct crypto_async_request *mv_cesa_dequeue_req_locked(
> +	struct mv_cesa_engine *engine, struct crypto_async_request **backlog)

Coding style issue:

struct crypto_async_request *
mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
			   struct crypto_async_request **backlog)

> +{
> +	struct crypto_async_request *req;
> +
> +	*backlog = crypto_get_backlog(&engine->queue);
> +	req = crypto_dequeue_request(&engine->queue);
> +
> +	if (!req)
> +		return NULL;
> +
> +	return req;
> +}
> +
> +static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
>  {
>  	struct crypto_async_request *req, *backlog;
>  	struct mv_cesa_ctx *ctx;
>  
> -	backlog = crypto_get_backlog(&engine->queue);
> -	req = crypto_dequeue_request(&engine->queue);
> -	engine->req = req;
>  
> +	spin_lock_bh(&engine->lock);
> +	if (engine->req)
> +		goto out_unlock;
> +
> +	req = mv_cesa_dequeue_req_locked(engine, &backlog);
>  	if (!req)
> -		return;
> +		goto out_unlock;
> +
> +	engine->req = req;
> +	spin_unlock_bh(&engine->lock);

I'm not a big fan of those multiple 'unlock() locations', and since
your code is pretty simple I'd prefer seeing something like.

	spin_lock_bh(&engine->lock);
	if (!engine->req) {
		req = mv_cesa_dequeue_req_locked(engine, &backlog);
		engine->req = req;
	}
	spin_unlock_bh(&engine->lock);

	if (!req)
		return;

With req and backlog initialized to NULL at the beginning of the
function.

>  
>  	if (backlog)
>  		backlog->complete(backlog, -EINPROGRESS);
>  
>  	ctx = crypto_tfm_ctx(req->tfm);
>  	ctx->ops->step(req);
> +	return;

Missing blank line.

> +out_unlock:
> +	spin_unlock_bh(&engine->lock);
> +}
> +
> +static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	struct crypto_async_request *req;
> +	struct mv_cesa_ctx *ctx;
> +	int res;
> +
> +	req = engine->req;
> +	ctx = crypto_tfm_ctx(req->tfm);
> +	res = ctx->ops->process(req, status);
> +
> +	if (res == 0) {
> +		ctx->ops->complete(req);
> +		mv_cesa_engine_enqueue_complete_request(engine, req);
> +	} else if (res == -EINPROGRESS) {
> +		ctx->ops->step(req);
> +	} else {
> +		ctx->ops->complete(req);

Do we really have to call ->complete() in this case?

> +	}
> +
> +	return res;
> +}
> +
> +static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	if (engine->chain.first && engine->chain.last)
> +		return mv_cesa_tdma_process(engine, status);

Missing blank line.

> +	return mv_cesa_std_process(engine, status);
> +}
> +
> +static inline void mv_cesa_complete_req(struct mv_cesa_ctx *ctx,
> +	struct crypto_async_request *req, int res)

Align parameters to the open parenthesis.

> +{
> +	ctx->ops->cleanup(req);
> +	local_bh_disable();
> +	req->complete(req, res);
> +	local_bh_enable();
>  }
>  
>  static irqreturn_t mv_cesa_int(int irq, void *priv)
> @@ -83,26 +143,31 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  		writel(~status, engine->regs + CESA_SA_FPGA_INT_STATUS);
>  		writel(~status, engine->regs + CESA_SA_INT_STATUS);
>  
> +		/* Process fetched requests */
> +		res = mv_cesa_int_process(engine, status & mask);
>  		ret = IRQ_HANDLED;
> +
>  		spin_lock_bh(&engine->lock);
>  		req = engine->req;
> +		if (res != -EINPROGRESS)
> +			engine->req = NULL;
>  		spin_unlock_bh(&engine->lock);
> -		if (req) {
> -			ctx = crypto_tfm_ctx(req->tfm);
> -			res = ctx->ops->process(req, status & mask);
> -			if (res != -EINPROGRESS) {
> -				spin_lock_bh(&engine->lock);
> -				engine->req = NULL;
> -				mv_cesa_dequeue_req_unlocked(engine);
> -				spin_unlock_bh(&engine->lock);
> -				ctx->ops->complete(req);
> -				ctx->ops->cleanup(req);
> -				local_bh_disable();
> -				req->complete(req, res);
> -				local_bh_enable();
> -			} else {
> -				ctx->ops->step(req);
> -			}
> +
> +		ctx = crypto_tfm_ctx(req->tfm);
> +
> +		if (res && res != -EINPROGRESS)
> +			mv_cesa_complete_req(ctx, req, res);
> +
> +		/* Launch the next pending request */
> +		mv_cesa_rearm_engine(engine);
> +
> +		/* Iterate over the complete queue */
> +		while (true) {
> +			req = mv_cesa_engine_dequeue_complete_request(engine);
> +			if (!req)
> +				break;
> +
> +			mv_cesa_complete_req(ctx, req, 0);
>  		}
>  	}
>  
> @@ -116,16 +181,15 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
>  	struct mv_cesa_engine *engine = creq->engine;
>  
>  	spin_lock_bh(&engine->lock);
> +	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
> +		mv_cesa_tdma_chain(engine, creq);

Missing blank line.

>  	ret = crypto_enqueue_request(&engine->queue, req);
>  	spin_unlock_bh(&engine->lock);
>  
>  	if (ret != -EINPROGRESS)
>  		return ret;
>  
> -	spin_lock_bh(&engine->lock);
> -	if (!engine->req)
> -		mv_cesa_dequeue_req_unlocked(engine);
> -	spin_unlock_bh(&engine->lock);
> +	mv_cesa_rearm_engine(engine);
>  
>  	return -EINPROGRESS;
>  }
> @@ -496,6 +560,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  
>  		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
>  		atomic_set(&engine->load, 0);
> +		INIT_LIST_HEAD(&engine->complete_queue);
>  	}
>  
>  	cesa_dev = cesa;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 5626aa7..e0fee1f 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
>  /* TDMA descriptor flags */
>  #define CESA_TDMA_DST_IN_SRAM			BIT(31)
>  #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
> -#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
> +#define CESA_TDMA_END_OF_REQ			BIT(29)
> +#define CESA_TDMA_NOT_CHAIN			BIT(28)

I would name it CESA_TDMA_BREAK_CHAIN.

> +#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
>  #define CESA_TDMA_DUMMY				0
>  #define CESA_TDMA_DATA				1
>  #define CESA_TDMA_OP				2
> @@ -431,6 +433,9 @@ struct mv_cesa_dev {
>   *			SRAM
>   * @queue:		fifo of the pending crypto requests
>   * @load:		engine load counter, useful for load balancing
> + * @chain:		list of the current tdma descriptors being processed
> + * 			by this engine.
> + * @complete_queue:	fifo of the processed requests by the engine
>   *
>   * Structure storing CESA engine information.
>   */
> @@ -448,6 +453,8 @@ struct mv_cesa_engine {
>  	struct gen_pool *pool;
>  	struct crypto_queue queue;
>  	atomic_t load;
> +	struct mv_cesa_tdma_chain chain;
> +	struct list_head complete_queue;
>  };
>  
>  /**
> @@ -618,6 +625,28 @@ struct mv_cesa_ahash_req {
>  
>  extern struct mv_cesa_dev *cesa_dev;
>  
> +
> +static inline void mv_cesa_engine_enqueue_complete_request(
> +	struct mv_cesa_engine *engine, struct crypto_async_request *req)

Coding style issue (see my previous comments).

> +{
> +	list_add_tail(&req->list, &engine->complete_queue);
> +}
> +
> +static inline struct crypto_async_request *
> +mv_cesa_engine_dequeue_complete_request(struct mv_cesa_engine *engine)
> +{
> +	struct crypto_async_request *req;
> +
> +	req = list_first_entry_or_null(&engine->complete_queue,
> +				       struct crypto_async_request,
> +				       list);
> +	if (req)
> +		list_del(&req->list);
> +
> +	return req;
> +}
> +
> +
>  static inline enum mv_cesa_req_type
>  mv_cesa_req_get_type(struct mv_cesa_req *req)
>  {
> @@ -699,6 +728,10 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
>  int mv_cesa_queue_req(struct crypto_async_request *req,
>  		      struct mv_cesa_req *creq);
>  
> +struct crypto_async_request *mv_cesa_dequeue_req_locked(
> +		      struct mv_cesa_engine *engine,
> +		      struct crypto_async_request **backlog);

Ditto.

> +
>  static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
>  {
>  	int i;
> @@ -804,6 +837,9 @@ static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
>  void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  			 struct mv_cesa_engine *engine);
>  void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
> +void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
> +			struct mv_cesa_req *dreq);
> +int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status);
>  
>  
>  static inline void
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 02aa38f..9033191 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -225,7 +225,6 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
>  static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
>  	.step = mv_cesa_ablkcipher_step,
>  	.process = mv_cesa_ablkcipher_process,
> -	.prepare = mv_cesa_ablkcipher_prepare,
>  	.cleanup = mv_cesa_ablkcipher_req_cleanup,
>  	.complete = mv_cesa_ablkcipher_complete,
>  };
> @@ -384,6 +383,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  		goto err_free_tdma;
>  
>  	dreq->chain = chain;
> +	dreq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
>  
>  	return 0;
>  
> @@ -441,7 +441,6 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
>  	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
>  			      CESA_SA_DESC_CFG_OP_MSK);
>  
> -	/* TODO: add a threshold for DMA usage */
>  	if (cesa_dev->caps->has_tdma)
>  		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
>  	else
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index 5946a69..c2ff353 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -172,6 +172,9 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>  	for (i = 0; i < digsize / 4; i++)
>  		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
>  
> +	mv_cesa_adjust_op(engine, &creq->op_tmpl);
> +	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
> +
>  	if (creq->cache_ptr)
>  		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>  			    creq->cache, creq->cache_ptr);
> @@ -282,6 +285,9 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
>  {
>  	struct ahash_request *ahashreq = ahash_request_cast(req);
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
> +	struct mv_cesa_engine *engine = creq->req.base.engine;
> +	unsigned int digsize;
> +	int i;
>  
>  	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_dma_step(&creq->req.base);
> @@ -367,7 +373,6 @@ static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
>  static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
>  	.step = mv_cesa_ahash_step,
>  	.process = mv_cesa_ahash_process,
> -	.prepare = mv_cesa_ahash_prepare,

Why are you doing that?

>  	.cleanup = mv_cesa_ahash_req_cleanup,
>  	.complete = mv_cesa_ahash_complete,
>  };
> @@ -648,6 +653,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  	else
>  		creq->cache_ptr = 0;
>  
> +	dreq->chain.last->flags |= (CESA_TDMA_END_OF_REQ | CESA_TDMA_NOT_CHAIN);
> +
>  	return 0;
>  
>  err_free_tdma:
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 9a424f9..ae50545 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -98,6 +98,87 @@ void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  	}
>  }
>  
> +void
> +mv_cesa_tdma_chain(struct mv_cesa_engine *engine, struct mv_cesa_req *dreq)
> +{
> +	if (engine->chain.first == NULL && engine->chain.last == NULL) {
> +		engine->chain.first = dreq->chain.first;
> +		engine->chain.last  = dreq->chain.last;
> +	} else {
> +		struct mv_cesa_tdma_desc *last;
> +
> +		last = engine->chain.last;
> +		last->next = dreq->chain.first;
> +		engine->chain.last = dreq->chain.last;

Missing blank line.

> +		if (!(last->flags & CESA_TDMA_NOT_CHAIN))
> +			last->next_dma = dreq->chain.first->cur_dma;
> +	}
> +}
> +
> +int
> +mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	struct crypto_async_request *req = NULL;
> +	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
> +	dma_addr_t tdma_cur;
> +	int res = 0;
> +
> +	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
> +
> +	for (tdma = engine->chain.first; tdma; tdma = next) {
> +		spin_lock_bh(&engine->lock);
> +		next = tdma->next;
> +		spin_unlock_bh(&engine->lock);
> +
> +		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
> +			struct crypto_async_request *backlog = NULL;
> +			struct mv_cesa_ctx *ctx;
> +
> +			spin_lock_bh(&engine->lock);
> +			/*
> +			 * if req is NULL, this means we're processing the
> +			 * request in engine->req.
> +			 */
> +			if (!req)
> +				req = engine->req;
> +			else
> +				req = mv_cesa_dequeue_req_locked(engine,
> +								 &backlog);
> +
> +			/* Re-chaining to the next request */
> +			engine->chain.first = tdma->next;
> +			tdma->next = NULL;
> +
> +			/* If this is the last request, clear the chain */
> +			if (engine->chain.first == NULL)
> +				engine->chain.last  = NULL;
> +			spin_unlock_bh(&engine->lock);
> +
> +			ctx = crypto_tfm_ctx(req->tfm);
> +			res = ctx->ops->process(req, status);

Hm, that's not exactly true. The status you're passing here is only
valid for the last request that has been processed. Say you queued 3
requests. 2 of them were correctly processed, but the last one
triggered an error. You don't want the first 2 requests to be
considered bad.

A solution would be to pass a 'fake' valid status, until we reach the
last request (IOW, tdma->cur_dma == tdma_cur).

> +			ctx->ops->complete(req);
> +
> +			if (res == 0)
> +				mv_cesa_engine_enqueue_complete_request(engine,
> +									req);
> +
> +			if (backlog)
> +				backlog->complete(backlog, -EINPROGRESS);
> +		}

Missing blank line.

> +		if (res || tdma->cur_dma == tdma_cur)
> +			break;
> +	}
> +
> +	if (res) {
> +		spin_lock_bh(&engine->lock);
> +		engine->req = req;
> +		spin_unlock_bh(&engine->lock);
> +	}

Maybe you can add a comment explaining that you are actually setting
the last processed request into engine->req, so that the core can know
which request was faulty.

> +
> +	return res;
> +}
> +
> +
>  static struct mv_cesa_tdma_desc *
>  mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
>  {



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 7/7] crypto: marvell: Add support for chaining crypto requests in TDMA mode
@ 2016-06-15 21:43     ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-15 21:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jun 2016 21:15:34 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> The Cryptographic Engines and Security Accelerators (CESA) supports the
> Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
> can be chained and processed by the hardware without software
> interferences.

intervention.

> This mode was already activated, however the crypto
> requests were not chained together. By doing so, we reduce significantly

						   significantly reduce

> the number of IRQs. Instead of being interrupted at the end of each
> crypto request, we are interrupted at the end of the last cryptographic
> request processed by the engine.
> 
> This commits re-factorizes the code, changes the code architecture and
> adds the required data structures to chain cryptographic requests
> together before sending them to an engine.

Not necessarily before sending them to the engine, it can be done while
the engine is running.

> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>  drivers/crypto/marvell/cesa.c   | 117 +++++++++++++++++++++++++++++++---------
>  drivers/crypto/marvell/cesa.h   |  38 ++++++++++++-
>  drivers/crypto/marvell/cipher.c |   3 +-
>  drivers/crypto/marvell/hash.c   |   9 +++-
>  drivers/crypto/marvell/tdma.c   |  81 ++++++++++++++++++++++++++++
>  5 files changed, 218 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index f9e6688..33411f6 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -32,7 +32,7 @@
>  #include "cesa.h"
>  
>  /* Limit of the crypto queue before reaching the backlog */
> -#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
> +#define CESA_CRYPTO_DEFAULT_MAX_QLEN 128
>  
>  static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
>  module_param_named(allhwsupport, allhwsupport, int, 0444);
> @@ -40,23 +40,83 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
>  
>  struct mv_cesa_dev *cesa_dev;
>  
> -static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
> +struct crypto_async_request *mv_cesa_dequeue_req_locked(
> +	struct mv_cesa_engine *engine, struct crypto_async_request **backlog)

Coding style issue:

struct crypto_async_request *
mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
			   struct crypto_async_request **backlog)

> +{
> +	struct crypto_async_request *req;
> +
> +	*backlog = crypto_get_backlog(&engine->queue);
> +	req = crypto_dequeue_request(&engine->queue);
> +
> +	if (!req)
> +		return NULL;
> +
> +	return req;
> +}
> +
> +static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
>  {
>  	struct crypto_async_request *req, *backlog;
>  	struct mv_cesa_ctx *ctx;
>  
> -	backlog = crypto_get_backlog(&engine->queue);
> -	req = crypto_dequeue_request(&engine->queue);
> -	engine->req = req;
>  
> +	spin_lock_bh(&engine->lock);
> +	if (engine->req)
> +		goto out_unlock;
> +
> +	req = mv_cesa_dequeue_req_locked(engine, &backlog);
>  	if (!req)
> -		return;
> +		goto out_unlock;
> +
> +	engine->req = req;
> +	spin_unlock_bh(&engine->lock);

I'm not a big fan of those multiple 'unlock() locations', and since
your code is pretty simple I'd prefer seeing something like.

	spin_lock_bh(&engine->lock);
	if (!engine->req) {
		req = mv_cesa_dequeue_req_locked(engine, &backlog);
		engine->req = req;
	}
	spin_unlock_bh(&engine->lock);

	if (!req)
		return;

With req and backlog initialized to NULL at the beginning of the
function.

>  
>  	if (backlog)
>  		backlog->complete(backlog, -EINPROGRESS);
>  
>  	ctx = crypto_tfm_ctx(req->tfm);
>  	ctx->ops->step(req);
> +	return;

Missing blank line.

> +out_unlock:
> +	spin_unlock_bh(&engine->lock);
> +}
> +
> +static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	struct crypto_async_request *req;
> +	struct mv_cesa_ctx *ctx;
> +	int res;
> +
> +	req = engine->req;
> +	ctx = crypto_tfm_ctx(req->tfm);
> +	res = ctx->ops->process(req, status);
> +
> +	if (res == 0) {
> +		ctx->ops->complete(req);
> +		mv_cesa_engine_enqueue_complete_request(engine, req);
> +	} else if (res == -EINPROGRESS) {
> +		ctx->ops->step(req);
> +	} else {
> +		ctx->ops->complete(req);

Do we really have to call ->complete() in this case?

> +	}
> +
> +	return res;
> +}
> +
> +static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	if (engine->chain.first && engine->chain.last)
> +		return mv_cesa_tdma_process(engine, status);

Missing blank line.

> +	return mv_cesa_std_process(engine, status);
> +}
> +
> +static inline void mv_cesa_complete_req(struct mv_cesa_ctx *ctx,
> +	struct crypto_async_request *req, int res)

Align parameters to the open parenthesis.

> +{
> +	ctx->ops->cleanup(req);
> +	local_bh_disable();
> +	req->complete(req, res);
> +	local_bh_enable();
>  }
>  
>  static irqreturn_t mv_cesa_int(int irq, void *priv)
> @@ -83,26 +143,31 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  		writel(~status, engine->regs + CESA_SA_FPGA_INT_STATUS);
>  		writel(~status, engine->regs + CESA_SA_INT_STATUS);
>  
> +		/* Process fetched requests */
> +		res = mv_cesa_int_process(engine, status & mask);
>  		ret = IRQ_HANDLED;
> +
>  		spin_lock_bh(&engine->lock);
>  		req = engine->req;
> +		if (res != -EINPROGRESS)
> +			engine->req = NULL;
>  		spin_unlock_bh(&engine->lock);
> -		if (req) {
> -			ctx = crypto_tfm_ctx(req->tfm);
> -			res = ctx->ops->process(req, status & mask);
> -			if (res != -EINPROGRESS) {
> -				spin_lock_bh(&engine->lock);
> -				engine->req = NULL;
> -				mv_cesa_dequeue_req_unlocked(engine);
> -				spin_unlock_bh(&engine->lock);
> -				ctx->ops->complete(req);
> -				ctx->ops->cleanup(req);
> -				local_bh_disable();
> -				req->complete(req, res);
> -				local_bh_enable();
> -			} else {
> -				ctx->ops->step(req);
> -			}
> +
> +		ctx = crypto_tfm_ctx(req->tfm);
> +
> +		if (res && res != -EINPROGRESS)
> +			mv_cesa_complete_req(ctx, req, res);
> +
> +		/* Launch the next pending request */
> +		mv_cesa_rearm_engine(engine);
> +
> +		/* Iterate over the complete queue */
> +		while (true) {
> +			req = mv_cesa_engine_dequeue_complete_request(engine);
> +			if (!req)
> +				break;
> +
> +			mv_cesa_complete_req(ctx, req, 0);
>  		}
>  	}
>  
> @@ -116,16 +181,15 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
>  	struct mv_cesa_engine *engine = creq->engine;
>  
>  	spin_lock_bh(&engine->lock);
> +	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
> +		mv_cesa_tdma_chain(engine, creq);

Missing blank line.

>  	ret = crypto_enqueue_request(&engine->queue, req);
>  	spin_unlock_bh(&engine->lock);
>  
>  	if (ret != -EINPROGRESS)
>  		return ret;
>  
> -	spin_lock_bh(&engine->lock);
> -	if (!engine->req)
> -		mv_cesa_dequeue_req_unlocked(engine);
> -	spin_unlock_bh(&engine->lock);
> +	mv_cesa_rearm_engine(engine);
>  
>  	return -EINPROGRESS;
>  }
> @@ -496,6 +560,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  
>  		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
>  		atomic_set(&engine->load, 0);
> +		INIT_LIST_HEAD(&engine->complete_queue);
>  	}
>  
>  	cesa_dev = cesa;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 5626aa7..e0fee1f 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
>  /* TDMA descriptor flags */
>  #define CESA_TDMA_DST_IN_SRAM			BIT(31)
>  #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
> -#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
> +#define CESA_TDMA_END_OF_REQ			BIT(29)
> +#define CESA_TDMA_NOT_CHAIN			BIT(28)

I would name it CESA_TDMA_BREAK_CHAIN.

> +#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
>  #define CESA_TDMA_DUMMY				0
>  #define CESA_TDMA_DATA				1
>  #define CESA_TDMA_OP				2
> @@ -431,6 +433,9 @@ struct mv_cesa_dev {
>   *			SRAM
>   * @queue:		fifo of the pending crypto requests
>   * @load:		engine load counter, useful for load balancing
> + * @chain:		list of the current tdma descriptors being processed
> + * 			by this engine.
> + * @complete_queue:	fifo of the processed requests by the engine
>   *
>   * Structure storing CESA engine information.
>   */
> @@ -448,6 +453,8 @@ struct mv_cesa_engine {
>  	struct gen_pool *pool;
>  	struct crypto_queue queue;
>  	atomic_t load;
> +	struct mv_cesa_tdma_chain chain;
> +	struct list_head complete_queue;
>  };
>  
>  /**
> @@ -618,6 +625,28 @@ struct mv_cesa_ahash_req {
>  
>  extern struct mv_cesa_dev *cesa_dev;
>  
> +
> +static inline void mv_cesa_engine_enqueue_complete_request(
> +	struct mv_cesa_engine *engine, struct crypto_async_request *req)

Coding style issue (see my previous comments).

> +{
> +	list_add_tail(&req->list, &engine->complete_queue);
> +}
> +
> +static inline struct crypto_async_request *
> +mv_cesa_engine_dequeue_complete_request(struct mv_cesa_engine *engine)
> +{
> +	struct crypto_async_request *req;
> +
> +	req = list_first_entry_or_null(&engine->complete_queue,
> +				       struct crypto_async_request,
> +				       list);
> +	if (req)
> +		list_del(&req->list);
> +
> +	return req;
> +}
> +
> +
>  static inline enum mv_cesa_req_type
>  mv_cesa_req_get_type(struct mv_cesa_req *req)
>  {
> @@ -699,6 +728,10 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
>  int mv_cesa_queue_req(struct crypto_async_request *req,
>  		      struct mv_cesa_req *creq);
>  
> +struct crypto_async_request *mv_cesa_dequeue_req_locked(
> +		      struct mv_cesa_engine *engine,
> +		      struct crypto_async_request **backlog);

Ditto.

> +
>  static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
>  {
>  	int i;
> @@ -804,6 +837,9 @@ static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
>  void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  			 struct mv_cesa_engine *engine);
>  void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
> +void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
> +			struct mv_cesa_req *dreq);
> +int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status);
>  
>  
>  static inline void
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 02aa38f..9033191 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -225,7 +225,6 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
>  static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
>  	.step = mv_cesa_ablkcipher_step,
>  	.process = mv_cesa_ablkcipher_process,
> -	.prepare = mv_cesa_ablkcipher_prepare,
>  	.cleanup = mv_cesa_ablkcipher_req_cleanup,
>  	.complete = mv_cesa_ablkcipher_complete,
>  };
> @@ -384,6 +383,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  		goto err_free_tdma;
>  
>  	dreq->chain = chain;
> +	dreq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
>  
>  	return 0;
>  
> @@ -441,7 +441,6 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
>  	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
>  			      CESA_SA_DESC_CFG_OP_MSK);
>  
> -	/* TODO: add a threshold for DMA usage */
>  	if (cesa_dev->caps->has_tdma)
>  		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
>  	else
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index 5946a69..c2ff353 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -172,6 +172,9 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>  	for (i = 0; i < digsize / 4; i++)
>  		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
>  
> +	mv_cesa_adjust_op(engine, &creq->op_tmpl);
> +	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
> +
>  	if (creq->cache_ptr)
>  		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>  			    creq->cache, creq->cache_ptr);
> @@ -282,6 +285,9 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
>  {
>  	struct ahash_request *ahashreq = ahash_request_cast(req);
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
> +	struct mv_cesa_engine *engine = creq->req.base.engine;
> +	unsigned int digsize;
> +	int i;
>  
>  	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>  		mv_cesa_dma_step(&creq->req.base);
> @@ -367,7 +373,6 @@ static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
>  static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
>  	.step = mv_cesa_ahash_step,
>  	.process = mv_cesa_ahash_process,
> -	.prepare = mv_cesa_ahash_prepare,

Why are you doing that?

>  	.cleanup = mv_cesa_ahash_req_cleanup,
>  	.complete = mv_cesa_ahash_complete,
>  };
> @@ -648,6 +653,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  	else
>  		creq->cache_ptr = 0;
>  
> +	dreq->chain.last->flags |= (CESA_TDMA_END_OF_REQ | CESA_TDMA_NOT_CHAIN);
> +
>  	return 0;
>  
>  err_free_tdma:
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 9a424f9..ae50545 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -98,6 +98,87 @@ void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  	}
>  }
>  
> +void
> +mv_cesa_tdma_chain(struct mv_cesa_engine *engine, struct mv_cesa_req *dreq)
> +{
> +	if (engine->chain.first == NULL && engine->chain.last == NULL) {
> +		engine->chain.first = dreq->chain.first;
> +		engine->chain.last  = dreq->chain.last;
> +	} else {
> +		struct mv_cesa_tdma_desc *last;
> +
> +		last = engine->chain.last;
> +		last->next = dreq->chain.first;
> +		engine->chain.last = dreq->chain.last;

Missing blank line.

> +		if (!(last->flags & CESA_TDMA_NOT_CHAIN))
> +			last->next_dma = dreq->chain.first->cur_dma;
> +	}
> +}
> +
> +int
> +mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	struct crypto_async_request *req = NULL;
> +	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
> +	dma_addr_t tdma_cur;
> +	int res = 0;
> +
> +	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
> +
> +	for (tdma = engine->chain.first; tdma; tdma = next) {
> +		spin_lock_bh(&engine->lock);
> +		next = tdma->next;
> +		spin_unlock_bh(&engine->lock);
> +
> +		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
> +			struct crypto_async_request *backlog = NULL;
> +			struct mv_cesa_ctx *ctx;
> +
> +			spin_lock_bh(&engine->lock);
> +			/*
> +			 * if req is NULL, this means we're processing the
> +			 * request in engine->req.
> +			 */
> +			if (!req)
> +				req = engine->req;
> +			else
> +				req = mv_cesa_dequeue_req_locked(engine,
> +								 &backlog);
> +
> +			/* Re-chaining to the next request */
> +			engine->chain.first = tdma->next;
> +			tdma->next = NULL;
> +
> +			/* If this is the last request, clear the chain */
> +			if (engine->chain.first == NULL)
> +				engine->chain.last  = NULL;
> +			spin_unlock_bh(&engine->lock);
> +
> +			ctx = crypto_tfm_ctx(req->tfm);
> +			res = ctx->ops->process(req, status);

Hm, that's not exactly true. The status you're passing here is only
valid for the last request that has been processed. Say you queued 3
requests. 2 of them were correctly processed, but the last one
triggered an error. You don't want the first 2 requests to be
considered bad.

A solution would be to pass a 'fake' valid status, until we reach the
last request (IOW, tdma->cur_dma == tdma_cur).

> +			ctx->ops->complete(req);
> +
> +			if (res == 0)
> +				mv_cesa_engine_enqueue_complete_request(engine,
> +									req);
> +
> +			if (backlog)
> +				backlog->complete(backlog, -EINPROGRESS);
> +		}

Missing blank line.

> +		if (res || tdma->cur_dma == tdma_cur)
> +			break;
> +	}
> +
> +	if (res) {
> +		spin_lock_bh(&engine->lock);
> +		engine->req = req;
> +		spin_unlock_bh(&engine->lock);
> +	}

Maybe you can add a comment explaining that you are actually setting
the last processed request into engine->req, so that the core can know
which request was faulty.

> +
> +	return res;
> +}
> +
> +
>  static struct mv_cesa_tdma_desc *
>  mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
>  {



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/7] crypto: marvell: Check engine is not already running when enabling a req
  2016-06-15 19:37     ` Boris Brezillon
@ 2016-06-16  8:18       ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16  8:18 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

Hello,

Le 15/06/2016 21:37, Boris Brezillon a écrit :
>
> "
> Add a BUG_ON() call when the driver tries to launch a crypto request
> while the engine is still processing the previous one. This replaces
> a silent system hang by a verbose kernel panic with the associated
> backtrace to let the user know that something went wrong in the CESA
> driver.
> "

thanks

>
>> ---
>>   drivers/crypto/marvell/cipher.c | 2 ++
>>   drivers/crypto/marvell/hash.c   | 2 ++
>>   drivers/crypto/marvell/tdma.c   | 2 ++
>>   3 files changed, 6 insertions(+)
>>
>> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
>> index dcf1fce..8d0fabb 100644
>> --- a/drivers/crypto/marvell/cipher.c
>> +++ b/drivers/crypto/marvell/cipher.c
>> @@ -106,6 +106,8 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
>>
>>   	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
>>   	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
>> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
>> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
>
> Nit: please put the '&' operator at the end of the first line and
> align CESA_SA_CMD_EN_CESA_SA_ACCL0 on the open parenthesis.

Arf, ok I will fix this.

>
> 	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
> 	       CESA_SA_CMD_EN_CESA_SA_ACCL0);
>
>>   	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>>   }
>>
>> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
>> index 7ca2e0f..0fae351 100644
>> --- a/drivers/crypto/marvell/hash.c
>> +++ b/drivers/crypto/marvell/hash.c
>> @@ -237,6 +237,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>>
>>   	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
>>   	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
>> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
>> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
>
> Ditto.

ack

>
>>   	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>>   }
>>
>> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
>> index 7642798..d493714 100644
>> --- a/drivers/crypto/marvell/tdma.c
>> +++ b/drivers/crypto/marvell/tdma.c
>> @@ -53,6 +53,8 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
>>   		       engine->regs + CESA_SA_CFG);
>>   	writel_relaxed(dreq->chain.first->cur_dma,
>>   		       engine->regs + CESA_TDMA_NEXT_ADDR);
>> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
>> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
>
> Ditto.
ack

Regards,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 2/7] crypto: marvell: Check engine is not already running when enabling a req
@ 2016-06-16  8:18       ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Le 15/06/2016 21:37, Boris Brezillon a ?crit :
>
> "
> Add a BUG_ON() call when the driver tries to launch a crypto request
> while the engine is still processing the previous one. This replaces
> a silent system hang by a verbose kernel panic with the associated
> backtrace to let the user know that something went wrong in the CESA
> driver.
> "

thanks

>
>> ---
>>   drivers/crypto/marvell/cipher.c | 2 ++
>>   drivers/crypto/marvell/hash.c   | 2 ++
>>   drivers/crypto/marvell/tdma.c   | 2 ++
>>   3 files changed, 6 insertions(+)
>>
>> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
>> index dcf1fce..8d0fabb 100644
>> --- a/drivers/crypto/marvell/cipher.c
>> +++ b/drivers/crypto/marvell/cipher.c
>> @@ -106,6 +106,8 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
>>
>>   	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
>>   	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
>> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
>> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
>
> Nit: please put the '&' operator at the end of the first line and
> align CESA_SA_CMD_EN_CESA_SA_ACCL0 on the open parenthesis.

Arf, ok I will fix this.

>
> 	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
> 	       CESA_SA_CMD_EN_CESA_SA_ACCL0);
>
>>   	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>>   }
>>
>> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
>> index 7ca2e0f..0fae351 100644
>> --- a/drivers/crypto/marvell/hash.c
>> +++ b/drivers/crypto/marvell/hash.c
>> @@ -237,6 +237,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>>
>>   	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
>>   	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
>> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
>> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
>
> Ditto.

ack

>
>>   	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
>>   }
>>
>> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
>> index 7642798..d493714 100644
>> --- a/drivers/crypto/marvell/tdma.c
>> +++ b/drivers/crypto/marvell/tdma.c
>> @@ -53,6 +53,8 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
>>   		       engine->regs + CESA_SA_CFG);
>>   	writel_relaxed(dreq->chain.first->cur_dma,
>>   		       engine->regs + CESA_TDMA_NEXT_ADDR);
>> +	BUG_ON(readl(engine->regs + CESA_SA_CMD)
>> +				  & CESA_SA_CMD_EN_CESA_SA_ACCL0);
>
> Ditto.
ack

Regards,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  2016-06-15 20:07     ` Boris Brezillon
@ 2016-06-16  8:29       ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16  8:29 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

Hello,

Le 15/06/2016 22:07, Boris Brezillon a écrit :
> On Wed, 15 Jun 2016 21:15:30 +0200
> Romain Perier <romain.perier@free-electrons.com> wrote:
>
>> Adding a TDMA descriptor at the end of the request for copying the
>> output IV vector via a DMA transfer. This is required for processing
>> cipher requests asynchroniously in chained mode, otherwise the content
>
> 		  asynchronously
>
>> of the IV vector will be overwriten for each new finished request.
>
> BTW, Not sure the term 'asynchronously' is appropriate here. The
> standard (AKA non-DMA) processing is also asynchronous. The real reason
> here is that you want to chain the requests and offload as much
> processing as possible to the DMA and crypto engine. And as you
> explained, this is only possible if we retrieve the updated IV using
> DMA.
>

What do you think of the following description ?
"
Adding a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is a good way for offloading
as much as processing as possible to the DMA and the crypto engine.
This is also required for processing multiple cipher requests
in chained mode, otherwise the content of the IV vector would be
overwritten by the last processed request.
"

This point is true if multiple chained requests are processed via TDMA, 
the content of the "global" IV output vector would be overwritten
by the last request.


>> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
>> index 74071e4..74b84bd 100644
>> --- a/drivers/crypto/marvell/cesa.h
>> +++ b/drivers/crypto/marvell/cesa.h
>> @@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
>>   #define CESA_TDMA_DUMMY				0
>>   #define CESA_TDMA_DATA				1
>>   #define CESA_TDMA_OP				2
>> +#define CESA_TDMA_IV				4
>
> Should be 3 and not 4: TDMA_TYPE is an enum, not a bit field.

Ok

>
> Sometime it's better to offend the < 80 characters rule than doing
> funky stuff ;).

I just wanted to make checkpatch happy :D
Yeah, that's ugly, I agree. I will fix this.

>
>> +		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
>>   		return ret;
>> -
>> -	memcpy_fromio(ablkreq->info,
>> -		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
>> -		      crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq)));
>> -
>> -	return 0;
>> +	}
>
> Missing blank line.

ack

>
>> +	return mv_cesa_ablkcipher_std_process(ablkreq, status);
>
> This version is more readable IMHO:
>
> 	struct mv_cesa_tdma_req *dreq;
> 	unsigned int ivsize;
> 	int ret;
>
> 	if (creq->req.base.type == CESA_STD_REQ)
> 		return mv_cesa_ablkcipher_std_process(ablkreq, status);
>
> 	ret = mv_cesa_dma_process(&creq->req.dma, status);
> 	if (ret)
> 		return ret;
>
> 	dreq = &creq->req.dma;
> 	ivsize =
> 	crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
> 	memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
>
> 	return 0;
>
>>
>>   static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
>> @@ -302,6 +307,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>>   	struct mv_cesa_tdma_chain chain;
>>   	bool skip_ctx = false;
>>   	int ret;
>> +	unsigned int ivsize;
>>
>>   	dreq->base.type = CESA_DMA_REQ;
>>   	dreq->chain.first = NULL;
>> @@ -360,6 +366,14 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>>
>>   	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
>>
>> +	/* Add output data for IV */
>> +	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
>> +	ret = mv_cesa_dma_add_iv_op(&chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
>> +				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
>> +
>> +	if (ret)
>> +		goto err_free_tdma;
>> +
>>   	dreq->chain = chain;
>>
>>   	return 0;
>> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
>> index d493714..88c87be 100644
>> --- a/drivers/crypto/marvell/tdma.c
>> +++ b/drivers/crypto/marvell/tdma.c
>> @@ -68,6 +68,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
>>   		if (tdma->flags & CESA_TDMA_OP)
>
> I realize this test is wrong.
>
> It should be
> 		type = tdma->flags & CESA_TDMA_TYPE_MSK;
> 		if (type == CESA_TDMA_OP)
>
>>   			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
>>   				      le32_to_cpu(tdma->src));
>> +		else if (tdma->flags & CESA_TDMA_IV)
>
> and here

I propose a separated commit to fix this problem. What do you think ?

> 		else if (type == CESA_TDMA_IV)
>
>> +			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
>> +				      le32_to_cpu(tdma->dst));
>>
>>   		tdma = tdma->next;
>>   		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
>> @@ -120,6 +123,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
>>   	return new_tdma;
>>   }
>>
>> +int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
>> +			  u32 size, u32 flags, gfp_t gfp_flags)
>> +{
>> +
>> +	struct mv_cesa_tdma_desc *tdma;
>> +	u8 *cache;
>
> Why do you name that one cache? iv would be a better name.

ok

>
>> +	dma_addr_t dma_handle;
>> +
>> +	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
>> +	if (IS_ERR(tdma))
>> +		return PTR_ERR(tdma);
>> +
>> +	cache = dma_pool_alloc(cesa_dev->dma->iv_pool, flags, &dma_handle);
>> +	if (!cache)
>> +		return -ENOMEM;
>> +
>> +	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
>> +	tdma->src = src;
>> +	tdma->dst = cpu_to_le32(dma_handle);
>> +	tdma->data = cache;
>> +
>> +	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
>> +	tdma->flags = flags | CESA_TDMA_DATA | CESA_TDMA_IV;
>
> You should not mix 2 different types, it's either CESA_TDMA_DATA or
> CESA_TDMA_IV, and in this case it should be CESA_TDMA_IV.

good catch.

>
>> +	return 0;
>> +}
>> +
>>   struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
>>   					const struct mv_cesa_op_ctx *op_templ,
>>   					bool skip_ctx,
>
>
>

Thanks,
Regards,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
@ 2016-06-16  8:29       ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16  8:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Le 15/06/2016 22:07, Boris Brezillon a ?crit :
> On Wed, 15 Jun 2016 21:15:30 +0200
> Romain Perier <romain.perier@free-electrons.com> wrote:
>
>> Adding a TDMA descriptor at the end of the request for copying the
>> output IV vector via a DMA transfer. This is required for processing
>> cipher requests asynchroniously in chained mode, otherwise the content
>
> 		  asynchronously
>
>> of the IV vector will be overwriten for each new finished request.
>
> BTW, Not sure the term 'asynchronously' is appropriate here. The
> standard (AKA non-DMA) processing is also asynchronous. The real reason
> here is that you want to chain the requests and offload as much
> processing as possible to the DMA and crypto engine. And as you
> explained, this is only possible if we retrieve the updated IV using
> DMA.
>

What do you think of the following description ?
"
Adding a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is a good way for offloading
as much as processing as possible to the DMA and the crypto engine.
This is also required for processing multiple cipher requests
in chained mode, otherwise the content of the IV vector would be
overwritten by the last processed request.
"

This point is true if multiple chained requests are processed via TDMA, 
the content of the "global" IV output vector would be overwritten
by the last request.


>> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
>> index 74071e4..74b84bd 100644
>> --- a/drivers/crypto/marvell/cesa.h
>> +++ b/drivers/crypto/marvell/cesa.h
>> @@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
>>   #define CESA_TDMA_DUMMY				0
>>   #define CESA_TDMA_DATA				1
>>   #define CESA_TDMA_OP				2
>> +#define CESA_TDMA_IV				4
>
> Should be 3 and not 4: TDMA_TYPE is an enum, not a bit field.

Ok

>
> Sometime it's better to offend the < 80 characters rule than doing
> funky stuff ;).

I just wanted to make checkpatch happy :D
Yeah, that's ugly, I agree. I will fix this.

>
>> +		memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
>>   		return ret;
>> -
>> -	memcpy_fromio(ablkreq->info,
>> -		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
>> -		      crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq)));
>> -
>> -	return 0;
>> +	}
>
> Missing blank line.

ack

>
>> +	return mv_cesa_ablkcipher_std_process(ablkreq, status);
>
> This version is more readable IMHO:
>
> 	struct mv_cesa_tdma_req *dreq;
> 	unsigned int ivsize;
> 	int ret;
>
> 	if (creq->req.base.type == CESA_STD_REQ)
> 		return mv_cesa_ablkcipher_std_process(ablkreq, status);
>
> 	ret = mv_cesa_dma_process(&creq->req.dma, status);
> 	if (ret)
> 		return ret;
>
> 	dreq = &creq->req.dma;
> 	ivsize =
> 	crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
> 	memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
>
> 	return 0;
>
>>
>>   static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
>> @@ -302,6 +307,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>>   	struct mv_cesa_tdma_chain chain;
>>   	bool skip_ctx = false;
>>   	int ret;
>> +	unsigned int ivsize;
>>
>>   	dreq->base.type = CESA_DMA_REQ;
>>   	dreq->chain.first = NULL;
>> @@ -360,6 +366,14 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>>
>>   	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
>>
>> +	/* Add output data for IV */
>> +	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
>> +	ret = mv_cesa_dma_add_iv_op(&chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
>> +				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
>> +
>> +	if (ret)
>> +		goto err_free_tdma;
>> +
>>   	dreq->chain = chain;
>>
>>   	return 0;
>> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
>> index d493714..88c87be 100644
>> --- a/drivers/crypto/marvell/tdma.c
>> +++ b/drivers/crypto/marvell/tdma.c
>> @@ -68,6 +68,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
>>   		if (tdma->flags & CESA_TDMA_OP)
>
> I realize this test is wrong.
>
> It should be
> 		type = tdma->flags & CESA_TDMA_TYPE_MSK;
> 		if (type == CESA_TDMA_OP)
>
>>   			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
>>   				      le32_to_cpu(tdma->src));
>> +		else if (tdma->flags & CESA_TDMA_IV)
>
> and here

I propose a separated commit to fix this problem. What do you think ?

> 		else if (type == CESA_TDMA_IV)
>
>> +			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
>> +				      le32_to_cpu(tdma->dst));
>>
>>   		tdma = tdma->next;
>>   		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
>> @@ -120,6 +123,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
>>   	return new_tdma;
>>   }
>>
>> +int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
>> +			  u32 size, u32 flags, gfp_t gfp_flags)
>> +{
>> +
>> +	struct mv_cesa_tdma_desc *tdma;
>> +	u8 *cache;
>
> Why do you name that one cache? iv would be a better name.

ok

>
>> +	dma_addr_t dma_handle;
>> +
>> +	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
>> +	if (IS_ERR(tdma))
>> +		return PTR_ERR(tdma);
>> +
>> +	cache = dma_pool_alloc(cesa_dev->dma->iv_pool, flags, &dma_handle);
>> +	if (!cache)
>> +		return -ENOMEM;
>> +
>> +	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
>> +	tdma->src = src;
>> +	tdma->dst = cpu_to_le32(dma_handle);
>> +	tdma->data = cache;
>> +
>> +	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
>> +	tdma->flags = flags | CESA_TDMA_DATA | CESA_TDMA_IV;
>
> You should not mix 2 different types, it's either CESA_TDMA_DATA or
> CESA_TDMA_IV, and in this case it should be CESA_TDMA_IV.

good catch.

>
>> +	return 0;
>> +}
>> +
>>   struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
>>   					const struct mv_cesa_op_ctx *op_templ,
>>   					bool skip_ctx,
>
>
>

Thanks,
Regards,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  2016-06-16  8:29       ` Romain Perier
@ 2016-06-16  8:32         ` Gregory CLEMENT
  -1 siblings, 0 replies; 50+ messages in thread
From: Gregory CLEMENT @ 2016-06-16  8:32 UTC (permalink / raw)
  To: Romain Perier
  Cc: Boris Brezillon, Arnaud Ebalard, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

Hi Romain,
 
 On jeu., juin 16 2016, Romain Perier <romain.perier@free-electrons.com> wrote:

>
>>> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
>>> index 74071e4..74b84bd 100644
>>> --- a/drivers/crypto/marvell/cesa.h
>>> +++ b/drivers/crypto/marvell/cesa.h
>>> @@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
>>>   #define CESA_TDMA_DUMMY				0
>>>   #define CESA_TDMA_DATA				1
>>>   #define CESA_TDMA_OP				2
>>> +#define CESA_TDMA_IV				4
>>
>> Should be 3 and not 4: TDMA_TYPE is an enum, not a bit field.
>
> Ok
>
>>
>> Sometime it's better to offend the < 80 characters rule than doing
>> funky stuff ;).
>
> I just wanted to make checkpatch happy :D

In this case you can use a temporary variable.


-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
@ 2016-06-16  8:32         ` Gregory CLEMENT
  0 siblings, 0 replies; 50+ messages in thread
From: Gregory CLEMENT @ 2016-06-16  8:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Romain,
 
 On jeu., juin 16 2016, Romain Perier <romain.perier@free-electrons.com> wrote:

>
>>> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
>>> index 74071e4..74b84bd 100644
>>> --- a/drivers/crypto/marvell/cesa.h
>>> +++ b/drivers/crypto/marvell/cesa.h
>>> @@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
>>>   #define CESA_TDMA_DUMMY				0
>>>   #define CESA_TDMA_DATA				1
>>>   #define CESA_TDMA_OP				2
>>> +#define CESA_TDMA_IV				4
>>
>> Should be 3 and not 4: TDMA_TYPE is an enum, not a bit field.
>
> Ok
>
>>
>> Sometime it's better to offend the < 80 characters rule than doing
>> funky stuff ;).
>
> I just wanted to make checkpatch happy :D

In this case you can use a temporary variable.


-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
  2016-06-15 20:42     ` Boris Brezillon
@ 2016-06-16 12:02       ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16 12:02 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

Hello,

Le 15/06/2016 22:42, Boris Brezillon a écrit :
> On Wed, 15 Jun 2016 21:15:31 +0200
> Romain Perier <romain.perier@free-electrons.com> wrote:
>
>> Actually the only way to access the tdma chain is to use the 'req' union
>
> Currently, ...

ok

> Now that the dma specific fields are part of the base request there's no
> reason to keep this union.
>
> You can just put struct mv_cesa_req base; directly under struct
> mv_cesa_ablkcipher_req, and move mv_cesa_ablkcipher_std_req fields in
> mv_cesa_ablkcipher_req.


Well, I think that I might keep the changes related to mv_cesa_tdma_req 
in this commit (+ put struct mv_cesa_req base; direct under struct 
mv_cesa_ablkcipher_req) and move the changes related to 
mv_cesa_ablkcipher_std_req into another commit. What do you think ?


> Initialize basereq earlier and pass it as the first argument of
> mv_cesa_dma_process().

ok



>> @@ -174,9 +174,9 @@ static inline void
>>   mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
>>   {
>>   	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
>> +	struct mv_cesa_req *dreq = &creq->req.base;
>
> You named it basereq in mv_cesa_ablkcipher_step(). Try to be
> consistent, no matter the name.

ack

>
>>
>> -	mv_cesa_dma_prepare(dreq, dreq->base.engine);
>> +	mv_cesa_dma_prepare(dreq, dreq->engine);
>>   }
>>
>>   static inline void
>> @@ -199,7 +199,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
>>   	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>>   	creq->req.base.engine = engine;
>>
>> -	if (creq->req.base.type == CESA_DMA_REQ)
>> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>>   		mv_cesa_ablkcipher_dma_prepare(ablkreq);
>>   	else
>>   		mv_cesa_ablkcipher_std_prepare(ablkreq);
>> @@ -302,14 +302,13 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>>   	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>>   	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
>>   		      GFP_KERNEL : GFP_ATOMIC;
>> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
>> +	struct mv_cesa_req *dreq = &creq->req.base;
>
> Ditto.

ack


>> @@ -256,9 +256,9 @@ static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
>>   static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
>>   {
>>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
>> +	struct mv_cesa_req *dreq = &creq->req.base;
>
> Ditto.

ack

>> @@ -340,7 +340,7 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
>>
>>   	creq->req.base.engine = engine;
>>
>> -	if (creq->req.base.type == CESA_DMA_REQ)
>> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>>   		mv_cesa_ahash_dma_prepare(ahashreq);
>>   	else
>>   		mv_cesa_ahash_std_prepare(ahashreq);
>> @@ -555,8 +555,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>>   	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
>>   		      GFP_KERNEL : GFP_ATOMIC;
>> -	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
>> -	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
>> +	struct mv_cesa_req *dreq = &creq->req.base;
>
> Ditto.

ack

>
>>   	struct mv_cesa_ahash_dma_iter iter;
>>   	struct mv_cesa_op_ctx *op = NULL;
>>   	unsigned int frag_len;
>> @@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>>   	int ret;
>>
>> -	if (cesa_dev->caps->has_tdma)
>> -		creq->req.base.type = CESA_DMA_REQ;
>> -	else
>> -		creq->req.base.type = CESA_STD_REQ;
>> -
>
> Hm, where is it decided now? I mean, I don't see this test anywhere
> else in your patch, which means you're now always using standard mode.

It has been replaced by mv_cesa_req_get_type() + initializing 
chain.first to NULL in std_init. So, that's the same thing, no ?

>
>>   	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
>>   	if (creq->src_nents < 0) {
>>   		dev_err(cesa_dev->dev, "Invalid number of src SG");
>> @@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>>   	if (*cached)
>>   		return 0;
>>
>> -	if (creq->req.base.type == CESA_DMA_REQ)
>> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>
> Should be
>
> 	if (cesa_dev->caps->has_tdma)
>
>>   		ret = mv_cesa_ahash_dma_req_init(req);

Why ? mv_cesa_req_get_type() tests mv_cesa_req->chain and returns a code 
depending on its value. This value is initialized according to what is 
set un "has_tdma"...


Thanks,
Regards,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
@ 2016-06-16 12:02       ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16 12:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Le 15/06/2016 22:42, Boris Brezillon a ?crit :
> On Wed, 15 Jun 2016 21:15:31 +0200
> Romain Perier <romain.perier@free-electrons.com> wrote:
>
>> Actually the only way to access the tdma chain is to use the 'req' union
>
> Currently, ...

ok

> Now that the dma specific fields are part of the base request there's no
> reason to keep this union.
>
> You can just put struct mv_cesa_req base; directly under struct
> mv_cesa_ablkcipher_req, and move mv_cesa_ablkcipher_std_req fields in
> mv_cesa_ablkcipher_req.


Well, I think that I might keep the changes related to mv_cesa_tdma_req 
in this commit (+ put struct mv_cesa_req base; direct under struct 
mv_cesa_ablkcipher_req) and move the changes related to 
mv_cesa_ablkcipher_std_req into another commit. What do you think ?


> Initialize basereq earlier and pass it as the first argument of
> mv_cesa_dma_process().

ok



>> @@ -174,9 +174,9 @@ static inline void
>>   mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
>>   {
>>   	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
>> +	struct mv_cesa_req *dreq = &creq->req.base;
>
> You named it basereq in mv_cesa_ablkcipher_step(). Try to be
> consistent, no matter the name.

ack

>
>>
>> -	mv_cesa_dma_prepare(dreq, dreq->base.engine);
>> +	mv_cesa_dma_prepare(dreq, dreq->engine);
>>   }
>>
>>   static inline void
>> @@ -199,7 +199,7 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
>>   	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
>>   	creq->req.base.engine = engine;
>>
>> -	if (creq->req.base.type == CESA_DMA_REQ)
>> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>>   		mv_cesa_ablkcipher_dma_prepare(ablkreq);
>>   	else
>>   		mv_cesa_ablkcipher_std_prepare(ablkreq);
>> @@ -302,14 +302,13 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>>   	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>>   	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
>>   		      GFP_KERNEL : GFP_ATOMIC;
>> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
>> +	struct mv_cesa_req *dreq = &creq->req.base;
>
> Ditto.

ack


>> @@ -256,9 +256,9 @@ static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
>>   static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
>>   {
>>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>> -	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
>> +	struct mv_cesa_req *dreq = &creq->req.base;
>
> Ditto.

ack

>> @@ -340,7 +340,7 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
>>
>>   	creq->req.base.engine = engine;
>>
>> -	if (creq->req.base.type == CESA_DMA_REQ)
>> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>>   		mv_cesa_ahash_dma_prepare(ahashreq);
>>   	else
>>   		mv_cesa_ahash_std_prepare(ahashreq);
>> @@ -555,8 +555,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>>   	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
>>   		      GFP_KERNEL : GFP_ATOMIC;
>> -	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
>> -	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
>> +	struct mv_cesa_req *dreq = &creq->req.base;
>
> Ditto.

ack

>
>>   	struct mv_cesa_ahash_dma_iter iter;
>>   	struct mv_cesa_op_ctx *op = NULL;
>>   	unsigned int frag_len;
>> @@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>>   	int ret;
>>
>> -	if (cesa_dev->caps->has_tdma)
>> -		creq->req.base.type = CESA_DMA_REQ;
>> -	else
>> -		creq->req.base.type = CESA_STD_REQ;
>> -
>
> Hm, where is it decided now? I mean, I don't see this test anywhere
> else in your patch, which means you're now always using standard mode.

It has been replaced by mv_cesa_req_get_type() + initializing 
chain.first to NULL in std_init. So, that's the same thing, no ?

>
>>   	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
>>   	if (creq->src_nents < 0) {
>>   		dev_err(cesa_dev->dev, "Invalid number of src SG");
>> @@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>>   	if (*cached)
>>   		return 0;
>>
>> -	if (creq->req.base.type == CESA_DMA_REQ)
>> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)
>
> Should be
>
> 	if (cesa_dev->caps->has_tdma)
>
>>   		ret = mv_cesa_ahash_dma_req_init(req);

Why ? mv_cesa_req_get_type() tests mv_cesa_req->chain and returns a code 
depending on its value. This value is initialized according to what is 
set un "has_tdma"...


Thanks,
Regards,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
  2016-06-16 12:02       ` Romain Perier
@ 2016-06-16 12:45         ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-16 12:45 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Thu, 16 Jun 2016 14:02:42 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> > Now that the dma specific fields are part of the base request there's no
> > reason to keep this union.
> >
> > You can just put struct mv_cesa_req base; directly under struct
> > mv_cesa_ablkcipher_req, and move mv_cesa_ablkcipher_std_req fields in
> > mv_cesa_ablkcipher_req.  
> 
> 
> Well, I think that I might keep the changes related to mv_cesa_tdma_req 
> in this commit (+ put struct mv_cesa_req base; direct under struct 
> mv_cesa_ablkcipher_req) and move the changes related to 
> mv_cesa_ablkcipher_std_req into another commit. What do you think ?

Sounds good.

> >  
> >>   	struct mv_cesa_ahash_dma_iter iter;
> >>   	struct mv_cesa_op_ctx *op = NULL;
> >>   	unsigned int frag_len;
> >> @@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
> >>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
> >>   	int ret;
> >>
> >> -	if (cesa_dev->caps->has_tdma)
> >> -		creq->req.base.type = CESA_DMA_REQ;
> >> -	else
> >> -		creq->req.base.type = CESA_STD_REQ;
> >> -  
> >
> > Hm, where is it decided now? I mean, I don't see this test anywhere
> > else in your patch, which means you're now always using standard mode.  
> 
> It has been replaced by mv_cesa_req_get_type() + initializing 
> chain.first to NULL in std_init. So, that's the same thing, no ?

And that's exactly my point :-). When these fields are NULL the request
is a STD request...

> 
> >  
> >>   	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
> >>   	if (creq->src_nents < 0) {
> >>   		dev_err(cesa_dev->dev, "Invalid number of src SG");
> >> @@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
> >>   	if (*cached)
> >>   		return 0;
> >>
> >> -	if (creq->req.base.type == CESA_DMA_REQ)
> >> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)  

... and here you're testing if it's a DMA request, which will always be
false, since mv_cesa_ahash_dma_req_init() is the function supposed to
fill the ->first and ->last fields.

> >
> > Should be
> >
> > 	if (cesa_dev->caps->has_tdma)
> >  
> >>   		ret = mv_cesa_ahash_dma_req_init(req);  
> 
> Why ? mv_cesa_req_get_type() tests mv_cesa_req->chain and returns a code 
> depending on its value. This value is initialized according to what is 
> set un "has_tdma"...

As explained above, it's not ;).


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
@ 2016-06-16 12:45         ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-16 12:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 16 Jun 2016 14:02:42 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> > Now that the dma specific fields are part of the base request there's no
> > reason to keep this union.
> >
> > You can just put struct mv_cesa_req base; directly under struct
> > mv_cesa_ablkcipher_req, and move mv_cesa_ablkcipher_std_req fields in
> > mv_cesa_ablkcipher_req.  
> 
> 
> Well, I think that I might keep the changes related to mv_cesa_tdma_req 
> in this commit (+ put struct mv_cesa_req base; direct under struct 
> mv_cesa_ablkcipher_req) and move the changes related to 
> mv_cesa_ablkcipher_std_req into another commit. What do you think ?

Sounds good.

> >  
> >>   	struct mv_cesa_ahash_dma_iter iter;
> >>   	struct mv_cesa_op_ctx *op = NULL;
> >>   	unsigned int frag_len;
> >> @@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
> >>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
> >>   	int ret;
> >>
> >> -	if (cesa_dev->caps->has_tdma)
> >> -		creq->req.base.type = CESA_DMA_REQ;
> >> -	else
> >> -		creq->req.base.type = CESA_STD_REQ;
> >> -  
> >
> > Hm, where is it decided now? I mean, I don't see this test anywhere
> > else in your patch, which means you're now always using standard mode.  
> 
> It has been replaced by mv_cesa_req_get_type() + initializing 
> chain.first to NULL in std_init. So, that's the same thing, no ?

And that's exactly my point :-). When these fields are NULL the request
is a STD request...

> 
> >  
> >>   	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
> >>   	if (creq->src_nents < 0) {
> >>   		dev_err(cesa_dev->dev, "Invalid number of src SG");
> >> @@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
> >>   	if (*cached)
> >>   		return 0;
> >>
> >> -	if (creq->req.base.type == CESA_DMA_REQ)
> >> +	if (mv_cesa_req_get_type(&creq->req.base) == CESA_DMA_REQ)  

... and here you're testing if it's a DMA request, which will always be
false, since mv_cesa_ahash_dma_req_init() is the function supposed to
fill the ->first and ->last fields.

> >
> > Should be
> >
> > 	if (cesa_dev->caps->has_tdma)
> >  
> >>   		ret = mv_cesa_ahash_dma_req_init(req);  
> 
> Why ? mv_cesa_req_get_type() tests mv_cesa_req->chain and returns a code 
> depending on its value. This value is initialized according to what is 
> set un "has_tdma"...

As explained above, it's not ;).


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
  2016-06-16 12:02       ` Romain Perier
@ 2016-06-16 12:57         ` Boris Brezillon
  -1 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-16 12:57 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Thu, 16 Jun 2016 14:02:42 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> > Now that the dma specific fields are part of the base request there's no
> > reason to keep this union.
> >
> > You can just put struct mv_cesa_req base; directly under struct
> > mv_cesa_ablkcipher_req, and move mv_cesa_ablkcipher_std_req fields in
> > mv_cesa_ablkcipher_req.  
> 
> 
> Well, I think that I might keep the changes related to mv_cesa_tdma_req 
> in this commit (+ put struct mv_cesa_req base; direct under struct 
> mv_cesa_ablkcipher_req) and move the changes related to 
> mv_cesa_ablkcipher_std_req into another commit. What do you think ?

After re-reading the code, I'm not sure the last part (moving
mv_cesa_ablkcipher_std_req fields into mv_cesa_ablkcipher_req) is a
good idea anymore.

So let's just kill the union, and move mv_cesa_ablkcipher_std_req and
mv_cesa_req base in mv_cesa_ablkcipher_req (you'll also have to remove
the base field from the mv_cesa_ablkcipher_std_req struct).

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req
@ 2016-06-16 12:57         ` Boris Brezillon
  0 siblings, 0 replies; 50+ messages in thread
From: Boris Brezillon @ 2016-06-16 12:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 16 Jun 2016 14:02:42 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> > Now that the dma specific fields are part of the base request there's no
> > reason to keep this union.
> >
> > You can just put struct mv_cesa_req base; directly under struct
> > mv_cesa_ablkcipher_req, and move mv_cesa_ablkcipher_std_req fields in
> > mv_cesa_ablkcipher_req.  
> 
> 
> Well, I think that I might keep the changes related to mv_cesa_tdma_req 
> in this commit (+ put struct mv_cesa_req base; direct under struct 
> mv_cesa_ablkcipher_req) and move the changes related to 
> mv_cesa_ablkcipher_std_req into another commit. What do you think ?

After re-reading the code, I'm not sure the last part (moving
mv_cesa_ablkcipher_std_req fields into mv_cesa_ablkcipher_req) is a
good idea anymore.

So let's just kill the union, and move mv_cesa_ablkcipher_std_req and
mv_cesa_req base in mv_cesa_ablkcipher_req (you'll also have to remove
the base field from the mv_cesa_ablkcipher_std_req struct).

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/7] crypto: marvell: Adding a complete operation for async requests
  2016-06-15 20:55     ` Boris Brezillon
@ 2016-06-16 13:41       ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16 13:41 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

Hello,

Le 15/06/2016 22:55, Boris Brezillon a écrit :
>> +
>
> Nit: not sure you should mix this cosmetic change with the other
> changes.

Ok


> You already have ivsize initialized.
>
>> +		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
>
> Use memcpy() here.

good catch, for both.

Thanks,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 5/7] crypto: marvell: Adding a complete operation for async requests
@ 2016-06-16 13:41       ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16 13:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Le 15/06/2016 22:55, Boris Brezillon a ?crit :
>> +
>
> Nit: not sure you should mix this cosmetic change with the other
> changes.

Ok


> You already have ivsize initialized.
>
>> +		memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
>
> Use memcpy() here.

good catch, for both.

Thanks,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 6/7] crypto: marvell: Adding load balancing between engines
  2016-06-15 21:13     ` Boris Brezillon
@ 2016-06-16 13:44       ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16 13:44 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

Hello,

Le 15/06/2016 23:13, Boris Brezillon a écrit :
> On Wed, 15 Jun 2016 21:15:33 +0200
> Romain Perier <romain.perier@free-electrons.com> wrote:
>
>> This commits adds support for fine grained load balancing on
>> multi-engine IPs. The engine is pre-selected based on its current load
>> and on the weight of the crypto request that is about to be processed.
>> The global crypto queue is also moved to each engine. These changes are
>
> 					to the mv_cesa_engine object.
>
>> useful for preparing the code to support TDMA chaining between crypto
>> requests, because each tdma chain will be handled per engine.
>
> These changes are required to allow chaining crypto requests at the DMA
> level.

ack

>> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
>> index fbaae2f..02aa38f 100644
>> --- a/drivers/crypto/marvell/cipher.c
>> +++ b/drivers/crypto/marvell/cipher.c
>> @@ -89,6 +89,9 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
>>   	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
>>   			    CESA_SA_SRAM_PAYLOAD_SIZE);
>>
>> +	mv_cesa_adjust_op(engine, &sreq->op);
>> +	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
>> +
>>   	len = sg_pcopy_to_buffer(req->src, creq->src_nents,
>>   				 engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>>   				 len, sreq->offset);
>> @@ -167,12 +170,9 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
>>   {
>>   	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>>   	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
>> -	struct mv_cesa_engine *engine = sreq->base.engine;
>>
>>   	sreq->size = 0;
>>   	sreq->offset = 0;
>> -	mv_cesa_adjust_op(engine, &sreq->op);
>> -	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
>
> Are these changes really related to this load balancing support?
> AFAICT, it's something that could have been done earlier, and is not
> dependent on the changes your introducing here, but maybe I'm missing
> something.

Yeah, indeed. I suggest another commit for doing it. What do you think ?

>
>>   }
>
> [...]
>
>>   static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
>> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
>> index f7f84cc..5946a69 100644
>> --- a/drivers/crypto/marvell/hash.c
>> +++ b/drivers/crypto/marvell/hash.c
>> @@ -162,6 +162,15 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>>   	unsigned int new_cache_ptr = 0;
>>   	u32 frag_mode;
>>   	size_t  len;
>> +	unsigned int digsize;
>> +	int i;
>> +
>> +	mv_cesa_adjust_op(engine, &creq->op_tmpl);
>> +	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
>> +
>> +	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
>> +	for (i = 0; i < digsize / 4; i++)
>> +		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
>>
>>   	if (creq->cache_ptr)
>>   		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>> @@ -265,11 +274,8 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
>>   {
>>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>>   	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
>> -	struct mv_cesa_engine *engine = sreq->base.engine;
>>
>>   	sreq->offset = 0;
>> -	mv_cesa_adjust_op(engine, &creq->op_tmpl);
>> -	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
>
> Same as above: it doesn't seem related to the load balancing stuff.

It might be moved into this "separated commit" described above.

Thanks,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 6/7] crypto: marvell: Adding load balancing between engines
@ 2016-06-16 13:44       ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-16 13:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Le 15/06/2016 23:13, Boris Brezillon a ?crit :
> On Wed, 15 Jun 2016 21:15:33 +0200
> Romain Perier <romain.perier@free-electrons.com> wrote:
>
>> This commits adds support for fine grained load balancing on
>> multi-engine IPs. The engine is pre-selected based on its current load
>> and on the weight of the crypto request that is about to be processed.
>> The global crypto queue is also moved to each engine. These changes are
>
> 					to the mv_cesa_engine object.
>
>> useful for preparing the code to support TDMA chaining between crypto
>> requests, because each tdma chain will be handled per engine.
>
> These changes are required to allow chaining crypto requests at the DMA
> level.

ack

>> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
>> index fbaae2f..02aa38f 100644
>> --- a/drivers/crypto/marvell/cipher.c
>> +++ b/drivers/crypto/marvell/cipher.c
>> @@ -89,6 +89,9 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
>>   	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
>>   			    CESA_SA_SRAM_PAYLOAD_SIZE);
>>
>> +	mv_cesa_adjust_op(engine, &sreq->op);
>> +	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
>> +
>>   	len = sg_pcopy_to_buffer(req->src, creq->src_nents,
>>   				 engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>>   				 len, sreq->offset);
>> @@ -167,12 +170,9 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
>>   {
>>   	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>>   	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
>> -	struct mv_cesa_engine *engine = sreq->base.engine;
>>
>>   	sreq->size = 0;
>>   	sreq->offset = 0;
>> -	mv_cesa_adjust_op(engine, &sreq->op);
>> -	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
>
> Are these changes really related to this load balancing support?
> AFAICT, it's something that could have been done earlier, and is not
> dependent on the changes your introducing here, but maybe I'm missing
> something.

Yeah, indeed. I suggest another commit for doing it. What do you think ?

>
>>   }
>
> [...]
>
>>   static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
>> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
>> index f7f84cc..5946a69 100644
>> --- a/drivers/crypto/marvell/hash.c
>> +++ b/drivers/crypto/marvell/hash.c
>> @@ -162,6 +162,15 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>>   	unsigned int new_cache_ptr = 0;
>>   	u32 frag_mode;
>>   	size_t  len;
>> +	unsigned int digsize;
>> +	int i;
>> +
>> +	mv_cesa_adjust_op(engine, &creq->op_tmpl);
>> +	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
>> +
>> +	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
>> +	for (i = 0; i < digsize / 4; i++)
>> +		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
>>
>>   	if (creq->cache_ptr)
>>   		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>> @@ -265,11 +274,8 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
>>   {
>>   	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>>   	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
>> -	struct mv_cesa_engine *engine = sreq->base.engine;
>>
>>   	sreq->offset = 0;
>> -	mv_cesa_adjust_op(engine, &creq->op_tmpl);
>> -	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
>
> Same as above: it doesn't seem related to the load balancing stuff.

It might be moved into this "separated commit" described above.

Thanks,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 7/7] crypto: marvell: Add support for chaining crypto requests in TDMA mode
  2016-06-15 21:43     ` Boris Brezillon
@ 2016-06-17  9:54       ` Romain Perier
  -1 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-17  9:54 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

Hello,

Le 15/06/2016 23:43, Boris Brezillon a écrit :
> On Wed, 15 Jun 2016 21:15:34 +0200
> Romain Perier <romain.perier@free-electrons.com> wrote:
>
>> The Cryptographic Engines and Security Accelerators (CESA) supports the
>> Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
>> can be chained and processed by the hardware without software
>> interferences.
>
> intervention.

ack


> Not necessarily before sending them to the engine, it can be done while
> the engine is running.

I re-worded it

> Coding style issue:
>
> struct crypto_async_request *
> mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
> 			   struct crypto_async_request **backlog)

ack

>
>> +{
>> +	struct crypto_async_request *req;
>> +
>> +	*backlog = crypto_get_backlog(&engine->queue);
>> +	req = crypto_dequeue_request(&engine->queue);
>> +
>> +	if (!req)
>> +		return NULL;
>> +
>> +	return req;
>> +}
>> +
>> +static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
>>   {
>>   	struct crypto_async_request *req, *backlog;
>>   	struct mv_cesa_ctx *ctx;
>>
>> -	backlog = crypto_get_backlog(&engine->queue);
>> -	req = crypto_dequeue_request(&engine->queue);
>> -	engine->req = req;
>>
>> +	spin_lock_bh(&engine->lock);
>> +	if (engine->req)
>> +		goto out_unlock;
>> +
>> +	req = mv_cesa_dequeue_req_locked(engine, &backlog);
>>   	if (!req)
>> -		return;
>> +		goto out_unlock;
>> +
>> +	engine->req = req;
>> +	spin_unlock_bh(&engine->lock);
>
> I'm not a big fan of those multiple 'unlock() locations', and since
> your code is pretty simple I'd prefer seeing something like.

mhhh, yes I have re-worked this function recently (the locking was more 
complicated before), I will change the code.

>
> 	spin_lock_bh(&engine->lock);
> 	if (!engine->req) {
> 		req = mv_cesa_dequeue_req_locked(engine, &backlog);
> 		engine->req = req;
> 	}
> 	spin_unlock_bh(&engine->lock);
>
> 	if (!req)
> 		return;
>
> With req and backlog initialized to NULL at the beginning of the
> function.

ack

>
>>
>>   	if (backlog)
>>   		backlog->complete(backlog, -EINPROGRESS);
>>
>>   	ctx = crypto_tfm_ctx(req->tfm);
>>   	ctx->ops->step(req);
>> +	return;
>
> Missing blank line.

ack

>
>> +out_unlock:
>> +	spin_unlock_bh(&engine->lock);
>> +}
>> +
>> +static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
>> +{
>> +	struct crypto_async_request *req;
>> +	struct mv_cesa_ctx *ctx;
>> +	int res;
>> +
>> +	req = engine->req;
>> +	ctx = crypto_tfm_ctx(req->tfm);
>> +	res = ctx->ops->process(req, status);
>> +
>> +	if (res == 0) {
>> +		ctx->ops->complete(req);
>> +		mv_cesa_engine_enqueue_complete_request(engine, req);
>> +	} else if (res == -EINPROGRESS) {
>> +		ctx->ops->step(req);
>> +	} else {
>> +		ctx->ops->complete(req);
>
> Do we really have to call ->complete() in this case?

I was simply to be consistent with the old code (that is currently in 
mainline) but to be honest I don't think so...

>
>> +	}
>> +
>> +	return res;
>> +}
>> +
>> +static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
>> +{
>> +	if (engine->chain.first && engine->chain.last)
>> +		return mv_cesa_tdma_process(engine, status);
>
> Missing blank line.

ack

>
>> +	return mv_cesa_std_process(engine, status);
>> +}
>> +
>> +static inline void mv_cesa_complete_req(struct mv_cesa_ctx *ctx,
>> +	struct crypto_async_request *req, int res)
>
> Align parameters to the open parenthesis.

ack



>> @@ -116,16 +181,15 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
>>   	struct mv_cesa_engine *engine = creq->engine;
>>
>>   	spin_lock_bh(&engine->lock);
>> +	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
>> +		mv_cesa_tdma_chain(engine, creq);
>
> Missing blank line.

ack


>> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
>> index 5626aa7..e0fee1f 100644
>> --- a/drivers/crypto/marvell/cesa.h
>> +++ b/drivers/crypto/marvell/cesa.h
>> @@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
>>   /* TDMA descriptor flags */
>>   #define CESA_TDMA_DST_IN_SRAM			BIT(31)
>>   #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
>> -#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
>> +#define CESA_TDMA_END_OF_REQ			BIT(29)
>> +#define CESA_TDMA_NOT_CHAIN			BIT(28)
>
> I would name it CESA_TDMA_BREAK_CHAIN.

ack

>
>> +#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
>>   #define CESA_TDMA_DUMMY				0
>>   #define CESA_TDMA_DATA				1
>>   #define CESA_TDMA_OP				2
>> @@ -431,6 +433,9 @@ struct mv_cesa_dev {
>>    *			SRAM
>>    * @queue:		fifo of the pending crypto requests
>>    * @load:		engine load counter, useful for load balancing
>> + * @chain:		list of the current tdma descriptors being processed
>> + * 			by this engine.
>> + * @complete_queue:	fifo of the processed requests by the engine
>>    *
>>    * Structure storing CESA engine information.
>>    */
>> @@ -448,6 +453,8 @@ struct mv_cesa_engine {
>>   	struct gen_pool *pool;
>>   	struct crypto_queue queue;
>>   	atomic_t load;
>> +	struct mv_cesa_tdma_chain chain;
>> +	struct list_head complete_queue;
>>   };
>>
>>   /**
>> @@ -618,6 +625,28 @@ struct mv_cesa_ahash_req {
>>
>>   extern struct mv_cesa_dev *cesa_dev;
>>
>> +
>> +static inline void mv_cesa_engine_enqueue_complete_request(
>> +	struct mv_cesa_engine *engine, struct crypto_async_request *req)
>
> Coding style issue (see my previous comments).

ok


>>
>> +struct crypto_async_request *mv_cesa_dequeue_req_locked(
>> +		      struct mv_cesa_engine *engine,
>> +		      struct crypto_async_request **backlog);
>
> Ditto.

ok


>> +void
>> +mv_cesa_tdma_chain(struct mv_cesa_engine *engine, struct mv_cesa_req *dreq)
>> +{
>> +	if (engine->chain.first == NULL && engine->chain.last == NULL) {
>> +		engine->chain.first = dreq->chain.first;
>> +		engine->chain.last  = dreq->chain.last;
>> +	} else {
>> +		struct mv_cesa_tdma_desc *last;
>> +
>> +		last = engine->chain.last;
>> +		last->next = dreq->chain.first;
>> +		engine->chain.last = dreq->chain.last;
>
> Missing blank line.

ack

>
>> +		if (!(last->flags & CESA_TDMA_NOT_CHAIN))
>> +			last->next_dma = dreq->chain.first->cur_dma;
>> +	}
>> +}
>> +
>> +int
>> +mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
>> +{
>> +	struct crypto_async_request *req = NULL;
>> +	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
>> +	dma_addr_t tdma_cur;
>> +	int res = 0;
>> +
>> +	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
>> +
>> +	for (tdma = engine->chain.first; tdma; tdma = next) {
>> +		spin_lock_bh(&engine->lock);
>> +		next = tdma->next;
>> +		spin_unlock_bh(&engine->lock);
>> +
>> +		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
>> +			struct crypto_async_request *backlog = NULL;
>> +			struct mv_cesa_ctx *ctx;
>> +
>> +			spin_lock_bh(&engine->lock);
>> +			/*
>> +			 * if req is NULL, this means we're processing the
>> +			 * request in engine->req.
>> +			 */
>> +			if (!req)
>> +				req = engine->req;
>> +			else
>> +				req = mv_cesa_dequeue_req_locked(engine,
>> +								 &backlog);
>> +
>> +			/* Re-chaining to the next request */
>> +			engine->chain.first = tdma->next;
>> +			tdma->next = NULL;
>> +
>> +			/* If this is the last request, clear the chain */
>> +			if (engine->chain.first == NULL)
>> +				engine->chain.last  = NULL;
>> +			spin_unlock_bh(&engine->lock);
>> +
>> +			ctx = crypto_tfm_ctx(req->tfm);
>> +			res = ctx->ops->process(req, status);
>
> Hm, that's not exactly true. The status you're passing here is only
> valid for the last request that has been processed. Say you queued 3
> requests. 2 of them were correctly processed, but the last one
> triggered an error. You don't want the first 2 requests to be
> considered bad.

I will re-work this part


>
>> +			ctx->ops->complete(req);
>> +
>> +			if (res == 0)
>> +				mv_cesa_engine_enqueue_complete_request(engine,
>> +									req);
>> +
>> +			if (backlog)
>> +				backlog->complete(backlog, -EINPROGRESS);
>> +		}
>
> Missing blank line.

ok

>
>> +		if (res || tdma->cur_dma == tdma_cur)
>> +			break;
>> +	}
>> +
>> +	if (res) {
>> +		spin_lock_bh(&engine->lock);
>> +		engine->req = req;
>> +		spin_unlock_bh(&engine->lock);
>> +	}
>
> Maybe you can add a comment explaining that you are actually setting
> the last processed request into engine->req, so that the core can know
> which request was faulty.
>
I added a comment


Thanks !
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 7/7] crypto: marvell: Add support for chaining crypto requests in TDMA mode
@ 2016-06-17  9:54       ` Romain Perier
  0 siblings, 0 replies; 50+ messages in thread
From: Romain Perier @ 2016-06-17  9:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Le 15/06/2016 23:43, Boris Brezillon a ?crit :
> On Wed, 15 Jun 2016 21:15:34 +0200
> Romain Perier <romain.perier@free-electrons.com> wrote:
>
>> The Cryptographic Engines and Security Accelerators (CESA) supports the
>> Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
>> can be chained and processed by the hardware without software
>> interferences.
>
> intervention.

ack


> Not necessarily before sending them to the engine, it can be done while
> the engine is running.

I re-worded it

> Coding style issue:
>
> struct crypto_async_request *
> mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
> 			   struct crypto_async_request **backlog)

ack

>
>> +{
>> +	struct crypto_async_request *req;
>> +
>> +	*backlog = crypto_get_backlog(&engine->queue);
>> +	req = crypto_dequeue_request(&engine->queue);
>> +
>> +	if (!req)
>> +		return NULL;
>> +
>> +	return req;
>> +}
>> +
>> +static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
>>   {
>>   	struct crypto_async_request *req, *backlog;
>>   	struct mv_cesa_ctx *ctx;
>>
>> -	backlog = crypto_get_backlog(&engine->queue);
>> -	req = crypto_dequeue_request(&engine->queue);
>> -	engine->req = req;
>>
>> +	spin_lock_bh(&engine->lock);
>> +	if (engine->req)
>> +		goto out_unlock;
>> +
>> +	req = mv_cesa_dequeue_req_locked(engine, &backlog);
>>   	if (!req)
>> -		return;
>> +		goto out_unlock;
>> +
>> +	engine->req = req;
>> +	spin_unlock_bh(&engine->lock);
>
> I'm not a big fan of those multiple 'unlock() locations', and since
> your code is pretty simple I'd prefer seeing something like.

mhhh, yes I have re-worked this function recently (the locking was more 
complicated before), I will change the code.

>
> 	spin_lock_bh(&engine->lock);
> 	if (!engine->req) {
> 		req = mv_cesa_dequeue_req_locked(engine, &backlog);
> 		engine->req = req;
> 	}
> 	spin_unlock_bh(&engine->lock);
>
> 	if (!req)
> 		return;
>
> With req and backlog initialized to NULL at the beginning of the
> function.

ack

>
>>
>>   	if (backlog)
>>   		backlog->complete(backlog, -EINPROGRESS);
>>
>>   	ctx = crypto_tfm_ctx(req->tfm);
>>   	ctx->ops->step(req);
>> +	return;
>
> Missing blank line.

ack

>
>> +out_unlock:
>> +	spin_unlock_bh(&engine->lock);
>> +}
>> +
>> +static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
>> +{
>> +	struct crypto_async_request *req;
>> +	struct mv_cesa_ctx *ctx;
>> +	int res;
>> +
>> +	req = engine->req;
>> +	ctx = crypto_tfm_ctx(req->tfm);
>> +	res = ctx->ops->process(req, status);
>> +
>> +	if (res == 0) {
>> +		ctx->ops->complete(req);
>> +		mv_cesa_engine_enqueue_complete_request(engine, req);
>> +	} else if (res == -EINPROGRESS) {
>> +		ctx->ops->step(req);
>> +	} else {
>> +		ctx->ops->complete(req);
>
> Do we really have to call ->complete() in this case?

I was simply to be consistent with the old code (that is currently in 
mainline) but to be honest I don't think so...

>
>> +	}
>> +
>> +	return res;
>> +}
>> +
>> +static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
>> +{
>> +	if (engine->chain.first && engine->chain.last)
>> +		return mv_cesa_tdma_process(engine, status);
>
> Missing blank line.

ack

>
>> +	return mv_cesa_std_process(engine, status);
>> +}
>> +
>> +static inline void mv_cesa_complete_req(struct mv_cesa_ctx *ctx,
>> +	struct crypto_async_request *req, int res)
>
> Align parameters to the open parenthesis.

ack



>> @@ -116,16 +181,15 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
>>   	struct mv_cesa_engine *engine = creq->engine;
>>
>>   	spin_lock_bh(&engine->lock);
>> +	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
>> +		mv_cesa_tdma_chain(engine, creq);
>
> Missing blank line.

ack


>> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
>> index 5626aa7..e0fee1f 100644
>> --- a/drivers/crypto/marvell/cesa.h
>> +++ b/drivers/crypto/marvell/cesa.h
>> @@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
>>   /* TDMA descriptor flags */
>>   #define CESA_TDMA_DST_IN_SRAM			BIT(31)
>>   #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
>> -#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
>> +#define CESA_TDMA_END_OF_REQ			BIT(29)
>> +#define CESA_TDMA_NOT_CHAIN			BIT(28)
>
> I would name it CESA_TDMA_BREAK_CHAIN.

ack

>
>> +#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
>>   #define CESA_TDMA_DUMMY				0
>>   #define CESA_TDMA_DATA				1
>>   #define CESA_TDMA_OP				2
>> @@ -431,6 +433,9 @@ struct mv_cesa_dev {
>>    *			SRAM
>>    * @queue:		fifo of the pending crypto requests
>>    * @load:		engine load counter, useful for load balancing
>> + * @chain:		list of the current tdma descriptors being processed
>> + * 			by this engine.
>> + * @complete_queue:	fifo of the processed requests by the engine
>>    *
>>    * Structure storing CESA engine information.
>>    */
>> @@ -448,6 +453,8 @@ struct mv_cesa_engine {
>>   	struct gen_pool *pool;
>>   	struct crypto_queue queue;
>>   	atomic_t load;
>> +	struct mv_cesa_tdma_chain chain;
>> +	struct list_head complete_queue;
>>   };
>>
>>   /**
>> @@ -618,6 +625,28 @@ struct mv_cesa_ahash_req {
>>
>>   extern struct mv_cesa_dev *cesa_dev;
>>
>> +
>> +static inline void mv_cesa_engine_enqueue_complete_request(
>> +	struct mv_cesa_engine *engine, struct crypto_async_request *req)
>
> Coding style issue (see my previous comments).

ok


>>
>> +struct crypto_async_request *mv_cesa_dequeue_req_locked(
>> +		      struct mv_cesa_engine *engine,
>> +		      struct crypto_async_request **backlog);
>
> Ditto.

ok


>> +void
>> +mv_cesa_tdma_chain(struct mv_cesa_engine *engine, struct mv_cesa_req *dreq)
>> +{
>> +	if (engine->chain.first == NULL && engine->chain.last == NULL) {
>> +		engine->chain.first = dreq->chain.first;
>> +		engine->chain.last  = dreq->chain.last;
>> +	} else {
>> +		struct mv_cesa_tdma_desc *last;
>> +
>> +		last = engine->chain.last;
>> +		last->next = dreq->chain.first;
>> +		engine->chain.last = dreq->chain.last;
>
> Missing blank line.

ack

>
>> +		if (!(last->flags & CESA_TDMA_NOT_CHAIN))
>> +			last->next_dma = dreq->chain.first->cur_dma;
>> +	}
>> +}
>> +
>> +int
>> +mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
>> +{
>> +	struct crypto_async_request *req = NULL;
>> +	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
>> +	dma_addr_t tdma_cur;
>> +	int res = 0;
>> +
>> +	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
>> +
>> +	for (tdma = engine->chain.first; tdma; tdma = next) {
>> +		spin_lock_bh(&engine->lock);
>> +		next = tdma->next;
>> +		spin_unlock_bh(&engine->lock);
>> +
>> +		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
>> +			struct crypto_async_request *backlog = NULL;
>> +			struct mv_cesa_ctx *ctx;
>> +
>> +			spin_lock_bh(&engine->lock);
>> +			/*
>> +			 * if req is NULL, this means we're processing the
>> +			 * request in engine->req.
>> +			 */
>> +			if (!req)
>> +				req = engine->req;
>> +			else
>> +				req = mv_cesa_dequeue_req_locked(engine,
>> +								 &backlog);
>> +
>> +			/* Re-chaining to the next request */
>> +			engine->chain.first = tdma->next;
>> +			tdma->next = NULL;
>> +
>> +			/* If this is the last request, clear the chain */
>> +			if (engine->chain.first == NULL)
>> +				engine->chain.last  = NULL;
>> +			spin_unlock_bh(&engine->lock);
>> +
>> +			ctx = crypto_tfm_ctx(req->tfm);
>> +			res = ctx->ops->process(req, status);
>
> Hm, that's not exactly true. The status you're passing here is only
> valid for the last request that has been processed. Say you queued 3
> requests. 2 of them were correctly processed, but the last one
> triggered an error. You don't want the first 2 requests to be
> considered bad.

I will re-work this part


>
>> +			ctx->ops->complete(req);
>> +
>> +			if (res == 0)
>> +				mv_cesa_engine_enqueue_complete_request(engine,
>> +									req);
>> +
>> +			if (backlog)
>> +				backlog->complete(backlog, -EINPROGRESS);
>> +		}
>
> Missing blank line.

ok

>
>> +		if (res || tdma->cur_dma == tdma_cur)
>> +			break;
>> +	}
>> +
>> +	if (res) {
>> +		spin_lock_bh(&engine->lock);
>> +		engine->req = req;
>> +		spin_unlock_bh(&engine->lock);
>> +	}
>
> Maybe you can add a comment explaining that you are actually setting
> the last processed request into engine->req, so that the core can know
> which request was faulty.
>
I added a comment


Thanks !
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2016-06-17  9:54 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-15 19:15 [PATCH 0/7] Chain crypto requests together at the DMA level Romain Perier
2016-06-15 19:15 ` Romain Perier
2016-06-15 19:15 ` [PATCH 1/7] crypto: marvell: Add a macro constant for the size of the crypto queue Romain Perier
2016-06-15 19:15   ` Romain Perier
2016-06-15 19:20   ` Boris Brezillon
2016-06-15 19:20     ` Boris Brezillon
2016-06-15 19:15 ` [PATCH 2/7] crypto: marvell: Check engine is not already running when enabling a req Romain Perier
2016-06-15 19:15   ` Romain Perier
2016-06-15 19:37   ` Boris Brezillon
2016-06-15 19:37     ` Boris Brezillon
2016-06-16  8:18     ` Romain Perier
2016-06-16  8:18       ` Romain Perier
2016-06-15 19:15 ` [PATCH 3/7] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests Romain Perier
2016-06-15 19:15   ` Romain Perier
2016-06-15 20:07   ` Boris Brezillon
2016-06-15 20:07     ` Boris Brezillon
2016-06-16  8:29     ` Romain Perier
2016-06-16  8:29       ` Romain Perier
2016-06-16  8:32       ` Gregory CLEMENT
2016-06-16  8:32         ` Gregory CLEMENT
2016-06-15 20:48   ` Boris Brezillon
2016-06-15 20:48     ` Boris Brezillon
2016-06-15 19:15 ` [PATCH 4/7] crypto: marvell: Moving the tdma chain out of mv_cesa_tdma_req Romain Perier
2016-06-15 19:15   ` Romain Perier
2016-06-15 20:42   ` Boris Brezillon
2016-06-15 20:42     ` Boris Brezillon
2016-06-16 12:02     ` Romain Perier
2016-06-16 12:02       ` Romain Perier
2016-06-16 12:45       ` Boris Brezillon
2016-06-16 12:45         ` Boris Brezillon
2016-06-16 12:57       ` Boris Brezillon
2016-06-16 12:57         ` Boris Brezillon
2016-06-15 19:15 ` [PATCH 5/7] crypto: marvell: Adding a complete operation for async requests Romain Perier
2016-06-15 19:15   ` Romain Perier
2016-06-15 20:55   ` Boris Brezillon
2016-06-15 20:55     ` Boris Brezillon
2016-06-16 13:41     ` Romain Perier
2016-06-16 13:41       ` Romain Perier
2016-06-15 19:15 ` [PATCH 6/7] crypto: marvell: Adding load balancing between engines Romain Perier
2016-06-15 19:15   ` Romain Perier
2016-06-15 21:13   ` Boris Brezillon
2016-06-15 21:13     ` Boris Brezillon
2016-06-16 13:44     ` Romain Perier
2016-06-16 13:44       ` Romain Perier
2016-06-15 19:15 ` [PATCH 7/7] crypto: marvell: Add support for chaining crypto requests in TDMA mode Romain Perier
2016-06-15 19:15   ` Romain Perier
2016-06-15 21:43   ` Boris Brezillon
2016-06-15 21:43     ` Boris Brezillon
2016-06-17  9:54     ` Romain Perier
2016-06-17  9:54       ` Romain Perier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.