[PATCH v3 00/10] Chain crypto requests together at the DMA level

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 00/10] Chain crypto requests together at the DMA level
@ 2016-06-21  8:08 ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Thomas Petazzoni, Russell King, linux-crypto, Gregory Clement,
	David S. Miller, linux-arm-kernel

The Cryptographic Engines and Security Accelerators (CESA) supports
the TDMA chained mode support. When this mode is enabled and crypto
requests are chained at the DMA level, multiple crypto requests can be
handled by the hardware engine without requiring any software
intervention. This approach limits the number of interrupts generated
by the engines thus improving its throughput and making the whole system
behave nicely under heavy crypto load.

Benchmarking results with dmcrypt
=================================
		I/O read	I/O write
Before		81.7 MB/s	31.7 MB/s
After		129  MB/s	39.8 MB/s

Improvement	+57.8 %		+25.5 %

Romain Perier (10):
  crypto: marvell: Add a macro constant for the size of the crypto queue
  crypto: marvell: Check engine is not already running when enabling a
    req
  crypto: marvell: Fix wrong type check in dma functions
  crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  crypto: marvell: Move tdma chain out of mv_cesa_tdma_req and remove it
  crypto: marvell: Add a complete operation for async requests
  crypto: marvell: Move SRAM I/O operations to step functions
  crypto: marvell: Add load balancing between engines
  crypto: marvell: Add support for chaining crypto requests in TDMA mode
  crypto: marvell: Increase the size of the crypto queue

 drivers/crypto/marvell/cesa.c   | 142 +++++++++++++++++++++++++++---------
 drivers/crypto/marvell/cesa.h   | 120 +++++++++++++++++++++---------
 drivers/crypto/marvell/cipher.c | 157 ++++++++++++++++++++++++----------------
 drivers/crypto/marvell/hash.c   | 150 ++++++++++++++++++--------------------
 drivers/crypto/marvell/tdma.c   | 132 +++++++++++++++++++++++++++++++--
 5 files changed, 483 insertions(+), 218 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 00/10] Chain crypto requests together at the DMA level
@ 2016-06-21  8:08 ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

The Cryptographic Engines and Security Accelerators (CESA) supports
the TDMA chained mode support. When this mode is enabled and crypto
requests are chained at the DMA level, multiple crypto requests can be
handled by the hardware engine without requiring any software
intervention. This approach limits the number of interrupts generated
by the engines thus improving its throughput and making the whole system
behave nicely under heavy crypto load.

Benchmarking results with dmcrypt
=================================
		I/O read	I/O write
Before		81.7 MB/s	31.7 MB/s
After		129  MB/s	39.8 MB/s

Improvement	+57.8 %		+25.5 %

Romain Perier (10):
  crypto: marvell: Add a macro constant for the size of the crypto queue
  crypto: marvell: Check engine is not already running when enabling a
    req
  crypto: marvell: Fix wrong type check in dma functions
  crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  crypto: marvell: Move tdma chain out of mv_cesa_tdma_req and remove it
  crypto: marvell: Add a complete operation for async requests
  crypto: marvell: Move SRAM I/O operations to step functions
  crypto: marvell: Add load balancing between engines
  crypto: marvell: Add support for chaining crypto requests in TDMA mode
  crypto: marvell: Increase the size of the crypto queue

 drivers/crypto/marvell/cesa.c   | 142 +++++++++++++++++++++++++++---------
 drivers/crypto/marvell/cesa.h   | 120 +++++++++++++++++++++---------
 drivers/crypto/marvell/cipher.c | 157 ++++++++++++++++++++++++----------------
 drivers/crypto/marvell/hash.c   | 150 ++++++++++++++++++--------------------
 drivers/crypto/marvell/tdma.c   | 132 +++++++++++++++++++++++++++++++--
 5 files changed, 483 insertions(+), 218 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 01/10] crypto: marvell: Add a macro constant for the size of the crypto queue
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

Adding a macro constant to be used for the size of the crypto queue,
instead of using a numeric value directly. It will be easier to
maintain in case we add more than one crypto queue of the same size.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 056a754..fb403e1 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -31,6 +31,9 @@
 
 #include "cesa.h"
 
+/* Limit of the crypto queue before reaching the backlog */
+#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
+
 static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
 module_param_named(allhwsupport, allhwsupport, int, 0444);
 MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if overlaps with the mv_cesa driver)");
@@ -416,7 +419,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	spin_lock_init(&cesa->lock);
-	crypto_init_queue(&cesa->queue, 50);
+	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
 	cesa->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(cesa->regs))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 01/10] crypto: marvell: Add a macro constant for the size of the crypto queue
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

Adding a macro constant to be used for the size of the crypto queue,
instead of using a numeric value directly. It will be easier to
maintain in case we add more than one crypto queue of the same size.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 056a754..fb403e1 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -31,6 +31,9 @@
 
 #include "cesa.h"
 
+/* Limit of the crypto queue before reaching the backlog */
+#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
+
 static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
 module_param_named(allhwsupport, allhwsupport, int, 0444);
 MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if overlaps with the mv_cesa driver)");
@@ -416,7 +419,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	spin_lock_init(&cesa->lock);
-	crypto_init_queue(&cesa->queue, 50);
+	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
 	cesa->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(cesa->regs))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

Add a BUG_ON() call when the driver tries to launch a crypto request
while the engine is still processing the previous one. This replaces
a silent system hang by a verbose kernel panic with the associated
backtrace to let the user know that something went wrong in the CESA
driver.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v3:
  - Fixed incorrectly aligned parameter for BUG_ON in
    mv_cesa_ablkcipher_std_step

Changes in v2:
  - Reworded the commit message
  - Fixed cosmetic changes


 drivers/crypto/marvell/cipher.c | 2 ++
 drivers/crypto/marvell/hash.c   | 2 ++
 drivers/crypto/marvell/tdma.c   | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index dcf1fce..8c1432e 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -106,6 +106,8 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 
 	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
 	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
+	       CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 7ca2e0f..80bddd7 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -237,6 +237,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 
 	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
 	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
+	       CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 7642798..8c86bb6 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -53,6 +53,8 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
 		       engine->regs + CESA_SA_CFG);
 	writel_relaxed(dreq->chain.first->cur_dma,
 		       engine->regs + CESA_TDMA_NEXT_ADDR);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
+	       CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

Add a BUG_ON() call when the driver tries to launch a crypto request
while the engine is still processing the previous one. This replaces
a silent system hang by a verbose kernel panic with the associated
backtrace to let the user know that something went wrong in the CESA
driver.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v3:
  - Fixed incorrectly aligned parameter for BUG_ON in
    mv_cesa_ablkcipher_std_step

Changes in v2:
  - Reworded the commit message
  - Fixed cosmetic changes


 drivers/crypto/marvell/cipher.c | 2 ++
 drivers/crypto/marvell/hash.c   | 2 ++
 drivers/crypto/marvell/tdma.c   | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index dcf1fce..8c1432e 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -106,6 +106,8 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 
 	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
 	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
+	       CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 7ca2e0f..80bddd7 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -237,6 +237,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 
 	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE);
 	writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
+	       CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 7642798..8c86bb6 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -53,6 +53,8 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
 		       engine->regs + CESA_SA_CFG);
 	writel_relaxed(dreq->chain.first->cur_dma,
 		       engine->regs + CESA_TDMA_NEXT_ADDR);
+	BUG_ON(readl(engine->regs + CESA_SA_CMD) &
+	       CESA_SA_CMD_EN_CESA_SA_ACCL0);
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 03/10] crypto: marvell: Fix wrong type check in dma functions
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

So far, the way that the type of a TDMA operation was checked was wrong.
We have to use the type mask in order to get the right part of the flag
containing the type of the operation.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/crypto/marvell/tdma.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 8c86bb6..de8c253 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -64,8 +64,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 
 	for (tdma = dreq->chain.first; tdma;) {
 		struct mv_cesa_tdma_desc *old_tdma = tdma;
+		u32 type = tdma->flags & CESA_TDMA_TYPE_MSK;
 
-		if (tdma->flags & CESA_TDMA_OP)
+		if (type == CESA_TDMA_OP)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
 
@@ -90,7 +91,7 @@ void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
 		if (tdma->flags & CESA_TDMA_SRC_IN_SRAM)
 			tdma->src = cpu_to_le32(tdma->src + engine->sram_dma);
 
-		if (tdma->flags & CESA_TDMA_OP)
+		if ((tdma->flags & CESA_TDMA_TYPE_MSK) == CESA_TDMA_OP)
 			mv_cesa_adjust_op(engine, tdma->op);
 	}
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 03/10] crypto: marvell: Fix wrong type check in dma functions
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

So far, the way that the type of a TDMA operation was checked was wrong.
We have to use the type mask in order to get the right part of the flag
containing the type of the operation.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/crypto/marvell/tdma.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 8c86bb6..de8c253 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -64,8 +64,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 
 	for (tdma = dreq->chain.first; tdma;) {
 		struct mv_cesa_tdma_desc *old_tdma = tdma;
+		u32 type = tdma->flags & CESA_TDMA_TYPE_MSK;
 
-		if (tdma->flags & CESA_TDMA_OP)
+		if (type == CESA_TDMA_OP)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
 
@@ -90,7 +91,7 @@ void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
 		if (tdma->flags & CESA_TDMA_SRC_IN_SRAM)
 			tdma->src = cpu_to_le32(tdma->src + engine->sram_dma);
 
-		if (tdma->flags & CESA_TDMA_OP)
+		if ((tdma->flags & CESA_TDMA_TYPE_MSK) == CESA_TDMA_OP)
 			mv_cesa_adjust_op(engine, tdma->op);
 	}
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 04/10] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Gregory Clement, Thomas Petazzoni, David S. Miller, Russell King,
	linux-crypto, linux-arm-kernel

Add a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is a good way for offloading
as much as processing as possible to the DMA and the crypto engine.
This is also required for processing multiple cipher requests
in chained mode, otherwise the content of the IV vector would be
overwritten by the last processed request.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v3:
  - Fixed coding style issues

Changes in v2:
  - Reworded the commit message, the term 'asynchronously' was
    ambigous
  - Changed the value of CESA_TDMA_IV from 4 to 3
  - Adding missing blank lines
  - Rewrote the function mv_cesa_ablkcipher_process to something more
    readable.
  - Fixed a bug about how the type of a TDMA operation was tested in
    mv_cesa_dma_cleanup and mv_cesa_dma_prepare, I created a separated
    commit for that (see PATCH 03/10)
  - Renamed variables in mv_cesa_dma_add_iv_op
  - Removed the flag CESA_TDMA_DATA from mv_cesa_dma_add_iv_op (not
    needed)

 drivers/crypto/marvell/cesa.c   |  4 ++++
 drivers/crypto/marvell/cesa.h   |  5 +++++
 drivers/crypto/marvell/cipher.c | 31 ++++++++++++++++++++++---------
 drivers/crypto/marvell/tdma.c   | 29 +++++++++++++++++++++++++++++
 4 files changed, 60 insertions(+), 9 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index fb403e1..93700cd 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -312,6 +312,10 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
 	if (!dma->padding_pool)
 		return -ENOMEM;
 
+	dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
+	if (!dma->iv_pool)
+		return -ENOMEM;
+
 	cesa->dma = dma;
 
 	return 0;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 74071e4..685a627 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
+#define CESA_TDMA_IV				3
 
 /**
  * struct mv_cesa_tdma_desc - TDMA descriptor
@@ -390,6 +391,7 @@ struct mv_cesa_dev_dma {
 	struct dma_pool *op_pool;
 	struct dma_pool *cache_pool;
 	struct dma_pool *padding_pool;
+	struct dma_pool *iv_pool;
 };
 
 /**
@@ -790,6 +792,9 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
 	memset(chain, 0, sizeof(*chain));
 }
 
+int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+			  u32 size, u32 flags, gfp_t gfp_flags);
+
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
 					const struct mv_cesa_op_ctx *op_templ,
 					bool skip_ctx,
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index ec23609..908be86 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -118,6 +118,7 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
 	struct mv_cesa_engine *engine = sreq->base.engine;
 	size_t len;
+	unsigned int ivsize;
 
 	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
 				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -127,6 +128,10 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	if (sreq->offset < req->nbytes)
 		return -EINPROGRESS;
 
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
+	memcpy_fromio(req->info,
+		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
+
 	return 0;
 }
 
@@ -135,21 +140,20 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_tdma_req *dreq;
+	unsigned int ivsize;
 	int ret;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->req.dma, status);
-	else
-		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
+	if (creq->req.base.type == CESA_STD_REQ)
+		return mv_cesa_ablkcipher_std_process(ablkreq, status);
 
+	ret = mv_cesa_dma_process(&creq->req.dma, status);
 	if (ret)
 		return ret;
 
-	memcpy_fromio(ablkreq->info,
-		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
-		      crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq)));
+	dreq = &creq->req.dma;
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
+	memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
 
 	return 0;
 }
@@ -302,6 +306,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_tdma_chain chain;
 	bool skip_ctx = false;
 	int ret;
+	unsigned int ivsize;
 
 	dreq->base.type = CESA_DMA_REQ;
 	dreq->chain.first = NULL;
@@ -360,6 +365,14 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 
 	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
 
+	/* Add output data for IV */
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
+	ret = mv_cesa_dma_add_iv_op(&chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
+				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
+
+	if (ret)
+		goto err_free_tdma;
+
 	dreq->chain = chain;
 
 	return 0;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index de8c253..01dda58 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -69,6 +69,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 		if (type == CESA_TDMA_OP)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
+		else if (type == CESA_TDMA_IV)
+			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
+				      le32_to_cpu(tdma->dst));
 
 		tdma = tdma->next;
 		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
@@ -121,6 +124,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 	return new_tdma;
 }
 
+int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+			  u32 size, u32 flags, gfp_t gfp_flags)
+{
+
+	struct mv_cesa_tdma_desc *tdma;
+	u8 *iv;
+	dma_addr_t dma_handle;
+
+	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
+	if (IS_ERR(tdma))
+		return PTR_ERR(tdma);
+
+	iv = dma_pool_alloc(cesa_dev->dma->iv_pool, flags, &dma_handle);
+	if (!iv)
+		return -ENOMEM;
+
+	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
+	tdma->src = src;
+	tdma->dst = cpu_to_le32(dma_handle);
+	tdma->data = iv;
+
+	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
+	tdma->flags = flags | CESA_TDMA_IV;
+	return 0;
+}
+
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
 					const struct mv_cesa_op_ctx *op_templ,
 					bool skip_ctx,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 04/10] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

Add a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is a good way for offloading
as much as processing as possible to the DMA and the crypto engine.
This is also required for processing multiple cipher requests
in chained mode, otherwise the content of the IV vector would be
overwritten by the last processed request.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v3:
  - Fixed coding style issues

Changes in v2:
  - Reworded the commit message, the term 'asynchronously' was
    ambigous
  - Changed the value of CESA_TDMA_IV from 4 to 3
  - Adding missing blank lines
  - Rewrote the function mv_cesa_ablkcipher_process to something more
    readable.
  - Fixed a bug about how the type of a TDMA operation was tested in
    mv_cesa_dma_cleanup and mv_cesa_dma_prepare, I created a separated
    commit for that (see PATCH 03/10)
  - Renamed variables in mv_cesa_dma_add_iv_op
  - Removed the flag CESA_TDMA_DATA from mv_cesa_dma_add_iv_op (not
    needed)

 drivers/crypto/marvell/cesa.c   |  4 ++++
 drivers/crypto/marvell/cesa.h   |  5 +++++
 drivers/crypto/marvell/cipher.c | 31 ++++++++++++++++++++++---------
 drivers/crypto/marvell/tdma.c   | 29 +++++++++++++++++++++++++++++
 4 files changed, 60 insertions(+), 9 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index fb403e1..93700cd 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -312,6 +312,10 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
 	if (!dma->padding_pool)
 		return -ENOMEM;
 
+	dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
+	if (!dma->iv_pool)
+		return -ENOMEM;
+
 	cesa->dma = dma;
 
 	return 0;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 74071e4..685a627 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -275,6 +275,7 @@ struct mv_cesa_op_ctx {
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
+#define CESA_TDMA_IV				3
 
 /**
  * struct mv_cesa_tdma_desc - TDMA descriptor
@@ -390,6 +391,7 @@ struct mv_cesa_dev_dma {
 	struct dma_pool *op_pool;
 	struct dma_pool *cache_pool;
 	struct dma_pool *padding_pool;
+	struct dma_pool *iv_pool;
 };
 
 /**
@@ -790,6 +792,9 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
 	memset(chain, 0, sizeof(*chain));
 }
 
+int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+			  u32 size, u32 flags, gfp_t gfp_flags);
+
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
 					const struct mv_cesa_op_ctx *op_templ,
 					bool skip_ctx,
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index ec23609..908be86 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -118,6 +118,7 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
 	struct mv_cesa_engine *engine = sreq->base.engine;
 	size_t len;
+	unsigned int ivsize;
 
 	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
 				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -127,6 +128,10 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	if (sreq->offset < req->nbytes)
 		return -EINPROGRESS;
 
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
+	memcpy_fromio(req->info,
+		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
+
 	return 0;
 }
 
@@ -135,21 +140,20 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_tdma_req *dreq;
+	unsigned int ivsize;
 	int ret;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->req.dma, status);
-	else
-		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
+	if (creq->req.base.type == CESA_STD_REQ)
+		return mv_cesa_ablkcipher_std_process(ablkreq, status);
 
+	ret = mv_cesa_dma_process(&creq->req.dma, status);
 	if (ret)
 		return ret;
 
-	memcpy_fromio(ablkreq->info,
-		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
-		      crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq)));
+	dreq = &creq->req.dma;
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
+	memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
 
 	return 0;
 }
@@ -302,6 +306,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_tdma_chain chain;
 	bool skip_ctx = false;
 	int ret;
+	unsigned int ivsize;
 
 	dreq->base.type = CESA_DMA_REQ;
 	dreq->chain.first = NULL;
@@ -360,6 +365,14 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 
 	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
 
+	/* Add output data for IV */
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
+	ret = mv_cesa_dma_add_iv_op(&chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
+				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
+
+	if (ret)
+		goto err_free_tdma;
+
 	dreq->chain = chain;
 
 	return 0;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index de8c253..01dda58 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -69,6 +69,9 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 		if (type == CESA_TDMA_OP)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
+		else if (type == CESA_TDMA_IV)
+			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
+				      le32_to_cpu(tdma->dst));
 
 		tdma = tdma->next;
 		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
@@ -121,6 +124,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 	return new_tdma;
 }
 
+int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+			  u32 size, u32 flags, gfp_t gfp_flags)
+{
+
+	struct mv_cesa_tdma_desc *tdma;
+	u8 *iv;
+	dma_addr_t dma_handle;
+
+	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
+	if (IS_ERR(tdma))
+		return PTR_ERR(tdma);
+
+	iv = dma_pool_alloc(cesa_dev->dma->iv_pool, flags, &dma_handle);
+	if (!iv)
+		return -ENOMEM;
+
+	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
+	tdma->src = src;
+	tdma->dst = cpu_to_le32(dma_handle);
+	tdma->data = iv;
+
+	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
+	tdma->flags = flags | CESA_TDMA_IV;
+	return 0;
+}
+
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
 					const struct mv_cesa_op_ctx *op_templ,
 					bool skip_ctx,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 05/10] crypto: marvell: Move tdma chain out of mv_cesa_tdma_req and remove it
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Thomas Petazzoni, Russell King, linux-crypto, Gregory Clement,
	David S. Miller, linux-arm-kernel

Currently, the only way to access the tdma chain is to use the 'req'
union from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem
if we want to handle the TDMA chaining vs standard/non-DMA processing in
a generic way (with generic functions at the cesa.c level detecting
whether the request should be queued at the DMA level or not). Hence the
decision to move the chain field a the mv_cesa_req level at the expense
of adding 2 void * fields to all request contexts (including non-DMA
ones) and to remove the type completly. To limit the overhead, we get
rid of the type field, which can now be deduced from the req->chain.first
value. Once these changes are done the union is no longer needed, so
remove it and move mv_cesa_ablkcipher_std_req and mv_cesa_req
to mv_cesa_ablkcipher_req directly. There are also no needs to keep the
'base' field into the union of mv_cesa_ahash_req, so move it into the
upper structure.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v2:
  - Reworded the commit log
  - In mv_cesa_ablkcipher_req moved 'base' and 'std' into the upper
    structure. Also removed the union
  - Removed 'base' from mv_cesa_ablkcipher_std_req
  - In mv_cesa_hash_req moved 'base' into the upper structure
  - Remove 'base' from mv_cesa_ahash_std_req and mv_cesa_ahash_dma_req
  - Cosmetic changes: variables renaming, missing blank lines
  - Replaced the test in mv_cesa_ahash_req_init from
    'mv_cesa_req_get_type == CESA_DMA_REQ' to 'cesa_dev->caps->has_tdma',
    now mv_cesa_hash_dma_req_init is really called. 

 drivers/crypto/marvell/cesa.c   |  3 +-
 drivers/crypto/marvell/cesa.h   | 44 ++++++++++-----------------
 drivers/crypto/marvell/cipher.c | 66 +++++++++++++++++++++--------------------
 drivers/crypto/marvell/hash.c   | 64 ++++++++++++++++++---------------------
 drivers/crypto/marvell/tdma.c   |  8 ++---
 5 files changed, 85 insertions(+), 100 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 93700cd..fe04d1b 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -111,7 +111,8 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 	return ret;
 }
 
-int mv_cesa_queue_req(struct crypto_async_request *req)
+int mv_cesa_queue_req(struct crypto_async_request *req,
+		      struct mv_cesa_req *creq)
 {
 	int ret;
 	int i;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 685a627..e67e3f1 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -509,21 +509,11 @@ enum mv_cesa_req_type {
 
 /**
  * struct mv_cesa_req - CESA request
- * @type:	request type
  * @engine:	engine associated with this request
+ * @chain:	list of tdma descriptors associated  with this request
  */
 struct mv_cesa_req {
-	enum mv_cesa_req_type type;
 	struct mv_cesa_engine *engine;
-};
-
-/**
- * struct mv_cesa_tdma_req - CESA TDMA request
- * @base:	base information
- * @chain:	TDMA chain
- */
-struct mv_cesa_tdma_req {
-	struct mv_cesa_req base;
 	struct mv_cesa_tdma_chain chain;
 };
 
@@ -540,13 +530,11 @@ struct mv_cesa_sg_std_iter {
 
 /**
  * struct mv_cesa_ablkcipher_std_req - cipher standard request
- * @base:	base information
  * @op:		operation context
  * @offset:	current operation offset
  * @size:	size of the crypto operation
  */
 struct mv_cesa_ablkcipher_std_req {
-	struct mv_cesa_req base;
 	struct mv_cesa_op_ctx op;
 	unsigned int offset;
 	unsigned int size;
@@ -560,34 +548,27 @@ struct mv_cesa_ablkcipher_std_req {
  * @dst_nents:	number of entries in the dest sg list
  */
 struct mv_cesa_ablkcipher_req {
-	union {
-		struct mv_cesa_req base;
-		struct mv_cesa_tdma_req dma;
-		struct mv_cesa_ablkcipher_std_req std;
-	} req;
+	struct mv_cesa_req base;
+	struct mv_cesa_ablkcipher_std_req std;
 	int src_nents;
 	int dst_nents;
 };
 
 /**
  * struct mv_cesa_ahash_std_req - standard hash request
- * @base:	base information
  * @offset:	current operation offset
  */
 struct mv_cesa_ahash_std_req {
-	struct mv_cesa_req base;
 	unsigned int offset;
 };
 
 /**
  * struct mv_cesa_ahash_dma_req - DMA hash request
- * @base:		base information
  * @padding:		padding buffer
  * @padding_dma:	DMA address of the padding buffer
  * @cache_dma:		DMA address of the cache buffer
  */
 struct mv_cesa_ahash_dma_req {
-	struct mv_cesa_tdma_req base;
 	u8 *padding;
 	dma_addr_t padding_dma;
 	u8 *cache;
@@ -606,8 +587,8 @@ struct mv_cesa_ahash_dma_req {
  * @state:		hash state
  */
 struct mv_cesa_ahash_req {
+	struct mv_cesa_req base;
 	union {
-		struct mv_cesa_req base;
 		struct mv_cesa_ahash_dma_req dma;
 		struct mv_cesa_ahash_std_req std;
 	} req;
@@ -625,6 +606,12 @@ struct mv_cesa_ahash_req {
 
 extern struct mv_cesa_dev *cesa_dev;
 
+static inline enum mv_cesa_req_type
+mv_cesa_req_get_type(struct mv_cesa_req *req)
+{
+	return req->chain.first ? CESA_DMA_REQ : CESA_STD_REQ;
+}
+
 static inline void mv_cesa_update_op_cfg(struct mv_cesa_op_ctx *op,
 					 u32 cfg, u32 mask)
 {
@@ -697,7 +684,8 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 		CESA_SA_DESC_CFG_FIRST_FRAG;
 }
 
-int mv_cesa_queue_req(struct crypto_async_request *req);
+int mv_cesa_queue_req(struct crypto_async_request *req,
+		      struct mv_cesa_req *creq);
 
 /*
  * Helper function that indicates whether a crypto request needs to be
@@ -767,9 +755,9 @@ static inline bool mv_cesa_req_dma_iter_next_op(struct mv_cesa_dma_iter *iter)
 	return iter->op_len;
 }
 
-void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq);
+void mv_cesa_dma_step(struct mv_cesa_req *dreq);
 
-static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
+static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
 				      u32 status)
 {
 	if (!(status & CESA_SA_INT_ACC0_IDMA_DONE))
@@ -781,10 +769,10 @@ static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
 	return 0;
 }
 
-void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine);
+void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
 
-void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq);
 
 static inline void
 mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index ded5feb..ffe0f4a 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -70,22 +70,22 @@ mv_cesa_ablkcipher_dma_cleanup(struct ablkcipher_request *req)
 		dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents,
 			     DMA_BIDIRECTIONAL);
 	}
-	mv_cesa_dma_cleanup(&creq->req.dma);
+	mv_cesa_dma_cleanup(&creq->base);
 }
 
 static inline void mv_cesa_ablkcipher_cleanup(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ablkcipher_dma_cleanup(req);
 }
 
 static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
+	struct mv_cesa_engine *engine = creq->base.engine;
 	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
 			    CESA_SA_SRAM_PAYLOAD_SIZE);
 
@@ -115,8 +115,8 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 					  u32 status)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
+	struct mv_cesa_engine *engine = creq->base.engine;
 	size_t len;
 	unsigned int ivsize;
 
@@ -140,21 +140,19 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
-	struct mv_cesa_tdma_req *dreq;
+	struct mv_cesa_req *basereq = &creq->base;
 	unsigned int ivsize;
 	int ret;
 
-	if (creq->req.base.type == CESA_STD_REQ)
+	if (mv_cesa_req_get_type(basereq) == CESA_STD_REQ)
 		return mv_cesa_ablkcipher_std_process(ablkreq, status);
 
-	ret = mv_cesa_dma_process(&creq->req.dma, status);
+	ret = mv_cesa_dma_process(basereq, status);
 	if (ret)
 		return ret;
 
-	dreq = &creq->req.dma;
-	ivsize =
-	crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
-	memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
+	memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
 
 	return 0;
 }
@@ -164,8 +162,8 @@ static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		mv_cesa_dma_step(&creq->req.dma);
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->base);
 	else
 		mv_cesa_ablkcipher_std_step(ablkreq);
 }
@@ -174,17 +172,17 @@ static inline void
 mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_req *basereq = &creq->base;
 
-	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+	mv_cesa_dma_prepare(basereq, basereq->engine);
 }
 
 static inline void
 mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
+	struct mv_cesa_engine *engine = creq->base.engine;
 
 	sreq->size = 0;
 	sreq->offset = 0;
@@ -197,9 +195,9 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
-	creq->req.base.engine = engine;
+	creq->base.engine = engine;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ablkcipher_dma_prepare(ablkreq);
 	else
 		mv_cesa_ablkcipher_std_prepare(ablkreq);
@@ -302,16 +300,15 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_req *basereq = &creq->base;
 	struct mv_cesa_ablkcipher_dma_iter iter;
 	struct mv_cesa_tdma_chain chain;
 	bool skip_ctx = false;
 	int ret;
 	unsigned int ivsize;
 
-	dreq->base.type = CESA_DMA_REQ;
-	dreq->chain.first = NULL;
-	dreq->chain.last = NULL;
+	basereq->chain.first = NULL;
+	basereq->chain.last = NULL;
 
 	if (req->src != req->dst) {
 		ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
@@ -374,12 +371,12 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	if (ret)
 		goto err_free_tdma;
 
-	dreq->chain = chain;
+	basereq->chain = chain;
 
 	return 0;
 
 err_free_tdma:
-	mv_cesa_dma_cleanup(dreq);
+	mv_cesa_dma_cleanup(basereq);
 	if (req->dst != req->src)
 		dma_unmap_sg(cesa_dev->dev, req->dst, creq->dst_nents,
 			     DMA_FROM_DEVICE);
@@ -396,11 +393,13 @@ mv_cesa_ablkcipher_std_req_init(struct ablkcipher_request *req,
 				const struct mv_cesa_op_ctx *op_templ)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
+	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
+	struct mv_cesa_req *basereq = &creq->base;
 
-	sreq->base.type = CESA_STD_REQ;
 	sreq->op = *op_templ;
 	sreq->skip_ctx = false;
+	basereq->chain.first = NULL;
+	basereq->chain.last = NULL;
 
 	return 0;
 }
@@ -442,6 +441,7 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 static int mv_cesa_des_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
 
@@ -454,7 +454,7 @@ static int mv_cesa_des_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
@@ -562,6 +562,7 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
 static int mv_cesa_des3_op(struct ablkcipher_request *req,
 			   struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
 
@@ -574,7 +575,7 @@ static int mv_cesa_des3_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
@@ -688,6 +689,7 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
 static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret, i;
 	u32 *key;
@@ -716,7 +718,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 80bddd7..21a4737 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -103,14 +103,14 @@ static inline void mv_cesa_ahash_dma_cleanup(struct ahash_request *req)
 
 	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
 	mv_cesa_ahash_dma_free_cache(&creq->req.dma);
-	mv_cesa_dma_cleanup(&creq->req.dma.base);
+	mv_cesa_dma_cleanup(&creq->base);
 }
 
 static inline void mv_cesa_ahash_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_cleanup(req);
 }
 
@@ -118,7 +118,7 @@ static void mv_cesa_ahash_last_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_last_cleanup(req);
 }
 
@@ -157,7 +157,7 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_engine *engine = creq->base.engine;
 	struct mv_cesa_op_ctx *op;
 	unsigned int new_cache_ptr = 0;
 	u32 frag_mode;
@@ -256,16 +256,16 @@ static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
 static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
+	struct mv_cesa_req *basereq = &creq->base;
 
-	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+	mv_cesa_dma_prepare(basereq, basereq->engine);
 }
 
 static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_engine *engine = creq->base.engine;
 
 	sreq->offset = 0;
 	mv_cesa_adjust_op(engine, &creq->op_tmpl);
@@ -277,8 +277,8 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		mv_cesa_dma_step(&creq->req.dma.base);
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->base);
 	else
 		mv_cesa_ahash_std_step(ahashreq);
 }
@@ -287,12 +287,12 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	struct mv_cesa_engine *engine = creq->req.base.engine;
+	struct mv_cesa_engine *engine = creq->base.engine;
 	unsigned int digsize;
 	int ret, i;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->req.dma.base, status);
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
+		ret = mv_cesa_dma_process(&creq->base, status);
 	else
 		ret = mv_cesa_ahash_std_process(ahashreq, status);
 
@@ -338,9 +338,9 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 	unsigned int digsize;
 	int i;
 
-	creq->req.base.engine = engine;
+	creq->base.engine = engine;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_prepare(ahashreq);
 	else
 		mv_cesa_ahash_std_prepare(ahashreq);
@@ -555,15 +555,14 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
-	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
-	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
+	struct mv_cesa_req *basereq = &creq->base;
 	struct mv_cesa_ahash_dma_iter iter;
 	struct mv_cesa_op_ctx *op = NULL;
 	unsigned int frag_len;
 	int ret;
 
-	dreq->chain.first = NULL;
-	dreq->chain.last = NULL;
+	basereq->chain.first = NULL;
+	basereq->chain.last = NULL;
 
 	if (creq->src_nents) {
 		ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
@@ -574,14 +573,14 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 		}
 	}
 
-	mv_cesa_tdma_desc_iter_init(&dreq->chain);
+	mv_cesa_tdma_desc_iter_init(&basereq->chain);
 	mv_cesa_ahash_req_iter_init(&iter, req);
 
 	/*
 	 * Add the cache (left-over data from a previous block) first.
 	 * This will never overflow the SRAM size.
 	 */
-	ret = mv_cesa_ahash_dma_add_cache(&dreq->chain, &iter, creq, flags);
+	ret = mv_cesa_ahash_dma_add_cache(&basereq->chain, &iter, creq, flags);
 	if (ret)
 		goto err_free_tdma;
 
@@ -592,7 +591,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 		 * data. We intentionally do not add the final op block.
 		 */
 		while (true) {
-			ret = mv_cesa_dma_add_op_transfers(&dreq->chain,
+			ret = mv_cesa_dma_add_op_transfers(&basereq->chain,
 							   &iter.base,
 							   &iter.src, flags);
 			if (ret)
@@ -603,7 +602,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 			if (!mv_cesa_ahash_req_iter_next_op(&iter))
 				break;
 
-			op = mv_cesa_dma_add_frag(&dreq->chain, &creq->op_tmpl,
+			op = mv_cesa_dma_add_frag(&basereq->chain, &creq->op_tmpl,
 						  frag_len, flags);
 			if (IS_ERR(op)) {
 				ret = PTR_ERR(op);
@@ -621,10 +620,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	 * operation, which depends whether this is the final request.
 	 */
 	if (creq->last_req)
-		op = mv_cesa_ahash_dma_last_req(&dreq->chain, &iter, creq,
+		op = mv_cesa_ahash_dma_last_req(&basereq->chain, &iter, creq,
 						frag_len, flags);
 	else if (frag_len)
-		op = mv_cesa_dma_add_frag(&dreq->chain, &creq->op_tmpl,
+		op = mv_cesa_dma_add_frag(&basereq->chain, &creq->op_tmpl,
 					  frag_len, flags);
 
 	if (IS_ERR(op)) {
@@ -634,7 +633,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 
 	if (op) {
 		/* Add dummy desc to wait for crypto operation end */
-		ret = mv_cesa_dma_add_dummy_end(&dreq->chain, flags);
+		ret = mv_cesa_dma_add_dummy_end(&basereq->chain, flags);
 		if (ret)
 			goto err_free_tdma;
 	}
@@ -648,7 +647,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	return 0;
 
 err_free_tdma:
-	mv_cesa_dma_cleanup(dreq);
+	mv_cesa_dma_cleanup(basereq);
 	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
 
 err:
@@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	int ret;
 
-	if (cesa_dev->caps->has_tdma)
-		creq->req.base.type = CESA_DMA_REQ;
-	else
-		creq->req.base.type = CESA_STD_REQ;
-
 	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
 	if (creq->src_nents < 0) {
 		dev_err(cesa_dev->dev, "Invalid number of src SG");
@@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	if (*cached)
 		return 0;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (cesa_dev->caps->has_tdma)
 		ret = mv_cesa_ahash_dma_req_init(req);
 
 	return ret;
@@ -700,7 +694,7 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
@@ -725,7 +719,7 @@ static int mv_cesa_ahash_final(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
@@ -750,7 +744,7 @@ static int mv_cesa_ahash_finup(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 01dda58..9d944ad 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -37,9 +37,9 @@ bool mv_cesa_req_dma_iter_next_transfer(struct mv_cesa_dma_iter *iter,
 	return true;
 }
 
-void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
+void mv_cesa_dma_step(struct mv_cesa_req *dreq)
 {
-	struct mv_cesa_engine *engine = dreq->base.engine;
+	struct mv_cesa_engine *engine = dreq->engine;
 
 	writel_relaxed(0, engine->regs + CESA_SA_CFG);
 
@@ -58,7 +58,7 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
-void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
+void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
 {
 	struct mv_cesa_tdma_desc *tdma;
 
@@ -82,7 +82,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 	dreq->chain.last = NULL;
 }
 
-void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine)
 {
 	struct mv_cesa_tdma_desc *tdma;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 05/10] crypto: marvell: Move tdma chain out of mv_cesa_tdma_req and remove it
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, the only way to access the tdma chain is to use the 'req'
union from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem
if we want to handle the TDMA chaining vs standard/non-DMA processing in
a generic way (with generic functions at the cesa.c level detecting
whether the request should be queued at the DMA level or not). Hence the
decision to move the chain field a the mv_cesa_req level at the expense
of adding 2 void * fields to all request contexts (including non-DMA
ones) and to remove the type completly. To limit the overhead, we get
rid of the type field, which can now be deduced from the req->chain.first
value. Once these changes are done the union is no longer needed, so
remove it and move mv_cesa_ablkcipher_std_req and mv_cesa_req
to mv_cesa_ablkcipher_req directly. There are also no needs to keep the
'base' field into the union of mv_cesa_ahash_req, so move it into the
upper structure.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v2:
  - Reworded the commit log
  - In mv_cesa_ablkcipher_req moved 'base' and 'std' into the upper
    structure. Also removed the union
  - Removed 'base' from mv_cesa_ablkcipher_std_req
  - In mv_cesa_hash_req moved 'base' into the upper structure
  - Remove 'base' from mv_cesa_ahash_std_req and mv_cesa_ahash_dma_req
  - Cosmetic changes: variables renaming, missing blank lines
  - Replaced the test in mv_cesa_ahash_req_init from
    'mv_cesa_req_get_type == CESA_DMA_REQ' to 'cesa_dev->caps->has_tdma',
    now mv_cesa_hash_dma_req_init is really called. 

 drivers/crypto/marvell/cesa.c   |  3 +-
 drivers/crypto/marvell/cesa.h   | 44 ++++++++++-----------------
 drivers/crypto/marvell/cipher.c | 66 +++++++++++++++++++++--------------------
 drivers/crypto/marvell/hash.c   | 64 ++++++++++++++++++---------------------
 drivers/crypto/marvell/tdma.c   |  8 ++---
 5 files changed, 85 insertions(+), 100 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 93700cd..fe04d1b 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -111,7 +111,8 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 	return ret;
 }
 
-int mv_cesa_queue_req(struct crypto_async_request *req)
+int mv_cesa_queue_req(struct crypto_async_request *req,
+		      struct mv_cesa_req *creq)
 {
 	int ret;
 	int i;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 685a627..e67e3f1 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -509,21 +509,11 @@ enum mv_cesa_req_type {
 
 /**
  * struct mv_cesa_req - CESA request
- * @type:	request type
  * @engine:	engine associated with this request
+ * @chain:	list of tdma descriptors associated  with this request
  */
 struct mv_cesa_req {
-	enum mv_cesa_req_type type;
 	struct mv_cesa_engine *engine;
-};
-
-/**
- * struct mv_cesa_tdma_req - CESA TDMA request
- * @base:	base information
- * @chain:	TDMA chain
- */
-struct mv_cesa_tdma_req {
-	struct mv_cesa_req base;
 	struct mv_cesa_tdma_chain chain;
 };
 
@@ -540,13 +530,11 @@ struct mv_cesa_sg_std_iter {
 
 /**
  * struct mv_cesa_ablkcipher_std_req - cipher standard request
- * @base:	base information
  * @op:		operation context
  * @offset:	current operation offset
  * @size:	size of the crypto operation
  */
 struct mv_cesa_ablkcipher_std_req {
-	struct mv_cesa_req base;
 	struct mv_cesa_op_ctx op;
 	unsigned int offset;
 	unsigned int size;
@@ -560,34 +548,27 @@ struct mv_cesa_ablkcipher_std_req {
  * @dst_nents:	number of entries in the dest sg list
  */
 struct mv_cesa_ablkcipher_req {
-	union {
-		struct mv_cesa_req base;
-		struct mv_cesa_tdma_req dma;
-		struct mv_cesa_ablkcipher_std_req std;
-	} req;
+	struct mv_cesa_req base;
+	struct mv_cesa_ablkcipher_std_req std;
 	int src_nents;
 	int dst_nents;
 };
 
 /**
  * struct mv_cesa_ahash_std_req - standard hash request
- * @base:	base information
  * @offset:	current operation offset
  */
 struct mv_cesa_ahash_std_req {
-	struct mv_cesa_req base;
 	unsigned int offset;
 };
 
 /**
  * struct mv_cesa_ahash_dma_req - DMA hash request
- * @base:		base information
  * @padding:		padding buffer
  * @padding_dma:	DMA address of the padding buffer
  * @cache_dma:		DMA address of the cache buffer
  */
 struct mv_cesa_ahash_dma_req {
-	struct mv_cesa_tdma_req base;
 	u8 *padding;
 	dma_addr_t padding_dma;
 	u8 *cache;
@@ -606,8 +587,8 @@ struct mv_cesa_ahash_dma_req {
  * @state:		hash state
  */
 struct mv_cesa_ahash_req {
+	struct mv_cesa_req base;
 	union {
-		struct mv_cesa_req base;
 		struct mv_cesa_ahash_dma_req dma;
 		struct mv_cesa_ahash_std_req std;
 	} req;
@@ -625,6 +606,12 @@ struct mv_cesa_ahash_req {
 
 extern struct mv_cesa_dev *cesa_dev;
 
+static inline enum mv_cesa_req_type
+mv_cesa_req_get_type(struct mv_cesa_req *req)
+{
+	return req->chain.first ? CESA_DMA_REQ : CESA_STD_REQ;
+}
+
 static inline void mv_cesa_update_op_cfg(struct mv_cesa_op_ctx *op,
 					 u32 cfg, u32 mask)
 {
@@ -697,7 +684,8 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 		CESA_SA_DESC_CFG_FIRST_FRAG;
 }
 
-int mv_cesa_queue_req(struct crypto_async_request *req);
+int mv_cesa_queue_req(struct crypto_async_request *req,
+		      struct mv_cesa_req *creq);
 
 /*
  * Helper function that indicates whether a crypto request needs to be
@@ -767,9 +755,9 @@ static inline bool mv_cesa_req_dma_iter_next_op(struct mv_cesa_dma_iter *iter)
 	return iter->op_len;
 }
 
-void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq);
+void mv_cesa_dma_step(struct mv_cesa_req *dreq);
 
-static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
+static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
 				      u32 status)
 {
 	if (!(status & CESA_SA_INT_ACC0_IDMA_DONE))
@@ -781,10 +769,10 @@ static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
 	return 0;
 }
 
-void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine);
+void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
 
-void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq);
 
 static inline void
 mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index ded5feb..ffe0f4a 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -70,22 +70,22 @@ mv_cesa_ablkcipher_dma_cleanup(struct ablkcipher_request *req)
 		dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents,
 			     DMA_BIDIRECTIONAL);
 	}
-	mv_cesa_dma_cleanup(&creq->req.dma);
+	mv_cesa_dma_cleanup(&creq->base);
 }
 
 static inline void mv_cesa_ablkcipher_cleanup(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ablkcipher_dma_cleanup(req);
 }
 
 static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
+	struct mv_cesa_engine *engine = creq->base.engine;
 	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
 			    CESA_SA_SRAM_PAYLOAD_SIZE);
 
@@ -115,8 +115,8 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 					  u32 status)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
+	struct mv_cesa_engine *engine = creq->base.engine;
 	size_t len;
 	unsigned int ivsize;
 
@@ -140,21 +140,19 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
-	struct mv_cesa_tdma_req *dreq;
+	struct mv_cesa_req *basereq = &creq->base;
 	unsigned int ivsize;
 	int ret;
 
-	if (creq->req.base.type == CESA_STD_REQ)
+	if (mv_cesa_req_get_type(basereq) == CESA_STD_REQ)
 		return mv_cesa_ablkcipher_std_process(ablkreq, status);
 
-	ret = mv_cesa_dma_process(&creq->req.dma, status);
+	ret = mv_cesa_dma_process(basereq, status);
 	if (ret)
 		return ret;
 
-	dreq = &creq->req.dma;
-	ivsize =
-	crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
-	memcpy_fromio(ablkreq->info, dreq->chain.last->data, ivsize);
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
+	memcpy_fromio(ablkreq->info, basereq->chain.last->data, ivsize);
 
 	return 0;
 }
@@ -164,8 +162,8 @@ static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		mv_cesa_dma_step(&creq->req.dma);
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->base);
 	else
 		mv_cesa_ablkcipher_std_step(ablkreq);
 }
@@ -174,17 +172,17 @@ static inline void
 mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_req *basereq = &creq->base;
 
-	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+	mv_cesa_dma_prepare(basereq, basereq->engine);
 }
 
 static inline void
 mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
+	struct mv_cesa_engine *engine = creq->base.engine;
 
 	sreq->size = 0;
 	sreq->offset = 0;
@@ -197,9 +195,9 @@ static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
-	creq->req.base.engine = engine;
+	creq->base.engine = engine;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ablkcipher_dma_prepare(ablkreq);
 	else
 		mv_cesa_ablkcipher_std_prepare(ablkreq);
@@ -302,16 +300,15 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_req *basereq = &creq->base;
 	struct mv_cesa_ablkcipher_dma_iter iter;
 	struct mv_cesa_tdma_chain chain;
 	bool skip_ctx = false;
 	int ret;
 	unsigned int ivsize;
 
-	dreq->base.type = CESA_DMA_REQ;
-	dreq->chain.first = NULL;
-	dreq->chain.last = NULL;
+	basereq->chain.first = NULL;
+	basereq->chain.last = NULL;
 
 	if (req->src != req->dst) {
 		ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
@@ -374,12 +371,12 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 	if (ret)
 		goto err_free_tdma;
 
-	dreq->chain = chain;
+	basereq->chain = chain;
 
 	return 0;
 
 err_free_tdma:
-	mv_cesa_dma_cleanup(dreq);
+	mv_cesa_dma_cleanup(basereq);
 	if (req->dst != req->src)
 		dma_unmap_sg(cesa_dev->dev, req->dst, creq->dst_nents,
 			     DMA_FROM_DEVICE);
@@ -396,11 +393,13 @@ mv_cesa_ablkcipher_std_req_init(struct ablkcipher_request *req,
 				const struct mv_cesa_op_ctx *op_templ)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_ablkcipher_std_req *sreq = &creq->req.std;
+	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
+	struct mv_cesa_req *basereq = &creq->base;
 
-	sreq->base.type = CESA_STD_REQ;
 	sreq->op = *op_templ;
 	sreq->skip_ctx = false;
+	basereq->chain.first = NULL;
+	basereq->chain.last = NULL;
 
 	return 0;
 }
@@ -442,6 +441,7 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 static int mv_cesa_des_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
 
@@ -454,7 +454,7 @@ static int mv_cesa_des_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
@@ -562,6 +562,7 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
 static int mv_cesa_des3_op(struct ablkcipher_request *req,
 			   struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
 
@@ -574,7 +575,7 @@ static int mv_cesa_des3_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
@@ -688,6 +689,7 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
 static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret, i;
 	u32 *key;
@@ -716,7 +718,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 80bddd7..21a4737 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -103,14 +103,14 @@ static inline void mv_cesa_ahash_dma_cleanup(struct ahash_request *req)
 
 	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
 	mv_cesa_ahash_dma_free_cache(&creq->req.dma);
-	mv_cesa_dma_cleanup(&creq->req.dma.base);
+	mv_cesa_dma_cleanup(&creq->base);
 }
 
 static inline void mv_cesa_ahash_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_cleanup(req);
 }
 
@@ -118,7 +118,7 @@ static void mv_cesa_ahash_last_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_last_cleanup(req);
 }
 
@@ -157,7 +157,7 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_engine *engine = creq->base.engine;
 	struct mv_cesa_op_ctx *op;
 	unsigned int new_cache_ptr = 0;
 	u32 frag_mode;
@@ -256,16 +256,16 @@ static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
 static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
-	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
+	struct mv_cesa_req *basereq = &creq->base;
 
-	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+	mv_cesa_dma_prepare(basereq, basereq->engine);
 }
 
 static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = sreq->base.engine;
+	struct mv_cesa_engine *engine = creq->base.engine;
 
 	sreq->offset = 0;
 	mv_cesa_adjust_op(engine, &creq->op_tmpl);
@@ -277,8 +277,8 @@ static void mv_cesa_ahash_step(struct crypto_async_request *req)
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		mv_cesa_dma_step(&creq->req.dma.base);
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->base);
 	else
 		mv_cesa_ahash_std_step(ahashreq);
 }
@@ -287,12 +287,12 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	struct mv_cesa_engine *engine = creq->req.base.engine;
+	struct mv_cesa_engine *engine = creq->base.engine;
 	unsigned int digsize;
 	int ret, i;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->req.dma.base, status);
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
+		ret = mv_cesa_dma_process(&creq->base, status);
 	else
 		ret = mv_cesa_ahash_std_process(ahashreq, status);
 
@@ -338,9 +338,9 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 	unsigned int digsize;
 	int i;
 
-	creq->req.base.engine = engine;
+	creq->base.engine = engine;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
 		mv_cesa_ahash_dma_prepare(ahashreq);
 	else
 		mv_cesa_ahash_std_prepare(ahashreq);
@@ -555,15 +555,14 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
-	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
-	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
+	struct mv_cesa_req *basereq = &creq->base;
 	struct mv_cesa_ahash_dma_iter iter;
 	struct mv_cesa_op_ctx *op = NULL;
 	unsigned int frag_len;
 	int ret;
 
-	dreq->chain.first = NULL;
-	dreq->chain.last = NULL;
+	basereq->chain.first = NULL;
+	basereq->chain.last = NULL;
 
 	if (creq->src_nents) {
 		ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
@@ -574,14 +573,14 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 		}
 	}
 
-	mv_cesa_tdma_desc_iter_init(&dreq->chain);
+	mv_cesa_tdma_desc_iter_init(&basereq->chain);
 	mv_cesa_ahash_req_iter_init(&iter, req);
 
 	/*
 	 * Add the cache (left-over data from a previous block) first.
 	 * This will never overflow the SRAM size.
 	 */
-	ret = mv_cesa_ahash_dma_add_cache(&dreq->chain, &iter, creq, flags);
+	ret = mv_cesa_ahash_dma_add_cache(&basereq->chain, &iter, creq, flags);
 	if (ret)
 		goto err_free_tdma;
 
@@ -592,7 +591,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 		 * data. We intentionally do not add the final op block.
 		 */
 		while (true) {
-			ret = mv_cesa_dma_add_op_transfers(&dreq->chain,
+			ret = mv_cesa_dma_add_op_transfers(&basereq->chain,
 							   &iter.base,
 							   &iter.src, flags);
 			if (ret)
@@ -603,7 +602,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 			if (!mv_cesa_ahash_req_iter_next_op(&iter))
 				break;
 
-			op = mv_cesa_dma_add_frag(&dreq->chain, &creq->op_tmpl,
+			op = mv_cesa_dma_add_frag(&basereq->chain, &creq->op_tmpl,
 						  frag_len, flags);
 			if (IS_ERR(op)) {
 				ret = PTR_ERR(op);
@@ -621,10 +620,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	 * operation, which depends whether this is the final request.
 	 */
 	if (creq->last_req)
-		op = mv_cesa_ahash_dma_last_req(&dreq->chain, &iter, creq,
+		op = mv_cesa_ahash_dma_last_req(&basereq->chain, &iter, creq,
 						frag_len, flags);
 	else if (frag_len)
-		op = mv_cesa_dma_add_frag(&dreq->chain, &creq->op_tmpl,
+		op = mv_cesa_dma_add_frag(&basereq->chain, &creq->op_tmpl,
 					  frag_len, flags);
 
 	if (IS_ERR(op)) {
@@ -634,7 +633,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 
 	if (op) {
 		/* Add dummy desc to wait for crypto operation end */
-		ret = mv_cesa_dma_add_dummy_end(&dreq->chain, flags);
+		ret = mv_cesa_dma_add_dummy_end(&basereq->chain, flags);
 		if (ret)
 			goto err_free_tdma;
 	}
@@ -648,7 +647,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	return 0;
 
 err_free_tdma:
-	mv_cesa_dma_cleanup(dreq);
+	mv_cesa_dma_cleanup(basereq);
 	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
 
 err:
@@ -662,11 +661,6 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	int ret;
 
-	if (cesa_dev->caps->has_tdma)
-		creq->req.base.type = CESA_DMA_REQ;
-	else
-		creq->req.base.type = CESA_STD_REQ;
-
 	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
 	if (creq->src_nents < 0) {
 		dev_err(cesa_dev->dev, "Invalid number of src SG");
@@ -680,7 +674,7 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	if (*cached)
 		return 0;
 
-	if (creq->req.base.type == CESA_DMA_REQ)
+	if (cesa_dev->caps->has_tdma)
 		ret = mv_cesa_ahash_dma_req_init(req);
 
 	return ret;
@@ -700,7 +694,7 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
@@ -725,7 +719,7 @@ static int mv_cesa_ahash_final(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
@@ -750,7 +744,7 @@ static int mv_cesa_ahash_finup(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	ret = mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base, &creq->base);
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 01dda58..9d944ad 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -37,9 +37,9 @@ bool mv_cesa_req_dma_iter_next_transfer(struct mv_cesa_dma_iter *iter,
 	return true;
 }
 
-void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
+void mv_cesa_dma_step(struct mv_cesa_req *dreq)
 {
-	struct mv_cesa_engine *engine = dreq->base.engine;
+	struct mv_cesa_engine *engine = dreq->engine;
 
 	writel_relaxed(0, engine->regs + CESA_SA_CFG);
 
@@ -58,7 +58,7 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
 	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
 }
 
-void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
+void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
 {
 	struct mv_cesa_tdma_desc *tdma;
 
@@ -82,7 +82,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
 	dreq->chain.last = NULL;
 }
 
-void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine)
 {
 	struct mv_cesa_tdma_desc *tdma;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 06/10] crypto: marvell: Add a complete operation for async requests
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Thomas Petazzoni, Russell King, linux-crypto, Gregory Clement,
	David S. Miller, linux-arm-kernel

So far, the 'process' operation was used to check if the current request
was correctly handled by the engine, if it was the case it copied
information from the SRAM to the main memory. Now, we split this
operation. We keep the 'process' operation, which still checks if the
request was correctly handled by the engine or not, then we add a new
operation for completion. The 'complete' method copies the content of
the SRAM to memory. This will soon become useful if we want to call
the process and the complete operations from different locations
depending on the type of the request (different cleanup logic).

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v3:
  - Fixed comment for the "complete" field in mv_cesa_req_ops.

Changes in v2:

  - Removed useless cosmetic change added for checkpatch (which
    had nothing to do with the patch itself)
  - Removed duplicated initialization of 'ivsize'
    mv_cesa_ablkcipher_complete.
  - Replace memcpy_fromio by memcpy in mv_cesa_ablkcipher_complete

 drivers/crypto/marvell/cesa.c   |  1 +
 drivers/crypto/marvell/cesa.h   |  3 +++
 drivers/crypto/marvell/cipher.c | 28 +++++++++++++++++++++++-----
 drivers/crypto/marvell/hash.c   | 22 ++++++++++++----------
 4 files changed, 39 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index fe04d1b..af96426 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -98,6 +98,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 				engine->req = NULL;
 				mv_cesa_dequeue_req_unlocked(engine);
 				spin_unlock_bh(&engine->lock);
+				ctx->ops->complete(req);
 				ctx->ops->cleanup(req);
 				local_bh_disable();
 				req->complete(req, res);
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index e67e3f1..d749335 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -456,6 +456,8 @@ struct mv_cesa_engine {
  *		code)
  * @step:	launch the crypto operation on the next chunk
  * @cleanup:	cleanup the crypto request (release associated data)
+ * @complete:	complete the request, i.e copy result or context from sram when
+ * 		needed.
  */
 struct mv_cesa_req_ops {
 	void (*prepare)(struct crypto_async_request *req,
@@ -463,6 +465,7 @@ struct mv_cesa_req_ops {
 	int (*process)(struct crypto_async_request *req, u32 status);
 	void (*step)(struct crypto_async_request *req);
 	void (*cleanup)(struct crypto_async_request *req);
+	void (*complete)(struct crypto_async_request *req);
 };
 
 /**
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index ffe0f4a..175ce76 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -118,7 +118,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
 	struct mv_cesa_engine *engine = creq->base.engine;
 	size_t len;
-	unsigned int ivsize;
 
 	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
 				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -128,10 +127,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	if (sreq->offset < req->nbytes)
 		return -EINPROGRESS;
 
-	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
-	memcpy_fromio(req->info,
-		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
-
 	return 0;
 }
 
@@ -211,11 +206,34 @@ mv_cesa_ablkcipher_req_cleanup(struct crypto_async_request *req)
 	mv_cesa_ablkcipher_cleanup(ablkreq);
 }
 
+static void
+mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
+{
+	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
+	struct mv_cesa_engine *engine = creq->base.engine;
+	unsigned int ivsize;
+
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
+
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ) {
+		struct mv_cesa_req *basereq;
+
+		basereq = &creq->base;
+		memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
+	} else {
+		memcpy_fromio(ablkreq->info,
+			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
+			      ivsize);
+	}
+}
+
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
 	.step = mv_cesa_ablkcipher_step,
 	.process = mv_cesa_ablkcipher_process,
 	.prepare = mv_cesa_ablkcipher_prepare,
 	.cleanup = mv_cesa_ablkcipher_req_cleanup,
+	.complete = mv_cesa_ablkcipher_complete,
 };
 
 static int mv_cesa_ablkcipher_cra_init(struct crypto_tfm *tfm)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 21a4737..09665a7 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -287,17 +287,20 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	struct mv_cesa_engine *engine = creq->base.engine;
-	unsigned int digsize;
-	int ret, i;
 
 	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->base, status);
-	else
-		ret = mv_cesa_ahash_std_process(ahashreq, status);
+		return mv_cesa_dma_process(&creq->base, status);
 
-	if (ret == -EINPROGRESS)
-		return ret;
+	return mv_cesa_ahash_std_process(ahashreq, status);
+}
+
+static void mv_cesa_ahash_complete(struct crypto_async_request *req)
+{
+	struct ahash_request *ahashreq = ahash_request_cast(req);
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
+	struct mv_cesa_engine *engine = creq->base.engine;
+	unsigned int digsize;
+	int i;
 
 	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
 	for (i = 0; i < digsize / 4; i++)
@@ -326,8 +329,6 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 				result[i] = cpu_to_be32(creq->state[i]);
 		}
 	}
-
-	return ret;
 }
 
 static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
@@ -366,6 +367,7 @@ static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
 	.process = mv_cesa_ahash_process,
 	.prepare = mv_cesa_ahash_prepare,
 	.cleanup = mv_cesa_ahash_req_cleanup,
+	.complete = mv_cesa_ahash_complete,
 };
 
 static int mv_cesa_ahash_init(struct ahash_request *req,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 06/10] crypto: marvell: Add a complete operation for async requests
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

So far, the 'process' operation was used to check if the current request
was correctly handled by the engine, if it was the case it copied
information from the SRAM to the main memory. Now, we split this
operation. We keep the 'process' operation, which still checks if the
request was correctly handled by the engine or not, then we add a new
operation for completion. The 'complete' method copies the content of
the SRAM to memory. This will soon become useful if we want to call
the process and the complete operations from different locations
depending on the type of the request (different cleanup logic).

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v3:
  - Fixed comment for the "complete" field in mv_cesa_req_ops.

Changes in v2:

  - Removed useless cosmetic change added for checkpatch (which
    had nothing to do with the patch itself)
  - Removed duplicated initialization of 'ivsize'
    mv_cesa_ablkcipher_complete.
  - Replace memcpy_fromio by memcpy in mv_cesa_ablkcipher_complete

 drivers/crypto/marvell/cesa.c   |  1 +
 drivers/crypto/marvell/cesa.h   |  3 +++
 drivers/crypto/marvell/cipher.c | 28 +++++++++++++++++++++++-----
 drivers/crypto/marvell/hash.c   | 22 ++++++++++++----------
 4 files changed, 39 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index fe04d1b..af96426 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -98,6 +98,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 				engine->req = NULL;
 				mv_cesa_dequeue_req_unlocked(engine);
 				spin_unlock_bh(&engine->lock);
+				ctx->ops->complete(req);
 				ctx->ops->cleanup(req);
 				local_bh_disable();
 				req->complete(req, res);
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index e67e3f1..d749335 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -456,6 +456,8 @@ struct mv_cesa_engine {
  *		code)
  * @step:	launch the crypto operation on the next chunk
  * @cleanup:	cleanup the crypto request (release associated data)
+ * @complete:	complete the request, i.e copy result or context from sram when
+ * 		needed.
  */
 struct mv_cesa_req_ops {
 	void (*prepare)(struct crypto_async_request *req,
@@ -463,6 +465,7 @@ struct mv_cesa_req_ops {
 	int (*process)(struct crypto_async_request *req, u32 status);
 	void (*step)(struct crypto_async_request *req);
 	void (*cleanup)(struct crypto_async_request *req);
+	void (*complete)(struct crypto_async_request *req);
 };
 
 /**
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index ffe0f4a..175ce76 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -118,7 +118,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
 	struct mv_cesa_engine *engine = creq->base.engine;
 	size_t len;
-	unsigned int ivsize;
 
 	len = sg_pcopy_from_buffer(req->dst, creq->dst_nents,
 				   engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -128,10 +127,6 @@ static int mv_cesa_ablkcipher_std_process(struct ablkcipher_request *req,
 	if (sreq->offset < req->nbytes)
 		return -EINPROGRESS;
 
-	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
-	memcpy_fromio(req->info,
-		      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, ivsize);
-
 	return 0;
 }
 
@@ -211,11 +206,34 @@ mv_cesa_ablkcipher_req_cleanup(struct crypto_async_request *req)
 	mv_cesa_ablkcipher_cleanup(ablkreq);
 }
 
+static void
+mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
+{
+	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
+	struct mv_cesa_engine *engine = creq->base.engine;
+	unsigned int ivsize;
+
+	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
+
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ) {
+		struct mv_cesa_req *basereq;
+
+		basereq = &creq->base;
+		memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
+	} else {
+		memcpy_fromio(ablkreq->info,
+			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
+			      ivsize);
+	}
+}
+
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
 	.step = mv_cesa_ablkcipher_step,
 	.process = mv_cesa_ablkcipher_process,
 	.prepare = mv_cesa_ablkcipher_prepare,
 	.cleanup = mv_cesa_ablkcipher_req_cleanup,
+	.complete = mv_cesa_ablkcipher_complete,
 };
 
 static int mv_cesa_ablkcipher_cra_init(struct crypto_tfm *tfm)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 21a4737..09665a7 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -287,17 +287,20 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	struct mv_cesa_engine *engine = creq->base.engine;
-	unsigned int digsize;
-	int ret, i;
 
 	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ)
-		ret = mv_cesa_dma_process(&creq->base, status);
-	else
-		ret = mv_cesa_ahash_std_process(ahashreq, status);
+		return mv_cesa_dma_process(&creq->base, status);
 
-	if (ret == -EINPROGRESS)
-		return ret;
+	return mv_cesa_ahash_std_process(ahashreq, status);
+}
+
+static void mv_cesa_ahash_complete(struct crypto_async_request *req)
+{
+	struct ahash_request *ahashreq = ahash_request_cast(req);
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
+	struct mv_cesa_engine *engine = creq->base.engine;
+	unsigned int digsize;
+	int i;
 
 	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
 	for (i = 0; i < digsize / 4; i++)
@@ -326,8 +329,6 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 				result[i] = cpu_to_be32(creq->state[i]);
 		}
 	}
-
-	return ret;
 }
 
 static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
@@ -366,6 +367,7 @@ static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
 	.process = mv_cesa_ahash_process,
 	.prepare = mv_cesa_ahash_prepare,
 	.cleanup = mv_cesa_ahash_req_cleanup,
+	.complete = mv_cesa_ahash_complete,
 };
 
 static int mv_cesa_ahash_init(struct ahash_request *req,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 07/10] crypto: marvell: Move SRAM I/O operations to step functions
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Thomas Petazzoni, Russell King, linux-crypto, Gregory Clement,
	David S. Miller, linux-arm-kernel

Currently the crypto requests were sent to engines sequentially.
This commit moves the SRAM I/O operations from the prepare to the step
functions. It provides flexibility for future works and allow to prepare
a request while the engine is running.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/crypto/marvell/cipher.c |  6 +++---
 drivers/crypto/marvell/hash.c   | 18 +++++++++---------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 175ce76..79d4175 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -89,6 +89,9 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
 			    CESA_SA_SRAM_PAYLOAD_SIZE);
 
+	mv_cesa_adjust_op(engine, &sreq->op);
+	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
+
 	len = sg_pcopy_to_buffer(req->src, creq->src_nents,
 				 engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 				 len, sreq->offset);
@@ -177,12 +180,9 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
-	struct mv_cesa_engine *engine = creq->base.engine;
 
 	sreq->size = 0;
 	sreq->offset = 0;
-	mv_cesa_adjust_op(engine, &sreq->op);
-	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
 }
 
 static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 09665a7..e1f8acd 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -162,6 +162,15 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	unsigned int new_cache_ptr = 0;
 	u32 frag_mode;
 	size_t  len;
+	unsigned int digsize;
+	int i;
+
+	mv_cesa_adjust_op(engine, &creq->op_tmpl);
+	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
+
+	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
+	for (i = 0; i < digsize / 4; i++)
+		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -265,11 +274,8 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = creq->base.engine;
 
 	sreq->offset = 0;
-	mv_cesa_adjust_op(engine, &creq->op_tmpl);
-	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
 }
 
 static void mv_cesa_ahash_step(struct crypto_async_request *req)
@@ -336,8 +342,6 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	unsigned int digsize;
-	int i;
 
 	creq->base.engine = engine;
 
@@ -345,10 +349,6 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 		mv_cesa_ahash_dma_prepare(ahashreq);
 	else
 		mv_cesa_ahash_std_prepare(ahashreq);
-
-	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
-	for (i = 0; i < digsize / 4; i++)
-		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 }
 
 static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 07/10] crypto: marvell: Move SRAM I/O operations to step functions
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

Currently the crypto requests were sent to engines sequentially.
This commit moves the SRAM I/O operations from the prepare to the step
functions. It provides flexibility for future works and allow to prepare
a request while the engine is running.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/crypto/marvell/cipher.c |  6 +++---
 drivers/crypto/marvell/hash.c   | 18 +++++++++---------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 175ce76..79d4175 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -89,6 +89,9 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 	size_t  len = min_t(size_t, req->nbytes - sreq->offset,
 			    CESA_SA_SRAM_PAYLOAD_SIZE);
 
+	mv_cesa_adjust_op(engine, &sreq->op);
+	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
+
 	len = sg_pcopy_to_buffer(req->src, creq->src_nents,
 				 engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 				 len, sreq->offset);
@@ -177,12 +180,9 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_ablkcipher_std_req *sreq = &creq->std;
-	struct mv_cesa_engine *engine = creq->base.engine;
 
 	sreq->size = 0;
 	sreq->offset = 0;
-	mv_cesa_adjust_op(engine, &sreq->op);
-	memcpy_toio(engine->sram, &sreq->op, sizeof(sreq->op));
 }
 
 static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 09665a7..e1f8acd 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -162,6 +162,15 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	unsigned int new_cache_ptr = 0;
 	u32 frag_mode;
 	size_t  len;
+	unsigned int digsize;
+	int i;
+
+	mv_cesa_adjust_op(engine, &creq->op_tmpl);
+	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
+
+	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
+	for (i = 0; i < digsize / 4; i++)
+		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
@@ -265,11 +274,8 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_ahash_std_req *sreq = &creq->req.std;
-	struct mv_cesa_engine *engine = creq->base.engine;
 
 	sreq->offset = 0;
-	mv_cesa_adjust_op(engine, &creq->op_tmpl);
-	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
 }
 
 static void mv_cesa_ahash_step(struct crypto_async_request *req)
@@ -336,8 +342,6 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
-	unsigned int digsize;
-	int i;
 
 	creq->base.engine = engine;
 
@@ -345,10 +349,6 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 		mv_cesa_ahash_dma_prepare(ahashreq);
 	else
 		mv_cesa_ahash_std_prepare(ahashreq);
-
-	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
-	for (i = 0; i < digsize / 4; i++)
-		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 }
 
 static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 08/10] crypto: marvell: Add load balancing between engines
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Thomas Petazzoni, Russell King, linux-crypto, Gregory Clement,
	David S. Miller, linux-arm-kernel

This commits adds support for fine grained load balancing on
multi-engine IPs. The engine is pre-selected based on its current load
and on the weight of the crypto request that is about to be processed.
The global crypto queue is also moved to each engine. These changes are
required to allow chaining crypto requests at the DMA level. By using
a crypto queue per engine, we make sure that we keep the state of the
tdma chain synchronized with the crypto queue. We also reduce contention
on 'cesa_dev->lock' and improve parallelism.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---

Changes in v3:

  - Renamed mv_cesa_dequeue_req_unlocked => mv_cesa_dequeue_req_locked

Changes in v2:

  - Reworded the commit message
  - Moved the code about SRAM I/O operations from this commit to
    a separated commit (see PATCH 07/10).

 drivers/crypto/marvell/cesa.c   | 34 ++++++++++++------------
 drivers/crypto/marvell/cesa.h   | 29 +++++++++++++++++----
 drivers/crypto/marvell/cipher.c | 57 ++++++++++++++++++-----------------------
 drivers/crypto/marvell/hash.c   | 50 ++++++++++++++----------------------
 4 files changed, 84 insertions(+), 86 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index af96426..c0497ac 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -40,16 +40,14 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
 
 struct mv_cesa_dev *cesa_dev;
 
-static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
+static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
 {
 	struct crypto_async_request *req, *backlog;
 	struct mv_cesa_ctx *ctx;
 
-	spin_lock_bh(&cesa_dev->lock);
-	backlog = crypto_get_backlog(&cesa_dev->queue);
-	req = crypto_dequeue_request(&cesa_dev->queue);
+	backlog = crypto_get_backlog(&engine->queue);
+	req = crypto_dequeue_request(&engine->queue);
 	engine->req = req;
-	spin_unlock_bh(&cesa_dev->lock);
 
 	if (!req)
 		return;
@@ -58,7 +56,6 @@ static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
 		backlog->complete(backlog, -EINPROGRESS);
 
 	ctx = crypto_tfm_ctx(req->tfm);
-	ctx->ops->prepare(req, engine);
 	ctx->ops->step(req);
 }
 
@@ -96,7 +93,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 			if (res != -EINPROGRESS) {
 				spin_lock_bh(&engine->lock);
 				engine->req = NULL;
-				mv_cesa_dequeue_req_unlocked(engine);
+				mv_cesa_dequeue_req_locked(engine);
 				spin_unlock_bh(&engine->lock);
 				ctx->ops->complete(req);
 				ctx->ops->cleanup(req);
@@ -116,21 +113,19 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq)
 {
 	int ret;
-	int i;
+	struct mv_cesa_engine *engine = creq->engine;
 
-	spin_lock_bh(&cesa_dev->lock);
-	ret = crypto_enqueue_request(&cesa_dev->queue, req);
-	spin_unlock_bh(&cesa_dev->lock);
+	spin_lock_bh(&engine->lock);
+	ret = crypto_enqueue_request(&engine->queue, req);
+	spin_unlock_bh(&engine->lock);
 
 	if (ret != -EINPROGRESS)
 		return ret;
 
-	for (i = 0; i < cesa_dev->caps->nengines; i++) {
-		spin_lock_bh(&cesa_dev->engines[i].lock);
-		if (!cesa_dev->engines[i].req)
-			mv_cesa_dequeue_req_unlocked(&cesa_dev->engines[i]);
-		spin_unlock_bh(&cesa_dev->engines[i].lock);
-	}
+	spin_lock_bh(&engine->lock);
+	if (!engine->req)
+		mv_cesa_dequeue_req_locked(engine);
+	spin_unlock_bh(&engine->lock);
 
 	return -EINPROGRESS;
 }
@@ -425,7 +420,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	spin_lock_init(&cesa->lock);
-	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
+
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
 	cesa->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(cesa->regs))
@@ -498,6 +493,9 @@ static int mv_cesa_probe(struct platform_device *pdev)
 						engine);
 		if (ret)
 			goto err_cleanup;
+
+		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
+		atomic_set(&engine->load, 0);
 	}
 
 	cesa_dev = cesa;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index c463528..644be35 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -400,7 +400,6 @@ struct mv_cesa_dev_dma {
  * @regs:	device registers
  * @sram_size:	usable SRAM size
  * @lock:	device lock
- * @queue:	crypto request queue
  * @engines:	array of engines
  * @dma:	dma pools
  *
@@ -412,7 +411,6 @@ struct mv_cesa_dev {
 	struct device *dev;
 	unsigned int sram_size;
 	spinlock_t lock;
-	struct crypto_queue queue;
 	struct mv_cesa_engine *engines;
 	struct mv_cesa_dev_dma *dma;
 };
@@ -431,6 +429,8 @@ struct mv_cesa_dev {
  * @int_mask:		interrupt mask cache
  * @pool:		memory pool pointing to the memory region reserved in
  *			SRAM
+ * @queue:		fifo of the pending crypto requests
+ * @load:		engine load counter, useful for load balancing
  *
  * Structure storing CESA engine information.
  */
@@ -446,11 +446,12 @@ struct mv_cesa_engine {
 	size_t max_req_len;
 	u32 int_mask;
 	struct gen_pool *pool;
+	struct crypto_queue queue;
+	atomic_t load;
 };
 
 /**
  * struct mv_cesa_req_ops - CESA request operations
- * @prepare:	prepare a request to be executed on the specified engine
  * @process:	process a request chunk result (should return 0 if the
  *		operation, -EINPROGRESS if it needs more steps or an error
  *		code)
@@ -460,8 +461,6 @@ struct mv_cesa_engine {
  * 		when it is needed.
  */
 struct mv_cesa_req_ops {
-	void (*prepare)(struct crypto_async_request *req,
-			struct mv_cesa_engine *engine);
 	int (*process)(struct crypto_async_request *req, u32 status);
 	void (*step)(struct crypto_async_request *req);
 	void (*cleanup)(struct crypto_async_request *req);
@@ -690,6 +689,26 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq);
 
+static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
+{
+	int i;
+	u32 min_load = U32_MAX;
+	struct mv_cesa_engine *selected = NULL;
+
+	for (i = 0; i < cesa_dev->caps->nengines; i++) {
+		struct mv_cesa_engine *engine = cesa_dev->engines + i;
+		u32 load = atomic_read(&engine->load);
+		if (load < min_load) {
+			min_load = load;
+			selected = engine;
+		}
+	}
+
+	atomic_add(weight, &selected->load);
+
+	return selected;
+}
+
 /*
  * Helper function that indicates whether a crypto request needs to be
  * cleaned up or not after being enqueued using mv_cesa_queue_req().
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 79d4175..28894be 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -214,6 +214,7 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 	struct mv_cesa_engine *engine = creq->base.engine;
 	unsigned int ivsize;
 
+	atomic_sub(ablkreq->nbytes, &engine->load);
 	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
 
 	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ) {
@@ -231,7 +232,6 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
 	.step = mv_cesa_ablkcipher_step,
 	.process = mv_cesa_ablkcipher_process,
-	.prepare = mv_cesa_ablkcipher_prepare,
 	.cleanup = mv_cesa_ablkcipher_req_cleanup,
 	.complete = mv_cesa_ablkcipher_complete,
 };
@@ -456,29 +456,41 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	return ret;
 }
 
-static int mv_cesa_des_op(struct ablkcipher_request *req,
-			  struct mv_cesa_op_ctx *tmpl)
+static int mv_cesa_ablkcipher_queue_req(struct ablkcipher_request *req,
+					struct mv_cesa_op_ctx *tmpl)
 {
-	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
-
-	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
-			      CESA_SA_DESC_CFG_CRYPTM_MSK);
-
-	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+	struct mv_cesa_engine *engine;
 
 	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
 	if (ret)
 		return ret;
 
+	engine = mv_cesa_select_engine(req->nbytes);
+	mv_cesa_ablkcipher_prepare(&req->base, engine);
+
 	ret = mv_cesa_queue_req(&req->base, &creq->base);
+
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
 	return ret;
 }
 
+static int mv_cesa_des_op(struct ablkcipher_request *req,
+			  struct mv_cesa_op_ctx *tmpl)
+{
+	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+
+	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
+			      CESA_SA_DESC_CFG_CRYPTM_MSK);
+
+	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
+
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
+}
+
 static int mv_cesa_ecb_des_encrypt(struct ablkcipher_request *req)
 {
 	struct mv_cesa_op_ctx tmpl;
@@ -580,24 +592,14 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
 static int mv_cesa_des3_op(struct ablkcipher_request *req,
 			   struct mv_cesa_op_ctx *tmpl)
 {
-	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
-	int ret;
 
 	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_3DES,
 			      CESA_SA_DESC_CFG_CRYPTM_MSK);
 
 	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES3_EDE_KEY_SIZE);
 
-	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
-	if (ret)
-		return ret;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ablkcipher_cleanup(req);
-
-	return ret;
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
 }
 
 static int mv_cesa_ecb_des3_ede_encrypt(struct ablkcipher_request *req)
@@ -707,9 +709,8 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
 static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
-	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
-	int ret, i;
+	int i;
 	u32 *key;
 	u32 cfg;
 
@@ -732,15 +733,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			      CESA_SA_DESC_CFG_CRYPTM_MSK |
 			      CESA_SA_DESC_CFG_AES_LEN_MSK);
 
-	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
-	if (ret)
-		return ret;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ablkcipher_cleanup(req);
-
-	return ret;
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
 }
 
 static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index e1f8acd..b7cfc42 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -335,6 +335,8 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
 				result[i] = cpu_to_be32(creq->state[i]);
 		}
 	}
+
+	atomic_sub(ahashreq->nbytes, &engine->load);
 }
 
 static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
@@ -365,7 +367,6 @@ static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
 static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
 	.step = mv_cesa_ahash_step,
 	.process = mv_cesa_ahash_process,
-	.prepare = mv_cesa_ahash_prepare,
 	.cleanup = mv_cesa_ahash_req_cleanup,
 	.complete = mv_cesa_ahash_complete,
 };
@@ -682,13 +683,13 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	return ret;
 }
 
-static int mv_cesa_ahash_update(struct ahash_request *req)
+static int mv_cesa_ahash_queue_req(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	struct mv_cesa_engine *engine;
 	bool cached = false;
 	int ret;
 
-	creq->len += req->nbytes;
 	ret = mv_cesa_ahash_req_init(req, &cached);
 	if (ret)
 		return ret;
@@ -696,61 +697,48 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
+	engine = mv_cesa_select_engine(req->nbytes);
+	mv_cesa_ahash_prepare(&req->base, engine);
+
 	ret = mv_cesa_queue_req(&req->base, &creq->base);
+
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
 	return ret;
 }
 
+static int mv_cesa_ahash_update(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+
+	creq->len += req->nbytes;
+
+	return mv_cesa_ahash_queue_req(req);
+}
+
 static int mv_cesa_ahash_final(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_op_ctx *tmpl = &creq->op_tmpl;
-	bool cached = false;
-	int ret;
 
 	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
 	creq->last_req = true;
 	req->nbytes = 0;
 
-	ret = mv_cesa_ahash_req_init(req, &cached);
-	if (ret)
-		return ret;
-
-	if (cached)
-		return 0;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ahash_cleanup(req);
-
-	return ret;
+	return mv_cesa_ahash_queue_req(req);
 }
 
 static int mv_cesa_ahash_finup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_op_ctx *tmpl = &creq->op_tmpl;
-	bool cached = false;
-	int ret;
 
 	creq->len += req->nbytes;
 	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
 	creq->last_req = true;
 
-	ret = mv_cesa_ahash_req_init(req, &cached);
-	if (ret)
-		return ret;
-
-	if (cached)
-		return 0;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ahash_cleanup(req);
-
-	return ret;
+	return mv_cesa_ahash_queue_req(req);
 }
 
 static int mv_cesa_ahash_export(struct ahash_request *req, void *hash,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 08/10] crypto: marvell: Add load balancing between engines
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

This commits adds support for fine grained load balancing on
multi-engine IPs. The engine is pre-selected based on its current load
and on the weight of the crypto request that is about to be processed.
The global crypto queue is also moved to each engine. These changes are
required to allow chaining crypto requests at the DMA level. By using
a crypto queue per engine, we make sure that we keep the state of the
tdma chain synchronized with the crypto queue. We also reduce contention
on 'cesa_dev->lock' and improve parallelism.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---

Changes in v3:

  - Renamed mv_cesa_dequeue_req_unlocked => mv_cesa_dequeue_req_locked

Changes in v2:

  - Reworded the commit message
  - Moved the code about SRAM I/O operations from this commit to
    a separated commit (see PATCH 07/10).

 drivers/crypto/marvell/cesa.c   | 34 ++++++++++++------------
 drivers/crypto/marvell/cesa.h   | 29 +++++++++++++++++----
 drivers/crypto/marvell/cipher.c | 57 ++++++++++++++++++-----------------------
 drivers/crypto/marvell/hash.c   | 50 ++++++++++++++----------------------
 4 files changed, 84 insertions(+), 86 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index af96426..c0497ac 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -40,16 +40,14 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
 
 struct mv_cesa_dev *cesa_dev;
 
-static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
+static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
 {
 	struct crypto_async_request *req, *backlog;
 	struct mv_cesa_ctx *ctx;
 
-	spin_lock_bh(&cesa_dev->lock);
-	backlog = crypto_get_backlog(&cesa_dev->queue);
-	req = crypto_dequeue_request(&cesa_dev->queue);
+	backlog = crypto_get_backlog(&engine->queue);
+	req = crypto_dequeue_request(&engine->queue);
 	engine->req = req;
-	spin_unlock_bh(&cesa_dev->lock);
 
 	if (!req)
 		return;
@@ -58,7 +56,6 @@ static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
 		backlog->complete(backlog, -EINPROGRESS);
 
 	ctx = crypto_tfm_ctx(req->tfm);
-	ctx->ops->prepare(req, engine);
 	ctx->ops->step(req);
 }
 
@@ -96,7 +93,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 			if (res != -EINPROGRESS) {
 				spin_lock_bh(&engine->lock);
 				engine->req = NULL;
-				mv_cesa_dequeue_req_unlocked(engine);
+				mv_cesa_dequeue_req_locked(engine);
 				spin_unlock_bh(&engine->lock);
 				ctx->ops->complete(req);
 				ctx->ops->cleanup(req);
@@ -116,21 +113,19 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq)
 {
 	int ret;
-	int i;
+	struct mv_cesa_engine *engine = creq->engine;
 
-	spin_lock_bh(&cesa_dev->lock);
-	ret = crypto_enqueue_request(&cesa_dev->queue, req);
-	spin_unlock_bh(&cesa_dev->lock);
+	spin_lock_bh(&engine->lock);
+	ret = crypto_enqueue_request(&engine->queue, req);
+	spin_unlock_bh(&engine->lock);
 
 	if (ret != -EINPROGRESS)
 		return ret;
 
-	for (i = 0; i < cesa_dev->caps->nengines; i++) {
-		spin_lock_bh(&cesa_dev->engines[i].lock);
-		if (!cesa_dev->engines[i].req)
-			mv_cesa_dequeue_req_unlocked(&cesa_dev->engines[i]);
-		spin_unlock_bh(&cesa_dev->engines[i].lock);
-	}
+	spin_lock_bh(&engine->lock);
+	if (!engine->req)
+		mv_cesa_dequeue_req_locked(engine);
+	spin_unlock_bh(&engine->lock);
 
 	return -EINPROGRESS;
 }
@@ -425,7 +420,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	spin_lock_init(&cesa->lock);
-	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
+
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
 	cesa->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(cesa->regs))
@@ -498,6 +493,9 @@ static int mv_cesa_probe(struct platform_device *pdev)
 						engine);
 		if (ret)
 			goto err_cleanup;
+
+		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
+		atomic_set(&engine->load, 0);
 	}
 
 	cesa_dev = cesa;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index c463528..644be35 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -400,7 +400,6 @@ struct mv_cesa_dev_dma {
  * @regs:	device registers
  * @sram_size:	usable SRAM size
  * @lock:	device lock
- * @queue:	crypto request queue
  * @engines:	array of engines
  * @dma:	dma pools
  *
@@ -412,7 +411,6 @@ struct mv_cesa_dev {
 	struct device *dev;
 	unsigned int sram_size;
 	spinlock_t lock;
-	struct crypto_queue queue;
 	struct mv_cesa_engine *engines;
 	struct mv_cesa_dev_dma *dma;
 };
@@ -431,6 +429,8 @@ struct mv_cesa_dev {
  * @int_mask:		interrupt mask cache
  * @pool:		memory pool pointing to the memory region reserved in
  *			SRAM
+ * @queue:		fifo of the pending crypto requests
+ * @load:		engine load counter, useful for load balancing
  *
  * Structure storing CESA engine information.
  */
@@ -446,11 +446,12 @@ struct mv_cesa_engine {
 	size_t max_req_len;
 	u32 int_mask;
 	struct gen_pool *pool;
+	struct crypto_queue queue;
+	atomic_t load;
 };
 
 /**
  * struct mv_cesa_req_ops - CESA request operations
- * @prepare:	prepare a request to be executed on the specified engine
  * @process:	process a request chunk result (should return 0 if the
  *		operation, -EINPROGRESS if it needs more steps or an error
  *		code)
@@ -460,8 +461,6 @@ struct mv_cesa_engine {
  * 		when it is needed.
  */
 struct mv_cesa_req_ops {
-	void (*prepare)(struct crypto_async_request *req,
-			struct mv_cesa_engine *engine);
 	int (*process)(struct crypto_async_request *req, u32 status);
 	void (*step)(struct crypto_async_request *req);
 	void (*cleanup)(struct crypto_async_request *req);
@@ -690,6 +689,26 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq);
 
+static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
+{
+	int i;
+	u32 min_load = U32_MAX;
+	struct mv_cesa_engine *selected = NULL;
+
+	for (i = 0; i < cesa_dev->caps->nengines; i++) {
+		struct mv_cesa_engine *engine = cesa_dev->engines + i;
+		u32 load = atomic_read(&engine->load);
+		if (load < min_load) {
+			min_load = load;
+			selected = engine;
+		}
+	}
+
+	atomic_add(weight, &selected->load);
+
+	return selected;
+}
+
 /*
  * Helper function that indicates whether a crypto request needs to be
  * cleaned up or not after being enqueued using mv_cesa_queue_req().
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 79d4175..28894be 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -214,6 +214,7 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 	struct mv_cesa_engine *engine = creq->base.engine;
 	unsigned int ivsize;
 
+	atomic_sub(ablkreq->nbytes, &engine->load);
 	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
 
 	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ) {
@@ -231,7 +232,6 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
 	.step = mv_cesa_ablkcipher_step,
 	.process = mv_cesa_ablkcipher_process,
-	.prepare = mv_cesa_ablkcipher_prepare,
 	.cleanup = mv_cesa_ablkcipher_req_cleanup,
 	.complete = mv_cesa_ablkcipher_complete,
 };
@@ -456,29 +456,41 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	return ret;
 }
 
-static int mv_cesa_des_op(struct ablkcipher_request *req,
-			  struct mv_cesa_op_ctx *tmpl)
+static int mv_cesa_ablkcipher_queue_req(struct ablkcipher_request *req,
+					struct mv_cesa_op_ctx *tmpl)
 {
-	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
-	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
 	int ret;
-
-	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
-			      CESA_SA_DESC_CFG_CRYPTM_MSK);
-
-	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+	struct mv_cesa_engine *engine;
 
 	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
 	if (ret)
 		return ret;
 
+	engine = mv_cesa_select_engine(req->nbytes);
+	mv_cesa_ablkcipher_prepare(&req->base, engine);
+
 	ret = mv_cesa_queue_req(&req->base, &creq->base);
+
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ablkcipher_cleanup(req);
 
 	return ret;
 }
 
+static int mv_cesa_des_op(struct ablkcipher_request *req,
+			  struct mv_cesa_op_ctx *tmpl)
+{
+	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+
+	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
+			      CESA_SA_DESC_CFG_CRYPTM_MSK);
+
+	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
+
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
+}
+
 static int mv_cesa_ecb_des_encrypt(struct ablkcipher_request *req)
 {
 	struct mv_cesa_op_ctx tmpl;
@@ -580,24 +592,14 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
 static int mv_cesa_des3_op(struct ablkcipher_request *req,
 			   struct mv_cesa_op_ctx *tmpl)
 {
-	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
-	int ret;
 
 	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_3DES,
 			      CESA_SA_DESC_CFG_CRYPTM_MSK);
 
 	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES3_EDE_KEY_SIZE);
 
-	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
-	if (ret)
-		return ret;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ablkcipher_cleanup(req);
-
-	return ret;
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
 }
 
 static int mv_cesa_ecb_des3_ede_encrypt(struct ablkcipher_request *req)
@@ -707,9 +709,8 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
 static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			  struct mv_cesa_op_ctx *tmpl)
 {
-	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
-	int ret, i;
+	int i;
 	u32 *key;
 	u32 cfg;
 
@@ -732,15 +733,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
 			      CESA_SA_DESC_CFG_CRYPTM_MSK |
 			      CESA_SA_DESC_CFG_AES_LEN_MSK);
 
-	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
-	if (ret)
-		return ret;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ablkcipher_cleanup(req);
-
-	return ret;
+	return mv_cesa_ablkcipher_queue_req(req, tmpl);
 }
 
 static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index e1f8acd..b7cfc42 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -335,6 +335,8 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
 				result[i] = cpu_to_be32(creq->state[i]);
 		}
 	}
+
+	atomic_sub(ahashreq->nbytes, &engine->load);
 }
 
 static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
@@ -365,7 +367,6 @@ static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
 static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
 	.step = mv_cesa_ahash_step,
 	.process = mv_cesa_ahash_process,
-	.prepare = mv_cesa_ahash_prepare,
 	.cleanup = mv_cesa_ahash_req_cleanup,
 	.complete = mv_cesa_ahash_complete,
 };
@@ -682,13 +683,13 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 	return ret;
 }
 
-static int mv_cesa_ahash_update(struct ahash_request *req)
+static int mv_cesa_ahash_queue_req(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	struct mv_cesa_engine *engine;
 	bool cached = false;
 	int ret;
 
-	creq->len += req->nbytes;
 	ret = mv_cesa_ahash_req_init(req, &cached);
 	if (ret)
 		return ret;
@@ -696,61 +697,48 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
+	engine = mv_cesa_select_engine(req->nbytes);
+	mv_cesa_ahash_prepare(&req->base, engine);
+
 	ret = mv_cesa_queue_req(&req->base, &creq->base);
+
 	if (mv_cesa_req_needs_cleanup(&req->base, ret))
 		mv_cesa_ahash_cleanup(req);
 
 	return ret;
 }
 
+static int mv_cesa_ahash_update(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+
+	creq->len += req->nbytes;
+
+	return mv_cesa_ahash_queue_req(req);
+}
+
 static int mv_cesa_ahash_final(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_op_ctx *tmpl = &creq->op_tmpl;
-	bool cached = false;
-	int ret;
 
 	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
 	creq->last_req = true;
 	req->nbytes = 0;
 
-	ret = mv_cesa_ahash_req_init(req, &cached);
-	if (ret)
-		return ret;
-
-	if (cached)
-		return 0;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ahash_cleanup(req);
-
-	return ret;
+	return mv_cesa_ahash_queue_req(req);
 }
 
 static int mv_cesa_ahash_finup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	struct mv_cesa_op_ctx *tmpl = &creq->op_tmpl;
-	bool cached = false;
-	int ret;
 
 	creq->len += req->nbytes;
 	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
 	creq->last_req = true;
 
-	ret = mv_cesa_ahash_req_init(req, &cached);
-	if (ret)
-		return ret;
-
-	if (cached)
-		return 0;
-
-	ret = mv_cesa_queue_req(&req->base, &creq->base);
-	if (mv_cesa_req_needs_cleanup(&req->base, ret))
-		mv_cesa_ahash_cleanup(req);
-
-	return ret;
+	return mv_cesa_ahash_queue_req(req);
 }
 
 static int mv_cesa_ahash_export(struct ahash_request *req, void *hash,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 09/10] crypto: marvell: Add support for chaining crypto requests in TDMA mode
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Thomas Petazzoni, Russell King, linux-crypto, Gregory Clement,
	David S. Miller, linux-arm-kernel

The Cryptographic Engines and Security Accelerators (CESA) supports the
Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
can be chained and processed by the hardware without software
intervention. This mode was already activated, however the crypto
requests were not chained together. By doing so, we reduce significantly
the number of IRQs. Instead of being interrupted at the end of each
crypto request, we are interrupted at the end of the last cryptographic
request processed by the engine.

This commits re-factorizes the code, changes the code architecture and
adds the required data structures to chain cryptographic requests
together before sending them to an engine (stopped or possibly already
running).

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---

Changes in v3:

  - Cosmetic changes: Extra blank lines and coding style issues
    on prototypes.

Changes in v2:

  - Reworded the commit message
  - Fixed cosmetic changes: coding styles issues, missing blank lines
  - Reworked mv_cesa_rearm_engine: lock handling is simpler
  - Removed the call to the complete operation in mv_cesa_std_process,
    in case of errors (not required)
  - Squashed the removal of the '.prepare' fields (cipher.c, hash.c)
    into another commit (see PATCH 08/10).
  - In mv_cesa_tdma_process only treat the status argument for the last
    request, use 'normal' status for the other ones.
  - Added a comment for explaining how the errors are notified to the
    cesa core.

 drivers/crypto/marvell/cesa.c   | 115 +++++++++++++++++++++++++++++++---------
 drivers/crypto/marvell/cesa.h   |  39 +++++++++++++-
 drivers/crypto/marvell/cipher.c |   2 +-
 drivers/crypto/marvell/hash.c   |   6 +++
 drivers/crypto/marvell/tdma.c   |  86 ++++++++++++++++++++++++++++++
 5 files changed, 221 insertions(+), 27 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index c0497ac..bb91156 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -40,14 +40,33 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
 
 struct mv_cesa_dev *cesa_dev;
 
-static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
+struct crypto_async_request *
+mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
+			   struct crypto_async_request **backlog)
 {
-	struct crypto_async_request *req, *backlog;
-	struct mv_cesa_ctx *ctx;
+	struct crypto_async_request *req;
 
-	backlog = crypto_get_backlog(&engine->queue);
+	*backlog = crypto_get_backlog(&engine->queue);
 	req = crypto_dequeue_request(&engine->queue);
-	engine->req = req;
+
+	if (!req)
+		return NULL;
+
+	return req;
+}
+
+static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
+{
+	struct crypto_async_request *req = NULL, *backlog = NULL;
+	struct mv_cesa_ctx *ctx;
+
+
+	spin_lock_bh(&engine->lock);
+	if (!engine->req) {
+		req = mv_cesa_dequeue_req_locked(engine, &backlog);
+		engine->req = req;
+	}
+	spin_unlock_bh(&engine->lock);
 
 	if (!req)
 		return;
@@ -57,6 +76,46 @@ static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
 
 	ctx = crypto_tfm_ctx(req->tfm);
 	ctx->ops->step(req);
+
+	return;
+}
+
+static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
+{
+	struct crypto_async_request *req;
+	struct mv_cesa_ctx *ctx;
+	int res;
+
+	req = engine->req;
+	ctx = crypto_tfm_ctx(req->tfm);
+	res = ctx->ops->process(req, status);
+
+	if (res == 0) {
+		ctx->ops->complete(req);
+		mv_cesa_engine_enqueue_complete_request(engine, req);
+	} else if (res == -EINPROGRESS) {
+		ctx->ops->step(req);
+	}
+
+	return res;
+}
+
+static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
+{
+	if (engine->chain.first && engine->chain.last)
+		return mv_cesa_tdma_process(engine, status);
+
+	return mv_cesa_std_process(engine, status);
+}
+
+static inline void
+mv_cesa_complete_req(struct mv_cesa_ctx *ctx, struct crypto_async_request *req,
+		     int res)
+{
+	ctx->ops->cleanup(req);
+	local_bh_disable();
+	req->complete(req, res);
+	local_bh_enable();
 }
 
 static irqreturn_t mv_cesa_int(int irq, void *priv)
@@ -83,26 +142,31 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 		writel(~status, engine->regs + CESA_SA_FPGA_INT_STATUS);
 		writel(~status, engine->regs + CESA_SA_INT_STATUS);
 
+		/* Process fetched requests */
+		res = mv_cesa_int_process(engine, status & mask);
 		ret = IRQ_HANDLED;
+
 		spin_lock_bh(&engine->lock);
 		req = engine->req;
+		if (res != -EINPROGRESS)
+			engine->req = NULL;
 		spin_unlock_bh(&engine->lock);
-		if (req) {
-			ctx = crypto_tfm_ctx(req->tfm);
-			res = ctx->ops->process(req, status & mask);
-			if (res != -EINPROGRESS) {
-				spin_lock_bh(&engine->lock);
-				engine->req = NULL;
-				mv_cesa_dequeue_req_locked(engine);
-				spin_unlock_bh(&engine->lock);
-				ctx->ops->complete(req);
-				ctx->ops->cleanup(req);
-				local_bh_disable();
-				req->complete(req, res);
-				local_bh_enable();
-			} else {
-				ctx->ops->step(req);
-			}
+
+		ctx = crypto_tfm_ctx(req->tfm);
+
+		if (res && res != -EINPROGRESS)
+			mv_cesa_complete_req(ctx, req, res);
+
+		/* Launch the next pending request */
+		mv_cesa_rearm_engine(engine);
+
+		/* Iterate over the complete queue */
+		while (true) {
+			req = mv_cesa_engine_dequeue_complete_request(engine);
+			if (!req)
+				break;
+
+			mv_cesa_complete_req(ctx, req, 0);
 		}
 	}
 
@@ -116,16 +180,16 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
 	struct mv_cesa_engine *engine = creq->engine;
 
 	spin_lock_bh(&engine->lock);
+	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
+		mv_cesa_tdma_chain(engine, creq);
+
 	ret = crypto_enqueue_request(&engine->queue, req);
 	spin_unlock_bh(&engine->lock);
 
 	if (ret != -EINPROGRESS)
 		return ret;
 
-	spin_lock_bh(&engine->lock);
-	if (!engine->req)
-		mv_cesa_dequeue_req_locked(engine);
-	spin_unlock_bh(&engine->lock);
+	mv_cesa_rearm_engine(engine);
 
 	return -EINPROGRESS;
 }
@@ -496,6 +560,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 
 		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
 		atomic_set(&engine->load, 0);
+		INIT_LIST_HEAD(&engine->complete_queue);
 	}
 
 	cesa_dev = cesa;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 644be35..50a1fb2 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
 /* TDMA descriptor flags */
 #define CESA_TDMA_DST_IN_SRAM			BIT(31)
 #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
-#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
+#define CESA_TDMA_END_OF_REQ			BIT(29)
+#define CESA_TDMA_BREAK_CHAIN			BIT(28)
+#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
@@ -431,6 +433,9 @@ struct mv_cesa_dev {
  *			SRAM
  * @queue:		fifo of the pending crypto requests
  * @load:		engine load counter, useful for load balancing
+ * @chain:		list of the current tdma descriptors being processed
+ * 			by this engine.
+ * @complete_queue:	fifo of the processed requests by the engine
  *
  * Structure storing CESA engine information.
  */
@@ -448,6 +453,8 @@ struct mv_cesa_engine {
 	struct gen_pool *pool;
 	struct crypto_queue queue;
 	atomic_t load;
+	struct mv_cesa_tdma_chain chain;
+	struct list_head complete_queue;
 };
 
 /**
@@ -608,6 +615,29 @@ struct mv_cesa_ahash_req {
 
 extern struct mv_cesa_dev *cesa_dev;
 
+
+static inline void
+mv_cesa_engine_enqueue_complete_request(struct mv_cesa_engine *engine,
+					struct crypto_async_request *req)
+{
+	list_add_tail(&req->list, &engine->complete_queue);
+}
+
+static inline struct crypto_async_request *
+mv_cesa_engine_dequeue_complete_request(struct mv_cesa_engine *engine)
+{
+	struct crypto_async_request *req;
+
+	req = list_first_entry_or_null(&engine->complete_queue,
+				       struct crypto_async_request,
+				       list);
+	if (req)
+		list_del(&req->list);
+
+	return req;
+}
+
+
 static inline enum mv_cesa_req_type
 mv_cesa_req_get_type(struct mv_cesa_req *req)
 {
@@ -689,6 +719,10 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq);
 
+struct crypto_async_request *
+mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
+			   struct crypto_async_request **backlog);
+
 static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
 {
 	int i;
@@ -794,6 +828,9 @@ static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
 void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine);
 void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
+void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
+			struct mv_cesa_req *dreq);
+int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status);
 
 
 static inline void
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 28894be..a9ca0dc 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -390,6 +390,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 		goto err_free_tdma;
 
 	basereq->chain = chain;
+	basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
 
 	return 0;
 
@@ -447,7 +448,6 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
 			      CESA_SA_DESC_CFG_OP_MSK);
 
-	/* TODO: add a threshold for DMA usage */
 	if (cesa_dev->caps->has_tdma)
 		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
 	else
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index b7cfc42..c7e5a46 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -172,6 +172,9 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	for (i = 0; i < digsize / 4; i++)
 		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
+	mv_cesa_adjust_op(engine, &creq->op_tmpl);
+	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
+
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 			    creq->cache, creq->cache_ptr);
@@ -647,6 +650,9 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	else
 		creq->cache_ptr = 0;
 
+	basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
+				       CESA_TDMA_BREAK_CHAIN);
+
 	return 0;
 
 err_free_tdma:
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 9d944ad..8de8c83 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -99,6 +99,92 @@ void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 	}
 }
 
+void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
+			struct mv_cesa_req *dreq)
+{
+	if (engine->chain.first == NULL && engine->chain.last == NULL) {
+		engine->chain.first = dreq->chain.first;
+		engine->chain.last  = dreq->chain.last;
+	} else {
+		struct mv_cesa_tdma_desc *last;
+
+		last = engine->chain.last;
+		last->next = dreq->chain.first;
+		engine->chain.last = dreq->chain.last;
+
+		if (!(last->flags & CESA_TDMA_BREAK_CHAIN))
+			last->next_dma = dreq->chain.first->cur_dma;
+	}
+}
+
+int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
+{
+	struct crypto_async_request *req = NULL;
+	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
+	dma_addr_t tdma_cur;
+	int res = 0;
+
+	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
+
+	for (tdma = engine->chain.first; tdma; tdma = next) {
+		spin_lock_bh(&engine->lock);
+		next = tdma->next;
+		spin_unlock_bh(&engine->lock);
+
+		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
+			struct crypto_async_request *backlog = NULL;
+			struct mv_cesa_ctx *ctx;
+			u32 current_status;
+
+			spin_lock_bh(&engine->lock);
+			/*
+			 * if req is NULL, this means we're processing the
+			 * request in engine->req.
+			 */
+			if (!req)
+				req = engine->req;
+			else
+				req = mv_cesa_dequeue_req_locked(engine,
+								 &backlog);
+
+			/* Re-chaining to the next request */
+			engine->chain.first = tdma->next;
+			tdma->next = NULL;
+
+			/* If this is the last request, clear the chain */
+			if (engine->chain.first == NULL)
+				engine->chain.last  = NULL;
+			spin_unlock_bh(&engine->lock);
+
+			ctx = crypto_tfm_ctx(req->tfm);
+			current_status = (tdma->cur_dma == tdma_cur) ?
+					  status : CESA_SA_INT_ACC0_IDMA_DONE;
+			res = ctx->ops->process(req, current_status);
+			ctx->ops->complete(req);
+
+			if (res == 0)
+				mv_cesa_engine_enqueue_complete_request(engine,
+									req);
+
+			if (backlog)
+				backlog->complete(backlog, -EINPROGRESS);
+		}
+
+		if (res || tdma->cur_dma == tdma_cur)
+			break;
+	}
+
+	/* Save the last request in error to engine->req, so that the core
+	 * knows which request was fautly */
+	if (res) {
+		spin_lock_bh(&engine->lock);
+		engine->req = req;
+		spin_unlock_bh(&engine->lock);
+	}
+
+	return res;
+}
+
 static struct mv_cesa_tdma_desc *
 mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 09/10] crypto: marvell: Add support for chaining crypto requests in TDMA mode
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

The Cryptographic Engines and Security Accelerators (CESA) supports the
Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
can be chained and processed by the hardware without software
intervention. This mode was already activated, however the crypto
requests were not chained together. By doing so, we reduce significantly
the number of IRQs. Instead of being interrupted at the end of each
crypto request, we are interrupted at the end of the last cryptographic
request processed by the engine.

This commits re-factorizes the code, changes the code architecture and
adds the required data structures to chain cryptographic requests
together before sending them to an engine (stopped or possibly already
running).

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---

Changes in v3:

  - Cosmetic changes: Extra blank lines and coding style issues
    on prototypes.

Changes in v2:

  - Reworded the commit message
  - Fixed cosmetic changes: coding styles issues, missing blank lines
  - Reworked mv_cesa_rearm_engine: lock handling is simpler
  - Removed the call to the complete operation in mv_cesa_std_process,
    in case of errors (not required)
  - Squashed the removal of the '.prepare' fields (cipher.c, hash.c)
    into another commit (see PATCH 08/10).
  - In mv_cesa_tdma_process only treat the status argument for the last
    request, use 'normal' status for the other ones.
  - Added a comment for explaining how the errors are notified to the
    cesa core.

 drivers/crypto/marvell/cesa.c   | 115 +++++++++++++++++++++++++++++++---------
 drivers/crypto/marvell/cesa.h   |  39 +++++++++++++-
 drivers/crypto/marvell/cipher.c |   2 +-
 drivers/crypto/marvell/hash.c   |   6 +++
 drivers/crypto/marvell/tdma.c   |  86 ++++++++++++++++++++++++++++++
 5 files changed, 221 insertions(+), 27 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index c0497ac..bb91156 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -40,14 +40,33 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
 
 struct mv_cesa_dev *cesa_dev;
 
-static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
+struct crypto_async_request *
+mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
+			   struct crypto_async_request **backlog)
 {
-	struct crypto_async_request *req, *backlog;
-	struct mv_cesa_ctx *ctx;
+	struct crypto_async_request *req;
 
-	backlog = crypto_get_backlog(&engine->queue);
+	*backlog = crypto_get_backlog(&engine->queue);
 	req = crypto_dequeue_request(&engine->queue);
-	engine->req = req;
+
+	if (!req)
+		return NULL;
+
+	return req;
+}
+
+static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
+{
+	struct crypto_async_request *req = NULL, *backlog = NULL;
+	struct mv_cesa_ctx *ctx;
+
+
+	spin_lock_bh(&engine->lock);
+	if (!engine->req) {
+		req = mv_cesa_dequeue_req_locked(engine, &backlog);
+		engine->req = req;
+	}
+	spin_unlock_bh(&engine->lock);
 
 	if (!req)
 		return;
@@ -57,6 +76,46 @@ static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
 
 	ctx = crypto_tfm_ctx(req->tfm);
 	ctx->ops->step(req);
+
+	return;
+}
+
+static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
+{
+	struct crypto_async_request *req;
+	struct mv_cesa_ctx *ctx;
+	int res;
+
+	req = engine->req;
+	ctx = crypto_tfm_ctx(req->tfm);
+	res = ctx->ops->process(req, status);
+
+	if (res == 0) {
+		ctx->ops->complete(req);
+		mv_cesa_engine_enqueue_complete_request(engine, req);
+	} else if (res == -EINPROGRESS) {
+		ctx->ops->step(req);
+	}
+
+	return res;
+}
+
+static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
+{
+	if (engine->chain.first && engine->chain.last)
+		return mv_cesa_tdma_process(engine, status);
+
+	return mv_cesa_std_process(engine, status);
+}
+
+static inline void
+mv_cesa_complete_req(struct mv_cesa_ctx *ctx, struct crypto_async_request *req,
+		     int res)
+{
+	ctx->ops->cleanup(req);
+	local_bh_disable();
+	req->complete(req, res);
+	local_bh_enable();
 }
 
 static irqreturn_t mv_cesa_int(int irq, void *priv)
@@ -83,26 +142,31 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
 		writel(~status, engine->regs + CESA_SA_FPGA_INT_STATUS);
 		writel(~status, engine->regs + CESA_SA_INT_STATUS);
 
+		/* Process fetched requests */
+		res = mv_cesa_int_process(engine, status & mask);
 		ret = IRQ_HANDLED;
+
 		spin_lock_bh(&engine->lock);
 		req = engine->req;
+		if (res != -EINPROGRESS)
+			engine->req = NULL;
 		spin_unlock_bh(&engine->lock);
-		if (req) {
-			ctx = crypto_tfm_ctx(req->tfm);
-			res = ctx->ops->process(req, status & mask);
-			if (res != -EINPROGRESS) {
-				spin_lock_bh(&engine->lock);
-				engine->req = NULL;
-				mv_cesa_dequeue_req_locked(engine);
-				spin_unlock_bh(&engine->lock);
-				ctx->ops->complete(req);
-				ctx->ops->cleanup(req);
-				local_bh_disable();
-				req->complete(req, res);
-				local_bh_enable();
-			} else {
-				ctx->ops->step(req);
-			}
+
+		ctx = crypto_tfm_ctx(req->tfm);
+
+		if (res && res != -EINPROGRESS)
+			mv_cesa_complete_req(ctx, req, res);
+
+		/* Launch the next pending request */
+		mv_cesa_rearm_engine(engine);
+
+		/* Iterate over the complete queue */
+		while (true) {
+			req = mv_cesa_engine_dequeue_complete_request(engine);
+			if (!req)
+				break;
+
+			mv_cesa_complete_req(ctx, req, 0);
 		}
 	}
 
@@ -116,16 +180,16 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
 	struct mv_cesa_engine *engine = creq->engine;
 
 	spin_lock_bh(&engine->lock);
+	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
+		mv_cesa_tdma_chain(engine, creq);
+
 	ret = crypto_enqueue_request(&engine->queue, req);
 	spin_unlock_bh(&engine->lock);
 
 	if (ret != -EINPROGRESS)
 		return ret;
 
-	spin_lock_bh(&engine->lock);
-	if (!engine->req)
-		mv_cesa_dequeue_req_locked(engine);
-	spin_unlock_bh(&engine->lock);
+	mv_cesa_rearm_engine(engine);
 
 	return -EINPROGRESS;
 }
@@ -496,6 +560,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
 
 		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
 		atomic_set(&engine->load, 0);
+		INIT_LIST_HEAD(&engine->complete_queue);
 	}
 
 	cesa_dev = cesa;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index 644be35..50a1fb2 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
 /* TDMA descriptor flags */
 #define CESA_TDMA_DST_IN_SRAM			BIT(31)
 #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
-#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
+#define CESA_TDMA_END_OF_REQ			BIT(29)
+#define CESA_TDMA_BREAK_CHAIN			BIT(28)
+#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
@@ -431,6 +433,9 @@ struct mv_cesa_dev {
  *			SRAM
  * @queue:		fifo of the pending crypto requests
  * @load:		engine load counter, useful for load balancing
+ * @chain:		list of the current tdma descriptors being processed
+ * 			by this engine.
+ * @complete_queue:	fifo of the processed requests by the engine
  *
  * Structure storing CESA engine information.
  */
@@ -448,6 +453,8 @@ struct mv_cesa_engine {
 	struct gen_pool *pool;
 	struct crypto_queue queue;
 	atomic_t load;
+	struct mv_cesa_tdma_chain chain;
+	struct list_head complete_queue;
 };
 
 /**
@@ -608,6 +615,29 @@ struct mv_cesa_ahash_req {
 
 extern struct mv_cesa_dev *cesa_dev;
 
+
+static inline void
+mv_cesa_engine_enqueue_complete_request(struct mv_cesa_engine *engine,
+					struct crypto_async_request *req)
+{
+	list_add_tail(&req->list, &engine->complete_queue);
+}
+
+static inline struct crypto_async_request *
+mv_cesa_engine_dequeue_complete_request(struct mv_cesa_engine *engine)
+{
+	struct crypto_async_request *req;
+
+	req = list_first_entry_or_null(&engine->complete_queue,
+				       struct crypto_async_request,
+				       list);
+	if (req)
+		list_del(&req->list);
+
+	return req;
+}
+
+
 static inline enum mv_cesa_req_type
 mv_cesa_req_get_type(struct mv_cesa_req *req)
 {
@@ -689,6 +719,10 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
 int mv_cesa_queue_req(struct crypto_async_request *req,
 		      struct mv_cesa_req *creq);
 
+struct crypto_async_request *
+mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
+			   struct crypto_async_request **backlog);
+
 static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
 {
 	int i;
@@ -794,6 +828,9 @@ static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
 void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 			 struct mv_cesa_engine *engine);
 void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
+void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
+			struct mv_cesa_req *dreq);
+int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status);
 
 
 static inline void
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 28894be..a9ca0dc 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -390,6 +390,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 		goto err_free_tdma;
 
 	basereq->chain = chain;
+	basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
 
 	return 0;
 
@@ -447,7 +448,6 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
 			      CESA_SA_DESC_CFG_OP_MSK);
 
-	/* TODO: add a threshold for DMA usage */
 	if (cesa_dev->caps->has_tdma)
 		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
 	else
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index b7cfc42..c7e5a46 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -172,6 +172,9 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	for (i = 0; i < digsize / 4; i++)
 		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
+	mv_cesa_adjust_op(engine, &creq->op_tmpl);
+	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
+
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 			    creq->cache, creq->cache_ptr);
@@ -647,6 +650,9 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	else
 		creq->cache_ptr = 0;
 
+	basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
+				       CESA_TDMA_BREAK_CHAIN);
+
 	return 0;
 
 err_free_tdma:
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 9d944ad..8de8c83 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -99,6 +99,92 @@ void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
 	}
 }
 
+void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
+			struct mv_cesa_req *dreq)
+{
+	if (engine->chain.first == NULL && engine->chain.last == NULL) {
+		engine->chain.first = dreq->chain.first;
+		engine->chain.last  = dreq->chain.last;
+	} else {
+		struct mv_cesa_tdma_desc *last;
+
+		last = engine->chain.last;
+		last->next = dreq->chain.first;
+		engine->chain.last = dreq->chain.last;
+
+		if (!(last->flags & CESA_TDMA_BREAK_CHAIN))
+			last->next_dma = dreq->chain.first->cur_dma;
+	}
+}
+
+int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
+{
+	struct crypto_async_request *req = NULL;
+	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
+	dma_addr_t tdma_cur;
+	int res = 0;
+
+	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
+
+	for (tdma = engine->chain.first; tdma; tdma = next) {
+		spin_lock_bh(&engine->lock);
+		next = tdma->next;
+		spin_unlock_bh(&engine->lock);
+
+		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
+			struct crypto_async_request *backlog = NULL;
+			struct mv_cesa_ctx *ctx;
+			u32 current_status;
+
+			spin_lock_bh(&engine->lock);
+			/*
+			 * if req is NULL, this means we're processing the
+			 * request in engine->req.
+			 */
+			if (!req)
+				req = engine->req;
+			else
+				req = mv_cesa_dequeue_req_locked(engine,
+								 &backlog);
+
+			/* Re-chaining to the next request */
+			engine->chain.first = tdma->next;
+			tdma->next = NULL;
+
+			/* If this is the last request, clear the chain */
+			if (engine->chain.first == NULL)
+				engine->chain.last  = NULL;
+			spin_unlock_bh(&engine->lock);
+
+			ctx = crypto_tfm_ctx(req->tfm);
+			current_status = (tdma->cur_dma == tdma_cur) ?
+					  status : CESA_SA_INT_ACC0_IDMA_DONE;
+			res = ctx->ops->process(req, current_status);
+			ctx->ops->complete(req);
+
+			if (res == 0)
+				mv_cesa_engine_enqueue_complete_request(engine,
+									req);
+
+			if (backlog)
+				backlog->complete(backlog, -EINPROGRESS);
+		}
+
+		if (res || tdma->cur_dma == tdma_cur)
+			break;
+	}
+
+	/* Save the last request in error to engine->req, so that the core
+	 * knows which request was fautly */
+	if (res) {
+		spin_lock_bh(&engine->lock);
+		engine->req = req;
+		spin_unlock_bh(&engine->lock);
+	}
+
+	return res;
+}
+
 static struct mv_cesa_tdma_desc *
 mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 10/10] crypto: marvell: Increase the size of the crypto queue
  2016-06-21  8:08 ` Romain Perier
@ 2016-06-21  8:08   ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Thomas Petazzoni, Russell King, linux-crypto, Gregory Clement,
	David S. Miller, linux-arm-kernel

Now that crypto requests are chained together at the DMA level, we
increase the size of the crypto queue for each engine. The result is
that as the backlog list is reached later, it does not stop the crypto
stack from sending asychronous requests, so more cryptographic tasks
are processed by the engines.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index bb91156..5147073 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -32,7 +32,7 @@
 #include "cesa.h"
 
 /* Limit of the crypto queue before reaching the backlog */
-#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
+#define CESA_CRYPTO_DEFAULT_MAX_QLEN 128
 
 static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
 module_param_named(allhwsupport, allhwsupport, int, 0444);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v3 10/10] crypto: marvell: Increase the size of the crypto queue
@ 2016-06-21  8:08   ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-21  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

Now that crypto requests are chained together at the DMA level, we
increase the size of the crypto queue for each engine. The result is
that as the backlog list is reached later, it does not stop the crypto
stack from sending asychronous requests, so more cryptographic tasks
are processed by the engines.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/crypto/marvell/cesa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index bb91156..5147073 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -32,7 +32,7 @@
 #include "cesa.h"
 
 /* Limit of the crypto queue before reaching the backlog */
-#define CESA_CRYPTO_DEFAULT_MAX_QLEN 50
+#define CESA_CRYPTO_DEFAULT_MAX_QLEN 128
 
 static int allhwsupport = !IS_ENABLED(CONFIG_CRYPTO_DEV_MV_CESA);
 module_param_named(allhwsupport, allhwsupport, int, 0444);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 08/10] crypto: marvell: Add load balancing between engines
  2016-06-21  8:08   ` Romain Perier
@ 2016-06-21 12:33     ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-06-21 12:33 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, Gregory Clement, Thomas Petazzoni,
	David S. Miller, Russell King, linux-crypto, linux-arm-kernel

On Tue, 21 Jun 2016 10:08:38 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> This commits adds support for fine grained load balancing on
> multi-engine IPs. The engine is pre-selected based on its current load
> and on the weight of the crypto request that is about to be processed.
> The global crypto queue is also moved to each engine. These changes are
> required to allow chaining crypto requests at the DMA level. By using
> a crypto queue per engine, we make sure that we keep the state of the
> tdma chain synchronized with the crypto queue. We also reduce contention
> on 'cesa_dev->lock' and improve parallelism.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

> ---
> 
> Changes in v3:
> 
>   - Renamed mv_cesa_dequeue_req_unlocked => mv_cesa_dequeue_req_locked
> 
> Changes in v2:
> 
>   - Reworded the commit message
>   - Moved the code about SRAM I/O operations from this commit to
>     a separated commit (see PATCH 07/10).
> 
>  drivers/crypto/marvell/cesa.c   | 34 ++++++++++++------------
>  drivers/crypto/marvell/cesa.h   | 29 +++++++++++++++++----
>  drivers/crypto/marvell/cipher.c | 57 ++++++++++++++++++-----------------------
>  drivers/crypto/marvell/hash.c   | 50 ++++++++++++++----------------------
>  4 files changed, 84 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index af96426..c0497ac 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -40,16 +40,14 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
>  
>  struct mv_cesa_dev *cesa_dev;
>  
> -static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
> +static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
>  {
>  	struct crypto_async_request *req, *backlog;
>  	struct mv_cesa_ctx *ctx;
>  
> -	spin_lock_bh(&cesa_dev->lock);
> -	backlog = crypto_get_backlog(&cesa_dev->queue);
> -	req = crypto_dequeue_request(&cesa_dev->queue);
> +	backlog = crypto_get_backlog(&engine->queue);
> +	req = crypto_dequeue_request(&engine->queue);
>  	engine->req = req;
> -	spin_unlock_bh(&cesa_dev->lock);
>  
>  	if (!req)
>  		return;
> @@ -58,7 +56,6 @@ static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
>  		backlog->complete(backlog, -EINPROGRESS);
>  
>  	ctx = crypto_tfm_ctx(req->tfm);
> -	ctx->ops->prepare(req, engine);
>  	ctx->ops->step(req);
>  }
>  
> @@ -96,7 +93,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  			if (res != -EINPROGRESS) {
>  				spin_lock_bh(&engine->lock);
>  				engine->req = NULL;
> -				mv_cesa_dequeue_req_unlocked(engine);
> +				mv_cesa_dequeue_req_locked(engine);
>  				spin_unlock_bh(&engine->lock);
>  				ctx->ops->complete(req);
>  				ctx->ops->cleanup(req);
> @@ -116,21 +113,19 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
>  		      struct mv_cesa_req *creq)
>  {
>  	int ret;
> -	int i;
> +	struct mv_cesa_engine *engine = creq->engine;
>  
> -	spin_lock_bh(&cesa_dev->lock);
> -	ret = crypto_enqueue_request(&cesa_dev->queue, req);
> -	spin_unlock_bh(&cesa_dev->lock);
> +	spin_lock_bh(&engine->lock);
> +	ret = crypto_enqueue_request(&engine->queue, req);
> +	spin_unlock_bh(&engine->lock);
>  
>  	if (ret != -EINPROGRESS)
>  		return ret;
>  
> -	for (i = 0; i < cesa_dev->caps->nengines; i++) {
> -		spin_lock_bh(&cesa_dev->engines[i].lock);
> -		if (!cesa_dev->engines[i].req)
> -			mv_cesa_dequeue_req_unlocked(&cesa_dev->engines[i]);
> -		spin_unlock_bh(&cesa_dev->engines[i].lock);
> -	}
> +	spin_lock_bh(&engine->lock);
> +	if (!engine->req)
> +		mv_cesa_dequeue_req_locked(engine);
> +	spin_unlock_bh(&engine->lock);
>  
>  	return -EINPROGRESS;
>  }
> @@ -425,7 +420,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  
>  	spin_lock_init(&cesa->lock);
> -	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
> +
>  	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
>  	cesa->regs = devm_ioremap_resource(dev, res);
>  	if (IS_ERR(cesa->regs))
> @@ -498,6 +493,9 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  						engine);
>  		if (ret)
>  			goto err_cleanup;
> +
> +		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
> +		atomic_set(&engine->load, 0);
>  	}
>  
>  	cesa_dev = cesa;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index c463528..644be35 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -400,7 +400,6 @@ struct mv_cesa_dev_dma {
>   * @regs:	device registers
>   * @sram_size:	usable SRAM size
>   * @lock:	device lock
> - * @queue:	crypto request queue
>   * @engines:	array of engines
>   * @dma:	dma pools
>   *
> @@ -412,7 +411,6 @@ struct mv_cesa_dev {
>  	struct device *dev;
>  	unsigned int sram_size;
>  	spinlock_t lock;
> -	struct crypto_queue queue;
>  	struct mv_cesa_engine *engines;
>  	struct mv_cesa_dev_dma *dma;
>  };
> @@ -431,6 +429,8 @@ struct mv_cesa_dev {
>   * @int_mask:		interrupt mask cache
>   * @pool:		memory pool pointing to the memory region reserved in
>   *			SRAM
> + * @queue:		fifo of the pending crypto requests
> + * @load:		engine load counter, useful for load balancing
>   *
>   * Structure storing CESA engine information.
>   */
> @@ -446,11 +446,12 @@ struct mv_cesa_engine {
>  	size_t max_req_len;
>  	u32 int_mask;
>  	struct gen_pool *pool;
> +	struct crypto_queue queue;
> +	atomic_t load;
>  };
>  
>  /**
>   * struct mv_cesa_req_ops - CESA request operations
> - * @prepare:	prepare a request to be executed on the specified engine
>   * @process:	process a request chunk result (should return 0 if the
>   *		operation, -EINPROGRESS if it needs more steps or an error
>   *		code)
> @@ -460,8 +461,6 @@ struct mv_cesa_engine {
>   * 		when it is needed.
>   */
>  struct mv_cesa_req_ops {
> -	void (*prepare)(struct crypto_async_request *req,
> -			struct mv_cesa_engine *engine);
>  	int (*process)(struct crypto_async_request *req, u32 status);
>  	void (*step)(struct crypto_async_request *req);
>  	void (*cleanup)(struct crypto_async_request *req);
> @@ -690,6 +689,26 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
>  int mv_cesa_queue_req(struct crypto_async_request *req,
>  		      struct mv_cesa_req *creq);
>  
> +static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
> +{
> +	int i;
> +	u32 min_load = U32_MAX;
> +	struct mv_cesa_engine *selected = NULL;
> +
> +	for (i = 0; i < cesa_dev->caps->nengines; i++) {
> +		struct mv_cesa_engine *engine = cesa_dev->engines + i;
> +		u32 load = atomic_read(&engine->load);
> +		if (load < min_load) {
> +			min_load = load;
> +			selected = engine;
> +		}
> +	}
> +
> +	atomic_add(weight, &selected->load);
> +
> +	return selected;
> +}
> +
>  /*
>   * Helper function that indicates whether a crypto request needs to be
>   * cleaned up or not after being enqueued using mv_cesa_queue_req().
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 79d4175..28894be 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -214,6 +214,7 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
>  	struct mv_cesa_engine *engine = creq->base.engine;
>  	unsigned int ivsize;
>  
> +	atomic_sub(ablkreq->nbytes, &engine->load);
>  	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
>  
>  	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ) {
> @@ -231,7 +232,6 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
>  static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
>  	.step = mv_cesa_ablkcipher_step,
>  	.process = mv_cesa_ablkcipher_process,
> -	.prepare = mv_cesa_ablkcipher_prepare,
>  	.cleanup = mv_cesa_ablkcipher_req_cleanup,
>  	.complete = mv_cesa_ablkcipher_complete,
>  };
> @@ -456,29 +456,41 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
>  	return ret;
>  }
>  
> -static int mv_cesa_des_op(struct ablkcipher_request *req,
> -			  struct mv_cesa_op_ctx *tmpl)
> +static int mv_cesa_ablkcipher_queue_req(struct ablkcipher_request *req,
> +					struct mv_cesa_op_ctx *tmpl)
>  {
> -	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
> -	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>  	int ret;
> -
> -	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
> -			      CESA_SA_DESC_CFG_CRYPTM_MSK);
> -
> -	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
> +	struct mv_cesa_engine *engine;
>  
>  	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
>  	if (ret)
>  		return ret;
>  
> +	engine = mv_cesa_select_engine(req->nbytes);
> +	mv_cesa_ablkcipher_prepare(&req->base, engine);
> +
>  	ret = mv_cesa_queue_req(&req->base, &creq->base);
> +
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ablkcipher_cleanup(req);
>  
>  	return ret;
>  }
>  
> +static int mv_cesa_des_op(struct ablkcipher_request *req,
> +			  struct mv_cesa_op_ctx *tmpl)
> +{
> +	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
> +
> +	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
> +			      CESA_SA_DESC_CFG_CRYPTM_MSK);
> +
> +	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
> +
> +	return mv_cesa_ablkcipher_queue_req(req, tmpl);
> +}
> +
>  static int mv_cesa_ecb_des_encrypt(struct ablkcipher_request *req)
>  {
>  	struct mv_cesa_op_ctx tmpl;
> @@ -580,24 +592,14 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
>  static int mv_cesa_des3_op(struct ablkcipher_request *req,
>  			   struct mv_cesa_op_ctx *tmpl)
>  {
> -	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
> -	int ret;
>  
>  	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_3DES,
>  			      CESA_SA_DESC_CFG_CRYPTM_MSK);
>  
>  	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES3_EDE_KEY_SIZE);
>  
> -	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
> -	if (ret)
> -		return ret;
> -
> -	ret = mv_cesa_queue_req(&req->base, &creq->base);
> -	if (mv_cesa_req_needs_cleanup(&req->base, ret))
> -		mv_cesa_ablkcipher_cleanup(req);
> -
> -	return ret;
> +	return mv_cesa_ablkcipher_queue_req(req, tmpl);
>  }
>  
>  static int mv_cesa_ecb_des3_ede_encrypt(struct ablkcipher_request *req)
> @@ -707,9 +709,8 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
>  static int mv_cesa_aes_op(struct ablkcipher_request *req,
>  			  struct mv_cesa_op_ctx *tmpl)
>  {
> -	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
> -	int ret, i;
> +	int i;
>  	u32 *key;
>  	u32 cfg;
>  
> @@ -732,15 +733,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
>  			      CESA_SA_DESC_CFG_CRYPTM_MSK |
>  			      CESA_SA_DESC_CFG_AES_LEN_MSK);
>  
> -	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
> -	if (ret)
> -		return ret;
> -
> -	ret = mv_cesa_queue_req(&req->base, &creq->base);
> -	if (mv_cesa_req_needs_cleanup(&req->base, ret))
> -		mv_cesa_ablkcipher_cleanup(req);
> -
> -	return ret;
> +	return mv_cesa_ablkcipher_queue_req(req, tmpl);
>  }
>  
>  static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index e1f8acd..b7cfc42 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -335,6 +335,8 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
>  				result[i] = cpu_to_be32(creq->state[i]);
>  		}
>  	}
> +
> +	atomic_sub(ahashreq->nbytes, &engine->load);
>  }
>  
>  static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
> @@ -365,7 +367,6 @@ static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
>  static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
>  	.step = mv_cesa_ahash_step,
>  	.process = mv_cesa_ahash_process,
> -	.prepare = mv_cesa_ahash_prepare,
>  	.cleanup = mv_cesa_ahash_req_cleanup,
>  	.complete = mv_cesa_ahash_complete,
>  };
> @@ -682,13 +683,13 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>  	return ret;
>  }
>  
> -static int mv_cesa_ahash_update(struct ahash_request *req)
> +static int mv_cesa_ahash_queue_req(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
> +	struct mv_cesa_engine *engine;
>  	bool cached = false;
>  	int ret;
>  
> -	creq->len += req->nbytes;
>  	ret = mv_cesa_ahash_req_init(req, &cached);
>  	if (ret)
>  		return ret;
> @@ -696,61 +697,48 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
>  	if (cached)
>  		return 0;
>  
> +	engine = mv_cesa_select_engine(req->nbytes);
> +	mv_cesa_ahash_prepare(&req->base, engine);
> +
>  	ret = mv_cesa_queue_req(&req->base, &creq->base);
> +
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ahash_cleanup(req);
>  
>  	return ret;
>  }
>  
> +static int mv_cesa_ahash_update(struct ahash_request *req)
> +{
> +	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
> +
> +	creq->len += req->nbytes;
> +
> +	return mv_cesa_ahash_queue_req(req);
> +}
> +
>  static int mv_cesa_ahash_final(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	struct mv_cesa_op_ctx *tmpl = &creq->op_tmpl;
> -	bool cached = false;
> -	int ret;
>  
>  	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
>  	creq->last_req = true;
>  	req->nbytes = 0;
>  
> -	ret = mv_cesa_ahash_req_init(req, &cached);
> -	if (ret)
> -		return ret;
> -
> -	if (cached)
> -		return 0;
> -
> -	ret = mv_cesa_queue_req(&req->base, &creq->base);
> -	if (mv_cesa_req_needs_cleanup(&req->base, ret))
> -		mv_cesa_ahash_cleanup(req);
> -
> -	return ret;
> +	return mv_cesa_ahash_queue_req(req);
>  }
>  
>  static int mv_cesa_ahash_finup(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	struct mv_cesa_op_ctx *tmpl = &creq->op_tmpl;
> -	bool cached = false;
> -	int ret;
>  
>  	creq->len += req->nbytes;
>  	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
>  	creq->last_req = true;
>  
> -	ret = mv_cesa_ahash_req_init(req, &cached);
> -	if (ret)
> -		return ret;
> -
> -	if (cached)
> -		return 0;
> -
> -	ret = mv_cesa_queue_req(&req->base, &creq->base);
> -	if (mv_cesa_req_needs_cleanup(&req->base, ret))
> -		mv_cesa_ahash_cleanup(req);
> -
> -	return ret;
> +	return mv_cesa_ahash_queue_req(req);
>  }
>  
>  static int mv_cesa_ahash_export(struct ahash_request *req, void *hash,

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 08/10] crypto: marvell: Add load balancing between engines
@ 2016-06-21 12:33     ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-06-21 12:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 21 Jun 2016 10:08:38 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> This commits adds support for fine grained load balancing on
> multi-engine IPs. The engine is pre-selected based on its current load
> and on the weight of the crypto request that is about to be processed.
> The global crypto queue is also moved to each engine. These changes are
> required to allow chaining crypto requests at the DMA level. By using
> a crypto queue per engine, we make sure that we keep the state of the
> tdma chain synchronized with the crypto queue. We also reduce contention
> on 'cesa_dev->lock' and improve parallelism.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

> ---
> 
> Changes in v3:
> 
>   - Renamed mv_cesa_dequeue_req_unlocked => mv_cesa_dequeue_req_locked
> 
> Changes in v2:
> 
>   - Reworded the commit message
>   - Moved the code about SRAM I/O operations from this commit to
>     a separated commit (see PATCH 07/10).
> 
>  drivers/crypto/marvell/cesa.c   | 34 ++++++++++++------------
>  drivers/crypto/marvell/cesa.h   | 29 +++++++++++++++++----
>  drivers/crypto/marvell/cipher.c | 57 ++++++++++++++++++-----------------------
>  drivers/crypto/marvell/hash.c   | 50 ++++++++++++++----------------------
>  4 files changed, 84 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index af96426..c0497ac 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -40,16 +40,14 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
>  
>  struct mv_cesa_dev *cesa_dev;
>  
> -static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
> +static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
>  {
>  	struct crypto_async_request *req, *backlog;
>  	struct mv_cesa_ctx *ctx;
>  
> -	spin_lock_bh(&cesa_dev->lock);
> -	backlog = crypto_get_backlog(&cesa_dev->queue);
> -	req = crypto_dequeue_request(&cesa_dev->queue);
> +	backlog = crypto_get_backlog(&engine->queue);
> +	req = crypto_dequeue_request(&engine->queue);
>  	engine->req = req;
> -	spin_unlock_bh(&cesa_dev->lock);
>  
>  	if (!req)
>  		return;
> @@ -58,7 +56,6 @@ static void mv_cesa_dequeue_req_unlocked(struct mv_cesa_engine *engine)
>  		backlog->complete(backlog, -EINPROGRESS);
>  
>  	ctx = crypto_tfm_ctx(req->tfm);
> -	ctx->ops->prepare(req, engine);
>  	ctx->ops->step(req);
>  }
>  
> @@ -96,7 +93,7 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  			if (res != -EINPROGRESS) {
>  				spin_lock_bh(&engine->lock);
>  				engine->req = NULL;
> -				mv_cesa_dequeue_req_unlocked(engine);
> +				mv_cesa_dequeue_req_locked(engine);
>  				spin_unlock_bh(&engine->lock);
>  				ctx->ops->complete(req);
>  				ctx->ops->cleanup(req);
> @@ -116,21 +113,19 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
>  		      struct mv_cesa_req *creq)
>  {
>  	int ret;
> -	int i;
> +	struct mv_cesa_engine *engine = creq->engine;
>  
> -	spin_lock_bh(&cesa_dev->lock);
> -	ret = crypto_enqueue_request(&cesa_dev->queue, req);
> -	spin_unlock_bh(&cesa_dev->lock);
> +	spin_lock_bh(&engine->lock);
> +	ret = crypto_enqueue_request(&engine->queue, req);
> +	spin_unlock_bh(&engine->lock);
>  
>  	if (ret != -EINPROGRESS)
>  		return ret;
>  
> -	for (i = 0; i < cesa_dev->caps->nengines; i++) {
> -		spin_lock_bh(&cesa_dev->engines[i].lock);
> -		if (!cesa_dev->engines[i].req)
> -			mv_cesa_dequeue_req_unlocked(&cesa_dev->engines[i]);
> -		spin_unlock_bh(&cesa_dev->engines[i].lock);
> -	}
> +	spin_lock_bh(&engine->lock);
> +	if (!engine->req)
> +		mv_cesa_dequeue_req_locked(engine);
> +	spin_unlock_bh(&engine->lock);
>  
>  	return -EINPROGRESS;
>  }
> @@ -425,7 +420,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  		return -ENOMEM;
>  
>  	spin_lock_init(&cesa->lock);
> -	crypto_init_queue(&cesa->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
> +
>  	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs");
>  	cesa->regs = devm_ioremap_resource(dev, res);
>  	if (IS_ERR(cesa->regs))
> @@ -498,6 +493,9 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  						engine);
>  		if (ret)
>  			goto err_cleanup;
> +
> +		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
> +		atomic_set(&engine->load, 0);
>  	}
>  
>  	cesa_dev = cesa;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index c463528..644be35 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -400,7 +400,6 @@ struct mv_cesa_dev_dma {
>   * @regs:	device registers
>   * @sram_size:	usable SRAM size
>   * @lock:	device lock
> - * @queue:	crypto request queue
>   * @engines:	array of engines
>   * @dma:	dma pools
>   *
> @@ -412,7 +411,6 @@ struct mv_cesa_dev {
>  	struct device *dev;
>  	unsigned int sram_size;
>  	spinlock_t lock;
> -	struct crypto_queue queue;
>  	struct mv_cesa_engine *engines;
>  	struct mv_cesa_dev_dma *dma;
>  };
> @@ -431,6 +429,8 @@ struct mv_cesa_dev {
>   * @int_mask:		interrupt mask cache
>   * @pool:		memory pool pointing to the memory region reserved in
>   *			SRAM
> + * @queue:		fifo of the pending crypto requests
> + * @load:		engine load counter, useful for load balancing
>   *
>   * Structure storing CESA engine information.
>   */
> @@ -446,11 +446,12 @@ struct mv_cesa_engine {
>  	size_t max_req_len;
>  	u32 int_mask;
>  	struct gen_pool *pool;
> +	struct crypto_queue queue;
> +	atomic_t load;
>  };
>  
>  /**
>   * struct mv_cesa_req_ops - CESA request operations
> - * @prepare:	prepare a request to be executed on the specified engine
>   * @process:	process a request chunk result (should return 0 if the
>   *		operation, -EINPROGRESS if it needs more steps or an error
>   *		code)
> @@ -460,8 +461,6 @@ struct mv_cesa_engine {
>   * 		when it is needed.
>   */
>  struct mv_cesa_req_ops {
> -	void (*prepare)(struct crypto_async_request *req,
> -			struct mv_cesa_engine *engine);
>  	int (*process)(struct crypto_async_request *req, u32 status);
>  	void (*step)(struct crypto_async_request *req);
>  	void (*cleanup)(struct crypto_async_request *req);
> @@ -690,6 +689,26 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
>  int mv_cesa_queue_req(struct crypto_async_request *req,
>  		      struct mv_cesa_req *creq);
>  
> +static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
> +{
> +	int i;
> +	u32 min_load = U32_MAX;
> +	struct mv_cesa_engine *selected = NULL;
> +
> +	for (i = 0; i < cesa_dev->caps->nengines; i++) {
> +		struct mv_cesa_engine *engine = cesa_dev->engines + i;
> +		u32 load = atomic_read(&engine->load);
> +		if (load < min_load) {
> +			min_load = load;
> +			selected = engine;
> +		}
> +	}
> +
> +	atomic_add(weight, &selected->load);
> +
> +	return selected;
> +}
> +
>  /*
>   * Helper function that indicates whether a crypto request needs to be
>   * cleaned up or not after being enqueued using mv_cesa_queue_req().
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 79d4175..28894be 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -214,6 +214,7 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
>  	struct mv_cesa_engine *engine = creq->base.engine;
>  	unsigned int ivsize;
>  
> +	atomic_sub(ablkreq->nbytes, &engine->load);
>  	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq));
>  
>  	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ) {
> @@ -231,7 +232,6 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
>  static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
>  	.step = mv_cesa_ablkcipher_step,
>  	.process = mv_cesa_ablkcipher_process,
> -	.prepare = mv_cesa_ablkcipher_prepare,
>  	.cleanup = mv_cesa_ablkcipher_req_cleanup,
>  	.complete = mv_cesa_ablkcipher_complete,
>  };
> @@ -456,29 +456,41 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
>  	return ret;
>  }
>  
> -static int mv_cesa_des_op(struct ablkcipher_request *req,
> -			  struct mv_cesa_op_ctx *tmpl)
> +static int mv_cesa_ablkcipher_queue_req(struct ablkcipher_request *req,
> +					struct mv_cesa_op_ctx *tmpl)
>  {
> -	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
> -	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>  	int ret;
> -
> -	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
> -			      CESA_SA_DESC_CFG_CRYPTM_MSK);
> -
> -	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
> +	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
> +	struct mv_cesa_engine *engine;
>  
>  	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
>  	if (ret)
>  		return ret;
>  
> +	engine = mv_cesa_select_engine(req->nbytes);
> +	mv_cesa_ablkcipher_prepare(&req->base, engine);
> +
>  	ret = mv_cesa_queue_req(&req->base, &creq->base);
> +
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ablkcipher_cleanup(req);
>  
>  	return ret;
>  }
>  
> +static int mv_cesa_des_op(struct ablkcipher_request *req,
> +			  struct mv_cesa_op_ctx *tmpl)
> +{
> +	struct mv_cesa_des_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
> +
> +	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_DES,
> +			      CESA_SA_DESC_CFG_CRYPTM_MSK);
> +
> +	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES_KEY_SIZE);
> +
> +	return mv_cesa_ablkcipher_queue_req(req, tmpl);
> +}
> +
>  static int mv_cesa_ecb_des_encrypt(struct ablkcipher_request *req)
>  {
>  	struct mv_cesa_op_ctx tmpl;
> @@ -580,24 +592,14 @@ struct crypto_alg mv_cesa_cbc_des_alg = {
>  static int mv_cesa_des3_op(struct ablkcipher_request *req,
>  			   struct mv_cesa_op_ctx *tmpl)
>  {
> -	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_des3_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
> -	int ret;
>  
>  	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_CRYPTM_3DES,
>  			      CESA_SA_DESC_CFG_CRYPTM_MSK);
>  
>  	memcpy(tmpl->ctx.blkcipher.key, ctx->key, DES3_EDE_KEY_SIZE);
>  
> -	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
> -	if (ret)
> -		return ret;
> -
> -	ret = mv_cesa_queue_req(&req->base, &creq->base);
> -	if (mv_cesa_req_needs_cleanup(&req->base, ret))
> -		mv_cesa_ablkcipher_cleanup(req);
> -
> -	return ret;
> +	return mv_cesa_ablkcipher_queue_req(req, tmpl);
>  }
>  
>  static int mv_cesa_ecb_des3_ede_encrypt(struct ablkcipher_request *req)
> @@ -707,9 +709,8 @@ struct crypto_alg mv_cesa_cbc_des3_ede_alg = {
>  static int mv_cesa_aes_op(struct ablkcipher_request *req,
>  			  struct mv_cesa_op_ctx *tmpl)
>  {
> -	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
>  	struct mv_cesa_aes_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
> -	int ret, i;
> +	int i;
>  	u32 *key;
>  	u32 cfg;
>  
> @@ -732,15 +733,7 @@ static int mv_cesa_aes_op(struct ablkcipher_request *req,
>  			      CESA_SA_DESC_CFG_CRYPTM_MSK |
>  			      CESA_SA_DESC_CFG_AES_LEN_MSK);
>  
> -	ret = mv_cesa_ablkcipher_req_init(req, tmpl);
> -	if (ret)
> -		return ret;
> -
> -	ret = mv_cesa_queue_req(&req->base, &creq->base);
> -	if (mv_cesa_req_needs_cleanup(&req->base, ret))
> -		mv_cesa_ablkcipher_cleanup(req);
> -
> -	return ret;
> +	return mv_cesa_ablkcipher_queue_req(req, tmpl);
>  }
>  
>  static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index e1f8acd..b7cfc42 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -335,6 +335,8 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
>  				result[i] = cpu_to_be32(creq->state[i]);
>  		}
>  	}
> +
> +	atomic_sub(ahashreq->nbytes, &engine->load);
>  }
>  
>  static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
> @@ -365,7 +367,6 @@ static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
>  static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
>  	.step = mv_cesa_ahash_step,
>  	.process = mv_cesa_ahash_process,
> -	.prepare = mv_cesa_ahash_prepare,
>  	.cleanup = mv_cesa_ahash_req_cleanup,
>  	.complete = mv_cesa_ahash_complete,
>  };
> @@ -682,13 +683,13 @@ static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
>  	return ret;
>  }
>  
> -static int mv_cesa_ahash_update(struct ahash_request *req)
> +static int mv_cesa_ahash_queue_req(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
> +	struct mv_cesa_engine *engine;
>  	bool cached = false;
>  	int ret;
>  
> -	creq->len += req->nbytes;
>  	ret = mv_cesa_ahash_req_init(req, &cached);
>  	if (ret)
>  		return ret;
> @@ -696,61 +697,48 @@ static int mv_cesa_ahash_update(struct ahash_request *req)
>  	if (cached)
>  		return 0;
>  
> +	engine = mv_cesa_select_engine(req->nbytes);
> +	mv_cesa_ahash_prepare(&req->base, engine);
> +
>  	ret = mv_cesa_queue_req(&req->base, &creq->base);
> +
>  	if (mv_cesa_req_needs_cleanup(&req->base, ret))
>  		mv_cesa_ahash_cleanup(req);
>  
>  	return ret;
>  }
>  
> +static int mv_cesa_ahash_update(struct ahash_request *req)
> +{
> +	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
> +
> +	creq->len += req->nbytes;
> +
> +	return mv_cesa_ahash_queue_req(req);
> +}
> +
>  static int mv_cesa_ahash_final(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	struct mv_cesa_op_ctx *tmpl = &creq->op_tmpl;
> -	bool cached = false;
> -	int ret;
>  
>  	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
>  	creq->last_req = true;
>  	req->nbytes = 0;
>  
> -	ret = mv_cesa_ahash_req_init(req, &cached);
> -	if (ret)
> -		return ret;
> -
> -	if (cached)
> -		return 0;
> -
> -	ret = mv_cesa_queue_req(&req->base, &creq->base);
> -	if (mv_cesa_req_needs_cleanup(&req->base, ret))
> -		mv_cesa_ahash_cleanup(req);
> -
> -	return ret;
> +	return mv_cesa_ahash_queue_req(req);
>  }
>  
>  static int mv_cesa_ahash_finup(struct ahash_request *req)
>  {
>  	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
>  	struct mv_cesa_op_ctx *tmpl = &creq->op_tmpl;
> -	bool cached = false;
> -	int ret;
>  
>  	creq->len += req->nbytes;
>  	mv_cesa_set_mac_op_total_len(tmpl, creq->len);
>  	creq->last_req = true;
>  
> -	ret = mv_cesa_ahash_req_init(req, &cached);
> -	if (ret)
> -		return ret;
> -
> -	if (cached)
> -		return 0;
> -
> -	ret = mv_cesa_queue_req(&req->base, &creq->base);
> -	if (mv_cesa_req_needs_cleanup(&req->base, ret))
> -		mv_cesa_ahash_cleanup(req);
> -
> -	return ret;
> +	return mv_cesa_ahash_queue_req(req);
>  }
>  
>  static int mv_cesa_ahash_export(struct ahash_request *req, void *hash,

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 09/10] crypto: marvell: Add support for chaining crypto requests in TDMA mode
  2016-06-21  8:08   ` Romain Perier
@ 2016-06-21 12:37     ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-06-21 12:37 UTC (permalink / raw)
  To: Romain Perier
  Cc: Thomas Petazzoni, Russell King, Arnaud Ebalard, linux-crypto,
	Gregory Clement, David S. Miller, linux-arm-kernel

On Tue, 21 Jun 2016 10:08:39 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> The Cryptographic Engines and Security Accelerators (CESA) supports the
> Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
> can be chained and processed by the hardware without software
> intervention. This mode was already activated, however the crypto
> requests were not chained together. By doing so, we reduce significantly
> the number of IRQs. Instead of being interrupted at the end of each
> crypto request, we are interrupted at the end of the last cryptographic
> request processed by the engine.
> 
> This commits re-factorizes the code, changes the code architecture and
> adds the required data structures to chain cryptographic requests
> together before sending them to an engine (stopped or possibly already
> running).
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

One nit below ;).

> ---
> 
> Changes in v3:
> 
>   - Cosmetic changes: Extra blank lines and coding style issues
>     on prototypes.
> 
> Changes in v2:
> 
>   - Reworded the commit message
>   - Fixed cosmetic changes: coding styles issues, missing blank lines
>   - Reworked mv_cesa_rearm_engine: lock handling is simpler
>   - Removed the call to the complete operation in mv_cesa_std_process,
>     in case of errors (not required)
>   - Squashed the removal of the '.prepare' fields (cipher.c, hash.c)
>     into another commit (see PATCH 08/10).
>   - In mv_cesa_tdma_process only treat the status argument for the last
>     request, use 'normal' status for the other ones.
>   - Added a comment for explaining how the errors are notified to the
>     cesa core.
> 
>  drivers/crypto/marvell/cesa.c   | 115 +++++++++++++++++++++++++++++++---------
>  drivers/crypto/marvell/cesa.h   |  39 +++++++++++++-
>  drivers/crypto/marvell/cipher.c |   2 +-
>  drivers/crypto/marvell/hash.c   |   6 +++
>  drivers/crypto/marvell/tdma.c   |  86 ++++++++++++++++++++++++++++++
>  5 files changed, 221 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index c0497ac..bb91156 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -40,14 +40,33 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
>  
>  struct mv_cesa_dev *cesa_dev;
>  
> -static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
> +struct crypto_async_request *
> +mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
> +			   struct crypto_async_request **backlog)
>  {
> -	struct crypto_async_request *req, *backlog;
> -	struct mv_cesa_ctx *ctx;
> +	struct crypto_async_request *req;
>  
> -	backlog = crypto_get_backlog(&engine->queue);
> +	*backlog = crypto_get_backlog(&engine->queue);
>  	req = crypto_dequeue_request(&engine->queue);
> -	engine->req = req;
> +
> +	if (!req)
> +		return NULL;
> +
> +	return req;
> +}
> +
> +static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
> +{
> +	struct crypto_async_request *req = NULL, *backlog = NULL;
> +	struct mv_cesa_ctx *ctx;
> +
> +
> +	spin_lock_bh(&engine->lock);
> +	if (!engine->req) {
> +		req = mv_cesa_dequeue_req_locked(engine, &backlog);
> +		engine->req = req;
> +	}
> +	spin_unlock_bh(&engine->lock);
>  
>  	if (!req)
>  		return;
> @@ -57,6 +76,46 @@ static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
>  
>  	ctx = crypto_tfm_ctx(req->tfm);
>  	ctx->ops->step(req);
> +
> +	return;
> +}
> +
> +static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	struct crypto_async_request *req;
> +	struct mv_cesa_ctx *ctx;
> +	int res;
> +
> +	req = engine->req;
> +	ctx = crypto_tfm_ctx(req->tfm);
> +	res = ctx->ops->process(req, status);
> +
> +	if (res == 0) {
> +		ctx->ops->complete(req);
> +		mv_cesa_engine_enqueue_complete_request(engine, req);
> +	} else if (res == -EINPROGRESS) {
> +		ctx->ops->step(req);
> +	}
> +
> +	return res;
> +}
> +
> +static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	if (engine->chain.first && engine->chain.last)
> +		return mv_cesa_tdma_process(engine, status);
> +
> +	return mv_cesa_std_process(engine, status);
> +}
> +
> +static inline void
> +mv_cesa_complete_req(struct mv_cesa_ctx *ctx, struct crypto_async_request *req,
> +		     int res)
> +{
> +	ctx->ops->cleanup(req);
> +	local_bh_disable();
> +	req->complete(req, res);
> +	local_bh_enable();
>  }
>  
>  static irqreturn_t mv_cesa_int(int irq, void *priv)
> @@ -83,26 +142,31 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  		writel(~status, engine->regs + CESA_SA_FPGA_INT_STATUS);
>  		writel(~status, engine->regs + CESA_SA_INT_STATUS);
>  
> +		/* Process fetched requests */
> +		res = mv_cesa_int_process(engine, status & mask);
>  		ret = IRQ_HANDLED;
> +
>  		spin_lock_bh(&engine->lock);
>  		req = engine->req;
> +		if (res != -EINPROGRESS)
> +			engine->req = NULL;
>  		spin_unlock_bh(&engine->lock);
> -		if (req) {
> -			ctx = crypto_tfm_ctx(req->tfm);
> -			res = ctx->ops->process(req, status & mask);
> -			if (res != -EINPROGRESS) {
> -				spin_lock_bh(&engine->lock);
> -				engine->req = NULL;
> -				mv_cesa_dequeue_req_locked(engine);
> -				spin_unlock_bh(&engine->lock);
> -				ctx->ops->complete(req);
> -				ctx->ops->cleanup(req);
> -				local_bh_disable();
> -				req->complete(req, res);
> -				local_bh_enable();
> -			} else {
> -				ctx->ops->step(req);
> -			}
> +
> +		ctx = crypto_tfm_ctx(req->tfm);
> +
> +		if (res && res != -EINPROGRESS)
> +			mv_cesa_complete_req(ctx, req, res);
> +
> +		/* Launch the next pending request */
> +		mv_cesa_rearm_engine(engine);
> +
> +		/* Iterate over the complete queue */
> +		while (true) {
> +			req = mv_cesa_engine_dequeue_complete_request(engine);
> +			if (!req)
> +				break;
> +
> +			mv_cesa_complete_req(ctx, req, 0);
>  		}
>  	}
>  
> @@ -116,16 +180,16 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
>  	struct mv_cesa_engine *engine = creq->engine;
>  
>  	spin_lock_bh(&engine->lock);
> +	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
> +		mv_cesa_tdma_chain(engine, creq);
> +
>  	ret = crypto_enqueue_request(&engine->queue, req);
>  	spin_unlock_bh(&engine->lock);
>  
>  	if (ret != -EINPROGRESS)
>  		return ret;
>  
> -	spin_lock_bh(&engine->lock);
> -	if (!engine->req)
> -		mv_cesa_dequeue_req_locked(engine);
> -	spin_unlock_bh(&engine->lock);
> +	mv_cesa_rearm_engine(engine);
>  
>  	return -EINPROGRESS;
>  }
> @@ -496,6 +560,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  
>  		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
>  		atomic_set(&engine->load, 0);
> +		INIT_LIST_HEAD(&engine->complete_queue);
>  	}
>  
>  	cesa_dev = cesa;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 644be35..50a1fb2 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
>  /* TDMA descriptor flags */
>  #define CESA_TDMA_DST_IN_SRAM			BIT(31)
>  #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
> -#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
> +#define CESA_TDMA_END_OF_REQ			BIT(29)
> +#define CESA_TDMA_BREAK_CHAIN			BIT(28)
> +#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
>  #define CESA_TDMA_DUMMY				0
>  #define CESA_TDMA_DATA				1
>  #define CESA_TDMA_OP				2
> @@ -431,6 +433,9 @@ struct mv_cesa_dev {
>   *			SRAM
>   * @queue:		fifo of the pending crypto requests
>   * @load:		engine load counter, useful for load balancing
> + * @chain:		list of the current tdma descriptors being processed
> + * 			by this engine.
> + * @complete_queue:	fifo of the processed requests by the engine
>   *
>   * Structure storing CESA engine information.
>   */
> @@ -448,6 +453,8 @@ struct mv_cesa_engine {
>  	struct gen_pool *pool;
>  	struct crypto_queue queue;
>  	atomic_t load;
> +	struct mv_cesa_tdma_chain chain;
> +	struct list_head complete_queue;
>  };
>  
>  /**
> @@ -608,6 +615,29 @@ struct mv_cesa_ahash_req {
>  
>  extern struct mv_cesa_dev *cesa_dev;
>  
> +
> +static inline void
> +mv_cesa_engine_enqueue_complete_request(struct mv_cesa_engine *engine,
> +					struct crypto_async_request *req)
> +{
> +	list_add_tail(&req->list, &engine->complete_queue);
> +}
> +
> +static inline struct crypto_async_request *
> +mv_cesa_engine_dequeue_complete_request(struct mv_cesa_engine *engine)
> +{
> +	struct crypto_async_request *req;
> +
> +	req = list_first_entry_or_null(&engine->complete_queue,
> +				       struct crypto_async_request,
> +				       list);
> +	if (req)
> +		list_del(&req->list);
> +
> +	return req;
> +}
> +
> +
>  static inline enum mv_cesa_req_type
>  mv_cesa_req_get_type(struct mv_cesa_req *req)
>  {
> @@ -689,6 +719,10 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
>  int mv_cesa_queue_req(struct crypto_async_request *req,
>  		      struct mv_cesa_req *creq);
>  
> +struct crypto_async_request *
> +mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
> +			   struct crypto_async_request **backlog);
> +
>  static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
>  {
>  	int i;
> @@ -794,6 +828,9 @@ static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
>  void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  			 struct mv_cesa_engine *engine);
>  void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
> +void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
> +			struct mv_cesa_req *dreq);
> +int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status);
>  
>  
>  static inline void
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 28894be..a9ca0dc 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -390,6 +390,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  		goto err_free_tdma;
>  
>  	basereq->chain = chain;
> +	basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
>  
>  	return 0;
>  
> @@ -447,7 +448,6 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
>  	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
>  			      CESA_SA_DESC_CFG_OP_MSK);
>  
> -	/* TODO: add a threshold for DMA usage */
>  	if (cesa_dev->caps->has_tdma)
>  		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
>  	else
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index b7cfc42..c7e5a46 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -172,6 +172,9 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>  	for (i = 0; i < digsize / 4; i++)
>  		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
>  
> +	mv_cesa_adjust_op(engine, &creq->op_tmpl);
> +	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
> +
>  	if (creq->cache_ptr)
>  		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>  			    creq->cache, creq->cache_ptr);
> @@ -647,6 +650,9 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  	else
>  		creq->cache_ptr = 0;
>  
> +	basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
> +				       CESA_TDMA_BREAK_CHAIN);
> +
>  	return 0;
>  
>  err_free_tdma:
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 9d944ad..8de8c83 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -99,6 +99,92 @@ void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  	}
>  }
>  
> +void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
> +			struct mv_cesa_req *dreq)
> +{
> +	if (engine->chain.first == NULL && engine->chain.last == NULL) {
> +		engine->chain.first = dreq->chain.first;
> +		engine->chain.last  = dreq->chain.last;
> +	} else {
> +		struct mv_cesa_tdma_desc *last;
> +
> +		last = engine->chain.last;
> +		last->next = dreq->chain.first;
> +		engine->chain.last = dreq->chain.last;
> +
> +		if (!(last->flags & CESA_TDMA_BREAK_CHAIN))
> +			last->next_dma = dreq->chain.first->cur_dma;
> +	}
> +}
> +
> +int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	struct crypto_async_request *req = NULL;
> +	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
> +	dma_addr_t tdma_cur;
> +	int res = 0;
> +
> +	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
> +
> +	for (tdma = engine->chain.first; tdma; tdma = next) {
> +		spin_lock_bh(&engine->lock);
> +		next = tdma->next;
> +		spin_unlock_bh(&engine->lock);
> +
> +		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
> +			struct crypto_async_request *backlog = NULL;
> +			struct mv_cesa_ctx *ctx;
> +			u32 current_status;
> +
> +			spin_lock_bh(&engine->lock);
> +			/*
> +			 * if req is NULL, this means we're processing the
> +			 * request in engine->req.
> +			 */
> +			if (!req)
> +				req = engine->req;
> +			else
> +				req = mv_cesa_dequeue_req_locked(engine,
> +								 &backlog);
> +
> +			/* Re-chaining to the next request */
> +			engine->chain.first = tdma->next;
> +			tdma->next = NULL;
> +
> +			/* If this is the last request, clear the chain */
> +			if (engine->chain.first == NULL)
> +				engine->chain.last  = NULL;
> +			spin_unlock_bh(&engine->lock);
> +
> +			ctx = crypto_tfm_ctx(req->tfm);
> +			current_status = (tdma->cur_dma == tdma_cur) ?
> +					  status : CESA_SA_INT_ACC0_IDMA_DONE;
> +			res = ctx->ops->process(req, current_status);
> +			ctx->ops->complete(req);
> +
> +			if (res == 0)
> +				mv_cesa_engine_enqueue_complete_request(engine,
> +									req);
> +
> +			if (backlog)
> +				backlog->complete(backlog, -EINPROGRESS);
> +		}
> +
> +		if (res || tdma->cur_dma == tdma_cur)
> +			break;
> +	}
> +
> +	/* Save the last request in error to engine->req, so that the core
> +	 * knows which request was fautly */

Please use the standard comment style for over 80 char comments:

	/*
	 * <long message>
	 */

> +	if (res) {
> +		spin_lock_bh(&engine->lock);
> +		engine->req = req;
> +		spin_unlock_bh(&engine->lock);
> +	}
> +
> +	return res;
> +}
> +
>  static struct mv_cesa_tdma_desc *
>  mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
>  {

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 09/10] crypto: marvell: Add support for chaining crypto requests in TDMA mode
@ 2016-06-21 12:37     ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-06-21 12:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 21 Jun 2016 10:08:39 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> The Cryptographic Engines and Security Accelerators (CESA) supports the
> Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
> can be chained and processed by the hardware without software
> intervention. This mode was already activated, however the crypto
> requests were not chained together. By doing so, we reduce significantly
> the number of IRQs. Instead of being interrupted at the end of each
> crypto request, we are interrupted at the end of the last cryptographic
> request processed by the engine.
> 
> This commits re-factorizes the code, changes the code architecture and
> adds the required data structures to chain cryptographic requests
> together before sending them to an engine (stopped or possibly already
> running).
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

One nit below ;).

> ---
> 
> Changes in v3:
> 
>   - Cosmetic changes: Extra blank lines and coding style issues
>     on prototypes.
> 
> Changes in v2:
> 
>   - Reworded the commit message
>   - Fixed cosmetic changes: coding styles issues, missing blank lines
>   - Reworked mv_cesa_rearm_engine: lock handling is simpler
>   - Removed the call to the complete operation in mv_cesa_std_process,
>     in case of errors (not required)
>   - Squashed the removal of the '.prepare' fields (cipher.c, hash.c)
>     into another commit (see PATCH 08/10).
>   - In mv_cesa_tdma_process only treat the status argument for the last
>     request, use 'normal' status for the other ones.
>   - Added a comment for explaining how the errors are notified to the
>     cesa core.
> 
>  drivers/crypto/marvell/cesa.c   | 115 +++++++++++++++++++++++++++++++---------
>  drivers/crypto/marvell/cesa.h   |  39 +++++++++++++-
>  drivers/crypto/marvell/cipher.c |   2 +-
>  drivers/crypto/marvell/hash.c   |   6 +++
>  drivers/crypto/marvell/tdma.c   |  86 ++++++++++++++++++++++++++++++
>  5 files changed, 221 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index c0497ac..bb91156 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -40,14 +40,33 @@ MODULE_PARM_DESC(allhwsupport, "Enable support for all hardware (even it if over
>  
>  struct mv_cesa_dev *cesa_dev;
>  
> -static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
> +struct crypto_async_request *
> +mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
> +			   struct crypto_async_request **backlog)
>  {
> -	struct crypto_async_request *req, *backlog;
> -	struct mv_cesa_ctx *ctx;
> +	struct crypto_async_request *req;
>  
> -	backlog = crypto_get_backlog(&engine->queue);
> +	*backlog = crypto_get_backlog(&engine->queue);
>  	req = crypto_dequeue_request(&engine->queue);
> -	engine->req = req;
> +
> +	if (!req)
> +		return NULL;
> +
> +	return req;
> +}
> +
> +static void mv_cesa_rearm_engine(struct mv_cesa_engine *engine)
> +{
> +	struct crypto_async_request *req = NULL, *backlog = NULL;
> +	struct mv_cesa_ctx *ctx;
> +
> +
> +	spin_lock_bh(&engine->lock);
> +	if (!engine->req) {
> +		req = mv_cesa_dequeue_req_locked(engine, &backlog);
> +		engine->req = req;
> +	}
> +	spin_unlock_bh(&engine->lock);
>  
>  	if (!req)
>  		return;
> @@ -57,6 +76,46 @@ static void mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine)
>  
>  	ctx = crypto_tfm_ctx(req->tfm);
>  	ctx->ops->step(req);
> +
> +	return;
> +}
> +
> +static int mv_cesa_std_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	struct crypto_async_request *req;
> +	struct mv_cesa_ctx *ctx;
> +	int res;
> +
> +	req = engine->req;
> +	ctx = crypto_tfm_ctx(req->tfm);
> +	res = ctx->ops->process(req, status);
> +
> +	if (res == 0) {
> +		ctx->ops->complete(req);
> +		mv_cesa_engine_enqueue_complete_request(engine, req);
> +	} else if (res == -EINPROGRESS) {
> +		ctx->ops->step(req);
> +	}
> +
> +	return res;
> +}
> +
> +static int mv_cesa_int_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	if (engine->chain.first && engine->chain.last)
> +		return mv_cesa_tdma_process(engine, status);
> +
> +	return mv_cesa_std_process(engine, status);
> +}
> +
> +static inline void
> +mv_cesa_complete_req(struct mv_cesa_ctx *ctx, struct crypto_async_request *req,
> +		     int res)
> +{
> +	ctx->ops->cleanup(req);
> +	local_bh_disable();
> +	req->complete(req, res);
> +	local_bh_enable();
>  }
>  
>  static irqreturn_t mv_cesa_int(int irq, void *priv)
> @@ -83,26 +142,31 @@ static irqreturn_t mv_cesa_int(int irq, void *priv)
>  		writel(~status, engine->regs + CESA_SA_FPGA_INT_STATUS);
>  		writel(~status, engine->regs + CESA_SA_INT_STATUS);
>  
> +		/* Process fetched requests */
> +		res = mv_cesa_int_process(engine, status & mask);
>  		ret = IRQ_HANDLED;
> +
>  		spin_lock_bh(&engine->lock);
>  		req = engine->req;
> +		if (res != -EINPROGRESS)
> +			engine->req = NULL;
>  		spin_unlock_bh(&engine->lock);
> -		if (req) {
> -			ctx = crypto_tfm_ctx(req->tfm);
> -			res = ctx->ops->process(req, status & mask);
> -			if (res != -EINPROGRESS) {
> -				spin_lock_bh(&engine->lock);
> -				engine->req = NULL;
> -				mv_cesa_dequeue_req_locked(engine);
> -				spin_unlock_bh(&engine->lock);
> -				ctx->ops->complete(req);
> -				ctx->ops->cleanup(req);
> -				local_bh_disable();
> -				req->complete(req, res);
> -				local_bh_enable();
> -			} else {
> -				ctx->ops->step(req);
> -			}
> +
> +		ctx = crypto_tfm_ctx(req->tfm);
> +
> +		if (res && res != -EINPROGRESS)
> +			mv_cesa_complete_req(ctx, req, res);
> +
> +		/* Launch the next pending request */
> +		mv_cesa_rearm_engine(engine);
> +
> +		/* Iterate over the complete queue */
> +		while (true) {
> +			req = mv_cesa_engine_dequeue_complete_request(engine);
> +			if (!req)
> +				break;
> +
> +			mv_cesa_complete_req(ctx, req, 0);
>  		}
>  	}
>  
> @@ -116,16 +180,16 @@ int mv_cesa_queue_req(struct crypto_async_request *req,
>  	struct mv_cesa_engine *engine = creq->engine;
>  
>  	spin_lock_bh(&engine->lock);
> +	if (mv_cesa_req_get_type(creq) == CESA_DMA_REQ)
> +		mv_cesa_tdma_chain(engine, creq);
> +
>  	ret = crypto_enqueue_request(&engine->queue, req);
>  	spin_unlock_bh(&engine->lock);
>  
>  	if (ret != -EINPROGRESS)
>  		return ret;
>  
> -	spin_lock_bh(&engine->lock);
> -	if (!engine->req)
> -		mv_cesa_dequeue_req_locked(engine);
> -	spin_unlock_bh(&engine->lock);
> +	mv_cesa_rearm_engine(engine);
>  
>  	return -EINPROGRESS;
>  }
> @@ -496,6 +560,7 @@ static int mv_cesa_probe(struct platform_device *pdev)
>  
>  		crypto_init_queue(&engine->queue, CESA_CRYPTO_DEFAULT_MAX_QLEN);
>  		atomic_set(&engine->load, 0);
> +		INIT_LIST_HEAD(&engine->complete_queue);
>  	}
>  
>  	cesa_dev = cesa;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index 644be35..50a1fb2 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -271,7 +271,9 @@ struct mv_cesa_op_ctx {
>  /* TDMA descriptor flags */
>  #define CESA_TDMA_DST_IN_SRAM			BIT(31)
>  #define CESA_TDMA_SRC_IN_SRAM			BIT(30)
> -#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
> +#define CESA_TDMA_END_OF_REQ			BIT(29)
> +#define CESA_TDMA_BREAK_CHAIN			BIT(28)
> +#define CESA_TDMA_TYPE_MSK			GENMASK(27, 0)
>  #define CESA_TDMA_DUMMY				0
>  #define CESA_TDMA_DATA				1
>  #define CESA_TDMA_OP				2
> @@ -431,6 +433,9 @@ struct mv_cesa_dev {
>   *			SRAM
>   * @queue:		fifo of the pending crypto requests
>   * @load:		engine load counter, useful for load balancing
> + * @chain:		list of the current tdma descriptors being processed
> + * 			by this engine.
> + * @complete_queue:	fifo of the processed requests by the engine
>   *
>   * Structure storing CESA engine information.
>   */
> @@ -448,6 +453,8 @@ struct mv_cesa_engine {
>  	struct gen_pool *pool;
>  	struct crypto_queue queue;
>  	atomic_t load;
> +	struct mv_cesa_tdma_chain chain;
> +	struct list_head complete_queue;
>  };
>  
>  /**
> @@ -608,6 +615,29 @@ struct mv_cesa_ahash_req {
>  
>  extern struct mv_cesa_dev *cesa_dev;
>  
> +
> +static inline void
> +mv_cesa_engine_enqueue_complete_request(struct mv_cesa_engine *engine,
> +					struct crypto_async_request *req)
> +{
> +	list_add_tail(&req->list, &engine->complete_queue);
> +}
> +
> +static inline struct crypto_async_request *
> +mv_cesa_engine_dequeue_complete_request(struct mv_cesa_engine *engine)
> +{
> +	struct crypto_async_request *req;
> +
> +	req = list_first_entry_or_null(&engine->complete_queue,
> +				       struct crypto_async_request,
> +				       list);
> +	if (req)
> +		list_del(&req->list);
> +
> +	return req;
> +}
> +
> +
>  static inline enum mv_cesa_req_type
>  mv_cesa_req_get_type(struct mv_cesa_req *req)
>  {
> @@ -689,6 +719,10 @@ static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op)
>  int mv_cesa_queue_req(struct crypto_async_request *req,
>  		      struct mv_cesa_req *creq);
>  
> +struct crypto_async_request *
> +mv_cesa_dequeue_req_locked(struct mv_cesa_engine *engine,
> +			   struct crypto_async_request **backlog);
> +
>  static inline struct mv_cesa_engine *mv_cesa_select_engine(int weight)
>  {
>  	int i;
> @@ -794,6 +828,9 @@ static inline int mv_cesa_dma_process(struct mv_cesa_req *dreq,
>  void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  			 struct mv_cesa_engine *engine);
>  void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq);
> +void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
> +			struct mv_cesa_req *dreq);
> +int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status);
>  
>  
>  static inline void
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index 28894be..a9ca0dc 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -390,6 +390,7 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>  		goto err_free_tdma;
>  
>  	basereq->chain = chain;
> +	basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
>  
>  	return 0;
>  
> @@ -447,7 +448,6 @@ static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
>  	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
>  			      CESA_SA_DESC_CFG_OP_MSK);
>  
> -	/* TODO: add a threshold for DMA usage */
>  	if (cesa_dev->caps->has_tdma)
>  		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
>  	else
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index b7cfc42..c7e5a46 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -172,6 +172,9 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
>  	for (i = 0; i < digsize / 4; i++)
>  		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
>  
> +	mv_cesa_adjust_op(engine, &creq->op_tmpl);
> +	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
> +
>  	if (creq->cache_ptr)
>  		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
>  			    creq->cache, creq->cache_ptr);
> @@ -647,6 +650,9 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  	else
>  		creq->cache_ptr = 0;
>  
> +	basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
> +				       CESA_TDMA_BREAK_CHAIN);
> +
>  	return 0;
>  
>  err_free_tdma:
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 9d944ad..8de8c83 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -99,6 +99,92 @@ void mv_cesa_dma_prepare(struct mv_cesa_req *dreq,
>  	}
>  }
>  
> +void mv_cesa_tdma_chain(struct mv_cesa_engine *engine,
> +			struct mv_cesa_req *dreq)
> +{
> +	if (engine->chain.first == NULL && engine->chain.last == NULL) {
> +		engine->chain.first = dreq->chain.first;
> +		engine->chain.last  = dreq->chain.last;
> +	} else {
> +		struct mv_cesa_tdma_desc *last;
> +
> +		last = engine->chain.last;
> +		last->next = dreq->chain.first;
> +		engine->chain.last = dreq->chain.last;
> +
> +		if (!(last->flags & CESA_TDMA_BREAK_CHAIN))
> +			last->next_dma = dreq->chain.first->cur_dma;
> +	}
> +}
> +
> +int mv_cesa_tdma_process(struct mv_cesa_engine *engine, u32 status)
> +{
> +	struct crypto_async_request *req = NULL;
> +	struct mv_cesa_tdma_desc *tdma = NULL, *next = NULL;
> +	dma_addr_t tdma_cur;
> +	int res = 0;
> +
> +	tdma_cur = readl(engine->regs + CESA_TDMA_CUR);
> +
> +	for (tdma = engine->chain.first; tdma; tdma = next) {
> +		spin_lock_bh(&engine->lock);
> +		next = tdma->next;
> +		spin_unlock_bh(&engine->lock);
> +
> +		if (tdma->flags & CESA_TDMA_END_OF_REQ) {
> +			struct crypto_async_request *backlog = NULL;
> +			struct mv_cesa_ctx *ctx;
> +			u32 current_status;
> +
> +			spin_lock_bh(&engine->lock);
> +			/*
> +			 * if req is NULL, this means we're processing the
> +			 * request in engine->req.
> +			 */
> +			if (!req)
> +				req = engine->req;
> +			else
> +				req = mv_cesa_dequeue_req_locked(engine,
> +								 &backlog);
> +
> +			/* Re-chaining to the next request */
> +			engine->chain.first = tdma->next;
> +			tdma->next = NULL;
> +
> +			/* If this is the last request, clear the chain */
> +			if (engine->chain.first == NULL)
> +				engine->chain.last  = NULL;
> +			spin_unlock_bh(&engine->lock);
> +
> +			ctx = crypto_tfm_ctx(req->tfm);
> +			current_status = (tdma->cur_dma == tdma_cur) ?
> +					  status : CESA_SA_INT_ACC0_IDMA_DONE;
> +			res = ctx->ops->process(req, current_status);
> +			ctx->ops->complete(req);
> +
> +			if (res == 0)
> +				mv_cesa_engine_enqueue_complete_request(engine,
> +									req);
> +
> +			if (backlog)
> +				backlog->complete(backlog, -EINPROGRESS);
> +		}
> +
> +		if (res || tdma->cur_dma == tdma_cur)
> +			break;
> +	}
> +
> +	/* Save the last request in error to engine->req, so that the core
> +	 * knows which request was fautly */

Please use the standard comment style for over 80 char comments:

	/*
	 * <long message>
	 */

> +	if (res) {
> +		spin_lock_bh(&engine->lock);
> +		engine->req = req;
> +		spin_unlock_bh(&engine->lock);
> +	}
> +
> +	return res;
> +}
> +
>  static struct mv_cesa_tdma_desc *
>  mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
>  {

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req
  2016-06-21  8:08   ` Romain Perier
@ 2016-06-22 10:33     ` Herbert Xu
  -1 siblings, 0 replies; 32+ messages in thread
From: Herbert Xu @ 2016-06-22 10:33 UTC (permalink / raw)
  To: Romain Perier
  Cc: thomas.petazzoni, boris.brezillon, linux, arno, linux-crypto,
	gregory.clement, davem, linux-arm-kernel

Romain Perier <romain.perier@free-electrons.com> wrote:
> Add a BUG_ON() call when the driver tries to launch a crypto request
> while the engine is still processing the previous one. This replaces
> a silent system hang by a verbose kernel panic with the associated
> backtrace to let the user know that something went wrong in the CESA
> driver.

Hmm, so how can this happen? If it is triggerable then we better
try to recover from it more gracefully.  If it is not triggerable
then why bother?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req
@ 2016-06-22 10:33     ` Herbert Xu
  0 siblings, 0 replies; 32+ messages in thread
From: Herbert Xu @ 2016-06-22 10:33 UTC (permalink / raw)
  To: linux-arm-kernel

Romain Perier <romain.perier@free-electrons.com> wrote:
> Add a BUG_ON() call when the driver tries to launch a crypto request
> while the engine is still processing the previous one. This replaces
> a silent system hang by a verbose kernel panic with the associated
> backtrace to let the user know that something went wrong in the CESA
> driver.

Hmm, so how can this happen? If it is triggerable then we better
try to recover from it more gracefully.  If it is not triggerable
then why bother?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req
  2016-06-22 10:33     ` Herbert Xu
@ 2016-06-22 11:23       ` Romain Perier
  -1 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-22 11:23 UTC (permalink / raw)
  To: Herbert Xu
  Cc: thomas.petazzoni, boris.brezillon, linux, arno, linux-crypto,
	gregory.clement, davem, linux-arm-kernel

Hello,

Le 22/06/2016 12:33, Herbert Xu a écrit :
> Romain Perier <romain.perier@free-electrons.com> wrote:
>> Add a BUG_ON() call when the driver tries to launch a crypto request
>> while the engine is still processing the previous one. This replaces
>> a silent system hang by a verbose kernel panic with the associated
>> backtrace to let the user know that something went wrong in the CESA
>> driver.
>
> Hmm, so how can this happen?
If it is triggerable then we better
> try to recover from it more gracefully.  If it is not triggerable
> then why bother?
>

Well, It does not happen with the current driver (in mainline). This is 
bug I had when I added support to chain requests. Take a look at the 
patch 008/010, it changes the way the requests are "prepared". If you 
really enable a request while the engine is running, that's very hard to 
debug. This is more useful to have a backtrace to let the user know that 
something is wrong instead of having a silent system hang. That's easier 
to debug and you can detect regressions.

Regards,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req
@ 2016-06-22 11:23       ` Romain Perier
  0 siblings, 0 replies; 32+ messages in thread
From: Romain Perier @ 2016-06-22 11:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Le 22/06/2016 12:33, Herbert Xu a ?crit :
> Romain Perier <romain.perier@free-electrons.com> wrote:
>> Add a BUG_ON() call when the driver tries to launch a crypto request
>> while the engine is still processing the previous one. This replaces
>> a silent system hang by a verbose kernel panic with the associated
>> backtrace to let the user know that something went wrong in the CESA
>> driver.
>
> Hmm, so how can this happen?
If it is triggerable then we better
> try to recover from it more gracefully.  If it is not triggerable
> then why bother?
>

Well, It does not happen with the current driver (in mainline). This is 
bug I had when I added support to chain requests. Take a look at the 
patch 008/010, it changes the way the requests are "prepared". If you 
really enable a request while the engine is running, that's very hard to 
debug. This is more useful to have a backtrace to let the user know that 
something is wrong instead of having a silent system hang. That's easier 
to debug and you can detect regressions.

Regards,
Romain
-- 
Romain Perier, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req
  2016-06-22 11:23       ` Romain Perier
@ 2016-06-23 10:44         ` Herbert Xu
  -1 siblings, 0 replies; 32+ messages in thread
From: Herbert Xu @ 2016-06-23 10:44 UTC (permalink / raw)
  To: Romain Perier
  Cc: boris.brezillon, arno, gregory.clement, thomas.petazzoni, davem,
	linux, linux-crypto, linux-arm-kernel

On Wed, Jun 22, 2016 at 01:23:39PM +0200, Romain Perier wrote:
> Hello,
> 
> Le 22/06/2016 12:33, Herbert Xu a écrit :
> >Romain Perier <romain.perier@free-electrons.com> wrote:
> >>Add a BUG_ON() call when the driver tries to launch a crypto request
> >>while the engine is still processing the previous one. This replaces
> >>a silent system hang by a verbose kernel panic with the associated
> >>backtrace to let the user know that something went wrong in the CESA
> >>driver.
> >
> >Hmm, so how can this happen?
> If it is triggerable then we better
> >try to recover from it more gracefully.  If it is not triggerable
> >then why bother?
> >
> 
> Well, It does not happen with the current driver (in mainline). This
> is bug I had when I added support to chain requests. Take a look at
> the patch 008/010, it changes the way the requests are "prepared".
> If you really enable a request while the engine is running, that's
> very hard to debug. This is more useful to have a backtrace to let
> the user know that something is wrong instead of having a silent
> system hang. That's easier to debug and you can detect regressions.

OK.  All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req
@ 2016-06-23 10:44         ` Herbert Xu
  0 siblings, 0 replies; 32+ messages in thread
From: Herbert Xu @ 2016-06-23 10:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 22, 2016 at 01:23:39PM +0200, Romain Perier wrote:
> Hello,
> 
> Le 22/06/2016 12:33, Herbert Xu a ?crit :
> >Romain Perier <romain.perier@free-electrons.com> wrote:
> >>Add a BUG_ON() call when the driver tries to launch a crypto request
> >>while the engine is still processing the previous one. This replaces
> >>a silent system hang by a verbose kernel panic with the associated
> >>backtrace to let the user know that something went wrong in the CESA
> >>driver.
> >
> >Hmm, so how can this happen?
> If it is triggerable then we better
> >try to recover from it more gracefully.  If it is not triggerable
> >then why bother?
> >
> 
> Well, It does not happen with the current driver (in mainline). This
> is bug I had when I added support to chain requests. Take a look at
> the patch 008/010, it changes the way the requests are "prepared".
> If you really enable a request while the engine is running, that's
> very hard to debug. This is more useful to have a backtrace to let
> the user know that something is wrong instead of having a silent
> system hang. That's easier to debug and you can detect regressions.

OK.  All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-06-23 10:45 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-21  8:08 [PATCH v3 00/10] Chain crypto requests together at the DMA level Romain Perier
2016-06-21  8:08 ` Romain Perier
2016-06-21  8:08 ` [PATCH v3 01/10] crypto: marvell: Add a macro constant for the size of the crypto queue Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-21  8:08 ` [PATCH v3 02/10] crypto: marvell: Check engine is not already running when enabling a req Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-22 10:33   ` Herbert Xu
2016-06-22 10:33     ` Herbert Xu
2016-06-22 11:23     ` Romain Perier
2016-06-22 11:23       ` Romain Perier
2016-06-23 10:44       ` Herbert Xu
2016-06-23 10:44         ` Herbert Xu
2016-06-21  8:08 ` [PATCH v3 03/10] crypto: marvell: Fix wrong type check in dma functions Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-21  8:08 ` [PATCH v3 04/10] crypto: marvell: Copy IV vectors by DMA transfers for acipher requests Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-21  8:08 ` [PATCH v3 05/10] crypto: marvell: Move tdma chain out of mv_cesa_tdma_req and remove it Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-21  8:08 ` [PATCH v3 06/10] crypto: marvell: Add a complete operation for async requests Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-21  8:08 ` [PATCH v3 07/10] crypto: marvell: Move SRAM I/O operations to step functions Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-21  8:08 ` [PATCH v3 08/10] crypto: marvell: Add load balancing between engines Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-21 12:33   ` Boris Brezillon
2016-06-21 12:33     ` Boris Brezillon
2016-06-21  8:08 ` [PATCH v3 09/10] crypto: marvell: Add support for chaining crypto requests in TDMA mode Romain Perier
2016-06-21  8:08   ` Romain Perier
2016-06-21 12:37   ` Boris Brezillon
2016-06-21 12:37     ` Boris Brezillon
2016-06-21  8:08 ` [PATCH v3 10/10] crypto: marvell: Increase the size of the crypto queue Romain Perier
2016-06-21  8:08   ` Romain Perier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.