linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] Improve DMA chaining for ahash requests
@ 2016-10-05  7:56 Romain Perier
  2016-10-05  7:56 ` [PATCH v4 1/2] crypto: marvell - Use an unique pool to copy results of requests Romain Perier
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Romain Perier @ 2016-10-05  7:56 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
	Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Nadav Haklai, Ofer Heifetz, linux-crypto, linux-arm-kernel

This series contain performance improvement regarding ahash requests.
So far, ahash requests were systematically not chained at the DMA level.
However, in some case, like this is the case by using IPSec, some ahash
requests can be processed directly by the engine, and don't have
intermediaire partial update states.

This series firstly re-work the way outer IVs are copied from the SRAM
into the dma pool. To do so, we introduce a common dma pool for all type
of requests that contains outer results (like IV or digest). Then, for
ahash requests that can be processed directly by the engine, outer
results are copied from the SRAM into the common dma pool. These requests
are then allowed to be chained at the DMA level.


Benchmarking results with iperf throught IPSec
==============================================
		ESP			AH

Before		343 Mbits/s		492 Mbits/s
After		422 Mbits/s		577 Mbits/s
Improvement	+23%			+17%

Romain Perier (2):
  crypto: marvell - Use an unique pool to copy results of requests
  crypto: marvell - Don't break chain for computable last ahash requests

 drivers/crypto/marvell/cesa.c   |  4 ---
 drivers/crypto/marvell/cesa.h   |  5 ++--
 drivers/crypto/marvell/cipher.c |  8 +++--
 drivers/crypto/marvell/hash.c   | 65 +++++++++++++++++++++++++++++++----------
 drivers/crypto/marvell/tdma.c   | 28 +++++++++---------
 5 files changed, 70 insertions(+), 40 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/2] crypto: marvell - Use an unique pool to copy results of requests
  2016-10-05  7:56 [PATCH v4 0/2] Improve DMA chaining for ahash requests Romain Perier
@ 2016-10-05  7:56 ` Romain Perier
  2016-10-05  7:56 ` [PATCH v4 2/2] crypto: marvell - Don't break chain for computable last ahash requests Romain Perier
  2016-10-21  3:13 ` [PATCH v4 0/2] Improve DMA chaining for " Herbert Xu
  2 siblings, 0 replies; 5+ messages in thread
From: Romain Perier @ 2016-10-05  7:56 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
	Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Nadav Haklai, Ofer Heifetz, linux-crypto, linux-arm-kernel

So far, we used a dedicated dma pool to copy the result of outer IV for
cipher requests. Instead of using a dma pool per outer data, we prefer
use the op dma pool that contains all part of the request from the SRAM.
Then, the outer data that is likely to be used by the 'complete'
operation, is copied later. In this way, any type of result can be
retrieved by DMA for cipher or ahash requests.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---

Changes in v4:
  - Added a comment that explains why we retrieve the first op ctx
    of the chain.
  - Added the tag 'Acked-by'

Changes in v3:
  - Don't allocate a new op ctx for the last tdma descriptor. Instead
    we point to the last op ctx in the tdma chain, and copy the context
    of the current request to this location.

Changes in v2:
  - Use the dma pool "op" to retrieve outer data intead of introducing
    a new one.

 drivers/crypto/marvell/cesa.c   |  4 ----
 drivers/crypto/marvell/cesa.h   |  5 ++---
 drivers/crypto/marvell/cipher.c |  8 +++++---
 drivers/crypto/marvell/tdma.c   | 33 +++++++++++++++++++--------------
 4 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 37dadb2..6e7a5c7 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -375,10 +375,6 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
 	if (!dma->padding_pool)
 		return -ENOMEM;
 
-	dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
-	if (!dma->iv_pool)
-		return -ENOMEM;
-
 	cesa->dma = dma;
 
 	return 0;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index e423d33..a768da7 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -277,7 +277,7 @@ struct mv_cesa_op_ctx {
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
-#define CESA_TDMA_IV				3
+#define CESA_TDMA_RESULT			3
 
 /**
  * struct mv_cesa_tdma_desc - TDMA descriptor
@@ -393,7 +393,6 @@ struct mv_cesa_dev_dma {
 	struct dma_pool *op_pool;
 	struct dma_pool *cache_pool;
 	struct dma_pool *padding_pool;
-	struct dma_pool *iv_pool;
 };
 
 /**
@@ -839,7 +838,7 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
 	memset(chain, 0, sizeof(*chain));
 }
 
-int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
 			  u32 size, u32 flags, gfp_t gfp_flags);
 
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index d19dc96..098871a 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -212,7 +212,8 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 		struct mv_cesa_req *basereq;
 
 		basereq = &creq->base;
-		memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
+		memcpy(ablkreq->info, basereq->chain.last->op->ctx.blkcipher.iv,
+		       ivsize);
 	} else {
 		memcpy_fromio(ablkreq->info,
 			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
@@ -373,8 +374,9 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 
 	/* Add output data for IV */
 	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
-	ret = mv_cesa_dma_add_iv_op(&basereq->chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
-				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
+	ret = mv_cesa_dma_add_result_op(&basereq->chain, CESA_SA_CFG_SRAM_OFFSET,
+				    CESA_SA_DATA_SRAM_OFFSET,
+				    CESA_TDMA_SRC_IN_SRAM, flags);
 
 	if (ret)
 		goto err_free_tdma;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 9fd7a5f..4416b88 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -69,9 +69,6 @@ void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
 		if (type == CESA_TDMA_OP)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
-		else if (type == CESA_TDMA_IV)
-			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
-				      le32_to_cpu(tdma->dst));
 
 		tdma = tdma->next;
 		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
@@ -209,29 +206,37 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 	return new_tdma;
 }
 
-int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
 			  u32 size, u32 flags, gfp_t gfp_flags)
 {
-
-	struct mv_cesa_tdma_desc *tdma;
-	u8 *iv;
-	dma_addr_t dma_handle;
+	struct mv_cesa_tdma_desc *tdma, *op_desc;
 
 	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
 	if (IS_ERR(tdma))
 		return PTR_ERR(tdma);
 
-	iv = dma_pool_alloc(cesa_dev->dma->iv_pool, gfp_flags, &dma_handle);
-	if (!iv)
-		return -ENOMEM;
+	/* We re-use an existing op_desc object to retrieve the context
+	 * and result instead of allocating a new one.
+	 * There is at least one object of this type in a CESA crypto
+	 * req, just pick the first one in the chain.
+	 */
+	for (op_desc = chain->first; op_desc; op_desc = op_desc->next) {
+		u32 type = op_desc->flags & CESA_TDMA_TYPE_MSK;
+
+		if (type == CESA_TDMA_OP)
+			break;
+	}
+
+	if (!op_desc)
+		return -EIO;
 
 	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
 	tdma->src = src;
-	tdma->dst = cpu_to_le32(dma_handle);
-	tdma->data = iv;
+	tdma->dst = op_desc->src;
+	tdma->op = op_desc->op;
 
 	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
-	tdma->flags = flags | CESA_TDMA_IV;
+	tdma->flags = flags | CESA_TDMA_RESULT;
 	return 0;
 }
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 2/2] crypto: marvell - Don't break chain for computable last ahash requests
  2016-10-05  7:56 [PATCH v4 0/2] Improve DMA chaining for ahash requests Romain Perier
  2016-10-05  7:56 ` [PATCH v4 1/2] crypto: marvell - Use an unique pool to copy results of requests Romain Perier
@ 2016-10-05  7:56 ` Romain Perier
  2016-10-14 10:40   ` Boris Brezillon
  2016-10-21  3:13 ` [PATCH v4 0/2] Improve DMA chaining for " Herbert Xu
  2 siblings, 1 reply; 5+ messages in thread
From: Romain Perier @ 2016-10-05  7:56 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
	Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Nadav Haklai, Ofer Heifetz, linux-crypto, linux-arm-kernel

Currently, the driver breaks chain for all kind of hash requests in order to
don't override intermediate states of partial ahash updates. However, some final
ahash requests can be directly processed by the engine, and so without
intermediate state. This is typically the case for most for the HMAC requests
processed via IPSec.

This commits adds a TDMA descriptor to copy context for these of requests
into the "op" dma pool, then it allow to chain these requests at the DMA level.
The 'complete' operation is also updated to retrieve the MAC digest from the
right location.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---

Changes in v4:
 - Remove the dummy descriptor at the end of the chain, when a TDMA_RESULT
   is present. So, we re-wrote a bit the code of ahash_complete accordingly.

Changes in v3:
 - Copy the whole context back to RAM and not just the digest. Also
   fixed a rebase issue ^^ (whoops)

Changes in v2:
 - Replaced BUG_ON by an error
 - Add a variable "break_chain", with "type" to break the chain

 drivers/crypto/marvell/hash.c | 65 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 49 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 9f28468..2a92605 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -312,24 +312,40 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
 	int i;
 
 	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
-	for (i = 0; i < digsize / 4; i++)
-		creq->state[i] = readl_relaxed(engine->regs + CESA_IVDIG(i));
 
-	if (creq->last_req) {
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ &&
+	    (creq->base.chain.last->flags & CESA_TDMA_TYPE_MSK) == CESA_TDMA_RESULT) {
+		__le32 *data = NULL;
+
 		/*
-		 * Hardware's MD5 digest is in little endian format, but
-		 * SHA in big endian format
+		 * Result is already in the correct endianess when the SA is
+		 * used
 		 */
-		if (creq->algo_le) {
-			__le32 *result = (void *)ahashreq->result;
+		data = creq->base.chain.last->op->ctx.hash.hash;
+		for (i = 0; i < digsize / 4; i++)
+			creq->state[i] = cpu_to_le32(data[i]);
 
-			for (i = 0; i < digsize / 4; i++)
-				result[i] = cpu_to_le32(creq->state[i]);
-		} else {
-			__be32 *result = (void *)ahashreq->result;
+		memcpy(ahashreq->result, data, digsize);
+	} else {
+		for (i = 0; i < digsize / 4; i++)
+			creq->state[i] = readl_relaxed(engine->regs +
+						       CESA_IVDIG(i));
+		if (creq->last_req) {
+			/*
+			* Hardware's MD5 digest is in little endian format, but
+			* SHA in big endian format
+			*/
+			if (creq->algo_le) {
+				__le32 *result = (void *)ahashreq->result;
+
+				for (i = 0; i < digsize / 4; i++)
+					result[i] = cpu_to_le32(creq->state[i]);
+			} else {
+				__be32 *result = (void *)ahashreq->result;
 
-			for (i = 0; i < digsize / 4; i++)
-				result[i] = cpu_to_be32(creq->state[i]);
+				for (i = 0; i < digsize / 4; i++)
+					result[i] = cpu_to_be32(creq->state[i]);
+			}
 		}
 	}
 
@@ -504,6 +520,12 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain,
 						CESA_SA_DESC_CFG_LAST_FRAG,
 				      CESA_SA_DESC_CFG_FRAG_MSK);
 
+		ret = mv_cesa_dma_add_result_op(chain,
+						CESA_SA_CFG_SRAM_OFFSET,
+						CESA_SA_DATA_SRAM_OFFSET,
+						CESA_TDMA_SRC_IN_SRAM, flags);
+		if (ret)
+			return ERR_PTR(-ENOMEM);
 		return op;
 	}
 
@@ -564,6 +586,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	struct mv_cesa_op_ctx *op = NULL;
 	unsigned int frag_len;
 	int ret;
+	u32 type;
 
 	basereq->chain.first = NULL;
 	basereq->chain.last = NULL;
@@ -635,7 +658,15 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 		goto err_free_tdma;
 	}
 
-	if (op) {
+	/*
+	 * If results are copied via DMA, this means that this
+	 * request can be directly processed by the engine,
+	 * without partial updates. So we can chain it at the
+	 * DMA level with other requests.
+	 */
+	type = basereq->chain.last->flags & CESA_TDMA_TYPE_MSK;
+
+	if (op && type != CESA_TDMA_RESULT) {
 		/* Add dummy desc to wait for crypto operation end */
 		ret = mv_cesa_dma_add_dummy_end(&basereq->chain, flags);
 		if (ret)
@@ -648,8 +679,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	else
 		creq->cache_ptr = 0;
 
-	basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
-				       CESA_TDMA_BREAK_CHAIN);
+	basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
+
+	if (type != CESA_TDMA_RESULT)
+		basereq->chain.last->flags |= CESA_TDMA_BREAK_CHAIN;
 
 	return 0;
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 2/2] crypto: marvell - Don't break chain for computable last ahash requests
  2016-10-05  7:56 ` [PATCH v4 2/2] crypto: marvell - Don't break chain for computable last ahash requests Romain Perier
@ 2016-10-14 10:40   ` Boris Brezillon
  0 siblings, 0 replies; 5+ messages in thread
From: Boris Brezillon @ 2016-10-14 10:40 UTC (permalink / raw)
  To: Romain Perier
  Cc: Arnaud Ebalard, David S. Miller, Herbert Xu, Thomas Petazzoni,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	Gregory Clement, Nadav Haklai, Ofer Heifetz, linux-crypto,
	linux-arm-kernel

On Wed,  5 Oct 2016 09:56:33 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:

> Currently, the driver breaks chain for all kind of hash requests in order to
> don't override intermediate states of partial ahash updates. However, some final
> ahash requests can be directly processed by the engine, and so without
> intermediate state. This is typically the case for most for the HMAC requests
> processed via IPSec.
> 
> This commits adds a TDMA descriptor to copy context for these of requests
> into the "op" dma pool, then it allow to chain these requests at the DMA level.
> The 'complete' operation is also updated to retrieve the MAC digest from the
> right location.
> 
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>

Minor comments below, otherwise it looks good.

Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>

> ---
> 
> Changes in v4:
>  - Remove the dummy descriptor at the end of the chain, when a TDMA_RESULT
>    is present. So, we re-wrote a bit the code of ahash_complete accordingly.
> 
> Changes in v3:
>  - Copy the whole context back to RAM and not just the digest. Also
>    fixed a rebase issue ^^ (whoops)
> 
> Changes in v2:
>  - Replaced BUG_ON by an error
>  - Add a variable "break_chain", with "type" to break the chain
> 
>  drivers/crypto/marvell/hash.c | 65 ++++++++++++++++++++++++++++++++-----------
>  1 file changed, 49 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index 9f28468..2a92605 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -312,24 +312,40 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
>  	int i;
>  
>  	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
> -	for (i = 0; i < digsize / 4; i++)
> -		creq->state[i] = readl_relaxed(engine->regs + CESA_IVDIG(i));
>  
> -	if (creq->last_req) {
> +	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ &&
> +	    (creq->base.chain.last->flags & CESA_TDMA_TYPE_MSK) == CESA_TDMA_RESULT) {

Maybe it's time to create an helper to extract the TDMA desc type (as
you did for the CESA req type with mv_cesa_req_get_type()).

> +		__le32 *data = NULL;
> +
>  		/*
> -		 * Hardware's MD5 digest is in little endian format, but
> -		 * SHA in big endian format
> +		 * Result is already in the correct endianess when the SA is
> +		 * used
>  		 */
> -		if (creq->algo_le) {
> -			__le32 *result = (void *)ahashreq->result;
> +		data = creq->base.chain.last->op->ctx.hash.hash;
> +		for (i = 0; i < digsize / 4; i++)
> +			creq->state[i] = cpu_to_le32(data[i]);
>  
> -			for (i = 0; i < digsize / 4; i++)
> -				result[i] = cpu_to_le32(creq->state[i]);
> -		} else {
> -			__be32 *result = (void *)ahashreq->result;
> +		memcpy(ahashreq->result, data, digsize);
> +	} else {
> +		for (i = 0; i < digsize / 4; i++)
> +			creq->state[i] = readl_relaxed(engine->regs +
> +						       CESA_IVDIG(i));
> +		if (creq->last_req) {
> +			/*
> +			* Hardware's MD5 digest is in little endian format, but
> +			* SHA in big endian format
> +			*/
> +			if (creq->algo_le) {
> +				__le32 *result = (void *)ahashreq->result;
> +
> +				for (i = 0; i < digsize / 4; i++)
> +					result[i] = cpu_to_le32(creq->state[i]);
> +			} else {
> +				__be32 *result = (void *)ahashreq->result;
>  
> -			for (i = 0; i < digsize / 4; i++)
> -				result[i] = cpu_to_be32(creq->state[i]);
> +				for (i = 0; i < digsize / 4; i++)
> +					result[i] = cpu_to_be32(creq->state[i]);
> +			}
>  		}
>  	}
>  
> @@ -504,6 +520,12 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain,
>  						CESA_SA_DESC_CFG_LAST_FRAG,
>  				      CESA_SA_DESC_CFG_FRAG_MSK);
>  
> +		ret = mv_cesa_dma_add_result_op(chain,
> +						CESA_SA_CFG_SRAM_OFFSET,
> +						CESA_SA_DATA_SRAM_OFFSET,
> +						CESA_TDMA_SRC_IN_SRAM, flags);
> +		if (ret)
> +			return ERR_PTR(-ENOMEM);
>  		return op;
>  	}
>  
> @@ -564,6 +586,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  	struct mv_cesa_op_ctx *op = NULL;
>  	unsigned int frag_len;
>  	int ret;
> +	u32 type;
>  
>  	basereq->chain.first = NULL;
>  	basereq->chain.last = NULL;
> @@ -635,7 +658,15 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  		goto err_free_tdma;
>  	}
>  
> -	if (op) {
> +	/*
> +	 * If results are copied via DMA, this means that this
> +	 * request can be directly processed by the engine,
> +	 * without partial updates. So we can chain it at the
> +	 * DMA level with other requests.
> +	 */

Can you move this comment where it really belongs: when you
conditionally set the CESA_TDMA_BREAK_CHAIN flag.

> +	type = basereq->chain.last->flags & CESA_TDMA_TYPE_MSK;
> +
> +	if (op && type != CESA_TDMA_RESULT) {
>  		/* Add dummy desc to wait for crypto operation end */
>  		ret = mv_cesa_dma_add_dummy_end(&basereq->chain, flags);
>  		if (ret)
> @@ -648,8 +679,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
>  	else
>  		creq->cache_ptr = 0;
>  
> -	basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
> -				       CESA_TDMA_BREAK_CHAIN);
> +	basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
> +
> +	if (type != CESA_TDMA_RESULT)
> +		basereq->chain.last->flags |= CESA_TDMA_BREAK_CHAIN;
>  
>  	return 0;
>  

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 0/2] Improve DMA chaining for ahash requests
  2016-10-05  7:56 [PATCH v4 0/2] Improve DMA chaining for ahash requests Romain Perier
  2016-10-05  7:56 ` [PATCH v4 1/2] crypto: marvell - Use an unique pool to copy results of requests Romain Perier
  2016-10-05  7:56 ` [PATCH v4 2/2] crypto: marvell - Don't break chain for computable last ahash requests Romain Perier
@ 2016-10-21  3:13 ` Herbert Xu
  2 siblings, 0 replies; 5+ messages in thread
From: Herbert Xu @ 2016-10-21  3:13 UTC (permalink / raw)
  To: Romain Perier
  Cc: Boris Brezillon, Arnaud Ebalard, David S. Miller,
	Thomas Petazzoni, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Gregory Clement, Nadav Haklai,
	Ofer Heifetz, linux-crypto, linux-arm-kernel

On Wed, Oct 05, 2016 at 09:56:31AM +0200, Romain Perier wrote:
> This series contain performance improvement regarding ahash requests.
> So far, ahash requests were systematically not chained at the DMA level.
> However, in some case, like this is the case by using IPSec, some ahash
> requests can be processed directly by the engine, and don't have
> intermediaire partial update states.
> 
> This series firstly re-work the way outer IVs are copied from the SRAM
> into the dma pool. To do so, we introduce a common dma pool for all type
> of requests that contains outer results (like IV or digest). Then, for
> ahash requests that can be processed directly by the engine, outer
> results are copied from the SRAM into the common dma pool. These requests
> are then allowed to be chained at the DMA level.
> 
> 
> Benchmarking results with iperf throught IPSec
> ==============================================
> 		ESP			AH
> 
> Before		343 Mbits/s		492 Mbits/s
> After		422 Mbits/s		577 Mbits/s
> Improvement	+23%			+17%
> 
> Romain Perier (2):
>   crypto: marvell - Use an unique pool to copy results of requests
>   crypto: marvell - Don't break chain for computable last ahash requests
> 
>  drivers/crypto/marvell/cesa.c   |  4 ---
>  drivers/crypto/marvell/cesa.h   |  5 ++--
>  drivers/crypto/marvell/cipher.c |  8 +++--
>  drivers/crypto/marvell/hash.c   | 65 +++++++++++++++++++++++++++++++----------
>  drivers/crypto/marvell/tdma.c   | 28 +++++++++---------
>  5 files changed, 70 insertions(+), 40 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-10-21  3:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-05  7:56 [PATCH v4 0/2] Improve DMA chaining for ahash requests Romain Perier
2016-10-05  7:56 ` [PATCH v4 1/2] crypto: marvell - Use an unique pool to copy results of requests Romain Perier
2016-10-05  7:56 ` [PATCH v4 2/2] crypto: marvell - Don't break chain for computable last ahash requests Romain Perier
2016-10-14 10:40   ` Boris Brezillon
2016-10-21  3:13 ` [PATCH v4 0/2] Improve DMA chaining for " Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).