linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes
@ 2014-01-24 22:17 Tom Lendacky
  2014-01-24 22:17 ` [PATCH 1/4] crypto: ccp - Allow for selective disablement of crypto API algorithms Tom Lendacky
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Tom Lendacky @ 2014-01-24 22:17 UTC (permalink / raw)
  To: linux-crypto, herbert, davem; +Cc: linux-kernel

Patch 1: Allow for selectively disabling the registration of an algorithm
family (sha or aes algorithms) via module parameters.

Patch 2-4: Fix errors/issues that were found during IPSec testing. In
order to prevent deadlocks with the networking code, the crypto callback
was changed to run as a tasklet.  In order for the callback to be run as
a tasklet, the HMAC calculation needed to be moved out of the callback
path (since it sleeps) and into the CCP sha operation logic.  Additionally,
trying to allow concurrency within the tfm while maintaining serialization
of the tfm per CPU was not working properly so a single queue is used now.

This patch series is based on the cryptodev-2.6 kernel tree.

---

Tom Lendacky (4):
      crypto: ccp - Allow for selective disablement of crypto API algorithms
      crypto: ccp - Move HMAC calculation down to ccp ops file
      crypto: ccp - Use a single queue for proper ordering of tfm requests
      crypto: ccp - Perform completion callbacks using a tasklet


 drivers/crypto/ccp/ccp-crypto-main.c |  201 ++++++++++++----------------------
 drivers/crypto/ccp/ccp-crypto-sha.c  |  130 ++++------------------
 drivers/crypto/ccp/ccp-crypto.h      |    8 +
 drivers/crypto/ccp/ccp-dev.c         |   21 +++-
 drivers/crypto/ccp/ccp-ops.c         |  104 +++++++++++++++++-
 include/linux/ccp.h                  |    7 +
 6 files changed, 229 insertions(+), 242 deletions(-)

-- 
Tom Lendacky


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] crypto: ccp - Allow for selective disablement of crypto API algorithms
  2014-01-24 22:17 [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Tom Lendacky
@ 2014-01-24 22:17 ` Tom Lendacky
  2014-01-24 22:18 ` [PATCH 2/4] crypto: ccp - Move HMAC calculation down to ccp ops file Tom Lendacky
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Tom Lendacky @ 2014-01-24 22:17 UTC (permalink / raw)
  To: linux-crypto, herbert, davem; +Cc: linux-kernel

Introduce module parameters that allow for disabling of a
crypto algorithm by not registering the algorithm with the
crypto API.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/ccp/ccp-crypto-main.c |   37 +++++++++++++++++++++++-----------
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-main.c b/drivers/crypto/ccp/ccp-crypto-main.c
index 2636f04..b3f22b0 100644
--- a/drivers/crypto/ccp/ccp-crypto-main.c
+++ b/drivers/crypto/ccp/ccp-crypto-main.c
@@ -11,6 +11,7 @@
  */
 
 #include <linux/module.h>
+#include <linux/moduleparam.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/ccp.h>
@@ -24,6 +25,14 @@ MODULE_LICENSE("GPL");
 MODULE_VERSION("1.0.0");
 MODULE_DESCRIPTION("AMD Cryptographic Coprocessor crypto API support");
 
+static unsigned int aes_disable;
+module_param(aes_disable, uint, 0444);
+MODULE_PARM_DESC(aes_disable, "Disable use of AES - any non-zero value");
+
+static unsigned int sha_disable;
+module_param(sha_disable, uint, 0444);
+MODULE_PARM_DESC(sha_disable, "Disable use of SHA - any non-zero value");
+
 
 /* List heads for the supported algorithms */
 static LIST_HEAD(hash_algs);
@@ -337,21 +346,25 @@ static int ccp_register_algs(void)
 {
 	int ret;
 
-	ret = ccp_register_aes_algs(&cipher_algs);
-	if (ret)
-		return ret;
+	if (!aes_disable) {
+		ret = ccp_register_aes_algs(&cipher_algs);
+		if (ret)
+			return ret;
 
-	ret = ccp_register_aes_cmac_algs(&hash_algs);
-	if (ret)
-		return ret;
+		ret = ccp_register_aes_cmac_algs(&hash_algs);
+		if (ret)
+			return ret;
 
-	ret = ccp_register_aes_xts_algs(&cipher_algs);
-	if (ret)
-		return ret;
+		ret = ccp_register_aes_xts_algs(&cipher_algs);
+		if (ret)
+			return ret;
+	}
 
-	ret = ccp_register_sha_algs(&hash_algs);
-	if (ret)
-		return ret;
+	if (!sha_disable) {
+		ret = ccp_register_sha_algs(&hash_algs);
+		if (ret)
+			return ret;
+	}
 
 	return 0;
 }



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] crypto: ccp - Move HMAC calculation down to ccp ops file
  2014-01-24 22:17 [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Tom Lendacky
  2014-01-24 22:17 ` [PATCH 1/4] crypto: ccp - Allow for selective disablement of crypto API algorithms Tom Lendacky
@ 2014-01-24 22:18 ` Tom Lendacky
  2014-01-24 22:18 ` [PATCH 3/4] crypto: ccp - Use a single queue for proper ordering of tfm requests Tom Lendacky
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Tom Lendacky @ 2014-01-24 22:18 UTC (permalink / raw)
  To: linux-crypto, herbert, davem; +Cc: linux-kernel

Move the support to perform an HMAC calculation into
the CCP operations file.  This eliminates the need to
perform a synchronous SHA operation used to calculate
the HMAC.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/ccp/ccp-crypto-sha.c |  130 +++++++----------------------------
 drivers/crypto/ccp/ccp-crypto.h     |    8 +-
 drivers/crypto/ccp/ccp-ops.c        |  104 ++++++++++++++++++++++++++++
 include/linux/ccp.h                 |    7 ++
 4 files changed, 139 insertions(+), 110 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-sha.c b/drivers/crypto/ccp/ccp-crypto-sha.c
index 3867290..873f234 100644
--- a/drivers/crypto/ccp/ccp-crypto-sha.c
+++ b/drivers/crypto/ccp/ccp-crypto-sha.c
@@ -24,75 +24,10 @@
 #include "ccp-crypto.h"
 
 
-struct ccp_sha_result {
-	struct completion completion;
-	int err;
-};
-
-static void ccp_sync_hash_complete(struct crypto_async_request *req, int err)
-{
-	struct ccp_sha_result *result = req->data;
-
-	if (err == -EINPROGRESS)
-		return;
-
-	result->err = err;
-	complete(&result->completion);
-}
-
-static int ccp_sync_hash(struct crypto_ahash *tfm, u8 *buf,
-			 struct scatterlist *sg, unsigned int len)
-{
-	struct ccp_sha_result result;
-	struct ahash_request *req;
-	int ret;
-
-	init_completion(&result.completion);
-
-	req = ahash_request_alloc(tfm, GFP_KERNEL);
-	if (!req)
-		return -ENOMEM;
-
-	ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
-				   ccp_sync_hash_complete, &result);
-	ahash_request_set_crypt(req, sg, buf, len);
-
-	ret = crypto_ahash_digest(req);
-	if ((ret == -EINPROGRESS) || (ret == -EBUSY)) {
-		ret = wait_for_completion_interruptible(&result.completion);
-		if (!ret)
-			ret = result.err;
-	}
-
-	ahash_request_free(req);
-
-	return ret;
-}
-
-static int ccp_sha_finish_hmac(struct crypto_async_request *async_req)
-{
-	struct ahash_request *req = ahash_request_cast(async_req);
-	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
-	struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
-	struct ccp_sha_req_ctx *rctx = ahash_request_ctx(req);
-	struct scatterlist sg[2];
-	unsigned int block_size =
-		crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
-	unsigned int digest_size = crypto_ahash_digestsize(tfm);
-
-	sg_init_table(sg, ARRAY_SIZE(sg));
-	sg_set_buf(&sg[0], ctx->u.sha.opad, block_size);
-	sg_set_buf(&sg[1], rctx->ctx, digest_size);
-
-	return ccp_sync_hash(ctx->u.sha.hmac_tfm, req->result, sg,
-			     block_size + digest_size);
-}
-
 static int ccp_sha_complete(struct crypto_async_request *async_req, int ret)
 {
 	struct ahash_request *req = ahash_request_cast(async_req);
 	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
-	struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
 	struct ccp_sha_req_ctx *rctx = ahash_request_ctx(req);
 	unsigned int digest_size = crypto_ahash_digestsize(tfm);
 
@@ -112,10 +47,6 @@ static int ccp_sha_complete(struct crypto_async_request *async_req, int ret)
 	if (req->result)
 		memcpy(req->result, rctx->ctx, digest_size);
 
-	/* If we're doing an HMAC, we need to perform that on the final op */
-	if (rctx->final && ctx->u.sha.key_len)
-		ret = ccp_sha_finish_hmac(async_req);
-
 e_free:
 	sg_free_table(&rctx->data_sg);
 
@@ -126,6 +57,7 @@ static int ccp_do_sha_update(struct ahash_request *req, unsigned int nbytes,
 			     unsigned int final)
 {
 	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
 	struct ccp_sha_req_ctx *rctx = ahash_request_ctx(req);
 	struct scatterlist *sg;
 	unsigned int block_size =
@@ -196,6 +128,11 @@ static int ccp_do_sha_update(struct ahash_request *req, unsigned int nbytes,
 	rctx->cmd.u.sha.ctx_len = sizeof(rctx->ctx);
 	rctx->cmd.u.sha.src = sg;
 	rctx->cmd.u.sha.src_len = rctx->hash_cnt;
+	rctx->cmd.u.sha.opad = ctx->u.sha.key_len ?
+		&ctx->u.sha.opad_sg : NULL;
+	rctx->cmd.u.sha.opad_len = ctx->u.sha.key_len ?
+		ctx->u.sha.opad_count : 0;
+	rctx->cmd.u.sha.first = rctx->first;
 	rctx->cmd.u.sha.final = rctx->final;
 	rctx->cmd.u.sha.msg_bits = rctx->msg_bits;
 
@@ -218,7 +155,6 @@ static int ccp_sha_init(struct ahash_request *req)
 
 	memset(rctx, 0, sizeof(*rctx));
 
-	memcpy(rctx->ctx, alg->init, sizeof(rctx->ctx));
 	rctx->type = alg->type;
 	rctx->first = 1;
 
@@ -261,10 +197,13 @@ static int ccp_sha_setkey(struct crypto_ahash *tfm, const u8 *key,
 			  unsigned int key_len)
 {
 	struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ahash_tfm(tfm));
-	struct scatterlist sg;
-	unsigned int block_size =
-		crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
-	unsigned int digest_size = crypto_ahash_digestsize(tfm);
+	struct crypto_shash *shash = ctx->u.sha.hmac_tfm;
+	struct {
+		struct shash_desc sdesc;
+		char ctx[crypto_shash_descsize(shash)];
+	} desc;
+	unsigned int block_size = crypto_shash_blocksize(shash);
+	unsigned int digest_size = crypto_shash_digestsize(shash);
 	int i, ret;
 
 	/* Set to zero until complete */
@@ -277,8 +216,12 @@ static int ccp_sha_setkey(struct crypto_ahash *tfm, const u8 *key,
 
 	if (key_len > block_size) {
 		/* Must hash the input key */
-		sg_init_one(&sg, key, key_len);
-		ret = ccp_sync_hash(tfm, ctx->u.sha.key, &sg, key_len);
+		desc.sdesc.tfm = shash;
+		desc.sdesc.flags = crypto_ahash_get_flags(tfm) &
+			CRYPTO_TFM_REQ_MAY_SLEEP;
+
+		ret = crypto_shash_digest(&desc.sdesc, key, key_len,
+					  ctx->u.sha.key);
 		if (ret) {
 			crypto_ahash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
 			return -EINVAL;
@@ -293,6 +236,9 @@ static int ccp_sha_setkey(struct crypto_ahash *tfm, const u8 *key,
 		ctx->u.sha.opad[i] = ctx->u.sha.key[i] ^ 0x5c;
 	}
 
+	sg_init_one(&ctx->u.sha.opad_sg, ctx->u.sha.opad, block_size);
+	ctx->u.sha.opad_count = block_size;
+
 	ctx->u.sha.key_len = key_len;
 
 	return 0;
@@ -319,10 +265,9 @@ static int ccp_hmac_sha_cra_init(struct crypto_tfm *tfm)
 {
 	struct ccp_ctx *ctx = crypto_tfm_ctx(tfm);
 	struct ccp_crypto_ahash_alg *alg = ccp_crypto_ahash_alg(tfm);
-	struct crypto_ahash *hmac_tfm;
+	struct crypto_shash *hmac_tfm;
 
-	hmac_tfm = crypto_alloc_ahash(alg->child_alg,
-				      CRYPTO_ALG_TYPE_AHASH, 0);
+	hmac_tfm = crypto_alloc_shash(alg->child_alg, 0, 0);
 	if (IS_ERR(hmac_tfm)) {
 		pr_warn("could not load driver %s need for HMAC support\n",
 			alg->child_alg);
@@ -339,35 +284,14 @@ static void ccp_hmac_sha_cra_exit(struct crypto_tfm *tfm)
 	struct ccp_ctx *ctx = crypto_tfm_ctx(tfm);
 
 	if (ctx->u.sha.hmac_tfm)
-		crypto_free_ahash(ctx->u.sha.hmac_tfm);
+		crypto_free_shash(ctx->u.sha.hmac_tfm);
 
 	ccp_sha_cra_exit(tfm);
 }
 
-static const __be32 sha1_init[CCP_SHA_CTXSIZE / sizeof(__be32)] = {
-	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
-	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
-	cpu_to_be32(SHA1_H4), 0, 0, 0,
-};
-
-static const __be32 sha224_init[CCP_SHA_CTXSIZE / sizeof(__be32)] = {
-	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
-	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
-	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
-	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
-};
-
-static const __be32 sha256_init[CCP_SHA_CTXSIZE / sizeof(__be32)] = {
-	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
-	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
-	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
-	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
-};
-
 struct ccp_sha_def {
 	const char *name;
 	const char *drv_name;
-	const __be32 *init;
 	enum ccp_sha_type type;
 	u32 digest_size;
 	u32 block_size;
@@ -377,7 +301,6 @@ static struct ccp_sha_def sha_algs[] = {
 	{
 		.name		= "sha1",
 		.drv_name	= "sha1-ccp",
-		.init		= sha1_init,
 		.type		= CCP_SHA_TYPE_1,
 		.digest_size	= SHA1_DIGEST_SIZE,
 		.block_size	= SHA1_BLOCK_SIZE,
@@ -385,7 +308,6 @@ static struct ccp_sha_def sha_algs[] = {
 	{
 		.name		= "sha224",
 		.drv_name	= "sha224-ccp",
-		.init		= sha224_init,
 		.type		= CCP_SHA_TYPE_224,
 		.digest_size	= SHA224_DIGEST_SIZE,
 		.block_size	= SHA224_BLOCK_SIZE,
@@ -393,7 +315,6 @@ static struct ccp_sha_def sha_algs[] = {
 	{
 		.name		= "sha256",
 		.drv_name	= "sha256-ccp",
-		.init		= sha256_init,
 		.type		= CCP_SHA_TYPE_256,
 		.digest_size	= SHA256_DIGEST_SIZE,
 		.block_size	= SHA256_BLOCK_SIZE,
@@ -460,7 +381,6 @@ static int ccp_register_sha_alg(struct list_head *head,
 
 	INIT_LIST_HEAD(&ccp_alg->entry);
 
-	ccp_alg->init = def->init;
 	ccp_alg->type = def->type;
 
 	alg = &ccp_alg->alg;
diff --git a/drivers/crypto/ccp/ccp-crypto.h b/drivers/crypto/ccp/ccp-crypto.h
index b222231..9aa4ae1 100644
--- a/drivers/crypto/ccp/ccp-crypto.h
+++ b/drivers/crypto/ccp/ccp-crypto.h
@@ -137,11 +137,14 @@ struct ccp_aes_cmac_req_ctx {
 #define MAX_SHA_BLOCK_SIZE	SHA256_BLOCK_SIZE
 
 struct ccp_sha_ctx {
+	struct scatterlist opad_sg;
+	unsigned int opad_count;
+
 	unsigned int key_len;
 	u8 key[MAX_SHA_BLOCK_SIZE];
 	u8 ipad[MAX_SHA_BLOCK_SIZE];
 	u8 opad[MAX_SHA_BLOCK_SIZE];
-	struct crypto_ahash *hmac_tfm;
+	struct crypto_shash *hmac_tfm;
 };
 
 struct ccp_sha_req_ctx {
@@ -167,9 +170,6 @@ struct ccp_sha_req_ctx {
 	unsigned int buf_count;
 	u8 buf[MAX_SHA_BLOCK_SIZE];
 
-	/* HMAC support field */
-	struct scatterlist pad_sg;
-
 	/* CCP driver command */
 	struct ccp_cmd cmd;
 };
diff --git a/drivers/crypto/ccp/ccp-ops.c b/drivers/crypto/ccp/ccp-ops.c
index 71ed3ad..c3462e5 100644
--- a/drivers/crypto/ccp/ccp-ops.c
+++ b/drivers/crypto/ccp/ccp-ops.c
@@ -23,6 +23,7 @@
 #include <linux/ccp.h>
 #include <linux/scatterlist.h>
 #include <crypto/scatterwalk.h>
+#include <crypto/sha.h>
 
 #include "ccp-dev.h"
 
@@ -132,6 +133,27 @@ struct ccp_op {
 	} u;
 };
 
+/* SHA initial context values */
+static const __be32 ccp_sha1_init[CCP_SHA_CTXSIZE / sizeof(__be32)] = {
+	cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
+	cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
+	cpu_to_be32(SHA1_H4), 0, 0, 0,
+};
+
+static const __be32 ccp_sha224_init[CCP_SHA_CTXSIZE / sizeof(__be32)] = {
+	cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
+	cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
+	cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
+	cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
+};
+
+static const __be32 ccp_sha256_init[CCP_SHA_CTXSIZE / sizeof(__be32)] = {
+	cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
+	cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
+	cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
+	cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
+};
+
 /* The CCP cannot perform zero-length sha operations so the caller
  * is required to buffer data for the final operation.  However, a
  * sha operation for a message with a total length of zero is valid
@@ -1411,7 +1433,27 @@ static int ccp_run_sha_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd)
 	if (ret)
 		return ret;
 
-	ccp_set_dm_area(&ctx, 0, sha->ctx, 0, sha->ctx_len);
+	if (sha->first) {
+		const __be32 *init;
+
+		switch (sha->type) {
+		case CCP_SHA_TYPE_1:
+			init = ccp_sha1_init;
+			break;
+		case CCP_SHA_TYPE_224:
+			init = ccp_sha224_init;
+			break;
+		case CCP_SHA_TYPE_256:
+			init = ccp_sha256_init;
+			break;
+		default:
+			ret = -EINVAL;
+			goto e_ctx;
+		}
+		memcpy(ctx.address, init, CCP_SHA_CTXSIZE);
+	} else
+		ccp_set_dm_area(&ctx, 0, sha->ctx, 0, sha->ctx_len);
+
 	ret = ccp_copy_to_ksb(cmd_q, &ctx, op.jobid, op.ksb_ctx,
 			      CCP_PASSTHRU_BYTESWAP_256BIT);
 	if (ret) {
@@ -1451,6 +1493,66 @@ static int ccp_run_sha_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd)
 
 	ccp_get_dm_area(&ctx, 0, sha->ctx, 0, sha->ctx_len);
 
+	if (sha->final && sha->opad) {
+		/* HMAC operation, recursively perform final SHA */
+		struct ccp_cmd hmac_cmd;
+		struct scatterlist sg;
+		u64 block_size, digest_size;
+		u8 *hmac_buf;
+
+		switch (sha->type) {
+		case CCP_SHA_TYPE_1:
+			block_size = SHA1_BLOCK_SIZE;
+			digest_size = SHA1_DIGEST_SIZE;
+			break;
+		case CCP_SHA_TYPE_224:
+			block_size = SHA224_BLOCK_SIZE;
+			digest_size = SHA224_DIGEST_SIZE;
+			break;
+		case CCP_SHA_TYPE_256:
+			block_size = SHA256_BLOCK_SIZE;
+			digest_size = SHA256_DIGEST_SIZE;
+			break;
+		default:
+			ret = -EINVAL;
+			goto e_data;
+		}
+
+		if (sha->opad_len != block_size) {
+			ret = -EINVAL;
+			goto e_data;
+		}
+
+		hmac_buf = kmalloc(block_size + digest_size, GFP_KERNEL);
+		if (!hmac_buf) {
+			ret = -ENOMEM;
+			goto e_data;
+		}
+		sg_init_one(&sg, hmac_buf, block_size + digest_size);
+
+		scatterwalk_map_and_copy(hmac_buf, sha->opad, 0, block_size, 0);
+		memcpy(hmac_buf + block_size, ctx.address, digest_size);
+
+		memset(&hmac_cmd, 0, sizeof(hmac_cmd));
+		hmac_cmd.engine = CCP_ENGINE_SHA;
+		hmac_cmd.u.sha.type = sha->type;
+		hmac_cmd.u.sha.ctx = sha->ctx;
+		hmac_cmd.u.sha.ctx_len = sha->ctx_len;
+		hmac_cmd.u.sha.src = &sg;
+		hmac_cmd.u.sha.src_len = block_size + digest_size;
+		hmac_cmd.u.sha.opad = NULL;
+		hmac_cmd.u.sha.opad_len = 0;
+		hmac_cmd.u.sha.first = 1;
+		hmac_cmd.u.sha.final = 1;
+		hmac_cmd.u.sha.msg_bits = (block_size + digest_size) << 3;
+
+		ret = ccp_run_sha_cmd(cmd_q, &hmac_cmd);
+		if (ret)
+			cmd->engine_error = hmac_cmd.engine_error;
+
+		kfree(hmac_buf);
+	}
+
 e_data:
 	ccp_free_data(&src, cmd_q);
 
diff --git a/include/linux/ccp.h b/include/linux/ccp.h
index b941ab9..ebcc9d1 100644
--- a/include/linux/ccp.h
+++ b/include/linux/ccp.h
@@ -232,6 +232,9 @@ enum ccp_sha_type {
  * @ctx_len: length in bytes of hash value
  * @src: data to be used for this operation
  * @src_len: length in bytes of data used for this operation
+ * @opad: data to be used for final HMAC operation
+ * @opad_len: length in bytes of data used for final HMAC operation
+ * @first: indicates first SHA operation
  * @final: indicates final SHA operation
  * @msg_bits: total length of the message in bits used in final SHA operation
  *
@@ -251,6 +254,10 @@ struct ccp_sha_engine {
 	struct scatterlist *src;
 	u64 src_len;		/* In bytes */
 
+	struct scatterlist *opad;
+	u32 opad_len;		/* In bytes */
+
+	u32 first;		/* Indicates first sha cmd */
 	u32 final;		/* Indicates final sha cmd */
 	u64 msg_bits;		/* Message length in bits required for
 				 * final sha cmd */



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] crypto: ccp - Use a single queue for proper ordering of tfm requests
  2014-01-24 22:17 [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Tom Lendacky
  2014-01-24 22:17 ` [PATCH 1/4] crypto: ccp - Allow for selective disablement of crypto API algorithms Tom Lendacky
  2014-01-24 22:18 ` [PATCH 2/4] crypto: ccp - Move HMAC calculation down to ccp ops file Tom Lendacky
@ 2014-01-24 22:18 ` Tom Lendacky
  2014-01-24 22:18 ` [PATCH 4/4] crypto: ccp - Perform completion callbacks using a tasklet Tom Lendacky
  2014-02-09  9:21 ` [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Herbert Xu
  4 siblings, 0 replies; 6+ messages in thread
From: Tom Lendacky @ 2014-01-24 22:18 UTC (permalink / raw)
  To: linux-crypto, herbert, davem; +Cc: linux-kernel

Move to a single queue to serialize requests within a tfm. When
testing using IPSec with a large number of network connections
the per cpu tfm queuing logic was not working properly.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/ccp/ccp-crypto-main.c |  164 ++++++++++------------------------
 1 file changed, 48 insertions(+), 116 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-main.c b/drivers/crypto/ccp/ccp-crypto-main.c
index b3f22b0..010fded 100644
--- a/drivers/crypto/ccp/ccp-crypto-main.c
+++ b/drivers/crypto/ccp/ccp-crypto-main.c
@@ -38,23 +38,20 @@ MODULE_PARM_DESC(sha_disable, "Disable use of SHA - any non-zero value");
 static LIST_HEAD(hash_algs);
 static LIST_HEAD(cipher_algs);
 
-/* For any tfm, requests for that tfm on the same CPU must be returned
- * in the order received.  With multiple queues available, the CCP can
- * process more than one cmd at a time.  Therefore we must maintain
- * a cmd list to insure the proper ordering of requests on a given tfm/cpu
- * combination.
+/* For any tfm, requests for that tfm must be returned on the order
+ * received.  With multiple queues available, the CCP can process more
+ * than one cmd at a time.  Therefore we must maintain a cmd list to insure
+ * the proper ordering of requests on a given tfm.
  */
-struct ccp_crypto_cpu_queue {
+struct ccp_crypto_queue {
 	struct list_head cmds;
 	struct list_head *backlog;
 	unsigned int cmd_count;
 };
-#define CCP_CRYPTO_MAX_QLEN	50
+#define CCP_CRYPTO_MAX_QLEN	100
 
-struct ccp_crypto_percpu_queue {
-	struct ccp_crypto_cpu_queue __percpu *cpu_queue;
-};
-static struct ccp_crypto_percpu_queue req_queue;
+static struct ccp_crypto_queue req_queue;
+static spinlock_t req_queue_lock;
 
 struct ccp_crypto_cmd {
 	struct list_head entry;
@@ -71,8 +68,6 @@ struct ccp_crypto_cmd {
 
 	/* Used for held command processing to determine state */
 	int ret;
-
-	int cpu;
 };
 
 struct ccp_crypto_cpu {
@@ -91,25 +86,21 @@ static inline bool ccp_crypto_success(int err)
 	return true;
 }
 
-/*
- * ccp_crypto_cmd_complete must be called while running on the appropriate
- * cpu and the caller must have done a get_cpu to disable preemption
- */
 static struct ccp_crypto_cmd *ccp_crypto_cmd_complete(
 	struct ccp_crypto_cmd *crypto_cmd, struct ccp_crypto_cmd **backlog)
 {
-	struct ccp_crypto_cpu_queue *cpu_queue;
 	struct ccp_crypto_cmd *held = NULL, *tmp;
+	unsigned long flags;
 
 	*backlog = NULL;
 
-	cpu_queue = this_cpu_ptr(req_queue.cpu_queue);
+	spin_lock_irqsave(&req_queue_lock, flags);
 
 	/* Held cmds will be after the current cmd in the queue so start
 	 * searching for a cmd with a matching tfm for submission.
 	 */
 	tmp = crypto_cmd;
-	list_for_each_entry_continue(tmp, &cpu_queue->cmds, entry) {
+	list_for_each_entry_continue(tmp, &req_queue.cmds, entry) {
 		if (crypto_cmd->tfm != tmp->tfm)
 			continue;
 		held = tmp;
@@ -120,47 +111,45 @@ static struct ccp_crypto_cmd *ccp_crypto_cmd_complete(
 	 *   Because cmds can be executed from any point in the cmd list
 	 *   special precautions have to be taken when handling the backlog.
 	 */
-	if (cpu_queue->backlog != &cpu_queue->cmds) {
+	if (req_queue.backlog != &req_queue.cmds) {
 		/* Skip over this cmd if it is the next backlog cmd */
-		if (cpu_queue->backlog == &crypto_cmd->entry)
-			cpu_queue->backlog = crypto_cmd->entry.next;
+		if (req_queue.backlog == &crypto_cmd->entry)
+			req_queue.backlog = crypto_cmd->entry.next;
 
-		*backlog = container_of(cpu_queue->backlog,
+		*backlog = container_of(req_queue.backlog,
 					struct ccp_crypto_cmd, entry);
-		cpu_queue->backlog = cpu_queue->backlog->next;
+		req_queue.backlog = req_queue.backlog->next;
 
 		/* Skip over this cmd if it is now the next backlog cmd */
-		if (cpu_queue->backlog == &crypto_cmd->entry)
-			cpu_queue->backlog = crypto_cmd->entry.next;
+		if (req_queue.backlog == &crypto_cmd->entry)
+			req_queue.backlog = crypto_cmd->entry.next;
 	}
 
 	/* Remove the cmd entry from the list of cmds */
-	cpu_queue->cmd_count--;
+	req_queue.cmd_count--;
 	list_del(&crypto_cmd->entry);
 
+	spin_unlock_irqrestore(&req_queue_lock, flags);
+
 	return held;
 }
 
-static void ccp_crypto_complete_on_cpu(struct work_struct *work)
+static void ccp_crypto_complete(void *data, int err)
 {
-	struct ccp_crypto_cpu *cpu_work =
-		container_of(work, struct ccp_crypto_cpu, work);
-	struct ccp_crypto_cmd *crypto_cmd = cpu_work->crypto_cmd;
+	struct ccp_crypto_cmd *crypto_cmd = data;
 	struct ccp_crypto_cmd *held, *next, *backlog;
 	struct crypto_async_request *req = crypto_cmd->req;
 	struct ccp_ctx *ctx = crypto_tfm_ctx(req->tfm);
-	int cpu, ret;
-
-	cpu = get_cpu();
+	int ret;
 
-	if (cpu_work->err == -EINPROGRESS) {
+	if (err == -EINPROGRESS) {
 		/* Only propogate the -EINPROGRESS if necessary */
 		if (crypto_cmd->ret == -EBUSY) {
 			crypto_cmd->ret = -EINPROGRESS;
 			req->complete(req, -EINPROGRESS);
 		}
 
-		goto e_cpu;
+		return;
 	}
 
 	/* Operation has completed - update the queue before invoking
@@ -178,7 +167,7 @@ static void ccp_crypto_complete_on_cpu(struct work_struct *work)
 		req->complete(req, -EINPROGRESS);
 
 	/* Completion callbacks */
-	ret = cpu_work->err;
+	ret = err;
 	if (ctx->complete)
 		ret = ctx->complete(req, ret);
 	req->complete(req, ret);
@@ -203,52 +192,28 @@ static void ccp_crypto_complete_on_cpu(struct work_struct *work)
 	}
 
 	kfree(crypto_cmd);
-
-e_cpu:
-	put_cpu();
-
-	complete(&cpu_work->completion);
-}
-
-static void ccp_crypto_complete(void *data, int err)
-{
-	struct ccp_crypto_cmd *crypto_cmd = data;
-	struct ccp_crypto_cpu cpu_work;
-
-	INIT_WORK(&cpu_work.work, ccp_crypto_complete_on_cpu);
-	init_completion(&cpu_work.completion);
-	cpu_work.crypto_cmd = crypto_cmd;
-	cpu_work.err = err;
-
-	schedule_work_on(crypto_cmd->cpu, &cpu_work.work);
-
-	/* Keep the completion call synchronous */
-	wait_for_completion(&cpu_work.completion);
 }
 
 static int ccp_crypto_enqueue_cmd(struct ccp_crypto_cmd *crypto_cmd)
 {
-	struct ccp_crypto_cpu_queue *cpu_queue;
 	struct ccp_crypto_cmd *active = NULL, *tmp;
-	int cpu, ret;
-
-	cpu = get_cpu();
-	crypto_cmd->cpu = cpu;
+	unsigned long flags;
+	int ret;
 
-	cpu_queue = this_cpu_ptr(req_queue.cpu_queue);
+	spin_lock_irqsave(&req_queue_lock, flags);
 
 	/* Check if the cmd can/should be queued */
-	if (cpu_queue->cmd_count >= CCP_CRYPTO_MAX_QLEN) {
+	if (req_queue.cmd_count >= CCP_CRYPTO_MAX_QLEN) {
 		ret = -EBUSY;
 		if (!(crypto_cmd->cmd->flags & CCP_CMD_MAY_BACKLOG))
-			goto e_cpu;
+			goto e_lock;
 	}
 
 	/* Look for an entry with the same tfm.  If there is a cmd
-	 * with the same tfm in the list for this cpu then the current
-	 * cmd cannot be submitted to the CCP yet.
+	 * with the same tfm in the list then the current cmd cannot
+	 * be submitted to the CCP yet.
 	 */
-	list_for_each_entry(tmp, &cpu_queue->cmds, entry) {
+	list_for_each_entry(tmp, &req_queue.cmds, entry) {
 		if (crypto_cmd->tfm != tmp->tfm)
 			continue;
 		active = tmp;
@@ -259,21 +224,21 @@ static int ccp_crypto_enqueue_cmd(struct ccp_crypto_cmd *crypto_cmd)
 	if (!active) {
 		ret = ccp_enqueue_cmd(crypto_cmd->cmd);
 		if (!ccp_crypto_success(ret))
-			goto e_cpu;
+			goto e_lock;
 	}
 
-	if (cpu_queue->cmd_count >= CCP_CRYPTO_MAX_QLEN) {
+	if (req_queue.cmd_count >= CCP_CRYPTO_MAX_QLEN) {
 		ret = -EBUSY;
-		if (cpu_queue->backlog == &cpu_queue->cmds)
-			cpu_queue->backlog = &crypto_cmd->entry;
+		if (req_queue.backlog == &req_queue.cmds)
+			req_queue.backlog = &crypto_cmd->entry;
 	}
 	crypto_cmd->ret = ret;
 
-	cpu_queue->cmd_count++;
-	list_add_tail(&crypto_cmd->entry, &cpu_queue->cmds);
+	req_queue.cmd_count++;
+	list_add_tail(&crypto_cmd->entry, &req_queue.cmds);
 
-e_cpu:
-	put_cpu();
+e_lock:
+	spin_unlock_irqrestore(&req_queue_lock, flags);
 
 	return ret;
 }
@@ -387,50 +352,18 @@ static void ccp_unregister_algs(void)
 	}
 }
 
-static int ccp_init_queues(void)
-{
-	struct ccp_crypto_cpu_queue *cpu_queue;
-	int cpu;
-
-	req_queue.cpu_queue = alloc_percpu(struct ccp_crypto_cpu_queue);
-	if (!req_queue.cpu_queue)
-		return -ENOMEM;
-
-	for_each_possible_cpu(cpu) {
-		cpu_queue = per_cpu_ptr(req_queue.cpu_queue, cpu);
-		INIT_LIST_HEAD(&cpu_queue->cmds);
-		cpu_queue->backlog = &cpu_queue->cmds;
-		cpu_queue->cmd_count = 0;
-	}
-
-	return 0;
-}
-
-static void ccp_fini_queue(void)
-{
-	struct ccp_crypto_cpu_queue *cpu_queue;
-	int cpu;
-
-	for_each_possible_cpu(cpu) {
-		cpu_queue = per_cpu_ptr(req_queue.cpu_queue, cpu);
-		BUG_ON(!list_empty(&cpu_queue->cmds));
-	}
-	free_percpu(req_queue.cpu_queue);
-}
-
 static int ccp_crypto_init(void)
 {
 	int ret;
 
-	ret = ccp_init_queues();
-	if (ret)
-		return ret;
+	spin_lock_init(&req_queue_lock);
+	INIT_LIST_HEAD(&req_queue.cmds);
+	req_queue.backlog = &req_queue.cmds;
+	req_queue.cmd_count = 0;
 
 	ret = ccp_register_algs();
-	if (ret) {
+	if (ret)
 		ccp_unregister_algs();
-		ccp_fini_queue();
-	}
 
 	return ret;
 }
@@ -438,7 +371,6 @@ static int ccp_crypto_init(void)
 static void ccp_crypto_exit(void)
 {
 	ccp_unregister_algs();
-	ccp_fini_queue();
 }
 
 module_init(ccp_crypto_init);



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] crypto: ccp - Perform completion callbacks using a tasklet
  2014-01-24 22:17 [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Tom Lendacky
                   ` (2 preceding siblings ...)
  2014-01-24 22:18 ` [PATCH 3/4] crypto: ccp - Use a single queue for proper ordering of tfm requests Tom Lendacky
@ 2014-01-24 22:18 ` Tom Lendacky
  2014-02-09  9:21 ` [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Herbert Xu
  4 siblings, 0 replies; 6+ messages in thread
From: Tom Lendacky @ 2014-01-24 22:18 UTC (permalink / raw)
  To: linux-crypto, herbert, davem; +Cc: linux-kernel

Change from scheduling work to scheduling a tasklet to perform
the callback operations.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/ccp/ccp-dev.c |   21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-dev.c b/drivers/crypto/ccp/ccp-dev.c
index c3bc212..2c78161 100644
--- a/drivers/crypto/ccp/ccp-dev.c
+++ b/drivers/crypto/ccp/ccp-dev.c
@@ -30,6 +30,11 @@ MODULE_LICENSE("GPL");
 MODULE_VERSION("1.0.0");
 MODULE_DESCRIPTION("AMD Cryptographic Coprocessor driver");
 
+struct ccp_tasklet_data {
+	struct completion completion;
+	struct ccp_cmd *cmd;
+};
+
 
 static struct ccp_device *ccp_dev;
 static inline struct ccp_device *ccp_get_device(void)
@@ -192,17 +197,23 @@ static struct ccp_cmd *ccp_dequeue_cmd(struct ccp_cmd_queue *cmd_q)
 	return cmd;
 }
 
-static void ccp_do_cmd_complete(struct work_struct *work)
+static void ccp_do_cmd_complete(unsigned long data)
 {
-	struct ccp_cmd *cmd = container_of(work, struct ccp_cmd, work);
+	struct ccp_tasklet_data *tdata = (struct ccp_tasklet_data *)data;
+	struct ccp_cmd *cmd = tdata->cmd;
 
 	cmd->callback(cmd->data, cmd->ret);
+	complete(&tdata->completion);
 }
 
 static int ccp_cmd_queue_thread(void *data)
 {
 	struct ccp_cmd_queue *cmd_q = (struct ccp_cmd_queue *)data;
 	struct ccp_cmd *cmd;
+	struct ccp_tasklet_data tdata;
+	struct tasklet_struct tasklet;
+
+	tasklet_init(&tasklet, ccp_do_cmd_complete, (unsigned long)&tdata);
 
 	set_current_state(TASK_INTERRUPTIBLE);
 	while (!kthread_should_stop()) {
@@ -220,8 +231,10 @@ static int ccp_cmd_queue_thread(void *data)
 		cmd->ret = ccp_run_cmd(cmd_q, cmd);
 
 		/* Schedule the completion callback */
-		INIT_WORK(&cmd->work, ccp_do_cmd_complete);
-		schedule_work(&cmd->work);
+		tdata.cmd = cmd;
+		init_completion(&tdata.completion);
+		tasklet_schedule(&tasklet);
+		wait_for_completion(&tdata.completion);
 	}
 
 	__set_current_state(TASK_RUNNING);



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes
  2014-01-24 22:17 [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Tom Lendacky
                   ` (3 preceding siblings ...)
  2014-01-24 22:18 ` [PATCH 4/4] crypto: ccp - Perform completion callbacks using a tasklet Tom Lendacky
@ 2014-02-09  9:21 ` Herbert Xu
  4 siblings, 0 replies; 6+ messages in thread
From: Herbert Xu @ 2014-02-09  9:21 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-crypto, davem, linux-kernel

On Fri, Jan 24, 2014 at 04:17:50PM -0600, Tom Lendacky wrote:
> Patch 1: Allow for selectively disabling the registration of an algorithm
> family (sha or aes algorithms) via module parameters.
> 
> Patch 2-4: Fix errors/issues that were found during IPSec testing. In
> order to prevent deadlocks with the networking code, the crypto callback
> was changed to run as a tasklet.  In order for the callback to be run as
> a tasklet, the HMAC calculation needed to be moved out of the callback
> path (since it sleeps) and into the CCP sha operation logic.  Additionally,
> trying to allow concurrency within the tfm while maintaining serialization
> of the tfm per CPU was not working properly so a single queue is used now.
> 
> This patch series is based on the cryptodev-2.6 kernel tree.

All applied.  Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-02-09  9:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-24 22:17 [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Tom Lendacky
2014-01-24 22:17 ` [PATCH 1/4] crypto: ccp - Allow for selective disablement of crypto API algorithms Tom Lendacky
2014-01-24 22:18 ` [PATCH 2/4] crypto: ccp - Move HMAC calculation down to ccp ops file Tom Lendacky
2014-01-24 22:18 ` [PATCH 3/4] crypto: ccp - Use a single queue for proper ordering of tfm requests Tom Lendacky
2014-01-24 22:18 ` [PATCH 4/4] crypto: ccp - Perform completion callbacks using a tasklet Tom Lendacky
2014-02-09  9:21 ` [PATCH 0/4] crypto: ccp - selective algorithm registration and ipsec-related fixes Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).