All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/9] enable CAAM's HWRNG as default
@ 2020-01-27 16:56 Andrey Smirnov
  2020-01-27 16:56 ` [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA Andrey Smirnov
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

Everyone:

This series is a continuation of original [discussion]. I don't know
if what's in the series is enough to use CAAMs HWRNG system wide, but
I am hoping that with enough iterations and feedback it will be.

Changes since [v1]:

    - Original hw_random replaced with the one using output of TRNG directly

    - SEC4 DRNG IP block exposed via crypto API

    - Small fix regarding use of GFP_DMA added to the series

Chagnes since [v2]:

    - msleep in polling loop to avoid wasting CPU cycles

    - caam_trng_read() bails out early if 'wait' is set to 'false'

    - fixed typo in ZII's name

Changes since [v3]:

    - DRNG's .cra_name is now "stdrng"

    - collected Reviewd-by tag from Lucas

    - typo fixes in commit messages of the series

Changes since [v4]:

    - Dropped "crypto: caam - RNG4 TRNG errata" and "crypto: caam -
      enable prediction resistance in HRWNG" to limit the scope of the
      series. Those two patches are not yet ready and can be submitted
      separately later.

    - Collected Tested-by from Chris

Changes since [v5]:

    - Series is converted back to implementing HWRNG using a job ring
      as per feedback from Horia.

Changes since [v6]:

    - "crypto: caam - drop global context pointer and init_done"
      changed to use devres group to allow freeing HWRNG resource
      independently of the parent device lifecycle. Code to deal with
      circular deallocation dependency is added as well

    - Removed worker self-scheduling in "crypto: caam - simplify RNG
      implementation". It didn't bring much value, but meant that
      simple cleanup with just a call to flush_work() wasn't good
      enough.

    - Added a simple function with a FIXME item for MC firmware check in
      "crypto: caam - enable prediction resistance in HRWNG"

    - "crypto: caam - limit single JD RNG output to maximum of 16
      bytes" now shrinks async FIFO size from 32K to 64 bytes, since
      having a buffer that big doesn't seem to do any good given that
      througput of TRNG

Feedback is welcome!

Thanks,
Andrey Smirnov

[discussion] https://patchwork.kernel.org/patch/9850669/
[v1] https://lore.kernel.org/lkml/20191029162916.26579-1-andrew.smirnov@gmail.com
[v2] https://lore.kernel.org/lkml/20191118153843.28136-1-andrew.smirnov@gmail.com
[v3] https://lore.kernel.org/lkml/20191120165341.32669-1-andrew.smirnov@gmail.com
[v4] https://lore.kernel.org/lkml/20191121155554.1227-1-andrew.smirnov@gmail.com
[v5] https://lore.kernel.org/lkml/20191203162357.21942-1-andrew.smirnov@gmail.com
[v6] https://lore.kernel.org/lkml/20200108154047.12526-1-andrew.smirnov@gmail.com


Andrey Smirnov (9):
  crypto: caam - allocate RNG instantiation descriptor with GFP_DMA
  crypto: caam - use struct hwrng's .init for initialization
  crypto: caam - use devm_kzalloc to allocate JR data
  crypto: caam - drop global context pointer and init_done
  crypto: caam - simplify RNG implementation
  crypto: caam - check if RNG job failed
  crypto: caam - invalidate entropy register during RNG initialization
  crypto: caam - enable prediction resistance in HRWNG
  crypto: caam - limit single JD RNG output to maximum of 16 bytes

 drivers/crypto/caam/caamrng.c | 395 +++++++++++++---------------------
 drivers/crypto/caam/ctrl.c    |  56 +++--
 drivers/crypto/caam/desc.h    |   2 +
 drivers/crypto/caam/intern.h  |   7 +-
 drivers/crypto/caam/jr.c      |  13 +-
 drivers/crypto/caam/regs.h    |   7 +-
 6 files changed, 209 insertions(+), 271 deletions(-)

--
2.21.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  2020-02-04 14:08   ` Horia Geanta
  2020-01-27 16:56 ` [PATCH v7 2/9] crypto: caam - use struct hwrng's .init for initialization Andrey Smirnov
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-imx, linux-kernel

Be consistent with the rest of the codebase and use GFP_DMA when
allocating memory for a CAAM JR descriptor.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-imx@nxp.com
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/crypto/caam/ctrl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index 6659c8d9672e..a11c13a89ef8 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -194,7 +194,7 @@ static int deinstantiate_rng(struct device *ctrldev, int state_handle_mask)
 	u32 *desc, status;
 	int sh_idx, ret = 0;
 
-	desc = kmalloc(CAAM_CMD_SZ * 3, GFP_KERNEL);
+	desc = kmalloc(CAAM_CMD_SZ * 3, GFP_KERNEL | GFP_DMA);
 	if (!desc)
 		return -ENOMEM;
 
@@ -271,7 +271,7 @@ static int instantiate_rng(struct device *ctrldev, int state_handle_mask,
 	int ret = 0, sh_idx;
 
 	ctrl = (struct caam_ctrl __iomem *)ctrlpriv->ctrl;
-	desc = kmalloc(CAAM_CMD_SZ * 7, GFP_KERNEL);
+	desc = kmalloc(CAAM_CMD_SZ * 7, GFP_KERNEL | GFP_DMA);
 	if (!desc)
 		return -ENOMEM;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 2/9] crypto: caam - use struct hwrng's .init for initialization
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
  2020-01-27 16:56 ` [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  2020-02-11 14:39   ` Horia Geanta
  2020-01-27 16:56 ` [PATCH v7 3/9] crypto: caam - use devm_kzalloc to allocate JR data Andrey Smirnov
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

Make caamrng code a bit more symmetric by moving initialization code
to .init hook of struct hwrng.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-imx@nxp.com
---
 drivers/crypto/caam/caamrng.c | 47 ++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
index e8baacaabe07..1ce7fbd29e85 100644
--- a/drivers/crypto/caam/caamrng.c
+++ b/drivers/crypto/caam/caamrng.c
@@ -256,6 +256,7 @@ static void caam_cleanup(struct hwrng *rng)
 	}
 
 	rng_unmap_ctx(rng_ctx);
+	caam_jr_free(rng_ctx->jrdev);
 }
 
 static int caam_init_buf(struct caam_rng_ctx *ctx, int buf_id)
@@ -274,28 +275,43 @@ static int caam_init_buf(struct caam_rng_ctx *ctx, int buf_id)
 	return 0;
 }
 
-static int caam_init_rng(struct caam_rng_ctx *ctx, struct device *jrdev)
+static int caam_init(struct hwrng *rng)
 {
+	struct caam_rng_ctx *ctx = rng_ctx;
 	int err;
 
-	ctx->jrdev = jrdev;
+	ctx->jrdev = caam_jr_alloc();
+	err = PTR_ERR_OR_ZERO(ctx->jrdev);
+	if (err) {
+		pr_err("Job Ring Device allocation for transform failed\n");
+		return err;
+	}
 
 	err = rng_create_sh_desc(ctx);
 	if (err)
-		return err;
+		goto free_jrdev;
 
 	ctx->current_buf = 0;
 	ctx->cur_buf_idx = 0;
 
 	err = caam_init_buf(ctx, 0);
 	if (err)
-		return err;
+		goto free_jrdev;
+
+	err = caam_init_buf(ctx, 1);
+	if (err)
+		goto free_jrdev;
 
-	return caam_init_buf(ctx, 1);
+	return 0;
+
+free_jrdev:
+	caam_jr_free(ctx->jrdev);
+	return err;
 }
 
 static struct hwrng caam_rng = {
 	.name		= "rng-caam",
+	.init           = caam_init,
 	.cleanup	= caam_cleanup,
 	.read		= caam_read,
 };
@@ -305,14 +321,12 @@ void caam_rng_exit(void)
 	if (!init_done)
 		return;
 
-	caam_jr_free(rng_ctx->jrdev);
 	hwrng_unregister(&caam_rng);
 	kfree(rng_ctx);
 }
 
 int caam_rng_init(struct device *ctrldev)
 {
-	struct device *dev;
 	u32 rng_inst;
 	struct caam_drv_private *priv = dev_get_drvdata(ctrldev);
 	int err;
@@ -328,21 +342,11 @@ int caam_rng_init(struct device *ctrldev)
 	if (!rng_inst)
 		return 0;
 
-	dev = caam_jr_alloc();
-	if (IS_ERR(dev)) {
-		pr_err("Job Ring Device allocation for transform failed\n");
-		return PTR_ERR(dev);
-	}
 	rng_ctx = kmalloc(sizeof(*rng_ctx), GFP_DMA | GFP_KERNEL);
-	if (!rng_ctx) {
-		err = -ENOMEM;
-		goto free_caam_alloc;
-	}
-	err = caam_init_rng(rng_ctx, dev);
-	if (err)
-		goto free_rng_ctx;
+	if (!rng_ctx)
+		return -ENOMEM;
 
-	dev_info(dev, "registering rng-caam\n");
+	dev_info(ctrldev, "registering rng-caam\n");
 
 	err = hwrng_register(&caam_rng);
 	if (!err) {
@@ -350,9 +354,6 @@ int caam_rng_init(struct device *ctrldev)
 		return err;
 	}
 
-free_rng_ctx:
 	kfree(rng_ctx);
-free_caam_alloc:
-	caam_jr_free(dev);
 	return err;
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 3/9] crypto: caam - use devm_kzalloc to allocate JR data
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
  2020-01-27 16:56 ` [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA Andrey Smirnov
  2020-01-27 16:56 ` [PATCH v7 2/9] crypto: caam - use struct hwrng's .init for initialization Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  2020-02-11 18:23   ` Horia Geanta
  2020-01-27 16:56 ` [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done Andrey Smirnov
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

Use devm_kzalloc() to allocate JR data in order to make sure that it
is initialized consistently every time.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-imx@nxp.com
---
 drivers/crypto/caam/jr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
index fc97cde27059..a627de959b1e 100644
--- a/drivers/crypto/caam/jr.c
+++ b/drivers/crypto/caam/jr.c
@@ -505,7 +505,7 @@ static int caam_jr_probe(struct platform_device *pdev)
 	int error;
 
 	jrdev = &pdev->dev;
-	jrpriv = devm_kmalloc(jrdev, sizeof(*jrpriv), GFP_KERNEL);
+	jrpriv = devm_kzalloc(jrdev, sizeof(*jrpriv), GFP_KERNEL);
 	if (!jrpriv)
 		return -ENOMEM;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
                   ` (2 preceding siblings ...)
  2020-01-27 16:56 ` [PATCH v7 3/9] crypto: caam - use devm_kzalloc to allocate JR data Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  2020-02-11 18:53   ` Horia Geanta
  2020-02-11 20:57   ` Horia Geanta
  2020-01-27 16:56 ` [PATCH v7 5/9] crypto: caam - simplify RNG implementation Andrey Smirnov
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

Leverage devres to get rid of code storing global context as well as
init_done flag.

Original code also has a circular deallocation dependency where
unregister_algs() -> caam_rng_exit() -> caam_jr_free() chain would
only happen if all of JRs were freed. Fix this by moving
caam_rng_exit() outside of unregister_algs() and doing it specifically
for JR that instantiated HWRNG.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-imx@nxp.com
---
 drivers/crypto/caam/caamrng.c | 66 +++++++++++++++++------------------
 drivers/crypto/caam/intern.h  |  7 ++--
 drivers/crypto/caam/jr.c      | 11 +++---
 3 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
index 1ce7fbd29e85..47b15c25b66f 100644
--- a/drivers/crypto/caam/caamrng.c
+++ b/drivers/crypto/caam/caamrng.c
@@ -70,6 +70,7 @@ struct buf_data {
 
 /* rng per-device context */
 struct caam_rng_ctx {
+	struct hwrng rng;
 	struct device *jrdev;
 	dma_addr_t sh_desc_dma;
 	u32 sh_desc[DESC_RNG_LEN];
@@ -78,13 +79,10 @@ struct caam_rng_ctx {
 	struct buf_data bufs[2];
 };
 
-static struct caam_rng_ctx *rng_ctx;
-
-/*
- * Variable used to avoid double free of resources in case
- * algorithm registration was unsuccessful
- */
-static bool init_done;
+static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
+{
+	return container_of(r, struct caam_rng_ctx, rng);
+}
 
 static inline void rng_unmap_buf(struct device *jrdev, struct buf_data *bd)
 {
@@ -143,7 +141,7 @@ static inline int submit_job(struct caam_rng_ctx *ctx, int to_current)
 
 static int caam_read(struct hwrng *rng, void *data, size_t max, bool wait)
 {
-	struct caam_rng_ctx *ctx = rng_ctx;
+	struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
 	struct buf_data *bd = &ctx->bufs[ctx->current_buf];
 	int next_buf_idx, copied_idx;
 	int err;
@@ -246,17 +244,18 @@ static inline int rng_create_job_desc(struct caam_rng_ctx *ctx, int buf_id)
 
 static void caam_cleanup(struct hwrng *rng)
 {
+	struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
 	int i;
 	struct buf_data *bd;
 
 	for (i = 0; i < 2; i++) {
-		bd = &rng_ctx->bufs[i];
+		bd = &ctx->bufs[i];
 		if (atomic_read(&bd->empty) == BUF_PENDING)
 			wait_for_completion(&bd->filled);
 	}
 
-	rng_unmap_ctx(rng_ctx);
-	caam_jr_free(rng_ctx->jrdev);
+	rng_unmap_ctx(ctx);
+	caam_jr_free(ctx->jrdev);
 }
 
 static int caam_init_buf(struct caam_rng_ctx *ctx, int buf_id)
@@ -277,7 +276,7 @@ static int caam_init_buf(struct caam_rng_ctx *ctx, int buf_id)
 
 static int caam_init(struct hwrng *rng)
 {
-	struct caam_rng_ctx *ctx = rng_ctx;
+	struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
 	int err;
 
 	ctx->jrdev = caam_jr_alloc();
@@ -309,28 +308,19 @@ static int caam_init(struct hwrng *rng)
 	return err;
 }
 
-static struct hwrng caam_rng = {
-	.name		= "rng-caam",
-	.init           = caam_init,
-	.cleanup	= caam_cleanup,
-	.read		= caam_read,
-};
+int caam_rng_init(struct device *ctrldev);
 
-void caam_rng_exit(void)
+void caam_rng_exit(struct device *ctrldev)
 {
-	if (!init_done)
-		return;
-
-	hwrng_unregister(&caam_rng);
-	kfree(rng_ctx);
+	devres_release_group(ctrldev, caam_rng_init);
 }
 
 int caam_rng_init(struct device *ctrldev)
 {
+	struct caam_rng_ctx *ctx;
 	u32 rng_inst;
 	struct caam_drv_private *priv = dev_get_drvdata(ctrldev);
-	int err;
-	init_done = false;
+	int ret;
 
 	/* Check for an instantiated RNG before registration */
 	if (priv->era < 10)
@@ -342,18 +332,26 @@ int caam_rng_init(struct device *ctrldev)
 	if (!rng_inst)
 		return 0;
 
-	rng_ctx = kmalloc(sizeof(*rng_ctx), GFP_DMA | GFP_KERNEL);
-	if (!rng_ctx)
+	if (!devres_open_group(ctrldev, caam_rng_init, GFP_KERNEL))
+		return -ENOMEM;
+
+	ctx = devm_kzalloc(ctrldev, sizeof(*ctx), GFP_DMA | GFP_KERNEL);
+	if (!ctx)
 		return -ENOMEM;
 
+	ctx->rng.name    = "rng-caam";
+	ctx->rng.init    = caam_init;
+	ctx->rng.cleanup = caam_cleanup;
+	ctx->rng.read    = caam_read;
+
 	dev_info(ctrldev, "registering rng-caam\n");
 
-	err = hwrng_register(&caam_rng);
-	if (!err) {
-		init_done = true;
-		return err;
+	ret = devm_hwrng_register(ctrldev, &ctx->rng);
+	if (ret) {
+		caam_rng_exit(ctrldev);
+		return ret;
 	}
 
-	kfree(rng_ctx);
-	return err;
+	devres_close_group(ctrldev, caam_rng_init);
+	return 0;
 }
diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h
index c7c10c90464b..e11c9722c2dd 100644
--- a/drivers/crypto/caam/intern.h
+++ b/drivers/crypto/caam/intern.h
@@ -46,6 +46,7 @@ struct caam_drv_private_jr {
 	struct caam_job_ring __iomem *rregs;	/* JobR's register space */
 	struct tasklet_struct irqtask;
 	int irq;			/* One per queue */
+	bool hwrng;
 
 	/* Number of scatterlist crypt transforms active on the JobR */
 	atomic_t tfm_count ____cacheline_aligned;
@@ -161,7 +162,7 @@ static inline void caam_pkc_exit(void)
 #ifdef CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API
 
 int caam_rng_init(struct device *dev);
-void caam_rng_exit(void);
+void caam_rng_exit(struct device *dev);
 
 #else
 
@@ -170,9 +171,7 @@ static inline int caam_rng_init(struct device *dev)
 	return 0;
 }
 
-static inline void caam_rng_exit(void)
-{
-}
+static inline void caam_rng_exit(struct device *dev) {}
 
 #endif /* CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API */
 
diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
index a627de959b1e..7b8a8d3db40e 100644
--- a/drivers/crypto/caam/jr.c
+++ b/drivers/crypto/caam/jr.c
@@ -27,7 +27,8 @@ static struct jr_driver_data driver_data;
 static DEFINE_MUTEX(algs_lock);
 static unsigned int active_devs;
 
-static void register_algs(struct device *dev)
+static void register_algs(struct caam_drv_private_jr *jrpriv,
+			  struct device *dev)
 {
 	mutex_lock(&algs_lock);
 
@@ -37,7 +38,7 @@ static void register_algs(struct device *dev)
 	caam_algapi_init(dev);
 	caam_algapi_hash_init(dev);
 	caam_pkc_init(dev);
-	caam_rng_init(dev);
+	jrpriv->hwrng = !caam_rng_init(dev);
 	caam_qi_algapi_init(dev);
 
 algs_unlock:
@@ -53,7 +54,6 @@ static void unregister_algs(void)
 
 	caam_qi_algapi_exit();
 
-	caam_rng_exit();
 	caam_pkc_exit();
 	caam_algapi_hash_exit();
 	caam_algapi_exit();
@@ -126,6 +126,9 @@ static int caam_jr_remove(struct platform_device *pdev)
 	jrdev = &pdev->dev;
 	jrpriv = dev_get_drvdata(jrdev);
 
+	if (jrpriv->hwrng)
+		caam_rng_exit(jrdev->parent);
+
 	/*
 	 * Return EBUSY if job ring already allocated.
 	 */
@@ -562,7 +565,7 @@ static int caam_jr_probe(struct platform_device *pdev)
 
 	atomic_set(&jrpriv->tfm_count, 0);
 
-	register_algs(jrdev->parent);
+	register_algs(jrpriv, jrdev->parent);
 
 	return 0;
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 5/9] crypto: caam - simplify RNG implementation
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
                   ` (3 preceding siblings ...)
  2020-01-27 16:56 ` [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  2020-02-12 13:20   ` Horia Geanta
  2020-01-27 16:56 ` [PATCH v7 6/9] crypto: caam - check if RNG job failed Andrey Smirnov
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

Rework CAAM RNG implementation as follows:

- Make use of the fact that HWRNG supports partial reads and will
handle such cases gracefully by removing recursion in caam_read()

- Convert blocking caam_read() codepath to do a single blocking job
read directly into requested buffer, bypassing any intermediary
buffers

- Convert async caam_read() codepath into a simple single
reader/single writer FIFO use-case, thus simplifying concurrency
handling and delegating buffer read/write position management to KFIFO
subsystem.

- Leverage the same low level RNG data extraction code for both async
and blocking caam_read() scenarios, get rid of the shared job
descriptor and make non-shared one as a simple as possible (just
HEADER + ALGORITHM OPERATION + FIFO STORE)

- Split private context from DMA related memory, so that the former
could be allocated without GFP_DMA.

NOTE: On its face value this commit decreased throughput numbers
reported by

  dd if=/dev/hwrng of=/dev/null bs=1 count=100K [iflag=nonblock]

by about 15%, however commits that enable prediction resistance and
limit JR total size impact the performance so much and move the
bottleneck such as to make this regression irrelevant.

NOTE: On the bright side, this commit reduces RNG in kernel DMA buffer
memory usage from 2 x RN_BUF_SIZE (~256K) to 32K.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-imx@nxp.com
---
 drivers/crypto/caam/caamrng.c | 321 ++++++++++++----------------------
 1 file changed, 107 insertions(+), 214 deletions(-)

diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
index 47b15c25b66f..cb498186b9b9 100644
--- a/drivers/crypto/caam/caamrng.c
+++ b/drivers/crypto/caam/caamrng.c
@@ -7,35 +7,12 @@
  *
  * Based on caamalg.c crypto API driver.
  *
- * relationship between job descriptors to shared descriptors:
- *
- * ---------------                     --------------
- * | JobDesc #0  |-------------------->| ShareDesc  |
- * | *(buffer 0) |      |------------->| (generate) |
- * ---------------      |              | (move)     |
- *                      |              | (store)    |
- * ---------------      |              --------------
- * | JobDesc #1  |------|
- * | *(buffer 1) |
- * ---------------
- *
- * A job desc looks like this:
- *
- * ---------------------
- * | Header            |
- * | ShareDesc Pointer |
- * | SEQ_OUT_PTR       |
- * | (output buffer)   |
- * ---------------------
- *
- * The SharedDesc never changes, and each job descriptor points to one of two
- * buffers for each device, from which the data will be copied into the
- * requested destination
  */
 
 #include <linux/hw_random.h>
 #include <linux/completion.h>
 #include <linux/atomic.h>
+#include <linux/kfifo.h>
 
 #include "compat.h"
 
@@ -45,38 +22,34 @@
 #include "jr.h"
 #include "error.h"
 
+/* length of descriptors */
+#define CAAM_RNG_MAX_FIFO_STORE_SIZE	U16_MAX
+
+#define CAAM_RNG_FIFO_LEN		SZ_32K /* Must be a multiple of 2 */
+
 /*
- * Maximum buffer size: maximum number of random, cache-aligned bytes that
- * will be generated and moved to seq out ptr (extlen not allowed)
+ * See caam_init_desc()
  */
-#define RN_BUF_SIZE			(0xffff / L1_CACHE_BYTES * \
-					 L1_CACHE_BYTES)
+#define CAAM_RNG_DESC_LEN (CAAM_CMD_SZ +				\
+			   CAAM_CMD_SZ +				\
+			   CAAM_CMD_SZ + CAAM_PTR_SZ_MAX)
 
-/* length of descriptors */
-#define DESC_JOB_O_LEN			(CAAM_CMD_SZ * 2 + CAAM_PTR_SZ_MAX * 2)
-#define DESC_RNG_LEN			(3 * CAAM_CMD_SZ)
-
-/* Buffer, its dma address and lock */
-struct buf_data {
-	u8 buf[RN_BUF_SIZE] ____cacheline_aligned;
-	dma_addr_t addr;
-	struct completion filled;
-	u32 hw_desc[DESC_JOB_O_LEN];
-#define BUF_NOT_EMPTY 0
-#define BUF_EMPTY 1
-#define BUF_PENDING 2  /* Empty, but with job pending --don't submit another */
-	atomic_t empty;
+typedef u8 caam_rng_desc[CAAM_RNG_DESC_LEN];
+
+enum {
+	DESC_ASYNC,
+	DESC_SYNC,
+	DESC_NUM,
 };
 
 /* rng per-device context */
 struct caam_rng_ctx {
 	struct hwrng rng;
 	struct device *jrdev;
-	dma_addr_t sh_desc_dma;
-	u32 sh_desc[DESC_RNG_LEN];
-	unsigned int cur_buf_idx;
-	int current_buf;
-	struct buf_data bufs[2];
+	struct device *ctrldev;
+	caam_rng_desc *desc;
+	struct work_struct worker;
+	struct kfifo fifo;
 };
 
 static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
@@ -84,228 +57,146 @@ static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
 	return container_of(r, struct caam_rng_ctx, rng);
 }
 
-static inline void rng_unmap_buf(struct device *jrdev, struct buf_data *bd)
+static void caam_rng_done(struct device *jrdev, u32 *desc, u32 err,
+			  void *context)
 {
-	if (bd->addr)
-		dma_unmap_single(jrdev, bd->addr, RN_BUF_SIZE,
-				 DMA_FROM_DEVICE);
-}
-
-static inline void rng_unmap_ctx(struct caam_rng_ctx *ctx)
-{
-	struct device *jrdev = ctx->jrdev;
-
-	if (ctx->sh_desc_dma)
-		dma_unmap_single(jrdev, ctx->sh_desc_dma,
-				 desc_bytes(ctx->sh_desc), DMA_TO_DEVICE);
-	rng_unmap_buf(jrdev, &ctx->bufs[0]);
-	rng_unmap_buf(jrdev, &ctx->bufs[1]);
-}
-
-static void rng_done(struct device *jrdev, u32 *desc, u32 err, void *context)
-{
-	struct buf_data *bd;
-
-	bd = container_of(desc, struct buf_data, hw_desc[0]);
+	struct completion *done = context;
 
 	if (err)
 		caam_jr_strstatus(jrdev, err);
 
-	atomic_set(&bd->empty, BUF_NOT_EMPTY);
-	complete(&bd->filled);
-
-	/* Buffer refilled, invalidate cache */
-	dma_sync_single_for_cpu(jrdev, bd->addr, RN_BUF_SIZE, DMA_FROM_DEVICE);
-
-	print_hex_dump_debug("rng refreshed buf@: ", DUMP_PREFIX_ADDRESS, 16, 4,
-			     bd->buf, RN_BUF_SIZE, 1);
+	complete(done);
 }
 
-static inline int submit_job(struct caam_rng_ctx *ctx, int to_current)
+static u32 *caam_init_desc(u32 *desc, dma_addr_t dst_dma, int len)
 {
-	struct buf_data *bd = &ctx->bufs[!(to_current ^ ctx->current_buf)];
-	struct device *jrdev = ctx->jrdev;
-	u32 *desc = bd->hw_desc;
-	int err;
+	init_job_desc(desc, 0);	/* + 1 cmd_sz */
+	/* Generate random bytes: + 1 cmd_sz */
+	append_operation(desc, OP_ALG_ALGSEL_RNG | OP_TYPE_CLASS1_ALG);
+	/* Store bytes */
+	append_fifo_store(desc, dst_dma, len, FIFOST_TYPE_RNGSTORE);
 
-	dev_dbg(jrdev, "submitting job %d\n", !(to_current ^ ctx->current_buf));
-	init_completion(&bd->filled);
-	err = caam_jr_enqueue(jrdev, desc, rng_done, ctx);
-	if (err)
-		complete(&bd->filled); /* don't wait on failed job*/
-	else
-		atomic_inc(&bd->empty); /* note if pending */
+	print_hex_dump_debug("rng job desc@: ", DUMP_PREFIX_ADDRESS,
+			     16, 4, desc, desc_bytes(desc), 1);
 
-	return err;
+	return desc;
 }
 
-static int caam_read(struct hwrng *rng, void *data, size_t max, bool wait)
+static int caam_rng_read_one(struct device *jrdev,
+			     void *dst, int len,
+			     void *desc,
+			     struct completion *done)
 {
-	struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
-	struct buf_data *bd = &ctx->bufs[ctx->current_buf];
-	int next_buf_idx, copied_idx;
+	dma_addr_t dst_dma;
 	int err;
 
-	if (atomic_read(&bd->empty)) {
-		/* try to submit job if there wasn't one */
-		if (atomic_read(&bd->empty) == BUF_EMPTY) {
-			err = submit_job(ctx, 1);
-			/* if can't submit job, can't even wait */
-			if (err)
-				return 0;
-		}
-		/* no immediate data, so exit if not waiting */
-		if (!wait)
-			return 0;
-
-		/* waiting for pending job */
-		if (atomic_read(&bd->empty))
-			wait_for_completion(&bd->filled);
-	}
-
-	next_buf_idx = ctx->cur_buf_idx + max;
-	dev_dbg(ctx->jrdev, "%s: start reading at buffer %d, idx %d\n",
-		 __func__, ctx->current_buf, ctx->cur_buf_idx);
+	len = min_t(int, len, CAAM_RNG_MAX_FIFO_STORE_SIZE);
 
-	/* if enough data in current buffer */
-	if (next_buf_idx < RN_BUF_SIZE) {
-		memcpy(data, bd->buf + ctx->cur_buf_idx, max);
-		ctx->cur_buf_idx = next_buf_idx;
-		return max;
+	dst_dma = dma_map_single(jrdev, dst, len, DMA_FROM_DEVICE);
+	if (dma_mapping_error(jrdev, dst_dma)) {
+		dev_err(jrdev, "unable to map destination memory\n");
+		return -ENOMEM;
 	}
 
-	/* else, copy what's left... */
-	copied_idx = RN_BUF_SIZE - ctx->cur_buf_idx;
-	memcpy(data, bd->buf + ctx->cur_buf_idx, copied_idx);
-	ctx->cur_buf_idx = 0;
-	atomic_set(&bd->empty, BUF_EMPTY);
-
-	/* ...refill... */
-	submit_job(ctx, 1);
+	init_completion(done);
+	err = caam_jr_enqueue(jrdev,
+			      caam_init_desc(desc, dst_dma, len),
+			      caam_rng_done, done);
+	if (!err)
+		wait_for_completion(done);
 
-	/* and use next buffer */
-	ctx->current_buf = !ctx->current_buf;
-	dev_dbg(ctx->jrdev, "switched to buffer %d\n", ctx->current_buf);
+	dma_unmap_single(jrdev, dst_dma, len, DMA_FROM_DEVICE);
 
-	/* since there already is some data read, don't wait */
-	return copied_idx + caam_read(rng, data + copied_idx,
-				      max - copied_idx, false);
+	return err ?: len;
 }
 
-static inline int rng_create_sh_desc(struct caam_rng_ctx *ctx)
+static void caam_rng_fill_async(struct caam_rng_ctx *ctx)
 {
-	struct device *jrdev = ctx->jrdev;
-	u32 *desc = ctx->sh_desc;
-
-	init_sh_desc(desc, HDR_SHARE_SERIAL);
-
-	/* Generate random bytes */
-	append_operation(desc, OP_ALG_ALGSEL_RNG | OP_TYPE_CLASS1_ALG);
-
-	/* Store bytes */
-	append_seq_fifo_store(desc, RN_BUF_SIZE, FIFOST_TYPE_RNGSTORE);
-
-	ctx->sh_desc_dma = dma_map_single(jrdev, desc, desc_bytes(desc),
-					  DMA_TO_DEVICE);
-	if (dma_mapping_error(jrdev, ctx->sh_desc_dma)) {
-		dev_err(jrdev, "unable to map shared descriptor\n");
-		return -ENOMEM;
-	}
-
-	print_hex_dump_debug("rng shdesc@: ", DUMP_PREFIX_ADDRESS, 16, 4,
-			     desc, desc_bytes(desc), 1);
+	struct scatterlist sg[1];
+	struct completion done;
+	int len, nents;
+
+	sg_init_table(sg, ARRAY_SIZE(sg));
+	nents = kfifo_dma_in_prepare(&ctx->fifo, sg, ARRAY_SIZE(sg),
+				     CAAM_RNG_FIFO_LEN);
+	if (!nents)
+		return;
+
+	len = caam_rng_read_one(ctx->jrdev, sg_virt(&sg[0]),
+				sg[0].length,
+				&ctx->desc[DESC_ASYNC],
+				&done);
+	if (len < 0)
+		return;
+
+	kfifo_dma_in_finish(&ctx->fifo, len);
+}
 
-	return 0;
+static void caam_rng_worker(struct work_struct *work)
+{
+	struct caam_rng_ctx *ctx = container_of(work, struct caam_rng_ctx,
+						worker);
+	caam_rng_fill_async(ctx);
 }
 
-static inline int rng_create_job_desc(struct caam_rng_ctx *ctx, int buf_id)
+static int caam_read(struct hwrng *rng, void *dst, size_t max, bool wait)
 {
-	struct device *jrdev = ctx->jrdev;
-	struct buf_data *bd = &ctx->bufs[buf_id];
-	u32 *desc = bd->hw_desc;
-	int sh_len = desc_len(ctx->sh_desc);
+	struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
+	int out;
 
-	init_job_desc_shared(desc, ctx->sh_desc_dma, sh_len, HDR_SHARE_DEFER |
-			     HDR_REVERSE);
+	if (wait) {
+		struct completion done;
 
-	bd->addr = dma_map_single(jrdev, bd->buf, RN_BUF_SIZE, DMA_FROM_DEVICE);
-	if (dma_mapping_error(jrdev, bd->addr)) {
-		dev_err(jrdev, "unable to map dst\n");
-		return -ENOMEM;
+		return caam_rng_read_one(ctx->jrdev, dst, max,
+					 ctx->desc[DESC_SYNC], &done);
 	}
 
-	append_seq_out_ptr_intlen(desc, bd->addr, RN_BUF_SIZE, 0);
-
-	print_hex_dump_debug("rng job desc@: ", DUMP_PREFIX_ADDRESS, 16, 4,
-			     desc, desc_bytes(desc), 1);
+	out = kfifo_out(&ctx->fifo, dst, max);
+	if (kfifo_len(&ctx->fifo) <= CAAM_RNG_FIFO_LEN / 2)
+		schedule_work(&ctx->worker);
 
-	return 0;
+	return out;
 }
 
 static void caam_cleanup(struct hwrng *rng)
 {
 	struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
-	int i;
-	struct buf_data *bd;
-
-	for (i = 0; i < 2; i++) {
-		bd = &ctx->bufs[i];
-		if (atomic_read(&bd->empty) == BUF_PENDING)
-			wait_for_completion(&bd->filled);
-	}
 
-	rng_unmap_ctx(ctx);
+	flush_work(&ctx->worker);
 	caam_jr_free(ctx->jrdev);
+	kfifo_free(&ctx->fifo);
 }
 
-static int caam_init_buf(struct caam_rng_ctx *ctx, int buf_id)
+static int caam_init(struct hwrng *rng)
 {
-	struct buf_data *bd = &ctx->bufs[buf_id];
+	struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
 	int err;
 
-	err = rng_create_job_desc(ctx, buf_id);
-	if (err)
-		return err;
-
-	atomic_set(&bd->empty, BUF_EMPTY);
-	submit_job(ctx, buf_id == ctx->current_buf);
-	wait_for_completion(&bd->filled);
+	ctx->desc = devm_kcalloc(ctx->ctrldev, DESC_NUM, sizeof(*ctx->desc),
+				 GFP_DMA | GFP_KERNEL);
+	if (!ctx->desc)
+		return -ENOMEM;
 
-	return 0;
-}
+	if (kfifo_alloc(&ctx->fifo, CAAM_RNG_FIFO_LEN, GFP_DMA | GFP_KERNEL))
+		return -ENOMEM;
 
-static int caam_init(struct hwrng *rng)
-{
-	struct caam_rng_ctx *ctx = to_caam_rng_ctx(rng);
-	int err;
+	INIT_WORK(&ctx->worker, caam_rng_worker);
 
 	ctx->jrdev = caam_jr_alloc();
 	err = PTR_ERR_OR_ZERO(ctx->jrdev);
 	if (err) {
+		kfifo_free(&ctx->fifo);
 		pr_err("Job Ring Device allocation for transform failed\n");
 		return err;
 	}
 
-	err = rng_create_sh_desc(ctx);
-	if (err)
-		goto free_jrdev;
-
-	ctx->current_buf = 0;
-	ctx->cur_buf_idx = 0;
-
-	err = caam_init_buf(ctx, 0);
-	if (err)
-		goto free_jrdev;
-
-	err = caam_init_buf(ctx, 1);
-	if (err)
-		goto free_jrdev;
+	/*
+	 * Fill async buffer to have early randomness data for
+	 * hw_random
+	 */
+	caam_rng_fill_async(ctx);
 
 	return 0;
-
-free_jrdev:
-	caam_jr_free(ctx->jrdev);
-	return err;
 }
 
 int caam_rng_init(struct device *ctrldev);
@@ -335,10 +226,12 @@ int caam_rng_init(struct device *ctrldev)
 	if (!devres_open_group(ctrldev, caam_rng_init, GFP_KERNEL))
 		return -ENOMEM;
 
-	ctx = devm_kzalloc(ctrldev, sizeof(*ctx), GFP_DMA | GFP_KERNEL);
+	ctx = devm_kzalloc(ctrldev, sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
 		return -ENOMEM;
 
+	ctx->ctrldev = ctrldev;
+
 	ctx->rng.name    = "rng-caam";
 	ctx->rng.init    = caam_init;
 	ctx->rng.cleanup = caam_cleanup;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 6/9] crypto: caam - check if RNG job failed
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
                   ` (4 preceding siblings ...)
  2020-01-27 16:56 ` [PATCH v7 5/9] crypto: caam - simplify RNG implementation Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  2020-02-12 10:41   ` Horia Geanta
  2020-01-27 16:56 ` [PATCH v7 7/9] crypto: caam - invalidate entropy register during RNG initialization Andrey Smirnov
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

We shouldn't stay silent if RNG job fails. Add appropriate code to
check for that case and propagate error code up appropriately.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-imx@nxp.com
---
 drivers/crypto/caam/caamrng.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
index cb498186b9b9..790624ae83c6 100644
--- a/drivers/crypto/caam/caamrng.c
+++ b/drivers/crypto/caam/caamrng.c
@@ -52,6 +52,11 @@ struct caam_rng_ctx {
 	struct kfifo fifo;
 };
 
+struct caam_rng_job_ctx {
+	struct completion *done;
+	int *err;
+};
+
 static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
 {
 	return container_of(r, struct caam_rng_ctx, rng);
@@ -60,12 +65,12 @@ static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
 static void caam_rng_done(struct device *jrdev, u32 *desc, u32 err,
 			  void *context)
 {
-	struct completion *done = context;
+	struct caam_rng_job_ctx *jctx = context;
 
 	if (err)
-		caam_jr_strstatus(jrdev, err);
+		*jctx->err = caam_jr_strstatus(jrdev, err);
 
-	complete(done);
+	complete(jctx->done);
 }
 
 static u32 *caam_init_desc(u32 *desc, dma_addr_t dst_dma, int len)
@@ -89,6 +94,10 @@ static int caam_rng_read_one(struct device *jrdev,
 {
 	dma_addr_t dst_dma;
 	int err;
+	struct caam_rng_job_ctx jctx = {
+		.done = done,
+		.err  = &err,
+	};
 
 	len = min_t(int, len, CAAM_RNG_MAX_FIFO_STORE_SIZE);
 
@@ -101,7 +110,7 @@ static int caam_rng_read_one(struct device *jrdev,
 	init_completion(done);
 	err = caam_jr_enqueue(jrdev,
 			      caam_init_desc(desc, dst_dma, len),
-			      caam_rng_done, done);
+			      caam_rng_done, &jctx);
 	if (!err)
 		wait_for_completion(done);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 7/9] crypto: caam - invalidate entropy register during RNG initialization
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
                   ` (5 preceding siblings ...)
  2020-01-27 16:56 ` [PATCH v7 6/9] crypto: caam - check if RNG job failed Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  2020-02-25 20:26   ` Horia Geanta
  2020-01-27 16:56 ` [PATCH v7 8/9] crypto: caam - enable prediction resistance in HRWNG Andrey Smirnov
  2020-01-27 16:56 ` [PATCH v7 9/9] crypto: caam - limit single JD RNG output to maximum of 16 bytes Andrey Smirnov
  8 siblings, 1 reply; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Aymen Sghaier, Vipul Kumar, Chris Healy,
	Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

In order to make sure that we always use non-stale entropy data, change
the code to invalidate entropy register during RNG initialization.

Signed-off-by: Aymen Sghaier <aymen.sghaier@nxp.com>
Signed-off-by: Vipul Kumar <vipul_kumar@mentor.com>
[andrew.smirnov@gmail.com ported to upstream kernel, rewrote commit msg]
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-imx@nxp.com
---
 drivers/crypto/caam/ctrl.c | 11 ++++++++---
 drivers/crypto/caam/regs.h |  3 ++-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index a11c13a89ef8..bcbc832b208e 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -339,8 +339,12 @@ static void kick_trng(struct platform_device *pdev, int ent_delay)
 	ctrl = (struct caam_ctrl __iomem *)ctrlpriv->ctrl;
 	r4tst = &ctrl->r4tst[0];
 
-	/* put RNG4 into program mode */
-	clrsetbits_32(&r4tst->rtmctl, 0, RTMCTL_PRGM);
+	/*
+	 * Setting both RTMCTL:PRGM and RTMCTL:TRNG_ACC causes TRNG to
+	 * properly invalidate the entropy in the entropy register and
+	 * force re-generation.
+	 */
+	clrsetbits_32(&r4tst->rtmctl, 0, RTMCTL_PRGM | RTMCTL_ACC);
 
 	/*
 	 * Performance-wise, it does not make sense to
@@ -370,7 +374,8 @@ static void kick_trng(struct platform_device *pdev, int ent_delay)
 	 * select raw sampling in both entropy shifter
 	 * and statistical checker; ; put RNG4 into run mode
 	 */
-	clrsetbits_32(&r4tst->rtmctl, RTMCTL_PRGM, RTMCTL_SAMP_MODE_RAW_ES_SC);
+	clrsetbits_32(&r4tst->rtmctl, RTMCTL_PRGM | RTMCTL_ACC,
+		      RTMCTL_SAMP_MODE_RAW_ES_SC);
 }
 
 static int caam_get_era_from_hw(struct caam_ctrl __iomem *ctrl)
diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index 05127b70527d..c191e8fd0fa7 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -487,7 +487,8 @@ struct rngtst {
 
 /* RNG4 TRNG test registers */
 struct rng4tst {
-#define RTMCTL_PRGM	0x00010000	/* 1 -> program mode, 0 -> run mode */
+#define RTMCTL_ACC  BIT(5)  /* TRNG access mode */
+#define RTMCTL_PRGM BIT(16) /* 1 -> program mode, 0 -> run mode */
 #define RTMCTL_SAMP_MODE_VON_NEUMANN_ES_SC	0 /* use von Neumann data in
 						     both entropy shifter and
 						     statistical checker */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 8/9] crypto: caam - enable prediction resistance in HRWNG
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
                   ` (6 preceding siblings ...)
  2020-01-27 16:56 ` [PATCH v7 7/9] crypto: caam - invalidate entropy register during RNG initialization Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  2020-02-04 13:09   ` [EXT] " Andrei Botila (OSS)
  2020-01-27 16:56 ` [PATCH v7 9/9] crypto: caam - limit single JD RNG output to maximum of 16 bytes Andrey Smirnov
  8 siblings, 1 reply; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

Instantiate CAAM RNG with prediction resistance enabled to improve its
quality (with PR on DRNG is forced to reseed from TRNG every time
random data is generated).

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-imx@nxp.com
---
 drivers/crypto/caam/caamrng.c |  3 ++-
 drivers/crypto/caam/ctrl.c    | 41 +++++++++++++++++++++++++++--------
 drivers/crypto/caam/desc.h    |  2 ++
 drivers/crypto/caam/regs.h    |  4 +++-
 4 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
index 790624ae83c6..62f3a69ae837 100644
--- a/drivers/crypto/caam/caamrng.c
+++ b/drivers/crypto/caam/caamrng.c
@@ -77,7 +77,8 @@ static u32 *caam_init_desc(u32 *desc, dma_addr_t dst_dma, int len)
 {
 	init_job_desc(desc, 0);	/* + 1 cmd_sz */
 	/* Generate random bytes: + 1 cmd_sz */
-	append_operation(desc, OP_ALG_ALGSEL_RNG | OP_TYPE_CLASS1_ALG);
+	append_operation(desc, OP_ALG_ALGSEL_RNG | OP_TYPE_CLASS1_ALG |
+			 OP_ALG_PR_ON);
 	/* Store bytes */
 	append_fifo_store(desc, dst_dma, len, FIFOST_TYPE_RNGSTORE);
 
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index bcbc832b208e..ad3f6aa921d3 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -36,7 +36,8 @@ static void build_instantiation_desc(u32 *desc, int handle, int do_sk)
 	init_job_desc(desc, 0);
 
 	op_flags = OP_TYPE_CLASS1_ALG | OP_ALG_ALGSEL_RNG |
-			(handle << OP_ALG_AAI_SHIFT) | OP_ALG_AS_INIT;
+			(handle << OP_ALG_AAI_SHIFT) | OP_ALG_AS_INIT |
+			OP_ALG_PR_ON;
 
 	/* INIT RNG in non-test mode */
 	append_operation(desc, op_flags);
@@ -276,12 +277,25 @@ static int instantiate_rng(struct device *ctrldev, int state_handle_mask,
 		return -ENOMEM;
 
 	for (sh_idx = 0; sh_idx < RNG4_MAX_HANDLES; sh_idx++) {
+		const u32 rdsta_if = RDSTA_IF0 << sh_idx;
+		const u32 rdsta_pr = RDSTA_PR0 << sh_idx;
+		const u32 rdsta_mask = rdsta_if | rdsta_pr;
 		/*
 		 * If the corresponding bit is set, this state handle
 		 * was initialized by somebody else, so it's left alone.
 		 */
-		if ((1 << sh_idx) & state_handle_mask)
-			continue;
+		if (rdsta_if & state_handle_mask) {
+			if (rdsta_pr & state_handle_mask)
+				continue;
+
+			dev_info(ctrldev,
+				 "RNG4 SH%d was previously instantiated without prediction resistance. Tearing it down\n",
+				 sh_idx);
+
+			ret = deinstantiate_rng(ctrldev, rdsta_if);
+			if (ret)
+				break;
+		}
 
 		/* Create the descriptor for instantiating RNG State Handle */
 		build_instantiation_desc(desc, sh_idx, gen_sk);
@@ -301,9 +315,9 @@ static int instantiate_rng(struct device *ctrldev, int state_handle_mask,
 		if (ret)
 			break;
 
-		rdsta_val = rd_reg32(&ctrl->r4tst[0].rdsta) & RDSTA_IFMASK;
+		rdsta_val = rd_reg32(&ctrl->r4tst[0].rdsta) & RDSTA_MASK;
 		if ((status && status != JRSTA_SSRC_JUMP_HALT_CC) ||
-		    !(rdsta_val & (1 << sh_idx))) {
+		    (rdsta_val & rdsta_mask) != rdsta_mask) {
 			ret = -EAGAIN;
 			break;
 		}
@@ -563,6 +577,15 @@ static void caam_remove_debugfs(void *root)
 }
 #endif
 
+static bool caam_mc_skip_hwrng_init(struct caam_drv_private *ctrlpriv)
+{
+	return ctrlpriv->mc_en;
+	/*
+	 * FIXME: Add check for MC firmware version that need
+	 * reinitialization due to PR bit
+	 */
+}
+
 /* Probe routine for CAAM top (controller) level */
 static int caam_probe(struct platform_device *pdev)
 {
@@ -783,7 +806,7 @@ static int caam_probe(struct platform_device *pdev)
 	 * already instantiated, do RNG instantiation
 	 * In case of SoCs with Management Complex, RNG is managed by MC f/w.
 	 */
-	if (!ctrlpriv->mc_en && rng_vid >= 4) {
+	if (!caam_mc_skip_hwrng_init(ctrlpriv) && rng_vid >= 4) {
 		ctrlpriv->rng4_sh_init =
 			rd_reg32(&ctrl->r4tst[0].rdsta);
 		/*
@@ -793,11 +816,11 @@ static int caam_probe(struct platform_device *pdev)
 		 * to regenerate these keys before the next POR.
 		 */
 		gen_sk = ctrlpriv->rng4_sh_init & RDSTA_SKVN ? 0 : 1;
-		ctrlpriv->rng4_sh_init &= RDSTA_IFMASK;
+		ctrlpriv->rng4_sh_init &= RDSTA_MASK;
 		do {
 			int inst_handles =
 				rd_reg32(&ctrl->r4tst[0].rdsta) &
-								RDSTA_IFMASK;
+								RDSTA_MASK;
 			/*
 			 * If either SH were instantiated by somebody else
 			 * (e.g. u-boot) then it is assumed that the entropy
@@ -837,7 +860,7 @@ static int caam_probe(struct platform_device *pdev)
 		 * Set handles init'ed by this module as the complement of the
 		 * already initialized ones
 		 */
-		ctrlpriv->rng4_sh_init = ~ctrlpriv->rng4_sh_init & RDSTA_IFMASK;
+		ctrlpriv->rng4_sh_init = ~ctrlpriv->rng4_sh_init & RDSTA_MASK;
 
 		/* Enable RDB bit so that RNG works faster */
 		clrsetbits_32(&ctrl->scfgr, 0, SCFGR_RDBENABLE);
diff --git a/drivers/crypto/caam/desc.h b/drivers/crypto/caam/desc.h
index 4b6854bf896a..e796d3cb9be8 100644
--- a/drivers/crypto/caam/desc.h
+++ b/drivers/crypto/caam/desc.h
@@ -1254,6 +1254,8 @@
 #define OP_ALG_ICV_OFF		(0 << OP_ALG_ICV_SHIFT)
 #define OP_ALG_ICV_ON		(1 << OP_ALG_ICV_SHIFT)
 
+#define OP_ALG_PR_ON		BIT(1)
+
 #define OP_ALG_DIR_SHIFT	0
 #define OP_ALG_DIR_MASK		1
 #define OP_ALG_DECRYPT		0
diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index c191e8fd0fa7..0f810bc13b2b 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -524,9 +524,11 @@ struct rng4tst {
 	u32 rsvd1[40];
 #define RDSTA_SKVT 0x80000000
 #define RDSTA_SKVN 0x40000000
+#define RDSTA_PR0 BIT(4)
+#define RDSTA_PR1 BIT(5)
 #define RDSTA_IF0 0x00000001
 #define RDSTA_IF1 0x00000002
-#define RDSTA_IFMASK (RDSTA_IF1 | RDSTA_IF0)
+#define RDSTA_MASK (RDSTA_PR1 | RDSTA_PR0 | RDSTA_IF1 | RDSTA_IF0)
 	u32 rdsta;
 	u32 rsvd2[15];
 };
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 9/9] crypto: caam - limit single JD RNG output to maximum of 16 bytes
  2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
                   ` (7 preceding siblings ...)
  2020-01-27 16:56 ` [PATCH v7 8/9] crypto: caam - enable prediction resistance in HRWNG Andrey Smirnov
@ 2020-01-27 16:56 ` Andrey Smirnov
  8 siblings, 0 replies; 27+ messages in thread
From: Andrey Smirnov @ 2020-01-27 16:56 UTC (permalink / raw)
  To: linux-crypto
  Cc: Andrey Smirnov, Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

In order to follow recommendation in SP800-90C (section "9.4 The
Oversampling-NRBG Construction") limit the output of "generate" JD
submitted to CAAM. See
https://lore.kernel.org/linux-crypto/VI1PR0402MB3485EF10976A4A69F90E5B0F98580@VI1PR0402MB3485.eurprd04.prod.outlook.com/
for more details.

This change should make CAAM's hwrng driver good enough to have 1024
quality rating.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-imx@nxp.com
---
 drivers/crypto/caam/caamrng.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
index 62f3a69ae837..c0981fce68b7 100644
--- a/drivers/crypto/caam/caamrng.c
+++ b/drivers/crypto/caam/caamrng.c
@@ -23,9 +23,9 @@
 #include "error.h"
 
 /* length of descriptors */
-#define CAAM_RNG_MAX_FIFO_STORE_SIZE	U16_MAX
+#define CAAM_RNG_MAX_FIFO_STORE_SIZE	16
 
-#define CAAM_RNG_FIFO_LEN		SZ_32K /* Must be a multiple of 2 */
+#define CAAM_RNG_FIFO_LEN		64 /* Must be a multiple of 2 */
 
 /*
  * See caam_init_desc()
@@ -128,7 +128,7 @@ static void caam_rng_fill_async(struct caam_rng_ctx *ctx)
 
 	sg_init_table(sg, ARRAY_SIZE(sg));
 	nents = kfifo_dma_in_prepare(&ctx->fifo, sg, ARRAY_SIZE(sg),
-				     CAAM_RNG_FIFO_LEN);
+				     CAAM_RNG_MAX_FIFO_STORE_SIZE);
 	if (!nents)
 		return;
 
@@ -246,6 +246,7 @@ int caam_rng_init(struct device *ctrldev)
 	ctx->rng.init    = caam_init;
 	ctx->rng.cleanup = caam_cleanup;
 	ctx->rng.read    = caam_read;
+	ctx->rng.quality = 1024;
 
 	dev_info(ctrldev, "registering rng-caam\n");
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [EXT] [PATCH v7 8/9] crypto: caam - enable prediction resistance in HRWNG
  2020-01-27 16:56 ` [PATCH v7 8/9] crypto: caam - enable prediction resistance in HRWNG Andrey Smirnov
@ 2020-02-04 13:09   ` Andrei Botila (OSS)
  2020-02-04 14:19     ` Horia Geanta
  0 siblings, 1 reply; 27+ messages in thread
From: Andrei Botila (OSS) @ 2020-02-04 13:09 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Chris Healy, Lucas Stach, Horia Geantă,
	Herbert Xu, Iuliana Prodan, linux-kernel, linux-imx

On 1/27/2020 6:56 PM, Andrey Smirnov wrote:
> +static bool caam_mc_skip_hwrng_init(struct caam_drv_private *ctrlpriv)
> +{
> +       return ctrlpriv->mc_en;
> +       /*
> +        * FIXME: Add check for MC firmware version that need
> +        * reinitialization due to PR bit
> +        */
> +}
> +

Hi Andrey,

Please use the following patch as a way to check for MC firmware version.
This should be squashed into current PATCH v7 8/9.

-- >8 --

From: Andrei Botila <andrei.botila@nxp.com>
Subject: [PATCH] crypto: caam - check mc firmware version before instantiating
  rng

Management Complex firmware with version lower than 10.20.0
doesn't provide prediction resistance support. Consider this
and only instantiate rng when mc f/w version is lower.

Signed-off-by: Andrei Botila <andrei.botila@nxp.com>
---
  drivers/crypto/caam/Kconfig |  1 +
  drivers/crypto/caam/ctrl.c  | 46 ++++++++++++++++++++++++++++---------
  2 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/caam/Kconfig b/drivers/crypto/caam/Kconfig
index fac5b2e26610..d0e833121d8c 100644
--- a/drivers/crypto/caam/Kconfig
+++ b/drivers/crypto/caam/Kconfig
@@ -13,6 +13,7 @@ config CRYPTO_DEV_FSL_CAAM
  	depends on FSL_SOC || ARCH_MXC || ARCH_LAYERSCAPE
  	select SOC_BUS
  	select CRYPTO_DEV_FSL_CAAM_COMMON
+	imply FSL_MC_BUS
  	help
  	  Enables the driver module for Freescale's Cryptographic Accelerator
  	  and Assurance Module (CAAM), also known as the SEC version 4 (SEC4).
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index 167a79fa3b8a..52b98e8d5175 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -10,6 +10,7 @@
  #include <linux/of_address.h>
  #include <linux/of_irq.h>
  #include <linux/sys_soc.h>
+#include <linux/fsl/mc.h>
  
  #include "compat.h"
  #include "regs.h"
@@ -578,14 +579,24 @@ static void caam_remove_debugfs(void *root)
  }
  #endif
  
-static bool caam_mc_skip_hwrng_init(struct caam_drv_private *ctrlpriv)
+#ifdef CONFIG_FSL_MC_BUS
+static bool check_version(struct fsl_mc_version *mc_version, u32 major,
+			  u32 minor, u32 revision)
  {
-	return ctrlpriv->mc_en;
-	/*
-	 * FIXME: Add check for MC firmware version that need
-	 * reinitialization due to PR bit
-	 */
+	if (mc_version->major > major)
+		return true;
+
+	if (mc_version->major == major) {
+		if (mc_version->minor > minor)
+			return true;
+
+		if (mc_version->minor == minor && mc_version->revision > 0)
+			return true;
+	}
+
+	return false;
  }
+#endif
  
  /* Probe routine for CAAM top (controller) level */
  static int caam_probe(struct platform_device *pdev)
@@ -605,6 +616,7 @@ static int caam_probe(struct platform_device *pdev)
  	u8 rng_vid;
  	int pg_size;
  	int BLOCK_OFFSET = 0;
+	bool pr_support = false;
  
  	ctrlpriv = devm_kzalloc(&pdev->dev, sizeof(*ctrlpriv), GFP_KERNEL);
  	if (!ctrlpriv)
@@ -691,16 +703,28 @@ static int caam_probe(struct platform_device *pdev)
  	/* Get the IRQ of the controller (for security violations only) */
  	ctrlpriv->secvio_irq = irq_of_parse_and_map(nprop, 0);
  
+	np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-mc");
+	ctrlpriv->mc_en = !!np;
+	of_node_put(np);
+
+#ifdef CONFIG_FSL_MC_BUS
+	if (ctrlpriv->mc_en) {
+		struct fsl_mc_version *mc_version;
+
+		mc_version = fsl_mc_get_version();
+		if (mc_version)
+			pr_support = check_version(mc_version, 10, 20, 0);
+		else
+			return -EPROBE_DEFER;
+	}
+#endif
+
  	/*
  	 * Enable DECO watchdogs and, if this is a PHYS_ADDR_T_64BIT kernel,
  	 * long pointers in master configuration register.
  	 * In case of SoCs with Management Complex, MC f/w performs
  	 * the configuration.
  	 */
-	np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-mc");
-	ctrlpriv->mc_en = !!np;
-	of_node_put(np);
-
  	if (!ctrlpriv->mc_en)
  		clrsetbits_32(&ctrl->mcr, MCFGR_AWCACHE_MASK,
  			      MCFGR_AWCACHE_CACH | MCFGR_AWCACHE_BUFF |
@@ -807,7 +831,7 @@ static int caam_probe(struct platform_device *pdev)
  	 * already instantiated, do RNG instantiation
  	 * In case of SoCs with Management Complex, RNG is managed by MC f/w.
  	 */
-	if (!caam_mc_skip_hwrng_init(ctrlpriv) && rng_vid >= 4) {
+	if (!(ctrlpriv->mc_en && pr_support) && rng_vid >= 4) {
  		ctrlpriv->rng4_sh_init =
  			rd_reg32(&ctrl->r4tst[0].rdsta);
  		/*
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA
  2020-01-27 16:56 ` [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA Andrey Smirnov
@ 2020-02-04 14:08   ` Horia Geanta
  2020-02-24 16:40     ` Andrey Smirnov
  2020-03-16  4:14     ` Andrey Smirnov
  0 siblings, 2 replies; 27+ messages in thread
From: Horia Geanta @ 2020-02-04 14:08 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Chris Healy, Lucas Stach, Herbert Xu, Iuliana Prodan,
	dl-linux-imx, linux-kernel

On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> Be consistent with the rest of the codebase and use GFP_DMA when
> allocating memory for a CAAM JR descriptor.
> 
Please use GFP_DMA32 instead.
Device is not limited to less than 32 bits of addressing
in any of its incarnations.

s/GFP_DMA/GFP_DMA32 should be performed throughout caam driver.
(But of course, I wouldn't include this change in current patch series).

Horia

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [EXT] [PATCH v7 8/9] crypto: caam - enable prediction resistance in HRWNG
  2020-02-04 13:09   ` [EXT] " Andrei Botila (OSS)
@ 2020-02-04 14:19     ` Horia Geanta
  0 siblings, 0 replies; 27+ messages in thread
From: Horia Geanta @ 2020-02-04 14:19 UTC (permalink / raw)
  To: Andrei Botila (OSS),
	Andrey Smirnov, linux-crypto, Herbert Xu, Greg Kroah-Hartman
  Cc: Chris Healy, Lucas Stach, Iuliana Prodan, linux-kernel, dl-linux-imx

On 2/4/2020 3:09 PM, Andrei Botila (OSS) wrote:
> On 1/27/2020 6:56 PM, Andrey Smirnov wrote:
>> +static bool caam_mc_skip_hwrng_init(struct caam_drv_private *ctrlpriv)
>> +{
>> +       return ctrlpriv->mc_en;
>> +       /*
>> +        * FIXME: Add check for MC firmware version that need
>> +        * reinitialization due to PR bit
>> +        */
>> +}
>> +
> 
> Hi Andrey,
> 
> Please use the following patch as a way to check for MC firmware version.
> This should be squashed into current PATCH v7 8/9.
> 
Btw, this depends on the fsl-mc bus patch that adds fsl_mc_get_version()
bus: fsl-mc: add api to retrieve mc version
https://patchwork.kernel.org/patch/11352493/

As already stated, I would like to take the fsl-mc bus dependency
through the crypto tree.
Greg, Herbert - are you ok with this?

Thanks,
Horia

> -- >8 --
> 
> From: Andrei Botila <andrei.botila@nxp.com>
> Subject: [PATCH] crypto: caam - check mc firmware version before instantiating
>   rng
> 
> Management Complex firmware with version lower than 10.20.0
> doesn't provide prediction resistance support. Consider this
> and only instantiate rng when mc f/w version is lower.
> 
> Signed-off-by: Andrei Botila <andrei.botila@nxp.com>
> ---
>   drivers/crypto/caam/Kconfig |  1 +
>   drivers/crypto/caam/ctrl.c  | 46 ++++++++++++++++++++++++++++---------
>   2 files changed, 36 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/crypto/caam/Kconfig b/drivers/crypto/caam/Kconfig
> index fac5b2e26610..d0e833121d8c 100644
> --- a/drivers/crypto/caam/Kconfig
> +++ b/drivers/crypto/caam/Kconfig
> @@ -13,6 +13,7 @@ config CRYPTO_DEV_FSL_CAAM
>   	depends on FSL_SOC || ARCH_MXC || ARCH_LAYERSCAPE
>   	select SOC_BUS
>   	select CRYPTO_DEV_FSL_CAAM_COMMON
> +	imply FSL_MC_BUS
>   	help
>   	  Enables the driver module for Freescale's Cryptographic Accelerator
>   	  and Assurance Module (CAAM), also known as the SEC version 4 (SEC4).
> diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
> index 167a79fa3b8a..52b98e8d5175 100644
> --- a/drivers/crypto/caam/ctrl.c
> +++ b/drivers/crypto/caam/ctrl.c
> @@ -10,6 +10,7 @@
>   #include <linux/of_address.h>
>   #include <linux/of_irq.h>
>   #include <linux/sys_soc.h>
> +#include <linux/fsl/mc.h>
>   
>   #include "compat.h"
>   #include "regs.h"
> @@ -578,14 +579,24 @@ static void caam_remove_debugfs(void *root)
>   }
>   #endif
>   
> -static bool caam_mc_skip_hwrng_init(struct caam_drv_private *ctrlpriv)
> +#ifdef CONFIG_FSL_MC_BUS
> +static bool check_version(struct fsl_mc_version *mc_version, u32 major,
> +			  u32 minor, u32 revision)
>   {
> -	return ctrlpriv->mc_en;
> -	/*
> -	 * FIXME: Add check for MC firmware version that need
> -	 * reinitialization due to PR bit
> -	 */
> +	if (mc_version->major > major)
> +		return true;
> +
> +	if (mc_version->major == major) {
> +		if (mc_version->minor > minor)
> +			return true;
> +
> +		if (mc_version->minor == minor && mc_version->revision > 0)
> +			return true;
> +	}
> +
> +	return false;
>   }
> +#endif
>   
>   /* Probe routine for CAAM top (controller) level */
>   static int caam_probe(struct platform_device *pdev)
> @@ -605,6 +616,7 @@ static int caam_probe(struct platform_device *pdev)
>   	u8 rng_vid;
>   	int pg_size;
>   	int BLOCK_OFFSET = 0;
> +	bool pr_support = false;
>   
>   	ctrlpriv = devm_kzalloc(&pdev->dev, sizeof(*ctrlpriv), GFP_KERNEL);
>   	if (!ctrlpriv)
> @@ -691,16 +703,28 @@ static int caam_probe(struct platform_device *pdev)
>   	/* Get the IRQ of the controller (for security violations only) */
>   	ctrlpriv->secvio_irq = irq_of_parse_and_map(nprop, 0);
>   
> +	np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-mc");
> +	ctrlpriv->mc_en = !!np;
> +	of_node_put(np);
> +
> +#ifdef CONFIG_FSL_MC_BUS
> +	if (ctrlpriv->mc_en) {
> +		struct fsl_mc_version *mc_version;
> +
> +		mc_version = fsl_mc_get_version();
> +		if (mc_version)
> +			pr_support = check_version(mc_version, 10, 20, 0);
> +		else
> +			return -EPROBE_DEFER;
> +	}
> +#endif
> +
>   	/*
>   	 * Enable DECO watchdogs and, if this is a PHYS_ADDR_T_64BIT kernel,
>   	 * long pointers in master configuration register.
>   	 * In case of SoCs with Management Complex, MC f/w performs
>   	 * the configuration.
>   	 */
> -	np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-mc");
> -	ctrlpriv->mc_en = !!np;
> -	of_node_put(np);
> -
>   	if (!ctrlpriv->mc_en)
>   		clrsetbits_32(&ctrl->mcr, MCFGR_AWCACHE_MASK,
>   			      MCFGR_AWCACHE_CACH | MCFGR_AWCACHE_BUFF |
> @@ -807,7 +831,7 @@ static int caam_probe(struct platform_device *pdev)
>   	 * already instantiated, do RNG instantiation
>   	 * In case of SoCs with Management Complex, RNG is managed by MC f/w.
>   	 */
> -	if (!caam_mc_skip_hwrng_init(ctrlpriv) && rng_vid >= 4) {
> +	if (!(ctrlpriv->mc_en && pr_support) && rng_vid >= 4) {
>   		ctrlpriv->rng4_sh_init =
>   			rd_reg32(&ctrl->r4tst[0].rdsta);
>   		/*
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 2/9] crypto: caam - use struct hwrng's .init for initialization
  2020-01-27 16:56 ` [PATCH v7 2/9] crypto: caam - use struct hwrng's .init for initialization Andrey Smirnov
@ 2020-02-11 14:39   ` Horia Geanta
  0 siblings, 0 replies; 27+ messages in thread
From: Horia Geanta @ 2020-02-11 14:39 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Chris Healy, Lucas Stach, Herbert Xu, Iuliana Prodan,
	linux-kernel, dl-linux-imx

On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> Make caamrng code a bit more symmetric by moving initialization code
> to .init hook of struct hwrng.
> 
> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
> Cc: Chris Healy <cphealy@gmail.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Horia Geantă <horia.geanta@nxp.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
> Cc: linux-crypto@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-imx@nxp.com
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>

Thanks,
Horia

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/9] crypto: caam - use devm_kzalloc to allocate JR data
  2020-01-27 16:56 ` [PATCH v7 3/9] crypto: caam - use devm_kzalloc to allocate JR data Andrey Smirnov
@ 2020-02-11 18:23   ` Horia Geanta
  2020-02-24 16:39     ` Andrey Smirnov
  0 siblings, 1 reply; 27+ messages in thread
From: Horia Geanta @ 2020-02-11 18:23 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Chris Healy, Lucas Stach, Herbert Xu, Iuliana Prodan,
	linux-kernel, dl-linux-imx

On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> Use devm_kzalloc() to allocate JR data in order to make sure that it
> is initialized consistently every time.
> 
The commit message is a bit vague.

I assume this is needed in patch 4/9, which adds a new member (hwrng)
in caam_drv_private_jr structure.

If so, it's probably better to have the change merged into patch 4/9.

Horia

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done
  2020-01-27 16:56 ` [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done Andrey Smirnov
@ 2020-02-11 18:53   ` Horia Geanta
  2020-02-11 20:57   ` Horia Geanta
  1 sibling, 0 replies; 27+ messages in thread
From: Horia Geanta @ 2020-02-11 18:53 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Chris Healy, Lucas Stach, Herbert Xu, Iuliana Prodan,
	linux-kernel, dl-linux-imx

On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> Leverage devres to get rid of code storing global context as well as
> init_done flag.
> 
> Original code also has a circular deallocation dependency where
> unregister_algs() -> caam_rng_exit() -> caam_jr_free() chain would
> only happen if all of JRs were freed. Fix this by moving
I wouldn't call this a circular dependency.

I think the discussion & patch here:
https://patchwork.kernel.org/patch/11140741/
https://lore.kernel.org/linux-crypto/VI1PR0402MB34859D108C03F3AB0F64EE6598B10@VI1PR0402MB3485.eurprd04.prod.outlook.com/

describes more accurately the problem:
"The issue in caamrng is actually that caam/jr driver (jr.c) tries to call
caam_rng_exit() on the last available jr device.
Instead, caam_rng_exit() must be called on the same jr device that
was used during caam_rng_init()."

> caam_rng_exit() outside of unregister_algs() and doing it specifically
> for JR that instantiated HWRNG.
> 
> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
> Cc: Chris Healy <cphealy@gmail.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Horia Geantă <horia.geanta@nxp.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
> Cc: linux-crypto@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-imx@nxp.com
Current patch is similar with the one mentioned above and solves the problem.
Thus, with rewording the commit message (and squashing patch 3/9)
you can add my
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>

Thanks,
Horia


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done
  2020-01-27 16:56 ` [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done Andrey Smirnov
  2020-02-11 18:53   ` Horia Geanta
@ 2020-02-11 20:57   ` Horia Geanta
  2020-02-24 16:40     ` Andrey Smirnov
  1 sibling, 1 reply; 27+ messages in thread
From: Horia Geanta @ 2020-02-11 20:57 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Chris Healy, Lucas Stach, Herbert Xu, Iuliana Prodan,
	linux-kernel, dl-linux-imx

On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> @@ -70,6 +70,7 @@ struct buf_data {
>  
>  /* rng per-device context */
>  struct caam_rng_ctx {
> +	struct hwrng rng;
>  	struct device *jrdev;
>  	dma_addr_t sh_desc_dma;
>  	u32 sh_desc[DESC_RNG_LEN];
> @@ -78,13 +79,10 @@ struct caam_rng_ctx {
>  	struct buf_data bufs[2];
>  };
>  
> -static struct caam_rng_ctx *rng_ctx;
> -
> -/*
> - * Variable used to avoid double free of resources in case
> - * algorithm registration was unsuccessful
> - */
> -static bool init_done;
> +static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
> +{
> +	return container_of(r, struct caam_rng_ctx, rng);
> +}
[...]
> -static struct hwrng caam_rng = {
> -	.name		= "rng-caam",
> -	.init           = caam_init,
> -	.cleanup	= caam_cleanup,
> -	.read		= caam_read,
> -};
[...]> +	ctx->rng.name    = "rng-caam";
> +	ctx->rng.init    = caam_init;
> +	ctx->rng.cleanup = caam_cleanup;
> +	ctx->rng.read    = caam_read;
An alternative (probably better) for storing caamrng context
is to use what is already available in struct hwrng:
 * @priv:               Private data, for use by the RNG driver.

Horia

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 6/9] crypto: caam - check if RNG job failed
  2020-01-27 16:56 ` [PATCH v7 6/9] crypto: caam - check if RNG job failed Andrey Smirnov
@ 2020-02-12 10:41   ` Horia Geanta
  2020-02-24 16:37     ` Andrey Smirnov
  0 siblings, 1 reply; 27+ messages in thread
From: Horia Geanta @ 2020-02-12 10:41 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Chris Healy, Lucas Stach, Herbert Xu, Iuliana Prodan,
	linux-kernel, dl-linux-imx

On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> @@ -60,12 +65,12 @@ static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
>  static void caam_rng_done(struct device *jrdev, u32 *desc, u32 err,
>  			  void *context)
>  {
> -	struct completion *done = context;
> +	struct caam_rng_job_ctx *jctx = context;
>  
>  	if (err)
> -		caam_jr_strstatus(jrdev, err);
> +		*jctx->err = caam_jr_strstatus(jrdev, err);
>  
> -	complete(done);
> +	complete(jctx->done);
>  }
>  
>  static u32 *caam_init_desc(u32 *desc, dma_addr_t dst_dma, int len)
> @@ -89,6 +94,10 @@ static int caam_rng_read_one(struct device *jrdev,
>  {
>  	dma_addr_t dst_dma;
>  	int err;
> +	struct caam_rng_job_ctx jctx = {
> +		.done = done,
> +		.err  = &err,
> +	};
>  
>  	len = min_t(int, len, CAAM_RNG_MAX_FIFO_STORE_SIZE);
>  
> @@ -101,7 +110,7 @@ static int caam_rng_read_one(struct device *jrdev,
>  	init_completion(done);
>  	err = caam_jr_enqueue(jrdev,
>  			      caam_init_desc(desc, dst_dma, len),
> -			      caam_rng_done, done);
> +			      caam_rng_done, &jctx);
AFAICT there's a race condition b/w caam_jr_enqueue() and caam_rng_done(),
both writing to "err":
caam_jr_enqueue()
	-> JR interrupt -> caam_jr_interrupt() -> tasklet_schedule()...
	-> spin_unlock_bh()
	-> caam_jr_dequeue() -> caam_rng_done() -> write err
	-> return 0 -> write err

Horia

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/9] crypto: caam - simplify RNG implementation
  2020-01-27 16:56 ` [PATCH v7 5/9] crypto: caam - simplify RNG implementation Andrey Smirnov
@ 2020-02-12 13:20   ` Horia Geanta
  2020-02-24 17:16     ` Andrey Smirnov
  0 siblings, 1 reply; 27+ messages in thread
From: Horia Geanta @ 2020-02-12 13:20 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Chris Healy, Lucas Stach, Herbert Xu, Iuliana Prodan,
	linux-kernel, dl-linux-imx

On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> Rework CAAM RNG implementation as follows:
> 
> - Make use of the fact that HWRNG supports partial reads and will
> handle such cases gracefully by removing recursion in caam_read()
> 
> - Convert blocking caam_read() codepath to do a single blocking job
> read directly into requested buffer, bypassing any intermediary
> buffers
> 
> - Convert async caam_read() codepath into a simple single
> reader/single writer FIFO use-case, thus simplifying concurrency
> handling and delegating buffer read/write position management to KFIFO
> subsystem.
> 
> - Leverage the same low level RNG data extraction code for both async
> and blocking caam_read() scenarios, get rid of the shared job
> descriptor and make non-shared one as a simple as possible (just
> HEADER + ALGORITHM OPERATION + FIFO STORE)
> 
> - Split private context from DMA related memory, so that the former
> could be allocated without GFP_DMA.
> 
> NOTE: On its face value this commit decreased throughput numbers
> reported by
> 
>   dd if=/dev/hwrng of=/dev/null bs=1 count=100K [iflag=nonblock]
> 
> by about 15%, however commits that enable prediction resistance and
Running dd as mentioned above, on a i.MX8MM board I see:
~ 20% decrease in non-blocking case (525 kB/s vs. 662 kB/s)
~ 75% decrease in blocking case (170 kB/s vs. 657 kB/s)

bs=1 is a bit drastic.
Using bs=16 the numbers look better in terms of overall speed,
however the relative degradation is still there:
~ 66% decrease in blocking case (3.5 MB/s vs. 10.1 MB/s)

> limit JR total size impact the performance so much and move the
> bottleneck such as to make this regression irrelevant.
> 
Yes, performance is greatly impacted by moving from a DRBG configuration
to a TRNG one.

The speed that I get with this patch set (1.3 kB/s)
is ~ 20% lower than theoretical output (1.583 kB/s) (see below).
Seeing this and also the relative decrease in case of DRBG
makes me wonder whether the SW overhead could be lowered.

Theoretical TRNG output speed in this configuration
can be computed as:
Speed = (SZ x CAAM_CLK_FREQ) / (RTSDCTL[ENT_DLY] x RTSDCTL[SAMP_SIZE]) [bps]

SZ is sample taken from the DRBG, b/w two consecutive reseedings.
As previously discussed, this is limited to 128 bits (16 bytes),
such that the DRBG behaves as a TRNG.

If:
-CAAM_CLK_FREQ = 166 MHz (as for i.MXM*)
-RTSDCTL[ENT_DLY] = 3200 clocks (default / POR value)
-RTSDCTL[SAMP_SIZE] = 512 (recommended; default / POR value is 2500)
then theoretical speed is 1.583 kB/s.

> @@ -45,38 +22,34 @@
>  #include "jr.h"
>  #include "error.h"
>  
> +/* length of descriptors */
This comment is misplaced, length of descriptors (CAAM_RNG_DESC_LEN)
is further below.

> +#define CAAM_RNG_MAX_FIFO_STORE_SIZE	U16_MAX
> +
> +#define CAAM_RNG_FIFO_LEN		SZ_32K /* Must be a multiple of 2 */
> +
>  /*
> - * Maximum buffer size: maximum number of random, cache-aligned bytes that
> - * will be generated and moved to seq out ptr (extlen not allowed)
> + * See caam_init_desc()
>   */
> -#define RN_BUF_SIZE			(0xffff / L1_CACHE_BYTES * \
> -					 L1_CACHE_BYTES)
> +#define CAAM_RNG_DESC_LEN (CAAM_CMD_SZ +				\
> +			   CAAM_CMD_SZ +				\
> +			   CAAM_CMD_SZ + CAAM_PTR_SZ_MAX)

> +typedef u8 caam_rng_desc[CAAM_RNG_DESC_LEN];
Is this really necessary?

> -static int caam_read(struct hwrng *rng, void *data, size_t max, bool wait)
> +static int caam_rng_read_one(struct device *jrdev,
> +			     void *dst, int len,
> +			     void *desc,
> +			     struct completion *done)
[...]
> +	len = min_t(int, len, CAAM_RNG_MAX_FIFO_STORE_SIZE);
For the blocking case, i.e. caam_read() -> caam_rng_read_one(),
"len" is at least 32B - cf. include/linux/hw_random.h:
 * @read:		New API. drivers can fill up to max bytes of data
 *			into the buffer. The buffer is aligned for any type
 *			and max is a multiple of 4 and >= 32 bytes.

For reducing the SW overhead, it might be worth optimizing this path.
For example, considering
min_t(int, len, CAAM_RNG_MAX_FIFO_STORE_SIZE) = CAAM_RNG_MAX_FIFO_STORE_SIZE
this means length is fixed, thus also ctx->desc[DESC_SYNC] descriptor is fixed
and its generation could be moved out of the hot path.

Horia


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 6/9] crypto: caam - check if RNG job failed
  2020-02-12 10:41   ` Horia Geanta
@ 2020-02-24 16:37     ` Andrey Smirnov
  0 siblings, 0 replies; 27+ messages in thread
From: Andrey Smirnov @ 2020-02-24 16:37 UTC (permalink / raw)
  To: Horia Geanta
  Cc: linux-crypto, Chris Healy, Lucas Stach, Herbert Xu,
	Iuliana Prodan, linux-kernel, dl-linux-imx

On Wed, Feb 12, 2020 at 2:41 AM Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> > @@ -60,12 +65,12 @@ static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
> >  static void caam_rng_done(struct device *jrdev, u32 *desc, u32 err,
> >                         void *context)
> >  {
> > -     struct completion *done = context;
> > +     struct caam_rng_job_ctx *jctx = context;
> >
> >       if (err)
> > -             caam_jr_strstatus(jrdev, err);
> > +             *jctx->err = caam_jr_strstatus(jrdev, err);
> >
> > -     complete(done);
> > +     complete(jctx->done);
> >  }
> >
> >  static u32 *caam_init_desc(u32 *desc, dma_addr_t dst_dma, int len)
> > @@ -89,6 +94,10 @@ static int caam_rng_read_one(struct device *jrdev,
> >  {
> >       dma_addr_t dst_dma;
> >       int err;
> > +     struct caam_rng_job_ctx jctx = {
> > +             .done = done,
> > +             .err  = &err,
> > +     };
> >
> >       len = min_t(int, len, CAAM_RNG_MAX_FIFO_STORE_SIZE);
> >
> > @@ -101,7 +110,7 @@ static int caam_rng_read_one(struct device *jrdev,
> >       init_completion(done);
> >       err = caam_jr_enqueue(jrdev,
> >                             caam_init_desc(desc, dst_dma, len),
> > -                           caam_rng_done, done);
> > +                           caam_rng_done, &jctx);
> AFAICT there's a race condition b/w caam_jr_enqueue() and caam_rng_done(),
> both writing to "err":
> caam_jr_enqueue()
>         -> JR interrupt -> caam_jr_interrupt() -> tasklet_schedule()...
>         -> spin_unlock_bh()
>         -> caam_jr_dequeue() -> caam_rng_done() -> write err
>         -> return 0 -> write err
>

Yes, I thought it didn't really matter for calling
wait_for_completion(done), but now that I think on it again, it can
return wrong result code from vcaam_rng_read_one(). Will fix in v8.

Thanks,
Andrey Smirnov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/9] crypto: caam - use devm_kzalloc to allocate JR data
  2020-02-11 18:23   ` Horia Geanta
@ 2020-02-24 16:39     ` Andrey Smirnov
  0 siblings, 0 replies; 27+ messages in thread
From: Andrey Smirnov @ 2020-02-24 16:39 UTC (permalink / raw)
  To: Horia Geanta
  Cc: linux-crypto, Chris Healy, Lucas Stach, Herbert Xu,
	Iuliana Prodan, linux-kernel, dl-linux-imx

On Tue, Feb 11, 2020 at 10:23 AM Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> > Use devm_kzalloc() to allocate JR data in order to make sure that it
> > is initialized consistently every time.
> >
> The commit message is a bit vague.
>
> I assume this is needed in patch 4/9, which adds a new member (hwrng)
> in caam_drv_private_jr structure.
>
> If so, it's probably better to have the change merged into patch 4/9.
>

Yes, it is specifically needed for 4/9, but it's generally a good
practice to zero out allocated memory, so I made it a separate patch.
Will fold it into 4/9, since that's what you prefer.

Thanks,
Andrey Smirnov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA
  2020-02-04 14:08   ` Horia Geanta
@ 2020-02-24 16:40     ` Andrey Smirnov
  2020-03-16  4:14     ` Andrey Smirnov
  1 sibling, 0 replies; 27+ messages in thread
From: Andrey Smirnov @ 2020-02-24 16:40 UTC (permalink / raw)
  To: Horia Geanta
  Cc: linux-crypto, Chris Healy, Lucas Stach, Herbert Xu,
	Iuliana Prodan, dl-linux-imx, linux-kernel

On Tue, Feb 4, 2020 at 6:08 AM Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> > Be consistent with the rest of the codebase and use GFP_DMA when
> > allocating memory for a CAAM JR descriptor.
> >
> Please use GFP_DMA32 instead.
> Device is not limited to less than 32 bits of addressing
> in any of its incarnations.
>
> s/GFP_DMA/GFP_DMA32 should be performed throughout caam driver.
> (But of course, I wouldn't include this change in current patch series).
>

Will do in v8.

Thanks,
Andrey Smirnov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done
  2020-02-11 20:57   ` Horia Geanta
@ 2020-02-24 16:40     ` Andrey Smirnov
  0 siblings, 0 replies; 27+ messages in thread
From: Andrey Smirnov @ 2020-02-24 16:40 UTC (permalink / raw)
  To: Horia Geanta
  Cc: linux-crypto, Chris Healy, Lucas Stach, Herbert Xu,
	Iuliana Prodan, linux-kernel, dl-linux-imx

On Tue, Feb 11, 2020 at 12:57 PM Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> > @@ -70,6 +70,7 @@ struct buf_data {
> >
> >  /* rng per-device context */
> >  struct caam_rng_ctx {
> > +     struct hwrng rng;
> >       struct device *jrdev;
> >       dma_addr_t sh_desc_dma;
> >       u32 sh_desc[DESC_RNG_LEN];
> > @@ -78,13 +79,10 @@ struct caam_rng_ctx {
> >       struct buf_data bufs[2];
> >  };
> >
> > -static struct caam_rng_ctx *rng_ctx;
> > -
> > -/*
> > - * Variable used to avoid double free of resources in case
> > - * algorithm registration was unsuccessful
> > - */
> > -static bool init_done;
> > +static struct caam_rng_ctx *to_caam_rng_ctx(struct hwrng *r)
> > +{
> > +     return container_of(r, struct caam_rng_ctx, rng);
> > +}
> [...]
> > -static struct hwrng caam_rng = {
> > -     .name           = "rng-caam",
> > -     .init           = caam_init,
> > -     .cleanup        = caam_cleanup,
> > -     .read           = caam_read,
> > -};
> [...]> +        ctx->rng.name    = "rng-caam";
> > +     ctx->rng.init    = caam_init;
> > +     ctx->rng.cleanup = caam_cleanup;
> > +     ctx->rng.read    = caam_read;
> An alternative (probably better) for storing caamrng context
> is to use what is already available in struct hwrng:
>  * @priv:               Private data, for use by the RNG driver.
>

OK, will do in v8.

Thanks,
Andrey Smirnov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/9] crypto: caam - simplify RNG implementation
  2020-02-12 13:20   ` Horia Geanta
@ 2020-02-24 17:16     ` Andrey Smirnov
  0 siblings, 0 replies; 27+ messages in thread
From: Andrey Smirnov @ 2020-02-24 17:16 UTC (permalink / raw)
  To: Horia Geanta
  Cc: linux-crypto, Chris Healy, Lucas Stach, Herbert Xu,
	Iuliana Prodan, linux-kernel, dl-linux-imx

On Wed, Feb 12, 2020 at 5:20 AM Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> > Rework CAAM RNG implementation as follows:
> >
> > - Make use of the fact that HWRNG supports partial reads and will
> > handle such cases gracefully by removing recursion in caam_read()
> >
> > - Convert blocking caam_read() codepath to do a single blocking job
> > read directly into requested buffer, bypassing any intermediary
> > buffers
> >
> > - Convert async caam_read() codepath into a simple single
> > reader/single writer FIFO use-case, thus simplifying concurrency
> > handling and delegating buffer read/write position management to KFIFO
> > subsystem.
> >
> > - Leverage the same low level RNG data extraction code for both async
> > and blocking caam_read() scenarios, get rid of the shared job
> > descriptor and make non-shared one as a simple as possible (just
> > HEADER + ALGORITHM OPERATION + FIFO STORE)
> >
> > - Split private context from DMA related memory, so that the former
> > could be allocated without GFP_DMA.
> >
> > NOTE: On its face value this commit decreased throughput numbers
> > reported by
> >
> >   dd if=/dev/hwrng of=/dev/null bs=1 count=100K [iflag=nonblock]
> >
> > by about 15%, however commits that enable prediction resistance and
> Running dd as mentioned above, on a i.MX8MM board I see:
> ~ 20% decrease in non-blocking case (525 kB/s vs. 662 kB/s)
> ~ 75% decrease in blocking case (170 kB/s vs. 657 kB/s)
>
> bs=1 is a bit drastic.
> Using bs=16 the numbers look better in terms of overall speed,
> however the relative degradation is still there:
> ~ 66% decrease in blocking case (3.5 MB/s vs. 10.1 MB/s)
>
> > limit JR total size impact the performance so much and move the
> > bottleneck such as to make this regression irrelevant.
> >
> Yes, performance is greatly impacted by moving from a DRBG configuration
> to a TRNG one.
>
> The speed that I get with this patch set (1.3 kB/s)
> is ~ 20% lower than theoretical output (1.583 kB/s) (see below).
> Seeing this and also the relative decrease in case of DRBG
> makes me wonder whether the SW overhead could be lowered.
>
> Theoretical TRNG output speed in this configuration
> can be computed as:
> Speed = (SZ x CAAM_CLK_FREQ) / (RTSDCTL[ENT_DLY] x RTSDCTL[SAMP_SIZE]) [bps]
>
> SZ is sample taken from the DRBG, b/w two consecutive reseedings.
> As previously discussed, this is limited to 128 bits (16 bytes),
> such that the DRBG behaves as a TRNG.
>
> If:
> -CAAM_CLK_FREQ = 166 MHz (as for i.MXM*)
> -RTSDCTL[ENT_DLY] = 3200 clocks (default / POR value)
> -RTSDCTL[SAMP_SIZE] = 512 (recommended; default / POR value is 2500)
> then theoretical speed is 1.583 kB/s.
>
> > @@ -45,38 +22,34 @@
> >  #include "jr.h"
> >  #include "error.h"
> >
> > +/* length of descriptors */
> This comment is misplaced, length of descriptors (CAAM_RNG_DESC_LEN)
> is further below.
>
> > +#define CAAM_RNG_MAX_FIFO_STORE_SIZE U16_MAX
> > +
> > +#define CAAM_RNG_FIFO_LEN            SZ_32K /* Must be a multiple of 2 */
> > +
> >  /*
> > - * Maximum buffer size: maximum number of random, cache-aligned bytes that
> > - * will be generated and moved to seq out ptr (extlen not allowed)
> > + * See caam_init_desc()
> >   */
> > -#define RN_BUF_SIZE                  (0xffff / L1_CACHE_BYTES * \
> > -                                      L1_CACHE_BYTES)
> > +#define CAAM_RNG_DESC_LEN (CAAM_CMD_SZ +                             \
> > +                        CAAM_CMD_SZ +                                \
> > +                        CAAM_CMD_SZ + CAAM_PTR_SZ_MAX)
>
> > +typedef u8 caam_rng_desc[CAAM_RNG_DESC_LEN];
> Is this really necessary?
>

Will drop in v8.

> > -static int caam_read(struct hwrng *rng, void *data, size_t max, bool wait)
> > +static int caam_rng_read_one(struct device *jrdev,
> > +                          void *dst, int len,
> > +                          void *desc,
> > +                          struct completion *done)
> [...]
> > +     len = min_t(int, len, CAAM_RNG_MAX_FIFO_STORE_SIZE);
> For the blocking case, i.e. caam_read() -> caam_rng_read_one(),
> "len" is at least 32B - cf. include/linux/hw_random.h:
>  * @read:               New API. drivers can fill up to max bytes of data
>  *                      into the buffer. The buffer is aligned for any type
>  *                      and max is a multiple of 4 and >= 32 bytes.
>
> For reducing the SW overhead, it might be worth optimizing this path.
> For example, considering
> min_t(int, len, CAAM_RNG_MAX_FIFO_STORE_SIZE) = CAAM_RNG_MAX_FIFO_STORE_SIZE

This isn't true until next commit. Here CAAM_RNG_MAX_FIFO_STORE_SIZE
is still 32K, so "len" is going to be smaller. I'll update the code to
assume fixed length once CAAM_RNG_MAX_FIFO_STORE_SIZE becomes 16
bytes.

> this means length is fixed, thus also ctx->desc[DESC_SYNC] descriptor is fixed
> and its generation could be moved out of the hot path.

That descriptor also includes DMA address of a buffer we are given, so
the descriptor is never fixed.

Thanks,
Andrey Smirnov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 7/9] crypto: caam - invalidate entropy register during RNG initialization
  2020-01-27 16:56 ` [PATCH v7 7/9] crypto: caam - invalidate entropy register during RNG initialization Andrey Smirnov
@ 2020-02-25 20:26   ` Horia Geanta
  0 siblings, 0 replies; 27+ messages in thread
From: Horia Geanta @ 2020-02-25 20:26 UTC (permalink / raw)
  To: Andrey Smirnov, linux-crypto
  Cc: Aymen Sghaier, Vipul Kumar, Chris Healy, Lucas Stach, Herbert Xu,
	Iuliana Prodan, linux-kernel, dl-linux-imx

On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> In order to make sure that we always use non-stale entropy data, change
> the code to invalidate entropy register during RNG initialization.
> 
> Signed-off-by: Aymen Sghaier <aymen.sghaier@nxp.com>
> Signed-off-by: Vipul Kumar <vipul_kumar@mentor.com>
> [andrew.smirnov@gmail.com ported to upstream kernel, rewrote commit msg]
> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
> Cc: Chris Healy <cphealy@gmail.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Horia Geantă <horia.geanta@nxp.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
> Cc: linux-crypto@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-imx@nxp.com
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>

Thanks,
Horia


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA
  2020-02-04 14:08   ` Horia Geanta
  2020-02-24 16:40     ` Andrey Smirnov
@ 2020-03-16  4:14     ` Andrey Smirnov
  2020-03-17 15:20       ` Horia Geantă
  1 sibling, 1 reply; 27+ messages in thread
From: Andrey Smirnov @ 2020-03-16  4:14 UTC (permalink / raw)
  To: Horia Geanta
  Cc: linux-crypto, Chris Healy, Lucas Stach, Herbert Xu,
	Iuliana Prodan, dl-linux-imx, linux-kernel

On Tue, Feb 4, 2020 at 6:08 AM Horia Geanta <horia.geanta@nxp.com> wrote:
>
> On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
> > Be consistent with the rest of the codebase and use GFP_DMA when
> > allocating memory for a CAAM JR descriptor.
> >
> Please use GFP_DMA32 instead.
> Device is not limited to less than 32 bits of addressing
> in any of its incarnations.
>
> s/GFP_DMA/GFP_DMA32 should be performed throughout caam driver.
> (But of course, I wouldn't include this change in current patch series).
>

Hmm, I am triggering
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/slub.c?h=v5.6-rc6#n1721
by using GFP_DMA32. AFAICT, GFP_DMA32 can't be used in SLUB/SLAB
allocated memory:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/internal.h?h=v5.6-rc6#n32

I'll stick with GFP_DMA for now, unless you have a different preference.

Thanks,
Andrey Smirnov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA
  2020-03-16  4:14     ` Andrey Smirnov
@ 2020-03-17 15:20       ` Horia Geantă
  0 siblings, 0 replies; 27+ messages in thread
From: Horia Geantă @ 2020-03-17 15:20 UTC (permalink / raw)
  To: Andrey Smirnov
  Cc: linux-crypto, Chris Healy, Lucas Stach, Herbert Xu,
	Iuliana Prodan, dl-linux-imx, linux-kernel

On 3/16/2020 6:15 AM, Andrey Smirnov wrote:
> On Tue, Feb 4, 2020 at 6:08 AM Horia Geanta <horia.geanta@nxp.com> wrote:
>>
>> On 1/27/2020 6:57 PM, Andrey Smirnov wrote:
>>> Be consistent with the rest of the codebase and use GFP_DMA when
>>> allocating memory for a CAAM JR descriptor.
>>>
>> Please use GFP_DMA32 instead.
>> Device is not limited to less than 32 bits of addressing
>> in any of its incarnations.
>>
>> s/GFP_DMA/GFP_DMA32 should be performed throughout caam driver.
>> (But of course, I wouldn't include this change in current patch series).
>>
> 
> Hmm, I am triggering
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/slub.c?h=v5.6-rc6#n1721
> by using GFP_DMA32. AFAICT, GFP_DMA32 can't be used in SLUB/SLAB
> allocated memory:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/internal.h?h=v5.6-rc6#n32
> 
Indeed.

> I'll stick with GFP_DMA for now, unless you have a different preference.
> 
I'm ok with this, will deal with memory allocation separately.

Thanks,
Horia

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-03-17 15:20 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-27 16:56 [PATCH v7 0/9] enable CAAM's HWRNG as default Andrey Smirnov
2020-01-27 16:56 ` [PATCH v7 1/9] crypto: caam - allocate RNG instantiation descriptor with GFP_DMA Andrey Smirnov
2020-02-04 14:08   ` Horia Geanta
2020-02-24 16:40     ` Andrey Smirnov
2020-03-16  4:14     ` Andrey Smirnov
2020-03-17 15:20       ` Horia Geantă
2020-01-27 16:56 ` [PATCH v7 2/9] crypto: caam - use struct hwrng's .init for initialization Andrey Smirnov
2020-02-11 14:39   ` Horia Geanta
2020-01-27 16:56 ` [PATCH v7 3/9] crypto: caam - use devm_kzalloc to allocate JR data Andrey Smirnov
2020-02-11 18:23   ` Horia Geanta
2020-02-24 16:39     ` Andrey Smirnov
2020-01-27 16:56 ` [PATCH v7 4/9] crypto: caam - drop global context pointer and init_done Andrey Smirnov
2020-02-11 18:53   ` Horia Geanta
2020-02-11 20:57   ` Horia Geanta
2020-02-24 16:40     ` Andrey Smirnov
2020-01-27 16:56 ` [PATCH v7 5/9] crypto: caam - simplify RNG implementation Andrey Smirnov
2020-02-12 13:20   ` Horia Geanta
2020-02-24 17:16     ` Andrey Smirnov
2020-01-27 16:56 ` [PATCH v7 6/9] crypto: caam - check if RNG job failed Andrey Smirnov
2020-02-12 10:41   ` Horia Geanta
2020-02-24 16:37     ` Andrey Smirnov
2020-01-27 16:56 ` [PATCH v7 7/9] crypto: caam - invalidate entropy register during RNG initialization Andrey Smirnov
2020-02-25 20:26   ` Horia Geanta
2020-01-27 16:56 ` [PATCH v7 8/9] crypto: caam - enable prediction resistance in HRWNG Andrey Smirnov
2020-02-04 13:09   ` [EXT] " Andrei Botila (OSS)
2020-02-04 14:19     ` Horia Geanta
2020-01-27 16:56 ` [PATCH v7 9/9] crypto: caam - limit single JD RNG output to maximum of 16 bytes Andrey Smirnov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.