linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] crypto: Add Allwinner Security System crypto accelerator
@ 2014-07-12 12:59 LABBE Corentin
  2014-07-12 12:59 ` [PATCH v4 1/3] ARM: sun7i: dt: Add Security System to A20 SoC DTS LABBE Corentin
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: LABBE Corentin @ 2014-07-12 12:59 UTC (permalink / raw)
  To: linux-arm-kernel


Hello

This is the driver for the Security System included in Allwinner SoC A20.
The Security System (SS for short) is a hardware cryptographic accelerator that support AES/MD5/SHA1/DES/3DES/PRNG algorithms.
It could be found on others Allwinner SoC: 
- A10s and A31 diagram speak about it with precisions (AES/DES/3DES/Md5/SHA1/PRNG)
- A10 and A13 manual give the same datasheet for SS than A20
- A23 speak about a security system but without precisions
But I do not have access on any of those hardware, tests are welcome.

This driver currently supports:
- MD5 and SHA1 hash algorithms
- AES block cipher in CBC mode with 128/196/256bits keys.
- DES and 3DES block cipher in CBC mode
The driver exposes all those algorithms through the kernel cryptographic API.

The driver support only CPU driven (aka poll mode) transfer mode since the DMA engine of the A20 does not have a driver yet.

Changes since v3:
- Remove all algorithms options from Kconfig, so now only one module is used
- Add the sunxi_ss_cipher function to unify mode calculation
- Remove the sunxi_cipher_exit empty function
- Add some missing mutex_unlock()
- Drop PRNG support, I wait for more comment on its results before re-enabling it.

Changes since v2:
- Fix Makefile and Kconfig for static kernel.

Changes since v1:
- annotate ss->base as __iomem
- regroup all mutex in the ss_ctx structure
- splited driver in 7 modules (core md5 sha1 aes des 3des prng) in sunxi-ss directory
- use dev_exit_p() for .remove
- added missing CRYPTO_BLKCIPHER dep in Kconfig
- use ahash instead of shash
- use ablkcipher instead of blkcipher
- use crypto_rng_ctx instead of crypto_tfm_ctx
- set seed as an u32
- drop useless comment decoration
- drop useless debug
- ss_ctx is now a static pointer and whole structure being allocated
- fix the platform_get_resource/devm_ioremap_resource pattern
- invert getting die id and configuring clock
- set clock value as a const unsigned long
- add MODULE_ALIAS
- use define names more consistency (SS_xxx)
- fix PRNG errors
- respell SS to Security System in DT documentation

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 1/3] ARM: sun7i: dt: Add Security System to A20 SoC DTS
  2014-07-12 12:59 [PATCH v4] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
@ 2014-07-12 12:59 ` LABBE Corentin
  2014-07-12 12:59 ` [PATCH v4 2/3] ARM: sunxi: dt: Add DT bindings documentation for SUNXI Security System LABBE Corentin
  2014-07-12 12:59 ` [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
  2 siblings, 0 replies; 14+ messages in thread
From: LABBE Corentin @ 2014-07-12 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

The Security System is a hardware cryptographic accelerator that support
AES/MD5/SHA1/DES/3DES/PRNG algorithms.
It could be found on many Allwinner SoC.

This patch enable the Security System on the Allwinner A20 SoC Device-tree.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
---
 arch/arm/boot/dts/sun7i-a20.dtsi | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/sun7i-a20.dtsi b/arch/arm/boot/dts/sun7i-a20.dtsi
index 6acdbdf..19b1ced 100644
--- a/arch/arm/boot/dts/sun7i-a20.dtsi
+++ b/arch/arm/boot/dts/sun7i-a20.dtsi
@@ -529,6 +529,14 @@
 			status = "disabled";
 		};
 
+		crypto: crypto-engine at 01c15000 {
+			compatible = "allwinner,sun7i-a20-crypto";
+			reg = <0x01c15000 0x1000>;
+			interrupts = <0 86 4>;
+			clocks = <&ahb_gates 5>, <&ss_clk>;
+			clock-names = "ahb", "mod";
+		};
+
 		spi2: spi at 01c17000 {
 			compatible = "allwinner,sun4i-a10-spi";
 			reg = <0x01c17000 0x1000>;
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 2/3] ARM: sunxi: dt: Add DT bindings documentation for SUNXI Security System
  2014-07-12 12:59 [PATCH v4] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
  2014-07-12 12:59 ` [PATCH v4 1/3] ARM: sun7i: dt: Add Security System to A20 SoC DTS LABBE Corentin
@ 2014-07-12 12:59 ` LABBE Corentin
  2014-07-25 10:10   ` Maxime Ripard
  2014-07-12 12:59 ` [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
  2 siblings, 1 reply; 14+ messages in thread
From: LABBE Corentin @ 2014-07-12 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds documentation for Device-Tree bindings for the Security
System cryptographic accelerator driver.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
---
 Documentation/devicetree/bindings/crypto/sunxi-ss.txt | 9 +++++++++
 1 file changed, 9 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/sunxi-ss.txt

diff --git a/Documentation/devicetree/bindings/crypto/sunxi-ss.txt b/Documentation/devicetree/bindings/crypto/sunxi-ss.txt
new file mode 100644
index 0000000..a566803
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/sunxi-ss.txt
@@ -0,0 +1,9 @@
+* Allwinner Security System found on A20 SoC
+
+Required properties:
+- compatible : Should be "allwinner,sun7i-a20-crypto".
+- reg: Should contain the Security System register location and length.
+- interrupts: Should contain the IRQ line for the Security System.
+- clocks : A phandle to the functional clock node of the Security System module
+- clock-names : Name of the functional clock, should be "ahb" and "mod".
+
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-12 12:59 [PATCH v4] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
  2014-07-12 12:59 ` [PATCH v4 1/3] ARM: sun7i: dt: Add Security System to A20 SoC DTS LABBE Corentin
  2014-07-12 12:59 ` [PATCH v4 2/3] ARM: sunxi: dt: Add DT bindings documentation for SUNXI Security System LABBE Corentin
@ 2014-07-12 12:59 ` LABBE Corentin
  2014-07-23 13:16   ` Herbert Xu
                     ` (2 more replies)
  2 siblings, 3 replies; 14+ messages in thread
From: LABBE Corentin @ 2014-07-12 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for the Security System included in Allwinner SoC A20.
The Security System is a hardware cryptographic accelerator that support AES/MD5/SHA1/DES/3DES/PRNG algorithms.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
---
 drivers/crypto/Kconfig                    |  17 ++
 drivers/crypto/Makefile                   |   1 +
 drivers/crypto/sunxi-ss/Makefile          |   2 +
 drivers/crypto/sunxi-ss/sunxi-ss-cipher.c | 461 ++++++++++++++++++++++++++++++
 drivers/crypto/sunxi-ss/sunxi-ss-core.c   | 308 ++++++++++++++++++++
 drivers/crypto/sunxi-ss/sunxi-ss-hash.c   | 241 ++++++++++++++++
 drivers/crypto/sunxi-ss/sunxi-ss.h        | 183 ++++++++++++
 7 files changed, 1213 insertions(+)
 create mode 100644 drivers/crypto/sunxi-ss/Makefile
 create mode 100644 drivers/crypto/sunxi-ss/sunxi-ss-cipher.c
 create mode 100644 drivers/crypto/sunxi-ss/sunxi-ss-core.c
 create mode 100644 drivers/crypto/sunxi-ss/sunxi-ss-hash.c
 create mode 100644 drivers/crypto/sunxi-ss/sunxi-ss.h

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 03ccdb0..a2acda4 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -418,4 +418,21 @@ config CRYPTO_DEV_MXS_DCP
 	  To compile this driver as a module, choose M here: the module
 	  will be called mxs-dcp.
 
+config CRYPTO_DEV_SUNXI_SS
+	tristate "Support for Allwinner Security System cryptographic accelerator"
+	depends on ARCH_SUNXI
+	select CRYPTO_MD5
+	select CRYPTO_SHA1
+	select CRYPTO_AES
+	select CRYPTO_DES
+	select CRYPTO_BLKCIPHER
+	help
+	  Some Allwinner SoC have a crypto accelerator named
+	  Security System. Select this if you want to use it.
+	  The Security System handle AES/DES/3DES ciphers in CBC mode
+	  and SHA1 and MD5 hash algorithms.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called sunxi-ss.
+
 endif # CRYPTO_HW
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index 482f090..855292a 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -23,3 +23,4 @@ obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o
 obj-$(CONFIG_CRYPTO_DEV_SAHARA) += sahara.o
 obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
 obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
+obj-$(CONFIG_CRYPTO_DEV_SUNXI_SS) += sunxi-ss/
diff --git a/drivers/crypto/sunxi-ss/Makefile b/drivers/crypto/sunxi-ss/Makefile
new file mode 100644
index 0000000..8bb287d
--- /dev/null
+++ b/drivers/crypto/sunxi-ss/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_CRYPTO_DEV_SUNXI_SS) += sunxi-ss.o
+sunxi-ss-y += sunxi-ss-core.o sunxi-ss-hash.o sunxi-ss-cipher.o
diff --git a/drivers/crypto/sunxi-ss/sunxi-ss-cipher.c b/drivers/crypto/sunxi-ss/sunxi-ss-cipher.c
new file mode 100644
index 0000000..c2422f7
--- /dev/null
+++ b/drivers/crypto/sunxi-ss/sunxi-ss-cipher.c
@@ -0,0 +1,461 @@
+/*
+ * sunxi-ss-cipher.c - hardware cryptographic accelerator for Allwinner A20 SoC
+ *
+ * Copyright (C) 2013-2014 Corentin LABBE <clabbe.montjoie@gmail.com>
+ *
+ * This file add support for AES cipher with 128,192,256 bits
+ * keysize in CBC mode.
+ *
+ * You could find the datasheet at
+ * http://dl.linux-sunxi.org/A20/A20%20User%20Manual%202013-03-22.pdf
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+#include "sunxi-ss.h"
+
+extern struct sunxi_ss_ctx *ss;
+
+static int sunxi_ss_cipher(struct ablkcipher_request *areq, u32 mode)
+{
+	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(areq);
+	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
+	const char *cipher_type;
+
+	cipher_type = crypto_tfm_alg_name(crypto_ablkcipher_tfm(tfm));
+
+	if (areq->nbytes == 0) {
+		mutex_unlock(&ss->lock);
+		return 0;
+	}
+
+	if (areq->info == NULL) {
+		dev_err(ss->dev, "ERROR: Empty IV\n");
+		mutex_unlock(&ss->lock);
+		return -EINVAL;
+	}
+
+	if (areq->src == NULL || areq->dst == NULL) {
+		dev_err(ss->dev, "ERROR: Some SGs are NULL\n");
+		mutex_unlock(&ss->lock);
+		return -EINVAL;
+	}
+
+	if (strcmp("cbc(aes)", cipher_type) == 0) {
+		op->mode |= SS_OP_AES | SS_CBC | SS_ENABLED | mode;
+		return sunxi_ss_aes_poll(areq);
+	}
+	if (strcmp("cbc(des)", cipher_type) == 0) {
+		op->mode = SS_OP_DES | SS_CBC | SS_ENABLED | mode;
+		return sunxi_ss_des_poll(areq);
+	}
+	if (strcmp("cbc(des3_ede)", cipher_type) == 0) {
+		op->mode = SS_OP_3DES | SS_CBC | SS_ENABLED | mode;
+		return sunxi_ss_des_poll(areq);
+	}
+	dev_err(ss->dev, "ERROR: Cipher %s not handled\n", cipher_type);
+	mutex_unlock(&ss->lock);
+	return -EINVAL;
+}
+
+int sunxi_ss_cipher_encrypt(struct ablkcipher_request *areq)
+{
+	return sunxi_ss_cipher(areq, SS_ENCRYPTION);
+}
+
+int sunxi_ss_cipher_decrypt(struct ablkcipher_request *areq)
+{
+	return sunxi_ss_cipher(areq, SS_DECRYPTION);
+}
+
+int sunxi_ss_cipher_init(struct crypto_tfm *tfm)
+{
+	struct sunxi_req_ctx *op = crypto_tfm_ctx(tfm);
+
+	mutex_lock(&ss->lock);
+
+	memset(op, 0, sizeof(struct sunxi_req_ctx));
+	return 0;
+}
+
+int sunxi_ss_aes_poll(struct ablkcipher_request *areq)
+{
+	u32 spaces;
+	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(areq);
+	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
+	unsigned int ivsize = crypto_ablkcipher_ivsize(tfm);
+	/* when activating SS, the default FIFO space is 32 */
+	u32 rx_cnt = 32;
+	u32 tx_cnt = 0;
+	u32 v;
+	int i;
+	struct scatterlist *in_sg;
+	struct scatterlist *out_sg;
+	void *src_addr;
+	void *dst_addr;
+	unsigned int ileft = areq->nbytes;
+	unsigned int oleft = areq->nbytes;
+	unsigned int sgileft = areq->src->length;
+	unsigned int sgoleft = areq->dst->length;
+	unsigned int todo;
+	u32 *src32;
+	u32 *dst32;
+
+	in_sg = areq->src;
+	out_sg = areq->dst;
+	for (i = 0; i < op->keylen; i += 4)
+		writel(*(op->key + i/4), ss->base + SS_KEY0 + i);
+	if (areq->info != NULL) {
+		for (i = 0; i < 4 && i < ivsize / 4; i++) {
+			v = *(u32 *)(areq->info + i * 4);
+			writel(v, ss->base + SS_IV0 + i * 4);
+		}
+	}
+	writel(op->mode, ss->base + SS_CTL);
+
+	/* If we have only one SG, we can use kmap_atomic */
+	if (sg_next(in_sg) == NULL && sg_next(out_sg) == NULL) {
+		src_addr = kmap_atomic(sg_page(in_sg)) + in_sg->offset;
+		if (src_addr == NULL) {
+			dev_err(ss->dev, "kmap_atomic error for src SG\n");
+			writel(0, ss->base + SS_CTL);
+			mutex_unlock(&ss->lock);
+			return -EINVAL;
+		}
+		dst_addr = kmap_atomic(sg_page(out_sg)) + out_sg->offset;
+		if (dst_addr == NULL) {
+			dev_err(ss->dev, "kmap_atomic error for dst SG\n");
+			writel(0, ss->base + SS_CTL);
+			kunmap_atomic(src_addr);
+			mutex_unlock(&ss->lock);
+			return -EINVAL;
+		}
+		src32 = (u32 *)src_addr;
+		dst32 = (u32 *)dst_addr;
+		ileft = areq->nbytes / 4;
+		oleft = areq->nbytes / 4;
+		i = 0;
+		do {
+			if (ileft > 0 && rx_cnt > 0) {
+				todo = min(rx_cnt, ileft);
+				ileft -= todo;
+				do {
+					writel_relaxed(*src32++,
+						       ss->base +
+						       SS_RXFIFO);
+					todo--;
+				} while (todo > 0);
+			}
+			if (tx_cnt > 0) {
+				todo = min(tx_cnt, oleft);
+				oleft -= todo;
+				do {
+					*dst32++ = readl_relaxed(ss->base +
+								SS_TXFIFO);
+					todo--;
+				} while (todo > 0);
+			}
+			spaces = readl_relaxed(ss->base + SS_FCSR);
+			rx_cnt = SS_RXFIFO_SPACES(spaces);
+			tx_cnt = SS_TXFIFO_SPACES(spaces);
+		} while (oleft > 0);
+		writel(0, ss->base + SS_CTL);
+		kunmap_atomic(src_addr);
+		kunmap_atomic(dst_addr);
+		mutex_unlock(&ss->lock);
+		return 0;
+	}
+
+	/* If we have more than one SG, we cannot use kmap_atomic since
+	 * we hold the mapping too long
+	 */
+	src_addr = kmap(sg_page(in_sg)) + in_sg->offset;
+	if (src_addr == NULL) {
+		dev_err(ss->dev, "KMAP error for src SG\n");
+		mutex_unlock(&ss->lock);
+		return -EINVAL;
+	}
+	dst_addr = kmap(sg_page(out_sg)) + out_sg->offset;
+	if (dst_addr == NULL) {
+		kunmap(sg_page(in_sg));
+		dev_err(ss->dev, "KMAP error for dst SG\n");
+		mutex_unlock(&ss->lock);
+		return -EINVAL;
+	}
+	src32 = (u32 *)src_addr;
+	dst32 = (u32 *)dst_addr;
+	ileft = areq->nbytes / 4;
+	oleft = areq->nbytes / 4;
+	sgileft = in_sg->length / 4;
+	sgoleft = out_sg->length / 4;
+	do {
+		spaces = readl_relaxed(ss->base + SS_FCSR);
+		rx_cnt = SS_RXFIFO_SPACES(spaces);
+		tx_cnt = SS_TXFIFO_SPACES(spaces);
+		todo = min3(rx_cnt, ileft, sgileft);
+		if (todo > 0) {
+			ileft -= todo;
+			sgileft -= todo;
+		}
+		while (todo > 0) {
+			writel_relaxed(*src32++, ss->base + SS_RXFIFO);
+			todo--;
+		}
+		if (in_sg != NULL && sgileft == 0 && ileft > 0) {
+			kunmap(sg_page(in_sg));
+			in_sg = sg_next(in_sg);
+			while (in_sg != NULL && in_sg->length == 0)
+				in_sg = sg_next(in_sg);
+			if (in_sg != NULL && ileft > 0) {
+				src_addr = kmap(sg_page(in_sg)) + in_sg->offset;
+				if (src_addr == NULL) {
+					dev_err(ss->dev, "ERROR: KMAP for src SG\n");
+					mutex_unlock(&ss->lock);
+					return -EINVAL;
+				}
+				src32 = src_addr;
+				sgileft = in_sg->length / 4;
+			}
+		}
+		/* do not test oleft since when oleft == 0 we have finished */
+		todo = min3(tx_cnt, oleft, sgoleft);
+		if (todo > 0) {
+			oleft -= todo;
+			sgoleft -= todo;
+		}
+		while (todo > 0) {
+			*dst32++ = readl_relaxed(ss->base + SS_TXFIFO);
+			todo--;
+		}
+		if (out_sg != NULL && sgoleft == 0 && oleft >= 0) {
+			kunmap(sg_page(out_sg));
+			out_sg = sg_next(out_sg);
+			while (out_sg != NULL && out_sg->length == 0)
+				out_sg = sg_next(out_sg);
+			if (out_sg != NULL && oleft > 0) {
+				dst_addr = kmap(sg_page(out_sg)) +
+					out_sg->offset;
+				if (dst_addr == NULL) {
+					dev_err(ss->dev, "KMAP error\n");
+					mutex_unlock(&ss->lock);
+					return -EINVAL;
+				}
+				dst32 = dst_addr;
+				sgoleft = out_sg->length / 4;
+			}
+		}
+	} while (oleft > 0);
+
+	writel(0, ss->base + SS_CTL);
+	mutex_unlock(&ss->lock);
+	return 0;
+}
+
+/* Pure CPU way of doing DES/3DES with SS
+ * Since DES and 3DES SGs could be smaller than 4 bytes, I use sg_copy_to_buffer
+ * for "linearize" them.
+ * The problem with that is that I alloc (2 x areq->nbytes) for buf_in/buf_out
+ * TODO: change this system
+ * SGsrc -> buf_in -> SS -> buf_out -> SGdst */
+int sunxi_ss_des_poll(struct ablkcipher_request *areq)
+{
+	u32 value, spaces;
+	size_t nb_in_sg_tx, nb_in_sg_rx;
+	size_t ir, it;
+	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(areq);
+	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
+	unsigned int ivsize = crypto_ablkcipher_ivsize(tfm);
+	u32 tx_cnt = 0;
+	u32 rx_cnt = 0;
+	u32 v;
+	int i;
+	int no_chunk = 1;
+
+	/* if we have only SGs with size multiple of 4,
+	 * we can use the SS AES function */
+	struct scatterlist *in_sg;
+	struct scatterlist *out_sg;
+
+	in_sg = areq->src;
+	out_sg = areq->dst;
+
+	while (in_sg != NULL && no_chunk == 1) {
+		if ((in_sg->length % 4) != 0)
+			no_chunk = 0;
+		in_sg = sg_next(in_sg);
+	}
+	while (out_sg != NULL && no_chunk == 1) {
+		if ((out_sg->length % 4) != 0)
+			no_chunk = 0;
+		out_sg = sg_next(out_sg);
+	}
+
+	if (no_chunk == 1)
+		return sunxi_ss_aes_poll(areq);
+	in_sg = areq->src;
+	out_sg = areq->dst;
+
+	nb_in_sg_rx = sg_nents(in_sg);
+	nb_in_sg_tx = sg_nents(out_sg);
+
+	mutex_lock(&ss->bufin_lock);
+	if (ss->buf_in == NULL) {
+		ss->buf_in = kmalloc(areq->nbytes, GFP_KERNEL);
+		ss->buf_in_size = areq->nbytes;
+	} else {
+		if (areq->nbytes > ss->buf_in_size) {
+			kfree(ss->buf_in);
+			ss->buf_in = kmalloc(areq->nbytes, GFP_KERNEL);
+			ss->buf_in_size = areq->nbytes;
+		}
+	}
+	if (ss->buf_in == NULL) {
+		ss->buf_in_size = 0;
+		mutex_unlock(&ss->bufin_lock);
+		dev_err(ss->dev, "Unable to allocate pages.\n");
+		return -ENOMEM;
+	}
+	if (ss->buf_out == NULL) {
+		mutex_lock(&ss->bufout_lock);
+		ss->buf_out = kmalloc(areq->nbytes, GFP_KERNEL);
+		if (ss->buf_out == NULL) {
+			ss->buf_out_size = 0;
+			mutex_unlock(&ss->bufout_lock);
+			dev_err(ss->dev, "Unable to allocate pages.\n");
+			return -ENOMEM;
+		}
+		ss->buf_out_size = areq->nbytes;
+		mutex_unlock(&ss->bufout_lock);
+	} else {
+		if (areq->nbytes > ss->buf_out_size) {
+			mutex_lock(&ss->bufout_lock);
+			kfree(ss->buf_out);
+			ss->buf_out = kmalloc(areq->nbytes, GFP_KERNEL);
+			if (ss->buf_out == NULL) {
+				ss->buf_out_size = 0;
+				mutex_unlock(&ss->bufout_lock);
+				dev_err(ss->dev, "Unable to allocate pages.\n");
+				return -ENOMEM;
+			}
+			ss->buf_out_size = areq->nbytes;
+			mutex_unlock(&ss->bufout_lock);
+		}
+	}
+
+	sg_copy_to_buffer(areq->src, nb_in_sg_rx, ss->buf_in, areq->nbytes);
+
+	ir = 0;
+	it = 0;
+
+	for (i = 0; i < op->keylen; i += 4)
+		writel(*(op->key + i/4), ss->base + SS_KEY0 + i);
+	if (areq->info != NULL) {
+		for (i = 0; i < 4 && i < ivsize / 4; i++) {
+			v = *(u32 *)(areq->info + i * 4);
+			writel(v, ss->base + SS_IV0 + i * 4);
+		}
+	}
+	writel(op->mode, ss->base + SS_CTL);
+
+	do {
+		if (rx_cnt == 0 || tx_cnt == 0) {
+			spaces = readl(ss->base + SS_FCSR);
+			rx_cnt = SS_RXFIFO_SPACES(spaces);
+			tx_cnt = SS_TXFIFO_SPACES(spaces);
+		}
+		if (rx_cnt > 0 && ir < areq->nbytes) {
+			do {
+				value = *(u32 *)(ss->buf_in + ir);
+				writel(value, ss->base + SS_RXFIFO);
+				ir += 4;
+				rx_cnt--;
+			} while (rx_cnt > 0 && ir < areq->nbytes);
+		}
+		if (tx_cnt > 0 && it < areq->nbytes) {
+			do {
+				value = readl(ss->base + SS_TXFIFO);
+				*(u32 *)(ss->buf_out + it) = value;
+				it += 4;
+				tx_cnt--;
+			} while (tx_cnt > 0 && it < areq->nbytes);
+		}
+		if (ir == areq->nbytes) {
+			mutex_unlock(&ss->bufin_lock);
+			ir++;
+		}
+	} while (it < areq->nbytes);
+
+	writel(0, ss->base + SS_CTL);
+	mutex_unlock(&ss->lock);
+
+	/* a simple optimization, since we dont need the hardware for this copy
+	 * we release the lock and do the copy. With that we gain 5/10% perf */
+	mutex_lock(&ss->bufout_lock);
+	sg_copy_from_buffer(areq->dst, nb_in_sg_tx, ss->buf_out, areq->nbytes);
+
+	mutex_unlock(&ss->bufout_lock);
+	return 0;
+}
+
+/* check and set the AES key, prepare the mode to be used */
+int sunxi_ss_aes_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+		unsigned int keylen)
+{
+	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
+
+	switch (keylen) {
+	case 128 / 8:
+		op->mode = SS_AES_128BITS;
+		break;
+	case 192 / 8:
+		op->mode = SS_AES_192BITS;
+		break;
+	case 256 / 8:
+		op->mode = SS_AES_256BITS;
+		break;
+	default:
+		dev_err(ss->dev, "ERROR: Invalid keylen %u\n", keylen);
+		crypto_ablkcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+		mutex_unlock(&ss->lock);
+		return -EINVAL;
+	}
+	op->keylen = keylen;
+	memcpy(op->key, key, keylen);
+	return 0;
+}
+
+/* check and set the DES key, prepare the mode to be used */
+int sunxi_ss_des_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+		unsigned int keylen)
+{
+	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
+
+	if (keylen != DES_KEY_SIZE) {
+		dev_err(ss->dev, "Invalid keylen %u\n", keylen);
+		crypto_ablkcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+		mutex_unlock(&ss->lock);
+		return -EINVAL;
+	}
+	op->keylen = keylen;
+	memcpy(op->key, key, keylen);
+	return 0;
+}
+
+/* check and set the 3DES key, prepare the mode to be used */
+int sunxi_ss_des3_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+		unsigned int keylen)
+{
+	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
+
+	if (keylen != 3 * DES_KEY_SIZE) {
+		dev_err(ss->dev, "Invalid keylen %u\n", keylen);
+		crypto_ablkcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+		mutex_unlock(&ss->lock);
+		return -EINVAL;
+	}
+	op->keylen = keylen;
+	memcpy(op->key, key, keylen);
+	return 0;
+}
diff --git a/drivers/crypto/sunxi-ss/sunxi-ss-core.c b/drivers/crypto/sunxi-ss/sunxi-ss-core.c
new file mode 100644
index 0000000..c76016e
--- /dev/null
+++ b/drivers/crypto/sunxi-ss/sunxi-ss-core.c
@@ -0,0 +1,308 @@
+/*
+ * sunxi-ss.c - hardware cryptographic accelerator for Allwinner A20 SoC
+ *
+ * Copyright (C) 2013-2014 Corentin LABBE <clabbe.montjoie@gmail.com>
+ *
+ * Core file which registers crypto algorithms supported by the SS.
+ *
+ * You could find the datasheet at
+ * http://dl.linux-sunxi.org/A20/A20%20User%20Manual%202013-03-22.pdf
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+#include <linux/clk.h>
+#include <linux/crypto.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <crypto/scatterwalk.h>
+#include <linux/scatterlist.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+
+#include "sunxi-ss.h"
+
+struct sunxi_ss_ctx *ss;
+
+/* General notes:
+ * I cannot use a key/IV cache because each time one of these change ALL stuff
+ * need to be re-writed (rewrite SS_KEYX ans SS_IVX).
+ * And for example, with dm-crypt IV changes on each request.
+ *
+ * After each request the device must be disabled with a write of 0 in SS_CTL
+ *
+ * For performance reason, we use writel_relaxed/read_relaxed for all
+ * operations on RX and TX FIFO and also SS_FCSR.
+ * For all other registers, we use writel/readl.
+ * See http://permalink.gmane.org/gmane.linux.ports.arm.kernel/117644
+ * and http://permalink.gmane.org/gmane.linux.ports.arm.kernel/117640
+ * */
+
+static struct ahash_alg sunxi_md5_alg = {
+	.init = sunxi_hash_init,
+	.update = sunxi_hash_update,
+	.final = sunxi_hash_final,
+	.finup = sunxi_hash_finup,
+	.digest = sunxi_hash_digest,
+	.halg = {
+		.digestsize = MD5_DIGEST_SIZE,
+		.base = {
+			.cra_name = "md5",
+			.cra_driver_name = "md5-sunxi-ss",
+			.cra_priority = 300,
+			.cra_alignmask = 3,
+			.cra_flags = CRYPTO_ALG_TYPE_AHASH | CRYPTO_ALG_ASYNC,
+			.cra_blocksize = MD5_HMAC_BLOCK_SIZE,
+			.cra_ctxsize = sizeof(struct sunxi_req_ctx),
+			.cra_module = THIS_MODULE,
+			.cra_type = &crypto_ahash_type
+		}
+	}
+};
+static struct ahash_alg sunxi_sha1_alg = {
+	.init = sunxi_hash_init,
+	.update = sunxi_hash_update,
+	.final = sunxi_hash_final,
+	.finup = sunxi_hash_finup,
+	.digest = sunxi_hash_digest,
+	.halg = {
+		.digestsize = SHA1_DIGEST_SIZE,
+		.base = {
+			.cra_name = "sha1",
+			.cra_driver_name = "sha1-sunxi-ss",
+			.cra_priority = 300,
+			.cra_alignmask = 3,
+			.cra_flags = CRYPTO_ALG_TYPE_AHASH | CRYPTO_ALG_ASYNC,
+			.cra_blocksize = SHA1_BLOCK_SIZE,
+			.cra_ctxsize = sizeof(struct sunxi_req_ctx),
+			.cra_module = THIS_MODULE,
+			.cra_type = &crypto_ahash_type
+		}
+	}
+};
+
+static struct crypto_alg sunxi_cipher_algs[] = {
+{
+	.cra_name = "cbc(aes)",
+	.cra_driver_name = "cbc-aes-sunxi-ss",
+	.cra_priority = 300,
+	.cra_blocksize = AES_BLOCK_SIZE,
+	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER,
+	.cra_ctxsize = sizeof(struct sunxi_req_ctx),
+	.cra_module = THIS_MODULE,
+	.cra_alignmask = 3,
+	.cra_type = &crypto_ablkcipher_type,
+	.cra_init = sunxi_ss_cipher_init,
+	.cra_u = {
+		.ablkcipher = {
+			.min_keysize    = AES_MIN_KEY_SIZE,
+			.max_keysize    = AES_MAX_KEY_SIZE,
+			.ivsize         = AES_BLOCK_SIZE,
+			.setkey         = sunxi_ss_aes_setkey,
+			.encrypt        = sunxi_ss_cipher_encrypt,
+			.decrypt        = sunxi_ss_cipher_decrypt,
+		}
+	}
+}, {
+	.cra_name = "cbc(des)",
+	.cra_driver_name = "cbc-des-sunxi-ss",
+	.cra_priority = 300,
+	.cra_blocksize = DES_BLOCK_SIZE,
+	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER,
+	.cra_ctxsize = sizeof(struct sunxi_req_ctx),
+	.cra_module = THIS_MODULE,
+	.cra_alignmask = 3,
+	.cra_type = &crypto_ablkcipher_type,
+	.cra_init = sunxi_ss_cipher_init,
+	.cra_u.ablkcipher = {
+		.min_keysize    = DES_KEY_SIZE,
+		.max_keysize    = DES_KEY_SIZE,
+		.ivsize         = DES_BLOCK_SIZE,
+		.setkey         = sunxi_ss_des_setkey,
+		.encrypt        = sunxi_ss_cipher_encrypt,
+		.decrypt        = sunxi_ss_cipher_decrypt,
+	}
+}, {
+	.cra_name = "cbc(des3_ede)",
+	.cra_driver_name = "cbc-des3-sunxi-ss",
+	.cra_priority = 300,
+	.cra_blocksize = DES3_EDE_BLOCK_SIZE,
+	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER,
+	.cra_ctxsize = sizeof(struct sunxi_req_ctx),
+	.cra_module = THIS_MODULE,
+	.cra_alignmask = 3,
+	.cra_type = &crypto_ablkcipher_type,
+	.cra_init = sunxi_ss_cipher_init,
+	.cra_u.ablkcipher = {
+		.min_keysize    = DES3_EDE_KEY_SIZE,
+		.max_keysize    = DES3_EDE_KEY_SIZE,
+		.ivsize         = DES3_EDE_BLOCK_SIZE,
+		.setkey         = sunxi_ss_des3_setkey,
+		.encrypt        = sunxi_ss_cipher_encrypt,
+		.decrypt        = sunxi_ss_cipher_decrypt,
+	}
+}
+};
+
+static int sunxi_ss_probe(struct platform_device *pdev)
+{
+	struct resource *res;
+	u32 v;
+	int err;
+	unsigned long cr;
+	const unsigned long cr_ahb = 24 * 1000 * 1000;
+	const unsigned long cr_mod = 150 * 1000 * 1000;
+
+	if (!pdev->dev.of_node)
+		return -ENODEV;
+
+	ss = devm_kzalloc(&pdev->dev, sizeof(*ss), GFP_KERNEL);
+	if (ss == NULL)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ss->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(ss->base)) {
+		dev_err(&pdev->dev, "Cannot request MMIO\n");
+		return PTR_ERR(ss->base);
+	}
+
+	ss->ssclk = devm_clk_get(&pdev->dev, "mod");
+	if (IS_ERR(ss->ssclk)) {
+		err = PTR_ERR(ss->ssclk);
+		dev_err(&pdev->dev, "Cannot get SS clock err=%d\n", err);
+		return err;
+	}
+	dev_dbg(&pdev->dev, "clock ss acquired\n");
+
+	ss->busclk = devm_clk_get(&pdev->dev, "ahb");
+	if (IS_ERR(ss->busclk)) {
+		err = PTR_ERR(ss->busclk);
+		dev_err(&pdev->dev, "Cannot get AHB SS clock err=%d\n", err);
+		return err;
+	}
+	dev_dbg(&pdev->dev, "clock ahb_ss acquired\n");
+
+	/* Enable the clocks */
+	err = clk_prepare_enable(ss->busclk);
+	if (err != 0) {
+		dev_err(&pdev->dev, "Cannot prepare_enable busclk\n");
+		return err;
+	}
+	err = clk_prepare_enable(ss->ssclk);
+	if (err != 0) {
+		dev_err(&pdev->dev, "Cannot prepare_enable ssclk\n");
+		clk_disable_unprepare(ss->busclk);
+		return err;
+	}
+
+	/* Check that clock have the correct rates gived in the datasheet */
+	/* Try to set the clock to the maximum allowed */
+	err = clk_set_rate(ss->ssclk, cr_mod);
+	if (err != 0) {
+		dev_err(&pdev->dev, "Cannot set clock rate to ssclk\n");
+		clk_disable_unprepare(ss->ssclk);
+		clk_disable_unprepare(ss->busclk);
+		return err;
+	}
+	cr = clk_get_rate(ss->busclk);
+	if (cr >= cr_ahb)
+		dev_dbg(&pdev->dev, "Clock bus %lu (%lu MHz) (must be >= %lu)\n",
+				cr, cr / 1000000, cr_ahb);
+	else
+		dev_warn(&pdev->dev, "Clock bus %lu (%lu MHz) (must be >= %lu)\n",
+				cr, cr / 1000000, cr_ahb);
+	cr = clk_get_rate(ss->ssclk);
+	if (cr == cr_mod)
+		dev_dbg(&pdev->dev, "Clock ss %lu (%lu MHz) (must be <= %lu)\n",
+				cr, cr / 1000000, cr_mod);
+	else {
+		dev_warn(&pdev->dev, "Clock ss is at %lu (%lu MHz) (must be <= %lu)\n",
+				cr, cr / 1000000, cr_mod);
+	}
+
+	/* TODO Does this information could be usefull ? */
+	writel(SS_ENABLED, ss->base + SS_CTL);
+	v = readl(ss->base + SS_CTL);
+	v >>= 16;
+	v &= 0x07;
+	dev_info(&pdev->dev, "Die ID %d\n", v);
+	writel(0, ss->base + SS_CTL);
+
+	ss->dev = &pdev->dev;
+
+	mutex_init(&ss->lock);
+	mutex_init(&ss->bufin_lock);
+	mutex_init(&ss->bufout_lock);
+
+	err = crypto_register_ahash(&sunxi_md5_alg);
+	if (err)
+		goto error_md5;
+	err = crypto_register_ahash(&sunxi_sha1_alg);
+	if (err)
+		goto error_sha1;
+	err = crypto_register_algs(sunxi_cipher_algs,
+			ARRAY_SIZE(sunxi_cipher_algs));
+	if (err)
+		goto error_ciphers;
+
+	return 0;
+error_ciphers:
+	crypto_unregister_ahash(&sunxi_sha1_alg);
+error_sha1:
+	crypto_unregister_ahash(&sunxi_md5_alg);
+error_md5:
+	clk_disable_unprepare(ss->ssclk);
+	clk_disable_unprepare(ss->busclk);
+	return err;
+}
+
+static int __exit sunxi_ss_remove(struct platform_device *pdev)
+{
+	if (!pdev->dev.of_node)
+		return 0;
+
+	crypto_unregister_ahash(&sunxi_md5_alg);
+	crypto_unregister_ahash(&sunxi_sha1_alg);
+	crypto_unregister_algs(sunxi_cipher_algs,
+			ARRAY_SIZE(sunxi_cipher_algs));
+
+	if (ss->buf_in != NULL)
+		kfree(ss->buf_in);
+	if (ss->buf_out != NULL)
+		kfree(ss->buf_out);
+
+	writel(0, ss->base + SS_CTL);
+	clk_disable_unprepare(ss->busclk);
+	clk_disable_unprepare(ss->ssclk);
+	return 0;
+}
+
+/*============================================================================*/
+/*============================================================================*/
+static const struct of_device_id a20ss_crypto_of_match_table[] = {
+	{ .compatible = "allwinner,sun7i-a20-crypto" },
+	{}
+};
+MODULE_DEVICE_TABLE(of, a20ss_crypto_of_match_table);
+
+static struct platform_driver sunxi_ss_driver = {
+	.probe          = sunxi_ss_probe,
+	.remove         = __exit_p(sunxi_ss_remove),
+	.driver         = {
+		.owner          = THIS_MODULE,
+		.name           = "sunxi-ss",
+		.of_match_table	= a20ss_crypto_of_match_table,
+	},
+};
+
+module_platform_driver(sunxi_ss_driver);
+
+MODULE_DESCRIPTION("Allwinner Security System cryptographic accelerator");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Corentin LABBE <clabbe.montjoie@gmail.com>");
diff --git a/drivers/crypto/sunxi-ss/sunxi-ss-hash.c b/drivers/crypto/sunxi-ss/sunxi-ss-hash.c
new file mode 100644
index 0000000..6412bfb
--- /dev/null
+++ b/drivers/crypto/sunxi-ss/sunxi-ss-hash.c
@@ -0,0 +1,241 @@
+/*
+ * sunxi-ss-hash.c - hardware cryptographic accelerator for Allwinner A20 SoC
+ *
+ * Copyright (C) 2013-2014 Corentin LABBE <clabbe.montjoie@gmail.com>
+ *
+ * This file add support for MD5 and SHA1.
+ *
+ * You could find the datasheet at
+ * http://dl.linux-sunxi.org/A20/A20%20User%20Manual%202013-03-22.pdf
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+#include "sunxi-ss.h"
+
+extern struct sunxi_ss_ctx *ss;
+
+/* sunxi_hash_init: initialize request context
+ * Activate the SS, and configure it for MD5 or SHA1
+ */
+int sunxi_hash_init(struct ahash_request *areq)
+{
+	const char *hash_type;
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
+	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
+
+	mutex_lock(&ss->lock);
+
+	hash_type = crypto_tfm_alg_name(areq->base.tfm);
+
+	op->byte_count = 0;
+	op->nbwait = 0;
+	op->waitbuf = 0;
+
+	/* Enable and configure SS for MD5 or SHA1 */
+	if (strcmp(hash_type, "sha1") == 0)
+		op->mode = SS_OP_SHA1;
+	else
+		op->mode = SS_OP_MD5;
+
+	writel(op->mode | SS_ENABLED, ss->base + SS_CTL);
+	return 0;
+}
+
+/*
+ * sunxi_hash_update: update hash engine
+ *
+ * Could be used for both SHA1 and MD5
+ * Write data by step of 32bits and put then in the SS.
+ * The remaining data is stored (nbwait bytes) in op->waitbuf
+ * As an optimisation, we do not check RXFIFO_SPACES, since SS handle
+ * the FIFO faster than our writes
+ */
+int sunxi_hash_update(struct ahash_request *areq)
+{
+	u32 v;
+	unsigned int i = 0;/* bytes read, to be compared to areq->nbytes */
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
+	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
+	struct scatterlist *in_sg;
+	unsigned int in_i = 0;/* advancement in the current SG */
+	void *src_addr;
+
+	u8 *waitbuf = (u8 *)(&op->waitbuf);
+
+	if (areq->nbytes == 0)
+		return 0;
+
+	in_sg = areq->src;
+	do {
+		src_addr = kmap(sg_page(in_sg)) + in_sg->offset;
+		/* step 1, if some bytes remains from last SG,
+		 * try to complete them to 4 and sent its */
+		if (op->nbwait > 0) {
+			while (op->nbwait < 4 && i < areq->nbytes &&
+					in_i < in_sg->length) {
+				waitbuf[op->nbwait] = *(u8 *)(src_addr + in_i);
+				i++;
+				in_i++;
+				op->nbwait++;
+			}
+			if (op->nbwait == 4) {
+				writel(op->waitbuf, ss->base + SS_RXFIFO);
+				op->byte_count += 4;
+				op->nbwait = 0;
+				op->waitbuf = 0;
+			}
+		}
+		/* step 2, main loop, read data 4bytes at a time */
+		while (i < areq->nbytes && areq->nbytes - i >= 4 &&
+				in_i < in_sg->length &&
+				in_sg->length - in_i >= 4) {
+			v = *(u32 *)(src_addr + in_i);
+			writel_relaxed(v, ss->base + SS_RXFIFO);
+			i += 4;
+			op->byte_count += 4;
+			in_i += 4;
+		}
+		/* step 3, if we have less than 4 bytes, copy them in waitbuf
+		 * no need to check for op->nbwait < 4 since we cannot have
+		 * more than 4 bytes remaining */
+		if (in_i < in_sg->length && in_sg->length - in_i < 4 &&
+				i < areq->nbytes) {
+			do {
+				waitbuf[op->nbwait] = *(u8 *)(src_addr + in_i);
+				op->nbwait++;
+				in_i++;
+				i++;
+			} while (in_i < in_sg->length && i < areq->nbytes);
+		}
+		/* we have finished the current SG, try next one */
+		kunmap(sg_page(in_sg));
+		in_sg = sg_next(in_sg);
+		in_i = 0;
+	} while (in_sg != NULL && i < areq->nbytes);
+	return 0;
+}
+
+/*
+ * sunxi_hash_final: finalize hashing operation
+ *
+ * If we have some remaining bytes, send it.
+ * Then ask the SS for finalizing the hash
+ */
+int sunxi_hash_final(struct ahash_request *areq)
+{
+	u32 v;
+	unsigned int i;
+	int zeros;
+	unsigned int index, padlen;
+	__be64 bits;
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
+	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
+
+	if (op->nbwait > 0) {
+		op->waitbuf |= ((1 << 7) << (op->nbwait * 8));
+		writel(op->waitbuf, ss->base + SS_RXFIFO);
+	} else {
+		writel((1 << 7), ss->base + SS_RXFIFO);
+	}
+
+	/* number of space to pad to obtain 64o minus 8(size) minus 4 (final 1)
+	 * example len=0
+	 * example len=56
+	 * */
+
+	/* we have already send 4 more byte of which nbwait data */
+	if (op->mode == SS_OP_MD5) {
+		index = (op->byte_count + 4) & 0x3f;
+		op->byte_count += op->nbwait;
+		if (index > 56)
+			zeros = (120 - index) / 4;
+		else
+			zeros = (56 - index) / 4;
+	} else {
+		op->byte_count += op->nbwait;
+		index = op->byte_count & 0x3f;
+		padlen = (index < 56) ? (56 - index) : ((64+56) - index);
+		zeros = (padlen - 1) / 4;
+	}
+	for (i = 0; i < zeros; i++)
+		writel(0, ss->base + SS_RXFIFO);
+
+	/* write the lenght */
+	if (op->mode == SS_OP_SHA1) {
+		bits = cpu_to_be64(op->byte_count << 3);
+		writel(bits & 0xffffffff, ss->base + SS_RXFIFO);
+		writel((bits >> 32) & 0xffffffff, ss->base + SS_RXFIFO);
+	} else {
+		writel((op->byte_count << 3) & 0xffffffff,
+				ss->base + SS_RXFIFO);
+		writel((op->byte_count >> 29) & 0xffffffff,
+				ss->base + SS_RXFIFO);
+	}
+
+	/* stop the hashing */
+	v = readl(ss->base + SS_CTL);
+	v |= SS_DATA_END;
+	writel(v, ss->base + SS_CTL);
+
+	/* check the end */
+	/* The timeout could happend only in case of bad overcloking */
+#define SS_TIMEOUT 100
+	i = 0;
+	do {
+		v = readl(ss->base + SS_CTL);
+		i++;
+	} while (i < SS_TIMEOUT && (v & SS_DATA_END) > 0);
+	if (i >= SS_TIMEOUT) {
+		dev_err(ss->dev, "ERROR: hash end timeout %d>%d\n",
+				i, SS_TIMEOUT);
+		writel(0, ss->base + SS_CTL);
+		mutex_unlock(&ss->lock);
+		return -1;
+	}
+
+	if (op->mode == SS_OP_SHA1) {
+		for (i = 0; i < 5; i++) {
+			v = cpu_to_be32(readl(ss->base + SS_MD0 + i * 4));
+			memcpy(areq->result + i * 4, &v, 4);
+		}
+	} else {
+		for (i = 0; i < 4; i++) {
+			v = readl(ss->base + SS_MD0 + i * 4);
+			memcpy(areq->result + i * 4, &v, 4);
+		}
+	}
+	writel(0, ss->base + SS_CTL);
+	mutex_unlock(&ss->lock);
+	return 0;
+}
+
+/* sunxi_hash_finup: finalize hashing operation after an update */
+int sunxi_hash_finup(struct ahash_request *areq)
+{
+	int err;
+
+	err = sunxi_hash_update(areq);
+	if (err != 0)
+		return err;
+
+	return sunxi_hash_final(areq);
+}
+
+/* combo of init/update/final functions */
+int sunxi_hash_digest(struct ahash_request *areq)
+{
+	int err;
+
+	err = sunxi_hash_init(areq);
+	if (err != 0)
+		return err;
+
+	err = sunxi_hash_update(areq);
+	if (err != 0)
+		return err;
+
+	return sunxi_hash_final(areq);
+}
diff --git a/drivers/crypto/sunxi-ss/sunxi-ss.h b/drivers/crypto/sunxi-ss/sunxi-ss.h
new file mode 100644
index 0000000..94aca20
--- /dev/null
+++ b/drivers/crypto/sunxi-ss/sunxi-ss.h
@@ -0,0 +1,183 @@
+/*
+ * sunxi-ss.c - hardware cryptographic accelerator for Allwinner A20 SoC
+ *
+ * Copyright (C) 2013-2014 Corentin LABBE <clabbe.montjoie@gmail.com>
+ *
+ * Support AES cipher with 128,192,256 bits keysize.
+ * Support MD5 and SHA1 hash algorithms.
+ * Support DES and 3DES
+ * Support PRNG
+ *
+ * You could find the datasheet at
+ * http://dl.linux-sunxi.org/A20/A20%20User%20Manual%202013-03-22.pdf
+ *
+ *
+ * Licensed under the GPL-2.
+ */
+
+#include <linux/clk.h>
+#include <linux/crypto.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <crypto/scatterwalk.h>
+#include <linux/scatterlist.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <crypto/md5.h>
+#include <crypto/sha.h>
+#include <crypto/hash.h>
+#include <crypto/internal/hash.h>
+#include <crypto/aes.h>
+#include <crypto/des.h>
+#include <crypto/internal/rng.h>
+
+#define SS_CTL            0x00
+#define SS_KEY0           0x04
+#define SS_KEY1           0x08
+#define SS_KEY2           0x0C
+#define SS_KEY3           0x10
+#define SS_KEY4           0x14
+#define SS_KEY5           0x18
+#define SS_KEY6           0x1C
+#define SS_KEY7           0x20
+
+#define SS_IV0            0x24
+#define SS_IV1            0x28
+#define SS_IV2            0x2C
+#define SS_IV3            0x30
+
+#define SS_CNT0           0x34
+#define SS_CNT1           0x38
+#define SS_CNT2           0x3C
+#define SS_CNT3           0x40
+
+#define SS_FCSR           0x44
+#define SS_ICSR           0x48
+
+#define SS_MD0            0x4C
+#define SS_MD1            0x50
+#define SS_MD2            0x54
+#define SS_MD3            0x58
+#define SS_MD4            0x5C
+
+#define SS_RXFIFO         0x200
+#define SS_TXFIFO         0x204
+
+/* SS_CTL configuration values */
+
+/* PRNG generator mode - bit 15 */
+#define SS_PRNG_ONESHOT		(0 << 15)
+#define SS_PRNG_CONTINUE	(1 << 15)
+
+/* SS operation mode - bits 12-13 */
+#define SS_ECB			(0 << 12)
+#define SS_CBC			(1 << 12)
+#define SS_CNT			(2 << 12)
+
+/* Counter width for CNT mode - bits 10-11 */
+#define SS_CNT_16BITS		(0 << 10)
+#define SS_CNT_32BITS		(1 << 10)
+#define SS_CNT_64BITS		(2 << 10)
+
+/* Key size for AES - bits 8-9 */
+#define SS_AES_128BITS		(0 << 8)
+#define SS_AES_192BITS		(1 << 8)
+#define SS_AES_256BITS		(2 << 8)
+
+/* Operation direction - bit 7 */
+#define SS_ENCRYPTION		(0 << 7)
+#define SS_DECRYPTION		(1 << 7)
+
+/* SS Method - bits 4-6 */
+#define SS_OP_AES		(0 << 4)
+#define SS_OP_DES		(1 << 4)
+#define SS_OP_3DES		(2 << 4)
+#define SS_OP_SHA1		(3 << 4)
+#define SS_OP_MD5		(4 << 4)
+#define SS_OP_PRNG		(5 << 4)
+
+/* Data end bit - bit 2 */
+#define SS_DATA_END		(1 << 2)
+
+/* PRNG start bit - bit 1 */
+#define SS_PRNG_START		(1 << 1)
+
+/* SS Enable bit - bit 0 */
+#define SS_DISABLED		(0 << 0)
+#define SS_ENABLED		(1 << 0)
+
+/* SS_FCSR configuration values */
+/* RX FIFO status - bit 30 */
+#define SS_RXFIFO_FREE		(1 << 30)
+
+/* RX FIFO empty spaces - bits 24-29 */
+#define SS_RXFIFO_SPACES(val)	(((val) >> 24) & 0x3f)
+
+/* TX FIFO status - bit 22 */
+#define SS_TXFIFO_AVAILABLE	(1 << 22)
+
+/* TX FIFO available spaces - bits 16-21 */
+#define SS_TXFIFO_SPACES(val)	(((val) >> 16) & 0x3f)
+
+#define SS_RXFIFO_EMP_INT_PENDING	(1 << 10)
+#define SS_TXFIFO_AVA_INT_PENDING	(1 << 8)
+#define SS_RXFIFO_EMP_INT_ENABLE	(1 << 2)
+#define SS_TXFIFO_AVA_INT_ENABLE	(1 << 0)
+
+/* SS_ICSR configuration values */
+#define SS_ICS_DRQ_ENABLE		(1 << 4)
+
+struct sunxi_ss_ctx {
+	void __iomem *base;
+	int irq;
+	struct clk *busclk;
+	struct clk *ssclk;
+	struct device *dev;
+	struct resource *res;
+	void *buf_in; /* pointer to data to be uploaded to the device */
+	size_t buf_in_size; /* size of buf_in */
+	void *buf_out;
+	size_t buf_out_size;
+	struct mutex lock; /* control the use of the device */
+	struct mutex bufout_lock; /* control the use of buf_out*/
+	struct mutex bufin_lock; /* control the sue of buf_in*/
+};
+
+struct sunxi_req_ctx {
+	u32 key[AES_MAX_KEY_SIZE / 4];/* divided by sizeof(u32) */
+	u32 keylen;
+	u32 mode;
+	u64 byte_count; /* number of bytes "uploaded" to the device */
+	u32 waitbuf; /* a partial word waiting to be completed and
+			uploaded to the device */
+	/* number of bytes to be uploaded in the waitbuf word */
+	unsigned int nbwait;
+};
+
+#define SS_SEED_LEN (192/8)
+#define SS_DATA_LEN (160/8)
+
+struct prng_context {
+	u32 seed[SS_SEED_LEN/4];
+	unsigned int slen;
+};
+
+int sunxi_hash_init(struct ahash_request *areq);
+int sunxi_hash_update(struct ahash_request *areq);
+int sunxi_hash_final(struct ahash_request *areq);
+int sunxi_hash_finup(struct ahash_request *areq);
+int sunxi_hash_digest(struct ahash_request *areq);
+
+int sunxi_ss_aes_poll(struct ablkcipher_request *areq);
+int sunxi_ss_des_poll(struct ablkcipher_request *areq);
+int sunxi_ss_cipher_init(struct crypto_tfm *tfm);
+int sunxi_ss_cipher_encrypt(struct ablkcipher_request *areq);
+int sunxi_ss_cipher_decrypt(struct ablkcipher_request *areq);
+int sunxi_ss_aes_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+		unsigned int keylen);
+int sunxi_ss_des_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+		unsigned int keylen);
+int sunxi_ss_des3_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+		unsigned int keylen);
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-12 12:59 ` [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
@ 2014-07-23 13:16   ` Herbert Xu
  2014-07-23 13:48     ` Maxime Ripard
  2014-07-24  6:00   ` Herbert Xu
  2014-07-25 11:36   ` Maxime Ripard
  2 siblings, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2014-07-23 13:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jul 12, 2014 at 02:59:13PM +0200, LABBE Corentin wrote:
> Add support for the Security System included in Allwinner SoC A20.
> The Security System is a hardware cryptographic accelerator that support AES/MD5/SHA1/DES/3DES/PRNG algorithms.
> 
> Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>

This is essentially a synchronous driver, no? If so please
switch to the blkcipher/shash interface.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-23 13:16   ` Herbert Xu
@ 2014-07-23 13:48     ` Maxime Ripard
  2014-07-23 13:54       ` Herbert Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Maxime Ripard @ 2014-07-23 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Wed, Jul 23, 2014 at 09:16:20PM +0800, Herbert Xu wrote:
> On Sat, Jul 12, 2014 at 02:59:13PM +0200, LABBE Corentin wrote:
> > Add support for the Security System included in Allwinner SoC A20.
> > The Security System is a hardware cryptographic accelerator that support AES/MD5/SHA1/DES/3DES/PRNG algorithms.
> > 
> > Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
> 
> This is essentially a synchronous driver, no? If so please
> switch to the blkcipher/shash interface.

The exact opposite has been asked for during v1's review...

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140723/201fc9ae/attachment.sig>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-23 13:48     ` Maxime Ripard
@ 2014-07-23 13:54       ` Herbert Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Herbert Xu @ 2014-07-23 13:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 23, 2014 at 03:48:57PM +0200, Maxime Ripard wrote:
>
> The exact opposite has been asked for during v1's review...

Indeed but unfortunately it was bogus advice.  The async interface
brings with it a lot of complexity which should be avoided unless
you actually need it.

Even if you use the sync interface your driver will still be
available to all async users.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-12 12:59 ` [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
  2014-07-23 13:16   ` Herbert Xu
@ 2014-07-24  6:00   ` Herbert Xu
  2014-07-24 11:04     ` Corentin LABBE
  2014-07-25 11:36   ` Maxime Ripard
  2 siblings, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2014-07-24  6:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jul 12, 2014 at 02:59:13PM +0200, LABBE Corentin wrote:
>
> +/* sunxi_hash_init: initialize request context
> + * Activate the SS, and configure it for MD5 or SHA1
> + */
> +int sunxi_hash_init(struct ahash_request *areq)
> +{
> +	const char *hash_type;
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
> +	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
> +
> +	mutex_lock(&ss->lock);
> +
> +	hash_type = crypto_tfm_alg_name(areq->base.tfm);
> +
> +	op->byte_count = 0;
> +	op->nbwait = 0;
> +	op->waitbuf = 0;
> +
> +	/* Enable and configure SS for MD5 or SHA1 */
> +	if (strcmp(hash_type, "sha1") == 0)
> +		op->mode = SS_OP_SHA1;
> +	else
> +		op->mode = SS_OP_MD5;
> +
> +	writel(op->mode | SS_ENABLED, ss->base + SS_CTL);
> +	return 0;

The hash driver is completely broken.  You are modifying tfm
ctx data which is shared by all users of a single tfm.  So
if two users conduct hashes in parallel they will step all
over each other.

Worse, the unpaired mutex_lock will quickly lead to dead locks.

You cannot assume that final will be called.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-24  6:00   ` Herbert Xu
@ 2014-07-24 11:04     ` Corentin LABBE
  2014-07-24 13:38       ` Herbert Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Corentin LABBE @ 2014-07-24 11:04 UTC (permalink / raw)
  To: linux-arm-kernel

Le 24/07/2014 08:00, Herbert Xu a ?crit :
> On Sat, Jul 12, 2014 at 02:59:13PM +0200, LABBE Corentin wrote:
>>
>> +/* sunxi_hash_init: initialize request context
>> + * Activate the SS, and configure it for MD5 or SHA1
>> + */
>> +int sunxi_hash_init(struct ahash_request *areq)
>> +{
>> +	const char *hash_type;
>> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
>> +	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
>> +
>> +	mutex_lock(&ss->lock);
>> +
>> +	hash_type = crypto_tfm_alg_name(areq->base.tfm);
>> +
>> +	op->byte_count = 0;
>> +	op->nbwait = 0;
>> +	op->waitbuf = 0;
>> +
>> +	/* Enable and configure SS for MD5 or SHA1 */
>> +	if (strcmp(hash_type, "sha1") == 0)
>> +		op->mode = SS_OP_SHA1;
>> +	else
>> +		op->mode = SS_OP_MD5;
>> +
>> +	writel(op->mode | SS_ENABLED, ss->base + SS_CTL);
>> +	return 0;
> 
> The hash driver is completely broken.  You are modifying tfm
> ctx data which is shared by all users of a single tfm.  So
> if two users conduct hashes in parallel they will step all
> over each other.

So where can I store data for each request ?

> 
> Worse, the unpaired mutex_lock will quickly lead to dead locks.
> 
> You cannot assume that final will be called.

An user reported an equivalent problem when using openssl speed test with cryptodev.
Does cryptoqueue is a good answer to that problem since the device could handle only one transformation at a time ?
And perhaps with cryptoqueue,  my first question is useless.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-24 11:04     ` Corentin LABBE
@ 2014-07-24 13:38       ` Herbert Xu
  2014-07-26 14:01         ` Corentin LABBE
  0 siblings, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2014-07-24 13:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 24, 2014 at 01:04:55PM +0200, Corentin LABBE wrote:
> Le 24/07/2014 08:00, Herbert Xu a ?crit :
> > On Sat, Jul 12, 2014 at 02:59:13PM +0200, LABBE Corentin wrote:
> >>
> >> +/* sunxi_hash_init: initialize request context
> >> + * Activate the SS, and configure it for MD5 or SHA1
> >> + */
> >> +int sunxi_hash_init(struct ahash_request *areq)
> >> +{
> >> +	const char *hash_type;
> >> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
> >> +	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
> >> +
> >> +	mutex_lock(&ss->lock);
> >> +
> >> +	hash_type = crypto_tfm_alg_name(areq->base.tfm);
> >> +
> >> +	op->byte_count = 0;
> >> +	op->nbwait = 0;
> >> +	op->waitbuf = 0;
> >> +
> >> +	/* Enable and configure SS for MD5 or SHA1 */
> >> +	if (strcmp(hash_type, "sha1") == 0)
> >> +		op->mode = SS_OP_SHA1;
> >> +	else
> >> +		op->mode = SS_OP_MD5;
> >> +
> >> +	writel(op->mode | SS_ENABLED, ss->base + SS_CTL);
> >> +	return 0;
> > 
> > The hash driver is completely broken.  You are modifying tfm
> > ctx data which is shared by all users of a single tfm.  So
> > if two users conduct hashes in parallel they will step all
> > over each other.
> 
> So where can I store data for each request ?

Well, first of all you need to stop storing state in the hardware.
After each operation the hardware may be used by some other user
for a completely different hash request.  So leaving the hash state
in the hardware is a no-no.

If your hardware supports exporting the hash state then you just
have to export it after each operation and reimporting before the
next one.

If your hardware is incapable of exporting partial hash state then
you will have to use a software fallback for init/update.  If your
hardware is incapable of importing partial hash state then you will
also have to do finup/final using a software fallback.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 2/3] ARM: sunxi: dt: Add DT bindings documentation for SUNXI Security System
  2014-07-12 12:59 ` [PATCH v4 2/3] ARM: sunxi: dt: Add DT bindings documentation for SUNXI Security System LABBE Corentin
@ 2014-07-25 10:10   ` Maxime Ripard
  0 siblings, 0 replies; 14+ messages in thread
From: Maxime Ripard @ 2014-07-25 10:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jul 12, 2014 at 02:59:12PM +0200, LABBE Corentin wrote:
> This patch adds documentation for Device-Tree bindings for the Security
> System cryptographic accelerator driver.
> 
> Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>

Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140725/7621f030/attachment.sig>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-12 12:59 ` [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
  2014-07-23 13:16   ` Herbert Xu
  2014-07-24  6:00   ` Herbert Xu
@ 2014-07-25 11:36   ` Maxime Ripard
  2 siblings, 0 replies; 14+ messages in thread
From: Maxime Ripard @ 2014-07-25 11:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jul 12, 2014 at 02:59:13PM +0200, LABBE Corentin wrote:
> Add support for the Security System included in Allwinner SoC A20.
> The Security System is a hardware cryptographic accelerator that support AES/MD5/SHA1/DES/3DES/PRNG algorithms.
> 
> Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
> ---
>  drivers/crypto/Kconfig                    |  17 ++
>  drivers/crypto/Makefile                   |   1 +
>  drivers/crypto/sunxi-ss/Makefile          |   2 +
>  drivers/crypto/sunxi-ss/sunxi-ss-cipher.c | 461 ++++++++++++++++++++++++++++++
>  drivers/crypto/sunxi-ss/sunxi-ss-core.c   | 308 ++++++++++++++++++++
>  drivers/crypto/sunxi-ss/sunxi-ss-hash.c   | 241 ++++++++++++++++
>  drivers/crypto/sunxi-ss/sunxi-ss.h        | 183 ++++++++++++
>  7 files changed, 1213 insertions(+)
>  create mode 100644 drivers/crypto/sunxi-ss/Makefile
>  create mode 100644 drivers/crypto/sunxi-ss/sunxi-ss-cipher.c
>  create mode 100644 drivers/crypto/sunxi-ss/sunxi-ss-core.c
>  create mode 100644 drivers/crypto/sunxi-ss/sunxi-ss-hash.c
>  create mode 100644 drivers/crypto/sunxi-ss/sunxi-ss.h
> 
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index 03ccdb0..a2acda4 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -418,4 +418,21 @@ config CRYPTO_DEV_MXS_DCP
>  	  To compile this driver as a module, choose M here: the module
>  	  will be called mxs-dcp.
>  
> +config CRYPTO_DEV_SUNXI_SS
> +	tristate "Support for Allwinner Security System cryptographic accelerator"
> +	depends on ARCH_SUNXI
> +	select CRYPTO_MD5
> +	select CRYPTO_SHA1
> +	select CRYPTO_AES
> +	select CRYPTO_DES
> +	select CRYPTO_BLKCIPHER
> +	help
> +	  Some Allwinner SoC have a crypto accelerator named
> +	  Security System. Select this if you want to use it.
> +	  The Security System handle AES/DES/3DES ciphers in CBC mode
> +	  and SHA1 and MD5 hash algorithms.
> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called sunxi-ss.
> +
>  endif # CRYPTO_HW
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index 482f090..855292a 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -23,3 +23,4 @@ obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o
>  obj-$(CONFIG_CRYPTO_DEV_SAHARA) += sahara.o
>  obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
>  obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
> +obj-$(CONFIG_CRYPTO_DEV_SUNXI_SS) += sunxi-ss/
> diff --git a/drivers/crypto/sunxi-ss/Makefile b/drivers/crypto/sunxi-ss/Makefile
> new file mode 100644
> index 0000000..8bb287d
> --- /dev/null
> +++ b/drivers/crypto/sunxi-ss/Makefile
> @@ -0,0 +1,2 @@
> +obj-$(CONFIG_CRYPTO_DEV_SUNXI_SS) += sunxi-ss.o
> +sunxi-ss-y += sunxi-ss-core.o sunxi-ss-hash.o sunxi-ss-cipher.o
> diff --git a/drivers/crypto/sunxi-ss/sunxi-ss-cipher.c b/drivers/crypto/sunxi-ss/sunxi-ss-cipher.c
> new file mode 100644
> index 0000000..c2422f7
> --- /dev/null
> +++ b/drivers/crypto/sunxi-ss/sunxi-ss-cipher.c
> @@ -0,0 +1,461 @@
> +/*
> + * sunxi-ss-cipher.c - hardware cryptographic accelerator for Allwinner A20 SoC
> + *
> + * Copyright (C) 2013-2014 Corentin LABBE <clabbe.montjoie@gmail.com>
> + *
> + * This file add support for AES cipher with 128,192,256 bits
> + * keysize in CBC mode.
> + *
> + * You could find the datasheet at
> + * http://dl.linux-sunxi.org/A20/A20%20User%20Manual%202013-03-22.pdf

It's already documented in Documentation/arm/sunxi/README, you don't
need this.

> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +#include "sunxi-ss.h"
> +
> +extern struct sunxi_ss_ctx *ss;
> +
> +static int sunxi_ss_cipher(struct ablkcipher_request *areq, u32 mode)
> +{
> +	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(areq);
> +	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
> +	const char *cipher_type;
> +
> +	cipher_type = crypto_tfm_alg_name(crypto_ablkcipher_tfm(tfm));
> +
> +	if (areq->nbytes == 0) {
> +		mutex_unlock(&ss->lock);
> +		return 0;
> +	}
> +
> +	if (areq->info == NULL) {
> +		dev_err(ss->dev, "ERROR: Empty IV\n");
> +		mutex_unlock(&ss->lock);
> +		return -EINVAL;
> +	}
> +
> +	if (areq->src == NULL || areq->dst == NULL) {
> +		dev_err(ss->dev, "ERROR: Some SGs are NULL\n");
> +		mutex_unlock(&ss->lock);
> +		return -EINVAL;
> +	}
> +
> +	if (strcmp("cbc(aes)", cipher_type) == 0) {
> +		op->mode |= SS_OP_AES | SS_CBC | SS_ENABLED | mode;
> +		return sunxi_ss_aes_poll(areq);
> +	}

Newline

> +	if (strcmp("cbc(des)", cipher_type) == 0) {
> +		op->mode = SS_OP_DES | SS_CBC | SS_ENABLED | mode;
> +		return sunxi_ss_des_poll(areq);
> +	}

Ditto.

> +	if (strcmp("cbc(des3_ede)", cipher_type) == 0) {
> +		op->mode = SS_OP_3DES | SS_CBC | SS_ENABLED | mode;
> +		return sunxi_ss_des_poll(areq);
> +	}

Ditto.

> +	dev_err(ss->dev, "ERROR: Cipher %s not handled\n", cipher_type);
> +	mutex_unlock(&ss->lock);
> +	return -EINVAL;

You could avoid much of this code duplication with gotos.

> +}
> +
> +int sunxi_ss_cipher_encrypt(struct ablkcipher_request *areq)
> +{
> +	return sunxi_ss_cipher(areq, SS_ENCRYPTION);
> +}
> +
> +int sunxi_ss_cipher_decrypt(struct ablkcipher_request *areq)
> +{
> +	return sunxi_ss_cipher(areq, SS_DECRYPTION);
> +}
> +
> +int sunxi_ss_cipher_init(struct crypto_tfm *tfm)
> +{
> +	struct sunxi_req_ctx *op = crypto_tfm_ctx(tfm);
> +
> +	mutex_lock(&ss->lock);
> +
> +	memset(op, 0, sizeof(struct sunxi_req_ctx));

And you never unlock the mutex? And where is this ss coming from?

> +	return 0;
> +}
> +
> +int sunxi_ss_aes_poll(struct ablkcipher_request *areq)
> +{
> +	u32 spaces;
> +	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(areq);
> +	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
> +	unsigned int ivsize = crypto_ablkcipher_ivsize(tfm);
> +	/* when activating SS, the default FIFO space is 32 */
> +	u32 rx_cnt = 32;
> +	u32 tx_cnt = 0;
> +	u32 v;
> +	int i;
> +	struct scatterlist *in_sg;
> +	struct scatterlist *out_sg;
> +	void *src_addr;
> +	void *dst_addr;
> +	unsigned int ileft = areq->nbytes;
> +	unsigned int oleft = areq->nbytes;
> +	unsigned int sgileft = areq->src->length;
> +	unsigned int sgoleft = areq->dst->length;
> +	unsigned int todo;
> +	u32 *src32;
> +	u32 *dst32;
> +
> +	in_sg = areq->src;
> +	out_sg = areq->dst;
> +	for (i = 0; i < op->keylen; i += 4)
> +		writel(*(op->key + i/4), ss->base + SS_KEY0 + i);

Newline

> +	if (areq->info != NULL) {
> +		for (i = 0; i < 4 && i < ivsize / 4; i++) {
> +			v = *(u32 *)(areq->info + i * 4);
> +			writel(v, ss->base + SS_IV0 + i * 4);
> +		}
> +	}
> +	writel(op->mode, ss->base + SS_CTL);
> +
> +	/* If we have only one SG, we can use kmap_atomic */
> +	if (sg_next(in_sg) == NULL && sg_next(out_sg) == NULL) {
> +		src_addr = kmap_atomic(sg_page(in_sg)) + in_sg->offset;
> +		if (src_addr == NULL) {
> +			dev_err(ss->dev, "kmap_atomic error for src SG\n");
> +			writel(0, ss->base + SS_CTL);
> +			mutex_unlock(&ss->lock);
> +			return -EINVAL;
> +		}

Newline...

You get the idea, please go over your driver to catch all of them.

> +		dst_addr = kmap_atomic(sg_page(out_sg)) + out_sg->offset;
> +		if (dst_addr == NULL) {
> +			dev_err(ss->dev, "kmap_atomic error for dst SG\n");
> +			writel(0, ss->base + SS_CTL);
> +			kunmap_atomic(src_addr);
> +			mutex_unlock(&ss->lock);
> +			return -EINVAL;
> +		}
> +		src32 = (u32 *)src_addr;
> +		dst32 = (u32 *)dst_addr;
> +		ileft = areq->nbytes / 4;
> +		oleft = areq->nbytes / 4;
> +		i = 0;
> +		do {
> +			if (ileft > 0 && rx_cnt > 0) {
> +				todo = min(rx_cnt, ileft);
> +				ileft -= todo;
> +				do {
> +					writel_relaxed(*src32++,
> +						       ss->base +
> +						       SS_RXFIFO);
> +					todo--;
> +				} while (todo > 0);
> +			}
> +			if (tx_cnt > 0) {
> +				todo = min(tx_cnt, oleft);
> +				oleft -= todo;
> +				do {
> +					*dst32++ = readl_relaxed(ss->base +
> +								SS_TXFIFO);
> +					todo--;
> +				} while (todo > 0);
> +			}
> +			spaces = readl_relaxed(ss->base + SS_FCSR);
> +			rx_cnt = SS_RXFIFO_SPACES(spaces);
> +			tx_cnt = SS_TXFIFO_SPACES(spaces);
> +		} while (oleft > 0);
> +		writel(0, ss->base + SS_CTL);
> +		kunmap_atomic(src_addr);
> +		kunmap_atomic(dst_addr);
> +		mutex_unlock(&ss->lock);

Again, gotos here would be nice 

> +		return 0;
> +	}

So you just have to huge
if (sg_num == 1)
    do_something
else
    do_something_else

right?

Can't you turn those two in separate functions, the number of
imbricated test cases is quite huge and not really readable.

> +	/* If we have more than one SG, we cannot use kmap_atomic since
> +	 * we hold the mapping too long
> +	 */

This is not the proper way of definining multi-lines comments. Please
read CodingStyle, and fix all of them in your driver.

> +	src_addr = kmap(sg_page(in_sg)) + in_sg->offset;
> +	if (src_addr == NULL) {
> +		dev_err(ss->dev, "KMAP error for src SG\n");
> +		mutex_unlock(&ss->lock);
> +		return -EINVAL;
> +	}
> +	dst_addr = kmap(sg_page(out_sg)) + out_sg->offset;
> +	if (dst_addr == NULL) {
> +		kunmap(sg_page(in_sg));
> +		dev_err(ss->dev, "KMAP error for dst SG\n");
> +		mutex_unlock(&ss->lock);
> +		return -EINVAL;
> +	}
> +	src32 = (u32 *)src_addr;
> +	dst32 = (u32 *)dst_addr;
> +	ileft = areq->nbytes / 4;
> +	oleft = areq->nbytes / 4;
> +	sgileft = in_sg->length / 4;
> +	sgoleft = out_sg->length / 4;
> +	do {
> +		spaces = readl_relaxed(ss->base + SS_FCSR);
> +		rx_cnt = SS_RXFIFO_SPACES(spaces);
> +		tx_cnt = SS_TXFIFO_SPACES(spaces);
> +		todo = min3(rx_cnt, ileft, sgileft);
> +		if (todo > 0) {
> +			ileft -= todo;
> +			sgileft -= todo;
> +		}
> +		while (todo > 0) {
> +			writel_relaxed(*src32++, ss->base + SS_RXFIFO);
> +			todo--;
> +		}
> +		if (in_sg != NULL && sgileft == 0 && ileft > 0) {
> +			kunmap(sg_page(in_sg));
> +			in_sg = sg_next(in_sg);
> +			while (in_sg != NULL && in_sg->length == 0)
> +				in_sg = sg_next(in_sg);
> +			if (in_sg != NULL && ileft > 0) {
> +				src_addr = kmap(sg_page(in_sg)) + in_sg->offset;
> +				if (src_addr == NULL) {
> +					dev_err(ss->dev, "ERROR: KMAP for src SG\n");
> +					mutex_unlock(&ss->lock);
> +					return -EINVAL;
> +				}
> +				src32 = src_addr;
> +				sgileft = in_sg->length / 4;
> +			}
> +		}
> +		/* do not test oleft since when oleft == 0 we have finished */
> +		todo = min3(tx_cnt, oleft, sgoleft);
> +		if (todo > 0) {
> +			oleft -= todo;
> +			sgoleft -= todo;
> +		}
> +		while (todo > 0) {
> +			*dst32++ = readl_relaxed(ss->base + SS_TXFIFO);
> +			todo--;
> +		}
> +		if (out_sg != NULL && sgoleft == 0 && oleft >= 0) {
> +			kunmap(sg_page(out_sg));
> +			out_sg = sg_next(out_sg);
> +			while (out_sg != NULL && out_sg->length == 0)
> +				out_sg = sg_next(out_sg);
> +			if (out_sg != NULL && oleft > 0) {
> +				dst_addr = kmap(sg_page(out_sg)) +
> +					out_sg->offset;
> +				if (dst_addr == NULL) {
> +					dev_err(ss->dev, "KMAP error\n");
> +					mutex_unlock(&ss->lock);
> +					return -EINVAL;
> +				}
> +				dst32 = dst_addr;
> +				sgoleft = out_sg->length / 4;
> +			}
> +		}
> +	} while (oleft > 0);
> +
> +	writel(0, ss->base + SS_CTL);
> +	mutex_unlock(&ss->lock);
> +	return 0;

Yep, splitting this into a few functions, and adding some comments
would definitely help.

> +}
> +
> +/* Pure CPU way of doing DES/3DES with SS
> + * Since DES and 3DES SGs could be smaller than 4 bytes, I use sg_copy_to_buffer
> + * for "linearize" them.
> + * The problem with that is that I alloc (2 x areq->nbytes) for buf_in/buf_out
> + * TODO: change this system

Change this system for what?

> + * SGsrc -> buf_in -> SS -> buf_out -> SGdst */
> +int sunxi_ss_des_poll(struct ablkcipher_request *areq)
> +{
> +	u32 value, spaces;
> +	size_t nb_in_sg_tx, nb_in_sg_rx;
> +	size_t ir, it;
> +	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(areq);
> +	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
> +	unsigned int ivsize = crypto_ablkcipher_ivsize(tfm);
> +	u32 tx_cnt = 0;
> +	u32 rx_cnt = 0;
> +	u32 v;
> +	int i;
> +	int no_chunk = 1;
> +
> +	/* if we have only SGs with size multiple of 4,
> +	 * we can use the SS AES function */
> +	struct scatterlist *in_sg;
> +	struct scatterlist *out_sg;
> +
> +	in_sg = areq->src;
> +	out_sg = areq->dst;
> +
> +	while (in_sg != NULL && no_chunk == 1) {
> +		if ((in_sg->length % 4) != 0)
> +			no_chunk = 0;
> +		in_sg = sg_next(in_sg);
> +	}
> +	while (out_sg != NULL && no_chunk == 1) {
> +		if ((out_sg->length % 4) != 0)
> +			no_chunk = 0;
> +		out_sg = sg_next(out_sg);
> +	}
> +
> +	if (no_chunk == 1)
> +		return sunxi_ss_aes_poll(areq);
> +	in_sg = areq->src;
> +	out_sg = areq->dst;
> +
> +	nb_in_sg_rx = sg_nents(in_sg);
> +	nb_in_sg_tx = sg_nents(out_sg);
> +
> +	mutex_lock(&ss->bufin_lock);
> +	if (ss->buf_in == NULL) {
> +		ss->buf_in = kmalloc(areq->nbytes, GFP_KERNEL);
> +		ss->buf_in_size = areq->nbytes;
> +	} else {
> +		if (areq->nbytes > ss->buf_in_size) {
> +			kfree(ss->buf_in);
> +			ss->buf_in = kmalloc(areq->nbytes, GFP_KERNEL);
> +			ss->buf_in_size = areq->nbytes;
> +		}
> +	}
> +	if (ss->buf_in == NULL) {
> +		ss->buf_in_size = 0;
> +		mutex_unlock(&ss->bufin_lock);
> +		dev_err(ss->dev, "Unable to allocate pages.\n");
> +		return -ENOMEM;
> +	}
> +	if (ss->buf_out == NULL) {
> +		mutex_lock(&ss->bufout_lock);
> +		ss->buf_out = kmalloc(areq->nbytes, GFP_KERNEL);
> +		if (ss->buf_out == NULL) {
> +			ss->buf_out_size = 0;
> +			mutex_unlock(&ss->bufout_lock);
> +			dev_err(ss->dev, "Unable to allocate pages.\n");
> +			return -ENOMEM;
> +		}
> +		ss->buf_out_size = areq->nbytes;
> +		mutex_unlock(&ss->bufout_lock);
> +	} else {
> +		if (areq->nbytes > ss->buf_out_size) {
> +			mutex_lock(&ss->bufout_lock);
> +			kfree(ss->buf_out);
> +			ss->buf_out = kmalloc(areq->nbytes, GFP_KERNEL);

Why do you free it to reallocate it right away? I only see two cases
for this and none of them are great:
   - This buffer has been preallocated by this function, but never
     free'd before, which would mean that we leak resources.
   - This buffer has been allocated by another function, which makes
     the tracking of the allocation/deallocation quite hard to do,
     which means that we will end up leaking memory.

> +			if (ss->buf_out == NULL) {
> +				ss->buf_out_size = 0;
> +				mutex_unlock(&ss->bufout_lock);
> +				dev_err(ss->dev, "Unable to allocate pages.\n");
> +				return -ENOMEM;
> +			}
> +			ss->buf_out_size = areq->nbytes;
> +			mutex_unlock(&ss->bufout_lock);
> +		}
> +	}
> +
> +	sg_copy_to_buffer(areq->src, nb_in_sg_rx, ss->buf_in, areq->nbytes);
> +
> +	ir = 0;
> +	it = 0;
> +
> +	for (i = 0; i < op->keylen; i += 4)
> +		writel(*(op->key + i/4), ss->base + SS_KEY0 + i);
> +	if (areq->info != NULL) {
> +		for (i = 0; i < 4 && i < ivsize / 4; i++) {
> +			v = *(u32 *)(areq->info + i * 4);
> +			writel(v, ss->base + SS_IV0 + i * 4);
> +		}
> +	}
> +	writel(op->mode, ss->base + SS_CTL);
> +
> +	do {
> +		if (rx_cnt == 0 || tx_cnt == 0) {
> +			spaces = readl(ss->base + SS_FCSR);
> +			rx_cnt = SS_RXFIFO_SPACES(spaces);
> +			tx_cnt = SS_TXFIFO_SPACES(spaces);
> +		}
> +		if (rx_cnt > 0 && ir < areq->nbytes) {
> +			do {
> +				value = *(u32 *)(ss->buf_in + ir);
> +				writel(value, ss->base + SS_RXFIFO);
> +				ir += 4;
> +				rx_cnt--;
> +			} while (rx_cnt > 0 && ir < areq->nbytes);
> +		}
> +		if (tx_cnt > 0 && it < areq->nbytes) {
> +			do {
> +				value = readl(ss->base + SS_TXFIFO);
> +				*(u32 *)(ss->buf_out + it) = value;
> +				it += 4;
> +				tx_cnt--;
> +			} while (tx_cnt > 0 && it < areq->nbytes);
> +		}
> +		if (ir == areq->nbytes) {
> +			mutex_unlock(&ss->bufin_lock);
> +			ir++;
> +		}
> +	} while (it < areq->nbytes);
> +
> +	writel(0, ss->base + SS_CTL);
> +	mutex_unlock(&ss->lock);
> +
> +	/* a simple optimization, since we dont need the hardware for this copy
> +	 * we release the lock and do the copy. With that we gain 5/10% perf */
> +	mutex_lock(&ss->bufout_lock);
> +	sg_copy_from_buffer(areq->dst, nb_in_sg_tx, ss->buf_out, areq->nbytes);
> +
> +	mutex_unlock(&ss->bufout_lock);
> +	return 0;
> +}
> +
> +/* check and set the AES key, prepare the mode to be used */
> +int sunxi_ss_aes_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
> +		unsigned int keylen)
> +{
> +	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
> +
> +	switch (keylen) {
> +	case 128 / 8:
> +		op->mode = SS_AES_128BITS;
> +		break;
> +	case 192 / 8:
> +		op->mode = SS_AES_192BITS;
> +		break;
> +	case 256 / 8:
> +		op->mode = SS_AES_256BITS;
> +		break;
> +	default:
> +		dev_err(ss->dev, "ERROR: Invalid keylen %u\n", keylen);
> +		crypto_ablkcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +		mutex_unlock(&ss->lock);

Who takes this lock?

> +		return -EINVAL;
> +	}
> +	op->keylen = keylen;
> +	memcpy(op->key, key, keylen);
> +	return 0;
> +}
> +
> +/* check and set the DES key, prepare the mode to be used */
> +int sunxi_ss_des_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
> +		unsigned int keylen)
> +{
> +	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
> +
> +	if (keylen != DES_KEY_SIZE) {
> +		dev_err(ss->dev, "Invalid keylen %u\n", keylen);
> +		crypto_ablkcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +		mutex_unlock(&ss->lock);

Ditto

> +		return -EINVAL;
> +	}
> +	op->keylen = keylen;
> +	memcpy(op->key, key, keylen);
> +	return 0;
> +}
> +
> +/* check and set the 3DES key, prepare the mode to be used */
> +int sunxi_ss_des3_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
> +		unsigned int keylen)
> +{
> +	struct sunxi_req_ctx *op = crypto_ablkcipher_ctx(tfm);
> +
> +	if (keylen != 3 * DES_KEY_SIZE) {
> +		dev_err(ss->dev, "Invalid keylen %u\n", keylen);
> +		crypto_ablkcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +		mutex_unlock(&ss->lock);
> +		return -EINVAL;
> +	}
> +	op->keylen = keylen;
> +	memcpy(op->key, key, keylen);
> +	return 0;
> +}
> diff --git a/drivers/crypto/sunxi-ss/sunxi-ss-core.c b/drivers/crypto/sunxi-ss/sunxi-ss-core.c
> new file mode 100644
> index 0000000..c76016e
> --- /dev/null
> +++ b/drivers/crypto/sunxi-ss/sunxi-ss-core.c
> @@ -0,0 +1,308 @@
> +/*
> + * sunxi-ss.c - hardware cryptographic accelerator for Allwinner A20 SoC
> + *
> + * Copyright (C) 2013-2014 Corentin LABBE <clabbe.montjoie@gmail.com>
> + *
> + * Core file which registers crypto algorithms supported by the SS.
> + *
> + * You could find the datasheet at
> + * http://dl.linux-sunxi.org/A20/A20%20User%20Manual%202013-03-22.pdf
> + *
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +#include <linux/clk.h>
> +#include <linux/crypto.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <crypto/scatterwalk.h>
> +#include <linux/scatterlist.h>
> +#include <linux/interrupt.h>
> +#include <linux/delay.h>
> +
> +#include "sunxi-ss.h"
> +
> +struct sunxi_ss_ctx *ss;
> +
> +/* General notes:
> + * I cannot use a key/IV cache because each time one of these change ALL stuff
> + * need to be re-writed (rewrite SS_KEYX ans SS_IVX).
> + * And for example, with dm-crypt IV changes on each request.
> + *
> + * After each request the device must be disabled with a write of 0 in SS_CTL
> + *
> + * For performance reason, we use writel_relaxed/read_relaxed for all
> + * operations on RX and TX FIFO and also SS_FCSR.
> + * For all other registers, we use writel/readl.
> + * See http://permalink.gmane.org/gmane.linux.ports.arm.kernel/117644
> + * and http://permalink.gmane.org/gmane.linux.ports.arm.kernel/117640
> + * */
> +
> +static struct ahash_alg sunxi_md5_alg = {
> +	.init = sunxi_hash_init,
> +	.update = sunxi_hash_update,
> +	.final = sunxi_hash_final,
> +	.finup = sunxi_hash_finup,
> +	.digest = sunxi_hash_digest,
> +	.halg = {
> +		.digestsize = MD5_DIGEST_SIZE,
> +		.base = {
> +			.cra_name = "md5",
> +			.cra_driver_name = "md5-sunxi-ss",
> +			.cra_priority = 300,
> +			.cra_alignmask = 3,
> +			.cra_flags = CRYPTO_ALG_TYPE_AHASH | CRYPTO_ALG_ASYNC,
> +			.cra_blocksize = MD5_HMAC_BLOCK_SIZE,
> +			.cra_ctxsize = sizeof(struct sunxi_req_ctx),
> +			.cra_module = THIS_MODULE,
> +			.cra_type = &crypto_ahash_type
> +		}
> +	}
> +};
> +static struct ahash_alg sunxi_sha1_alg = {
> +	.init = sunxi_hash_init,
> +	.update = sunxi_hash_update,
> +	.final = sunxi_hash_final,
> +	.finup = sunxi_hash_finup,
> +	.digest = sunxi_hash_digest,
> +	.halg = {
> +		.digestsize = SHA1_DIGEST_SIZE,
> +		.base = {
> +			.cra_name = "sha1",
> +			.cra_driver_name = "sha1-sunxi-ss",
> +			.cra_priority = 300,
> +			.cra_alignmask = 3,
> +			.cra_flags = CRYPTO_ALG_TYPE_AHASH | CRYPTO_ALG_ASYNC,
> +			.cra_blocksize = SHA1_BLOCK_SIZE,
> +			.cra_ctxsize = sizeof(struct sunxi_req_ctx),
> +			.cra_module = THIS_MODULE,
> +			.cra_type = &crypto_ahash_type
> +		}
> +	}
> +};
> +
> +static struct crypto_alg sunxi_cipher_algs[] = {
> +{
> +	.cra_name = "cbc(aes)",
> +	.cra_driver_name = "cbc-aes-sunxi-ss",
> +	.cra_priority = 300,
> +	.cra_blocksize = AES_BLOCK_SIZE,
> +	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +	.cra_ctxsize = sizeof(struct sunxi_req_ctx),
> +	.cra_module = THIS_MODULE,
> +	.cra_alignmask = 3,
> +	.cra_type = &crypto_ablkcipher_type,
> +	.cra_init = sunxi_ss_cipher_init,
> +	.cra_u = {
> +		.ablkcipher = {
> +			.min_keysize    = AES_MIN_KEY_SIZE,
> +			.max_keysize    = AES_MAX_KEY_SIZE,
> +			.ivsize         = AES_BLOCK_SIZE,
> +			.setkey         = sunxi_ss_aes_setkey,
> +			.encrypt        = sunxi_ss_cipher_encrypt,
> +			.decrypt        = sunxi_ss_cipher_decrypt,
> +		}
> +	}
> +}, {
> +	.cra_name = "cbc(des)",
> +	.cra_driver_name = "cbc-des-sunxi-ss",
> +	.cra_priority = 300,
> +	.cra_blocksize = DES_BLOCK_SIZE,
> +	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +	.cra_ctxsize = sizeof(struct sunxi_req_ctx),
> +	.cra_module = THIS_MODULE,
> +	.cra_alignmask = 3,
> +	.cra_type = &crypto_ablkcipher_type,
> +	.cra_init = sunxi_ss_cipher_init,
> +	.cra_u.ablkcipher = {
> +		.min_keysize    = DES_KEY_SIZE,
> +		.max_keysize    = DES_KEY_SIZE,
> +		.ivsize         = DES_BLOCK_SIZE,
> +		.setkey         = sunxi_ss_des_setkey,
> +		.encrypt        = sunxi_ss_cipher_encrypt,
> +		.decrypt        = sunxi_ss_cipher_decrypt,
> +	}
> +}, {
> +	.cra_name = "cbc(des3_ede)",
> +	.cra_driver_name = "cbc-des3-sunxi-ss",
> +	.cra_priority = 300,
> +	.cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +	.cra_ctxsize = sizeof(struct sunxi_req_ctx),
> +	.cra_module = THIS_MODULE,
> +	.cra_alignmask = 3,
> +	.cra_type = &crypto_ablkcipher_type,
> +	.cra_init = sunxi_ss_cipher_init,
> +	.cra_u.ablkcipher = {
> +		.min_keysize    = DES3_EDE_KEY_SIZE,
> +		.max_keysize    = DES3_EDE_KEY_SIZE,
> +		.ivsize         = DES3_EDE_BLOCK_SIZE,
> +		.setkey         = sunxi_ss_des3_setkey,
> +		.encrypt        = sunxi_ss_cipher_encrypt,
> +		.decrypt        = sunxi_ss_cipher_decrypt,
> +	}
> +}
> +};
> +
> +static int sunxi_ss_probe(struct platform_device *pdev)
> +{
> +	struct resource *res;
> +	u32 v;
> +	int err;
> +	unsigned long cr;
> +	const unsigned long cr_ahb = 24 * 1000 * 1000;
> +	const unsigned long cr_mod = 150 * 1000 * 1000;
> +
> +	if (!pdev->dev.of_node)
> +		return -ENODEV;
> +
> +	ss = devm_kzalloc(&pdev->dev, sizeof(*ss), GFP_KERNEL);
> +	if (ss == NULL)
> +		return -ENOMEM;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	ss->base = devm_ioremap_resource(&pdev->dev, res);
> +	if (IS_ERR(ss->base)) {
> +		dev_err(&pdev->dev, "Cannot request MMIO\n");
> +		return PTR_ERR(ss->base);
> +	}
> +
> +	ss->ssclk = devm_clk_get(&pdev->dev, "mod");
> +	if (IS_ERR(ss->ssclk)) {
> +		err = PTR_ERR(ss->ssclk);
> +		dev_err(&pdev->dev, "Cannot get SS clock err=%d\n", err);
> +		return err;
> +	}
> +	dev_dbg(&pdev->dev, "clock ss acquired\n");
> +
> +	ss->busclk = devm_clk_get(&pdev->dev, "ahb");
> +	if (IS_ERR(ss->busclk)) {
> +		err = PTR_ERR(ss->busclk);
> +		dev_err(&pdev->dev, "Cannot get AHB SS clock err=%d\n", err);
> +		return err;
> +	}
> +	dev_dbg(&pdev->dev, "clock ahb_ss acquired\n");
> +
> +	/* Enable the clocks */
> +	err = clk_prepare_enable(ss->busclk);
> +	if (err != 0) {
> +		dev_err(&pdev->dev, "Cannot prepare_enable busclk\n");
> +		return err;
> +	}
> +	err = clk_prepare_enable(ss->ssclk);
> +	if (err != 0) {
> +		dev_err(&pdev->dev, "Cannot prepare_enable ssclk\n");
> +		clk_disable_unprepare(ss->busclk);
> +		return err;
> +	}
> +
> +	/* Check that clock have the correct rates gived in the datasheet */
> +	/* Try to set the clock to the maximum allowed */
> +	err = clk_set_rate(ss->ssclk, cr_mod);
> +	if (err != 0) {
> +		dev_err(&pdev->dev, "Cannot set clock rate to ssclk\n");
> +		clk_disable_unprepare(ss->ssclk);
> +		clk_disable_unprepare(ss->busclk);
> +		return err;
> +	}
> +	cr = clk_get_rate(ss->busclk);
> +	if (cr >= cr_ahb)
> +		dev_dbg(&pdev->dev, "Clock bus %lu (%lu MHz) (must be >= %lu)\n",
> +				cr, cr / 1000000, cr_ahb);
> +	else
> +		dev_warn(&pdev->dev, "Clock bus %lu (%lu MHz) (must be >= %lu)\n",
> +				cr, cr / 1000000, cr_ahb);

Why do you need such a check?

> +	cr = clk_get_rate(ss->ssclk);
> +	if (cr == cr_mod)
> +		dev_dbg(&pdev->dev, "Clock ss %lu (%lu MHz) (must be <= %lu)\n",
> +				cr, cr / 1000000, cr_mod);
> +	else {
> +		dev_warn(&pdev->dev, "Clock ss is at %lu (%lu MHz) (must be <= %lu)\n",
> +				cr, cr / 1000000, cr_mod);
> +	}

Ditto, you just changed the rate here. Why are you double-checking?

> +	/* TODO Does this information could be usefull ? */

I don't know, what is it? :)

> +	writel(SS_ENABLED, ss->base + SS_CTL);
> +	v = readl(ss->base + SS_CTL);
> +	v >>= 16;
> +	v &= 0x07;
> +	dev_info(&pdev->dev, "Die ID %d\n", v);
> +	writel(0, ss->base + SS_CTL);
> +
> +	ss->dev = &pdev->dev;
> +
> +	mutex_init(&ss->lock);
> +	mutex_init(&ss->bufin_lock);
> +	mutex_init(&ss->bufout_lock);
> +
> +	err = crypto_register_ahash(&sunxi_md5_alg);
> +	if (err)
> +		goto error_md5;
> +	err = crypto_register_ahash(&sunxi_sha1_alg);
> +	if (err)
> +		goto error_sha1;
> +	err = crypto_register_algs(sunxi_cipher_algs,
> +			ARRAY_SIZE(sunxi_cipher_algs));
> +	if (err)
> +		goto error_ciphers;
> +
> +	return 0;
> +error_ciphers:
> +	crypto_unregister_ahash(&sunxi_sha1_alg);
> +error_sha1:
> +	crypto_unregister_ahash(&sunxi_md5_alg);
> +error_md5:
> +	clk_disable_unprepare(ss->ssclk);
> +	clk_disable_unprepare(ss->busclk);
> +	return err;
> +}
> +
> +static int __exit sunxi_ss_remove(struct platform_device *pdev)
> +{
> +	if (!pdev->dev.of_node)
> +		return 0;
> +
> +	crypto_unregister_ahash(&sunxi_md5_alg);
> +	crypto_unregister_ahash(&sunxi_sha1_alg);
> +	crypto_unregister_algs(sunxi_cipher_algs,
> +			ARRAY_SIZE(sunxi_cipher_algs));
> +
> +	if (ss->buf_in != NULL)
> +		kfree(ss->buf_in);
> +	if (ss->buf_out != NULL)
> +		kfree(ss->buf_out);
> +
> +	writel(0, ss->base + SS_CTL);
> +	clk_disable_unprepare(ss->busclk);
> +	clk_disable_unprepare(ss->ssclk);
> +	return 0;
> +}
> +
> +/*============================================================================*/
> +/*============================================================================*/

Drope these two lines

> +static const struct of_device_id a20ss_crypto_of_match_table[] = {
> +	{ .compatible = "allwinner,sun7i-a20-crypto" },
> +	{}
> +};
> +MODULE_DEVICE_TABLE(of, a20ss_crypto_of_match_table);
> +
> +static struct platform_driver sunxi_ss_driver = {
> +	.probe          = sunxi_ss_probe,
> +	.remove         = __exit_p(sunxi_ss_remove),
> +	.driver         = {
> +		.owner          = THIS_MODULE,
> +		.name           = "sunxi-ss",
> +		.of_match_table	= a20ss_crypto_of_match_table,
> +	},
> +};
> +
> +module_platform_driver(sunxi_ss_driver);
> +
> +MODULE_DESCRIPTION("Allwinner Security System cryptographic accelerator");
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Corentin LABBE <clabbe.montjoie@gmail.com>");
> diff --git a/drivers/crypto/sunxi-ss/sunxi-ss-hash.c b/drivers/crypto/sunxi-ss/sunxi-ss-hash.c
> new file mode 100644
> index 0000000..6412bfb
> --- /dev/null
> +++ b/drivers/crypto/sunxi-ss/sunxi-ss-hash.c
> @@ -0,0 +1,241 @@
> +/*
> + * sunxi-ss-hash.c - hardware cryptographic accelerator for Allwinner A20 SoC
> + *
> + * Copyright (C) 2013-2014 Corentin LABBE <clabbe.montjoie@gmail.com>
> + *
> + * This file add support for MD5 and SHA1.
> + *
> + * You could find the datasheet at
> + * http://dl.linux-sunxi.org/A20/A20%20User%20Manual%202013-03-22.pdf
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +#include "sunxi-ss.h"
> +
> +extern struct sunxi_ss_ctx *ss;
> +
> +/* sunxi_hash_init: initialize request context
> + * Activate the SS, and configure it for MD5 or SHA1
> + */
> +int sunxi_hash_init(struct ahash_request *areq)
> +{
> +	const char *hash_type;
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
> +	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
> +
> +	mutex_lock(&ss->lock);
> +
> +	hash_type = crypto_tfm_alg_name(areq->base.tfm);
> +
> +	op->byte_count = 0;
> +	op->nbwait = 0;
> +	op->waitbuf = 0;
> +
> +	/* Enable and configure SS for MD5 or SHA1 */
> +	if (strcmp(hash_type, "sha1") == 0)
> +		op->mode = SS_OP_SHA1;
> +	else
> +		op->mode = SS_OP_MD5;
> +
> +	writel(op->mode | SS_ENABLED, ss->base + SS_CTL);
> +	return 0;
> +}
> +
> +/*
> + * sunxi_hash_update: update hash engine
> + *
> + * Could be used for both SHA1 and MD5
> + * Write data by step of 32bits and put then in the SS.
> + * The remaining data is stored (nbwait bytes) in op->waitbuf
> + * As an optimisation, we do not check RXFIFO_SPACES, since SS handle
> + * the FIFO faster than our writes
> + */
> +int sunxi_hash_update(struct ahash_request *areq)
> +{
> +	u32 v;
> +	unsigned int i = 0;/* bytes read, to be compared to areq->nbytes */
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
> +	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
> +	struct scatterlist *in_sg;
> +	unsigned int in_i = 0;/* advancement in the current SG */
> +	void *src_addr;
> +
> +	u8 *waitbuf = (u8 *)(&op->waitbuf);
> +
> +	if (areq->nbytes == 0)
> +		return 0;
> +
> +	in_sg = areq->src;
> +	do {
> +		src_addr = kmap(sg_page(in_sg)) + in_sg->offset;
> +		/* step 1, if some bytes remains from last SG,
> +		 * try to complete them to 4 and sent its */
> +		if (op->nbwait > 0) {
> +			while (op->nbwait < 4 && i < areq->nbytes &&
> +					in_i < in_sg->length) {
> +				waitbuf[op->nbwait] = *(u8 *)(src_addr + in_i);
> +				i++;
> +				in_i++;
> +				op->nbwait++;
> +			}
> +			if (op->nbwait == 4) {
> +				writel(op->waitbuf, ss->base + SS_RXFIFO);
> +				op->byte_count += 4;
> +				op->nbwait = 0;
> +				op->waitbuf = 0;
> +			}
> +		}
> +		/* step 2, main loop, read data 4bytes at a time */
> +		while (i < areq->nbytes && areq->nbytes - i >= 4 &&
> +				in_i < in_sg->length &&
> +				in_sg->length - in_i >= 4) {
> +			v = *(u32 *)(src_addr + in_i);
> +			writel_relaxed(v, ss->base + SS_RXFIFO);
> +			i += 4;
> +			op->byte_count += 4;
> +			in_i += 4;
> +		}
> +		/* step 3, if we have less than 4 bytes, copy them in waitbuf
> +		 * no need to check for op->nbwait < 4 since we cannot have
> +		 * more than 4 bytes remaining */
> +		if (in_i < in_sg->length && in_sg->length - in_i < 4 &&
> +				i < areq->nbytes) {
> +			do {
> +				waitbuf[op->nbwait] = *(u8 *)(src_addr + in_i);
> +				op->nbwait++;
> +				in_i++;
> +				i++;
> +			} while (in_i < in_sg->length && i < areq->nbytes);
> +		}
> +		/* we have finished the current SG, try next one */
> +		kunmap(sg_page(in_sg));
> +		in_sg = sg_next(in_sg);
> +		in_i = 0;
> +	} while (in_sg != NULL && i < areq->nbytes);
> +	return 0;
> +}
> +
> +/*
> + * sunxi_hash_final: finalize hashing operation
> + *
> + * If we have some remaining bytes, send it.
> + * Then ask the SS for finalizing the hash
> + */
> +int sunxi_hash_final(struct ahash_request *areq)
> +{
> +	u32 v;
> +	unsigned int i;
> +	int zeros;
> +	unsigned int index, padlen;
> +	__be64 bits;
> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
> +	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
> +
> +	if (op->nbwait > 0) {
> +		op->waitbuf |= ((1 << 7) << (op->nbwait * 8));
> +		writel(op->waitbuf, ss->base + SS_RXFIFO);
> +	} else {
> +		writel((1 << 7), ss->base + SS_RXFIFO);
> +	}
> +
> +	/* number of space to pad to obtain 64o minus 8(size) minus 4 (final 1)
> +	 * example len=0
> +	 * example len=56
> +	 * */
> +
> +	/* we have already send 4 more byte of which nbwait data */
> +	if (op->mode == SS_OP_MD5) {
> +		index = (op->byte_count + 4) & 0x3f;
> +		op->byte_count += op->nbwait;
> +		if (index > 56)
> +			zeros = (120 - index) / 4;
> +		else
> +			zeros = (56 - index) / 4;
> +	} else {
> +		op->byte_count += op->nbwait;
> +		index = op->byte_count & 0x3f;
> +		padlen = (index < 56) ? (56 - index) : ((64+56) - index);
> +		zeros = (padlen - 1) / 4;
> +	}
> +	for (i = 0; i < zeros; i++)
> +		writel(0, ss->base + SS_RXFIFO);
> +
> +	/* write the lenght */
> +	if (op->mode == SS_OP_SHA1) {
> +		bits = cpu_to_be64(op->byte_count << 3);
> +		writel(bits & 0xffffffff, ss->base + SS_RXFIFO);
> +		writel((bits >> 32) & 0xffffffff, ss->base + SS_RXFIFO);
> +	} else {
> +		writel((op->byte_count << 3) & 0xffffffff,
> +				ss->base + SS_RXFIFO);
> +		writel((op->byte_count >> 29) & 0xffffffff,
> +				ss->base + SS_RXFIFO);
> +	}
> +
> +	/* stop the hashing */
> +	v = readl(ss->base + SS_CTL);
> +	v |= SS_DATA_END;
> +	writel(v, ss->base + SS_CTL);
> +
> +	/* check the end */
> +	/* The timeout could happend only in case of bad overcloking */
> +#define SS_TIMEOUT 100
> +	i = 0;
> +	do {
> +		v = readl(ss->base + SS_CTL);
> +		i++;
> +	} while (i < SS_TIMEOUT && (v & SS_DATA_END) > 0);
> +	if (i >= SS_TIMEOUT) {
> +		dev_err(ss->dev, "ERROR: hash end timeout %d>%d\n",
> +				i, SS_TIMEOUT);
> +		writel(0, ss->base + SS_CTL);
> +		mutex_unlock(&ss->lock);
> +		return -1;
> +	}
> +
> +	if (op->mode == SS_OP_SHA1) {
> +		for (i = 0; i < 5; i++) {
> +			v = cpu_to_be32(readl(ss->base + SS_MD0 + i * 4));
> +			memcpy(areq->result + i * 4, &v, 4);
> +		}
> +	} else {
> +		for (i = 0; i < 4; i++) {
> +			v = readl(ss->base + SS_MD0 + i * 4);
> +			memcpy(areq->result + i * 4, &v, 4);
> +		}
> +	}
> +	writel(0, ss->base + SS_CTL);
> +	mutex_unlock(&ss->lock);
> +	return 0;
> +}
> +
> +/* sunxi_hash_finup: finalize hashing operation after an update */
> +int sunxi_hash_finup(struct ahash_request *areq)
> +{
> +	int err;
> +
> +	err = sunxi_hash_update(areq);
> +	if (err != 0)
> +		return err;
> +
> +	return sunxi_hash_final(areq);
> +}
> +
> +/* combo of init/update/final functions */
> +int sunxi_hash_digest(struct ahash_request *areq)
> +{
> +	int err;
> +
> +	err = sunxi_hash_init(areq);
> +	if (err != 0)
> +		return err;
> +
> +	err = sunxi_hash_update(areq);
> +	if (err != 0)
> +		return err;
> +
> +	return sunxi_hash_final(areq);
> +}
> diff --git a/drivers/crypto/sunxi-ss/sunxi-ss.h b/drivers/crypto/sunxi-ss/sunxi-ss.h
> new file mode 100644
> index 0000000..94aca20
> --- /dev/null
> +++ b/drivers/crypto/sunxi-ss/sunxi-ss.h
> @@ -0,0 +1,183 @@
> +/*
> + * sunxi-ss.c - hardware cryptographic accelerator for Allwinner A20 SoC
> + *
> + * Copyright (C) 2013-2014 Corentin LABBE <clabbe.montjoie@gmail.com>
> + *
> + * Support AES cipher with 128,192,256 bits keysize.
> + * Support MD5 and SHA1 hash algorithms.
> + * Support DES and 3DES
> + * Support PRNG
> + *
> + * You could find the datasheet at
> + * http://dl.linux-sunxi.org/A20/A20%20User%20Manual%202013-03-22.pdf
> + *
> + *
> + * Licensed under the GPL-2.
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/crypto.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <crypto/scatterwalk.h>
> +#include <linux/scatterlist.h>
> +#include <linux/interrupt.h>
> +#include <linux/delay.h>
> +#include <crypto/md5.h>
> +#include <crypto/sha.h>
> +#include <crypto/hash.h>
> +#include <crypto/internal/hash.h>
> +#include <crypto/aes.h>
> +#include <crypto/des.h>
> +#include <crypto/internal/rng.h>
> +
> +#define SS_CTL            0x00
> +#define SS_KEY0           0x04
> +#define SS_KEY1           0x08
> +#define SS_KEY2           0x0C
> +#define SS_KEY3           0x10
> +#define SS_KEY4           0x14
> +#define SS_KEY5           0x18
> +#define SS_KEY6           0x1C
> +#define SS_KEY7           0x20
> +
> +#define SS_IV0            0x24
> +#define SS_IV1            0x28
> +#define SS_IV2            0x2C
> +#define SS_IV3            0x30
> +
> +#define SS_CNT0           0x34
> +#define SS_CNT1           0x38
> +#define SS_CNT2           0x3C
> +#define SS_CNT3           0x40
> +
> +#define SS_FCSR           0x44
> +#define SS_ICSR           0x48
> +
> +#define SS_MD0            0x4C
> +#define SS_MD1            0x50
> +#define SS_MD2            0x54
> +#define SS_MD3            0x58
> +#define SS_MD4            0x5C
> +
> +#define SS_RXFIFO         0x200
> +#define SS_TXFIFO         0x204
> +
> +/* SS_CTL configuration values */
> +
> +/* PRNG generator mode - bit 15 */
> +#define SS_PRNG_ONESHOT		(0 << 15)
> +#define SS_PRNG_CONTINUE	(1 << 15)
> +
> +/* SS operation mode - bits 12-13 */
> +#define SS_ECB			(0 << 12)
> +#define SS_CBC			(1 << 12)
> +#define SS_CNT			(2 << 12)
> +
> +/* Counter width for CNT mode - bits 10-11 */
> +#define SS_CNT_16BITS		(0 << 10)
> +#define SS_CNT_32BITS		(1 << 10)
> +#define SS_CNT_64BITS		(2 << 10)
> +
> +/* Key size for AES - bits 8-9 */
> +#define SS_AES_128BITS		(0 << 8)
> +#define SS_AES_192BITS		(1 << 8)
> +#define SS_AES_256BITS		(2 << 8)
> +
> +/* Operation direction - bit 7 */
> +#define SS_ENCRYPTION		(0 << 7)
> +#define SS_DECRYPTION		(1 << 7)
> +
> +/* SS Method - bits 4-6 */
> +#define SS_OP_AES		(0 << 4)
> +#define SS_OP_DES		(1 << 4)
> +#define SS_OP_3DES		(2 << 4)
> +#define SS_OP_SHA1		(3 << 4)
> +#define SS_OP_MD5		(4 << 4)
> +#define SS_OP_PRNG		(5 << 4)
> +
> +/* Data end bit - bit 2 */
> +#define SS_DATA_END		(1 << 2)
> +
> +/* PRNG start bit - bit 1 */
> +#define SS_PRNG_START		(1 << 1)
> +
> +/* SS Enable bit - bit 0 */
> +#define SS_DISABLED		(0 << 0)
> +#define SS_ENABLED		(1 << 0)
> +
> +/* SS_FCSR configuration values */
> +/* RX FIFO status - bit 30 */
> +#define SS_RXFIFO_FREE		(1 << 30)
> +
> +/* RX FIFO empty spaces - bits 24-29 */
> +#define SS_RXFIFO_SPACES(val)	(((val) >> 24) & 0x3f)
> +
> +/* TX FIFO status - bit 22 */
> +#define SS_TXFIFO_AVAILABLE	(1 << 22)
> +
> +/* TX FIFO available spaces - bits 16-21 */
> +#define SS_TXFIFO_SPACES(val)	(((val) >> 16) & 0x3f)
> +
> +#define SS_RXFIFO_EMP_INT_PENDING	(1 << 10)
> +#define SS_TXFIFO_AVA_INT_PENDING	(1 << 8)
> +#define SS_RXFIFO_EMP_INT_ENABLE	(1 << 2)
> +#define SS_TXFIFO_AVA_INT_ENABLE	(1 << 0)
> +
> +/* SS_ICSR configuration values */
> +#define SS_ICS_DRQ_ENABLE		(1 << 4)
> +
> +struct sunxi_ss_ctx {
> +	void __iomem *base;
> +	int irq;
> +	struct clk *busclk;
> +	struct clk *ssclk;
> +	struct device *dev;
> +	struct resource *res;
> +	void *buf_in; /* pointer to data to be uploaded to the device */
> +	size_t buf_in_size; /* size of buf_in */
> +	void *buf_out;
> +	size_t buf_out_size;
> +	struct mutex lock; /* control the use of the device */
> +	struct mutex bufout_lock; /* control the use of buf_out*/
> +	struct mutex bufin_lock; /* control the sue of buf_in*/
> +};
> +
> +struct sunxi_req_ctx {
> +	u32 key[AES_MAX_KEY_SIZE / 4];/* divided by sizeof(u32) */
> +	u32 keylen;
> +	u32 mode;
> +	u64 byte_count; /* number of bytes "uploaded" to the device */
> +	u32 waitbuf; /* a partial word waiting to be completed and
> +			uploaded to the device */
> +	/* number of bytes to be uploaded in the waitbuf word */
> +	unsigned int nbwait;
> +};
> +
> +#define SS_SEED_LEN (192/8)
> +#define SS_DATA_LEN (160/8)
> +
> +struct prng_context {
> +	u32 seed[SS_SEED_LEN/4];
> +	unsigned int slen;
> +};
> +
> +int sunxi_hash_init(struct ahash_request *areq);
> +int sunxi_hash_update(struct ahash_request *areq);
> +int sunxi_hash_final(struct ahash_request *areq);
> +int sunxi_hash_finup(struct ahash_request *areq);
> +int sunxi_hash_digest(struct ahash_request *areq);
> +
> +int sunxi_ss_aes_poll(struct ablkcipher_request *areq);
> +int sunxi_ss_des_poll(struct ablkcipher_request *areq);
> +int sunxi_ss_cipher_init(struct crypto_tfm *tfm);
> +int sunxi_ss_cipher_encrypt(struct ablkcipher_request *areq);
> +int sunxi_ss_cipher_decrypt(struct ablkcipher_request *areq);
> +int sunxi_ss_aes_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
> +		unsigned int keylen);
> +int sunxi_ss_des_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
> +		unsigned int keylen);
> +int sunxi_ss_des3_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
> +		unsigned int keylen);
> -- 
> 1.8.5.5
> 

Thanks for your work.

Overall, I guess it needs more work, especially on the memory
allocation/locking.

In your driver, your lock management and you memory management seems a
bit messy.

Try to stick with a symetrical pattern in your functions if possible,
like

kmalloc
mutex_lock
do_something
mutex_unlock
free

or at least create symetrical function to allocate/free the resources
you need.

So far, there's a lot of places where you just allocate, or free, or
lock or unlock, which makes it kind of hard to follow, and a very good
recipe for deadlocks/memory leaks.

Also, try to use gotos as much as possible, it will cleanup your exit
path, making your whole function much eaiser to read.

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140725/5817c2cf/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-24 13:38       ` Herbert Xu
@ 2014-07-26 14:01         ` Corentin LABBE
  2014-07-27 14:52           ` Herbert Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Corentin LABBE @ 2014-07-26 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

Le 24/07/2014 15:38, Herbert Xu a ?crit :
> On Thu, Jul 24, 2014 at 01:04:55PM +0200, Corentin LABBE wrote:
>> Le 24/07/2014 08:00, Herbert Xu a ?crit :
>>> On Sat, Jul 12, 2014 at 02:59:13PM +0200, LABBE Corentin wrote:
>>>>
>>>> +/* sunxi_hash_init: initialize request context
>>>> + * Activate the SS, and configure it for MD5 or SHA1
>>>> + */
>>>> +int sunxi_hash_init(struct ahash_request *areq)
>>>> +{
>>>> +	const char *hash_type;
>>>> +	struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq);
>>>> +	struct sunxi_req_ctx *op = crypto_ahash_ctx(tfm);
>>>> +
>>>> +	mutex_lock(&ss->lock);
>>>> +
>>>> +	hash_type = crypto_tfm_alg_name(areq->base.tfm);
>>>> +
>>>> +	op->byte_count = 0;
>>>> +	op->nbwait = 0;
>>>> +	op->waitbuf = 0;
>>>> +
>>>> +	/* Enable and configure SS for MD5 or SHA1 */
>>>> +	if (strcmp(hash_type, "sha1") == 0)
>>>> +		op->mode = SS_OP_SHA1;
>>>> +	else
>>>> +		op->mode = SS_OP_MD5;
>>>> +
>>>> +	writel(op->mode | SS_ENABLED, ss->base + SS_CTL);
>>>> +	return 0;
>>>
>>> The hash driver is completely broken.  You are modifying tfm
>>> ctx data which is shared by all users of a single tfm.  So
>>> if two users conduct hashes in parallel they will step all
>>> over each other.
>>
>> So where can I store data for each request ?
> 
> Well, first of all you need to stop storing state in the hardware.
> After each operation the hardware may be used by some other user
> for a completely different hash request.  So leaving the hash state
> in the hardware is a no-no.
> 
> If your hardware supports exporting the hash state then you just
> have to export it after each operation and reimporting before the
> next one.

Even if it is undocumented, the hardware seems to support it.
Since crypto_ahash_ctx is for a tfm, does ahash_request_ctx is the good place to store data ?
(after a call to crypto_ahash_set_reqsize in cra_init)

I have also seen export/import function, does I need to use it ?


> 
> If your hardware is incapable of exporting partial hash state then
> you will have to use a software fallback for init/update.  If your
> hardware is incapable of importing partial hash state then you will
> also have to do finup/final using a software fallback.
> 
> Cheers,
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator
  2014-07-26 14:01         ` Corentin LABBE
@ 2014-07-27 14:52           ` Herbert Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Herbert Xu @ 2014-07-27 14:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jul 26, 2014 at 04:01:26PM +0200, Corentin LABBE wrote:
>
> Even if it is undocumented, the hardware seems to support it.
> Since crypto_ahash_ctx is for a tfm, does ahash_request_ctx is the good place to store data ?
> (after a call to crypto_ahash_set_reqsize in cra_init)

Yes any hash state-related data should go into the request context.

> I have also seen export/import function, does I need to use it ?

Absolutely.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-07-27 14:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-12 12:59 [PATCH v4] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
2014-07-12 12:59 ` [PATCH v4 1/3] ARM: sun7i: dt: Add Security System to A20 SoC DTS LABBE Corentin
2014-07-12 12:59 ` [PATCH v4 2/3] ARM: sunxi: dt: Add DT bindings documentation for SUNXI Security System LABBE Corentin
2014-07-25 10:10   ` Maxime Ripard
2014-07-12 12:59 ` [PATCH v4 3/3] crypto: Add Allwinner Security System crypto accelerator LABBE Corentin
2014-07-23 13:16   ` Herbert Xu
2014-07-23 13:48     ` Maxime Ripard
2014-07-23 13:54       ` Herbert Xu
2014-07-24  6:00   ` Herbert Xu
2014-07-24 11:04     ` Corentin LABBE
2014-07-24 13:38       ` Herbert Xu
2014-07-26 14:01         ` Corentin LABBE
2014-07-27 14:52           ` Herbert Xu
2014-07-25 11:36   ` Maxime Ripard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).