All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/14] crypto: omap-aes: Improve DMA, add PIO mode and support for AM437x
@ 2013-08-18  2:42 ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List

Following patch series rewrites the DMA code to be cleaner and faster. Earlier,
only a single SG was used for DMA purpose, and the SG-list passed from the
crypto layer was being copied and DMA'd one entry at a time. This turns out to
be quite inefficient and lot of code, we replace it with much simpler approach
that directly passes the SG-list from crypto to the DMA layers for cases where
possible. For all cases where such a direct passing of SG list is not possible,
we create a new SG-list and do the copying. This is still better than before, as
we create an SG list as big as needed and not just 1-element list.

We also add PIO mode support to the driver, and switch to it whenever the DMA
channel allocation is not available. This also has shown to give good performance
for small blocks as shown below.

Tests have been performed on AM335x, OMAP4 and AM437x SoCs.

Below is a sample run on AM335x SoC (beaglebone board), showing
performance improvement (20% for 8K blocks):

With DMA rewrite (key size = 128-bit)
16 byte blocks: 4318 operations in 1 seconds (69088 bytes)
64 byte blocks: 4360 operations in 1 seconds (279040 bytes)
256 byte blocks: 3609 operations in 1 seconds (923904 bytes)
1024 byte blocks: 3418 operations in 1 seconds (3500032 bytes)
8192 byte blocks: 1766 operations in 1 seconds (14467072 bytes)

Without DMA rewrite:
16 byte blocks: 4417 operations in 1 seconds (70672 bytes)
64 byte blocks: 4221 operations in 1 seconds (270144 bytes)
256 byte blocks: 3528 operations in 1 seconds (903168 bytes)
1024 byte blocks: 3281 operations in 1 seconds (3359744 bytes)
8192 byte blocks: 1460 operations in 1 seconds (11960320 bytes)

With PIO mode, good performance is observed for small blocks:
16 byte blocks: 20585 operations in 1 seconds (329360 bytes)
64 byte blocks: 8106 operations in 1 seconds (518784 bytes)
256 byte blocks: 2359 operations in 1 seconds (603904 bytes)
1024 byte blocks: 605 operations in 1 seconds (619520 bytes)
8192 byte blocks: 79 operations in 1 seconds (647168 bytes)

Future work in this direction would be to dynamically change between PIO/DMA mode
based on the block size.

Changes since last series:
* Unaligned cases for omap-aes are handled with patch: 
   "Add support for cases of unaligned lengths"
* Support for am437x SoC is added and tested.
* Changes following review comments on debug patch 

Note:
  The debug patch:  "crypto: omap-aes: Add useful debug macros" will generate
  a checkpatch error, which cannot be fixed. Refer to patch for error message
  and reasons for why cannot be fixed, thanks.

Joel Fernandes (14):
  crypto: scatterwalk:  Add support for calculating number of SG
    elements
  crypto: omap-aes: Add useful debug macros
  crypto: omap-aes: Populate number of SG elements
  crypto: omap-aes: Simplify DMA usage by using direct SGs
  crypto: omap-aes: Sync SG before DMA operation
  crypto: omap-aes: Remove previously used intermediate buffers
  crypto: omap-aes: Add IRQ info and helper macros
  crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
  crypto: omap-aes: PIO mode: platform data for OMAP4/AM437x and
    trigger
  crypto: omap-aes: Switch to PIO mode during probe
  crypto: omap-aes: Add support for cases of unaligned lengths
  crypto: omap-aes: Convert kzalloc to devm_kzalloc
  crypto: omap-aes: Convert request_irq to devm_request_irq
  crypto: omap-aes: Kconfig: Add build support for AM437x

 crypto/scatterwalk.c         |   22 ++
 drivers/crypto/Kconfig       |    2 +-
 drivers/crypto/omap-aes.c    |  466 +++++++++++++++++++++++-------------------
 include/crypto/scatterwalk.h |    2 +
 4 files changed, 284 insertions(+), 208 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 00/14] crypto: omap-aes: Improve DMA, add PIO mode and support for AM437x
@ 2013-08-18  2:42 ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

Following patch series rewrites the DMA code to be cleaner and faster. Earlier,
only a single SG was used for DMA purpose, and the SG-list passed from the
crypto layer was being copied and DMA'd one entry at a time. This turns out to
be quite inefficient and lot of code, we replace it with much simpler approach
that directly passes the SG-list from crypto to the DMA layers for cases where
possible. For all cases where such a direct passing of SG list is not possible,
we create a new SG-list and do the copying. This is still better than before, as
we create an SG list as big as needed and not just 1-element list.

We also add PIO mode support to the driver, and switch to it whenever the DMA
channel allocation is not available. This also has shown to give good performance
for small blocks as shown below.

Tests have been performed on AM335x, OMAP4 and AM437x SoCs.

Below is a sample run on AM335x SoC (beaglebone board), showing
performance improvement (20% for 8K blocks):

With DMA rewrite (key size = 128-bit)
16 byte blocks: 4318 operations in 1 seconds (69088 bytes)
64 byte blocks: 4360 operations in 1 seconds (279040 bytes)
256 byte blocks: 3609 operations in 1 seconds (923904 bytes)
1024 byte blocks: 3418 operations in 1 seconds (3500032 bytes)
8192 byte blocks: 1766 operations in 1 seconds (14467072 bytes)

Without DMA rewrite:
16 byte blocks: 4417 operations in 1 seconds (70672 bytes)
64 byte blocks: 4221 operations in 1 seconds (270144 bytes)
256 byte blocks: 3528 operations in 1 seconds (903168 bytes)
1024 byte blocks: 3281 operations in 1 seconds (3359744 bytes)
8192 byte blocks: 1460 operations in 1 seconds (11960320 bytes)

With PIO mode, good performance is observed for small blocks:
16 byte blocks: 20585 operations in 1 seconds (329360 bytes)
64 byte blocks: 8106 operations in 1 seconds (518784 bytes)
256 byte blocks: 2359 operations in 1 seconds (603904 bytes)
1024 byte blocks: 605 operations in 1 seconds (619520 bytes)
8192 byte blocks: 79 operations in 1 seconds (647168 bytes)

Future work in this direction would be to dynamically change between PIO/DMA mode
based on the block size.

Changes since last series:
* Unaligned cases for omap-aes are handled with patch: 
   "Add support for cases of unaligned lengths"
* Support for am437x SoC is added and tested.
* Changes following review comments on debug patch 

Note:
  The debug patch:  "crypto: omap-aes: Add useful debug macros" will generate
  a checkpatch error, which cannot be fixed. Refer to patch for error message
  and reasons for why cannot be fixed, thanks.

Joel Fernandes (14):
  crypto: scatterwalk:  Add support for calculating number of SG
    elements
  crypto: omap-aes: Add useful debug macros
  crypto: omap-aes: Populate number of SG elements
  crypto: omap-aes: Simplify DMA usage by using direct SGs
  crypto: omap-aes: Sync SG before DMA operation
  crypto: omap-aes: Remove previously used intermediate buffers
  crypto: omap-aes: Add IRQ info and helper macros
  crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
  crypto: omap-aes: PIO mode: platform data for OMAP4/AM437x and
    trigger
  crypto: omap-aes: Switch to PIO mode during probe
  crypto: omap-aes: Add support for cases of unaligned lengths
  crypto: omap-aes: Convert kzalloc to devm_kzalloc
  crypto: omap-aes: Convert request_irq to devm_request_irq
  crypto: omap-aes: Kconfig: Add build support for AM437x

 crypto/scatterwalk.c         |   22 ++
 drivers/crypto/Kconfig       |    2 +-
 drivers/crypto/omap-aes.c    |  466 +++++++++++++++++++++++-------------------
 include/crypto/scatterwalk.h |    2 +
 4 files changed, 284 insertions(+), 208 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 01/14] crypto: scatterwalk:  Add support for calculating number of SG elements
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

Crypto layer only passes nbytes to encrypt but in omap-aes driver we need to
know number of SG elements to pass to dmaengine slave API.  We add function for
the same to scatterwalk library.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 crypto/scatterwalk.c         |   22 ++++++++++++++++++++++
 include/crypto/scatterwalk.h |    2 ++
 2 files changed, 24 insertions(+)

diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c
index 7281b8a..79ca227 100644
--- a/crypto/scatterwalk.c
+++ b/crypto/scatterwalk.c
@@ -124,3 +124,25 @@ void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
 	scatterwalk_done(&walk, out, 0);
 }
 EXPORT_SYMBOL_GPL(scatterwalk_map_and_copy);
+
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes)
+{
+	int offset = 0, n = 0;
+
+	/* num_bytes is too small */
+	if (num_bytes < sg->length)
+		return -1;
+
+	do {
+		offset += sg->length;
+		n++;
+		sg = scatterwalk_sg_next(sg);
+
+		/* num_bytes is too large */
+		if (unlikely(!sg && (num_bytes < offset)))
+			return -1;
+	} while (sg && (num_bytes > offset));
+
+	return n;
+}
+EXPORT_SYMBOL_GPL(scatterwalk_bytes_sglen);
diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h
index 3744d2a..13621cc 100644
--- a/include/crypto/scatterwalk.h
+++ b/include/crypto/scatterwalk.h
@@ -113,4 +113,6 @@ void scatterwalk_done(struct scatter_walk *walk, int out, int more);
 void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
 			      unsigned int start, unsigned int nbytes, int out);
 
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes);
+
 #endif  /* _CRYPTO_SCATTERWALK_H */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 01/14] crypto: scatterwalk: Add support for calculating number of SG elements
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

Crypto layer only passes nbytes to encrypt but in omap-aes driver we need to
know number of SG elements to pass to dmaengine slave API.  We add function for
the same to scatterwalk library.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 crypto/scatterwalk.c         |   22 ++++++++++++++++++++++
 include/crypto/scatterwalk.h |    2 ++
 2 files changed, 24 insertions(+)

diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c
index 7281b8a..79ca227 100644
--- a/crypto/scatterwalk.c
+++ b/crypto/scatterwalk.c
@@ -124,3 +124,25 @@ void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
 	scatterwalk_done(&walk, out, 0);
 }
 EXPORT_SYMBOL_GPL(scatterwalk_map_and_copy);
+
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes)
+{
+	int offset = 0, n = 0;
+
+	/* num_bytes is too small */
+	if (num_bytes < sg->length)
+		return -1;
+
+	do {
+		offset += sg->length;
+		n++;
+		sg = scatterwalk_sg_next(sg);
+
+		/* num_bytes is too large */
+		if (unlikely(!sg && (num_bytes < offset)))
+			return -1;
+	} while (sg && (num_bytes > offset));
+
+	return n;
+}
+EXPORT_SYMBOL_GPL(scatterwalk_bytes_sglen);
diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h
index 3744d2a..13621cc 100644
--- a/include/crypto/scatterwalk.h
+++ b/include/crypto/scatterwalk.h
@@ -113,4 +113,6 @@ void scatterwalk_done(struct scatter_walk *walk, int out, int more);
 void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
 			      unsigned int start, unsigned int nbytes, int out);
 
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes);
+
 #endif  /* _CRYPTO_SCATTERWALK_H */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 02/14] crypto: omap-aes: Add useful debug macros
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

When DEBUG is enabled, these macros can be used to print variables in integer
and hex format, and clearly display which registers, offsets and values are
being read/written , including printing the names of the offsets and their values.

Note:
This patch results in a checkpatch error that cannot be fixed.
ERROR: Macros with multiple statements should be enclosed in a do - while loop
+#define omap_aes_read(dd, offset)                                      \
+       __raw_readl(dd->io_base + offset);                              \
+       pr_debug("omap_aes_read(" #offset ")\n");

Using do-while loop will break a lot of code such as:
ret = omap_aes_read(..);

On the other hand, not using a do-while loop will only result in a spurious
debug print message when DEBUG is enabled, all other issues would be caught at
compile time if any. As such, there is no code in the driver as of now that
requires a do-while loop, but there is code that will break if a do-while loop
is used in the macro so we ignore the checkpatch error in this case.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ee15b0f..26b802b 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -13,7 +13,8 @@
  *
  */
 
-#define pr_fmt(fmt) "%s: " fmt, __func__
+#define prn(num) pr_debug(#num "=%d\n", num)
+#define prx(num) pr_debug(#num "=%x\n", num)
 
 #include <linux/err.h>
 #include <linux/module.h>
@@ -172,16 +173,35 @@ struct omap_aes_dev {
 static LIST_HEAD(dev_list);
 static DEFINE_SPINLOCK(list_lock);
 
+#ifdef DEBUG
+/*
+ * Note: In DEBUG mode, when using conditionals, omap_aes_read _must_
+ * be surrounded by braces otherwise you may see spurious prints.
+ */
+#define omap_aes_read(dd, offset)					\
+	__raw_readl(dd->io_base + offset);				\
+	pr_debug("omap_aes_read(" #offset ")\n");
+#else
 static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
 {
 	return __raw_readl(dd->io_base + offset);
 }
+#endif
 
+#ifdef DEBUG
+#define omap_aes_write(dd, offset, value)				\
+	do {								\
+		pr_debug("omap_aes_write(" #offset "=%x) value=%d\n",	\
+			 offset, value);				\
+		__raw_writel(value, dd->io_base + offset);		\
+	} while (0)
+#else
 static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
 				  u32 value)
 {
 	__raw_writel(value, dd->io_base + offset);
 }
+#endif
 
 static inline void omap_aes_write_mask(struct omap_aes_dev *dd, u32 offset,
 					u32 value, u32 mask)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 02/14] crypto: omap-aes: Add useful debug macros
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

When DEBUG is enabled, these macros can be used to print variables in integer
and hex format, and clearly display which registers, offsets and values are
being read/written , including printing the names of the offsets and their values.

Note:
This patch results in a checkpatch error that cannot be fixed.
ERROR: Macros with multiple statements should be enclosed in a do - while loop
+#define omap_aes_read(dd, offset)                                      \
+       __raw_readl(dd->io_base + offset);                              \
+       pr_debug("omap_aes_read(" #offset ")\n");

Using do-while loop will break a lot of code such as:
ret = omap_aes_read(..);

On the other hand, not using a do-while loop will only result in a spurious
debug print message when DEBUG is enabled, all other issues would be caught at
compile time if any. As such, there is no code in the driver as of now that
requires a do-while loop, but there is code that will break if a do-while loop
is used in the macro so we ignore the checkpatch error in this case.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ee15b0f..26b802b 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -13,7 +13,8 @@
  *
  */
 
-#define pr_fmt(fmt) "%s: " fmt, __func__
+#define prn(num) pr_debug(#num "=%d\n", num)
+#define prx(num) pr_debug(#num "=%x\n", num)
 
 #include <linux/err.h>
 #include <linux/module.h>
@@ -172,16 +173,35 @@ struct omap_aes_dev {
 static LIST_HEAD(dev_list);
 static DEFINE_SPINLOCK(list_lock);
 
+#ifdef DEBUG
+/*
+ * Note: In DEBUG mode, when using conditionals, omap_aes_read _must_
+ * be surrounded by braces otherwise you may see spurious prints.
+ */
+#define omap_aes_read(dd, offset)					\
+	__raw_readl(dd->io_base + offset);				\
+	pr_debug("omap_aes_read(" #offset ")\n");
+#else
 static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
 {
 	return __raw_readl(dd->io_base + offset);
 }
+#endif
 
+#ifdef DEBUG
+#define omap_aes_write(dd, offset, value)				\
+	do {								\
+		pr_debug("omap_aes_write(" #offset "=%x) value=%d\n",	\
+			 offset, value);				\
+		__raw_writel(value, dd->io_base + offset);		\
+	} while (0)
+#else
 static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
 				  u32 value)
 {
 	__raw_writel(value, dd->io_base + offset);
 }
+#endif
 
 static inline void omap_aes_write_mask(struct omap_aes_dev *dd, u32 offset,
 					u32 value, u32 mask)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 03/14] crypto: omap-aes: Populate number of SG elements
  2013-08-18  2:42 ` Joel Fernandes
  (?)
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Joel Fernandes, Linux OMAP List, Linux Kernel Mailing List,
	Linux ARM Kernel List, Linux Crypto Mailing List

Crypto layer only passes nbytes but number of SG elements is needed for mapping
or unmapping SGs at one time using dma_map* API and also needed to pass in for
dmaengine prep function.

We call function added to scatterwalk for this purpose in omap_aes_handle_queue
to populate the values which are used later.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 26b802b..e369e6e 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -164,6 +164,8 @@ struct omap_aes_dev {
 	void			*buf_out;
 	int			dma_out;
 	struct dma_chan		*dma_lch_out;
+	int			in_sg_len;
+	int			out_sg_len;
 	dma_addr_t		dma_addr_out;
 
 	const struct omap_aes_pdata	*pdata;
@@ -731,6 +733,10 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 	dd->out_offset = 0;
 	dd->out_sg = req->dst;
 
+	dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
+	dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
+	BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
+
 	rctx = ablkcipher_request_ctx(req);
 	ctx = crypto_ablkcipher_ctx(crypto_ablkcipher_reqtfm(req));
 	rctx->mode &= FLAGS_MODE_MASK;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 03/14] crypto: omap-aes: Populate number of SG elements
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

Crypto layer only passes nbytes but number of SG elements is needed for mapping
or unmapping SGs at one time using dma_map* API and also needed to pass in for
dmaengine prep function.

We call function added to scatterwalk for this purpose in omap_aes_handle_queue
to populate the values which are used later.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 26b802b..e369e6e 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -164,6 +164,8 @@ struct omap_aes_dev {
 	void			*buf_out;
 	int			dma_out;
 	struct dma_chan		*dma_lch_out;
+	int			in_sg_len;
+	int			out_sg_len;
 	dma_addr_t		dma_addr_out;
 
 	const struct omap_aes_pdata	*pdata;
@@ -731,6 +733,10 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 	dd->out_offset = 0;
 	dd->out_sg = req->dst;
 
+	dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
+	dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
+	BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
+
 	rctx = ablkcipher_request_ctx(req);
 	ctx = crypto_ablkcipher_ctx(crypto_ablkcipher_reqtfm(req));
 	rctx->mode &= FLAGS_MODE_MASK;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 03/14] crypto: omap-aes: Populate number of SG elements
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

Crypto layer only passes nbytes but number of SG elements is needed for mapping
or unmapping SGs at one time using dma_map* API and also needed to pass in for
dmaengine prep function.

We call function added to scatterwalk for this purpose in omap_aes_handle_queue
to populate the values which are used later.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 26b802b..e369e6e 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -164,6 +164,8 @@ struct omap_aes_dev {
 	void			*buf_out;
 	int			dma_out;
 	struct dma_chan		*dma_lch_out;
+	int			in_sg_len;
+	int			out_sg_len;
 	dma_addr_t		dma_addr_out;
 
 	const struct omap_aes_pdata	*pdata;
@@ -731,6 +733,10 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 	dd->out_offset = 0;
 	dd->out_sg = req->dst;
 
+	dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
+	dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
+	BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
+
 	rctx = ablkcipher_request_ctx(req);
 	ctx = crypto_ablkcipher_ctx(crypto_ablkcipher_reqtfm(req));
 	rctx->mode &= FLAGS_MODE_MASK;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

In early version of this driver, assumptions were made such as DMA layer
requires contiguous buffers etc. Due to this, new buffers were allocated,
mapped and used for DMA. These assumptions are no longer true and DMAEngine
scatter-gather DMA doesn't have such requirements. We simply the DMA operations
by directly using the scatter-gather buffers provided by the crypto layer
instead of creating our own.

Lot of logic that handled DMA'ing only X number of bytes of the total, or as
much as fitted into a 3rd party buffer is removed and is no longer required.

Also, good performance improvement of atleast ~20% seen with encrypting a
buffer size of 8K (1800 ops/sec vs 1400 ops/sec).  Improvement will be higher
for much larger blocks though such benchmarking is left as an exercise for the
reader.  Also DMA usage is much more simplified and coherent with rest of the
code.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |  147 ++++++++-------------------------------------
 1 file changed, 25 insertions(+), 122 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index e369e6e..64dd5c1 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -480,22 +480,14 @@ static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
 }
 
 static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
-		struct scatterlist *in_sg, struct scatterlist *out_sg)
+		struct scatterlist *in_sg, struct scatterlist *out_sg,
+		int in_sg_len, int out_sg_len)
 {
 	struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
 	struct omap_aes_dev *dd = ctx->dd;
 	struct dma_async_tx_descriptor *tx_in, *tx_out;
 	struct dma_slave_config cfg;
-	dma_addr_t dma_addr_in = sg_dma_address(in_sg);
-	int ret, length = sg_dma_len(in_sg);
-
-	pr_debug("len: %d\n", length);
-
-	dd->dma_size = length;
-
-	if (!(dd->flags & FLAGS_FAST))
-		dma_sync_single_for_device(dd->dev, dma_addr_in, length,
-					   DMA_TO_DEVICE);
+	int ret;
 
 	memset(&cfg, 0, sizeof(cfg));
 
@@ -514,7 +506,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 		return ret;
 	}
 
-	tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, 1,
+	tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, in_sg_len,
 					DMA_MEM_TO_DEV,
 					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
 	if (!tx_in) {
@@ -533,7 +525,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 		return ret;
 	}
 
-	tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, 1,
+	tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, out_sg_len,
 					DMA_DEV_TO_MEM,
 					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
 	if (!tx_out) {
@@ -551,7 +543,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 	dma_async_issue_pending(dd->dma_lch_out);
 
 	/* start DMA */
-	dd->pdata->trigger(dd, length);
+	dd->pdata->trigger(dd, dd->total);
 
 	return 0;
 }
@@ -560,93 +552,28 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev *dd)
 {
 	struct crypto_tfm *tfm = crypto_ablkcipher_tfm(
 					crypto_ablkcipher_reqtfm(dd->req));
-	int err, fast = 0, in, out;
-	size_t count;
-	dma_addr_t addr_in, addr_out;
-	struct scatterlist *in_sg, *out_sg;
-	int len32;
+	int err;
 
 	pr_debug("total: %d\n", dd->total);
 
-	if (sg_is_last(dd->in_sg) && sg_is_last(dd->out_sg)) {
-		/* check for alignment */
-		in = IS_ALIGNED((u32)dd->in_sg->offset, sizeof(u32));
-		out = IS_ALIGNED((u32)dd->out_sg->offset, sizeof(u32));
-
-		fast = in && out;
+	err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+	if (!err) {
+		dev_err(dd->dev, "dma_map_sg() error\n");
+		return -EINVAL;
 	}
 
-	if (fast)  {
-		count = min(dd->total, sg_dma_len(dd->in_sg));
-		count = min(count, sg_dma_len(dd->out_sg));
-
-		if (count != dd->total) {
-			pr_err("request length != buffer length\n");
-			return -EINVAL;
-		}
-
-		pr_debug("fast\n");
-
-		err = dma_map_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-		if (!err) {
-			dev_err(dd->dev, "dma_map_sg() error\n");
-			return -EINVAL;
-		}
-
-		err = dma_map_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
-		if (!err) {
-			dev_err(dd->dev, "dma_map_sg() error\n");
-			dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-			return -EINVAL;
-		}
-
-		addr_in = sg_dma_address(dd->in_sg);
-		addr_out = sg_dma_address(dd->out_sg);
-
-		in_sg = dd->in_sg;
-		out_sg = dd->out_sg;
-
-		dd->flags |= FLAGS_FAST;
-
-	} else {
-		/* use cache buffers */
-		count = sg_copy(&dd->in_sg, &dd->in_offset, dd->buf_in,
-				 dd->buflen, dd->total, 0);
-
-		len32 = DIV_ROUND_UP(count, DMA_MIN) * DMA_MIN;
-
-		/*
-		 * The data going into the AES module has been copied
-		 * to a local buffer and the data coming out will go
-		 * into a local buffer so set up local SG entries for
-		 * both.
-		 */
-		sg_init_table(&dd->in_sgl, 1);
-		dd->in_sgl.offset = dd->in_offset;
-		sg_dma_len(&dd->in_sgl) = len32;
-		sg_dma_address(&dd->in_sgl) = dd->dma_addr_in;
-
-		sg_init_table(&dd->out_sgl, 1);
-		dd->out_sgl.offset = dd->out_offset;
-		sg_dma_len(&dd->out_sgl) = len32;
-		sg_dma_address(&dd->out_sgl) = dd->dma_addr_out;
-
-		in_sg = &dd->in_sgl;
-		out_sg = &dd->out_sgl;
-
-		addr_in = dd->dma_addr_in;
-		addr_out = dd->dma_addr_out;
-
-		dd->flags &= ~FLAGS_FAST;
-
+	err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
+	if (!err) {
+		dev_err(dd->dev, "dma_map_sg() error\n");
+		return -EINVAL;
 	}
 
-	dd->total -= count;
-
-	err = omap_aes_crypt_dma(tfm, in_sg, out_sg);
+	err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
+				 dd->out_sg_len);
 	if (err) {
-		dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-		dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_TO_DEVICE);
+		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+			     DMA_FROM_DEVICE);
 	}
 
 	return err;
@@ -667,7 +594,6 @@ static void omap_aes_finish_req(struct omap_aes_dev *dd, int err)
 static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
 {
 	int err = 0;
-	size_t count;
 
 	pr_debug("total: %d\n", dd->total);
 
@@ -676,21 +602,8 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
 	dmaengine_terminate_all(dd->dma_lch_in);
 	dmaengine_terminate_all(dd->dma_lch_out);
 
-	if (dd->flags & FLAGS_FAST) {
-		dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
-		dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-	} else {
-		dma_sync_single_for_device(dd->dev, dd->dma_addr_out,
-					   dd->dma_size, DMA_FROM_DEVICE);
-
-		/* copy data */
-		count = sg_copy(&dd->out_sg, &dd->out_offset, dd->buf_out,
-				 dd->buflen, dd->dma_size, 1);
-		if (count != dd->dma_size) {
-			err = -EINVAL;
-			pr_err("not all data converted: %u\n", count);
-		}
-	}
+	dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+	dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
 
 	return err;
 }
@@ -760,21 +673,11 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 static void omap_aes_done_task(unsigned long data)
 {
 	struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
-	int err;
-
-	pr_debug("enter\n");
 
-	err = omap_aes_crypt_dma_stop(dd);
-
-	err = dd->err ? : err;
-
-	if (dd->total && !err) {
-		err = omap_aes_crypt_dma_start(dd);
-		if (!err)
-			return; /* DMA started. Not fininishing. */
-	}
+	pr_debug("enter done_task\n");
 
-	omap_aes_finish_req(dd, err);
+	omap_aes_crypt_dma_stop(dd);
+	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
 
 	pr_debug("exit\n");
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

In early version of this driver, assumptions were made such as DMA layer
requires contiguous buffers etc. Due to this, new buffers were allocated,
mapped and used for DMA. These assumptions are no longer true and DMAEngine
scatter-gather DMA doesn't have such requirements. We simply the DMA operations
by directly using the scatter-gather buffers provided by the crypto layer
instead of creating our own.

Lot of logic that handled DMA'ing only X number of bytes of the total, or as
much as fitted into a 3rd party buffer is removed and is no longer required.

Also, good performance improvement of atleast ~20% seen with encrypting a
buffer size of 8K (1800 ops/sec vs 1400 ops/sec).  Improvement will be higher
for much larger blocks though such benchmarking is left as an exercise for the
reader.  Also DMA usage is much more simplified and coherent with rest of the
code.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |  147 ++++++++-------------------------------------
 1 file changed, 25 insertions(+), 122 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index e369e6e..64dd5c1 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -480,22 +480,14 @@ static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
 }
 
 static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
-		struct scatterlist *in_sg, struct scatterlist *out_sg)
+		struct scatterlist *in_sg, struct scatterlist *out_sg,
+		int in_sg_len, int out_sg_len)
 {
 	struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
 	struct omap_aes_dev *dd = ctx->dd;
 	struct dma_async_tx_descriptor *tx_in, *tx_out;
 	struct dma_slave_config cfg;
-	dma_addr_t dma_addr_in = sg_dma_address(in_sg);
-	int ret, length = sg_dma_len(in_sg);
-
-	pr_debug("len: %d\n", length);
-
-	dd->dma_size = length;
-
-	if (!(dd->flags & FLAGS_FAST))
-		dma_sync_single_for_device(dd->dev, dma_addr_in, length,
-					   DMA_TO_DEVICE);
+	int ret;
 
 	memset(&cfg, 0, sizeof(cfg));
 
@@ -514,7 +506,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 		return ret;
 	}
 
-	tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, 1,
+	tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, in_sg_len,
 					DMA_MEM_TO_DEV,
 					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
 	if (!tx_in) {
@@ -533,7 +525,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 		return ret;
 	}
 
-	tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, 1,
+	tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, out_sg_len,
 					DMA_DEV_TO_MEM,
 					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
 	if (!tx_out) {
@@ -551,7 +543,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 	dma_async_issue_pending(dd->dma_lch_out);
 
 	/* start DMA */
-	dd->pdata->trigger(dd, length);
+	dd->pdata->trigger(dd, dd->total);
 
 	return 0;
 }
@@ -560,93 +552,28 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev *dd)
 {
 	struct crypto_tfm *tfm = crypto_ablkcipher_tfm(
 					crypto_ablkcipher_reqtfm(dd->req));
-	int err, fast = 0, in, out;
-	size_t count;
-	dma_addr_t addr_in, addr_out;
-	struct scatterlist *in_sg, *out_sg;
-	int len32;
+	int err;
 
 	pr_debug("total: %d\n", dd->total);
 
-	if (sg_is_last(dd->in_sg) && sg_is_last(dd->out_sg)) {
-		/* check for alignment */
-		in = IS_ALIGNED((u32)dd->in_sg->offset, sizeof(u32));
-		out = IS_ALIGNED((u32)dd->out_sg->offset, sizeof(u32));
-
-		fast = in && out;
+	err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+	if (!err) {
+		dev_err(dd->dev, "dma_map_sg() error\n");
+		return -EINVAL;
 	}
 
-	if (fast)  {
-		count = min(dd->total, sg_dma_len(dd->in_sg));
-		count = min(count, sg_dma_len(dd->out_sg));
-
-		if (count != dd->total) {
-			pr_err("request length != buffer length\n");
-			return -EINVAL;
-		}
-
-		pr_debug("fast\n");
-
-		err = dma_map_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-		if (!err) {
-			dev_err(dd->dev, "dma_map_sg() error\n");
-			return -EINVAL;
-		}
-
-		err = dma_map_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
-		if (!err) {
-			dev_err(dd->dev, "dma_map_sg() error\n");
-			dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-			return -EINVAL;
-		}
-
-		addr_in = sg_dma_address(dd->in_sg);
-		addr_out = sg_dma_address(dd->out_sg);
-
-		in_sg = dd->in_sg;
-		out_sg = dd->out_sg;
-
-		dd->flags |= FLAGS_FAST;
-
-	} else {
-		/* use cache buffers */
-		count = sg_copy(&dd->in_sg, &dd->in_offset, dd->buf_in,
-				 dd->buflen, dd->total, 0);
-
-		len32 = DIV_ROUND_UP(count, DMA_MIN) * DMA_MIN;
-
-		/*
-		 * The data going into the AES module has been copied
-		 * to a local buffer and the data coming out will go
-		 * into a local buffer so set up local SG entries for
-		 * both.
-		 */
-		sg_init_table(&dd->in_sgl, 1);
-		dd->in_sgl.offset = dd->in_offset;
-		sg_dma_len(&dd->in_sgl) = len32;
-		sg_dma_address(&dd->in_sgl) = dd->dma_addr_in;
-
-		sg_init_table(&dd->out_sgl, 1);
-		dd->out_sgl.offset = dd->out_offset;
-		sg_dma_len(&dd->out_sgl) = len32;
-		sg_dma_address(&dd->out_sgl) = dd->dma_addr_out;
-
-		in_sg = &dd->in_sgl;
-		out_sg = &dd->out_sgl;
-
-		addr_in = dd->dma_addr_in;
-		addr_out = dd->dma_addr_out;
-
-		dd->flags &= ~FLAGS_FAST;
-
+	err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
+	if (!err) {
+		dev_err(dd->dev, "dma_map_sg() error\n");
+		return -EINVAL;
 	}
 
-	dd->total -= count;
-
-	err = omap_aes_crypt_dma(tfm, in_sg, out_sg);
+	err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
+				 dd->out_sg_len);
 	if (err) {
-		dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-		dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_TO_DEVICE);
+		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+			     DMA_FROM_DEVICE);
 	}
 
 	return err;
@@ -667,7 +594,6 @@ static void omap_aes_finish_req(struct omap_aes_dev *dd, int err)
 static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
 {
 	int err = 0;
-	size_t count;
 
 	pr_debug("total: %d\n", dd->total);
 
@@ -676,21 +602,8 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
 	dmaengine_terminate_all(dd->dma_lch_in);
 	dmaengine_terminate_all(dd->dma_lch_out);
 
-	if (dd->flags & FLAGS_FAST) {
-		dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
-		dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-	} else {
-		dma_sync_single_for_device(dd->dev, dd->dma_addr_out,
-					   dd->dma_size, DMA_FROM_DEVICE);
-
-		/* copy data */
-		count = sg_copy(&dd->out_sg, &dd->out_offset, dd->buf_out,
-				 dd->buflen, dd->dma_size, 1);
-		if (count != dd->dma_size) {
-			err = -EINVAL;
-			pr_err("not all data converted: %u\n", count);
-		}
-	}
+	dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+	dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
 
 	return err;
 }
@@ -760,21 +673,11 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 static void omap_aes_done_task(unsigned long data)
 {
 	struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
-	int err;
-
-	pr_debug("enter\n");
 
-	err = omap_aes_crypt_dma_stop(dd);
-
-	err = dd->err ? : err;
-
-	if (dd->total && !err) {
-		err = omap_aes_crypt_dma_start(dd);
-		if (!err)
-			return; /* DMA started. Not fininishing. */
-	}
+	pr_debug("enter done_task\n");
 
-	omap_aes_finish_req(dd, err);
+	omap_aes_crypt_dma_stop(dd);
+	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
 
 	pr_debug("exit\n");
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 05/14] crypto: omap-aes: Sync SG before DMA operation
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

Earlier functions that did a similar sync are replaced by the dma_sync_sg_*
which can operate on entire SG list.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 64dd5c1..5e034a1 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -489,6 +489,8 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 	struct dma_slave_config cfg;
 	int ret;
 
+	dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
+
 	memset(&cfg, 0, sizeof(cfg));
 
 	cfg.src_addr = dd->phys_base + AES_REG_DATA_N(dd, 0);
@@ -676,6 +678,8 @@ static void omap_aes_done_task(unsigned long data)
 
 	pr_debug("enter done_task\n");
 
+	dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
+
 	omap_aes_crypt_dma_stop(dd);
 	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 05/14] crypto: omap-aes: Sync SG before DMA operation
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

Earlier functions that did a similar sync are replaced by the dma_sync_sg_*
which can operate on entire SG list.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 64dd5c1..5e034a1 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -489,6 +489,8 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 	struct dma_slave_config cfg;
 	int ret;
 
+	dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
+
 	memset(&cfg, 0, sizeof(cfg));
 
 	cfg.src_addr = dd->phys_base + AES_REG_DATA_N(dd, 0);
@@ -676,6 +678,8 @@ static void omap_aes_done_task(unsigned long data)
 
 	pr_debug("enter done_task\n");
 
+	dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
+
 	omap_aes_crypt_dma_stop(dd);
 	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 06/14] crypto: omap-aes: Remove previously used intermediate buffers
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

Intermdiate buffers were allocated, mapped and used for DMA.  These are no
longer required as we use the SGs from crypto layer directly in previous
commits in the series. Also along with it, remove the logic for copying SGs
etc as they are no longer used, and all the associated variables in omap_aes_device.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   90 ---------------------------------------------
 1 file changed, 90 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 5e034a1..bbdd1c3 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -149,25 +149,13 @@ struct omap_aes_dev {
 	struct ablkcipher_request	*req;
 	size_t				total;
 	struct scatterlist		*in_sg;
-	struct scatterlist		in_sgl;
-	size_t				in_offset;
 	struct scatterlist		*out_sg;
-	struct scatterlist		out_sgl;
-	size_t				out_offset;
-
-	size_t			buflen;
-	void			*buf_in;
-	size_t			dma_size;
 	int			dma_in;
 	struct dma_chan		*dma_lch_in;
-	dma_addr_t		dma_addr_in;
-	void			*buf_out;
 	int			dma_out;
 	struct dma_chan		*dma_lch_out;
 	int			in_sg_len;
 	int			out_sg_len;
-	dma_addr_t		dma_addr_out;
-
 	const struct omap_aes_pdata	*pdata;
 };
 
@@ -352,33 +340,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
 	dd->dma_lch_out = NULL;
 	dd->dma_lch_in = NULL;
 
-	dd->buf_in = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
-	dd->buf_out = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
-	dd->buflen = PAGE_SIZE << OMAP_AES_CACHE_SIZE;
-	dd->buflen &= ~(AES_BLOCK_SIZE - 1);
-
-	if (!dd->buf_in || !dd->buf_out) {
-		dev_err(dd->dev, "unable to alloc pages.\n");
-		goto err_alloc;
-	}
-
-	/* MAP here */
-	dd->dma_addr_in = dma_map_single(dd->dev, dd->buf_in, dd->buflen,
-					 DMA_TO_DEVICE);
-	if (dma_mapping_error(dd->dev, dd->dma_addr_in)) {
-		dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
-		err = -EINVAL;
-		goto err_map_in;
-	}
-
-	dd->dma_addr_out = dma_map_single(dd->dev, dd->buf_out, dd->buflen,
-					  DMA_FROM_DEVICE);
-	if (dma_mapping_error(dd->dev, dd->dma_addr_out)) {
-		dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
-		err = -EINVAL;
-		goto err_map_out;
-	}
-
 	dma_cap_zero(mask);
 	dma_cap_set(DMA_SLAVE, mask);
 
@@ -405,14 +366,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
 err_dma_out:
 	dma_release_channel(dd->dma_lch_in);
 err_dma_in:
-	dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
-			 DMA_FROM_DEVICE);
-err_map_out:
-	dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-err_map_in:
-	free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
-	free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
-err_alloc:
 	if (err)
 		pr_err("error: %d\n", err);
 	return err;
@@ -422,11 +375,6 @@ static void omap_aes_dma_cleanup(struct omap_aes_dev *dd)
 {
 	dma_release_channel(dd->dma_lch_out);
 	dma_release_channel(dd->dma_lch_in);
-	dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
-			 DMA_FROM_DEVICE);
-	dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-	free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
-	free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
 }
 
 static void sg_copy_buf(void *buf, struct scatterlist *sg,
@@ -443,42 +391,6 @@ static void sg_copy_buf(void *buf, struct scatterlist *sg,
 	scatterwalk_done(&walk, out, 0);
 }
 
-static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
-		   size_t buflen, size_t total, int out)
-{
-	unsigned int count, off = 0;
-
-	while (buflen && total) {
-		count = min((*sg)->length - *offset, total);
-		count = min(count, buflen);
-
-		if (!count)
-			return off;
-
-		/*
-		 * buflen and total are AES_BLOCK_SIZE size aligned,
-		 * so count should be also aligned
-		 */
-
-		sg_copy_buf(buf + off, *sg, *offset, count, out);
-
-		off += count;
-		buflen -= count;
-		*offset += count;
-		total -= count;
-
-		if (*offset == (*sg)->length) {
-			*sg = sg_next(*sg);
-			if (*sg)
-				*offset = 0;
-			else
-				total = 0;
-		}
-	}
-
-	return off;
-}
-
 static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 		struct scatterlist *in_sg, struct scatterlist *out_sg,
 		int in_sg_len, int out_sg_len)
@@ -643,9 +555,7 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 	/* assign new request to device */
 	dd->req = req;
 	dd->total = req->nbytes;
-	dd->in_offset = 0;
 	dd->in_sg = req->src;
-	dd->out_offset = 0;
 	dd->out_sg = req->dst;
 
 	dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 06/14] crypto: omap-aes: Remove previously used intermediate buffers
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

Intermdiate buffers were allocated, mapped and used for DMA.  These are no
longer required as we use the SGs from crypto layer directly in previous
commits in the series. Also along with it, remove the logic for copying SGs
etc as they are no longer used, and all the associated variables in omap_aes_device.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   90 ---------------------------------------------
 1 file changed, 90 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 5e034a1..bbdd1c3 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -149,25 +149,13 @@ struct omap_aes_dev {
 	struct ablkcipher_request	*req;
 	size_t				total;
 	struct scatterlist		*in_sg;
-	struct scatterlist		in_sgl;
-	size_t				in_offset;
 	struct scatterlist		*out_sg;
-	struct scatterlist		out_sgl;
-	size_t				out_offset;
-
-	size_t			buflen;
-	void			*buf_in;
-	size_t			dma_size;
 	int			dma_in;
 	struct dma_chan		*dma_lch_in;
-	dma_addr_t		dma_addr_in;
-	void			*buf_out;
 	int			dma_out;
 	struct dma_chan		*dma_lch_out;
 	int			in_sg_len;
 	int			out_sg_len;
-	dma_addr_t		dma_addr_out;
-
 	const struct omap_aes_pdata	*pdata;
 };
 
@@ -352,33 +340,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
 	dd->dma_lch_out = NULL;
 	dd->dma_lch_in = NULL;
 
-	dd->buf_in = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
-	dd->buf_out = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
-	dd->buflen = PAGE_SIZE << OMAP_AES_CACHE_SIZE;
-	dd->buflen &= ~(AES_BLOCK_SIZE - 1);
-
-	if (!dd->buf_in || !dd->buf_out) {
-		dev_err(dd->dev, "unable to alloc pages.\n");
-		goto err_alloc;
-	}
-
-	/* MAP here */
-	dd->dma_addr_in = dma_map_single(dd->dev, dd->buf_in, dd->buflen,
-					 DMA_TO_DEVICE);
-	if (dma_mapping_error(dd->dev, dd->dma_addr_in)) {
-		dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
-		err = -EINVAL;
-		goto err_map_in;
-	}
-
-	dd->dma_addr_out = dma_map_single(dd->dev, dd->buf_out, dd->buflen,
-					  DMA_FROM_DEVICE);
-	if (dma_mapping_error(dd->dev, dd->dma_addr_out)) {
-		dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
-		err = -EINVAL;
-		goto err_map_out;
-	}
-
 	dma_cap_zero(mask);
 	dma_cap_set(DMA_SLAVE, mask);
 
@@ -405,14 +366,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
 err_dma_out:
 	dma_release_channel(dd->dma_lch_in);
 err_dma_in:
-	dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
-			 DMA_FROM_DEVICE);
-err_map_out:
-	dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-err_map_in:
-	free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
-	free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
-err_alloc:
 	if (err)
 		pr_err("error: %d\n", err);
 	return err;
@@ -422,11 +375,6 @@ static void omap_aes_dma_cleanup(struct omap_aes_dev *dd)
 {
 	dma_release_channel(dd->dma_lch_out);
 	dma_release_channel(dd->dma_lch_in);
-	dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
-			 DMA_FROM_DEVICE);
-	dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-	free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
-	free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
 }
 
 static void sg_copy_buf(void *buf, struct scatterlist *sg,
@@ -443,42 +391,6 @@ static void sg_copy_buf(void *buf, struct scatterlist *sg,
 	scatterwalk_done(&walk, out, 0);
 }
 
-static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
-		   size_t buflen, size_t total, int out)
-{
-	unsigned int count, off = 0;
-
-	while (buflen && total) {
-		count = min((*sg)->length - *offset, total);
-		count = min(count, buflen);
-
-		if (!count)
-			return off;
-
-		/*
-		 * buflen and total are AES_BLOCK_SIZE size aligned,
-		 * so count should be also aligned
-		 */
-
-		sg_copy_buf(buf + off, *sg, *offset, count, out);
-
-		off += count;
-		buflen -= count;
-		*offset += count;
-		total -= count;
-
-		if (*offset == (*sg)->length) {
-			*sg = sg_next(*sg);
-			if (*sg)
-				*offset = 0;
-			else
-				total = 0;
-		}
-	}
-
-	return off;
-}
-
 static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 		struct scatterlist *in_sg, struct scatterlist *out_sg,
 		int in_sg_len, int out_sg_len)
@@ -643,9 +555,7 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 	/* assign new request to device */
 	dd->req = req;
 	dd->total = req->nbytes;
-	dd->in_offset = 0;
 	dd->in_sg = req->src;
-	dd->out_offset = 0;
 	dd->out_sg = req->dst;
 
 	dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 07/14] crypto: omap-aes: Add IRQ info and helper macros
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

Add IRQ information to pdata and helper macros. These are required
for PIO-mode support.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index bbdd1c3..9a964e8 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -75,6 +75,10 @@
 
 #define AES_REG_LENGTH_N(x)		(0x54 + ((x) * 0x04))
 
+#define AES_REG_IRQ_STATUS(dd)         ((dd)->pdata->irq_status_ofs)
+#define AES_REG_IRQ_ENABLE(dd)         ((dd)->pdata->irq_enable_ofs)
+#define AES_REG_IRQ_DATA_IN            BIT(1)
+#define AES_REG_IRQ_DATA_OUT           BIT(2)
 #define DEFAULT_TIMEOUT		(5*HZ)
 
 #define FLAGS_MODE_MASK		0x000f
@@ -120,6 +124,8 @@ struct omap_aes_pdata {
 	u32		data_ofs;
 	u32		rev_ofs;
 	u32		mask_ofs;
+	u32             irq_enable_ofs;
+	u32             irq_status_ofs;
 
 	u32		dma_enable_in;
 	u32		dma_enable_out;
@@ -836,6 +842,8 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
 	.data_ofs	= 0x60,
 	.rev_ofs	= 0x80,
 	.mask_ofs	= 0x84,
+	.irq_status_ofs = 0x8c,
+	.irq_enable_ofs = 0x90,
 	.dma_enable_in	= BIT(5),
 	.dma_enable_out	= BIT(6),
 	.major_mask	= 0x0700,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 07/14] crypto: omap-aes: Add IRQ info and helper macros
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

Add IRQ information to pdata and helper macros. These are required
for PIO-mode support.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index bbdd1c3..9a964e8 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -75,6 +75,10 @@
 
 #define AES_REG_LENGTH_N(x)		(0x54 + ((x) * 0x04))
 
+#define AES_REG_IRQ_STATUS(dd)         ((dd)->pdata->irq_status_ofs)
+#define AES_REG_IRQ_ENABLE(dd)         ((dd)->pdata->irq_enable_ofs)
+#define AES_REG_IRQ_DATA_IN            BIT(1)
+#define AES_REG_IRQ_DATA_OUT           BIT(2)
 #define DEFAULT_TIMEOUT		(5*HZ)
 
 #define FLAGS_MODE_MASK		0x000f
@@ -120,6 +124,8 @@ struct omap_aes_pdata {
 	u32		data_ofs;
 	u32		rev_ofs;
 	u32		mask_ofs;
+	u32             irq_enable_ofs;
+	u32             irq_status_ofs;
 
 	u32		dma_enable_in;
 	u32		dma_enable_out;
@@ -836,6 +842,8 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
 	.data_ofs	= 0x60,
 	.rev_ofs	= 0x80,
 	.mask_ofs	= 0x84,
+	.irq_status_ofs = 0x8c,
+	.irq_enable_ofs = 0x90,
 	.dma_enable_in	= BIT(5),
 	.dma_enable_out	= BIT(6),
 	.major_mask	= 0x0700,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 08/14] crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

We add an IRQ handler that implements a state-machine for PIO-mode and data
structures for walking the scatter-gather list. The IRQ handler is called in
succession both when data is available to read or next data can be sent for
processing. This process continues till the entire in/out SG lists have been
walked. Once the SG-list has been completely walked, the IRQ handler schedules
the done_task tasklet.

Also add a useful macro that is used through out the IRQ code for a common
pattern of calculating how much an SG list has been walked.  This improves code
readability and avoids checkpatch errors.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   90 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 9a964e8..889dc99 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -39,6 +39,8 @@
 #define DST_MAXBURST			4
 #define DMA_MIN				(DST_MAXBURST * sizeof(u32))
 
+#define _calc_walked(inout) (dd->inout##_walk.offset - dd->inout##_sg->offset)
+
 /* OMAP TRM gives bitfields as start:end, where start is the higher bit
    number. For example 7:0 */
 #define FLD_MASK(start, end)	(((1 << ((start) - (end) + 1)) - 1) << (end))
@@ -91,6 +93,8 @@
 #define FLAGS_FAST		BIT(5)
 #define FLAGS_BUSY		BIT(6)
 
+#define AES_BLOCK_WORDS		(AES_BLOCK_SIZE >> 2)
+
 struct omap_aes_ctx {
 	struct omap_aes_dev *dd;
 
@@ -156,6 +160,8 @@ struct omap_aes_dev {
 	size_t				total;
 	struct scatterlist		*in_sg;
 	struct scatterlist		*out_sg;
+	struct scatter_walk		in_walk;
+	struct scatter_walk		out_walk;
 	int			dma_in;
 	struct dma_chan		*dma_lch_in;
 	int			dma_out;
@@ -852,6 +858,90 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
 	.minor_shift	= 0,
 };
 
+static irqreturn_t omap_aes_irq(int irq, void *dev_id)
+{
+	struct omap_aes_dev *dd = dev_id;
+	u32 status, i;
+	u32 *src, *dst;
+
+	status = omap_aes_read(dd, AES_REG_IRQ_STATUS(dd));
+	if (status & AES_REG_IRQ_DATA_IN) {
+		omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+		BUG_ON(!dd->in_sg);
+
+		BUG_ON(_calc_walked(in) > dd->in_sg->length);
+
+		src = sg_virt(dd->in_sg) + _calc_walked(in);
+
+		for (i = 0; i < AES_BLOCK_WORDS; i++) {
+			omap_aes_write(dd, AES_REG_DATA_N(dd, i), *src);
+
+			scatterwalk_advance(&dd->in_walk, 4);
+			if (dd->in_sg->length == _calc_walked(in)) {
+				dd->in_sg = scatterwalk_sg_next(dd->in_sg);
+				if (dd->in_sg) {
+					scatterwalk_start(&dd->in_walk,
+							  dd->in_sg);
+					src = sg_virt(dd->in_sg) +
+					      _calc_walked(in);
+				}
+			} else {
+				src++;
+			}
+		}
+
+		/* Clear IRQ status */
+		status &= ~AES_REG_IRQ_DATA_IN;
+		omap_aes_write(dd, AES_REG_IRQ_STATUS(dd), status);
+
+		/* Enable DATA_OUT interrupt */
+		omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x4);
+
+	} else if (status & AES_REG_IRQ_DATA_OUT) {
+		omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+		BUG_ON(!dd->out_sg);
+
+		BUG_ON(_calc_walked(out) > dd->out_sg->length);
+
+		dst = sg_virt(dd->out_sg) + _calc_walked(out);
+
+		for (i = 0; i < AES_BLOCK_WORDS; i++) {
+			*dst = omap_aes_read(dd, AES_REG_DATA_N(dd, i));
+			scatterwalk_advance(&dd->out_walk, 4);
+			if (dd->out_sg->length == _calc_walked(out)) {
+				dd->out_sg = scatterwalk_sg_next(dd->out_sg);
+				if (dd->out_sg) {
+					scatterwalk_start(&dd->out_walk,
+							  dd->out_sg);
+					dst = sg_virt(dd->out_sg) +
+					      _calc_walked(out);
+				}
+			} else {
+				dst++;
+			}
+		}
+
+		dd->total -= AES_BLOCK_SIZE;
+
+		BUG_ON(dd->total < 0);
+
+		/* Clear IRQ status */
+		status &= ~AES_REG_IRQ_DATA_OUT;
+		omap_aes_write(dd, AES_REG_IRQ_STATUS(dd), status);
+
+		if (!dd->total)
+			/* All bytes read! */
+			tasklet_schedule(&dd->done_task);
+		else
+			/* Enable DATA_IN interrupt for next block */
+			omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x2);
+	}
+
+	return IRQ_HANDLED;
+}
+
 static const struct of_device_id omap_aes_of_match[] = {
 	{
 		.compatible	= "ti,omap2-aes",
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 08/14] crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

We add an IRQ handler that implements a state-machine for PIO-mode and data
structures for walking the scatter-gather list. The IRQ handler is called in
succession both when data is available to read or next data can be sent for
processing. This process continues till the entire in/out SG lists have been
walked. Once the SG-list has been completely walked, the IRQ handler schedules
the done_task tasklet.

Also add a useful macro that is used through out the IRQ code for a common
pattern of calculating how much an SG list has been walked.  This improves code
readability and avoids checkpatch errors.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   90 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 9a964e8..889dc99 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -39,6 +39,8 @@
 #define DST_MAXBURST			4
 #define DMA_MIN				(DST_MAXBURST * sizeof(u32))
 
+#define _calc_walked(inout) (dd->inout##_walk.offset - dd->inout##_sg->offset)
+
 /* OMAP TRM gives bitfields as start:end, where start is the higher bit
    number. For example 7:0 */
 #define FLD_MASK(start, end)	(((1 << ((start) - (end) + 1)) - 1) << (end))
@@ -91,6 +93,8 @@
 #define FLAGS_FAST		BIT(5)
 #define FLAGS_BUSY		BIT(6)
 
+#define AES_BLOCK_WORDS		(AES_BLOCK_SIZE >> 2)
+
 struct omap_aes_ctx {
 	struct omap_aes_dev *dd;
 
@@ -156,6 +160,8 @@ struct omap_aes_dev {
 	size_t				total;
 	struct scatterlist		*in_sg;
 	struct scatterlist		*out_sg;
+	struct scatter_walk		in_walk;
+	struct scatter_walk		out_walk;
 	int			dma_in;
 	struct dma_chan		*dma_lch_in;
 	int			dma_out;
@@ -852,6 +858,90 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
 	.minor_shift	= 0,
 };
 
+static irqreturn_t omap_aes_irq(int irq, void *dev_id)
+{
+	struct omap_aes_dev *dd = dev_id;
+	u32 status, i;
+	u32 *src, *dst;
+
+	status = omap_aes_read(dd, AES_REG_IRQ_STATUS(dd));
+	if (status & AES_REG_IRQ_DATA_IN) {
+		omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+		BUG_ON(!dd->in_sg);
+
+		BUG_ON(_calc_walked(in) > dd->in_sg->length);
+
+		src = sg_virt(dd->in_sg) + _calc_walked(in);
+
+		for (i = 0; i < AES_BLOCK_WORDS; i++) {
+			omap_aes_write(dd, AES_REG_DATA_N(dd, i), *src);
+
+			scatterwalk_advance(&dd->in_walk, 4);
+			if (dd->in_sg->length == _calc_walked(in)) {
+				dd->in_sg = scatterwalk_sg_next(dd->in_sg);
+				if (dd->in_sg) {
+					scatterwalk_start(&dd->in_walk,
+							  dd->in_sg);
+					src = sg_virt(dd->in_sg) +
+					      _calc_walked(in);
+				}
+			} else {
+				src++;
+			}
+		}
+
+		/* Clear IRQ status */
+		status &= ~AES_REG_IRQ_DATA_IN;
+		omap_aes_write(dd, AES_REG_IRQ_STATUS(dd), status);
+
+		/* Enable DATA_OUT interrupt */
+		omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x4);
+
+	} else if (status & AES_REG_IRQ_DATA_OUT) {
+		omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+		BUG_ON(!dd->out_sg);
+
+		BUG_ON(_calc_walked(out) > dd->out_sg->length);
+
+		dst = sg_virt(dd->out_sg) + _calc_walked(out);
+
+		for (i = 0; i < AES_BLOCK_WORDS; i++) {
+			*dst = omap_aes_read(dd, AES_REG_DATA_N(dd, i));
+			scatterwalk_advance(&dd->out_walk, 4);
+			if (dd->out_sg->length == _calc_walked(out)) {
+				dd->out_sg = scatterwalk_sg_next(dd->out_sg);
+				if (dd->out_sg) {
+					scatterwalk_start(&dd->out_walk,
+							  dd->out_sg);
+					dst = sg_virt(dd->out_sg) +
+					      _calc_walked(out);
+				}
+			} else {
+				dst++;
+			}
+		}
+
+		dd->total -= AES_BLOCK_SIZE;
+
+		BUG_ON(dd->total < 0);
+
+		/* Clear IRQ status */
+		status &= ~AES_REG_IRQ_DATA_OUT;
+		omap_aes_write(dd, AES_REG_IRQ_STATUS(dd), status);
+
+		if (!dd->total)
+			/* All bytes read! */
+			tasklet_schedule(&dd->done_task);
+		else
+			/* Enable DATA_IN interrupt for next block */
+			omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x2);
+	}
+
+	return IRQ_HANDLED;
+}
+
 static const struct of_device_id omap_aes_of_match[] = {
 	{
 		.compatible	= "ti,omap2-aes",
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 09/14] crypto: omap-aes: PIO mode: platform data for OMAP4/AM437x and trigger
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

We initialize the scatter gather walk lists needed for PIO mode and avoid all
DMA paths such as mapping/unmapping buffers by checking for the pio_only flag.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   43 ++++++++++++++++++++++++++++++-------------
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 889dc99..62b3260 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -168,6 +168,7 @@ struct omap_aes_dev {
 	struct dma_chan		*dma_lch_out;
 	int			in_sg_len;
 	int			out_sg_len;
+	int			pio_only;
 	const struct omap_aes_pdata	*pdata;
 };
 
@@ -413,6 +414,16 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 	struct dma_slave_config cfg;
 	int ret;
 
+	if (dd->pio_only) {
+		scatterwalk_start(&dd->in_walk, dd->in_sg);
+		scatterwalk_start(&dd->out_walk, dd->out_sg);
+
+		/* Enable DATAIN interrupt and let it take
+		   care of the rest */
+		omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x2);
+		return 0;
+	}
+
 	dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
 
 	memset(&cfg, 0, sizeof(cfg));
@@ -482,21 +493,25 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev *dd)
 
 	pr_debug("total: %d\n", dd->total);
 
-	err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
-	if (!err) {
-		dev_err(dd->dev, "dma_map_sg() error\n");
-		return -EINVAL;
-	}
+	if (!dd->pio_only) {
+		err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len,
+				 DMA_TO_DEVICE);
+		if (!err) {
+			dev_err(dd->dev, "dma_map_sg() error\n");
+			return -EINVAL;
+		}
 
-	err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
-	if (!err) {
-		dev_err(dd->dev, "dma_map_sg() error\n");
-		return -EINVAL;
+		err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+				 DMA_FROM_DEVICE);
+		if (!err) {
+			dev_err(dd->dev, "dma_map_sg() error\n");
+			return -EINVAL;
+		}
 	}
 
 	err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
 				 dd->out_sg_len);
-	if (err) {
+	if (err && !dd->pio_only) {
 		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
 		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
 			     DMA_FROM_DEVICE);
@@ -600,9 +615,11 @@ static void omap_aes_done_task(unsigned long data)
 
 	pr_debug("enter done_task\n");
 
-	dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
-
-	omap_aes_crypt_dma_stop(dd);
+	if (!dd->pio_only) {
+		dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
+				       DMA_FROM_DEVICE);
+		omap_aes_crypt_dma_stop(dd);
+	}
 	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 09/14] crypto: omap-aes: PIO mode: platform data for OMAP4/AM437x and trigger
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

We initialize the scatter gather walk lists needed for PIO mode and avoid all
DMA paths such as mapping/unmapping buffers by checking for the pio_only flag.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   43 ++++++++++++++++++++++++++++++-------------
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 889dc99..62b3260 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -168,6 +168,7 @@ struct omap_aes_dev {
 	struct dma_chan		*dma_lch_out;
 	int			in_sg_len;
 	int			out_sg_len;
+	int			pio_only;
 	const struct omap_aes_pdata	*pdata;
 };
 
@@ -413,6 +414,16 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
 	struct dma_slave_config cfg;
 	int ret;
 
+	if (dd->pio_only) {
+		scatterwalk_start(&dd->in_walk, dd->in_sg);
+		scatterwalk_start(&dd->out_walk, dd->out_sg);
+
+		/* Enable DATAIN interrupt and let it take
+		   care of the rest */
+		omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x2);
+		return 0;
+	}
+
 	dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
 
 	memset(&cfg, 0, sizeof(cfg));
@@ -482,21 +493,25 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev *dd)
 
 	pr_debug("total: %d\n", dd->total);
 
-	err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
-	if (!err) {
-		dev_err(dd->dev, "dma_map_sg() error\n");
-		return -EINVAL;
-	}
+	if (!dd->pio_only) {
+		err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len,
+				 DMA_TO_DEVICE);
+		if (!err) {
+			dev_err(dd->dev, "dma_map_sg() error\n");
+			return -EINVAL;
+		}
 
-	err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
-	if (!err) {
-		dev_err(dd->dev, "dma_map_sg() error\n");
-		return -EINVAL;
+		err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+				 DMA_FROM_DEVICE);
+		if (!err) {
+			dev_err(dd->dev, "dma_map_sg() error\n");
+			return -EINVAL;
+		}
 	}
 
 	err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
 				 dd->out_sg_len);
-	if (err) {
+	if (err && !dd->pio_only) {
 		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
 		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
 			     DMA_FROM_DEVICE);
@@ -600,9 +615,11 @@ static void omap_aes_done_task(unsigned long data)
 
 	pr_debug("enter done_task\n");
 
-	dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
-
-	omap_aes_crypt_dma_stop(dd);
+	if (!dd->pio_only) {
+		dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
+				       DMA_FROM_DEVICE);
+		omap_aes_crypt_dma_stop(dd);
+	}
 	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 10/14] crypto: omap-aes: Switch to PIO mode during probe
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

In cases where requesting for DMA channels fails for some reason, or channel
numbers are not provided in DT or platform data, we switch to PIO-only mode
also checking if platform provides IRQ numbers and interrupt register offsets
in DT and platform data. All dma-only paths are avoided in this mode.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 62b3260..d214dbf 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1064,7 +1064,7 @@ static int omap_aes_probe(struct platform_device *pdev)
 	struct omap_aes_dev *dd;
 	struct crypto_alg *algp;
 	struct resource res;
-	int err = -ENOMEM, i, j;
+	int err = -ENOMEM, i, j, irq = -1;
 	u32 reg;
 
 	dd = kzalloc(sizeof(struct omap_aes_dev), GFP_KERNEL);
@@ -1108,8 +1108,23 @@ static int omap_aes_probe(struct platform_device *pdev)
 	tasklet_init(&dd->queue_task, omap_aes_queue_task, (unsigned long)dd);
 
 	err = omap_aes_dma_init(dd);
-	if (err)
-		goto err_dma;
+	if (err && AES_REG_IRQ_STATUS(dd) && AES_REG_IRQ_ENABLE(dd)) {
+		dd->pio_only = 1;
+
+		irq = platform_get_irq(pdev, 0);
+		if (irq < 0) {
+			dev_err(dev, "can't get IRQ resource\n");
+			goto err_irq;
+		}
+
+		err = request_irq(irq, omap_aes_irq, 0,
+				dev_name(dev), dd);
+		if (err) {
+			dev_err(dev, "Unable to grab omap-aes IRQ\n");
+			goto err_irq;
+		}
+	}
+
 
 	INIT_LIST_HEAD(&dd->list);
 	spin_lock(&list_lock);
@@ -1137,8 +1152,11 @@ err_algs:
 		for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
 			crypto_unregister_alg(
 					&dd->pdata->algs_info[i].algs_list[j]);
-	omap_aes_dma_cleanup(dd);
-err_dma:
+	if (dd->pio_only)
+		free_irq(irq, dd);
+	else
+		omap_aes_dma_cleanup(dd);
+err_irq:
 	tasklet_kill(&dd->done_task);
 	tasklet_kill(&dd->queue_task);
 	pm_runtime_disable(dev);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 10/14] crypto: omap-aes: Switch to PIO mode during probe
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

In cases where requesting for DMA channels fails for some reason, or channel
numbers are not provided in DT or platform data, we switch to PIO-only mode
also checking if platform provides IRQ numbers and interrupt register offsets
in DT and platform data. All dma-only paths are avoided in this mode.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 62b3260..d214dbf 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1064,7 +1064,7 @@ static int omap_aes_probe(struct platform_device *pdev)
 	struct omap_aes_dev *dd;
 	struct crypto_alg *algp;
 	struct resource res;
-	int err = -ENOMEM, i, j;
+	int err = -ENOMEM, i, j, irq = -1;
 	u32 reg;
 
 	dd = kzalloc(sizeof(struct omap_aes_dev), GFP_KERNEL);
@@ -1108,8 +1108,23 @@ static int omap_aes_probe(struct platform_device *pdev)
 	tasklet_init(&dd->queue_task, omap_aes_queue_task, (unsigned long)dd);
 
 	err = omap_aes_dma_init(dd);
-	if (err)
-		goto err_dma;
+	if (err && AES_REG_IRQ_STATUS(dd) && AES_REG_IRQ_ENABLE(dd)) {
+		dd->pio_only = 1;
+
+		irq = platform_get_irq(pdev, 0);
+		if (irq < 0) {
+			dev_err(dev, "can't get IRQ resource\n");
+			goto err_irq;
+		}
+
+		err = request_irq(irq, omap_aes_irq, 0,
+				dev_name(dev), dd);
+		if (err) {
+			dev_err(dev, "Unable to grab omap-aes IRQ\n");
+			goto err_irq;
+		}
+	}
+
 
 	INIT_LIST_HEAD(&dd->list);
 	spin_lock(&list_lock);
@@ -1137,8 +1152,11 @@ err_algs:
 		for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
 			crypto_unregister_alg(
 					&dd->pdata->algs_info[i].algs_list[j]);
-	omap_aes_dma_cleanup(dd);
-err_dma:
+	if (dd->pio_only)
+		free_irq(irq, dd);
+	else
+		omap_aes_dma_cleanup(dd);
+err_irq:
 	tasklet_kill(&dd->done_task);
 	tasklet_kill(&dd->queue_task);
 	pm_runtime_disable(dev);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 11/14] crypto: omap-aes: Add support for cases of unaligned lengths
  2013-08-18  2:42 ` Joel Fernandes
  (?)
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Joel Fernandes, Linux OMAP List, Linux Kernel Mailing List,
	Linux ARM Kernel List, Linux Crypto Mailing List

For cases where offset/length of on any page of the input SG is not aligned by
AES_BLOCK_SIZE, we copy all the pages from the input SG list into a contiguous
buffer and prepare a single element SG list for this buffer with length as the
total bytes to crypt.

This is requried for cases such as when an SG list of 16 bytes total size
contains 16 pages each containing 1 byte. DMA using the direct buffers of such
instances is not possible.

For this purpose, we first detect if the unaligned case and accordingly
allocate enough number of pages to satisfy the request and prepare SG lists.
We then copy data into the buffer, and copy data out of it on completion.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   86 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 83 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index d214dbf..cbeaaf4 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -157,9 +157,23 @@ struct omap_aes_dev {
 	struct tasklet_struct	queue_task;
 
 	struct ablkcipher_request	*req;
+
+	/*
+	 * total is used by PIO mode for book keeping so introduce
+	 * variable total_save as need it to calc page_order
+	 */
 	size_t				total;
+	size_t				total_save;
+
 	struct scatterlist		*in_sg;
 	struct scatterlist		*out_sg;
+
+	/* Buffers for copying for unaligned cases */
+	struct scatterlist		in_sgl;
+	struct scatterlist		out_sgl;
+	struct scatterlist		*orig_out;
+	int				sgs_copied;
+
 	struct scatter_walk		in_walk;
 	struct scatter_walk		out_walk;
 	int			dma_in;
@@ -543,12 +557,51 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
 	dmaengine_terminate_all(dd->dma_lch_in);
 	dmaengine_terminate_all(dd->dma_lch_out);
 
-	dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
-	dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
-
 	return err;
 }
 
+int omap_aes_check_aligned(struct scatterlist *sg)
+{
+	while (sg) {
+		if (!IS_ALIGNED(sg->offset, 4))
+			return -1;
+		if (!IS_ALIGNED(sg->length, AES_BLOCK_SIZE))
+			return -1;
+		sg = sg_next(sg);
+	}
+	return 0;
+}
+
+int omap_aes_copy_sgs(struct omap_aes_dev *dd)
+{
+	void *buf_in, *buf_out;
+	int pages;
+
+	pages = get_order(dd->total);
+
+	buf_in = (void *)__get_free_pages(GFP_ATOMIC, pages);
+	buf_out = (void *)__get_free_pages(GFP_ATOMIC, pages);
+
+	if (!buf_in || !buf_out) {
+		pr_err("Couldn't allocated pages for unaligned cases.\n");
+		return -1;
+	}
+
+	dd->orig_out = dd->out_sg;
+
+	sg_copy_buf(buf_in, dd->in_sg, 0, dd->total, 0);
+
+	sg_init_table(&dd->in_sgl, 1);
+	sg_set_buf(&dd->in_sgl, buf_in, dd->total);
+	dd->in_sg = &dd->in_sgl;
+
+	sg_init_table(&dd->out_sgl, 1);
+	sg_set_buf(&dd->out_sgl, buf_out, dd->total);
+	dd->out_sg = &dd->out_sgl;
+
+	return 0;
+}
+
 static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 			       struct ablkcipher_request *req)
 {
@@ -582,9 +635,19 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 	/* assign new request to device */
 	dd->req = req;
 	dd->total = req->nbytes;
+	dd->total_save = req->nbytes;
 	dd->in_sg = req->src;
 	dd->out_sg = req->dst;
 
+	if (omap_aes_check_aligned(dd->in_sg) ||
+	    omap_aes_check_aligned(dd->out_sg)) {
+		if (omap_aes_copy_sgs(dd))
+			pr_err("Failed to copy SGs for unaligned cases\n");
+		dd->sgs_copied = 1;
+	} else {
+		dd->sgs_copied = 0;
+	}
+
 	dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
 	dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
 	BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
@@ -612,14 +675,31 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 static void omap_aes_done_task(unsigned long data)
 {
 	struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
+	void *buf_in, *buf_out;
+	int pages;
 
 	pr_debug("enter done_task\n");
 
 	if (!dd->pio_only) {
 		dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
 				       DMA_FROM_DEVICE);
+		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+			     DMA_FROM_DEVICE);
 		omap_aes_crypt_dma_stop(dd);
 	}
+
+	if (dd->sgs_copied) {
+		buf_in = sg_virt(&dd->in_sgl);
+		buf_out = sg_virt(&dd->out_sgl);
+
+		sg_copy_buf(buf_out, dd->orig_out, 0, dd->total_save, 1);
+
+		pages = get_order(dd->total_save);
+		free_pages((unsigned long)buf_in, pages);
+		free_pages((unsigned long)buf_out, pages);
+	}
+
 	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 11/14] crypto: omap-aes: Add support for cases of unaligned lengths
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

For cases where offset/length of on any page of the input SG is not aligned by
AES_BLOCK_SIZE, we copy all the pages from the input SG list into a contiguous
buffer and prepare a single element SG list for this buffer with length as the
total bytes to crypt.

This is requried for cases such as when an SG list of 16 bytes total size
contains 16 pages each containing 1 byte. DMA using the direct buffers of such
instances is not possible.

For this purpose, we first detect if the unaligned case and accordingly
allocate enough number of pages to satisfy the request and prepare SG lists.
We then copy data into the buffer, and copy data out of it on completion.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   86 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 83 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index d214dbf..cbeaaf4 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -157,9 +157,23 @@ struct omap_aes_dev {
 	struct tasklet_struct	queue_task;
 
 	struct ablkcipher_request	*req;
+
+	/*
+	 * total is used by PIO mode for book keeping so introduce
+	 * variable total_save as need it to calc page_order
+	 */
 	size_t				total;
+	size_t				total_save;
+
 	struct scatterlist		*in_sg;
 	struct scatterlist		*out_sg;
+
+	/* Buffers for copying for unaligned cases */
+	struct scatterlist		in_sgl;
+	struct scatterlist		out_sgl;
+	struct scatterlist		*orig_out;
+	int				sgs_copied;
+
 	struct scatter_walk		in_walk;
 	struct scatter_walk		out_walk;
 	int			dma_in;
@@ -543,12 +557,51 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
 	dmaengine_terminate_all(dd->dma_lch_in);
 	dmaengine_terminate_all(dd->dma_lch_out);
 
-	dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
-	dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
-
 	return err;
 }
 
+int omap_aes_check_aligned(struct scatterlist *sg)
+{
+	while (sg) {
+		if (!IS_ALIGNED(sg->offset, 4))
+			return -1;
+		if (!IS_ALIGNED(sg->length, AES_BLOCK_SIZE))
+			return -1;
+		sg = sg_next(sg);
+	}
+	return 0;
+}
+
+int omap_aes_copy_sgs(struct omap_aes_dev *dd)
+{
+	void *buf_in, *buf_out;
+	int pages;
+
+	pages = get_order(dd->total);
+
+	buf_in = (void *)__get_free_pages(GFP_ATOMIC, pages);
+	buf_out = (void *)__get_free_pages(GFP_ATOMIC, pages);
+
+	if (!buf_in || !buf_out) {
+		pr_err("Couldn't allocated pages for unaligned cases.\n");
+		return -1;
+	}
+
+	dd->orig_out = dd->out_sg;
+
+	sg_copy_buf(buf_in, dd->in_sg, 0, dd->total, 0);
+
+	sg_init_table(&dd->in_sgl, 1);
+	sg_set_buf(&dd->in_sgl, buf_in, dd->total);
+	dd->in_sg = &dd->in_sgl;
+
+	sg_init_table(&dd->out_sgl, 1);
+	sg_set_buf(&dd->out_sgl, buf_out, dd->total);
+	dd->out_sg = &dd->out_sgl;
+
+	return 0;
+}
+
 static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 			       struct ablkcipher_request *req)
 {
@@ -582,9 +635,19 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 	/* assign new request to device */
 	dd->req = req;
 	dd->total = req->nbytes;
+	dd->total_save = req->nbytes;
 	dd->in_sg = req->src;
 	dd->out_sg = req->dst;
 
+	if (omap_aes_check_aligned(dd->in_sg) ||
+	    omap_aes_check_aligned(dd->out_sg)) {
+		if (omap_aes_copy_sgs(dd))
+			pr_err("Failed to copy SGs for unaligned cases\n");
+		dd->sgs_copied = 1;
+	} else {
+		dd->sgs_copied = 0;
+	}
+
 	dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
 	dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
 	BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
@@ -612,14 +675,31 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 static void omap_aes_done_task(unsigned long data)
 {
 	struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
+	void *buf_in, *buf_out;
+	int pages;
 
 	pr_debug("enter done_task\n");
 
 	if (!dd->pio_only) {
 		dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
 				       DMA_FROM_DEVICE);
+		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+			     DMA_FROM_DEVICE);
 		omap_aes_crypt_dma_stop(dd);
 	}
+
+	if (dd->sgs_copied) {
+		buf_in = sg_virt(&dd->in_sgl);
+		buf_out = sg_virt(&dd->out_sgl);
+
+		sg_copy_buf(buf_out, dd->orig_out, 0, dd->total_save, 1);
+
+		pages = get_order(dd->total_save);
+		free_pages((unsigned long)buf_in, pages);
+		free_pages((unsigned long)buf_out, pages);
+	}
+
 	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 11/14] crypto: omap-aes: Add support for cases of unaligned lengths
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

For cases where offset/length of on any page of the input SG is not aligned by
AES_BLOCK_SIZE, we copy all the pages from the input SG list into a contiguous
buffer and prepare a single element SG list for this buffer with length as the
total bytes to crypt.

This is requried for cases such as when an SG list of 16 bytes total size
contains 16 pages each containing 1 byte. DMA using the direct buffers of such
instances is not possible.

For this purpose, we first detect if the unaligned case and accordingly
allocate enough number of pages to satisfy the request and prepare SG lists.
We then copy data into the buffer, and copy data out of it on completion.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   86 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 83 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index d214dbf..cbeaaf4 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -157,9 +157,23 @@ struct omap_aes_dev {
 	struct tasklet_struct	queue_task;
 
 	struct ablkcipher_request	*req;
+
+	/*
+	 * total is used by PIO mode for book keeping so introduce
+	 * variable total_save as need it to calc page_order
+	 */
 	size_t				total;
+	size_t				total_save;
+
 	struct scatterlist		*in_sg;
 	struct scatterlist		*out_sg;
+
+	/* Buffers for copying for unaligned cases */
+	struct scatterlist		in_sgl;
+	struct scatterlist		out_sgl;
+	struct scatterlist		*orig_out;
+	int				sgs_copied;
+
 	struct scatter_walk		in_walk;
 	struct scatter_walk		out_walk;
 	int			dma_in;
@@ -543,12 +557,51 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
 	dmaengine_terminate_all(dd->dma_lch_in);
 	dmaengine_terminate_all(dd->dma_lch_out);
 
-	dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
-	dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
-
 	return err;
 }
 
+int omap_aes_check_aligned(struct scatterlist *sg)
+{
+	while (sg) {
+		if (!IS_ALIGNED(sg->offset, 4))
+			return -1;
+		if (!IS_ALIGNED(sg->length, AES_BLOCK_SIZE))
+			return -1;
+		sg = sg_next(sg);
+	}
+	return 0;
+}
+
+int omap_aes_copy_sgs(struct omap_aes_dev *dd)
+{
+	void *buf_in, *buf_out;
+	int pages;
+
+	pages = get_order(dd->total);
+
+	buf_in = (void *)__get_free_pages(GFP_ATOMIC, pages);
+	buf_out = (void *)__get_free_pages(GFP_ATOMIC, pages);
+
+	if (!buf_in || !buf_out) {
+		pr_err("Couldn't allocated pages for unaligned cases.\n");
+		return -1;
+	}
+
+	dd->orig_out = dd->out_sg;
+
+	sg_copy_buf(buf_in, dd->in_sg, 0, dd->total, 0);
+
+	sg_init_table(&dd->in_sgl, 1);
+	sg_set_buf(&dd->in_sgl, buf_in, dd->total);
+	dd->in_sg = &dd->in_sgl;
+
+	sg_init_table(&dd->out_sgl, 1);
+	sg_set_buf(&dd->out_sgl, buf_out, dd->total);
+	dd->out_sg = &dd->out_sgl;
+
+	return 0;
+}
+
 static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 			       struct ablkcipher_request *req)
 {
@@ -582,9 +635,19 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 	/* assign new request to device */
 	dd->req = req;
 	dd->total = req->nbytes;
+	dd->total_save = req->nbytes;
 	dd->in_sg = req->src;
 	dd->out_sg = req->dst;
 
+	if (omap_aes_check_aligned(dd->in_sg) ||
+	    omap_aes_check_aligned(dd->out_sg)) {
+		if (omap_aes_copy_sgs(dd))
+			pr_err("Failed to copy SGs for unaligned cases\n");
+		dd->sgs_copied = 1;
+	} else {
+		dd->sgs_copied = 0;
+	}
+
 	dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
 	dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
 	BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
@@ -612,14 +675,31 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 static void omap_aes_done_task(unsigned long data)
 {
 	struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
+	void *buf_in, *buf_out;
+	int pages;
 
 	pr_debug("enter done_task\n");
 
 	if (!dd->pio_only) {
 		dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
 				       DMA_FROM_DEVICE);
+		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+			     DMA_FROM_DEVICE);
 		omap_aes_crypt_dma_stop(dd);
 	}
+
+	if (dd->sgs_copied) {
+		buf_in = sg_virt(&dd->in_sgl);
+		buf_out = sg_virt(&dd->out_sgl);
+
+		sg_copy_buf(buf_out, dd->orig_out, 0, dd->total_save, 1);
+
+		pages = get_order(dd->total_save);
+		free_pages((unsigned long)buf_in, pages);
+		free_pages((unsigned long)buf_out, pages);
+	}
+
 	omap_aes_finish_req(dd, 0);
 	omap_aes_handle_queue(dd, NULL);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 12/14] crypto: omap-aes: Convert kzalloc to devm_kzalloc
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

Use devm_kzalloc instead of kzalloc. With this change, there is no need to
call kfree in error/exit paths.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index cbeaaf4..6a4ac4a 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1147,7 +1147,7 @@ static int omap_aes_probe(struct platform_device *pdev)
 	int err = -ENOMEM, i, j, irq = -1;
 	u32 reg;
 
-	dd = kzalloc(sizeof(struct omap_aes_dev), GFP_KERNEL);
+	dd = devm_kzalloc(dev, sizeof(struct omap_aes_dev), GFP_KERNEL);
 	if (dd == NULL) {
 		dev_err(dev, "unable to alloc data struct.\n");
 		goto err_data;
@@ -1241,7 +1241,6 @@ err_irq:
 	tasklet_kill(&dd->queue_task);
 	pm_runtime_disable(dev);
 err_res:
-	kfree(dd);
 	dd = NULL;
 err_data:
 	dev_err(dev, "initialization failed.\n");
@@ -1269,7 +1268,6 @@ static int omap_aes_remove(struct platform_device *pdev)
 	tasklet_kill(&dd->queue_task);
 	omap_aes_dma_cleanup(dd);
 	pm_runtime_disable(dd->dev);
-	kfree(dd);
 	dd = NULL;
 
 	return 0;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 12/14] crypto: omap-aes: Convert kzalloc to devm_kzalloc
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

Use devm_kzalloc instead of kzalloc. With this change, there is no need to
call kfree in error/exit paths.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index cbeaaf4..6a4ac4a 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1147,7 +1147,7 @@ static int omap_aes_probe(struct platform_device *pdev)
 	int err = -ENOMEM, i, j, irq = -1;
 	u32 reg;
 
-	dd = kzalloc(sizeof(struct omap_aes_dev), GFP_KERNEL);
+	dd = devm_kzalloc(dev, sizeof(struct omap_aes_dev), GFP_KERNEL);
 	if (dd == NULL) {
 		dev_err(dev, "unable to alloc data struct.\n");
 		goto err_data;
@@ -1241,7 +1241,6 @@ err_irq:
 	tasklet_kill(&dd->queue_task);
 	pm_runtime_disable(dev);
 err_res:
-	kfree(dd);
 	dd = NULL;
 err_data:
 	dev_err(dev, "initialization failed.\n");
@@ -1269,7 +1268,6 @@ static int omap_aes_remove(struct platform_device *pdev)
 	tasklet_kill(&dd->queue_task);
 	omap_aes_dma_cleanup(dd);
 	pm_runtime_disable(dd->dev);
-	kfree(dd);
 	dd = NULL;
 
 	return 0;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 13/14] crypto: omap-aes: Convert request_irq to devm_request_irq
  2013-08-18  2:42 ` Joel Fernandes
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

Keeps request_irq exit/error code paths simpler.

Suggested-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 6a4ac4a..ab449b5 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1197,7 +1197,7 @@ static int omap_aes_probe(struct platform_device *pdev)
 			goto err_irq;
 		}
 
-		err = request_irq(irq, omap_aes_irq, 0,
+		err = devm_request_irq(dev, irq, omap_aes_irq, 0,
 				dev_name(dev), dd);
 		if (err) {
 			dev_err(dev, "Unable to grab omap-aes IRQ\n");
@@ -1232,9 +1232,7 @@ err_algs:
 		for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
 			crypto_unregister_alg(
 					&dd->pdata->algs_info[i].algs_list[j]);
-	if (dd->pio_only)
-		free_irq(irq, dd);
-	else
+	if (!dd->pio_only)
 		omap_aes_dma_cleanup(dd);
 err_irq:
 	tasklet_kill(&dd->done_task);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 13/14] crypto: omap-aes: Convert request_irq to devm_request_irq
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

Keeps request_irq exit/error code paths simpler.

Suggested-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 6a4ac4a..ab449b5 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1197,7 +1197,7 @@ static int omap_aes_probe(struct platform_device *pdev)
 			goto err_irq;
 		}
 
-		err = request_irq(irq, omap_aes_irq, 0,
+		err = devm_request_irq(dev, irq, omap_aes_irq, 0,
 				dev_name(dev), dd);
 		if (err) {
 			dev_err(dev, "Unable to grab omap-aes IRQ\n");
@@ -1232,9 +1232,7 @@ err_algs:
 		for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
 			crypto_unregister_alg(
 					&dd->pdata->algs_info[i].algs_list[j]);
-	if (dd->pio_only)
-		free_irq(irq, dd);
-	else
+	if (!dd->pio_only)
 		omap_aes_dma_cleanup(dd);
 err_irq:
 	tasklet_kill(&dd->done_task);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 14/14] crypto: omap-aes: Kconfig: Add build support for AM437x
  2013-08-18  2:42 ` Joel Fernandes
  (?)
@ 2013-08-18  2:42   ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Joel Fernandes, Linux OMAP List, Linux Kernel Mailing List,
	Linux ARM Kernel List, Linux Crypto Mailing List

For AM437x SoC, ARCH_OMAP2 and ARCH_OMAP3 is not enabled in the defconfig. We
follow same thing as SHA driver, and add depends on ARCH_OMAP2PLUS so that the
config is selectable for AES driver on AM437x SoC builds.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index dffb855..e289afa 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -252,7 +252,7 @@ config CRYPTO_DEV_OMAP_SHAM
 
 config CRYPTO_DEV_OMAP_AES
 	tristate "Support for OMAP AES hw engine"
-	depends on ARCH_OMAP2 || ARCH_OMAP3
+	depends on ARCH_OMAP2 || ARCH_OMAP3 || ARCH_OMAP2PLUS
 	select CRYPTO_AES
 	select CRYPTO_BLKCIPHER2
 	help
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 14/14] crypto: omap-aes: Kconfig: Add build support for AM437x
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla
  Cc: Linux OMAP List, Linux ARM Kernel List,
	Linux Kernel Mailing List, Linux Crypto Mailing List,
	Joel Fernandes

For AM437x SoC, ARCH_OMAP2 and ARCH_OMAP3 is not enabled in the defconfig. We
follow same thing as SHA driver, and add depends on ARCH_OMAP2PLUS so that the
config is selectable for AES driver on AM437x SoC builds.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index dffb855..e289afa 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -252,7 +252,7 @@ config CRYPTO_DEV_OMAP_SHAM
 
 config CRYPTO_DEV_OMAP_AES
 	tristate "Support for OMAP AES hw engine"
-	depends on ARCH_OMAP2 || ARCH_OMAP3
+	depends on ARCH_OMAP2 || ARCH_OMAP3 || ARCH_OMAP2PLUS
 	select CRYPTO_AES
 	select CRYPTO_BLKCIPHER2
 	help
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 14/14] crypto: omap-aes: Kconfig: Add build support for AM437x
@ 2013-08-18  2:42   ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  2:42 UTC (permalink / raw)
  To: linux-arm-kernel

For AM437x SoC, ARCH_OMAP2 and ARCH_OMAP3 is not enabled in the defconfig. We
follow same thing as SHA driver, and add depends on ARCH_OMAP2PLUS so that the
config is selectable for AES driver on AM437x SoC builds.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index dffb855..e289afa 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -252,7 +252,7 @@ config CRYPTO_DEV_OMAP_SHAM
 
 config CRYPTO_DEV_OMAP_AES
 	tristate "Support for OMAP AES hw engine"
-	depends on ARCH_OMAP2 || ARCH_OMAP3
+	depends on ARCH_OMAP2 || ARCH_OMAP3 || ARCH_OMAP2PLUS
 	select CRYPTO_AES
 	select CRYPTO_BLKCIPHER2
 	help
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 02/14] crypto: omap-aes: Add useful debug macros
  2013-08-18  2:42   ` Joel Fernandes
@ 2013-08-18  4:22     ` Joe Perches
  -1 siblings, 0 replies; 44+ messages in thread
From: Joe Perches @ 2013-08-18  4:22 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla, Linux OMAP List,
	Linux ARM Kernel List, Linux Kernel Mailing List,
	Linux Crypto Mailing List

On Sat, 2013-08-17 at 21:42 -0500, Joel Fernandes wrote:
> When DEBUG is enabled, these macros can be used to print variables in integer
> and hex format, and clearly display which registers, offsets and values are
> being read/written , including printing the names of the offsets and their values.
> 
> Note:
> This patch results in a checkpatch error that cannot be fixed.
> ERROR: Macros with multiple statements should be enclosed in a do - while loop
> +#define omap_aes_read(dd, offset)                                      \
> +       __raw_readl(dd->io_base + offset);                              \
> +       pr_debug("omap_aes_read(" #offset ")\n");
> 
> Using do-while loop will break a lot of code such as:
> ret = omap_aes_read(..);

That's where you use a statement expression macro

#define omap_aes_read(dd, offset)					\
({									\
	pr_debug("omap_aes_read("omap_aes_read(" #offset ")\n");	\
	__raw_readl((dd)->iobase + offset);				\
})

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 02/14] crypto: omap-aes: Add useful debug macros
@ 2013-08-18  4:22     ` Joe Perches
  0 siblings, 0 replies; 44+ messages in thread
From: Joe Perches @ 2013-08-18  4:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, 2013-08-17 at 21:42 -0500, Joel Fernandes wrote:
> When DEBUG is enabled, these macros can be used to print variables in integer
> and hex format, and clearly display which registers, offsets and values are
> being read/written , including printing the names of the offsets and their values.
> 
> Note:
> This patch results in a checkpatch error that cannot be fixed.
> ERROR: Macros with multiple statements should be enclosed in a do - while loop
> +#define omap_aes_read(dd, offset)                                      \
> +       __raw_readl(dd->io_base + offset);                              \
> +       pr_debug("omap_aes_read(" #offset ")\n");
> 
> Using do-while loop will break a lot of code such as:
> ret = omap_aes_read(..);

That's where you use a statement expression macro

#define omap_aes_read(dd, offset)					\
({									\
	pr_debug("omap_aes_read("omap_aes_read(" #offset ")\n");	\
	__raw_readl((dd)->iobase + offset);				\
})

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 02/14] crypto: omap-aes: Add useful debug macros
  2013-08-18  4:22     ` Joe Perches
@ 2013-08-18  5:56       ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  5:56 UTC (permalink / raw)
  To: Joe Perches
  Cc: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Lokesh Vutla, Linux OMAP List,
	Linux ARM Kernel List, Linux Kernel Mailing List,
	Linux Crypto Mailing List

On 08/17/2013 11:22 PM, Joe Perches wrote:
> On Sat, 2013-08-17 at 21:42 -0500, Joel Fernandes wrote:
>> When DEBUG is enabled, these macros can be used to print variables in integer
>> and hex format, and clearly display which registers, offsets and values are
>> being read/written , including printing the names of the offsets and their values.
>>
>> Note:
>> This patch results in a checkpatch error that cannot be fixed.
>> ERROR: Macros with multiple statements should be enclosed in a do - while loop
>> +#define omap_aes_read(dd, offset)                                      \
>> +       __raw_readl(dd->io_base + offset);                              \
>> +       pr_debug("omap_aes_read(" #offset ")\n");
>>
>> Using do-while loop will break a lot of code such as:
>> ret = omap_aes_read(..);
> 
> That's where you use a statement expression macro
> 
> #define omap_aes_read(dd, offset)					\
> ({									\
> 	pr_debug("omap_aes_read("omap_aes_read(" #offset ")\n");	\
> 	__raw_readl((dd)->iobase + offset);				\
> })
> 

That made things a lot simpler, thanks. Re-spinning just this patch as below:

--->8---

From: Joel Fernandes <joelf@ti.com>
Subject: [PATCH] crypto: omap-aes: Add useful debug macros

When DEBUG is enabled, these macros can be used to print variables in integer
and hex format, and clearly display which registers, offsets and values are
being read/written , including printing the names of the offsets and their values.

Using statement expression macros in read path as,
Suggested-by: Joe Perches <joe@perches.com>

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ee15b0f..e26d4d4 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -13,7 +13,9 @@
  *
  */

-#define pr_fmt(fmt) "%s: " fmt, __func__
+#define pr_fmt(fmt) "%20s: " fmt, __func__
+#define prn(num) pr_debug(#num "=%d\n", num)
+#define prx(num) pr_debug(#num "=%x\n", num)

 #include <linux/err.h>
 #include <linux/module.h>
@@ -172,16 +174,36 @@ struct omap_aes_dev {
 static LIST_HEAD(dev_list);
 static DEFINE_SPINLOCK(list_lock);

+#ifdef DEBUG
+#define omap_aes_read(dd, offset)				\
+({								\
+	int _read_ret;						\
+	_read_ret = __raw_readl(dd->io_base + offset);		\
+	pr_debug("omap_aes_read(" #offset "=%#x)= %#x\n",	\
+		 offset, _read_ret);				\
+	_read_ret;						\
+})
+#else
 static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
 {
 	return __raw_readl(dd->io_base + offset);
 }
+#endif

+#ifdef DEBUG
+#define omap_aes_write(dd, offset, value)				\
+	do {								\
+		pr_debug("omap_aes_write(" #offset "=%#x) value=%#x\n",	\
+			 offset, value);				\
+		__raw_writel(value, dd->io_base + offset);		\
+	} while (0)
+#else
 static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
 				  u32 value)
 {
 	__raw_writel(value, dd->io_base + offset);
 }
+#endif

 static inline void omap_aes_write_mask(struct omap_aes_dev *dd, u32 offset,
 					u32 value, u32 mask)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 02/14] crypto: omap-aes: Add useful debug macros
@ 2013-08-18  5:56       ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-18  5:56 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/17/2013 11:22 PM, Joe Perches wrote:
> On Sat, 2013-08-17 at 21:42 -0500, Joel Fernandes wrote:
>> When DEBUG is enabled, these macros can be used to print variables in integer
>> and hex format, and clearly display which registers, offsets and values are
>> being read/written , including printing the names of the offsets and their values.
>>
>> Note:
>> This patch results in a checkpatch error that cannot be fixed.
>> ERROR: Macros with multiple statements should be enclosed in a do - while loop
>> +#define omap_aes_read(dd, offset)                                      \
>> +       __raw_readl(dd->io_base + offset);                              \
>> +       pr_debug("omap_aes_read(" #offset ")\n");
>>
>> Using do-while loop will break a lot of code such as:
>> ret = omap_aes_read(..);
> 
> That's where you use a statement expression macro
> 
> #define omap_aes_read(dd, offset)					\
> ({									\
> 	pr_debug("omap_aes_read("omap_aes_read(" #offset ")\n");	\
> 	__raw_readl((dd)->iobase + offset);				\
> })
> 

That made things a lot simpler, thanks. Re-spinning just this patch as below:

--->8---

From: Joel Fernandes <joelf@ti.com>
Subject: [PATCH] crypto: omap-aes: Add useful debug macros

When DEBUG is enabled, these macros can be used to print variables in integer
and hex format, and clearly display which registers, offsets and values are
being read/written , including printing the names of the offsets and their values.

Using statement expression macros in read path as,
Suggested-by: Joe Perches <joe@perches.com>

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 drivers/crypto/omap-aes.c |   24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ee15b0f..e26d4d4 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -13,7 +13,9 @@
  *
  */

-#define pr_fmt(fmt) "%s: " fmt, __func__
+#define pr_fmt(fmt) "%20s: " fmt, __func__
+#define prn(num) pr_debug(#num "=%d\n", num)
+#define prx(num) pr_debug(#num "=%x\n", num)

 #include <linux/err.h>
 #include <linux/module.h>
@@ -172,16 +174,36 @@ struct omap_aes_dev {
 static LIST_HEAD(dev_list);
 static DEFINE_SPINLOCK(list_lock);

+#ifdef DEBUG
+#define omap_aes_read(dd, offset)				\
+({								\
+	int _read_ret;						\
+	_read_ret = __raw_readl(dd->io_base + offset);		\
+	pr_debug("omap_aes_read(" #offset "=%#x)= %#x\n",	\
+		 offset, _read_ret);				\
+	_read_ret;						\
+})
+#else
 static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
 {
 	return __raw_readl(dd->io_base + offset);
 }
+#endif

+#ifdef DEBUG
+#define omap_aes_write(dd, offset, value)				\
+	do {								\
+		pr_debug("omap_aes_write(" #offset "=%#x) value=%#x\n",	\
+			 offset, value);				\
+		__raw_writel(value, dd->io_base + offset);		\
+	} while (0)
+#else
 static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
 				  u32 value)
 {
 	__raw_writel(value, dd->io_base + offset);
 }
+#endif

 static inline void omap_aes_write_mask(struct omap_aes_dev *dd, u32 offset,
 					u32 value, u32 mask)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs
  2013-08-18  2:42   ` Joel Fernandes
@ 2013-08-20 12:57     ` Lokesh Vutla
  -1 siblings, 0 replies; 44+ messages in thread
From: Lokesh Vutla @ 2013-08-20 12:57 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Linux OMAP List,
	Linux ARM Kernel List, Linux Kernel Mailing List,
	Linux Crypto Mailing List

Hi Joel,

On Sunday 18 August 2013 08:12 AM, Joel Fernandes wrote:
> In early version of this driver, assumptions were made such as DMA layer
> requires contiguous buffers etc. Due to this, new buffers were allocated,
> mapped and used for DMA. These assumptions are no longer true and DMAEngine
> scatter-gather DMA doesn't have such requirements. We simply the DMA operations
> by directly using the scatter-gather buffers provided by the crypto layer
> instead of creating our own.
> 
> Lot of logic that handled DMA'ing only X number of bytes of the total, or as
> much as fitted into a 3rd party buffer is removed and is no longer required.
> 
> Also, good performance improvement of atleast ~20% seen with encrypting a
> buffer size of 8K (1800 ops/sec vs 1400 ops/sec).  Improvement will be higher
> for much larger blocks though such benchmarking is left as an exercise for the
> reader.  Also DMA usage is much more simplified and coherent with rest of the
> code.
> 
> Signed-off-by: Joel Fernandes <joelf@ti.com>
> ---
>  drivers/crypto/omap-aes.c |  147 ++++++++-------------------------------------
>  1 file changed, 25 insertions(+), 122 deletions(-)
> 
> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
> index e369e6e..64dd5c1 100644
> --- a/drivers/crypto/omap-aes.c
> +++ b/drivers/crypto/omap-aes.c
> @@ -480,22 +480,14 @@ static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
>  }
>  
>  static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
> -		struct scatterlist *in_sg, struct scatterlist *out_sg)
> +		struct scatterlist *in_sg, struct scatterlist *out_sg,
> +		int in_sg_len, int out_sg_len)
>  {
>  	struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
>  	struct omap_aes_dev *dd = ctx->dd;
>  	struct dma_async_tx_descriptor *tx_in, *tx_out;
>  	struct dma_slave_config cfg;
> -	dma_addr_t dma_addr_in = sg_dma_address(in_sg);
> -	int ret, length = sg_dma_len(in_sg);
> -
> -	pr_debug("len: %d\n", length);
> -
> -	dd->dma_size = length;
> -
> -	if (!(dd->flags & FLAGS_FAST))
> -		dma_sync_single_for_device(dd->dev, dma_addr_in, length,
> -					   DMA_TO_DEVICE);
> +	int ret;
By this change FLAGS_FAST is unsed, it can be cleaned right?
or Am I missing something?

Thanks and regards,
Lokesh
>  
>  	memset(&cfg, 0, sizeof(cfg));
>  
> @@ -514,7 +506,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>  		return ret;
>  	}
>  
> -	tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, 1,
> +	tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, in_sg_len,
>  					DMA_MEM_TO_DEV,
>  					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
>  	if (!tx_in) {
> @@ -533,7 +525,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>  		return ret;
>  	}
>  
> -	tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, 1,
> +	tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, out_sg_len,
>  					DMA_DEV_TO_MEM,
>  					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
>  	if (!tx_out) {
> @@ -551,7 +543,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>  	dma_async_issue_pending(dd->dma_lch_out);
>  
>  	/* start DMA */
> -	dd->pdata->trigger(dd, length);
> +	dd->pdata->trigger(dd, dd->total);
>  
>  	return 0;
>  }
> @@ -560,93 +552,28 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev *dd)
>  {
>  	struct crypto_tfm *tfm = crypto_ablkcipher_tfm(
>  					crypto_ablkcipher_reqtfm(dd->req));
> -	int err, fast = 0, in, out;
> -	size_t count;
> -	dma_addr_t addr_in, addr_out;
> -	struct scatterlist *in_sg, *out_sg;
> -	int len32;
> +	int err;
>  
>  	pr_debug("total: %d\n", dd->total);
>  
> -	if (sg_is_last(dd->in_sg) && sg_is_last(dd->out_sg)) {
> -		/* check for alignment */
> -		in = IS_ALIGNED((u32)dd->in_sg->offset, sizeof(u32));
> -		out = IS_ALIGNED((u32)dd->out_sg->offset, sizeof(u32));
> -
> -		fast = in && out;
> +	err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
> +	if (!err) {
> +		dev_err(dd->dev, "dma_map_sg() error\n");
> +		return -EINVAL;
>  	}
>  
> -	if (fast)  {
> -		count = min(dd->total, sg_dma_len(dd->in_sg));
> -		count = min(count, sg_dma_len(dd->out_sg));
> -
> -		if (count != dd->total) {
> -			pr_err("request length != buffer length\n");
> -			return -EINVAL;
> -		}
> -
> -		pr_debug("fast\n");
> -
> -		err = dma_map_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
> -		if (!err) {
> -			dev_err(dd->dev, "dma_map_sg() error\n");
> -			return -EINVAL;
> -		}
> -
> -		err = dma_map_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
> -		if (!err) {
> -			dev_err(dd->dev, "dma_map_sg() error\n");
> -			dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
> -			return -EINVAL;
> -		}
> -
> -		addr_in = sg_dma_address(dd->in_sg);
> -		addr_out = sg_dma_address(dd->out_sg);
> -
> -		in_sg = dd->in_sg;
> -		out_sg = dd->out_sg;
> -
> -		dd->flags |= FLAGS_FAST;
> -
> -	} else {
> -		/* use cache buffers */
> -		count = sg_copy(&dd->in_sg, &dd->in_offset, dd->buf_in,
> -				 dd->buflen, dd->total, 0);
> -
> -		len32 = DIV_ROUND_UP(count, DMA_MIN) * DMA_MIN;
> -
> -		/*
> -		 * The data going into the AES module has been copied
> -		 * to a local buffer and the data coming out will go
> -		 * into a local buffer so set up local SG entries for
> -		 * both.
> -		 */
> -		sg_init_table(&dd->in_sgl, 1);
> -		dd->in_sgl.offset = dd->in_offset;
> -		sg_dma_len(&dd->in_sgl) = len32;
> -		sg_dma_address(&dd->in_sgl) = dd->dma_addr_in;
> -
> -		sg_init_table(&dd->out_sgl, 1);
> -		dd->out_sgl.offset = dd->out_offset;
> -		sg_dma_len(&dd->out_sgl) = len32;
> -		sg_dma_address(&dd->out_sgl) = dd->dma_addr_out;
> -
> -		in_sg = &dd->in_sgl;
> -		out_sg = &dd->out_sgl;
> -
> -		addr_in = dd->dma_addr_in;
> -		addr_out = dd->dma_addr_out;
> -
> -		dd->flags &= ~FLAGS_FAST;
> -
> +	err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
> +	if (!err) {
> +		dev_err(dd->dev, "dma_map_sg() error\n");
> +		return -EINVAL;
>  	}
>  
> -	dd->total -= count;
> -
> -	err = omap_aes_crypt_dma(tfm, in_sg, out_sg);
> +	err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
> +				 dd->out_sg_len);
>  	if (err) {
> -		dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
> -		dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_TO_DEVICE);
> +		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
> +		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
> +			     DMA_FROM_DEVICE);
>  	}
>  
>  	return err;
> @@ -667,7 +594,6 @@ static void omap_aes_finish_req(struct omap_aes_dev *dd, int err)
>  static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
>  {
>  	int err = 0;
> -	size_t count;
>  
>  	pr_debug("total: %d\n", dd->total);
>  
> @@ -676,21 +602,8 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
>  	dmaengine_terminate_all(dd->dma_lch_in);
>  	dmaengine_terminate_all(dd->dma_lch_out);
>  
> -	if (dd->flags & FLAGS_FAST) {
> -		dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
> -		dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
> -	} else {
> -		dma_sync_single_for_device(dd->dev, dd->dma_addr_out,
> -					   dd->dma_size, DMA_FROM_DEVICE);
> -
> -		/* copy data */
> -		count = sg_copy(&dd->out_sg, &dd->out_offset, dd->buf_out,
> -				 dd->buflen, dd->dma_size, 1);
> -		if (count != dd->dma_size) {
> -			err = -EINVAL;
> -			pr_err("not all data converted: %u\n", count);
> -		}
> -	}
> +	dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
> +	dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
>  
>  	return err;
>  }
> @@ -760,21 +673,11 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
>  static void omap_aes_done_task(unsigned long data)
>  {
>  	struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
> -	int err;
> -
> -	pr_debug("enter\n");
>  
> -	err = omap_aes_crypt_dma_stop(dd);
> -
> -	err = dd->err ? : err;
> -
> -	if (dd->total && !err) {
> -		err = omap_aes_crypt_dma_start(dd);
> -		if (!err)
> -			return; /* DMA started. Not fininishing. */
> -	}
> +	pr_debug("enter done_task\n");
>  
> -	omap_aes_finish_req(dd, err);
> +	omap_aes_crypt_dma_stop(dd);
> +	omap_aes_finish_req(dd, 0);
>  	omap_aes_handle_queue(dd, NULL);
>  
>  	pr_debug("exit\n");
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs
@ 2013-08-20 12:57     ` Lokesh Vutla
  0 siblings, 0 replies; 44+ messages in thread
From: Lokesh Vutla @ 2013-08-20 12:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Joel,

On Sunday 18 August 2013 08:12 AM, Joel Fernandes wrote:
> In early version of this driver, assumptions were made such as DMA layer
> requires contiguous buffers etc. Due to this, new buffers were allocated,
> mapped and used for DMA. These assumptions are no longer true and DMAEngine
> scatter-gather DMA doesn't have such requirements. We simply the DMA operations
> by directly using the scatter-gather buffers provided by the crypto layer
> instead of creating our own.
> 
> Lot of logic that handled DMA'ing only X number of bytes of the total, or as
> much as fitted into a 3rd party buffer is removed and is no longer required.
> 
> Also, good performance improvement of atleast ~20% seen with encrypting a
> buffer size of 8K (1800 ops/sec vs 1400 ops/sec).  Improvement will be higher
> for much larger blocks though such benchmarking is left as an exercise for the
> reader.  Also DMA usage is much more simplified and coherent with rest of the
> code.
> 
> Signed-off-by: Joel Fernandes <joelf@ti.com>
> ---
>  drivers/crypto/omap-aes.c |  147 ++++++++-------------------------------------
>  1 file changed, 25 insertions(+), 122 deletions(-)
> 
> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
> index e369e6e..64dd5c1 100644
> --- a/drivers/crypto/omap-aes.c
> +++ b/drivers/crypto/omap-aes.c
> @@ -480,22 +480,14 @@ static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
>  }
>  
>  static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
> -		struct scatterlist *in_sg, struct scatterlist *out_sg)
> +		struct scatterlist *in_sg, struct scatterlist *out_sg,
> +		int in_sg_len, int out_sg_len)
>  {
>  	struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
>  	struct omap_aes_dev *dd = ctx->dd;
>  	struct dma_async_tx_descriptor *tx_in, *tx_out;
>  	struct dma_slave_config cfg;
> -	dma_addr_t dma_addr_in = sg_dma_address(in_sg);
> -	int ret, length = sg_dma_len(in_sg);
> -
> -	pr_debug("len: %d\n", length);
> -
> -	dd->dma_size = length;
> -
> -	if (!(dd->flags & FLAGS_FAST))
> -		dma_sync_single_for_device(dd->dev, dma_addr_in, length,
> -					   DMA_TO_DEVICE);
> +	int ret;
By this change FLAGS_FAST is unsed, it can be cleaned right?
or Am I missing something?

Thanks and regards,
Lokesh
>  
>  	memset(&cfg, 0, sizeof(cfg));
>  
> @@ -514,7 +506,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>  		return ret;
>  	}
>  
> -	tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, 1,
> +	tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, in_sg_len,
>  					DMA_MEM_TO_DEV,
>  					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
>  	if (!tx_in) {
> @@ -533,7 +525,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>  		return ret;
>  	}
>  
> -	tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, 1,
> +	tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, out_sg_len,
>  					DMA_DEV_TO_MEM,
>  					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
>  	if (!tx_out) {
> @@ -551,7 +543,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>  	dma_async_issue_pending(dd->dma_lch_out);
>  
>  	/* start DMA */
> -	dd->pdata->trigger(dd, length);
> +	dd->pdata->trigger(dd, dd->total);
>  
>  	return 0;
>  }
> @@ -560,93 +552,28 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev *dd)
>  {
>  	struct crypto_tfm *tfm = crypto_ablkcipher_tfm(
>  					crypto_ablkcipher_reqtfm(dd->req));
> -	int err, fast = 0, in, out;
> -	size_t count;
> -	dma_addr_t addr_in, addr_out;
> -	struct scatterlist *in_sg, *out_sg;
> -	int len32;
> +	int err;
>  
>  	pr_debug("total: %d\n", dd->total);
>  
> -	if (sg_is_last(dd->in_sg) && sg_is_last(dd->out_sg)) {
> -		/* check for alignment */
> -		in = IS_ALIGNED((u32)dd->in_sg->offset, sizeof(u32));
> -		out = IS_ALIGNED((u32)dd->out_sg->offset, sizeof(u32));
> -
> -		fast = in && out;
> +	err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
> +	if (!err) {
> +		dev_err(dd->dev, "dma_map_sg() error\n");
> +		return -EINVAL;
>  	}
>  
> -	if (fast)  {
> -		count = min(dd->total, sg_dma_len(dd->in_sg));
> -		count = min(count, sg_dma_len(dd->out_sg));
> -
> -		if (count != dd->total) {
> -			pr_err("request length != buffer length\n");
> -			return -EINVAL;
> -		}
> -
> -		pr_debug("fast\n");
> -
> -		err = dma_map_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
> -		if (!err) {
> -			dev_err(dd->dev, "dma_map_sg() error\n");
> -			return -EINVAL;
> -		}
> -
> -		err = dma_map_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
> -		if (!err) {
> -			dev_err(dd->dev, "dma_map_sg() error\n");
> -			dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
> -			return -EINVAL;
> -		}
> -
> -		addr_in = sg_dma_address(dd->in_sg);
> -		addr_out = sg_dma_address(dd->out_sg);
> -
> -		in_sg = dd->in_sg;
> -		out_sg = dd->out_sg;
> -
> -		dd->flags |= FLAGS_FAST;
> -
> -	} else {
> -		/* use cache buffers */
> -		count = sg_copy(&dd->in_sg, &dd->in_offset, dd->buf_in,
> -				 dd->buflen, dd->total, 0);
> -
> -		len32 = DIV_ROUND_UP(count, DMA_MIN) * DMA_MIN;
> -
> -		/*
> -		 * The data going into the AES module has been copied
> -		 * to a local buffer and the data coming out will go
> -		 * into a local buffer so set up local SG entries for
> -		 * both.
> -		 */
> -		sg_init_table(&dd->in_sgl, 1);
> -		dd->in_sgl.offset = dd->in_offset;
> -		sg_dma_len(&dd->in_sgl) = len32;
> -		sg_dma_address(&dd->in_sgl) = dd->dma_addr_in;
> -
> -		sg_init_table(&dd->out_sgl, 1);
> -		dd->out_sgl.offset = dd->out_offset;
> -		sg_dma_len(&dd->out_sgl) = len32;
> -		sg_dma_address(&dd->out_sgl) = dd->dma_addr_out;
> -
> -		in_sg = &dd->in_sgl;
> -		out_sg = &dd->out_sgl;
> -
> -		addr_in = dd->dma_addr_in;
> -		addr_out = dd->dma_addr_out;
> -
> -		dd->flags &= ~FLAGS_FAST;
> -
> +	err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
> +	if (!err) {
> +		dev_err(dd->dev, "dma_map_sg() error\n");
> +		return -EINVAL;
>  	}
>  
> -	dd->total -= count;
> -
> -	err = omap_aes_crypt_dma(tfm, in_sg, out_sg);
> +	err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
> +				 dd->out_sg_len);
>  	if (err) {
> -		dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
> -		dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_TO_DEVICE);
> +		dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
> +		dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
> +			     DMA_FROM_DEVICE);
>  	}
>  
>  	return err;
> @@ -667,7 +594,6 @@ static void omap_aes_finish_req(struct omap_aes_dev *dd, int err)
>  static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
>  {
>  	int err = 0;
> -	size_t count;
>  
>  	pr_debug("total: %d\n", dd->total);
>  
> @@ -676,21 +602,8 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
>  	dmaengine_terminate_all(dd->dma_lch_in);
>  	dmaengine_terminate_all(dd->dma_lch_out);
>  
> -	if (dd->flags & FLAGS_FAST) {
> -		dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
> -		dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
> -	} else {
> -		dma_sync_single_for_device(dd->dev, dd->dma_addr_out,
> -					   dd->dma_size, DMA_FROM_DEVICE);
> -
> -		/* copy data */
> -		count = sg_copy(&dd->out_sg, &dd->out_offset, dd->buf_out,
> -				 dd->buflen, dd->dma_size, 1);
> -		if (count != dd->dma_size) {
> -			err = -EINVAL;
> -			pr_err("not all data converted: %u\n", count);
> -		}
> -	}
> +	dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
> +	dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
>  
>  	return err;
>  }
> @@ -760,21 +673,11 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
>  static void omap_aes_done_task(unsigned long data)
>  {
>  	struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
> -	int err;
> -
> -	pr_debug("enter\n");
>  
> -	err = omap_aes_crypt_dma_stop(dd);
> -
> -	err = dd->err ? : err;
> -
> -	if (dd->total && !err) {
> -		err = omap_aes_crypt_dma_start(dd);
> -		if (!err)
> -			return; /* DMA started. Not fininishing. */
> -	}
> +	pr_debug("enter done_task\n");
>  
> -	omap_aes_finish_req(dd, err);
> +	omap_aes_crypt_dma_stop(dd);
> +	omap_aes_finish_req(dd, 0);
>  	omap_aes_handle_queue(dd, NULL);
>  
>  	pr_debug("exit\n");
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs
  2013-08-20 12:57     ` Lokesh Vutla
@ 2013-08-21  0:54       ` Joel Fernandes
  -1 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-21  0:54 UTC (permalink / raw)
  To: Lokesh Vutla
  Cc: Herbert Xu, David S. Miller, Mark Greer, Tony Lindgren,
	Santosh Shilimkar, Rajendra Nayak, Linux OMAP List,
	Linux ARM Kernel List, Linux Kernel Mailing List,
	Linux Crypto Mailing List

On 08/20/2013 07:57 AM, Lokesh Vutla wrote:
> Hi Joel,
> 
> On Sunday 18 August 2013 08:12 AM, Joel Fernandes wrote:
>> In early version of this driver, assumptions were made such as DMA layer
>> requires contiguous buffers etc. Due to this, new buffers were allocated,
>> mapped and used for DMA. These assumptions are no longer true and DMAEngine
>> scatter-gather DMA doesn't have such requirements. We simply the DMA operations
>> by directly using the scatter-gather buffers provided by the crypto layer
>> instead of creating our own.
>>
>> Lot of logic that handled DMA'ing only X number of bytes of the total, or as
>> much as fitted into a 3rd party buffer is removed and is no longer required.
>>
>> Also, good performance improvement of atleast ~20% seen with encrypting a
>> buffer size of 8K (1800 ops/sec vs 1400 ops/sec).  Improvement will be higher
>> for much larger blocks though such benchmarking is left as an exercise for the
>> reader.  Also DMA usage is much more simplified and coherent with rest of the
>> code.
>>
>> Signed-off-by: Joel Fernandes <joelf@ti.com>
>> ---
>>  drivers/crypto/omap-aes.c |  147 ++++++++-------------------------------------
>>  1 file changed, 25 insertions(+), 122 deletions(-)
>>
>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
>> index e369e6e..64dd5c1 100644
>> --- a/drivers/crypto/omap-aes.c
>> +++ b/drivers/crypto/omap-aes.c
>> @@ -480,22 +480,14 @@ static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
>>  }
>>  
>>  static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>> -		struct scatterlist *in_sg, struct scatterlist *out_sg)
>> +		struct scatterlist *in_sg, struct scatterlist *out_sg,
>> +		int in_sg_len, int out_sg_len)
>>  {
>>  	struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
>>  	struct omap_aes_dev *dd = ctx->dd;
>>  	struct dma_async_tx_descriptor *tx_in, *tx_out;
>>  	struct dma_slave_config cfg;
>> -	dma_addr_t dma_addr_in = sg_dma_address(in_sg);
>> -	int ret, length = sg_dma_len(in_sg);
>> -
>> -	pr_debug("len: %d\n", length);
>> -
>> -	dd->dma_size = length;
>> -
>> -	if (!(dd->flags & FLAGS_FAST))
>> -		dma_sync_single_for_device(dd->dev, dma_addr_in, length,
>> -					   DMA_TO_DEVICE);
>> +	int ret;
> By this change FLAGS_FAST is unsed, it can be cleaned right?
> or Am I missing something?

Yes, FLAGS_FAST would be unused now and can go away. Since it is very trivial
change, I will make this change in the not-immediate future and submit.

Thanks,

-Joel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs
@ 2013-08-21  0:54       ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2013-08-21  0:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/20/2013 07:57 AM, Lokesh Vutla wrote:
> Hi Joel,
> 
> On Sunday 18 August 2013 08:12 AM, Joel Fernandes wrote:
>> In early version of this driver, assumptions were made such as DMA layer
>> requires contiguous buffers etc. Due to this, new buffers were allocated,
>> mapped and used for DMA. These assumptions are no longer true and DMAEngine
>> scatter-gather DMA doesn't have such requirements. We simply the DMA operations
>> by directly using the scatter-gather buffers provided by the crypto layer
>> instead of creating our own.
>>
>> Lot of logic that handled DMA'ing only X number of bytes of the total, or as
>> much as fitted into a 3rd party buffer is removed and is no longer required.
>>
>> Also, good performance improvement of atleast ~20% seen with encrypting a
>> buffer size of 8K (1800 ops/sec vs 1400 ops/sec).  Improvement will be higher
>> for much larger blocks though such benchmarking is left as an exercise for the
>> reader.  Also DMA usage is much more simplified and coherent with rest of the
>> code.
>>
>> Signed-off-by: Joel Fernandes <joelf@ti.com>
>> ---
>>  drivers/crypto/omap-aes.c |  147 ++++++++-------------------------------------
>>  1 file changed, 25 insertions(+), 122 deletions(-)
>>
>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
>> index e369e6e..64dd5c1 100644
>> --- a/drivers/crypto/omap-aes.c
>> +++ b/drivers/crypto/omap-aes.c
>> @@ -480,22 +480,14 @@ static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
>>  }
>>  
>>  static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>> -		struct scatterlist *in_sg, struct scatterlist *out_sg)
>> +		struct scatterlist *in_sg, struct scatterlist *out_sg,
>> +		int in_sg_len, int out_sg_len)
>>  {
>>  	struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
>>  	struct omap_aes_dev *dd = ctx->dd;
>>  	struct dma_async_tx_descriptor *tx_in, *tx_out;
>>  	struct dma_slave_config cfg;
>> -	dma_addr_t dma_addr_in = sg_dma_address(in_sg);
>> -	int ret, length = sg_dma_len(in_sg);
>> -
>> -	pr_debug("len: %d\n", length);
>> -
>> -	dd->dma_size = length;
>> -
>> -	if (!(dd->flags & FLAGS_FAST))
>> -		dma_sync_single_for_device(dd->dev, dma_addr_in, length,
>> -					   DMA_TO_DEVICE);
>> +	int ret;
> By this change FLAGS_FAST is unsed, it can be cleaned right?
> or Am I missing something?

Yes, FLAGS_FAST would be unused now and can go away. Since it is very trivial
change, I will make this change in the not-immediate future and submit.

Thanks,

-Joel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 00/14] crypto: omap-aes: Improve DMA, add PIO mode and support for AM437x
  2013-08-18  2:42 ` Joel Fernandes
  (?)
@ 2013-08-21 11:50   ` Herbert Xu
  -1 siblings, 0 replies; 44+ messages in thread
From: Herbert Xu @ 2013-08-21 11:50 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: David S. Miller, Mark Greer, Tony Lindgren, Santosh Shilimkar,
	Rajendra Nayak, Lokesh Vutla, Linux OMAP List,
	Linux ARM Kernel List, Linux Kernel Mailing List,
	Linux Crypto Mailing List

On Sat, Aug 17, 2013 at 09:42:21PM -0500, Joel Fernandes wrote:
> Following patch series rewrites the DMA code to be cleaner and faster. Earlier,
> only a single SG was used for DMA purpose, and the SG-list passed from the
> crypto layer was being copied and DMA'd one entry at a time. This turns out to
> be quite inefficient and lot of code, we replace it with much simpler approach
> that directly passes the SG-list from crypto to the DMA layers for cases where
> possible. For all cases where such a direct passing of SG list is not possible,
> we create a new SG-list and do the copying. This is still better than before, as
> we create an SG list as big as needed and not just 1-element list.
> 
> We also add PIO mode support to the driver, and switch to it whenever the DMA
> channel allocation is not available. This also has shown to give good performance
> for small blocks as shown below.
> 
> Tests have been performed on AM335x, OMAP4 and AM437x SoCs.
> 
> Below is a sample run on AM335x SoC (beaglebone board), showing
> performance improvement (20% for 8K blocks):

All applied.  Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 00/14] crypto: omap-aes: Improve DMA, add PIO mode and support for AM437x
@ 2013-08-21 11:50   ` Herbert Xu
  0 siblings, 0 replies; 44+ messages in thread
From: Herbert Xu @ 2013-08-21 11:50 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: David S. Miller, Mark Greer, Tony Lindgren, Santosh Shilimkar,
	Rajendra Nayak, Lokesh Vutla, Linux OMAP List,
	Linux ARM Kernel List, Linux Kernel Mailing List,
	Linux Crypto Mailing List

On Sat, Aug 17, 2013 at 09:42:21PM -0500, Joel Fernandes wrote:
> Following patch series rewrites the DMA code to be cleaner and faster. Earlier,
> only a single SG was used for DMA purpose, and the SG-list passed from the
> crypto layer was being copied and DMA'd one entry at a time. This turns out to
> be quite inefficient and lot of code, we replace it with much simpler approach
> that directly passes the SG-list from crypto to the DMA layers for cases where
> possible. For all cases where such a direct passing of SG list is not possible,
> we create a new SG-list and do the copying. This is still better than before, as
> we create an SG list as big as needed and not just 1-element list.
> 
> We also add PIO mode support to the driver, and switch to it whenever the DMA
> channel allocation is not available. This also has shown to give good performance
> for small blocks as shown below.
> 
> Tests have been performed on AM335x, OMAP4 and AM437x SoCs.
> 
> Below is a sample run on AM335x SoC (beaglebone board), showing
> performance improvement (20% for 8K blocks):

All applied.  Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 00/14] crypto: omap-aes: Improve DMA, add PIO mode and support for AM437x
@ 2013-08-21 11:50   ` Herbert Xu
  0 siblings, 0 replies; 44+ messages in thread
From: Herbert Xu @ 2013-08-21 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Aug 17, 2013 at 09:42:21PM -0500, Joel Fernandes wrote:
> Following patch series rewrites the DMA code to be cleaner and faster. Earlier,
> only a single SG was used for DMA purpose, and the SG-list passed from the
> crypto layer was being copied and DMA'd one entry at a time. This turns out to
> be quite inefficient and lot of code, we replace it with much simpler approach
> that directly passes the SG-list from crypto to the DMA layers for cases where
> possible. For all cases where such a direct passing of SG list is not possible,
> we create a new SG-list and do the copying. This is still better than before, as
> we create an SG list as big as needed and not just 1-element list.
> 
> We also add PIO mode support to the driver, and switch to it whenever the DMA
> channel allocation is not available. This also has shown to give good performance
> for small blocks as shown below.
> 
> Tests have been performed on AM335x, OMAP4 and AM437x SoCs.
> 
> Below is a sample run on AM335x SoC (beaglebone board), showing
> performance improvement (20% for 8K blocks):

All applied.  Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2013-08-21 11:51 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-18  2:42 [PATCH v2 00/14] crypto: omap-aes: Improve DMA, add PIO mode and support for AM437x Joel Fernandes
2013-08-18  2:42 ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 01/14] crypto: scatterwalk: Add support for calculating number of SG elements Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 02/14] crypto: omap-aes: Add useful debug macros Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  4:22   ` Joe Perches
2013-08-18  4:22     ` Joe Perches
2013-08-18  5:56     ` Joel Fernandes
2013-08-18  5:56       ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 03/14] crypto: omap-aes: Populate number of SG elements Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-20 12:57   ` Lokesh Vutla
2013-08-20 12:57     ` Lokesh Vutla
2013-08-21  0:54     ` Joel Fernandes
2013-08-21  0:54       ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 05/14] crypto: omap-aes: Sync SG before DMA operation Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 06/14] crypto: omap-aes: Remove previously used intermediate buffers Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 07/14] crypto: omap-aes: Add IRQ info and helper macros Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 08/14] crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 09/14] crypto: omap-aes: PIO mode: platform data for OMAP4/AM437x and trigger Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 10/14] crypto: omap-aes: Switch to PIO mode during probe Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 11/14] crypto: omap-aes: Add support for cases of unaligned lengths Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 12/14] crypto: omap-aes: Convert kzalloc to devm_kzalloc Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 13/14] crypto: omap-aes: Convert request_irq to devm_request_irq Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42 ` [PATCH v2 14/14] crypto: omap-aes: Kconfig: Add build support for AM437x Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-18  2:42   ` Joel Fernandes
2013-08-21 11:50 ` [PATCH v2 00/14] crypto: omap-aes: Improve DMA, add PIO mode and " Herbert Xu
2013-08-21 11:50   ` Herbert Xu
2013-08-21 11:50   ` Herbert Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.