linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] dmaengine_unmap_data
@ 2012-12-06  9:25 Dan Williams
  2012-12-06  9:25 ` [PATCH 01/12] dmaengine: consolidate memcpy apis Dan Williams
                   ` (11 more replies)
  0 siblings, 12 replies; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: vinod.koul, linux, dave.jiang

dmaengine from the beginning has placed the burden of unmapping dma
buffers on the individual drivers.  The thought being that since the dma
driver already has the descriptor it can use that information for
unmapping.  This results in a lot of cruft to read back data from
descriptors, places a burden on channels that need to break up an
operation internally into multiple descriptors, and makes it difficult
to have dma mappings with different lifetimes than the current
operation.

For example an xor->copy->xor chain wants to leave all buffers
mapped until completion, async_tx currently performs invalid overlapping
mappings.  With dmaengine_unmap_data map once and take a reference for
descriptor that uses the mapping.

Thanks to Bart for getting this cleanup started!

I'll also push this out to the 'unmap' branch:
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine unmap

--
Dan


---

Bartlomiej Zolnierkiewicz (2):
      dmaengine: remove DMA unmap from drivers
      dmaengine: remove DMA unmap flags

Dan Williams (10):
      dmaengine: consolidate memcpy apis
      dmaengine: prepare for generic 'unmap' data
      dmaengine: reference counted unmap data
      async_memcpy: convert to dmaengine_unmap_data
      async_memset: convert to dmaengine_unmap_data
      async_xor: convert to dmaengine_unmap_data
      async_xor_val: convert to dmaengine_unmap_data
      async_raid6_recov: convert to dmaengine_unmap_data
      async_pq: convert to dmaengine_unmap_data
      async_pq_val: convert to dmaengine_unmap_data


 arch/arm/include/asm/hardware/iop3xx-adma.h |   30 ---
 arch/arm/include/asm/hardware/iop_adma.h    |    4 
 arch/arm/mach-iop13xx/include/mach/adma.h   |   26 ---
 crypto/async_tx/async_memcpy.c              |   36 ++--
 crypto/async_tx/async_memset.c              |   15 +-
 crypto/async_tx/async_pq.c                  |  174 ++++++++++-------
 crypto/async_tx/async_raid6_recov.c         |   61 ++++--
 crypto/async_tx/async_xor.c                 |  122 +++++++-----
 drivers/ata/pata_arasan_cf.c                |    3 
 drivers/dma/amba-pl08x.c                    |   32 ---
 drivers/dma/at_hdmac.c                      |   26 ---
 drivers/dma/dmaengine.c                     |  261 ++++++++++++++++++--------
 drivers/dma/dmatest.c                       |    3 
 drivers/dma/dw_dmac.c                       |   21 --
 drivers/dma/ep93xx_dma.c                    |   30 ---
 drivers/dma/fsldma.c                        |   17 --
 drivers/dma/ioat/dma.c                      |   20 --
 drivers/dma/ioat/dma.h                      |   12 -
 drivers/dma/ioat/dma_v2.c                   |    2 
 drivers/dma/ioat/dma_v3.c                   |  143 +-------------
 drivers/dma/iop-adma.c                      |   99 ----------
 drivers/dma/mv_xor.c                        |   46 -----
 drivers/dma/ppc4xx/adma.c                   |  270 ---------------------------
 drivers/dma/timb_dma.c                      |   37 ----
 drivers/dma/txx9dmac.c                      |   25 ---
 drivers/media/platform/m2m-deinterlace.c    |    3 
 drivers/media/platform/timblogiw.c          |    2 
 drivers/misc/carma/carma-fpga.c             |    3 
 drivers/mtd/nand/atmel_nand.c               |    3 
 drivers/mtd/nand/fsmc_nand.c                |    2 
 drivers/net/ethernet/micrel/ks8842.c        |    6 -
 drivers/spi/spi-dw-mid.c                    |    4 
 include/linux/dmaengine.h                   |   49 ++++-
 33 files changed, 481 insertions(+), 1106 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/12] dmaengine: consolidate memcpy apis
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06  9:25 ` [PATCH 02/12] dmaengine: prepare for generic 'unmap' data Dan Williams
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: vinod.koul, linux, dave.jiang

Copying from page to page (dma_async_memcpy_pg_to_pg) is the superset,
make the other two apis use that one in preparation for providing a
common dma unmap implementation.  The common implementation just wants
to assume all buffers are mapped with dma_map_page().

Signed-off-by: Dan Williams <djbw@fb.com>
---
 drivers/dma/dmaengine.c |  137 +++++++++++++++--------------------------------
 1 file changed, 45 insertions(+), 92 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 07109d0..f3cadc6 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -857,20 +857,23 @@ void dma_async_device_unregister(struct dma_device *device)
 EXPORT_SYMBOL(dma_async_device_unregister);
 
 /**
- * dma_async_memcpy_buf_to_buf - offloaded copy between virtual addresses
+ * dma_async_memcpy_pg_to_pg - offloaded copy from page to page
  * @chan: DMA channel to offload copy to
- * @dest: destination address (virtual)
- * @src: source address (virtual)
+ * @dest_pg: destination page
+ * @dest_off: offset in page to copy to
+ * @src_pg: source page
+ * @src_off: offset in page to copy from
  * @len: length
  *
- * Both @dest and @src must be mappable to a bus address according to the
- * DMA mapping API rules for streaming mappings.
- * Both @dest and @src must stay memory resident (kernel memory or locked
- * user space pages).
+ * Both @dest_page/@dest_off and @src_page/@src_off must be mappable to a bus
+ * address according to the DMA mapping API rules for streaming mappings.
+ * Both @dest_page/@dest_off and @src_page/@src_off must stay memory resident
+ * (kernel memory or locked user space pages).
  */
 dma_cookie_t
-dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
-			void *src, size_t len)
+dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
+	unsigned int dest_off, struct page *src_pg, unsigned int src_off,
+	size_t len)
 {
 	struct dma_device *dev = chan->device;
 	struct dma_async_tx_descriptor *tx;
@@ -878,16 +881,15 @@ dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
 	dma_cookie_t cookie;
 	unsigned long flags;
 
-	dma_src = dma_map_single(dev->dev, src, len, DMA_TO_DEVICE);
-	dma_dest = dma_map_single(dev->dev, dest, len, DMA_FROM_DEVICE);
-	flags = DMA_CTRL_ACK |
-		DMA_COMPL_SRC_UNMAP_SINGLE |
-		DMA_COMPL_DEST_UNMAP_SINGLE;
+	dma_src = dma_map_page(dev->dev, src_pg, src_off, len, DMA_TO_DEVICE);
+	dma_dest = dma_map_page(dev->dev, dest_pg, dest_off, len,
+				DMA_FROM_DEVICE);
+	flags = DMA_CTRL_ACK;
 	tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);
 
 	if (!tx) {
-		dma_unmap_single(dev->dev, dma_src, len, DMA_TO_DEVICE);
-		dma_unmap_single(dev->dev, dma_dest, len, DMA_FROM_DEVICE);
+		dma_unmap_page(dev->dev, dma_src, len, DMA_TO_DEVICE);
+		dma_unmap_page(dev->dev, dma_dest, len, DMA_FROM_DEVICE);
 		return -ENOMEM;
 	}
 
@@ -901,6 +903,29 @@ dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
 
 	return cookie;
 }
+EXPORT_SYMBOL(dma_async_memcpy_pg_to_pg);
+
+/**
+ * dma_async_memcpy_buf_to_buf - offloaded copy between virtual addresses
+ * @chan: DMA channel to offload copy to
+ * @dest: destination address (virtual)
+ * @src: source address (virtual)
+ * @len: length
+ *
+ * Both @dest and @src must be mappable to a bus address according to the
+ * DMA mapping API rules for streaming mappings.
+ * Both @dest and @src must stay memory resident (kernel memory or locked
+ * user space pages).
+ */
+dma_cookie_t
+dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
+			    void *src, size_t len)
+{
+	return dma_async_memcpy_pg_to_pg(chan, virt_to_page(dest),
+					 (unsigned long) dest & ~PAGE_MASK,
+					 virt_to_page(src),
+					 (unsigned long) src & ~PAGE_MASK, len);
+}
 EXPORT_SYMBOL(dma_async_memcpy_buf_to_buf);
 
 /**
@@ -918,86 +943,14 @@ EXPORT_SYMBOL(dma_async_memcpy_buf_to_buf);
  */
 dma_cookie_t
 dma_async_memcpy_buf_to_pg(struct dma_chan *chan, struct page *page,
-			unsigned int offset, void *kdata, size_t len)
+			   unsigned int offset, void *kdata, size_t len)
 {
-	struct dma_device *dev = chan->device;
-	struct dma_async_tx_descriptor *tx;
-	dma_addr_t dma_dest, dma_src;
-	dma_cookie_t cookie;
-	unsigned long flags;
-
-	dma_src = dma_map_single(dev->dev, kdata, len, DMA_TO_DEVICE);
-	dma_dest = dma_map_page(dev->dev, page, offset, len, DMA_FROM_DEVICE);
-	flags = DMA_CTRL_ACK | DMA_COMPL_SRC_UNMAP_SINGLE;
-	tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);
-
-	if (!tx) {
-		dma_unmap_single(dev->dev, dma_src, len, DMA_TO_DEVICE);
-		dma_unmap_page(dev->dev, dma_dest, len, DMA_FROM_DEVICE);
-		return -ENOMEM;
-	}
-
-	tx->callback = NULL;
-	cookie = tx->tx_submit(tx);
-
-	preempt_disable();
-	__this_cpu_add(chan->local->bytes_transferred, len);
-	__this_cpu_inc(chan->local->memcpy_count);
-	preempt_enable();
-
-	return cookie;
+	return dma_async_memcpy_pg_to_pg(chan, page, offset,
+					 virt_to_page(kdata),
+					 (unsigned long) kdata & ~PAGE_MASK, len);
 }
 EXPORT_SYMBOL(dma_async_memcpy_buf_to_pg);
 
-/**
- * dma_async_memcpy_pg_to_pg - offloaded copy from page to page
- * @chan: DMA channel to offload copy to
- * @dest_pg: destination page
- * @dest_off: offset in page to copy to
- * @src_pg: source page
- * @src_off: offset in page to copy from
- * @len: length
- *
- * Both @dest_page/@dest_off and @src_page/@src_off must be mappable to a bus
- * address according to the DMA mapping API rules for streaming mappings.
- * Both @dest_page/@dest_off and @src_page/@src_off must stay memory resident
- * (kernel memory or locked user space pages).
- */
-dma_cookie_t
-dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
-	unsigned int dest_off, struct page *src_pg, unsigned int src_off,
-	size_t len)
-{
-	struct dma_device *dev = chan->device;
-	struct dma_async_tx_descriptor *tx;
-	dma_addr_t dma_dest, dma_src;
-	dma_cookie_t cookie;
-	unsigned long flags;
-
-	dma_src = dma_map_page(dev->dev, src_pg, src_off, len, DMA_TO_DEVICE);
-	dma_dest = dma_map_page(dev->dev, dest_pg, dest_off, len,
-				DMA_FROM_DEVICE);
-	flags = DMA_CTRL_ACK;
-	tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);
-
-	if (!tx) {
-		dma_unmap_page(dev->dev, dma_src, len, DMA_TO_DEVICE);
-		dma_unmap_page(dev->dev, dma_dest, len, DMA_FROM_DEVICE);
-		return -ENOMEM;
-	}
-
-	tx->callback = NULL;
-	cookie = tx->tx_submit(tx);
-
-	preempt_disable();
-	__this_cpu_add(chan->local->bytes_transferred, len);
-	__this_cpu_inc(chan->local->memcpy_count);
-	preempt_enable();
-
-	return cookie;
-}
-EXPORT_SYMBOL(dma_async_memcpy_pg_to_pg);
-
 void dma_async_tx_descriptor_init(struct dma_async_tx_descriptor *tx,
 	struct dma_chan *chan)
 {


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/12] dmaengine: prepare for generic 'unmap' data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
  2012-12-06  9:25 ` [PATCH 01/12] dmaengine: consolidate memcpy apis Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:25 ` [PATCH 03/12] dmaengine: reference counted unmap data Dan Williams
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: vinod.koul, linux, dave.jiang, Bartlomiej Zolnierkiewicz

Add a hook for a common dma unmap implementation to enable removal of
the per driver custom unmap code.  (A reworked version of Bartlomiej
Zolnierkiewicz's patches to remove the custom callbacks and the size
increase of dma_async_tx_descriptor for drivers that don't care about raid).

Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 drivers/dma/amba-pl08x.c  |    1 +
 drivers/dma/at_hdmac.c    |    1 +
 drivers/dma/dw_dmac.c     |    1 +
 drivers/dma/ep93xx_dma.c  |    1 +
 drivers/dma/fsldma.c      |    1 +
 drivers/dma/ioat/dma.c    |    1 +
 drivers/dma/ioat/dma_v2.c |    1 +
 drivers/dma/ioat/dma_v3.c |    1 +
 drivers/dma/iop-adma.c    |    1 +
 drivers/dma/mv_xor.c      |    1 +
 drivers/dma/ppc4xx/adma.c |    1 +
 drivers/dma/timb_dma.c    |    1 +
 drivers/dma/txx9dmac.c    |    1 +
 include/linux/dmaengine.h |   26 ++++++++++++++++++++++++++
 14 files changed, 39 insertions(+)

diff --git a/drivers/dma/amba-pl08x.c b/drivers/dma/amba-pl08x.c
index d1cc579..4cb2f23 100644
--- a/drivers/dma/amba-pl08x.c
+++ b/drivers/dma/amba-pl08x.c
@@ -1083,6 +1083,7 @@ static void pl08x_desc_free(struct virt_dma_desc *vd)
 	struct pl08x_txd *txd = to_pl08x_txd(&vd->tx);
 	struct pl08x_dma_chan *plchan = to_pl08x_chan(vd->tx.chan);
 
+	dma_descriptor_unmap(txd);
 	if (!plchan->slave)
 		pl08x_unmap_buffers(txd);
 
diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c
index 13a02f4..280ce87 100644
--- a/drivers/dma/at_hdmac.c
+++ b/drivers/dma/at_hdmac.c
@@ -253,6 +253,7 @@ atc_chain_complete(struct at_dma_chan *atchan, struct at_desc *desc)
 	list_move(&desc->desc_node, &atchan->free_list);
 
 	/* unmap dma addresses (not on slave channels) */
+	dma_descriptor_unmap(txd);
 	if (!atchan->chan_common.private) {
 		struct device *parent = chan2parent(&atchan->chan_common);
 		if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
index c4b0eb3..5d0b58c 100644
--- a/drivers/dma/dw_dmac.c
+++ b/drivers/dma/dw_dmac.c
@@ -326,6 +326,7 @@ dwc_descriptor_complete(struct dw_dma_chan *dwc, struct dw_desc *desc,
 	list_splice_init(&desc->tx_list, &dwc->free_list);
 	list_move(&desc->desc_node, &dwc->free_list);
 
+	dma_descriptor_unmap(txd);
 	if (!dwc->chan.private) {
 		struct device *parent = chan2parent(&dwc->chan);
 		if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
diff --git a/drivers/dma/ep93xx_dma.c b/drivers/dma/ep93xx_dma.c
index bcfde40..5d08aeb 100644
--- a/drivers/dma/ep93xx_dma.c
+++ b/drivers/dma/ep93xx_dma.c
@@ -791,6 +791,7 @@ static void ep93xx_dma_tasklet(unsigned long data)
 		 * For the memcpy channels the API requires us to unmap the
 		 * buffers unless requested otherwise.
 		 */
+		dma_descriptor_unmap(&desc->txd);
 		if (!edmac->chan.private)
 			ep93xx_dma_unmap_buffers(desc);
 
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 094437b..7e4e44c 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -868,6 +868,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
 	/* Run any dependencies */
 	dma_run_dependencies(txd);
 
+	dma_descriptor_unmap(txd);
 	/* Unmap the dst buffer, if requested */
 	if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
 		if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index 464138a..38918cf 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -602,6 +602,7 @@ static void __cleanup(struct ioat_dma_chan *ioat, dma_addr_t phys_complete)
 		dump_desc_dbg(ioat, desc);
 		if (tx->cookie) {
 			dma_cookie_complete(tx);
+			dma_descriptor_unmap(tx);
 			ioat_dma_unmap(chan, tx->flags, desc->len, desc->hw);
 			ioat->active -= desc->hw->tx_cnt;
 			if (tx->callback) {
diff --git a/drivers/dma/ioat/dma_v2.c b/drivers/dma/ioat/dma_v2.c
index b9d6678..2714b0e 100644
--- a/drivers/dma/ioat/dma_v2.c
+++ b/drivers/dma/ioat/dma_v2.c
@@ -148,6 +148,7 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
 		tx = &desc->txd;
 		dump_desc_dbg(ioat, desc);
 		if (tx->cookie) {
+			dma_descriptor_unmap(tx);
 			ioat_dma_unmap(chan, tx->flags, desc->len, desc->hw);
 			dma_cookie_complete(tx);
 			if (tx->callback) {
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index e52cf1e..70385d5 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -279,6 +279,7 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
 		tx = &desc->txd;
 		if (tx->cookie) {
 			dma_cookie_complete(tx);
+			dma_descriptor_unmap(tx);
 			ioat3_dma_unmap(ioat, desc, idx + i);
 			if (tx->callback) {
 				tx->callback(tx->callback_param);
diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index 79e3eba..32f5d46 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -152,6 +152,7 @@ iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,
 		if (tx->callback)
 			tx->callback(tx->callback_param);
 
+		dma_descriptor_unmap(tx);
 		/* unmap dma addresses
 		 * (unmap_single vs unmap_page?)
 		 */
diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
index e362e2b..4a5c073 100644
--- a/drivers/dma/mv_xor.c
+++ b/drivers/dma/mv_xor.c
@@ -303,6 +303,7 @@ mv_xor_run_tx_complete_actions(struct mv_xor_desc_slot *desc,
 			desc->async_tx.callback(
 				desc->async_tx.callback_param);
 
+		dma_descriptor_unmap(&desc->async_tx);
 		/* unmap dma addresses
 		 * (unmap_single vs unmap_page?)
 		 */
diff --git a/drivers/dma/ppc4xx/adma.c b/drivers/dma/ppc4xx/adma.c
index f72348d..883b343 100644
--- a/drivers/dma/ppc4xx/adma.c
+++ b/drivers/dma/ppc4xx/adma.c
@@ -1765,6 +1765,7 @@ static dma_cookie_t ppc440spe_adma_run_tx_complete_actions(
 			desc->async_tx.callback(
 				desc->async_tx.callback_param);
 
+		dma_descriptor_unmap(&desc->async_tx);
 		/* unmap dma addresses
 		 * (unmap_single vs unmap_page?)
 		 *
diff --git a/drivers/dma/timb_dma.c b/drivers/dma/timb_dma.c
index 4e0dff5..4b82112 100644
--- a/drivers/dma/timb_dma.c
+++ b/drivers/dma/timb_dma.c
@@ -293,6 +293,7 @@ static void __td_finish(struct timb_dma_chan *td_chan)
 
 	list_move(&td_desc->desc_node, &td_chan->free_list);
 
+	dma_descriptor_unmap(txd);
 	if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP))
 		__td_unmap_descs(td_desc,
 			txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE);
diff --git a/drivers/dma/txx9dmac.c b/drivers/dma/txx9dmac.c
index 913f55c..041b675 100644
--- a/drivers/dma/txx9dmac.c
+++ b/drivers/dma/txx9dmac.c
@@ -419,6 +419,7 @@ txx9dmac_descriptor_complete(struct txx9dmac_chan *dc,
 	list_splice_init(&desc->tx_list, &dc->free_list);
 	list_move(&desc->desc_node, &dc->free_list);
 
+	dma_descriptor_unmap(txd);
 	if (!ds) {
 		dma_addr_t dmaaddr;
 		if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 1750e09..da58d79 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -392,6 +392,17 @@ void dma_chan_cleanup(struct kref *kref);
 typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param);
 
 typedef void (*dma_async_tx_callback)(void *dma_async_param);
+
+struct dmaengine_unmap_data {
+	u8 to_cnt;
+	u8 from_cnt;
+	u8 bidi_cnt;
+	struct device *dev;
+	struct kref kref;
+	size_t len;
+	dma_addr_t addr[0];
+};
+
 /**
  * struct dma_async_tx_descriptor - async transaction descriptor
  * ---dma generic offload fields---
@@ -417,6 +428,7 @@ struct dma_async_tx_descriptor {
 	dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
 	dma_async_tx_callback callback;
 	void *callback_param;
+	struct dmaengine_unmap_data *unmap;
 #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
 	struct dma_async_tx_descriptor *next;
 	struct dma_async_tx_descriptor *parent;
@@ -424,6 +436,20 @@ struct dma_async_tx_descriptor {
 #endif
 };
 
+static inline void dma_set_unmap(struct dma_async_tx_descriptor *tx,
+				 struct dmaengine_unmap_data *unmap)
+{
+	kref_get(&unmap->kref);
+	tx->unmap = unmap;
+}
+
+static inline void dma_descriptor_unmap(struct dma_async_tx_descriptor *tx)
+{
+	if (tx->unmap) {
+		tx->unmap = NULL;
+	}
+}
+
 #ifndef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
 static inline void txd_lock(struct dma_async_tx_descriptor *txd)
 {


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/12] dmaengine: reference counted unmap data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
  2012-12-06  9:25 ` [PATCH 01/12] dmaengine: consolidate memcpy apis Dan Williams
  2012-12-06  9:25 ` [PATCH 02/12] dmaengine: prepare for generic 'unmap' data Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:25 ` [PATCH 04/12] async_memcpy: convert to dmaengine_unmap_data Dan Williams
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, Vinod Koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

hang a common 'unmap' object off of dma descriptors for the purpose of
providing a unified unmapping interface.  The lifetime of a mapping may
span multiple descriptors, so these unmap objects are reference counted
by related descriptor.

Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 drivers/dma/dmaengine.c   |  157 ++++++++++++++++++++++++++++++++++++++++++---
 include/linux/dmaengine.h |    3 +
 2 files changed, 151 insertions(+), 9 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index f3cadc6..00f0baf 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -62,6 +62,7 @@
 #include <linux/rculist.h>
 #include <linux/idr.h>
 #include <linux/slab.h>
+#include <linux/mempool.h>
 
 static DEFINE_MUTEX(dma_list_mutex);
 static DEFINE_IDR(dma_idr);
@@ -856,6 +857,131 @@ void dma_async_device_unregister(struct dma_device *device)
 }
 EXPORT_SYMBOL(dma_async_device_unregister);
 
+struct dmaengine_unmap_pool {
+	struct kmem_cache *cache;
+	const char *name;
+	mempool_t *pool;
+	size_t size;
+};
+
+#define __UNMAP_POOL(x) { .size = x, .name = "dmaengine-unmap-" __stringify(x) }
+static struct dmaengine_unmap_pool unmap_pool[] = {
+	__UNMAP_POOL(2),
+	#if IS_ENABLED(ASYNC_TX_DMA)
+	__UNMAP_POOL(16),
+	__UNMAP_POOL(128),
+	__UNMAP_POOL(256),
+	#endif
+};
+
+static struct dmaengine_unmap_pool *__get_unmap_pool(int nr)
+{
+	int order = get_count_order(nr);
+
+	switch (order) {
+	case 0 ... 1:
+		return &unmap_pool[0];
+	case 2 ... 4:
+		return &unmap_pool[1];
+	case 5 ... 7:
+		return &unmap_pool[2];
+	case 8:
+		return &unmap_pool[3];
+	default:
+		BUG();
+		return NULL;
+	}
+
+}
+
+static void dmaengine_unmap(struct kref *kref)
+{
+	struct dmaengine_unmap_data *unmap = container_of(kref, typeof(*unmap), kref);
+	struct device *dev = unmap->dev;
+	int cnt, i;
+
+	cnt = unmap->to_cnt;
+	for (i = 0; i < cnt; i++)
+		dma_unmap_page(dev, unmap->addr[i], unmap->len,
+			       DMA_TO_DEVICE);
+	cnt += unmap->from_cnt;
+	for (; i < cnt; i++)
+		dma_unmap_page(dev, unmap->addr[i], unmap->len,
+			       DMA_FROM_DEVICE);
+	cnt += unmap->bidi_cnt;
+	for (; i < cnt; i++)
+		dma_unmap_page(dev, unmap->addr[i], unmap->len,
+			       DMA_BIDIRECTIONAL);
+	kmem_cache_free(__get_unmap_pool(cnt)->cache, unmap);
+}
+
+void dmaengine_unmap_put(struct dmaengine_unmap_data *unmap)
+{
+	if (unmap)
+		kref_put(&unmap->kref, dmaengine_unmap);
+}
+EXPORT_SYMBOL_GPL(dmaengine_unmap_put);
+
+static void dmaengine_destroy_unmap_pool(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(unmap_pool); i++) {
+		struct dmaengine_unmap_pool *p = &unmap_pool[i];
+
+		if (p->cache)
+			kmem_cache_destroy(p->cache);
+		p->cache = NULL;
+		if (p->pool)
+			mempool_destroy(p->pool);
+		p->pool = NULL;
+	}
+}
+
+static int dmaengine_init_unmap_pool(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(unmap_pool); i++) {
+		struct dmaengine_unmap_pool *p = &unmap_pool[i];
+		size_t size;
+
+		size = sizeof(struct dmaengine_unmap_data) +
+		       sizeof(dma_addr_t) * p->size;
+
+		p->cache = kmem_cache_create(p->name, size, 0,
+					     SLAB_HWCACHE_ALIGN, NULL);
+		if (!p->cache)
+			break;
+		p->pool = mempool_create_slab_pool(1, p->cache);
+		if (!p->pool)
+			break;
+	}
+
+	if (i > ARRAY_SIZE(unmap_pool))
+		return 0;
+
+	dmaengine_destroy_unmap_pool();
+	return -ENOMEM;
+}
+
+static struct dmaengine_unmap_data *
+dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags)
+{
+	struct dmaengine_unmap_data *unmap;
+
+	unmap = mempool_alloc(__get_unmap_pool(nr)->pool, flags);
+	if (!unmap)
+		return NULL;
+	unmap->to_cnt = 0;
+	unmap->from_cnt = 0;
+	unmap->bidi_cnt = 0;
+	kref_init(&unmap->kref);
+	unmap->dev = dev;
+
+	return unmap;
+}
+
 /**
  * dma_async_memcpy_pg_to_pg - offloaded copy from page to page
  * @chan: DMA channel to offload copy to
@@ -877,24 +1003,33 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
 {
 	struct dma_device *dev = chan->device;
 	struct dma_async_tx_descriptor *tx;
-	dma_addr_t dma_dest, dma_src;
+	struct dmaengine_unmap_data *unmap;
 	dma_cookie_t cookie;
 	unsigned long flags;
 
-	dma_src = dma_map_page(dev->dev, src_pg, src_off, len, DMA_TO_DEVICE);
-	dma_dest = dma_map_page(dev->dev, dest_pg, dest_off, len,
-				DMA_FROM_DEVICE);
-	flags = DMA_CTRL_ACK;
-	tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);
+	unmap = dmaengine_get_unmap_data(dev->dev, 2, GFP_NOIO);
+	if (!unmap)
+		return -ENOMEM;
+
+	unmap->to_cnt = 1;
+	unmap->from_cnt = 1;
+	unmap->addr[0] = dma_map_page(dev->dev, src_pg, src_off, len,
+				      DMA_TO_DEVICE);
+	unmap->addr[1] = dma_map_page(dev->dev, dest_pg, dest_off, len,
+				      DMA_FROM_DEVICE);
+	flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
+		DMA_COMPL_SKIP_DEST_UNMAP;
+	tx = dev->device_prep_dma_memcpy(chan, unmap->addr[1], unmap->addr[0],
+					 len, flags);
 
 	if (!tx) {
-		dma_unmap_page(dev->dev, dma_src, len, DMA_TO_DEVICE);
-		dma_unmap_page(dev->dev, dma_dest, len, DMA_FROM_DEVICE);
+		dmaengine_unmap_put(unmap);
 		return -ENOMEM;
 	}
 
-	tx->callback = NULL;
+	dma_set_unmap(tx, unmap);
 	cookie = tx->tx_submit(tx);
+	dmaengine_unmap_put(unmap);
 
 	preempt_disable();
 	__this_cpu_add(chan->local->bytes_transferred, len);
@@ -1024,6 +1159,10 @@ EXPORT_SYMBOL_GPL(dma_run_dependencies);
 
 static int __init dma_bus_init(void)
 {
+	int err = dmaengine_init_unmap_pool();
+
+	if (err)
+		return err;
 	return class_register(&dma_devclass);
 }
 arch_initcall(dma_bus_init);
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index da58d79..c90d6b6 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -443,9 +443,12 @@ static inline void dma_set_unmap(struct dma_async_tx_descriptor *tx,
 	tx->unmap = unmap;
 }
 
+void dmaengine_unmap_put(struct dmaengine_unmap_data *unmap);
+
 static inline void dma_descriptor_unmap(struct dma_async_tx_descriptor *tx)
 {
 	if (tx->unmap) {
+		dmaengine_unmap_put(tx->unmap);
 		tx->unmap = NULL;
 	}
 }


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/12] async_memcpy: convert to dmaengine_unmap_data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (2 preceding siblings ...)
  2012-12-06  9:25 ` [PATCH 03/12] dmaengine: reference counted unmap data Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:25 ` [PATCH 05/12] async_memset: " Dan Williams
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, vinod.koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

Use the generic unmap object to unmap dma buffers.

Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 crypto/async_tx/async_memcpy.c |   39 ++++++++++++++++++++++-----------------
 drivers/dma/dmaengine.c        |    3 ++-
 include/linux/dmaengine.h      |    2 ++
 3 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index 9e62fef..ca95c4c 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -50,33 +50,36 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
 						      &dest, 1, &src, 1, len);
 	struct dma_device *device = chan ? chan->device : NULL;
 	struct dma_async_tx_descriptor *tx = NULL;
+	struct dmaengine_unmap_data *unmap = NULL;
 
-	if (device && is_dma_copy_aligned(device, src_offset, dest_offset, len)) {
-		dma_addr_t dma_dest, dma_src;
-		unsigned long dma_prep_flags = 0;
+	if (device)
+		unmap = dmaengine_get_unmap_data(device->dev, 2, GFP_NOIO);
+
+	if (unmap && is_dma_copy_aligned(device, src_offset, dest_offset, len)) {
+		unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+					       DMA_COMPL_SKIP_DEST_UNMAP;
 
 		if (submit->cb_fn)
 			dma_prep_flags |= DMA_PREP_INTERRUPT;
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_prep_flags |= DMA_PREP_FENCE;
-		dma_dest = dma_map_page(device->dev, dest, dest_offset, len,
-					DMA_FROM_DEVICE);
-
-		dma_src = dma_map_page(device->dev, src, src_offset, len,
-				       DMA_TO_DEVICE);
-
-		tx = device->device_prep_dma_memcpy(chan, dma_dest, dma_src,
-						    len, dma_prep_flags);
-		if (!tx) {
-			dma_unmap_page(device->dev, dma_dest, len,
-				       DMA_FROM_DEVICE);
-			dma_unmap_page(device->dev, dma_src, len,
-				       DMA_TO_DEVICE);
-		}
+
+		unmap->to_cnt = 1;
+		unmap->addr[0] = dma_map_page(device->dev, src, src_offset, len,
+		                              DMA_TO_DEVICE);
+		unmap->from_cnt = 1;
+		unmap->addr[1] = dma_map_page(device->dev, dest, dest_offset, len,
+					      DMA_FROM_DEVICE);
+
+		tx = device->device_prep_dma_memcpy(chan, unmap->addr[1],
+						    unmap->addr[0], len,
+						    dma_prep_flags);
 	}
 
 	if (tx) {
 		pr_debug("%s: (async) len: %zu\n", __func__, len);
+
+		dma_set_unmap(tx, unmap);
 		async_tx_submit(chan, tx, submit);
 	} else {
 		void *dest_buf, *src_buf;
@@ -96,6 +99,8 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
 		async_tx_sync_epilog(submit);
 	}
 
+	dmaengine_unmap_put(unmap);
+
 	return tx;
 }
 EXPORT_SYMBOL_GPL(async_memcpy);
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 00f0baf..1b76227 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -965,7 +965,7 @@ static int dmaengine_init_unmap_pool(void)
 	return -ENOMEM;
 }
 
-static struct dmaengine_unmap_data *
+struct dmaengine_unmap_data *
 dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags)
 {
 	struct dmaengine_unmap_data *unmap;
@@ -981,6 +981,7 @@ dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags)
 
 	return unmap;
 }
+EXPORT_SYMBOL(dmaengine_get_unmap_data);
 
 /**
  * dma_async_memcpy_pg_to_pg - offloaded copy from page to page
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index c90d6b6..e954e9f 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -443,6 +443,8 @@ static inline void dma_set_unmap(struct dma_async_tx_descriptor *tx,
 	tx->unmap = unmap;
 }
 
+struct dmaengine_unmap_data *
+dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags);
 void dmaengine_unmap_put(struct dmaengine_unmap_data *unmap);
 
 static inline void dma_descriptor_unmap(struct dma_async_tx_descriptor *tx)


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/12] async_memset: convert to dmaengine_unmap_data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (3 preceding siblings ...)
  2012-12-06  9:25 ` [PATCH 04/12] async_memcpy: convert to dmaengine_unmap_data Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:25 ` [PATCH 06/12] async_xor: " Dan Williams
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, vinod.koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

Use the generic unmap object to unmap dma buffers.

Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 crypto/async_tx/async_memset.c |   18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index 05a4d1e..ffca53b 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -47,17 +47,22 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
 						      &dest, 1, NULL, 0, len);
 	struct dma_device *device = chan ? chan->device : NULL;
 	struct dma_async_tx_descriptor *tx = NULL;
+	struct dmaengine_unmap_data *unmap = NULL;
 
-	if (device && is_dma_fill_aligned(device, offset, 0, len)) {
-		dma_addr_t dma_dest;
-		unsigned long dma_prep_flags = 0;
+	if (device)
+		unmap = dmaengine_get_unmap_data(device->dev, 1, GFP_NOIO);
+
+	if (unmap && is_dma_fill_aligned(device, offset, 0, len)) {
+		unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+					       DMA_COMPL_SKIP_DEST_UNMAP;
 
 		if (submit->cb_fn)
 			dma_prep_flags |= DMA_PREP_INTERRUPT;
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_prep_flags |= DMA_PREP_FENCE;
-		dma_dest = dma_map_page(device->dev, dest, offset, len,
-					DMA_FROM_DEVICE);
+		unmap->from_cnt = 1;
+		unmap->addr[0] = dma_map_page(device->dev, dest, offset, len,
+					      DMA_FROM_DEVICE);
 
 		tx = device->device_prep_dma_memset(chan, dma_dest, val, len,
 						    dma_prep_flags);
@@ -65,6 +70,8 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
 
 	if (tx) {
 		pr_debug("%s: (async) len: %zu\n", __func__, len);
+
+		dma_set_unmap(tx, unmap);
 		async_tx_submit(chan, tx, submit);
 	} else { /* run the memset synchronously */
 		void *dest_buf;
@@ -79,6 +86,7 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
 
 		async_tx_sync_epilog(submit);
 	}
+	dmaengine_unmap_put(unmap);
 
 	return tx;
 }


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/12] async_xor: convert to dmaengine_unmap_data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (4 preceding siblings ...)
  2012-12-06  9:25 ` [PATCH 05/12] async_memset: " Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:25 ` [PATCH 07/12] async_xor_val: " Dan Williams
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, vinod.koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

Use the generic unmap object to unmap dma buffers.

Later we can push this unmap object up to the raid layer and get rid of
the 'scribble' parameter.

Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 crypto/async_tx/async_xor.c |   96 +++++++++++++++++++++++--------------------
 1 file changed, 52 insertions(+), 44 deletions(-)

diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 154cc84..46bbdb3 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -33,48 +33,33 @@
 
 /* do_async_xor - dma map the pages and perform the xor with an engine */
 static __async_inline struct dma_async_tx_descriptor *
-do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
-	     unsigned int offset, int src_cnt, size_t len, dma_addr_t *dma_src,
-	     struct async_submit_ctl *submit)
+do_async_xor(struct dma_chan *chan, struct dmaengine_unmap_data *unmap,
+             struct async_submit_ctl *submit)
 {
 	struct dma_device *dma = chan->device;
 	struct dma_async_tx_descriptor *tx = NULL;
-	int src_off = 0;
-	int i;
 	dma_async_tx_callback cb_fn_orig = submit->cb_fn;
 	void *cb_param_orig = submit->cb_param;
 	enum async_tx_flags flags_orig = submit->flags;
 	enum dma_ctrl_flags dma_flags;
-	int xor_src_cnt = 0;
-	dma_addr_t dma_dest;
-
-	/* map the dest bidrectional in case it is re-used as a source */
-	dma_dest = dma_map_page(dma->dev, dest, offset, len, DMA_BIDIRECTIONAL);
-	for (i = 0; i < src_cnt; i++) {
-		/* only map the dest once */
-		if (!src_list[i])
-			continue;
-		if (unlikely(src_list[i] == dest)) {
-			dma_src[xor_src_cnt++] = dma_dest;
-			continue;
-		}
-		dma_src[xor_src_cnt++] = dma_map_page(dma->dev, src_list[i], offset,
-						      len, DMA_TO_DEVICE);
-	}
-	src_cnt = xor_src_cnt;
+	int src_cnt = unmap->to_cnt;
+	int xor_src_cnt;
+	dma_addr_t dma_dest = unmap->addr[unmap->to_cnt];
+	dma_addr_t *src_list = unmap->addr;
 
 	while (src_cnt) {
+		dma_addr_t tmp;
+
 		submit->flags = flags_orig;
 		dma_flags = 0;
 		xor_src_cnt = min(src_cnt, (int)dma->max_xor);
-		/* if we are submitting additional xors, leave the chain open,
-		 * clear the callback parameters, and leave the destination
-		 * buffer mapped
+		/* if we are submitting additional xors, leave the chain open
+		 * and clear the callback parameters
 		 */
+		dma_flags = DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
 		if (src_cnt > xor_src_cnt) {
 			submit->flags &= ~ASYNC_TX_ACK;
 			submit->flags |= ASYNC_TX_FENCE;
-			dma_flags = DMA_COMPL_SKIP_DEST_UNMAP;
 			submit->cb_fn = NULL;
 			submit->cb_param = NULL;
 		} else {
@@ -85,12 +70,18 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 			dma_flags |= DMA_PREP_INTERRUPT;
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_flags |= DMA_PREP_FENCE;
-		/* Since we have clobbered the src_list we are committed
-		 * to doing this asynchronously.  Drivers force forward progress
-		 * in case they can not provide a descriptor
+
+		/* Drivers force forward progress in case they can not provide a
+		 * descriptor
 		 */
-		tx = dma->device_prep_dma_xor(chan, dma_dest, &dma_src[src_off],
-					      xor_src_cnt, len, dma_flags);
+		tmp = src_list[0];
+		if (src_list > unmap->addr)
+			src_list[0] = dma_dest;
+		tx = dma->device_prep_dma_xor(chan, dma_dest, src_list,
+					      xor_src_cnt, unmap->len,
+					      dma_flags);
+		src_list[0] = tmp;
+
 
 		if (unlikely(!tx))
 			async_tx_quiesce(&submit->depend_tx);
@@ -99,22 +90,21 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 		while (unlikely(!tx)) {
 			dma_async_issue_pending(chan);
 			tx = dma->device_prep_dma_xor(chan, dma_dest,
-						      &dma_src[src_off],
-						      xor_src_cnt, len,
+						      src_list,
+						      xor_src_cnt, unmap->len,
 						      dma_flags);
 		}
 
+		dma_set_unmap(tx, unmap);
 		async_tx_submit(chan, tx, submit);
 		submit->depend_tx = tx;
 
 		if (src_cnt > xor_src_cnt) {
 			/* drop completed sources */
 			src_cnt -= xor_src_cnt;
-			src_off += xor_src_cnt;
-
 			/* use the intermediate result a source */
-			dma_src[--src_off] = dma_dest;
 			src_cnt++;
+			src_list += xor_src_cnt - 1;
 		} else
 			break;
 	}
@@ -189,22 +179,40 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
 	struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR,
 						      &dest, 1, src_list,
 						      src_cnt, len);
-	dma_addr_t *dma_src = NULL;
+	struct dma_device *device = chan ? chan->device : NULL;
+	struct dmaengine_unmap_data *unmap = NULL;
 
 	BUG_ON(src_cnt <= 1);
 
-	if (submit->scribble)
-		dma_src = submit->scribble;
-	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
-		dma_src = (dma_addr_t *) src_list;
+	if (device)
+		unmap = dmaengine_get_unmap_data(device->dev, src_cnt+1, GFP_NOIO);
+
+	if (unmap && is_dma_xor_aligned(device, offset, 0, len)) {
+		struct dma_async_tx_descriptor *tx;
+		int i, j;
 
-	if (dma_src && chan && is_dma_xor_aligned(chan->device, offset, 0, len)) {
 		/* run the xor asynchronously */
 		pr_debug("%s (async): len: %zu\n", __func__, len);
 
-		return do_async_xor(chan, dest, src_list, offset, src_cnt, len,
-				    dma_src, submit);
+		unmap->len = len;
+		for (i = 0, j = 0; i < src_cnt; i++) {
+			if (!src_list[i])
+				continue;
+			unmap->to_cnt++;
+			unmap->addr[j++] = dma_map_page(chan->device->dev, src_list[i],
+							offset, len, DMA_TO_DEVICE);
+		}
+
+		/* map it bidirectional as it may be re-used as a source */
+		unmap->addr[j] = dma_map_page(chan->device->dev, dest, offset, len,
+					      DMA_BIDIRECTIONAL);
+		unmap->bidi_cnt = 1;
+
+		tx = do_async_xor(chan, unmap, submit);
+		dmaengine_unmap_put(unmap);
+		return tx;
 	} else {
+		dmaengine_unmap_put(unmap);
 		/* run the xor synchronously */
 		pr_debug("%s (sync): len: %zu\n", __func__, len);
 		WARN_ONCE(chan, "%s: no space for dma address conversion\n",


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/12] async_xor_val: convert to dmaengine_unmap_data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (5 preceding siblings ...)
  2012-12-06  9:25 ` [PATCH 06/12] async_xor: " Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:25 ` [PATCH 08/12] async_raid6_recov: " Dan Williams
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, vinod.koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

Use the generic unmap object to unmap dma buffers.

Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 crypto/async_tx/async_xor.c |   30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 46bbdb3..d6e1dc0 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -278,18 +278,17 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 	struct dma_chan *chan = xor_val_chan(submit, dest, src_list, src_cnt, len);
 	struct dma_device *device = chan ? chan->device : NULL;
 	struct dma_async_tx_descriptor *tx = NULL;
-	dma_addr_t *dma_src = NULL;
+	struct dmaengine_unmap_data *unmap = NULL;
 
 	BUG_ON(src_cnt <= 1);
 
-	if (submit->scribble)
-		dma_src = submit->scribble;
-	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
-		dma_src = (dma_addr_t *) src_list;
+	if (device)
+		unmap = dmaengine_get_unmap_data(device->dev, src_cnt, GFP_NOIO);
 
-	if (dma_src && device && src_cnt <= device->max_xor &&
+	if (unmap && src_cnt <= device->max_xor &&
 	    is_dma_xor_aligned(device, offset, 0, len)) {
-		unsigned long dma_prep_flags = 0;
+		unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+					       DMA_COMPL_SKIP_DEST_UNMAP;
 		int i;
 
 		pr_debug("%s: (async) len: %zu\n", __func__, len);
@@ -298,11 +297,15 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 			dma_prep_flags |= DMA_PREP_INTERRUPT;
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_prep_flags |= DMA_PREP_FENCE;
-		for (i = 0; i < src_cnt; i++)
-			dma_src[i] = dma_map_page(device->dev, src_list[i],
-						  offset, len, DMA_TO_DEVICE);
 
-		tx = device->device_prep_dma_xor_val(chan, dma_src, src_cnt,
+		for (i = 0; i < src_cnt; i++) {
+			unmap->addr[i] = dma_map_page(device->dev, src_list[i],
+                                          offset, len, DMA_TO_DEVICE);
+			unmap->to_cnt++;
+		}
+		unmap->len = len;
+
+		tx = device->device_prep_dma_xor_val(chan, unmap->addr, src_cnt,
 						     len, result,
 						     dma_prep_flags);
 		if (unlikely(!tx)) {
@@ -311,11 +314,11 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 			while (!tx) {
 				dma_async_issue_pending(chan);
 				tx = device->device_prep_dma_xor_val(chan,
-					dma_src, src_cnt, len, result,
+					unmap->addr, src_cnt, len, result,
 					dma_prep_flags);
 			}
 		}
-
+		dma_set_unmap(tx, unmap);
 		async_tx_submit(chan, tx, submit);
 	} else {
 		enum async_tx_flags flags_orig = submit->flags;
@@ -337,6 +340,7 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 		async_tx_sync_epilog(submit);
 		submit->flags = flags_orig;
 	}
+	dmaengine_unmap_put(unmap);
 
 	return tx;
 }


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/12] async_raid6_recov: convert to dmaengine_unmap_data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (6 preceding siblings ...)
  2012-12-06  9:25 ` [PATCH 07/12] async_xor_val: " Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:25 ` [PATCH 09/12] async_pq: " Dan Williams
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, vinod.koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

Use the generic unmap object to unmap dma buffers.

Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 crypto/async_tx/async_raid6_recov.c |   69 ++++++++++++++++++++++++-----------
 1 file changed, 48 insertions(+), 21 deletions(-)

diff --git a/crypto/async_tx/async_raid6_recov.c b/crypto/async_tx/async_raid6_recov.c
index a9f08a6..20aea04 100644
--- a/crypto/async_tx/async_raid6_recov.c
+++ b/crypto/async_tx/async_raid6_recov.c
@@ -26,6 +26,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/raid/pq.h>
 #include <linux/async_tx.h>
+#include <linux/dmaengine.h>
 
 static struct dma_async_tx_descriptor *
 async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
@@ -34,35 +35,47 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
 	struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
 						      &dest, 1, srcs, 2, len);
 	struct dma_device *dma = chan ? chan->device : NULL;
+	struct dmaengine_unmap_data *unmap = NULL;
 	const u8 *amul, *bmul;
 	u8 ax, bx;
 	u8 *a, *b, *c;
 
-	if (dma) {
-		dma_addr_t dma_dest[2];
-		dma_addr_t dma_src[2];
+	if (dma)
+		unmap = dmaengine_get_unmap_data(dma->dev, 3, GFP_NOIO);
+
+	if (unmap) {
 		struct device *dev = dma->dev;
+		dma_addr_t pq[2];
 		struct dma_async_tx_descriptor *tx;
-		enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
+		enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+						DMA_COMPL_SKIP_DEST_UNMAP |
+						DMA_PREP_PQ_DISABLE_P;
 
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_flags |= DMA_PREP_FENCE;
-		dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
-		dma_src[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE);
-		dma_src[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE);
-		tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 2, coef,
+		unmap->addr[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE);
+		unmap->addr[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE);
+		unmap->to_cnt = 2;
+
+		unmap->addr[2] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
+		unmap->bidi_cnt = 1;
+		/* engine only looks at Q, but expects it to follow P */
+		pq[1] = unmap->addr[2];
+
+		unmap->len = len;
+		tx = dma->device_prep_dma_pq(chan, pq, unmap->addr, 2, coef,
 					     len, dma_flags);
 		if (tx) {
+			dma_set_unmap(tx, unmap);
 			async_tx_submit(chan, tx, submit);
+			dmaengine_unmap_put(unmap);
 			return tx;
 		}
 
 		/* could not get a descriptor, unmap and fall through to
 		 * the synchronous path
 		 */
-		dma_unmap_page(dev, dma_dest[1], len, DMA_BIDIRECTIONAL);
-		dma_unmap_page(dev, dma_src[0], len, DMA_TO_DEVICE);
-		dma_unmap_page(dev, dma_src[1], len, DMA_TO_DEVICE);
+		dmaengine_unmap_put(unmap);
 	}
 
 	/* run the operation synchronously */
@@ -89,23 +102,38 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
 	struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
 						      &dest, 1, &src, 1, len);
 	struct dma_device *dma = chan ? chan->device : NULL;
+	struct dmaengine_unmap_data *unmap = NULL;
 	const u8 *qmul; /* Q multiplier table */
 	u8 *d, *s;
 
-	if (dma) {
-		dma_addr_t dma_dest[2];
-		dma_addr_t dma_src[1];
+	if (dma)
+		unmap = dmaengine_get_unmap_data(dma->dev, 3, GFP_NOIO);
+
+	if (unmap) {
 		struct device *dev = dma->dev;
 		struct dma_async_tx_descriptor *tx;
-		enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
+		enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+						DMA_COMPL_SKIP_DEST_UNMAP |
+						DMA_PREP_PQ_DISABLE_P;
 
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_flags |= DMA_PREP_FENCE;
-		dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
-		dma_src[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE);
-		tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 1, &coef,
-					     len, dma_flags);
+		unmap->addr[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE);
+		unmap->to_cnt++;
+		unmap->addr[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
+		unmap->bidi_cnt++;
+		unmap->len = len;
+
+		/* this looks funny, but the engine looks for Q at
+		 * unmap->addr[1] and ignores unmap->addr[0] as a dest
+		 * due to DMA_PREP_PQ_DISABLE_P
+		 */
+		tx = dma->device_prep_dma_pq(chan, unmap->addr, unmap->addr,
+					     1, &coef, len, dma_flags);
+
 		if (tx) {
+			dma_set_unmap(tx, unmap);
+			dmaengine_unmap_put(unmap);
 			async_tx_submit(chan, tx, submit);
 			return tx;
 		}
@@ -113,8 +141,7 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
 		/* could not get a descriptor, unmap and fall through to
 		 * the synchronous path
 		 */
-		dma_unmap_page(dev, dma_dest[1], len, DMA_BIDIRECTIONAL);
-		dma_unmap_page(dev, dma_src[0], len, DMA_TO_DEVICE);
+		dmaengine_unmap_put(unmap);
 	}
 
 	/* no channel available, or failed to allocate a descriptor, so


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/12] async_pq: convert to dmaengine_unmap_data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (7 preceding siblings ...)
  2012-12-06  9:25 ` [PATCH 08/12] async_raid6_recov: " Dan Williams
@ 2012-12-06  9:25 ` Dan Williams
  2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:26 ` [PATCH 10/12] async_pq_val: " Dan Williams
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, vinod.koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

Use the generic unmap object to unmap dma buffers.

Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 crypto/async_tx/async_pq.c |  117 ++++++++++++++++++++++++--------------------
 drivers/dma/dmaengine.c    |    5 ++
 2 files changed, 68 insertions(+), 54 deletions(-)

diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
index 91d5d38..1d78984 100644
--- a/crypto/async_tx/async_pq.c
+++ b/crypto/async_tx/async_pq.c
@@ -46,49 +46,25 @@ static struct page *pq_scribble_page;
  * do_async_gen_syndrome - asynchronously calculate P and/or Q
  */
 static __async_inline struct dma_async_tx_descriptor *
-do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
-		      const unsigned char *scfs, unsigned int offset, int disks,
-		      size_t len, dma_addr_t *dma_src,
+do_async_gen_syndrome(struct dma_chan *chan,
+		      const unsigned char *scfs, int disks,
+		      struct dmaengine_unmap_data *unmap,
+		      enum dma_ctrl_flags dma_flags,
 		      struct async_submit_ctl *submit)
 {
+	dma_addr_t *dma_dest = &unmap->addr[disks - 2];
 	struct dma_async_tx_descriptor *tx = NULL;
 	struct dma_device *dma = chan->device;
-	enum dma_ctrl_flags dma_flags = 0;
 	enum async_tx_flags flags_orig = submit->flags;
 	dma_async_tx_callback cb_fn_orig = submit->cb_fn;
 	dma_async_tx_callback cb_param_orig = submit->cb_param;
 	int src_cnt = disks - 2;
-	unsigned char coefs[src_cnt];
 	unsigned short pq_src_cnt;
-	dma_addr_t dma_dest[2];
 	int src_off = 0;
-	int idx;
-	int i;
 
-	/* DMAs use destinations as sources, so use BIDIRECTIONAL mapping */
-	if (P(blocks, disks))
-		dma_dest[0] = dma_map_page(dma->dev, P(blocks, disks), offset,
-					   len, DMA_BIDIRECTIONAL);
-	else
-		dma_flags |= DMA_PREP_PQ_DISABLE_P;
-	if (Q(blocks, disks))
-		dma_dest[1] = dma_map_page(dma->dev, Q(blocks, disks), offset,
-					   len, DMA_BIDIRECTIONAL);
-	else
-		dma_flags |= DMA_PREP_PQ_DISABLE_Q;
-
-	/* convert source addresses being careful to collapse 'empty'
-	 * sources and update the coefficients accordingly
-	 */
-	for (i = 0, idx = 0; i < src_cnt; i++) {
-		if (blocks[i] == NULL)
-			continue;
-		dma_src[idx] = dma_map_page(dma->dev, blocks[i], offset, len,
-					    DMA_TO_DEVICE);
-		coefs[idx] = scfs[i];
-		idx++;
-	}
-	src_cnt = idx;
+	dma_flags |= DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
+	if (submit->flags & ASYNC_TX_FENCE)
+		dma_flags |= DMA_PREP_FENCE;
 
 	while (src_cnt > 0) {
 		submit->flags = flags_orig;
@@ -100,28 +76,23 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
 		if (src_cnt > pq_src_cnt) {
 			submit->flags &= ~ASYNC_TX_ACK;
 			submit->flags |= ASYNC_TX_FENCE;
-			dma_flags |= DMA_COMPL_SKIP_DEST_UNMAP;
 			submit->cb_fn = NULL;
 			submit->cb_param = NULL;
 		} else {
-			dma_flags &= ~DMA_COMPL_SKIP_DEST_UNMAP;
 			submit->cb_fn = cb_fn_orig;
 			submit->cb_param = cb_param_orig;
 			if (cb_fn_orig)
 				dma_flags |= DMA_PREP_INTERRUPT;
 		}
-		if (submit->flags & ASYNC_TX_FENCE)
-			dma_flags |= DMA_PREP_FENCE;
 
-		/* Since we have clobbered the src_list we are committed
-		 * to doing this asynchronously.  Drivers force forward
-		 * progress in case they can not provide a descriptor
+		/* Drivers force forward progress in case they can not provide
+		 * a descriptor
 		 */
 		for (;;) {
 			tx = dma->device_prep_dma_pq(chan, dma_dest,
-						     &dma_src[src_off],
+						     &unmap->addr[src_off],
 						     pq_src_cnt,
-						     &coefs[src_off], len,
+						     &scfs[src_off], unmap->len,
 						     dma_flags);
 			if (likely(tx))
 				break;
@@ -129,6 +100,7 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
 			dma_async_issue_pending(chan);
 		}
 
+		dma_set_unmap(tx, unmap);
 		async_tx_submit(chan, tx, submit);
 		submit->depend_tx = tx;
 
@@ -188,10 +160,6 @@ do_sync_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
  * set to NULL those buffers will be replaced with the raid6_zero_page
  * in the synchronous path and omitted in the hardware-asynchronous
  * path.
- *
- * 'blocks' note: if submit->scribble is NULL then the contents of
- * 'blocks' may be overwritten to perform address conversions
- * (dma_map_page() or page_address()).
  */
 struct dma_async_tx_descriptor *
 async_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
@@ -202,26 +170,69 @@ async_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
 						      &P(blocks, disks), 2,
 						      blocks, src_cnt, len);
 	struct dma_device *device = chan ? chan->device : NULL;
-	dma_addr_t *dma_src = NULL;
+	struct dmaengine_unmap_data *unmap = NULL;
 
 	BUG_ON(disks > 255 || !(P(blocks, disks) || Q(blocks, disks)));
 
-	if (submit->scribble)
-		dma_src = submit->scribble;
-	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
-		dma_src = (dma_addr_t *) blocks;
+	if (device)
+		unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOIO);
 
-	if (dma_src && device &&
+	if (unmap &&
 	    (src_cnt <= dma_maxpq(device, 0) ||
 	     dma_maxpq(device, DMA_PREP_CONTINUE) > 0) &&
 	    is_dma_pq_aligned(device, offset, 0, len)) {
+		struct dma_async_tx_descriptor *tx;
+		enum dma_ctrl_flags dma_flags = 0;
+		unsigned char coefs[src_cnt];
+		int i, j;
+
 		/* run the p+q asynchronously */
 		pr_debug("%s: (async) disks: %d len: %zu\n",
 			 __func__, disks, len);
-		return do_async_gen_syndrome(chan, blocks, raid6_gfexp, offset,
-					     disks, len, dma_src, submit);
+
+		/* convert source addresses being careful to collapse 'empty'
+		 * sources and update the coefficients accordingly
+		 */
+		unmap->len = len;
+		for (i = 0, j = 0; i < src_cnt; i++) {
+			if (blocks[i] == NULL)
+				continue;
+			unmap->addr[j] = dma_map_page(device->dev, blocks[i], offset,
+						      len, DMA_TO_DEVICE);
+			coefs[j] = raid6_gfexp[i];
+			unmap->to_cnt++;
+			j++;
+		}
+
+		/*
+		 * DMAs use destinations as sources,
+		 * so use BIDIRECTIONAL mapping
+		 */
+		unmap->bidi_cnt++;
+		if (P(blocks, disks))
+			unmap->addr[j++] = dma_map_page(device->dev, P(blocks, disks),
+							offset, len, DMA_BIDIRECTIONAL);
+		else {
+			unmap->addr[j++] = 0;
+			dma_flags |= DMA_PREP_PQ_DISABLE_P;
+		}
+
+		unmap->bidi_cnt++;
+		if (Q(blocks, disks))
+			unmap->addr[j++] = dma_map_page(device->dev, Q(blocks, disks),
+						       offset, len, DMA_BIDIRECTIONAL);
+		else {
+			unmap->addr[j++] = 0;
+			dma_flags |= DMA_PREP_PQ_DISABLE_Q;
+		}
+
+		tx = do_async_gen_syndrome(chan, coefs, j, unmap, dma_flags, submit);
+		dmaengine_unmap_put(unmap);
+		return tx;
 	}
 
+	dmaengine_unmap_put(unmap);
+
 	/* run the pq synchronously */
 	pr_debug("%s: (sync) disks: %d len: %zu\n", __func__, disks, len);
 
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 1b76227..5a3c7c0 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -909,9 +909,12 @@ static void dmaengine_unmap(struct kref *kref)
 		dma_unmap_page(dev, unmap->addr[i], unmap->len,
 			       DMA_FROM_DEVICE);
 	cnt += unmap->bidi_cnt;
-	for (; i < cnt; i++)
+	for (; i < cnt; i++) {
+		if (unmap->addr[i] == 0)
+			continue;
 		dma_unmap_page(dev, unmap->addr[i], unmap->len,
 			       DMA_BIDIRECTIONAL);
+	}
 	kmem_cache_free(__get_unmap_pool(cnt)->cache, unmap);
 }
 


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/12] async_pq_val: convert to dmaengine_unmap_data
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (8 preceding siblings ...)
  2012-12-06  9:25 ` [PATCH 09/12] async_pq: " Dan Williams
@ 2012-12-06  9:26 ` Dan Williams
  2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  2012-12-06  9:26 ` [PATCH 11/12] dmaengine: remove DMA unmap from drivers Dan Williams
  2012-12-06  9:26 ` [PATCH 12/12] dmaengine: remove DMA unmap flags Dan Williams
  11 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, vinod.koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

Use the generic unmap object to unmap dma buffers.

Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 crypto/async_tx/async_pq.c |   58 +++++++++++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 23 deletions(-)

diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
index 1d78984..e5ddb31 100644
--- a/crypto/async_tx/async_pq.c
+++ b/crypto/async_tx/async_pq.c
@@ -288,50 +288,60 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks,
 	struct dma_async_tx_descriptor *tx;
 	unsigned char coefs[disks-2];
 	enum dma_ctrl_flags dma_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0;
-	dma_addr_t *dma_src = NULL;
-	int src_cnt = 0;
+	struct dmaengine_unmap_data *unmap = NULL;
 
 	BUG_ON(disks < 4);
 
-	if (submit->scribble)
-		dma_src = submit->scribble;
-	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
-		dma_src = (dma_addr_t *) blocks;
+	if (device)
+		unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOIO);
 
-	if (dma_src && device && disks <= dma_maxpq(device, 0) &&
+	if (unmap && disks <= dma_maxpq(device, 0) &&
 	    is_dma_pq_aligned(device, offset, 0, len)) {
 		struct device *dev = device->dev;
-		dma_addr_t *pq = &dma_src[disks-2];
-		int i;
+		dma_addr_t pq[2];
+		int i, j = 0, src_cnt = 0;
 
 		pr_debug("%s: (async) disks: %d len: %zu\n",
 			 __func__, disks, len);
-		if (!P(blocks, disks))
+
+		unmap->len = len;
+		for (i = 0; i < disks-2; i++)
+			if (likely(blocks[i])) {
+				unmap->addr[j] = dma_map_page(dev, blocks[i],
+							      offset, len,
+							      DMA_TO_DEVICE);
+				coefs[j] = raid6_gfexp[i];
+				unmap->to_cnt++;
+				src_cnt++;
+				j++;
+			}
+
+		if (!P(blocks, disks)) {
+			pq[0] = 0;
 			dma_flags |= DMA_PREP_PQ_DISABLE_P;
-		else
+		} else {
 			pq[0] = dma_map_page(dev, P(blocks, disks),
 					     offset, len,
 					     DMA_TO_DEVICE);
-		if (!Q(blocks, disks))
+			unmap->addr[j++] = pq[0];
+			unmap->to_cnt++;
+		}
+		if (!Q(blocks, disks)) {
+			pq[1] = 0;
 			dma_flags |= DMA_PREP_PQ_DISABLE_Q;
-		else
+		} else {
 			pq[1] = dma_map_page(dev, Q(blocks, disks),
 					     offset, len,
 					     DMA_TO_DEVICE);
+			unmap->addr[j++] = pq[1];
+			unmap->to_cnt++;
+		}
 
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_flags |= DMA_PREP_FENCE;
-		for (i = 0; i < disks-2; i++)
-			if (likely(blocks[i])) {
-				dma_src[src_cnt] = dma_map_page(dev, blocks[i],
-								offset, len,
-								DMA_TO_DEVICE);
-				coefs[src_cnt] = raid6_gfexp[i];
-				src_cnt++;
-			}
-
 		for (;;) {
-			tx = device->device_prep_dma_pq_val(chan, pq, dma_src,
+			tx = device->device_prep_dma_pq_val(chan, pq,
+							    unmap->addr,
 							    src_cnt,
 							    coefs,
 							    len, pqres,
@@ -341,6 +351,8 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks,
 			async_tx_quiesce(&submit->depend_tx);
 			dma_async_issue_pending(chan);
 		}
+
+		dma_set_unmap(tx, unmap);
 		async_tx_submit(chan, tx, submit);
 
 		return tx;


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 11/12] dmaengine: remove DMA unmap from drivers
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (9 preceding siblings ...)
  2012-12-06  9:26 ` [PATCH 10/12] async_pq_val: " Dan Williams
@ 2012-12-06  9:26 ` Dan Williams
  2012-12-06  9:26 ` [PATCH 12/12] dmaengine: remove DMA unmap flags Dan Williams
  11 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, Vinod Koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

From: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>

Remove support for DMA unmapping from drivers as it is no longer
needed (DMA core code is now handling it).

Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 arch/arm/include/asm/hardware/iop3xx-adma.h |   30 ---
 arch/arm/include/asm/hardware/iop_adma.h    |    4 
 arch/arm/mach-iop13xx/include/mach/adma.h   |   26 ---
 drivers/dma/amba-pl08x.c                    |   31 ---
 drivers/dma/at_hdmac.c                      |   25 ---
 drivers/dma/dw_dmac.c                       |   20 --
 drivers/dma/ep93xx_dma.c                    |   29 ---
 drivers/dma/fsldma.c                        |   16 --
 drivers/dma/ioat/dma.c                      |   16 --
 drivers/dma/ioat/dma.h                      |   12 -
 drivers/dma/ioat/dma_v2.c                   |    1 
 drivers/dma/ioat/dma_v3.c                   |  126 -------------
 drivers/dma/iop-adma.c                      |   98 ----------
 drivers/dma/mv_xor.c                        |   45 -----
 drivers/dma/ppc4xx/adma.c                   |  269 ---------------------------
 drivers/dma/timb_dma.c                      |   36 ----
 drivers/dma/txx9dmac.c                      |   24 --
 17 files changed, 3 insertions(+), 805 deletions(-)

diff --git a/arch/arm/include/asm/hardware/iop3xx-adma.h b/arch/arm/include/asm/hardware/iop3xx-adma.h
index 9b28f12..240b29e 100644
--- a/arch/arm/include/asm/hardware/iop3xx-adma.h
+++ b/arch/arm/include/asm/hardware/iop3xx-adma.h
@@ -393,36 +393,6 @@ static inline int iop_chan_zero_sum_slot_count(size_t len, int src_cnt,
 	return slot_cnt;
 }
 
-static inline int iop_desc_is_pq(struct iop_adma_desc_slot *desc)
-{
-	return 0;
-}
-
-static inline u32 iop_desc_get_dest_addr(struct iop_adma_desc_slot *desc,
-					struct iop_adma_chan *chan)
-{
-	union iop3xx_desc hw_desc = { .ptr = desc->hw_desc, };
-
-	switch (chan->device->id) {
-	case DMA0_ID:
-	case DMA1_ID:
-		return hw_desc.dma->dest_addr;
-	case AAU_ID:
-		return hw_desc.aau->dest_addr;
-	default:
-		BUG();
-	}
-	return 0;
-}
-
-
-static inline u32 iop_desc_get_qdest_addr(struct iop_adma_desc_slot *desc,
-					  struct iop_adma_chan *chan)
-{
-	BUG();
-	return 0;
-}
-
 static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc,
 					struct iop_adma_chan *chan)
 {
diff --git a/arch/arm/include/asm/hardware/iop_adma.h b/arch/arm/include/asm/hardware/iop_adma.h
index 122f86d..250760e 100644
--- a/arch/arm/include/asm/hardware/iop_adma.h
+++ b/arch/arm/include/asm/hardware/iop_adma.h
@@ -82,8 +82,6 @@ struct iop_adma_chan {
  * @slot_cnt: total slots used in an transaction (group of operations)
  * @slots_per_op: number of slots per operation
  * @idx: pool index
- * @unmap_src_cnt: number of xor sources
- * @unmap_len: transaction bytecount
  * @tx_list: list of descriptors that are associated with one operation
  * @async_tx: support for the async_tx api
  * @group_list: list of slots that make up a multi-descriptor transaction
@@ -99,8 +97,6 @@ struct iop_adma_desc_slot {
 	u16 slot_cnt;
 	u16 slots_per_op;
 	u16 idx;
-	u16 unmap_src_cnt;
-	size_t unmap_len;
 	struct list_head tx_list;
 	struct dma_async_tx_descriptor async_tx;
 	union {
diff --git a/arch/arm/mach-iop13xx/include/mach/adma.h b/arch/arm/mach-iop13xx/include/mach/adma.h
index 6d3782d..a86fd0e 100644
--- a/arch/arm/mach-iop13xx/include/mach/adma.h
+++ b/arch/arm/mach-iop13xx/include/mach/adma.h
@@ -218,20 +218,6 @@ iop_chan_xor_slot_count(size_t len, int src_cnt, int *slots_per_op)
 #define iop_chan_pq_slot_count iop_chan_xor_slot_count
 #define iop_chan_pq_zero_sum_slot_count iop_chan_xor_slot_count
 
-static inline u32 iop_desc_get_dest_addr(struct iop_adma_desc_slot *desc,
-					struct iop_adma_chan *chan)
-{
-	struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
-	return hw_desc->dest_addr;
-}
-
-static inline u32 iop_desc_get_qdest_addr(struct iop_adma_desc_slot *desc,
-					  struct iop_adma_chan *chan)
-{
-	struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
-	return hw_desc->q_dest_addr;
-}
-
 static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc,
 					struct iop_adma_chan *chan)
 {
@@ -350,18 +336,6 @@ iop_desc_init_pq(struct iop_adma_desc_slot *desc, int src_cnt,
 	hw_desc->desc_ctrl = u_desc_ctrl.value;
 }
 
-static inline int iop_desc_is_pq(struct iop_adma_desc_slot *desc)
-{
-	struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
-	union {
-		u32 value;
-		struct iop13xx_adma_desc_ctrl field;
-	} u_desc_ctrl;
-
-	u_desc_ctrl.value = hw_desc->desc_ctrl;
-	return u_desc_ctrl.field.pq_xfer_en;
-}
-
 static inline void
 iop_desc_init_pq_zero_sum(struct iop_adma_desc_slot *desc, int src_cnt,
 			  unsigned long flags)
diff --git a/drivers/dma/amba-pl08x.c b/drivers/dma/amba-pl08x.c
index 4cb2f23..6ff6547 100644
--- a/drivers/dma/amba-pl08x.c
+++ b/drivers/dma/amba-pl08x.c
@@ -1050,43 +1050,12 @@ static void pl08x_free_txd(struct pl08x_driver_data *pl08x,
 	kfree(txd);
 }
 
-static void pl08x_unmap_buffers(struct pl08x_txd *txd)
-{
-	struct device *dev = txd->vd.tx.chan->device->dev;
-	struct pl08x_sg *dsg;
-
-	if (!(txd->vd.tx.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-		if (txd->vd.tx.flags & DMA_COMPL_SRC_UNMAP_SINGLE)
-			list_for_each_entry(dsg, &txd->dsg_list, node)
-				dma_unmap_single(dev, dsg->src_addr, dsg->len,
-						DMA_TO_DEVICE);
-		else {
-			list_for_each_entry(dsg, &txd->dsg_list, node)
-				dma_unmap_page(dev, dsg->src_addr, dsg->len,
-						DMA_TO_DEVICE);
-		}
-	}
-	if (!(txd->vd.tx.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-		if (txd->vd.tx.flags & DMA_COMPL_DEST_UNMAP_SINGLE)
-			list_for_each_entry(dsg, &txd->dsg_list, node)
-				dma_unmap_single(dev, dsg->dst_addr, dsg->len,
-						DMA_FROM_DEVICE);
-		else
-			list_for_each_entry(dsg, &txd->dsg_list, node)
-				dma_unmap_page(dev, dsg->dst_addr, dsg->len,
-						DMA_FROM_DEVICE);
-	}
-}
-
 static void pl08x_desc_free(struct virt_dma_desc *vd)
 {
 	struct pl08x_txd *txd = to_pl08x_txd(&vd->tx);
 	struct pl08x_dma_chan *plchan = to_pl08x_chan(vd->tx.chan);
 
 	dma_descriptor_unmap(txd);
-	if (!plchan->slave)
-		pl08x_unmap_buffers(txd);
-
 	if (!txd->done)
 		pl08x_release_mux(plchan);
 
diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c
index 280ce87..154f9df 100644
--- a/drivers/dma/at_hdmac.c
+++ b/drivers/dma/at_hdmac.c
@@ -252,32 +252,7 @@ atc_chain_complete(struct at_dma_chan *atchan, struct at_desc *desc)
 	/* move myself to free_list */
 	list_move(&desc->desc_node, &atchan->free_list);
 
-	/* unmap dma addresses (not on slave channels) */
 	dma_descriptor_unmap(txd);
-	if (!atchan->chan_common.private) {
-		struct device *parent = chan2parent(&atchan->chan_common);
-		if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-			if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
-				dma_unmap_single(parent,
-						desc->lli.daddr,
-						desc->len, DMA_FROM_DEVICE);
-			else
-				dma_unmap_page(parent,
-						desc->lli.daddr,
-						desc->len, DMA_FROM_DEVICE);
-		}
-		if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-			if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
-				dma_unmap_single(parent,
-						desc->lli.saddr,
-						desc->len, DMA_TO_DEVICE);
-			else
-				dma_unmap_page(parent,
-						desc->lli.saddr,
-						desc->len, DMA_TO_DEVICE);
-		}
-	}
-
 	/* for cyclic transfers,
 	 * no need to replay callback function while stopping */
 	if (!atc_chan_is_cyclic(atchan)) {
diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
index 5d0b58c..f626deb 100644
--- a/drivers/dma/dw_dmac.c
+++ b/drivers/dma/dw_dmac.c
@@ -327,26 +327,6 @@ dwc_descriptor_complete(struct dw_dma_chan *dwc, struct dw_desc *desc,
 	list_move(&desc->desc_node, &dwc->free_list);
 
 	dma_descriptor_unmap(txd);
-	if (!dwc->chan.private) {
-		struct device *parent = chan2parent(&dwc->chan);
-		if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-			if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
-				dma_unmap_single(parent, desc->lli.dar,
-						desc->len, DMA_FROM_DEVICE);
-			else
-				dma_unmap_page(parent, desc->lli.dar,
-						desc->len, DMA_FROM_DEVICE);
-		}
-		if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-			if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
-				dma_unmap_single(parent, desc->lli.sar,
-						desc->len, DMA_TO_DEVICE);
-			else
-				dma_unmap_page(parent, desc->lli.sar,
-						desc->len, DMA_TO_DEVICE);
-		}
-	}
-
 	spin_unlock_irqrestore(&dwc->lock, flags);
 
 	if (callback_required && callback)
diff --git a/drivers/dma/ep93xx_dma.c b/drivers/dma/ep93xx_dma.c
index 5d08aeb..493392e 100644
--- a/drivers/dma/ep93xx_dma.c
+++ b/drivers/dma/ep93xx_dma.c
@@ -733,28 +733,6 @@ static void ep93xx_dma_advance_work(struct ep93xx_dma_chan *edmac)
 	spin_unlock_irqrestore(&edmac->lock, flags);
 }
 
-static void ep93xx_dma_unmap_buffers(struct ep93xx_dma_desc *desc)
-{
-	struct device *dev = desc->txd.chan->device->dev;
-
-	if (!(desc->txd.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-		if (desc->txd.flags & DMA_COMPL_SRC_UNMAP_SINGLE)
-			dma_unmap_single(dev, desc->src_addr, desc->size,
-					 DMA_TO_DEVICE);
-		else
-			dma_unmap_page(dev, desc->src_addr, desc->size,
-				       DMA_TO_DEVICE);
-	}
-	if (!(desc->txd.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-		if (desc->txd.flags & DMA_COMPL_DEST_UNMAP_SINGLE)
-			dma_unmap_single(dev, desc->dst_addr, desc->size,
-					 DMA_FROM_DEVICE);
-		else
-			dma_unmap_page(dev, desc->dst_addr, desc->size,
-				       DMA_FROM_DEVICE);
-	}
-}
-
 static void ep93xx_dma_tasklet(unsigned long data)
 {
 	struct ep93xx_dma_chan *edmac = (struct ep93xx_dma_chan *)data;
@@ -787,14 +765,7 @@ static void ep93xx_dma_tasklet(unsigned long data)
 
 	/* Now we can release all the chained descriptors */
 	list_for_each_entry_safe(desc, d, &list, node) {
-		/*
-		 * For the memcpy channels the API requires us to unmap the
-		 * buffers unless requested otherwise.
-		 */
 		dma_descriptor_unmap(&desc->txd);
-		if (!edmac->chan.private)
-			ep93xx_dma_unmap_buffers(desc);
-
 		ep93xx_dma_desc_put(edmac, desc);
 	}
 
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 7e4e44c..e3aad8d 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -869,22 +869,6 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
 	dma_run_dependencies(txd);
 
 	dma_descriptor_unmap(txd);
-	/* Unmap the dst buffer, if requested */
-	if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-		if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
-			dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
-		else
-			dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
-	}
-
-	/* Unmap the src buffer, if requested */
-	if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-		if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
-			dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
-		else
-			dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
-	}
-
 #ifdef FSL_DMA_LD_DEBUG
 	chan_dbg(chan, "LD %p free\n", desc);
 #endif
diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index 38918cf..b01df57 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -531,21 +531,6 @@ static void ioat1_cleanup_event(unsigned long data)
 	writew(IOAT_CHANCTRL_RUN, ioat->base.reg_base + IOAT_CHANCTRL_OFFSET);
 }
 
-void ioat_dma_unmap(struct ioat_chan_common *chan, enum dma_ctrl_flags flags,
-		    size_t len, struct ioat_dma_descriptor *hw)
-{
-	struct pci_dev *pdev = chan->device->pdev;
-	size_t offset = len - hw->size;
-
-	if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP))
-		ioat_unmap(pdev, hw->dst_addr - offset, len,
-			   PCI_DMA_FROMDEVICE, flags, 1);
-
-	if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP))
-		ioat_unmap(pdev, hw->src_addr - offset, len,
-			   PCI_DMA_TODEVICE, flags, 0);
-}
-
 dma_addr_t ioat_get_current_completion(struct ioat_chan_common *chan)
 {
 	dma_addr_t phys_complete;
@@ -603,7 +588,6 @@ static void __cleanup(struct ioat_dma_chan *ioat, dma_addr_t phys_complete)
 		if (tx->cookie) {
 			dma_cookie_complete(tx);
 			dma_descriptor_unmap(tx);
-			ioat_dma_unmap(chan, tx->flags, desc->len, desc->hw);
 			ioat->active -= desc->hw->tx_cnt;
 			if (tx->callback) {
 				tx->callback(tx->callback_param);
diff --git a/drivers/dma/ioat/dma.h b/drivers/dma/ioat/dma.h
index 5e8fe01..4e9e027 100644
--- a/drivers/dma/ioat/dma.h
+++ b/drivers/dma/ioat/dma.h
@@ -293,16 +293,6 @@ static inline bool is_ioat_bug(unsigned long err)
 	return !!err;
 }
 
-static inline void ioat_unmap(struct pci_dev *pdev, dma_addr_t addr, size_t len,
-			      int direction, enum dma_ctrl_flags flags, bool dst)
-{
-	if ((dst && (flags & DMA_COMPL_DEST_UNMAP_SINGLE)) ||
-	    (!dst && (flags & DMA_COMPL_SRC_UNMAP_SINGLE)))
-		pci_unmap_single(pdev, addr, len, direction);
-	else
-		pci_unmap_page(pdev, addr, len, direction);
-}
-
 int __devinit ioat_probe(struct ioatdma_device *device);
 int __devinit ioat_register(struct ioatdma_device *device);
 int __devinit ioat1_dma_probe(struct ioatdma_device *dev, int dca);
@@ -315,8 +305,6 @@ void ioat_init_channel(struct ioatdma_device *device,
 		       struct ioat_chan_common *chan, int idx);
 enum dma_status ioat_dma_tx_status(struct dma_chan *c, dma_cookie_t cookie,
 				   struct dma_tx_state *txstate);
-void ioat_dma_unmap(struct ioat_chan_common *chan, enum dma_ctrl_flags flags,
-		    size_t len, struct ioat_dma_descriptor *hw);
 bool ioat_cleanup_preamble(struct ioat_chan_common *chan,
 			   dma_addr_t *phys_complete);
 void ioat_kobject_add(struct ioatdma_device *device, struct kobj_type *type);
diff --git a/drivers/dma/ioat/dma_v2.c b/drivers/dma/ioat/dma_v2.c
index 2714b0e..e786ef6 100644
--- a/drivers/dma/ioat/dma_v2.c
+++ b/drivers/dma/ioat/dma_v2.c
@@ -149,7 +149,6 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
 		dump_desc_dbg(ioat, desc);
 		if (tx->cookie) {
 			dma_descriptor_unmap(tx);
-			ioat_dma_unmap(chan, tx->flags, desc->len, desc->hw);
 			dma_cookie_complete(tx);
 			if (tx->callback) {
 				tx->callback(tx->callback_param);
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 70385d5..cdf37ff 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -79,13 +79,6 @@ static const u8 xor_idx_to_field[] = { 1, 4, 5, 6, 7, 0, 1, 2 };
 static const u8 pq_idx_to_desc = 0xf8;
 static const u8 pq_idx_to_field[] = { 1, 4, 5, 0, 1, 2, 4, 5 };
 
-static dma_addr_t xor_get_src(struct ioat_raw_descriptor *descs[2], int idx)
-{
-	struct ioat_raw_descriptor *raw = descs[xor_idx_to_desc >> idx & 1];
-
-	return raw->field[xor_idx_to_field[idx]];
-}
-
 static void xor_set_src(struct ioat_raw_descriptor *descs[2],
 			dma_addr_t addr, u32 offset, int idx)
 {
@@ -111,124 +104,6 @@ static void pq_set_src(struct ioat_raw_descriptor *descs[2],
 	pq->coef[idx] = coef;
 }
 
-static void ioat3_dma_unmap(struct ioat2_dma_chan *ioat,
-			    struct ioat_ring_ent *desc, int idx)
-{
-	struct ioat_chan_common *chan = &ioat->base;
-	struct pci_dev *pdev = chan->device->pdev;
-	size_t len = desc->len;
-	size_t offset = len - desc->hw->size;
-	struct dma_async_tx_descriptor *tx = &desc->txd;
-	enum dma_ctrl_flags flags = tx->flags;
-
-	switch (desc->hw->ctl_f.op) {
-	case IOAT_OP_COPY:
-		if (!desc->hw->ctl_f.null) /* skip 'interrupt' ops */
-			ioat_dma_unmap(chan, flags, len, desc->hw);
-		break;
-	case IOAT_OP_FILL: {
-		struct ioat_fill_descriptor *hw = desc->fill;
-
-		if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP))
-			ioat_unmap(pdev, hw->dst_addr - offset, len,
-				   PCI_DMA_FROMDEVICE, flags, 1);
-		break;
-	}
-	case IOAT_OP_XOR_VAL:
-	case IOAT_OP_XOR: {
-		struct ioat_xor_descriptor *xor = desc->xor;
-		struct ioat_ring_ent *ext;
-		struct ioat_xor_ext_descriptor *xor_ex = NULL;
-		int src_cnt = src_cnt_to_sw(xor->ctl_f.src_cnt);
-		struct ioat_raw_descriptor *descs[2];
-		int i;
-
-		if (src_cnt > 5) {
-			ext = ioat2_get_ring_ent(ioat, idx + 1);
-			xor_ex = ext->xor_ex;
-		}
-
-		if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-			descs[0] = (struct ioat_raw_descriptor *) xor;
-			descs[1] = (struct ioat_raw_descriptor *) xor_ex;
-			for (i = 0; i < src_cnt; i++) {
-				dma_addr_t src = xor_get_src(descs, i);
-
-				ioat_unmap(pdev, src - offset, len,
-					   PCI_DMA_TODEVICE, flags, 0);
-			}
-
-			/* dest is a source in xor validate operations */
-			if (xor->ctl_f.op == IOAT_OP_XOR_VAL) {
-				ioat_unmap(pdev, xor->dst_addr - offset, len,
-					   PCI_DMA_TODEVICE, flags, 1);
-				break;
-			}
-		}
-
-		if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP))
-			ioat_unmap(pdev, xor->dst_addr - offset, len,
-				   PCI_DMA_FROMDEVICE, flags, 1);
-		break;
-	}
-	case IOAT_OP_PQ_VAL:
-	case IOAT_OP_PQ: {
-		struct ioat_pq_descriptor *pq = desc->pq;
-		struct ioat_ring_ent *ext;
-		struct ioat_pq_ext_descriptor *pq_ex = NULL;
-		int src_cnt = src_cnt_to_sw(pq->ctl_f.src_cnt);
-		struct ioat_raw_descriptor *descs[2];
-		int i;
-
-		if (src_cnt > 3) {
-			ext = ioat2_get_ring_ent(ioat, idx + 1);
-			pq_ex = ext->pq_ex;
-		}
-
-		/* in the 'continue' case don't unmap the dests as sources */
-		if (dmaf_p_disabled_continue(flags))
-			src_cnt--;
-		else if (dmaf_continue(flags))
-			src_cnt -= 3;
-
-		if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-			descs[0] = (struct ioat_raw_descriptor *) pq;
-			descs[1] = (struct ioat_raw_descriptor *) pq_ex;
-			for (i = 0; i < src_cnt; i++) {
-				dma_addr_t src = pq_get_src(descs, i);
-
-				ioat_unmap(pdev, src - offset, len,
-					   PCI_DMA_TODEVICE, flags, 0);
-			}
-
-			/* the dests are sources in pq validate operations */
-			if (pq->ctl_f.op == IOAT_OP_XOR_VAL) {
-				if (!(flags & DMA_PREP_PQ_DISABLE_P))
-					ioat_unmap(pdev, pq->p_addr - offset,
-						   len, PCI_DMA_TODEVICE, flags, 0);
-				if (!(flags & DMA_PREP_PQ_DISABLE_Q))
-					ioat_unmap(pdev, pq->q_addr - offset,
-						   len, PCI_DMA_TODEVICE, flags, 0);
-				break;
-			}
-		}
-
-		if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-			if (!(flags & DMA_PREP_PQ_DISABLE_P))
-				ioat_unmap(pdev, pq->p_addr - offset, len,
-					   PCI_DMA_BIDIRECTIONAL, flags, 1);
-			if (!(flags & DMA_PREP_PQ_DISABLE_Q))
-				ioat_unmap(pdev, pq->q_addr - offset, len,
-					   PCI_DMA_BIDIRECTIONAL, flags, 1);
-		}
-		break;
-	}
-	default:
-		dev_err(&pdev->dev, "%s: unknown op type: %#x\n",
-			__func__, desc->hw->ctl_f.op);
-	}
-}
-
 static bool desc_has_ext(struct ioat_ring_ent *desc)
 {
 	struct ioat_dma_descriptor *hw = desc->hw;
@@ -280,7 +155,6 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
 		if (tx->cookie) {
 			dma_cookie_complete(tx);
 			dma_descriptor_unmap(tx);
-			ioat3_dma_unmap(ioat, desc, idx + i);
 			if (tx->callback) {
 				tx->callback(tx->callback_param);
 				tx->callback = NULL;
diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index 32f5d46..4175bb2 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -61,80 +61,6 @@ static void iop_adma_free_slots(struct iop_adma_desc_slot *slot)
 	}
 }
 
-static void
-iop_desc_unmap(struct iop_adma_chan *iop_chan, struct iop_adma_desc_slot *desc)
-{
-	struct dma_async_tx_descriptor *tx = &desc->async_tx;
-	struct iop_adma_desc_slot *unmap = desc->group_head;
-	struct device *dev = &iop_chan->device->pdev->dev;
-	u32 len = unmap->unmap_len;
-	enum dma_ctrl_flags flags = tx->flags;
-	u32 src_cnt;
-	dma_addr_t addr;
-	dma_addr_t dest;
-
-	src_cnt = unmap->unmap_src_cnt;
-	dest = iop_desc_get_dest_addr(unmap, iop_chan);
-	if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-		enum dma_data_direction dir;
-
-		if (src_cnt > 1) /* is xor? */
-			dir = DMA_BIDIRECTIONAL;
-		else
-			dir = DMA_FROM_DEVICE;
-
-		dma_unmap_page(dev, dest, len, dir);
-	}
-
-	if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-		while (src_cnt--) {
-			addr = iop_desc_get_src_addr(unmap, iop_chan, src_cnt);
-			if (addr == dest)
-				continue;
-			dma_unmap_page(dev, addr, len, DMA_TO_DEVICE);
-		}
-	}
-	desc->group_head = NULL;
-}
-
-static void
-iop_desc_unmap_pq(struct iop_adma_chan *iop_chan, struct iop_adma_desc_slot *desc)
-{
-	struct dma_async_tx_descriptor *tx = &desc->async_tx;
-	struct iop_adma_desc_slot *unmap = desc->group_head;
-	struct device *dev = &iop_chan->device->pdev->dev;
-	u32 len = unmap->unmap_len;
-	enum dma_ctrl_flags flags = tx->flags;
-	u32 src_cnt = unmap->unmap_src_cnt;
-	dma_addr_t pdest = iop_desc_get_dest_addr(unmap, iop_chan);
-	dma_addr_t qdest = iop_desc_get_qdest_addr(unmap, iop_chan);
-	int i;
-
-	if (tx->flags & DMA_PREP_CONTINUE)
-		src_cnt -= 3;
-
-	if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP) && !desc->pq_check_result) {
-		dma_unmap_page(dev, pdest, len, DMA_BIDIRECTIONAL);
-		dma_unmap_page(dev, qdest, len, DMA_BIDIRECTIONAL);
-	}
-
-	if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-		dma_addr_t addr;
-
-		for (i = 0; i < src_cnt; i++) {
-			addr = iop_desc_get_src_addr(unmap, iop_chan, i);
-			dma_unmap_page(dev, addr, len, DMA_TO_DEVICE);
-		}
-		if (desc->pq_check_result) {
-			dma_unmap_page(dev, pdest, len, DMA_TO_DEVICE);
-			dma_unmap_page(dev, qdest, len, DMA_TO_DEVICE);
-		}
-	}
-
-	desc->group_head = NULL;
-}
-
-
 static dma_cookie_t
 iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,
 	struct iop_adma_chan *iop_chan, dma_cookie_t cookie)
@@ -153,15 +79,8 @@ iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,
 			tx->callback(tx->callback_param);
 
 		dma_descriptor_unmap(tx);
-		/* unmap dma addresses
-		 * (unmap_single vs unmap_page?)
-		 */
-		if (desc->group_head && desc->unmap_len) {
-			if (iop_desc_is_pq(desc))
-				iop_desc_unmap_pq(iop_chan, desc);
-			else
-				iop_desc_unmap(iop_chan, desc);
-		}
+		if (desc->group_head)
+			desc->group_head = NULL;
 	}
 
 	/* run dependent operations */
@@ -592,7 +511,6 @@ iop_adma_prep_dma_interrupt(struct dma_chan *chan, unsigned long flags)
 	if (sw_desc) {
 		grp_start = sw_desc->group_head;
 		iop_desc_init_interrupt(grp_start, iop_chan);
-		grp_start->unmap_len = 0;
 		sw_desc->async_tx.flags = flags;
 	}
 	spin_unlock_bh(&iop_chan->lock);
@@ -624,8 +542,6 @@ iop_adma_prep_dma_memcpy(struct dma_chan *chan, dma_addr_t dma_dest,
 		iop_desc_set_byte_count(grp_start, iop_chan, len);
 		iop_desc_set_dest_addr(grp_start, iop_chan, dma_dest);
 		iop_desc_set_memcpy_src_addr(grp_start, dma_src);
-		sw_desc->unmap_src_cnt = 1;
-		sw_desc->unmap_len = len;
 		sw_desc->async_tx.flags = flags;
 	}
 	spin_unlock_bh(&iop_chan->lock);
@@ -657,8 +573,6 @@ iop_adma_prep_dma_memset(struct dma_chan *chan, dma_addr_t dma_dest,
 		iop_desc_set_byte_count(grp_start, iop_chan, len);
 		iop_desc_set_block_fill_val(grp_start, value);
 		iop_desc_set_dest_addr(grp_start, iop_chan, dma_dest);
-		sw_desc->unmap_src_cnt = 1;
-		sw_desc->unmap_len = len;
 		sw_desc->async_tx.flags = flags;
 	}
 	spin_unlock_bh(&iop_chan->lock);
@@ -691,8 +605,6 @@ iop_adma_prep_dma_xor(struct dma_chan *chan, dma_addr_t dma_dest,
 		iop_desc_init_xor(grp_start, src_cnt, flags);
 		iop_desc_set_byte_count(grp_start, iop_chan, len);
 		iop_desc_set_dest_addr(grp_start, iop_chan, dma_dest);
-		sw_desc->unmap_src_cnt = src_cnt;
-		sw_desc->unmap_len = len;
 		sw_desc->async_tx.flags = flags;
 		while (src_cnt--)
 			iop_desc_set_xor_src_addr(grp_start, src_cnt,
@@ -728,8 +640,6 @@ iop_adma_prep_dma_xor_val(struct dma_chan *chan, dma_addr_t *dma_src,
 		grp_start->xor_check_result = result;
 		pr_debug("\t%s: grp_start->xor_check_result: %p\n",
 			__func__, grp_start->xor_check_result);
-		sw_desc->unmap_src_cnt = src_cnt;
-		sw_desc->unmap_len = len;
 		sw_desc->async_tx.flags = flags;
 		while (src_cnt--)
 			iop_desc_set_zero_sum_src_addr(grp_start, src_cnt,
@@ -782,8 +692,6 @@ iop_adma_prep_dma_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,
 			dst[0] = dst[1] & 0x7;
 
 		iop_desc_set_pq_addr(g, dst);
-		sw_desc->unmap_src_cnt = src_cnt;
-		sw_desc->unmap_len = len;
 		sw_desc->async_tx.flags = flags;
 		for (i = 0; i < src_cnt; i++)
 			iop_desc_set_pq_src_addr(g, i, src[i], scf[i]);
@@ -838,8 +746,6 @@ iop_adma_prep_dma_pq_val(struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src,
 		g->pq_check_result = pqres;
 		pr_debug("\t%s: g->pq_check_result: %p\n",
 			__func__, g->pq_check_result);
-		sw_desc->unmap_src_cnt = src_cnt+2;
-		sw_desc->unmap_len = len;
 		sw_desc->async_tx.flags = flags;
 		while (src_cnt--)
 			iop_desc_set_pq_zero_sum_src_addr(g, src_cnt,
diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
index 4a5c073..7d0ba4a 100644
--- a/drivers/dma/mv_xor.c
+++ b/drivers/dma/mv_xor.c
@@ -57,14 +57,6 @@ static u32 mv_desc_get_dest_addr(struct mv_xor_desc_slot *desc)
 	return hw_desc->phy_dest_addr;
 }
 
-static u32 mv_desc_get_src_addr(struct mv_xor_desc_slot *desc,
-				int src_idx)
-{
-	struct mv_xor_desc *hw_desc = desc->hw_desc;
-	return hw_desc->phy_src_addr[src_idx];
-}
-
-
 static void mv_desc_set_byte_count(struct mv_xor_desc_slot *desc,
 				   u32 byte_count)
 {
@@ -304,43 +296,8 @@ mv_xor_run_tx_complete_actions(struct mv_xor_desc_slot *desc,
 				desc->async_tx.callback_param);
 
 		dma_descriptor_unmap(&desc->async_tx);
-		/* unmap dma addresses
-		 * (unmap_single vs unmap_page?)
-		 */
-		if (desc->group_head && desc->unmap_len) {
-			struct mv_xor_desc_slot *unmap = desc->group_head;
-			struct device *dev =
-				&mv_chan->device->pdev->dev;
-			u32 len = unmap->unmap_len;
-			enum dma_ctrl_flags flags = desc->async_tx.flags;
-			u32 src_cnt;
-			dma_addr_t addr;
-			dma_addr_t dest;
-
-			src_cnt = unmap->unmap_src_cnt;
-			dest = mv_desc_get_dest_addr(unmap);
-			if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-				enum dma_data_direction dir;
-
-				if (src_cnt > 1) /* is xor ? */
-					dir = DMA_BIDIRECTIONAL;
-				else
-					dir = DMA_FROM_DEVICE;
-				dma_unmap_page(dev, dest, len, dir);
-			}
-
-			if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-				while (src_cnt--) {
-					addr = mv_desc_get_src_addr(unmap,
-								    src_cnt);
-					if (addr == dest)
-						continue;
-					dma_unmap_page(dev, addr, len,
-						       DMA_TO_DEVICE);
-				}
-			}
+		if (desc->group_head)
 			desc->group_head = NULL;
-		}
 	}
 
 	/* run dependent operations */
diff --git a/drivers/dma/ppc4xx/adma.c b/drivers/dma/ppc4xx/adma.c
index 883b343..4c7f53b 100644
--- a/drivers/dma/ppc4xx/adma.c
+++ b/drivers/dma/ppc4xx/adma.c
@@ -802,218 +802,6 @@ static void ppc440spe_desc_set_link(struct ppc440spe_adma_chan *chan,
 }
 
 /**
- * ppc440spe_desc_get_src_addr - extract the source address from the descriptor
- */
-static u32 ppc440spe_desc_get_src_addr(struct ppc440spe_adma_desc_slot *desc,
-				struct ppc440spe_adma_chan *chan, int src_idx)
-{
-	struct dma_cdb *dma_hw_desc;
-	struct xor_cb *xor_hw_desc;
-
-	switch (chan->device->id) {
-	case PPC440SPE_DMA0_ID:
-	case PPC440SPE_DMA1_ID:
-		dma_hw_desc = desc->hw_desc;
-		/* May have 0, 1, 2, or 3 sources */
-		switch (dma_hw_desc->opc) {
-		case DMA_CDB_OPC_NO_OP:
-		case DMA_CDB_OPC_DFILL128:
-			return 0;
-		case DMA_CDB_OPC_DCHECK128:
-			if (unlikely(src_idx)) {
-				printk(KERN_ERR "%s: try to get %d source for"
-				    " DCHECK128\n", __func__, src_idx);
-				BUG();
-			}
-			return le32_to_cpu(dma_hw_desc->sg1l);
-		case DMA_CDB_OPC_MULTICAST:
-		case DMA_CDB_OPC_MV_SG1_SG2:
-			if (unlikely(src_idx > 2)) {
-				printk(KERN_ERR "%s: try to get %d source from"
-				    " DMA descr\n", __func__, src_idx);
-				BUG();
-			}
-			if (src_idx) {
-				if (le32_to_cpu(dma_hw_desc->sg1u) &
-				    DMA_CUED_XOR_WIN_MSK) {
-					u8 region;
-
-					if (src_idx == 1)
-						return le32_to_cpu(
-						    dma_hw_desc->sg1l) +
-							desc->unmap_len;
-
-					region = (le32_to_cpu(
-					    dma_hw_desc->sg1u)) >>
-						DMA_CUED_REGION_OFF;
-
-					region &= DMA_CUED_REGION_MSK;
-					switch (region) {
-					case DMA_RXOR123:
-						return le32_to_cpu(
-						    dma_hw_desc->sg1l) +
-							(desc->unmap_len << 1);
-					case DMA_RXOR124:
-						return le32_to_cpu(
-						    dma_hw_desc->sg1l) +
-							(desc->unmap_len * 3);
-					case DMA_RXOR125:
-						return le32_to_cpu(
-						    dma_hw_desc->sg1l) +
-							(desc->unmap_len << 2);
-					default:
-						printk(KERN_ERR
-						    "%s: try to"
-						    " get src3 for region %02x"
-						    "PPC440SPE_DESC_RXOR12?\n",
-						    __func__, region);
-						BUG();
-					}
-				} else {
-					printk(KERN_ERR
-						"%s: try to get %d"
-						" source for non-cued descr\n",
-						__func__, src_idx);
-					BUG();
-				}
-			}
-			return le32_to_cpu(dma_hw_desc->sg1l);
-		default:
-			printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
-				__func__, dma_hw_desc->opc);
-			BUG();
-		}
-		return le32_to_cpu(dma_hw_desc->sg1l);
-	case PPC440SPE_XOR_ID:
-		/* May have up to 16 sources */
-		xor_hw_desc = desc->hw_desc;
-		return xor_hw_desc->ops[src_idx].l;
-	}
-	return 0;
-}
-
-/**
- * ppc440spe_desc_get_dest_addr - extract the destination address from the
- * descriptor
- */
-static u32 ppc440spe_desc_get_dest_addr(struct ppc440spe_adma_desc_slot *desc,
-				struct ppc440spe_adma_chan *chan, int idx)
-{
-	struct dma_cdb *dma_hw_desc;
-	struct xor_cb *xor_hw_desc;
-
-	switch (chan->device->id) {
-	case PPC440SPE_DMA0_ID:
-	case PPC440SPE_DMA1_ID:
-		dma_hw_desc = desc->hw_desc;
-
-		if (likely(!idx))
-			return le32_to_cpu(dma_hw_desc->sg2l);
-		return le32_to_cpu(dma_hw_desc->sg3l);
-	case PPC440SPE_XOR_ID:
-		xor_hw_desc = desc->hw_desc;
-		return xor_hw_desc->cbtal;
-	}
-	return 0;
-}
-
-/**
- * ppc440spe_desc_get_src_num - extract the number of source addresses from
- * the descriptor
- */
-static u32 ppc440spe_desc_get_src_num(struct ppc440spe_adma_desc_slot *desc,
-				struct ppc440spe_adma_chan *chan)
-{
-	struct dma_cdb *dma_hw_desc;
-	struct xor_cb *xor_hw_desc;
-
-	switch (chan->device->id) {
-	case PPC440SPE_DMA0_ID:
-	case PPC440SPE_DMA1_ID:
-		dma_hw_desc = desc->hw_desc;
-
-		switch (dma_hw_desc->opc) {
-		case DMA_CDB_OPC_NO_OP:
-		case DMA_CDB_OPC_DFILL128:
-			return 0;
-		case DMA_CDB_OPC_DCHECK128:
-			return 1;
-		case DMA_CDB_OPC_MV_SG1_SG2:
-		case DMA_CDB_OPC_MULTICAST:
-			/*
-			 * Only for RXOR operations we have more than
-			 * one source
-			 */
-			if (le32_to_cpu(dma_hw_desc->sg1u) &
-			    DMA_CUED_XOR_WIN_MSK) {
-				/* RXOR op, there are 2 or 3 sources */
-				if (((le32_to_cpu(dma_hw_desc->sg1u) >>
-				    DMA_CUED_REGION_OFF) &
-				      DMA_CUED_REGION_MSK) == DMA_RXOR12) {
-					/* RXOR 1-2 */
-					return 2;
-				} else {
-					/* RXOR 1-2-3/1-2-4/1-2-5 */
-					return 3;
-				}
-			}
-			return 1;
-		default:
-			printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
-				__func__, dma_hw_desc->opc);
-			BUG();
-		}
-	case PPC440SPE_XOR_ID:
-		/* up to 16 sources */
-		xor_hw_desc = desc->hw_desc;
-		return xor_hw_desc->cbc & XOR_CDCR_OAC_MSK;
-	default:
-		BUG();
-	}
-	return 0;
-}
-
-/**
- * ppc440spe_desc_get_dst_num - get the number of destination addresses in
- * this descriptor
- */
-static u32 ppc440spe_desc_get_dst_num(struct ppc440spe_adma_desc_slot *desc,
-				struct ppc440spe_adma_chan *chan)
-{
-	struct dma_cdb *dma_hw_desc;
-
-	switch (chan->device->id) {
-	case PPC440SPE_DMA0_ID:
-	case PPC440SPE_DMA1_ID:
-		/* May be 1 or 2 destinations */
-		dma_hw_desc = desc->hw_desc;
-		switch (dma_hw_desc->opc) {
-		case DMA_CDB_OPC_NO_OP:
-		case DMA_CDB_OPC_DCHECK128:
-			return 0;
-		case DMA_CDB_OPC_MV_SG1_SG2:
-		case DMA_CDB_OPC_DFILL128:
-			return 1;
-		case DMA_CDB_OPC_MULTICAST:
-			if (desc->dst_cnt == 2)
-				return 2;
-			else
-				return 1;
-		default:
-			printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
-				__func__, dma_hw_desc->opc);
-			BUG();
-		}
-	case PPC440SPE_XOR_ID:
-		/* Always only 1 destination */
-		return 1;
-	default:
-		BUG();
-	}
-	return 0;
-}
-
-/**
  * ppc440spe_desc_get_link - get the address of the descriptor that
  * follows this one
  */
@@ -1705,43 +1493,6 @@ static void ppc440spe_adma_free_slots(struct ppc440spe_adma_desc_slot *slot,
 	}
 }
 
-static void ppc440spe_adma_unmap(struct ppc440spe_adma_chan *chan,
-				 struct ppc440spe_adma_desc_slot *desc)
-{
-	u32 src_cnt, dst_cnt;
-	dma_addr_t addr;
-
-	/*
-	 * get the number of sources & destination
-	 * included in this descriptor and unmap
-	 * them all
-	 */
-	src_cnt = ppc440spe_desc_get_src_num(desc, chan);
-	dst_cnt = ppc440spe_desc_get_dst_num(desc, chan);
-
-	/* unmap destinations */
-	if (!(desc->async_tx.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-		while (dst_cnt--) {
-			addr = ppc440spe_desc_get_dest_addr(
-				desc, chan, dst_cnt);
-			dma_unmap_page(chan->device->dev,
-					addr, desc->unmap_len,
-					DMA_FROM_DEVICE);
-		}
-	}
-
-	/* unmap sources */
-	if (!(desc->async_tx.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-		while (src_cnt--) {
-			addr = ppc440spe_desc_get_src_addr(
-				desc, chan, src_cnt);
-			dma_unmap_page(chan->device->dev,
-					addr, desc->unmap_len,
-					DMA_TO_DEVICE);
-		}
-	}
-}
-
 /**
  * ppc440spe_adma_run_tx_complete_actions - call functions to be called
  * upon completion
@@ -1766,26 +1517,6 @@ static dma_cookie_t ppc440spe_adma_run_tx_complete_actions(
 				desc->async_tx.callback_param);
 
 		dma_descriptor_unmap(&desc->async_tx);
-		/* unmap dma addresses
-		 * (unmap_single vs unmap_page?)
-		 *
-		 * actually, ppc's dma_unmap_page() functions are empty, so
-		 * the following code is just for the sake of completeness
-		 */
-		if (chan && chan->needs_unmap && desc->group_head &&
-		     desc->unmap_len) {
-			struct ppc440spe_adma_desc_slot *unmap =
-							desc->group_head;
-			/* assume 1 slot per op always */
-			u32 slot_count = unmap->slot_cnt;
-
-			/* Run through the group list and unmap addresses */
-			for (i = 0; i < slot_count; i++) {
-				BUG_ON(!unmap);
-				ppc440spe_adma_unmap(chan, unmap);
-				unmap = unmap->hw_next;
-			}
-		}
 	}
 
 	/* run dependent operations */
diff --git a/drivers/dma/timb_dma.c b/drivers/dma/timb_dma.c
index 4b82112..866a97c 100644
--- a/drivers/dma/timb_dma.c
+++ b/drivers/dma/timb_dma.c
@@ -154,38 +154,6 @@ static bool __td_dma_done_ack(struct timb_dma_chan *td_chan)
 	return done;
 }
 
-static void __td_unmap_desc(struct timb_dma_chan *td_chan, const u8 *dma_desc,
-	bool single)
-{
-	dma_addr_t addr;
-	int len;
-
-	addr = (dma_desc[7] << 24) | (dma_desc[6] << 16) | (dma_desc[5] << 8) |
-		dma_desc[4];
-
-	len = (dma_desc[3] << 8) | dma_desc[2];
-
-	if (single)
-		dma_unmap_single(chan2dev(&td_chan->chan), addr, len,
-			DMA_TO_DEVICE);
-	else
-		dma_unmap_page(chan2dev(&td_chan->chan), addr, len,
-			DMA_TO_DEVICE);
-}
-
-static void __td_unmap_descs(struct timb_dma_desc *td_desc, bool single)
-{
-	struct timb_dma_chan *td_chan = container_of(td_desc->txd.chan,
-		struct timb_dma_chan, chan);
-	u8 *descs;
-
-	for (descs = td_desc->desc_list; ; descs += TIMB_DMA_DESC_SIZE) {
-		__td_unmap_desc(td_chan, descs, single);
-		if (descs[0] & 0x02)
-			break;
-	}
-}
-
 static int td_fill_desc(struct timb_dma_chan *td_chan, u8 *dma_desc,
 	struct scatterlist *sg, bool last)
 {
@@ -294,10 +262,6 @@ static void __td_finish(struct timb_dma_chan *td_chan)
 	list_move(&td_desc->desc_node, &td_chan->free_list);
 
 	dma_descriptor_unmap(txd);
-	if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP))
-		__td_unmap_descs(td_desc,
-			txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE);
-
 	/*
 	 * The API requires that no submissions are done from a
 	 * callback, so we don't need to drop the lock here
diff --git a/drivers/dma/txx9dmac.c b/drivers/dma/txx9dmac.c
index 041b675..aee38a7 100644
--- a/drivers/dma/txx9dmac.c
+++ b/drivers/dma/txx9dmac.c
@@ -420,30 +420,6 @@ txx9dmac_descriptor_complete(struct txx9dmac_chan *dc,
 	list_move(&desc->desc_node, &dc->free_list);
 
 	dma_descriptor_unmap(txd);
-	if (!ds) {
-		dma_addr_t dmaaddr;
-		if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
-			dmaaddr = is_dmac64(dc) ?
-				desc->hwdesc.DAR : desc->hwdesc32.DAR;
-			if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
-				dma_unmap_single(chan2parent(&dc->chan),
-					dmaaddr, desc->len, DMA_FROM_DEVICE);
-			else
-				dma_unmap_page(chan2parent(&dc->chan),
-					dmaaddr, desc->len, DMA_FROM_DEVICE);
-		}
-		if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
-			dmaaddr = is_dmac64(dc) ?
-				desc->hwdesc.SAR : desc->hwdesc32.SAR;
-			if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
-				dma_unmap_single(chan2parent(&dc->chan),
-					dmaaddr, desc->len, DMA_TO_DEVICE);
-			else
-				dma_unmap_page(chan2parent(&dc->chan),
-					dmaaddr, desc->len, DMA_TO_DEVICE);
-		}
-	}
-
 	/*
 	 * The API requires that no submissions are done from a
 	 * callback, so we don't need to drop the lock here


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 12/12] dmaengine: remove DMA unmap flags
  2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
                   ` (10 preceding siblings ...)
  2012-12-06  9:26 ` [PATCH 11/12] dmaengine: remove DMA unmap from drivers Dan Williams
@ 2012-12-06  9:26 ` Dan Williams
  11 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2012-12-06  9:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux, Bartlomiej Zolnierkiewicz, Vinod Koul, Tomasz Figa,
	Kyungmin Park, dave.jiang

From: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>

Remove no longer needed DMA unmap flags:
- DMA_COMPL_SKIP_SRC_UNMAP
- DMA_COMPL_SKIP_DEST_UNMAP
- DMA_COMPL_SRC_UNMAP_SINGLE
- DMA_COMPL_DEST_UNMAP_SINGLE

Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dan Williams <djbw@fb.com>
---
 crypto/async_tx/async_memcpy.c           |    3 +--
 crypto/async_tx/async_memset.c           |    3 +--
 crypto/async_tx/async_pq.c               |    1 -
 crypto/async_tx/async_raid6_recov.c      |    8 ++------
 crypto/async_tx/async_xor.c              |    4 +---
 drivers/ata/pata_arasan_cf.c             |    3 +--
 drivers/dma/dmaengine.c                  |    3 +--
 drivers/dma/dmatest.c                    |    3 +--
 drivers/dma/ioat/dma.c                   |    3 +--
 drivers/dma/ioat/dma_v3.c                |   16 ++++------------
 drivers/media/platform/m2m-deinterlace.c |    3 +--
 drivers/media/platform/timblogiw.c       |    2 +-
 drivers/misc/carma/carma-fpga.c          |    3 +--
 drivers/mtd/nand/atmel_nand.c            |    3 +--
 drivers/mtd/nand/fsmc_nand.c             |    2 --
 drivers/net/ethernet/micrel/ks8842.c     |    6 ++----
 drivers/spi/spi-dw-mid.c                 |    4 ++--
 include/linux/dmaengine.h                |   18 ++++--------------
 18 files changed, 25 insertions(+), 63 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index ca95c4c..8caffb0 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -56,8 +56,7 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
 		unmap = dmaengine_get_unmap_data(device->dev, 2, GFP_NOIO);
 
 	if (unmap && is_dma_copy_aligned(device, src_offset, dest_offset, len)) {
-		unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
-					       DMA_COMPL_SKIP_DEST_UNMAP;
+		unsigned long dma_prep_flags = 0;
 
 		if (submit->cb_fn)
 			dma_prep_flags |= DMA_PREP_INTERRUPT;
diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index ffca53b..169ab25 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -53,8 +53,7 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
 		unmap = dmaengine_get_unmap_data(device->dev, 1, GFP_NOIO);
 
 	if (unmap && is_dma_fill_aligned(device, offset, 0, len)) {
-		unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
-					       DMA_COMPL_SKIP_DEST_UNMAP;
+		unsigned long dma_prep_flags = 0;
 
 		if (submit->cb_fn)
 			dma_prep_flags |= DMA_PREP_INTERRUPT;
diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
index e5ddb31..a00aa1e 100644
--- a/crypto/async_tx/async_pq.c
+++ b/crypto/async_tx/async_pq.c
@@ -62,7 +62,6 @@ do_async_gen_syndrome(struct dma_chan *chan,
 	unsigned short pq_src_cnt;
 	int src_off = 0;
 
-	dma_flags |= DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
 	if (submit->flags & ASYNC_TX_FENCE)
 		dma_flags |= DMA_PREP_FENCE;
 
diff --git a/crypto/async_tx/async_raid6_recov.c b/crypto/async_tx/async_raid6_recov.c
index 20aea04..01b00c9 100644
--- a/crypto/async_tx/async_raid6_recov.c
+++ b/crypto/async_tx/async_raid6_recov.c
@@ -47,9 +47,7 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
 		struct device *dev = dma->dev;
 		dma_addr_t pq[2];
 		struct dma_async_tx_descriptor *tx;
-		enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
-						DMA_COMPL_SKIP_DEST_UNMAP |
-						DMA_PREP_PQ_DISABLE_P;
+		enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
 
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_flags |= DMA_PREP_FENCE;
@@ -112,9 +110,7 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
 	if (unmap) {
 		struct device *dev = dma->dev;
 		struct dma_async_tx_descriptor *tx;
-		enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
-						DMA_COMPL_SKIP_DEST_UNMAP |
-						DMA_PREP_PQ_DISABLE_P;
+		enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
 
 		if (submit->flags & ASYNC_TX_FENCE)
 			dma_flags |= DMA_PREP_FENCE;
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index d6e1dc0..a5e1ebc 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -56,7 +56,6 @@ do_async_xor(struct dma_chan *chan, struct dmaengine_unmap_data *unmap,
 		/* if we are submitting additional xors, leave the chain open
 		 * and clear the callback parameters
 		 */
-		dma_flags = DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
 		if (src_cnt > xor_src_cnt) {
 			submit->flags &= ~ASYNC_TX_ACK;
 			submit->flags |= ASYNC_TX_FENCE;
@@ -287,8 +286,7 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 
 	if (unmap && src_cnt <= device->max_xor &&
 	    is_dma_xor_aligned(device, offset, 0, len)) {
-		unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
-					       DMA_COMPL_SKIP_DEST_UNMAP;
+		unsigned long dma_prep_flags = 0;
 		int i;
 
 		pr_debug("%s: (async) len: %zu\n", __func__, len);
diff --git a/drivers/ata/pata_arasan_cf.c b/drivers/ata/pata_arasan_cf.c
index 26201eb..9a6d38d 100644
--- a/drivers/ata/pata_arasan_cf.c
+++ b/drivers/ata/pata_arasan_cf.c
@@ -393,8 +393,7 @@ dma_xfer(struct arasan_cf_dev *acdev, dma_addr_t src, dma_addr_t dest, u32 len)
 	struct dma_async_tx_descriptor *tx;
 	struct dma_chan *chan = acdev->dma_chan;
 	dma_cookie_t cookie;
-	unsigned long flags = DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP |
-		DMA_COMPL_SKIP_DEST_UNMAP;
+	unsigned long flags = DMA_PREP_INTERRUPT;
 	int ret = 0;
 
 	tx = chan->device->device_prep_dma_memcpy(chan, dest, src, len, flags);
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 5a3c7c0..96b5493 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -1021,8 +1021,7 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
 				      DMA_TO_DEVICE);
 	unmap->addr[1] = dma_map_page(dev->dev, dest_pg, dest_off, len,
 				      DMA_FROM_DEVICE);
-	flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
-		DMA_COMPL_SKIP_DEST_UNMAP;
+	flags = DMA_CTRL_ACK;
 	tx = dev->device_prep_dma_memcpy(chan, unmap->addr[1], unmap->addr[0],
 					 len, flags);
 
diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index 3b36890..df78702 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -305,8 +305,7 @@ static int dmatest_func(void *data)
 	set_user_nice(current, 10);
 
 	/* src and dst buffers are freed by ourselves below */
-	flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT |
-		DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
+	flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;
 
 	while (!kthread_should_stop()
 	       && !(iterations && total_tests >= iterations)) {
diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index b01df57..c8fc6d4 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -818,8 +818,7 @@ int __devinit ioat_dma_self_test(struct ioatdma_device *device)
 
 	dma_src = dma_map_single(dev, src, IOAT_TEST_SIZE, DMA_TO_DEVICE);
 	dma_dest = dma_map_single(dev, dest, IOAT_TEST_SIZE, DMA_FROM_DEVICE);
-	flags = DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP |
-		DMA_PREP_INTERRUPT;
+	flags = DMA_PREP_INTERRUPT;
 	tx = device->common.device_prep_dma_memcpy(dma_chan, dma_dest, dma_src,
 						   IOAT_TEST_SIZE, flags);
 	if (!tx) {
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index cdf37ff..e38077c 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -792,9 +792,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
 					   DMA_TO_DEVICE);
 	tx = dma->device_prep_dma_xor(dma_chan, dest_dma, dma_srcs,
 				      IOAT_NUM_SRC_TEST, PAGE_SIZE,
-				      DMA_PREP_INTERRUPT |
-				      DMA_COMPL_SKIP_SRC_UNMAP |
-				      DMA_COMPL_SKIP_DEST_UNMAP);
+				      DMA_PREP_INTERRUPT);
 
 	if (!tx) {
 		dev_err(dev, "Self-test xor prep failed\n");
@@ -855,9 +853,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
 					   DMA_TO_DEVICE);
 	tx = dma->device_prep_dma_xor_val(dma_chan, dma_srcs,
 					  IOAT_NUM_SRC_TEST + 1, PAGE_SIZE,
-					  &xor_val_result, DMA_PREP_INTERRUPT |
-					  DMA_COMPL_SKIP_SRC_UNMAP |
-					  DMA_COMPL_SKIP_DEST_UNMAP);
+					  &xor_val_result, DMA_PREP_INTERRUPT);
 	if (!tx) {
 		dev_err(dev, "Self-test zero prep failed\n");
 		err = -ENODEV;
@@ -903,9 +899,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
 	dma_addr = dma_map_page(dev, dest, 0,
 			PAGE_SIZE, DMA_FROM_DEVICE);
 	tx = dma->device_prep_dma_memset(dma_chan, dma_addr, 0, PAGE_SIZE,
-					 DMA_PREP_INTERRUPT |
-					 DMA_COMPL_SKIP_SRC_UNMAP |
-					 DMA_COMPL_SKIP_DEST_UNMAP);
+					 DMA_PREP_INTERRUPT);
 	if (!tx) {
 		dev_err(dev, "Self-test memset prep failed\n");
 		err = -ENODEV;
@@ -952,9 +946,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
 					   DMA_TO_DEVICE);
 	tx = dma->device_prep_dma_xor_val(dma_chan, dma_srcs,
 					  IOAT_NUM_SRC_TEST + 1, PAGE_SIZE,
-					  &xor_val_result, DMA_PREP_INTERRUPT |
-					  DMA_COMPL_SKIP_SRC_UNMAP |
-					  DMA_COMPL_SKIP_DEST_UNMAP);
+					  &xor_val_result, DMA_PREP_INTERRUPT);
 	if (!tx) {
 		dev_err(dev, "Self-test 2nd zero prep failed\n");
 		err = -ENODEV;
diff --git a/drivers/media/platform/m2m-deinterlace.c b/drivers/media/platform/m2m-deinterlace.c
index 45164c4..8c63b93 100644
--- a/drivers/media/platform/m2m-deinterlace.c
+++ b/drivers/media/platform/m2m-deinterlace.c
@@ -344,8 +344,7 @@ static void deinterlace_issue_dma(struct deinterlace_ctx *ctx, int op,
 	ctx->xt->dir = DMA_MEM_TO_MEM;
 	ctx->xt->src_sgl = false;
 	ctx->xt->dst_sgl = true;
-	flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT |
-		DMA_COMPL_SKIP_DEST_UNMAP | DMA_COMPL_SKIP_SRC_UNMAP;
+	flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;
 
 	tx = dmadev->device_prep_interleaved_dma(chan, ctx->xt, flags);
 	if (tx == NULL) {
diff --git a/drivers/media/platform/timblogiw.c b/drivers/media/platform/timblogiw.c
index 02194c0..8ae630c 100644
--- a/drivers/media/platform/timblogiw.c
+++ b/drivers/media/platform/timblogiw.c
@@ -566,7 +566,7 @@ static void buffer_queue(struct videobuf_queue *vq, struct videobuf_buffer *vb)
 
 	desc = dmaengine_prep_slave_sg(fh->chan,
 		buf->sg, sg_elems, DMA_DEV_TO_MEM,
-		DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
+		DMA_PREP_INTERRUPT);
 	if (!desc) {
 		spin_lock_irq(&fh->queue_lock);
 		list_del_init(&vb->queue);
diff --git a/drivers/misc/carma/carma-fpga.c b/drivers/misc/carma/carma-fpga.c
index 7508caf..b654c79 100644
--- a/drivers/misc/carma/carma-fpga.c
+++ b/drivers/misc/carma/carma-fpga.c
@@ -631,8 +631,7 @@ static int data_submit_dma(struct fpga_device *priv, struct data_buf *buf)
 	struct dma_async_tx_descriptor *tx;
 	dma_cookie_t cookie;
 	dma_addr_t dst, src;
-	unsigned long dma_flags = DMA_COMPL_SKIP_DEST_UNMAP |
-				  DMA_COMPL_SKIP_SRC_UNMAP;
+	unsigned long dma_flags = 0;
 
 	dst_sg = buf->vb.sglist;
 	dst_nents = buf->vb.sglen;
diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c
index 9144557..5e1f88f 100644
--- a/drivers/mtd/nand/atmel_nand.c
+++ b/drivers/mtd/nand/atmel_nand.c
@@ -238,8 +238,7 @@ static int atmel_nand_dma_op(struct mtd_info *mtd, void *buf, int len,
 
 	dma_dev = host->dma_chan->device;
 
-	flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP |
-		DMA_COMPL_SKIP_DEST_UNMAP;
+	flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;
 
 	phys_addr = dma_map_single(dma_dev->dev, p, len, dir);
 	if (dma_mapping_error(dma_dev->dev, phys_addr)) {
diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c
index 679ede8..fdb98f0 100644
--- a/drivers/mtd/nand/fsmc_nand.c
+++ b/drivers/mtd/nand/fsmc_nand.c
@@ -569,8 +569,6 @@ static int dma_xfer(struct fsmc_nand_data *host, void *buffer, int len,
 	dma_dev = chan->device;
 	dma_addr = dma_map_single(dma_dev->dev, buffer, len, direction);
 
-	flags |= DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
-
 	if (direction == DMA_TO_DEVICE) {
 		dma_src = dma_addr;
 		dma_dst = host->data_pa;
diff --git a/drivers/net/ethernet/micrel/ks8842.c b/drivers/net/ethernet/micrel/ks8842.c
index 24fb049..f657760 100644
--- a/drivers/net/ethernet/micrel/ks8842.c
+++ b/drivers/net/ethernet/micrel/ks8842.c
@@ -459,8 +459,7 @@ static int ks8842_tx_frame_dma(struct sk_buff *skb, struct net_device *netdev)
 		sg_dma_len(&ctl->sg) += 4 - sg_dma_len(&ctl->sg) % 4;
 
 	ctl->adesc = dmaengine_prep_slave_sg(ctl->chan,
-		&ctl->sg, 1, DMA_MEM_TO_DEV,
-		DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
+		&ctl->sg, 1, DMA_MEM_TO_DEV, DMA_PREP_INTERRUPT);
 	if (!ctl->adesc)
 		return NETDEV_TX_BUSY;
 
@@ -571,8 +570,7 @@ static int __ks8842_start_new_rx_dma(struct net_device *netdev)
 		sg_dma_len(sg) = DMA_BUFFER_SIZE;
 
 		ctl->adesc = dmaengine_prep_slave_sg(ctl->chan,
-			sg, 1, DMA_DEV_TO_MEM,
-			DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
+			sg, 1, DMA_DEV_TO_MEM, DMA_PREP_INTERRUPT);
 
 		if (!ctl->adesc)
 			goto out;
diff --git a/drivers/spi/spi-dw-mid.c b/drivers/spi/spi-dw-mid.c
index b9f0192..6d207af 100644
--- a/drivers/spi/spi-dw-mid.c
+++ b/drivers/spi/spi-dw-mid.c
@@ -150,7 +150,7 @@ static int mid_spi_dma_transfer(struct dw_spi *dws, int cs_change)
 				&dws->tx_sgl,
 				1,
 				DMA_MEM_TO_DEV,
-				DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_DEST_UNMAP);
+				DMA_PREP_INTERRUPT);
 	txdesc->callback = dw_spi_dma_done;
 	txdesc->callback_param = dws;
 
@@ -173,7 +173,7 @@ static int mid_spi_dma_transfer(struct dw_spi *dws, int cs_change)
 				&dws->rx_sgl,
 				1,
 				DMA_DEV_TO_MEM,
-				DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_DEST_UNMAP);
+				DMA_PREP_INTERRUPT);
 	rxdesc->callback = dw_spi_dma_done;
 	rxdesc->callback_param = dws;
 
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index e954e9f..6f3873d 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -169,12 +169,6 @@ struct dma_interleaved_template {
  * @DMA_CTRL_ACK - if clear, the descriptor cannot be reused until the client
  *  acknowledges receipt, i.e. has has a chance to establish any dependency
  *  chains
- * @DMA_COMPL_SKIP_SRC_UNMAP - set to disable dma-unmapping the source buffer(s)
- * @DMA_COMPL_SKIP_DEST_UNMAP - set to disable dma-unmapping the destination(s)
- * @DMA_COMPL_SRC_UNMAP_SINGLE - set to do the source dma-unmapping as single
- * 	(if not set, do the source dma-unmapping as page)
- * @DMA_COMPL_DEST_UNMAP_SINGLE - set to do the destination dma-unmapping as single
- * 	(if not set, do the destination dma-unmapping as page)
  * @DMA_PREP_PQ_DISABLE_P - prevent generation of P while generating Q
  * @DMA_PREP_PQ_DISABLE_Q - prevent generation of Q while generating P
  * @DMA_PREP_CONTINUE - indicate to a driver that it is reusing buffers as
@@ -186,14 +180,10 @@ struct dma_interleaved_template {
 enum dma_ctrl_flags {
 	DMA_PREP_INTERRUPT = (1 << 0),
 	DMA_CTRL_ACK = (1 << 1),
-	DMA_COMPL_SKIP_SRC_UNMAP = (1 << 2),
-	DMA_COMPL_SKIP_DEST_UNMAP = (1 << 3),
-	DMA_COMPL_SRC_UNMAP_SINGLE = (1 << 4),
-	DMA_COMPL_DEST_UNMAP_SINGLE = (1 << 5),
-	DMA_PREP_PQ_DISABLE_P = (1 << 6),
-	DMA_PREP_PQ_DISABLE_Q = (1 << 7),
-	DMA_PREP_CONTINUE = (1 << 8),
-	DMA_PREP_FENCE = (1 << 9),
+	DMA_PREP_PQ_DISABLE_P = (1 << 2),
+	DMA_PREP_PQ_DISABLE_Q = (1 << 3),
+	DMA_PREP_CONTINUE = (1 << 4),
+	DMA_PREP_FENCE = (1 << 5),
 };
 
 /**


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/12] dmaengine: prepare for generic 'unmap' data
  2012-12-06  9:25 ` [PATCH 02/12] dmaengine: prepare for generic 'unmap' data Dan Williams
@ 2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
  2013-06-26 10:44     ` dmaengine_unmap_data patches (was: Re: Re: [PATCH 02/12] dmaengine: prepare for generic 'unmap' data) Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:47 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-kernel, vinod.koul, linux, dave.jiang

On Thursday 06 December 2012 10:25:20 Dan Williams wrote:
> Add a hook for a common dma unmap implementation to enable removal of
> the per driver custom unmap code.  (A reworked version of Bartlomiej
> Zolnierkiewicz's patches to remove the custom callbacks and the size
> increase of dma_async_tx_descriptor for drivers that don't care about raid).
> 
> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>

Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>

Thanks for reworking my patches!  The patchset looks generally
good but there are few issues that should be fixed (please see
other mails).

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 03/12] dmaengine: reference counted unmap data
  2012-12-06  9:25 ` [PATCH 03/12] dmaengine: reference counted unmap data Dan Williams
@ 2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:47 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux, Vinod Koul, Tomasz Figa, Kyungmin Park, dave.jiang


On Thursday 06 December 2012 10:25:26 Dan Williams wrote:
> hang a common 'unmap' object off of dma descriptors for the purpose of
> providing a unified unmapping interface.  The lifetime of a mapping may
> span multiple descriptors, so these unmap objects are reference counted
> by related descriptor.
> 
> Cc: Vinod Koul <vinod.koul@intel.com>
> Cc: Tomasz Figa <t.figa@samsung.com>
> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>
> ---
>  drivers/dma/dmaengine.c   |  157 ++++++++++++++++++++++++++++++++++++++++++---
>  include/linux/dmaengine.h |    3 +
>  2 files changed, 151 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index f3cadc6..00f0baf 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -62,6 +62,7 @@
>  #include <linux/rculist.h>
>  #include <linux/idr.h>
>  #include <linux/slab.h>
> +#include <linux/mempool.h>
>  
>  static DEFINE_MUTEX(dma_list_mutex);
>  static DEFINE_IDR(dma_idr);
> @@ -856,6 +857,131 @@ void dma_async_device_unregister(struct dma_device *device)
>  }
>  EXPORT_SYMBOL(dma_async_device_unregister);
>  
> +struct dmaengine_unmap_pool {
> +	struct kmem_cache *cache;
> +	const char *name;
> +	mempool_t *pool;
> +	size_t size;
> +};
> +
> +#define __UNMAP_POOL(x) { .size = x, .name = "dmaengine-unmap-" __stringify(x) }
> +static struct dmaengine_unmap_pool unmap_pool[] = {
> +	__UNMAP_POOL(2),
> +	#if IS_ENABLED(ASYNC_TX_DMA)
> +	__UNMAP_POOL(16),
> +	__UNMAP_POOL(128),
> +	__UNMAP_POOL(256),
> +	#endif
> +};
> +
> +static struct dmaengine_unmap_pool *__get_unmap_pool(int nr)
> +{
> +	int order = get_count_order(nr);
> +
> +	switch (order) {
> +	case 0 ... 1:
> +		return &unmap_pool[0];
> +	case 2 ... 4:
> +		return &unmap_pool[1];
> +	case 5 ... 7:
> +		return &unmap_pool[2];
> +	case 8:
> +		return &unmap_pool[3];
> +	default:
> +		BUG();
> +		return NULL;
> +	}
> +
> +}
> +
> +static void dmaengine_unmap(struct kref *kref)
> +{
> +	struct dmaengine_unmap_data *unmap = container_of(kref, typeof(*unmap), kref);
> +	struct device *dev = unmap->dev;
> +	int cnt, i;
> +
> +	cnt = unmap->to_cnt;
> +	for (i = 0; i < cnt; i++)
> +		dma_unmap_page(dev, unmap->addr[i], unmap->len,
> +			       DMA_TO_DEVICE);
> +	cnt += unmap->from_cnt;
> +	for (; i < cnt; i++)
> +		dma_unmap_page(dev, unmap->addr[i], unmap->len,
> +			       DMA_FROM_DEVICE);
> +	cnt += unmap->bidi_cnt;
> +	for (; i < cnt; i++)
> +		dma_unmap_page(dev, unmap->addr[i], unmap->len,
> +			       DMA_BIDIRECTIONAL);
> +	kmem_cache_free(__get_unmap_pool(cnt)->cache, unmap);

Shouldn't it call mempool_free() instead?

> +}
> +
> +void dmaengine_unmap_put(struct dmaengine_unmap_data *unmap)
> +{
> +	if (unmap)
> +		kref_put(&unmap->kref, dmaengine_unmap);
> +}
> +EXPORT_SYMBOL_GPL(dmaengine_unmap_put);
> +
> +static void dmaengine_destroy_unmap_pool(void)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(unmap_pool); i++) {
> +		struct dmaengine_unmap_pool *p = &unmap_pool[i];
> +
> +		if (p->cache)
> +			kmem_cache_destroy(p->cache);
> +		p->cache = NULL;
> +		if (p->pool)
> +			mempool_destroy(p->pool);
> +		p->pool = NULL;
> +	}
> +}
> +
> +static int dmaengine_init_unmap_pool(void)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(unmap_pool); i++) {
> +		struct dmaengine_unmap_pool *p = &unmap_pool[i];
> +		size_t size;
> +
> +		size = sizeof(struct dmaengine_unmap_data) +
> +		       sizeof(dma_addr_t) * p->size;
> +
> +		p->cache = kmem_cache_create(p->name, size, 0,
> +					     SLAB_HWCACHE_ALIGN, NULL);
> +		if (!p->cache)
> +			break;
> +		p->pool = mempool_create_slab_pool(1, p->cache);
> +		if (!p->pool)
> +			break;
> +	}
> +
> +	if (i > ARRAY_SIZE(unmap_pool))
> +		return 0;
> +
> +	dmaengine_destroy_unmap_pool();
> +	return -ENOMEM;
> +}
> +
> +static struct dmaengine_unmap_data *
> +dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags)
> +{
> +	struct dmaengine_unmap_data *unmap;
> +
> +	unmap = mempool_alloc(__get_unmap_pool(nr)->pool, flags);
> +	if (!unmap)
> +		return NULL;
> +	unmap->to_cnt = 0;
> +	unmap->from_cnt = 0;
> +	unmap->bidi_cnt = 0;

unmap->len is not initialized

> +	kref_init(&unmap->kref);
> +	unmap->dev = dev;
> +
> +	return unmap;
> +}
> +
>  /**
>   * dma_async_memcpy_pg_to_pg - offloaded copy from page to page
>   * @chan: DMA channel to offload copy to
> @@ -877,24 +1003,33 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
>  {
>  	struct dma_device *dev = chan->device;
>  	struct dma_async_tx_descriptor *tx;
> -	dma_addr_t dma_dest, dma_src;
> +	struct dmaengine_unmap_data *unmap;
>  	dma_cookie_t cookie;
>  	unsigned long flags;
>  
> -	dma_src = dma_map_page(dev->dev, src_pg, src_off, len, DMA_TO_DEVICE);
> -	dma_dest = dma_map_page(dev->dev, dest_pg, dest_off, len,
> -				DMA_FROM_DEVICE);
> -	flags = DMA_CTRL_ACK;
> -	tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);
> +	unmap = dmaengine_get_unmap_data(dev->dev, 2, GFP_NOIO);
> +	if (!unmap)
> +		return -ENOMEM;
> +
> +	unmap->to_cnt = 1;
> +	unmap->from_cnt = 1;
> +	unmap->addr[0] = dma_map_page(dev->dev, src_pg, src_off, len,
> +				      DMA_TO_DEVICE);
> +	unmap->addr[1] = dma_map_page(dev->dev, dest_pg, dest_off, len,
> +				      DMA_FROM_DEVICE);

unmap->len is not set anywhere

> +	flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
> +		DMA_COMPL_SKIP_DEST_UNMAP;
> +	tx = dev->device_prep_dma_memcpy(chan, unmap->addr[1], unmap->addr[0],
> +					 len, flags);
>  
>  	if (!tx) {
> -		dma_unmap_page(dev->dev, dma_src, len, DMA_TO_DEVICE);
> -		dma_unmap_page(dev->dev, dma_dest, len, DMA_FROM_DEVICE);
> +		dmaengine_unmap_put(unmap);
>  		return -ENOMEM;
>  	}
>  
> -	tx->callback = NULL;
> +	dma_set_unmap(tx, unmap);
>  	cookie = tx->tx_submit(tx);
> +	dmaengine_unmap_put(unmap);
>  
>  	preempt_disable();
>  	__this_cpu_add(chan->local->bytes_transferred, len);

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 04/12] async_memcpy: convert to dmaengine_unmap_data
  2012-12-06  9:25 ` [PATCH 04/12] async_memcpy: convert to dmaengine_unmap_data Dan Williams
@ 2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:47 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux, vinod.koul, Tomasz Figa, Kyungmin Park, dave.jiang


On Thursday 06 December 2012 10:25:31 Dan Williams wrote:
> Use the generic unmap object to unmap dma buffers.
> 
> Cc: Tomasz Figa <t.figa@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>
> ---
>  crypto/async_tx/async_memcpy.c |   39 ++++++++++++++++++++++-----------------
>  drivers/dma/dmaengine.c        |    3 ++-
>  include/linux/dmaengine.h      |    2 ++
>  3 files changed, 26 insertions(+), 18 deletions(-)
> 
> diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
> index 9e62fef..ca95c4c 100644
> --- a/crypto/async_tx/async_memcpy.c
> +++ b/crypto/async_tx/async_memcpy.c
> @@ -50,33 +50,36 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
>  						      &dest, 1, &src, 1, len);
>  	struct dma_device *device = chan ? chan->device : NULL;
>  	struct dma_async_tx_descriptor *tx = NULL;
> +	struct dmaengine_unmap_data *unmap = NULL;
>  
> -	if (device && is_dma_copy_aligned(device, src_offset, dest_offset, len)) {
> -		dma_addr_t dma_dest, dma_src;
> -		unsigned long dma_prep_flags = 0;
> +	if (device)
> +		unmap = dmaengine_get_unmap_data(device->dev, 2, GFP_NOIO);
> +
> +	if (unmap && is_dma_copy_aligned(device, src_offset, dest_offset, len)) {
> +		unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
> +					       DMA_COMPL_SKIP_DEST_UNMAP;
>  
>  		if (submit->cb_fn)
>  			dma_prep_flags |= DMA_PREP_INTERRUPT;
>  		if (submit->flags & ASYNC_TX_FENCE)
>  			dma_prep_flags |= DMA_PREP_FENCE;
> -		dma_dest = dma_map_page(device->dev, dest, dest_offset, len,
> -					DMA_FROM_DEVICE);
> -
> -		dma_src = dma_map_page(device->dev, src, src_offset, len,
> -				       DMA_TO_DEVICE);
> -
> -		tx = device->device_prep_dma_memcpy(chan, dma_dest, dma_src,
> -						    len, dma_prep_flags);
> -		if (!tx) {
> -			dma_unmap_page(device->dev, dma_dest, len,
> -				       DMA_FROM_DEVICE);
> -			dma_unmap_page(device->dev, dma_src, len,
> -				       DMA_TO_DEVICE);
> -		}
> +
> +		unmap->to_cnt = 1;
> +		unmap->addr[0] = dma_map_page(device->dev, src, src_offset, len,
> +		                              DMA_TO_DEVICE);
> +		unmap->from_cnt = 1;
> +		unmap->addr[1] = dma_map_page(device->dev, dest, dest_offset, len,
> +					      DMA_FROM_DEVICE);

unmap->len is not set anywhere

> +
> +		tx = device->device_prep_dma_memcpy(chan, unmap->addr[1],
> +						    unmap->addr[0], len,
> +						    dma_prep_flags);
>  	}
>  
>  	if (tx) {
>  		pr_debug("%s: (async) len: %zu\n", __func__, len);
> +
> +		dma_set_unmap(tx, unmap);
>  		async_tx_submit(chan, tx, submit);
>  	} else {
>  		void *dest_buf, *src_buf;
> @@ -96,6 +99,8 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
>  		async_tx_sync_epilog(submit);
>  	}
>  
> +	dmaengine_unmap_put(unmap);
> +
>  	return tx;
>  }
>  EXPORT_SYMBOL_GPL(async_memcpy);

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 05/12] async_memset: convert to dmaengine_unmap_data
  2012-12-06  9:25 ` [PATCH 05/12] async_memset: " Dan Williams
@ 2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:48 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux, vinod.koul, Tomasz Figa, Kyungmin Park, dave.jiang


On Thursday 06 December 2012 10:25:37 Dan Williams wrote:
> Use the generic unmap object to unmap dma buffers.
> 
> Cc: Tomasz Figa <t.figa@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>
> ---
>  crypto/async_tx/async_memset.c |   18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
> index 05a4d1e..ffca53b 100644
> --- a/crypto/async_tx/async_memset.c
> +++ b/crypto/async_tx/async_memset.c
> @@ -47,17 +47,22 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
>  						      &dest, 1, NULL, 0, len);
>  	struct dma_device *device = chan ? chan->device : NULL;
>  	struct dma_async_tx_descriptor *tx = NULL;
> +	struct dmaengine_unmap_data *unmap = NULL;
>  
> -	if (device && is_dma_fill_aligned(device, offset, 0, len)) {
> -		dma_addr_t dma_dest;
> -		unsigned long dma_prep_flags = 0;
> +	if (device)
> +		unmap = dmaengine_get_unmap_data(device->dev, 1, GFP_NOIO);
> +
> +	if (unmap && is_dma_fill_aligned(device, offset, 0, len)) {
> +		unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
> +					       DMA_COMPL_SKIP_DEST_UNMAP;
>  
>  		if (submit->cb_fn)
>  			dma_prep_flags |= DMA_PREP_INTERRUPT;
>  		if (submit->flags & ASYNC_TX_FENCE)
>  			dma_prep_flags |= DMA_PREP_FENCE;
> -		dma_dest = dma_map_page(device->dev, dest, offset, len,
> -					DMA_FROM_DEVICE);
> +		unmap->from_cnt = 1;
> +		unmap->addr[0] = dma_map_page(device->dev, dest, offset, len,
> +					      DMA_FROM_DEVICE);

unmap->len is not set anywhere

>  		tx = device->device_prep_dma_memset(chan, dma_dest, val, len,
>  						    dma_prep_flags);
> @@ -65,6 +70,8 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
>  
>  	if (tx) {
>  		pr_debug("%s: (async) len: %zu\n", __func__, len);
> +
> +		dma_set_unmap(tx, unmap);
>  		async_tx_submit(chan, tx, submit);
>  	} else { /* run the memset synchronously */
>  		void *dest_buf;
> @@ -79,6 +86,7 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
>  
>  		async_tx_sync_epilog(submit);
>  	}
> +	dmaengine_unmap_put(unmap);
>  
>  	return tx;
>  }

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 06/12] async_xor: convert to dmaengine_unmap_data
  2012-12-06  9:25 ` [PATCH 06/12] async_xor: " Dan Williams
@ 2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:48 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux, vinod.koul, Tomasz Figa, Kyungmin Park, dave.jiang


On Thursday 06 December 2012 10:25:42 Dan Williams wrote:
> Use the generic unmap object to unmap dma buffers.
> 
> Later we can push this unmap object up to the raid layer and get rid of
> the 'scribble' parameter.
> 
> Cc: Tomasz Figa <t.figa@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>
> ---
>  crypto/async_tx/async_xor.c |   96 +++++++++++++++++++++++--------------------
>  1 file changed, 52 insertions(+), 44 deletions(-)
> 
> diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
> index 154cc84..46bbdb3 100644
> --- a/crypto/async_tx/async_xor.c
> +++ b/crypto/async_tx/async_xor.c
> @@ -33,48 +33,33 @@
>  
>  /* do_async_xor - dma map the pages and perform the xor with an engine */
>  static __async_inline struct dma_async_tx_descriptor *
> -do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
> -	     unsigned int offset, int src_cnt, size_t len, dma_addr_t *dma_src,
> -	     struct async_submit_ctl *submit)
> +do_async_xor(struct dma_chan *chan, struct dmaengine_unmap_data *unmap,
> +             struct async_submit_ctl *submit)
>  {
>  	struct dma_device *dma = chan->device;
>  	struct dma_async_tx_descriptor *tx = NULL;
> -	int src_off = 0;
> -	int i;
>  	dma_async_tx_callback cb_fn_orig = submit->cb_fn;
>  	void *cb_param_orig = submit->cb_param;
>  	enum async_tx_flags flags_orig = submit->flags;
>  	enum dma_ctrl_flags dma_flags;
> -	int xor_src_cnt = 0;
> -	dma_addr_t dma_dest;
> -
> -	/* map the dest bidrectional in case it is re-used as a source */
> -	dma_dest = dma_map_page(dma->dev, dest, offset, len, DMA_BIDIRECTIONAL);
> -	for (i = 0; i < src_cnt; i++) {
> -		/* only map the dest once */
> -		if (!src_list[i])
> -			continue;
> -		if (unlikely(src_list[i] == dest)) {
> -			dma_src[xor_src_cnt++] = dma_dest;
> -			continue;
> -		}
> -		dma_src[xor_src_cnt++] = dma_map_page(dma->dev, src_list[i], offset,
> -						      len, DMA_TO_DEVICE);
> -	}
> -	src_cnt = xor_src_cnt;
> +	int src_cnt = unmap->to_cnt;
> +	int xor_src_cnt;
> +	dma_addr_t dma_dest = unmap->addr[unmap->to_cnt];
> +	dma_addr_t *src_list = unmap->addr;
>  
>  	while (src_cnt) {
> +		dma_addr_t tmp;
> +
>  		submit->flags = flags_orig;
>  		dma_flags = 0;

This line can be removed now.

>  		xor_src_cnt = min(src_cnt, (int)dma->max_xor);
> -		/* if we are submitting additional xors, leave the chain open,
> -		 * clear the callback parameters, and leave the destination
> -		 * buffer mapped
> +		/* if we are submitting additional xors, leave the chain open
> +		 * and clear the callback parameters
>  		 */
> +		dma_flags = DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
>  		if (src_cnt > xor_src_cnt) {
>  			submit->flags &= ~ASYNC_TX_ACK;
>  			submit->flags |= ASYNC_TX_FENCE;
> -			dma_flags = DMA_COMPL_SKIP_DEST_UNMAP;
>  			submit->cb_fn = NULL;
>  			submit->cb_param = NULL;
>  		} else {
> @@ -85,12 +70,18 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
>  			dma_flags |= DMA_PREP_INTERRUPT;
>  		if (submit->flags & ASYNC_TX_FENCE)
>  			dma_flags |= DMA_PREP_FENCE;
> -		/* Since we have clobbered the src_list we are committed
> -		 * to doing this asynchronously.  Drivers force forward progress
> -		 * in case they can not provide a descriptor
> +
> +		/* Drivers force forward progress in case they can not provide a
> +		 * descriptor
>  		 */
> -		tx = dma->device_prep_dma_xor(chan, dma_dest, &dma_src[src_off],
> -					      xor_src_cnt, len, dma_flags);
> +		tmp = src_list[0];
> +		if (src_list > unmap->addr)
> +			src_list[0] = dma_dest;
> +		tx = dma->device_prep_dma_xor(chan, dma_dest, src_list,
> +					      xor_src_cnt, unmap->len,
> +					      dma_flags);
> +		src_list[0] = tmp;
> +
>  
>  		if (unlikely(!tx))
>  			async_tx_quiesce(&submit->depend_tx);
> @@ -99,22 +90,21 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
>  		while (unlikely(!tx)) {
>  			dma_async_issue_pending(chan);
>  			tx = dma->device_prep_dma_xor(chan, dma_dest,
> -						      &dma_src[src_off],
> -						      xor_src_cnt, len,
> +						      src_list,
> +						      xor_src_cnt, unmap->len,
>  						      dma_flags);
>  		}
>  
> +		dma_set_unmap(tx, unmap);
>  		async_tx_submit(chan, tx, submit);
>  		submit->depend_tx = tx;
>  
>  		if (src_cnt > xor_src_cnt) {
>  			/* drop completed sources */
>  			src_cnt -= xor_src_cnt;
> -			src_off += xor_src_cnt;
> -
>  			/* use the intermediate result a source */
> -			dma_src[--src_off] = dma_dest;
>  			src_cnt++;
> +			src_list += xor_src_cnt - 1;
>  		} else
>  			break;
>  	}
> @@ -189,22 +179,40 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
>  	struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR,
>  						      &dest, 1, src_list,
>  						      src_cnt, len);
> -	dma_addr_t *dma_src = NULL;
> +	struct dma_device *device = chan ? chan->device : NULL;
> +	struct dmaengine_unmap_data *unmap = NULL;
>  
>  	BUG_ON(src_cnt <= 1);
>  
> -	if (submit->scribble)
> -		dma_src = submit->scribble;
> -	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
> -		dma_src = (dma_addr_t *) src_list;
> +	if (device)
> +		unmap = dmaengine_get_unmap_data(device->dev, src_cnt+1, GFP_NOIO);
> +
> +	if (unmap && is_dma_xor_aligned(device, offset, 0, len)) {
> +		struct dma_async_tx_descriptor *tx;
> +		int i, j;
>  
> -	if (dma_src && chan && is_dma_xor_aligned(chan->device, offset, 0, len)) {
>  		/* run the xor asynchronously */
>  		pr_debug("%s (async): len: %zu\n", __func__, len);
>  
> -		return do_async_xor(chan, dest, src_list, offset, src_cnt, len,
> -				    dma_src, submit);
> +		unmap->len = len;
> +		for (i = 0, j = 0; i < src_cnt; i++) {
> +			if (!src_list[i])
> +				continue;
> +			unmap->to_cnt++;
> +			unmap->addr[j++] = dma_map_page(chan->device->dev, src_list[i],

device->dev can be now used instead of chan->device->dev

> +							offset, len, DMA_TO_DEVICE);
> +		}
> +
> +		/* map it bidirectional as it may be re-used as a source */
> +		unmap->addr[j] = dma_map_page(chan->device->dev, dest, offset, len,

ditto

> +					      DMA_BIDIRECTIONAL);
> +		unmap->bidi_cnt = 1;
> +
> +		tx = do_async_xor(chan, unmap, submit);
> +		dmaengine_unmap_put(unmap);
> +		return tx;
>  	} else {
> +		dmaengine_unmap_put(unmap);
>  		/* run the xor synchronously */
>  		pr_debug("%s (sync): len: %zu\n", __func__, len);
>  		WARN_ONCE(chan, "%s: no space for dma address conversion\n",

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 07/12] async_xor_val: convert to dmaengine_unmap_data
  2012-12-06  9:25 ` [PATCH 07/12] async_xor_val: " Dan Williams
@ 2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:48 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux, vinod.koul, Tomasz Figa, Kyungmin Park, dave.jiang

On Thursday 06 December 2012 10:25:48 Dan Williams wrote:
> Use the generic unmap object to unmap dma buffers.
> 
> Cc: Tomasz Figa <t.figa@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>

Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 08/12] async_raid6_recov: convert to dmaengine_unmap_data
  2012-12-06  9:25 ` [PATCH 08/12] async_raid6_recov: " Dan Williams
@ 2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:48 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux, vinod.koul, Tomasz Figa, Kyungmin Park, dave.jiang

On Thursday 06 December 2012 10:25:54 Dan Williams wrote:
> Use the generic unmap object to unmap dma buffers.
> 
> Cc: Tomasz Figa <t.figa@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>
> ---
>  crypto/async_tx/async_raid6_recov.c |   69 ++++++++++++++++++++++++-----------
>  1 file changed, 48 insertions(+), 21 deletions(-)
> 
> diff --git a/crypto/async_tx/async_raid6_recov.c b/crypto/async_tx/async_raid6_recov.c
> index a9f08a6..20aea04 100644
> --- a/crypto/async_tx/async_raid6_recov.c
> +++ b/crypto/async_tx/async_raid6_recov.c
> @@ -26,6 +26,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/raid/pq.h>
>  #include <linux/async_tx.h>
> +#include <linux/dmaengine.h>
>  
>  static struct dma_async_tx_descriptor *
>  async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
> @@ -34,35 +35,47 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
>  	struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
>  						      &dest, 1, srcs, 2, len);
>  	struct dma_device *dma = chan ? chan->device : NULL;
> +	struct dmaengine_unmap_data *unmap = NULL;
>  	const u8 *amul, *bmul;
>  	u8 ax, bx;
>  	u8 *a, *b, *c;
>  
> -	if (dma) {
> -		dma_addr_t dma_dest[2];
> -		dma_addr_t dma_src[2];
> +	if (dma)
> +		unmap = dmaengine_get_unmap_data(dma->dev, 3, GFP_NOIO);
> +
> +	if (unmap) {
>  		struct device *dev = dma->dev;
> +		dma_addr_t pq[2];
>  		struct dma_async_tx_descriptor *tx;
> -		enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
> +		enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
> +						DMA_COMPL_SKIP_DEST_UNMAP |
> +						DMA_PREP_PQ_DISABLE_P;
>  
>  		if (submit->flags & ASYNC_TX_FENCE)
>  			dma_flags |= DMA_PREP_FENCE;
> -		dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
> -		dma_src[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE);
> -		dma_src[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE);
> -		tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 2, coef,
> +		unmap->addr[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE);
> +		unmap->addr[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE);
> +		unmap->to_cnt = 2;
> +
> +		unmap->addr[2] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
> +		unmap->bidi_cnt = 1;
> +		/* engine only looks at Q, but expects it to follow P */
> +		pq[1] = unmap->addr[2];
> +
> +		unmap->len = len;
> +		tx = dma->device_prep_dma_pq(chan, pq, unmap->addr, 2, coef,
>  					     len, dma_flags);
>  		if (tx) {
> +			dma_set_unmap(tx, unmap);
>  			async_tx_submit(chan, tx, submit);
> +			dmaengine_unmap_put(unmap);
>  			return tx;
>  		}
>  
>  		/* could not get a descriptor, unmap and fall through to
>  		 * the synchronous path
>  		 */
> -		dma_unmap_page(dev, dma_dest[1], len, DMA_BIDIRECTIONAL);
> -		dma_unmap_page(dev, dma_src[0], len, DMA_TO_DEVICE);
> -		dma_unmap_page(dev, dma_src[1], len, DMA_TO_DEVICE);
> +		dmaengine_unmap_put(unmap);
>  	}
>  
>  	/* run the operation synchronously */
> @@ -89,23 +102,38 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
>  	struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
>  						      &dest, 1, &src, 1, len);
>  	struct dma_device *dma = chan ? chan->device : NULL;
> +	struct dmaengine_unmap_data *unmap = NULL;
>  	const u8 *qmul; /* Q multiplier table */
>  	u8 *d, *s;
>  
> -	if (dma) {
> -		dma_addr_t dma_dest[2];
> -		dma_addr_t dma_src[1];
> +	if (dma)
> +		unmap = dmaengine_get_unmap_data(dma->dev, 3, GFP_NOIO);
> +
> +	if (unmap) {
>  		struct device *dev = dma->dev;
>  		struct dma_async_tx_descriptor *tx;
> -		enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
> +		enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
> +						DMA_COMPL_SKIP_DEST_UNMAP |
> +						DMA_PREP_PQ_DISABLE_P;
>  
>  		if (submit->flags & ASYNC_TX_FENCE)
>  			dma_flags |= DMA_PREP_FENCE;
> -		dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
> -		dma_src[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE);
> -		tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 1, &coef,
> -					     len, dma_flags);
> +		unmap->addr[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE);
> +		unmap->to_cnt++;
> +		unmap->addr[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
> +		unmap->bidi_cnt++;
> +		unmap->len = len;
> +
> +		/* this looks funny, but the engine looks for Q at
> +		 * unmap->addr[1] and ignores unmap->addr[0] as a dest
> +		 * due to DMA_PREP_PQ_DISABLE_P
> +		 */
> +		tx = dma->device_prep_dma_pq(chan, unmap->addr, unmap->addr,
> +					     1, &coef, len, dma_flags);

at least iop-adma.c and ioat/dma_v3.c seem to modify content of
unmap->addr[0] which is probably not what we want and therefore
temporary dma_dest array should still be used here

static struct dma_async_tx_descriptor *
iop_adma_prep_dma_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,
		     unsigned int src_cnt, const unsigned char *scf, size_t len,
		     unsigned long flags)
{
...
		/* even if P is disabled its destination address (bits
		 * [3:0]) must match Q.  It is ok if P points to an
		 * invalid address, it won't be written.
		 */
		if (flags & DMA_PREP_PQ_DISABLE_P)
			dst[0] = dst[1] & 0x7;
...
}

static struct dma_async_tx_descriptor *
ioat3_prep_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,
	      unsigned int src_cnt, const unsigned char *scf, size_t len,
	      unsigned long flags)
{
	/* specify valid address for disabled result */
	if (flags & DMA_PREP_PQ_DISABLE_P)
		dst[0] = dst[1];
	if (flags & DMA_PREP_PQ_DISABLE_Q)
		dst[1] = dst[0];
...
}

> +
>  		if (tx) {
> +			dma_set_unmap(tx, unmap);
> +			dmaengine_unmap_put(unmap);
>  			async_tx_submit(chan, tx, submit);
>  			return tx;
>  		}
> @@ -113,8 +141,7 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
>  		/* could not get a descriptor, unmap and fall through to
>  		 * the synchronous path
>  		 */
> -		dma_unmap_page(dev, dma_dest[1], len, DMA_BIDIRECTIONAL);
> -		dma_unmap_page(dev, dma_src[0], len, DMA_TO_DEVICE);
> +		dmaengine_unmap_put(unmap);
>  	}
>  
>  	/* no channel available, or failed to allocate a descriptor, so

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 09/12] async_pq: convert to dmaengine_unmap_data
  2012-12-06  9:25 ` [PATCH 09/12] async_pq: " Dan Williams
@ 2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:48 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux, vinod.koul, Tomasz Figa, Kyungmin Park, dave.jiang

On Thursday 06 December 2012 10:25:59 Dan Williams wrote:
> Use the generic unmap object to unmap dma buffers.
> 
> Cc: Tomasz Figa <t.figa@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>
> ---
>  crypto/async_tx/async_pq.c |  117 ++++++++++++++++++++++++--------------------
>  drivers/dma/dmaengine.c    |    5 ++
>  2 files changed, 68 insertions(+), 54 deletions(-)
> 
> diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
> index 91d5d38..1d78984 100644
> --- a/crypto/async_tx/async_pq.c
> +++ b/crypto/async_tx/async_pq.c
> @@ -46,49 +46,25 @@ static struct page *pq_scribble_page;
>   * do_async_gen_syndrome - asynchronously calculate P and/or Q
>   */
>  static __async_inline struct dma_async_tx_descriptor *
> -do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
> -		      const unsigned char *scfs, unsigned int offset, int disks,
> -		      size_t len, dma_addr_t *dma_src,
> +do_async_gen_syndrome(struct dma_chan *chan,
> +		      const unsigned char *scfs, int disks,
> +		      struct dmaengine_unmap_data *unmap,
> +		      enum dma_ctrl_flags dma_flags,
>  		      struct async_submit_ctl *submit)
>  {
> +	dma_addr_t *dma_dest = &unmap->addr[disks - 2];

This seem to have the same issue with dma_dest[0] being potentially
overwritten as patch #8 so temporary array is really necessary.

>  	struct dma_async_tx_descriptor *tx = NULL;
>  	struct dma_device *dma = chan->device;
> -	enum dma_ctrl_flags dma_flags = 0;
>  	enum async_tx_flags flags_orig = submit->flags;
>  	dma_async_tx_callback cb_fn_orig = submit->cb_fn;
>  	dma_async_tx_callback cb_param_orig = submit->cb_param;
>  	int src_cnt = disks - 2;
> -	unsigned char coefs[src_cnt];
>  	unsigned short pq_src_cnt;
> -	dma_addr_t dma_dest[2];
>  	int src_off = 0;
> -	int idx;
> -	int i;
>  
> -	/* DMAs use destinations as sources, so use BIDIRECTIONAL mapping */
> -	if (P(blocks, disks))
> -		dma_dest[0] = dma_map_page(dma->dev, P(blocks, disks), offset,
> -					   len, DMA_BIDIRECTIONAL);
> -	else
> -		dma_flags |= DMA_PREP_PQ_DISABLE_P;
> -	if (Q(blocks, disks))
> -		dma_dest[1] = dma_map_page(dma->dev, Q(blocks, disks), offset,
> -					   len, DMA_BIDIRECTIONAL);
> -	else
> -		dma_flags |= DMA_PREP_PQ_DISABLE_Q;
> -
> -	/* convert source addresses being careful to collapse 'empty'
> -	 * sources and update the coefficients accordingly
> -	 */
> -	for (i = 0, idx = 0; i < src_cnt; i++) {
> -		if (blocks[i] == NULL)
> -			continue;
> -		dma_src[idx] = dma_map_page(dma->dev, blocks[i], offset, len,
> -					    DMA_TO_DEVICE);
> -		coefs[idx] = scfs[i];
> -		idx++;
> -	}
> -	src_cnt = idx;
> +	dma_flags |= DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
> +	if (submit->flags & ASYNC_TX_FENCE)
> +		dma_flags |= DMA_PREP_FENCE;
>  
>  	while (src_cnt > 0) {
>  		submit->flags = flags_orig;
> @@ -100,28 +76,23 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
>  		if (src_cnt > pq_src_cnt) {
>  			submit->flags &= ~ASYNC_TX_ACK;
>  			submit->flags |= ASYNC_TX_FENCE;
> -			dma_flags |= DMA_COMPL_SKIP_DEST_UNMAP;
>  			submit->cb_fn = NULL;
>  			submit->cb_param = NULL;
>  		} else {
> -			dma_flags &= ~DMA_COMPL_SKIP_DEST_UNMAP;
>  			submit->cb_fn = cb_fn_orig;
>  			submit->cb_param = cb_param_orig;
>  			if (cb_fn_orig)
>  				dma_flags |= DMA_PREP_INTERRUPT;
>  		}
> -		if (submit->flags & ASYNC_TX_FENCE)
> -			dma_flags |= DMA_PREP_FENCE;
>  
> -		/* Since we have clobbered the src_list we are committed
> -		 * to doing this asynchronously.  Drivers force forward
> -		 * progress in case they can not provide a descriptor
> +		/* Drivers force forward progress in case they can not provide
> +		 * a descriptor
>  		 */
>  		for (;;) {
>  			tx = dma->device_prep_dma_pq(chan, dma_dest,
> -						     &dma_src[src_off],
> +						     &unmap->addr[src_off],
>  						     pq_src_cnt,
> -						     &coefs[src_off], len,
> +						     &scfs[src_off], unmap->len,
>  						     dma_flags);
>  			if (likely(tx))
>  				break;
> @@ -129,6 +100,7 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
>  			dma_async_issue_pending(chan);
>  		}
>  
> +		dma_set_unmap(tx, unmap);
>  		async_tx_submit(chan, tx, submit);
>  		submit->depend_tx = tx;
>  
> @@ -188,10 +160,6 @@ do_sync_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
>   * set to NULL those buffers will be replaced with the raid6_zero_page
>   * in the synchronous path and omitted in the hardware-asynchronous
>   * path.
> - *
> - * 'blocks' note: if submit->scribble is NULL then the contents of
> - * 'blocks' may be overwritten to perform address conversions
> - * (dma_map_page() or page_address()).
>   */
>  struct dma_async_tx_descriptor *
>  async_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
> @@ -202,26 +170,69 @@ async_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
>  						      &P(blocks, disks), 2,
>  						      blocks, src_cnt, len);
>  	struct dma_device *device = chan ? chan->device : NULL;
> -	dma_addr_t *dma_src = NULL;
> +	struct dmaengine_unmap_data *unmap = NULL;
>  
>  	BUG_ON(disks > 255 || !(P(blocks, disks) || Q(blocks, disks)));
>  
> -	if (submit->scribble)
> -		dma_src = submit->scribble;
> -	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
> -		dma_src = (dma_addr_t *) blocks;
> +	if (device)
> +		unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOIO);
>  
> -	if (dma_src && device &&
> +	if (unmap &&
>  	    (src_cnt <= dma_maxpq(device, 0) ||
>  	     dma_maxpq(device, DMA_PREP_CONTINUE) > 0) &&
>  	    is_dma_pq_aligned(device, offset, 0, len)) {
> +		struct dma_async_tx_descriptor *tx;
> +		enum dma_ctrl_flags dma_flags = 0;
> +		unsigned char coefs[src_cnt];
> +		int i, j;
> +
>  		/* run the p+q asynchronously */
>  		pr_debug("%s: (async) disks: %d len: %zu\n",
>  			 __func__, disks, len);
> -		return do_async_gen_syndrome(chan, blocks, raid6_gfexp, offset,
> -					     disks, len, dma_src, submit);
> +
> +		/* convert source addresses being careful to collapse 'empty'
> +		 * sources and update the coefficients accordingly
> +		 */
> +		unmap->len = len;
> +		for (i = 0, j = 0; i < src_cnt; i++) {
> +			if (blocks[i] == NULL)
> +				continue;
> +			unmap->addr[j] = dma_map_page(device->dev, blocks[i], offset,
> +						      len, DMA_TO_DEVICE);
> +			coefs[j] = raid6_gfexp[i];
> +			unmap->to_cnt++;
> +			j++;
> +		}
> +
> +		/*
> +		 * DMAs use destinations as sources,
> +		 * so use BIDIRECTIONAL mapping
> +		 */
> +		unmap->bidi_cnt++;
> +		if (P(blocks, disks))
> +			unmap->addr[j++] = dma_map_page(device->dev, P(blocks, disks),
> +							offset, len, DMA_BIDIRECTIONAL);
> +		else {
> +			unmap->addr[j++] = 0;
> +			dma_flags |= DMA_PREP_PQ_DISABLE_P;
> +		}
> +
> +		unmap->bidi_cnt++;
> +		if (Q(blocks, disks))
> +			unmap->addr[j++] = dma_map_page(device->dev, Q(blocks, disks),
> +						       offset, len, DMA_BIDIRECTIONAL);
> +		else {
> +			unmap->addr[j++] = 0;
> +			dma_flags |= DMA_PREP_PQ_DISABLE_Q;
> +		}
> +
> +		tx = do_async_gen_syndrome(chan, coefs, j, unmap, dma_flags, submit);
> +		dmaengine_unmap_put(unmap);
> +		return tx;
>  	}
>  
> +	dmaengine_unmap_put(unmap);
> +
>  	/* run the pq synchronously */
>  	pr_debug("%s: (sync) disks: %d len: %zu\n", __func__, disks, len);
>  
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index 1b76227..5a3c7c0 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -909,9 +909,12 @@ static void dmaengine_unmap(struct kref *kref)
>  		dma_unmap_page(dev, unmap->addr[i], unmap->len,
>  			       DMA_FROM_DEVICE);
>  	cnt += unmap->bidi_cnt;
> -	for (; i < cnt; i++)
> +	for (; i < cnt; i++) {
> +		if (unmap->addr[i] == 0)
> +			continue;
>  		dma_unmap_page(dev, unmap->addr[i], unmap->len,
>  			       DMA_BIDIRECTIONAL);
> +	}
>  	kmem_cache_free(__get_unmap_pool(cnt)->cache, unmap);
>  }

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 10/12] async_pq_val: convert to dmaengine_unmap_data
  2012-12-06  9:26 ` [PATCH 10/12] async_pq_val: " Dan Williams
@ 2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2012-12-06 15:48 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux, vinod.koul, Tomasz Figa, Kyungmin Park, dave.jiang

On Thursday 06 December 2012 10:26:05 Dan Williams wrote:
> Use the generic unmap object to unmap dma buffers.
> 
> Cc: Tomasz Figa <t.figa@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> Signed-off-by: Dan Williams <djbw@fb.com>
> ---
>  crypto/async_tx/async_pq.c |   58 +++++++++++++++++++++++++++-----------------
>  1 file changed, 35 insertions(+), 23 deletions(-)
> 
> diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
> index 1d78984..e5ddb31 100644
> --- a/crypto/async_tx/async_pq.c
> +++ b/crypto/async_tx/async_pq.c
> @@ -288,50 +288,60 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks,
>  	struct dma_async_tx_descriptor *tx;
>  	unsigned char coefs[disks-2];
>  	enum dma_ctrl_flags dma_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0;
> -	dma_addr_t *dma_src = NULL;
> -	int src_cnt = 0;
> +	struct dmaengine_unmap_data *unmap = NULL;
>  
>  	BUG_ON(disks < 4);
>  
> -	if (submit->scribble)
> -		dma_src = submit->scribble;
> -	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
> -		dma_src = (dma_addr_t *) blocks;
> +	if (device)
> +		unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOIO);
>  
> -	if (dma_src && device && disks <= dma_maxpq(device, 0) &&
> +	if (unmap && disks <= dma_maxpq(device, 0) &&
>  	    is_dma_pq_aligned(device, offset, 0, len)) {
>  		struct device *dev = device->dev;
> -		dma_addr_t *pq = &dma_src[disks-2];
> -		int i;
> +		dma_addr_t pq[2];
> +		int i, j = 0, src_cnt = 0;
>  
>  		pr_debug("%s: (async) disks: %d len: %zu\n",
>  			 __func__, disks, len);
> -		if (!P(blocks, disks))
> +
> +		unmap->len = len;
> +		for (i = 0; i < disks-2; i++)
> +			if (likely(blocks[i])) {
> +				unmap->addr[j] = dma_map_page(dev, blocks[i],
> +							      offset, len,
> +							      DMA_TO_DEVICE);
> +				coefs[j] = raid6_gfexp[i];
> +				unmap->to_cnt++;
> +				src_cnt++;
> +				j++;
> +			}
> +
> +		if (!P(blocks, disks)) {
> +			pq[0] = 0;
>  			dma_flags |= DMA_PREP_PQ_DISABLE_P;
> -		else
> +		} else {
>  			pq[0] = dma_map_page(dev, P(blocks, disks),
>  					     offset, len,
>  					     DMA_TO_DEVICE);
> -		if (!Q(blocks, disks))
> +			unmap->addr[j++] = pq[0];
> +			unmap->to_cnt++;
> +		}
> +		if (!Q(blocks, disks)) {
> +			pq[1] = 0;
>  			dma_flags |= DMA_PREP_PQ_DISABLE_Q;
> -		else
> +		} else {
>  			pq[1] = dma_map_page(dev, Q(blocks, disks),
>  					     offset, len,
>  					     DMA_TO_DEVICE);
> +			unmap->addr[j++] = pq[1];
> +			unmap->to_cnt++;
> +		}
>  
>  		if (submit->flags & ASYNC_TX_FENCE)
>  			dma_flags |= DMA_PREP_FENCE;
> -		for (i = 0; i < disks-2; i++)
> -			if (likely(blocks[i])) {
> -				dma_src[src_cnt] = dma_map_page(dev, blocks[i],
> -								offset, len,
> -								DMA_TO_DEVICE);
> -				coefs[src_cnt] = raid6_gfexp[i];
> -				src_cnt++;
> -			}
> -
>  		for (;;) {
> -			tx = device->device_prep_dma_pq_val(chan, pq, dma_src,
> +			tx = device->device_prep_dma_pq_val(chan, pq,
> +							    unmap->addr,
>  							    src_cnt,
>  							    coefs,
>  							    len, pqres,
> @@ -341,6 +351,8 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks,
>  			async_tx_quiesce(&submit->depend_tx);
>  			dma_async_issue_pending(chan);
>  		}
> +
> +		dma_set_unmap(tx, unmap);
>  		async_tx_submit(chan, tx, submit);
>  
>  		return tx;

What did happen to dmaengine_unmap_put() calls?

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 24+ messages in thread

* dmaengine_unmap_data patches (was: Re: Re: [PATCH 02/12] dmaengine: prepare for generic 'unmap' data)
  2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
@ 2013-06-26 10:44     ` Bartlomiej Zolnierkiewicz
  2013-06-28 17:35       ` Dan Williams
  0 siblings, 1 reply; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2013-06-26 10:44 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-kernel, vinod.koul, linux, dave.jiang


Hi,

The original discussion is here:

	https://lkml.org/lkml/2012/12/6/71

Unfortunately there hasn't been any progress on these patches since
last December.  Dan, if you lack time to polish and push them would
you like me to take them over?

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

On Thursday, December 06, 2012 04:47:51 PM Bartlomiej Zolnierkiewicz wrote:
> On Thursday 06 December 2012 10:25:20 Dan Williams wrote:
> > Add a hook for a common dma unmap implementation to enable removal of
> > the per driver custom unmap code.  (A reworked version of Bartlomiej
> > Zolnierkiewicz's patches to remove the custom callbacks and the size
> > increase of dma_async_tx_descriptor for drivers that don't care about raid).
> > 
> > Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> > Signed-off-by: Dan Williams <djbw@fb.com>
> 
> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> 
> Thanks for reworking my patches!  The patchset looks generally
> good but there are few issues that should be fixed (please see
> other mails).
> 
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung Poland R&D Center


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: dmaengine_unmap_data patches (was: Re: Re: [PATCH 02/12] dmaengine: prepare for generic 'unmap' data)
  2013-06-26 10:44     ` dmaengine_unmap_data patches (was: Re: Re: [PATCH 02/12] dmaengine: prepare for generic 'unmap' data) Bartlomiej Zolnierkiewicz
@ 2013-06-28 17:35       ` Dan Williams
  0 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2013-06-28 17:35 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: linux-kernel, vinod.koul, linux, dave.jiang



On 6/26/13 3:44 AM, "Bartlomiej Zolnierkiewicz" <b.zolnierkie@samsung.com>
wrote:

>
>Hi,
>
>The original discussion is here:
>
>	https://lkml.org/lkml/2012/12/6/71
>
>Unfortunately there hasn't been any progress on these patches since
>last December.

Apologies, indeed these have languished.  I don¹t expect to have them
polished and tested in time for the upcoming merge window, but let¹s get
these in shape for 3.12.  I¹ll circle back in a few weeks unless you beat
me to it.

--
Dan


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2013-06-28 17:44 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-06  9:25 [PATCH 00/12] dmaengine_unmap_data Dan Williams
2012-12-06  9:25 ` [PATCH 01/12] dmaengine: consolidate memcpy apis Dan Williams
2012-12-06  9:25 ` [PATCH 02/12] dmaengine: prepare for generic 'unmap' data Dan Williams
2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
2013-06-26 10:44     ` dmaengine_unmap_data patches (was: Re: Re: [PATCH 02/12] dmaengine: prepare for generic 'unmap' data) Bartlomiej Zolnierkiewicz
2013-06-28 17:35       ` Dan Williams
2012-12-06  9:25 ` [PATCH 03/12] dmaengine: reference counted unmap data Dan Williams
2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
2012-12-06  9:25 ` [PATCH 04/12] async_memcpy: convert to dmaengine_unmap_data Dan Williams
2012-12-06 15:47   ` Bartlomiej Zolnierkiewicz
2012-12-06  9:25 ` [PATCH 05/12] async_memset: " Dan Williams
2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
2012-12-06  9:25 ` [PATCH 06/12] async_xor: " Dan Williams
2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
2012-12-06  9:25 ` [PATCH 07/12] async_xor_val: " Dan Williams
2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
2012-12-06  9:25 ` [PATCH 08/12] async_raid6_recov: " Dan Williams
2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
2012-12-06  9:25 ` [PATCH 09/12] async_pq: " Dan Williams
2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
2012-12-06  9:26 ` [PATCH 10/12] async_pq_val: " Dan Williams
2012-12-06 15:48   ` Bartlomiej Zolnierkiewicz
2012-12-06  9:26 ` [PATCH 11/12] dmaengine: remove DMA unmap from drivers Dan Williams
2012-12-06  9:26 ` [PATCH 12/12] dmaengine: remove DMA unmap flags Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).