linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] DaVinci DMA engine conversion
@ 2012-08-16 21:44 Matt Porter
  2012-08-16 21:44 ` [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver Matt Porter
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Matt Porter @ 2012-08-16 21:44 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, cjb, grant.likely
  Cc: Linux Kernel Mailing List, Linux ARM Kernel List, Linux MMC List,
	Linux SPI Devel List, Linux DaVinci Kernel List, Sekhar Nori

This series begins the conversion of the DaVinci private
EDMA API implementation to a DMA engine driver and
converts two of the three in-kernel users of the private
EDMA API to DMA engine.

The approach taken is similar to the recent OMAP DMA
Engine conversion. The EDMA DMA Engine driver is a
wrapper around the existing private EDMA implementation
and registers the platform device within the driver.
This allows the conversion series to stand alone with
just the drivers and no changes to platform code. It
also allows peripheral drivers to continue to use the
private EDMA implementation until they are converted.

The EDMA DMA Engine driver supports slave transfers only
at this time. It is planned to add cyclic transfers in
support of audio peripherals.

There are three users of the private EDMA API in the
kernel now: davinci_mmc, spi-davinci, and davinci-mcasp.
This series provides DMA Engine conversions for the
davinci_mmc and spi-davinci drivers which use the
supported slave transfers.

This series has been tested on an AM18x EVM and
performance is comparable with the private EDMA
API implementations. I do not have a Wifi module
for the AM18x EVM to test the MMC1 instance which
has DMA channels on the second EDMA channel controller
instance. Testing is needed on all DaVinci platforms
including DM355/365, DM644x/6x, DA830/OMAP-L137/AM17x,
and DA850/OMAP-L138/AM18x.

After this series, the current plan is to complete
the mcasp driver conversion which includes adding
cyclic dma support. This will then enable the
removal and refactoring of the private EDMA API
functionality into the EDMA DMA Engine driver.
Since EDMA is also used on the AM33xx family of
parts in mach-omap2/, the plan is to enable this
driver on that platform as well.

Matt Porter (3):
  dmaengine: add TI EDMA DMA engine driver
  mmc: davinci_mmc: convert to DMA engine API
  spi: spi-davinci: convert to DMA engine API

 drivers/dma/Kconfig            |    9 +
 drivers/dma/Makefile           |    1 +
 drivers/dma/edma.c             |  729 ++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/davinci_mmc.c |  271 +++++----------
 drivers/spi/spi-davinci.c      |  292 +++++++---------
 include/linux/edma.h           |   29 ++
 6 files changed, 980 insertions(+), 351 deletions(-)
 create mode 100644 drivers/dma/edma.c
 create mode 100644 include/linux/edma.h

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver
  2012-08-16 21:44 [PATCH 0/3] DaVinci DMA engine conversion Matt Porter
@ 2012-08-16 21:44 ` Matt Porter
  2012-08-16 22:29   ` Russell King - ARM Linux
  2012-08-21 18:20   ` Matt Porter
  2012-08-16 21:44 ` [PATCH 2/3] mmc: davinci_mmc: convert to DMA engine API Matt Porter
  2012-08-16 21:44 ` [PATCH 3/3] spi: spi-davinci: " Matt Porter
  2 siblings, 2 replies; 7+ messages in thread
From: Matt Porter @ 2012-08-16 21:44 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, cjb, grant.likely
  Cc: Linux Kernel Mailing List, Linux ARM Kernel List, Linux MMC List,
	Linux SPI Devel List, Linux DaVinci Kernel List, Sekhar Nori

Add a DMA engine driver for the TI EDMA controller. This driver
is implemented as a wrapper around the existing DaVinci private
DMA implementation. This approach allows for incremental conversion
of each peripheral driver to the DMA engine API. The EDMA driver
supports slave transfers but does not yet support cyclic transfers.

Signed-off-by: Matt Porter <mporter@ti.com>
---
 drivers/dma/Kconfig  |    9 +
 drivers/dma/Makefile |    1 +
 drivers/dma/edma.c   |  729 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/edma.h |   29 ++
 4 files changed, 768 insertions(+)
 create mode 100644 drivers/dma/edma.c
 create mode 100644 include/linux/edma.h

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index d06ea29..d9f17d4 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -208,6 +208,15 @@ config SIRF_DMA
 	help
 	  Enable support for the CSR SiRFprimaII DMA engine.
 
+config TI_EDMA
+	tristate "TI EDMA support"
+	depends on ARCH_DAVINCI
+	select DMA_ENGINE
+	default y
+	help
+	  Enable support for the TI EDMA controller. This DMA
+	  engine is found on TI DaVinci and AM33xx parts.
+
 config ARCH_HAS_ASYNC_TX_FIND_CHANNEL
 	bool
 
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 4cf6b12..f5cf310 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_IMX_DMA) += imx-dma.o
 obj-$(CONFIG_MXS_DMA) += mxs-dma.o
 obj-$(CONFIG_TIMB_DMA) += timb_dma.o
 obj-$(CONFIG_SIRF_DMA) += sirf-dma.o
+obj-$(CONFIG_TI_EDMA) += edma.o
 obj-$(CONFIG_STE_DMA40) += ste_dma40.o ste_dma40_ll.o
 obj-$(CONFIG_TEGRA20_APB_DMA) += tegra20-apb-dma.o
 obj-$(CONFIG_PL330_DMA) += pl330.o
diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
new file mode 100644
index 0000000..ba4de75
--- /dev/null
+++ b/drivers/dma/edma.c
@@ -0,0 +1,729 @@
+/*
+ * TI EDMA DMA engine driver
+ *
+ * Copyright 2012 Texas Instruments
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/dmaengine.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/slab.h>
+#include <linux/platform_device.h>
+
+#include <mach/edma.h>
+
+#include "dmaengine.h"
+
+#define EDMA_MAX_CHANNELS	64
+#define EDMA_DESCRIPTORS	16
+/* Max of 16 segments per channel to conserve PaRAM slots */
+#define MAX_NR_SG		16
+#define EDMA_MAX_SLOTS		MAX_NR_SG
+
+struct edma_desc {
+	struct dma_async_tx_descriptor	txd;
+	struct list_head		node;
+
+	struct edmacc_param		pset[EDMA_MAX_SLOTS];
+	int				pset_nr;
+	int				absync;
+};
+
+struct edma_cc;
+
+struct edma_chan {
+	struct dma_chan			chan;
+	struct list_head		free;
+	struct list_head		prepared;
+	struct list_head		queued;
+	struct list_head		active;
+	struct list_head		completed;
+	struct tasklet_struct		work;
+
+	struct edma_cc			*ecc;
+	int				ch_num;
+	bool				alloced;
+	int				slot[EDMA_MAX_SLOTS];
+
+	dma_addr_t			addr;
+	int				addr_width;
+	int				maxburst;
+
+	/* Lock for this structure */
+	spinlock_t			lock;
+};
+
+/* EDMA channel controller */
+struct edma_cc {
+	struct dma_device		dma_slave;
+	struct edma_chan		slave_chans[EDMA_MAX_CHANNELS];
+	int				num_slave_chans;
+	int				dummy_slot[2];
+};
+
+/* Convert struct dma_chan to struct edma_chan */
+static inline
+struct edma_chan *to_edma_chan(struct dma_chan *c)
+{
+	return container_of(c, struct edma_chan, chan);
+}
+
+/* Dispatch a queued descriptor to the controller (caller holds lock) */
+static void edma_execute(struct edma_chan *echan)
+{
+	struct edma_desc *edesc = NULL;
+	int i;
+
+	/* Move the first queued descriptor to active list */
+	edesc = list_first_entry(&echan->queued, struct edma_desc,
+		node);
+	list_move_tail(&edesc->node, &echan->active);
+
+	/* Write descriptor PaRAM set(s) */
+	for (i = 0; i < edesc->pset_nr; i++) {
+		edma_write_slot(echan->slot[i], &edesc->pset[i]);
+		dev_dbg(echan->chan.device->dev,
+			"\n pset[%d]:\n"
+			"  chnum\t%d\n"
+			"  slot\t%d\n"
+			"  opt\t%08x\n"
+			"  src\t%08x\n"
+			"  dst\t%08x\n"
+			"  abcnt\t%08x\n"
+			"  ccnt\t%08x\n"
+			"  bidx\t%08x\n"
+			"  cidx\t%08x\n"
+			"  lkrld\t%08x\n",
+			i, echan->ch_num, echan->slot[i],
+			edesc->pset[i].opt,
+			edesc->pset[i].src,
+			edesc->pset[i].dst,
+			edesc->pset[i].a_b_cnt,
+			edesc->pset[i].ccnt,
+			edesc->pset[i].src_dst_bidx,
+			edesc->pset[i].src_dst_cidx,
+			edesc->pset[i].link_bcntrld);
+		/* Link to the previous slot if not the last set */
+		if (i != (edesc->pset_nr - 1))
+			edma_link(echan->slot[i], echan->slot[i+1]);
+		/* Final pset links to the dummy pset */
+		else
+			edma_link(echan->slot[i],
+				echan->ecc->dummy_slot[
+				EDMA_CTLR(echan->ch_num)]);
+	}
+
+	edma_start(echan->ch_num);
+}
+
+/* Process completed descriptors */
+static void edma_process_completed(struct edma_chan *echan)
+{
+	struct edma_desc *edesc;
+	struct dma_async_tx_descriptor *txd;
+	unsigned long flags;
+	LIST_HEAD(list);
+
+	/* Get all completed descriptors */
+	if (!list_empty(&echan->completed)) {
+		spin_lock_irqsave(&echan->lock, flags);
+		list_splice_tail_init(&echan->completed, &list);
+		spin_unlock_irqrestore(&echan->lock, flags);
+
+		/* Execute callbacks and run dependencies */
+		list_for_each_entry(edesc, &list, node) {
+			txd = &edesc->txd;
+
+			dma_cookie_complete(txd);
+
+			if (txd->callback)
+				txd->callback(txd->callback_param);
+
+			dma_run_dependencies(txd);
+		}
+
+		/* Free descriptors */
+		spin_lock_irqsave(&echan->lock, flags);
+		list_splice_tail_init(&list, &echan->free);
+		spin_unlock_irqrestore(&echan->lock, flags);
+	}
+}
+
+/* EDMA tasklet */
+static void edma_work(unsigned long data)
+{
+	struct edma_chan *echan = (void *)data;
+
+	edma_process_completed(echan);
+}
+
+/* Queue descriptor for processing */
+static dma_cookie_t edma_tx_submit(struct dma_async_tx_descriptor *txd)
+{
+	struct edma_chan *echan = to_edma_chan(txd->chan);
+	struct edma_desc *edesc;
+	dma_cookie_t cookie;
+	unsigned long flags;
+
+	edesc = container_of(txd, struct edma_desc, txd);
+
+	spin_lock_irqsave(&echan->lock, flags);
+
+	/* Assign a cookie and queue it */
+	cookie = dma_cookie_assign(&edesc->txd);
+	list_move_tail(&edesc->node, &echan->queued);
+
+	spin_unlock_irqrestore(&echan->lock, flags);
+
+	return cookie;
+}
+
+static int edma_slave_config(struct edma_chan *echan,
+	struct dma_slave_config *config)
+{
+	if ((config->src_addr_width > DMA_SLAVE_BUSWIDTH_4_BYTES) ||
+		(config->dst_addr_width > DMA_SLAVE_BUSWIDTH_4_BYTES))
+		return -EINVAL;
+
+	if (config->direction == DMA_MEM_TO_DEV) {
+		if (config->dst_addr)
+			echan->addr = config->dst_addr;
+		if (config->dst_addr_width)
+			echan->addr_width = config->dst_addr_width;
+		if (config->dst_maxburst)
+			echan->maxburst = config->dst_maxburst;
+	} else if (config->direction == DMA_DEV_TO_MEM) {
+		if (config->src_addr)
+			echan->addr = config->src_addr;
+		if (config->src_addr_width)
+			echan->addr_width = config->src_addr_width;
+		if (config->src_maxburst)
+			echan->maxburst = config->src_maxburst;
+	}
+
+	return 0;
+}
+
+static int edma_control(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
+			unsigned long arg)
+{
+	int ret = 0;
+	struct dma_slave_config *config;
+	struct edma_chan *echan = to_edma_chan(chan);
+
+	switch (cmd) {
+	case DMA_TERMINATE_ALL:
+		edma_stop(echan->ch_num);
+		break;
+	case DMA_SLAVE_CONFIG:
+		config = (struct dma_slave_config *)arg;
+		ret = edma_slave_config(echan, config);
+		break;
+	default:
+		ret = -ENOSYS;
+	}
+
+	return ret;
+}
+
+static struct dma_async_tx_descriptor *edma_prep_slave_sg(
+	struct dma_chan *chan, struct scatterlist *sgl,
+	unsigned int sg_len, enum dma_transfer_direction direction,
+	unsigned long dma_flags, void *context)
+{
+	struct edma_chan *echan = to_edma_chan(chan);
+	struct device *dev = echan->chan.device->dev;
+	struct edma_desc *edesc = NULL;
+	struct scatterlist *sg;
+	unsigned long flags;
+	int i;
+	int acnt, bcnt, ccnt, src, dst, cidx;
+	int src_bidx, dst_bidx, src_cidx, dst_cidx;
+
+	if (unlikely(!echan || !sgl || !sg_len))
+		return NULL;
+
+	if (echan->addr_width == DMA_SLAVE_BUSWIDTH_UNDEFINED) {
+		dev_err(dev, "Undefined slave buswidth\n");
+		return NULL;
+	}
+
+	if (sg_len > MAX_NR_SG) {
+		dev_err(dev, "Exceeded max SG segments %d > %d\n",
+			sg_len, MAX_NR_SG);
+		return NULL;
+	}
+
+	spin_lock_irqsave(&echan->lock, flags);
+	/* Get free descriptor */
+	if (!list_empty(&echan->free)) {
+		edesc = list_first_entry(&echan->free,
+				struct edma_desc, node);
+		list_del_init(&edesc->node);
+	}
+	spin_unlock_irqrestore(&echan->lock, flags);
+
+	if (!edesc) {
+		dev_dbg(dev, "Out of descriptors for channel %d",
+			echan->ch_num);
+		/* Try to free completed descriptors */
+		edma_process_completed(echan);
+		return NULL;
+	}
+
+	edesc->pset_nr = sg_len;
+
+	for_each_sg(sgl, sg, sg_len, i) {
+		/* Allocate a PaRAM slot, if needed */
+		if (echan->slot[i] < 0) {
+			echan->slot[i] =
+				edma_alloc_slot(EDMA_CTLR(echan->ch_num),
+						EDMA_SLOT_ANY);
+			if (echan->slot[i] < 0) {
+				dev_err(dev, "Failed to allocate slot\n");
+				return NULL;
+			}
+		}
+
+		acnt = echan->addr_width;
+
+		/*
+		 * If the maxburst is equal to the fifo width, use
+		 * A-synced transfers. This allows for large contiguous
+		 * buffer transfers using only one PaRAM set.
+		 */
+		if (echan->maxburst == 1) {
+			edesc->absync = false;
+			ccnt = sg_dma_len(sg) / acnt / (SZ_64K - 1);
+			bcnt = sg_dma_len(sg) / acnt - ccnt * (SZ_64K - 1);
+			if (bcnt)
+				ccnt++;
+			else
+				bcnt = SZ_64K - 1;
+			cidx = acnt;
+		/*
+		 * If maxburst is greater than the fifo address_width,
+		 * use AB-synced transfers where A count is the fifo
+		 * address_width and B count is the maxburst. In this
+		 * case, we are limited to transfers of C count frames
+		 * of (address_width * maxburst) where C count is limited
+		 * to SZ_64K-1. This places an upper bound on the length
+		 * of an SG segment that can be handled.
+		 */
+		} else {
+			edesc->absync = true;
+			bcnt = echan->maxburst;
+			ccnt = sg_dma_len(sg) / (acnt * bcnt);
+			if (ccnt > (SZ_64K - 1)) {
+				dev_err(dev, "Exceeded max SG segment size\n");
+				return NULL;
+			}
+			cidx = acnt * bcnt;
+		}
+
+		if (direction == DMA_MEM_TO_DEV) {
+			src = sg_dma_address(sg);
+			dst = echan->addr;
+			src_bidx = acnt;
+			src_cidx = cidx;
+			dst_bidx = 0;
+			dst_cidx = 0;
+		} else {
+			src = echan->addr;
+			dst = sg_dma_address(sg);
+			src_bidx = 0;
+			src_cidx = 0;
+			dst_bidx = acnt;
+			dst_cidx = cidx;
+		}
+
+		edesc->pset[i].opt = EDMA_TCC(EDMA_CHAN_SLOT(echan->ch_num));
+		/* Configure A or AB synchronized transfers */
+		if (edesc->absync)
+			edesc->pset[i].opt |= SYNCDIM;
+		/* If this is the last set, enable completion interrupt flag */
+		if (i == sg_len - 1)
+			edesc->pset[i].opt |= TCINTEN;
+
+		edesc->pset[i].src = src;
+		edesc->pset[i].dst = dst;
+
+		edesc->pset[i].src_dst_bidx = (dst_bidx << 16) | src_bidx;
+		edesc->pset[i].src_dst_cidx = (dst_cidx << 16) | src_cidx;
+
+		edesc->pset[i].a_b_cnt = bcnt << 16 | acnt;
+		edesc->pset[i].ccnt = ccnt;
+		edesc->pset[i].link_bcntrld = 0xffffffff;
+
+	}
+	edesc->txd.flags = dma_flags;
+
+	/* Place descriptor in prepared list */
+	spin_lock_irqsave(&echan->lock, flags);
+	list_add_tail(&edesc->node, &echan->prepared);
+	spin_unlock_irqrestore(&echan->lock, flags);
+
+	return &edesc->txd;
+}
+
+static void edma_callback(unsigned ch_num, u16 ch_status, void *data)
+{
+	struct edma_chan *echan = to_edma_chan((struct dma_chan *)data);
+	struct device *dev = echan->chan.device->dev;
+	struct edma_desc *edesc = NULL;
+
+	/* Stop the channel */
+	edma_stop(echan->ch_num);
+
+	switch (ch_status) {
+	case DMA_COMPLETE:
+		dev_dbg(dev, "transfer complete on channel %d\n", ch_num);
+
+		spin_lock(&echan->lock);
+
+		/* Grab the first active descriptor */
+		edesc = list_first_entry(&echan->active, struct edma_desc,
+			node);
+
+		/* Mark the descriptor completed */
+		list_move_tail(&edesc->node, &echan->completed);
+
+		/* Execute queued descriptors */
+		if (!list_empty(&echan->queued)) {
+			dev_dbg(dev, "descriptor queued on channel %d\n",
+				ch_num);
+			edma_execute(echan);
+		}
+
+		spin_unlock(&echan->lock);
+
+		/* Schedule work */
+		tasklet_schedule(&echan->work);
+
+		break;
+	case DMA_CC_ERROR:
+		dev_dbg(dev, "transfer error on channel %d\n", ch_num);
+		break;
+	default:
+		break;
+	}
+}
+
+/* Alloc channel resources */
+static int edma_alloc_chan_resources(struct dma_chan *chan)
+{
+	struct edma_chan *echan = to_edma_chan(chan);
+	struct device *dev = echan->chan.device->dev;
+	struct edma_desc *edesc;
+	int ret;
+	int a_ch_num;
+	LIST_HEAD(descs);
+	int i;
+	unsigned long flags;
+
+	a_ch_num = edma_alloc_channel(echan->ch_num, edma_callback,
+					chan, EVENTQ_DEFAULT);
+
+	if (a_ch_num < 0) {
+		ret = -ENODEV;
+		goto err_no_chan;
+	}
+
+	if (a_ch_num != echan->ch_num) {
+		dev_err(dev, "failed to allocate requested channel %d\n",
+			echan->ch_num);
+		ret = -ENODEV;
+		goto err_wrong_chan;
+	}
+
+	dma_cookie_init(&echan->chan);
+	echan->alloced = true;
+	echan->slot[0] = echan->ch_num;
+
+	/* Alloc descriptors for this channel */
+	for (i = 0; i < EDMA_DESCRIPTORS; i++) {
+		edesc = kzalloc(sizeof(*edesc), GFP_KERNEL);
+		if (!edesc) {
+			dev_notice(dev,
+				"allocated only %u descriptors\n", i);
+			break;
+		}
+
+		dma_async_tx_descriptor_init(&edesc->txd, chan);
+		edesc->txd.flags = DMA_CTRL_ACK;
+		edesc->txd.tx_submit = edma_tx_submit;
+
+		list_add_tail(&edesc->node, &descs);
+	}
+
+	/* Return error only if no descriptors were allocated */
+	if (i == 0) {
+		ret = -ENOMEM;
+		goto err_alloc_desc;
+	}
+
+	spin_lock_irqsave(&echan->lock, flags);
+
+	list_splice_tail_init(&descs, &echan->free);
+
+	spin_unlock_irqrestore(&echan->lock, flags);
+
+	return 0;
+
+err_alloc_desc:
+err_wrong_chan:
+	edma_free_channel(a_ch_num);
+err_no_chan:
+	return ret;
+}
+
+/* Free channel resources */
+static void edma_free_chan_resources(struct dma_chan *chan)
+{
+	struct edma_chan *echan = to_edma_chan(chan);
+	struct edma_desc *edesc, *tmp;
+	unsigned long flags;
+	LIST_HEAD(descs);
+	int i;
+
+	/* Terminate transfers */
+	edma_stop(echan->ch_num);
+
+	spin_lock_irqsave(&echan->lock, flags);
+
+	/* Channel must be idle */
+	BUG_ON(!list_empty(&echan->prepared));
+	BUG_ON(!list_empty(&echan->queued));
+	BUG_ON(!list_empty(&echan->active));
+	BUG_ON(!list_empty(&echan->completed));
+
+	list_splice_tail_init(&echan->free, &descs);
+
+	spin_unlock_irqrestore(&echan->lock, flags);
+
+	/* Free descriptors */
+	list_for_each_entry_safe(edesc, tmp, &descs, node)
+		kfree(edesc);
+
+	/* Free EDMA PaRAM slots */
+	for (i = 1; i < EDMA_MAX_SLOTS; i++) {
+		if (echan->slot[i] >= 0) {
+			edma_free_slot(echan->slot[i]);
+			echan->slot[i] = -1;
+		}
+	}
+
+	/* Free EDMA channel */
+	if (echan->alloced) {
+		edma_free_channel(echan->ch_num);
+		echan->alloced = false;
+	}
+}
+
+/* Send pending descriptor to hardware */
+static void edma_issue_pending(struct dma_chan *chan)
+{
+	struct edma_chan *echan = to_edma_chan(chan);
+	unsigned long flags;
+
+	spin_lock_irqsave(&echan->lock, flags);
+
+	if (list_empty(&echan->active) && !list_empty(&echan->queued))
+		edma_execute(echan);
+
+	spin_unlock_irqrestore(&echan->lock, flags);
+}
+
+/* Check request completion status */
+static enum dma_status edma_tx_status(struct dma_chan *chan,
+				      dma_cookie_t cookie,
+				      struct dma_tx_state *txstate)
+{
+	struct edma_chan *echan = to_edma_chan(chan);
+	unsigned long flags;
+	enum dma_status ret;
+
+	spin_lock_irqsave(&echan->lock, flags);
+	ret = dma_cookie_status(chan, cookie, txstate);
+	spin_unlock_irqrestore(&echan->lock, flags);
+
+	return ret;
+}
+
+static void __init edma_chan_init(struct edma_cc *ecc,
+				  struct dma_device *dma,
+				  struct edma_chan *echans)
+{
+	int i, j;
+	int chcnt = 0;
+
+	INIT_LIST_HEAD(&dma->channels);
+
+	for (i = 0; i < EDMA_MAX_CHANNELS; i++) {
+		struct edma_chan *echan = &echans[chcnt];
+		echan->ch_num = i;
+		echan->ecc = ecc;
+		echan->chan.device = dma;
+		for (j = 0; j < EDMA_MAX_SLOTS; j++)
+			echan->slot[j] = -1;
+
+		spin_lock_init(&echan->lock);
+
+		INIT_LIST_HEAD(&echan->free);
+		INIT_LIST_HEAD(&echan->prepared);
+		INIT_LIST_HEAD(&echan->queued);
+		INIT_LIST_HEAD(&echan->active);
+		INIT_LIST_HEAD(&echan->completed);
+
+		tasklet_init(&echan->work, edma_work, (unsigned long)echan);
+
+		list_add_tail(&echan->chan.device_node, &dma->channels);
+
+		chcnt++;
+	}
+}
+
+static void edma_ops_init(struct edma_cc *ecc, struct dma_device *dma,
+			  struct device *dev)
+{
+	if (dma_has_cap(DMA_SLAVE, dma->cap_mask))
+		dma->device_prep_slave_sg = edma_prep_slave_sg;
+
+	dma->device_alloc_chan_resources = edma_alloc_chan_resources;
+	dma->device_free_chan_resources = edma_free_chan_resources;
+	dma->device_issue_pending = edma_issue_pending;
+	dma->device_tx_status = edma_tx_status;
+	dma->device_control = edma_control;
+	dma->dev = dev;
+}
+
+static int __devinit edma_probe(struct platform_device *pdev)
+{
+	struct edma_cc *ecc;
+	int ret;
+
+	ecc = devm_kzalloc(&pdev->dev, sizeof(*ecc), GFP_KERNEL);
+	if (!ecc) {
+		dev_err(&pdev->dev, "Can't allocate controller\n");
+		ret = -ENOMEM;
+		goto err_alloc_ecc;
+	}
+
+	ecc->dummy_slot[0] = edma_alloc_slot(0, EDMA_SLOT_ANY);
+	if (ecc->dummy_slot[0] < 0) {
+		dev_err(&pdev->dev, "Can't allocate PaRAM dummy slot\n");
+		ret = -EIO;
+		goto err_alloc_slot0;
+	}
+
+	ecc->dummy_slot[1] = edma_alloc_slot(1, EDMA_SLOT_ANY);
+	if (ecc->dummy_slot[1] < 0) {
+		dev_err(&pdev->dev, "Can't allocate PaRAM dummy slot\n");
+		ret = -EIO;
+		goto err_alloc_slot1;
+	}
+
+	edma_chan_init(ecc, &ecc->dma_slave, ecc->slave_chans);
+
+	dma_cap_zero(ecc->dma_slave.cap_mask);
+	dma_cap_set(DMA_SLAVE, ecc->dma_slave.cap_mask);
+
+	edma_ops_init(ecc, &ecc->dma_slave, &pdev->dev);
+
+	ret = dma_async_device_register(&ecc->dma_slave);
+	if (ret)
+		goto err_reg1;
+
+	platform_set_drvdata(pdev, ecc);
+
+	dev_info(&pdev->dev, "TI EDMA DMA engine driver\n");
+
+	return 0;
+
+err_reg1:
+	edma_free_slot(ecc->dummy_slot[1]);
+err_alloc_slot1:
+	edma_free_slot(ecc->dummy_slot[0]);
+err_alloc_slot0:
+	devm_kfree(&pdev->dev, ecc);
+err_alloc_ecc:
+	return ret;
+}
+
+static int __devexit edma_remove(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct edma_cc *ecc = dev_get_drvdata(dev);
+
+	dma_async_device_unregister(&ecc->dma_slave);
+	edma_free_slot(ecc->dummy_slot[0]);
+	edma_free_slot(ecc->dummy_slot[1]);
+	devm_kfree(dev, ecc);
+
+	return 0;
+}
+
+static struct platform_driver edma_driver = {
+	.probe		= edma_probe,
+	.remove		= __devexit_p(edma_remove),
+	.driver = {
+		.name = "edma-dma-engine",
+		.owner = THIS_MODULE,
+	},
+};
+
+bool edma_filter_fn(struct dma_chan *chan, void *param)
+{
+	if (chan->device->dev->driver == &edma_driver.driver) {
+		struct edma_chan *echan = to_edma_chan(chan);
+		unsigned ch_req = *(unsigned *)param;
+		return ch_req == echan->ch_num;
+	}
+	return false;
+}
+EXPORT_SYMBOL(edma_filter_fn);
+
+static struct platform_device *pdev;
+
+static const struct platform_device_info edma_dev_info = {
+	.name = "edma-dma-engine",
+	.id = -1,
+	.dma_mask = DMA_BIT_MASK(32),
+};
+
+static int edma_init(void)
+{
+	int ret = platform_driver_register(&edma_driver);
+
+	if (ret == 0) {
+		pdev = platform_device_register_full(&edma_dev_info);
+		if (IS_ERR(pdev)) {
+			platform_driver_unregister(&edma_driver);
+			ret = PTR_ERR(pdev);
+		}
+	}
+	return ret;
+}
+subsys_initcall(edma_init);
+
+static void __exit edma_exit(void)
+{
+	platform_device_unregister(pdev);
+	platform_driver_unregister(&edma_driver);
+}
+module_exit(edma_exit);
+
+MODULE_AUTHOR("Matt Porter <mporter@ti.com>");
+MODULE_DESCRIPTION("TI EDMA DMA engine driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/edma.h b/include/linux/edma.h
new file mode 100644
index 0000000..a1307e7
--- /dev/null
+++ b/include/linux/edma.h
@@ -0,0 +1,29 @@
+/*
+ * TI EDMA DMA engine driver
+ *
+ * Copyright 2012 Texas Instruments
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#ifndef __LINUX_EDMA_H
+#define __LINUX_EDMA_H
+
+struct dma_chan;
+
+#if defined(CONFIG_TI_EDMA) || defined(CONFIG_TI_EDMA_MODULE)
+bool edma_filter_fn(struct dma_chan *, void *);
+#else
+static inline bool edma_filter_fn(struct dma_chan *chan, void *param)
+{
+	return false;
+}
+#endif
+
+#endif
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] mmc: davinci_mmc: convert to DMA engine API
  2012-08-16 21:44 [PATCH 0/3] DaVinci DMA engine conversion Matt Porter
  2012-08-16 21:44 ` [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver Matt Porter
@ 2012-08-16 21:44 ` Matt Porter
  2012-08-16 21:44 ` [PATCH 3/3] spi: spi-davinci: " Matt Porter
  2 siblings, 0 replies; 7+ messages in thread
From: Matt Porter @ 2012-08-16 21:44 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, cjb, grant.likely
  Cc: Linux Kernel Mailing List, Linux ARM Kernel List, Linux MMC List,
	Linux SPI Devel List, Linux DaVinci Kernel List, Sekhar Nori

Removes use of the DaVinci EDMA private DMA API and replaces
it with use of the DMA engine API.

Signed-off-by: Matt Porter <mporter@ti.com>
---
 drivers/mmc/host/davinci_mmc.c |  271 ++++++++++++----------------------------
 1 file changed, 82 insertions(+), 189 deletions(-)

diff --git a/drivers/mmc/host/davinci_mmc.c b/drivers/mmc/host/davinci_mmc.c
index 7cf6c62..c5e1eeb 100644
--- a/drivers/mmc/host/davinci_mmc.c
+++ b/drivers/mmc/host/davinci_mmc.c
@@ -30,11 +30,12 @@
 #include <linux/io.h>
 #include <linux/irq.h>
 #include <linux/delay.h>
+#include <linux/dmaengine.h>
 #include <linux/dma-mapping.h>
+#include <linux/edma.h>
 #include <linux/mmc/mmc.h>
 
 #include <mach/mmc.h>
-#include <mach/edma.h>
 
 /*
  * Register Definitions
@@ -200,21 +201,13 @@ struct mmc_davinci_host {
 	u32 bytes_left;
 
 	u32 rxdma, txdma;
+	struct dma_chan *dma_tx;
+	struct dma_chan *dma_rx;
 	bool use_dma;
 	bool do_dma;
 	bool sdio_int;
 	bool active_request;
 
-	/* Scatterlist DMA uses one or more parameter RAM entries:
-	 * the main one (associated with rxdma or txdma) plus zero or
-	 * more links.  The entries for a given transfer differ only
-	 * by memory buffer (address, length) and link field.
-	 */
-	struct edmacc_param	tx_template;
-	struct edmacc_param	rx_template;
-	unsigned		n_link;
-	u32			links[MAX_NR_SG - 1];
-
 	/* For PIO we walk scatterlists one segment at a time. */
 	unsigned int		sg_len;
 	struct scatterlist *sg;
@@ -410,153 +403,74 @@ static void mmc_davinci_start_command(struct mmc_davinci_host *host,
 
 static void davinci_abort_dma(struct mmc_davinci_host *host)
 {
-	int sync_dev;
+	struct dma_chan *sync_dev;
 
 	if (host->data_dir == DAVINCI_MMC_DATADIR_READ)
-		sync_dev = host->rxdma;
+		sync_dev = host->dma_rx;
 	else
-		sync_dev = host->txdma;
-
-	edma_stop(sync_dev);
-	edma_clean_channel(sync_dev);
-}
-
-static void
-mmc_davinci_xfer_done(struct mmc_davinci_host *host, struct mmc_data *data);
-
-static void mmc_davinci_dma_cb(unsigned channel, u16 ch_status, void *data)
-{
-	if (DMA_COMPLETE != ch_status) {
-		struct mmc_davinci_host *host = data;
-
-		/* Currently means:  DMA Event Missed, or "null" transfer
-		 * request was seen.  In the future, TC errors (like bad
-		 * addresses) might be presented too.
-		 */
-		dev_warn(mmc_dev(host->mmc), "DMA %s error\n",
-			(host->data->flags & MMC_DATA_WRITE)
-				? "write" : "read");
-		host->data->error = -EIO;
-		mmc_davinci_xfer_done(host, host->data);
-	}
-}
-
-/* Set up tx or rx template, to be modified and updated later */
-static void __init mmc_davinci_dma_setup(struct mmc_davinci_host *host,
-		bool tx, struct edmacc_param *template)
-{
-	unsigned	sync_dev;
-	const u16	acnt = 4;
-	const u16	bcnt = rw_threshold >> 2;
-	const u16	ccnt = 0;
-	u32		src_port = 0;
-	u32		dst_port = 0;
-	s16		src_bidx, dst_bidx;
-	s16		src_cidx, dst_cidx;
-
-	/*
-	 * A-B Sync transfer:  each DMA request is for one "frame" of
-	 * rw_threshold bytes, broken into "acnt"-size chunks repeated
-	 * "bcnt" times.  Each segment needs "ccnt" such frames; since
-	 * we tell the block layer our mmc->max_seg_size limit, we can
-	 * trust (later) that it's within bounds.
-	 *
-	 * The FIFOs are read/written in 4-byte chunks (acnt == 4) and
-	 * EDMA will optimize memory operations to use larger bursts.
-	 */
-	if (tx) {
-		sync_dev = host->txdma;
-
-		/* src_prt, ccnt, and link to be set up later */
-		src_bidx = acnt;
-		src_cidx = acnt * bcnt;
-
-		dst_port = host->mem_res->start + DAVINCI_MMCDXR;
-		dst_bidx = 0;
-		dst_cidx = 0;
-	} else {
-		sync_dev = host->rxdma;
-
-		src_port = host->mem_res->start + DAVINCI_MMCDRR;
-		src_bidx = 0;
-		src_cidx = 0;
-
-		/* dst_prt, ccnt, and link to be set up later */
-		dst_bidx = acnt;
-		dst_cidx = acnt * bcnt;
-	}
-
-	/*
-	 * We can't use FIFO mode for the FIFOs because MMC FIFO addresses
-	 * are not 256-bit (32-byte) aligned.  So we use INCR, and the W8BIT
-	 * parameter is ignored.
-	 */
-	edma_set_src(sync_dev, src_port, INCR, W8BIT);
-	edma_set_dest(sync_dev, dst_port, INCR, W8BIT);
+		sync_dev = host->dma_tx;
 
-	edma_set_src_index(sync_dev, src_bidx, src_cidx);
-	edma_set_dest_index(sync_dev, dst_bidx, dst_cidx);
-
-	edma_set_transfer_params(sync_dev, acnt, bcnt, ccnt, 8, ABSYNC);
-
-	edma_read_slot(sync_dev, template);
-
-	/* don't bother with irqs or chaining */
-	template->opt |= EDMA_CHAN_SLOT(sync_dev) << 12;
+	dmaengine_terminate_all(sync_dev);
 }
 
-static void mmc_davinci_send_dma_request(struct mmc_davinci_host *host,
+static int mmc_davinci_send_dma_request(struct mmc_davinci_host *host,
 		struct mmc_data *data)
 {
-	struct edmacc_param	*template;
-	int			channel, slot;
-	unsigned		link;
-	struct scatterlist	*sg;
-	unsigned		sg_len;
-	unsigned		bytes_left = host->bytes_left;
-	const unsigned		shift = ffs(rw_threshold) - 1;
+	struct dma_chan *chan;
+	struct dma_async_tx_descriptor *desc;
+	int ret = 0;
 
 	if (host->data_dir == DAVINCI_MMC_DATADIR_WRITE) {
-		template = &host->tx_template;
-		channel = host->txdma;
+		struct dma_slave_config dma_tx_conf = {
+			.direction = DMA_MEM_TO_DEV,
+			.dst_addr = host->mem_res->start + DAVINCI_MMCDXR,
+			.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
+			.dst_maxburst =
+				rw_threshold / DMA_SLAVE_BUSWIDTH_4_BYTES,
+		};
+		chan = host->dma_tx;
+		dmaengine_slave_config(host->dma_tx, &dma_tx_conf);
+
+		desc = dmaengine_prep_slave_sg(host->dma_tx,
+				data->sg,
+				host->sg_len,
+				DMA_MEM_TO_DEV,
+				DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+		if (!desc) {
+			dev_dbg(mmc_dev(host->mmc),
+				"failed to allocate DMA TX descriptor");
+			ret = -1;
+			goto out;
+		}
 	} else {
-		template = &host->rx_template;
-		channel = host->rxdma;
-	}
-
-	/* We know sg_len and ccnt will never be out of range because
-	 * we told the mmc layer which in turn tells the block layer
-	 * to ensure that it only hands us one scatterlist segment
-	 * per EDMA PARAM entry.  Update the PARAM
-	 * entries needed for each segment of this scatterlist.
-	 */
-	for (slot = channel, link = 0, sg = data->sg, sg_len = host->sg_len;
-			sg_len-- != 0 && bytes_left;
-			sg = sg_next(sg), slot = host->links[link++]) {
-		u32		buf = sg_dma_address(sg);
-		unsigned	count = sg_dma_len(sg);
-
-		template->link_bcntrld = sg_len
-				? (EDMA_CHAN_SLOT(host->links[link]) << 5)
-				: 0xffff;
-
-		if (count > bytes_left)
-			count = bytes_left;
-		bytes_left -= count;
-
-		if (host->data_dir == DAVINCI_MMC_DATADIR_WRITE)
-			template->src = buf;
-		else
-			template->dst = buf;
-		template->ccnt = count >> shift;
-
-		edma_write_slot(slot, template);
+		struct dma_slave_config dma_rx_conf = {
+			.direction = DMA_DEV_TO_MEM,
+			.src_addr = host->mem_res->start + DAVINCI_MMCDRR,
+			.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
+			.src_maxburst =
+				rw_threshold / DMA_SLAVE_BUSWIDTH_4_BYTES,
+		};
+		chan = host->dma_rx;
+		dmaengine_slave_config(host->dma_rx, &dma_rx_conf);
+
+		desc = dmaengine_prep_slave_sg(host->dma_rx,
+				data->sg,
+				host->sg_len,
+				DMA_DEV_TO_MEM,
+				DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+		if (!desc) {
+			dev_dbg(mmc_dev(host->mmc),
+				"failed to allocate DMA RX descriptor");
+			ret = -1;
+			goto out;
+		}
 	}
 
-	if (host->version == MMC_CTLR_VERSION_2)
-		edma_clear_event(channel);
+	dmaengine_submit(desc);
+	dma_async_issue_pending(chan);
 
-	edma_start(channel);
+out:
+	return ret;
 }
 
 static int mmc_davinci_start_dma_transfer(struct mmc_davinci_host *host,
@@ -564,6 +478,7 @@ static int mmc_davinci_start_dma_transfer(struct mmc_davinci_host *host,
 {
 	int i;
 	int mask = rw_threshold - 1;
+	int ret = 0;
 
 	host->sg_len = dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
 				((data->flags & MMC_DATA_WRITE)
@@ -583,70 +498,48 @@ static int mmc_davinci_start_dma_transfer(struct mmc_davinci_host *host,
 	}
 
 	host->do_dma = 1;
-	mmc_davinci_send_dma_request(host, data);
+	ret = mmc_davinci_send_dma_request(host, data);
 
-	return 0;
+	return ret;
 }
 
 static void __init_or_module
 davinci_release_dma_channels(struct mmc_davinci_host *host)
 {
-	unsigned	i;
-
 	if (!host->use_dma)
 		return;
 
-	for (i = 0; i < host->n_link; i++)
-		edma_free_slot(host->links[i]);
-
-	edma_free_channel(host->txdma);
-	edma_free_channel(host->rxdma);
+	dma_release_channel(host->dma_tx);
+	dma_release_channel(host->dma_rx);
 }
 
 static int __init davinci_acquire_dma_channels(struct mmc_davinci_host *host)
 {
-	u32 link_size;
-	int r, i;
-
-	/* Acquire master DMA write channel */
-	r = edma_alloc_channel(host->txdma, mmc_davinci_dma_cb, host,
-			EVENTQ_DEFAULT);
-	if (r < 0) {
-		dev_warn(mmc_dev(host->mmc), "alloc %s channel err %d\n",
-				"tx", r);
-		return r;
-	}
-	mmc_davinci_dma_setup(host, true, &host->tx_template);
-
-	/* Acquire master DMA read channel */
-	r = edma_alloc_channel(host->rxdma, mmc_davinci_dma_cb, host,
-			EVENTQ_DEFAULT);
-	if (r < 0) {
-		dev_warn(mmc_dev(host->mmc), "alloc %s channel err %d\n",
-				"rx", r);
-		goto free_master_write;
+	int r;
+	dma_cap_mask_t mask;
+
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+
+	host->dma_tx =
+		dma_request_channel(mask, edma_filter_fn, &host->txdma);
+	if (!host->dma_tx) {
+		dev_err(mmc_dev(host->mmc), "Can't get dma_tx channel\n");
+		return -ENODEV;
 	}
-	mmc_davinci_dma_setup(host, false, &host->rx_template);
 
-	/* Allocate parameter RAM slots, which will later be bound to a
-	 * channel as needed to handle a scatterlist.
-	 */
-	link_size = min_t(unsigned, host->nr_sg, ARRAY_SIZE(host->links));
-	for (i = 0; i < link_size; i++) {
-		r = edma_alloc_slot(EDMA_CTLR(host->txdma), EDMA_SLOT_ANY);
-		if (r < 0) {
-			dev_dbg(mmc_dev(host->mmc), "dma PaRAM alloc --> %d\n",
-				r);
-			break;
-		}
-		host->links[i] = r;
+	host->dma_rx =
+		dma_request_channel(mask, edma_filter_fn, &host->rxdma);
+	if (!host->dma_rx) {
+		dev_err(mmc_dev(host->mmc), "Can't get dma_rx channel\n");
+		r = -ENODEV;
+		goto free_master_write;
 	}
-	host->n_link = i;
 
 	return 0;
 
 free_master_write:
-	edma_free_channel(host->txdma);
+	dma_release_channel(host->dma_tx);
 
 	return r;
 }
@@ -1359,7 +1252,7 @@ static int __init davinci_mmcsd_probe(struct platform_device *pdev)
 	 * Each hw_seg uses one EDMA parameter RAM slot, always one
 	 * channel and then usually some linked slots.
 	 */
-	mmc->max_segs		= 1 + host->n_link;
+	mmc->max_segs		= MAX_NR_SG;
 
 	/* EDMA limit per hw segment (one or two MBytes) */
 	mmc->max_seg_size	= MAX_CCNT * rw_threshold;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] spi: spi-davinci: convert to DMA engine API
  2012-08-16 21:44 [PATCH 0/3] DaVinci DMA engine conversion Matt Porter
  2012-08-16 21:44 ` [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver Matt Porter
  2012-08-16 21:44 ` [PATCH 2/3] mmc: davinci_mmc: convert to DMA engine API Matt Porter
@ 2012-08-16 21:44 ` Matt Porter
  2 siblings, 0 replies; 7+ messages in thread
From: Matt Porter @ 2012-08-16 21:44 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, cjb, grant.likely
  Cc: Linux Kernel Mailing List, Linux ARM Kernel List, Linux MMC List,
	Linux SPI Devel List, Linux DaVinci Kernel List, Sekhar Nori

Removes use of the DaVinci EDMA private DMA API and replaces
it with use of the DMA engine API.

Signed-off-by: Matt Porter <mporter@ti.com>
---
 drivers/spi/spi-davinci.c |  292 ++++++++++++++++++++-------------------------
 1 file changed, 130 insertions(+), 162 deletions(-)

diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
index 9b2901f..c1ec52d 100644
--- a/drivers/spi/spi-davinci.c
+++ b/drivers/spi/spi-davinci.c
@@ -25,13 +25,14 @@
 #include <linux/platform_device.h>
 #include <linux/err.h>
 #include <linux/clk.h>
+#include <linux/dmaengine.h>
 #include <linux/dma-mapping.h>
+#include <linux/edma.h>
 #include <linux/spi/spi.h>
 #include <linux/spi/spi_bitbang.h>
 #include <linux/slab.h>
 
 #include <mach/spi.h>
-#include <mach/edma.h>
 
 #define SPI_NO_RESOURCE		((resource_size_t)-1)
 
@@ -113,14 +114,6 @@
 #define SPIDEF		0x4c
 #define SPIFMT0		0x50
 
-/* We have 2 DMA channels per CS, one for RX and one for TX */
-struct davinci_spi_dma {
-	int			tx_channel;
-	int			rx_channel;
-	int			dummy_param_slot;
-	enum dma_event_q	eventq;
-};
-
 /* SPI Controller driver's private data. */
 struct davinci_spi {
 	struct spi_bitbang	bitbang;
@@ -134,11 +127,14 @@ struct davinci_spi {
 
 	const void		*tx;
 	void			*rx;
-#define SPI_TMP_BUFSZ	(SMP_CACHE_BYTES + 1)
-	u8			rx_tmp_buf[SPI_TMP_BUFSZ];
 	int			rcount;
 	int			wcount;
-	struct davinci_spi_dma	dma;
+
+	struct dma_chan		*dma_rx;
+	struct dma_chan		*dma_tx;
+	int			dma_rx_chnum;
+	int			dma_tx_chnum;
+
 	struct davinci_spi_platform_data *pdata;
 
 	void			(*get_rx)(u32 rx_data, struct davinci_spi *);
@@ -496,21 +492,23 @@ out:
 	return errors;
 }
 
-static void davinci_spi_dma_callback(unsigned lch, u16 status, void *data)
+static void davinci_spi_dma_rx_callback(void *data)
 {
-	struct davinci_spi *dspi = data;
-	struct davinci_spi_dma *dma = &dspi->dma;
+	struct davinci_spi *dspi = (struct davinci_spi *)data;
 
-	edma_stop(lch);
+	dspi->rcount = 0;
 
-	if (status == DMA_COMPLETE) {
-		if (lch == dma->rx_channel)
-			dspi->rcount = 0;
-		if (lch == dma->tx_channel)
-			dspi->wcount = 0;
-	}
+	if (!dspi->wcount && !dspi->rcount)
+		complete(&dspi->done);
+}
 
-	if ((!dspi->wcount && !dspi->rcount) || (status != DMA_COMPLETE))
+static void davinci_spi_dma_tx_callback(void *data)
+{
+	struct davinci_spi *dspi = (struct davinci_spi *)data;
+
+	dspi->wcount = 0;
+
+	if (!dspi->wcount && !dspi->rcount)
 		complete(&dspi->done);
 }
 
@@ -526,20 +524,20 @@ static void davinci_spi_dma_callback(unsigned lch, u16 status, void *data)
 static int davinci_spi_bufs(struct spi_device *spi, struct spi_transfer *t)
 {
 	struct davinci_spi *dspi;
-	int data_type, ret;
+	int data_type, ret = -ENOMEM;
 	u32 tx_data, spidat1;
 	u32 errors = 0;
 	struct davinci_spi_config *spicfg;
 	struct davinci_spi_platform_data *pdata;
 	unsigned uninitialized_var(rx_buf_count);
-	struct device *sdev;
+	void *dummy_buf = NULL;
+	struct scatterlist sg_rx, sg_tx;
 
 	dspi = spi_master_get_devdata(spi->master);
 	pdata = dspi->pdata;
 	spicfg = (struct davinci_spi_config *)spi->controller_data;
 	if (!spicfg)
 		spicfg = &davinci_spi_default_cfg;
-	sdev = dspi->bitbang.master->dev.parent;
 
 	/* convert len to words based on bits_per_word */
 	data_type = dspi->bytes_per_word[spi->chip_select];
@@ -567,112 +565,83 @@ static int davinci_spi_bufs(struct spi_device *spi, struct spi_transfer *t)
 		spidat1 |= tx_data & 0xFFFF;
 		iowrite32(spidat1, dspi->base + SPIDAT1);
 	} else {
-		struct davinci_spi_dma *dma;
-		unsigned long tx_reg, rx_reg;
-		struct edmacc_param param;
-		void *rx_buf;
-		int b, c;
-
-		dma = &dspi->dma;
-
-		tx_reg = (unsigned long)dspi->pbase + SPIDAT1;
-		rx_reg = (unsigned long)dspi->pbase + SPIBUF;
-
-		/*
-		 * Transmit DMA setup
-		 *
-		 * If there is transmit data, map the transmit buffer, set it
-		 * as the source of data and set the source B index to data
-		 * size. If there is no transmit data, set the transmit register
-		 * as the source of data, and set the source B index to zero.
-		 *
-		 * The destination is always the transmit register itself. And
-		 * the destination never increments.
-		 */
-
-		if (t->tx_buf) {
-			t->tx_dma = dma_map_single(&spi->dev, (void *)t->tx_buf,
-						t->len, DMA_TO_DEVICE);
-			if (dma_mapping_error(&spi->dev, t->tx_dma)) {
-				dev_dbg(sdev, "Unable to DMA map %d bytes"
-						"TX buffer\n", t->len);
-				return -ENOMEM;
-			}
-		}
-
-		/*
-		 * If number of words is greater than 65535, then we need
-		 * to configure a 3 dimension transfer.  Use the BCNTRLD
-		 * feature to allow for transfers that aren't even multiples
-		 * of 65535 (or any other possible b size) by first transferring
-		 * the remainder amount then grabbing the next N blocks of
-		 * 65535 words.
-		 */
-
-		c = dspi->wcount / (SZ_64K - 1);	/* N 65535 Blocks */
-		b = dspi->wcount - c * (SZ_64K - 1);	/* Remainder */
-		if (b)
-			c++;
+		struct dma_slave_config dma_rx_conf = {
+			.direction = DMA_DEV_TO_MEM,
+			.src_addr = (unsigned long)dspi->pbase + SPIBUF,
+			.src_addr_width = data_type,
+			.src_maxburst = 1,
+		};
+		struct dma_slave_config dma_tx_conf = {
+			.direction = DMA_MEM_TO_DEV,
+			.dst_addr = (unsigned long)dspi->pbase + SPIDAT1,
+			.dst_addr_width = data_type,
+			.dst_maxburst = 1,
+		};
+		struct dma_async_tx_descriptor *rxdesc;
+		struct dma_async_tx_descriptor *txdesc;
+		void *buf;
+
+		dummy_buf = kzalloc(t->len, GFP_KERNEL);
+		if (!dummy_buf)
+			goto err_alloc_dummy_buf;
+
+		dmaengine_slave_config(dspi->dma_rx, &dma_rx_conf);
+		dmaengine_slave_config(dspi->dma_tx, &dma_tx_conf);
+
+		sg_init_table(&sg_rx, 1);
+		if (!t->rx_buf)
+			buf = dummy_buf;
 		else
-			b = SZ_64K - 1;
-
-		param.opt = TCINTEN | EDMA_TCC(dma->tx_channel);
-		param.src = t->tx_buf ? t->tx_dma : tx_reg;
-		param.a_b_cnt = b << 16 | data_type;
-		param.dst = tx_reg;
-		param.src_dst_bidx = t->tx_buf ? data_type : 0;
-		param.link_bcntrld = 0xffffffff;
-		param.src_dst_cidx = t->tx_buf ? data_type : 0;
-		param.ccnt = c;
-		edma_write_slot(dma->tx_channel, &param);
-		edma_link(dma->tx_channel, dma->dummy_param_slot);
-
-		/*
-		 * Receive DMA setup
-		 *
-		 * If there is receive buffer, use it to receive data. If there
-		 * is none provided, use a temporary receive buffer. Set the
-		 * destination B index to 0 so effectively only one byte is used
-		 * in the temporary buffer (address does not increment).
-		 *
-		 * The source of receive data is the receive data register. The
-		 * source address never increments.
-		 */
-
-		if (t->rx_buf) {
-			rx_buf = t->rx_buf;
-			rx_buf_count = t->len;
-		} else {
-			rx_buf = dspi->rx_tmp_buf;
-			rx_buf_count = sizeof(dspi->rx_tmp_buf);
+			buf = t->rx_buf;
+		t->rx_dma = dma_map_single(&spi->dev, buf,
+				t->len, DMA_FROM_DEVICE);
+		if (!t->rx_dma) {
+			ret = -EFAULT;
+			goto err_rx_map;
 		}
+		sg_dma_address(&sg_rx) = t->rx_dma;
+		sg_dma_len(&sg_rx) = t->len;
 
-		t->rx_dma = dma_map_single(&spi->dev, rx_buf, rx_buf_count,
-							DMA_FROM_DEVICE);
-		if (dma_mapping_error(&spi->dev, t->rx_dma)) {
-			dev_dbg(sdev, "Couldn't DMA map a %d bytes RX buffer\n",
-								rx_buf_count);
-			if (t->tx_buf)
-				dma_unmap_single(&spi->dev, t->tx_dma, t->len,
-								DMA_TO_DEVICE);
-			return -ENOMEM;
+		sg_init_table(&sg_tx, 1);
+		if (!t->tx_buf)
+			buf = dummy_buf;
+		else
+			buf = (void *)t->tx_buf;
+		t->tx_dma = dma_map_single(&spi->dev, buf,
+				t->len, DMA_FROM_DEVICE);
+		if (!t->tx_dma) {
+			ret = -EFAULT;
+			goto err_tx_map;
 		}
-
-		param.opt = TCINTEN | EDMA_TCC(dma->rx_channel);
-		param.src = rx_reg;
-		param.a_b_cnt = b << 16 | data_type;
-		param.dst = t->rx_dma;
-		param.src_dst_bidx = (t->rx_buf ? data_type : 0) << 16;
-		param.link_bcntrld = 0xffffffff;
-		param.src_dst_cidx = (t->rx_buf ? data_type : 0) << 16;
-		param.ccnt = c;
-		edma_write_slot(dma->rx_channel, &param);
+		sg_dma_address(&sg_tx) = t->tx_dma;
+		sg_dma_len(&sg_tx) = t->len;
+
+		rxdesc = dmaengine_prep_slave_sg(dspi->dma_rx,
+				&sg_rx, 1, DMA_DEV_TO_MEM,
+				DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+		if (!rxdesc)
+			goto err_desc;
+
+		txdesc = dmaengine_prep_slave_sg(dspi->dma_tx,
+				&sg_tx, 1, DMA_MEM_TO_DEV,
+				DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+		if (!txdesc)
+			goto err_desc;
+
+		rxdesc->callback = davinci_spi_dma_rx_callback;
+		rxdesc->callback_param = (void *)dspi;
+		txdesc->callback = davinci_spi_dma_tx_callback;
+		txdesc->callback_param = (void *)dspi;
 
 		if (pdata->cshold_bug)
 			iowrite16(spidat1 >> 16, dspi->base + SPIDAT1 + 2);
 
-		edma_start(dma->rx_channel);
-		edma_start(dma->tx_channel);
+		dmaengine_submit(rxdesc);
+		dmaengine_submit(txdesc);
+
+		dma_async_issue_pending(dspi->dma_rx);
+		dma_async_issue_pending(dspi->dma_tx);
+
 		set_io_bits(dspi->base + SPIINT, SPIINT_DMA_REQ_EN);
 	}
 
@@ -690,15 +659,13 @@ static int davinci_spi_bufs(struct spi_device *spi, struct spi_transfer *t)
 
 	clear_io_bits(dspi->base + SPIINT, SPIINT_MASKALL);
 	if (spicfg->io_type == SPI_IO_TYPE_DMA) {
-
-		if (t->tx_buf)
-			dma_unmap_single(&spi->dev, t->tx_dma, t->len,
-								DMA_TO_DEVICE);
-
-		dma_unmap_single(&spi->dev, t->rx_dma, rx_buf_count,
-							DMA_FROM_DEVICE);
-
 		clear_io_bits(dspi->base + SPIINT, SPIINT_DMA_REQ_EN);
+
+		dma_unmap_single(&spi->dev, t->rx_dma,
+				t->len, DMA_FROM_DEVICE);
+		dma_unmap_single(&spi->dev, t->tx_dma,
+				t->len, DMA_TO_DEVICE);
+		kfree(dummy_buf);
 	}
 
 	clear_io_bits(dspi->base + SPIGCR1, SPIGCR1_SPIENA_MASK);
@@ -716,11 +683,20 @@ static int davinci_spi_bufs(struct spi_device *spi, struct spi_transfer *t)
 	}
 
 	if (dspi->rcount != 0 || dspi->wcount != 0) {
-		dev_err(sdev, "SPI data transfer error\n");
+		dev_err(&spi->dev, "SPI data transfer error\n");
 		return -EIO;
 	}
 
 	return t->len;
+
+err_desc:
+	dma_unmap_single(&spi->dev, t->tx_dma, t->len, DMA_TO_DEVICE);
+err_tx_map:
+	dma_unmap_single(&spi->dev, t->rx_dma, t->len, DMA_FROM_DEVICE);
+err_rx_map:
+	kfree(dummy_buf);
+err_alloc_dummy_buf:
+	return ret;
 }
 
 /**
@@ -751,39 +727,33 @@ static irqreturn_t davinci_spi_irq(s32 irq, void *data)
 
 static int davinci_spi_request_dma(struct davinci_spi *dspi)
 {
+	dma_cap_mask_t mask;
+	struct device *sdev = dspi->bitbang.master->dev.parent;
 	int r;
-	struct davinci_spi_dma *dma = &dspi->dma;
 
-	r = edma_alloc_channel(dma->rx_channel, davinci_spi_dma_callback, dspi,
-								dma->eventq);
-	if (r < 0) {
-		pr_err("Unable to request DMA channel for SPI RX\n");
-		r = -EAGAIN;
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+
+	dspi->dma_rx = dma_request_channel(mask, edma_filter_fn,
+					   &dspi->dma_rx_chnum);
+	if (!dspi->dma_rx) {
+		dev_err(sdev, "request RX DMA channel failed\n");
+		r = -ENODEV;
 		goto rx_dma_failed;
 	}
 
-	r = edma_alloc_channel(dma->tx_channel, davinci_spi_dma_callback, dspi,
-								dma->eventq);
-	if (r < 0) {
-		pr_err("Unable to request DMA channel for SPI TX\n");
-		r = -EAGAIN;
+	dspi->dma_tx = dma_request_channel(mask, edma_filter_fn,
+					   &dspi->dma_tx_chnum);
+	if (!dspi->dma_tx) {
+		dev_err(sdev, "request TX DMA channel failed\n");
+		r = -ENODEV;
 		goto tx_dma_failed;
 	}
 
-	r = edma_alloc_slot(EDMA_CTLR(dma->tx_channel), EDMA_SLOT_ANY);
-	if (r < 0) {
-		pr_err("Unable to request SPI TX DMA param slot\n");
-		r = -EAGAIN;
-		goto param_failed;
-	}
-	dma->dummy_param_slot = r;
-	edma_link(dma->dummy_param_slot, dma->dummy_param_slot);
-
 	return 0;
-param_failed:
-	edma_free_channel(dma->tx_channel);
+
 tx_dma_failed:
-	edma_free_channel(dma->rx_channel);
+	dma_release_channel(dspi->dma_rx);
 rx_dma_failed:
 	return r;
 }
@@ -898,9 +868,8 @@ static int __devinit davinci_spi_probe(struct platform_device *pdev)
 	dspi->bitbang.txrx_bufs = davinci_spi_bufs;
 	if (dma_rx_chan != SPI_NO_RESOURCE &&
 	    dma_tx_chan != SPI_NO_RESOURCE) {
-		dspi->dma.rx_channel = dma_rx_chan;
-		dspi->dma.tx_channel = dma_tx_chan;
-		dspi->dma.eventq = pdata->dma_event_q;
+		dspi->dma_rx_chnum = dma_rx_chan;
+		dspi->dma_tx_chnum = dma_tx_chan;
 
 		ret = davinci_spi_request_dma(dspi);
 		if (ret)
@@ -955,9 +924,8 @@ static int __devinit davinci_spi_probe(struct platform_device *pdev)
 	return ret;
 
 free_dma:
-	edma_free_channel(dspi->dma.tx_channel);
-	edma_free_channel(dspi->dma.rx_channel);
-	edma_free_slot(dspi->dma.dummy_param_slot);
+	dma_release_channel(dspi->dma_rx);
+	dma_release_channel(dspi->dma_tx);
 free_clk:
 	clk_disable(dspi->clk);
 	clk_put(dspi->clk);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver
  2012-08-16 21:44 ` [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver Matt Porter
@ 2012-08-16 22:29   ` Russell King - ARM Linux
  2012-08-20 14:17     ` Matt Porter
  2012-08-21 18:20   ` Matt Porter
  1 sibling, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2012-08-16 22:29 UTC (permalink / raw)
  To: Matt Porter
  Cc: vinod.koul, dan.j.williams, cjb, grant.likely,
	Linux DaVinci Kernel List, Sekhar Nori, Linux MMC List,
	Linux Kernel Mailing List, Linux SPI Devel List,
	Linux ARM Kernel List

On Thu, Aug 16, 2012 at 05:44:29PM -0400, Matt Porter wrote:
> Add a DMA engine driver for the TI EDMA controller. This driver
> is implemented as a wrapper around the existing DaVinci private
> DMA implementation. This approach allows for incremental conversion
> of each peripheral driver to the DMA engine API. The EDMA driver
> supports slave transfers but does not yet support cyclic transfers.

Have you looked at the virt-dma support?  That should allow you to
avoid some common errors, like forgetting that stuff submitted but
not issued should not be started even if the channel is currently
running.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver
  2012-08-16 22:29   ` Russell King - ARM Linux
@ 2012-08-20 14:17     ` Matt Porter
  0 siblings, 0 replies; 7+ messages in thread
From: Matt Porter @ 2012-08-20 14:17 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: vinod.koul, dan.j.williams, cjb, grant.likely,
	Linux DaVinci Kernel List, Sekhar Nori, Linux MMC List,
	Linux Kernel Mailing List, Linux SPI Devel List,
	Linux ARM Kernel List

On Thu, Aug 16, 2012 at 11:29:12PM +0100, Russell King wrote:
> On Thu, Aug 16, 2012 at 05:44:29PM -0400, Matt Porter wrote:
> > Add a DMA engine driver for the TI EDMA controller. This driver
> > is implemented as a wrapper around the existing DaVinci private
> > DMA implementation. This approach allows for incremental conversion
> > of each peripheral driver to the DMA engine API. The EDMA driver
> > supports slave transfers but does not yet support cyclic transfers.
> 
> Have you looked at the virt-dma support?  That should allow you to
> avoid some common errors, like forgetting that stuff submitted but
> not issued should not be started even if the channel is currently
> running.

I had previously skimmed it a bit and wrote it off as a helper for
certain types of controllers. On your suggestion, I looked into it
in detail and it does look to be a nice improvement on handling those
types of errors...some of which I noticed were not being handled
properly in my v1 version. I did a quick and dirty conversion of the
EDMA driver to use virt-dma now and it also results in simplified
handling of the descriptors and about a 10% code reduction. I'll do
some clean up on this virt-dma-enabled version and post v2.

One thing to note is that although I'm borrowing a lot from how you
did the omap-dma.c driver, the EDMA implementation of issue_pending()
cannot schedule a tasklet to send the descriptor to the hardware. The
client drivers have to be sure that dma_async_issue_pending() results
in the PaRAM set been written to the h/w before the driver specific
DMA reqs are enabled. Enabling the DMA req before the PaRAM set is
programmed results in an event error condition on this h/w. Ideally
there would be a dma_status state to reflect that a descriptor has
actually been written to the h/w so that client drivers can safely
enable dma reqs where this is a requirement.

Thanks,
Matt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver
  2012-08-16 21:44 ` [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver Matt Porter
  2012-08-16 22:29   ` Russell King - ARM Linux
@ 2012-08-21 18:20   ` Matt Porter
  1 sibling, 0 replies; 7+ messages in thread
From: Matt Porter @ 2012-08-21 18:20 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, cjb, grant.likely
  Cc: Linux Kernel Mailing List, Linux ARM Kernel List, Linux MMC List,
	Linux SPI Devel List, Linux DaVinci Kernel List, Sekhar Nori

On Thu, Aug 16, 2012 at 05:44:29PM -0400, Matt Porter wrote:
> Add a DMA engine driver for the TI EDMA controller. This driver
> is implemented as a wrapper around the existing DaVinci private
> DMA implementation. This approach allows for incremental conversion
> of each peripheral driver to the DMA engine API. The EDMA driver
> supports slave transfers but does not yet support cyclic transfers.

I got my hands on a AM18x WL12xx module so I could test MMC1 and this
exposed some bugs in the driver's handling of the second EDMA controller
case. DA850/OMAP-L138/AM1x have this second EDMA controller. I've
addressed this in v2 which will be posted shortly.

I'm expecting to receive a DM646x EVM RSN so I can cover testing that
part as well.

-Matt

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-08-21 18:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-16 21:44 [PATCH 0/3] DaVinci DMA engine conversion Matt Porter
2012-08-16 21:44 ` [PATCH 1/3] dmaengine: add TI EDMA DMA engine driver Matt Porter
2012-08-16 22:29   ` Russell King - ARM Linux
2012-08-20 14:17     ` Matt Porter
2012-08-21 18:20   ` Matt Porter
2012-08-16 21:44 ` [PATCH 2/3] mmc: davinci_mmc: convert to DMA engine API Matt Porter
2012-08-16 21:44 ` [PATCH 3/3] spi: spi-davinci: " Matt Porter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).