linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0)
@ 2019-01-11 18:33 Gustavo Pimentel
  2019-01-11 18:33 ` [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver Gustavo Pimentel
                   ` (6 more replies)
  0 siblings, 7 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-11 18:33 UTC (permalink / raw)
  To: linux-pci, dmaengine
  Cc: Gustavo Pimentel, Vinod Koul, Andy Shevchenko, Russell King,
	Eugeniy Paltsev, Lorenzo Pieralisi, Bjorn Helgaas,
	Kishon Vijay Abraham I, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

Add Synopsys eDMA IP driver (version 0 and for EP side only) to Linux
kernel. This IP is generally distributed with Synopsys PCIe EndPoint IP
(depends of the use and licensing agreement), which supports:
 - legacy and unroll modes
 - 16 independent and concurrent channels (8 write + 8 read)
 - supports linked list (scatter-gather) transfer
 - each linked list descriptor can transfer from 1 byte to 4 Gbytes
 - supports cyclic transfer
 - PCIe EndPoint glue-logic

This patch series contains: 
 - eDMA core + eDMA core v0 driver (implements the interface with
 DMAengine controller APIs and interfaces with eDMA HW block)
 - eDMA PCIe glue-logic reference driver (attaches to Synopsys EP and
 provides memory access to eDMA core driver)
 - eDMA Test driver (is a reference driver which aims to test the eDMA,
 which uses DMAengine client APIs)

Gustavo Pimentel (7):
  dmaengine: Add Synopsys eDMA IP core driver
  dmaengine: Add Synopsys eDMA IP version 0 support
  dmaengine: Add Synopsys eDMA IP version 0 debugfs support
  PCI: Add Synopsys endpoint EDDA Device id
  dmaengine: Add Synopsys eDMA IP PCIe glue-logic
  MAINTAINERS: Add Synopsys eDMA IP driver maintainer
  dmaengine: Add Synopsys eDMA IP test and sample driver

 MAINTAINERS                              |    7 +
 drivers/dma/Kconfig                      |    2 +
 drivers/dma/Makefile                     |    1 +
 drivers/dma/dw-edma/Kconfig              |   25 +
 drivers/dma/dw-edma/Makefile             |    8 +
 drivers/dma/dw-edma/dw-edma-core.c       | 1080 ++++++++++++++++++++++++++++++
 drivers/dma/dw-edma/dw-edma-core.h       |  151 +++++
 drivers/dma/dw-edma/dw-edma-pcie.c       |  254 +++++++
 drivers/dma/dw-edma/dw-edma-test.c       |  897 +++++++++++++++++++++++++
 drivers/dma/dw-edma/dw-edma-v0-core.c    |  347 ++++++++++
 drivers/dma/dw-edma/dw-edma-v0-core.h    |   26 +
 drivers/dma/dw-edma/dw-edma-v0-debugfs.c |  361 ++++++++++
 drivers/dma/dw-edma/dw-edma-v0-debugfs.h |   24 +
 drivers/dma/dw-edma/dw-edma-v0-regs.h    |  156 +++++
 drivers/misc/pci_endpoint_test.c         |    2 +-
 include/linux/dma/edma.h                 |   43 ++
 include/linux/pci_ids.h                  |    1 +
 17 files changed, 3384 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma/dw-edma/Kconfig
 create mode 100644 drivers/dma/dw-edma/Makefile
 create mode 100644 drivers/dma/dw-edma/dw-edma-core.c
 create mode 100644 drivers/dma/dw-edma/dw-edma-core.h
 create mode 100644 drivers/dma/dw-edma/dw-edma-pcie.c
 create mode 100644 drivers/dma/dw-edma/dw-edma-test.c
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-core.c
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-core.h
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-debugfs.c
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-debugfs.h
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-regs.h
 create mode 100644 include/linux/dma/edma.h

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Niklas Cassel <niklas.cassel@linaro.org>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Vitor Soares <vitor.soares@synopsys.com>
Cc: Nelson Costa <nelson.costa@synopsys.com>
Cc: Pedro Sousa <pedrom.sousa@synopsys.com>

-- 
2.7.4


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-11 18:33 [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0) Gustavo Pimentel
@ 2019-01-11 18:33 ` Gustavo Pimentel
  2019-01-16 10:21   ` Jose Abreu
  2019-01-20 11:44   ` Vinod Koul
  2019-01-11 18:33 ` [RFC v3 2/7] dmaengine: Add Synopsys eDMA IP version 0 support Gustavo Pimentel
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-11 18:33 UTC (permalink / raw)
  To: linux-pci, dmaengine
  Cc: Gustavo Pimentel, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

Add Synopsys eDMA IP core driver to kernel.

This core driver, initializes and configures the eDMA IP using vma-helpers
functions and dma-engine subsystem.

Also creates an abstration layer through callbacks allowing different
registers mappings in the future, organized in to versions.

This driver can be compile as built-in or external module in kernel.

To enable this driver just select DW_EDMA option in kernel configuration,
however it requires and selects automatically DMA_ENGINE and
DMA_VIRTUAL_CHANNELS option too.

Changes:
RFC v1->RFC v2:
 - Replace comments // (C99 style) by /**/
 - Fix the headers of the .c and .h files according to the most recent
   convention
 - Fix errors and checks pointed out by checkpatch with --strict option
 - Replace patch small description tag from dma by dmaengine
 - Change some dev_info() into dev_dbg()
 - Remove unnecessary zero initialization after kzalloc
 - Remove direction validation on config() API, since the direction
   parameter is deprecated
 - Refactor code to replace atomic_t by u32 variable type
 - Replace start_transfer() name by dw_edma_start_transfer()
 - Add spinlock to dw_edma_device_prep_slave_sg()
 - Add spinlock to dw_edma_free_chunk()
 - Simplify switch case into if on dw_edma_device_pause(),
   dw_edma_device_resume() and dw_edma_device_terminate_all()
RFC v2->RFC v3:
 - Add driver parameter to disable msix feature
 - Fix printk variable of phys_addr_t type
 - Fix printk variable of __iomem type
 - Fix printk variable of size_t type
 - Add comments or improve existing ones
 - Add possibility to work with multiple IRQs feature
 - Fix source and destination addresses
 - Add define to magic numbers
 - Add DMA cyclic transfer feature

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Niklas Cassel <niklas.cassel@linaro.org>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Vitor Soares <vitor.soares@synopsys.com>
Cc: Nelson Costa <nelson.costa@synopsys.com>
Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
---
 drivers/dma/Kconfig                |    2 +
 drivers/dma/Makefile               |    1 +
 drivers/dma/dw-edma/Kconfig        |    9 +
 drivers/dma/dw-edma/Makefile       |    4 +
 drivers/dma/dw-edma/dw-edma-core.c | 1059 ++++++++++++++++++++++++++++++++++++
 drivers/dma/dw-edma/dw-edma-core.h |  151 +++++
 include/linux/dma/edma.h           |   43 ++
 7 files changed, 1269 insertions(+)
 create mode 100644 drivers/dma/dw-edma/Kconfig
 create mode 100644 drivers/dma/dw-edma/Makefile
 create mode 100644 drivers/dma/dw-edma/dw-edma-core.c
 create mode 100644 drivers/dma/dw-edma/dw-edma-core.h
 create mode 100644 include/linux/dma/edma.h

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index d2286c7..5877ee5 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -651,6 +651,8 @@ source "drivers/dma/qcom/Kconfig"
 
 source "drivers/dma/dw/Kconfig"
 
+source "drivers/dma/dw-edma/Kconfig"
+
 source "drivers/dma/hsu/Kconfig"
 
 source "drivers/dma/sh/Kconfig"
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 09571a8..e42e7d6 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -29,6 +29,7 @@ obj-$(CONFIG_DMA_SUN4I) += sun4i-dma.o
 obj-$(CONFIG_DMA_SUN6I) += sun6i-dma.o
 obj-$(CONFIG_DW_AXI_DMAC) += dw-axi-dmac/
 obj-$(CONFIG_DW_DMAC_CORE) += dw/
+obj-$(CONFIG_DW_EDMA) += dw-edma/
 obj-$(CONFIG_EP93XX_DMA) += ep93xx_dma.o
 obj-$(CONFIG_FSL_DMA) += fsldma.o
 obj-$(CONFIG_FSL_EDMA) += fsl-edma.o fsl-edma-common.o
diff --git a/drivers/dma/dw-edma/Kconfig b/drivers/dma/dw-edma/Kconfig
new file mode 100644
index 0000000..3016bed
--- /dev/null
+++ b/drivers/dma/dw-edma/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0
+
+config DW_EDMA
+	tristate "Synopsys DesignWare eDMA controller driver"
+	select DMA_ENGINE
+	select DMA_VIRTUAL_CHANNELS
+	help
+	  Support the Synopsys DesignWare eDMA controller, normally
+	  implemented on endpoints SoCs.
diff --git a/drivers/dma/dw-edma/Makefile b/drivers/dma/dw-edma/Makefile
new file mode 100644
index 0000000..3224010
--- /dev/null
+++ b/drivers/dma/dw-edma/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_DW_EDMA)		+= dw-edma.o
+dw-edma-objs			:= dw-edma-core.o
diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c
new file mode 100644
index 0000000..2b6b70f
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-core.c
@@ -0,0 +1,1059 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA core driver
+ */
+
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/pm_runtime.h>
+#include <linux/dmaengine.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/dma/edma.h>
+#include <linux/pci.h>
+
+#include "dw-edma-core.h"
+#include "../dmaengine.h"
+#include "../virt-dma.h"
+
+#define SET(reg, name, val)			\
+	reg.name = val
+
+#define SET_BOTH_CH(name, value)		\
+	do {					\
+		SET(dw->wr_edma, name, value);	\
+		SET(dw->rd_edma, name, value);	\
+	} while (0)
+
+static inline
+struct device *dchan2dev(struct dma_chan *dchan)
+{
+	return &dchan->dev->device;
+}
+
+static inline
+struct device *chan2dev(struct dw_edma_chan *chan)
+{
+	return &chan->vc.chan.dev->device;
+}
+
+static inline
+const struct dw_edma_core_ops *chan2ops(struct dw_edma_chan *chan)
+{
+	return chan->chip->dw->ops;
+}
+
+static inline
+struct dw_edma_desc *vd2dw_edma_desc(struct virt_dma_desc *vd)
+{
+	return container_of(vd, struct dw_edma_desc, vd);
+}
+
+static struct dw_edma_burst *dw_edma_alloc_burst(struct dw_edma_chunk *chunk)
+{
+	struct dw_edma_chan *chan = chunk->chan;
+	struct dw_edma_burst *burst;
+
+	burst = kvzalloc(sizeof(*burst), GFP_NOWAIT);
+	if (unlikely(!burst))
+		return NULL;
+
+	INIT_LIST_HEAD(&burst->list);
+
+	if (chunk->burst) {
+		/* Create and add new element into the linked list */
+		chunk->bursts_alloc++;
+		dev_dbg(chan2dev(chan), "alloc new burst element (%d)\n",
+			chunk->bursts_alloc);
+		list_add_tail(&burst->list, &chunk->burst->list);
+	} else {
+		/* List head */
+		chunk->bursts_alloc = 0;
+		chunk->burst = burst;
+		dev_dbg(chan2dev(chan), "alloc new burst head\n");
+	}
+
+	return burst;
+}
+
+static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
+{
+	struct dw_edma_chan *chan = desc->chan;
+	struct dw_edma *dw = chan->chip->dw;
+	struct dw_edma_chunk *chunk;
+
+	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
+	if (unlikely(!chunk))
+		return NULL;
+
+	INIT_LIST_HEAD(&chunk->list);
+	chunk->chan = chan;
+	chunk->cb = !(desc->chunks_alloc % 2);
+	chunk->ll_region.paddr = dw->ll_region.paddr + chan->ll_off;
+	chunk->ll_region.vaddr = dw->ll_region.vaddr + chan->ll_off;
+
+	if (desc->chunk) {
+		/* Create and add new element into the linked list */
+		desc->chunks_alloc++;
+		dev_dbg(chan2dev(chan), "alloc new chunk element (%d)\n",
+			desc->chunks_alloc);
+		list_add_tail(&chunk->list, &desc->chunk->list);
+		dw_edma_alloc_burst(chunk);
+	} else {
+		/* List head */
+		chunk->burst = NULL;
+		desc->chunks_alloc = 0;
+		desc->chunk = chunk;
+		dev_dbg(chan2dev(chan), "alloc new chunk head\n");
+	}
+
+	return chunk;
+}
+
+static struct dw_edma_desc *dw_edma_alloc_desc(struct dw_edma_chan *chan)
+{
+	struct dw_edma_desc *desc;
+
+	dev_dbg(chan2dev(chan), "alloc new descriptor\n");
+
+	desc = kvzalloc(sizeof(*desc), GFP_NOWAIT);
+	if (unlikely(!desc))
+		return NULL;
+
+	desc->chan = chan;
+	dw_edma_alloc_chunk(desc);
+
+	return desc;
+}
+
+static void dw_edma_free_burst(struct dw_edma_chunk *chunk)
+{
+	struct dw_edma_burst *child, *_next;
+
+	if (!chunk->burst)
+		return;
+
+	/* Remove all the list elements */
+	list_for_each_entry_safe(child, _next, &chunk->burst->list, list) {
+		list_del(&child->list);
+		kvfree(child);
+		chunk->bursts_alloc--;
+	}
+
+	/* Remove the list head */
+	kvfree(child);
+	chunk->burst = NULL;
+}
+
+static void dw_edma_free_chunk(struct dw_edma_desc *desc)
+{
+	struct dw_edma_chan *chan = desc->chan;
+	struct dw_edma_chunk *child, *_next;
+
+	if (!desc->chunk)
+		return;
+
+	/* Remove all the list elements */
+	list_for_each_entry_safe(child, _next, &desc->chunk->list, list) {
+		dw_edma_free_burst(child);
+		if (child->bursts_alloc)
+			dev_dbg(chan2dev(chan),	"%u bursts still allocated\n",
+				child->bursts_alloc);
+		list_del(&child->list);
+		kvfree(child);
+		desc->chunks_alloc--;
+	}
+
+	/* Remove the list head */
+	kvfree(child);
+	desc->chunk = NULL;
+}
+
+static void dw_edma_free_desc(struct dw_edma_desc *desc)
+{
+	struct dw_edma_chan *chan = desc->chan;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+
+	dw_edma_free_chunk(desc);
+	if (desc->chunks_alloc)
+		dev_dbg(chan2dev(chan), "%u chunks still allocated\n",
+			desc->chunks_alloc);
+
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+}
+
+static void vchan_free_desc(struct virt_dma_desc *vdesc)
+{
+	dw_edma_free_desc(vd2dw_edma_desc(vdesc));
+}
+
+static void dw_edma_start_transfer(struct dw_edma_chan *chan)
+{
+	struct virt_dma_desc *vd;
+	struct dw_edma_desc *desc;
+	struct dw_edma_chunk *child;
+	const struct dw_edma_core_ops *ops = chan2ops(chan);
+
+	vd = vchan_next_desc(&chan->vc);
+	if (!vd)
+		return;
+
+	desc = vd2dw_edma_desc(vd);
+	if (!desc)
+		return;
+
+	child = list_first_entry_or_null(&desc->chunk->list,
+					 struct dw_edma_chunk, list);
+	if (!child)
+		return;
+
+	ops->start(child, !desc->xfer_sz);
+	desc->xfer_sz += child->ll_region.sz;
+	dev_dbg(chan2dev(chan), "transfer of %u bytes started\n",
+		child->ll_region.sz);
+
+	dw_edma_free_burst(child);
+	if (child->bursts_alloc)
+		dev_dbg(chan2dev(chan),	"%u bursts still allocated\n",
+			child->bursts_alloc);
+	list_del(&child->list);
+	kvfree(child);
+	desc->chunks_alloc--;
+}
+
+static int dw_edma_device_config(struct dma_chan *dchan,
+				 struct dma_slave_config *config)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	const struct dw_edma_core_ops *ops = chan2ops(chan);
+	unsigned long flags;
+	int err = 0;
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+
+	if (!config) {
+		err = -EINVAL;
+		goto err_config;
+	}
+
+	if (chan->status != EDMA_ST_IDLE) {
+		dev_err(chan2dev(chan), "channel is busy or paused\n");
+		err = -EPERM;
+		goto err_config;
+	}
+
+	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
+		&config->src_addr, &config->dst_addr);
+
+	chan->src_addr = config->src_addr;
+	chan->dst_addr = config->dst_addr;
+
+	err = ops->device_config(dchan);
+	if (!err) {
+		chan->configured = true;
+		dev_dbg(chan2dev(chan),	"channel configured\n");
+	}
+
+err_config:
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+	return err;
+}
+
+static int dw_edma_device_pause(struct dma_chan *dchan)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	unsigned long flags;
+	int err = 0;
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+
+	if (!chan->configured) {
+		dev_err(chan2dev(chan), "(pause) channel not configured\n");
+		err = -EPERM;
+		goto err_pause;
+	}
+
+	if (chan->status != EDMA_ST_BUSY) {
+		err = -EPERM;
+		goto err_pause;
+	}
+
+	if (chan->request != EDMA_REQ_NONE) {
+		err = -EPERM;
+		goto err_pause;
+	}
+
+	chan->request = EDMA_REQ_PAUSE;
+	dev_dbg(chan2dev(chan), "pause requested\n");
+
+err_pause:
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+	return err;
+}
+
+static int dw_edma_device_resume(struct dma_chan *dchan)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	unsigned long flags;
+	int err = 0;
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+
+	if (!chan->configured) {
+		dev_err(chan2dev(chan), "(resume) channel not configured\n");
+		err = -EPERM;
+		goto err_resume;
+	}
+
+	if (chan->status != EDMA_ST_PAUSE) {
+		err = -EPERM;
+		goto err_resume;
+	}
+
+	if (chan->request != EDMA_REQ_NONE) {
+		err = -EPERM;
+		goto err_resume;
+	}
+
+	chan->status = EDMA_ST_BUSY;
+	dev_dbg(dchan2dev(dchan), "transfer resumed\n");
+	dw_edma_start_transfer(chan);
+
+err_resume:
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+	return err;
+}
+
+static int dw_edma_device_terminate_all(struct dma_chan *dchan)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	const struct dw_edma_core_ops *ops = chan2ops(chan);
+	unsigned long flags;
+	int err = 0;
+	LIST_HEAD(head);
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+
+	if (!chan->configured)
+		goto err_terminate;
+
+	if (chan->status == EDMA_ST_PAUSE) {
+		dev_dbg(dchan2dev(dchan), "channel is paused, stopping immediately\n");
+		chan->status = EDMA_ST_IDLE;
+		chan->configured = false;
+		goto err_terminate;
+	} else if (chan->status == EDMA_ST_IDLE) {
+		chan->configured = false;
+		goto err_terminate;
+	} else if (ops->ch_status(chan) == DMA_COMPLETE) {
+		/*
+		 * The channel is in a false BUSY state, probably didn't
+		 * receive or lost an interrupt
+		 */
+		chan->status = EDMA_ST_IDLE;
+		chan->configured = false;
+		goto err_terminate;
+	}
+
+	if (chan->request > EDMA_REQ_PAUSE) {
+		err = -EPERM;
+		goto err_terminate;
+	}
+
+	chan->request = EDMA_REQ_STOP;
+	dev_dbg(dchan2dev(dchan), "termination requested\n");
+
+err_terminate:
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+	return err;
+}
+
+static void dw_edma_device_issue_pending(struct dma_chan *dchan)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+
+	if (chan->configured && chan->request == EDMA_REQ_NONE &&
+	    chan->status == EDMA_ST_IDLE && vchan_issue_pending(&chan->vc)) {
+		dev_dbg(dchan2dev(dchan), "transfer issued\n");
+		chan->status = EDMA_ST_BUSY;
+		dw_edma_start_transfer(chan);
+	}
+
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+}
+
+static enum dma_status
+dw_edma_device_tx_status(struct dma_chan *dchan, dma_cookie_t cookie,
+			 struct dma_tx_state *txstate)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	const struct dw_edma_core_ops *ops = chan2ops(chan);
+	unsigned long flags;
+	enum dma_status ret;
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+
+	ret = ops->ch_status(chan);
+	if (ret == DMA_ERROR) {
+		goto ret_status;
+	} else if (ret == DMA_IN_PROGRESS) {
+		chan->status = EDMA_ST_BUSY;
+		goto ret_status;
+	} else {
+		/* DMA_COMPLETE */
+		if (chan->status == EDMA_ST_PAUSE)
+			ret = DMA_PAUSED;
+		else if (chan->status == EDMA_ST_BUSY)
+			ret = DMA_IN_PROGRESS;
+		else
+			ret = DMA_COMPLETE;
+	}
+
+ret_status:
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+	dma_set_residue(txstate, 0);
+
+	return ret;
+}
+
+static struct dma_async_tx_descriptor *
+dw_edma_device_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
+			     unsigned int sg_len,
+			     enum dma_transfer_direction direction,
+			     unsigned long flags, void *context)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	struct dw_edma_desc *desc;
+	struct dw_edma_chunk *chunk;
+	struct dw_edma_burst *burst;
+	struct scatterlist *sg;
+	unsigned long sflags;
+	phys_addr_t src_addr;
+	phys_addr_t dst_addr;
+	int i;
+
+	if (sg_len < 1) {
+		dev_err(chan2dev(chan), "invalid sg length %u\n", sg_len);
+		return NULL;
+	}
+
+	if (direction == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE) {
+		dev_dbg(chan2dev(chan),	"prepare operation (WRITE)\n");
+	} else if (direction == DMA_MEM_TO_DEV && chan->dir == EDMA_DIR_READ) {
+		dev_dbg(chan2dev(chan),	"prepare operation (READ)\n");
+	} else {
+		dev_err(chan2dev(chan), "invalid direction\n");
+		return NULL;
+	}
+
+	if (!chan->configured) {
+		dev_err(chan2dev(chan), "(prep_slave_sg) channel not configured\n");
+		return NULL;
+	}
+
+	if (chan->status != EDMA_ST_IDLE) {
+		dev_err(chan2dev(chan), "channel is busy or paused\n");
+		return NULL;
+	}
+
+	spin_lock_irqsave(&chan->vc.lock, sflags);
+
+	desc = dw_edma_alloc_desc(chan);
+	if (unlikely(!desc))
+		goto err_alloc;
+
+	chunk = dw_edma_alloc_chunk(desc);
+	if (unlikely(!chunk))
+		goto err_alloc;
+
+	src_addr = chan->src_addr;
+	dst_addr = chan->dst_addr;
+
+	for_each_sg(sgl, sg, sg_len, i) {
+		if (chunk->bursts_alloc == chan->ll_max) {
+			chunk = dw_edma_alloc_chunk(desc);
+			if (unlikely(!chunk))
+				goto err_alloc;
+		}
+
+		burst = dw_edma_alloc_burst(chunk);
+
+		if (unlikely(!burst))
+			goto err_alloc;
+
+		burst->sz = sg_dma_len(sg);
+		chunk->ll_region.sz += burst->sz;
+		desc->alloc_sz += burst->sz;
+
+		if (direction == DMA_MEM_TO_DEV) {
+			burst->sar = sg_dma_address(sg);
+			burst->dar = dst_addr;
+			dst_addr += sg_dma_len(sg);
+		} else {
+			burst->sar = src_addr;
+			burst->dar = sg_dma_address(sg);
+			src_addr += sg_dma_len(sg);
+		}
+
+		dev_dbg(chan2dev(chan), "lli %u/%u, sar=0x%.8llx, dar=0x%.8llx, size=%u bytes\n",
+			i + 1, sg_len, burst->sar, burst->dar, burst->sz);
+	}
+
+	spin_unlock_irqrestore(&chan->vc.lock, sflags);
+	return vchan_tx_prep(&chan->vc, &desc->vd, flags);
+
+err_alloc:
+	if (desc)
+		dw_edma_free_desc(desc);
+	spin_unlock_irqrestore(&chan->vc.lock, sflags);
+	return NULL;
+}
+
+static struct dma_async_tx_descriptor *
+dw_edma_device_prep_dma_cyclic(struct dma_chan *dchan, dma_addr_t buf_addr,
+			       size_t len, size_t cyclic_cnt,
+			       enum dma_transfer_direction direction,
+			       unsigned long flags)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	struct dw_edma_desc *desc;
+	struct dw_edma_chunk *chunk;
+	struct dw_edma_burst *burst;
+	unsigned long sflags;
+	phys_addr_t src_addr;
+	phys_addr_t dst_addr;
+	u32 i;
+
+	if (!len || !cyclic_cnt) {
+		dev_err(chan2dev(chan), "invalid len or cyclic count\n");
+		return NULL;
+	}
+
+	if (direction == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE) {
+		dev_dbg(chan2dev(chan),	"prepare operation (WRITE)\n");
+	} else if (direction == DMA_MEM_TO_DEV && chan->dir == EDMA_DIR_READ) {
+		dev_dbg(chan2dev(chan),	"prepare operation (READ)\n");
+	} else {
+		dev_err(chan2dev(chan), "invalid direction\n");
+		return NULL;
+	}
+
+	if (!chan->configured) {
+		dev_err(chan2dev(chan), "(prep_dma_cyclic) channel not configured\n");
+		return NULL;
+	}
+
+	if (chan->status != EDMA_ST_IDLE) {
+		dev_err(chan2dev(chan), "channel is busy or paused\n");
+		return NULL;
+	}
+
+	spin_lock_irqsave(&chan->vc.lock, sflags);
+
+	desc = dw_edma_alloc_desc(chan);
+	if (unlikely(!desc))
+		goto err_alloc;
+
+	chunk = dw_edma_alloc_chunk(desc);
+	if (unlikely(!chunk))
+		goto err_alloc;
+
+	src_addr = chan->src_addr;
+	dst_addr = chan->dst_addr;
+
+	for (i = 0; i < cyclic_cnt; i++) {
+		if (chunk->bursts_alloc == chan->ll_max) {
+			chunk = dw_edma_alloc_chunk(desc);
+			if (unlikely(!chunk))
+				goto err_alloc;
+		}
+
+		burst = dw_edma_alloc_burst(chunk);
+
+		if (unlikely(!burst))
+			goto err_alloc;
+
+		burst->sz = len;
+		chunk->ll_region.sz += burst->sz;
+		desc->alloc_sz += burst->sz;
+
+		burst->sar = src_addr;
+		burst->dar = dst_addr;
+
+		dev_dbg(chan2dev(chan), "lli %u/%u, sar=0x%.8llx, dar=0x%.8llx, size=%u bytes\n",
+			i + 1, cyclic_cnt, burst->sar, burst->dar, burst->sz);
+	}
+
+	spin_unlock_irqrestore(&chan->vc.lock, sflags);
+	return vchan_tx_prep(&chan->vc, &desc->vd, flags);
+
+err_alloc:
+	if (desc)
+		dw_edma_free_desc(desc);
+	spin_unlock_irqrestore(&chan->vc.lock, sflags);
+	return NULL;
+}
+
+static void dw_edma_done_interrupt(struct dw_edma_chan *chan)
+{
+	struct dw_edma *dw = chan->chip->dw;
+	const struct dw_edma_core_ops *ops = dw->ops;
+	struct virt_dma_desc *vd;
+	struct dw_edma_desc *desc;
+	unsigned long flags;
+
+	ops->clear_done_int(chan);
+	dev_dbg(chan2dev(chan), "clear done interrupt\n");
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+	vd = vchan_next_desc(&chan->vc);
+	switch (chan->request) {
+	case EDMA_REQ_NONE:
+		if (!vd)
+			break;
+
+		desc = vd2dw_edma_desc(vd);
+		if (desc->chunks_alloc) {
+			dev_dbg(chan2dev(chan), "sub-transfer complete\n");
+			chan->status = EDMA_ST_BUSY;
+			dev_dbg(chan2dev(chan), "transferred %u bytes\n",
+				desc->xfer_sz);
+			dw_edma_start_transfer(chan);
+		} else {
+			list_del(&vd->node);
+			vchan_cookie_complete(vd);
+			chan->status = EDMA_ST_IDLE;
+			dev_dbg(chan2dev(chan), "transfer complete\n");
+		}
+		break;
+	case EDMA_REQ_STOP:
+		if (!vd)
+			break;
+
+		list_del(&vd->node);
+		vchan_cookie_complete(vd);
+		chan->request = EDMA_REQ_NONE;
+		chan->status = EDMA_ST_IDLE;
+		chan->configured = false;
+		dev_dbg(chan2dev(chan), "transfer stop\n");
+		break;
+	case EDMA_REQ_PAUSE:
+		chan->request = EDMA_REQ_NONE;
+		chan->status = EDMA_ST_PAUSE;
+		break;
+	default:
+		dev_err(chan2dev(chan), "invalid status state\n");
+		break;
+	}
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+}
+
+static void dw_edma_abort_interrupt(struct dw_edma_chan *chan)
+{
+	struct dw_edma *dw = chan->chip->dw;
+	const struct dw_edma_core_ops *ops = dw->ops;
+	struct virt_dma_desc *vd;
+	unsigned long flags;
+
+	ops->clear_abort_int(chan);
+	dev_dbg(chan2dev(chan), "clear abort interrupt\n");
+
+	spin_lock_irqsave(&chan->vc.lock, flags);
+	vd = vchan_next_desc(&chan->vc);
+	if (vd) {
+		list_del(&vd->node);
+		vchan_cookie_complete(vd);
+	}
+	chan->request = EDMA_REQ_NONE;
+	chan->status = EDMA_ST_IDLE;
+
+	spin_unlock_irqrestore(&chan->vc.lock, flags);
+}
+
+static irqreturn_t dw_edma_interrupt_write(int irq, void *data)
+{
+	struct dw_edma_chip *chip = data;
+	struct dw_edma *dw = chip->dw;
+	const struct dw_edma_core_ops *ops = dw->ops;
+	unsigned long tot = dw->wr_ch_cnt;
+	unsigned long pos = 0;
+	unsigned long val;
+
+	pos = 0;
+	val = ops->status_done_int(dw, EDMA_DIR_WRITE);
+	while ((pos = find_next_bit(&val, tot, pos)) != tot) {
+		struct dw_edma_chan *chan = &dw->chan[pos];
+
+		dw_edma_done_interrupt(chan);
+		pos++;
+	}
+
+	pos = 0;
+	val = ops->status_abort_int(dw, EDMA_DIR_WRITE);
+	while ((pos = find_next_bit(&val, tot, pos)) != tot) {
+		struct dw_edma_chan *chan = &dw->chan[pos];
+
+		dw_edma_abort_interrupt(chan);
+		pos++;
+	}
+
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t dw_edma_interrupt_read(int irq, void *data)
+{
+	struct dw_edma_chip *chip = data;
+	struct dw_edma *dw = chip->dw;
+	const struct dw_edma_core_ops *ops = dw->ops;
+	unsigned long tot = dw->rd_ch_cnt;
+	unsigned long off = dw->wr_ch_cnt;
+	unsigned long pos, val;
+
+	pos = 0;
+	val = ops->status_done_int(dw, EDMA_DIR_READ);
+	while ((pos = find_next_bit(&val, tot, pos)) != tot) {
+		struct dw_edma_chan *chan = &dw->chan[pos + off];
+
+		dw_edma_done_interrupt(chan);
+		pos++;
+	}
+
+	pos = 0;
+	val = ops->status_abort_int(dw, EDMA_DIR_READ);
+	while ((pos = find_next_bit(&val, tot, pos)) != tot) {
+		struct dw_edma_chan *chan = &dw->chan[pos + off];
+
+		dw_edma_abort_interrupt(chan);
+		pos++;
+	}
+
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t dw_edma_interrupt_all(int irq, void *data)
+{
+	dw_edma_interrupt_write(irq, data);
+	dw_edma_interrupt_read(irq, data);
+
+	return IRQ_HANDLED;
+}
+
+static int dw_edma_alloc_chan_resources(struct dma_chan *dchan)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+
+	if (chan->status != EDMA_ST_IDLE) {
+		dev_err(chan2dev(chan), "channel is busy\n");
+		return -EBUSY;
+	}
+
+	dev_dbg(dchan2dev(dchan), "allocated\n");
+
+	pm_runtime_get(chan->chip->dev);
+
+	return 0;
+}
+
+static void dw_edma_free_chan_resources(struct dma_chan *dchan)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	unsigned long timeout = jiffies + msecs_to_jiffies(5000);
+	int ret;
+
+	if (chan->status != EDMA_ST_IDLE)
+		dev_err(chan2dev(chan), "channel is busy\n");
+
+	do {
+		ret = dw_edma_device_terminate_all(dchan);
+		if (!ret)
+			break;
+
+		if (time_after_eq(jiffies, timeout)) {
+			dev_err(chan2dev(chan), "free timeout\n");
+			return;
+		}
+
+		cpu_relax();
+	} while (1);
+
+	dev_dbg(dchan2dev(dchan), "channel freed\n");
+
+	pm_runtime_put(chan->chip->dev);
+}
+
+int dw_edma_probe(struct dw_edma_chip *chip)
+{
+	struct dw_edma *dw = chip->dw;
+	struct device *dev = chip->dev;
+	const struct dw_edma_core_ops *ops;
+	size_t ll_chunk = dw->ll_region.sz;
+	size_t dt_chunk = dw->dt_region.sz;
+	u32 ch_tot;
+	int i, j, err;
+
+	raw_spin_lock_init(&dw->lock);
+
+	/* Callback operation selection accordingly to eDMA version */
+	switch (dw->version) {
+	default:
+		dev_err(dev, "unsupported version\n");
+		return -EPERM;
+	}
+
+	pm_runtime_get_sync(dev);
+
+	/* Find out how many write channels are supported by hardware */
+	dw->wr_ch_cnt = ops->ch_count(dw, EDMA_DIR_WRITE);
+	if (!dw->wr_ch_cnt) {
+		dev_err(dev, "invalid number of write channels(0)\n");
+		return -EINVAL;
+	}
+
+	/* Find out how many read channels are supported by hardware */
+	dw->rd_ch_cnt = ops->ch_count(dw, EDMA_DIR_READ);
+	if (!dw->rd_ch_cnt) {
+		dev_err(dev, "invalid number of read channels(0)\n");
+		return -EINVAL;
+	}
+
+	dev_dbg(dev, "Channels:\twrite=%d, read=%d\n",
+		dw->wr_ch_cnt, dw->rd_ch_cnt);
+
+	ch_tot = dw->wr_ch_cnt + dw->rd_ch_cnt;
+
+	/* Allocate channels */
+	dw->chan = devm_kcalloc(dev, ch_tot, sizeof(*dw->chan), GFP_KERNEL);
+	if (!dw->chan)
+		return -ENOMEM;
+
+	/* Calculate the linked list chunk for each channel */
+	ll_chunk /= roundup_pow_of_two(ch_tot);
+
+	/* Calculate the linked list chunk for each channel */
+	dt_chunk /= roundup_pow_of_two(ch_tot);
+
+	/* Disable eDMA, only to establish the ideal initial conditions */
+	ops->off(dw);
+
+	snprintf(dw->name, sizeof(dw->name), "dw-edma-core:%d", chip->id);
+
+	/* Request IRQs */
+	if (dw->nr_irqs != 1) {
+		dev_err(dev, "invalid number of irqs (%u)\n", dw->nr_irqs);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < dw->nr_irqs; i++) {
+		err = devm_request_irq(dev, pci_irq_vector(to_pci_dev(dev), i),
+				       dw_edma_interrupt_all,
+				       IRQF_SHARED, dw->name, chip);
+		if (err)
+			return err;
+	}
+
+	/* Create write channels */
+	INIT_LIST_HEAD(&dw->wr_edma.channels);
+	for (i = 0; i < dw->wr_ch_cnt; i++) {
+		struct dw_edma_chan *chan = &dw->chan[i];
+		struct dw_edma_region *dt_region;
+
+		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
+		if (!dt_region)
+			return -ENOMEM;
+
+		chan->vc.chan.private = dt_region;
+
+		chan->chip = chip;
+		chan->id = i;
+		chan->dir = EDMA_DIR_WRITE;
+		chan->configured = false;
+		chan->request = EDMA_REQ_NONE;
+		chan->status = EDMA_ST_IDLE;
+
+		chan->ll_off = (ll_chunk * i);
+		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
+
+		chan->dt_off = (dt_chunk * i);
+
+		dev_dbg(dev, "L. List:\tChannel write[%u] off=0x%.8lx, max_cnt=%u\n",
+			i, chan->ll_off, chan->ll_max);
+
+		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
+
+		dev_dbg(dev, "MSI:\t\tChannel write[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
+			i, chan->msi.address_hi, chan->msi.address_lo,
+			chan->msi.data);
+
+		chan->vc.desc_free = vchan_free_desc;
+		vchan_init(&chan->vc, &dw->wr_edma);
+
+		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
+		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
+		dt_region->sz = dt_chunk;
+
+		dev_dbg(dev, "Data:\tChannel write[%u] off=0x%.8lx\n",
+			i, chan->dt_off);
+	}
+	dma_cap_zero(dw->wr_edma.cap_mask);
+	dma_cap_set(DMA_SLAVE, dw->wr_edma.cap_mask);
+	dma_cap_set(DMA_CYCLIC, dw->wr_edma.cap_mask);
+	dw->wr_edma.directions = BIT(DMA_DEV_TO_MEM);
+	dw->wr_edma.chancnt = dw->wr_ch_cnt;
+
+	/* Create read channels */
+	INIT_LIST_HEAD(&dw->rd_edma.channels);
+	for (j = 0; j < dw->rd_ch_cnt; j++, i++) {
+		struct dw_edma_chan *chan = &dw->chan[i];
+		struct dw_edma_region *dt_region;
+
+		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
+		if (!dt_region)
+			return -ENOMEM;
+
+		chan->vc.chan.private = dt_region;
+
+		chan->chip = chip;
+		chan->id = j;
+		chan->dir = EDMA_DIR_READ;
+		chan->configured = false;
+		chan->request = EDMA_REQ_NONE;
+		chan->status = EDMA_ST_IDLE;
+
+		chan->ll_off = (ll_chunk * i);
+		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
+
+		chan->dt_off = (dt_chunk * i);
+
+		dev_dbg(dev, "L. List:\tChannel read[%u] off=0x%.8lx, max_cnt=%u\n",
+			j, chan->ll_off, chan->ll_max);
+
+		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
+
+		dev_dbg(dev, "MSI:\t\tChannel read[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
+			j, chan->msi.address_hi, chan->msi.address_lo,
+			chan->msi.data);
+
+		chan->vc.desc_free = vchan_free_desc;
+		vchan_init(&chan->vc, &dw->rd_edma);
+
+		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
+		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
+		dt_region->sz = dt_chunk;
+
+		dev_dbg(dev, "Data:\tChannel read[%u] off=0x%.8lx\n",
+			i, chan->dt_off);
+	}
+	dma_cap_zero(dw->rd_edma.cap_mask);
+	dma_cap_set(DMA_SLAVE, dw->rd_edma.cap_mask);
+	dma_cap_set(DMA_CYCLIC, dw->rd_edma.cap_mask);
+	dw->rd_edma.directions = BIT(DMA_MEM_TO_DEV);
+	dw->rd_edma.chancnt = dw->rd_ch_cnt;
+
+	/* Set DMA channels  capabilities */
+	SET_BOTH_CH(src_addr_widths, BIT(DMA_SLAVE_BUSWIDTH_4_BYTES));
+	SET_BOTH_CH(dst_addr_widths, BIT(DMA_SLAVE_BUSWIDTH_4_BYTES));
+	SET_BOTH_CH(residue_granularity, DMA_RESIDUE_GRANULARITY_DESCRIPTOR);
+
+	SET_BOTH_CH(dev, dev);
+
+	SET_BOTH_CH(device_alloc_chan_resources, dw_edma_alloc_chan_resources);
+	SET_BOTH_CH(device_free_chan_resources, dw_edma_free_chan_resources);
+
+	SET_BOTH_CH(device_config, dw_edma_device_config);
+	SET_BOTH_CH(device_pause, dw_edma_device_pause);
+	SET_BOTH_CH(device_resume, dw_edma_device_resume);
+	SET_BOTH_CH(device_terminate_all, dw_edma_device_terminate_all);
+	SET_BOTH_CH(device_issue_pending, dw_edma_device_issue_pending);
+	SET_BOTH_CH(device_tx_status, dw_edma_device_tx_status);
+	SET_BOTH_CH(device_prep_slave_sg, dw_edma_device_prep_slave_sg);
+	SET_BOTH_CH(device_prep_dma_cyclic, dw_edma_device_prep_dma_cyclic);
+
+	/* Power management */
+	pm_runtime_enable(dev);
+
+	/* Register DMA device */
+	err = dma_async_device_register(&dw->wr_edma);
+	if (err)
+		goto err_pm_disable;
+
+	err = dma_async_device_register(&dw->rd_edma);
+	if (err)
+		goto err_pm_disable;
+
+	/* Turn debugfs on */
+	err = ops->debugfs_on(chip);
+	if (err) {
+		dev_err(dev, "unable to create debugfs structure\n");
+		goto err_pm_disable;
+	}
+
+	dev_info(dev, "DesignWare eDMA controller driver loaded completely\n");
+
+	return 0;
+
+err_pm_disable:
+	pm_runtime_disable(dev);
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(dw_edma_probe);
+
+int dw_edma_remove(struct dw_edma_chip *chip)
+{
+	struct dw_edma *dw = chip->dw;
+	struct device *dev = chip->dev;
+	const struct dw_edma_core_ops *ops = dw->ops;
+	struct dw_edma_chan *chan, *_chan;
+	int i;
+
+	/* Disable eDMA */
+	if (ops)
+		ops->off(dw);
+
+	/* Free irqs */
+	for (i = 0; i < dw->nr_irqs; i++) {
+		if (pci_irq_vector(to_pci_dev(dev), i) < 0)
+			continue;
+
+		devm_free_irq(dev, pci_irq_vector(to_pci_dev(dev), i), chip);
+	}
+
+	/* Power management */
+	pm_runtime_disable(dev);
+
+	list_for_each_entry_safe(chan, _chan, &dw->wr_edma.channels,
+				 vc.chan.device_node) {
+		list_del(&chan->vc.chan.device_node);
+		tasklet_kill(&chan->vc.task);
+	}
+
+	list_for_each_entry_safe(chan, _chan, &dw->rd_edma.channels,
+				 vc.chan.device_node) {
+		list_del(&chan->vc.chan.device_node);
+		tasklet_kill(&chan->vc.task);
+	}
+
+	/* Deregister eDMA device */
+	dma_async_device_unregister(&dw->wr_edma);
+	dma_async_device_unregister(&dw->rd_edma);
+
+	/* Turn debugfs off */
+	if (ops)
+		ops->debugfs_off();
+
+	dev_info(dev, "DesignWare eDMA controller driver unloaded complete\n");
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dw_edma_remove);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Synopsys DesignWare eDMA controller core driver");
+MODULE_AUTHOR("Gustavo Pimentel <gustavo.pimentel@synopsys.com>");
diff --git a/drivers/dma/dw-edma/dw-edma-core.h b/drivers/dma/dw-edma/dw-edma-core.h
new file mode 100644
index 0000000..2d98a10
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-core.h
@@ -0,0 +1,151 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA core driver
+ */
+
+#ifndef _DW_EDMA_CORE_H
+#define _DW_EDMA_CORE_H
+
+#include <linux/msi.h>
+#include <linux/dma/edma.h>
+
+#include "../virt-dma.h"
+
+#define EDMA_LL_SZ					24
+
+enum dw_edma_dir {
+	EDMA_DIR_WRITE = 0,
+	EDMA_DIR_READ
+};
+
+enum dw_edma_mode {
+	EDMA_MODE_LEGACY = 0,
+	EDMA_MODE_UNROLL
+};
+
+enum dw_edma_request {
+	EDMA_REQ_NONE = 0,
+	EDMA_REQ_STOP,
+	EDMA_REQ_PAUSE
+};
+
+enum dw_edma_status {
+	EDMA_ST_IDLE = 0,
+	EDMA_ST_PAUSE,
+	EDMA_ST_BUSY
+};
+
+struct dw_edma_chan;
+struct dw_edma_chunk;
+
+struct dw_edma_core_ops {
+	/* eDMA management callbacks */
+	void (*off)(struct dw_edma *dw);
+	u16 (*ch_count)(struct dw_edma *dw, enum dw_edma_dir dir);
+	enum dma_status (*ch_status)(struct dw_edma_chan *chan);
+	void (*clear_done_int)(struct dw_edma_chan *chan);
+	void (*clear_abort_int)(struct dw_edma_chan *chan);
+	u32 (*status_done_int)(struct dw_edma *chan, enum dw_edma_dir dir);
+	u32 (*status_abort_int)(struct dw_edma *chan, enum dw_edma_dir dir);
+	void (*start)(struct dw_edma_chunk *chunk, bool first);
+	int (*device_config)(struct dma_chan *dchan);
+	/* eDMA debug fs callbacks */
+	int (*debugfs_on)(struct dw_edma_chip *chip);
+	void (*debugfs_off)(void);
+};
+
+struct dw_edma_burst {
+	struct list_head		list;
+	u64				sar;
+	u64				dar;
+	u32				sz;
+};
+
+struct dw_edma_region {
+	phys_addr_t			paddr;
+	dma_addr_t			vaddr;
+	size_t				sz;
+};
+
+struct dw_edma_chunk {
+	struct list_head		list;
+	struct dw_edma_chan		*chan;
+	struct dw_edma_burst		*burst;
+
+	u32				bursts_alloc;
+
+	u8				cb;
+	struct dw_edma_region		ll_region;	/* Linked list */
+};
+
+struct dw_edma_desc {
+	struct virt_dma_desc		vd;
+	struct dw_edma_chan		*chan;
+	struct dw_edma_chunk		*chunk;
+
+	u32				chunks_alloc;
+
+	u32				alloc_sz;
+	u32				xfer_sz;
+};
+
+struct dw_edma_chan {
+	struct virt_dma_chan		vc;
+	struct dw_edma_chip		*chip;
+	int				id;
+	enum dw_edma_dir		dir;
+
+	off_t				ll_off;
+	u32				ll_max;
+
+	off_t				dt_off;
+
+	struct msi_msg			msi;
+
+	enum dw_edma_request		request;
+	enum dw_edma_status		status;
+	u8				configured;
+
+	phys_addr_t			src_addr;
+	phys_addr_t			dst_addr;
+};
+
+struct dw_edma {
+	char				name[20];
+
+	struct dma_device		wr_edma;
+	u16				wr_ch_cnt;
+
+	struct dma_device		rd_edma;
+	u16				rd_ch_cnt;
+
+	struct dw_edma_region		rg_region;	/* Registers */
+	struct dw_edma_region		ll_region;	/* Linked list */
+	struct dw_edma_region		dt_region;	/* Data */
+
+	struct msi_msg			*msi;
+	int				nr_irqs;
+
+	u32				version;
+	enum dw_edma_mode		mode;
+
+	struct dw_edma_chan		*chan;
+	const struct dw_edma_core_ops	*ops;
+
+	raw_spinlock_t			lock;		/* Only for legacy */
+};
+
+static inline
+struct dw_edma_chan *vc2dw_edma_chan(struct virt_dma_chan *vc)
+{
+	return container_of(vc, struct dw_edma_chan, vc);
+}
+
+static inline
+struct dw_edma_chan *dchan2dw_edma_chan(struct dma_chan *dchan)
+{
+	return vc2dw_edma_chan(to_virt_chan(dchan));
+}
+
+#endif /* _DW_EDMA_CORE_H */
diff --git a/include/linux/dma/edma.h b/include/linux/dma/edma.h
new file mode 100644
index 0000000..349e542
--- /dev/null
+++ b/include/linux/dma/edma.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+// Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+// Synopsys DesignWare eDMA core driver
+
+#ifndef _DW_EDMA_H
+#define _DW_EDMA_H
+
+#include <linux/device.h>
+#include <linux/dmaengine.h>
+
+struct dw_edma;
+
+/**
+ * struct dw_edma_chip - representation of DesignWare eDMA controller hardware
+ * @dev:		 struct device of the eDMA controller
+ * @id:			 instance ID
+ * @irq:		 irq line
+ * @dw:			 struct dw_edma that is filed by dw_edma_probe()
+ */
+struct dw_edma_chip {
+	struct device		*dev;
+	int			id;
+	int			irq;
+	struct dw_edma		*dw;
+};
+
+/* Export to the platform drivers */
+#if IS_ENABLED(CONFIG_DW_EDMA)
+int dw_edma_probe(struct dw_edma_chip *chip);
+int dw_edma_remove(struct dw_edma_chip *chip);
+#else
+static inline int dw_edma_probe(struct dw_edma_chip *chip)
+{
+	return -ENODEV;
+}
+
+static inline int dw_edma_remove(struct dw_edma_chip *chip)
+{
+	return 0;
+}
+#endif /* CONFIG_DW_EDMA */
+
+#endif /* _DW_EDMA_H */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC v3 2/7] dmaengine: Add Synopsys eDMA IP version 0 support
  2019-01-11 18:33 [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0) Gustavo Pimentel
  2019-01-11 18:33 ` [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver Gustavo Pimentel
@ 2019-01-11 18:33 ` Gustavo Pimentel
  2019-01-16 10:33   ` Jose Abreu
  2019-01-11 18:33 ` [RFC v3 3/7] dmaengine: Add Synopsys eDMA IP version 0 debugfs support Gustavo Pimentel
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-11 18:33 UTC (permalink / raw)
  To: linux-pci, dmaengine
  Cc: Gustavo Pimentel, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

Add support for the eDMA IP version 0 driver for both register maps (legacy
and unroll).

The legacy register mapping was the initial implementation, which consisted
in having all registers belonging to channels multiplexed, which could be
change anytime (which could led a race-condition) by view port register
(access to only one channel available each time).

This register mapping is not very effective and efficient in a multithread
environment, which has led to the development of unroll registers mapping,
which consists of having all channels registers accessible any time by
spreading all channels registers by an offset between them.

This version supports a maximum of 16 independent channels (8 write +
8 read), which can run simultaneously.

Implements a scatter-gather transfer through a linked list, where the size
of linked list depends on the allocated memory divided equally among all
channels.

Each linked list descriptor can transfer from 1 byte to 4 Gbytes and is
alignmented to DWORD.

Both SAR (Source Address Register) and DAR (Destination Address Register)
are alignmented to byte.

Changes:
RFC v1->RFC v2:
 - Replace comments // (C99 style) by /**/
 - Replace magic numbers by defines
 - Replace boolean return from ternary operation by a double negation
   operation
 - Replace QWORD_HI/QWORD_LO macros by upper_32_bits()/lower_32_bits()
 - Fix the headers of the .c and .h files according to the most recent
   convention
 - Fix errors and checks pointed out by checkpatch with --strict option
 - Replace patch small description tag from dma by dmaengine
 - Refactor code to replace atomic_t by u32 variable type
RFC v2->RFC v3:
 - Code rewrite to use FIELD_PREP() and FIELD_GET()
 - Add define to magic numbers
 - Fix minor bugs

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Niklas Cassel <niklas.cassel@linaro.org>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Vitor Soares <vitor.soares@synopsys.com>
Cc: Nelson Costa <nelson.costa@synopsys.com>
Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
---
 drivers/dma/dw-edma/Makefile          |   3 +-
 drivers/dma/dw-edma/dw-edma-core.c    |  21 +++
 drivers/dma/dw-edma/dw-edma-v0-core.c | 346 ++++++++++++++++++++++++++++++++++
 drivers/dma/dw-edma/dw-edma-v0-core.h |  26 +++
 drivers/dma/dw-edma/dw-edma-v0-regs.h | 156 +++++++++++++++
 5 files changed, 551 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-core.c
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-core.h
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-regs.h

diff --git a/drivers/dma/dw-edma/Makefile b/drivers/dma/dw-edma/Makefile
index 3224010..01c7c63 100644
--- a/drivers/dma/dw-edma/Makefile
+++ b/drivers/dma/dw-edma/Makefile
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_DW_EDMA)		+= dw-edma.o
-dw-edma-objs			:= dw-edma-core.o
+dw-edma-objs			:= dw-edma-core.o \
+					dw-edma-v0-core.o
diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c
index 2b6b70f..772a22f 100644
--- a/drivers/dma/dw-edma/dw-edma-core.c
+++ b/drivers/dma/dw-edma/dw-edma-core.c
@@ -15,6 +15,7 @@
 #include <linux/pci.h>
 
 #include "dw-edma-core.h"
+#include "dw-edma-v0-core.h"
 #include "../dmaengine.h"
 #include "../virt-dma.h"
 
@@ -27,6 +28,22 @@
 		SET(dw->rd_edma, name, value);	\
 	} while (0)
 
+static const struct dw_edma_core_ops dw_edma_v0_core_ops = {
+	/* eDMA management callbacks */
+	.off = dw_edma_v0_core_off,
+	.ch_count = dw_edma_v0_core_ch_count,
+	.ch_status = dw_edma_v0_core_ch_status,
+	.clear_done_int = dw_edma_v0_core_clear_done_int,
+	.clear_abort_int = dw_edma_v0_core_clear_abort_int,
+	.status_done_int = dw_edma_v0_core_status_done_int,
+	.status_abort_int = dw_edma_v0_core_status_abort_int,
+	.start = dw_edma_v0_core_start,
+	.device_config = dw_edma_v0_core_device_config,
+	/* eDMA debug fs callbacks */
+	.debugfs_on = dw_edma_v0_core_debugfs_on,
+	.debugfs_off = dw_edma_v0_core_debugfs_off,
+};
+
 static inline
 struct device *dchan2dev(struct dma_chan *dchan)
 {
@@ -802,10 +819,14 @@ int dw_edma_probe(struct dw_edma_chip *chip)
 
 	/* Callback operation selection accordingly to eDMA version */
 	switch (dw->version) {
+	case 0:
+		ops = &dw_edma_v0_core_ops;
+		break;
 	default:
 		dev_err(dev, "unsupported version\n");
 		return -EPERM;
 	}
+	dw->ops = ops;
 
 	pm_runtime_get_sync(dev);
 
diff --git a/drivers/dma/dw-edma/dw-edma-v0-core.c b/drivers/dma/dw-edma/dw-edma-v0-core.c
new file mode 100644
index 0000000..5b76e42
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-v0-core.c
@@ -0,0 +1,346 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA v0 core
+ */
+
+#include <linux/bitfield.h>
+
+#include "dw-edma-core.h"
+#include "dw-edma-v0-core.h"
+#include "dw-edma-v0-regs.h"
+#include "dw-edma-v0-debugfs.h"
+
+enum dw_edma_control {
+	DW_EDMA_V0_CB					= BIT(0),
+	DW_EDMA_V0_TCB					= BIT(1),
+	DW_EDMA_V0_LLP					= BIT(2),
+	DW_EDMA_V0_LIE					= BIT(3),
+	DW_EDMA_V0_RIE					= BIT(4),
+	DW_EDMA_V0_CCS					= BIT(8),
+	DW_EDMA_V0_LLE					= BIT(9),
+};
+
+static inline struct dw_edma_v0_regs __iomem *__dw_regs(struct dw_edma *dw)
+{
+	return (struct dw_edma_v0_regs __iomem *)dw->rg_region.vaddr;
+}
+
+#define SET(dw, name, value)				\
+	writel(value, &(__dw_regs(dw)->name))
+
+#define GET(dw, name)					\
+	readl(&(__dw_regs(dw)->name))
+
+#define SET_RW(dw, dir, name, value)			\
+	do {						\
+		if ((dir) == EDMA_DIR_WRITE)		\
+			SET(dw, wr_##name, value);	\
+		else					\
+			SET(dw, rd_##name, value);	\
+	} while (0)
+
+#define GET_RW(dw, dir, name)				\
+	((dir) == EDMA_DIR_WRITE			\
+	  ? GET(dw, wr_##name)				\
+	  : GET(dw, rd_##name))
+
+#define SET_BOTH(dw, name, value)			\
+	do {						\
+		SET(dw, wr_##name, value);		\
+		SET(dw, rd_##name, value);		\
+	} while (0)
+
+static inline struct dw_edma_v0_ch_regs __iomem *
+__dw_ch_regs(struct dw_edma *dw, enum dw_edma_dir dir, u16 ch)
+{
+	if (dw->mode == EDMA_MODE_LEGACY)
+		return &(__dw_regs(dw)->type.legacy.ch);
+
+	if (dir == EDMA_DIR_WRITE)
+		return &__dw_regs(dw)->type.unroll.ch[ch].wr;
+
+	return &__dw_regs(dw)->type.unroll.ch[ch].rd;
+}
+
+static inline void writel_ch(struct dw_edma *dw, enum dw_edma_dir dir, u16 ch,
+			     u32 value, void __iomem *addr)
+{
+	if (dw->mode == EDMA_MODE_LEGACY) {
+		u32 viewport_sel;
+		unsigned long flags;
+
+		raw_spin_lock_irqsave(&dw->lock, flags);
+
+		viewport_sel = FIELD_PREP(EDMA_V0_VIEWPORT_MASK, ch);
+		if (dir == EDMA_DIR_READ)
+			viewport_sel |= BIT(31);
+
+		writel(viewport_sel,
+		       &(__dw_regs(dw)->type.legacy.viewport_sel));
+		writel(value, addr);
+
+		raw_spin_unlock_irqrestore(&dw->lock, flags);
+	} else {
+		writel(value, addr);
+	}
+}
+
+static inline u32 readl_ch(struct dw_edma *dw, enum dw_edma_dir dir, u16 ch,
+			   const void __iomem *addr)
+{
+	u32 value;
+
+	if (dw->mode == EDMA_MODE_LEGACY) {
+		u32 viewport_sel;
+		unsigned long flags;
+
+		raw_spin_lock_irqsave(&dw->lock, flags);
+
+		viewport_sel = FIELD_PREP(EDMA_V0_VIEWPORT_MASK, ch);
+		if (dir == EDMA_DIR_READ)
+			viewport_sel |= BIT(31);
+
+		writel(viewport_sel,
+		       &(__dw_regs(dw)->type.legacy.viewport_sel));
+		value = readl(addr);
+
+		raw_spin_unlock_irqrestore(&dw->lock, flags);
+	} else {
+		value = readl(addr);
+	}
+
+	return value;
+}
+
+#define SET_CH(dw, dir, ch, name, value) \
+	writel_ch(dw, dir, ch, value, &(__dw_ch_regs(dw, dir, ch)->name))
+
+#define GET_CH(dw, dir, ch, name) \
+	readl_ch(dw, dir, ch, &(__dw_ch_regs(dw, dir, ch)->name))
+
+#define SET_LL(ll, value) \
+	writel(value, ll)
+
+/* eDMA management callbacks */
+void dw_edma_v0_core_off(struct dw_edma *dw)
+{
+	SET_BOTH(dw, int_mask, EDMA_V0_DONE_INT_MASK | EDMA_V0_ABORT_INT_MASK);
+	SET_BOTH(dw, int_clear, EDMA_V0_DONE_INT_MASK | EDMA_V0_ABORT_INT_MASK);
+	SET_BOTH(dw, engine_en, 0);
+}
+
+u16 dw_edma_v0_core_ch_count(struct dw_edma *dw, enum dw_edma_dir dir)
+{
+	u32 num_ch;
+
+	if (dir == EDMA_DIR_WRITE)
+		num_ch = FIELD_GET(EDMA_V0_WRITE_CH_COUNT_MASK, GET(dw, ctrl));
+	else
+		num_ch = FIELD_GET(EDMA_V0_READ_CH_COUNT_MASK, GET(dw, ctrl));
+
+	if (num_ch > EDMA_V0_MAX_NR_CH)
+		num_ch = EDMA_V0_MAX_NR_CH;
+
+	return (u16)num_ch;
+}
+
+enum dma_status dw_edma_v0_core_ch_status(struct dw_edma_chan *chan)
+{
+	struct dw_edma *dw = chan->chip->dw;
+	u32 tmp;
+
+	tmp = FIELD_GET(EDMA_V0_CH_STATUS_MASK,
+			GET_CH(dw, chan->dir, chan->id, ch_control1));
+
+	if (tmp == 1)
+		return DMA_IN_PROGRESS;
+	else if (tmp == 3)
+		return DMA_COMPLETE;
+	else
+		return DMA_ERROR;
+}
+
+void dw_edma_v0_core_clear_done_int(struct dw_edma_chan *chan)
+{
+	struct dw_edma *dw = chan->chip->dw;
+
+	SET_RW(dw, chan->dir, int_clear,
+	       FIELD_PREP(EDMA_V0_DONE_INT_MASK, BIT(chan->id)));
+}
+
+void dw_edma_v0_core_clear_abort_int(struct dw_edma_chan *chan)
+{
+	struct dw_edma *dw = chan->chip->dw;
+
+	SET_RW(dw, chan->dir, int_clear,
+	       FIELD_PREP(EDMA_V0_ABORT_INT_MASK, BIT(chan->id)));
+}
+
+u32 dw_edma_v0_core_status_done_int(struct dw_edma *dw, enum dw_edma_dir dir)
+{
+	return FIELD_GET(EDMA_V0_DONE_INT_MASK, GET_RW(dw, dir, int_status));
+}
+
+u32 dw_edma_v0_core_status_abort_int(struct dw_edma *dw, enum dw_edma_dir dir)
+{
+	return FIELD_GET(EDMA_V0_ABORT_INT_MASK, GET_RW(dw, dir, int_status));
+}
+
+static void dw_edma_v0_core_write_chunk(struct dw_edma_chunk *chunk)
+{
+	struct dw_edma_burst *child;
+	struct dw_edma_v0_lli *lli;
+	struct dw_edma_v0_llp *llp;
+	u32 control = 0, i = 0;
+	u64 sar, dar, addr;
+	int j;
+
+	lli = (struct dw_edma_v0_lli *)chunk->ll_region.vaddr;
+
+	if (chunk->cb)
+		control = DW_EDMA_V0_CB;
+
+	j = chunk->bursts_alloc;
+	list_for_each_entry(child, &chunk->burst->list, list) {
+		j--;
+		if (!j)
+			control |= (DW_EDMA_V0_LIE | DW_EDMA_V0_RIE);
+
+		/* Channel control */
+		SET_LL(&lli[i].control, control);
+		/* Transfer size */
+		SET_LL(&lli[i].transfer_size, child->sz);
+		/* SAR - low, high */
+		sar = cpu_to_le64(child->sar);
+		SET_LL(&lli[i].sar_low, lower_32_bits(sar));
+		SET_LL(&lli[i].sar_high, upper_32_bits(sar));
+		/* DAR - low, high */
+		dar = cpu_to_le64(child->dar);
+		SET_LL(&lli[i].dar_low, lower_32_bits(dar));
+		SET_LL(&lli[i].dar_high, upper_32_bits(dar));
+		i++;
+	}
+
+	llp = (struct dw_edma_v0_llp *)&lli[i];
+	control = DW_EDMA_V0_LLP | DW_EDMA_V0_TCB;
+	if (!chunk->cb)
+		control |= DW_EDMA_V0_CB;
+
+	/* Channel control */
+	SET_LL(&llp->control, control);
+	/* Linked list  - low, high */
+	addr = cpu_to_le64(chunk->ll_region.paddr);
+	SET_LL(&llp->llp_low, lower_32_bits(addr));
+	SET_LL(&llp->llp_high, upper_32_bits(addr));
+}
+
+void dw_edma_v0_core_start(struct dw_edma_chunk *chunk, bool first)
+{
+	struct dw_edma_chan *chan = chunk->chan;
+	struct dw_edma *dw = chan->chip->dw;
+	u32 tmp;
+	u64 llp;
+
+	dw_edma_v0_core_write_chunk(chunk);
+
+	if (first) {
+		/* Enable engine */
+		SET_RW(dw, chan->dir, engine_en, BIT(0));
+		/* Interrupt unmask - done, abort */
+		tmp = GET_RW(dw, chan->dir, int_mask);
+		tmp &= ~FIELD_PREP(EDMA_V0_DONE_INT_MASK, BIT(chan->id));
+		tmp &= ~FIELD_PREP(EDMA_V0_ABORT_INT_MASK, BIT(chan->id));
+		SET_RW(dw, chan->dir, int_mask, tmp);
+		/* Linked list error */
+		tmp = GET_RW(dw, chan->dir, linked_list_err_en);
+		tmp |= FIELD_PREP(EDMA_V0_LINKED_LIST_ERR_MASK, BIT(chan->id));
+		SET_RW(dw, chan->dir, linked_list_err_en, tmp);
+		/* Channel control */
+		SET_CH(dw, chan->dir, chan->id, ch_control1,
+		       (DW_EDMA_V0_CCS | DW_EDMA_V0_LLE));
+		/* Linked list - low, high */
+		llp = cpu_to_le64(chunk->ll_region.paddr);
+		SET_CH(dw, chan->dir, chan->id, llp_low, lower_32_bits(llp));
+		SET_CH(dw, chan->dir, chan->id, llp_high, upper_32_bits(llp));
+	}
+	/* Doorbell */
+	SET_RW(dw, chan->dir, doorbell,
+	       FIELD_PREP(EDMA_V0_DOORBELL_CH_MASK, chan->id));
+}
+
+int dw_edma_v0_core_device_config(struct dma_chan *dchan)
+{
+	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
+	struct dw_edma *dw = chan->chip->dw;
+	u32 tmp = 0;
+
+	/* MSI done addr - low, high */
+	SET_RW(dw, chan->dir, done_imwr_low, chan->msi.address_lo);
+	SET_RW(dw, chan->dir, done_imwr_high, chan->msi.address_hi);
+	/* MSI abort addr - low, high */
+	SET_RW(dw, chan->dir, abort_imwr_low, chan->msi.address_lo);
+	SET_RW(dw, chan->dir, abort_imwr_high, chan->msi.address_hi);
+	/* MSI data - low, high */
+	switch (chan->id) {
+	case 0:
+	case 1:
+		tmp = GET_RW(dw, chan->dir, ch01_imwr_data);
+		break;
+	case 2:
+	case 3:
+		tmp = GET_RW(dw, chan->dir, ch23_imwr_data);
+		break;
+	case 4:
+	case 5:
+		tmp = GET_RW(dw, chan->dir, ch45_imwr_data);
+		break;
+	case 6:
+	case 7:
+		tmp = GET_RW(dw, chan->dir, ch67_imwr_data);
+		break;
+	}
+
+	if (chan->id & BIT(0)) {
+		/* Channel odd {1, 3, 5, 7} */
+		tmp &= EDMA_V0_CH_EVEN_MSI_DATA_MASK;
+		tmp |= FIELD_PREP(EDMA_V0_CH_ODD_MSI_DATA_MASK,
+				  chan->msi.data);
+	} else {
+		/* Channel even {0, 2, 4, 6} */
+		tmp &= EDMA_V0_CH_ODD_MSI_DATA_MASK;
+		tmp |= FIELD_PREP(EDMA_V0_CH_EVEN_MSI_DATA_MASK,
+				  chan->msi.data);
+	}
+
+	switch (chan->id) {
+	case 0:
+	case 1:
+		SET_RW(dw, chan->dir, ch01_imwr_data, tmp);
+		break;
+	case 2:
+	case 3:
+		SET_RW(dw, chan->dir, ch23_imwr_data, tmp);
+		break;
+	case 4:
+	case 5:
+		SET_RW(dw, chan->dir, ch45_imwr_data, tmp);
+		break;
+	case 6:
+	case 7:
+		SET_RW(dw, chan->dir, ch67_imwr_data, tmp);
+		break;
+	}
+
+	return 0;
+}
+
+/* eDMA debugfs callbacks */
+int dw_edma_v0_core_debugfs_on(struct dw_edma_chip *chip)
+{
+	return 0;
+}
+
+void dw_edma_v0_core_debugfs_off(void)
+{
+}
diff --git a/drivers/dma/dw-edma/dw-edma-v0-core.h b/drivers/dma/dw-edma/dw-edma-v0-core.h
new file mode 100644
index 0000000..2d35bff
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-v0-core.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA v0 core
+ */
+
+#ifndef _DW_EDMA_V0_CORE_H
+#define _DW_EDMA_V0_CORE_H
+
+#include <linux/dma/edma.h>
+
+/* eDMA management callbacks */
+void dw_edma_v0_core_off(struct dw_edma *chan);
+u16 dw_edma_v0_core_ch_count(struct dw_edma *chan, enum dw_edma_dir dir);
+enum dma_status dw_edma_v0_core_ch_status(struct dw_edma_chan *chan);
+void dw_edma_v0_core_clear_done_int(struct dw_edma_chan *chan);
+void dw_edma_v0_core_clear_abort_int(struct dw_edma_chan *chan);
+u32 dw_edma_v0_core_status_done_int(struct dw_edma *chan, enum dw_edma_dir dir);
+u32 dw_edma_v0_core_status_abort_int(struct dw_edma *chan, enum dw_edma_dir dir);
+void dw_edma_v0_core_start(struct dw_edma_chunk *chunk, bool first);
+int dw_edma_v0_core_device_config(struct dma_chan *dchan);
+/* eDMA debug fs callbacks */
+int dw_edma_v0_core_debugfs_on(struct dw_edma_chip *chip);
+void dw_edma_v0_core_debugfs_off(void);
+
+#endif /* _DW_EDMA_V0_CORE_H */
diff --git a/drivers/dma/dw-edma/dw-edma-v0-regs.h b/drivers/dma/dw-edma/dw-edma-v0-regs.h
new file mode 100644
index 0000000..eb04715
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-v0-regs.h
@@ -0,0 +1,156 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA v0 core
+ */
+
+#ifndef _DW_EDMA_V0_REGS_H
+#define _DW_EDMA_V0_REGS_H
+
+#include <linux/dmaengine.h>
+
+#define EDMA_V0_MAX_NR_CH				8
+#define EDMA_V0_VIEWPORT_MASK				GENMASK(2, 0)
+#define EDMA_V0_DONE_INT_MASK				GENMASK(7, 0)
+#define EDMA_V0_ABORT_INT_MASK				GENMASK(23, 16)
+#define EDMA_V0_WRITE_CH_COUNT_MASK			GENMASK(3, 0)
+#define EDMA_V0_READ_CH_COUNT_MASK			GENMASK(19, 16)
+#define EDMA_V0_CH_STATUS_MASK				GENMASK(6, 5)
+#define EDMA_V0_DOORBELL_CH_MASK			GENMASK(2, 0)
+#define EDMA_V0_LINKED_LIST_ERR_MASK			GENMASK(7, 0)
+
+#define EDMA_V0_CH_ODD_MSI_DATA_MASK			GENMASK(15, 8)
+#define EDMA_V0_CH_EVEN_MSI_DATA_MASK			GENMASK(7, 0)
+
+struct dw_edma_v0_ch_regs {
+	u32 ch_control1;				/* 0x000 */
+	u32 ch_control2;				/* 0x004 */
+	u32 transfer_size;				/* 0x008 */
+	u32 sar_low;					/* 0x00c */
+	u32 sar_high;					/* 0x010 */
+	u32 dar_low;					/* 0x014 */
+	u32 dar_high;					/* 0x018 */
+	u32 llp_low;					/* 0x01c */
+	u32 llp_high;					/* 0x020 */
+};
+
+struct dw_edma_v0_ch {
+	struct dw_edma_v0_ch_regs wr;			/* 0x200 */
+	u32 padding_1[55];				/* [0x224..0x2fc] */
+	struct dw_edma_v0_ch_regs rd;			/* 0x300 */
+	u32 padding_2[55];				/* [0x224..0x2fc] */
+};
+
+struct dw_edma_v0_unroll {
+	u32 padding_1;					/* 0x0f8 */
+	u32 wr_engine_chgroup;				/* 0x100 */
+	u32 rd_engine_chgroup;				/* 0x104 */
+	u32 wr_engine_hshake_cnt_low;			/* 0x108 */
+	u32 wr_engine_hshake_cnt_high;			/* 0x10c */
+	u32 padding_2[2];				/* [0x110..0x114] */
+	u32 rd_engine_hshake_cnt_low;			/* 0x118 */
+	u32 rd_engine_hshake_cnt_high;			/* 0x11c */
+	u32 padding_3[2];				/* [0x120..0x124] */
+	u32 wr_ch0_pwr_en;				/* 0x128 */
+	u32 wr_ch1_pwr_en;				/* 0x12c */
+	u32 wr_ch2_pwr_en;				/* 0x130 */
+	u32 wr_ch3_pwr_en;				/* 0x134 */
+	u32 wr_ch4_pwr_en;				/* 0x138 */
+	u32 wr_ch5_pwr_en;				/* 0x13c */
+	u32 wr_ch6_pwr_en;				/* 0x140 */
+	u32 wr_ch7_pwr_en;				/* 0x144 */
+	u32 padding_4[8];				/* [0x148..0x164] */
+	u32 rd_ch0_pwr_en;				/* 0x168 */
+	u32 rd_ch1_pwr_en;				/* 0x16c */
+	u32 rd_ch2_pwr_en;				/* 0x170 */
+	u32 rd_ch3_pwr_en;				/* 0x174 */
+	u32 rd_ch4_pwr_en;				/* 0x178 */
+	u32 rd_ch5_pwr_en;				/* 0x18c */
+	u32 rd_ch6_pwr_en;				/* 0x180 */
+	u32 rd_ch7_pwr_en;				/* 0x184 */
+	u32 padding_5[30];				/* [0x188..0x1fc] */
+	struct dw_edma_v0_ch ch[EDMA_V0_MAX_NR_CH];	/* [0x200..0x1120] */
+};
+
+struct dw_edma_v0_legacy {
+	u32 viewport_sel;				/* 0x0f8 */
+	struct dw_edma_v0_ch_regs ch;			/* [0x100..0x120] */
+};
+
+struct dw_edma_v0_regs {
+	/* eDMA global registers */
+	u32 ctrl_data_arb_prior;			/* 0x000 */
+	u32 padding_1;					/* 0x004 */
+	u32 ctrl;					/* 0x008 */
+	u32 wr_engine_en;				/* 0x00c */
+	u32 wr_doorbell;				/* 0x010 */
+	u32 padding_2;					/* 0x014 */
+	u32 wr_ch_arb_weight_low;			/* 0x018 */
+	u32 wr_ch_arb_weight_high;			/* 0x01c */
+	u32 padding_3[3];				/* [0x020..0x028] */
+	u32 rd_engine_en;				/* 0x02c */
+	u32 rd_doorbell;				/* 0x030 */
+	u32 padding_4;					/* 0x034 */
+	u32 rd_ch_arb_weight_low;			/* 0x038 */
+	u32 rd_ch_arb_weight_high;			/* 0x03c */
+	u32 padding_5[3];				/* [0x040..0x048] */
+	/* eDMA interrupts registers */
+	u32 wr_int_status;				/* 0x04c */
+	u32 padding_6;					/* 0x050 */
+	u32 wr_int_mask;				/* 0x054 */
+	u32 wr_int_clear;				/* 0x058 */
+	u32 wr_err_status;				/* 0x05c */
+	u32 wr_done_imwr_low;				/* 0x060 */
+	u32 wr_done_imwr_high;				/* 0x064 */
+	u32 wr_abort_imwr_low;				/* 0x068 */
+	u32 wr_abort_imwr_high;				/* 0x06c */
+	u32 wr_ch01_imwr_data;				/* 0x070 */
+	u32 wr_ch23_imwr_data;				/* 0x074 */
+	u32 wr_ch45_imwr_data;				/* 0x078 */
+	u32 wr_ch67_imwr_data;				/* 0x07c */
+	u32 padding_7[4];				/* [0x080..0x08c] */
+	u32 wr_linked_list_err_en;			/* 0x090 */
+	u32 padding_8[3];				/* [0x094..0x09c] */
+	u32 rd_int_status;				/* 0x0a0 */
+	u32 padding_9;					/* 0x0a4 */
+	u32 rd_int_mask;				/* 0x0a8 */
+	u32 rd_int_clear;				/* 0x0ac */
+	u32 padding_10;					/* 0x0b0 */
+	u32 rd_err_status_low;				/* 0x0b4 */
+	u32 rd_err_status_high;				/* 0x0b8 */
+	u32 padding_11[2];				/* [0x0bc..0x0c0] */
+	u32 rd_linked_list_err_en;			/* 0x0c4 */
+	u32 padding_12;					/* 0x0c8 */
+	u32 rd_done_imwr_low;				/* 0x0cc */
+	u32 rd_done_imwr_high;				/* 0x0d0 */
+	u32 rd_abort_imwr_low;				/* 0x0d4 */
+	u32 rd_abort_imwr_high;				/* 0x0d8 */
+	u32 rd_ch01_imwr_data;				/* 0x0dc */
+	u32 rd_ch23_imwr_data;				/* 0x0e0 */
+	u32 rd_ch45_imwr_data;				/* 0x0e4 */
+	u32 rd_ch67_imwr_data;				/* 0x0e8 */
+	u32 padding_13[4];				/* [0x0ec..0x0f8] */
+	/* eDMA channel context grouping */
+	union Type {
+		struct dw_edma_v0_legacy legacy;	/* [0x0f8..0x120] */
+		struct dw_edma_v0_unroll unroll;	/* [0x0f8..0x1120] */
+	} type;
+};
+
+struct dw_edma_v0_lli {
+	u32 control;
+	u32 transfer_size;
+	u32 sar_low;
+	u32 sar_high;
+	u32 dar_low;
+	u32 dar_high;
+};
+
+struct dw_edma_v0_llp {
+	u32 control;
+	u32 reserved;
+	u32 llp_low;
+	u32 llp_high;
+};
+
+#endif /* _DW_EDMA_V0_REGS_H */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC v3 3/7] dmaengine: Add Synopsys eDMA IP version 0 debugfs support
  2019-01-11 18:33 [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0) Gustavo Pimentel
  2019-01-11 18:33 ` [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver Gustavo Pimentel
  2019-01-11 18:33 ` [RFC v3 2/7] dmaengine: Add Synopsys eDMA IP version 0 support Gustavo Pimentel
@ 2019-01-11 18:33 ` Gustavo Pimentel
  2019-01-11 18:33 ` [RFC v3 4/7] PCI: Add Synopsys endpoint EDDA Device id Gustavo Pimentel
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-11 18:33 UTC (permalink / raw)
  To: linux-pci, dmaengine
  Cc: Gustavo Pimentel, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

Add Synopsys eDMA IP version 0 debugfs support to assist any debug
in the future.

Creates a file system structure composed by folders and files that mimic
the IP register map (this files are read only) to ease any debug.

To enable this feature is necessary to select DEBUG_FS option on kernel
configuration.

Small output example:

(eDMA IP version 0, unroll, 1 write + 1 read channels)

% mount -t debugfs none /sys/kernel/debug/
% tree /sys/kernel/debug/dw-edma-core:0/
dw-edma/
├── version
├── mode
├── wr_ch_cnt
├── rd_ch_cnt
└── registers
    ├── ctrl_data_arb_prior
    ├── ctrl
    ├── write
    │   ├── engine_en
    │   ├── doorbell
    │   ├── ch_arb_weight_low
    │   ├── ch_arb_weight_high
    │   ├── int_status
    │   ├── int_mask
    │   ├── int_clear
    │   ├── err_status
    │   ├── done_imwr_low
    │   ├── done_imwr_high
    │   ├── abort_imwr_low
    │   ├── abort_imwr_high
    │   ├── ch01_imwr_data
    │   ├── ch23_imwr_data
    │   ├── ch45_imwr_data
    │   ├── ch67_imwr_data
    │   ├── linked_list_err_en
    │   ├── engine_chgroup
    │   ├── engine_hshake_cnt_low
    │   ├── engine_hshake_cnt_high
    │   ├── ch0_pwr_en
    │   ├── ch1_pwr_en
    │   ├── ch2_pwr_en
    │   ├── ch3_pwr_en
    │   ├── ch4_pwr_en
    │   ├── ch5_pwr_en
    │   ├── ch6_pwr_en
    │   ├── ch7_pwr_en
    │   └── channel:0
    │       ├── ch_control1
    │       ├── ch_control2
    │       ├── transfer_size
    │       ├── sar_low
    │       ├── sar_high
    │       ├── dar_high
    │       ├── llp_low
    │       └── llp_high
    └── read
        ├── engine_en
        ├── doorbell
        ├── ch_arb_weight_low
        ├── ch_arb_weight_high
        ├── int_status
        ├── int_mask
        ├── int_clear
        ├── err_status_low
        ├── err_status_high
        ├── done_imwr_low
        ├── done_imwr_high
        ├── abort_imwr_low
        ├── abort_imwr_high
        ├── ch01_imwr_data
        ├── ch23_imwr_data
        ├── ch45_imwr_data
        ├── ch67_imwr_data
        ├── linked_list_err_en
        ├── engine_chgroup
        ├── engine_hshake_cnt_low
        ├── engine_hshake_cnt_high
        ├── ch0_pwr_en
        ├── ch1_pwr_en
        ├── ch2_pwr_en
        ├── ch3_pwr_en
        ├── ch4_pwr_en
        ├── ch5_pwr_en
        ├── ch6_pwr_en
        ├── ch7_pwr_en
        └── channel:0
            ├── ch_control1
            ├── ch_control2
            ├── transfer_size
            ├── sar_low
            ├── sar_high
            ├── dar_high
            ├── llp_low
            └── llp_high

Changes:
RFC v1->RFC v2:
 - Replace comments // (C99 style) by /**/
 - Fix the headers of the .c and .h files according to the most recent
   convention
 - Fix errors and checks pointed out by checkpatch with --strict
   option
 - Replace patch small description tag from dma by dmaengine
RFC v2->RFC v3:
 - Code rewrite to use FIELD_PREP() and FIELD_GET()
 - Add define to magic numbers

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Niklas Cassel <niklas.cassel@linaro.org>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Vitor Soares <vitor.soares@synopsys.com>
Cc: Nelson Costa <nelson.costa@synopsys.com>
Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
---
 drivers/dma/dw-edma/Makefile             |   3 +-
 drivers/dma/dw-edma/dw-edma-v0-core.c    |   3 +-
 drivers/dma/dw-edma/dw-edma-v0-debugfs.c | 361 +++++++++++++++++++++++++++++++
 drivers/dma/dw-edma/dw-edma-v0-debugfs.h |  24 ++
 4 files changed, 389 insertions(+), 2 deletions(-)
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-debugfs.c
 create mode 100644 drivers/dma/dw-edma/dw-edma-v0-debugfs.h

diff --git a/drivers/dma/dw-edma/Makefile b/drivers/dma/dw-edma/Makefile
index 01c7c63..0c53033 100644
--- a/drivers/dma/dw-edma/Makefile
+++ b/drivers/dma/dw-edma/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_DW_EDMA)		+= dw-edma.o
+dw-edma-$(CONFIG_DEBUG_FS)	:= dw-edma-v0-debugfs.o
 dw-edma-objs			:= dw-edma-core.o \
-					dw-edma-v0-core.o
+					dw-edma-v0-core.o $(dw-edma-y)
diff --git a/drivers/dma/dw-edma/dw-edma-v0-core.c b/drivers/dma/dw-edma/dw-edma-v0-core.c
index 5b76e42..62295af 100644
--- a/drivers/dma/dw-edma/dw-edma-v0-core.c
+++ b/drivers/dma/dw-edma/dw-edma-v0-core.c
@@ -338,9 +338,10 @@ int dw_edma_v0_core_device_config(struct dma_chan *dchan)
 /* eDMA debugfs callbacks */
 int dw_edma_v0_core_debugfs_on(struct dw_edma_chip *chip)
 {
-	return 0;
+	return dw_edma_v0_debugfs_on(chip);
 }
 
 void dw_edma_v0_core_debugfs_off(void)
 {
+	dw_edma_v0_debugfs_off();
 }
diff --git a/drivers/dma/dw-edma/dw-edma-v0-debugfs.c b/drivers/dma/dw-edma/dw-edma-v0-debugfs.c
new file mode 100644
index 0000000..385bd38
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-v0-debugfs.c
@@ -0,0 +1,361 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA v0 core
+ */
+
+#include <linux/debugfs.h>
+#include <linux/bitfield.h>
+
+#include "dw-edma-v0-debugfs.h"
+#include "dw-edma-v0-regs.h"
+#include "dw-edma-core.h"
+
+#define RD_PERM					0444
+
+#define REGS_ADDR(name) \
+	(&regs->name)
+#define REGISTER(name) \
+	{ #name, REGS_ADDR(name) }
+
+#define WR_REGISTER(name) \
+	{ #name, REGS_ADDR(wr_##name) }
+#define RD_REGISTER(name) \
+	{ #name, REGS_ADDR(rd_##name) }
+
+#define WR_REGISTER_LEGACY(name) \
+	{ #name, REGS_ADDR(type.legacy.wr_##name) }
+#define RD_REGISTER_LEGACY(name) \
+	{ #name, REGS_ADDR(type.legacy.rd_##name) }
+
+#define WR_REGISTER_UNROLL(name) \
+	{ #name, REGS_ADDR(type.unroll.wr_##name) }
+#define RD_REGISTER_UNROLL(name) \
+	{ #name, REGS_ADDR(type.unroll.rd_##name) }
+
+#define WRITE_STR				"write"
+#define READ_STR				"read"
+#define CHANNEL_STR				"channel"
+#define REGISTERS_STR				"registers"
+
+static struct dentry				*base_dir;
+static struct dw_edma				*dw;
+static struct dw_edma_v0_regs			*regs;
+
+static struct {
+	void					*start;
+	void					*end;
+} lim[2][EDMA_V0_MAX_NR_CH];
+
+struct debugfs_entries {
+	char					name[24];
+	dma_addr_t				*reg;
+};
+
+static int dw_edma_debugfs_u32_get(void *data, u64 *val)
+{
+	if (dw->mode == EDMA_MODE_LEGACY &&
+	    data >= (void *)&regs->type.legacy.ch) {
+		void *ptr = (void *)&regs->type.legacy.ch;
+		u32 viewport_sel = 0;
+		unsigned long flags;
+		u16 ch;
+
+		for (ch = 0; ch < dw->wr_ch_cnt; ch++)
+			if (lim[0][ch].start >= data && data < lim[0][ch].end) {
+				ptr += (data - lim[0][ch].start);
+				goto legacy_sel_wr;
+			}
+
+		for (ch = 0; ch < dw->rd_ch_cnt; ch++)
+			if (lim[1][ch].start >= data && data < lim[1][ch].end) {
+				ptr += (data - lim[1][ch].start);
+				goto legacy_sel_rd;
+			}
+
+		return 0;
+legacy_sel_rd:
+		viewport_sel = BIT(31);
+legacy_sel_wr:
+		viewport_sel |= FIELD_PREP(EDMA_V0_VIEWPORT_MASK, ch);
+
+		raw_spin_lock_irqsave(&dw->lock, flags);
+
+		writel(viewport_sel, &regs->type.legacy.viewport_sel);
+		*val = readl((u32 *)ptr);
+
+		raw_spin_unlock_irqrestore(&dw->lock, flags);
+	} else {
+		*val = readl((u32 *)data);
+	}
+
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(fops_x32, dw_edma_debugfs_u32_get, NULL, "0x%08llx\n");
+
+static int dw_edma_debugfs_create_x32(const struct debugfs_entries entries[],
+				      int nr_entries, struct dentry *dir)
+{
+	struct dentry *entry;
+	int i;
+
+	for (i = 0; i < nr_entries; i++) {
+		entry = debugfs_create_file_unsafe(entries[i].name, RD_PERM,
+						   dir, entries[i].reg,
+						   &fops_x32);
+		if (!entry)
+			return -EPERM;
+	}
+
+	return 0;
+}
+
+static int dw_edma_debugfs_regs_ch(struct dw_edma_v0_ch_regs *regs,
+				   struct dentry *dir)
+{
+	int nr_entries;
+	const struct debugfs_entries debugfs_regs[] = {
+		REGISTER(ch_control1),
+		REGISTER(ch_control2),
+		REGISTER(transfer_size),
+		REGISTER(sar_low),
+		REGISTER(sar_high),
+		REGISTER(dar_low),
+		REGISTER(dar_high),
+		REGISTER(llp_low),
+		REGISTER(llp_high),
+	};
+
+	nr_entries = sizeof(debugfs_regs) / sizeof(struct debugfs_entries);
+	return dw_edma_debugfs_create_x32(debugfs_regs, nr_entries, dir);
+}
+
+static int dw_edma_debugfs_regs_wr(struct dentry *dir)
+{
+	struct dentry *regs_dir, *ch_dir;
+	int nr_entries, i, err;
+	char name[16];
+	const struct debugfs_entries debugfs_regs[] = {
+		/* eDMA global registers */
+		WR_REGISTER(engine_en),
+		WR_REGISTER(doorbell),
+		WR_REGISTER(ch_arb_weight_low),
+		WR_REGISTER(ch_arb_weight_high),
+		/* eDMA interrupts registers */
+		WR_REGISTER(int_status),
+		WR_REGISTER(int_mask),
+		WR_REGISTER(int_clear),
+		WR_REGISTER(err_status),
+		WR_REGISTER(done_imwr_low),
+		WR_REGISTER(done_imwr_high),
+		WR_REGISTER(abort_imwr_low),
+		WR_REGISTER(abort_imwr_high),
+		WR_REGISTER(ch01_imwr_data),
+		WR_REGISTER(ch23_imwr_data),
+		WR_REGISTER(ch45_imwr_data),
+		WR_REGISTER(ch67_imwr_data),
+		WR_REGISTER(linked_list_err_en),
+	};
+	const struct debugfs_entries debugfs_unroll_regs[] = {
+		/* eDMA channel context grouping */
+		WR_REGISTER_UNROLL(engine_chgroup),
+		WR_REGISTER_UNROLL(engine_hshake_cnt_low),
+		WR_REGISTER_UNROLL(engine_hshake_cnt_high),
+		WR_REGISTER_UNROLL(ch0_pwr_en),
+		WR_REGISTER_UNROLL(ch1_pwr_en),
+		WR_REGISTER_UNROLL(ch2_pwr_en),
+		WR_REGISTER_UNROLL(ch3_pwr_en),
+		WR_REGISTER_UNROLL(ch4_pwr_en),
+		WR_REGISTER_UNROLL(ch5_pwr_en),
+		WR_REGISTER_UNROLL(ch6_pwr_en),
+		WR_REGISTER_UNROLL(ch7_pwr_en),
+	};
+
+	regs_dir = debugfs_create_dir(WRITE_STR, dir);
+	if (!regs_dir)
+		return -EPERM;
+
+	nr_entries = sizeof(debugfs_regs) / sizeof(struct debugfs_entries);
+	err = dw_edma_debugfs_create_x32(debugfs_regs, nr_entries, regs_dir);
+	if (err)
+		return err;
+
+	if (dw->mode == EDMA_MODE_UNROLL) {
+		nr_entries = sizeof(debugfs_unroll_regs) /
+			     sizeof(struct debugfs_entries);
+		err = dw_edma_debugfs_create_x32(debugfs_unroll_regs,
+						 nr_entries, regs_dir);
+		if (err)
+			return err;
+	}
+
+	for (i = 0; i < dw->wr_ch_cnt; i++) {
+		snprintf(name, sizeof(name), "%s:%d", CHANNEL_STR, i);
+
+		ch_dir = debugfs_create_dir(name, regs_dir);
+		if (!ch_dir)
+			return -EPERM;
+
+		err = dw_edma_debugfs_regs_ch(&regs->type.unroll.ch[i].wr,
+					      ch_dir);
+		if (err)
+			return err;
+
+		lim[0][i].start = &regs->type.unroll.ch[i].wr;
+		lim[0][i].end = &regs->type.unroll.ch[i].padding_1[0];
+	}
+
+	return 0;
+}
+
+static int dw_edma_debugfs_regs_rd(struct dentry *dir)
+{
+	struct dentry *regs_dir, *ch_dir;
+	int nr_entries, i, err;
+	char name[16];
+	const struct debugfs_entries debugfs_regs[] = {
+		/* eDMA global registers */
+		RD_REGISTER(engine_en),
+		RD_REGISTER(doorbell),
+		RD_REGISTER(ch_arb_weight_low),
+		RD_REGISTER(ch_arb_weight_high),
+		/* eDMA interrupts registers */
+		RD_REGISTER(int_status),
+		RD_REGISTER(int_mask),
+		RD_REGISTER(int_clear),
+		RD_REGISTER(err_status_low),
+		RD_REGISTER(err_status_high),
+		RD_REGISTER(linked_list_err_en),
+		RD_REGISTER(done_imwr_low),
+		RD_REGISTER(done_imwr_high),
+		RD_REGISTER(abort_imwr_low),
+		RD_REGISTER(abort_imwr_high),
+		RD_REGISTER(ch01_imwr_data),
+		RD_REGISTER(ch23_imwr_data),
+		RD_REGISTER(ch45_imwr_data),
+		RD_REGISTER(ch67_imwr_data),
+	};
+	const struct debugfs_entries debugfs_unroll_regs[] = {
+		/* eDMA channel context grouping */
+		RD_REGISTER_UNROLL(engine_chgroup),
+		RD_REGISTER_UNROLL(engine_hshake_cnt_low),
+		RD_REGISTER_UNROLL(engine_hshake_cnt_high),
+		RD_REGISTER_UNROLL(ch0_pwr_en),
+		RD_REGISTER_UNROLL(ch1_pwr_en),
+		RD_REGISTER_UNROLL(ch2_pwr_en),
+		RD_REGISTER_UNROLL(ch3_pwr_en),
+		RD_REGISTER_UNROLL(ch4_pwr_en),
+		RD_REGISTER_UNROLL(ch5_pwr_en),
+		RD_REGISTER_UNROLL(ch6_pwr_en),
+		RD_REGISTER_UNROLL(ch7_pwr_en),
+	};
+
+	regs_dir = debugfs_create_dir(READ_STR, dir);
+	if (!regs_dir)
+		return -EPERM;
+
+	nr_entries = sizeof(debugfs_regs) / sizeof(struct debugfs_entries);
+	err = dw_edma_debugfs_create_x32(debugfs_regs, nr_entries, regs_dir);
+	if (err)
+		return err;
+
+	if (dw->mode == EDMA_MODE_UNROLL) {
+		nr_entries = sizeof(debugfs_unroll_regs) /
+			     sizeof(struct debugfs_entries);
+		err = dw_edma_debugfs_create_x32(debugfs_unroll_regs,
+						 nr_entries, regs_dir);
+		if (err)
+			return err;
+	}
+
+	for (i = 0; i < dw->rd_ch_cnt; i++) {
+		snprintf(name, sizeof(name), "%s:%d", CHANNEL_STR, i);
+
+		ch_dir = debugfs_create_dir(name, regs_dir);
+		if (!ch_dir)
+			return -EPERM;
+
+		err = dw_edma_debugfs_regs_ch(&regs->type.unroll.ch[i].rd,
+					      ch_dir);
+		if (err)
+			return err;
+
+		lim[1][i].start = &regs->type.unroll.ch[i].rd;
+		lim[1][i].end = &regs->type.unroll.ch[i].padding_2[0];
+	}
+
+	return 0;
+}
+
+static int dw_edma_debugfs_regs(void)
+{
+	struct dentry *regs_dir;
+	int nr_entries, err;
+	const struct debugfs_entries debugfs_regs[] = {
+		REGISTER(ctrl_data_arb_prior),
+		REGISTER(ctrl),
+	};
+
+	regs_dir = debugfs_create_dir(REGISTERS_STR, base_dir);
+	if (!regs_dir)
+		return -EPERM;
+
+	nr_entries = sizeof(debugfs_regs) / sizeof(struct debugfs_entries);
+	err = dw_edma_debugfs_create_x32(debugfs_regs, nr_entries, regs_dir);
+	if (err)
+		return err;
+
+	err = dw_edma_debugfs_regs_wr(regs_dir);
+	if (err)
+		return err;
+
+	err = dw_edma_debugfs_regs_rd(regs_dir);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+int dw_edma_v0_debugfs_on(struct dw_edma_chip *chip)
+{
+	struct dentry *entry;
+	int err;
+
+	dw = chip->dw;
+	if (!dw)
+		return -EPERM;
+
+	regs = (struct dw_edma_v0_regs *)dw->rg_region.vaddr;
+	if (!regs)
+		return -EPERM;
+
+	base_dir = debugfs_create_dir(dw->name, 0);
+	if (!base_dir)
+		return -EPERM;
+
+	entry = debugfs_create_u32("version", RD_PERM, base_dir, &dw->version);
+	if (!entry)
+		return -EPERM;
+
+	entry = debugfs_create_u32("mode", RD_PERM, base_dir, &dw->mode);
+	if (!entry)
+		return -EPERM;
+
+	entry = debugfs_create_u16("wr_ch_cnt", RD_PERM, base_dir,
+				   &dw->wr_ch_cnt);
+	if (!entry)
+		return -EPERM;
+
+	entry = debugfs_create_u16("rd_ch_cnt", RD_PERM, base_dir,
+				   &dw->rd_ch_cnt);
+	if (!entry)
+		return -EPERM;
+
+	err = dw_edma_debugfs_regs();
+	return err;
+}
+
+void dw_edma_v0_debugfs_off(void)
+{
+	debugfs_remove_recursive(base_dir);
+}
diff --git a/drivers/dma/dw-edma/dw-edma-v0-debugfs.h b/drivers/dma/dw-edma/dw-edma-v0-debugfs.h
new file mode 100644
index 0000000..175f646
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-v0-debugfs.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA v0 core
+ */
+
+#ifndef _DW_EDMA_V0_DEBUG_FS_H
+#define _DW_EDMA_V0_DEBUG_FS_H
+
+#include <linux/dma/edma.h>
+
+#ifdef CONFIG_DEBUG_FS
+int dw_edma_v0_debugfs_on(struct dw_edma_chip *chip);
+void dw_edma_v0_debugfs_off(void);
+#else
+static inline int dw_edma_v0_debugfs_on(struct dw_edma_chip *chip);
+{
+	return 0;
+}
+
+static inline void dw_edma_v0_debugfs_off(void);
+#endif /* CONFIG_DEBUG_FS */
+
+#endif /* _DW_EDMA_V0_DEBUG_FS_H */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC v3 4/7] PCI: Add Synopsys endpoint EDDA Device id
  2019-01-11 18:33 [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0) Gustavo Pimentel
                   ` (2 preceding siblings ...)
  2019-01-11 18:33 ` [RFC v3 3/7] dmaengine: Add Synopsys eDMA IP version 0 debugfs support Gustavo Pimentel
@ 2019-01-11 18:33 ` Gustavo Pimentel
  2019-01-14 14:41   ` Bjorn Helgaas
  2019-01-11 18:33 ` [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic Gustavo Pimentel
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-11 18:33 UTC (permalink / raw)
  To: linux-pci, dmaengine
  Cc: Gustavo Pimentel, Kishon Vijay Abraham I, Bjorn Helgaas,
	Lorenzo Pieralisi, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

Create and add Synopsys Endpoint EDDA Device id to PCI id list, since
this id is now being use on two different drivers (pci_endpoint_test.ko
and dw-edma-pcie.ko).

Changes:
RFC v1->RFC v2:
 - Reword subject line patch
 - Reorder patch order on the series
RFC v2->RFC v3:
 - No changes

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Niklas Cassel <niklas.cassel@linaro.org>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Vitor Soares <vitor.soares@synopsys.com>
Cc: Nelson Costa <nelson.costa@synopsys.com>
Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
---
 drivers/misc/pci_endpoint_test.c | 2 +-
 include/linux/pci_ids.h          | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
index 896e2df..d27efe838 100644
--- a/drivers/misc/pci_endpoint_test.c
+++ b/drivers/misc/pci_endpoint_test.c
@@ -788,7 +788,7 @@ static void pci_endpoint_test_remove(struct pci_dev *pdev)
 static const struct pci_device_id pci_endpoint_test_tbl[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_TI, PCI_DEVICE_ID_TI_DRA74x) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_TI, PCI_DEVICE_ID_TI_DRA72x) },
-	{ PCI_DEVICE(PCI_VENDOR_ID_SYNOPSYS, 0xedda) },
+	{ PCI_DEVICE_DATA(SYNOPSYS, EDDA, NULL) },
 	{ }
 };
 MODULE_DEVICE_TABLE(pci, pci_endpoint_test_tbl);
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 5eaf39d..faf55af 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2364,6 +2364,7 @@
 #define PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3		0xabcd
 #define PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3_AXI	0xabce
 #define PCI_DEVICE_ID_SYNOPSYS_HAPSUSB31	0xabcf
+#define PCI_DEVICE_ID_SYNOPSYS_EDDA	0xedda
 
 #define PCI_VENDOR_ID_USR		0x16ec
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic
  2019-01-11 18:33 [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0) Gustavo Pimentel
                   ` (3 preceding siblings ...)
  2019-01-11 18:33 ` [RFC v3 4/7] PCI: Add Synopsys endpoint EDDA Device id Gustavo Pimentel
@ 2019-01-11 18:33 ` Gustavo Pimentel
  2019-01-11 19:47   ` Andy Shevchenko
  2019-01-11 18:33 ` [RFC v3 6/7] MAINTAINERS: Add Synopsys eDMA IP driver maintainer Gustavo Pimentel
  2019-01-11 18:33 ` [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver Gustavo Pimentel
  6 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-11 18:33 UTC (permalink / raw)
  To: linux-pci, dmaengine
  Cc: Gustavo Pimentel, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Lorenzo Pieralisi,
	Joao Pinto, Jose Abreu, Luis Oliveira, Vitor Soares,
	Nelson Costa, Pedro Sousa

Synopsys eDMA IP is normally distributed along with Synopsys PCIe
EndPoint IP (depends of the use and licensing agreement).

This IP requires some basic configurations, such as:
 - eDMA registers BAR
 - eDMA registers offset
 - eDMA registers size
 - eDMA linked list memory BAR
 - eDMA linked list memory offset
 - eDMA linked list memory sze
 - eDMA data memory BAR
 - eDMA data memory offset
 - eDMA data memory size
 - eDMA version
 - eDMA mode
 - IRQs available for eDMA

As a working example, PCIe glue-logic will attach to a Synopsys PCIe
EndPoint IP prototype kit (Vendor ID = 0x16c3, Device ID = 0xedda),
which has built-in an eDMA IP with this default configuration:
 - eDMA registers BAR = 0
 - eDMA registers offset = 0x00001000 (4 Kbytes)
 - eDMA registers size = 0x00002000 (8 Kbytes)
 - eDMA linked list memory BAR = 2
 - eDMA linked list memory offset = 0x00000000 (0 Kbytes)
 - eDMA linked list memory size = 0x00800000 (8 Mbytes)
 - eDMA data memory BAR = 2
 - eDMA data memory offset = 0x00800000 (8 Mbytes)
 - eDMA data memory size = 0x03800000 (56 Mbytes)
 - eDMA version = 0
 - eDMA mode = EDMA_MODE_UNROLL
 - IRQs = 1

This driver can be compile as built-in or external module in kernel.

To enable this driver just select DW_EDMA_PCIE option in kernel
configuration, however it requires and selects automatically DW_EDMA
option too.

Changes:
RFC v1->RFC v2:
 - Replace comments // (C99 style) by /**/
 - Merge two pcim_iomap_regions() calls into just one call
 - Remove pci_try_set_mwi() call
 - Replace some dev_info() by dev_dbg() to reduce *noise*
 - Remove pci_name(pdev) call after being call dw_edma_remove()
 - Remove all power management support
 - Fix the headers of the .c and .h files according to the most recent
   convention
 - Fix errors and checks pointed out by checkpatch with --strict option
 - Replace patch small description tag from dma by dmaengine
RFC v2->RFC v3:
 - Fix printk variable of phys_addr_t type
 - Fix missing variable initialization (chan->configured)
 - Change linked list size to 512 Kbytes
 - Add data memory information
 - Add register size information
 - Add comments or improve existing ones
 - Add possibility to work with multiple IRQs feature
 - Replace MSI and MSI-X enable condition by pci_dev_msi_enabled()
 - Replace code to acquire MSI(-X) address and data by
   get_cached_msi_msg()

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Niklas Cassel <niklas.cassel@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Vitor Soares <vitor.soares@synopsys.com>
Cc: Nelson Costa <nelson.costa@synopsys.com>
Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
---
 drivers/dma/dw-edma/Kconfig        |   9 ++
 drivers/dma/dw-edma/Makefile       |   1 +
 drivers/dma/dw-edma/dw-edma-pcie.c | 254 +++++++++++++++++++++++++++++++++++++
 3 files changed, 264 insertions(+)
 create mode 100644 drivers/dma/dw-edma/dw-edma-pcie.c

diff --git a/drivers/dma/dw-edma/Kconfig b/drivers/dma/dw-edma/Kconfig
index 3016bed..c0838ce 100644
--- a/drivers/dma/dw-edma/Kconfig
+++ b/drivers/dma/dw-edma/Kconfig
@@ -7,3 +7,12 @@ config DW_EDMA
 	help
 	  Support the Synopsys DesignWare eDMA controller, normally
 	  implemented on endpoints SoCs.
+
+config DW_EDMA_PCIE
+	tristate "Synopsys DesignWare eDMA PCIe driver"
+	depends on PCI && PCI_MSI
+	select DW_EDMA
+	help
+	  Provides a glue-logic between the Synopsys DesignWare
+	  eDMA controller and an endpoint PCIe device. This also serves
+	  as a reference design to whom desires to use this IP.
diff --git a/drivers/dma/dw-edma/Makefile b/drivers/dma/dw-edma/Makefile
index 0c53033..8d45c0d 100644
--- a/drivers/dma/dw-edma/Makefile
+++ b/drivers/dma/dw-edma/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_DW_EDMA)		+= dw-edma.o
 dw-edma-$(CONFIG_DEBUG_FS)	:= dw-edma-v0-debugfs.o
 dw-edma-objs			:= dw-edma-core.o \
 					dw-edma-v0-core.o $(dw-edma-y)
+obj-$(CONFIG_DW_EDMA_PCIE)	+= dw-edma-pcie.o
diff --git a/drivers/dma/dw-edma/dw-edma-pcie.c b/drivers/dma/dw-edma/dw-edma-pcie.c
new file mode 100644
index 0000000..b96b3c4
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-pcie.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA PCIe driver
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/dma/edma.h>
+#include <linux/msi.h>
+
+#include "dw-edma-core.h"
+
+enum dw_edma_pcie_bar {
+	BAR_0,
+	BAR_1,
+	BAR_2,
+	BAR_3,
+	BAR_4,
+	BAR_5
+};
+
+struct dw_edma_pcie_data {
+	/* eDMA registers location */
+	enum dw_edma_pcie_bar		rg_bar;
+	off_t				rg_off;
+	size_t				rg_sz;
+	/* eDMA memory linked list location */
+	enum dw_edma_pcie_bar		ll_bar;
+	off_t				ll_off;
+	size_t				ll_sz;
+	/* eDMA memory data location */
+	enum dw_edma_pcie_bar		dt_bar;
+	off_t				dt_off;
+	size_t				dt_sz;
+	/* Other */
+	u32				version;
+	enum dw_edma_mode		mode;
+	u8				irqs_cnt;
+};
+
+static const struct dw_edma_pcie_data snps_edda_data = {
+	/* eDMA registers location */
+	.rg_bar				= BAR_0,
+	.rg_off				= 0x00001000,	/*  4 Kbytes */
+	.rg_sz				= 0x00002000,	/*  8 Kbytes */
+	/* eDMA memory linked list location */
+	.ll_bar				= BAR_2,
+	.ll_off				= 0x00000000,	/*  0 Kbytes */
+	.ll_sz				= 0x00800000,	/*  8 Mbytes */
+	/* eDMA memory data location */
+	.dt_bar				= BAR_2,
+	.dt_off				= 0x00800000,	/*  8 Mbytes */
+	.dt_sz				= 0x03800000,	/* 56 Mbytes */
+	/* Other */
+	.version			= 0,
+	.mode				= EDMA_MODE_UNROLL,
+	.irqs_cnt			= 1,
+};
+
+static bool disable_msix;
+module_param(disable_msix, bool, 0644);
+MODULE_PARM_DESC(disable_msix, "Disable MSI-X interrupts");
+
+static int dw_edma_pcie_probe(struct pci_dev *pdev,
+			      const struct pci_device_id *pid)
+{
+	const struct dw_edma_pcie_data *pdata = (void *)pid->driver_data;
+	struct device *dev = &pdev->dev;
+	struct dw_edma_chip *chip;
+	struct dw_edma *dw;
+	unsigned int irq_flags = PCI_IRQ_MSI;
+	int err, nr_irqs, i;
+
+	if (!pdata) {
+		dev_err(dev, "%s missing data structure\n", pci_name(pdev));
+		return -EFAULT;
+	}
+
+	/* Enable PCI device */
+	err = pcim_enable_device(pdev);
+	if (err) {
+		dev_err(dev, "%s enabling device failed\n", pci_name(pdev));
+		return err;
+	}
+
+	/* Mapping PCI BAR regions */
+	err = pcim_iomap_regions(pdev, BIT(pdata->rg_bar) |
+				       BIT(pdata->ll_bar) |
+				       BIT(pdata->dt_bar),
+				 pci_name(pdev));
+	if (err) {
+		dev_err(dev, "%s eDMA BAR I/O remapping failed\n",
+			pci_name(pdev));
+		return err;
+	}
+
+	pci_set_master(pdev);
+
+	/* DMA configuration */
+	err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+	if (err) {
+		dev_err(dev, "%s DMA mask set failed\n", pci_name(pdev));
+		return err;
+	}
+
+	err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+	if (err) {
+		dev_err(dev, "%s consistent DMA mask set failed\n",
+			pci_name(pdev));
+		return err;
+	}
+
+	/* Data structure allocation */
+	chip = devm_kzalloc(dev, sizeof(*chip), GFP_KERNEL);
+	if (!chip)
+		return -ENOMEM;
+
+	dw = devm_kzalloc(dev, sizeof(*dw), GFP_KERNEL);
+	if (!dw)
+		return -ENOMEM;
+
+	/* IRQs allocation */
+	if (!disable_msix)
+		irq_flags |= PCI_IRQ_MSIX;
+
+	nr_irqs = pci_alloc_irq_vectors(pdev, 1, pdata->irqs_cnt, irq_flags);
+	if (nr_irqs < 1) {
+		dev_err(dev, "%s failed to alloc IRQ vector (Number of IRQs=%u)\n",
+			pci_name(pdev), nr_irqs);
+		return -EPERM;
+	}
+
+	/* Data structure initialization */
+	chip->dw = dw;
+	chip->dev = dev;
+	chip->id = pdev->devfn;
+	chip->irq = pdev->irq;
+
+	if (!pcim_iomap_table(pdev))
+		return -EACCES;
+
+	dw->rg_region.vaddr = (dma_addr_t)pcim_iomap_table(pdev)[pdata->rg_bar];
+	dw->rg_region.vaddr += pdata->rg_off;
+	dw->rg_region.paddr = pdev->resource[pdata->rg_bar].start;
+	dw->rg_region.paddr += pdata->rg_off;
+	dw->rg_region.sz = pdata->rg_sz;
+
+	dw->ll_region.vaddr = (dma_addr_t)pcim_iomap_table(pdev)[pdata->ll_bar];
+	dw->ll_region.vaddr += pdata->ll_off;
+	dw->ll_region.paddr = pdev->resource[pdata->ll_bar].start;
+	dw->ll_region.paddr += pdata->ll_off;
+	dw->ll_region.sz = pdata->ll_sz;
+
+	dw->dt_region.vaddr = (dma_addr_t)pcim_iomap_table(pdev)[pdata->dt_bar];
+	dw->dt_region.vaddr += pdata->dt_off;
+	dw->dt_region.paddr = pdev->resource[pdata->dt_bar].start;
+	dw->dt_region.paddr += pdata->dt_off;
+	dw->dt_region.sz = pdata->dt_sz;
+
+	dw->version = pdata->version;
+	dw->mode = pdata->mode;
+	dw->nr_irqs = nr_irqs;
+
+	/* Debug info */
+	dev_dbg(dev, "Version:\t%u\n", dw->version);
+
+	dev_dbg(dev, "Mode:\t%s\n",
+		dw->mode == EDMA_MODE_LEGACY ? "Legacy" : "Unroll");
+
+	dev_dbg(dev, "Registers:\tBAR=%u, off=0x%.8lx, sz=0x%zx bytes, addr(v=%pa, p=%pa)\n",
+		pdata->rg_bar, pdata->rg_off, pdata->rg_sz,
+		&dw->rg_region.vaddr, &dw->rg_region.paddr);
+
+	dev_dbg(dev, "L. List:\tBAR=%u, off=0x%.8lx, sz=0x%zx bytes, addr(v=%pa, p=%pa)\n",
+		pdata->ll_bar, pdata->ll_off, pdata->ll_sz,
+		&dw->ll_region.vaddr, &dw->ll_region.paddr);
+
+	dev_dbg(dev, "Data:\tBAR=%u, off=0x%.8lx, sz=0x%zx bytes, addr(v=%pa, p=%pa)\n",
+		pdata->dt_bar, pdata->dt_off, pdata->dt_sz,
+		&dw->dt_region.vaddr, &dw->dt_region.paddr);
+
+	dev_dbg(dev, "Nr. IRQs:\t%u\n", dw->nr_irqs);
+
+	/* Validating if PCI interrupts were enabled */
+	if (!pci_dev_msi_enabled(pdev)) {
+		dev_err(dev, "%s enable interrupt failed\n", pci_name(pdev));
+		return -EPERM;
+	}
+
+	/*
+	 * Acquiring PCI MSI(-X) configuration (address and data) for
+	 * setting it later on eDMA interrupt registers
+	 */
+	dw->msi = devm_kcalloc(dev, nr_irqs, sizeof(*dw->msi), GFP_KERNEL);
+	if (!dw->msi)
+		return -ENOMEM;
+
+	for (i = 0; i < nr_irqs; i++)
+		get_cached_msi_msg(pci_irq_vector(to_pci_dev(dev), i),
+				   &dw->msi[i]);
+
+	/* Starting eDMA driver */
+	err = dw_edma_probe(chip);
+	if (err) {
+		dev_err(dev, "%s eDMA probe failed\n", pci_name(pdev));
+		return err;
+	}
+
+	/* Saving data structure reference */
+	pci_set_drvdata(pdev, chip);
+
+	dev_info(dev, "DesignWare eDMA PCIe driver loaded completely\n");
+
+	return 0;
+}
+
+static void dw_edma_pcie_remove(struct pci_dev *pdev)
+{
+	struct dw_edma_chip *chip = pci_get_drvdata(pdev);
+	struct device *dev = &pdev->dev;
+	int err;
+
+	/* Stopping eDMA driver */
+	err = dw_edma_remove(chip);
+	if (err)
+		dev_warn(dev, "can't remove device properly: %d\n", err);
+
+	/* Freeing IRQs */
+	pci_free_irq_vectors(pdev);
+
+	dev_info(dev, "DesignWare eDMA PCIe driver unloaded completely\n");
+}
+
+static const struct pci_device_id dw_edma_pcie_id_table[] = {
+	{ PCI_DEVICE_DATA(SYNOPSYS, EDDA, &snps_edda_data) },
+	{ }
+};
+MODULE_DEVICE_TABLE(pci, dw_edma_pcie_id_table);
+
+static struct pci_driver dw_edma_pcie_driver = {
+	.name		= "dw-edma-pcie",
+	.id_table	= dw_edma_pcie_id_table,
+	.probe		= dw_edma_pcie_probe,
+	.remove		= dw_edma_pcie_remove,
+};
+
+module_pci_driver(dw_edma_pcie_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Synopsys DesignWare eDMA PCIe driver");
+MODULE_AUTHOR("Gustavo Pimentel <gustavo.pimentel@synopsys.com>");
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC v3 6/7] MAINTAINERS: Add Synopsys eDMA IP driver maintainer
  2019-01-11 18:33 [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0) Gustavo Pimentel
                   ` (4 preceding siblings ...)
  2019-01-11 18:33 ` [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic Gustavo Pimentel
@ 2019-01-11 18:33 ` Gustavo Pimentel
  2019-01-11 18:33 ` [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver Gustavo Pimentel
  6 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-11 18:33 UTC (permalink / raw)
  To: linux-pci, dmaengine
  Cc: Gustavo Pimentel, Vinod Koul, Eugeniy Paltsev, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

Add Synopsys eDMA IP driver maintainer.

This driver aims to support Synopsys eDMA IP and is normally distributed
along with Synopsys PCIe EndPoint IP (depends of the use and licensing
agreement).

Changes:
RFC v1->RFC v2:
 - No changes
RFC v2->RFC v3:
 - No changes

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Vitor Soares <vitor.soares@synopsys.com>
Cc: Nelson Costa <nelson.costa@synopsys.com>
Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
---
 MAINTAINERS | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 32d44447..e30c1ee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4401,6 +4401,13 @@ L:	linux-mtd@lists.infradead.org
 S:	Supported
 F:	drivers/mtd/nand/raw/denali*
 
+DESIGNWARE EDMA CORE IP DRIVER
+M:	Gustavo Pimentel <gustavo.pimentel@synopsys.com>
+L:	dmaengine@vger.kernel.org
+S:	Maintained
+F:	drivers/dma/dw-edma/
+F:	include/linux/dma/edma.h
+
 DESIGNWARE USB2 DRD IP DRIVER
 M:	Minas Harutyunyan <hminas@synopsys.com>
 L:	linux-usb@vger.kernel.org
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-11 18:33 [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0) Gustavo Pimentel
                   ` (5 preceding siblings ...)
  2019-01-11 18:33 ` [RFC v3 6/7] MAINTAINERS: Add Synopsys eDMA IP driver maintainer Gustavo Pimentel
@ 2019-01-11 18:33 ` Gustavo Pimentel
  2019-01-11 19:48   ` Andy Shevchenko
  2019-01-16 10:45   ` Jose Abreu
  6 siblings, 2 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-11 18:33 UTC (permalink / raw)
  To: linux-pci, dmaengine
  Cc: Gustavo Pimentel, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

Add Synopsys eDMA IP test and sample driver to be use for testing
purposes and also as a reference for any developer who needs to
implement and use Synopsys eDMA.

This driver can be compile as built-in or external module in kernel.

To enable this driver just select DW_EDMA_TEST option in kernel
configuration, however it requires and selects automatically DW_EDMA
option too.

Changes:
RFC v1->RFC v2:
 - No changes
RFC v2->RFC v3:
 - Add test module

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Niklas Cassel <niklas.cassel@linaro.org>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jose Abreu <jose.abreu@synopsys.com>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Vitor Soares <vitor.soares@synopsys.com>
Cc: Nelson Costa <nelson.costa@synopsys.com>
Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
---
 drivers/dma/dw-edma/Kconfig        |   7 +
 drivers/dma/dw-edma/Makefile       |   1 +
 drivers/dma/dw-edma/dw-edma-test.c | 897 +++++++++++++++++++++++++++++++++++++
 3 files changed, 905 insertions(+)
 create mode 100644 drivers/dma/dw-edma/dw-edma-test.c

diff --git a/drivers/dma/dw-edma/Kconfig b/drivers/dma/dw-edma/Kconfig
index c0838ce..fe2b129 100644
--- a/drivers/dma/dw-edma/Kconfig
+++ b/drivers/dma/dw-edma/Kconfig
@@ -16,3 +16,10 @@ config DW_EDMA_PCIE
 	  Provides a glue-logic between the Synopsys DesignWare
 	  eDMA controller and an endpoint PCIe device. This also serves
 	  as a reference design to whom desires to use this IP.
+
+config DW_EDMA_TEST
+	tristate "Synopsys DesignWare eDMA test driver"
+	select DW_EDMA
+	help
+	  Simple DMA test client. Say N unless you're debugging a
+	  Synopsys eDMA device driver.
diff --git a/drivers/dma/dw-edma/Makefile b/drivers/dma/dw-edma/Makefile
index 8d45c0d..76e1e73 100644
--- a/drivers/dma/dw-edma/Makefile
+++ b/drivers/dma/dw-edma/Makefile
@@ -5,3 +5,4 @@ dw-edma-$(CONFIG_DEBUG_FS)	:= dw-edma-v0-debugfs.o
 dw-edma-objs			:= dw-edma-core.o \
 					dw-edma-v0-core.o $(dw-edma-y)
 obj-$(CONFIG_DW_EDMA_PCIE)	+= dw-edma-pcie.o
+obj-$(CONFIG_DW_EDMA_TEST)	+= dw-edma-test.o
diff --git a/drivers/dma/dw-edma/dw-edma-test.c b/drivers/dma/dw-edma/dw-edma-test.c
new file mode 100644
index 0000000..23f8c23
--- /dev/null
+++ b/drivers/dma/dw-edma/dw-edma-test.c
@@ -0,0 +1,897 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
+ * Synopsys DesignWare eDMA test driver
+ */
+
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmaengine.h>
+#include <linux/freezer.h>
+#include <linux/kthread.h>
+#include <linux/sched/task.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/slab.h>
+#include <linux/wait.h>
+
+#include "dw-edma-core.h"
+
+enum channel_id {
+	EDMA_CH_WR = 0,
+	EDMA_CH_RD,
+	EDMA_CH_END
+};
+
+static const char * const channel_name[] = {"WRITE", "READ"};
+
+#define EDMA_TEST_MAX_THREADS_CHANNEL		8
+#define EDMA_TEST_DEVICE_NAME			"0000:01:00.0"
+#define EDMA_TEST_CHANNEL_NAME			"dma%uchan%u"
+
+static u32 buf_sz = 14 * 1024 * 1024;		/* 14 Mbytes */
+module_param(buf_sz, uint, 0644);
+MODULE_PARM_DESC(buf_sz, "Buffer test size in bytes");
+
+static u32 buf_seg = 2 * 1024 * 1024;		/*  2 Mbytes */
+module_param(buf_seg, uint, 0644);
+MODULE_PARM_DESC(buf_seg, "Buffer test size segments in bytes");
+
+static u32 wr_threads = EDMA_TEST_MAX_THREADS_CHANNEL;
+module_param(wr_threads, uint, 0644);
+MODULE_PARM_DESC(wr_threads, "Number of write threads");
+
+static u32 rd_threads = EDMA_TEST_MAX_THREADS_CHANNEL;
+module_param(rd_threads, uint, 0644);
+MODULE_PARM_DESC(rd_threads, "Number of reads threads");
+
+static u32 repetitions;
+module_param(repetitions, uint, 0644);
+MODULE_PARM_DESC(repetitions, "Number of repetitions");
+
+static u32 timeout = 5000;
+module_param(timeout, uint, 0644);
+MODULE_PARM_DESC(timeout, "Transfer timeout in msec");
+
+static bool pattern;
+module_param(pattern, bool, 0644);
+MODULE_PARM_DESC(pattern, "Set CPU memory with a pattern before the transfer");
+
+static bool dump_mem;
+module_param(dump_mem, bool, 0644);
+MODULE_PARM_DESC(dump_mem, "Prints on console the CPU and Endpoint memory before and after the transfer");
+
+static u32 dump_sz = 5;
+module_param(dump_sz, uint, 0644);
+MODULE_PARM_DESC(dump_sz, "Size of memory dump");
+
+static bool check;
+module_param(check, bool, 0644);
+MODULE_PARM_DESC(check, "Performs a verification after the transfer to validate data");
+
+static int dw_edma_test_run_set(const char *val, const struct kernel_param *kp);
+
+static int dw_edma_test_run_set(const char *val, const struct kernel_param *kp);
+static int dw_edma_test_run_get(char *val, const struct kernel_param *kp);
+static const struct kernel_param_ops run_ops = {
+	.set = dw_edma_test_run_set,
+	.get = dw_edma_test_run_get,
+};
+
+static bool run_test;
+module_param_cb(run_test, &run_ops, &run_test, 0644);
+MODULE_PARM_DESC(run_test, "Run test");
+
+struct dw_edma_test_params {
+	u32				buf_sz;
+	u32				buf_seg;
+	u32				num_threads[EDMA_CH_END];
+	u32				repetitions;
+	u32				timeout;
+	u8				pattern;
+	u8				dump_mem;
+	u32				dump_sz;
+	u8				check;
+};
+
+static struct dw_edma_test_info {
+	struct dw_edma_test_params	params;
+	struct list_head		channels;
+	struct mutex			lock;
+	bool				init;
+} test_info = {
+	.channels = LIST_HEAD_INIT(test_info.channels),
+	.lock = __MUTEX_INITIALIZER(test_info.lock),
+};
+
+struct dw_edma_test_done {
+	bool				done;
+	wait_queue_head_t		*wait;
+};
+
+struct dw_edma_test_thread {
+	struct dw_edma_test_info	*info;
+	struct task_struct		*task;
+	struct dma_chan			*chan;
+	enum dma_transfer_direction	direction;
+	wait_queue_head_t		done_wait;
+	struct dw_edma_test_done	test_done;
+	bool				done;
+};
+
+struct dw_edma_test_chan {
+	struct list_head		node;
+	struct dma_chan			*chan;
+	struct dw_edma_test_thread	*thread;
+};
+
+static DECLARE_WAIT_QUEUE_HEAD(thread_wait);
+
+static void dw_edma_test_callback(void *arg)
+{
+	struct dw_edma_test_done *done = arg;
+	struct dw_edma_test_thread *thread =
+		container_of(done, struct dw_edma_test_thread, test_done);
+	if (!thread->done) {
+		done->done = true;
+		wake_up_all(done->wait);
+	} else {
+		WARN(1, "dw_edma_test: Kernel memory may be corrupted!!\n");
+	}
+}
+
+static void dw_edma_test_memset(dma_addr_t addr, int sz)
+{
+	void __iomem *ptr = (void __iomem *)addr;
+	int rem_sz = sz, step = 0;
+
+	while (rem_sz >= 0) {
+#ifdef CONFIG_64BIT
+		if (rem_sz >= 8) {
+			step = 8;
+			writeq(0x0123456789ABCDEF, ptr);
+		} else if (rem_sz >= 4) {
+#else
+		if (rem_sz >= 4) {
+#endif
+			step = 4;
+			writel(0x01234567, ptr);
+		} else if (rem_sz >= 2) {
+			step = 2;
+			writew(0x0123, ptr);
+		} else {
+			step = 1;
+			writeb(0x01, ptr);
+		}
+		ptr += step;
+		rem_sz -= step;
+	}
+}
+
+static bool dw_edma_test_check(dma_addr_t v1, dma_addr_t v2, int sz)
+{
+	void __iomem *ptr1 = (void __iomem *)v1;
+	void __iomem *ptr2 = (void __iomem *)v2;
+	int rem_sz = sz, step = 0;
+
+	while (rem_sz >= 0) {
+#ifdef CONFIG_64BIT
+		if (rem_sz >= 8) {
+			step = 8;
+			if (readq(ptr1) != readq(ptr2))
+				return false;
+		} else if (rem_sz >= 4) {
+#else
+		if (rem_sz >= 4) {
+#endif
+			step = 4;
+			if (readl(ptr1) != readl(ptr2))
+				return false;
+		} else if (rem_sz >= 2) {
+			step = 2;
+			if (readw(ptr1) != readw(ptr2))
+				return false;
+		} else {
+			step = 1;
+			if (readb(ptr1) != readb(ptr2))
+				return false;
+		}
+		ptr1 += step;
+		ptr2 += step;
+		rem_sz -= step;
+	}
+
+	return true;
+}
+
+static void dw_edma_test_dump(struct device *dev,
+			      enum dma_transfer_direction direction, int sz,
+			      struct dw_edma_region *r1,
+			      struct dw_edma_region *r2)
+{
+	u32 *ptr1, *ptr2, *ptr3, *ptr4;
+	int i, cnt = min(r1->sz, r2->sz);
+
+	cnt = min(cnt, sz);
+	cnt -= cnt % 4;
+
+	if (direction == DMA_DEV_TO_MEM) {
+		ptr1 = (u32 *)r1->vaddr;
+		ptr2 = (u32 *)r1->paddr;
+		ptr3 = (u32 *)r2->vaddr;
+		ptr4 = (u32 *)r2->paddr;
+		dev_info(dev, "      ============= EP memory =============\t============= CPU memory ============\n");
+	} else {
+		ptr1 = (u32 *)r2->vaddr;
+		ptr2 = (u32 *)r2->paddr;
+		ptr3 = (u32 *)r1->vaddr;
+		ptr4 = (u32 *)r1->paddr;
+		dev_info(dev, "      ============= CPU memory ============\t============= EP memory =============\n");
+	}
+	dev_info(dev, "      ============== Source ===============\t============ Destination ============\n");
+	dev_info(dev, "      [Virt. Addr][Phys. Addr]=[   Value  ]\t[Virt. Addr][Phys. Addr]=[   Value  ]\n");
+	for (i = 0; i < cnt; i++, ptr1++, ptr2++, ptr3++, ptr4++)
+		dev_info(dev, "[%.3u] [%pa][%pa]=[0x%.8x]\t[%pa][%pa]=[0x%.8x]\n",
+			 i,
+			 &ptr1, &ptr2, readl(ptr1),
+			 &ptr3, &ptr4, readl(ptr3));
+}
+
+static int dw_edma_test_sg(void *data)
+{
+	struct dw_edma_test_thread *thread = data;
+	struct dw_edma_test_done *done = &thread->test_done;
+	struct dw_edma_test_info *info = thread->info;
+	struct dw_edma_test_params *params = &info->params;
+	struct dma_chan	*chan = thread->chan;
+	struct device *dev = chan->device->dev;
+	struct dw_edma_region *dt_region = chan->private;
+	u32 rem_len = params->buf_sz;
+	u32 f_prp_cnt = 0;
+	u32 f_sbt_cnt = 0;
+	u32 f_tm_cnt = 0;
+	u32 f_cpl_err = 0;
+	u32 f_cpl_bsy = 0;
+	dma_cookie_t cookie;
+	enum dma_status status;
+	struct dw_edma_region *descs;
+	struct sg_table	*sgt;
+	struct scatterlist *sg;
+	struct dma_slave_config	sconf;
+	struct dma_async_tx_descriptor *txdesc;
+	int i, sgs, err = 0;
+
+	set_freezable();
+	set_user_nice(current, 10);
+
+	/* Calculates the maximum number of segments */
+	sgs = DIV_ROUND_UP(params->buf_sz, params->buf_seg);
+
+	if (!sgs)
+		goto err_end;
+
+	/* Allocate scatter-gather table */
+	sgt = kvmalloc(sizeof(*sgt), GFP_KERNEL);
+	if (!sgt)
+		goto err_end;
+
+	err = sg_alloc_table(sgt, sgs, GFP_KERNEL);
+	if (err)
+		goto err_sg_alloc_table;
+
+	sg = &sgt->sgl[0];
+	if (!sg)
+		goto err_alloc_descs;
+
+	/*
+	 * Allocate structure to hold all scatter-gather segments (size,
+	 * virtual and physical addresses)
+	 */
+	descs = devm_kcalloc(dev, sgs, sizeof(*descs), GFP_KERNEL);
+	if (!descs)
+		goto err_alloc_descs;
+
+	for (i = 0; sg && i < sgs; i++) {
+		descs[i].paddr = 0;
+		descs[i].sz = min(rem_len, params->buf_seg);
+		rem_len -= descs[i].sz;
+
+		descs[i].vaddr = (dma_addr_t)dma_alloc_coherent(dev,
+								descs[i].sz,
+								&descs[i].paddr,
+								GFP_KERNEL);
+		if (!descs[i].vaddr || !descs[i].paddr) {
+			dev_err(dev, "%s: (%u)fail to allocate %u bytes\n",
+				dma_chan_name(chan), i,	descs[i].sz);
+			goto err_descs;
+		}
+
+		dev_dbg(dev, "%s: CPU: segment %u, addr(v=%pa, p=%pa)\n",
+			dma_chan_name(chan), i,
+			&descs[i].vaddr, &descs[i].paddr);
+
+		sg_set_buf(sg, (void *)descs[i].paddr, descs[i].sz);
+		sg = sg_next(sg);
+	}
+
+	/* Dumps the first segment memory */
+	if (params->dump_mem)
+		dw_edma_test_dump(dev, thread->direction, params->dump_sz,
+				  dt_region, &descs[0]);
+
+	/* Fills CPU memory with a known pattern */
+	if (params->pattern)
+		dw_edma_test_memset(descs[0].vaddr, params->buf_sz);
+
+	/*
+	 * Configures DMA channel according to the direction
+	 *  - flags
+	 *  - source and destination addresses
+	 */
+	if (thread->direction == DMA_DEV_TO_MEM) {
+		/* DMA_DEV_TO_MEM - WRITE - DMA_FROM_DEVICE */
+		dev_dbg(dev, "%s: DMA_DEV_TO_MEM - WRITE - DMA_FROM_DEVICE\n",
+			dma_chan_name(chan));
+		err = dma_map_sg(dev, sgt->sgl, sgt->nents, DMA_FROM_DEVICE);
+		if (!err)
+			goto err_descs;
+
+		sgt->nents = err;
+		/* Endpoint memory */
+		sconf.src_addr = dt_region->paddr;
+		/* CPU memory */
+		sconf.dst_addr = descs[0].paddr;
+	} else {
+		/* DMA_MEM_TO_DEV - READ - DMA_TO_DEVICE */
+		dev_dbg(dev, "%s: DMA_MEM_TO_DEV - READ - DMA_TO_DEVICE\n",
+			dma_chan_name(chan));
+		err = dma_map_sg(dev, sgt->sgl, sgt->nents, DMA_TO_DEVICE);
+		if (!err)
+			goto err_descs;
+
+		sgt->nents = err;
+		/* CPU memory */
+		sconf.src_addr = descs[0].paddr;
+		/* Endpoint memory */
+		sconf.dst_addr = dt_region->paddr;
+	}
+
+	dmaengine_slave_config(chan, &sconf);
+	dev_dbg(dev, "%s: addr(physical) src=%pa, dst=%pa\n",
+		dma_chan_name(chan), &sconf.src_addr, &sconf.dst_addr);
+	dev_dbg(dev, "%s: len=%u bytes, sgs=%u, seg_sz=%u bytes\n",
+		dma_chan_name(chan), params->buf_sz, sgs, params->buf_seg);
+
+	/*
+	 * Prepare the DMA channel for the transfer
+	 *  - provide scatter-gather list
+	 *  - configure to trigger an interrupt after the transfer
+	 */
+	txdesc = dmaengine_prep_slave_sg(chan, sgt->sgl, sgt->nents,
+					 thread->direction,
+					 DMA_PREP_INTERRUPT);
+	if (!txdesc) {
+		dev_dbg(dev, "%s: dmaengine_prep_slave_sg\n",
+			dma_chan_name(chan));
+		f_prp_cnt++;
+		goto err_stats;
+	}
+
+	done->done = false;
+	txdesc->callback = dw_edma_test_callback;
+	txdesc->callback_param = done;
+	cookie = dmaengine_submit(txdesc);
+	if (dma_submit_error(cookie)) {
+		dev_dbg(dev, "%s: dma_submit_error\n", dma_chan_name(chan));
+		f_sbt_cnt++;
+		goto err_stats;
+	}
+
+	/* Start DMA transfer */
+	dma_async_issue_pending(chan);
+
+	/* Thread waits here for transfer completion or exists by timeout */
+	wait_event_freezable_timeout(thread->done_wait, done->done,
+				     msecs_to_jiffies(params->timeout));
+
+	/* Check DMA transfer status and act upon it  */
+	status = dma_async_is_tx_complete(chan, cookie, NULL, NULL);
+	if (!done->done) {
+		dev_dbg(dev, "%s: timeout\n", dma_chan_name(chan));
+		f_tm_cnt++;
+	} else if (status != DMA_COMPLETE) {
+		if (status == DMA_ERROR) {
+			dev_dbg(dev, "%s:  completion error status\n",
+				dma_chan_name(chan));
+			f_cpl_err++;
+		} else {
+			dev_dbg(dev, "%s: completion busy status\n",
+				dma_chan_name(chan));
+			f_cpl_bsy++;
+		}
+	}
+
+err_stats:
+	/* Display some stats information */
+	if (f_prp_cnt || f_sbt_cnt || f_tm_cnt || f_cpl_err || f_cpl_bsy) {
+		dev_info(dev, "%s: test failed - dmaengine_prep_slave_sg=%u, dma_submit_error=%u, timeout=%u, completion error status=%u, completion busy status=%u\n",
+			 dma_chan_name(chan), f_prp_cnt, f_sbt_cnt,
+			 f_tm_cnt, f_cpl_err, f_cpl_bsy);
+	} else {
+		dev_info(dev, "%s: test passed\n", dma_chan_name(chan));
+	}
+
+	/* Dumps the first segment memory */
+	if (params->dump_mem)
+		dw_edma_test_dump(dev, thread->direction, params->dump_sz,
+				  dt_region, &descs[0]);
+
+	/* Check if the data was correctly transfer */
+	if (params->check) {
+		dev_info(dev, "%s: performing check\n", dma_chan_name(chan));
+		err = dw_edma_test_check(descs[i].vaddr, dt_region->vaddr,
+					 params->buf_sz);
+		if (err)
+			dev_info(dev, "%s: check pass\n", dma_chan_name(chan));
+		else
+			dev_info(dev, "%s: check fail\n", dma_chan_name(chan));
+	}
+
+	/* Terminate any DMA operation, (fail safe) */
+	dmaengine_terminate_all(chan);
+
+err_descs:
+	for (i = 0; i < sgs && descs[i].vaddr && descs[i].paddr; i++)
+		dma_free_coherent(dev, descs[i].sz, (void *)descs[i].vaddr,
+				  descs[i].paddr);
+	devm_kfree(dev, descs);
+err_alloc_descs:
+	sg_free_table(sgt);
+err_sg_alloc_table:
+	kvfree(sgt);
+err_end:
+	thread->done = true;
+	wake_up(&thread_wait);
+
+	return 0;
+}
+
+static int dw_edma_test_cyclic(void *data)
+{
+	struct dw_edma_test_thread *thread = data;
+	struct dw_edma_test_done *done = &thread->test_done;
+	struct dw_edma_test_info *info = thread->info;
+	struct dw_edma_test_params *params = &info->params;
+	struct dma_chan	*chan = thread->chan;
+	struct device *dev = chan->device->dev;
+	struct dw_edma_region *dt_region = chan->private;
+	u32 f_prp_cnt = 0;
+	u32 f_sbt_cnt = 0;
+	u32 f_tm_cnt = 0;
+	u32 f_cpl_err = 0;
+	u32 f_cpl_bsy = 0;
+	dma_cookie_t cookie;
+	enum dma_status status;
+	struct dw_edma_region desc;
+	struct dma_slave_config	sconf;
+	struct dma_async_tx_descriptor *txdesc;
+	int err = 0;
+
+	set_freezable();
+	set_user_nice(current, 10);
+
+	desc.paddr = 0;
+	desc.sz = params->buf_seg;
+	desc.vaddr = (dma_addr_t)dma_alloc_coherent(dev, desc.sz, &desc.paddr,
+						    GFP_KERNEL);
+	if (!desc.vaddr || !desc.paddr) {
+		dev_err(dev, "%s: fail to allocate %u bytes\n",
+			dma_chan_name(chan), desc.sz);
+		goto err_end;
+	}
+
+	dev_dbg(dev, "%s: CPU: addr(v=%pa, p=%pa)\n",
+		dma_chan_name(chan), &desc.vaddr, &desc.paddr);
+
+	/* Dumps the first segment memory */
+	if (params->dump_mem)
+		dw_edma_test_dump(dev, thread->direction, params->dump_sz,
+				  dt_region, &desc);
+
+	/* Fills CPU memory with a known pattern */
+	if (params->pattern)
+		dw_edma_test_memset(desc.vaddr, params->buf_sz);
+
+	/*
+	 * Configures DMA channel according to the direction
+	 *  - flags
+	 *  - source and destination addresses
+	 */
+	if (thread->direction == DMA_DEV_TO_MEM) {
+		/* DMA_DEV_TO_MEM - WRITE - DMA_FROM_DEVICE */
+		dev_dbg(dev, "%s: DMA_DEV_TO_MEM - WRITE - DMA_FROM_DEVICE\n",
+			dma_chan_name(chan));
+
+		/* Endpoint memory */
+		sconf.src_addr = dt_region->paddr;
+		/* CPU memory */
+		sconf.dst_addr = desc.paddr;
+	} else {
+		/* DMA_MEM_TO_DEV - READ - DMA_TO_DEVICE */
+		dev_dbg(dev, "%s: DMA_MEM_TO_DEV - READ - DMA_TO_DEVICE\n",
+			dma_chan_name(chan));
+
+		/* CPU memory */
+		sconf.src_addr = desc.paddr;
+		/* Endpoint memory */
+		sconf.dst_addr = dt_region->paddr;
+	}
+
+	dmaengine_slave_config(chan, &sconf);
+	dev_dbg(dev, "%s: addr(physical) src=%pa, dst=%pa\n",
+		dma_chan_name(chan), &sconf.src_addr, &sconf.dst_addr);
+	dev_dbg(dev, "%s: len=%u bytes\n",
+		dma_chan_name(chan), params->buf_sz);
+
+	/*
+	 * Prepare the DMA channel for the transfer
+	 *  - provide buffer, size and number of repetitions
+	 *  - configure to trigger an interrupt after the transfer
+	 */
+	txdesc = dmaengine_prep_dma_cyclic(chan, desc.vaddr, desc.sz,
+					   params->repetitions,
+					   thread->direction,
+					   DMA_PREP_INTERRUPT);
+	if (!txdesc) {
+		dev_dbg(dev, "%s: dmaengine_prep_slave_sg\n",
+			dma_chan_name(chan));
+		f_prp_cnt++;
+		goto err_stats;
+	}
+
+	done->done = false;
+	txdesc->callback = dw_edma_test_callback;
+	txdesc->callback_param = done;
+	cookie = dmaengine_submit(txdesc);
+	if (dma_submit_error(cookie)) {
+		dev_dbg(dev, "%s: dma_submit_error\n", dma_chan_name(chan));
+		f_sbt_cnt++;
+		goto err_stats;
+	}
+
+	/* Start DMA transfer */
+	dma_async_issue_pending(chan);
+
+	/* Thread waits here for transfer completion or exists by timeout */
+	wait_event_freezable_timeout(thread->done_wait, done->done,
+				     msecs_to_jiffies(params->timeout));
+
+	/* Check DMA transfer status and act upon it */
+	status = dma_async_is_tx_complete(chan, cookie, NULL, NULL);
+	if (!done->done) {
+		dev_dbg(dev, "%s: timeout\n", dma_chan_name(chan));
+		f_tm_cnt++;
+	} else if (status != DMA_COMPLETE) {
+		if (status == DMA_ERROR) {
+			dev_dbg(dev, "%s:  completion error status\n",
+				dma_chan_name(chan));
+			f_cpl_err++;
+		} else {
+			dev_dbg(dev, "%s: completion busy status\n",
+				dma_chan_name(chan));
+			f_cpl_bsy++;
+		}
+	}
+
+err_stats:
+	/* Display some stats information */
+	if (f_prp_cnt || f_sbt_cnt || f_tm_cnt || f_cpl_err || f_cpl_bsy) {
+		dev_info(dev, "%s: test failed - dmaengine_prep_slave_sg=%u, dma_submit_error=%u, timeout=%u, completion error status=%u, completion busy status=%u\n",
+			 dma_chan_name(chan), f_prp_cnt, f_sbt_cnt,
+			 f_tm_cnt, f_cpl_err, f_cpl_bsy);
+	} else {
+		dev_info(dev, "%s: test passed\n", dma_chan_name(chan));
+	}
+
+	/* Dumps the first segment memory */
+	if (params->dump_mem)
+		dw_edma_test_dump(dev, thread->direction, params->dump_sz,
+				  dt_region, &desc);
+
+	/* Check if the data was correctly transfer */
+	if (params->check) {
+		dev_info(dev, "%s: performing check\n", dma_chan_name(chan));
+		err = dw_edma_test_check(desc.vaddr, dt_region->vaddr,
+					 params->buf_sz);
+		if (err)
+			dev_info(dev, "%s: check pass\n", dma_chan_name(chan));
+		else
+			dev_info(dev, "%s: check fail\n", dma_chan_name(chan));
+	}
+
+	/* Terminate any DMA operation, (fail safe) */
+	dmaengine_terminate_all(chan);
+
+	dma_free_coherent(dev, desc.sz, (void *)desc.vaddr, desc.paddr);
+err_end:
+	thread->done = true;
+	wake_up(&thread_wait);
+
+	return 0;
+}
+
+static int dw_edma_test_add_channel(struct dw_edma_test_info *info,
+				    struct dma_chan *chan,
+				    u32 channel)
+{
+	struct dw_edma_test_params *params = &info->params;
+	struct dw_edma_test_thread *thread;
+	struct dw_edma_test_chan *tchan;
+
+	tchan = kvmalloc(sizeof(*tchan), GFP_KERNEL);
+	if (!tchan)
+		return -ENOMEM;
+
+	tchan->chan = chan;
+
+	thread = kvzalloc(sizeof(*thread), GFP_KERNEL);
+	if (!thread) {
+		kvfree(tchan);
+		return -ENOMEM;
+	}
+
+	thread->info = info;
+	thread->chan = tchan->chan;
+	switch (channel) {
+	case EDMA_CH_WR:
+		thread->direction = DMA_DEV_TO_MEM;
+		break;
+	case EDMA_CH_RD:
+		thread->direction = DMA_MEM_TO_DEV;
+		break;
+	default:
+		kvfree(tchan);
+		return -EPERM;
+	}
+	thread->test_done.wait = &thread->done_wait;
+	init_waitqueue_head(&thread->done_wait);
+
+	if (!params->repetitions)
+		thread->task = kthread_create(dw_edma_test_sg, thread, "%s",
+					      dma_chan_name(chan));
+	else
+		thread->task = kthread_create(dw_edma_test_cyclic, thread, "%s",
+					      dma_chan_name(chan));
+
+	if (IS_ERR(thread->task)) {
+		pr_err("failed to create thread %s\n", dma_chan_name(chan));
+		kvfree(tchan);
+		kvfree(thread);
+		return -EPERM;
+	}
+
+	tchan->thread = thread;
+	dev_dbg(chan->device->dev, "add thread %s\n", dma_chan_name(chan));
+	list_add_tail(&tchan->node, &info->channels);
+
+	return 0;
+}
+
+static void dw_edma_test_del_channel(struct dw_edma_test_chan *tchan)
+{
+	struct dw_edma_test_thread *thread = tchan->thread;
+
+	kthread_stop(thread->task);
+	dev_dbg(tchan->chan->device->dev, "thread %s exited\n",
+		thread->task->comm);
+	put_task_struct(thread->task);
+	kvfree(thread);
+	tchan->thread = NULL;
+
+	dmaengine_terminate_all(tchan->chan);
+	kvfree(tchan);
+}
+
+static void dw_edma_test_run_channel(struct dw_edma_test_chan *tchan)
+{
+	struct dw_edma_test_thread *thread = tchan->thread;
+
+	get_task_struct(thread->task);
+	wake_up_process(thread->task);
+	dev_dbg(tchan->chan->device->dev, "thread %s started\n",
+		thread->task->comm);
+}
+
+static bool dw_edma_test_filter(struct dma_chan *chan, void *filter)
+{
+	if (strcmp(dev_name(chan->device->dev), EDMA_TEST_DEVICE_NAME) ||
+	    strcmp(dma_chan_name(chan), filter))
+		return false;
+
+	return true;
+}
+
+static void dw_edma_test_thread_create(struct dw_edma_test_info *info)
+{
+	struct dw_edma_test_params *params = &info->params;
+	struct dma_chan *chan;
+	struct dw_edma_region *dt_region;
+	dma_cap_mask_t mask;
+	char filter[20];
+	int i, j;
+
+	params->num_threads[EDMA_CH_WR] = min_t(u32,
+						EDMA_TEST_MAX_THREADS_CHANNEL,
+						wr_threads);
+	params->num_threads[EDMA_CH_RD] = min_t(u32,
+						EDMA_TEST_MAX_THREADS_CHANNEL,
+						rd_threads);
+	params->repetitions = repetitions;
+	params->timeout = timeout;
+	params->pattern = pattern;
+	params->dump_mem = dump_mem;
+	params->dump_sz = dump_sz;
+	params->check = check;
+	params->buf_sz = buf_sz;
+	params->buf_seg = min(buf_seg, buf_sz);
+
+#ifndef CONFIG_CMA_SIZE_MBYTES
+	pr_warn("CMA not present/activated! Contiguous Memory may fail to be allocted\n");
+#endif
+
+	pr_info("Number of write threads = %u\n", wr_threads);
+	pr_info("Number of read threads = %u\n", rd_threads);
+	if (!params->repetitions)
+		pr_info("Scatter-gather mode\n");
+	else
+		pr_info("Cyclic mode (repetitions per thread %u)\n",
+			params->repetitions);
+	pr_info("Timeout = %u ms\n", params->timeout);
+	pr_info("Use pattern = %s\n", params->pattern ? "true" : "false");
+	pr_info("Dump memory = %s\n", params->dump_mem ? "true" : "false");
+	pr_info("Perform check = %s\n", params->check ? "true" : "false");
+
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+	dma_cap_set(DMA_CYCLIC, mask);
+
+	for (i = 0; i < EDMA_CH_END; i++) {
+		for (j = 0; j < params->num_threads[i]; j++) {
+			snprintf(filter, sizeof(filter),
+				 EDMA_TEST_CHANNEL_NAME, i, j);
+
+			chan = dma_request_channel(mask, dw_edma_test_filter,
+						   filter);
+			if (!chan)
+				continue;
+
+			if (dw_edma_test_add_channel(info, chan, i)) {
+				dma_release_channel(chan);
+				pr_err("error adding %s channel thread %u\n",
+				       channel_name[i], j);
+				continue;
+			}
+
+			dt_region = chan->private;
+			params->buf_sz = min(params->buf_sz, dt_region->sz);
+			params->buf_seg = min(params->buf_seg, dt_region->sz);
+		}
+	}
+}
+
+static void dw_edma_test_thread_run(struct dw_edma_test_info *info)
+{
+	struct dw_edma_test_chan *tchan, *_tchan;
+
+	list_for_each_entry_safe(tchan, _tchan, &info->channels, node)
+		dw_edma_test_run_channel(tchan);
+}
+
+static void dw_edma_test_thread_stop(struct dw_edma_test_info *info)
+{
+	struct dw_edma_test_chan *tchan, *_tchan;
+	struct dma_chan *chan;
+
+	list_for_each_entry_safe(tchan, _tchan, &info->channels, node) {
+		list_del(&tchan->node);
+		chan = tchan->chan;
+		dw_edma_test_del_channel(tchan);
+		dma_release_channel(chan);
+		pr_info("deleted channel %s\n", dma_chan_name(chan));
+	}
+}
+
+static bool dw_edma_test_is_thread_run(struct dw_edma_test_info *info)
+{
+	struct dw_edma_test_chan *tchan;
+
+	list_for_each_entry(tchan, &info->channels, node) {
+		struct dw_edma_test_thread *thread = tchan->thread;
+
+		if (!thread->done)
+			return true;
+	}
+
+	return false;
+}
+
+static void dw_edma_test_thread_restart(struct dw_edma_test_info *info,
+					bool run)
+{
+	if (!info->init)
+		return;
+
+	dw_edma_test_thread_stop(info);
+	dw_edma_test_thread_create(info);
+	dw_edma_test_thread_run(info);
+}
+
+static int dw_edma_test_run_get(char *val, const struct kernel_param *kp)
+{
+	struct dw_edma_test_info *info = &test_info;
+
+	mutex_lock(&info->lock);
+
+	run_test = dw_edma_test_is_thread_run(info);
+	if (!run_test)
+		dw_edma_test_thread_stop(info);
+
+	mutex_unlock(&info->lock);
+
+	return param_get_bool(val, kp);
+}
+
+static int dw_edma_test_run_set(const char *val, const struct kernel_param *kp)
+{
+	struct dw_edma_test_info *info = &test_info;
+	int ret;
+
+	mutex_lock(&info->lock);
+
+	ret = param_set_bool(val, kp);
+	if (ret)
+		goto err_set;
+
+	if (dw_edma_test_is_thread_run(info))
+		ret = -EBUSY;
+	else if (run_test)
+		dw_edma_test_thread_restart(info, run_test);
+
+err_set:
+	mutex_unlock(&info->lock);
+
+	return ret;
+}
+
+static int __init dw_edma_test_init(void)
+{
+	struct dw_edma_test_info *info = &test_info;
+
+	if (run_test) {
+		mutex_lock(&info->lock);
+		dw_edma_test_thread_create(info);
+		dw_edma_test_thread_run(info);
+		mutex_unlock(&info->lock);
+	}
+
+	wait_event(thread_wait, !dw_edma_test_is_thread_run(info));
+
+	info->init = true;
+
+	return 0;
+}
+late_initcall(dw_edma_test_init);
+
+static void __exit dw_edma_test_exit(void)
+{
+	struct dw_edma_test_info *info = &test_info;
+
+	mutex_lock(&info->lock);
+	dw_edma_test_thread_stop(info);
+	mutex_unlock(&info->lock);
+}
+module_exit(dw_edma_test_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Synopsys DesignWare eDMA test driver");
+MODULE_AUTHOR("Gustavo Pimentel <gustavo.pimentel@synopsys.com>");
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic
  2019-01-11 18:33 ` [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic Gustavo Pimentel
@ 2019-01-11 19:47   ` Andy Shevchenko
  2019-01-14 11:38     ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Andy Shevchenko @ 2019-01-11 19:47 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Lorenzo Pieralisi, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On Fri, Jan 11, 2019 at 07:33:41PM +0100, Gustavo Pimentel wrote:
> Synopsys eDMA IP is normally distributed along with Synopsys PCIe
> EndPoint IP (depends of the use and licensing agreement).
> 
> This IP requires some basic configurations, such as:
>  - eDMA registers BAR
>  - eDMA registers offset
>  - eDMA registers size
>  - eDMA linked list memory BAR
>  - eDMA linked list memory offset
>  - eDMA linked list memory sze
>  - eDMA data memory BAR
>  - eDMA data memory offset
>  - eDMA data memory size
>  - eDMA version
>  - eDMA mode
>  - IRQs available for eDMA
> 
> As a working example, PCIe glue-logic will attach to a Synopsys PCIe
> EndPoint IP prototype kit (Vendor ID = 0x16c3, Device ID = 0xedda),
> which has built-in an eDMA IP with this default configuration:
>  - eDMA registers BAR = 0
>  - eDMA registers offset = 0x00001000 (4 Kbytes)
>  - eDMA registers size = 0x00002000 (8 Kbytes)
>  - eDMA linked list memory BAR = 2
>  - eDMA linked list memory offset = 0x00000000 (0 Kbytes)
>  - eDMA linked list memory size = 0x00800000 (8 Mbytes)
>  - eDMA data memory BAR = 2
>  - eDMA data memory offset = 0x00800000 (8 Mbytes)
>  - eDMA data memory size = 0x03800000 (56 Mbytes)
>  - eDMA version = 0
>  - eDMA mode = EDMA_MODE_UNROLL
>  - IRQs = 1
> 
> This driver can be compile as built-in or external module in kernel.
> 
> To enable this driver just select DW_EDMA_PCIE option in kernel
> configuration, however it requires and selects automatically DW_EDMA
> option too.
> 

> Changes:
> RFC v1->RFC v2:

Changes go after '--- ' line.

>  - Replace comments // (C99 style) by /**/
>  - Merge two pcim_iomap_regions() calls into just one call
>  - Remove pci_try_set_mwi() call
>  - Replace some dev_info() by dev_dbg() to reduce *noise*
>  - Remove pci_name(pdev) call after being call dw_edma_remove()
>  - Remove all power management support
>  - Fix the headers of the .c and .h files according to the most recent
>    convention
>  - Fix errors and checks pointed out by checkpatch with --strict option
>  - Replace patch small description tag from dma by dmaengine
> RFC v2->RFC v3:
>  - Fix printk variable of phys_addr_t type
>  - Fix missing variable initialization (chan->configured)
>  - Change linked list size to 512 Kbytes
>  - Add data memory information
>  - Add register size information
>  - Add comments or improve existing ones
>  - Add possibility to work with multiple IRQs feature
>  - Replace MSI and MSI-X enable condition by pci_dev_msi_enabled()
>  - Replace code to acquire MSI(-X) address and data by
>    get_cached_msi_msg()

> +enum dw_edma_pcie_bar {
> +	BAR_0,
> +	BAR_1,
> +	BAR_2,
> +	BAR_3,
> +	BAR_4,
> +	BAR_5
> +};

pci-epf.h has this.
Why duplicate?


What else is being duplicated from PCI core?

> +static bool disable_msix;
> +module_param(disable_msix, bool, 0644);
> +MODULE_PARM_DESC(disable_msix, "Disable MSI-X interrupts");

Why?!
We are no allow new module parameters without very strong arguments.

> +
> +static int dw_edma_pcie_probe(struct pci_dev *pdev,
> +			      const struct pci_device_id *pid)
> +{
> +	const struct dw_edma_pcie_data *pdata = (void *)pid->driver_data;
> +	struct device *dev = &pdev->dev;
> +	struct dw_edma_chip *chip;
> +	struct dw_edma *dw;
> +	unsigned int irq_flags = PCI_IRQ_MSI;
> +	int err, nr_irqs, i;
> +

> +	if (!pdata) {
> +		dev_err(dev, "%s missing data structure\n", pci_name(pdev));
> +		return -EFAULT;
> +	}

Useless check.

> +
> +	/* Enable PCI device */
> +	err = pcim_enable_device(pdev);
> +	if (err) {
> +		dev_err(dev, "%s enabling device failed\n", pci_name(pdev));
> +		return err;
> +	}
> +
> +	/* Mapping PCI BAR regions */
> +	err = pcim_iomap_regions(pdev, BIT(pdata->rg_bar) |
> +				       BIT(pdata->ll_bar) |
> +				       BIT(pdata->dt_bar),
> +				 pci_name(pdev));
> +	if (err) {

> +		dev_err(dev, "%s eDMA BAR I/O remapping failed\n",
> +			pci_name(pdev));

Isn't it pci_err() ?
Same comment for the rest similar cases above and below.

> +		return err;
> +	}
> +
> +	pci_set_master(pdev);
> +
> +	nr_irqs = pci_alloc_irq_vectors(pdev, 1, pdata->irqs_cnt, irq_flags);
> +	if (nr_irqs < 1) {
> +		dev_err(dev, "%s failed to alloc IRQ vector (Number of IRQs=%u)\n",
> +			pci_name(pdev), nr_irqs);
> +		return -EPERM;
> +	}
> +
> +	/* Data structure initialization */
> +	chip->dw = dw;
> +	chip->dev = dev;
> +	chip->id = pdev->devfn;
> +	chip->irq = pdev->irq;
> +

> +	if (!pcim_iomap_table(pdev))
> +		return -EACCES;

Never happen condition. Thus useless.

> +	dev_info(dev, "DesignWare eDMA PCIe driver loaded completely\n");

Useless.

> +}
> +
> +static void dw_edma_pcie_remove(struct pci_dev *pdev)
> +{
> +	struct dw_edma_chip *chip = pci_get_drvdata(pdev);
> +	struct device *dev = &pdev->dev;
> +	int err;
> +
> +	/* Stopping eDMA driver */
> +	err = dw_edma_remove(chip);
> +	if (err)
> +		dev_warn(dev, "can't remove device properly: %d\n", err);
> +
> +	/* Freeing IRQs */
> +	pci_free_irq_vectors(pdev);
> +
> +	dev_info(dev, "DesignWare eDMA PCIe driver unloaded completely\n");

Ditto.

> +}

> +MODULE_DEVICE_TABLE(pci, dw_edma_pcie_id_table);
> +
> +static struct pci_driver dw_edma_pcie_driver = {
> +	.name		= "dw-edma-pcie",
> +	.id_table	= dw_edma_pcie_id_table,
> +	.probe		= dw_edma_pcie_probe,
> +	.remove		= dw_edma_pcie_remove,

Power management?

> +};

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-11 18:33 ` [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver Gustavo Pimentel
@ 2019-01-11 19:48   ` Andy Shevchenko
  2019-01-14 11:44     ` Gustavo Pimentel
  2019-01-16 10:45   ` Jose Abreu
  1 sibling, 1 reply; 39+ messages in thread
From: Andy Shevchenko @ 2019-01-11 19:48 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

On Fri, Jan 11, 2019 at 07:33:43PM +0100, Gustavo Pimentel wrote:
> Add Synopsys eDMA IP test and sample driver to be use for testing
> purposes and also as a reference for any developer who needs to
> implement and use Synopsys eDMA.
> 
> This driver can be compile as built-in or external module in kernel.
> 
> To enable this driver just select DW_EDMA_TEST option in kernel
> configuration, however it requires and selects automatically DW_EDMA
> option too.
> 

Hmm... This doesn't explain what's wrong with dmatest module.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic
  2019-01-11 19:47   ` Andy Shevchenko
@ 2019-01-14 11:38     ` Gustavo Pimentel
  2019-01-15  5:43       ` Andy Shevchenko
  0 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-14 11:38 UTC (permalink / raw)
  To: Andy Shevchenko, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Lorenzo Pieralisi, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 11/01/2019 19:47, Andy Shevchenko wrote:
> On Fri, Jan 11, 2019 at 07:33:41PM +0100, Gustavo Pimentel wrote:
>> Synopsys eDMA IP is normally distributed along with Synopsys PCIe
>> EndPoint IP (depends of the use and licensing agreement).
>>
>> This IP requires some basic configurations, such as:
>>  - eDMA registers BAR
>>  - eDMA registers offset
>>  - eDMA registers size
>>  - eDMA linked list memory BAR
>>  - eDMA linked list memory offset
>>  - eDMA linked list memory sze
>>  - eDMA data memory BAR
>>  - eDMA data memory offset
>>  - eDMA data memory size
>>  - eDMA version
>>  - eDMA mode
>>  - IRQs available for eDMA
>>
>> As a working example, PCIe glue-logic will attach to a Synopsys PCIe
>> EndPoint IP prototype kit (Vendor ID = 0x16c3, Device ID = 0xedda),
>> which has built-in an eDMA IP with this default configuration:
>>  - eDMA registers BAR = 0
>>  - eDMA registers offset = 0x00001000 (4 Kbytes)
>>  - eDMA registers size = 0x00002000 (8 Kbytes)
>>  - eDMA linked list memory BAR = 2
>>  - eDMA linked list memory offset = 0x00000000 (0 Kbytes)
>>  - eDMA linked list memory size = 0x00800000 (8 Mbytes)
>>  - eDMA data memory BAR = 2
>>  - eDMA data memory offset = 0x00800000 (8 Mbytes)
>>  - eDMA data memory size = 0x03800000 (56 Mbytes)
>>  - eDMA version = 0
>>  - eDMA mode = EDMA_MODE_UNROLL
>>  - IRQs = 1
>>
>> This driver can be compile as built-in or external module in kernel.
>>
>> To enable this driver just select DW_EDMA_PCIE option in kernel
>> configuration, however it requires and selects automatically DW_EDMA
>> option too.
>>
> 
>> Changes:
>> RFC v1->RFC v2:
> 
> Changes go after '--- ' line.

At the last Linux Plumbers Conference there were some subsystem maintainers who
asked that the track changes be included in the description as a way to not lose
the previous work done. That why I put it before the '---' line, but it's
indifferent to me, I can put it after the '---' line.

> 
>>  - Replace comments // (C99 style) by /**/
>>  - Merge two pcim_iomap_regions() calls into just one call
>>  - Remove pci_try_set_mwi() call
>>  - Replace some dev_info() by dev_dbg() to reduce *noise*
>>  - Remove pci_name(pdev) call after being call dw_edma_remove()
>>  - Remove all power management support
>>  - Fix the headers of the .c and .h files according to the most recent
>>    convention
>>  - Fix errors and checks pointed out by checkpatch with --strict option
>>  - Replace patch small description tag from dma by dmaengine
>> RFC v2->RFC v3:
>>  - Fix printk variable of phys_addr_t type
>>  - Fix missing variable initialization (chan->configured)
>>  - Change linked list size to 512 Kbytes
>>  - Add data memory information
>>  - Add register size information
>>  - Add comments or improve existing ones
>>  - Add possibility to work with multiple IRQs feature
>>  - Replace MSI and MSI-X enable condition by pci_dev_msi_enabled()
>>  - Replace code to acquire MSI(-X) address and data by
>>    get_cached_msi_msg()
> 
>> +enum dw_edma_pcie_bar {
>> +	BAR_0,
>> +	BAR_1,
>> +	BAR_2,
>> +	BAR_3,
>> +	BAR_4,
>> +	BAR_5
>> +};
> 
> pci-epf.h has this.
> Why duplicate?

I can use that header sure. Thanks.

> 
> 
> What else is being duplicated from PCI core?
> 
>> +static bool disable_msix;
>> +module_param(disable_msix, bool, 0644);
>> +MODULE_PARM_DESC(disable_msix, "Disable MSI-X interrupts");
> 
> Why?!
> We are no allow new module parameters without very strong arguments.

Since this is a reference driver and might be used to test customized HW
solutions, I added this parameter to allow the possibility to test the solution
forcing the MSI feature binding. This is required specially if who will test
this solution has a Root Complex with both features available (MSI and MSI-X),
because the Kernel will give always preference to MSI-X binding (assuming that
the EP has also both features available).

> 
>> +
>> +static int dw_edma_pcie_probe(struct pci_dev *pdev,
>> +			      const struct pci_device_id *pid)
>> +{
>> +	const struct dw_edma_pcie_data *pdata = (void *)pid->driver_data;
>> +	struct device *dev = &pdev->dev;
>> +	struct dw_edma_chip *chip;
>> +	struct dw_edma *dw;
>> +	unsigned int irq_flags = PCI_IRQ_MSI;
>> +	int err, nr_irqs, i;
>> +
> 
>> +	if (!pdata) {
>> +		dev_err(dev, "%s missing data structure\n", pci_name(pdev));
>> +		return -EFAULT;
>> +	}
> 
> Useless check.

Why? It's just a precaution, isn't it a good practice always to think of the
worst case?

> 
>> +
>> +	/* Enable PCI device */
>> +	err = pcim_enable_device(pdev);
>> +	if (err) {
>> +		dev_err(dev, "%s enabling device failed\n", pci_name(pdev));
>> +		return err;
>> +	}
>> +
>> +	/* Mapping PCI BAR regions */
>> +	err = pcim_iomap_regions(pdev, BIT(pdata->rg_bar) |
>> +				       BIT(pdata->ll_bar) |
>> +				       BIT(pdata->dt_bar),
>> +				 pci_name(pdev));
>> +	if (err) {
> 
>> +		dev_err(dev, "%s eDMA BAR I/O remapping failed\n",
>> +			pci_name(pdev));
> 
> Isn't it pci_err() ?
> Same comment for the rest similar cases above and below.

Ok, I'll replace all dev_* function in this file.
Thanks.

> 
>> +		return err;
>> +	}
>> +
>> +	pci_set_master(pdev);
>> +
>> +	nr_irqs = pci_alloc_irq_vectors(pdev, 1, pdata->irqs_cnt, irq_flags);
>> +	if (nr_irqs < 1) {
>> +		dev_err(dev, "%s failed to alloc IRQ vector (Number of IRQs=%u)\n",
>> +			pci_name(pdev), nr_irqs);
>> +		return -EPERM;
>> +	}
>> +
>> +	/* Data structure initialization */
>> +	chip->dw = dw;
>> +	chip->dev = dev;
>> +	chip->id = pdev->devfn;
>> +	chip->irq = pdev->irq;
>> +
> 
>> +	if (!pcim_iomap_table(pdev))
>> +		return -EACCES;
> 
> Never happen condition. Thus useless.

pcim_iomap_table() can return NULL in case of allocation failure. Besides that,
isn't it a good practice always to think of the worst case?

> 
>> +	dev_info(dev, "DesignWare eDMA PCIe driver loaded completely\n");
> 
> Useless.

It's helpful for bring up, I can pass it to dbg.

> 
>> +}
>> +
>> +static void dw_edma_pcie_remove(struct pci_dev *pdev)
>> +{
>> +	struct dw_edma_chip *chip = pci_get_drvdata(pdev);
>> +	struct device *dev = &pdev->dev;
>> +	int err;
>> +
>> +	/* Stopping eDMA driver */
>> +	err = dw_edma_remove(chip);
>> +	if (err)
>> +		dev_warn(dev, "can't remove device properly: %d\n", err);
>> +
>> +	/* Freeing IRQs */
>> +	pci_free_irq_vectors(pdev);
>> +
>> +	dev_info(dev, "DesignWare eDMA PCIe driver unloaded completely\n");
> 
> Ditto.

It's helpful for bring up, I can pass it to dbg.

> 
>> +}
> 
>> +MODULE_DEVICE_TABLE(pci, dw_edma_pcie_id_table);
>> +
>> +static struct pci_driver dw_edma_pcie_driver = {
>> +	.name		= "dw-edma-pcie",
>> +	.id_table	= dw_edma_pcie_id_table,
>> +	.probe		= dw_edma_pcie_probe,
>> +	.remove		= dw_edma_pcie_remove,
> 
> Power management?

I've removed the power management for now, since with my current setup I don't
have the necessary conditions to test it. I prefer not submitting that code for now.

> 
>> +};
> 

Thanks for the inputs Andy! They have been pretty good!

Regards,
Gustavo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-11 19:48   ` Andy Shevchenko
@ 2019-01-14 11:44     ` Gustavo Pimentel
  2019-01-15  5:45       ` Andy Shevchenko
  0 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-14 11:44 UTC (permalink / raw)
  To: Andy Shevchenko, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

On 11/01/2019 19:48, Andy Shevchenko wrote:
> On Fri, Jan 11, 2019 at 07:33:43PM +0100, Gustavo Pimentel wrote:
>> Add Synopsys eDMA IP test and sample driver to be use for testing
>> purposes and also as a reference for any developer who needs to
>> implement and use Synopsys eDMA.
>>
>> This driver can be compile as built-in or external module in kernel.
>>
>> To enable this driver just select DW_EDMA_TEST option in kernel
>> configuration, however it requires and selects automatically DW_EDMA
>> option too.
>>
> 
> Hmm... This doesn't explain what's wrong with dmatest module.

There isn't anything wrong with dmatest module, that I know of. In beginning I
was planning to used it, however only works with MEM_TO_MEM transfers, that's
why I created a similar module but for MEM_TO_DEV and DEV_TO_MEM with
scatter-gather and cyclic transfers type for my use case. I don't know if can be
applied to other cases, if that is feasible, I'm glad to share it.

> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 4/7] PCI: Add Synopsys endpoint EDDA Device id
  2019-01-11 18:33 ` [RFC v3 4/7] PCI: Add Synopsys endpoint EDDA Device id Gustavo Pimentel
@ 2019-01-14 14:41   ` Bjorn Helgaas
  0 siblings, 0 replies; 39+ messages in thread
From: Bjorn Helgaas @ 2019-01-14 14:41 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Kishon Vijay Abraham I, Lorenzo Pieralisi,
	Niklas Cassel, Joao Pinto, Jose Abreu, Luis Oliveira,
	Vitor Soares, Nelson Costa, Pedro Sousa

On Fri, Jan 11, 2019 at 07:33:40PM +0100, Gustavo Pimentel wrote:
> Create and add Synopsys Endpoint EDDA Device id to PCI id list, since
> this id is now being use on two different drivers (pci_endpoint_test.ko
> and dw-edma-pcie.ko).

Nit if you update this series for some other reason: s/id/ID/ in
subject and changelog.  "Id" is an English word, but it's not
applicable in this context, so using "ID" makes it clear that we don't
mean the psychoanalysis term.

Bjorn

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic
  2019-01-14 11:38     ` Gustavo Pimentel
@ 2019-01-15  5:43       ` Andy Shevchenko
  2019-01-15 12:48         ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Andy Shevchenko @ 2019-01-15  5:43 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Lorenzo Pieralisi, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On Mon, Jan 14, 2019 at 11:38:02AM +0000, Gustavo Pimentel wrote:
> On 11/01/2019 19:47, Andy Shevchenko wrote:
> > On Fri, Jan 11, 2019 at 07:33:41PM +0100, Gustavo Pimentel wrote:

> >> +static bool disable_msix;
> >> +module_param(disable_msix, bool, 0644);
> >> +MODULE_PARM_DESC(disable_msix, "Disable MSI-X interrupts");
> > 
> > Why?!
> > We are no allow new module parameters without very strong arguments.
> 
> Since this is a reference driver and might be used to test customized HW
> solutions, I added this parameter to allow the possibility to test the solution
> forcing the MSI feature binding. This is required specially if who will test
> this solution has a Root Complex with both features available (MSI and MSI-X),
> because the Kernel will give always preference to MSI-X binding (assuming that
> the EP has also both features available).

Yes, you may do it for testing purposes, but it doesn't fit the kernel standards.

> >> +	if (!pdata) {
> >> +		dev_err(dev, "%s missing data structure\n", pci_name(pdev));
> >> +		return -EFAULT;
> >> +	}
> > 
> > Useless check.
> 
> Why? It's just a precaution, isn't it a good practice always to think of the
> worst case?

You just can put an implicit requirement of pdata rather than doing this
useless check. I don't believe it would make sense to have NULL pdata for the
driver since it wouldn't be functional anyhow.

> >> +	/* Mapping PCI BAR regions */
> >> +	err = pcim_iomap_regions(pdev, BIT(pdata->rg_bar) |
> >> +				       BIT(pdata->ll_bar) |
> >> +				       BIT(pdata->dt_bar),
> >> +				 pci_name(pdev));
> >> +	if (err) {

> >> +		return err;
> >> +	}

> >> +	if (!pcim_iomap_table(pdev))
> >> +		return -EACCES;
> > 
> > Never happen condition. Thus useless.
> 
> pcim_iomap_table() can return NULL in case of allocation failure. Besides that,
> isn't it a good practice always to think of the worst case?

No, it can't in the conditions your have in the code. See above the lines I left.
If pcim_iomap_regions() successfully finished...

> >> +	dev_info(dev, "DesignWare eDMA PCIe driver loaded completely\n");
> > 
> > Useless.
> 
> It's helpful for bring up, I can pass it to dbg.

It just shows that someone didn't use existing tools and features. This message and similar are useless.
Hint: initcall_debug.

> >> +	dev_info(dev, "DesignWare eDMA PCIe driver unloaded completely\n");
> > 
> > Ditto.
> 
> It's helpful for bring up, I can pass it to dbg.

Ditto.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-14 11:44     ` Gustavo Pimentel
@ 2019-01-15  5:45       ` Andy Shevchenko
  2019-01-15 13:02         ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Andy Shevchenko @ 2019-01-15  5:45 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

On Mon, Jan 14, 2019 at 11:44:22AM +0000, Gustavo Pimentel wrote:
> On 11/01/2019 19:48, Andy Shevchenko wrote:
> > On Fri, Jan 11, 2019 at 07:33:43PM +0100, Gustavo Pimentel wrote:
> >> Add Synopsys eDMA IP test and sample driver to be use for testing
> >> purposes and also as a reference for any developer who needs to
> >> implement and use Synopsys eDMA.
> >>
> >> This driver can be compile as built-in or external module in kernel.
> >>
> >> To enable this driver just select DW_EDMA_TEST option in kernel
> >> configuration, however it requires and selects automatically DW_EDMA
> >> option too.
> >>
> > 
> > Hmm... This doesn't explain what's wrong with dmatest module.
> 
> There isn't anything wrong with dmatest module, that I know of. In beginning I
> was planning to used it, however only works with MEM_TO_MEM transfers, that's
> why I created a similar module but for MEM_TO_DEV and DEV_TO_MEM with
> scatter-gather and cyclic transfers type for my use case. I don't know if can be
> applied to other cases, if that is feasible, I'm glad to share it.

What I'm trying to tell is that the dmatest driver would be nice to have such
capability.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic
  2019-01-15  5:43       ` Andy Shevchenko
@ 2019-01-15 12:48         ` Gustavo Pimentel
  2019-01-19 15:45           ` Andy Shevchenko
  0 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-15 12:48 UTC (permalink / raw)
  To: Andy Shevchenko, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Lorenzo Pieralisi, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 15/01/2019 05:43, Andy Shevchenko wrote:
> On Mon, Jan 14, 2019 at 11:38:02AM +0000, Gustavo Pimentel wrote:
>> On 11/01/2019 19:47, Andy Shevchenko wrote:
>>> On Fri, Jan 11, 2019 at 07:33:41PM +0100, Gustavo Pimentel wrote:
> 
>>>> +static bool disable_msix;
>>>> +module_param(disable_msix, bool, 0644);
>>>> +MODULE_PARM_DESC(disable_msix, "Disable MSI-X interrupts");
>>>
>>> Why?!
>>> We are no allow new module parameters without very strong arguments.
>>
>> Since this is a reference driver and might be used to test customized HW
>> solutions, I added this parameter to allow the possibility to test the solution
>> forcing the MSI feature binding. This is required specially if who will test
>> this solution has a Root Complex with both features available (MSI and MSI-X),
>> because the Kernel will give always preference to MSI-X binding (assuming that
>> the EP has also both features available).
> 
> Yes, you may do it for testing purposes, but it doesn't fit the kernel standards.

Ok, but how should I proceed? May I leave it or substitute by another way to do
it? If so, how?
As I said, the intended is to be only used for this test case, on normal
operation the parameter it should be always false.

> 
>>>> +	if (!pdata) {
>>>> +		dev_err(dev, "%s missing data structure\n", pci_name(pdev));
>>>> +		return -EFAULT;
>>>> +	}
>>>
>>> Useless check.
>>
>> Why? It's just a precaution, isn't it a good practice always to think of the
>> worst case?
> 
> You just can put an implicit requirement of pdata rather than doing this

Ok, how can I do it? What I should add to the code to force that?

> useless check. I don't believe it would make sense to have NULL pdata for the
> driver since it wouldn't be functional anyhow.

Yes, you're right without pdata the driver can't do anything.
> 
>>>> +	/* Mapping PCI BAR regions */
>>>> +	err = pcim_iomap_regions(pdev, BIT(pdata->rg_bar) |
>>>> +				       BIT(pdata->ll_bar) |
>>>> +				       BIT(pdata->dt_bar),
>>>> +				 pci_name(pdev));
>>>> +	if (err) {
> 
>>>> +		return err;
>>>> +	}
> 
>>>> +	if (!pcim_iomap_table(pdev))
>>>> +		return -EACCES;
>>>
>>> Never happen condition. Thus useless.
>>
>> pcim_iomap_table() can return NULL in case of allocation failure. Besides that,
>> isn't it a good practice always to think of the worst case?
> 
> No, it can't in the conditions your have in the code. See above the lines I left.
> If pcim_iomap_regions() successfully finished...

Nice catch, I didn't saw that. I added that validation because of coverity
analysis. I'll add a comment here to suppress this coverity false positive error.

> 
>>>> +	dev_info(dev, "DesignWare eDMA PCIe driver loaded completely\n");
>>>
>>> Useless.
>>
>> It's helpful for bring up, I can pass it to dbg.
> 
> It just shows that someone didn't use existing tools and features. This message and similar are useless.
> Hint: initcall_debug.

I wasn't aware of it. Thanks for the hint :)

> 
>>>> +	dev_info(dev, "DesignWare eDMA PCIe driver unloaded completely\n");
>>>
>>> Ditto.
>>
>> It's helpful for bring up, I can pass it to dbg.
> 
> Ditto.
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-15  5:45       ` Andy Shevchenko
@ 2019-01-15 13:02         ` Gustavo Pimentel
  2019-01-17  5:03           ` Vinod Koul
  0 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-15 13:02 UTC (permalink / raw)
  To: Andy Shevchenko, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

On 15/01/2019 05:45, Andy Shevchenko wrote:
> On Mon, Jan 14, 2019 at 11:44:22AM +0000, Gustavo Pimentel wrote:
>> On 11/01/2019 19:48, Andy Shevchenko wrote:
>>> On Fri, Jan 11, 2019 at 07:33:43PM +0100, Gustavo Pimentel wrote:
>>>> Add Synopsys eDMA IP test and sample driver to be use for testing
>>>> purposes and also as a reference for any developer who needs to
>>>> implement and use Synopsys eDMA.
>>>>
>>>> This driver can be compile as built-in or external module in kernel.
>>>>
>>>> To enable this driver just select DW_EDMA_TEST option in kernel
>>>> configuration, however it requires and selects automatically DW_EDMA
>>>> option too.
>>>>
>>>
>>> Hmm... This doesn't explain what's wrong with dmatest module.
>>
>> There isn't anything wrong with dmatest module, that I know of. In beginning I
>> was planning to used it, however only works with MEM_TO_MEM transfers, that's
>> why I created a similar module but for MEM_TO_DEV and DEV_TO_MEM with
>> scatter-gather and cyclic transfers type for my use case. I don't know if can be
>> applied to other cases, if that is feasible, I'm glad to share it.
> 
> What I'm trying to tell is that the dmatest driver would be nice to have such
> capability.
> 

At the beginning I thought to add those features to dmatest module, but because
of several points such as:
 - I didn't want, by any chance, to broke dmatest module while implementing the
support of those new features;
 - Since I'm still gaining knowledge about this subsystem I preferred to do in a
more controlled environment;
 - Since this driver is also a reference driver aim to teach and to be use by
someone how to use the dmaengine API, I preferred to keep it simple.

Maybe in the future both modules could be merged in just one tool, but for now I
would prefer to keep it separated specially to fix any immaturity error of this
module.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-11 18:33 ` [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver Gustavo Pimentel
@ 2019-01-16 10:21   ` Jose Abreu
  2019-01-16 11:53     ` Gustavo Pimentel
  2019-01-20 11:44   ` Vinod Koul
  1 sibling, 1 reply; 39+ messages in thread
From: Jose Abreu @ 2019-01-16 10:21 UTC (permalink / raw)
  To: Gustavo Pimentel, linux-pci, dmaengine
  Cc: Vinod Koul, Dan Williams, Eugeniy Paltsev, Andy Shevchenko,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

Hi Gustavo,

On 1/11/2019 6:33 PM, Gustavo Pimentel wrote:
> Add Synopsys eDMA IP core driver to kernel.
> 
> This core driver, initializes and configures the eDMA IP using vma-helpers
> functions and dma-engine subsystem.
> 
> Also creates an abstration layer through callbacks allowing different
> registers mappings in the future, organized in to versions.
> 
> This driver can be compile as built-in or external module in kernel.
> 
> To enable this driver just select DW_EDMA option in kernel configuration,
> however it requires and selects automatically DMA_ENGINE and
> DMA_VIRTUAL_CHANNELS option too.
> 
> Changes:
> RFC v1->RFC v2:
>  - Replace comments // (C99 style) by /**/
>  - Fix the headers of the .c and .h files according to the most recent
>    convention
>  - Fix errors and checks pointed out by checkpatch with --strict option
>  - Replace patch small description tag from dma by dmaengine
>  - Change some dev_info() into dev_dbg()
>  - Remove unnecessary zero initialization after kzalloc
>  - Remove direction validation on config() API, since the direction
>    parameter is deprecated
>  - Refactor code to replace atomic_t by u32 variable type
>  - Replace start_transfer() name by dw_edma_start_transfer()
>  - Add spinlock to dw_edma_device_prep_slave_sg()
>  - Add spinlock to dw_edma_free_chunk()
>  - Simplify switch case into if on dw_edma_device_pause(),
>    dw_edma_device_resume() and dw_edma_device_terminate_all()
> RFC v2->RFC v3:
>  - Add driver parameter to disable msix feature
>  - Fix printk variable of phys_addr_t type
>  - Fix printk variable of __iomem type
>  - Fix printk variable of size_t type
>  - Add comments or improve existing ones
>  - Add possibility to work with multiple IRQs feature
>  - Fix source and destination addresses
>  - Add define to magic numbers
>  - Add DMA cyclic transfer feature
> 
> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
> Cc: Vinod Koul <vkoul@kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Eugeniy Paltsev <paltsev@synopsys.com>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Russell King <rmk+kernel@armlinux.org.uk>
> Cc: Niklas Cassel <niklas.cassel@linaro.org>
> Cc: Joao Pinto <jpinto@synopsys.com>
> Cc: Jose Abreu <jose.abreu@synopsys.com>
> Cc: Luis Oliveira <lolivei@synopsys.com>
> Cc: Vitor Soares <vitor.soares@synopsys.com>
> Cc: Nelson Costa <nelson.costa@synopsys.com>
> Cc: Pedro Sousa <pedrom.sousa@synopsys.com>


> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
> +{
> +	struct dw_edma_chan *chan = desc->chan;
> +	struct dw_edma *dw = chan->chip->dw;
> +	struct dw_edma_chunk *chunk;
> +
> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
> +	if (unlikely(!chunk))
> +		return NULL;
> +
> +	INIT_LIST_HEAD(&chunk->list);
> +	chunk->chan = chan;
> +	chunk->cb = !(desc->chunks_alloc % 2);
> +	chunk->ll_region.paddr = dw->ll_region.paddr + chan->ll_off;
> +	chunk->ll_region.vaddr = dw->ll_region.vaddr + chan->ll_off;
> +
> +	if (desc->chunk) {
> +		/* Create and add new element into the linked list */
> +		desc->chunks_alloc++;
> +		dev_dbg(chan2dev(chan), "alloc new chunk element (%d)\n",
> +			desc->chunks_alloc);
> +		list_add_tail(&chunk->list, &desc->chunk->list);
> +		dw_edma_alloc_burst(chunk);

No return check ?

> +	} else {
> +		/* List head */
> +		chunk->burst = NULL;
> +		desc->chunks_alloc = 0;
> +		desc->chunk = chunk;
> +		dev_dbg(chan2dev(chan), "alloc new chunk head\n");
> +	}
> +
> +	return chunk;
> +}
> +
> +static struct dw_edma_desc *dw_edma_alloc_desc(struct dw_edma_chan *chan)
> +{
> +	struct dw_edma_desc *desc;
> +
> +	dev_dbg(chan2dev(chan), "alloc new descriptor\n");
> +
> +	desc = kvzalloc(sizeof(*desc), GFP_NOWAIT);
> +	if (unlikely(!desc))
> +		return NULL;
> +
> +	desc->chan = chan;
> +	dw_edma_alloc_chunk(desc);

No return check ?

> +
> +	return desc;
> +}
> +

> +static void dw_edma_start_transfer(struct dw_edma_chan *chan)
> +{
> +	struct virt_dma_desc *vd;
> +	struct dw_edma_desc *desc;
> +	struct dw_edma_chunk *child;
> +	const struct dw_edma_core_ops *ops = chan2ops(chan);

Reverse tree order here and in remaining functions ...

> +
> +	vd = vchan_next_desc(&chan->vc);
> +	if (!vd)
> +		return;
> +
> +	desc = vd2dw_edma_desc(vd);
> +	if (!desc)
> +		return;
> +
> +	child = list_first_entry_or_null(&desc->chunk->list,
> +					 struct dw_edma_chunk, list);
> +	if (!child)
> +		return;
> +
> +	ops->start(child, !desc->xfer_sz);
> +	desc->xfer_sz += child->ll_region.sz;
> +	dev_dbg(chan2dev(chan), "transfer of %u bytes started\n",
> +		child->ll_region.sz);
> +
> +	dw_edma_free_burst(child);
> +	if (child->bursts_alloc)
> +		dev_dbg(chan2dev(chan),	"%u bursts still allocated\n",
> +			child->bursts_alloc);
> +	list_del(&child->list);
> +	kvfree(child);
> +	desc->chunks_alloc--;
> +}
> +

> +int dw_edma_probe(struct dw_edma_chip *chip)
> +{
> +	struct dw_edma *dw = chip->dw;
> +	struct device *dev = chip->dev;
> +	const struct dw_edma_core_ops *ops;
> +	size_t ll_chunk = dw->ll_region.sz;
> +	size_t dt_chunk = dw->dt_region.sz;
> +	u32 ch_tot;
> +	int i, j, err;
> +
> +	raw_spin_lock_init(&dw->lock);

Why raw ?

> +
> +	/* Callback operation selection accordingly to eDMA version */
> +	switch (dw->version) {
> +	default:
> +		dev_err(dev, "unsupported version\n");
> +		return -EPERM;
> +	}
> +
> +	pm_runtime_get_sync(dev);

Why do you need to increase usage count at probe ? And shouldn't
this be after pm_runtime_enable() ?

> +
> +	/* Find out how many write channels are supported by hardware */
> +	dw->wr_ch_cnt = ops->ch_count(dw, EDMA_DIR_WRITE);
> +	if (!dw->wr_ch_cnt) {
> +		dev_err(dev, "invalid number of write channels(0)\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Find out how many read channels are supported by hardware */
> +	dw->rd_ch_cnt = ops->ch_count(dw, EDMA_DIR_READ);
> +	if (!dw->rd_ch_cnt) {
> +		dev_err(dev, "invalid number of read channels(0)\n");
> +		return -EINVAL;
> +	}
> +
> +	dev_dbg(dev, "Channels:\twrite=%d, read=%d\n",
> +		dw->wr_ch_cnt, dw->rd_ch_cnt);
> +
> +	ch_tot = dw->wr_ch_cnt + dw->rd_ch_cnt;
> +
> +	/* Allocate channels */
> +	dw->chan = devm_kcalloc(dev, ch_tot, sizeof(*dw->chan), GFP_KERNEL);
> +	if (!dw->chan)
> +		return -ENOMEM;
> +
> +	/* Calculate the linked list chunk for each channel */
> +	ll_chunk /= roundup_pow_of_two(ch_tot);
> +
> +	/* Calculate the linked list chunk for each channel */
> +	dt_chunk /= roundup_pow_of_two(ch_tot);
> +
> +	/* Disable eDMA, only to establish the ideal initial conditions */
> +	ops->off(dw);
> +
> +	snprintf(dw->name, sizeof(dw->name), "dw-edma-core:%d", chip->id);
> +
> +	/* Request IRQs */
> +	if (dw->nr_irqs != 1) {
> +		dev_err(dev, "invalid number of irqs (%u)\n", dw->nr_irqs);
> +		return -EINVAL;
> +	}
> +
> +	for (i = 0; i < dw->nr_irqs; i++) {
> +		err = devm_request_irq(dev, pci_irq_vector(to_pci_dev(dev), i),
> +				       dw_edma_interrupt_all,
> +				       IRQF_SHARED, dw->name, chip);
> +		if (err)
> +			return err;
> +	}
> +
> +	/* Create write channels */
> +	INIT_LIST_HEAD(&dw->wr_edma.channels);
> +	for (i = 0; i < dw->wr_ch_cnt; i++) {
> +		struct dw_edma_chan *chan = &dw->chan[i];
> +		struct dw_edma_region *dt_region;
> +
> +		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
> +		if (!dt_region)
> +			return -ENOMEM;
> +
> +		chan->vc.chan.private = dt_region;
> +
> +		chan->chip = chip;
> +		chan->id = i;
> +		chan->dir = EDMA_DIR_WRITE;
> +		chan->configured = false;
> +		chan->request = EDMA_REQ_NONE;
> +		chan->status = EDMA_ST_IDLE;
> +
> +		chan->ll_off = (ll_chunk * i);
> +		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
> +
> +		chan->dt_off = (dt_chunk * i);
> +
> +		dev_dbg(dev, "L. List:\tChannel write[%u] off=0x%.8lx, max_cnt=%u\n",
> +			i, chan->ll_off, chan->ll_max);
> +
> +		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
> +
> +		dev_dbg(dev, "MSI:\t\tChannel write[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
> +			i, chan->msi.address_hi, chan->msi.address_lo,
> +			chan->msi.data);
> +
> +		chan->vc.desc_free = vchan_free_desc;
> +		vchan_init(&chan->vc, &dw->wr_edma);
> +
> +		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
> +		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
> +		dt_region->sz = dt_chunk;
> +
> +		dev_dbg(dev, "Data:\tChannel write[%u] off=0x%.8lx\n",
> +			i, chan->dt_off);
> +	}
> +	dma_cap_zero(dw->wr_edma.cap_mask);
> +	dma_cap_set(DMA_SLAVE, dw->wr_edma.cap_mask);
> +	dma_cap_set(DMA_CYCLIC, dw->wr_edma.cap_mask);
> +	dw->wr_edma.directions = BIT(DMA_DEV_TO_MEM);
> +	dw->wr_edma.chancnt = dw->wr_ch_cnt;
> +
> +	/* Create read channels */
> +	INIT_LIST_HEAD(&dw->rd_edma.channels);
> +	for (j = 0; j < dw->rd_ch_cnt; j++, i++) {
> +		struct dw_edma_chan *chan = &dw->chan[i];
> +		struct dw_edma_region *dt_region;
> +
> +		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
> +		if (!dt_region)
> +			return -ENOMEM;
> +
> +		chan->vc.chan.private = dt_region;
> +
> +		chan->chip = chip;
> +		chan->id = j;
> +		chan->dir = EDMA_DIR_READ;
> +		chan->configured = false;
> +		chan->request = EDMA_REQ_NONE;
> +		chan->status = EDMA_ST_IDLE;
> +
> +		chan->ll_off = (ll_chunk * i);
> +		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
> +
> +		chan->dt_off = (dt_chunk * i);
> +
> +		dev_dbg(dev, "L. List:\tChannel read[%u] off=0x%.8lx, max_cnt=%u\n",
> +			j, chan->ll_off, chan->ll_max);
> +
> +		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
> +
> +		dev_dbg(dev, "MSI:\t\tChannel read[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
> +			j, chan->msi.address_hi, chan->msi.address_lo,
> +			chan->msi.data);
> +
> +		chan->vc.desc_free = vchan_free_desc;
> +		vchan_init(&chan->vc, &dw->rd_edma);
> +
> +		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
> +		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
> +		dt_region->sz = dt_chunk;
> +
> +		dev_dbg(dev, "Data:\tChannel read[%u] off=0x%.8lx\n",
> +			i, chan->dt_off);
> +	}
> +	dma_cap_zero(dw->rd_edma.cap_mask);
> +	dma_cap_set(DMA_SLAVE, dw->rd_edma.cap_mask);
> +	dma_cap_set(DMA_CYCLIC, dw->rd_edma.cap_mask);
> +	dw->rd_edma.directions = BIT(DMA_MEM_TO_DEV);
> +	dw->rd_edma.chancnt = dw->rd_ch_cnt;
> +
> +	/* Set DMA channels  capabilities */
> +	SET_BOTH_CH(src_addr_widths, BIT(DMA_SLAVE_BUSWIDTH_4_BYTES));
> +	SET_BOTH_CH(dst_addr_widths, BIT(DMA_SLAVE_BUSWIDTH_4_BYTES));
> +	SET_BOTH_CH(residue_granularity, DMA_RESIDUE_GRANULARITY_DESCRIPTOR);
> +
> +	SET_BOTH_CH(dev, dev);
> +
> +	SET_BOTH_CH(device_alloc_chan_resources, dw_edma_alloc_chan_resources);
> +	SET_BOTH_CH(device_free_chan_resources, dw_edma_free_chan_resources);
> +
> +	SET_BOTH_CH(device_config, dw_edma_device_config);
> +	SET_BOTH_CH(device_pause, dw_edma_device_pause);
> +	SET_BOTH_CH(device_resume, dw_edma_device_resume);
> +	SET_BOTH_CH(device_terminate_all, dw_edma_device_terminate_all);
> +	SET_BOTH_CH(device_issue_pending, dw_edma_device_issue_pending);
> +	SET_BOTH_CH(device_tx_status, dw_edma_device_tx_status);
> +	SET_BOTH_CH(device_prep_slave_sg, dw_edma_device_prep_slave_sg);
> +	SET_BOTH_CH(device_prep_dma_cyclic, dw_edma_device_prep_dma_cyclic);
> +
> +	/* Power management */
> +	pm_runtime_enable(dev);
> +
> +	/* Register DMA device */
> +	err = dma_async_device_register(&dw->wr_edma);
> +	if (err)
> +		goto err_pm_disable;
> +
> +	err = dma_async_device_register(&dw->rd_edma);
> +	if (err)
> +		goto err_pm_disable;
> +
> +	/* Turn debugfs on */
> +	err = ops->debugfs_on(chip);
> +	if (err) {
> +		dev_err(dev, "unable to create debugfs structure\n");
> +		goto err_pm_disable;
> +	}
> +
> +	dev_info(dev, "DesignWare eDMA controller driver loaded completely\n");
> +
> +	return 0;
> +
> +err_pm_disable:
> +	pm_runtime_disable(dev);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(dw_edma_probe);
> +
> +int dw_edma_remove(struct dw_edma_chip *chip)
> +{
> +	struct dw_edma *dw = chip->dw;
> +	struct device *dev = chip->dev;
> +	const struct dw_edma_core_ops *ops = dw->ops;
> +	struct dw_edma_chan *chan, *_chan;
> +	int i;
> +
> +	/* Disable eDMA */
> +	if (ops)
> +		ops->off(dw);
> +
> +	/* Free irqs */

But devm will automatically free it, no ?

> +	for (i = 0; i < dw->nr_irqs; i++) {
> +		if (pci_irq_vector(to_pci_dev(dev), i) < 0)
> +			continue;
> +
> +		devm_free_irq(dev, pci_irq_vector(to_pci_dev(dev), i), chip);
> +	}
> +
> +	/* Power management */
> +	pm_runtime_disable(dev);
> +
> +	list_for_each_entry_safe(chan, _chan, &dw->wr_edma.channels,
> +				 vc.chan.device_node) {
> +		list_del(&chan->vc.chan.device_node);
> +		tasklet_kill(&chan->vc.task);
> +	}
> +
> +	list_for_each_entry_safe(chan, _chan, &dw->rd_edma.channels,
> +				 vc.chan.device_node) {
> +		list_del(&chan->vc.chan.device_node);
> +		tasklet_kill(&chan->vc.task);
> +	}
> +
> +	/* Deregister eDMA device */
> +	dma_async_device_unregister(&dw->wr_edma);
> +	dma_async_device_unregister(&dw->rd_edma);
> +
> +	/* Turn debugfs off */
> +	if (ops)
> +		ops->debugfs_off();
> +
> +	dev_info(dev, "DesignWare eDMA controller driver unloaded complete\n");
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(dw_edma_remove);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 2/7] dmaengine: Add Synopsys eDMA IP version 0 support
  2019-01-11 18:33 ` [RFC v3 2/7] dmaengine: Add Synopsys eDMA IP version 0 support Gustavo Pimentel
@ 2019-01-16 10:33   ` Jose Abreu
  2019-01-16 14:02     ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Jose Abreu @ 2019-01-16 10:33 UTC (permalink / raw)
  To: Gustavo Pimentel, linux-pci, dmaengine
  Cc: Vinod Koul, Dan Williams, Eugeniy Paltsev, Andy Shevchenko,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

Hi Gustavo,

On 1/11/2019 6:33 PM, Gustavo Pimentel wrote:
> Add support for the eDMA IP version 0 driver for both register maps (legacy
> and unroll).
> 
> The legacy register mapping was the initial implementation, which consisted
> in having all registers belonging to channels multiplexed, which could be
> change anytime (which could led a race-condition) by view port register
> (access to only one channel available each time).
> 
> This register mapping is not very effective and efficient in a multithread
> environment, which has led to the development of unroll registers mapping,
> which consists of having all channels registers accessible any time by
> spreading all channels registers by an offset between them.
> 
> This version supports a maximum of 16 independent channels (8 write +
> 8 read), which can run simultaneously.
> 
> Implements a scatter-gather transfer through a linked list, where the size
> of linked list depends on the allocated memory divided equally among all
> channels.
> 
> Each linked list descriptor can transfer from 1 byte to 4 Gbytes and is
> alignmented to DWORD.
> 
> Both SAR (Source Address Register) and DAR (Destination Address Register)
> are alignmented to byte.
> 
> Changes:
> RFC v1->RFC v2:
>  - Replace comments // (C99 style) by /**/
>  - Replace magic numbers by defines
>  - Replace boolean return from ternary operation by a double negation
>    operation
>  - Replace QWORD_HI/QWORD_LO macros by upper_32_bits()/lower_32_bits()
>  - Fix the headers of the .c and .h files according to the most recent
>    convention
>  - Fix errors and checks pointed out by checkpatch with --strict option
>  - Replace patch small description tag from dma by dmaengine
>  - Refactor code to replace atomic_t by u32 variable type
> RFC v2->RFC v3:
>  - Code rewrite to use FIELD_PREP() and FIELD_GET()
>  - Add define to magic numbers
>  - Fix minor bugs
> 
> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
> Cc: Vinod Koul <vkoul@kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Eugeniy Paltsev <paltsev@synopsys.com>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Russell King <rmk+kernel@armlinux.org.uk>
> Cc: Niklas Cassel <niklas.cassel@linaro.org>
> Cc: Joao Pinto <jpinto@synopsys.com>
> Cc: Jose Abreu <jose.abreu@synopsys.com>
> Cc: Luis Oliveira <lolivei@synopsys.com>
> Cc: Vitor Soares <vitor.soares@synopsys.com>
> Cc: Nelson Costa <nelson.costa@synopsys.com>
> Cc: Pedro Sousa <pedrom.sousa@synopsys.com>

> +void dw_edma_v0_core_start(struct dw_edma_chunk *chunk, bool first)
> +{
> +	struct dw_edma_chan *chan = chunk->chan;
> +	struct dw_edma *dw = chan->chip->dw;
> +	u32 tmp;
> +	u64 llp;
> +
> +	dw_edma_v0_core_write_chunk(chunk);
> +
> +	if (first) {
> +		/* Enable engine */
> +		SET_RW(dw, chan->dir, engine_en, BIT(0));
> +		/* Interrupt unmask - done, abort */
> +		tmp = GET_RW(dw, chan->dir, int_mask);
> +		tmp &= ~FIELD_PREP(EDMA_V0_DONE_INT_MASK, BIT(chan->id));
> +		tmp &= ~FIELD_PREP(EDMA_V0_ABORT_INT_MASK, BIT(chan->id));
> +		SET_RW(dw, chan->dir, int_mask, tmp);
> +		/* Linked list error */
> +		tmp = GET_RW(dw, chan->dir, linked_list_err_en);
> +		tmp |= FIELD_PREP(EDMA_V0_LINKED_LIST_ERR_MASK, BIT(chan->id));
> +		SET_RW(dw, chan->dir, linked_list_err_en, tmp);
> +		/* Channel control */
> +		SET_CH(dw, chan->dir, chan->id, ch_control1,
> +		       (DW_EDMA_V0_CCS | DW_EDMA_V0_LLE));
> +		/* Linked list - low, high */
> +		llp = cpu_to_le64(chunk->ll_region.paddr);
> +		SET_CH(dw, chan->dir, chan->id, llp_low, lower_32_bits(llp));
> +		SET_CH(dw, chan->dir, chan->id, llp_high, upper_32_bits(llp));
> +	}
> +	/* Doorbell */

Not sure if DMA subsystem does this but maybe you need some kind
of barrier to ensure everything is coherent before granting
control to eDMA ?

> +	SET_RW(dw, chan->dir, doorbell,
> +	       FIELD_PREP(EDMA_V0_DOORBELL_CH_MASK, chan->id));
> +}
> +

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-11 18:33 ` [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver Gustavo Pimentel
  2019-01-11 19:48   ` Andy Shevchenko
@ 2019-01-16 10:45   ` Jose Abreu
  2019-01-16 11:56     ` Gustavo Pimentel
  1 sibling, 1 reply; 39+ messages in thread
From: Jose Abreu @ 2019-01-16 10:45 UTC (permalink / raw)
  To: Gustavo Pimentel, linux-pci, dmaengine
  Cc: Vinod Koul, Dan Williams, Eugeniy Paltsev, Andy Shevchenko,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

Hi Gustavo,

On 1/11/2019 6:33 PM, Gustavo Pimentel wrote:
> Add Synopsys eDMA IP test and sample driver to be use for testing
> purposes and also as a reference for any developer who needs to
> implement and use Synopsys eDMA.
> 
> This driver can be compile as built-in or external module in kernel.
> 
> To enable this driver just select DW_EDMA_TEST option in kernel
> configuration, however it requires and selects automatically DW_EDMA
> option too.
> 
> Changes:
> RFC v1->RFC v2:
>  - No changes
> RFC v2->RFC v3:
>  - Add test module
> 
> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
> Cc: Vinod Koul <vkoul@kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Eugeniy Paltsev <paltsev@synopsys.com>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Russell King <rmk+kernel@armlinux.org.uk>
> Cc: Niklas Cassel <niklas.cassel@linaro.org>
> Cc: Joao Pinto <jpinto@synopsys.com>
> Cc: Jose Abreu <jose.abreu@synopsys.com>
> Cc: Luis Oliveira <lolivei@synopsys.com>
> Cc: Vitor Soares <vitor.soares@synopsys.com>
> Cc: Nelson Costa <nelson.costa@synopsys.com>
> Cc: Pedro Sousa <pedrom.sousa@synopsys.com>

> +static int dw_edma_test_add_channel(struct dw_edma_test_info *info,
> +				    struct dma_chan *chan,
> +				    u32 channel)
> +{
> +	struct dw_edma_test_params *params = &info->params;
> +	struct dw_edma_test_thread *thread;
> +	struct dw_edma_test_chan *tchan;
> +
> +	tchan = kvmalloc(sizeof(*tchan), GFP_KERNEL);
> +	if (!tchan)
> +		return -ENOMEM;
> +
> +	tchan->chan = chan;
> +
> +	thread = kvzalloc(sizeof(*thread), GFP_KERNEL);
> +	if (!thread) {
> +		kvfree(tchan);
> +		return -ENOMEM;
> +	}
> +
> +	thread->info = info;
> +	thread->chan = tchan->chan;
> +	switch (channel) {
> +	case EDMA_CH_WR:
> +		thread->direction = DMA_DEV_TO_MEM;
> +		break;
> +	case EDMA_CH_RD:
> +		thread->direction = DMA_MEM_TO_DEV;
> +		break;
> +	default:
> +		kvfree(tchan);

You are leaking thread here.

> +		return -EPERM;
> +	}
> +	thread->test_done.wait = &thread->done_wait;
> +	init_waitqueue_head(&thread->done_wait);
> +
> +	if (!params->repetitions)
> +		thread->task = kthread_create(dw_edma_test_sg, thread, "%s",
> +					      dma_chan_name(chan));
> +	else
> +		thread->task = kthread_create(dw_edma_test_cyclic, thread, "%s",
> +					      dma_chan_name(chan));
> +
> +	if (IS_ERR(thread->task)) {
> +		pr_err("failed to create thread %s\n", dma_chan_name(chan));
> +		kvfree(tchan);
> +		kvfree(thread);
> +		return -EPERM;
> +	}
> +
> +	tchan->thread = thread;
> +	dev_dbg(chan->device->dev, "add thread %s\n", dma_chan_name(chan));
> +	list_add_tail(&tchan->node, &info->channels);
> +
> +	return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-16 10:21   ` Jose Abreu
@ 2019-01-16 11:53     ` Gustavo Pimentel
  2019-01-19 16:21       ` Andy Shevchenko
  2019-01-20 11:47       ` Vinod Koul
  0 siblings, 2 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-16 11:53 UTC (permalink / raw)
  To: Jose Abreu, Gustavo Pimentel, linux-pci, dmaengine
  Cc: Vinod Koul, Dan Williams, Eugeniy Paltsev, Andy Shevchenko,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis de Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

Hi Jose,

On 16/01/2019 10:21, Jose Abreu wrote:
> Hi Gustavo,
> 
> On 1/11/2019 6:33 PM, Gustavo Pimentel wrote:
>> Add Synopsys eDMA IP core driver to kernel.
>>
>> This core driver, initializes and configures the eDMA IP using vma-helpers
>> functions and dma-engine subsystem.
>>
>> Also creates an abstration layer through callbacks allowing different
>> registers mappings in the future, organized in to versions.
>>
>> This driver can be compile as built-in or external module in kernel.
>>
>> To enable this driver just select DW_EDMA option in kernel configuration,
>> however it requires and selects automatically DMA_ENGINE and
>> DMA_VIRTUAL_CHANNELS option too.
>>
>> Changes:
>> RFC v1->RFC v2:
>>  - Replace comments // (C99 style) by /**/
>>  - Fix the headers of the .c and .h files according to the most recent
>>    convention
>>  - Fix errors and checks pointed out by checkpatch with --strict option
>>  - Replace patch small description tag from dma by dmaengine
>>  - Change some dev_info() into dev_dbg()
>>  - Remove unnecessary zero initialization after kzalloc
>>  - Remove direction validation on config() API, since the direction
>>    parameter is deprecated
>>  - Refactor code to replace atomic_t by u32 variable type
>>  - Replace start_transfer() name by dw_edma_start_transfer()
>>  - Add spinlock to dw_edma_device_prep_slave_sg()
>>  - Add spinlock to dw_edma_free_chunk()
>>  - Simplify switch case into if on dw_edma_device_pause(),
>>    dw_edma_device_resume() and dw_edma_device_terminate_all()
>> RFC v2->RFC v3:
>>  - Add driver parameter to disable msix feature
>>  - Fix printk variable of phys_addr_t type
>>  - Fix printk variable of __iomem type
>>  - Fix printk variable of size_t type
>>  - Add comments or improve existing ones
>>  - Add possibility to work with multiple IRQs feature
>>  - Fix source and destination addresses
>>  - Add define to magic numbers
>>  - Add DMA cyclic transfer feature
>>
>> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
>> Cc: Vinod Koul <vkoul@kernel.org>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Eugeniy Paltsev <paltsev@synopsys.com>
>> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
>> Cc: Russell King <rmk+kernel@armlinux.org.uk>
>> Cc: Niklas Cassel <niklas.cassel@linaro.org>
>> Cc: Joao Pinto <jpinto@synopsys.com>
>> Cc: Jose Abreu <jose.abreu@synopsys.com>
>> Cc: Luis Oliveira <lolivei@synopsys.com>
>> Cc: Vitor Soares <vitor.soares@synopsys.com>
>> Cc: Nelson Costa <nelson.costa@synopsys.com>
>> Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
> 
> 
>> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
>> +{
>> +	struct dw_edma_chan *chan = desc->chan;
>> +	struct dw_edma *dw = chan->chip->dw;
>> +	struct dw_edma_chunk *chunk;
>> +
>> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
>> +	if (unlikely(!chunk))
>> +		return NULL;
>> +
>> +	INIT_LIST_HEAD(&chunk->list);
>> +	chunk->chan = chan;
>> +	chunk->cb = !(desc->chunks_alloc % 2);
>> +	chunk->ll_region.paddr = dw->ll_region.paddr + chan->ll_off;
>> +	chunk->ll_region.vaddr = dw->ll_region.vaddr + chan->ll_off;
>> +
>> +	if (desc->chunk) {
>> +		/* Create and add new element into the linked list */
>> +		desc->chunks_alloc++;
>> +		dev_dbg(chan2dev(chan), "alloc new chunk element (%d)\n",
>> +			desc->chunks_alloc);
>> +		list_add_tail(&chunk->list, &desc->chunk->list);
>> +		dw_edma_alloc_burst(chunk);
> 
> No return check ?

Nice catch! I'll do a return check.

> 
>> +	} else {
>> +		/* List head */
>> +		chunk->burst = NULL;
>> +		desc->chunks_alloc = 0;
>> +		desc->chunk = chunk;
>> +		dev_dbg(chan2dev(chan), "alloc new chunk head\n");
>> +	}
>> +
>> +	return chunk;
>> +}
>> +
>> +static struct dw_edma_desc *dw_edma_alloc_desc(struct dw_edma_chan *chan)
>> +{
>> +	struct dw_edma_desc *desc;
>> +
>> +	dev_dbg(chan2dev(chan), "alloc new descriptor\n");
>> +
>> +	desc = kvzalloc(sizeof(*desc), GFP_NOWAIT);
>> +	if (unlikely(!desc))
>> +		return NULL;
>> +
>> +	desc->chan = chan;
>> +	dw_edma_alloc_chunk(desc);
> 
> No return check ?

Ditto.

> 
>> +
>> +	return desc;
>> +}
>> +
> 
>> +static void dw_edma_start_transfer(struct dw_edma_chan *chan)
>> +{
>> +	struct virt_dma_desc *vd;
>> +	struct dw_edma_desc *desc;
>> +	struct dw_edma_chunk *child;
>> +	const struct dw_edma_core_ops *ops = chan2ops(chan);
> 
> Reverse tree order here and in remaining functions ...

Ok, I'll do that.

> 
>> +
>> +	vd = vchan_next_desc(&chan->vc);
>> +	if (!vd)
>> +		return;
>> +
>> +	desc = vd2dw_edma_desc(vd);
>> +	if (!desc)
>> +		return;
>> +
>> +	child = list_first_entry_or_null(&desc->chunk->list,
>> +					 struct dw_edma_chunk, list);
>> +	if (!child)
>> +		return;
>> +
>> +	ops->start(child, !desc->xfer_sz);
>> +	desc->xfer_sz += child->ll_region.sz;
>> +	dev_dbg(chan2dev(chan), "transfer of %u bytes started\n",
>> +		child->ll_region.sz);
>> +
>> +	dw_edma_free_burst(child);
>> +	if (child->bursts_alloc)
>> +		dev_dbg(chan2dev(chan),	"%u bursts still allocated\n",
>> +			child->bursts_alloc);
>> +	list_del(&child->list);
>> +	kvfree(child);
>> +	desc->chunks_alloc--;
>> +}
>> +
> 
>> +int dw_edma_probe(struct dw_edma_chip *chip)
>> +{
>> +	struct dw_edma *dw = chip->dw;
>> +	struct device *dev = chip->dev;
>> +	const struct dw_edma_core_ops *ops;
>> +	size_t ll_chunk = dw->ll_region.sz;
>> +	size_t dt_chunk = dw->dt_region.sz;
>> +	u32 ch_tot;
>> +	int i, j, err;
>> +
>> +	raw_spin_lock_init(&dw->lock);
> 
> Why raw ?

Marc Zyngier told me some time ago raw spin locks it would preferable, since it
doesn't bring any additional effect for the mainline kernel, but will make it
work correctly if you use the RT patches.

> 
>> +
>> +	/* Callback operation selection accordingly to eDMA version */
>> +	switch (dw->version) {
>> +	default:
>> +		dev_err(dev, "unsupported version\n");
>> +		return -EPERM;
>> +	}
>> +
>> +	pm_runtime_get_sync(dev);
> 
> Why do you need to increase usage count at probe ? And shouldn't
> this be after pm_runtime_enable() ?

It shouldn't be there. I'll remove it.

> 
>> +
>> +	/* Find out how many write channels are supported by hardware */
>> +	dw->wr_ch_cnt = ops->ch_count(dw, EDMA_DIR_WRITE);
>> +	if (!dw->wr_ch_cnt) {
>> +		dev_err(dev, "invalid number of write channels(0)\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* Find out how many read channels are supported by hardware */
>> +	dw->rd_ch_cnt = ops->ch_count(dw, EDMA_DIR_READ);
>> +	if (!dw->rd_ch_cnt) {
>> +		dev_err(dev, "invalid number of read channels(0)\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	dev_dbg(dev, "Channels:\twrite=%d, read=%d\n",
>> +		dw->wr_ch_cnt, dw->rd_ch_cnt);
>> +
>> +	ch_tot = dw->wr_ch_cnt + dw->rd_ch_cnt;
>> +
>> +	/* Allocate channels */
>> +	dw->chan = devm_kcalloc(dev, ch_tot, sizeof(*dw->chan), GFP_KERNEL);
>> +	if (!dw->chan)
>> +		return -ENOMEM;
>> +
>> +	/* Calculate the linked list chunk for each channel */
>> +	ll_chunk /= roundup_pow_of_two(ch_tot);
>> +
>> +	/* Calculate the linked list chunk for each channel */
>> +	dt_chunk /= roundup_pow_of_two(ch_tot);
>> +
>> +	/* Disable eDMA, only to establish the ideal initial conditions */
>> +	ops->off(dw);
>> +
>> +	snprintf(dw->name, sizeof(dw->name), "dw-edma-core:%d", chip->id);
>> +
>> +	/* Request IRQs */
>> +	if (dw->nr_irqs != 1) {
>> +		dev_err(dev, "invalid number of irqs (%u)\n", dw->nr_irqs);
>> +		return -EINVAL;
>> +	}
>> +
>> +	for (i = 0; i < dw->nr_irqs; i++) {
>> +		err = devm_request_irq(dev, pci_irq_vector(to_pci_dev(dev), i),
>> +				       dw_edma_interrupt_all,
>> +				       IRQF_SHARED, dw->name, chip);
>> +		if (err)
>> +			return err;
>> +	}
>> +
>> +	/* Create write channels */
>> +	INIT_LIST_HEAD(&dw->wr_edma.channels);
>> +	for (i = 0; i < dw->wr_ch_cnt; i++) {
>> +		struct dw_edma_chan *chan = &dw->chan[i];
>> +		struct dw_edma_region *dt_region;
>> +
>> +		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
>> +		if (!dt_region)
>> +			return -ENOMEM;
>> +
>> +		chan->vc.chan.private = dt_region;
>> +
>> +		chan->chip = chip;
>> +		chan->id = i;
>> +		chan->dir = EDMA_DIR_WRITE;
>> +		chan->configured = false;
>> +		chan->request = EDMA_REQ_NONE;
>> +		chan->status = EDMA_ST_IDLE;
>> +
>> +		chan->ll_off = (ll_chunk * i);
>> +		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
>> +
>> +		chan->dt_off = (dt_chunk * i);
>> +
>> +		dev_dbg(dev, "L. List:\tChannel write[%u] off=0x%.8lx, max_cnt=%u\n",
>> +			i, chan->ll_off, chan->ll_max);
>> +
>> +		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
>> +
>> +		dev_dbg(dev, "MSI:\t\tChannel write[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
>> +			i, chan->msi.address_hi, chan->msi.address_lo,
>> +			chan->msi.data);
>> +
>> +		chan->vc.desc_free = vchan_free_desc;
>> +		vchan_init(&chan->vc, &dw->wr_edma);
>> +
>> +		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
>> +		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
>> +		dt_region->sz = dt_chunk;
>> +
>> +		dev_dbg(dev, "Data:\tChannel write[%u] off=0x%.8lx\n",
>> +			i, chan->dt_off);
>> +	}
>> +	dma_cap_zero(dw->wr_edma.cap_mask);
>> +	dma_cap_set(DMA_SLAVE, dw->wr_edma.cap_mask);
>> +	dma_cap_set(DMA_CYCLIC, dw->wr_edma.cap_mask);
>> +	dw->wr_edma.directions = BIT(DMA_DEV_TO_MEM);
>> +	dw->wr_edma.chancnt = dw->wr_ch_cnt;
>> +
>> +	/* Create read channels */
>> +	INIT_LIST_HEAD(&dw->rd_edma.channels);
>> +	for (j = 0; j < dw->rd_ch_cnt; j++, i++) {
>> +		struct dw_edma_chan *chan = &dw->chan[i];
>> +		struct dw_edma_region *dt_region;
>> +
>> +		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
>> +		if (!dt_region)
>> +			return -ENOMEM;
>> +
>> +		chan->vc.chan.private = dt_region;
>> +
>> +		chan->chip = chip;
>> +		chan->id = j;
>> +		chan->dir = EDMA_DIR_READ;
>> +		chan->configured = false;
>> +		chan->request = EDMA_REQ_NONE;
>> +		chan->status = EDMA_ST_IDLE;
>> +
>> +		chan->ll_off = (ll_chunk * i);
>> +		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
>> +
>> +		chan->dt_off = (dt_chunk * i);
>> +
>> +		dev_dbg(dev, "L. List:\tChannel read[%u] off=0x%.8lx, max_cnt=%u\n",
>> +			j, chan->ll_off, chan->ll_max);
>> +
>> +		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
>> +
>> +		dev_dbg(dev, "MSI:\t\tChannel read[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
>> +			j, chan->msi.address_hi, chan->msi.address_lo,
>> +			chan->msi.data);
>> +
>> +		chan->vc.desc_free = vchan_free_desc;
>> +		vchan_init(&chan->vc, &dw->rd_edma);
>> +
>> +		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
>> +		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
>> +		dt_region->sz = dt_chunk;
>> +
>> +		dev_dbg(dev, "Data:\tChannel read[%u] off=0x%.8lx\n",
>> +			i, chan->dt_off);
>> +	}
>> +	dma_cap_zero(dw->rd_edma.cap_mask);
>> +	dma_cap_set(DMA_SLAVE, dw->rd_edma.cap_mask);
>> +	dma_cap_set(DMA_CYCLIC, dw->rd_edma.cap_mask);
>> +	dw->rd_edma.directions = BIT(DMA_MEM_TO_DEV);
>> +	dw->rd_edma.chancnt = dw->rd_ch_cnt;
>> +
>> +	/* Set DMA channels  capabilities */
>> +	SET_BOTH_CH(src_addr_widths, BIT(DMA_SLAVE_BUSWIDTH_4_BYTES));
>> +	SET_BOTH_CH(dst_addr_widths, BIT(DMA_SLAVE_BUSWIDTH_4_BYTES));
>> +	SET_BOTH_CH(residue_granularity, DMA_RESIDUE_GRANULARITY_DESCRIPTOR);
>> +
>> +	SET_BOTH_CH(dev, dev);
>> +
>> +	SET_BOTH_CH(device_alloc_chan_resources, dw_edma_alloc_chan_resources);
>> +	SET_BOTH_CH(device_free_chan_resources, dw_edma_free_chan_resources);
>> +
>> +	SET_BOTH_CH(device_config, dw_edma_device_config);
>> +	SET_BOTH_CH(device_pause, dw_edma_device_pause);
>> +	SET_BOTH_CH(device_resume, dw_edma_device_resume);
>> +	SET_BOTH_CH(device_terminate_all, dw_edma_device_terminate_all);
>> +	SET_BOTH_CH(device_issue_pending, dw_edma_device_issue_pending);
>> +	SET_BOTH_CH(device_tx_status, dw_edma_device_tx_status);
>> +	SET_BOTH_CH(device_prep_slave_sg, dw_edma_device_prep_slave_sg);
>> +	SET_BOTH_CH(device_prep_dma_cyclic, dw_edma_device_prep_dma_cyclic);
>> +
>> +	/* Power management */
>> +	pm_runtime_enable(dev);
>> +
>> +	/* Register DMA device */
>> +	err = dma_async_device_register(&dw->wr_edma);
>> +	if (err)
>> +		goto err_pm_disable;
>> +
>> +	err = dma_async_device_register(&dw->rd_edma);
>> +	if (err)
>> +		goto err_pm_disable;
>> +
>> +	/* Turn debugfs on */
>> +	err = ops->debugfs_on(chip);
>> +	if (err) {
>> +		dev_err(dev, "unable to create debugfs structure\n");
>> +		goto err_pm_disable;
>> +	}
>> +
>> +	dev_info(dev, "DesignWare eDMA controller driver loaded completely\n");
>> +
>> +	return 0;
>> +
>> +err_pm_disable:
>> +	pm_runtime_disable(dev);
>> +
>> +	return err;
>> +}
>> +EXPORT_SYMBOL_GPL(dw_edma_probe);
>> +
>> +int dw_edma_remove(struct dw_edma_chip *chip)
>> +{
>> +	struct dw_edma *dw = chip->dw;
>> +	struct device *dev = chip->dev;
>> +	const struct dw_edma_core_ops *ops = dw->ops;
>> +	struct dw_edma_chan *chan, *_chan;
>> +	int i;
>> +
>> +	/* Disable eDMA */
>> +	if (ops)
>> +		ops->off(dw);
>> +
>> +	/* Free irqs */
> 
> But devm will automatically free it, no ?

Yes, it shouldn't be there. I'll remove it.

> 
>> +	for (i = 0; i < dw->nr_irqs; i++) {
>> +		if (pci_irq_vector(to_pci_dev(dev), i) < 0)
>> +			continue;
>> +
>> +		devm_free_irq(dev, pci_irq_vector(to_pci_dev(dev), i), chip);
>> +	}
>> +
>> +	/* Power management */
>> +	pm_runtime_disable(dev);
>> +
>> +	list_for_each_entry_safe(chan, _chan, &dw->wr_edma.channels,
>> +				 vc.chan.device_node) {
>> +		list_del(&chan->vc.chan.device_node);
>> +		tasklet_kill(&chan->vc.task);
>> +	}
>> +
>> +	list_for_each_entry_safe(chan, _chan, &dw->rd_edma.channels,
>> +				 vc.chan.device_node) {
>> +		list_del(&chan->vc.chan.device_node);
>> +		tasklet_kill(&chan->vc.task);
>> +	}
>> +
>> +	/* Deregister eDMA device */
>> +	dma_async_device_unregister(&dw->wr_edma);
>> +	dma_async_device_unregister(&dw->rd_edma);
>> +
>> +	/* Turn debugfs off */
>> +	if (ops)
>> +		ops->debugfs_off();
>> +
>> +	dev_info(dev, "DesignWare eDMA controller driver unloaded complete\n");
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(dw_edma_remove);

Thanks!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-16 10:45   ` Jose Abreu
@ 2019-01-16 11:56     ` Gustavo Pimentel
  0 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-16 11:56 UTC (permalink / raw)
  To: Jose Abreu, Gustavo Pimentel, linux-pci, dmaengine
  Cc: Vinod Koul, Dan Williams, Eugeniy Paltsev, Andy Shevchenko,
	Russell King, Niklas Cassel, Joao Pinto, Jose Abreu,
	Luis de Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

Hi Jose,

On 16/01/2019 10:45, Jose Abreu wrote:
> Hi Gustavo,
> 
> On 1/11/2019 6:33 PM, Gustavo Pimentel wrote:
>> Add Synopsys eDMA IP test and sample driver to be use for testing
>> purposes and also as a reference for any developer who needs to
>> implement and use Synopsys eDMA.
>>
>> This driver can be compile as built-in or external module in kernel.
>>
>> To enable this driver just select DW_EDMA_TEST option in kernel
>> configuration, however it requires and selects automatically DW_EDMA
>> option too.
>>
>> Changes:
>> RFC v1->RFC v2:
>>  - No changes
>> RFC v2->RFC v3:
>>  - Add test module
>>
>> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
>> Cc: Vinod Koul <vkoul@kernel.org>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Eugeniy Paltsev <paltsev@synopsys.com>
>> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
>> Cc: Russell King <rmk+kernel@armlinux.org.uk>
>> Cc: Niklas Cassel <niklas.cassel@linaro.org>
>> Cc: Joao Pinto <jpinto@synopsys.com>
>> Cc: Jose Abreu <jose.abreu@synopsys.com>
>> Cc: Luis Oliveira <lolivei@synopsys.com>
>> Cc: Vitor Soares <vitor.soares@synopsys.com>
>> Cc: Nelson Costa <nelson.costa@synopsys.com>
>> Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
> 
>> +static int dw_edma_test_add_channel(struct dw_edma_test_info *info,
>> +				    struct dma_chan *chan,
>> +				    u32 channel)
>> +{
>> +	struct dw_edma_test_params *params = &info->params;
>> +	struct dw_edma_test_thread *thread;
>> +	struct dw_edma_test_chan *tchan;
>> +
>> +	tchan = kvmalloc(sizeof(*tchan), GFP_KERNEL);
>> +	if (!tchan)
>> +		return -ENOMEM;
>> +
>> +	tchan->chan = chan;
>> +
>> +	thread = kvzalloc(sizeof(*thread), GFP_KERNEL);
>> +	if (!thread) {
>> +		kvfree(tchan);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	thread->info = info;
>> +	thread->chan = tchan->chan;
>> +	switch (channel) {
>> +	case EDMA_CH_WR:
>> +		thread->direction = DMA_DEV_TO_MEM;
>> +		break;
>> +	case EDMA_CH_RD:
>> +		thread->direction = DMA_MEM_TO_DEV;
>> +		break;
>> +	default:
>> +		kvfree(tchan);
> 
> You are leaking thread here.

Yes, indeed.

> 
>> +		return -EPERM;
>> +	}
>> +	thread->test_done.wait = &thread->done_wait;
>> +	init_waitqueue_head(&thread->done_wait);
>> +
>> +	if (!params->repetitions)
>> +		thread->task = kthread_create(dw_edma_test_sg, thread, "%s",
>> +					      dma_chan_name(chan));
>> +	else
>> +		thread->task = kthread_create(dw_edma_test_cyclic, thread, "%s",
>> +					      dma_chan_name(chan));
>> +
>> +	if (IS_ERR(thread->task)) {
>> +		pr_err("failed to create thread %s\n", dma_chan_name(chan));
>> +		kvfree(tchan);
>> +		kvfree(thread);
>> +		return -EPERM;
>> +	}
>> +
>> +	tchan->thread = thread;
>> +	dev_dbg(chan->device->dev, "add thread %s\n", dma_chan_name(chan));
>> +	list_add_tail(&tchan->node, &info->channels);
>> +
>> +	return 0;
>> +}
>> +

Thanks!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 2/7] dmaengine: Add Synopsys eDMA IP version 0 support
  2019-01-16 10:33   ` Jose Abreu
@ 2019-01-16 14:02     ` Gustavo Pimentel
  0 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-16 14:02 UTC (permalink / raw)
  To: Jose Abreu, Gustavo Pimentel, linux-pci, dmaengine
  Cc: Vinod Koul, Dan Williams, Eugeniy Paltsev, Andy Shevchenko,
	Russell King, Niklas Cassel, Joao Pinto, Luis Oliveira,
	Vitor Soares, Nelson Costa, Pedro Sousa

Hi Jose,

On 16/01/2019 10:33, Jose Abreu wrote:
> Hi Gustavo,
> 
> On 1/11/2019 6:33 PM, Gustavo Pimentel wrote:
>> Add support for the eDMA IP version 0 driver for both register maps (legacy
>> and unroll).
>>
>> The legacy register mapping was the initial implementation, which consisted
>> in having all registers belonging to channels multiplexed, which could be
>> change anytime (which could led a race-condition) by view port register
>> (access to only one channel available each time).
>>
>> This register mapping is not very effective and efficient in a multithread
>> environment, which has led to the development of unroll registers mapping,
>> which consists of having all channels registers accessible any time by
>> spreading all channels registers by an offset between them.
>>
>> This version supports a maximum of 16 independent channels (8 write +
>> 8 read), which can run simultaneously.
>>
>> Implements a scatter-gather transfer through a linked list, where the size
>> of linked list depends on the allocated memory divided equally among all
>> channels.
>>
>> Each linked list descriptor can transfer from 1 byte to 4 Gbytes and is
>> alignmented to DWORD.
>>
>> Both SAR (Source Address Register) and DAR (Destination Address Register)
>> are alignmented to byte.
>>
>> Changes:
>> RFC v1->RFC v2:
>>  - Replace comments // (C99 style) by /**/
>>  - Replace magic numbers by defines
>>  - Replace boolean return from ternary operation by a double negation
>>    operation
>>  - Replace QWORD_HI/QWORD_LO macros by upper_32_bits()/lower_32_bits()
>>  - Fix the headers of the .c and .h files according to the most recent
>>    convention
>>  - Fix errors and checks pointed out by checkpatch with --strict option
>>  - Replace patch small description tag from dma by dmaengine
>>  - Refactor code to replace atomic_t by u32 variable type
>> RFC v2->RFC v3:
>>  - Code rewrite to use FIELD_PREP() and FIELD_GET()
>>  - Add define to magic numbers
>>  - Fix minor bugs
>>
>> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
>> Cc: Vinod Koul <vkoul@kernel.org>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Eugeniy Paltsev <paltsev@synopsys.com>
>> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
>> Cc: Russell King <rmk+kernel@armlinux.org.uk>
>> Cc: Niklas Cassel <niklas.cassel@linaro.org>
>> Cc: Joao Pinto <jpinto@synopsys.com>
>> Cc: Jose Abreu <jose.abreu@synopsys.com>
>> Cc: Luis Oliveira <lolivei@synopsys.com>
>> Cc: Vitor Soares <vitor.soares@synopsys.com>
>> Cc: Nelson Costa <nelson.costa@synopsys.com>
>> Cc: Pedro Sousa <pedrom.sousa@synopsys.com>
> 
>> +void dw_edma_v0_core_start(struct dw_edma_chunk *chunk, bool first)
>> +{
>> +	struct dw_edma_chan *chan = chunk->chan;
>> +	struct dw_edma *dw = chan->chip->dw;
>> +	u32 tmp;
>> +	u64 llp;
>> +
>> +	dw_edma_v0_core_write_chunk(chunk);
>> +
>> +	if (first) {
>> +		/* Enable engine */
>> +		SET_RW(dw, chan->dir, engine_en, BIT(0));
>> +		/* Interrupt unmask - done, abort */
>> +		tmp = GET_RW(dw, chan->dir, int_mask);
>> +		tmp &= ~FIELD_PREP(EDMA_V0_DONE_INT_MASK, BIT(chan->id));
>> +		tmp &= ~FIELD_PREP(EDMA_V0_ABORT_INT_MASK, BIT(chan->id));
>> +		SET_RW(dw, chan->dir, int_mask, tmp);
>> +		/* Linked list error */
>> +		tmp = GET_RW(dw, chan->dir, linked_list_err_en);
>> +		tmp |= FIELD_PREP(EDMA_V0_LINKED_LIST_ERR_MASK, BIT(chan->id));
>> +		SET_RW(dw, chan->dir, linked_list_err_en, tmp);
>> +		/* Channel control */
>> +		SET_CH(dw, chan->dir, chan->id, ch_control1,
>> +		       (DW_EDMA_V0_CCS | DW_EDMA_V0_LLE));
>> +		/* Linked list - low, high */
>> +		llp = cpu_to_le64(chunk->ll_region.paddr);
>> +		SET_CH(dw, chan->dir, chan->id, llp_low, lower_32_bits(llp));
>> +		SET_CH(dw, chan->dir, chan->id, llp_high, upper_32_bits(llp));
>> +	}
>> +	/* Doorbell */
> 
> Not sure if DMA subsystem does this but maybe you need some kind
> of barrier to ensure everything is coherent before granting
> control to eDMA ?

Each readl and writel inside of helper macros (SET_RW/GETRW and SET_CH/GET_CH)
has a memory barrier, that grants that the coherency, however this is platform
dependent.

> 
>> +	SET_RW(dw, chan->dir, doorbell,
>> +	       FIELD_PREP(EDMA_V0_DOORBELL_CH_MASK, chan->id));
>> +}
>> +

Thanks!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-15 13:02         ` Gustavo Pimentel
@ 2019-01-17  5:03           ` Vinod Koul
  2019-01-21 15:59             ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Vinod Koul @ 2019-01-17  5:03 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: Andy Shevchenko, linux-pci, dmaengine, Dan Williams,
	Eugeniy Paltsev, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 15-01-19, 13:02, Gustavo Pimentel wrote:
> On 15/01/2019 05:45, Andy Shevchenko wrote:
> > On Mon, Jan 14, 2019 at 11:44:22AM +0000, Gustavo Pimentel wrote:
> >> On 11/01/2019 19:48, Andy Shevchenko wrote:
> >>> On Fri, Jan 11, 2019 at 07:33:43PM +0100, Gustavo Pimentel wrote:
> >>>> Add Synopsys eDMA IP test and sample driver to be use for testing
> >>>> purposes and also as a reference for any developer who needs to
> >>>> implement and use Synopsys eDMA.
> >>>>
> >>>> This driver can be compile as built-in or external module in kernel.
> >>>>
> >>>> To enable this driver just select DW_EDMA_TEST option in kernel
> >>>> configuration, however it requires and selects automatically DW_EDMA
> >>>> option too.
> >>>>
> >>>
> >>> Hmm... This doesn't explain what's wrong with dmatest module.
> >>
> >> There isn't anything wrong with dmatest module, that I know of. In beginning I
> >> was planning to used it, however only works with MEM_TO_MEM transfers, that's
> >> why I created a similar module but for MEM_TO_DEV and DEV_TO_MEM with
> >> scatter-gather and cyclic transfers type for my use case. I don't know if can be
> >> applied to other cases, if that is feasible, I'm glad to share it.
> > 
> > What I'm trying to tell is that the dmatest driver would be nice to have such
> > capability.
> > 
> 
> At the beginning I thought to add those features to dmatest module, but because
> of several points such as:
>  - I didn't want, by any chance, to broke dmatest module while implementing the
> support of those new features;
>  - Since I'm still gaining knowledge about this subsystem I preferred to do in a
> more controlled environment;
>  - Since this driver is also a reference driver aim to teach and to be use by
> someone how to use the dmaengine API, I preferred to keep it simple.
> 
> Maybe in the future both modules could be merged in just one tool, but for now I
> would prefer to keep it separated specially to fix any immaturity error of this
> module.

With decent testing we should be able to iron out kinks in dmatest
module. It gets tested for memcpy controllers so we should easily catch
issues.

Said that, it would make sense to add support in generic dmatest so that
other folks can reuse and contribute as well. Future work always gets
side tracked by life, so lets do it now :)


-- 
~Vinod

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic
  2019-01-15 12:48         ` Gustavo Pimentel
@ 2019-01-19 15:45           ` Andy Shevchenko
  2019-01-21  9:21             ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Andy Shevchenko @ 2019-01-19 15:45 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Lorenzo Pieralisi, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On Tue, Jan 15, 2019 at 12:48:34PM +0000, Gustavo Pimentel wrote:
> On 15/01/2019 05:43, Andy Shevchenko wrote:
> > On Mon, Jan 14, 2019 at 11:38:02AM +0000, Gustavo Pimentel wrote:
> >> On 11/01/2019 19:47, Andy Shevchenko wrote:
> >>> On Fri, Jan 11, 2019 at 07:33:41PM +0100, Gustavo Pimentel wrote:

> >>>> +static bool disable_msix;
> >>>> +module_param(disable_msix, bool, 0644);
> >>>> +MODULE_PARM_DESC(disable_msix, "Disable MSI-X interrupts");
> >>>
> >>> Why?!
> >>> We are no allow new module parameters without very strong arguments.
> >>
> >> Since this is a reference driver and might be used to test customized HW
> >> solutions, I added this parameter to allow the possibility to test the solution
> >> forcing the MSI feature binding. This is required specially if who will test
> >> this solution has a Root Complex with both features available (MSI and MSI-X),
> >> because the Kernel will give always preference to MSI-X binding (assuming that
> >> the EP has also both features available).
> > 
> > Yes, you may do it for testing purposes, but it doesn't fit the kernel standards.
> 
> Ok, but how should I proceed? May I leave it or substitute by another way to do
> it? If so, how?
> As I said, the intended is to be only used for this test case, on normal
> operation the parameter it should be always false.

Keep out-of-tree patch for your needs.

> >>>> +	if (!pdata) {
> >>>> +		dev_err(dev, "%s missing data structure\n", pci_name(pdev));
> >>>> +		return -EFAULT;
> >>>> +	}
> >>>
> >>> Useless check.
> >>
> >> Why? It's just a precaution, isn't it a good practice always to think of the
> >> worst case?
> > 
> > You just can put an implicit requirement of pdata rather than doing this
> 
> Ok, how can I do it? What I should add to the code to force that?

Not adding, removing. That's what I put before.

> 
> > useless check. I don't believe it would make sense to have NULL pdata for the
> > driver since it wouldn't be functional anyhow.
> 
> Yes, you're right without pdata the driver can't do anything.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-16 11:53     ` Gustavo Pimentel
@ 2019-01-19 16:21       ` Andy Shevchenko
  2019-01-21  9:14         ` Gustavo Pimentel
  2019-01-20 11:47       ` Vinod Koul
  1 sibling, 1 reply; 39+ messages in thread
From: Andy Shevchenko @ 2019-01-19 16:21 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: Jose Abreu, linux-pci, dmaengine, Vinod Koul, Dan Williams,
	Eugeniy Paltsev, Russell King, Niklas Cassel, Joao Pinto,
	Luis de Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

On Wed, Jan 16, 2019 at 11:53:00AM +0000, Gustavo Pimentel wrote:
> On 16/01/2019 10:21, Jose Abreu wrote:
> > On 1/11/2019 6:33 PM, Gustavo Pimentel wrote:

> >> +	/* Free irqs */
> > 
> > But devm will automatically free it, no ?
> 
> Yes, it shouldn't be there. I'll remove it.

It depends. If the driver is using tasklets it must free it explicitly like this.

P.S. devm_*_irq() is quite doubtful to have in the first place.
	
-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-11 18:33 ` [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver Gustavo Pimentel
  2019-01-16 10:21   ` Jose Abreu
@ 2019-01-20 11:44   ` Vinod Koul
  2019-01-21 15:48     ` Gustavo Pimentel
  1 sibling, 1 reply; 39+ messages in thread
From: Vinod Koul @ 2019-01-20 11:44 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 11-01-19, 19:33, Gustavo Pimentel wrote:
> Add Synopsys eDMA IP core driver to kernel.
> 
> This core driver, initializes and configures the eDMA IP using vma-helpers
> functions and dma-engine subsystem.

A description of eDMA IP will help review the driver

> Also creates an abstration layer through callbacks allowing different
> registers mappings in the future, organized in to versions.
> 
> This driver can be compile as built-in or external module in kernel.
> 
> To enable this driver just select DW_EDMA option in kernel configuration,
> however it requires and selects automatically DMA_ENGINE and
> DMA_VIRTUAL_CHANNELS option too.
> 
> Changes:
> RFC v1->RFC v2:

These do not belong to change log, either move them to cover-letter or
keep them after s-o-b lines..

> @@ -0,0 +1,1059 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.

2019 now

> +static struct dw_edma_burst *dw_edma_alloc_burst(struct dw_edma_chunk *chunk)
> +{
> +	struct dw_edma_chan *chan = chunk->chan;
> +	struct dw_edma_burst *burst;
> +
> +	burst = kvzalloc(sizeof(*burst), GFP_NOWAIT);

Is there a specific reason for kvzalloc(),  do you submit this to HW..

> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
> +{
> +	struct dw_edma_chan *chan = desc->chan;
> +	struct dw_edma *dw = chan->chip->dw;
> +	struct dw_edma_chunk *chunk;
> +
> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
> +	if (unlikely(!chunk))
> +		return NULL;
> +
> +	INIT_LIST_HEAD(&chunk->list);
> +	chunk->chan = chan;
> +	chunk->cb = !(desc->chunks_alloc % 2);

cb ..?

> +static int dw_edma_device_config(struct dma_chan *dchan,
> +				 struct dma_slave_config *config)
> +{
> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
> +	const struct dw_edma_core_ops *ops = chan2ops(chan);
> +	unsigned long flags;
> +	int err = 0;
> +
> +	spin_lock_irqsave(&chan->vc.lock, flags);
> +
> +	if (!config) {
> +		err = -EINVAL;
> +		goto err_config;
> +	}
> +
> +	if (chan->status != EDMA_ST_IDLE) {
> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
> +		err = -EPERM;

this is not correct behaviour, device_config can be called anytime and
values can take affect on next transaction submitted..

> +		goto err_config;
> +	}
> +
> +	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
> +		&config->src_addr, &config->dst_addr);
> +
> +	chan->src_addr = config->src_addr;
> +	chan->dst_addr = config->dst_addr;
> +
> +	err = ops->device_config(dchan);

what does this do?

> +static enum dma_status
> +dw_edma_device_tx_status(struct dma_chan *dchan, dma_cookie_t cookie,
> +			 struct dma_tx_state *txstate)
> +{
> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
> +	const struct dw_edma_core_ops *ops = chan2ops(chan);
> +	unsigned long flags;
> +	enum dma_status ret;
> +
> +	spin_lock_irqsave(&chan->vc.lock, flags);
> +
> +	ret = ops->ch_status(chan);
> +	if (ret == DMA_ERROR) {
> +		goto ret_status;
> +	} else if (ret == DMA_IN_PROGRESS) {
> +		chan->status = EDMA_ST_BUSY;
> +		goto ret_status;

so in this case you set residue as zero, which is not correct

> +	} else {
> +		/* DMA_COMPLETE */
> +		if (chan->status == EDMA_ST_PAUSE)
> +			ret = DMA_PAUSED;
> +		else if (chan->status == EDMA_ST_BUSY)
> +			ret = DMA_IN_PROGRESS;

?? if txn is complete how are you returning DMA_IN_PROGRESS?

> +		else
> +			ret = DMA_COMPLETE;
> +	}

this looks incorrect interpretation to me. The status is to be retrieved
for the given cookie passed and given that you do not even use this
argument tells me that you have understood this as 'channel' status
reporting, which is not correct

> +static struct dma_async_tx_descriptor *
> +dw_edma_device_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
> +			     unsigned int sg_len,
> +			     enum dma_transfer_direction direction,
> +			     unsigned long flags, void *context)
> +{
> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
> +	struct dw_edma_desc *desc;
> +	struct dw_edma_chunk *chunk;
> +	struct dw_edma_burst *burst;
> +	struct scatterlist *sg;
> +	unsigned long sflags;
> +	phys_addr_t src_addr;
> +	phys_addr_t dst_addr;
> +	int i;
> +
> +	if (sg_len < 1) {
> +		dev_err(chan2dev(chan), "invalid sg length %u\n", sg_len);
> +		return NULL;
> +	}
> +
> +	if (direction == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE) {

what is the second part of the check, can you explain that, who sets
chan->dir?

> +		dev_dbg(chan2dev(chan),	"prepare operation (WRITE)\n");
> +	} else if (direction == DMA_MEM_TO_DEV && chan->dir == EDMA_DIR_READ) {
> +		dev_dbg(chan2dev(chan),	"prepare operation (READ)\n");
> +	} else {
> +		dev_err(chan2dev(chan), "invalid direction\n");
> +		return NULL;
> +	}
> +
> +	if (!chan->configured) {
> +		dev_err(chan2dev(chan), "(prep_slave_sg) channel not configured\n");
> +		return NULL;
> +	}
> +
> +	if (chan->status != EDMA_ST_IDLE) {
> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
> +		return NULL;
> +	}

No, wrong again. The txn must be prepared and then on submit added to a
queue. You are writing a driver for dmaengine, surely you dont expect
the channel to be free and then do a txn.. that would be very
inefficient!

> +static struct dma_async_tx_descriptor *
> +dw_edma_device_prep_dma_cyclic(struct dma_chan *dchan, dma_addr_t buf_addr,
> +			       size_t len, size_t cyclic_cnt,
> +			       enum dma_transfer_direction direction,
> +			       unsigned long flags)
> +{
> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
> +	struct dw_edma_desc *desc;
> +	struct dw_edma_chunk *chunk;
> +	struct dw_edma_burst *burst;
> +	unsigned long sflags;
> +	phys_addr_t src_addr;
> +	phys_addr_t dst_addr;
> +	u32 i;
> +
> +	if (!len || !cyclic_cnt) {
> +		dev_err(chan2dev(chan), "invalid len or cyclic count\n");
> +		return NULL;
> +	}
> +
> +	if (direction == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE) {
> +		dev_dbg(chan2dev(chan),	"prepare operation (WRITE)\n");
> +	} else if (direction == DMA_MEM_TO_DEV && chan->dir == EDMA_DIR_READ) {
> +		dev_dbg(chan2dev(chan),	"prepare operation (READ)\n");
> +	} else {
> +		dev_err(chan2dev(chan), "invalid direction\n");
> +		return NULL;
> +	}
> +
> +	if (!chan->configured) {
> +		dev_err(chan2dev(chan), "(prep_dma_cyclic) channel not configured\n");
> +		return NULL;
> +	}
> +
> +	if (chan->status != EDMA_ST_IDLE) {
> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
> +		return NULL;
> +	}
> +
> +	spin_lock_irqsave(&chan->vc.lock, sflags);
> +
> +	desc = dw_edma_alloc_desc(chan);
> +	if (unlikely(!desc))
> +		goto err_alloc;
> +
> +	chunk = dw_edma_alloc_chunk(desc);
> +	if (unlikely(!chunk))
> +		goto err_alloc;
> +
> +	src_addr = chan->src_addr;
> +	dst_addr = chan->dst_addr;
> +
> +	for (i = 0; i < cyclic_cnt; i++) {
> +		if (chunk->bursts_alloc == chan->ll_max) {
> +			chunk = dw_edma_alloc_chunk(desc);
> +			if (unlikely(!chunk))
> +				goto err_alloc;
> +		}
> +
> +		burst = dw_edma_alloc_burst(chunk);
> +
> +		if (unlikely(!burst))
> +			goto err_alloc;
> +
> +		burst->sz = len;
> +		chunk->ll_region.sz += burst->sz;
> +		desc->alloc_sz += burst->sz;
> +
> +		burst->sar = src_addr;
> +		burst->dar = dst_addr;
> +
> +		dev_dbg(chan2dev(chan), "lli %u/%u, sar=0x%.8llx, dar=0x%.8llx, size=%u bytes\n",
> +			i + 1, cyclic_cnt, burst->sar, burst->dar, burst->sz);
> +	}
> +
> +	spin_unlock_irqrestore(&chan->vc.lock, sflags);
> +	return vchan_tx_prep(&chan->vc, &desc->vd, flags);

how is it different from previous? am sure we can reuse bits!

> +static void dw_edma_done_interrupt(struct dw_edma_chan *chan)
> +{
> +	struct dw_edma *dw = chan->chip->dw;
> +	const struct dw_edma_core_ops *ops = dw->ops;
> +	struct virt_dma_desc *vd;
> +	struct dw_edma_desc *desc;
> +	unsigned long flags;
> +
> +	ops->clear_done_int(chan);
> +	dev_dbg(chan2dev(chan), "clear done interrupt\n");
> +
> +	spin_lock_irqsave(&chan->vc.lock, flags);
> +	vd = vchan_next_desc(&chan->vc);
> +	switch (chan->request) {
> +	case EDMA_REQ_NONE:
> +		if (!vd)
> +			break;
> +
> +		desc = vd2dw_edma_desc(vd);
> +		if (desc->chunks_alloc) {
> +			dev_dbg(chan2dev(chan), "sub-transfer complete\n");
> +			chan->status = EDMA_ST_BUSY;
> +			dev_dbg(chan2dev(chan), "transferred %u bytes\n",
> +				desc->xfer_sz);
> +			dw_edma_start_transfer(chan);
> +		} else {
> +			list_del(&vd->node);
> +			vchan_cookie_complete(vd);
> +			chan->status = EDMA_ST_IDLE;
> +			dev_dbg(chan2dev(chan), "transfer complete\n");
> +		}
> +		break;
> +	case EDMA_REQ_STOP:
> +		if (!vd)
> +			break;
> +
> +		list_del(&vd->node);
> +		vchan_cookie_complete(vd);
> +		chan->request = EDMA_REQ_NONE;
> +		chan->status = EDMA_ST_IDLE;
> +		chan->configured = false;

why is configuration deemed invalid, it can still be used again!

> +int dw_edma_probe(struct dw_edma_chip *chip)
> +{
> +	struct dw_edma *dw = chip->dw;
> +	struct device *dev = chip->dev;
> +	const struct dw_edma_core_ops *ops;
> +	size_t ll_chunk = dw->ll_region.sz;
> +	size_t dt_chunk = dw->dt_region.sz;
> +	u32 ch_tot;
> +	int i, j, err;
> +
> +	raw_spin_lock_init(&dw->lock);
> +
> +	/* Callback operation selection accordingly to eDMA version */
> +	switch (dw->version) {
> +	default:
> +		dev_err(dev, "unsupported version\n");
> +		return -EPERM;
> +	}

So we have only one case which returns error, was this code tested?

> +
> +	pm_runtime_get_sync(dev);
> +
> +	/* Find out how many write channels are supported by hardware */
> +	dw->wr_ch_cnt = ops->ch_count(dw, EDMA_DIR_WRITE);
> +	if (!dw->wr_ch_cnt) {
> +		dev_err(dev, "invalid number of write channels(0)\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Find out how many read channels are supported by hardware */
> +	dw->rd_ch_cnt = ops->ch_count(dw, EDMA_DIR_READ);
> +	if (!dw->rd_ch_cnt) {
> +		dev_err(dev, "invalid number of read channels(0)\n");
> +		return -EINVAL;
> +	}
> +
> +	dev_dbg(dev, "Channels:\twrite=%d, read=%d\n",
> +		dw->wr_ch_cnt, dw->rd_ch_cnt);
> +
> +	ch_tot = dw->wr_ch_cnt + dw->rd_ch_cnt;
> +
> +	/* Allocate channels */
> +	dw->chan = devm_kcalloc(dev, ch_tot, sizeof(*dw->chan), GFP_KERNEL);

you may use struct_size() here

> +	if (!dw->chan)
> +		return -ENOMEM;
> +
> +	/* Calculate the linked list chunk for each channel */
> +	ll_chunk /= roundup_pow_of_two(ch_tot);
> +
> +	/* Calculate the linked list chunk for each channel */
> +	dt_chunk /= roundup_pow_of_two(ch_tot);
> +
> +	/* Disable eDMA, only to establish the ideal initial conditions */
> +	ops->off(dw);
> +
> +	snprintf(dw->name, sizeof(dw->name), "dw-edma-core:%d", chip->id);
> +
> +	/* Request IRQs */
> +	if (dw->nr_irqs != 1) {
> +		dev_err(dev, "invalid number of irqs (%u)\n", dw->nr_irqs);
> +		return -EINVAL;
> +	}
> +
> +	for (i = 0; i < dw->nr_irqs; i++) {
> +		err = devm_request_irq(dev, pci_irq_vector(to_pci_dev(dev), i),
> +				       dw_edma_interrupt_all,
> +				       IRQF_SHARED, dw->name, chip);
> +		if (err)
> +			return err;
> +	}
> +
> +	/* Create write channels */
> +	INIT_LIST_HEAD(&dw->wr_edma.channels);
> +	for (i = 0; i < dw->wr_ch_cnt; i++) {
> +		struct dw_edma_chan *chan = &dw->chan[i];
> +		struct dw_edma_region *dt_region;
> +
> +		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
> +		if (!dt_region)
> +			return -ENOMEM;
> +
> +		chan->vc.chan.private = dt_region;
> +
> +		chan->chip = chip;
> +		chan->id = i;
> +		chan->dir = EDMA_DIR_WRITE;
> +		chan->configured = false;
> +		chan->request = EDMA_REQ_NONE;
> +		chan->status = EDMA_ST_IDLE;
> +
> +		chan->ll_off = (ll_chunk * i);
> +		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
> +
> +		chan->dt_off = (dt_chunk * i);
> +
> +		dev_dbg(dev, "L. List:\tChannel write[%u] off=0x%.8lx, max_cnt=%u\n",
> +			i, chan->ll_off, chan->ll_max);
> +
> +		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
> +
> +		dev_dbg(dev, "MSI:\t\tChannel write[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
> +			i, chan->msi.address_hi, chan->msi.address_lo,
> +			chan->msi.data);
> +
> +		chan->vc.desc_free = vchan_free_desc;
> +		vchan_init(&chan->vc, &dw->wr_edma);
> +
> +		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
> +		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
> +		dt_region->sz = dt_chunk;
> +
> +		dev_dbg(dev, "Data:\tChannel write[%u] off=0x%.8lx\n",
> +			i, chan->dt_off);
> +	}
> +	dma_cap_zero(dw->wr_edma.cap_mask);
> +	dma_cap_set(DMA_SLAVE, dw->wr_edma.cap_mask);
> +	dma_cap_set(DMA_CYCLIC, dw->wr_edma.cap_mask);
> +	dw->wr_edma.directions = BIT(DMA_DEV_TO_MEM);
> +	dw->wr_edma.chancnt = dw->wr_ch_cnt;
> +
> +	/* Create read channels */
> +	INIT_LIST_HEAD(&dw->rd_edma.channels);
> +	for (j = 0; j < dw->rd_ch_cnt; j++, i++) {
> +		struct dw_edma_chan *chan = &dw->chan[i];
> +		struct dw_edma_region *dt_region;
> +
> +		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
> +		if (!dt_region)
> +			return -ENOMEM;
> +
> +		chan->vc.chan.private = dt_region;
> +
> +		chan->chip = chip;
> +		chan->id = j;
> +		chan->dir = EDMA_DIR_READ;
> +		chan->configured = false;
> +		chan->request = EDMA_REQ_NONE;
> +		chan->status = EDMA_ST_IDLE;
> +
> +		chan->ll_off = (ll_chunk * i);
> +		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
> +
> +		chan->dt_off = (dt_chunk * i);
> +
> +		dev_dbg(dev, "L. List:\tChannel read[%u] off=0x%.8lx, max_cnt=%u\n",
> +			j, chan->ll_off, chan->ll_max);
> +
> +		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
> +
> +		dev_dbg(dev, "MSI:\t\tChannel read[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
> +			j, chan->msi.address_hi, chan->msi.address_lo,
> +			chan->msi.data);
> +
> +		chan->vc.desc_free = vchan_free_desc;
> +		vchan_init(&chan->vc, &dw->rd_edma);
> +
> +		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
> +		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
> +		dt_region->sz = dt_chunk;
> +
> +		dev_dbg(dev, "Data:\tChannel read[%u] off=0x%.8lx\n",
> +			i, chan->dt_off);

this is similar to previous with obvious changes, I think this can be
made common routine invoke for R and W ...

So HW provides read channels and write channels and not generic channels
which can be used for either R or W?
-- 
~Vinod

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-16 11:53     ` Gustavo Pimentel
  2019-01-19 16:21       ` Andy Shevchenko
@ 2019-01-20 11:47       ` Vinod Koul
  2019-01-21 15:49         ` Gustavo Pimentel
  1 sibling, 1 reply; 39+ messages in thread
From: Vinod Koul @ 2019-01-20 11:47 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: Jose Abreu, linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Luis de Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

On 16-01-19, 11:53, Gustavo Pimentel wrote:

> >> +	/* Free irqs */
> > 
> > But devm will automatically free it, no ?
> 
> Yes, it shouldn't be there. I'll remove it.

Nope this is correct. we need to ensure irqs are freed and tasklets are
quiesced otherwise you can end up with a case when a spurious interrupt
and then tasklets spinning while core unrolls. devm for irq is not a
good idea unless you really know what you are doing!

-- 
~Vinod

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-19 16:21       ` Andy Shevchenko
@ 2019-01-21  9:14         ` Gustavo Pimentel
  0 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-21  9:14 UTC (permalink / raw)
  To: Andy Shevchenko, Gustavo Pimentel
  Cc: Jose Abreu, linux-pci, dmaengine, Vinod Koul, Dan Williams,
	Eugeniy Paltsev, Russell King, Niklas Cassel, Joao Pinto,
	Luis de Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

On 19/01/2019 16:21, Andy Shevchenko wrote:
> On Wed, Jan 16, 2019 at 11:53:00AM +0000, Gustavo Pimentel wrote:
>> On 16/01/2019 10:21, Jose Abreu wrote:
>>> On 1/11/2019 6:33 PM, Gustavo Pimentel wrote:
> 
>>>> +	/* Free irqs */
>>>
>>> But devm will automatically free it, no ?
>>
>> Yes, it shouldn't be there. I'll remove it.
> 
> It depends. If the driver is using tasklets it must free it explicitly like this.

In this driver I'm not using tasklets.

> 
> P.S. devm_*_irq() is quite doubtful to have in the first place.

For curiosity, why?

> 	
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic
  2019-01-19 15:45           ` Andy Shevchenko
@ 2019-01-21  9:21             ` Gustavo Pimentel
  0 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-21  9:21 UTC (permalink / raw)
  To: Andy Shevchenko, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Vinod Koul, Dan Williams, Eugeniy Paltsev,
	Russell King, Niklas Cassel, Lorenzo Pieralisi, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 19/01/2019 15:45, Andy Shevchenko wrote:
> On Tue, Jan 15, 2019 at 12:48:34PM +0000, Gustavo Pimentel wrote:
>> On 15/01/2019 05:43, Andy Shevchenko wrote:
>>> On Mon, Jan 14, 2019 at 11:38:02AM +0000, Gustavo Pimentel wrote:
>>>> On 11/01/2019 19:47, Andy Shevchenko wrote:
>>>>> On Fri, Jan 11, 2019 at 07:33:41PM +0100, Gustavo Pimentel wrote:
> 
>>>>>> +static bool disable_msix;
>>>>>> +module_param(disable_msix, bool, 0644);
>>>>>> +MODULE_PARM_DESC(disable_msix, "Disable MSI-X interrupts");
>>>>>
>>>>> Why?!
>>>>> We are no allow new module parameters without very strong arguments.
>>>>
>>>> Since this is a reference driver and might be used to test customized HW
>>>> solutions, I added this parameter to allow the possibility to test the solution
>>>> forcing the MSI feature binding. This is required specially if who will test
>>>> this solution has a Root Complex with both features available (MSI and MSI-X),
>>>> because the Kernel will give always preference to MSI-X binding (assuming that
>>>> the EP has also both features available).
>>>
>>> Yes, you may do it for testing purposes, but it doesn't fit the kernel standards.
>>
>> Ok, but how should I proceed? May I leave it or substitute by another way to do
>> it? If so, how?
>> As I said, the intended is to be only used for this test case, on normal
>> operation the parameter it should be always false.
> 
> Keep out-of-tree patch for your needs.

Ok.

> 
>>>>>> +	if (!pdata) {
>>>>>> +		dev_err(dev, "%s missing data structure\n", pci_name(pdev));
>>>>>> +		return -EFAULT;
>>>>>> +	}
>>>>>
>>>>> Useless check.
>>>>
>>>> Why? It's just a precaution, isn't it a good practice always to think of the
>>>> worst case?
>>>
>>> You just can put an implicit requirement of pdata rather than doing this
>>
>> Ok, how can I do it? What I should add to the code to force that?
> 
> Not adding, removing. That's what I put before.

I thought there was some API or code design to force that. Sorry my bad.

> 
>>
>>> useless check. I don't believe it would make sense to have NULL pdata for the
>>> driver since it wouldn't be functional anyhow.
>>
>> Yes, you're right without pdata the driver can't do anything.
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-20 11:44   ` Vinod Koul
@ 2019-01-21 15:48     ` Gustavo Pimentel
  2019-01-23 13:08       ` Vinod Koul
  0 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-21 15:48 UTC (permalink / raw)
  To: Vinod Koul, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 20/01/2019 11:44, Vinod Koul wrote:
> On 11-01-19, 19:33, Gustavo Pimentel wrote:
>> Add Synopsys eDMA IP core driver to kernel.
>>
>> This core driver, initializes and configures the eDMA IP using vma-helpers
>> functions and dma-engine subsystem.
> 
> A description of eDMA IP will help review the driver

I've the IP description on the cover-letter, but I'll bring it to this patch, if
it helps.

> 
>> Also creates an abstration layer through callbacks allowing different
>> registers mappings in the future, organized in to versions.
>>
>> This driver can be compile as built-in or external module in kernel.
>>
>> To enable this driver just select DW_EDMA option in kernel configuration,
>> however it requires and selects automatically DMA_ENGINE and
>> DMA_VIRTUAL_CHANNELS option too.
>>
>> Changes:
>> RFC v1->RFC v2:
> 
> These do not belong to change log, either move them to cover-letter or
> keep them after s-o-b lines..

Ok, Andy has also referred that. I'll keep them after s-o-b lines.

> 
>> @@ -0,0 +1,1059 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
> 
> 2019 now

I've changed to "Copyright (c) 2018-present Synopsys, Inc. and/or its
affiliates." this way it's always up to date and I also kept 2018, because it
was the date that I started to develop this driver, if you don't mind.

> 
>> +static struct dw_edma_burst *dw_edma_alloc_burst(struct dw_edma_chunk *chunk)
>> +{
>> +	struct dw_edma_chan *chan = chunk->chan;
>> +	struct dw_edma_burst *burst;
>> +
>> +	burst = kvzalloc(sizeof(*burst), GFP_NOWAIT);
> 
> Is there a specific reason for kvzalloc(),  do you submit this to HW..

There is not reason for that, it were remains of an initial implementation. I'll
replace them by kzalloc.
Nicely spotted!

> 
>> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
>> +{
>> +	struct dw_edma_chan *chan = desc->chan;
>> +	struct dw_edma *dw = chan->chip->dw;
>> +	struct dw_edma_chunk *chunk;
>> +
>> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
>> +	if (unlikely(!chunk))
>> +		return NULL;
>> +
>> +	INIT_LIST_HEAD(&chunk->list);
>> +	chunk->chan = chan;
>> +	chunk->cb = !(desc->chunks_alloc % 2);
> 
> cb ..?

CB = change bit, is a property of this eDMA IP. Basically it is a kind of
handshake which serves to validate whether the linked list has been updated or
not, especially useful in cases of recycled linked list elements (every linked
list recycle is a new chunk, this will allow to differentiate each chunk).

> 
>> +static int dw_edma_device_config(struct dma_chan *dchan,
>> +				 struct dma_slave_config *config)
>> +{
>> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
>> +	const struct dw_edma_core_ops *ops = chan2ops(chan);
>> +	unsigned long flags;
>> +	int err = 0;
>> +
>> +	spin_lock_irqsave(&chan->vc.lock, flags);
>> +
>> +	if (!config) {
>> +		err = -EINVAL;
>> +		goto err_config;
>> +	}
>> +
>> +	if (chan->status != EDMA_ST_IDLE) {
>> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
>> +		err = -EPERM;
> 
> this is not correct behaviour, device_config can be called anytime and
> values can take affect on next transaction submitted..

Hum, I thought we could only reconfigure after transfer being finished.

> 
>> +		goto err_config;
>> +	}
>> +
>> +	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
>> +		&config->src_addr, &config->dst_addr);
>> +
>> +	chan->src_addr = config->src_addr;
>> +	chan->dst_addr = config->dst_addr;
>> +
>> +	err = ops->device_config(dchan);
> 
> what does this do?

This is an initialization procedure to setup interrupts (data and addresses) to
each channel on the eDMA IP,  in order to be triggered after transfer being
completed or aborted. Due the fact the config() can be called at anytime,
doesn't make sense to have this procedure here, I'll moved it to probe().

> 
>> +static enum dma_status
>> +dw_edma_device_tx_status(struct dma_chan *dchan, dma_cookie_t cookie,
>> +			 struct dma_tx_state *txstate)
>> +{
>> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
>> +	const struct dw_edma_core_ops *ops = chan2ops(chan);
>> +	unsigned long flags;
>> +	enum dma_status ret;
>> +
>> +	spin_lock_irqsave(&chan->vc.lock, flags);
>> +
>> +	ret = ops->ch_status(chan);
>> +	if (ret == DMA_ERROR) {
>> +		goto ret_status;
>> +	} else if (ret == DMA_IN_PROGRESS) {
>> +		chan->status = EDMA_ST_BUSY;
>> +		goto ret_status;
> 
> so in this case you set residue as zero, which is not correct

Ok, the residue should be the remaining bytes to transfer, right?

> 
>> +	} else {
>> +		/* DMA_COMPLETE */
>> +		if (chan->status == EDMA_ST_PAUSE)
>> +			ret = DMA_PAUSED;
>> +		else if (chan->status == EDMA_ST_BUSY)
>> +			ret = DMA_IN_PROGRESS;
> 
> ?? if txn is complete how are you returning DMA_IN_PROGRESS?

This IP reports DMA_COMPLETE when it successfully completes the transfer of all
linked list elements, given that there may be even more elements of the linked
list to be transferred the overall state is set to DMA_IN_PROGRESS.
> 
>> +		else
>> +			ret = DMA_COMPLETE;
>> +	}
> 
> this looks incorrect interpretation to me. The status is to be retrieved
> for the given cookie passed and given that you do not even use this
> argument tells me that you have understood this as 'channel' status
> reporting, which is not correct

Yes, you're right, my interpretation assumes this function reports
channel/transaction status. What would be the correct implementation?
Something like this?

{
	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
	const struct dw_edma_core_ops *ops = chan2ops(chan);
	struct dw_edma_desc *desc;
	struct virt_dma_desc *vd;
	unsigned long flags;
	enum dma_status ret;
	u32 residue = 0;

	spin_lock_irqsave(&chan->vc.lock, flags);

	ret = dma_cookie_status(chan, cookie, txstate);
	if (ret == DMA_COMPLETE)
		goto ret_status;

	vd = vchan_next_desc(&chan->vc);
	if (!vd)
		goto ret_status;

	desc = vd2dw_edma_desc(vd);
	if (!desc)
		residue = desc->alloc_sz - desc->xfer_sz;
		
	if (ret == DMA_IN_PROGRESS && chan->status == EDMA_ST_PAUSE)
		ret = DMA_PAUSED;

ret_status:
	spin_unlock_irqrestore(&chan->vc.lock, flags);
	dma_set_residue(txstate, residue);
	
	return ret;
}

> 
>> +static struct dma_async_tx_descriptor *
>> +dw_edma_device_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
>> +			     unsigned int sg_len,
>> +			     enum dma_transfer_direction direction,
>> +			     unsigned long flags, void *context)
>> +{
>> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
>> +	struct dw_edma_desc *desc;
>> +	struct dw_edma_chunk *chunk;
>> +	struct dw_edma_burst *burst;
>> +	struct scatterlist *sg;
>> +	unsigned long sflags;
>> +	phys_addr_t src_addr;
>> +	phys_addr_t dst_addr;
>> +	int i;
>> +
>> +	if (sg_len < 1) {
>> +		dev_err(chan2dev(chan), "invalid sg length %u\n", sg_len);
>> +		return NULL;
>> +	}
>> +
>> +	if (direction == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE) {
> 
> what is the second part of the check, can you explain that, who sets
> chan->dir?

The chan->dir is set on probe() during the process of configuring each channel.

> 
>> +		dev_dbg(chan2dev(chan),	"prepare operation (WRITE)\n");
>> +	} else if (direction == DMA_MEM_TO_DEV && chan->dir == EDMA_DIR_READ) {
>> +		dev_dbg(chan2dev(chan),	"prepare operation (READ)\n");
>> +	} else {
>> +		dev_err(chan2dev(chan), "invalid direction\n");
>> +		return NULL;
>> +	}
>> +
>> +	if (!chan->configured) {
>> +		dev_err(chan2dev(chan), "(prep_slave_sg) channel not configured\n");
>> +		return NULL;
>> +	}
>> +
>> +	if (chan->status != EDMA_ST_IDLE) {
>> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
>> +		return NULL;
>> +	}
> 
> No, wrong again. The txn must be prepared and then on submit added to a
> queue. You are writing a driver for dmaengine, surely you dont expect
> the channel to be free and then do a txn.. that would be very
> inefficient!

I did not realize that the flow could be as you mentioned. The documentation I
read about the subsystem did not give me this idea. Thank you for clarifying me.

> 
>> +static struct dma_async_tx_descriptor *
>> +dw_edma_device_prep_dma_cyclic(struct dma_chan *dchan, dma_addr_t buf_addr,
>> +			       size_t len, size_t cyclic_cnt,
>> +			       enum dma_transfer_direction direction,
>> +			       unsigned long flags)
>> +{
>> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
>> +	struct dw_edma_desc *desc;
>> +	struct dw_edma_chunk *chunk;
>> +	struct dw_edma_burst *burst;
>> +	unsigned long sflags;
>> +	phys_addr_t src_addr;
>> +	phys_addr_t dst_addr;
>> +	u32 i;
>> +
>> +	if (!len || !cyclic_cnt) {
>> +		dev_err(chan2dev(chan), "invalid len or cyclic count\n");
>> +		return NULL;
>> +	}
>> +
>> +	if (direction == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE) {
>> +		dev_dbg(chan2dev(chan),	"prepare operation (WRITE)\n");
>> +	} else if (direction == DMA_MEM_TO_DEV && chan->dir == EDMA_DIR_READ) {
>> +		dev_dbg(chan2dev(chan),	"prepare operation (READ)\n");
>> +	} else {
>> +		dev_err(chan2dev(chan), "invalid direction\n");
>> +		return NULL;
>> +	}
>> +
>> +	if (!chan->configured) {
>> +		dev_err(chan2dev(chan), "(prep_dma_cyclic) channel not configured\n");
>> +		return NULL;
>> +	}
>> +
>> +	if (chan->status != EDMA_ST_IDLE) {
>> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
>> +		return NULL;
>> +	}
>> +
>> +	spin_lock_irqsave(&chan->vc.lock, sflags);
>> +
>> +	desc = dw_edma_alloc_desc(chan);
>> +	if (unlikely(!desc))
>> +		goto err_alloc;
>> +
>> +	chunk = dw_edma_alloc_chunk(desc);
>> +	if (unlikely(!chunk))
>> +		goto err_alloc;
>> +
>> +	src_addr = chan->src_addr;
>> +	dst_addr = chan->dst_addr;
>> +
>> +	for (i = 0; i < cyclic_cnt; i++) {
>> +		if (chunk->bursts_alloc == chan->ll_max) {
>> +			chunk = dw_edma_alloc_chunk(desc);
>> +			if (unlikely(!chunk))
>> +				goto err_alloc;
>> +		}
>> +
>> +		burst = dw_edma_alloc_burst(chunk);
>> +
>> +		if (unlikely(!burst))
>> +			goto err_alloc;
>> +
>> +		burst->sz = len;
>> +		chunk->ll_region.sz += burst->sz;
>> +		desc->alloc_sz += burst->sz;
>> +
>> +		burst->sar = src_addr;
>> +		burst->dar = dst_addr;
>> +
>> +		dev_dbg(chan2dev(chan), "lli %u/%u, sar=0x%.8llx, dar=0x%.8llx, size=%u bytes\n",
>> +			i + 1, cyclic_cnt, burst->sar, burst->dar, burst->sz);
>> +	}
>> +
>> +	spin_unlock_irqrestore(&chan->vc.lock, sflags);
>> +	return vchan_tx_prep(&chan->vc, &desc->vd, flags);
> 
> how is it different from previous? am sure we can reuse bits!

There are some differences, but I agree I can rewrite the code of this and
previous function in order to reuse code between them. This function was a last
minute added feature :).

> 
>> +static void dw_edma_done_interrupt(struct dw_edma_chan *chan)
>> +{
>> +	struct dw_edma *dw = chan->chip->dw;
>> +	const struct dw_edma_core_ops *ops = dw->ops;
>> +	struct virt_dma_desc *vd;
>> +	struct dw_edma_desc *desc;
>> +	unsigned long flags;
>> +
>> +	ops->clear_done_int(chan);
>> +	dev_dbg(chan2dev(chan), "clear done interrupt\n");
>> +
>> +	spin_lock_irqsave(&chan->vc.lock, flags);
>> +	vd = vchan_next_desc(&chan->vc);
>> +	switch (chan->request) {
>> +	case EDMA_REQ_NONE:
>> +		if (!vd)
>> +			break;
>> +
>> +		desc = vd2dw_edma_desc(vd);
>> +		if (desc->chunks_alloc) {
>> +			dev_dbg(chan2dev(chan), "sub-transfer complete\n");
>> +			chan->status = EDMA_ST_BUSY;
>> +			dev_dbg(chan2dev(chan), "transferred %u bytes\n",
>> +				desc->xfer_sz);
>> +			dw_edma_start_transfer(chan);
>> +		} else {
>> +			list_del(&vd->node);
>> +			vchan_cookie_complete(vd);
>> +			chan->status = EDMA_ST_IDLE;
>> +			dev_dbg(chan2dev(chan), "transfer complete\n");
>> +		}
>> +		break;
>> +	case EDMA_REQ_STOP:
>> +		if (!vd)
>> +			break;
>> +
>> +		list_del(&vd->node);
>> +		vchan_cookie_complete(vd);
>> +		chan->request = EDMA_REQ_NONE;
>> +		chan->status = EDMA_ST_IDLE;
>> +		chan->configured = false;
> 
> why is configuration deemed invalid, it can still be used again!

At the time I thought, that was necessary to reconfigure every time that we
required to do a new transfer. You already explained to me that is not
necessary. My bad!

> 
>> +int dw_edma_probe(struct dw_edma_chip *chip)
>> +{
>> +	struct dw_edma *dw = chip->dw;
>> +	struct device *dev = chip->dev;
>> +	const struct dw_edma_core_ops *ops;
>> +	size_t ll_chunk = dw->ll_region.sz;
>> +	size_t dt_chunk = dw->dt_region.sz;
>> +	u32 ch_tot;
>> +	int i, j, err;
>> +
>> +	raw_spin_lock_init(&dw->lock);
>> +
>> +	/* Callback operation selection accordingly to eDMA version */
>> +	switch (dw->version) {
>> +	default:
>> +		dev_err(dev, "unsupported version\n");
>> +		return -EPERM;
>> +	}
> 
> So we have only one case which returns error, was this code tested?

Yes it was, but I understand what your point of view.
This was done like this, because I wanna to segment the patch series like this:
 1) Adding eDMA driver core, which contains the driver skeleton and the whole
logic associated.
 2) and 3) Adding the callbacks for the eDMA register mapping version 0 (it will
appear in the future a new version, I thought that this new version would came
while I was trying to get the feedback about this patch series, therefore would
have another 2 patches for the version 1 isolated and independent from the
version 0).
 4) and 5) Adding the PCIe glue-logic and device ID associated.
 6) Adding maintainer for this driver.
 7) Adding a test driver.

Since this switch will only have the associated case on patch 2, that why on
patch 1 doesn't appear any possibility.

If you feel logic to squash patch 2 with patch 1, just say something and I'll do
it for you :)

> 
>> +
>> +	pm_runtime_get_sync(dev);
>> +
>> +	/* Find out how many write channels are supported by hardware */
>> +	dw->wr_ch_cnt = ops->ch_count(dw, EDMA_DIR_WRITE);
>> +	if (!dw->wr_ch_cnt) {
>> +		dev_err(dev, "invalid number of write channels(0)\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* Find out how many read channels are supported by hardware */
>> +	dw->rd_ch_cnt = ops->ch_count(dw, EDMA_DIR_READ);
>> +	if (!dw->rd_ch_cnt) {
>> +		dev_err(dev, "invalid number of read channels(0)\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	dev_dbg(dev, "Channels:\twrite=%d, read=%d\n",
>> +		dw->wr_ch_cnt, dw->rd_ch_cnt);
>> +
>> +	ch_tot = dw->wr_ch_cnt + dw->rd_ch_cnt;
>> +
>> +	/* Allocate channels */
>> +	dw->chan = devm_kcalloc(dev, ch_tot, sizeof(*dw->chan), GFP_KERNEL);
> 
> you may use struct_size() here

Hum, this would be useful if I wanted to allocate the dw struct as well, right?
Since dw struct is already allocated, it looks like this can't be used, or am I
missing something?

> 
>> +	if (!dw->chan)
>> +		return -ENOMEM;
>> +
>> +	/* Calculate the linked list chunk for each channel */
>> +	ll_chunk /= roundup_pow_of_two(ch_tot);
>> +
>> +	/* Calculate the linked list chunk for each channel */
>> +	dt_chunk /= roundup_pow_of_two(ch_tot);
>> +
>> +	/* Disable eDMA, only to establish the ideal initial conditions */
>> +	ops->off(dw);
>> +
>> +	snprintf(dw->name, sizeof(dw->name), "dw-edma-core:%d", chip->id);
>> +
>> +	/* Request IRQs */
>> +	if (dw->nr_irqs != 1) {
>> +		dev_err(dev, "invalid number of irqs (%u)\n", dw->nr_irqs);
>> +		return -EINVAL;
>> +	}
>> +
>> +	for (i = 0; i < dw->nr_irqs; i++) {
>> +		err = devm_request_irq(dev, pci_irq_vector(to_pci_dev(dev), i),
>> +				       dw_edma_interrupt_all,
>> +				       IRQF_SHARED, dw->name, chip);
>> +		if (err)
>> +			return err;
>> +	}
>> +
>> +	/* Create write channels */
>> +	INIT_LIST_HEAD(&dw->wr_edma.channels);
>> +	for (i = 0; i < dw->wr_ch_cnt; i++) {
>> +		struct dw_edma_chan *chan = &dw->chan[i];
>> +		struct dw_edma_region *dt_region;
>> +
>> +		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
>> +		if (!dt_region)
>> +			return -ENOMEM;
>> +
>> +		chan->vc.chan.private = dt_region;
>> +
>> +		chan->chip = chip;
>> +		chan->id = i;
>> +		chan->dir = EDMA_DIR_WRITE;
>> +		chan->configured = false;
>> +		chan->request = EDMA_REQ_NONE;
>> +		chan->status = EDMA_ST_IDLE;
>> +
>> +		chan->ll_off = (ll_chunk * i);
>> +		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
>> +
>> +		chan->dt_off = (dt_chunk * i);
>> +
>> +		dev_dbg(dev, "L. List:\tChannel write[%u] off=0x%.8lx, max_cnt=%u\n",
>> +			i, chan->ll_off, chan->ll_max);
>> +
>> +		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
>> +
>> +		dev_dbg(dev, "MSI:\t\tChannel write[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
>> +			i, chan->msi.address_hi, chan->msi.address_lo,
>> +			chan->msi.data);
>> +
>> +		chan->vc.desc_free = vchan_free_desc;
>> +		vchan_init(&chan->vc, &dw->wr_edma);
>> +
>> +		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
>> +		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
>> +		dt_region->sz = dt_chunk;
>> +
>> +		dev_dbg(dev, "Data:\tChannel write[%u] off=0x%.8lx\n",
>> +			i, chan->dt_off);
>> +	}
>> +	dma_cap_zero(dw->wr_edma.cap_mask);
>> +	dma_cap_set(DMA_SLAVE, dw->wr_edma.cap_mask);
>> +	dma_cap_set(DMA_CYCLIC, dw->wr_edma.cap_mask);
>> +	dw->wr_edma.directions = BIT(DMA_DEV_TO_MEM);
>> +	dw->wr_edma.chancnt = dw->wr_ch_cnt;
>> +
>> +	/* Create read channels */
>> +	INIT_LIST_HEAD(&dw->rd_edma.channels);
>> +	for (j = 0; j < dw->rd_ch_cnt; j++, i++) {
>> +		struct dw_edma_chan *chan = &dw->chan[i];
>> +		struct dw_edma_region *dt_region;
>> +
>> +		dt_region = devm_kzalloc(dev, sizeof(*dt_region), GFP_KERNEL);
>> +		if (!dt_region)
>> +			return -ENOMEM;
>> +
>> +		chan->vc.chan.private = dt_region;
>> +
>> +		chan->chip = chip;
>> +		chan->id = j;
>> +		chan->dir = EDMA_DIR_READ;
>> +		chan->configured = false;
>> +		chan->request = EDMA_REQ_NONE;
>> +		chan->status = EDMA_ST_IDLE;
>> +
>> +		chan->ll_off = (ll_chunk * i);
>> +		chan->ll_max = (ll_chunk / EDMA_LL_SZ) - 1;
>> +
>> +		chan->dt_off = (dt_chunk * i);
>> +
>> +		dev_dbg(dev, "L. List:\tChannel read[%u] off=0x%.8lx, max_cnt=%u\n",
>> +			j, chan->ll_off, chan->ll_max);
>> +
>> +		memcpy(&chan->msi, &dw->msi[0], sizeof(chan->msi));
>> +
>> +		dev_dbg(dev, "MSI:\t\tChannel read[%u] addr=0x%.8x%.8x, data=0x%.8x\n",
>> +			j, chan->msi.address_hi, chan->msi.address_lo,
>> +			chan->msi.data);
>> +
>> +		chan->vc.desc_free = vchan_free_desc;
>> +		vchan_init(&chan->vc, &dw->rd_edma);
>> +
>> +		dt_region->paddr = dw->dt_region.paddr + chan->dt_off;
>> +		dt_region->vaddr = dw->dt_region.vaddr + chan->dt_off;
>> +		dt_region->sz = dt_chunk;
>> +
>> +		dev_dbg(dev, "Data:\tChannel read[%u] off=0x%.8lx\n",
>> +			i, chan->dt_off);
> 
> this is similar to previous with obvious changes, I think this can be
> made common routine invoke for R and W ...

OK, I can rewrite the code to simplify this.

> 
> So HW provides read channels and write channels and not generic channels
> which can be used for either R or W?
> 

The HW, provides a well defined channels, with different registers sets.
Initially I thought to maskared it, but since the number of channels can be
client configurable [0..8] for each type of channel and also asymmetric (write
vs read), I prefer to kept then separated.


Once again, thanks Vinod!



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-20 11:47       ` Vinod Koul
@ 2019-01-21 15:49         ` Gustavo Pimentel
  0 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-21 15:49 UTC (permalink / raw)
  To: Vinod Koul, Gustavo Pimentel
  Cc: Jose Abreu, linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Luis de Oliveira, Vitor Soares, Nelson Costa, Pedro Sousa

On 20/01/2019 11:47, Vinod Koul wrote:
> On 16-01-19, 11:53, Gustavo Pimentel wrote:
> 
>>>> +	/* Free irqs */
>>>
>>> But devm will automatically free it, no ?
>>
>> Yes, it shouldn't be there. I'll remove it.
> 
> Nope this is correct. we need to ensure irqs are freed and tasklets are
> quiesced otherwise you can end up with a case when a spurious interrupt
> and then tasklets spinning while core unrolls. devm for irq is not a
> good idea unless you really know what you are doing!
> 

Ok. Nice explanation.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver
  2019-01-17  5:03           ` Vinod Koul
@ 2019-01-21 15:59             ` Gustavo Pimentel
  0 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-21 15:59 UTC (permalink / raw)
  To: Vinod Koul, Gustavo Pimentel
  Cc: Andy Shevchenko, linux-pci, dmaengine, Dan Williams,
	Eugeniy Paltsev, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 17/01/2019 05:03, Vinod Koul wrote:
> On 15-01-19, 13:02, Gustavo Pimentel wrote:
>> On 15/01/2019 05:45, Andy Shevchenko wrote:
>>> On Mon, Jan 14, 2019 at 11:44:22AM +0000, Gustavo Pimentel wrote:
>>>> On 11/01/2019 19:48, Andy Shevchenko wrote:
>>>>> On Fri, Jan 11, 2019 at 07:33:43PM +0100, Gustavo Pimentel wrote:
>>>>>> Add Synopsys eDMA IP test and sample driver to be use for testing
>>>>>> purposes and also as a reference for any developer who needs to
>>>>>> implement and use Synopsys eDMA.
>>>>>>
>>>>>> This driver can be compile as built-in or external module in kernel.
>>>>>>
>>>>>> To enable this driver just select DW_EDMA_TEST option in kernel
>>>>>> configuration, however it requires and selects automatically DW_EDMA
>>>>>> option too.
>>>>>>
>>>>>
>>>>> Hmm... This doesn't explain what's wrong with dmatest module.
>>>>
>>>> There isn't anything wrong with dmatest module, that I know of. In beginning I
>>>> was planning to used it, however only works with MEM_TO_MEM transfers, that's
>>>> why I created a similar module but for MEM_TO_DEV and DEV_TO_MEM with
>>>> scatter-gather and cyclic transfers type for my use case. I don't know if can be
>>>> applied to other cases, if that is feasible, I'm glad to share it.
>>>
>>> What I'm trying to tell is that the dmatest driver would be nice to have such
>>> capability.
>>>
>>
>> At the beginning I thought to add those features to dmatest module, but because
>> of several points such as:
>>  - I didn't want, by any chance, to broke dmatest module while implementing the
>> support of those new features;
>>  - Since I'm still gaining knowledge about this subsystem I preferred to do in a
>> more controlled environment;
>>  - Since this driver is also a reference driver aim to teach and to be use by
>> someone how to use the dmaengine API, I preferred to keep it simple.
>>
>> Maybe in the future both modules could be merged in just one tool, but for now I
>> would prefer to keep it separated specially to fix any immaturity error of this
>> module.
> 
> With decent testing we should be able to iron out kinks in dmatest
> module. It gets tested for memcpy controllers so we should easily catch
> issues.

Ok, since I don't have a setup to test that, may I count with you for test it?

> 
> Said that, it would make sense to add support in generic dmatest so that
> other folks can reuse and contribute as well. Future work always gets
> side tracked by life, so lets do it now :)

But there are some points that will need to be reworked or clarified.
For instance, I require to know the physical and virtual address and the maximum
size allowed for the Endpoint (in my case), where I will put/get the data.

Another difference is, I create all threads and only after that I start them all
simultaneously, the dmatest doesn't have that behavior.

There are 1 or 2 more details, but the ones that have the greatest impact are those.


> 
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-21 15:48     ` Gustavo Pimentel
@ 2019-01-23 13:08       ` Vinod Koul
  2019-01-31 11:33         ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Vinod Koul @ 2019-01-23 13:08 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 21-01-19, 15:48, Gustavo Pimentel wrote:
> On 20/01/2019 11:44, Vinod Koul wrote:
> > On 11-01-19, 19:33, Gustavo Pimentel wrote:
> >> Add Synopsys eDMA IP core driver to kernel.
> >>
> >> This core driver, initializes and configures the eDMA IP using vma-helpers
> >> functions and dma-engine subsystem.
> > 
> > A description of eDMA IP will help review the driver
> 
> I've the IP description on the cover-letter, but I'll bring it to this patch, if
> it helps.

yeah cover doesnt get applied, changelog are very important
documentation for kernel

> >> @@ -0,0 +1,1059 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/*
> >> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
> > 
> > 2019 now
> 
> I've changed to "Copyright (c) 2018-present Synopsys, Inc. and/or its
> affiliates." this way it's always up to date and I also kept 2018, because it
> was the date that I started to develop this driver, if you don't mind.

yeah 18 is fine :) it need to end with current year always

> >> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
> >> +{
> >> +	struct dw_edma_chan *chan = desc->chan;
> >> +	struct dw_edma *dw = chan->chip->dw;
> >> +	struct dw_edma_chunk *chunk;
> >> +
> >> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
> >> +	if (unlikely(!chunk))
> >> +		return NULL;
> >> +
> >> +	INIT_LIST_HEAD(&chunk->list);
> >> +	chunk->chan = chan;
> >> +	chunk->cb = !(desc->chunks_alloc % 2);
> > 
> > cb ..?
> 
> CB = change bit, is a property of this eDMA IP. Basically it is a kind of
> handshake which serves to validate whether the linked list has been updated or
> not, especially useful in cases of recycled linked list elements (every linked
> list recycle is a new chunk, this will allow to differentiate each chunk).

okay please add that somewhere. Also it would help me if you explain
what is chunk and other terminologies used in this driver

> >> +static int dw_edma_device_config(struct dma_chan *dchan,
> >> +				 struct dma_slave_config *config)
> >> +{
> >> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
> >> +	const struct dw_edma_core_ops *ops = chan2ops(chan);
> >> +	unsigned long flags;
> >> +	int err = 0;
> >> +
> >> +	spin_lock_irqsave(&chan->vc.lock, flags);
> >> +
> >> +	if (!config) {
> >> +		err = -EINVAL;
> >> +		goto err_config;
> >> +	}
> >> +
> >> +	if (chan->status != EDMA_ST_IDLE) {
> >> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
> >> +		err = -EPERM;
> > 
> > this is not correct behaviour, device_config can be called anytime and
> > values can take affect on next transaction submitted..
> 
> Hum, I thought we could only reconfigure after transfer being finished.

Nope, anytime. They take effect for next prepare when you use it

> >> +	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
> >> +		&config->src_addr, &config->dst_addr);
> >> +
> >> +	chan->src_addr = config->src_addr;
> >> +	chan->dst_addr = config->dst_addr;
> >> +
> >> +	err = ops->device_config(dchan);
> > 
> > what does this do?
> 
> This is an initialization procedure to setup interrupts (data and addresses) to
> each channel on the eDMA IP,  in order to be triggered after transfer being
> completed or aborted. Due the fact the config() can be called at anytime,
> doesn't make sense to have this procedure here, I'll moved it to probe().

Yeah I am not still convinced about having another layer! Have you
though about using common lib for common parts ..?

> > this looks incorrect interpretation to me. The status is to be retrieved
> > for the given cookie passed and given that you do not even use this
> > argument tells me that you have understood this as 'channel' status
> > reporting, which is not correct
> 
> Yes, you're right, my interpretation assumes this function reports
> channel/transaction status. What would be the correct implementation?
> Something like this?
> 
> {
> 	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
> 	const struct dw_edma_core_ops *ops = chan2ops(chan);
> 	struct dw_edma_desc *desc;
> 	struct virt_dma_desc *vd;
> 	unsigned long flags;
> 	enum dma_status ret;
> 	u32 residue = 0;
> 
> 	spin_lock_irqsave(&chan->vc.lock, flags);
> 
> 	ret = dma_cookie_status(chan, cookie, txstate);
> 	if (ret == DMA_COMPLETE)
> 		goto ret_status;
> 
> 	vd = vchan_next_desc(&chan->vc);
> 	if (!vd)
> 		goto ret_status;
> 
> 	desc = vd2dw_edma_desc(vd);
> 	if (!desc)
> 		residue = desc->alloc_sz - desc->xfer_sz;
> 		
> 	if (ret == DMA_IN_PROGRESS && chan->status == EDMA_ST_PAUSE)
> 		ret = DMA_PAUSED;

this looks better, please do keep in mind txstate can be null, so
residue caln can be skipped

> >> +static struct dma_async_tx_descriptor *
> >> +dw_edma_device_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
> >> +			     unsigned int sg_len,
> >> +			     enum dma_transfer_direction direction,
> >> +			     unsigned long flags, void *context)
> >> +{
> >> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
> >> +	struct dw_edma_desc *desc;
> >> +	struct dw_edma_chunk *chunk;
> >> +	struct dw_edma_burst *burst;
> >> +	struct scatterlist *sg;
> >> +	unsigned long sflags;
> >> +	phys_addr_t src_addr;
> >> +	phys_addr_t dst_addr;
> >> +	int i;
> >> +
> >> +	if (sg_len < 1) {
> >> +		dev_err(chan2dev(chan), "invalid sg length %u\n", sg_len);
> >> +		return NULL;
> >> +	}
> >> +
> >> +	if (direction == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE) {
> > 
> > what is the second part of the check, can you explain that, who sets
> > chan->dir?
> 
> The chan->dir is set on probe() during the process of configuring each channel.

So you have channels that are unidirectional?

> >> +		dev_dbg(chan2dev(chan),	"prepare operation (WRITE)\n");
> >> +	} else if (direction == DMA_MEM_TO_DEV && chan->dir == EDMA_DIR_READ) {
> >> +		dev_dbg(chan2dev(chan),	"prepare operation (READ)\n");
> >> +	} else {
> >> +		dev_err(chan2dev(chan), "invalid direction\n");
> >> +		return NULL;
> >> +	}
> >> +
> >> +	if (!chan->configured) {
> >> +		dev_err(chan2dev(chan), "(prep_slave_sg) channel not configured\n");
> >> +		return NULL;
> >> +	}
> >> +
> >> +	if (chan->status != EDMA_ST_IDLE) {
> >> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
> >> +		return NULL;
> >> +	}
> > 
> > No, wrong again. The txn must be prepared and then on submit added to a
> > queue. You are writing a driver for dmaengine, surely you dont expect
> > the channel to be free and then do a txn.. that would be very
> > inefficient!
> 
> I did not realize that the flow could be as you mentioned. The documentation I
> read about the subsystem did not give me this idea. Thank you for clarifying me.

I think we have improved  that part a lot, please do feel free to point
out inconsistency
See DMA usage in Documentation/driver-api/dmaengine/client.rst

> >> +int dw_edma_probe(struct dw_edma_chip *chip)
> >> +{
> >> +	struct dw_edma *dw = chip->dw;
> >> +	struct device *dev = chip->dev;
> >> +	const struct dw_edma_core_ops *ops;
> >> +	size_t ll_chunk = dw->ll_region.sz;
> >> +	size_t dt_chunk = dw->dt_region.sz;
> >> +	u32 ch_tot;
> >> +	int i, j, err;
> >> +
> >> +	raw_spin_lock_init(&dw->lock);
> >> +
> >> +	/* Callback operation selection accordingly to eDMA version */
> >> +	switch (dw->version) {
> >> +	default:
> >> +		dev_err(dev, "unsupported version\n");
> >> +		return -EPERM;
> >> +	}
> > 
> > So we have only one case which returns error, was this code tested?
> 
> Yes it was, but I understand what your point of view.
> This was done like this, because I wanna to segment the patch series like this:
>  1) Adding eDMA driver core, which contains the driver skeleton and the whole
> logic associated.
>  2) and 3) Adding the callbacks for the eDMA register mapping version 0 (it will
> appear in the future a new version, I thought that this new version would came
> while I was trying to get the feedback about this patch series, therefore would
> have another 2 patches for the version 1 isolated and independent from the
> version 0).
>  4) and 5) Adding the PCIe glue-logic and device ID associated.
>  6) Adding maintainer for this driver.
>  7) Adding a test driver.
> 
> Since this switch will only have the associated case on patch 2, that why on
> patch 1 doesn't appear any possibility.
> 
> If you feel logic to squash patch 2 with patch 1, just say something and I'll do
> it for you :)

well each patch should build and work on its own, otherwise we get
problems :) But since this is a new driver it is okay. Anyway this patch
is quite _huge_ so lets not add more to it..

I would have moved the callback check to subsequent one..

> >> +	pm_runtime_get_sync(dev);
> >> +
> >> +	/* Find out how many write channels are supported by hardware */
> >> +	dw->wr_ch_cnt = ops->ch_count(dw, EDMA_DIR_WRITE);
> >> +	if (!dw->wr_ch_cnt) {
> >> +		dev_err(dev, "invalid number of write channels(0)\n");
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	/* Find out how many read channels are supported by hardware */
> >> +	dw->rd_ch_cnt = ops->ch_count(dw, EDMA_DIR_READ);
> >> +	if (!dw->rd_ch_cnt) {
> >> +		dev_err(dev, "invalid number of read channels(0)\n");
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	dev_dbg(dev, "Channels:\twrite=%d, read=%d\n",
> >> +		dw->wr_ch_cnt, dw->rd_ch_cnt);
> >> +
> >> +	ch_tot = dw->wr_ch_cnt + dw->rd_ch_cnt;
> >> +
> >> +	/* Allocate channels */
> >> +	dw->chan = devm_kcalloc(dev, ch_tot, sizeof(*dw->chan), GFP_KERNEL);
> > 
> > you may use struct_size() here
> 
> Hum, this would be useful if I wanted to allocate the dw struct as well, right?
> Since dw struct is already allocated, it looks like this can't be used, or am I
> missing something?

yeah you can allocate dw + chan one shot...
-- 
~Vinod

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-23 13:08       ` Vinod Koul
@ 2019-01-31 11:33         ` Gustavo Pimentel
  2019-02-01  4:14           ` Vinod Koul
  0 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-01-31 11:33 UTC (permalink / raw)
  To: Vinod Koul, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 23/01/2019 13:08, Vinod Koul wrote:
> On 21-01-19, 15:48, Gustavo Pimentel wrote:
>> On 20/01/2019 11:44, Vinod Koul wrote:
>>> On 11-01-19, 19:33, Gustavo Pimentel wrote:
>>>> Add Synopsys eDMA IP core driver to kernel.
>>>>
>>>> This core driver, initializes and configures the eDMA IP using vma-helpers
>>>> functions and dma-engine subsystem.
>>>
>>> A description of eDMA IP will help review the driver
>>
>> I've the IP description on the cover-letter, but I'll bring it to this patch, if
>> it helps.
> 
> yeah cover doesnt get applied, changelog are very important
> documentation for kernel

Ok. See below.

> 
>>>> @@ -0,0 +1,1059 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +/*
>>>> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
>>>
>>> 2019 now
>>
>> I've changed to "Copyright (c) 2018-present Synopsys, Inc. and/or its
>> affiliates." this way it's always up to date and I also kept 2018, because it
>> was the date that I started to develop this driver, if you don't mind.
> 
> yeah 18 is fine :) it need to end with current year always

Just to be sure, are you saying that must be: "Copyright (c) 2018-2019 Synopsys,
Inc. and/or its affiliates."?

> 
>>>> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
>>>> +{
>>>> +	struct dw_edma_chan *chan = desc->chan;
>>>> +	struct dw_edma *dw = chan->chip->dw;
>>>> +	struct dw_edma_chunk *chunk;
>>>> +
>>>> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
>>>> +	if (unlikely(!chunk))
>>>> +		return NULL;
>>>> +
>>>> +	INIT_LIST_HEAD(&chunk->list);
>>>> +	chunk->chan = chan;
>>>> +	chunk->cb = !(desc->chunks_alloc % 2);
>>>
>>> cb ..?
>>
>> CB = change bit, is a property of this eDMA IP. Basically it is a kind of
>> handshake which serves to validate whether the linked list has been updated or
>> not, especially useful in cases of recycled linked list elements (every linked
>> list recycle is a new chunk, this will allow to differentiate each chunk).
> 
> okay please add that somewhere. Also it would help me if you explain
> what is chunk and other terminologies used in this driver

I'm thinking to put the below description on the patch, please check if this is
sufficient explicit and clear to understand what this IP needs and does.

In order to transfer data from point A to B as fast as possible this IP requires
a dedicated memory space where will reside a linked list of elements.
All elements of this linked list are continuous and each one describes a data
transfer (source and destination addresses, length and a control variable).

For the sake of simplicity, lets assume a memory space for channel write 0 which
allows about 42 elements.

+---------+
| Desc #0 |--+
+---------+  |
             V
        +----------+
        | Chunk #0 |---+
        |  CB = 1  |   |   +----------+   +-----+   +-----------+   +-----+
        +----------+   +-->| Burst #0 |-->| ... |-->| Burst #41 |-->| llp |
              |            +----------+   +-----+   +-----------+   +-----+
              V
        +----------+
        | Chunk #1 |---+
        |  CB = 0  |   |   +-----------+   +-----+   +-----------+   +-----+
        +----------+   +-->| Burst #42 |-->| ... |-->| Burst #83 |-->| llp |
              |            +-----------+   +-----+   +-----------+   +-----+
              V
        +----------+
        | Chunk #2 |---+
        |  CB = 1  |   |   +-----------+   +-----+   +------------+   +-----+
        +----------+   +-->| Burst #84 |-->| ... |-->| Burst #125 |-->| llp |
              |            +-----------+   +-----+   +------------+   +-----+
              V
        +----------+
        | Chunk #3 |---+
        |  CB = 0  |   |   +------------+   +-----+   +------------+   +-----+
        +----------+   +-->| Burst #126 |-->| ... |-->| Burst #129 |-->| llp |
                           +------------+   +-----+   +------------+   +-----+

Legend:

*Linked list*, also know as Chunk
*Linked list element*, also know as Burst
*CB*, also know as Change Bit, it's a control bit (and typically is toggled)
that allows to easily identify and differentiate between the current linked list
and the previous or the next one.
*LLP*, is a special element that indicates the end of the linked list element
stream also informs that the next CB should be toggle.

On scatter-gather transfer mode, the client will submit a scatter-gather list of
n (on this case 130) elements, that will be divide in multiple Chunks, each
Chunk will have (on this case 42) a limited number of Bursts and after
transferring all Bursts, an interrupt will be triggered, which will allow to
recycle the all linked list dedicated memory again with the new information
relative to the next Chunk and respective Burst associated and repeat the whole
cycle again.

On cyclic transfer mode, the client will submit a buffer pointer, length of it
and number of repetitions, in this case each burst will correspond directly to
each repetition.

Each Burst can describes a data transfer from point A(source) to point
B(destination) with a length that can be from 1 byte up to 4 GB. Since dedicated
the memory space where the linked list will reside is limited, the whole n burst
elements will be organized in several Chunks, that will be used later to recycle
the dedicated memory space to initiate a new sequence of data transfers.

The whole transfer is considered has completed when it was transferred all bursts.

Currently this IP has a set well-known register map, which includes support for
legacy and unroll modes. Legacy mode is version of this register map that has
multiplexer register that allows to switch registers between all write and read
channels and the unroll modes repeats all write and read channels registers with
an offset between them. This register map is called v0.

The IP team is creating a new register map more suitable to the latest PCIe
features, that very likely will change the map register, which this version will
be called v1. As soon as this new version is released by the IP team the support
for this version in be included on this driver.

What do you think? There is any gray area that I could clarify?

> 
>>>> +static int dw_edma_device_config(struct dma_chan *dchan,
>>>> +				 struct dma_slave_config *config)
>>>> +{
>>>> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
>>>> +	const struct dw_edma_core_ops *ops = chan2ops(chan);
>>>> +	unsigned long flags;
>>>> +	int err = 0;
>>>> +
>>>> +	spin_lock_irqsave(&chan->vc.lock, flags);
>>>> +
>>>> +	if (!config) {
>>>> +		err = -EINVAL;
>>>> +		goto err_config;
>>>> +	}
>>>> +
>>>> +	if (chan->status != EDMA_ST_IDLE) {
>>>> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
>>>> +		err = -EPERM;
>>>
>>> this is not correct behaviour, device_config can be called anytime and
>>> values can take affect on next transaction submitted..
>>
>> Hum, I thought we could only reconfigure after transfer being finished.
> 
> Nope, anytime. They take effect for next prepare when you use it
> 
>>>> +	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
>>>> +		&config->src_addr, &config->dst_addr);
>>>> +
>>>> +	chan->src_addr = config->src_addr;
>>>> +	chan->dst_addr = config->dst_addr;
>>>> +
>>>> +	err = ops->device_config(dchan);
>>>
>>> what does this do?
>>
>> This is an initialization procedure to setup interrupts (data and addresses) to
>> each channel on the eDMA IP,  in order to be triggered after transfer being
>> completed or aborted. Due the fact the config() can be called at anytime,
>> doesn't make sense to have this procedure here, I'll moved it to probe().
> 
> Yeah I am not still convinced about having another layer! Have you
> though about using common lib for common parts ..?

Maybe I'm explaining myself wrongly. I don't have any clue about the new
register map for the future versions. I honestly tried to implement the common
lib for the whole process that interact with dma engine controller to ease in
the future any new addition of register mapping, by having this common callbacks
that will only be responsible for interfacing the HW accordingly to register map
version. Maybe I can simplify something in the future, but I only be able to
conclude that after having some idea about the new register map.

IMHO I think this is the easier and clean way to do it, in terms of code
maintenance and architecture, but if you have another idea that you can show me
or pointing out for a driver that implements something similar, I'm no problem
to check it out.

> 
>>> this looks incorrect interpretation to me. The status is to be retrieved
>>> for the given cookie passed and given that you do not even use this
>>> argument tells me that you have understood this as 'channel' status
>>> reporting, which is not correct
>>
>> Yes, you're right, my interpretation assumes this function reports
>> channel/transaction status. What would be the correct implementation?
>> Something like this?
>>
>> {
>> 	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
>> 	const struct dw_edma_core_ops *ops = chan2ops(chan);
>> 	struct dw_edma_desc *desc;
>> 	struct virt_dma_desc *vd;
>> 	unsigned long flags;
>> 	enum dma_status ret;
>> 	u32 residue = 0;
>>
>> 	spin_lock_irqsave(&chan->vc.lock, flags);
>>
>> 	ret = dma_cookie_status(chan, cookie, txstate);
>> 	if (ret == DMA_COMPLETE)
>> 		goto ret_status;
>>
>> 	vd = vchan_next_desc(&chan->vc);
>> 	if (!vd)
>> 		goto ret_status;
>>
>> 	desc = vd2dw_edma_desc(vd);
>> 	if (!desc)
>> 		residue = desc->alloc_sz - desc->xfer_sz;
>> 		
>> 	if (ret == DMA_IN_PROGRESS && chan->status == EDMA_ST_PAUSE)
>> 		ret = DMA_PAUSED;
> 
> this looks better, please do keep in mind txstate can be null, so
> residue caln can be skipped

Ok.

> 
>>>> +static struct dma_async_tx_descriptor *
>>>> +dw_edma_device_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
>>>> +			     unsigned int sg_len,
>>>> +			     enum dma_transfer_direction direction,
>>>> +			     unsigned long flags, void *context)
>>>> +{
>>>> +	struct dw_edma_chan *chan = dchan2dw_edma_chan(dchan);
>>>> +	struct dw_edma_desc *desc;
>>>> +	struct dw_edma_chunk *chunk;
>>>> +	struct dw_edma_burst *burst;
>>>> +	struct scatterlist *sg;
>>>> +	unsigned long sflags;
>>>> +	phys_addr_t src_addr;
>>>> +	phys_addr_t dst_addr;
>>>> +	int i;
>>>> +
>>>> +	if (sg_len < 1) {
>>>> +		dev_err(chan2dev(chan), "invalid sg length %u\n", sg_len);
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	if (direction == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE) {
>>>
>>> what is the second part of the check, can you explain that, who sets
>>> chan->dir?
>>
>> The chan->dir is set on probe() during the process of configuring each channel.
> 
> So you have channels that are unidirectional?

Yes. That's one another reason IMHO to keep the dw-edma-test separate from
dma-test, since this IP is more picky and have this particularities.

This don't invalidate to add in the future, similar features to dma-test that
are generic enough to work with other IPs, at least this is my opinion.

> 
>>>> +		dev_dbg(chan2dev(chan),	"prepare operation (WRITE)\n");
>>>> +	} else if (direction == DMA_MEM_TO_DEV && chan->dir == EDMA_DIR_READ) {
>>>> +		dev_dbg(chan2dev(chan),	"prepare operation (READ)\n");
>>>> +	} else {
>>>> +		dev_err(chan2dev(chan), "invalid direction\n");
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	if (!chan->configured) {
>>>> +		dev_err(chan2dev(chan), "(prep_slave_sg) channel not configured\n");
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	if (chan->status != EDMA_ST_IDLE) {
>>>> +		dev_err(chan2dev(chan), "channel is busy or paused\n");
>>>> +		return NULL;
>>>> +	}
>>>
>>> No, wrong again. The txn must be prepared and then on submit added to a
>>> queue. You are writing a driver for dmaengine, surely you dont expect
>>> the channel to be free and then do a txn.. that would be very
>>> inefficient!
>>
>> I did not realize that the flow could be as you mentioned. The documentation I
>> read about the subsystem did not give me this idea. Thank you for clarifying me.
> 
> I think we have improved  that part a lot, please do feel free to point
> out inconsistency
> See DMA usage in Documentation/driver-api/dmaengine/client.rst
> 
>>>> +int dw_edma_probe(struct dw_edma_chip *chip)
>>>> +{
>>>> +	struct dw_edma *dw = chip->dw;
>>>> +	struct device *dev = chip->dev;
>>>> +	const struct dw_edma_core_ops *ops;
>>>> +	size_t ll_chunk = dw->ll_region.sz;
>>>> +	size_t dt_chunk = dw->dt_region.sz;
>>>> +	u32 ch_tot;
>>>> +	int i, j, err;
>>>> +
>>>> +	raw_spin_lock_init(&dw->lock);
>>>> +
>>>> +	/* Callback operation selection accordingly to eDMA version */
>>>> +	switch (dw->version) {
>>>> +	default:
>>>> +		dev_err(dev, "unsupported version\n");
>>>> +		return -EPERM;
>>>> +	}
>>>
>>> So we have only one case which returns error, was this code tested?
>>
>> Yes it was, but I understand what your point of view.
>> This was done like this, because I wanna to segment the patch series like this:
>>  1) Adding eDMA driver core, which contains the driver skeleton and the whole
>> logic associated.
>>  2) and 3) Adding the callbacks for the eDMA register mapping version 0 (it will
>> appear in the future a new version, I thought that this new version would came
>> while I was trying to get the feedback about this patch series, therefore would
>> have another 2 patches for the version 1 isolated and independent from the
>> version 0).
>>  4) and 5) Adding the PCIe glue-logic and device ID associated.
>>  6) Adding maintainer for this driver.
>>  7) Adding a test driver.
>>
>> Since this switch will only have the associated case on patch 2, that why on
>> patch 1 doesn't appear any possibility.
>>
>> If you feel logic to squash patch 2 with patch 1, just say something and I'll do
>> it for you :)
> 
> well each patch should build and work on its own, otherwise we get
> problems :) But since this is a new driver it is okay. Anyway this patch
> is quite _huge_ so lets not add more to it..
> 
> I would have moved the callback check to subsequent one..

Sorry. I didn't catch this. Can you explain it?

> 
>>>> +	pm_runtime_get_sync(dev);
>>>> +
>>>> +	/* Find out how many write channels are supported by hardware */
>>>> +	dw->wr_ch_cnt = ops->ch_count(dw, EDMA_DIR_WRITE);
>>>> +	if (!dw->wr_ch_cnt) {
>>>> +		dev_err(dev, "invalid number of write channels(0)\n");
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	/* Find out how many read channels are supported by hardware */
>>>> +	dw->rd_ch_cnt = ops->ch_count(dw, EDMA_DIR_READ);
>>>> +	if (!dw->rd_ch_cnt) {
>>>> +		dev_err(dev, "invalid number of read channels(0)\n");
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	dev_dbg(dev, "Channels:\twrite=%d, read=%d\n",
>>>> +		dw->wr_ch_cnt, dw->rd_ch_cnt);
>>>> +
>>>> +	ch_tot = dw->wr_ch_cnt + dw->rd_ch_cnt;
>>>> +
>>>> +	/* Allocate channels */
>>>> +	dw->chan = devm_kcalloc(dev, ch_tot, sizeof(*dw->chan), GFP_KERNEL);
>>>
>>> you may use struct_size() here
>>
>> Hum, this would be useful if I wanted to allocate the dw struct as well, right?
>> Since dw struct is already allocated, it looks like this can't be used, or am I
>> missing something?
> 
> yeah you can allocate dw + chan one shot...

I'm allocating dw struct on the PCIe glue-logic driver in order to set it later
some data that will be useful here and here I'm only adding the channels. That's
why I can't allocate all in one shot.

> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-01-31 11:33         ` Gustavo Pimentel
@ 2019-02-01  4:14           ` Vinod Koul
  2019-02-01 11:23             ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Vinod Koul @ 2019-02-01  4:14 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 31-01-19, 11:33, Gustavo Pimentel wrote:
> On 23/01/2019 13:08, Vinod Koul wrote:
> > On 21-01-19, 15:48, Gustavo Pimentel wrote:
> >> On 20/01/2019 11:44, Vinod Koul wrote:
> >>> On 11-01-19, 19:33, Gustavo Pimentel wrote:

> >>>> @@ -0,0 +1,1059 @@
> >>>> +// SPDX-License-Identifier: GPL-2.0
> >>>> +/*
> >>>> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
> >>>
> >>> 2019 now
> >>
> >> I've changed to "Copyright (c) 2018-present Synopsys, Inc. and/or its
> >> affiliates." this way it's always up to date and I also kept 2018, because it
> >> was the date that I started to develop this driver, if you don't mind.
> > 
> > yeah 18 is fine :) it need to end with current year always
> 
> Just to be sure, are you saying that must be: "Copyright (c) 2018-2019 Synopsys,
> Inc. and/or its affiliates."?

Yup :)

> >>>> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
> >>>> +{
> >>>> +	struct dw_edma_chan *chan = desc->chan;
> >>>> +	struct dw_edma *dw = chan->chip->dw;
> >>>> +	struct dw_edma_chunk *chunk;
> >>>> +
> >>>> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
> >>>> +	if (unlikely(!chunk))
> >>>> +		return NULL;
> >>>> +
> >>>> +	INIT_LIST_HEAD(&chunk->list);
> >>>> +	chunk->chan = chan;
> >>>> +	chunk->cb = !(desc->chunks_alloc % 2);
> >>>
> >>> cb ..?
> >>
> >> CB = change bit, is a property of this eDMA IP. Basically it is a kind of
> >> handshake which serves to validate whether the linked list has been updated or
> >> not, especially useful in cases of recycled linked list elements (every linked
> >> list recycle is a new chunk, this will allow to differentiate each chunk).
> > 
> > okay please add that somewhere. Also it would help me if you explain
> > what is chunk and other terminologies used in this driver
> 
> I'm thinking to put the below description on the patch, please check if this is
> sufficient explicit and clear to understand what this IP needs and does.
> 
> In order to transfer data from point A to B as fast as possible this IP requires
> a dedicated memory space where will reside a linked list of elements.

rephrasing: a dedicated memory space containing linked list of elements

> All elements of this linked list are continuous and each one describes a data
> transfer (source and destination addresses, length and a control variable).
> 
> For the sake of simplicity, lets assume a memory space for channel write 0 which
> allows about 42 elements.
> 
> +---------+
> | Desc #0 |--+
> +---------+  |
>              V
>         +----------+
>         | Chunk #0 |---+
>         |  CB = 1  |   |   +----------+   +-----+   +-----------+   +-----+
>         +----------+   +-->| Burst #0 |-->| ... |-->| Burst #41 |-->| llp |
>               |            +----------+   +-----+   +-----------+   +-----+
>               V
>         +----------+
>         | Chunk #1 |---+
>         |  CB = 0  |   |   +-----------+   +-----+   +-----------+   +-----+
>         +----------+   +-->| Burst #42 |-->| ... |-->| Burst #83 |-->| llp |
>               |            +-----------+   +-----+   +-----------+   +-----+
>               V
>         +----------+
>         | Chunk #2 |---+
>         |  CB = 1  |   |   +-----------+   +-----+   +------------+   +-----+
>         +----------+   +-->| Burst #84 |-->| ... |-->| Burst #125 |-->| llp |
>               |            +-----------+   +-----+   +------------+   +-----+
>               V
>         +----------+
>         | Chunk #3 |---+
>         |  CB = 0  |   |   +------------+   +-----+   +------------+   +-----+
>         +----------+   +-->| Burst #126 |-->| ... |-->| Burst #129 |-->| llp |
>                            +------------+   +-----+   +------------+   +-----+

This is great and required to understand the driver.

How does controller move from Burnst #41 of Chunk 0 to Chunk 1 ?
Is Burst 0 to 129 a link list with Burst 0, 42, 84 and 126 having a
callback (done bit set)..

This sound **very** similar to dw dma concepts!

> 
> Legend:
> 
> *Linked list*, also know as Chunk
> *Linked list element*, also know as Burst
> *CB*, also know as Change Bit, it's a control bit (and typically is toggled)
> that allows to easily identify and differentiate between the current linked list
> and the previous or the next one.
> *LLP*, is a special element that indicates the end of the linked list element
> stream also informs that the next CB should be toggle.
> 
> On scatter-gather transfer mode, the client will submit a scatter-gather list of
> n (on this case 130) elements, that will be divide in multiple Chunks, each
> Chunk will have (on this case 42) a limited number of Bursts and after
> transferring all Bursts, an interrupt will be triggered, which will allow to
> recycle the all linked list dedicated memory again with the new information
> relative to the next Chunk and respective Burst associated and repeat the whole
> cycle again.
> 
> On cyclic transfer mode, the client will submit a buffer pointer, length of it
> and number of repetitions, in this case each burst will correspond directly to
> each repetition.
> 
> Each Burst can describes a data transfer from point A(source) to point
> B(destination) with a length that can be from 1 byte up to 4 GB. Since dedicated
> the memory space where the linked list will reside is limited, the whole n burst
> elements will be organized in several Chunks, that will be used later to recycle
> the dedicated memory space to initiate a new sequence of data transfers.
> 
> The whole transfer is considered has completed when it was transferred all bursts.
> 
> Currently this IP has a set well-known register map, which includes support for
> legacy and unroll modes. Legacy mode is version of this register map that has

whats  unroll..

> multiplexer register that allows to switch registers between all write and read
> channels and the unroll modes repeats all write and read channels registers with
> an offset between them. This register map is called v0.
> 
> The IP team is creating a new register map more suitable to the latest PCIe
> features, that very likely will change the map register, which this version will
> be called v1. As soon as this new version is released by the IP team the support
> for this version in be included on this driver.
> 
> What do you think? There is any gray area that I could clarify?

This sounds good. But we are also catering to a WIP IP which can change
right. Doesnt sound very good idea to me :)

> >>>> +	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
> >>>> +		&config->src_addr, &config->dst_addr);
> >>>> +
> >>>> +	chan->src_addr = config->src_addr;
> >>>> +	chan->dst_addr = config->dst_addr;
> >>>> +
> >>>> +	err = ops->device_config(dchan);
> >>>
> >>> what does this do?
> >>
> >> This is an initialization procedure to setup interrupts (data and addresses) to
> >> each channel on the eDMA IP,  in order to be triggered after transfer being
> >> completed or aborted. Due the fact the config() can be called at anytime,
> >> doesn't make sense to have this procedure here, I'll moved it to probe().
> > 
> > Yeah I am not still convinced about having another layer! Have you
> > though about using common lib for common parts ..?
> 
> Maybe I'm explaining myself wrongly. I don't have any clue about the new
> register map for the future versions. I honestly tried to implement the common
> lib for the whole process that interact with dma engine controller to ease in
> the future any new addition of register mapping, by having this common callbacks
> that will only be responsible for interfacing the HW accordingly to register map
> version. Maybe I can simplify something in the future, but I only be able to
> conclude that after having some idea about the new register map.
> 
> IMHO I think this is the easier and clean way to do it, in terms of code
> maintenance and architecture, but if you have another idea that you can show me
> or pointing out for a driver that implements something similar, I'm no problem
> to check it out.

That is what my fear was :)

Lets step back and solve one problem at a time. Right now that is v0 of
IP. Please write a simple driver which solve v0 without any layers
involved.

Once v1 is in good shape, you would know what it required and then we
can split v0 driver into common lib and v0 driver and then add v1
driver.


> >>> what is the second part of the check, can you explain that, who sets
> >>> chan->dir?
> >>
> >> The chan->dir is set on probe() during the process of configuring each channel.
> > 
> > So you have channels that are unidirectional?
> 
> Yes. That's one another reason IMHO to keep the dw-edma-test separate from
> dma-test, since this IP is more picky and have this particularities.

That is okay, that should be handled by prep calls, if you get a prep
call for direction you dont support return error and move to next
available one.

That can be done generically in dmatest driver and to answer your
question, yes I would test that :)

> >>>> +int dw_edma_probe(struct dw_edma_chip *chip)
> >>>> +{
> >>>> +	struct dw_edma *dw = chip->dw;
> >>>> +	struct device *dev = chip->dev;
> >>>> +	const struct dw_edma_core_ops *ops;
> >>>> +	size_t ll_chunk = dw->ll_region.sz;
> >>>> +	size_t dt_chunk = dw->dt_region.sz;
> >>>> +	u32 ch_tot;
> >>>> +	int i, j, err;
> >>>> +
> >>>> +	raw_spin_lock_init(&dw->lock);
> >>>> +
> >>>> +	/* Callback operation selection accordingly to eDMA version */
> >>>> +	switch (dw->version) {
> >>>> +	default:
> >>>> +		dev_err(dev, "unsupported version\n");
> >>>> +		return -EPERM;
> >>>> +	}
> >>>
> >>> So we have only one case which returns error, was this code tested?
> >>
> >> Yes it was, but I understand what your point of view.
> >> This was done like this, because I wanna to segment the patch series like this:
> >>  1) Adding eDMA driver core, which contains the driver skeleton and the whole
> >> logic associated.
> >>  2) and 3) Adding the callbacks for the eDMA register mapping version 0 (it will
> >> appear in the future a new version, I thought that this new version would came
> >> while I was trying to get the feedback about this patch series, therefore would
> >> have another 2 patches for the version 1 isolated and independent from the
> >> version 0).
> >>  4) and 5) Adding the PCIe glue-logic and device ID associated.
> >>  6) Adding maintainer for this driver.
> >>  7) Adding a test driver.
> >>
> >> Since this switch will only have the associated case on patch 2, that why on
> >> patch 1 doesn't appear any possibility.
> >>
> >> If you feel logic to squash patch 2 with patch 1, just say something and I'll do
> >> it for you :)
> > 
> > well each patch should build and work on its own, otherwise we get
> > problems :) But since this is a new driver it is okay. Anyway this patch
> > is quite _huge_ so lets not add more to it..
> > 
> > I would have moved the callback check to subsequent one..
> 
> Sorry. I didn't catch this. Can you explain it?

I would have moved these lines to next patch to give them right meaning,
here it feels wrong:

        switch (dw->version) {
        case foo:
                ...
        default:
                ...
        }
-- 
~Vinod

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-02-01  4:14           ` Vinod Koul
@ 2019-02-01 11:23             ` Gustavo Pimentel
  2019-02-02 10:07               ` Vinod Koul
  0 siblings, 1 reply; 39+ messages in thread
From: Gustavo Pimentel @ 2019-02-01 11:23 UTC (permalink / raw)
  To: Vinod Koul, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 01/02/2019 04:14, Vinod Koul wrote:
> On 31-01-19, 11:33, Gustavo Pimentel wrote:
>> On 23/01/2019 13:08, Vinod Koul wrote:
>>> On 21-01-19, 15:48, Gustavo Pimentel wrote:
>>>> On 20/01/2019 11:44, Vinod Koul wrote:
>>>>> On 11-01-19, 19:33, Gustavo Pimentel wrote:
> 
>>>>>> @@ -0,0 +1,1059 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>> +/*
>>>>>> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
>>>>>
>>>>> 2019 now
>>>>
>>>> I've changed to "Copyright (c) 2018-present Synopsys, Inc. and/or its
>>>> affiliates." this way it's always up to date and I also kept 2018, because it
>>>> was the date that I started to develop this driver, if you don't mind.
>>>
>>> yeah 18 is fine :) it need to end with current year always
>>
>> Just to be sure, are you saying that must be: "Copyright (c) 2018-2019 Synopsys,
>> Inc. and/or its affiliates."?
> 
> Yup :)
> 
>>>>>> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
>>>>>> +{
>>>>>> +	struct dw_edma_chan *chan = desc->chan;
>>>>>> +	struct dw_edma *dw = chan->chip->dw;
>>>>>> +	struct dw_edma_chunk *chunk;
>>>>>> +
>>>>>> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
>>>>>> +	if (unlikely(!chunk))
>>>>>> +		return NULL;
>>>>>> +
>>>>>> +	INIT_LIST_HEAD(&chunk->list);
>>>>>> +	chunk->chan = chan;
>>>>>> +	chunk->cb = !(desc->chunks_alloc % 2);
>>>>>
>>>>> cb ..?
>>>>
>>>> CB = change bit, is a property of this eDMA IP. Basically it is a kind of
>>>> handshake which serves to validate whether the linked list has been updated or
>>>> not, especially useful in cases of recycled linked list elements (every linked
>>>> list recycle is a new chunk, this will allow to differentiate each chunk).
>>>
>>> okay please add that somewhere. Also it would help me if you explain
>>> what is chunk and other terminologies used in this driver
>>
>> I'm thinking to put the below description on the patch, please check if this is
>> sufficient explicit and clear to understand what this IP needs and does.
>>
>> In order to transfer data from point A to B as fast as possible this IP requires
>> a dedicated memory space where will reside a linked list of elements.
> 
> rephrasing: a dedicated memory space containing linked list of elements
> 
>> All elements of this linked list are continuous and each one describes a data
>> transfer (source and destination addresses, length and a control variable).
>>
>> For the sake of simplicity, lets assume a memory space for channel write 0 which
>> allows about 42 elements.
>>
>> +---------+
>> | Desc #0 |--+
>> +---------+  |
>>              V
>>         +----------+
>>         | Chunk #0 |---+
>>         |  CB = 1  |   |   +----------+   +-----+   +-----------+   +-----+
>>         +----------+   +-->| Burst #0 |-->| ... |-->| Burst #41 |-->| llp |
>>               |            +----------+   +-----+   +-----------+   +-----+
>>               V
>>         +----------+
>>         | Chunk #1 |---+
>>         |  CB = 0  |   |   +-----------+   +-----+   +-----------+   +-----+
>>         +----------+   +-->| Burst #42 |-->| ... |-->| Burst #83 |-->| llp |
>>               |            +-----------+   +-----+   +-----------+   +-----+
>>               V
>>         +----------+
>>         | Chunk #2 |---+
>>         |  CB = 1  |   |   +-----------+   +-----+   +------------+   +-----+
>>         +----------+   +-->| Burst #84 |-->| ... |-->| Burst #125 |-->| llp |
>>               |            +-----------+   +-----+   +------------+   +-----+
>>               V
>>         +----------+
>>         | Chunk #3 |---+
>>         |  CB = 0  |   |   +------------+   +-----+   +------------+   +-----+
>>         +----------+   +-->| Burst #126 |-->| ... |-->| Burst #129 |-->| llp |
>>                            +------------+   +-----+   +------------+   +-----+
> 
> This is great and required to understand the driver.
> 
> How does controller move from Burnst #41 of Chunk 0 to Chunk 1 ?

I forgot to explain that...

On every last Burst of the Chunk (Burst #41, Burst #83, Burst #125 or even Burst
#129) is set some flags on their control variable (RIE and LIE bits) that will
trigger the send of "done" interruption.

On the interruptions callback, is decided whether to recycle the linked list
memory space by writing a new set of Bursts elements (if still exists Chunks to
transfer) or is considered completed (if there is no Chunks available to transfer)

> Is Burst 0 to 129 a link list with Burst 0, 42, 84 and 126 having a
> callback (done bit set)..

I didn't quite understand it your question.

It comes from the prep_slave_sg n elements (130 applying the example), where
will be divide in several Chunks (#0, #1, #2, #3 and #4 applying the example) (a
linked list that will contain another linked list for the Bursts, CB, Chunk
size). The linked list inside of each Chunk will contain a number of Bursts
(limited to the memory space size), each one will possess Source Address,
Destination Address and size of that sub-transfer.

> 
> This sound **very** similar to dw dma concepts!

I believe some parts of dw dma and dw edma behavior are similar and that makes
perfectly sense since both dma are done by Synopsys and may be exist a shared
knowledge, however they are different IPs applied to different products.

> 
>>
>> Legend:
>>
>> *Linked list*, also know as Chunk
>> *Linked list element*, also know as Burst
>> *CB*, also know as Change Bit, it's a control bit (and typically is toggled)
>> that allows to easily identify and differentiate between the current linked list
>> and the previous or the next one.
>> *LLP*, is a special element that indicates the end of the linked list element
>> stream also informs that the next CB should be toggle.
>>
>> On scatter-gather transfer mode, the client will submit a scatter-gather list of
>> n (on this case 130) elements, that will be divide in multiple Chunks, each
>> Chunk will have (on this case 42) a limited number of Bursts and after
>> transferring all Bursts, an interrupt will be triggered, which will allow to
>> recycle the all linked list dedicated memory again with the new information
>> relative to the next Chunk and respective Burst associated and repeat the whole
>> cycle again.
>>
>> On cyclic transfer mode, the client will submit a buffer pointer, length of it
>> and number of repetitions, in this case each burst will correspond directly to
>> each repetition.
>>
>> Each Burst can describes a data transfer from point A(source) to point
>> B(destination) with a length that can be from 1 byte up to 4 GB. Since dedicated
>> the memory space where the linked list will reside is limited, the whole n burst
>> elements will be organized in several Chunks, that will be used later to recycle
>> the dedicated memory space to initiate a new sequence of data transfers.
>>
>> The whole transfer is considered has completed when it was transferred all bursts.
>>
>> Currently this IP has a set well-known register map, which includes support for
>> legacy and unroll modes. Legacy mode is version of this register map that has
> 
> whats  unroll..
> 
>> multiplexer register that allows to switch registers between all write and read

The unroll is explained here, see below

>> channels and the unroll modes repeats all write and read channels registers with
>> an offset between them. This register map is called v0.
>>
>> The IP team is creating a new register map more suitable to the latest PCIe
>> features, that very likely will change the map register, which this version will
>> be called v1. As soon as this new version is released by the IP team the support
>> for this version in be included on this driver.
>>
>> What do you think? There is any gray area that I could clarify?
> 
> This sounds good. But we are also catering to a WIP IP which can change
> right. Doesnt sound very good idea to me :)

This IP exists for several years like this and it works quite fine, however
because of new features and requests (SR-IOV, more dma channels, function
segregation and isolation, performance improvement) that are coming it's natural
to have exist improvements. The drivers should follow the evolution and be
sufficient robust enough to adapt to this new circumstance.

> 
>>>>>> +	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
>>>>>> +		&config->src_addr, &config->dst_addr);
>>>>>> +
>>>>>> +	chan->src_addr = config->src_addr;
>>>>>> +	chan->dst_addr = config->dst_addr;
>>>>>> +
>>>>>> +	err = ops->device_config(dchan);
>>>>>
>>>>> what does this do?
>>>>
>>>> This is an initialization procedure to setup interrupts (data and addresses) to
>>>> each channel on the eDMA IP,  in order to be triggered after transfer being
>>>> completed or aborted. Due the fact the config() can be called at anytime,
>>>> doesn't make sense to have this procedure here, I'll moved it to probe().
>>>
>>> Yeah I am not still convinced about having another layer! Have you
>>> though about using common lib for common parts ..?
>>
>> Maybe I'm explaining myself wrongly. I don't have any clue about the new
>> register map for the future versions. I honestly tried to implement the common
>> lib for the whole process that interact with dma engine controller to ease in
>> the future any new addition of register mapping, by having this common callbacks
>> that will only be responsible for interfacing the HW accordingly to register map
>> version. Maybe I can simplify something in the future, but I only be able to
>> conclude that after having some idea about the new register map.
>>
>> IMHO I think this is the easier and clean way to do it, in terms of code
>> maintenance and architecture, but if you have another idea that you can show me
>> or pointing out for a driver that implements something similar, I'm no problem
>> to check it out.
> 
> That is what my fear was :)
> 
> Lets step back and solve one problem at a time. Right now that is v0 of
> IP. Please write a simple driver which solve v0 without any layers
> involved.
> 
> Once v1 is in good shape, you would know what it required and then we
> can split v0 driver into common lib and v0 driver and then add v1
> driver.

Can I keep the code segregation as it is now? With the dw-edma-v0-core.c/h,
dw-edma-v0-debugfs.c/h and dw-edma-v0-regs.h

That way I would only replace the callbacks calls to direct function calls and
remove the switch case callback selection base on the version.

> 
> 
>>>>> what is the second part of the check, can you explain that, who sets
>>>>> chan->dir?
>>>>
>>>> The chan->dir is set on probe() during the process of configuring each channel.
>>>
>>> So you have channels that are unidirectional?
>>
>> Yes. That's one another reason IMHO to keep the dw-edma-test separate from
>> dma-test, since this IP is more picky and have this particularities.
> 
> That is okay, that should be handled by prep calls, if you get a prep
> call for direction you dont support return error and move to next
> available one.
> 
> That can be done generically in dmatest driver and to answer your
> question, yes I would test that :)

Like you said, let do this in small steps. For now I would like to suggest to
leave out the dw-dma-test driver and just focus on the current driver, if you
don't mind. I never thought that his test driver would raise this kind of discuss.

> 
>>>>>> +int dw_edma_probe(struct dw_edma_chip *chip)
>>>>>> +{
>>>>>> +	struct dw_edma *dw = chip->dw;
>>>>>> +	struct device *dev = chip->dev;
>>>>>> +	const struct dw_edma_core_ops *ops;
>>>>>> +	size_t ll_chunk = dw->ll_region.sz;
>>>>>> +	size_t dt_chunk = dw->dt_region.sz;
>>>>>> +	u32 ch_tot;
>>>>>> +	int i, j, err;
>>>>>> +
>>>>>> +	raw_spin_lock_init(&dw->lock);
>>>>>> +
>>>>>> +	/* Callback operation selection accordingly to eDMA version */
>>>>>> +	switch (dw->version) {
>>>>>> +	default:
>>>>>> +		dev_err(dev, "unsupported version\n");
>>>>>> +		return -EPERM;
>>>>>> +	}
>>>>>
>>>>> So we have only one case which returns error, was this code tested?
>>>>
>>>> Yes it was, but I understand what your point of view.
>>>> This was done like this, because I wanna to segment the patch series like this:
>>>>  1) Adding eDMA driver core, which contains the driver skeleton and the whole
>>>> logic associated.
>>>>  2) and 3) Adding the callbacks for the eDMA register mapping version 0 (it will
>>>> appear in the future a new version, I thought that this new version would came
>>>> while I was trying to get the feedback about this patch series, therefore would
>>>> have another 2 patches for the version 1 isolated and independent from the
>>>> version 0).
>>>>  4) and 5) Adding the PCIe glue-logic and device ID associated.
>>>>  6) Adding maintainer for this driver.
>>>>  7) Adding a test driver.
>>>>
>>>> Since this switch will only have the associated case on patch 2, that why on
>>>> patch 1 doesn't appear any possibility.
>>>>
>>>> If you feel logic to squash patch 2 with patch 1, just say something and I'll do
>>>> it for you :)
>>>
>>> well each patch should build and work on its own, otherwise we get
>>> problems :) But since this is a new driver it is okay. Anyway this patch
>>> is quite _huge_ so lets not add more to it..
>>>
>>> I would have moved the callback check to subsequent one..
>>
>> Sorry. I didn't catch this. Can you explain it?
> 
> I would have moved these lines to next patch to give them right meaning,
> here it feels wrong:
> 
>         switch (dw->version) {
>         case foo:
>                 ...
>         default:
>                 ...
>         }
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-02-01 11:23             ` Gustavo Pimentel
@ 2019-02-02 10:07               ` Vinod Koul
  2019-02-06 18:06                 ` Gustavo Pimentel
  0 siblings, 1 reply; 39+ messages in thread
From: Vinod Koul @ 2019-02-02 10:07 UTC (permalink / raw)
  To: Gustavo Pimentel
  Cc: linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 01-02-19, 11:23, Gustavo Pimentel wrote:
> On 01/02/2019 04:14, Vinod Koul wrote:
> > On 31-01-19, 11:33, Gustavo Pimentel wrote:
> >> On 23/01/2019 13:08, Vinod Koul wrote:
> >>> On 21-01-19, 15:48, Gustavo Pimentel wrote:
> >>>> On 20/01/2019 11:44, Vinod Koul wrote:
> >>>>> On 11-01-19, 19:33, Gustavo Pimentel wrote:
> > 
> >>>>>> @@ -0,0 +1,1059 @@
> >>>>>> +// SPDX-License-Identifier: GPL-2.0
> >>>>>> +/*
> >>>>>> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
> >>>>>
> >>>>> 2019 now
> >>>>
> >>>> I've changed to "Copyright (c) 2018-present Synopsys, Inc. and/or its
> >>>> affiliates." this way it's always up to date and I also kept 2018, because it
> >>>> was the date that I started to develop this driver, if you don't mind.
> >>>
> >>> yeah 18 is fine :) it need to end with current year always
> >>
> >> Just to be sure, are you saying that must be: "Copyright (c) 2018-2019 Synopsys,
> >> Inc. and/or its affiliates."?
> > 
> > Yup :)
> > 
> >>>>>> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
> >>>>>> +{
> >>>>>> +	struct dw_edma_chan *chan = desc->chan;
> >>>>>> +	struct dw_edma *dw = chan->chip->dw;
> >>>>>> +	struct dw_edma_chunk *chunk;
> >>>>>> +
> >>>>>> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
> >>>>>> +	if (unlikely(!chunk))
> >>>>>> +		return NULL;
> >>>>>> +
> >>>>>> +	INIT_LIST_HEAD(&chunk->list);
> >>>>>> +	chunk->chan = chan;
> >>>>>> +	chunk->cb = !(desc->chunks_alloc % 2);
> >>>>>
> >>>>> cb ..?
> >>>>
> >>>> CB = change bit, is a property of this eDMA IP. Basically it is a kind of
> >>>> handshake which serves to validate whether the linked list has been updated or
> >>>> not, especially useful in cases of recycled linked list elements (every linked
> >>>> list recycle is a new chunk, this will allow to differentiate each chunk).
> >>>
> >>> okay please add that somewhere. Also it would help me if you explain
> >>> what is chunk and other terminologies used in this driver
> >>
> >> I'm thinking to put the below description on the patch, please check if this is
> >> sufficient explicit and clear to understand what this IP needs and does.
> >>
> >> In order to transfer data from point A to B as fast as possible this IP requires
> >> a dedicated memory space where will reside a linked list of elements.
> > 
> > rephrasing: a dedicated memory space containing linked list of elements
> > 
> >> All elements of this linked list are continuous and each one describes a data
> >> transfer (source and destination addresses, length and a control variable).
> >>
> >> For the sake of simplicity, lets assume a memory space for channel write 0 which
> >> allows about 42 elements.
> >>
> >> +---------+
> >> | Desc #0 |--+
> >> +---------+  |
> >>              V
> >>         +----------+
> >>         | Chunk #0 |---+
> >>         |  CB = 1  |   |   +----------+   +-----+   +-----------+   +-----+
> >>         +----------+   +-->| Burst #0 |-->| ... |-->| Burst #41 |-->| llp |
> >>               |            +----------+   +-----+   +-----------+   +-----+
> >>               V
> >>         +----------+
> >>         | Chunk #1 |---+
> >>         |  CB = 0  |   |   +-----------+   +-----+   +-----------+   +-----+
> >>         +----------+   +-->| Burst #42 |-->| ... |-->| Burst #83 |-->| llp |
> >>               |            +-----------+   +-----+   +-----------+   +-----+
> >>               V
> >>         +----------+
> >>         | Chunk #2 |---+
> >>         |  CB = 1  |   |   +-----------+   +-----+   +------------+   +-----+
> >>         +----------+   +-->| Burst #84 |-->| ... |-->| Burst #125 |-->| llp |
> >>               |            +-----------+   +-----+   +------------+   +-----+
> >>               V
> >>         +----------+
> >>         | Chunk #3 |---+
> >>         |  CB = 0  |   |   +------------+   +-----+   +------------+   +-----+
> >>         +----------+   +-->| Burst #126 |-->| ... |-->| Burst #129 |-->| llp |
> >>                            +------------+   +-----+   +------------+   +-----+
> > 
> > This is great and required to understand the driver.
> > 
> > How does controller move from Burnst #41 of Chunk 0 to Chunk 1 ?
> 
> I forgot to explain that...
> 
> On every last Burst of the Chunk (Burst #41, Burst #83, Burst #125 or even Burst
> #129) is set some flags on their control variable (RIE and LIE bits) that will
> trigger the send of "done" interruption.
> 
> On the interruptions callback, is decided whether to recycle the linked list
> memory space by writing a new set of Bursts elements (if still exists Chunks to
> transfer) or is considered completed (if there is no Chunks available to transfer)
> 
> > Is Burst 0 to 129 a link list with Burst 0, 42, 84 and 126 having a
> > callback (done bit set)..
> 
> I didn't quite understand it your question.
> 
> It comes from the prep_slave_sg n elements (130 applying the example), where
> will be divide in several Chunks (#0, #1, #2, #3 and #4 applying the example) (a
> linked list that will contain another linked list for the Bursts, CB, Chunk
> size). The linked list inside of each Chunk will contain a number of Bursts
> (limited to the memory space size), each one will possess Source Address,
> Destination Address and size of that sub-transfer.

Thanks this helps :)

so in this case what does prep_slave_sg n elements corrosponds to (130
bursts) ..? If so how do you split a transaction to chunks and bursts?


> 
> > 
> > This sound **very** similar to dw dma concepts!
> 
> I believe some parts of dw dma and dw edma behavior are similar and that makes
> perfectly sense since both dma are done by Synopsys and may be exist a shared
> knowledge, however they are different IPs applied to different products.
> 
> > 
> >>
> >> Legend:
> >>
> >> *Linked list*, also know as Chunk
> >> *Linked list element*, also know as Burst
> >> *CB*, also know as Change Bit, it's a control bit (and typically is toggled)
> >> that allows to easily identify and differentiate between the current linked list
> >> and the previous or the next one.
> >> *LLP*, is a special element that indicates the end of the linked list element
> >> stream also informs that the next CB should be toggle.
> >>
> >> On scatter-gather transfer mode, the client will submit a scatter-gather list of
> >> n (on this case 130) elements, that will be divide in multiple Chunks, each
> >> Chunk will have (on this case 42) a limited number of Bursts and after
> >> transferring all Bursts, an interrupt will be triggered, which will allow to
> >> recycle the all linked list dedicated memory again with the new information
> >> relative to the next Chunk and respective Burst associated and repeat the whole
> >> cycle again.
> >>
> >> On cyclic transfer mode, the client will submit a buffer pointer, length of it
> >> and number of repetitions, in this case each burst will correspond directly to
> >> each repetition.
> >>
> >> Each Burst can describes a data transfer from point A(source) to point
> >> B(destination) with a length that can be from 1 byte up to 4 GB. Since dedicated
> >> the memory space where the linked list will reside is limited, the whole n burst
> >> elements will be organized in several Chunks, that will be used later to recycle
> >> the dedicated memory space to initiate a new sequence of data transfers.
> >>
> >> The whole transfer is considered has completed when it was transferred all bursts.
> >>
> >> Currently this IP has a set well-known register map, which includes support for
> >> legacy and unroll modes. Legacy mode is version of this register map that has
> > 
> > whats  unroll..
> > 
> >> multiplexer register that allows to switch registers between all write and read
> 
> The unroll is explained here, see below
> 
> >> channels and the unroll modes repeats all write and read channels registers with
> >> an offset between them. This register map is called v0.
> >>
> >> The IP team is creating a new register map more suitable to the latest PCIe
> >> features, that very likely will change the map register, which this version will
> >> be called v1. As soon as this new version is released by the IP team the support
> >> for this version in be included on this driver.
> >>
> >> What do you think? There is any gray area that I could clarify?
> > 
> > This sounds good. But we are also catering to a WIP IP which can change
> > right. Doesnt sound very good idea to me :)
> 
> This IP exists for several years like this and it works quite fine, however
> because of new features and requests (SR-IOV, more dma channels, function
> segregation and isolation, performance improvement) that are coming it's natural
> to have exist improvements. The drivers should follow the evolution and be
> sufficient robust enough to adapt to this new circumstance.

Kernel driver should be modular be design, if we keep things simple then
adding/splitting to support future revisions should be easy!

> >>>>>> +	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
> >>>>>> +		&config->src_addr, &config->dst_addr);
> >>>>>> +
> >>>>>> +	chan->src_addr = config->src_addr;
> >>>>>> +	chan->dst_addr = config->dst_addr;
> >>>>>> +
> >>>>>> +	err = ops->device_config(dchan);
> >>>>>
> >>>>> what does this do?
> >>>>
> >>>> This is an initialization procedure to setup interrupts (data and addresses) to
> >>>> each channel on the eDMA IP,  in order to be triggered after transfer being
> >>>> completed or aborted. Due the fact the config() can be called at anytime,
> >>>> doesn't make sense to have this procedure here, I'll moved it to probe().
> >>>
> >>> Yeah I am not still convinced about having another layer! Have you
> >>> though about using common lib for common parts ..?
> >>
> >> Maybe I'm explaining myself wrongly. I don't have any clue about the new
> >> register map for the future versions. I honestly tried to implement the common
> >> lib for the whole process that interact with dma engine controller to ease in
> >> the future any new addition of register mapping, by having this common callbacks
> >> that will only be responsible for interfacing the HW accordingly to register map
> >> version. Maybe I can simplify something in the future, but I only be able to
> >> conclude that after having some idea about the new register map.
> >>
> >> IMHO I think this is the easier and clean way to do it, in terms of code
> >> maintenance and architecture, but if you have another idea that you can show me
> >> or pointing out for a driver that implements something similar, I'm no problem
> >> to check it out.
> > 
> > That is what my fear was :)
> > 
> > Lets step back and solve one problem at a time. Right now that is v0 of
> > IP. Please write a simple driver which solve v0 without any layers
> > involved.
> > 
> > Once v1 is in good shape, you would know what it required and then we
> > can split v0 driver into common lib and v0 driver and then add v1
> > driver.
> 
> Can I keep the code segregation as it is now? With the dw-edma-v0-core.c/h,
> dw-edma-v0-debugfs.c/h and dw-edma-v0-regs.h
> 
> That way I would only replace the callbacks calls to direct function calls and
> remove the switch case callback selection base on the version.

That sound better, please keep patches smallish (anything going more
than 600 lines becomes harder to review) and logically split.

> >>>>> what is the second part of the check, can you explain that, who sets
> >>>>> chan->dir?
> >>>>
> >>>> The chan->dir is set on probe() during the process of configuring each channel.
> >>>
> >>> So you have channels that are unidirectional?
> >>
> >> Yes. That's one another reason IMHO to keep the dw-edma-test separate from
> >> dma-test, since this IP is more picky and have this particularities.
> > 
> > That is okay, that should be handled by prep calls, if you get a prep
> > call for direction you dont support return error and move to next
> > available one.
> > 
> > That can be done generically in dmatest driver and to answer your
> > question, yes I would test that :)
> 
> Like you said, let do this in small steps. For now I would like to suggest to
> leave out the dw-dma-test driver and just focus on the current driver, if you
> don't mind. I never thought that his test driver would raise this kind of discuss.

Well dmatest is just a testing aid and doesnt care about internal
designs of your driver. I would say keep it as it is useful aid and will
help other folks as well :)
-- 
~Vinod

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver
  2019-02-02 10:07               ` Vinod Koul
@ 2019-02-06 18:06                 ` Gustavo Pimentel
  0 siblings, 0 replies; 39+ messages in thread
From: Gustavo Pimentel @ 2019-02-06 18:06 UTC (permalink / raw)
  To: Vinod Koul, Gustavo Pimentel
  Cc: linux-pci, dmaengine, Dan Williams, Eugeniy Paltsev,
	Andy Shevchenko, Russell King, Niklas Cassel, Joao Pinto,
	Jose Abreu, Luis Oliveira, Vitor Soares, Nelson Costa,
	Pedro Sousa

On 02/02/2019 10:07, Vinod Koul wrote:
> On 01-02-19, 11:23, Gustavo Pimentel wrote:
>> On 01/02/2019 04:14, Vinod Koul wrote:
>>> On 31-01-19, 11:33, Gustavo Pimentel wrote:
>>>> On 23/01/2019 13:08, Vinod Koul wrote:
>>>>> On 21-01-19, 15:48, Gustavo Pimentel wrote:
>>>>>> On 20/01/2019 11:44, Vinod Koul wrote:
>>>>>>> On 11-01-19, 19:33, Gustavo Pimentel wrote:
>>>
>>>>>>>> @@ -0,0 +1,1059 @@
>>>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>>>> +/*
>>>>>>>> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.
>>>>>>>
>>>>>>> 2019 now
>>>>>>
>>>>>> I've changed to "Copyright (c) 2018-present Synopsys, Inc. and/or its
>>>>>> affiliates." this way it's always up to date and I also kept 2018, because it
>>>>>> was the date that I started to develop this driver, if you don't mind.
>>>>>
>>>>> yeah 18 is fine :) it need to end with current year always
>>>>
>>>> Just to be sure, are you saying that must be: "Copyright (c) 2018-2019 Synopsys,
>>>> Inc. and/or its affiliates."?
>>>
>>> Yup :)
>>>
>>>>>>>> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc)
>>>>>>>> +{
>>>>>>>> +	struct dw_edma_chan *chan = desc->chan;
>>>>>>>> +	struct dw_edma *dw = chan->chip->dw;
>>>>>>>> +	struct dw_edma_chunk *chunk;
>>>>>>>> +
>>>>>>>> +	chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT);
>>>>>>>> +	if (unlikely(!chunk))
>>>>>>>> +		return NULL;
>>>>>>>> +
>>>>>>>> +	INIT_LIST_HEAD(&chunk->list);
>>>>>>>> +	chunk->chan = chan;
>>>>>>>> +	chunk->cb = !(desc->chunks_alloc % 2);
>>>>>>>
>>>>>>> cb ..?
>>>>>>
>>>>>> CB = change bit, is a property of this eDMA IP. Basically it is a kind of
>>>>>> handshake which serves to validate whether the linked list has been updated or
>>>>>> not, especially useful in cases of recycled linked list elements (every linked
>>>>>> list recycle is a new chunk, this will allow to differentiate each chunk).
>>>>>
>>>>> okay please add that somewhere. Also it would help me if you explain
>>>>> what is chunk and other terminologies used in this driver
>>>>
>>>> I'm thinking to put the below description on the patch, please check if this is
>>>> sufficient explicit and clear to understand what this IP needs and does.
>>>>
>>>> In order to transfer data from point A to B as fast as possible this IP requires
>>>> a dedicated memory space where will reside a linked list of elements.
>>>
>>> rephrasing: a dedicated memory space containing linked list of elements
>>>
>>>> All elements of this linked list are continuous and each one describes a data
>>>> transfer (source and destination addresses, length and a control variable).
>>>>
>>>> For the sake of simplicity, lets assume a memory space for channel write 0 which
>>>> allows about 42 elements.
>>>>
>>>> +---------+
>>>> | Desc #0 |--+
>>>> +---------+  |
>>>>              V
>>>>         +----------+
>>>>         | Chunk #0 |---+
>>>>         |  CB = 1  |   |   +----------+   +-----+   +-----------+   +-----+
>>>>         +----------+   +-->| Burst #0 |-->| ... |-->| Burst #41 |-->| llp |
>>>>               |            +----------+   +-----+   +-----------+   +-----+
>>>>               V
>>>>         +----------+
>>>>         | Chunk #1 |---+
>>>>         |  CB = 0  |   |   +-----------+   +-----+   +-----------+   +-----+
>>>>         +----------+   +-->| Burst #42 |-->| ... |-->| Burst #83 |-->| llp |
>>>>               |            +-----------+   +-----+   +-----------+   +-----+
>>>>               V
>>>>         +----------+
>>>>         | Chunk #2 |---+
>>>>         |  CB = 1  |   |   +-----------+   +-----+   +------------+   +-----+
>>>>         +----------+   +-->| Burst #84 |-->| ... |-->| Burst #125 |-->| llp |
>>>>               |            +-----------+   +-----+   +------------+   +-----+
>>>>               V
>>>>         +----------+
>>>>         | Chunk #3 |---+
>>>>         |  CB = 0  |   |   +------------+   +-----+   +------------+   +-----+
>>>>         +----------+   +-->| Burst #126 |-->| ... |-->| Burst #129 |-->| llp |
>>>>                            +------------+   +-----+   +------------+   +-----+
>>>
>>> This is great and required to understand the driver.
>>>
>>> How does controller move from Burnst #41 of Chunk 0 to Chunk 1 ?
>>
>> I forgot to explain that...
>>
>> On every last Burst of the Chunk (Burst #41, Burst #83, Burst #125 or even Burst
>> #129) is set some flags on their control variable (RIE and LIE bits) that will
>> trigger the send of "done" interruption.
>>
>> On the interruptions callback, is decided whether to recycle the linked list
>> memory space by writing a new set of Bursts elements (if still exists Chunks to
>> transfer) or is considered completed (if there is no Chunks available to transfer)
>>
>>> Is Burst 0 to 129 a link list with Burst 0, 42, 84 and 126 having a
>>> callback (done bit set)..
>>
>> I didn't quite understand it your question.
>>
>> It comes from the prep_slave_sg n elements (130 applying the example), where
>> will be divide in several Chunks (#0, #1, #2, #3 and #4 applying the example) (a
>> linked list that will contain another linked list for the Bursts, CB, Chunk
>> size). The linked list inside of each Chunk will contain a number of Bursts
>> (limited to the memory space size), each one will possess Source Address,
>> Destination Address and size of that sub-transfer.
> 
> Thanks this helps :)
> 
> so in this case what does prep_slave_sg n elements corrosponds to (130
> bursts) ..? If so how do you split a transaction to chunks and bursts?

In the example case, prep_slave_sg API receives a total of 130 elements and the
dw-edma-core implementation will slice in 4 chunks (the first 3 chunks will
containing 42 bursts each and the last chunk only containing 4 bursts).

The burst maximum capacity of each chunk depends of the linked list memory space
size available or destined for, in the example case I didn't specify the
actually space amount, but the maximum number of bursts is calculated by:

maximum number of bursts = (linked list memory size / 24) - 1

the size is in bytes and 24 is the size of each burst information element.
It's also subtracted 1 element because on the end of each burst stream is needed
the llp element and for simplicity it's size is considered to be equal to the
burst element.

> 
> 
>>
>>>
>>> This sound **very** similar to dw dma concepts!
>>
>> I believe some parts of dw dma and dw edma behavior are similar and that makes
>> perfectly sense since both dma are done by Synopsys and may be exist a shared
>> knowledge, however they are different IPs applied to different products.
>>
>>>
>>>>
>>>> Legend:
>>>>
>>>> *Linked list*, also know as Chunk
>>>> *Linked list element*, also know as Burst
>>>> *CB*, also know as Change Bit, it's a control bit (and typically is toggled)
>>>> that allows to easily identify and differentiate between the current linked list
>>>> and the previous or the next one.
>>>> *LLP*, is a special element that indicates the end of the linked list element
>>>> stream also informs that the next CB should be toggle.
>>>>
>>>> On scatter-gather transfer mode, the client will submit a scatter-gather list of
>>>> n (on this case 130) elements, that will be divide in multiple Chunks, each
>>>> Chunk will have (on this case 42) a limited number of Bursts and after
>>>> transferring all Bursts, an interrupt will be triggered, which will allow to
>>>> recycle the all linked list dedicated memory again with the new information
>>>> relative to the next Chunk and respective Burst associated and repeat the whole
>>>> cycle again.
>>>>
>>>> On cyclic transfer mode, the client will submit a buffer pointer, length of it
>>>> and number of repetitions, in this case each burst will correspond directly to
>>>> each repetition.
>>>>
>>>> Each Burst can describes a data transfer from point A(source) to point
>>>> B(destination) with a length that can be from 1 byte up to 4 GB. Since dedicated
>>>> the memory space where the linked list will reside is limited, the whole n burst
>>>> elements will be organized in several Chunks, that will be used later to recycle
>>>> the dedicated memory space to initiate a new sequence of data transfers.
>>>>
>>>> The whole transfer is considered has completed when it was transferred all bursts.
>>>>
>>>> Currently this IP has a set well-known register map, which includes support for
>>>> legacy and unroll modes. Legacy mode is version of this register map that has
>>>
>>> whats  unroll..
>>>
>>>> multiplexer register that allows to switch registers between all write and read
>>
>> The unroll is explained here, see below
>>
>>>> channels and the unroll modes repeats all write and read channels registers with
>>>> an offset between them. This register map is called v0.
>>>>
>>>> The IP team is creating a new register map more suitable to the latest PCIe
>>>> features, that very likely will change the map register, which this version will
>>>> be called v1. As soon as this new version is released by the IP team the support
>>>> for this version in be included on this driver.
>>>>
>>>> What do you think? There is any gray area that I could clarify?
>>>
>>> This sounds good. But we are also catering to a WIP IP which can change
>>> right. Doesnt sound very good idea to me :)
>>
>> This IP exists for several years like this and it works quite fine, however
>> because of new features and requests (SR-IOV, more dma channels, function
>> segregation and isolation, performance improvement) that are coming it's natural
>> to have exist improvements. The drivers should follow the evolution and be
>> sufficient robust enough to adapt to this new circumstance.
> 
> Kernel driver should be modular be design, if we keep things simple then
> adding/splitting to support future revisions should be easy!
> 
>>>>>>>> +	dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n",
>>>>>>>> +		&config->src_addr, &config->dst_addr);
>>>>>>>> +
>>>>>>>> +	chan->src_addr = config->src_addr;
>>>>>>>> +	chan->dst_addr = config->dst_addr;
>>>>>>>> +
>>>>>>>> +	err = ops->device_config(dchan);
>>>>>>>
>>>>>>> what does this do?
>>>>>>
>>>>>> This is an initialization procedure to setup interrupts (data and addresses) to
>>>>>> each channel on the eDMA IP,  in order to be triggered after transfer being
>>>>>> completed or aborted. Due the fact the config() can be called at anytime,
>>>>>> doesn't make sense to have this procedure here, I'll moved it to probe().
>>>>>
>>>>> Yeah I am not still convinced about having another layer! Have you
>>>>> though about using common lib for common parts ..?
>>>>
>>>> Maybe I'm explaining myself wrongly. I don't have any clue about the new
>>>> register map for the future versions. I honestly tried to implement the common
>>>> lib for the whole process that interact with dma engine controller to ease in
>>>> the future any new addition of register mapping, by having this common callbacks
>>>> that will only be responsible for interfacing the HW accordingly to register map
>>>> version. Maybe I can simplify something in the future, but I only be able to
>>>> conclude that after having some idea about the new register map.
>>>>
>>>> IMHO I think this is the easier and clean way to do it, in terms of code
>>>> maintenance and architecture, but if you have another idea that you can show me
>>>> or pointing out for a driver that implements something similar, I'm no problem
>>>> to check it out.
>>>
>>> That is what my fear was :)
>>>
>>> Lets step back and solve one problem at a time. Right now that is v0 of
>>> IP. Please write a simple driver which solve v0 without any layers
>>> involved.
>>>
>>> Once v1 is in good shape, you would know what it required and then we
>>> can split v0 driver into common lib and v0 driver and then add v1
>>> driver.
>>
>> Can I keep the code segregation as it is now? With the dw-edma-v0-core.c/h,
>> dw-edma-v0-debugfs.c/h and dw-edma-v0-regs.h
>>
>> That way I would only replace the callbacks calls to direct function calls and
>> remove the switch case callback selection base on the version.
> 
> That sound better, please keep patches smallish (anything going more
> than 600 lines becomes harder to review) and logically split.
> 
>>>>>>> what is the second part of the check, can you explain that, who sets
>>>>>>> chan->dir?
>>>>>>
>>>>>> The chan->dir is set on probe() during the process of configuring each channel.
>>>>>
>>>>> So you have channels that are unidirectional?
>>>>
>>>> Yes. That's one another reason IMHO to keep the dw-edma-test separate from
>>>> dma-test, since this IP is more picky and have this particularities.
>>>
>>> That is okay, that should be handled by prep calls, if you get a prep
>>> call for direction you dont support return error and move to next
>>> available one.
>>>
>>> That can be done generically in dmatest driver and to answer your
>>> question, yes I would test that :)
>>
>> Like you said, let do this in small steps. For now I would like to suggest to
>> leave out the dw-dma-test driver and just focus on the current driver, if you
>> don't mind. I never thought that his test driver would raise this kind of discuss.
> 
> Well dmatest is just a testing aid and doesnt care about internal
> designs of your driver. I would say keep it as it is useful aid and will
> help other folks as well :)
> 

Let's see, I honestly hope so...

Gustavo

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2019-02-06 18:11 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-11 18:33 [RFC v3 0/6] dmaengine: Add Synopsys eDMA IP driver (version 0) Gustavo Pimentel
2019-01-11 18:33 ` [RFC v3 1/7] dmaengine: Add Synopsys eDMA IP core driver Gustavo Pimentel
2019-01-16 10:21   ` Jose Abreu
2019-01-16 11:53     ` Gustavo Pimentel
2019-01-19 16:21       ` Andy Shevchenko
2019-01-21  9:14         ` Gustavo Pimentel
2019-01-20 11:47       ` Vinod Koul
2019-01-21 15:49         ` Gustavo Pimentel
2019-01-20 11:44   ` Vinod Koul
2019-01-21 15:48     ` Gustavo Pimentel
2019-01-23 13:08       ` Vinod Koul
2019-01-31 11:33         ` Gustavo Pimentel
2019-02-01  4:14           ` Vinod Koul
2019-02-01 11:23             ` Gustavo Pimentel
2019-02-02 10:07               ` Vinod Koul
2019-02-06 18:06                 ` Gustavo Pimentel
2019-01-11 18:33 ` [RFC v3 2/7] dmaengine: Add Synopsys eDMA IP version 0 support Gustavo Pimentel
2019-01-16 10:33   ` Jose Abreu
2019-01-16 14:02     ` Gustavo Pimentel
2019-01-11 18:33 ` [RFC v3 3/7] dmaengine: Add Synopsys eDMA IP version 0 debugfs support Gustavo Pimentel
2019-01-11 18:33 ` [RFC v3 4/7] PCI: Add Synopsys endpoint EDDA Device id Gustavo Pimentel
2019-01-14 14:41   ` Bjorn Helgaas
2019-01-11 18:33 ` [RFC v3 5/7] dmaengine: Add Synopsys eDMA IP PCIe glue-logic Gustavo Pimentel
2019-01-11 19:47   ` Andy Shevchenko
2019-01-14 11:38     ` Gustavo Pimentel
2019-01-15  5:43       ` Andy Shevchenko
2019-01-15 12:48         ` Gustavo Pimentel
2019-01-19 15:45           ` Andy Shevchenko
2019-01-21  9:21             ` Gustavo Pimentel
2019-01-11 18:33 ` [RFC v3 6/7] MAINTAINERS: Add Synopsys eDMA IP driver maintainer Gustavo Pimentel
2019-01-11 18:33 ` [RFC v3 7/7] dmaengine: Add Synopsys eDMA IP test and sample driver Gustavo Pimentel
2019-01-11 19:48   ` Andy Shevchenko
2019-01-14 11:44     ` Gustavo Pimentel
2019-01-15  5:45       ` Andy Shevchenko
2019-01-15 13:02         ` Gustavo Pimentel
2019-01-17  5:03           ` Vinod Koul
2019-01-21 15:59             ` Gustavo Pimentel
2019-01-16 10:45   ` Jose Abreu
2019-01-16 11:56     ` Gustavo Pimentel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).