dmaengine Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 0/3] Add AMD PassThru DMA Engine driver
@ 2020-01-21  9:04 Sanjay R Mehta
  2020-01-21  9:04 ` [PATCH v3 1/3] dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine Sanjay R Mehta
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Sanjay R Mehta @ 2020-01-21  9:04 UTC (permalink / raw)
  To: vkoul, gregkh, dan.j.williams, Thomas.Lendacky, Shyam-sundar.S-k,
	Nehal-bakulchandra.Shah
  Cc: robh, mchehab+samsung, davem, Jonathan.Cameron, linux-kernel,
	dmaengine, Sanjay R Mehta

From: Sanjay R Mehta <sanju.mehta@amd.com>

This patch series adds support for AMD PassThru DMA Engine which
performs high bandwidth memory-to-memory and IO copy operation and
performs DMA transfer through queue based descriptor management.

AMD Processor has multiple ptdma device instances and each engine has
single queue. The driver also adds support for for multiple PTDMA
instances, each device will get an unique identifier and uniquely
named resources.

v3:
	- Fixed the sparse warnings.

v2:
	- Added controller description in cover letter
	- Removed "default m" from Kconfig
	- Replaced low_address() and high_address() functions with kernel
	  API's lower_32_bits & upper_32_bits().
	- Removed the BH handler function pt_core_irq_bh() and instead
	  handling transaction in irq handler itself.
	- Moved presetting of command queue registers into new function
	  "init_cmdq_regs()"
	- Removed the kernel thread dependency to submit transaction.
	- Increased the hardware command queue size to 32 and adding it
	  as a module parameter.
	- Removed backlog command queue handling mechanism.
	- Removed software command queue handling and instead submitting
	  transaction command directly to
	  hardware command queue.
	- Added tasklet structure variable in "struct pt_device".
	  This is used to invoke pt_do_cmd_complete() upon receiving interrupt
	  for command completion.
	- pt_core_perform_passthru() function parameters are modified and it is
	  now used to submit command directly to hardware from dmaengine framework.
	- Removed below structures, enums, macros and functions, as these values are
	  constants. Making command submission simple,
	   - Removed "union pt_function"  and several macros like PT_VERSION,
	     PT_BYTESWAP, PT_CMD_* etc..
	   - enum pt_passthru_bitwise, enum pt_passthru_byteswap, enum pt_memtype
	     struct pt_dma_info, struct pt_data, struct pt_mem, struct pt_passthru_op,
	     struct pt_op,
	   - Removed functions -> pt_cmd_queue_thread() pt_run_passthru_cmd(),
	     pt_run_cmd(), pt_dev_init(), pt_dequeue_cmd(), pt_do_cmd_backlog(),
	     pt_enqueue_cmd(),
	- Below functions, stuctures and variables moved from ptdma-ops.c
	   - Moved function pt_alloc_struct() to ptdma-pci.c as its used only there.
	   - Moved "struct pt_tasklet_data" structure to ptdma.h
	   - Moved functions pt_do_cmd_complete(), pt_present(), pt_get_device(),
	     pt_add_device(), pt_del_device(), pt_log_error() and its dependent
	     static variables pt_unit_lock, pt_units, pt_rr_lock, pt_rr, pt_error_codes,
	     pt_ordinal  to ptdma-dev.c as they are used only in that file.

Links of the review comments for v2:
1. https://lkml.org/lkml/2019/12/27/630
2. https://lkml.org/lkml/2020/1/3/23
3. https://lkml.org/lkml/2020/1/3/314
4. https://lkml.org/lkml/2020/1/10/100


Links of the review comments for v1:

1. https://lkml.org/lkml/2019/9/24/490
2. https://lkml.org/lkml/2019/9/24/399
3. https://lkml.org/lkml/2019/9/24/862
4. https://lkml.org/lkml/2019/9/24/122

Sanjay R Mehta (3):
  dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine
  dmaengine: ptdma: Register pass-through engine as a DMA resource
  dmaengine: ptdma: Add debugfs entries for PTDMA information

 MAINTAINERS                         |   6 +
 drivers/dma/Kconfig                 |   2 +
 drivers/dma/Makefile                |   1 +
 drivers/dma/ptdma/Kconfig           |   7 +
 drivers/dma/ptdma/Makefile          |  12 +
 drivers/dma/ptdma/ptdma-debugfs.c   | 237 ++++++++++++
 drivers/dma/ptdma/ptdma-dev.c       | 448 +++++++++++++++++++++++
 drivers/dma/ptdma/ptdma-dmaengine.c | 704 ++++++++++++++++++++++++++++++++++++
 drivers/dma/ptdma/ptdma-pci.c       | 269 ++++++++++++++
 drivers/dma/ptdma/ptdma.h           | 378 +++++++++++++++++++
 10 files changed, 2064 insertions(+)
 create mode 100644 drivers/dma/ptdma/Kconfig
 create mode 100644 drivers/dma/ptdma/Makefile
 create mode 100644 drivers/dma/ptdma/ptdma-debugfs.c
 create mode 100644 drivers/dma/ptdma/ptdma-dev.c
 create mode 100644 drivers/dma/ptdma/ptdma-dmaengine.c
 create mode 100644 drivers/dma/ptdma/ptdma-pci.c
 create mode 100644 drivers/dma/ptdma/ptdma.h

-- 
2.7.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/3] dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine
  2020-01-21  9:04 [PATCH v3 0/3] Add AMD PassThru DMA Engine driver Sanjay R Mehta
@ 2020-01-21  9:04 ` Sanjay R Mehta
  2020-01-24  6:00   ` Vinod Koul
  2020-01-21  9:04 ` [PATCH v3 2/3] dmaengine: ptdma: Register pass-through engine as a DMA resource Sanjay R Mehta
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Sanjay R Mehta @ 2020-01-21  9:04 UTC (permalink / raw)
  To: vkoul, gregkh, dan.j.williams, Thomas.Lendacky, Shyam-sundar.S-k,
	Nehal-bakulchandra.Shah
  Cc: robh, mchehab+samsung, davem, Jonathan.Cameron, linux-kernel,
	dmaengine, Sanjay R Mehta, Gary R Hook

From: Sanjay R Mehta <sanju.mehta@amd.com>

This device performs high-bandwidth memory-to-memory
transfer operations.

Device commands are managed via a circular queue of
'descriptors', each of which specifies source and
destination addresses for copying a single buffer of data.

The driver must handle multiple devices, which are logged
on a linked list; all devices are treated equivalently.

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
---
 MAINTAINERS                   |   6 +
 drivers/dma/Kconfig           |   2 +
 drivers/dma/Makefile          |   1 +
 drivers/dma/ptdma/Kconfig     |   6 +
 drivers/dma/ptdma/Makefile    |  10 ++
 drivers/dma/ptdma/ptdma-dev.c | 387 ++++++++++++++++++++++++++++++++++++++++++
 drivers/dma/ptdma/ptdma-pci.c | 269 +++++++++++++++++++++++++++++
 drivers/dma/ptdma/ptdma.h     | 321 +++++++++++++++++++++++++++++++++++
 8 files changed, 1002 insertions(+)
 create mode 100644 drivers/dma/ptdma/Kconfig
 create mode 100644 drivers/dma/ptdma/Makefile
 create mode 100644 drivers/dma/ptdma/ptdma-dev.c
 create mode 100644 drivers/dma/ptdma/ptdma-pci.c
 create mode 100644 drivers/dma/ptdma/ptdma.h

diff --git a/MAINTAINERS b/MAINTAINERS
index bd5847e..42ef93c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -882,6 +882,12 @@ S:	Supported
 F:	drivers/net/ethernet/amd/xgbe/
 F:	arch/arm64/boot/dts/amd/amd-seattle-xgbe*.dtsi
 
+AMD PTDMA DRIVER
+M:	Sanjay R Mehta <sanju.mehta@amd.com>
+L:	dmaengine@vger.kernel.org
+S:	Maintained
+F:	drivers/dma/ptdma/
+
 ANALOG DEVICES INC AD5686 DRIVER
 M:	Stefan Popa <stefan.popa@analog.com>
 L:	linux-pm@vger.kernel.org
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 6fa1eba..d9856cc 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -697,6 +697,8 @@ source "drivers/dma/ti/Kconfig"
 
 source "drivers/dma/fsl-dpaa2-qdma/Kconfig"
 
+source "drivers/dma/ptdma/Kconfig"
+
 # clients
 comment "DMA Clients"
 	depends on DMA_ENGINE
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 42d7e2f..a806824 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -79,6 +79,7 @@ obj-$(CONFIG_XGENE_DMA) += xgene-dma.o
 obj-$(CONFIG_ZX_DMA) += zx_dma.o
 obj-$(CONFIG_ST_FDMA) += st_fdma.o
 obj-$(CONFIG_FSL_DPAA2_QDMA) += fsl-dpaa2-qdma/
+obj-$(CONFIG_AMD_PTDMA) += ptdma/
 
 obj-y += mediatek/
 obj-y += qcom/
diff --git a/drivers/dma/ptdma/Kconfig b/drivers/dma/ptdma/Kconfig
new file mode 100644
index 0000000..4ec259e
--- /dev/null
+++ b/drivers/dma/ptdma/Kconfig
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config AMD_PTDMA
+	tristate  "AMD PassThru DMA Engine"
+	depends on X86_64 && PCI
+	help
+	  Provides the support for AMD PassThru DMA Engine.
diff --git a/drivers/dma/ptdma/Makefile b/drivers/dma/ptdma/Makefile
new file mode 100644
index 0000000..320fa82
--- /dev/null
+++ b/drivers/dma/ptdma/Makefile
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# AMD Passthru DMA driver
+#
+
+obj-$(CONFIG_AMD_PTDMA) += ptdma.o
+
+ptdma-objs := ptdma-dev.o
+
+ptdma-$(CONFIG_PCI) += ptdma-pci.o
diff --git a/drivers/dma/ptdma/ptdma-dev.c b/drivers/dma/ptdma/ptdma-dev.c
new file mode 100644
index 0000000..c10adce
--- /dev/null
+++ b/drivers/dma/ptdma/ptdma-dev.c
@@ -0,0 +1,387 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Passthru DMA device driver
+ * -- Based on the CCP driver
+ *
+ * Copyright (C) 2016,2019 Advanced Micro Devices, Inc.
+ *
+ * Author: Sanjay R Mehta <sanju.mehta@amd.com>
+ * Author: Gary R Hook <gary.hook@amd.com>
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+
+#include "ptdma.h"
+
+static int cmd_queue_length = 32;
+module_param(cmd_queue_length, uint, 0644);
+MODULE_PARM_DESC(cmd_queue_length,
+		 " length of the command queue, a power of 2 (2 <= val <= 128)");
+
+/*
+ * List of PTDMAs, PTDMA count, read-write access lock, and access functions
+ *
+ * Lock structure: get pt_unit_lock for reading whenever we need to
+ * examine the PTDMA list. While holding it for reading we can acquire
+ * the RR lock to update the round-robin next-PTDMA pointer. The unit lock
+ * must be acquired before the RR lock.
+ *
+ * If the unit-lock is acquired for writing, we have total control over
+ * the list, so there's no value in getting the RR lock.
+ */
+static DEFINE_RWLOCK(pt_unit_lock);
+static LIST_HEAD(pt_units);
+
+static struct pt_device *pt_rr;
+
+/* Human-readable error strings */
+static char *pt_error_codes[] = {
+	"",
+	"ERR 01: ILLEGAL_ENGINE",
+	"ERR 03: ILLEGAL_FUNCTION_TYPE",
+	"ERR 04: ILLEGAL_FUNCTION_MODE",
+	"ERR 06: ILLEGAL_FUNCTION_SIZE",
+	"ERR 08: ILLEGAL_FUNCTION_RSVD",
+	"ERR 09: ILLEGAL_BUFFER_LENGTH",
+	"ERR 10: VLSB_FAULT",
+	"ERR 11: ILLEGAL_MEM_ADDR",
+	"ERR 12: ILLEGAL_MEM_SEL",
+	"ERR 13: ILLEGAL_CONTEXT_ID",
+	"ERR 15: 0xF Reserved",
+	"ERR 18: CMD_TIMEOUT",
+	"ERR 19: IDMA0_AXI_SLVERR",
+	"ERR 20: IDMA0_AXI_DECERR",
+	"ERR 21: 0x15 Reserved",
+	"ERR 22: IDMA1_AXI_SLAVE_FAULT",
+	"ERR 23: IDMA1_AIXI_DECERR",
+	"ERR 24: 0x18 Reserved",
+	"ERR 27: 0x1B Reserved",
+	"ERR 38: ODMA0_AXI_SLVERR",
+	"ERR 39: ODMA0_AXI_DECERR",
+	"ERR 40: 0x28 Reserved",
+	"ERR 41: ODMA1_AXI_SLVERR",
+	"ERR 42: ODMA1_AXI_DECERR",
+	"ERR 43: LSB_PARITY_ERR",
+};
+
+static void pt_log_error(struct pt_device *d, int e)
+{
+	dev_err(d->dev, "PTDMA error: %s (0x%x)\n", pt_error_codes[e], e);
+}
+
+/*
+ * pt_add_device - add a PTDMA device to the list
+ *
+ * @pt: pt_device struct pointer
+ *
+ * Put this PTDMA on the unit list, which makes it available
+ * for use.
+ *
+ * Returns zero if a PTDMA device is present, -ENODEV otherwise.
+ */
+static void pt_add_device(struct pt_device *pt)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&pt_unit_lock, flags);
+	list_add_tail(&pt->entry, &pt_units);
+	if (!pt_rr)
+		/*
+		 * We already have the list lock (we're first) so this
+		 * pointer can't change on us. Set its initial value.
+		 */
+		pt_rr = pt;
+	write_unlock_irqrestore(&pt_unit_lock, flags);
+}
+
+/*
+ * pt_del_device - remove a PTDMA device from the list
+ *
+ * @pt: pt_device struct pointer
+ *
+ * Remove this unit from the list of devices. If the next device
+ * up for use is this one, adjust the pointer. If this is the last
+ * device, NULL the pointer.
+ */
+static void pt_del_device(struct pt_device *pt)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&pt_unit_lock, flags);
+	if (pt_rr == pt) {
+		/*
+		 * pt_unit_lock is read/write; any read access
+		 * will be suspended while we make changes to the
+		 * list and RR pointer.
+		 */
+		if (list_is_last(&pt_rr->entry, &pt_units))
+			pt_rr = list_first_entry(&pt_units, struct pt_device,
+						 entry);
+		else
+			pt_rr = list_next_entry(pt_rr, entry);
+	}
+	list_del(&pt->entry);
+	if (list_empty(&pt_units))
+		pt_rr = NULL;
+	write_unlock_irqrestore(&pt_unit_lock, flags);
+}
+
+static int pt_core_execute_cmd(struct ptdma_desc *desc,
+			       struct pt_cmd_queue *cmd_q)
+{
+	__le32 *mp;
+	u32 *dp;
+	u32 tail;
+	int	i;
+	int ret = 0;
+
+	if (desc->dw0.soc) {
+		desc->dw0.ioc = 1;
+		desc->dw0.soc = 0;
+	}
+	mutex_lock(&cmd_q->q_mutex);
+
+	mp = (__le32 *)&cmd_q->qbase[cmd_q->qidx];
+	dp = (u32 *)desc;
+	for (i = 0; i < 8; i++)
+		mp[i] = cpu_to_le32(dp[i]); /* handle endianness */
+
+	cmd_q->qidx = (cmd_q->qidx + 1) % cmd_queue_length;
+
+	/* The data used by this command must be flushed to memory */
+	wmb();
+
+	/* Write the new tail address back to the queue register */
+	tail = lower_32_bits(cmd_q->qdma_tail + cmd_q->qidx * Q_DESC_SIZE);
+	iowrite32(tail, cmd_q->reg_tail_lo);
+
+	/* Turn the queue back on using our cached control register */
+	iowrite32(cmd_q->qcontrol | CMD_Q_RUN, cmd_q->reg_control);
+	mutex_unlock(&cmd_q->q_mutex);
+
+	return ret;
+}
+
+int pt_core_perform_passthru(struct pt_cmd_queue *cmd_q,
+			     struct pt_passthru_engine *pt_engine)
+{
+	struct ptdma_desc desc;
+
+	cmd_q->cmd_error = 0;
+
+	memset(&desc, 0, Q_DESC_SIZE);
+
+	desc.dw0.val = CMD_DESC_DW0_VAL;
+
+	desc.length = pt_engine->src_len;
+
+	desc.src_lo = lower_32_bits(pt_engine->src_dma);
+	desc.dw3.src_hi = upper_32_bits(pt_engine->src_dma);
+
+	desc.dst_lo = lower_32_bits(pt_engine->dst_dma);
+	desc.dw5.dst_hi = upper_32_bits(pt_engine->dst_dma);
+
+	return pt_core_execute_cmd(&desc, cmd_q);
+}
+
+static void pt_core_disable_queue_interrupts(struct pt_device *pt)
+{
+	iowrite32(0x0, pt->cmd_q.reg_int_enable);
+}
+
+static void pt_core_enable_queue_interrupts(struct pt_device *pt)
+{
+	iowrite32(SUPPORTED_INTERRUPTS, pt->cmd_q.reg_int_enable);
+}
+
+static irqreturn_t pt_core_irq_handler(int irq, void *data)
+{
+	struct pt_device *pt = (struct pt_device *)data;
+	struct pt_cmd_queue *cmd_q = &pt->cmd_q;
+	u32 status;
+
+	pt_core_disable_queue_interrupts(pt);
+
+	status = ioread32(cmd_q->reg_interrupt_status);
+	if (status) {
+		cmd_q->int_status = status;
+		cmd_q->q_status = ioread32(cmd_q->reg_status);
+		cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+		/* On error, only save the first error value */
+		if ((status & INT_ERROR) && !cmd_q->cmd_error)
+			cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+		/* Acknowledge the interrupt */
+		iowrite32(status, cmd_q->reg_interrupt_status);
+	}
+
+	pt_core_enable_queue_interrupts(pt);
+
+	return IRQ_HANDLED;
+}
+
+static void pt_init_cmdq_regs(struct pt_cmd_queue *cmd_q)
+{
+	void __iomem *io_regs = cmd_q->reg_control;
+
+	cmd_q->reg_tail_lo = io_regs + CMD_Q_TAIL_LO_BASE;
+	cmd_q->reg_head_lo = io_regs + CMD_Q_HEAD_LO_BASE;
+	cmd_q->reg_status = io_regs + CMD_Q_STATUS_BASE;
+	cmd_q->reg_int_enable = io_regs + CMD_Q_INT_ENABLE_BASE;
+	cmd_q->reg_int_status = io_regs + CMD_Q_INT_STATUS_BASE;
+	cmd_q->reg_dma_status = io_regs + CMD_Q_DMA_STATUS_BASE;
+	cmd_q->reg_dma_read_status = io_regs + CMD_Q_DMA_READ_STATUS_BASE;
+	cmd_q->reg_dma_write_status = io_regs + CMD_Q_DMA_WRITE_STATUS_BASE;
+	cmd_q->reg_interrupt_status = io_regs + CMD_Q_INTERRUPT_STATUS_BASE;
+}
+
+int pt_core_init(struct pt_device *pt)
+{
+	struct device *dev = pt->dev;
+	struct pt_cmd_queue *cmd_q = &pt->cmd_q;
+	struct dma_pool *dma_pool;
+	char dma_pool_name[MAX_DMAPOOL_NAME_LEN];
+	int ret;
+	u32 dma_addr_lo, dma_addr_hi;
+
+	/* Allocate a dma pool for the queue */
+	snprintf(dma_pool_name, sizeof(dma_pool_name), "%s_q", pt->name);
+
+	dma_pool = dma_pool_create(dma_pool_name, dev,
+				   PT_DMAPOOL_MAX_SIZE,
+				   PT_DMAPOOL_ALIGN, 0);
+	if (!dma_pool) {
+		dev_err(dev, "unable to allocate dma pool\n");
+		ret = -ENOMEM;
+		return ret;
+	}
+
+	/* ptdma core initialisation */
+	iowrite32(CMD_CONFIG_VHB_EN, pt->io_regs + CMD_CONFIG_OFFSET);
+	iowrite32(CMD_QUEUE_PRIO, pt->io_regs + CMD_QUEUE_PRIO_OFFSET);
+	iowrite32(CMD_TIMEOUT_DISABLE, pt->io_regs + CMD_TIMEOUT_OFFSET);
+	iowrite32(CMD_CLK_GATE_CONFIG, pt->io_regs + CMD_CLK_GATE_CTL_OFFSET);
+	iowrite32(CMD_CONFIG_REQID, pt->io_regs + CMD_REQID_CONFIG_OFFSET);
+
+	cmd_q->pt = pt;
+	cmd_q->dma_pool = dma_pool;
+	mutex_init(&cmd_q->q_mutex);
+
+	/* Set the length module parameter */
+	if ((cmd_queue_length & (cmd_queue_length - 1)) ||
+	    cmd_queue_length < 2 || cmd_queue_length > 128) {
+		dev_err(dev, "invalid command queue length\n");
+		ret = -EINVAL;
+	}
+	dev_notice(dev, "Setting queue size to %d commands\n",
+		   cmd_queue_length);
+
+	/* Page alignment satisfies our needs for N <= 128 */
+	cmd_q->qsize = Q_SIZE(Q_DESC_SIZE);
+	cmd_q->qbase = dma_alloc_coherent(dev, cmd_q->qsize,
+					  &cmd_q->qbase_dma,
+					   GFP_KERNEL);
+	if (!cmd_q->qbase) {
+		dev_err(dev, "unable to allocate command queue\n");
+		ret = -ENOMEM;
+		goto e_dma_alloc;
+	}
+
+	cmd_q->qidx = 0;
+
+	/* Preset some register values */
+	cmd_q->reg_control = pt->io_regs + CMD_Q_STATUS_INCR;
+	pt_init_cmdq_regs(cmd_q);
+
+	dev_dbg(dev, "queue available\n");
+
+	/* Turn off the queues and disable interrupts until ready */
+	pt_core_disable_queue_interrupts(pt);
+
+	cmd_q->qcontrol = 0; /* Start with nothing */
+	iowrite32(cmd_q->qcontrol, cmd_q->reg_control);
+
+	ioread32(cmd_q->reg_int_status);
+	ioread32(cmd_q->reg_status);
+
+	/* Clear the interrupt status */
+	iowrite32(SUPPORTED_INTERRUPTS, cmd_q->reg_interrupt_status);
+
+	/* Request an irq */
+	ret = request_irq(pt->pt_irq, pt_core_irq_handler, 0, pt->name, pt);
+	if (ret) {
+		dev_err(dev, "unable to allocate an IRQ\n");
+		goto e_pool;
+	}
+
+	/* Update the device registers with queue information.  */
+
+	cmd_q->qcontrol &= ~(CMD_Q_SIZE << CMD_Q_SHIFT);
+	cmd_q->qcontrol |= QUEUE_SIZE_VAL << CMD_Q_SHIFT;
+
+	cmd_q->qdma_tail = cmd_q->qbase_dma;
+	dma_addr_lo = lower_32_bits(cmd_q->qdma_tail);
+	iowrite32((u32)dma_addr_lo, cmd_q->reg_tail_lo);
+	iowrite32((u32)dma_addr_lo, cmd_q->reg_head_lo);
+
+	dma_addr_hi = upper_32_bits(cmd_q->qdma_tail);
+	cmd_q->qcontrol |= (dma_addr_hi << 16);
+	iowrite32(cmd_q->qcontrol, cmd_q->reg_control);
+
+	dev_dbg(dev, "Enabling interrupts...\n");
+	pt_core_enable_queue_interrupts(pt);
+
+	/* Put this on the unit list to make it available */
+	pt_add_device(pt);
+
+	dev_dbg(dev, "PTDMA device %s registration successful...\n", pt->name);
+
+	return 0;
+
+e_dma_alloc:
+	dma_free_coherent(dev, cmd_q->qsize, cmd_q->qbase, cmd_q->qbase_dma);
+
+e_pool:
+	dma_pool_destroy(pt->cmd_q.dma_pool);
+
+	return ret;
+}
+
+void pt_core_destroy(struct pt_device *pt)
+{
+	struct device *dev = pt->dev;
+	struct pt_cmd_queue *cmd_q = &pt->cmd_q;
+	struct pt_cmd *cmd;
+
+	/* Remove this device from the list of available units first */
+	pt_del_device(pt);
+
+	/* Disable and clear interrupts */
+	pt_core_disable_queue_interrupts(pt);
+
+	/* Turn off the run bit */
+	iowrite32(cmd_q->qcontrol & ~CMD_Q_RUN, cmd_q->reg_control);
+
+	/* Clear the interrupt status */
+	iowrite32(SUPPORTED_INTERRUPTS, cmd_q->reg_interrupt_status);
+	ioread32(cmd_q->reg_int_status);
+	ioread32(cmd_q->reg_status);
+
+	free_irq(pt->pt_irq, pt);
+
+	dma_free_coherent(dev, cmd_q->qsize, cmd_q->qbase,
+			  cmd_q->qbase_dma);
+
+	/* Flush the cmd queue */
+	while (!list_empty(&pt->cmd)) {
+		/* Invoke the callback directly with an error code */
+		cmd = list_first_entry(&pt->cmd, struct pt_cmd, entry);
+		list_del(&cmd->entry);
+		cmd->pt_cmd_callback(cmd->data, -ENODEV);
+	}
+}
diff --git a/drivers/dma/ptdma/ptdma-pci.c b/drivers/dma/ptdma/ptdma-pci.c
new file mode 100644
index 0000000..8c8883d
--- /dev/null
+++ b/drivers/dma/ptdma/ptdma-pci.c
@@ -0,0 +1,269 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Passthru DMA device driver
+ * -- Based on the CCP driver
+ *
+ * Copyright (C) 2016,2019 Advanced Micro Devices, Inc.
+ *
+ * Author: Sanjay R Mehta <sanju.mehta@amd.com>
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ * Author: Gary R Hook <gary.hook@amd.com>
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/dma-mapping.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/delay.h>
+
+#include "ptdma.h"
+
+/* Ever-increasing value to produce unique unit numbers */
+static atomic_t pt_ordinal;
+
+struct pt_msix {
+	int msix_count;
+	struct msix_entry msix_entry;
+};
+
+/*
+ * pt_alloc_struct - allocate and initialize the pt_device struct
+ *
+ * @dev: device struct of the PTDMA
+ */
+static struct pt_device *pt_alloc_struct(struct device *dev)
+{
+	struct pt_device *pt;
+
+	pt = devm_kzalloc(dev, sizeof(*pt), GFP_KERNEL);
+	if (!pt)
+		return NULL;
+	pt->dev = dev;
+	pt->ord = atomic_inc_return(&pt_ordinal);
+
+	INIT_LIST_HEAD(&pt->cmd);
+
+	snprintf(pt->name, MAX_PT_NAME_LEN, "pt-%u", pt->ord);
+
+	return pt;
+}
+
+static int pt_get_msix_irqs(struct pt_device *pt)
+{
+	struct pt_msix *pt_msix = pt->pt_msix;
+	struct device *dev = pt->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int ret;
+
+	pt_msix->msix_entry.entry = 0;
+
+	ret = pci_enable_msix_range(pdev, &pt_msix->msix_entry, 1, 1);
+	if (ret < 0)
+		return ret;
+
+	pt_msix->msix_count = ret;
+
+	pt->pt_irq = pt_msix->msix_entry.vector;
+
+	return 0;
+}
+
+static int pt_get_msi_irq(struct pt_device *pt)
+{
+	struct device *dev = pt->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int ret;
+
+	ret = pci_enable_msi(pdev);
+	if (ret)
+		return ret;
+
+	pt->pt_irq = pdev->irq;
+
+	return 0;
+}
+
+static int pt_get_irqs(struct pt_device *pt)
+{
+	struct device *dev = pt->dev;
+	int ret;
+
+	ret = pt_get_msix_irqs(pt);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI-X vectors, try MSI */
+	dev_notice(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
+	ret = pt_get_msi_irq(pt);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI interrupt */
+	dev_notice(dev, "could not enable MSI (%d)\n", ret);
+
+	return ret;
+}
+
+static void pt_free_irqs(struct pt_device *pt)
+{
+	struct pt_msix *pt_msix = pt->pt_msix;
+	struct device *dev = pt->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (pt_msix->msix_count)
+		pci_disable_msix(pdev);
+	else if (pt->pt_irq)
+		pci_disable_msi(pdev);
+
+	pt->pt_irq = 0;
+}
+
+static int pt_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct pt_device *pt;
+	struct pt_msix *pt_msix;
+	struct device *dev = &pdev->dev;
+	void __iomem * const *iomap_table;
+	int bar_mask;
+	int ret = -ENOMEM;
+
+	pt = pt_alloc_struct(dev);
+	if (!pt)
+		goto e_err;
+
+	pt_msix = devm_kzalloc(dev, sizeof(*pt_msix), GFP_KERNEL);
+	if (!pt_msix)
+		goto e_err;
+
+	pt->pt_msix = pt_msix;
+	pt->dev_vdata = (struct pt_dev_vdata *)id->driver_data;
+	if (!pt->dev_vdata) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ret = pcim_enable_device(pdev);
+	if (ret) {
+		dev_err(dev, "pcim_enable_device failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
+	ret = pcim_iomap_regions(pdev, bar_mask, "ptdma");
+	if (ret) {
+		dev_err(dev, "pcim_iomap_regions failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	iomap_table = pcim_iomap_table(pdev);
+	if (!iomap_table) {
+		dev_err(dev, "pcim_iomap_table failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	pt->io_regs = iomap_table[pt->dev_vdata->bar];
+	if (!pt->io_regs) {
+		dev_err(dev, "ioremap failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	ret = pt_get_irqs(pt);
+	if (ret)
+		goto e_err;
+
+	pci_set_master(pdev);
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+		if (ret) {
+			dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n",
+				ret);
+			goto e_err;
+		}
+	}
+
+	dev_set_drvdata(dev, pt);
+
+	if (pt->dev_vdata)
+		ret = pt_core_init(pt);
+
+	if (ret) {
+		dev_notice(dev, "PTDMA initialization failed\n");
+		goto e_err;
+	}
+
+	dev_notice(dev, "PTDMA enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+	return ret;
+}
+
+static void pt_pci_remove(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct pt_device *pt = dev_get_drvdata(dev);
+
+	if (!pt)
+		return;
+
+	if (pt->dev_vdata)
+		pt_core_destroy(pt);
+
+	pt_free_irqs(pt);
+}
+
+#ifdef CONFIG_PM
+static int pt_pci_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+	return -ENOSYS;
+}
+
+static int pt_pci_resume(struct pci_dev *pdev)
+{
+	return -ENOSYS;
+}
+#endif
+
+static const struct pt_dev_vdata dev_vdata[] = {
+	{
+		.bar = 2,
+		.version = PT_VERSION(5, 0),
+	},
+};
+
+static const struct pci_device_id pt_pci_table[] = {
+	{ PCI_VDEVICE(AMD, 0x1498), (kernel_ulong_t)&dev_vdata[0] },
+	/* Last entry must be zero */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, pt_pci_table);
+
+static struct pci_driver pt_pci_driver = {
+	.name = "ptdma",
+	.id_table = pt_pci_table,
+	.probe = pt_pci_probe,
+	.remove = pt_pci_remove,
+#ifdef CONFIG_PM
+	.suspend = pt_pci_suspend,
+	.resume = pt_pci_resume,
+#endif
+};
+
+module_pci_driver(pt_pci_driver);
+
+MODULE_AUTHOR("Sanjay R Mehta <sanju.mehta@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("AMD PassThru DMA driver");
diff --git a/drivers/dma/ptdma/ptdma.h b/drivers/dma/ptdma/ptdma.h
new file mode 100644
index 0000000..7cd1b0d
--- /dev/null
+++ b/drivers/dma/ptdma/ptdma.h
@@ -0,0 +1,321 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * AMD Passthru DMA device driver
+ * -- Based on the CCP driver
+ *
+ * Copyright (C) 2016,2019 Advanced Micro Devices, Inc.
+ *
+ * Author: Sanjay R Mehta <sanju.mehta@amd.com>
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ * Author: Gary R Hook <gary.hook@amd.com>
+ */
+
+#ifndef __PT_DEV_H__
+#define __PT_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+
+#define MAX_PT_NAME_LEN			16
+#define MAX_DMAPOOL_NAME_LEN	32
+
+#define MAX_HW_QUEUES		1
+#define MAX_CMD_QLEN		100
+
+#define PT_ENGINE_PASSTHRU	5
+#define PT_OFFSET			0x0
+
+#define	PT_VSIZE			16
+#define	PT_VMASK			((unsigned int)((1 << PT_VSIZE) - 1))
+#define	PT_VERSION(v, r)	((unsigned int)(((v) << PT_VSIZE) \
+					       | ((r) & PT_VMASK)))
+
+/* Register Mappings */
+#define IRQ_MASK_REG		0x040
+#define IRQ_STATUS_REG		0x200
+
+#define CMD_Q_ERROR(__qs)	((__qs) & 0x0000003f)
+
+#define	CMD_QUEUE_PRIO_OFFSET		0x00
+#define	CMD_REQID_CONFIG_OFFSET		0x04
+#define	CMD_TIMEOUT_OFFSET		0x08
+
+#define CMD_Q_CONTROL_BASE		0x0000
+#define CMD_Q_TAIL_LO_BASE		0x0004
+#define CMD_Q_HEAD_LO_BASE		0x0008
+#define CMD_Q_INT_ENABLE_BASE		0x000C
+#define CMD_Q_INTERRUPT_STATUS_BASE	0x0010
+
+#define CMD_Q_STATUS_BASE		0x0100
+#define CMD_Q_INT_STATUS_BASE		0x0104
+#define CMD_Q_DMA_STATUS_BASE		0x0108
+#define CMD_Q_DMA_READ_STATUS_BASE	0x010C
+#define CMD_Q_DMA_WRITE_STATUS_BASE	0x0110
+#define CMD_Q_ABORT_BASE		0x0114
+#define CMD_Q_AX_CACHE_BASE		0x0118
+
+#define	CMD_CONFIG_OFFSET		0x1120
+#define	CMD_CLK_GATE_CTL_OFFSET		0x6004
+
+/* Address offset for virtual queue registers */
+#define CMD_Q_STATUS_INCR		0x1000
+
+/* Bit masks */
+#define	CMD_DESC_DW0_VAL		0x500012
+#define	CMD_CONFIG_REQID		0x0
+#define	CMD_CONFIG_VHB_EN		0x00000001
+#define	CMD_QUEUE_PRIO			0x00000006
+#define	CMD_TIMEOUT_DISABLE		0x00000000
+#define	CMD_CLK_DYN_GATING_EN	0x1
+#define	CMD_CLK_DYN_GATING_DIS	0x0
+#define	CMD_CLK_HW_GATE_MODE	0x1
+#define	CMD_CLK_SW_GATE_MODE	0x0
+#define	CMD_CLK_GATE_ON_DELAY	0x1000
+#define	CMD_CLK_GATE_CTL	0x0
+#define	CMD_CLK_GATE_OFF_DELAY	0x1000
+
+#define CMD_CLK_GATE_CONFIG	(CMD_CLK_GATE_CTL | \
+				CMD_CLK_HW_GATE_MODE | \
+				CMD_CLK_GATE_ON_DELAY | \
+				CMD_CLK_DYN_GATING_EN | \
+				CMD_CLK_GATE_OFF_DELAY)
+
+#define CMD_Q_RUN		0x1
+#define CMD_Q_HALT		0x2
+#define CMD_Q_MEM_LOCATION	0x4
+#define CMD_Q_SIZE		0x1F
+#define CMD_Q_SHIFT		3
+#define QUEUE_SIZE_VAL		((ffs(cmd_queue_length) - 2) & \
+							  CMD_Q_SIZE)
+#define Q_PTR_MASK		(2 << (QUEUE_SIZE_VAL + 5) - 1)
+#define Q_DESC_SIZE		sizeof(struct ptdma_desc)
+#define Q_SIZE(n)		(cmd_queue_length * (n))
+
+#define INT_COMPLETION		0x1
+#define INT_ERROR		0x2
+#define INT_QUEUE_STOPPED	0x4
+#define	INT_EMPTY_QUEUE		0x8
+#define SUPPORTED_INTERRUPTS	(INT_COMPLETION | INT_ERROR)
+
+/****** Local Storage Block ******/
+#define LSB_START		0
+#define LSB_END			127
+#define LSB_COUNT		(LSB_END - LSB_START + 1)
+
+#define PT_DMAPOOL_MAX_SIZE	64
+#define PT_DMAPOOL_ALIGN	BIT(5)
+
+#define PT_PASSTHRU_BLOCKSIZE	512
+
+struct pt_device;
+
+struct pt_tasklet_data {
+	struct completion completion;
+	struct pt_cmd *cmd;
+};
+
+/*
+ * struct pt_passthru_engine - pass-through operation
+ *   without performing DMA mapping
+ * @mask: mask to be applied to data
+ * @mask_len: length in bytes of mask
+ * @src: data to be used for this operation
+ * @dst: data produced by this operation
+ * @src_len: length in bytes of data used for this operation
+ * @final: indicate final pass-through operation
+ *
+ * Variables required to be set when calling pt_enqueue_cmd():
+ *   - bit_mod, byte_swap, src, dst, src_len
+ *   - mask, mask_len if bit_mod is not PT_PASSTHRU_BITWISE_NOOP
+ */
+struct pt_passthru_engine {
+	dma_addr_t mask;
+	u32 mask_len;		/* In bytes */
+
+	dma_addr_t src_dma, dst_dma;
+	u64 src_len;		/* In bytes */
+
+	u32 final;
+};
+
+/*
+ * struct pt_cmd - PTDMA operation request
+ * @entry: list element
+ * @work: work element used for callbacks
+ * @pt: PT device to be run on
+ * @ret: operation return code
+ * @flags: cmd processing flags
+ * @engine: PTDMA operation to perform (passthru)
+ * @engine_error: PT engine return code
+ * @passthru: engine specific structures, refer to specific engine struct below
+ * @callback: operation completion callback function
+ * @data: parameter value to be supplied to the callback function
+ *
+ * Variables required to be set when calling pt_enqueue_cmd():
+ *   - engine, callback
+ *   - See the operation structures below for what is required for each
+ *     operation.
+ */
+struct pt_cmd {
+	struct list_head entry;
+	struct work_struct work;
+	struct pt_device *pt;
+	int ret;
+
+	u32 engine;
+	u32 engine_error;
+
+	struct pt_passthru_engine passthru;
+
+	/* Completion callback support */
+	void (*pt_cmd_callback)(void *data, int err);
+	void *data;
+};
+
+struct pt_cmd_queue {
+	struct pt_device *pt;
+
+	/* Queue dma pool */
+	struct dma_pool *dma_pool;
+
+	/* Queue base address (not neccessarily aligned)*/
+	struct ptdma_desc *qbase;
+
+	/* Aligned queue start address (per requirement) */
+	struct mutex q_mutex ____cacheline_aligned;
+	unsigned int qidx;
+
+	unsigned int qsize;
+	dma_addr_t qbase_dma;
+	dma_addr_t qdma_tail;
+
+	unsigned int active;
+	unsigned int suspended;
+
+	/* Register addresses for queue */
+	void __iomem *reg_control;
+	void __iomem *reg_tail_lo;
+	void __iomem *reg_head_lo;
+	void __iomem *reg_int_enable;
+	void __iomem *reg_interrupt_status;
+	void __iomem *reg_status;
+	void __iomem *reg_int_status;
+	void __iomem *reg_dma_status;
+	void __iomem *reg_dma_read_status;
+	void __iomem *reg_dma_write_status;
+	u32 qcontrol; /* Cached control register */
+
+	/* Status values from job */
+	u32 int_status;
+	u32 q_status;
+	u32 q_int_status;
+	u32 cmd_error;
+
+} ____cacheline_aligned;
+
+struct pt_device {
+	struct list_head entry;
+
+	unsigned int ord;
+	char name[MAX_PT_NAME_LEN];
+
+	struct device *dev;
+
+	/* Bus specific device information */
+	struct pt_msix *pt_msix;
+
+	struct pt_dev_vdata *dev_vdata;
+
+	unsigned int pt_irq;
+
+	/* I/O area used for device communication */
+	void __iomem *io_regs;
+
+	spinlock_t cmd_lock ____cacheline_aligned;
+	unsigned int cmd_count;
+	struct list_head cmd;
+
+	/*
+	 * The command queue. This represent the queue available on the
+	 * PTDMA that are available for processing cmds
+	 */
+	struct pt_cmd_queue cmd_q;
+
+	wait_queue_head_t lsb_queue;
+
+	struct tasklet_struct tasklet;
+	struct pt_tasklet_data tdata;
+};
+
+/*
+ * descriptor for PTDMA commands
+ * 8 32-bit words:
+ * word 0: function; engine; control bits
+ * word 1: length of source data
+ * word 2: low 32 bits of source pointer
+ * word 3: upper 16 bits of source pointer; source memory type
+ * word 4: low 32 bits of destination pointer
+ * word 5: upper 16 bits of destination pointer; destination memory type
+ * word 6: reserved 32 bits
+ * word 7: reserved 32 bits
+ */
+
+union dword0 {
+	struct {
+		unsigned int soc:1;
+		unsigned int ioc:1;
+		unsigned int rsvd1:1;
+		unsigned int init:1;
+		unsigned int eom:1;
+		unsigned int function:15;
+		unsigned int engine:4;
+		unsigned int prot:1;
+		unsigned int rsvd2:7;
+	};
+	u32 val;
+};
+
+struct dword3 {
+	unsigned int  src_hi:16;
+	unsigned int  src_mem:2;
+	unsigned int  lsb_cxt_id:8;
+	unsigned int  rsvd1:5;
+	unsigned int  fixed:1;
+};
+
+struct dword5 {
+	unsigned int  dst_hi:16;
+	unsigned int  dst_mem:2;
+	unsigned int  rsvd1:13;
+	unsigned int  fixed:1;
+};
+
+struct ptdma_desc {
+	union dword0 dw0;
+	u32 length;
+	u32 src_lo;
+	struct dword3 dw3;
+	u32 dst_lo;
+	struct dword5 dw5;
+	__le32 rsvd1;
+	__le32 rsvd2;
+};
+
+/* Structure to hold PT device data */
+struct pt_dev_vdata {
+	const unsigned int bar;
+	const unsigned int version;
+};
+
+int pt_core_init(struct pt_device *pt);
+void pt_core_destroy(struct pt_device *pt);
+
+int pt_core_perform_passthru(struct pt_cmd_queue *cmd_q,
+			     struct pt_passthru_engine *pt_engine);
+
+#endif
-- 
2.7.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 2/3] dmaengine: ptdma: Register pass-through engine as a DMA resource
  2020-01-21  9:04 [PATCH v3 0/3] Add AMD PassThru DMA Engine driver Sanjay R Mehta
  2020-01-21  9:04 ` [PATCH v3 1/3] dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine Sanjay R Mehta
@ 2020-01-21  9:04 ` Sanjay R Mehta
  2020-01-24  6:05   ` Vinod Koul
  2020-01-21  9:04 ` [PATCH v3 3/3] dmaengine: ptdma: Add debugfs entries for PTDMA information Sanjay R Mehta
  2020-03-26 16:05 ` [PATCH v3 0/3] Add AMD PassThru DMA Engine driver Vitaly Mayatskih
  3 siblings, 1 reply; 7+ messages in thread
From: Sanjay R Mehta @ 2020-01-21  9:04 UTC (permalink / raw)
  To: vkoul, gregkh, dan.j.williams, Thomas.Lendacky, Shyam-sundar.S-k,
	Nehal-bakulchandra.Shah
  Cc: robh, mchehab+samsung, davem, Jonathan.Cameron, linux-kernel,
	dmaengine, Sanjay R Mehta

From: Sanjay R Mehta <sanju.mehta@amd.com>

This Registers the ptdma queue to Linux dmaengine
framework as general purpose DMA channels.

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
---
 drivers/dma/ptdma/Kconfig           |   1 +
 drivers/dma/ptdma/Makefile          |   3 +-
 drivers/dma/ptdma/ptdma-dev.c       |  35 ++
 drivers/dma/ptdma/ptdma-dmaengine.c | 704 ++++++++++++++++++++++++++++++++++++
 drivers/dma/ptdma/ptdma.h           |  45 +++
 5 files changed, 787 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma/ptdma/ptdma-dmaengine.c

diff --git a/drivers/dma/ptdma/Kconfig b/drivers/dma/ptdma/Kconfig
index 4ec259e..f4848bf 100644
--- a/drivers/dma/ptdma/Kconfig
+++ b/drivers/dma/ptdma/Kconfig
@@ -2,5 +2,6 @@
 config AMD_PTDMA
 	tristate  "AMD PassThru DMA Engine"
 	depends on X86_64 && PCI
+	select DMA_ENGINE
 	help
 	  Provides the support for AMD PassThru DMA Engine.
diff --git a/drivers/dma/ptdma/Makefile b/drivers/dma/ptdma/Makefile
index 320fa82..6fcb4ad 100644
--- a/drivers/dma/ptdma/Makefile
+++ b/drivers/dma/ptdma/Makefile
@@ -5,6 +5,7 @@
 
 obj-$(CONFIG_AMD_PTDMA) += ptdma.o
 
-ptdma-objs := ptdma-dev.o
+ptdma-objs := ptdma-dev.o \
+	      ptdma-dmaengine.o
 
 ptdma-$(CONFIG_PCI) += ptdma-pci.o
diff --git a/drivers/dma/ptdma/ptdma-dev.c b/drivers/dma/ptdma/ptdma-dev.c
index c10adce..0162ecd 100644
--- a/drivers/dma/ptdma/ptdma-dev.c
+++ b/drivers/dma/ptdma/ptdma-dev.c
@@ -222,6 +222,8 @@ static irqreturn_t pt_core_irq_handler(int irq, void *data)
 
 	pt_core_enable_queue_interrupts(pt);
 
+	tasklet_schedule(&pt->tasklet);
+
 	return IRQ_HANDLED;
 }
 
@@ -240,6 +242,26 @@ static void pt_init_cmdq_regs(struct pt_cmd_queue *cmd_q)
 	cmd_q->reg_interrupt_status = io_regs + CMD_Q_INTERRUPT_STATUS_BASE;
 }
 
+static void pt_do_cmd_complete(unsigned long data)
+{
+	struct pt_tasklet_data *tdata = (struct pt_tasklet_data *)data;
+	struct pt_cmd *cmd = tdata->cmd;
+	struct pt_cmd_queue *cmd_q = &cmd->pt->cmd_q;
+	u32 tail;
+
+	tail = lower_32_bits(cmd_q->qdma_tail + cmd_q->qidx * Q_DESC_SIZE);
+	if (cmd_q->cmd_error) {
+	       /*
+		* Log the error and flush the queue by
+		* moving the head pointer
+		*/
+		pt_log_error(cmd_q->pt, cmd_q->cmd_error);
+		iowrite32(tail, cmd_q->reg_head_lo);
+	}
+
+	cmd->pt_cmd_callback(cmd->data, cmd->ret);
+}
+
 int pt_core_init(struct pt_device *pt)
 {
 	struct device *dev = pt->dev;
@@ -341,8 +363,18 @@ int pt_core_init(struct pt_device *pt)
 
 	dev_dbg(dev, "PTDMA device %s registration successful...\n", pt->name);
 
+	/* Register the DMA engine support */
+	ret = pt_dmaengine_register(pt);
+	if (ret)
+		goto e_dmaengine;
+
+	tasklet_init(&pt->tasklet, pt_do_cmd_complete, (ulong)&pt->tdata);
+
 	return 0;
 
+e_dmaengine:
+	free_irq(pt->pt_irq, pt);
+
 e_dma_alloc:
 	dma_free_coherent(dev, cmd_q->qsize, cmd_q->qbase, cmd_q->qbase_dma);
 
@@ -358,6 +390,9 @@ void pt_core_destroy(struct pt_device *pt)
 	struct pt_cmd_queue *cmd_q = &pt->cmd_q;
 	struct pt_cmd *cmd;
 
+	/* Unregister the DMA engine */
+	pt_dmaengine_unregister(pt);
+
 	/* Remove this device from the list of available units first */
 	pt_del_device(pt);
 
diff --git a/drivers/dma/ptdma/ptdma-dmaengine.c b/drivers/dma/ptdma/ptdma-dmaengine.c
new file mode 100644
index 0000000..b91a83e
--- /dev/null
+++ b/drivers/dma/ptdma/ptdma-dmaengine.c
@@ -0,0 +1,704 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Passthrough DMA device driver
+ * -- Based on the CCP driver
+ *
+ * Copyright (C) 2016,2019 Advanced Micro Devices, Inc.
+ *
+ * Author: Sanjay R Mehta <sanju.mehta@amd.com>
+ * Author: Gary R Hook <gary.hook@amd.com>
+ */
+
+#include "ptdma.h"
+#include "../dmaengine.h"
+
+#define PT_DMA_WIDTH(_mask)		\
+({					\
+	u64 mask = (_mask) + 1;		\
+	(mask == 0) ? 64 : fls64(mask);	\
+})
+
+static void pt_free_cmd_resources(struct pt_device *pt,
+				  struct list_head *list)
+{
+	struct pt_dma_cmd *cmd, *ctmp;
+
+	list_for_each_entry_safe(cmd, ctmp, list, entry) {
+		list_del(&cmd->entry);
+		kmem_cache_free(pt->dma_cmd_cache, cmd);
+	}
+}
+
+static void pt_free_desc_resources(struct pt_device *pt,
+				   struct list_head *list)
+{
+	struct pt_dma_desc *desc, *dtmp;
+
+	list_for_each_entry_safe(desc, dtmp, list, entry) {
+		pt_free_cmd_resources(pt, &desc->active);
+		pt_free_cmd_resources(pt, &desc->pending);
+
+		list_del(&desc->entry);
+		kmem_cache_free(pt->dma_desc_cache, desc);
+	}
+}
+
+static void pt_free_chan_resources(struct dma_chan *dma_chan)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+	unsigned long flags;
+
+	dev_dbg(chan->pt->dev, "%s - chan=%p\n", __func__, chan);
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	pt_free_desc_resources(chan->pt, &chan->complete);
+	pt_free_desc_resources(chan->pt, &chan->active);
+	pt_free_desc_resources(chan->pt, &chan->pending);
+	pt_free_desc_resources(chan->pt, &chan->created);
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+static void pt_cleanup_desc_resources(struct pt_device *pt,
+				      struct list_head *list)
+{
+	struct pt_dma_desc *desc, *dtmp;
+
+	list_for_each_entry_safe_reverse(desc, dtmp, list, entry) {
+		if (!async_tx_test_ack(&desc->tx_desc))
+			continue;
+
+		dev_dbg(pt->dev, "%s - desc=%p\n", __func__, desc);
+
+		pt_free_cmd_resources(pt, &desc->active);
+		pt_free_cmd_resources(pt, &desc->pending);
+
+		list_del(&desc->entry);
+		kmem_cache_free(pt->dma_desc_cache, desc);
+	}
+}
+
+static void pt_do_cleanup(unsigned long data)
+{
+	struct pt_dma_chan *chan = (struct pt_dma_chan *)data;
+	unsigned long flags;
+
+	dev_dbg(chan->pt->dev, "%s - chan=%s\n", __func__,
+		dma_chan_name(&chan->dma_chan));
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	pt_cleanup_desc_resources(chan->pt, &chan->complete);
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+static int pt_issue_next_cmd(struct pt_dma_desc *desc)
+{
+	struct pt_passthru_engine *pt_engine;
+	struct pt_dma_cmd *cmd;
+	struct pt_device *pt;
+	struct pt_cmd *pt_cmd;
+	struct pt_cmd_queue *cmd_q;
+
+	cmd = list_first_entry(&desc->pending, struct pt_dma_cmd, entry);
+	list_move(&cmd->entry, &desc->active);
+
+	dev_dbg(desc->pt->dev, "%s - tx %d, cmd=%p\n", __func__,
+		desc->tx_desc.cookie, cmd);
+
+	pt_cmd = &cmd->pt_cmd;
+	pt = pt_cmd->pt;
+	cmd_q = &pt->cmd_q;
+	pt_engine = &pt_cmd->passthru;
+
+	if (!pt_engine->final)
+		return -EINVAL;
+
+	if (!pt_engine->src_dma || !pt_engine->dst_dma)
+		return -EINVAL;
+
+	pt->tdata.cmd = pt_cmd;
+
+	/* Execute the command */
+	pt_cmd->ret = pt_core_perform_passthru(cmd_q, pt_engine);
+
+	return 0;
+}
+
+static void pt_free_active_cmd(struct pt_dma_desc *desc)
+{
+	struct pt_dma_cmd *cmd;
+
+	cmd = list_first_entry_or_null(&desc->active, struct pt_dma_cmd,
+				       entry);
+	if (!cmd)
+		return;
+
+	dev_dbg(desc->pt->dev, "%s - freeing tx %d cmd=%p\n",
+		__func__, desc->tx_desc.cookie, cmd);
+
+	list_del(&cmd->entry);
+	kmem_cache_free(desc->pt->dma_cmd_cache, cmd);
+}
+
+static struct pt_dma_desc *__pt_next_dma_desc(struct pt_dma_chan *chan,
+					      struct pt_dma_desc *desc)
+{
+	/* Move current DMA descriptor to the complete list */
+	if (desc)
+		list_move(&desc->entry, &chan->complete);
+
+	/* Get the next DMA descriptor on the active list */
+	desc = list_first_entry_or_null(&chan->active, struct pt_dma_desc,
+					entry);
+
+	return desc;
+}
+
+static struct pt_dma_desc *pt_handle_active_desc(struct pt_dma_chan *chan,
+						 struct pt_dma_desc *desc)
+{
+	struct dma_async_tx_descriptor *tx_desc;
+	unsigned long flags;
+
+	/* Loop over descriptors until one is found with commands */
+	do {
+		if (desc) {
+			/* Remove the DMA command from the list and free it */
+			pt_free_active_cmd(desc);
+
+			if (!list_empty(&desc->pending)) {
+				/* No errors, keep going */
+				if (desc->status != DMA_ERROR)
+					return desc;
+
+				/* Error, free remaining commands and move on */
+				pt_free_cmd_resources(desc->pt,
+						      &desc->pending);
+			}
+
+			tx_desc = &desc->tx_desc;
+		} else {
+			tx_desc = NULL;
+		}
+
+		spin_lock_irqsave(&chan->lock, flags);
+
+		if (desc) {
+			if (desc->status != DMA_ERROR)
+				desc->status = DMA_COMPLETE;
+
+			dev_dbg(desc->pt->dev,
+				"%s - tx %d complete, status=%u\n", __func__,
+				desc->tx_desc.cookie, desc->status);
+
+			dma_cookie_complete(tx_desc);
+			dma_descriptor_unmap(tx_desc);
+		}
+
+		desc = __pt_next_dma_desc(chan, desc);
+
+		spin_unlock_irqrestore(&chan->lock, flags);
+
+		if (tx_desc) {
+			dmaengine_desc_get_callback_invoke(tx_desc, NULL);
+
+			dma_run_dependencies(tx_desc);
+		}
+	} while (desc);
+
+	return NULL;
+}
+
+static struct pt_dma_desc *__pt_pending_to_active(struct pt_dma_chan *chan)
+{
+	struct pt_dma_desc *desc;
+
+	if (list_empty(&chan->pending))
+		return NULL;
+
+	desc = list_empty(&chan->active)
+		? list_first_entry(&chan->pending, struct pt_dma_desc, entry)
+		: NULL;
+
+	list_splice_tail_init(&chan->pending, &chan->active);
+
+	return desc;
+}
+
+static void pt_cmd_callback(void *data, int err)
+{
+	struct pt_dma_desc *desc = data;
+	struct pt_dma_chan *chan;
+	int ret;
+
+	if (err == -EINPROGRESS)
+		return;
+
+	chan = container_of(desc->tx_desc.chan, struct pt_dma_chan,
+			    dma_chan);
+
+	dev_dbg(chan->pt->dev, "%s - tx %d callback, err=%d\n",
+		__func__, desc->tx_desc.cookie, err);
+
+	if (err)
+		desc->status = DMA_ERROR;
+
+	while (true) {
+		/* Check for DMA descriptor completion */
+		desc = pt_handle_active_desc(chan, desc);
+
+		/* Don't submit cmd if no descriptor or DMA is paused */
+		if (!desc || chan->status == DMA_PAUSED)
+			break;
+
+		ret = pt_issue_next_cmd(desc);
+		if (!ret)
+			break;
+
+		desc->status = DMA_ERROR;
+	}
+
+	tasklet_schedule(&chan->cleanup_tasklet);
+}
+
+static dma_cookie_t pt_tx_submit(struct dma_async_tx_descriptor *tx_desc)
+{
+	struct pt_dma_desc *desc = container_of(tx_desc, struct pt_dma_desc,
+						 tx_desc);
+	struct pt_dma_chan *chan;
+	dma_cookie_t cookie;
+	unsigned long flags;
+
+	chan = container_of(tx_desc->chan, struct pt_dma_chan, dma_chan);
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	cookie = dma_cookie_assign(tx_desc);
+	list_del(&desc->entry);
+	list_add_tail(&desc->entry, &chan->pending);
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	dev_dbg(chan->pt->dev, "%s - added tx descriptor %d to pending list\n",
+		__func__, cookie);
+
+	return cookie;
+}
+
+static struct pt_dma_cmd *pt_alloc_dma_cmd(struct pt_dma_chan *chan)
+{
+	struct pt_dma_cmd *cmd;
+
+	cmd = kmem_cache_zalloc(chan->pt->dma_cmd_cache, GFP_NOWAIT);
+
+	return cmd;
+}
+
+static struct pt_dma_desc *pt_alloc_dma_desc(struct pt_dma_chan *chan,
+					     unsigned long flags)
+{
+	struct pt_dma_desc *desc;
+
+	desc = kmem_cache_zalloc(chan->pt->dma_desc_cache, GFP_NOWAIT);
+	if (!desc)
+		return NULL;
+
+	dma_async_tx_descriptor_init(&desc->tx_desc, &chan->dma_chan);
+	desc->tx_desc.flags = flags;
+	desc->tx_desc.tx_submit = pt_tx_submit;
+	desc->pt = chan->pt;
+	INIT_LIST_HEAD(&desc->pending);
+	INIT_LIST_HEAD(&desc->active);
+	desc->status = DMA_IN_PROGRESS;
+
+	return desc;
+}
+
+static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
+					  struct scatterlist *dst_sg,
+					    unsigned int dst_nents,
+					    struct scatterlist *src_sg,
+					    unsigned int src_nents,
+					    unsigned long flags)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+	struct pt_device *pt = chan->pt;
+	struct pt_dma_desc *desc;
+	struct pt_dma_cmd *cmd;
+	struct pt_cmd *pt_cmd;
+	struct pt_passthru_engine *pt_engine;
+	unsigned int src_offset, src_len;
+	unsigned int dst_offset, dst_len;
+	unsigned int len;
+	unsigned long sflags;
+	size_t total_len;
+
+	if (!dst_sg || !src_sg)
+		return NULL;
+
+	if (!dst_nents || !src_nents)
+		return NULL;
+
+	desc = pt_alloc_dma_desc(chan, flags);
+	if (!desc)
+		return NULL;
+
+	total_len = 0;
+
+	src_len = sg_dma_len(src_sg);
+	src_offset = 0;
+
+	dst_len = sg_dma_len(dst_sg);
+	dst_offset = 0;
+
+	while (true) {
+		if (!src_len) {
+			src_nents--;
+			if (!src_nents)
+				break;
+
+			src_sg = sg_next(src_sg);
+			if (!src_sg)
+				break;
+
+			src_len = sg_dma_len(src_sg);
+			src_offset = 0;
+			continue;
+		}
+
+		if (!dst_len) {
+			dst_nents--;
+			if (!dst_nents)
+				break;
+
+			dst_sg = sg_next(dst_sg);
+			if (!dst_sg)
+				break;
+
+			dst_len = sg_dma_len(dst_sg);
+			dst_offset = 0;
+			continue;
+		}
+
+		len = min(dst_len, src_len);
+
+		cmd = pt_alloc_dma_cmd(chan);
+		if (!cmd)
+			goto err;
+
+		pt_cmd = &cmd->pt_cmd;
+		pt_cmd->pt = chan->pt;
+		pt_engine = &pt_cmd->passthru;
+		pt_cmd->engine = PT_ENGINE_PASSTHRU;
+		pt_engine->src_dma = sg_dma_address(src_sg) + src_offset;
+		pt_engine->dst_dma = sg_dma_address(dst_sg) + dst_offset;
+		pt_engine->src_len = len;
+		pt_engine->final = 1;
+		pt_cmd->pt_cmd_callback = pt_cmd_callback;
+		pt_cmd->data = desc;
+
+		list_add_tail(&cmd->entry, &desc->pending);
+
+		dev_dbg(pt->dev,
+			"%s - cmd=%p, src=%pad, dst=%pad, len=%llu\n", __func__,
+			cmd, &pt_engine->src_dma,
+			&pt_engine->dst_dma, pt_engine->src_len);
+
+		total_len += len;
+
+		src_len -= len;
+		src_offset += len;
+
+		dst_len -= len;
+		dst_offset += len;
+	}
+
+	desc->len = total_len;
+
+	if (list_empty(&desc->pending))
+		goto err;
+
+	dev_dbg(pt->dev, "%s - desc=%p\n", __func__, desc);
+
+	spin_lock_irqsave(&chan->lock, sflags);
+
+	list_add_tail(&desc->entry, &chan->created);
+
+	spin_unlock_irqrestore(&chan->lock, sflags);
+
+	return desc;
+
+err:
+	pt_free_cmd_resources(pt, &desc->pending);
+	kmem_cache_free(pt->dma_desc_cache, desc);
+
+	return NULL;
+}
+
+static struct dma_async_tx_descriptor *
+pt_prep_dma_memcpy(struct dma_chan *dma_chan, dma_addr_t dst,
+		   dma_addr_t src, size_t len, unsigned long flags)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+	struct pt_dma_desc *desc;
+	struct scatterlist dst_sg, src_sg;
+
+	dev_dbg(chan->pt->dev,
+		"%s - src=%pad, dst=%pad, len=%zu, flags=%#lx\n",
+		__func__, &src, &dst, len, flags);
+
+	sg_init_table(&dst_sg, 1);
+	sg_dma_address(&dst_sg) = dst;
+	sg_dma_len(&dst_sg) = len;
+
+	sg_init_table(&src_sg, 1);
+	sg_dma_address(&src_sg) = src;
+	sg_dma_len(&src_sg) = len;
+
+	desc = pt_create_desc(dma_chan, &dst_sg, 1, &src_sg, 1, flags);
+	if (!desc)
+		return NULL;
+
+	return &desc->tx_desc;
+}
+
+static struct dma_async_tx_descriptor *
+pt_prep_dma_interrupt(struct dma_chan *dma_chan, unsigned long flags)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+	struct pt_dma_desc *desc;
+
+	desc = pt_alloc_dma_desc(chan, flags);
+	if (!desc)
+		return NULL;
+
+	return &desc->tx_desc;
+}
+
+static void pt_issue_pending(struct dma_chan *dma_chan)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+	struct pt_dma_desc *desc;
+	unsigned long flags;
+
+	dev_dbg(chan->pt->dev, "%s\n", __func__);
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	desc = __pt_pending_to_active(chan);
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	/* If there was nothing active, start processing */
+	if (desc)
+		pt_cmd_callback(desc, 0);
+}
+
+static enum dma_status pt_tx_status(struct dma_chan *dma_chan,
+				    dma_cookie_t cookie,
+				    struct dma_tx_state *state)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+	struct pt_dma_desc *desc;
+	enum dma_status ret;
+	unsigned long flags;
+
+	if (chan->status == DMA_PAUSED) {
+		ret = DMA_PAUSED;
+		goto out;
+	}
+
+	ret = dma_cookie_status(dma_chan, cookie, state);
+	if (ret == DMA_COMPLETE) {
+		spin_lock_irqsave(&chan->lock, flags);
+
+		/* Get status from complete chain, if still there */
+		list_for_each_entry(desc, &chan->complete, entry) {
+			if (desc->tx_desc.cookie != cookie)
+				continue;
+
+			ret = desc->status;
+			break;
+		}
+
+		spin_unlock_irqrestore(&chan->lock, flags);
+	}
+
+out:
+	dev_dbg(chan->pt->dev, "%s - %u\n", __func__, ret);
+
+	return ret;
+}
+
+static int pt_pause(struct dma_chan *dma_chan)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+
+	chan->status = DMA_PAUSED;
+
+	/*TODO: Wait for active DMA to complete before returning? */
+
+	return 0;
+}
+
+static int pt_resume(struct dma_chan *dma_chan)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+	struct pt_dma_desc *desc;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	desc = list_first_entry_or_null(&chan->active, struct pt_dma_desc,
+					entry);
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	/* Indicate the channel is running again */
+	chan->status = DMA_IN_PROGRESS;
+
+	/* If there was something active, re-start */
+	if (desc)
+		pt_cmd_callback(desc, 0);
+
+	return 0;
+}
+
+static int pt_terminate_all(struct dma_chan *dma_chan)
+{
+	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
+						 dma_chan);
+	unsigned long flags;
+
+	dev_dbg(chan->pt->dev, "%s\n", __func__);
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	pt_free_desc_resources(chan->pt, &chan->active);
+	pt_free_desc_resources(chan->pt, &chan->pending);
+	pt_free_desc_resources(chan->pt, &chan->created);
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	return 0;
+}
+
+int pt_dmaengine_register(struct pt_device *pt)
+{
+	struct pt_dma_chan *chan;
+	struct dma_device *dma_dev = &pt->dma_dev;
+	struct dma_chan *dma_chan;
+	char *dma_cmd_cache_name;
+	char *dma_desc_cache_name;
+	int ret;
+
+	pt->pt_dma_chan = devm_kcalloc(pt->dev, 1,
+				       sizeof(*pt->pt_dma_chan),
+				       GFP_KERNEL);
+	if (!pt->pt_dma_chan)
+		return -ENOMEM;
+
+	dma_cmd_cache_name = devm_kasprintf(pt->dev, GFP_KERNEL,
+					    "%s-dmaengine-cmd-cache",
+					    pt->name);
+	if (!dma_cmd_cache_name)
+		return -ENOMEM;
+
+	pt->dma_cmd_cache = kmem_cache_create(dma_cmd_cache_name,
+					      sizeof(struct pt_dma_cmd),
+					      sizeof(void *),
+					      SLAB_HWCACHE_ALIGN, NULL);
+	if (!pt->dma_cmd_cache)
+		return -ENOMEM;
+
+	dma_desc_cache_name = devm_kasprintf(pt->dev, GFP_KERNEL,
+					     "%s-dmaengine-desc-cache",
+					     pt->name);
+	if (!dma_desc_cache_name) {
+		ret = -ENOMEM;
+		goto err_cache;
+	}
+
+	pt->dma_desc_cache = kmem_cache_create(dma_desc_cache_name,
+					       sizeof(struct pt_dma_desc),
+					       sizeof(void *),
+					       SLAB_HWCACHE_ALIGN, NULL);
+	if (!pt->dma_desc_cache) {
+		ret = -ENOMEM;
+		goto err_cache;
+	}
+
+	dma_dev->dev = pt->dev;
+	dma_dev->src_addr_widths = PT_DMA_WIDTH(dma_get_mask(pt->dev));
+	dma_dev->dst_addr_widths = PT_DMA_WIDTH(dma_get_mask(pt->dev));
+	dma_dev->directions = DMA_MEM_TO_MEM;
+	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
+	dma_cap_set(DMA_MEMCPY, dma_dev->cap_mask);
+	dma_cap_set(DMA_INTERRUPT, dma_dev->cap_mask);
+	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
+
+	INIT_LIST_HEAD(&dma_dev->channels);
+
+	chan = pt->pt_dma_chan;
+	dma_chan = &chan->dma_chan;
+
+	chan->pt = pt;
+
+	spin_lock_init(&chan->lock);
+	INIT_LIST_HEAD(&chan->created);
+	INIT_LIST_HEAD(&chan->pending);
+	INIT_LIST_HEAD(&chan->active);
+	INIT_LIST_HEAD(&chan->complete);
+
+	tasklet_init(&chan->cleanup_tasklet, pt_do_cleanup,
+		     (unsigned long)chan);
+
+	dma_chan->device = dma_dev;
+	dma_cookie_init(dma_chan);
+
+	list_add_tail(&dma_chan->device_node, &dma_dev->channels);
+
+	dma_dev->device_free_chan_resources = pt_free_chan_resources;
+	dma_dev->device_prep_dma_memcpy = pt_prep_dma_memcpy;
+	dma_dev->device_prep_dma_interrupt = pt_prep_dma_interrupt;
+	dma_dev->device_issue_pending = pt_issue_pending;
+	dma_dev->device_tx_status = pt_tx_status;
+	dma_dev->device_pause = pt_pause;
+	dma_dev->device_resume = pt_resume;
+	dma_dev->device_terminate_all = pt_terminate_all;
+
+	ret = dma_async_device_register(dma_dev);
+	if (ret)
+		goto err_reg;
+
+	return 0;
+
+err_reg:
+	kmem_cache_destroy(pt->dma_desc_cache);
+
+err_cache:
+	kmem_cache_destroy(pt->dma_cmd_cache);
+
+	return ret;
+}
+
+void pt_dmaengine_unregister(struct pt_device *pt)
+{
+	struct dma_device *dma_dev = &pt->dma_dev;
+
+	dma_async_device_unregister(dma_dev);
+
+	kmem_cache_destroy(pt->dma_desc_cache);
+	kmem_cache_destroy(pt->dma_cmd_cache);
+}
diff --git a/drivers/dma/ptdma/ptdma.h b/drivers/dma/ptdma/ptdma.h
index 7cd1b0d..943660b 100644
--- a/drivers/dma/ptdma/ptdma.h
+++ b/drivers/dma/ptdma/ptdma.h
@@ -20,6 +20,7 @@
 #include <linux/list.h>
 #include <linux/wait.h>
 #include <linux/dmapool.h>
+#include <linux/dmaengine.h>
 
 #define MAX_PT_NAME_LEN			16
 #define MAX_DMAPOOL_NAME_LEN	32
@@ -177,6 +178,41 @@ struct pt_cmd {
 	void *data;
 };
 
+struct pt_dma_cmd {
+	struct list_head entry;
+	struct pt_cmd pt_cmd;
+};
+
+struct pt_dma_desc {
+	struct list_head entry;
+
+	struct pt_device *pt;
+
+	struct list_head pending;
+	struct list_head active;
+
+	enum dma_status status;
+	struct dma_async_tx_descriptor tx_desc;
+	size_t len;
+};
+
+struct pt_dma_chan {
+	struct pt_device *pt;
+
+	/* channel lock */
+	spinlock_t lock;
+
+	struct list_head created;
+	struct list_head pending;
+	struct list_head active;
+	struct list_head complete;
+
+	struct tasklet_struct cleanup_tasklet;
+
+	enum dma_status status;
+	struct dma_chan dma_chan;
+};
+
 struct pt_cmd_queue {
 	struct pt_device *pt;
 
@@ -246,6 +282,12 @@ struct pt_device {
 	 */
 	struct pt_cmd_queue cmd_q;
 
+	/* Support for the DMA Engine capabilities */
+	struct dma_device dma_dev;
+	struct pt_dma_chan *pt_dma_chan;
+	struct kmem_cache *dma_cmd_cache;
+	struct kmem_cache *dma_desc_cache;
+
 	wait_queue_head_t lsb_queue;
 
 	struct tasklet_struct tasklet;
@@ -312,6 +354,9 @@ struct pt_dev_vdata {
 	const unsigned int version;
 };
 
+int pt_dmaengine_register(struct pt_device *pt);
+void pt_dmaengine_unregister(struct pt_device *pt);
+
 int pt_core_init(struct pt_device *pt);
 void pt_core_destroy(struct pt_device *pt);
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 3/3] dmaengine: ptdma: Add debugfs entries for PTDMA information
  2020-01-21  9:04 [PATCH v3 0/3] Add AMD PassThru DMA Engine driver Sanjay R Mehta
  2020-01-21  9:04 ` [PATCH v3 1/3] dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine Sanjay R Mehta
  2020-01-21  9:04 ` [PATCH v3 2/3] dmaengine: ptdma: Register pass-through engine as a DMA resource Sanjay R Mehta
@ 2020-01-21  9:04 ` Sanjay R Mehta
  2020-03-26 16:05 ` [PATCH v3 0/3] Add AMD PassThru DMA Engine driver Vitaly Mayatskih
  3 siblings, 0 replies; 7+ messages in thread
From: Sanjay R Mehta @ 2020-01-21  9:04 UTC (permalink / raw)
  To: vkoul, gregkh, dan.j.williams, Thomas.Lendacky, Shyam-sundar.S-k,
	Nehal-bakulchandra.Shah
  Cc: robh, mchehab+samsung, davem, Jonathan.Cameron, linux-kernel,
	dmaengine, Sanjay R Mehta

From: Sanjay R Mehta <sanju.mehta@amd.com>

Expose data about the configuration and operation of the
PTDMA through debugfs entries: device name, capabilities,
configuration, statistics.

Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
---
 drivers/dma/ptdma/Makefile        |   3 +-
 drivers/dma/ptdma/ptdma-debugfs.c | 237 ++++++++++++++++++++++++++++++++++++++
 drivers/dma/ptdma/ptdma-dev.c     |  26 +++++
 drivers/dma/ptdma/ptdma.h         |  12 ++
 4 files changed, 277 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma/ptdma/ptdma-debugfs.c

diff --git a/drivers/dma/ptdma/Makefile b/drivers/dma/ptdma/Makefile
index 6fcb4ad..60e7c10 100644
--- a/drivers/dma/ptdma/Makefile
+++ b/drivers/dma/ptdma/Makefile
@@ -6,6 +6,7 @@
 obj-$(CONFIG_AMD_PTDMA) += ptdma.o
 
 ptdma-objs := ptdma-dev.o \
-	      ptdma-dmaengine.o
+	      ptdma-dmaengine.o \
+	      ptdma-debugfs.o
 
 ptdma-$(CONFIG_PCI) += ptdma-pci.o
diff --git a/drivers/dma/ptdma/ptdma-debugfs.c b/drivers/dma/ptdma/ptdma-debugfs.c
new file mode 100644
index 0000000..b4af83c
--- /dev/null
+++ b/drivers/dma/ptdma/ptdma-debugfs.c
@@ -0,0 +1,237 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Passthrough DMA device driver
+ * -- Based on the CCP driver
+ *
+ * Copyright (C) 2016,2019 Advanced Micro Devices, Inc.
+ *
+ * Author: Sanjay R Mehta <sanju.mehta@amd.com>
+ * Author: Gary R Hook <gary.hook@amd.com>
+ */
+
+#include <linux/debugfs.h>
+
+#include "ptdma.h"
+
+/* DebugFS helpers */
+#define	OBUFP		(obuf + oboff)
+#define	OBUFLEN		512
+#define	OBUFSPC		(OBUFLEN - oboff)
+
+#define	MAX_NAME_LEN	20
+#define	BUFLEN		63
+#define	RI_VERSION_NUM	0x0000003F
+
+#define	RI_NUM_VQM	0x00078000
+#define	RI_NVQM_SHIFT	15
+#define	RI_NVQM(r)	(((r) * RI_NUM_VQM) >> RI_NVQM_SHIFT)
+#define	RI_LSB_ENTRIES	0x0FF80000
+#define	RI_NLSB_SHIFT	19
+#define	RI_NLSB(r)	(((r) * RI_LSB_ENTRIES) >> RI_NLSB_SHIFT)
+
+static struct dentry *pt_debugfs_dir;
+static DEFINE_MUTEX(pt_debugfs_lock);
+
+static ssize_t ptdma_debugfs_info_read(struct file *filp, char __user *ubuf,
+				       size_t count, loff_t *offp)
+{
+	struct pt_device *pt = filp->private_data;
+	unsigned int oboff = 0;
+	unsigned int regval;
+	ssize_t ret;
+	char *obuf;
+
+	if (!pt)
+		return 0;
+
+	obuf = kmalloc(OBUFLEN, GFP_KERNEL);
+	if (!obuf)
+		return -ENOMEM;
+
+	oboff += snprintf(OBUFP, OBUFSPC, "Device name: %s\n", pt->name);
+	oboff += snprintf(OBUFP, OBUFSPC, "   # Queues: %d\n", 1);
+	oboff += snprintf(OBUFP, OBUFSPC, "     # Cmds: %d\n", pt->cmd_count);
+
+	regval = ioread32(pt->io_regs + CMD_PT_VERSION);
+
+	oboff += snprintf(OBUFP, OBUFSPC, "    Version: %d\n",
+		   regval & RI_VERSION_NUM);
+	oboff += snprintf(OBUFP, OBUFSPC, "    Engines:");
+	oboff += snprintf(OBUFP, OBUFSPC, "\n");
+	oboff += snprintf(OBUFP, OBUFSPC, "     Queues: %d\n",
+		   (regval & RI_NUM_VQM) >> RI_NVQM_SHIFT);
+
+	ret = simple_read_from_buffer(ubuf, count, offp, obuf, oboff);
+	kfree(obuf);
+
+	return ret;
+}
+
+/*
+ * Return a formatted buffer containing the current
+ * statistics of queue for PTDMA
+ */
+static ssize_t ptdma_debugfs_stats_read(struct file *filp, char __user *ubuf,
+					size_t count, loff_t *offp)
+{
+	struct pt_device *pt = filp->private_data;
+	unsigned long total_pt_ops = 0;
+	unsigned int oboff = 0;
+	ssize_t ret = 0;
+	char *obuf;
+	struct pt_cmd_queue *cmd_q = &pt->cmd_q;
+
+	total_pt_ops += cmd_q->total_pt_ops;
+
+	obuf = kmalloc(OBUFLEN, GFP_KERNEL);
+	if (!obuf)
+		return -ENOMEM;
+
+	oboff += snprintf(OBUFP, OBUFSPC, "Total Interrupts Handled: %ld\n",
+			    pt->total_interrupts);
+
+	ret = simple_read_from_buffer(ubuf, count, offp, obuf, oboff);
+	kfree(obuf);
+
+	return ret;
+}
+
+/*
+ * Reset the counters in a queue
+ */
+static void ptdma_debugfs_reset_queue_stats(struct pt_cmd_queue *cmd_q)
+{
+	cmd_q->total_pt_ops = 0L;
+}
+
+/*
+ * A value was written to the stats variable, which
+ * should be used to reset the queue counters across
+ * that device.
+ */
+static ssize_t ptdma_debugfs_stats_write(struct file *filp,
+					 const char __user *ubuf,
+					 size_t count, loff_t *offp)
+{
+	struct pt_device *pt = filp->private_data;
+
+	ptdma_debugfs_reset_queue_stats(&pt->cmd_q);
+	pt->total_interrupts = 0L;
+
+	return count;
+}
+
+/*
+ * Return a formatted buffer containing the current information
+ * for that queue
+ */
+static ssize_t ptdma_debugfs_queue_read(struct file *filp, char __user *ubuf,
+					size_t count, loff_t *offp)
+{
+	struct pt_cmd_queue *cmd_q = filp->private_data;
+	unsigned int oboff = 0;
+	unsigned int regval;
+	ssize_t ret;
+	char *obuf;
+
+	if (!cmd_q)
+		return 0;
+
+	obuf = kmalloc(OBUFLEN, GFP_KERNEL);
+	if (!obuf)
+		return -ENOMEM;
+
+	oboff += snprintf(OBUFP, OBUFSPC, "               Pass-Thru: %ld\n",
+			    cmd_q->total_pt_ops);
+
+	regval = ioread32(cmd_q->reg_int_enable);
+	oboff += snprintf(OBUFP, OBUFSPC, "      Enabled Interrupts:");
+	if (regval & INT_EMPTY_QUEUE)
+		oboff += snprintf(OBUFP, OBUFSPC, " EMPTY");
+	if (regval & INT_QUEUE_STOPPED)
+		oboff += snprintf(OBUFP, OBUFSPC, " STOPPED");
+	if (regval & INT_ERROR)
+		oboff += snprintf(OBUFP, OBUFSPC, " ERROR");
+	if (regval & INT_COMPLETION)
+		oboff += snprintf(OBUFP, OBUFSPC, " COMPLETION");
+	oboff += snprintf(OBUFP, OBUFSPC, "\n");
+
+	ret = simple_read_from_buffer(ubuf, count, offp, obuf, oboff);
+	kfree(obuf);
+
+	return ret;
+}
+
+/*
+ * A value was written to the stats variable for a
+ * queue. Reset the queue counters to this value.
+ */
+static ssize_t ptdma_debugfs_queue_write(struct file *filp,
+					 const char __user *ubuf,
+					 size_t count, loff_t *offp)
+{
+	struct pt_cmd_queue *cmd_q = filp->private_data;
+
+	ptdma_debugfs_reset_queue_stats(cmd_q);
+
+	return count;
+}
+
+static const struct file_operations pt_debugfs_info_ops = {
+	.owner = THIS_MODULE,
+	.open = simple_open,
+	.read = ptdma_debugfs_info_read,
+	.write = NULL,
+};
+
+static const struct file_operations pt_debugfs_queue_ops = {
+	.owner = THIS_MODULE,
+	.open = simple_open,
+	.read = ptdma_debugfs_queue_read,
+	.write = ptdma_debugfs_queue_write,
+};
+
+static const struct file_operations pt_debugfs_stats_ops = {
+	.owner = THIS_MODULE,
+	.open = simple_open,
+	.read = ptdma_debugfs_stats_read,
+	.write = ptdma_debugfs_stats_write,
+};
+
+void ptdma_debugfs_setup(struct pt_device *pt)
+{
+	struct pt_cmd_queue *cmd_q;
+	char name[MAX_NAME_LEN + 1];
+	struct dentry *debugfs_q_instance;
+
+	if (!debugfs_initialized())
+		return;
+
+	mutex_lock(&pt_debugfs_lock);
+	if (!pt_debugfs_dir)
+		pt_debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	mutex_unlock(&pt_debugfs_lock);
+
+	pt->debugfs_instance = debugfs_create_dir(pt->name, pt_debugfs_dir);
+
+	debugfs_create_file("info", 0400, pt->debugfs_instance, pt,
+			    &pt_debugfs_info_ops);
+
+	debugfs_create_file("stats", 0600, pt->debugfs_instance, pt,
+			    &pt_debugfs_stats_ops);
+
+	cmd_q = &pt->cmd_q;
+
+	snprintf(name, MAX_NAME_LEN - 1, "q");
+
+	debugfs_q_instance =
+		debugfs_create_dir(name, pt->debugfs_instance);
+
+	debugfs_create_file("stats", 0600, debugfs_q_instance, cmd_q,
+			    &pt_debugfs_queue_ops);
+}
+
+void ptdma_debugfs_destroy(void)
+{
+	debugfs_remove_recursive(pt_debugfs_dir);
+}
diff --git a/drivers/dma/ptdma/ptdma-dev.c b/drivers/dma/ptdma/ptdma-dev.c
index 0162ecd..9815185 100644
--- a/drivers/dma/ptdma/ptdma-dev.c
+++ b/drivers/dma/ptdma/ptdma-dev.c
@@ -14,6 +14,7 @@
 #include <linux/pci.h>
 #include <linux/dma-mapping.h>
 #include <linux/interrupt.h>
+#include <linux/debugfs.h>
 
 #include "ptdma.h"
 
@@ -130,6 +131,23 @@ static void pt_del_device(struct pt_device *pt)
 	write_unlock_irqrestore(&pt_unit_lock, flags);
 }
 
+/*
+ * pt_present - check if a PTDMA device is present
+ *
+ * Returns zero if a PTDMA device is present, -ENODEV otherwise.
+ */
+static int pt_present(void)
+{
+	unsigned long flags;
+	int ret;
+
+	read_lock_irqsave(&pt_unit_lock, flags);
+	ret = list_empty(&pt_units);
+	read_unlock_irqrestore(&pt_unit_lock, flags);
+
+	return ret ? -ENODEV : 0;
+}
+
 static int pt_core_execute_cmd(struct ptdma_desc *desc,
 			       struct pt_cmd_queue *cmd_q)
 {
@@ -173,6 +191,7 @@ int pt_core_perform_passthru(struct pt_cmd_queue *cmd_q,
 
 	cmd_q->cmd_error = 0;
 
+	cmd_q->total_pt_ops++;
 	memset(&desc, 0, Q_DESC_SIZE);
 
 	desc.dw0.val = CMD_DESC_DW0_VAL;
@@ -205,6 +224,7 @@ static irqreturn_t pt_core_irq_handler(int irq, void *data)
 	u32 status;
 
 	pt_core_disable_queue_interrupts(pt);
+	pt->total_interrupts++;
 
 	status = ioread32(cmd_q->reg_interrupt_status);
 	if (status) {
@@ -370,6 +390,9 @@ int pt_core_init(struct pt_device *pt)
 
 	tasklet_init(&pt->tasklet, pt_do_cmd_complete, (ulong)&pt->tdata);
 
+	/* Set up debugfs entries */
+	ptdma_debugfs_setup(pt);
+
 	return 0;
 
 e_dmaengine:
@@ -396,6 +419,9 @@ void pt_core_destroy(struct pt_device *pt)
 	/* Remove this device from the list of available units first */
 	pt_del_device(pt);
 
+	if (pt_present())
+		ptdma_debugfs_destroy();
+
 	/* Disable and clear interrupts */
 	pt_core_disable_queue_interrupts(pt);
 
diff --git a/drivers/dma/ptdma/ptdma.h b/drivers/dma/ptdma/ptdma.h
index 943660b..cbeed10 100644
--- a/drivers/dma/ptdma/ptdma.h
+++ b/drivers/dma/ptdma/ptdma.h
@@ -45,6 +45,7 @@
 #define	CMD_QUEUE_PRIO_OFFSET		0x00
 #define	CMD_REQID_CONFIG_OFFSET		0x04
 #define	CMD_TIMEOUT_OFFSET		0x08
+#define	CMD_PT_VERSION			0x10
 
 #define CMD_Q_CONTROL_BASE		0x0000
 #define CMD_Q_TAIL_LO_BASE		0x0004
@@ -252,6 +253,8 @@ struct pt_cmd_queue {
 	u32 q_int_status;
 	u32 cmd_error;
 
+	/* queue Statistics */
+	unsigned long total_pt_ops;
 } ____cacheline_aligned;
 
 struct pt_device {
@@ -290,6 +293,12 @@ struct pt_device {
 
 	wait_queue_head_t lsb_queue;
 
+	/* Device Statistics */
+	unsigned long total_interrupts;
+
+	/* DebugFS info */
+	struct dentry *debugfs_instance;
+
 	struct tasklet_struct tasklet;
 	struct pt_tasklet_data tdata;
 };
@@ -357,6 +366,9 @@ struct pt_dev_vdata {
 int pt_dmaengine_register(struct pt_device *pt);
 void pt_dmaengine_unregister(struct pt_device *pt);
 
+void ptdma_debugfs_setup(struct pt_device *pt);
+void ptdma_debugfs_destroy(void);
+
 int pt_core_init(struct pt_device *pt);
 void pt_core_destroy(struct pt_device *pt);
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/3] dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine
  2020-01-21  9:04 ` [PATCH v3 1/3] dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine Sanjay R Mehta
@ 2020-01-24  6:00   ` Vinod Koul
  0 siblings, 0 replies; 7+ messages in thread
From: Vinod Koul @ 2020-01-24  6:00 UTC (permalink / raw)
  To: Sanjay R Mehta
  Cc: gregkh, dan.j.williams, Thomas.Lendacky, Shyam-sundar.S-k,
	Nehal-bakulchandra.Shah, robh, mchehab+samsung, davem,
	Jonathan.Cameron, linux-kernel, dmaengine, Gary R Hook

On 21-01-20, 03:04, Sanjay R Mehta wrote:
> From: Sanjay R Mehta <sanju.mehta@amd.com>
> 
> This device performs high-bandwidth memory-to-memory
> transfer operations.

Why is it called PassThru DMA engine?

>  obj-y += mediatek/
>  obj-y += qcom/
> diff --git a/drivers/dma/ptdma/Kconfig b/drivers/dma/ptdma/Kconfig
> new file mode 100644
> index 0000000..4ec259e
> --- /dev/null
> +++ b/drivers/dma/ptdma/Kconfig
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +config AMD_PTDMA
> +	tristate  "AMD PassThru DMA Engine"
> +	depends on X86_64 && PCI

not using DMA_VIRTUAL_CHANNELS?

> +	help
> +	  Provides the support for AMD PassThru DMA Engine.

more help text please

> +++ b/drivers/dma/ptdma/ptdma-dev.c
> @@ -0,0 +1,387 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * AMD Passthru DMA device driver
> + * -- Based on the CCP driver

What is ccp driver?

> + *
> + * Copyright (C) 2016,2019 Advanced Micro Devices, Inc.

2020..

> +#ifdef CONFIG_PM
> +static int pt_pci_suspend(struct pci_dev *pdev, pm_message_t state)
> +{
> +	return -ENOSYS;
> +}
> +
> +static int pt_pci_resume(struct pci_dev *pdev)
> +{
> +	return -ENOSYS;
> +}
> +#endif

please remove the dummy code, you can add these when you have support
for it

> +/* Bit masks */
> +#define	CMD_DESC_DW0_VAL		0x500012
> +#define	CMD_CONFIG_REQID		0x0
> +#define	CMD_CONFIG_VHB_EN		0x00000001
> +#define	CMD_QUEUE_PRIO			0x00000006
> +#define	CMD_TIMEOUT_DISABLE		0x00000000
> +#define	CMD_CLK_DYN_GATING_EN	0x1
> +#define	CMD_CLK_DYN_GATING_DIS	0x0
> +#define	CMD_CLK_HW_GATE_MODE	0x1
> +#define	CMD_CLK_SW_GATE_MODE	0x0
> +#define	CMD_CLK_GATE_ON_DELAY	0x1000
> +#define	CMD_CLK_GATE_CTL	0x0
> +#define	CMD_CLK_GATE_OFF_DELAY	0x1000

BIT and GENMASK for these...

All this adds handling of pt controller, I am not seeing dmaengine bits,
so please word the changelog accordingly and mention this adds based
bits and not dmaengine support (yet)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/3] dmaengine: ptdma: Register pass-through engine as a DMA resource
  2020-01-21  9:04 ` [PATCH v3 2/3] dmaengine: ptdma: Register pass-through engine as a DMA resource Sanjay R Mehta
@ 2020-01-24  6:05   ` Vinod Koul
  0 siblings, 0 replies; 7+ messages in thread
From: Vinod Koul @ 2020-01-24  6:05 UTC (permalink / raw)
  To: Sanjay R Mehta
  Cc: gregkh, dan.j.williams, Thomas.Lendacky, Shyam-sundar.S-k,
	Nehal-bakulchandra.Shah, robh, mchehab+samsung, davem,
	Jonathan.Cameron, linux-kernel, dmaengine

On 21-01-20, 03:04, Sanjay R Mehta wrote:

> +static void pt_free_chan_resources(struct dma_chan *dma_chan)
> +{
> +	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
> +						 dma_chan);
> +	unsigned long flags;
> +
> +	dev_dbg(chan->pt->dev, "%s - chan=%p\n", __func__, chan);
> +
> +	spin_lock_irqsave(&chan->lock, flags);
> +
> +	pt_free_desc_resources(chan->pt, &chan->complete);
> +	pt_free_desc_resources(chan->pt, &chan->active);
> +	pt_free_desc_resources(chan->pt, &chan->pending);
> +	pt_free_desc_resources(chan->pt, &chan->created);

can you use the virt-dma layer instead for list and descriptor
management..

> +static enum dma_status pt_tx_status(struct dma_chan *dma_chan,
> +				    dma_cookie_t cookie,
> +				    struct dma_tx_state *state)
> +{
> +	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
> +						 dma_chan);
> +	struct pt_dma_desc *desc;
> +	enum dma_status ret;
> +	unsigned long flags;
> +
> +	if (chan->status == DMA_PAUSED) {
> +		ret = DMA_PAUSED;

the pt_tx_status is for a specific cookie and not for the channel, so
you need to return status of the cookie which may have been completed or
pending (where pause makes sense)

> +static int pt_pause(struct dma_chan *dma_chan)
> +{
> +	struct pt_dma_chan *chan = container_of(dma_chan, struct pt_dma_chan,
> +						 dma_chan);
> +
> +	chan->status = DMA_PAUSED;
> +
> +	/*TODO: Wait for active DMA to complete before returning? */

When will the be resolved :)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/3] Add AMD PassThru DMA Engine driver
  2020-01-21  9:04 [PATCH v3 0/3] Add AMD PassThru DMA Engine driver Sanjay R Mehta
                   ` (2 preceding siblings ...)
  2020-01-21  9:04 ` [PATCH v3 3/3] dmaengine: ptdma: Add debugfs entries for PTDMA information Sanjay R Mehta
@ 2020-03-26 16:05 ` Vitaly Mayatskih
  3 siblings, 0 replies; 7+ messages in thread
From: Vitaly Mayatskih @ 2020-03-26 16:05 UTC (permalink / raw)
  To: Sanjay R Mehta
  Cc: vkoul, gregkh, dan.j.williams, Thomas.Lendacky, Shyam-sundar.S-k,
	Nehal-bakulchandra.Shah, robh, mchehab+samsung, davem,
	Jonathan.Cameron, linux-kernel, dmaengine

dmatest measures around 30 MB/s for one ptdma channel in EPYC 7302p
and 11 GB/s for ioatdma channel in Xeon Gold 6248. Is it a driver
limitation?

I run dmatest as modprobe dmatest timeout=10000 transfer_size=512
test_buf_size=1048576 threads_per_chan=1 iterations=1000 run=1 wait=1


On Tue, Jan 21, 2020 at 4:06 AM Sanjay R Mehta <Sanju.Mehta@amd.com> wrote:
>
> From: Sanjay R Mehta <sanju.mehta@amd.com>
>
> This patch series adds support for AMD PassThru DMA Engine which
> performs high bandwidth memory-to-memory and IO copy operation and
> performs DMA transfer through queue based descriptor management.
>
> AMD Processor has multiple ptdma device instances and each engine has
> single queue. The driver also adds support for for multiple PTDMA
> instances, each device will get an unique identifier and uniquely
> named resources.
>
> v3:
>         - Fixed the sparse warnings.
>
> v2:
>         - Added controller description in cover letter
>         - Removed "default m" from Kconfig
>         - Replaced low_address() and high_address() functions with kernel
>           API's lower_32_bits & upper_32_bits().
>         - Removed the BH handler function pt_core_irq_bh() and instead
>           handling transaction in irq handler itself.
>         - Moved presetting of command queue registers into new function
>           "init_cmdq_regs()"
>         - Removed the kernel thread dependency to submit transaction.
>         - Increased the hardware command queue size to 32 and adding it
>           as a module parameter.
>         - Removed backlog command queue handling mechanism.
>         - Removed software command queue handling and instead submitting
>           transaction command directly to
>           hardware command queue.
>         - Added tasklet structure variable in "struct pt_device".
>           This is used to invoke pt_do_cmd_complete() upon receiving interrupt
>           for command completion.
>         - pt_core_perform_passthru() function parameters are modified and it is
>           now used to submit command directly to hardware from dmaengine framework.
>         - Removed below structures, enums, macros and functions, as these values are
>           constants. Making command submission simple,
>            - Removed "union pt_function"  and several macros like PT_VERSION,
>              PT_BYTESWAP, PT_CMD_* etc..
>            - enum pt_passthru_bitwise, enum pt_passthru_byteswap, enum pt_memtype
>              struct pt_dma_info, struct pt_data, struct pt_mem, struct pt_passthru_op,
>              struct pt_op,
>            - Removed functions -> pt_cmd_queue_thread() pt_run_passthru_cmd(),
>              pt_run_cmd(), pt_dev_init(), pt_dequeue_cmd(), pt_do_cmd_backlog(),
>              pt_enqueue_cmd(),
>         - Below functions, stuctures and variables moved from ptdma-ops.c
>            - Moved function pt_alloc_struct() to ptdma-pci.c as its used only there.
>            - Moved "struct pt_tasklet_data" structure to ptdma.h
>            - Moved functions pt_do_cmd_complete(), pt_present(), pt_get_device(),
>              pt_add_device(), pt_del_device(), pt_log_error() and its dependent
>              static variables pt_unit_lock, pt_units, pt_rr_lock, pt_rr, pt_error_codes,
>              pt_ordinal  to ptdma-dev.c as they are used only in that file.
>
> Links of the review comments for v2:
> 1. https://lkml.org/lkml/2019/12/27/630
> 2. https://lkml.org/lkml/2020/1/3/23
> 3. https://lkml.org/lkml/2020/1/3/314
> 4. https://lkml.org/lkml/2020/1/10/100
>
>
> Links of the review comments for v1:
>
> 1. https://lkml.org/lkml/2019/9/24/490
> 2. https://lkml.org/lkml/2019/9/24/399
> 3. https://lkml.org/lkml/2019/9/24/862
> 4. https://lkml.org/lkml/2019/9/24/122
>
> Sanjay R Mehta (3):
>   dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine
>   dmaengine: ptdma: Register pass-through engine as a DMA resource
>   dmaengine: ptdma: Add debugfs entries for PTDMA information
>
>  MAINTAINERS                         |   6 +
>  drivers/dma/Kconfig                 |   2 +
>  drivers/dma/Makefile                |   1 +
>  drivers/dma/ptdma/Kconfig           |   7 +
>  drivers/dma/ptdma/Makefile          |  12 +
>  drivers/dma/ptdma/ptdma-debugfs.c   | 237 ++++++++++++
>  drivers/dma/ptdma/ptdma-dev.c       | 448 +++++++++++++++++++++++
>  drivers/dma/ptdma/ptdma-dmaengine.c | 704 ++++++++++++++++++++++++++++++++++++
>  drivers/dma/ptdma/ptdma-pci.c       | 269 ++++++++++++++
>  drivers/dma/ptdma/ptdma.h           | 378 +++++++++++++++++++
>  10 files changed, 2064 insertions(+)
>  create mode 100644 drivers/dma/ptdma/Kconfig
>  create mode 100644 drivers/dma/ptdma/Makefile
>  create mode 100644 drivers/dma/ptdma/ptdma-debugfs.c
>  create mode 100644 drivers/dma/ptdma/ptdma-dev.c
>  create mode 100644 drivers/dma/ptdma/ptdma-dmaengine.c
>  create mode 100644 drivers/dma/ptdma/ptdma-pci.c
>  create mode 100644 drivers/dma/ptdma/ptdma.h
>
> --
> 2.7.4
>


--
wbr, Vitaly

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-21  9:04 [PATCH v3 0/3] Add AMD PassThru DMA Engine driver Sanjay R Mehta
2020-01-21  9:04 ` [PATCH v3 1/3] dmaengine: ptdma: Initial driver for the AMD PassThru DMA engine Sanjay R Mehta
2020-01-24  6:00   ` Vinod Koul
2020-01-21  9:04 ` [PATCH v3 2/3] dmaengine: ptdma: Register pass-through engine as a DMA resource Sanjay R Mehta
2020-01-24  6:05   ` Vinod Koul
2020-01-21  9:04 ` [PATCH v3 3/3] dmaengine: ptdma: Add debugfs entries for PTDMA information Sanjay R Mehta
2020-03-26 16:05 ` [PATCH v3 0/3] Add AMD PassThru DMA Engine driver Vitaly Mayatskih

dmaengine Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dmaengine/0 dmaengine/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dmaengine dmaengine/ https://lore.kernel.org/dmaengine \
		dmaengine@vger.kernel.org
	public-inbox-index dmaengine

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.dmaengine


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git