dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/6] dma: Add Xilinx ZynqMP DPDMA driver
@ 2020-05-13 16:59 Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA Laurent Pinchart
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-13 16:59 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

Hello,

This patch series adds a new driver for the DPDMA engine found in the
Xilinx ZynqMP.

The previous version can be found at [1]. All review comments have been
taken into account. The most notable change is the replacement of the
proposed new DMA transfer type that combined interleaved and cyclic
tranfers with a new dma_ctrl_flags (patch 3/6, suggested by Peter).

The driver has been successfully tested with the ZynqMP DisplayPort
subsystem DRM driver.

[1] https://lore.kernel.org/dmaengine/20200123022939.9739-1-laurent.pinchart@ideasonboard.com/

Hyun Kwon (1):
  dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver

Laurent Pinchart (5):
  dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA
  dmaengine: virt-dma: Use lockdep to check locking requirements
  dmaengine: Add support for repeating transactions
  dmaengine: xilinx: dpdma: Add debugfs support
  arm64: dts: zynqmp: Add DPDMA node

 .../dma/xilinx/xlnx,zynqmp-dpdma.yaml         |   68 +
 MAINTAINERS                                   |    9 +
 .../arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi |    4 +
 arch/arm64/boot/dts/xilinx/zynqmp.dtsi        |   10 +
 drivers/dma/Kconfig                           |   10 +
 drivers/dma/virt-dma.c                        |    2 +
 drivers/dma/virt-dma.h                        |   10 +
 drivers/dma/xilinx/Makefile                   |    1 +
 drivers/dma/xilinx/xilinx_dpdma.c             | 1771 +++++++++++++++++
 include/dt-bindings/dma/xlnx-zynqmp-dpdma.h   |   16 +
 include/linux/dmaengine.h                     |   10 +
 11 files changed, 1911 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
 create mode 100644 drivers/dma/xilinx/xilinx_dpdma.c
 create mode 100644 include/dt-bindings/dma/xlnx-zynqmp-dpdma.h

-- 
Regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA
  2020-05-13 16:59 [PATCH v4 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
@ 2020-05-13 16:59 ` Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 2/6] dmaengine: virt-dma: Use lockdep to check locking requirements Laurent Pinchart
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-13 16:59 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

The ZynqMP includes the DisplayPort subsystem with its own DMA engine
called DPDMA. The DPDMA IP comes with 6 individual channels
(4 for display, 2 for audio). This documentation describes DT bindings
of DPDMA.

Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
Changes since v2:

- Fix id URL
- Fix path to dma-controller.yaml
- Update license to GPL-2.0-only OR BSD-2-Clause

Changes since v1:

- Convert the DT bindings to YAML
- Drop the DT child nodes
---
 .../dma/xilinx/xlnx,zynqmp-dpdma.yaml         | 68 +++++++++++++++++++
 MAINTAINERS                                   |  8 +++
 include/dt-bindings/dma/xlnx-zynqmp-dpdma.h   | 16 +++++
 3 files changed, 92 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
 create mode 100644 include/dt-bindings/dma/xlnx-zynqmp-dpdma.h

diff --git a/Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml b/Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
new file mode 100644
index 000000000000..5de510f8c88c
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
@@ -0,0 +1,68 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/dma/xilinx/xlnx,zynqmp-dpdma.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Xilinx ZynqMP DisplayPort DMA Controller Device Tree Bindings
+
+description: |
+  These bindings describe the DMA engine included in the Xilinx ZynqMP
+  DisplayPort Subsystem. The DMA engine supports up to 6 DMA channels (3
+  channels for a video stream, 1 channel for a graphics stream, and 2 channels
+  for an audio stream).
+
+maintainers:
+  - Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+
+allOf:
+  - $ref: "../dma-controller.yaml#"
+
+properties:
+  "#dma-cells":
+    const: 1
+    description: |
+      The cell is the DMA channel ID (see dt-bindings/dma/xlnx-zynqmp-dpdma.h
+      for a list of channel IDs).
+
+  compatible:
+    const: xlnx,zynqmp-dpdma
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  clocks:
+    description: The AXI clock
+    maxItems: 1
+
+  clock-names:
+    const: axi_clk
+
+required:
+  - "#dma-cells"
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+    dma: dma-controller@fd4c0000 {
+      compatible = "xlnx,zynqmp-dpdma";
+      reg = <0x0 0xfd4c0000 0x0 0x1000>;
+      interrupts = <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>;
+      interrupt-parent = <&gic>;
+      clocks = <&dpdma_clk>;
+      clock-names = "axi_clk";
+      #dma-cells = <1>;
+    };
+
+...
diff --git a/MAINTAINERS b/MAINTAINERS
index 0fc210d509ec..be39aab8b706 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18429,6 +18429,14 @@ F:	drivers/misc/Kconfig
 F:	drivers/misc/Makefile
 F:	include/uapi/misc/xilinx_sdfec.h
 
+XILINX ZYNQMP DPDMA DRIVER
+M:	Hyun Kwon <hyun.kwon@xilinx.com>
+M:	Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+L:	dmaengine@vger.kernel.org
+S:	Supported
+F:	Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
+F:	include/dt-bindings/dma/xlnx-zynqmp-dpdma.h
+
 XILLYBUS DRIVER
 M:	Eli Billauer <eli.billauer@gmail.com>
 L:	linux-kernel@vger.kernel.org
diff --git a/include/dt-bindings/dma/xlnx-zynqmp-dpdma.h b/include/dt-bindings/dma/xlnx-zynqmp-dpdma.h
new file mode 100644
index 000000000000..3719cda5679d
--- /dev/null
+++ b/include/dt-bindings/dma/xlnx-zynqmp-dpdma.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR MIT) */
+/*
+ * Copyright 2019 Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+ */
+
+#ifndef __DT_BINDINGS_DMA_XLNX_ZYNQMP_DPDMA_H__
+#define __DT_BINDINGS_DMA_XLNX_ZYNQMP_DPDMA_H__
+
+#define ZYNQMP_DPDMA_VIDEO0		0
+#define ZYNQMP_DPDMA_VIDEO1		1
+#define ZYNQMP_DPDMA_VIDEO2		2
+#define ZYNQMP_DPDMA_GRAPHICS		3
+#define ZYNQMP_DPDMA_AUDIO0		4
+#define ZYNQMP_DPDMA_AUDIO1		5
+
+#endif /* __DT_BINDINGS_DMA_XLNX_ZYNQMP_DPDMA_H__ */
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 2/6] dmaengine: virt-dma: Use lockdep to check locking requirements
  2020-05-13 16:59 [PATCH v4 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA Laurent Pinchart
@ 2020-05-13 16:59 ` Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 3/6] dmaengine: Add support for repeating transactions Laurent Pinchart
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-13 16:59 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

A few virt-dma functions are documented as requiring the vc.lock to be
held by the caller. Check this with lockdep.

The vchan_vdesc_fini() and vchan_find_desc() functions gain a lockdep
check as well, because, even though they are not documented with this
requirement (and not documented at all for the latter), they touch
fields documented as protected by vc.lock. All callers have been
manually inspected to verify they call the functions with the lock held.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 drivers/dma/virt-dma.c |  2 ++
 drivers/dma/virt-dma.h | 10 ++++++++++
 2 files changed, 12 insertions(+)

diff --git a/drivers/dma/virt-dma.c b/drivers/dma/virt-dma.c
index 23e33a85f033..1cb36ab3d9c1 100644
--- a/drivers/dma/virt-dma.c
+++ b/drivers/dma/virt-dma.c
@@ -68,6 +68,8 @@ struct virt_dma_desc *vchan_find_desc(struct virt_dma_chan *vc,
 {
 	struct virt_dma_desc *vd;
 
+	lockdep_assert_held(&vc->lock);
+
 	list_for_each_entry(vd, &vc->desc_issued, node)
 		if (vd->tx.cookie == cookie)
 			return vd;
diff --git a/drivers/dma/virt-dma.h b/drivers/dma/virt-dma.h
index e9f5250fbe4d..59d9eabc8b67 100644
--- a/drivers/dma/virt-dma.h
+++ b/drivers/dma/virt-dma.h
@@ -81,6 +81,8 @@ static inline struct dma_async_tx_descriptor *vchan_tx_prep(struct virt_dma_chan
  */
 static inline bool vchan_issue_pending(struct virt_dma_chan *vc)
 {
+	lockdep_assert_held(&vc->lock);
+
 	list_splice_tail_init(&vc->desc_submitted, &vc->desc_issued);
 	return !list_empty(&vc->desc_issued);
 }
@@ -96,6 +98,8 @@ static inline void vchan_cookie_complete(struct virt_dma_desc *vd)
 	struct virt_dma_chan *vc = to_virt_chan(vd->tx.chan);
 	dma_cookie_t cookie;
 
+	lockdep_assert_held(&vc->lock);
+
 	cookie = vd->tx.cookie;
 	dma_cookie_complete(&vd->tx);
 	dev_vdbg(vc->chan.device->dev, "txd %p[%x]: marked complete\n",
@@ -146,6 +150,8 @@ static inline void vchan_terminate_vdesc(struct virt_dma_desc *vd)
 {
 	struct virt_dma_chan *vc = to_virt_chan(vd->tx.chan);
 
+	lockdep_assert_held(&vc->lock);
+
 	list_add_tail(&vd->node, &vc->desc_terminated);
 
 	if (vc->cyclic == vd)
@@ -160,6 +166,8 @@ static inline void vchan_terminate_vdesc(struct virt_dma_desc *vd)
  */
 static inline struct virt_dma_desc *vchan_next_desc(struct virt_dma_chan *vc)
 {
+	lockdep_assert_held(&vc->lock);
+
 	return list_first_entry_or_null(&vc->desc_issued,
 					struct virt_dma_desc, node);
 }
@@ -177,6 +185,8 @@ static inline struct virt_dma_desc *vchan_next_desc(struct virt_dma_chan *vc)
 static inline void vchan_get_all_descriptors(struct virt_dma_chan *vc,
 	struct list_head *head)
 {
+	lockdep_assert_held(&vc->lock);
+
 	list_splice_tail_init(&vc->desc_allocated, head);
 	list_splice_tail_init(&vc->desc_submitted, head);
 	list_splice_tail_init(&vc->desc_issued, head);
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-13 16:59 [PATCH v4 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 2/6] dmaengine: virt-dma: Use lockdep to check locking requirements Laurent Pinchart
@ 2020-05-13 16:59 ` Laurent Pinchart
  2020-05-14 18:23   ` Vinod Koul
  2020-05-13 16:59 ` [PATCH v4 4/6] dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver Laurent Pinchart
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-13 16:59 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

DMA engines used with displays perform 2D interleaved transfers to read
framebuffers from memory and feed the data to the display engine. As the
same framebuffer can be displayed for multiple frames, the DMA
transactions need to be repeated until a new framebuffer replaces the
current one. This feature is implemented natively by some DMA engines
that have the ability to repeat transactions and switch to a new
transaction at the end of a transfer without any race condition or frame
loss.

This patch implements support for this feature in the DMA engine API. A
new DMA_PREP_REPEAT transaction flag allows DMA clients to instruct the
DMA channel to repeat the transaction automatically until one or more
new transactions are issued on the channel (or until all active DMA
transfers are explicitly terminated with the dmaengine_terminate_*()
functions). A new DMA_REPEAT transaction type is also added for DMA
engine drivers to report their support of the DMA_PREP_REPEAT flag.

The DMA_PREP_REPEAT flag is currently supported for interleaved
transactions only. Its usage can easily be extended to cover more
transaction types simply by adding an appropriate check in the
corresponding dmaengine_prep_*() function.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
If this approach is accepted I can send a new version that updates
documentation in Documentation/driver-api/dmaengine/, and extend support
of DMA_PREP_REPEAT to the other transaction types, if desired already.

 include/linux/dmaengine.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 64461fc64e1b..9fa00bdbf583 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -61,6 +61,7 @@ enum dma_transaction_type {
 	DMA_SLAVE,
 	DMA_CYCLIC,
 	DMA_INTERLEAVE,
+	DMA_REPEAT,
 /* last transaction type for creation of the capabilities mask */
 	DMA_TX_TYPE_END,
 };
@@ -176,6 +177,11 @@ struct dma_interleaved_template {
  * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
  *  data and the descriptor should be in different format from normal
  *  data descriptors.
+ * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
+ *  repeated when it ends if no other transaction has been issued on the same
+ *  channel. If other transactions have been issued, this transaction completes
+ *  normally. This flag is only applicable to interleaved transactions and is
+ *  ignored for all other transaction types.
  */
 enum dma_ctrl_flags {
 	DMA_PREP_INTERRUPT = (1 << 0),
@@ -186,6 +192,7 @@ enum dma_ctrl_flags {
 	DMA_PREP_FENCE = (1 << 5),
 	DMA_CTRL_REUSE = (1 << 6),
 	DMA_PREP_CMD = (1 << 7),
+	DMA_PREP_REPEAT = (1 << 8),
 };
 
 /**
@@ -967,6 +974,9 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_interleaved_dma(
 {
 	if (!chan || !chan->device || !chan->device->device_prep_interleaved_dma)
 		return NULL;
+	if (flags & DMA_PREP_REPEAT &&
+	    !test_bit(DMA_REPEAT, chan->device->cap_mask.bits))
+		return NULL;
 
 	return chan->device->device_prep_interleaved_dma(chan, xt, flags);
 }
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 4/6] dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver
  2020-05-13 16:59 [PATCH v4 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
                   ` (2 preceding siblings ...)
  2020-05-13 16:59 ` [PATCH v4 3/6] dmaengine: Add support for repeating transactions Laurent Pinchart
@ 2020-05-13 16:59 ` Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 5/6] dmaengine: xilinx: dpdma: Add debugfs support Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 6/6] arm64: dts: zynqmp: Add DPDMA node Laurent Pinchart
  5 siblings, 0 replies; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-13 16:59 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

From: Hyun Kwon <hyun.kwon@xilinx.com>

The ZynqMP DisplayPort subsystem includes a DMA engine called DPDMA with
6 DMa channels (4 for display and 2 for audio). This driver exposes the
DPDMA through the dmaengine API, to be used by audio (ALSA) and display
(DRM) drivers for the DisplayPort subsystem.

Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Tejas Upadhyay <tejasu@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
Changes since v3:

- Fix uninitialized variable in xilinx_dpdma_config()
- Free pending and active descriptors upon termination
- Switch to DMA_PREP_REPEAT

Changes since v2:

- Switch to virt-dma
- Support interleaved cyclic transfers and nothing else
- Fix terminate_all behaviour (don't wait)
- Fix bug in extended address handling for hw desc
- Clean up video group handling
- Update driver name
- Use macros for bitfields
- Remove unneeded header
- Coding style and typo fixes

Changes since v1:

- Remove unneeded #include
- Drop enum xilinx_dpdma_chan_id
- Update compatible string
- Drop DT subnodes
- Replace XILINX_DPDMA_NUM_CHAN with ARRAY_SIZE(xdev->chan)
- Disable IRQ at remove() time
- Use devm_platform_ioremap_resource()
- Don't inline functions manually
- Add section headers
- Merge DMA engine implementation in their wrappers
- Rename xilinx_dpdma_sw_desc::phys to dma_addr
- Use GENMASK()
- Use FIELD_PREP/FIELD_GET
- Fix MSB handling in xilinx_dpdma_sw_desc_addr_64()
- Fix logic in xilinx_dpdma_chan_prep_slave_sg()
- Document why xilinx_dpdma_config() doesn't need to check most
  parameters
- Remove debugfs support
- Rechedule errored descriptor
- Align the line size with 128bit
- SPDX header formatting
---
 MAINTAINERS                       |    1 +
 drivers/dma/Kconfig               |   10 +
 drivers/dma/xilinx/Makefile       |    1 +
 drivers/dma/xilinx/xilinx_dpdma.c | 1543 +++++++++++++++++++++++++++++
 4 files changed, 1555 insertions(+)
 create mode 100644 drivers/dma/xilinx/xilinx_dpdma.c

diff --git a/MAINTAINERS b/MAINTAINERS
index be39aab8b706..aae90caba2a6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18435,6 +18435,7 @@ M:	Laurent Pinchart <laurent.pinchart@ideasonboard.com>
 L:	dmaengine@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
+F:	drivers/dma/xilinx/xilinx_dpdma.c
 F:	include/dt-bindings/dma/xlnx-zynqmp-dpdma.h
 
 XILLYBUS DRIVER
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 5142da401db3..b8f9d6ba7bf2 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -697,6 +697,16 @@ config XILINX_ZYNQMP_DMA
 	help
 	  Enable support for Xilinx ZynqMP DMA controller.
 
+config XILINX_ZYNQMP_DPDMA
+	tristate "Xilinx DPDMA Engine"
+	select DMA_ENGINE
+	select DMA_VIRTUAL_CHANNELS
+	help
+	  Enable support for Xilinx ZynqMP DisplayPort DMA. Choose this option
+	  if you have a Xilinx ZynqMP SoC with a DisplayPort subsystem. The
+	  driver provides the dmaengine required by the DisplayPort subsystem
+	  display driver.
+
 config ZX_DMA
 	tristate "ZTE ZX DMA support"
 	depends on ARCH_ZX || COMPILE_TEST
diff --git a/drivers/dma/xilinx/Makefile b/drivers/dma/xilinx/Makefile
index e921de575b55..767bb45f641f 100644
--- a/drivers/dma/xilinx/Makefile
+++ b/drivers/dma/xilinx/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_XILINX_DMA) += xilinx_dma.o
 obj-$(CONFIG_XILINX_ZYNQMP_DMA) += zynqmp_dma.o
+obj-$(CONFIG_XILINX_ZYNQMP_DPDMA) += xilinx_dpdma.o
diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
new file mode 100644
index 000000000000..de28c63afc7d
--- /dev/null
+++ b/drivers/dma/xilinx/xilinx_dpdma.c
@@ -0,0 +1,1543 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Xilinx ZynqMP DPDMA Engine driver
+ *
+ * Copyright (C) 2015 - 2019 Xilinx, Inc.
+ *
+ * Author: Hyun Woo Kwon <hyun.kwon@xilinx.com>
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bits.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dmaengine.h>
+#include <linux/dmapool.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_dma.h>
+#include <linux/platform_device.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+
+#include <dt-bindings/dma/xlnx-zynqmp-dpdma.h>
+
+#include "../dmaengine.h"
+#include "../virt-dma.h"
+
+/* DPDMA registers */
+#define XILINX_DPDMA_ERR_CTRL				0x000
+#define XILINX_DPDMA_ISR				0x004
+#define XILINX_DPDMA_IMR				0x008
+#define XILINX_DPDMA_IEN				0x00c
+#define XILINX_DPDMA_IDS				0x010
+#define XILINX_DPDMA_INTR_DESC_DONE(n)			BIT((n) + 0)
+#define XILINX_DPDMA_INTR_DESC_DONE_MASK		GENMASK(5, 0)
+#define XILINX_DPDMA_INTR_NO_OSTAND(n)			BIT((n) + 6)
+#define XILINX_DPDMA_INTR_NO_OSTAND_MASK		GENMASK(11, 6)
+#define XILINX_DPDMA_INTR_AXI_ERR(n)			BIT((n) + 12)
+#define XILINX_DPDMA_INTR_AXI_ERR_MASK			GENMASK(17, 12)
+#define XILINX_DPDMA_INTR_DESC_ERR(n)			BIT((n) + 16)
+#define XILINX_DPDMA_INTR_DESC_ERR_MASK			GENMASK(23, 18)
+#define XILINX_DPDMA_INTR_WR_CMD_FIFO_FULL		BIT(24)
+#define XILINX_DPDMA_INTR_WR_DATA_FIFO_FULL		BIT(25)
+#define XILINX_DPDMA_INTR_AXI_4K_CROSS			BIT(26)
+#define XILINX_DPDMA_INTR_VSYNC				BIT(27)
+#define XILINX_DPDMA_INTR_CHAN_ERR_MASK			0x00041000
+#define XILINX_DPDMA_INTR_CHAN_ERR			0x00fff000
+#define XILINX_DPDMA_INTR_GLOBAL_ERR			0x07000000
+#define XILINX_DPDMA_INTR_ERR_ALL			0x07fff000
+#define XILINX_DPDMA_INTR_CHAN_MASK			0x00041041
+#define XILINX_DPDMA_INTR_GLOBAL_MASK			0x0f000000
+#define XILINX_DPDMA_INTR_ALL				0x0fffffff
+#define XILINX_DPDMA_EISR				0x014
+#define XILINX_DPDMA_EIMR				0x018
+#define XILINX_DPDMA_EIEN				0x01c
+#define XILINX_DPDMA_EIDS				0x020
+#define XILINX_DPDMA_EINTR_INV_APB			BIT(0)
+#define XILINX_DPDMA_EINTR_RD_AXI_ERR(n)		BIT((n) + 1)
+#define XILINX_DPDMA_EINTR_RD_AXI_ERR_MASK		GENMASK(6, 1)
+#define XILINX_DPDMA_EINTR_PRE_ERR(n)			BIT((n) + 7)
+#define XILINX_DPDMA_EINTR_PRE_ERR_MASK			GENMASK(12, 7)
+#define XILINX_DPDMA_EINTR_CRC_ERR(n)			BIT((n) + 13)
+#define XILINX_DPDMA_EINTR_CRC_ERR_MASK			GENMASK(18, 13)
+#define XILINX_DPDMA_EINTR_WR_AXI_ERR(n)		BIT((n) + 19)
+#define XILINX_DPDMA_EINTR_WR_AXI_ERR_MASK		GENMASK(24, 19)
+#define XILINX_DPDMA_EINTR_DESC_DONE_ERR(n)		BIT((n) + 25)
+#define XILINX_DPDMA_EINTR_DESC_DONE_ERR_MASK		GENMASK(30, 25)
+#define XILINX_DPDMA_EINTR_RD_CMD_FIFO_FULL		BIT(32)
+#define XILINX_DPDMA_EINTR_CHAN_ERR_MASK		0x02082082
+#define XILINX_DPDMA_EINTR_CHAN_ERR			0x7ffffffe
+#define XILINX_DPDMA_EINTR_GLOBAL_ERR			0x80000001
+#define XILINX_DPDMA_EINTR_ALL				0xffffffff
+#define XILINX_DPDMA_CNTL				0x100
+#define XILINX_DPDMA_GBL				0x104
+#define XILINX_DPDMA_GBL_TRIG_MASK(n)			((n) << 0)
+#define XILINX_DPDMA_GBL_RETRIG_MASK(n)			((n) << 6)
+#define XILINX_DPDMA_ALC0_CNTL				0x108
+#define XILINX_DPDMA_ALC0_STATUS			0x10c
+#define XILINX_DPDMA_ALC0_MAX				0x110
+#define XILINX_DPDMA_ALC0_MIN				0x114
+#define XILINX_DPDMA_ALC0_ACC				0x118
+#define XILINX_DPDMA_ALC0_ACC_TRAN			0x11c
+#define XILINX_DPDMA_ALC1_CNTL				0x120
+#define XILINX_DPDMA_ALC1_STATUS			0x124
+#define XILINX_DPDMA_ALC1_MAX				0x128
+#define XILINX_DPDMA_ALC1_MIN				0x12c
+#define XILINX_DPDMA_ALC1_ACC				0x130
+#define XILINX_DPDMA_ALC1_ACC_TRAN			0x134
+
+/* Channel register */
+#define XILINX_DPDMA_CH_BASE				0x200
+#define XILINX_DPDMA_CH_OFFSET				0x100
+#define XILINX_DPDMA_CH_DESC_START_ADDRE		0x000
+#define XILINX_DPDMA_CH_DESC_START_ADDRE_MASK		GENMASK(15, 0)
+#define XILINX_DPDMA_CH_DESC_START_ADDR			0x004
+#define XILINX_DPDMA_CH_DESC_NEXT_ADDRE			0x008
+#define XILINX_DPDMA_CH_DESC_NEXT_ADDR			0x00c
+#define XILINX_DPDMA_CH_PYLD_CUR_ADDRE			0x010
+#define XILINX_DPDMA_CH_PYLD_CUR_ADDR			0x014
+#define XILINX_DPDMA_CH_CNTL				0x018
+#define XILINX_DPDMA_CH_CNTL_ENABLE			BIT(0)
+#define XILINX_DPDMA_CH_CNTL_PAUSE			BIT(1)
+#define XILINX_DPDMA_CH_CNTL_QOS_DSCR_WR_MASK		GENMASK(5, 2)
+#define XILINX_DPDMA_CH_CNTL_QOS_DSCR_RD_MASK		GENMASK(9, 6)
+#define XILINX_DPDMA_CH_CNTL_QOS_DATA_RD_MASK		GENMASK(13, 10)
+#define XILINX_DPDMA_CH_CNTL_QOS_VID_CLASS		11
+#define XILINX_DPDMA_CH_STATUS				0x01c
+#define XILINX_DPDMA_CH_STATUS_OTRAN_CNT_MASK		GENMASK(24, 21)
+#define XILINX_DPDMA_CH_VDO				0x020
+#define XILINX_DPDMA_CH_PYLD_SZ				0x024
+#define XILINX_DPDMA_CH_DESC_ID				0x028
+
+/* DPDMA descriptor fields */
+#define XILINX_DPDMA_DESC_CONTROL_PREEMBLE		0xa5
+#define XILINX_DPDMA_DESC_CONTROL_COMPLETE_INTR		BIT(8)
+#define XILINX_DPDMA_DESC_CONTROL_DESC_UPDATE		BIT(9)
+#define XILINX_DPDMA_DESC_CONTROL_IGNORE_DONE		BIT(10)
+#define XILINX_DPDMA_DESC_CONTROL_FRAG_MODE		BIT(18)
+#define XILINX_DPDMA_DESC_CONTROL_LAST			BIT(19)
+#define XILINX_DPDMA_DESC_CONTROL_ENABLE_CRC		BIT(20)
+#define XILINX_DPDMA_DESC_CONTROL_LAST_OF_FRAME		BIT(21)
+#define XILINX_DPDMA_DESC_ID_MASK			GENMASK(15, 0)
+#define XILINX_DPDMA_DESC_HSIZE_STRIDE_HSIZE_MASK	GENMASK(17, 0)
+#define XILINX_DPDMA_DESC_HSIZE_STRIDE_STRIDE_MASK	GENMASK(31, 18)
+#define XILINX_DPDMA_DESC_ADDR_EXT_NEXT_ADDR_MASK	GENMASK(15, 0)
+#define XILINX_DPDMA_DESC_ADDR_EXT_SRC_ADDR_MASK	GENMASK(31, 16)
+
+#define XILINX_DPDMA_ALIGN_BYTES			256
+#define XILINX_DPDMA_LINESIZE_ALIGN_BITS		128
+
+#define XILINX_DPDMA_NUM_CHAN				6
+
+struct xilinx_dpdma_chan;
+
+/**
+ * struct xilinx_dpdma_hw_desc - DPDMA hardware descriptor
+ * @control: control configuration field
+ * @desc_id: descriptor ID
+ * @xfer_size: transfer size
+ * @hsize_stride: horizontal size and stride
+ * @timestamp_lsb: LSB of time stamp
+ * @timestamp_msb: MSB of time stamp
+ * @addr_ext: upper 16 bit of 48 bit address (next_desc and src_addr)
+ * @next_desc: next descriptor 32 bit address
+ * @src_addr: payload source address (1st page, 32 LSB)
+ * @addr_ext_23: payload source address (3nd and 3rd pages, 16 LSBs)
+ * @addr_ext_45: payload source address (4th and 5th pages, 16 LSBs)
+ * @src_addr2: payload source address (2nd page, 32 LSB)
+ * @src_addr3: payload source address (3rd page, 32 LSB)
+ * @src_addr4: payload source address (4th page, 32 LSB)
+ * @src_addr5: payload source address (5th page, 32 LSB)
+ * @crc: descriptor CRC
+ */
+struct xilinx_dpdma_hw_desc {
+	u32 control;
+	u32 desc_id;
+	u32 xfer_size;
+	u32 hsize_stride;
+	u32 timestamp_lsb;
+	u32 timestamp_msb;
+	u32 addr_ext;
+	u32 next_desc;
+	u32 src_addr;
+	u32 addr_ext_23;
+	u32 addr_ext_45;
+	u32 src_addr2;
+	u32 src_addr3;
+	u32 src_addr4;
+	u32 src_addr5;
+	u32 crc;
+} __aligned(XILINX_DPDMA_ALIGN_BYTES);
+
+/**
+ * struct xilinx_dpdma_sw_desc - DPDMA software descriptor
+ * @hw: DPDMA hardware descriptor
+ * @node: list node for software descriptors
+ * @dma_addr: DMA address of the software descriptor
+ */
+struct xilinx_dpdma_sw_desc {
+	struct xilinx_dpdma_hw_desc hw;
+	struct list_head node;
+	dma_addr_t dma_addr;
+};
+
+/**
+ * struct xilinx_dpdma_tx_desc - DPDMA transaction descriptor
+ * @vdesc: virtual DMA descriptor
+ * @chan: DMA channel
+ * @descriptors: list of software descriptors
+ * @error: an error has been detected with this descriptor
+ */
+struct xilinx_dpdma_tx_desc {
+	struct virt_dma_desc vdesc;
+	struct xilinx_dpdma_chan *chan;
+	struct list_head descriptors;
+	bool error;
+};
+
+#define to_dpdma_tx_desc(_desc) \
+	container_of(_desc, struct xilinx_dpdma_tx_desc, vdesc)
+
+/**
+ * struct xilinx_dpdma_chan - DPDMA channel
+ * @vchan: virtual DMA channel
+ * @reg: register base address
+ * @id: channel ID
+ * @wait_to_stop: queue to wait for outstanding transacitons before stopping
+ * @running: true if the channel is running
+ * @first_frame: flag for the first frame of stream
+ * @video_group: flag if multi-channel operation is needed for video channels
+ * @lock: lock to access struct xilinx_dpdma_chan
+ * @desc_pool: descriptor allocation pool
+ * @err_task: error IRQ bottom half handler
+ * @desc.pending: Descriptor schedule to the hardware, pending execution
+ * @desc.active: Descriptor being executed by the hardware
+ * @xdev: DPDMA device
+ */
+struct xilinx_dpdma_chan {
+	struct virt_dma_chan vchan;
+	void __iomem *reg;
+	unsigned int id;
+
+	wait_queue_head_t wait_to_stop;
+	bool running;
+	bool first_frame;
+	bool video_group;
+
+	spinlock_t lock; /* lock to access struct xilinx_dpdma_chan */
+	struct dma_pool *desc_pool;
+	struct tasklet_struct err_task;
+
+	struct {
+		struct xilinx_dpdma_tx_desc *pending;
+		struct xilinx_dpdma_tx_desc *active;
+	} desc;
+
+	struct xilinx_dpdma_device *xdev;
+};
+
+#define to_xilinx_chan(_chan) \
+	container_of(_chan, struct xilinx_dpdma_chan, vchan.chan)
+
+/**
+ * struct xilinx_dpdma_device - DPDMA device
+ * @common: generic dma device structure
+ * @reg: register base address
+ * @dev: generic device structure
+ * @irq: the interrupt number
+ * @axi_clk: axi clock
+ * @chan: DPDMA channels
+ * @ext_addr: flag for 64 bit system (48 bit addressing)
+ */
+struct xilinx_dpdma_device {
+	struct dma_device common;
+	void __iomem *reg;
+	struct device *dev;
+	int irq;
+
+	struct clk *axi_clk;
+	struct xilinx_dpdma_chan *chan[XILINX_DPDMA_NUM_CHAN];
+
+	bool ext_addr;
+};
+
+/* -----------------------------------------------------------------------------
+ * I/O Accessors
+ */
+
+static inline u32 dpdma_read(void __iomem *base, u32 offset)
+{
+	return ioread32(base + offset);
+}
+
+static inline void dpdma_write(void __iomem *base, u32 offset, u32 val)
+{
+	iowrite32(val, base + offset);
+}
+
+static inline void dpdma_clr(void __iomem *base, u32 offset, u32 clr)
+{
+	dpdma_write(base, offset, dpdma_read(base, offset) & ~clr);
+}
+
+static inline void dpdma_set(void __iomem *base, u32 offset, u32 set)
+{
+	dpdma_write(base, offset, dpdma_read(base, offset) | set);
+}
+
+/* -----------------------------------------------------------------------------
+ * Descriptor Operations
+ */
+
+/**
+ * xilinx_dpdma_sw_desc_set_dma_addrs - Set DMA addresses in the descriptor
+ * @sw_desc: The software descriptor in which to set DMA addresses
+ * @prev: The previous descriptor
+ * @dma_addr: array of dma addresses
+ * @num_src_addr: number of addresses in @dma_addr
+ *
+ * Set all the DMA addresses in the hardware descriptor corresponding to @dev
+ * from @dma_addr. If a previous descriptor is specified in @prev, its next
+ * descriptor DMA address is set to the DMA address of @sw_desc. @prev may be
+ * identical to @sw_desc for cyclic transfers.
+ */
+static void xilinx_dpdma_sw_desc_set_dma_addrs(struct xilinx_dpdma_device *xdev,
+					       struct xilinx_dpdma_sw_desc *sw_desc,
+					       struct xilinx_dpdma_sw_desc *prev,
+					       dma_addr_t dma_addr[],
+					       unsigned int num_src_addr)
+{
+	struct xilinx_dpdma_hw_desc *hw_desc = &sw_desc->hw;
+	unsigned int i;
+
+	hw_desc->src_addr = lower_32_bits(dma_addr[0]);
+	if (xdev->ext_addr)
+		hw_desc->addr_ext |=
+			FIELD_PREP(XILINX_DPDMA_DESC_ADDR_EXT_SRC_ADDR_MASK,
+				   upper_32_bits(dma_addr[0]));
+
+	for (i = 1; i < num_src_addr; i++) {
+		u32 *addr = &hw_desc->src_addr2;
+
+		addr[i-1] = lower_32_bits(dma_addr[i]);
+
+		if (xdev->ext_addr) {
+			u32 *addr_ext = &hw_desc->addr_ext_23;
+			u32 addr_msb;
+
+			addr_msb = upper_32_bits(dma_addr[i]) & GENMASK(15, 0);
+			addr_msb <<= 16 * ((i - 1) % 2);
+			addr_ext[(i - 1) / 2] |= addr_msb;
+		}
+	}
+
+	if (!prev)
+		return;
+
+	prev->hw.next_desc = lower_32_bits(sw_desc->dma_addr);
+	if (xdev->ext_addr)
+		prev->hw.addr_ext |=
+			FIELD_PREP(XILINX_DPDMA_DESC_ADDR_EXT_NEXT_ADDR_MASK,
+				   upper_32_bits(sw_desc->dma_addr));
+}
+
+/**
+ * xilinx_dpdma_chan_alloc_sw_desc - Allocate a software descriptor
+ * @chan: DPDMA channel
+ *
+ * Allocate a software descriptor from the channel's descriptor pool.
+ *
+ * Return: a software descriptor or NULL.
+ */
+static struct xilinx_dpdma_sw_desc *
+xilinx_dpdma_chan_alloc_sw_desc(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	dma_addr_t dma_addr;
+
+	sw_desc = dma_pool_zalloc(chan->desc_pool, GFP_ATOMIC, &dma_addr);
+	if (!sw_desc)
+		return NULL;
+
+	sw_desc->dma_addr = dma_addr;
+
+	return sw_desc;
+}
+
+/**
+ * xilinx_dpdma_chan_free_sw_desc - Free a software descriptor
+ * @chan: DPDMA channel
+ * @sw_desc: software descriptor to free
+ *
+ * Free a software descriptor from the channel's descriptor pool.
+ */
+static void
+xilinx_dpdma_chan_free_sw_desc(struct xilinx_dpdma_chan *chan,
+			       struct xilinx_dpdma_sw_desc *sw_desc)
+{
+	dma_pool_free(chan->desc_pool, sw_desc, sw_desc->dma_addr);
+}
+
+/**
+ * xilinx_dpdma_chan_dump_tx_desc - Dump a tx descriptor
+ * @chan: DPDMA channel
+ * @tx_desc: tx descriptor to dump
+ *
+ * Dump contents of a tx descriptor
+ */
+static void xilinx_dpdma_chan_dump_tx_desc(struct xilinx_dpdma_chan *chan,
+					   struct xilinx_dpdma_tx_desc *tx_desc)
+{
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	struct device *dev = chan->xdev->dev;
+	unsigned int i = 0;
+
+	dev_dbg(dev, "------- TX descriptor dump start -------\n");
+	dev_dbg(dev, "------- channel ID = %d -------\n", chan->id);
+
+	list_for_each_entry(sw_desc, &tx_desc->descriptors, node) {
+		struct xilinx_dpdma_hw_desc *hw_desc = &sw_desc->hw;
+
+		dev_dbg(dev, "------- HW descriptor %d -------\n", i++);
+		dev_dbg(dev, "descriptor DMA addr: %pad\n", &sw_desc->dma_addr);
+		dev_dbg(dev, "control: 0x%08x\n", hw_desc->control);
+		dev_dbg(dev, "desc_id: 0x%08x\n", hw_desc->desc_id);
+		dev_dbg(dev, "xfer_size: 0x%08x\n", hw_desc->xfer_size);
+		dev_dbg(dev, "hsize_stride: 0x%08x\n", hw_desc->hsize_stride);
+		dev_dbg(dev, "timestamp_lsb: 0x%08x\n", hw_desc->timestamp_lsb);
+		dev_dbg(dev, "timestamp_msb: 0x%08x\n", hw_desc->timestamp_msb);
+		dev_dbg(dev, "addr_ext: 0x%08x\n", hw_desc->addr_ext);
+		dev_dbg(dev, "next_desc: 0x%08x\n", hw_desc->next_desc);
+		dev_dbg(dev, "src_addr: 0x%08x\n", hw_desc->src_addr);
+		dev_dbg(dev, "addr_ext_23: 0x%08x\n", hw_desc->addr_ext_23);
+		dev_dbg(dev, "addr_ext_45: 0x%08x\n", hw_desc->addr_ext_45);
+		dev_dbg(dev, "src_addr2: 0x%08x\n", hw_desc->src_addr2);
+		dev_dbg(dev, "src_addr3: 0x%08x\n", hw_desc->src_addr3);
+		dev_dbg(dev, "src_addr4: 0x%08x\n", hw_desc->src_addr4);
+		dev_dbg(dev, "src_addr5: 0x%08x\n", hw_desc->src_addr5);
+		dev_dbg(dev, "crc: 0x%08x\n", hw_desc->crc);
+	}
+
+	dev_dbg(dev, "------- TX descriptor dump end -------\n");
+}
+
+/**
+ * xilinx_dpdma_chan_alloc_tx_desc - Allocate a transaction descriptor
+ * @chan: DPDMA channel
+ *
+ * Allocate a tx descriptor.
+ *
+ * Return: a tx descriptor or NULL.
+ */
+static struct xilinx_dpdma_tx_desc *
+xilinx_dpdma_chan_alloc_tx_desc(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_tx_desc *tx_desc;
+
+	tx_desc = kzalloc(sizeof(*tx_desc), GFP_KERNEL);
+	if (!tx_desc)
+		return NULL;
+
+	INIT_LIST_HEAD(&tx_desc->descriptors);
+	tx_desc->chan = chan;
+	tx_desc->error = false;
+
+	return tx_desc;
+}
+
+/**
+ * xilinx_dpdma_chan_free_tx_desc - Free a virtual DMA descriptor
+ * @vdesc: virtual DMA descriptor
+ *
+ * Free the virtual DMA descriptor @vdesc including its software descriptors.
+ */
+static void xilinx_dpdma_chan_free_tx_desc(struct virt_dma_desc *vdesc)
+{
+	struct xilinx_dpdma_sw_desc *sw_desc, *next;
+	struct xilinx_dpdma_tx_desc *desc;
+
+	if (!vdesc)
+		return;
+
+	desc = to_dpdma_tx_desc(vdesc);
+
+	list_for_each_entry_safe(sw_desc, next, &desc->descriptors, node) {
+		list_del(&sw_desc->node);
+		xilinx_dpdma_chan_free_sw_desc(desc->chan, sw_desc);
+	}
+
+	kfree(desc);
+}
+
+/**
+ * xilinx_dpdma_chan_prep_interleaved_dma - Prepare an interleaved dma
+ *					    descriptor
+ * @chan: DPDMA channel
+ * @xt: dma interleaved template
+ *
+ * Prepare a tx descriptor including internal software/hardware descriptors
+ * based on @xt.
+ *
+ * Return: A DPDMA TX descriptor on success, or NULL.
+ */
+static struct xilinx_dpdma_tx_desc *
+xilinx_dpdma_chan_prep_interleaved_dma(struct xilinx_dpdma_chan *chan,
+				       struct dma_interleaved_template *xt)
+{
+	struct xilinx_dpdma_tx_desc *tx_desc;
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	struct xilinx_dpdma_hw_desc *hw_desc;
+	size_t hsize = xt->sgl[0].size;
+	size_t stride = hsize + xt->sgl[0].icg;
+
+	if (!IS_ALIGNED(xt->src_start, XILINX_DPDMA_ALIGN_BYTES)) {
+		dev_err(chan->xdev->dev, "buffer should be aligned at %d B\n",
+			XILINX_DPDMA_ALIGN_BYTES);
+		return NULL;
+	}
+
+	tx_desc = xilinx_dpdma_chan_alloc_tx_desc(chan);
+	if (!tx_desc)
+		return NULL;
+
+	sw_desc = xilinx_dpdma_chan_alloc_sw_desc(chan);
+	if (!sw_desc) {
+		xilinx_dpdma_chan_free_tx_desc(&tx_desc->vdesc);
+		return NULL;
+	}
+
+	xilinx_dpdma_sw_desc_set_dma_addrs(chan->xdev, sw_desc, sw_desc,
+					   &xt->src_start, 1);
+
+	hw_desc = &sw_desc->hw;
+	hsize = ALIGN(hsize, XILINX_DPDMA_LINESIZE_ALIGN_BITS / 8);
+	hw_desc->xfer_size = hsize * xt->numf;
+	hw_desc->hsize_stride =
+		FIELD_PREP(XILINX_DPDMA_DESC_HSIZE_STRIDE_HSIZE_MASK, hsize) |
+		FIELD_PREP(XILINX_DPDMA_DESC_HSIZE_STRIDE_STRIDE_MASK,
+			   stride / 16);
+	hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_PREEMBLE;
+	hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_COMPLETE_INTR;
+	hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_IGNORE_DONE;
+	hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_LAST_OF_FRAME;
+
+	list_add_tail(&sw_desc->node, &tx_desc->descriptors);
+
+	return tx_desc;
+}
+
+/* -----------------------------------------------------------------------------
+ * DPDMA Channel Operations
+ */
+
+/**
+ * xilinx_dpdma_chan_enable - Enable the channel
+ * @chan: DPDMA channel
+ *
+ * Enable the channel and its interrupts. Set the QoS values for video class.
+ */
+static void xilinx_dpdma_chan_enable(struct xilinx_dpdma_chan *chan)
+{
+	u32 reg;
+
+	reg = (XILINX_DPDMA_INTR_CHAN_MASK << chan->id)
+	    | XILINX_DPDMA_INTR_GLOBAL_MASK;
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_IEN, reg);
+	reg = (XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id)
+	    | XILINX_DPDMA_INTR_GLOBAL_ERR;
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_EIEN, reg);
+
+	reg = XILINX_DPDMA_CH_CNTL_ENABLE
+	    | FIELD_PREP(XILINX_DPDMA_CH_CNTL_QOS_DSCR_WR_MASK,
+			 XILINX_DPDMA_CH_CNTL_QOS_VID_CLASS)
+	    | FIELD_PREP(XILINX_DPDMA_CH_CNTL_QOS_DSCR_RD_MASK,
+			 XILINX_DPDMA_CH_CNTL_QOS_VID_CLASS)
+	    | FIELD_PREP(XILINX_DPDMA_CH_CNTL_QOS_DATA_RD_MASK,
+			 XILINX_DPDMA_CH_CNTL_QOS_VID_CLASS);
+	dpdma_set(chan->reg, XILINX_DPDMA_CH_CNTL, reg);
+}
+
+/**
+ * xilinx_dpdma_chan_disable - Disable the channel
+ * @chan: DPDMA channel
+ *
+ * Disable the channel and its interrupts.
+ */
+static void xilinx_dpdma_chan_disable(struct xilinx_dpdma_chan *chan)
+{
+	u32 reg;
+
+	reg = XILINX_DPDMA_INTR_CHAN_MASK << chan->id;
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_IEN, reg);
+	reg = XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id;
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_EIEN, reg);
+
+	dpdma_clr(chan->reg, XILINX_DPDMA_CH_CNTL, XILINX_DPDMA_CH_CNTL_ENABLE);
+}
+
+/**
+ * xilinx_dpdma_chan_pause - Pause the channel
+ * @chan: DPDMA channel
+ *
+ * Pause the channel.
+ */
+static void xilinx_dpdma_chan_pause(struct xilinx_dpdma_chan *chan)
+{
+	dpdma_set(chan->reg, XILINX_DPDMA_CH_CNTL, XILINX_DPDMA_CH_CNTL_PAUSE);
+}
+
+/**
+ * xilinx_dpdma_chan_unpause - Unpause the channel
+ * @chan: DPDMA channel
+ *
+ * Unpause the channel.
+ */
+static void xilinx_dpdma_chan_unpause(struct xilinx_dpdma_chan *chan)
+{
+	dpdma_clr(chan->reg, XILINX_DPDMA_CH_CNTL, XILINX_DPDMA_CH_CNTL_PAUSE);
+}
+
+static u32 xilinx_dpdma_chan_video_group_ready(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	u32 channels = 0;
+	unsigned int i;
+
+	for (i = ZYNQMP_DPDMA_VIDEO0; i <= ZYNQMP_DPDMA_VIDEO2; i++) {
+		if (xdev->chan[i]->video_group && !xdev->chan[i]->running)
+			return 0;
+
+		if (xdev->chan[i]->video_group)
+			channels |= BIT(i);
+	}
+
+	return channels;
+}
+
+/**
+ * xilinx_dpdma_chan_queue_transfer - Queue the next transfer
+ * @chan: DPDMA channel
+ *
+ * Queue the next descriptor, if any, to the hardware. If the channel is
+ * stopped, start it first. Otherwise retrigger it with the next descriptor.
+ */
+static void xilinx_dpdma_chan_queue_transfer(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	struct xilinx_dpdma_tx_desc *desc;
+	struct virt_dma_desc *vdesc;
+	u32 reg, channels;
+
+	lockdep_assert_held(&chan->lock);
+
+	if (chan->desc.pending)
+		return;
+
+	if (!chan->running) {
+		xilinx_dpdma_chan_unpause(chan);
+		xilinx_dpdma_chan_enable(chan);
+		chan->first_frame = true;
+		chan->running = true;
+	}
+
+	if (chan->video_group)
+		channels = xilinx_dpdma_chan_video_group_ready(chan);
+	else
+		channels = BIT(chan->id);
+
+	if (!channels)
+		return;
+
+	vdesc = vchan_next_desc(&chan->vchan);
+	if (!vdesc)
+		return;
+
+	desc = to_dpdma_tx_desc(vdesc);
+	chan->desc.pending = desc;
+	list_del(&desc->vdesc.node);
+
+	/*
+	 * Assign the cookie to descriptors in this transaction. Only 16 bit
+	 * will be used, but it should be enough.
+	 */
+	list_for_each_entry(sw_desc, &desc->descriptors, node)
+		sw_desc->hw.desc_id = desc->vdesc.tx.cookie;
+
+	sw_desc = list_first_entry(&desc->descriptors,
+				   struct xilinx_dpdma_sw_desc, node);
+	dpdma_write(chan->reg, XILINX_DPDMA_CH_DESC_START_ADDR,
+		    lower_32_bits(sw_desc->dma_addr));
+	if (xdev->ext_addr)
+		dpdma_write(chan->reg, XILINX_DPDMA_CH_DESC_START_ADDRE,
+			    FIELD_PREP(XILINX_DPDMA_CH_DESC_START_ADDRE_MASK,
+				       upper_32_bits(sw_desc->dma_addr)));
+
+	if (chan->first_frame)
+		reg = XILINX_DPDMA_GBL_TRIG_MASK(channels);
+	else
+		reg = XILINX_DPDMA_GBL_RETRIG_MASK(channels);
+
+	chan->first_frame = false;
+
+	dpdma_write(xdev->reg, XILINX_DPDMA_GBL, reg);
+}
+
+/**
+ * xilinx_dpdma_chan_ostand - Number of outstanding transactions
+ * @chan: DPDMA channel
+ *
+ * Read and return the number of outstanding transactions from register.
+ *
+ * Return: Number of outstanding transactions from the status register.
+ */
+static u32 xilinx_dpdma_chan_ostand(struct xilinx_dpdma_chan *chan)
+{
+	return FIELD_GET(XILINX_DPDMA_CH_STATUS_OTRAN_CNT_MASK,
+			 dpdma_read(chan->reg, XILINX_DPDMA_CH_STATUS));
+}
+
+/**
+ * xilinx_dpdma_chan_no_ostand - Notify no outstanding transaction event
+ * @chan: DPDMA channel
+ *
+ * Notify waiters for no outstanding event, so waiters can stop the channel
+ * safely. This function is supposed to be called when 'no outstanding'
+ * interrupt is generated. The 'no outstanding' interrupt is disabled and
+ * should be re-enabled when this event is handled. If the channel status
+ * register still shows some number of outstanding transactions, the interrupt
+ * remains enabled.
+ *
+ * Return: 0 on success. On failure, -EWOULDBLOCK if there's still outstanding
+ * transaction(s).
+ */
+static int xilinx_dpdma_chan_notify_no_ostand(struct xilinx_dpdma_chan *chan)
+{
+	u32 cnt;
+
+	cnt = xilinx_dpdma_chan_ostand(chan);
+	if (cnt) {
+		dev_dbg(chan->xdev->dev, "%d outstanding transactions\n", cnt);
+		return -EWOULDBLOCK;
+	}
+
+	/* Disable 'no outstanding' interrupt */
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_IDS,
+		    XILINX_DPDMA_INTR_NO_OSTAND(chan->id));
+	wake_up(&chan->wait_to_stop);
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_chan_wait_no_ostand - Wait for the no outstanding irq
+ * @chan: DPDMA channel
+ *
+ * Wait for the no outstanding transaction interrupt. This functions can sleep
+ * for 50ms.
+ *
+ * Return: 0 on success. On failure, -ETIMEOUT for time out, or the error code
+ * from wait_event_interruptible_timeout().
+ */
+static int xilinx_dpdma_chan_wait_no_ostand(struct xilinx_dpdma_chan *chan)
+{
+	int ret;
+
+	/* Wait for a no outstanding transaction interrupt upto 50msec */
+	ret = wait_event_interruptible_timeout(chan->wait_to_stop,
+					       !xilinx_dpdma_chan_ostand(chan),
+					       msecs_to_jiffies(50));
+	if (ret > 0) {
+		dpdma_write(chan->xdev->reg, XILINX_DPDMA_IEN,
+			    XILINX_DPDMA_INTR_NO_OSTAND(chan->id));
+		return 0;
+	}
+
+	dev_err(chan->xdev->dev, "not ready to stop: %d trans\n",
+		xilinx_dpdma_chan_ostand(chan));
+
+	if (ret == 0)
+		return -ETIMEDOUT;
+
+	return ret;
+}
+
+/**
+ * xilinx_dpdma_chan_poll_no_ostand - Poll the outstanding transaction status
+ * @chan: DPDMA channel
+ *
+ * Poll the outstanding transaction status, and return when there's no
+ * outstanding transaction. This functions can be used in the interrupt context
+ * or where the atomicity is required. Calling thread may wait more than 50ms.
+ *
+ * Return: 0 on success, or -ETIMEDOUT.
+ */
+static int xilinx_dpdma_chan_poll_no_ostand(struct xilinx_dpdma_chan *chan)
+{
+	u32 cnt, loop = 50000;
+
+	/* Poll at least for 50ms (20 fps). */
+	do {
+		cnt = xilinx_dpdma_chan_ostand(chan);
+		udelay(1);
+	} while (loop-- > 0 && cnt);
+
+	if (loop) {
+		dpdma_write(chan->xdev->reg, XILINX_DPDMA_IEN,
+			    XILINX_DPDMA_INTR_NO_OSTAND(chan->id));
+		return 0;
+	}
+
+	dev_err(chan->xdev->dev, "not ready to stop: %d trans\n",
+		xilinx_dpdma_chan_ostand(chan));
+
+	return -ETIMEDOUT;
+}
+
+/**
+ * xilinx_dpdma_chan_stop - Stop the channel
+ * @chan: DPDMA channel
+ *
+ * Stop a previously paused channel by first waiting for completion of all
+ * outstanding transaction and then disabling the channel.
+ *
+ * Return: 0 on success, or -ETIMEDOUT if the channel failed to stop.
+ */
+static int xilinx_dpdma_chan_stop(struct xilinx_dpdma_chan *chan)
+{
+	unsigned long flags;
+	int ret;
+
+	ret = xilinx_dpdma_chan_wait_no_ostand(chan);
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&chan->lock, flags);
+	xilinx_dpdma_chan_disable(chan);
+	chan->running = false;
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_chan_done_irq - Handle hardware descriptor completion
+ * @chan: DPDMA channel
+ *
+ * Handle completion of the currently active descriptor (@chan->desc.active). As
+ * we currently support cyclic transfers only, this just invokes the cyclic
+ * callback. The descriptor will be completed at the VSYNC interrupt when a new
+ * descriptor replaces it.
+ */
+static void xilinx_dpdma_chan_done_irq(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_tx_desc *active = chan->desc.active;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	if (active)
+		vchan_cyclic_callback(&active->vdesc);
+	else
+		dev_warn(chan->xdev->dev,
+			 "DONE IRQ with no active descriptor!\n");
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/**
+ * xilinx_dpdma_chan_vsync_irq - Handle hardware descriptor scheduling
+ * @chan: DPDMA channel
+ *
+ * At VSYNC the active descriptor may have been replaced by the pending
+ * descriptor. Detect this through the DESC_ID and perform appropriate
+ * bookkeeping.
+ */
+static void xilinx_dpdma_chan_vsync_irq(struct  xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_tx_desc *pending;
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	unsigned long flags;
+	u32 desc_id;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	pending = chan->desc.pending;
+	if (!chan->running || !pending)
+		goto out;
+
+	desc_id = dpdma_read(chan->reg, XILINX_DPDMA_CH_DESC_ID);
+
+	/* If the retrigger raced with vsync, retry at the next frame. */
+	sw_desc = list_first_entry(&pending->descriptors,
+				   struct xilinx_dpdma_sw_desc, node);
+	if (sw_desc->hw.desc_id != desc_id)
+		goto out;
+
+	/*
+	 * Complete the active descriptor, if any, promote the pending
+	 * descriptor to active, and queue the next transfer, if any.
+	 */
+	if (chan->desc.active)
+		vchan_cookie_complete(&chan->desc.active->vdesc);
+	chan->desc.active = pending;
+	chan->desc.pending = NULL;
+
+	xilinx_dpdma_chan_queue_transfer(chan);
+
+out:
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/**
+ * xilinx_dpdma_chan_err - Detect any channel error
+ * @chan: DPDMA channel
+ * @isr: masked Interrupt Status Register
+ * @eisr: Error Interrupt Status Register
+ *
+ * Return: true if any channel error occurs, or false otherwise.
+ */
+static bool
+xilinx_dpdma_chan_err(struct xilinx_dpdma_chan *chan, u32 isr, u32 eisr)
+{
+	if (!chan)
+		return false;
+
+	if (chan->running &&
+	    ((isr & (XILINX_DPDMA_INTR_CHAN_ERR_MASK << chan->id)) ||
+	    (eisr & (XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id))))
+		return true;
+
+	return false;
+}
+
+/**
+ * xilinx_dpdma_chan_handle_err - DPDMA channel error handling
+ * @chan: DPDMA channel
+ *
+ * This function is called when any channel error or any global error occurs.
+ * The function disables the paused channel by errors and determines
+ * if the current active descriptor can be rescheduled depending on
+ * the descriptor status.
+ */
+static void xilinx_dpdma_chan_handle_err(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	struct xilinx_dpdma_tx_desc *active;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	dev_dbg(xdev->dev, "cur desc addr = 0x%04x%08x\n",
+		dpdma_read(chan->reg, XILINX_DPDMA_CH_DESC_START_ADDRE),
+		dpdma_read(chan->reg, XILINX_DPDMA_CH_DESC_START_ADDR));
+	dev_dbg(xdev->dev, "cur payload addr = 0x%04x%08x\n",
+		dpdma_read(chan->reg, XILINX_DPDMA_CH_PYLD_CUR_ADDRE),
+		dpdma_read(chan->reg, XILINX_DPDMA_CH_PYLD_CUR_ADDR));
+
+	xilinx_dpdma_chan_disable(chan);
+	chan->running = false;
+
+	if (!chan->desc.active)
+		goto out_unlock;
+
+	active = chan->desc.active;
+	chan->desc.active = NULL;
+
+	xilinx_dpdma_chan_dump_tx_desc(chan, active);
+
+	if (active->error)
+		dev_dbg(xdev->dev, "repeated error on desc\n");
+
+	/* Reschedule if there's no new descriptor */
+	if (!chan->desc.pending &&
+	    list_empty(&chan->vchan.desc_issued)) {
+		active->error = true;
+		list_add_tail(&active->vdesc.node,
+			      &chan->vchan.desc_issued);
+	} else {
+		xilinx_dpdma_chan_free_tx_desc(&active->vdesc);
+	}
+
+out_unlock:
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/* -----------------------------------------------------------------------------
+ * DMA Engine Operations
+ */
+
+static struct dma_async_tx_descriptor *
+xilinx_dpdma_prep_interleaved_dma(struct dma_chan *dchan,
+				  struct dma_interleaved_template *xt,
+				  unsigned long flags)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	struct xilinx_dpdma_tx_desc *desc;
+
+	if (xt->dir != DMA_MEM_TO_DEV)
+		return NULL;
+
+	if (!xt->numf || !xt->sgl[0].size)
+		return NULL;
+
+	if (!(flags & DMA_PREP_REPEAT))
+		return NULL;
+
+	desc = xilinx_dpdma_chan_prep_interleaved_dma(chan, xt);
+	if (!desc)
+		return NULL;
+
+	vchan_tx_prep(&chan->vchan, &desc->vdesc, flags | DMA_CTRL_ACK);
+
+	return &desc->vdesc.tx;
+}
+
+/**
+ * xilinx_dpdma_alloc_chan_resources - Allocate resources for the channel
+ * @dchan: DMA channel
+ *
+ * Allocate a descriptor pool for the channel.
+ *
+ * Return: 0 on success, or -ENOMEM if failed to allocate a pool.
+ */
+static int xilinx_dpdma_alloc_chan_resources(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	size_t align = __alignof__(struct xilinx_dpdma_sw_desc);
+
+	chan->desc_pool = dma_pool_create(dev_name(chan->xdev->dev),
+					  chan->xdev->dev,
+					  sizeof(struct xilinx_dpdma_sw_desc),
+					  align, 0);
+	if (!chan->desc_pool) {
+		dev_err(chan->xdev->dev,
+			"failed to allocate a descriptor pool\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_free_chan_resources - Free all resources for the channel
+ * @dchan: DMA channel
+ *
+ * Free resources associated with the virtual DMA channel, and destroy the
+ * descriptor pool.
+ */
+static void xilinx_dpdma_free_chan_resources(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+
+	vchan_free_chan_resources(&chan->vchan);
+
+	dma_pool_destroy(chan->desc_pool);
+	chan->desc_pool = NULL;
+}
+
+static void xilinx_dpdma_issue_pending(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+	if (vchan_issue_pending(&chan->vchan))
+		xilinx_dpdma_chan_queue_transfer(chan);
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+}
+
+static int xilinx_dpdma_config(struct dma_chan *dchan,
+			       struct dma_slave_config *config)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	unsigned long flags;
+	int ret = 0;
+
+	if (config->direction != DMA_MEM_TO_DEV)
+		return -EINVAL;
+
+	/*
+	 * The destination address doesn't need to be specified as the DPDMA is
+	 * hardwired to the destination (the DP controller). The transfer
+	 * width, burst size and port window size are thus meaningless, they're
+	 * fixed both on the DPDMA side and on the DP controller side.
+	 */
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	/* Can't reconfigure a running channel. */
+	if (chan->running) {
+		ret = -EBUSY;
+		goto unlock;
+	}
+
+	/*
+	 * Abuse the slave_id to indicate that the channel is part of a video
+	 * group.
+	 */
+	if (chan->id >= ZYNQMP_DPDMA_VIDEO0 && chan->id <= ZYNQMP_DPDMA_VIDEO2)
+		chan->video_group = config->slave_id != 0;
+
+unlock:
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	return ret;
+}
+
+static int xilinx_dpdma_pause(struct dma_chan *dchan)
+{
+	xilinx_dpdma_chan_pause(to_xilinx_chan(dchan));
+
+	return 0;
+}
+
+static int xilinx_dpdma_resume(struct dma_chan *dchan)
+{
+	xilinx_dpdma_chan_unpause(to_xilinx_chan(dchan));
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_terminate_all - Terminate the channel and descriptors
+ * @dchan: DMA channel
+ *
+ * Pause the channel without waiting for ongoing transfers to complete. Waiting
+ * for completion is performed by xilinx_dpdma_synchronize() that will disable
+ * the channel to complete the stop.
+ *
+ * All the descriptors associated with the channel that are guaranteed not to
+ * be touched by the hardware. The pending and active descriptor are not
+ * touched, and will be freed either upon completion, or by
+ * xilinx_dpdma_synchronize().
+ *
+ * Return: 0 on success, or -ETIMEDOUT if the channel failed to stop.
+ */
+static int xilinx_dpdma_terminate_all(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	LIST_HEAD(descriptors);
+	unsigned long flags;
+	unsigned int i;
+
+	/* Pause the channel (including the whole video group if applicable). */
+	if (chan->video_group) {
+		for (i = ZYNQMP_DPDMA_VIDEO0; i <= ZYNQMP_DPDMA_VIDEO2; i++) {
+			if (xdev->chan[i]->video_group &&
+			    xdev->chan[i]->running) {
+				xilinx_dpdma_chan_pause(xdev->chan[i]);
+				xdev->chan[i]->video_group = false;
+			}
+		}
+	} else {
+		xilinx_dpdma_chan_pause(chan);
+	}
+
+	/* Gather all the descriptors we can free and free them. */
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+	vchan_get_all_descriptors(&chan->vchan, &descriptors);
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+
+	vchan_dma_desc_free_list(&chan->vchan, &descriptors);
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_synchronize - Synchronize callback execution
+ * @dchan: DMA channel
+ *
+ * Synchronizing callback execution ensures that all previously issued
+ * transfers have completed and all associated callbacks have been called and
+ * have returned.
+ *
+ * This function waits for the DMA channel to stop. It assumes it has been
+ * paused by a previous call to dmaengine_terminate_async(), and that no new
+ * pending descriptors have been issued with dma_async_issue_pending(). The
+ * behaviour is undefined otherwise.
+ */
+static void xilinx_dpdma_synchronize(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	unsigned long flags;
+
+	xilinx_dpdma_chan_stop(chan);
+
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+	if (chan->desc.pending) {
+		vchan_terminate_vdesc(&chan->desc.pending->vdesc);
+		chan->desc.pending = NULL;
+	}
+	if (chan->desc.active) {
+		vchan_terminate_vdesc(&chan->desc.active->vdesc);
+		chan->desc.active = NULL;
+	}
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+
+	vchan_synchronize(&chan->vchan);
+}
+
+/* -----------------------------------------------------------------------------
+ * Interrupt and Tasklet Handling
+ */
+
+/**
+ * xilinx_dpdma_err - Detect any global error
+ * @isr: Interrupt Status Register
+ * @eisr: Error Interrupt Status Register
+ *
+ * Return: True if any global error occurs, or false otherwise.
+ */
+static bool xilinx_dpdma_err(u32 isr, u32 eisr)
+{
+	if (isr & XILINX_DPDMA_INTR_GLOBAL_ERR ||
+	    eisr & XILINX_DPDMA_EINTR_GLOBAL_ERR)
+		return true;
+
+	return false;
+}
+
+/**
+ * xilinx_dpdma_handle_err_irq - Handle DPDMA error interrupt
+ * @xdev: DPDMA device
+ * @isr: masked Interrupt Status Register
+ * @eisr: Error Interrupt Status Register
+ *
+ * Handle if any error occurs based on @isr and @eisr. This function disables
+ * corresponding error interrupts, and those should be re-enabled once handling
+ * is done.
+ */
+static void xilinx_dpdma_handle_err_irq(struct xilinx_dpdma_device *xdev,
+					u32 isr, u32 eisr)
+{
+	bool err = xilinx_dpdma_err(isr, eisr);
+	unsigned int i;
+
+	dev_dbg_ratelimited(xdev->dev,
+			    "error irq: isr = 0x%08x, eisr = 0x%08x\n",
+			    isr, eisr);
+
+	/* Disable channel error interrupts until errors are handled. */
+	dpdma_write(xdev->reg, XILINX_DPDMA_IDS,
+		    isr & ~XILINX_DPDMA_INTR_GLOBAL_ERR);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EIDS,
+		    eisr & ~XILINX_DPDMA_EINTR_GLOBAL_ERR);
+
+	for (i = 0; i < ARRAY_SIZE(xdev->chan); i++)
+		if (err || xilinx_dpdma_chan_err(xdev->chan[i], isr, eisr))
+			tasklet_schedule(&xdev->chan[i]->err_task);
+}
+
+/**
+ * xilinx_dpdma_enable_irq - Enable interrupts
+ * @xdev: DPDMA device
+ *
+ * Enable interrupts.
+ */
+static void xilinx_dpdma_enable_irq(struct xilinx_dpdma_device *xdev)
+{
+	dpdma_write(xdev->reg, XILINX_DPDMA_IEN, XILINX_DPDMA_INTR_ALL);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EIEN, XILINX_DPDMA_EINTR_ALL);
+}
+
+/**
+ * xilinx_dpdma_disable_irq - Disable interrupts
+ * @xdev: DPDMA device
+ *
+ * Disable interrupts.
+ */
+static void xilinx_dpdma_disable_irq(struct xilinx_dpdma_device *xdev)
+{
+	dpdma_write(xdev->reg, XILINX_DPDMA_IDS, XILINX_DPDMA_INTR_ERR_ALL);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EIDS, XILINX_DPDMA_EINTR_ALL);
+}
+
+/**
+ * xilinx_dpdma_chan_err_task - Per channel tasklet for error handling
+ * @data: tasklet data to be casted to DPDMA channel structure
+ *
+ * Per channel error handling tasklet. This function waits for the outstanding
+ * transaction to complete and triggers error handling. After error handling,
+ * re-enable channel error interrupts, and restart the channel if needed.
+ */
+static void xilinx_dpdma_chan_err_task(unsigned long data)
+{
+	struct xilinx_dpdma_chan *chan = (struct xilinx_dpdma_chan *)data;
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	unsigned long flags;
+
+	/* Proceed error handling even when polling fails. */
+	xilinx_dpdma_chan_poll_no_ostand(chan);
+
+	xilinx_dpdma_chan_handle_err(chan);
+
+	dpdma_write(xdev->reg, XILINX_DPDMA_IEN,
+		    XILINX_DPDMA_INTR_CHAN_ERR_MASK << chan->id);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EIEN,
+		    XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id);
+
+	spin_lock_irqsave(&chan->lock, flags);
+	xilinx_dpdma_chan_queue_transfer(chan);
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+static irqreturn_t xilinx_dpdma_irq_handler(int irq, void *data)
+{
+	struct xilinx_dpdma_device *xdev = data;
+	unsigned long mask;
+	unsigned int i;
+	u32 status;
+	u32 error;
+
+	status = dpdma_read(xdev->reg, XILINX_DPDMA_ISR);
+	error = dpdma_read(xdev->reg, XILINX_DPDMA_EISR);
+	if (!status && !error)
+		return IRQ_NONE;
+
+	dpdma_write(xdev->reg, XILINX_DPDMA_ISR, status);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EISR, error);
+
+	if (status & XILINX_DPDMA_INTR_VSYNC) {
+		/*
+		 * There's a single VSYNC interrupt that needs to be processed
+		 * by each running channel to update the active descriptor.
+		 */
+		for (i = 0; i < ARRAY_SIZE(xdev->chan); i++) {
+			struct xilinx_dpdma_chan *chan = xdev->chan[i];
+
+			if (chan)
+				xilinx_dpdma_chan_vsync_irq(chan);
+		}
+	}
+
+	mask = FIELD_GET(XILINX_DPDMA_INTR_DESC_DONE_MASK, status);
+	if (mask) {
+		for_each_set_bit(i, &mask, ARRAY_SIZE(xdev->chan))
+			xilinx_dpdma_chan_done_irq(xdev->chan[i]);
+	}
+
+	mask = FIELD_GET(XILINX_DPDMA_INTR_NO_OSTAND_MASK, status);
+	if (mask) {
+		for_each_set_bit(i, &mask, ARRAY_SIZE(xdev->chan))
+			xilinx_dpdma_chan_notify_no_ostand(xdev->chan[i]);
+	}
+
+	mask = status & XILINX_DPDMA_INTR_ERR_ALL;
+	if (mask || error)
+		xilinx_dpdma_handle_err_irq(xdev, mask, error);
+
+	return IRQ_HANDLED;
+}
+
+/* -----------------------------------------------------------------------------
+ * Initialization & Cleanup
+ */
+
+static int xilinx_dpdma_chan_init(struct xilinx_dpdma_device *xdev,
+				  unsigned int chan_id)
+{
+	struct xilinx_dpdma_chan *chan;
+
+	chan = devm_kzalloc(xdev->dev, sizeof(*chan), GFP_KERNEL);
+	if (!chan)
+		return -ENOMEM;
+
+	chan->id = chan_id;
+	chan->reg = xdev->reg + XILINX_DPDMA_CH_BASE
+		  + XILINX_DPDMA_CH_OFFSET * chan->id;
+	chan->running = false;
+	chan->xdev = xdev;
+
+	spin_lock_init(&chan->lock);
+	init_waitqueue_head(&chan->wait_to_stop);
+
+	tasklet_init(&chan->err_task, xilinx_dpdma_chan_err_task,
+		     (unsigned long)chan);
+
+	chan->vchan.desc_free = xilinx_dpdma_chan_free_tx_desc;
+	vchan_init(&chan->vchan, &xdev->common);
+
+	xdev->chan[chan->id] = chan;
+
+	return 0;
+}
+
+static void xilinx_dpdma_chan_remove(struct xilinx_dpdma_chan *chan)
+{
+	if (!chan)
+		return;
+
+	tasklet_kill(&chan->err_task);
+	list_del(&chan->vchan.chan.device_node);
+}
+
+static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
+					    struct of_dma *ofdma)
+{
+	struct xilinx_dpdma_device *xdev = ofdma->of_dma_data;
+	uint32_t chan_id = dma_spec->args[0];
+
+	if (chan_id >= ARRAY_SIZE(xdev->chan))
+		return NULL;
+
+	if (!xdev->chan[chan_id])
+		return NULL;
+
+	return dma_get_slave_channel(&xdev->chan[chan_id]->vchan.chan);
+}
+
+static int xilinx_dpdma_probe(struct platform_device *pdev)
+{
+	struct xilinx_dpdma_device *xdev;
+	struct dma_device *ddev;
+	unsigned int i;
+	int ret;
+
+	xdev = devm_kzalloc(&pdev->dev, sizeof(*xdev), GFP_KERNEL);
+	if (!xdev)
+		return -ENOMEM;
+
+	xdev->dev = &pdev->dev;
+	xdev->ext_addr = sizeof(dma_addr_t) > 4;
+
+	INIT_LIST_HEAD(&xdev->common.channels);
+
+	platform_set_drvdata(pdev, xdev);
+
+	xdev->axi_clk = devm_clk_get(xdev->dev, "axi_clk");
+	if (IS_ERR(xdev->axi_clk))
+		return PTR_ERR(xdev->axi_clk);
+
+	xdev->reg = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(xdev->reg))
+		return PTR_ERR(xdev->reg);
+
+	xdev->irq = platform_get_irq(pdev, 0);
+	if (xdev->irq < 0) {
+		dev_err(xdev->dev, "failed to get platform irq\n");
+		return xdev->irq;
+	}
+
+	ret = request_irq(xdev->irq, xilinx_dpdma_irq_handler, IRQF_SHARED,
+			  dev_name(xdev->dev), xdev);
+	if (ret) {
+		dev_err(xdev->dev, "failed to request IRQ\n");
+		return ret;
+	}
+
+	ddev = &xdev->common;
+	ddev->dev = &pdev->dev;
+
+	dma_cap_set(DMA_SLAVE, ddev->cap_mask);
+	dma_cap_set(DMA_PRIVATE, ddev->cap_mask);
+	dma_cap_set(DMA_INTERLEAVE, ddev->cap_mask);
+	dma_cap_set(DMA_REPEAT, ddev->cap_mask);
+	ddev->copy_align = fls(XILINX_DPDMA_ALIGN_BYTES - 1);
+
+	ddev->device_alloc_chan_resources = xilinx_dpdma_alloc_chan_resources;
+	ddev->device_free_chan_resources = xilinx_dpdma_free_chan_resources;
+	ddev->device_prep_interleaved_dma = xilinx_dpdma_prep_interleaved_dma;
+	/* TODO: Can we achieve better granularity ? */
+	ddev->device_tx_status = dma_cookie_status;
+	ddev->device_issue_pending = xilinx_dpdma_issue_pending;
+	ddev->device_config = xilinx_dpdma_config;
+	ddev->device_pause = xilinx_dpdma_pause;
+	ddev->device_resume = xilinx_dpdma_resume;
+	ddev->device_terminate_all = xilinx_dpdma_terminate_all;
+	ddev->device_synchronize = xilinx_dpdma_synchronize;
+	ddev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_UNDEFINED);
+	ddev->directions = BIT(DMA_MEM_TO_DEV);
+	ddev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
+
+	for (i = 0; i < ARRAY_SIZE(xdev->chan); ++i) {
+		ret = xilinx_dpdma_chan_init(xdev, i);
+		if (ret < 0) {
+			dev_err(xdev->dev, "failed to initialize channel %u\n",
+				i);
+			goto error;
+		}
+	}
+
+	ret = clk_prepare_enable(xdev->axi_clk);
+	if (ret) {
+		dev_err(xdev->dev, "failed to enable the axi clock\n");
+		goto error;
+	}
+
+	ret = dma_async_device_register(ddev);
+	if (ret) {
+		dev_err(xdev->dev, "failed to register the dma device\n");
+		goto error_dma_async;
+	}
+
+	ret = of_dma_controller_register(xdev->dev->of_node,
+					 of_dma_xilinx_xlate, ddev);
+	if (ret) {
+		dev_err(xdev->dev, "failed to register DMA to DT DMA helper\n");
+		goto error_of_dma;
+	}
+
+	xilinx_dpdma_enable_irq(xdev);
+
+	dev_info(&pdev->dev, "Xilinx DPDMA engine is probed\n");
+
+	return 0;
+
+error_of_dma:
+	dma_async_device_unregister(ddev);
+error_dma_async:
+	clk_disable_unprepare(xdev->axi_clk);
+error:
+	for (i = 0; i < ARRAY_SIZE(xdev->chan); i++)
+		xilinx_dpdma_chan_remove(xdev->chan[i]);
+
+	free_irq(xdev->irq, xdev);
+
+	return ret;
+}
+
+static int xilinx_dpdma_remove(struct platform_device *pdev)
+{
+	struct xilinx_dpdma_device *xdev = platform_get_drvdata(pdev);
+	unsigned int i;
+
+	/* Start by disabling the IRQ to avoid races during cleanup. */
+	free_irq(xdev->irq, xdev);
+
+	xilinx_dpdma_disable_irq(xdev);
+	of_dma_controller_free(pdev->dev.of_node);
+	dma_async_device_unregister(&xdev->common);
+	clk_disable_unprepare(xdev->axi_clk);
+
+	for (i = 0; i < ARRAY_SIZE(xdev->chan); i++)
+		xilinx_dpdma_chan_remove(xdev->chan[i]);
+
+	return 0;
+}
+
+static const struct of_device_id xilinx_dpdma_of_match[] = {
+	{ .compatible = "xlnx,zynqmp-dpdma",},
+	{ /* end of table */ },
+};
+MODULE_DEVICE_TABLE(of, xilinx_dpdma_of_match);
+
+static struct platform_driver xilinx_dpdma_driver = {
+	.probe			= xilinx_dpdma_probe,
+	.remove			= xilinx_dpdma_remove,
+	.driver			= {
+		.name		= "xilinx-zynqmp-dpdma",
+		.of_match_table	= xilinx_dpdma_of_match,
+	},
+};
+
+module_platform_driver(xilinx_dpdma_driver);
+
+MODULE_AUTHOR("Xilinx, Inc.");
+MODULE_DESCRIPTION("Xilinx ZynqMP DPDMA driver");
+MODULE_LICENSE("GPL v2");
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 5/6] dmaengine: xilinx: dpdma: Add debugfs support
  2020-05-13 16:59 [PATCH v4 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
                   ` (3 preceding siblings ...)
  2020-05-13 16:59 ` [PATCH v4 4/6] dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver Laurent Pinchart
@ 2020-05-13 16:59 ` Laurent Pinchart
  2020-05-13 16:59 ` [PATCH v4 6/6] arm64: dts: zynqmp: Add DPDMA node Laurent Pinchart
  5 siblings, 0 replies; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-13 16:59 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

Expose statistics to debugfs when available. This helps debugging issues
with the DPDMA driver.

Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
Changes since v3:

- Return -EFAULT instead of bytes remaining after copy_to_user()

Changes since v2:

- Refactor debugfs code
---
 drivers/dma/xilinx/xilinx_dpdma.c | 228 ++++++++++++++++++++++++++++++
 1 file changed, 228 insertions(+)

diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
index de28c63afc7d..77febfe54cfb 100644
--- a/drivers/dma/xilinx/xilinx_dpdma.c
+++ b/drivers/dma/xilinx/xilinx_dpdma.c
@@ -10,6 +10,7 @@
 #include <linux/bitfield.h>
 #include <linux/bits.h>
 #include <linux/clk.h>
+#include <linux/debugfs.h>
 #include <linux/delay.h>
 #include <linux/dmaengine.h>
 #include <linux/dmapool.h>
@@ -265,6 +266,229 @@ struct xilinx_dpdma_device {
 	bool ext_addr;
 };
 
+/* -----------------------------------------------------------------------------
+ * DebugFS
+ */
+
+#ifdef CONFIG_DEBUG_FS
+
+#define XILINX_DPDMA_DEBUGFS_READ_MAX_SIZE	32
+#define XILINX_DPDMA_DEBUGFS_UINT16_MAX_STR	"65535"
+
+/* Match xilinx_dpdma_testcases vs dpdma_debugfs_reqs[] entry */
+enum xilinx_dpdma_testcases {
+	DPDMA_TC_INTR_DONE,
+	DPDMA_TC_NONE
+};
+
+struct xilinx_dpdma_debugfs {
+	enum xilinx_dpdma_testcases testcase;
+	u16 xilinx_dpdma_irq_done_count;
+	unsigned int chan_id;
+};
+
+static struct xilinx_dpdma_debugfs dpdma_debugfs;
+struct xilinx_dpdma_debugfs_request {
+	const char *name;
+	enum xilinx_dpdma_testcases tc;
+	ssize_t (*read)(char *buf);
+	int (*write)(char *args);
+};
+
+static void xilinx_dpdma_debugfs_desc_done_irq(struct xilinx_dpdma_chan *chan)
+{
+	if (chan->id == dpdma_debugfs.chan_id)
+		dpdma_debugfs.xilinx_dpdma_irq_done_count++;
+}
+
+static ssize_t xilinx_dpdma_debugfs_desc_done_irq_read(char *buf)
+{
+	size_t out_str_len;
+
+	dpdma_debugfs.testcase = DPDMA_TC_NONE;
+
+	out_str_len = strlen(XILINX_DPDMA_DEBUGFS_UINT16_MAX_STR);
+	out_str_len = min_t(size_t, XILINX_DPDMA_DEBUGFS_READ_MAX_SIZE,
+			    out_str_len);
+	snprintf(buf, out_str_len, "%d",
+		 dpdma_debugfs.xilinx_dpdma_irq_done_count);
+
+	return 0;
+}
+
+static int xilinx_dpdma_debugfs_desc_done_irq_write(char *args)
+{
+	char *arg;
+	int ret;
+	u32 id;
+
+	arg = strsep(&args, " ");
+	if (!arg || strncasecmp(arg, "start", 5))
+		return -EINVAL;
+
+	arg = strsep(&args, " ");
+	if (!arg)
+		return -EINVAL;
+
+	ret = kstrtou32(arg, 0, &id);
+	if (ret < 0)
+		return ret;
+
+	if (id < ZYNQMP_DPDMA_VIDEO0 || id > ZYNQMP_DPDMA_AUDIO1)
+		return -EINVAL;
+
+	dpdma_debugfs.testcase = DPDMA_TC_INTR_DONE;
+	dpdma_debugfs.xilinx_dpdma_irq_done_count = 0;
+	dpdma_debugfs.chan_id = id;
+
+	return 0;
+}
+
+/* Match xilinx_dpdma_testcases vs dpdma_debugfs_reqs[] entry */
+struct xilinx_dpdma_debugfs_request dpdma_debugfs_reqs[] = {
+	{
+		.name = "DESCRIPTOR_DONE_INTR",
+		.tc = DPDMA_TC_INTR_DONE,
+		.read = xilinx_dpdma_debugfs_desc_done_irq_read,
+		.write = xilinx_dpdma_debugfs_desc_done_irq_write,
+	},
+};
+
+static ssize_t xilinx_dpdma_debugfs_read(struct file *f, char __user *buf,
+					 size_t size, loff_t *pos)
+{
+	enum xilinx_dpdma_testcases testcase;
+	char *kern_buff;
+	int ret = 0;
+
+	if (*pos != 0 || size <= 0)
+		return -EINVAL;
+
+	kern_buff = kzalloc(XILINX_DPDMA_DEBUGFS_READ_MAX_SIZE, GFP_KERNEL);
+	if (!kern_buff) {
+		dpdma_debugfs.testcase = DPDMA_TC_NONE;
+		return -ENOMEM;
+	}
+
+	testcase = READ_ONCE(dpdma_debugfs.testcase);
+	if (testcase != DPDMA_TC_NONE) {
+		ret = dpdma_debugfs_reqs[testcase].read(kern_buff);
+		if (ret < 0)
+			goto done;
+	} else {
+		strlcpy(kern_buff, "No testcase executed",
+			XILINX_DPDMA_DEBUGFS_READ_MAX_SIZE);
+	}
+
+	size = min(size, strlen(kern_buff));
+	if (copy_to_user(buf, kern_buff, size))
+		ret = -EFAULT;
+
+done:
+	kfree(kern_buff);
+	if (ret)
+		return ret;
+
+	*pos = size + 1;
+	return size;
+}
+
+static ssize_t xilinx_dpdma_debugfs_write(struct file *f,
+					  const char __user *buf, size_t size,
+					  loff_t *pos)
+{
+	char *kern_buff, *kern_buff_start;
+	char *testcase;
+	unsigned int i;
+	int ret;
+
+	if (*pos != 0 || size <= 0)
+		return -EINVAL;
+
+	/* Supporting single instance of test as of now. */
+	if (dpdma_debugfs.testcase != DPDMA_TC_NONE)
+		return -EBUSY;
+
+	kern_buff = kzalloc(size, GFP_KERNEL);
+	if (!kern_buff)
+		return -ENOMEM;
+	kern_buff_start = kern_buff;
+
+	ret = strncpy_from_user(kern_buff, buf, size);
+	if (ret < 0)
+		goto done;
+
+	/* Read the testcase name from a user request. */
+	testcase = strsep(&kern_buff, " ");
+
+	for (i = 0; i < ARRAY_SIZE(dpdma_debugfs_reqs); i++) {
+		if (!strcasecmp(testcase, dpdma_debugfs_reqs[i].name))
+			break;
+	}
+
+	if (i == ARRAY_SIZE(dpdma_debugfs_reqs)) {
+		ret = -EINVAL;
+		goto done;
+	}
+
+	ret = dpdma_debugfs_reqs[i].write(kern_buff);
+	if (ret < 0)
+		goto done;
+
+	ret = size;
+
+done:
+	kfree(kern_buff_start);
+	return ret;
+}
+
+static const struct file_operations fops_xilinx_dpdma_dbgfs = {
+	.owner = THIS_MODULE,
+	.read = xilinx_dpdma_debugfs_read,
+	.write = xilinx_dpdma_debugfs_write,
+};
+
+static int xilinx_dpdma_debugfs_init(struct device *dev)
+{
+	int err;
+	struct dentry *xilinx_dpdma_debugfs_dir, *xilinx_dpdma_debugfs_file;
+
+	dpdma_debugfs.testcase = DPDMA_TC_NONE;
+
+	xilinx_dpdma_debugfs_dir = debugfs_create_dir("dpdma", NULL);
+	if (!xilinx_dpdma_debugfs_dir) {
+		dev_err(dev, "debugfs_create_dir failed\n");
+		return -ENODEV;
+	}
+
+	xilinx_dpdma_debugfs_file =
+		debugfs_create_file("testcase", 0444,
+				    xilinx_dpdma_debugfs_dir, NULL,
+				    &fops_xilinx_dpdma_dbgfs);
+	if (!xilinx_dpdma_debugfs_file) {
+		dev_err(dev, "debugfs_create_file testcase failed\n");
+		err = -ENODEV;
+		goto err_dbgfs;
+	}
+	return 0;
+
+err_dbgfs:
+	debugfs_remove_recursive(xilinx_dpdma_debugfs_dir);
+	xilinx_dpdma_debugfs_dir = NULL;
+	return err;
+}
+
+#else
+static int xilinx_dpdma_debugfs_init(struct device *dev)
+{
+	return 0;
+}
+
+static void xilinx_dpdma_debugfs_desc_done_irq(struct xilinx_dpdma_chan *chan)
+{
+}
+#endif /* CONFIG_DEBUG_FS */
+
 /* -----------------------------------------------------------------------------
  * I/O Accessors
  */
@@ -840,6 +1064,8 @@ static void xilinx_dpdma_chan_done_irq(struct xilinx_dpdma_chan *chan)
 
 	spin_lock_irqsave(&chan->lock, flags);
 
+	xilinx_dpdma_debugfs_desc_done_irq(chan);
+
 	if (active)
 		vchan_cyclic_callback(&active->vdesc);
 	else
@@ -1485,6 +1711,8 @@ static int xilinx_dpdma_probe(struct platform_device *pdev)
 
 	xilinx_dpdma_enable_irq(xdev);
 
+	xilinx_dpdma_debugfs_init(&pdev->dev);
+
 	dev_info(&pdev->dev, "Xilinx DPDMA engine is probed\n");
 
 	return 0;
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 6/6] arm64: dts: zynqmp: Add DPDMA node
  2020-05-13 16:59 [PATCH v4 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
                   ` (4 preceding siblings ...)
  2020-05-13 16:59 ` [PATCH v4 5/6] dmaengine: xilinx: dpdma: Add debugfs support Laurent Pinchart
@ 2020-05-13 16:59 ` Laurent Pinchart
  2020-05-14  5:56   ` Michal Simek
  5 siblings, 1 reply; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-13 16:59 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

Add a DT node for the DisplayPort DMA engine (DPDMA).

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi |  4 ++++
 arch/arm64/boot/dts/xilinx/zynqmp.dtsi         | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
index 9868ca15dfc5..32c4914738d9 100644
--- a/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
+++ b/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
@@ -57,6 +57,10 @@ &cpu0 {
 	clocks = <&zynqmp_clk ACPU>;
 };
 
+&dpdma {
+	clocks = <&zynqmp_clk DPDMA_REF>;
+};
+
 &fpd_dma_chan1 {
 	clocks = <&zynqmp_clk GDMA_REF>, <&zynqmp_clk LPD_LSBUS>;
 };
diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
index 26d926eb1431..2e284eb8d3c1 100644
--- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
+++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
@@ -246,6 +246,16 @@ pmu@9000 {
 			};
 		};
 
+		dpdma: dma-controller@fd4c0000 {
+			compatible = "xlnx,zynqmp-dpdma";
+			status = "disabled";
+			reg = <0x0 0xfd4c0000 0x0 0x1000>;
+			interrupts = <0 122 4>;
+			interrupt-parent = <&gic>;
+			clock-names = "axi_clk";
+			#dma-cells = <1>;
+		};
+
 		/* GDMA */
 		fpd_dma_chan1: dma@fd500000 {
 			status = "disabled";
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 6/6] arm64: dts: zynqmp: Add DPDMA node
  2020-05-13 16:59 ` [PATCH v4 6/6] arm64: dts: zynqmp: Add DPDMA node Laurent Pinchart
@ 2020-05-14  5:56   ` Michal Simek
  2020-05-28  2:49     ` Laurent Pinchart
  0 siblings, 1 reply; 23+ messages in thread
From: Michal Simek @ 2020-05-14  5:56 UTC (permalink / raw)
  To: Laurent Pinchart, dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

On 13. 05. 20 18:59, Laurent Pinchart wrote:
> Add a DT node for the DisplayPort DMA engine (DPDMA).
> 
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
>  arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi |  4 ++++
>  arch/arm64/boot/dts/xilinx/zynqmp.dtsi         | 10 ++++++++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
> index 9868ca15dfc5..32c4914738d9 100644
> --- a/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
> +++ b/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
> @@ -57,6 +57,10 @@ &cpu0 {
>  	clocks = <&zynqmp_clk ACPU>;
>  };
>  
> +&dpdma {
> +	clocks = <&zynqmp_clk DPDMA_REF>;
> +};
> +
>  &fpd_dma_chan1 {
>  	clocks = <&zynqmp_clk GDMA_REF>, <&zynqmp_clk LPD_LSBUS>;
>  };
> diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> index 26d926eb1431..2e284eb8d3c1 100644
> --- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> +++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> @@ -246,6 +246,16 @@ pmu@9000 {
>  			};
>  		};
>  
> +		dpdma: dma-controller@fd4c0000 {
> +			compatible = "xlnx,zynqmp-dpdma";
> +			status = "disabled";
> +			reg = <0x0 0xfd4c0000 0x0 0x1000>;
> +			interrupts = <0 122 4>;
> +			interrupt-parent = <&gic>;
> +			clock-names = "axi_clk";
> +			#dma-cells = <1>;
> +		};
> +
>  		/* GDMA */
>  		fpd_dma_chan1: dma@fd500000 {
>  			status = "disabled";
> 

Acked-by: Michal Simek <michal.simek@xilinx.com>

Feel free to take it with this series. Or let me know if you want me to
take it via my soc tree.

Thanks,
Michal

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-13 16:59 ` [PATCH v4 3/6] dmaengine: Add support for repeating transactions Laurent Pinchart
@ 2020-05-14 18:23   ` Vinod Koul
  2020-05-14 20:07     ` Laurent Pinchart
  0 siblings, 1 reply; 23+ messages in thread
From: Vinod Koul @ 2020-05-14 18:23 UTC (permalink / raw)
  To: Laurent Pinchart, Peter Ujfalusi
  Cc: dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Laurent, Peter,

On 13-05-20, 19:59, Laurent Pinchart wrote:
> DMA engines used with displays perform 2D interleaved transfers to read
> framebuffers from memory and feed the data to the display engine. As the
> same framebuffer can be displayed for multiple frames, the DMA
> transactions need to be repeated until a new framebuffer replaces the
> current one. This feature is implemented natively by some DMA engines
> that have the ability to repeat transactions and switch to a new
> transaction at the end of a transfer without any race condition or frame
> loss.
> 
> This patch implements support for this feature in the DMA engine API. A
> new DMA_PREP_REPEAT transaction flag allows DMA clients to instruct the
> DMA channel to repeat the transaction automatically until one or more
> new transactions are issued on the channel (or until all active DMA
> transfers are explicitly terminated with the dmaengine_terminate_*()
> functions). A new DMA_REPEAT transaction type is also added for DMA
> engine drivers to report their support of the DMA_PREP_REPEAT flag.
> 
> The DMA_PREP_REPEAT flag is currently supported for interleaved
> transactions only. Its usage can easily be extended to cover more
> transaction types simply by adding an appropriate check in the
> corresponding dmaengine_prep_*() function.
> 
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> If this approach is accepted I can send a new version that updates
> documentation in Documentation/driver-api/dmaengine/, and extend support
> of DMA_PREP_REPEAT to the other transaction types, if desired already.
> 
>  include/linux/dmaengine.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 64461fc64e1b..9fa00bdbf583 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -61,6 +61,7 @@ enum dma_transaction_type {
>  	DMA_SLAVE,
>  	DMA_CYCLIC,
>  	DMA_INTERLEAVE,
> +	DMA_REPEAT,
>  /* last transaction type for creation of the capabilities mask */
>  	DMA_TX_TYPE_END,
>  };
> @@ -176,6 +177,11 @@ struct dma_interleaved_template {
>   * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
>   *  data and the descriptor should be in different format from normal
>   *  data descriptors.
> + * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
> + *  repeated when it ends if no other transaction has been issued on the same
> + *  channel. If other transactions have been issued, this transaction completes
> + *  normally. This flag is only applicable to interleaved transactions and is
> + *  ignored for all other transaction types.
>   */
>  enum dma_ctrl_flags {
>  	DMA_PREP_INTERRUPT = (1 << 0),
> @@ -186,6 +192,7 @@ enum dma_ctrl_flags {
>  	DMA_PREP_FENCE = (1 << 5),
>  	DMA_CTRL_REUSE = (1 << 6),
>  	DMA_PREP_CMD = (1 << 7),
> +	DMA_PREP_REPEAT = (1 << 8),

Thanks for sending this. I think this is a good proposal which Peter
made for solving this issue and it has great merits, but this is
incomplete.

DMA_PREP_REPEAT|RELOAD should only imply repeating of transactions,
nothing else. I would like to see APIs having explicit behaviour, so let
us also add another flag DMA_PREP_LOAD_NEXT|NEW to indicate that the
next transactions will replace the current one when submitted after calling
.issue_pending().

Also it makes sense to explicitly specify when the transaction should be
reloaded. Rather than make a guesswork based on hardware support, we
should specify the EOB/EOT in these flags as well.

Next is callback notification mechanism and when it should be invoked.
EOT is today indicated by DMA_PREP_INTERRUPT, EOB needs to be added.

So to summarize your driver needs to invoke
DMA_PREP_REPEAT|DMA_PREP_LOAD_NEXT|DMA_LOAD_EOT|DMA_PREP_INTERRUPT
specifying that the transactions are repeated untill next one pops up
and replaced at EOT with callbacks being invoked at EOT boundaries.

@Peter, did I miss anything else in this..? Please send the patch for
this (to start with just the headers so that Laurent can start
using them) and detailed patch with documentation as follow up, I trust
you two can coordinate :)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-14 18:23   ` Vinod Koul
@ 2020-05-14 20:07     ` Laurent Pinchart
  2020-05-15  8:38       ` Vinod Koul
  0 siblings, 1 reply; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-14 20:07 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Thu, May 14, 2020 at 11:53:44PM +0530, Vinod Koul wrote:
> On 13-05-20, 19:59, Laurent Pinchart wrote:
> > DMA engines used with displays perform 2D interleaved transfers to read
> > framebuffers from memory and feed the data to the display engine. As the
> > same framebuffer can be displayed for multiple frames, the DMA
> > transactions need to be repeated until a new framebuffer replaces the
> > current one. This feature is implemented natively by some DMA engines
> > that have the ability to repeat transactions and switch to a new
> > transaction at the end of a transfer without any race condition or frame
> > loss.
> > 
> > This patch implements support for this feature in the DMA engine API. A
> > new DMA_PREP_REPEAT transaction flag allows DMA clients to instruct the
> > DMA channel to repeat the transaction automatically until one or more
> > new transactions are issued on the channel (or until all active DMA
> > transfers are explicitly terminated with the dmaengine_terminate_*()
> > functions). A new DMA_REPEAT transaction type is also added for DMA
> > engine drivers to report their support of the DMA_PREP_REPEAT flag.
> > 
> > The DMA_PREP_REPEAT flag is currently supported for interleaved
> > transactions only. Its usage can easily be extended to cover more
> > transaction types simply by adding an appropriate check in the
> > corresponding dmaengine_prep_*() function.
> > 
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> > If this approach is accepted I can send a new version that updates
> > documentation in Documentation/driver-api/dmaengine/, and extend support
> > of DMA_PREP_REPEAT to the other transaction types, if desired already.
> > 
> >  include/linux/dmaengine.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> > index 64461fc64e1b..9fa00bdbf583 100644
> > --- a/include/linux/dmaengine.h
> > +++ b/include/linux/dmaengine.h
> > @@ -61,6 +61,7 @@ enum dma_transaction_type {
> >  	DMA_SLAVE,
> >  	DMA_CYCLIC,
> >  	DMA_INTERLEAVE,
> > +	DMA_REPEAT,
> >  /* last transaction type for creation of the capabilities mask */
> >  	DMA_TX_TYPE_END,
> >  };
> > @@ -176,6 +177,11 @@ struct dma_interleaved_template {
> >   * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
> >   *  data and the descriptor should be in different format from normal
> >   *  data descriptors.
> > + * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
> > + *  repeated when it ends if no other transaction has been issued on the same
> > + *  channel. If other transactions have been issued, this transaction completes
> > + *  normally. This flag is only applicable to interleaved transactions and is
> > + *  ignored for all other transaction types.
> >   */
> >  enum dma_ctrl_flags {
> >  	DMA_PREP_INTERRUPT = (1 << 0),
> > @@ -186,6 +192,7 @@ enum dma_ctrl_flags {
> >  	DMA_PREP_FENCE = (1 << 5),
> >  	DMA_CTRL_REUSE = (1 << 6),
> >  	DMA_PREP_CMD = (1 << 7),
> > +	DMA_PREP_REPEAT = (1 << 8),
> 
> Thanks for sending this. I think this is a good proposal which Peter
> made for solving this issue and it has great merits, but this is
> incomplete.
> 
> DMA_PREP_REPEAT|RELOAD should only imply repeating of transactions,
> nothing else. I would like to see APIs having explicit behaviour, so let
> us also add another flag DMA_PREP_LOAD_NEXT|NEW to indicate that the
> next transactions will replace the current one when submitted after calling
> .issue_pending().
> 
> Also it makes sense to explicitly specify when the transaction should be
> reloaded. Rather than make a guesswork based on hardware support, we
> should specify the EOB/EOT in these flags as well.
> 
> Next is callback notification mechanism and when it should be invoked.
> EOT is today indicated by DMA_PREP_INTERRUPT, EOB needs to be added.
> 
> So to summarize your driver needs to invoke
> DMA_PREP_REPEAT|DMA_PREP_LOAD_NEXT|DMA_LOAD_EOT|DMA_PREP_INTERRUPT
> specifying that the transactions are repeated untill next one pops up
> and replaced at EOT with callbacks being invoked at EOT boundaries.

Are you *serious* ? I feel trapped in a cross-over of Groundhog Day and
Brazil.

> @Peter, did I miss anything else in this..? Please send the patch for
> this (to start with just the headers so that Laurent can start
> using them) and detailed patch with documentation as follow up, I trust
> you two can coordinate :)

I won't call that coordination, no. If you want to design something
absurd that's your call, not mine, I don't want to get involved.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-14 20:07     ` Laurent Pinchart
@ 2020-05-15  8:38       ` Vinod Koul
  2020-05-15 14:11         ` Laurent Pinchart
  0 siblings, 1 reply; 23+ messages in thread
From: Vinod Koul @ 2020-05-15  8:38 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Laurent,

On 14-05-20, 23:07, Laurent Pinchart wrote:
> On Thu, May 14, 2020 at 11:53:44PM +0530, Vinod Koul wrote:
> > On 13-05-20, 19:59, Laurent Pinchart wrote:
> > > DMA engines used with displays perform 2D interleaved transfers to read
> > > framebuffers from memory and feed the data to the display engine. As the
> > > same framebuffer can be displayed for multiple frames, the DMA
> > > transactions need to be repeated until a new framebuffer replaces the
> > > current one. This feature is implemented natively by some DMA engines
> > > that have the ability to repeat transactions and switch to a new
> > > transaction at the end of a transfer without any race condition or frame
> > > loss.
> > > 
> > > This patch implements support for this feature in the DMA engine API. A
> > > new DMA_PREP_REPEAT transaction flag allows DMA clients to instruct the
> > > DMA channel to repeat the transaction automatically until one or more
> > > new transactions are issued on the channel (or until all active DMA
> > > transfers are explicitly terminated with the dmaengine_terminate_*()
> > > functions). A new DMA_REPEAT transaction type is also added for DMA
> > > engine drivers to report their support of the DMA_PREP_REPEAT flag.
> > > 
> > > The DMA_PREP_REPEAT flag is currently supported for interleaved
> > > transactions only. Its usage can easily be extended to cover more
> > > transaction types simply by adding an appropriate check in the
> > > corresponding dmaengine_prep_*() function.
> > > 
> > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > ---
> > > If this approach is accepted I can send a new version that updates
> > > documentation in Documentation/driver-api/dmaengine/, and extend support
> > > of DMA_PREP_REPEAT to the other transaction types, if desired already.
> > > 
> > >  include/linux/dmaengine.h | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> > > index 64461fc64e1b..9fa00bdbf583 100644
> > > --- a/include/linux/dmaengine.h
> > > +++ b/include/linux/dmaengine.h
> > > @@ -61,6 +61,7 @@ enum dma_transaction_type {
> > >  	DMA_SLAVE,
> > >  	DMA_CYCLIC,
> > >  	DMA_INTERLEAVE,
> > > +	DMA_REPEAT,
> > >  /* last transaction type for creation of the capabilities mask */
> > >  	DMA_TX_TYPE_END,
> > >  };
> > > @@ -176,6 +177,11 @@ struct dma_interleaved_template {
> > >   * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
> > >   *  data and the descriptor should be in different format from normal
> > >   *  data descriptors.
> > > + * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
> > > + *  repeated when it ends if no other transaction has been issued on the same
> > > + *  channel. If other transactions have been issued, this transaction completes
> > > + *  normally. This flag is only applicable to interleaved transactions and is
> > > + *  ignored for all other transaction types.
> > >   */
> > >  enum dma_ctrl_flags {
> > >  	DMA_PREP_INTERRUPT = (1 << 0),
> > > @@ -186,6 +192,7 @@ enum dma_ctrl_flags {
> > >  	DMA_PREP_FENCE = (1 << 5),
> > >  	DMA_CTRL_REUSE = (1 << 6),
> > >  	DMA_PREP_CMD = (1 << 7),
> > > +	DMA_PREP_REPEAT = (1 << 8),
> > 
> > Thanks for sending this. I think this is a good proposal which Peter
> > made for solving this issue and it has great merits, but this is
> > incomplete.
> > 
> > DMA_PREP_REPEAT|RELOAD should only imply repeating of transactions,
> > nothing else. I would like to see APIs having explicit behaviour, so let
> > us also add another flag DMA_PREP_LOAD_NEXT|NEW to indicate that the
> > next transactions will replace the current one when submitted after calling
> > .issue_pending().
> > 
> > Also it makes sense to explicitly specify when the transaction should be
> > reloaded. Rather than make a guesswork based on hardware support, we
> > should specify the EOB/EOT in these flags as well.
> > 
> > Next is callback notification mechanism and when it should be invoked.
> > EOT is today indicated by DMA_PREP_INTERRUPT, EOB needs to be added.
> > 
> > So to summarize your driver needs to invoke
> > DMA_PREP_REPEAT|DMA_PREP_LOAD_NEXT|DMA_LOAD_EOT|DMA_PREP_INTERRUPT
> > specifying that the transactions are repeated untill next one pops up
> > and replaced at EOT with callbacks being invoked at EOT boundaries.
> 
> Are you *serious* ? I feel trapped in a cross-over of Groundhog Day and
> Brazil.

Sorry, I don't understand that reference!

Nevertheless, you want a behaviour which is somehow defined by your use
and magically implies certain conditions. I do not want it that way.
I would rather see all the flag required.

> > @Peter, did I miss anything else in this..? Please send the patch for
> > this (to start with just the headers so that Laurent can start
> > using them) and detailed patch with documentation as follow up, I trust
> > you two can coordinate :)
> 
> I won't call that coordination, no. If you want to design something
> absurd that's your call, not mine, I don't want to get involved.

Your wish!

-- 
~Vinod

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-15  8:38       ` Vinod Koul
@ 2020-05-15 14:11         ` Laurent Pinchart
  2020-05-18 13:57           ` Peter Ujfalusi
  0 siblings, 1 reply; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-15 14:11 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: dmaengine, Vinod Koul, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

On Fri, May 15, 2020 at 02:08:17PM +0530, Vinod Koul wrote:
> Hi Laurent,
> 
> On 14-05-20, 23:07, Laurent Pinchart wrote:
> > On Thu, May 14, 2020 at 11:53:44PM +0530, Vinod Koul wrote:
> > > On 13-05-20, 19:59, Laurent Pinchart wrote:
> > > > DMA engines used with displays perform 2D interleaved transfers to read
> > > > framebuffers from memory and feed the data to the display engine. As the
> > > > same framebuffer can be displayed for multiple frames, the DMA
> > > > transactions need to be repeated until a new framebuffer replaces the
> > > > current one. This feature is implemented natively by some DMA engines
> > > > that have the ability to repeat transactions and switch to a new
> > > > transaction at the end of a transfer without any race condition or frame
> > > > loss.
> > > > 
> > > > This patch implements support for this feature in the DMA engine API. A
> > > > new DMA_PREP_REPEAT transaction flag allows DMA clients to instruct the
> > > > DMA channel to repeat the transaction automatically until one or more
> > > > new transactions are issued on the channel (or until all active DMA
> > > > transfers are explicitly terminated with the dmaengine_terminate_*()
> > > > functions). A new DMA_REPEAT transaction type is also added for DMA
> > > > engine drivers to report their support of the DMA_PREP_REPEAT flag.
> > > > 
> > > > The DMA_PREP_REPEAT flag is currently supported for interleaved
> > > > transactions only. Its usage can easily be extended to cover more
> > > > transaction types simply by adding an appropriate check in the
> > > > corresponding dmaengine_prep_*() function.
> > > > 
> > > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > ---
> > > > If this approach is accepted I can send a new version that updates
> > > > documentation in Documentation/driver-api/dmaengine/, and extend support
> > > > of DMA_PREP_REPEAT to the other transaction types, if desired already.
> > > > 
> > > >  include/linux/dmaengine.h | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > > 
> > > > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> > > > index 64461fc64e1b..9fa00bdbf583 100644
> > > > --- a/include/linux/dmaengine.h
> > > > +++ b/include/linux/dmaengine.h
> > > > @@ -61,6 +61,7 @@ enum dma_transaction_type {
> > > >  	DMA_SLAVE,
> > > >  	DMA_CYCLIC,
> > > >  	DMA_INTERLEAVE,
> > > > +	DMA_REPEAT,
> > > >  /* last transaction type for creation of the capabilities mask */
> > > >  	DMA_TX_TYPE_END,
> > > >  };
> > > > @@ -176,6 +177,11 @@ struct dma_interleaved_template {
> > > >   * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
> > > >   *  data and the descriptor should be in different format from normal
> > > >   *  data descriptors.
> > > > + * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
> > > > + *  repeated when it ends if no other transaction has been issued on the same
> > > > + *  channel. If other transactions have been issued, this transaction completes
> > > > + *  normally. This flag is only applicable to interleaved transactions and is
> > > > + *  ignored for all other transaction types.
> > > >   */
> > > >  enum dma_ctrl_flags {
> > > >  	DMA_PREP_INTERRUPT = (1 << 0),
> > > > @@ -186,6 +192,7 @@ enum dma_ctrl_flags {
> > > >  	DMA_PREP_FENCE = (1 << 5),
> > > >  	DMA_CTRL_REUSE = (1 << 6),
> > > >  	DMA_PREP_CMD = (1 << 7),
> > > > +	DMA_PREP_REPEAT = (1 << 8),
> > > 
> > > Thanks for sending this. I think this is a good proposal which Peter
> > > made for solving this issue and it has great merits, but this is
> > > incomplete.
> > > 
> > > DMA_PREP_REPEAT|RELOAD should only imply repeating of transactions,
> > > nothing else. I would like to see APIs having explicit behaviour, so let
> > > us also add another flag DMA_PREP_LOAD_NEXT|NEW to indicate that the
> > > next transactions will replace the current one when submitted after calling
> > > .issue_pending().
> > > 
> > > Also it makes sense to explicitly specify when the transaction should be
> > > reloaded. Rather than make a guesswork based on hardware support, we
> > > should specify the EOB/EOT in these flags as well.
> > > 
> > > Next is callback notification mechanism and when it should be invoked.
> > > EOT is today indicated by DMA_PREP_INTERRUPT, EOB needs to be added.
> > > 
> > > So to summarize your driver needs to invoke
> > > DMA_PREP_REPEAT|DMA_PREP_LOAD_NEXT|DMA_LOAD_EOT|DMA_PREP_INTERRUPT
> > > specifying that the transactions are repeated untill next one pops up
> > > and replaced at EOT with callbacks being invoked at EOT boundaries.

Peter, what do you think ?

> > Are you *serious* ? I feel trapped in a cross-over of Groundhog Day and
> > Brazil.
> 
> Sorry, I don't understand that reference!
> 
> Nevertheless, you want a behaviour which is somehow defined by your use
> and magically implies certain conditions. I do not want it that way.
> I would rather see all the flag required.
> 
> > > @Peter, did I miss anything else in this..? Please send the patch for
> > > this (to start with just the headers so that Laurent can start
> > > using them) and detailed patch with documentation as follow up, I trust
> > > you two can coordinate :)
> > 
> > I won't call that coordination, no. If you want to design something
> > absurd that's your call, not mine, I don't want to get involved.
> 
> Your wish!

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-15 14:11         ` Laurent Pinchart
@ 2020-05-18 13:57           ` Peter Ujfalusi
  2020-05-18 14:32             ` Laurent Pinchart
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Ujfalusi @ 2020-05-18 13:57 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: dmaengine, Vinod Koul, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

[-- Attachment #1: Type: text/plain, Size: 6816 bytes --]

Hi Laurent, Vinod,

On 15/05/2020 17.11, Laurent Pinchart wrote:
> On Fri, May 15, 2020 at 02:08:17PM +0530, Vinod Koul wrote:
>> Hi Laurent,
>>
>> On 14-05-20, 23:07, Laurent Pinchart wrote:
>>> On Thu, May 14, 2020 at 11:53:44PM +0530, Vinod Koul wrote:
>>>> On 13-05-20, 19:59, Laurent Pinchart wrote:
>>>>> DMA engines used with displays perform 2D interleaved transfers to read
>>>>> framebuffers from memory and feed the data to the display engine. As the
>>>>> same framebuffer can be displayed for multiple frames, the DMA
>>>>> transactions need to be repeated until a new framebuffer replaces the
>>>>> current one. This feature is implemented natively by some DMA engines
>>>>> that have the ability to repeat transactions and switch to a new
>>>>> transaction at the end of a transfer without any race condition or frame
>>>>> loss.
>>>>>
>>>>> This patch implements support for this feature in the DMA engine API. A
>>>>> new DMA_PREP_REPEAT transaction flag allows DMA clients to instruct the
>>>>> DMA channel to repeat the transaction automatically until one or more
>>>>> new transactions are issued on the channel (or until all active DMA
>>>>> transfers are explicitly terminated with the dmaengine_terminate_*()
>>>>> functions). A new DMA_REPEAT transaction type is also added for DMA
>>>>> engine drivers to report their support of the DMA_PREP_REPEAT flag.
>>>>>
>>>>> The DMA_PREP_REPEAT flag is currently supported for interleaved
>>>>> transactions only. Its usage can easily be extended to cover more
>>>>> transaction types simply by adding an appropriate check in the
>>>>> corresponding dmaengine_prep_*() function.
>>>>>
>>>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>>>>> ---
>>>>> If this approach is accepted I can send a new version that updates
>>>>> documentation in Documentation/driver-api/dmaengine/, and extend support
>>>>> of DMA_PREP_REPEAT to the other transaction types, if desired already.
>>>>>
>>>>>  include/linux/dmaengine.h | 10 ++++++++++
>>>>>  1 file changed, 10 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>>>>> index 64461fc64e1b..9fa00bdbf583 100644
>>>>> --- a/include/linux/dmaengine.h
>>>>> +++ b/include/linux/dmaengine.h
>>>>> @@ -61,6 +61,7 @@ enum dma_transaction_type {
>>>>>  	DMA_SLAVE,
>>>>>  	DMA_CYCLIC,
>>>>>  	DMA_INTERLEAVE,
>>>>> +	DMA_REPEAT,
>>>>>  /* last transaction type for creation of the capabilities mask */
>>>>>  	DMA_TX_TYPE_END,
>>>>>  };
>>>>> @@ -176,6 +177,11 @@ struct dma_interleaved_template {
>>>>>   * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
>>>>>   *  data and the descriptor should be in different format from normal
>>>>>   *  data descriptors.
>>>>> + * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
>>>>> + *  repeated when it ends if no other transaction has been issued on the same
>>>>> + *  channel. If other transactions have been issued, this transaction completes
>>>>> + *  normally. This flag is only applicable to interleaved transactions and is
>>>>> + *  ignored for all other transaction types.

It should not be restricted to interleaved, slave_sg/memcpy/etc can be
repeated if the DMA driver implements it (a user on a given platform
needs it).

>>>>>   */
>>>>>  enum dma_ctrl_flags {
>>>>>  	DMA_PREP_INTERRUPT = (1 << 0),
>>>>> @@ -186,6 +192,7 @@ enum dma_ctrl_flags {
>>>>>  	DMA_PREP_FENCE = (1 << 5),
>>>>>  	DMA_CTRL_REUSE = (1 << 6),
>>>>>  	DMA_PREP_CMD = (1 << 7),
>>>>> +	DMA_PREP_REPEAT = (1 << 8),
>>>>
>>>> Thanks for sending this. I think this is a good proposal which Peter
>>>> made for solving this issue and it has great merits, but this is
>>>> incomplete.
>>>>
>>>> DMA_PREP_REPEAT|RELOAD should only imply repeating of transactions,
>>>> nothing else. I would like to see APIs having explicit behaviour, so let
>>>> us also add another flag DMA_PREP_LOAD_NEXT|NEW to indicate that the
>>>> next transactions will replace the current one when submitted after calling
>>>> .issue_pending().
>>>>
>>>> Also it makes sense to explicitly specify when the transaction should be
>>>> reloaded. Rather than make a guesswork based on hardware support, we
>>>> should specify the EOB/EOT in these flags as well.
>>>>
>>>> Next is callback notification mechanism and when it should be invoked.
>>>> EOT is today indicated by DMA_PREP_INTERRUPT, EOB needs to be added.
>>>>
>>>> So to summarize your driver needs to invoke
>>>> DMA_PREP_REPEAT|DMA_PREP_LOAD_NEXT|DMA_LOAD_EOT|DMA_PREP_INTERRUPT
>>>> specifying that the transactions are repeated untill next one pops up
>>>> and replaced at EOT with callbacks being invoked at EOT boundaries.
> 
> Peter, what do you think ?

Well, I'm in between ;)

You have a dedicated DMA which can do one thing - to service display.
DMAengine provides generic API for DMA use users.

The DMA_PREP_REPEAT is a new flag for a descriptor, imho it can be
introduced without breaking anything which exists today.

DMA_PREP_REPEAT - the descriptor should be repeated until the channel is
terminated with terminate_all.

DMA_PREP_LOAD_EOT - the descriptor should be loaded at the next EOT of
the currently running transfer, if any, otherwise start.

DMA_PREP_INTERRUPT - as it is today. Callback at EOT (for
slave_sg/interleaved/memcpy/etc, cyclic interprets this differently -
callback at period elapse time).

So you would set DMA_PREP_REPEAT | DMA_PREP_LOAD_EOT (|
DMA_PREP_INTERRUPT if you need callbacks at EOT).

The capabilities of the device/channel should tell the user if it is
capable of REPEAT and LOAD_EOT.
It is possible that a DMA can do repeat, but lacks the ability to do any
type of LOAD_*

I think this would give a nice starting point to extend on later.

- Péter

>>> Are you *serious* ? I feel trapped in a cross-over of Groundhog Day and
>>> Brazil.
>>
>> Sorry, I don't understand that reference!
>>
>> Nevertheless, you want a behaviour which is somehow defined by your use
>> and magically implies certain conditions. I do not want it that way.
>> I would rather see all the flag required.
>>
>>>> @Peter, did I miss anything else in this..? Please send the patch for
>>>> this (to start with just the headers so that Laurent can start
>>>> using them) and detailed patch with documentation as follow up, I trust
>>>> you two can coordinate :)
>>>
>>> I won't call that coordination, no. If you want to design something
>>> absurd that's your call, not mine, I don't want to get involved.
>>
>> Your wish!
> 

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1783 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-18 13:57           ` Peter Ujfalusi
@ 2020-05-18 14:32             ` Laurent Pinchart
  2020-05-19 12:41               ` Peter Ujfalusi
  0 siblings, 1 reply; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-18 14:32 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: dmaengine, Vinod Koul, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Peter,

On Mon, May 18, 2020 at 04:57:07PM +0300, Peter Ujfalusi wrote:
> On 15/05/2020 17.11, Laurent Pinchart wrote:
> > On Fri, May 15, 2020 at 02:08:17PM +0530, Vinod Koul wrote:
> >> On 14-05-20, 23:07, Laurent Pinchart wrote:
> >>> On Thu, May 14, 2020 at 11:53:44PM +0530, Vinod Koul wrote:
> >>>> On 13-05-20, 19:59, Laurent Pinchart wrote:
> >>>>> DMA engines used with displays perform 2D interleaved transfers to read
> >>>>> framebuffers from memory and feed the data to the display engine. As the
> >>>>> same framebuffer can be displayed for multiple frames, the DMA
> >>>>> transactions need to be repeated until a new framebuffer replaces the
> >>>>> current one. This feature is implemented natively by some DMA engines
> >>>>> that have the ability to repeat transactions and switch to a new
> >>>>> transaction at the end of a transfer without any race condition or frame
> >>>>> loss.
> >>>>>
> >>>>> This patch implements support for this feature in the DMA engine API. A
> >>>>> new DMA_PREP_REPEAT transaction flag allows DMA clients to instruct the
> >>>>> DMA channel to repeat the transaction automatically until one or more
> >>>>> new transactions are issued on the channel (or until all active DMA
> >>>>> transfers are explicitly terminated with the dmaengine_terminate_*()
> >>>>> functions). A new DMA_REPEAT transaction type is also added for DMA
> >>>>> engine drivers to report their support of the DMA_PREP_REPEAT flag.
> >>>>>
> >>>>> The DMA_PREP_REPEAT flag is currently supported for interleaved
> >>>>> transactions only. Its usage can easily be extended to cover more
> >>>>> transaction types simply by adding an appropriate check in the
> >>>>> corresponding dmaengine_prep_*() function.
> >>>>>
> >>>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> >>>>> ---
> >>>>> If this approach is accepted I can send a new version that updates
> >>>>> documentation in Documentation/driver-api/dmaengine/, and extend support
> >>>>> of DMA_PREP_REPEAT to the other transaction types, if desired already.
> >>>>>
> >>>>>  include/linux/dmaengine.h | 10 ++++++++++
> >>>>>  1 file changed, 10 insertions(+)
> >>>>>
> >>>>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> >>>>> index 64461fc64e1b..9fa00bdbf583 100644
> >>>>> --- a/include/linux/dmaengine.h
> >>>>> +++ b/include/linux/dmaengine.h
> >>>>> @@ -61,6 +61,7 @@ enum dma_transaction_type {
> >>>>>  	DMA_SLAVE,
> >>>>>  	DMA_CYCLIC,
> >>>>>  	DMA_INTERLEAVE,
> >>>>> +	DMA_REPEAT,
> >>>>>  /* last transaction type for creation of the capabilities mask */
> >>>>>  	DMA_TX_TYPE_END,
> >>>>>  };
> >>>>> @@ -176,6 +177,11 @@ struct dma_interleaved_template {
> >>>>>   * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
> >>>>>   *  data and the descriptor should be in different format from normal
> >>>>>   *  data descriptors.
> >>>>> + * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
> >>>>> + *  repeated when it ends if no other transaction has been issued on the same
> >>>>> + *  channel. If other transactions have been issued, this transaction completes
> >>>>> + *  normally. This flag is only applicable to interleaved transactions and is
> >>>>> + *  ignored for all other transaction types.
> 
> It should not be restricted to interleaved, slave_sg/memcpy/etc can be
> repeated if the DMA driver implements it (a user on a given platform
> needs it).

As mentioned in the commit message, I plan to extend that, I just didn't
want to add the checks to all the prepare operation wrappers until an
agreement on the approach would be reached. I also thought it would be
good to not allow this API for other transaction types until use cases
arise, in order to force upstream discussions instead of silently
abusing the API :-) I can extend the flag to all other transaction types
(except for the cyclic transaction, as it doesn't make sense there).

> >>>>>   */
> >>>>>  enum dma_ctrl_flags {
> >>>>>  	DMA_PREP_INTERRUPT = (1 << 0),
> >>>>> @@ -186,6 +192,7 @@ enum dma_ctrl_flags {
> >>>>>  	DMA_PREP_FENCE = (1 << 5),
> >>>>>  	DMA_CTRL_REUSE = (1 << 6),
> >>>>>  	DMA_PREP_CMD = (1 << 7),
> >>>>> +	DMA_PREP_REPEAT = (1 << 8),
> >>>>
> >>>> Thanks for sending this. I think this is a good proposal which Peter
> >>>> made for solving this issue and it has great merits, but this is
> >>>> incomplete.
> >>>>
> >>>> DMA_PREP_REPEAT|RELOAD should only imply repeating of transactions,
> >>>> nothing else. I would like to see APIs having explicit behaviour, so let
> >>>> us also add another flag DMA_PREP_LOAD_NEXT|NEW to indicate that the
> >>>> next transactions will replace the current one when submitted after calling
> >>>> .issue_pending().
> >>>>
> >>>> Also it makes sense to explicitly specify when the transaction should be
> >>>> reloaded. Rather than make a guesswork based on hardware support, we
> >>>> should specify the EOB/EOT in these flags as well.
> >>>>
> >>>> Next is callback notification mechanism and when it should be invoked.
> >>>> EOT is today indicated by DMA_PREP_INTERRUPT, EOB needs to be added.
> >>>>
> >>>> So to summarize your driver needs to invoke
> >>>> DMA_PREP_REPEAT|DMA_PREP_LOAD_NEXT|DMA_LOAD_EOT|DMA_PREP_INTERRUPT
> >>>> specifying that the transactions are repeated untill next one pops up
> >>>> and replaced at EOT with callbacks being invoked at EOT boundaries.
> > 
> > Peter, what do you think ?
> 
> Well, I'm in between ;)
> 
> You have a dedicated DMA which can do one thing - to service display.
> DMAengine provides generic API for DMA use users.
> 
> The DMA_PREP_REPEAT is a new flag for a descriptor, imho it can be
> introduced without breaking anything which exists today.
> 
> DMA_PREP_REPEAT - the descriptor should be repeated until the channel is
> terminated with terminate_all.

No concern about DMA_PREP_REPEAT, I like the idea.

> DMA_PREP_LOAD_EOT - the descriptor should be loaded at the next EOT of
> the currently running transfer, if any, otherwise start.

Why is this needed ? Why can't this be the default behaviour ? What the
use case for queuing a descriptor with DMA_PREP_REPEAT and *not* setting
DMA_PREP_LOAD_EOT on the next one ? If a client queues the next
descriptor without DMA_PREP_LOAD_EOT, the DMA engine will keep repeating
the previous one, the client will hang forever waiting for the switch to
the new descriptor that will never happen, and no error message or other
diagnostic information will be provided. This creates an API that as no
purpose (or at least no specified purpose, if there's an actual use case
for not specifying that flag, I'm willing to discuss it) and makes it
easy to shoot oneself in the foot. A good API should be impossible to
misuse (this can of course not always be achieved in practice, but it's
still a good goal to aim for).

And this doesn't even mention DMA_PREP_LOAD_NEXT, that seems equally
design as a way to maximize chances that drivers will get something
wrong :-)

> DMA_PREP_INTERRUPT - as it is today. Callback at EOT (for
> slave_sg/interleaved/memcpy/etc, cyclic interprets this differently -
> callback at period elapse time).
> 
> So you would set DMA_PREP_REPEAT | DMA_PREP_LOAD_EOT (|
> DMA_PREP_INTERRUPT if you need callbacks at EOT).
> 
> The capabilities of the device/channel should tell the user if it is
> capable of REPEAT and LOAD_EOT.
> It is possible that a DMA can do repeat, but lacks the ability to do any
> type of LOAD_*

This is the kind of information I was looking for, thanks. I agree that
some DMA engines may not be able to replace a repeated transfer (I'm
using the word repeated here instead of cyclic, to avoid confusion with
the existing cyclic transfer type) at EOT. I however assume they would
all have the ability to replace it immediately, as DMA engines are
required to implement terminate_all(), and replacing a transfer
immediately can then just be a combination of terminate_all() + starting
the next transfer. Whether we want DMA engines to implement this
internally instead of having the logical on the client side (as done
today) is another question, and I'm not pushing in one direction or
another here (although I think we could explore the option of
implementing this in the DMA engine core).

Having a capability flag to report if a DMA engine supports replacing a
repeated transfer at EOT makes sense (no idea if we will have DMA
engines supporting DMA_REPEAT but not DMA_LOAD_EOT, but that's another
story, at least in theory it could happen). I hwoever don't see what a
DMA_PREP_LOAD_EOT flag is needed, if this feature isn't supported,
shouldn't tx_submit() and/or issue_pending() fail when a repeated
transfer is queued ? Succeeding in tx_submit() and issue_pending() and
doing nothing with the newly queued transfers is, as I explained above,
a very good way to increase the chance of bugs. I don't see a reason why
accepting a call that we know will not perform what the caller expects
it to perform whould be a good idea.

> I think this would give a nice starting point to extend on later.
> 
> >>> Are you *serious* ? I feel trapped in a cross-over of Groundhog Day and
> >>> Brazil.
> >>
> >> Sorry, I don't understand that reference!
> >>
> >> Nevertheless, you want a behaviour which is somehow defined by your use
> >> and magically implies certain conditions. I do not want it that way.
> >> I would rather see all the flag required.
> >>
> >>>> @Peter, did I miss anything else in this..? Please send the patch for
> >>>> this (to start with just the headers so that Laurent can start
> >>>> using them) and detailed patch with documentation as follow up, I trust
> >>>> you two can coordinate :)
> >>>
> >>> I won't call that coordination, no. If you want to design something
> >>> absurd that's your call, not mine, I don't want to get involved.
> >>
> >> Your wish!

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-18 14:32             ` Laurent Pinchart
@ 2020-05-19 12:41               ` Peter Ujfalusi
  2020-05-28  2:10                 ` Laurent Pinchart
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Ujfalusi @ 2020-05-19 12:41 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: dmaengine, Vinod Koul, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

[-- Attachment #1: Type: text/plain, Size: 9063 bytes --]

HI Laurent,

On 18/05/2020 17.32, Laurent Pinchart wrote:
>>>>>>> @@ -176,6 +177,11 @@ struct dma_interleaved_template {
>>>>>>>   * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
>>>>>>>   *  data and the descriptor should be in different format from normal
>>>>>>>   *  data descriptors.
>>>>>>> + * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
>>>>>>> + *  repeated when it ends if no other transaction has been issued on the same
>>>>>>> + *  channel. If other transactions have been issued, this transaction completes
>>>>>>> + *  normally. This flag is only applicable to interleaved transactions and is
>>>>>>> + *  ignored for all other transaction types.
>>
>> It should not be restricted to interleaved, slave_sg/memcpy/etc can be
>> repeated if the DMA driver implements it (a user on a given platform
>> needs it).
> 
> As mentioned in the commit message, I plan to extend that, I just didn't
> want to add the checks to all the prepare operation wrappers until an
> agreement on the approach would be reached. I also thought it would be
> good to not allow this API for other transaction types until use cases
> arise, in order to force upstream discussions instead of silently
> abusing the API :-)

I would not object if slave_sg and memcpy got the same treatment. If the
DMA driver did not set the DMA_REPEAT then clients can not use this
feature anyways.

> I can extend the flag to all other transaction types
> (except for the cyclic transaction, as it doesn't make sense there).

Yep, cyclic is a different type of transfer, it is for circular buffers.
It could be seen as a special case of slave_sg. Some drivers actually
create temporary sg_list in case of cyclic and use the same setup
function to set up the transfer for slave_sg/cyclic...

>>>>>>>   */
>>>>>>>  enum dma_ctrl_flags {
>>>>>>>  	DMA_PREP_INTERRUPT = (1 << 0),
>>>>>>> @@ -186,6 +192,7 @@ enum dma_ctrl_flags {
>>>>>>>  	DMA_PREP_FENCE = (1 << 5),
>>>>>>>  	DMA_CTRL_REUSE = (1 << 6),
>>>>>>>  	DMA_PREP_CMD = (1 << 7),
>>>>>>> +	DMA_PREP_REPEAT = (1 << 8),
>>>>>>
>>>>>> Thanks for sending this. I think this is a good proposal which Peter
>>>>>> made for solving this issue and it has great merits, but this is
>>>>>> incomplete.
>>>>>>
>>>>>> DMA_PREP_REPEAT|RELOAD should only imply repeating of transactions,
>>>>>> nothing else. I would like to see APIs having explicit behaviour, so let
>>>>>> us also add another flag DMA_PREP_LOAD_NEXT|NEW to indicate that the
>>>>>> next transactions will replace the current one when submitted after calling
>>>>>> .issue_pending().
>>>>>>
>>>>>> Also it makes sense to explicitly specify when the transaction should be
>>>>>> reloaded. Rather than make a guesswork based on hardware support, we
>>>>>> should specify the EOB/EOT in these flags as well.
>>>>>>
>>>>>> Next is callback notification mechanism and when it should be invoked.
>>>>>> EOT is today indicated by DMA_PREP_INTERRUPT, EOB needs to be added.
>>>>>>
>>>>>> So to summarize your driver needs to invoke
>>>>>> DMA_PREP_REPEAT|DMA_PREP_LOAD_NEXT|DMA_LOAD_EOT|DMA_PREP_INTERRUPT
>>>>>> specifying that the transactions are repeated untill next one pops up
>>>>>> and replaced at EOT with callbacks being invoked at EOT boundaries.
>>>
>>> Peter, what do you think ?
>>
>> Well, I'm in between ;)
>>
>> You have a dedicated DMA which can do one thing - to service display.
>> DMAengine provides generic API for DMA use users.
>>
>> The DMA_PREP_REPEAT is a new flag for a descriptor, imho it can be
>> introduced without breaking anything which exists today.
>>
>> DMA_PREP_REPEAT - the descriptor should be repeated until the channel is
>> terminated with terminate_all.
> 
> No concern about DMA_PREP_REPEAT, I like the idea.
> 
>> DMA_PREP_LOAD_EOT - the descriptor should be loaded at the next EOT of
>> the currently running transfer, if any, otherwise start.
> 
> Why is this needed ? Why can't this be the default behaviour ? What the
> use case for queuing a descriptor with DMA_PREP_REPEAT and *not* setting
> DMA_PREP_LOAD_EOT on the next one ? If a client queues the next
> descriptor without DMA_PREP_LOAD_EOT, the DMA engine will keep repeating
> the previous one, the client will hang forever waiting for the switch to
> the new descriptor that will never happen, and no error message or other
> diagnostic information will be provided. This creates an API that as no
> purpose (or at least no specified purpose, if there's an actual use case
> for not specifying that flag, I'm willing to discuss it) and makes it
> easy to shoot oneself in the foot. A good API should be impossible to
> misuse (this can of course not always be achieved in practice, but it's
> still a good goal to aim for).

If no DMA_PRP_LOAD_EOT is set then yes, the running transfer will not
move towards, like how the cyclic is working.
and...

> And this doesn't even mention DMA_PREP_LOAD_NEXT, that seems equally
> design as a way to maximize chances that drivers will get something
> wrong :-)
> 
>> DMA_PREP_INTERRUPT - as it is today. Callback at EOT (for
>> slave_sg/interleaved/memcpy/etc, cyclic interprets this differently -
>> callback at period elapse time).
>>
>> So you would set DMA_PREP_REPEAT | DMA_PREP_LOAD_EOT (|
>> DMA_PREP_INTERRUPT if you need callbacks at EOT).
>>
>> The capabilities of the device/channel should tell the user if it is
>> capable of REPEAT and LOAD_EOT.
>> It is possible that a DMA can do repeat, but lacks the ability to do any
>> type of LOAD_*
> 
> This is the kind of information I was looking for, thanks. I agree that
> some DMA engines may not be able to replace a repeated transfer (I'm
> using the word repeated here instead of cyclic, to avoid confusion with
> the existing cyclic transfer type) at EOT. I however assume they would
> all have the ability to replace it immediately, as DMA engines are
> required to implement terminate_all(), and replacing a transfer
> immediately can then just be a combination of terminate_all() + starting
> the next transfer. Whether we want DMA engines to implement this
> internally instead of having the logical on the client side (as done
> today) is another question, and I'm not pushing in one direction or
> another here (although I think we could explore the option of
> implementing this in the DMA engine core).
> 
> Having a capability flag to report if a DMA engine supports replacing a
> repeated transfer at EOT makes sense (no idea if we will have DMA
> engines supporting DMA_REPEAT but not DMA_LOAD_EOT, but that's another
> story, at least in theory it could happen). I hwoever don't see what a
> DMA_PREP_LOAD_EOT flag is needed, if this feature isn't supported,
> shouldn't tx_submit() and/or issue_pending() fail when a repeated
> transfer is queued ? Succeeding in tx_submit() and issue_pending() and
> doing nothing with the newly queued transfers is, as I explained above,
> a very good way to increase the chance of bugs. I don't see a reason why
> accepting a call that we know will not perform what the caller expects
> it to perform whould be a good idea.

I would argue that the DMA_PREP_RELOAD_NOW (ASAP?) is a bit more than
terminate_all+issue_pending.

But, DMA drivers might support neither of them, either of them or both.
It is up to the client to pick the preferred method for it's use.
It is not far fetched that the next DMA the client is going to be
serviced will have different capabilities and the client needs to handle
EOT or NOW or it might even need to have fallback to case when neither
is supported.

I don't like excessive flags either, but based on my experience
under-flagging can bite back sooner than later.

I'm aware that at the moment it feels like it is too explicit, but never
underestimate the creativity of the design - and in some cases the
constraint the design must fulfill.

> 
>> I think this would give a nice starting point to extend on later.
>>
>>>>> Are you *serious* ? I feel trapped in a cross-over of Groundhog Day and
>>>>> Brazil.
>>>>
>>>> Sorry, I don't understand that reference!
>>>>
>>>> Nevertheless, you want a behaviour which is somehow defined by your use
>>>> and magically implies certain conditions. I do not want it that way.
>>>> I would rather see all the flag required.
>>>>
>>>>>> @Peter, did I miss anything else in this..? Please send the patch for
>>>>>> this (to start with just the headers so that Laurent can start
>>>>>> using them) and detailed patch with documentation as follow up, I trust
>>>>>> you two can coordinate :)
>>>>>
>>>>> I won't call that coordination, no. If you want to design something
>>>>> absurd that's your call, not mine, I don't want to get involved.
>>>>
>>>> Your wish!
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1783 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-19 12:41               ` Peter Ujfalusi
@ 2020-05-28  2:10                 ` Laurent Pinchart
  2020-06-01 11:14                   ` Peter Ujfalusi
  0 siblings, 1 reply; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-28  2:10 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: dmaengine, Vinod Koul, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Peter,

On Tue, May 19, 2020 at 03:41:17PM +0300, Peter Ujfalusi wrote:
> On 18/05/2020 17.32, Laurent Pinchart wrote:
> >>>>>>> @@ -176,6 +177,11 @@ struct dma_interleaved_template {
> >>>>>>>   * @DMA_PREP_CMD: tell the driver that the data passed to DMA API is command
> >>>>>>>   *  data and the descriptor should be in different format from normal
> >>>>>>>   *  data descriptors.
> >>>>>>> + * @DMA_PREP_REPEAT: tell the driver that the transaction shall be automatically
> >>>>>>> + *  repeated when it ends if no other transaction has been issued on the same
> >>>>>>> + *  channel. If other transactions have been issued, this transaction completes
> >>>>>>> + *  normally. This flag is only applicable to interleaved transactions and is
> >>>>>>> + *  ignored for all other transaction types.
> >>
> >> It should not be restricted to interleaved, slave_sg/memcpy/etc can be
> >> repeated if the DMA driver implements it (a user on a given platform
> >> needs it).
> > 
> > As mentioned in the commit message, I plan to extend that, I just didn't
> > want to add the checks to all the prepare operation wrappers until an
> > agreement on the approach would be reached. I also thought it would be
> > good to not allow this API for other transaction types until use cases
> > arise, in order to force upstream discussions instead of silently
> > abusing the API :-)
> 
> I would not object if slave_sg and memcpy got the same treatment. If the
> DMA driver did not set the DMA_REPEAT then clients can not use this
> feature anyways.

Would you not object, or would you prefer if it was done in v5 ? :-)
Overall I think that enabling APIs that have no user isn't necessarily
the best idea, as it's prone to design issues, but I don't mind doing so
if you think it needs to be done now.

> > I can extend the flag to all other transaction types
> > (except for the cyclic transaction, as it doesn't make sense there).
> 
> Yep, cyclic is a different type of transfer, it is for circular buffers.
> It could be seen as a special case of slave_sg. Some drivers actually
> create temporary sg_list in case of cyclic and use the same setup
> function to set up the transfer for slave_sg/cyclic...

Cyclic is different for historical reasons, but if I had to redesign it
today, I'd make it slave_sg + DMA_PREP_REPEAT. We obviously can't, and I
have no issue with that.

> >>>>>>>   */
> >>>>>>>  enum dma_ctrl_flags {
> >>>>>>>  	DMA_PREP_INTERRUPT = (1 << 0),
> >>>>>>> @@ -186,6 +192,7 @@ enum dma_ctrl_flags {
> >>>>>>>  	DMA_PREP_FENCE = (1 << 5),
> >>>>>>>  	DMA_CTRL_REUSE = (1 << 6),
> >>>>>>>  	DMA_PREP_CMD = (1 << 7),
> >>>>>>> +	DMA_PREP_REPEAT = (1 << 8),
> >>>>>>
> >>>>>> Thanks for sending this. I think this is a good proposal which Peter
> >>>>>> made for solving this issue and it has great merits, but this is
> >>>>>> incomplete.
> >>>>>>
> >>>>>> DMA_PREP_REPEAT|RELOAD should only imply repeating of transactions,
> >>>>>> nothing else. I would like to see APIs having explicit behaviour, so let
> >>>>>> us also add another flag DMA_PREP_LOAD_NEXT|NEW to indicate that the
> >>>>>> next transactions will replace the current one when submitted after calling
> >>>>>> .issue_pending().
> >>>>>>
> >>>>>> Also it makes sense to explicitly specify when the transaction should be
> >>>>>> reloaded. Rather than make a guesswork based on hardware support, we
> >>>>>> should specify the EOB/EOT in these flags as well.
> >>>>>>
> >>>>>> Next is callback notification mechanism and when it should be invoked.
> >>>>>> EOT is today indicated by DMA_PREP_INTERRUPT, EOB needs to be added.
> >>>>>>
> >>>>>> So to summarize your driver needs to invoke
> >>>>>> DMA_PREP_REPEAT|DMA_PREP_LOAD_NEXT|DMA_LOAD_EOT|DMA_PREP_INTERRUPT
> >>>>>> specifying that the transactions are repeated untill next one pops up
> >>>>>> and replaced at EOT with callbacks being invoked at EOT boundaries.
> >>>
> >>> Peter, what do you think ?
> >>
> >> Well, I'm in between ;)
> >>
> >> You have a dedicated DMA which can do one thing - to service display.
> >> DMAengine provides generic API for DMA use users.
> >>
> >> The DMA_PREP_REPEAT is a new flag for a descriptor, imho it can be
> >> introduced without breaking anything which exists today.
> >>
> >> DMA_PREP_REPEAT - the descriptor should be repeated until the channel is
> >> terminated with terminate_all.
> > 
> > No concern about DMA_PREP_REPEAT, I like the idea.
> > 
> >> DMA_PREP_LOAD_EOT - the descriptor should be loaded at the next EOT of
> >> the currently running transfer, if any, otherwise start.
> > 
> > Why is this needed ? Why can't this be the default behaviour ? What the
> > use case for queuing a descriptor with DMA_PREP_REPEAT and *not* setting
> > DMA_PREP_LOAD_EOT on the next one ? If a client queues the next
> > descriptor without DMA_PREP_LOAD_EOT, the DMA engine will keep repeating
> > the previous one, the client will hang forever waiting for the switch to
> > the new descriptor that will never happen, and no error message or other
> > diagnostic information will be provided. This creates an API that as no
> > purpose (or at least no specified purpose, if there's an actual use case
> > for not specifying that flag, I'm willing to discuss it) and makes it
> > easy to shoot oneself in the foot. A good API should be impossible to
> > misuse (this can of course not always be achieved in practice, but it's
> > still a good goal to aim for).
> 
> If no DMA_PRP_LOAD_EOT is set then yes, the running transfer will not
> move towards, like how the cyclic is working.
> and...
> 
> > And this doesn't even mention DMA_PREP_LOAD_NEXT, that seems equally
> > design as a way to maximize chances that drivers will get something
> > wrong :-)
> > 
> >> DMA_PREP_INTERRUPT - as it is today. Callback at EOT (for
> >> slave_sg/interleaved/memcpy/etc, cyclic interprets this differently -
> >> callback at period elapse time).
> >>
> >> So you would set DMA_PREP_REPEAT | DMA_PREP_LOAD_EOT (|
> >> DMA_PREP_INTERRUPT if you need callbacks at EOT).
> >>
> >> The capabilities of the device/channel should tell the user if it is
> >> capable of REPEAT and LOAD_EOT.
> >> It is possible that a DMA can do repeat, but lacks the ability to do any
> >> type of LOAD_*
> > 
> > This is the kind of information I was looking for, thanks. I agree that
> > some DMA engines may not be able to replace a repeated transfer (I'm
> > using the word repeated here instead of cyclic, to avoid confusion with
> > the existing cyclic transfer type) at EOT. I however assume they would
> > all have the ability to replace it immediately, as DMA engines are
> > required to implement terminate_all(), and replacing a transfer
> > immediately can then just be a combination of terminate_all() + starting
> > the next transfer. Whether we want DMA engines to implement this
> > internally instead of having the logical on the client side (as done
> > today) is another question, and I'm not pushing in one direction or
> > another here (although I think we could explore the option of
> > implementing this in the DMA engine core).
> > 
> > Having a capability flag to report if a DMA engine supports replacing a
> > repeated transfer at EOT makes sense (no idea if we will have DMA
> > engines supporting DMA_REPEAT but not DMA_LOAD_EOT, but that's another
> > story, at least in theory it could happen). I hwoever don't see what a
> > DMA_PREP_LOAD_EOT flag is needed, if this feature isn't supported,
> > shouldn't tx_submit() and/or issue_pending() fail when a repeated
> > transfer is queued ? Succeeding in tx_submit() and issue_pending() and
> > doing nothing with the newly queued transfers is, as I explained above,
> > a very good way to increase the chance of bugs. I don't see a reason why
> > accepting a call that we know will not perform what the caller expects
> > it to perform whould be a good idea.
> 
> I would argue that the DMA_PREP_RELOAD_NOW (ASAP?) is a bit more than
> terminate_all+issue_pending.
> 
> But, DMA drivers might support neither of them, either of them or both.
> It is up to the client to pick the preferred method for it's use.
> It is not far fetched that the next DMA the client is going to be
> serviced will have different capabilities and the client needs to handle
> EOT or NOW or it might even need to have fallback to case when neither
> is supported.
> 
> I don't like excessive flags either, but based on my experience
> under-flagging can bite back sooner than later.
> 
> I'm aware that at the moment it feels like it is too explicit, but never
> underestimate the creativity of the design - and in some cases the
> constraint the design must fulfill.

I'm still very puzzled by why you think adding DMA_PREP_LOAD_EOT now is
a good idea, given that there's no existing and no foreseen use case for
not setting it. Creating an API element that is completely disconnected
from any known use case doesn't seem like good API design to me,
especially for an in-kernel API.

> >> I think this would give a nice starting point to extend on later.
> >>
> >>>>> Are you *serious* ? I feel trapped in a cross-over of Groundhog Day and
> >>>>> Brazil.
> >>>>
> >>>> Sorry, I don't understand that reference!
> >>>>
> >>>> Nevertheless, you want a behaviour which is somehow defined by your use
> >>>> and magically implies certain conditions. I do not want it that way.
> >>>> I would rather see all the flag required.
> >>>>
> >>>>>> @Peter, did I miss anything else in this..? Please send the patch for
> >>>>>> this (to start with just the headers so that Laurent can start
> >>>>>> using them) and detailed patch with documentation as follow up, I trust
> >>>>>> you two can coordinate :)
> >>>>>
> >>>>> I won't call that coordination, no. If you want to design something
> >>>>> absurd that's your call, not mine, I don't want to get involved.
> >>>>
> >>>> Your wish!

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 6/6] arm64: dts: zynqmp: Add DPDMA node
  2020-05-14  5:56   ` Michal Simek
@ 2020-05-28  2:49     ` Laurent Pinchart
  0 siblings, 0 replies; 23+ messages in thread
From: Laurent Pinchart @ 2020-05-28  2:49 UTC (permalink / raw)
  To: Michal Simek
  Cc: dmaengine, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul, Peter Ujfalusi

Hi Michal,

On Thu, May 14, 2020 at 07:56:47AM +0200, Michal Simek wrote:
> On 13. 05. 20 18:59, Laurent Pinchart wrote:
> > Add a DT node for the DisplayPort DMA engine (DPDMA).
> > 
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> >  arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi |  4 ++++
> >  arch/arm64/boot/dts/xilinx/zynqmp.dtsi         | 10 ++++++++++
> >  2 files changed, 14 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
> > index 9868ca15dfc5..32c4914738d9 100644
> > --- a/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
> > +++ b/arch/arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi
> > @@ -57,6 +57,10 @@ &cpu0 {
> >  	clocks = <&zynqmp_clk ACPU>;
> >  };
> >  
> > +&dpdma {
> > +	clocks = <&zynqmp_clk DPDMA_REF>;
> > +};
> > +
> >  &fpd_dma_chan1 {
> >  	clocks = <&zynqmp_clk GDMA_REF>, <&zynqmp_clk LPD_LSBUS>;
> >  };
> > diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> > index 26d926eb1431..2e284eb8d3c1 100644
> > --- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> > +++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> > @@ -246,6 +246,16 @@ pmu@9000 {
> >  			};
> >  		};
> >  
> > +		dpdma: dma-controller@fd4c0000 {
> > +			compatible = "xlnx,zynqmp-dpdma";
> > +			status = "disabled";
> > +			reg = <0x0 0xfd4c0000 0x0 0x1000>;
> > +			interrupts = <0 122 4>;
> > +			interrupt-parent = <&gic>;
> > +			clock-names = "axi_clk";
> > +			#dma-cells = <1>;
> > +		};
> > +
> >  		/* GDMA */
> >  		fpd_dma_chan1: dma@fd500000 {
> >  			status = "disabled";
> > 
> 
> Acked-by: Michal Simek <michal.simek@xilinx.com>
> 
> Feel free to take it with this series. Or let me know if you want me to
> take it via my soc tree.

If we ever find an agreement on the DMA engine API :-)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-05-28  2:10                 ` Laurent Pinchart
@ 2020-06-01 11:14                   ` Peter Ujfalusi
  2020-06-01 11:49                     ` Laurent Pinchart
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Ujfalusi @ 2020-06-01 11:14 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: dmaengine, Vinod Koul, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

[-- Attachment #1: Type: text/plain, Size: 3673 bytes --]

Hi Laurent,

On 28/05/2020 5.10, Laurent Pinchart wrote:
>>> As mentioned in the commit message, I plan to extend that, I just didn't
>>> want to add the checks to all the prepare operation wrappers until an
>>> agreement on the approach would be reached. I also thought it would be
>>> good to not allow this API for other transaction types until use cases
>>> arise, in order to force upstream discussions instead of silently
>>> abusing the API :-)
>>
>> I would not object if slave_sg and memcpy got the same treatment. If the
>> DMA driver did not set the DMA_REPEAT then clients can not use this
>> feature anyways.
> 
> Would you not object, or would you prefer if it was done in v5 ? :-)

DMA_REPEAT is a generic flag, not limited to only interleaved, but you
are going to be the first user of it with interleaved.

> Overall I think that enabling APIs that have no user isn't necessarily
> the best idea, as it's prone to design issues, but I don't mind doing so
> if you think it needs to be done now.

We would get the support in one go with the same commit. I don't think
it makes much sense to add slave_sg later, then memcpy another time.
True, there might be no users for them for some time, but their presents
might invite users?

>>> I can extend the flag to all other transaction types
>>> (except for the cyclic transaction, as it doesn't make sense there).
>>
>> Yep, cyclic is a different type of transfer, it is for circular buffers.
>> It could be seen as a special case of slave_sg. Some drivers actually
>> create temporary sg_list in case of cyclic and use the same setup
>> function to set up the transfer for slave_sg/cyclic...
> 
> Cyclic is different for historical reasons, but if I had to redesign it
> today, I'd make it slave_sg + DMA_PREP_REPEAT. We obviously can't, and I
> have no issue with that.

Which should be accompanied with a flag to tell that the sg_list is
covering a circular buffer to save all drivers to check the sg_list that
it is circular buffer (current cyclic) or really sg.
Some DMA can only do repeat on circular buffers (omap-dma, tegra, etc).

>> But, DMA drivers might support neither of them, either of them or both.
>> It is up to the client to pick the preferred method for it's use.
>> It is not far fetched that the next DMA the client is going to be
>> serviced will have different capabilities and the client needs to handle
>> EOT or NOW or it might even need to have fallback to case when neither
>> is supported.
>>
>> I don't like excessive flags either, but based on my experience
>> under-flagging can bite back sooner than later.
>>
>> I'm aware that at the moment it feels like it is too explicit, but never
>> underestimate the creativity of the design - and in some cases the
>> constraint the design must fulfill.
> 
> I'm still very puzzled by why you think adding DMA_PREP_LOAD_EOT now is
> a good idea, given that there's no existing and no foreseen use case for
> not setting it. Creating an API element that is completely disconnected
> from any known use case doesn't seem like good API design to me,
> especially for an in-kernel API.

If we document that DMA_REPEAT covers REPEAT _and_ LOAD_EOT with one
flag then how would other drivers can implement REPEAT if they can not
support LOAD_EOT?
They should do DMA_REPEAT | NOT_LOAD_EOT | LOAD_ASAP?

LOAD_EOT is a feature the HW can or can not support and it is an
operation mode that you want to use or do not want to use.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1783 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-06-01 11:14                   ` Peter Ujfalusi
@ 2020-06-01 11:49                     ` Laurent Pinchart
  2020-06-03 10:51                       ` Peter Ujfalusi
  0 siblings, 1 reply; 23+ messages in thread
From: Laurent Pinchart @ 2020-06-01 11:49 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: dmaengine, Vinod Koul, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Peter,

On Mon, Jun 01, 2020 at 02:14:03PM +0300, Peter Ujfalusi wrote:
> On 28/05/2020 5.10, Laurent Pinchart wrote:
> >>> As mentioned in the commit message, I plan to extend that, I just didn't
> >>> want to add the checks to all the prepare operation wrappers until an
> >>> agreement on the approach would be reached. I also thought it would be
> >>> good to not allow this API for other transaction types until use cases
> >>> arise, in order to force upstream discussions instead of silently
> >>> abusing the API :-)
> >>
> >> I would not object if slave_sg and memcpy got the same treatment. If the
> >> DMA driver did not set the DMA_REPEAT then clients can not use this
> >> feature anyways.
> > 
> > Would you not object, or would you prefer if it was done in v5 ? :-)
> 
> DMA_REPEAT is a generic flag, not limited to only interleaved, but you
> are going to be the first user of it with interleaved.
> 
> > Overall I think that enabling APIs that have no user isn't necessarily
> > the best idea, as it's prone to design issues, but I don't mind doing so
> > if you think it needs to be done now.
> 
> We would get the support in one go with the same commit. I don't think
> it makes much sense to add slave_sg later, then memcpy another time.
> True, there might be no users for them for some time, but their presents
> might invite users?

My approach to API design is that an API designed without (at least) one
user is very prone to be a bad API. As I said before I don't mind
enabling support for slave_sg and memcpy today already, even if I don't
think it's a good idea. I want to get my use case supported, and I've
given up on what I would consider a good API :-) That's fine,
maintainers are the ones who have to support APIs and the design choices
behind them in the longer term, and I'm not a subsystem maintainer here.
I tried to prevent what I think may become a case of shooting in the
foot, but I could be wrong. Only the future will tell.

> >>> I can extend the flag to all other transaction types
> >>> (except for the cyclic transaction, as it doesn't make sense there).
> >>
> >> Yep, cyclic is a different type of transfer, it is for circular buffers.
> >> It could be seen as a special case of slave_sg. Some drivers actually
> >> create temporary sg_list in case of cyclic and use the same setup
> >> function to set up the transfer for slave_sg/cyclic...
> > 
> > Cyclic is different for historical reasons, but if I had to redesign it
> > today, I'd make it slave_sg + DMA_PREP_REPEAT. We obviously can't, and I
> > have no issue with that.
> 
> Which should be accompanied with a flag to tell that the sg_list is
> covering a circular buffer to save all drivers to check the sg_list that
> it is circular buffer (current cyclic) or really sg.
> Some DMA can only do repeat on circular buffers (omap-dma, tegra, etc).

Isn't DMA_PREP_REPEAT that flag ?

> >> But, DMA drivers might support neither of them, either of them or both.
> >> It is up to the client to pick the preferred method for it's use.
> >> It is not far fetched that the next DMA the client is going to be
> >> serviced will have different capabilities and the client needs to handle
> >> EOT or NOW or it might even need to have fallback to case when neither
> >> is supported.
> >>
> >> I don't like excessive flags either, but based on my experience
> >> under-flagging can bite back sooner than later.
> >>
> >> I'm aware that at the moment it feels like it is too explicit, but never
> >> underestimate the creativity of the design - and in some cases the
> >> constraint the design must fulfill.
> > 
> > I'm still very puzzled by why you think adding DMA_PREP_LOAD_EOT now is
> > a good idea, given that there's no existing and no foreseen use case for
> > not setting it. Creating an API element that is completely disconnected
> > from any known use case doesn't seem like good API design to me,
> > especially for an in-kernel API.
> 
> If we document that DMA_REPEAT covers REPEAT _and_ LOAD_EOT with one
> flag then how would other drivers can implement REPEAT if they can not
> support LOAD_EOT?
> They should do DMA_REPEAT | NOT_LOAD_EOT | LOAD_ASAP?

As stated before, I think a DMA_LOAD_EOT capability is useful. My
concern is about DMA_PREP_LOAD_EOT for which I can't see use cases. I've
added DMA_PREP_LOAD_EOT in the last patch series, and my DMA engine
driver ignores the transaction when DMA_PREP_LOAD_EOT is not set, as
required. It works fine as the my client always sets it.

I'd expect Vinod or you to write the documentation though, as writing
code for an API I don't believe in is one thing, writing documentation
to explain the rationale behind the API design will be more complex when
I don't believe there's any rationale :-)

> LOAD_EOT is a feature the HW can or can not support and it is an
> operation mode that you want to use or do not want to use.

DMA_PREP_REPEAT for the EOT mode, DRM_PREP_REPEAT | DMA_PREP_LOAD_NOW
for the immediate mode would work too, and wouldn't have the drawback of
artificially creating a case (!EOT && !NOW) that would fail.

I'm tired of arguing about this and trying to prevent mistakes from
being made, so could you please review the latest patch series, and tell
me if there's anything missing for the implementation ? Feel free to
provide that feedback as patches on top.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-06-01 11:49                     ` Laurent Pinchart
@ 2020-06-03 10:51                       ` Peter Ujfalusi
  2020-06-03 16:06                         ` Vinod Koul
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Ujfalusi @ 2020-06-03 10:51 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: dmaengine, Vinod Koul, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

[-- Attachment #1: Type: text/plain, Size: 7411 bytes --]



On 01/06/2020 14.49, Laurent Pinchart wrote:
> Hi Peter,
> 
> On Mon, Jun 01, 2020 at 02:14:03PM +0300, Peter Ujfalusi wrote:
>> On 28/05/2020 5.10, Laurent Pinchart wrote:
>>>>> As mentioned in the commit message, I plan to extend that, I just didn't
>>>>> want to add the checks to all the prepare operation wrappers until an
>>>>> agreement on the approach would be reached. I also thought it would be
>>>>> good to not allow this API for other transaction types until use cases
>>>>> arise, in order to force upstream discussions instead of silently
>>>>> abusing the API :-)
>>>>
>>>> I would not object if slave_sg and memcpy got the same treatment. If the
>>>> DMA driver did not set the DMA_REPEAT then clients can not use this
>>>> feature anyways.
>>>
>>> Would you not object, or would you prefer if it was done in v5 ? :-)
>>
>> DMA_REPEAT is a generic flag, not limited to only interleaved, but you
>> are going to be the first user of it with interleaved.
>>
>>> Overall I think that enabling APIs that have no user isn't necessarily
>>> the best idea, as it's prone to design issues, but I don't mind doing so
>>> if you think it needs to be done now.
>>
>> We would get the support in one go with the same commit. I don't think
>> it makes much sense to add slave_sg later, then memcpy another time.
>> True, there might be no users for them for some time, but their presents
>> might invite users?
> 
> My approach to API design is that an API designed without (at least) one
> user is very prone to be a bad API. As I said before I don't mind
> enabling support for slave_sg and memcpy today already, even if I don't
> think it's a good idea. I want to get my use case supported, and I've
> given up on what I would consider a good API :-) That's fine,
> maintainers are the ones who have to support APIs and the design choices
> behind them in the longer term, and I'm not a subsystem maintainer here.
> I tried to prevent what I think may become a case of shooting in the
> foot, but I could be wrong. Only the future will tell.

Yes, we will see in the longer run.

>>>>> I can extend the flag to all other transaction types
>>>>> (except for the cyclic transaction, as it doesn't make sense there).
>>>>
>>>> Yep, cyclic is a different type of transfer, it is for circular buffers.
>>>> It could be seen as a special case of slave_sg. Some drivers actually
>>>> create temporary sg_list in case of cyclic and use the same setup
>>>> function to set up the transfer for slave_sg/cyclic...
>>>
>>> Cyclic is different for historical reasons, but if I had to redesign it
>>> today, I'd make it slave_sg + DMA_PREP_REPEAT. We obviously can't, and I
>>> have no issue with that.
>>
>> Which should be accompanied with a flag to tell that the sg_list is
>> covering a circular buffer to save all drivers to check the sg_list that
>> it is circular buffer (current cyclic) or really sg.
>> Some DMA can only do repeat on circular buffers (omap-dma, tegra, etc).
> 
> Isn't DMA_PREP_REPEAT that flag ?

Not really. It tells that the descriptor should be repeated. In case of
slave_sg the list could describe one block of memory, split up to
'periods' or it could be a list scattered chunks all over the place.

circular buffer can be described with sg_list.
sg_list is not necessary describes a circular buffer.

>>>> But, DMA drivers might support neither of them, either of them or both.
>>>> It is up to the client to pick the preferred method for it's use.
>>>> It is not far fetched that the next DMA the client is going to be
>>>> serviced will have different capabilities and the client needs to handle
>>>> EOT or NOW or it might even need to have fallback to case when neither
>>>> is supported.
>>>>
>>>> I don't like excessive flags either, but based on my experience
>>>> under-flagging can bite back sooner than later.
>>>>
>>>> I'm aware that at the moment it feels like it is too explicit, but never
>>>> underestimate the creativity of the design - and in some cases the
>>>> constraint the design must fulfill.
>>>
>>> I'm still very puzzled by why you think adding DMA_PREP_LOAD_EOT now is
>>> a good idea, given that there's no existing and no foreseen use case for
>>> not setting it. Creating an API element that is completely disconnected
>>> from any known use case doesn't seem like good API design to me,
>>> especially for an in-kernel API.
>>
>> If we document that DMA_REPEAT covers REPEAT _and_ LOAD_EOT with one
>> flag then how would other drivers can implement REPEAT if they can not
>> support LOAD_EOT?
>> They should do DMA_REPEAT | NOT_LOAD_EOT | LOAD_ASAP?
> 
> As stated before, I think a DMA_LOAD_EOT capability is useful. My
> concern is about DMA_PREP_LOAD_EOT for which I can't see use cases. I've
> added DMA_PREP_LOAD_EOT in the last patch series, and my DMA engine
> driver ignores the transaction when DMA_PREP_LOAD_EOT is not set, as
> required. It works fine as the my client always sets it.

Thanks.

> I'd expect Vinod or you to write the documentation though, as writing
> code for an API I don't believe in is one thing, writing documentation
> to explain the rationale behind the API design will be more complex

Vinod can correct me, but for the capabilities:
DMA_REPEAT: the controller (and driver) supports repeating the
	descriptor. It can be terminated with terminate_all
DMA_LOAD_EOT: the controller (and driver) supports loading the next
	issued transfer on a channel which is running DMA_REPEAT
	descriptor. Iow, instead of reloading the running transfer, it
	moves to the next one.
DMA_LOAD_NOW: the controller (and driver) supports aborting the
	active descriptor (either DMA_REPEAT or non repeated one) and
	moving to the next issued transfer without clients needing to
	use terminate_all.

> when I don't believe there's any rationale :-)

Sure, you have a specific DMA, which does one thing and one thing only.
When a subsystem decides to create a generic DMA layer on top of
DMAengine for example to get rid of the duplicated code in the drivers
then this generic code does need information to decide how the servicing
DMA should be used for optimal performance and quality.
Some DMAs (and drivers) might have slightly different capabilities.

>> LOAD_EOT is a feature the HW can or can not support and it is an
>> operation mode that you want to use or do not want to use.
> 
> DMA_PREP_REPEAT for the EOT mode, DRM_PREP_REPEAT | DMA_PREP_LOAD_NOW
> for the immediate mode would work too, and wouldn't have the drawback of
> artificially creating a case (!EOT && !NOW) that would fail.

But if a DMA does not support LOAD_EOT at all? If it did not support
LOAD_NOW either?
But if anything the LOAD_NOW sounds more of a default expectation than
LOAD_EOT.
Yes, I know. The display use case needs LOAD_EOT to avoid artifacts on
screen, but DMA_REPEAT is not only for displays.

> I'm tired of arguing about this and trying to prevent mistakes from
> being made, so could you please review the latest patch series, and tell
> me if there's anything missing for the implementation ? Feel free to
> provide that feedback as patches on top.
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1783 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-06-03 10:51                       ` Peter Ujfalusi
@ 2020-06-03 16:06                         ` Vinod Koul
  2020-06-16 21:39                           ` Laurent Pinchart
  0 siblings, 1 reply; 23+ messages in thread
From: Vinod Koul @ 2020-06-03 16:06 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: Laurent Pinchart, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Laurent, Peter,

On 03-06-20, 13:51, Peter Ujfalusi wrote:
> On 01/06/2020 14.49, Laurent Pinchart wrote:
> > Hi Peter,
> > 
> > On Mon, Jun 01, 2020 at 02:14:03PM +0300, Peter Ujfalusi wrote:
> >> On 28/05/2020 5.10, Laurent Pinchart wrote:
> >>>>> As mentioned in the commit message, I plan to extend that, I just didn't
> >>>>> want to add the checks to all the prepare operation wrappers until an
> >>>>> agreement on the approach would be reached. I also thought it would be
> >>>>> good to not allow this API for other transaction types until use cases
> >>>>> arise, in order to force upstream discussions instead of silently
> >>>>> abusing the API :-)
> >>>>
> >>>> I would not object if slave_sg and memcpy got the same treatment. If the
> >>>> DMA driver did not set the DMA_REPEAT then clients can not use this
> >>>> feature anyways.
> >>>
> >>> Would you not object, or would you prefer if it was done in v5 ? :-)
> >>
> >> DMA_REPEAT is a generic flag, not limited to only interleaved, but you
> >> are going to be the first user of it with interleaved.
> >>
> >>> Overall I think that enabling APIs that have no user isn't necessarily
> >>> the best idea, as it's prone to design issues, but I don't mind doing so
> >>> if you think it needs to be done now.
> >>
> >> We would get the support in one go with the same commit. I don't think
> >> it makes much sense to add slave_sg later, then memcpy another time.
> >> True, there might be no users for them for some time, but their presents
> >> might invite users?
> > 
> > My approach to API design is that an API designed without (at least) one
> > user is very prone to be a bad API. As I said before I don't mind
> > enabling support for slave_sg and memcpy today already, even if I don't
> > think it's a good idea. I want to get my use case supported, and I've
> > given up on what I would consider a good API :-) That's fine,
> > maintainers are the ones who have to support APIs and the design choices
> > behind them in the longer term, and I'm not a subsystem maintainer here.
> > I tried to prevent what I think may become a case of shooting in the
> > foot, but I could be wrong. Only the future will tell.
> 
> Yes, we will see in the longer run.

I am not sure I would like to add an API without a user, we can add some
notes in documentation for this and future ideas on how to add this, but
an API without user doesn't sound right to me.

> >>>>> I can extend the flag to all other transaction types
> >>>>> (except for the cyclic transaction, as it doesn't make sense there).
> >>>>
> >>>> Yep, cyclic is a different type of transfer, it is for circular buffers.
> >>>> It could be seen as a special case of slave_sg. Some drivers actually
> >>>> create temporary sg_list in case of cyclic and use the same setup
> >>>> function to set up the transfer for slave_sg/cyclic...
> >>>
> >>> Cyclic is different for historical reasons, but if I had to redesign it
> >>> today, I'd make it slave_sg + DMA_PREP_REPEAT. We obviously can't, and I
> >>> have no issue with that.
> >>
> >> Which should be accompanied with a flag to tell that the sg_list is
> >> covering a circular buffer to save all drivers to check the sg_list that
> >> it is circular buffer (current cyclic) or really sg.
> >> Some DMA can only do repeat on circular buffers (omap-dma, tegra, etc).
> > 
> > Isn't DMA_PREP_REPEAT that flag ?
> 
> Not really. It tells that the descriptor should be repeated. In case of
> slave_sg the list could describe one block of memory, split up to
> 'periods' or it could be a list scattered chunks all over the place.
> 
> circular buffer can be described with sg_list.
> sg_list is not necessary describes a circular buffer.
> 
> >>>> But, DMA drivers might support neither of them, either of them or both.
> >>>> It is up to the client to pick the preferred method for it's use.
> >>>> It is not far fetched that the next DMA the client is going to be
> >>>> serviced will have different capabilities and the client needs to handle
> >>>> EOT or NOW or it might even need to have fallback to case when neither
> >>>> is supported.
> >>>>
> >>>> I don't like excessive flags either, but based on my experience
> >>>> under-flagging can bite back sooner than later.
> >>>>
> >>>> I'm aware that at the moment it feels like it is too explicit, but never
> >>>> underestimate the creativity of the design - and in some cases the
> >>>> constraint the design must fulfill.
> >>>
> >>> I'm still very puzzled by why you think adding DMA_PREP_LOAD_EOT now is
> >>> a good idea, given that there's no existing and no foreseen use case for
> >>> not setting it. Creating an API element that is completely disconnected
> >>> from any known use case doesn't seem like good API design to me,
> >>> especially for an in-kernel API.
> >>
> >> If we document that DMA_REPEAT covers REPEAT _and_ LOAD_EOT with one
> >> flag then how would other drivers can implement REPEAT if they can not
> >> support LOAD_EOT?
> >> They should do DMA_REPEAT | NOT_LOAD_EOT | LOAD_ASAP?
> > 
> > As stated before, I think a DMA_LOAD_EOT capability is useful. My
> > concern is about DMA_PREP_LOAD_EOT for which I can't see use cases. I've
> > added DMA_PREP_LOAD_EOT in the last patch series, and my DMA engine
> > driver ignores the transaction when DMA_PREP_LOAD_EOT is not set, as
> > required. It works fine as the my client always sets it.
> 
> Thanks.
> 
> > I'd expect Vinod or you to write the documentation though, as writing
> > code for an API I don't believe in is one thing, writing documentation
> > to explain the rationale behind the API design will be more complex
> 
> Vinod can correct me, but for the capabilities:
> DMA_REPEAT: the controller (and driver) supports repeating the
> 	descriptor. It can be terminated with terminate_all
> DMA_LOAD_EOT: the controller (and driver) supports loading the next
> 	issued transfer on a channel which is running DMA_REPEAT
> 	descriptor. Iow, instead of reloading the running transfer, it
> 	moves to the next one.
> DMA_LOAD_NOW: the controller (and driver) supports aborting the
> 	active descriptor (either DMA_REPEAT or non repeated one) and
> 	moving to the next issued transfer without clients needing to
> 	use terminate_all.

Sounds right to me.

> > when I don't believe there's any rationale :-)
> 
> Sure, you have a specific DMA, which does one thing and one thing only.
> When a subsystem decides to create a generic DMA layer on top of
> DMAengine for example to get rid of the duplicated code in the drivers
> then this generic code does need information to decide how the servicing
> DMA should be used for optimal performance and quality.
> Some DMAs (and drivers) might have slightly different capabilities.
> 
> >> LOAD_EOT is a feature the HW can or can not support and it is an
> >> operation mode that you want to use or do not want to use.
> > 
> > DMA_PREP_REPEAT for the EOT mode, DRM_PREP_REPEAT | DMA_PREP_LOAD_NOW
> > for the immediate mode would work too, and wouldn't have the drawback of
> > artificially creating a case (!EOT && !NOW) that would fail.
> 
> But if a DMA does not support LOAD_EOT at all? If it did not support
> LOAD_NOW either?
> But if anything the LOAD_NOW sounds more of a default expectation than
> LOAD_EOT.
> Yes, I know. The display use case needs LOAD_EOT to avoid artifacts on
> screen, but DMA_REPEAT is not only for displays.

Correct, a user can request LOAD_NOW or LOAD_EOT, driver should be able
to handle (as long as h/w supports) and act accordingly.

Dmaengine layer and drivers and not specific to one interface or one
user, the idea is to write generic dmaengine driver catering to
different users, so supporting different flags from driver pov as well
dmaengine framework pov is required.

-- 
~Vinod

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-06-03 16:06                         ` Vinod Koul
@ 2020-06-16 21:39                           ` Laurent Pinchart
  2020-06-23  9:47                             ` Laurent Pinchart
  0 siblings, 1 reply; 23+ messages in thread
From: Laurent Pinchart @ 2020-06-16 21:39 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod and Peter,

On Wed, Jun 03, 2020 at 09:36:09PM +0530, Vinod Koul wrote:
> On 03-06-20, 13:51, Peter Ujfalusi wrote:
> > On 01/06/2020 14.49, Laurent Pinchart wrote:
> > > On Mon, Jun 01, 2020 at 02:14:03PM +0300, Peter Ujfalusi wrote:
> > >> On 28/05/2020 5.10, Laurent Pinchart wrote:
> > >>>>> As mentioned in the commit message, I plan to extend that, I just didn't
> > >>>>> want to add the checks to all the prepare operation wrappers until an
> > >>>>> agreement on the approach would be reached. I also thought it would be
> > >>>>> good to not allow this API for other transaction types until use cases
> > >>>>> arise, in order to force upstream discussions instead of silently
> > >>>>> abusing the API :-)
> > >>>>
> > >>>> I would not object if slave_sg and memcpy got the same treatment. If the
> > >>>> DMA driver did not set the DMA_REPEAT then clients can not use this
> > >>>> feature anyways.
> > >>>
> > >>> Would you not object, or would you prefer if it was done in v5 ? :-)
> > >>
> > >> DMA_REPEAT is a generic flag, not limited to only interleaved, but you
> > >> are going to be the first user of it with interleaved.
> > >>
> > >>> Overall I think that enabling APIs that have no user isn't necessarily
> > >>> the best idea, as it's prone to design issues, but I don't mind doing so
> > >>> if you think it needs to be done now.
> > >>
> > >> We would get the support in one go with the same commit. I don't think
> > >> it makes much sense to add slave_sg later, then memcpy another time.
> > >> True, there might be no users for them for some time, but their presents
> > >> might invite users?
> > > 
> > > My approach to API design is that an API designed without (at least) one
> > > user is very prone to be a bad API. As I said before I don't mind
> > > enabling support for slave_sg and memcpy today already, even if I don't
> > > think it's a good idea. I want to get my use case supported, and I've
> > > given up on what I would consider a good API :-) That's fine,
> > > maintainers are the ones who have to support APIs and the design choices
> > > behind them in the longer term, and I'm not a subsystem maintainer here.
> > > I tried to prevent what I think may become a case of shooting in the
> > > foot, but I could be wrong. Only the future will tell.
> > 
> > Yes, we will see in the longer run.
> 
> I am not sure I would like to add an API without a user, we can add some
> notes in documentation for this and future ideas on how to add this, but
> an API without user doesn't sound right to me.

That's my preference as well. Peter, are you OK with that ?

> > >>>>> I can extend the flag to all other transaction types
> > >>>>> (except for the cyclic transaction, as it doesn't make sense there).
> > >>>>
> > >>>> Yep, cyclic is a different type of transfer, it is for circular buffers.
> > >>>> It could be seen as a special case of slave_sg. Some drivers actually
> > >>>> create temporary sg_list in case of cyclic and use the same setup
> > >>>> function to set up the transfer for slave_sg/cyclic...
> > >>>
> > >>> Cyclic is different for historical reasons, but if I had to redesign it
> > >>> today, I'd make it slave_sg + DMA_PREP_REPEAT. We obviously can't, and I
> > >>> have no issue with that.
> > >>
> > >> Which should be accompanied with a flag to tell that the sg_list is
> > >> covering a circular buffer to save all drivers to check the sg_list that
> > >> it is circular buffer (current cyclic) or really sg.
> > >> Some DMA can only do repeat on circular buffers (omap-dma, tegra, etc).
> > > 
> > > Isn't DMA_PREP_REPEAT that flag ?
> > 
> > Not really. It tells that the descriptor should be repeated. In case of
> > slave_sg the list could describe one block of memory, split up to
> > 'periods' or it could be a list scattered chunks all over the place.
> > 
> > circular buffer can be described with sg_list.
> > sg_list is not necessary describes a circular buffer.
> > 
> > >>>> But, DMA drivers might support neither of them, either of them or both.
> > >>>> It is up to the client to pick the preferred method for it's use.
> > >>>> It is not far fetched that the next DMA the client is going to be
> > >>>> serviced will have different capabilities and the client needs to handle
> > >>>> EOT or NOW or it might even need to have fallback to case when neither
> > >>>> is supported.
> > >>>>
> > >>>> I don't like excessive flags either, but based on my experience
> > >>>> under-flagging can bite back sooner than later.
> > >>>>
> > >>>> I'm aware that at the moment it feels like it is too explicit, but never
> > >>>> underestimate the creativity of the design - and in some cases the
> > >>>> constraint the design must fulfill.
> > >>>
> > >>> I'm still very puzzled by why you think adding DMA_PREP_LOAD_EOT now is
> > >>> a good idea, given that there's no existing and no foreseen use case for
> > >>> not setting it. Creating an API element that is completely disconnected
> > >>> from any known use case doesn't seem like good API design to me,
> > >>> especially for an in-kernel API.
> > >>
> > >> If we document that DMA_REPEAT covers REPEAT _and_ LOAD_EOT with one
> > >> flag then how would other drivers can implement REPEAT if they can not
> > >> support LOAD_EOT?
> > >> They should do DMA_REPEAT | NOT_LOAD_EOT | LOAD_ASAP?
> > > 
> > > As stated before, I think a DMA_LOAD_EOT capability is useful. My
> > > concern is about DMA_PREP_LOAD_EOT for which I can't see use cases. I've
> > > added DMA_PREP_LOAD_EOT in the last patch series, and my DMA engine
> > > driver ignores the transaction when DMA_PREP_LOAD_EOT is not set, as
> > > required. It works fine as the my client always sets it.
> > 
> > Thanks.
> > 
> > > I'd expect Vinod or you to write the documentation though, as writing
> > > code for an API I don't believe in is one thing, writing documentation
> > > to explain the rationale behind the API design will be more complex
> > 
> > Vinod can correct me, but for the capabilities:
> > DMA_REPEAT: the controller (and driver) supports repeating the
> > 	descriptor. It can be terminated with terminate_all
> > DMA_LOAD_EOT: the controller (and driver) supports loading the next
> > 	issued transfer on a channel which is running DMA_REPEAT
> > 	descriptor. Iow, instead of reloading the running transfer, it
> > 	moves to the next one.
> > DMA_LOAD_NOW: the controller (and driver) supports aborting the
> > 	active descriptor (either DMA_REPEAT or non repeated one) and
> > 	moving to the next issued transfer without clients needing to
> > 	use terminate_all.
> 
> Sounds right to me.

For the same reason as above, my latest patch series doesn't include
DMA_LOAD_NOW, as that would be an API with no user. Vinod, is that OK
with you ?

> > > when I don't believe there's any rationale :-)
> > 
> > Sure, you have a specific DMA, which does one thing and one thing only.
> > When a subsystem decides to create a generic DMA layer on top of
> > DMAengine for example to get rid of the duplicated code in the drivers
> > then this generic code does need information to decide how the servicing
> > DMA should be used for optimal performance and quality.
> > Some DMAs (and drivers) might have slightly different capabilities.
> > 
> > >> LOAD_EOT is a feature the HW can or can not support and it is an
> > >> operation mode that you want to use or do not want to use.
> > > 
> > > DMA_PREP_REPEAT for the EOT mode, DRM_PREP_REPEAT | DMA_PREP_LOAD_NOW
> > > for the immediate mode would work too, and wouldn't have the drawback of
> > > artificially creating a case (!EOT && !NOW) that would fail.
> > 
> > But if a DMA does not support LOAD_EOT at all? If it did not support
> > LOAD_NOW either?

If the driver doesn't support EOT, then REPEAT without EOT would be
rejected by the prepare operation. If the driver doesn't support EOT not
NOW, then it wouldn't support REPEAT :-) In any case, both EOT and NOW
would be rejected, and so would REPEAT (REPEAT without EOT or NOW
doesn't make much sense).

> > But if anything the LOAD_NOW sounds more of a default expectation than
> > LOAD_EOT.
> > Yes, I know. The display use case needs LOAD_EOT to avoid artifacts on
> > screen, but DMA_REPEAT is not only for displays.
> 
> Correct, a user can request LOAD_NOW or LOAD_EOT, driver should be able
> to handle (as long as h/w supports) and act accordingly.
> 
> Dmaengine layer and drivers and not specific to one interface or one
> user, the idea is to write generic dmaengine driver catering to
> different users, so supporting different flags from driver pov as well
> dmaengine framework pov is required.

Let's skip the lectures on API design, I think we're way past that
point.

Could you please review the latest patch series
(https://lore.kernel.org/dmaengine/20200528025228.31638-1-laurent.pinchart@ideasonboard.com/T/#t)
?

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/6] dmaengine: Add support for repeating transactions
  2020-06-16 21:39                           ` Laurent Pinchart
@ 2020-06-23  9:47                             ` Laurent Pinchart
  0 siblings, 0 replies; 23+ messages in thread
From: Laurent Pinchart @ 2020-06-23  9:47 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod and Peter,

Gentle ping.

On Wed, Jun 17, 2020 at 12:39:09AM +0300, Laurent Pinchart wrote:
> On Wed, Jun 03, 2020 at 09:36:09PM +0530, Vinod Koul wrote:
> > On 03-06-20, 13:51, Peter Ujfalusi wrote:
> >> On 01/06/2020 14.49, Laurent Pinchart wrote:
> >>> On Mon, Jun 01, 2020 at 02:14:03PM +0300, Peter Ujfalusi wrote:
> >>>> On 28/05/2020 5.10, Laurent Pinchart wrote:
> >>>>>>> As mentioned in the commit message, I plan to extend that, I just didn't
> >>>>>>> want to add the checks to all the prepare operation wrappers until an
> >>>>>>> agreement on the approach would be reached. I also thought it would be
> >>>>>>> good to not allow this API for other transaction types until use cases
> >>>>>>> arise, in order to force upstream discussions instead of silently
> >>>>>>> abusing the API :-)
> >>>>>>
> >>>>>> I would not object if slave_sg and memcpy got the same treatment. If the
> >>>>>> DMA driver did not set the DMA_REPEAT then clients can not use this
> >>>>>> feature anyways.
> >>>>>
> >>>>> Would you not object, or would you prefer if it was done in v5 ? :-)
> >>>>
> >>>> DMA_REPEAT is a generic flag, not limited to only interleaved, but you
> >>>> are going to be the first user of it with interleaved.
> >>>>
> >>>>> Overall I think that enabling APIs that have no user isn't necessarily
> >>>>> the best idea, as it's prone to design issues, but I don't mind doing so
> >>>>> if you think it needs to be done now.
> >>>>
> >>>> We would get the support in one go with the same commit. I don't think
> >>>> it makes much sense to add slave_sg later, then memcpy another time.
> >>>> True, there might be no users for them for some time, but their presents
> >>>> might invite users?
> >>> 
> >>> My approach to API design is that an API designed without (at least) one
> >>> user is very prone to be a bad API. As I said before I don't mind
> >>> enabling support for slave_sg and memcpy today already, even if I don't
> >>> think it's a good idea. I want to get my use case supported, and I've
> >>> given up on what I would consider a good API :-) That's fine,
> >>> maintainers are the ones who have to support APIs and the design choices
> >>> behind them in the longer term, and I'm not a subsystem maintainer here.
> >>> I tried to prevent what I think may become a case of shooting in the
> >>> foot, but I could be wrong. Only the future will tell.
> >> 
> >> Yes, we will see in the longer run.
> > 
> > I am not sure I would like to add an API without a user, we can add some
> > notes in documentation for this and future ideas on how to add this, but
> > an API without user doesn't sound right to me.
> 
> That's my preference as well. Peter, are you OK with that ?
> 
> >>>>>>> I can extend the flag to all other transaction types
> >>>>>>> (except for the cyclic transaction, as it doesn't make sense there).
> >>>>>>
> >>>>>> Yep, cyclic is a different type of transfer, it is for circular buffers.
> >>>>>> It could be seen as a special case of slave_sg. Some drivers actually
> >>>>>> create temporary sg_list in case of cyclic and use the same setup
> >>>>>> function to set up the transfer for slave_sg/cyclic...
> >>>>>
> >>>>> Cyclic is different for historical reasons, but if I had to redesign it
> >>>>> today, I'd make it slave_sg + DMA_PREP_REPEAT. We obviously can't, and I
> >>>>> have no issue with that.
> >>>>
> >>>> Which should be accompanied with a flag to tell that the sg_list is
> >>>> covering a circular buffer to save all drivers to check the sg_list that
> >>>> it is circular buffer (current cyclic) or really sg.
> >>>> Some DMA can only do repeat on circular buffers (omap-dma, tegra, etc).
> >>> 
> >>> Isn't DMA_PREP_REPEAT that flag ?
> >> 
> >> Not really. It tells that the descriptor should be repeated. In case of
> >> slave_sg the list could describe one block of memory, split up to
> >> 'periods' or it could be a list scattered chunks all over the place.
> >> 
> >> circular buffer can be described with sg_list.
> >> sg_list is not necessary describes a circular buffer.
> >> 
> >>>>>> But, DMA drivers might support neither of them, either of them or both.
> >>>>>> It is up to the client to pick the preferred method for it's use.
> >>>>>> It is not far fetched that the next DMA the client is going to be
> >>>>>> serviced will have different capabilities and the client needs to handle
> >>>>>> EOT or NOW or it might even need to have fallback to case when neither
> >>>>>> is supported.
> >>>>>>
> >>>>>> I don't like excessive flags either, but based on my experience
> >>>>>> under-flagging can bite back sooner than later.
> >>>>>>
> >>>>>> I'm aware that at the moment it feels like it is too explicit, but never
> >>>>>> underestimate the creativity of the design - and in some cases the
> >>>>>> constraint the design must fulfill.
> >>>>>
> >>>>> I'm still very puzzled by why you think adding DMA_PREP_LOAD_EOT now is
> >>>>> a good idea, given that there's no existing and no foreseen use case for
> >>>>> not setting it. Creating an API element that is completely disconnected
> >>>>> from any known use case doesn't seem like good API design to me,
> >>>>> especially for an in-kernel API.
> >>>>
> >>>> If we document that DMA_REPEAT covers REPEAT _and_ LOAD_EOT with one
> >>>> flag then how would other drivers can implement REPEAT if they can not
> >>>> support LOAD_EOT?
> >>>> They should do DMA_REPEAT | NOT_LOAD_EOT | LOAD_ASAP?
> >>> 
> >>> As stated before, I think a DMA_LOAD_EOT capability is useful. My
> >>> concern is about DMA_PREP_LOAD_EOT for which I can't see use cases. I've
> >>> added DMA_PREP_LOAD_EOT in the last patch series, and my DMA engine
> >>> driver ignores the transaction when DMA_PREP_LOAD_EOT is not set, as
> >>> required. It works fine as the my client always sets it.
> >> 
> >> Thanks.
> >> 
> >>> I'd expect Vinod or you to write the documentation though, as writing
> >>> code for an API I don't believe in is one thing, writing documentation
> >>> to explain the rationale behind the API design will be more complex
> >> 
> >> Vinod can correct me, but for the capabilities:
> >> DMA_REPEAT: the controller (and driver) supports repeating the
> >> 	descriptor. It can be terminated with terminate_all
> >> DMA_LOAD_EOT: the controller (and driver) supports loading the next
> >> 	issued transfer on a channel which is running DMA_REPEAT
> >> 	descriptor. Iow, instead of reloading the running transfer, it
> >> 	moves to the next one.
> >> DMA_LOAD_NOW: the controller (and driver) supports aborting the
> >> 	active descriptor (either DMA_REPEAT or non repeated one) and
> >> 	moving to the next issued transfer without clients needing to
> >> 	use terminate_all.
> > 
> > Sounds right to me.
> 
> For the same reason as above, my latest patch series doesn't include
> DMA_LOAD_NOW, as that would be an API with no user. Vinod, is that OK
> with you ?
> 
> >>> when I don't believe there's any rationale :-)
> >> 
> >> Sure, you have a specific DMA, which does one thing and one thing only.
> >> When a subsystem decides to create a generic DMA layer on top of
> >> DMAengine for example to get rid of the duplicated code in the drivers
> >> then this generic code does need information to decide how the servicing
> >> DMA should be used for optimal performance and quality.
> >> Some DMAs (and drivers) might have slightly different capabilities.
> >> 
> >>>> LOAD_EOT is a feature the HW can or can not support and it is an
> >>>> operation mode that you want to use or do not want to use.
> >>> 
> >>> DMA_PREP_REPEAT for the EOT mode, DRM_PREP_REPEAT | DMA_PREP_LOAD_NOW
> >>> for the immediate mode would work too, and wouldn't have the drawback of
> >>> artificially creating a case (!EOT && !NOW) that would fail.
> >> 
> >> But if a DMA does not support LOAD_EOT at all? If it did not support
> >> LOAD_NOW either?
> 
> If the driver doesn't support EOT, then REPEAT without EOT would be
> rejected by the prepare operation. If the driver doesn't support EOT not
> NOW, then it wouldn't support REPEAT :-) In any case, both EOT and NOW
> would be rejected, and so would REPEAT (REPEAT without EOT or NOW
> doesn't make much sense).
> 
> >> But if anything the LOAD_NOW sounds more of a default expectation than
> >> LOAD_EOT.
> >> Yes, I know. The display use case needs LOAD_EOT to avoid artifacts on
> >> screen, but DMA_REPEAT is not only for displays.
> > 
> > Correct, a user can request LOAD_NOW or LOAD_EOT, driver should be able
> > to handle (as long as h/w supports) and act accordingly.
> > 
> > Dmaengine layer and drivers and not specific to one interface or one
> > user, the idea is to write generic dmaengine driver catering to
> > different users, so supporting different flags from driver pov as well
> > dmaengine framework pov is required.
> 
> Let's skip the lectures on API design, I think we're way past that
> point.
> 
> Could you please review the latest patch series
> (https://lore.kernel.org/dmaengine/20200528025228.31638-1-laurent.pinchart@ideasonboard.com/T/#t)
> ?

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-06-23  9:47 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13 16:59 [PATCH v4 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
2020-05-13 16:59 ` [PATCH v4 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA Laurent Pinchart
2020-05-13 16:59 ` [PATCH v4 2/6] dmaengine: virt-dma: Use lockdep to check locking requirements Laurent Pinchart
2020-05-13 16:59 ` [PATCH v4 3/6] dmaengine: Add support for repeating transactions Laurent Pinchart
2020-05-14 18:23   ` Vinod Koul
2020-05-14 20:07     ` Laurent Pinchart
2020-05-15  8:38       ` Vinod Koul
2020-05-15 14:11         ` Laurent Pinchart
2020-05-18 13:57           ` Peter Ujfalusi
2020-05-18 14:32             ` Laurent Pinchart
2020-05-19 12:41               ` Peter Ujfalusi
2020-05-28  2:10                 ` Laurent Pinchart
2020-06-01 11:14                   ` Peter Ujfalusi
2020-06-01 11:49                     ` Laurent Pinchart
2020-06-03 10:51                       ` Peter Ujfalusi
2020-06-03 16:06                         ` Vinod Koul
2020-06-16 21:39                           ` Laurent Pinchart
2020-06-23  9:47                             ` Laurent Pinchart
2020-05-13 16:59 ` [PATCH v4 4/6] dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver Laurent Pinchart
2020-05-13 16:59 ` [PATCH v4 5/6] dmaengine: xilinx: dpdma: Add debugfs support Laurent Pinchart
2020-05-13 16:59 ` [PATCH v4 6/6] arm64: dts: zynqmp: Add DPDMA node Laurent Pinchart
2020-05-14  5:56   ` Michal Simek
2020-05-28  2:49     ` Laurent Pinchart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).