dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] dma: Add Xilinx ZynqMP DPDMA driver
@ 2020-01-23  2:29 Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA Laurent Pinchart
                   ` (5 more replies)
  0 siblings, 6 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-23  2:29 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul

Hello,

This patch series adds a new driver for the DPDMA engine found in the
Xilinx ZynqMP.

The previous version can be found at [1]. All review comments have been
taken into account. The most notable changes are

- Introduction of a new DMA transfer type that combines interleaved and
  cyclic tranfers (patch 2/6, suggested by Vinod)

- Switch to virt-dma (including a drive-by lockdep addition to virt-dma
  in patch 3/6)

- Removal of all non-interleaved, non-cyclic transfer types, as I have
  currently no way to test them given how the IP core is integrated in
  the hardware. Support for non-interleaved cyclic transfers may be
  added later for audio.

The driver has been successfully tested with the ZynqMP DisplayPort
subsystem DRM driver.

Vinod, please let me know if you would like authorship of patch 2/6 to
be assigned to you, in which case I will need your SoB line.

[1] https://lore.kernel.org/dmaengine/20191107021400.16474-1-laurent.pinchart@ideasonboard.com/

Hyun Kwon (1):
  dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver

Laurent Pinchart (5):
  dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA
  dmaengine: Add interleaved cyclic transaction type
  dmaengine: virt-dma: Use lockdep to check locking requirements
  dmaengine: xilinx: dpdma: Add debugfs support
  arm64: dts: zynqmp: Add DPDMA node

 .../dma/xilinx/xlnx,zynqmp-dpdma.yaml         |   68 +
 MAINTAINERS                                   |    9 +
 arch/arm64/boot/dts/xilinx/zynqmp-clk.dtsi    |    4 +
 arch/arm64/boot/dts/xilinx/zynqmp.dtsi        |   10 +
 drivers/dma/Kconfig                           |   10 +
 drivers/dma/dmaengine.c                       |    8 +-
 drivers/dma/virt-dma.c                        |    2 +
 drivers/dma/virt-dma.h                        |   14 +
 drivers/dma/xilinx/Makefile                   |    1 +
 drivers/dma/xilinx/xilinx_dpdma.c             | 1754 +++++++++++++++++
 include/dt-bindings/dma/xlnx-zynqmp-dpdma.h   |   16 +
 include/linux/dmaengine.h                     |   18 +
 12 files changed, 1913 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
 create mode 100644 drivers/dma/xilinx/xilinx_dpdma.c
 create mode 100644 include/dt-bindings/dma/xlnx-zynqmp-dpdma.h


base-commit: d1eef1c619749b2a57e514a3fa67d9a516ffa919
-- 
Regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v3 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA
  2020-01-23  2:29 [PATCH v3 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
@ 2020-01-23  2:29 ` Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type Laurent Pinchart
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-23  2:29 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	devicetree

The ZynqMP includes the DisplayPort subsystem with its own DMA engine
called DPDMA. The DPDMA IP comes with 6 individual channels
(4 for display, 2 for audio). This documentation describes DT bindings
of DPDMA.

Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
Changes since v2:

- Fix id URL
- Fix path to dma-controller.yaml
- Update license to GPL-2.0-only OR BSD-2-Clause

Changes since v1:

- Convert the DT bindings to YAML
- Drop the DT child nodes
---
 .../dma/xilinx/xlnx,zynqmp-dpdma.yaml         | 68 +++++++++++++++++++
 MAINTAINERS                                   |  8 +++
 include/dt-bindings/dma/xlnx-zynqmp-dpdma.h   | 16 +++++
 3 files changed, 92 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
 create mode 100644 include/dt-bindings/dma/xlnx-zynqmp-dpdma.h

diff --git a/Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml b/Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
new file mode 100644
index 000000000000..5de510f8c88c
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
@@ -0,0 +1,68 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/dma/xilinx/xlnx,zynqmp-dpdma.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Xilinx ZynqMP DisplayPort DMA Controller Device Tree Bindings
+
+description: |
+  These bindings describe the DMA engine included in the Xilinx ZynqMP
+  DisplayPort Subsystem. The DMA engine supports up to 6 DMA channels (3
+  channels for a video stream, 1 channel for a graphics stream, and 2 channels
+  for an audio stream).
+
+maintainers:
+  - Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+
+allOf:
+  - $ref: "../dma-controller.yaml#"
+
+properties:
+  "#dma-cells":
+    const: 1
+    description: |
+      The cell is the DMA channel ID (see dt-bindings/dma/xlnx-zynqmp-dpdma.h
+      for a list of channel IDs).
+
+  compatible:
+    const: xlnx,zynqmp-dpdma
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  clocks:
+    description: The AXI clock
+    maxItems: 1
+
+  clock-names:
+    const: axi_clk
+
+required:
+  - "#dma-cells"
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+    dma: dma-controller@fd4c0000 {
+      compatible = "xlnx,zynqmp-dpdma";
+      reg = <0x0 0xfd4c0000 0x0 0x1000>;
+      interrupts = <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>;
+      interrupt-parent = <&gic>;
+      clocks = <&dpdma_clk>;
+      clock-names = "axi_clk";
+      #dma-cells = <1>;
+    };
+
+...
diff --git a/MAINTAINERS b/MAINTAINERS
index cc0a4a8ae06a..c7a011837102 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18182,6 +18182,14 @@ F:	drivers/misc/Kconfig
 F:	drivers/misc/Makefile
 F:	include/uapi/misc/xilinx_sdfec.h
 
+XILINX ZYNQMP DPDMA DRIVER
+M:	Hyun Kwon <hyun.kwon@xilinx.com>
+M:	Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+L:	dmaengine@vger.kernel.org
+S:	Supported
+F:	Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
+F:	include/dt-bindings/dma/xlnx-zynqmp-dpdma.h
+
 XILLYBUS DRIVER
 M:	Eli Billauer <eli.billauer@gmail.com>
 L:	linux-kernel@vger.kernel.org
diff --git a/include/dt-bindings/dma/xlnx-zynqmp-dpdma.h b/include/dt-bindings/dma/xlnx-zynqmp-dpdma.h
new file mode 100644
index 000000000000..3719cda5679d
--- /dev/null
+++ b/include/dt-bindings/dma/xlnx-zynqmp-dpdma.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR MIT) */
+/*
+ * Copyright 2019 Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+ */
+
+#ifndef __DT_BINDINGS_DMA_XLNX_ZYNQMP_DPDMA_H__
+#define __DT_BINDINGS_DMA_XLNX_ZYNQMP_DPDMA_H__
+
+#define ZYNQMP_DPDMA_VIDEO0		0
+#define ZYNQMP_DPDMA_VIDEO1		1
+#define ZYNQMP_DPDMA_VIDEO2		2
+#define ZYNQMP_DPDMA_GRAPHICS		3
+#define ZYNQMP_DPDMA_AUDIO0		4
+#define ZYNQMP_DPDMA_AUDIO1		5
+
+#endif /* __DT_BINDINGS_DMA_XLNX_ZYNQMP_DPDMA_H__ */
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-23  2:29 [PATCH v3 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA Laurent Pinchart
@ 2020-01-23  2:29 ` Laurent Pinchart
  2020-01-23  8:03   ` Peter Ujfalusi
  2020-01-23  2:29 ` [PATCH v3 3/6] dmaengine: virt-dma: Use lockdep to check locking requirements Laurent Pinchart
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-23  2:29 UTC (permalink / raw)
  To: dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul

The new interleaved cyclic transaction type combines interleaved and
cycle transactions. It is designed for DMA engines that back display
controllers, where the same 2D frame needs to be output to the display
until a new frame is available.

Suggested-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 drivers/dma/dmaengine.c   |  8 +++++++-
 include/linux/dmaengine.h | 18 ++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 03ac4b96117c..4ffb98a47f31 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -981,7 +981,13 @@ int dma_async_device_register(struct dma_device *device)
 			"DMA_INTERLEAVE");
 		return -EIO;
 	}
-
+	if (dma_has_cap(DMA_INTERLEAVE_CYCLIC, device->cap_mask) &&
+	    !device->device_prep_interleaved_cyclic) {
+		dev_err(device->dev,
+			"Device claims capability %s, but op is not defined\n",
+			"DMA_INTERLEAVE_CYCLIC");
+		return -EIO;
+	}
 
 	if (!device->device_tx_status) {
 		dev_err(device->dev, "Device tx_status is not defined\n");
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 8fcdee1c0cf9..e9af3bf835cb 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -61,6 +61,7 @@ enum dma_transaction_type {
 	DMA_SLAVE,
 	DMA_CYCLIC,
 	DMA_INTERLEAVE,
+	DMA_INTERLEAVE_CYCLIC,
 /* last transaction type for creation of the capabilities mask */
 	DMA_TX_TYPE_END,
 };
@@ -701,6 +702,10 @@ struct dma_filter {
  *	The function takes a buffer of size buf_len. The callback function will
  *	be called after period_len bytes have been transferred.
  * @device_prep_interleaved_dma: Transfer expression in a generic way.
+ * @device_prep_interleaved_cyclic: prepares an interleaved cyclic transfer.
+ *	This is similar to @device_prep_interleaved_dma, but the transfer is
+ *	repeated until a new transfer is issued. This transfer type is meant
+ *	for display.
  * @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address
  * @device_config: Pushes a new configuration to a channel, return 0 or an error
  *	code
@@ -785,6 +790,9 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
 		struct dma_chan *chan, struct dma_interleaved_template *xt,
 		unsigned long flags);
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_cyclic)(
+		struct dma_chan *chan, struct dma_interleaved_template *xt,
+		unsigned long flags);
 	struct dma_async_tx_descriptor *(*device_prep_dma_imm_data)(
 		struct dma_chan *chan, dma_addr_t dst, u64 data,
 		unsigned long flags);
@@ -880,6 +888,16 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_interleaved_dma(
 	return chan->device->device_prep_interleaved_dma(chan, xt, flags);
 }
 
+static inline struct dma_async_tx_descriptor *dmaengine_prep_interleaved_cyclic(
+		struct dma_chan *chan, struct dma_interleaved_template *xt,
+		unsigned long flags)
+{
+	if (!chan || !chan->device || !chan->device->device_prep_interleaved_cyclic)
+		return NULL;
+
+	return chan->device->device_prep_interleaved_cyclic(chan, xt, flags);
+}
+
 static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_memset(
 		struct dma_chan *chan, dma_addr_t dest, int value, size_t len,
 		unsigned long flags)
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 3/6] dmaengine: virt-dma: Use lockdep to check locking requirements
  2020-01-23  2:29 [PATCH v3 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type Laurent Pinchart
@ 2020-01-23  2:29 ` Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 4/6] dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver Laurent Pinchart
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-23  2:29 UTC (permalink / raw)
  To: dmaengine; +Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy

A few virt-dma functions are documented as requiring the vc.lock to be
held by the caller. Check this with lockdep.

The vchan_vdesc_fini() and vchan_find_desc() functions gain a lockdep
check as well, because, even though they are not documented with this
requirement (and not documented at all for the latter), they touch
fields documented as protected by vc.lock. All callers have been
manually inspected to verify they call the functions with the lock held.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 drivers/dma/virt-dma.c |  2 ++
 drivers/dma/virt-dma.h | 14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/drivers/dma/virt-dma.c b/drivers/dma/virt-dma.c
index ec4adf4260a0..9b59bc1c6a55 100644
--- a/drivers/dma/virt-dma.c
+++ b/drivers/dma/virt-dma.c
@@ -68,6 +68,8 @@ struct virt_dma_desc *vchan_find_desc(struct virt_dma_chan *vc,
 {
 	struct virt_dma_desc *vd;
 
+	lockdep_assert_held(&vc->lock);
+
 	list_for_each_entry(vd, &vc->desc_issued, node)
 		if (vd->tx.cookie == cookie)
 			return vd;
diff --git a/drivers/dma/virt-dma.h b/drivers/dma/virt-dma.h
index ab158bac03a7..942493e36666 100644
--- a/drivers/dma/virt-dma.h
+++ b/drivers/dma/virt-dma.h
@@ -81,6 +81,8 @@ static inline struct dma_async_tx_descriptor *vchan_tx_prep(struct virt_dma_chan
  */
 static inline bool vchan_issue_pending(struct virt_dma_chan *vc)
 {
+	lockdep_assert_held(&vc->lock);
+
 	list_splice_tail_init(&vc->desc_submitted, &vc->desc_issued);
 	return !list_empty(&vc->desc_issued);
 }
@@ -96,6 +98,8 @@ static inline void vchan_cookie_complete(struct virt_dma_desc *vd)
 	struct virt_dma_chan *vc = to_virt_chan(vd->tx.chan);
 	dma_cookie_t cookie;
 
+	lockdep_assert_held(&vc->lock);
+
 	cookie = vd->tx.cookie;
 	dma_cookie_complete(&vd->tx);
 	dev_vdbg(vc->chan.device->dev, "txd %p[%x]: marked complete\n",
@@ -108,11 +112,15 @@ static inline void vchan_cookie_complete(struct virt_dma_desc *vd)
 /**
  * vchan_vdesc_fini - Free or reuse a descriptor
  * @vd: virtual descriptor to free/reuse
+ *
+ * vc.lock must be held by caller
  */
 static inline void vchan_vdesc_fini(struct virt_dma_desc *vd)
 {
 	struct virt_dma_chan *vc = to_virt_chan(vd->tx.chan);
 
+	lockdep_assert_held(&vc->lock);
+
 	if (dmaengine_desc_test_reuse(&vd->tx))
 		list_add(&vd->node, &vc->desc_allocated);
 	else
@@ -141,6 +149,8 @@ static inline void vchan_terminate_vdesc(struct virt_dma_desc *vd)
 {
 	struct virt_dma_chan *vc = to_virt_chan(vd->tx.chan);
 
+	lockdep_assert_held(&vc->lock);
+
 	/* free up stuck descriptor */
 	if (vc->vd_terminated)
 		vchan_vdesc_fini(vc->vd_terminated);
@@ -158,6 +168,8 @@ static inline void vchan_terminate_vdesc(struct virt_dma_desc *vd)
  */
 static inline struct virt_dma_desc *vchan_next_desc(struct virt_dma_chan *vc)
 {
+	lockdep_assert_held(&vc->lock);
+
 	return list_first_entry_or_null(&vc->desc_issued,
 					struct virt_dma_desc, node);
 }
@@ -175,6 +187,8 @@ static inline struct virt_dma_desc *vchan_next_desc(struct virt_dma_chan *vc)
 static inline void vchan_get_all_descriptors(struct virt_dma_chan *vc,
 	struct list_head *head)
 {
+	lockdep_assert_held(&vc->lock);
+
 	list_splice_tail_init(&vc->desc_allocated, head);
 	list_splice_tail_init(&vc->desc_submitted, head);
 	list_splice_tail_init(&vc->desc_issued, head);
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 4/6] dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver
  2020-01-23  2:29 [PATCH v3 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
                   ` (2 preceding siblings ...)
  2020-01-23  2:29 ` [PATCH v3 3/6] dmaengine: virt-dma: Use lockdep to check locking requirements Laurent Pinchart
@ 2020-01-23  2:29 ` Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 5/6] dmaengine: xilinx: dpdma: Add debugfs support Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 6/6] arm64: dts: zynqmp: Add DPDMA node Laurent Pinchart
  5 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-23  2:29 UTC (permalink / raw)
  To: dmaengine; +Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy

From: Hyun Kwon <hyun.kwon@xilinx.com>

The ZynqMP DisplayPort subsystem includes a DMA engine called DPDMA with
6 DMa channels (4 for display and 2 for audio). This driver exposes the
DPDMA through the dmaengine API, to be used by audio (ALSA) and display
(DRM) drivers for the DisplayPort subsystem.

Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Tejas Upadhyay <tejasu@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
Changes since v2:

- Switch to virt-dma
- Support interleaved cyclic transfers and nothing else
- Fix terminate_all behaviour (don't wait)
- Fix bug in extended address handling for hw desc
- Clean up video group handling
- Update driver name
- Use macros for bitfields
- Remove unneeded header
- Coding style and typo fixes

Changes since v1:

- Remove unneeded #include
- Drop enum xilinx_dpdma_chan_id
- Update compatible string
- Drop DT subnodes
- Replace XILINX_DPDMA_NUM_CHAN with ARRAY_SIZE(xdev->chan)
- Disable IRQ at remove() time
- Use devm_platform_ioremap_resource()
- Don't inline functions manually
- Add section headers
- Merge DMA engine implementation in their wrappers
- Rename xilinx_dpdma_sw_desc::phys to dma_addr
- Use GENMASK()
- Use FIELD_PREP/FIELD_GET
- Fix MSB handling in xilinx_dpdma_sw_desc_addr_64()
- Fix logic in xilinx_dpdma_chan_prep_slave_sg()
- Document why xilinx_dpdma_config() doesn't need to check most
  parameters
- Remove debugfs support
- Rechedule errored descriptor
- Align the line size with 128bit
- SPDX header formatting
---
 MAINTAINERS                       |    1 +
 drivers/dma/Kconfig               |   10 +
 drivers/dma/xilinx/Makefile       |    1 +
 drivers/dma/xilinx/xilinx_dpdma.c | 1527 +++++++++++++++++++++++++++++
 4 files changed, 1539 insertions(+)
 create mode 100644 drivers/dma/xilinx/xilinx_dpdma.c

diff --git a/MAINTAINERS b/MAINTAINERS
index c7a011837102..cabe9d0417c2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18188,6 +18188,7 @@ M:	Laurent Pinchart <laurent.pinchart@ideasonboard.com>
 L:	dmaengine@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml
+F:	drivers/dma/xilinx/xilinx_dpdma.c
 F:	include/dt-bindings/dma/xlnx-zynqmp-dpdma.h
 
 XILLYBUS DRIVER
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 6fa1eba9d477..7f6a87161344 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -667,6 +667,16 @@ config XILINX_ZYNQMP_DMA
 	help
 	  Enable support for Xilinx ZynqMP DMA controller.
 
+config XILINX_ZYNQMP_DPDMA
+	tristate "Xilinx DPDMA Engine"
+	select DMA_ENGINE
+	select DMA_VIRTUAL_CHANNELS
+	help
+	  Enable support for Xilinx ZynqMP DisplayPort DMA. Choose this option
+	  if you have a Xilinx ZynqMP SoC with a DisplayPort subsystem. The
+	  driver provides the dmaengine required by the DisplayPort subsystem
+	  display driver.
+
 config ZX_DMA
 	tristate "ZTE ZX DMA support"
 	depends on ARCH_ZX || COMPILE_TEST
diff --git a/drivers/dma/xilinx/Makefile b/drivers/dma/xilinx/Makefile
index e921de575b55..767bb45f641f 100644
--- a/drivers/dma/xilinx/Makefile
+++ b/drivers/dma/xilinx/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_XILINX_DMA) += xilinx_dma.o
 obj-$(CONFIG_XILINX_ZYNQMP_DMA) += zynqmp_dma.o
+obj-$(CONFIG_XILINX_ZYNQMP_DPDMA) += xilinx_dpdma.o
diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
new file mode 100644
index 000000000000..15ba85aa63d9
--- /dev/null
+++ b/drivers/dma/xilinx/xilinx_dpdma.c
@@ -0,0 +1,1527 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Xilinx ZynqMP DPDMA Engine driver
+ *
+ * Copyright (C) 2015 - 2019 Xilinx, Inc.
+ *
+ * Author: Hyun Woo Kwon <hyun.kwon@xilinx.com>
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bits.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dmaengine.h>
+#include <linux/dmapool.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_dma.h>
+#include <linux/platform_device.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+
+#include <dt-bindings/dma/xlnx-zynqmp-dpdma.h>
+
+#include "../dmaengine.h"
+#include "../virt-dma.h"
+
+/* DPDMA registers */
+#define XILINX_DPDMA_ERR_CTRL				0x000
+#define XILINX_DPDMA_ISR				0x004
+#define XILINX_DPDMA_IMR				0x008
+#define XILINX_DPDMA_IEN				0x00c
+#define XILINX_DPDMA_IDS				0x010
+#define XILINX_DPDMA_INTR_DESC_DONE(n)			BIT((n) + 0)
+#define XILINX_DPDMA_INTR_DESC_DONE_MASK		GENMASK(5, 0)
+#define XILINX_DPDMA_INTR_NO_OSTAND(n)			BIT((n) + 6)
+#define XILINX_DPDMA_INTR_NO_OSTAND_MASK		GENMASK(11, 6)
+#define XILINX_DPDMA_INTR_AXI_ERR(n)			BIT((n) + 12)
+#define XILINX_DPDMA_INTR_AXI_ERR_MASK			GENMASK(17, 12)
+#define XILINX_DPDMA_INTR_DESC_ERR(n)			BIT((n) + 16)
+#define XILINX_DPDMA_INTR_DESC_ERR_MASK			GENMASK(23, 18)
+#define XILINX_DPDMA_INTR_WR_CMD_FIFO_FULL		BIT(24)
+#define XILINX_DPDMA_INTR_WR_DATA_FIFO_FULL		BIT(25)
+#define XILINX_DPDMA_INTR_AXI_4K_CROSS			BIT(26)
+#define XILINX_DPDMA_INTR_VSYNC				BIT(27)
+#define XILINX_DPDMA_INTR_CHAN_ERR_MASK			0x00041000
+#define XILINX_DPDMA_INTR_CHAN_ERR			0x00fff000
+#define XILINX_DPDMA_INTR_GLOBAL_ERR			0x07000000
+#define XILINX_DPDMA_INTR_ERR_ALL			0x07fff000
+#define XILINX_DPDMA_INTR_CHAN_MASK			0x00041041
+#define XILINX_DPDMA_INTR_GLOBAL_MASK			0x0f000000
+#define XILINX_DPDMA_INTR_ALL				0x0fffffff
+#define XILINX_DPDMA_EISR				0x014
+#define XILINX_DPDMA_EIMR				0x018
+#define XILINX_DPDMA_EIEN				0x01c
+#define XILINX_DPDMA_EIDS				0x020
+#define XILINX_DPDMA_EINTR_INV_APB			BIT(0)
+#define XILINX_DPDMA_EINTR_RD_AXI_ERR(n)		BIT((n) + 1)
+#define XILINX_DPDMA_EINTR_RD_AXI_ERR_MASK		GENMASK(6, 1)
+#define XILINX_DPDMA_EINTR_PRE_ERR(n)			BIT((n) + 7)
+#define XILINX_DPDMA_EINTR_PRE_ERR_MASK			GENMASK(12, 7)
+#define XILINX_DPDMA_EINTR_CRC_ERR(n)			BIT((n) + 13)
+#define XILINX_DPDMA_EINTR_CRC_ERR_MASK			GENMASK(18, 13)
+#define XILINX_DPDMA_EINTR_WR_AXI_ERR(n)		BIT((n) + 19)
+#define XILINX_DPDMA_EINTR_WR_AXI_ERR_MASK		GENMASK(24, 19)
+#define XILINX_DPDMA_EINTR_DESC_DONE_ERR(n)		BIT((n) + 25)
+#define XILINX_DPDMA_EINTR_DESC_DONE_ERR_MASK		GENMASK(30, 25)
+#define XILINX_DPDMA_EINTR_RD_CMD_FIFO_FULL		BIT(32)
+#define XILINX_DPDMA_EINTR_CHAN_ERR_MASK		0x02082082
+#define XILINX_DPDMA_EINTR_CHAN_ERR			0x7ffffffe
+#define XILINX_DPDMA_EINTR_GLOBAL_ERR			0x80000001
+#define XILINX_DPDMA_EINTR_ALL				0xffffffff
+#define XILINX_DPDMA_CNTL				0x100
+#define XILINX_DPDMA_GBL				0x104
+#define XILINX_DPDMA_GBL_TRIG_MASK(n)			((n) << 0)
+#define XILINX_DPDMA_GBL_RETRIG_MASK(n)			((n) << 6)
+#define XILINX_DPDMA_ALC0_CNTL				0x108
+#define XILINX_DPDMA_ALC0_STATUS			0x10c
+#define XILINX_DPDMA_ALC0_MAX				0x110
+#define XILINX_DPDMA_ALC0_MIN				0x114
+#define XILINX_DPDMA_ALC0_ACC				0x118
+#define XILINX_DPDMA_ALC0_ACC_TRAN			0x11c
+#define XILINX_DPDMA_ALC1_CNTL				0x120
+#define XILINX_DPDMA_ALC1_STATUS			0x124
+#define XILINX_DPDMA_ALC1_MAX				0x128
+#define XILINX_DPDMA_ALC1_MIN				0x12c
+#define XILINX_DPDMA_ALC1_ACC				0x130
+#define XILINX_DPDMA_ALC1_ACC_TRAN			0x134
+
+/* Channel register */
+#define XILINX_DPDMA_CH_BASE				0x200
+#define XILINX_DPDMA_CH_OFFSET				0x100
+#define XILINX_DPDMA_CH_DESC_START_ADDRE		0x000
+#define XILINX_DPDMA_CH_DESC_START_ADDRE_MASK		GENMASK(15, 0)
+#define XILINX_DPDMA_CH_DESC_START_ADDR			0x004
+#define XILINX_DPDMA_CH_DESC_NEXT_ADDRE			0x008
+#define XILINX_DPDMA_CH_DESC_NEXT_ADDR			0x00c
+#define XILINX_DPDMA_CH_PYLD_CUR_ADDRE			0x010
+#define XILINX_DPDMA_CH_PYLD_CUR_ADDR			0x014
+#define XILINX_DPDMA_CH_CNTL				0x018
+#define XILINX_DPDMA_CH_CNTL_ENABLE			BIT(0)
+#define XILINX_DPDMA_CH_CNTL_PAUSE			BIT(1)
+#define XILINX_DPDMA_CH_CNTL_QOS_DSCR_WR_MASK		GENMASK(5, 2)
+#define XILINX_DPDMA_CH_CNTL_QOS_DSCR_RD_MASK		GENMASK(9, 6)
+#define XILINX_DPDMA_CH_CNTL_QOS_DATA_RD_MASK		GENMASK(13, 10)
+#define XILINX_DPDMA_CH_CNTL_QOS_VID_CLASS		11
+#define XILINX_DPDMA_CH_STATUS				0x01c
+#define XILINX_DPDMA_CH_STATUS_OTRAN_CNT_MASK		GENMASK(24, 21)
+#define XILINX_DPDMA_CH_VDO				0x020
+#define XILINX_DPDMA_CH_PYLD_SZ				0x024
+#define XILINX_DPDMA_CH_DESC_ID				0x028
+
+/* DPDMA descriptor fields */
+#define XILINX_DPDMA_DESC_CONTROL_PREEMBLE		0xa5
+#define XILINX_DPDMA_DESC_CONTROL_COMPLETE_INTR		BIT(8)
+#define XILINX_DPDMA_DESC_CONTROL_DESC_UPDATE		BIT(9)
+#define XILINX_DPDMA_DESC_CONTROL_IGNORE_DONE		BIT(10)
+#define XILINX_DPDMA_DESC_CONTROL_FRAG_MODE		BIT(18)
+#define XILINX_DPDMA_DESC_CONTROL_LAST			BIT(19)
+#define XILINX_DPDMA_DESC_CONTROL_ENABLE_CRC		BIT(20)
+#define XILINX_DPDMA_DESC_CONTROL_LAST_OF_FRAME		BIT(21)
+#define XILINX_DPDMA_DESC_ID_MASK			GENMASK(15, 0)
+#define XILINX_DPDMA_DESC_HSIZE_STRIDE_HSIZE_MASK	GENMASK(17, 0)
+#define XILINX_DPDMA_DESC_HSIZE_STRIDE_STRIDE_MASK	GENMASK(31, 18)
+#define XILINX_DPDMA_DESC_ADDR_EXT_NEXT_ADDR_MASK	GENMASK(15, 0)
+#define XILINX_DPDMA_DESC_ADDR_EXT_SRC_ADDR_MASK	GENMASK(31, 16)
+
+#define XILINX_DPDMA_ALIGN_BYTES			256
+#define XILINX_DPDMA_LINESIZE_ALIGN_BITS		128
+
+#define XILINX_DPDMA_NUM_CHAN				6
+
+struct xilinx_dpdma_chan;
+
+/**
+ * struct xilinx_dpdma_hw_desc - DPDMA hardware descriptor
+ * @control: control configuration field
+ * @desc_id: descriptor ID
+ * @xfer_size: transfer size
+ * @hsize_stride: horizontal size and stride
+ * @timestamp_lsb: LSB of time stamp
+ * @timestamp_msb: MSB of time stamp
+ * @addr_ext: upper 16 bit of 48 bit address (next_desc and src_addr)
+ * @next_desc: next descriptor 32 bit address
+ * @src_addr: payload source address (1st page, 32 LSB)
+ * @addr_ext_23: payload source address (3nd and 3rd pages, 16 LSBs)
+ * @addr_ext_45: payload source address (4th and 5th pages, 16 LSBs)
+ * @src_addr2: payload source address (2nd page, 32 LSB)
+ * @src_addr3: payload source address (3rd page, 32 LSB)
+ * @src_addr4: payload source address (4th page, 32 LSB)
+ * @src_addr5: payload source address (5th page, 32 LSB)
+ * @crc: descriptor CRC
+ */
+struct xilinx_dpdma_hw_desc {
+	u32 control;
+	u32 desc_id;
+	u32 xfer_size;
+	u32 hsize_stride;
+	u32 timestamp_lsb;
+	u32 timestamp_msb;
+	u32 addr_ext;
+	u32 next_desc;
+	u32 src_addr;
+	u32 addr_ext_23;
+	u32 addr_ext_45;
+	u32 src_addr2;
+	u32 src_addr3;
+	u32 src_addr4;
+	u32 src_addr5;
+	u32 crc;
+} __aligned(XILINX_DPDMA_ALIGN_BYTES);
+
+/**
+ * struct xilinx_dpdma_sw_desc - DPDMA software descriptor
+ * @hw: DPDMA hardware descriptor
+ * @node: list node for software descriptors
+ * @dma_addr: DMA address of the software descriptor
+ */
+struct xilinx_dpdma_sw_desc {
+	struct xilinx_dpdma_hw_desc hw;
+	struct list_head node;
+	dma_addr_t dma_addr;
+};
+
+/**
+ * struct xilinx_dpdma_tx_desc - DPDMA transaction descriptor
+ * @vdesc: virtual DMA descriptor
+ * @chan: DMA channel
+ * @descriptors: list of software descriptors
+ * @error: an error has been detected with this descriptor
+ */
+struct xilinx_dpdma_tx_desc {
+	struct virt_dma_desc vdesc;
+	struct xilinx_dpdma_chan *chan;
+	struct list_head descriptors;
+	bool error;
+};
+
+#define to_dpdma_tx_desc(_desc) \
+	container_of(_desc, struct xilinx_dpdma_tx_desc, vdesc)
+
+/**
+ * struct xilinx_dpdma_chan - DPDMA channel
+ * @vchan: virtual DMA channel
+ * @reg: register base address
+ * @id: channel ID
+ * @wait_to_stop: queue to wait for outstanding transacitons before stopping
+ * @running: true if the channel is running
+ * @first_frame: flag for the first frame of stream
+ * @video_group: flag if multi-channel operation is needed for video channels
+ * @lock: lock to access struct xilinx_dpdma_chan
+ * @desc_pool: descriptor allocation pool
+ * @err_task: error IRQ bottom half handler
+ * @desc.pending: Descriptor schedule to the hardware, pending execution
+ * @desc.active: Descriptor being executed by the hardware
+ * @xdev: DPDMA device
+ */
+struct xilinx_dpdma_chan {
+	struct virt_dma_chan vchan;
+	void __iomem *reg;
+	unsigned int id;
+
+	wait_queue_head_t wait_to_stop;
+	bool running;
+	bool first_frame;
+	bool video_group;
+
+	spinlock_t lock; /* lock to access struct xilinx_dpdma_chan */
+	struct dma_pool *desc_pool;
+	struct tasklet_struct err_task;
+
+	struct {
+		struct xilinx_dpdma_tx_desc *pending;
+		struct xilinx_dpdma_tx_desc *active;
+	} desc;
+
+	struct xilinx_dpdma_device *xdev;
+};
+
+#define to_xilinx_chan(_chan) \
+	container_of(_chan, struct xilinx_dpdma_chan, vchan.chan)
+
+/**
+ * struct xilinx_dpdma_device - DPDMA device
+ * @common: generic dma device structure
+ * @reg: register base address
+ * @dev: generic device structure
+ * @irq: the interrupt number
+ * @axi_clk: axi clock
+ * @chan: DPDMA channels
+ * @ext_addr: flag for 64 bit system (48 bit addressing)
+ */
+struct xilinx_dpdma_device {
+	struct dma_device common;
+	void __iomem *reg;
+	struct device *dev;
+	int irq;
+
+	struct clk *axi_clk;
+	struct xilinx_dpdma_chan *chan[XILINX_DPDMA_NUM_CHAN];
+
+	bool ext_addr;
+};
+
+/* -----------------------------------------------------------------------------
+ * I/O Accessors
+ */
+
+static inline u32 dpdma_read(void __iomem *base, u32 offset)
+{
+	return ioread32(base + offset);
+}
+
+static inline void dpdma_write(void __iomem *base, u32 offset, u32 val)
+{
+	iowrite32(val, base + offset);
+}
+
+static inline void dpdma_clr(void __iomem *base, u32 offset, u32 clr)
+{
+	dpdma_write(base, offset, dpdma_read(base, offset) & ~clr);
+}
+
+static inline void dpdma_set(void __iomem *base, u32 offset, u32 set)
+{
+	dpdma_write(base, offset, dpdma_read(base, offset) | set);
+}
+
+/* -----------------------------------------------------------------------------
+ * Descriptor Operations
+ */
+
+/**
+ * xilinx_dpdma_sw_desc_set_dma_addrs - Set DMA addresses in the descriptor
+ * @sw_desc: The software descriptor in which to set DMA addresses
+ * @prev: The previous descriptor
+ * @dma_addr: array of dma addresses
+ * @num_src_addr: number of addresses in @dma_addr
+ *
+ * Set all the DMA addresses in the hardware descriptor corresponding to @dev
+ * from @dma_addr. If a previous descriptor is specified in @prev, its next
+ * descriptor DMA address is set to the DMA address of @sw_desc. @prev may be
+ * identical to @sw_desc for cyclic transfers.
+ */
+static void xilinx_dpdma_sw_desc_set_dma_addrs(struct xilinx_dpdma_device *xdev,
+					       struct xilinx_dpdma_sw_desc *sw_desc,
+					       struct xilinx_dpdma_sw_desc *prev,
+					       dma_addr_t dma_addr[],
+					       unsigned int num_src_addr)
+{
+	struct xilinx_dpdma_hw_desc *hw_desc = &sw_desc->hw;
+	unsigned int i;
+
+	hw_desc->src_addr = lower_32_bits(dma_addr[0]);
+	if (xdev->ext_addr)
+		hw_desc->addr_ext |=
+			FIELD_PREP(XILINX_DPDMA_DESC_ADDR_EXT_SRC_ADDR_MASK,
+				   upper_32_bits(dma_addr[0]));
+
+	for (i = 1; i < num_src_addr; i++) {
+		u32 *addr = &hw_desc->src_addr2;
+
+		addr[i-1] = lower_32_bits(dma_addr[i]);
+
+		if (xdev->ext_addr) {
+			u32 *addr_ext = &hw_desc->addr_ext_23;
+			u32 addr_msb;
+
+			addr_msb = upper_32_bits(dma_addr[i]) & GENMASK(15, 0);
+			addr_msb <<= 16 * ((i - 1) % 2);
+			addr_ext[(i - 1) / 2] |= addr_msb;
+		}
+	}
+
+	if (!prev)
+		return;
+
+	prev->hw.next_desc = lower_32_bits(sw_desc->dma_addr);
+	if (xdev->ext_addr)
+		prev->hw.addr_ext |=
+			FIELD_PREP(XILINX_DPDMA_DESC_ADDR_EXT_NEXT_ADDR_MASK,
+				   upper_32_bits(sw_desc->dma_addr));
+}
+
+/**
+ * xilinx_dpdma_chan_alloc_sw_desc - Allocate a software descriptor
+ * @chan: DPDMA channel
+ *
+ * Allocate a software descriptor from the channel's descriptor pool.
+ *
+ * Return: a software descriptor or NULL.
+ */
+static struct xilinx_dpdma_sw_desc *
+xilinx_dpdma_chan_alloc_sw_desc(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	dma_addr_t dma_addr;
+
+	sw_desc = dma_pool_zalloc(chan->desc_pool, GFP_ATOMIC, &dma_addr);
+	if (!sw_desc)
+		return NULL;
+
+	sw_desc->dma_addr = dma_addr;
+
+	return sw_desc;
+}
+
+/**
+ * xilinx_dpdma_chan_free_sw_desc - Free a software descriptor
+ * @chan: DPDMA channel
+ * @sw_desc: software descriptor to free
+ *
+ * Free a software descriptor from the channel's descriptor pool.
+ */
+static void
+xilinx_dpdma_chan_free_sw_desc(struct xilinx_dpdma_chan *chan,
+			       struct xilinx_dpdma_sw_desc *sw_desc)
+{
+	dma_pool_free(chan->desc_pool, sw_desc, sw_desc->dma_addr);
+}
+
+/**
+ * xilinx_dpdma_chan_dump_tx_desc - Dump a tx descriptor
+ * @chan: DPDMA channel
+ * @tx_desc: tx descriptor to dump
+ *
+ * Dump contents of a tx descriptor
+ */
+static void xilinx_dpdma_chan_dump_tx_desc(struct xilinx_dpdma_chan *chan,
+					   struct xilinx_dpdma_tx_desc *tx_desc)
+{
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	struct device *dev = chan->xdev->dev;
+	unsigned int i = 0;
+
+	dev_dbg(dev, "------- TX descriptor dump start -------\n");
+	dev_dbg(dev, "------- channel ID = %d -------\n", chan->id);
+
+	list_for_each_entry(sw_desc, &tx_desc->descriptors, node) {
+		struct xilinx_dpdma_hw_desc *hw_desc = &sw_desc->hw;
+
+		dev_dbg(dev, "------- HW descriptor %d -------\n", i++);
+		dev_dbg(dev, "descriptor DMA addr: %pad\n", &sw_desc->dma_addr);
+		dev_dbg(dev, "control: 0x%08x\n", hw_desc->control);
+		dev_dbg(dev, "desc_id: 0x%08x\n", hw_desc->desc_id);
+		dev_dbg(dev, "xfer_size: 0x%08x\n", hw_desc->xfer_size);
+		dev_dbg(dev, "hsize_stride: 0x%08x\n", hw_desc->hsize_stride);
+		dev_dbg(dev, "timestamp_lsb: 0x%08x\n", hw_desc->timestamp_lsb);
+		dev_dbg(dev, "timestamp_msb: 0x%08x\n", hw_desc->timestamp_msb);
+		dev_dbg(dev, "addr_ext: 0x%08x\n", hw_desc->addr_ext);
+		dev_dbg(dev, "next_desc: 0x%08x\n", hw_desc->next_desc);
+		dev_dbg(dev, "src_addr: 0x%08x\n", hw_desc->src_addr);
+		dev_dbg(dev, "addr_ext_23: 0x%08x\n", hw_desc->addr_ext_23);
+		dev_dbg(dev, "addr_ext_45: 0x%08x\n", hw_desc->addr_ext_45);
+		dev_dbg(dev, "src_addr2: 0x%08x\n", hw_desc->src_addr2);
+		dev_dbg(dev, "src_addr3: 0x%08x\n", hw_desc->src_addr3);
+		dev_dbg(dev, "src_addr4: 0x%08x\n", hw_desc->src_addr4);
+		dev_dbg(dev, "src_addr5: 0x%08x\n", hw_desc->src_addr5);
+		dev_dbg(dev, "crc: 0x%08x\n", hw_desc->crc);
+	}
+
+	dev_dbg(dev, "------- TX descriptor dump end -------\n");
+}
+
+/**
+ * xilinx_dpdma_chan_alloc_tx_desc - Allocate a transaction descriptor
+ * @chan: DPDMA channel
+ *
+ * Allocate a tx descriptor.
+ *
+ * Return: a tx descriptor or NULL.
+ */
+static struct xilinx_dpdma_tx_desc *
+xilinx_dpdma_chan_alloc_tx_desc(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_tx_desc *tx_desc;
+
+	tx_desc = kzalloc(sizeof(*tx_desc), GFP_KERNEL);
+	if (!tx_desc)
+		return NULL;
+
+	INIT_LIST_HEAD(&tx_desc->descriptors);
+	tx_desc->chan = chan;
+	tx_desc->error = false;
+
+	return tx_desc;
+}
+
+/**
+ * xilinx_dpdma_chan_free_tx_desc - Free a virtual DMA descriptor
+ * @vdesc: virtual DMA descriptor
+ *
+ * Free the virtual DMA descriptor @vdesc including its software descriptors.
+ */
+static void xilinx_dpdma_chan_free_tx_desc(struct virt_dma_desc *vdesc)
+{
+	struct xilinx_dpdma_sw_desc *sw_desc, *next;
+	struct xilinx_dpdma_tx_desc *desc;
+
+	if (!vdesc)
+		return;
+
+	desc = to_dpdma_tx_desc(vdesc);
+
+	list_for_each_entry_safe(sw_desc, next, &desc->descriptors, node) {
+		list_del(&sw_desc->node);
+		xilinx_dpdma_chan_free_sw_desc(desc->chan, sw_desc);
+	}
+
+	kfree(desc);
+}
+
+/**
+ * xilinx_dpdma_chan_prep_interleaved_cyclic - Prepare a cyclic interleaved dma
+ *					       descriptor
+ * @chan: DPDMA channel
+ * @xt: dma interleaved template
+ *
+ * Prepare a tx descriptor including internal software/hardware descriptors
+ * based on @xt.
+ *
+ * Return: A DPDMA TX descriptor on success, or NULL.
+ */
+static struct xilinx_dpdma_tx_desc *
+xilinx_dpdma_chan_prep_interleaved_cyclic(struct xilinx_dpdma_chan *chan,
+					  struct dma_interleaved_template *xt)
+{
+	struct xilinx_dpdma_tx_desc *tx_desc;
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	struct xilinx_dpdma_hw_desc *hw_desc;
+	size_t hsize = xt->sgl[0].size;
+	size_t stride = hsize + xt->sgl[0].icg;
+
+	if (!IS_ALIGNED(xt->src_start, XILINX_DPDMA_ALIGN_BYTES)) {
+		dev_err(chan->xdev->dev, "buffer should be aligned at %d B\n",
+			XILINX_DPDMA_ALIGN_BYTES);
+		return NULL;
+	}
+
+	tx_desc = xilinx_dpdma_chan_alloc_tx_desc(chan);
+	if (!tx_desc)
+		return NULL;
+
+	sw_desc = xilinx_dpdma_chan_alloc_sw_desc(chan);
+	if (!sw_desc) {
+		xilinx_dpdma_chan_free_tx_desc(&tx_desc->vdesc);
+		return NULL;
+	}
+
+	xilinx_dpdma_sw_desc_set_dma_addrs(chan->xdev, sw_desc, sw_desc,
+					   &xt->src_start, 1);
+
+	hw_desc = &sw_desc->hw;
+	hsize = ALIGN(hsize, XILINX_DPDMA_LINESIZE_ALIGN_BITS / 8);
+	hw_desc->xfer_size = hsize * xt->numf;
+	hw_desc->hsize_stride =
+		FIELD_PREP(XILINX_DPDMA_DESC_HSIZE_STRIDE_HSIZE_MASK, hsize) |
+		FIELD_PREP(XILINX_DPDMA_DESC_HSIZE_STRIDE_STRIDE_MASK,
+			   stride / 16);
+	hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_PREEMBLE;
+	hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_COMPLETE_INTR;
+	hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_IGNORE_DONE;
+	hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_LAST_OF_FRAME;
+
+	list_add_tail(&sw_desc->node, &tx_desc->descriptors);
+
+	return tx_desc;
+}
+
+/* -----------------------------------------------------------------------------
+ * DPDMA Channel Operations
+ */
+
+/**
+ * xilinx_dpdma_chan_enable - Enable the channel
+ * @chan: DPDMA channel
+ *
+ * Enable the channel and its interrupts. Set the QoS values for video class.
+ */
+static void xilinx_dpdma_chan_enable(struct xilinx_dpdma_chan *chan)
+{
+	u32 reg;
+
+	reg = (XILINX_DPDMA_INTR_CHAN_MASK << chan->id)
+	    | XILINX_DPDMA_INTR_GLOBAL_MASK;
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_IEN, reg);
+	reg = (XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id)
+	    | XILINX_DPDMA_INTR_GLOBAL_ERR;
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_EIEN, reg);
+
+	reg = XILINX_DPDMA_CH_CNTL_ENABLE
+	    | FIELD_PREP(XILINX_DPDMA_CH_CNTL_QOS_DSCR_WR_MASK,
+			 XILINX_DPDMA_CH_CNTL_QOS_VID_CLASS)
+	    | FIELD_PREP(XILINX_DPDMA_CH_CNTL_QOS_DSCR_RD_MASK,
+			 XILINX_DPDMA_CH_CNTL_QOS_VID_CLASS)
+	    | FIELD_PREP(XILINX_DPDMA_CH_CNTL_QOS_DATA_RD_MASK,
+			 XILINX_DPDMA_CH_CNTL_QOS_VID_CLASS);
+	dpdma_set(chan->reg, XILINX_DPDMA_CH_CNTL, reg);
+}
+
+/**
+ * xilinx_dpdma_chan_disable - Disable the channel
+ * @chan: DPDMA channel
+ *
+ * Disable the channel and its interrupts.
+ */
+static void xilinx_dpdma_chan_disable(struct xilinx_dpdma_chan *chan)
+{
+	u32 reg;
+
+	reg = XILINX_DPDMA_INTR_CHAN_MASK << chan->id;
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_IEN, reg);
+	reg = XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id;
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_EIEN, reg);
+
+	dpdma_clr(chan->reg, XILINX_DPDMA_CH_CNTL, XILINX_DPDMA_CH_CNTL_ENABLE);
+}
+
+/**
+ * xilinx_dpdma_chan_pause - Pause the channel
+ * @chan: DPDMA channel
+ *
+ * Pause the channel.
+ */
+static void xilinx_dpdma_chan_pause(struct xilinx_dpdma_chan *chan)
+{
+	dpdma_set(chan->reg, XILINX_DPDMA_CH_CNTL, XILINX_DPDMA_CH_CNTL_PAUSE);
+}
+
+/**
+ * xilinx_dpdma_chan_unpause - Unpause the channel
+ * @chan: DPDMA channel
+ *
+ * Unpause the channel.
+ */
+static void xilinx_dpdma_chan_unpause(struct xilinx_dpdma_chan *chan)
+{
+	dpdma_clr(chan->reg, XILINX_DPDMA_CH_CNTL, XILINX_DPDMA_CH_CNTL_PAUSE);
+}
+
+static u32 xilinx_dpdma_chan_video_group_ready(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	u32 channels = 0;
+	unsigned int i;
+
+	for (i = ZYNQMP_DPDMA_VIDEO0; i <= ZYNQMP_DPDMA_VIDEO2; i++) {
+		if (xdev->chan[i]->video_group && !xdev->chan[i]->running)
+			return 0;
+
+		if (xdev->chan[i]->video_group)
+			channels |= BIT(i);
+	}
+
+	return channels;
+}
+
+/**
+ * xilinx_dpdma_chan_queue_transfer - Queue the next transfer
+ * @chan: DPDMA channel
+ *
+ * Queue the next descriptor, if any, to the hardware. If the channel is
+ * stopped, start it first. Otherwise retrigger it with the next descriptor.
+ */
+static void xilinx_dpdma_chan_queue_transfer(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	struct xilinx_dpdma_tx_desc *desc;
+	struct virt_dma_desc *vdesc;
+	u32 reg, channels;
+
+	lockdep_assert_held(&chan->lock);
+
+	if (chan->desc.pending)
+		return;
+
+	if (!chan->running) {
+		xilinx_dpdma_chan_unpause(chan);
+		xilinx_dpdma_chan_enable(chan);
+		chan->first_frame = true;
+		chan->running = true;
+	}
+
+	if (chan->video_group)
+		channels = xilinx_dpdma_chan_video_group_ready(chan);
+	else
+		channels = BIT(chan->id);
+
+	if (!channels)
+		return;
+
+	vdesc = vchan_next_desc(&chan->vchan);
+	if (!vdesc)
+		return;
+
+	desc = to_dpdma_tx_desc(vdesc);
+	chan->desc.pending = desc;
+	list_del(&desc->vdesc.node);
+
+	/*
+	 * Assign the cookie to descriptors in this transaction. Only 16 bit
+	 * will be used, but it should be enough.
+	 */
+	list_for_each_entry(sw_desc, &desc->descriptors, node)
+		sw_desc->hw.desc_id = desc->vdesc.tx.cookie;
+
+	sw_desc = list_first_entry(&desc->descriptors,
+				   struct xilinx_dpdma_sw_desc, node);
+	dpdma_write(chan->reg, XILINX_DPDMA_CH_DESC_START_ADDR,
+		    lower_32_bits(sw_desc->dma_addr));
+	if (xdev->ext_addr)
+		dpdma_write(chan->reg, XILINX_DPDMA_CH_DESC_START_ADDRE,
+			    FIELD_PREP(XILINX_DPDMA_CH_DESC_START_ADDRE_MASK,
+				       upper_32_bits(sw_desc->dma_addr)));
+
+	if (chan->first_frame)
+		reg = XILINX_DPDMA_GBL_TRIG_MASK(channels);
+	else
+		reg = XILINX_DPDMA_GBL_RETRIG_MASK(channels);
+
+	chan->first_frame = false;
+
+	dpdma_write(xdev->reg, XILINX_DPDMA_GBL, reg);
+}
+
+/**
+ * xilinx_dpdma_chan_ostand - Number of outstanding transactions
+ * @chan: DPDMA channel
+ *
+ * Read and return the number of outstanding transactions from register.
+ *
+ * Return: Number of outstanding transactions from the status register.
+ */
+static u32 xilinx_dpdma_chan_ostand(struct xilinx_dpdma_chan *chan)
+{
+	return FIELD_GET(XILINX_DPDMA_CH_STATUS_OTRAN_CNT_MASK,
+			 dpdma_read(chan->reg, XILINX_DPDMA_CH_STATUS));
+}
+
+/**
+ * xilinx_dpdma_chan_no_ostand - Notify no outstanding transaction event
+ * @chan: DPDMA channel
+ *
+ * Notify waiters for no outstanding event, so waiters can stop the channel
+ * safely. This function is supposed to be called when 'no outstanding'
+ * interrupt is generated. The 'no outstanding' interrupt is disabled and
+ * should be re-enabled when this event is handled. If the channel status
+ * register still shows some number of outstanding transactions, the interrupt
+ * remains enabled.
+ *
+ * Return: 0 on success. On failure, -EWOULDBLOCK if there's still outstanding
+ * transaction(s).
+ */
+static int xilinx_dpdma_chan_notify_no_ostand(struct xilinx_dpdma_chan *chan)
+{
+	u32 cnt;
+
+	cnt = xilinx_dpdma_chan_ostand(chan);
+	if (cnt) {
+		dev_dbg(chan->xdev->dev, "%d outstanding transactions\n", cnt);
+		return -EWOULDBLOCK;
+	}
+
+	/* Disable 'no outstanding' interrupt */
+	dpdma_write(chan->xdev->reg, XILINX_DPDMA_IDS,
+		    XILINX_DPDMA_INTR_NO_OSTAND(chan->id));
+	wake_up(&chan->wait_to_stop);
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_chan_wait_no_ostand - Wait for the no outstanding irq
+ * @chan: DPDMA channel
+ *
+ * Wait for the no outstanding transaction interrupt. This functions can sleep
+ * for 50ms.
+ *
+ * Return: 0 on success. On failure, -ETIMEOUT for time out, or the error code
+ * from wait_event_interruptible_timeout().
+ */
+static int xilinx_dpdma_chan_wait_no_ostand(struct xilinx_dpdma_chan *chan)
+{
+	int ret;
+
+	/* Wait for a no outstanding transaction interrupt upto 50msec */
+	ret = wait_event_interruptible_timeout(chan->wait_to_stop,
+					       !xilinx_dpdma_chan_ostand(chan),
+					       msecs_to_jiffies(50));
+	if (ret > 0) {
+		dpdma_write(chan->xdev->reg, XILINX_DPDMA_IEN,
+			    XILINX_DPDMA_INTR_NO_OSTAND(chan->id));
+		return 0;
+	}
+
+	dev_err(chan->xdev->dev, "not ready to stop: %d trans\n",
+		xilinx_dpdma_chan_ostand(chan));
+
+	if (ret == 0)
+		return -ETIMEDOUT;
+
+	return ret;
+}
+
+/**
+ * xilinx_dpdma_chan_poll_no_ostand - Poll the outstanding transaction status
+ * @chan: DPDMA channel
+ *
+ * Poll the outstanding transaction status, and return when there's no
+ * outstanding transaction. This functions can be used in the interrupt context
+ * or where the atomicity is required. Calling thread may wait more than 50ms.
+ *
+ * Return: 0 on success, or -ETIMEDOUT.
+ */
+static int xilinx_dpdma_chan_poll_no_ostand(struct xilinx_dpdma_chan *chan)
+{
+	u32 cnt, loop = 50000;
+
+	/* Poll at least for 50ms (20 fps). */
+	do {
+		cnt = xilinx_dpdma_chan_ostand(chan);
+		udelay(1);
+	} while (loop-- > 0 && cnt);
+
+	if (loop) {
+		dpdma_write(chan->xdev->reg, XILINX_DPDMA_IEN,
+			    XILINX_DPDMA_INTR_NO_OSTAND(chan->id));
+		return 0;
+	}
+
+	dev_err(chan->xdev->dev, "not ready to stop: %d trans\n",
+		xilinx_dpdma_chan_ostand(chan));
+
+	return -ETIMEDOUT;
+}
+
+/**
+ * xilinx_dpdma_chan_stop - Stop the channel
+ * @chan: DPDMA channel
+ *
+ * Stop a previously paused channel by first waiting for completion of all
+ * outstanding transaction and then disabling the channel.
+ *
+ * Return: 0 on success, or -ETIMEDOUT if the channel failed to stop.
+ */
+static int xilinx_dpdma_chan_stop(struct xilinx_dpdma_chan *chan)
+{
+	unsigned long flags;
+	int ret;
+
+	ret = xilinx_dpdma_chan_wait_no_ostand(chan);
+	if (ret)
+		return ret;
+
+	spin_lock_irqsave(&chan->lock, flags);
+	xilinx_dpdma_chan_disable(chan);
+	chan->running = false;
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_chan_done_irq - Handle hardware descriptor completion
+ * @chan: DPDMA channel
+ *
+ * Handle completion of the currently active descriptor (@chan->desc.active). As
+ * we currently support cyclic transfers only, this just invokes the cyclic
+ * callback. The descriptor will be completed at the VSYNC interrupt when a new
+ * descriptor replaces it.
+ */
+static void xilinx_dpdma_chan_done_irq(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_tx_desc *active = chan->desc.active;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	if (active)
+		vchan_cyclic_callback(&active->vdesc);
+	else
+		dev_warn(chan->xdev->dev,
+			 "DONE IRQ with no active descriptor!\n");
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/**
+ * xilinx_dpdma_chan_vsync_irq - Handle hardware descriptor scheduling
+ * @chan: DPDMA channel
+ *
+ * At VSYNC the active descriptor may have been replaced by the pending
+ * descriptor. Detect this through the DESC_ID and perform appropriate
+ * bookkeeping.
+ */
+static void xilinx_dpdma_chan_vsync_irq(struct  xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_tx_desc *pending;
+	struct xilinx_dpdma_sw_desc *sw_desc;
+	unsigned long flags;
+	u32 desc_id;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	pending = chan->desc.pending;
+	if (!chan->running || !pending)
+		goto out;
+
+	desc_id = dpdma_read(chan->reg, XILINX_DPDMA_CH_DESC_ID);
+
+	/* If the retrigger raced with vsync, retry at the next frame. */
+	sw_desc = list_first_entry(&pending->descriptors,
+				   struct xilinx_dpdma_sw_desc, node);
+	if (sw_desc->hw.desc_id != desc_id)
+		goto out;
+
+	/*
+	 * Complete the active descriptor, if any, promote the pending
+	 * descriptor to active, and queue the next transfer, if any.
+	 */
+	if (chan->desc.active)
+		vchan_cookie_complete(&chan->desc.active->vdesc);
+	chan->desc.active = pending;
+	chan->desc.pending = NULL;
+
+	xilinx_dpdma_chan_queue_transfer(chan);
+
+out:
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/**
+ * xilinx_dpdma_chan_err - Detect any channel error
+ * @chan: DPDMA channel
+ * @isr: masked Interrupt Status Register
+ * @eisr: Error Interrupt Status Register
+ *
+ * Return: true if any channel error occurs, or false otherwise.
+ */
+static bool
+xilinx_dpdma_chan_err(struct xilinx_dpdma_chan *chan, u32 isr, u32 eisr)
+{
+	if (!chan)
+		return false;
+
+	if (chan->running &&
+	    ((isr & (XILINX_DPDMA_INTR_CHAN_ERR_MASK << chan->id)) ||
+	    (eisr & (XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id))))
+		return true;
+
+	return false;
+}
+
+/**
+ * xilinx_dpdma_chan_handle_err - DPDMA channel error handling
+ * @chan: DPDMA channel
+ *
+ * This function is called when any channel error or any global error occurs.
+ * The function disables the paused channel by errors and determines
+ * if the current active descriptor can be rescheduled depending on
+ * the descriptor status.
+ */
+static void xilinx_dpdma_chan_handle_err(struct xilinx_dpdma_chan *chan)
+{
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	struct xilinx_dpdma_tx_desc *active;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	dev_dbg(xdev->dev, "cur desc addr = 0x%04x%08x\n",
+		dpdma_read(chan->reg, XILINX_DPDMA_CH_DESC_START_ADDRE),
+		dpdma_read(chan->reg, XILINX_DPDMA_CH_DESC_START_ADDR));
+	dev_dbg(xdev->dev, "cur payload addr = 0x%04x%08x\n",
+		dpdma_read(chan->reg, XILINX_DPDMA_CH_PYLD_CUR_ADDRE),
+		dpdma_read(chan->reg, XILINX_DPDMA_CH_PYLD_CUR_ADDR));
+
+	xilinx_dpdma_chan_disable(chan);
+	chan->running = false;
+
+	if (!chan->desc.active)
+		goto out_unlock;
+
+	active = chan->desc.active;
+	chan->desc.active = NULL;
+
+	xilinx_dpdma_chan_dump_tx_desc(chan, active);
+
+	if (active->error)
+		dev_dbg(xdev->dev, "repeated error on desc\n");
+
+	/* Reschedule if there's no new descriptor */
+	if (!chan->desc.pending &&
+	    list_empty(&chan->vchan.desc_issued)) {
+		active->error = true;
+		list_add_tail(&active->vdesc.node,
+			      &chan->vchan.desc_issued);
+	} else {
+		xilinx_dpdma_chan_free_tx_desc(&active->vdesc);
+	}
+
+out_unlock:
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/* -----------------------------------------------------------------------------
+ * DMA Engine Operations
+ */
+
+static struct dma_async_tx_descriptor *
+xilinx_dpdma_prep_interleaved_cyclic(struct dma_chan *dchan,
+				     struct dma_interleaved_template *xt,
+				     unsigned long flags)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	struct xilinx_dpdma_tx_desc *desc;
+
+	if (xt->dir != DMA_MEM_TO_DEV)
+		return NULL;
+
+	if (!xt->numf || !xt->sgl[0].size)
+		return NULL;
+
+	desc = xilinx_dpdma_chan_prep_interleaved_cyclic(chan, xt);
+	if (!desc)
+		return NULL;
+
+	vchan_tx_prep(&chan->vchan, &desc->vdesc, flags | DMA_CTRL_ACK);
+
+	return &desc->vdesc.tx;
+}
+
+/**
+ * xilinx_dpdma_alloc_chan_resources - Allocate resources for the channel
+ * @dchan: DMA channel
+ *
+ * Allocate a descriptor pool for the channel.
+ *
+ * Return: 0 on success, or -ENOMEM if failed to allocate a pool.
+ */
+static int xilinx_dpdma_alloc_chan_resources(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	size_t align = __alignof__(struct xilinx_dpdma_sw_desc);
+
+	chan->desc_pool = dma_pool_create(dev_name(chan->xdev->dev),
+					  chan->xdev->dev,
+					  sizeof(struct xilinx_dpdma_sw_desc),
+					  align, 0);
+	if (!chan->desc_pool) {
+		dev_err(chan->xdev->dev,
+			"failed to allocate a descriptor pool\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_free_chan_resources - Free all resources for the channel
+ * @dchan: DMA channel
+ *
+ * Free resources associated with the virtual DMA channel, and destroy the
+ * descriptor pool.
+ */
+static void xilinx_dpdma_free_chan_resources(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+
+	vchan_free_chan_resources(&chan->vchan);
+
+	dma_pool_destroy(chan->desc_pool);
+	chan->desc_pool = NULL;
+}
+
+static void xilinx_dpdma_issue_pending(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+	if (vchan_issue_pending(&chan->vchan))
+		xilinx_dpdma_chan_queue_transfer(chan);
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+}
+
+static int xilinx_dpdma_config(struct dma_chan *dchan,
+			       struct dma_slave_config *config)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	unsigned long flags;
+	int ret;
+
+	if (config->direction != DMA_MEM_TO_DEV)
+		return -EINVAL;
+
+	/*
+	 * The destination address doesn't need to be specified as the DPDMA is
+	 * hardwired to the destination (the DP controller). The transfer
+	 * width, burst size and port window size are thus meaningless, they're
+	 * fixed both on the DPDMA side and on the DP controller side.
+	 */
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	/* Can't reconfigure a running channel. */
+	if (chan->running) {
+		ret = -EBUSY;
+		goto unlock;
+	}
+
+	/*
+	 * Abuse the slave_id to indicate that the channel is part of a video
+	 * group.
+	 */
+	if (chan->id >= ZYNQMP_DPDMA_VIDEO0 && chan->id <= ZYNQMP_DPDMA_VIDEO2)
+		chan->video_group = config->slave_id != 0;
+
+unlock:
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	return ret;
+}
+
+static int xilinx_dpdma_pause(struct dma_chan *dchan)
+{
+	xilinx_dpdma_chan_pause(to_xilinx_chan(dchan));
+
+	return 0;
+}
+
+static int xilinx_dpdma_resume(struct dma_chan *dchan)
+{
+	xilinx_dpdma_chan_unpause(to_xilinx_chan(dchan));
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_terminate_all - Terminate the channel and descriptors
+ * @dchan: DMA channel
+ *
+ * Pause the channel without waiting for ongoing transfers to complete. Waiting
+ * for completion is performed by xilinx_dpdma_synchronize() that will disable
+ * the channel to complete the stop.
+ *
+ * All the descriptors associated with the channel that are guaranteed not to
+ * be touched by the hardware. The pending and active descriptor are not
+ * touched, and will be freed either upon completion, or by
+ * xilinx_dpdma_synchronize().
+ *
+ * Return: 0 on success, or -ETIMEDOUT if the channel failed to stop.
+ */
+static int xilinx_dpdma_terminate_all(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	LIST_HEAD(descriptors);
+	unsigned long flags;
+	unsigned int i;
+
+	/* Pause the channel (including the whole video group if applicable). */
+	if (chan->video_group) {
+		for (i = ZYNQMP_DPDMA_VIDEO0; i <= ZYNQMP_DPDMA_VIDEO2; i++) {
+			if (xdev->chan[i]->video_group &&
+			    xdev->chan[i]->running) {
+				xilinx_dpdma_chan_pause(xdev->chan[i]);
+				xdev->chan[i]->video_group = false;
+			}
+		}
+	} else {
+		xilinx_dpdma_chan_pause(chan);
+	}
+
+	/* Gather all the descriptors we can free and free them. */
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+	vchan_get_all_descriptors(&chan->vchan, &descriptors);
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+
+	vchan_dma_desc_free_list(&chan->vchan, &descriptors);
+
+	return 0;
+}
+
+/**
+ * xilinx_dpdma_synchronize - Synchronize callback execution
+ * @dchan: DMA channel
+ *
+ * Synchronizing callback execution ensures that all previously issued
+ * transfers have completed and all associated callbacks have been called and
+ * have returned.
+ *
+ * This function waits for the DMA channel to stop. It assumes it has been
+ * paused by a previous call to dmaengine_terminate_async(), and that no new
+ * pending descriptors have been issued with dma_async_issue_pending(). The
+ * behaviour is undefined otherwise.
+ */
+static void xilinx_dpdma_synchronize(struct dma_chan *dchan)
+{
+	struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
+
+	xilinx_dpdma_chan_stop(chan);
+
+	vchan_synchronize(&chan->vchan);
+}
+
+/* -----------------------------------------------------------------------------
+ * Interrupt and Tasklet Handling
+ */
+
+/**
+ * xilinx_dpdma_err - Detect any global error
+ * @isr: Interrupt Status Register
+ * @eisr: Error Interrupt Status Register
+ *
+ * Return: True if any global error occurs, or false otherwise.
+ */
+static bool xilinx_dpdma_err(u32 isr, u32 eisr)
+{
+	if (isr & XILINX_DPDMA_INTR_GLOBAL_ERR ||
+	    eisr & XILINX_DPDMA_EINTR_GLOBAL_ERR)
+		return true;
+
+	return false;
+}
+
+/**
+ * xilinx_dpdma_handle_err_irq - Handle DPDMA error interrupt
+ * @xdev: DPDMA device
+ * @isr: masked Interrupt Status Register
+ * @eisr: Error Interrupt Status Register
+ *
+ * Handle if any error occurs based on @isr and @eisr. This function disables
+ * corresponding error interrupts, and those should be re-enabled once handling
+ * is done.
+ */
+static void xilinx_dpdma_handle_err_irq(struct xilinx_dpdma_device *xdev,
+					u32 isr, u32 eisr)
+{
+	bool err = xilinx_dpdma_err(isr, eisr);
+	unsigned int i;
+
+	dev_dbg_ratelimited(xdev->dev,
+			    "error irq: isr = 0x%08x, eisr = 0x%08x\n",
+			    isr, eisr);
+
+	/* Disable channel error interrupts until errors are handled. */
+	dpdma_write(xdev->reg, XILINX_DPDMA_IDS,
+		    isr & ~XILINX_DPDMA_INTR_GLOBAL_ERR);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EIDS,
+		    eisr & ~XILINX_DPDMA_EINTR_GLOBAL_ERR);
+
+	for (i = 0; i < ARRAY_SIZE(xdev->chan); i++)
+		if (err || xilinx_dpdma_chan_err(xdev->chan[i], isr, eisr))
+			tasklet_schedule(&xdev->chan[i]->err_task);
+}
+
+/**
+ * xilinx_dpdma_enable_irq - Enable interrupts
+ * @xdev: DPDMA device
+ *
+ * Enable interrupts.
+ */
+static void xilinx_dpdma_enable_irq(struct xilinx_dpdma_device *xdev)
+{
+	dpdma_write(xdev->reg, XILINX_DPDMA_IEN, XILINX_DPDMA_INTR_ALL);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EIEN, XILINX_DPDMA_EINTR_ALL);
+}
+
+/**
+ * xilinx_dpdma_disable_irq - Disable interrupts
+ * @xdev: DPDMA device
+ *
+ * Disable interrupts.
+ */
+static void xilinx_dpdma_disable_irq(struct xilinx_dpdma_device *xdev)
+{
+	dpdma_write(xdev->reg, XILINX_DPDMA_IDS, XILINX_DPDMA_INTR_ERR_ALL);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EIDS, XILINX_DPDMA_EINTR_ALL);
+}
+
+/**
+ * xilinx_dpdma_chan_err_task - Per channel tasklet for error handling
+ * @data: tasklet data to be casted to DPDMA channel structure
+ *
+ * Per channel error handling tasklet. This function waits for the outstanding
+ * transaction to complete and triggers error handling. After error handling,
+ * re-enable channel error interrupts, and restart the channel if needed.
+ */
+static void xilinx_dpdma_chan_err_task(unsigned long data)
+{
+	struct xilinx_dpdma_chan *chan = (struct xilinx_dpdma_chan *)data;
+	struct xilinx_dpdma_device *xdev = chan->xdev;
+	unsigned long flags;
+
+	/* Proceed error handling even when polling fails. */
+	xilinx_dpdma_chan_poll_no_ostand(chan);
+
+	xilinx_dpdma_chan_handle_err(chan);
+
+	dpdma_write(xdev->reg, XILINX_DPDMA_IEN,
+		    XILINX_DPDMA_INTR_CHAN_ERR_MASK << chan->id);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EIEN,
+		    XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id);
+
+	spin_lock_irqsave(&chan->lock, flags);
+	xilinx_dpdma_chan_queue_transfer(chan);
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+static irqreturn_t xilinx_dpdma_irq_handler(int irq, void *data)
+{
+	struct xilinx_dpdma_device *xdev = data;
+	unsigned long mask;
+	unsigned int i;
+	u32 status;
+	u32 error;
+
+	status = dpdma_read(xdev->reg, XILINX_DPDMA_ISR);
+	error = dpdma_read(xdev->reg, XILINX_DPDMA_EISR);
+	if (!status && !error)
+		return IRQ_NONE;
+
+	dpdma_write(xdev->reg, XILINX_DPDMA_ISR, status);
+	dpdma_write(xdev->reg, XILINX_DPDMA_EISR, error);
+
+	if (status & XILINX_DPDMA_INTR_VSYNC) {
+		/*
+		 * There's a single VSYNC interrupt that needs to be processed
+		 * by each running channel to update the active descriptor.
+		 */
+		for (i = 0; i < ARRAY_SIZE(xdev->chan); i++) {
+			struct xilinx_dpdma_chan *chan = xdev->chan[i];
+
+			if (chan)
+				xilinx_dpdma_chan_vsync_irq(chan);
+		}
+	}
+
+	mask = FIELD_GET(XILINX_DPDMA_INTR_DESC_DONE_MASK, status);
+	if (mask) {
+		for_each_set_bit(i, &mask, ARRAY_SIZE(xdev->chan))
+			xilinx_dpdma_chan_done_irq(xdev->chan[i]);
+	}
+
+	mask = FIELD_GET(XILINX_DPDMA_INTR_NO_OSTAND_MASK, status);
+	if (mask) {
+		for_each_set_bit(i, &mask, ARRAY_SIZE(xdev->chan))
+			xilinx_dpdma_chan_notify_no_ostand(xdev->chan[i]);
+	}
+
+	mask = status & XILINX_DPDMA_INTR_ERR_ALL;
+	if (mask || error)
+		xilinx_dpdma_handle_err_irq(xdev, mask, error);
+
+	return IRQ_HANDLED;
+}
+
+/* -----------------------------------------------------------------------------
+ * Initialization & Cleanup
+ */
+
+static int xilinx_dpdma_chan_init(struct xilinx_dpdma_device *xdev,
+				  unsigned int chan_id)
+{
+	struct xilinx_dpdma_chan *chan;
+
+	chan = devm_kzalloc(xdev->dev, sizeof(*chan), GFP_KERNEL);
+	if (!chan)
+		return -ENOMEM;
+
+	chan->id = chan_id;
+	chan->reg = xdev->reg + XILINX_DPDMA_CH_BASE
+		  + XILINX_DPDMA_CH_OFFSET * chan->id;
+	chan->running = false;
+	chan->xdev = xdev;
+
+	spin_lock_init(&chan->lock);
+	init_waitqueue_head(&chan->wait_to_stop);
+
+	tasklet_init(&chan->err_task, xilinx_dpdma_chan_err_task,
+		     (unsigned long)chan);
+
+	chan->vchan.desc_free = xilinx_dpdma_chan_free_tx_desc;
+	vchan_init(&chan->vchan, &xdev->common);
+
+	xdev->chan[chan->id] = chan;
+
+	return 0;
+}
+
+static void xilinx_dpdma_chan_remove(struct xilinx_dpdma_chan *chan)
+{
+	if (!chan)
+		return;
+
+	tasklet_kill(&chan->err_task);
+	list_del(&chan->vchan.chan.device_node);
+}
+
+static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
+					    struct of_dma *ofdma)
+{
+	struct xilinx_dpdma_device *xdev = ofdma->of_dma_data;
+	uint32_t chan_id = dma_spec->args[0];
+
+	if (chan_id >= ARRAY_SIZE(xdev->chan))
+		return NULL;
+
+	if (!xdev->chan[chan_id])
+		return NULL;
+
+	return dma_get_slave_channel(&xdev->chan[chan_id]->vchan.chan);
+}
+
+static int xilinx_dpdma_probe(struct platform_device *pdev)
+{
+	struct xilinx_dpdma_device *xdev;
+	struct dma_device *ddev;
+	unsigned int i;
+	int ret;
+
+	xdev = devm_kzalloc(&pdev->dev, sizeof(*xdev), GFP_KERNEL);
+	if (!xdev)
+		return -ENOMEM;
+
+	xdev->dev = &pdev->dev;
+	xdev->ext_addr = sizeof(dma_addr_t) > 4;
+
+	INIT_LIST_HEAD(&xdev->common.channels);
+
+	platform_set_drvdata(pdev, xdev);
+
+	xdev->axi_clk = devm_clk_get(xdev->dev, "axi_clk");
+	if (IS_ERR(xdev->axi_clk))
+		return PTR_ERR(xdev->axi_clk);
+
+	xdev->reg = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(xdev->reg))
+		return PTR_ERR(xdev->reg);
+
+	xdev->irq = platform_get_irq(pdev, 0);
+	if (xdev->irq < 0) {
+		dev_err(xdev->dev, "failed to get platform irq\n");
+		return xdev->irq;
+	}
+
+	ret = request_irq(xdev->irq, xilinx_dpdma_irq_handler, IRQF_SHARED,
+			  dev_name(xdev->dev), xdev);
+	if (ret) {
+		dev_err(xdev->dev, "failed to request IRQ\n");
+		return ret;
+	}
+
+	ddev = &xdev->common;
+	ddev->dev = &pdev->dev;
+
+	dma_cap_set(DMA_SLAVE, ddev->cap_mask);
+	dma_cap_set(DMA_PRIVATE, ddev->cap_mask);
+	dma_cap_set(DMA_INTERLEAVE_CYCLIC, ddev->cap_mask);
+	ddev->copy_align = fls(XILINX_DPDMA_ALIGN_BYTES - 1);
+
+	ddev->device_alloc_chan_resources = xilinx_dpdma_alloc_chan_resources;
+	ddev->device_free_chan_resources = xilinx_dpdma_free_chan_resources;
+	ddev->device_prep_interleaved_cyclic = xilinx_dpdma_prep_interleaved_cyclic;
+	/* TODO: Can we achieve better granularity ? */
+	ddev->device_tx_status = dma_cookie_status;
+	ddev->device_issue_pending = xilinx_dpdma_issue_pending;
+	ddev->device_config = xilinx_dpdma_config;
+	ddev->device_pause = xilinx_dpdma_pause;
+	ddev->device_resume = xilinx_dpdma_resume;
+	ddev->device_terminate_all = xilinx_dpdma_terminate_all;
+	ddev->device_synchronize = xilinx_dpdma_synchronize;
+	ddev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_UNDEFINED);
+	ddev->directions = BIT(DMA_MEM_TO_DEV);
+	ddev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
+
+	for (i = 0; i < ARRAY_SIZE(xdev->chan); ++i) {
+		ret = xilinx_dpdma_chan_init(xdev, i);
+		if (ret < 0) {
+			dev_err(xdev->dev, "failed to initialize channel %u\n",
+				i);
+			goto error;
+		}
+	}
+
+	ret = clk_prepare_enable(xdev->axi_clk);
+	if (ret) {
+		dev_err(xdev->dev, "failed to enable the axi clock\n");
+		goto error;
+	}
+
+	ret = dma_async_device_register(ddev);
+	if (ret) {
+		dev_err(xdev->dev, "failed to register the dma device\n");
+		goto error_dma_async;
+	}
+
+	ret = of_dma_controller_register(xdev->dev->of_node,
+					 of_dma_xilinx_xlate, ddev);
+	if (ret) {
+		dev_err(xdev->dev, "failed to register DMA to DT DMA helper\n");
+		goto error_of_dma;
+	}
+
+	xilinx_dpdma_enable_irq(xdev);
+
+	dev_info(&pdev->dev, "Xilinx DPDMA engine is probed\n");
+
+	return 0;
+
+error_of_dma:
+	dma_async_device_unregister(ddev);
+error_dma_async:
+	clk_disable_unprepare(xdev->axi_clk);
+error:
+	for (i = 0; i < ARRAY_SIZE(xdev->chan); i++)
+		xilinx_dpdma_chan_remove(xdev->chan[i]);
+
+	free_irq(xdev->irq, xdev);
+
+	return ret;
+}
+
+static int xilinx_dpdma_remove(struct platform_device *pdev)
+{
+	struct xilinx_dpdma_device *xdev = platform_get_drvdata(pdev);
+	unsigned int i;
+
+	/* Start by disabling the IRQ to avoid races during cleanup. */
+	free_irq(xdev->irq, xdev);
+
+	xilinx_dpdma_disable_irq(xdev);
+	of_dma_controller_free(pdev->dev.of_node);
+	dma_async_device_unregister(&xdev->common);
+	clk_disable_unprepare(xdev->axi_clk);
+
+	for (i = 0; i < ARRAY_SIZE(xdev->chan); i++)
+		xilinx_dpdma_chan_remove(xdev->chan[i]);
+
+	return 0;
+}
+
+static const struct of_device_id xilinx_dpdma_of_match[] = {
+	{ .compatible = "xlnx,zynqmp-dpdma",},
+	{ /* end of table */ },
+};
+MODULE_DEVICE_TABLE(of, xilinx_dpdma_of_match);
+
+static struct platform_driver xilinx_dpdma_driver = {
+	.probe			= xilinx_dpdma_probe,
+	.remove			= xilinx_dpdma_remove,
+	.driver			= {
+		.name		= "xilinx-zynqmp-dpdma",
+		.of_match_table	= xilinx_dpdma_of_match,
+	},
+};
+
+module_platform_driver(xilinx_dpdma_driver);
+
+MODULE_AUTHOR("Xilinx, Inc.");
+MODULE_DESCRIPTION("Xilinx ZynqMP DPDMA driver");
+MODULE_LICENSE("GPL v2");
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 5/6] dmaengine: xilinx: dpdma: Add debugfs support
  2020-01-23  2:29 [PATCH v3 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
                   ` (3 preceding siblings ...)
  2020-01-23  2:29 ` [PATCH v3 4/6] dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver Laurent Pinchart
@ 2020-01-23  2:29 ` Laurent Pinchart
  2020-01-23  2:29 ` [PATCH v3 6/6] arm64: dts: zynqmp: Add DPDMA node Laurent Pinchart
  5 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-23  2:29 UTC (permalink / raw)
  To: dmaengine; +Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy

Expose statistics to debugfs when available. This helps debugging issues
with the DPDMA driver.

Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
Changes since v2:

- Refactor debugfs code
---
 drivers/dma/xilinx/xilinx_dpdma.c | 227 ++++++++++++++++++++++++++++++
 1 file changed, 227 insertions(+)

diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
index 15ba85aa63d9..a0df729e2034 100644
--- a/drivers/dma/xilinx/xilinx_dpdma.c
+++ b/drivers/dma/xilinx/xilinx_dpdma.c
@@ -10,6 +10,7 @@
 #include <linux/bitfield.h>
 #include <linux/bits.h>
 #include <linux/clk.h>
+#include <linux/debugfs.h>
 #include <linux/delay.h>
 #include <linux/dmaengine.h>
 #include <linux/dmapool.h>
@@ -265,6 +266,228 @@ struct xilinx_dpdma_device {
 	bool ext_addr;
 };
 
+/* -----------------------------------------------------------------------------
+ * DebugFS
+ */
+
+#ifdef CONFIG_DEBUG_FS
+
+#define XILINX_DPDMA_DEBUGFS_READ_MAX_SIZE	32
+#define XILINX_DPDMA_DEBUGFS_UINT16_MAX_STR	"65535"
+
+/* Match xilinx_dpdma_testcases vs dpdma_debugfs_reqs[] entry */
+enum xilinx_dpdma_testcases {
+	DPDMA_TC_INTR_DONE,
+	DPDMA_TC_NONE
+};
+
+struct xilinx_dpdma_debugfs {
+	enum xilinx_dpdma_testcases testcase;
+	u16 xilinx_dpdma_irq_done_count;
+	unsigned int chan_id;
+};
+
+static struct xilinx_dpdma_debugfs dpdma_debugfs;
+struct xilinx_dpdma_debugfs_request {
+	const char *name;
+	enum xilinx_dpdma_testcases tc;
+	ssize_t (*read)(char *buf);
+	int (*write)(char *args);
+};
+
+static void xilinx_dpdma_debugfs_desc_done_irq(struct xilinx_dpdma_chan *chan)
+{
+	if (chan->id == dpdma_debugfs.chan_id)
+		dpdma_debugfs.xilinx_dpdma_irq_done_count++;
+}
+
+static ssize_t xilinx_dpdma_debugfs_desc_done_irq_read(char *buf)
+{
+	size_t out_str_len;
+
+	dpdma_debugfs.testcase = DPDMA_TC_NONE;
+
+	out_str_len = strlen(XILINX_DPDMA_DEBUGFS_UINT16_MAX_STR);
+	out_str_len = min_t(size_t, XILINX_DPDMA_DEBUGFS_READ_MAX_SIZE,
+			    out_str_len);
+	snprintf(buf, out_str_len, "%d",
+		 dpdma_debugfs.xilinx_dpdma_irq_done_count);
+
+	return 0;
+}
+
+static int xilinx_dpdma_debugfs_desc_done_irq_write(char *args)
+{
+	char *arg;
+	int ret;
+	u32 id;
+
+	arg = strsep(&args, " ");
+	if (!arg || strncasecmp(arg, "start", 5))
+		return -EINVAL;
+
+	arg = strsep(&args, " ");
+	if (!arg)
+		return -EINVAL;
+
+	ret = kstrtou32(arg, 0, &id);
+	if (ret < 0)
+		return ret;
+
+	if (id < ZYNQMP_DPDMA_VIDEO0 || id > ZYNQMP_DPDMA_AUDIO1)
+		return -EINVAL;
+
+	dpdma_debugfs.testcase = DPDMA_TC_INTR_DONE;
+	dpdma_debugfs.xilinx_dpdma_irq_done_count = 0;
+	dpdma_debugfs.chan_id = id;
+
+	return 0;
+}
+
+/* Match xilinx_dpdma_testcases vs dpdma_debugfs_reqs[] entry */
+struct xilinx_dpdma_debugfs_request dpdma_debugfs_reqs[] = {
+	{
+		.name = "DESCRIPTOR_DONE_INTR",
+		.tc = DPDMA_TC_INTR_DONE,
+		.read = xilinx_dpdma_debugfs_desc_done_irq_read,
+		.write = xilinx_dpdma_debugfs_desc_done_irq_write,
+	},
+};
+
+static ssize_t xilinx_dpdma_debugfs_read(struct file *f, char __user *buf,
+					 size_t size, loff_t *pos)
+{
+	enum xilinx_dpdma_testcases testcase;
+	char *kern_buff;
+	int ret;
+
+	if (*pos != 0 || size <= 0)
+		return -EINVAL;
+
+	kern_buff = kzalloc(XILINX_DPDMA_DEBUGFS_READ_MAX_SIZE, GFP_KERNEL);
+	if (!kern_buff) {
+		dpdma_debugfs.testcase = DPDMA_TC_NONE;
+		return -ENOMEM;
+	}
+
+	testcase = READ_ONCE(dpdma_debugfs.testcase);
+	if (testcase != DPDMA_TC_NONE) {
+		ret = dpdma_debugfs_reqs[testcase].read(kern_buff);
+		if (ret < 0)
+			goto done;
+	} else {
+		strlcpy(kern_buff, "No testcase executed",
+			XILINX_DPDMA_DEBUGFS_READ_MAX_SIZE);
+	}
+
+	size = min(size, strlen(kern_buff));
+	ret = copy_to_user(buf, kern_buff, size);
+
+done:
+	kfree(kern_buff);
+	if (ret)
+		return ret;
+
+	*pos = size + 1;
+	return size;
+}
+
+static ssize_t xilinx_dpdma_debugfs_write(struct file *f,
+					  const char __user *buf, size_t size,
+					  loff_t *pos)
+{
+	char *kern_buff, *kern_buff_start;
+	char *testcase;
+	unsigned int i;
+	int ret;
+
+	if (*pos != 0 || size <= 0)
+		return -EINVAL;
+
+	/* Supporting single instance of test as of now. */
+	if (dpdma_debugfs.testcase != DPDMA_TC_NONE)
+		return -EBUSY;
+
+	kern_buff = kzalloc(size, GFP_KERNEL);
+	if (!kern_buff)
+		return -ENOMEM;
+	kern_buff_start = kern_buff;
+
+	ret = strncpy_from_user(kern_buff, buf, size);
+	if (ret < 0)
+		goto done;
+
+	/* Read the testcase name from a user request. */
+	testcase = strsep(&kern_buff, " ");
+
+	for (i = 0; i < ARRAY_SIZE(dpdma_debugfs_reqs); i++) {
+		if (!strcasecmp(testcase, dpdma_debugfs_reqs[i].name))
+			break;
+	}
+
+	if (i == ARRAY_SIZE(dpdma_debugfs_reqs)) {
+		ret = -EINVAL;
+		goto done;
+	}
+
+	ret = dpdma_debugfs_reqs[i].write(kern_buff);
+	if (ret < 0)
+		goto done;
+
+	ret = size;
+
+done:
+	kfree(kern_buff_start);
+	return ret;
+}
+
+static const struct file_operations fops_xilinx_dpdma_dbgfs = {
+	.owner = THIS_MODULE,
+	.read = xilinx_dpdma_debugfs_read,
+	.write = xilinx_dpdma_debugfs_write,
+};
+
+static int xilinx_dpdma_debugfs_init(struct device *dev)
+{
+	int err;
+	struct dentry *xilinx_dpdma_debugfs_dir, *xilinx_dpdma_debugfs_file;
+
+	dpdma_debugfs.testcase = DPDMA_TC_NONE;
+
+	xilinx_dpdma_debugfs_dir = debugfs_create_dir("dpdma", NULL);
+	if (!xilinx_dpdma_debugfs_dir) {
+		dev_err(dev, "debugfs_create_dir failed\n");
+		return -ENODEV;
+	}
+
+	xilinx_dpdma_debugfs_file =
+		debugfs_create_file("testcase", 0444,
+				    xilinx_dpdma_debugfs_dir, NULL,
+				    &fops_xilinx_dpdma_dbgfs);
+	if (!xilinx_dpdma_debugfs_file) {
+		dev_err(dev, "debugfs_create_file testcase failed\n");
+		err = -ENODEV;
+		goto err_dbgfs;
+	}
+	return 0;
+
+err_dbgfs:
+	debugfs_remove_recursive(xilinx_dpdma_debugfs_dir);
+	xilinx_dpdma_debugfs_dir = NULL;
+	return err;
+}
+
+#else
+static int xilinx_dpdma_debugfs_init(struct device *dev)
+{
+	return 0;
+}
+
+static void xilinx_dpdma_debugfs_desc_done_irq(struct xilinx_dpdma_chan *chan)
+{
+}
+#endif /* CONFIG_DEBUG_FS */
+
 /* -----------------------------------------------------------------------------
  * I/O Accessors
  */
@@ -840,6 +1063,8 @@ static void xilinx_dpdma_chan_done_irq(struct xilinx_dpdma_chan *chan)
 
 	spin_lock_irqsave(&chan->lock, flags);
 
+	xilinx_dpdma_debugfs_desc_done_irq(chan);
+
 	if (active)
 		vchan_cyclic_callback(&active->vdesc);
 	else
@@ -1469,6 +1694,8 @@ static int xilinx_dpdma_probe(struct platform_device *pdev)
 
 	xilinx_dpdma_enable_irq(xdev);
 
+	xilinx_dpdma_debugfs_init(&pdev->dev);
+
 	dev_info(&pdev->dev, "Xilinx DPDMA engine is probed\n");
 
 	return 0;
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 6/6] arm64: dts: zynqmp: Add DPDMA node
  2020-01-23  2:29 [PATCH v3 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
                   ` (4 preceding siblings ...)
  2020-01-23  2:29 ` [PATCH v3 5/6] dmaengine: xilinx: dpdma: Add debugfs support Laurent Pinchart
@ 2020-01-23  2:29 ` Laurent Pinchart
  5 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-23  2:29 UTC (permalink / raw)
  To: dmaengine; +Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy

Add a DT node for the DisplayPort DMA engine (DPDMA).

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 arch/arm64/boot/dts/xilinx/zynqmp-clk.dtsi |  4 ++++
 arch/arm64/boot/dts/xilinx/zynqmp.dtsi     | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm64/boot/dts/xilinx/zynqmp-clk.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp-clk.dtsi
index 306ad2157c98..2936e5f97f84 100644
--- a/arch/arm64/boot/dts/xilinx/zynqmp-clk.dtsi
+++ b/arch/arm64/boot/dts/xilinx/zynqmp-clk.dtsi
@@ -80,6 +80,10 @@ &can1 {
 	clocks = <&clk100 &clk100>;
 };
 
+&dpdma {
+	clocks = <&dpdma_clk>;
+};
+
 &fpd_dma_chan1 {
 	clocks = <&clk600>, <&clk100>;
 };
diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
index 3c731e73903a..7e986461fd57 100644
--- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
+++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
@@ -219,6 +219,16 @@ pmu@9000 {
 			};
 		};
 
+		dpdma: dma-controller@fd4c0000 {
+			compatible = "xlnx,zynqmp-dpdma";
+			status = "disabled";
+			reg = <0x0 0xfd4c0000 0x0 0x1000>;
+			interrupts = <0 122 4>;
+			interrupt-parent = <&gic>;
+			clock-names = "axi_clk";
+			#dma-cells = <1>;
+		};
+
 		/* GDMA */
 		fpd_dma_chan1: dma@fd500000 {
 			status = "disabled";
-- 
Regards,

Laurent Pinchart


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-23  2:29 ` [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type Laurent Pinchart
@ 2020-01-23  8:03   ` Peter Ujfalusi
  2020-01-23  8:43     ` Vinod Koul
  0 siblings, 1 reply; 46+ messages in thread
From: Peter Ujfalusi @ 2020-01-23  8:03 UTC (permalink / raw)
  To: Laurent Pinchart, dmaengine
  Cc: Michal Simek, Hyun Kwon, Tejas Upadhyay, Satish Kumar Nagireddy,
	Vinod Koul

Hi Laurent,

On 23/01/2020 4.29, Laurent Pinchart wrote:
> The new interleaved cyclic transaction type combines interleaved and
> cycle transactions. It is designed for DMA engines that back display
> controllers, where the same 2D frame needs to be output to the display
> until a new frame is available.
> 
> Suggested-by: Vinod Koul <vkoul@kernel.org>
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
>  drivers/dma/dmaengine.c   |  8 +++++++-
>  include/linux/dmaengine.h | 18 ++++++++++++++++++
>  2 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index 03ac4b96117c..4ffb98a47f31 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -981,7 +981,13 @@ int dma_async_device_register(struct dma_device *device)
>  			"DMA_INTERLEAVE");
>  		return -EIO;
>  	}
> -
> +	if (dma_has_cap(DMA_INTERLEAVE_CYCLIC, device->cap_mask) &&
> +	    !device->device_prep_interleaved_cyclic) {
> +		dev_err(device->dev,
> +			"Device claims capability %s, but op is not defined\n",
> +			"DMA_INTERLEAVE_CYCLIC");
> +		return -EIO;
> +	}
>  
>  	if (!device->device_tx_status) {
>  		dev_err(device->dev, "Device tx_status is not defined\n");
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fcdee1c0cf9..e9af3bf835cb 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -61,6 +61,7 @@ enum dma_transaction_type {
>  	DMA_SLAVE,
>  	DMA_CYCLIC,
>  	DMA_INTERLEAVE,
> +	DMA_INTERLEAVE_CYCLIC,
>  /* last transaction type for creation of the capabilities mask */
>  	DMA_TX_TYPE_END,
>  };
> @@ -701,6 +702,10 @@ struct dma_filter {
>   *	The function takes a buffer of size buf_len. The callback function will
>   *	be called after period_len bytes have been transferred.
>   * @device_prep_interleaved_dma: Transfer expression in a generic way.
> + * @device_prep_interleaved_cyclic: prepares an interleaved cyclic transfer.
> + *	This is similar to @device_prep_interleaved_dma, but the transfer is
> + *	repeated until a new transfer is issued. This transfer type is meant
> + *	for display.

I think capture (camera) is another potential beneficiary of this.

So you don't need to terminate the running interleaved_cyclic and start
a new one, but prepare and issue a new one, which would
terminate/replace the currently running cyclic interleaved DMA?

Can you also update the documentation at
Documentation/driver-api/dmaengine/client.rst

One more thing might be good to clarify for the interleaved_cyclic:
What is expected when DMA_PREP_INTERRUPT is set in the flags? The
client's callback is called for each completion of
dma_interleaved_template, right?

- Péter

>   * @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address
>   * @device_config: Pushes a new configuration to a channel, return 0 or an error
>   *	code
> @@ -785,6 +790,9 @@ struct dma_device {
>  	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
>  		struct dma_chan *chan, struct dma_interleaved_template *xt,
>  		unsigned long flags);
> +	struct dma_async_tx_descriptor *(*device_prep_interleaved_cyclic)(
> +		struct dma_chan *chan, struct dma_interleaved_template *xt,
> +		unsigned long flags);
>  	struct dma_async_tx_descriptor *(*device_prep_dma_imm_data)(
>  		struct dma_chan *chan, dma_addr_t dst, u64 data,
>  		unsigned long flags);
> @@ -880,6 +888,16 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_interleaved_dma(
>  	return chan->device->device_prep_interleaved_dma(chan, xt, flags);
>  }
>  
> +static inline struct dma_async_tx_descriptor *dmaengine_prep_interleaved_cyclic(
> +		struct dma_chan *chan, struct dma_interleaved_template *xt,
> +		unsigned long flags)
> +{
> +	if (!chan || !chan->device || !chan->device->device_prep_interleaved_cyclic)
> +		return NULL;
> +
> +	return chan->device->device_prep_interleaved_cyclic(chan, xt, flags);
> +}
> +
>  static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_memset(
>  		struct dma_chan *chan, dma_addr_t dest, int value, size_t len,
>  		unsigned long flags)
> 

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-23  8:03   ` Peter Ujfalusi
@ 2020-01-23  8:43     ` Vinod Koul
  2020-01-23  8:51       ` Peter Ujfalusi
  0 siblings, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-01-23  8:43 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: Laurent Pinchart, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

On 23-01-20, 10:03, Peter Ujfalusi wrote:
> Hi Laurent,
> 
> On 23/01/2020 4.29, Laurent Pinchart wrote:
> > The new interleaved cyclic transaction type combines interleaved and
> > cycle transactions. It is designed for DMA engines that back display
> > controllers, where the same 2D frame needs to be output to the display
> > until a new frame is available.
> > 
> > Suggested-by: Vinod Koul <vkoul@kernel.org>
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> >  drivers/dma/dmaengine.c   |  8 +++++++-
> >  include/linux/dmaengine.h | 18 ++++++++++++++++++
> >  2 files changed, 25 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> > index 03ac4b96117c..4ffb98a47f31 100644
> > --- a/drivers/dma/dmaengine.c
> > +++ b/drivers/dma/dmaengine.c
> > @@ -981,7 +981,13 @@ int dma_async_device_register(struct dma_device *device)
> >  			"DMA_INTERLEAVE");
> >  		return -EIO;
> >  	}
> > -
> > +	if (dma_has_cap(DMA_INTERLEAVE_CYCLIC, device->cap_mask) &&
> > +	    !device->device_prep_interleaved_cyclic) {
> > +		dev_err(device->dev,
> > +			"Device claims capability %s, but op is not defined\n",
> > +			"DMA_INTERLEAVE_CYCLIC");
> > +		return -EIO;
> > +	}
> >  
> >  	if (!device->device_tx_status) {
> >  		dev_err(device->dev, "Device tx_status is not defined\n");
> > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> > index 8fcdee1c0cf9..e9af3bf835cb 100644
> > --- a/include/linux/dmaengine.h
> > +++ b/include/linux/dmaengine.h
> > @@ -61,6 +61,7 @@ enum dma_transaction_type {
> >  	DMA_SLAVE,
> >  	DMA_CYCLIC,
> >  	DMA_INTERLEAVE,
> > +	DMA_INTERLEAVE_CYCLIC,
> >  /* last transaction type for creation of the capabilities mask */
> >  	DMA_TX_TYPE_END,
> >  };
> > @@ -701,6 +702,10 @@ struct dma_filter {
> >   *	The function takes a buffer of size buf_len. The callback function will
> >   *	be called after period_len bytes have been transferred.
> >   * @device_prep_interleaved_dma: Transfer expression in a generic way.
> > + * @device_prep_interleaved_cyclic: prepares an interleaved cyclic transfer.
> > + *	This is similar to @device_prep_interleaved_dma, but the transfer is
> > + *	repeated until a new transfer is issued. This transfer type is meant
> > + *	for display.
> 
> I think capture (camera) is another potential beneficiary of this.
> 
> So you don't need to terminate the running interleaved_cyclic and start
> a new one, but prepare and issue a new one, which would
> terminate/replace the currently running cyclic interleaved DMA?

Why not explicitly terminate the transfer and start when a new one is
issued. That can be common usage for audio and display..

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-23  8:43     ` Vinod Koul
@ 2020-01-23  8:51       ` Peter Ujfalusi
  2020-01-23 12:23         ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Peter Ujfalusi @ 2020-01-23  8:51 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Laurent Pinchart, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Vinod,

On 23/01/2020 10.43, Vinod Koul wrote:
> On 23-01-20, 10:03, Peter Ujfalusi wrote:
>> Hi Laurent,
>>
>> On 23/01/2020 4.29, Laurent Pinchart wrote:
>>> The new interleaved cyclic transaction type combines interleaved and
>>> cycle transactions. It is designed for DMA engines that back display
>>> controllers, where the same 2D frame needs to be output to the display
>>> until a new frame is available.
>>>
>>> Suggested-by: Vinod Koul <vkoul@kernel.org>
>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>>> ---
>>>  drivers/dma/dmaengine.c   |  8 +++++++-
>>>  include/linux/dmaengine.h | 18 ++++++++++++++++++
>>>  2 files changed, 25 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>>> index 03ac4b96117c..4ffb98a47f31 100644
>>> --- a/drivers/dma/dmaengine.c
>>> +++ b/drivers/dma/dmaengine.c
>>> @@ -981,7 +981,13 @@ int dma_async_device_register(struct dma_device *device)
>>>  			"DMA_INTERLEAVE");
>>>  		return -EIO;
>>>  	}
>>> -
>>> +	if (dma_has_cap(DMA_INTERLEAVE_CYCLIC, device->cap_mask) &&
>>> +	    !device->device_prep_interleaved_cyclic) {
>>> +		dev_err(device->dev,
>>> +			"Device claims capability %s, but op is not defined\n",
>>> +			"DMA_INTERLEAVE_CYCLIC");
>>> +		return -EIO;
>>> +	}
>>>  
>>>  	if (!device->device_tx_status) {
>>>  		dev_err(device->dev, "Device tx_status is not defined\n");
>>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>>> index 8fcdee1c0cf9..e9af3bf835cb 100644
>>> --- a/include/linux/dmaengine.h
>>> +++ b/include/linux/dmaengine.h
>>> @@ -61,6 +61,7 @@ enum dma_transaction_type {
>>>  	DMA_SLAVE,
>>>  	DMA_CYCLIC,
>>>  	DMA_INTERLEAVE,
>>> +	DMA_INTERLEAVE_CYCLIC,
>>>  /* last transaction type for creation of the capabilities mask */
>>>  	DMA_TX_TYPE_END,
>>>  };
>>> @@ -701,6 +702,10 @@ struct dma_filter {
>>>   *	The function takes a buffer of size buf_len. The callback function will
>>>   *	be called after period_len bytes have been transferred.
>>>   * @device_prep_interleaved_dma: Transfer expression in a generic way.
>>> + * @device_prep_interleaved_cyclic: prepares an interleaved cyclic transfer.
>>> + *	This is similar to @device_prep_interleaved_dma, but the transfer is
>>> + *	repeated until a new transfer is issued. This transfer type is meant
>>> + *	for display.
>>
>> I think capture (camera) is another potential beneficiary of this.
>>
>> So you don't need to terminate the running interleaved_cyclic and start
>> a new one, but prepare and issue a new one, which would
>> terminate/replace the currently running cyclic interleaved DMA?
> 
> Why not explicitly terminate the transfer and start when a new one is
> issued. That can be common usage for audio and display..

Yes, this is what I'm asking. The cyclic transfer is running and in
order to start the new transfer, the previous should stop. But in cyclic
case it is not going to happen unless it is terminated.

When one would want to have different interleaved transfer the display
(or capture )IP needs to be reconfigured as well. The the would need to
be terminated anyways to avoid interpreting data in a wrong way.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-23  8:51       ` Peter Ujfalusi
@ 2020-01-23 12:23         ` Laurent Pinchart
  2020-01-24  6:10           ` Vinod Koul
  2020-01-24  7:20           ` Peter Ujfalusi
  0 siblings, 2 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-23 12:23 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: Vinod Koul, dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hello,

On Thu, Jan 23, 2020 at 10:51:42AM +0200, Peter Ujfalusi wrote:
> On 23/01/2020 10.43, Vinod Koul wrote:
> > On 23-01-20, 10:03, Peter Ujfalusi wrote:
> >> On 23/01/2020 4.29, Laurent Pinchart wrote:
> >>> The new interleaved cyclic transaction type combines interleaved and
> >>> cycle transactions. It is designed for DMA engines that back display
> >>> controllers, where the same 2D frame needs to be output to the display
> >>> until a new frame is available.
> >>>
> >>> Suggested-by: Vinod Koul <vkoul@kernel.org>
> >>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> >>> ---
> >>>  drivers/dma/dmaengine.c   |  8 +++++++-
> >>>  include/linux/dmaengine.h | 18 ++++++++++++++++++
> >>>  2 files changed, 25 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> >>> index 03ac4b96117c..4ffb98a47f31 100644
> >>> --- a/drivers/dma/dmaengine.c
> >>> +++ b/drivers/dma/dmaengine.c
> >>> @@ -981,7 +981,13 @@ int dma_async_device_register(struct dma_device *device)
> >>>  			"DMA_INTERLEAVE");
> >>>  		return -EIO;
> >>>  	}
> >>> -
> >>> +	if (dma_has_cap(DMA_INTERLEAVE_CYCLIC, device->cap_mask) &&
> >>> +	    !device->device_prep_interleaved_cyclic) {
> >>> +		dev_err(device->dev,
> >>> +			"Device claims capability %s, but op is not defined\n",
> >>> +			"DMA_INTERLEAVE_CYCLIC");
> >>> +		return -EIO;
> >>> +	}
> >>>  
> >>>  	if (!device->device_tx_status) {
> >>>  		dev_err(device->dev, "Device tx_status is not defined\n");
> >>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> >>> index 8fcdee1c0cf9..e9af3bf835cb 100644
> >>> --- a/include/linux/dmaengine.h
> >>> +++ b/include/linux/dmaengine.h
> >>> @@ -61,6 +61,7 @@ enum dma_transaction_type {
> >>>  	DMA_SLAVE,
> >>>  	DMA_CYCLIC,
> >>>  	DMA_INTERLEAVE,
> >>> +	DMA_INTERLEAVE_CYCLIC,
> >>>  /* last transaction type for creation of the capabilities mask */
> >>>  	DMA_TX_TYPE_END,
> >>>  };
> >>> @@ -701,6 +702,10 @@ struct dma_filter {
> >>>   *	The function takes a buffer of size buf_len. The callback function will
> >>>   *	be called after period_len bytes have been transferred.
> >>>   * @device_prep_interleaved_dma: Transfer expression in a generic way.
> >>> + * @device_prep_interleaved_cyclic: prepares an interleaved cyclic transfer.
> >>> + *	This is similar to @device_prep_interleaved_dma, but the transfer is
> >>> + *	repeated until a new transfer is issued. This transfer type is meant
> >>> + *	for display.
> >>
> >> I think capture (camera) is another potential beneficiary of this.

Possibly, although in the camera case I'd rather have the hardware stop
if there's no more buffer. Requiring a buffer to always be present is
annoying from a userspace point of view. For display it's different, if
userspace doesn't submit a new frame, the same frame should keep being
displayed on the screen.

> >> So you don't need to terminate the running interleaved_cyclic and start
> >> a new one, but prepare and issue a new one, which would
> >> terminate/replace the currently running cyclic interleaved DMA?

Correct.

> > Why not explicitly terminate the transfer and start when a new one is
> > issued. That can be common usage for audio and display..
> 
> Yes, this is what I'm asking. The cyclic transfer is running and in
> order to start the new transfer, the previous should stop. But in cyclic
> case it is not going to happen unless it is terminated.
> 
> When one would want to have different interleaved transfer the display
> (or capture )IP needs to be reconfigured as well. The the would need to
> be terminated anyways to avoid interpreting data in a wrong way.

The use case here is not to switch to a new configuration, but to switch
to a new buffer. If the transfer had to be terminated manually first,
the DMA engine would potentially miss a frame, which is not acceptable.
We need an atomic way to switch to the next transfer.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-23 12:23         ` Laurent Pinchart
@ 2020-01-24  6:10           ` Vinod Koul
  2020-01-24  8:50             ` Laurent Pinchart
  2020-01-24  7:20           ` Peter Ujfalusi
  1 sibling, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-01-24  6:10 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Laurent,

On 23-01-20, 14:23, Laurent Pinchart wrote:
> > >>> @@ -701,6 +702,10 @@ struct dma_filter {
> > >>>   *	The function takes a buffer of size buf_len. The callback function will
> > >>>   *	be called after period_len bytes have been transferred.
> > >>>   * @device_prep_interleaved_dma: Transfer expression in a generic way.
> > >>> + * @device_prep_interleaved_cyclic: prepares an interleaved cyclic transfer.
> > >>> + *	This is similar to @device_prep_interleaved_dma, but the transfer is
> > >>> + *	repeated until a new transfer is issued. This transfer type is meant
> > >>> + *	for display.
> > >>
> > >> I think capture (camera) is another potential beneficiary of this.
> 
> Possibly, although in the camera case I'd rather have the hardware stop
> if there's no more buffer. Requiring a buffer to always be present is
> annoying from a userspace point of view. For display it's different, if
> userspace doesn't submit a new frame, the same frame should keep being
> displayed on the screen.
> 
> > >> So you don't need to terminate the running interleaved_cyclic and start
> > >> a new one, but prepare and issue a new one, which would
> > >> terminate/replace the currently running cyclic interleaved DMA?
> 
> Correct.
> 
> > > Why not explicitly terminate the transfer and start when a new one is
> > > issued. That can be common usage for audio and display..
> > 
> > Yes, this is what I'm asking. The cyclic transfer is running and in
> > order to start the new transfer, the previous should stop. But in cyclic
> > case it is not going to happen unless it is terminated.
> > 
> > When one would want to have different interleaved transfer the display
> > (or capture )IP needs to be reconfigured as well. The the would need to
> > be terminated anyways to avoid interpreting data in a wrong way.
> 
> The use case here is not to switch to a new configuration, but to switch
> to a new buffer. If the transfer had to be terminated manually first,
> the DMA engine would potentially miss a frame, which is not acceptable.
> We need an atomic way to switch to the next transfer.

So in this case you have, let's say a cyclic descriptor with N buffers
and they are cyclically capturing data and providing to client/user..

So why would you like to submit again...? Once whole capture has
completed you would terminate, right...

Sorry not able to wrap my head around why new submission is required and
if that is the case why previous one cant be terminated :)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-23 12:23         ` Laurent Pinchart
  2020-01-24  6:10           ` Vinod Koul
@ 2020-01-24  7:20           ` Peter Ujfalusi
  2020-01-24  7:38             ` Peter Ujfalusi
  2020-01-24  8:56             ` Laurent Pinchart
  1 sibling, 2 replies; 46+ messages in thread
From: Peter Ujfalusi @ 2020-01-24  7:20 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Vinod Koul, dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Laurent,

On 23/01/2020 14.23, Laurent Pinchart wrote:
>>>> I think capture (camera) is another potential beneficiary of this.
> 
> Possibly, although in the camera case I'd rather have the hardware stop
> if there's no more buffer. Requiring a buffer to always be present is
> annoying from a userspace point of view. For display it's different, if
> userspace doesn't submit a new frame, the same frame should keep being
> displayed on the screen.
> 
>>>> So you don't need to terminate the running interleaved_cyclic and start
>>>> a new one, but prepare and issue a new one, which would
>>>> terminate/replace the currently running cyclic interleaved DMA?
> 
> Correct.
> 
>>> Why not explicitly terminate the transfer and start when a new one is
>>> issued. That can be common usage for audio and display..
>>
>> Yes, this is what I'm asking. The cyclic transfer is running and in
>> order to start the new transfer, the previous should stop. But in cyclic
>> case it is not going to happen unless it is terminated.
>>
>> When one would want to have different interleaved transfer the display
>> (or capture )IP needs to be reconfigured as well. The the would need to
>> be terminated anyways to avoid interpreting data in a wrong way.
> 
> The use case here is not to switch to a new configuration, but to switch
> to a new buffer. If the transfer had to be terminated manually first,
> the DMA engine would potentially miss a frame, which is not acceptable.
> We need an atomic way to switch to the next transfer.

You have a special hardware in hand, most DMAs can not just replace a
cyclic transfer in-flight and it also kind of violates the DMAengine
principles.
If cyclic transfer is started then it is expected to run forever until
it is terminated. Preparing and issuing a new transfer will not get
executed when there is already a cyclic transfer in flight as your only
option is to terminate_all, which will kill the running cyclic _and_
will discard the issued and pending transfers.

So the use case is page flip when you have multiple framebuffers and you
switch them to show the updated one, right?

There are things missing in DMAengine in API level for sure to do this,
imho.
The issue is that cyclic transfers will never complete, they run until
terminated, but you want to replace the currently executing one with a
another cyclic transfer without actually terminating the other.

It is like pause the 1st cyclic and continue with the 2nd one. Then at
some point you pause the 2nd one and restart the 1st one.
It is also crucial that the pause /switch happens when the executing one
finished the interleaved round and not in the middle somewhere, right?

If you:
desc_1 = dmaengine_prep_interleaved_cyclic(chan, );
cookie_1 = dmaengine_submit(desc_1);
desc_2 = dmaengine_prep_interleaved_cyclic(chan, );
cookie_2 = dmaengine_submit(desc_1);

/* cookie_1/desc_1 is started */
dma_async_issue_pending(chan);

/* When need to switch to cookie_2 */
dmaengine_cyclic_set_active_cookie(chan, cookie_2);
/*
 * cookie_1 execution is suspended after it finished the running
 * dma_interleaved_template or buffer in normal cyclic and cookie_2
 * is replacing it.
 */

/* Switch back to cookie_1 */
dmaengine_cyclic_set_active_cookie(chan, cookie_1);
/*
 * cookie_2 execution is suspended after it finished the running
 * dma_interleaved_template or buffer in normal cyclic and cookie_1
 * is replacing it.
 */

There should be a (yet another) capabilities flag got
cyclic_set_active_cookie and the documentation should be strict on what
is the expected behavior.

You can kill everything with terminate_all.
There is another thing which is missing imho from DMAengine: to
terminate a specific cookie, not the entire channel, which might be a
good addition as you might spawn framebuffers and then delete them and
you might want to release the corresponding cookie/descriptor as well.

What do you think?

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-24  7:20           ` Peter Ujfalusi
@ 2020-01-24  7:38             ` Peter Ujfalusi
  2020-01-24  8:58               ` Laurent Pinchart
  2020-01-24  8:56             ` Laurent Pinchart
  1 sibling, 1 reply; 46+ messages in thread
From: Peter Ujfalusi @ 2020-01-24  7:38 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Vinod Koul, dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy



On 24/01/2020 9.20, Peter Ujfalusi wrote:
> Hi Laurent,
> 
> On 23/01/2020 14.23, Laurent Pinchart wrote:
>>>>> I think capture (camera) is another potential beneficiary of this.
>>
>> Possibly, although in the camera case I'd rather have the hardware stop
>> if there's no more buffer. Requiring a buffer to always be present is
>> annoying from a userspace point of view. For display it's different, if
>> userspace doesn't submit a new frame, the same frame should keep being
>> displayed on the screen.
>>
>>>>> So you don't need to terminate the running interleaved_cyclic and start
>>>>> a new one, but prepare and issue a new one, which would
>>>>> terminate/replace the currently running cyclic interleaved DMA?
>>
>> Correct.
>>
>>>> Why not explicitly terminate the transfer and start when a new one is
>>>> issued. That can be common usage for audio and display..
>>>
>>> Yes, this is what I'm asking. The cyclic transfer is running and in
>>> order to start the new transfer, the previous should stop. But in cyclic
>>> case it is not going to happen unless it is terminated.
>>>
>>> When one would want to have different interleaved transfer the display
>>> (or capture )IP needs to be reconfigured as well. The the would need to
>>> be terminated anyways to avoid interpreting data in a wrong way.
>>
>> The use case here is not to switch to a new configuration, but to switch
>> to a new buffer. If the transfer had to be terminated manually first,
>> the DMA engine would potentially miss a frame, which is not acceptable.
>> We need an atomic way to switch to the next transfer.
> 
> You have a special hardware in hand, most DMAs can not just replace a
> cyclic transfer in-flight and it also kind of violates the DMAengine
> principles.

Is there any specific reason why you need DMAengine driver for a display
DMA? Usually the drm drivers handle their DMA internally.

> If cyclic transfer is started then it is expected to run forever until
> it is terminated. Preparing and issuing a new transfer will not get
> executed when there is already a cyclic transfer in flight as your only
> option is to terminate_all, which will kill the running cyclic _and_
> will discard the issued and pending transfers.
> 
> So the use case is page flip when you have multiple framebuffers and you
> switch them to show the updated one, right?
> 
> There are things missing in DMAengine in API level for sure to do this,
> imho.
> The issue is that cyclic transfers will never complete, they run until
> terminated, but you want to replace the currently executing one with a
> another cyclic transfer without actually terminating the other.
> 
> It is like pause the 1st cyclic and continue with the 2nd one. Then at
> some point you pause the 2nd one and restart the 1st one.
> It is also crucial that the pause /switch happens when the executing one
> finished the interleaved round and not in the middle somewhere, right?
> 
> If you:
> desc_1 = dmaengine_prep_interleaved_cyclic(chan, );
> cookie_1 = dmaengine_submit(desc_1);
> desc_2 = dmaengine_prep_interleaved_cyclic(chan, );
> cookie_2 = dmaengine_submit(desc_1);
> 
> /* cookie_1/desc_1 is started */
> dma_async_issue_pending(chan);
> 
> /* When need to switch to cookie_2 */
> dmaengine_cyclic_set_active_cookie(chan, cookie_2);
> /*
>  * cookie_1 execution is suspended after it finished the running
>  * dma_interleaved_template or buffer in normal cyclic and cookie_2
>  * is replacing it.
>  */
> 
> /* Switch back to cookie_1 */
> dmaengine_cyclic_set_active_cookie(chan, cookie_1);
> /*
>  * cookie_2 execution is suspended after it finished the running
>  * dma_interleaved_template or buffer in normal cyclic and cookie_1
>  * is replacing it.
>  */
> 
> There should be a (yet another) capabilities flag got
> cyclic_set_active_cookie and the documentation should be strict on what
> is the expected behavior.
> 
> You can kill everything with terminate_all.
> There is another thing which is missing imho from DMAengine: to
> terminate a specific cookie, not the entire channel, which might be a
> good addition as you might spawn framebuffers and then delete them and
> you might want to release the corresponding cookie/descriptor as well.

This is a bit trickier as DMAengine's cookie is s32 and internally
treated as a running number and cookie status is checked against s32
numbers with < >, I think this will not like when someone kills a cookie
in the middle.

> 
> What do you think?
> 
> - Péter
> 
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-24  6:10           ` Vinod Koul
@ 2020-01-24  8:50             ` Laurent Pinchart
  2020-02-10 14:06               ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-24  8:50 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

On Fri, Jan 24, 2020 at 11:40:47AM +0530, Vinod Koul wrote:
> On 23-01-20, 14:23, Laurent Pinchart wrote:
> > > >>> @@ -701,6 +702,10 @@ struct dma_filter {
> > > >>>   *	The function takes a buffer of size buf_len. The callback function will
> > > >>>   *	be called after period_len bytes have been transferred.
> > > >>>   * @device_prep_interleaved_dma: Transfer expression in a generic way.
> > > >>> + * @device_prep_interleaved_cyclic: prepares an interleaved cyclic transfer.
> > > >>> + *	This is similar to @device_prep_interleaved_dma, but the transfer is
> > > >>> + *	repeated until a new transfer is issued. This transfer type is meant
> > > >>> + *	for display.
> > > >>
> > > >> I think capture (camera) is another potential beneficiary of this.
> > 
> > Possibly, although in the camera case I'd rather have the hardware stop
> > if there's no more buffer. Requiring a buffer to always be present is
> > annoying from a userspace point of view. For display it's different, if
> > userspace doesn't submit a new frame, the same frame should keep being
> > displayed on the screen.
> > 
> > > >> So you don't need to terminate the running interleaved_cyclic and start
> > > >> a new one, but prepare and issue a new one, which would
> > > >> terminate/replace the currently running cyclic interleaved DMA?
> > 
> > Correct.
> > 
> > > > Why not explicitly terminate the transfer and start when a new one is
> > > > issued. That can be common usage for audio and display..
> > > 
> > > Yes, this is what I'm asking. The cyclic transfer is running and in
> > > order to start the new transfer, the previous should stop. But in cyclic
> > > case it is not going to happen unless it is terminated.
> > > 
> > > When one would want to have different interleaved transfer the display
> > > (or capture )IP needs to be reconfigured as well. The the would need to
> > > be terminated anyways to avoid interpreting data in a wrong way.
> > 
> > The use case here is not to switch to a new configuration, but to switch
> > to a new buffer. If the transfer had to be terminated manually first,
> > the DMA engine would potentially miss a frame, which is not acceptable.
> > We need an atomic way to switch to the next transfer.
> 
> So in this case you have, let's say a cyclic descriptor with N buffers
> and they are cyclically capturing data and providing to client/user..

For the display case it's cyclic over a single buffer that is repeatedly
displayed over and over again until a new one replaces it, when
userspace wants to change the content on the screen. Userspace only has
to provide a new buffer when content changes, otherwise the display has
to keep displaying the same one.

For cameras I don't think cyclic makes too much sense, except when the
DMA engine can't work in single-shot mode and always requires a buffer
to write into. That shouldn't be the norm.

> So why would you like to submit again...? Once whole capture has
> completed you would terminate, right...
> 
> Sorry not able to wrap my head around why new submission is required and
> if that is the case why previous one cant be terminated :)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-24  7:20           ` Peter Ujfalusi
  2020-01-24  7:38             ` Peter Ujfalusi
@ 2020-01-24  8:56             ` Laurent Pinchart
  1 sibling, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-24  8:56 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: Vinod Koul, dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Peter,

On Fri, Jan 24, 2020 at 09:20:15AM +0200, Peter Ujfalusi wrote:
> On 23/01/2020 14.23, Laurent Pinchart wrote:
> >>>> I think capture (camera) is another potential beneficiary of this.
> > 
> > Possibly, although in the camera case I'd rather have the hardware stop
> > if there's no more buffer. Requiring a buffer to always be present is
> > annoying from a userspace point of view. For display it's different, if
> > userspace doesn't submit a new frame, the same frame should keep being
> > displayed on the screen.
> > 
> >>>> So you don't need to terminate the running interleaved_cyclic and start
> >>>> a new one, but prepare and issue a new one, which would
> >>>> terminate/replace the currently running cyclic interleaved DMA?
> > 
> > Correct.
> > 
> >>> Why not explicitly terminate the transfer and start when a new one is
> >>> issued. That can be common usage for audio and display..
> >>
> >> Yes, this is what I'm asking. The cyclic transfer is running and in
> >> order to start the new transfer, the previous should stop. But in cyclic
> >> case it is not going to happen unless it is terminated.
> >>
> >> When one would want to have different interleaved transfer the display
> >> (or capture )IP needs to be reconfigured as well. The the would need to
> >> be terminated anyways to avoid interpreting data in a wrong way.
> > 
> > The use case here is not to switch to a new configuration, but to switch
> > to a new buffer. If the transfer had to be terminated manually first,
> > the DMA engine would potentially miss a frame, which is not acceptable.
> > We need an atomic way to switch to the next transfer.
> 
> You have a special hardware in hand, most DMAs can not just replace a
> cyclic transfer in-flight and it also kind of violates the DMAengine
> principles.

That's why cyclic support is optional :-)

> If cyclic transfer is started then it is expected to run forever until
> it is terminated. Preparing and issuing a new transfer will not get
> executed when there is already a cyclic transfer in flight as your only
> option is to terminate_all, which will kill the running cyclic _and_
> will discard the issued and pending transfers.

For the existing cyclic API, I could agree with that, although there's
very little documentation in the dmaengine subsystem to be used as an
authoritative source of information :-(

> So the use case is page flip when you have multiple framebuffers and you
> switch them to show the updated one, right?

Correct.

> There are things missing in DMAengine in API level for sure to do this,
> imho.
> The issue is that cyclic transfers will never complete, they run until
> terminated, but you want to replace the currently executing one with a
> another cyclic transfer without actually terminating the other.

Correct.

> It is like pause the 1st cyclic and continue with the 2nd one. Then at
> some point you pause the 2nd one and restart the 1st one.

No, after the 2nd one comes the 3rd one. It's not a double-buffering
case, it's really about replacing the buffer with another one,
regardless of where it comes from. Userspace may double-buffer, or
triple, or more.

> It is also crucial that the pause /switch happens when the executing one
> finished the interleaved round and not in the middle somewhere, right?

Yes. But that's not specific to this use case, with all non-cyclic
transfers submitting a new transfer request doesn't stop the ongoing
transfer (if any) immediately, it just queues the new transfer for
processing.

> If you:
> desc_1 = dmaengine_prep_interleaved_cyclic(chan, );
> cookie_1 = dmaengine_submit(desc_1);
> desc_2 = dmaengine_prep_interleaved_cyclic(chan, );
> cookie_2 = dmaengine_submit(desc_1);
> 
> /* cookie_1/desc_1 is started */
> dma_async_issue_pending(chan);
> 
> /* When need to switch to cookie_2 */
> dmaengine_cyclic_set_active_cookie(chan, cookie_2);
> /*
>  * cookie_1 execution is suspended after it finished the running
>  * dma_interleaved_template or buffer in normal cyclic and cookie_2
>  * is replacing it.
>  */
> 
> /* Switch back to cookie_1 */
> dmaengine_cyclic_set_active_cookie(chan, cookie_1);
> /*
>  * cookie_2 execution is suspended after it finished the running
>  * dma_interleaved_template or buffer in normal cyclic and cookie_1
>  * is replacing it.
>  */

As explained above, I don't want to switch back to a previous transfer,
I always want a new one. I don't see why we would need this kind of API
when we can just define that any queued interleaved transfer, whether
cyclic or not, is just queued and replaces the ongoing transfer at the
next frame boundary. Drivers don't have to implement the new API if the
hardware doesn't possess this capability.

> There should be a (yet another) capabilities flag got
> cyclic_set_active_cookie and the documentation should be strict on what
> is the expected behavior.
> 
> You can kill everything with terminate_all.
> There is another thing which is missing imho from DMAengine: to
> terminate a specific cookie, not the entire channel, which might be a
> good addition as you might spawn framebuffers and then delete them and
> you might want to release the corresponding cookie/descriptor as well.
> 
> What do you think?

I think it's overcomplicated for this use case :-)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-24  7:38             ` Peter Ujfalusi
@ 2020-01-24  8:58               ` Laurent Pinchart
  0 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-01-24  8:58 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: Vinod Koul, dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Peter,

On Fri, Jan 24, 2020 at 09:38:50AM +0200, Peter Ujfalusi wrote:
> On 24/01/2020 9.20, Peter Ujfalusi wrote:
> > On 23/01/2020 14.23, Laurent Pinchart wrote:
> >>>>> I think capture (camera) is another potential beneficiary of this.
> >>
> >> Possibly, although in the camera case I'd rather have the hardware stop
> >> if there's no more buffer. Requiring a buffer to always be present is
> >> annoying from a userspace point of view. For display it's different, if
> >> userspace doesn't submit a new frame, the same frame should keep being
> >> displayed on the screen.
> >>
> >>>>> So you don't need to terminate the running interleaved_cyclic and start
> >>>>> a new one, but prepare and issue a new one, which would
> >>>>> terminate/replace the currently running cyclic interleaved DMA?
> >>
> >> Correct.
> >>
> >>>> Why not explicitly terminate the transfer and start when a new one is
> >>>> issued. That can be common usage for audio and display..
> >>>
> >>> Yes, this is what I'm asking. The cyclic transfer is running and in
> >>> order to start the new transfer, the previous should stop. But in cyclic
> >>> case it is not going to happen unless it is terminated.
> >>>
> >>> When one would want to have different interleaved transfer the display
> >>> (or capture )IP needs to be reconfigured as well. The the would need to
> >>> be terminated anyways to avoid interpreting data in a wrong way.
> >>
> >> The use case here is not to switch to a new configuration, but to switch
> >> to a new buffer. If the transfer had to be terminated manually first,
> >> the DMA engine would potentially miss a frame, which is not acceptable.
> >> We need an atomic way to switch to the next transfer.
> > 
> > You have a special hardware in hand, most DMAs can not just replace a
> > cyclic transfer in-flight and it also kind of violates the DMAengine
> > principles.
> 
> Is there any specific reason why you need DMAengine driver for a display
> DMA? Usually the drm drivers handle their DMA internally.

Because it's a separate IP core that can be reused in different FPGAs
for different purposes. It happens that in my case it's a hard IP
connected to a display controller, but it could be used for non-cyclic
use cases in a different chip.

> > If cyclic transfer is started then it is expected to run forever until
> > it is terminated. Preparing and issuing a new transfer will not get
> > executed when there is already a cyclic transfer in flight as your only
> > option is to terminate_all, which will kill the running cyclic _and_
> > will discard the issued and pending transfers.
> > 
> > So the use case is page flip when you have multiple framebuffers and you
> > switch them to show the updated one, right?
> > 
> > There are things missing in DMAengine in API level for sure to do this,
> > imho.
> > The issue is that cyclic transfers will never complete, they run until
> > terminated, but you want to replace the currently executing one with a
> > another cyclic transfer without actually terminating the other.
> > 
> > It is like pause the 1st cyclic and continue with the 2nd one. Then at
> > some point you pause the 2nd one and restart the 1st one.
> > It is also crucial that the pause /switch happens when the executing one
> > finished the interleaved round and not in the middle somewhere, right?
> > 
> > If you:
> > desc_1 = dmaengine_prep_interleaved_cyclic(chan, );
> > cookie_1 = dmaengine_submit(desc_1);
> > desc_2 = dmaengine_prep_interleaved_cyclic(chan, );
> > cookie_2 = dmaengine_submit(desc_1);
> > 
> > /* cookie_1/desc_1 is started */
> > dma_async_issue_pending(chan);
> > 
> > /* When need to switch to cookie_2 */
> > dmaengine_cyclic_set_active_cookie(chan, cookie_2);
> > /*
> >  * cookie_1 execution is suspended after it finished the running
> >  * dma_interleaved_template or buffer in normal cyclic and cookie_2
> >  * is replacing it.
> >  */
> > 
> > /* Switch back to cookie_1 */
> > dmaengine_cyclic_set_active_cookie(chan, cookie_1);
> > /*
> >  * cookie_2 execution is suspended after it finished the running
> >  * dma_interleaved_template or buffer in normal cyclic and cookie_1
> >  * is replacing it.
> >  */
> > 
> > There should be a (yet another) capabilities flag got
> > cyclic_set_active_cookie and the documentation should be strict on what
> > is the expected behavior.
> > 
> > You can kill everything with terminate_all.
> > There is another thing which is missing imho from DMAengine: to
> > terminate a specific cookie, not the entire channel, which might be a
> > good addition as you might spawn framebuffers and then delete them and
> > you might want to release the corresponding cookie/descriptor as well.
> 
> This is a bit trickier as DMAengine's cookie is s32 and internally
> treated as a running number and cookie status is checked against s32
> numbers with < >, I think this will not like when someone kills a cookie
> in the middle.

I would require a major redesign, yes. Not looking forward to that,
especially as I think we don't need it.

> > What do you think?

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-01-24  8:50             ` Laurent Pinchart
@ 2020-02-10 14:06               ` Laurent Pinchart
  2020-02-13 13:29                 ` Vinod Koul
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-02-10 14:06 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Fri, Jan 24, 2020 at 10:50:51AM +0200, Laurent Pinchart wrote:
> On Fri, Jan 24, 2020 at 11:40:47AM +0530, Vinod Koul wrote:
> > On 23-01-20, 14:23, Laurent Pinchart wrote:
> >>>>>> @@ -701,6 +702,10 @@ struct dma_filter {
> >>>>>>   *	The function takes a buffer of size buf_len. The callback function will
> >>>>>>   *	be called after period_len bytes have been transferred.
> >>>>>>   * @device_prep_interleaved_dma: Transfer expression in a generic way.
> >>>>>> + * @device_prep_interleaved_cyclic: prepares an interleaved cyclic transfer.
> >>>>>> + *	This is similar to @device_prep_interleaved_dma, but the transfer is
> >>>>>> + *	repeated until a new transfer is issued. This transfer type is meant
> >>>>>> + *	for display.
> >>>>>
> >>>>> I think capture (camera) is another potential beneficiary of this.
> >> 
> >> Possibly, although in the camera case I'd rather have the hardware stop
> >> if there's no more buffer. Requiring a buffer to always be present is
> >> annoying from a userspace point of view. For display it's different, if
> >> userspace doesn't submit a new frame, the same frame should keep being
> >> displayed on the screen.
> >> 
> >>>>> So you don't need to terminate the running interleaved_cyclic and start
> >>>>> a new one, but prepare and issue a new one, which would
> >>>>> terminate/replace the currently running cyclic interleaved DMA?
> >> 
> >> Correct.
> >> 
> >>>> Why not explicitly terminate the transfer and start when a new one is
> >>>> issued. That can be common usage for audio and display..
> >>> 
> >>> Yes, this is what I'm asking. The cyclic transfer is running and in
> >>> order to start the new transfer, the previous should stop. But in cyclic
> >>> case it is not going to happen unless it is terminated.
> >>> 
> >>> When one would want to have different interleaved transfer the display
> >>> (or capture )IP needs to be reconfigured as well. The the would need to
> >>> be terminated anyways to avoid interpreting data in a wrong way.
> >> 
> >> The use case here is not to switch to a new configuration, but to switch
> >> to a new buffer. If the transfer had to be terminated manually first,
> >> the DMA engine would potentially miss a frame, which is not acceptable.
> >> We need an atomic way to switch to the next transfer.
> > 
> > So in this case you have, let's say a cyclic descriptor with N buffers
> > and they are cyclically capturing data and providing to client/user..
> 
> For the display case it's cyclic over a single buffer that is repeatedly
> displayed over and over again until a new one replaces it, when
> userspace wants to change the content on the screen. Userspace only has
> to provide a new buffer when content changes, otherwise the display has
> to keep displaying the same one.

Is the use case clear enough, or do you need more information ? Are you
fine with the API for this kind of use case ?

> For cameras I don't think cyclic makes too much sense, except when the
> DMA engine can't work in single-shot mode and always requires a buffer
> to write into. That shouldn't be the norm.
> 
> > So why would you like to submit again...? Once whole capture has
> > completed you would terminate, right...
> > 
> > Sorry not able to wrap my head around why new submission is required and
> > if that is the case why previous one cant be terminated :)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-10 14:06               ` Laurent Pinchart
@ 2020-02-13 13:29                 ` Vinod Koul
  2020-02-13 13:48                   ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-02-13 13:29 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Laurent,

On 10-02-20, 16:06, Laurent Pinchart wrote:

> > >> The use case here is not to switch to a new configuration, but to switch
> > >> to a new buffer. If the transfer had to be terminated manually first,
> > >> the DMA engine would potentially miss a frame, which is not acceptable.
> > >> We need an atomic way to switch to the next transfer.
> > > 
> > > So in this case you have, let's say a cyclic descriptor with N buffers
> > > and they are cyclically capturing data and providing to client/user..
> > 
> > For the display case it's cyclic over a single buffer that is repeatedly
> > displayed over and over again until a new one replaces it, when
> > userspace wants to change the content on the screen. Userspace only has
> > to provide a new buffer when content changes, otherwise the display has
> > to keep displaying the same one.
> 
> Is the use case clear enough, or do you need more information ? Are you
> fine with the API for this kind of use case ?

So we *know* when a new buffer is being used?

IOW would it be possible for display (rather a dmaengine facing display wrapper) to detect that we are reusing an
old buffer and keep the cyclic and once detected prepare a new
descriptor, submit a new one and then terminate old one which should
trigger next transaction to be submitted

Would that make sense here?

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-13 13:29                 ` Vinod Koul
@ 2020-02-13 13:48                   ` Laurent Pinchart
  2020-02-13 14:07                     ` Vinod Koul
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-02-13 13:48 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Thu, Feb 13, 2020 at 06:59:38PM +0530, Vinod Koul wrote:
> On 10-02-20, 16:06, Laurent Pinchart wrote:
> 
> > > >> The use case here is not to switch to a new configuration, but to switch
> > > >> to a new buffer. If the transfer had to be terminated manually first,
> > > >> the DMA engine would potentially miss a frame, which is not acceptable.
> > > >> We need an atomic way to switch to the next transfer.
> > > > 
> > > > So in this case you have, let's say a cyclic descriptor with N buffers
> > > > and they are cyclically capturing data and providing to client/user..
> > > 
> > > For the display case it's cyclic over a single buffer that is repeatedly
> > > displayed over and over again until a new one replaces it, when
> > > userspace wants to change the content on the screen. Userspace only has
> > > to provide a new buffer when content changes, otherwise the display has
> > > to keep displaying the same one.
> > 
> > Is the use case clear enough, or do you need more information ? Are you
> > fine with the API for this kind of use case ?
> 
> So we *know* when a new buffer is being used?

The user of the DMA engine (the DRM DPSUB driver in this case) knows
when a new buffer needs to be used, as it receives it from userspace. In
response, it prepares a new interleaved cyclic transaction and queues
it. At the next IRQ, the DMA engine driver switches to the new
transaction (the implementation is slightly more complex to handle race
conditions, but that's the idea).

> IOW would it be possible for display (rather a dmaengine facing
> display wrapper) to detect that we are reusing an old buffer and keep
> the cyclic and once detected prepare a new descriptor, submit a new
> one and then terminate old one which should trigger next transaction
> to be submitted

I'm not sure to follow you. Do you mean that the display driver should
submit a non-cyclic transaction for every frame, reusing the same buffer
for every transaction, until a new buffer is available ? The issue with
this is that if the CPU load gets high, we may miss a frame, and the
display will break. The DPDMA hardware implements cyclic support for
this reason, and we want to use that feature to comply with the real
time requirements.

If you meant something else, could you please elaborate ?

> Would that make sense here?

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-13 13:48                   ` Laurent Pinchart
@ 2020-02-13 14:07                     ` Vinod Koul
  2020-02-13 14:15                       ` Peter Ujfalusi
  0 siblings, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-02-13 14:07 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

On 13-02-20, 15:48, Laurent Pinchart wrote:
> Hi Vinod,
> 
> On Thu, Feb 13, 2020 at 06:59:38PM +0530, Vinod Koul wrote:
> > On 10-02-20, 16:06, Laurent Pinchart wrote:
> > 
> > > > >> The use case here is not to switch to a new configuration, but to switch
> > > > >> to a new buffer. If the transfer had to be terminated manually first,
> > > > >> the DMA engine would potentially miss a frame, which is not acceptable.
> > > > >> We need an atomic way to switch to the next transfer.
> > > > > 
> > > > > So in this case you have, let's say a cyclic descriptor with N buffers
> > > > > and they are cyclically capturing data and providing to client/user..
> > > > 
> > > > For the display case it's cyclic over a single buffer that is repeatedly
> > > > displayed over and over again until a new one replaces it, when
> > > > userspace wants to change the content on the screen. Userspace only has
> > > > to provide a new buffer when content changes, otherwise the display has
> > > > to keep displaying the same one.
> > > 
> > > Is the use case clear enough, or do you need more information ? Are you
> > > fine with the API for this kind of use case ?
> > 
> > So we *know* when a new buffer is being used?
> 
> The user of the DMA engine (the DRM DPSUB driver in this case) knows
> when a new buffer needs to be used, as it receives it from userspace. In
> response, it prepares a new interleaved cyclic transaction and queues
> it. At the next IRQ, the DMA engine driver switches to the new
> transaction (the implementation is slightly more complex to handle race
> conditions, but that's the idea).
> 
> > IOW would it be possible for display (rather a dmaengine facing
> > display wrapper) to detect that we are reusing an old buffer and keep
> > the cyclic and once detected prepare a new descriptor, submit a new
> > one and then terminate old one which should trigger next transaction
> > to be submitted
> 
> I'm not sure to follow you. Do you mean that the display driver should
> submit a non-cyclic transaction for every frame, reusing the same buffer
> for every transaction, until a new buffer is available ? The issue with
> this is that if the CPU load gets high, we may miss a frame, and the
> display will break. The DPDMA hardware implements cyclic support for
> this reason, and we want to use that feature to comply with the real
> time requirements.

Sorry to cause confusion :) I mean cyclic

So, DRM DPSUB get first buffer
A.1 Prepare cyclic interleave txn
A.2 Submit the txn (it doesn't start here)
A.3 Invoke issue_pending (that starts the txn)

DRM DPSUB gets next buffer:
B.1 Prepare cyclic interleave txn
B.2 Submit the txn
B.3 Call terminate for current cyclic txn (we need an updated terminate
which terminates the current txn, right now we have terminate_all which
is a sledge hammer approach)
B.4 Next txn would start once current one is started

Does this help and make sense in your case

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-13 14:07                     ` Vinod Koul
@ 2020-02-13 14:15                       ` Peter Ujfalusi
  2020-02-13 16:52                         ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Peter Ujfalusi @ 2020-02-13 14:15 UTC (permalink / raw)
  To: Vinod Koul, Laurent Pinchart
  Cc: dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Vinod, Laurent,

On 13/02/2020 16.07, Vinod Koul wrote:
> On 13-02-20, 15:48, Laurent Pinchart wrote:
>> Hi Vinod,
>>
>> On Thu, Feb 13, 2020 at 06:59:38PM +0530, Vinod Koul wrote:
>>> On 10-02-20, 16:06, Laurent Pinchart wrote:
>>>
>>>>>>> The use case here is not to switch to a new configuration, but to switch
>>>>>>> to a new buffer. If the transfer had to be terminated manually first,
>>>>>>> the DMA engine would potentially miss a frame, which is not acceptable.
>>>>>>> We need an atomic way to switch to the next transfer.
>>>>>>
>>>>>> So in this case you have, let's say a cyclic descriptor with N buffers
>>>>>> and they are cyclically capturing data and providing to client/user..
>>>>>
>>>>> For the display case it's cyclic over a single buffer that is repeatedly
>>>>> displayed over and over again until a new one replaces it, when
>>>>> userspace wants to change the content on the screen. Userspace only has
>>>>> to provide a new buffer when content changes, otherwise the display has
>>>>> to keep displaying the same one.
>>>>
>>>> Is the use case clear enough, or do you need more information ? Are you
>>>> fine with the API for this kind of use case ?
>>>
>>> So we *know* when a new buffer is being used?
>>
>> The user of the DMA engine (the DRM DPSUB driver in this case) knows
>> when a new buffer needs to be used, as it receives it from userspace. In
>> response, it prepares a new interleaved cyclic transaction and queues
>> it. At the next IRQ, the DMA engine driver switches to the new
>> transaction (the implementation is slightly more complex to handle race
>> conditions, but that's the idea).
>>
>>> IOW would it be possible for display (rather a dmaengine facing
>>> display wrapper) to detect that we are reusing an old buffer and keep
>>> the cyclic and once detected prepare a new descriptor, submit a new
>>> one and then terminate old one which should trigger next transaction
>>> to be submitted
>>
>> I'm not sure to follow you. Do you mean that the display driver should
>> submit a non-cyclic transaction for every frame, reusing the same buffer
>> for every transaction, until a new buffer is available ? The issue with
>> this is that if the CPU load gets high, we may miss a frame, and the
>> display will break. The DPDMA hardware implements cyclic support for
>> this reason, and we want to use that feature to comply with the real
>> time requirements.
> 
> Sorry to cause confusion :) I mean cyclic
> 
> So, DRM DPSUB get first buffer
> A.1 Prepare cyclic interleave txn
> A.2 Submit the txn (it doesn't start here)
> A.3 Invoke issue_pending (that starts the txn)
> 
> DRM DPSUB gets next buffer:
> B.1 Prepare cyclic interleave txn
> B.2 Submit the txn
> B.3 Call terminate for current cyclic txn (we need an updated terminate
> which terminates the current txn, right now we have terminate_all which
> is a sledge hammer approach)
> B.4 Next txn would start once current one is started
> 
> Does this help and make sense in your case

That would be a clean way to handle it. We were missing this API for a
long time to be able to cancel the ongoing transfer (whether it is
cyclic or slave_sg, or memcpy) and move to the next one if there is one
pending.

+1 from me if it counts ;)

> 
> Thanks
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-13 14:15                       ` Peter Ujfalusi
@ 2020-02-13 16:52                         ` Laurent Pinchart
  2020-02-14  4:23                           ` Vinod Koul
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-02-13 16:52 UTC (permalink / raw)
  To: Peter Ujfalusi, Vinod Koul
  Cc: dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Vinod and Peter,

On Thu, Feb 13, 2020 at 04:15:38PM +0200, Peter Ujfalusi wrote:
> On 13/02/2020 16.07, Vinod Koul wrote:
> > On 13-02-20, 15:48, Laurent Pinchart wrote:
> >> On Thu, Feb 13, 2020 at 06:59:38PM +0530, Vinod Koul wrote:
> >>> On 10-02-20, 16:06, Laurent Pinchart wrote:
> >>>
> >>>>>>> The use case here is not to switch to a new configuration, but to switch
> >>>>>>> to a new buffer. If the transfer had to be terminated manually first,
> >>>>>>> the DMA engine would potentially miss a frame, which is not acceptable.
> >>>>>>> We need an atomic way to switch to the next transfer.
> >>>>>>
> >>>>>> So in this case you have, let's say a cyclic descriptor with N buffers
> >>>>>> and they are cyclically capturing data and providing to client/user..
> >>>>>
> >>>>> For the display case it's cyclic over a single buffer that is repeatedly
> >>>>> displayed over and over again until a new one replaces it, when
> >>>>> userspace wants to change the content on the screen. Userspace only has
> >>>>> to provide a new buffer when content changes, otherwise the display has
> >>>>> to keep displaying the same one.
> >>>>
> >>>> Is the use case clear enough, or do you need more information ? Are you
> >>>> fine with the API for this kind of use case ?
> >>>
> >>> So we *know* when a new buffer is being used?
> >>
> >> The user of the DMA engine (the DRM DPSUB driver in this case) knows
> >> when a new buffer needs to be used, as it receives it from userspace. In
> >> response, it prepares a new interleaved cyclic transaction and queues
> >> it. At the next IRQ, the DMA engine driver switches to the new
> >> transaction (the implementation is slightly more complex to handle race
> >> conditions, but that's the idea).
> >>
> >>> IOW would it be possible for display (rather a dmaengine facing
> >>> display wrapper) to detect that we are reusing an old buffer and keep
> >>> the cyclic and once detected prepare a new descriptor, submit a new
> >>> one and then terminate old one which should trigger next transaction
> >>> to be submitted
> >>
> >> I'm not sure to follow you. Do you mean that the display driver should
> >> submit a non-cyclic transaction for every frame, reusing the same buffer
> >> for every transaction, until a new buffer is available ? The issue with
> >> this is that if the CPU load gets high, we may miss a frame, and the
> >> display will break. The DPDMA hardware implements cyclic support for
> >> this reason, and we want to use that feature to comply with the real
> >> time requirements.
> > 
> > Sorry to cause confusion :) I mean cyclic
> > 
> > So, DRM DPSUB get first buffer
> > A.1 Prepare cyclic interleave txn
> > A.2 Submit the txn (it doesn't start here)
> > A.3 Invoke issue_pending (that starts the txn)

I assume that, at this point, the transfer is started, and repeated
forever until step B below, right ?

> > DRM DPSUB gets next buffer:
> > B.1 Prepare cyclic interleave txn
> > B.2 Submit the txn
> > B.3 Call terminate for current cyclic txn (we need an updated terminate
> > which terminates the current txn, right now we have terminate_all which
> > is a sledge hammer approach)
> > B.4 Next txn would start once current one is started

Do you mean "once current one is completed" ?

> > Does this help and make sense in your case

It does, but I really wonder why we need a new terminate operation that
would terminate a single transfer. If we call issue_pending at step B.3,
when the new txn submitted, we can terminate the current transfer at the
point. It changes the semantics of issue_pending, but only for cyclic
transfers (this whole discussions it only about cyclic transfers). As a
cyclic transfer will be repeated forever until terminated, there's no
use case for issuing a new transfer without terminating the one in
progress. I thus don't think we need a new terminate operation: the only
thing that makes sense to do when submitting a new cyclic transfer is to
terminate the current one and switch to the new one, and we already have
all the APIs we need to enable this behaviour.

> That would be a clean way to handle it. We were missing this API for a
> long time to be able to cancel the ongoing transfer (whether it is
> cyclic or slave_sg, or memcpy) and move to the next one if there is one
> pending.

Note that this new terminate API wouldn't terminate the ongoing transfer
immediately, it would complete first, until the end of the cycle for
cyclic transfers, and until the end of the whole transfer otherwise.
This new operation would thus essentially be a no-op for non-cyclic
transfers. I don't see how it would help :-) Do you have any particular
use case in mind ?

> +1 from me if it counts ;)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-13 16:52                         ` Laurent Pinchart
@ 2020-02-14  4:23                           ` Vinod Koul
  2020-02-14 16:22                             ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-02-14  4:23 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

On 13-02-20, 18:52, Laurent Pinchart wrote:
> Hi Vinod and Peter,
> 
> On Thu, Feb 13, 2020 at 04:15:38PM +0200, Peter Ujfalusi wrote:
> > On 13/02/2020 16.07, Vinod Koul wrote:
> > > On 13-02-20, 15:48, Laurent Pinchart wrote:
> > >> On Thu, Feb 13, 2020 at 06:59:38PM +0530, Vinod Koul wrote:
> > >>> On 10-02-20, 16:06, Laurent Pinchart wrote:
> > >>>
> > >>>>>>> The use case here is not to switch to a new configuration, but to switch
> > >>>>>>> to a new buffer. If the transfer had to be terminated manually first,
> > >>>>>>> the DMA engine would potentially miss a frame, which is not acceptable.
> > >>>>>>> We need an atomic way to switch to the next transfer.
> > >>>>>>
> > >>>>>> So in this case you have, let's say a cyclic descriptor with N buffers
> > >>>>>> and they are cyclically capturing data and providing to client/user..
> > >>>>>
> > >>>>> For the display case it's cyclic over a single buffer that is repeatedly
> > >>>>> displayed over and over again until a new one replaces it, when
> > >>>>> userspace wants to change the content on the screen. Userspace only has
> > >>>>> to provide a new buffer when content changes, otherwise the display has
> > >>>>> to keep displaying the same one.
> > >>>>
> > >>>> Is the use case clear enough, or do you need more information ? Are you
> > >>>> fine with the API for this kind of use case ?
> > >>>
> > >>> So we *know* when a new buffer is being used?
> > >>
> > >> The user of the DMA engine (the DRM DPSUB driver in this case) knows
> > >> when a new buffer needs to be used, as it receives it from userspace. In
> > >> response, it prepares a new interleaved cyclic transaction and queues
> > >> it. At the next IRQ, the DMA engine driver switches to the new
> > >> transaction (the implementation is slightly more complex to handle race
> > >> conditions, but that's the idea).
> > >>
> > >>> IOW would it be possible for display (rather a dmaengine facing
> > >>> display wrapper) to detect that we are reusing an old buffer and keep
> > >>> the cyclic and once detected prepare a new descriptor, submit a new
> > >>> one and then terminate old one which should trigger next transaction
> > >>> to be submitted
> > >>
> > >> I'm not sure to follow you. Do you mean that the display driver should
> > >> submit a non-cyclic transaction for every frame, reusing the same buffer
> > >> for every transaction, until a new buffer is available ? The issue with
> > >> this is that if the CPU load gets high, we may miss a frame, and the
> > >> display will break. The DPDMA hardware implements cyclic support for
> > >> this reason, and we want to use that feature to comply with the real
> > >> time requirements.
> > > 
> > > Sorry to cause confusion :) I mean cyclic
> > > 
> > > So, DRM DPSUB get first buffer
> > > A.1 Prepare cyclic interleave txn
> > > A.2 Submit the txn (it doesn't start here)
> > > A.3 Invoke issue_pending (that starts the txn)
> 
> I assume that, at this point, the transfer is started, and repeated
> forever until step B below, right ?

Right, since the transaction is cyclic in nature, the transaction will continue
until stopped or switched :)

> > > DRM DPSUB gets next buffer:
> > > B.1 Prepare cyclic interleave txn
> > > B.2 Submit the txn
> > > B.3 Call terminate for current cyclic txn (we need an updated terminate
> > > which terminates the current txn, right now we have terminate_all which
> > > is a sledge hammer approach)
> > > B.4 Next txn would start once current one is started
> 
> Do you mean "once current one is completed" ?

Yup, sorry for the typo!

> > > Does this help and make sense in your case
> 
> It does, but I really wonder why we need a new terminate operation that
> would terminate a single transfer. If we call issue_pending at step B.3,
> when the new txn submitted, we can terminate the current transfer at the
> point. It changes the semantics of issue_pending, but only for cyclic
> transfers (this whole discussions it only about cyclic transfers). As a
> cyclic transfer will be repeated forever until terminated, there's no
> use case for issuing a new transfer without terminating the one in
> progress. I thus don't think we need a new terminate operation: the only
> thing that makes sense to do when submitting a new cyclic transfer is to
> terminate the current one and switch to the new one, and we already have
> all the APIs we need to enable this behaviour.

The issue_pending() is a NOP when engine is already running.

The design of APIs is that we submit a txn to pending_list and then the
pending_list is started when issue_pending() is called.
Or if the engine is already running, it will take next txn from
pending_list() when current txn completes.

The only consideration here in this case is that the cyclic txn never
completes. Do we really treat a new txn submission as an 'indication' of
completeness? That is indeed a point to ponder upon.

Also, we need to keep in mind that the dmaengine wont stop a cyclic
txn. It would be running and start next transfer (in this case do
from start) while it also gives you an interrupt. Here we would be
required to stop it and then start a new one...

Or perhaps remove the cyclic setting from the txn when a new one
arrives and that behaviour IMO is controller dependent, not sure if
all controllers support it..

> > That would be a clean way to handle it. We were missing this API for a
> > long time to be able to cancel the ongoing transfer (whether it is
> > cyclic or slave_sg, or memcpy) and move to the next one if there is one
> > pending.
> 
> Note that this new terminate API wouldn't terminate the ongoing transfer
> immediately, it would complete first, until the end of the cycle for
> cyclic transfers, and until the end of the whole transfer otherwise.
> This new operation would thus essentially be a no-op for non-cyclic
> transfers. I don't see how it would help :-) Do you have any particular
> use case in mind ?

Yeah that is something more to think about. Do we really abort here or
wait for the txn to complete. I think Peter needs the former and your
falls in the latter category

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-14  4:23                           ` Vinod Koul
@ 2020-02-14 16:22                             ` Laurent Pinchart
  2020-02-17 10:00                               ` Peter Ujfalusi
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-02-14 16:22 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Fri, Feb 14, 2020 at 09:53:49AM +0530, Vinod Koul wrote:
> On 13-02-20, 18:52, Laurent Pinchart wrote:
> > On Thu, Feb 13, 2020 at 04:15:38PM +0200, Peter Ujfalusi wrote:
> > > On 13/02/2020 16.07, Vinod Koul wrote:
> > > > On 13-02-20, 15:48, Laurent Pinchart wrote:
> > > >> On Thu, Feb 13, 2020 at 06:59:38PM +0530, Vinod Koul wrote:
> > > >>> On 10-02-20, 16:06, Laurent Pinchart wrote:
> > > >>>
> > > >>>>>>> The use case here is not to switch to a new configuration, but to switch
> > > >>>>>>> to a new buffer. If the transfer had to be terminated manually first,
> > > >>>>>>> the DMA engine would potentially miss a frame, which is not acceptable.
> > > >>>>>>> We need an atomic way to switch to the next transfer.
> > > >>>>>>
> > > >>>>>> So in this case you have, let's say a cyclic descriptor with N buffers
> > > >>>>>> and they are cyclically capturing data and providing to client/user..
> > > >>>>>
> > > >>>>> For the display case it's cyclic over a single buffer that is repeatedly
> > > >>>>> displayed over and over again until a new one replaces it, when
> > > >>>>> userspace wants to change the content on the screen. Userspace only has
> > > >>>>> to provide a new buffer when content changes, otherwise the display has
> > > >>>>> to keep displaying the same one.
> > > >>>>
> > > >>>> Is the use case clear enough, or do you need more information ? Are you
> > > >>>> fine with the API for this kind of use case ?
> > > >>>
> > > >>> So we *know* when a new buffer is being used?
> > > >>
> > > >> The user of the DMA engine (the DRM DPSUB driver in this case) knows
> > > >> when a new buffer needs to be used, as it receives it from userspace. In
> > > >> response, it prepares a new interleaved cyclic transaction and queues
> > > >> it. At the next IRQ, the DMA engine driver switches to the new
> > > >> transaction (the implementation is slightly more complex to handle race
> > > >> conditions, but that's the idea).
> > > >>
> > > >>> IOW would it be possible for display (rather a dmaengine facing
> > > >>> display wrapper) to detect that we are reusing an old buffer and keep
> > > >>> the cyclic and once detected prepare a new descriptor, submit a new
> > > >>> one and then terminate old one which should trigger next transaction
> > > >>> to be submitted
> > > >>
> > > >> I'm not sure to follow you. Do you mean that the display driver should
> > > >> submit a non-cyclic transaction for every frame, reusing the same buffer
> > > >> for every transaction, until a new buffer is available ? The issue with
> > > >> this is that if the CPU load gets high, we may miss a frame, and the
> > > >> display will break. The DPDMA hardware implements cyclic support for
> > > >> this reason, and we want to use that feature to comply with the real
> > > >> time requirements.
> > > > 
> > > > Sorry to cause confusion :) I mean cyclic
> > > > 
> > > > So, DRM DPSUB get first buffer
> > > > A.1 Prepare cyclic interleave txn
> > > > A.2 Submit the txn (it doesn't start here)
> > > > A.3 Invoke issue_pending (that starts the txn)
> > 
> > I assume that, at this point, the transfer is started, and repeated
> > forever until step B below, right ?
> 
> Right, since the transaction is cyclic in nature, the transaction will continue
> until stopped or switched :)
> 
> > > > DRM DPSUB gets next buffer:
> > > > B.1 Prepare cyclic interleave txn
> > > > B.2 Submit the txn
> > > > B.3 Call terminate for current cyclic txn (we need an updated terminate
> > > > which terminates the current txn, right now we have terminate_all which
> > > > is a sledge hammer approach)
> > > > B.4 Next txn would start once current one is started
> > 
> > Do you mean "once current one is completed" ?
> 
> Yup, sorry for the typo!

No worries, I just wanted to make sure it wasn't a misunderstanding on
my side.

> > > > Does this help and make sense in your case
> > 
> > It does, but I really wonder why we need a new terminate operation that
> > would terminate a single transfer. If we call issue_pending at step B.3,
> > when the new txn submitted, we can terminate the current transfer at the
> > point. It changes the semantics of issue_pending, but only for cyclic
> > transfers (this whole discussions it only about cyclic transfers). As a
> > cyclic transfer will be repeated forever until terminated, there's no
> > use case for issuing a new transfer without terminating the one in
> > progress. I thus don't think we need a new terminate operation: the only
> > thing that makes sense to do when submitting a new cyclic transfer is to
> > terminate the current one and switch to the new one, and we already have
> > all the APIs we need to enable this behaviour.
> 
> The issue_pending() is a NOP when engine is already running.

That's not totally right. issue_pending() still moves submitted but not
issued transactions from the submitted queue to the issued queue. The
DMA engine only considers the issued queue, so issue_pending()
essentially tells the DMA engine to consider the submitted transaction
for processing after the already issued transactions complete (in the
non-cyclic case).

> The design of APIs is that we submit a txn to pending_list and then the
> pending_list is started when issue_pending() is called.
> Or if the engine is already running, it will take next txn from
> pending_list() when current txn completes.
> 
> The only consideration here in this case is that the cyclic txn never
> completes. Do we really treat a new txn submission as an 'indication' of
> completeness? That is indeed a point to ponder upon.

The reason why I think we should is two-fold:

1. I believe it's semantically aligned with the existing behaviour of
issue_pending(). As explained above, the operation tells the DMA engine
to consider submitted transactions for processing when the current (and
other issued) transactions complete. If we extend the definition of
complete to cover cyclic transactions, I think it's a good match.

2. There's really nothing else we could do with cyclic transactions.
They never complete today and have to be terminated manually with
terminate_all(). Using issue_pending() to move to a next cyclic
transaction doesn't change the existing behaviour by replacing a useful
(and used) feature, as issue_pending() is currently a no-op for cyclic
transactions. The newly issued transaction is never considered, and
calling terminate_all() will cancel the issued transactions. By
extending the behaviour of issue_pending(), we're making a new use case
possible, without restricting any other feature, and without "stealing"
issue_pending() and preventing it from implementing another useful
behaviour.

In a nutshell, an important reason why I like using issue_pending() for
this purpose is because it makes cyclic and non-cyclic transactions
behave more similarly, which I think is good from an API consistency
point of view.

> Also, we need to keep in mind that the dmaengine wont stop a cyclic
> txn. It would be running and start next transfer (in this case do
> from start) while it also gives you an interrupt. Here we would be
> required to stop it and then start a new one...

We wouldn't be required to stop it in the middle, the expected behaviour
is for the DMA engine to complete the cyclic transaction until the end
of the cycle and then replace it by the new one. That's exactly what
happens for non-cyclic transactions when you call issue_pending(), which
makes me like this solution.

> Or perhaps remove the cyclic setting from the txn when a new one
> arrives and that behaviour IMO is controller dependent, not sure if
> all controllers support it..

At the very least I would assume controllers to be able to stop a cyclic
transaction forcefully, otherwise terminate_all() could never be
implemented. This may not lead to a gracefully switch from one cyclic
transaction to another one if the hardware doesn't allow doing so. In
that case I think tx_submit() could return an error, or we could turn
issue_pending() into an int operation to signal the error. Note that
there's no need to mass-patch drivers here, if a DMA engine client
issues a second cyclic transaction while one is in progress, the second
transaction won't be considered today. Signalling an error is in my
opinion a useful feature, but not doing so in DMA engine drivers can't
be a regression. We could also add a flag to tell whether this mode of
operation is supported.

> > > That would be a clean way to handle it. We were missing this API for a
> > > long time to be able to cancel the ongoing transfer (whether it is
> > > cyclic or slave_sg, or memcpy) and move to the next one if there is one
> > > pending.
> > 
> > Note that this new terminate API wouldn't terminate the ongoing transfer
> > immediately, it would complete first, until the end of the cycle for
> > cyclic transfers, and until the end of the whole transfer otherwise.
> > This new operation would thus essentially be a no-op for non-cyclic
> > transfers. I don't see how it would help :-) Do you have any particular
> > use case in mind ?
> 
> Yeah that is something more to think about. Do we really abort here or
> wait for the txn to complete. I think Peter needs the former and your
> falls in the latter category

I definitely need the latter, otherwise the display will flicker (or
completely misoperate) every time a new frame is displayed, which isn't
a good idea :-) I'm not sure about Peter's use cases, but it seems to me
that aborting a transaction immediately is racy in most cases, unless
the DMA engine supports byte-level residue reporting. One non-intrusive
option would be to add a flag to signal that a newly issued transaction
should interrupt the current transaction immediately.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-14 16:22                             ` Laurent Pinchart
@ 2020-02-17 10:00                               ` Peter Ujfalusi
  2020-02-19  9:25                                 ` Vinod Koul
  2020-02-26 16:24                                 ` Laurent Pinchart
  0 siblings, 2 replies; 46+ messages in thread
From: Peter Ujfalusi @ 2020-02-17 10:00 UTC (permalink / raw)
  To: Laurent Pinchart, Vinod Koul
  Cc: dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Laurent, Vinod,

On 14/02/2020 18.22, Laurent Pinchart wrote:
>>> It does, but I really wonder why we need a new terminate operation that
>>> would terminate a single transfer. If we call issue_pending at step B.3,
>>> when the new txn submitted, we can terminate the current transfer at the
>>> point. It changes the semantics of issue_pending, but only for cyclic
>>> transfers (this whole discussions it only about cyclic transfers). As a
>>> cyclic transfer will be repeated forever until terminated, there's no
>>> use case for issuing a new transfer without terminating the one in
>>> progress. I thus don't think we need a new terminate operation: the only
>>> thing that makes sense to do when submitting a new cyclic transfer is to
>>> terminate the current one and switch to the new one, and we already have
>>> all the APIs we need to enable this behaviour.
>>
>> The issue_pending() is a NOP when engine is already running.
> 
> That's not totally right. issue_pending() still moves submitted but not
> issued transactions from the submitted queue to the issued queue. The
> DMA engine only considers the issued queue, so issue_pending()
> essentially tells the DMA engine to consider the submitted transaction
> for processing after the already issued transactions complete (in the
> non-cyclic case).

Vinod's point is for the cyclic case at the current state. It is NOP
essentially as we don't have way to not kill the whole channel.

Just a sidenote: it is not even that clean cut for slave transfers
either as the slave_config must _not_ change between the issued
transfers. Iow, you can not switch between 16bit and 32bit word lengths
with some DMA. EDMA, sDMA can do that, but UDMA can not for example...

>> The design of APIs is that we submit a txn to pending_list and then the
>> pending_list is started when issue_pending() is called.
>> Or if the engine is already running, it will take next txn from
>> pending_list() when current txn completes.
>>
>> The only consideration here in this case is that the cyclic txn never
>> completes. Do we really treat a new txn submission as an 'indication' of
>> completeness? That is indeed a point to ponder upon.
> 
> The reason why I think we should is two-fold:
> 
> 1. I believe it's semantically aligned with the existing behaviour of
> issue_pending(). As explained above, the operation tells the DMA engine
> to consider submitted transactions for processing when the current (and
> other issued) transactions complete. If we extend the definition of
> complete to cover cyclic transactions, I think it's a good match.

We will end up with different behavior between cyclic and non cyclic
transfers and the new behavior should be somehow supported by existing
drivers.
Yes, issue_pending is moving the submitted tx to the issued queue to be
executed on HW when the current transfer finished.
We only needed this for non cyclic uses so far. Some DMA hw can replace
the current transfer with a new one (re-trigger to fetch the new
configuration, like your's), but some can not (none of the system DMAs
on TI platforms can).
If we say that this is the behavior the DMA drivers must follow then we
will have non compliant DMA drivers. You can not move simply to other
DMA or can not create generic DMA code shared by drivers.

> 2. There's really nothing else we could do with cyclic transactions.
> They never complete today and have to be terminated manually with
> terminate_all(). Using issue_pending() to move to a next cyclic
> transaction doesn't change the existing behaviour by replacing a useful
> (and used) feature, as issue_pending() is currently a no-op for cyclic
> transactions. The newly issued transaction is never considered, and
> calling terminate_all() will cancel the issued transactions. By
> extending the behaviour of issue_pending(), we're making a new use case
> possible, without restricting any other feature, and without "stealing"
> issue_pending() and preventing it from implementing another useful
> behaviour.

But at the same time we make existing drivers non compliant...

Imo a new callback to 'kill' / 'terminate' / 'replace' / 'abort' an
issued cookie would be cleaner.

cookie1 = dmaengine_issue_pending();
// will start the transfer
cookie2 = dmaengine_issue_pending();
// cookie1 still runs, cookie2 is waiting to be executed
dmaengine_abort_tx(chan);
// will kill cookie1 and executes cookie2

dmaengine_abort_tx() could take a cookie as parameter if we wish, so you
can say selectively which issued tx you want to remove, if it is the
running one, then stop it and move to the next one.
In place of the cookie parameter a 0 could imply that I don't know the
cookie, but kill the running one.

We would preserve what issue_pending does atm and would give us a
generic flow of how other drivers should handle such cases.

Note that this is not only useful for cyclic cases. Any driver which
currently uses brute-force termination can be upgraded.
Prime example is UART RX. We issue an RX buffer to receive data, but it
is not guarantied that the remote will send data which would fill the
buffer and we hit a timeout waiting. We could issue the next buffer and
kill the stale transfer to reclaim the received data.

I think this can be even implemented for DMAs which can not do the same
thing as your DMA can.

> In a nutshell, an important reason why I like using issue_pending() for
> this purpose is because it makes cyclic and non-cyclic transactions
> behave more similarly, which I think is good from an API consistency
> point of view.
> 
>> Also, we need to keep in mind that the dmaengine wont stop a cyclic
>> txn. It would be running and start next transfer (in this case do
>> from start) while it also gives you an interrupt. Here we would be
>> required to stop it and then start a new one...
> 
> We wouldn't be required to stop it in the middle, the expected behaviour
> is for the DMA engine to complete the cyclic transaction until the end
> of the cycle and then replace it by the new one. That's exactly what
> happens for non-cyclic transactions when you call issue_pending(), which
> makes me like this solution.

Right, so we have two different use cases. Replace the current transfers
with the next issued one and abort the current transfer now and arm the
next issued one.
dmaengine_abort_tx(chan, cookie, forced) ?
forced == false: replace it at cyclic boundary
forced == true: right away (as HW allows), do not wait for cyclic round

>> Or perhaps remove the cyclic setting from the txn when a new one
>> arrives and that behaviour IMO is controller dependent, not sure if
>> all controllers support it..
> 
> At the very least I would assume controllers to be able to stop a cyclic
> transaction forcefully, otherwise terminate_all() could never be
> implemented. This may not lead to a gracefully switch from one cyclic
> transaction to another one if the hardware doesn't allow doing so. In
> that case I think tx_submit() could return an error, or we could turn
> issue_pending() into an int operation to signal the error. Note that
> there's no need to mass-patch drivers here, if a DMA engine client
> issues a second cyclic transaction while one is in progress, the second
> transaction won't be considered today. Signalling an error is in my
> opinion a useful feature, but not doing so in DMA engine drivers can't
> be a regression. We could also add a flag to tell whether this mode of
> operation is supported.

My problems is that it is changing the behavior of issue_pending() for
cyclic. If we document this than all existing DMA drivers are broken
(not complaint with the API documentation) as they don't do this.


>>>> That would be a clean way to handle it. We were missing this API for a
>>>> long time to be able to cancel the ongoing transfer (whether it is
>>>> cyclic or slave_sg, or memcpy) and move to the next one if there is one
>>>> pending.
>>>
>>> Note that this new terminate API wouldn't terminate the ongoing transfer
>>> immediately, it would complete first, until the end of the cycle for
>>> cyclic transfers, and until the end of the whole transfer otherwise.
>>> This new operation would thus essentially be a no-op for non-cyclic
>>> transfers. I don't see how it would help :-) Do you have any particular
>>> use case in mind ?
>>
>> Yeah that is something more to think about. Do we really abort here or
>> wait for the txn to complete. I think Peter needs the former and your
>> falls in the latter category
> 
> I definitely need the latter, otherwise the display will flicker (or
> completely misoperate) every time a new frame is displayed, which isn't
> a good idea :-)

Sure, and it is a great feature.

> I'm not sure about Peter's use cases, but it seems to me
> that aborting a transaction immediately is racy in most cases, unless
> the DMA engine supports byte-level residue reporting.

Sort of yes. With EDMA, sDMA I can just kill the channel and set up a
new one right away.
UDMA on the other hand is not that forgiving... I would need to kill the
channel, wait for the termination to complete, reconfigure the channel
and execute the new transfer.

But with a separate callback API at least there will be an entry point
when this can be initiated and handled.
Fwiw, I think it should be simple to add this functionality to them, the
code is kind of handling it in other parts, but implementing it in the
issue_pending() is not really a clean solution.

In a channel you can run slave_sg transfers followed by cyclic if you
wish. A slave channel is what it is, slave channel which can be capable
to execute slave_sg and/or cyclic (and/or interleaved).
If issue_pending() is to take care then we need to check if the current
transfer is cyclic or not and decide based on that.

With a separate callback we in the DMA driver just need to do what the
client is asking for and no need to think.

> One non-intrusive
> option would be to add a flag to signal that a newly issued transaction
> should interrupt the current transaction immediately.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-17 10:00                               ` Peter Ujfalusi
@ 2020-02-19  9:25                                 ` Vinod Koul
  2020-02-26 16:30                                   ` Laurent Pinchart
  2020-02-26 16:24                                 ` Laurent Pinchart
  1 sibling, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-02-19  9:25 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: Laurent Pinchart, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

On 17-02-20, 12:00, Peter Ujfalusi wrote:
> Hi Laurent, Vinod,
> 
> On 14/02/2020 18.22, Laurent Pinchart wrote:
> >>> It does, but I really wonder why we need a new terminate operation that
> >>> would terminate a single transfer. If we call issue_pending at step B.3,
> >>> when the new txn submitted, we can terminate the current transfer at the
> >>> point. It changes the semantics of issue_pending, but only for cyclic
> >>> transfers (this whole discussions it only about cyclic transfers). As a
> >>> cyclic transfer will be repeated forever until terminated, there's no
> >>> use case for issuing a new transfer without terminating the one in
> >>> progress. I thus don't think we need a new terminate operation: the only
> >>> thing that makes sense to do when submitting a new cyclic transfer is to
> >>> terminate the current one and switch to the new one, and we already have
> >>> all the APIs we need to enable this behaviour.
> >>
> >> The issue_pending() is a NOP when engine is already running.
> > 
> > That's not totally right. issue_pending() still moves submitted but not
> > issued transactions from the submitted queue to the issued queue. The
> > DMA engine only considers the issued queue, so issue_pending()
> > essentially tells the DMA engine to consider the submitted transaction
> > for processing after the already issued transactions complete (in the
> > non-cyclic case).
> 
> Vinod's point is for the cyclic case at the current state. It is NOP
> essentially as we don't have way to not kill the whole channel.

Or IOW there is no descriptor movement to hardware..

> Just a sidenote: it is not even that clean cut for slave transfers
> either as the slave_config must _not_ change between the issued
> transfers. Iow, you can not switch between 16bit and 32bit word lengths
> with some DMA. EDMA, sDMA can do that, but UDMA can not for example...
> 
> >> The design of APIs is that we submit a txn to pending_list and then the
> >> pending_list is started when issue_pending() is called.
> >> Or if the engine is already running, it will take next txn from
> >> pending_list() when current txn completes.
> >>
> >> The only consideration here in this case is that the cyclic txn never
> >> completes. Do we really treat a new txn submission as an 'indication' of
> >> completeness? That is indeed a point to ponder upon.
> > 
> > The reason why I think we should is two-fold:
> > 
> > 1. I believe it's semantically aligned with the existing behaviour of
> > issue_pending(). As explained above, the operation tells the DMA engine
> > to consider submitted transactions for processing when the current (and
> > other issued) transactions complete. If we extend the definition of
> > complete to cover cyclic transactions, I think it's a good match.
> 
> We will end up with different behavior between cyclic and non cyclic
> transfers and the new behavior should be somehow supported by existing
> drivers.
> Yes, issue_pending is moving the submitted tx to the issued queue to be
> executed on HW when the current transfer finished.
> We only needed this for non cyclic uses so far. Some DMA hw can replace
> the current transfer with a new one (re-trigger to fetch the new
> configuration, like your's), but some can not (none of the system DMAs
> on TI platforms can).
> If we say that this is the behavior the DMA drivers must follow then we
> will have non compliant DMA drivers. You can not move simply to other
> DMA or can not create generic DMA code shared by drivers.

That is very important point for API. We want no implicit behaviour, so
if we want an behaviour let us do that explicitly.

> > 2. There's really nothing else we could do with cyclic transactions.
> > They never complete today and have to be terminated manually with
> > terminate_all(). Using issue_pending() to move to a next cyclic
> > transaction doesn't change the existing behaviour by replacing a useful
> > (and used) feature, as issue_pending() is currently a no-op for cyclic
> > transactions. The newly issued transaction is never considered, and
> > calling terminate_all() will cancel the issued transactions. By
> > extending the behaviour of issue_pending(), we're making a new use case
> > possible, without restricting any other feature, and without "stealing"
> > issue_pending() and preventing it from implementing another useful
> > behaviour.
> 
> But at the same time we make existing drivers non compliant...
> 
> Imo a new callback to 'kill' / 'terminate' / 'replace' / 'abort' an
> issued cookie would be cleaner.
> 
> cookie1 = dmaengine_issue_pending();
> // will start the transfer
> cookie2 = dmaengine_issue_pending();
> // cookie1 still runs, cookie2 is waiting to be executed
> dmaengine_abort_tx(chan);
> // will kill cookie1 and executes cookie2

Right and we need a kill mode which kills the cookie1 at the end of
transfer (conditional to hw supporting that)

I think it should be generic API and usable in both the cyclic and
non-cyclic case

> 
> dmaengine_abort_tx() could take a cookie as parameter if we wish, so you
> can say selectively which issued tx you want to remove, if it is the
> running one, then stop it and move to the next one.
> In place of the cookie parameter a 0 could imply that I don't know the
> cookie, but kill the running one.
> 
> We would preserve what issue_pending does atm and would give us a
> generic flow of how other drivers should handle such cases.
> 
> Note that this is not only useful for cyclic cases. Any driver which
> currently uses brute-force termination can be upgraded.
> Prime example is UART RX. We issue an RX buffer to receive data, but it
> is not guarantied that the remote will send data which would fill the
> buffer and we hit a timeout waiting. We could issue the next buffer and
> kill the stale transfer to reclaim the received data.
> 
> I think this can be even implemented for DMAs which can not do the same
> thing as your DMA can.
> 
> > In a nutshell, an important reason why I like using issue_pending() for
> > this purpose is because it makes cyclic and non-cyclic transactions
> > behave more similarly, which I think is good from an API consistency
> > point of view.
> > 
> >> Also, we need to keep in mind that the dmaengine wont stop a cyclic
> >> txn. It would be running and start next transfer (in this case do
> >> from start) while it also gives you an interrupt. Here we would be
> >> required to stop it and then start a new one...
> > 
> > We wouldn't be required to stop it in the middle, the expected behaviour
> > is for the DMA engine to complete the cyclic transaction until the end
> > of the cycle and then replace it by the new one. That's exactly what
> > happens for non-cyclic transactions when you call issue_pending(), which
> > makes me like this solution.
> 
> Right, so we have two different use cases. Replace the current transfers
> with the next issued one and abort the current transfer now and arm the
> next issued one.
> dmaengine_abort_tx(chan, cookie, forced) ?
> forced == false: replace it at cyclic boundary
> forced == true: right away (as HW allows), do not wait for cyclic round
> 
> >> Or perhaps remove the cyclic setting from the txn when a new one
> >> arrives and that behaviour IMO is controller dependent, not sure if
> >> all controllers support it..
> > 
> > At the very least I would assume controllers to be able to stop a cyclic
> > transaction forcefully, otherwise terminate_all() could never be
> > implemented. This may not lead to a gracefully switch from one cyclic
> > transaction to another one if the hardware doesn't allow doing so. In
> > that case I think tx_submit() could return an error, or we could turn
> > issue_pending() into an int operation to signal the error. Note that
> > there's no need to mass-patch drivers here, if a DMA engine client
> > issues a second cyclic transaction while one is in progress, the second
> > transaction won't be considered today. Signalling an error is in my
> > opinion a useful feature, but not doing so in DMA engine drivers can't
> > be a regression. We could also add a flag to tell whether this mode of
> > operation is supported.
> 
> My problems is that it is changing the behavior of issue_pending() for
> cyclic. If we document this than all existing DMA drivers are broken
> (not complaint with the API documentation) as they don't do this.
> 
> 
> >>>> That would be a clean way to handle it. We were missing this API for a
> >>>> long time to be able to cancel the ongoing transfer (whether it is
> >>>> cyclic or slave_sg, or memcpy) and move to the next one if there is one
> >>>> pending.
> >>>
> >>> Note that this new terminate API wouldn't terminate the ongoing transfer
> >>> immediately, it would complete first, until the end of the cycle for
> >>> cyclic transfers, and until the end of the whole transfer otherwise.
> >>> This new operation would thus essentially be a no-op for non-cyclic
> >>> transfers. I don't see how it would help :-) Do you have any particular
> >>> use case in mind ?
> >>
> >> Yeah that is something more to think about. Do we really abort here or
> >> wait for the txn to complete. I think Peter needs the former and your
> >> falls in the latter category
> > 
> > I definitely need the latter, otherwise the display will flicker (or
> > completely misoperate) every time a new frame is displayed, which isn't
> > a good idea :-)
> 
> Sure, and it is a great feature.
> 
> > I'm not sure about Peter's use cases, but it seems to me
> > that aborting a transaction immediately is racy in most cases, unless
> > the DMA engine supports byte-level residue reporting.
> 
> Sort of yes. With EDMA, sDMA I can just kill the channel and set up a
> new one right away.
> UDMA on the other hand is not that forgiving... I would need to kill the
> channel, wait for the termination to complete, reconfigure the channel
> and execute the new transfer.
> 
> But with a separate callback API at least there will be an entry point
> when this can be initiated and handled.
> Fwiw, I think it should be simple to add this functionality to them, the
> code is kind of handling it in other parts, but implementing it in the
> issue_pending() is not really a clean solution.
> 
> In a channel you can run slave_sg transfers followed by cyclic if you
> wish. A slave channel is what it is, slave channel which can be capable
> to execute slave_sg and/or cyclic (and/or interleaved).
> If issue_pending() is to take care then we need to check if the current
> transfer is cyclic or not and decide based on that.
> 
> With a separate callback we in the DMA driver just need to do what the
> client is asking for and no need to think.
> 
> > One non-intrusive
> > option would be to add a flag to signal that a newly issued transaction
> > should interrupt the current transaction immediately.
> 
> - Péter
> 
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-17 10:00                               ` Peter Ujfalusi
  2020-02-19  9:25                                 ` Vinod Koul
@ 2020-02-26 16:24                                 ` Laurent Pinchart
  2020-03-02  3:42                                   ` Vinod Koul
  1 sibling, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-02-26 16:24 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: Vinod Koul, dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Peter,

On Mon, Feb 17, 2020 at 12:00:02PM +0200, Peter Ujfalusi wrote:
> On 14/02/2020 18.22, Laurent Pinchart wrote:
> >>> It does, but I really wonder why we need a new terminate operation that
> >>> would terminate a single transfer. If we call issue_pending at step B.3,
> >>> when the new txn submitted, we can terminate the current transfer at the
> >>> point. It changes the semantics of issue_pending, but only for cyclic
> >>> transfers (this whole discussions it only about cyclic transfers). As a
> >>> cyclic transfer will be repeated forever until terminated, there's no
> >>> use case for issuing a new transfer without terminating the one in
> >>> progress. I thus don't think we need a new terminate operation: the only
> >>> thing that makes sense to do when submitting a new cyclic transfer is to
> >>> terminate the current one and switch to the new one, and we already have
> >>> all the APIs we need to enable this behaviour.
> >>
> >> The issue_pending() is a NOP when engine is already running.
> > 
> > That's not totally right. issue_pending() still moves submitted but not
> > issued transactions from the submitted queue to the issued queue. The
> > DMA engine only considers the issued queue, so issue_pending()
> > essentially tells the DMA engine to consider the submitted transaction
> > for processing after the already issued transactions complete (in the
> > non-cyclic case).
> 
> Vinod's point is for the cyclic case at the current state. It is NOP
> essentially as we don't have way to not kill the whole channel.

Considering the current implementation of issue_pending(), for cyclic
transfers, that's correct.

My point was that, semantically, and as is implemented today for
non-cyclic transfers, issue_pending() is meant to tell the DMA engine to
consider the submitted transactions for processing after the already
issued transactions complete. For cyclic transactions, .issue_pending
has no defined semantics, and is implemented as a NOP. My proposal is to
extend the existing semantics of issue_pending() as defined for the
non-cyclic transactions to also cover the cyclic transactions. This
won't cause any breakage (the issue_pending() operation being unused for
cyclic transactions, it won't cause any change to existing code), and
will make the API more consistent as the same semantics (moving to the
next submitted transaction when the current one completes) will be
implemented using the same operation.

> Just a sidenote: it is not even that clean cut for slave transfers
> either as the slave_config must _not_ change between the issued
> transfers. Iow, you can not switch between 16bit and 32bit word lengths
> with some DMA. EDMA, sDMA can do that, but UDMA can not for example...

I agree this can be an issue, but I'm not sure how it's related :-) I
believe we need to consider this feature, and specify the API better,
but that's fairly unrelated, isn't it ?

> >> The design of APIs is that we submit a txn to pending_list and then the
> >> pending_list is started when issue_pending() is called.
> >> Or if the engine is already running, it will take next txn from
> >> pending_list() when current txn completes.
> >>
> >> The only consideration here in this case is that the cyclic txn never
> >> completes. Do we really treat a new txn submission as an 'indication' of
> >> completeness? That is indeed a point to ponder upon.
> > 
> > The reason why I think we should is two-fold:
> > 
> > 1. I believe it's semantically aligned with the existing behaviour of
> > issue_pending(). As explained above, the operation tells the DMA engine
> > to consider submitted transactions for processing when the current (and
> > other issued) transactions complete. If we extend the definition of
> > complete to cover cyclic transactions, I think it's a good match.
> 
> We will end up with different behavior between cyclic and non cyclic
> transfers and the new behavior should be somehow supported by existing
> drivers.
> Yes, issue_pending is moving the submitted tx to the issued queue to be
> executed on HW when the current transfer finished.
> We only needed this for non cyclic uses so far. Some DMA hw can replace
> the current transfer with a new one (re-trigger to fetch the new
> configuration, like your's), but some can not (none of the system DMAs
> on TI platforms can).
> If we say that this is the behavior the DMA drivers must follow then we
> will have non compliant DMA drivers. You can not move simply to other
> DMA or can not create generic DMA code shared by drivers.

I think that's a matter of reporting the capabilities of the DMA engine,
and I believe a flag is enough for this. My proposal really gives a
purpose to an API that is unused today (.issue_pending() for cyclic
transfers), and that purpose is semantically coherent with the purpose
of the same function for non-cyclic transfers. I thus believe it brings
the cyclic and non-cyclic cases closer, making their behaviour more
similar, not different.

> > 2. There's really nothing else we could do with cyclic transactions.
> > They never complete today and have to be terminated manually with
> > terminate_all(). Using issue_pending() to move to a next cyclic
> > transaction doesn't change the existing behaviour by replacing a useful
> > (and used) feature, as issue_pending() is currently a no-op for cyclic
> > transactions. The newly issued transaction is never considered, and
> > calling terminate_all() will cancel the issued transactions. By
> > extending the behaviour of issue_pending(), we're making a new use case
> > possible, without restricting any other feature, and without "stealing"
> > issue_pending() and preventing it from implementing another useful
> > behaviour.
> 
> But at the same time we make existing drivers non compliant...

With a flag to report this new feature, that's not a problem.

> Imo a new callback to 'kill' / 'terminate' / 'replace' / 'abort' an
> issued cookie would be cleaner.
> 
> cookie1 = dmaengine_issue_pending();
> // will start the transfer
> cookie2 = dmaengine_issue_pending();
> // cookie1 still runs, cookie2 is waiting to be executed
> dmaengine_abort_tx(chan);
> // will kill cookie1 and executes cookie2
> 
> dmaengine_abort_tx() could take a cookie as parameter if we wish, so you
> can say selectively which issued tx you want to remove, if it is the
> running one, then stop it and move to the next one.
> In place of the cookie parameter a 0 could imply that I don't know the
> cookie, but kill the running one.
> 
> We would preserve what issue_pending does atm and would give us a
> generic flow of how other drivers should handle such cases.
> 
> Note that this is not only useful for cyclic cases. Any driver which
> currently uses brute-force termination can be upgraded.
> Prime example is UART RX. We issue an RX buffer to receive data, but it
> is not guarantied that the remote will send data which would fill the
> buffer and we hit a timeout waiting. We could issue the next buffer and
> kill the stale transfer to reclaim the received data.
> 
> I think this can be even implemented for DMAs which can not do the same
> thing as your DMA can.

But that's a different use case. What I'm after is *not*
killing/aborting a currently running transfer, it's moving to the next
submitted transfer at the next available sync point. I don't want to
abort the transfer in progress immediately, that would kill the display.

I understand that the above can be useful, but I really don't see why
I'd need to implement support for a more complex use case that I have no
need for, and could hardly even test properly, when what I'm after is
fixing what I view as a bug in the existing implementation: we have an
operation, issue_pending(), with a defined purpose, and it happens that
for one of the transfer types that operation doesn't work. I really see
no reason to implement a brand new API in this case.

Note that using issue_pending() as I propose doesn't preclude anyone
(you, or someone else) to implement your above proposal, but please
don't make me do your work :-) This is becoming a case of yak shaving
where I'm asked to fix shortcomings of the DMA engine API when they're
unrelated to my use case.

> > In a nutshell, an important reason why I like using issue_pending() for
> > this purpose is because it makes cyclic and non-cyclic transactions
> > behave more similarly, which I think is good from an API consistency
> > point of view.
> > 
> >> Also, we need to keep in mind that the dmaengine wont stop a cyclic
> >> txn. It would be running and start next transfer (in this case do
> >> from start) while it also gives you an interrupt. Here we would be
> >> required to stop it and then start a new one...
> > 
> > We wouldn't be required to stop it in the middle, the expected behaviour
> > is for the DMA engine to complete the cyclic transaction until the end
> > of the cycle and then replace it by the new one. That's exactly what
> > happens for non-cyclic transactions when you call issue_pending(), which
> > makes me like this solution.
> 
> Right, so we have two different use cases. Replace the current transfers
> with the next issued one and abort the current transfer now and arm the
> next issued one.
> dmaengine_abort_tx(chan, cookie, forced) ?
> forced == false: replace it at cyclic boundary
> forced == true: right away (as HW allows), do not wait for cyclic round

See the above. You're making this more complicated than it should be,
designing an API that contains a small part that could help solving my
problem, and asking me to implement the 90% for free. Not fair :-)

> >> Or perhaps remove the cyclic setting from the txn when a new one
> >> arrives and that behaviour IMO is controller dependent, not sure if
> >> all controllers support it..
> > 
> > At the very least I would assume controllers to be able to stop a cyclic
> > transaction forcefully, otherwise terminate_all() could never be
> > implemented. This may not lead to a gracefully switch from one cyclic
> > transaction to another one if the hardware doesn't allow doing so. In
> > that case I think tx_submit() could return an error, or we could turn
> > issue_pending() into an int operation to signal the error. Note that
> > there's no need to mass-patch drivers here, if a DMA engine client
> > issues a second cyclic transaction while one is in progress, the second
> > transaction won't be considered today. Signalling an error is in my
> > opinion a useful feature, but not doing so in DMA engine drivers can't
> > be a regression. We could also add a flag to tell whether this mode of
> > operation is supported.
> 
> My problems is that it is changing the behavior of issue_pending() for
> cyclic. If we document this than all existing DMA drivers are broken
> (not complaint with the API documentation) as they don't do this.

Again, see above. I argue that it's not a behavioural change as such, as
the current behaviour is unused, because it's implemented as a NOP and
useless. With a simple flag to report if a DMA engine supports replacing
cyclic transfers, we would have a more consistent API as issue_pending()
will operate the same way for *all* types of transfers.

> >>>> That would be a clean way to handle it. We were missing this API for a
> >>>> long time to be able to cancel the ongoing transfer (whether it is
> >>>> cyclic or slave_sg, or memcpy) and move to the next one if there is one
> >>>> pending.
> >>>
> >>> Note that this new terminate API wouldn't terminate the ongoing transfer
> >>> immediately, it would complete first, until the end of the cycle for
> >>> cyclic transfers, and until the end of the whole transfer otherwise.
> >>> This new operation would thus essentially be a no-op for non-cyclic
> >>> transfers. I don't see how it would help :-) Do you have any particular
> >>> use case in mind ?
> >>
> >> Yeah that is something more to think about. Do we really abort here or
> >> wait for the txn to complete. I think Peter needs the former and your
> >> falls in the latter category
> > 
> > I definitely need the latter, otherwise the display will flicker (or
> > completely misoperate) every time a new frame is displayed, which isn't
> > a good idea :-)
> 
> Sure, and it is a great feature.
> 
> > I'm not sure about Peter's use cases, but it seems to me
> > that aborting a transaction immediately is racy in most cases, unless
> > the DMA engine supports byte-level residue reporting.
> 
> Sort of yes. With EDMA, sDMA I can just kill the channel and set up a
> new one right away.
> UDMA on the other hand is not that forgiving... I would need to kill the
> channel, wait for the termination to complete, reconfigure the channel
> and execute the new transfer.
> 
> But with a separate callback API at least there will be an entry point
> when this can be initiated and handled.
> Fwiw, I think it should be simple to add this functionality to them, the
> code is kind of handling it in other parts, but implementing it in the
> issue_pending() is not really a clean solution.
> 
> In a channel you can run slave_sg transfers followed by cyclic if you
> wish. A slave channel is what it is, slave channel which can be capable
> to execute slave_sg and/or cyclic (and/or interleaved).
> If issue_pending() is to take care then we need to check if the current
> transfer is cyclic or not and decide based on that.
> 
> With a separate callback we in the DMA driver just need to do what the
> client is asking for and no need to think.

Let's put it that way: are you volunteering to implement your proposal
(with proper API documentation) in a reasonable time frame, so that I
can try it for my use case ? Otherwise I see no reason to push against
my proposal.

> > One non-intrusive
> > option would be to add a flag to signal that a newly issued transaction
> > should interrupt the current transaction immediately.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-19  9:25                                 ` Vinod Koul
@ 2020-02-26 16:30                                   ` Laurent Pinchart
  2020-03-02  3:47                                     ` Vinod Koul
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-02-26 16:30 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Wed, Feb 19, 2020 at 02:55:14PM +0530, Vinod Koul wrote:
> On 17-02-20, 12:00, Peter Ujfalusi wrote:
> > On 14/02/2020 18.22, Laurent Pinchart wrote:
> >>>> It does, but I really wonder why we need a new terminate operation that
> >>>> would terminate a single transfer. If we call issue_pending at step B.3,
> >>>> when the new txn submitted, we can terminate the current transfer at the
> >>>> point. It changes the semantics of issue_pending, but only for cyclic
> >>>> transfers (this whole discussions it only about cyclic transfers). As a
> >>>> cyclic transfer will be repeated forever until terminated, there's no
> >>>> use case for issuing a new transfer without terminating the one in
> >>>> progress. I thus don't think we need a new terminate operation: the only
> >>>> thing that makes sense to do when submitting a new cyclic transfer is to
> >>>> terminate the current one and switch to the new one, and we already have
> >>>> all the APIs we need to enable this behaviour.
> >>>
> >>> The issue_pending() is a NOP when engine is already running.
> >> 
> >> That's not totally right. issue_pending() still moves submitted but not
> >> issued transactions from the submitted queue to the issued queue. The
> >> DMA engine only considers the issued queue, so issue_pending()
> >> essentially tells the DMA engine to consider the submitted transaction
> >> for processing after the already issued transactions complete (in the
> >> non-cyclic case).
> > 
> > Vinod's point is for the cyclic case at the current state. It is NOP
> > essentially as we don't have way to not kill the whole channel.
> 
> Or IOW there is no descriptor movement to hardware..
> 
> > Just a sidenote: it is not even that clean cut for slave transfers
> > either as the slave_config must _not_ change between the issued
> > transfers. Iow, you can not switch between 16bit and 32bit word lengths
> > with some DMA. EDMA, sDMA can do that, but UDMA can not for example...
> > 
> >>> The design of APIs is that we submit a txn to pending_list and then the
> >>> pending_list is started when issue_pending() is called.
> >>> Or if the engine is already running, it will take next txn from
> >>> pending_list() when current txn completes.
> >>>
> >>> The only consideration here in this case is that the cyclic txn never
> >>> completes. Do we really treat a new txn submission as an 'indication' of
> >>> completeness? That is indeed a point to ponder upon.
> >> 
> >> The reason why I think we should is two-fold:
> >> 
> >> 1. I believe it's semantically aligned with the existing behaviour of
> >> issue_pending(). As explained above, the operation tells the DMA engine
> >> to consider submitted transactions for processing when the current (and
> >> other issued) transactions complete. If we extend the definition of
> >> complete to cover cyclic transactions, I think it's a good match.
> > 
> > We will end up with different behavior between cyclic and non cyclic
> > transfers and the new behavior should be somehow supported by existing
> > drivers.
> > Yes, issue_pending is moving the submitted tx to the issued queue to be
> > executed on HW when the current transfer finished.
> > We only needed this for non cyclic uses so far. Some DMA hw can replace
> > the current transfer with a new one (re-trigger to fetch the new
> > configuration, like your's), but some can not (none of the system DMAs
> > on TI platforms can).
> > If we say that this is the behavior the DMA drivers must follow then we
> > will have non compliant DMA drivers. You can not move simply to other
> > DMA or can not create generic DMA code shared by drivers.
> 
> That is very important point for API. We want no implicit behaviour, so
> if we want an behaviour let us do that explicitly.

As I've just explained in my reply to Peter, there's nothing implicit in
my proposal :-) It's however missing a flag to report if the DMA engine
driver supports this feature, put apart from that, it makes the API
*more* consistent by making issue_pending() cover *all* transfer types
with the *same* semantics.

> >> 2. There's really nothing else we could do with cyclic transactions.
> >> They never complete today and have to be terminated manually with
> >> terminate_all(). Using issue_pending() to move to a next cyclic
> >> transaction doesn't change the existing behaviour by replacing a useful
> >> (and used) feature, as issue_pending() is currently a no-op for cyclic
> >> transactions. The newly issued transaction is never considered, and
> >> calling terminate_all() will cancel the issued transactions. By
> >> extending the behaviour of issue_pending(), we're making a new use case
> >> possible, without restricting any other feature, and without "stealing"
> >> issue_pending() and preventing it from implementing another useful
> >> behaviour.
> > 
> > But at the same time we make existing drivers non compliant...
> > 
> > Imo a new callback to 'kill' / 'terminate' / 'replace' / 'abort' an
> > issued cookie would be cleaner.
> > 
> > cookie1 = dmaengine_issue_pending();
> > // will start the transfer
> > cookie2 = dmaengine_issue_pending();
> > // cookie1 still runs, cookie2 is waiting to be executed
> > dmaengine_abort_tx(chan);
> > // will kill cookie1 and executes cookie2
> 
> Right and we need a kill mode which kills the cookie1 at the end of
> transfer (conditional to hw supporting that)
> 
> I think it should be generic API and usable in both the cyclic and
> non-cyclic case

I have no issue with an API that can abort ongoing transfers without
killing the whole queue of pending transfers, but that's not what I'm
after, it's not my use case. Again, as explained in my reply to Peter,
I'm not looking for a way to abort a transfer immediately, but to move
to the next transfer at the end of the current one. It's very different,
and the DMA engine API already supports this for all transfers but
cyclic transfers. I'd go as far as saying that my proposal is fixing a
bug in the current implementation :-)

> > dmaengine_abort_tx() could take a cookie as parameter if we wish, so you
> > can say selectively which issued tx you want to remove, if it is the
> > running one, then stop it and move to the next one.
> > In place of the cookie parameter a 0 could imply that I don't know the
> > cookie, but kill the running one.
> > 
> > We would preserve what issue_pending does atm and would give us a
> > generic flow of how other drivers should handle such cases.
> > 
> > Note that this is not only useful for cyclic cases. Any driver which
> > currently uses brute-force termination can be upgraded.
> > Prime example is UART RX. We issue an RX buffer to receive data, but it
> > is not guarantied that the remote will send data which would fill the
> > buffer and we hit a timeout waiting. We could issue the next buffer and
> > kill the stale transfer to reclaim the received data.
> > 
> > I think this can be even implemented for DMAs which can not do the same
> > thing as your DMA can.
> > 
> >> In a nutshell, an important reason why I like using issue_pending() for
> >> this purpose is because it makes cyclic and non-cyclic transactions
> >> behave more similarly, which I think is good from an API consistency
> >> point of view.
> >> 
> >>> Also, we need to keep in mind that the dmaengine wont stop a cyclic
> >>> txn. It would be running and start next transfer (in this case do
> >>> from start) while it also gives you an interrupt. Here we would be
> >>> required to stop it and then start a new one...
> >> 
> >> We wouldn't be required to stop it in the middle, the expected behaviour
> >> is for the DMA engine to complete the cyclic transaction until the end
> >> of the cycle and then replace it by the new one. That's exactly what
> >> happens for non-cyclic transactions when you call issue_pending(), which
> >> makes me like this solution.
> > 
> > Right, so we have two different use cases. Replace the current transfers
> > with the next issued one and abort the current transfer now and arm the
> > next issued one.
> > dmaengine_abort_tx(chan, cookie, forced) ?
> > forced == false: replace it at cyclic boundary
> > forced == true: right away (as HW allows), do not wait for cyclic round
> > 
> >>> Or perhaps remove the cyclic setting from the txn when a new one
> >>> arrives and that behaviour IMO is controller dependent, not sure if
> >>> all controllers support it..
> >> 
> >> At the very least I would assume controllers to be able to stop a cyclic
> >> transaction forcefully, otherwise terminate_all() could never be
> >> implemented. This may not lead to a gracefully switch from one cyclic
> >> transaction to another one if the hardware doesn't allow doing so. In
> >> that case I think tx_submit() could return an error, or we could turn
> >> issue_pending() into an int operation to signal the error. Note that
> >> there's no need to mass-patch drivers here, if a DMA engine client
> >> issues a second cyclic transaction while one is in progress, the second
> >> transaction won't be considered today. Signalling an error is in my
> >> opinion a useful feature, but not doing so in DMA engine drivers can't
> >> be a regression. We could also add a flag to tell whether this mode of
> >> operation is supported.
> > 
> > My problems is that it is changing the behavior of issue_pending() for
> > cyclic. If we document this than all existing DMA drivers are broken
> > (not complaint with the API documentation) as they don't do this.
> > 
> > 
> >>>>> That would be a clean way to handle it. We were missing this API for a
> >>>>> long time to be able to cancel the ongoing transfer (whether it is
> >>>>> cyclic or slave_sg, or memcpy) and move to the next one if there is one
> >>>>> pending.
> >>>>
> >>>> Note that this new terminate API wouldn't terminate the ongoing transfer
> >>>> immediately, it would complete first, until the end of the cycle for
> >>>> cyclic transfers, and until the end of the whole transfer otherwise.
> >>>> This new operation would thus essentially be a no-op for non-cyclic
> >>>> transfers. I don't see how it would help :-) Do you have any particular
> >>>> use case in mind ?
> >>>
> >>> Yeah that is something more to think about. Do we really abort here or
> >>> wait for the txn to complete. I think Peter needs the former and your
> >>> falls in the latter category
> >> 
> >> I definitely need the latter, otherwise the display will flicker (or
> >> completely misoperate) every time a new frame is displayed, which isn't
> >> a good idea :-)
> > 
> > Sure, and it is a great feature.
> > 
> >> I'm not sure about Peter's use cases, but it seems to me
> >> that aborting a transaction immediately is racy in most cases, unless
> >> the DMA engine supports byte-level residue reporting.
> > 
> > Sort of yes. With EDMA, sDMA I can just kill the channel and set up a
> > new one right away.
> > UDMA on the other hand is not that forgiving... I would need to kill the
> > channel, wait for the termination to complete, reconfigure the channel
> > and execute the new transfer.
> > 
> > But with a separate callback API at least there will be an entry point
> > when this can be initiated and handled.
> > Fwiw, I think it should be simple to add this functionality to them, the
> > code is kind of handling it in other parts, but implementing it in the
> > issue_pending() is not really a clean solution.
> > 
> > In a channel you can run slave_sg transfers followed by cyclic if you
> > wish. A slave channel is what it is, slave channel which can be capable
> > to execute slave_sg and/or cyclic (and/or interleaved).
> > If issue_pending() is to take care then we need to check if the current
> > transfer is cyclic or not and decide based on that.
> > 
> > With a separate callback we in the DMA driver just need to do what the
> > client is asking for and no need to think.
> > 
> >> One non-intrusive
> >> option would be to add a flag to signal that a newly issued transaction
> >> should interrupt the current transaction immediately.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-26 16:24                                 ` Laurent Pinchart
@ 2020-03-02  3:42                                   ` Vinod Koul
  0 siblings, 0 replies; 46+ messages in thread
From: Vinod Koul @ 2020-03-02  3:42 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

On 26-02-20, 18:24, Laurent Pinchart wrote:
> Hi Peter,
> 
> On Mon, Feb 17, 2020 at 12:00:02PM +0200, Peter Ujfalusi wrote:
> > On 14/02/2020 18.22, Laurent Pinchart wrote:
> > >>> It does, but I really wonder why we need a new terminate operation that
> > >>> would terminate a single transfer. If we call issue_pending at step B.3,
> > >>> when the new txn submitted, we can terminate the current transfer at the
> > >>> point. It changes the semantics of issue_pending, but only for cyclic
> > >>> transfers (this whole discussions it only about cyclic transfers). As a
> > >>> cyclic transfer will be repeated forever until terminated, there's no
> > >>> use case for issuing a new transfer without terminating the one in
> > >>> progress. I thus don't think we need a new terminate operation: the only
> > >>> thing that makes sense to do when submitting a new cyclic transfer is to
> > >>> terminate the current one and switch to the new one, and we already have
> > >>> all the APIs we need to enable this behaviour.
> > >>
> > >> The issue_pending() is a NOP when engine is already running.
> > > 
> > > That's not totally right. issue_pending() still moves submitted but not
> > > issued transactions from the submitted queue to the issued queue. The
> > > DMA engine only considers the issued queue, so issue_pending()
> > > essentially tells the DMA engine to consider the submitted transaction
> > > for processing after the already issued transactions complete (in the
> > > non-cyclic case).
> > 
> > Vinod's point is for the cyclic case at the current state. It is NOP
> > essentially as we don't have way to not kill the whole channel.
> 
> Considering the current implementation of issue_pending(), for cyclic
> transfers, that's correct.
> 
> My point was that, semantically, and as is implemented today for
> non-cyclic transfers, issue_pending() is meant to tell the DMA engine to
> consider the submitted transactions for processing after the already
> issued transactions complete. For cyclic transactions, .issue_pending
> has no defined semantics, and is implemented as a NOP. My proposal is to
> extend the existing semantics of issue_pending() as defined for the
> non-cyclic transactions to also cover the cyclic transactions. This
> won't cause any breakage (the issue_pending() operation being unused for
> cyclic transactions, it won't cause any change to existing code), and
> will make the API more consistent as the same semantics (moving to the
> next submitted transaction when the current one completes) will be
> implemented using the same operation.

Only problem is cyclic by defination never completes, so we need to add
additional semantics for completion which is something me and Peter do
not seem to like :)

> 
> > Just a sidenote: it is not even that clean cut for slave transfers
> > either as the slave_config must _not_ change between the issued
> > transfers. Iow, you can not switch between 16bit and 32bit word lengths
> > with some DMA. EDMA, sDMA can do that, but UDMA can not for example...
> 
> I agree this can be an issue, but I'm not sure how it's related :-) I
> believe we need to consider this feature, and specify the API better,
> but that's fairly unrelated, isn't it ?
> 
> > >> The design of APIs is that we submit a txn to pending_list and then the
> > >> pending_list is started when issue_pending() is called.
> > >> Or if the engine is already running, it will take next txn from
> > >> pending_list() when current txn completes.
> > >>
> > >> The only consideration here in this case is that the cyclic txn never
> > >> completes. Do we really treat a new txn submission as an 'indication' of
> > >> completeness? That is indeed a point to ponder upon.
> > > 
> > > The reason why I think we should is two-fold:
> > > 
> > > 1. I believe it's semantically aligned with the existing behaviour of
> > > issue_pending(). As explained above, the operation tells the DMA engine
> > > to consider submitted transactions for processing when the current (and
> > > other issued) transactions complete. If we extend the definition of
> > > complete to cover cyclic transactions, I think it's a good match.
> > 
> > We will end up with different behavior between cyclic and non cyclic
> > transfers and the new behavior should be somehow supported by existing
> > drivers.
> > Yes, issue_pending is moving the submitted tx to the issued queue to be
> > executed on HW when the current transfer finished.
> > We only needed this for non cyclic uses so far. Some DMA hw can replace
> > the current transfer with a new one (re-trigger to fetch the new
> > configuration, like your's), but some can not (none of the system DMAs
> > on TI platforms can).
> > If we say that this is the behavior the DMA drivers must follow then we
> > will have non compliant DMA drivers. You can not move simply to other
> > DMA or can not create generic DMA code shared by drivers.
> 
> I think that's a matter of reporting the capabilities of the DMA engine,
> and I believe a flag is enough for this. My proposal really gives a
> purpose to an API that is unused today (.issue_pending() for cyclic
> transfers), and that purpose is semantically coherent with the purpose
> of the same function for non-cyclic transfers. I thus believe it brings
> the cyclic and non-cyclic cases closer, making their behaviour more
> similar, not different.
> 
> > > 2. There's really nothing else we could do with cyclic transactions.
> > > They never complete today and have to be terminated manually with
> > > terminate_all(). Using issue_pending() to move to a next cyclic
> > > transaction doesn't change the existing behaviour by replacing a useful
> > > (and used) feature, as issue_pending() is currently a no-op for cyclic
> > > transactions. The newly issued transaction is never considered, and
> > > calling terminate_all() will cancel the issued transactions. By
> > > extending the behaviour of issue_pending(), we're making a new use case
> > > possible, without restricting any other feature, and without "stealing"
> > > issue_pending() and preventing it from implementing another useful
> > > behaviour.
> > 
> > But at the same time we make existing drivers non compliant...
> 
> With a flag to report this new feature, that's not a problem.
> 
> > Imo a new callback to 'kill' / 'terminate' / 'replace' / 'abort' an
> > issued cookie would be cleaner.
> > 
> > cookie1 = dmaengine_issue_pending();
> > // will start the transfer
> > cookie2 = dmaengine_issue_pending();
> > // cookie1 still runs, cookie2 is waiting to be executed
> > dmaengine_abort_tx(chan);
> > // will kill cookie1 and executes cookie2
> > 
> > dmaengine_abort_tx() could take a cookie as parameter if we wish, so you
> > can say selectively which issued tx you want to remove, if it is the
> > running one, then stop it and move to the next one.
> > In place of the cookie parameter a 0 could imply that I don't know the
> > cookie, but kill the running one.
> > 
> > We would preserve what issue_pending does atm and would give us a
> > generic flow of how other drivers should handle such cases.
> > 
> > Note that this is not only useful for cyclic cases. Any driver which
> > currently uses brute-force termination can be upgraded.
> > Prime example is UART RX. We issue an RX buffer to receive data, but it
> > is not guarantied that the remote will send data which would fill the
> > buffer and we hit a timeout waiting. We could issue the next buffer and
> > kill the stale transfer to reclaim the received data.
> > 
> > I think this can be even implemented for DMAs which can not do the same
> > thing as your DMA can.
> 
> But that's a different use case. What I'm after is *not*
> killing/aborting a currently running transfer, it's moving to the next
> submitted transfer at the next available sync point. I don't want to
> abort the transfer in progress immediately, that would kill the display.
> 
> I understand that the above can be useful, but I really don't see why
> I'd need to implement support for a more complex use case that I have no
> need for, and could hardly even test properly, when what I'm after is
> fixing what I view as a bug in the existing implementation: we have an
> operation, issue_pending(), with a defined purpose, and it happens that
> for one of the transfer types that operation doesn't work. I really see
> no reason to implement a brand new API in this case.
> 
> Note that using issue_pending() as I propose doesn't preclude anyone
> (you, or someone else) to implement your above proposal, but please
> don't make me do your work :-) This is becoming a case of yak shaving
> where I'm asked to fix shortcomings of the DMA engine API when they're
> unrelated to my use case.
> 
> > > In a nutshell, an important reason why I like using issue_pending() for
> > > this purpose is because it makes cyclic and non-cyclic transactions
> > > behave more similarly, which I think is good from an API consistency
> > > point of view.
> > > 
> > >> Also, we need to keep in mind that the dmaengine wont stop a cyclic
> > >> txn. It would be running and start next transfer (in this case do
> > >> from start) while it also gives you an interrupt. Here we would be
> > >> required to stop it and then start a new one...
> > > 
> > > We wouldn't be required to stop it in the middle, the expected behaviour
> > > is for the DMA engine to complete the cyclic transaction until the end
> > > of the cycle and then replace it by the new one. That's exactly what
> > > happens for non-cyclic transactions when you call issue_pending(), which
> > > makes me like this solution.
> > 
> > Right, so we have two different use cases. Replace the current transfers
> > with the next issued one and abort the current transfer now and arm the
> > next issued one.
> > dmaengine_abort_tx(chan, cookie, forced) ?
> > forced == false: replace it at cyclic boundary
> > forced == true: right away (as HW allows), do not wait for cyclic round
> 
> See the above. You're making this more complicated than it should be,
> designing an API that contains a small part that could help solving my
> problem, and asking me to implement the 90% for free. Not fair :-)

I agree it may help in other cases, but my view here is that if we want
to terminate the cyclic, let us be explicit about it. I would rather
call an API to do so and explicitly convey that current cyclic txn is
ending rather than implictly submit a new one.

> > >> Or perhaps remove the cyclic setting from the txn when a new one
> > >> arrives and that behaviour IMO is controller dependent, not sure if
> > >> all controllers support it..
> > > 
> > > At the very least I would assume controllers to be able to stop a cyclic
> > > transaction forcefully, otherwise terminate_all() could never be
> > > implemented. This may not lead to a gracefully switch from one cyclic
> > > transaction to another one if the hardware doesn't allow doing so. In
> > > that case I think tx_submit() could return an error, or we could turn
> > > issue_pending() into an int operation to signal the error. Note that
> > > there's no need to mass-patch drivers here, if a DMA engine client
> > > issues a second cyclic transaction while one is in progress, the second
> > > transaction won't be considered today. Signalling an error is in my
> > > opinion a useful feature, but not doing so in DMA engine drivers can't
> > > be a regression. We could also add a flag to tell whether this mode of
> > > operation is supported.
> > 
> > My problems is that it is changing the behavior of issue_pending() for
> > cyclic. If we document this than all existing DMA drivers are broken
> > (not complaint with the API documentation) as they don't do this.
> 
> Again, see above. I argue that it's not a behavioural change as such, as
> the current behaviour is unused, because it's implemented as a NOP and
> useless. With a simple flag to report if a DMA engine supports replacing
> cyclic transfers, we would have a more consistent API as issue_pending()
> will operate the same way for *all* types of transfers.
> 
> > >>>> That would be a clean way to handle it. We were missing this API for a
> > >>>> long time to be able to cancel the ongoing transfer (whether it is
> > >>>> cyclic or slave_sg, or memcpy) and move to the next one if there is one
> > >>>> pending.
> > >>>
> > >>> Note that this new terminate API wouldn't terminate the ongoing transfer
> > >>> immediately, it would complete first, until the end of the cycle for
> > >>> cyclic transfers, and until the end of the whole transfer otherwise.
> > >>> This new operation would thus essentially be a no-op for non-cyclic
> > >>> transfers. I don't see how it would help :-) Do you have any particular
> > >>> use case in mind ?
> > >>
> > >> Yeah that is something more to think about. Do we really abort here or
> > >> wait for the txn to complete. I think Peter needs the former and your
> > >> falls in the latter category
> > > 
> > > I definitely need the latter, otherwise the display will flicker (or
> > > completely misoperate) every time a new frame is displayed, which isn't
> > > a good idea :-)
> > 
> > Sure, and it is a great feature.
> > 
> > > I'm not sure about Peter's use cases, but it seems to me
> > > that aborting a transaction immediately is racy in most cases, unless
> > > the DMA engine supports byte-level residue reporting.
> > 
> > Sort of yes. With EDMA, sDMA I can just kill the channel and set up a
> > new one right away.
> > UDMA on the other hand is not that forgiving... I would need to kill the
> > channel, wait for the termination to complete, reconfigure the channel
> > and execute the new transfer.
> > 
> > But with a separate callback API at least there will be an entry point
> > when this can be initiated and handled.
> > Fwiw, I think it should be simple to add this functionality to them, the
> > code is kind of handling it in other parts, but implementing it in the
> > issue_pending() is not really a clean solution.
> > 
> > In a channel you can run slave_sg transfers followed by cyclic if you
> > wish. A slave channel is what it is, slave channel which can be capable
> > to execute slave_sg and/or cyclic (and/or interleaved).
> > If issue_pending() is to take care then we need to check if the current
> > transfer is cyclic or not and decide based on that.
> > 
> > With a separate callback we in the DMA driver just need to do what the
> > client is asking for and no need to think.
> 
> Let's put it that way: are you volunteering to implement your proposal
> (with proper API documentation) in a reasonable time frame, so that I
> can try it for my use case ? Otherwise I see no reason to push against
> my proposal.
> 
> > > One non-intrusive
> > > option would be to add a flag to signal that a newly issued transaction
> > > should interrupt the current transaction immediately.
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-02-26 16:30                                   ` Laurent Pinchart
@ 2020-03-02  3:47                                     ` Vinod Koul
  2020-03-02  7:37                                       ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-03-02  3:47 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Laurent,

On 26-02-20, 18:30, Laurent Pinchart wrote:
> On Wed, Feb 19, 2020 at 02:55:14PM +0530, Vinod Koul wrote:
> > On 17-02-20, 12:00, Peter Ujfalusi wrote:
> > > On 14/02/2020 18.22, Laurent Pinchart wrote:
> > >>>> It does, but I really wonder why we need a new terminate operation that
> > >>>> would terminate a single transfer. If we call issue_pending at step B.3,
> > >>>> when the new txn submitted, we can terminate the current transfer at the
> > >>>> point. It changes the semantics of issue_pending, but only for cyclic
> > >>>> transfers (this whole discussions it only about cyclic transfers). As a
> > >>>> cyclic transfer will be repeated forever until terminated, there's no
> > >>>> use case for issuing a new transfer without terminating the one in
> > >>>> progress. I thus don't think we need a new terminate operation: the only
> > >>>> thing that makes sense to do when submitting a new cyclic transfer is to
> > >>>> terminate the current one and switch to the new one, and we already have
> > >>>> all the APIs we need to enable this behaviour.
> > >>>
> > >>> The issue_pending() is a NOP when engine is already running.
> > >> 
> > >> That's not totally right. issue_pending() still moves submitted but not
> > >> issued transactions from the submitted queue to the issued queue. The
> > >> DMA engine only considers the issued queue, so issue_pending()
> > >> essentially tells the DMA engine to consider the submitted transaction
> > >> for processing after the already issued transactions complete (in the
> > >> non-cyclic case).
> > > 
> > > Vinod's point is for the cyclic case at the current state. It is NOP
> > > essentially as we don't have way to not kill the whole channel.
> > 
> > Or IOW there is no descriptor movement to hardware..
> > 
> > > Just a sidenote: it is not even that clean cut for slave transfers
> > > either as the slave_config must _not_ change between the issued
> > > transfers. Iow, you can not switch between 16bit and 32bit word lengths
> > > with some DMA. EDMA, sDMA can do that, but UDMA can not for example...
> > > 
> > >>> The design of APIs is that we submit a txn to pending_list and then the
> > >>> pending_list is started when issue_pending() is called.
> > >>> Or if the engine is already running, it will take next txn from
> > >>> pending_list() when current txn completes.
> > >>>
> > >>> The only consideration here in this case is that the cyclic txn never
> > >>> completes. Do we really treat a new txn submission as an 'indication' of
> > >>> completeness? That is indeed a point to ponder upon.
> > >> 
> > >> The reason why I think we should is two-fold:
> > >> 
> > >> 1. I believe it's semantically aligned with the existing behaviour of
> > >> issue_pending(). As explained above, the operation tells the DMA engine
> > >> to consider submitted transactions for processing when the current (and
> > >> other issued) transactions complete. If we extend the definition of
> > >> complete to cover cyclic transactions, I think it's a good match.
> > > 
> > > We will end up with different behavior between cyclic and non cyclic
> > > transfers and the new behavior should be somehow supported by existing
> > > drivers.
> > > Yes, issue_pending is moving the submitted tx to the issued queue to be
> > > executed on HW when the current transfer finished.
> > > We only needed this for non cyclic uses so far. Some DMA hw can replace
> > > the current transfer with a new one (re-trigger to fetch the new
> > > configuration, like your's), but some can not (none of the system DMAs
> > > on TI platforms can).
> > > If we say that this is the behavior the DMA drivers must follow then we
> > > will have non compliant DMA drivers. You can not move simply to other
> > > DMA or can not create generic DMA code shared by drivers.
> > 
> > That is very important point for API. We want no implicit behaviour, so
> > if we want an behaviour let us do that explicitly.
> 
> As I've just explained in my reply to Peter, there's nothing implicit in
> my proposal :-) It's however missing a flag to report if the DMA engine
> driver supports this feature, put apart from that, it makes the API
> *more* consistent by making issue_pending() cover *all* transfer types
> with the *same* semantics.

I would be more comfortable in calling an API to do so :)
The flow I am thinking is:

- prep cyclic1 txn
- submit cyclic1 txn
- call issue_pending() (cyclic one starts)

- prep cyclic2 txn
- submit cyclic2 txn
- signal_cyclic1_txn aka terminate_cookie()
- cyclic1 completes, switch to cyclic2 (dmaengine driver)
- get callback for cyclic1 (optional)

To check if hw supports terminate_cookie() or not we can check if the
callback support is implemented

> 
> > >> 2. There's really nothing else we could do with cyclic transactions.
> > >> They never complete today and have to be terminated manually with
> > >> terminate_all(). Using issue_pending() to move to a next cyclic
> > >> transaction doesn't change the existing behaviour by replacing a useful
> > >> (and used) feature, as issue_pending() is currently a no-op for cyclic
> > >> transactions. The newly issued transaction is never considered, and
> > >> calling terminate_all() will cancel the issued transactions. By
> > >> extending the behaviour of issue_pending(), we're making a new use case
> > >> possible, without restricting any other feature, and without "stealing"
> > >> issue_pending() and preventing it from implementing another useful
> > >> behaviour.
> > > 
> > > But at the same time we make existing drivers non compliant...
> > > 
> > > Imo a new callback to 'kill' / 'terminate' / 'replace' / 'abort' an
> > > issued cookie would be cleaner.
> > > 
> > > cookie1 = dmaengine_issue_pending();
> > > // will start the transfer
> > > cookie2 = dmaengine_issue_pending();
> > > // cookie1 still runs, cookie2 is waiting to be executed
> > > dmaengine_abort_tx(chan);
> > > // will kill cookie1 and executes cookie2
> > 
> > Right and we need a kill mode which kills the cookie1 at the end of
> > transfer (conditional to hw supporting that)
> > 
> > I think it should be generic API and usable in both the cyclic and
> > non-cyclic case
> 
> I have no issue with an API that can abort ongoing transfers without
> killing the whole queue of pending transfers, but that's not what I'm
> after, it's not my use case. Again, as explained in my reply to Peter,
> I'm not looking for a way to abort a transfer immediately, but to move
> to the next transfer at the end of the current one. It's very different,
> and the DMA engine API already supports this for all transfers but
> cyclic transfers. I'd go as far as saying that my proposal is fixing a
> bug in the current implementation :-)
> 
> > > dmaengine_abort_tx() could take a cookie as parameter if we wish, so you
> > > can say selectively which issued tx you want to remove, if it is the
> > > running one, then stop it and move to the next one.
> > > In place of the cookie parameter a 0 could imply that I don't know the
> > > cookie, but kill the running one.
> > > 
> > > We would preserve what issue_pending does atm and would give us a
> > > generic flow of how other drivers should handle such cases.
> > > 
> > > Note that this is not only useful for cyclic cases. Any driver which
> > > currently uses brute-force termination can be upgraded.
> > > Prime example is UART RX. We issue an RX buffer to receive data, but it
> > > is not guarantied that the remote will send data which would fill the
> > > buffer and we hit a timeout waiting. We could issue the next buffer and
> > > kill the stale transfer to reclaim the received data.
> > > 
> > > I think this can be even implemented for DMAs which can not do the same
> > > thing as your DMA can.
> > > 
> > >> In a nutshell, an important reason why I like using issue_pending() for
> > >> this purpose is because it makes cyclic and non-cyclic transactions
> > >> behave more similarly, which I think is good from an API consistency
> > >> point of view.
> > >> 
> > >>> Also, we need to keep in mind that the dmaengine wont stop a cyclic
> > >>> txn. It would be running and start next transfer (in this case do
> > >>> from start) while it also gives you an interrupt. Here we would be
> > >>> required to stop it and then start a new one...
> > >> 
> > >> We wouldn't be required to stop it in the middle, the expected behaviour
> > >> is for the DMA engine to complete the cyclic transaction until the end
> > >> of the cycle and then replace it by the new one. That's exactly what
> > >> happens for non-cyclic transactions when you call issue_pending(), which
> > >> makes me like this solution.
> > > 
> > > Right, so we have two different use cases. Replace the current transfers
> > > with the next issued one and abort the current transfer now and arm the
> > > next issued one.
> > > dmaengine_abort_tx(chan, cookie, forced) ?
> > > forced == false: replace it at cyclic boundary
> > > forced == true: right away (as HW allows), do not wait for cyclic round
> > > 
> > >>> Or perhaps remove the cyclic setting from the txn when a new one
> > >>> arrives and that behaviour IMO is controller dependent, not sure if
> > >>> all controllers support it..
> > >> 
> > >> At the very least I would assume controllers to be able to stop a cyclic
> > >> transaction forcefully, otherwise terminate_all() could never be
> > >> implemented. This may not lead to a gracefully switch from one cyclic
> > >> transaction to another one if the hardware doesn't allow doing so. In
> > >> that case I think tx_submit() could return an error, or we could turn
> > >> issue_pending() into an int operation to signal the error. Note that
> > >> there's no need to mass-patch drivers here, if a DMA engine client
> > >> issues a second cyclic transaction while one is in progress, the second
> > >> transaction won't be considered today. Signalling an error is in my
> > >> opinion a useful feature, but not doing so in DMA engine drivers can't
> > >> be a regression. We could also add a flag to tell whether this mode of
> > >> operation is supported.
> > > 
> > > My problems is that it is changing the behavior of issue_pending() for
> > > cyclic. If we document this than all existing DMA drivers are broken
> > > (not complaint with the API documentation) as they don't do this.
> > > 
> > > 
> > >>>>> That would be a clean way to handle it. We were missing this API for a
> > >>>>> long time to be able to cancel the ongoing transfer (whether it is
> > >>>>> cyclic or slave_sg, or memcpy) and move to the next one if there is one
> > >>>>> pending.
> > >>>>
> > >>>> Note that this new terminate API wouldn't terminate the ongoing transfer
> > >>>> immediately, it would complete first, until the end of the cycle for
> > >>>> cyclic transfers, and until the end of the whole transfer otherwise.
> > >>>> This new operation would thus essentially be a no-op for non-cyclic
> > >>>> transfers. I don't see how it would help :-) Do you have any particular
> > >>>> use case in mind ?
> > >>>
> > >>> Yeah that is something more to think about. Do we really abort here or
> > >>> wait for the txn to complete. I think Peter needs the former and your
> > >>> falls in the latter category
> > >> 
> > >> I definitely need the latter, otherwise the display will flicker (or
> > >> completely misoperate) every time a new frame is displayed, which isn't
> > >> a good idea :-)
> > > 
> > > Sure, and it is a great feature.
> > > 
> > >> I'm not sure about Peter's use cases, but it seems to me
> > >> that aborting a transaction immediately is racy in most cases, unless
> > >> the DMA engine supports byte-level residue reporting.
> > > 
> > > Sort of yes. With EDMA, sDMA I can just kill the channel and set up a
> > > new one right away.
> > > UDMA on the other hand is not that forgiving... I would need to kill the
> > > channel, wait for the termination to complete, reconfigure the channel
> > > and execute the new transfer.
> > > 
> > > But with a separate callback API at least there will be an entry point
> > > when this can be initiated and handled.
> > > Fwiw, I think it should be simple to add this functionality to them, the
> > > code is kind of handling it in other parts, but implementing it in the
> > > issue_pending() is not really a clean solution.
> > > 
> > > In a channel you can run slave_sg transfers followed by cyclic if you
> > > wish. A slave channel is what it is, slave channel which can be capable
> > > to execute slave_sg and/or cyclic (and/or interleaved).
> > > If issue_pending() is to take care then we need to check if the current
> > > transfer is cyclic or not and decide based on that.
> > > 
> > > With a separate callback we in the DMA driver just need to do what the
> > > client is asking for and no need to think.
> > > 
> > >> One non-intrusive
> > >> option would be to add a flag to signal that a newly issued transaction
> > >> should interrupt the current transaction immediately.
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-02  3:47                                     ` Vinod Koul
@ 2020-03-02  7:37                                       ` Laurent Pinchart
  2020-03-03  4:32                                         ` Vinod Koul
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-03-02  7:37 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Mon, Mar 02, 2020 at 09:17:35AM +0530, Vinod Koul wrote:
> On 26-02-20, 18:30, Laurent Pinchart wrote:
> > On Wed, Feb 19, 2020 at 02:55:14PM +0530, Vinod Koul wrote:
> >> On 17-02-20, 12:00, Peter Ujfalusi wrote:
> >>> On 14/02/2020 18.22, Laurent Pinchart wrote:
> >>>>>> It does, but I really wonder why we need a new terminate operation that
> >>>>>> would terminate a single transfer. If we call issue_pending at step B.3,
> >>>>>> when the new txn submitted, we can terminate the current transfer at the
> >>>>>> point. It changes the semantics of issue_pending, but only for cyclic
> >>>>>> transfers (this whole discussions it only about cyclic transfers). As a
> >>>>>> cyclic transfer will be repeated forever until terminated, there's no
> >>>>>> use case for issuing a new transfer without terminating the one in
> >>>>>> progress. I thus don't think we need a new terminate operation: the only
> >>>>>> thing that makes sense to do when submitting a new cyclic transfer is to
> >>>>>> terminate the current one and switch to the new one, and we already have
> >>>>>> all the APIs we need to enable this behaviour.
> >>>>>
> >>>>> The issue_pending() is a NOP when engine is already running.
> >>>> 
> >>>> That's not totally right. issue_pending() still moves submitted but not
> >>>> issued transactions from the submitted queue to the issued queue. The
> >>>> DMA engine only considers the issued queue, so issue_pending()
> >>>> essentially tells the DMA engine to consider the submitted transaction
> >>>> for processing after the already issued transactions complete (in the
> >>>> non-cyclic case).
> >>> 
> >>> Vinod's point is for the cyclic case at the current state. It is NOP
> >>> essentially as we don't have way to not kill the whole channel.
> >> 
> >> Or IOW there is no descriptor movement to hardware..
> >> 
> >>> Just a sidenote: it is not even that clean cut for slave transfers
> >>> either as the slave_config must _not_ change between the issued
> >>> transfers. Iow, you can not switch between 16bit and 32bit word lengths
> >>> with some DMA. EDMA, sDMA can do that, but UDMA can not for example...
> >>> 
> >>>>> The design of APIs is that we submit a txn to pending_list and then the
> >>>>> pending_list is started when issue_pending() is called.
> >>>>> Or if the engine is already running, it will take next txn from
> >>>>> pending_list() when current txn completes.
> >>>>>
> >>>>> The only consideration here in this case is that the cyclic txn never
> >>>>> completes. Do we really treat a new txn submission as an 'indication' of
> >>>>> completeness? That is indeed a point to ponder upon.
> >>>> 
> >>>> The reason why I think we should is two-fold:
> >>>> 
> >>>> 1. I believe it's semantically aligned with the existing behaviour of
> >>>> issue_pending(). As explained above, the operation tells the DMA engine
> >>>> to consider submitted transactions for processing when the current (and
> >>>> other issued) transactions complete. If we extend the definition of
> >>>> complete to cover cyclic transactions, I think it's a good match.
> >>> 
> >>> We will end up with different behavior between cyclic and non cyclic
> >>> transfers and the new behavior should be somehow supported by existing
> >>> drivers.
> >>> Yes, issue_pending is moving the submitted tx to the issued queue to be
> >>> executed on HW when the current transfer finished.
> >>> We only needed this for non cyclic uses so far. Some DMA hw can replace
> >>> the current transfer with a new one (re-trigger to fetch the new
> >>> configuration, like your's), but some can not (none of the system DMAs
> >>> on TI platforms can).
> >>> If we say that this is the behavior the DMA drivers must follow then we
> >>> will have non compliant DMA drivers. You can not move simply to other
> >>> DMA or can not create generic DMA code shared by drivers.
> >> 
> >> That is very important point for API. We want no implicit behaviour, so
> >> if we want an behaviour let us do that explicitly.
> > 
> > As I've just explained in my reply to Peter, there's nothing implicit in
> > my proposal :-) It's however missing a flag to report if the DMA engine
> > driver supports this feature, put apart from that, it makes the API
> > *more* consistent by making issue_pending() cover *all* transfer types
> > with the *same* semantics.
> 
> I would be more comfortable in calling an API to do so :)
> The flow I am thinking is:
> 
> - prep cyclic1 txn
> - submit cyclic1 txn
> - call issue_pending() (cyclic one starts)
> 
> - prep cyclic2 txn
> - submit cyclic2 txn
> - signal_cyclic1_txn aka terminate_cookie()
> - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> - get callback for cyclic1 (optional)
> 
> To check if hw supports terminate_cookie() or not we can check if the
> callback support is implemented

Two questions though:

- Where is .issue_pending() called for cyclic2 in your above sequence ?
  Surely it should be called somewhere, as the DMA engine API requires
  .issue_pending() to be called for a transfer to be executed, otherwise
  it stays in the submitted but not pending queue.

- With the introduction of a new .terminate_cookie() operation, we need
  to specify that operation for all transfer types. What's its
  envisioned semantics for non-cyclic transfers ? And how do DMA engine
  drivers report that they support .terminate_cookie() for cyclic
  transfers but not for other transfer types (the counterpart of
  reporting, in my proposition, that .issue_pending() isn't supported
  replace the current cyclic transfer) ?

> >>>> 2. There's really nothing else we could do with cyclic transactions.
> >>>> They never complete today and have to be terminated manually with
> >>>> terminate_all(). Using issue_pending() to move to a next cyclic
> >>>> transaction doesn't change the existing behaviour by replacing a useful
> >>>> (and used) feature, as issue_pending() is currently a no-op for cyclic
> >>>> transactions. The newly issued transaction is never considered, and
> >>>> calling terminate_all() will cancel the issued transactions. By
> >>>> extending the behaviour of issue_pending(), we're making a new use case
> >>>> possible, without restricting any other feature, and without "stealing"
> >>>> issue_pending() and preventing it from implementing another useful
> >>>> behaviour.
> >>> 
> >>> But at the same time we make existing drivers non compliant...
> >>> 
> >>> Imo a new callback to 'kill' / 'terminate' / 'replace' / 'abort' an
> >>> issued cookie would be cleaner.
> >>> 
> >>> cookie1 = dmaengine_issue_pending();
> >>> // will start the transfer
> >>> cookie2 = dmaengine_issue_pending();
> >>> // cookie1 still runs, cookie2 is waiting to be executed
> >>> dmaengine_abort_tx(chan);
> >>> // will kill cookie1 and executes cookie2
> >> 
> >> Right and we need a kill mode which kills the cookie1 at the end of
> >> transfer (conditional to hw supporting that)
> >> 
> >> I think it should be generic API and usable in both the cyclic and
> >> non-cyclic case
> > 
> > I have no issue with an API that can abort ongoing transfers without
> > killing the whole queue of pending transfers, but that's not what I'm
> > after, it's not my use case. Again, as explained in my reply to Peter,
> > I'm not looking for a way to abort a transfer immediately, but to move
> > to the next transfer at the end of the current one. It's very different,
> > and the DMA engine API already supports this for all transfers but
> > cyclic transfers. I'd go as far as saying that my proposal is fixing a
> > bug in the current implementation :-)
> > 
> >>> dmaengine_abort_tx() could take a cookie as parameter if we wish, so you
> >>> can say selectively which issued tx you want to remove, if it is the
> >>> running one, then stop it and move to the next one.
> >>> In place of the cookie parameter a 0 could imply that I don't know the
> >>> cookie, but kill the running one.
> >>> 
> >>> We would preserve what issue_pending does atm and would give us a
> >>> generic flow of how other drivers should handle such cases.
> >>> 
> >>> Note that this is not only useful for cyclic cases. Any driver which
> >>> currently uses brute-force termination can be upgraded.
> >>> Prime example is UART RX. We issue an RX buffer to receive data, but it
> >>> is not guarantied that the remote will send data which would fill the
> >>> buffer and we hit a timeout waiting. We could issue the next buffer and
> >>> kill the stale transfer to reclaim the received data.
> >>> 
> >>> I think this can be even implemented for DMAs which can not do the same
> >>> thing as your DMA can.
> >>> 
> >>>> In a nutshell, an important reason why I like using issue_pending() for
> >>>> this purpose is because it makes cyclic and non-cyclic transactions
> >>>> behave more similarly, which I think is good from an API consistency
> >>>> point of view.
> >>>> 
> >>>>> Also, we need to keep in mind that the dmaengine wont stop a cyclic
> >>>>> txn. It would be running and start next transfer (in this case do
> >>>>> from start) while it also gives you an interrupt. Here we would be
> >>>>> required to stop it and then start a new one...
> >>>> 
> >>>> We wouldn't be required to stop it in the middle, the expected behaviour
> >>>> is for the DMA engine to complete the cyclic transaction until the end
> >>>> of the cycle and then replace it by the new one. That's exactly what
> >>>> happens for non-cyclic transactions when you call issue_pending(), which
> >>>> makes me like this solution.
> >>> 
> >>> Right, so we have two different use cases. Replace the current transfers
> >>> with the next issued one and abort the current transfer now and arm the
> >>> next issued one.
> >>> dmaengine_abort_tx(chan, cookie, forced) ?
> >>> forced == false: replace it at cyclic boundary
> >>> forced == true: right away (as HW allows), do not wait for cyclic round
> >>> 
> >>>>> Or perhaps remove the cyclic setting from the txn when a new one
> >>>>> arrives and that behaviour IMO is controller dependent, not sure if
> >>>>> all controllers support it..
> >>>> 
> >>>> At the very least I would assume controllers to be able to stop a cyclic
> >>>> transaction forcefully, otherwise terminate_all() could never be
> >>>> implemented. This may not lead to a gracefully switch from one cyclic
> >>>> transaction to another one if the hardware doesn't allow doing so. In
> >>>> that case I think tx_submit() could return an error, or we could turn
> >>>> issue_pending() into an int operation to signal the error. Note that
> >>>> there's no need to mass-patch drivers here, if a DMA engine client
> >>>> issues a second cyclic transaction while one is in progress, the second
> >>>> transaction won't be considered today. Signalling an error is in my
> >>>> opinion a useful feature, but not doing so in DMA engine drivers can't
> >>>> be a regression. We could also add a flag to tell whether this mode of
> >>>> operation is supported.
> >>> 
> >>> My problems is that it is changing the behavior of issue_pending() for
> >>> cyclic. If we document this than all existing DMA drivers are broken
> >>> (not complaint with the API documentation) as they don't do this.
> >>> 
> >>> 
> >>>>>>> That would be a clean way to handle it. We were missing this API for a
> >>>>>>> long time to be able to cancel the ongoing transfer (whether it is
> >>>>>>> cyclic or slave_sg, or memcpy) and move to the next one if there is one
> >>>>>>> pending.
> >>>>>>
> >>>>>> Note that this new terminate API wouldn't terminate the ongoing transfer
> >>>>>> immediately, it would complete first, until the end of the cycle for
> >>>>>> cyclic transfers, and until the end of the whole transfer otherwise.
> >>>>>> This new operation would thus essentially be a no-op for non-cyclic
> >>>>>> transfers. I don't see how it would help :-) Do you have any particular
> >>>>>> use case in mind ?
> >>>>>
> >>>>> Yeah that is something more to think about. Do we really abort here or
> >>>>> wait for the txn to complete. I think Peter needs the former and your
> >>>>> falls in the latter category
> >>>> 
> >>>> I definitely need the latter, otherwise the display will flicker (or
> >>>> completely misoperate) every time a new frame is displayed, which isn't
> >>>> a good idea :-)
> >>> 
> >>> Sure, and it is a great feature.
> >>> 
> >>>> I'm not sure about Peter's use cases, but it seems to me
> >>>> that aborting a transaction immediately is racy in most cases, unless
> >>>> the DMA engine supports byte-level residue reporting.
> >>> 
> >>> Sort of yes. With EDMA, sDMA I can just kill the channel and set up a
> >>> new one right away.
> >>> UDMA on the other hand is not that forgiving... I would need to kill the
> >>> channel, wait for the termination to complete, reconfigure the channel
> >>> and execute the new transfer.
> >>> 
> >>> But with a separate callback API at least there will be an entry point
> >>> when this can be initiated and handled.
> >>> Fwiw, I think it should be simple to add this functionality to them, the
> >>> code is kind of handling it in other parts, but implementing it in the
> >>> issue_pending() is not really a clean solution.
> >>> 
> >>> In a channel you can run slave_sg transfers followed by cyclic if you
> >>> wish. A slave channel is what it is, slave channel which can be capable
> >>> to execute slave_sg and/or cyclic (and/or interleaved).
> >>> If issue_pending() is to take care then we need to check if the current
> >>> transfer is cyclic or not and decide based on that.
> >>> 
> >>> With a separate callback we in the DMA driver just need to do what the
> >>> client is asking for and no need to think.
> >>> 
> >>>> One non-intrusive
> >>>> option would be to add a flag to signal that a newly issued transaction
> >>>> should interrupt the current transaction immediately.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-02  7:37                                       ` Laurent Pinchart
@ 2020-03-03  4:32                                         ` Vinod Koul
  2020-03-03 19:22                                           ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-03-03  4:32 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Laurent,

On 02-03-20, 09:37, Laurent Pinchart wrote:

> > I would be more comfortable in calling an API to do so :)
> > The flow I am thinking is:
> > 
> > - prep cyclic1 txn
> > - submit cyclic1 txn
> > - call issue_pending() (cyclic one starts)
> > 
> > - prep cyclic2 txn
> > - submit cyclic2 txn
> > - signal_cyclic1_txn aka terminate_cookie()
> > - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> > - get callback for cyclic1 (optional)
> > 
> > To check if hw supports terminate_cookie() or not we can check if the
> > callback support is implemented
> 
> Two questions though:
> 
> - Where is .issue_pending() called for cyclic2 in your above sequence ?
>   Surely it should be called somewhere, as the DMA engine API requires
>   .issue_pending() to be called for a transfer to be executed, otherwise
>   it stays in the submitted but not pending queue.

Sorry missed that one, I would do that after submit cyclic2 txn step and
then signal signal_cyclic1_txn termination

> - With the introduction of a new .terminate_cookie() operation, we need
>   to specify that operation for all transfer types. What's its

Correct

>   envisioned semantics for non-cyclic transfers ? And how do DMA engine
>   drivers report that they support .terminate_cookie() for cyclic
>   transfers but not for other transfer types (the counterpart of
>   reporting, in my proposition, that .issue_pending() isn't supported
>   replace the current cyclic transfer) ?

Typically for dmaengine controller cyclic is *not* a special mode, only
change is that a list provided to controller is circular.

So, the .terminate_cookie() should be a feature for all type of txn's.
If for some reason (dont discount what hw designers can do) a controller
supports this for some specific type(s), then they should return
-ENOTSUPP for cookies that do not support and let the caller know.

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-03  4:32                                         ` Vinod Koul
@ 2020-03-03 19:22                                           ` Laurent Pinchart
  2020-03-04  5:13                                             ` Vinod Koul
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-03-03 19:22 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Tue, Mar 03, 2020 at 10:02:54AM +0530, Vinod Koul wrote:
> On 02-03-20, 09:37, Laurent Pinchart wrote:
> 
> > > I would be more comfortable in calling an API to do so :)
> > > The flow I am thinking is:
> > > 
> > > - prep cyclic1 txn
> > > - submit cyclic1 txn
> > > - call issue_pending() (cyclic one starts)
> > > 
> > > - prep cyclic2 txn
> > > - submit cyclic2 txn
> > > - signal_cyclic1_txn aka terminate_cookie()
> > > - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> > > - get callback for cyclic1 (optional)
> > > 
> > > To check if hw supports terminate_cookie() or not we can check if the
> > > callback support is implemented
> > 
> > Two questions though:
> > 
> > - Where is .issue_pending() called for cyclic2 in your above sequence ?
> >   Surely it should be called somewhere, as the DMA engine API requires
> >   .issue_pending() to be called for a transfer to be executed, otherwise
> >   it stays in the submitted but not pending queue.
> 
> Sorry missed that one, I would do that after submit cyclic2 txn step and
> then signal signal_cyclic1_txn termination

OK, that matches my understanding, good :-)

> > - With the introduction of a new .terminate_cookie() operation, we need
> >   to specify that operation for all transfer types. What's its
> 
> Correct
> 
> >   envisioned semantics for non-cyclic transfers ? And how do DMA engine
> >   drivers report that they support .terminate_cookie() for cyclic
> >   transfers but not for other transfer types (the counterpart of
> >   reporting, in my proposition, that .issue_pending() isn't supported
> >   replace the current cyclic transfer) ?
> 
> Typically for dmaengine controller cyclic is *not* a special mode, only
> change is that a list provided to controller is circular.

I don't agree with this. For cyclic transfers to be replaceable in a
clean way, the feature must be specifically implemented at the hardware
level. A DMA engine that supports chaining transfers with an explicit
way to override that chaining, and without the logic to report if the
inherent race was lost or not, really can't support this API.

Furthemore, for non-cyclic transfers, what would .terminate_cookie() do
? I need it to be defined as terminating the current transfer when it
ends for the cyclic case, not terminating it immediately. All non-cyclic
transfers terminate by themselves when they end, so what would this new
operation do ?

> So, the .terminate_cookie() should be a feature for all type of txn's.
> If for some reason (dont discount what hw designers can do) a controller
> supports this for some specific type(s), then they should return
> -ENOTSUPP for cookies that do not support and let the caller know.

But then the caller can't know ahead of time, it will only find out when
it's too late, and can't decide not to use the DMA engine if it doesn't
support the feature. I don't think that's a very good option.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-03 19:22                                           ` Laurent Pinchart
@ 2020-03-04  5:13                                             ` Vinod Koul
  2020-03-04  8:01                                               ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-03-04  5:13 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

On 03-03-20, 21:22, Laurent Pinchart wrote:
> Hi Vinod,
> 
> On Tue, Mar 03, 2020 at 10:02:54AM +0530, Vinod Koul wrote:
> > On 02-03-20, 09:37, Laurent Pinchart wrote:
> > 
> > > > I would be more comfortable in calling an API to do so :)
> > > > The flow I am thinking is:
> > > > 
> > > > - prep cyclic1 txn
> > > > - submit cyclic1 txn
> > > > - call issue_pending() (cyclic one starts)
> > > > 
> > > > - prep cyclic2 txn
> > > > - submit cyclic2 txn
> > > > - signal_cyclic1_txn aka terminate_cookie()
> > > > - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> > > > - get callback for cyclic1 (optional)
> > > > 
> > > > To check if hw supports terminate_cookie() or not we can check if the
> > > > callback support is implemented
> > > 
> > > Two questions though:
> > > 
> > > - Where is .issue_pending() called for cyclic2 in your above sequence ?
> > >   Surely it should be called somewhere, as the DMA engine API requires
> > >   .issue_pending() to be called for a transfer to be executed, otherwise
> > >   it stays in the submitted but not pending queue.
> > 
> > Sorry missed that one, I would do that after submit cyclic2 txn step and
> > then signal signal_cyclic1_txn termination
> 
> OK, that matches my understanding, good :-)
> 
> > > - With the introduction of a new .terminate_cookie() operation, we need
> > >   to specify that operation for all transfer types. What's its
> > 
> > Correct
> > 
> > >   envisioned semantics for non-cyclic transfers ? And how do DMA engine
> > >   drivers report that they support .terminate_cookie() for cyclic
> > >   transfers but not for other transfer types (the counterpart of
> > >   reporting, in my proposition, that .issue_pending() isn't supported
> > >   replace the current cyclic transfer) ?
> > 
> > Typically for dmaengine controller cyclic is *not* a special mode, only
> > change is that a list provided to controller is circular.
> 
> I don't agree with this. For cyclic transfers to be replaceable in a
> clean way, the feature must be specifically implemented at the hardware
> level. A DMA engine that supports chaining transfers with an explicit
> way to override that chaining, and without the logic to report if the
> inherent race was lost or not, really can't support this API.

Well chaining is a typical feature in dmaengine and making last chain
point to first makes it circular. I have seen couple of engines and this
was the implementation in the hardware.

There can exist special hardware for this purposes as well, but the
point is that the cyclic can be treated as circular list.

> Furthemore, for non-cyclic transfers, what would .terminate_cookie() do
> ? I need it to be defined as terminating the current transfer when it
> ends for the cyclic case, not terminating it immediately. All non-cyclic
> transfers terminate by themselves when they end, so what would this new
> operation do ?

I would use it for two purposes, cancelling txn but at the end of
current txn. I have couple of usages where this would
be helpful. Second in error handling where some engines do not support
aborting (unless we reset the whole controller)

But yes the .terminate_cookie() semantics should indicate if the
termination should be immediate or end of current txn. I see people
using it for both.

And with this I think it would make sense to also add this to
capabilities :)

> > So, the .terminate_cookie() should be a feature for all type of txn's.
> > If for some reason (dont discount what hw designers can do) a controller
> > supports this for some specific type(s), then they should return
> > -ENOTSUPP for cookies that do not support and let the caller know.
> 
> But then the caller can't know ahead of time, it will only find out when
> it's too late, and can't decide not to use the DMA engine if it doesn't
> support the feature. I don't think that's a very good option.

Agreed so lets go with adding these in caps.

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-04  5:13                                             ` Vinod Koul
@ 2020-03-04  8:01                                               ` Laurent Pinchart
  2020-03-04 15:37                                                 ` Vinod Koul
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-03-04  8:01 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Wed, Mar 04, 2020 at 10:43:01AM +0530, Vinod Koul wrote:
> On 03-03-20, 21:22, Laurent Pinchart wrote:
> > On Tue, Mar 03, 2020 at 10:02:54AM +0530, Vinod Koul wrote:
> > > On 02-03-20, 09:37, Laurent Pinchart wrote:
> > > > > I would be more comfortable in calling an API to do so :)
> > > > > The flow I am thinking is:
> > > > > 
> > > > > - prep cyclic1 txn
> > > > > - submit cyclic1 txn
> > > > > - call issue_pending() (cyclic one starts)
> > > > > 
> > > > > - prep cyclic2 txn
> > > > > - submit cyclic2 txn
> > > > > - signal_cyclic1_txn aka terminate_cookie()
> > > > > - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> > > > > - get callback for cyclic1 (optional)
> > > > > 
> > > > > To check if hw supports terminate_cookie() or not we can check if the
> > > > > callback support is implemented
> > > > 
> > > > Two questions though:
> > > > 
> > > > - Where is .issue_pending() called for cyclic2 in your above sequence ?
> > > >   Surely it should be called somewhere, as the DMA engine API requires
> > > >   .issue_pending() to be called for a transfer to be executed, otherwise
> > > >   it stays in the submitted but not pending queue.
> > > 
> > > Sorry missed that one, I would do that after submit cyclic2 txn step and
> > > then signal signal_cyclic1_txn termination
> > 
> > OK, that matches my understanding, good :-)
> > 
> > > > - With the introduction of a new .terminate_cookie() operation, we need
> > > >   to specify that operation for all transfer types. What's its
> > > 
> > > Correct
> > > 
> > > >   envisioned semantics for non-cyclic transfers ? And how do DMA engine
> > > >   drivers report that they support .terminate_cookie() for cyclic
> > > >   transfers but not for other transfer types (the counterpart of
> > > >   reporting, in my proposition, that .issue_pending() isn't supported
> > > >   replace the current cyclic transfer) ?
> > > 
> > > Typically for dmaengine controller cyclic is *not* a special mode, only
> > > change is that a list provided to controller is circular.
> > 
> > I don't agree with this. For cyclic transfers to be replaceable in a
> > clean way, the feature must be specifically implemented at the hardware
> > level. A DMA engine that supports chaining transfers with an explicit
> > way to override that chaining, and without the logic to report if the
> > inherent race was lost or not, really can't support this API.
> 
> Well chaining is a typical feature in dmaengine and making last chain
> point to first makes it circular. I have seen couple of engines and this
> was the implementation in the hardware.
> 
> There can exist special hardware for this purposes as well, but the
> point is that the cyclic can be treated as circular list.
> 
> > Furthemore, for non-cyclic transfers, what would .terminate_cookie() do
> > ? I need it to be defined as terminating the current transfer when it
> > ends for the cyclic case, not terminating it immediately. All non-cyclic
> > transfers terminate by themselves when they end, so what would this new
> > operation do ?
> 
> I would use it for two purposes, cancelling txn but at the end of
> current txn. I have couple of usages where this would be helpful.

I fail to see how that would help. Non-cyclic transfers always stop at
the end of the transfer. "Cancelling txn but at the end of current txn"
is what DMA engine drivers already do if you call .terminate_cookie() on
the ongoing transfer. It would thus be a no-op.

> Second in error handling where some engines do not support
> aborting (unless we reset the whole controller)

Could you explain that one ? I'm not sure to understand it.

> But yes the .terminate_cookie() semantics should indicate if the
> termination should be immediate or end of current txn. I see people
> using it for both.

Immediate termination is *not* something I'll implement as I have no
good way to test that semantics. I assume you would be fine with leaving
that for later, when someone will need it ?

> And with this I think it would make sense to also add this to
> capabilities :)

I'll repeat the comment I made to Peter: you want me to implement a
feature that you think would be useful, but is completely unrelated to
my use case, while there's a more natural way to handle my issue with
the current API, without precluding in any way the addition of your new
feature in the future. Not fair.

> > > So, the .terminate_cookie() should be a feature for all type of txn's.
> > > If for some reason (dont discount what hw designers can do) a controller
> > > supports this for some specific type(s), then they should return
> > > -ENOTSUPP for cookies that do not support and let the caller know.
> > 
> > But then the caller can't know ahead of time, it will only find out when
> > it's too late, and can't decide not to use the DMA engine if it doesn't
> > support the feature. I don't think that's a very good option.
> 
> Agreed so lets go with adding these in caps.

So if there's a need for caps anyway, why not a cap that marks
.issue_pending() as moving from the current cyclic transfer to the next
one ? 

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-04  8:01                                               ` Laurent Pinchart
@ 2020-03-04 15:37                                                 ` Vinod Koul
  2020-03-04 16:00                                                   ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-03-04 15:37 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

On 04-03-20, 10:01, Laurent Pinchart wrote:
> Hi Vinod,
> 
> On Wed, Mar 04, 2020 at 10:43:01AM +0530, Vinod Koul wrote:
> > On 03-03-20, 21:22, Laurent Pinchart wrote:
> > > On Tue, Mar 03, 2020 at 10:02:54AM +0530, Vinod Koul wrote:
> > > > On 02-03-20, 09:37, Laurent Pinchart wrote:
> > > > > > I would be more comfortable in calling an API to do so :)
> > > > > > The flow I am thinking is:
> > > > > > 
> > > > > > - prep cyclic1 txn
> > > > > > - submit cyclic1 txn
> > > > > > - call issue_pending() (cyclic one starts)
> > > > > > 
> > > > > > - prep cyclic2 txn
> > > > > > - submit cyclic2 txn
> > > > > > - signal_cyclic1_txn aka terminate_cookie()
> > > > > > - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> > > > > > - get callback for cyclic1 (optional)
> > > > > > 
> > > > > > To check if hw supports terminate_cookie() or not we can check if the
> > > > > > callback support is implemented
> > > > > 
> > > > > Two questions though:
> > > > > 
> > > > > - Where is .issue_pending() called for cyclic2 in your above sequence ?
> > > > >   Surely it should be called somewhere, as the DMA engine API requires
> > > > >   .issue_pending() to be called for a transfer to be executed, otherwise
> > > > >   it stays in the submitted but not pending queue.
> > > > 
> > > > Sorry missed that one, I would do that after submit cyclic2 txn step and
> > > > then signal signal_cyclic1_txn termination
> > > 
> > > OK, that matches my understanding, good :-)
> > > 
> > > > > - With the introduction of a new .terminate_cookie() operation, we need
> > > > >   to specify that operation for all transfer types. What's its
> > > > 
> > > > Correct
> > > > 
> > > > >   envisioned semantics for non-cyclic transfers ? And how do DMA engine
> > > > >   drivers report that they support .terminate_cookie() for cyclic
> > > > >   transfers but not for other transfer types (the counterpart of
> > > > >   reporting, in my proposition, that .issue_pending() isn't supported
> > > > >   replace the current cyclic transfer) ?
> > > > 
> > > > Typically for dmaengine controller cyclic is *not* a special mode, only
> > > > change is that a list provided to controller is circular.
> > > 
> > > I don't agree with this. For cyclic transfers to be replaceable in a
> > > clean way, the feature must be specifically implemented at the hardware
> > > level. A DMA engine that supports chaining transfers with an explicit
> > > way to override that chaining, and without the logic to report if the
> > > inherent race was lost or not, really can't support this API.
> > 
> > Well chaining is a typical feature in dmaengine and making last chain
> > point to first makes it circular. I have seen couple of engines and this
> > was the implementation in the hardware.
> > 
> > There can exist special hardware for this purposes as well, but the
> > point is that the cyclic can be treated as circular list.
> > 
> > > Furthemore, for non-cyclic transfers, what would .terminate_cookie() do
> > > ? I need it to be defined as terminating the current transfer when it
> > > ends for the cyclic case, not terminating it immediately. All non-cyclic
> > > transfers terminate by themselves when they end, so what would this new
> > > operation do ?
> > 
> > I would use it for two purposes, cancelling txn but at the end of
> > current txn. I have couple of usages where this would be helpful.
> 
> I fail to see how that would help. Non-cyclic transfers always stop at
> the end of the transfer. "Cancelling txn but at the end of current txn"
> is what DMA engine drivers already do if you call .terminate_cookie() on
> the ongoing transfer. It would thus be a no-op.

Well that actually depends on the hardware, some of them support abort
so people cancel it (terminate_all approach atm)

> 
> > Second in error handling where some engines do not support
> > aborting (unless we reset the whole controller)
> 
> Could you explain that one ? I'm not sure to understand it.

So I have dma to a slow peripheral and it is stuck for some reason. I
want to abort the cookie and let subsequent ones runs (btw this is for
non cyclic case), so I would use that here. Today we terminate_all and
then resubmit...

> > But yes the .terminate_cookie() semantics should indicate if the
> > termination should be immediate or end of current txn. I see people
> > using it for both.
> 
> Immediate termination is *not* something I'll implement as I have no
> good way to test that semantics. I assume you would be fine with leaving
> that for later, when someone will need it ?

Sure, if you have hw to support please test. If not, you will not
implement that.

The point is that API should support it and people can add support in
the controllers and test :)

> > And with this I think it would make sense to also add this to
> > capabilities :)
> 
> I'll repeat the comment I made to Peter: you want me to implement a
> feature that you think would be useful, but is completely unrelated to
> my use case, while there's a more natural way to handle my issue with
> the current API, without precluding in any way the addition of your new
> feature in the future. Not fair.

So from API design pov, I would like this to support both the features.
This helps us to not rework the API again for the immediate abort.

I am not expecting this to be implemented by you if your hw doesn't
support it. The core changes are pretty minimal and callback in the
driver is the one which does the job and yours wont do this

> > > > So, the .terminate_cookie() should be a feature for all type of txn's.
> > > > If for some reason (dont discount what hw designers can do) a controller
> > > > supports this for some specific type(s), then they should return
> > > > -ENOTSUPP for cookies that do not support and let the caller know.
> > > 
> > > But then the caller can't know ahead of time, it will only find out when
> > > it's too late, and can't decide not to use the DMA engine if it doesn't
> > > support the feature. I don't think that's a very good option.
> > 
> > Agreed so lets go with adding these in caps.
> 
> So if there's a need for caps anyway, why not a cap that marks
> .issue_pending() as moving from the current cyclic transfer to the next
> one ? 

Is the overhead really too much on that :) If you like I can send the
core patches and you would need to implement the driver side?

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-04 15:37                                                 ` Vinod Koul
@ 2020-03-04 16:00                                                   ` Laurent Pinchart
  2020-03-04 16:24                                                     ` Vinod Koul
  2020-03-06 14:49                                                     ` Peter Ujfalusi
  0 siblings, 2 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-03-04 16:00 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Wed, Mar 04, 2020 at 09:07:18PM +0530, Vinod Koul wrote:
> On 04-03-20, 10:01, Laurent Pinchart wrote:
> > On Wed, Mar 04, 2020 at 10:43:01AM +0530, Vinod Koul wrote:
> >> On 03-03-20, 21:22, Laurent Pinchart wrote:
> >>> On Tue, Mar 03, 2020 at 10:02:54AM +0530, Vinod Koul wrote:
> >>>> On 02-03-20, 09:37, Laurent Pinchart wrote:
> >>>>>> I would be more comfortable in calling an API to do so :)
> >>>>>> The flow I am thinking is:
> >>>>>> 
> >>>>>> - prep cyclic1 txn
> >>>>>> - submit cyclic1 txn
> >>>>>> - call issue_pending() (cyclic one starts)
> >>>>>> 
> >>>>>> - prep cyclic2 txn
> >>>>>> - submit cyclic2 txn
> >>>>>> - signal_cyclic1_txn aka terminate_cookie()
> >>>>>> - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> >>>>>> - get callback for cyclic1 (optional)
> >>>>>> 
> >>>>>> To check if hw supports terminate_cookie() or not we can check if the
> >>>>>> callback support is implemented
> >>>>> 
> >>>>> Two questions though:
> >>>>> 
> >>>>> - Where is .issue_pending() called for cyclic2 in your above sequence ?
> >>>>>   Surely it should be called somewhere, as the DMA engine API requires
> >>>>>   .issue_pending() to be called for a transfer to be executed, otherwise
> >>>>>   it stays in the submitted but not pending queue.
> >>>> 
> >>>> Sorry missed that one, I would do that after submit cyclic2 txn step and
> >>>> then signal signal_cyclic1_txn termination
> >>> 
> >>> OK, that matches my understanding, good :-)
> >>> 
> >>>>> - With the introduction of a new .terminate_cookie() operation, we need
> >>>>>   to specify that operation for all transfer types. What's its
> >>>> 
> >>>> Correct
> >>>> 
> >>>>>   envisioned semantics for non-cyclic transfers ? And how do DMA engine
> >>>>>   drivers report that they support .terminate_cookie() for cyclic
> >>>>>   transfers but not for other transfer types (the counterpart of
> >>>>>   reporting, in my proposition, that .issue_pending() isn't supported
> >>>>>   replace the current cyclic transfer) ?
> >>>> 
> >>>> Typically for dmaengine controller cyclic is *not* a special mode, only
> >>>> change is that a list provided to controller is circular.
> >>> 
> >>> I don't agree with this. For cyclic transfers to be replaceable in a
> >>> clean way, the feature must be specifically implemented at the hardware
> >>> level. A DMA engine that supports chaining transfers with an explicit
> >>> way to override that chaining, and without the logic to report if the
> >>> inherent race was lost or not, really can't support this API.
> >> 
> >> Well chaining is a typical feature in dmaengine and making last chain
> >> point to first makes it circular. I have seen couple of engines and this
> >> was the implementation in the hardware.
> >> 
> >> There can exist special hardware for this purposes as well, but the
> >> point is that the cyclic can be treated as circular list.
> >> 
> >>> Furthemore, for non-cyclic transfers, what would .terminate_cookie() do
> >>> ? I need it to be defined as terminating the current transfer when it
> >>> ends for the cyclic case, not terminating it immediately. All non-cyclic
> >>> transfers terminate by themselves when they end, so what would this new
> >>> operation do ?
> >> 
> >> I would use it for two purposes, cancelling txn but at the end of
> >> current txn. I have couple of usages where this would be helpful.
> > 
> > I fail to see how that would help. Non-cyclic transfers always stop at
> > the end of the transfer. "Cancelling txn but at the end of current txn"
> > is what DMA engine drivers already do if you call .terminate_cookie() on
> > the ongoing transfer. It would thus be a no-op.
> 
> Well that actually depends on the hardware, some of them support abort
> so people cancel it (terminate_all approach atm)

In that case it's not terminating at the end of the current transfer,
but terminating immediately (a.k.a. aborting), right ? Cancelling at the
end of the current transfer still seems to be a no-op to me for
non-cyclic transfers, as that's what they do on their own already.

> >> Second in error handling where some engines do not support
> >> aborting (unless we reset the whole controller)
> > 
> > Could you explain that one ? I'm not sure to understand it.
> 
> So I have dma to a slow peripheral and it is stuck for some reason. I
> want to abort the cookie and let subsequent ones runs (btw this is for
> non cyclic case), so I would use that here. Today we terminate_all and
> then resubmit...

That's also for immediate abort, right ?

For this to work properly we need very accurate residue reporting, as
the client will usually need to know exactly what has been transferred.
The device would need to support DMA_RESIDUE_GRANULARITY_BURST when
aborting an ongoing transfer. What hardware supports this ?

> >> But yes the .terminate_cookie() semantics should indicate if the
> >> termination should be immediate or end of current txn. I see people
> >> using it for both.
> > 
> > Immediate termination is *not* something I'll implement as I have no
> > good way to test that semantics. I assume you would be fine with leaving
> > that for later, when someone will need it ?
> 
> Sure, if you have hw to support please test. If not, you will not
> implement that.
> 
> The point is that API should support it and people can add support in
> the controllers and test :)

I still think this is a different API. We'll have

1. Existing .issue_pending(), queueing the next transfer for non-cyclic
   cases, and being a no-op for cyclic cases.
2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
   non-cyclic cases, and moving to the next transfer for cyclic cases.
3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
   and non-cyclic cases.

3. is an API I don't need, and can't easily test. I agree that it can
have use cases (provided the DMA device can abort an ongoing transfer
*and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).

I'm troubled by my inability to convince you that 1. and 2. are really
the same, with 1. addressing the non-cyclic case and 2. addressing the
cyclic case :-) This is why I think they should both be implemeted using
.issue_pending() (no other option for 1., that's what it uses today).
This wouldn't prevent implementing 3. with a new .terminate_cookie()
operation, that wouldn't need to take a flag as it would always operate
in ABORT_IMMEDIATELY mode. There would also be no need to report a new
capability for 3., as the presence of the .terminate_cookie() handler
would be enough to tell clients that the API is supported. Only a new
capability for 2. would be needed.

> >> And with this I think it would make sense to also add this to
> >> capabilities :)
> > 
> > I'll repeat the comment I made to Peter: you want me to implement a
> > feature that you think would be useful, but is completely unrelated to
> > my use case, while there's a more natural way to handle my issue with
> > the current API, without precluding in any way the addition of your new
> > feature in the future. Not fair.
> 
> So from API design pov, I would like this to support both the features.
> This helps us to not rework the API again for the immediate abort.
> 
> I am not expecting this to be implemented by you if your hw doesn't
> support it. The core changes are pretty minimal and callback in the
> driver is the one which does the job and yours wont do this

Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
can't test this indeed.

> >>>> So, the .terminate_cookie() should be a feature for all type of txn's.
> >>>> If for some reason (dont discount what hw designers can do) a controller
> >>>> supports this for some specific type(s), then they should return
> >>>> -ENOTSUPP for cookies that do not support and let the caller know.
> >>> 
> >>> But then the caller can't know ahead of time, it will only find out when
> >>> it's too late, and can't decide not to use the DMA engine if it doesn't
> >>> support the feature. I don't think that's a very good option.
> >> 
> >> Agreed so lets go with adding these in caps.
> > 
> > So if there's a need for caps anyway, why not a cap that marks
> > .issue_pending() as moving from the current cyclic transfer to the next
> > one ? 
> 
> Is the overhead really too much on that :) If you like I can send the
> core patches and you would need to implement the driver side?

We can try that as a compromise. One of main concerns with developing
the core patches myself is that the .terminate_cookie() API still seems
ill-defined to me, so it would be much more efficient if you translate
the idea you have in your idea into code than trying to communicate it
to me in all details (one of the grey areas is what should
.terminate_cookie() do if the cookie passed to the function corresponds
to an already terminated or, more tricky from a completion callback
point of view, an issued but not-yet-started transfer, or also a
submitted but not issued transfer). If you implement the core part, then
that problem will go away.

How about the implementation in virt-dma.[ch] by the way ?

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-04 16:00                                                   ` Laurent Pinchart
@ 2020-03-04 16:24                                                     ` Vinod Koul
       [not found]                                                       ` <20200311155248.GA4772@pendragon.ideasonboard.com>
  2020-03-06 14:49                                                     ` Peter Ujfalusi
  1 sibling, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-03-04 16:24 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Laurent,

On 04-03-20, 18:00, Laurent Pinchart wrote:
> On Wed, Mar 04, 2020 at 09:07:18PM +0530, Vinod Koul wrote:
> > On 04-03-20, 10:01, Laurent Pinchart wrote:
> > > On Wed, Mar 04, 2020 at 10:43:01AM +0530, Vinod Koul wrote:
> > >> On 03-03-20, 21:22, Laurent Pinchart wrote:
> > >>> On Tue, Mar 03, 2020 at 10:02:54AM +0530, Vinod Koul wrote:
> > >>>> On 02-03-20, 09:37, Laurent Pinchart wrote:
> > >>>>>> I would be more comfortable in calling an API to do so :)
> > >>>>>> The flow I am thinking is:
> > >>>>>> 
> > >>>>>> - prep cyclic1 txn
> > >>>>>> - submit cyclic1 txn
> > >>>>>> - call issue_pending() (cyclic one starts)
> > >>>>>> 
> > >>>>>> - prep cyclic2 txn
> > >>>>>> - submit cyclic2 txn
> > >>>>>> - signal_cyclic1_txn aka terminate_cookie()
> > >>>>>> - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> > >>>>>> - get callback for cyclic1 (optional)
> > >>>>>> 
> > >>>>>> To check if hw supports terminate_cookie() or not we can check if the
> > >>>>>> callback support is implemented
> > >>>>> 
> > >>>>> Two questions though:
> > >>>>> 
> > >>>>> - Where is .issue_pending() called for cyclic2 in your above sequence ?
> > >>>>>   Surely it should be called somewhere, as the DMA engine API requires
> > >>>>>   .issue_pending() to be called for a transfer to be executed, otherwise
> > >>>>>   it stays in the submitted but not pending queue.
> > >>>> 
> > >>>> Sorry missed that one, I would do that after submit cyclic2 txn step and
> > >>>> then signal signal_cyclic1_txn termination
> > >>> 
> > >>> OK, that matches my understanding, good :-)
> > >>> 
> > >>>>> - With the introduction of a new .terminate_cookie() operation, we need
> > >>>>>   to specify that operation for all transfer types. What's its
> > >>>> 
> > >>>> Correct
> > >>>> 
> > >>>>>   envisioned semantics for non-cyclic transfers ? And how do DMA engine
> > >>>>>   drivers report that they support .terminate_cookie() for cyclic
> > >>>>>   transfers but not for other transfer types (the counterpart of
> > >>>>>   reporting, in my proposition, that .issue_pending() isn't supported
> > >>>>>   replace the current cyclic transfer) ?
> > >>>> 
> > >>>> Typically for dmaengine controller cyclic is *not* a special mode, only
> > >>>> change is that a list provided to controller is circular.
> > >>> 
> > >>> I don't agree with this. For cyclic transfers to be replaceable in a
> > >>> clean way, the feature must be specifically implemented at the hardware
> > >>> level. A DMA engine that supports chaining transfers with an explicit
> > >>> way to override that chaining, and without the logic to report if the
> > >>> inherent race was lost or not, really can't support this API.
> > >> 
> > >> Well chaining is a typical feature in dmaengine and making last chain
> > >> point to first makes it circular. I have seen couple of engines and this
> > >> was the implementation in the hardware.
> > >> 
> > >> There can exist special hardware for this purposes as well, but the
> > >> point is that the cyclic can be treated as circular list.
> > >> 
> > >>> Furthemore, for non-cyclic transfers, what would .terminate_cookie() do
> > >>> ? I need it to be defined as terminating the current transfer when it
> > >>> ends for the cyclic case, not terminating it immediately. All non-cyclic
> > >>> transfers terminate by themselves when they end, so what would this new
> > >>> operation do ?
> > >> 
> > >> I would use it for two purposes, cancelling txn but at the end of
> > >> current txn. I have couple of usages where this would be helpful.
> > > 
> > > I fail to see how that would help. Non-cyclic transfers always stop at
> > > the end of the transfer. "Cancelling txn but at the end of current txn"
> > > is what DMA engine drivers already do if you call .terminate_cookie() on
> > > the ongoing transfer. It would thus be a no-op.
> > 
> > Well that actually depends on the hardware, some of them support abort
> > so people cancel it (terminate_all approach atm)
> 
> In that case it's not terminating at the end of the current transfer,
> but terminating immediately (a.k.a. aborting), right ? Cancelling at the
> end of the current transfer still seems to be a no-op to me for
> non-cyclic transfers, as that's what they do on their own already.

Correct, it is abort for current txn.

> > >> Second in error handling where some engines do not support
> > >> aborting (unless we reset the whole controller)
> > > 
> > > Could you explain that one ? I'm not sure to understand it.
> > 
> > So I have dma to a slow peripheral and it is stuck for some reason. I
> > want to abort the cookie and let subsequent ones runs (btw this is for
> > non cyclic case), so I would use that here. Today we terminate_all and
> > then resubmit...
> 
> That's also for immediate abort, right ?

Right

> For this to work properly we need very accurate residue reporting, as
> the client will usually need to know exactly what has been transferred.
> The device would need to support DMA_RESIDUE_GRANULARITY_BURST when
> aborting an ongoing transfer. What hardware supports this ?

 git grep DMA_RESIDUE_GRANULARITY_BURST drivers/dma/ |wc -l
27

So it seems many do support the burst reporting.

> > >> But yes the .terminate_cookie() semantics should indicate if the
> > >> termination should be immediate or end of current txn. I see people
> > >> using it for both.
> > > 
> > > Immediate termination is *not* something I'll implement as I have no
> > > good way to test that semantics. I assume you would be fine with leaving
> > > that for later, when someone will need it ?
> > 
> > Sure, if you have hw to support please test. If not, you will not
> > implement that.
> > 
> > The point is that API should support it and people can add support in
> > the controllers and test :)
> 
> I still think this is a different API. We'll have
> 
> 1. Existing .issue_pending(), queueing the next transfer for non-cyclic
>    cases, and being a no-op for cyclic cases.
> 2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
>    non-cyclic cases, and moving to the next transfer for cyclic cases.
> 3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
>    and non-cyclic cases.
> 
> 3. is an API I don't need, and can't easily test. I agree that it can
> have use cases (provided the DMA device can abort an ongoing transfer
> *and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).
> 
> I'm troubled by my inability to convince you that 1. and 2. are really
> the same, with 1. addressing the non-cyclic case and 2. addressing the
> cyclic case :-) This is why I think they should both be implemeted using
> .issue_pending() (no other option for 1., that's what it uses today).
> This wouldn't prevent implementing 3. with a new .terminate_cookie()
> operation, that wouldn't need to take a flag as it would always operate
> in ABORT_IMMEDIATELY mode. There would also be no need to report a new
> capability for 3., as the presence of the .terminate_cookie() handler
> would be enough to tell clients that the API is supported. Only a new
> capability for 2. would be needed.

Well I agree 1 & 2 seem similar but I would like to define the behaviour
not dependent on the txn being cyclic or not. That is my concern and
hence the idea that:

1. .issue_pending() will push txn to pending_queue, you may have a case
where that is done only once (due to nature of txn), but no other
implication

2. .terminate_cookie(EOT) will abort the transfer at the end. Maybe not
used for cyclic but irrespective of that, the behaviour would be abort
at end of cyclic

3. .terminate_cookie(IMMEDIATE) will abort immediately. If there is
anything in pending_queue that will get pushed to hardware.

4. Cyclic by nature never completes
   - as a consequence needs to be stopped by terminate_all/terminate_cookie

Does these rules make sense :)

> > >> And with this I think it would make sense to also add this to
> > >> capabilities :)
> > > 
> > > I'll repeat the comment I made to Peter: you want me to implement a
> > > feature that you think would be useful, but is completely unrelated to
> > > my use case, while there's a more natural way to handle my issue with
> > > the current API, without precluding in any way the addition of your new
> > > feature in the future. Not fair.
> > 
> > So from API design pov, I would like this to support both the features.
> > This helps us to not rework the API again for the immediate abort.
> > 
> > I am not expecting this to be implemented by you if your hw doesn't
> > support it. The core changes are pretty minimal and callback in the
> > driver is the one which does the job and yours wont do this
> 
> Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
> can't test this indeed.

Sure I understand that! Am sure folks will respond to CFT and I guess
Peter will also be interested in testing.

> > >>>> So, the .terminate_cookie() should be a feature for all type of txn's.
> > >>>> If for some reason (dont discount what hw designers can do) a controller
> > >>>> supports this for some specific type(s), then they should return
> > >>>> -ENOTSUPP for cookies that do not support and let the caller know.
> > >>> 
> > >>> But then the caller can't know ahead of time, it will only find out when
> > >>> it's too late, and can't decide not to use the DMA engine if it doesn't
> > >>> support the feature. I don't think that's a very good option.
> > >> 
> > >> Agreed so lets go with adding these in caps.
> > > 
> > > So if there's a need for caps anyway, why not a cap that marks
> > > .issue_pending() as moving from the current cyclic transfer to the next
> > > one ? 
> > 
> > Is the overhead really too much on that :) If you like I can send the
> > core patches and you would need to implement the driver side?
> 
> We can try that as a compromise. One of main concerns with developing
> the core patches myself is that the .terminate_cookie() API still seems
> ill-defined to me, so it would be much more efficient if you translate

yeah lets take a stab at defining this and see if we come up with
something meaningful

> the idea you have in your idea into code than trying to communicate it
> to me in all details (one of the grey areas is what should
> .terminate_cookie() do if the cookie passed to the function corresponds
> to an already terminated or, more tricky from a completion callback
> point of view, an issued but not-yet-started transfer, or also a
> submitted but not issued transfer). If you implement the core part, then
> that problem will go away.
> 
> How about the implementation in virt-dma.[ch] by the way ?

It needs to be comprehended and tested as well.. since these are simple
callbacks to driver, we should not need huge changes here (i need to
double check though)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-04 16:00                                                   ` Laurent Pinchart
  2020-03-04 16:24                                                     ` Vinod Koul
@ 2020-03-06 14:49                                                     ` Peter Ujfalusi
  2020-03-11 23:15                                                       ` Laurent Pinchart
  1 sibling, 1 reply; 46+ messages in thread
From: Peter Ujfalusi @ 2020-03-06 14:49 UTC (permalink / raw)
  To: Laurent Pinchart, Vinod Koul
  Cc: dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Laureant,

On 04/03/2020 18.00, Laurent Pinchart wrote:
> I still think this is a different API. We'll have
> 
> 1. Existing .issue_pending(), queueing the next transfer for non-cyclic
>    cases, and being a no-op for cyclic cases.
> 2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
>    non-cyclic cases, and moving to the next transfer for cyclic cases.
> 3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
>    and non-cyclic cases.
> 
> 3. is an API I don't need, and can't easily test. I agree that it can
> have use cases (provided the DMA device can abort an ongoing transfer
> *and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).
> 
> I'm troubled by my inability to convince you that 1. and 2. are really
> the same, with 1. addressing the non-cyclic case and 2. addressing the
> cyclic case :-) This is why I think they should both be implemeted using
> .issue_pending() (no other option for 1., that's what it uses today).
> This wouldn't prevent implementing 3. with a new .terminate_cookie()
> operation, that wouldn't need to take a flag as it would always operate
> in ABORT_IMMEDIATELY mode. There would also be no need to report a new
> capability for 3., as the presence of the .terminate_cookie() handler
> would be enough to tell clients that the API is supported. Only a new
> capability for 2. would be needed.

Let's see the two cases, AT_END_OF_TRANSFER and ABORT_IMMEDIATELY
against cyclic and slave for simplicity:
- AT_END_OF_TRANSFER
...
issue_pending(1)
issue_pending(2)
terminate_cookie(AT_END_OF_TRANSFER)

In case of cyclic:
When cookie1 finishes a tx cookie2 will start.

Same sequence in case of slave:
When cookie1 finishes a tx cookie2 will start.
 Yes, terminate_cookie(AT_END_OF_TRANSFER) is NOP

- ABORT_IMMEDIATELY
...
issue_pending(1)
issue_pending(2)
terminate_cookie(ABORT_IMMEDIATELY)

In case of cyclic and slave:
Abort cookie1 right away and start cookie2.

In case of cyclic:
When cookie1 finishes a tx cookie2 will start.

True, we have NOP operation, but as you can see the semantics of the two
cases are well defined and consistent among different operations.

Imho the only thing which is not really defined is the
AT_END_OF_TRANSFER, is it after the current period, or when finishing
the buffer / after a frame or all frames are consumed in the current tx
for interleaved.


>>>> And with this I think it would make sense to also add this to
>>>> capabilities :)
>>>
>>> I'll repeat the comment I made to Peter: you want me to implement a
>>> feature that you think would be useful, but is completely unrelated to
>>> my use case, while there's a more natural way to handle my issue with
>>> the current API, without precluding in any way the addition of your new
>>> feature in the future. Not fair.
>>
>> So from API design pov, I would like this to support both the features.
>> This helps us to not rework the API again for the immediate abort.
>>
>> I am not expecting this to be implemented by you if your hw doesn't
>> support it. The core changes are pretty minimal and callback in the
>> driver is the one which does the job and yours wont do this
> 
> Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
> can't test this indeed.

All TI DMA supports it ;)

> 
>>>>>> So, the .terminate_cookie() should be a feature for all type of txn's.
>>>>>> If for some reason (dont discount what hw designers can do) a controller
>>>>>> supports this for some specific type(s), then they should return
>>>>>> -ENOTSUPP for cookies that do not support and let the caller know.
>>>>>
>>>>> But then the caller can't know ahead of time, it will only find out when
>>>>> it's too late, and can't decide not to use the DMA engine if it doesn't
>>>>> support the feature. I don't think that's a very good option.
>>>>
>>>> Agreed so lets go with adding these in caps.
>>>
>>> So if there's a need for caps anyway, why not a cap that marks
>>> .issue_pending() as moving from the current cyclic transfer to the next
>>> one ? 
>>
>> Is the overhead really too much on that :) If you like I can send the
>> core patches and you would need to implement the driver side?
> 
> We can try that as a compromise. One of main concerns with developing
> the core patches myself is that the .terminate_cookie() API still seems
> ill-defined to me, so it would be much more efficient if you translate
> the idea you have in your idea into code than trying to communicate it
> to me in all details (one of the grey areas is what should
> .terminate_cookie() do if the cookie passed to the function corresponds
> to an already terminated or, more tricky from a completion callback
> point of view, an issued but not-yet-started transfer, or also a
> submitted but not issued transfer). If you implement the core part, then
> that problem will go away.
> 
> How about the implementation in virt-dma.[ch] by the way ?
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-06 14:49                                                     ` Peter Ujfalusi
@ 2020-03-11 23:15                                                       ` Laurent Pinchart
  0 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-03-11 23:15 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: Vinod Koul, dmaengine, Michal Simek, Hyun Kwon, Tejas Upadhyay,
	Satish Kumar Nagireddy

Hi Peter,

On Fri, Mar 06, 2020 at 04:49:01PM +0200, Peter Ujfalusi wrote:
> On 04/03/2020 18.00, Laurent Pinchart wrote:
> > I still think this is a different API. We'll have
> > 
> > 1. Existing .issue_pending(), queueing the next transfer for non-cyclic
> >    cases, and being a no-op for cyclic cases.
> > 2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
> >    non-cyclic cases, and moving to the next transfer for cyclic cases.
> > 3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
> >    and non-cyclic cases.
> > 
> > 3. is an API I don't need, and can't easily test. I agree that it can
> > have use cases (provided the DMA device can abort an ongoing transfer
> > *and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).
> > 
> > I'm troubled by my inability to convince you that 1. and 2. are really
> > the same, with 1. addressing the non-cyclic case and 2. addressing the
> > cyclic case :-) This is why I think they should both be implemeted using
> > .issue_pending() (no other option for 1., that's what it uses today).
> > This wouldn't prevent implementing 3. with a new .terminate_cookie()
> > operation, that wouldn't need to take a flag as it would always operate
> > in ABORT_IMMEDIATELY mode. There would also be no need to report a new
> > capability for 3., as the presence of the .terminate_cookie() handler
> > would be enough to tell clients that the API is supported. Only a new
> > capability for 2. would be needed.
> 
> Let's see the two cases, AT_END_OF_TRANSFER and ABORT_IMMEDIATELY
> against cyclic and slave for simplicity:
> - AT_END_OF_TRANSFER
> ...
> issue_pending(1)
> issue_pending(2)
> terminate_cookie(AT_END_OF_TRANSFER)
> 
> In case of cyclic:
> When cookie1 finishes a tx cookie2 will start.
> 
> Same sequence in case of slave:
> When cookie1 finishes a tx cookie2 will start.
>  Yes, terminate_cookie(AT_END_OF_TRANSFER) is NOP
> 
> - ABORT_IMMEDIATELY
> ...
> issue_pending(1)
> issue_pending(2)
> terminate_cookie(ABORT_IMMEDIATELY)
> 
> In case of cyclic and slave:
> Abort cookie1 right away and start cookie2.
> 
> In case of cyclic:
> When cookie1 finishes a tx cookie2 will start.

Is this paragraph a copy & paste leftover ?

> True, we have NOP operation, but as you can see the semantics of the two
> cases are well defined and consistent among different operations.

I'm not disputing that, but I still think that the semantics for the
proposal based solely on issue_pending() is well-defined too and
consistent among different operations :-) My point is that
terminate_cookie() is only required for the ABORT_IMMEDIATELY case,
which could be implemented on top of my proposal. Anyway, I seem to have
failed in my attempt to convincing Vinod, and he proposed providing the
implementation of terminate_cookie() in the DMA engine core and doc, so
I'll rebase the driver on top of that and submit the two together after
testing.

> Imho the only thing which is not really defined is the
> AT_END_OF_TRANSFER, is it after the current period, or when finishing
> the buffer / after a frame or all frames are consumed in the current tx
> for interleaved.

For 2D interleaved cyclic transfers, there's a single period, so that's
not an issue. For the existing cyclic API it's up to us to decide, and I
don't have enough insight on the expected usage and hardware features to
answer that question.

> >>>> And with this I think it would make sense to also add this to
> >>>> capabilities :)
> >>>
> >>> I'll repeat the comment I made to Peter: you want me to implement a
> >>> feature that you think would be useful, but is completely unrelated to
> >>> my use case, while there's a more natural way to handle my issue with
> >>> the current API, without precluding in any way the addition of your new
> >>> feature in the future. Not fair.
> >>
> >> So from API design pov, I would like this to support both the features.
> >> This helps us to not rework the API again for the immediate abort.
> >>
> >> I am not expecting this to be implemented by you if your hw doesn't
> >> support it. The core changes are pretty minimal and callback in the
> >> driver is the one which does the job and yours wont do this
> > 
> > Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
> > can't test this indeed.
> 
> All TI DMA supports it ;)

Great, so you can implement this feature ;-)

> >>>>>> So, the .terminate_cookie() should be a feature for all type of txn's.
> >>>>>> If for some reason (dont discount what hw designers can do) a controller
> >>>>>> supports this for some specific type(s), then they should return
> >>>>>> -ENOTSUPP for cookies that do not support and let the caller know.
> >>>>>
> >>>>> But then the caller can't know ahead of time, it will only find out when
> >>>>> it's too late, and can't decide not to use the DMA engine if it doesn't
> >>>>> support the feature. I don't think that's a very good option.
> >>>>
> >>>> Agreed so lets go with adding these in caps.
> >>>
> >>> So if there's a need for caps anyway, why not a cap that marks
> >>> .issue_pending() as moving from the current cyclic transfer to the next
> >>> one ? 
> >>
> >> Is the overhead really too much on that :) If you like I can send the
> >> core patches and you would need to implement the driver side?
> > 
> > We can try that as a compromise. One of main concerns with developing
> > the core patches myself is that the .terminate_cookie() API still seems
> > ill-defined to me, so it would be much more efficient if you translate
> > the idea you have in your idea into code than trying to communicate it
> > to me in all details (one of the grey areas is what should
> > .terminate_cookie() do if the cookie passed to the function corresponds
> > to an already terminated or, more tricky from a completion callback
> > point of view, an issued but not-yet-started transfer, or also a
> > submitted but not issued transfer). If you implement the core part, then
> > that problem will go away.
> > 
> > How about the implementation in virt-dma.[ch] by the way ?

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
       [not found]                                                       ` <20200311155248.GA4772@pendragon.ideasonboard.com>
@ 2020-03-18 15:14                                                         ` Laurent Pinchart
  2020-03-25 16:00                                                           ` Laurent Pinchart
  2020-03-26  7:02                                                         ` Vinod Koul
  1 sibling, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-03-18 15:14 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Wed, Mar 11, 2020 at 05:52:48PM +0200, Laurent Pinchart wrote:
> On Wed, Mar 04, 2020 at 09:54:26PM +0530, Vinod Koul wrote:
> > On 04-03-20, 18:00, Laurent Pinchart wrote:
> >> On Wed, Mar 04, 2020 at 09:07:18PM +0530, Vinod Koul wrote:
> >>> On 04-03-20, 10:01, Laurent Pinchart wrote:
> >>>> On Wed, Mar 04, 2020 at 10:43:01AM +0530, Vinod Koul wrote:
> >>>>> On 03-03-20, 21:22, Laurent Pinchart wrote:
> >>>>>> On Tue, Mar 03, 2020 at 10:02:54AM +0530, Vinod Koul wrote:
> >>>>>>> On 02-03-20, 09:37, Laurent Pinchart wrote:
> >>>>>>>>> I would be more comfortable in calling an API to do so :)
> >>>>>>>>> The flow I am thinking is:
> >>>>>>>>> 
> >>>>>>>>> - prep cyclic1 txn
> >>>>>>>>> - submit cyclic1 txn
> >>>>>>>>> - call issue_pending() (cyclic one starts)
> >>>>>>>>> 
> >>>>>>>>> - prep cyclic2 txn
> >>>>>>>>> - submit cyclic2 txn
> >>>>>>>>> - signal_cyclic1_txn aka terminate_cookie()
> >>>>>>>>> - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> >>>>>>>>> - get callback for cyclic1 (optional)
> >>>>>>>>> 
> >>>>>>>>> To check if hw supports terminate_cookie() or not we can check if the
> >>>>>>>>> callback support is implemented
> >>>>>>>> 
> >>>>>>>> Two questions though:
> >>>>>>>> 
> >>>>>>>> - Where is .issue_pending() called for cyclic2 in your above sequence ?
> >>>>>>>>   Surely it should be called somewhere, as the DMA engine API requires
> >>>>>>>>   .issue_pending() to be called for a transfer to be executed, otherwise
> >>>>>>>>   it stays in the submitted but not pending queue.
> >>>>>>> 
> >>>>>>> Sorry missed that one, I would do that after submit cyclic2 txn step and
> >>>>>>> then signal signal_cyclic1_txn termination
> >>>>>> 
> >>>>>> OK, that matches my understanding, good :-)
> >>>>>> 
> >>>>>>>> - With the introduction of a new .terminate_cookie() operation, we need
> >>>>>>>>   to specify that operation for all transfer types. What's its
> >>>>>>> 
> >>>>>>> Correct
> >>>>>>> 
> >>>>>>>>   envisioned semantics for non-cyclic transfers ? And how do DMA engine
> >>>>>>>>   drivers report that they support .terminate_cookie() for cyclic
> >>>>>>>>   transfers but not for other transfer types (the counterpart of
> >>>>>>>>   reporting, in my proposition, that .issue_pending() isn't supported
> >>>>>>>>   replace the current cyclic transfer) ?
> >>>>>>> 
> >>>>>>> Typically for dmaengine controller cyclic is *not* a special mode, only
> >>>>>>> change is that a list provided to controller is circular.
> >>>>>> 
> >>>>>> I don't agree with this. For cyclic transfers to be replaceable in a
> >>>>>> clean way, the feature must be specifically implemented at the hardware
> >>>>>> level. A DMA engine that supports chaining transfers with an explicit
> >>>>>> way to override that chaining, and without the logic to report if the
> >>>>>> inherent race was lost or not, really can't support this API.
> >>>>> 
> >>>>> Well chaining is a typical feature in dmaengine and making last chain
> >>>>> point to first makes it circular. I have seen couple of engines and this
> >>>>> was the implementation in the hardware.
> >>>>> 
> >>>>> There can exist special hardware for this purposes as well, but the
> >>>>> point is that the cyclic can be treated as circular list.
> >>>>> 
> >>>>>> Furthemore, for non-cyclic transfers, what would .terminate_cookie() do
> >>>>>> ? I need it to be defined as terminating the current transfer when it
> >>>>>> ends for the cyclic case, not terminating it immediately. All non-cyclic
> >>>>>> transfers terminate by themselves when they end, so what would this new
> >>>>>> operation do ?
> >>>>> 
> >>>>> I would use it for two purposes, cancelling txn but at the end of
> >>>>> current txn. I have couple of usages where this would be helpful.
> >>>> 
> >>>> I fail to see how that would help. Non-cyclic transfers always stop at
> >>>> the end of the transfer. "Cancelling txn but at the end of current txn"
> >>>> is what DMA engine drivers already do if you call .terminate_cookie() on
> >>>> the ongoing transfer. It would thus be a no-op.
> >>> 
> >>> Well that actually depends on the hardware, some of them support abort
> >>> so people cancel it (terminate_all approach atm)
> >> 
> >> In that case it's not terminating at the end of the current transfer,
> >> but terminating immediately (a.k.a. aborting), right ? Cancelling at the
> >> end of the current transfer still seems to be a no-op to me for
> >> non-cyclic transfers, as that's what they do on their own already.
> > 
> > Correct, it is abort for current txn.
> > 
> >>>>> Second in error handling where some engines do not support
> >>>>> aborting (unless we reset the whole controller)
> >>>> 
> >>>> Could you explain that one ? I'm not sure to understand it.
> >>> 
> >>> So I have dma to a slow peripheral and it is stuck for some reason. I
> >>> want to abort the cookie and let subsequent ones runs (btw this is for
> >>> non cyclic case), so I would use that here. Today we terminate_all and
> >>> then resubmit...
> >> 
> >> That's also for immediate abort, right ?
> > 
> > Right
> > 
> >> For this to work properly we need very accurate residue reporting, as
> >> the client will usually need to know exactly what has been transferred.
> >> The device would need to support DMA_RESIDUE_GRANULARITY_BURST when
> >> aborting an ongoing transfer. What hardware supports this ?
> > 
> >  git grep DMA_RESIDUE_GRANULARITY_BURST drivers/dma/ |wc -l
> > 27
> > 
> > So it seems many do support the burst reporting.
> 
> Yes, but not all of those may support aborting a transfer *and*
> reporting the exact residue of cancelled transfers. We need both to
> implement your proposal.
> 
> >>>>> But yes the .terminate_cookie() semantics should indicate if the
> >>>>> termination should be immediate or end of current txn. I see people
> >>>>> using it for both.
> >>>> 
> >>>> Immediate termination is *not* something I'll implement as I have no
> >>>> good way to test that semantics. I assume you would be fine with leaving
> >>>> that for later, when someone will need it ?
> >>> 
> >>> Sure, if you have hw to support please test. If not, you will not
> >>> implement that.
> >>> 
> >>> The point is that API should support it and people can add support in
> >>> the controllers and test :)
> >> 
> >> I still think this is a different API. We'll have
> >> 
> >> 1. Existing .issue_pending(), queueing the next transfer for non-cyclic
> >>    cases, and being a no-op for cyclic cases.
> >> 2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
> >>    non-cyclic cases, and moving to the next transfer for cyclic cases.
> >> 3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
> >>    and non-cyclic cases.
> >> 
> >> 3. is an API I don't need, and can't easily test. I agree that it can
> >> have use cases (provided the DMA device can abort an ongoing transfer
> >> *and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).
> >> 
> >> I'm troubled by my inability to convince you that 1. and 2. are really
> >> the same, with 1. addressing the non-cyclic case and 2. addressing the
> >> cyclic case :-) This is why I think they should both be implemeted using
> >> .issue_pending() (no other option for 1., that's what it uses today).
> >> This wouldn't prevent implementing 3. with a new .terminate_cookie()
> >> operation, that wouldn't need to take a flag as it would always operate
> >> in ABORT_IMMEDIATELY mode. There would also be no need to report a new
> >> capability for 3., as the presence of the .terminate_cookie() handler
> >> would be enough to tell clients that the API is supported. Only a new
> >> capability for 2. would be needed.
> > 
> > Well I agree 1 & 2 seem similar but I would like to define the behaviour
> > not dependent on the txn being cyclic or not. That is my concern and
> > hence the idea that:
> > 
> > 1. .issue_pending() will push txn to pending_queue, you may have a case
> > where that is done only once (due to nature of txn), but no other
> > implication
> > 
> > 2. .terminate_cookie(EOT) will abort the transfer at the end. Maybe not
> > used for cyclic but irrespective of that, the behaviour would be abort
> > at end of cyclic
> 
> Did you mean "maybe not used for non-cyclic" ?
> 
> > 3. .terminate_cookie(IMMEDIATE) will abort immediately. If there is
> > anything in pending_queue that will get pushed to hardware.
> > 
> > 4. Cyclic by nature never completes
> >    - as a consequence needs to be stopped by terminate_all/terminate_cookie
> > 
> > Does these rules make sense :)
> 
> It's a set of rules that I think can handle my use case, but I still
> believe my proposal based on just .issue_pending() would be simpler, in
> line with the existing API concepts, and wouldn't preclude the addition
> of .terminate_cookie(IMMEDIATE) at a later point. It's your call though,
> especially if you provide the implementation :-) When do you think you
> will be able to do so ?

Gentle ping :-)

> >>>>> And with this I think it would make sense to also add this to
> >>>>> capabilities :)
> >>>> 
> >>>> I'll repeat the comment I made to Peter: you want me to implement a
> >>>> feature that you think would be useful, but is completely unrelated to
> >>>> my use case, while there's a more natural way to handle my issue with
> >>>> the current API, without precluding in any way the addition of your new
> >>>> feature in the future. Not fair.
> >>> 
> >>> So from API design pov, I would like this to support both the features.
> >>> This helps us to not rework the API again for the immediate abort.
> >>> 
> >>> I am not expecting this to be implemented by you if your hw doesn't
> >>> support it. The core changes are pretty minimal and callback in the
> >>> driver is the one which does the job and yours wont do this
> >> 
> >> Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
> >> can't test this indeed.
> > 
> > Sure I understand that! Am sure folks will respond to CFT and I guess
> > Peter will also be interested in testing.
> 
> s/testing/implementing it/ :-)
> 
> >>>>>>> So, the .terminate_cookie() should be a feature for all type of txn's.
> >>>>>>> If for some reason (dont discount what hw designers can do) a controller
> >>>>>>> supports this for some specific type(s), then they should return
> >>>>>>> -ENOTSUPP for cookies that do not support and let the caller know.
> >>>>>> 
> >>>>>> But then the caller can't know ahead of time, it will only find out when
> >>>>>> it's too late, and can't decide not to use the DMA engine if it doesn't
> >>>>>> support the feature. I don't think that's a very good option.
> >>>>> 
> >>>>> Agreed so lets go with adding these in caps.
> >>>> 
> >>>> So if there's a need for caps anyway, why not a cap that marks
> >>>> .issue_pending() as moving from the current cyclic transfer to the next
> >>>> one ? 
> >>> 
> >>> Is the overhead really too much on that :) If you like I can send the
> >>> core patches and you would need to implement the driver side?
> >> 
> >> We can try that as a compromise. One of main concerns with developing
> >> the core patches myself is that the .terminate_cookie() API still seems
> >> ill-defined to me, so it would be much more efficient if you translate
> > 
> > yeah lets take a stab at defining this and see if we come up with
> > something meaningful
> > 
> >> the idea you have in your idea into code than trying to communicate it
> >> to me in all details (one of the grey areas is what should
> >> .terminate_cookie() do if the cookie passed to the function corresponds
> >> to an already terminated or, more tricky from a completion callback
> >> point of view, an issued but not-yet-started transfer, or also a
> >> submitted but not issued transfer). If you implement the core part, then
> >> that problem will go away.
> >> 
> >> How about the implementation in virt-dma.[ch] by the way ?
> > 
> > It needs to be comprehended and tested as well.. since these are simple
> > callbacks to driver, we should not need huge changes here (i need to
> > double check though)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-18 15:14                                                         ` Laurent Pinchart
@ 2020-03-25 16:00                                                           ` Laurent Pinchart
  0 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-03-25 16:00 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Wed, Mar 18, 2020 at 05:14:27PM +0200, Laurent Pinchart wrote:
> On Wed, Mar 11, 2020 at 05:52:48PM +0200, Laurent Pinchart wrote:
> > On Wed, Mar 04, 2020 at 09:54:26PM +0530, Vinod Koul wrote:
> >> On 04-03-20, 18:00, Laurent Pinchart wrote:
> >>> On Wed, Mar 04, 2020 at 09:07:18PM +0530, Vinod Koul wrote:
> >>>> On 04-03-20, 10:01, Laurent Pinchart wrote:
> >>>>> On Wed, Mar 04, 2020 at 10:43:01AM +0530, Vinod Koul wrote:
> >>>>>> On 03-03-20, 21:22, Laurent Pinchart wrote:
> >>>>>>> On Tue, Mar 03, 2020 at 10:02:54AM +0530, Vinod Koul wrote:
> >>>>>>>> On 02-03-20, 09:37, Laurent Pinchart wrote:
> >>>>>>>>>> I would be more comfortable in calling an API to do so :)
> >>>>>>>>>> The flow I am thinking is:
> >>>>>>>>>> 
> >>>>>>>>>> - prep cyclic1 txn
> >>>>>>>>>> - submit cyclic1 txn
> >>>>>>>>>> - call issue_pending() (cyclic one starts)
> >>>>>>>>>> 
> >>>>>>>>>> - prep cyclic2 txn
> >>>>>>>>>> - submit cyclic2 txn
> >>>>>>>>>> - signal_cyclic1_txn aka terminate_cookie()
> >>>>>>>>>> - cyclic1 completes, switch to cyclic2 (dmaengine driver)
> >>>>>>>>>> - get callback for cyclic1 (optional)
> >>>>>>>>>> 
> >>>>>>>>>> To check if hw supports terminate_cookie() or not we can check if the
> >>>>>>>>>> callback support is implemented
> >>>>>>>>> 
> >>>>>>>>> Two questions though:
> >>>>>>>>> 
> >>>>>>>>> - Where is .issue_pending() called for cyclic2 in your above sequence ?
> >>>>>>>>>   Surely it should be called somewhere, as the DMA engine API requires
> >>>>>>>>>   .issue_pending() to be called for a transfer to be executed, otherwise
> >>>>>>>>>   it stays in the submitted but not pending queue.
> >>>>>>>> 
> >>>>>>>> Sorry missed that one, I would do that after submit cyclic2 txn step and
> >>>>>>>> then signal signal_cyclic1_txn termination
> >>>>>>> 
> >>>>>>> OK, that matches my understanding, good :-)
> >>>>>>> 
> >>>>>>>>> - With the introduction of a new .terminate_cookie() operation, we need
> >>>>>>>>>   to specify that operation for all transfer types. What's its
> >>>>>>>> 
> >>>>>>>> Correct
> >>>>>>>> 
> >>>>>>>>>   envisioned semantics for non-cyclic transfers ? And how do DMA engine
> >>>>>>>>>   drivers report that they support .terminate_cookie() for cyclic
> >>>>>>>>>   transfers but not for other transfer types (the counterpart of
> >>>>>>>>>   reporting, in my proposition, that .issue_pending() isn't supported
> >>>>>>>>>   replace the current cyclic transfer) ?
> >>>>>>>> 
> >>>>>>>> Typically for dmaengine controller cyclic is *not* a special mode, only
> >>>>>>>> change is that a list provided to controller is circular.
> >>>>>>> 
> >>>>>>> I don't agree with this. For cyclic transfers to be replaceable in a
> >>>>>>> clean way, the feature must be specifically implemented at the hardware
> >>>>>>> level. A DMA engine that supports chaining transfers with an explicit
> >>>>>>> way to override that chaining, and without the logic to report if the
> >>>>>>> inherent race was lost or not, really can't support this API.
> >>>>>> 
> >>>>>> Well chaining is a typical feature in dmaengine and making last chain
> >>>>>> point to first makes it circular. I have seen couple of engines and this
> >>>>>> was the implementation in the hardware.
> >>>>>> 
> >>>>>> There can exist special hardware for this purposes as well, but the
> >>>>>> point is that the cyclic can be treated as circular list.
> >>>>>> 
> >>>>>>> Furthemore, for non-cyclic transfers, what would .terminate_cookie() do
> >>>>>>> ? I need it to be defined as terminating the current transfer when it
> >>>>>>> ends for the cyclic case, not terminating it immediately. All non-cyclic
> >>>>>>> transfers terminate by themselves when they end, so what would this new
> >>>>>>> operation do ?
> >>>>>> 
> >>>>>> I would use it for two purposes, cancelling txn but at the end of
> >>>>>> current txn. I have couple of usages where this would be helpful.
> >>>>> 
> >>>>> I fail to see how that would help. Non-cyclic transfers always stop at
> >>>>> the end of the transfer. "Cancelling txn but at the end of current txn"
> >>>>> is what DMA engine drivers already do if you call .terminate_cookie() on
> >>>>> the ongoing transfer. It would thus be a no-op.
> >>>> 
> >>>> Well that actually depends on the hardware, some of them support abort
> >>>> so people cancel it (terminate_all approach atm)
> >>> 
> >>> In that case it's not terminating at the end of the current transfer,
> >>> but terminating immediately (a.k.a. aborting), right ? Cancelling at the
> >>> end of the current transfer still seems to be a no-op to me for
> >>> non-cyclic transfers, as that's what they do on their own already.
> >> 
> >> Correct, it is abort for current txn.
> >> 
> >>>>>> Second in error handling where some engines do not support
> >>>>>> aborting (unless we reset the whole controller)
> >>>>> 
> >>>>> Could you explain that one ? I'm not sure to understand it.
> >>>> 
> >>>> So I have dma to a slow peripheral and it is stuck for some reason. I
> >>>> want to abort the cookie and let subsequent ones runs (btw this is for
> >>>> non cyclic case), so I would use that here. Today we terminate_all and
> >>>> then resubmit...
> >>> 
> >>> That's also for immediate abort, right ?
> >> 
> >> Right
> >> 
> >>> For this to work properly we need very accurate residue reporting, as
> >>> the client will usually need to know exactly what has been transferred.
> >>> The device would need to support DMA_RESIDUE_GRANULARITY_BURST when
> >>> aborting an ongoing transfer. What hardware supports this ?
> >> 
> >>  git grep DMA_RESIDUE_GRANULARITY_BURST drivers/dma/ |wc -l
> >> 27
> >> 
> >> So it seems many do support the burst reporting.
> > 
> > Yes, but not all of those may support aborting a transfer *and*
> > reporting the exact residue of cancelled transfers. We need both to
> > implement your proposal.
> > 
> >>>>>> But yes the .terminate_cookie() semantics should indicate if the
> >>>>>> termination should be immediate or end of current txn. I see people
> >>>>>> using it for both.
> >>>>> 
> >>>>> Immediate termination is *not* something I'll implement as I have no
> >>>>> good way to test that semantics. I assume you would be fine with leaving
> >>>>> that for later, when someone will need it ?
> >>>> 
> >>>> Sure, if you have hw to support please test. If not, you will not
> >>>> implement that.
> >>>> 
> >>>> The point is that API should support it and people can add support in
> >>>> the controllers and test :)
> >>> 
> >>> I still think this is a different API. We'll have
> >>> 
> >>> 1. Existing .issue_pending(), queueing the next transfer for non-cyclic
> >>>    cases, and being a no-op for cyclic cases.
> >>> 2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
> >>>    non-cyclic cases, and moving to the next transfer for cyclic cases.
> >>> 3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
> >>>    and non-cyclic cases.
> >>> 
> >>> 3. is an API I don't need, and can't easily test. I agree that it can
> >>> have use cases (provided the DMA device can abort an ongoing transfer
> >>> *and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).
> >>> 
> >>> I'm troubled by my inability to convince you that 1. and 2. are really
> >>> the same, with 1. addressing the non-cyclic case and 2. addressing the
> >>> cyclic case :-) This is why I think they should both be implemeted using
> >>> .issue_pending() (no other option for 1., that's what it uses today).
> >>> This wouldn't prevent implementing 3. with a new .terminate_cookie()
> >>> operation, that wouldn't need to take a flag as it would always operate
> >>> in ABORT_IMMEDIATELY mode. There would also be no need to report a new
> >>> capability for 3., as the presence of the .terminate_cookie() handler
> >>> would be enough to tell clients that the API is supported. Only a new
> >>> capability for 2. would be needed.
> >> 
> >> Well I agree 1 & 2 seem similar but I would like to define the behaviour
> >> not dependent on the txn being cyclic or not. That is my concern and
> >> hence the idea that:
> >> 
> >> 1. .issue_pending() will push txn to pending_queue, you may have a case
> >> where that is done only once (due to nature of txn), but no other
> >> implication
> >> 
> >> 2. .terminate_cookie(EOT) will abort the transfer at the end. Maybe not
> >> used for cyclic but irrespective of that, the behaviour would be abort
> >> at end of cyclic
> > 
> > Did you mean "maybe not used for non-cyclic" ?
> > 
> >> 3. .terminate_cookie(IMMEDIATE) will abort immediately. If there is
> >> anything in pending_queue that will get pushed to hardware.
> >> 
> >> 4. Cyclic by nature never completes
> >>    - as a consequence needs to be stopped by terminate_all/terminate_cookie
> >> 
> >> Does these rules make sense :)
> > 
> > It's a set of rules that I think can handle my use case, but I still
> > believe my proposal based on just .issue_pending() would be simpler, in
> > line with the existing API concepts, and wouldn't preclude the addition
> > of .terminate_cookie(IMMEDIATE) at a later point. It's your call though,
> > especially if you provide the implementation :-) When do you think you
> > will be able to do so ?
> 
> Gentle ping :-)

Any update ?

> >>>>>> And with this I think it would make sense to also add this to
> >>>>>> capabilities :)
> >>>>> 
> >>>>> I'll repeat the comment I made to Peter: you want me to implement a
> >>>>> feature that you think would be useful, but is completely unrelated to
> >>>>> my use case, while there's a more natural way to handle my issue with
> >>>>> the current API, without precluding in any way the addition of your new
> >>>>> feature in the future. Not fair.
> >>>> 
> >>>> So from API design pov, I would like this to support both the features.
> >>>> This helps us to not rework the API again for the immediate abort.
> >>>> 
> >>>> I am not expecting this to be implemented by you if your hw doesn't
> >>>> support it. The core changes are pretty minimal and callback in the
> >>>> driver is the one which does the job and yours wont do this
> >>> 
> >>> Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
> >>> can't test this indeed.
> >> 
> >> Sure I understand that! Am sure folks will respond to CFT and I guess
> >> Peter will also be interested in testing.
> > 
> > s/testing/implementing it/ :-)
> > 
> >>>>>>>> So, the .terminate_cookie() should be a feature for all type of txn's.
> >>>>>>>> If for some reason (dont discount what hw designers can do) a controller
> >>>>>>>> supports this for some specific type(s), then they should return
> >>>>>>>> -ENOTSUPP for cookies that do not support and let the caller know.
> >>>>>>> 
> >>>>>>> But then the caller can't know ahead of time, it will only find out when
> >>>>>>> it's too late, and can't decide not to use the DMA engine if it doesn't
> >>>>>>> support the feature. I don't think that's a very good option.
> >>>>>> 
> >>>>>> Agreed so lets go with adding these in caps.
> >>>>> 
> >>>>> So if there's a need for caps anyway, why not a cap that marks
> >>>>> .issue_pending() as moving from the current cyclic transfer to the next
> >>>>> one ? 
> >>>> 
> >>>> Is the overhead really too much on that :) If you like I can send the
> >>>> core patches and you would need to implement the driver side?
> >>> 
> >>> We can try that as a compromise. One of main concerns with developing
> >>> the core patches myself is that the .terminate_cookie() API still seems
> >>> ill-defined to me, so it would be much more efficient if you translate
> >> 
> >> yeah lets take a stab at defining this and see if we come up with
> >> something meaningful
> >> 
> >>> the idea you have in your idea into code than trying to communicate it
> >>> to me in all details (one of the grey areas is what should
> >>> .terminate_cookie() do if the cookie passed to the function corresponds
> >>> to an already terminated or, more tricky from a completion callback
> >>> point of view, an issued but not-yet-started transfer, or also a
> >>> submitted but not issued transfer). If you implement the core part, then
> >>> that problem will go away.
> >>> 
> >>> How about the implementation in virt-dma.[ch] by the way ?
> >> 
> >> It needs to be comprehended and tested as well.. since these are simple
> >> callbacks to driver, we should not need huge changes here (i need to
> >> double check though)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
       [not found]                                                       ` <20200311155248.GA4772@pendragon.ideasonboard.com>
  2020-03-18 15:14                                                         ` Laurent Pinchart
@ 2020-03-26  7:02                                                         ` Vinod Koul
  2020-04-08 17:00                                                           ` Laurent Pinchart
  1 sibling, 1 reply; 46+ messages in thread
From: Vinod Koul @ 2020-03-26  7:02 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Laurent,

Sorry for delay in replying..

On 11-03-20, 17:52, Laurent Pinchart wrote:
> On Wed, Mar 04, 2020 at 09:54:26PM +0530, Vinod Koul wrote:
> > >>>> Second in error handling where some engines do not support
> > >>>> aborting (unless we reset the whole controller)
> > >>> 
> > >>> Could you explain that one ? I'm not sure to understand it.
> > >> 
> > >> So I have dma to a slow peripheral and it is stuck for some reason. I
> > >> want to abort the cookie and let subsequent ones runs (btw this is for
> > >> non cyclic case), so I would use that here. Today we terminate_all and
> > >> then resubmit...
> > > 
> > > That's also for immediate abort, right ?
> > 
> > Right
> > 
> > > For this to work properly we need very accurate residue reporting, as
> > > the client will usually need to know exactly what has been transferred.
> > > The device would need to support DMA_RESIDUE_GRANULARITY_BURST when
> > > aborting an ongoing transfer. What hardware supports this ?
> > 
> >  git grep DMA_RESIDUE_GRANULARITY_BURST drivers/dma/ |wc -l
> > 27
> > 
> > So it seems many do support the burst reporting.
> 
> Yes, but not all of those may support aborting a transfer *and*
> reporting the exact residue of cancelled transfers. We need both to
> implement your proposal.

Reporting residue is already implemented, please see  struct
dmaengine_result. This can be passed by a callback
dma_async_tx_callback_result() in struct dma_async_tx_descriptor.

> > >>>> But yes the .terminate_cookie() semantics should indicate if the
> > >>>> termination should be immediate or end of current txn. I see people
> > >>>> using it for both.
> > >>> 
> > >>> Immediate termination is *not* something I'll implement as I have no
> > >>> good way to test that semantics. I assume you would be fine with leaving
> > >>> that for later, when someone will need it ?
> > >> 
> > >> Sure, if you have hw to support please test. If not, you will not
> > >> implement that.
> > >> 
> > >> The point is that API should support it and people can add support in
> > >> the controllers and test :)
> > > 
> > > I still think this is a different API. We'll have
> > > 
> > > 1. Existing .issue_pending(), queueing the next transfer for non-cyclic
> > >    cases, and being a no-op for cyclic cases.
> > > 2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
> > >    non-cyclic cases, and moving to the next transfer for cyclic cases.
> > > 3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
> > >    and non-cyclic cases.
> > > 
> > > 3. is an API I don't need, and can't easily test. I agree that it can
> > > have use cases (provided the DMA device can abort an ongoing transfer
> > > *and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).
> > > 
> > > I'm troubled by my inability to convince you that 1. and 2. are really
> > > the same, with 1. addressing the non-cyclic case and 2. addressing the
> > > cyclic case :-) This is why I think they should both be implemeted using
> > > .issue_pending() (no other option for 1., that's what it uses today).
> > > This wouldn't prevent implementing 3. with a new .terminate_cookie()
> > > operation, that wouldn't need to take a flag as it would always operate
> > > in ABORT_IMMEDIATELY mode. There would also be no need to report a new
> > > capability for 3., as the presence of the .terminate_cookie() handler
> > > would be enough to tell clients that the API is supported. Only a new
> > > capability for 2. would be needed.
> > 
> > Well I agree 1 & 2 seem similar but I would like to define the behaviour
> > not dependent on the txn being cyclic or not. That is my concern and
> > hence the idea that:
> > 
> > 1. .issue_pending() will push txn to pending_queue, you may have a case
> > where that is done only once (due to nature of txn), but no other
> > implication
> > 
> > 2. .terminate_cookie(EOT) will abort the transfer at the end. Maybe not
> > used for cyclic but irrespective of that, the behaviour would be abort
> > at end of cyclic
> 
> Did you mean "maybe not used for non-cyclic" ?

Yes I think so..

> > 3. .terminate_cookie(IMMEDIATE) will abort immediately. If there is
> > anything in pending_queue that will get pushed to hardware.
> > 
> > 4. Cyclic by nature never completes
> >    - as a consequence needs to be stopped by terminate_all/terminate_cookie
> > 
> > Does these rules make sense :)
> 
> It's a set of rules that I think can handle my use case, but I still
> believe my proposal based on just .issue_pending() would be simpler, in
> line with the existing API concepts, and wouldn't preclude the addition
> of .terminate_cookie(IMMEDIATE) at a later point. It's your call though,
> especially if you provide the implementation :-) When do you think you
> will be able to do so ?

I will try to take a stab at it once merge window opens.. will let you
and Peter for sneak preview once I start on it :)

> > >>>> And with this I think it would make sense to also add this to
> > >>>> capabilities :)
> > >>> 
> > >>> I'll repeat the comment I made to Peter: you want me to implement a
> > >>> feature that you think would be useful, but is completely unrelated to
> > >>> my use case, while there's a more natural way to handle my issue with
> > >>> the current API, without precluding in any way the addition of your new
> > >>> feature in the future. Not fair.
> > >> 
> > >> So from API design pov, I would like this to support both the features.
> > >> This helps us to not rework the API again for the immediate abort.
> > >> 
> > >> I am not expecting this to be implemented by you if your hw doesn't
> > >> support it. The core changes are pretty minimal and callback in the
> > >> driver is the one which does the job and yours wont do this
> > > 
> > > Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
> > > can't test this indeed.
> > 
> > Sure I understand that! Am sure folks will respond to CFT and I guess
> > Peter will also be interested in testing.
> 
> s/testing/implementing it/ :-)

Even better :)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-03-26  7:02                                                         ` Vinod Koul
@ 2020-04-08 17:00                                                           ` Laurent Pinchart
  2020-04-15 15:12                                                             ` Laurent Pinchart
  0 siblings, 1 reply; 46+ messages in thread
From: Laurent Pinchart @ 2020-04-08 17:00 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

On Thu, Mar 26, 2020 at 12:32:34PM +0530, Vinod Koul wrote:
> On 11-03-20, 17:52, Laurent Pinchart wrote:
> > On Wed, Mar 04, 2020 at 09:54:26PM +0530, Vinod Koul wrote:
> >>>>>> Second in error handling where some engines do not support
> >>>>>> aborting (unless we reset the whole controller)
> >>>>> 
> >>>>> Could you explain that one ? I'm not sure to understand it.
> >>>> 
> >>>> So I have dma to a slow peripheral and it is stuck for some reason. I
> >>>> want to abort the cookie and let subsequent ones runs (btw this is for
> >>>> non cyclic case), so I would use that here. Today we terminate_all and
> >>>> then resubmit...
> >>> 
> >>> That's also for immediate abort, right ?
> >> 
> >> Right
> >> 
> >>> For this to work properly we need very accurate residue reporting, as
> >>> the client will usually need to know exactly what has been transferred.
> >>> The device would need to support DMA_RESIDUE_GRANULARITY_BURST when
> >>> aborting an ongoing transfer. What hardware supports this ?
> >> 
> >>  git grep DMA_RESIDUE_GRANULARITY_BURST drivers/dma/ |wc -l
> >> 27
> >> 
> >> So it seems many do support the burst reporting.
> > 
> > Yes, but not all of those may support aborting a transfer *and*
> > reporting the exact residue of cancelled transfers. We need both to
> > implement your proposal.
> 
> Reporting residue is already implemented, please see  struct
> dmaengine_result. This can be passed by a callback
> dma_async_tx_callback_result() in struct dma_async_tx_descriptor.

I mean that I don't know if the driver that support
DMA_RESIDUE_GRANULARITY_BURST only support reporting the residue when
the transfer is active, or also support reporting it when cancelling a
transfer. Maybe all of them do, maybe a subset of them do, so I can't
tell if this would be a feature that could be widely supported.

> >>>>>> But yes the .terminate_cookie() semantics should indicate if the
> >>>>>> termination should be immediate or end of current txn. I see people
> >>>>>> using it for both.
> >>>>> 
> >>>>> Immediate termination is *not* something I'll implement as I have no
> >>>>> good way to test that semantics. I assume you would be fine with leaving
> >>>>> that for later, when someone will need it ?
> >>>> 
> >>>> Sure, if you have hw to support please test. If not, you will not
> >>>> implement that.
> >>>> 
> >>>> The point is that API should support it and people can add support in
> >>>> the controllers and test :)
> >>> 
> >>> I still think this is a different API. We'll have
> >>> 
> >>> 1. Existing .issue_pending(), queueing the next transfer for non-cyclic
> >>>    cases, and being a no-op for cyclic cases.
> >>> 2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
> >>>    non-cyclic cases, and moving to the next transfer for cyclic cases.
> >>> 3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
> >>>    and non-cyclic cases.
> >>> 
> >>> 3. is an API I don't need, and can't easily test. I agree that it can
> >>> have use cases (provided the DMA device can abort an ongoing transfer
> >>> *and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).
> >>> 
> >>> I'm troubled by my inability to convince you that 1. and 2. are really
> >>> the same, with 1. addressing the non-cyclic case and 2. addressing the
> >>> cyclic case :-) This is why I think they should both be implemeted using
> >>> .issue_pending() (no other option for 1., that's what it uses today).
> >>> This wouldn't prevent implementing 3. with a new .terminate_cookie()
> >>> operation, that wouldn't need to take a flag as it would always operate
> >>> in ABORT_IMMEDIATELY mode. There would also be no need to report a new
> >>> capability for 3., as the presence of the .terminate_cookie() handler
> >>> would be enough to tell clients that the API is supported. Only a new
> >>> capability for 2. would be needed.
> >> 
> >> Well I agree 1 & 2 seem similar but I would like to define the behaviour
> >> not dependent on the txn being cyclic or not. That is my concern and
> >> hence the idea that:
> >> 
> >> 1. .issue_pending() will push txn to pending_queue, you may have a case
> >> where that is done only once (due to nature of txn), but no other
> >> implication
> >> 
> >> 2. .terminate_cookie(EOT) will abort the transfer at the end. Maybe not
> >> used for cyclic but irrespective of that, the behaviour would be abort
> >> at end of cyclic
> > 
> > Did you mean "maybe not used for non-cyclic" ?
> 
> Yes I think so..
> 
> >> 3. .terminate_cookie(IMMEDIATE) will abort immediately. If there is
> >> anything in pending_queue that will get pushed to hardware.
> >> 
> >> 4. Cyclic by nature never completes
> >>    - as a consequence needs to be stopped by terminate_all/terminate_cookie
> >> 
> >> Does these rules make sense :)
> > 
> > It's a set of rules that I think can handle my use case, but I still
> > believe my proposal based on just .issue_pending() would be simpler, in
> > line with the existing API concepts, and wouldn't preclude the addition
> > of .terminate_cookie(IMMEDIATE) at a later point. It's your call though,
> > especially if you provide the implementation :-) When do you think you
> > will be able to do so ?
> 
> I will try to take a stab at it once merge window opens.. will let you
> and Peter for sneak preview once I start on it :)

I started giving it a try as this has been blocked for two months and a
half now.

I very quickly ran into issues as the interface is ill-defined as it
stands.

- What should happen when .terminate_cookie(EOT) is called with no other
  transfer issued, and a new transfer is issued before the current
  transfer terminates ?

- I expect .terminate_cookie() to be asynchronous, as .terminate_all().
  This means that actual termination of cyclic transfers will actually
  be handled at end of transfer, in the interrupt handler. This creates
  race conditions with other operations. It would also make it much more
  difficult to support this feature for devices that require sleeping
  when stopping the DMA engine at the end of a cyclic transfer.

If we have to go forward with this new API, I need a detailed
explanation of how all this should be handled. I still truly believe
this is a case of yak shaving that introduces additional complexity for
absolutely no valid reason, when a solution that is aligned with the
existing API and its concepts exists already. It's your decision as the
subsystem maintainer, but if you want something more complex, please
provide it soon. I don't want to wait another three months to see
progress on this issue.

> >>>>>> And with this I think it would make sense to also add this to
> >>>>>> capabilities :)
> >>>>> 
> >>>>> I'll repeat the comment I made to Peter: you want me to implement a
> >>>>> feature that you think would be useful, but is completely unrelated to
> >>>>> my use case, while there's a more natural way to handle my issue with
> >>>>> the current API, without precluding in any way the addition of your new
> >>>>> feature in the future. Not fair.
> >>>> 
> >>>> So from API design pov, I would like this to support both the features.
> >>>> This helps us to not rework the API again for the immediate abort.
> >>>> 
> >>>> I am not expecting this to be implemented by you if your hw doesn't
> >>>> support it. The core changes are pretty minimal and callback in the
> >>>> driver is the one which does the job and yours wont do this
> >>> 
> >>> Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
> >>> can't test this indeed.
> >> 
> >> Sure I understand that! Am sure folks will respond to CFT and I guess
> >> Peter will also be interested in testing.
> > 
> > s/testing/implementing it/ :-)
> 
> Even better :)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type
  2020-04-08 17:00                                                           ` Laurent Pinchart
@ 2020-04-15 15:12                                                             ` Laurent Pinchart
  0 siblings, 0 replies; 46+ messages in thread
From: Laurent Pinchart @ 2020-04-15 15:12 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Peter Ujfalusi, dmaengine, Michal Simek, Hyun Kwon,
	Tejas Upadhyay, Satish Kumar Nagireddy

Hi Vinod,

Ping. We need a solution to this problem, it's been way too long
already. If you don't want to accept my proposal, please provide me with
an implementation or a very detailed spec I can implement.

On Wed, Apr 08, 2020 at 08:00:49PM +0300, Laurent Pinchart wrote:
> On Thu, Mar 26, 2020 at 12:32:34PM +0530, Vinod Koul wrote:
> > On 11-03-20, 17:52, Laurent Pinchart wrote:
> >> On Wed, Mar 04, 2020 at 09:54:26PM +0530, Vinod Koul wrote:
> >>>>>>> Second in error handling where some engines do not support
> >>>>>>> aborting (unless we reset the whole controller)
> >>>>>> 
> >>>>>> Could you explain that one ? I'm not sure to understand it.
> >>>>> 
> >>>>> So I have dma to a slow peripheral and it is stuck for some reason. I
> >>>>> want to abort the cookie and let subsequent ones runs (btw this is for
> >>>>> non cyclic case), so I would use that here. Today we terminate_all and
> >>>>> then resubmit...
> >>>> 
> >>>> That's also for immediate abort, right ?
> >>> 
> >>> Right
> >>> 
> >>>> For this to work properly we need very accurate residue reporting, as
> >>>> the client will usually need to know exactly what has been transferred.
> >>>> The device would need to support DMA_RESIDUE_GRANULARITY_BURST when
> >>>> aborting an ongoing transfer. What hardware supports this ?
> >>> 
> >>> git grep DMA_RESIDUE_GRANULARITY_BURST drivers/dma/ |wc -l
> >>> 27
> >>> 
> >>> So it seems many do support the burst reporting.
> >> 
> >> Yes, but not all of those may support aborting a transfer *and*
> >> reporting the exact residue of cancelled transfers. We need both to
> >> implement your proposal.
> > 
> > Reporting residue is already implemented, please see  struct
> > dmaengine_result. This can be passed by a callback
> > dma_async_tx_callback_result() in struct dma_async_tx_descriptor.
> 
> I mean that I don't know if the driver that support
> DMA_RESIDUE_GRANULARITY_BURST only support reporting the residue when
> the transfer is active, or also support reporting it when cancelling a
> transfer. Maybe all of them do, maybe a subset of them do, so I can't
> tell if this would be a feature that could be widely supported.
> 
> >>>>>>> But yes the .terminate_cookie() semantics should indicate if the
> >>>>>>> termination should be immediate or end of current txn. I see people
> >>>>>>> using it for both.
> >>>>>> 
> >>>>>> Immediate termination is *not* something I'll implement as I have no
> >>>>>> good way to test that semantics. I assume you would be fine with leaving
> >>>>>> that for later, when someone will need it ?
> >>>>> 
> >>>>> Sure, if you have hw to support please test. If not, you will not
> >>>>> implement that.
> >>>>> 
> >>>>> The point is that API should support it and people can add support in
> >>>>> the controllers and test :)
> >>>> 
> >>>> I still think this is a different API. We'll have
> >>>> 
> >>>> 1. Existing .issue_pending(), queueing the next transfer for non-cyclic
> >>>>    cases, and being a no-op for cyclic cases.
> >>>> 2. New .terminate_cookie(AT_END_OF_TRANSFER), being a no-op for
> >>>>    non-cyclic cases, and moving to the next transfer for cyclic cases.
> >>>> 3. New .terminate_cookie(ABORT_IMMEDIATELY), applicable to both cyclic
> >>>>    and non-cyclic cases.
> >>>> 
> >>>> 3. is an API I don't need, and can't easily test. I agree that it can
> >>>> have use cases (provided the DMA device can abort an ongoing transfer
> >>>> *and* still support DMA_RESIDUE_GRANULARITY_BURST in that case).
> >>>> 
> >>>> I'm troubled by my inability to convince you that 1. and 2. are really
> >>>> the same, with 1. addressing the non-cyclic case and 2. addressing the
> >>>> cyclic case :-) This is why I think they should both be implemeted using
> >>>> .issue_pending() (no other option for 1., that's what it uses today).
> >>>> This wouldn't prevent implementing 3. with a new .terminate_cookie()
> >>>> operation, that wouldn't need to take a flag as it would always operate
> >>>> in ABORT_IMMEDIATELY mode. There would also be no need to report a new
> >>>> capability for 3., as the presence of the .terminate_cookie() handler
> >>>> would be enough to tell clients that the API is supported. Only a new
> >>>> capability for 2. would be needed.
> >>> 
> >>> Well I agree 1 & 2 seem similar but I would like to define the behaviour
> >>> not dependent on the txn being cyclic or not. That is my concern and
> >>> hence the idea that:
> >>> 
> >>> 1. .issue_pending() will push txn to pending_queue, you may have a case
> >>> where that is done only once (due to nature of txn), but no other
> >>> implication
> >>> 
> >>> 2. .terminate_cookie(EOT) will abort the transfer at the end. Maybe not
> >>> used for cyclic but irrespective of that, the behaviour would be abort
> >>> at end of cyclic
> >> 
> >> Did you mean "maybe not used for non-cyclic" ?
> > 
> > Yes I think so..
> > 
> >>> 3. .terminate_cookie(IMMEDIATE) will abort immediately. If there is
> >>> anything in pending_queue that will get pushed to hardware.
> >>> 
> >>> 4. Cyclic by nature never completes
> >>>    - as a consequence needs to be stopped by terminate_all/terminate_cookie
> >>> 
> >>> Does these rules make sense :)
> >> 
> >> It's a set of rules that I think can handle my use case, but I still
> >> believe my proposal based on just .issue_pending() would be simpler, in
> >> line with the existing API concepts, and wouldn't preclude the addition
> >> of .terminate_cookie(IMMEDIATE) at a later point. It's your call though,
> >> especially if you provide the implementation :-) When do you think you
> >> will be able to do so ?
> > 
> > I will try to take a stab at it once merge window opens.. will let you
> > and Peter for sneak preview once I start on it :)
> 
> I started giving it a try as this has been blocked for two months and a
> half now.
> 
> I very quickly ran into issues as the interface is ill-defined as it
> stands.
> 
> - What should happen when .terminate_cookie(EOT) is called with no other
>   transfer issued, and a new transfer is issued before the current
>   transfer terminates ?
> 
> - I expect .terminate_cookie() to be asynchronous, as .terminate_all().
>   This means that actual termination of cyclic transfers will actually
>   be handled at end of transfer, in the interrupt handler. This creates
>   race conditions with other operations. It would also make it much more
>   difficult to support this feature for devices that require sleeping
>   when stopping the DMA engine at the end of a cyclic transfer.
> 
> If we have to go forward with this new API, I need a detailed
> explanation of how all this should be handled. I still truly believe
> this is a case of yak shaving that introduces additional complexity for
> absolutely no valid reason, when a solution that is aligned with the
> existing API and its concepts exists already. It's your decision as the
> subsystem maintainer, but if you want something more complex, please
> provide it soon. I don't want to wait another three months to see
> progress on this issue.
> 
> >>>>>>> And with this I think it would make sense to also add this to
> >>>>>>> capabilities :)
> >>>>>> 
> >>>>>> I'll repeat the comment I made to Peter: you want me to implement a
> >>>>>> feature that you think would be useful, but is completely unrelated to
> >>>>>> my use case, while there's a more natural way to handle my issue with
> >>>>>> the current API, without precluding in any way the addition of your new
> >>>>>> feature in the future. Not fair.
> >>>>> 
> >>>>> So from API design pov, I would like this to support both the features.
> >>>>> This helps us to not rework the API again for the immediate abort.
> >>>>> 
> >>>>> I am not expecting this to be implemented by you if your hw doesn't
> >>>>> support it. The core changes are pretty minimal and callback in the
> >>>>> driver is the one which does the job and yours wont do this
> >>>> 
> >>>> Xilinx DMA drivers don't support DMA_RESIDUE_GRANULARITY_BURST so I
> >>>> can't test this indeed.
> >>> 
> >>> Sure I understand that! Am sure folks will respond to CFT and I guess
> >>> Peter will also be interested in testing.
> >> 
> >> s/testing/implementing it/ :-)
> > 
> > Even better :)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2020-04-15 15:13 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-23  2:29 [PATCH v3 0/6] dma: Add Xilinx ZynqMP DPDMA driver Laurent Pinchart
2020-01-23  2:29 ` [PATCH v3 1/6] dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA Laurent Pinchart
2020-01-23  2:29 ` [PATCH v3 2/6] dmaengine: Add interleaved cyclic transaction type Laurent Pinchart
2020-01-23  8:03   ` Peter Ujfalusi
2020-01-23  8:43     ` Vinod Koul
2020-01-23  8:51       ` Peter Ujfalusi
2020-01-23 12:23         ` Laurent Pinchart
2020-01-24  6:10           ` Vinod Koul
2020-01-24  8:50             ` Laurent Pinchart
2020-02-10 14:06               ` Laurent Pinchart
2020-02-13 13:29                 ` Vinod Koul
2020-02-13 13:48                   ` Laurent Pinchart
2020-02-13 14:07                     ` Vinod Koul
2020-02-13 14:15                       ` Peter Ujfalusi
2020-02-13 16:52                         ` Laurent Pinchart
2020-02-14  4:23                           ` Vinod Koul
2020-02-14 16:22                             ` Laurent Pinchart
2020-02-17 10:00                               ` Peter Ujfalusi
2020-02-19  9:25                                 ` Vinod Koul
2020-02-26 16:30                                   ` Laurent Pinchart
2020-03-02  3:47                                     ` Vinod Koul
2020-03-02  7:37                                       ` Laurent Pinchart
2020-03-03  4:32                                         ` Vinod Koul
2020-03-03 19:22                                           ` Laurent Pinchart
2020-03-04  5:13                                             ` Vinod Koul
2020-03-04  8:01                                               ` Laurent Pinchart
2020-03-04 15:37                                                 ` Vinod Koul
2020-03-04 16:00                                                   ` Laurent Pinchart
2020-03-04 16:24                                                     ` Vinod Koul
     [not found]                                                       ` <20200311155248.GA4772@pendragon.ideasonboard.com>
2020-03-18 15:14                                                         ` Laurent Pinchart
2020-03-25 16:00                                                           ` Laurent Pinchart
2020-03-26  7:02                                                         ` Vinod Koul
2020-04-08 17:00                                                           ` Laurent Pinchart
2020-04-15 15:12                                                             ` Laurent Pinchart
2020-03-06 14:49                                                     ` Peter Ujfalusi
2020-03-11 23:15                                                       ` Laurent Pinchart
2020-02-26 16:24                                 ` Laurent Pinchart
2020-03-02  3:42                                   ` Vinod Koul
2020-01-24  7:20           ` Peter Ujfalusi
2020-01-24  7:38             ` Peter Ujfalusi
2020-01-24  8:58               ` Laurent Pinchart
2020-01-24  8:56             ` Laurent Pinchart
2020-01-23  2:29 ` [PATCH v3 3/6] dmaengine: virt-dma: Use lockdep to check locking requirements Laurent Pinchart
2020-01-23  2:29 ` [PATCH v3 4/6] dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver Laurent Pinchart
2020-01-23  2:29 ` [PATCH v3 5/6] dmaengine: xilinx: dpdma: Add debugfs support Laurent Pinchart
2020-01-23  2:29 ` [PATCH v3 6/6] arm64: dts: zynqmp: Add DPDMA node Laurent Pinchart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).