dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization
@ 2021-04-09 17:55 Radhey Shyam Pandey
  2021-04-09 17:55 ` [RFC v2 PATCH 1/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,axistream-connected property Radhey Shyam Pandey
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-04-09 17:55 UTC (permalink / raw)
  To: vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git,
	Radhey Shyam Pandey


Some background about the patch series: Xilinx Axi Ethernet device driver
(xilinx_axienet_main.c) currently has axi-dma code inside it. The goal
is to refactor axiethernet driver and use existing AXI DMA driver using
DMAEngine API.

This patchset does feature addition and optimization to support
axidma integration with axiethernet network driver. Once axidma 
version is accepted mcdma specific changes will be added in 
followup version.

This series is based on dmaengine tree commit: #a38fd8748464

Changes for v2:
- Use metadata API[1] for passing metadata from dma to netdev client.
- Read irq-delay from DT.
- Remove desc_callback_valid check.
- Addressed RFC v1 comments[2].
- Minor code refactoring.

Comments, suggestions are very welcome!


[1] https://www.spinics.net/lists/dmaengine/msg16583.html
[2] https://www.spinics.net/lists/dmaengine/msg15208.html

Radhey Shyam Pandey (7):
  dt-bindings: dmaengine: xilinx_dma: Add xlnx,axistream-connected
    property
  dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property
  dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client
  dmaengine: xilinx_dma: Increase AXI DMA transaction segment count
  dmaengine: xilinx_dma: Freeup active list based on descriptor
    completion bit
  dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical
    usecase
  dmaengine: xilinx_dma: Program interrupt delay timeout

 .../devicetree/bindings/dma/xilinx/xilinx_dma.txt  |  4 ++
 drivers/dma/xilinx/xilinx_dma.c                    | 68 ++++++++++++++++++----
 2 files changed, 61 insertions(+), 11 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC v2 PATCH 1/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,axistream-connected property
  2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
@ 2021-04-09 17:55 ` Radhey Shyam Pandey
  2021-04-12 18:25   ` Rob Herring
  2021-04-09 17:56 ` [RFC v2 PATCH 2/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property Radhey Shyam Pandey
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-04-09 17:55 UTC (permalink / raw)
  To: vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git,
	Radhey Shyam Pandey

Add an optional DMA property 'xlnx,axistream-connected'. This can be
specified to indicate that DMA is connected to a streaming IP in the
hardware design and dma driver needs to do some additional handling
i.e pass metadata and perform streaming IP specific configuration.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
---
Changes for v2:
- Rename xlnx,axieth-connected to xlnx,axistream-connected to
  make it generic.
---
 Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
index 325aca52cd43..f5f23a4a4467 100644
--- a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
+++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
@@ -49,6 +49,8 @@ Optional properties for AXI DMA and MCDMA:
 	register as configured in h/w. Takes values {8...26}. If the property
 	is missing or invalid then the default value 23 is used. This is the
 	maximum value that is supported by all IP versions.
+- xlnx,axistream-connected: Tells whether DMA is connected to AXI stream IP.
+
 Optional properties for VDMA:
 - xlnx,flush-fsync: Tells which channel to Flush on Frame sync.
 	It takes following values:
-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC v2 PATCH 2/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property
  2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
  2021-04-09 17:55 ` [RFC v2 PATCH 1/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,axistream-connected property Radhey Shyam Pandey
@ 2021-04-09 17:56 ` Radhey Shyam Pandey
  2021-04-12 18:25   ` Rob Herring
  2021-04-09 17:56 ` [RFC v2 PATCH 3/7] dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client Radhey Shyam Pandey
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-04-09 17:56 UTC (permalink / raw)
  To: vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git,
	Radhey Shyam Pandey

Add an optional AXI DMA property 'xlnx,irq-delay'. It specifies interrupt
timeout value and causes the DMA engine to generate an interrupt after the
delay time period has expired. Timer begins counting at the end of a packet
and resets with receipt of a new packet or a timeout event occurs.

This property is useful when AXI DMA is connected to the streaming IP i.e
axiethernet where inter packet latency is critical while still taking the
benefit of interrupt coalescing.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
---
Changes for v2:
- New patch. Introduce xlnx,irq-delay property for low latency usecases
---
 Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
index f5f23a4a4467..96009ced7b29 100644
--- a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
+++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
@@ -50,7 +50,9 @@ Optional properties for AXI DMA and MCDMA:
 	is missing or invalid then the default value 23 is used. This is the
 	maximum value that is supported by all IP versions.
 - xlnx,axistream-connected: Tells whether DMA is connected to AXI stream IP.
-
+- xlnx,irq-delay: Tells the interrupt delay timeout value. Valid range is from
+	0-255. Setting this value to zero disables the delay timer interrupt.
+	1 timeout interval = 125 * clock period of SG clock.
 Optional properties for VDMA:
 - xlnx,flush-fsync: Tells which channel to Flush on Frame sync.
 	It takes following values:
-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC v2 PATCH 3/7] dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client
  2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
  2021-04-09 17:55 ` [RFC v2 PATCH 1/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,axistream-connected property Radhey Shyam Pandey
  2021-04-09 17:56 ` [RFC v2 PATCH 2/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property Radhey Shyam Pandey
@ 2021-04-09 17:56 ` Radhey Shyam Pandey
  2021-04-09 17:56 ` [RFC v2 PATCH 4/7] dmaengine: xilinx_dma: Increase AXI DMA transaction segment count Radhey Shyam Pandey
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-04-09 17:56 UTC (permalink / raw)
  To: vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git,
	Radhey Shyam Pandey

Read DT property to check if AXI DMA is connected to streaming IP
i.e axiethernet. If connected pass AXI4-Stream control words to
dma client using metadata_ops dmaengine API.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
---
Changes for v2:
- Use descriptor metadata API to pass control words to dma client.
- Rephrased commit description to be inline with implementation.
---
 drivers/dma/xilinx/xilinx_dma.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 3aded7861fef..14dcfc473e52 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -491,6 +491,7 @@ struct xilinx_dma_config {
  * @s2mm_chan_id: DMA s2mm channel identifier
  * @mm2s_chan_id: DMA mm2s channel identifier
  * @max_buffer_len: Max buffer length
+ * @has_axistream_connected: AXI DMA connected to AXI Stream IP
  */
 struct xilinx_dma_device {
 	void __iomem *regs;
@@ -509,6 +510,7 @@ struct xilinx_dma_device {
 	u32 s2mm_chan_id;
 	u32 mm2s_chan_id;
 	u32 max_buffer_len;
+	bool has_axistream_connected;
 };
 
 /* Macros */
@@ -621,6 +623,29 @@ static inline void xilinx_aximcdma_buf(struct xilinx_dma_chan *chan,
 	}
 }
 
+/**
+ * xilinx_dma_get_metadata_ptr- Populate metadata pointer and payload length
+ * @tx: async transaction descriptor
+ * @payload_len: metadata payload length
+ * @max_len: metadata max length
+ * Return: The app field pointer.
+ */
+static void *xilinx_dma_get_metadata_ptr(struct dma_async_tx_descriptor *tx,
+					 size_t *payload_len, size_t *max_len)
+{
+	struct xilinx_dma_tx_descriptor *desc = to_dma_tx_descriptor(tx);
+	struct xilinx_axidma_tx_segment *seg;
+
+	*max_len = *payload_len = sizeof(u32) * XILINX_DMA_NUM_APP_WORDS;
+	seg = list_first_entry(&desc->segments,
+			       struct xilinx_axidma_tx_segment, node);
+	return seg->hw.app;
+}
+
+static struct dma_descriptor_metadata_ops xilinx_dma_metadata_ops = {
+	.get_ptr = xilinx_dma_get_metadata_ptr,
+};
+
 /* -----------------------------------------------------------------------------
  * Descriptors and segments alloc and free
  */
@@ -2200,6 +2225,9 @@ static struct dma_async_tx_descriptor *xilinx_dma_prep_slave_sg(
 		segment->hw.control |= XILINX_DMA_BD_EOP;
 	}
 
+	if (chan->xdev->has_axistream_connected)
+		desc->async_tx.metadata_ops = &xilinx_dma_metadata_ops;
+
 	return &desc->async_tx;
 
 error:
@@ -3032,6 +3060,11 @@ static int xilinx_dma_probe(struct platform_device *pdev)
 		}
 	}
 
+	if (xdev->dma_config->dmatype == XDMA_TYPE_AXIDMA) {
+		xdev->has_axistream_connected =
+			of_property_read_bool(node, "xlnx,axistream-connected");
+	}
+
 	if (xdev->dma_config->dmatype == XDMA_TYPE_VDMA) {
 		err = of_property_read_u32(node, "xlnx,num-fstores",
 					   &num_frames);
@@ -3057,6 +3090,10 @@ static int xilinx_dma_probe(struct platform_device *pdev)
 	else
 		xdev->ext_addr = false;
 
+	/* Set metadata mode */
+	if (xdev->has_axistream_connected)
+		xdev->common.desc_metadata_modes = DESC_METADATA_ENGINE;
+
 	/* Set the dma mask bits */
 	dma_set_mask(xdev->dev, DMA_BIT_MASK(addr_width));
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC v2 PATCH 4/7] dmaengine: xilinx_dma: Increase AXI DMA transaction segment count
  2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
                   ` (2 preceding siblings ...)
  2021-04-09 17:56 ` [RFC v2 PATCH 3/7] dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client Radhey Shyam Pandey
@ 2021-04-09 17:56 ` Radhey Shyam Pandey
  2021-04-09 17:56 ` [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit Radhey Shyam Pandey
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-04-09 17:56 UTC (permalink / raw)
  To: vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git,
	Radhey Shyam Pandey

Increase AXI DMA transaction segments count to ensure that even in
high load we always get a free segment in prepare descriptor for a
DMA_SLAVE transaction.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
---
Changes for v2:
- None
---
 drivers/dma/xilinx/xilinx_dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 14dcfc473e52..890bf46b36e5 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -178,7 +178,7 @@
 #define XILINX_DMA_BD_SOP		BIT(27)
 #define XILINX_DMA_BD_EOP		BIT(26)
 #define XILINX_DMA_COALESCE_MAX		255
-#define XILINX_DMA_NUM_DESCS		255
+#define XILINX_DMA_NUM_DESCS		512
 #define XILINX_DMA_NUM_APP_WORDS	5
 
 /* AXI CDMA Specific Registers/Offsets */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit
  2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
                   ` (3 preceding siblings ...)
  2021-04-09 17:56 ` [RFC v2 PATCH 4/7] dmaengine: xilinx_dma: Increase AXI DMA transaction segment count Radhey Shyam Pandey
@ 2021-04-09 17:56 ` Radhey Shyam Pandey
  2021-04-15  7:08   ` Lars-Peter Clausen
  2021-04-15  7:26   ` Lars-Peter Clausen
  2021-04-09 17:56 ` [RFC v2 PATCH 6/7] dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase Radhey Shyam Pandey
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-04-09 17:56 UTC (permalink / raw)
  To: vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git,
	Radhey Shyam Pandey

AXIDMA IP in SG mode sets completion bit to 1 when the transfer is
completed. Read this bit to move descriptor from active list to the
done list. This feature is needed when interrupt delay timeout and
IRQThreshold is enabled i.e Dly_IrqEn is triggered w/o completing
interrupt threshold.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
---
- Check BD completion bit only for SG mode.
- Modify the logic to have early return path.
---
 drivers/dma/xilinx/xilinx_dma.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 890bf46b36e5..f2305a73cb91 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -177,6 +177,7 @@
 #define XILINX_DMA_CR_COALESCE_SHIFT	16
 #define XILINX_DMA_BD_SOP		BIT(27)
 #define XILINX_DMA_BD_EOP		BIT(26)
+#define XILINX_DMA_BD_COMP_MASK		BIT(31)
 #define XILINX_DMA_COALESCE_MAX		255
 #define XILINX_DMA_NUM_DESCS		512
 #define XILINX_DMA_NUM_APP_WORDS	5
@@ -1683,12 +1684,18 @@ static void xilinx_dma_issue_pending(struct dma_chan *dchan)
 static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan *chan)
 {
 	struct xilinx_dma_tx_descriptor *desc, *next;
+	struct xilinx_axidma_tx_segment *seg;
 
 	/* This function was invoked with lock held */
 	if (list_empty(&chan->active_list))
 		return;
 
 	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
+		/* TODO: remove hardcoding for axidma_tx_segment */
+		seg = list_last_entry(&desc->segments,
+				      struct xilinx_axidma_tx_segment, node);
+		if (!(seg->hw.status & XILINX_DMA_BD_COMP_MASK) && chan->has_sg)
+			break;
 		if (chan->has_sg && chan->xdev->dma_config->dmatype !=
 		    XDMA_TYPE_VDMA)
 			desc->residue = xilinx_dma_get_residue(chan, desc);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC v2 PATCH 6/7] dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase
  2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
                   ` (4 preceding siblings ...)
  2021-04-09 17:56 ` [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit Radhey Shyam Pandey
@ 2021-04-09 17:56 ` Radhey Shyam Pandey
  2021-04-15  7:10   ` Lars-Peter Clausen
  2021-04-09 17:56 ` [RFC v2 PATCH 7/7] dmaengine: xilinx_dma: Program interrupt delay timeout Radhey Shyam Pandey
  2021-04-15  7:06 ` [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Lars-Peter Clausen
  7 siblings, 1 reply; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-04-09 17:56 UTC (permalink / raw)
  To: vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git,
	Radhey Shyam Pandey

Schedule tasklet with high priority to ensure that callback processing
is prioritized. It improves throughput for netdev dma clients.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
--
Changes for v2:
- None
---
 drivers/dma/xilinx/xilinx_dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index f2305a73cb91..a2ea2d649332 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -1829,7 +1829,7 @@ static irqreturn_t xilinx_mcdma_irq_handler(int irq, void *data)
 		spin_unlock(&chan->lock);
 	}
 
-	tasklet_schedule(&chan->tasklet);
+	tasklet_hi_schedule(&chan->tasklet);
 	return IRQ_HANDLED;
 }
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC v2 PATCH 7/7] dmaengine: xilinx_dma: Program interrupt delay timeout
  2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
                   ` (5 preceding siblings ...)
  2021-04-09 17:56 ` [RFC v2 PATCH 6/7] dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase Radhey Shyam Pandey
@ 2021-04-09 17:56 ` Radhey Shyam Pandey
  2021-04-15  7:33   ` Lars-Peter Clausen
  2021-04-15  7:06 ` [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Lars-Peter Clausen
  7 siblings, 1 reply; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-04-09 17:56 UTC (permalink / raw)
  To: vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git,
	Radhey Shyam Pandey

Program IRQDelay for AXI DMA. The interrupt timeout mechanism causes
the DMA engine to generate an interrupt after the delay time period
has expired. It enables dmaengine to respond in real-time even though
interrupt coalescing is configured. It also remove the placeholder
for delay interrupt and merge it with frame completion interrupt.
Since by default interrupt delay timeout is disabled this feature
addition has no functional impact on VDMA and CDMA IP's.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
---
Changes for v2:
- Read irq delay timeout value from DT.
- Merge interrupt processing for frame done and delay interrupt.
---
 drivers/dma/xilinx/xilinx_dma.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index a2ea2d649332..0c0dc9882a01 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -173,8 +173,10 @@
 #define XILINX_DMA_MAX_TRANS_LEN_MAX	23
 #define XILINX_DMA_V2_MAX_TRANS_LEN_MAX	26
 #define XILINX_DMA_CR_COALESCE_MAX	GENMASK(23, 16)
+#define XILINX_DMA_CR_DELAY_MAX		GENMASK(31, 24)
 #define XILINX_DMA_CR_CYCLIC_BD_EN_MASK	BIT(4)
 #define XILINX_DMA_CR_COALESCE_SHIFT	16
+#define XILINX_DMA_CR_DELAY_SHIFT	24
 #define XILINX_DMA_BD_SOP		BIT(27)
 #define XILINX_DMA_BD_EOP		BIT(26)
 #define XILINX_DMA_BD_COMP_MASK		BIT(31)
@@ -410,6 +412,7 @@ struct xilinx_dma_tx_descriptor {
  * @stop_transfer: Differentiate b/w DMA IP's quiesce
  * @tdest: TDEST value for mcdma
  * @has_vflip: S2MM vertical flip
+ * @irq_delay: Interrupt delay timeout
  */
 struct xilinx_dma_chan {
 	struct xilinx_dma_device *xdev;
@@ -447,6 +450,7 @@ struct xilinx_dma_chan {
 	int (*stop_transfer)(struct xilinx_dma_chan *chan);
 	u16 tdest;
 	bool has_vflip;
+	u8 irq_delay;
 };
 
 /**
@@ -1555,6 +1559,9 @@ static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
 	if (chan->has_sg)
 		xilinx_write(chan, XILINX_DMA_REG_CURDESC,
 			     head_desc->async_tx.phys);
+	reg  &= ~XILINX_DMA_CR_DELAY_MAX;
+	reg  |= chan->irq_delay << XILINX_DMA_CR_DELAY_SHIFT;
+	dma_ctrl_write(chan, XILINX_DMA_REG_DMACR, reg);
 
 	xilinx_dma_start(chan);
 
@@ -1877,15 +1884,8 @@ static irqreturn_t xilinx_dma_irq_handler(int irq, void *data)
 		}
 	}
 
-	if (status & XILINX_DMA_DMASR_DLY_CNT_IRQ) {
-		/*
-		 * Device takes too long to do the transfer when user requires
-		 * responsiveness.
-		 */
-		dev_dbg(chan->dev, "Inter-packet latency too long\n");
-	}
-
-	if (status & XILINX_DMA_DMASR_FRM_CNT_IRQ) {
+	if (status & (XILINX_DMA_DMASR_FRM_CNT_IRQ |
+		      XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
 		spin_lock(&chan->lock);
 		xilinx_dma_complete_descriptor(chan);
 		chan->idle = true;
@@ -2802,6 +2802,8 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev,
 	/* Retrieve the channel properties from the device tree */
 	has_dre = of_property_read_bool(node, "xlnx,include-dre");
 
+	of_property_read_u8(node, "xlnx,irq-delay", &chan->irq_delay);
+
 	chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode");
 
 	err = of_property_read_u32(node, "xlnx,datawidth", &value);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC v2 PATCH 1/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,axistream-connected property
  2021-04-09 17:55 ` [RFC v2 PATCH 1/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,axistream-connected property Radhey Shyam Pandey
@ 2021-04-12 18:25   ` Rob Herring
  0 siblings, 0 replies; 20+ messages in thread
From: Rob Herring @ 2021-04-12 18:25 UTC (permalink / raw)
  To: Radhey Shyam Pandey
  Cc: vkoul, git, michal.simek, dmaengine, linux-arm-kernel,
	linux-kernel, robh+dt, devicetree

On Fri, 09 Apr 2021 23:25:59 +0530, Radhey Shyam Pandey wrote:
> Add an optional DMA property 'xlnx,axistream-connected'. This can be
> specified to indicate that DMA is connected to a streaming IP in the
> hardware design and dma driver needs to do some additional handling
> i.e pass metadata and perform streaming IP specific configuration.
> 
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> ---
> Changes for v2:
> - Rename xlnx,axieth-connected to xlnx,axistream-connected to
>   make it generic.
> ---
>  Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt | 2 ++
>  1 file changed, 2 insertions(+)
> 

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC v2 PATCH 2/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property
  2021-04-09 17:56 ` [RFC v2 PATCH 2/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property Radhey Shyam Pandey
@ 2021-04-12 18:25   ` Rob Herring
  0 siblings, 0 replies; 20+ messages in thread
From: Rob Herring @ 2021-04-12 18:25 UTC (permalink / raw)
  To: Radhey Shyam Pandey
  Cc: linux-arm-kernel, git, vkoul, michal.simek, robh+dt,
	linux-kernel, devicetree, dmaengine

On Fri, 09 Apr 2021 23:26:00 +0530, Radhey Shyam Pandey wrote:
> Add an optional AXI DMA property 'xlnx,irq-delay'. It specifies interrupt
> timeout value and causes the DMA engine to generate an interrupt after the
> delay time period has expired. Timer begins counting at the end of a packet
> and resets with receipt of a new packet or a timeout event occurs.
> 
> This property is useful when AXI DMA is connected to the streaming IP i.e
> axiethernet where inter packet latency is critical while still taking the
> benefit of interrupt coalescing.
> 
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> ---
> Changes for v2:
> - New patch. Introduce xlnx,irq-delay property for low latency usecases
> ---
>  Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization
  2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
                   ` (6 preceding siblings ...)
  2021-04-09 17:56 ` [RFC v2 PATCH 7/7] dmaengine: xilinx_dma: Program interrupt delay timeout Radhey Shyam Pandey
@ 2021-04-15  7:06 ` Lars-Peter Clausen
  2021-06-11 16:13   ` Radhey Shyam Pandey
  7 siblings, 1 reply; 20+ messages in thread
From: Lars-Peter Clausen @ 2021-04-15  7:06 UTC (permalink / raw)
  To: Radhey Shyam Pandey, vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

On 4/9/21 7:55 PM, Radhey Shyam Pandey wrote:
> Some background about the patch series: Xilinx Axi Ethernet device driver
> (xilinx_axienet_main.c) currently has axi-dma code inside it. The goal
> is to refactor axiethernet driver and use existing AXI DMA driver using
> DMAEngine API.

This is pretty neat! Do you have the patches that modify the AXI 
Ethernet driver in a public tree somewhere, so this series can be seen 
in context?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit
  2021-04-09 17:56 ` [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit Radhey Shyam Pandey
@ 2021-04-15  7:08   ` Lars-Peter Clausen
  2021-06-11 16:16     ` Radhey Shyam Pandey
  2021-04-15  7:26   ` Lars-Peter Clausen
  1 sibling, 1 reply; 20+ messages in thread
From: Lars-Peter Clausen @ 2021-04-15  7:08 UTC (permalink / raw)
  To: Radhey Shyam Pandey, vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

On 4/9/21 7:56 PM, Radhey Shyam Pandey wrote:
> AXIDMA IP in SG mode sets completion bit to 1 when the transfer is
> completed. Read this bit to move descriptor from active list to the
> done list. This feature is needed when interrupt delay timeout and
> IRQThreshold is enabled i.e Dly_IrqEn is triggered w/o completing
> interrupt threshold.
>
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> ---
> - Check BD completion bit only for SG mode.
> - Modify the logic to have early return path.
> ---
>   drivers/dma/xilinx/xilinx_dma.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index 890bf46b36e5..f2305a73cb91 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -177,6 +177,7 @@
>   #define XILINX_DMA_CR_COALESCE_SHIFT	16
>   #define XILINX_DMA_BD_SOP		BIT(27)
>   #define XILINX_DMA_BD_EOP		BIT(26)
> +#define XILINX_DMA_BD_COMP_MASK		BIT(31)
>   #define XILINX_DMA_COALESCE_MAX		255
>   #define XILINX_DMA_NUM_DESCS		512
>   #define XILINX_DMA_NUM_APP_WORDS	5
> @@ -1683,12 +1684,18 @@ static void xilinx_dma_issue_pending(struct dma_chan *dchan)
>   static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan *chan)
>   {
>   	struct xilinx_dma_tx_descriptor *desc, *next;
> +	struct xilinx_axidma_tx_segment *seg;
>   
>   	/* This function was invoked with lock held */
>   	if (list_empty(&chan->active_list))
>   		return;
>   
>   	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
> +		/* TODO: remove hardcoding for axidma_tx_segment */
> +		seg = list_last_entry(&desc->segments,
> +				      struct xilinx_axidma_tx_segment, node);
This needs to be fixed before this can be merged as it right now will 
break the non AXIDMA variants.
> +		if (!(seg->hw.status & XILINX_DMA_BD_COMP_MASK) && chan->has_sg)
> +			break;
>   		if (chan->has_sg && chan->xdev->dma_config->dmatype !=
>   		    XDMA_TYPE_VDMA)
>   			desc->residue = xilinx_dma_get_residue(chan, desc);



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC v2 PATCH 6/7] dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase
  2021-04-09 17:56 ` [RFC v2 PATCH 6/7] dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase Radhey Shyam Pandey
@ 2021-04-15  7:10   ` Lars-Peter Clausen
  2021-06-11 18:30     ` Radhey Shyam Pandey
  0 siblings, 1 reply; 20+ messages in thread
From: Lars-Peter Clausen @ 2021-04-15  7:10 UTC (permalink / raw)
  To: Radhey Shyam Pandey, vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

On 4/9/21 7:56 PM, Radhey Shyam Pandey wrote:
> Schedule tasklet with high priority to ensure that callback processing
> is prioritized. It improves throughput for netdev dma clients.
Do you have specific numbers on the throughput improvement?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit
  2021-04-09 17:56 ` [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit Radhey Shyam Pandey
  2021-04-15  7:08   ` Lars-Peter Clausen
@ 2021-04-15  7:26   ` Lars-Peter Clausen
  2021-06-11 18:58     ` Radhey Shyam Pandey
  1 sibling, 1 reply; 20+ messages in thread
From: Lars-Peter Clausen @ 2021-04-15  7:26 UTC (permalink / raw)
  To: Radhey Shyam Pandey, vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

On 4/9/21 7:56 PM, Radhey Shyam Pandey wrote:
> AXIDMA IP in SG mode sets completion bit to 1 when the transfer is
> completed. Read this bit to move descriptor from active list to the
> done list. This feature is needed when interrupt delay timeout and
> IRQThreshold is enabled i.e Dly_IrqEn is triggered w/o completing
> interrupt threshold.
>
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> ---
> - Check BD completion bit only for SG mode.
> - Modify the logic to have early return path.
> ---
>   drivers/dma/xilinx/xilinx_dma.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index 890bf46b36e5..f2305a73cb91 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -177,6 +177,7 @@
>   #define XILINX_DMA_CR_COALESCE_SHIFT	16
>   #define XILINX_DMA_BD_SOP		BIT(27)
>   #define XILINX_DMA_BD_EOP		BIT(26)
> +#define XILINX_DMA_BD_COMP_MASK		BIT(31)
>   #define XILINX_DMA_COALESCE_MAX		255
>   #define XILINX_DMA_NUM_DESCS		512
>   #define XILINX_DMA_NUM_APP_WORDS	5
> @@ -1683,12 +1684,18 @@ static void xilinx_dma_issue_pending(struct dma_chan *dchan)
>   static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan *chan)
>   {
>   	struct xilinx_dma_tx_descriptor *desc, *next;
> +	struct xilinx_axidma_tx_segment *seg;
>   
>   	/* This function was invoked with lock held */
>   	if (list_empty(&chan->active_list))
>   		return;
>   
>   	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
> +		/* TODO: remove hardcoding for axidma_tx_segment */
> +		seg = list_last_entry(&desc->segments,
> +				      struct xilinx_axidma_tx_segment, node);
> +		if (!(seg->hw.status & XILINX_DMA_BD_COMP_MASK) && chan->has_sg)
> +			break;
>   		if (chan->has_sg && chan->xdev->dma_config->dmatype !=
>   		    XDMA_TYPE_VDMA)
>   			desc->residue = xilinx_dma_get_residue(chan, desc);

Since not all descriptors will be completed in this function the 
`chan->idle = true;` in xilinx_dma_irq_handler() needs to be gated on 
the active_list being empty.

xilinx_dma_complete_descriptor


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC v2 PATCH 7/7] dmaengine: xilinx_dma: Program interrupt delay timeout
  2021-04-09 17:56 ` [RFC v2 PATCH 7/7] dmaengine: xilinx_dma: Program interrupt delay timeout Radhey Shyam Pandey
@ 2021-04-15  7:33   ` Lars-Peter Clausen
  2021-06-11 19:33     ` Radhey Shyam Pandey
  0 siblings, 1 reply; 20+ messages in thread
From: Lars-Peter Clausen @ 2021-04-15  7:33 UTC (permalink / raw)
  To: Radhey Shyam Pandey, vkoul, robh+dt, michal.simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

On 4/9/21 7:56 PM, Radhey Shyam Pandey wrote:
> Program IRQDelay for AXI DMA. The interrupt timeout mechanism causes
> the DMA engine to generate an interrupt after the delay time period
> has expired. It enables dmaengine to respond in real-time even though
> interrupt coalescing is configured. It also remove the placeholder
> for delay interrupt and merge it with frame completion interrupt.
> Since by default interrupt delay timeout is disabled this feature
> addition has no functional impact on VDMA and CDMA IP's.

In my opinion this should not come from the devicetree. This setting is 
application specific and should be configured through a runtime API.

For the VDMA there is already xilinx_vdma_channel_set_config() which 
allows to configure the maximum number of IRQs that can be coalesced and 
the IRQ delay. Something similar is probably needed for the AXIDMA.

>
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> ---
> Changes for v2:
> - Read irq delay timeout value from DT.
> - Merge interrupt processing for frame done and delay interrupt.
> ---
>   drivers/dma/xilinx/xilinx_dma.c | 20 +++++++++++---------
>   1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index a2ea2d649332..0c0dc9882a01 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -173,8 +173,10 @@
>   #define XILINX_DMA_MAX_TRANS_LEN_MAX	23
>   #define XILINX_DMA_V2_MAX_TRANS_LEN_MAX	26
>   #define XILINX_DMA_CR_COALESCE_MAX	GENMASK(23, 16)
> +#define XILINX_DMA_CR_DELAY_MAX		GENMASK(31, 24)
>   #define XILINX_DMA_CR_CYCLIC_BD_EN_MASK	BIT(4)
>   #define XILINX_DMA_CR_COALESCE_SHIFT	16
> +#define XILINX_DMA_CR_DELAY_SHIFT	24
>   #define XILINX_DMA_BD_SOP		BIT(27)
>   #define XILINX_DMA_BD_EOP		BIT(26)
>   #define XILINX_DMA_BD_COMP_MASK		BIT(31)
> @@ -410,6 +412,7 @@ struct xilinx_dma_tx_descriptor {
>    * @stop_transfer: Differentiate b/w DMA IP's quiesce
>    * @tdest: TDEST value for mcdma
>    * @has_vflip: S2MM vertical flip
> + * @irq_delay: Interrupt delay timeout
>    */
>   struct xilinx_dma_chan {
>   	struct xilinx_dma_device *xdev;
> @@ -447,6 +450,7 @@ struct xilinx_dma_chan {
>   	int (*stop_transfer)(struct xilinx_dma_chan *chan);
>   	u16 tdest;
>   	bool has_vflip;
> +	u8 irq_delay;
>   };
>   
>   /**
> @@ -1555,6 +1559,9 @@ static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
>   	if (chan->has_sg)
>   		xilinx_write(chan, XILINX_DMA_REG_CURDESC,
>   			     head_desc->async_tx.phys);
> +	reg  &= ~XILINX_DMA_CR_DELAY_MAX;
> +	reg  |= chan->irq_delay << XILINX_DMA_CR_DELAY_SHIFT;
> +	dma_ctrl_write(chan, XILINX_DMA_REG_DMACR, reg);
>   
>   	xilinx_dma_start(chan);
>   
> @@ -1877,15 +1884,8 @@ static irqreturn_t xilinx_dma_irq_handler(int irq, void *data)
>   		}
>   	}
>   
> -	if (status & XILINX_DMA_DMASR_DLY_CNT_IRQ) {
> -		/*
> -		 * Device takes too long to do the transfer when user requires
> -		 * responsiveness.
> -		 */
> -		dev_dbg(chan->dev, "Inter-packet latency too long\n");
> -	}
> -
> -	if (status & XILINX_DMA_DMASR_FRM_CNT_IRQ) {
> +	if (status & (XILINX_DMA_DMASR_FRM_CNT_IRQ |
> +		      XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
>   		spin_lock(&chan->lock);
>   		xilinx_dma_complete_descriptor(chan);
>   		chan->idle = true;
> @@ -2802,6 +2802,8 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev,
>   	/* Retrieve the channel properties from the device tree */
>   	has_dre = of_property_read_bool(node, "xlnx,include-dre");
>   
> +	of_property_read_u8(node, "xlnx,irq-delay", &chan->irq_delay);
> +
>   	chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode");
>   
>   	err = of_property_read_u32(node, "xlnx,datawidth", &value);



^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization
  2021-04-15  7:06 ` [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Lars-Peter Clausen
@ 2021-06-11 16:13   ` Radhey Shyam Pandey
  0 siblings, 0 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-06-11 16:13 UTC (permalink / raw)
  To: Lars-Peter Clausen, vkoul, robh+dt, Michal Simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

> -----Original Message-----
> From: Lars-Peter Clausen <lars@metafoo.de>
> Sent: Thursday, April 15, 2021 12:36 PM
> To: Radhey Shyam Pandey <radheys@xilinx.com>; vkoul@kernel.org;
> robh+dt@kernel.org; Michal Simek <michals@xilinx.com>
> Cc: dmaengine@vger.kernel.org; devicetree@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linux-kernel@vger.kernel.org; git
> <git@xilinx.com>
> Subject: Re: [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization
> 
> On 4/9/21 7:55 PM, Radhey Shyam Pandey wrote:
> > Some background about the patch series: Xilinx Axi Ethernet device
> > driver
> > (xilinx_axienet_main.c) currently has axi-dma code inside it. The goal
> > is to refactor axiethernet driver and use existing AXI DMA driver
> > using DMAEngine API.
> 
> This is pretty neat! Do you have the patches that modify the AXI Ethernet
> driver in a public tree somewhere, so this series can be seen in context?
Yes,  I sent the axiethernet RFC series to the netdev mailing list. Here is
the link: https://www.spinics.net/lists/netdev/msg734173.html

Thanks,
Radhey

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit
  2021-04-15  7:08   ` Lars-Peter Clausen
@ 2021-06-11 16:16     ` Radhey Shyam Pandey
  0 siblings, 0 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-06-11 16:16 UTC (permalink / raw)
  To: Lars-Peter Clausen, vkoul, robh+dt, Michal Simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

> -----Original Message-----
> From: Lars-Peter Clausen <lars@metafoo.de>
> Sent: Thursday, April 15, 2021 12:39 PM
> To: Radhey Shyam Pandey <radheys@xilinx.com>; vkoul@kernel.org;
> robh+dt@kernel.org; Michal Simek <michals@xilinx.com>
> Cc: dmaengine@vger.kernel.org; devicetree@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linux-kernel@vger.kernel.org; git
> <git@xilinx.com>
> Subject: Re: [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list
> based on descriptor completion bit
> 
> On 4/9/21 7:56 PM, Radhey Shyam Pandey wrote:
> > AXIDMA IP in SG mode sets completion bit to 1 when the transfer is
> > completed. Read this bit to move descriptor from active list to the
> > done list. This feature is needed when interrupt delay timeout and
> > IRQThreshold is enabled i.e Dly_IrqEn is triggered w/o completing
> > interrupt threshold.
> >
> > Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> > ---
> > - Check BD completion bit only for SG mode.
> > - Modify the logic to have early return path.
> > ---
> >   drivers/dma/xilinx/xilinx_dma.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/dma/xilinx/xilinx_dma.c
> > b/drivers/dma/xilinx/xilinx_dma.c index 890bf46b36e5..f2305a73cb91
> > 100644
> > --- a/drivers/dma/xilinx/xilinx_dma.c
> > +++ b/drivers/dma/xilinx/xilinx_dma.c
> > @@ -177,6 +177,7 @@
> >   #define XILINX_DMA_CR_COALESCE_SHIFT	16
> >   #define XILINX_DMA_BD_SOP		BIT(27)
> >   #define XILINX_DMA_BD_EOP		BIT(26)
> > +#define XILINX_DMA_BD_COMP_MASK		BIT(31)
> >   #define XILINX_DMA_COALESCE_MAX		255
> >   #define XILINX_DMA_NUM_DESCS		512
> >   #define XILINX_DMA_NUM_APP_WORDS	5
> > @@ -1683,12 +1684,18 @@ static void xilinx_dma_issue_pending(struct
> dma_chan *dchan)
> >   static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan
> *chan)
> >   {
> >   	struct xilinx_dma_tx_descriptor *desc, *next;
> > +	struct xilinx_axidma_tx_segment *seg;
> >
> >   	/* This function was invoked with lock held */
> >   	if (list_empty(&chan->active_list))
> >   		return;
> >
> >   	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
> > +		/* TODO: remove hardcoding for axidma_tx_segment */
> > +		seg = list_last_entry(&desc->segments,
> > +				      struct xilinx_axidma_tx_segment, node);
> This needs to be fixed before this can be merged as it right now will break
> the non AXIDMA variants.

I agree, mentioned it in TODO to remove axidma specific hardcoding.
Will fix it next version.

Thanks,
Radhey
> > +		if (!(seg->hw.status & XILINX_DMA_BD_COMP_MASK) &&
> chan->has_sg)
> > +			break;
> >   		if (chan->has_sg && chan->xdev->dma_config->dmatype !=
> >   		    XDMA_TYPE_VDMA)
> >   			desc->residue = xilinx_dma_get_residue(chan, desc);
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC v2 PATCH 6/7] dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase
  2021-04-15  7:10   ` Lars-Peter Clausen
@ 2021-06-11 18:30     ` Radhey Shyam Pandey
  0 siblings, 0 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-06-11 18:30 UTC (permalink / raw)
  To: Lars-Peter Clausen, vkoul, robh+dt, Michal Simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

> -----Original Message-----
> From: Lars-Peter Clausen <lars@metafoo.de>
> Sent: Thursday, April 15, 2021 12:41 PM
> To: Radhey Shyam Pandey <radheys@xilinx.com>; vkoul@kernel.org;
> robh+dt@kernel.org; Michal Simek <michals@xilinx.com>
> Cc: dmaengine@vger.kernel.org; devicetree@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linux-kernel@vger.kernel.org; git
> <git@xilinx.com>
> Subject: Re: [RFC v2 PATCH 6/7] dmaengine: xilinx_dma: Use
> tasklet_hi_schedule for timing critical usecase
> 
> On 4/9/21 7:56 PM, Radhey Shyam Pandey wrote:
> > Schedule tasklet with high priority to ensure that callback processing
> > is prioritized. It improves throughput for netdev dma clients.
> Do you have specific numbers on the throughput improvement?
IIRC there was ~5% performance improvement but I did that a long back
on an older kernel 4.8 and after that onward I always checked overall
performance (having all optimization applied). In next version i will 
redo incremental profiling and capture improvement % in the commit 
description.

Thanks,
Radhey

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit
  2021-04-15  7:26   ` Lars-Peter Clausen
@ 2021-06-11 18:58     ` Radhey Shyam Pandey
  0 siblings, 0 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-06-11 18:58 UTC (permalink / raw)
  To: Lars-Peter Clausen, vkoul, robh+dt, Michal Simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

> -----Original Message-----
> From: Lars-Peter Clausen <lars@metafoo.de>
> Sent: Thursday, April 15, 2021 12:56 PM
> To: Radhey Shyam Pandey <radheys@xilinx.com>; vkoul@kernel.org;
> robh+dt@kernel.org; Michal Simek <michals@xilinx.com>
> Cc: dmaengine@vger.kernel.org; devicetree@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linux-kernel@vger.kernel.org; git
> <git@xilinx.com>
> Subject: Re: [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list
> based on descriptor completion bit
> 
> On 4/9/21 7:56 PM, Radhey Shyam Pandey wrote:
> > AXIDMA IP in SG mode sets completion bit to 1 when the transfer is
> > completed. Read this bit to move descriptor from active list to the
> > done list. This feature is needed when interrupt delay timeout and
> > IRQThreshold is enabled i.e Dly_IrqEn is triggered w/o completing
> > interrupt threshold.
> >
> > Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> > ---
> > - Check BD completion bit only for SG mode.
> > - Modify the logic to have early return path.
> > ---
> >   drivers/dma/xilinx/xilinx_dma.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/dma/xilinx/xilinx_dma.c
> > b/drivers/dma/xilinx/xilinx_dma.c index 890bf46b36e5..f2305a73cb91
> > 100644
> > --- a/drivers/dma/xilinx/xilinx_dma.c
> > +++ b/drivers/dma/xilinx/xilinx_dma.c
> > @@ -177,6 +177,7 @@
> >   #define XILINX_DMA_CR_COALESCE_SHIFT	16
> >   #define XILINX_DMA_BD_SOP		BIT(27)
> >   #define XILINX_DMA_BD_EOP		BIT(26)
> > +#define XILINX_DMA_BD_COMP_MASK		BIT(31)
> >   #define XILINX_DMA_COALESCE_MAX		255
> >   #define XILINX_DMA_NUM_DESCS		512
> >   #define XILINX_DMA_NUM_APP_WORDS	5
> > @@ -1683,12 +1684,18 @@ static void xilinx_dma_issue_pending(struct
> dma_chan *dchan)
> >   static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan
> *chan)
> >   {
> >   	struct xilinx_dma_tx_descriptor *desc, *next;
> > +	struct xilinx_axidma_tx_segment *seg;
> >
> >   	/* This function was invoked with lock held */
> >   	if (list_empty(&chan->active_list))
> >   		return;
> >
> >   	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
> > +		/* TODO: remove hardcoding for axidma_tx_segment */
> > +		seg = list_last_entry(&desc->segments,
> > +				      struct xilinx_axidma_tx_segment, node);
> > +		if (!(seg->hw.status & XILINX_DMA_BD_COMP_MASK) &&
> chan->has_sg)
> > +			break;
> >   		if (chan->has_sg && chan->xdev->dma_config->dmatype !=
> >   		    XDMA_TYPE_VDMA)
> >   			desc->residue = xilinx_dma_get_residue(chan, desc);
> 
> Since not all descriptors will be completed in this function the `chan->idle =
> true;` in xilinx_dma_irq_handler() needs to be gated on the active_list being
> empty.

Thanks for pointing it out. Agree to it, will fix it in the next version.
> 
> xilinx_dma_complete_descriptor


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC v2 PATCH 7/7] dmaengine: xilinx_dma: Program interrupt delay timeout
  2021-04-15  7:33   ` Lars-Peter Clausen
@ 2021-06-11 19:33     ` Radhey Shyam Pandey
  0 siblings, 0 replies; 20+ messages in thread
From: Radhey Shyam Pandey @ 2021-06-11 19:33 UTC (permalink / raw)
  To: Lars-Peter Clausen, vkoul, robh+dt, Michal Simek
  Cc: dmaengine, devicetree, linux-arm-kernel, linux-kernel, git

 -----Original Message-----
> From: Lars-Peter Clausen <lars@metafoo.de>
> Sent: Thursday, April 15, 2021 1:03 PM
> To: Radhey Shyam Pandey <radheys@xilinx.com>; vkoul@kernel.org;
> robh+dt@kernel.org; Michal Simek <michals@xilinx.com>
> Cc: dmaengine@vger.kernel.org; devicetree@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linux-kernel@vger.kernel.org; git
> <git@xilinx.com>
> Subject: Re: [RFC v2 PATCH 7/7] dmaengine: xilinx_dma: Program interrupt
> delay timeout
> 
> On 4/9/21 7:56 PM, Radhey Shyam Pandey wrote:
> > Program IRQDelay for AXI DMA. The interrupt timeout mechanism causes
> > the DMA engine to generate an interrupt after the delay time period
> > has expired. It enables dmaengine to respond in real-time even though
> > interrupt coalescing is configured. It also remove the placeholder
> > for delay interrupt and merge it with frame completion interrupt.
> > Since by default interrupt delay timeout is disabled this feature
> > addition has no functional impact on VDMA and CDMA IP's.
> 
> In my opinion this should not come from the devicetree. This setting is
> application specific and should be configured through a runtime API.

The inclination for reading irq delay from DT was to minimize creating custom 
interface for clients. For example - If we use xilinx_vdma_channel_set_config
API in ethernet driver it won't be then generic enough to be hooked to any 
other dmaengine client. Any thoughts on it?  With DT only limitation is it's
not runtime programmable.

Thanks,
Radhey
> 
> For the VDMA there is already xilinx_vdma_channel_set_config() which
> allows to configure the maximum number of IRQs that can be coalesced and
> the IRQ delay. Something similar is probably needed for the AXIDMA.
> 
> >
> > Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> > ---
> > Changes for v2:
> > - Read irq delay timeout value from DT.
> > - Merge interrupt processing for frame done and delay interrupt.
> > ---
> >   drivers/dma/xilinx/xilinx_dma.c | 20 +++++++++++---------
> >   1 file changed, 11 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/dma/xilinx/xilinx_dma.c
> b/drivers/dma/xilinx/xilinx_dma.c
> > index a2ea2d649332..0c0dc9882a01 100644
> > --- a/drivers/dma/xilinx/xilinx_dma.c
> > +++ b/drivers/dma/xilinx/xilinx_dma.c
> > @@ -173,8 +173,10 @@
> >   #define XILINX_DMA_MAX_TRANS_LEN_MAX	23
> >   #define XILINX_DMA_V2_MAX_TRANS_LEN_MAX	26
> >   #define XILINX_DMA_CR_COALESCE_MAX	GENMASK(23, 16)
> > +#define XILINX_DMA_CR_DELAY_MAX		GENMASK(31, 24)
> >   #define XILINX_DMA_CR_CYCLIC_BD_EN_MASK	BIT(4)
> >   #define XILINX_DMA_CR_COALESCE_SHIFT	16
> > +#define XILINX_DMA_CR_DELAY_SHIFT	24
> >   #define XILINX_DMA_BD_SOP		BIT(27)
> >   #define XILINX_DMA_BD_EOP		BIT(26)
> >   #define XILINX_DMA_BD_COMP_MASK		BIT(31)
> > @@ -410,6 +412,7 @@ struct xilinx_dma_tx_descriptor {
> >    * @stop_transfer: Differentiate b/w DMA IP's quiesce
> >    * @tdest: TDEST value for mcdma
> >    * @has_vflip: S2MM vertical flip
> > + * @irq_delay: Interrupt delay timeout
> >    */
> >   struct xilinx_dma_chan {
> >   	struct xilinx_dma_device *xdev;
> > @@ -447,6 +450,7 @@ struct xilinx_dma_chan {
> >   	int (*stop_transfer)(struct xilinx_dma_chan *chan);
> >   	u16 tdest;
> >   	bool has_vflip;
> > +	u8 irq_delay;
> >   };
> >
> >   /**
> > @@ -1555,6 +1559,9 @@ static void xilinx_dma_start_transfer(struct
> xilinx_dma_chan *chan)
> >   	if (chan->has_sg)
> >   		xilinx_write(chan, XILINX_DMA_REG_CURDESC,
> >   			     head_desc->async_tx.phys);
> > +	reg  &= ~XILINX_DMA_CR_DELAY_MAX;
> > +	reg  |= chan->irq_delay << XILINX_DMA_CR_DELAY_SHIFT;
> > +	dma_ctrl_write(chan, XILINX_DMA_REG_DMACR, reg);
> >
> >   	xilinx_dma_start(chan);
> >
> > @@ -1877,15 +1884,8 @@ static irqreturn_t xilinx_dma_irq_handler(int
> irq, void *data)
> >   		}
> >   	}
> >
> > -	if (status & XILINX_DMA_DMASR_DLY_CNT_IRQ) {
> > -		/*
> > -		 * Device takes too long to do the transfer when user
> requires
> > -		 * responsiveness.
> > -		 */
> > -		dev_dbg(chan->dev, "Inter-packet latency too long\n");
> > -	}
> > -
> > -	if (status & XILINX_DMA_DMASR_FRM_CNT_IRQ) {
> > +	if (status & (XILINX_DMA_DMASR_FRM_CNT_IRQ |
> > +		      XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
> >   		spin_lock(&chan->lock);
> >   		xilinx_dma_complete_descriptor(chan);
> >   		chan->idle = true;
> > @@ -2802,6 +2802,8 @@ static int xilinx_dma_chan_probe(struct
> xilinx_dma_device *xdev,
> >   	/* Retrieve the channel properties from the device tree */
> >   	has_dre = of_property_read_bool(node, "xlnx,include-dre");
> >
> > +	of_property_read_u8(node, "xlnx,irq-delay", &chan->irq_delay);
> > +
> >   	chan->genlock = of_property_read_bool(node, "xlnx,genlock-
> mode");
> >
> >   	err = of_property_read_u32(node, "xlnx,datawidth", &value);
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-06-11 19:33 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09 17:55 [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Radhey Shyam Pandey
2021-04-09 17:55 ` [RFC v2 PATCH 1/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,axistream-connected property Radhey Shyam Pandey
2021-04-12 18:25   ` Rob Herring
2021-04-09 17:56 ` [RFC v2 PATCH 2/7] dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property Radhey Shyam Pandey
2021-04-12 18:25   ` Rob Herring
2021-04-09 17:56 ` [RFC v2 PATCH 3/7] dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client Radhey Shyam Pandey
2021-04-09 17:56 ` [RFC v2 PATCH 4/7] dmaengine: xilinx_dma: Increase AXI DMA transaction segment count Radhey Shyam Pandey
2021-04-09 17:56 ` [RFC v2 PATCH 5/7] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit Radhey Shyam Pandey
2021-04-15  7:08   ` Lars-Peter Clausen
2021-06-11 16:16     ` Radhey Shyam Pandey
2021-04-15  7:26   ` Lars-Peter Clausen
2021-06-11 18:58     ` Radhey Shyam Pandey
2021-04-09 17:56 ` [RFC v2 PATCH 6/7] dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase Radhey Shyam Pandey
2021-04-15  7:10   ` Lars-Peter Clausen
2021-06-11 18:30     ` Radhey Shyam Pandey
2021-04-09 17:56 ` [RFC v2 PATCH 7/7] dmaengine: xilinx_dma: Program interrupt delay timeout Radhey Shyam Pandey
2021-04-15  7:33   ` Lars-Peter Clausen
2021-06-11 19:33     ` Radhey Shyam Pandey
2021-04-15  7:06 ` [RFC v2 PATCH 0/7] Xilinx DMA enhancements and optimization Lars-Peter Clausen
2021-06-11 16:13   ` Radhey Shyam Pandey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).