dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support
@ 2019-07-16  8:26 Peter Ujfalusi
  2019-07-16  8:26 ` [PATCH v5 1/3] dmaengine: ti: edma: Clean up the 2x32bit array register accesses Peter Ujfalusi
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Peter Ujfalusi @ 2019-07-16  8:26 UTC (permalink / raw)
  To: vkoul; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

Hi,

Changes since v4:
- Split the DMA_COMPLETE and !txstate check as Vinod suggested

Change since v3:
- fix DMA pointer tracking for memcpy
- completion polling is only done when it is asked by not providing
  DMA_PREP_INTERRUPT for memcpy

Changes since v2:
- Fix typo in the comment for patch 0

Changes since v1:
- Cleanup patch for the array register handling
- typo fixed in patch2 commit message

The code around the array register access was pretty confusing for the first
look, so clean them up first then use the cleaner way in the polled handling.

When a DMA client driver does not set the DMA_PREP_INTERRUPT because it
does not want to use interrupts for DMA completion or because it can not
rely on DMA interrupts due to executing the memcpy when interrupts are
disabled it will poll the status of the transfer.

Since we can not tell from any EDMA register that the transfer is
completed, we can only know that the paRAM set has been sent to TPTC for
processing we need to check the residue of the transfer, if it is 0 then
the transfer is completed.

The polled completion can bve tested by applying:
https://patchwork.kernel.org/patch/10966499/

Enabling the memcpy for EDMA and run the dmatest with polled = 1.

Or, enable the EDMA memcpy support and boot up any dra7 family device with
display enabled. The workaround for DMM errata i878 uses polled DMA memcpy.

Regards,
Peter
---
Peter Ujfalusi (3):
  dmaengine: ti: edma: Clean up the 2x32bit array register accesses
  dmaengine: ti: edma: Correct the residue calculation (fix for memcpy)
  dmaengine: ti: edma: Support for polled (memcpy) completion

 drivers/dma/ti/edma.c | 174 ++++++++++++++++++++++++++++--------------
 1 file changed, 117 insertions(+), 57 deletions(-)

-- 
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 1/3] dmaengine: ti: edma: Clean up the 2x32bit array register accesses
  2019-07-16  8:26 [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Peter Ujfalusi
@ 2019-07-16  8:26 ` Peter Ujfalusi
  2019-07-16  8:26 ` [PATCH v5 2/3] dmaengine: ti: edma: Correct the residue calculation (fix for memcpy) Peter Ujfalusi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Peter Ujfalusi @ 2019-07-16  8:26 UTC (permalink / raw)
  To: vkoul; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

Introduce defines for getting the array index and the bit number within the
64bit array register pairs.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
---
 drivers/dma/ti/edma.c | 106 ++++++++++++++++++++++++------------------
 1 file changed, 61 insertions(+), 45 deletions(-)

diff --git a/drivers/dma/ti/edma.c b/drivers/dma/ti/edma.c
index ceabdea40ae0..a39f817b3888 100644
--- a/drivers/dma/ti/edma.c
+++ b/drivers/dma/ti/edma.c
@@ -133,6 +133,17 @@
 #define EDMA_CONT_PARAMS_FIXED_EXACT	 1002
 #define EDMA_CONT_PARAMS_FIXED_NOT_EXACT 1003
 
+/*
+ * 64bit array registers are split into two 32bit registers:
+ * reg0: channel/event 0-31
+ * reg1: channel/event 32-63
+ *
+ * bit 5 in the channel number tells the array index (0/1)
+ * bit 0-4 (0x1f) is the bit offset within the register
+ */
+#define EDMA_REG_ARRAY_INDEX(channel)	((channel) >> 5)
+#define EDMA_CHANNEL_BIT(channel)	(BIT((channel) & 0x1f))
+
 /* PaRAM slots are laid out like this */
 struct edmacc_param {
 	u32 opt;
@@ -441,15 +452,14 @@ static void edma_setup_interrupt(struct edma_chan *echan, bool enable)
 {
 	struct edma_cc *ecc = echan->ecc;
 	int channel = EDMA_CHAN_SLOT(echan->ch_num);
+	int idx = EDMA_REG_ARRAY_INDEX(channel);
+	int ch_bit = EDMA_CHANNEL_BIT(channel);
 
 	if (enable) {
-		edma_shadow0_write_array(ecc, SH_ICR, channel >> 5,
-					 BIT(channel & 0x1f));
-		edma_shadow0_write_array(ecc, SH_IESR, channel >> 5,
-					 BIT(channel & 0x1f));
+		edma_shadow0_write_array(ecc, SH_ICR, idx, ch_bit);
+		edma_shadow0_write_array(ecc, SH_IESR, idx, ch_bit);
 	} else {
-		edma_shadow0_write_array(ecc, SH_IECR, channel >> 5,
-					 BIT(channel & 0x1f));
+		edma_shadow0_write_array(ecc, SH_IECR, idx, ch_bit);
 	}
 }
 
@@ -587,26 +597,26 @@ static void edma_start(struct edma_chan *echan)
 {
 	struct edma_cc *ecc = echan->ecc;
 	int channel = EDMA_CHAN_SLOT(echan->ch_num);
-	int j = (channel >> 5);
-	unsigned int mask = BIT(channel & 0x1f);
+	int idx = EDMA_REG_ARRAY_INDEX(channel);
+	int ch_bit = EDMA_CHANNEL_BIT(channel);
 
 	if (!echan->hw_triggered) {
 		/* EDMA channels without event association */
-		dev_dbg(ecc->dev, "ESR%d %08x\n", j,
-			edma_shadow0_read_array(ecc, SH_ESR, j));
-		edma_shadow0_write_array(ecc, SH_ESR, j, mask);
+		dev_dbg(ecc->dev, "ESR%d %08x\n", idx,
+			edma_shadow0_read_array(ecc, SH_ESR, idx));
+		edma_shadow0_write_array(ecc, SH_ESR, idx, ch_bit);
 	} else {
 		/* EDMA channel with event association */
-		dev_dbg(ecc->dev, "ER%d %08x\n", j,
-			edma_shadow0_read_array(ecc, SH_ER, j));
+		dev_dbg(ecc->dev, "ER%d %08x\n", idx,
+			edma_shadow0_read_array(ecc, SH_ER, idx));
 		/* Clear any pending event or error */
-		edma_write_array(ecc, EDMA_ECR, j, mask);
-		edma_write_array(ecc, EDMA_EMCR, j, mask);
+		edma_write_array(ecc, EDMA_ECR, idx, ch_bit);
+		edma_write_array(ecc, EDMA_EMCR, idx, ch_bit);
 		/* Clear any SER */
-		edma_shadow0_write_array(ecc, SH_SECR, j, mask);
-		edma_shadow0_write_array(ecc, SH_EESR, j, mask);
-		dev_dbg(ecc->dev, "EER%d %08x\n", j,
-			edma_shadow0_read_array(ecc, SH_EER, j));
+		edma_shadow0_write_array(ecc, SH_SECR, idx, ch_bit);
+		edma_shadow0_write_array(ecc, SH_EESR, idx, ch_bit);
+		dev_dbg(ecc->dev, "EER%d %08x\n", idx,
+			edma_shadow0_read_array(ecc, SH_EER, idx));
 	}
 }
 
@@ -614,19 +624,19 @@ static void edma_stop(struct edma_chan *echan)
 {
 	struct edma_cc *ecc = echan->ecc;
 	int channel = EDMA_CHAN_SLOT(echan->ch_num);
-	int j = (channel >> 5);
-	unsigned int mask = BIT(channel & 0x1f);
+	int idx = EDMA_REG_ARRAY_INDEX(channel);
+	int ch_bit = EDMA_CHANNEL_BIT(channel);
 
-	edma_shadow0_write_array(ecc, SH_EECR, j, mask);
-	edma_shadow0_write_array(ecc, SH_ECR, j, mask);
-	edma_shadow0_write_array(ecc, SH_SECR, j, mask);
-	edma_write_array(ecc, EDMA_EMCR, j, mask);
+	edma_shadow0_write_array(ecc, SH_EECR, idx, ch_bit);
+	edma_shadow0_write_array(ecc, SH_ECR, idx, ch_bit);
+	edma_shadow0_write_array(ecc, SH_SECR, idx, ch_bit);
+	edma_write_array(ecc, EDMA_EMCR, idx, ch_bit);
 
 	/* clear possibly pending completion interrupt */
-	edma_shadow0_write_array(ecc, SH_ICR, j, mask);
+	edma_shadow0_write_array(ecc, SH_ICR, idx, ch_bit);
 
-	dev_dbg(ecc->dev, "EER%d %08x\n", j,
-		edma_shadow0_read_array(ecc, SH_EER, j));
+	dev_dbg(ecc->dev, "EER%d %08x\n", idx,
+		edma_shadow0_read_array(ecc, SH_EER, idx));
 
 	/* REVISIT:  consider guarding against inappropriate event
 	 * chaining by overwriting with dummy_paramset.
@@ -640,45 +650,49 @@ static void edma_stop(struct edma_chan *echan)
 static void edma_pause(struct edma_chan *echan)
 {
 	int channel = EDMA_CHAN_SLOT(echan->ch_num);
-	unsigned int mask = BIT(channel & 0x1f);
 
-	edma_shadow0_write_array(echan->ecc, SH_EECR, channel >> 5, mask);
+	edma_shadow0_write_array(echan->ecc, SH_EECR,
+				 EDMA_REG_ARRAY_INDEX(channel),
+				 EDMA_CHANNEL_BIT(channel));
 }
 
 /* Re-enable EDMA hardware events on the specified channel.  */
 static void edma_resume(struct edma_chan *echan)
 {
 	int channel = EDMA_CHAN_SLOT(echan->ch_num);
-	unsigned int mask = BIT(channel & 0x1f);
 
-	edma_shadow0_write_array(echan->ecc, SH_EESR, channel >> 5, mask);
+	edma_shadow0_write_array(echan->ecc, SH_EESR,
+				 EDMA_REG_ARRAY_INDEX(channel),
+				 EDMA_CHANNEL_BIT(channel));
 }
 
 static void edma_trigger_channel(struct edma_chan *echan)
 {
 	struct edma_cc *ecc = echan->ecc;
 	int channel = EDMA_CHAN_SLOT(echan->ch_num);
-	unsigned int mask = BIT(channel & 0x1f);
+	int idx = EDMA_REG_ARRAY_INDEX(channel);
+	int ch_bit = EDMA_CHANNEL_BIT(channel);
 
-	edma_shadow0_write_array(ecc, SH_ESR, (channel >> 5), mask);
+	edma_shadow0_write_array(ecc, SH_ESR, idx, ch_bit);
 
-	dev_dbg(ecc->dev, "ESR%d %08x\n", (channel >> 5),
-		edma_shadow0_read_array(ecc, SH_ESR, (channel >> 5)));
+	dev_dbg(ecc->dev, "ESR%d %08x\n", idx,
+		edma_shadow0_read_array(ecc, SH_ESR, idx));
 }
 
 static void edma_clean_channel(struct edma_chan *echan)
 {
 	struct edma_cc *ecc = echan->ecc;
 	int channel = EDMA_CHAN_SLOT(echan->ch_num);
-	int j = (channel >> 5);
-	unsigned int mask = BIT(channel & 0x1f);
+	int idx = EDMA_REG_ARRAY_INDEX(channel);
+	int ch_bit = EDMA_CHANNEL_BIT(channel);
 
-	dev_dbg(ecc->dev, "EMR%d %08x\n", j, edma_read_array(ecc, EDMA_EMR, j));
-	edma_shadow0_write_array(ecc, SH_ECR, j, mask);
+	dev_dbg(ecc->dev, "EMR%d %08x\n", idx,
+		edma_read_array(ecc, EDMA_EMR, idx));
+	edma_shadow0_write_array(ecc, SH_ECR, idx, ch_bit);
 	/* Clear the corresponding EMR bits */
-	edma_write_array(ecc, EDMA_EMCR, j, mask);
+	edma_write_array(ecc, EDMA_EMCR, idx, ch_bit);
 	/* Clear any SER */
-	edma_shadow0_write_array(ecc, SH_SECR, j, mask);
+	edma_shadow0_write_array(ecc, SH_SECR, idx, ch_bit);
 	edma_write(ecc, EDMA_CCERRCLR, BIT(16) | BIT(1) | BIT(0));
 }
 
@@ -708,7 +722,8 @@ static int edma_alloc_channel(struct edma_chan *echan,
 	int channel = EDMA_CHAN_SLOT(echan->ch_num);
 
 	/* ensure access through shadow region 0 */
-	edma_or_array2(ecc, EDMA_DRAE, 0, channel >> 5, BIT(channel & 0x1f));
+	edma_or_array2(ecc, EDMA_DRAE, 0, EDMA_REG_ARRAY_INDEX(channel),
+		       EDMA_CHANNEL_BIT(channel));
 
 	/* ensure no events are pending */
 	edma_stop(echan);
@@ -2482,8 +2497,9 @@ static int edma_pm_resume(struct device *dev)
 	for (i = 0; i < ecc->num_channels; i++) {
 		if (echan[i].alloced) {
 			/* ensure access through shadow region 0 */
-			edma_or_array2(ecc, EDMA_DRAE, 0, i >> 5,
-				       BIT(i & 0x1f));
+			edma_or_array2(ecc, EDMA_DRAE, 0,
+				       EDMA_REG_ARRAY_INDEX(i),
+				       EDMA_CHANNEL_BIT(i));
 
 			edma_setup_interrupt(&echan[i], true);
 
-- 
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 2/3] dmaengine: ti: edma: Correct the residue calculation (fix for memcpy)
  2019-07-16  8:26 [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Peter Ujfalusi
  2019-07-16  8:26 ` [PATCH v5 1/3] dmaengine: ti: edma: Clean up the 2x32bit array register accesses Peter Ujfalusi
@ 2019-07-16  8:26 ` Peter Ujfalusi
  2019-07-25 13:46   ` Vinod Koul
  2019-07-16  8:26 ` [PATCH v5 3/3] dmaengine: ti: edma: Support for polled (memcpy) completion Peter Ujfalusi
  2019-07-29  6:42 ` [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Vinod Koul
  3 siblings, 1 reply; 10+ messages in thread
From: Peter Ujfalusi @ 2019-07-16  8:26 UTC (permalink / raw)
  To: vkoul; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

For memcpy we never stored the start address of the transfer for the pset
which rendered the memcpy residue calculation completely broken.

In the edma_residue() function we also need to to some correction for the
calculations:
Instead waiting for all EDMA channels to be idle (in a busy system it can
take few iteration to hit a point when all queues are idle) wait for the
event pending on the given channel (SH_ER for hw synchronized channels,
SH_ESR for manually triggered channels).

If the position returned by EMDA is 0 it imiplies that the last paRAM set
has been consumed and we are at the closing dummy set, thus we can conclude
that the transfer is completed and we can return 0 as residue.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
---
 drivers/dma/ti/edma.c | 31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/dma/ti/edma.c b/drivers/dma/ti/edma.c
index a39f817b3888..5b8cbd6d7610 100644
--- a/drivers/dma/ti/edma.c
+++ b/drivers/dma/ti/edma.c
@@ -1026,6 +1026,7 @@ static int edma_config_pset(struct dma_chan *chan, struct edma_pset *epset,
 		src_cidx = cidx;
 		dst_bidx = acnt;
 		dst_cidx = cidx;
+		epset->addr = src_addr;
 	} else {
 		dev_err(dev, "%s: direction not implemented yet\n", __func__);
 		return -EINVAL;
@@ -1736,7 +1737,11 @@ static u32 edma_residue(struct edma_desc *edesc)
 	int loop_count = EDMA_MAX_TR_WAIT_LOOPS;
 	struct edma_chan *echan = edesc->echan;
 	struct edma_pset *pset = edesc->pset;
-	dma_addr_t done, pos;
+	dma_addr_t done, pos, pos_old;
+	int channel = EDMA_CHAN_SLOT(echan->ch_num);
+	int idx = EDMA_REG_ARRAY_INDEX(channel);
+	int ch_bit = EDMA_CHANNEL_BIT(channel);
+	int event_reg;
 	int i;
 
 	/*
@@ -1749,16 +1754,20 @@ static u32 edma_residue(struct edma_desc *edesc)
 	 * "pos" may represent a transfer request that is still being
 	 * processed by the EDMACC or EDMATC. We will busy wait until
 	 * any one of the situations occurs:
-	 *   1. the DMA hardware is idle
-	 *   2. a new transfer request is setup
+	 *   1. while and event is pending for the channel
+	 *   2. a position updated
 	 *   3. we hit the loop limit
 	 */
-	while (edma_read(echan->ecc, EDMA_CCSTAT) & EDMA_CCSTAT_ACTV) {
-		/* check if a new transfer request is setup */
-		if (edma_get_position(echan->ecc,
-				      echan->slot[0], dst) != pos) {
+	if (is_slave_direction(edesc->direction))
+		event_reg = SH_ER;
+	else
+		event_reg = SH_ESR;
+
+	pos_old = pos;
+	while (edma_shadow0_read_array(echan->ecc, event_reg, idx) & ch_bit) {
+		pos = edma_get_position(echan->ecc, echan->slot[0], dst);
+		if (pos != pos_old)
 			break;
-		}
 
 		if (!--loop_count) {
 			dev_dbg_ratelimited(echan->vchan.chan.device->dev,
@@ -1783,6 +1792,12 @@ static u32 edma_residue(struct edma_desc *edesc)
 		return edesc->residue_stat;
 	}
 
+	/*
+	 * If the position is 0, then EDMA loaded the closing dummy slot, the
+	 * transfer is completed
+	 */
+	if (!pos)
+		return 0;
 	/*
 	 * For SG operation we catch up with the last processed
 	 * status.
-- 
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 3/3] dmaengine: ti: edma: Support for polled (memcpy) completion
  2019-07-16  8:26 [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Peter Ujfalusi
  2019-07-16  8:26 ` [PATCH v5 1/3] dmaengine: ti: edma: Clean up the 2x32bit array register accesses Peter Ujfalusi
  2019-07-16  8:26 ` [PATCH v5 2/3] dmaengine: ti: edma: Correct the residue calculation (fix for memcpy) Peter Ujfalusi
@ 2019-07-16  8:26 ` Peter Ujfalusi
  2019-07-29  6:42 ` [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Vinod Koul
  3 siblings, 0 replies; 10+ messages in thread
From: Peter Ujfalusi @ 2019-07-16  8:26 UTC (permalink / raw)
  To: vkoul; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

When a DMA client driver does not set the DMA_PREP_INTERRUPT because it
does not want to use interrupts for DMA completion or because it can not
rely on DMA interrupts due to executing the memcpy when interrupts are
disabled it will poll the status of the transfer.

Since we can not tell from any EDMA register that the transfer is
completed, we can only know that the paRAM set has been sent to TPTC for
processing we need to check the residue of the transfer, if it is 0 then
the transfer is completed.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
---
 drivers/dma/ti/edma.c | 37 +++++++++++++++++++++++++++++++++----
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/ti/edma.c b/drivers/dma/ti/edma.c
index 5b8cbd6d7610..bcd431283d8a 100644
--- a/drivers/dma/ti/edma.c
+++ b/drivers/dma/ti/edma.c
@@ -180,6 +180,7 @@ struct edma_desc {
 	struct list_head		node;
 	enum dma_transfer_direction	direction;
 	int				cyclic;
+	bool				polled;
 	int				absync;
 	int				pset_nr;
 	struct edma_chan		*echan;
@@ -1227,8 +1228,9 @@ static struct dma_async_tx_descriptor *edma_prep_dma_memcpy(
 
 	edesc->pset[0].param.opt |= ITCCHEN;
 	if (nslots == 1) {
-		/* Enable transfer complete interrupt */
-		edesc->pset[0].param.opt |= TCINTEN;
+		/* Enable transfer complete interrupt if requested */
+		if (tx_flags & DMA_PREP_INTERRUPT)
+			edesc->pset[0].param.opt |= TCINTEN;
 	} else {
 		/* Enable transfer complete chaining for the first slot */
 		edesc->pset[0].param.opt |= TCCHEN;
@@ -1255,9 +1257,14 @@ static struct dma_async_tx_descriptor *edma_prep_dma_memcpy(
 		}
 
 		edesc->pset[1].param.opt |= ITCCHEN;
-		edesc->pset[1].param.opt |= TCINTEN;
+		/* Enable transfer complete interrupt if requested */
+		if (tx_flags & DMA_PREP_INTERRUPT)
+			edesc->pset[1].param.opt |= TCINTEN;
 	}
 
+	if (!(tx_flags & DMA_PREP_INTERRUPT))
+		edesc->polled = true;
+
 	return vchan_tx_prep(&echan->vchan, &edesc->vdesc, tx_flags);
 }
 
@@ -1827,18 +1834,40 @@ static enum dma_status edma_tx_status(struct dma_chan *chan,
 {
 	struct edma_chan *echan = to_edma_chan(chan);
 	struct virt_dma_desc *vdesc;
+	struct dma_tx_state txstate_tmp;
 	enum dma_status ret;
 	unsigned long flags;
 
 	ret = dma_cookie_status(chan, cookie, txstate);
-	if (ret == DMA_COMPLETE || !txstate)
+
+	if (ret == DMA_COMPLETE)
 		return ret;
 
+	/* Provide a dummy dma_tx_state for completion checking */
+	if (!txstate)
+		txstate = &txstate_tmp;
+
+	txstate->residue = 0;
 	spin_lock_irqsave(&echan->vchan.lock, flags);
 	if (echan->edesc && echan->edesc->vdesc.tx.cookie == cookie)
 		txstate->residue = edma_residue(echan->edesc);
 	else if ((vdesc = vchan_find_desc(&echan->vchan, cookie)))
 		txstate->residue = to_edma_desc(&vdesc->tx)->residue;
+
+	/*
+	 * Mark the cookie completed if the residue is 0 for non cyclic
+	 * transfers
+	 */
+	if (ret != DMA_COMPLETE && !txstate->residue &&
+	    echan->edesc && echan->edesc->polled &&
+	    echan->edesc->vdesc.tx.cookie == cookie) {
+		edma_stop(echan);
+		vchan_cookie_complete(&echan->edesc->vdesc);
+		echan->edesc = NULL;
+		edma_execute(echan);
+		ret = DMA_COMPLETE;
+	}
+
 	spin_unlock_irqrestore(&echan->vchan.lock, flags);
 
 	return ret;
-- 
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 2/3] dmaengine: ti: edma: Correct the residue calculation (fix for memcpy)
  2019-07-16  8:26 ` [PATCH v5 2/3] dmaengine: ti: edma: Correct the residue calculation (fix for memcpy) Peter Ujfalusi
@ 2019-07-25 13:46   ` Vinod Koul
  0 siblings, 0 replies; 10+ messages in thread
From: Vinod Koul @ 2019-07-25 13:46 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

On 16-07-19, 11:26, Peter Ujfalusi wrote:
> For memcpy we never stored the start address of the transfer for the pset
> which rendered the memcpy residue calculation completely broken.
> 
> In the edma_residue() function we also need to to some correction for the
> calculations:
> Instead waiting for all EDMA channels to be idle (in a busy system it can
> take few iteration to hit a point when all queues are idle) wait for the
> event pending on the given channel (SH_ER for hw synchronized channels,
> SH_ESR for manually triggered channels).
> 
> If the position returned by EMDA is 0 it imiplies that the last paRAM set

s/imiplies/implies

> has been consumed and we are at the closing dummy set, thus we can conclude
> that the transfer is completed and we can return 0 as residue.
> 
> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> ---
>  drivers/dma/ti/edma.c | 31 +++++++++++++++++++++++--------
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/dma/ti/edma.c b/drivers/dma/ti/edma.c
> index a39f817b3888..5b8cbd6d7610 100644
> --- a/drivers/dma/ti/edma.c
> +++ b/drivers/dma/ti/edma.c
> @@ -1026,6 +1026,7 @@ static int edma_config_pset(struct dma_chan *chan, struct edma_pset *epset,
>  		src_cidx = cidx;
>  		dst_bidx = acnt;
>  		dst_cidx = cidx;
> +		epset->addr = src_addr;
>  	} else {
>  		dev_err(dev, "%s: direction not implemented yet\n", __func__);
>  		return -EINVAL;
> @@ -1736,7 +1737,11 @@ static u32 edma_residue(struct edma_desc *edesc)
>  	int loop_count = EDMA_MAX_TR_WAIT_LOOPS;
>  	struct edma_chan *echan = edesc->echan;
>  	struct edma_pset *pset = edesc->pset;
> -	dma_addr_t done, pos;
> +	dma_addr_t done, pos, pos_old;
> +	int channel = EDMA_CHAN_SLOT(echan->ch_num);
> +	int idx = EDMA_REG_ARRAY_INDEX(channel);
> +	int ch_bit = EDMA_CHANNEL_BIT(channel);
> +	int event_reg;
>  	int i;
>  
>  	/*
> @@ -1749,16 +1754,20 @@ static u32 edma_residue(struct edma_desc *edesc)
>  	 * "pos" may represent a transfer request that is still being
>  	 * processed by the EDMACC or EDMATC. We will busy wait until
>  	 * any one of the situations occurs:
> -	 *   1. the DMA hardware is idle
> -	 *   2. a new transfer request is setup
> +	 *   1. while and event is pending for the channel
> +	 *   2. a position updated
>  	 *   3. we hit the loop limit
>  	 */
> -	while (edma_read(echan->ecc, EDMA_CCSTAT) & EDMA_CCSTAT_ACTV) {
> -		/* check if a new transfer request is setup */
> -		if (edma_get_position(echan->ecc,
> -				      echan->slot[0], dst) != pos) {
> +	if (is_slave_direction(edesc->direction))
> +		event_reg = SH_ER;
> +	else
> +		event_reg = SH_ESR;
> +
> +	pos_old = pos;
> +	while (edma_shadow0_read_array(echan->ecc, event_reg, idx) & ch_bit) {
> +		pos = edma_get_position(echan->ecc, echan->slot[0], dst);
> +		if (pos != pos_old)
>  			break;
> -		}
>  
>  		if (!--loop_count) {
>  			dev_dbg_ratelimited(echan->vchan.chan.device->dev,
> @@ -1783,6 +1792,12 @@ static u32 edma_residue(struct edma_desc *edesc)
>  		return edesc->residue_stat;
>  	}
>  
> +	/*
> +	 * If the position is 0, then EDMA loaded the closing dummy slot, the
> +	 * transfer is completed
> +	 */
> +	if (!pos)
> +		return 0;
>  	/*
>  	 * For SG operation we catch up with the last processed
>  	 * status.
> -- 
> Peter
> 
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

-- 
~Vinod

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support
  2019-07-16  8:26 [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Peter Ujfalusi
                   ` (2 preceding siblings ...)
  2019-07-16  8:26 ` [PATCH v5 3/3] dmaengine: ti: edma: Support for polled (memcpy) completion Peter Ujfalusi
@ 2019-07-29  6:42 ` Vinod Koul
  2019-07-29  7:22   ` Peter Ujfalusi
  2019-07-29  7:24   ` Peter Ujfalusi
  3 siblings, 2 replies; 10+ messages in thread
From: Vinod Koul @ 2019-07-29  6:42 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

On 16-07-19, 11:26, Peter Ujfalusi wrote:
> Hi,
> 
> Changes since v4:
> - Split the DMA_COMPLETE and !txstate check as Vinod suggested
> 
> Change since v3:
> - fix DMA pointer tracking for memcpy
> - completion polling is only done when it is asked by not providing
>   DMA_PREP_INTERRUPT for memcpy
> 
> Changes since v2:
> - Fix typo in the comment for patch 0
> 
> Changes since v1:
> - Cleanup patch for the array register handling
> - typo fixed in patch2 commit message
> 
> The code around the array register access was pretty confusing for the first
> look, so clean them up first then use the cleaner way in the polled handling.
> 
> When a DMA client driver does not set the DMA_PREP_INTERRUPT because it
> does not want to use interrupts for DMA completion or because it can not
> rely on DMA interrupts due to executing the memcpy when interrupts are
> disabled it will poll the status of the transfer.
> 
> Since we can not tell from any EDMA register that the transfer is
> completed, we can only know that the paRAM set has been sent to TPTC for
> processing we need to check the residue of the transfer, if it is 0 then
> the transfer is completed.
> 
> The polled completion can bve tested by applying:
> https://patchwork.kernel.org/patch/10966499/
> 
> Enabling the memcpy for EDMA and run the dmatest with polled = 1.
> 
> Or, enable the EDMA memcpy support and boot up any dra7 family device with
> display enabled. The workaround for DMM errata i878 uses polled DMA memcpy.

Applied, thanks. Fixed typo in 2nd patch while at it

-- 
~Vinod

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support
  2019-07-29  6:42 ` [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Vinod Koul
@ 2019-07-29  7:22   ` Peter Ujfalusi
  2019-07-29  7:33     ` Vinod Koul
  2019-07-29  7:24   ` Peter Ujfalusi
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Ujfalusi @ 2019-07-29  7:22 UTC (permalink / raw)
  To: Vinod Koul; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

Vinod,

On 29/07/2019 9.42, Vinod Koul wrote:
> On 16-07-19, 11:26, Peter Ujfalusi wrote:
>> Hi,
>>
>> Changes since v4:
>> - Split the DMA_COMPLETE and !txstate check as Vinod suggested
>>
>> Change since v3:
>> - fix DMA pointer tracking for memcpy
>> - completion polling is only done when it is asked by not providing
>>   DMA_PREP_INTERRUPT for memcpy
>>
>> Changes since v2:
>> - Fix typo in the comment for patch 0
>>
>> Changes since v1:
>> - Cleanup patch for the array register handling
>> - typo fixed in patch2 commit message
>>
>> The code around the array register access was pretty confusing for the first
>> look, so clean them up first then use the cleaner way in the polled handling.
>>
>> When a DMA client driver does not set the DMA_PREP_INTERRUPT because it
>> does not want to use interrupts for DMA completion or because it can not
>> rely on DMA interrupts due to executing the memcpy when interrupts are
>> disabled it will poll the status of the transfer.
>>
>> Since we can not tell from any EDMA register that the transfer is
>> completed, we can only know that the paRAM set has been sent to TPTC for
>> processing we need to check the residue of the transfer, if it is 0 then
>> the transfer is completed.
>>
>> The polled completion can bve tested by applying:
>> https://patchwork.kernel.org/patch/10966499/
>>
>> Enabling the memcpy for EDMA and run the dmatest with polled = 1.
>>
>> Or, enable the EDMA memcpy support and boot up any dra7 family device with
>> display enabled. The workaround for DMM errata i878 uses polled DMA memcpy.
> 
> Applied, thanks. Fixed typo in 2nd patch while at it

Thank you! I was about to send v6 with the fixed typo.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support
  2019-07-29  6:42 ` [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Vinod Koul
  2019-07-29  7:22   ` Peter Ujfalusi
@ 2019-07-29  7:24   ` Peter Ujfalusi
  2019-07-29  7:33     ` Vinod Koul
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Ujfalusi @ 2019-07-29  7:24 UTC (permalink / raw)
  To: Vinod Koul; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

Vinod,

On 29/07/2019 9.42, Vinod Koul wrote:
> On 16-07-19, 11:26, Peter Ujfalusi wrote:
>> Hi,
>>
>> Changes since v4:
>> - Split the DMA_COMPLETE and !txstate check as Vinod suggested
>>
>> Change since v3:
>> - fix DMA pointer tracking for memcpy
>> - completion polling is only done when it is asked by not providing
>>   DMA_PREP_INTERRUPT for memcpy
>>
>> Changes since v2:
>> - Fix typo in the comment for patch 0
>>
>> Changes since v1:
>> - Cleanup patch for the array register handling
>> - typo fixed in patch2 commit message
>>
>> The code around the array register access was pretty confusing for the first
>> look, so clean them up first then use the cleaner way in the polled handling.
>>
>> When a DMA client driver does not set the DMA_PREP_INTERRUPT because it
>> does not want to use interrupts for DMA completion or because it can not
>> rely on DMA interrupts due to executing the memcpy when interrupts are
>> disabled it will poll the status of the transfer.
>>
>> Since we can not tell from any EDMA register that the transfer is
>> completed, we can only know that the paRAM set has been sent to TPTC for
>> processing we need to check the residue of the transfer, if it is 0 then
>> the transfer is completed.
>>
>> The polled completion can bve tested by applying:
>> https://patchwork.kernel.org/patch/10966499/

Should I resend this patch so the polled mode can be tested in upstream?

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support
  2019-07-29  7:24   ` Peter Ujfalusi
@ 2019-07-29  7:33     ` Vinod Koul
  0 siblings, 0 replies; 10+ messages in thread
From: Vinod Koul @ 2019-07-29  7:33 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

On 29-07-19, 10:24, Peter Ujfalusi wrote:
> Vinod,
> 
> On 29/07/2019 9.42, Vinod Koul wrote:
> > On 16-07-19, 11:26, Peter Ujfalusi wrote:
> >> Hi,
> >>
> >> Changes since v4:
> >> - Split the DMA_COMPLETE and !txstate check as Vinod suggested
> >>
> >> Change since v3:
> >> - fix DMA pointer tracking for memcpy
> >> - completion polling is only done when it is asked by not providing
> >>   DMA_PREP_INTERRUPT for memcpy
> >>
> >> Changes since v2:
> >> - Fix typo in the comment for patch 0
> >>
> >> Changes since v1:
> >> - Cleanup patch for the array register handling
> >> - typo fixed in patch2 commit message
> >>
> >> The code around the array register access was pretty confusing for the first
> >> look, so clean them up first then use the cleaner way in the polled handling.
> >>
> >> When a DMA client driver does not set the DMA_PREP_INTERRUPT because it
> >> does not want to use interrupts for DMA completion or because it can not
> >> rely on DMA interrupts due to executing the memcpy when interrupts are
> >> disabled it will poll the status of the transfer.
> >>
> >> Since we can not tell from any EDMA register that the transfer is
> >> completed, we can only know that the paRAM set has been sent to TPTC for
> >> processing we need to check the residue of the transfer, if it is 0 then
> >> the transfer is completed.
> >>
> >> The polled completion can bve tested by applying:
> >> https://patchwork.kernel.org/patch/10966499/
> 
> Should I resend this patch so the polled mode can be tested in upstream?

Yes sure

-- 
~Vinod

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support
  2019-07-29  7:22   ` Peter Ujfalusi
@ 2019-07-29  7:33     ` Vinod Koul
  0 siblings, 0 replies; 10+ messages in thread
From: Vinod Koul @ 2019-07-29  7:33 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: dan.j.williams, dmaengine, linux-arm-kernel, linux-omap

On 29-07-19, 10:22, Peter Ujfalusi wrote:
> Vinod,
> 
> On 29/07/2019 9.42, Vinod Koul wrote:
> > On 16-07-19, 11:26, Peter Ujfalusi wrote:
> >> Hi,
> >>
> >> Changes since v4:
> >> - Split the DMA_COMPLETE and !txstate check as Vinod suggested
> >>
> >> Change since v3:
> >> - fix DMA pointer tracking for memcpy
> >> - completion polling is only done when it is asked by not providing
> >>   DMA_PREP_INTERRUPT for memcpy
> >>
> >> Changes since v2:
> >> - Fix typo in the comment for patch 0
> >>
> >> Changes since v1:
> >> - Cleanup patch for the array register handling
> >> - typo fixed in patch2 commit message
> >>
> >> The code around the array register access was pretty confusing for the first
> >> look, so clean them up first then use the cleaner way in the polled handling.
> >>
> >> When a DMA client driver does not set the DMA_PREP_INTERRUPT because it
> >> does not want to use interrupts for DMA completion or because it can not
> >> rely on DMA interrupts due to executing the memcpy when interrupts are
> >> disabled it will poll the status of the transfer.
> >>
> >> Since we can not tell from any EDMA register that the transfer is
> >> completed, we can only know that the paRAM set has been sent to TPTC for
> >> processing we need to check the residue of the transfer, if it is 0 then
> >> the transfer is completed.
> >>
> >> The polled completion can bve tested by applying:
> >> https://patchwork.kernel.org/patch/10966499/
> >>
> >> Enabling the memcpy for EDMA and run the dmatest with polled = 1.
> >>
> >> Or, enable the EDMA memcpy support and boot up any dra7 family device with
> >> display enabled. The workaround for DMM errata i878 uses polled DMA memcpy.
> > 
> > Applied, thanks. Fixed typo in 2nd patch while at it
> 
> Thank you! I was about to send v6 with the fixed typo.

Simpler typos are ok to fix while applying

-- 
~Vinod

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-07-29  7:35 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-16  8:26 [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Peter Ujfalusi
2019-07-16  8:26 ` [PATCH v5 1/3] dmaengine: ti: edma: Clean up the 2x32bit array register accesses Peter Ujfalusi
2019-07-16  8:26 ` [PATCH v5 2/3] dmaengine: ti: edma: Correct the residue calculation (fix for memcpy) Peter Ujfalusi
2019-07-25 13:46   ` Vinod Koul
2019-07-16  8:26 ` [PATCH v5 3/3] dmaengine: ti: edma: Support for polled (memcpy) completion Peter Ujfalusi
2019-07-29  6:42 ` [PATCH v5 0/3] dmaengine: ti: edma: Polled completion support Vinod Koul
2019-07-29  7:22   ` Peter Ujfalusi
2019-07-29  7:33     ` Vinod Koul
2019-07-29  7:24   ` Peter Ujfalusi
2019-07-29  7:33     ` Vinod Koul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).