dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs
@ 2019-09-23 13:58 Philipp Puschmann
  2019-09-23 13:58 ` [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership Philipp Puschmann
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Philipp Puschmann @ 2019-09-23 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: jlu, yibin.gong, fugang.duan, l.stach, dan.j.williams, vkoul,
	shawnguo, s.hauer, kernel, festevam, linux-imx, dmaengine,
	linux-arm-kernel, Philipp Puschmann

For some years and since many kernel versions there are reports that
RX UART DMA channel stops working at one point. So far the usual
workaround was to disable RX DMA. This patches fix the underlying
problem.

When a running sdma script does not find any usable destination buffer
to put its data into it just leads to stopping the channel being
scheduled again. As solution we manually retrigger the sdma script for
this channel and by this dissolve the freeze.

While this seems to work fine so far, it may come to buffer overruns
when the channel - even temporary - is stopped. This case has to be
addressed by device drivers by increasing the number of DMA periods.

This patch series was tested with the current kernel and backported to
kernel 4.15 with a special use case using a WL1837MOD via UART and
provoking the hanging of UART RX DMA within seconds after starting a
test application. It resulted in well known
  "Bluetooth: hci0: command 0x0408 tx timeout"
errors and complete stop of UART data reception. Our Bluetooth traffic
consists of many independent small packets, mostly only a few bytes,
causing high usage of periods.

Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
Reviewed-by: Fugang Duan <fugang.duan@nxp.com>

---

Changelog v5:
 - join with patch version from Jan Luebbe
 - adapt comments and patch descriptions
 - add Reviewed-by

Changelog v4:
 - fixed the fixes tags
 
Changelog v3:
 - fixes typo in dma_wmb
 - add fixes tags
 
Changelog v2:
 - adapt title (this patches are not only for i.MX6)
 - improve some comments and patch descriptions
 - add a dma_wb() around BD_DONE flag
 - add Reviewed-by tags
 - split off  "serial: imx: adapt rx buffer and dma periods"

Philipp Puschmann (3):
  dmaengine: imx-sdma: fix buffer ownership
  dmaengine: imx-sdma: fix dma freezes
  dmaengine: imx-sdma: drop redundant variable

 drivers/dma/imx-sdma.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership
  2019-09-23 13:58 [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Philipp Puschmann
@ 2019-09-23 13:58 ` Philipp Puschmann
  2019-12-03  9:47   ` Lucas Stach
  2019-12-04  9:19   ` Robin Gong
  2019-09-23 13:58 ` [PATCH v5 2/3] dmaengine: imx-sdma: fix dma freezes Philipp Puschmann
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Philipp Puschmann @ 2019-09-23 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: jlu, yibin.gong, fugang.duan, l.stach, dan.j.williams, vkoul,
	shawnguo, s.hauer, kernel, festevam, linux-imx, dmaengine,
	linux-arm-kernel, Philipp Puschmann

BD_DONE flag marks ownership of the buffer. When 1 SDMA owns the
buffer, when 0 ARM owns it. When processing the buffers in
sdma_update_channel_loop the ownership of the currently processed
buffer was set to SDMA again before running the callback function of
the buffer and while the sdma script may be running in parallel. So
there was the possibility to get the buffer overwritten by SDMA before
it has been processed by kernel leading to kind of random errors in the
upper layers, e.g. bluetooth.

Fixes: 1ec1e82f2510 ("dmaengine: Add Freescale i.MX SDMA support")
Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
---

Changelog v5:
 - no changes

Changelog v4:
 - fixed the fixes tag
 
Changelog v3:
 - use correct dma_wmb() instead of dma_wb()
 - add fixes tag

Changelog v2:
 - add dma_wb()
 
 drivers/dma/imx-sdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index 9ba74ab7e912..b42281604e54 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -802,7 +802,6 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 		*/
 
 		desc->chn_real_count = bd->mode.count;
-		bd->mode.status |= BD_DONE;
 		bd->mode.count = desc->period_len;
 		desc->buf_ptail = desc->buf_tail;
 		desc->buf_tail = (desc->buf_tail + 1) % desc->num_bd;
@@ -817,6 +816,9 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 		dmaengine_desc_get_callback_invoke(&desc->vd.tx, NULL);
 		spin_lock(&sdmac->vc.lock);
 
+		dma_wmb();
+		bd->mode.status |= BD_DONE;
+
 		if (error)
 			sdmac->status = old_status;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 2/3] dmaengine: imx-sdma: fix dma freezes
  2019-09-23 13:58 [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Philipp Puschmann
  2019-09-23 13:58 ` [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership Philipp Puschmann
@ 2019-09-23 13:58 ` Philipp Puschmann
  2019-09-24  3:14   ` Robin Gong
  2019-09-23 13:58 ` [PATCH v5 3/3] dmaengine: imx-sdma: drop redundant variable Philipp Puschmann
  2019-09-23 14:55 ` [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Adam Ford
  3 siblings, 1 reply; 12+ messages in thread
From: Philipp Puschmann @ 2019-09-23 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: jlu, yibin.gong, fugang.duan, l.stach, dan.j.williams, vkoul,
	shawnguo, s.hauer, kernel, festevam, linux-imx, dmaengine,
	linux-arm-kernel, Philipp Puschmann

For some years and since many kernel versions there are reports that the
RX UART SDMA channel stops working at some point. The workaround was to
disable DMA for RX. This commit fixes the problem itself. Cyclic DMA
transfers are used by uart and other drivers and these can fail in at
least two cases where we can run out of descriptors available to the
engine:
- Interrupts are disabled for too long and all buffers are filled with
  data, especially in a setup where many small dma transfers are being
  executed only using a tiny part of a single buffer
- DMA errors (such as generated by baud rate mismatch with imx-uart)
  use up all descriptors before we can react.

In this case, SDMA stops the channel and no further transfers are done
until the respective channel is disabled and re-enabled. We can check
if the channel has been stopped and re-enable it then. To distinguish
from the the case that the channel was stopped by upper-level driver
we introduce new flag IMX_DMA_ACTIVE.

As sdmac->desc is constant we can move desc out of the loop.

Fixes: 1ec1e82f2510 ("dmaengine: Add Freescale i.MX SDMA support")
Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
---

Changelog v5:
 - join with patch version from Jan Luebbe
 - adapt comments and patch descriptions
 
Changelog v4:
 - fixed the fixes tag
 
Changelog v3:
 - use correct dma_wmb() instead of dma_wb()
 - add fixes tag
 
Changelog v2:
 - clarify comment and commit description

 drivers/dma/imx-sdma.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index b42281604e54..0b1d6a62423d 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -383,6 +383,7 @@ struct sdma_channel {
 };
 
 #define IMX_DMA_SG_LOOP		BIT(0)
+#define IMX_DMA_ACTIVE		BIT(1)
 
 #define MAX_DMA_CHANNELS 32
 #define MXC_SDMA_DEFAULT_PRIORITY 1
@@ -658,6 +659,9 @@ static int sdma_config_ownership(struct sdma_channel *sdmac,
 
 static void sdma_enable_channel(struct sdma_engine *sdma, int channel)
 {
+	struct sdma_channel *sdmac = &sdma->channel[channel];
+
+	sdmac->flags |= IMX_DMA_ACTIVE;
 	writel(BIT(channel), sdma->regs + SDMA_H_START);
 }
 
@@ -774,16 +778,17 @@ static void sdma_start_desc(struct sdma_channel *sdmac)
 
 static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 {
+	struct sdma_engine *sdma = sdmac->sdma;
 	struct sdma_buffer_descriptor *bd;
+	struct sdma_desc *desc = sdmac->desc;
 	int error = 0;
-	enum dma_status	old_status = sdmac->status;
+	enum dma_status old_status = sdmac->status;
 
 	/*
 	 * loop mode. Iterate over descriptors, re-setup them and
 	 * call callback function.
 	 */
-	while (sdmac->desc) {
-		struct sdma_desc *desc = sdmac->desc;
+	while (desc) {
 
 		bd = &desc->bd[desc->buf_tail];
 
@@ -822,6 +827,18 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 		if (error)
 			sdmac->status = old_status;
 	}
+
+	/* In some situations it may happen that the sdma does not find any
+	 * usable descriptor in the ring to put data into. The channel is
+	 * stopped then and after having freed some buffers we have to restart
+	 * it manually.
+	 */
+	if ((sdmac->flags & IMX_DMA_ACTIVE) &&
+	    !(readl_relaxed(sdma->regs + SDMA_H_STATSTOP) & BIT(sdmac->channel))) {
+		dev_err_ratelimited(sdma->dev, "SDMA channel %d: cyclic transfer disabled by HW, reenabling\n",
+				    sdmac->channel);
+			writel(BIT(sdmac->channel), sdma->regs + SDMA_H_START);
+	};
 }
 
 static void mxc_sdma_handle_channel_normal(struct sdma_channel *data)
@@ -1051,7 +1068,8 @@ static int sdma_disable_channel(struct dma_chan *chan)
 	struct sdma_engine *sdma = sdmac->sdma;
 	int channel = sdmac->channel;
 
-	writel_relaxed(BIT(channel), sdma->regs + SDMA_H_STATSTOP);
+	sdmac->flags &= ~IMX_DMA_ACTIVE;
+	writel(BIT(channel), sdma->regs + SDMA_H_STATSTOP);
 	sdmac->status = DMA_ERROR;
 
 	return 0;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 3/3] dmaengine: imx-sdma: drop redundant variable
  2019-09-23 13:58 [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Philipp Puschmann
  2019-09-23 13:58 ` [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership Philipp Puschmann
  2019-09-23 13:58 ` [PATCH v5 2/3] dmaengine: imx-sdma: fix dma freezes Philipp Puschmann
@ 2019-09-23 13:58 ` Philipp Puschmann
  2019-09-23 14:55 ` [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Adam Ford
  3 siblings, 0 replies; 12+ messages in thread
From: Philipp Puschmann @ 2019-09-23 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: jlu, yibin.gong, fugang.duan, l.stach, dan.j.williams, vkoul,
	shawnguo, s.hauer, kernel, festevam, linux-imx, dmaengine,
	linux-arm-kernel, Philipp Puschmann

In sdma_prep_dma_cyclic buf is redundant. Drop it.

Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
---

Changelog v3,v4,v5:
 - no changes

Changelog v2:
 - add Reviewed-by tag

 drivers/dma/imx-sdma.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index 0b1d6a62423d..0d62664e534e 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -1549,7 +1549,7 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 	struct sdma_engine *sdma = sdmac->sdma;
 	int num_periods = buf_len / period_len;
 	int channel = sdmac->channel;
-	int i = 0, buf = 0;
+	int i;
 	struct sdma_desc *desc;
 
 	dev_dbg(sdma->dev, "%s channel: %d\n", __func__, channel);
@@ -1570,7 +1570,7 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 		goto err_bd_out;
 	}
 
-	while (buf < buf_len) {
+	for (i = 0; i < num_periods; i++) {
 		struct sdma_buffer_descriptor *bd = &desc->bd[i];
 		int param;
 
@@ -1597,9 +1597,6 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 		bd->mode.status = param;
 
 		dma_addr += period_len;
-		buf += period_len;
-
-		i++;
 	}
 
 	return vchan_tx_prep(&sdmac->vc, &desc->vd, flags);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs
  2019-09-23 13:58 [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Philipp Puschmann
                   ` (2 preceding siblings ...)
  2019-09-23 13:58 ` [PATCH v5 3/3] dmaengine: imx-sdma: drop redundant variable Philipp Puschmann
@ 2019-09-23 14:55 ` Adam Ford
  2019-09-23 15:06   ` Philipp Puschmann
  3 siblings, 1 reply; 12+ messages in thread
From: Adam Ford @ 2019-09-23 14:55 UTC (permalink / raw)
  To: Philipp Puschmann
  Cc: Linux Kernel Mailing List, fugang.duan, jlu, Fabio Estevam,
	Sascha Hauer, vkoul, NXP Linux Team, Sascha Hauer, dmaengine,
	dan.j.williams, Robin Gong, Shawn Guo, arm-soc, Lucas Stach

On Mon, Sep 23, 2019 at 8:58 AM Philipp Puschmann
<philipp.puschmann@emlix.com> wrote:
>
> For some years and since many kernel versions there are reports that
> RX UART DMA channel stops working at one point. So far the usual
> workaround was to disable RX DMA. This patches fix the underlying
> problem.
>
> When a running sdma script does not find any usable destination buffer
> to put its data into it just leads to stopping the channel being
> scheduled again. As solution we manually retrigger the sdma script for
> this channel and by this dissolve the freeze.
>
> While this seems to work fine so far, it may come to buffer overruns
> when the channel - even temporary - is stopped. This case has to be
> addressed by device drivers by increasing the number of DMA periods.
>
> This patch series was tested with the current kernel and backported to
> kernel 4.15 with a special use case using a WL1837MOD via UART and
> provoking the hanging of UART RX DMA within seconds after starting a
> test application. It resulted in well known
>   "Bluetooth: hci0: command 0x0408 tx timeout"
> errors and complete stop of UART data reception. Our Bluetooth traffic
> consists of many independent small packets, mostly only a few bytes,
> causing high usage of periods.
>

Using the 4.19.y branch, this seems to working just fine for me with an i.MX6Q
with WL1837MOD Bluetooth connected to UART2.  I am still seeing some
timeouts with 5.3, but I'm going to continue to run some tests.

Tested-by: Adam Ford <aford173@gmail.com> #imx6q w/ 4.19 Kernel

> Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
> Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
>
> ---
>
> Changelog v5:
>  - join with patch version from Jan Luebbe
>  - adapt comments and patch descriptions
>  - add Reviewed-by
>
> Changelog v4:
>  - fixed the fixes tags
>
> Changelog v3:
>  - fixes typo in dma_wmb
>  - add fixes tags
>
> Changelog v2:
>  - adapt title (this patches are not only for i.MX6)
>  - improve some comments and patch descriptions
>  - add a dma_wb() around BD_DONE flag
>  - add Reviewed-by tags
>  - split off  "serial: imx: adapt rx buffer and dma periods"
>
> Philipp Puschmann (3):
>   dmaengine: imx-sdma: fix buffer ownership
>   dmaengine: imx-sdma: fix dma freezes
>   dmaengine: imx-sdma: drop redundant variable
>
>  drivers/dma/imx-sdma.c | 37 +++++++++++++++++++++++++++----------
>  1 file changed, 27 insertions(+), 10 deletions(-)
>
> --
> 2.23.0
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs
  2019-09-23 14:55 ` [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Adam Ford
@ 2019-09-23 15:06   ` Philipp Puschmann
  2019-09-23 15:09     ` Jan Lübbe
  0 siblings, 1 reply; 12+ messages in thread
From: Philipp Puschmann @ 2019-09-23 15:06 UTC (permalink / raw)
  To: Adam Ford
  Cc: Linux Kernel Mailing List, fugang.duan, jlu, Fabio Estevam,
	Sascha Hauer, vkoul, NXP Linux Team, Sascha Hauer, dmaengine,
	dan.j.williams, Robin Gong, Shawn Guo, arm-soc, Lucas Stach

Hi Adam,

Am 23.09.19 um 16:55 schrieb Adam Ford:
> On Mon, Sep 23, 2019 at 8:58 AM Philipp Puschmann
> <philipp.puschmann@emlix.com> wrote:
>>
>> For some years and since many kernel versions there are reports that
>> RX UART DMA channel stops working at one point. So far the usual
>> workaround was to disable RX DMA. This patches fix the underlying
>> problem.
>>
>> When a running sdma script does not find any usable destination buffer
>> to put its data into it just leads to stopping the channel being
>> scheduled again. As solution we manually retrigger the sdma script for
>> this channel and by this dissolve the freeze.
>>
>> While this seems to work fine so far, it may come to buffer overruns
>> when the channel - even temporary - is stopped. This case has to be
>> addressed by device drivers by increasing the number of DMA periods.
>>
>> This patch series was tested with the current kernel and backported to
>> kernel 4.15 with a special use case using a WL1837MOD via UART and
>> provoking the hanging of UART RX DMA within seconds after starting a
>> test application. It resulted in well known
>>   "Bluetooth: hci0: command 0x0408 tx timeout"
>> errors and complete stop of UART data reception. Our Bluetooth traffic
>> consists of many independent small packets, mostly only a few bytes,
>> causing high usage of periods.
>>
> 
> Using the 4.19.y branch, this seems to working just fine for me with an i.MX6Q
> with WL1837MOD Bluetooth connected to UART2.  I am still seeing some
> timeouts with 5.3, but I'm going to continue to run some tests.

Thanks for testing.
With my local setup i still have very few tx timeouts too. But i think they have a different
cause and especially different consequences. When the problem addressed by this series
appear you get a whole bunch of tx timeouts (and maybe errors from Bluetooth
layer) and monitoring received Bluetooth packets with hciconfig shows a
complete freeze of rx counter. Only resetting the hci_uart driver and the wl1837mon then helps.
With these patches applied the rx data shold still coming in even if a single or
multiple tx timeout error happen. I'm not sure where the error comes from and what the
consequences for the Bluetooth layer are.

Regards,
Philipp
> 
> Tested-by: Adam Ford <aford173@gmail.com> #imx6q w/ 4.19 Kernel
> 
>> Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
>> Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
>>
>> ---
>>
>> Changelog v5:
>>  - join with patch version from Jan Luebbe
>>  - adapt comments and patch descriptions
>>  - add Reviewed-by
>>
>> Changelog v4:
>>  - fixed the fixes tags
>>
>> Changelog v3:
>>  - fixes typo in dma_wmb
>>  - add fixes tags
>>
>> Changelog v2:
>>  - adapt title (this patches are not only for i.MX6)
>>  - improve some comments and patch descriptions
>>  - add a dma_wb() around BD_DONE flag
>>  - add Reviewed-by tags
>>  - split off  "serial: imx: adapt rx buffer and dma periods"
>>
>> Philipp Puschmann (3):
>>   dmaengine: imx-sdma: fix buffer ownership
>>   dmaengine: imx-sdma: fix dma freezes
>>   dmaengine: imx-sdma: drop redundant variable
>>
>>  drivers/dma/imx-sdma.c | 37 +++++++++++++++++++++++++++----------
>>  1 file changed, 27 insertions(+), 10 deletions(-)
>>
>> --
>> 2.23.0
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs
  2019-09-23 15:06   ` Philipp Puschmann
@ 2019-09-23 15:09     ` Jan Lübbe
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Lübbe @ 2019-09-23 15:09 UTC (permalink / raw)
  To: Philipp Puschmann, Adam Ford
  Cc: fugang.duan, Shawn Guo, Sascha Hauer, Linux Kernel Mailing List,
	vkoul, NXP Linux Team, Sascha Hauer, dmaengine, dan.j.williams,
	Robin Gong, Fabio Estevam, arm-soc, Lucas Stach

On Mon, 2019-09-23 at 17:06 +0200, Philipp Puschmann wrote:
> Thanks for testing.
> With my local setup i still have very few tx timeouts too. But i think they have a different
> cause and especially different consequences. When the problem addressed by this series
> appear you get a whole bunch of tx timeouts (and maybe errors from Bluetooth
> layer) and monitoring received Bluetooth packets with hciconfig shows a
> complete freeze of rx counter. Only resetting the hci_uart driver and the wl1837mon then helps.
> With these patches applied the rx data shold still coming in even if a single or
> multiple tx timeout error happen. I'm not sure where the error comes from and what the
> consequences for the Bluetooth layer are.

For testing, I've used a UART connected to my development host and
configured *mismatching* baud rates. Sending /dev/urandom from the host
to the i.MX6 then triggered the DMA hang (because each character
triggers and error indication, which "uses" a full buffer).

Regards,
Jan
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v5 2/3] dmaengine: imx-sdma: fix dma freezes
  2019-09-23 13:58 ` [PATCH v5 2/3] dmaengine: imx-sdma: fix dma freezes Philipp Puschmann
@ 2019-09-24  3:14   ` Robin Gong
  0 siblings, 0 replies; 12+ messages in thread
From: Robin Gong @ 2019-09-24  3:14 UTC (permalink / raw)
  To: Philipp Puschmann, linux-kernel
  Cc: jlu, Andy Duan, l.stach, dan.j.williams, vkoul, shawnguo,
	s.hauer, kernel, festevam, dl-linux-imx, dmaengine,
	linux-arm-kernel

On 2019-9-23 21:58 Philipp Puschmann <philipp.puschmann@emlix.com> wrote:
> For some years and since many kernel versions there are reports that the RX
> UART SDMA channel stops working at some point. The workaround was to
> disable DMA for RX. This commit fixes the problem itself. Cyclic DMA transfers
> are used by uart and other drivers and these can fail in at least two cases
> where we can run out of descriptors available to the
> engine:
> - Interrupts are disabled for too long and all buffers are filled with
>   data, especially in a setup where many small dma transfers are being
>   executed only using a tiny part of a single buffer
> - DMA errors (such as generated by baud rate mismatch with imx-uart)
>   use up all descriptors before we can react.
> 
> In this case, SDMA stops the channel and no further transfers are done until
> the respective channel is disabled and re-enabled. We can check if the
> channel has been stopped and re-enable it then. To distinguish from the the
> case that the channel was stopped by upper-level driver we introduce new
> flag IMX_DMA_ACTIVE.
> 
> As sdmac->desc is constant we can move desc out of the loop.
> 
> Fixes: 1ec1e82f2510 ("dmaengine: Add Freescale i.MX SDMA support")
> Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
> Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
> ---
> 
> Changelog v5:
>  - join with patch version from Jan Luebbe
>  - adapt comments and patch descriptions
> 
> Changelog v4:
>  - fixed the fixes tag
> 
> Changelog v3:
>  - use correct dma_wmb() instead of dma_wb()
>  - add fixes tag
> 
> Changelog v2:
>  - clarify comment and commit description
> 
>  drivers/dma/imx-sdma.c | 26 ++++++++++++++++++++++----
>  1 file changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index
> b42281604e54..0b1d6a62423d 100644
> --- a/drivers/dma/imx-sdma.c
> +++ b/drivers/dma/imx-sdma.c
> @@ -383,6 +383,7 @@ struct sdma_channel {  };
> 
>  #define IMX_DMA_SG_LOOP		BIT(0)
> +#define IMX_DMA_ACTIVE		BIT(1)
> 
>  #define MAX_DMA_CHANNELS 32
>  #define MXC_SDMA_DEFAULT_PRIORITY 1
> @@ -658,6 +659,9 @@ static int sdma_config_ownership(struct
> sdma_channel *sdmac,
> 
>  static void sdma_enable_channel(struct sdma_engine *sdma, int channel)  {
> +	struct sdma_channel *sdmac = &sdma->channel[channel];
> +
> +	sdmac->flags |= IMX_DMA_ACTIVE;
Add spin_lock_irq protect this flags.
>  	writel(BIT(channel), sdma->regs + SDMA_H_START);  }
> 
> @@ -774,16 +778,17 @@ static void sdma_start_desc(struct sdma_channel
> *sdmac)
> 
>  static void sdma_update_channel_loop(struct sdma_channel *sdmac)  {
> +	struct sdma_engine *sdma = sdmac->sdma;
>  	struct sdma_buffer_descriptor *bd;
> +	struct sdma_desc *desc = sdmac->desc;
>  	int error = 0;
> -	enum dma_status	old_status = sdmac->status;
> +	enum dma_status old_status = sdmac->status;
> 
>  	/*
>  	 * loop mode. Iterate over descriptors, re-setup them and
>  	 * call callback function.
>  	 */
> -	while (sdmac->desc) {
> -		struct sdma_desc *desc = sdmac->desc;
> +	while (desc) {
> 
>  		bd = &desc->bd[desc->buf_tail];
> 
> @@ -822,6 +827,18 @@ static void sdma_update_channel_loop(struct
> sdma_channel *sdmac)
>  		if (error)
>  			sdmac->status = old_status;
>  	}
> +
> +	/* In some situations it may happen that the sdma does not find any
> +	 * usable descriptor in the ring to put data into. The channel is
> +	 * stopped then and after having freed some buffers we have to restart
> +	 * it manually.
> +	 */
> +	if ((sdmac->flags & IMX_DMA_ACTIVE) &&
> +	    !(readl_relaxed(sdma->regs + SDMA_H_STATSTOP) &
> BIT(sdmac->channel))) {
Seems duplicate checking here, IMX_DMA_ACTIVE is enough.
> +		dev_err_ratelimited(sdma->dev, "SDMA channel %d: cyclic transfer
> disabled by HW, reenabling\n",
Would you change the print log to below:
"cyclic bds consumed all,reenableing".?
> +				    sdmac->channel);
> +			writel(BIT(sdmac->channel), sdma->regs + SDMA_H_START);
> +	};
>  }
> 
>  static void mxc_sdma_handle_channel_normal(struct sdma_channel *data)
> @@ -1051,7 +1068,8 @@ static int sdma_disable_channel(struct dma_chan
> *chan)
>  	struct sdma_engine *sdma = sdmac->sdma;
>  	int channel = sdmac->channel;
> 
> -	writel_relaxed(BIT(channel), sdma->regs + SDMA_H_STATSTOP);
> +	sdmac->flags &= ~IMX_DMA_ACTIVE;
> +	writel(BIT(channel), sdma->regs + SDMA_H_STATSTOP);
>  	sdmac->status = DMA_ERROR;
> 
>  	return 0;
> --
> 2.23.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership
  2019-09-23 13:58 ` [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership Philipp Puschmann
@ 2019-12-03  9:47   ` Lucas Stach
  2019-12-04  9:19   ` Robin Gong
  1 sibling, 0 replies; 12+ messages in thread
From: Lucas Stach @ 2019-12-03  9:47 UTC (permalink / raw)
  To: Philipp Puschmann, linux-kernel
  Cc: jlu, yibin.gong, fugang.duan, dan.j.williams, vkoul, shawnguo,
	s.hauer, kernel, festevam, linux-imx, dmaengine,
	linux-arm-kernel

On Mo, 2019-09-23 at 15:58 +0200, Philipp Puschmann wrote:
> BD_DONE flag marks ownership of the buffer. When 1 SDMA owns the
> buffer, when 0 ARM owns it. When processing the buffers in
> sdma_update_channel_loop the ownership of the currently processed
> buffer was set to SDMA again before running the callback function of
> the buffer and while the sdma script may be running in parallel. So
> there was the possibility to get the buffer overwritten by SDMA
> before
> it has been processed by kernel leading to kind of random errors in
> the
> upper layers, e.g. bluetooth.
> 
> Fixes: 1ec1e82f2510 ("dmaengine: Add Freescale i.MX SDMA support")
> Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>

Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

> ---
> 
> Changelog v5:
>  - no changes
> 
> Changelog v4:
>  - fixed the fixes tag
>  
> Changelog v3:
>  - use correct dma_wmb() instead of dma_wb()
>  - add fixes tag
> 
> Changelog v2:
>  - add dma_wb()
>  
>  drivers/dma/imx-sdma.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
> index 9ba74ab7e912..b42281604e54 100644
> --- a/drivers/dma/imx-sdma.c
> +++ b/drivers/dma/imx-sdma.c
> @@ -802,7 +802,6 @@ static void sdma_update_channel_loop(struct
> sdma_channel *sdmac)
>  		*/
>  
>  		desc->chn_real_count = bd->mode.count;
> -		bd->mode.status |= BD_DONE;
>  		bd->mode.count = desc->period_len;
>  		desc->buf_ptail = desc->buf_tail;
>  		desc->buf_tail = (desc->buf_tail + 1) % desc->num_bd;
> @@ -817,6 +816,9 @@ static void sdma_update_channel_loop(struct
> sdma_channel *sdmac)
>  		dmaengine_desc_get_callback_invoke(&desc->vd.tx, NULL);
>  		spin_lock(&sdmac->vc.lock);
>  
> +		dma_wmb();
> +		bd->mode.status |= BD_DONE;
> +
>  		if (error)
>  			sdmac->status = old_status;
>  	}


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership
  2019-09-23 13:58 ` [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership Philipp Puschmann
  2019-12-03  9:47   ` Lucas Stach
@ 2019-12-04  9:19   ` Robin Gong
  2019-12-10  9:44     ` Philipp Puschmann
  1 sibling, 1 reply; 12+ messages in thread
From: Robin Gong @ 2019-12-04  9:19 UTC (permalink / raw)
  To: Philipp Puschmann, linux-kernel
  Cc: jlu, Andy Duan, l.stach, dan.j.williams, vkoul, shawnguo,
	s.hauer, kernel, festevam, dl-linux-imx, dmaengine,
	linux-arm-kernel

On 2019-9-23 Philipp Puschmann <philipp.puschmann@emlix.com> wrote:
> BD_DONE flag marks ownership of the buffer. When 1 SDMA owns the buffer,
> when 0 ARM owns it. When processing the buffers in
> sdma_update_channel_loop the ownership of the currently processed buffer
> was set to SDMA again before running the callback function of the buffer and
> while the sdma script may be running in parallel. So there was the possibility to
> get the buffer overwritten by SDMA before it has been processed by kernel
Does this patch need indeed? I don't think any difference here move done flag
before callback or after callback, because callback never care this flag and actually
done flag is setup for next time rather than this time. Basically, this flag should be
set to 1 quickly asap so that sdma could use this bd asap. If delay the flag may cause
sdma channel stop since all BDs consumed. Could you try again your case without
this patch?
> leading to kind of random errors in the upper layers, e.g. bluetooth.
> 
> Fixes: 1ec1e82f2510 ("dmaengine: Add Freescale i.MX SDMA support")
> Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
> ---
> 
> Changelog v5:
>  - no changes
> 
> Changelog v4:
>  - fixed the fixes tag
> 
> Changelog v3:
>  - use correct dma_wmb() instead of dma_wb()
>  - add fixes tag
> 
> Changelog v2:
>  - add dma_wb()
> 
>  drivers/dma/imx-sdma.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index
> 9ba74ab7e912..b42281604e54 100644
> --- a/drivers/dma/imx-sdma.c
> +++ b/drivers/dma/imx-sdma.c
> @@ -802,7 +802,6 @@ static void sdma_update_channel_loop(struct
> sdma_channel *sdmac)
>  		*/
> 
>  		desc->chn_real_count = bd->mode.count;
> -		bd->mode.status |= BD_DONE;
>  		bd->mode.count = desc->period_len;
>  		desc->buf_ptail = desc->buf_tail;
>  		desc->buf_tail = (desc->buf_tail + 1) % desc->num_bd; @@ -817,6
> +816,9 @@ static void sdma_update_channel_loop(struct sdma_channel
> *sdmac)
>  		dmaengine_desc_get_callback_invoke(&desc->vd.tx, NULL);
>  		spin_lock(&sdmac->vc.lock);
> 
> +		dma_wmb();
> +		bd->mode.status |= BD_DONE;
> +
>  		if (error)
>  			sdmac->status = old_status;
>  	}
> --
> 2.23.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership
  2019-12-04  9:19   ` Robin Gong
@ 2019-12-10  9:44     ` Philipp Puschmann
  2019-12-10 13:01       ` Robin Gong
  0 siblings, 1 reply; 12+ messages in thread
From: Philipp Puschmann @ 2019-12-10  9:44 UTC (permalink / raw)
  To: Robin Gong, linux-kernel
  Cc: jlu, Andy Duan, l.stach, dan.j.williams, vkoul, shawnguo,
	s.hauer, kernel, festevam, dl-linux-imx, dmaengine,
	linux-arm-kernel



Am 04.12.19 um 10:19 schrieb Robin Gong:
> On 2019-9-23 Philipp Puschmann <philipp.puschmann@emlix.com> wrote:
>> BD_DONE flag marks ownership of the buffer. When 1 SDMA owns the buffer,
>> when 0 ARM owns it. When processing the buffers in
>> sdma_update_channel_loop the ownership of the currently processed buffer
>> was set to SDMA again before running the callback function of the buffer and
>> while the sdma script may be running in parallel. So there was the possibility to
>> get the buffer overwritten by SDMA before it has been processed by kernel
> Does this patch need indeed? I don't think any difference here move done flag
> before callback or after callback, because callback never care this flag and actually
> done flag is setup for next time rather than this time.
The callback doesn't care, but the DMA controller cares about this flag. I see a possible race
condition here. If i set the DONE flag for a specific buffer descriptor before handling the
data belonging to this buffer descriptor (aka running the callback function) the DMA script running
at the same time could corrupt that data while being processed.
Or is there are mechanism that prevents this case, that i havn't considered here.

> Basically, this flag should be
> set to 1 quickly asap so that sdma could use this bd asap. If delay the flag may cause
> sdma channel stop since all BDs consumed.

> Could you try again your case without this patch?
I don't have the hw to reproduce this available at the moment but as i remember i did run it without
this patch successfully already. The problem i have described above was more a logical or theoretical
one than a problem that really occured with my setup.

>> leading to kind of random errors in the upper layers, e.g. bluetooth.
>>
>> Fixes: 1ec1e82f2510 ("dmaengine: Add Freescale i.MX SDMA support")
>> Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
>> ---
>>
>> Changelog v5:
>>  - no changes
>>
>> Changelog v4:
>>  - fixed the fixes tag
>>
>> Changelog v3:
>>  - use correct dma_wmb() instead of dma_wb()
>>  - add fixes tag
>>
>> Changelog v2:
>>  - add dma_wb()
>>
>>  drivers/dma/imx-sdma.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index
>> 9ba74ab7e912..b42281604e54 100644
>> --- a/drivers/dma/imx-sdma.c
>> +++ b/drivers/dma/imx-sdma.c
>> @@ -802,7 +802,6 @@ static void sdma_update_channel_loop(struct
>> sdma_channel *sdmac)
>>  		*/
>>
>>  		desc->chn_real_count = bd->mode.count;
>> -		bd->mode.status |= BD_DONE;
>>  		bd->mode.count = desc->period_len;
>>  		desc->buf_ptail = desc->buf_tail;
>>  		desc->buf_tail = (desc->buf_tail + 1) % desc->num_bd; @@ -817,6
>> +816,9 @@ static void sdma_update_channel_loop(struct sdma_channel
>> *sdmac)
>>  		dmaengine_desc_get_callback_invoke(&desc->vd.tx, NULL);
>>  		spin_lock(&sdmac->vc.lock);
>>
>> +		dma_wmb();
>> +		bd->mode.status |= BD_DONE;
>> +
>>  		if (error)
>>  			sdmac->status = old_status;
>>  	}
>> --
>> 2.23.0
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership
  2019-12-10  9:44     ` Philipp Puschmann
@ 2019-12-10 13:01       ` Robin Gong
  0 siblings, 0 replies; 12+ messages in thread
From: Robin Gong @ 2019-12-10 13:01 UTC (permalink / raw)
  To: Philipp Puschmann, linux-kernel
  Cc: jlu, Andy Duan, l.stach, dan.j.williams, vkoul, shawnguo,
	s.hauer, kernel, festevam, dl-linux-imx, dmaengine,
	linux-arm-kernel

On 2019/12/10 17:45 Philipp Puschmann <philipp.puschmann@emlix.com> wrote:
> Am 04.12.19 um 10:19 schrieb Robin Gong:
> > On 2019-9-23 Philipp Puschmann <philipp.puschmann@emlix.com> wrote:
> >> BD_DONE flag marks ownership of the buffer. When 1 SDMA owns the
> >> buffer, when 0 ARM owns it. When processing the buffers in
> >> sdma_update_channel_loop the ownership of the currently processed
> >> buffer was set to SDMA again before running the callback function of
> >> the buffer and while the sdma script may be running in parallel. So
> >> there was the possibility to get the buffer overwritten by SDMA
> >> before it has been processed by kernel
> > Does this patch need indeed? I don't think any difference here move
> > done flag before callback or after callback, because callback never
> > care this flag and actually done flag is setup for next time rather than this
> time.
> The callback doesn't care, but the DMA controller cares about this flag. I see a
> possible race condition here. If i set the DONE flag for a specific buffer
> descriptor before handling the data belonging to this buffer descriptor (aka
> running the callback function) the DMA script running at the same time could
> corrupt that data while being processed.
> Or is there are mechanism that prevents this case, that i havn't considered
> here.
In theory that may happen, but in real world that's not the case:
1. SDMA is running much slower than CPU, for example, on i.MX6Q SDMA is running
at 66MHz while CPU is running at 1GHz. Besides, SDMA transfer data depends on fifo
data output frequency, such as UART 4Mhz. So your case may not be caught unless
time-consuming flow involved in callback which is not right.
2. There are multi descriptors(BDs) setup for cyclic mode, so that SDMA controller and CPU could handle data in parallel without interactions by using BD_DONE. Client driver should choose proper BD number and transfer size of BD to make sure cyclic transfer running smoothly without stop. In your case, all BDs consumed by SDMA during the narrow timing window which is between BD_DONE set and callback done at CPU side(all in interrupt handler). That never happen unless very small BD size set wrongly, such as only 32 byte or 64 byte for one BD, but generally BD size is in KB unit.
> 
> > Basically, this flag should be
> > set to 1 quickly asap so that sdma could use this bd asap. If delay
> > the flag may cause sdma channel stop since all BDs consumed.
> 
> > Could you try again your case without this patch?
> I don't have the hw to reproduce this available at the moment but as i
> remember i did run it without this patch successfully already. The problem i
> have described above was more a logical or theoretical one than a problem
> that really occured with my setup.
> 
> >> leading to kind of random errors in the upper layers, e.g. bluetooth.
> >>
> >> Fixes: 1ec1e82f2510 ("dmaengine: Add Freescale i.MX SDMA support")
> >> Signed-off-by: Philipp Puschmann <philipp.puschmann@emlix.com>
> >> ---
> >>
> >> Changelog v5:
> >>  - no changes
> >>
> >> Changelog v4:
> >>  - fixed the fixes tag
> >>
> >> Changelog v3:
> >>  - use correct dma_wmb() instead of dma_wb()
> >>  - add fixes tag
> >>
> >> Changelog v2:
> >>  - add dma_wb()
> >>
> >>  drivers/dma/imx-sdma.c | 4 +++-
> >>  1 file changed, 3 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index
> >> 9ba74ab7e912..b42281604e54 100644
> >> --- a/drivers/dma/imx-sdma.c
> >> +++ b/drivers/dma/imx-sdma.c
> >> @@ -802,7 +802,6 @@ static void sdma_update_channel_loop(struct
> >> sdma_channel *sdmac)
> >>  		*/
> >>
> >>  		desc->chn_real_count = bd->mode.count;
> >> -		bd->mode.status |= BD_DONE;
> >>  		bd->mode.count = desc->period_len;
> >>  		desc->buf_ptail = desc->buf_tail;
> >>  		desc->buf_tail = (desc->buf_tail + 1) % desc->num_bd; @@
> -817,6
> >> +816,9 @@ static void sdma_update_channel_loop(struct sdma_channel
> >> *sdmac)
> >>  		dmaengine_desc_get_callback_invoke(&desc->vd.tx, NULL);
> >>  		spin_lock(&sdmac->vc.lock);
> >>
> >> +		dma_wmb();
> >> +		bd->mode.status |= BD_DONE;
> >> +
> >>  		if (error)
> >>  			sdmac->status = old_status;
> >>  	}
> >> --
> >> 2.23.0
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-12-10 13:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-23 13:58 [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Philipp Puschmann
2019-09-23 13:58 ` [PATCH v5 1/3] dmaengine: imx-sdma: fix buffer ownership Philipp Puschmann
2019-12-03  9:47   ` Lucas Stach
2019-12-04  9:19   ` Robin Gong
2019-12-10  9:44     ` Philipp Puschmann
2019-12-10 13:01       ` Robin Gong
2019-09-23 13:58 ` [PATCH v5 2/3] dmaengine: imx-sdma: fix dma freezes Philipp Puschmann
2019-09-24  3:14   ` Robin Gong
2019-09-23 13:58 ` [PATCH v5 3/3] dmaengine: imx-sdma: drop redundant variable Philipp Puschmann
2019-09-23 14:55 ` [PATCH v5 0/3] Fix UART DMA freezes for i.MX SOCs Adam Ford
2019-09-23 15:06   ` Philipp Puschmann
2019-09-23 15:09     ` Jan Lübbe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).