dmaengine Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
@ 2019-06-14  8:39 yibin.gong
  2019-06-14 10:49 ` Fabio Estevam
  2019-06-14 13:35 ` Sven Van Asbroeck
  0 siblings, 2 replies; 12+ messages in thread
From: yibin.gong @ 2019-06-14  8:39 UTC (permalink / raw)
  To: shawnguo, s.hauer, festevam, vkoul, dan.j.williams, thesven73
  Cc: linux-imx, linux-arm-kernel, dmaengine, kernel

From: Robin Gong <yibin.gong@nxp.com>

It is possible for an irq triggered by channel0 to be received later
after clks are disabled once firmware loaded during sdma probe. If
that happens then clearing them by writing to SDMA_H_INTR won't work
and the system will hang processing infinite interrupts. Actually,
don't need interrupt triggered on channel0 since it's pollling
SDMA_H_STATSTOP to know channel0 done rather than interrupt in
current code, just clear BD_INTR to disable channel0 interrupt to
avoid the above case.

Signed-off-by: Robin Gong <yibin.gong@nxp.com>
Reported-by: Sven Van Asbroeck <thesven73@gmail.com>
---
 drivers/dma/imx-sdma.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index deea9aa..b5a1ee2 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -742,7 +742,7 @@ static int sdma_load_script(struct sdma_engine *sdma, void *buf, int size,
 	spin_lock_irqsave(&sdma->channel_0_lock, flags);
 
 	bd0->mode.command = C0_SETPM;
-	bd0->mode.status = BD_DONE | BD_INTR | BD_WRAP | BD_EXTD;
+	bd0->mode.status = BD_DONE | BD_WRAP | BD_EXTD;
 	bd0->mode.count = size / 2;
 	bd0->buffer_addr = buf_phys;
 	bd0->ext_buffer_addr = address;
@@ -1064,7 +1064,7 @@ static int sdma_load_context(struct sdma_channel *sdmac)
 	context->gReg[7] = sdmac->watermark_level;
 
 	bd0->mode.command = C0_SETDM;
-	bd0->mode.status = BD_DONE | BD_INTR | BD_WRAP | BD_EXTD;
+	bd0->mode.status = BD_DONE | BD_WRAP | BD_EXTD;
 	bd0->mode.count = sizeof(*context) / 4;
 	bd0->buffer_addr = sdma->context_phys;
 	bd0->ext_buffer_addr = 2048 + (sizeof(*context) / 4) * channel;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-14  8:39 [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0 yibin.gong
@ 2019-06-14 10:49 ` Fabio Estevam
  2019-06-14 13:25   ` Sven Van Asbroeck
  2019-06-14 13:35 ` Sven Van Asbroeck
  1 sibling, 1 reply; 12+ messages in thread
From: Fabio Estevam @ 2019-06-14 10:49 UTC (permalink / raw)
  To: Robin Gong
  Cc: Shawn Guo, Sascha Hauer, Vinod, Dan Williams, Sven Van Asbroeck,
	NXP Linux Team,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	dmaengine, Sascha Hauer

Hi Robin,

On Fri, Jun 14, 2019 at 5:38 AM <yibin.gong@nxp.com> wrote:
>
> From: Robin Gong <yibin.gong@nxp.com>
>
> It is possible for an irq triggered by channel0 to be received later
> after clks are disabled once firmware loaded during sdma probe. If
> that happens then clearing them by writing to SDMA_H_INTR won't work
> and the system will hang processing infinite interrupts. Actually,
> don't need interrupt triggered on channel0 since it's pollling
> SDMA_H_STATSTOP to know channel0 done rather than interrupt in
> current code, just clear BD_INTR to disable channel0 interrupt to
> avoid the above case.
>
> Signed-off-by: Robin Gong <yibin.gong@nxp.com>
> Reported-by: Sven Van Asbroeck <thesven73@gmail.com>

According to the original report from Sven the issue started to happen
on 5.0, so it would be good to add a Fixes tag and Cc stable so that
this fix could be backported to 5.0/5.1 stable trees.

Thanks

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-14 10:49 ` Fabio Estevam
@ 2019-06-14 13:25   ` Sven Van Asbroeck
  2019-06-14 18:09     ` Michael Olbrich
  2019-06-17  2:02     ` Robin Gong
  0 siblings, 2 replies; 12+ messages in thread
From: Sven Van Asbroeck @ 2019-06-14 13:25 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Robin Gong, Shawn Guo, Sascha Hauer, Vinod, Dan Williams,
	NXP Linux Team,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	dmaengine, Sascha Hauer, Michael Olbrich

On Fri, Jun 14, 2019 at 6:49 AM Fabio Estevam <festevam@gmail.com> wrote:
>
> According to the original report from Sven the issue started to happen
> on 5.0, so it would be good to add a Fixes tag and Cc stable so that
> this fix could be backported to 5.0/5.1 stable trees.

Good catch !

However, the issue is highly timing-dependent. It will come and go depending
on the kernel version, devicetree and defconfig. If it works for me on
4.19, that
doesn't mean the bug is gone on 4.19.

Looking at the commit history, I think the commit below possibly introduced the
issue. Until this commit, sdma_run_channel() would wait on the interrupt
before proceeding. It has been there since 4.8:

Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in the
interrupt handler")

But my knowledge of imx-sdma is non-existent, so I invite the more knowledgeable
people in this thread to take a look at this commit.

[Adding Michael Olbrich to the thread]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-14  8:39 [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0 yibin.gong
  2019-06-14 10:49 ` Fabio Estevam
@ 2019-06-14 13:35 ` Sven Van Asbroeck
  2019-06-17  1:42   ` Robin Gong
  1 sibling, 1 reply; 12+ messages in thread
From: Sven Van Asbroeck @ 2019-06-14 13:35 UTC (permalink / raw)
  To: Robin Gong
  Cc: Shawn Guo, Sascha Hauer, Fabio Estevam, Vinod, Dan Williams,
	NXP Linux Team,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	dmaengine, Sascha Hauer, Michael Olbrich

Hi Robin, see comments inline.

On Fri, Jun 14, 2019 at 4:38 AM <yibin.gong@nxp.com> wrote:
>
> diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
> index deea9aa..b5a1ee2 100644
> --- a/drivers/dma/imx-sdma.c
> +++ b/drivers/dma/imx-sdma.c
> @@ -742,7 +742,7 @@ static int sdma_load_script(struct sdma_engine *sdma, void *buf, int size,
>         spin_lock_irqsave(&sdma->channel_0_lock, flags);
>
>         bd0->mode.command = C0_SETPM;
> -       bd0->mode.status = BD_DONE | BD_INTR | BD_WRAP | BD_EXTD;
> +       bd0->mode.status = BD_DONE | BD_WRAP | BD_EXTD;

I tested this change on its own, and it seemed sufficient to make the crash
disappear.

>         bd0->mode.count = size / 2;
>         bd0->buffer_addr = buf_phys;
>         bd0->ext_buffer_addr = address;
> @@ -1064,7 +1064,7 @@ static int sdma_load_context(struct sdma_channel *sdmac)
>         context->gReg[7] = sdmac->watermark_level;
>
>         bd0->mode.command = C0_SETDM;
> -       bd0->mode.status = BD_DONE | BD_INTR | BD_WRAP | BD_EXTD;
> +       bd0->mode.status = BD_DONE | BD_WRAP | BD_EXTD;

This function isn't part of the firmware load path, so how can it be related
to fixing the firmware crash?

If this is an unrelated efficiency saving, maybe it should go into its
own patch?
Maybe we want bugfix patches to be as small and specific as possible, so they
can more easily be backported to older kernels?

>         bd0->mode.count = sizeof(*context) / 4;
>         bd0->buffer_addr = sdma->context_phys;
>         bd0->ext_buffer_addr = 2048 + (sizeof(*context) / 4) * channel;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-14 13:25   ` Sven Van Asbroeck
@ 2019-06-14 18:09     ` Michael Olbrich
  2019-06-17  2:14       ` Robin Gong
  2019-06-17  2:02     ` Robin Gong
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Olbrich @ 2019-06-14 18:09 UTC (permalink / raw)
  To: Sven Van Asbroeck
  Cc: Fabio Estevam, Robin Gong, Shawn Guo, Sascha Hauer, Vinod,
	Dan Williams, NXP Linux Team,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	dmaengine, Sascha Hauer

On Fri, Jun 14, 2019 at 09:25:51AM -0400, Sven Van Asbroeck wrote:
> On Fri, Jun 14, 2019 at 6:49 AM Fabio Estevam <festevam@gmail.com> wrote:
> >
> > According to the original report from Sven the issue started to happen
> > on 5.0, so it would be good to add a Fixes tag and Cc stable so that
> > this fix could be backported to 5.0/5.1 stable trees.
> 
> Good catch !
> 
> However, the issue is highly timing-dependent. It will come and go depending
> on the kernel version, devicetree and defconfig. If it works for me on
> 4.19, that
> doesn't mean the bug is gone on 4.19.
> 
> Looking at the commit history, I think the commit below possibly introduced the
> issue. Until this commit, sdma_run_channel() would wait on the interrupt
> before proceeding. It has been there since 4.8:
> 
> Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in the
> interrupt handler")

I think this is correct. Starting with this commit, the interrupt status fr
channel 0 is no longer cleared in sdma_run_channel0() and
sdma_int_handler() is always called for channel 0.
During firmware loading the interrupts are enabled again just before the
clocks are disabled. The interrupt is pending at this moment so on a single
core system I think this will always work as expected. If the firmware
loading and the interrupt handler run on different cores then this is racy.
Maybe something else changed to make this more likely?

With this new change sdma_int_handler() is no longer called for channel 0
right, so you should also remove the special handling there.

Michael

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-14 13:35 ` Sven Van Asbroeck
@ 2019-06-17  1:42   ` Robin Gong
  0 siblings, 0 replies; 12+ messages in thread
From: Robin Gong @ 2019-06-17  1:42 UTC (permalink / raw)
  To: thesven73
  Cc: dl-linux-imx, festevam, dan.j.williams, dmaengine, m.olbrich,
	shawnguo, linux-arm-kernel, vkoul, kernel, s.hauer

On 2019-06-14 at 13:35 +0000, Sven Van Asbroeck wrote:
> Hi Robin, see comments inline.
> 
> On Fri, Jun 14, 2019 at 4:38 AM <yibin.gong@nxp.com> wrote:
> > 
> > 
> > diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
> > index deea9aa..b5a1ee2 100644
> > --- a/drivers/dma/imx-sdma.c
> > +++ b/drivers/dma/imx-sdma.c
> > @@ -742,7 +742,7 @@ static int sdma_load_script(struct sdma_engine
> > *sdma, void *buf, int size,
> >         spin_lock_irqsave(&sdma->channel_0_lock, flags);
> > 
> >         bd0->mode.command = C0_SETPM;
> > -       bd0->mode.status = BD_DONE | BD_INTR | BD_WRAP | BD_EXTD;
> > +       bd0->mode.status = BD_DONE | BD_WRAP | BD_EXTD;
> I tested this change on its own, and it seemed sufficient to make the
> crash
> disappear.
> 
> > 
> >         bd0->mode.count = size / 2;
> >         bd0->buffer_addr = buf_phys;
> >         bd0->ext_buffer_addr = address;
> > @@ -1064,7 +1064,7 @@ static int sdma_load_context(struct
> > sdma_channel *sdmac)
> >         context->gReg[7] = sdmac->watermark_level;
> > 
> >         bd0->mode.command = C0_SETDM;
> > -       bd0->mode.status = BD_DONE | BD_INTR | BD_WRAP | BD_EXTD;
> > +       bd0->mode.status = BD_DONE | BD_WRAP | BD_EXTD;
> This function isn't part of the firmware load path, so how can it be
> related
> to fixing the firmware crash?
Yes, that's not in your firmware load case, but I want to keep the same
behavior for all channel0 conditions, don't worry, harmless here. 
> 
> If this is an unrelated efficiency saving, maybe it should go into
> its
> own patch?
> Maybe we want bugfix patches to be as small and specific as possible,
> so they
> can more easily be backported to older kernels?
> 
> > 
> >         bd0->mode.count = sizeof(*context) / 4;
> >         bd0->buffer_addr = sdma->context_phys;
> >         bd0->ext_buffer_addr = 2048 + (sizeof(*context) / 4) *
> > channel;
> > --
> > 2.7.4
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-14 13:25   ` Sven Van Asbroeck
  2019-06-14 18:09     ` Michael Olbrich
@ 2019-06-17  2:02     ` Robin Gong
  2019-06-17 13:27       ` Sven Van Asbroeck
  1 sibling, 1 reply; 12+ messages in thread
From: Robin Gong @ 2019-06-17  2:02 UTC (permalink / raw)
  To: festevam, thesven73
  Cc: dl-linux-imx, dan.j.williams, dmaengine, m.olbrich, shawnguo,
	linux-arm-kernel, vkoul, kernel, s.hauer

On 2019-06-14 at 13:25 +0000, Sven Van Asbroeck wrote:
> On Fri, Jun 14, 2019 at 6:49 AM Fabio Estevam <festevam@gmail.com>
> wrote:
> > 
> > 
> > According to the original report from Sven the issue started to
> > happen
> > on 5.0, so it would be good to add a Fixes tag and Cc stable so
> > that
> > this fix could be backported to 5.0/5.1 stable trees.
> Good catch !
> 
> However, the issue is highly timing-dependent. It will come and go
> depending
> on the kernel version, devicetree and defconfig. If it works for me
> on
> 4.19, that
> doesn't mean the bug is gone on 4.19.
The default imx defconfig and dts should be ok, because firmware load
is delayed after rootfs mounted where firmware located in and before
that, some driver which use sdma such as spi/uart/audio may have
already enable sdma clock which means channel0 interrupt could be
cleared immediately without interrupt storm. That's why I can't
reproduce your issue at first, but catch it once I sync with your
directly firmware load defconfig. So seems not very must to CC stable
tree?
> 
> Looking at the commit history, I think the commit below possibly
> introduced the
> issue. Until this commit, sdma_run_channel() would wait on the
> interrupt
> before proceeding. It has been there since 4.8:
> 
> Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in the
> interrupt handler")
Yes, but Michael's patch is the right direction, at least it fix RT
case and meaningless channel0 interrupt storm coming before clearing
channel0 interrupt status in sdma_run_channel0(). Now, this patch could
fix its minor side-effect.
> 
> But my knowledge of imx-sdma is non-existent, so I invite the more
> knowledgeable
> people in this thread to take a look at this commit.
> 
> [Adding Michael Olbrich to the thread]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-14 18:09     ` Michael Olbrich
@ 2019-06-17  2:14       ` Robin Gong
  2019-06-17 10:15         ` m.olbrich
  0 siblings, 1 reply; 12+ messages in thread
From: Robin Gong @ 2019-06-17  2:14 UTC (permalink / raw)
  To: m.olbrich, thesven73
  Cc: dl-linux-imx, festevam, dan.j.williams, dmaengine, shawnguo,
	linux-arm-kernel, vkoul, kernel, s.hauer

On 2019-06-14 at 18:09 +0000, Michael Olbrich wrote:
> On Fri, Jun 14, 2019 at 09:25:51AM -0400, Sven Van Asbroeck wrote:
> > 
> > On Fri, Jun 14, 2019 at 6:49 AM Fabio Estevam <festevam@gmail.com>
> > wrote:
> > > 
> > > 
> > > According to the original report from Sven the issue started to
> > > happen
> > > on 5.0, so it would be good to add a Fixes tag and Cc stable so
> > > that
> > > this fix could be backported to 5.0/5.1 stable trees.
> > Good catch !
> > 
> > However, the issue is highly timing-dependent. It will come and go
> > depending
> > on the kernel version, devicetree and defconfig. If it works for me
> > on
> > 4.19, that
> > doesn't mean the bug is gone on 4.19.
> > 
> > Looking at the commit history, I think the commit below possibly
> > introduced the
> > issue. Until this commit, sdma_run_channel() would wait on the
> > interrupt
> > before proceeding. It has been there since 4.8:
> > 
> > Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in the
> > interrupt handler")
> I think this is correct. Starting with this commit, the interrupt
> status fr
> channel 0 is no longer cleared in sdma_run_channel0() and
> sdma_int_handler() is always called for channel 0.
> During firmware loading the interrupts are enabled again just before
> the
> clocks are disabled. The interrupt is pending at this moment so on a
> single
> core system I think this will always work as expected. If the
> firmware
> loading and the interrupt handler run on different cores then this is
> racy.
> Maybe something else changed to make this more likely?
> 
> With this new change sdma_int_handler() is no longer called for
> channel 0
> right, so you should also remove the special handling there.
What's 'special handling' should be removed here? Do you mean put below
pieces of your patch back?
 static int sdma_load_script(struct sdma_engine *sdma, void *buf, int
size,
@@ -727,9 +720,9 @@ static irqreturn_t sdma_int_handler(int irq, void
*dev_id)
        unsigned long stat;
 
        stat = readl_relaxed(sdma->regs + SDMA_H_INTR);
-       /* not interested in channel 0 interrupts */
-       stat &= ~1;
        writel_relaxed(stat, sdma->regs + SDMA_H_INTR);
+       /* channel 0 is special and not handled here, see
run_channel0() */
+       stat &= ~1;

> 
> Michael
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-17  2:14       ` Robin Gong
@ 2019-06-17 10:15         ` m.olbrich
  2019-06-18  6:08           ` Robin Gong
  0 siblings, 1 reply; 12+ messages in thread
From: m.olbrich @ 2019-06-17 10:15 UTC (permalink / raw)
  To: Robin Gong
  Cc: thesven73, dl-linux-imx, festevam, dan.j.williams, dmaengine,
	shawnguo, linux-arm-kernel, vkoul, kernel, s.hauer

On Mon, Jun 17, 2019 at 02:14:34AM +0000, Robin Gong wrote:
> On 2019-06-14 at 18:09 +0000, Michael Olbrich wrote:
> > On Fri, Jun 14, 2019 at 09:25:51AM -0400, Sven Van Asbroeck wrote:
> > > 
> > > On Fri, Jun 14, 2019 at 6:49 AM Fabio Estevam <festevam@gmail.com>
> > > wrote:
> > > > 
> > > > 
> > > > According to the original report from Sven the issue started to
> > > > happen
> > > > on 5.0, so it would be good to add a Fixes tag and Cc stable so
> > > > that
> > > > this fix could be backported to 5.0/5.1 stable trees.
> > > Good catch !
> > > 
> > > However, the issue is highly timing-dependent. It will come and go
> > > depending
> > > on the kernel version, devicetree and defconfig. If it works for me
> > > on
> > > 4.19, that
> > > doesn't mean the bug is gone on 4.19.
> > > 
> > > Looking at the commit history, I think the commit below possibly
> > > introduced the
> > > issue. Until this commit, sdma_run_channel() would wait on the
> > > interrupt
> > > before proceeding. It has been there since 4.8:
> > > 
> > > Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in the
> > > interrupt handler")
> > I think this is correct. Starting with this commit, the interrupt
> > status fr
> > channel 0 is no longer cleared in sdma_run_channel0() and
> > sdma_int_handler() is always called for channel 0.
> > During firmware loading the interrupts are enabled again just before
> > the
> > clocks are disabled. The interrupt is pending at this moment so on a
> > single
> > core system I think this will always work as expected. If the
> > firmware
> > loading and the interrupt handler run on different cores then this is
> > racy.
> > Maybe something else changed to make this more likely?
> > 
> > With this new change sdma_int_handler() is no longer called for
> > channel 0
> > right, so you should also remove the special handling there.
> What's 'special handling' should be removed here? Do you mean put below
> pieces of your patch back?
>  static int sdma_load_script(struct sdma_engine *sdma, void *buf, int
> size,
> @@ -727,9 +720,9 @@ static irqreturn_t sdma_int_handler(int irq, void
> *dev_id)
>         unsigned long stat;
>  
>         stat = readl_relaxed(sdma->regs + SDMA_H_INTR);
> -       /* not interested in channel 0 interrupts */
> -       stat &= ~1;
>         writel_relaxed(stat, sdma->regs + SDMA_H_INTR);
> +       /* channel 0 is special and not handled here, see
> run_channel0() */
> +       stat &= ~1;

I think the "stat &= ~1;" can be removed completely. This bit should never
be set, now that the interrupt for channel 0 is disabled.

Michael

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-17  2:02     ` Robin Gong
@ 2019-06-17 13:27       ` Sven Van Asbroeck
  2019-06-18  5:50         ` Robin Gong
  0 siblings, 1 reply; 12+ messages in thread
From: Sven Van Asbroeck @ 2019-06-17 13:27 UTC (permalink / raw)
  To: Robin Gong
  Cc: festevam, dl-linux-imx, dan.j.williams, dmaengine, m.olbrich,
	shawnguo, linux-arm-kernel, vkoul, kernel, s.hauer

Hello Robin,

On Sun, Jun 16, 2019 at 10:02 PM Robin Gong <yibin.gong@nxp.com> wrote:
>
> The default imx defconfig and dts should be ok, because firmware load
> is delayed after rootfs mounted where firmware located in and before
> that, some driver which use sdma such as spi/uart/audio may have
> already enable sdma clock which means channel0 interrupt could be
> cleared immediately without interrupt storm. That's why I can't
> reproduce your issue at first, but catch it once I sync with your
> directly firmware load defconfig. So seems not very must to CC stable
> tree?

As far as I know, the bug/crash does not happen if you're loading the
sdma firmware from a filesystem. So the vast majority of users would
never see the crash.

I agree that this is not a high-priority bugfix. But it's worthwhile for the
stable trees to have it.

> Yes, but Michael's patch is the right direction, at least it fix RT
> case and meaningless channel0 interrupt storm coming before clearing
> channel0 interrupt status in sdma_run_channel0(). Now, this patch could
> fix its minor side-effect.

I'm not suggesting that we should revert or change Michael's patch. Just
that it would be good for the v2 patch to contain:

Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in the
interrupt handler")

This should allow stable maintainers to pull in your patch if and only if
their release already contains 1d069bfa3c78.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-17 13:27       ` Sven Van Asbroeck
@ 2019-06-18  5:50         ` Robin Gong
  0 siblings, 0 replies; 12+ messages in thread
From: Robin Gong @ 2019-06-18  5:50 UTC (permalink / raw)
  To: thesven73
  Cc: dl-linux-imx, festevam, dan.j.williams, dmaengine, m.olbrich,
	shawnguo, linux-arm-kernel, vkoul, kernel, s.hauer

On 2019-06-17 at 09:27 -0400, Sven Van Asbroeck wrote:
> Hello Robin,
> 
> On Sun, Jun 16, 2019 at 10:02 PM Robin Gong <yibin.gong@nxp.com>
> wrote:
> > 
> > 
> > The default imx defconfig and dts should be ok, because firmware
> > load
> > is delayed after rootfs mounted where firmware located in and
> > before
> > that, some driver which use sdma such as spi/uart/audio may have
> > already enable sdma clock which means channel0 interrupt could be
> > cleared immediately without interrupt storm. That's why I can't
> > reproduce your issue at first, but catch it once I sync with your
> > directly firmware load defconfig. So seems not very must to CC
> > stable
> > tree?
> As far as I know, the bug/crash does not happen if you're loading the
> sdma firmware from a filesystem. So the vast majority of users would
> never see the crash.
> 
> I agree that this is not a high-priority bugfix. But it's worthwhile
> for the
> stable trees to have it.
> 
> > 
> > Yes, but Michael's patch is the right direction, at least it fix RT
> > case and meaningless channel0 interrupt storm coming before
> > clearing
> > channel0 interrupt status in sdma_run_channel0(). Now, this patch
> > could
> > fix its minor side-effect.
> I'm not suggesting that we should revert or change Michael's patch.
> Just
> that it would be good for the v2 patch to contain:
> 
> Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in the
> interrupt handler")
> 
> This should allow stable maintainers to pull in your patch if and
> only if
> their release already contains 1d069bfa3c78.
Okay, would add Fixes tag into v2.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0
  2019-06-17 10:15         ` m.olbrich
@ 2019-06-18  6:08           ` Robin Gong
  0 siblings, 0 replies; 12+ messages in thread
From: Robin Gong @ 2019-06-18  6:08 UTC (permalink / raw)
  To: m.olbrich
  Cc: dl-linux-imx, thesven73, festevam, dan.j.williams, dmaengine,
	shawnguo, linux-arm-kernel, vkoul, kernel, s.hauer

On 2019-06-17 at 12:15 +0200, m.olbrich@pengutronix.de wrote:
> On Mon, Jun 17, 2019 at 02:14:34AM +0000, Robin Gong wrote:
> > 
> > On 2019-06-14 at 18:09 +0000, Michael Olbrich wrote:
> > > 
> > > On Fri, Jun 14, 2019 at 09:25:51AM -0400, Sven Van Asbroeck
> > > wrote:
> > > > 
> > > > 
> > > > On Fri, Jun 14, 2019 at 6:49 AM Fabio Estevam <festevam@gmail.c
> > > > om>
> > > > wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > According to the original report from Sven the issue started
> > > > > to
> > > > > happen
> > > > > on 5.0, so it would be good to add a Fixes tag and Cc stable
> > > > > so
> > > > > that
> > > > > this fix could be backported to 5.0/5.1 stable trees.
> > > > Good catch !
> > > > 
> > > > However, the issue is highly timing-dependent. It will come and
> > > > go
> > > > depending
> > > > on the kernel version, devicetree and defconfig. If it works
> > > > for me
> > > > on
> > > > 4.19, that
> > > > doesn't mean the bug is gone on 4.19.
> > > > 
> > > > Looking at the commit history, I think the commit below
> > > > possibly
> > > > introduced the
> > > > issue. Until this commit, sdma_run_channel() would wait on the
> > > > interrupt
> > > > before proceeding. It has been there since 4.8:
> > > > 
> > > > Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in
> > > > the
> > > > interrupt handler")
> > > I think this is correct. Starting with this commit, the interrupt
> > > status fr
> > > channel 0 is no longer cleared in sdma_run_channel0() and
> > > sdma_int_handler() is always called for channel 0.
> > > During firmware loading the interrupts are enabled again just
> > > before
> > > the
> > > clocks are disabled. The interrupt is pending at this moment so
> > > on a
> > > single
> > > core system I think this will always work as expected. If the
> > > firmware
> > > loading and the interrupt handler run on different cores then
> > > this is
> > > racy.
> > > Maybe something else changed to make this more likely?
> > > 
> > > With this new change sdma_int_handler() is no longer called for
> > > channel 0
> > > right, so you should also remove the special handling there.
> > What's 'special handling' should be removed here? Do you mean put
> > below
> > pieces of your patch back?
> >  static int sdma_load_script(struct sdma_engine *sdma, void *buf,
> > int
> > size,
> > @@ -727,9 +720,9 @@ static irqreturn_t sdma_int_handler(int irq,
> > void
> > *dev_id)
> >         unsigned long stat;
> >  
> >         stat = readl_relaxed(sdma->regs + SDMA_H_INTR);
> > -       /* not interested in channel 0 interrupts */
> > -       stat &= ~1;
> >         writel_relaxed(stat, sdma->regs + SDMA_H_INTR);
> > +       /* channel 0 is special and not handled here, see
> > run_channel0() */
> > +       stat &= ~1;
> I think the "stat &= ~1;" can be removed completely. This bit should
> never
> be set, now that the interrupt for channel 0 is disabled.
Okay, but that's harmless, moreover, I like your comment '/* channel 0
is special and not handled here, see run_channel0() */' which said
clearly channel0 interrupt is a special one and NOT handled in
sdma_int_handler. So I'd like to keep it...  
> 
> Michael
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-14  8:39 [PATCH v1] dmaengine: imx-sdma: remove BD_INTR for channel0 yibin.gong
2019-06-14 10:49 ` Fabio Estevam
2019-06-14 13:25   ` Sven Van Asbroeck
2019-06-14 18:09     ` Michael Olbrich
2019-06-17  2:14       ` Robin Gong
2019-06-17 10:15         ` m.olbrich
2019-06-18  6:08           ` Robin Gong
2019-06-17  2:02     ` Robin Gong
2019-06-17 13:27       ` Sven Van Asbroeck
2019-06-18  5:50         ` Robin Gong
2019-06-14 13:35 ` Sven Van Asbroeck
2019-06-17  1:42   ` Robin Gong

dmaengine Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dmaengine/0 dmaengine/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dmaengine dmaengine/ https://lore.kernel.org/dmaengine \
		dmaengine@vger.kernel.org dmaengine@archiver.kernel.org
	public-inbox-index dmaengine


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.dmaengine


AGPL code for this site: git clone https://public-inbox.org/ public-inbox