linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* iMX6/UART imprecise external abort
@ 2019-12-02 20:40 Andre Renaud
  2019-12-02 20:56 ` Fabio Estevam
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Andre Renaud @ 2019-12-02 20:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,
I am working with an iMX6Q system that is exhibiting a crash when
using the serial ports.
We have /dev/ttymxc2 configured as an RS485 UART, and are seeing an
'imprecise external abort' after some time of use (panic listed
below).

We are able to replicate this both on the custom kernel we're using,
as well as on the 5.3.x+fslc image from
https://github.com/Freescale/linux-fslc

To replicate it we have the mx6 hooked up to a PC, with each end
sending short 3-4 character messages every second. The fault kicks in
after about 15-30 minutes. This seems similar to the fault described
here: https://lkml.org/lkml/2019/11/11/588. We have tried shutting
down DMA and various performance/cpuidle systems, but that doesn't
seem to have any impact.

Does anyone have any thoughts on how to solve this?

Regards,
Andre

[ 5047.074427] Unhandled fault: imprecise external abort (0x1406) at 0xb6e00f78
[ 5047.081498] pgd = c0004000
[ 5047.084213] [b6e00f78] *pgd=00000000
[ 5047.087813] Internal error: : 1406 [#1] SMP ARM
[ 5047.092348] Modules linked in:
[ 5047.095429] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0+ #19
[ 5047.101440] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 5047.107974] task: ef0ecd00 task.stack: ef158000
[ 5047.112521] PC is at arch_cpu_idle+0x48/0x4c
[ 5047.116799] LR is at arch_cpu_idle+0x44/0x4c
[ 5047.121077] pc : [<c0108c70>]    lr : [<c0108c6c>]    psr: 60070013
[ 5047.121077] sp : ef159f98  ip : ef159fa8  fp : ef159fa4
[ 5047.132560] r10: 00000000  r9 : 00000002  r8 : c0d025dc
[ 5047.137791] r7 : c0d95448  r6 : ef158000  r5 : c0d02648  r4 : ef158000
[ 5047.144324] r3 : c011a140  r2 : 005bc18a  r1 : ef7ae3c0  r0 : 00000000
[ 5047.150858] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[ 5047.157998] Control: 10c5387d  Table: 3dd9804a  DAC: 00000051
[ 5047.163748] Process swapper/1 (pid: 0, stack limit = 0xef158210)
[ 5047.169759] Stack: (0xef159f98 to 0xef15a000)
[ 5047.174124] 9f80:
    ef159fb4 ef159fa8
[ 5047.182310] 9fa0: c0170b54 c0108c34 ef159fdc ef159fb8 c0170da4
c0170b30 c0a8fb48 c0d8b845
[ 5047.190496] 9fc0: c0d8c48c c0d025dc 10c0387d c0d95448 ef159ff4
ef159fe0 c010e6cc c0170b6c
[ 5047.198682] 9fe0: 3f0fc06a 00000051 00000000 ef159ff8 101018cc
c010e580 edddf4eb ffeebffd
[ 5047.206887] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
(default_idle_call+0x30/0x3c)
[ 5047.214993] [<c0170b54>] (default_idle_call) from [<c0170da4>]
(cpu_startup_entry+0x244/0x298)
[ 5047.223619] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
(secondary_start_kernel+0x158/0x164)
[ 5047.232677] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
(0x101018cc)
[ 5047.240083] Code: e34c30d0 e5933014 e12fff33 f1080080 (e89da800)
[ 5047.246190] ---[ end trace 853e028df8c9b7cd ]---
[ 5047.250814] Kernel panic - not syncing: Attempted to kill the idle task!
[ 5047.257528] CPU2: stopping
[ 5047.260250] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D
4.8.0+ #19
[ 5047.267476] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 5047.274022] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
(show_stack+0x20/0x24)
[ 5047.281783] [<c010c9d0>] (show_stack) from [<c04939c8>]
(dump_stack+0x90/0xa4)
[ 5047.289019] [<c04939c8>] (dump_stack) from [<c010ebe8>]
(handle_IPI+0x2a4/0x2c4)
[ 5047.296426] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
(gic_handle_irq+0x80/0x84)
[ 5047.304013] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
(__irq_svc+0x6c/0x90)
[ 5047.311500] Exception stack(0xef15bf48 to 0xef15bf90)
[ 5047.316561] bf40:                   00000000 ef7bd3c0 00729232
c011a140 ef15a000 c0d02648
[ 5047.324747] bf60: ef15a000 c0d95448 c0d025dc 00000004 00000000
ef15bfa4 ef15bfa8 ef15bf98
[ 5047.332930] bf80: c0108c6c c0108c70 60000013 ffffffff
[ 5047.337999] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
(arch_cpu_idle+0x48/0x4c)
[ 5047.345410] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
(default_idle_call+0x30/0x3c)
[ 5047.353514] [<c0170b54>] (default_idle_call) from [<c0170da4>]
(cpu_startup_entry+0x244/0x298)
[ 5047.362137] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
(secondary_start_kernel+0x158/0x164)
[ 5047.371191] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
(0x101018cc)
[ 5047.378593] CPU3: stopping
[ 5047.381314] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D
4.8.0+ #19
[ 5047.388540] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 5047.395085] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
(show_stack+0x20/0x24)
[ 5047.402841] [<c010c9d0>] (show_stack) from [<c04939c8>]
(dump_stack+0x90/0xa4)
[ 5047.410073] [<c04939c8>] (dump_stack) from [<c010ebe8>]
(handle_IPI+0x2a4/0x2c4)
[ 5047.417479] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
(gic_handle_irq+0x80/0x84)
[ 5047.425060] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
(__irq_svc+0x6c/0x90)
[ 5047.432546] Exception stack(0xef15df48 to 0xef15df90)
[ 5047.437606] df40:                   00000000 ef7cc3c0 00512f92
c011a140 ef15c000 c0d02648
[ 5047.445792] df60: ef15c000 c0d95448 c0d025dc 00000008 00000000
ef15dfa4 ef15dfa8 ef15df98
[ 5047.453974] df80: c0108c6c c0108c70 600d0013 ffffffff
[ 5047.459041] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
(arch_cpu_idle+0x48/0x4c)
[ 5047.466452] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
(default_idle_call+0x30/0x3c)
[ 5047.474555] [<c0170b54>] (default_idle_call) from [<c0170da4>]
(cpu_startup_entry+0x244/0x298)
[ 5047.483179] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
(secondary_start_kernel+0x158/0x164)
[ 5047.492234] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
(0x101018cc)
[ 5047.499635] CPU0: stopping
[ 5047.502356] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G      D
4.8.0+ #19
[ 5047.509582] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 5047.516126] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
(show_stack+0x20/0x24)
[ 5047.523883] [<c010c9d0>] (show_stack) from [<c04939c8>]
(dump_stack+0x90/0xa4)
[ 5047.531116] [<c04939c8>] (dump_stack) from [<c010ebe8>]
(handle_IPI+0x2a4/0x2c4)
[ 5047.538521] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
(gic_handle_irq+0x80/0x84)
[ 5047.546102] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
(__irq_svc+0x6c/0x90)
[ 5047.553588] Exception stack(0xc0d01d78 to 0xc0d01dc0)
[ 5047.558645] 1d60:
    ee4dfae8 20070113
[ 5047.566831] 1d80: 00000730 000049db ee4dfa80 ef3d9de0 ee4dfae8
20070113 00000020 00000000
[ 5047.575016] 1da0: c0d0799c c0d01dd4 c0d01dd8 c0d01dc8 c05fdd00
c080c26c 60070113 ffffffff
[ 5047.583205] [<c080c8cc>] (__irq_svc) from [<c080c26c>]
(_raw_spin_unlock_irqrestore+0x30/0x34)
[ 5047.591839] [<c080c26c>] (_raw_spin_unlock_irqrestore) from
[<c05fdd00>] (sdhci_tasklet_finish+0xfc/0x234)
[ 5047.601507] [<c05fdd00>] (sdhci_tasklet_finish) from [<c012dfe8>]
(tasklet_action+0x68/0xf8)
[ 5047.609955] [<c012dfe8>] (tasklet_action) from [<c0101630>]
(__do_softirq+0x150/0x34c)
[ 5047.617892] [<c0101630>] (__do_softirq) from [<c012d9e0>]
(irq_exit+0xc8/0x128)
[ 5047.625218] [<c012d9e0>] (irq_exit) from [<c0181338>]
(__handle_domain_irq+0x70/0xc4)
[ 5047.633060] [<c0181338>] (__handle_domain_irq) from [<c01014a4>]
(gic_handle_irq+0x48/0x84)
[ 5047.641421] [<c01014a4>] (gic_handle_irq) from [<c080c8cc>]
(__irq_svc+0x6c/0x90)
[ 5047.648907] Exception stack(0xc0d01f08 to 0xc0d01f50)
[ 5047.653965] 1f00:                   00000000 ef79f3c0 00449df4
c011a140 c0d00000 c0d02648
[ 5047.662151] 1f20: c0d00000 c0d02540 c0d025dc 00000001 ef7db300
c0d01f64 c0d01f68 c0d01f58
[ 5047.670333] 1f40: c0108c6c c0108c70 60070013 ffffffff
[ 5047.675399] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
(arch_cpu_idle+0x48/0x4c)
[ 5047.682809] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
(default_idle_call+0x30/0x3c)
[ 5047.690911] [<c0170b54>] (default_idle_call) from [<c0170da4>]
(cpu_startup_entry+0x244/0x298)
[ 5047.699533] [<c0170da4>] (cpu_startup_entry) from [<c0806eb8>]
(rest_init+0x84/0x88)
[ 5047.707292] [<c0806eb8>] (rest_init) from [<c0c00d34>]
(start_kernel+0x390/0x39c)
[ 5047.714786] ---[ end Kernel panic - not syncing: Attempted to kill
the idle task!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: iMX6/UART imprecise external abort
  2019-12-02 20:40 iMX6/UART imprecise external abort Andre Renaud
@ 2019-12-02 20:56 ` Fabio Estevam
  2019-12-03  1:53   ` [EXT] " Andy Duan
                     ` (2 more replies)
  2019-12-02 21:29 ` Uwe Kleine-König
  2019-12-21  3:33 ` Andre Renaud
  2 siblings, 3 replies; 31+ messages in thread
From: Fabio Estevam @ 2019-12-02 20:56 UTC (permalink / raw)
  To: Andre Renaud, Fugang Duan
  Cc: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

Hi Andre,

[Adding Andy]

On Mon, Dec 2, 2019 at 5:40 PM Andre Renaud
<arenaud@designa-electronics.com> wrote:
>
> Hello,
> I am working with an iMX6Q system that is exhibiting a crash when
> using the serial ports.
> We have /dev/ttymxc2 configured as an RS485 UART, and are seeing an
> 'imprecise external abort' after some time of use (panic listed
> below).
>
> We are able to replicate this both on the custom kernel we're using,
> as well as on the 5.3.x+fslc image from
> https://github.com/Freescale/linux-fslc

Could you please try a vanilla 5.4.1 kernel?

>
> To replicate it we have the mx6 hooked up to a PC, with each end
> sending short 3-4 character messages every second. The fault kicks in
> after about 15-30 minutes. This seems similar to the fault described
> here: https://lkml.org/lkml/2019/11/11/588. We have tried shutting
> down DMA and various performance/cpuidle systems, but that doesn't
> seem to have any impact.
>
> Does anyone have any thoughts on how to solve this?
>
> Regards,
> Andre
>
> [ 5047.074427] Unhandled fault: imprecise external abort (0x1406) at 0xb6e00f78
> [ 5047.081498] pgd = c0004000
> [ 5047.084213] [b6e00f78] *pgd=00000000
> [ 5047.087813] Internal error: : 1406 [#1] SMP ARM
> [ 5047.092348] Modules linked in:
> [ 5047.095429] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0+ #19
> [ 5047.101440] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [ 5047.107974] task: ef0ecd00 task.stack: ef158000
> [ 5047.112521] PC is at arch_cpu_idle+0x48/0x4c
> [ 5047.116799] LR is at arch_cpu_idle+0x44/0x4c
> [ 5047.121077] pc : [<c0108c70>]    lr : [<c0108c6c>]    psr: 60070013
> [ 5047.121077] sp : ef159f98  ip : ef159fa8  fp : ef159fa4
> [ 5047.132560] r10: 00000000  r9 : 00000002  r8 : c0d025dc
> [ 5047.137791] r7 : c0d95448  r6 : ef158000  r5 : c0d02648  r4 : ef158000
> [ 5047.144324] r3 : c011a140  r2 : 005bc18a  r1 : ef7ae3c0  r0 : 00000000
> [ 5047.150858] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [ 5047.157998] Control: 10c5387d  Table: 3dd9804a  DAC: 00000051
> [ 5047.163748] Process swapper/1 (pid: 0, stack limit = 0xef158210)
> [ 5047.169759] Stack: (0xef159f98 to 0xef15a000)
> [ 5047.174124] 9f80:
>     ef159fb4 ef159fa8
> [ 5047.182310] 9fa0: c0170b54 c0108c34 ef159fdc ef159fb8 c0170da4
> c0170b30 c0a8fb48 c0d8b845
> [ 5047.190496] 9fc0: c0d8c48c c0d025dc 10c0387d c0d95448 ef159ff4
> ef159fe0 c010e6cc c0170b6c
> [ 5047.198682] 9fe0: 3f0fc06a 00000051 00000000 ef159ff8 101018cc
> c010e580 edddf4eb ffeebffd
> [ 5047.206887] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
> (default_idle_call+0x30/0x3c)
> [ 5047.214993] [<c0170b54>] (default_idle_call) from [<c0170da4>]
> (cpu_startup_entry+0x244/0x298)
> [ 5047.223619] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
> (secondary_start_kernel+0x158/0x164)
> [ 5047.232677] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
> (0x101018cc)
> [ 5047.240083] Code: e34c30d0 e5933014 e12fff33 f1080080 (e89da800)
> [ 5047.246190] ---[ end trace 853e028df8c9b7cd ]---
> [ 5047.250814] Kernel panic - not syncing: Attempted to kill the idle task!
> [ 5047.257528] CPU2: stopping
> [ 5047.260250] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D
> 4.8.0+ #19
> [ 5047.267476] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [ 5047.274022] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
> (show_stack+0x20/0x24)
> [ 5047.281783] [<c010c9d0>] (show_stack) from [<c04939c8>]
> (dump_stack+0x90/0xa4)
> [ 5047.289019] [<c04939c8>] (dump_stack) from [<c010ebe8>]
> (handle_IPI+0x2a4/0x2c4)
> [ 5047.296426] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
> (gic_handle_irq+0x80/0x84)
> [ 5047.304013] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
> (__irq_svc+0x6c/0x90)
> [ 5047.311500] Exception stack(0xef15bf48 to 0xef15bf90)
> [ 5047.316561] bf40:                   00000000 ef7bd3c0 00729232
> c011a140 ef15a000 c0d02648
> [ 5047.324747] bf60: ef15a000 c0d95448 c0d025dc 00000004 00000000
> ef15bfa4 ef15bfa8 ef15bf98
> [ 5047.332930] bf80: c0108c6c c0108c70 60000013 ffffffff
> [ 5047.337999] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
> (arch_cpu_idle+0x48/0x4c)
> [ 5047.345410] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
> (default_idle_call+0x30/0x3c)
> [ 5047.353514] [<c0170b54>] (default_idle_call) from [<c0170da4>]
> (cpu_startup_entry+0x244/0x298)
> [ 5047.362137] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
> (secondary_start_kernel+0x158/0x164)
> [ 5047.371191] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
> (0x101018cc)
> [ 5047.378593] CPU3: stopping
> [ 5047.381314] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D
> 4.8.0+ #19
> [ 5047.388540] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [ 5047.395085] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
> (show_stack+0x20/0x24)
> [ 5047.402841] [<c010c9d0>] (show_stack) from [<c04939c8>]
> (dump_stack+0x90/0xa4)
> [ 5047.410073] [<c04939c8>] (dump_stack) from [<c010ebe8>]
> (handle_IPI+0x2a4/0x2c4)
> [ 5047.417479] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
> (gic_handle_irq+0x80/0x84)
> [ 5047.425060] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
> (__irq_svc+0x6c/0x90)
> [ 5047.432546] Exception stack(0xef15df48 to 0xef15df90)
> [ 5047.437606] df40:                   00000000 ef7cc3c0 00512f92
> c011a140 ef15c000 c0d02648
> [ 5047.445792] df60: ef15c000 c0d95448 c0d025dc 00000008 00000000
> ef15dfa4 ef15dfa8 ef15df98
> [ 5047.453974] df80: c0108c6c c0108c70 600d0013 ffffffff
> [ 5047.459041] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
> (arch_cpu_idle+0x48/0x4c)
> [ 5047.466452] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
> (default_idle_call+0x30/0x3c)
> [ 5047.474555] [<c0170b54>] (default_idle_call) from [<c0170da4>]
> (cpu_startup_entry+0x244/0x298)
> [ 5047.483179] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
> (secondary_start_kernel+0x158/0x164)
> [ 5047.492234] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
> (0x101018cc)
> [ 5047.499635] CPU0: stopping
> [ 5047.502356] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G      D
> 4.8.0+ #19
> [ 5047.509582] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [ 5047.516126] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
> (show_stack+0x20/0x24)
> [ 5047.523883] [<c010c9d0>] (show_stack) from [<c04939c8>]
> (dump_stack+0x90/0xa4)
> [ 5047.531116] [<c04939c8>] (dump_stack) from [<c010ebe8>]
> (handle_IPI+0x2a4/0x2c4)
> [ 5047.538521] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
> (gic_handle_irq+0x80/0x84)
> [ 5047.546102] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
> (__irq_svc+0x6c/0x90)
> [ 5047.553588] Exception stack(0xc0d01d78 to 0xc0d01dc0)
> [ 5047.558645] 1d60:
>     ee4dfae8 20070113
> [ 5047.566831] 1d80: 00000730 000049db ee4dfa80 ef3d9de0 ee4dfae8
> 20070113 00000020 00000000
> [ 5047.575016] 1da0: c0d0799c c0d01dd4 c0d01dd8 c0d01dc8 c05fdd00
> c080c26c 60070113 ffffffff
> [ 5047.583205] [<c080c8cc>] (__irq_svc) from [<c080c26c>]
> (_raw_spin_unlock_irqrestore+0x30/0x34)
> [ 5047.591839] [<c080c26c>] (_raw_spin_unlock_irqrestore) from
> [<c05fdd00>] (sdhci_tasklet_finish+0xfc/0x234)
> [ 5047.601507] [<c05fdd00>] (sdhci_tasklet_finish) from [<c012dfe8>]
> (tasklet_action+0x68/0xf8)
> [ 5047.609955] [<c012dfe8>] (tasklet_action) from [<c0101630>]
> (__do_softirq+0x150/0x34c)
> [ 5047.617892] [<c0101630>] (__do_softirq) from [<c012d9e0>]
> (irq_exit+0xc8/0x128)
> [ 5047.625218] [<c012d9e0>] (irq_exit) from [<c0181338>]
> (__handle_domain_irq+0x70/0xc4)
> [ 5047.633060] [<c0181338>] (__handle_domain_irq) from [<c01014a4>]
> (gic_handle_irq+0x48/0x84)
> [ 5047.641421] [<c01014a4>] (gic_handle_irq) from [<c080c8cc>]
> (__irq_svc+0x6c/0x90)
> [ 5047.648907] Exception stack(0xc0d01f08 to 0xc0d01f50)
> [ 5047.653965] 1f00:                   00000000 ef79f3c0 00449df4
> c011a140 c0d00000 c0d02648
> [ 5047.662151] 1f20: c0d00000 c0d02540 c0d025dc 00000001 ef7db300
> c0d01f64 c0d01f68 c0d01f58
> [ 5047.670333] 1f40: c0108c6c c0108c70 60070013 ffffffff
> [ 5047.675399] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
> (arch_cpu_idle+0x48/0x4c)
> [ 5047.682809] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
> (default_idle_call+0x30/0x3c)
> [ 5047.690911] [<c0170b54>] (default_idle_call) from [<c0170da4>]
> (cpu_startup_entry+0x244/0x298)
> [ 5047.699533] [<c0170da4>] (cpu_startup_entry) from [<c0806eb8>]
> (rest_init+0x84/0x88)
> [ 5047.707292] [<c0806eb8>] (rest_init) from [<c0c00d34>]
> (start_kernel+0x390/0x39c)
> [ 5047.714786] ---[ end Kernel panic - not syncing: Attempted to kill
> the idle task!
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: iMX6/UART imprecise external abort
  2019-12-02 20:40 iMX6/UART imprecise external abort Andre Renaud
  2019-12-02 20:56 ` Fabio Estevam
@ 2019-12-02 21:29 ` Uwe Kleine-König
  2019-12-05 19:29   ` Andre Renaud
  2019-12-21  3:33 ` Andre Renaud
  2 siblings, 1 reply; 31+ messages in thread
From: Uwe Kleine-König @ 2019-12-02 21:29 UTC (permalink / raw)
  To: Andre Renaud; +Cc: kernel, linux-arm-kernel

Hello Andre,

On Tue, Dec 03, 2019 at 09:40:28AM +1300, Andre Renaud wrote:
> I am working with an iMX6Q system that is exhibiting a crash when
> using the serial ports.
> We have /dev/ttymxc2 configured as an RS485 UART, and are seeing an
> 'imprecise external abort' after some time of use (panic listed
> below).
> 
> We are able to replicate this both on the custom kernel we're using,
> as well as on the 5.3.x+fslc image from
> https://github.com/Freescale/linux-fslc
> 
> To replicate it we have the mx6 hooked up to a PC, with each end
> sending short 3-4 character messages every second. The fault kicks in

Are you closing the UART between the sendings?

> after about 15-30 minutes. This seems similar to the fault described
> here: https://lkml.org/lkml/2019/11/11/588. We have tried shutting
> down DMA and various performance/cpuidle systems, but that doesn't
> seem to have any impact.
> 
> Does anyone have any thoughts on how to solve this?
> 
> Regards,
> Andre
> 
> [ 5047.074427] Unhandled fault: imprecise external abort (0x1406) at 0xb6e00f78
> [ 5047.081498] pgd = c0004000
> [ 5047.084213] [b6e00f78] *pgd=00000000
> [ 5047.087813] Internal error: : 1406 [#1] SMP ARM
> [ 5047.092348] Modules linked in:
> [ 5047.095429] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0+ #19
> [ 5047.101440] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [ 5047.107974] task: ef0ecd00 task.stack: ef158000
> [ 5047.112521] PC is at arch_cpu_idle+0x48/0x4c
> [ 5047.116799] LR is at arch_cpu_idle+0x44/0x4c
> [ 5047.121077] pc : [<c0108c70>]    lr : [<c0108c6c>]    psr: 60070013
> [ 5047.121077] sp : ef159f98  ip : ef159fa8  fp : ef159fa4
> [ 5047.132560] r10: 00000000  r9 : 00000002  r8 : c0d025dc
> [ 5047.137791] r7 : c0d95448  r6 : ef158000  r5 : c0d02648  r4 : ef158000
> [ 5047.144324] r3 : c011a140  r2 : 005bc18a  r1 : ef7ae3c0  r0 : 00000000
> [ 5047.150858] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [ 5047.157998] Control: 10c5387d  Table: 3dd9804a  DAC: 00000051
> [ 5047.163748] Process swapper/1 (pid: 0, stack limit = 0xef158210)
> [ 5047.169759] Stack: (0xef159f98 to 0xef15a000)
> [ 5047.174124] 9f80:                                                       ef159fb4 ef159fa8
> [ 5047.182310] 9fa0: c0170b54 c0108c34 ef159fdc ef159fb8 c0170da4 c0170b30 c0a8fb48 c0d8b845
> [ 5047.190496] 9fc0: c0d8c48c c0d025dc 10c0387d c0d95448 ef159ff4 ef159fe0 c010e6cc c0170b6c
> [ 5047.198682] 9fe0: 3f0fc06a 00000051 00000000 ef159ff8 101018cc c010e580 edddf4eb ffeebffd
> [ 5047.206887] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>] (default_idle_call+0x30/0x3c)
> [ 5047.214993] [<c0170b54>] (default_idle_call) from [<c0170da4>] (cpu_startup_entry+0x244/0x298)
> [ 5047.223619] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>] (secondary_start_kernel+0x158/0x164)
> [ 5047.232677] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>] (0x101018cc)
> [ 5047.240083] Code: e34c30d0 e5933014 e12fff33 f1080080 (e89da800)
> [ 5047.246190] ---[ end trace 853e028df8c9b7cd ]---

I saw a similar problem some time ago on a 4.14.69 and 4.19.72 with a
backport of the UART driver from some newer release (around 5.4-rc1)
plus some rs485 improvments and the -rt patch applied. I got some input
by RMK on that and the situation is difficult.  The address where the
fault is reported to have happend doesn't say anything for an imprecise
external abort.

On our end the problem doesn't reproduce so easily up to now.

I didn't come around to debug this problem, yet. I would do some 
shooting in the dark and start with:

 - disable DMA (doesn't help according to your report)
 - reproduce without -rt (still happens according to your report)
 - keep the UART clocks on. (Try removing
   "clk_disable_unprepare(sport->clk_ipg);" from imx_uart_probe())
 - try to reproduce in rs232 mode
 - try to record some traces of the problem
   (i.e. add tracing_off() to the fault handler and enable ftrace with a
   large enough trace buffer.)

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-02 20:56 ` Fabio Estevam
@ 2019-12-03  1:53   ` Andy Duan
  2019-12-03  2:01     ` Andre Renaud
  2019-12-03  6:24   ` Andre Renaud
  2019-12-10  4:03   ` Andre Renaud
  2 siblings, 1 reply; 31+ messages in thread
From: Andy Duan @ 2019-12-03  1:53 UTC (permalink / raw)
  To: Fabio Estevam, Andre Renaud
  Cc: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

From: Fabio Estevam <festevam@gmail.com> Sent: Tuesday, December 3, 2019 4:56 AM
> Hi Andre,
> 
> [Adding Andy]
> 
> On Mon, Dec 2, 2019 at 5:40 PM Andre Renaud
> <arenaud@designa-electronics.com> wrote:
> >
> > Hello,
> > I am working with an iMX6Q system that is exhibiting a crash when
> > using the serial ports.
> > We have /dev/ttymxc2 configured as an RS485 UART, and are seeing an
> > 'imprecise external abort' after some time of use (panic listed
> > below).
> >
> > We are able to replicate this both on the custom kernel we're using,
> > as well as on the 5.3.x+fslc image from
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> >
> ub.com%2FFreescale%2Flinux-fslc&amp;data=02%7C01%7Cfugang.duan%40
> nxp.c
> >
> om%7C08c50f6afbd14be3017d08d7776a0a4e%7C686ea1d3bc2b4c6fa92cd9
> 9c5c3016
> >
> 35%7C0%7C0%7C637109169631007605&amp;sdata=MZ4O%2Bk0dgKcY2So
> TfF7Vs%2F%2
> > FCXkDcBa5pc1iTAR6x8qQ%3D&amp;reserved=0
> 
> Could you please try a vanilla 5.4.1 kernel?

Please try 5.4 kernel firstly.

Can you remove the sdma firmware "/lib/firmware/imx/sdma/sdma-imx6q.bin" and
try it ?

Andy
> 
> >
> > To replicate it we have the mx6 hooked up to a PC, with each end
> > sending short 3-4 character messages every second. The fault kicks in
> > after about 15-30 minutes. This seems similar to the fault described
> > here:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml
> > .org%2Flkml%2F2019%2F11%2F11%2F588&amp;data=02%7C01%7Cfugang
> .duan%40nxp.com%7C08c50f6afbd14be3017d08d7776a0a4e%7C686ea1d3b
> c2b4c6fa92cd99c5c301635%7C0%7C0%7C637109169631007605&amp;sdat
> a=aC45obTT2sYqoWhcdm%2F8qz%2BtKH4CZxAOb%2FwyjJ376Fw%3D&amp;
> reserved=0. We have tried shutting down DMA and various
> performance/cpuidle systems, but that doesn't seem to have any impact.
> >
> > Does anyone have any thoughts on how to solve this?
> >
> > Regards,
> > Andre
> >
> > [ 5047.074427] Unhandled fault: imprecise external abort (0x1406) at
> > 0xb6e00f78 [ 5047.081498] pgd = c0004000 [ 5047.084213] [b6e00f78]
> > *pgd=00000000 [ 5047.087813] Internal error: : 1406 [#1] SMP ARM [
> > 5047.092348] Modules linked in:
> > [ 5047.095429] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0+ #19 [
> > 5047.101440] Hardware name: Freescale i.MX6 Quad/DualLite (Device
> > Tree) [ 5047.107974] task: ef0ecd00 task.stack: ef158000 [
> > 5047.112521] PC is at arch_cpu_idle+0x48/0x4c [ 5047.116799] LR is at
> > arch_cpu_idle+0x44/0x4c
> > [ 5047.121077] pc : [<c0108c70>]    lr : [<c0108c6c>]    psr: 60070013
> > [ 5047.121077] sp : ef159f98  ip : ef159fa8  fp : ef159fa4 [
> > 5047.132560] r10: 00000000  r9 : 00000002  r8 : c0d025dc [
> > 5047.137791] r7 : c0d95448  r6 : ef158000  r5 : c0d02648  r4 :
> > ef158000 [ 5047.144324] r3 : c011a140  r2 : 005bc18a  r1 : ef7ae3c0
> > r0 : 00000000 [ 5047.150858] Flags: nZCv  IRQs on  FIQs on  Mode
> > SVC_32  ISA ARM  Segment none [ 5047.157998] Control: 10c5387d
> Table:
> > 3dd9804a  DAC: 00000051 [ 5047.163748] Process swapper/1 (pid: 0,
> > stack limit = 0xef158210) [ 5047.169759] Stack: (0xef159f98 to
> > 0xef15a000) [ 5047.174124] 9f80:
> >     ef159fb4 ef159fa8
> > [ 5047.182310] 9fa0: c0170b54 c0108c34 ef159fdc ef159fb8 c0170da4
> > c0170b30 c0a8fb48 c0d8b845
> > [ 5047.190496] 9fc0: c0d8c48c c0d025dc 10c0387d c0d95448 ef159ff4
> > ef159fe0 c010e6cc c0170b6c
> > [ 5047.198682] 9fe0: 3f0fc06a 00000051 00000000 ef159ff8 101018cc
> > c010e580 edddf4eb ffeebffd
> > [ 5047.206887] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
> > (default_idle_call+0x30/0x3c)
> > [ 5047.214993] [<c0170b54>] (default_idle_call) from [<c0170da4>]
> > (cpu_startup_entry+0x244/0x298)
> > [ 5047.223619] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
> > (secondary_start_kernel+0x158/0x164)
> > [ 5047.232677] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
> > (0x101018cc)
> > [ 5047.240083] Code: e34c30d0 e5933014 e12fff33 f1080080 (e89da800) [
> > 5047.246190] ---[ end trace 853e028df8c9b7cd ]--- [ 5047.250814]
> > Kernel panic - not syncing: Attempted to kill the idle task!
> > [ 5047.257528] CPU2: stopping
> > [ 5047.260250] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D
> > 4.8.0+ #19
> > [ 5047.267476] Hardware name: Freescale i.MX6 Quad/DualLite (Device
> > Tree) [ 5047.274022] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
> > (show_stack+0x20/0x24)
> > [ 5047.281783] [<c010c9d0>] (show_stack) from [<c04939c8>]
> > (dump_stack+0x90/0xa4)
> > [ 5047.289019] [<c04939c8>] (dump_stack) from [<c010ebe8>]
> > (handle_IPI+0x2a4/0x2c4)
> > [ 5047.296426] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
> > (gic_handle_irq+0x80/0x84)
> > [ 5047.304013] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
> > (__irq_svc+0x6c/0x90)
> > [ 5047.311500] Exception stack(0xef15bf48 to 0xef15bf90)
> > [ 5047.316561] bf40:                   00000000 ef7bd3c0 00729232
> > c011a140 ef15a000 c0d02648
> > [ 5047.324747] bf60: ef15a000 c0d95448 c0d025dc 00000004 00000000
> > ef15bfa4 ef15bfa8 ef15bf98
> > [ 5047.332930] bf80: c0108c6c c0108c70 60000013 ffffffff [
> > 5047.337999] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
> > (arch_cpu_idle+0x48/0x4c)
> > [ 5047.345410] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
> > (default_idle_call+0x30/0x3c)
> > [ 5047.353514] [<c0170b54>] (default_idle_call) from [<c0170da4>]
> > (cpu_startup_entry+0x244/0x298)
> > [ 5047.362137] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
> > (secondary_start_kernel+0x158/0x164)
> > [ 5047.371191] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
> > (0x101018cc)
> > [ 5047.378593] CPU3: stopping
> > [ 5047.381314] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D
> > 4.8.0+ #19
> > [ 5047.388540] Hardware name: Freescale i.MX6 Quad/DualLite (Device
> > Tree) [ 5047.395085] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
> > (show_stack+0x20/0x24)
> > [ 5047.402841] [<c010c9d0>] (show_stack) from [<c04939c8>]
> > (dump_stack+0x90/0xa4)
> > [ 5047.410073] [<c04939c8>] (dump_stack) from [<c010ebe8>]
> > (handle_IPI+0x2a4/0x2c4)
> > [ 5047.417479] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
> > (gic_handle_irq+0x80/0x84)
> > [ 5047.425060] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
> > (__irq_svc+0x6c/0x90)
> > [ 5047.432546] Exception stack(0xef15df48 to 0xef15df90)
> > [ 5047.437606] df40:                   00000000 ef7cc3c0 00512f92
> > c011a140 ef15c000 c0d02648
> > [ 5047.445792] df60: ef15c000 c0d95448 c0d025dc 00000008 00000000
> > ef15dfa4 ef15dfa8 ef15df98
> > [ 5047.453974] df80: c0108c6c c0108c70 600d0013 ffffffff [
> > 5047.459041] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
> > (arch_cpu_idle+0x48/0x4c)
> > [ 5047.466452] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
> > (default_idle_call+0x30/0x3c)
> > [ 5047.474555] [<c0170b54>] (default_idle_call) from [<c0170da4>]
> > (cpu_startup_entry+0x244/0x298)
> > [ 5047.483179] [<c0170da4>] (cpu_startup_entry) from [<c010e6cc>]
> > (secondary_start_kernel+0x158/0x164)
> > [ 5047.492234] [<c010e6cc>] (secondary_start_kernel) from [<101018cc>]
> > (0x101018cc)
> > [ 5047.499635] CPU0: stopping
> > [ 5047.502356] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G      D
> > 4.8.0+ #19
> > [ 5047.509582] Hardware name: Freescale i.MX6 Quad/DualLite (Device
> > Tree) [ 5047.516126] [<c0110cc4>] (unwind_backtrace) from [<c010c9d0>]
> > (show_stack+0x20/0x24)
> > [ 5047.523883] [<c010c9d0>] (show_stack) from [<c04939c8>]
> > (dump_stack+0x90/0xa4)
> > [ 5047.531116] [<c04939c8>] (dump_stack) from [<c010ebe8>]
> > (handle_IPI+0x2a4/0x2c4)
> > [ 5047.538521] [<c010ebe8>] (handle_IPI) from [<c01014dc>]
> > (gic_handle_irq+0x80/0x84)
> > [ 5047.546102] [<c01014dc>] (gic_handle_irq) from [<c080c8cc>]
> > (__irq_svc+0x6c/0x90)
> > [ 5047.553588] Exception stack(0xc0d01d78 to 0xc0d01dc0) [
> > 5047.558645] 1d60:
> >     ee4dfae8 20070113
> > [ 5047.566831] 1d80: 00000730 000049db ee4dfa80 ef3d9de0 ee4dfae8
> > 20070113 00000020 00000000
> > [ 5047.575016] 1da0: c0d0799c c0d01dd4 c0d01dd8 c0d01dc8 c05fdd00
> > c080c26c 60070113 ffffffff [ 5047.583205] [<c080c8cc>] (__irq_svc)
> > from [<c080c26c>]
> > (_raw_spin_unlock_irqrestore+0x30/0x34)
> > [ 5047.591839] [<c080c26c>] (_raw_spin_unlock_irqrestore) from
> > [<c05fdd00>] (sdhci_tasklet_finish+0xfc/0x234) [ 5047.601507]
> > [<c05fdd00>] (sdhci_tasklet_finish) from [<c012dfe8>]
> > (tasklet_action+0x68/0xf8)
> > [ 5047.609955] [<c012dfe8>] (tasklet_action) from [<c0101630>]
> > (__do_softirq+0x150/0x34c)
> > [ 5047.617892] [<c0101630>] (__do_softirq) from [<c012d9e0>]
> > (irq_exit+0xc8/0x128)
> > [ 5047.625218] [<c012d9e0>] (irq_exit) from [<c0181338>]
> > (__handle_domain_irq+0x70/0xc4)
> > [ 5047.633060] [<c0181338>] (__handle_domain_irq) from [<c01014a4>]
> > (gic_handle_irq+0x48/0x84)
> > [ 5047.641421] [<c01014a4>] (gic_handle_irq) from [<c080c8cc>]
> > (__irq_svc+0x6c/0x90)
> > [ 5047.648907] Exception stack(0xc0d01f08 to 0xc0d01f50)
> > [ 5047.653965] 1f00:                   00000000 ef79f3c0 00449df4
> > c011a140 c0d00000 c0d02648
> > [ 5047.662151] 1f20: c0d00000 c0d02540 c0d025dc 00000001 ef7db300
> > c0d01f64 c0d01f68 c0d01f58
> > [ 5047.670333] 1f40: c0108c6c c0108c70 60070013 ffffffff [
> > 5047.675399] [<c080c8cc>] (__irq_svc) from [<c0108c70>]
> > (arch_cpu_idle+0x48/0x4c)
> > [ 5047.682809] [<c0108c70>] (arch_cpu_idle) from [<c0170b54>]
> > (default_idle_call+0x30/0x3c)
> > [ 5047.690911] [<c0170b54>] (default_idle_call) from [<c0170da4>]
> > (cpu_startup_entry+0x244/0x298)
> > [ 5047.699533] [<c0170da4>] (cpu_startup_entry) from [<c0806eb8>]
> > (rest_init+0x84/0x88)
> > [ 5047.707292] [<c0806eb8>] (rest_init) from [<c0c00d34>]
> > (start_kernel+0x390/0x39c)
> > [ 5047.714786] ---[ end Kernel panic - not syncing: Attempted to kill
> > the idle task!
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists
> > .infradead.org%2Fmailman%2Flistinfo%2Flinux-arm-kernel&amp;data=02%
> 7C0
> >
> 1%7Cfugang.duan%40nxp.com%7C08c50f6afbd14be3017d08d7776a0a4e%7
> C686ea1d
> >
> 3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637109169631007605&amp;sd
> ata=H%2FA
> >
> He58qWHdK12hD%2BWTVeLcETpFzwYpM2ZbRwArS4VA%3D&amp;reserved=
> 0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-03  1:53   ` [EXT] " Andy Duan
@ 2019-12-03  2:01     ` Andre Renaud
  2019-12-03  2:06       ` Andy Duan
  2019-12-03  2:28       ` Fabio Estevam
  0 siblings, 2 replies; 31+ messages in thread
From: Andre Renaud @ 2019-12-03  2:01 UTC (permalink / raw)
  To: Andy Duan
  Cc: Fabio Estevam, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

> > Could you please try a vanilla 5.4.1 kernel?
>
> Please try 5.4 kernel firstly.

We are trying 5.4.1 at the moment, but will try 5.4 after that if the
issue persists.

> Can you remove the sdma firmware "/lib/firmware/imx/sdma/sdma-imx6q.bin" and
> try it ?
At the moment we have CONFIG_IMX_SDMA not set, so I'm assuming this
binary wouldn't have any impact?

Regards,
Andre

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-03  2:01     ` Andre Renaud
@ 2019-12-03  2:06       ` Andy Duan
  2019-12-03  2:28       ` Fabio Estevam
  1 sibling, 0 replies; 31+ messages in thread
From: Andy Duan @ 2019-12-03  2:06 UTC (permalink / raw)
  To: Andre Renaud
  Cc: Fabio Estevam, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

From: Andre Renaud <arenaud@designa-electronics.com> Sent: Tuesday, December 3, 2019 10:01 AM
> > > Could you please try a vanilla 5.4.1 kernel?
> >
> > Please try 5.4 kernel firstly.
> 
> We are trying 5.4.1 at the moment, but will try 5.4 after that if the issue
> persists.
> 
> > Can you remove the sdma firmware
> > "/lib/firmware/imx/sdma/sdma-imx6q.bin" and try it ?
> At the moment we have CONFIG_IMX_SDMA not set, so I'm assuming this
> binary wouldn't have any impact?

Yes, if the config is disabled, SDMA driver is not loaded.

In generally, access uart valid registers cannot cause external data abort.
> 
> Regards,
> Andre
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-03  2:01     ` Andre Renaud
  2019-12-03  2:06       ` Andy Duan
@ 2019-12-03  2:28       ` Fabio Estevam
  1 sibling, 0 replies; 31+ messages in thread
From: Fabio Estevam @ 2019-12-03  2:28 UTC (permalink / raw)
  To: Andre Renaud
  Cc: Andy Duan, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Mon, Dec 2, 2019 at 11:01 PM Andre Renaud
<arenaud@designa-electronics.com> wrote:
>
> > > Could you please try a vanilla 5.4.1 kernel?
> >
> > Please try 5.4 kernel firstly.
>
> We are trying 5.4.1 at the moment, but will try 5.4 after that if the
> issue persists.

5.4.1 is fine, thanks

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: iMX6/UART imprecise external abort
  2019-12-02 20:56 ` Fabio Estevam
  2019-12-03  1:53   ` [EXT] " Andy Duan
@ 2019-12-03  6:24   ` Andre Renaud
  2019-12-10  4:03   ` Andre Renaud
  2 siblings, 0 replies; 31+ messages in thread
From: Andre Renaud @ 2019-12-03  6:24 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Fugang Duan, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

Hi Fabio,
> Could you please try a vanilla 5.4.1 kernel?

Yes, it continues to occur on 5.4.1 vanilla kernel.
We have managed to speed up the replication of the fault with the
following fairly simple UART stress test:
https://gist.github.com/AndreRenaud/712abbef1340d907fca6bdadafb4ac92
Our kernel .config is here:
https://gist.github.com/AndreRenaud/39b1eee0ca9a6d2db2ee86f5d3cc6d4d

The unit is essentially idle - just a minimal buildroot filesystem.
Bootlog is available here:
https://gist.github.com/AndreRenaud/a4aef8a57cfe33b90f43a17bb0d2395b

We are starting to look at some of the other suggestions - disabling
the clocking in drivers/tty/serial/imx.c for instance.

Regards,
Andre

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: iMX6/UART imprecise external abort
  2019-12-02 21:29 ` Uwe Kleine-König
@ 2019-12-05 19:29   ` Andre Renaud
  0 siblings, 0 replies; 31+ messages in thread
From: Andre Renaud @ 2019-12-05 19:29 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: kernel, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Tue, Dec 3, 2019 at 10:29 AM Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
> I saw a similar problem some time ago on a 4.14.69 and 4.19.72 with a
> backport of the UART driver from some newer release (around 5.4-rc1)
> plus some rs485 improvments and the -rt patch applied. I got some input
> by RMK on that and the situation is difficult.  The address where the
> fault is reported to have happend doesn't say anything for an imprecise
> external abort.
>
> On our end the problem doesn't reproduce so easily up to now.
>
> I didn't come around to debug this problem, yet. I would do some
> shooting in the dark and start with:
>
>  - disable DMA (doesn't help according to your report)
>  - reproduce without -rt (still happens according to your report)
>  - keep the UART clocks on. (Try removing
>    "clk_disable_unprepare(sport->clk_ipg);" from imx_uart_probe())

Changing the clocking made the issue harder to reproduce, but it did
still occur eventually

>  - try to reproduce in rs232 mode
>  - try to record some traces of the problem
>    (i.e. add tracing_off() to the fault handler and enable ftrace with a
>    large enough trace buffer.)

We haven't tried these yet.

We did try fiddling with the IRQ affinity - we pushed the affinity for
the IRQ itself on to CPU 4 (ie: echo 8 > /proc/irq/67/smp_affinity),
and then we ran our test program on the same CPU with taskset (ie:
taskset 8 rs485_test).

Using this configuration I have not seen the issue reoccur on the
5.4.1 kernel, although the same settings do not resolve the issue on
our older 4.8 kernel.

Regards,
Andre

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: iMX6/UART imprecise external abort
  2019-12-02 20:56 ` Fabio Estevam
  2019-12-03  1:53   ` [EXT] " Andy Duan
  2019-12-03  6:24   ` Andre Renaud
@ 2019-12-10  4:03   ` Andre Renaud
  2019-12-10  5:46     ` [EXT] " Andy Duan
  2 siblings, 1 reply; 31+ messages in thread
From: Andre Renaud @ 2019-12-10  4:03 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Fugang Duan, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

Hello Fabio,

On Tue, Dec 3, 2019 at 9:56 AM Fabio Estevam <festevam@gmail.com> wrote:
> Could you please try a vanilla 5.4.1 kernel?

We have confirmed that this happens on the standard 5.4.1 board, but
only when the RS485 ioctls and corresponding device tree entries are
enabled.
We have tried it on different hardware (the Wandboard iMX6Q), with
just a minor change to the device tree - the addition of
uart-has-rtscts, and rts-gpio.
&uart3 {
    pinctrl-names = "default";
    pinctrl-0 = <&pinctrl_uart3>;
    uart-has-rtscts;
    rts-gpio = <&gpio3 23 GPIO_ACTIVE_LOW>;
    status = "okay";
};

Then if we enable RS485 mode using the ioctl:
    struct serial_rs485 rs485;
    rs485.flags = SER_RS485_ENABLED | SER_RS485_RTS_ON_SEND;
    ioctl(fd, TIOCSRS485, &rs485);

This will cause it to panic fairly quickly (in a few minutes).

This is with the SDMA peripheral disabled. If we enable the SDMA
peripheral, then after a while we stop receiving data, and the unit
locks up, although there is no panic output.

Regards,
Andre

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-10  4:03   ` Andre Renaud
@ 2019-12-10  5:46     ` Andy Duan
  2019-12-10 22:07       ` Andre Renaud
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Duan @ 2019-12-10  5:46 UTC (permalink / raw)
  To: Andre Renaud, Fabio Estevam
  Cc: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

From: Andre Renaud <arenaud@designa-electronics.com> Sent: Tuesday, December 10, 2019 12:04 PM
> Hello Fabio,
> 
> On Tue, Dec 3, 2019 at 9:56 AM Fabio Estevam <festevam@gmail.com>
> wrote:
> > Could you please try a vanilla 5.4.1 kernel?
> 
> We have confirmed that this happens on the standard 5.4.1 board, but only
> when the RS485 ioctls and corresponding device tree entries are enabled.
> We have tried it on different hardware (the Wandboard iMX6Q), with just a
> minor change to the device tree - the addition of uart-has-rtscts, and rts-gpio.
Documentation/devicetree/bindings/serial/fsl-imx-uart.txt
Note that for RS485 you must enable either the "uart-has-rtscts" or the "rts-gpios" properties.

And please note rts-gpio should be the pin for CTS_B.

> &uart3 {
>     pinctrl-names = "default";
>     pinctrl-0 = <&pinctrl_uart3>;
>     uart-has-rtscts;
>     rts-gpio = <&gpio3 23 GPIO_ACTIVE_LOW>;
>     status = "okay";
> };
> 
> Then if we enable RS485 mode using the ioctl:
>     struct serial_rs485 rs485;
>     rs485.flags = SER_RS485_ENABLED | SER_RS485_RTS_ON_SEND;
>     ioctl(fd, TIOCSRS485, &rs485);
> 
> This will cause it to panic fairly quickly (in a few minutes).
> 
> This is with the SDMA peripheral disabled. If we enable the SDMA peripheral,
> then after a while we stop receiving data, and the unit locks up, although
> there is no panic output.
> 
> Regards,
> Andre
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-10  5:46     ` [EXT] " Andy Duan
@ 2019-12-10 22:07       ` Andre Renaud
  2019-12-11  0:27         ` Fabio Estevam
  0 siblings, 1 reply; 31+ messages in thread
From: Andre Renaud @ 2019-12-10 22:07 UTC (permalink / raw)
  To: Andy Duan
  Cc: Fabio Estevam, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

Hi Andry

On Tue, Dec 10, 2019 at 6:46 PM Andy Duan <fugang.duan@nxp.com> wrote:
> > We have confirmed that this happens on the standard 5.4.1 board, but only
> > when the RS485 ioctls and corresponding device tree entries are enabled.
> > We have tried it on different hardware (the Wandboard iMX6Q), with just a
> > minor change to the device tree - the addition of uart-has-rtscts, and rts-gpio.
> Documentation/devicetree/bindings/serial/fsl-imx-uart.txt
> Note that for RS485 you must enable either the "uart-has-rtscts" or the "rts-gpios" properties.
>
> And please note rts-gpio should be the pin for CTS_B.

Thanks - we did have this wrong. However even after correcting this it
still fails in the same way. This is what we're trying:
diff --git a/arch/arm/boot/dts/imx6qdl-wandboard.dtsi
b/arch/arm/boot/dts/imx6qdl-wandboard.dtsi
index 2cfb4112a467..5f2d3dfafcec 100644
--- a/arch/arm/boot/dts/imx6qdl-wandboard.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-wandboard.dtsi
@@ -219,6 +219,14 @@
  >;
  };
+ pinctrl_uart2: uart2grp {
+ fsl,pins = <
+        MX6QDL_PAD_SD4_DAT7__UART2_TX_DATA 0x1b0b1
+        MX6QDL_PAD_SD4_DAT4__UART2_RX_DATA 0x1b0b1
+        MX6QDL_PAD_SD4_DAT5__UART2_CTS_B    0x1b0b1
+ >;
+ };
+
  pinctrl_uart3: uart3grp {
  fsl,pins = <
  MX6QDL_PAD_EIM_D24__UART3_TX_DATA 0x1b0b1
@@ -313,13 +321,18 @@
  status = "okay";
 };
-&uart3 {
+&uart2 {
  pinctrl-names = "default";
- pinctrl-0 = <&pinctrl_uart3>;
+ pinctrl-0 = <&pinctrl_uart2>;
  uart-has-rtscts;
+  linux,rs485-enabled-at-boot-time;
  status = "okay";
 };
+&uart3 {
+ status = "disabled";
+};
+
 &usbh1 {
  status = "okay";
 };

Regards,
Andre

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-10 22:07       ` Andre Renaud
@ 2019-12-11  0:27         ` Fabio Estevam
  2019-12-11  1:35           ` [EXT] " Andre Renaud
  0 siblings, 1 reply; 31+ messages in thread
From: Fabio Estevam @ 2019-12-11  0:27 UTC (permalink / raw)
  To: Andre Renaud
  Cc: Andy Duan, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

Hi Andre,

On Tue, Dec 10, 2019 at 7:07 PM Andre Renaud
<arenaud@designa-electronics.com> wrote:

> Thanks - we did have this wrong. However even after correcting this it
> still fails in the same way. This is what we're trying:

I tried your patch on my imx6qp wandboard and did not reproduce the error.

As you activated uart2 I changed your code to use ttymxc1 instead.

Do you connect UART2 to the PC and also send command from the PC to the board?

Thanks

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] iMX6/UART imprecise external abort
  2019-12-11  0:27         ` Fabio Estevam
@ 2019-12-11  1:35           ` Andre Renaud
  2019-12-11  1:59             ` Andre Renaud
  2019-12-12  1:35             ` Fabio Estevam
  0 siblings, 2 replies; 31+ messages in thread
From: Andre Renaud @ 2019-12-11  1:35 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Youxin Su, Andy Duan,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

Hi Fabio,
> On 11/12/2019, at 1:27 PM, Fabio Estevam <festevam@gmail.com> wrote:
>> Thanks - we did have this wrong. However even after correcting this it
>> still fails in the same way. This is what we're trying:
> 
> I tried your patch on my imx6qp wandboard and did not reproduce the error.
> 
> As you activated uart2 I changed your code to use ttymxc1 instead.
> 
> Do you connect UART2 to the PC and also send command from the PC to the board?

Yes, we connect it to a PC and basically stream data in both directions between the two.

On the PC it’s as simple as:
while : ; do echo test > /dev/ttyUSB1 ; sleep 0.05 ; done

It does need bi-directional traffic to trigger.

Regards,
Andre
  
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] iMX6/UART imprecise external abort
  2019-12-11  1:35           ` [EXT] " Andre Renaud
@ 2019-12-11  1:59             ` Andre Renaud
  2019-12-12  1:35             ` Fabio Estevam
  1 sibling, 0 replies; 31+ messages in thread
From: Andre Renaud @ 2019-12-11  1:59 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Youxin Su, Andy Duan,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE


> On 11/12/2019, at 2:35 PM, Andre Renaud <arenaud@designa-electronics.com> wrote:
> 
> Hi Fabio,
>> On 11/12/2019, at 1:27 PM, Fabio Estevam <festevam@gmail.com> wrote:
>>> Thanks - we did have this wrong. However even after correcting this it
>>> still fails in the same way. This is what we're trying:
>> 
>> I tried your patch on my imx6qp wandboard and did not reproduce the error.
>> 
>> As you activated uart2 I changed your code to use ttymxc1 instead.
>> 
>> Do you connect UART2 to the PC and also send command from the PC to the board?
> 
> Yes, we connect it to a PC and basically stream data in both directions between the two.
> 
> On the PC it’s as simple as:
> while : ; do echo test > /dev/ttyUSB1 ; sleep 0.05 ; done
> 
> It does need bi-directional traffic to trigger.

We also have SDMA disabled.

Regards,
Andre
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] iMX6/UART imprecise external abort
  2019-12-11  1:35           ` [EXT] " Andre Renaud
  2019-12-11  1:59             ` Andre Renaud
@ 2019-12-12  1:35             ` Fabio Estevam
  2019-12-15 23:41               ` Andre Renaud
  1 sibling, 1 reply; 31+ messages in thread
From: Fabio Estevam @ 2019-12-12  1:35 UTC (permalink / raw)
  To: Andre Renaud
  Cc: Youxin Su, Andy Duan,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

Hi Andre,

On Tue, Dec 10, 2019 at 10:36 PM Andre Renaud
<arenaud@designa-electronics.com> wrote:

> Yes, we connect it to a PC and basically stream data in both directions between the two.

Looks like I need to access the UART2 pins in the wandboard and place
a RS-232/USB converter.

Not sure I can easily do such setup at the moment.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] iMX6/UART imprecise external abort
  2019-12-12  1:35             ` Fabio Estevam
@ 2019-12-15 23:41               ` Andre Renaud
  0 siblings, 0 replies; 31+ messages in thread
From: Andre Renaud @ 2019-12-15 23:41 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Youxin Su, Andy Duan,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]

Hello Fabio,

On Thu, Dec 12, 2019 at 2:35 PM Fabio Estevam <festevam@gmail.com> wrote:
> > Yes, we connect it to a PC and basically stream data in both directions between the two.
>
> Looks like I need to access the UART2 pins in the wandboard and place
> a RS-232/USB converter.
>
> Not sure I can easily do such setup at the moment.

I have found one other issue relating to this crash - it seems to only
occur when we're in a bus-conflict situation. I.E: both parties on the
RS485 bus are transmitting at the same time. On the scope I am able to
see invalid bytes due to this contention.

In the attached screenshot you can see that there are odd bit patterns
visible on the RX line. Unfortunately I only have a digital trace of
it at the moment. This is a screenshot of the last block of bytes
transmitted by the IMX6 before the crash. Line 0 is the TX_EN signal
for the RS485 transceiver, line 1 is the TX signal from the MX6, and
line 2 is the RX signal. These are both on the MX6 side of the
transceiver.

Regards,
Andre

[-- Attachment #2: Screenshot from 2019-12-16 12-32-43.png --]
[-- Type: image/png, Size: 64026 bytes --]

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: iMX6/UART imprecise external abort
  2019-12-02 20:40 iMX6/UART imprecise external abort Andre Renaud
  2019-12-02 20:56 ` Fabio Estevam
  2019-12-02 21:29 ` Uwe Kleine-König
@ 2019-12-21  3:33 ` Andre Renaud
  2019-12-21  7:31   ` [EXT] " Andy Duan
  2 siblings, 1 reply; 31+ messages in thread
From: Andre Renaud @ 2019-12-21  3:33 UTC (permalink / raw)
  To: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
  Cc: Andy Duan, s.hauer, Fabio Estevam, Uwe Kleine-König

> On 3/12/2019, at 9:40 AM, Andre Renaud <arenaud@designa-electronics.com> wrote:
> I am working with an iMX6Q system that is exhibiting a crash when
> using the serial ports.
> We have /dev/ttymxc2 configured as an RS485 UART, and are seeing an
> 'imprecise external abort' after some time of use (panic listed
> below).

Following up on this. After various attempts to replicate it on different boards, we found that after enabling the SER_RS485_RX_DURING_TX flag, the issue went away. This in turn led us to discover that the issue we were seeing is the same one discussed in this prior discussion: https://www.spinics.net/lists/arm-kernel/msg564268.html

We took the patch there, modified it to suite our kernel, and have confirmed that this resolves the issue.

Can anyone comment on whether that patch, or some variant on it, should be forward ported to the latest kernel? It has a slightly odd timeout in there, with no error return value, but apart from that it does resolve out issue.

Regards,
Andre
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-21  3:33 ` Andre Renaud
@ 2019-12-21  7:31   ` Andy Duan
  2019-12-21 12:03     ` Fabio Estevam
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Duan @ 2019-12-21  7:31 UTC (permalink / raw)
  To: Andre Renaud, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
  Cc: s.hauer, Fabio Estevam, Uwe Kleine-König

From: Andre Renaud <arenaud@designa-electronics.com> Sent: Saturday, December 21, 2019 11:34 AM
> > On 3/12/2019, at 9:40 AM, Andre Renaud
> <arenaud@designa-electronics.com> wrote:
> > I am working with an iMX6Q system that is exhibiting a crash when
> > using the serial ports.
> > We have /dev/ttymxc2 configured as an RS485 UART, and are seeing an
> > 'imprecise external abort' after some time of use (panic listed
> > below).
> 
> Following up on this. After various attempts to replicate it on different boards,
> we found that after enabling the SER_RS485_RX_DURING_TX flag, the issue
> went away. This in turn led us to discover that the issue we were seeing is the
> same one discussed in this prior discussion:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.s
> pinics.net%2Flists%2Farm-kernel%2Fmsg564268.html&amp;data=02%7C01%
> 7Cfugang.duan%40nxp.com%7Cd14c378d3af74134754308d785c69a6a%7C6
> 86ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637124960361867595&
> amp;sdata=TXmXA2qie62qEReYMJE9AW2YPanEetDvwT8LWvzmmoE%3D&a
> mp;reserved=0
> 
From the old thread discussion, the key point:
"If the receiver is disabled while there are characters in the FIFO
then the receive data ready interrupt is triggered (even when
UCR1_RRDYEN is cleared beforehand)."

We should ensure the RX FIFO data are not missed since they are valid data.
To compatible DMA and cpu PIO mode, to receive all RX FIFO data when start
to send, it will involve complex code logic.
So I suggest to enable the flag "SER_RS485_RX_DURING_TX", and force to enable
the flag for imx uart RS485 driver.

Andy
> We took the patch there, modified it to suite our kernel, and have confirmed
> that this resolves the issue.
> 
> Can anyone comment on whether that patch, or some variant on it, should be
> forward ported to the latest kernel? It has a slightly odd timeout in there, with
> no error return value, but apart from that it does resolve out issue.
> 
> Regards,
> Andre

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-21  7:31   ` [EXT] " Andy Duan
@ 2019-12-21 12:03     ` Fabio Estevam
  2019-12-23  1:53       ` Andy Duan
  0 siblings, 1 reply; 31+ messages in thread
From: Fabio Estevam @ 2019-12-21 12:03 UTC (permalink / raw)
  To: Andy Duan
  Cc: Andre Renaud, s.hauer,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Uwe Kleine-König

On Sat, Dec 21, 2019 at 4:31 AM Andy Duan <fugang.duan@nxp.com> wrote:

> We should ensure the RX FIFO data are not missed since they are valid data.
> To compatible DMA and cpu PIO mode, to receive all RX FIFO data when start
> to send, it will involve complex code logic.
> So I suggest to enable the flag "SER_RS485_RX_DURING_TX", and force to enable
> the flag for imx uart RS485 driver.

Inside imx_uart_rs485_config() we have:

if (rs485conf->flags & SER_RS485_ENABLED) {
       /* Enable receiver if low-active RTS signal is requested */
       if (sport->have_rtscts &&  !sport->have_rtsgpio &&
           !(rs485conf->flags & SER_RS485_RTS_ON_SEND))
                    rs485conf->flags |= SER_RS485_RX_DURING_TX;

Maybe the if() logic needs to be changed so that the
SER_RS485_RX_DURING_TX flag could be set in Andre's case?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-21 12:03     ` Fabio Estevam
@ 2019-12-23  1:53       ` Andy Duan
  2019-12-23 10:16         ` Uwe Kleine-König
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Duan @ 2019-12-23  1:53 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Andre Renaud, s.hauer,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Uwe Kleine-König

From: Fabio Estevam <festevam@gmail.com> Sent: Saturday, December 21, 2019 8:03 PM
> On Sat, Dec 21, 2019 at 4:31 AM Andy Duan <fugang.duan@nxp.com> wrote:
> 
> > We should ensure the RX FIFO data are not missed since they are valid data.
> > To compatible DMA and cpu PIO mode, to receive all RX FIFO data when
> > start to send, it will involve complex code logic.
> > So I suggest to enable the flag "SER_RS485_RX_DURING_TX", and force to
> > enable the flag for imx uart RS485 driver.
> 
> Inside imx_uart_rs485_config() we have:
> 
> if (rs485conf->flags & SER_RS485_ENABLED) {
>        /* Enable receiver if low-active RTS signal is requested */
>        if (sport->have_rtscts &&  !sport->have_rtsgpio &&
>            !(rs485conf->flags & SER_RS485_RTS_ON_SEND))
>                     rs485conf->flags |= SER_RS485_RX_DURING_TX;
> 
> Maybe the if() logic needs to be changed so that the
> SER_RS485_RX_DURING_TX flag could be set in Andre's case?

I think let the config always is enabled unconditionally: 
	rs485conf->flags |= SER_RS485_RX_DURING_TX;

Regards,
Andy
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-23  1:53       ` Andy Duan
@ 2019-12-23 10:16         ` Uwe Kleine-König
  2020-01-07 22:24           ` Uwe Kleine-König
  0 siblings, 1 reply; 31+ messages in thread
From: Uwe Kleine-König @ 2019-12-23 10:16 UTC (permalink / raw)
  To: Andy Duan
  Cc: Andre Renaud, s.hauer, Fabio Estevam,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

Hello,

On Mon, Dec 23, 2019 at 01:53:44AM +0000, Andy Duan wrote:
> From: Fabio Estevam <festevam@gmail.com> Sent: Saturday, December 21, 2019 8:03 PM
> > On Sat, Dec 21, 2019 at 4:31 AM Andy Duan <fugang.duan@nxp.com> wrote:
> > 
> > > We should ensure the RX FIFO data are not missed since they are valid data.
> > > To compatible DMA and cpu PIO mode, to receive all RX FIFO data when
> > > start to send, it will involve complex code logic.
> > > So I suggest to enable the flag "SER_RS485_RX_DURING_TX", and force to
> > > enable the flag for imx uart RS485 driver.
> > 
> > Inside imx_uart_rs485_config() we have:
> > 
> > if (rs485conf->flags & SER_RS485_ENABLED) {
> >        /* Enable receiver if low-active RTS signal is requested */
> >        if (sport->have_rtscts &&  !sport->have_rtsgpio &&
> >            !(rs485conf->flags & SER_RS485_RTS_ON_SEND))
> >                     rs485conf->flags |= SER_RS485_RX_DURING_TX;
> > 
> > Maybe the if() logic needs to be changed so that the
> > SER_RS485_RX_DURING_TX flag could be set in Andre's case?
> 
> I think let the config always is enabled unconditionally: 
> 	rs485conf->flags |= SER_RS485_RX_DURING_TX;

I think it should be possible to fix without forcing
SER_RS485_RX_DURING_TX (which might have surprising effects for
userspace). Actually I was convinced this problem was fixed in a
different way in the imx driver already since 76821e222c18 ("serial:
imx: ensure that RX irqs are off if RX is off").

The key idea is to disable the RX irq and dma request and only then
disable RX. This way it is not given that the RX FIFO is empty on
disable, but the characters are not read and so the exception doesn't
happen.

I'll take a deeper look after my vacations in the new year, probably
some rx485 path was missed in the fix.

Merry Christmas (for those who care),
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2019-12-23 10:16         ` Uwe Kleine-König
@ 2020-01-07 22:24           ` Uwe Kleine-König
  2020-01-08  1:43             ` Andy Duan
  2020-01-08 11:23             ` Uwe Kleine-König
  0 siblings, 2 replies; 31+ messages in thread
From: Uwe Kleine-König @ 2020-01-07 22:24 UTC (permalink / raw)
  To: Andy Duan
  Cc: Andre Renaud, s.hauer, Fabio Estevam, kernel,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Mon, Dec 23, 2019 at 11:16:27AM +0100, Uwe Kleine-König wrote:
> Hello,
> 
> On Mon, Dec 23, 2019 at 01:53:44AM +0000, Andy Duan wrote:
> > From: Fabio Estevam <festevam@gmail.com> Sent: Saturday, December 21, 2019 8:03 PM
> > > On Sat, Dec 21, 2019 at 4:31 AM Andy Duan <fugang.duan@nxp.com> wrote:
> > > 
> > > > We should ensure the RX FIFO data are not missed since they are valid data.
> > > > To compatible DMA and cpu PIO mode, to receive all RX FIFO data when
> > > > start to send, it will involve complex code logic.
> > > > So I suggest to enable the flag "SER_RS485_RX_DURING_TX", and force to
> > > > enable the flag for imx uart RS485 driver.
> > > 
> > > Inside imx_uart_rs485_config() we have:
> > > 
> > > if (rs485conf->flags & SER_RS485_ENABLED) {
> > >        /* Enable receiver if low-active RTS signal is requested */
> > >        if (sport->have_rtscts &&  !sport->have_rtsgpio &&
> > >            !(rs485conf->flags & SER_RS485_RTS_ON_SEND))
> > >                     rs485conf->flags |= SER_RS485_RX_DURING_TX;
> > > 
> > > Maybe the if() logic needs to be changed so that the
> > > SER_RS485_RX_DURING_TX flag could be set in Andre's case?
> > 
> > I think let the config always is enabled unconditionally: 
> > 	rs485conf->flags |= SER_RS485_RX_DURING_TX;
> 
> I think it should be possible to fix without forcing
> SER_RS485_RX_DURING_TX (which might have surprising effects for
> userspace). Actually I was convinced this problem was fixed in a
> different way in the imx driver already since 76821e222c18 ("serial:
> imx: ensure that RX irqs are off if RX is off").
> 
> The key idea is to disable the RX irq and dma request and only then
> disable RX. This way it is not given that the RX FIFO is empty on
> disable, but the characters are not read and so the exception doesn't
> happen.
> 
> I'll take a deeper look after my vacations in the new year, probably
> some rx485 path was missed in the fix.

I took a look now and found a race condition that might trigger this
problem. The following can happen (in the non-DMA case):


	imx_uart_int()
	  usr1 = imx_uart_readl(sport, USR1);
	  ...
	  ucr1 = imx_uart_readl(sport, UCR1);
	  ucr2 = imx_uart_readl(sport, UCR2);
	  ...
	  if ((ucr1 & UCR1_RRDYEN) == 0)
	    usr1 &= ~USR1_RRDY;
	  if ((ucr2 & UCR2_ATEN) == 0)
	    usr1 &= ~USR1_AGTIM;
	    						imx_uart_start_tx()
							  imx_uart_stop_rx()
							    ...
							    ucr1 &= ~UCR1_RRDYEN;
							    ucr2 &= ~(UCR2_ATEN | UCR2_RXEN)
							    imx_uart_writel(sport, ucr1, UCR1);
							    imx_uart_writel(sport, ucr2, UCR2);

	  if (usr1 & (USR1_RRDY | USR1_AGTIM)) {
	    imx_uart_rxint(irq, dev_id);
	    ...
	  }

Which results in the left execution thread to read from the RX register
while RXEN is off and so trigger the fault.

Currently imx_uart_rxint() grabs the port lock (and imx_uart_start_tx()
also holds it), and so the decision to call imx_uart_rxint() is done
without holding the lock.

The fix is to do the check for UCR1_RRDYEN and UCR2_ATEN (and all the
other similar checks) under the port lock.

So assuming the problem is indeed what we are experiencing here, the
workaround by Andre (i.e. run the UART user and the UART irq on the same
cpu) is a good one.

I will look into this again tomorrow when I'm well rested and create a
patch.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [EXT] Re: iMX6/UART imprecise external abort
  2020-01-07 22:24           ` Uwe Kleine-König
@ 2020-01-08  1:43             ` Andy Duan
  2020-01-08  4:03               ` Andre Renaud
  2020-01-08  6:40               ` Uwe Kleine-König
  2020-01-08 11:23             ` Uwe Kleine-König
  1 sibling, 2 replies; 31+ messages in thread
From: Andy Duan @ 2020-01-08  1:43 UTC (permalink / raw)
  To: Uwe Kleine-König, Andre Renaud
  Cc: s.hauer, Fabio Estevam, kernel,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Sent: Wednesday, January 8, 2020 6:24 AM
> On Mon, Dec 23, 2019 at 11:16:27AM +0100, Uwe Kleine-König wrote:
> > Hello,
> >
> > On Mon, Dec 23, 2019 at 01:53:44AM +0000, Andy Duan wrote:
> > > From: Fabio Estevam <festevam@gmail.com> Sent: Saturday, December
> > > 21, 2019 8:03 PM
> > > > On Sat, Dec 21, 2019 at 4:31 AM Andy Duan <fugang.duan@nxp.com>
> wrote:
> > > >
> > > > > We should ensure the RX FIFO data are not missed since they are valid
> data.
> > > > > To compatible DMA and cpu PIO mode, to receive all RX FIFO data
> > > > > when start to send, it will involve complex code logic.
> > > > > So I suggest to enable the flag "SER_RS485_RX_DURING_TX", and
> > > > > force to enable the flag for imx uart RS485 driver.
> > > >
> > > > Inside imx_uart_rs485_config() we have:
> > > >
> > > > if (rs485conf->flags & SER_RS485_ENABLED) {
> > > >        /* Enable receiver if low-active RTS signal is requested */
> > > >        if (sport->have_rtscts &&  !sport->have_rtsgpio &&
> > > >            !(rs485conf->flags & SER_RS485_RTS_ON_SEND))
> > > >                     rs485conf->flags |=
> SER_RS485_RX_DURING_TX;
> > > >
> > > > Maybe the if() logic needs to be changed so that the
> > > > SER_RS485_RX_DURING_TX flag could be set in Andre's case?
> > >
> > > I think let the config always is enabled unconditionally:
> > >     rs485conf->flags |= SER_RS485_RX_DURING_TX;
> >
> > I think it should be possible to fix without forcing
> > SER_RS485_RX_DURING_TX (which might have surprising effects for
> > userspace). Actually I was convinced this problem was fixed in a
> > different way in the imx driver already since 76821e222c18 ("serial:
> > imx: ensure that RX irqs are off if RX is off").
> >
> > The key idea is to disable the RX irq and dma request and only then
> > disable RX. This way it is not given that the RX FIFO is empty on
> > disable, but the characters are not read and so the exception doesn't
> > happen.
> >
> > I'll take a deeper look after my vacations in the new year, probably
> > some rx485 path was missed in the fix.
> 
> I took a look now and found a race condition that might trigger this problem.
> The following can happen (in the non-DMA case):
> 
> 
>         imx_uart_int()
>           usr1 = imx_uart_readl(sport, USR1);
>           ...
>           ucr1 = imx_uart_readl(sport, UCR1);
>           ucr2 = imx_uart_readl(sport, UCR2);
>           ...
>           if ((ucr1 & UCR1_RRDYEN) == 0)
>             usr1 &= ~USR1_RRDY;
>           if ((ucr2 & UCR2_ATEN) == 0)
>             usr1 &= ~USR1_AGTIM;
> 
> imx_uart_start_tx()
> 
> imx_uart_stop_rx()
>                                                             ...
> 
> ucr1 &= ~UCR1_RRDYEN;
> 
> ucr2 &= ~(UCR2_ATEN | UCR2_RXEN)
> 
> imx_uart_writel(sport, ucr1, UCR1);
> 
> imx_uart_writel(sport, ucr2, UCR2);
> 
>           if (usr1 & (USR1_RRDY | USR1_AGTIM)) {
>             imx_uart_rxint(irq, dev_id);
>             ...
>           }
> 
> Which results in the left execution thread to read from the RX register while
> RXEN is off and so trigger the fault.
> 
> Currently imx_uart_rxint() grabs the port lock (and imx_uart_start_tx() also
> holds it), and so the decision to call imx_uart_rxint() is done without holding
> the lock.
> 
> The fix is to do the check for UCR1_RRDYEN and UCR2_ATEN (and all the
> other similar checks) under the port lock.

Add RXEN check before accessing RX register.

if ((usr1 & (USR1_RRDY | USR1_AGTIM)) &&
   ucr2 & UCR2_RXEN) {
	imx_uart_rxint(irq, dev_id);
	...
}

> 
> So assuming the problem is indeed what we are experiencing here, the
> workaround by Andre (i.e. run the UART user and the UART irq on the same
> cpu) is a good one.

@Andre Renaud,  can you add kernel command line "nosmp" to check the issue ?
Suppose one core cannot reproduce the issue.

> 
> I will look into this again tomorrow when I'm well rested and create a patch.
> 
> Best regards
> Uwe
> 
> --
> Pengutronix e.K.                           | Uwe Kleine-König
> |
> Industrial Linux Solutions                 |
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> pengutronix.de%2F&amp;data=02%7C01%7Cfugang.duan%40nxp.com%7C47
> 502f86fd7847bd952a08d793c05094%7C686ea1d3bc2b4c6fa92cd99c5c3016
> 35%7C0%7C0%7C637140326517408679&amp;sdata=yt7kjoP9DgEUl19MRQP
> mGwTskdI7fpSnN%2FtlklaPozw%3D&amp;reserved=0 |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2020-01-08  1:43             ` Andy Duan
@ 2020-01-08  4:03               ` Andre Renaud
  2020-01-08  5:11                 ` Andy Duan
  2020-01-08  6:43                 ` Uwe Kleine-König
  2020-01-08  6:40               ` Uwe Kleine-König
  1 sibling, 2 replies; 31+ messages in thread
From: Andre Renaud @ 2020-01-08  4:03 UTC (permalink / raw)
  To: Andy Duan
  Cc: s.hauer, Fabio Estevam, kernel,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Uwe Kleine-König

On Wed, Jan 8, 2020 at 2:43 PM Andy Duan <fugang.duan@nxp.com> wrote:
> @Andre Renaud,  can you add kernel command line "nosmp" to check the issue ?
> Suppose one core cannot reproduce the issue.

We have done this test previously, and 'nosmp' does resolve the issue.

Regards,
Andre

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [EXT] Re: iMX6/UART imprecise external abort
  2020-01-08  4:03               ` Andre Renaud
@ 2020-01-08  5:11                 ` Andy Duan
  2020-01-08  6:43                 ` Uwe Kleine-König
  1 sibling, 0 replies; 31+ messages in thread
From: Andy Duan @ 2020-01-08  5:11 UTC (permalink / raw)
  To: Andre Renaud
  Cc: s.hauer, Fabio Estevam, kernel,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Uwe Kleine-König

From: Andre Renaud <arenaud@designa-electronics.com> Sent: Wednesday, January 8, 2020 12:03 PM
> On Wed, Jan 8, 2020 at 2:43 PM Andy Duan <fugang.duan@nxp.com> wrote:
> > @Andre Renaud,  can you add kernel command line "nosmp" to check the
> issue ?
> > Suppose one core cannot reproduce the issue.
> 
> We have done this test previously, and 'nosmp' does resolve the issue.
> 
> Regards,
> Andre

So it more likely the cause explained by Uwe.

Regards,
Andy
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2020-01-08  1:43             ` Andy Duan
  2020-01-08  4:03               ` Andre Renaud
@ 2020-01-08  6:40               ` Uwe Kleine-König
  1 sibling, 0 replies; 31+ messages in thread
From: Uwe Kleine-König @ 2020-01-08  6:40 UTC (permalink / raw)
  To: Andy Duan
  Cc: Andre Renaud, s.hauer, Fabio Estevam,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, kernel

Hello,

On Wed, Jan 08, 2020 at 01:43:12AM +0000, Andy Duan wrote:
> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Sent: Wednesday, January 8, 2020 6:24 AM
> > The fix is to do the check for UCR1_RRDYEN and UCR2_ATEN (and all the
> > other similar checks) under the port lock.
> 
> Add RXEN check before accessing RX register.
> 
> if ((usr1 & (USR1_RRDY | USR1_AGTIM)) &&
>    ucr2 & UCR2_RXEN) {
> 	imx_uart_rxint(irq, dev_id);
> 	...
> }

FTR: This doesn't fix the issue as RXEN might be disabled after the
check and before imx_uart_rxint() grabs the lock.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2020-01-08  4:03               ` Andre Renaud
  2020-01-08  5:11                 ` Andy Duan
@ 2020-01-08  6:43                 ` Uwe Kleine-König
  1 sibling, 0 replies; 31+ messages in thread
From: Uwe Kleine-König @ 2020-01-08  6:43 UTC (permalink / raw)
  To: Andre Renaud
  Cc: Fabio Estevam, s.hauer, Andy Duan,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, kernel

Hello,

On Wed, Jan 08, 2020 at 05:03:18PM +1300, Andre Renaud wrote:
> On Wed, Jan 8, 2020 at 2:43 PM Andy Duan <fugang.duan@nxp.com> wrote:
> > @Andre Renaud,  can you add kernel command line "nosmp" to check the issue ?
> > Suppose one core cannot reproduce the issue.
> 
> We have done this test previously, and 'nosmp' does resolve the issue.

Not surprising as the problem goes away by (only) ensuring that irq
handler and uart user run on the same cpu. Limiting the whole system to
a single CPU is just another way to ensure this same cpu handling.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2020-01-07 22:24           ` Uwe Kleine-König
  2020-01-08  1:43             ` Andy Duan
@ 2020-01-08 11:23             ` Uwe Kleine-König
  2020-01-09  1:33               ` Andy Duan
  1 sibling, 1 reply; 31+ messages in thread
From: Uwe Kleine-König @ 2020-01-08 11:23 UTC (permalink / raw)
  To: Andy Duan
  Cc: Andre Renaud, s.hauer, Fabio Estevam,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, kernel

Hello,

On Tue, Jan 07, 2020 at 11:24:06PM +0100, Uwe Kleine-König wrote:
> I will look into this again tomorrow when I'm well rested and create a
> patch.

Here you go, for now without proper commit log etc.pp.

Please test if this fixes your problems.

I currently don't have the setup to trigger this bug, but normal console
usage still works for me.

Best regards
Uwe

-------->8--------
From 025a72c6de6df8b71414378a0297568df371bd73 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig@pengutronix.de>
Date: Wed, 8 Jan 2020 09:47:20 +0100
Subject: [PATCH RFT] serial: imx: fix a race condition in receive path

---
 drivers/tty/serial/imx.c | 52 ++++++++++++++++++++++++++++++----------
 1 file changed, 39 insertions(+), 13 deletions(-)

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index a9e20e6c63ad..679b2de27c4d 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -700,22 +700,33 @@ static void imx_uart_start_tx(struct uart_port *port)
 	}
 }
 
-static irqreturn_t imx_uart_rtsint(int irq, void *dev_id)
+static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
 {
 	struct imx_port *sport = dev_id;
 	u32 usr1;
 
-	spin_lock(&sport->port.lock);
-
 	imx_uart_writel(sport, USR1_RTSD, USR1);
 	usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
 	uart_handle_cts_change(&sport->port, !!usr1);
 	wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
 
-	spin_unlock(&sport->port.lock);
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t imx_uart_rtsint(int irq, void *dev_id)
+{
+	struct imx_port *sport = dev_id;
+	irqreturn_t ret;
+
+	spin_lock(&sport->port.lock);
+
+	ret = __imx_uart_rtsint(irq, dev_id);
+
+	spin_unlock(&sport->port.lock);
+
+	return ret;
+}
+
 static irqreturn_t imx_uart_txint(int irq, void *dev_id)
 {
 	struct imx_port *sport = dev_id;
@@ -726,14 +737,12 @@ static irqreturn_t imx_uart_txint(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static irqreturn_t imx_uart_rxint(int irq, void *dev_id)
+static irqreturn_t __imx_uart_rxint(int irq, void *dev_id)
 {
 	struct imx_port *sport = dev_id;
 	unsigned int rx, flg, ignored = 0;
 	struct tty_port *port = &sport->port.state->port;
 
-	spin_lock(&sport->port.lock);
-
 	while (imx_uart_readl(sport, USR2) & USR2_RDR) {
 		u32 usr2;
 
@@ -792,11 +801,26 @@ static irqreturn_t imx_uart_rxint(int irq, void *dev_id)
 	}
 
 out:
-	spin_unlock(&sport->port.lock);
 	tty_flip_buffer_push(port);
+
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t imx_uart_rxint(int irq, void *dev_id)
+{
+	struct imx_port *sport = dev_id;
+	struct tty_port *port = &sport->port.state->port;
+	irqreturn_t ret;
+
+	spin_lock(&sport->port.lock);
+
+	ret = __imx_uart_rxint(irq, dev_id);
+
+	spin_unlock(&sport->port.lock);
+
+	return ret;
+}
+
 static void imx_uart_clear_rx_errors(struct imx_port *sport);
 
 /*
@@ -855,6 +879,8 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
 	unsigned int usr1, usr2, ucr1, ucr2, ucr3, ucr4;
 	irqreturn_t ret = IRQ_NONE;
 
+	spin_lock(&sport->port.lock);
+
 	usr1 = imx_uart_readl(sport, USR1);
 	usr2 = imx_uart_readl(sport, USR2);
 	ucr1 = imx_uart_readl(sport, UCR1);
@@ -888,27 +914,25 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
 		usr2 &= ~USR2_ORE;
 
 	if (usr1 & (USR1_RRDY | USR1_AGTIM)) {
-		imx_uart_rxint(irq, dev_id);
+		__imx_uart_rxint(irq, dev_id);
 		ret = IRQ_HANDLED;
 	}
 
 	if ((usr1 & USR1_TRDY) || (usr2 & USR2_TXDC)) {
-		imx_uart_txint(irq, dev_id);
+		imx_uart_transmit_buffer(sport);
 		ret = IRQ_HANDLED;
 	}
 
 	if (usr1 & USR1_DTRD) {
 		imx_uart_writel(sport, USR1_DTRD, USR1);
 
-		spin_lock(&sport->port.lock);
 		imx_uart_mctrl_check(sport);
-		spin_unlock(&sport->port.lock);
 
 		ret = IRQ_HANDLED;
 	}
 
 	if (usr1 & USR1_RTSD) {
-		imx_uart_rtsint(irq, dev_id);
+		__imx_uart_rtsint(irq, dev_id);
 		ret = IRQ_HANDLED;
 	}
 
@@ -923,6 +947,8 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
 		ret = IRQ_HANDLED;
 	}
 
+	spin_unlock(&sport->port.lock);
+
 	return ret;
 }
 
-- 
2.24.0



-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* RE: [EXT] Re: iMX6/UART imprecise external abort
  2020-01-08 11:23             ` Uwe Kleine-König
@ 2020-01-09  1:33               ` Andy Duan
  2020-01-19 23:29                 ` Andre Renaud
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Duan @ 2020-01-09  1:33 UTC (permalink / raw)
  To: Uwe Kleine-König, Andre Renaud
  Cc: s.hauer, Fabio Estevam,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, kernel

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Sent: Wednesday, January 8, 2020 7:23 PM
> Hello,
> 
> On Tue, Jan 07, 2020 at 11:24:06PM +0100, Uwe Kleine-König wrote:
> > I will look into this again tomorrow when I'm well rested and create a
> > patch.
> 
> Here you go, for now without proper commit log etc.pp.
> 
> Please test if this fixes your problems.
> 
> I currently don't have the setup to trigger this bug, but normal console usage
> still works for me.

I also have not the environment to reproduce the issue.
@Andre Renaud, can you try Uwe's patch ?


Regards,
Andy
> 
> Best regards
> Uwe
> 
> -------->8--------
> From 025a72c6de6df8b71414378a0297568df371bd73 Mon Sep 17 00:00:00
> 2001
> From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?=
> <u.kleine-koenig@pengutronix.de>
> Date: Wed, 8 Jan 2020 09:47:20 +0100
> Subject: [PATCH RFT] serial: imx: fix a race condition in receive path
> 
> ---
>  drivers/tty/serial/imx.c | 52 ++++++++++++++++++++++++++++++----------
>  1 file changed, 39 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c index
> a9e20e6c63ad..679b2de27c4d 100644
> --- a/drivers/tty/serial/imx.c
> +++ b/drivers/tty/serial/imx.c
> @@ -700,22 +700,33 @@ static void imx_uart_start_tx(struct uart_port
> *port)
>         }
>  }
> 
> -static irqreturn_t imx_uart_rtsint(int irq, void *dev_id)
> +static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id)
>  {
>         struct imx_port *sport = dev_id;
>         u32 usr1;
> 
> -       spin_lock(&sport->port.lock);
> -
>         imx_uart_writel(sport, USR1_RTSD, USR1);
>         usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS;
>         uart_handle_cts_change(&sport->port, !!usr1);
>         wake_up_interruptible(&sport->port.state->port.delta_msr_wait);
> 
> -       spin_unlock(&sport->port.lock);
>         return IRQ_HANDLED;
>  }
> 
> +static irqreturn_t imx_uart_rtsint(int irq, void *dev_id) {
> +       struct imx_port *sport = dev_id;
> +       irqreturn_t ret;
> +
> +       spin_lock(&sport->port.lock);
> +
> +       ret = __imx_uart_rtsint(irq, dev_id);
> +
> +       spin_unlock(&sport->port.lock);
> +
> +       return ret;
> +}
> +
>  static irqreturn_t imx_uart_txint(int irq, void *dev_id)  {
>         struct imx_port *sport = dev_id; @@ -726,14 +737,12 @@ static
> irqreturn_t imx_uart_txint(int irq, void *dev_id)
>         return IRQ_HANDLED;
>  }
> 
> -static irqreturn_t imx_uart_rxint(int irq, void *dev_id)
> +static irqreturn_t __imx_uart_rxint(int irq, void *dev_id)
>  {
>         struct imx_port *sport = dev_id;
>         unsigned int rx, flg, ignored = 0;
>         struct tty_port *port = &sport->port.state->port;
> 
> -       spin_lock(&sport->port.lock);
> -
>         while (imx_uart_readl(sport, USR2) & USR2_RDR) {
>                 u32 usr2;
> 
> @@ -792,11 +801,26 @@ static irqreturn_t imx_uart_rxint(int irq, void
> *dev_id)
>         }
> 
>  out:
> -       spin_unlock(&sport->port.lock);
>         tty_flip_buffer_push(port);
> +
>         return IRQ_HANDLED;
>  }
> 
> +static irqreturn_t imx_uart_rxint(int irq, void *dev_id) {
> +       struct imx_port *sport = dev_id;
> +       struct tty_port *port = &sport->port.state->port;
> +       irqreturn_t ret;
> +
> +       spin_lock(&sport->port.lock);
> +
> +       ret = __imx_uart_rxint(irq, dev_id);
> +
> +       spin_unlock(&sport->port.lock);
> +
> +       return ret;
> +}
> +
>  static void imx_uart_clear_rx_errors(struct imx_port *sport);
> 
>  /*
> @@ -855,6 +879,8 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
>         unsigned int usr1, usr2, ucr1, ucr2, ucr3, ucr4;
>         irqreturn_t ret = IRQ_NONE;
> 
> +       spin_lock(&sport->port.lock);
> +
>         usr1 = imx_uart_readl(sport, USR1);
>         usr2 = imx_uart_readl(sport, USR2);
>         ucr1 = imx_uart_readl(sport, UCR1); @@ -888,27 +914,25 @@
> static irqreturn_t imx_uart_int(int irq, void *dev_id)
>                 usr2 &= ~USR2_ORE;
> 
>         if (usr1 & (USR1_RRDY | USR1_AGTIM)) {
> -               imx_uart_rxint(irq, dev_id);
> +               __imx_uart_rxint(irq, dev_id);
>                 ret = IRQ_HANDLED;
>         }
> 
>         if ((usr1 & USR1_TRDY) || (usr2 & USR2_TXDC)) {
> -               imx_uart_txint(irq, dev_id);
> +               imx_uart_transmit_buffer(sport);
>                 ret = IRQ_HANDLED;
>         }
> 
>         if (usr1 & USR1_DTRD) {
>                 imx_uart_writel(sport, USR1_DTRD, USR1);
> 
> -               spin_lock(&sport->port.lock);
>                 imx_uart_mctrl_check(sport);
> -               spin_unlock(&sport->port.lock);
> 
>                 ret = IRQ_HANDLED;
>         }
> 
>         if (usr1 & USR1_RTSD) {
> -               imx_uart_rtsint(irq, dev_id);
> +               __imx_uart_rtsint(irq, dev_id);
>                 ret = IRQ_HANDLED;
>         }
> 
> @@ -923,6 +947,8 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
>                 ret = IRQ_HANDLED;
>         }
> 
> +       spin_unlock(&sport->port.lock);
> +
>         return ret;
>  }
> 
> --
> 2.24.0
> 
> 
> 
> --
> Pengutronix e.K.                           | Uwe Kleine-König
> |
> Industrial Linux Solutions                 |
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> pengutronix.de%2F&amp;data=02%7C01%7Cfugang.duan%40nxp.com%7Ca7
> 00379685894ff5ceeb08d7942d28d8%7C686ea1d3bc2b4c6fa92cd99c5c3016
> 35%7C0%7C0%7C637140793993226335&amp;sdata=KH2pKxhQxNLv6jV8ZST
> j7UuZHYDAnzP8%2BTkHNTr5T4Y%3D&amp;reserved=0 |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [EXT] Re: iMX6/UART imprecise external abort
  2020-01-09  1:33               ` Andy Duan
@ 2020-01-19 23:29                 ` Andre Renaud
  0 siblings, 0 replies; 31+ messages in thread
From: Andre Renaud @ 2020-01-19 23:29 UTC (permalink / raw)
  To: Andy Duan
  Cc: s.hauer, Fabio Estevam,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, kernel,
	Uwe Kleine-König

On Thu, Jan 9, 2020 at 2:34 PM Andy Duan <fugang.duan@nxp.com> wrote:
>
> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Sent: Wednesday, January 8, 2020 7:23 PM
> > Hello,
> >
> > On Tue, Jan 07, 2020 at 11:24:06PM +0100, Uwe Kleine-König wrote:
> > > I will look into this again tomorrow when I'm well rested and create a
> > > patch.
> >
> > Here you go, for now without proper commit log etc.pp.
> >
> > Please test if this fixes your problems.
> >
> > I currently don't have the setup to trigger this bug, but normal console usage
> > still works for me.
>
> I also have not the environment to reproduce the issue.
> @Andre Renaud, can you try Uwe's patch ?

I can confirm that this appears to resolve our issue. It would be
great if this patch could make its way into mainline. I'm not sure if
it results in any performance issues (it doesn't appear to for us).

Thanks,
Andre

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2020-01-19 23:29 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-02 20:40 iMX6/UART imprecise external abort Andre Renaud
2019-12-02 20:56 ` Fabio Estevam
2019-12-03  1:53   ` [EXT] " Andy Duan
2019-12-03  2:01     ` Andre Renaud
2019-12-03  2:06       ` Andy Duan
2019-12-03  2:28       ` Fabio Estevam
2019-12-03  6:24   ` Andre Renaud
2019-12-10  4:03   ` Andre Renaud
2019-12-10  5:46     ` [EXT] " Andy Duan
2019-12-10 22:07       ` Andre Renaud
2019-12-11  0:27         ` Fabio Estevam
2019-12-11  1:35           ` [EXT] " Andre Renaud
2019-12-11  1:59             ` Andre Renaud
2019-12-12  1:35             ` Fabio Estevam
2019-12-15 23:41               ` Andre Renaud
2019-12-02 21:29 ` Uwe Kleine-König
2019-12-05 19:29   ` Andre Renaud
2019-12-21  3:33 ` Andre Renaud
2019-12-21  7:31   ` [EXT] " Andy Duan
2019-12-21 12:03     ` Fabio Estevam
2019-12-23  1:53       ` Andy Duan
2019-12-23 10:16         ` Uwe Kleine-König
2020-01-07 22:24           ` Uwe Kleine-König
2020-01-08  1:43             ` Andy Duan
2020-01-08  4:03               ` Andre Renaud
2020-01-08  5:11                 ` Andy Duan
2020-01-08  6:43                 ` Uwe Kleine-König
2020-01-08  6:40               ` Uwe Kleine-König
2020-01-08 11:23             ` Uwe Kleine-König
2020-01-09  1:33               ` Andy Duan
2020-01-19 23:29                 ` Andre Renaud

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).