All of lore.kernel.org
 help / color / mirror / Atom feed
* Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24  8:57 ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-24  8:57 UTC (permalink / raw)
  To: Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby
  Cc: Uwe Kleine-König, Sergey Organov, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	linux-serial, Linux ARM, Shawn Guo, Stefan Wahren

Hi,

after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we 
experience the following issues with the debug UART (115200 baud, 8N1, 
no hardware flow control):

- overrun errors if we paste in multiple text lines while system is idle
- no reaction to single key strokes while system is on higher load

After reverting 7a637784d517 ("serial: imx: reduce RX interrupt 
frequency") the issue disappear.

Maybe it's worth to mention that the Tarragon board uses two additional 
application UARTs with similiar baud rates (9600 - 115200 baud, no 
hardware flow control) for RS485 communication, but there are no overrun 
errors (with and without the mention change).

Best regards


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24  8:57 ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-24  8:57 UTC (permalink / raw)
  To: Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby
  Cc: Uwe Kleine-König, Sergey Organov, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	linux-serial, Linux ARM, Shawn Guo, Stefan Wahren

Hi,

after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we 
experience the following issues with the debug UART (115200 baud, 8N1, 
no hardware flow control):

- overrun errors if we paste in multiple text lines while system is idle
- no reaction to single key strokes while system is on higher load

After reverting 7a637784d517 ("serial: imx: reduce RX interrupt 
frequency") the issue disappear.

Maybe it's worth to mention that the Tarragon board uses two additional 
application UARTs with similiar baud rates (9600 - 115200 baud, no 
hardware flow control) for RS485 communication, but there are no overrun 
errors (with and without the mention change).

Best regards


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24  8:57 ` Stefan Wahren
@ 2023-03-24 10:12   ` Linux regression tracking #adding (Thorsten Leemhuis)
  -1 siblings, 0 replies; 90+ messages in thread
From: Linux regression tracking #adding (Thorsten Leemhuis) @ 2023-03-24 10:12 UTC (permalink / raw)
  To: Stefan Wahren, Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby
  Cc: Uwe Kleine-König, Sergey Organov, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	linux-serial, Linux ARM, Shawn Guo, Stefan Wahren

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 24.03.23 09:57, Stefan Wahren wrote:
> Hi,
> 
> after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> experience the following issues with the debug UART (115200 baud, 8N1,
> no hardware flow control):
> 
> - overrun errors if we paste in multiple text lines while system is idle
> - no reaction to single key strokes while system is on higher load
> 
> After reverting 7a637784d517 ("serial: imx: reduce RX interrupt
> frequency") the issue disappear.
> 
> Maybe it's worth to mention that the Tarragon board uses two additional
> application UARTs with similiar baud rates (9600 - 115200 baud, no
> hardware flow control) for RS485 communication, but there are no overrun
> errors (with and without the mention change).

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 7a637784d517
#regzbot title serial: imx: overrun errors on debug UART
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 10:12   ` Linux regression tracking #adding (Thorsten Leemhuis)
  0 siblings, 0 replies; 90+ messages in thread
From: Linux regression tracking #adding (Thorsten Leemhuis) @ 2023-03-24 10:12 UTC (permalink / raw)
  To: Stefan Wahren, Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby
  Cc: Uwe Kleine-König, Sergey Organov, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	linux-serial, Linux ARM, Shawn Guo, Stefan Wahren

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 24.03.23 09:57, Stefan Wahren wrote:
> Hi,
> 
> after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> experience the following issues with the debug UART (115200 baud, 8N1,
> no hardware flow control):
> 
> - overrun errors if we paste in multiple text lines while system is idle
> - no reaction to single key strokes while system is on higher load
> 
> After reverting 7a637784d517 ("serial: imx: reduce RX interrupt
> frequency") the issue disappear.
> 
> Maybe it's worth to mention that the Tarragon board uses two additional
> application UARTs with similiar baud rates (9600 - 115200 baud, no
> hardware flow control) for RS485 communication, but there are no overrun
> errors (with and without the mention change).

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 7a637784d517
#regzbot title serial: imx: overrun errors on debug UART
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24  8:57 ` Stefan Wahren
@ 2023-03-24 11:47   ` Ilpo Järvinen
  -1 siblings, 0 replies; 90+ messages in thread
From: Ilpo Järvinen @ 2023-03-24 11:47 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

On Fri, 24 Mar 2023, Stefan Wahren wrote:

> Hi,
> 
> after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> experience the following issues with the debug UART (115200 baud, 8N1, no
> hardware flow control):
> 
> - overrun errors if we paste in multiple text lines while system is idle
> - no reaction to single key strokes while system is on higher load
> 
> After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
> the issue disappear.
> 
> Maybe it's worth to mention that the Tarragon board uses two additional
> application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
> flow control) for RS485 communication, but there are no overrun errors (with
> and without the mention change).

This has come up earlier, see e.g.:

https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/

My somewhat uninformed suggestion: if the overrun problems mostly show up 
with console ports, maybe the trigger level could depend on the port 
being a console or not?


-- 
 i.


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 11:47   ` Ilpo Järvinen
  0 siblings, 0 replies; 90+ messages in thread
From: Ilpo Järvinen @ 2023-03-24 11:47 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

On Fri, 24 Mar 2023, Stefan Wahren wrote:

> Hi,
> 
> after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> experience the following issues with the debug UART (115200 baud, 8N1, no
> hardware flow control):
> 
> - overrun errors if we paste in multiple text lines while system is idle
> - no reaction to single key strokes while system is on higher load
> 
> After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
> the issue disappear.
> 
> Maybe it's worth to mention that the Tarragon board uses two additional
> application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
> flow control) for RS485 communication, but there are no overrun errors (with
> and without the mention change).

This has come up earlier, see e.g.:

https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/

My somewhat uninformed suggestion: if the overrun problems mostly show up 
with console ports, maybe the trigger level could depend on the port 
being a console or not?


-- 
 i.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 11:47   ` Ilpo Järvinen
@ 2023-03-24 12:26     ` Francesco Dolcini
  -1 siblings, 0 replies; 90+ messages in thread
From: Francesco Dolcini @ 2023-03-24 12:26 UTC (permalink / raw)
  To: Ilpo Järvinen, Stefan Wahren
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

Hello

On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote:
> On Fri, 24 Mar 2023, Stefan Wahren wrote:
> > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> > experience the following issues with the debug UART (115200 baud, 8N1, no
> > hardware flow control):
> > 
> > - overrun errors if we paste in multiple text lines while system is idle
> > - no reaction to single key strokes while system is on higher load
> > 
> > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
> > the issue disappear.
> > 
> > Maybe it's worth to mention that the Tarragon board uses two additional
> > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
> > flow control) for RS485 communication, but there are no overrun errors (with
> > and without the mention change).
> 
> This has come up earlier, see e.g.:
> 
> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/

yep, it looks exactly the same issue.

We did not verify if this was affecting other UARTs. However, isn't RS485 
half-duplex? This is very likely a difference compared to the RS232
console port.

I am also not really convinced this is a proper regression, while 7a637784d517
clearly is making the situation _worst_, we had some issues even before -
unfortunately I have no much more details available.

Francesco


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 12:26     ` Francesco Dolcini
  0 siblings, 0 replies; 90+ messages in thread
From: Francesco Dolcini @ 2023-03-24 12:26 UTC (permalink / raw)
  To: Ilpo Järvinen, Stefan Wahren
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

Hello

On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote:
> On Fri, 24 Mar 2023, Stefan Wahren wrote:
> > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> > experience the following issues with the debug UART (115200 baud, 8N1, no
> > hardware flow control):
> > 
> > - overrun errors if we paste in multiple text lines while system is idle
> > - no reaction to single key strokes while system is on higher load
> > 
> > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
> > the issue disappear.
> > 
> > Maybe it's worth to mention that the Tarragon board uses two additional
> > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
> > flow control) for RS485 communication, but there are no overrun errors (with
> > and without the mention change).
> 
> This has come up earlier, see e.g.:
> 
> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/

yep, it looks exactly the same issue.

We did not verify if this was affecting other UARTs. However, isn't RS485 
half-duplex? This is very likely a difference compared to the RS232
console port.

I am also not really convinced this is a proper regression, while 7a637784d517
clearly is making the situation _worst_, we had some issues even before -
unfortunately I have no much more details available.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 12:26     ` Francesco Dolcini
@ 2023-03-24 12:35       ` Ilpo Järvinen
  -1 siblings, 0 replies; 90+ messages in thread
From: Ilpo Järvinen @ 2023-03-24 12:35 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Stefan Wahren, Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

[-- Attachment #1: Type: text/plain, Size: 1639 bytes --]

On Fri, 24 Mar 2023, Francesco Dolcini wrote:

> Hello
> 
> On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote:
> > On Fri, 24 Mar 2023, Stefan Wahren wrote:
> > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> > > experience the following issues with the debug UART (115200 baud, 8N1, no
> > > hardware flow control):
> > > 
> > > - overrun errors if we paste in multiple text lines while system is idle
> > > - no reaction to single key strokes while system is on higher load
> > > 
> > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
> > > the issue disappear.
> > > 
> > > Maybe it's worth to mention that the Tarragon board uses two additional
> > > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
> > > flow control) for RS485 communication, but there are no overrun errors (with
> > > and without the mention change).
> > 
> > This has come up earlier, see e.g.:
> > 
> > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> 
> yep, it looks exactly the same issue.
> 
> We did not verify if this was affecting other UARTs. However, isn't RS485 
> half-duplex?

While half-duplex is more likely by far due simplicity, RS485 could also 
be full-duplex. It seems imx driver supports for both modes.

-- 
 i.

> This is very likely a difference compared to the RS232
> console port.
> 
> I am also not really convinced this is a proper regression, while 7a637784d517
> clearly is making the situation _worst_, we had some issues even before -
> unfortunately I have no much more details available.


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 12:35       ` Ilpo Järvinen
  0 siblings, 0 replies; 90+ messages in thread
From: Ilpo Järvinen @ 2023-03-24 12:35 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Stefan Wahren, Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

[-- Attachment #1: Type: text/plain, Size: 1639 bytes --]

On Fri, 24 Mar 2023, Francesco Dolcini wrote:

> Hello
> 
> On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote:
> > On Fri, 24 Mar 2023, Stefan Wahren wrote:
> > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> > > experience the following issues with the debug UART (115200 baud, 8N1, no
> > > hardware flow control):
> > > 
> > > - overrun errors if we paste in multiple text lines while system is idle
> > > - no reaction to single key strokes while system is on higher load
> > > 
> > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
> > > the issue disappear.
> > > 
> > > Maybe it's worth to mention that the Tarragon board uses two additional
> > > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
> > > flow control) for RS485 communication, but there are no overrun errors (with
> > > and without the mention change).
> > 
> > This has come up earlier, see e.g.:
> > 
> > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> 
> yep, it looks exactly the same issue.
> 
> We did not verify if this was affecting other UARTs. However, isn't RS485 
> half-duplex?

While half-duplex is more likely by far due simplicity, RS485 could also 
be full-duplex. It seems imx driver supports for both modes.

-- 
 i.

> This is very likely a difference compared to the RS232
> console port.
> 
> I am also not really convinced this is a proper regression, while 7a637784d517
> clearly is making the situation _worst_, we had some issues even before -
> unfortunately I have no much more details available.


[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 12:35       ` Ilpo Järvinen
@ 2023-03-24 12:49         ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-24 12:49 UTC (permalink / raw)
  To: Ilpo Järvinen, Francesco Dolcini
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

Am 24.03.23 um 13:35 schrieb Ilpo Järvinen:
> On Fri, 24 Mar 2023, Francesco Dolcini wrote:
>
>> Hello
>>
>> On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote:
>>> On Fri, 24 Mar 2023, Stefan Wahren wrote:
>>>> after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
>>>> experience the following issues with the debug UART (115200 baud, 8N1, no
>>>> hardware flow control):
>>>>
>>>> - overrun errors if we paste in multiple text lines while system is idle
>>>> - no reaction to single key strokes while system is on higher load
>>>>
>>>> After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
>>>> the issue disappear.
>>>>
>>>> Maybe it's worth to mention that the Tarragon board uses two additional
>>>> application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
>>>> flow control) for RS485 communication, but there are no overrun errors (with
>>>> and without the mention change).
>>> This has come up earlier, see e.g.:
>>>
>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>> yep, it looks exactly the same issue.
>>
>> We did not verify if this was affecting other UARTs. However, isn't RS485
>> half-duplex?
> While half-duplex is more likely by far due simplicity, RS485 could also
> be full-duplex. It seems imx driver supports for both modes.

The RS485 on Tarragon is half-duplex, but this is implemented in 
external hardware. So from Linux / driver point of view it's a RS232.

To us the current behavior (overrun errors and no reaction under load) 
is not acceptable. I agree that increasing the rx threshold isn't the 
real issue. But i needed a starting point for a discussion.

So any ideas how to investigate this further are welcome.

>

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 12:49         ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-24 12:49 UTC (permalink / raw)
  To: Ilpo Järvinen, Francesco Dolcini
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

Am 24.03.23 um 13:35 schrieb Ilpo Järvinen:
> On Fri, 24 Mar 2023, Francesco Dolcini wrote:
>
>> Hello
>>
>> On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote:
>>> On Fri, 24 Mar 2023, Stefan Wahren wrote:
>>>> after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
>>>> experience the following issues with the debug UART (115200 baud, 8N1, no
>>>> hardware flow control):
>>>>
>>>> - overrun errors if we paste in multiple text lines while system is idle
>>>> - no reaction to single key strokes while system is on higher load
>>>>
>>>> After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
>>>> the issue disappear.
>>>>
>>>> Maybe it's worth to mention that the Tarragon board uses two additional
>>>> application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
>>>> flow control) for RS485 communication, but there are no overrun errors (with
>>>> and without the mention change).
>>> This has come up earlier, see e.g.:
>>>
>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>> yep, it looks exactly the same issue.
>>
>> We did not verify if this was affecting other UARTs. However, isn't RS485
>> half-duplex?
> While half-duplex is more likely by far due simplicity, RS485 could also
> be full-duplex. It seems imx driver supports for both modes.

The RS485 on Tarragon is half-duplex, but this is implemented in 
external hardware. So from Linux / driver point of view it's a RS232.

To us the current behavior (overrun errors and no reaction under load) 
is not acceptable. I agree that increasing the rx threshold isn't the 
real issue. But i needed a starting point for a discussion.

So any ideas how to investigate this further are welcome.

>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 11:47   ` Ilpo Järvinen
@ 2023-03-24 12:57     ` Fabio Estevam
  -1 siblings, 0 replies; 90+ messages in thread
From: Fabio Estevam @ 2023-03-24 12:57 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Stefan Wahren, Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	NXP Linux Team, linux-serial, Linux ARM, Shawn Guo,
	Stefan Wahren

Hi Stefan,

On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
<ilpo.jarvinen@linux.intel.com> wrote:

> This has come up earlier, see e.g.:
>
> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>
> My somewhat uninformed suggestion: if the overrun problems mostly show up
> with console ports, maybe the trigger level could depend on the port
> being a console or not?

Does the change below help? Taking Ilpo's suggestion into account:

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 0fa1bd8cdec7..4d0aae38b7a5 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -233,6 +233,7 @@ struct imx_port {
        enum imx_tx_state       tx_state;
        struct hrtimer          trigger_start_tx;
        struct hrtimer          trigger_stop_tx;
+       unsigned int            rxtl;
 };

 struct imx_port_ucrs {
@@ -1309,6 +1310,7 @@ static void imx_uart_clear_rx_errors(struct
imx_port *sport)
 }

 #define TXTL_DEFAULT 2 /* reset default */
+#define RXTL_DEFAULT_CONSOLE 1 /* 1 character or aging timer */
 #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
 #define TXTL_DMA 8 /* DMA burst setting */
 #define RXTL_DMA 9 /* DMA burst setting */
@@ -1422,7 +1424,7 @@ static void imx_uart_disable_dma(struct imx_port *sport)
        ucr1 &= ~(UCR1_RXDMAEN | UCR1_TXDMAEN | UCR1_ATDMAEN);
        imx_uart_writel(sport, ucr1, UCR1);

-       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
+       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);

        sport->dma_is_enabled = 0;
 }
@@ -1447,7 +1449,7 @@ static int imx_uart_startup(struct uart_port *port)
                return retval;
        }

-       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
+       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);

        /* disable the DREN bit (Data Ready interrupt enable) before
         * requesting IRQs
@@ -1464,6 +1466,11 @@ static int imx_uart_startup(struct uart_port *port)
        if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
                dma_is_inited = 1;

+       if (uart_console(port))
+               sport->rxtl = RXTL_DEFAULT_CONSOLE;
+       else
+               sport->rxtl = RXTL_DEFAULT;
+
        spin_lock_irqsave(&sport->port.lock, flags);

        /* Reset fifo's and state machines */
@@ -1863,7 +1870,7 @@ static int imx_uart_poll_init(struct uart_port *port)
        if (retval)
                clk_disable_unprepare(sport->clk_ipg);

-       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
+       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);

        spin_lock_irqsave(&sport->port.lock, flags);

@@ -2139,7 +2146,7 @@ imx_uart_console_setup(struct console *co, char *options)
        else
                imx_uart_console_get_options(sport, &baud, &parity, &bits);

-       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
+       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);

        retval = uart_set_options(&sport->port, co, baud, parity, bits, flow);

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 12:57     ` Fabio Estevam
  0 siblings, 0 replies; 90+ messages in thread
From: Fabio Estevam @ 2023-03-24 12:57 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Stefan Wahren, Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	NXP Linux Team, linux-serial, Linux ARM, Shawn Guo,
	Stefan Wahren

Hi Stefan,

On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
<ilpo.jarvinen@linux.intel.com> wrote:

> This has come up earlier, see e.g.:
>
> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>
> My somewhat uninformed suggestion: if the overrun problems mostly show up
> with console ports, maybe the trigger level could depend on the port
> being a console or not?

Does the change below help? Taking Ilpo's suggestion into account:

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 0fa1bd8cdec7..4d0aae38b7a5 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -233,6 +233,7 @@ struct imx_port {
        enum imx_tx_state       tx_state;
        struct hrtimer          trigger_start_tx;
        struct hrtimer          trigger_stop_tx;
+       unsigned int            rxtl;
 };

 struct imx_port_ucrs {
@@ -1309,6 +1310,7 @@ static void imx_uart_clear_rx_errors(struct
imx_port *sport)
 }

 #define TXTL_DEFAULT 2 /* reset default */
+#define RXTL_DEFAULT_CONSOLE 1 /* 1 character or aging timer */
 #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
 #define TXTL_DMA 8 /* DMA burst setting */
 #define RXTL_DMA 9 /* DMA burst setting */
@@ -1422,7 +1424,7 @@ static void imx_uart_disable_dma(struct imx_port *sport)
        ucr1 &= ~(UCR1_RXDMAEN | UCR1_TXDMAEN | UCR1_ATDMAEN);
        imx_uart_writel(sport, ucr1, UCR1);

-       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
+       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);

        sport->dma_is_enabled = 0;
 }
@@ -1447,7 +1449,7 @@ static int imx_uart_startup(struct uart_port *port)
                return retval;
        }

-       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
+       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);

        /* disable the DREN bit (Data Ready interrupt enable) before
         * requesting IRQs
@@ -1464,6 +1466,11 @@ static int imx_uart_startup(struct uart_port *port)
        if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
                dma_is_inited = 1;

+       if (uart_console(port))
+               sport->rxtl = RXTL_DEFAULT_CONSOLE;
+       else
+               sport->rxtl = RXTL_DEFAULT;
+
        spin_lock_irqsave(&sport->port.lock, flags);

        /* Reset fifo's and state machines */
@@ -1863,7 +1870,7 @@ static int imx_uart_poll_init(struct uart_port *port)
        if (retval)
                clk_disable_unprepare(sport->clk_ipg);

-       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
+       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);

        spin_lock_irqsave(&sport->port.lock, flags);

@@ -2139,7 +2146,7 @@ imx_uart_console_setup(struct console *co, char *options)
        else
                imx_uart_console_get_options(sport, &baud, &parity, &bits);

-       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
+       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);

        retval = uart_set_options(&sport->port, co, baud, parity, bits, flow);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 12:49         ` Stefan Wahren
@ 2023-03-24 13:06           ` Francesco Dolcini
  -1 siblings, 0 replies; 90+ messages in thread
From: Francesco Dolcini @ 2023-03-24 13:06 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Ilpo Järvinen, Francesco Dolcini, Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

On Fri, Mar 24, 2023 at 01:49:56PM +0100, Stefan Wahren wrote:
> Am 24.03.23 um 13:35 schrieb Ilpo Järvinen:
> > On Fri, 24 Mar 2023, Francesco Dolcini wrote:
> > > On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote:
> > > > On Fri, 24 Mar 2023, Stefan Wahren wrote:
> > > > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> > > > > experience the following issues with the debug UART (115200 baud, 8N1, no
> > > > > hardware flow control):
> > > > > 
> > > > > - overrun errors if we paste in multiple text lines while system is idle
> > > > > - no reaction to single key strokes while system is on higher load
> > > > > 
> > > > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
> > > > > the issue disappear.
> > > > > 
> > > > > Maybe it's worth to mention that the Tarragon board uses two additional
> > > > > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
> > > > > flow control) for RS485 communication, but there are no overrun errors (with
> > > > > and without the mention change).
> > > > This has come up earlier, see e.g.:
> > > > 
> > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> > > yep, it looks exactly the same issue.
> > > 
> > > We did not verify if this was affecting other UARTs. However, isn't RS485
> > > half-duplex?
> > While half-duplex is more likely by far due simplicity, RS485 could also
> > be full-duplex. It seems imx driver supports for both modes.
> 
> The RS485 on Tarragon is half-duplex, but this is implemented in external
> hardware. So from Linux / driver point of view it's a RS232.

To me this is an interesting difference that might be worth
investigating. The console is somehow special since you are going to
echo out the received chars most of the times.

Francesco


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 13:06           ` Francesco Dolcini
  0 siblings, 0 replies; 90+ messages in thread
From: Francesco Dolcini @ 2023-03-24 13:06 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Ilpo Järvinen, Francesco Dolcini, Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM,
	Shawn Guo, Stefan Wahren

On Fri, Mar 24, 2023 at 01:49:56PM +0100, Stefan Wahren wrote:
> Am 24.03.23 um 13:35 schrieb Ilpo Järvinen:
> > On Fri, 24 Mar 2023, Francesco Dolcini wrote:
> > > On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote:
> > > > On Fri, 24 Mar 2023, Stefan Wahren wrote:
> > > > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we
> > > > > experience the following issues with the debug UART (115200 baud, 8N1, no
> > > > > hardware flow control):
> > > > > 
> > > > > - overrun errors if we paste in multiple text lines while system is idle
> > > > > - no reaction to single key strokes while system is on higher load
> > > > > 
> > > > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency")
> > > > > the issue disappear.
> > > > > 
> > > > > Maybe it's worth to mention that the Tarragon board uses two additional
> > > > > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware
> > > > > flow control) for RS485 communication, but there are no overrun errors (with
> > > > > and without the mention change).
> > > > This has come up earlier, see e.g.:
> > > > 
> > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> > > yep, it looks exactly the same issue.
> > > 
> > > We did not verify if this was affecting other UARTs. However, isn't RS485
> > > half-duplex?
> > While half-duplex is more likely by far due simplicity, RS485 could also
> > be full-duplex. It seems imx driver supports for both modes.
> 
> The RS485 on Tarragon is half-duplex, but this is implemented in external
> hardware. So from Linux / driver point of view it's a RS232.

To me this is an interesting difference that might be worth
investigating. The console is somehow special since you are going to
echo out the received chars most of the times.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 12:57     ` Fabio Estevam
@ 2023-03-24 13:37       ` Uwe Kleine-König
  -1 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-03-24 13:37 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Ilpo Järvinen, Stefan Wahren, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Tomasz Moń,
	Sergey Organov, NXP Linux Team, Pengutronix Kernel Team,
	Shawn Guo, Jiri Slaby, Sascha Hauer, Linux ARM

[-- Attachment #1: Type: text/plain, Size: 1049 bytes --]

On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote:
> Hi Stefan,
> 
> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
> <ilpo.jarvinen@linux.intel.com> wrote:
> 
> > This has come up earlier, see e.g.:
> >
> > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> >
> > My somewhat uninformed suggestion: if the overrun problems mostly show up
> > with console ports, maybe the trigger level could depend on the port
> > being a console or not?
> 
> Does the change below help? Taking Ilpo's suggestion into account:

I wonder if it's a red herring that having the console on that port
makes a difference. If I understand correctly the problem is pasting
bigger amounts of data on a ttymxc after having logged in via a getty?

@Stefan: Can you try to reproduce with the port being also a console?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 13:37       ` Uwe Kleine-König
  0 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-03-24 13:37 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Ilpo Järvinen, Stefan Wahren, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Tomasz Moń,
	Sergey Organov, NXP Linux Team, Pengutronix Kernel Team,
	Shawn Guo, Jiri Slaby, Sascha Hauer, Linux ARM


[-- Attachment #1.1: Type: text/plain, Size: 1049 bytes --]

On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote:
> Hi Stefan,
> 
> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
> <ilpo.jarvinen@linux.intel.com> wrote:
> 
> > This has come up earlier, see e.g.:
> >
> > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> >
> > My somewhat uninformed suggestion: if the overrun problems mostly show up
> > with console ports, maybe the trigger level could depend on the port
> > being a console or not?
> 
> Does the change below help? Taking Ilpo's suggestion into account:

I wonder if it's a red herring that having the console on that port
makes a difference. If I understand correctly the problem is pasting
bigger amounts of data on a ttymxc after having logged in via a getty?

@Stefan: Can you try to reproduce with the port being also a console?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 13:37       ` Uwe Kleine-König
@ 2023-03-24 14:19         ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-24 14:19 UTC (permalink / raw)
  To: Uwe Kleine-König, Fabio Estevam
  Cc: Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Tomasz Moń,
	Sergey Organov, NXP Linux Team, Pengutronix Kernel Team,
	Shawn Guo, Jiri Slaby, Sascha Hauer, Linux ARM

Hi,

Am 24.03.23 um 14:37 schrieb Uwe Kleine-König:
> On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote:
>> Hi Stefan,
>>
>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>> <ilpo.jarvinen@linux.intel.com> wrote:
>>
>>> This has come up earlier, see e.g.:
>>>
>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>
>>> My somewhat uninformed suggestion: if the overrun problems mostly show up
>>> with console ports, maybe the trigger level could depend on the port
>>> being a console or not?
>> Does the change below help? Taking Ilpo's suggestion into account:
> I wonder if it's a red herring that having the console on that port
> makes a difference. If I understand correctly the problem is pasting
> bigger amounts of data on a ttymxc after having logged in via a getty?
>
> @Stefan: Can you try to reproduce with the port being also a console?

Sorry, for the confusion. Maybe i should have mentioned that the debug 
UART was configured as a console. Here is the output to be more specific 
(ttymxc0 and 4 are RS485, ttymxc3 is the debug console):

# cat /proc/tty/driver/IMX-uart

serinfo:1.0 driver revision:
0: uart:IMX mmio:0x02020000 irq:192 tx:285207 rx:2633621 fe:2 DSR|CD
3: uart:IMX mmio:0x021F0000 irq:193 tx:70502 rx:69 RTS|DTR|DSR
4: uart:IMX mmio:0x021F4000 irq:194 tx:300988 rx:677223 DSR|CD
5: uart:IMX mmio:0x021FC000 irq:195 tx:0 rx:0 DSR|CD
6: uart:IMX mmio:0x02018000 irq:191 tx:0 rx:0 DSR|CD

Just for clarification the Tarragon board is build in a charging 
station. So hardware access is limited.

@Uwe which port should be configured as a console?

>
> Best regards
> Uwe
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 14:19         ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-24 14:19 UTC (permalink / raw)
  To: Uwe Kleine-König, Fabio Estevam
  Cc: Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Tomasz Moń,
	Sergey Organov, NXP Linux Team, Pengutronix Kernel Team,
	Shawn Guo, Jiri Slaby, Sascha Hauer, Linux ARM

Hi,

Am 24.03.23 um 14:37 schrieb Uwe Kleine-König:
> On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote:
>> Hi Stefan,
>>
>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>> <ilpo.jarvinen@linux.intel.com> wrote:
>>
>>> This has come up earlier, see e.g.:
>>>
>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>
>>> My somewhat uninformed suggestion: if the overrun problems mostly show up
>>> with console ports, maybe the trigger level could depend on the port
>>> being a console or not?
>> Does the change below help? Taking Ilpo's suggestion into account:
> I wonder if it's a red herring that having the console on that port
> makes a difference. If I understand correctly the problem is pasting
> bigger amounts of data on a ttymxc after having logged in via a getty?
>
> @Stefan: Can you try to reproduce with the port being also a console?

Sorry, for the confusion. Maybe i should have mentioned that the debug 
UART was configured as a console. Here is the output to be more specific 
(ttymxc0 and 4 are RS485, ttymxc3 is the debug console):

# cat /proc/tty/driver/IMX-uart

serinfo:1.0 driver revision:
0: uart:IMX mmio:0x02020000 irq:192 tx:285207 rx:2633621 fe:2 DSR|CD
3: uart:IMX mmio:0x021F0000 irq:193 tx:70502 rx:69 RTS|DTR|DSR
4: uart:IMX mmio:0x021F4000 irq:194 tx:300988 rx:677223 DSR|CD
5: uart:IMX mmio:0x021FC000 irq:195 tx:0 rx:0 DSR|CD
6: uart:IMX mmio:0x02018000 irq:191 tx:0 rx:0 DSR|CD

Just for clarification the Tarragon board is build in a charging 
station. So hardware access is limited.

@Uwe which port should be configured as a console?

>
> Best regards
> Uwe
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 14:19         ` Stefan Wahren
@ 2023-03-24 14:39           ` Uwe Kleine-König
  -1 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-03-24 14:39 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Fabio Estevam, Pengutronix Kernel Team, Jiri Slaby,
	Greg Kroah-Hartman, Tomasz Moń,
	Sergey Organov, NXP Linux Team, linux-serial, Stefan Wahren,
	Ilpo Järvinen, Shawn Guo, Sascha Hauer, Linux ARM

[-- Attachment #1: Type: text/plain, Size: 2111 bytes --]

On Fri, Mar 24, 2023 at 03:19:46PM +0100, Stefan Wahren wrote:
> Hi,
> 
> Am 24.03.23 um 14:37 schrieb Uwe Kleine-König:
> > On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote:
> > > Hi Stefan,
> > > 
> > > On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
> > > <ilpo.jarvinen@linux.intel.com> wrote:
> > > 
> > > > This has come up earlier, see e.g.:
> > > > 
> > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> > > > 
> > > > My somewhat uninformed suggestion: if the overrun problems mostly show up
> > > > with console ports, maybe the trigger level could depend on the port
> > > > being a console or not?
> > > Does the change below help? Taking Ilpo's suggestion into account:
> > I wonder if it's a red herring that having the console on that port
> > makes a difference. If I understand correctly the problem is pasting
> > bigger amounts of data on a ttymxc after having logged in via a getty?
> > 
> > @Stefan: Can you try to reproduce with the port being also a console?
> 
> Sorry, for the confusion. Maybe i should have mentioned that the debug UART
> was configured as a console. Here is the output to be more specific (ttymxc0
> and 4 are RS485, ttymxc3 is the debug console):
> 
> # cat /proc/tty/driver/IMX-uart
> 
> serinfo:1.0 driver revision:
> 0: uart:IMX mmio:0x02020000 irq:192 tx:285207 rx:2633621 fe:2 DSR|CD
> 3: uart:IMX mmio:0x021F0000 irq:193 tx:70502 rx:69 RTS|DTR|DSR
> 4: uart:IMX mmio:0x021F4000 irq:194 tx:300988 rx:677223 DSR|CD
> 5: uart:IMX mmio:0x021FC000 irq:195 tx:0 rx:0 DSR|CD
> 6: uart:IMX mmio:0x02018000 irq:191 tx:0 rx:0 DSR|CD
> 
> Just for clarification the Tarragon board is build in a charging station. So
> hardware access is limited.
> 
> @Uwe which port should be configured as a console?

I don't care as long as it's not hte port that you do your test on. None
is fine.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 14:39           ` Uwe Kleine-König
  0 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-03-24 14:39 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Fabio Estevam, Pengutronix Kernel Team, Jiri Slaby,
	Greg Kroah-Hartman, Tomasz Moń,
	Sergey Organov, NXP Linux Team, linux-serial, Stefan Wahren,
	Ilpo Järvinen, Shawn Guo, Sascha Hauer, Linux ARM


[-- Attachment #1.1: Type: text/plain, Size: 2111 bytes --]

On Fri, Mar 24, 2023 at 03:19:46PM +0100, Stefan Wahren wrote:
> Hi,
> 
> Am 24.03.23 um 14:37 schrieb Uwe Kleine-König:
> > On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote:
> > > Hi Stefan,
> > > 
> > > On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
> > > <ilpo.jarvinen@linux.intel.com> wrote:
> > > 
> > > > This has come up earlier, see e.g.:
> > > > 
> > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> > > > 
> > > > My somewhat uninformed suggestion: if the overrun problems mostly show up
> > > > with console ports, maybe the trigger level could depend on the port
> > > > being a console or not?
> > > Does the change below help? Taking Ilpo's suggestion into account:
> > I wonder if it's a red herring that having the console on that port
> > makes a difference. If I understand correctly the problem is pasting
> > bigger amounts of data on a ttymxc after having logged in via a getty?
> > 
> > @Stefan: Can you try to reproduce with the port being also a console?
> 
> Sorry, for the confusion. Maybe i should have mentioned that the debug UART
> was configured as a console. Here is the output to be more specific (ttymxc0
> and 4 are RS485, ttymxc3 is the debug console):
> 
> # cat /proc/tty/driver/IMX-uart
> 
> serinfo:1.0 driver revision:
> 0: uart:IMX mmio:0x02020000 irq:192 tx:285207 rx:2633621 fe:2 DSR|CD
> 3: uart:IMX mmio:0x021F0000 irq:193 tx:70502 rx:69 RTS|DTR|DSR
> 4: uart:IMX mmio:0x021F4000 irq:194 tx:300988 rx:677223 DSR|CD
> 5: uart:IMX mmio:0x021FC000 irq:195 tx:0 rx:0 DSR|CD
> 6: uart:IMX mmio:0x02018000 irq:191 tx:0 rx:0 DSR|CD
> 
> Just for clarification the Tarragon board is build in a charging station. So
> hardware access is limited.
> 
> @Uwe which port should be configured as a console?

I don't care as long as it's not hte port that you do your test on. None
is fine.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 12:57     ` Fabio Estevam
@ 2023-03-24 15:00       ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-24 15:00 UTC (permalink / raw)
  To: Fabio Estevam, Ilpo Järvinen
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	NXP Linux Team, linux-serial, Linux ARM, Shawn Guo,
	Stefan Wahren

Hi Fabio,

Am 24.03.23 um 13:57 schrieb Fabio Estevam:
> Hi Stefan,
>
> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
> <ilpo.jarvinen@linux.intel.com> wrote:
>
>> This has come up earlier, see e.g.:
>>
>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>
>> My somewhat uninformed suggestion: if the overrun problems mostly show up
>> with console ports, maybe the trigger level could depend on the port
>> being a console or not?
> Does the change below help? Taking Ilpo's suggestion into account:
this breaks the boot / debug console completely, but i got the idea.
>
> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> index 0fa1bd8cdec7..4d0aae38b7a5 100644
> --- a/drivers/tty/serial/imx.c
> +++ b/drivers/tty/serial/imx.c
> @@ -233,6 +233,7 @@ struct imx_port {
>          enum imx_tx_state       tx_state;
>          struct hrtimer          trigger_start_tx;
>          struct hrtimer          trigger_stop_tx;
> +       unsigned int            rxtl;
>   };
>
>   struct imx_port_ucrs {
> @@ -1309,6 +1310,7 @@ static void imx_uart_clear_rx_errors(struct
> imx_port *sport)
>   }
>
>   #define TXTL_DEFAULT 2 /* reset default */
> +#define RXTL_DEFAULT_CONSOLE 1 /* 1 character or aging timer */
>   #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>   #define TXTL_DMA 8 /* DMA burst setting */
>   #define RXTL_DMA 9 /* DMA burst setting */
> @@ -1422,7 +1424,7 @@ static void imx_uart_disable_dma(struct imx_port *sport)
>          ucr1 &= ~(UCR1_RXDMAEN | UCR1_TXDMAEN | UCR1_ATDMAEN);
>          imx_uart_writel(sport, ucr1, UCR1);
>
> -       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
> +       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);
>
>          sport->dma_is_enabled = 0;
>   }
> @@ -1447,7 +1449,7 @@ static int imx_uart_startup(struct uart_port *port)
>                  return retval;
>          }
>
> -       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
> +       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);
I think at lea this point sport->rxtl is not properly initialized.
>
>          /* disable the DREN bit (Data Ready interrupt enable) before
>           * requesting IRQs
> @@ -1464,6 +1466,11 @@ static int imx_uart_startup(struct uart_port *port)
>          if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
>                  dma_is_inited = 1;
>
> +       if (uart_console(port))
> +               sport->rxtl = RXTL_DEFAULT_CONSOLE;
> +       else
> +               sport->rxtl = RXTL_DEFAULT;
> +
>          spin_lock_irqsave(&sport->port.lock, flags);
>
>          /* Reset fifo's and state machines */
> @@ -1863,7 +1870,7 @@ static int imx_uart_poll_init(struct uart_port *port)
>          if (retval)
>                  clk_disable_unprepare(sport->clk_ipg);
>
> -       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
> +       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);
>
>          spin_lock_irqsave(&sport->port.lock, flags);
>
> @@ -2139,7 +2146,7 @@ imx_uart_console_setup(struct console *co, char *options)
>          else
>                  imx_uart_console_get_options(sport, &baud, &parity, &bits);
>
> -       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
> +       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);
>
>          retval = uart_set_options(&sport->port, co, baud, parity, bits, flow);
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 15:00       ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-24 15:00 UTC (permalink / raw)
  To: Fabio Estevam, Ilpo Järvinen
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	NXP Linux Team, linux-serial, Linux ARM, Shawn Guo,
	Stefan Wahren

Hi Fabio,

Am 24.03.23 um 13:57 schrieb Fabio Estevam:
> Hi Stefan,
>
> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
> <ilpo.jarvinen@linux.intel.com> wrote:
>
>> This has come up earlier, see e.g.:
>>
>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>
>> My somewhat uninformed suggestion: if the overrun problems mostly show up
>> with console ports, maybe the trigger level could depend on the port
>> being a console or not?
> Does the change below help? Taking Ilpo's suggestion into account:
this breaks the boot / debug console completely, but i got the idea.
>
> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> index 0fa1bd8cdec7..4d0aae38b7a5 100644
> --- a/drivers/tty/serial/imx.c
> +++ b/drivers/tty/serial/imx.c
> @@ -233,6 +233,7 @@ struct imx_port {
>          enum imx_tx_state       tx_state;
>          struct hrtimer          trigger_start_tx;
>          struct hrtimer          trigger_stop_tx;
> +       unsigned int            rxtl;
>   };
>
>   struct imx_port_ucrs {
> @@ -1309,6 +1310,7 @@ static void imx_uart_clear_rx_errors(struct
> imx_port *sport)
>   }
>
>   #define TXTL_DEFAULT 2 /* reset default */
> +#define RXTL_DEFAULT_CONSOLE 1 /* 1 character or aging timer */
>   #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>   #define TXTL_DMA 8 /* DMA burst setting */
>   #define RXTL_DMA 9 /* DMA burst setting */
> @@ -1422,7 +1424,7 @@ static void imx_uart_disable_dma(struct imx_port *sport)
>          ucr1 &= ~(UCR1_RXDMAEN | UCR1_TXDMAEN | UCR1_ATDMAEN);
>          imx_uart_writel(sport, ucr1, UCR1);
>
> -       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
> +       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);
>
>          sport->dma_is_enabled = 0;
>   }
> @@ -1447,7 +1449,7 @@ static int imx_uart_startup(struct uart_port *port)
>                  return retval;
>          }
>
> -       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
> +       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);
I think at lea this point sport->rxtl is not properly initialized.
>
>          /* disable the DREN bit (Data Ready interrupt enable) before
>           * requesting IRQs
> @@ -1464,6 +1466,11 @@ static int imx_uart_startup(struct uart_port *port)
>          if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
>                  dma_is_inited = 1;
>
> +       if (uart_console(port))
> +               sport->rxtl = RXTL_DEFAULT_CONSOLE;
> +       else
> +               sport->rxtl = RXTL_DEFAULT;
> +
>          spin_lock_irqsave(&sport->port.lock, flags);
>
>          /* Reset fifo's and state machines */
> @@ -1863,7 +1870,7 @@ static int imx_uart_poll_init(struct uart_port *port)
>          if (retval)
>                  clk_disable_unprepare(sport->clk_ipg);
>
> -       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
> +       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);
>
>          spin_lock_irqsave(&sport->port.lock, flags);
>
> @@ -2139,7 +2146,7 @@ imx_uart_console_setup(struct console *co, char *options)
>          else
>                  imx_uart_console_get_options(sport, &baud, &parity, &bits);
>
> -       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT);
> +       imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl);
>
>          retval = uart_set_options(&sport->port, co, baud, parity, bits, flow);
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 14:19         ` Stefan Wahren
@ 2023-03-24 21:57           ` Sergey Organov
  -1 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-03-24 21:57 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Tomasz Moń,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Sascha Hauer, Linux ARM

Stefan Wahren <stefan.wahren@i2se.com> writes:

> Hi,
>
> Am 24.03.23 um 14:37 schrieb Uwe Kleine-König:
>> On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote:
>>> Hi Stefan,
>>>
>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>>> <ilpo.jarvinen@linux.intel.com> wrote:
>>>
>>>> This has come up earlier, see e.g.:
>>>>
>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>>
>>>> My somewhat uninformed suggestion: if the overrun problems mostly show up
>>>> with console ports, maybe the trigger level could depend on the port
>>>> being a console or not?
>>> Does the change below help? Taking Ilpo's suggestion into account:
>> I wonder if it's a red herring that having the console on that port
>> makes a difference. If I understand correctly the problem is pasting
>> bigger amounts of data on a ttymxc after having logged in via a getty?
>>
>> @Stefan: Can you try to reproduce with the port being also a console?
>
> Sorry, for the confusion. Maybe i should have mentioned that the debug
> UART was configured as a console.

Chances are that you might experience the same problem that I've
described here:

https://marc.info/?l=linux-serial&m=158504064609504&w=2

Essentially, any serial console output out of printk() could easily
cause 10 milliseconds or even up to 1 second interrupts latency, that
will definitely cause overruns on serial ports and gosh knows what other
problems.

This issue hasn't got any resolution as far as I'm aware. To me it means
that I can't use Linux serial console at all on my non-SMP system,
unless I remove the offending lock.

-- Sergey Organov

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-24 21:57           ` Sergey Organov
  0 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-03-24 21:57 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Tomasz Moń,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Sascha Hauer, Linux ARM

Stefan Wahren <stefan.wahren@i2se.com> writes:

> Hi,
>
> Am 24.03.23 um 14:37 schrieb Uwe Kleine-König:
>> On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote:
>>> Hi Stefan,
>>>
>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>>> <ilpo.jarvinen@linux.intel.com> wrote:
>>>
>>>> This has come up earlier, see e.g.:
>>>>
>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>>
>>>> My somewhat uninformed suggestion: if the overrun problems mostly show up
>>>> with console ports, maybe the trigger level could depend on the port
>>>> being a console or not?
>>> Does the change below help? Taking Ilpo's suggestion into account:
>> I wonder if it's a red herring that having the console on that port
>> makes a difference. If I understand correctly the problem is pasting
>> bigger amounts of data on a ttymxc after having logged in via a getty?
>>
>> @Stefan: Can you try to reproduce with the port being also a console?
>
> Sorry, for the confusion. Maybe i should have mentioned that the debug
> UART was configured as a console.

Chances are that you might experience the same problem that I've
described here:

https://marc.info/?l=linux-serial&m=158504064609504&w=2

Essentially, any serial console output out of printk() could easily
cause 10 milliseconds or even up to 1 second interrupts latency, that
will definitely cause overruns on serial ports and gosh knows what other
problems.

This issue hasn't got any resolution as far as I'm aware. To me it means
that I can't use Linux serial console at all on my non-SMP system,
unless I remove the offending lock.

-- Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-24 15:00       ` Stefan Wahren
@ 2023-03-25 11:31         ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-25 11:31 UTC (permalink / raw)
  To: Fabio Estevam, Ilpo Järvinen
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	NXP Linux Team, linux-serial, Linux ARM, Shawn Guo,
	Stefan Wahren

Hi Fabio,

Am 24.03.23 um 16:00 schrieb Stefan Wahren:
> Hi Fabio,
> 
> Am 24.03.23 um 13:57 schrieb Fabio Estevam:
>> Hi Stefan,
>>
>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>> <ilpo.jarvinen@linux.intel.com> wrote:
>>
>>> This has come up earlier, see e.g.:
>>>
>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>
>>> My somewhat uninformed suggestion: if the overrun problems mostly 
>>> show up
>>> with console ports, maybe the trigger level could depend on the port
>>> being a console or not?
>> Does the change below help? Taking Ilpo's suggestion into account:
> this breaks the boot / debug console completely, but i got the idea.
>>

based on your patch, i successfully tested this:

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index f07c4f9ff13c..1aacaa637ede 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct 
imx_port *sport)
  }

  #define TXTL_DEFAULT 2 /* reset default */
+#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
  #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
  #define TXTL_DMA 8 /* DMA burst setting */
  #define RXTL_DMA 9 /* DMA burst setting */
@@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port 
*sport,
  {
  	unsigned int val;

+	if (uart_console(&sport->port))
+		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
+
  	/* set receiver / transmitter trigger level */
  	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
  	val |= txwl << UFCR_TXTL_SHF | rxwl;

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-25 11:31         ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-25 11:31 UTC (permalink / raw)
  To: Fabio Estevam, Ilpo Järvinen
  Cc: Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	NXP Linux Team, linux-serial, Linux ARM, Shawn Guo,
	Stefan Wahren

Hi Fabio,

Am 24.03.23 um 16:00 schrieb Stefan Wahren:
> Hi Fabio,
> 
> Am 24.03.23 um 13:57 schrieb Fabio Estevam:
>> Hi Stefan,
>>
>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>> <ilpo.jarvinen@linux.intel.com> wrote:
>>
>>> This has come up earlier, see e.g.:
>>>
>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>
>>> My somewhat uninformed suggestion: if the overrun problems mostly 
>>> show up
>>> with console ports, maybe the trigger level could depend on the port
>>> being a console or not?
>> Does the change below help? Taking Ilpo's suggestion into account:
> this breaks the boot / debug console completely, but i got the idea.
>>

based on your patch, i successfully tested this:

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index f07c4f9ff13c..1aacaa637ede 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct 
imx_port *sport)
  }

  #define TXTL_DEFAULT 2 /* reset default */
+#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
  #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
  #define TXTL_DMA 8 /* DMA burst setting */
  #define RXTL_DMA 9 /* DMA burst setting */
@@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port 
*sport,
  {
  	unsigned int val;

+	if (uart_console(&sport->port))
+		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
+
  	/* set receiver / transmitter trigger level */
  	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
  	val |= txwl << UFCR_TXTL_SHF | rxwl;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-25 11:31         ` Stefan Wahren
@ 2023-03-25 12:23           ` Fabio Estevam
  -1 siblings, 0 replies; 90+ messages in thread
From: Fabio Estevam @ 2023-03-25 12:23 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Ilpo Järvinen, Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	NXP Linux Team, linux-serial, Linux ARM, Shawn Guo,
	Stefan Wahren

Hi Stefan,

On Sat, Mar 25, 2023 at 8:31 AM Stefan Wahren <stefan.wahren@i2se.com> wrote:

> based on your patch, i successfully tested this:

Great, much simpler :-)

Please submit it as a formal patch so we can get feedback, thanks.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-25 12:23           ` Fabio Estevam
  0 siblings, 0 replies; 90+ messages in thread
From: Fabio Estevam @ 2023-03-25 12:23 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Ilpo Järvinen, Tomasz Moń,
	Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König,
	Sergey Organov, Sascha Hauer, Pengutronix Kernel Team,
	NXP Linux Team, linux-serial, Linux ARM, Shawn Guo,
	Stefan Wahren

Hi Stefan,

On Sat, Mar 25, 2023 at 8:31 AM Stefan Wahren <stefan.wahren@i2se.com> wrote:

> based on your patch, i successfully tested this:

Great, much simpler :-)

Please submit it as a formal patch so we can get feedback, thanks.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-25 11:31         ` Stefan Wahren
@ 2023-03-25 15:11           ` Uwe Kleine-König
  -1 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-03-25 15:11 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

[-- Attachment #1: Type: text/plain, Size: 2666 bytes --]

Hello,

On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
> Am 24.03.23 um 16:00 schrieb Stefan Wahren:
> > Am 24.03.23 um 13:57 schrieb Fabio Estevam:
> > > On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
> > > <ilpo.jarvinen@linux.intel.com> wrote:
> > > 
> > > > This has come up earlier, see e.g.:
> > > > 
> > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> > > > 
> > > > My somewhat uninformed suggestion: if the overrun problems
> > > > mostly show up
> > > > with console ports, maybe the trigger level could depend on the port
> > > > being a console or not?
> > > Does the change below help? Taking Ilpo's suggestion into account:
> > this breaks the boot / debug console completely, but i got the idea.
> > > 
> 
> based on your patch, i successfully tested this:
> 
> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> index f07c4f9ff13c..1aacaa637ede 100644
> --- a/drivers/tty/serial/imx.c
> +++ b/drivers/tty/serial/imx.c
> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port
> *sport)
>  }
> 
>  #define TXTL_DEFAULT 2 /* reset default */
> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
>  #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>  #define TXTL_DMA 8 /* DMA burst setting */
>  #define RXTL_DMA 9 /* DMA burst setting */
> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port
> *sport,
>  {
>  	unsigned int val;
> 
> +	if (uart_console(&sport->port))
> +		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
> +
>  	/* set receiver / transmitter trigger level */
>  	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
>  	val |= txwl << UFCR_TXTL_SHF | rxwl;

So the current theory that the issue occurs because of a combination of:

 - With a higher watermark value the irq triggers later and so there is
   less time the until the ISR must run before an overflow happens; and

 - serial console activity disables irqs for a (relative) long time

right?

So on an UP system the problem should occur also on a non-console port?
Local irqs are only disabled if some printk is about to be emitted,
isn't it? Does this match the error you're seeing?

That makes me wonder if the error doesn't relate to the UART being a
console port, but the UART being used without DMA?! (So the patch above
fixes the problem for you because on the console port no DMA is used?)

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-25 15:11           ` Uwe Kleine-König
  0 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-03-25 15:11 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM


[-- Attachment #1.1: Type: text/plain, Size: 2666 bytes --]

Hello,

On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
> Am 24.03.23 um 16:00 schrieb Stefan Wahren:
> > Am 24.03.23 um 13:57 schrieb Fabio Estevam:
> > > On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
> > > <ilpo.jarvinen@linux.intel.com> wrote:
> > > 
> > > > This has come up earlier, see e.g.:
> > > > 
> > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
> > > > 
> > > > My somewhat uninformed suggestion: if the overrun problems
> > > > mostly show up
> > > > with console ports, maybe the trigger level could depend on the port
> > > > being a console or not?
> > > Does the change below help? Taking Ilpo's suggestion into account:
> > this breaks the boot / debug console completely, but i got the idea.
> > > 
> 
> based on your patch, i successfully tested this:
> 
> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> index f07c4f9ff13c..1aacaa637ede 100644
> --- a/drivers/tty/serial/imx.c
> +++ b/drivers/tty/serial/imx.c
> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port
> *sport)
>  }
> 
>  #define TXTL_DEFAULT 2 /* reset default */
> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
>  #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>  #define TXTL_DMA 8 /* DMA burst setting */
>  #define RXTL_DMA 9 /* DMA burst setting */
> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port
> *sport,
>  {
>  	unsigned int val;
> 
> +	if (uart_console(&sport->port))
> +		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
> +
>  	/* set receiver / transmitter trigger level */
>  	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
>  	val |= txwl << UFCR_TXTL_SHF | rxwl;

So the current theory that the issue occurs because of a combination of:

 - With a higher watermark value the irq triggers later and so there is
   less time the until the ISR must run before an overflow happens; and

 - serial console activity disables irqs for a (relative) long time

right?

So on an UP system the problem should occur also on a non-console port?
Local irqs are only disabled if some printk is about to be emitted,
isn't it? Does this match the error you're seeing?

That makes me wonder if the error doesn't relate to the UART being a
console port, but the UART being used without DMA?! (So the patch above
fixes the problem for you because on the console port no DMA is used?)

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-25 15:11           ` Uwe Kleine-König
@ 2023-03-25 17:05             ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-25 17:05 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Hi Uwe,

Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> Hello,
> 
> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
>> Am 24.03.23 um 16:00 schrieb Stefan Wahren:
>>> Am 24.03.23 um 13:57 schrieb Fabio Estevam:
>>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>>>> <ilpo.jarvinen@linux.intel.com> wrote:
>>>>
>>>>> This has come up earlier, see e.g.:
>>>>>
>>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>>>
>>>>> My somewhat uninformed suggestion: if the overrun problems
>>>>> mostly show up
>>>>> with console ports, maybe the trigger level could depend on the port
>>>>> being a console or not?
>>>> Does the change below help? Taking Ilpo's suggestion into account:
>>> this breaks the boot / debug console completely, but i got the idea.
>>>>
>>
>> based on your patch, i successfully tested this:
>>
>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>> index f07c4f9ff13c..1aacaa637ede 100644
>> --- a/drivers/tty/serial/imx.c
>> +++ b/drivers/tty/serial/imx.c
>> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port
>> *sport)
>>   }
>>
>>   #define TXTL_DEFAULT 2 /* reset default */
>> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
>>   #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>>   #define TXTL_DMA 8 /* DMA burst setting */
>>   #define RXTL_DMA 9 /* DMA burst setting */
>> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port
>> *sport,
>>   {
>>   	unsigned int val;
>>
>> +	if (uart_console(&sport->port))
>> +		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
>> +
>>   	/* set receiver / transmitter trigger level */
>>   	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
>>   	val |= txwl << UFCR_TXTL_SHF | rxwl;
> 
> So the current theory that the issue occurs because of a combination of:
> 
>   - With a higher watermark value the irq triggers later and so there is
>     less time the until the ISR must run before an overflow happens; and
> 
>   - serial console activity disables irqs for a (relative) long time
> 
> right?
> 
> So on an UP system the problem should occur also on a non-console port?

This is less likely, because UART applications usually need some kind of 
flow control (either from hardware or protocol side). For a non-console 
application the receiver usually wait until the end and then starts to 
transmit.

Sure you can flood the UART with characters and it's only a question of 
time until the RX FIFO is full and data get lost. But i think we should 
focus on the real use case and don't try find the perfect solution. At 
the end it's always a compromise between latency and throughput.

> Local irqs are only disabled if some printk is about to be emitted,
> isn't it? Does this match the error you're seeing?

Yes, that's the typical "problem" of a console application.

> That makes me wonder if the error doesn't relate to the UART being a
> console port, but the UART being used without DMA?! (So the patch above
> fixes the problem for you because on the console port no DMA is used?)

As i said the issue only occured on the console. My problem is that the 
other UARTs on Tarragon are used for RS485 which means they are just 
half duplex.

According to these lines in imx.c DMA is never used for console:

   /* Can we enable the DMA support? */
   if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
     dma_is_inited = 1;

At the end the patch above only restores the old console behavior, but 
keep Tomasz Moń's optimization for non-console (which was indented for).

Best regards
Stefan

> 
> Best regards
> Uwe
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-25 17:05             ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-25 17:05 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Hi Uwe,

Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> Hello,
> 
> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
>> Am 24.03.23 um 16:00 schrieb Stefan Wahren:
>>> Am 24.03.23 um 13:57 schrieb Fabio Estevam:
>>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>>>> <ilpo.jarvinen@linux.intel.com> wrote:
>>>>
>>>>> This has come up earlier, see e.g.:
>>>>>
>>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>>>
>>>>> My somewhat uninformed suggestion: if the overrun problems
>>>>> mostly show up
>>>>> with console ports, maybe the trigger level could depend on the port
>>>>> being a console or not?
>>>> Does the change below help? Taking Ilpo's suggestion into account:
>>> this breaks the boot / debug console completely, but i got the idea.
>>>>
>>
>> based on your patch, i successfully tested this:
>>
>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>> index f07c4f9ff13c..1aacaa637ede 100644
>> --- a/drivers/tty/serial/imx.c
>> +++ b/drivers/tty/serial/imx.c
>> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port
>> *sport)
>>   }
>>
>>   #define TXTL_DEFAULT 2 /* reset default */
>> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
>>   #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>>   #define TXTL_DMA 8 /* DMA burst setting */
>>   #define RXTL_DMA 9 /* DMA burst setting */
>> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port
>> *sport,
>>   {
>>   	unsigned int val;
>>
>> +	if (uart_console(&sport->port))
>> +		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
>> +
>>   	/* set receiver / transmitter trigger level */
>>   	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
>>   	val |= txwl << UFCR_TXTL_SHF | rxwl;
> 
> So the current theory that the issue occurs because of a combination of:
> 
>   - With a higher watermark value the irq triggers later and so there is
>     less time the until the ISR must run before an overflow happens; and
> 
>   - serial console activity disables irqs for a (relative) long time
> 
> right?
> 
> So on an UP system the problem should occur also on a non-console port?

This is less likely, because UART applications usually need some kind of 
flow control (either from hardware or protocol side). For a non-console 
application the receiver usually wait until the end and then starts to 
transmit.

Sure you can flood the UART with characters and it's only a question of 
time until the RX FIFO is full and data get lost. But i think we should 
focus on the real use case and don't try find the perfect solution. At 
the end it's always a compromise between latency and throughput.

> Local irqs are only disabled if some printk is about to be emitted,
> isn't it? Does this match the error you're seeing?

Yes, that's the typical "problem" of a console application.

> That makes me wonder if the error doesn't relate to the UART being a
> console port, but the UART being used without DMA?! (So the patch above
> fixes the problem for you because on the console port no DMA is used?)

As i said the issue only occured on the console. My problem is that the 
other UARTs on Tarragon are used for RS485 which means they are just 
half duplex.

According to these lines in imx.c DMA is never used for console:

   /* Can we enable the DMA support? */
   if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
     dma_is_inited = 1;

At the end the patch above only restores the old console behavior, but 
keep Tomasz Moń's optimization for non-console (which was indented for).

Best regards
Stefan

> 
> Best regards
> Uwe
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-25 15:11           ` Uwe Kleine-König
@ 2023-03-25 18:30             ` Sergey Organov
  -1 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-03-25 18:30 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Stefan Wahren, Fabio Estevam, Ilpo Järvinen, Stefan Wahren,
	linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Hello,

Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes:
> Hello,
>
> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
[...]
>
> So the current theory that the issue occurs because of a combination of:
>
>  - With a higher watermark value the irq triggers later and so there is
>    less time the until the ISR must run before an overflow happens; and
>
>  - serial console activity disables irqs for a (relative) long time
>
> right?
>
> So on an UP system the problem should occur also on a non-console
> port?

That's exactly what I've experienced, especially when console baud
rate was lower than that of other port(s). I had console at 115200,
and got immediate problems on another port working at 460800 whenever
relatively lengthy printk output has been emitted (in my case it was
info from wlan driver.)

> Local irqs are only disabled if some printk is about to be emitted,
> isn't it?

Yep, and this allows for easy check if it's indeed printk that causes
this by eliminating the output using

# echo 0 > /proc/sys/kernel/printk

> Does this match the error you're seeing?
>
> That makes me wonder if the error doesn't relate to the UART being a
> console port, but the UART being used without DMA?! (So the patch above
> fixes the problem for you because on the console port no DMA is used?)

Indeed DMA is likely to be able to hide the problem if the cause is
printk, though all my results were obtained on DMA-disabled ports, and I
never checked with DMA enabled, so unfortunately I have no tested
confirmation of this idea.

Best regards,
-- Sergey Organov

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-25 18:30             ` Sergey Organov
  0 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-03-25 18:30 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Stefan Wahren, Fabio Estevam, Ilpo Järvinen, Stefan Wahren,
	linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Hello,

Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes:
> Hello,
>
> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
[...]
>
> So the current theory that the issue occurs because of a combination of:
>
>  - With a higher watermark value the irq triggers later and so there is
>    less time the until the ISR must run before an overflow happens; and
>
>  - serial console activity disables irqs for a (relative) long time
>
> right?
>
> So on an UP system the problem should occur also on a non-console
> port?

That's exactly what I've experienced, especially when console baud
rate was lower than that of other port(s). I had console at 115200,
and got immediate problems on another port working at 460800 whenever
relatively lengthy printk output has been emitted (in my case it was
info from wlan driver.)

> Local irqs are only disabled if some printk is about to be emitted,
> isn't it?

Yep, and this allows for easy check if it's indeed printk that causes
this by eliminating the output using

# echo 0 > /proc/sys/kernel/printk

> Does this match the error you're seeing?
>
> That makes me wonder if the error doesn't relate to the UART being a
> console port, but the UART being used without DMA?! (So the patch above
> fixes the problem for you because on the console port no DMA is used?)

Indeed DMA is likely to be able to hide the problem if the cause is
printk, though all my results were obtained on DMA-disabled ports, and I
never checked with DMA enabled, so unfortunately I have no tested
confirmation of this idea.

Best regards,
-- Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-25 17:05             ` Stefan Wahren
@ 2023-03-25 19:00               ` Sergey Organov
  -1 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-03-25 19:00 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hello,

Stefan Wahren <stefan.wahren@i2se.com> writes:

> Hi Uwe,
>
> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
>> Hello,
>> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
>>> Am 24.03.23 um 16:00 schrieb Stefan Wahren:
>>>> Am 24.03.23 um 13:57 schrieb Fabio Estevam:
>>>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>>>>> <ilpo.jarvinen@linux.intel.com> wrote:
>>>>>
>>>>>> This has come up earlier, see e.g.:
>>>>>>
>>>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>>>>
>>>>>> My somewhat uninformed suggestion: if the overrun problems
>>>>>> mostly show up
>>>>>> with console ports, maybe the trigger level could depend on the port
>>>>>> being a console or not?
>>>>> Does the change below help? Taking Ilpo's suggestion into account:
>>>> this breaks the boot / debug console completely, but i got the idea.
>>>>>
>>>
>>> based on your patch, i successfully tested this:
>>>
>>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>>> index f07c4f9ff13c..1aacaa637ede 100644
>>> --- a/drivers/tty/serial/imx.c
>>> +++ b/drivers/tty/serial/imx.c
>>> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port
>>> *sport)
>>>   }
>>>
>>>   #define TXTL_DEFAULT 2 /* reset default */
>>> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
>>>   #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>>>   #define TXTL_DMA 8 /* DMA burst setting */
>>>   #define RXTL_DMA 9 /* DMA burst setting */
>>> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port
>>> *sport,
>>>   {
>>>   	unsigned int val;
>>>
>>> +	if (uart_console(&sport->port))
>>> +		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
>>> +
>>>   	/* set receiver / transmitter trigger level */
>>>   	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
>>>   	val |= txwl << UFCR_TXTL_SHF | rxwl;
>> So the current theory that the issue occurs because of a combination of:
>>   - With a higher watermark value the irq triggers later and so there is
>>     less time the until the ISR must run before an overflow happens; and
>>   - serial console activity disables irqs for a (relative) long time
>> right?
>> So on an UP system the problem should occur also on a non-console port?
>
> This is less likely, because UART applications usually need some kind
> of flow control (either from hardware or protocol side). For a
> non-console application the receiver usually wait until the end and
> then starts to transmit.

Only CTS/RTS hardware handshake could help, as otherwise printk() output
is typically entirely async with respect to transmissions on another
port, and software protocol(s) then are irrelevant, unless they enforce
extremely short chunks of data (less than FIFO size).

> Sure you can flood the UART with characters and it's only a question
> of time until the RX FIFO is full and data get lost.

In correctly working RT system this doesn't typically happen, as CPUs
are way faster than typical UART speeds, and are able to handle the
loads easily, provided UART has decent FIFO. It's disabling IRQs for
prolonged times that makes shit happen.

[...]

>
> According to these lines in imx.c DMA is never used for console:
>
>   /* Can we enable the DMA support? */
>   if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
>     dma_is_inited = 1;
>
> At the end the patch above only restores the old console behavior, but
> keep Tomasz Moń's optimization for non-console (which was indented
> for).

So this will likely only be of help for this particular case, and will
leave the problem there on other DMA-disabled ports. To "fix" this, the
old threshold is to be returned on all DMA-disabled ports, and then the
Tomasz original patch would be entirely reverted, it seems.

Disclaimer: all the above is said on the assumption that it's printk
that is core cause of the problem in this case, that has not yet been
shown in testing, as far as I know.

Best regards,
-- Sergey

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-25 19:00               ` Sergey Organov
  0 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-03-25 19:00 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hello,

Stefan Wahren <stefan.wahren@i2se.com> writes:

> Hi Uwe,
>
> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
>> Hello,
>> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
>>> Am 24.03.23 um 16:00 schrieb Stefan Wahren:
>>>> Am 24.03.23 um 13:57 schrieb Fabio Estevam:
>>>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>>>>> <ilpo.jarvinen@linux.intel.com> wrote:
>>>>>
>>>>>> This has come up earlier, see e.g.:
>>>>>>
>>>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>>>>
>>>>>> My somewhat uninformed suggestion: if the overrun problems
>>>>>> mostly show up
>>>>>> with console ports, maybe the trigger level could depend on the port
>>>>>> being a console or not?
>>>>> Does the change below help? Taking Ilpo's suggestion into account:
>>>> this breaks the boot / debug console completely, but i got the idea.
>>>>>
>>>
>>> based on your patch, i successfully tested this:
>>>
>>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>>> index f07c4f9ff13c..1aacaa637ede 100644
>>> --- a/drivers/tty/serial/imx.c
>>> +++ b/drivers/tty/serial/imx.c
>>> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port
>>> *sport)
>>>   }
>>>
>>>   #define TXTL_DEFAULT 2 /* reset default */
>>> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
>>>   #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>>>   #define TXTL_DMA 8 /* DMA burst setting */
>>>   #define RXTL_DMA 9 /* DMA burst setting */
>>> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port
>>> *sport,
>>>   {
>>>   	unsigned int val;
>>>
>>> +	if (uart_console(&sport->port))
>>> +		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
>>> +
>>>   	/* set receiver / transmitter trigger level */
>>>   	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
>>>   	val |= txwl << UFCR_TXTL_SHF | rxwl;
>> So the current theory that the issue occurs because of a combination of:
>>   - With a higher watermark value the irq triggers later and so there is
>>     less time the until the ISR must run before an overflow happens; and
>>   - serial console activity disables irqs for a (relative) long time
>> right?
>> So on an UP system the problem should occur also on a non-console port?
>
> This is less likely, because UART applications usually need some kind
> of flow control (either from hardware or protocol side). For a
> non-console application the receiver usually wait until the end and
> then starts to transmit.

Only CTS/RTS hardware handshake could help, as otherwise printk() output
is typically entirely async with respect to transmissions on another
port, and software protocol(s) then are irrelevant, unless they enforce
extremely short chunks of data (less than FIFO size).

> Sure you can flood the UART with characters and it's only a question
> of time until the RX FIFO is full and data get lost.

In correctly working RT system this doesn't typically happen, as CPUs
are way faster than typical UART speeds, and are able to handle the
loads easily, provided UART has decent FIFO. It's disabling IRQs for
prolonged times that makes shit happen.

[...]

>
> According to these lines in imx.c DMA is never used for console:
>
>   /* Can we enable the DMA support? */
>   if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
>     dma_is_inited = 1;
>
> At the end the patch above only restores the old console behavior, but
> keep Tomasz Moń's optimization for non-console (which was indented
> for).

So this will likely only be of help for this particular case, and will
leave the problem there on other DMA-disabled ports. To "fix" this, the
old threshold is to be returned on all DMA-disabled ports, and then the
Tomasz original patch would be entirely reverted, it seems.

Disclaimer: all the above is said on the assumption that it's printk
that is core cause of the problem in this case, that has not yet been
shown in testing, as far as I know.

Best regards,
-- Sergey

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-25 19:00               ` Sergey Organov
@ 2023-03-26 18:21                 ` Francesco Dolcini
  -1 siblings, 0 replies; 90+ messages in thread
From: Francesco Dolcini @ 2023-03-26 18:21 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Stefan Wahren, Uwe Kleine-König, Fabio Estevam,
	Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

On Sat, Mar 25, 2023 at 10:00:24PM +0300, Sergey Organov wrote:
> In correctly working RT system this doesn't typically happen, as CPUs
> are way faster than typical UART speeds, and are able to handle the
> loads easily, provided UART has decent FIFO. It's disabling IRQs for
> prolonged times that makes shit happen.

The first time we were looking into this issue was before 7a637784d517
and with a PREEMPT-RT patched kernel (if I remember correctly it was a
v5.4).

The system was not loaded at all and it was pretty surprising the
behavior, because of the reasons you just wrote here.

Francesco


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-26 18:21                 ` Francesco Dolcini
  0 siblings, 0 replies; 90+ messages in thread
From: Francesco Dolcini @ 2023-03-26 18:21 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Stefan Wahren, Uwe Kleine-König, Fabio Estevam,
	Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

On Sat, Mar 25, 2023 at 10:00:24PM +0300, Sergey Organov wrote:
> In correctly working RT system this doesn't typically happen, as CPUs
> are way faster than typical UART speeds, and are able to handle the
> loads easily, provided UART has decent FIFO. It's disabling IRQs for
> prolonged times that makes shit happen.

The first time we were looking into this issue was before 7a637784d517
and with a PREEMPT-RT patched kernel (if I remember correctly it was a
v5.4).

The system was not loaded at all and it was pretty surprising the
behavior, because of the reasons you just wrote here.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-25 17:05             ` Stefan Wahren
@ 2023-03-27  8:07               ` Tomasz Moń
  -1 siblings, 0 replies; 90+ messages in thread
From: Tomasz Moń @ 2023-03-27  8:07 UTC (permalink / raw)
  To: Stefan Wahren, Uwe Kleine-König
  Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

On Sat, 2023-03-25 at 18:05 +0100, Stefan Wahren wrote:
> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> > So on an UP system the problem should occur also on a non-console port?
> 
> This is less likely, because UART applications usually need some kind of 
> flow control (either from hardware or protocol side). For a non-console 
> application the receiver usually wait until the end and then starts to 
> transmit.
> 
> Sure you can flood the UART with characters and it's only a question of 
> time until the RX FIFO is full and data get lost. But i think we should 
> focus on the real use case and don't try find the perfect solution. At 
> the end it's always a compromise between latency and throughput.

If you enable DMA on the UART then you are extremely unlikely to hit
overflow. To some degree the DMA can be seen as "extended" RX FIFO.

Unfortunately DMA cannot be used for imx console UART.

> > That makes me wonder if the error doesn't relate to the UART being a
> > console port, but the UART being used without DMA?! (So the patch above
> > fixes the problem for you because on the console port no DMA is used?)
> 
> As i said the issue only occured on the console. My problem is that the 
> other UARTs on Tarragon are used for RS485 which means they are just 
> half duplex.
> 
> According to these lines in imx.c DMA is never used for console:
> 
>    /* Can we enable the DMA support? */
>    if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
>      dma_is_inited = 1;
> 
> At the end the patch above only restores the old console behavior, but 
> keep Tomasz Moń's optimization for non-console (which was indented for).

Setting RXTL to 1 is essentially making the irq raised a bit earlier,
i.e. when the RX FIFO can hold 31 more characters. With RXTL set to 8
and data burst, the irq is raised when RX FIFO can hold 24 more
characters. Therefore with RXTL set to 1 (instead of 8) the maximum
acceptable RX interrupt latency (i.e. before you losing incoming
characters) is 7 characters time longer.

Best Regards,
Tomasz Moń

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-27  8:07               ` Tomasz Moń
  0 siblings, 0 replies; 90+ messages in thread
From: Tomasz Moń @ 2023-03-27  8:07 UTC (permalink / raw)
  To: Stefan Wahren, Uwe Kleine-König
  Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

On Sat, 2023-03-25 at 18:05 +0100, Stefan Wahren wrote:
> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> > So on an UP system the problem should occur also on a non-console port?
> 
> This is less likely, because UART applications usually need some kind of 
> flow control (either from hardware or protocol side). For a non-console 
> application the receiver usually wait until the end and then starts to 
> transmit.
> 
> Sure you can flood the UART with characters and it's only a question of 
> time until the RX FIFO is full and data get lost. But i think we should 
> focus on the real use case and don't try find the perfect solution. At 
> the end it's always a compromise between latency and throughput.

If you enable DMA on the UART then you are extremely unlikely to hit
overflow. To some degree the DMA can be seen as "extended" RX FIFO.

Unfortunately DMA cannot be used for imx console UART.

> > That makes me wonder if the error doesn't relate to the UART being a
> > console port, but the UART being used without DMA?! (So the patch above
> > fixes the problem for you because on the console port no DMA is used?)
> 
> As i said the issue only occured on the console. My problem is that the 
> other UARTs on Tarragon are used for RS485 which means they are just 
> half duplex.
> 
> According to these lines in imx.c DMA is never used for console:
> 
>    /* Can we enable the DMA support? */
>    if (!uart_console(port) && imx_uart_dma_init(sport) == 0)
>      dma_is_inited = 1;
> 
> At the end the patch above only restores the old console behavior, but 
> keep Tomasz Moń's optimization for non-console (which was indented for).

Setting RXTL to 1 is essentially making the irq raised a bit earlier,
i.e. when the RX FIFO can hold 31 more characters. With RXTL set to 8
and data burst, the irq is raised when RX FIFO can hold 24 more
characters. Therefore with RXTL set to 1 (instead of 8) the maximum
acceptable RX interrupt latency (i.e. before you losing incoming
characters) is 7 characters time longer.

Best Regards,
Tomasz Moń

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-25 15:11           ` Uwe Kleine-König
@ 2023-03-27 14:42             ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-27 14:42 UTC (permalink / raw)
  To: Uwe Kleine-König, Sergey Organov
  Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Hi,

Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> Hello,
> 
> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
>> Am 24.03.23 um 16:00 schrieb Stefan Wahren:
>>> Am 24.03.23 um 13:57 schrieb Fabio Estevam:
>>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>>>> <ilpo.jarvinen@linux.intel.com> wrote:
>>>>
>>>>> This has come up earlier, see e.g.:
>>>>>
>>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>>>
>>>>> My somewhat uninformed suggestion: if the overrun problems
>>>>> mostly show up
>>>>> with console ports, maybe the trigger level could depend on the port
>>>>> being a console or not?
>>>> Does the change below help? Taking Ilpo's suggestion into account:
>>> this breaks the boot / debug console completely, but i got the idea.
>>>>
>>
>> based on your patch, i successfully tested this:
>>
>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>> index f07c4f9ff13c..1aacaa637ede 100644
>> --- a/drivers/tty/serial/imx.c
>> +++ b/drivers/tty/serial/imx.c
>> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port
>> *sport)
>>   }
>>
>>   #define TXTL_DEFAULT 2 /* reset default */
>> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
>>   #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>>   #define TXTL_DMA 8 /* DMA burst setting */
>>   #define RXTL_DMA 9 /* DMA burst setting */
>> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port
>> *sport,
>>   {
>>   	unsigned int val;
>>
>> +	if (uart_console(&sport->port))
>> +		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
>> +
>>   	/* set receiver / transmitter trigger level */
>>   	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
>>   	val |= txwl << UFCR_TXTL_SHF | rxwl;
> 
> So the current theory that the issue occurs because of a combination of:
> 
>   - With a higher watermark value the irq triggers later and so there is
>     less time the until the ISR must run before an overflow happens; and
> 
>   - serial console activity disables irqs for a (relative) long time
> 
> right?
> 
> So on an UP system the problem should occur also on a non-console port?
> Local irqs are only disabled if some printk is about to be emitted,
> isn't it? Does this match the error you're seeing?
> 
> That makes me wonder if the error doesn't relate to the UART being a
> console port, but the UART being used without DMA?! (So the patch above
> fixes the problem for you because on the console port no DMA is used?)

today i had time to do some testing. At first i tested with different 
RXTL_DEFAULT values.

1 No overrun
2 No overrun
4 No overrun
8 Overruns

After that i look at the # echo 0 > /proc/sys/kernel/printk approach, 
but this didn't change anything. The kernel is usually silent about log 
message after boot and the console works still with echo. Enforcing some 
driver to call printk periodically would make the console unusuable.

Finally i tried to disabled the spin_lock in imx_uart_console_write:

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index f07c4f9ff13c..c342559ff1a2 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const 
char *s, unsigned int count)
  	struct imx_port_ucrs old_ucr;
  	unsigned long flags;
  	unsigned int ucr1;
-	int locked = 1;
+	int locked = 0;

  	if (sport->port.sysrq)
  		locked = 0;
  	else if (oops_in_progress)
  		locked = spin_trylock_irqsave(&sport->port.lock, flags);
-	else
-		spin_lock_irqsave(&sport->port.lock, flags);

  	/*
  	 *	First, save UCR1/2/3 and then disable interrupts

But the overruns still occured. Is this because the serial core already 
helds a lock?

> 
> Best regards
> Uwe
> 

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-27 14:42             ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-03-27 14:42 UTC (permalink / raw)
  To: Uwe Kleine-König, Sergey Organov
  Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Hi,

Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> Hello,
> 
> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote:
>> Am 24.03.23 um 16:00 schrieb Stefan Wahren:
>>> Am 24.03.23 um 13:57 schrieb Fabio Estevam:
>>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen
>>>> <ilpo.jarvinen@linux.intel.com> wrote:
>>>>
>>>>> This has come up earlier, see e.g.:
>>>>>
>>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/
>>>>>
>>>>> My somewhat uninformed suggestion: if the overrun problems
>>>>> mostly show up
>>>>> with console ports, maybe the trigger level could depend on the port
>>>>> being a console or not?
>>>> Does the change below help? Taking Ilpo's suggestion into account:
>>> this breaks the boot / debug console completely, but i got the idea.
>>>>
>>
>> based on your patch, i successfully tested this:
>>
>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>> index f07c4f9ff13c..1aacaa637ede 100644
>> --- a/drivers/tty/serial/imx.c
>> +++ b/drivers/tty/serial/imx.c
>> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port
>> *sport)
>>   }
>>
>>   #define TXTL_DEFAULT 2 /* reset default */
>> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */
>>   #define RXTL_DEFAULT 8 /* 8 characters or aging timer */
>>   #define TXTL_DMA 8 /* DMA burst setting */
>>   #define RXTL_DMA 9 /* DMA burst setting */
>> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port
>> *sport,
>>   {
>>   	unsigned int val;
>>
>> +	if (uart_console(&sport->port))
>> +		rxwl = RXTL_DEFAULT_CONSOLE; // fallback
>> +
>>   	/* set receiver / transmitter trigger level */
>>   	val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE);
>>   	val |= txwl << UFCR_TXTL_SHF | rxwl;
> 
> So the current theory that the issue occurs because of a combination of:
> 
>   - With a higher watermark value the irq triggers later and so there is
>     less time the until the ISR must run before an overflow happens; and
> 
>   - serial console activity disables irqs for a (relative) long time
> 
> right?
> 
> So on an UP system the problem should occur also on a non-console port?
> Local irqs are only disabled if some printk is about to be emitted,
> isn't it? Does this match the error you're seeing?
> 
> That makes me wonder if the error doesn't relate to the UART being a
> console port, but the UART being used without DMA?! (So the patch above
> fixes the problem for you because on the console port no DMA is used?)

today i had time to do some testing. At first i tested with different 
RXTL_DEFAULT values.

1 No overrun
2 No overrun
4 No overrun
8 Overruns

After that i look at the # echo 0 > /proc/sys/kernel/printk approach, 
but this didn't change anything. The kernel is usually silent about log 
message after boot and the console works still with echo. Enforcing some 
driver to call printk periodically would make the console unusuable.

Finally i tried to disabled the spin_lock in imx_uart_console_write:

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index f07c4f9ff13c..c342559ff1a2 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const 
char *s, unsigned int count)
  	struct imx_port_ucrs old_ucr;
  	unsigned long flags;
  	unsigned int ucr1;
-	int locked = 1;
+	int locked = 0;

  	if (sport->port.sysrq)
  		locked = 0;
  	else if (oops_in_progress)
  		locked = spin_trylock_irqsave(&sport->port.lock, flags);
-	else
-		spin_lock_irqsave(&sport->port.lock, flags);

  	/*
  	 *	First, save UCR1/2/3 and then disable interrupts

But the overruns still occured. Is this because the serial core already 
helds a lock?

> 
> Best regards
> Uwe
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-27 14:42             ` Stefan Wahren
@ 2023-03-27 15:11               ` Sergey Organov
  -1 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-03-27 15:11 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Stefan Wahren <stefan.wahren@i2se.com> writes:

> Hi,
>
> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:

[...]

> today i had time to do some testing. At first i tested with different RXTL_DEFAULT values.
>
> 1 No overrun
> 2 No overrun
> 4 No overrun
> 8 Overruns
>
> After that i look at the # echo 0 > /proc/sys/kernel/printk approach,
> but this didn't change anything. The kernel is usually silent about
> log message after boot and the console works still with echo.
> Enforcing some driver to call printk periodically would make the
> console unusuable.

As you figured that printk() is not the cause, it must be something else
that causes overruns, so there is no need to check printk case further.

>
> Finally i tried to disabled the spin_lock in imx_uart_console_write:
>
> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> index f07c4f9ff13c..c342559ff1a2 100644
> --- a/drivers/tty/serial/imx.c
> +++ b/drivers/tty/serial/imx.c
> @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count)
>  	struct imx_port_ucrs old_ucr;
>  	unsigned long flags;
>  	unsigned int ucr1;
> -	int locked = 1;
> +	int locked = 0;
>
>  	if (sport->port.sysrq)
>  		locked = 0;
>  	else if (oops_in_progress)
>  		locked = spin_trylock_irqsave(&sport->port.lock, flags);
> -	else
> -		spin_lock_irqsave(&sport->port.lock, flags);
>
>  	/*
>  	 *	First, save UCR1/2/3 and then disable interrupts
>
> But the overruns still occured. Is this because the serial core
> already helds a lock?

This probably isn't even called when there is no printk() output, as
user-space writes to /dev/console are rather performed through regular
generic code, AFAIK.

Best regards,
-- Sergey Organov

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-27 15:11               ` Sergey Organov
  0 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-03-27 15:11 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Stefan Wahren <stefan.wahren@i2se.com> writes:

> Hi,
>
> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:

[...]

> today i had time to do some testing. At first i tested with different RXTL_DEFAULT values.
>
> 1 No overrun
> 2 No overrun
> 4 No overrun
> 8 Overruns
>
> After that i look at the # echo 0 > /proc/sys/kernel/printk approach,
> but this didn't change anything. The kernel is usually silent about
> log message after boot and the console works still with echo.
> Enforcing some driver to call printk periodically would make the
> console unusuable.

As you figured that printk() is not the cause, it must be something else
that causes overruns, so there is no need to check printk case further.

>
> Finally i tried to disabled the spin_lock in imx_uart_console_write:
>
> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> index f07c4f9ff13c..c342559ff1a2 100644
> --- a/drivers/tty/serial/imx.c
> +++ b/drivers/tty/serial/imx.c
> @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count)
>  	struct imx_port_ucrs old_ucr;
>  	unsigned long flags;
>  	unsigned int ucr1;
> -	int locked = 1;
> +	int locked = 0;
>
>  	if (sport->port.sysrq)
>  		locked = 0;
>  	else if (oops_in_progress)
>  		locked = spin_trylock_irqsave(&sport->port.lock, flags);
> -	else
> -		spin_lock_irqsave(&sport->port.lock, flags);
>
>  	/*
>  	 *	First, save UCR1/2/3 and then disable interrupts
>
> But the overruns still occured. Is this because the serial core
> already helds a lock?

This probably isn't even called when there is no printk() output, as
user-space writes to /dev/console are rather performed through regular
generic code, AFAIK.

Best regards,
-- Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-27 15:11               ` Sergey Organov
@ 2023-03-27 15:30                 ` Russell King (Oracle)
  -1 siblings, 0 replies; 90+ messages in thread
From: Russell King (Oracle) @ 2023-03-27 15:30 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Stefan Wahren, Uwe Kleine-König, Fabio Estevam,
	Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

On Mon, Mar 27, 2023 at 06:11:12PM +0300, Sergey Organov wrote:
> Stefan Wahren <stefan.wahren@i2se.com> writes:
> 
> > Hi,
> >
> > Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> 
> [...]
> 
> > today i had time to do some testing. At first i tested with different RXTL_DEFAULT values.
> >
> > 1 No overrun
> > 2 No overrun
> > 4 No overrun
> > 8 Overruns
> >
> > After that i look at the # echo 0 > /proc/sys/kernel/printk approach,
> > but this didn't change anything. The kernel is usually silent about
> > log message after boot and the console works still with echo.
> > Enforcing some driver to call printk periodically would make the
> > console unusuable.
> 
> As you figured that printk() is not the cause, it must be something else
> that causes overruns, so there is no need to check printk case further.
> 
> >
> > Finally i tried to disabled the spin_lock in imx_uart_console_write:
> >
> > diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> > index f07c4f9ff13c..c342559ff1a2 100644
> > --- a/drivers/tty/serial/imx.c
> > +++ b/drivers/tty/serial/imx.c
> > @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count)
> >  	struct imx_port_ucrs old_ucr;
> >  	unsigned long flags;
> >  	unsigned int ucr1;
> > -	int locked = 1;
> > +	int locked = 0;
> >
> >  	if (sport->port.sysrq)
> >  		locked = 0;
> >  	else if (oops_in_progress)
> >  		locked = spin_trylock_irqsave(&sport->port.lock, flags);
> > -	else
> > -		spin_lock_irqsave(&sport->port.lock, flags);
> >
> >  	/*
> >  	 *	First, save UCR1/2/3 and then disable interrupts
> >
> > But the overruns still occured. Is this because the serial core
> > already helds a lock?
> 
> This probably isn't even called when there is no printk() output, as
> user-space writes to /dev/console are rather performed through regular
> generic code, AFAIK.

Correct on both points.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-03-27 15:30                 ` Russell King (Oracle)
  0 siblings, 0 replies; 90+ messages in thread
From: Russell King (Oracle) @ 2023-03-27 15:30 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Stefan Wahren, Uwe Kleine-König, Fabio Estevam,
	Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

On Mon, Mar 27, 2023 at 06:11:12PM +0300, Sergey Organov wrote:
> Stefan Wahren <stefan.wahren@i2se.com> writes:
> 
> > Hi,
> >
> > Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> 
> [...]
> 
> > today i had time to do some testing. At first i tested with different RXTL_DEFAULT values.
> >
> > 1 No overrun
> > 2 No overrun
> > 4 No overrun
> > 8 Overruns
> >
> > After that i look at the # echo 0 > /proc/sys/kernel/printk approach,
> > but this didn't change anything. The kernel is usually silent about
> > log message after boot and the console works still with echo.
> > Enforcing some driver to call printk periodically would make the
> > console unusuable.
> 
> As you figured that printk() is not the cause, it must be something else
> that causes overruns, so there is no need to check printk case further.
> 
> >
> > Finally i tried to disabled the spin_lock in imx_uart_console_write:
> >
> > diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> > index f07c4f9ff13c..c342559ff1a2 100644
> > --- a/drivers/tty/serial/imx.c
> > +++ b/drivers/tty/serial/imx.c
> > @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count)
> >  	struct imx_port_ucrs old_ucr;
> >  	unsigned long flags;
> >  	unsigned int ucr1;
> > -	int locked = 1;
> > +	int locked = 0;
> >
> >  	if (sport->port.sysrq)
> >  		locked = 0;
> >  	else if (oops_in_progress)
> >  		locked = spin_trylock_irqsave(&sport->port.lock, flags);
> > -	else
> > -		spin_lock_irqsave(&sport->port.lock, flags);
> >
> >  	/*
> >  	 *	First, save UCR1/2/3 and then disable interrupts
> >
> > But the overruns still occured. Is this because the serial core
> > already helds a lock?
> 
> This probably isn't even called when there is no printk() output, as
> user-space writes to /dev/console are rather performed through regular
> generic code, AFAIK.

Correct on both points.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-03-27 15:11               ` Sergey Organov
@ 2023-04-16 13:43                 ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-04-16 13:43 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Sergey,

Am 27.03.23 um 17:11 schrieb Sergey Organov:
> Stefan Wahren <stefan.wahren@i2se.com> writes:
> 
>> Hi,
>>
>> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> 
> [...]
> 
>> today i had time to do some testing. At first i tested with different RXTL_DEFAULT values.
>>
>> 1 No overrun
>> 2 No overrun
>> 4 No overrun
>> 8 Overruns
>>
>> After that i look at the # echo 0 > /proc/sys/kernel/printk approach,
>> but this didn't change anything. The kernel is usually silent about
>> log message after boot and the console works still with echo.
>> Enforcing some driver to call printk periodically would make the
>> console unusuable.
> 
> As you figured that printk() is not the cause, it must be something else
> that causes overruns, so there is no need to check printk case further.
> 
>>
>> Finally i tried to disabled the spin_lock in imx_uart_console_write:
>>
>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>> index f07c4f9ff13c..c342559ff1a2 100644
>> --- a/drivers/tty/serial/imx.c
>> +++ b/drivers/tty/serial/imx.c
>> @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count)
>>   	struct imx_port_ucrs old_ucr;
>>   	unsigned long flags;
>>   	unsigned int ucr1;
>> -	int locked = 1;
>> +	int locked = 0;
>>
>>   	if (sport->port.sysrq)
>>   		locked = 0;
>>   	else if (oops_in_progress)
>>   		locked = spin_trylock_irqsave(&sport->port.lock, flags);
>> -	else
>> -		spin_lock_irqsave(&sport->port.lock, flags);
>>
>>   	/*
>>   	 *	First, save UCR1/2/3 and then disable interrupts
>>
>> But the overruns still occured. Is this because the serial core
>> already helds a lock?
> 
> This probably isn't even called when there is no printk() output, as
> user-space writes to /dev/console are rather performed through regular
> generic code, AFAIK.

i had some time today to investigate this a little bit. I thought it 
would be a good idea to use debugfs as a ugly quick hack:

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index b8c817d26b00..d5bde4754004 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -30,6 +30,7 @@
  #include <linux/dma-mapping.h>

  #include <asm/irq.h>
+#include <linux/debugfs.h>
  #include <linux/dma/imx-dma.h>

  #include "serial_mctrl_gpio.h"
@@ -237,8 +238,19 @@ struct imx_port {
  	enum imx_tx_state	tx_state;
  	struct hrtimer		trigger_start_tx;
  	struct hrtimer		trigger_stop_tx;
+
+	struct dentry		*debugfs_dir;
+
+	/* stats exposed through debugf */
+	s64			total_duration_us;
+	s64			rx_duration_us;
+	s64			tx_duration_us;
+	u32			received;
+	u32			send;
  };

+static struct dentry *imx_debugfs_root;
+
  struct imx_port_ucrs {
  	unsigned int	ucr1;
  	unsigned int	ucr2;
@@ -536,12 +548,15 @@ static void imx_uart_dma_tx(struct imx_port *sport);
  static inline void imx_uart_transmit_buffer(struct imx_port *sport)
  {
  	struct circ_buf *xmit = &sport->port.state->xmit;
+	u32 send = 0;

  	if (sport->port.x_char) {
  		/* Send next char */
  		imx_uart_writel(sport, sport->port.x_char, URTX0);
  		sport->port.icount.tx++;
  		sport->port.x_char = 0;
+		if (sport->send == 0)
+			sport->send = 1;
  		return;
  	}

@@ -576,8 +591,12 @@ static inline void imx_uart_transmit_buffer(struct 
imx_port *sport)
  		imx_uart_writel(sport, xmit->buf[xmit->tail], URTX0);
  		xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE - 1);
  		sport->port.icount.tx++;
+		send++;
  	}

+	if (send > sport->send)
+		sport->send = send;
+
  	if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
  		uart_write_wakeup(&sport->port);

@@ -808,6 +827,7 @@ static irqreturn_t __imx_uart_rxint(int irq, void 
*dev_id)
  {
  	struct imx_port *sport = dev_id;
  	unsigned int rx, flg, ignored = 0;
+	u32 received = 0;
  	struct tty_port *port = &sport->port.state->port;

  	while (imx_uart_readl(sport, USR2) & USR2_RDR) {
@@ -815,6 +835,7 @@ static irqreturn_t __imx_uart_rxint(int irq, void 
*dev_id)

  		flg = TTY_NORMAL;
  		sport->port.icount.rx++;
+		received++;

  		rx = imx_uart_readl(sport, URXD0);

@@ -868,6 +889,9 @@ static irqreturn_t __imx_uart_rxint(int irq, void 
*dev_id)
  out:
  	tty_flip_buffer_push(port);

+	if (received > sport->received)
+		sport->received = received;
+
  	return IRQ_HANDLED;
  }

@@ -942,6 +966,8 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
  	struct imx_port *sport = dev_id;
  	unsigned int usr1, usr2, ucr1, ucr2, ucr3, ucr4;
  	irqreturn_t ret = IRQ_NONE;
+	ktime_t total_start = ktime_get();
+	s64 total_duration_us, rx_duration_us, tx_duration_us;

  	spin_lock(&sport->port.lock);

@@ -978,14 +1004,24 @@ static irqreturn_t imx_uart_int(int irq, void 
*dev_id)
  		usr2 &= ~USR2_ORE;

  	if (usr1 & (USR1_RRDY | USR1_AGTIM)) {
+		ktime_t rx_start = ktime_get();
  		imx_uart_writel(sport, USR1_AGTIM, USR1);

  		__imx_uart_rxint(irq, dev_id);
+		rx_duration_us = ktime_us_delta(ktime_get(), rx_start);
+		if (rx_duration_us > sport->rx_duration_us)
+			sport->rx_duration_us = rx_duration_us;
+
  		ret = IRQ_HANDLED;
  	}

  	if ((usr1 & USR1_TRDY) || (usr2 & USR2_TXDC)) {
+		ktime_t tx_start = ktime_get();
  		imx_uart_transmit_buffer(sport);
+		tx_duration_us = ktime_us_delta(ktime_get(), tx_start);
+		if (tx_duration_us > sport->tx_duration_us)
+			sport->tx_duration_us = tx_duration_us;
+
  		ret = IRQ_HANDLED;
  	}

@@ -1015,6 +1051,10 @@ static irqreturn_t imx_uart_int(int irq, void 
*dev_id)

  	spin_unlock(&sport->port.lock);

+	total_duration_us = ktime_us_delta(ktime_get(), total_start);
+	if (total_duration_us > sport->total_duration_us)
+		sport->total_duration_us = total_duration_us;
+
  	return ret;
  }

@@ -2233,6 +2273,26 @@ static const struct serial_rs485 
imx_rs485_supported = {
  #define RX_DMA_PERIODS		16
  #define RX_DMA_PERIOD_LEN	(PAGE_SIZE / 4)

+static int debugfs_stats_show(struct seq_file *s, void *unused)
+{
+	struct imx_port *sport = s->private;
+
+	seq_printf(s, "total_duration_us:\t%lld\n", sport->total_duration_us);
+	seq_printf(s, "rx_duration_us:\t%lld\n", sport->rx_duration_us);
+	seq_printf(s, "tx_duration_us:\t%lld\n", sport->tx_duration_us);
+	seq_printf(s, "received:\t\t%u\n", sport->received);
+	seq_printf(s, "send:\t\t%u\n", sport->send);
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(debugfs_stats);
+
+static void imx_init_debugfs(struct imx_port *sport, const char *device)
+{
+	sport->debugfs_dir = debugfs_create_dir(device, imx_debugfs_root);
+	debugfs_create_file("stats", 0444, sport->debugfs_dir, sport,
+			    &debugfs_stats_fops);
+}
+
  static int imx_uart_probe(struct platform_device *pdev)
  {
  	struct device_node *np = pdev->dev.of_node;
@@ -2485,6 +2545,7 @@ static int imx_uart_probe(struct platform_device 
*pdev)
  	imx_uart_ports[sport->port.line] = sport;

  	platform_set_drvdata(pdev, sport);
+	imx_init_debugfs(sport, dev_name(&pdev->dev));

  	return uart_add_one_port(&imx_uart_uart_driver, &sport->port);
  }
@@ -2678,9 +2739,14 @@ static int __init imx_uart_init(void)
  	if (ret)
  		return ret;

+	imx_debugfs_root = debugfs_create_dir(
+		imx_uart_platform_driver.driver.name, NULL);
+
  	ret = platform_driver_register(&imx_uart_platform_driver);
-	if (ret != 0)
+	if (ret != 0) {
+		debugfs_remove_recursive(imx_debugfs_root);
  		uart_unregister_driver(&imx_uart_uart_driver);
+	}

  	return ret;
  }
@@ -2688,6 +2754,7 @@ static int __init imx_uart_init(void)
  static void __exit imx_uart_exit(void)
  {
  	platform_driver_unregister(&imx_uart_platform_driver);
+	debugfs_remove_recursive(imx_debugfs_root);
  	uart_unregister_driver(&imx_uart_uart_driver);
  }

Using this i was able to better compare the behavior with RXTL_DEFAULT 1 
(without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on my 
i.MX6ULL test platform. After doing my usual test scenario (copy some 
text lines to console) i got the following results:

RXTL_DEFAULT 1
21f0000.serial/stats:total_duration_us: 61
21f0000.serial/stats:rx_duration_us:    36
21f0000.serial/stats:tx_duration_us:    48
21f0000.serial/stats:received:          28
21f0000.serial/stats:send:              33

RXTL_DEFAULT 8
21f0000.serial/stats:total_duration_us: 78
21f0000.serial/stats:rx_duration_us:    46
21f0000.serial/stats:tx_duration_us:    47
21f0000.serial/stats:received:          33
21f0000.serial/stats:send:              33

So based on the maximum of received characters on RX interrupt, i 
consider the root cause of this issue has already been there because the 
amount is near to the maximum of the FIFO (32 chars). So finally 
increasing RXTL_DEFAULT makes the situation even worse by adding enough 
latency for overrun errors.

Best regards

>  
> Best regards,
> -- Sergey Organov

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-04-16 13:43                 ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-04-16 13:43 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Sergey,

Am 27.03.23 um 17:11 schrieb Sergey Organov:
> Stefan Wahren <stefan.wahren@i2se.com> writes:
> 
>> Hi,
>>
>> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König:
> 
> [...]
> 
>> today i had time to do some testing. At first i tested with different RXTL_DEFAULT values.
>>
>> 1 No overrun
>> 2 No overrun
>> 4 No overrun
>> 8 Overruns
>>
>> After that i look at the # echo 0 > /proc/sys/kernel/printk approach,
>> but this didn't change anything. The kernel is usually silent about
>> log message after boot and the console works still with echo.
>> Enforcing some driver to call printk periodically would make the
>> console unusuable.
> 
> As you figured that printk() is not the cause, it must be something else
> that causes overruns, so there is no need to check printk case further.
> 
>>
>> Finally i tried to disabled the spin_lock in imx_uart_console_write:
>>
>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>> index f07c4f9ff13c..c342559ff1a2 100644
>> --- a/drivers/tty/serial/imx.c
>> +++ b/drivers/tty/serial/imx.c
>> @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count)
>>   	struct imx_port_ucrs old_ucr;
>>   	unsigned long flags;
>>   	unsigned int ucr1;
>> -	int locked = 1;
>> +	int locked = 0;
>>
>>   	if (sport->port.sysrq)
>>   		locked = 0;
>>   	else if (oops_in_progress)
>>   		locked = spin_trylock_irqsave(&sport->port.lock, flags);
>> -	else
>> -		spin_lock_irqsave(&sport->port.lock, flags);
>>
>>   	/*
>>   	 *	First, save UCR1/2/3 and then disable interrupts
>>
>> But the overruns still occured. Is this because the serial core
>> already helds a lock?
> 
> This probably isn't even called when there is no printk() output, as
> user-space writes to /dev/console are rather performed through regular
> generic code, AFAIK.

i had some time today to investigate this a little bit. I thought it 
would be a good idea to use debugfs as a ugly quick hack:

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index b8c817d26b00..d5bde4754004 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -30,6 +30,7 @@
  #include <linux/dma-mapping.h>

  #include <asm/irq.h>
+#include <linux/debugfs.h>
  #include <linux/dma/imx-dma.h>

  #include "serial_mctrl_gpio.h"
@@ -237,8 +238,19 @@ struct imx_port {
  	enum imx_tx_state	tx_state;
  	struct hrtimer		trigger_start_tx;
  	struct hrtimer		trigger_stop_tx;
+
+	struct dentry		*debugfs_dir;
+
+	/* stats exposed through debugf */
+	s64			total_duration_us;
+	s64			rx_duration_us;
+	s64			tx_duration_us;
+	u32			received;
+	u32			send;
  };

+static struct dentry *imx_debugfs_root;
+
  struct imx_port_ucrs {
  	unsigned int	ucr1;
  	unsigned int	ucr2;
@@ -536,12 +548,15 @@ static void imx_uart_dma_tx(struct imx_port *sport);
  static inline void imx_uart_transmit_buffer(struct imx_port *sport)
  {
  	struct circ_buf *xmit = &sport->port.state->xmit;
+	u32 send = 0;

  	if (sport->port.x_char) {
  		/* Send next char */
  		imx_uart_writel(sport, sport->port.x_char, URTX0);
  		sport->port.icount.tx++;
  		sport->port.x_char = 0;
+		if (sport->send == 0)
+			sport->send = 1;
  		return;
  	}

@@ -576,8 +591,12 @@ static inline void imx_uart_transmit_buffer(struct 
imx_port *sport)
  		imx_uart_writel(sport, xmit->buf[xmit->tail], URTX0);
  		xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE - 1);
  		sport->port.icount.tx++;
+		send++;
  	}

+	if (send > sport->send)
+		sport->send = send;
+
  	if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
  		uart_write_wakeup(&sport->port);

@@ -808,6 +827,7 @@ static irqreturn_t __imx_uart_rxint(int irq, void 
*dev_id)
  {
  	struct imx_port *sport = dev_id;
  	unsigned int rx, flg, ignored = 0;
+	u32 received = 0;
  	struct tty_port *port = &sport->port.state->port;

  	while (imx_uart_readl(sport, USR2) & USR2_RDR) {
@@ -815,6 +835,7 @@ static irqreturn_t __imx_uart_rxint(int irq, void 
*dev_id)

  		flg = TTY_NORMAL;
  		sport->port.icount.rx++;
+		received++;

  		rx = imx_uart_readl(sport, URXD0);

@@ -868,6 +889,9 @@ static irqreturn_t __imx_uart_rxint(int irq, void 
*dev_id)
  out:
  	tty_flip_buffer_push(port);

+	if (received > sport->received)
+		sport->received = received;
+
  	return IRQ_HANDLED;
  }

@@ -942,6 +966,8 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id)
  	struct imx_port *sport = dev_id;
  	unsigned int usr1, usr2, ucr1, ucr2, ucr3, ucr4;
  	irqreturn_t ret = IRQ_NONE;
+	ktime_t total_start = ktime_get();
+	s64 total_duration_us, rx_duration_us, tx_duration_us;

  	spin_lock(&sport->port.lock);

@@ -978,14 +1004,24 @@ static irqreturn_t imx_uart_int(int irq, void 
*dev_id)
  		usr2 &= ~USR2_ORE;

  	if (usr1 & (USR1_RRDY | USR1_AGTIM)) {
+		ktime_t rx_start = ktime_get();
  		imx_uart_writel(sport, USR1_AGTIM, USR1);

  		__imx_uart_rxint(irq, dev_id);
+		rx_duration_us = ktime_us_delta(ktime_get(), rx_start);
+		if (rx_duration_us > sport->rx_duration_us)
+			sport->rx_duration_us = rx_duration_us;
+
  		ret = IRQ_HANDLED;
  	}

  	if ((usr1 & USR1_TRDY) || (usr2 & USR2_TXDC)) {
+		ktime_t tx_start = ktime_get();
  		imx_uart_transmit_buffer(sport);
+		tx_duration_us = ktime_us_delta(ktime_get(), tx_start);
+		if (tx_duration_us > sport->tx_duration_us)
+			sport->tx_duration_us = tx_duration_us;
+
  		ret = IRQ_HANDLED;
  	}

@@ -1015,6 +1051,10 @@ static irqreturn_t imx_uart_int(int irq, void 
*dev_id)

  	spin_unlock(&sport->port.lock);

+	total_duration_us = ktime_us_delta(ktime_get(), total_start);
+	if (total_duration_us > sport->total_duration_us)
+		sport->total_duration_us = total_duration_us;
+
  	return ret;
  }

@@ -2233,6 +2273,26 @@ static const struct serial_rs485 
imx_rs485_supported = {
  #define RX_DMA_PERIODS		16
  #define RX_DMA_PERIOD_LEN	(PAGE_SIZE / 4)

+static int debugfs_stats_show(struct seq_file *s, void *unused)
+{
+	struct imx_port *sport = s->private;
+
+	seq_printf(s, "total_duration_us:\t%lld\n", sport->total_duration_us);
+	seq_printf(s, "rx_duration_us:\t%lld\n", sport->rx_duration_us);
+	seq_printf(s, "tx_duration_us:\t%lld\n", sport->tx_duration_us);
+	seq_printf(s, "received:\t\t%u\n", sport->received);
+	seq_printf(s, "send:\t\t%u\n", sport->send);
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(debugfs_stats);
+
+static void imx_init_debugfs(struct imx_port *sport, const char *device)
+{
+	sport->debugfs_dir = debugfs_create_dir(device, imx_debugfs_root);
+	debugfs_create_file("stats", 0444, sport->debugfs_dir, sport,
+			    &debugfs_stats_fops);
+}
+
  static int imx_uart_probe(struct platform_device *pdev)
  {
  	struct device_node *np = pdev->dev.of_node;
@@ -2485,6 +2545,7 @@ static int imx_uart_probe(struct platform_device 
*pdev)
  	imx_uart_ports[sport->port.line] = sport;

  	platform_set_drvdata(pdev, sport);
+	imx_init_debugfs(sport, dev_name(&pdev->dev));

  	return uart_add_one_port(&imx_uart_uart_driver, &sport->port);
  }
@@ -2678,9 +2739,14 @@ static int __init imx_uart_init(void)
  	if (ret)
  		return ret;

+	imx_debugfs_root = debugfs_create_dir(
+		imx_uart_platform_driver.driver.name, NULL);
+
  	ret = platform_driver_register(&imx_uart_platform_driver);
-	if (ret != 0)
+	if (ret != 0) {
+		debugfs_remove_recursive(imx_debugfs_root);
  		uart_unregister_driver(&imx_uart_uart_driver);
+	}

  	return ret;
  }
@@ -2688,6 +2754,7 @@ static int __init imx_uart_init(void)
  static void __exit imx_uart_exit(void)
  {
  	platform_driver_unregister(&imx_uart_platform_driver);
+	debugfs_remove_recursive(imx_debugfs_root);
  	uart_unregister_driver(&imx_uart_uart_driver);
  }

Using this i was able to better compare the behavior with RXTL_DEFAULT 1 
(without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on my 
i.MX6ULL test platform. After doing my usual test scenario (copy some 
text lines to console) i got the following results:

RXTL_DEFAULT 1
21f0000.serial/stats:total_duration_us: 61
21f0000.serial/stats:rx_duration_us:    36
21f0000.serial/stats:tx_duration_us:    48
21f0000.serial/stats:received:          28
21f0000.serial/stats:send:              33

RXTL_DEFAULT 8
21f0000.serial/stats:total_duration_us: 78
21f0000.serial/stats:rx_duration_us:    46
21f0000.serial/stats:tx_duration_us:    47
21f0000.serial/stats:received:          33
21f0000.serial/stats:send:              33

So based on the maximum of received characters on RX interrupt, i 
consider the root cause of this issue has already been there because the 
amount is near to the maximum of the FIFO (32 chars). So finally 
increasing RXTL_DEFAULT makes the situation even worse by adding enough 
latency for overrun errors.

Best regards

>  
> Best regards,
> -- Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-04-16 13:43                 ` Stefan Wahren
@ 2023-04-17 16:50                   ` Sergey Organov
  -1 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-04-17 16:50 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Stefan,

Stefan Wahren <stefan.wahren@i2se.com> writes:

> Hi Sergey,
>

[...]

> i had some time today to investigate this a little bit. I thought it
> would be a good idea to use debugfs as a ugly quick hack:
>

[...]

> Using this i was able to better compare the behavior with RXTL_DEFAULT
> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on
> my i.MX6ULL test platform. After doing my usual test scenario (copy
> some text lines to console) i got the following results:
>
> RXTL_DEFAULT 1
> 21f0000.serial/stats:total_duration_us: 61
> 21f0000.serial/stats:rx_duration_us:    36
> 21f0000.serial/stats:tx_duration_us:    48
> 21f0000.serial/stats:received:          28
> 21f0000.serial/stats:send:              33
>
> RXTL_DEFAULT 8
> 21f0000.serial/stats:total_duration_us: 78
> 21f0000.serial/stats:rx_duration_us:    46
> 21f0000.serial/stats:tx_duration_us:    47
> 21f0000.serial/stats:received:          33
> 21f0000.serial/stats:send:              33
>
> So based on the maximum of received characters on RX interrupt, i
> consider the root cause of this issue has already been there because
> the amount is near to the maximum of the FIFO (32 chars). So finally
> increasing RXTL_DEFAULT makes the situation even worse by adding
> enough latency for overrun errors.

Yep, looks like an issue.

What's the baud rate? 115200? If so, it means that interrupts are
apparently blocked in your system for up to about 28/(115200/10)=2.4
milliseconds. This is very large number, and it may negatively affect
system performance in other places as well, I'm afraid.

Best regards,
-- Sergey Organov

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-04-17 16:50                   ` Sergey Organov
  0 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-04-17 16:50 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Stefan,

Stefan Wahren <stefan.wahren@i2se.com> writes:

> Hi Sergey,
>

[...]

> i had some time today to investigate this a little bit. I thought it
> would be a good idea to use debugfs as a ugly quick hack:
>

[...]

> Using this i was able to better compare the behavior with RXTL_DEFAULT
> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on
> my i.MX6ULL test platform. After doing my usual test scenario (copy
> some text lines to console) i got the following results:
>
> RXTL_DEFAULT 1
> 21f0000.serial/stats:total_duration_us: 61
> 21f0000.serial/stats:rx_duration_us:    36
> 21f0000.serial/stats:tx_duration_us:    48
> 21f0000.serial/stats:received:          28
> 21f0000.serial/stats:send:              33
>
> RXTL_DEFAULT 8
> 21f0000.serial/stats:total_duration_us: 78
> 21f0000.serial/stats:rx_duration_us:    46
> 21f0000.serial/stats:tx_duration_us:    47
> 21f0000.serial/stats:received:          33
> 21f0000.serial/stats:send:              33
>
> So based on the maximum of received characters on RX interrupt, i
> consider the root cause of this issue has already been there because
> the amount is near to the maximum of the FIFO (32 chars). So finally
> increasing RXTL_DEFAULT makes the situation even worse by adding
> enough latency for overrun errors.

Yep, looks like an issue.

What's the baud rate? 115200? If so, it means that interrupts are
apparently blocked in your system for up to about 28/(115200/10)=2.4
milliseconds. This is very large number, and it may negatively affect
system performance in other places as well, I'm afraid.

Best regards,
-- Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-04-17 16:50                   ` Sergey Organov
@ 2023-04-17 18:40                     ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-04-17 18:40 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Am 17.04.23 um 18:50 schrieb Sergey Organov:
> Hi Stefan,
> 
> Stefan Wahren <stefan.wahren@i2se.com> writes:
> 
>> Hi Sergey,
>>
> 
> [...]
> 
>> i had some time today to investigate this a little bit. I thought it
>> would be a good idea to use debugfs as a ugly quick hack:
>>
> 
> [...]
> 
>> Using this i was able to better compare the behavior with RXTL_DEFAULT
>> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on
>> my i.MX6ULL test platform. After doing my usual test scenario (copy
>> some text lines to console) i got the following results:
>>
>> RXTL_DEFAULT 1
>> 21f0000.serial/stats:total_duration_us: 61
>> 21f0000.serial/stats:rx_duration_us:    36
>> 21f0000.serial/stats:tx_duration_us:    48
>> 21f0000.serial/stats:received:          28
>> 21f0000.serial/stats:send:              33
>>
>> RXTL_DEFAULT 8
>> 21f0000.serial/stats:total_duration_us: 78
>> 21f0000.serial/stats:rx_duration_us:    46
>> 21f0000.serial/stats:tx_duration_us:    47
>> 21f0000.serial/stats:received:          33
>> 21f0000.serial/stats:send:              33
>>
>> So based on the maximum of received characters on RX interrupt, i
>> consider the root cause of this issue has already been there because
>> the amount is near to the maximum of the FIFO (32 chars). So finally
>> increasing RXTL_DEFAULT makes the situation even worse by adding
>> enough latency for overrun errors.
> 
> Yep, looks like an issue.
> 
> What's the baud rate? 115200?

Correct

> If so, it means that interrupts are
> apparently blocked in your system for up to about 28/(115200/10)=2.4
> milliseconds. This is very large number, and it may negatively affect
> system performance in other places as well, I'm afraid.
> 
> Best regards,
> -- Sergey Organov
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-04-17 18:40                     ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-04-17 18:40 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Am 17.04.23 um 18:50 schrieb Sergey Organov:
> Hi Stefan,
> 
> Stefan Wahren <stefan.wahren@i2se.com> writes:
> 
>> Hi Sergey,
>>
> 
> [...]
> 
>> i had some time today to investigate this a little bit. I thought it
>> would be a good idea to use debugfs as a ugly quick hack:
>>
> 
> [...]
> 
>> Using this i was able to better compare the behavior with RXTL_DEFAULT
>> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on
>> my i.MX6ULL test platform. After doing my usual test scenario (copy
>> some text lines to console) i got the following results:
>>
>> RXTL_DEFAULT 1
>> 21f0000.serial/stats:total_duration_us: 61
>> 21f0000.serial/stats:rx_duration_us:    36
>> 21f0000.serial/stats:tx_duration_us:    48
>> 21f0000.serial/stats:received:          28
>> 21f0000.serial/stats:send:              33
>>
>> RXTL_DEFAULT 8
>> 21f0000.serial/stats:total_duration_us: 78
>> 21f0000.serial/stats:rx_duration_us:    46
>> 21f0000.serial/stats:tx_duration_us:    47
>> 21f0000.serial/stats:received:          33
>> 21f0000.serial/stats:send:              33
>>
>> So based on the maximum of received characters on RX interrupt, i
>> consider the root cause of this issue has already been there because
>> the amount is near to the maximum of the FIFO (32 chars). So finally
>> increasing RXTL_DEFAULT makes the situation even worse by adding
>> enough latency for overrun errors.
> 
> Yep, looks like an issue.
> 
> What's the baud rate? 115200?

Correct

> If so, it means that interrupts are
> apparently blocked in your system for up to about 28/(115200/10)=2.4
> milliseconds. This is very large number, and it may negatively affect
> system performance in other places as well, I'm afraid.
> 
> Best regards,
> -- Sergey Organov
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-04-17 16:50                   ` Sergey Organov
@ 2023-04-18 16:16                     ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-04-18 16:16 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Sergey,

Am 17.04.23 um 18:50 schrieb Sergey Organov:
> Hi Stefan,
> 
> Stefan Wahren <stefan.wahren@i2se.com> writes:
> 
>> Hi Sergey,
>>
> 
> [...]
> 
>> i had some time today to investigate this a little bit. I thought it
>> would be a good idea to use debugfs as a ugly quick hack:
>>
> 
> [...]
> 
>> Using this i was able to better compare the behavior with RXTL_DEFAULT
>> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on
>> my i.MX6ULL test platform. After doing my usual test scenario (copy
>> some text lines to console) i got the following results:
>>
>> RXTL_DEFAULT 1
>> 21f0000.serial/stats:total_duration_us: 61
>> 21f0000.serial/stats:rx_duration_us:    36
>> 21f0000.serial/stats:tx_duration_us:    48
>> 21f0000.serial/stats:received:          28
>> 21f0000.serial/stats:send:              33
>>
>> RXTL_DEFAULT 8
>> 21f0000.serial/stats:total_duration_us: 78
>> 21f0000.serial/stats:rx_duration_us:    46
>> 21f0000.serial/stats:tx_duration_us:    47
>> 21f0000.serial/stats:received:          33
>> 21f0000.serial/stats:send:              33
>>
>> So based on the maximum of received characters on RX interrupt, i
>> consider the root cause of this issue has already been there because
>> the amount is near to the maximum of the FIFO (32 chars). So finally
>> increasing RXTL_DEFAULT makes the situation even worse by adding
>> enough latency for overrun errors.
> 
> Yep, looks like an issue.
> 
> What's the baud rate? 115200? If so, it means that interrupts are
> apparently blocked in your system for up to about 28/(115200/10)=2.4
> milliseconds. This is very large number, and it may negatively affect
> system performance in other places as well, I'm afraid.

i forgot to mention that i also measured the time around 
printk_safe_(enter|exit)_irqsave in console_emit_next_record() which had 
a maximum of 24721 µs. But uncommenting these functions doesn't fixed 
the problem. This seems to be used only by printk.

Best regards

> 
> Best regards,
> -- Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-04-18 16:16                     ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-04-18 16:16 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Sergey,

Am 17.04.23 um 18:50 schrieb Sergey Organov:
> Hi Stefan,
> 
> Stefan Wahren <stefan.wahren@i2se.com> writes:
> 
>> Hi Sergey,
>>
> 
> [...]
> 
>> i had some time today to investigate this a little bit. I thought it
>> would be a good idea to use debugfs as a ugly quick hack:
>>
> 
> [...]
> 
>> Using this i was able to better compare the behavior with RXTL_DEFAULT
>> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on
>> my i.MX6ULL test platform. After doing my usual test scenario (copy
>> some text lines to console) i got the following results:
>>
>> RXTL_DEFAULT 1
>> 21f0000.serial/stats:total_duration_us: 61
>> 21f0000.serial/stats:rx_duration_us:    36
>> 21f0000.serial/stats:tx_duration_us:    48
>> 21f0000.serial/stats:received:          28
>> 21f0000.serial/stats:send:              33
>>
>> RXTL_DEFAULT 8
>> 21f0000.serial/stats:total_duration_us: 78
>> 21f0000.serial/stats:rx_duration_us:    46
>> 21f0000.serial/stats:tx_duration_us:    47
>> 21f0000.serial/stats:received:          33
>> 21f0000.serial/stats:send:              33
>>
>> So based on the maximum of received characters on RX interrupt, i
>> consider the root cause of this issue has already been there because
>> the amount is near to the maximum of the FIFO (32 chars). So finally
>> increasing RXTL_DEFAULT makes the situation even worse by adding
>> enough latency for overrun errors.
> 
> Yep, looks like an issue.
> 
> What's the baud rate? 115200? If so, it means that interrupts are
> apparently blocked in your system for up to about 28/(115200/10)=2.4
> milliseconds. This is very large number, and it may negatively affect
> system performance in other places as well, I'm afraid.

i forgot to mention that i also measured the time around 
printk_safe_(enter|exit)_irqsave in console_emit_next_record() which had 
a maximum of 24721 µs. But uncommenting these functions doesn't fixed 
the problem. This seems to be used only by printk.

Best regards

> 
> Best regards,
> -- Sergey Organov

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-04-18 16:16                     ` Stefan Wahren
@ 2023-05-22  9:25                       ` Linux regression tracking (Thorsten Leemhuis)
  -1 siblings, 0 replies; 90+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-22  9:25 UTC (permalink / raw)
  To: Stefan Wahren, Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM, Linux kernel regressions list

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Stefan, was this regression ever solved? It doesn't look like it, but
maybe I'm missing something.

If it wasn't solved: what needs to be done to get this rolling again?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 18.04.23 18:16, Stefan Wahren wrote:
> Hi Sergey,
> 
> Am 17.04.23 um 18:50 schrieb Sergey Organov:
>> Hi Stefan,
>>
>> Stefan Wahren <stefan.wahren@i2se.com> writes:
>>
>>> Hi Sergey,
>>>
>>
>> [...]
>>
>>> i had some time today to investigate this a little bit. I thought it
>>> would be a good idea to use debugfs as a ugly quick hack:
>>>
>>
>> [...]
>>
>>> Using this i was able to better compare the behavior with RXTL_DEFAULT
>>> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on
>>> my i.MX6ULL test platform. After doing my usual test scenario (copy
>>> some text lines to console) i got the following results:
>>>
>>> RXTL_DEFAULT 1
>>> 21f0000.serial/stats:total_duration_us: 61
>>> 21f0000.serial/stats:rx_duration_us:    36
>>> 21f0000.serial/stats:tx_duration_us:    48
>>> 21f0000.serial/stats:received:          28
>>> 21f0000.serial/stats:send:              33
>>>
>>> RXTL_DEFAULT 8
>>> 21f0000.serial/stats:total_duration_us: 78
>>> 21f0000.serial/stats:rx_duration_us:    46
>>> 21f0000.serial/stats:tx_duration_us:    47
>>> 21f0000.serial/stats:received:          33
>>> 21f0000.serial/stats:send:              33
>>>
>>> So based on the maximum of received characters on RX interrupt, i
>>> consider the root cause of this issue has already been there because
>>> the amount is near to the maximum of the FIFO (32 chars). So finally
>>> increasing RXTL_DEFAULT makes the situation even worse by adding
>>> enough latency for overrun errors.
>>
>> Yep, looks like an issue.
>>
>> What's the baud rate? 115200? If so, it means that interrupts are
>> apparently blocked in your system for up to about 28/(115200/10)=2.4
>> milliseconds. This is very large number, and it may negatively affect
>> system performance in other places as well, I'm afraid.
> 
> i forgot to mention that i also measured the time around
> printk_safe_(enter|exit)_irqsave in console_emit_next_record() which had
> a maximum of 24721 µs. But uncommenting these functions doesn't fixed
> the problem. This seems to be used only by printk.
> 
> Best regards
> 
>>
>> Best regards,
>> -- Sergey Organov
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-05-22  9:25                       ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 90+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-22  9:25 UTC (permalink / raw)
  To: Stefan Wahren, Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM, Linux kernel regressions list

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Stefan, was this regression ever solved? It doesn't look like it, but
maybe I'm missing something.

If it wasn't solved: what needs to be done to get this rolling again?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 18.04.23 18:16, Stefan Wahren wrote:
> Hi Sergey,
> 
> Am 17.04.23 um 18:50 schrieb Sergey Organov:
>> Hi Stefan,
>>
>> Stefan Wahren <stefan.wahren@i2se.com> writes:
>>
>>> Hi Sergey,
>>>
>>
>> [...]
>>
>>> i had some time today to investigate this a little bit. I thought it
>>> would be a good idea to use debugfs as a ugly quick hack:
>>>
>>
>> [...]
>>
>>> Using this i was able to better compare the behavior with RXTL_DEFAULT
>>> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on
>>> my i.MX6ULL test platform. After doing my usual test scenario (copy
>>> some text lines to console) i got the following results:
>>>
>>> RXTL_DEFAULT 1
>>> 21f0000.serial/stats:total_duration_us: 61
>>> 21f0000.serial/stats:rx_duration_us:    36
>>> 21f0000.serial/stats:tx_duration_us:    48
>>> 21f0000.serial/stats:received:          28
>>> 21f0000.serial/stats:send:              33
>>>
>>> RXTL_DEFAULT 8
>>> 21f0000.serial/stats:total_duration_us: 78
>>> 21f0000.serial/stats:rx_duration_us:    46
>>> 21f0000.serial/stats:tx_duration_us:    47
>>> 21f0000.serial/stats:received:          33
>>> 21f0000.serial/stats:send:              33
>>>
>>> So based on the maximum of received characters on RX interrupt, i
>>> consider the root cause of this issue has already been there because
>>> the amount is near to the maximum of the FIFO (32 chars). So finally
>>> increasing RXTL_DEFAULT makes the situation even worse by adding
>>> enough latency for overrun errors.
>>
>> Yep, looks like an issue.
>>
>> What's the baud rate? 115200? If so, it means that interrupts are
>> apparently blocked in your system for up to about 28/(115200/10)=2.4
>> milliseconds. This is very large number, and it may negatively affect
>> system performance in other places as well, I'm afraid.
> 
> i forgot to mention that i also measured the time around
> printk_safe_(enter|exit)_irqsave in console_emit_next_record() which had
> a maximum of 24721 µs. But uncommenting these functions doesn't fixed
> the problem. This seems to be used only by printk.
> 
> Best regards
> 
>>
>> Best regards,
>> -- Sergey Organov
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-05-22  9:25                       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-05-23 15:12                         ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-05-23 15:12 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Uwe Kleine-König, Sergey Organov, Fabio Estevam,
	Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Hi Thorsten,

Am 22.05.23 um 11:25 schrieb Linux regression tracking (Thorsten Leemhuis):
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
> 
> Stefan, was this regression ever solved? It doesn't look like it, but
> maybe I'm missing something.
> 
> If it wasn't solved: what needs to be done to get this rolling again?

thanks for the reminder. From a user point of view this issue hasn't 
been fixed so far. For our product we just reverted the commit in a 
downstream repo.

 From my understanding there was already an issue there and the 
optimizing commit by Tomasz just make the situation worse. Unfortunately 
my time budget to investigate this issue further is exhausted, so i 
stopped working at this.

In case someone can give clear instructions to investigate this further, 
i will try to look at it in my spare time. But i cannot make any promises.

Best regards

> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
> 
> #regzbot poke
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-05-23 15:12                         ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-05-23 15:12 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Uwe Kleine-König, Sergey Organov, Fabio Estevam,
	Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Hi Thorsten,

Am 22.05.23 um 11:25 schrieb Linux regression tracking (Thorsten Leemhuis):
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
> 
> Stefan, was this regression ever solved? It doesn't look like it, but
> maybe I'm missing something.
> 
> If it wasn't solved: what needs to be done to get this rolling again?

thanks for the reminder. From a user point of view this issue hasn't 
been fixed so far. For our product we just reverted the commit in a 
downstream repo.

 From my understanding there was already an issue there and the 
optimizing commit by Tomasz just make the situation worse. Unfortunately 
my time budget to investigate this issue further is exhausted, so i 
stopped working at this.

In case someone can give clear instructions to investigate this further, 
i will try to look at it in my spare time. But i cannot make any promises.

Best regards

> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
> 
> #regzbot poke
> 

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-05-22  9:25                       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-05-23 19:44                         ` Sergey Organov
  -1 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-05-23 19:44 UTC (permalink / raw)
  To: Linux regression tracking (Thorsten Leemhuis)
  Cc: Stefan Wahren, Linux regressions mailing list,
	Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

"Linux regression tracking (Thorsten Leemhuis)"
<regressions@leemhuis.info> writes:

> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> Stefan, was this regression ever solved? It doesn't look like it, but
> maybe I'm missing something.
>
> If it wasn't solved: what needs to be done to get this rolling again?

Hi Thorsten,

Not Stefan, but as far as I can tell, the problem is that on Stefan's
build the kernel has rather large periods of interrupts being disabled,
so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ
threshold causes "regression" that manifests itself as missing
characters on receive. I'm not sure if it's tuning FIFO level that is in
fact a regression in this case.

Solving this would need to identify the cause of interrupts being
disabled for prolonged times, and nobody volunteered to investigate this
further. One suspect, the Linux serial console, has been likely excluded
already though, as not actually being in use for printk() output.

-- 
Sergey Organov

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-05-23 19:44                         ` Sergey Organov
  0 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-05-23 19:44 UTC (permalink / raw)
  To: Linux regression tracking (Thorsten Leemhuis)
  Cc: Stefan Wahren, Linux regressions mailing list,
	Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

"Linux regression tracking (Thorsten Leemhuis)"
<regressions@leemhuis.info> writes:

> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> Stefan, was this regression ever solved? It doesn't look like it, but
> maybe I'm missing something.
>
> If it wasn't solved: what needs to be done to get this rolling again?

Hi Thorsten,

Not Stefan, but as far as I can tell, the problem is that on Stefan's
build the kernel has rather large periods of interrupts being disabled,
so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ
threshold causes "regression" that manifests itself as missing
characters on receive. I'm not sure if it's tuning FIFO level that is in
fact a regression in this case.

Solving this would need to identify the cause of interrupts being
disabled for prolonged times, and nobody volunteered to investigate this
further. One suspect, the Linux serial console, has been likely excluded
already though, as not actually being in use for printk() output.

-- 
Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-05-23 19:44                         ` Sergey Organov
@ 2023-05-24 10:48                           ` Thorsten Leemhuis
  -1 siblings, 0 replies; 90+ messages in thread
From: Thorsten Leemhuis @ 2023-05-24 10:48 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Stefan Wahren, Linux regressions mailing list,
	Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

On 23.05.23 21:44, Sergey Organov wrote:
> "Linux regression tracking (Thorsten Leemhuis)"
> <regressions@leemhuis.info> writes:
> 
>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>> for once, to make this easily accessible to everyone.
>>
>> Stefan, was this regression ever solved? It doesn't look like it, but
>> maybe I'm missing something.
>>
>> If it wasn't solved: what needs to be done to get this rolling again?
> 
> Not Stefan,

Thx to both you and Stefan for the update.

> but as far as I can tell, the problem is that on Stefan's
> build the kernel has rather large periods of interrupts being disabled,
> so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ
> threshold causes "regression" that manifests itself as missing
> characters on receive. I'm not sure if it's tuning FIFO level that is in
> fact a regression in this case.

Not totally sure, but I guess Linus stance in this case would be along
the lines of "commit 7a637784d517 made an existing issue worse; either
the people involved in it fix it, or we revert that commit[1], as it's
causing a regression". At least we *iirc* had situations he handled like
that.

[1] of course unless a revert would cause regressions for others --
which i guess might be the case here, as that was added in 5.18 already.
So let's not bring Linus in.

> Solving this would need to identify the cause of interrupts being
> disabled for prolonged times, and nobody volunteered to investigate this
> further. 

Well, Stefan kind of did to do so in his spare time, but asked for
"clear instructions to investigate this further". Could you maybe
provide those? If not: who could?

> One suspect, the Linux serial console, has been likely excluded
> already though, as not actually being in use for printk() output.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-05-24 10:48                           ` Thorsten Leemhuis
  0 siblings, 0 replies; 90+ messages in thread
From: Thorsten Leemhuis @ 2023-05-24 10:48 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Stefan Wahren, Linux regressions mailing list,
	Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

On 23.05.23 21:44, Sergey Organov wrote:
> "Linux regression tracking (Thorsten Leemhuis)"
> <regressions@leemhuis.info> writes:
> 
>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>> for once, to make this easily accessible to everyone.
>>
>> Stefan, was this regression ever solved? It doesn't look like it, but
>> maybe I'm missing something.
>>
>> If it wasn't solved: what needs to be done to get this rolling again?
> 
> Not Stefan,

Thx to both you and Stefan for the update.

> but as far as I can tell, the problem is that on Stefan's
> build the kernel has rather large periods of interrupts being disabled,
> so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ
> threshold causes "regression" that manifests itself as missing
> characters on receive. I'm not sure if it's tuning FIFO level that is in
> fact a regression in this case.

Not totally sure, but I guess Linus stance in this case would be along
the lines of "commit 7a637784d517 made an existing issue worse; either
the people involved in it fix it, or we revert that commit[1], as it's
causing a regression". At least we *iirc* had situations he handled like
that.

[1] of course unless a revert would cause regressions for others --
which i guess might be the case here, as that was added in 5.18 already.
So let's not bring Linus in.

> Solving this would need to identify the cause of interrupts being
> disabled for prolonged times, and nobody volunteered to investigate this
> further. 

Well, Stefan kind of did to do so in his spare time, but asked for
"clear instructions to investigate this further". Could you maybe
provide those? If not: who could?

> One suspect, the Linux serial console, has been likely excluded
> already though, as not actually being in use for printk() output.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-05-24 10:48                           ` Thorsten Leemhuis
@ 2023-05-24 12:41                             ` Uwe Kleine-König
  -1 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-05-24 12:41 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Sergey Organov, Stefan Wahren, Linux regressions mailing list,
	Jiri Slaby, Greg Kroah-Hartman, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Tomasz Moń,
	NXP Linux Team, linux-serial, Ilpo Järvinen, Fabio Estevam,
	Pengutronix Kernel Team, Linux ARM

[-- Attachment #1: Type: text/plain, Size: 2077 bytes --]

On Wed, May 24, 2023 at 12:48:51PM +0200, Thorsten Leemhuis wrote:
> On 23.05.23 21:44, Sergey Organov wrote:
> > "Linux regression tracking (Thorsten Leemhuis)"
> > <regressions@leemhuis.info> writes:
> > 
> >> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> >> for once, to make this easily accessible to everyone.
> >>
> >> Stefan, was this regression ever solved? It doesn't look like it, but
> >> maybe I'm missing something.
> >>
> >> If it wasn't solved: what needs to be done to get this rolling again?
> > 
> > Not Stefan,
> 
> Thx to both you and Stefan for the update.
> 
> > but as far as I can tell, the problem is that on Stefan's
> > build the kernel has rather large periods of interrupts being disabled,
> > so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ
> > threshold causes "regression" that manifests itself as missing
> > characters on receive. I'm not sure if it's tuning FIFO level that is in
> > fact a regression in this case.
> 
> Not totally sure, but I guess Linus stance in this case would be along
> the lines of "commit 7a637784d517 made an existing issue worse; either
> the people involved in it fix it, or we revert that commit[1], as it's
> causing a regression". At least we *iirc* had situations he handled like
> that.
> 
> [1] of course unless a revert would cause regressions for others --
> which i guess might be the case here, as that was added in 5.18 already.
> So let's not bring Linus in.

Well in my eyes this regression is in the same league as: That patch
over made a driver use some more memory and on my (memory limited)
machine this makes the difference to trigger an OOM. You could apply
this to pretty much any patch that increases the memory foot print /
latency / cpu usage.
(TL;DR: I agree to not revert the patch under discussion for this
reason.)

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-05-24 12:41                             ` Uwe Kleine-König
  0 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-05-24 12:41 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Sergey Organov, Stefan Wahren, Linux regressions mailing list,
	Jiri Slaby, Greg Kroah-Hartman, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Tomasz Moń,
	NXP Linux Team, linux-serial, Ilpo Järvinen, Fabio Estevam,
	Pengutronix Kernel Team, Linux ARM


[-- Attachment #1.1: Type: text/plain, Size: 2077 bytes --]

On Wed, May 24, 2023 at 12:48:51PM +0200, Thorsten Leemhuis wrote:
> On 23.05.23 21:44, Sergey Organov wrote:
> > "Linux regression tracking (Thorsten Leemhuis)"
> > <regressions@leemhuis.info> writes:
> > 
> >> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> >> for once, to make this easily accessible to everyone.
> >>
> >> Stefan, was this regression ever solved? It doesn't look like it, but
> >> maybe I'm missing something.
> >>
> >> If it wasn't solved: what needs to be done to get this rolling again?
> > 
> > Not Stefan,
> 
> Thx to both you and Stefan for the update.
> 
> > but as far as I can tell, the problem is that on Stefan's
> > build the kernel has rather large periods of interrupts being disabled,
> > so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ
> > threshold causes "regression" that manifests itself as missing
> > characters on receive. I'm not sure if it's tuning FIFO level that is in
> > fact a regression in this case.
> 
> Not totally sure, but I guess Linus stance in this case would be along
> the lines of "commit 7a637784d517 made an existing issue worse; either
> the people involved in it fix it, or we revert that commit[1], as it's
> causing a regression". At least we *iirc* had situations he handled like
> that.
> 
> [1] of course unless a revert would cause regressions for others --
> which i guess might be the case here, as that was added in 5.18 already.
> So let's not bring Linus in.

Well in my eyes this regression is in the same league as: That patch
over made a driver use some more memory and on my (memory limited)
machine this makes the difference to trigger an OOM. You could apply
this to pretty much any patch that increases the memory foot print /
latency / cpu usage.
(TL;DR: I agree to not revert the patch under discussion for this
reason.)

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-05-23 19:44                         ` Sergey Organov
@ 2023-05-24 13:07                           ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-05-24 13:07 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Linux regressions mailing list, Uwe Kleine-König,
	Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM, Linux regression tracking (Thorsten Leemhuis)

Hi Sergey,

Am 23.05.23 um 21:44 schrieb Sergey Organov:
> "Linux regression tracking (Thorsten Leemhuis)"
> <regressions@leemhuis.info> writes:
> 
...
> 
> Solving this would need to identify the cause of interrupts being
> disabled for prolonged times, and nobody volunteered to investigate this
> further. One suspect, the Linux serial console, has been likely excluded
> already though, as not actually being in use for printk() output.
> 

I don't think that we can exclude the serial console as a whole, i never 
made such a observation. But at least we can exclude kernel logging on 
the debug UART.

Best regards

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-05-24 13:07                           ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-05-24 13:07 UTC (permalink / raw)
  To: Sergey Organov
  Cc: Linux regressions mailing list, Uwe Kleine-König,
	Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM, Linux regression tracking (Thorsten Leemhuis)

Hi Sergey,

Am 23.05.23 um 21:44 schrieb Sergey Organov:
> "Linux regression tracking (Thorsten Leemhuis)"
> <regressions@leemhuis.info> writes:
> 
...
> 
> Solving this would need to identify the cause of interrupts being
> disabled for prolonged times, and nobody volunteered to investigate this
> further. One suspect, the Linux serial console, has been likely excluded
> already though, as not actually being in use for printk() output.
> 

I don't think that we can exclude the serial console as a whole, i never 
made such a observation. But at least we can exclude kernel logging on 
the debug UART.

Best regards

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-05-24 10:48                           ` Thorsten Leemhuis
@ 2023-05-24 13:45                             ` Sergey Organov
  -1 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-05-24 13:45 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Stefan Wahren, Linux regressions mailing list,
	Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Thorsten Leemhuis <regressions@leemhuis.info> writes:

> On 23.05.23 21:44, Sergey Organov wrote:
>> "Linux regression tracking (Thorsten Leemhuis)"
>> <regressions@leemhuis.info> writes:
>> 
>>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>>> for once, to make this easily accessible to everyone.
>>>
>>> Stefan, was this regression ever solved? It doesn't look like it, but
>>> maybe I'm missing something.
>>>
>>> If it wasn't solved: what needs to be done to get this rolling again?
>> 
>> Not Stefan,
>
> Thx to both you and Stefan for the update.
>
>> but as far as I can tell, the problem is that on Stefan's
>> build the kernel has rather large periods of interrupts being disabled,
>> so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ
>> threshold causes "regression" that manifests itself as missing
>> characters on receive. I'm not sure if it's tuning FIFO level that is in
>> fact a regression in this case.
>
> Not totally sure, but I guess Linus stance in this case would be along
> the lines of "commit 7a637784d517 made an existing issue worse; either
> the people involved in it fix it, or we revert that commit[1], as it's
> causing a regression". At least we *iirc* had situations he handled like
> that.

From Stefan's investigations it follows that the kernel has interrupts
disabled for about 2.5 milliseconds! If that's an acceptable value for
Linux kernel, then the commit in question is a regression. If not, and
in my opinion that's too high a number, then it's not a regression at
all, but rather a manifestation of a problem (bug?) elsewhere.

>
> [1] of course unless a revert would cause regressions for others --
> which i guess might be the case here, as that was added in 5.18 already.
> So let's not bring Linus in.
>
>> Solving this would need to identify the cause of interrupts being
>> disabled for prolonged times, and nobody volunteered to investigate this
>> further. 
>
> Well, Stefan kind of did to do so in his spare time, but asked for
> "clear instructions to investigate this further". Could you maybe
> provide those? If not: who could?

There should be somebody who is familiar with methods to isolate the
victim of abnormal interrupts latencies, but I'm not the one, sorry.

Thanks,
-- Sergey Organov

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-05-24 13:45                             ` Sergey Organov
  0 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-05-24 13:45 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Stefan Wahren, Linux regressions mailing list,
	Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Thorsten Leemhuis <regressions@leemhuis.info> writes:

> On 23.05.23 21:44, Sergey Organov wrote:
>> "Linux regression tracking (Thorsten Leemhuis)"
>> <regressions@leemhuis.info> writes:
>> 
>>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>>> for once, to make this easily accessible to everyone.
>>>
>>> Stefan, was this regression ever solved? It doesn't look like it, but
>>> maybe I'm missing something.
>>>
>>> If it wasn't solved: what needs to be done to get this rolling again?
>> 
>> Not Stefan,
>
> Thx to both you and Stefan for the update.
>
>> but as far as I can tell, the problem is that on Stefan's
>> build the kernel has rather large periods of interrupts being disabled,
>> so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ
>> threshold causes "regression" that manifests itself as missing
>> characters on receive. I'm not sure if it's tuning FIFO level that is in
>> fact a regression in this case.
>
> Not totally sure, but I guess Linus stance in this case would be along
> the lines of "commit 7a637784d517 made an existing issue worse; either
> the people involved in it fix it, or we revert that commit[1], as it's
> causing a regression". At least we *iirc* had situations he handled like
> that.

From Stefan's investigations it follows that the kernel has interrupts
disabled for about 2.5 milliseconds! If that's an acceptable value for
Linux kernel, then the commit in question is a regression. If not, and
in my opinion that's too high a number, then it's not a regression at
all, but rather a manifestation of a problem (bug?) elsewhere.

>
> [1] of course unless a revert would cause regressions for others --
> which i guess might be the case here, as that was added in 5.18 already.
> So let's not bring Linus in.
>
>> Solving this would need to identify the cause of interrupts being
>> disabled for prolonged times, and nobody volunteered to investigate this
>> further. 
>
> Well, Stefan kind of did to do so in his spare time, but asked for
> "clear instructions to investigate this further". Could you maybe
> provide those? If not: who could?

There should be somebody who is familiar with methods to isolate the
victim of abnormal interrupts latencies, but I'm not the one, sorry.

Thanks,
-- Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-05-24 13:07                           ` Stefan Wahren
@ 2023-06-20 14:47                             ` Linux regression tracking (Thorsten Leemhuis)
  -1 siblings, 0 replies; 90+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-20 14:47 UTC (permalink / raw)
  To: Stefan Wahren, Sergey Organov
  Cc: Linux regressions mailing list, Uwe Kleine-König,
	Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

On 24.05.23 15:07, Stefan Wahren wrote:
> 
> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>> "Linux regression tracking (Thorsten Leemhuis)"
>> <regressions@leemhuis.info> writes:
>>
>> Solving this would need to identify the cause of interrupts being
>> disabled for prolonged times, and nobody volunteered to investigate this
>> further. One suspect, the Linux serial console, has been likely excluded
>> already though, as not actually being in use for printk() output.
>>
> 
> I don't think that we can exclude the serial console as a whole, i never
> made such a observation. But at least we can exclude kernel logging on
> the debug UART.

Stefan, just wondering: was this ever addressed upstream? I assume it's
not, just wanted to be sure.

I'm a bit unsure what to do with this and consider asking Greg for
advice, as he applied the patch. On one hand it's *IMHO* clearly a
regression (but for the record,  some people involved in the discussion
claim it's not). OTOH the culprit was applied more than a year ago now,
so reverting it might cause more trouble than it's worth at this point,
as that could lead to regressions for other users.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-20 14:47                             ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 90+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-20 14:47 UTC (permalink / raw)
  To: Stefan Wahren, Sergey Organov
  Cc: Linux regressions mailing list, Uwe Kleine-König,
	Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

On 24.05.23 15:07, Stefan Wahren wrote:
> 
> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>> "Linux regression tracking (Thorsten Leemhuis)"
>> <regressions@leemhuis.info> writes:
>>
>> Solving this would need to identify the cause of interrupts being
>> disabled for prolonged times, and nobody volunteered to investigate this
>> further. One suspect, the Linux serial console, has been likely excluded
>> already though, as not actually being in use for printk() output.
>>
> 
> I don't think that we can exclude the serial console as a whole, i never
> made such a observation. But at least we can exclude kernel logging on
> the debug UART.

Stefan, just wondering: was this ever addressed upstream? I assume it's
not, just wanted to be sure.

I'm a bit unsure what to do with this and consider asking Greg for
advice, as he applied the patch. On one hand it's *IMHO* clearly a
regression (but for the record,  some people involved in the discussion
claim it's not). OTOH the culprit was applied more than a year ago now,
so reverting it might cause more trouble than it's worth at this point,
as that could lead to regressions for other users.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-20 14:47                             ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-06-20 14:59                               ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-20 14:59 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Stefan Wahren, Sergey Organov, Uwe Kleine-König,
	Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo,
	Jiri Slaby, Tomasz Moń,
	Linux ARM

On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 24.05.23 15:07, Stefan Wahren wrote:
> > 
> > Am 23.05.23 um 21:44 schrieb Sergey Organov:
> >> "Linux regression tracking (Thorsten Leemhuis)"
> >> <regressions@leemhuis.info> writes:
> >>
> >> Solving this would need to identify the cause of interrupts being
> >> disabled for prolonged times, and nobody volunteered to investigate this
> >> further. One suspect, the Linux serial console, has been likely excluded
> >> already though, as not actually being in use for printk() output.
> >>
> > 
> > I don't think that we can exclude the serial console as a whole, i never
> > made such a observation. But at least we can exclude kernel logging on
> > the debug UART.
> 
> Stefan, just wondering: was this ever addressed upstream? I assume it's
> not, just wanted to be sure.
> 
> I'm a bit unsure what to do with this and consider asking Greg for
> advice, as he applied the patch. On one hand it's *IMHO* clearly a
> regression (but for the record,  some people involved in the discussion
> claim it's not). OTOH the culprit was applied more than a year ago now,
> so reverting it might cause more trouble than it's worth at this point,
> as that could lead to regressions for other users.

I'll be glad to revert this, but for some reason I thought that someone
was working on a "real fix" here.  Stefan, is that not the case?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-20 14:59                               ` Greg Kroah-Hartman
  0 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-20 14:59 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Stefan Wahren, Sergey Organov, Uwe Kleine-König,
	Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial,
	Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo,
	Jiri Slaby, Tomasz Moń,
	Linux ARM

On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 24.05.23 15:07, Stefan Wahren wrote:
> > 
> > Am 23.05.23 um 21:44 schrieb Sergey Organov:
> >> "Linux regression tracking (Thorsten Leemhuis)"
> >> <regressions@leemhuis.info> writes:
> >>
> >> Solving this would need to identify the cause of interrupts being
> >> disabled for prolonged times, and nobody volunteered to investigate this
> >> further. One suspect, the Linux serial console, has been likely excluded
> >> already though, as not actually being in use for printk() output.
> >>
> > 
> > I don't think that we can exclude the serial console as a whole, i never
> > made such a observation. But at least we can exclude kernel logging on
> > the debug UART.
> 
> Stefan, just wondering: was this ever addressed upstream? I assume it's
> not, just wanted to be sure.
> 
> I'm a bit unsure what to do with this and consider asking Greg for
> advice, as he applied the patch. On one hand it's *IMHO* clearly a
> regression (but for the record,  some people involved in the discussion
> claim it's not). OTOH the culprit was applied more than a year ago now,
> so reverting it might cause more trouble than it's worth at this point,
> as that could lead to regressions for other users.

I'll be glad to revert this, but for some reason I thought that someone
was working on a "real fix" here.  Stefan, is that not the case?

thanks,

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-20 14:59                               ` Greg Kroah-Hartman
@ 2023-06-20 15:34                                 ` Sergey Organov
  -1 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-06-20 15:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux regressions mailing list, Stefan Wahren,
	Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:

> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 24.05.23 15:07, Stefan Wahren wrote:
>> > 
>> > Am 23.05.23 um 21:44 schrieb Sergey Organov:
>> >> "Linux regression tracking (Thorsten Leemhuis)"
>> >> <regressions@leemhuis.info> writes:
>> >>
>> >> Solving this would need to identify the cause of interrupts being
>> >> disabled for prolonged times, and nobody volunteered to investigate this
>> >> further. One suspect, the Linux serial console, has been likely excluded
>> >> already though, as not actually being in use for printk() output.
>> >>
>> > 
>> > I don't think that we can exclude the serial console as a whole, i never
>> > made such a observation. But at least we can exclude kernel logging on
>> > the debug UART.
>> 
>> Stefan, just wondering: was this ever addressed upstream? I assume it's
>> not, just wanted to be sure.
>> 
>> I'm a bit unsure what to do with this and consider asking Greg for
>> advice, as he applied the patch. On one hand it's *IMHO* clearly a
>> regression (but for the record,  some people involved in the discussion
>> claim it's not). OTOH the culprit was applied more than a year ago now,
>> so reverting it might cause more trouble than it's worth at this point,
>> as that could lead to regressions for other users.
>
> I'll be glad to revert this, but for some reason I thought that someone
> was working on a "real fix" here.  Stefan, is that not the case?

As far as I understand, the "real fix" is to be where interrupts are
being disabled for prolonged times in given specific kernel build, and
nobody is looking for that place.

In other words, I'm one who thinks the commit in question is not a
regression per se, so I'm not sure it should be reverted.

Thanks,
Sergey Organov

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-20 15:34                                 ` Sergey Organov
  0 siblings, 0 replies; 90+ messages in thread
From: Sergey Organov @ 2023-06-20 15:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux regressions mailing list, Stefan Wahren,
	Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Sascha Hauer, NXP Linux Team,
	Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń,
	Linux ARM

Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:

> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 24.05.23 15:07, Stefan Wahren wrote:
>> > 
>> > Am 23.05.23 um 21:44 schrieb Sergey Organov:
>> >> "Linux regression tracking (Thorsten Leemhuis)"
>> >> <regressions@leemhuis.info> writes:
>> >>
>> >> Solving this would need to identify the cause of interrupts being
>> >> disabled for prolonged times, and nobody volunteered to investigate this
>> >> further. One suspect, the Linux serial console, has been likely excluded
>> >> already though, as not actually being in use for printk() output.
>> >>
>> > 
>> > I don't think that we can exclude the serial console as a whole, i never
>> > made such a observation. But at least we can exclude kernel logging on
>> > the debug UART.
>> 
>> Stefan, just wondering: was this ever addressed upstream? I assume it's
>> not, just wanted to be sure.
>> 
>> I'm a bit unsure what to do with this and consider asking Greg for
>> advice, as he applied the patch. On one hand it's *IMHO* clearly a
>> regression (but for the record,  some people involved in the discussion
>> claim it's not). OTOH the culprit was applied more than a year ago now,
>> so reverting it might cause more trouble than it's worth at this point,
>> as that could lead to regressions for other users.
>
> I'll be glad to revert this, but for some reason I thought that someone
> was working on a "real fix" here.  Stefan, is that not the case?

As far as I understand, the "real fix" is to be where interrupts are
being disabled for prolonged times in given specific kernel build, and
nobody is looking for that place.

In other words, I'm one who thinks the commit in question is not a
regression per se, so I'm not sure it should be reverted.

Thanks,
Sergey Organov

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-20 14:59                               ` Greg Kroah-Hartman
@ 2023-06-20 16:30                                 ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-06-20 16:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Linux regressions mailing list
  Cc: Sergey Organov, Uwe Kleine-König, Fabio Estevam,
	Ilpo Järvinen, Stefan Wahren, linux-serial, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Greg,

Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman:
> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 24.05.23 15:07, Stefan Wahren wrote:
>>>
>>> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>>>> "Linux regression tracking (Thorsten Leemhuis)"
>>>> <regressions@leemhuis.info> writes:
>>>>
>>>> Solving this would need to identify the cause of interrupts being
>>>> disabled for prolonged times, and nobody volunteered to investigate this
>>>> further. One suspect, the Linux serial console, has been likely excluded
>>>> already though, as not actually being in use for printk() output.
>>>>
>>>
>>> I don't think that we can exclude the serial console as a whole, i never
>>> made such a observation. But at least we can exclude kernel logging on
>>> the debug UART.
>>
>> Stefan, just wondering: was this ever addressed upstream? I assume it's
>> not, just wanted to be sure.
>>
>> I'm a bit unsure what to do with this and consider asking Greg for
>> advice, as he applied the patch. On one hand it's *IMHO* clearly a
>> regression (but for the record,  some people involved in the discussion
>> claim it's not). OTOH the culprit was applied more than a year ago now,
>> so reverting it might cause more trouble than it's worth at this point,
>> as that could lead to regressions for other users.
> 
> I'll be glad to revert this, but for some reason I thought that someone
> was working on a "real fix" here.  Stefan, is that not the case?

i can only repeat the statements from 23.5.:

Unfortunately my time budget to investigate this issue further is 
exhausted, so i stopped working at this.

In case someone can give clear instructions to investigate this further, 
i will try to look at it in my spare time. But i cannot make any promises.

I'm not aware that some else is working on this.

Best regards

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-20 16:30                                 ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-06-20 16:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Linux regressions mailing list
  Cc: Sergey Organov, Uwe Kleine-König, Fabio Estevam,
	Ilpo Järvinen, Stefan Wahren, linux-serial, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Greg,

Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman:
> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 24.05.23 15:07, Stefan Wahren wrote:
>>>
>>> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>>>> "Linux regression tracking (Thorsten Leemhuis)"
>>>> <regressions@leemhuis.info> writes:
>>>>
>>>> Solving this would need to identify the cause of interrupts being
>>>> disabled for prolonged times, and nobody volunteered to investigate this
>>>> further. One suspect, the Linux serial console, has been likely excluded
>>>> already though, as not actually being in use for printk() output.
>>>>
>>>
>>> I don't think that we can exclude the serial console as a whole, i never
>>> made such a observation. But at least we can exclude kernel logging on
>>> the debug UART.
>>
>> Stefan, just wondering: was this ever addressed upstream? I assume it's
>> not, just wanted to be sure.
>>
>> I'm a bit unsure what to do with this and consider asking Greg for
>> advice, as he applied the patch. On one hand it's *IMHO* clearly a
>> regression (but for the record,  some people involved in the discussion
>> claim it's not). OTOH the culprit was applied more than a year ago now,
>> so reverting it might cause more trouble than it's worth at this point,
>> as that could lead to regressions for other users.
> 
> I'll be glad to revert this, but for some reason I thought that someone
> was working on a "real fix" here.  Stefan, is that not the case?

i can only repeat the statements from 23.5.:

Unfortunately my time budget to investigate this issue further is 
exhausted, so i stopped working at this.

In case someone can give clear instructions to investigate this further, 
i will try to look at it in my spare time. But i cannot make any promises.

I'm not aware that some else is working on this.

Best regards

> 
> thanks,
> 
> greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-20 16:30                                 ` Stefan Wahren
@ 2023-06-20 16:40                                   ` Lucas Stach
  -1 siblings, 0 replies; 90+ messages in thread
From: Lucas Stach @ 2023-06-20 16:40 UTC (permalink / raw)
  To: Stefan Wahren, Greg Kroah-Hartman, Linux regressions mailing list
  Cc: Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial,
	Uwe Kleine-König, Ilpo Järvinen, Fabio Estevam,
	Tomasz Moń,
	Linux ARM

Am Dienstag, dem 20.06.2023 um 18:30 +0200 schrieb Stefan Wahren:
> Hi Greg,
> 
> Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman:
> > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> > > On 24.05.23 15:07, Stefan Wahren wrote:
> > > > 
> > > > Am 23.05.23 um 21:44 schrieb Sergey Organov:
> > > > > "Linux regression tracking (Thorsten Leemhuis)"
> > > > > <regressions@leemhuis.info> writes:
> > > > > 
> > > > > Solving this would need to identify the cause of interrupts being
> > > > > disabled for prolonged times, and nobody volunteered to investigate this
> > > > > further. One suspect, the Linux serial console, has been likely excluded
> > > > > already though, as not actually being in use for printk() output.
> > > > > 
> > > > 
> > > > I don't think that we can exclude the serial console as a whole, i never
> > > > made such a observation. But at least we can exclude kernel logging on
> > > > the debug UART.
> > > 
> > > Stefan, just wondering: was this ever addressed upstream? I assume it's
> > > not, just wanted to be sure.
> > > 
> > > I'm a bit unsure what to do with this and consider asking Greg for
> > > advice, as he applied the patch. On one hand it's *IMHO* clearly a
> > > regression (but for the record,  some people involved in the discussion
> > > claim it's not). OTOH the culprit was applied more than a year ago now,
> > > so reverting it might cause more trouble than it's worth at this point,
> > > as that could lead to regressions for other users.
> > 
> > I'll be glad to revert this, but for some reason I thought that someone
> > was working on a "real fix" here.  Stefan, is that not the case?
> 
> i can only repeat the statements from 23.5.:
> 
> Unfortunately my time budget to investigate this issue further is 
> exhausted, so i stopped working at this.
> 
> In case someone can give clear instructions to investigate this further, 
> i will try to look at it in my spare time. But i cannot make any promises.
> 
If the cause is simply interrupts not being serviced for a long period
of time, the irqsoff tracer is usually a very good start to investigate
the issue. It might point to a smoking gun already.

Regards,
Lucas

> I'm not aware that some else is working on this.
> 
> Best regards
> 
> > 
> > thanks,
> > 
> > greg k-h
> 


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-20 16:40                                   ` Lucas Stach
  0 siblings, 0 replies; 90+ messages in thread
From: Lucas Stach @ 2023-06-20 16:40 UTC (permalink / raw)
  To: Stefan Wahren, Greg Kroah-Hartman, Linux regressions mailing list
  Cc: Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial,
	Uwe Kleine-König, Ilpo Järvinen, Fabio Estevam,
	Tomasz Moń,
	Linux ARM

Am Dienstag, dem 20.06.2023 um 18:30 +0200 schrieb Stefan Wahren:
> Hi Greg,
> 
> Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman:
> > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> > > On 24.05.23 15:07, Stefan Wahren wrote:
> > > > 
> > > > Am 23.05.23 um 21:44 schrieb Sergey Organov:
> > > > > "Linux regression tracking (Thorsten Leemhuis)"
> > > > > <regressions@leemhuis.info> writes:
> > > > > 
> > > > > Solving this would need to identify the cause of interrupts being
> > > > > disabled for prolonged times, and nobody volunteered to investigate this
> > > > > further. One suspect, the Linux serial console, has been likely excluded
> > > > > already though, as not actually being in use for printk() output.
> > > > > 
> > > > 
> > > > I don't think that we can exclude the serial console as a whole, i never
> > > > made such a observation. But at least we can exclude kernel logging on
> > > > the debug UART.
> > > 
> > > Stefan, just wondering: was this ever addressed upstream? I assume it's
> > > not, just wanted to be sure.
> > > 
> > > I'm a bit unsure what to do with this and consider asking Greg for
> > > advice, as he applied the patch. On one hand it's *IMHO* clearly a
> > > regression (but for the record,  some people involved in the discussion
> > > claim it's not). OTOH the culprit was applied more than a year ago now,
> > > so reverting it might cause more trouble than it's worth at this point,
> > > as that could lead to regressions for other users.
> > 
> > I'll be glad to revert this, but for some reason I thought that someone
> > was working on a "real fix" here.  Stefan, is that not the case?
> 
> i can only repeat the statements from 23.5.:
> 
> Unfortunately my time budget to investigate this issue further is 
> exhausted, so i stopped working at this.
> 
> In case someone can give clear instructions to investigate this further, 
> i will try to look at it in my spare time. But i cannot make any promises.
> 
If the cause is simply interrupts not being serviced for a long period
of time, the irqsoff tracer is usually a very good start to investigate
the issue. It might point to a smoking gun already.

Regards,
Lucas

> I'm not aware that some else is working on this.
> 
> Best regards
> 
> > 
> > thanks,
> > 
> > greg k-h
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-20 16:40                                   ` Lucas Stach
@ 2023-06-20 16:55                                     ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-06-20 16:55 UTC (permalink / raw)
  To: Lucas Stach, Greg Kroah-Hartman, Linux regressions mailing list
  Cc: Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial,
	Uwe Kleine-König, Ilpo Järvinen, Fabio Estevam,
	Tomasz Moń,
	Linux ARM

Hi Lucas,

Am 20.06.23 um 18:40 schrieb Lucas Stach:
> Am Dienstag, dem 20.06.2023 um 18:30 +0200 schrieb Stefan Wahren:
>> Hi Greg,
>>
>> Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman:
>>> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 24.05.23 15:07, Stefan Wahren wrote:
>>>>>
>>>>> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>>>>>> "Linux regression tracking (Thorsten Leemhuis)"
>>>>>> <regressions@leemhuis.info> writes:
>>>>>>
>>>>>> Solving this would need to identify the cause of interrupts being
>>>>>> disabled for prolonged times, and nobody volunteered to investigate this
>>>>>> further. One suspect, the Linux serial console, has been likely excluded
>>>>>> already though, as not actually being in use for printk() output.
>>>>>>
>>>>>
>>>>> I don't think that we can exclude the serial console as a whole, i never
>>>>> made such a observation. But at least we can exclude kernel logging on
>>>>> the debug UART.
>>>>
>>>> Stefan, just wondering: was this ever addressed upstream? I assume it's
>>>> not, just wanted to be sure.
>>>>
>>>> I'm a bit unsure what to do with this and consider asking Greg for
>>>> advice, as he applied the patch. On one hand it's *IMHO* clearly a
>>>> regression (but for the record,  some people involved in the discussion
>>>> claim it's not). OTOH the culprit was applied more than a year ago now,
>>>> so reverting it might cause more trouble than it's worth at this point,
>>>> as that could lead to regressions for other users.
>>>
>>> I'll be glad to revert this, but for some reason I thought that someone
>>> was working on a "real fix" here.  Stefan, is that not the case?
>>
>> i can only repeat the statements from 23.5.:
>>
>> Unfortunately my time budget to investigate this issue further is
>> exhausted, so i stopped working at this.
>>
>> In case someone can give clear instructions to investigate this further,
>> i will try to look at it in my spare time. But i cannot make any promises.
>>
> If the cause is simply interrupts not being serviced for a long period
> of time, the irqsoff tracer is usually a very good start to investigate
> the issue. It might point to a smoking gun already.

thanks the hint, i can try that.

AFAIR there was a kernel comment which pointed out that console IO (or 
at least parts) is excluded from the irqoff tracer?

> 
> Regards,
> Lucas
> 
>> I'm not aware that some else is working on this.
>>
>> Best regards
>>
>>>
>>> thanks,
>>>
>>> greg k-h
>>
> 

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-20 16:55                                     ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-06-20 16:55 UTC (permalink / raw)
  To: Lucas Stach, Greg Kroah-Hartman, Linux regressions mailing list
  Cc: Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial,
	Uwe Kleine-König, Ilpo Järvinen, Fabio Estevam,
	Tomasz Moń,
	Linux ARM

Hi Lucas,

Am 20.06.23 um 18:40 schrieb Lucas Stach:
> Am Dienstag, dem 20.06.2023 um 18:30 +0200 schrieb Stefan Wahren:
>> Hi Greg,
>>
>> Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman:
>>> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 24.05.23 15:07, Stefan Wahren wrote:
>>>>>
>>>>> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>>>>>> "Linux regression tracking (Thorsten Leemhuis)"
>>>>>> <regressions@leemhuis.info> writes:
>>>>>>
>>>>>> Solving this would need to identify the cause of interrupts being
>>>>>> disabled for prolonged times, and nobody volunteered to investigate this
>>>>>> further. One suspect, the Linux serial console, has been likely excluded
>>>>>> already though, as not actually being in use for printk() output.
>>>>>>
>>>>>
>>>>> I don't think that we can exclude the serial console as a whole, i never
>>>>> made such a observation. But at least we can exclude kernel logging on
>>>>> the debug UART.
>>>>
>>>> Stefan, just wondering: was this ever addressed upstream? I assume it's
>>>> not, just wanted to be sure.
>>>>
>>>> I'm a bit unsure what to do with this and consider asking Greg for
>>>> advice, as he applied the patch. On one hand it's *IMHO* clearly a
>>>> regression (but for the record,  some people involved in the discussion
>>>> claim it's not). OTOH the culprit was applied more than a year ago now,
>>>> so reverting it might cause more trouble than it's worth at this point,
>>>> as that could lead to regressions for other users.
>>>
>>> I'll be glad to revert this, but for some reason I thought that someone
>>> was working on a "real fix" here.  Stefan, is that not the case?
>>
>> i can only repeat the statements from 23.5.:
>>
>> Unfortunately my time budget to investigate this issue further is
>> exhausted, so i stopped working at this.
>>
>> In case someone can give clear instructions to investigate this further,
>> i will try to look at it in my spare time. But i cannot make any promises.
>>
> If the cause is simply interrupts not being serviced for a long period
> of time, the irqsoff tracer is usually a very good start to investigate
> the issue. It might point to a smoking gun already.

thanks the hint, i can try that.

AFAIR there was a kernel comment which pointed out that console IO (or 
at least parts) is excluded from the irqoff tracer?

> 
> Regards,
> Lucas
> 
>> I'm not aware that some else is working on this.
>>
>> Best regards
>>
>>>
>>> thanks,
>>>
>>> greg k-h
>>
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-20 14:59                               ` Greg Kroah-Hartman
@ 2023-06-20 19:27                                 ` Uwe Kleine-König
  -1 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-06-20 19:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux regressions mailing list, Stefan Wahren,
	Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial,
	Ilpo Järvinen, Fabio Estevam, Tomasz Moń,
	Linux ARM

[-- Attachment #1: Type: text/plain, Size: 2433 bytes --]

Hello Greg,

On Tue, Jun 20, 2023 at 04:59:18PM +0200, Greg Kroah-Hartman wrote:
> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> > On 24.05.23 15:07, Stefan Wahren wrote:
> > > 
> > > Am 23.05.23 um 21:44 schrieb Sergey Organov:
> > >> "Linux regression tracking (Thorsten Leemhuis)"
> > >> <regressions@leemhuis.info> writes:
> > >>
> > >> Solving this would need to identify the cause of interrupts being
> > >> disabled for prolonged times, and nobody volunteered to investigate this
> > >> further. One suspect, the Linux serial console, has been likely excluded
> > >> already though, as not actually being in use for printk() output.
> > >>
> > > 
> > > I don't think that we can exclude the serial console as a whole, i never
> > > made such a observation. But at least we can exclude kernel logging on
> > > the debug UART.
> > 
> > Stefan, just wondering: was this ever addressed upstream? I assume it's
> > not, just wanted to be sure.
> > 
> > I'm a bit unsure what to do with this and consider asking Greg for
> > advice, as he applied the patch. On one hand it's *IMHO* clearly a
> > regression (but for the record,  some people involved in the discussion
> > claim it's not). OTOH the culprit was applied more than a year ago now,
> > so reverting it might cause more trouble than it's worth at this point,
> > as that could lead to regressions for other users.
> 
> I'll be glad to revert this, but for some reason I thought that someone
> was working on a "real fix" here.  Stefan, is that not the case?

Sergey Organov already said something similar, but not very explicit:
With the current understanding reverting said commit is wrong. It is
expected that the commit increases irq latency for imx-serial a bit for
the benefit of less interrupts and so serves the overall system
performance. That this poses a problem only means that on the reporter's
machine there is already an issue that results in a longer period with
disabled irqs. While reverting the imx-serial commit would (maybe) solve
that, the actual problem is the other issue that disables preemption for
a longer timespan.

So TL;DR: Please don't revert the imx-serial patch.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-20 19:27                                 ` Uwe Kleine-König
  0 siblings, 0 replies; 90+ messages in thread
From: Uwe Kleine-König @ 2023-06-20 19:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux regressions mailing list, Stefan Wahren,
	Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial,
	Ilpo Järvinen, Fabio Estevam, Tomasz Moń,
	Linux ARM


[-- Attachment #1.1: Type: text/plain, Size: 2433 bytes --]

Hello Greg,

On Tue, Jun 20, 2023 at 04:59:18PM +0200, Greg Kroah-Hartman wrote:
> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> > On 24.05.23 15:07, Stefan Wahren wrote:
> > > 
> > > Am 23.05.23 um 21:44 schrieb Sergey Organov:
> > >> "Linux regression tracking (Thorsten Leemhuis)"
> > >> <regressions@leemhuis.info> writes:
> > >>
> > >> Solving this would need to identify the cause of interrupts being
> > >> disabled for prolonged times, and nobody volunteered to investigate this
> > >> further. One suspect, the Linux serial console, has been likely excluded
> > >> already though, as not actually being in use for printk() output.
> > >>
> > > 
> > > I don't think that we can exclude the serial console as a whole, i never
> > > made such a observation. But at least we can exclude kernel logging on
> > > the debug UART.
> > 
> > Stefan, just wondering: was this ever addressed upstream? I assume it's
> > not, just wanted to be sure.
> > 
> > I'm a bit unsure what to do with this and consider asking Greg for
> > advice, as he applied the patch. On one hand it's *IMHO* clearly a
> > regression (but for the record,  some people involved in the discussion
> > claim it's not). OTOH the culprit was applied more than a year ago now,
> > so reverting it might cause more trouble than it's worth at this point,
> > as that could lead to regressions for other users.
> 
> I'll be glad to revert this, but for some reason I thought that someone
> was working on a "real fix" here.  Stefan, is that not the case?

Sergey Organov already said something similar, but not very explicit:
With the current understanding reverting said commit is wrong. It is
expected that the commit increases irq latency for imx-serial a bit for
the benefit of less interrupts and so serves the overall system
performance. That this poses a problem only means that on the reporter's
machine there is already an issue that results in a longer period with
disabled irqs. While reverting the imx-serial commit would (maybe) solve
that, the actual problem is the other issue that disables preemption for
a longer timespan.

So TL;DR: Please don't revert the imx-serial patch.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-20 14:47                             ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-06-21  6:23                               ` Stefan Wahren
  -1 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-06-21  6:23 UTC (permalink / raw)
  To: Linux regressions mailing list, Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Thorsten,

Am 20.06.23 um 16:47 schrieb Linux regression tracking (Thorsten Leemhuis):
> On 24.05.23 15:07, Stefan Wahren wrote:
>>
>> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>>> "Linux regression tracking (Thorsten Leemhuis)"
>>> <regressions@leemhuis.info> writes:
>>>
>>> Solving this would need to identify the cause of interrupts being
>>> disabled for prolonged times, and nobody volunteered to investigate this
>>> further. One suspect, the Linux serial console, has been likely excluded
>>> already though, as not actually being in use for printk() output.
>>>
>>
>> I don't think that we can exclude the serial console as a whole, i never
>> made such a observation. But at least we can exclude kernel logging on
>> the debug UART.
> 
> Stefan, just wondering: was this ever addressed upstream? I assume it's
> not, just wanted to be sure.
> 
> I'm a bit unsure what to do with this and consider asking Greg for
> advice, as he applied the patch. On one hand it's *IMHO* clearly a
> regression (but for the record,  some people involved in the discussion
> claim it's not). OTOH the culprit was applied more than a year ago now,
> so reverting it might cause more trouble than it's worth at this point,
> as that could lead to regressions for other users.

thanks for tracking this issue, but in my opinion the discussion goes in 
circles. So i don't see a point in reanimating this again.

Articles like [1] suggests me this is a general issue.

Best regards

[1] - https://www.phoronix.com/news/Printk-Threaded-Atomic-v1

> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
> 
> #regzbot poke

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-21  6:23                               ` Stefan Wahren
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Wahren @ 2023-06-21  6:23 UTC (permalink / raw)
  To: Linux regressions mailing list, Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

Hi Thorsten,

Am 20.06.23 um 16:47 schrieb Linux regression tracking (Thorsten Leemhuis):
> On 24.05.23 15:07, Stefan Wahren wrote:
>>
>> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>>> "Linux regression tracking (Thorsten Leemhuis)"
>>> <regressions@leemhuis.info> writes:
>>>
>>> Solving this would need to identify the cause of interrupts being
>>> disabled for prolonged times, and nobody volunteered to investigate this
>>> further. One suspect, the Linux serial console, has been likely excluded
>>> already though, as not actually being in use for printk() output.
>>>
>>
>> I don't think that we can exclude the serial console as a whole, i never
>> made such a observation. But at least we can exclude kernel logging on
>> the debug UART.
> 
> Stefan, just wondering: was this ever addressed upstream? I assume it's
> not, just wanted to be sure.
> 
> I'm a bit unsure what to do with this and consider asking Greg for
> advice, as he applied the patch. On one hand it's *IMHO* clearly a
> regression (but for the record,  some people involved in the discussion
> claim it's not). OTOH the culprit was applied more than a year ago now,
> so reverting it might cause more trouble than it's worth at this point,
> as that could lead to regressions for other users.

thanks for tracking this issue, but in my opinion the discussion goes in 
circles. So i don't see a point in reanimating this again.

Articles like [1] suggests me this is a general issue.

Best regards

[1] - https://www.phoronix.com/news/Printk-Threaded-Atomic-v1

> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
> 
> #regzbot poke

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-20 19:27                                 ` Uwe Kleine-König
@ 2023-06-21  8:43                                   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-21  8:43 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Linux regressions mailing list, Stefan Wahren,
	Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial,
	Ilpo Järvinen, Fabio Estevam, Tomasz Moń,
	Linux ARM

On Tue, Jun 20, 2023 at 09:27:48PM +0200, Uwe Kleine-König wrote:
> Hello Greg,
> 
> On Tue, Jun 20, 2023 at 04:59:18PM +0200, Greg Kroah-Hartman wrote:
> > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> > > On 24.05.23 15:07, Stefan Wahren wrote:
> > > > 
> > > > Am 23.05.23 um 21:44 schrieb Sergey Organov:
> > > >> "Linux regression tracking (Thorsten Leemhuis)"
> > > >> <regressions@leemhuis.info> writes:
> > > >>
> > > >> Solving this would need to identify the cause of interrupts being
> > > >> disabled for prolonged times, and nobody volunteered to investigate this
> > > >> further. One suspect, the Linux serial console, has been likely excluded
> > > >> already though, as not actually being in use for printk() output.
> > > >>
> > > > 
> > > > I don't think that we can exclude the serial console as a whole, i never
> > > > made such a observation. But at least we can exclude kernel logging on
> > > > the debug UART.
> > > 
> > > Stefan, just wondering: was this ever addressed upstream? I assume it's
> > > not, just wanted to be sure.
> > > 
> > > I'm a bit unsure what to do with this and consider asking Greg for
> > > advice, as he applied the patch. On one hand it's *IMHO* clearly a
> > > regression (but for the record,  some people involved in the discussion
> > > claim it's not). OTOH the culprit was applied more than a year ago now,
> > > so reverting it might cause more trouble than it's worth at this point,
> > > as that could lead to regressions for other users.
> > 
> > I'll be glad to revert this, but for some reason I thought that someone
> > was working on a "real fix" here.  Stefan, is that not the case?
> 
> Sergey Organov already said something similar, but not very explicit:
> With the current understanding reverting said commit is wrong. It is
> expected that the commit increases irq latency for imx-serial a bit for
> the benefit of less interrupts and so serves the overall system
> performance. That this poses a problem only means that on the reporter's
> machine there is already an issue that results in a longer period with
> disabled irqs. While reverting the imx-serial commit would (maybe) solve
> that, the actual problem is the other issue that disables preemption for
> a longer timespan.
> 
> So TL;DR: Please don't revert the imx-serial patch.

Ok, will leave this alone, it shouldn't be marked as a regression.

greg k-h

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-21  8:43                                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-21  8:43 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Linux regressions mailing list, Stefan Wahren,
	Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer,
	Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial,
	Ilpo Järvinen, Fabio Estevam, Tomasz Moń,
	Linux ARM

On Tue, Jun 20, 2023 at 09:27:48PM +0200, Uwe Kleine-König wrote:
> Hello Greg,
> 
> On Tue, Jun 20, 2023 at 04:59:18PM +0200, Greg Kroah-Hartman wrote:
> > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> > > On 24.05.23 15:07, Stefan Wahren wrote:
> > > > 
> > > > Am 23.05.23 um 21:44 schrieb Sergey Organov:
> > > >> "Linux regression tracking (Thorsten Leemhuis)"
> > > >> <regressions@leemhuis.info> writes:
> > > >>
> > > >> Solving this would need to identify the cause of interrupts being
> > > >> disabled for prolonged times, and nobody volunteered to investigate this
> > > >> further. One suspect, the Linux serial console, has been likely excluded
> > > >> already though, as not actually being in use for printk() output.
> > > >>
> > > > 
> > > > I don't think that we can exclude the serial console as a whole, i never
> > > > made such a observation. But at least we can exclude kernel logging on
> > > > the debug UART.
> > > 
> > > Stefan, just wondering: was this ever addressed upstream? I assume it's
> > > not, just wanted to be sure.
> > > 
> > > I'm a bit unsure what to do with this and consider asking Greg for
> > > advice, as he applied the patch. On one hand it's *IMHO* clearly a
> > > regression (but for the record,  some people involved in the discussion
> > > claim it's not). OTOH the culprit was applied more than a year ago now,
> > > so reverting it might cause more trouble than it's worth at this point,
> > > as that could lead to regressions for other users.
> > 
> > I'll be glad to revert this, but for some reason I thought that someone
> > was working on a "real fix" here.  Stefan, is that not the case?
> 
> Sergey Organov already said something similar, but not very explicit:
> With the current understanding reverting said commit is wrong. It is
> expected that the commit increases irq latency for imx-serial a bit for
> the benefit of less interrupts and so serves the overall system
> performance. That this poses a problem only means that on the reporter's
> machine there is already an issue that results in a longer period with
> disabled irqs. While reverting the imx-serial commit would (maybe) solve
> that, the actual problem is the other issue that disables preemption for
> a longer timespan.
> 
> So TL;DR: Please don't revert the imx-serial patch.

Ok, will leave this alone, it shouldn't be marked as a regression.

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
  2023-06-21  6:23                               ` Stefan Wahren
@ 2023-06-21 13:42                                 ` Linux regression tracking (Thorsten Leemhuis)
  -1 siblings, 0 replies; 90+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-21 13:42 UTC (permalink / raw)
  To: Stefan Wahren, Linux regressions mailing list, Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

On 21.06.23 08:23, Stefan Wahren wrote:
> Am 20.06.23 um 16:47 schrieb Linux regression tracking (Thorsten Leemhuis):
>> On 24.05.23 15:07, Stefan Wahren wrote:
>>>
>>> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>>>> "Linux regression tracking (Thorsten Leemhuis)"
>>>> <regressions@leemhuis.info> writes:
>>>>
>>>> Solving this would need to identify the cause of interrupts being
>>>> disabled for prolonged times, and nobody volunteered to investigate
>>>> this
>>>> further. One suspect, the Linux serial console, has been likely
>>>> excluded
>>>> already though, as not actually being in use for printk() output.
>>>
>>> I don't think that we can exclude the serial console as a whole, i never
>>> made such a observation. But at least we can exclude kernel logging on
>>> the debug UART.
>>
>> Stefan, just wondering: was this ever addressed upstream? I assume it's
>> not, just wanted to be sure.
>>
>> I'm a bit unsure what to do with this and consider asking Greg for
>> advice, as he applied the patch. On one hand it's *IMHO* clearly a
>> regression (but for the record,  some people involved in the discussion
>> claim it's not). OTOH the culprit was applied more than a year ago now,
>> so reverting it might cause more trouble than it's worth at this point,
>> as that could lead to regressions for other users.
> 
> thanks for tracking this issue, but in my opinion the discussion goes in
> circles. So i don't see a point in reanimating this again.  [...]

Yup. Sadly. I don't think Linus would agree with the "this is not a
regression" claim from various people here. But well, due to your
statement, Gregs mail from earlier today, and the fact that reverting
the culprit at this point might lead to regression that would hit more
people, I agree:

#regzbot inconclusive: unresolved, but stuck, and revert likely a bad
option - and reporter is fine with not perusing this further

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Regression: serial: imx: overrun errors on debug UART
@ 2023-06-21 13:42                                 ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 90+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-21 13:42 UTC (permalink / raw)
  To: Stefan Wahren, Linux regressions mailing list, Sergey Organov
  Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen,
	Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer,
	NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby,
	Tomasz Moń,
	Linux ARM

On 21.06.23 08:23, Stefan Wahren wrote:
> Am 20.06.23 um 16:47 schrieb Linux regression tracking (Thorsten Leemhuis):
>> On 24.05.23 15:07, Stefan Wahren wrote:
>>>
>>> Am 23.05.23 um 21:44 schrieb Sergey Organov:
>>>> "Linux regression tracking (Thorsten Leemhuis)"
>>>> <regressions@leemhuis.info> writes:
>>>>
>>>> Solving this would need to identify the cause of interrupts being
>>>> disabled for prolonged times, and nobody volunteered to investigate
>>>> this
>>>> further. One suspect, the Linux serial console, has been likely
>>>> excluded
>>>> already though, as not actually being in use for printk() output.
>>>
>>> I don't think that we can exclude the serial console as a whole, i never
>>> made such a observation. But at least we can exclude kernel logging on
>>> the debug UART.
>>
>> Stefan, just wondering: was this ever addressed upstream? I assume it's
>> not, just wanted to be sure.
>>
>> I'm a bit unsure what to do with this and consider asking Greg for
>> advice, as he applied the patch. On one hand it's *IMHO* clearly a
>> regression (but for the record,  some people involved in the discussion
>> claim it's not). OTOH the culprit was applied more than a year ago now,
>> so reverting it might cause more trouble than it's worth at this point,
>> as that could lead to regressions for other users.
> 
> thanks for tracking this issue, but in my opinion the discussion goes in
> circles. So i don't see a point in reanimating this again.  [...]

Yup. Sadly. I don't think Linus would agree with the "this is not a
regression" claim from various people here. But well, due to your
statement, Gregs mail from earlier today, and the fact that reverting
the culprit at this point might lead to regression that would hit more
people, I agree:

#regzbot inconclusive: unresolved, but stuck, and revert likely a bad
option - and reporter is fine with not perusing this further

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

end of thread, other threads:[~2023-06-21 13:43 UTC | newest]

Thread overview: 90+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24  8:57 Regression: serial: imx: overrun errors on debug UART Stefan Wahren
2023-03-24  8:57 ` Stefan Wahren
2023-03-24 10:12 ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-03-24 10:12   ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-03-24 11:47 ` Ilpo Järvinen
2023-03-24 11:47   ` Ilpo Järvinen
2023-03-24 12:26   ` Francesco Dolcini
2023-03-24 12:26     ` Francesco Dolcini
2023-03-24 12:35     ` Ilpo Järvinen
2023-03-24 12:35       ` Ilpo Järvinen
2023-03-24 12:49       ` Stefan Wahren
2023-03-24 12:49         ` Stefan Wahren
2023-03-24 13:06         ` Francesco Dolcini
2023-03-24 13:06           ` Francesco Dolcini
2023-03-24 12:57   ` Fabio Estevam
2023-03-24 12:57     ` Fabio Estevam
2023-03-24 13:37     ` Uwe Kleine-König
2023-03-24 13:37       ` Uwe Kleine-König
2023-03-24 14:19       ` Stefan Wahren
2023-03-24 14:19         ` Stefan Wahren
2023-03-24 14:39         ` Uwe Kleine-König
2023-03-24 14:39           ` Uwe Kleine-König
2023-03-24 21:57         ` Sergey Organov
2023-03-24 21:57           ` Sergey Organov
2023-03-24 15:00     ` Stefan Wahren
2023-03-24 15:00       ` Stefan Wahren
2023-03-25 11:31       ` Stefan Wahren
2023-03-25 11:31         ` Stefan Wahren
2023-03-25 12:23         ` Fabio Estevam
2023-03-25 12:23           ` Fabio Estevam
2023-03-25 15:11         ` Uwe Kleine-König
2023-03-25 15:11           ` Uwe Kleine-König
2023-03-25 17:05           ` Stefan Wahren
2023-03-25 17:05             ` Stefan Wahren
2023-03-25 19:00             ` Sergey Organov
2023-03-25 19:00               ` Sergey Organov
2023-03-26 18:21               ` Francesco Dolcini
2023-03-26 18:21                 ` Francesco Dolcini
2023-03-27  8:07             ` Tomasz Moń
2023-03-27  8:07               ` Tomasz Moń
2023-03-25 18:30           ` Sergey Organov
2023-03-25 18:30             ` Sergey Organov
2023-03-27 14:42           ` Stefan Wahren
2023-03-27 14:42             ` Stefan Wahren
2023-03-27 15:11             ` Sergey Organov
2023-03-27 15:11               ` Sergey Organov
2023-03-27 15:30               ` Russell King (Oracle)
2023-03-27 15:30                 ` Russell King (Oracle)
2023-04-16 13:43               ` Stefan Wahren
2023-04-16 13:43                 ` Stefan Wahren
2023-04-17 16:50                 ` Sergey Organov
2023-04-17 16:50                   ` Sergey Organov
2023-04-17 18:40                   ` Stefan Wahren
2023-04-17 18:40                     ` Stefan Wahren
2023-04-18 16:16                   ` Stefan Wahren
2023-04-18 16:16                     ` Stefan Wahren
2023-05-22  9:25                     ` Linux regression tracking (Thorsten Leemhuis)
2023-05-22  9:25                       ` Linux regression tracking (Thorsten Leemhuis)
2023-05-23 15:12                       ` Stefan Wahren
2023-05-23 15:12                         ` Stefan Wahren
2023-05-23 19:44                       ` Sergey Organov
2023-05-23 19:44                         ` Sergey Organov
2023-05-24 10:48                         ` Thorsten Leemhuis
2023-05-24 10:48                           ` Thorsten Leemhuis
2023-05-24 12:41                           ` Uwe Kleine-König
2023-05-24 12:41                             ` Uwe Kleine-König
2023-05-24 13:45                           ` Sergey Organov
2023-05-24 13:45                             ` Sergey Organov
2023-05-24 13:07                         ` Stefan Wahren
2023-05-24 13:07                           ` Stefan Wahren
2023-06-20 14:47                           ` Linux regression tracking (Thorsten Leemhuis)
2023-06-20 14:47                             ` Linux regression tracking (Thorsten Leemhuis)
2023-06-20 14:59                             ` Greg Kroah-Hartman
2023-06-20 14:59                               ` Greg Kroah-Hartman
2023-06-20 15:34                               ` Sergey Organov
2023-06-20 15:34                                 ` Sergey Organov
2023-06-20 16:30                               ` Stefan Wahren
2023-06-20 16:30                                 ` Stefan Wahren
2023-06-20 16:40                                 ` Lucas Stach
2023-06-20 16:40                                   ` Lucas Stach
2023-06-20 16:55                                   ` Stefan Wahren
2023-06-20 16:55                                     ` Stefan Wahren
2023-06-20 19:27                               ` Uwe Kleine-König
2023-06-20 19:27                                 ` Uwe Kleine-König
2023-06-21  8:43                                 ` Greg Kroah-Hartman
2023-06-21  8:43                                   ` Greg Kroah-Hartman
2023-06-21  6:23                             ` Stefan Wahren
2023-06-21  6:23                               ` Stefan Wahren
2023-06-21 13:42                               ` Linux regression tracking (Thorsten Leemhuis)
2023-06-21 13:42                                 ` Linux regression tracking (Thorsten Leemhuis)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.