* Regression: serial: imx: overrun errors on debug UART @ 2023-03-24 8:57 Stefan Wahren 2023-03-24 10:12 ` Linux regression tracking #adding (Thorsten Leemhuis) 2023-03-24 11:47 ` Ilpo Järvinen 0 siblings, 2 replies; 45+ messages in thread From: Stefan Wahren @ 2023-03-24 8:57 UTC (permalink / raw) To: Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby Cc: Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren Hi, after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we experience the following issues with the debug UART (115200 baud, 8N1, no hardware flow control): - overrun errors if we paste in multiple text lines while system is idle - no reaction to single key strokes while system is on higher load After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency") the issue disappear. Maybe it's worth to mention that the Tarragon board uses two additional application UARTs with similiar baud rates (9600 - 115200 baud, no hardware flow control) for RS485 communication, but there are no overrun errors (with and without the mention change). Best regards _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 8:57 Regression: serial: imx: overrun errors on debug UART Stefan Wahren @ 2023-03-24 10:12 ` Linux regression tracking #adding (Thorsten Leemhuis) 2023-03-24 11:47 ` Ilpo Järvinen 1 sibling, 0 replies; 45+ messages in thread From: Linux regression tracking #adding (Thorsten Leemhuis) @ 2023-03-24 10:12 UTC (permalink / raw) To: Stefan Wahren, Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby Cc: Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren [TLDR: I'm adding this report to the list of tracked Linux kernel regressions; the text you find below is based on a few templates paragraphs you might have encountered already in similar form. See link in footer if these mails annoy you.] On 24.03.23 09:57, Stefan Wahren wrote: > Hi, > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we > experience the following issues with the debug UART (115200 baud, 8N1, > no hardware flow control): > > - overrun errors if we paste in multiple text lines while system is idle > - no reaction to single key strokes while system is on higher load > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt > frequency") the issue disappear. > > Maybe it's worth to mention that the Tarragon board uses two additional > application UARTs with similiar baud rates (9600 - 115200 baud, no > hardware flow control) for RS485 communication, but there are no overrun > errors (with and without the mention change). Thanks for the report. To be sure the issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression tracking bot: #regzbot ^introduced 7a637784d517 #regzbot title serial: imx: overrun errors on debug UART #regzbot ignore-activity This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply and tell me -- ideally while also telling regzbot about it, as explained by the page listed in the footer of this mail. Developers: When fixing the issue, remember to add 'Link:' tags pointing to the report (the parent of this mail). See page linked in footer for details. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr That page also explains what to do if mails like this annoy you. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 8:57 Regression: serial: imx: overrun errors on debug UART Stefan Wahren 2023-03-24 10:12 ` Linux regression tracking #adding (Thorsten Leemhuis) @ 2023-03-24 11:47 ` Ilpo Järvinen 2023-03-24 12:26 ` Francesco Dolcini 2023-03-24 12:57 ` Fabio Estevam 1 sibling, 2 replies; 45+ messages in thread From: Ilpo Järvinen @ 2023-03-24 11:47 UTC (permalink / raw) To: Stefan Wahren Cc: Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren On Fri, 24 Mar 2023, Stefan Wahren wrote: > Hi, > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we > experience the following issues with the debug UART (115200 baud, 8N1, no > hardware flow control): > > - overrun errors if we paste in multiple text lines while system is idle > - no reaction to single key strokes while system is on higher load > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency") > the issue disappear. > > Maybe it's worth to mention that the Tarragon board uses two additional > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware > flow control) for RS485 communication, but there are no overrun errors (with > and without the mention change). This has come up earlier, see e.g.: https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ My somewhat uninformed suggestion: if the overrun problems mostly show up with console ports, maybe the trigger level could depend on the port being a console or not? -- i. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 11:47 ` Ilpo Järvinen @ 2023-03-24 12:26 ` Francesco Dolcini 2023-03-24 12:35 ` Ilpo Järvinen 2023-03-24 12:57 ` Fabio Estevam 1 sibling, 1 reply; 45+ messages in thread From: Francesco Dolcini @ 2023-03-24 12:26 UTC (permalink / raw) To: Ilpo Järvinen, Stefan Wahren Cc: Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren Hello On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote: > On Fri, 24 Mar 2023, Stefan Wahren wrote: > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we > > experience the following issues with the debug UART (115200 baud, 8N1, no > > hardware flow control): > > > > - overrun errors if we paste in multiple text lines while system is idle > > - no reaction to single key strokes while system is on higher load > > > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency") > > the issue disappear. > > > > Maybe it's worth to mention that the Tarragon board uses two additional > > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware > > flow control) for RS485 communication, but there are no overrun errors (with > > and without the mention change). > > This has come up earlier, see e.g.: > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ yep, it looks exactly the same issue. We did not verify if this was affecting other UARTs. However, isn't RS485 half-duplex? This is very likely a difference compared to the RS232 console port. I am also not really convinced this is a proper regression, while 7a637784d517 clearly is making the situation _worst_, we had some issues even before - unfortunately I have no much more details available. Francesco _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 12:26 ` Francesco Dolcini @ 2023-03-24 12:35 ` Ilpo Järvinen 2023-03-24 12:49 ` Stefan Wahren 0 siblings, 1 reply; 45+ messages in thread From: Ilpo Järvinen @ 2023-03-24 12:35 UTC (permalink / raw) To: Francesco Dolcini Cc: Stefan Wahren, Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren [-- Attachment #1: Type: text/plain, Size: 1639 bytes --] On Fri, 24 Mar 2023, Francesco Dolcini wrote: > Hello > > On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote: > > On Fri, 24 Mar 2023, Stefan Wahren wrote: > > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we > > > experience the following issues with the debug UART (115200 baud, 8N1, no > > > hardware flow control): > > > > > > - overrun errors if we paste in multiple text lines while system is idle > > > - no reaction to single key strokes while system is on higher load > > > > > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency") > > > the issue disappear. > > > > > > Maybe it's worth to mention that the Tarragon board uses two additional > > > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware > > > flow control) for RS485 communication, but there are no overrun errors (with > > > and without the mention change). > > > > This has come up earlier, see e.g.: > > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ > > yep, it looks exactly the same issue. > > We did not verify if this was affecting other UARTs. However, isn't RS485 > half-duplex? While half-duplex is more likely by far due simplicity, RS485 could also be full-duplex. It seems imx driver supports for both modes. -- i. > This is very likely a difference compared to the RS232 > console port. > > I am also not really convinced this is a proper regression, while 7a637784d517 > clearly is making the situation _worst_, we had some issues even before - > unfortunately I have no much more details available. [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 12:35 ` Ilpo Järvinen @ 2023-03-24 12:49 ` Stefan Wahren 2023-03-24 13:06 ` Francesco Dolcini 0 siblings, 1 reply; 45+ messages in thread From: Stefan Wahren @ 2023-03-24 12:49 UTC (permalink / raw) To: Ilpo Järvinen, Francesco Dolcini Cc: Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren Am 24.03.23 um 13:35 schrieb Ilpo Järvinen: > On Fri, 24 Mar 2023, Francesco Dolcini wrote: > >> Hello >> >> On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote: >>> On Fri, 24 Mar 2023, Stefan Wahren wrote: >>>> after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we >>>> experience the following issues with the debug UART (115200 baud, 8N1, no >>>> hardware flow control): >>>> >>>> - overrun errors if we paste in multiple text lines while system is idle >>>> - no reaction to single key strokes while system is on higher load >>>> >>>> After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency") >>>> the issue disappear. >>>> >>>> Maybe it's worth to mention that the Tarragon board uses two additional >>>> application UARTs with similiar baud rates (9600 - 115200 baud, no hardware >>>> flow control) for RS485 communication, but there are no overrun errors (with >>>> and without the mention change). >>> This has come up earlier, see e.g.: >>> >>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ >> yep, it looks exactly the same issue. >> >> We did not verify if this was affecting other UARTs. However, isn't RS485 >> half-duplex? > While half-duplex is more likely by far due simplicity, RS485 could also > be full-duplex. It seems imx driver supports for both modes. The RS485 on Tarragon is half-duplex, but this is implemented in external hardware. So from Linux / driver point of view it's a RS232. To us the current behavior (overrun errors and no reaction under load) is not acceptable. I agree that increasing the rx threshold isn't the real issue. But i needed a starting point for a discussion. So any ideas how to investigate this further are welcome. > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 12:49 ` Stefan Wahren @ 2023-03-24 13:06 ` Francesco Dolcini 0 siblings, 0 replies; 45+ messages in thread From: Francesco Dolcini @ 2023-03-24 13:06 UTC (permalink / raw) To: Stefan Wahren Cc: Ilpo Järvinen, Francesco Dolcini, Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren On Fri, Mar 24, 2023 at 01:49:56PM +0100, Stefan Wahren wrote: > Am 24.03.23 um 13:35 schrieb Ilpo Järvinen: > > On Fri, 24 Mar 2023, Francesco Dolcini wrote: > > > On Fri, Mar 24, 2023 at 01:47:59PM +0200, Ilpo Järvinen wrote: > > > > On Fri, 24 Mar 2023, Stefan Wahren wrote: > > > > > after switching to Linux 6.1.21 on our Tarragon board (i.MX6ULL SoC), we > > > > > experience the following issues with the debug UART (115200 baud, 8N1, no > > > > > hardware flow control): > > > > > > > > > > - overrun errors if we paste in multiple text lines while system is idle > > > > > - no reaction to single key strokes while system is on higher load > > > > > > > > > > After reverting 7a637784d517 ("serial: imx: reduce RX interrupt frequency") > > > > > the issue disappear. > > > > > > > > > > Maybe it's worth to mention that the Tarragon board uses two additional > > > > > application UARTs with similiar baud rates (9600 - 115200 baud, no hardware > > > > > flow control) for RS485 communication, but there are no overrun errors (with > > > > > and without the mention change). > > > > This has come up earlier, see e.g.: > > > > > > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ > > > yep, it looks exactly the same issue. > > > > > > We did not verify if this was affecting other UARTs. However, isn't RS485 > > > half-duplex? > > While half-duplex is more likely by far due simplicity, RS485 could also > > be full-duplex. It seems imx driver supports for both modes. > > The RS485 on Tarragon is half-duplex, but this is implemented in external > hardware. So from Linux / driver point of view it's a RS232. To me this is an interesting difference that might be worth investigating. The console is somehow special since you are going to echo out the received chars most of the times. Francesco _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 11:47 ` Ilpo Järvinen 2023-03-24 12:26 ` Francesco Dolcini @ 2023-03-24 12:57 ` Fabio Estevam 2023-03-24 13:37 ` Uwe Kleine-König 2023-03-24 15:00 ` Stefan Wahren 1 sibling, 2 replies; 45+ messages in thread From: Fabio Estevam @ 2023-03-24 12:57 UTC (permalink / raw) To: Ilpo Järvinen Cc: Stefan Wahren, Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren Hi Stefan, On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote: > This has come up earlier, see e.g.: > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ > > My somewhat uninformed suggestion: if the overrun problems mostly show up > with console ports, maybe the trigger level could depend on the port > being a console or not? Does the change below help? Taking Ilpo's suggestion into account: diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c index 0fa1bd8cdec7..4d0aae38b7a5 100644 --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -233,6 +233,7 @@ struct imx_port { enum imx_tx_state tx_state; struct hrtimer trigger_start_tx; struct hrtimer trigger_stop_tx; + unsigned int rxtl; }; struct imx_port_ucrs { @@ -1309,6 +1310,7 @@ static void imx_uart_clear_rx_errors(struct imx_port *sport) } #define TXTL_DEFAULT 2 /* reset default */ +#define RXTL_DEFAULT_CONSOLE 1 /* 1 character or aging timer */ #define RXTL_DEFAULT 8 /* 8 characters or aging timer */ #define TXTL_DMA 8 /* DMA burst setting */ #define RXTL_DMA 9 /* DMA burst setting */ @@ -1422,7 +1424,7 @@ static void imx_uart_disable_dma(struct imx_port *sport) ucr1 &= ~(UCR1_RXDMAEN | UCR1_TXDMAEN | UCR1_ATDMAEN); imx_uart_writel(sport, ucr1, UCR1); - imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT); + imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl); sport->dma_is_enabled = 0; } @@ -1447,7 +1449,7 @@ static int imx_uart_startup(struct uart_port *port) return retval; } - imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT); + imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl); /* disable the DREN bit (Data Ready interrupt enable) before * requesting IRQs @@ -1464,6 +1466,11 @@ static int imx_uart_startup(struct uart_port *port) if (!uart_console(port) && imx_uart_dma_init(sport) == 0) dma_is_inited = 1; + if (uart_console(port)) + sport->rxtl = RXTL_DEFAULT_CONSOLE; + else + sport->rxtl = RXTL_DEFAULT; + spin_lock_irqsave(&sport->port.lock, flags); /* Reset fifo's and state machines */ @@ -1863,7 +1870,7 @@ static int imx_uart_poll_init(struct uart_port *port) if (retval) clk_disable_unprepare(sport->clk_ipg); - imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT); + imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl); spin_lock_irqsave(&sport->port.lock, flags); @@ -2139,7 +2146,7 @@ imx_uart_console_setup(struct console *co, char *options) else imx_uart_console_get_options(sport, &baud, &parity, &bits); - imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT); + imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl); retval = uart_set_options(&sport->port, co, baud, parity, bits, flow); _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 12:57 ` Fabio Estevam @ 2023-03-24 13:37 ` Uwe Kleine-König 2023-03-24 14:19 ` Stefan Wahren 2023-03-24 15:00 ` Stefan Wahren 1 sibling, 1 reply; 45+ messages in thread From: Uwe Kleine-König @ 2023-03-24 13:37 UTC (permalink / raw) To: Fabio Estevam Cc: Ilpo Järvinen, Stefan Wahren, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Tomasz Moń, Sergey Organov, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Sascha Hauer, Linux ARM [-- Attachment #1.1: Type: text/plain, Size: 1049 bytes --] On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote: > Hi Stefan, > > On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen > <ilpo.jarvinen@linux.intel.com> wrote: > > > This has come up earlier, see e.g.: > > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ > > > > My somewhat uninformed suggestion: if the overrun problems mostly show up > > with console ports, maybe the trigger level could depend on the port > > being a console or not? > > Does the change below help? Taking Ilpo's suggestion into account: I wonder if it's a red herring that having the console on that port makes a difference. If I understand correctly the problem is pasting bigger amounts of data on a ttymxc after having logged in via a getty? @Stefan: Can you try to reproduce with the port being also a console? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 13:37 ` Uwe Kleine-König @ 2023-03-24 14:19 ` Stefan Wahren 2023-03-24 14:39 ` Uwe Kleine-König 2023-03-24 21:57 ` Sergey Organov 0 siblings, 2 replies; 45+ messages in thread From: Stefan Wahren @ 2023-03-24 14:19 UTC (permalink / raw) To: Uwe Kleine-König, Fabio Estevam Cc: Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Tomasz Moń, Sergey Organov, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Sascha Hauer, Linux ARM Hi, Am 24.03.23 um 14:37 schrieb Uwe Kleine-König: > On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote: >> Hi Stefan, >> >> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen >> <ilpo.jarvinen@linux.intel.com> wrote: >> >>> This has come up earlier, see e.g.: >>> >>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ >>> >>> My somewhat uninformed suggestion: if the overrun problems mostly show up >>> with console ports, maybe the trigger level could depend on the port >>> being a console or not? >> Does the change below help? Taking Ilpo's suggestion into account: > I wonder if it's a red herring that having the console on that port > makes a difference. If I understand correctly the problem is pasting > bigger amounts of data on a ttymxc after having logged in via a getty? > > @Stefan: Can you try to reproduce with the port being also a console? Sorry, for the confusion. Maybe i should have mentioned that the debug UART was configured as a console. Here is the output to be more specific (ttymxc0 and 4 are RS485, ttymxc3 is the debug console): # cat /proc/tty/driver/IMX-uart serinfo:1.0 driver revision: 0: uart:IMX mmio:0x02020000 irq:192 tx:285207 rx:2633621 fe:2 DSR|CD 3: uart:IMX mmio:0x021F0000 irq:193 tx:70502 rx:69 RTS|DTR|DSR 4: uart:IMX mmio:0x021F4000 irq:194 tx:300988 rx:677223 DSR|CD 5: uart:IMX mmio:0x021FC000 irq:195 tx:0 rx:0 DSR|CD 6: uart:IMX mmio:0x02018000 irq:191 tx:0 rx:0 DSR|CD Just for clarification the Tarragon board is build in a charging station. So hardware access is limited. @Uwe which port should be configured as a console? > > Best regards > Uwe > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 14:19 ` Stefan Wahren @ 2023-03-24 14:39 ` Uwe Kleine-König 2023-03-24 21:57 ` Sergey Organov 1 sibling, 0 replies; 45+ messages in thread From: Uwe Kleine-König @ 2023-03-24 14:39 UTC (permalink / raw) To: Stefan Wahren Cc: Fabio Estevam, Pengutronix Kernel Team, Jiri Slaby, Greg Kroah-Hartman, Tomasz Moń, Sergey Organov, NXP Linux Team, linux-serial, Stefan Wahren, Ilpo Järvinen, Shawn Guo, Sascha Hauer, Linux ARM [-- Attachment #1.1: Type: text/plain, Size: 2111 bytes --] On Fri, Mar 24, 2023 at 03:19:46PM +0100, Stefan Wahren wrote: > Hi, > > Am 24.03.23 um 14:37 schrieb Uwe Kleine-König: > > On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote: > > > Hi Stefan, > > > > > > On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen > > > <ilpo.jarvinen@linux.intel.com> wrote: > > > > > > > This has come up earlier, see e.g.: > > > > > > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ > > > > > > > > My somewhat uninformed suggestion: if the overrun problems mostly show up > > > > with console ports, maybe the trigger level could depend on the port > > > > being a console or not? > > > Does the change below help? Taking Ilpo's suggestion into account: > > I wonder if it's a red herring that having the console on that port > > makes a difference. If I understand correctly the problem is pasting > > bigger amounts of data on a ttymxc after having logged in via a getty? > > > > @Stefan: Can you try to reproduce with the port being also a console? > > Sorry, for the confusion. Maybe i should have mentioned that the debug UART > was configured as a console. Here is the output to be more specific (ttymxc0 > and 4 are RS485, ttymxc3 is the debug console): > > # cat /proc/tty/driver/IMX-uart > > serinfo:1.0 driver revision: > 0: uart:IMX mmio:0x02020000 irq:192 tx:285207 rx:2633621 fe:2 DSR|CD > 3: uart:IMX mmio:0x021F0000 irq:193 tx:70502 rx:69 RTS|DTR|DSR > 4: uart:IMX mmio:0x021F4000 irq:194 tx:300988 rx:677223 DSR|CD > 5: uart:IMX mmio:0x021FC000 irq:195 tx:0 rx:0 DSR|CD > 6: uart:IMX mmio:0x02018000 irq:191 tx:0 rx:0 DSR|CD > > Just for clarification the Tarragon board is build in a charging station. So > hardware access is limited. > > @Uwe which port should be configured as a console? I don't care as long as it's not hte port that you do your test on. None is fine. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 14:19 ` Stefan Wahren 2023-03-24 14:39 ` Uwe Kleine-König @ 2023-03-24 21:57 ` Sergey Organov 1 sibling, 0 replies; 45+ messages in thread From: Sergey Organov @ 2023-03-24 21:57 UTC (permalink / raw) To: Stefan Wahren Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Tomasz Moń, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Sascha Hauer, Linux ARM Stefan Wahren <stefan.wahren@i2se.com> writes: > Hi, > > Am 24.03.23 um 14:37 schrieb Uwe Kleine-König: >> On Fri, Mar 24, 2023 at 09:57:39AM -0300, Fabio Estevam wrote: >>> Hi Stefan, >>> >>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen >>> <ilpo.jarvinen@linux.intel.com> wrote: >>> >>>> This has come up earlier, see e.g.: >>>> >>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ >>>> >>>> My somewhat uninformed suggestion: if the overrun problems mostly show up >>>> with console ports, maybe the trigger level could depend on the port >>>> being a console or not? >>> Does the change below help? Taking Ilpo's suggestion into account: >> I wonder if it's a red herring that having the console on that port >> makes a difference. If I understand correctly the problem is pasting >> bigger amounts of data on a ttymxc after having logged in via a getty? >> >> @Stefan: Can you try to reproduce with the port being also a console? > > Sorry, for the confusion. Maybe i should have mentioned that the debug > UART was configured as a console. Chances are that you might experience the same problem that I've described here: https://marc.info/?l=linux-serial&m=158504064609504&w=2 Essentially, any serial console output out of printk() could easily cause 10 milliseconds or even up to 1 second interrupts latency, that will definitely cause overruns on serial ports and gosh knows what other problems. This issue hasn't got any resolution as far as I'm aware. To me it means that I can't use Linux serial console at all on my non-SMP system, unless I remove the offending lock. -- Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 12:57 ` Fabio Estevam 2023-03-24 13:37 ` Uwe Kleine-König @ 2023-03-24 15:00 ` Stefan Wahren 2023-03-25 11:31 ` Stefan Wahren 1 sibling, 1 reply; 45+ messages in thread From: Stefan Wahren @ 2023-03-24 15:00 UTC (permalink / raw) To: Fabio Estevam, Ilpo Järvinen Cc: Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren Hi Fabio, Am 24.03.23 um 13:57 schrieb Fabio Estevam: > Hi Stefan, > > On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen > <ilpo.jarvinen@linux.intel.com> wrote: > >> This has come up earlier, see e.g.: >> >> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ >> >> My somewhat uninformed suggestion: if the overrun problems mostly show up >> with console ports, maybe the trigger level could depend on the port >> being a console or not? > Does the change below help? Taking Ilpo's suggestion into account: this breaks the boot / debug console completely, but i got the idea. > > diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c > index 0fa1bd8cdec7..4d0aae38b7a5 100644 > --- a/drivers/tty/serial/imx.c > +++ b/drivers/tty/serial/imx.c > @@ -233,6 +233,7 @@ struct imx_port { > enum imx_tx_state tx_state; > struct hrtimer trigger_start_tx; > struct hrtimer trigger_stop_tx; > + unsigned int rxtl; > }; > > struct imx_port_ucrs { > @@ -1309,6 +1310,7 @@ static void imx_uart_clear_rx_errors(struct > imx_port *sport) > } > > #define TXTL_DEFAULT 2 /* reset default */ > +#define RXTL_DEFAULT_CONSOLE 1 /* 1 character or aging timer */ > #define RXTL_DEFAULT 8 /* 8 characters or aging timer */ > #define TXTL_DMA 8 /* DMA burst setting */ > #define RXTL_DMA 9 /* DMA burst setting */ > @@ -1422,7 +1424,7 @@ static void imx_uart_disable_dma(struct imx_port *sport) > ucr1 &= ~(UCR1_RXDMAEN | UCR1_TXDMAEN | UCR1_ATDMAEN); > imx_uart_writel(sport, ucr1, UCR1); > > - imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT); > + imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl); > > sport->dma_is_enabled = 0; > } > @@ -1447,7 +1449,7 @@ static int imx_uart_startup(struct uart_port *port) > return retval; > } > > - imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT); > + imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl); I think at lea this point sport->rxtl is not properly initialized. > > /* disable the DREN bit (Data Ready interrupt enable) before > * requesting IRQs > @@ -1464,6 +1466,11 @@ static int imx_uart_startup(struct uart_port *port) > if (!uart_console(port) && imx_uart_dma_init(sport) == 0) > dma_is_inited = 1; > > + if (uart_console(port)) > + sport->rxtl = RXTL_DEFAULT_CONSOLE; > + else > + sport->rxtl = RXTL_DEFAULT; > + > spin_lock_irqsave(&sport->port.lock, flags); > > /* Reset fifo's and state machines */ > @@ -1863,7 +1870,7 @@ static int imx_uart_poll_init(struct uart_port *port) > if (retval) > clk_disable_unprepare(sport->clk_ipg); > > - imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT); > + imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl); > > spin_lock_irqsave(&sport->port.lock, flags); > > @@ -2139,7 +2146,7 @@ imx_uart_console_setup(struct console *co, char *options) > else > imx_uart_console_get_options(sport, &baud, &parity, &bits); > > - imx_uart_setup_ufcr(sport, TXTL_DEFAULT, RXTL_DEFAULT); > + imx_uart_setup_ufcr(sport, TXTL_DEFAULT, sport->rxtl); > > retval = uart_set_options(&sport->port, co, baud, parity, bits, flow); > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-24 15:00 ` Stefan Wahren @ 2023-03-25 11:31 ` Stefan Wahren 2023-03-25 12:23 ` Fabio Estevam 2023-03-25 15:11 ` Uwe Kleine-König 0 siblings, 2 replies; 45+ messages in thread From: Stefan Wahren @ 2023-03-25 11:31 UTC (permalink / raw) To: Fabio Estevam, Ilpo Järvinen Cc: Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren Hi Fabio, Am 24.03.23 um 16:00 schrieb Stefan Wahren: > Hi Fabio, > > Am 24.03.23 um 13:57 schrieb Fabio Estevam: >> Hi Stefan, >> >> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen >> <ilpo.jarvinen@linux.intel.com> wrote: >> >>> This has come up earlier, see e.g.: >>> >>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ >>> >>> My somewhat uninformed suggestion: if the overrun problems mostly >>> show up >>> with console ports, maybe the trigger level could depend on the port >>> being a console or not? >> Does the change below help? Taking Ilpo's suggestion into account: > this breaks the boot / debug console completely, but i got the idea. >> based on your patch, i successfully tested this: diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c index f07c4f9ff13c..1aacaa637ede 100644 --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port *sport) } #define TXTL_DEFAULT 2 /* reset default */ +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */ #define RXTL_DEFAULT 8 /* 8 characters or aging timer */ #define TXTL_DMA 8 /* DMA burst setting */ #define RXTL_DMA 9 /* DMA burst setting */ @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port *sport, { unsigned int val; + if (uart_console(&sport->port)) + rxwl = RXTL_DEFAULT_CONSOLE; // fallback + /* set receiver / transmitter trigger level */ val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE); val |= txwl << UFCR_TXTL_SHF | rxwl; _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-25 11:31 ` Stefan Wahren @ 2023-03-25 12:23 ` Fabio Estevam 2023-03-25 15:11 ` Uwe Kleine-König 1 sibling, 0 replies; 45+ messages in thread From: Fabio Estevam @ 2023-03-25 12:23 UTC (permalink / raw) To: Stefan Wahren Cc: Ilpo Järvinen, Tomasz Moń, Greg Kroah-Hartman, Jiri Slaby, Uwe Kleine-König, Sergey Organov, Sascha Hauer, Pengutronix Kernel Team, NXP Linux Team, linux-serial, Linux ARM, Shawn Guo, Stefan Wahren Hi Stefan, On Sat, Mar 25, 2023 at 8:31 AM Stefan Wahren <stefan.wahren@i2se.com> wrote: > based on your patch, i successfully tested this: Great, much simpler :-) Please submit it as a formal patch so we can get feedback, thanks. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-25 11:31 ` Stefan Wahren 2023-03-25 12:23 ` Fabio Estevam @ 2023-03-25 15:11 ` Uwe Kleine-König 2023-03-25 17:05 ` Stefan Wahren ` (2 more replies) 1 sibling, 3 replies; 45+ messages in thread From: Uwe Kleine-König @ 2023-03-25 15:11 UTC (permalink / raw) To: Stefan Wahren Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM [-- Attachment #1.1: Type: text/plain, Size: 2666 bytes --] Hello, On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote: > Am 24.03.23 um 16:00 schrieb Stefan Wahren: > > Am 24.03.23 um 13:57 schrieb Fabio Estevam: > > > On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen > > > <ilpo.jarvinen@linux.intel.com> wrote: > > > > > > > This has come up earlier, see e.g.: > > > > > > > > https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ > > > > > > > > My somewhat uninformed suggestion: if the overrun problems > > > > mostly show up > > > > with console ports, maybe the trigger level could depend on the port > > > > being a console or not? > > > Does the change below help? Taking Ilpo's suggestion into account: > > this breaks the boot / debug console completely, but i got the idea. > > > > > based on your patch, i successfully tested this: > > diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c > index f07c4f9ff13c..1aacaa637ede 100644 > --- a/drivers/tty/serial/imx.c > +++ b/drivers/tty/serial/imx.c > @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port > *sport) > } > > #define TXTL_DEFAULT 2 /* reset default */ > +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */ > #define RXTL_DEFAULT 8 /* 8 characters or aging timer */ > #define TXTL_DMA 8 /* DMA burst setting */ > #define RXTL_DMA 9 /* DMA burst setting */ > @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port > *sport, > { > unsigned int val; > > + if (uart_console(&sport->port)) > + rxwl = RXTL_DEFAULT_CONSOLE; // fallback > + > /* set receiver / transmitter trigger level */ > val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE); > val |= txwl << UFCR_TXTL_SHF | rxwl; So the current theory that the issue occurs because of a combination of: - With a higher watermark value the irq triggers later and so there is less time the until the ISR must run before an overflow happens; and - serial console activity disables irqs for a (relative) long time right? So on an UP system the problem should occur also on a non-console port? Local irqs are only disabled if some printk is about to be emitted, isn't it? Does this match the error you're seeing? That makes me wonder if the error doesn't relate to the UART being a console port, but the UART being used without DMA?! (So the patch above fixes the problem for you because on the console port no DMA is used?) Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 484 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-25 15:11 ` Uwe Kleine-König @ 2023-03-25 17:05 ` Stefan Wahren 2023-03-25 19:00 ` Sergey Organov 2023-03-27 8:07 ` Tomasz Moń 2023-03-25 18:30 ` Sergey Organov 2023-03-27 14:42 ` Stefan Wahren 2 siblings, 2 replies; 45+ messages in thread From: Stefan Wahren @ 2023-03-25 17:05 UTC (permalink / raw) To: Uwe Kleine-König Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hi Uwe, Am 25.03.23 um 16:11 schrieb Uwe Kleine-König: > Hello, > > On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote: >> Am 24.03.23 um 16:00 schrieb Stefan Wahren: >>> Am 24.03.23 um 13:57 schrieb Fabio Estevam: >>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen >>>> <ilpo.jarvinen@linux.intel.com> wrote: >>>> >>>>> This has come up earlier, see e.g.: >>>>> >>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ >>>>> >>>>> My somewhat uninformed suggestion: if the overrun problems >>>>> mostly show up >>>>> with console ports, maybe the trigger level could depend on the port >>>>> being a console or not? >>>> Does the change below help? Taking Ilpo's suggestion into account: >>> this breaks the boot / debug console completely, but i got the idea. >>>> >> >> based on your patch, i successfully tested this: >> >> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c >> index f07c4f9ff13c..1aacaa637ede 100644 >> --- a/drivers/tty/serial/imx.c >> +++ b/drivers/tty/serial/imx.c >> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port >> *sport) >> } >> >> #define TXTL_DEFAULT 2 /* reset default */ >> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */ >> #define RXTL_DEFAULT 8 /* 8 characters or aging timer */ >> #define TXTL_DMA 8 /* DMA burst setting */ >> #define RXTL_DMA 9 /* DMA burst setting */ >> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port >> *sport, >> { >> unsigned int val; >> >> + if (uart_console(&sport->port)) >> + rxwl = RXTL_DEFAULT_CONSOLE; // fallback >> + >> /* set receiver / transmitter trigger level */ >> val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE); >> val |= txwl << UFCR_TXTL_SHF | rxwl; > > So the current theory that the issue occurs because of a combination of: > > - With a higher watermark value the irq triggers later and so there is > less time the until the ISR must run before an overflow happens; and > > - serial console activity disables irqs for a (relative) long time > > right? > > So on an UP system the problem should occur also on a non-console port? This is less likely, because UART applications usually need some kind of flow control (either from hardware or protocol side). For a non-console application the receiver usually wait until the end and then starts to transmit. Sure you can flood the UART with characters and it's only a question of time until the RX FIFO is full and data get lost. But i think we should focus on the real use case and don't try find the perfect solution. At the end it's always a compromise between latency and throughput. > Local irqs are only disabled if some printk is about to be emitted, > isn't it? Does this match the error you're seeing? Yes, that's the typical "problem" of a console application. > That makes me wonder if the error doesn't relate to the UART being a > console port, but the UART being used without DMA?! (So the patch above > fixes the problem for you because on the console port no DMA is used?) As i said the issue only occured on the console. My problem is that the other UARTs on Tarragon are used for RS485 which means they are just half duplex. According to these lines in imx.c DMA is never used for console: /* Can we enable the DMA support? */ if (!uart_console(port) && imx_uart_dma_init(sport) == 0) dma_is_inited = 1; At the end the patch above only restores the old console behavior, but keep Tomasz Moń's optimization for non-console (which was indented for). Best regards Stefan > > Best regards > Uwe > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-25 17:05 ` Stefan Wahren @ 2023-03-25 19:00 ` Sergey Organov 2023-03-26 18:21 ` Francesco Dolcini 2023-03-27 8:07 ` Tomasz Moń 1 sibling, 1 reply; 45+ messages in thread From: Sergey Organov @ 2023-03-25 19:00 UTC (permalink / raw) To: Stefan Wahren Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hello, Stefan Wahren <stefan.wahren@i2se.com> writes: > Hi Uwe, > > Am 25.03.23 um 16:11 schrieb Uwe Kleine-König: >> Hello, >> On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote: >>> Am 24.03.23 um 16:00 schrieb Stefan Wahren: >>>> Am 24.03.23 um 13:57 schrieb Fabio Estevam: >>>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen >>>>> <ilpo.jarvinen@linux.intel.com> wrote: >>>>> >>>>>> This has come up earlier, see e.g.: >>>>>> >>>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ >>>>>> >>>>>> My somewhat uninformed suggestion: if the overrun problems >>>>>> mostly show up >>>>>> with console ports, maybe the trigger level could depend on the port >>>>>> being a console or not? >>>>> Does the change below help? Taking Ilpo's suggestion into account: >>>> this breaks the boot / debug console completely, but i got the idea. >>>>> >>> >>> based on your patch, i successfully tested this: >>> >>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c >>> index f07c4f9ff13c..1aacaa637ede 100644 >>> --- a/drivers/tty/serial/imx.c >>> +++ b/drivers/tty/serial/imx.c >>> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port >>> *sport) >>> } >>> >>> #define TXTL_DEFAULT 2 /* reset default */ >>> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */ >>> #define RXTL_DEFAULT 8 /* 8 characters or aging timer */ >>> #define TXTL_DMA 8 /* DMA burst setting */ >>> #define RXTL_DMA 9 /* DMA burst setting */ >>> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port >>> *sport, >>> { >>> unsigned int val; >>> >>> + if (uart_console(&sport->port)) >>> + rxwl = RXTL_DEFAULT_CONSOLE; // fallback >>> + >>> /* set receiver / transmitter trigger level */ >>> val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE); >>> val |= txwl << UFCR_TXTL_SHF | rxwl; >> So the current theory that the issue occurs because of a combination of: >> - With a higher watermark value the irq triggers later and so there is >> less time the until the ISR must run before an overflow happens; and >> - serial console activity disables irqs for a (relative) long time >> right? >> So on an UP system the problem should occur also on a non-console port? > > This is less likely, because UART applications usually need some kind > of flow control (either from hardware or protocol side). For a > non-console application the receiver usually wait until the end and > then starts to transmit. Only CTS/RTS hardware handshake could help, as otherwise printk() output is typically entirely async with respect to transmissions on another port, and software protocol(s) then are irrelevant, unless they enforce extremely short chunks of data (less than FIFO size). > Sure you can flood the UART with characters and it's only a question > of time until the RX FIFO is full and data get lost. In correctly working RT system this doesn't typically happen, as CPUs are way faster than typical UART speeds, and are able to handle the loads easily, provided UART has decent FIFO. It's disabling IRQs for prolonged times that makes shit happen. [...] > > According to these lines in imx.c DMA is never used for console: > > /* Can we enable the DMA support? */ > if (!uart_console(port) && imx_uart_dma_init(sport) == 0) > dma_is_inited = 1; > > At the end the patch above only restores the old console behavior, but > keep Tomasz Moń's optimization for non-console (which was indented > for). So this will likely only be of help for this particular case, and will leave the problem there on other DMA-disabled ports. To "fix" this, the old threshold is to be returned on all DMA-disabled ports, and then the Tomasz original patch would be entirely reverted, it seems. Disclaimer: all the above is said on the assumption that it's printk that is core cause of the problem in this case, that has not yet been shown in testing, as far as I know. Best regards, -- Sergey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-25 19:00 ` Sergey Organov @ 2023-03-26 18:21 ` Francesco Dolcini 0 siblings, 0 replies; 45+ messages in thread From: Francesco Dolcini @ 2023-03-26 18:21 UTC (permalink / raw) To: Sergey Organov Cc: Stefan Wahren, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM On Sat, Mar 25, 2023 at 10:00:24PM +0300, Sergey Organov wrote: > In correctly working RT system this doesn't typically happen, as CPUs > are way faster than typical UART speeds, and are able to handle the > loads easily, provided UART has decent FIFO. It's disabling IRQs for > prolonged times that makes shit happen. The first time we were looking into this issue was before 7a637784d517 and with a PREEMPT-RT patched kernel (if I remember correctly it was a v5.4). The system was not loaded at all and it was pretty surprising the behavior, because of the reasons you just wrote here. Francesco _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-25 17:05 ` Stefan Wahren 2023-03-25 19:00 ` Sergey Organov @ 2023-03-27 8:07 ` Tomasz Moń 1 sibling, 0 replies; 45+ messages in thread From: Tomasz Moń @ 2023-03-27 8:07 UTC (permalink / raw) To: Stefan Wahren, Uwe Kleine-König Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, Sergey Organov, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM On Sat, 2023-03-25 at 18:05 +0100, Stefan Wahren wrote: > Am 25.03.23 um 16:11 schrieb Uwe Kleine-König: > > So on an UP system the problem should occur also on a non-console port? > > This is less likely, because UART applications usually need some kind of > flow control (either from hardware or protocol side). For a non-console > application the receiver usually wait until the end and then starts to > transmit. > > Sure you can flood the UART with characters and it's only a question of > time until the RX FIFO is full and data get lost. But i think we should > focus on the real use case and don't try find the perfect solution. At > the end it's always a compromise between latency and throughput. If you enable DMA on the UART then you are extremely unlikely to hit overflow. To some degree the DMA can be seen as "extended" RX FIFO. Unfortunately DMA cannot be used for imx console UART. > > That makes me wonder if the error doesn't relate to the UART being a > > console port, but the UART being used without DMA?! (So the patch above > > fixes the problem for you because on the console port no DMA is used?) > > As i said the issue only occured on the console. My problem is that the > other UARTs on Tarragon are used for RS485 which means they are just > half duplex. > > According to these lines in imx.c DMA is never used for console: > > /* Can we enable the DMA support? */ > if (!uart_console(port) && imx_uart_dma_init(sport) == 0) > dma_is_inited = 1; > > At the end the patch above only restores the old console behavior, but > keep Tomasz Moń's optimization for non-console (which was indented for). Setting RXTL to 1 is essentially making the irq raised a bit earlier, i.e. when the RX FIFO can hold 31 more characters. With RXTL set to 8 and data burst, the irq is raised when RX FIFO can hold 24 more characters. Therefore with RXTL set to 1 (instead of 8) the maximum acceptable RX interrupt latency (i.e. before you losing incoming characters) is 7 characters time longer. Best Regards, Tomasz Moń _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-25 15:11 ` Uwe Kleine-König 2023-03-25 17:05 ` Stefan Wahren @ 2023-03-25 18:30 ` Sergey Organov 2023-03-27 14:42 ` Stefan Wahren 2 siblings, 0 replies; 45+ messages in thread From: Sergey Organov @ 2023-03-25 18:30 UTC (permalink / raw) To: Uwe Kleine-König Cc: Stefan Wahren, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hello, Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes: > Hello, > > On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote: [...] > > So the current theory that the issue occurs because of a combination of: > > - With a higher watermark value the irq triggers later and so there is > less time the until the ISR must run before an overflow happens; and > > - serial console activity disables irqs for a (relative) long time > > right? > > So on an UP system the problem should occur also on a non-console > port? That's exactly what I've experienced, especially when console baud rate was lower than that of other port(s). I had console at 115200, and got immediate problems on another port working at 460800 whenever relatively lengthy printk output has been emitted (in my case it was info from wlan driver.) > Local irqs are only disabled if some printk is about to be emitted, > isn't it? Yep, and this allows for easy check if it's indeed printk that causes this by eliminating the output using # echo 0 > /proc/sys/kernel/printk > Does this match the error you're seeing? > > That makes me wonder if the error doesn't relate to the UART being a > console port, but the UART being used without DMA?! (So the patch above > fixes the problem for you because on the console port no DMA is used?) Indeed DMA is likely to be able to hide the problem if the cause is printk, though all my results were obtained on DMA-disabled ports, and I never checked with DMA enabled, so unfortunately I have no tested confirmation of this idea. Best regards, -- Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-25 15:11 ` Uwe Kleine-König 2023-03-25 17:05 ` Stefan Wahren 2023-03-25 18:30 ` Sergey Organov @ 2023-03-27 14:42 ` Stefan Wahren 2023-03-27 15:11 ` Sergey Organov 2 siblings, 1 reply; 45+ messages in thread From: Stefan Wahren @ 2023-03-27 14:42 UTC (permalink / raw) To: Uwe Kleine-König, Sergey Organov Cc: Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hi, Am 25.03.23 um 16:11 schrieb Uwe Kleine-König: > Hello, > > On Sat, Mar 25, 2023 at 12:31:01PM +0100, Stefan Wahren wrote: >> Am 24.03.23 um 16:00 schrieb Stefan Wahren: >>> Am 24.03.23 um 13:57 schrieb Fabio Estevam: >>>> On Fri, Mar 24, 2023 at 8:48 AM Ilpo Järvinen >>>> <ilpo.jarvinen@linux.intel.com> wrote: >>>> >>>>> This has come up earlier, see e.g.: >>>>> >>>>> https://lore.kernel.org/linux-serial/20221003110850.GA28338@francesco-nb.int.toradex.com/ >>>>> >>>>> My somewhat uninformed suggestion: if the overrun problems >>>>> mostly show up >>>>> with console ports, maybe the trigger level could depend on the port >>>>> being a console or not? >>>> Does the change below help? Taking Ilpo's suggestion into account: >>> this breaks the boot / debug console completely, but i got the idea. >>>> >> >> based on your patch, i successfully tested this: >> >> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c >> index f07c4f9ff13c..1aacaa637ede 100644 >> --- a/drivers/tty/serial/imx.c >> +++ b/drivers/tty/serial/imx.c >> @@ -1277,6 +1277,7 @@ static void imx_uart_clear_rx_errors(struct imx_port >> *sport) >> } >> >> #define TXTL_DEFAULT 2 /* reset default */ >> +#define RXTL_DEFAULT_CONSOLE 1 /* reset default */ >> #define RXTL_DEFAULT 8 /* 8 characters or aging timer */ >> #define TXTL_DMA 8 /* DMA burst setting */ >> #define RXTL_DMA 9 /* DMA burst setting */ >> @@ -1286,6 +1287,9 @@ static void imx_uart_setup_ufcr(struct imx_port >> *sport, >> { >> unsigned int val; >> >> + if (uart_console(&sport->port)) >> + rxwl = RXTL_DEFAULT_CONSOLE; // fallback >> + >> /* set receiver / transmitter trigger level */ >> val = imx_uart_readl(sport, UFCR) & (UFCR_RFDIV | UFCR_DCEDTE); >> val |= txwl << UFCR_TXTL_SHF | rxwl; > > So the current theory that the issue occurs because of a combination of: > > - With a higher watermark value the irq triggers later and so there is > less time the until the ISR must run before an overflow happens; and > > - serial console activity disables irqs for a (relative) long time > > right? > > So on an UP system the problem should occur also on a non-console port? > Local irqs are only disabled if some printk is about to be emitted, > isn't it? Does this match the error you're seeing? > > That makes me wonder if the error doesn't relate to the UART being a > console port, but the UART being used without DMA?! (So the patch above > fixes the problem for you because on the console port no DMA is used?) today i had time to do some testing. At first i tested with different RXTL_DEFAULT values. 1 No overrun 2 No overrun 4 No overrun 8 Overruns After that i look at the # echo 0 > /proc/sys/kernel/printk approach, but this didn't change anything. The kernel is usually silent about log message after boot and the console works still with echo. Enforcing some driver to call printk periodically would make the console unusuable. Finally i tried to disabled the spin_lock in imx_uart_console_write: diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c index f07c4f9ff13c..c342559ff1a2 100644 --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count) struct imx_port_ucrs old_ucr; unsigned long flags; unsigned int ucr1; - int locked = 1; + int locked = 0; if (sport->port.sysrq) locked = 0; else if (oops_in_progress) locked = spin_trylock_irqsave(&sport->port.lock, flags); - else - spin_lock_irqsave(&sport->port.lock, flags); /* * First, save UCR1/2/3 and then disable interrupts But the overruns still occured. Is this because the serial core already helds a lock? > > Best regards > Uwe > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-27 14:42 ` Stefan Wahren @ 2023-03-27 15:11 ` Sergey Organov 2023-03-27 15:30 ` Russell King (Oracle) 2023-04-16 13:43 ` Stefan Wahren 0 siblings, 2 replies; 45+ messages in thread From: Sergey Organov @ 2023-03-27 15:11 UTC (permalink / raw) To: Stefan Wahren Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Stefan Wahren <stefan.wahren@i2se.com> writes: > Hi, > > Am 25.03.23 um 16:11 schrieb Uwe Kleine-König: [...] > today i had time to do some testing. At first i tested with different RXTL_DEFAULT values. > > 1 No overrun > 2 No overrun > 4 No overrun > 8 Overruns > > After that i look at the # echo 0 > /proc/sys/kernel/printk approach, > but this didn't change anything. The kernel is usually silent about > log message after boot and the console works still with echo. > Enforcing some driver to call printk periodically would make the > console unusuable. As you figured that printk() is not the cause, it must be something else that causes overruns, so there is no need to check printk case further. > > Finally i tried to disabled the spin_lock in imx_uart_console_write: > > diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c > index f07c4f9ff13c..c342559ff1a2 100644 > --- a/drivers/tty/serial/imx.c > +++ b/drivers/tty/serial/imx.c > @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count) > struct imx_port_ucrs old_ucr; > unsigned long flags; > unsigned int ucr1; > - int locked = 1; > + int locked = 0; > > if (sport->port.sysrq) > locked = 0; > else if (oops_in_progress) > locked = spin_trylock_irqsave(&sport->port.lock, flags); > - else > - spin_lock_irqsave(&sport->port.lock, flags); > > /* > * First, save UCR1/2/3 and then disable interrupts > > But the overruns still occured. Is this because the serial core > already helds a lock? This probably isn't even called when there is no printk() output, as user-space writes to /dev/console are rather performed through regular generic code, AFAIK. Best regards, -- Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-27 15:11 ` Sergey Organov @ 2023-03-27 15:30 ` Russell King (Oracle) 2023-04-16 13:43 ` Stefan Wahren 1 sibling, 0 replies; 45+ messages in thread From: Russell King (Oracle) @ 2023-03-27 15:30 UTC (permalink / raw) To: Sergey Organov Cc: Stefan Wahren, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM On Mon, Mar 27, 2023 at 06:11:12PM +0300, Sergey Organov wrote: > Stefan Wahren <stefan.wahren@i2se.com> writes: > > > Hi, > > > > Am 25.03.23 um 16:11 schrieb Uwe Kleine-König: > > [...] > > > today i had time to do some testing. At first i tested with different RXTL_DEFAULT values. > > > > 1 No overrun > > 2 No overrun > > 4 No overrun > > 8 Overruns > > > > After that i look at the # echo 0 > /proc/sys/kernel/printk approach, > > but this didn't change anything. The kernel is usually silent about > > log message after boot and the console works still with echo. > > Enforcing some driver to call printk periodically would make the > > console unusuable. > > As you figured that printk() is not the cause, it must be something else > that causes overruns, so there is no need to check printk case further. > > > > > Finally i tried to disabled the spin_lock in imx_uart_console_write: > > > > diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c > > index f07c4f9ff13c..c342559ff1a2 100644 > > --- a/drivers/tty/serial/imx.c > > +++ b/drivers/tty/serial/imx.c > > @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count) > > struct imx_port_ucrs old_ucr; > > unsigned long flags; > > unsigned int ucr1; > > - int locked = 1; > > + int locked = 0; > > > > if (sport->port.sysrq) > > locked = 0; > > else if (oops_in_progress) > > locked = spin_trylock_irqsave(&sport->port.lock, flags); > > - else > > - spin_lock_irqsave(&sport->port.lock, flags); > > > > /* > > * First, save UCR1/2/3 and then disable interrupts > > > > But the overruns still occured. Is this because the serial core > > already helds a lock? > > This probably isn't even called when there is no printk() output, as > user-space writes to /dev/console are rather performed through regular > generic code, AFAIK. Correct on both points. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-03-27 15:11 ` Sergey Organov 2023-03-27 15:30 ` Russell King (Oracle) @ 2023-04-16 13:43 ` Stefan Wahren 2023-04-17 16:50 ` Sergey Organov 1 sibling, 1 reply; 45+ messages in thread From: Stefan Wahren @ 2023-04-16 13:43 UTC (permalink / raw) To: Sergey Organov Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hi Sergey, Am 27.03.23 um 17:11 schrieb Sergey Organov: > Stefan Wahren <stefan.wahren@i2se.com> writes: > >> Hi, >> >> Am 25.03.23 um 16:11 schrieb Uwe Kleine-König: > > [...] > >> today i had time to do some testing. At first i tested with different RXTL_DEFAULT values. >> >> 1 No overrun >> 2 No overrun >> 4 No overrun >> 8 Overruns >> >> After that i look at the # echo 0 > /proc/sys/kernel/printk approach, >> but this didn't change anything. The kernel is usually silent about >> log message after boot and the console works still with echo. >> Enforcing some driver to call printk periodically would make the >> console unusuable. > > As you figured that printk() is not the cause, it must be something else > that causes overruns, so there is no need to check printk case further. > >> >> Finally i tried to disabled the spin_lock in imx_uart_console_write: >> >> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c >> index f07c4f9ff13c..c342559ff1a2 100644 >> --- a/drivers/tty/serial/imx.c >> +++ b/drivers/tty/serial/imx.c >> @@ -2007,14 +2007,12 @@ imx_uart_console_write(struct console *co, const char *s, unsigned int count) >> struct imx_port_ucrs old_ucr; >> unsigned long flags; >> unsigned int ucr1; >> - int locked = 1; >> + int locked = 0; >> >> if (sport->port.sysrq) >> locked = 0; >> else if (oops_in_progress) >> locked = spin_trylock_irqsave(&sport->port.lock, flags); >> - else >> - spin_lock_irqsave(&sport->port.lock, flags); >> >> /* >> * First, save UCR1/2/3 and then disable interrupts >> >> But the overruns still occured. Is this because the serial core >> already helds a lock? > > This probably isn't even called when there is no printk() output, as > user-space writes to /dev/console are rather performed through regular > generic code, AFAIK. i had some time today to investigate this a little bit. I thought it would be a good idea to use debugfs as a ugly quick hack: diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c index b8c817d26b00..d5bde4754004 100644 --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -30,6 +30,7 @@ #include <linux/dma-mapping.h> #include <asm/irq.h> +#include <linux/debugfs.h> #include <linux/dma/imx-dma.h> #include "serial_mctrl_gpio.h" @@ -237,8 +238,19 @@ struct imx_port { enum imx_tx_state tx_state; struct hrtimer trigger_start_tx; struct hrtimer trigger_stop_tx; + + struct dentry *debugfs_dir; + + /* stats exposed through debugf */ + s64 total_duration_us; + s64 rx_duration_us; + s64 tx_duration_us; + u32 received; + u32 send; }; +static struct dentry *imx_debugfs_root; + struct imx_port_ucrs { unsigned int ucr1; unsigned int ucr2; @@ -536,12 +548,15 @@ static void imx_uart_dma_tx(struct imx_port *sport); static inline void imx_uart_transmit_buffer(struct imx_port *sport) { struct circ_buf *xmit = &sport->port.state->xmit; + u32 send = 0; if (sport->port.x_char) { /* Send next char */ imx_uart_writel(sport, sport->port.x_char, URTX0); sport->port.icount.tx++; sport->port.x_char = 0; + if (sport->send == 0) + sport->send = 1; return; } @@ -576,8 +591,12 @@ static inline void imx_uart_transmit_buffer(struct imx_port *sport) imx_uart_writel(sport, xmit->buf[xmit->tail], URTX0); xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE - 1); sport->port.icount.tx++; + send++; } + if (send > sport->send) + sport->send = send; + if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS) uart_write_wakeup(&sport->port); @@ -808,6 +827,7 @@ static irqreturn_t __imx_uart_rxint(int irq, void *dev_id) { struct imx_port *sport = dev_id; unsigned int rx, flg, ignored = 0; + u32 received = 0; struct tty_port *port = &sport->port.state->port; while (imx_uart_readl(sport, USR2) & USR2_RDR) { @@ -815,6 +835,7 @@ static irqreturn_t __imx_uart_rxint(int irq, void *dev_id) flg = TTY_NORMAL; sport->port.icount.rx++; + received++; rx = imx_uart_readl(sport, URXD0); @@ -868,6 +889,9 @@ static irqreturn_t __imx_uart_rxint(int irq, void *dev_id) out: tty_flip_buffer_push(port); + if (received > sport->received) + sport->received = received; + return IRQ_HANDLED; } @@ -942,6 +966,8 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id) struct imx_port *sport = dev_id; unsigned int usr1, usr2, ucr1, ucr2, ucr3, ucr4; irqreturn_t ret = IRQ_NONE; + ktime_t total_start = ktime_get(); + s64 total_duration_us, rx_duration_us, tx_duration_us; spin_lock(&sport->port.lock); @@ -978,14 +1004,24 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id) usr2 &= ~USR2_ORE; if (usr1 & (USR1_RRDY | USR1_AGTIM)) { + ktime_t rx_start = ktime_get(); imx_uart_writel(sport, USR1_AGTIM, USR1); __imx_uart_rxint(irq, dev_id); + rx_duration_us = ktime_us_delta(ktime_get(), rx_start); + if (rx_duration_us > sport->rx_duration_us) + sport->rx_duration_us = rx_duration_us; + ret = IRQ_HANDLED; } if ((usr1 & USR1_TRDY) || (usr2 & USR2_TXDC)) { + ktime_t tx_start = ktime_get(); imx_uart_transmit_buffer(sport); + tx_duration_us = ktime_us_delta(ktime_get(), tx_start); + if (tx_duration_us > sport->tx_duration_us) + sport->tx_duration_us = tx_duration_us; + ret = IRQ_HANDLED; } @@ -1015,6 +1051,10 @@ static irqreturn_t imx_uart_int(int irq, void *dev_id) spin_unlock(&sport->port.lock); + total_duration_us = ktime_us_delta(ktime_get(), total_start); + if (total_duration_us > sport->total_duration_us) + sport->total_duration_us = total_duration_us; + return ret; } @@ -2233,6 +2273,26 @@ static const struct serial_rs485 imx_rs485_supported = { #define RX_DMA_PERIODS 16 #define RX_DMA_PERIOD_LEN (PAGE_SIZE / 4) +static int debugfs_stats_show(struct seq_file *s, void *unused) +{ + struct imx_port *sport = s->private; + + seq_printf(s, "total_duration_us:\t%lld\n", sport->total_duration_us); + seq_printf(s, "rx_duration_us:\t%lld\n", sport->rx_duration_us); + seq_printf(s, "tx_duration_us:\t%lld\n", sport->tx_duration_us); + seq_printf(s, "received:\t\t%u\n", sport->received); + seq_printf(s, "send:\t\t%u\n", sport->send); + return 0; +} +DEFINE_SHOW_ATTRIBUTE(debugfs_stats); + +static void imx_init_debugfs(struct imx_port *sport, const char *device) +{ + sport->debugfs_dir = debugfs_create_dir(device, imx_debugfs_root); + debugfs_create_file("stats", 0444, sport->debugfs_dir, sport, + &debugfs_stats_fops); +} + static int imx_uart_probe(struct platform_device *pdev) { struct device_node *np = pdev->dev.of_node; @@ -2485,6 +2545,7 @@ static int imx_uart_probe(struct platform_device *pdev) imx_uart_ports[sport->port.line] = sport; platform_set_drvdata(pdev, sport); + imx_init_debugfs(sport, dev_name(&pdev->dev)); return uart_add_one_port(&imx_uart_uart_driver, &sport->port); } @@ -2678,9 +2739,14 @@ static int __init imx_uart_init(void) if (ret) return ret; + imx_debugfs_root = debugfs_create_dir( + imx_uart_platform_driver.driver.name, NULL); + ret = platform_driver_register(&imx_uart_platform_driver); - if (ret != 0) + if (ret != 0) { + debugfs_remove_recursive(imx_debugfs_root); uart_unregister_driver(&imx_uart_uart_driver); + } return ret; } @@ -2688,6 +2754,7 @@ static int __init imx_uart_init(void) static void __exit imx_uart_exit(void) { platform_driver_unregister(&imx_uart_platform_driver); + debugfs_remove_recursive(imx_debugfs_root); uart_unregister_driver(&imx_uart_uart_driver); } Using this i was able to better compare the behavior with RXTL_DEFAULT 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on my i.MX6ULL test platform. After doing my usual test scenario (copy some text lines to console) i got the following results: RXTL_DEFAULT 1 21f0000.serial/stats:total_duration_us: 61 21f0000.serial/stats:rx_duration_us: 36 21f0000.serial/stats:tx_duration_us: 48 21f0000.serial/stats:received: 28 21f0000.serial/stats:send: 33 RXTL_DEFAULT 8 21f0000.serial/stats:total_duration_us: 78 21f0000.serial/stats:rx_duration_us: 46 21f0000.serial/stats:tx_duration_us: 47 21f0000.serial/stats:received: 33 21f0000.serial/stats:send: 33 So based on the maximum of received characters on RX interrupt, i consider the root cause of this issue has already been there because the amount is near to the maximum of the FIFO (32 chars). So finally increasing RXTL_DEFAULT makes the situation even worse by adding enough latency for overrun errors. Best regards > > Best regards, > -- Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-04-16 13:43 ` Stefan Wahren @ 2023-04-17 16:50 ` Sergey Organov 2023-04-17 18:40 ` Stefan Wahren 2023-04-18 16:16 ` Stefan Wahren 0 siblings, 2 replies; 45+ messages in thread From: Sergey Organov @ 2023-04-17 16:50 UTC (permalink / raw) To: Stefan Wahren Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hi Stefan, Stefan Wahren <stefan.wahren@i2se.com> writes: > Hi Sergey, > [...] > i had some time today to investigate this a little bit. I thought it > would be a good idea to use debugfs as a ugly quick hack: > [...] > Using this i was able to better compare the behavior with RXTL_DEFAULT > 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on > my i.MX6ULL test platform. After doing my usual test scenario (copy > some text lines to console) i got the following results: > > RXTL_DEFAULT 1 > 21f0000.serial/stats:total_duration_us: 61 > 21f0000.serial/stats:rx_duration_us: 36 > 21f0000.serial/stats:tx_duration_us: 48 > 21f0000.serial/stats:received: 28 > 21f0000.serial/stats:send: 33 > > RXTL_DEFAULT 8 > 21f0000.serial/stats:total_duration_us: 78 > 21f0000.serial/stats:rx_duration_us: 46 > 21f0000.serial/stats:tx_duration_us: 47 > 21f0000.serial/stats:received: 33 > 21f0000.serial/stats:send: 33 > > So based on the maximum of received characters on RX interrupt, i > consider the root cause of this issue has already been there because > the amount is near to the maximum of the FIFO (32 chars). So finally > increasing RXTL_DEFAULT makes the situation even worse by adding > enough latency for overrun errors. Yep, looks like an issue. What's the baud rate? 115200? If so, it means that interrupts are apparently blocked in your system for up to about 28/(115200/10)=2.4 milliseconds. This is very large number, and it may negatively affect system performance in other places as well, I'm afraid. Best regards, -- Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-04-17 16:50 ` Sergey Organov @ 2023-04-17 18:40 ` Stefan Wahren 2023-04-18 16:16 ` Stefan Wahren 1 sibling, 0 replies; 45+ messages in thread From: Stefan Wahren @ 2023-04-17 18:40 UTC (permalink / raw) To: Sergey Organov Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Am 17.04.23 um 18:50 schrieb Sergey Organov: > Hi Stefan, > > Stefan Wahren <stefan.wahren@i2se.com> writes: > >> Hi Sergey, >> > > [...] > >> i had some time today to investigate this a little bit. I thought it >> would be a good idea to use debugfs as a ugly quick hack: >> > > [...] > >> Using this i was able to better compare the behavior with RXTL_DEFAULT >> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on >> my i.MX6ULL test platform. After doing my usual test scenario (copy >> some text lines to console) i got the following results: >> >> RXTL_DEFAULT 1 >> 21f0000.serial/stats:total_duration_us: 61 >> 21f0000.serial/stats:rx_duration_us: 36 >> 21f0000.serial/stats:tx_duration_us: 48 >> 21f0000.serial/stats:received: 28 >> 21f0000.serial/stats:send: 33 >> >> RXTL_DEFAULT 8 >> 21f0000.serial/stats:total_duration_us: 78 >> 21f0000.serial/stats:rx_duration_us: 46 >> 21f0000.serial/stats:tx_duration_us: 47 >> 21f0000.serial/stats:received: 33 >> 21f0000.serial/stats:send: 33 >> >> So based on the maximum of received characters on RX interrupt, i >> consider the root cause of this issue has already been there because >> the amount is near to the maximum of the FIFO (32 chars). So finally >> increasing RXTL_DEFAULT makes the situation even worse by adding >> enough latency for overrun errors. > > Yep, looks like an issue. > > What's the baud rate? 115200? Correct > If so, it means that interrupts are > apparently blocked in your system for up to about 28/(115200/10)=2.4 > milliseconds. This is very large number, and it may negatively affect > system performance in other places as well, I'm afraid. > > Best regards, > -- Sergey Organov > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-04-17 16:50 ` Sergey Organov 2023-04-17 18:40 ` Stefan Wahren @ 2023-04-18 16:16 ` Stefan Wahren 2023-05-22 9:25 ` Linux regression tracking (Thorsten Leemhuis) 1 sibling, 1 reply; 45+ messages in thread From: Stefan Wahren @ 2023-04-18 16:16 UTC (permalink / raw) To: Sergey Organov Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hi Sergey, Am 17.04.23 um 18:50 schrieb Sergey Organov: > Hi Stefan, > > Stefan Wahren <stefan.wahren@i2se.com> writes: > >> Hi Sergey, >> > > [...] > >> i had some time today to investigate this a little bit. I thought it >> would be a good idea to use debugfs as a ugly quick hack: >> > > [...] > >> Using this i was able to better compare the behavior with RXTL_DEFAULT >> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on >> my i.MX6ULL test platform. After doing my usual test scenario (copy >> some text lines to console) i got the following results: >> >> RXTL_DEFAULT 1 >> 21f0000.serial/stats:total_duration_us: 61 >> 21f0000.serial/stats:rx_duration_us: 36 >> 21f0000.serial/stats:tx_duration_us: 48 >> 21f0000.serial/stats:received: 28 >> 21f0000.serial/stats:send: 33 >> >> RXTL_DEFAULT 8 >> 21f0000.serial/stats:total_duration_us: 78 >> 21f0000.serial/stats:rx_duration_us: 46 >> 21f0000.serial/stats:tx_duration_us: 47 >> 21f0000.serial/stats:received: 33 >> 21f0000.serial/stats:send: 33 >> >> So based on the maximum of received characters on RX interrupt, i >> consider the root cause of this issue has already been there because >> the amount is near to the maximum of the FIFO (32 chars). So finally >> increasing RXTL_DEFAULT makes the situation even worse by adding >> enough latency for overrun errors. > > Yep, looks like an issue. > > What's the baud rate? 115200? If so, it means that interrupts are > apparently blocked in your system for up to about 28/(115200/10)=2.4 > milliseconds. This is very large number, and it may negatively affect > system performance in other places as well, I'm afraid. i forgot to mention that i also measured the time around printk_safe_(enter|exit)_irqsave in console_emit_next_record() which had a maximum of 24721 µs. But uncommenting these functions doesn't fixed the problem. This seems to be used only by printk. Best regards > > Best regards, > -- Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-04-18 16:16 ` Stefan Wahren @ 2023-05-22 9:25 ` Linux regression tracking (Thorsten Leemhuis) 2023-05-23 15:12 ` Stefan Wahren 2023-05-23 19:44 ` Sergey Organov 0 siblings, 2 replies; 45+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-22 9:25 UTC (permalink / raw) To: Stefan Wahren, Sergey Organov Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM, Linux kernel regressions list Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting for once, to make this easily accessible to everyone. Stefan, was this regression ever solved? It doesn't look like it, but maybe I'm missing something. If it wasn't solved: what needs to be done to get this rolling again? Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. #regzbot poke On 18.04.23 18:16, Stefan Wahren wrote: > Hi Sergey, > > Am 17.04.23 um 18:50 schrieb Sergey Organov: >> Hi Stefan, >> >> Stefan Wahren <stefan.wahren@i2se.com> writes: >> >>> Hi Sergey, >>> >> >> [...] >> >>> i had some time today to investigate this a little bit. I thought it >>> would be a good idea to use debugfs as a ugly quick hack: >>> >> >> [...] >> >>> Using this i was able to better compare the behavior with RXTL_DEFAULT >>> 1 (without overrun errors) and RXTL_DEFAULT 8 (with overrun errors) on >>> my i.MX6ULL test platform. After doing my usual test scenario (copy >>> some text lines to console) i got the following results: >>> >>> RXTL_DEFAULT 1 >>> 21f0000.serial/stats:total_duration_us: 61 >>> 21f0000.serial/stats:rx_duration_us: 36 >>> 21f0000.serial/stats:tx_duration_us: 48 >>> 21f0000.serial/stats:received: 28 >>> 21f0000.serial/stats:send: 33 >>> >>> RXTL_DEFAULT 8 >>> 21f0000.serial/stats:total_duration_us: 78 >>> 21f0000.serial/stats:rx_duration_us: 46 >>> 21f0000.serial/stats:tx_duration_us: 47 >>> 21f0000.serial/stats:received: 33 >>> 21f0000.serial/stats:send: 33 >>> >>> So based on the maximum of received characters on RX interrupt, i >>> consider the root cause of this issue has already been there because >>> the amount is near to the maximum of the FIFO (32 chars). So finally >>> increasing RXTL_DEFAULT makes the situation even worse by adding >>> enough latency for overrun errors. >> >> Yep, looks like an issue. >> >> What's the baud rate? 115200? If so, it means that interrupts are >> apparently blocked in your system for up to about 28/(115200/10)=2.4 >> milliseconds. This is very large number, and it may negatively affect >> system performance in other places as well, I'm afraid. > > i forgot to mention that i also measured the time around > printk_safe_(enter|exit)_irqsave in console_emit_next_record() which had > a maximum of 24721 µs. But uncommenting these functions doesn't fixed > the problem. This seems to be used only by printk. > > Best regards > >> >> Best regards, >> -- Sergey Organov > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-05-22 9:25 ` Linux regression tracking (Thorsten Leemhuis) @ 2023-05-23 15:12 ` Stefan Wahren 2023-05-23 19:44 ` Sergey Organov 1 sibling, 0 replies; 45+ messages in thread From: Stefan Wahren @ 2023-05-23 15:12 UTC (permalink / raw) To: Linux regressions mailing list Cc: Uwe Kleine-König, Sergey Organov, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hi Thorsten, Am 22.05.23 um 11:25 schrieb Linux regression tracking (Thorsten Leemhuis): > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > for once, to make this easily accessible to everyone. > > Stefan, was this regression ever solved? It doesn't look like it, but > maybe I'm missing something. > > If it wasn't solved: what needs to be done to get this rolling again? thanks for the reminder. From a user point of view this issue hasn't been fixed so far. For our product we just reverted the commit in a downstream repo. From my understanding there was already an issue there and the optimizing commit by Tomasz just make the situation worse. Unfortunately my time budget to investigate this issue further is exhausted, so i stopped working at this. In case someone can give clear instructions to investigate this further, i will try to look at it in my spare time. But i cannot make any promises. Best regards > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > If I did something stupid, please tell me, as explained on that page. > > #regzbot poke > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-05-22 9:25 ` Linux regression tracking (Thorsten Leemhuis) 2023-05-23 15:12 ` Stefan Wahren @ 2023-05-23 19:44 ` Sergey Organov 2023-05-24 10:48 ` Thorsten Leemhuis 2023-05-24 13:07 ` Stefan Wahren 1 sibling, 2 replies; 45+ messages in thread From: Sergey Organov @ 2023-05-23 19:44 UTC (permalink / raw) To: Linux regression tracking (Thorsten Leemhuis) Cc: Stefan Wahren, Linux regressions mailing list, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes: > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > for once, to make this easily accessible to everyone. > > Stefan, was this regression ever solved? It doesn't look like it, but > maybe I'm missing something. > > If it wasn't solved: what needs to be done to get this rolling again? Hi Thorsten, Not Stefan, but as far as I can tell, the problem is that on Stefan's build the kernel has rather large periods of interrupts being disabled, so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ threshold causes "regression" that manifests itself as missing characters on receive. I'm not sure if it's tuning FIFO level that is in fact a regression in this case. Solving this would need to identify the cause of interrupts being disabled for prolonged times, and nobody volunteered to investigate this further. One suspect, the Linux serial console, has been likely excluded already though, as not actually being in use for printk() output. -- Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-05-23 19:44 ` Sergey Organov @ 2023-05-24 10:48 ` Thorsten Leemhuis 2023-05-24 12:41 ` Uwe Kleine-König 2023-05-24 13:45 ` Sergey Organov 2023-05-24 13:07 ` Stefan Wahren 1 sibling, 2 replies; 45+ messages in thread From: Thorsten Leemhuis @ 2023-05-24 10:48 UTC (permalink / raw) To: Sergey Organov Cc: Stefan Wahren, Linux regressions mailing list, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM On 23.05.23 21:44, Sergey Organov wrote: > "Linux regression tracking (Thorsten Leemhuis)" > <regressions@leemhuis.info> writes: > >> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting >> for once, to make this easily accessible to everyone. >> >> Stefan, was this regression ever solved? It doesn't look like it, but >> maybe I'm missing something. >> >> If it wasn't solved: what needs to be done to get this rolling again? > > Not Stefan, Thx to both you and Stefan for the update. > but as far as I can tell, the problem is that on Stefan's > build the kernel has rather large periods of interrupts being disabled, > so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ > threshold causes "regression" that manifests itself as missing > characters on receive. I'm not sure if it's tuning FIFO level that is in > fact a regression in this case. Not totally sure, but I guess Linus stance in this case would be along the lines of "commit 7a637784d517 made an existing issue worse; either the people involved in it fix it, or we revert that commit[1], as it's causing a regression". At least we *iirc* had situations he handled like that. [1] of course unless a revert would cause regressions for others -- which i guess might be the case here, as that was added in 5.18 already. So let's not bring Linus in. > Solving this would need to identify the cause of interrupts being > disabled for prolonged times, and nobody volunteered to investigate this > further. Well, Stefan kind of did to do so in his spare time, but asked for "clear instructions to investigate this further". Could you maybe provide those? If not: who could? > One suspect, the Linux serial console, has been likely excluded > already though, as not actually being in use for printk() output. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-05-24 10:48 ` Thorsten Leemhuis @ 2023-05-24 12:41 ` Uwe Kleine-König 2023-05-24 13:45 ` Sergey Organov 1 sibling, 0 replies; 45+ messages in thread From: Uwe Kleine-König @ 2023-05-24 12:41 UTC (permalink / raw) To: Thorsten Leemhuis Cc: Sergey Organov, Stefan Wahren, Linux regressions mailing list, Jiri Slaby, Greg Kroah-Hartman, Stefan Wahren, Sascha Hauer, Shawn Guo, Tomasz Moń, NXP Linux Team, linux-serial, Ilpo Järvinen, Fabio Estevam, Pengutronix Kernel Team, Linux ARM [-- Attachment #1.1: Type: text/plain, Size: 2077 bytes --] On Wed, May 24, 2023 at 12:48:51PM +0200, Thorsten Leemhuis wrote: > On 23.05.23 21:44, Sergey Organov wrote: > > "Linux regression tracking (Thorsten Leemhuis)" > > <regressions@leemhuis.info> writes: > > > >> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > >> for once, to make this easily accessible to everyone. > >> > >> Stefan, was this regression ever solved? It doesn't look like it, but > >> maybe I'm missing something. > >> > >> If it wasn't solved: what needs to be done to get this rolling again? > > > > Not Stefan, > > Thx to both you and Stefan for the update. > > > but as far as I can tell, the problem is that on Stefan's > > build the kernel has rather large periods of interrupts being disabled, > > so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ > > threshold causes "regression" that manifests itself as missing > > characters on receive. I'm not sure if it's tuning FIFO level that is in > > fact a regression in this case. > > Not totally sure, but I guess Linus stance in this case would be along > the lines of "commit 7a637784d517 made an existing issue worse; either > the people involved in it fix it, or we revert that commit[1], as it's > causing a regression". At least we *iirc* had situations he handled like > that. > > [1] of course unless a revert would cause regressions for others -- > which i guess might be the case here, as that was added in 5.18 already. > So let's not bring Linus in. Well in my eyes this regression is in the same league as: That patch over made a driver use some more memory and on my (memory limited) machine this makes the difference to trigger an OOM. You could apply this to pretty much any patch that increases the memory foot print / latency / cpu usage. (TL;DR: I agree to not revert the patch under discussion for this reason.) Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-05-24 10:48 ` Thorsten Leemhuis 2023-05-24 12:41 ` Uwe Kleine-König @ 2023-05-24 13:45 ` Sergey Organov 1 sibling, 0 replies; 45+ messages in thread From: Sergey Organov @ 2023-05-24 13:45 UTC (permalink / raw) To: Thorsten Leemhuis Cc: Stefan Wahren, Linux regressions mailing list, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Thorsten Leemhuis <regressions@leemhuis.info> writes: > On 23.05.23 21:44, Sergey Organov wrote: >> "Linux regression tracking (Thorsten Leemhuis)" >> <regressions@leemhuis.info> writes: >> >>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting >>> for once, to make this easily accessible to everyone. >>> >>> Stefan, was this regression ever solved? It doesn't look like it, but >>> maybe I'm missing something. >>> >>> If it wasn't solved: what needs to be done to get this rolling again? >> >> Not Stefan, > > Thx to both you and Stefan for the update. > >> but as far as I can tell, the problem is that on Stefan's >> build the kernel has rather large periods of interrupts being disabled, >> so any attempt to decrease IRQs frequency from UART by raising FIFO IRQ >> threshold causes "regression" that manifests itself as missing >> characters on receive. I'm not sure if it's tuning FIFO level that is in >> fact a regression in this case. > > Not totally sure, but I guess Linus stance in this case would be along > the lines of "commit 7a637784d517 made an existing issue worse; either > the people involved in it fix it, or we revert that commit[1], as it's > causing a regression". At least we *iirc* had situations he handled like > that. From Stefan's investigations it follows that the kernel has interrupts disabled for about 2.5 milliseconds! If that's an acceptable value for Linux kernel, then the commit in question is a regression. If not, and in my opinion that's too high a number, then it's not a regression at all, but rather a manifestation of a problem (bug?) elsewhere. > > [1] of course unless a revert would cause regressions for others -- > which i guess might be the case here, as that was added in 5.18 already. > So let's not bring Linus in. > >> Solving this would need to identify the cause of interrupts being >> disabled for prolonged times, and nobody volunteered to investigate this >> further. > > Well, Stefan kind of did to do so in his spare time, but asked for > "clear instructions to investigate this further". Could you maybe > provide those? If not: who could? There should be somebody who is familiar with methods to isolate the victim of abnormal interrupts latencies, but I'm not the one, sorry. Thanks, -- Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-05-23 19:44 ` Sergey Organov 2023-05-24 10:48 ` Thorsten Leemhuis @ 2023-05-24 13:07 ` Stefan Wahren 2023-06-20 14:47 ` Linux regression tracking (Thorsten Leemhuis) 1 sibling, 1 reply; 45+ messages in thread From: Stefan Wahren @ 2023-05-24 13:07 UTC (permalink / raw) To: Sergey Organov Cc: Linux regressions mailing list, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM, Linux regression tracking (Thorsten Leemhuis) Hi Sergey, Am 23.05.23 um 21:44 schrieb Sergey Organov: > "Linux regression tracking (Thorsten Leemhuis)" > <regressions@leemhuis.info> writes: > ... > > Solving this would need to identify the cause of interrupts being > disabled for prolonged times, and nobody volunteered to investigate this > further. One suspect, the Linux serial console, has been likely excluded > already though, as not actually being in use for printk() output. > I don't think that we can exclude the serial console as a whole, i never made such a observation. But at least we can exclude kernel logging on the debug UART. Best regards _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-05-24 13:07 ` Stefan Wahren @ 2023-06-20 14:47 ` Linux regression tracking (Thorsten Leemhuis) 2023-06-20 14:59 ` Greg Kroah-Hartman 2023-06-21 6:23 ` Stefan Wahren 0 siblings, 2 replies; 45+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-20 14:47 UTC (permalink / raw) To: Stefan Wahren, Sergey Organov Cc: Linux regressions mailing list, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM On 24.05.23 15:07, Stefan Wahren wrote: > > Am 23.05.23 um 21:44 schrieb Sergey Organov: >> "Linux regression tracking (Thorsten Leemhuis)" >> <regressions@leemhuis.info> writes: >> >> Solving this would need to identify the cause of interrupts being >> disabled for prolonged times, and nobody volunteered to investigate this >> further. One suspect, the Linux serial console, has been likely excluded >> already though, as not actually being in use for printk() output. >> > > I don't think that we can exclude the serial console as a whole, i never > made such a observation. But at least we can exclude kernel logging on > the debug UART. Stefan, just wondering: was this ever addressed upstream? I assume it's not, just wanted to be sure. I'm a bit unsure what to do with this and consider asking Greg for advice, as he applied the patch. On one hand it's *IMHO* clearly a regression (but for the record, some people involved in the discussion claim it's not). OTOH the culprit was applied more than a year ago now, so reverting it might cause more trouble than it's worth at this point, as that could lead to regressions for other users. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. #regzbot poke _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-20 14:47 ` Linux regression tracking (Thorsten Leemhuis) @ 2023-06-20 14:59 ` Greg Kroah-Hartman 2023-06-20 15:34 ` Sergey Organov ` (2 more replies) 2023-06-21 6:23 ` Stefan Wahren 1 sibling, 3 replies; 45+ messages in thread From: Greg Kroah-Hartman @ 2023-06-20 14:59 UTC (permalink / raw) To: Linux regressions mailing list Cc: Stefan Wahren, Sergey Organov, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: > On 24.05.23 15:07, Stefan Wahren wrote: > > > > Am 23.05.23 um 21:44 schrieb Sergey Organov: > >> "Linux regression tracking (Thorsten Leemhuis)" > >> <regressions@leemhuis.info> writes: > >> > >> Solving this would need to identify the cause of interrupts being > >> disabled for prolonged times, and nobody volunteered to investigate this > >> further. One suspect, the Linux serial console, has been likely excluded > >> already though, as not actually being in use for printk() output. > >> > > > > I don't think that we can exclude the serial console as a whole, i never > > made such a observation. But at least we can exclude kernel logging on > > the debug UART. > > Stefan, just wondering: was this ever addressed upstream? I assume it's > not, just wanted to be sure. > > I'm a bit unsure what to do with this and consider asking Greg for > advice, as he applied the patch. On one hand it's *IMHO* clearly a > regression (but for the record, some people involved in the discussion > claim it's not). OTOH the culprit was applied more than a year ago now, > so reverting it might cause more trouble than it's worth at this point, > as that could lead to regressions for other users. I'll be glad to revert this, but for some reason I thought that someone was working on a "real fix" here. Stefan, is that not the case? thanks, greg k-h _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-20 14:59 ` Greg Kroah-Hartman @ 2023-06-20 15:34 ` Sergey Organov 2023-06-20 16:30 ` Stefan Wahren 2023-06-20 19:27 ` Uwe Kleine-König 2 siblings, 0 replies; 45+ messages in thread From: Sergey Organov @ 2023-06-20 15:34 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Linux regressions mailing list, Stefan Wahren, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes: > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: >> On 24.05.23 15:07, Stefan Wahren wrote: >> > >> > Am 23.05.23 um 21:44 schrieb Sergey Organov: >> >> "Linux regression tracking (Thorsten Leemhuis)" >> >> <regressions@leemhuis.info> writes: >> >> >> >> Solving this would need to identify the cause of interrupts being >> >> disabled for prolonged times, and nobody volunteered to investigate this >> >> further. One suspect, the Linux serial console, has been likely excluded >> >> already though, as not actually being in use for printk() output. >> >> >> > >> > I don't think that we can exclude the serial console as a whole, i never >> > made such a observation. But at least we can exclude kernel logging on >> > the debug UART. >> >> Stefan, just wondering: was this ever addressed upstream? I assume it's >> not, just wanted to be sure. >> >> I'm a bit unsure what to do with this and consider asking Greg for >> advice, as he applied the patch. On one hand it's *IMHO* clearly a >> regression (but for the record, some people involved in the discussion >> claim it's not). OTOH the culprit was applied more than a year ago now, >> so reverting it might cause more trouble than it's worth at this point, >> as that could lead to regressions for other users. > > I'll be glad to revert this, but for some reason I thought that someone > was working on a "real fix" here. Stefan, is that not the case? As far as I understand, the "real fix" is to be where interrupts are being disabled for prolonged times in given specific kernel build, and nobody is looking for that place. In other words, I'm one who thinks the commit in question is not a regression per se, so I'm not sure it should be reverted. Thanks, Sergey Organov _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-20 14:59 ` Greg Kroah-Hartman 2023-06-20 15:34 ` Sergey Organov @ 2023-06-20 16:30 ` Stefan Wahren 2023-06-20 16:40 ` Lucas Stach 2023-06-20 19:27 ` Uwe Kleine-König 2 siblings, 1 reply; 45+ messages in thread From: Stefan Wahren @ 2023-06-20 16:30 UTC (permalink / raw) To: Greg Kroah-Hartman, Linux regressions mailing list Cc: Sergey Organov, Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hi Greg, Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman: > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: >> On 24.05.23 15:07, Stefan Wahren wrote: >>> >>> Am 23.05.23 um 21:44 schrieb Sergey Organov: >>>> "Linux regression tracking (Thorsten Leemhuis)" >>>> <regressions@leemhuis.info> writes: >>>> >>>> Solving this would need to identify the cause of interrupts being >>>> disabled for prolonged times, and nobody volunteered to investigate this >>>> further. One suspect, the Linux serial console, has been likely excluded >>>> already though, as not actually being in use for printk() output. >>>> >>> >>> I don't think that we can exclude the serial console as a whole, i never >>> made such a observation. But at least we can exclude kernel logging on >>> the debug UART. >> >> Stefan, just wondering: was this ever addressed upstream? I assume it's >> not, just wanted to be sure. >> >> I'm a bit unsure what to do with this and consider asking Greg for >> advice, as he applied the patch. On one hand it's *IMHO* clearly a >> regression (but for the record, some people involved in the discussion >> claim it's not). OTOH the culprit was applied more than a year ago now, >> so reverting it might cause more trouble than it's worth at this point, >> as that could lead to regressions for other users. > > I'll be glad to revert this, but for some reason I thought that someone > was working on a "real fix" here. Stefan, is that not the case? i can only repeat the statements from 23.5.: Unfortunately my time budget to investigate this issue further is exhausted, so i stopped working at this. In case someone can give clear instructions to investigate this further, i will try to look at it in my spare time. But i cannot make any promises. I'm not aware that some else is working on this. Best regards > > thanks, > > greg k-h _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-20 16:30 ` Stefan Wahren @ 2023-06-20 16:40 ` Lucas Stach 2023-06-20 16:55 ` Stefan Wahren 0 siblings, 1 reply; 45+ messages in thread From: Lucas Stach @ 2023-06-20 16:40 UTC (permalink / raw) To: Stefan Wahren, Greg Kroah-Hartman, Linux regressions mailing list Cc: Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer, Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial, Uwe Kleine-König, Ilpo Järvinen, Fabio Estevam, Tomasz Moń, Linux ARM Am Dienstag, dem 20.06.2023 um 18:30 +0200 schrieb Stefan Wahren: > Hi Greg, > > Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman: > > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: > > > On 24.05.23 15:07, Stefan Wahren wrote: > > > > > > > > Am 23.05.23 um 21:44 schrieb Sergey Organov: > > > > > "Linux regression tracking (Thorsten Leemhuis)" > > > > > <regressions@leemhuis.info> writes: > > > > > > > > > > Solving this would need to identify the cause of interrupts being > > > > > disabled for prolonged times, and nobody volunteered to investigate this > > > > > further. One suspect, the Linux serial console, has been likely excluded > > > > > already though, as not actually being in use for printk() output. > > > > > > > > > > > > > I don't think that we can exclude the serial console as a whole, i never > > > > made such a observation. But at least we can exclude kernel logging on > > > > the debug UART. > > > > > > Stefan, just wondering: was this ever addressed upstream? I assume it's > > > not, just wanted to be sure. > > > > > > I'm a bit unsure what to do with this and consider asking Greg for > > > advice, as he applied the patch. On one hand it's *IMHO* clearly a > > > regression (but for the record, some people involved in the discussion > > > claim it's not). OTOH the culprit was applied more than a year ago now, > > > so reverting it might cause more trouble than it's worth at this point, > > > as that could lead to regressions for other users. > > > > I'll be glad to revert this, but for some reason I thought that someone > > was working on a "real fix" here. Stefan, is that not the case? > > i can only repeat the statements from 23.5.: > > Unfortunately my time budget to investigate this issue further is > exhausted, so i stopped working at this. > > In case someone can give clear instructions to investigate this further, > i will try to look at it in my spare time. But i cannot make any promises. > If the cause is simply interrupts not being serviced for a long period of time, the irqsoff tracer is usually a very good start to investigate the issue. It might point to a smoking gun already. Regards, Lucas > I'm not aware that some else is working on this. > > Best regards > > > > > thanks, > > > > greg k-h > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-20 16:40 ` Lucas Stach @ 2023-06-20 16:55 ` Stefan Wahren 0 siblings, 0 replies; 45+ messages in thread From: Stefan Wahren @ 2023-06-20 16:55 UTC (permalink / raw) To: Lucas Stach, Greg Kroah-Hartman, Linux regressions mailing list Cc: Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer, Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial, Uwe Kleine-König, Ilpo Järvinen, Fabio Estevam, Tomasz Moń, Linux ARM Hi Lucas, Am 20.06.23 um 18:40 schrieb Lucas Stach: > Am Dienstag, dem 20.06.2023 um 18:30 +0200 schrieb Stefan Wahren: >> Hi Greg, >> >> Am 20.06.23 um 16:59 schrieb Greg Kroah-Hartman: >>> On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: >>>> On 24.05.23 15:07, Stefan Wahren wrote: >>>>> >>>>> Am 23.05.23 um 21:44 schrieb Sergey Organov: >>>>>> "Linux regression tracking (Thorsten Leemhuis)" >>>>>> <regressions@leemhuis.info> writes: >>>>>> >>>>>> Solving this would need to identify the cause of interrupts being >>>>>> disabled for prolonged times, and nobody volunteered to investigate this >>>>>> further. One suspect, the Linux serial console, has been likely excluded >>>>>> already though, as not actually being in use for printk() output. >>>>>> >>>>> >>>>> I don't think that we can exclude the serial console as a whole, i never >>>>> made such a observation. But at least we can exclude kernel logging on >>>>> the debug UART. >>>> >>>> Stefan, just wondering: was this ever addressed upstream? I assume it's >>>> not, just wanted to be sure. >>>> >>>> I'm a bit unsure what to do with this and consider asking Greg for >>>> advice, as he applied the patch. On one hand it's *IMHO* clearly a >>>> regression (but for the record, some people involved in the discussion >>>> claim it's not). OTOH the culprit was applied more than a year ago now, >>>> so reverting it might cause more trouble than it's worth at this point, >>>> as that could lead to regressions for other users. >>> >>> I'll be glad to revert this, but for some reason I thought that someone >>> was working on a "real fix" here. Stefan, is that not the case? >> >> i can only repeat the statements from 23.5.: >> >> Unfortunately my time budget to investigate this issue further is >> exhausted, so i stopped working at this. >> >> In case someone can give clear instructions to investigate this further, >> i will try to look at it in my spare time. But i cannot make any promises. >> > If the cause is simply interrupts not being serviced for a long period > of time, the irqsoff tracer is usually a very good start to investigate > the issue. It might point to a smoking gun already. thanks the hint, i can try that. AFAIR there was a kernel comment which pointed out that console IO (or at least parts) is excluded from the irqoff tracer? > > Regards, > Lucas > >> I'm not aware that some else is working on this. >> >> Best regards >> >>> >>> thanks, >>> >>> greg k-h >> > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-20 14:59 ` Greg Kroah-Hartman 2023-06-20 15:34 ` Sergey Organov 2023-06-20 16:30 ` Stefan Wahren @ 2023-06-20 19:27 ` Uwe Kleine-König 2023-06-21 8:43 ` Greg Kroah-Hartman 2 siblings, 1 reply; 45+ messages in thread From: Uwe Kleine-König @ 2023-06-20 19:27 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Linux regressions mailing list, Stefan Wahren, Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer, Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial, Ilpo Järvinen, Fabio Estevam, Tomasz Moń, Linux ARM [-- Attachment #1.1: Type: text/plain, Size: 2433 bytes --] Hello Greg, On Tue, Jun 20, 2023 at 04:59:18PM +0200, Greg Kroah-Hartman wrote: > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: > > On 24.05.23 15:07, Stefan Wahren wrote: > > > > > > Am 23.05.23 um 21:44 schrieb Sergey Organov: > > >> "Linux regression tracking (Thorsten Leemhuis)" > > >> <regressions@leemhuis.info> writes: > > >> > > >> Solving this would need to identify the cause of interrupts being > > >> disabled for prolonged times, and nobody volunteered to investigate this > > >> further. One suspect, the Linux serial console, has been likely excluded > > >> already though, as not actually being in use for printk() output. > > >> > > > > > > I don't think that we can exclude the serial console as a whole, i never > > > made such a observation. But at least we can exclude kernel logging on > > > the debug UART. > > > > Stefan, just wondering: was this ever addressed upstream? I assume it's > > not, just wanted to be sure. > > > > I'm a bit unsure what to do with this and consider asking Greg for > > advice, as he applied the patch. On one hand it's *IMHO* clearly a > > regression (but for the record, some people involved in the discussion > > claim it's not). OTOH the culprit was applied more than a year ago now, > > so reverting it might cause more trouble than it's worth at this point, > > as that could lead to regressions for other users. > > I'll be glad to revert this, but for some reason I thought that someone > was working on a "real fix" here. Stefan, is that not the case? Sergey Organov already said something similar, but not very explicit: With the current understanding reverting said commit is wrong. It is expected that the commit increases irq latency for imx-serial a bit for the benefit of less interrupts and so serves the overall system performance. That this poses a problem only means that on the reporter's machine there is already an issue that results in a longer period with disabled irqs. While reverting the imx-serial commit would (maybe) solve that, the actual problem is the other issue that disables preemption for a longer timespan. So TL;DR: Please don't revert the imx-serial patch. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-20 19:27 ` Uwe Kleine-König @ 2023-06-21 8:43 ` Greg Kroah-Hartman 0 siblings, 0 replies; 45+ messages in thread From: Greg Kroah-Hartman @ 2023-06-21 8:43 UTC (permalink / raw) To: Uwe Kleine-König Cc: Linux regressions mailing list, Stefan Wahren, Pengutronix Kernel Team, Jiri Slaby, Stefan Wahren, Sascha Hauer, Shawn Guo, Sergey Organov, NXP Linux Team, linux-serial, Ilpo Järvinen, Fabio Estevam, Tomasz Moń, Linux ARM On Tue, Jun 20, 2023 at 09:27:48PM +0200, Uwe Kleine-König wrote: > Hello Greg, > > On Tue, Jun 20, 2023 at 04:59:18PM +0200, Greg Kroah-Hartman wrote: > > On Tue, Jun 20, 2023 at 04:47:10PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: > > > On 24.05.23 15:07, Stefan Wahren wrote: > > > > > > > > Am 23.05.23 um 21:44 schrieb Sergey Organov: > > > >> "Linux regression tracking (Thorsten Leemhuis)" > > > >> <regressions@leemhuis.info> writes: > > > >> > > > >> Solving this would need to identify the cause of interrupts being > > > >> disabled for prolonged times, and nobody volunteered to investigate this > > > >> further. One suspect, the Linux serial console, has been likely excluded > > > >> already though, as not actually being in use for printk() output. > > > >> > > > > > > > > I don't think that we can exclude the serial console as a whole, i never > > > > made such a observation. But at least we can exclude kernel logging on > > > > the debug UART. > > > > > > Stefan, just wondering: was this ever addressed upstream? I assume it's > > > not, just wanted to be sure. > > > > > > I'm a bit unsure what to do with this and consider asking Greg for > > > advice, as he applied the patch. On one hand it's *IMHO* clearly a > > > regression (but for the record, some people involved in the discussion > > > claim it's not). OTOH the culprit was applied more than a year ago now, > > > so reverting it might cause more trouble than it's worth at this point, > > > as that could lead to regressions for other users. > > > > I'll be glad to revert this, but for some reason I thought that someone > > was working on a "real fix" here. Stefan, is that not the case? > > Sergey Organov already said something similar, but not very explicit: > With the current understanding reverting said commit is wrong. It is > expected that the commit increases irq latency for imx-serial a bit for > the benefit of less interrupts and so serves the overall system > performance. That this poses a problem only means that on the reporter's > machine there is already an issue that results in a longer period with > disabled irqs. While reverting the imx-serial commit would (maybe) solve > that, the actual problem is the other issue that disables preemption for > a longer timespan. > > So TL;DR: Please don't revert the imx-serial patch. Ok, will leave this alone, it shouldn't be marked as a regression. greg k-h _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-20 14:47 ` Linux regression tracking (Thorsten Leemhuis) 2023-06-20 14:59 ` Greg Kroah-Hartman @ 2023-06-21 6:23 ` Stefan Wahren 2023-06-21 13:42 ` Linux regression tracking (Thorsten Leemhuis) 1 sibling, 1 reply; 45+ messages in thread From: Stefan Wahren @ 2023-06-21 6:23 UTC (permalink / raw) To: Linux regressions mailing list, Sergey Organov Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM Hi Thorsten, Am 20.06.23 um 16:47 schrieb Linux regression tracking (Thorsten Leemhuis): > On 24.05.23 15:07, Stefan Wahren wrote: >> >> Am 23.05.23 um 21:44 schrieb Sergey Organov: >>> "Linux regression tracking (Thorsten Leemhuis)" >>> <regressions@leemhuis.info> writes: >>> >>> Solving this would need to identify the cause of interrupts being >>> disabled for prolonged times, and nobody volunteered to investigate this >>> further. One suspect, the Linux serial console, has been likely excluded >>> already though, as not actually being in use for printk() output. >>> >> >> I don't think that we can exclude the serial console as a whole, i never >> made such a observation. But at least we can exclude kernel logging on >> the debug UART. > > Stefan, just wondering: was this ever addressed upstream? I assume it's > not, just wanted to be sure. > > I'm a bit unsure what to do with this and consider asking Greg for > advice, as he applied the patch. On one hand it's *IMHO* clearly a > regression (but for the record, some people involved in the discussion > claim it's not). OTOH the culprit was applied more than a year ago now, > so reverting it might cause more trouble than it's worth at this point, > as that could lead to regressions for other users. thanks for tracking this issue, but in my opinion the discussion goes in circles. So i don't see a point in reanimating this again. Articles like [1] suggests me this is a general issue. Best regards [1] - https://www.phoronix.com/news/Printk-Threaded-Atomic-v1 > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > If I did something stupid, please tell me, as explained on that page. > > #regzbot poke _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: Regression: serial: imx: overrun errors on debug UART 2023-06-21 6:23 ` Stefan Wahren @ 2023-06-21 13:42 ` Linux regression tracking (Thorsten Leemhuis) 0 siblings, 0 replies; 45+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-21 13:42 UTC (permalink / raw) To: Stefan Wahren, Linux regressions mailing list, Sergey Organov Cc: Uwe Kleine-König, Fabio Estevam, Ilpo Järvinen, Stefan Wahren, linux-serial, Greg Kroah-Hartman, Sascha Hauer, NXP Linux Team, Pengutronix Kernel Team, Shawn Guo, Jiri Slaby, Tomasz Moń, Linux ARM On 21.06.23 08:23, Stefan Wahren wrote: > Am 20.06.23 um 16:47 schrieb Linux regression tracking (Thorsten Leemhuis): >> On 24.05.23 15:07, Stefan Wahren wrote: >>> >>> Am 23.05.23 um 21:44 schrieb Sergey Organov: >>>> "Linux regression tracking (Thorsten Leemhuis)" >>>> <regressions@leemhuis.info> writes: >>>> >>>> Solving this would need to identify the cause of interrupts being >>>> disabled for prolonged times, and nobody volunteered to investigate >>>> this >>>> further. One suspect, the Linux serial console, has been likely >>>> excluded >>>> already though, as not actually being in use for printk() output. >>> >>> I don't think that we can exclude the serial console as a whole, i never >>> made such a observation. But at least we can exclude kernel logging on >>> the debug UART. >> >> Stefan, just wondering: was this ever addressed upstream? I assume it's >> not, just wanted to be sure. >> >> I'm a bit unsure what to do with this and consider asking Greg for >> advice, as he applied the patch. On one hand it's *IMHO* clearly a >> regression (but for the record, some people involved in the discussion >> claim it's not). OTOH the culprit was applied more than a year ago now, >> so reverting it might cause more trouble than it's worth at this point, >> as that could lead to regressions for other users. > > thanks for tracking this issue, but in my opinion the discussion goes in > circles. So i don't see a point in reanimating this again. [...] Yup. Sadly. I don't think Linus would agree with the "this is not a regression" claim from various people here. But well, due to your statement, Gregs mail from earlier today, and the fact that reverting the culprit at this point might lead to regression that would hit more people, I agree: #regzbot inconclusive: unresolved, but stuck, and revert likely a bad option - and reporter is fine with not perusing this further Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 45+ messages in thread
end of thread, other threads:[~2023-06-21 13:43 UTC | newest] Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-03-24 8:57 Regression: serial: imx: overrun errors on debug UART Stefan Wahren 2023-03-24 10:12 ` Linux regression tracking #adding (Thorsten Leemhuis) 2023-03-24 11:47 ` Ilpo Järvinen 2023-03-24 12:26 ` Francesco Dolcini 2023-03-24 12:35 ` Ilpo Järvinen 2023-03-24 12:49 ` Stefan Wahren 2023-03-24 13:06 ` Francesco Dolcini 2023-03-24 12:57 ` Fabio Estevam 2023-03-24 13:37 ` Uwe Kleine-König 2023-03-24 14:19 ` Stefan Wahren 2023-03-24 14:39 ` Uwe Kleine-König 2023-03-24 21:57 ` Sergey Organov 2023-03-24 15:00 ` Stefan Wahren 2023-03-25 11:31 ` Stefan Wahren 2023-03-25 12:23 ` Fabio Estevam 2023-03-25 15:11 ` Uwe Kleine-König 2023-03-25 17:05 ` Stefan Wahren 2023-03-25 19:00 ` Sergey Organov 2023-03-26 18:21 ` Francesco Dolcini 2023-03-27 8:07 ` Tomasz Moń 2023-03-25 18:30 ` Sergey Organov 2023-03-27 14:42 ` Stefan Wahren 2023-03-27 15:11 ` Sergey Organov 2023-03-27 15:30 ` Russell King (Oracle) 2023-04-16 13:43 ` Stefan Wahren 2023-04-17 16:50 ` Sergey Organov 2023-04-17 18:40 ` Stefan Wahren 2023-04-18 16:16 ` Stefan Wahren 2023-05-22 9:25 ` Linux regression tracking (Thorsten Leemhuis) 2023-05-23 15:12 ` Stefan Wahren 2023-05-23 19:44 ` Sergey Organov 2023-05-24 10:48 ` Thorsten Leemhuis 2023-05-24 12:41 ` Uwe Kleine-König 2023-05-24 13:45 ` Sergey Organov 2023-05-24 13:07 ` Stefan Wahren 2023-06-20 14:47 ` Linux regression tracking (Thorsten Leemhuis) 2023-06-20 14:59 ` Greg Kroah-Hartman 2023-06-20 15:34 ` Sergey Organov 2023-06-20 16:30 ` Stefan Wahren 2023-06-20 16:40 ` Lucas Stach 2023-06-20 16:55 ` Stefan Wahren 2023-06-20 19:27 ` Uwe Kleine-König 2023-06-21 8:43 ` Greg Kroah-Hartman 2023-06-21 6:23 ` Stefan Wahren 2023-06-21 13:42 ` Linux regression tracking (Thorsten Leemhuis)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).