linux-serial.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC] serial: 8250: Handle invalid UART state
@ 2015-08-19 20:12 california.l.sullivan
  2015-08-24 13:31 ` Peter Hurley
  0 siblings, 1 reply; 2+ messages in thread
From: california.l.sullivan @ 2015-08-19 20:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-serial, alan, darren.hart, gregkh, jslaby,
	california.l.sullivan, heikki.krogerus, andriy.shevchenko, arjan

From: California Sullivan <california.l.sullivan@intel.com>

The debug UART on the Bay Trail is buggy and will sometimes enter a
state where there is a Character Timeout interrupt, but the Data
Ready bit in the Line Status Register is not set, despite there
being data available in the FIFO. It stays in this state until the
Receive Buffer is read. Because the 8250 driver does not read the
Receive Buffer without the Data Ready bit being set, the UART is
never read once we reach this point, and thus the Character Timeout
interrupt is never cleared. This makes the driver spin in a loop
causing multiple "too much work for irq" errors and eventually
locking up entirely.

This patch handles the invalid state by setting the Data Ready bit
in the 'status' variable when the invalid state occurs. This allows
the receive buffer to be read, which clears the interrupt
identification register in the UART and brings us back to a correct
state.

After this patch was submitted for internal comment, a similar bug on
the HSUARTs of the Bay Trail and Braswell platforms was pointed out.
On those UARTs, the invalid state mentioned previously is reached for
some amount of time, cauing the same "too much work for irq" messages,
but not permanently locking up the UART. This patch has been confirmed
to solve that issue as well.

We considered solving this by adding a new UART_BUG_ define and
detecting the buggy UART by either identifying the processor or by
adding a new PNP device ID to firmware. However, this solution
would be more complex and have zero performance benefits, as the
ISR would require a similar 'if' condition to detect and handle the
bug.

Our main concern with this fix is whether it having a Character
Timeout with no Data Ready is an invalid state for all UARTs or
just some. If other UARTs have a Character Timeout without
immediately flipping the Data Ready bit in the Line Status Register
for a good reason, setting the Data Ready bit in the 'status'
variable could have unintended consequences.

Signed-off-by: California Sullivan <california.l.sullivan@intel.com>
---
 drivers/tty/serial/8250/8250_core.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index ea1a8da..078b118 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -1604,6 +1604,14 @@ int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
 
 	DEBUG_INTR("status = %x...", status);
 
+	/*
+	 * Workaround for when there is a character timeout interrupt
+	 * but the data ready bit is not set in the Line Status Register.
+	 */
+	if ((iir & UART_IIR_RX_TIMEOUT) &&
+	    !(status & (UART_LSR_DR | UART_LSR_BI)))
+		status |= UART_LSR_DR;
+
 	if (status & (UART_LSR_DR | UART_LSR_BI)) {
 		if (up->dma)
 			dma_err = up->dma->rx_dma(up, iir);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH RFC] serial: 8250: Handle invalid UART state
  2015-08-19 20:12 [PATCH RFC] serial: 8250: Handle invalid UART state california.l.sullivan
@ 2015-08-24 13:31 ` Peter Hurley
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Hurley @ 2015-08-24 13:31 UTC (permalink / raw)
  To: california.l.sullivan, linux-kernel
  Cc: linux-serial, alan, darren.hart, gregkh, jslaby, heikki.krogerus,
	andriy.shevchenko, arjan

On 08/19/2015 04:12 PM, california.l.sullivan@intel.com wrote:
> From: California Sullivan <california.l.sullivan@intel.com>
> 
> The debug UART on the Bay Trail is buggy and will sometimes enter a
> state where there is a Character Timeout interrupt, but the Data
> Ready bit in the Line Status Register is not set, despite there
> being data available in the FIFO. It stays in this state until the
> Receive Buffer is read. Because the 8250 driver does not read the
> Receive Buffer without the Data Ready bit being set, the UART is
> never read once we reach this point, and thus the Character Timeout
> interrupt is never cleared. This makes the driver spin in a loop
> causing multiple "too much work for irq" errors and eventually
> locking up entirely.

LSR doesn't reflect the actual state of the fifo?
So kgdb (CONSOLE_POLL) won't work on these platforms either, right?

Does this happen with only a single byte in the fifo or is it any
number of bytes under the fifo trigger?


> This patch handles the invalid state by setting the Data Ready bit
> in the 'status' variable when the invalid state occurs. This allows
> the receive buffer to be read, which clears the interrupt
> identification register in the UART and brings us back to a correct
> state.
> 
> After this patch was submitted for internal comment, a similar bug on
> the HSUARTs of the Bay Trail and Braswell platforms was pointed out.
> On those UARTs, the invalid state mentioned previously is reached for
> some amount of time, cauing the same "too much work for irq" messages,
> but not permanently locking up the UART. This patch has been confirmed
> to solve that issue as well.
> 
> We considered solving this by adding a new UART_BUG_ define and
> detecting the buggy UART by either identifying the processor or by
> adding a new PNP device ID to firmware. However, this solution
> would be more complex and have zero performance benefits, as the
> ISR would require a similar 'if' condition to detect and handle the
> bug.
> 
> Our main concern with this fix is whether it having a Character
> Timeout with no Data Ready is an invalid state for all UARTs or
> just some. If other UARTs have a Character Timeout without
> immediately flipping the Data Ready bit in the Line Status Register
> for a good reason, setting the Data Ready bit in the 'status'
> variable could have unintended consequences.

I think there is every likelihood of spurious RX timeout interrupts
tripping this patch, sorry.

Unfortunately, I think UART_BUG_ is the only viable possibility.
Or perhaps fixing the port type as PORT_8250 (thus disabling the fifos).

Regards,
Peter Hurley


> Signed-off-by: California Sullivan <california.l.sullivan@intel.com>
> ---
>  drivers/tty/serial/8250/8250_core.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
> index ea1a8da..078b118 100644
> --- a/drivers/tty/serial/8250/8250_core.c
> +++ b/drivers/tty/serial/8250/8250_core.c
> @@ -1604,6 +1604,14 @@ int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
>  
>  	DEBUG_INTR("status = %x...", status);
>  
> +	/*
> +	 * Workaround for when there is a character timeout interrupt
> +	 * but the data ready bit is not set in the Line Status Register.
> +	 */
> +	if ((iir & UART_IIR_RX_TIMEOUT) &&
> +	    !(status & (UART_LSR_DR | UART_LSR_BI)))
> +		status |= UART_LSR_DR;
> +
>  	if (status & (UART_LSR_DR | UART_LSR_BI)) {
>  		if (up->dma)
>  			dma_err = up->dma->rx_dma(up, iir);
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-08-24 13:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-19 20:12 [PATCH RFC] serial: 8250: Handle invalid UART state california.l.sullivan
2015-08-24 13:31 ` Peter Hurley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).