All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] IRQ handling race and spurious IIR read in serial/8250.c
@ 2009-03-12 17:57 ` Markus Armbruster
  0 siblings, 0 replies; 10+ messages in thread
From: Markus Armbruster @ 2009-03-12 17:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-serial, xen-devel, Ian Jackson, Anders Kaseorg,
	Jeremy Fitzhardinge, Andrew Morton

From: Ian Jackson <ian.jackson@eu.citrix.com>

Do not read IIR in serial8250_start_tx when UART_BUG_TXEN

Reading the IIR clears some oustanding interrupts so it is not safe.
Instead, simply transmit immediately if the buffer is empty without
regard to IIR.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>

---
Ian Jackson recently debugged a problem reported by Anders Kaseorg, and
posted a fix (see http://lkml.org/lkml/2009/2/11/240).  As far as I can
see, the patch has been dropped on the floor.

I also experienced the problem, and Ian's patch fixes it for me.

The bug bites Xen HVM guests.  Output to the serial console stalls until
some input happens.  As Ian's analysis quoted below shows, it is a race
condition that could theoretically bite elsewhere as well.

Ian Jackson explained:
> The bugs in detail (this discussion applies to 2.6.20 and also to
> 2.6.28.4):
>
>  1. The hunk of serial8250_startup I quote below attempts to discover
>     whether writing the IER re-asserts the THRI (transmit ready)
>     interrupt.  However the spinlock that it has taken out,
>     port->lock, is not the one that the IRQ service routine takes
>     before reading the IIR (i->lock).  As a result, on an SMP system
>     the generated interrupt races with the straight-line code in
>     serial8250_startup.
>
>     If serial8250_startup loses the race (perhaps because the system
>     is a VM and its VCPU got preempted), UART_BUG_TXEN is spuriously
>     added to bugs.  This is quite unlikely in a normal system but in
>     certain Xen configurations, particularly ones where there is CPU
>     pressure, we may lose the race every time.
>
>     It is not exactly clear to me how this ought to be resolved.  One
>     possibility is that the UART_BUG_TXEN problem might be worked
>     around perfectly well by the new and very similar workaround
>     UART_BUG_THRE[1] in 2.6.21ish in which case it could just be
>     removed.
>      2. UART_BUG_TXEN's workaround appears to be intended to be
> harmless.
>     However what it actually does is to read the IIR, thus clearing
>     any actual interrupt (including incidentally non-THRI), and then
>     only perform the intended servicing if the interrupt was _not_
>     asserted.  That is, it breaks on any serial port with the bug.
>
>     As far as I can see there is not much use in UART_BUG_TXEN reading
>     IIR at all, so a suitable change if we want to keep UART_BUG_TXEN
>     might be the first patch I enclose below (again, not compiled
>     or tested).
>
>     If UART_BUG_TXEN is retained something along these lines should be
>     done at the very least.
>
> Ian.
>
> [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=40b36daad0ac704e6d5c1b75789f371ef5b053c1
>     in which case UART

--- ../linux-2.6.28.4/drivers/serial/8250.c~	2009-02-06 21:47:45.000000000 +0000
+++ ../linux-2.6.28.4/drivers/serial/8250.c	2009-02-11 15:55:24.000000000 +0000
@@ -1257,14 +1257,12 @@
 		serial_out(up, UART_IER, up->ier);
 
 		if (up->bugs & UART_BUG_TXEN) {
-			unsigned char lsr, iir;
+			unsigned char lsr;
 			lsr = serial_in(up, UART_LSR);
 			up->lsr_saved_flags |= lsr & LSR_SAVE_FLAGS;
-			iir = serial_in(up, UART_IIR) & 0x0f;
 			if ((up->port.type == PORT_RM9000) ?
-				(lsr & UART_LSR_THRE &&
-				(iir == UART_IIR_NO_INT || iir == UART_IIR_THRI)) :
-				(lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT))
+				(lsr & UART_LSR_THRE) :
+				(lsr & UART_LSR_TEMT))
 				transmit_chars(up);
 		}
 	}



^ permalink raw reply	[flat|nested] 10+ messages in thread
* Serial console hangs with Linux 2.6.20 HVM guest
@ 2009-02-05  2:23 Anders Kaseorg
  2009-02-05 17:04 ` Ian Jackson
  0 siblings, 1 reply; 10+ messages in thread
From: Anders Kaseorg @ 2009-02-05  2:23 UTC (permalink / raw)
  To: xen-devel

I am seeing a problem with the Xen emulated serial console.  When
running a Linux 2.6.20 HVM guest that has CONFIG_HOTPLUG_CPU=n, the
guest blocks on output to the console until it receives input keypresses
from `xm console`.  This prevents the guest from booting up without
banging on some keys, and makes interactive use of the console
difficult.

By bisecting Linux kernel commits, I found that the bug goes away in
commit 40b36daad0ac704e6d5c1b75789f371ef5b053c1 (v2.6.21-rc1~261), which
is a workaround for buggy UARTs on certain HP machines.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=40b36daad0ac704e6d5c1b75789f371ef5b053c1

“The patch below works around a minor bug found in the UART of the
remote management card used in many HP ia64 and parisc servers (aka the
Diva UARTs).  The problem is that the UART does not reassert the THRE
interrupt if it has been previously cleared and the IIR THRI bit is
re-enabled.  This can produce a very annoying failure mode when used as
a serial console, allowing a boot/reboot to hang indefinitely until an
RX interrupt kicks it into working again (ie. an unattended reboot could
stall).”

That matches my symptoms exactly, which suggests that the Xen UART
probably has a similar bug.

I’ve seen this in Xen 3.2.1 and 3.3.1.  (My host is running Debian Lenny
amd64, with the Xen dom0 kernel 2.6.24-23-xen from Ubuntu Hardy, on a
server with two quad-core Xeons.)

Anders

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-06-19  1:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-12 17:57 [PATCH] IRQ handling race and spurious IIR read in serial/8250.c Markus Armbruster
2009-03-12 17:57 ` Markus Armbruster
2009-03-12 18:09 ` Alan Cox
2009-03-12 18:09   ` Alan Cox
2009-03-12 19:30   ` Ian Jackson
2009-03-12 19:30     ` Ian Jackson
2009-03-12 21:38     ` [Xen-devel] " Alan Cox
2009-03-12 21:38       ` Alan Cox
2009-06-19  1:39       ` [Xen-devel] " Robert Evans
  -- strict thread matches above, loose matches on Subject: below --
2009-02-05  2:23 Serial console hangs with Linux 2.6.20 HVM guest Anders Kaseorg
2009-02-05 17:04 ` Ian Jackson
2009-02-05 19:34   ` Anders Kaseorg
2009-02-05 21:52     ` Anders Kaseorg
2009-02-10 15:34       ` Ian Jackson
2009-02-10 18:20         ` Anders Kaseorg
2009-02-11 16:08           ` [PATCH] IRQ handling race and spurious IIR read in serial/8250.c Ian Jackson
2009-02-19 17:52             ` Ian Jackson
2009-02-19 18:37               ` [Xen-devel] " Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.