All of lore.kernel.org
 help / color / mirror / Atom feed
* tty_lock held during transmit wait in close: still unresolved
@ 2011-05-25 23:59 Andreas Bombe
  2011-05-26  7:11 ` Greg KH
  2011-05-26  8:17 ` Alan Cox
  0 siblings, 2 replies; 8+ messages in thread
From: Andreas Bombe @ 2011-05-25 23:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg KH, Alan Cox, Arnd Bergmann

Short reminder/summary: Unlike the BKL, the tty_lock mechanism that
replaced it does not release the lock while sleeping. This causes the
tty_lock to be held for extended periods of time when uart_close()
(running with tty_lock held) tries to flush the transmit buffer but is
unable to do so.

The result is that until the transmit flush completes by the remote end
finally accepting the data or the driver timing out and giving up,
pretty much all tty operations freeze. No programs can be started
and for some reason X freezes for me. From a user viewpoint, the
computer effectively locks up for that time.

After I tracked the issue down to the tty_lock mechanism, I found some
discussion on LKML about that from last November (thread starting here:
http://lkml.kernel.org/r/4CCBCD8E.1020601@billauer.co.il ). However
nothing has come of it, it appears, at least not in the official kernel.

A minimalist way to trigger this issue is with a serial port that has
nothing attached:

stty -F /dev/ttyS0 crtscts
echo >/dev/ttyS0

The echo triggers a 30 second timeout on close while the driver is
trying to flush the newline. Another way is developing an USB device
with a virtual serial port and having it stop (by debugger or
crash/lockup)...

Any ideas on how to progress?

-- 
Andreas Bombe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tty_lock held during transmit wait in close: still unresolved
  2011-05-25 23:59 tty_lock held during transmit wait in close: still unresolved Andreas Bombe
@ 2011-05-26  7:11 ` Greg KH
  2011-05-27  0:29   ` Andreas Bombe
  2011-05-26  8:17 ` Alan Cox
  1 sibling, 1 reply; 8+ messages in thread
From: Greg KH @ 2011-05-26  7:11 UTC (permalink / raw)
  To: Andreas Bombe; +Cc: linux-kernel, Alan Cox, Arnd Bergmann

On Thu, May 26, 2011 at 01:59:22AM +0200, Andreas Bombe wrote:
> Short reminder/summary: Unlike the BKL, the tty_lock mechanism that
> replaced it does not release the lock while sleeping. This causes the
> tty_lock to be held for extended periods of time when uart_close()
> (running with tty_lock held) tries to flush the transmit buffer but is
> unable to do so.
> 
> The result is that until the transmit flush completes by the remote end
> finally accepting the data or the driver timing out and giving up,
> pretty much all tty operations freeze. No programs can be started
> and for some reason X freezes for me. From a user viewpoint, the
> computer effectively locks up for that time.
> 
> After I tracked the issue down to the tty_lock mechanism, I found some
> discussion on LKML about that from last November (thread starting here:
> http://lkml.kernel.org/r/4CCBCD8E.1020601@billauer.co.il ). However
> nothing has come of it, it appears, at least not in the official kernel.
> 
> A minimalist way to trigger this issue is with a serial port that has
> nothing attached:
> 
> stty -F /dev/ttyS0 crtscts
> echo >/dev/ttyS0
> 
> The echo triggers a 30 second timeout on close while the driver is
> trying to flush the newline. Another way is developing an USB device
> with a virtual serial port and having it stop (by debugger or
> crash/lockup)...
> 
> Any ideas on how to progress?

Can you try Linus's tree right now?  A number of changes went in during
this merge window that might help out here I think.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tty_lock held during transmit wait in close: still unresolved
  2011-05-25 23:59 tty_lock held during transmit wait in close: still unresolved Andreas Bombe
  2011-05-26  7:11 ` Greg KH
@ 2011-05-26  8:17 ` Alan Cox
  2011-05-27  0:41   ` Andreas Bombe
  1 sibling, 1 reply; 8+ messages in thread
From: Alan Cox @ 2011-05-26  8:17 UTC (permalink / raw)
  To: Andreas Bombe; +Cc: linux-kernel, Greg KH, Arnd Bergmann

> A minimalist way to trigger this issue is with a serial port that has
> nothing attached:
> 
> stty -F /dev/ttyS0 crtscts
> echo >/dev/ttyS0
> 
> The echo triggers a 30 second timeout on close while the driver is
> trying to flush the newline. Another way is developing an USB device
> with a virtual serial port and having it stop (by debugger or
> crash/lockup)...
> 
> Any ideas on how to progress?

Send patches.

At any point you can show the code sleeps you can drop and retake the
tty mutex either side of it, so you should be able to do that in the
close timeout case. You may need to think about the order of locking with
the port mutex but I suspect you can drop and retake both there.

Most use of the mutex (but not quite all) could also be moved to a tty
specific mutex fairly easily once someone has time, but that alone won't
entirely fix your case.

Alan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tty_lock held during transmit wait in close: still unresolved
  2011-05-26  7:11 ` Greg KH
@ 2011-05-27  0:29   ` Andreas Bombe
  0 siblings, 0 replies; 8+ messages in thread
From: Andreas Bombe @ 2011-05-27  0:29 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, Alan Cox, Arnd Bergmann

On Thu, May 26, 2011 at 12:11:04AM -0700, Greg KH wrote:
> On Thu, May 26, 2011 at 01:59:22AM +0200, Andreas Bombe wrote:
> > A minimalist way to trigger this issue is with a serial port that has
> > nothing attached:
> > 
> > stty -F /dev/ttyS0 crtscts
> > echo >/dev/ttyS0
> > 
> > The echo triggers a 30 second timeout on close while the driver is
> > trying to flush the newline. Another way is developing an USB device
> > with a virtual serial port and having it stop (by debugger or
> > crash/lockup)...
> > 
> > Any ideas on how to progress?
> 
> Can you try Linus's tree right now?  A number of changes went in during
> this merge window that might help out here I think.

I think I had those changes already in the last try (Wednesday's git)
but I tried again with the very latest. No noticeable change.

-- 
Andreas Bombe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tty_lock held during transmit wait in close: still unresolved
  2011-05-26  8:17 ` Alan Cox
@ 2011-05-27  0:41   ` Andreas Bombe
  2011-05-27 11:38     ` Arnd Bergmann
  2011-05-27 12:11     ` Jiri Slaby
  0 siblings, 2 replies; 8+ messages in thread
From: Andreas Bombe @ 2011-05-27  0:41 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel, Greg KH, Arnd Bergmann

On Thu, May 26, 2011 at 09:17:26AM +0100, Alan Cox wrote:
> > A minimalist way to trigger this issue is with a serial port that has
> > nothing attached:
> > 
> > stty -F /dev/ttyS0 crtscts
> > echo >/dev/ttyS0
> > 
> > The echo triggers a 30 second timeout on close while the driver is
> > trying to flush the newline. Another way is developing an USB device
> > with a virtual serial port and having it stop (by debugger or
> > crash/lockup)...
> > 
> > Any ideas on how to progress?
> 
> Send patches.

I would have, but I'm not exactly familiar with the tty code and I
thought messing around with the locking is probably not the most
promising way to start.

However…

> At any point you can show the code sleeps you can drop and retake the
> tty mutex either side of it, so you should be able to do that in the
> close timeout case. You may need to think about the order of locking with
> the port mutex but I suspect you can drop and retake both there.

…basically emulating the BKL semantics? Sounds more doable. I'll look
into it.

Of course that means it has to be done individually in all drivers.

> Most use of the mutex (but not quite all) could also be moved to a tty
> specific mutex fairly easily once someone has time, but that alone won't
> entirely fix your case.
> 
> Alan

-- 
Andreas Bombe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tty_lock held during transmit wait in close: still unresolved
  2011-05-27  0:41   ` Andreas Bombe
@ 2011-05-27 11:38     ` Arnd Bergmann
  2011-05-27 12:11     ` Jiri Slaby
  1 sibling, 0 replies; 8+ messages in thread
From: Arnd Bergmann @ 2011-05-27 11:38 UTC (permalink / raw)
  To: Andreas Bombe; +Cc: Alan Cox, linux-kernel, Greg KH

On Friday 27 May 2011, Andreas Bombe wrote:
> > At any point you can show the code sleeps you can drop and retake the
> > tty mutex either side of it, so you should be able to do that in the
> > close timeout case. You may need to think about the order of locking with
> > the port mutex but I suspect you can drop and retake both there.
> 
> …basically emulating the BKL semantics? Sounds more doable. I'll look
> into it.

If I understand it correctly, the problem is the msleep_interruptible()
in __uart_wait_until_sent(), right?

Note that this function may be called with or without the port mutex
held, depending on whether the caller is uart_close or uart_wait_until_sent.
The tricky part here will be making sure that you hold neither the
port mutex nor the tty_mutex while sleeping, and to always retake them
in the correct order (tty_mutex before port mutex).

My mistake here must have been that I assumed the timeout was relatively
short to not hurt when holding a mutex, since we already hold the port
mutex. I expected the wait time to be a fraction of a second as in the time
that it takes to send a few remaining characters, which would be acceptable,
unlike the 30 second sleep that you are seeing.

> Of course that means it has to be done individually in all drivers.

Right. Fortunately, we have now reduced the number of drivers a bit, by
moving some of them to staging or completely out of the kernel.

Some drivers call their wait_until_sent function directly from their
close function, some call it through tty_wait_until_sent, and some
actually do both.

Further, some of the drivers have a rather ugly part in them where we take
tty_lock() conditionally in wait_until_sent(), depending on whether the
current thread already holds it (i.e. when coming from ->close, not when
coming from ioctl).

	Arnd

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tty_lock held during transmit wait in close: still unresolved
  2011-05-27  0:41   ` Andreas Bombe
  2011-05-27 11:38     ` Arnd Bergmann
@ 2011-05-27 12:11     ` Jiri Slaby
  2011-05-27 13:53       ` Arnd Bergmann
  1 sibling, 1 reply; 8+ messages in thread
From: Jiri Slaby @ 2011-05-27 12:11 UTC (permalink / raw)
  To: Andreas Bombe; +Cc: alan, linux-kernel, greg, arnd, jirislaby

> > At any point you can show the code sleeps you can drop and retake the
> > tty mutex either side of it, so you should be able to do that in the
> > close timeout case. You may need to think about the order of locking with
> > the port mutex but I suspect you can drop and retake both there.
>
> #basically emulating the BKL semantics? Sounds more doable. I'll look
> into it.

Actually the best would be to get rid of tty_lock completely. Here I
have a patch removing tty_lock from wait_until_sent path. I've been
using a kernel with the patch applied for about month without any
problems. But I have only 8250-based serial.

Why I ahven't sent it is that I need to re-investigate users whether
this will hurt. (I already did, but stopped in the middle to do other
things.) Mostly wait_until_sent hooks in uart drivers do simple reads
from device.

The patch is only the first step. When this patch is OK and works, we
can move the whole uart thing to use tty_port_open/close helpers.

Does the patch below fix the issue?

---
 drivers/tty/serial/serial_core.c |   30 +++++++-----------------------
 1 files changed, 7 insertions(+), 23 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index db7912c..2cbf1bd 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -57,7 +57,7 @@ static struct lock_class_key port_lock_key;
 
 static void uart_change_speed(struct tty_struct *tty, struct uart_state *state,
 					struct ktermios *old_termios);
-static void __uart_wait_until_sent(struct uart_port *port, int timeout);
+static void uart_wait_until_sent(struct tty_struct *tty, int timeout);
 static void uart_change_pm(struct uart_state *state, int pm_state);
 
 /*
@@ -1304,16 +1304,8 @@ static void uart_close(struct tty_struct *tty, struct file *filp)
 	tty->closing = 1;
 	spin_unlock_irqrestore(&port->lock, flags);
 
-	if (port->closing_wait != ASYNC_CLOSING_WAIT_NONE) {
-		/*
-		 * hack: open-coded tty_wait_until_sent to avoid
-		 * recursive tty_lock
-		 */
-		long timeout = msecs_to_jiffies(port->closing_wait);
-		if (wait_event_interruptible_timeout(tty->write_wait,
-				!tty_chars_in_buffer(tty), timeout) >= 0)
-			__uart_wait_until_sent(uport, timeout);
-	}
+	if (port->closing_wait != ASYNC_CLOSING_WAIT_NONE)
+		tty_wait_until_sent(tty, msecs_to_jiffies(port->closing_wait));
 
 	/*
 	 * At this point, we stop accepting input.  To do this, we
@@ -1329,7 +1321,7 @@ static void uart_close(struct tty_struct *tty, struct file *filp)
 		 * has completely drained; this is especially
 		 * important if there is a transmit FIFO!
 		 */
-		__uart_wait_until_sent(uport, uport->timeout);
+		uart_wait_until_sent(tty, uport->timeout);
 	}
 
 	uart_shutdown(tty, state);
@@ -1363,8 +1355,10 @@ done:
 	mutex_unlock(&port->mutex);
 }
 
-static void __uart_wait_until_sent(struct uart_port *port, int timeout)
+static void uart_wait_until_sent(struct tty_struct *tty, int timeout)
 {
+	struct uart_state *state = tty->driver_data;
+	struct uart_port *port = state->uart_port;
 	unsigned long char_time, expire;
 
 	if (port->type == PORT_UNKNOWN || port->fifosize == 0)
@@ -1416,16 +1410,6 @@ static void __uart_wait_until_sent(struct uart_port *port, int timeout)
 	}
 }
 
-static void uart_wait_until_sent(struct tty_struct *tty, int timeout)
-{
-	struct uart_state *state = tty->driver_data;
-	struct uart_port *port = state->uart_port;
-
-	tty_lock();
-	__uart_wait_until_sent(port, timeout);
-	tty_unlock();
-}
-
 /*
  * This is called with the BKL held in
  *  linux/drivers/char/tty_io.c:do_tty_hangup()
-- 
1.7.4.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: tty_lock held during transmit wait in close: still unresolved
  2011-05-27 12:11     ` Jiri Slaby
@ 2011-05-27 13:53       ` Arnd Bergmann
  0 siblings, 0 replies; 8+ messages in thread
From: Arnd Bergmann @ 2011-05-27 13:53 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andreas Bombe, alan, linux-kernel, greg, jirislaby

On Friday 27 May 2011, Jiri Slaby wrote:
> Actually the best would be to get rid of tty_lock completely. Here I
> have a patch removing tty_lock from wait_until_sent path. I've been
> using a kernel with the patch applied for about month without any
> problems. But I have only 8250-based serial.

I agree that this would be the best option. The only reason I had
for holding the tty_lock in wait_until_sent is that it the BKL
has been held there tradiditonally. I could not find anything that
would break without holding it, but I also didn't look very hard.

	Arnd

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-05-27 13:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-25 23:59 tty_lock held during transmit wait in close: still unresolved Andreas Bombe
2011-05-26  7:11 ` Greg KH
2011-05-27  0:29   ` Andreas Bombe
2011-05-26  8:17 ` Alan Cox
2011-05-27  0:41   ` Andreas Bombe
2011-05-27 11:38     ` Arnd Bergmann
2011-05-27 12:11     ` Jiri Slaby
2011-05-27 13:53       ` Arnd Bergmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.