* tty_lock held during transmit wait in close: still unresolved @ 2011-05-25 23:59 Andreas Bombe 2011-05-26 7:11 ` Greg KH 2011-05-26 8:17 ` Alan Cox 0 siblings, 2 replies; 8+ messages in thread From: Andreas Bombe @ 2011-05-25 23:59 UTC (permalink / raw) To: linux-kernel; +Cc: Greg KH, Alan Cox, Arnd Bergmann Short reminder/summary: Unlike the BKL, the tty_lock mechanism that replaced it does not release the lock while sleeping. This causes the tty_lock to be held for extended periods of time when uart_close() (running with tty_lock held) tries to flush the transmit buffer but is unable to do so. The result is that until the transmit flush completes by the remote end finally accepting the data or the driver timing out and giving up, pretty much all tty operations freeze. No programs can be started and for some reason X freezes for me. From a user viewpoint, the computer effectively locks up for that time. After I tracked the issue down to the tty_lock mechanism, I found some discussion on LKML about that from last November (thread starting here: http://lkml.kernel.org/r/4CCBCD8E.1020601@billauer.co.il ). However nothing has come of it, it appears, at least not in the official kernel. A minimalist way to trigger this issue is with a serial port that has nothing attached: stty -F /dev/ttyS0 crtscts echo >/dev/ttyS0 The echo triggers a 30 second timeout on close while the driver is trying to flush the newline. Another way is developing an USB device with a virtual serial port and having it stop (by debugger or crash/lockup)... Any ideas on how to progress? -- Andreas Bombe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tty_lock held during transmit wait in close: still unresolved 2011-05-25 23:59 tty_lock held during transmit wait in close: still unresolved Andreas Bombe @ 2011-05-26 7:11 ` Greg KH 2011-05-27 0:29 ` Andreas Bombe 2011-05-26 8:17 ` Alan Cox 1 sibling, 1 reply; 8+ messages in thread From: Greg KH @ 2011-05-26 7:11 UTC (permalink / raw) To: Andreas Bombe; +Cc: linux-kernel, Alan Cox, Arnd Bergmann On Thu, May 26, 2011 at 01:59:22AM +0200, Andreas Bombe wrote: > Short reminder/summary: Unlike the BKL, the tty_lock mechanism that > replaced it does not release the lock while sleeping. This causes the > tty_lock to be held for extended periods of time when uart_close() > (running with tty_lock held) tries to flush the transmit buffer but is > unable to do so. > > The result is that until the transmit flush completes by the remote end > finally accepting the data or the driver timing out and giving up, > pretty much all tty operations freeze. No programs can be started > and for some reason X freezes for me. From a user viewpoint, the > computer effectively locks up for that time. > > After I tracked the issue down to the tty_lock mechanism, I found some > discussion on LKML about that from last November (thread starting here: > http://lkml.kernel.org/r/4CCBCD8E.1020601@billauer.co.il ). However > nothing has come of it, it appears, at least not in the official kernel. > > A minimalist way to trigger this issue is with a serial port that has > nothing attached: > > stty -F /dev/ttyS0 crtscts > echo >/dev/ttyS0 > > The echo triggers a 30 second timeout on close while the driver is > trying to flush the newline. Another way is developing an USB device > with a virtual serial port and having it stop (by debugger or > crash/lockup)... > > Any ideas on how to progress? Can you try Linus's tree right now? A number of changes went in during this merge window that might help out here I think. thanks, greg k-h ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tty_lock held during transmit wait in close: still unresolved 2011-05-26 7:11 ` Greg KH @ 2011-05-27 0:29 ` Andreas Bombe 0 siblings, 0 replies; 8+ messages in thread From: Andreas Bombe @ 2011-05-27 0:29 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel, Alan Cox, Arnd Bergmann On Thu, May 26, 2011 at 12:11:04AM -0700, Greg KH wrote: > On Thu, May 26, 2011 at 01:59:22AM +0200, Andreas Bombe wrote: > > A minimalist way to trigger this issue is with a serial port that has > > nothing attached: > > > > stty -F /dev/ttyS0 crtscts > > echo >/dev/ttyS0 > > > > The echo triggers a 30 second timeout on close while the driver is > > trying to flush the newline. Another way is developing an USB device > > with a virtual serial port and having it stop (by debugger or > > crash/lockup)... > > > > Any ideas on how to progress? > > Can you try Linus's tree right now? A number of changes went in during > this merge window that might help out here I think. I think I had those changes already in the last try (Wednesday's git) but I tried again with the very latest. No noticeable change. -- Andreas Bombe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tty_lock held during transmit wait in close: still unresolved 2011-05-25 23:59 tty_lock held during transmit wait in close: still unresolved Andreas Bombe 2011-05-26 7:11 ` Greg KH @ 2011-05-26 8:17 ` Alan Cox 2011-05-27 0:41 ` Andreas Bombe 1 sibling, 1 reply; 8+ messages in thread From: Alan Cox @ 2011-05-26 8:17 UTC (permalink / raw) To: Andreas Bombe; +Cc: linux-kernel, Greg KH, Arnd Bergmann > A minimalist way to trigger this issue is with a serial port that has > nothing attached: > > stty -F /dev/ttyS0 crtscts > echo >/dev/ttyS0 > > The echo triggers a 30 second timeout on close while the driver is > trying to flush the newline. Another way is developing an USB device > with a virtual serial port and having it stop (by debugger or > crash/lockup)... > > Any ideas on how to progress? Send patches. At any point you can show the code sleeps you can drop and retake the tty mutex either side of it, so you should be able to do that in the close timeout case. You may need to think about the order of locking with the port mutex but I suspect you can drop and retake both there. Most use of the mutex (but not quite all) could also be moved to a tty specific mutex fairly easily once someone has time, but that alone won't entirely fix your case. Alan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tty_lock held during transmit wait in close: still unresolved 2011-05-26 8:17 ` Alan Cox @ 2011-05-27 0:41 ` Andreas Bombe 2011-05-27 11:38 ` Arnd Bergmann 2011-05-27 12:11 ` Jiri Slaby 0 siblings, 2 replies; 8+ messages in thread From: Andreas Bombe @ 2011-05-27 0:41 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel, Greg KH, Arnd Bergmann On Thu, May 26, 2011 at 09:17:26AM +0100, Alan Cox wrote: > > A minimalist way to trigger this issue is with a serial port that has > > nothing attached: > > > > stty -F /dev/ttyS0 crtscts > > echo >/dev/ttyS0 > > > > The echo triggers a 30 second timeout on close while the driver is > > trying to flush the newline. Another way is developing an USB device > > with a virtual serial port and having it stop (by debugger or > > crash/lockup)... > > > > Any ideas on how to progress? > > Send patches. I would have, but I'm not exactly familiar with the tty code and I thought messing around with the locking is probably not the most promising way to start. However… > At any point you can show the code sleeps you can drop and retake the > tty mutex either side of it, so you should be able to do that in the > close timeout case. You may need to think about the order of locking with > the port mutex but I suspect you can drop and retake both there. …basically emulating the BKL semantics? Sounds more doable. I'll look into it. Of course that means it has to be done individually in all drivers. > Most use of the mutex (but not quite all) could also be moved to a tty > specific mutex fairly easily once someone has time, but that alone won't > entirely fix your case. > > Alan -- Andreas Bombe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tty_lock held during transmit wait in close: still unresolved 2011-05-27 0:41 ` Andreas Bombe @ 2011-05-27 11:38 ` Arnd Bergmann 2011-05-27 12:11 ` Jiri Slaby 1 sibling, 0 replies; 8+ messages in thread From: Arnd Bergmann @ 2011-05-27 11:38 UTC (permalink / raw) To: Andreas Bombe; +Cc: Alan Cox, linux-kernel, Greg KH On Friday 27 May 2011, Andreas Bombe wrote: > > At any point you can show the code sleeps you can drop and retake the > > tty mutex either side of it, so you should be able to do that in the > > close timeout case. You may need to think about the order of locking with > > the port mutex but I suspect you can drop and retake both there. > > …basically emulating the BKL semantics? Sounds more doable. I'll look > into it. If I understand it correctly, the problem is the msleep_interruptible() in __uart_wait_until_sent(), right? Note that this function may be called with or without the port mutex held, depending on whether the caller is uart_close or uart_wait_until_sent. The tricky part here will be making sure that you hold neither the port mutex nor the tty_mutex while sleeping, and to always retake them in the correct order (tty_mutex before port mutex). My mistake here must have been that I assumed the timeout was relatively short to not hurt when holding a mutex, since we already hold the port mutex. I expected the wait time to be a fraction of a second as in the time that it takes to send a few remaining characters, which would be acceptable, unlike the 30 second sleep that you are seeing. > Of course that means it has to be done individually in all drivers. Right. Fortunately, we have now reduced the number of drivers a bit, by moving some of them to staging or completely out of the kernel. Some drivers call their wait_until_sent function directly from their close function, some call it through tty_wait_until_sent, and some actually do both. Further, some of the drivers have a rather ugly part in them where we take tty_lock() conditionally in wait_until_sent(), depending on whether the current thread already holds it (i.e. when coming from ->close, not when coming from ioctl). Arnd ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tty_lock held during transmit wait in close: still unresolved 2011-05-27 0:41 ` Andreas Bombe 2011-05-27 11:38 ` Arnd Bergmann @ 2011-05-27 12:11 ` Jiri Slaby 2011-05-27 13:53 ` Arnd Bergmann 1 sibling, 1 reply; 8+ messages in thread From: Jiri Slaby @ 2011-05-27 12:11 UTC (permalink / raw) To: Andreas Bombe; +Cc: alan, linux-kernel, greg, arnd, jirislaby > > At any point you can show the code sleeps you can drop and retake the > > tty mutex either side of it, so you should be able to do that in the > > close timeout case. You may need to think about the order of locking with > > the port mutex but I suspect you can drop and retake both there. > > #basically emulating the BKL semantics? Sounds more doable. I'll look > into it. Actually the best would be to get rid of tty_lock completely. Here I have a patch removing tty_lock from wait_until_sent path. I've been using a kernel with the patch applied for about month without any problems. But I have only 8250-based serial. Why I ahven't sent it is that I need to re-investigate users whether this will hurt. (I already did, but stopped in the middle to do other things.) Mostly wait_until_sent hooks in uart drivers do simple reads from device. The patch is only the first step. When this patch is OK and works, we can move the whole uart thing to use tty_port_open/close helpers. Does the patch below fix the issue? --- drivers/tty/serial/serial_core.c | 30 +++++++----------------------- 1 files changed, 7 insertions(+), 23 deletions(-) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index db7912c..2cbf1bd 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -57,7 +57,7 @@ static struct lock_class_key port_lock_key; static void uart_change_speed(struct tty_struct *tty, struct uart_state *state, struct ktermios *old_termios); -static void __uart_wait_until_sent(struct uart_port *port, int timeout); +static void uart_wait_until_sent(struct tty_struct *tty, int timeout); static void uart_change_pm(struct uart_state *state, int pm_state); /* @@ -1304,16 +1304,8 @@ static void uart_close(struct tty_struct *tty, struct file *filp) tty->closing = 1; spin_unlock_irqrestore(&port->lock, flags); - if (port->closing_wait != ASYNC_CLOSING_WAIT_NONE) { - /* - * hack: open-coded tty_wait_until_sent to avoid - * recursive tty_lock - */ - long timeout = msecs_to_jiffies(port->closing_wait); - if (wait_event_interruptible_timeout(tty->write_wait, - !tty_chars_in_buffer(tty), timeout) >= 0) - __uart_wait_until_sent(uport, timeout); - } + if (port->closing_wait != ASYNC_CLOSING_WAIT_NONE) + tty_wait_until_sent(tty, msecs_to_jiffies(port->closing_wait)); /* * At this point, we stop accepting input. To do this, we @@ -1329,7 +1321,7 @@ static void uart_close(struct tty_struct *tty, struct file *filp) * has completely drained; this is especially * important if there is a transmit FIFO! */ - __uart_wait_until_sent(uport, uport->timeout); + uart_wait_until_sent(tty, uport->timeout); } uart_shutdown(tty, state); @@ -1363,8 +1355,10 @@ done: mutex_unlock(&port->mutex); } -static void __uart_wait_until_sent(struct uart_port *port, int timeout) +static void uart_wait_until_sent(struct tty_struct *tty, int timeout) { + struct uart_state *state = tty->driver_data; + struct uart_port *port = state->uart_port; unsigned long char_time, expire; if (port->type == PORT_UNKNOWN || port->fifosize == 0) @@ -1416,16 +1410,6 @@ static void __uart_wait_until_sent(struct uart_port *port, int timeout) } } -static void uart_wait_until_sent(struct tty_struct *tty, int timeout) -{ - struct uart_state *state = tty->driver_data; - struct uart_port *port = state->uart_port; - - tty_lock(); - __uart_wait_until_sent(port, timeout); - tty_unlock(); -} - /* * This is called with the BKL held in * linux/drivers/char/tty_io.c:do_tty_hangup() -- 1.7.4.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: tty_lock held during transmit wait in close: still unresolved 2011-05-27 12:11 ` Jiri Slaby @ 2011-05-27 13:53 ` Arnd Bergmann 0 siblings, 0 replies; 8+ messages in thread From: Arnd Bergmann @ 2011-05-27 13:53 UTC (permalink / raw) To: Jiri Slaby; +Cc: Andreas Bombe, alan, linux-kernel, greg, jirislaby On Friday 27 May 2011, Jiri Slaby wrote: > Actually the best would be to get rid of tty_lock completely. Here I > have a patch removing tty_lock from wait_until_sent path. I've been > using a kernel with the patch applied for about month without any > problems. But I have only 8250-based serial. I agree that this would be the best option. The only reason I had for holding the tty_lock in wait_until_sent is that it the BKL has been held there tradiditonally. I could not find anything that would break without holding it, but I also didn't look very hard. Arnd ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-05-27 13:53 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-05-25 23:59 tty_lock held during transmit wait in close: still unresolved Andreas Bombe 2011-05-26 7:11 ` Greg KH 2011-05-27 0:29 ` Andreas Bombe 2011-05-26 8:17 ` Alan Cox 2011-05-27 0:41 ` Andreas Bombe 2011-05-27 11:38 ` Arnd Bergmann 2011-05-27 12:11 ` Jiri Slaby 2011-05-27 13:53 ` Arnd Bergmann
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.