From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753386AbbLJWtB (ORCPT ); Thu, 10 Dec 2015 17:49:01 -0500 Received: from smtp-out-so.shaw.ca ([64.59.136.138]:38003 "EHLO smtp-out-so.shaw.ca" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751991AbbLJWs7 (ORCPT ); Thu, 10 Dec 2015 17:48:59 -0500 X-Authority-Analysis: v=2.1 cv=He60Nnw8 c=1 sm=1 tr=0 a=qZxK3cM5tHtOUZVZkOzy1Q==:117 a=qZxK3cM5tHtOUZVZkOzy1Q==:17 a=3I1X_3ewAAAA:8 a=kj9zAlcOel0A:10 a=hlA1aZlqE3UKVm7xPq4A:9 a=K3_39NgPM2eG6TxB:21 a=OBfKLxK9caXJfj_h:21 a=CjuIK1q_8ugA:10 Date: Thu, 10 Dec 2015 15:48:55 -0700 (MST) From: Marc Aurele La France To: Peter Hurley cc: Greg Kroah-Hartman , Jiri Slaby , linux-kernel@vger.kernel.org, Volth , Damien Miller Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read() In-Reply-To: <56699356.8040802@hurleysoftware.com> Message-ID: References: <56699356.8040802@hurleysoftware.com> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-CMAE-Envelope: MS4wfA2x1YELjA5YBW8LRn6Zquvgws2sj9NUlaT/KBp2stcu70avjIFpVm9a5Fk7wJEyY+tmqXxMZnIosKfnz8bOZTvGRIhe7SbV09bk3mMSEumPL5M11eTf IHyOW4tlMMJHuImTOIn7tl3Y58OaFrvMtwTFc3t5xTSXPd4BpsGNODZ4JeLm078ZVfoR04hnrwBz7Lx8/GuUSFtC6CXTpAgnq5/bO1Gn3dncmBPPZQtTURAJ Hb/MyljLm8reZuRQABvTCYLQvUwsITBcELfFVqcpHr993VC0qLTWWJo2+SNH4dRHRp4PBpqVeuT5NBJoaOTStpCp+La4ME07PiuPPXG8Uto= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 10 Dec 2015, Peter Hurley wrote: > On 12/09/2015 01:06 PM, Marc Aurele La France wrote: >> The following four commits are some of the changes that have been made >> to the tty layer since kernel version 3.11: >> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8 >> n_tty: Don't wait for buffer work in read() loop >> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12 >> tty: Fix pty master read() after slave closes >> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73 >> pty, n_tty: Simplify input processing on final close >> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60 >> pty: Fix input race when closing >> Indeed, the previous code (before commit "1)") checked the other end >> of the pty pair for any data before returning EAGAIN. This mimics the >> behaviour of other System-V variants (Solaris, AIX, etc.) > What other SysV systems were tested? The fix for mindrot bug 52 was verified to correct the issue reported there on various versions of SunOS, OSF1, HP-UX, IRIX, Tru64 UNIX, AIX, OpenSolaris, and a number of Linux distributions. There was a no-go report regarding HP-UX 11.11 that was never followed up on, but I believe it to have been resolved by the fix for mindrot bug 1467 (for systems where EAGAIN != EWOULDBLOCK). >> in this >> regard and ensured that EAGAIN really did mean no data was available >> at the time of the call. >> After sshd has been SIGCHLD'ed about the shell's termination, it >> continues to read the master pty until an error occurs. This error >> will be EIO if no process has the slave pty open. Otherwise (for >> example when the shell spawned long-running processes in the >> background before terminating), that error is expected to be EAGAIN. >> sshd cannot continue to read until an EIO in all cases, because doing >> so causes the session to hang until all processes have closed the >> slave pty, which is not the desired behaviour. Thus a spurious EAGAIN >> return causes sshd to lose data, whether or not the slave pty is >> completely closed. > Ah, the games userspace will be up to :) Not really. The fact different OSes behave differently in this regard can hardly be said to be userland's fault. The lower the number of distinct behaviours userland needs to deal with, the better. Furthermore, sshd "knows" there should be data there, so it makes no sense to befuddle it with false EAGAIN returns. >> --- a/drivers/tty/n_tty.c >> +++ a/drivers/tty/n_tty.c >> @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, >> if (!timeout) >> break; >> if (file->f_flags & O_NONBLOCK) { >> - retval = -EAGAIN; >> - break; >> - } >> - if (signal_pending(current)) { >> - retval = -ERESTARTSYS; >> - break; >> - } >> - up_read(&tty->termios_rwsem); >> + up_read(&tty->termios_rwsem); >> + flush_work(&tty->port->buf.work); >> + down_read(&tty->termios_rwsem); >> + if (!input_available_p(tty, 0)) { >> + retval = -EAGAIN; >> + break; >> + } >> + } else { >> + if (signal_pending(current)) { >> + retval = -ERESTARTSYS; >> + break; >> + } >> + up_read(&tty->termios_rwsem); > No sense in doing this just for O_NONBLOCK; might as well do it before > all the condition tests. > Which renders the earlier fixes for the slave end closing superfluous, > so might as well rip those out. This strikes me as somewhat draconian. It seems to me the intent behind commit "1)" was in part to speed things up a bit. I beleive that goal to have been largely attained. It's just a matter of ensuring this speedup doesn't change kernel<->userland semantics. I see your EIO fix and the eventual solution to the problem here in that light. > n_tty_poll() will need to be fixed as well, because if one application > used read() with O_NONBLOCK to expect to block until i/o became available, > then I guarantee some other application uses poll() with no timeout > for the same purpose. Agreed. which is another reason why I don't believe my suggestion to be the correct fix. Also, I took somewhat of a hammer approach, meaning that, for OpenSSH's purposes, it would be sufficient to remove spurious EAGAIN only after disassociation of the slave pty. >> >> - timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, >> - timeout); >> + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, >> + timeout); >> >> - down_read(&tty->termios_rwsem); >> - continue; >> + down_read(&tty->termios_rwsem); >> + continue; >> + } >> } >> >> if (ldata->icanon && !L_EXTPROC(tty)) { >> Thanks. Marc.