All of lore.kernel.org
 help / color / mirror / Atom feed
* n_tty:  Check the other end of pty pair before returning EAGAIN on a read()
@ 2015-12-09 21:06 Marc Aurele La France
  2015-12-10 14:59 ` Peter Hurley
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Aurele La France @ 2015-12-09 21:06 UTC (permalink / raw)
  To: Peter Hurley, Greg Kroah-Hartman, Jiri Slaby, linux-kernel
  Cc: Volth, Damien Miller

Greetings.

The following four commits are some of the changes that have been made
to the tty layer since kernel version 3.11:

1) f95499c3030fe1bfad57745f2db1959c5b43dca8
    n_tty: Don't wait for buffer work in read() loop

2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
    tty: Fix pty master read() after slave closes

3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
    pty, n_tty: Simplify input processing on final close

4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
    pty: Fix input race when closing

Commit "4)" corrected an issue whereby EIO could be prematurely
returned on a read of one end of a master/slave pty pair after the
other had been completely closed.  Yet, I would argue that EAGAIN
should not be returned either when there actually is data to be
returned.  This whether or not the other end has been completely
closed.

Indeed, the previous code (before commit "1)") checked the other end
of the pty pair for any data before returning EAGAIN.  This mimics the
behaviour of other System-V variants (Solaris, AIX, etc.) in this
regard and ensured that EAGAIN really did mean no data was available
at the time of the call.

Portable OpenSSH, since release 4.6p1 in 2007, relies on this being
the case and has been broken since commit "1)" introduced spurious
EAGAIN returns (i.e. as of 3.12 kernels).  The scenario at hand is
as follows.

After sshd has been SIGCHLD'ed about the shell's termination, it
continues to read the master pty until an error occurs.  This error
will be EIO if no process has the slave pty open.  Otherwise (for
example when the shell spawned long-running processes in the
background before terminating), that error is expected to be EAGAIN.
sshd cannot continue to read until an EIO in all cases, because doing
so causes the session to hang until all processes have closed the
slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
return causes sshd to lose data, whether or not the slave pty is
completely closed.

I've been using the following script to reproduce the problem.  It
loops until the issue is detected.

	#! /bin/bash

	LOG=sshlog-`date "+%F.%T"`

	touch ${LOG}

	while test -z "`grep -n '^Connection' ${LOG} | grep -v '0:Connection'`"
	do
	 ssh -p 22 -tt root@localhost \
	  '/bin/bash -c "/bin/ping -c4 8.8.8.8"' 2>&1 | \
	  tee -a ${LOG}
	done

It should be noted that the problem is extremely rare, but still
occurs, on real hardware.  This bug is easier to replicate in a
virtual machine such as those that can be created using Google Cloud.

The patch below is a suggested fix.  It was developed using a 4.3.0
kernel and should apply, modulo fuzz, to any release >= 4.0.5.  My
suggested fix is modeled after commit "2)" mentionned above.  Given
commit "2)" was later reworked by commit "3)", I fully expect my fix
to be reworked as well.

I volunteer to backport the fix this ends up being to any stable
release >= 3.12 deemed needed.

Please Reply-To-All.

Thanks and have a great day.

Marc.

Reported-by: Volth <openssh@volth.com>
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
Signed-off-by: Marc Aurele La France <tsi@tuyoix.net>

--- a/drivers/tty/n_tty.c
+++ a/drivers/tty/n_tty.c
@@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
 			if (!timeout)
 				break;
 			if (file->f_flags & O_NONBLOCK) {
-				retval = -EAGAIN;
-				break;
-			}
-			if (signal_pending(current)) {
-				retval = -ERESTARTSYS;
-				break;
-			}
-			up_read(&tty->termios_rwsem);
+				up_read(&tty->termios_rwsem);
+				flush_work(&tty->port->buf.work);
+				down_read(&tty->termios_rwsem);
+				if (!input_available_p(tty, 0)) {
+					retval = -EAGAIN;
+					break;
+				}
+			} else {
+				if (signal_pending(current)) {
+					retval = -ERESTARTSYS;
+					break;
+				}
+				up_read(&tty->termios_rwsem);

-			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
-					     timeout);
+				timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
+						     timeout);

-			down_read(&tty->termios_rwsem);
-			continue;
+				down_read(&tty->termios_rwsem);
+				continue;
+			}
 		}

 		if (ldata->icanon && !L_EXTPROC(tty)) {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-09 21:06 n_tty: Check the other end of pty pair before returning EAGAIN on a read() Marc Aurele La France
@ 2015-12-10 14:59 ` Peter Hurley
  2015-12-10 22:48   ` Marc Aurele La France
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Hurley @ 2015-12-10 14:59 UTC (permalink / raw)
  To: Marc Aurele La France, Greg Kroah-Hartman, Jiri Slaby, linux-kernel
  Cc: Volth, Damien Miller

On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
> Greetings.
> 
> The following four commits are some of the changes that have been made
> to the tty layer since kernel version 3.11:
> 
> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8
>     n_tty: Don't wait for buffer work in read() loop
> 
> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
>     tty: Fix pty master read() after slave closes
> 
> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
>     pty, n_tty: Simplify input processing on final close
> 
> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
>     pty: Fix input race when closing
> 
> Commit "4)" corrected an issue whereby EIO could be prematurely
> returned on a read of one end of a master/slave pty pair after the
> other had been completely closed.  Yet, I would argue that EAGAIN
> should not be returned either when there actually is data to be
> returned.  This whether or not the other end has been completely
> closed.
> 
> Indeed, the previous code (before commit "1)") checked the other end
> of the pty pair for any data before returning EAGAIN.  This mimics the
> behaviour of other System-V variants (Solaris, AIX, etc.)
                                                      ^^^^
What other SysV systems were tested?

> in this
> regard and ensured that EAGAIN really did mean no data was available
> at the time of the call.
> 
> Portable OpenSSH, since release 4.6p1 in 2007, relies on this being
> the case and has been broken since commit "1)" introduced spurious
> EAGAIN returns (i.e. as of 3.12 kernels).  The scenario at hand is
> as follows.
> 
> After sshd has been SIGCHLD'ed about the shell's termination, it
> continues to read the master pty until an error occurs.  This error
> will be EIO if no process has the slave pty open.  Otherwise (for
> example when the shell spawned long-running processes in the
> background before terminating), that error is expected to be EAGAIN.
> sshd cannot continue to read until an EIO in all cases, because doing
> so causes the session to hang until all processes have closed the
> slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
> return causes sshd to lose data, whether or not the slave pty is
> completely closed.

Ah, the games userspace will be up to :)


> I've been using the following script to reproduce the problem.  It
> loops until the issue is detected.
> 
> 	#! /bin/bash
> 
> 	LOG=sshlog-`date "+%F.%T"`
> 
> 	touch ${LOG}
> 
> 	while test -z "`grep -n '^Connection' ${LOG} | grep -v '0:Connection'`"
> 	do
> 	 ssh -p 22 -tt root@localhost \
> 	  '/bin/bash -c "/bin/ping -c4 8.8.8.8"' 2>&1 | \
> 	  tee -a ${LOG}
> 	done
> 
> It should be noted that the problem is extremely rare, but still
> occurs, on real hardware.  This bug is easier to replicate in a
> virtual machine such as those that can be created using Google Cloud.
> 
> The patch below is a suggested fix.  It was developed using a 4.3.0
> kernel and should apply, modulo fuzz, to any release >= 4.0.5.  My
> suggested fix is modeled after commit "2)" mentionned above.  Given
> commit "2)" was later reworked by commit "3)", I fully expect my fix
> to be reworked as well.
> 
> I volunteer to backport the fix this ends up being to any stable
> release >= 3.12 deemed needed.
> 
> Please Reply-To-All.
> 
> Thanks and have a great day.
> 
> Marc.
> 
> Reported-by: Volth <openssh@volth.com>
> BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
> BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
> Signed-off-by: Marc Aurele La France <tsi@tuyoix.net>
> 
> --- a/drivers/tty/n_tty.c
> +++ a/drivers/tty/n_tty.c
> @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
>  			if (!timeout)
>  				break;
>  			if (file->f_flags & O_NONBLOCK) {
> -				retval = -EAGAIN;
> -				break;
> -			}
> -			if (signal_pending(current)) {
> -				retval = -ERESTARTSYS;
> -				break;
> -			}
> -			up_read(&tty->termios_rwsem);
> +				up_read(&tty->termios_rwsem);
> +				flush_work(&tty->port->buf.work);
> +				down_read(&tty->termios_rwsem);
> +				if (!input_available_p(tty, 0)) {
> +					retval = -EAGAIN;
> +					break;
> +				}
> +			} else {
> +				if (signal_pending(current)) {
> +					retval = -ERESTARTSYS;
> +					break;
> +				}
> +				up_read(&tty->termios_rwsem);

No sense in doing this just for O_NONBLOCK; might as well do it before
all the condition tests.

Which renders the earlier fixes for the slave end closing superfluous,
so might as well rip those out.

n_tty_poll() will need to be fixed as well, because if one application
used read() with O_NONBLOCK to expect to block until i/o became available,
then I guarantee some other application uses poll() with no timeout
for the same purpose.

Regards,
Peter Hurley

> 
> -			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
> -					     timeout);
> +				timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
> +						     timeout);
> 
> -			down_read(&tty->termios_rwsem);
> -			continue;
> +				down_read(&tty->termios_rwsem);
> +				continue;
> +			}
>  		}
> 
>  		if (ldata->icanon && !L_EXTPROC(tty)) {
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-10 14:59 ` Peter Hurley
@ 2015-12-10 22:48   ` Marc Aurele La France
  2015-12-11  0:07     ` Peter Hurley
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Aurele La France @ 2015-12-10 22:48 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Volth, Damien Miller

On Thu, 10 Dec 2015, Peter Hurley wrote:
> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
>> The following four commits are some of the changes that have been made
>> to the tty layer since kernel version 3.11:

>> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8
>>     n_tty: Don't wait for buffer work in read() loop

>> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
>>     tty: Fix pty master read() after slave closes

>> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
>>     pty, n_tty: Simplify input processing on final close

>> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
>>     pty: Fix input race when closing

>> Indeed, the previous code (before commit "1)") checked the other end
>> of the pty pair for any data before returning EAGAIN.  This mimics the
>> behaviour of other System-V variants (Solaris, AIX, etc.)

> What other SysV systems were tested?

The fix for mindrot bug 52 was verified to correct the issue reported
there on various versions of SunOS, OSF1, HP-UX, IRIX, Tru64 UNIX, AIX,
OpenSolaris, and a number of Linux distributions.  There was a no-go
report regarding HP-UX 11.11 that was never followed up on, but I
believe it to have been resolved by the fix for mindrot bug 1467 (for
systems where EAGAIN != EWOULDBLOCK).

>> in this
>> regard and ensured that EAGAIN really did mean no data was available
>> at the time of the call.

>> After sshd has been SIGCHLD'ed about the shell's termination, it
>> continues to read the master pty until an error occurs.  This error
>> will be EIO if no process has the slave pty open.  Otherwise (for
>> example when the shell spawned long-running processes in the
>> background before terminating), that error is expected to be EAGAIN.
>> sshd cannot continue to read until an EIO in all cases, because doing
>> so causes the session to hang until all processes have closed the
>> slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
>> return causes sshd to lose data, whether or not the slave pty is
>> completely closed.

> Ah, the games userspace will be up to :)

Not really.  The fact different OSes behave differently in this regard can
hardly be said to be userland's fault.  The lower the number of distinct
behaviours userland needs to deal with, the better.  Furthermore, sshd
"knows" there should be data there, so it makes no sense to befuddle it
with false EAGAIN returns.

>> --- a/drivers/tty/n_tty.c
>> +++ a/drivers/tty/n_tty.c
>> @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
>>  			if (!timeout)
>>  				break;
>>  			if (file->f_flags & O_NONBLOCK) {
>> -				retval = -EAGAIN;
>> -				break;
>> -			}
>> -			if (signal_pending(current)) {
>> -				retval = -ERESTARTSYS;
>> -				break;
>> -			}
>> -			up_read(&tty->termios_rwsem);
>> +				up_read(&tty->termios_rwsem);
>> +				flush_work(&tty->port->buf.work);
>> +				down_read(&tty->termios_rwsem);
>> +				if (!input_available_p(tty, 0)) {
>> +					retval = -EAGAIN;
>> +					break;
>> +				}
>> +			} else {
>> +				if (signal_pending(current)) {
>> +					retval = -ERESTARTSYS;
>> +					break;
>> +				}
>> +				up_read(&tty->termios_rwsem);

> No sense in doing this just for O_NONBLOCK; might as well do it before
> all the condition tests.

> Which renders the earlier fixes for the slave end closing superfluous,
> so might as well rip those out.

This strikes me as somewhat draconian.  It seems to me the intent behind
commit "1)" was in part to speed things up a bit.  I beleive that goal to
have been largely attained.  It's just a matter of ensuring this speedup
doesn't change kernel<->userland semantics.  I see your EIO fix and the
eventual solution to the problem here in that light.

> n_tty_poll() will need to be fixed as well, because if one application
> used read() with O_NONBLOCK to expect to block until i/o became available,
> then I guarantee some other application uses poll() with no timeout
> for the same purpose.

Agreed. which is another reason why I don't believe my suggestion to be
the correct fix.  Also, I took somewhat of a hammer approach, meaning
that, for OpenSSH's purposes, it would be sufficient to remove spurious
EAGAIN only after disassociation of the slave pty.

>>
>> -			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>> -					     timeout);
>> +				timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>> +						     timeout);
>>
>> -			down_read(&tty->termios_rwsem);
>> -			continue;
>> +				down_read(&tty->termios_rwsem);
>> +				continue;
>> +			}
>>  		}
>>
>>  		if (ldata->icanon && !L_EXTPROC(tty)) {
>>

Thanks.

Marc.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-10 22:48   ` Marc Aurele La France
@ 2015-12-11  0:07     ` Peter Hurley
  2015-12-11 13:37       ` Marc Aurele La France
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Hurley @ 2015-12-11  0:07 UTC (permalink / raw)
  To: Marc Aurele La France
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Volth, Damien Miller

On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
> On Thu, 10 Dec 2015, Peter Hurley wrote:
>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
>>> The following four commits are some of the changes that have been made
>>> to the tty layer since kernel version 3.11:
> 
>>> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8
>>>     n_tty: Don't wait for buffer work in read() loop
> 
>>> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
>>>     tty: Fix pty master read() after slave closes
> 
>>> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
>>>     pty, n_tty: Simplify input processing on final close
> 
>>> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
>>>     pty: Fix input race when closing
> 
>>> Indeed, the previous code (before commit "1)") checked the other end
>>> of the pty pair for any data before returning EAGAIN.  This mimics the
>>> behaviour of other System-V variants (Solaris, AIX, etc.)
> 
>> What other SysV systems were tested?
> 
> The fix for mindrot bug 52 was verified to correct the issue reported
> there on various versions of SunOS, OSF1, HP-UX, IRIX, Tru64 UNIX, AIX,
> OpenSolaris, and a number of Linux distributions.  There was a no-go
> report regarding HP-UX 11.11 that was never followed up on, but I
> believe it to have been resolved by the fix for mindrot bug 1467 (for
> systems where EAGAIN != EWOULDBLOCK).

Ok.

>>> in this
>>> regard and ensured that EAGAIN really did mean no data was available
>>> at the time of the call.
> 
>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>> continues to read the master pty until an error occurs.  This error
>>> will be EIO if no process has the slave pty open.  Otherwise (for
>>> example when the shell spawned long-running processes in the
>>> background before terminating), that error is expected to be EAGAIN.
>>> sshd cannot continue to read until an EIO in all cases, because doing
>>> so causes the session to hang until all processes have closed the
>>> slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
>>> return causes sshd to lose data, whether or not the slave pty is
>>> completely closed.
> 
>> Ah, the games userspace will be up to :)
> 
> Not really.

Definitely.

The idea that a read with O_NONBLOCK set should have synchronous behavior
is ridiculous.

> The fact different OSes behave differently in this regard can
> hardly be said to be userland's fault.  The lower the number of distinct
> behaviours userland needs to deal with, the better.  Furthermore, sshd
> "knows" there should be data there, so it makes no sense to befuddle it
> with false EAGAIN returns.

But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
sshd is extrapolating from one known condition to another unknown condition,
and assuming it "should" be that way because it has been.

For example, try the same idea with real ttys on loopback. Wouldn't work,
because it's asynchronous.

The only reason this needs fixing is because it's a userspace regression.


>>> --- a/drivers/tty/n_tty.c
>>> +++ a/drivers/tty/n_tty.c
>>> @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
>>>  			if (!timeout)
>>>  				break;
>>>  			if (file->f_flags & O_NONBLOCK) {
>>> -				retval = -EAGAIN;
>>> -				break;
>>> -			}
>>> -			if (signal_pending(current)) {
>>> -				retval = -ERESTARTSYS;
>>> -				break;
>>> -			}
>>> -			up_read(&tty->termios_rwsem);
>>> +				up_read(&tty->termios_rwsem);
>>> +				flush_work(&tty->port->buf.work);
>>> +				down_read(&tty->termios_rwsem);
>>> +				if (!input_available_p(tty, 0)) {
>>> +					retval = -EAGAIN;
>>> +					break;
>>> +				}
>>> +			} else {
>>> +				if (signal_pending(current)) {
>>> +					retval = -ERESTARTSYS;
>>> +					break;
>>> +				}
>>> +				up_read(&tty->termios_rwsem);
> 
>> No sense in doing this just for O_NONBLOCK; might as well do it before
>> all the condition tests.
> 
>> Which renders the earlier fixes for the slave end closing superfluous,
>> so might as well rip those out.
> 
> This strikes me as somewhat draconian.  It seems to me the intent behind
> commit "1)" was in part to speed things up a bit.  I beleive that goal to
> have been largely attained.  It's just a matter of ensuring this speedup
> doesn't change kernel<->userland semantics.  I see your EIO fix and the
> eventual solution to the problem here in that light.

When I first saw this report, I considered pipelining the status change
using new tty flags, but then reconsidered. It's one thing to operate within
the confines of VFS but a whole different situation to workaround this
problem with some kind of process exit hack.

Unfortunately, this condition lands squarely in the path I wanted to
avoid; O_NONBLOCK and poll(). I see no clear solution other than that
presented to prevent the regression. And the existing work since 3.12
is really pointless if we have to wait in O_NONBLOCK and poll anyway.

This is just one of those unfortunate situations where userspace has come
to rely on an unspecified behavior because it worked.

>> n_tty_poll() will need to be fixed as well, because if one application
>> used read() with O_NONBLOCK to expect to block until i/o became available,
>> then I guarantee some other application uses poll() with no timeout
>> for the same purpose.
> 
> Agreed. which is another reason why I don't believe my suggestion to be
> the correct fix.  Also, I took somewhat of a hammer approach, meaning
> that, for OpenSSH's purposes, it would be sufficient to remove spurious
> EAGAIN only after disassociation of the slave pty.

I don't see another solution.

Regards,
Peter Hurley

>>>
>>> -			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>>> -					     timeout);
>>> +				timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>>> +						     timeout);
>>>
>>> -			down_read(&tty->termios_rwsem);
>>> -			continue;
>>> +				down_read(&tty->termios_rwsem);
>>> +				continue;
>>> +			}
>>>  		}
>>>
>>>  		if (ldata->icanon && !L_EXTPROC(tty)) {
>>>
> 
> Thanks.
> 
> Marc.
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-11  0:07     ` Peter Hurley
@ 2015-12-11 13:37       ` Marc Aurele La France
  2015-12-11 13:56         ` Peter Hurley
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Aurele La France @ 2015-12-11 13:37 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Volth, Damien Miller

On Thu, 10 Dec 2015, Peter Hurley wrote:
> On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
>> On Thu, 10 Dec 2015, Peter Hurley wrote:
>>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:

>>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>>> continues to read the master pty until an error occurs.  This error
>>>> will be EIO if no process has the slave pty open.  Otherwise (for
>>>> example when the shell spawned long-running processes in the
>>>> background before terminating), that error is expected to be EAGAIN.
>>>> sshd cannot continue to read until an EIO in all cases, because doing
>>>> so causes the session to hang until all processes have closed the
>>>> slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
>>>> return causes sshd to lose data, whether or not the slave pty is
>>>> completely closed.

>>> Ah, the games userspace will be up to :)

>> Not really.

> Definitely.

> The idea that a read with O_NONBLOCK set should have synchronous behavior
> is ridiculous.

>> The fact different OSes behave differently in this regard can
>> hardly be said to be userland's fault.  The lower the number of distinct
>> behaviours userland needs to deal with, the better.  Furthermore, sshd
>> "knows" there should be data there, so it makes no sense to befuddle it
>> with false EAGAIN returns.

> But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
> sshd is extrapolating from one known condition to another unknown condition,
> and assuming it "should" be that way because it has been.

> For example, try the same idea with real ttys on loopback. Wouldn't work,
> because it's asynchronous.

> The only reason this needs fixing is because it's a userspace regression.

It's the kernel that introduced this regression, not OpenSSH.

I am not asking to read data before it has been produced.  I am puzzled 
that despite knowing that the data exists, I can now be lied to when I 
try to retrieve it, when I wasn't before.  We are talking about what is 
essentially a two-way pipe, not some network or serial connection with 
transmission delays userland has long experience in dealing with.

These previously internal additional delays, that are now exposed to 
userland, are simply an implementation detail that userland did not, and 
should not, need to worry about.

> This is just one of those unfortunate situations where userspace has come
> to rely on an unspecified behavior because it worked.

Whether the behaviour is specified or not is irrelevent.  This simply 
means there is no standard to debunk the fact that the kernel's previous 
behaviour mimics that of other systems.

So, how am I supposed to avoid these spurious EAGAINs and finally be 
allowed to read the data I know exists?  How long do I have to wait?  Do I 
have to run a calibration loop to figure that out?  Why should I need to 
do that only on Linux?

I don't know, but there's nonsense in here somewhere.

Marc.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-11 13:37       ` Marc Aurele La France
@ 2015-12-11 13:56         ` Peter Hurley
  2015-12-18 14:26           ` Marc Aurele La France
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Hurley @ 2015-12-11 13:56 UTC (permalink / raw)
  To: Marc Aurele La France
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Volth, Damien Miller

On 12/11/2015 05:37 AM, Marc Aurele La France wrote:
> On Thu, 10 Dec 2015, Peter Hurley wrote:
>> On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
>>> On Thu, 10 Dec 2015, Peter Hurley wrote:
>>>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
> 
>>>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>>>> continues to read the master pty until an error occurs.  This error
>>>>> will be EIO if no process has the slave pty open.  Otherwise (for
>>>>> example when the shell spawned long-running processes in the
>>>>> background before terminating), that error is expected to be EAGAIN.
>>>>> sshd cannot continue to read until an EIO in all cases, because doing
>>>>> so causes the session to hang until all processes have closed the
>>>>> slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
>>>>> return causes sshd to lose data, whether or not the slave pty is
>>>>> completely closed.
> 
>>>> Ah, the games userspace will be up to :)
> 
>>> Not really.
> 
>> Definitely.
> 
>> The idea that a read with O_NONBLOCK set should have synchronous behavior
>> is ridiculous.
> 
>>> The fact different OSes behave differently in this regard can
>>> hardly be said to be userland's fault.  The lower the number of distinct
>>> behaviours userland needs to deal with, the better.  Furthermore, sshd
>>> "knows" there should be data there, so it makes no sense to befuddle it
>>> with false EAGAIN returns.
> 
>> But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
>> sshd is extrapolating from one known condition to another unknown condition,
>> and assuming it "should" be that way because it has been.
> 
>> For example, try the same idea with real ttys on loopback. Wouldn't work,
>> because it's asynchronous.
> 
>> The only reason this needs fixing is because it's a userspace regression.

Misunderstanding.

"userspace regression" = kernel regression observable by userspace

> It's the kernel that introduced this regression, not OpenSSH.
> 
> I am not asking to read data before it has been produced.  I am puzzled that despite knowing that the data exists, I can now be lied to when I try to retrieve it, when I wasn't before.  We are talking about what is essentially a two-way pipe, not some network or serial connection with transmission delays userland has long experience in dealing with.
> 
> These previously internal additional delays, that are now exposed to userland, are simply an implementation detail that userland did not, and should not, need to worry about.

Your mental model is that pseudo-terminals are a synchronous pipe, which
is not true.

But this argument is pointless because the regression needs to be fixed
regardless of the merits.

Regards,
Peter Hurley

>> This is just one of those unfortunate situations where userspace has come
>> to rely on an unspecified behavior because it worked.
> 
> Whether the behaviour is specified or not is irrelevent.  This simply means there is no standard to debunk the fact that the kernel's previous behaviour mimics that of other systems.
> 
> So, how am I supposed to avoid these spurious EAGAINs and finally be allowed to read the data I know exists?  How long do I have to wait?  Do I have to run a calibration loop to figure that out?  Why should I need to do that only on Linux?
> 
> I don't know, but there's nonsense in here somewhere.
> 
> Marc.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-11 13:56         ` Peter Hurley
@ 2015-12-18 14:26           ` Marc Aurele La France
  2015-12-18 16:39             ` Peter Hurley
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Aurele La France @ 2015-12-18 14:26 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Volth, Damien Miller

On Fri, 11 Dec 2015, Peter Hurley wrote:
> On 12/11/2015 05:37 AM, Marc Aurele La France wrote:

>> I am not asking to read data before it has been produced.  I am puzzled
>> that despite knowing that the data exists, I can now be lied to when I
>> try to retrieve it, when I wasn't before.  We are talking about what is
>> essentially a two-way pipe, not some network or serial connection with
>> transmission delays userland has long experience in dealing with.

>> These previously internal additional delays, that are now exposed to
>> userland, are simply an implementation detail that userland did not,
>> and should not, need to worry about.
>
> Your mental model is that pseudo-terminals are a synchronous pipe, which
> is not true.
>
> But this argument is pointless because the regression needs to be fixed
> regardless of the merits.

Fair enough.

Anything new on this?

Thanks.

Marc.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-18 14:26           ` Marc Aurele La France
@ 2015-12-18 16:39             ` Peter Hurley
  2015-12-18 17:23               ` Marc Aurele La France
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Hurley @ 2015-12-18 16:39 UTC (permalink / raw)
  To: Marc Aurele La France
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Volth, Damien Miller

Hi Marc,

On 12/18/2015 06:26 AM, Marc Aurele La France wrote:
> On Fri, 11 Dec 2015, Peter Hurley wrote:
>> On 12/11/2015 05:37 AM, Marc Aurele La France wrote:
> 
>>> I am not asking to read data before it has been produced.  I am puzzled
>>> that despite knowing that the data exists, I can now be lied to when I
>>> try to retrieve it, when I wasn't before.  We are talking about what is
>>> essentially a two-way pipe, not some network or serial connection with
>>> transmission delays userland has long experience in dealing with.
> 
>>> These previously internal additional delays, that are now exposed to
>>> userland, are simply an implementation detail that userland did not,
>>> and should not, need to worry about.
>>
>> Your mental model is that pseudo-terminals are a synchronous pipe, which
>> is not true.
>>
>> But this argument is pointless because the regression needs to be fixed
>> regardless of the merits.
> 
> Fair enough.
> 
> Anything new on this?

It's on my todo list.

While considering this issue further, I was curious what ssh does
regarding the entire foreground process group and its output?

If ssh only knows that the child has terminated, how does it wait
for the rest of the foreground process group's output since those
processes may not yet have received their SIGHUP/SIGCONT signals
yet?

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-18 16:39             ` Peter Hurley
@ 2015-12-18 17:23               ` Marc Aurele La France
  2016-01-14 21:50                 ` Marc Aurele La France
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Aurele La France @ 2015-12-18 17:23 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Volth, Damien Miller

On Fri, 18 Dec 2015, Peter Hurley wrote:
> On 12/18/2015 06:26 AM, Marc Aurele La France wrote:
>> On Fri, 11 Dec 2015, Peter Hurley wrote:
>>> On 12/11/2015 05:37 AM, Marc Aurele La France wrote:

>>>> I am not asking to read data before it has been produced.  I am puzzled
>>>> that despite knowing that the data exists, I can now be lied to when I
>>>> try to retrieve it, when I wasn't before.  We are talking about what is
>>>> essentially a two-way pipe, not some network or serial connection with
>>>> transmission delays userland has long experience in dealing with.

>>>> These previously internal additional delays, that are now exposed to
>>>> userland, are simply an implementation detail that userland did not,
>>>> and should not, need to worry about.

>>> Your mental model is that pseudo-terminals are a synchronous pipe, which
>>> is not true.

>>> But this argument is pointless because the regression needs to be fixed
>>> regardless of the merits.

>> Fair enough.

>> Anything new on this?

> It's on my todo list.

> While considering this issue further, I was curious what ssh does
> regarding the entire foreground process group and its output?

> If ssh only knows that the child has terminated, how does it wait
> for the rest of the foreground process group's output since those
> processes may not yet have received their SIGHUP/SIGCONT signals
> yet?

sshd cannot know about the termination of any process other than the
session leader because any of the session leader's children are
re-parented to init.  The idea is to, at minimum, collect any output the
session leader might have left behind.  Yes, this could entail also
collecting output from its children that might have squeaked in, but
that's gravy that can't be avoided.

This situation is much simpler on the *BSDs.  There, both ends of the
pty pair are, in effect, completely closed after disassociation of either
end, preventing (with EIO) any further output (but still allowing data
already collected to be read, after which an EIO occurs).  It's
unfortunate System V variants don't do this, but that's crying over spilt
milk.

Marc.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2015-12-18 17:23               ` Marc Aurele La France
@ 2016-01-14 21:50                 ` Marc Aurele La France
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Aurele La France @ 2016-01-14 21:50 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Volth, Damien Miller

On Fri, 18 Dec 2015, Marc Aurele La France wrote:
> On Fri, 18 Dec 2015, Peter Hurley wrote:
>> On 12/18/2015 06:26 AM, Marc Aurele La France wrote:
>>> On Fri, 11 Dec 2015, Peter Hurley wrote:
>>>> On 12/11/2015 05:37 AM, Marc Aurele La France wrote:

>>>>> I am not asking to read data before it has been produced.  I am puzzled
>>>>> that despite knowing that the data exists, I can now be lied to when I
>>>>> try to retrieve it, when I wasn't before.  We are talking about what is
>>>>> essentially a two-way pipe, not some network or serial connection with
>>>>> transmission delays userland has long experience in dealing with.

>>>>> These previously internal additional delays, that are now exposed to
>>>>> userland, are simply an implementation detail that userland did not,
>>>>> and should not, need to worry about.

>>>> Your mental model is that pseudo-terminals are a synchronous pipe, which
>>>> is not true.

>>>> But this argument is pointless because the regression needs to be fixed
>>>> regardless of the merits.

>>> Fair enough.

>>> Anything new on this?

>> It's on my todo list.

>> While considering this issue further, I was curious what ssh does
>> regarding the entire foreground process group and its output?

>> If ssh only knows that the child has terminated, how does it wait
>> for the rest of the foreground process group's output since those
>> processes may not yet have received their SIGHUP/SIGCONT signals
>> yet?

> sshd cannot know about the termination of any process other than the
> session leader because any of the session leader's children are
> re-parented to init.  The idea is to, at minimum, collect any output the
> session leader might have left behind.  Yes, this could entail also
> collecting output from its children that might have squeaked in, but
> that's gravy that can't be avoided.

> This situation is much simpler on the *BSDs.  There, both ends of the
> pty pair are, in effect, completely closed after disassociation of either
> end, preventing (with EIO) any further output (but still allowing data
> already collected to be read, after which an EIO occurs).  It's
> unfortunate System V variants don't do this, but that's crying over spilt
> milk.

Anything more on this?

Thanks.

Marc.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
  2016-02-29  3:56 Brian Bloniarz
@ 2016-03-01  4:30 ` Peter Hurley
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Hurley @ 2016-03-01  4:30 UTC (permalink / raw)
  To: Brian Bloniarz; +Cc: linux-kernel, tsi

Hi Brian,

On 02/28/2016 07:56 PM, Brian Bloniarz wrote:
> (Take 3, fix compile error in n_hdlc.c)
> 
> Hi Peter, I saw Marc Aurele La France's proposed patch to n_tty to fix
> OpenSSH, and your feedback. Patch below is an attempt to address that
> feedback. Please let me know if this is the change you envisioned;
> (see Marc's excellent original writeup for details on the issue).
> 
> [PATCH] n_tty: wait for buffer work in read() and poll().
> 
> Undoes the following four changes:
> 
> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8
>     n_tty: Don't wait for buffer work in read() loop
> 
> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
>     tty: Fix pty master read() after slave closes
> 
> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
>     pty, n_tty: Simplify input processing on final close
> 
> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
>     pty: Fix input race when closing
> 
> These changes caused a regression in OpenSSH, as it assumes that the
> first read() to return EAGAIN after a SIGCHLD means that all the child's
> output has been returned.


This is not descriptive enough of the requirements of OpenSSH.
For example, it fails to note that the attempted read() is non-blocking
for which EAGAIN is an expected result that doesn't not mean
the slave-side has been closed.

Something like:

Fix OpenSSH pty regression on close

OpenSSH expects the (non-blocking) read() of pty master to return
EAGAIN only if it has received all of the slave-side output after
it has received SIGCHLD. This used to work on pre-3.12 kernels.

This fix effectively forces non-blocking read() and poll() to
block for parallel i/o to complete for all ttys.

...


> Inspired by analysis and patch from Marc Aurele La France <tsi@tuyoix.net>
> 
> Reported-by: Volth <openssh@volth.com>
> Reported-by: Marc Aurele La France <tsi@tuyoix.net>
> BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
> BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
> Signed-off-by: Brian Bloniarz <brian.bloniarz@gmail.com>
> ---
>  Documentation/serial/tty.txt |  3 ---
>  drivers/tty/n_hdlc.c         |  4 ++--
>  drivers/tty/n_tty.c          | 34 +++++++++++++++-------------------
>  drivers/tty/pty.c            |  4 +---
>  drivers/tty/tty_buffer.c     | 29 +----------------------------
>  include/linux/tty.h          |  1 -
>  6 files changed, 19 insertions(+), 56 deletions(-)
> 
> diff --git a/Documentation/serial/tty.txt b/Documentation/serial/tty.txt
> index bc3842d..e2dea3d 100644
> --- a/Documentation/serial/tty.txt
> +++ b/Documentation/serial/tty.txt
> @@ -213,9 +213,6 @@ TTY_IO_ERROR		If set, causes all subsequent userspace read/write
>  
>  TTY_OTHER_CLOSED	Device is a pty and the other side has closed.
>  
> -TTY_OTHER_DONE		Device is a pty and the other side has closed and
> -			all pending input processing has been completed.
> -
>  TTY_NO_WRITE_SPLIT	Prevent driver from splitting up writes into
>  			smaller chunks.
>  
> diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c
> index bbc4ce6..644ddb8 100644
> --- a/drivers/tty/n_hdlc.c
> +++ b/drivers/tty/n_hdlc.c
> @@ -600,7 +600,7 @@ static ssize_t n_hdlc_tty_read(struct tty_struct *tty, struct file *file,
>  	add_wait_queue(&tty->read_wait, &wait);
>  
>  	for (;;) {
> -		if (test_bit(TTY_OTHER_DONE, &tty->flags)) {
> +		if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
>  			ret = -EIO;
>  			break;
>  		}
> @@ -828,7 +828,7 @@ static unsigned int n_hdlc_tty_poll(struct tty_struct *tty, struct file *filp,
>  		/* set bits for operations that won't block */
>  		if (n_hdlc->rx_buf_list.head)
>  			mask |= POLLIN | POLLRDNORM;	/* readable */
> -		if (test_bit(TTY_OTHER_DONE, &tty->flags))
> +		if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
>  			mask |= POLLHUP;
>  		if (tty_hung_up_p(filp))
>  			mask |= POLLHUP;
> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> index b280abaa..fc04011 100644
> --- a/drivers/tty/n_tty.c
> +++ b/drivers/tty/n_tty.c
> @@ -1952,10 +1952,20 @@ err:
>  	return -ENOMEM;
>  }
>  
> +/**
> + *	Synchronously pushes the terminal flip buffers to the line discipline
> + *	and checks for available data.

No, see below.

> + *
> + *	Must not be called from IRQ context.
> + */
>  static inline int input_available_p(struct tty_struct *tty, int poll)
>  {
>  	struct n_tty_data *ldata = tty->disc_data;
> -	int amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
> +	int amt;
> +
> +	flush_work(&tty->port->buf.work);

This is not necessary and can deadlock.

The wait for buffer work to finish is only necessary input_available_p()
returns false. Then in n_tty_read() the termios lock needs to be dropped
before waiting for buffer work and rechecking input_available_p().

Please test your patches with lockdep enabled.

Also, abstract the naked flush_work(), similar to tty_buffer_cancel_work().


> +
> +	amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
>  
>  	if (ldata->icanon && !L_EXTPROC(tty))
>  		return ldata->canon_head != ldata->read_tail;
> @@ -1963,18 +1973,6 @@ static inline int input_available_p(struct tty_struct *tty, int poll)
>  		return ldata->commit_head - ldata->read_tail >= amt;
>  }
>  
> -static inline int check_other_done(struct tty_struct *tty)
> -{
> -	int done = test_bit(TTY_OTHER_DONE, &tty->flags);
> -	if (done) {
> -		/* paired with cmpxchg() in check_other_closed(); ensures
> -		 * read buffer head index is not stale
> -		 */
> -		smp_mb__after_atomic();
> -	}
> -	return done;
> -}
> -
>  /**
>   *	copy_from_read_buf	-	copy read data directly
>   *	@tty: terminal device
> @@ -2170,7 +2168,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
>  	struct n_tty_data *ldata = tty->disc_data;
>  	unsigned char __user *b = buf;
>  	DEFINE_WAIT_FUNC(wait, woken_wake_function);
> -	int c, done;
> +	int c;
>  	int minimum, time;
>  	ssize_t retval = 0;
>  	long timeout;
> @@ -2238,10 +2236,8 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
>  		    ((minimum - (b - buf)) >= 1))
>  			ldata->minimum_to_wake = (minimum - (b - buf));
>  
> -		done = check_other_done(tty);
> -
>  		if (!input_available_p(tty, 0)) {
> -			if (done) {
> +			if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
>  				retval = -EIO;
>  				break;
>  			}
> @@ -2445,12 +2441,12 @@ static unsigned int n_tty_poll(struct tty_struct *tty, struct file *file,
>  
>  	poll_wait(file, &tty->read_wait, wait);
>  	poll_wait(file, &tty->write_wait, wait);
> -	if (check_other_done(tty))
> -		mask |= POLLHUP;
>  	if (input_available_p(tty, 1))
>  		mask |= POLLIN | POLLRDNORM;
>  	if (tty->packet && tty->link->ctrl_status)
>  		mask |= POLLPRI | POLLIN | POLLRDNORM;
> +	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
> +		mask |= POLLHUP;
>  	if (tty_hung_up_p(file))
>  		mask |= POLLHUP;
>  	if (!(mask & (POLLHUP | POLLIN | POLLRDNORM))) {
> diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
> index 2348fa6..6427a39 100644
> --- a/drivers/tty/pty.c
> +++ b/drivers/tty/pty.c
> @@ -59,7 +59,7 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
>  	if (!tty->link)
>  		return;
>  	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> -	tty_flip_buffer_push(tty->link->port);
> +	wake_up_interruptible(&tty->link->read_wait);
>  	wake_up_interruptible(&tty->link->write_wait);
>  	if (tty->driver->subtype == PTY_TYPE_MASTER) {
>  		set_bit(TTY_OTHER_CLOSED, &tty->flags);
> @@ -247,9 +247,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
>  		goto out;
>  
>  	clear_bit(TTY_IO_ERROR, &tty->flags);
> -	/* TTY_OTHER_CLOSED must be cleared before TTY_OTHER_DONE */
>  	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> -	clear_bit(TTY_OTHER_DONE, &tty->link->flags);
>  	set_bit(TTY_THROTTLED, &tty->flags);
>  	return 0;
>  
> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
> index 3cd31e0..3c0e914 100644
> --- a/drivers/tty/tty_buffer.c
> +++ b/drivers/tty/tty_buffer.c
> @@ -37,29 +37,6 @@
>  
>  #define TTY_BUFFER_PAGE	(((PAGE_SIZE - sizeof(struct tty_buffer)) / 2) & ~0xFF)
>  
> -/*
> - * If all tty flip buffers have been processed by flush_to_ldisc() or
> - * dropped by tty_buffer_flush(), check if the linked pty has been closed.
> - * If so, wake the reader/poll to process
> - */
> -static inline void check_other_closed(struct tty_struct *tty)
> -{
> -	unsigned long flags, old;
> -
> -	/* transition from TTY_OTHER_CLOSED => TTY_OTHER_DONE must be atomic */
> -	for (flags = ACCESS_ONCE(tty->flags);
> -	     test_bit(TTY_OTHER_CLOSED, &flags);
> -	     ) {
> -		old = flags;
> -		__set_bit(TTY_OTHER_DONE, &flags);
> -		flags = cmpxchg(&tty->flags, old, flags);
> -		if (old == flags) {
> -			wake_up_interruptible(&tty->read_wait);
> -			break;
> -		}
> -	}
> -}
> -
>  /**
>   *	tty_buffer_lock_exclusive	-	gain exclusive access to buffer
>   *	tty_buffer_unlock_exclusive	-	release exclusive access
> @@ -254,8 +231,6 @@ void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld)
>  	if (ld && ld->ops->flush_buffer)
>  		ld->ops->flush_buffer(tty);
>  
> -	check_other_closed(tty);
> -
>  	atomic_dec(&buf->priority);
>  	mutex_unlock(&buf->lock);
>  }
> @@ -505,10 +480,8 @@ static void flush_to_ldisc(struct work_struct *work)
>  		 */
>  		count = smp_load_acquire(&head->commit) - head->read;
>  		if (!count) {
> -			if (next == NULL) {
> -				check_other_closed(tty);
> +			if (next == NULL)
>  				break;
> -			}
>  			buf->head = next;
>  			tty_buffer_free(port, head);
>  			continue;
> diff --git a/include/linux/tty.h b/include/linux/tty.h
> index d9fb4b0..af18365 100644
> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -338,7 +338,6 @@ struct tty_file_private {
>  #define TTY_EXCLUSIVE 		3	/* Exclusive open mode */
>  #define TTY_DEBUG 		4	/* Debugging */
>  #define TTY_DO_WRITE_WAKEUP 	5	/* Call write_wakeup after queuing new */
> -#define TTY_OTHER_DONE		6	/* Closed pty has completed input processing */
>  #define TTY_LDISC_OPEN	 	11	/* Line discipline is open */
>  #define TTY_PTY_LOCK 		16	/* pty private */
>  #define TTY_NO_WRITE_SPLIT 	17	/* Preserve write boundaries to driver */
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
@ 2016-02-29  3:56 Brian Bloniarz
  2016-03-01  4:30 ` Peter Hurley
  0 siblings, 1 reply; 14+ messages in thread
From: Brian Bloniarz @ 2016-02-29  3:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: peter, tsi

(Take 3, fix compile error in n_hdlc.c)

Hi Peter, I saw Marc Aurele La France's proposed patch to n_tty to fix
OpenSSH, and your feedback. Patch below is an attempt to address that
feedback. Please let me know if this is the change you envisioned;
(see Marc's excellent original writeup for details on the issue).

[PATCH] n_tty: wait for buffer work in read() and poll().

Undoes the following four changes:

1) f95499c3030fe1bfad57745f2db1959c5b43dca8
    n_tty: Don't wait for buffer work in read() loop

2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
    tty: Fix pty master read() after slave closes

3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
    pty, n_tty: Simplify input processing on final close

4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
    pty: Fix input race when closing

These changes caused a regression in OpenSSH, as it assumes that the
first read() to return EAGAIN after a SIGCHLD means that all the child's
output has been returned.

Inspired by analysis and patch from Marc Aurele La France <tsi@tuyoix.net>

Reported-by: Volth <openssh@volth.com>
Reported-by: Marc Aurele La France <tsi@tuyoix.net>
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
Signed-off-by: Brian Bloniarz <brian.bloniarz@gmail.com>
---
 Documentation/serial/tty.txt |  3 ---
 drivers/tty/n_hdlc.c         |  4 ++--
 drivers/tty/n_tty.c          | 34 +++++++++++++++-------------------
 drivers/tty/pty.c            |  4 +---
 drivers/tty/tty_buffer.c     | 29 +----------------------------
 include/linux/tty.h          |  1 -
 6 files changed, 19 insertions(+), 56 deletions(-)

diff --git a/Documentation/serial/tty.txt b/Documentation/serial/tty.txt
index bc3842d..e2dea3d 100644
--- a/Documentation/serial/tty.txt
+++ b/Documentation/serial/tty.txt
@@ -213,9 +213,6 @@ TTY_IO_ERROR		If set, causes all subsequent userspace read/write
 
 TTY_OTHER_CLOSED	Device is a pty and the other side has closed.
 
-TTY_OTHER_DONE		Device is a pty and the other side has closed and
-			all pending input processing has been completed.
-
 TTY_NO_WRITE_SPLIT	Prevent driver from splitting up writes into
 			smaller chunks.
 
diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c
index bbc4ce6..644ddb8 100644
--- a/drivers/tty/n_hdlc.c
+++ b/drivers/tty/n_hdlc.c
@@ -600,7 +600,7 @@ static ssize_t n_hdlc_tty_read(struct tty_struct *tty, struct file *file,
 	add_wait_queue(&tty->read_wait, &wait);
 
 	for (;;) {
-		if (test_bit(TTY_OTHER_DONE, &tty->flags)) {
+		if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
 			ret = -EIO;
 			break;
 		}
@@ -828,7 +828,7 @@ static unsigned int n_hdlc_tty_poll(struct tty_struct *tty, struct file *filp,
 		/* set bits for operations that won't block */
 		if (n_hdlc->rx_buf_list.head)
 			mask |= POLLIN | POLLRDNORM;	/* readable */
-		if (test_bit(TTY_OTHER_DONE, &tty->flags))
+		if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
 			mask |= POLLHUP;
 		if (tty_hung_up_p(filp))
 			mask |= POLLHUP;
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index b280abaa..fc04011 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1952,10 +1952,20 @@ err:
 	return -ENOMEM;
 }
 
+/**
+ *	Synchronously pushes the terminal flip buffers to the line discipline
+ *	and checks for available data.
+ *
+ *	Must not be called from IRQ context.
+ */
 static inline int input_available_p(struct tty_struct *tty, int poll)
 {
 	struct n_tty_data *ldata = tty->disc_data;
-	int amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
+	int amt;
+
+	flush_work(&tty->port->buf.work);
+
+	amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
 
 	if (ldata->icanon && !L_EXTPROC(tty))
 		return ldata->canon_head != ldata->read_tail;
@@ -1963,18 +1973,6 @@ static inline int input_available_p(struct tty_struct *tty, int poll)
 		return ldata->commit_head - ldata->read_tail >= amt;
 }
 
-static inline int check_other_done(struct tty_struct *tty)
-{
-	int done = test_bit(TTY_OTHER_DONE, &tty->flags);
-	if (done) {
-		/* paired with cmpxchg() in check_other_closed(); ensures
-		 * read buffer head index is not stale
-		 */
-		smp_mb__after_atomic();
-	}
-	return done;
-}
-
 /**
  *	copy_from_read_buf	-	copy read data directly
  *	@tty: terminal device
@@ -2170,7 +2168,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
 	struct n_tty_data *ldata = tty->disc_data;
 	unsigned char __user *b = buf;
 	DEFINE_WAIT_FUNC(wait, woken_wake_function);
-	int c, done;
+	int c;
 	int minimum, time;
 	ssize_t retval = 0;
 	long timeout;
@@ -2238,10 +2236,8 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
 		    ((minimum - (b - buf)) >= 1))
 			ldata->minimum_to_wake = (minimum - (b - buf));
 
-		done = check_other_done(tty);
-
 		if (!input_available_p(tty, 0)) {
-			if (done) {
+			if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
 				retval = -EIO;
 				break;
 			}
@@ -2445,12 +2441,12 @@ static unsigned int n_tty_poll(struct tty_struct *tty, struct file *file,
 
 	poll_wait(file, &tty->read_wait, wait);
 	poll_wait(file, &tty->write_wait, wait);
-	if (check_other_done(tty))
-		mask |= POLLHUP;
 	if (input_available_p(tty, 1))
 		mask |= POLLIN | POLLRDNORM;
 	if (tty->packet && tty->link->ctrl_status)
 		mask |= POLLPRI | POLLIN | POLLRDNORM;
+	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+		mask |= POLLHUP;
 	if (tty_hung_up_p(file))
 		mask |= POLLHUP;
 	if (!(mask & (POLLHUP | POLLIN | POLLRDNORM))) {
diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index 2348fa6..6427a39 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -59,7 +59,7 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
 	if (!tty->link)
 		return;
 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-	tty_flip_buffer_push(tty->link->port);
+	wake_up_interruptible(&tty->link->read_wait);
 	wake_up_interruptible(&tty->link->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
 		set_bit(TTY_OTHER_CLOSED, &tty->flags);
@@ -247,9 +247,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
 		goto out;
 
 	clear_bit(TTY_IO_ERROR, &tty->flags);
-	/* TTY_OTHER_CLOSED must be cleared before TTY_OTHER_DONE */
 	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-	clear_bit(TTY_OTHER_DONE, &tty->link->flags);
 	set_bit(TTY_THROTTLED, &tty->flags);
 	return 0;
 
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 3cd31e0..3c0e914 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -37,29 +37,6 @@
 
 #define TTY_BUFFER_PAGE	(((PAGE_SIZE - sizeof(struct tty_buffer)) / 2) & ~0xFF)
 
-/*
- * If all tty flip buffers have been processed by flush_to_ldisc() or
- * dropped by tty_buffer_flush(), check if the linked pty has been closed.
- * If so, wake the reader/poll to process
- */
-static inline void check_other_closed(struct tty_struct *tty)
-{
-	unsigned long flags, old;
-
-	/* transition from TTY_OTHER_CLOSED => TTY_OTHER_DONE must be atomic */
-	for (flags = ACCESS_ONCE(tty->flags);
-	     test_bit(TTY_OTHER_CLOSED, &flags);
-	     ) {
-		old = flags;
-		__set_bit(TTY_OTHER_DONE, &flags);
-		flags = cmpxchg(&tty->flags, old, flags);
-		if (old == flags) {
-			wake_up_interruptible(&tty->read_wait);
-			break;
-		}
-	}
-}
-
 /**
  *	tty_buffer_lock_exclusive	-	gain exclusive access to buffer
  *	tty_buffer_unlock_exclusive	-	release exclusive access
@@ -254,8 +231,6 @@ void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld)
 	if (ld && ld->ops->flush_buffer)
 		ld->ops->flush_buffer(tty);
 
-	check_other_closed(tty);
-
 	atomic_dec(&buf->priority);
 	mutex_unlock(&buf->lock);
 }
@@ -505,10 +480,8 @@ static void flush_to_ldisc(struct work_struct *work)
 		 */
 		count = smp_load_acquire(&head->commit) - head->read;
 		if (!count) {
-			if (next == NULL) {
-				check_other_closed(tty);
+			if (next == NULL)
 				break;
-			}
 			buf->head = next;
 			tty_buffer_free(port, head);
 			continue;
diff --git a/include/linux/tty.h b/include/linux/tty.h
index d9fb4b0..af18365 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -338,7 +338,6 @@ struct tty_file_private {
 #define TTY_EXCLUSIVE 		3	/* Exclusive open mode */
 #define TTY_DEBUG 		4	/* Debugging */
 #define TTY_DO_WRITE_WAKEUP 	5	/* Call write_wakeup after queuing new */
-#define TTY_OTHER_DONE		6	/* Closed pty has completed input processing */
 #define TTY_LDISC_OPEN	 	11	/* Line discipline is open */
 #define TTY_PTY_LOCK 		16	/* pty private */
 #define TTY_NO_WRITE_SPLIT 	17	/* Preserve write boundaries to driver */

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: n_tty:  Check the other end of pty pair before returning EAGAIN on a read()
@ 2016-02-28 23:02 Brian Bloniarz
  0 siblings, 0 replies; 14+ messages in thread
From: Brian Bloniarz @ 2016-02-28 23:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: peter, tsi

(Take 2, email client woes)

Peter, I saw Marc Aurele La France's proposed patch to n_tty to fix
OpenSSH, and your feedback. Patch below is an attempt to address that
feedback. Please let me know if this is the change you envisioned.
See Marc's excellent original writeup for details on the issue.

[PATCH] n_tty: wait for buffer work in read() and poll().

Undoes the following four changes:

1) f95499c3030fe1bfad57745f2db1959c5b43dca8
    n_tty: Don't wait for buffer work in read() loop

2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
    tty: Fix pty master read() after slave closes

3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
    pty, n_tty: Simplify input processing on final close

4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
    pty: Fix input race when closing

These changes caused a regression in OpenSSH, as it assumes that the
first read() to return EAGAIN after a SIGCHLD means that all the child's
output has been returned.

Inspired by analysis and patch from Marc Aurele La France <tsi@tuyoix.net>

Reported-by: Volth <openssh@volth.com>
Reported-by: Marc Aurele La France <tsi@tuyoix.net>
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
Signed-off-by: Brian Bloniarz <brian.bloniarz@gmail.com>
---
 Documentation/serial/tty.txt |  3 ---
 drivers/tty/n_hdlc.c         |  2 +-
 drivers/tty/n_tty.c          | 34 +++++++++++++++-------------------
 drivers/tty/pty.c            |  4 +---
 drivers/tty/tty_buffer.c     | 29 +----------------------------
 include/linux/tty.h          |  1 -
 6 files changed, 18 insertions(+), 55 deletions(-)

diff --git a/Documentation/serial/tty.txt b/Documentation/serial/tty.txt
index bc3842d..e2dea3d 100644
--- a/Documentation/serial/tty.txt
+++ b/Documentation/serial/tty.txt
@@ -213,9 +213,6 @@ TTY_IO_ERROR		If set, causes all subsequent userspace read/write
 
 TTY_OTHER_CLOSED	Device is a pty and the other side has closed.
 
-TTY_OTHER_DONE		Device is a pty and the other side has closed and
-			all pending input processing has been completed.
-
 TTY_NO_WRITE_SPLIT	Prevent driver from splitting up writes into
 			smaller chunks.
 
diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c
index bbc4ce6..c35abef 100644
--- a/drivers/tty/n_hdlc.c
+++ b/drivers/tty/n_hdlc.c
@@ -600,7 +600,7 @@ static ssize_t n_hdlc_tty_read(struct tty_struct *tty, struct file *file,
 	add_wait_queue(&tty->read_wait, &wait);
 
 	for (;;) {
-		if (test_bit(TTY_OTHER_DONE, &tty->flags)) {
+		if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
 			ret = -EIO;
 			break;
 		}
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index b280abaa..fc04011 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1952,10 +1952,20 @@ err:
 	return -ENOMEM;
 }
 
+/**
+ *	Synchronously pushes the terminal flip buffers to the line discipline
+ *	and checks for available data.
+ *
+ *	Must not be called from IRQ context.
+ */
 static inline int input_available_p(struct tty_struct *tty, int poll)
 {
 	struct n_tty_data *ldata = tty->disc_data;
-	int amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
+	int amt;
+
+	flush_work(&tty->port->buf.work);
+
+	amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
 
 	if (ldata->icanon && !L_EXTPROC(tty))
 		return ldata->canon_head != ldata->read_tail;
@@ -1963,18 +1973,6 @@ static inline int input_available_p(struct tty_struct *tty, int poll)
 		return ldata->commit_head - ldata->read_tail >= amt;
 }
 
-static inline int check_other_done(struct tty_struct *tty)
-{
-	int done = test_bit(TTY_OTHER_DONE, &tty->flags);
-	if (done) {
-		/* paired with cmpxchg() in check_other_closed(); ensures
-		 * read buffer head index is not stale
-		 */
-		smp_mb__after_atomic();
-	}
-	return done;
-}
-
 /**
  *	copy_from_read_buf	-	copy read data directly
  *	@tty: terminal device
@@ -2170,7 +2168,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
 	struct n_tty_data *ldata = tty->disc_data;
 	unsigned char __user *b = buf;
 	DEFINE_WAIT_FUNC(wait, woken_wake_function);
-	int c, done;
+	int c;
 	int minimum, time;
 	ssize_t retval = 0;
 	long timeout;
@@ -2238,10 +2236,8 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
 		    ((minimum - (b - buf)) >= 1))
 			ldata->minimum_to_wake = (minimum - (b - buf));
 
-		done = check_other_done(tty);
-
 		if (!input_available_p(tty, 0)) {
-			if (done) {
+			if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
 				retval = -EIO;
 				break;
 			}
@@ -2445,12 +2441,12 @@ static unsigned int n_tty_poll(struct tty_struct *tty, struct file *file,
 
 	poll_wait(file, &tty->read_wait, wait);
 	poll_wait(file, &tty->write_wait, wait);
-	if (check_other_done(tty))
-		mask |= POLLHUP;
 	if (input_available_p(tty, 1))
 		mask |= POLLIN | POLLRDNORM;
 	if (tty->packet && tty->link->ctrl_status)
 		mask |= POLLPRI | POLLIN | POLLRDNORM;
+	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+		mask |= POLLHUP;
 	if (tty_hung_up_p(file))
 		mask |= POLLHUP;
 	if (!(mask & (POLLHUP | POLLIN | POLLRDNORM))) {
diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index 2348fa6..6427a39 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -59,7 +59,7 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
 	if (!tty->link)
 		return;
 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-	tty_flip_buffer_push(tty->link->port);
+	wake_up_interruptible(&tty->link->read_wait);
 	wake_up_interruptible(&tty->link->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
 		set_bit(TTY_OTHER_CLOSED, &tty->flags);
@@ -247,9 +247,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
 		goto out;
 
 	clear_bit(TTY_IO_ERROR, &tty->flags);
-	/* TTY_OTHER_CLOSED must be cleared before TTY_OTHER_DONE */
 	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-	clear_bit(TTY_OTHER_DONE, &tty->link->flags);
 	set_bit(TTY_THROTTLED, &tty->flags);
 	return 0;
 
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 3cd31e0..3c0e914 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -37,29 +37,6 @@
 
 #define TTY_BUFFER_PAGE	(((PAGE_SIZE - sizeof(struct tty_buffer)) / 2) & ~0xFF)
 
-/*
- * If all tty flip buffers have been processed by flush_to_ldisc() or
- * dropped by tty_buffer_flush(), check if the linked pty has been closed.
- * If so, wake the reader/poll to process
- */
-static inline void check_other_closed(struct tty_struct *tty)
-{
-	unsigned long flags, old;
-
-	/* transition from TTY_OTHER_CLOSED => TTY_OTHER_DONE must be atomic */
-	for (flags = ACCESS_ONCE(tty->flags);
-	     test_bit(TTY_OTHER_CLOSED, &flags);
-	     ) {
-		old = flags;
-		__set_bit(TTY_OTHER_DONE, &flags);
-		flags = cmpxchg(&tty->flags, old, flags);
-		if (old == flags) {
-			wake_up_interruptible(&tty->read_wait);
-			break;
-		}
-	}
-}
-
 /**
  *	tty_buffer_lock_exclusive	-	gain exclusive access to buffer
  *	tty_buffer_unlock_exclusive	-	release exclusive access
@@ -254,8 +231,6 @@ void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld)
 	if (ld && ld->ops->flush_buffer)
 		ld->ops->flush_buffer(tty);
 
-	check_other_closed(tty);
-
 	atomic_dec(&buf->priority);
 	mutex_unlock(&buf->lock);
 }
@@ -505,10 +480,8 @@ static void flush_to_ldisc(struct work_struct *work)
 		 */
 		count = smp_load_acquire(&head->commit) - head->read;
 		if (!count) {
-			if (next == NULL) {
-				check_other_closed(tty);
+			if (next == NULL)
 				break;
-			}
 			buf->head = next;
 			tty_buffer_free(port, head);
 			continue;
diff --git a/include/linux/tty.h b/include/linux/tty.h
index d9fb4b0..af18365 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -338,7 +338,6 @@ struct tty_file_private {
 #define TTY_EXCLUSIVE 		3	/* Exclusive open mode */
 #define TTY_DEBUG 		4	/* Debugging */
 #define TTY_DO_WRITE_WAKEUP 	5	/* Call write_wakeup after queuing new */
-#define TTY_OTHER_DONE		6	/* Closed pty has completed input processing */
 #define TTY_LDISC_OPEN	 	11	/* Line discipline is open */
 #define TTY_PTY_LOCK 		16	/* pty private */
 #define TTY_NO_WRITE_SPLIT 	17	/* Preserve write boundaries to driver */

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
@ 2016-02-28 22:53 Brian Bloniarz
  0 siblings, 0 replies; 14+ messages in thread
From: Brian Bloniarz @ 2016-02-28 22:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: peter, tsi

Hi Peter, I saw Marc Aurele La France's proposed patch to n_tty to fix
OpenSSH, and your feedback. Patch below is an attempt to address that
feedback. Please let me know if this is the change you envisioned;
(see Marc's excellent original writeup for details on the issue).

[PATCH] n_tty: wait for buffer work in read() and poll().

Undoes the following four changes:

1) f95499c3030fe1bfad57745f2db1959c5b43dca8
    n_tty: Don't wait for buffer work in read() loop

2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
    tty: Fix pty master read() after slave closes

3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
    pty, n_tty: Simplify input processing on final close

4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
    pty: Fix input race when closing

These changes caused a regression in OpenSSH, as it assumes that the
first read() to return EAGAIN after a SIGCHLD means that all the child's
output has been returned.

Inspired by analysis and patch from Marc Aurele La France <tsi@tuyoix.net>

Reported-by: Volth <openssh@volth.com>
Reported-by: Marc Aurele La France <tsi@tuyoix.net>
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
Signed-off-by: Brian Bloniarz <brian.bloniarz@gmail.com>
---
 Documentation/serial/tty.txt |  3 ---
 drivers/tty/n_hdlc.c         |  2 +-
 drivers/tty/n_tty.c          | 34 +++++++++++++++-------------------
 drivers/tty/pty.c            |  4 +---
 drivers/tty/tty_buffer.c     | 29 +----------------------------
 include/linux/tty.h          |  1 -
 6 files changed, 18 insertions(+), 55 deletions(-)

diff --git a/Documentation/serial/tty.txt b/Documentation/serial/tty.txt
index bc3842d..e2dea3d 100644
--- a/Documentation/serial/tty.txt
+++ b/Documentation/serial/tty.txt
@@ -213,9 +213,6 @@ TTY_IO_ERROR                If set, causes all subsequent userspace read/write
 
 TTY_OTHER_CLOSED       Device is a pty and the other side has closed.
 
-TTY_OTHER_DONE         Device is a pty and the other side has closed and
-                       all pending input processing has been completed.
-
 TTY_NO_WRITE_SPLIT     Prevent driver from splitting up writes into
                        smaller chunks.
 
diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c
index bbc4ce6..c35abef 100644
--- a/drivers/tty/n_hdlc.c
+++ b/drivers/tty/n_hdlc.c
@@ -600,7 +600,7 @@ static ssize_t n_hdlc_tty_read(struct tty_struct *tty, struct file *file,
        add_wait_queue(&tty->read_wait, &wait);
 
        for (;;) {
-               if (test_bit(TTY_OTHER_DONE, &tty->flags)) {
+               if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
                        ret = -EIO;
                        break;
                }
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index b280abaa..fc04011 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1952,10 +1952,20 @@ err:
        return -ENOMEM;
 }
 
+/**
+ *     Synchronously pushes the terminal flip buffers to the line discipline
+ *     and checks for available data.
+ *
+ *     Must not be called from IRQ context.
+ */
 static inline int input_available_p(struct tty_struct *tty, int poll)
 {
        struct n_tty_data *ldata = tty->disc_data;
-       int amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
+       int amt;
+
+       flush_work(&tty->port->buf.work);
+
+       amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
 
        if (ldata->icanon && !L_EXTPROC(tty))
                return ldata->canon_head != ldata->read_tail;
@@ -1963,18 +1973,6 @@ static inline int input_available_p(struct tty_struct *tty, int poll)
                return ldata->commit_head - ldata->read_tail >= amt;
 }
 
-static inline int check_other_done(struct tty_struct *tty)
-{
-       int done = test_bit(TTY_OTHER_DONE, &tty->flags);
-       if (done) {
-               /* paired with cmpxchg() in check_other_closed(); ensures
-                * read buffer head index is not stale
-                */
-               smp_mb__after_atomic();
-       }
-       return done;
-}
-
 /**
  *     copy_from_read_buf      -       copy read data directly
  *     @tty: terminal device
@@ -2170,7 +2168,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
        struct n_tty_data *ldata = tty->disc_data;
        unsigned char __user *b = buf;
        DEFINE_WAIT_FUNC(wait, woken_wake_function);
-       int c, done;
+       int c;
        int minimum, time;
        ssize_t retval = 0;
        long timeout;
@@ -2238,10 +2236,8 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
                    ((minimum - (b - buf)) >= 1))
                        ldata->minimum_to_wake = (minimum - (b - buf));
 
-               done = check_other_done(tty);
-
                if (!input_available_p(tty, 0)) {
-                       if (done) {
+                       if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
                                retval = -EIO;
                                break;
                        }
@@ -2445,12 +2441,12 @@ static unsigned int n_tty_poll(struct tty_struct *tty, struct file *file,
 
        poll_wait(file, &tty->read_wait, wait);
        poll_wait(file, &tty->write_wait, wait);
-       if (check_other_done(tty))
-               mask |= POLLHUP;
        if (input_available_p(tty, 1))
                mask |= POLLIN | POLLRDNORM;
        if (tty->packet && tty->link->ctrl_status)
                mask |= POLLPRI | POLLIN | POLLRDNORM;
+       if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+               mask |= POLLHUP;
        if (tty_hung_up_p(file))
                mask |= POLLHUP;
        if (!(mask & (POLLHUP | POLLIN | POLLRDNORM))) {
diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index 2348fa6..6427a39 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -59,7 +59,7 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
        if (!tty->link)
                return;
        set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-       tty_flip_buffer_push(tty->link->port);
+       wake_up_interruptible(&tty->link->read_wait);
        wake_up_interruptible(&tty->link->write_wait);
        if (tty->driver->subtype == PTY_TYPE_MASTER) {
                set_bit(TTY_OTHER_CLOSED, &tty->flags);
@@ -247,9 +247,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
                goto out;
 
        clear_bit(TTY_IO_ERROR, &tty->flags);
-       /* TTY_OTHER_CLOSED must be cleared before TTY_OTHER_DONE */
        clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-       clear_bit(TTY_OTHER_DONE, &tty->link->flags);
        set_bit(TTY_THROTTLED, &tty->flags);
        return 0;
 
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 3cd31e0..3c0e914 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -37,29 +37,6 @@
 
 #define TTY_BUFFER_PAGE        (((PAGE_SIZE - sizeof(struct tty_buffer)) / 2) & ~0xFF)
 
-/*
- * If all tty flip buffers have been processed by flush_to_ldisc() or
- * dropped by tty_buffer_flush(), check if the linked pty has been closed.
- * If so, wake the reader/poll to process
- */
-static inline void check_other_closed(struct tty_struct *tty)
-{
-       unsigned long flags, old;
-
-       /* transition from TTY_OTHER_CLOSED => TTY_OTHER_DONE must be atomic */
-       for (flags = ACCESS_ONCE(tty->flags);
-            test_bit(TTY_OTHER_CLOSED, &flags);
-            ) {
-               old = flags;
-               __set_bit(TTY_OTHER_DONE, &flags);
-               flags = cmpxchg(&tty->flags, old, flags);
-               if (old == flags) {
-                       wake_up_interruptible(&tty->read_wait);
-                       break;
-               }
-       }
-}
-
 /**
  *     tty_buffer_lock_exclusive       -       gain exclusive access to buffer
  *     tty_buffer_unlock_exclusive     -       release exclusive access
@@ -254,8 +231,6 @@ void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld)
        if (ld && ld->ops->flush_buffer)
                ld->ops->flush_buffer(tty);
 
-       check_other_closed(tty);
-
        atomic_dec(&buf->priority);
        mutex_unlock(&buf->lock);
 }
@@ -505,10 +480,8 @@ static void flush_to_ldisc(struct work_struct *work)
                 */
                count = smp_load_acquire(&head->commit) - head->read;
                if (!count) {
-                       if (next == NULL) {
-                               check_other_closed(tty);
+                       if (next == NULL)
                                break;
-                       }
                        buf->head = next;
                        tty_buffer_free(port, head);
                        continue;
diff --git a/include/linux/tty.h b/include/linux/tty.h
index d9fb4b0..af18365 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -338,7 +338,6 @@ struct tty_file_private {
 #define TTY_EXCLUSIVE          3       /* Exclusive open mode */
 #define TTY_DEBUG              4       /* Debugging */
 #define TTY_DO_WRITE_WAKEUP    5       /* Call write_wakeup after queuing new */
-#define TTY_OTHER_DONE         6       /* Closed pty has completed input processing */
 #define TTY_LDISC_OPEN         11      /* Line discipline is open */
 #define TTY_PTY_LOCK           16      /* pty private */
 #define TTY_NO_WRITE_SPLIT     17      /* Preserve write boundaries to driver */

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-03-01  4:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-09 21:06 n_tty: Check the other end of pty pair before returning EAGAIN on a read() Marc Aurele La France
2015-12-10 14:59 ` Peter Hurley
2015-12-10 22:48   ` Marc Aurele La France
2015-12-11  0:07     ` Peter Hurley
2015-12-11 13:37       ` Marc Aurele La France
2015-12-11 13:56         ` Peter Hurley
2015-12-18 14:26           ` Marc Aurele La France
2015-12-18 16:39             ` Peter Hurley
2015-12-18 17:23               ` Marc Aurele La France
2016-01-14 21:50                 ` Marc Aurele La France
2016-02-28 22:53 Brian Bloniarz
2016-02-28 23:02 Brian Bloniarz
2016-02-29  3:56 Brian Bloniarz
2016-03-01  4:30 ` Peter Hurley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.