linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION][BISECTED] 3.12-rc "n_tty: Don't wait for buffer work in read() loop" patch breaks gcc's testsuite
@ 2013-09-25 12:18 Mikael Pettersson
  2013-09-25 13:50 ` Peter Hurley
  0 siblings, 1 reply; 7+ messages in thread
From: Mikael Pettersson @ 2013-09-25 12:18 UTC (permalink / raw)
  To: Peter Hurley; +Cc: linux-kernel

With 3.12-rc[12] I see unexpected failures in gcc's Ada acats testsuite, e.g.

                === acats tests ===
FAIL:   a83009b
FAIL:   c37209a
FAIL:   c45531e
FAIL:   c45614a
FAIL:   c67005d
FAIL:   c730a01
FAIL:   c74302b
FAIL:   cc3004a
FAIL:   cd2a24j
FAIL:   cd2a53a
FAIL:   cxa3001
FAIL:   cxf3a07
FAIL:   cxf3a08

                === acats Summary ===
# of expected passes            2307
# of unexpected failures        13
Native configuration is x86_64-unknown-linux-gnu

The exact failures vary from run to run, but some failures always occur on my
x86_64 machines, and all three open gcc branches (trunk, 4.8, 4.7) are affected.
With a 3.11 kernel the acats testsuite is always clean.

A bisection identified:

>From f95499c3030fe1bfad57745f2db1959c5b43dca8 Mon Sep 17 00:00:00 2001
From: Peter Hurley <peter@hurleysoftware.com>
Date: Sat, 15 Jun 2013 13:14:29 +0000
Subject: n_tty: Don't wait for buffer work in read() loop

User-space read() can run concurrently with receiving from device;
waiting for receive_buf() to complete is not required.

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index fe1c399..a6eea30 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1724,7 +1724,6 @@ static inline int input_available_p(struct tty_struct *tty, int amt)
 {
        struct n_tty_data *ldata = tty->disc_data;
 
-       tty_flush_to_ldisc(tty);
        if (ldata->icanon && !L_EXTPROC(tty)) {
                if (ldata->canon_head != ldata->read_tail)
                        return 1;

as the culprit.  Reverting that from 3.12-rc2 eliminates the acats failures
and brings the gcc testsuite results to what one gets with 3.11.

I can't pretend to understand exactly what goes wrong, suffice it to say that
the gcc testsuite harness uses a combination of shell, expect, and tcl.  I
suspect ptys are also involved.

To repeat, bootstrap a recent gcc 4.8 snapshot w/ ada in --enable-languages,
then run the test suite with "make -j6 -k check; make mail-report.log".
(Adjust -jN as appropriate, but -j6 is what I'm using on my quad-core i7s.)

Please consider reverting or fixing this patch.

/Mikael

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [REGRESSION][BISECTED] 3.12-rc "n_tty: Don't wait for buffer work in read() loop" patch breaks gcc's testsuite
  2013-09-25 12:18 [REGRESSION][BISECTED] 3.12-rc "n_tty: Don't wait for buffer work in read() loop" patch breaks gcc's testsuite Mikael Pettersson
@ 2013-09-25 13:50 ` Peter Hurley
  2013-09-25 13:52   ` Peter Hurley
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Hurley @ 2013-09-25 13:50 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel

On 09/25/2013 08:18 AM, Mikael Pettersson wrote:
> With 3.12-rc[12] I see unexpected failures in gcc's Ada acats testsuite, e.g.
>
>                  === acats tests ===
> FAIL:   a83009b
> FAIL:   c37209a
> FAIL:   c45531e
> FAIL:   c45614a
> FAIL:   c67005d
> FAIL:   c730a01
> FAIL:   c74302b
> FAIL:   cc3004a
> FAIL:   cd2a24j
> FAIL:   cd2a53a
> FAIL:   cxa3001
> FAIL:   cxf3a07
> FAIL:   cxf3a08
>
>                  === acats Summary ===
> # of expected passes            2307
> # of unexpected failures        13
> Native configuration is x86_64-unknown-linux-gnu

Thanks for the report.
Would you please send me the acats.log file from a failed testsuite run with its
matching screen output?

Regards,
Peter Hurley


> The exact failures vary from run to run, but some failures always occur on my
> x86_64 machines, and all three open gcc branches (trunk, 4.8, 4.7) are affected.
> With a 3.11 kernel the acats testsuite is always clean.
>
> A bisection identified:
>
>  From f95499c3030fe1bfad57745f2db1959c5b43dca8 Mon Sep 17 00:00:00 2001
> From: Peter Hurley <peter@hurleysoftware.com>
> Date: Sat, 15 Jun 2013 13:14:29 +0000
> Subject: n_tty: Don't wait for buffer work in read() loop
>
> User-space read() can run concurrently with receiving from device;
> waiting for receive_buf() to complete is not required.
>
> Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> index fe1c399..a6eea30 100644
> --- a/drivers/tty/n_tty.c
> +++ b/drivers/tty/n_tty.c
> @@ -1724,7 +1724,6 @@ static inline int input_available_p(struct tty_struct *tty, int amt)
>   {
>          struct n_tty_data *ldata = tty->disc_data;
>
> -       tty_flush_to_ldisc(tty);
>          if (ldata->icanon && !L_EXTPROC(tty)) {
>                  if (ldata->canon_head != ldata->read_tail)
>                          return 1;
>
> as the culprit.  Reverting that from 3.12-rc2 eliminates the acats failures
> and brings the gcc testsuite results to what one gets with 3.11.
>
> I can't pretend to understand exactly what goes wrong, suffice it to say that
> the gcc testsuite harness uses a combination of shell, expect, and tcl.  I
> suspect ptys are also involved.
>
> To repeat, bootstrap a recent gcc 4.8 snapshot w/ ada in --enable-languages,
> then run the test suite with "make -j6 -k check; make mail-report.log".
> (Adjust -jN as appropriate, but -j6 is what I'm using on my quad-core i7s.)
>
> Please consider reverting or fixing this patch.
>
> /Mikael
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION][BISECTED] 3.12-rc "n_tty: Don't wait for buffer work in read() loop" patch breaks gcc's testsuite
  2013-09-25 13:50 ` Peter Hurley
@ 2013-09-25 13:52   ` Peter Hurley
  2013-09-26 20:07     ` Peter Hurley
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Hurley @ 2013-09-25 13:52 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel, Greg KH

[ +cc Greg Kroah-Hartman ]

On 09/25/2013 09:50 AM, Peter Hurley wrote:
> On 09/25/2013 08:18 AM, Mikael Pettersson wrote:
>> With 3.12-rc[12] I see unexpected failures in gcc's Ada acats testsuite, e.g.
>>
>>                  === acats tests ===
>> FAIL:   a83009b
>> FAIL:   c37209a
>> FAIL:   c45531e
>> FAIL:   c45614a
>> FAIL:   c67005d
>> FAIL:   c730a01
>> FAIL:   c74302b
>> FAIL:   cc3004a
>> FAIL:   cd2a24j
>> FAIL:   cd2a53a
>> FAIL:   cxa3001
>> FAIL:   cxf3a07
>> FAIL:   cxf3a08
>>
>>                  === acats Summary ===
>> # of expected passes            2307
>> # of unexpected failures        13
>> Native configuration is x86_64-unknown-linux-gnu
>
> Thanks for the report.
> Would you please send me the acats.log file from a failed testsuite run with its
> matching screen output?
>
> Regards,
> Peter Hurley
>
>
>> The exact failures vary from run to run, but some failures always occur on my
>> x86_64 machines, and all three open gcc branches (trunk, 4.8, 4.7) are affected.
>> With a 3.11 kernel the acats testsuite is always clean.
>>
>> A bisection identified:
>>
>>  From f95499c3030fe1bfad57745f2db1959c5b43dca8 Mon Sep 17 00:00:00 2001
>> From: Peter Hurley <peter@hurleysoftware.com>
>> Date: Sat, 15 Jun 2013 13:14:29 +0000
>> Subject: n_tty: Don't wait for buffer work in read() loop
>>
>> User-space read() can run concurrently with receiving from device;
>> waiting for receive_buf() to complete is not required.
>>
>> Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
>> index fe1c399..a6eea30 100644
>> --- a/drivers/tty/n_tty.c
>> +++ b/drivers/tty/n_tty.c
>> @@ -1724,7 +1724,6 @@ static inline int input_available_p(struct tty_struct *tty, int amt)
>>   {
>>          struct n_tty_data *ldata = tty->disc_data;
>>
>> -       tty_flush_to_ldisc(tty);
>>          if (ldata->icanon && !L_EXTPROC(tty)) {
>>                  if (ldata->canon_head != ldata->read_tail)
>>                          return 1;
>>
>> as the culprit.  Reverting that from 3.12-rc2 eliminates the acats failures
>> and brings the gcc testsuite results to what one gets with 3.11.
>>
>> I can't pretend to understand exactly what goes wrong, suffice it to say that
>> the gcc testsuite harness uses a combination of shell, expect, and tcl.  I
>> suspect ptys are also involved.
>>
>> To repeat, bootstrap a recent gcc 4.8 snapshot w/ ada in --enable-languages,
>> then run the test suite with "make -j6 -k check; make mail-report.log".
>> (Adjust -jN as appropriate, but -j6 is what I'm using on my quad-core i7s.)
>>
>> Please consider reverting or fixing this patch.
>>
>> /Mikael



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION][BISECTED] 3.12-rc "n_tty: Don't wait for buffer work in read() loop" patch breaks gcc's testsuite
  2013-09-25 13:52   ` Peter Hurley
@ 2013-09-26 20:07     ` Peter Hurley
  2013-09-27 17:27       ` [PATCH] tty: Fix pty master read() after slave closes Peter Hurley
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Hurley @ 2013-09-26 20:07 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel, Greg KH

On 09/25/2013 09:52 AM, Peter Hurley wrote:
> [ +cc Greg Kroah-Hartman ]
>
> On 09/25/2013 09:50 AM, Peter Hurley wrote:
>> On 09/25/2013 08:18 AM, Mikael Pettersson wrote:
>>> With 3.12-rc[12] I see unexpected failures in gcc's Ada acats testsuite, e.g.
>>>
>>>                  === acats tests ===
>>> FAIL:   a83009b
>>> FAIL:   c37209a
>>> FAIL:   c45531e
>>> FAIL:   c45614a
>>> FAIL:   c67005d
>>> FAIL:   c730a01
>>> FAIL:   c74302b
>>> FAIL:   cc3004a
>>> FAIL:   cd2a24j
>>> FAIL:   cd2a53a
>>> FAIL:   cxa3001
>>> FAIL:   cxf3a07
>>> FAIL:   cxf3a08
>>>
>>>                  === acats Summary ===
>>> # of expected passes            2307
>>> # of unexpected failures        13
>>> Native configuration is x86_64-unknown-linux-gnu
>>
>> Thanks for the report.
>> Would you please send me the acats.log file from a failed testsuite run with its
>> matching screen output?
>>
>> Regards,
>> Peter Hurley
>>
>>
>>> The exact failures vary from run to run, but some failures always occur on my
>>> x86_64 machines, and all three open gcc branches (trunk, 4.8, 4.7) are affected.
>>> With a 3.11 kernel the acats testsuite is always clean.
>>>
>>> A bisection identified:
>>>
>>>  From f95499c3030fe1bfad57745f2db1959c5b43dca8 Mon Sep 17 00:00:00 2001
>>> From: Peter Hurley <peter@hurleysoftware.com>
>>> Date: Sat, 15 Jun 2013 13:14:29 +0000
>>> Subject: n_tty: Don't wait for buffer work in read() loop
>>>
>>> User-space read() can run concurrently with receiving from device;
>>> waiting for receive_buf() to complete is not required.
>>>
>>> Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
>>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> ---
>>> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
>>> index fe1c399..a6eea30 100644
>>> --- a/drivers/tty/n_tty.c
>>> +++ b/drivers/tty/n_tty.c
>>> @@ -1724,7 +1724,6 @@ static inline int input_available_p(struct tty_struct *tty, int amt)
>>>   {
>>>          struct n_tty_data *ldata = tty->disc_data;
>>>
>>> -       tty_flush_to_ldisc(tty);
>>>          if (ldata->icanon && !L_EXTPROC(tty)) {
>>>                  if (ldata->canon_head != ldata->read_tail)
>>>                          return 1;
>>>
>>> as the culprit.  Reverting that from 3.12-rc2 eliminates the acats failures
>>> and brings the gcc testsuite results to what one gets with 3.11.
>>>
>>> I can't pretend to understand exactly what goes wrong, suffice it to say that
>>> the gcc testsuite harness uses a combination of shell, expect, and tcl.  I
>>> suspect ptys are also involved.
>>>
>>> To repeat, bootstrap a recent gcc 4.8 snapshot w/ ada in --enable-languages,
>>> then run the test suite with "make -j6 -k check; make mail-report.log".
>>> (Adjust -jN as appropriate, but -j6 is what I'm using on my quad-core i7s.)

Ok, I've managed to reproduce this (epic adventure).

What happens is the child process (the test) writes to its stdout (which is
the slave end of a pty pair) and exits.

Then, the parent (expect), waiting for output from the child, is scheduled
and run before the tty buffer i/o loop has pushed any data to the pty master
read buffer.

IOW, at that particular instant, the pty appears to be closed (read return -EIO).

(The converse is also possible: ie., writing to the master and then closing
the master may not be read by the slave.)

>>> Please consider reverting or fixing this patch.

I need to think a little on the right way to fix this.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] tty: Fix pty master read() after slave closes
  2013-09-26 20:07     ` Peter Hurley
@ 2013-09-27 17:27       ` Peter Hurley
  2013-09-27 17:32         ` Peter Hurley
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Hurley @ 2013-09-27 17:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Mikael Pettersson; +Cc: linux-kernel, Peter Hurley

Commit f95499c3030fe1bfad57745f2db1959c5b43dca8,
  n_tty: Don't wait for buffer work in read() loop
creates a race window which can cause a pty master read()
to miss the last pty slave write(s) and return -EIO instead,
thus signalling the pty slave is closed. This can happen when
the pty slave is written and immediately closed but before the
tty buffer i/o loop receives the new input; the pty master
read() is scheduled, sees its read buffer is empty and the
pty slave has been closed, and exits.

Because tty_flush_to_ldisc() has significant performance impact
for parallel i/o, rather than revert the commit, special case this
condition (ie., when the read buffer is empty and the 'other' pty
has been closed) and, only then, wait for buffer work to complete
before re-testing if the read buffer is still empty.

As before, subsequent pty master reads return any available data
until no more data is available, and then returns -EIO to
indicate the pty slave has closed.

Reported-by: Mikael Pettersson <mikpelinux@gmail.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
---
 drivers/tty/n_tty.c | 46 ++++++++++++++++++++++++++--------------------
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index 09505ff..4e69f3b 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -2176,28 +2176,34 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
 
 		if (!input_available_p(tty, 0)) {
 			if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
-				retval = -EIO;
-				break;
-			}
-			if (tty_hung_up_p(file))
-				break;
-			if (!timeout)
-				break;
-			if (file->f_flags & O_NONBLOCK) {
-				retval = -EAGAIN;
-				break;
-			}
-			if (signal_pending(current)) {
-				retval = -ERESTARTSYS;
-				break;
-			}
-			n_tty_set_room(tty);
-			up_read(&tty->termios_rwsem);
+				up_read(&tty->termios_rwsem);
+				tty_flush_to_ldisc(tty);
+				down_read(&tty->termios_rwsem);
+				if (!input_available_p(tty, 0)) {
+					retval = -EIO;
+					break;
+				}
+			} else {
+				if (tty_hung_up_p(file))
+					break;
+				if (!timeout)
+					break;
+				if (file->f_flags & O_NONBLOCK) {
+					retval = -EAGAIN;
+					break;
+				}
+				if (signal_pending(current)) {
+					retval = -ERESTARTSYS;
+					break;
+				}
+				n_tty_set_room(tty);
+				up_read(&tty->termios_rwsem);
 
-			timeout = schedule_timeout(timeout);
+				timeout = schedule_timeout(timeout);
 
-			down_read(&tty->termios_rwsem);
-			continue;
+				down_read(&tty->termios_rwsem);
+				continue;
+			}
 		}
 		__set_current_state(TASK_RUNNING);
 
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] tty: Fix pty master read() after slave closes
  2013-09-27 17:27       ` [PATCH] tty: Fix pty master read() after slave closes Peter Hurley
@ 2013-09-27 17:32         ` Peter Hurley
  2013-09-28 17:36           ` Mikael Pettersson
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Hurley @ 2013-09-27 17:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Mikael Pettersson; +Cc: linux-kernel

On 09/27/2013 01:27 PM, Peter Hurley wrote:
> Commit f95499c3030fe1bfad57745f2db1959c5b43dca8,
>    n_tty: Don't wait for buffer work in read() loop
> creates a race window which can cause a pty master read()
> to miss the last pty slave write(s) and return -EIO instead,
> thus signalling the pty slave is closed. This can happen when
> the pty slave is written and immediately closed but before the
> tty buffer i/o loop receives the new input; the pty master
> read() is scheduled, sees its read buffer is empty and the
> pty slave has been closed, and exits.
>
> Because tty_flush_to_ldisc() has significant performance impact
> for parallel i/o, rather than revert the commit, special case this
> condition (ie., when the read buffer is empty and the 'other' pty
> has been closed) and, only then, wait for buffer work to complete
> before re-testing if the read buffer is still empty.
>
> As before, subsequent pty master reads return any available data
> until no more data is available, and then returns -EIO to
> indicate the pty slave has closed.
>
> Reported-by: Mikael Pettersson <mikpelinux@gmail.com>
> Signed-off-by: Peter Hurley <peter@hurleysoftware.com>

I tested this patch with the gcc ada ACATS testsuite and got this
summary so all seems ok with this patch.

		=== acats Summary ===
# of expected passes		805
# of unexpected failures	0

...

		=== acats Summary ===
# of expected passes		772
# of unexpected failures	0

...

		=== acats Summary ===
# of expected passes		743
# of unexpected failures	0

Mikael,

I'd appreciate if you could re-test with this patch and
confirm the issue is fixed.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tty: Fix pty master read() after slave closes
  2013-09-27 17:32         ` Peter Hurley
@ 2013-09-28 17:36           ` Mikael Pettersson
  0 siblings, 0 replies; 7+ messages in thread
From: Mikael Pettersson @ 2013-09-28 17:36 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg Kroah-Hartman, Mikael Pettersson, linux-kernel

Peter Hurley writes:
 > On 09/27/2013 01:27 PM, Peter Hurley wrote:
 > > Commit f95499c3030fe1bfad57745f2db1959c5b43dca8,
 > >    n_tty: Don't wait for buffer work in read() loop
 > > creates a race window which can cause a pty master read()
 > > to miss the last pty slave write(s) and return -EIO instead,
 > > thus signalling the pty slave is closed. This can happen when
 > > the pty slave is written and immediately closed but before the
 > > tty buffer i/o loop receives the new input; the pty master
 > > read() is scheduled, sees its read buffer is empty and the
 > > pty slave has been closed, and exits.
 > >
 > > Because tty_flush_to_ldisc() has significant performance impact
 > > for parallel i/o, rather than revert the commit, special case this
 > > condition (ie., when the read buffer is empty and the 'other' pty
 > > has been closed) and, only then, wait for buffer work to complete
 > > before re-testing if the read buffer is still empty.
 > >
 > > As before, subsequent pty master reads return any available data
 > > until no more data is available, and then returns -EIO to
 > > indicate the pty slave has closed.
 > >
 > > Reported-by: Mikael Pettersson <mikpelinux@gmail.com>
 > > Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
 > 
 > I tested this patch with the gcc ada ACATS testsuite and got this
 > summary so all seems ok with this patch.
 > 
 > 		=== acats Summary ===
 > # of expected passes		805
 > # of unexpected failures	0
 > 
 > ...
 > 
 > 		=== acats Summary ===
 > # of expected passes		772
 > # of unexpected failures	0
 > 
 > ...
 > 
 > 		=== acats Summary ===
 > # of expected passes		743
 > # of unexpected failures	0
 > 
 > Mikael,
 > 
 > I'd appreciate if you could re-test with this patch and
 > confirm the issue is fixed.

I did a big pile of gcc bootstrap + regression test runs today with
this patch, and everything looks good.  Thus:

Tested-by: Mikael Pettersson <mikpelinux@gmail.com>

Thanks,

/Mikael

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-09-28 17:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-25 12:18 [REGRESSION][BISECTED] 3.12-rc "n_tty: Don't wait for buffer work in read() loop" patch breaks gcc's testsuite Mikael Pettersson
2013-09-25 13:50 ` Peter Hurley
2013-09-25 13:52   ` Peter Hurley
2013-09-26 20:07     ` Peter Hurley
2013-09-27 17:27       ` [PATCH] tty: Fix pty master read() after slave closes Peter Hurley
2013-09-27 17:32         ` Peter Hurley
2013-09-28 17:36           ` Mikael Pettersson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).