All of lore.kernel.org
 help / color / mirror / Atom feed
* [Regression] kdesu broken
@ 2009-07-23 23:45 Rafael J. Wysocki
  2009-07-24  0:21 ` Ray Lee
  0 siblings, 1 reply; 104+ messages in thread
From: Rafael J. Wysocki @ 2009-07-23 23:45 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton

Hi,

A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.  ISTR a
discussion about that, but I can't find it right now.  Any clues?

Rafael

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-23 23:45 [Regression] kdesu broken Rafael J. Wysocki
@ 2009-07-24  0:21 ` Ray Lee
  2009-07-24 15:21   ` Rafael J. Wysocki
  2009-07-24 18:25   ` Aneesh Kumar K.V
  0 siblings, 2 replies; 104+ messages in thread
From: Ray Lee @ 2009-07-24  0:21 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: LKML, Andrew Morton

On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.  ISTR a
> discussion about that, but I can't find it right now.  Any clues?

See the thread starting here: ("possible regression with pty.c commit")
   http://lkml.org/lkml/2009/7/11/125

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-24  0:21 ` Ray Lee
@ 2009-07-24 15:21   ` Rafael J. Wysocki
  2009-07-24 15:40     ` Alan Cox
  2009-07-24 18:25   ` Aneesh Kumar K.V
  1 sibling, 1 reply; 104+ messages in thread
From: Rafael J. Wysocki @ 2009-07-24 15:21 UTC (permalink / raw)
  To: Ray Lee, LKML; +Cc: Andrew Morton, Alan Cox, Linus Torvalds

On Friday 24 July 2009, Ray Lee wrote:
> On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.  ISTR a
> > discussion about that, but I can't find it right now.  Any clues?
> 
> See the thread starting here: ("possible regression with pty.c commit")
>    http://lkml.org/lkml/2009/7/11/125

Thanks for the pointer.

Well, I thought we were expected to avoid breaking existing user space, even
if that were buggy etc.  

Best,
Rafael

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-24 15:21   ` Rafael J. Wysocki
@ 2009-07-24 15:40     ` Alan Cox
  2009-07-24 16:34       ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-24 15:40 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Ray Lee, LKML, Andrew Morton, Linus Torvalds

On Fri, 24 Jul 2009 17:21:45 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Friday 24 July 2009, Ray Lee wrote:
> > On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > > A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.  ISTR a
> > > discussion about that, but I can't find it right now.  Any clues?
> > 
> > See the thread starting here: ("possible regression with pty.c commit")
> >    http://lkml.org/lkml/2009/7/11/125
> 
> Thanks for the pointer.
> 
> Well, I thought we were expected to avoid breaking existing user space, even
> if that were buggy etc.

I don't know where you got that idea from. Avoiding breaking user space
unneccessarily is good but if its buggy you often can't do anything about
it.

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-24 15:40     ` Alan Cox
@ 2009-07-24 16:34       ` Linus Torvalds
  2009-07-25  6:04         ` OGAWA Hirofumi
  2009-07-25 11:48         ` Alan Cox
  0 siblings, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-24 16:34 UTC (permalink / raw)
  To: Alan Cox; +Cc: Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton



On Fri, 24 Jul 2009, Alan Cox wrote:

> On Fri, 24 Jul 2009 17:21:45 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Friday 24 July 2009, Ray Lee wrote:
> > > On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > > > A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.  ISTR a
> > > > discussion about that, but I can't find it right now.  Any clues?
> > > 
> > > See the thread starting here: ("possible regression with pty.c commit")
> > >    http://lkml.org/lkml/2009/7/11/125
> > 
> > Thanks for the pointer.
> > 
> > Well, I thought we were expected to avoid breaking existing user space, even
> > if that were buggy etc.
> 
> I don't know where you got that idea from. Avoiding breaking user space
> unneccessarily is good but if its buggy you often can't do anything about
> it.

Alan, he got that idea from me.

We don't do regressions. If user space depended on old behavior, we don't 
change behavior.

And regardless of that, I do not think EIO is the right thing to return at 
all. If the other side of a pty went away, return 0 and possibly send a 
HUP, or whatever. What did we do before?

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-24  0:21 ` Ray Lee
  2009-07-24 15:21   ` Rafael J. Wysocki
@ 2009-07-24 18:25   ` Aneesh Kumar K.V
  2009-07-25 12:07     ` Alan Cox
  2009-07-29 19:09     ` [Regression] kdesu broken, now usb fixed in current git pull Gene Heskett
  1 sibling, 2 replies; 104+ messages in thread
From: Aneesh Kumar K.V @ 2009-07-24 18:25 UTC (permalink / raw)
  To: Ray Lee; +Cc: Rafael J. Wysocki, LKML, Andrew Morton

On Thu, Jul 23, 2009 at 05:21:36PM -0700, Ray Lee wrote:
> On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.  ISTR a
> > discussion about that, but I can't find it right now.  Any clues?
> 
> See the thread starting here: ("possible regression with pty.c commit")
>    http://lkml.org/lkml/2009/7/11/125

I am also facing a similar problem.

http://bugzilla.kernel.org/show_bug.cgi?id=13815

-aneesh

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-24 16:34       ` Linus Torvalds
@ 2009-07-25  6:04         ` OGAWA Hirofumi
  2009-07-25 13:31           ` Alan Cox
  2009-07-25 14:05           ` Alan Cox
  2009-07-25 11:48         ` Alan Cox
  1 sibling, 2 replies; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-25  6:04 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:

>> I don't know where you got that idea from. Avoiding breaking user space
>> unneccessarily is good but if its buggy you often can't do anything about
>> it.
>
> Alan, he got that idea from me.
>
> We don't do regressions. If user space depended on old behavior, we don't 
> change behavior.
>
> And regardless of that, I do not think EIO is the right thing to return at 
> all. If the other side of a pty went away, return 0 and possibly send a 
> HUP, or whatever. What did we do before?

I also was seeing this. I hope the attached test code shows the problem.

The problem seems to be complex. And before change, write() seems to
send buffer to ldisc directly. After change, write() seems to send
buffer to tty buffer. With some debug, I'm not sure though, I guess the
following

          slave                                 master
     write()
         write to buffer
             tty_flip_buffer_push()
             schedule_delayed_work()
     close()
         set_bit(TTY_OTHER_CLOSED)
                                            read()
                                                input_available_p()
                                                # buffer was not received yet
                                                    test_bit(TTY_OTHER_CLOSED)
                                                        return -EIO

                                            flush_to_ldisc()
                                                ->receive_buf()

master is having the input data in tty->buf, but ->receive_buf() is not
called yet. So, it seems to return -EIO before handling input data in
tty->buf.

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <error.h>
#include <limits.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <unistd.h>

static char pts_name[PATH_MAX];

static int open_pty(void)
{
	int master;
	char *name;

	master = getpt();
	if (master < 0)
		return -1;

	if (grantpt(master) < 0 || unlockpt(master) < 0)
		goto close_master;
#if 0
	{
		int on = 1;
		ioctl(master, FIONBIO, &on);
	}
#endif
	name = ptsname(master);
	if (name == NULL)
		goto close_master;

	strcpy(pts_name, name);

	return master;

close_master:
	close(master);
	return -1;
}

static pid_t child(int master)
{
	pid_t pid;
	int slave;

	pid = fork();
	if (pid < 0)
		error(1, errno, "%s: fork", __func__);

	if (pid == 0) {
		slave = open(pts_name, O_RDWR);
		if (slave < 0)
			error(1, errno, "%s: open", __func__);

		close(master);
		dup2(slave, 0);
		dup2(slave, 1);
		dup2(slave, 2);
		close(slave);

		printf("1-----------------------------------------------\n");
		printf("2-----------------------------------------------\n");
		printf("3-----------------------------------------------\n");
		printf("4-----------------------------------------------\n");
		printf("5-----------------------------------------------\n");
		printf("6-----------------------------------------------\n");
		printf("7-----------------------------------------------\n");
		printf("8-----------------------------------------------\n");
		printf("9-----------------------------------------------\n");
		exit(0);
	}

	return pid;
}

int main()
{
	pid_t pid;
	int master;

	master = open_pty();
	if (master < 0)
		error(1, errno, "%s: open_pty", __func__);

	pid = child(master);

	waitpid(pid, NULL, 0);
	while (1) {
		char buf[4096];
		ssize_t size;

		size = read(master, buf, sizeof(buf));
		if (size < 0) {
			if (errno == EAGAIN) {
				printf("EAGAIN\n");
				continue;
			}
			error(1, errno, "%s: read", __func__);
		}
		if (size == 0)
			break;
		write(STDOUT_FILENO, buf, size);
	}
	
	return 0;
}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-24 16:34       ` Linus Torvalds
  2009-07-25  6:04         ` OGAWA Hirofumi
@ 2009-07-25 11:48         ` Alan Cox
  2009-07-25 14:02           ` Frans Pop
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-25 11:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

> > I don't know where you got that idea from. Avoiding breaking user space
> > unneccessarily is good but if its buggy you often can't do anything about
> > it.
> 
> Alan, he got that idea from me.
> 
> We don't do regressions. If user space depended on old behavior, we don't 
> change behavior

I think there are two bugs being confused here. The first kdesu bug isn't
the EIO one. It's just busted userspace code that happened to be lucky.
It's also impossible to keep that "luck" going. The other problem is the
EIO - which is a bug.

> And regardless of that, I do not think EIO is the right thing to return at 
> all. If the other side of a pty went away, return 0 and possibly send a 
> HUP, or whatever. What did we do before?

If the other side of a pty goes away we send SIGHUP and at that point
return -EIO as required by POSIX, SuS and various other things. You
cannot return "EOF" as that means the user caused an EOF event (eg hit
^D).

There is a bug going on here (the one that bites emacs meta-whatever
compile) where you get an -EIO and that seems to be timing related due to
flushing. I'm currently working on that one. I suspect we will need to
make the close on the writer wait for the data to be flushed through the
reader side and I'm testing some fixes for that.

So there are two things here that need not to be muddled up:

1. 	Case where kdesu from logging what is going on and code
inspection does:

	read
	not what I wanted
	discard the lot
	read
	oh damn I missed the bit I wanted

If we go back to the old pty approach we break ppp, locking rules and
everything else, and re-introduce DoS attacks (and some worse ones) going
back to 2.0 if not earlier.

2.	A case where previously it happened that

	write(pty, "Wibble", 6);
	close(pty);

	would *usually* block on close until the "Wibble" was consumed the
	other end. (ie logically speaking the buffering is on the writer
	not the reader)

	We can implement that (although probably a timeout to avoid
	certain deadlocks is needed)

I plan to fix #2 but not #1. There really isn't any way to keep #1 true.
Although the spec has nothing to say about #2 every implementation I've
you get the wibble reliably and we used to, so it needs fixing.

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-24 18:25   ` Aneesh Kumar K.V
@ 2009-07-25 12:07     ` Alan Cox
  2009-07-25 16:18       ` Aneesh Kumar K.V
  2009-07-29 19:09     ` [Regression] kdesu broken, now usb fixed in current git pull Gene Heskett
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-25 12:07 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton

> > See the thread starting here: ("possible regression with pty.c commit")
> >    http://lkml.org/lkml/2009/7/11/125
> 
> I am also facing a similar problem.
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=13815

Probably something like this fixes it. I'll be working on that Monday/Tuesday
along with various other bugs that need a review.

--


pty: ensure writes hit the reader before close

From: Alan Cox <alan@linux.intel.com>

Logically we move the buffering from one side to the other
---

 drivers/char/pty.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)


diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..7555890 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -36,6 +36,15 @@ static struct tty_driver *ptm_driver;
 static struct tty_driver *pts_driver;
 #endif
 
+static int pty_empty(struct tty_struct *tty)
+{
+	if (tty->buf.memory_used == 0)
+		return 1;
+	if (test_bit(TTY_HUPPED, &tty->flags))
+		return 1;
+	return 0;
+}
+
 static void pty_close(struct tty_struct *tty, struct file *filp)
 {
 	BUG_ON(!tty);
@@ -47,9 +56,11 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
 	}
 	wake_up_interruptible(&tty->read_wait);
 	wake_up_interruptible(&tty->write_wait);
+	
 	tty->packet = 0;
 	if (!tty->link)
 		return;
+	wait_event_interruptible(tty->write_wait, pty_empty(tty->link));
 	tty->link->packet = 0;
 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
 	wake_up_interruptible(&tty->link->read_wait);

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25  6:04         ` OGAWA Hirofumi
@ 2009-07-25 13:31           ` Alan Cox
  2009-07-25 14:05           ` Alan Cox
  1 sibling, 0 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-25 13:31 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Sat, 25 Jul 2009 15:04:06 +0900
> I also was seeing this. I hope the attached test code shows the problem.

It does for me: I've been using the test attached below from the Emacs 21
for Mac OS X web site where MacOS developed the same behaviour

>                                                 input_available_p()
>                                                 # buffer was not received yet
>                                                     test_bit(TTY_OTHER_CLOSED)
>                                                         return -EIO
> 
>                                             flush_to_ldisc()
>                                                 ->receive_buf()
> 
> master is having the input data in tty->buf, but ->receive_buf() is not
> called yet. So, it seems to return -EIO before handling input data in
> tty->buf.

Would make sense. Just investigating that now.

-----------

#include <stdio.h>
#include <unistd.h>

int main(int argc, char **argv) {
      pid_t   pid;
      int     master;
      char    buf[101];
      int     n;

      pid = forkpty(&master, NULL, NULL, NULL);
      if(pid < 0) {
          perror("fork error");
          exit(-1);
      } else if(pid == 0) {
          printf("### This is the child process ###\n");  // To be read by parent
          fflush(stdout);     // Doesn't help.
       sleep(1);       // Shouldn't be needed, but it makes things work.
          return(0);
      } else {
          while(n = read(master, buf, 100)) {
              if(n < 0) {
                  perror("read error");
                  exit(-1);
              }
              buf[n] = 0;     // Make a string out of our data.
              printf("Read %d bytes: %s", n, buf);
          }
      }
      exit(0);
}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 11:48         ` Alan Cox
@ 2009-07-25 14:02           ` Frans Pop
  2009-07-25 20:16             ` Rafael J. Wysocki
  2009-07-26 15:51             ` Sergey Senozhatsky
  0 siblings, 2 replies; 104+ messages in thread
From: Frans Pop @ 2009-07-25 14:02 UTC (permalink / raw)
  To: Alan Cox; +Cc: torvalds, rjw, ray-lk, linux-kernel, akpm, Sergey Senozhatsky

Alan Cox wrote:
> Linus Torvalds wrote:
>> We don't do regressions. If user space depended on old behavior, we
>> don't change behavior
> 
> I think there are two bugs being confused here. The first kdesu bug
> isn't the EIO one. It's just busted userspace code that happened to be
> lucky. 
> It's also impossible to keep that "luck" going. The other problem is the
> EIO - which is a bug.
 
Just out of curiosity ad for reference.
Has this issue already been reported to the KDE people? Any links to a bug 
report or discussions there?

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25  6:04         ` OGAWA Hirofumi
  2009-07-25 13:31           ` Alan Cox
@ 2009-07-25 14:05           ` Alan Cox
  2009-07-25 14:55             ` OGAWA Hirofumi
                               ` (3 more replies)
  1 sibling, 4 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-25 14:05 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

Actually try this:


commit b0e6bdde87725a5d46273ecc4bd00c54bd675848
Author: Alan Cox <alan@linux.intel.com>
Date:   Sat Jul 25 15:00:04 2009 +0100

    pty: ensure writes hit the reader before close
    
    This is elegant in all the wrong ways. Put the pty into low latency mode (which
    we can do as we always post bytes from user context). The tty_flip_buffer_push
    then always calls into the ldisc which means we clear the ldisc buffer before
    we set the TTY_OTHER_CLOSED flag.
    
    Means pty has subtle knowledge of tty internals we really don't want it to, but
    it fixes the problem for the moment.
    
    Signed-off-by: Alan Cox <alan@linux.intel.com>

diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..87d729b 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -47,10 +47,12 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
 	}
 	wake_up_interruptible(&tty->read_wait);
 	wake_up_interruptible(&tty->write_wait);
+	
 	tty->packet = 0;
 	if (!tty->link)
 		return;
 	tty->link->packet = 0;
+	tty_flip_buffer_push(tty->link);
 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
 	wake_up_interruptible(&tty->link->read_wait);
 	wake_up_interruptible(&tty->link->write_wait);
@@ -207,6 +209,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
 	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
 	set_bit(TTY_THROTTLED, &tty->flags);
 	retval = 0;
+	tty->low_latency = 1;
 out:
 	return retval;
 }

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 14:05           ` Alan Cox
@ 2009-07-25 14:55             ` OGAWA Hirofumi
  2009-07-25 15:32               ` Alan Cox
  2009-07-25 20:12             ` [Regression] " Rafael J. Wysocki
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-25 14:55 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> Actually try this:

Thanks. This patch improved situation. However, if slave writes big data
to buffer, it seems we still have the problem.

> +	tty_flip_buffer_push(tty->link);

This is handling the pending buffer, but in flush_to_ldisc(), if
!tty->receive_room, it seems still delay the ->receive_buf().

>  	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
>  	wake_up_interruptible(&tty->link->read_wait);
>  	wake_up_interruptible(&tty->link->write_wait);

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>


#define _GNU_SOURCE
#define BIG_BUF
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <error.h>
#include <limits.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <unistd.h>

static char pts_name[PATH_MAX];

static int open_pty(void)
{
	int master;
	char *name;

	master = getpt();
	if (master < 0)
		return -1;

	if (grantpt(master) < 0 || unlockpt(master) < 0)
		goto close_master;
#if 0
	{
		int on = 1;
		ioctl(master, FIONBIO, &on);
	}
#endif
	name = ptsname(master);
	if (name == NULL)
		goto close_master;

	strcpy(pts_name, name);

	return master;

close_master:
	close(master);
	return -1;
}

static pid_t child(int master)
{
	pid_t pid;
	int slave;

	pid = fork();
	if (pid < 0)
		error(1, errno, "%s: fork", __func__);

	if (pid == 0) {
		slave = open(pts_name, O_RDWR);
		if (slave < 0)
			error(1, errno, "%s: open", __func__);

		close(master);
		dup2(slave, 0);
		dup2(slave, 1);
		dup2(slave, 2);
		close(slave);

#ifdef BIG_BUF
	{
		char buf[4096];
		size_t size;

		memset(buf, '-', sizeof(buf));
		size = 0;
		while (size < 8192) {
			ssize_t r = write(STDOUT_FILENO, buf, sizeof(buf));
			if (r < 0)
				error(1, errno, "%s: write", __func__);
			size += r;
		}
	}
#else
		printf("1-----------------------------------------------\n");
		printf("2-----------------------------------------------\n");
		printf("3-----------------------------------------------\n");
		printf("4-----------------------------------------------\n");
		printf("5-----------------------------------------------\n");
		printf("6-----------------------------------------------\n");
		printf("7-----------------------------------------------\n");
		printf("8-----------------------------------------------\n");
		printf("9-----------------------------------------------\n");
#endif
		exit(0);
	}

	return pid;
}

int main()
{
	pid_t pid;
	int master;

	master = open_pty();
	if (master < 0)
		error(1, errno, "%s: open_pty", __func__);

	pid = child(master);

	waitpid(pid, NULL, 0);
	while (1) {
		char buf[4096];
		ssize_t size;

		size = read(master, buf, sizeof(buf));
		if (size < 0) {
			if (errno == EAGAIN) {
				printf("EAGAIN\n");
				continue;
			}
			error(1, errno, "%s: read", __func__);
		}
		if (size == 0)
			break;
#ifdef BIG_BUF
		printf("size %zd\n", size);
#else
		write(STDOUT_FILENO, buf, size);
#endif
	}
	
	return 0;
}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 14:55             ` OGAWA Hirofumi
@ 2009-07-25 15:32               ` Alan Cox
  2009-07-26 11:51                 ` OGAWA Hirofumi
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-25 15:32 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Sat, 25 Jul 2009 23:55:39 +0900
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> wrote:

> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> 
> > Actually try this:
> 
> Thanks. This patch improved situation. However, if slave writes big data
> to buffer, it seems we still have the problem.
> 
> > +	tty_flip_buffer_push(tty->link);
> 
> This is handling the pending buffer, but in flush_to_ldisc(), if
> !tty->receive_room, it seems still delay the ->receive_buf().

Agreed - we could

	wait_event_interruptible(tty->write_wait,
			tty->link->receive_room);

or similar.

Good to know the initial fix works. To actually do it cleanly probably
wants a way to pass a logical channel close through the tty layer which
isn't I think too hard

	set a new TTY_OTHER_CLOSING in the pty code
	set TTY_OTHER_CLOSED in the flip_buffer_push code if we empty the
	buffer queue and TTY_OTHER_CLOSING is set.

That would also avoid using tty->low_latency=1 in the pty layer which I
worry may harm PPP gateway performance and the like.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 12:07     ` Alan Cox
@ 2009-07-25 16:18       ` Aneesh Kumar K.V
  2009-07-25 17:06         ` Aneesh Kumar K.V
  0 siblings, 1 reply; 104+ messages in thread
From: Aneesh Kumar K.V @ 2009-07-25 16:18 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton

On Sat, Jul 25, 2009 at 01:07:03PM +0100, Alan Cox wrote:
> > > See the thread starting here: ("possible regression with pty.c commit")
> > >    http://lkml.org/lkml/2009/7/11/125
> > 
> > I am also facing a similar problem.
> > 
> > http://bugzilla.kernel.org/show_bug.cgi?id=13815
> 
> Probably something like this fixes it. I'll be working on that Monday/Tuesday
> along with various other bugs that need a review.
> 
> --
> 
> 
> pty: ensure writes hit the reader before close
> 
> From: Alan Cox <alan@linux.intel.com>
> 
> Logically we move the buffering from one side to the other
> ---
> 
>  drivers/char/pty.c |   11 +++++++++++
>  1 files changed, 11 insertions(+), 0 deletions(-)
> 
> 
> diff --git a/drivers/char/pty.c b/drivers/char/pty.c
> index 6e6942c..7555890 100644
> --- a/drivers/char/pty.c
> +++ b/drivers/char/pty.c
> @@ -36,6 +36,15 @@ static struct tty_driver *ptm_driver;
>  static struct tty_driver *pts_driver;
>  #endif
> 
> +static int pty_empty(struct tty_struct *tty)
> +{
> +	if (tty->buf.memory_used == 0)
> +		return 1;
> +	if (test_bit(TTY_HUPPED, &tty->flags))
> +		return 1;
> +	return 0;
> +}
> +
>  static void pty_close(struct tty_struct *tty, struct file *filp)
>  {
>  	BUG_ON(!tty);
> @@ -47,9 +56,11 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
>  	}
>  	wake_up_interruptible(&tty->read_wait);
>  	wake_up_interruptible(&tty->write_wait);
> +	
>  	tty->packet = 0;
>  	if (!tty->link)
>  		return;
> +	wait_event_interruptible(tty->write_wait, pty_empty(tty->link));
>  	tty->link->packet = 0;
>  	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
>  	wake_up_interruptible(&tty->link->read_wait);


After applying the patch on Fedora 10 the system bootup hangs after modrpobe.
on ubuntu jaunty i have the system booting fine. But trying to recompile
the test prg on emacs gives me the error message 

-UUU:%*--F1  *compilation*   All (1,0)      (Compilation:run Compiling)----<E> -------
A compilation process is running; kill it? (yes or no)

So i guess even though it gets the error information it cause emacs to think that
the compliation process is still running. The process details listed by emacs
Proc        Status   Buffer        Tty        Command 
----        ------   ------        ---        -------
compilation run      *compilation* /dev/pts/2 /bin/bash -c cc -g a.c                                                                     

But a ps -eaf doesn't list the command running. So something more  is going on. 

-aneesh

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 16:18       ` Aneesh Kumar K.V
@ 2009-07-25 17:06         ` Aneesh Kumar K.V
  0 siblings, 0 replies; 104+ messages in thread
From: Aneesh Kumar K.V @ 2009-07-25 17:06 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton

On Sat, Jul 25, 2009 at 09:48:44PM +0530, Aneesh Kumar K.V wrote:
> On Sat, Jul 25, 2009 at 01:07:03PM +0100, Alan Cox wrote:
> > > > See the thread starting here: ("possible regression with pty.c commit")
> > > >    http://lkml.org/lkml/2009/7/11/125
> > > 
> > > I am also facing a similar problem.
> > > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=13815
> > 
> > Probably something like this fixes it. I'll be working on that Monday/Tuesday
> > along with various other bugs that need a review.
> > 
> > --
> > 
> > 
> > pty: ensure writes hit the reader before close
> > 
> > From: Alan Cox <alan@linux.intel.com>
> > 
> > Logically we move the buffering from one side to the other
> > ---
> > 
> >  drivers/char/pty.c |   11 +++++++++++
> >  1 files changed, 11 insertions(+), 0 deletions(-)
> > 
> > 
> > diff --git a/drivers/char/pty.c b/drivers/char/pty.c
> > index 6e6942c..7555890 100644
> > --- a/drivers/char/pty.c
> > +++ b/drivers/char/pty.c
> > @@ -36,6 +36,15 @@ static struct tty_driver *ptm_driver;
> >  static struct tty_driver *pts_driver;
> >  #endif
> > 
> > +static int pty_empty(struct tty_struct *tty)
> > +{
> > +	if (tty->buf.memory_used == 0)
> > +		return 1;
> > +	if (test_bit(TTY_HUPPED, &tty->flags))
> > +		return 1;
> > +	return 0;
> > +}
> > +
> >  static void pty_close(struct tty_struct *tty, struct file *filp)
> >  {
> >  	BUG_ON(!tty);
> > @@ -47,9 +56,11 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
> >  	}
> >  	wake_up_interruptible(&tty->read_wait);
> >  	wake_up_interruptible(&tty->write_wait);
> > +	
> >  	tty->packet = 0;
> >  	if (!tty->link)
> >  		return;
> > +	wait_event_interruptible(tty->write_wait, pty_empty(tty->link));
> >  	tty->link->packet = 0;
> >  	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> >  	wake_up_interruptible(&tty->link->read_wait);
> 
> 
> After applying the patch on Fedora 10 the system bootup hangs after modrpobe.
> on ubuntu jaunty i have the system booting fine. But trying to recompile
> the test prg on emacs gives me the error message 
> 
> -UUU:%*--F1  *compilation*   All (1,0)      (Compilation:run Compiling)----<E> -------
> A compilation process is running; kill it? (yes or no)
> 
> So i guess even though it gets the error information it cause emacs to think that
> the compliation process is still running. The process details listed by emacs
> Proc        Status   Buffer        Tty        Command 
> ----        ------   ------        ---        -------
> compilation run      *compilation* /dev/pts/2 /bin/bash -c cc -g a.c                                                                     
> 
> But a ps -eaf doesn't list the command running. So something more  is going on. 
> 

ok i have in the process list

kvaneesh  6584  6583  0 22:34 pts/6    00:00:00 [cc]

-aneesh

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 14:05           ` Alan Cox
  2009-07-25 14:55             ` OGAWA Hirofumi
@ 2009-07-25 20:12             ` Rafael J. Wysocki
  2009-07-26 17:41             ` Aneesh Kumar K.V
  2009-07-29  2:20             ` Gene Heskett
  3 siblings, 0 replies; 104+ messages in thread
From: Rafael J. Wysocki @ 2009-07-25 20:12 UTC (permalink / raw)
  To: Alan Cox; +Cc: OGAWA Hirofumi, Linus Torvalds, Ray Lee, LKML, Andrew Morton

On Saturday 25 July 2009, Alan Cox wrote:
> Actually try this:
> 
> 
> commit b0e6bdde87725a5d46273ecc4bd00c54bd675848
> Author: Alan Cox <alan@linux.intel.com>
> Date:   Sat Jul 25 15:00:04 2009 +0100
> 
>     pty: ensure writes hit the reader before close
>     
>     This is elegant in all the wrong ways. Put the pty into low latency mode (which
>     we can do as we always post bytes from user context). The tty_flip_buffer_push
>     then always calls into the ldisc which means we clear the ldisc buffer before
>     we set the TTY_OTHER_CLOSED flag.
>     
>     Means pty has subtle knowledge of tty internals we really don't want it to, but
>     it fixes the problem for the moment.
>     
>     Signed-off-by: Alan Cox <alan@linux.intel.com>

Works for me, thanks!

Best,
Rafael

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 14:02           ` Frans Pop
@ 2009-07-25 20:16             ` Rafael J. Wysocki
  2009-07-25 21:03               ` Alan Cox
  2009-07-26 15:51             ` Sergey Senozhatsky
  1 sibling, 1 reply; 104+ messages in thread
From: Rafael J. Wysocki @ 2009-07-25 20:16 UTC (permalink / raw)
  To: Frans Pop
  Cc: Alan Cox, torvalds, ray-lk, linux-kernel, akpm, Sergey Senozhatsky

On Saturday 25 July 2009, Frans Pop wrote:
> Alan Cox wrote:
> > Linus Torvalds wrote:
> >> We don't do regressions. If user space depended on old behavior, we
> >> don't change behavior
> > 
> > I think there are two bugs being confused here. The first kdesu bug
> > isn't the EIO one. It's just busted userspace code that happened to be
> > lucky. 
> > It's also impossible to keep that "luck" going. The other problem is the
> > EIO - which is a bug.
>  
> Just out of curiosity ad for reference.
> Has this issue already been reported to the KDE people? Any links to a bug 
> report or discussions there?

Not that I know of.  It's probably worth reporting, though.

Best,
Rafael

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 20:16             ` Rafael J. Wysocki
@ 2009-07-25 21:03               ` Alan Cox
  0 siblings, 0 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-25 21:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Frans Pop, torvalds, ray-lk, linux-kernel, akpm, Sergey Senozhatsky

> > Just out of curiosity ad for reference.
> > Has this issue already been reported to the KDE people? Any links to a bug 
> > report or discussions there?
> 
> Not that I know of.  It's probably worth reporting, though.

The EIO one hasn't because it isn't their bug, the other one has.

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 15:32               ` Alan Cox
@ 2009-07-26 11:51                 ` OGAWA Hirofumi
  2009-07-27 10:57                   ` Alan Cox
  0 siblings, 1 reply; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-26 11:51 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> Good to know the initial fix works. To actually do it cleanly probably
> wants a way to pass a logical channel close through the tty layer which
> isn't I think too hard
>
> 	set a new TTY_OTHER_CLOSING in the pty code
> 	set TTY_OTHER_CLOSED in the flip_buffer_push code if we empty the
> 	buffer queue and TTY_OTHER_CLOSING is set.
>
> That would also avoid using tty->low_latency=1 in the pty layer which I
> worry may harm PPP gateway performance and the like.

I see. It sounds like good thing. The attached patch or something?
Anyway, I'm not familiar with the tty stuff obviously, so, I'm not sure
whether this patch is right or not.

If needed, I'll test the new patch.

Thanks.


Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
---

 drivers/char/pty.c        |   19 ++++++++++++++++---
 drivers/char/tty_buffer.c |   28 ++++++++++++++++++++++++++--
 include/linux/tty.h       |   13 +++++++------
 3 files changed, 49 insertions(+), 11 deletions(-)

diff -puN drivers/char/pty.c~pty-fixes2 drivers/char/pty.c
--- linux-2.6/drivers/char/pty.c~pty-fixes2	2009-07-26 20:04:17.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/pty.c	2009-07-26 20:06:17.000000000 +0900
@@ -51,11 +51,19 @@ static void pty_close(struct tty_struct 
 	tty->packet = 0;
 	if (!tty->link)
 		return;
-	tty->link->packet = 0;
+
+	/*
+	 * This try to flush pending tty->buf. And after flushed all
+	 * pending tty->buf, TTY_OTHER_CLOSED will be set.
+	 */
+	set_bit(TTY_OTHER_CLOSING, &tty->link->flags);
 	tty_flip_buffer_push(tty->link);
+#if 0
+	tty->link->packet = 0;
 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
 	wake_up_interruptible(&tty->link->read_wait);
 	wake_up_interruptible(&tty->link->write_wait);
+#endif
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
 		set_bit(TTY_OTHER_CLOSED, &tty->flags);
 #ifdef CONFIG_UNIX98_PTYS
@@ -199,17 +207,22 @@ static int pty_open(struct tty_struct *t
 		goto out;
 
 	retval = -EIO;
-	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+	if (test_bit(TTY_OTHER_CLOSING, &tty->flags) ||
+	    test_bit(TTY_OTHER_CLOSED, &tty->flags))
 		goto out;
 	if (test_bit(TTY_PTY_LOCK, &tty->link->flags))
 		goto out;
 	if (tty->link->count != 1)
 		goto out;
 
+	spin_lock_irq(&tty->link->buf.lock);
+	clear_bit(TTY_OTHER_CLOSING, &tty->link->flags);
 	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
+	spin_unlock_irq(&tty->link->buf.lock);
+
 	set_bit(TTY_THROTTLED, &tty->flags);
 	retval = 0;
-	tty->low_latency = 1;
+//	tty->low_latency = 1;
 out:
 	return retval;
 }
diff -puN drivers/char/tty_buffer.c~pty-fixes2 drivers/char/tty_buffer.c
--- linux-2.6/drivers/char/tty_buffer.c~pty-fixes2	2009-07-26 20:04:17.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/tty_buffer.c	2009-07-26 20:04:46.000000000 +0900
@@ -74,6 +74,20 @@ static struct tty_buffer *tty_buffer_all
 	return p;
 }
 
+/* must hold tty->buf.lock */
+static void tty_check_other_closing(struct tty_struct *tty)
+{
+	if (test_bit(TTY_OTHER_CLOSING, &tty->flags)) {
+		printk("%s: tty %p, closed\n", __func__, tty);
+		tty->link->packet = 0;
+		set_bit(TTY_OTHER_CLOSED, &tty->flags);
+		wake_up_interruptible(&tty->read_wait);
+		wake_up_interruptible(&tty->write_wait);
+		/* Clear TTY_OTHER_CLOSING after set TTY_OTHER_CLOSED */
+		clear_bit(TTY_OTHER_CLOSING, &tty->flags);
+	}
+}
+
 /**
  *	tty_buffer_free		-	free a tty buffer
  *	@tty: tty owning the buffer
@@ -119,6 +133,7 @@ static void __tty_buffer_flush(struct tt
 		tty_buffer_free(tty, thead);
 	}
 	tty->buf.tail = NULL;
+	tty_check_other_closing(tty);
 }
 
 /**
@@ -407,8 +422,12 @@ static void flush_to_ldisc(struct work_s
 	unsigned char *flag_buf;
 
 	disc = tty_ldisc_ref(tty);
-	if (disc == NULL)	/*  !TTY_LDISC */
+	if (disc == NULL) {	/*  !TTY_LDISC */
+		spin_lock_irqsave(&tty->buf.lock, flags);
+		tty_check_other_closing(tty);
+		spin_unlock_irqrestore(&tty->buf.lock, flags);
 		return;
+	}
 
 	spin_lock_irqsave(&tty->buf.lock, flags);
 	/* So we know a flush is running */
@@ -419,8 +438,11 @@ static void flush_to_ldisc(struct work_s
 		for (;;) {
 			int count = head->commit - head->read;
 			if (!count) {
-				if (head->next == NULL)
+				if (head->next == NULL) {
+					printk("%s: tty %p, next == NULL\n", __func__, tty);
+					tty_check_other_closing(tty);
 					break;
+				}
 				tbuf = head;
 				head = head->next;
 				tty_buffer_free(tty, tbuf);
@@ -448,9 +470,11 @@ static void flush_to_ldisc(struct work_s
 		/* Restore the queue head */
 		tty->buf.head = head;
 	}
+
 	/* We may have a deferred request to flush the input buffer,
 	   if so pull the chain under the lock and empty the queue */
 	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
+		printk("%s: tty %p, flushing\n", __func__, tty);
 		__tty_buffer_flush(tty);
 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
 		wake_up(&tty->read_wait);
diff -puN include/linux/tty.h~pty-fixes2 include/linux/tty.h
--- linux-2.6/include/linux/tty.h~pty-fixes2	2009-07-26 20:04:17.000000000 +0900
+++ linux-2.6-hirofumi/include/linux/tty.h	2009-07-26 20:04:17.000000000 +0900
@@ -309,12 +309,13 @@ struct tty_struct {
  */
 #define TTY_THROTTLED 		0	/* Call unthrottle() at threshold min */
 #define TTY_IO_ERROR 		1	/* Cause an I/O error (may be no ldisc too) */
-#define TTY_OTHER_CLOSED 	2	/* Other side (if any) has closed */
-#define TTY_EXCLUSIVE 		3	/* Exclusive open mode */
-#define TTY_DEBUG 		4	/* Debugging */
-#define TTY_DO_WRITE_WAKEUP 	5	/* Call write_wakeup after queuing new */
-#define TTY_PUSH 		6	/* n_tty private */
-#define TTY_CLOSING 		7	/* ->close() in progress */
+#define TTY_OTHER_CLOSING 	2	/* Other side (if any) is closing */
+#define TTY_OTHER_CLOSED 	3	/* Other side (if any) has closed */
+#define TTY_EXCLUSIVE 		4	/* Exclusive open mode */
+#define TTY_DEBUG 		5	/* Debugging */
+#define TTY_DO_WRITE_WAKEUP 	6	/* Call write_wakeup after queuing new */
+#define TTY_PUSH 		7	/* n_tty private */
+#define TTY_CLOSING 		8	/* ->close() in progress */
 #define TTY_LDISC 		9	/* Line discipline attached */
 #define TTY_LDISC_CHANGING 	10	/* Line discipline changing */
 #define TTY_LDISC_OPEN	 	11	/* Line discipline is open */
_
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 14:02           ` Frans Pop
  2009-07-25 20:16             ` Rafael J. Wysocki
@ 2009-07-26 15:51             ` Sergey Senozhatsky
  1 sibling, 0 replies; 104+ messages in thread
From: Sergey Senozhatsky @ 2009-07-26 15:51 UTC (permalink / raw)
  To: Frans Pop
  Cc: Alan Cox, torvalds, rjw, ray-lk, linux-kernel, akpm, Sergey Senozhatsky

[-- Attachment #1: Type: text/plain, Size: 1025 bytes --]

On (07/25/09 16:02), Frans Pop wrote:
> Date: Sat, 25 Jul 2009 16:02:48 +0200
> From: Frans Pop <elendil@planet.nl>
> To: Alan Cox <alan@lxorguk.ukuu.org.uk>
> Cc: torvalds@linux-foundation.org, rjw@sisk.pl, ray-lk@madrabbit.org,
> 	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
> 	Sergey Senozhatsky <sergey.senozhatsky@mail.by>
> Subject: Re: [Regression] kdesu broken
> User-Agent: KMail/1.9.9
> 
> Alan Cox wrote:
> > Linus Torvalds wrote:
> >> We don't do regressions. If user space depended on old behavior, we
> >> don't change behavior
> > 
> > I think there are two bugs being confused here. The first kdesu bug
> > isn't the EIO one. It's just busted userspace code that happened to be
> > lucky. 
> > It's also impossible to keep that "luck" going. The other problem is the
> > EIO - which is a bug.
>  
> Just out of curiosity ad for reference.
> Has this issue already been reported to the KDE people? Any links to a bug 
> report or discussions there?
> 

Not yet.

Sergey

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 315 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 14:05           ` Alan Cox
  2009-07-25 14:55             ` OGAWA Hirofumi
  2009-07-25 20:12             ` [Regression] " Rafael J. Wysocki
@ 2009-07-26 17:41             ` Aneesh Kumar K.V
  2009-07-29  2:20             ` Gene Heskett
  3 siblings, 0 replies; 104+ messages in thread
From: Aneesh Kumar K.V @ 2009-07-26 17:41 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

On Sat, Jul 25, 2009 at 03:05:10PM +0100, Alan Cox wrote:
> Actually try this:
> 
> 
> commit b0e6bdde87725a5d46273ecc4bd00c54bd675848
> Author: Alan Cox <alan@linux.intel.com>
> Date:   Sat Jul 25 15:00:04 2009 +0100
> 
>     pty: ensure writes hit the reader before close
>     
>     This is elegant in all the wrong ways. Put the pty into low latency mode (which
>     we can do as we always post bytes from user context). The tty_flip_buffer_push
>     then always calls into the ldisc which means we clear the ldisc buffer before
>     we set the TTY_OTHER_CLOSED flag.
>     
>     Means pty has subtle knowledge of tty internals we really don't want it to, but
>     it fixes the problem for the moment.
>     
>     Signed-off-by: Alan Cox <alan@linux.intel.com>
> 
> diff --git a/drivers/char/pty.c b/drivers/char/pty.c
> index 6e6942c..87d729b 100644
> --- a/drivers/char/pty.c
> +++ b/drivers/char/pty.c
> @@ -47,10 +47,12 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
>  	}
>  	wake_up_interruptible(&tty->read_wait);
>  	wake_up_interruptible(&tty->write_wait);
> +	
>  	tty->packet = 0;
>  	if (!tty->link)
>  		return;
>  	tty->link->packet = 0;
> +	tty_flip_buffer_push(tty->link);
>  	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
>  	wake_up_interruptible(&tty->link->read_wait);
>  	wake_up_interruptible(&tty->link->write_wait);
> @@ -207,6 +209,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
>  	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
>  	set_bit(TTY_THROTTLED, &tty->flags);
>  	retval = 0;
> +	tty->low_latency = 1;
>  out:
>  	return retval;
>  }

This one fixes "the compile in emacs bug" for me.

http://bugzilla.kernel.org/show_bug.cgi?id=13815

-aneesh

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-26 11:51                 ` OGAWA Hirofumi
@ 2009-07-27 10:57                   ` Alan Cox
  2009-07-27 12:07                     ` OGAWA Hirofumi
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-27 10:57 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

> I see. It sounds like good thing. The attached patch or something?
> Anyway, I'm not familiar with the tty stuff obviously, so, I'm not sure
> whether this patch is right or not.

It turns out to be a little bit trickier than I thought because of open
v close v flush_to_ldisc races (some of the open/close ones being long
standing).

We now use tty->buf.lock to keep EOF/EOFPENDING/OTHER_CLOSED all in order
together. That has no real cost as we already hold the buf.lock in the hot
path which is flush_to_ldisc.

Anyway this is my current thoughts (not yet given a testing)


diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index ff47907..acae995 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1777,7 +1777,8 @@ do_it_again:
 			tty->minimum_to_wake = (minimum - (b - buf));
 
 		if (!input_available_p(tty, 0)) {
-			if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
+			if (test_bit(TTY_EOF, &tty->flags)) {
+				/* PTY pair closed and all data consumed */
 				retval = -EIO;
 				break;
 			}
diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..de10cc0 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -38,6 +38,9 @@ static struct tty_driver *pts_driver;
 
 static void pty_close(struct tty_struct *tty, struct file *filp)
 {
+	struct tty_struct *pair;
+	unsigned long flags;
+
 	BUG_ON(!tty);
 	if (tty->driver->subtype == PTY_TYPE_MASTER)
 		WARN_ON(tty->count > 1);
@@ -47,13 +50,22 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
 	}
 	wake_up_interruptible(&tty->read_wait);
 	wake_up_interruptible(&tty->write_wait);
+	
 	tty->packet = 0;
-	if (!tty->link)
+	pair = tty->link;
+	if (!pair)
 		return;
-	tty->link->packet = 0;
-	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-	wake_up_interruptible(&tty->link->read_wait);
-	wake_up_interruptible(&tty->link->write_wait);
+
+	spin_lock_irqsave(&pair->buf.lock, flags);
+	pair->packet = 0;
+	/* Indicate that the other end is now closed, set the
+	   ENDPENDING marker so that the true end can be processed by
+	   the line discipline */
+	set_bit(TTY_EOFPENDING, &pair->flags);
+	set_bit(TTY_OTHER_CLOSED, &pair->flags);
+	spin_unlock_irqrestore(&pair->buf.lock, flags);
+	wake_up_interruptible(&pair->read_wait);
+	wake_up_interruptible(&pair->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
 		set_bit(TTY_OTHER_CLOSED, &tty->flags);
 #ifdef CONFIG_UNIX98_PTYS
@@ -180,7 +192,6 @@ static void pty_flush_buffer(struct tty_struct *tty)
 
 	if (!to)
 		return;
-	/* tty_buffer_flush(to); FIXME */
 	if (to->packet) {
 		spin_lock_irqsave(&tty->ctrl_lock, flags);
 		tty->ctrl_status |= TIOCPKT_FLUSHWRITE;
@@ -191,23 +202,30 @@ static void pty_flush_buffer(struct tty_struct *tty)
 
 static int pty_open(struct tty_struct *tty, struct file *filp)
 {
-	int	retval = -ENODEV;
+	int	retval = -EIO;
+	unsigned long flags;
+	struct tty_struct *pair;
 
-	if (!tty || !tty->link)
-		goto out;
+	if (tty == NULL || (pair = tty->link) == NULL)
+		return -ENODEV;
 
-	retval = -EIO;
 	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+		return -EIO;
+	spin_lock_irqsave(&pair->buf.lock, flags);
+	if (test_bit(TTY_PTY_LOCK, &pair->flags))
 		goto out;
-	if (test_bit(TTY_PTY_LOCK, &tty->link->flags))
-		goto out;
-	if (tty->link->count != 1)
+	if (pair->count != 1)
 		goto out;
 
-	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
+	clear_bit(TTY_OTHER_CLOSED, &pair->flags);
+	/* The buf.lock stops this racing a flush_to_ldisc from
+	   the other end */
+	clear_bit(TTY_EOFPENDING, &pair->flags);
+	clear_bit(TTY_EOF, &pair->flags);
 	set_bit(TTY_THROTTLED, &tty->flags);
 	retval = 0;
 out:
+	spin_unlock_irqrestore(&pair->buf.lock, flags);
 	return retval;
 }
 
diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 810ee25..448cd1d 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -119,6 +119,12 @@ static void __tty_buffer_flush(struct tty_struct *tty)
 		tty_buffer_free(tty, thead);
 	}
 	tty->buf.tail = NULL;
+	/* We can EOF a pty/tty pair with a flush as well as by consuming
+	   all the data left over */
+	if (test_bit(TTY_EOFPENDING, &tty->flags)) {
+		set_bit(TTY_EOF, &tty->flags);
+		wake_up(&tty->read_wait);
+	}
 }
 
 /**
@@ -455,6 +461,10 @@ static void flush_to_ldisc(struct work_struct *work)
 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
 		wake_up(&tty->read_wait);
 	}
+	if (tty->buf.head == NULL && test_bit(TTY_EOFPENDING, &tty->flags)) {
+		wake_up(&tty->read_wait);
+		set_bit(TTY_EOF, &tty->flags);
+	}
 	clear_bit(TTY_FLUSHING, &tty->flags);
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
 
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 85aa525..427d107 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -321,6 +321,8 @@ struct tty_struct {
 #define TTY_LDISC 		9	/* Line discipline attached */
 #define TTY_LDISC_CHANGING 	10	/* Line discipline changing */
 #define TTY_LDISC_OPEN	 	11	/* Line discipline is open */
+#define TTY_EOF			12	/* TTY/PTY pair at EOF */
+#define TTY_EOFPENDING		13	/* TTY/PTY pair EOF when data emptied */
 #define TTY_HW_COOK_OUT 	14	/* Hardware can do output cooking */
 #define TTY_HW_COOK_IN 		15	/* Hardware can do input cooking */
 #define TTY_PTY_LOCK 		16	/* pty private */

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-27 10:57                   ` Alan Cox
@ 2009-07-27 12:07                     ` OGAWA Hirofumi
  2009-07-27 12:46                       ` OGAWA Hirofumi
  2009-07-27 13:23                       ` [PATCH] " Alan Cox
  0 siblings, 2 replies; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-27 12:07 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> I see. It sounds like good thing. The attached patch or something?
>> Anyway, I'm not familiar with the tty stuff obviously, so, I'm not sure
>> whether this patch is right or not.
>
> It turns out to be a little bit trickier than I thought because of open
> v close v flush_to_ldisc races (some of the open/close ones being long
> standing).
>
> We now use tty->buf.lock to keep EOF/EOFPENDING/OTHER_CLOSED all in order
> together. That has no real cost as we already hold the buf.lock in the hot
> path which is flush_to_ldisc.
>
> Anyway this is my current thoughts (not yet given a testing)

I see. Looks like clean to me.

> +	spin_lock_irqsave(&pair->buf.lock, flags);
> +	pair->packet = 0;

I worried "pair->packet = 0" when I'm thinking this. I guess it would be
changed more early than before. Is it ok?

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-27 12:07                     ` OGAWA Hirofumi
@ 2009-07-27 12:46                       ` OGAWA Hirofumi
  2009-07-27 13:23                       ` [PATCH] " Alan Cox
  1 sibling, 0 replies; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-27 12:46 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> writes:

>> +	spin_lock_irqsave(&pair->buf.lock, flags);
>> +	pair->packet = 0;
>
> I worried "pair->packet = 0" when I'm thinking this. I guess it would be
> changed more early than before. Is it ok?

Ah, maybe, this is ok. Because n_tty_read() checks it, so, I guess there
is no big difference with before change.

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH] kdesu broken
  2009-07-27 12:07                     ` OGAWA Hirofumi
  2009-07-27 12:46                       ` OGAWA Hirofumi
@ 2009-07-27 13:23                       ` Alan Cox
  2009-07-27 13:50                         ` OGAWA Hirofumi
  2009-07-27 13:58                         ` Aneesh Kumar K.V
  1 sibling, 2 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-27 13:23 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

> I worried "pair->packet = 0" when I'm thinking this. I guess it would be
> changed more early than before. Is it ok?

I think so, and we can get stuck otherwise.


Tested patch:

commit 70325fd1d4341896c17b6f1f1965370b5258d0b1
Author: Alan Cox <alan@linux.intel.com>
Date:   Mon Jul 27 14:18:52 2009 +0100

    pty: ensure writes hit the reader before close
    
    Implement TTY_EOF/EOFPENDING flags so that we can propogate the close of the
    pty through the buffering correctly. The new flag state is locked but the
    tty buffer lock as it relates to buffers, and also because the buffer
    lock is already held in the hot path.
    
    Signed-off-by: Alan Cox <alan@linux.intel.com>

diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index ff47907..acae995 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1777,7 +1777,8 @@ do_it_again:
 			tty->minimum_to_wake = (minimum - (b - buf));
 
 		if (!input_available_p(tty, 0)) {
-			if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
+			if (test_bit(TTY_EOF, &tty->flags)) {
+				/* PTY pair closed and all data consumed */
 				retval = -EIO;
 				break;
 			}
diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..de10cc0 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -38,6 +38,9 @@ static struct tty_driver *pts_driver;
 
 static void pty_close(struct tty_struct *tty, struct file *filp)
 {
+	struct tty_struct *pair;
+	unsigned long flags;
+
 	BUG_ON(!tty);
 	if (tty->driver->subtype == PTY_TYPE_MASTER)
 		WARN_ON(tty->count > 1);
@@ -47,13 +50,22 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
 	}
 	wake_up_interruptible(&tty->read_wait);
 	wake_up_interruptible(&tty->write_wait);
+	
 	tty->packet = 0;
-	if (!tty->link)
+	pair = tty->link;
+	if (!pair)
 		return;
-	tty->link->packet = 0;
-	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-	wake_up_interruptible(&tty->link->read_wait);
-	wake_up_interruptible(&tty->link->write_wait);
+
+	spin_lock_irqsave(&pair->buf.lock, flags);
+	pair->packet = 0;
+	/* Indicate that the other end is now closed, set the
+	   ENDPENDING marker so that the true end can be processed by
+	   the line discipline */
+	set_bit(TTY_EOFPENDING, &pair->flags);
+	set_bit(TTY_OTHER_CLOSED, &pair->flags);
+	spin_unlock_irqrestore(&pair->buf.lock, flags);
+	wake_up_interruptible(&pair->read_wait);
+	wake_up_interruptible(&pair->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
 		set_bit(TTY_OTHER_CLOSED, &tty->flags);
 #ifdef CONFIG_UNIX98_PTYS
@@ -180,7 +192,6 @@ static void pty_flush_buffer(struct tty_struct *tty)
 
 	if (!to)
 		return;
-	/* tty_buffer_flush(to); FIXME */
 	if (to->packet) {
 		spin_lock_irqsave(&tty->ctrl_lock, flags);
 		tty->ctrl_status |= TIOCPKT_FLUSHWRITE;
@@ -191,23 +202,30 @@ static void pty_flush_buffer(struct tty_struct *tty)
 
 static int pty_open(struct tty_struct *tty, struct file *filp)
 {
-	int	retval = -ENODEV;
+	int	retval = -EIO;
+	unsigned long flags;
+	struct tty_struct *pair;
 
-	if (!tty || !tty->link)
-		goto out;
+	if (tty == NULL || (pair = tty->link) == NULL)
+		return -ENODEV;
 
-	retval = -EIO;
 	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+		return -EIO;
+	spin_lock_irqsave(&pair->buf.lock, flags);
+	if (test_bit(TTY_PTY_LOCK, &pair->flags))
 		goto out;
-	if (test_bit(TTY_PTY_LOCK, &tty->link->flags))
-		goto out;
-	if (tty->link->count != 1)
+	if (pair->count != 1)
 		goto out;
 
-	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
+	clear_bit(TTY_OTHER_CLOSED, &pair->flags);
+	/* The buf.lock stops this racing a flush_to_ldisc from
+	   the other end */
+	clear_bit(TTY_EOFPENDING, &pair->flags);
+	clear_bit(TTY_EOF, &pair->flags);
 	set_bit(TTY_THROTTLED, &tty->flags);
 	retval = 0;
 out:
+	spin_unlock_irqrestore(&pair->buf.lock, flags);
 	return retval;
 }
 
diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 810ee25..19a7ced 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -119,6 +119,12 @@ static void __tty_buffer_flush(struct tty_struct *tty)
 		tty_buffer_free(tty, thead);
 	}
 	tty->buf.tail = NULL;
+	/* We can EOF a pty/tty pair with a flush as well as by consuming
+	   all the data left over */
+	if (test_bit(TTY_EOFPENDING, &tty->flags)) {
+		set_bit(TTY_EOF, &tty->flags);
+		wake_up(&tty->read_wait);
+	}
 }
 
 /**
@@ -405,6 +411,7 @@ static void flush_to_ldisc(struct work_struct *work)
 	struct tty_buffer *tbuf, *head;
 	char *char_buf;
 	unsigned char *flag_buf;
+	int done = 1;
 
 	disc = tty_ldisc_ref(tty);
 	if (disc == NULL)	/*  !TTY_LDISC */
@@ -433,10 +440,13 @@ static void flush_to_ldisc(struct work_struct *work)
 				break;
 			if (!tty->receive_room) {
 				schedule_delayed_work(&tty->buf.work, 1);
+				done = 0;
 				break;
 			}
-			if (count > tty->receive_room)
+			if (count > tty->receive_room) {
 				count = tty->receive_room;
+				done = 0;
+			}
 			char_buf = head->char_buf_ptr + head->read;
 			flag_buf = head->flag_buf_ptr + head->read;
 			head->read += count;
@@ -454,6 +464,10 @@ static void flush_to_ldisc(struct work_struct *work)
 		__tty_buffer_flush(tty);
 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
 		wake_up(&tty->read_wait);
+	} else if (done && test_bit(TTY_EOFPENDING, &tty->flags)) {
+		/* The last bits hit the ldisc so set EOF */
+		wake_up(&tty->read_wait);
+		set_bit(TTY_EOF, &tty->flags);
 	}
 	clear_bit(TTY_FLUSHING, &tty->flags);
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 85aa525..427d107 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -321,6 +321,8 @@ struct tty_struct {
 #define TTY_LDISC 		9	/* Line discipline attached */
 #define TTY_LDISC_CHANGING 	10	/* Line discipline changing */
 #define TTY_LDISC_OPEN	 	11	/* Line discipline is open */
+#define TTY_EOF			12	/* TTY/PTY pair at EOF */
+#define TTY_EOFPENDING		13	/* TTY/PTY pair EOF when data emptied */
 #define TTY_HW_COOK_OUT 	14	/* Hardware can do output cooking */
 #define TTY_HW_COOK_IN 		15	/* Hardware can do input cooking */
 #define TTY_PTY_LOCK 		16	/* pty private */



^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 13:23                       ` [PATCH] " Alan Cox
@ 2009-07-27 13:50                         ` OGAWA Hirofumi
  2009-07-27 13:58                           ` Alan Cox
  2009-07-27 13:58                         ` Aneesh Kumar K.V
  1 sibling, 1 reply; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-27 13:50 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> commit 70325fd1d4341896c17b6f1f1965370b5258d0b1
> Author: Alan Cox <alan@linux.intel.com>
> Date:   Mon Jul 27 14:18:52 2009 +0100
>
>     pty: ensure writes hit the reader before close
>     
>     Implement TTY_EOF/EOFPENDING flags so that we can propogate the close of the
>     pty through the buffering correctly. The new flag state is locked but the
>     tty buffer lock as it relates to buffers, and also because the buffer
>     lock is already held in the hot path.
>     
>     Signed-off-by: Alan Cox <alan@linux.intel.com>

I also tested this patch, and it seems to work in some case. However, in
some case, n_tty_read() didn't return -EIO for read() of master.

> +	spin_lock_irqsave(&pair->buf.lock, flags);
> +	pair->packet = 0;
> +	/* Indicate that the other end is now closed, set the
> +	   ENDPENDING marker so that the true end can be processed by

This seems typo s/ENDPENDING/EOFPENDING/


> @@ -47,13 +50,22 @@ static void pty_close(struct tty_struct *tty, struct file *filp)

[...]

> +	set_bit(TTY_EOFPENDING, &pair->flags);
> +	set_bit(TTY_OTHER_CLOSED, &pair->flags);
> +	spin_unlock_irqrestore(&pair->buf.lock, flags);

	tty_flip_buffer_push() or something?

> +	wake_up_interruptible(&pair->read_wait);
> +	wake_up_interruptible(&pair->write_wait);

It seems, this sets TTY_EOFPENDING, but if flush_to_ldisc() was done
already here, anybody doesn't set TTY_EOF for master.

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 13:50                         ` OGAWA Hirofumi
@ 2009-07-27 13:58                           ` Alan Cox
  2009-07-27 15:04                             ` OGAWA Hirofumi
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-27 13:58 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

> > +	/* Indicate that the other end is now closed, set the
> > +	   ENDPENDING marker so that the true end can be processed by
> 
> This seems typo s/ENDPENDING/EOFPENDING/

I renamed it but missed that.

> > @@ -47,13 +50,22 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
> 
> [...]
> 
> > +	set_bit(TTY_EOFPENDING, &pair->flags);
> > +	set_bit(TTY_OTHER_CLOSED, &pair->flags);
> > +	spin_unlock_irqrestore(&pair->buf.lock, flags);
> 
> 	tty_flip_buffer_push() or something?
> 
> > +	wake_up_interruptible(&pair->read_wait);
> > +	wake_up_interruptible(&pair->write_wait);
> 
> It seems, this sets TTY_EOFPENDING, but if flush_to_ldisc() was done
> already here, anybody doesn't set TTY_EOF for master.

Does putting a tty_flip_buffer_push() at that point fix it. I had thought
I could remove it but you are right that creates a situation where EOF
may never get set.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 13:23                       ` [PATCH] " Alan Cox
  2009-07-27 13:50                         ` OGAWA Hirofumi
@ 2009-07-27 13:58                         ` Aneesh Kumar K.V
  1 sibling, 0 replies; 104+ messages in thread
From: Aneesh Kumar K.V @ 2009-07-27 13:58 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

On Mon, Jul 27, 2009 at 02:23:03PM +0100, Alan Cox wrote:
> > I worried "pair->packet = 0" when I'm thinking this. I guess it would be
> > changed more early than before. Is it ok?
> 
> I think so, and we can get stuck otherwise.
> 
> 
> Tested patch:
> 
> commit 70325fd1d4341896c17b6f1f1965370b5258d0b1
> Author: Alan Cox <alan@linux.intel.com>
> Date:   Mon Jul 27 14:18:52 2009 +0100
> 
>     pty: ensure writes hit the reader before close
>     
>     Implement TTY_EOF/EOFPENDING flags so that we can propogate the close of the
>     pty through the buffering correctly. The new flag state is locked but the
>     tty buffer lock as it relates to buffers, and also because the buffer
>     lock is already held in the hot path.
>     
>     Signed-off-by: Alan Cox <alan@linux.intel.com>
> 
> diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
> index ff47907..acae995 100644
> --- a/drivers/char/n_tty.c
> +++ b/drivers/char/n_tty.c
> @@ -1777,7 +1777,8 @@ do_it_again:
>  			tty->minimum_to_wake = (minimum - (b - buf));
> 
>  		if (!input_available_p(tty, 0)) {
> -			if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
> +			if (test_bit(TTY_EOF, &tty->flags)) {
> +				/* PTY pair closed and all data consumed */
>  				retval = -EIO;
>  				break;
>  			}
> diff --git a/drivers/char/pty.c b/drivers/char/pty.c
> index 6e6942c..de10cc0 100644
> --- a/drivers/char/pty.c
> +++ b/drivers/char/pty.c
> @@ -38,6 +38,9 @@ static struct tty_driver *pts_driver;
> 
>  static void pty_close(struct tty_struct *tty, struct file *filp)
>  {
> +	struct tty_struct *pair;
> +	unsigned long flags;
> +
>  	BUG_ON(!tty);
>  	if (tty->driver->subtype == PTY_TYPE_MASTER)
>  		WARN_ON(tty->count > 1);
> @@ -47,13 +50,22 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
>  	}
>  	wake_up_interruptible(&tty->read_wait);
>  	wake_up_interruptible(&tty->write_wait);
> +	
>  	tty->packet = 0;
> -	if (!tty->link)
> +	pair = tty->link;
> +	if (!pair)
>  		return;
> -	tty->link->packet = 0;
> -	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> -	wake_up_interruptible(&tty->link->read_wait);
> -	wake_up_interruptible(&tty->link->write_wait);
> +
> +	spin_lock_irqsave(&pair->buf.lock, flags);
> +	pair->packet = 0;
> +	/* Indicate that the other end is now closed, set the
> +	   ENDPENDING marker so that the true end can be processed by
> +	   the line discipline */
> +	set_bit(TTY_EOFPENDING, &pair->flags);
> +	set_bit(TTY_OTHER_CLOSED, &pair->flags);
> +	spin_unlock_irqrestore(&pair->buf.lock, flags);
> +	wake_up_interruptible(&pair->read_wait);
> +	wake_up_interruptible(&pair->write_wait);
>  	if (tty->driver->subtype == PTY_TYPE_MASTER) {
>  		set_bit(TTY_OTHER_CLOSED, &tty->flags);
>  #ifdef CONFIG_UNIX98_PTYS
> @@ -180,7 +192,6 @@ static void pty_flush_buffer(struct tty_struct *tty)
> 
>  	if (!to)
>  		return;
> -	/* tty_buffer_flush(to); FIXME */
>  	if (to->packet) {
>  		spin_lock_irqsave(&tty->ctrl_lock, flags);
>  		tty->ctrl_status |= TIOCPKT_FLUSHWRITE;
> @@ -191,23 +202,30 @@ static void pty_flush_buffer(struct tty_struct *tty)
> 
>  static int pty_open(struct tty_struct *tty, struct file *filp)
>  {
> -	int	retval = -ENODEV;
> +	int	retval = -EIO;
> +	unsigned long flags;
> +	struct tty_struct *pair;
> 
> -	if (!tty || !tty->link)
> -		goto out;
> +	if (tty == NULL || (pair = tty->link) == NULL)
> +		return -ENODEV;
> 
> -	retval = -EIO;
>  	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
> +		return -EIO;
> +	spin_lock_irqsave(&pair->buf.lock, flags);
> +	if (test_bit(TTY_PTY_LOCK, &pair->flags))
>  		goto out;
> -	if (test_bit(TTY_PTY_LOCK, &tty->link->flags))
> -		goto out;
> -	if (tty->link->count != 1)
> +	if (pair->count != 1)
>  		goto out;
> 
> -	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> +	clear_bit(TTY_OTHER_CLOSED, &pair->flags);
> +	/* The buf.lock stops this racing a flush_to_ldisc from
> +	   the other end */
> +	clear_bit(TTY_EOFPENDING, &pair->flags);
> +	clear_bit(TTY_EOF, &pair->flags);
>  	set_bit(TTY_THROTTLED, &tty->flags);
>  	retval = 0;
>  out:
> +	spin_unlock_irqrestore(&pair->buf.lock, flags);
>  	return retval;
>  }
> 
> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 810ee25..19a7ced 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -119,6 +119,12 @@ static void __tty_buffer_flush(struct tty_struct *tty)
>  		tty_buffer_free(tty, thead);
>  	}
>  	tty->buf.tail = NULL;
> +	/* We can EOF a pty/tty pair with a flush as well as by consuming
> +	   all the data left over */
> +	if (test_bit(TTY_EOFPENDING, &tty->flags)) {
> +		set_bit(TTY_EOF, &tty->flags);
> +		wake_up(&tty->read_wait);
> +	}
>  }
> 
>  /**
> @@ -405,6 +411,7 @@ static void flush_to_ldisc(struct work_struct *work)
>  	struct tty_buffer *tbuf, *head;
>  	char *char_buf;
>  	unsigned char *flag_buf;
> +	int done = 1;
> 
>  	disc = tty_ldisc_ref(tty);
>  	if (disc == NULL)	/*  !TTY_LDISC */
> @@ -433,10 +440,13 @@ static void flush_to_ldisc(struct work_struct *work)
>  				break;
>  			if (!tty->receive_room) {
>  				schedule_delayed_work(&tty->buf.work, 1);
> +				done = 0;
>  				break;
>  			}
> -			if (count > tty->receive_room)
> +			if (count > tty->receive_room) {
>  				count = tty->receive_room;
> +				done = 0;
> +			}
>  			char_buf = head->char_buf_ptr + head->read;
>  			flag_buf = head->flag_buf_ptr + head->read;
>  			head->read += count;
> @@ -454,6 +464,10 @@ static void flush_to_ldisc(struct work_struct *work)
>  		__tty_buffer_flush(tty);
>  		clear_bit(TTY_FLUSHPENDING, &tty->flags);
>  		wake_up(&tty->read_wait);
> +	} else if (done && test_bit(TTY_EOFPENDING, &tty->flags)) {
> +		/* The last bits hit the ldisc so set EOF */
> +		wake_up(&tty->read_wait);
> +		set_bit(TTY_EOF, &tty->flags);
>  	}
>  	clear_bit(TTY_FLUSHING, &tty->flags);
>  	spin_unlock_irqrestore(&tty->buf.lock, flags);
> diff --git a/include/linux/tty.h b/include/linux/tty.h
> index 85aa525..427d107 100644
> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -321,6 +321,8 @@ struct tty_struct {
>  #define TTY_LDISC 		9	/* Line discipline attached */
>  #define TTY_LDISC_CHANGING 	10	/* Line discipline changing */
>  #define TTY_LDISC_OPEN	 	11	/* Line discipline is open */
> +#define TTY_EOF			12	/* TTY/PTY pair at EOF */
> +#define TTY_EOFPENDING		13	/* TTY/PTY pair EOF when data emptied */
>  #define TTY_HW_COOK_OUT 	14	/* Hardware can do output cooking */
>  #define TTY_HW_COOK_IN 		15	/* Hardware can do input cooking */
>  #define TTY_PTY_LOCK 		16	/* pty private */
> 
> 

With this patch i still have "the compile in emacs" problem. But it is less frequent.
1 in 5 tries fail now

-aneesh

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 13:58                           ` Alan Cox
@ 2009-07-27 15:04                             ` OGAWA Hirofumi
  2009-07-27 16:14                               ` Aneesh Kumar K.V
  0 siblings, 1 reply; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-27 15:04 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> > +	set_bit(TTY_EOFPENDING, &pair->flags);
>> > +	set_bit(TTY_OTHER_CLOSED, &pair->flags);
>> > +	spin_unlock_irqrestore(&pair->buf.lock, flags);
>> 
>> 	tty_flip_buffer_push() or something?
>> 
>> > +	wake_up_interruptible(&pair->read_wait);
>> > +	wake_up_interruptible(&pair->write_wait);
>> 
>> It seems, this sets TTY_EOFPENDING, but if flush_to_ldisc() was done
>> already here, anybody doesn't set TTY_EOF for master.
>
> Does putting a tty_flip_buffer_push() at that point fix it. I had thought
> I could remove it but you are right that creates a situation where EOF
> may never get set.

With this patch, test program seems to work.

It seems "done = 0" hunk was needed for correct "done".

Thanks.


Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
---

 drivers/char/pty.c        |    1 +
 drivers/char/tty_buffer.c |    4 +---
 2 files changed, 2 insertions(+), 3 deletions(-)

diff -puN drivers/char/pty.c~pty-fixes4-fix drivers/char/pty.c
--- linux-2.6/drivers/char/pty.c~pty-fixes4-fix	2009-07-27 23:56:27.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/pty.c	2009-07-27 23:56:40.000000000 +0900
@@ -64,6 +64,7 @@ static void pty_close(struct tty_struct 
 	set_bit(TTY_EOFPENDING, &pair->flags);
 	set_bit(TTY_OTHER_CLOSED, &pair->flags);
 	spin_unlock_irqrestore(&pair->buf.lock, flags);
+	tty_flip_buffer_push(pair);
 	wake_up_interruptible(&pair->read_wait);
 	wake_up_interruptible(&pair->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
diff -puN drivers/char/tty_buffer.c~pty-fixes4-fix drivers/char/tty_buffer.c
--- linux-2.6/drivers/char/tty_buffer.c~pty-fixes4-fix	2009-07-27 23:57:27.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/tty_buffer.c	2009-07-27 23:57:30.000000000 +0900
@@ -443,10 +443,8 @@ static void flush_to_ldisc(struct work_s
 				done = 0;
 				break;
 			}
-			if (count > tty->receive_room) {
+			if (count > tty->receive_room)
 				count = tty->receive_room;
-				done = 0;
-			}
 			char_buf = head->char_buf_ptr + head->read;
 			flag_buf = head->flag_buf_ptr + head->read;
 			head->read += count;
_

-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 15:04                             ` OGAWA Hirofumi
@ 2009-07-27 16:14                               ` Aneesh Kumar K.V
  2009-07-27 16:42                                 ` Alan Cox
  2009-07-27 18:28                                 ` Andreas Schwab
  0 siblings, 2 replies; 104+ messages in thread
From: Aneesh Kumar K.V @ 2009-07-27 16:14 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

On Tue, Jul 28, 2009 at 12:04:42AM +0900, OGAWA Hirofumi wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> 
> >> > +	set_bit(TTY_EOFPENDING, &pair->flags);
> >> > +	set_bit(TTY_OTHER_CLOSED, &pair->flags);
> >> > +	spin_unlock_irqrestore(&pair->buf.lock, flags);
> >> 
> >> 	tty_flip_buffer_push() or something?
> >> 
> >> > +	wake_up_interruptible(&pair->read_wait);
> >> > +	wake_up_interruptible(&pair->write_wait);
> >> 
> >> It seems, this sets TTY_EOFPENDING, but if flush_to_ldisc() was done
> >> already here, anybody doesn't set TTY_EOF for master.
> >
> > Does putting a tty_flip_buffer_push() at that point fix it. I had thought
> > I could remove it but you are right that creates a situation where EOF
> > may never get set.
> 
> With this patch, test program seems to work.
> 
> It seems "done = 0" hunk was needed for correct "done".
> 
> Thanks.
> 
> 
> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
> ---
> 
>  drivers/char/pty.c        |    1 +
>  drivers/char/tty_buffer.c |    4 +---
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff -puN drivers/char/pty.c~pty-fixes4-fix drivers/char/pty.c
> --- linux-2.6/drivers/char/pty.c~pty-fixes4-fix	2009-07-27 23:56:27.000000000 +0900
> +++ linux-2.6-hirofumi/drivers/char/pty.c	2009-07-27 23:56:40.000000000 +0900
> @@ -64,6 +64,7 @@ static void pty_close(struct tty_struct 
>  	set_bit(TTY_EOFPENDING, &pair->flags);
>  	set_bit(TTY_OTHER_CLOSED, &pair->flags);
>  	spin_unlock_irqrestore(&pair->buf.lock, flags);
> +	tty_flip_buffer_push(pair);
>  	wake_up_interruptible(&pair->read_wait);
>  	wake_up_interruptible(&pair->write_wait);
>  	if (tty->driver->subtype == PTY_TYPE_MASTER) {
> diff -puN drivers/char/tty_buffer.c~pty-fixes4-fix drivers/char/tty_buffer.c
> --- linux-2.6/drivers/char/tty_buffer.c~pty-fixes4-fix	2009-07-27 23:57:27.000000000 +0900
> +++ linux-2.6-hirofumi/drivers/char/tty_buffer.c	2009-07-27 23:57:30.000000000 +0900
> @@ -443,10 +443,8 @@ static void flush_to_ldisc(struct work_s
>  				done = 0;
>  				break;
>  			}
> -			if (count > tty->receive_room) {
> +			if (count > tty->receive_room)
>  				count = tty->receive_room;
> -				done = 0;
> -			}
>  			char_buf = head->char_buf_ptr + head->read;
>  			flag_buf = head->flag_buf_ptr + head->read;
>  			head->read += count;
> _


I still have the "compile in emacs" bug. So this patch along with
http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
bug for me.

-aneesh

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 16:14                               ` Aneesh Kumar K.V
@ 2009-07-27 16:42                                 ` Alan Cox
  2009-07-27 17:12                                   ` Aneesh Kumar K.V
  2009-07-27 18:28                                 ` Andreas Schwab
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-27 16:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton


> > -			if (count > tty->receive_room) {
> > +			if (count > tty->receive_room)
> >  				count = tty->receive_room;
> > -				done = 0;
> > -			}
> >  			char_buf = head->char_buf_ptr + head->read;
> >  			flag_buf = head->flag_buf_ptr + head->read;
> >  			head->read += count;
> > _
> 
> 
> I still have the "compile in emacs" bug. So this patch along with
> http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
> bug for me.

Can you stick some printk calls in and debug it further then because at
with the fixes Ogawa provided I'm finding it impossible to reproduce even
on an 8 way x86.

What hardware are you using ?


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 16:42                                 ` Alan Cox
@ 2009-07-27 17:12                                   ` Aneesh Kumar K.V
  2009-07-27 19:28                                     ` OGAWA Hirofumi
  0 siblings, 1 reply; 104+ messages in thread
From: Aneesh Kumar K.V @ 2009-07-27 17:12 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

On Mon, Jul 27, 2009 at 05:42:52PM +0100, Alan Cox wrote:
> 
> > > -			if (count > tty->receive_room) {
> > > +			if (count > tty->receive_room)
> > >  				count = tty->receive_room;
> > > -				done = 0;
> > > -			}
> > >  			char_buf = head->char_buf_ptr + head->read;
> > >  			flag_buf = head->flag_buf_ptr + head->read;
> > >  			head->read += count;
> > > _
> > 
> > 
> > I still have the "compile in emacs" bug. So this patch along with
> > http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
> > bug for me.
> 
> Can you stick some printk calls in and debug it further then because at
> with the fixes Ogawa provided I'm finding it impossible to reproduce even
> on an 8 way x86.
> 
> What hardware are you using ?

My T60p lenovo laptop. If you can send me a debug prink patch i can redo the
test with the patches. I don't know what data i should start collecting so
that it make sense. This is the patch i tried

diff -ru /home/opensource/sources/linux-2.6/drivers/char/n_tty.c /home/opensource/build/linux-2.6-distro/drivers/char/n_tty.c
--- /home/opensource/sources/linux-2.6/drivers/char/n_tty.c	2009-07-17 10:07:40.210991073 +0530
+++ /home/opensource/build/linux-2.6-distro/drivers/char/n_tty.c	2009-07-27 18:56:14.518330502 +0530
@@ -1777,7 +1777,8 @@
 			tty->minimum_to_wake = (minimum - (b - buf));
 
 		if (!input_available_p(tty, 0)) {
-			if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
+			if (test_bit(TTY_EOF, &tty->flags)) {
+				/* PTY pair closed and all data consumed */
 				retval = -EIO;
 				break;
 			}
diff -ru /home/opensource/sources/linux-2.6/drivers/char/pty.c /home/opensource/build/linux-2.6-distro/drivers/char/pty.c
--- /home/opensource/sources/linux-2.6/drivers/char/pty.c	2009-07-26 22:45:50.354419907 +0530
+++ /home/opensource/build/linux-2.6-distro/drivers/char/pty.c	2009-07-27 21:29:26.217922778 +0530
@@ -38,6 +38,9 @@
 
 static void pty_close(struct tty_struct *tty, struct file *filp)
 {
+	struct tty_struct *pair;
+	unsigned long flags;
+
 	BUG_ON(!tty);
 	if (tty->driver->subtype == PTY_TYPE_MASTER)
 		WARN_ON(tty->count > 1);
@@ -47,13 +50,23 @@
 	}
 	wake_up_interruptible(&tty->read_wait);
 	wake_up_interruptible(&tty->write_wait);
+	
 	tty->packet = 0;
-	if (!tty->link)
+	pair = tty->link;
+	if (!pair)
 		return;
-	tty->link->packet = 0;
-	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-	wake_up_interruptible(&tty->link->read_wait);
-	wake_up_interruptible(&tty->link->write_wait);
+
+	spin_lock_irqsave(&pair->buf.lock, flags);
+	pair->packet = 0;
+	/* Indicate that the other end is now closed, set the
+	   ENDPENDING marker so that the true end can be processed by
+	   the line discipline */
+	set_bit(TTY_EOFPENDING, &pair->flags);
+	set_bit(TTY_OTHER_CLOSED, &pair->flags);
+	spin_unlock_irqrestore(&pair->buf.lock, flags);
+	tty_flip_buffer_push(pair);
+	wake_up_interruptible(&pair->read_wait);
+	wake_up_interruptible(&pair->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
 		set_bit(TTY_OTHER_CLOSED, &tty->flags);
 #ifdef CONFIG_UNIX98_PTYS
@@ -180,7 +193,6 @@
 
 	if (!to)
 		return;
-	/* tty_buffer_flush(to); FIXME */
 	if (to->packet) {
 		spin_lock_irqsave(&tty->ctrl_lock, flags);
 		tty->ctrl_status |= TIOCPKT_FLUSHWRITE;
@@ -191,23 +203,30 @@
 
 static int pty_open(struct tty_struct *tty, struct file *filp)
 {
-	int	retval = -ENODEV;
+	int	retval = -EIO;
+	unsigned long flags;
+	struct tty_struct *pair;
 
-	if (!tty || !tty->link)
-		goto out;
+	if (tty == NULL || (pair = tty->link) == NULL)
+		return -ENODEV;
 
-	retval = -EIO;
 	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+		return -EIO;
+	spin_lock_irqsave(&pair->buf.lock, flags);
+	if (test_bit(TTY_PTY_LOCK, &pair->flags))
 		goto out;
-	if (test_bit(TTY_PTY_LOCK, &tty->link->flags))
-		goto out;
-	if (tty->link->count != 1)
+	if (pair->count != 1)
 		goto out;
 
-	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
+	clear_bit(TTY_OTHER_CLOSED, &pair->flags);
+	/* The buf.lock stops this racing a flush_to_ldisc from
+	   the other end */
+	clear_bit(TTY_EOFPENDING, &pair->flags);
+	clear_bit(TTY_EOF, &pair->flags);
 	set_bit(TTY_THROTTLED, &tty->flags);
 	retval = 0;
 out:
+	spin_unlock_irqrestore(&pair->buf.lock, flags);
 	return retval;
 }
 
diff -ru /home/opensource/sources/linux-2.6/drivers/char/tty_buffer.c /home/opensource/build/linux-2.6-distro/drivers/char/tty_buffer.c
--- /home/opensource/sources/linux-2.6/drivers/char/tty_buffer.c	2009-04-28 21:21:06.964692375 +0530
+++ /home/opensource/build/linux-2.6-distro/drivers/char/tty_buffer.c	2009-07-27 21:29:26.229922140 +0530
@@ -119,6 +119,12 @@
 		tty_buffer_free(tty, thead);
 	}
 	tty->buf.tail = NULL;
+	/* We can EOF a pty/tty pair with a flush as well as by consuming
+	   all the data left over */
+	if (test_bit(TTY_EOFPENDING, &tty->flags)) {
+		set_bit(TTY_EOF, &tty->flags);
+		wake_up(&tty->read_wait);
+	}
 }
 
 /**
@@ -405,6 +411,7 @@
 	struct tty_buffer *tbuf, *head;
 	char *char_buf;
 	unsigned char *flag_buf;
+	int done = 1;
 
 	disc = tty_ldisc_ref(tty);
 	if (disc == NULL)	/*  !TTY_LDISC */
@@ -433,6 +440,7 @@
 				break;
 			if (!tty->receive_room) {
 				schedule_delayed_work(&tty->buf.work, 1);
+				done = 0;
 				break;
 			}
 			if (count > tty->receive_room)
@@ -454,6 +462,10 @@
 		__tty_buffer_flush(tty);
 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
 		wake_up(&tty->read_wait);
+	} else if (done && test_bit(TTY_EOFPENDING, &tty->flags)) {
+		/* The last bits hit the ldisc so set EOF */
+		wake_up(&tty->read_wait);
+		set_bit(TTY_EOF, &tty->flags);
 	}
 	clear_bit(TTY_FLUSHING, &tty->flags);
 	spin_unlock_irqrestore(&tty->buf.lock, flags);

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 16:14                               ` Aneesh Kumar K.V
  2009-07-27 16:42                                 ` Alan Cox
@ 2009-07-27 18:28                                 ` Andreas Schwab
  1 sibling, 0 replies; 104+ messages in thread
From: Andreas Schwab @ 2009-07-27 18:28 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: OGAWA Hirofumi, Alan Cox, Linus Torvalds, Rafael J. Wysocki,
	Ray Lee, LKML, Andrew Morton

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> I still have the "compile in emacs" bug. So this patch along with
> http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
> bug for me.

The expect testcase still fails, too.

Andreas.

$ cat pty.tcl
#!/usr/bin/expect -f
spawn sh -c "echo foo bar"
expect {
    "foo bar" {puts $expect_out(buffer)}
}
$ expect -f pty.tcl
spawn echo foo bar

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 17:12                                   ` Aneesh Kumar K.V
@ 2009-07-27 19:28                                     ` OGAWA Hirofumi
  2009-07-27 19:40                                       ` Linus Torvalds
  2009-07-27 21:20                                       ` Alan Cox
  0 siblings, 2 replies; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-27 19:28 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> On Mon, Jul 27, 2009 at 05:42:52PM +0100, Alan Cox wrote:
>> 
>> > > -			if (count > tty->receive_room) {
>> > > +			if (count > tty->receive_room)
>> > >  				count = tty->receive_room;
>> > > -				done = 0;
>> > > -			}
>> > >  			char_buf = head->char_buf_ptr + head->read;
>> > >  			flag_buf = head->flag_buf_ptr + head->read;
>> > >  			head->read += count;
>> > > _
>> > 
>> > 
>> > I still have the "compile in emacs" bug. So this patch along with
>> > http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
>> > bug for me.
>> 
>> Can you stick some printk calls in and debug it further then because at
>> with the fixes Ogawa provided I'm finding it impossible to reproduce even
>> on an 8 way x86.
>> 
>> What hardware are you using ?
>
> My T60p lenovo laptop. If you can send me a debug prink patch i can redo the
> test with the patches. I don't know what data i should start collecting so
> that it make sense. This is the patch i tried

If I read that part of emacs correctly, it seems to be assuming the data
was already sent to master side if the child process was exited.

And if it's right, unfortunately, I guess we can't return -EAGAIN in
this case to preserve behavior.

Aneesh, can you get the log of strace of emacs error case?

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

#define _GNU_SOURCE
#define BIG_BUF
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <error.h>
#include <limits.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <unistd.h>

static char pts_name[PATH_MAX];

static int open_pty(void)
{
	int master;
	char *name;

	master = getpt();
	if (master < 0)
		return -1;

	if (grantpt(master) < 0 || unlockpt(master) < 0)
		goto close_master;

	{
		int on = 1;
		ioctl(master, FIONBIO, &on);
	}

	name = ptsname(master);
	if (name == NULL)
		goto close_master;

	strcpy(pts_name, name);

	return master;

close_master:
	close(master);
	return -1;
}

static pid_t child(int master)
{
	pid_t pid;
	int slave;

	pid = fork();
	if (pid < 0)
		error(1, errno, "%s: fork", __func__);

	if (pid == 0) {
		slave = open(pts_name, O_RDWR);
		if (slave < 0)
			error(1, errno, "%s: open", __func__);

		close(master);
		dup2(slave, 0);
		dup2(slave, 1);
		dup2(slave, 2);
		close(slave);

#ifdef BIG_BUF
	{
		char buf[4096];
		size_t size;

		memset(buf, '-', sizeof(buf));
		size = 0;
		while (size < 8192) {
			ssize_t r = write(STDOUT_FILENO, buf, sizeof(buf));
			if (r < 0)
				error(1, errno, "%s: write", __func__);
			size += r;
		}
	}
#else
		printf("1-----------------------------------------------\n");
		printf("2-----------------------------------------------\n");
		printf("3-----------------------------------------------\n");
		printf("4-----------------------------------------------\n");
		printf("5-----------------------------------------------\n");
		printf("6-----------------------------------------------\n");
		printf("7-----------------------------------------------\n");
		printf("8-----------------------------------------------\n");
		printf("9-----------------------------------------------\n");
#endif
		exit(0);
	}

	return pid;
}

int main()
{
	pid_t pid;
	int master;

	master = open_pty();
	if (master < 0)
		error(1, errno, "%s: open_pty", __func__);

	pid = child(master);

	waitpid(pid, NULL, 0);
	while (1) {
		char buf[4096];
		ssize_t size;

		size = read(master, buf, sizeof(buf));
		if (size < 0) {
			if (errno == EAGAIN) {
				printf("some app doesn't expect EAGAIN\n");
				break;
			}
			error(1, errno, "%s: read", __func__);
		}
		if (size == 0)
			break;
#ifdef BIG_BUF
		printf("size %zd\n", size);
#else
		write(STDOUT_FILENO, buf, size);
#endif
	}
	
	return 0;
}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 19:28                                     ` OGAWA Hirofumi
@ 2009-07-27 19:40                                       ` Linus Torvalds
  2009-07-27 20:38                                         ` OGAWA Hirofumi
  2009-07-27 20:52                                         ` Alan Cox
  2009-07-27 21:20                                       ` Alan Cox
  1 sibling, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-27 19:40 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Aneesh Kumar K.V, Alan Cox, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton



On Tue, 28 Jul 2009, OGAWA Hirofumi wrote:
> 
> If I read that part of emacs correctly, it seems to be assuming the data
> was already sent to master side if the child process was exited.

That sounds like a rather obvious assumption.

Aren't pty's flushing the data at flush() time? Which should be happening 
when the child process exits and closes the pty slave.

So at what point do we just admit that the commit that caused all this was 
a buggy pile of sh*t and just revert it?

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 19:40                                       ` Linus Torvalds
@ 2009-07-27 20:38                                         ` OGAWA Hirofumi
  2009-07-27 20:45                                           ` Linus Torvalds
  2009-07-27 21:42                                           ` Alan Cox
  2009-07-27 20:52                                         ` Alan Cox
  1 sibling, 2 replies; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-27 20:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Aneesh Kumar K.V, Alan Cox, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, 28 Jul 2009, OGAWA Hirofumi wrote:
>> 
>> If I read that part of emacs correctly, it seems to be assuming the data
>> was already sent to master side if the child process was exited.
>
> That sounds like a rather obvious assumption.
>
> Aren't pty's flushing the data at flush() time? Which should be happening 
> when the child process exits and closes the pty slave.

For tty, I guess yes. However, now, pty is pushing the data to other
side by background, I'm not sure at all though, I guess ppp is requiring it.

> So at what point do we just admit that the commit that caused all this was 
> a buggy pile of sh*t and just revert it?

Also, I'm not sure though, I guess it depends on the bug which was fixed
by the commit.

I don't know about ppp problem at all. So, personally I'm ok either way
- we revert it and try to fix this for next merge window, or continue to
fix this for a while.

Alan-san?

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 20:38                                         ` OGAWA Hirofumi
@ 2009-07-27 20:45                                           ` Linus Torvalds
  2009-07-27 21:42                                           ` Alan Cox
  1 sibling, 0 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-27 20:45 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Aneesh Kumar K.V, Alan Cox, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton



On Tue, 28 Jul 2009, OGAWA Hirofumi wrote:
> 
> > So at what point do we just admit that the commit that caused all this was 
> > a buggy pile of sh*t and just revert it?
> 
> Also, I'm not sure though, I guess it depends on the bug which was fixed
> by the commit.

It doesn't really.

The fact is, we don't accept "fix one bug, introduce another" code. We're 
much better off with _old_ bugs than with new ones.

I'd love to have both the old and the new fixed, but I'm not seeing a lot 
of progress on this new bug, and at some point I'm going to have to decide 
to just revert the commit that caused this whole breakage, until a fix 
that doesn't cause new problems can be worked out.

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 19:40                                       ` Linus Torvalds
  2009-07-27 20:38                                         ` OGAWA Hirofumi
@ 2009-07-27 20:52                                         ` Alan Cox
  2009-07-27 21:22                                           ` Linus Torvalds
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-27 20:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

On Mon, 27 Jul 2009 12:40:11 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 28 Jul 2009, OGAWA Hirofumi wrote:
> > 
> > If I read that part of emacs correctly, it seems to be assuming the data
> > was already sent to master side if the child process was exited.
> 
> That sounds like a rather obvious assumption.

> Aren't pty's flushing the data at flush() time? Which should be happening 
> when the child process exits and closes the pty slave.

When you close the slave the device all the data has been queued to the
master

> So at what point do we just admit that the commit that caused all this was 
> a buggy pile of sh*t and just revert it?

If we could "just revert it" and get a sane tree I'd have asked you to do
so quite some time ago and readdressed it later.

We can't "just revert it" or I'd have deferred it for the next release. If
you revert it you

- introduce a DoS attack
- break ppp
- introduce a pile of other tty races including at least one where the
  right timings should let you jump through null pointers
- put back all sorts of random obscure hangs caused by all the lock
  violations

and you'll note I pointed out this was a late change I was forced to make
and really didn't want to in the original commit.

Nor can we revert several patches because the ppp stuff means going back
to about 2.6.28 or so for the entire tty layer plus some of the DoS and
null pointer races go back to 2.2 or 2.0 8(.

We can use the two line slightly imperfect quickfix which people reported
does fix their problem and I'm tempted to go with that for 2.6.31 because
it works for the real world cases that matter.

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 19:28                                     ` OGAWA Hirofumi
  2009-07-27 19:40                                       ` Linus Torvalds
@ 2009-07-27 21:20                                       ` Alan Cox
  2009-07-28  5:33                                         ` OGAWA Hirofumi
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-27 21:20 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Aneesh Kumar K.V, Linus Torvalds, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> > My T60p lenovo laptop. If you can send me a debug prink patch i can redo the
> > test with the patches. I don't know what data i should start collecting so
> > that it make sense. This is the patch i tried
> 
> If I read that part of emacs correctly, it seems to be assuming the data
> was already sent to master side if the child process was exited.

Oops.. that would be rather buggy.

It is true that the data has left the slave. It is not neccessarily true
the data has arrived on the master before the call to close() completes.

We can fake that but what the hell are the semantics if the process on
the master blocks or if both processes each end fill the queue and then
close so no data can be consumed.

> And if it's right, unfortunately, I guess we can't return -EAGAIN in
> this case to preserve behavior.

To avoid the EAGAIN the close needs to block until all queued I/O has
been processed by the ldisc the other end. That's not standards
guaranteed or even what happens with a non pty port, but it is doable
unless you take signals - we can't block signals or it can all deadlock.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 20:52                                         ` Alan Cox
@ 2009-07-27 21:22                                           ` Linus Torvalds
  2009-07-27 21:54                                             ` Alan Cox
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-27 21:22 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Mon, 27 Jul 2009, Alan Cox wrote:
>
> Nor can we revert several patches because the ppp stuff means going back
> to about 2.6.28 or so for the entire tty layer plus some of the DoS and
> null pointer races go back to 2.2 or 2.0 8(.

The thing is, breaking ppp and reverting at least to 2.6.30 behavior. 
Since it's better than the _current_ breakage.

> We can use the two line slightly imperfect quickfix which people reported
> does fix their problem and I'm tempted to go with that for 2.6.31 because
> it works for the real world cases that matter.

Umm. Which ones? People have reported kdesu and emacs breaking. Last I 
saw, the emacs breakage wasn't fixed by any of the patches seen so far.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 20:38                                         ` OGAWA Hirofumi
  2009-07-27 20:45                                           ` Linus Torvalds
@ 2009-07-27 21:42                                           ` Alan Cox
  2009-07-27 22:04                                             ` Linus Torvalds
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-27 21:42 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Linus Torvalds, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

I'd favour we go with this which passes t1, expect, emacs and a corrected
t3

(test t3 is buggy as it has no \n and leaves the tty in line by line mode
so its typing a long line at the pty but never hits return to finish the
input)


It's theoretically imperfect in the case where you write a vast amount of
output in one go, the tty is blocked the other end and then you close.
However in practice that doesn't happen because with tty->low_latency = 1
we run the pty received n_tty ldisc code in our context so each write
fires through the entire n_tty ldisc and does flow control synchronously.

It needs re-addressing but its simple which at this point wins over
everything else and its one people tested before we tried to fix the hard
corner cases

commit aaf9da79c95a32fc5286fb851632baf09dc6134b
Author: Alan Cox <alan@linux.intel.com>
Date:   Mon Jul 27 22:17:51 2009 +0100

    pty: quickfix for the pty ENXIO timing problems
    
    This also makes close stall in the normal case which is apparently needed
    to fix emacs
    
    Signed-off-by: Alan Cox <alan@linux.intel.com>

diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..3850a68 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -52,6 +52,7 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
 		return;
 	tty->link->packet = 0;
 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
+	tty_flip_buffer_push(tty->link);
 	wake_up_interruptible(&tty->link->read_wait);
 	wake_up_interruptible(&tty->link->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
@@ -207,6 +208,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
 	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
 	set_bit(TTY_THROTTLED, &tty->flags);
 	retval = 0;
+	tty->low_latency = 1;
 out:
 	return retval;
 }

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 21:22                                           ` Linus Torvalds
@ 2009-07-27 21:54                                             ` Alan Cox
  0 siblings, 0 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-27 21:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

On Mon, 27 Jul 2009 14:22:40 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Mon, 27 Jul 2009, Alan Cox wrote:
> >
> > Nor can we revert several patches because the ppp stuff means going back
> > to about 2.6.28 or so for the entire tty layer plus some of the DoS and
> > null pointer races go back to 2.2 or 2.0 8(.
> 
> The thing is, breaking ppp and reverting at least to 2.6.30 behavior. 
> Since it's better than the _current_ breakage.
> 
> > We can use the two line slightly imperfect quickfix which people reported
> > does fix their problem and I'm tempted to go with that for 2.6.31 because
> > it works for the real world cases that matter.
> 
> Umm. Which ones? People have reported kdesu and emacs breaking. Last I 
> saw, the emacs breakage wasn't fixed by any of the patches seen so far.

Aneesh verified the tty->low_latency patch fixed emacs (Sunday mail in
the thread "Re: [Regression] kdesu broken")


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 21:42                                           ` Alan Cox
@ 2009-07-27 22:04                                             ` Linus Torvalds
  2009-07-27 22:41                                               ` Alan Cox
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-27 22:04 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Mon, 27 Jul 2009, Alan Cox wrote:
> 
> It's theoretically imperfect in the case where you write a vast amount of
> output in one go, the tty is blocked the other end and then you close.
> However in practice that doesn't happen because with tty->low_latency = 1
> we run the pty received n_tty ldisc code in our context so each write
> fires through the entire n_tty ldisc and does flow control synchronously.

An alternative might be something like this.

THIS IS TOTALLY UNTESTED.

This is just meant as a "this kind of approach may be a good idea", and 
just tries to make sure that any delayed work is flushed by closing the 
tty.

No guarantees. It may or may not compile. It may or may not make any 
difference. It might do unspeakable things to your pets. It really is 
meant to be an example of what a "flush" function could/should do. There 
are probably other things/buffers that may need flushing.

		Linus

---
 drivers/char/tty_io.c |   33 +++++++++++++++++++++++++++++++++
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index a3afa0c..1be69af 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -143,6 +143,7 @@ ssize_t redirected_tty_write(struct file *, const char __user *,
 static unsigned int tty_poll(struct file *, poll_table *);
 static int tty_open(struct inode *, struct file *);
 static int tty_release(struct inode *, struct file *);
+static int tty_flush(struct file *filp, fl_owner_t id);
 long tty_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
 #ifdef CONFIG_COMPAT
 static long tty_compat_ioctl(struct file *file, unsigned int cmd,
@@ -415,6 +416,7 @@ static const struct file_operations tty_fops = {
 	.unlocked_ioctl	= tty_ioctl,
 	.compat_ioctl	= tty_compat_ioctl,
 	.open		= tty_open,
+	.flush		= tty_flush,
 	.release	= tty_release,
 	.fasync		= tty_fasync,
 };
@@ -1831,7 +1833,38 @@ static int tty_open(struct inode *inode, struct file *filp)
 	return ret;
 }
 
+/* This should probably be a generic function */
+static void flush_delayed_work(struct delayed_work *work)
+{
+	if (cancel_delayed_work(work)) {
+		schedule_delayed_work(work, 0);
+		flush_scheduled_work();
+	}
+}
 
+/**
+ *	tty_flush		-	vfs callback for close
+ *	@filp: file pointer for handle to tty
+ *	@id: struct files_struct of owner.
+ *
+ *	Called for every close(), whether the last or not
+ */
+static int tty_flush(struct file *filp, fl_owner_t id)
+{
+	struct tty_struct *tty, *o_tty;
+	struct inode *inode;
+
+	inode = filp->f_path.dentry->d_inode;
+	tty = (struct tty_struct *)filp->private_data;
+
+	if (tty_paranoia_check(tty, inode, "tty_flush"))
+		return 0;
+
+	o_tty = tty->link;
+	if (o_tty)
+		flush_delayed_work(&o_tty->buf.work);
+	return 0;
+}
 
 
 /**


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 22:04                                             ` Linus Torvalds
@ 2009-07-27 22:41                                               ` Alan Cox
  0 siblings, 0 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-27 22:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> No guarantees. It may or may not compile. It may or may not make any 
> difference. It might do unspeakable things to your pets. It really is 
> meant to be an example of what a "flush" function could/should do. There 
> are probably other things/buffers that may need flushing.

Wrong end of the stick. close() already flushes the transmit buffers
properly (and passes the SuS test suite tests on behaviour). Its the
receive the other end you need to sync which isn't an assumption made or
something with any meaning on real hardware.

> +/* This should probably be a generic function */
> +static void flush_delayed_work(struct delayed_work *work)
> +{
> +	if (cancel_delayed_work(work)) {
> +		schedule_delayed_work(work, 0);
> +		flush_scheduled_work();
> +	}

If the ldisc is busy the scheduled work will not flush the data to the
other tty. The tty->low_latency fix actually works better than this
because it makes tty_flip_buffer_push() force the work through that can
be done as it calls the ldisc work synchronously.

> + *	Called for every close(), whether the last or not

The pty case is "special" and seems to be an accident of implementations
in history. The problem on the pty side is on the input not the output
side.

A normal tty write goes from the ldisc to the device, and in that case
the close behaviour is defined by the close delay and FIFO flush in the
driver. In other words it is already done correctly, only done for the
last close (some apps rely on that as a sneaky way to keep the main
thread from blocking).

The pty race is quite different

A write to the pty is guaranteed to correctly have completed on the
close(). The data may however not have been received by the other end as
the ldisc could be busy or not yet scheduled. If you like the data is
sitting on the virtual wire". Or more exactly its in the "not yet fed to
the ldisc" buffers on the other end.

Adding the EOFPENDING/EOF flag fixes the detection of the EOF by the
receiver but doesn't guarantee the sender blocks. If the sender also
waits for the EOF flag to be set then the sender knows the receiving
ldisc has got all the data. Thats a fairly small mod from the existing
long patch.

Except: The other end could also be in EOFPENDING/EOF wait with both ends
blocked by the ldisc and not accepting data. In which case you just
deadlocked. The close() blocks forever waiting for another end which may
also be close() blocked in the other direction isn't a viable
implementation although one I'm quite keen to try out on a Mac or BSD box
out of curiosity.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-27 21:20                                       ` Alan Cox
@ 2009-07-28  5:33                                         ` OGAWA Hirofumi
  2009-07-28 10:22                                           ` Alan Cox
  2009-07-28 15:48                                           ` Linus Torvalds
  0 siblings, 2 replies; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-28  5:33 UTC (permalink / raw)
  To: Alan Cox
  Cc: Aneesh Kumar K.V, Linus Torvalds, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> And if it's right, unfortunately, I guess we can't return -EAGAIN in
>> this case to preserve behavior.
>
> To avoid the EAGAIN the close needs to block until all queued I/O has
> been processed by the ldisc the other end. That's not standards
> guaranteed or even what happens with a non pty port, but it is doable
> unless you take signals - we can't block signals or it can all deadlock.

Just a quick hack though. Is this wrong/unpreferable way?

n_tty_read() checks the pending buffer and consume it before
input_available_p().

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>



Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
---

 drivers/char/n_tty.c      |    1 +
 drivers/char/pty.c        |    1 -
 drivers/char/tty_buffer.c |    7 +++++--
 include/linux/tty.h       |    1 +
 4 files changed, 7 insertions(+), 3 deletions(-)

diff -puN drivers/char/pty.c~pty-fixes4-fix2 drivers/char/pty.c
--- linux-2.6/drivers/char/pty.c~pty-fixes4-fix2	2009-07-28 05:00:45.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/pty.c	2009-07-28 05:00:48.000000000 +0900
@@ -64,7 +64,6 @@ static void pty_close(struct tty_struct 
 	set_bit(TTY_EOFPENDING, &pair->flags);
 	set_bit(TTY_OTHER_CLOSED, &pair->flags);
 	spin_unlock_irqrestore(&pair->buf.lock, flags);
-	tty_flip_buffer_push(pair);
 	wake_up_interruptible(&pair->read_wait);
 	wake_up_interruptible(&pair->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
diff -puN drivers/char/n_tty.c~pty-fixes4-fix2 drivers/char/n_tty.c
--- linux-2.6/drivers/char/n_tty.c~pty-fixes4-fix2	2009-07-28 05:01:10.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/n_tty.c	2009-07-28 05:07:49.000000000 +0900
@@ -1776,6 +1776,7 @@ do_it_again:
 		    ((minimum - (b - buf)) >= 1))
 			tty->minimum_to_wake = (minimum - (b - buf));
 
+		tty_flush_to_ldisc(tty);
 		if (!input_available_p(tty, 0)) {
 			if (test_bit(TTY_EOF, &tty->flags)) {
 				/* PTY pair closed and all data consumed */
diff -puN include/linux/tty.h~pty-fixes4-fix2 include/linux/tty.h
--- linux-2.6/include/linux/tty.h~pty-fixes4-fix2	2009-07-28 05:07:07.000000000 +0900
+++ linux-2.6-hirofumi/include/linux/tty.h	2009-07-28 05:07:30.000000000 +0900
@@ -396,6 +396,7 @@ extern void __do_SAK(struct tty_struct *
 extern void disassociate_ctty(int priv);
 extern void no_tty(void);
 extern void tty_flip_buffer_push(struct tty_struct *tty);
+extern void tty_flush_to_ldisc(struct tty_struct *tty);
 extern void tty_buffer_free_all(struct tty_struct *tty);
 extern void tty_buffer_flush(struct tty_struct *tty);
 extern void tty_buffer_init(struct tty_struct *tty);
diff -puN drivers/char/tty_buffer.c~pty-fixes4-fix2 drivers/char/tty_buffer.c
--- linux-2.6/drivers/char/tty_buffer.c~pty-fixes4-fix2	2009-07-28 05:41:12.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/tty_buffer.c	2009-07-28 14:24:48.000000000 +0900
@@ -388,8 +388,6 @@ int tty_prepare_flip_string_flags(struct
 }
 EXPORT_SYMBOL_GPL(tty_prepare_flip_string_flags);
 
-
-
 /**
  *	flush_to_ldisc
  *	@work: tty structure passed from work queue.
@@ -473,6 +471,11 @@ static void flush_to_ldisc(struct work_s
 	tty_ldisc_deref(disc);
 }
 
+void tty_flush_to_ldisc(struct tty_struct *tty)
+{
+	flush_to_ldisc(&tty->buf.work.work);
+}
+
 /**
  *	tty_flip_buffer_push	-	terminal
  *	@tty: tty to push
_

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28  5:33                                         ` OGAWA Hirofumi
@ 2009-07-28 10:22                                           ` Alan Cox
  2009-07-28 10:42                                             ` OGAWA Hirofumi
  2009-07-28 15:49                                             ` Linus Torvalds
  2009-07-28 15:48                                           ` Linus Torvalds
  1 sibling, 2 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-28 10:22 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Aneesh Kumar K.V, Linus Torvalds, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

On Tue, 28 Jul 2009 14:33:40 +0900
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> wrote:

> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> 
> >> And if it's right, unfortunately, I guess we can't return -EAGAIN in
> >> this case to preserve behavior.
> >
> > To avoid the EAGAIN the close needs to block until all queued I/O has
> > been processed by the ldisc the other end. That's not standards
> > guaranteed or even what happens with a non pty port, but it is doable
> > unless you take signals - we can't block signals or it can all deadlock.
> 
> Just a quick hack though. Is this wrong/unpreferable way?
> 
> n_tty_read() checks the pending buffer and consume it before
> input_available_p().

That won't change the fact that the process could have exited. You can
fix the -ENXIO reporting that way (and it is basically what the
EOFPENDING/EOF patch did), but the only way I can see to fix the
assumption that the process exit means all the data is in the ldisc the
other end ready to use is to actually to make the close() path block -
but with some kind of limits to prevent deadlocks.

Given the assumptions in emacs are wrong, low_latency fixes the real
world cases and we are standards compliant perhaps we are trying too
hard ?


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 10:22                                           ` Alan Cox
@ 2009-07-28 10:42                                             ` OGAWA Hirofumi
  2009-07-28 15:49                                             ` Linus Torvalds
  1 sibling, 0 replies; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-28 10:42 UTC (permalink / raw)
  To: Alan Cox
  Cc: Aneesh Kumar K.V, Linus Torvalds, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> Just a quick hack though. Is this wrong/unpreferable way?
>> 
>> n_tty_read() checks the pending buffer and consume it before
>> input_available_p().
>
> That won't change the fact that the process could have exited.

Yes.

> You can fix the -ENXIO reporting that way (and it is basically what
> the EOFPENDING/EOF patch did), but the only way I can see to fix the
> assumption that the process exit means all the data is in the ldisc
> the other end ready to use is to actually to make the close() path
> block - but with some kind of limits to prevent deadlocks.

I thought, to check in n_tty_read() may guarantee that tty->buf (slave
guarantee to sent to tty->buf) is consumed by master side.

I hoped this tty->buf works as the pending data in ldisc.

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28  5:33                                         ` OGAWA Hirofumi
  2009-07-28 10:22                                           ` Alan Cox
@ 2009-07-28 15:48                                           ` Linus Torvalds
  2009-07-28 16:16                                             ` OGAWA Hirofumi
  1 sibling, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-28 15:48 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Alan Cox, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton



On Tue, 28 Jul 2009, OGAWA Hirofumi wrote:
> 
> Just a quick hack though. Is this wrong/unpreferable way?

That seems to be right regardless of any other issues.

> n_tty_read() checks the pending buffer and consume it before
> input_available_p().

Why not move this _inside_ "input_available_p()"? There are only two 
call-sites, and strictly speaking they both want it.

Look at n_tty_poll(), for example:

        if (input_available_p(tty, TIME_CHAR(tty) ? 0 : MIN_CHAR(tty)))
                mask |= POLLIN | POLLRDNORM;
        if (tty->packet && tty->link->ctrl_status)
                mask |= POLLPRI | POLLIN | POLLRDNORM;
        if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
                mask |= POLLHUP;
        if (tty_hung_up_p(file))
                mask |= POLLHUP;

and notice what happens to somebody who uses select/poll when there might 
be pending data that hasn't been handled yet, and the tty has been marked 
TTY_OTHER_CLOSED or hung up. It would return only POLLHUP, and as a 
result, that side would never even try to read the pending data because 
poll implies that there is no data and it's EOF. Which is just wrong.

So _any_ time you check "is there input available?" you should always 
check if there are other buffers. No?

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 10:22                                           ` Alan Cox
  2009-07-28 10:42                                             ` OGAWA Hirofumi
@ 2009-07-28 15:49                                             ` Linus Torvalds
  2009-07-28 16:42                                               ` Alan Cox
  1 sibling, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-28 15:49 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Tue, 28 Jul 2009, Alan Cox wrote:
> 
> Given the assumptions in emacs are wrong, low_latency fixes the real
> world cases and we are standards compliant perhaps we are trying too
> hard ?

Alan, I really don't understand why you seem to argue _for_ bad code, and 
not just fixing it. You seem to think that it's ok to leave stuff in 
buffers and ignore it, just because the other end might have exited. 

That seems disingenious and stupid. Why argue for crap?

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 15:48                                           ` Linus Torvalds
@ 2009-07-28 16:16                                             ` OGAWA Hirofumi
  0 siblings, 0 replies; 104+ messages in thread
From: OGAWA Hirofumi @ 2009-07-28 16:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:

>> n_tty_read() checks the pending buffer and consume it before
>> input_available_p().
>
> Why not move this _inside_ "input_available_p()"? There are only two 
> call-sites, and strictly speaking they both want it.

The only reason for me is this is just quick hack.

Yes, I guess the input_available_p() is preferable place than my patch.
I'm still not checking the related path etc. deeply though.

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 15:49                                             ` Linus Torvalds
@ 2009-07-28 16:42                                               ` Alan Cox
  2009-07-28 16:49                                                 ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-28 16:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

On Tue, 28 Jul 2009 08:49:29 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 28 Jul 2009, Alan Cox wrote:
> > 
> > Given the assumptions in emacs are wrong, low_latency fixes the real
> > world cases and we are standards compliant perhaps we are trying too
> > hard ?
> 
> Alan, I really don't understand why you seem to argue _for_ bad code, and 
> not just fixing it. You seem to think that it's ok to leave stuff in 
> buffers and ignore it, just because the other end might have exited. 
> 
> That seems disingenious and stupid. Why argue for crap?

We don't ignore it. What goes in one end comes out the other after tty
processing (ldisc, echo etc). Reliably. Both the EOF fix and the
tty->low_latency fix cure that. [The low latency one also provides the
*exact* same semantics as we had prior to 2.6.31-rc as well]

If I understand Ogawa correctly then what emacs thinks is true is quite
different: Emacs thinks that

	write(pty, "data", length)
	close(pty)
	exit()

will always ensure that the other end has already got the data before
close() completes - or to be more exact before the parent receives SIGCLD.

Unfortunately as both ends could do

	write(pty, "enough to fill all the buffers but not block")
	close(pty)

at the same time it doesn't strike me as a good idea because it will
deadlock.

Ogawa's latest experiment is actually quite interesting which is to also
run the ldisc processing off the back of the user context. Unfortunately
that breaks assorted assumptions in the kernel because the
close/hangup/ldisc change paths believe that if they kill the work queue
for feeding data into the ldisc then the ldisc will stop receiving
data. It believes that therefore the ldisc will not do things like call
the write method to echo data back to a device which has gone away, which
generally involves jumping through null pointers.

So to be clear: WE DO NOT LOSE DATA.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 16:42                                               ` Alan Cox
@ 2009-07-28 16:49                                                 ` Linus Torvalds
  2009-07-28 16:52                                                   ` Linus Torvalds
  2009-07-28 17:06                                                   ` Alan Cox
  0 siblings, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-28 16:49 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Tue, 28 Jul 2009, Alan Cox wrote:
> 
> We don't ignore it.

Yes we do. Look at Ogawa-san's patch. And read my email answer to it.

> What goes in one end comes out the other after tty
> processing (ldisc, echo etc). Reliably. Both the EOF fix and the
> tty->low_latency fix cure that. [The low latency one also provides the
> *exact* same semantics as we had prior to 2.6.31-rc as well]

I agree that the low-latency thing should fix things, and I applied it. 
But I think that Ogawa's patch is fundamentally "correct" at a much higher 
level. Rather than depend on low-latency being set, it just "Does The 
Right Thing(tm)", by making sure that readers never even look at the EOF 
bits etc until they have flushed the tty ldisc state.

> If I understand Ogawa correctly then what emacs thinks is true is quite
> different: Emacs thinks that
> 
> 	write(pty, "data", length)
> 	close(pty)
> 	exit()
> 
> will always ensure that the other end has already got the data before
> close() completes - or to be more exact before the parent receives SIGCLD.

.. and depending on what emacs does with signals and it's select() loop, 
that may actually be _entirely_reasonable_.

Imagine being in 'select()' (or read, for that matter), and getting EINTR 
due to SIGCHLD. What is the correct expectations?

The correct expectation is that the select() (or read()) should have 
returned any data that it saw _before_ it returns EINTR.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 16:49                                                 ` Linus Torvalds
@ 2009-07-28 16:52                                                   ` Linus Torvalds
  2009-07-28 17:09                                                     ` Alan Cox
  2009-07-28 17:06                                                   ` Alan Cox
  1 sibling, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-28 16:52 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Tue, 28 Jul 2009, Linus Torvalds wrote:
> 
> The correct expectation is that the select() (or read()) should have 
> returned any data that it saw _before_ it returns EINTR.

Put another way: our pty code is simply _buggy_ if it returns EINTR when 
there is actually data pending on a pty.

Yes, it's "buggy" only in a QoI sense - I'm sure that if you read POSIX 
and SuS like a language lawyer, there is absolutely zero that says that it 
can't return EINTR at any random time, or that the SIGCHLD should be 
ordered wrt the other side of the pty having done a write.

But we don't do language-lawyering based on standards that inevitably 
never really delve into all the nitty-gritty details. We are simply better 
than that. Leave the language-lawyering to the people who can't do things 
well, and then whine about their crap being "technically correct".

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 16:49                                                 ` Linus Torvalds
  2009-07-28 16:52                                                   ` Linus Torvalds
@ 2009-07-28 17:06                                                   ` Alan Cox
  2009-07-28 18:44                                                     ` Linus Torvalds
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-28 17:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> Imagine being in 'select()' (or read, for that matter), and getting EINTR 
> due to SIGCHLD. What is the correct expectations?

If you are asking that question I don't think you understand the bug
report.

> The correct expectation is that the select() (or read()) should have 
> returned any data that it saw _before_ it returns EINTR.

read() handles that correctly, has always done so.

emacs from the traces does this

	set O_NDELAY
	wait for SIGCLD
	read()
		EAGAIN
		shit_myself();

The EWOULDBLOCK is perfectly correctly reporting that at that precise
instant no data is available.

Correctly written code does this

	loop: 	read()
		EAGAIN
		poll
		goto loop

and in that case our code all works correctly.

If you put the whole thing on a timeline it looks like this (without
lowlatency being enabled but with the EOF thing fixed)


Slave			Emacs		Kernel

write "errors"
					Queue to tty->buf
close
					This is all the data
					Other end closed
exit
			SIGCLD
			read pty
			EWOULDBLOCK
			shit myself
					schedule n_tty ldisc
					queue bytes to parent

			died rather than waited

Had emacs used poll() properly then it ought to have all worked, although
a spot of review of that wouldn't be a bad thing.

Also calling into the n_tty ldisc side processing in the receiver
unfortunately opens up this stuff


	** interrupt path **
	data received
	queue to tty->buf, and wake

	** hangup **
	closes the ldisc down
	waits for the workqueue to complete
	physical instance goes away

	** a read already in progress **
	(new reads will go via tty_hung_up_write())
	process n_tty ldisc
		(which is already closed)
	echo char
		write to non-existant device
		jump to fishkill (or exploitville or wherever)

Because the hangup code, ldisc and open/close code all work on the basis
that

	- a hang up means the caller will not call tty_flip_buffer_push
	  after calling tty_hangup.
	- the only other path (the non low latency path) is the workqueue
	  which it takes care to kill off


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 16:52                                                   ` Linus Torvalds
@ 2009-07-28 17:09                                                     ` Alan Cox
  2009-07-28 18:45                                                       ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-28 17:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> Put another way: our pty code is simply _buggy_ if it returns EINTR when 
> there is actually data pending on a pty.

Good job it doesn't do that then - although be careful what "data
pending" means. If the buffer contains "wombat" and you are in ICANON
mode then there is no data pending, and poll() likewise will say there is
no data pending. Only when newline is hit do you have data pending (which
is why test t3 is buggy)

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 17:06                                                   ` Alan Cox
@ 2009-07-28 18:44                                                     ` Linus Torvalds
  2009-07-28 18:56                                                       ` Alan Cox
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-28 18:44 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Tue, 28 Jul 2009, Alan Cox wrote:
> 
> If you are asking that question I don't think you understand the bug
> report.

I don't think YOU understand what I'm saying.

> > The correct expectation is that the select() (or read()) should have 
> > returned any data that it saw _before_ it returns EINTR.
> 
> read() handles that correctly, has always done so.
> 
> emacs from the traces does this
> 
> 	set O_NDELAY
> 	wait for SIGCLD
> 	read()
> 		EAGAIN
> 		shit_myself();

Go back and read my email.

Emacs is ENTIRELY PROPER in doing that. If it has gotten the SIGCHLD, then 
it damn well should know that the data is buffered already, since the 
child sure as hell isn't writing any more. So if it gets EAGAIN due to the 
SIGCHLD, it can assume that there isn't going to be any more data.

My point is that a program _should_ be able to depend on simple causality 
when it comes to ordering rules. If the child did a write() before 
exiting, then we should see the data before SIGCHLD.

It's really that simple.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 17:09                                                     ` Alan Cox
@ 2009-07-28 18:45                                                       ` Linus Torvalds
  0 siblings, 0 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-28 18:45 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Tue, 28 Jul 2009, Alan Cox wrote:

> > Put another way: our pty code is simply _buggy_ if it returns EINTR when 
> > there is actually data pending on a pty.
> 
> Good job it doesn't do that then - although be careful what "data
> pending" means. If the buffer contains "wombat" and you are in ICANON
> mode then there is no data pending, and poll() likewise will say there is
> no data pending. Only when newline is hit do you have data pending (which
> is why test t3 is buggy)

Alan, that's a total red herring. We're not talking t3. We're talking 
emacs, and the newline is there.

You claim that emacs sh*ts itself when it gets EAGAIN, and you think 
that's an emacs bug. And I think you're full of crap. We should NEVER EVER 
get EAGAIN (due to the SIGCHLD, at least) if the app on the other side 
wrote data that could be read.

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 18:44                                                     ` Linus Torvalds
@ 2009-07-28 18:56                                                       ` Alan Cox
  2009-07-28 19:08                                                         ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-28 18:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> My point is that a program _should_ be able to depend on simple causality 
> when it comes to ordering rules. If the child did a write() before 
> exiting, then we should see the data before SIGCHLD.
> 
> It's really that simple.

In which case you need the write to be synchronous to the ldisc
processing. Which is what tty->low_latency = 1 did. That provides your
required causality.

So what exactly is the problem. Setting ->low_latency has a small
performance impact which was why I was trying the other stuff, but that
is all.

As I said tty->low_latency = 1 gives you the previous semantics.

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 18:56                                                       ` Alan Cox
@ 2009-07-28 19:08                                                         ` Linus Torvalds
  2009-07-28 19:15                                                           ` Alan Cox
  2009-07-28 23:46                                                           ` Alan Cox
  0 siblings, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-28 19:08 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Tue, 28 Jul 2009, Alan Cox wrote:
>
> In which case you need the write to be synchronous to the ldisc
> processing. Which is what tty->low_latency = 1 did. That provides your
> required causality.
> 
> So what exactly is the problem.

The problem is that you seem to be arguing against the _nicer_ fix, that 
also makes more conceptual sense, and that doesn't even depend on the 
whole low-latency hackery.

And I don't see why you argue that.

Furthermore, you have been CONTINUALLY arguing that emacs is buggy. 
Without any logic to back that up what-so-ever. You argued that just a few 
minutes ago. 

Why? Why are you making those outlandish claims? What I'm so unhappy about 
is that your whole reaction to it - from the beginning - was to blame 
anything but the patch that caused it, and that it was bisected down to.

Why? Why blame emacs? Why call user land buggy, when the bug was 
introduced by you, and was in the kernel? Why are you fighting it? Why did 
it take so long to admit that all the regressions were kernel problems? 
Why were you trying to dismiss the regression report as a user-land bug 
from the very beginning?

Let me quote one of your first emails as I got cc'd on this thread:

    > > See the thread starting here: ("possible regression with pty.c commit")
    > >    http://lkml.org/lkml/2009/7/11/125
    > 
    > Thanks for the pointer.
    > 
    > Well, I thought we were expected to avoid breaking existing user space, even
    > if that were buggy etc.

    I don't know where you got that idea from. Avoiding breaking user space
    unneccessarily is good but if its buggy you often can't do anything about
    it.

That was you talking to Rafael, who tracks regressions, and trying to
argue that it wasn't a kernel bug (both in the double quoted thing and
in the final one).

And no, that was not a fluke. TODAY, you sent out an email saying that 
EWOUDLBLOCK/EAGAIN would be "correct". Despite me having told you that it 
_clearly_ is not correct, and despite you knowing that it breaks user 
space.

WHY?

Quite frankly, I don't understand why I should even have to bring these 
issues up. You should have tried to fix the problem immediately, without 
arguing against fixing the kernel. Without blaming user space. Without 
making idiotic excuses for bad kernel behavior.

The fact is, breaking regular user applications is simply not acceptable. 
Trying to blame kernel breakage on the app being "buggy" is not ok. And 
arguing for almost a week against fixing it - that's just crazy.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 19:08                                                         ` Linus Torvalds
@ 2009-07-28 19:15                                                           ` Alan Cox
  2009-07-28 19:56                                                             ` Greg KH
  2009-07-28 23:46                                                           ` Alan Cox
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-28 19:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton


> Quite frankly, I don't understand why I should even have to bring these 
> issues up. You should have tried to fix the problem immediately, without 
> arguing against fixing the kernel. Without blaming user space. Without 
> making idiotic excuses for bad kernel behavior.
> 
> The fact is, breaking regular user applications is simply not acceptable. 
> Trying to blame kernel breakage on the app being "buggy" is not ok. And 
> arguing for almost a week against fixing it - that's just crazy.

I've been working on fixing it. I have spent a huge amount of time
working on the tty stuff trying to gradually get it sane without breaking
anything and fixing security holes along the way as they came up. I spent
the past two evenings working on the tty regressions.

However I've had enough. If you think that problem is easy to fix you fix
it.

Have fun.

I've zapped the tty merge queue so anyone with patches for the tty layer
can send them to the new maintainer.

--- MAINTAINERS~	2009-07-23 15:36:41.000000000 +0100
+++ MAINTAINERS	2009-07-28 20:09:32.200685827 +0100
@@ -5815,10 +5815,7 @@
 S:	Maintained
 
 TTY LAYER
-P:	Alan Cox
-M:	alan@lxorguk.ukuu.org.uk
-S:	Maintained
-T:	stgit http://zeniv.linux.org.uk/~alan/ttydev/
+S:	Unmaintained
 F:	drivers/char/tty_*
 F:	drivers/serial/serial_core.c
 F:	include/linux/serial_core.h

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 19:15                                                           ` Alan Cox
@ 2009-07-28 19:56                                                             ` Greg KH
  2009-07-28 20:47                                                               ` Theodore Tso
  0 siblings, 1 reply; 104+ messages in thread
From: Greg KH @ 2009-07-28 19:56 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, Jul 28, 2009 at 08:15:53PM +0100, Alan Cox wrote:
> 
> > Quite frankly, I don't understand why I should even have to bring these 
> > issues up. You should have tried to fix the problem immediately, without 
> > arguing against fixing the kernel. Without blaming user space. Without 
> > making idiotic excuses for bad kernel behavior.
> > 
> > The fact is, breaking regular user applications is simply not acceptable. 
> > Trying to blame kernel breakage on the app being "buggy" is not ok. And 
> > arguing for almost a week against fixing it - that's just crazy.
> 
> I've been working on fixing it. I have spent a huge amount of time
> working on the tty stuff trying to gradually get it sane without breaking
> anything and fixing security holes along the way as they came up. I spent
> the past two evenings working on the tty regressions.
> 
> However I've had enough. If you think that problem is easy to fix you fix
> it.
> 
> Have fun.
> 
> I've zapped the tty merge queue so anyone with patches for the tty layer
> can send them to the new maintainer.

Do you have a pointer to your previous queue before you zapped it so
that others might be able to pick it up?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 19:56                                                             ` Greg KH
@ 2009-07-28 20:47                                                               ` Theodore Tso
  2009-07-28 21:01                                                                 ` Greg KH
  0 siblings, 1 reply; 104+ messages in thread
From: Theodore Tso @ 2009-07-28 20:47 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Cox, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, Jul 28, 2009 at 12:56:48PM -0700, Greg KH wrote:
> 
> Do you have a pointer to your previous queue before you zapped it so
> that others might be able to pick it up?
> 

It's also available from linux-next as commit:

     024ea7873598686f3e02002ca23c2e8fa3fddd30

						- Ted

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 20:47                                                               ` Theodore Tso
@ 2009-07-28 21:01                                                                 ` Greg KH
  2009-07-28 22:02                                                                   ` Theodore Tso
  0 siblings, 1 reply; 104+ messages in thread
From: Greg KH @ 2009-07-28 21:01 UTC (permalink / raw)
  To: Theodore Tso, Alan Cox, Linus Torvalds, OGAWA Hirofumi,
	Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

On Tue, Jul 28, 2009 at 04:47:20PM -0400, Theodore Tso wrote:
> On Tue, Jul 28, 2009 at 12:56:48PM -0700, Greg KH wrote:
> > 
> > Do you have a pointer to your previous queue before you zapped it so
> > that others might be able to pick it up?
> > 
> 
> It's also available from linux-next as commit:
> 
>      024ea7873598686f3e02002ca23c2e8fa3fddd30

Hm, that's the wrong end of the tree, I think you mean
fbfa97a856a4ce2fa86058cff84f589eb7f1364e, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 21:01                                                                 ` Greg KH
@ 2009-07-28 22:02                                                                   ` Theodore Tso
  2009-07-28 23:49                                                                     ` Alan Cox
  0 siblings, 1 reply; 104+ messages in thread
From: Theodore Tso @ 2009-07-28 22:02 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Cox, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, Jul 28, 2009 at 02:01:21PM -0700, Greg KH wrote:
> On Tue, Jul 28, 2009 at 04:47:20PM -0400, Theodore Tso wrote:
> > On Tue, Jul 28, 2009 at 12:56:48PM -0700, Greg KH wrote:
> > > 
> > > Do you have a pointer to your previous queue before you zapped it so
> > > that others might be able to pick it up?
> > > 
> > 
> > It's also available from linux-next as commit:
> > 
> >      024ea7873598686f3e02002ca23c2e8fa3fddd30
> 
> Hm, that's the wrong end of the tree, I think you mean
> fbfa97a856a4ce2fa86058cff84f589eb7f1364e, right?

It all depends on your point of view, I guess.  The "last" commit in
the branch is 024ea78, so you can see the whole queue if you use the
command "git log 024ea78"; hence, it's the most useful if you're
trying to find the patch series via git.

The whole set of tty patches can be found via:

git log e00b95d..024ea78

(where e00b95d is the parent of fbfa978).

       	       	  					- Ted

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 19:08                                                         ` Linus Torvalds
  2009-07-28 19:15                                                           ` Alan Cox
@ 2009-07-28 23:46                                                           ` Alan Cox
  2009-07-29  0:10                                                             ` Linus Torvalds
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-28 23:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> Let me quote one of your first emails as I got cc'd on this thread:

Which is not about the emacs bug but old versions of kdesu relying upon
data boundaries happening to occur in the "right" places.

> And no, that was not a fluke. TODAY, you sent out an email saying that 
> EWOUDLBLOCK/EAGAIN would be "correct". Despite me having told you that it 
> _clearly_ is not correct, and despite you knowing that it breaks user 
> space.

I did not as you seem to think ever advocate not fixing the emacs problem.
Clearly we need to fix the behaviour because apps rely on it and its not
an outright "duh.." on the app side its a reasonable expectation in the
simple case.

Instead I got a whole collection of mail from you most of which indicated
you hadn't even understood the problem.

BTW: The tty->low_latency fix doesn't work, because the ->write method
can be called from an IRQ and that means we can't use ->low_latency=1 as
we take mutexes.

If you don't take the mutexes you revert the fixes that stop high speed
data sessions such as 3G Modem randomly hanging until you disconnect from
your provider and try again.

So the current state of play is
	as is breaks emacs
	revert pty changes breaks ppp, 
	tty->low_latency breaks ppp
	Ogawa's patch can lock up and races including exploitable jumps
		into freed memory
	blocking in close deadlocks if both ends get stuck that way

The best suggestion I can come up with for the new maintainer is to
implement a block in close() on the slave side only, and probably with an
interruptible wait. Set TTY_EOFPENDING when you enter close, sleep on
TTY_EOF being set when the workqueue flushing data into the ldisc
finishes shovelling. Providing exit() closes the file handles before
sending the signal that'll then give expected semantics. A pty master
close is "magic" anyway as it triggers a hangup event.

> The fact is, breaking regular user applications is simply not acceptable. 
> Trying to blame kernel breakage on the app being "buggy" is not ok. And 
> arguing for almost a week against fixing it - that's just crazy.

The security exploits on 2.6.27 don't work on 2.6.30, please fix
the regression so I get root again. Get real, that is what you just
claimed and its donkey poo. There is a point at which you fix stuff and a
point at which you don't.

The thing I advocated not fixing is that in some cases kdesu fails
because some old versions at least do

	read(fd, buf, lots);
	if (buf starts with something I want) {
		while(read more data)
			look for pattern
	}

but never looks for pattern in the first read data after the start it
wanted - ie it relied upon magic happy chance buffer boundary
preservation. Thats well beyond what is sane to try and preserve unless
it is easy to do which its not.

The emacs assumption is wrong, actually always untrue for some cases
(fork, child inherits pty, parent exits for one) but worth preserving for
the cases it happens to work. I've never argued otherwise.

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 22:02                                                                   ` Theodore Tso
@ 2009-07-28 23:49                                                                     ` Alan Cox
  2009-07-29  0:12                                                                       ` Greg KH
                                                                                         ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-28 23:49 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Greg KH, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009 18:02:03 -0400
Theodore Tso <tytso@mit.edu> wrote:

> On Tue, Jul 28, 2009 at 02:01:21PM -0700, Greg KH wrote:
> > On Tue, Jul 28, 2009 at 04:47:20PM -0400, Theodore Tso wrote:
> > > On Tue, Jul 28, 2009 at 12:56:48PM -0700, Greg KH wrote:
> > > > 
> > > > Do you have a pointer to your previous queue before you zapped it so
> > > > that others might be able to pick it up?
> > > > 
> > > 
> > > It's also available from linux-next as commit:
> > > 
> > >      024ea7873598686f3e02002ca23c2e8fa3fddd30
> > 
> > Hm, that's the wrong end of the tree, I think you mean
> > fbfa97a856a4ce2fa86058cff84f589eb7f1364e, right?
> 
> It all depends on your point of view, I guess.  The "last" commit in
> the branch is 024ea78, so you can see the whole queue if you use the
> command "git log 024ea78"; hence, it's the most useful if you're
> trying to find the patch series via git.
> 
> The whole set of tty patches can be found via:
> 
> git log e00b95d..024ea78
> 
> (where e00b95d is the parent of fbfa978).

I'll fish them out tomorrow. What is in the -next tree is only some of
the bits.

We still have races on the USB side too btw I put a frequency generator
on the carrier line of my pl2303 with an app continually opening the
port and it oopses after about 10 minutes. On the bright side so does
2.6.29, 2.6.27 ....

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 23:46                                                           ` Alan Cox
@ 2009-07-29  0:10                                                             ` Linus Torvalds
  2009-07-29  0:26                                                               ` Linus Torvalds
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29  0:10 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Wed, 29 Jul 2009, Alan Cox wrote:
> 
> BTW: The tty->low_latency fix doesn't work, because the ->write method
> can be called from an IRQ and that means we can't use ->low_latency=1 as
> we take mutexes.

Ok. So the end result is that Ogawa-san's fix is the right one. Then we 
can revert the low_latency=1 thing for pty's entirely. No?

And this is also the one that _looks_ the sanest - ie we do it on the read 
side, where it matters, rather than on the write side or the flush side 
(where the proper flushing can _also_ fix the problem, but where it's much 
more problematic, and where it's a lot less direct about what we care 
about).

This is just Ogawa's patch, redone against current -git, and with commit 
3a54297478e6578f96fd54bf4daa1751130aca86 reverted (Ogawa's patch already 
undid the non-low_latency part of it).

Now, I wonder if some _other_ line discipline might want to do that same 
tty_flush_to_ldisc thing in their "do I have data" logic, but I didn't 
look any closer.

So does this work for everyone? I haven't tested it yet myself, but this 
is the patch that "looks" right.

		Linus

---
 drivers/char/n_tty.c      |    1 +
 drivers/char/pty.c        |    2 --
 drivers/char/tty_buffer.c |   13 +++++++++++++
 include/linux/tty.h       |    1 +
 4 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index ff47907..973be2f 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1583,6 +1583,7 @@ static int n_tty_open(struct tty_struct *tty)
 
 static inline int input_available_p(struct tty_struct *tty, int amt)
 {
+	tty_flush_to_ldisc(tty);
 	if (tty->icanon) {
 		if (tty->canon_data)
 			return 1;
diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 3850a68..6e6942c 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -52,7 +52,6 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
 		return;
 	tty->link->packet = 0;
 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
-	tty_flip_buffer_push(tty->link);
 	wake_up_interruptible(&tty->link->read_wait);
 	wake_up_interruptible(&tty->link->write_wait);
 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
@@ -208,7 +207,6 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
 	clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
 	set_bit(TTY_THROTTLED, &tty->flags);
 	retval = 0;
-	tty->low_latency = 1;
 out:
 	return retval;
 }
diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 810ee25..3108991 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -462,6 +462,19 @@ static void flush_to_ldisc(struct work_struct *work)
 }
 
 /**
+ *	tty_flush_to_ldisc
+ *	@tty: tty to push
+ *
+ *	Push the terminal flip buffers to the line discipline.
+ *
+ *	Must not be called from IRQ context.
+ */
+void tty_flush_to_ldisc(struct tty_struct *tty)
+{
+	flush_to_ldisc(&tty->buf.work.work);
+}
+
+/**
  *	tty_flip_buffer_push	-	terminal
  *	@tty: tty to push
  *
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 1488d8c..e8c6c91 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -394,6 +394,7 @@ extern void __do_SAK(struct tty_struct *tty);
 extern void disassociate_ctty(int priv);
 extern void no_tty(void);
 extern void tty_flip_buffer_push(struct tty_struct *tty);
+extern void tty_flush_to_ldisc(struct tty_struct *tty);
 extern void tty_buffer_free_all(struct tty_struct *tty);
 extern void tty_buffer_flush(struct tty_struct *tty);
 extern void tty_buffer_init(struct tty_struct *tty);

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 23:49                                                                     ` Alan Cox
@ 2009-07-29  0:12                                                                       ` Greg KH
  2009-07-30 23:16                                                                       ` Alan Cox
  2009-07-31 13:49                                                                       ` Alan Cox
  2 siblings, 0 replies; 104+ messages in thread
From: Greg KH @ 2009-07-29  0:12 UTC (permalink / raw)
  To: Alan Cox
  Cc: Theodore Tso, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Wed, Jul 29, 2009 at 12:49:41AM +0100, Alan Cox wrote:
> On Tue, 28 Jul 2009 18:02:03 -0400
> Theodore Tso <tytso@mit.edu> wrote:
> 
> > On Tue, Jul 28, 2009 at 02:01:21PM -0700, Greg KH wrote:
> > > On Tue, Jul 28, 2009 at 04:47:20PM -0400, Theodore Tso wrote:
> > > > On Tue, Jul 28, 2009 at 12:56:48PM -0700, Greg KH wrote:
> > > > > 
> > > > > Do you have a pointer to your previous queue before you zapped it so
> > > > > that others might be able to pick it up?
> > > > > 
> > > > 
> > > > It's also available from linux-next as commit:
> > > > 
> > > >      024ea7873598686f3e02002ca23c2e8fa3fddd30
> > > 
> > > Hm, that's the wrong end of the tree, I think you mean
> > > fbfa97a856a4ce2fa86058cff84f589eb7f1364e, right?
> > 
> > It all depends on your point of view, I guess.  The "last" commit in
> > the branch is 024ea78, so you can see the whole queue if you use the
> > command "git log 024ea78"; hence, it's the most useful if you're
> > trying to find the patch series via git.
> > 
> > The whole set of tty patches can be found via:
> > 
> > git log e00b95d..024ea78
> > 
> > (where e00b95d is the parent of fbfa978).
> 
> I'll fish them out tomorrow. What is in the -next tree is only some of
> the bits.
> 
> We still have races on the USB side too btw I put a frequency generator
> on the carrier line of my pl2303 with an app continually opening the
> port and it oopses after about 10 minutes. On the bright side so does
> 2.6.29, 2.6.27 ....

Yeah, that's a scary test :)

I've gotten everything from linux-next right now, if you could be so
kind as to send me the rest of the bits you have pending, I'll be glad
to take them over and shephard them forward over time.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  0:10                                                             ` Linus Torvalds
@ 2009-07-29  0:26                                                               ` Linus Torvalds
  2009-07-29  7:01                                                                 ` Aneesh Kumar K.V
  2009-07-29  0:34                                                               ` Alan Cox
  2009-07-29  2:50                                                               ` Gene Heskett
  2 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29  0:26 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Tue, 28 Jul 2009, Linus Torvalds wrote:
> 
> So does this work for everyone? I haven't tested it yet myself, but this 
> is the patch that "looks" right.

This thing (on top of current -git) seems to pass Ogawa's tests at least 
on my machine (both on SMP, and when the processes are limited to just a 
single CPU).

Of course, since he wrote both the patch _and_ the tests, that doesn't 
come as a huge surprise - you'd expect Ogawa's patch to fix the testcase 
he has.

I use neither kdesu nor GNU emacs, so people who see the problem should 
test this patch, and maybe we can put _this_ particular issue behind us.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  0:10                                                             ` Linus Torvalds
  2009-07-29  0:26                                                               ` Linus Torvalds
@ 2009-07-29  0:34                                                               ` Alan Cox
  2009-07-29  1:04                                                                 ` Linus Torvalds
  2009-07-29  2:50                                                               ` Gene Heskett
  2 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-29  0:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> Ok. So the end result is that Ogawa-san's fix is the right one. Then we 
> can revert the low_latency=1 thing for pty's entirely. No?

No

> So does this work for everyone? I haven't tested it yet myself, but this 
> is the patch that "looks" right.

Looks pretty but breaks hangup events and hotplug.

The rules for hangup and thus hot unplug etc are

- The driver ensures that it will not call
  tty_flip_buffer_push/flush_to_ldisc again for this port until re-opened
- The driver calls tty_hangup
- tty_hangup ensures that tty_flip_buffer_push cannot occur again
  (by killing the workqueue)
- resources may well then get freed before close()

The same rules apply for an ldisc change via TIOCSETD

Ogawa's patch violates this by calling tty_flush_to_ldisc. If that
bridges a hangup then it will call into the ldisc for the dead port and
that in turn will call the write method of the driver which will in some
cases jump to free memory.

To make that work (and I agree its a very sane looking expression of the
intent) you would need to ensure that your new tty_flush_to_ldisc routine
was interlocked against already being hung up, tty_hangup and against an
ldisc change. I think (haven't verified) you get ldisc_change for free as
you are in the ldisc code so the ldisc ref is held and a set_ldisc will
block.

The hangup is tty_io:do_tty_hangup which calls tty_ldisc:
tty_ldisc_hangup which calls ld->ops->hangup which n_tty doesn't
currently have but which it could grow and which may block.

So my guess is you need to make your new flush to ldisc routine

	interlock versus do_tty_hangup
	check if the TTY_HUPPED flag is clear
	only if so and do_tty_hangup cannot be running actually run the
		ldisc code


the hangup code is still evil lock_kernel (and in parts "prayer_lock()"
code)


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  0:34                                                               ` Alan Cox
@ 2009-07-29  1:04                                                                 ` Linus Torvalds
  2009-07-29  1:23                                                                   ` Linus Torvalds
  2009-07-29  8:59                                                                   ` Alan Cox
  0 siblings, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29  1:04 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Wed, 29 Jul 2009, Alan Cox wrote:
> 
> The rules for hangup and thus hot unplug etc are
> 
> - The driver ensures that it will not call
>   tty_flip_buffer_push/flush_to_ldisc again for this port until re-opened

That's just bogus.

The work is already scheduled to be flushed by a timer. The only thing we 
do is to make it happen _immediately_ (rather than later) when we do that 
tty_flush_to_ldisc().

IOW, it's not changing what the kernel _does_. It's just moving something 
that will be done to a slightly earlier point.

If that is wrong as per hangup code, then the bug is in the hangup 
handling, not in the tty_flush_to_ldisc().

> - The driver calls tty_hangup
> - tty_hangup ensures that tty_flip_buffer_push cannot occur again
>   (by killing the workqueue)
> - resources may well then get freed before close()

They had better not be, since all the data structures touched are inside 
the 'tty_struct' (which we're dereferencing in other ways anyway in that 
whole routine).

So the only thing that the hangup code needs to do is to make that the 
"tty->buf.work.work" function pointer is a nop. And it does, as far as I
can tell.

> The same rules apply for an ldisc change via TIOCSETD
> 
> Ogawa's patch violates this by calling tty_flush_to_ldisc. If that
> bridges a hangup then it will call into the ldisc for the dead port and
> that in turn will call the write method of the driver which will in some
> cases jump to free memory.

What you describe is just crazy talk. If Ogawa's fix really causes 
problems, then the hangup code is broken, not Ogawa's trivial patch to 
make sure the work is done when trying to read a tty.

So regardless, by now we have moved from "trivial bug that bites people in 
real life" to "theoretical bug that looks impossible to trigger".

That's already a big step forward - along with making the code make more 
sense. Which is always good in itself.

> The hangup is tty_io:do_tty_hangup which calls tty_ldisc:
> tty_ldisc_hangup which calls ld->ops->hangup which n_tty doesn't
> currently have but which it could grow and which may block.

Well, put this way: the only thing that actually stops the outstanding 
timer (for the delayed work) is the tty_ldisc_halt() call in 
tty_ldisc_hangup(). If that _isn't_ called, then your argument is 
pointless, since the tty_flush_to_ldisc() will be done by a timer later 
(and Ogawa's patch thus clearly introduces nothing new).

And when it _is_ called, it also clears TTY_LDISC, so now tty_ldisc_ref() 
will return NULL, so then flush_to_ldisc() will be a no-op.

So as far as I can tell, we already protect against this whole case: if we 
call flush_to_ldisc() after the tty has been HUP'ed, it just won't _do_ 
anything (unless the work hasn't been canceled, but in that case the timer 
would have done the same thing, so nothing new is introduced).

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  1:04                                                                 ` Linus Torvalds
@ 2009-07-29  1:23                                                                   ` Linus Torvalds
  2009-07-29 11:17                                                                     ` Alan Cox
  2009-07-29  8:59                                                                   ` Alan Cox
  1 sibling, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29  1:23 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Tue, 28 Jul 2009, Linus Torvalds wrote:
> 
> And when it _is_ called, it also clears TTY_LDISC, so now tty_ldisc_ref() 
> will return NULL, so then flush_to_ldisc() will be a no-op.
> 
> So as far as I can tell, we already protect against this whole case: if we 
> call flush_to_ldisc() after the tty has been HUP'ed, it just won't _do_ 
> anything (unless the work hasn't been canceled, but in that case the timer 
> would have done the same thing, so nothing new is introduced).

Btw, if you worry about the fact that this all could happen concurrently 
(ie the hangup is done on one cpu, just as the other cpu is doing that 
flush_to_ldisc() thing), then again, Ogawa's patch doesn't actually change 
anything. The synchronous flush_to_ldisc() (done by Ogawa's patch) could 
equally have been an asynchronous (done by a timer), and the timer may 
already have triggered - so 'tty_ldisc_halt()' doing a cancel on the 
delayed work is too late.

So I don't think there are any new races wrt concurrency there either, 
even though we do not take any locks in the tty_flush_to_ldisc() case. 
Because the timer wouldn't have taken any locks either..

Of course, if "tty_ldisc_halt()" (to remove any pending timer) and 
"tty_ldisc_wait_idle()" (to make sure nothing else is executing right 
then) is not sufficient, then there's a possible problem there if you hit 
the timing just right, but again, that would be true _regardless_ of 
Ogawa's changes as far as I can tell. 

IOW, the whole argument really hinges on the fact that calling 
flush_to_ldisc() manually (without any locking) is really not 
fundamentally any different from the delayed work doing it from a timer. 

And when we _do_ disable the timer, we also make that flush_to_ldisc() 
function be a no-op, so the "tty_ldisc_halt()" effectively stops both the 
timer and (conceptually) the manual call case too. 

So now we have one remaining case, namely the case of the ldisc then being 
re-initialized and TTY_LDISC is set again. But at that point, calling 
flush_to_ldisc() had better be ok again, wouldn't you agree?

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken
  2009-07-25 14:05           ` Alan Cox
                               ` (2 preceding siblings ...)
  2009-07-26 17:41             ` Aneesh Kumar K.V
@ 2009-07-29  2:20             ` Gene Heskett
  3 siblings, 0 replies; 104+ messages in thread
From: Gene Heskett @ 2009-07-29  2:20 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

On Saturday 25 July 2009, Alan Cox wrote:
>Actually try this:
>
>
>commit b0e6bdde87725a5d46273ecc4bd00c54bd675848
>Author: Alan Cox <alan@linux.intel.com>
>Date:   Sat Jul 25 15:00:04 2009 +0100
>
>    pty: ensure writes hit the reader before close
>
>    This is elegant in all the wrong ways. Put the pty into low latency mode
> (which we can do as we always post bytes from user context). The
> tty_flip_buffer_push then always calls into the ldisc which means we clear
> the ldisc buffer before we set the TTY_OTHER_CLOSED flag.
>
>    Means pty has subtle knowledge of tty internals we really don't want it
> to, but it fixes the problem for the moment.
>
>    Signed-off-by: Alan Cox <alan@linux.intel.com>
>
>diff --git a/drivers/char/pty.c b/drivers/char/pty.c
>index 6e6942c..87d729b 100644
>--- a/drivers/char/pty.c
>+++ b/drivers/char/pty.c
>@@ -47,10 +47,12 @@ static void pty_close(struct tty_struct *tty, struct
> file *filp) }
> 	wake_up_interruptible(&tty->read_wait);
> 	wake_up_interruptible(&tty->write_wait);
>+
> 	tty->packet = 0;
> 	if (!tty->link)
> 		return;
> 	tty->link->packet = 0;
>+	tty_flip_buffer_push(tty->link);
> 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> 	wake_up_interruptible(&tty->link->read_wait);
> 	wake_up_interruptible(&tty->link->write_wait);
>@@ -207,6 +209,7 @@ static int pty_open(struct tty_struct *tty, struct file
> *filp) clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> 	set_bit(TTY_THROTTLED, &tty->flags);
> 	retval = 0;
>+	tty->low_latency = 1;
> out:
> 	return retval;
> }

Tracing this kdesu thread down per advice from Rafael W.  This patch doesn't 
seem to do anything about bz 13841. There are more patches in this thread, 
I'll give them some testing too.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

"Microsoft is the epitome of innovation and product quality."

   -- This testimonial paid for by Microsoft. 


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  0:10                                                             ` Linus Torvalds
  2009-07-29  0:26                                                               ` Linus Torvalds
  2009-07-29  0:34                                                               ` Alan Cox
@ 2009-07-29  2:50                                                               ` Gene Heskett
  2009-07-29  4:49                                                                 ` Linus Torvalds
  2 siblings, 1 reply; 104+ messages in thread
From: Gene Heskett @ 2009-07-29  2:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki,
	Ray Lee, LKML, Andrew Morton

On Tuesday 28 July 2009, Linus Torvalds wrote:
>On Wed, 29 Jul 2009, Alan Cox wrote:
>> BTW: The tty->low_latency fix doesn't work, because the ->write method
>> can be called from an IRQ and that means we can't use ->low_latency=1 as
>> we take mutexes.
>
>Ok. So the end result is that Ogawa-san's fix is the right one. Then we
>can revert the low_latency=1 thing for pty's entirely. No?
>
>And this is also the one that _looks_ the sanest - ie we do it on the read
>side, where it matters, rather than on the write side or the flush side
>(where the proper flushing can _also_ fix the problem, but where it's much
>more problematic, and where it's a lot less direct about what we care
>about).
>
>This is just Ogawa's patch, redone against current -git, and with commit
>3a54297478e6578f96fd54bf4daa1751130aca86 reverted (Ogawa's patch already
>undid the non-low_latency part of it).
>
>Now, I wonder if some _other_ line discipline might want to do that same
>tty_flush_to_ldisc thing in their "do I have data" logic, but I didn't
>look any closer.
>
>So does this work for everyone? I haven't tested it yet myself, but this
>is the patch that "looks" right.
>
>		Linus

It doesn't seem to do anything for bz 13841 here.  Sorry, I wish it did.

Thanks Linus.
>
>---
> drivers/char/n_tty.c      |    1 +
> drivers/char/pty.c        |    2 --
> drivers/char/tty_buffer.c |   13 +++++++++++++
> include/linux/tty.h       |    1 +
> 4 files changed, 15 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
>index ff47907..973be2f 100644
>--- a/drivers/char/n_tty.c
>+++ b/drivers/char/n_tty.c
>@@ -1583,6 +1583,7 @@ static int n_tty_open(struct tty_struct *tty)
>
> static inline int input_available_p(struct tty_struct *tty, int amt)
> {
>+	tty_flush_to_ldisc(tty);
> 	if (tty->icanon) {
> 		if (tty->canon_data)
> 			return 1;
>diff --git a/drivers/char/pty.c b/drivers/char/pty.c
>index 3850a68..6e6942c 100644
>--- a/drivers/char/pty.c
>+++ b/drivers/char/pty.c
>@@ -52,7 +52,6 @@ static void pty_close(struct tty_struct *tty, struct file
> *filp) return;
> 	tty->link->packet = 0;
> 	set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
>-	tty_flip_buffer_push(tty->link);
> 	wake_up_interruptible(&tty->link->read_wait);
> 	wake_up_interruptible(&tty->link->write_wait);
> 	if (tty->driver->subtype == PTY_TYPE_MASTER) {
>@@ -208,7 +207,6 @@ static int pty_open(struct tty_struct *tty, struct file
> *filp) clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> 	set_bit(TTY_THROTTLED, &tty->flags);
> 	retval = 0;
>-	tty->low_latency = 1;
> out:
> 	return retval;
> }
>diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
>index 810ee25..3108991 100644
>--- a/drivers/char/tty_buffer.c
>+++ b/drivers/char/tty_buffer.c
>@@ -462,6 +462,19 @@ static void flush_to_ldisc(struct work_struct *work)
> }
>
> /**
>+ *	tty_flush_to_ldisc
>+ *	@tty: tty to push
>+ *
>+ *	Push the terminal flip buffers to the line discipline.
>+ *
>+ *	Must not be called from IRQ context.
>+ */
>+void tty_flush_to_ldisc(struct tty_struct *tty)
>+{
>+	flush_to_ldisc(&tty->buf.work.work);
>+}
>+
>+/**
>  *	tty_flip_buffer_push	-	terminal
>  *	@tty: tty to push
>  *
>diff --git a/include/linux/tty.h b/include/linux/tty.h
>index 1488d8c..e8c6c91 100644
>--- a/include/linux/tty.h
>+++ b/include/linux/tty.h
>@@ -394,6 +394,7 @@ extern void __do_SAK(struct tty_struct *tty);
> extern void disassociate_ctty(int priv);
> extern void no_tty(void);
> extern void tty_flip_buffer_push(struct tty_struct *tty);
>+extern void tty_flush_to_ldisc(struct tty_struct *tty);
> extern void tty_buffer_free_all(struct tty_struct *tty);
> extern void tty_buffer_flush(struct tty_struct *tty);
> extern void tty_buffer_init(struct tty_struct *tty);
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/


-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

Small animal kamikaze attack on power supplies


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  2:50                                                               ` Gene Heskett
@ 2009-07-29  4:49                                                                 ` Linus Torvalds
  2009-07-29  4:54                                                                   ` Linus Torvalds
                                                                                     ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29  4:49 UTC (permalink / raw)
  To: Gene Heskett
  Cc: Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki,
	Ray Lee, LKML, Andrew Morton



On Tue, 28 Jul 2009, Gene Heskett wrote:
> 
> It doesn't seem to do anything for bz 13841 here.  Sorry, I wish it did.

No, bz 13841 isn't about pty's, it's about usb serial. The patches in this 
thread would mainly be pty-related, and would affect other tty drivers 
(like your usb one) only purely by chance.

could you bisect the 13841 behavior?

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  4:49                                                                 ` Linus Torvalds
@ 2009-07-29  4:54                                                                   ` Linus Torvalds
  2009-07-29  5:04                                                                     ` Gene Heskett
  2009-07-29  5:00                                                                   ` Gene Heskett
  2009-07-29 11:07                                                                   ` Alan Cox
  2 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29  4:54 UTC (permalink / raw)
  To: Gene Heskett
  Cc: Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki,
	Ray Lee, LKML, Andrew Morton



On Tue, 28 Jul 2009, Linus Torvalds wrote:
> 
> could you bisect the 13841 behavior?

Oh - and have you checked current -git? There's commit c56d300086, which 
may well have fixed some problems with openign a serial USB device.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  4:49                                                                 ` Linus Torvalds
  2009-07-29  4:54                                                                   ` Linus Torvalds
@ 2009-07-29  5:00                                                                   ` Gene Heskett
  2009-07-29  5:08                                                                     ` Andrew Morton
  2009-07-29 11:07                                                                   ` Alan Cox
  2 siblings, 1 reply; 104+ messages in thread
From: Gene Heskett @ 2009-07-29  5:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki,
	Ray Lee, LKML, Andrew Morton

On Wednesday 29 July 2009, Linus Torvalds wrote:
>On Tue, 28 Jul 2009, Gene Heskett wrote:
>> It doesn't seem to do anything for bz 13841 here.  Sorry, I wish it did.
>
>No, bz 13841 isn't about pty's, it's about usb serial. The patches in this
>thread would mainly be pty-related, and would affect other tty drivers
>(like your usb one) only purely by chance.
>
>could you bisect the 13841 behavior?
>
>		Linus

I may have to bite the bullet and spend a couple days doing that.  I don't 
know as my f10 installed git is new enough to work with the recent changes in 
it, I've rx'd several notices from Junio about new versions that fedora hasn't 
offered as updates.

Link to a tutorial plz?

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

One picture is worth 128K words.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  4:54                                                                   ` Linus Torvalds
@ 2009-07-29  5:04                                                                     ` Gene Heskett
  0 siblings, 0 replies; 104+ messages in thread
From: Gene Heskett @ 2009-07-29  5:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki,
	Ray Lee, LKML, Andrew Morton

On Wednesday 29 July 2009, Linus Torvalds wrote:
>On Tue, 28 Jul 2009, Linus Torvalds wrote:
>> could you bisect the 13841 behavior?
>
>Oh - and have you checked current -git? There's commit c56d300086, which
>may well have fixed some problems with openign a serial USB device.
>
>		Linus

No I haven't Linus, been running on your vanilla releases.  I'd say I know 
just enough about this to be dangerous (to myself) :)

Syntax to do that if my git is new enough?

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

The average Ph.D thesis is nothing but the transference of bones from
one graveyard to another.
		-- J. Frank Dobie, "A Texan in England"


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  5:00                                                                   ` Gene Heskett
@ 2009-07-29  5:08                                                                     ` Andrew Morton
  2009-07-29  7:46                                                                       ` Gene Heskett
  0 siblings, 1 reply; 104+ messages in thread
From: Andrew Morton @ 2009-07-29  5:08 UTC (permalink / raw)
  To: Gene Heskett
  Cc: Linus Torvalds, Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML

On Wed, 29 Jul 2009 01:00:45 -0400 Gene Heskett <gene.heskett@verizon.net> wrote:

> Link to a tutorial plz?

http://landley.net/writing/git-quick.html

That used to be at http://www.kernel.org/doc/local/git-quick.html but some
dope removed it.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  0:26                                                               ` Linus Torvalds
@ 2009-07-29  7:01                                                                 ` Aneesh Kumar K.V
  0 siblings, 0 replies; 104+ messages in thread
From: Aneesh Kumar K.V @ 2009-07-29  7:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, OGAWA Hirofumi, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

On Tue, Jul 28, 2009 at 05:26:16PM -0700, Linus Torvalds wrote:
> 
> 
> On Tue, 28 Jul 2009, Linus Torvalds wrote:
> > 
> > So does this work for everyone? I haven't tested it yet myself, but this 
> > is the patch that "looks" right.
> 
> This thing (on top of current -git) seems to pass Ogawa's tests at least 
> on my machine (both on SMP, and when the processes are limited to just a 
> single CPU).
> 
> Of course, since he wrote both the patch _and_ the tests, that doesn't 
> come as a huge surprise - you'd expect Ogawa's patch to fix the testcase 
> he has.
> 
> I use neither kdesu nor GNU emacs, so people who see the problem should 
> test this patch, and maybe we can put _this_ particular issue behind us.
> 

The patch also fix the "compile in emacs" bug.
http://bugzilla.kernel.org/show_bug.cgi?id=13815

-aneesh

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  5:08                                                                     ` Andrew Morton
@ 2009-07-29  7:46                                                                       ` Gene Heskett
  0 siblings, 0 replies; 104+ messages in thread
From: Gene Heskett @ 2009-07-29  7:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML

On Wednesday 29 July 2009, Andrew Morton wrote:
>On Wed, 29 Jul 2009 01:00:45 -0400 Gene Heskett <gene.heskett@verizon.net> 
wrote:
>> Link to a tutorial plz?
>
>http://landley.net/writing/git-quick.html
>
>That used to be at http://www.kernel.org/doc/local/git-quick.html but some
>dope removed it.

The ./configure line fails, on page 2 of that document, so I moved the .config 
files from 31-rc4 into that tree, and I'm using my own script to do the build 
and install, fiddling with the version numbers to track, calling the first 
build 2.6.31-rc3.5.  The build seems to be going ok, but I had to pull in the 
firmware/radeon tree from my own 31-rc4 in order to get the firmware my script 
& the .config expects.

As my script bounces in and out of the src tree, it needs the src tree named 
consistently.  So I had to restart it a couple of times to fix all those 
gotcha's.  But now I am ready to reboot, starting with me some zz's.  The 
build didn't spit any more stuff than usual.  5 section miss-matches I've been 
looking at for over a year, and some recent adds of rnonce.data stuff that are 
fussing about not being initialized.  I'll test for good/bad in the morning.


-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

The new Congressmen say they're going to turn the government around.  I
hope I don't get run over again.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  1:04                                                                 ` Linus Torvalds
  2009-07-29  1:23                                                                   ` Linus Torvalds
@ 2009-07-29  8:59                                                                   ` Alan Cox
  2009-07-29 15:48                                                                     ` Linus Torvalds
  1 sibling, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-29  8:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> > - The driver ensures that it will not call
> >   tty_flip_buffer_push/flush_to_ldisc again for this port until re-opened
> 
> That's just bogus.

I didn't invent it, thats how it works and its not an area I've touched.

> If that is wrong as per hangup code, then the bug is in the hangup 
> handling, not in the tty_flush_to_ldisc().

I wouldn't argue with that - I merely pointed out they need synchronizing
> 
> > - The driver calls tty_hangup
> > - tty_hangup ensures that tty_flip_buffer_push cannot occur again
> >   (by killing the workqueue)
> > - resources may well then get freed before close()
> 
> They had better not be, since all the data structures touched are inside 
> the 'tty_struct' (which we're dereferencing in other ways anyway in that 
> whole routine).

You are only looking at pty. That code is used for all the real physical
tty devices too. With real devices the underlying physical device and its
structures can get dumped. When you run the n_tty ldisc you call back out
to the drivers for echo etc.

> So the only thing that the hangup code needs to do is to make that the 
> "tty->buf.work.work" function pointer is a nop. And it does, as far as I
> can tell.

What happens if the hangup occurs just after you start running the ldisc
on another CPU ?

> So regardless, by now we have moved from "trivial bug that bites people in 
> real life" to "theoretical bug that looks impossible to trigger".

Actually all the hangup races turn up for people. Not often but now and
then. Also because you have vhangup() you can cause them in software by
leaving one app spinning in a loop hanging up and opening stuff while you
try and make it break.

> Well, put this way: the only thing that actually stops the outstanding 
> timer (for the delayed work) is the tty_ldisc_halt() call in 
> tty_ldisc_hangup(). If that _isn't_ called, then your argument is 
> pointless, since the tty_flush_to_ldisc() will be done by a timer later 
> (and Ogawa's patch thus clearly introduces nothing new).
> 
> And when it _is_ called, it also clears TTY_LDISC, so now tty_ldisc_ref() 
> will return NULL, so then flush_to_ldisc() will be a no-op.

IFF the hangup doesn't occur while you are entering flush_to_ldisc()

Consider a real tty for a bit

	CPU0					CPU1
n_tty methods
	flush_to_ldisc
	get ldisc ref
						INTERRUPT
						tty_hangup
						do_tty_hangup
	ldisc work				tty_ldisc_hangup
						RESET_TERMIOS is false
						tty->ops->hangup()
						[usb]serial_hangup()
						[usb]serial_do_down()
							close physical
								driver

	tty->ops->write
						[usb]serial_write
						WARN()


So as I said before you need to fix flush_to_ldisc and the hangup running
against one another. At the very least I think you need a
tty_ldisc_wait_idle(tty); just before

        if (tty->driver->flags & TTY_DRIVER_RESET_TERMIOS) {

so that you stall the hangup until n_tty exits the ldisc.


(The other case is calling ld->ops->hangup then calling ld->ops->write
which no ldisc seems to care about)


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  4:49                                                                 ` Linus Torvalds
  2009-07-29  4:54                                                                   ` Linus Torvalds
  2009-07-29  5:00                                                                   ` Gene Heskett
@ 2009-07-29 11:07                                                                   ` Alan Cox
  2009-07-29 17:40                                                                     ` Gene Heskett
  2 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-29 11:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Gene Heskett, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton, Alan Stern

On Tue, 28 Jul 2009 21:49:05 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 28 Jul 2009, Gene Heskett wrote:
> > 
> > It doesn't seem to do anything for bz 13841 here.  Sorry, I wish it did.
> 
> No, bz 13841 isn't about pty's, it's about usb serial. The patches in this 
> thread would mainly be pty-related, and would affect other tty drivers 
> (like your usb one) only purely by chance.
> 
> could you bisect the 13841 behavior?

See the thread in 13821 on linux-usb list which extends from the "why do
we leak ttyUSB numbers" into this as well. Its been pinned down and the in
tree fix is confirmed to do the job for that case. Not perfectly because
it needs a further small fix as

a) it should call drv->close() within the mutex on the error path if the
tty_block_til_ready() fails without which you get a leak on a few drivers
but nothing too harmful. (Noted by Alan Stern inspecting the patch)

b) it fails the 'variable frequency generator on carrier pin' test - but
that isn't a regression as such as all 2.6 kernels I've tried crash
during that test with USB.  (Running my 'extreme violence' test setup)

I suspect that doing

        retval = tty_port_block_til_ready(&port->port, tty, filp);
        if (retval == 0) {
                if (!first)
                        usb_serial_put(serial);
                return 0;
        }
        mutex_lock(&port->mutex);


	if (tty_hung_up_p(filp)) {
		mutex_unlock(&port->mutex);
		return retval;
	}
	dev->close(port);

	...

fixes both. The race basically is

		open
		drop mutex so we can wait for the port
		wait for the port
						[hangup]
						[serial_do_down]
		block_til_ready reports an error
		take mutex
			duplicate serial_do_down


Most serial drivers don't try and do open clean up in open() instead they
do it in close() which is called by the tty layer on an open failure
(always has been) and which turns out to be a useful design
decision as it means the driver doesn't have to tie itself in knots and
tty_port_close_start() handles all the mechanics. Plus they use
ASYNC_INITIALIZED as flag to avoid double shutdowns on a device.

Simply deleting most of the clean up from serial_open and letting close()
do its job would I think clean that up no end but not in an -rc perhaps.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  1:23                                                                   ` Linus Torvalds
@ 2009-07-29 11:17                                                                     ` Alan Cox
  0 siblings, 0 replies; 104+ messages in thread
From: Alan Cox @ 2009-07-29 11:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> IOW, the whole argument really hinges on the fact that calling 
> flush_to_ldisc() manually (without any locking) is really not 
> fundamentally any different from the delayed work doing it from a timer. 

Which means the hangup path should be doing a cancel_delayed_work() in
the case where its not resetting the termios as well and since it isn't
you can't actually (fingers crossed) make the problem worse, its merely as
broken as before.

Ok

> And when we _do_ disable the timer, we also make that flush_to_ldisc() 
> function be a no-op, so the "tty_ldisc_halt()" effectively stops both the 
> timer and (conceptually) the manual call case too. 
> 
> So now we have one remaining case, namely the case of the ldisc then being 
> re-initialized and TTY_LDISC is set again. But at that point, calling 
> flush_to_ldisc() had better be ok again, wouldn't you agree?

The ldisc re-init via SETD should be fine as it locks versus hangup
anyway and always has to.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29  8:59                                                                   ` Alan Cox
@ 2009-07-29 15:48                                                                     ` Linus Torvalds
  2009-07-29 15:55                                                                       ` Alan Cox
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29 15:48 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Wed, 29 Jul 2009, Alan Cox wrote:
> > 
> > > - The driver calls tty_hangup
> > > - tty_hangup ensures that tty_flip_buffer_push cannot occur again
> > >   (by killing the workqueue)
> > > - resources may well then get freed before close()
> > 
> > They had better not be, since all the data structures touched are inside 
> > the 'tty_struct' (which we're dereferencing in other ways anyway in that 
> > whole routine).
> 
> You are only looking at pty. That code is used for all the real physical
> tty devices too. With real devices the underlying physical device and its
> structures can get dumped. When you run the n_tty ldisc you call back out
> to the drivers for echo etc.

I actually only meant the "flush_to_ldisc()" part. We'll never get any 
further than that due to the reasons I outlined.

> > So the only thing that the hangup code needs to do is to make that the 
> > "tty->buf.work.work" function pointer is a nop. And it does, as far as I
> > can tell.
> 
> What happens if the hangup occurs just after you start running the ldisc
> on another CPU ?

But Alan, that was my point: Ogawa's patch in no way changes any existing 
behavior. The case you mention ALREADY HAPPENS, regardless of Ogawa's 
patch.

If a timer goes off at the same time hangup happens, you have the exact 
same situation. You can have one CPU doing hangup processing, and one CPU 
having just triggered the timeout and doing flush_to_ldisc.

> Consider a real tty for a bit
> 
> 	CPU0					CPU1
> n_tty methods
> 	flush_to_ldisc
> 	get ldisc ref
> 						INTERRUPT
> 						tty_hangup
> 						do_tty_hangup

Yes. Consider exactly that. And notice how it happens with or without 
Ogawa's patch. Just instead of "n_tty methods", you have "delayed-work 
timer".

> So as I said before you need to fix flush_to_ldisc and the hangup running
> against one another. At the very least I think you need a
> tty_ldisc_wait_idle(tty); just before
> 
>         if (tty->driver->flags & TTY_DRIVER_RESET_TERMIOS) {
> 
> so that you stall the hangup until n_tty exits the ldisc.

The tty_ldisc_hangup() already does tty_ldisc_wait_idle() after disabling 
the timer (and clearing the TTY_LDISC bit), so that all seems fine 
already. No?

				Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 15:48                                                                     ` Linus Torvalds
@ 2009-07-29 15:55                                                                       ` Alan Cox
  2009-07-29 16:05                                                                         ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-29 15:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

> The tty_ldisc_hangup() already does tty_ldisc_wait_idle() after disabling 
> the timer (and clearing the TTY_LDISC bit), so that all seems fine 
> already. No?

We only do tty_ldisc_wait_idle if tty->driver->flags & TTY_RESET_TERMIOS
is set that I can see ?

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 15:55                                                                       ` Alan Cox
@ 2009-07-29 16:05                                                                         ` Linus Torvalds
  2009-07-29 16:39                                                                           ` Alan Cox
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29 16:05 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Wed, 29 Jul 2009, Alan Cox wrote:

> > The tty_ldisc_hangup() already does tty_ldisc_wait_idle() after disabling 
> > the timer (and clearing the TTY_LDISC bit), so that all seems fine 
> > already. No?
> 
> We only do tty_ldisc_wait_idle if tty->driver->flags & TTY_RESET_TERMIOS
> is set that I can see ?

I agree, but that's also the only time we do that 'tty_ldisc_halt()' that 
cancels the timer. So if there is something that depends on the 
flush_to_ldisc() not happening, the bug was pre-existing since it never 
got rid of the flush to begin with, so the flush_to_ldisc() would happen 
at some random later time as a result of the delayed-work timer.

That said, I suspect that from a sanity standpoint, I suspect something 
like the appended migth be a good idea. But I think it's an independent 
issue (and unless somebody can actually point to an actual problem, I'd 
only apply something like this during the merge window, not after -rc4).

		Linus
---
 drivers/char/tty_ldisc.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/char/tty_ldisc.c b/drivers/char/tty_ldisc.c
index acd76b7..640b100 100644
--- a/drivers/char/tty_ldisc.c
+++ b/drivers/char/tty_ldisc.c
@@ -778,6 +778,8 @@ void tty_ldisc_hangup(struct tty_struct *tty)
 		if (ld->ops->hangup)
 			ld->ops->hangup(tty);
 		tty_ldisc_deref(ld);
+		tty_ldisc_halt(tty);
+		tty_ldisc_wait_idle(tty);
 	}
 	/*
 	 * FIXME: Once we trust the LDISC code better we can wait here for
@@ -794,8 +796,6 @@ void tty_ldisc_hangup(struct tty_struct *tty)
 		mutex_lock(&tty->ldisc_mutex);
 		if (tty->ldisc) {	/* Not yet closed */
 			/* Switch back to N_TTY */
-			tty_ldisc_halt(tty);
-			tty_ldisc_wait_idle(tty);
 			tty_ldisc_reinit(tty);
 			/* At this point we have a closed ldisc and we want to
 			   reopen it. We could defer this to the next open but

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 16:05                                                                         ` Linus Torvalds
@ 2009-07-29 16:39                                                                           ` Alan Cox
  2009-07-29 19:07                                                                             ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-29 16:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton

>  		tty_ldisc_deref(ld);
> +		tty_ldisc_halt(tty);
> +		tty_ldisc_wait_idle(tty);

Needs to be
	mutex_lock(&tty->ldisc_mutex);
	if (tty->ldisc)
		....
	}
	mutex_unlock(&tty->ldisc_mutex)

We can't wait for the reference to go away if we hold one, and if we
don't hold one the ldisc reference might go away during the hangup if we
close()

Odds of hitting it minimal however. Or maybe we need a smarter
tty_ldisc_wait_self_idle(ld) ?


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 11:07                                                                   ` Alan Cox
@ 2009-07-29 17:40                                                                     ` Gene Heskett
  2009-07-29 18:28                                                                       ` Frans Pop
  0 siblings, 1 reply; 104+ messages in thread
From: Gene Heskett @ 2009-07-29 17:40 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton, Alan Stern

On Wednesday 29 July 2009, Alan Cox wrote:
>On Tue, 28 Jul 2009 21:49:05 -0700 (PDT)
>
>Linus Torvalds <torvalds@linux-foundation.org> wrote:
>> On Tue, 28 Jul 2009, Gene Heskett wrote:
>> > It doesn't seem to do anything for bz 13841 here.  Sorry, I wish it did.
>>
>> No, bz 13841 isn't about pty's, it's about usb serial. The patches in this
>> thread would mainly be pty-related, and would affect other tty drivers
>> (like your usb one) only purely by chance.
>>
>> could you bisect the 13841 behavior?
>
>See the thread in 13821 on linux-usb list which extends from the "why do
>we leak ttyUSB numbers" into this as well. Its been pinned down and the in
>tree fix is confirmed to do the job for that case. Not perfectly because
>it needs a further small fix as
>
>a) it should call drv->close() within the mutex on the error path if the
>tty_block_til_ready() fails without which you get a leak on a few drivers
>but nothing too harmful. (Noted by Alan Stern inspecting the patch)
>
>b) it fails the 'variable frequency generator on carrier pin' test - but
>that isn't a regression as such as all 2.6 kernels I've tried crash
>during that test with USB.  (Running my 'extreme violence' test setup)
>
>I suspect that doing
>
>        retval = tty_port_block_til_ready(&port->port, tty, filp);
>        if (retval == 0) {
>                if (!first)
>                        usb_serial_put(serial);
>                return 0;
>        }
>        mutex_lock(&port->mutex);
>
>
>	if (tty_hung_up_p(filp)) {
>		mutex_unlock(&port->mutex);
>		return retval;
>	}
>	dev->close(port);
>
>	...
>
>fixes both. The race basically is
>
>		open
>		drop mutex so we can wait for the port
>		wait for the port
>						[hangup]
>						[serial_do_down]
>		block_til_ready reports an error
>		take mutex
>			duplicate serial_do_down
>
>
>Most serial drivers don't try and do open clean up in open() instead they
>do it in close() which is called by the tty layer on an open failure
>(always has been) and which turns out to be a useful design
>decision as it means the driver doesn't have to tie itself in knots and
>tty_port_close_start() handles all the mechanics. Plus they use
>ASYNC_INITIALIZED as flag to avoid double shutdowns on a device.
>
>Simply deleting most of the clean up from serial_open and letting close()
>do its job would I think clean that up no end but not in an -rc perhaps.

I have now done a bisect, but I'm not convinced its correct.  Since the 
./configure line didn't work in that tutorial, I fell back to using my own 
"makeit" script which largely automates the build and installation of a 
kernel, all I have to do is make sure its internal $VER matches what is in the 
Makefile.

And there by hangs the tail of some mistakes.  As I ran the bisect, the 
Makefile regressed all the way to 2.6.30 and screwed up my make a few times 
because the Makefile version was not stable from bisect run to bisect run.
The final, no choices left reboot was bad cuz it Ooops when that script was 
run the last 3 times, but there were no choices left to try.
Here is that 'git bisect log >../bisect-final.log'

git bisect start
# good: [6847e154e3cd74fca6084124c097980a7634285a] Linux 2.6.31-rc3
git bisect good 6847e154e3cd74fca6084124c097980a7634285a
# bad: [4be3bd7849165e7efa6b0b35a23d6a3598d97465] Linux 2.6.31-rc4
git bisect bad 4be3bd7849165e7efa6b0b35a23d6a3598d97465
# bad: [ae42b9e1ca8969d52e51f5e461b2e89e180943dd] Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin
git bisect bad ae42b9e1ca8969d52e51f5e461b2e89e180943dd
# bad: [301d95c4dade09388f94258ee797d2d650dc00b5] Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
git bisect bad 301d95c4dade09388f94258ee797d2d650dc00b5
# bad: [301d95c4dade09388f94258ee797d2d650dc00b5] Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
git bisect bad 301d95c4dade09388f94258ee797d2d650dc00b5
# good: [e9e961c9a818a2f24711af493b907a8e40a69efc] Merge branch 'i2c-for-2631-
rc3' of git://aeryn.fluff.org.uk/bjdooks/linux
git bisect good e9e961c9a818a2f24711af493b907a8e40a69efc
# good: [63f7a330014ad29b662638caabd8e96fe945b9ed] Merge branch 'timers-fixes-
for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good 63f7a330014ad29b662638caabd8e96fe945b9ed
# bad: [b983d0deb0e28f8880cdea79def575d96a27e603] Merge branch 'tracing-fixes-
for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad b983d0deb0e28f8880cdea79def575d96a27e603
# bad: [c20b08e3986c2dbfa6df1e880bf4f7159994d199] sched: Fix rt_rq-
>pushable_tasks initialization in init_rt_rq()
git bisect bad c20b08e3986c2dbfa6df1e880bf4f7159994d199
# bad: [a1ba4d8ba9f06a397e97cbd67a93ee306860b40a] sched_rt: Fix overload bug 
on rt group scheduling
git bisect bad a1ba4d8ba9f06a397e97cbd67a93ee306860b40a

What is needed for a mistake free bisect is a 'doesn't matter as long as it 
matches' Makefile version that survives all the way through a bisect run.  
When I was done, I had to go back to my own 2.6.30 and 2.6.31-rc3 trees and do 
a full rebuild and reinstall in order to have a clean boot newer that 2.6.29.

Because of these errors, and the fact that the final 3 2.6.30's it built were 
bad, I am not convinced I have a valid bisect, particularly since the final 3 
builds were called 2.6.30, and they were all=bad because of an Oopsen at the 
scripts invocation from rc.local, not the i/o error that makes it loop forever 
using 98+% of all 4 cores in my phenom and thrashing the disk a bit.

If this version thing can be handled in a cleaner manner, I'll be glad to 
repeat the bisect from scratch, it turns out not to be as complex as I feared.

As it stands, its a pita because its destroying perfect good fallback boot 
setups.  I should be able to set the extraversion to -rc3.5 in both the 
Makefile and in my makeit script, and run the whole bisect without molesting 
the rest of the system, creating one set of bootfiles that are simply 
overwritten with the next bisect run. 

Thanks for listening to the rant.  What do I do next?

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

Every program has (at least) two purposes:
	the one for which it was written and another for which it wasn't.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 17:40                                                                     ` Gene Heskett
@ 2009-07-29 18:28                                                                       ` Frans Pop
  2009-07-29 18:43                                                                         ` Gene Heskett
  0 siblings, 1 reply; 104+ messages in thread
From: Frans Pop @ 2009-07-29 18:28 UTC (permalink / raw)
  To: Gene Heskett
  Cc: alan, torvalds, hirofumi, aneesh.kumar, rjw, ray-lk,
	linux-kernel, akpm, stern

> What is needed for a mistake free bisect is a 'doesn't matter as long
> as it matches' Makefile version that survives all the way through a
> bisect run.

I have a wrapper script I use for kernel builds that takes care of that 
(it also supports cross building and building some out-of-tree modules). 
Some snippets from that script below.

BISECTING=
if [ -e .git/BISECT_LOG ]; then
        BISECTING=1
fi
[...]
if [ "$BISECTING" ]; then
        # The version in the next line may need updating before a bisect
        sed -i "s/^SUBLEVEL = .*/SUBLEVEL = 31/" Makefile
        sed -i "s/^EXTRAVERSION =.*/EXTRAVERSION = -bisect/" Makefile
fi
[...]
make ...
[...]
if [ "$BISECTING" ]; then
	# Revert Makefile to avoid errors on 'git bisect good/bad'
        git checkout Makefile
fi

I use the deb-pkg target and also set the .deb package version in the 
second hunk:
KERNELDEBREVISION=$(grep "^git[- ]bisect" .git/BISECT_LOG | wc -l)

This way I end up with a nice series of packages whose numbering matches 
the steps in .git/BISECT_LOG:
linux-image-2.6.31-bisect_1_amd64.deb
linux-image-2.6.31-bisect_2_amd64.deb
linux-image-2.6.31-bisect_3_amd64.deb
...

Hope that help.

Cheers,
FJP

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 18:28                                                                       ` Frans Pop
@ 2009-07-29 18:43                                                                         ` Gene Heskett
  2009-07-29 19:08                                                                           ` Frans Pop
  0 siblings, 1 reply; 104+ messages in thread
From: Gene Heskett @ 2009-07-29 18:43 UTC (permalink / raw)
  To: Frans Pop
  Cc: alan, torvalds, hirofumi, aneesh.kumar, rjw, ray-lk,
	linux-kernel, akpm, stern

On Wednesday 29 July 2009, Frans Pop wrote:
>> What is needed for a mistake free bisect is a 'doesn't matter as long
>> as it matches' Makefile version that survives all the way through a
>> bisect run.
>
>I have a wrapper script I use for kernel builds that takes care of that
>(it also supports cross building and building some out-of-tree modules).
>Some snippets from that script below.
>
>BISECTING=
>if [ -e .git/BISECT_LOG ]; then
>        BISECTING=1
>fi
>[...]
>if [ "$BISECTING" ]; then
>        # The version in the next line may need updating before a bisect
>        sed -i "s/^SUBLEVEL = .*/SUBLEVEL = 31/" Makefile
>        sed -i "s/^EXTRAVERSION =.*/EXTRAVERSION = -bisect/" Makefile
>fi
>[...]
>make ...
>[...]
>if [ "$BISECTING" ]; then
>	# Revert Makefile to avoid errors on 'git bisect good/bad'
>        git checkout Makefile
Ahh, I see that now, which I was objecting to below.  I'll go quietly. :)
>fi
>
>I use the deb-pkg target and also set the .deb package version in the
>second hunk:
>KERNELDEBREVISION=$(grep "^git[- ]bisect" .git/BISECT_LOG | wc -l)
>
>This way I end up with a nice series of packages whose numbering matches
>the steps in .git/BISECT_LOG:
>linux-image-2.6.31-bisect_1_amd64.deb
>linux-image-2.6.31-bisect_2_amd64.deb
>linux-image-2.6.31-bisect_3_amd64.deb
>...
>
>Hope that help.
>
>Cheers,
>FJP

Yes, some of it will.  But thanks to fedora's broken disk partitioner, 
something I've been screaming about for a damned decade, my /boot partition 
isn't big enough to absorb a whole chain of those, hence the fixed version 
request.

This script would appear to need a restore function for the Makefile version 
because one of my 'git bisect bad's returned that it couldn't switch branches 
because of the handmade Makefile changes I'd done on the first build.  How 
have you been handling that?

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

When you don't know what to do, walk fast and look worried.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 16:39                                                                           ` Alan Cox
@ 2009-07-29 19:07                                                                             ` Linus Torvalds
  0 siblings, 0 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-07-29 19:07 UTC (permalink / raw)
  To: Alan Cox
  Cc: OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee,
	LKML, Andrew Morton



On Wed, 29 Jul 2009, Alan Cox wrote:
> 
> Odds of hitting it minimal however. Or maybe we need a smarter
> tty_ldisc_wait_idle(ld) ?

Just adding the ldisc_mutex around the call sounds like the simplest 
solution. 

That said, looking at the callers of tty_ldisc_wait_idle(), it looks like 
we have other similar problems already in tty_ldisc_release(), which also 
calls it without holding the lock, both for the "self" case and then 
recursively for o_tty.

Moving the mutex_lock() up a bit in tty_ldisc_release() looks like the 
trivial solution, although I suspect there are any deadlock issues there 
(ie refs that won't go away because we hold the lock and the thing we are 
waiting for needs the lock to release the ldisc!).

So making tty_ldisc_wait_idle() more careful adn work without the lock 
would definitely be the safer thing to do. Requiring the lock looks 
potentially pretty dangerous.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 18:43                                                                         ` Gene Heskett
@ 2009-07-29 19:08                                                                           ` Frans Pop
  2009-07-29 19:19                                                                             ` Gene Heskett
  0 siblings, 1 reply; 104+ messages in thread
From: Frans Pop @ 2009-07-29 19:08 UTC (permalink / raw)
  To: Gene Heskett
  Cc: alan, torvalds, hirofumi, aneesh.kumar, rjw, ray-lk,
	linux-kernel, akpm, stern

On Wednesday 29 July 2009, Gene Heskett wrote:
> Yes, some of it will.  But thanks to fedora's broken disk partitioner,
> something I've been screaming about for a damned decade, my /boot
> partition isn't big enough to absorb a whole chain of those, hence the
> fixed version request.

My /boot isn't big enough for a full series either, but my $HOME is :-)

The fact that the packages I create all have the same package name and 
only the package version differs, means that I do get the complete series 
but only have one version installed in /boot at any time (as installing a 
new version overwrites the version from the previous bisect iteration). 

No idea if you can do the same on an RPM-based system easily.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Regression] kdesu broken, now usb fixed in current git pull
  2009-07-24 18:25   ` Aneesh Kumar K.V
  2009-07-25 12:07     ` Alan Cox
@ 2009-07-29 19:09     ` Gene Heskett
  1 sibling, 0 replies; 104+ messages in thread
From: Gene Heskett @ 2009-07-29 19:09 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton, torvalds, alan

On Friday 24 July 2009, Aneesh Kumar K.V wrote:
>On Thu, Jul 23, 2009 at 05:21:36PM -0700, Ray Lee wrote:
>> On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
>> > A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.
>> >  ISTR a discussion about that, but I can't find it right now.  Any
>> > clues?
>>
>> See the thread starting here: ("possible regression with pty.c commit")
>>    http://lkml.org/lkml/2009/7/11/125
>
>I am also facing a similar problem.
>
>http://bugzilla.kernel.org/show_bug.cgi?id=13815
>
>-aneesh

I just did a git bisect reset master;git pull and built a new 2.6.31-rc4.  The 
problem with my bash script has now been resolved.  Many thanks.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

Nothing cures insomnia like the realization that it's time to get up.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 19:08                                                                           ` Frans Pop
@ 2009-07-29 19:19                                                                             ` Gene Heskett
  2009-07-30 12:43                                                                               ` Valdis.Kletnieks
  0 siblings, 1 reply; 104+ messages in thread
From: Gene Heskett @ 2009-07-29 19:19 UTC (permalink / raw)
  To: Frans Pop
  Cc: alan, torvalds, hirofumi, aneesh.kumar, rjw, ray-lk,
	linux-kernel, akpm, stern

On Wednesday 29 July 2009, Frans Pop wrote:
>On Wednesday 29 July 2009, Gene Heskett wrote:
>> Yes, some of it will.  But thanks to fedora's broken disk partitioner,
>> something I've been screaming about for a damned decade, my /boot
>> partition isn't big enough to absorb a whole chain of those, hence the
>> fixed version request.
>
>My /boot isn't big enough for a full series either, but my $HOME is :-)
>
>The fact that the packages I create all have the same package name and
>only the package version differs, means that I do get the complete series
>but only have one version installed in /boot at any time (as installing a
>new version overwrites the version from the previous bisect iteration).
>
>No idea if you can do the same on an RPM-based system easily.

I do not normally use rpms to do that.  My makeit script handles all of that, 
keeping one old version of everything in /boot and /lib/modules that I can 
revert by hand if the new kernel is a showstopper.

Whenever I let rpm get access to my grub.conf, I always have to go back in 
with vim & fix the stanza numbers I keep track of the boots with.  Grub 
doesn't really care, but I do. :)  rpm should be so neat...

Anybody that wants this script (actually there are 2 of them), I can attach 
them, makeit and buildit26 (which unpacks and applies the patches plus running 
a make oldconfig) are about 100 lines each.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

I just got out of the hospital after a speed reading accident.
I hit a bookmark.
		-- Steven Wright


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-29 19:19                                                                             ` Gene Heskett
@ 2009-07-30 12:43                                                                               ` Valdis.Kletnieks
  2009-07-30 15:35                                                                                 ` Gene Heskett
  0 siblings, 1 reply; 104+ messages in thread
From: Valdis.Kletnieks @ 2009-07-30 12:43 UTC (permalink / raw)
  To: Gene Heskett
  Cc: Frans Pop, alan, torvalds, hirofumi, aneesh.kumar, rjw, ray-lk,
	linux-kernel, akpm, stern

[-- Attachment #1: Type: text/plain, Size: 576 bytes --]

On Wed, 29 Jul 2009 15:19:19 EDT, Gene Heskett said:

> Whenever I let rpm get access to my grub.conf, I always have to go back in 
> with vim & fix the stanza numbers I keep track of the boots with.  Grub 
> doesn't really care, but I do. :)  rpm should be so neat...

Just to be accurate - this isn't rpm's fault, it's a program called 'grubby'.

I understand the reasoning - if you just installed a new kernel, you probably
want it to be the one started at next boot.  But it would be nice if grubby
had a configure option for that in /etc/sysconfig/grubby or someplace...

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-30 12:43                                                                               ` Valdis.Kletnieks
@ 2009-07-30 15:35                                                                                 ` Gene Heskett
  2009-07-30 18:39                                                                                   ` Valdis.Kletnieks
  2009-07-31  2:01                                                                                   ` H. Peter Anvin
  0 siblings, 2 replies; 104+ messages in thread
From: Gene Heskett @ 2009-07-30 15:35 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Frans Pop, alan, torvalds, hirofumi, aneesh.kumar, rjw, ray-lk,
	linux-kernel, akpm, stern

On Thursday 30 July 2009, Valdis.Kletnieks@vt.edu wrote:
>On Wed, 29 Jul 2009 15:19:19 EDT, Gene Heskett said:
>> Whenever I let rpm get access to my grub.conf, I always have to go back in
>> with vim & fix the stanza numbers I keep track of the boots with.  Grub
>> doesn't really care, but I do. :)  rpm should be so neat...
>
>Just to be accurate - this isn't rpm's fault, it's a program called
> 'grubby'.
>
>I understand the reasoning - if you just installed a new kernel, you
> probably want it to be the one started at next boot.  But it would be nice
> if grubby had a configure option for that in /etc/sysconfig/grubby or
> someplace...

Humm, kewl.  And the src's for grubby are where?  Hopefully in C so thois old 
fart can grok...

Thanks.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

The best laid plans of mice and men are held up in the legal department.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-30 15:35                                                                                 ` Gene Heskett
@ 2009-07-30 18:39                                                                                   ` Valdis.Kletnieks
  2009-07-31  2:01                                                                                   ` H. Peter Anvin
  1 sibling, 0 replies; 104+ messages in thread
From: Valdis.Kletnieks @ 2009-07-30 18:39 UTC (permalink / raw)
  To: Gene Heskett
  Cc: Frans Pop, alan, torvalds, hirofumi, aneesh.kumar, rjw, ray-lk,
	linux-kernel, akpm, stern

[-- Attachment #1: Type: text/plain, Size: 364 bytes --]

On Thu, 30 Jul 2009 11:35:03 EDT, Gene Heskett said:
> Humm, kewl.  And the src's for grubby are where?  Hopefully in C so thois old
> fart can grok...

% rpm -qi grubby

should provide pointers for the .src.rpm and the upstream source...

(That trick should work for *every* RPM - if you find one it doesn't work
for, that's probably a reportable bug to RedHat.)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 23:49                                                                     ` Alan Cox
  2009-07-29  0:12                                                                       ` Greg KH
@ 2009-07-30 23:16                                                                       ` Alan Cox
  2009-07-30 23:24                                                                         ` Greg KH
  2009-07-31 13:49                                                                       ` Alan Cox
  2 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-30 23:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Theodore Tso, Greg KH, Linus Torvalds, OGAWA Hirofumi,
	Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

> I'll fish them out tomorrow. What is in the -next tree is only some of
> the bits.
> 
> We still have races on the USB side too btw I put a frequency generator
> on the carrier line of my pl2303 with an app continually opening the
> port and it oopses after about 10 minutes. On the bright side so does
> 2.6.29, 2.6.27 ....

Just an update - I've not posted them yet as I've been putting the
various branches into some kind of sane order with the tested stuff first
and the "in progress" serial -> tty_port stuff last.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-30 23:16                                                                       ` Alan Cox
@ 2009-07-30 23:24                                                                         ` Greg KH
  0 siblings, 0 replies; 104+ messages in thread
From: Greg KH @ 2009-07-30 23:24 UTC (permalink / raw)
  To: Alan Cox
  Cc: Theodore Tso, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Fri, Jul 31, 2009 at 12:16:21AM +0100, Alan Cox wrote:
> > I'll fish them out tomorrow. What is in the -next tree is only some of
> > the bits.
> > 
> > We still have races on the USB side too btw I put a frequency generator
> > on the carrier line of my pl2303 with an app continually opening the
> > port and it oopses after about 10 minutes. On the bright side so does
> > 2.6.29, 2.6.27 ....
> 
> Just an update - I've not posted them yet as I've been putting the
> various branches into some kind of sane order with the tested stuff first
> and the "in progress" serial -> tty_port stuff last.

Thanks for the update, I appreciate it.

greg k-h

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-30 15:35                                                                                 ` Gene Heskett
  2009-07-30 18:39                                                                                   ` Valdis.Kletnieks
@ 2009-07-31  2:01                                                                                   ` H. Peter Anvin
  1 sibling, 0 replies; 104+ messages in thread
From: H. Peter Anvin @ 2009-07-31  2:01 UTC (permalink / raw)
  To: Gene Heskett
  Cc: Valdis.Kletnieks, Frans Pop, alan, torvalds, hirofumi,
	aneesh.kumar, rjw, ray-lk, linux-kernel, akpm, stern

On 07/30/2009 08:35 AM, Gene Heskett wrote:
> 
> Humm, kewl.  And the src's for grubby are where?  Hopefully in C so thois old 
> fart can grok...
> 

They're part of the Fedora mkinitrd package.  They're in C, but they'll
make you go blind.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-28 23:49                                                                     ` Alan Cox
  2009-07-29  0:12                                                                       ` Greg KH
  2009-07-30 23:16                                                                       ` Alan Cox
@ 2009-07-31 13:49                                                                       ` Alan Cox
  2009-07-31 14:17                                                                         ` Greg KH
  2 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-07-31 13:49 UTC (permalink / raw)
  To: Alan Cox
  Cc: Theodore Tso, Greg KH, Linus Torvalds, OGAWA Hirofumi,
	Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML,
	Andrew Morton

tar ball at

http://zeniv.linux.org.uk/~alan/ttydev.tar.gz

Series is in order of degree of working. It should all be sane down to
vt-lockactive. The stuff down to tty-usb-serial-termiosbits has had some
basic testing.

tty-usb-cleanup-open hangs still on the second open and is a work in
progress but shows where the tty close side interface was going in order
to remove all the posix logic from drivers.

beyond that is the last bits of the serial cleanup which are definitely
"work in progress", and the f81216 helper which needs reworking into the
drivers proper.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] kdesu broken
  2009-07-31 13:49                                                                       ` Alan Cox
@ 2009-07-31 14:17                                                                         ` Greg KH
  0 siblings, 0 replies; 104+ messages in thread
From: Greg KH @ 2009-07-31 14:17 UTC (permalink / raw)
  To: Alan Cox
  Cc: Theodore Tso, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V,
	Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Fri, Jul 31, 2009 at 02:49:01PM +0100, Alan Cox wrote:
> tar ball at
> 
> http://zeniv.linux.org.uk/~alan/ttydev.tar.gz
> 
> Series is in order of degree of working. It should all be sane down to
> vt-lockactive. The stuff down to tty-usb-serial-termiosbits has had some
> basic testing.
> 
> tty-usb-cleanup-open hangs still on the second open and is a work in
> progress but shows where the tty close side interface was going in order
> to remove all the posix logic from drivers.
> 
> beyond that is the last bits of the serial cleanup which are definitely
> "work in progress", and the f81216 helper which needs reworking into the
> drivers proper.

Thanks for providing this.  I'll pull these into my tree and then tell
Stephen about it for linux-next.

Thanks again for doing the tty work, it is much appreciated.

greg k-h

^ permalink raw reply	[flat|nested] 104+ messages in thread

end of thread, other threads:[~2009-07-31 14:21 UTC | newest]

Thread overview: 104+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-23 23:45 [Regression] kdesu broken Rafael J. Wysocki
2009-07-24  0:21 ` Ray Lee
2009-07-24 15:21   ` Rafael J. Wysocki
2009-07-24 15:40     ` Alan Cox
2009-07-24 16:34       ` Linus Torvalds
2009-07-25  6:04         ` OGAWA Hirofumi
2009-07-25 13:31           ` Alan Cox
2009-07-25 14:05           ` Alan Cox
2009-07-25 14:55             ` OGAWA Hirofumi
2009-07-25 15:32               ` Alan Cox
2009-07-26 11:51                 ` OGAWA Hirofumi
2009-07-27 10:57                   ` Alan Cox
2009-07-27 12:07                     ` OGAWA Hirofumi
2009-07-27 12:46                       ` OGAWA Hirofumi
2009-07-27 13:23                       ` [PATCH] " Alan Cox
2009-07-27 13:50                         ` OGAWA Hirofumi
2009-07-27 13:58                           ` Alan Cox
2009-07-27 15:04                             ` OGAWA Hirofumi
2009-07-27 16:14                               ` Aneesh Kumar K.V
2009-07-27 16:42                                 ` Alan Cox
2009-07-27 17:12                                   ` Aneesh Kumar K.V
2009-07-27 19:28                                     ` OGAWA Hirofumi
2009-07-27 19:40                                       ` Linus Torvalds
2009-07-27 20:38                                         ` OGAWA Hirofumi
2009-07-27 20:45                                           ` Linus Torvalds
2009-07-27 21:42                                           ` Alan Cox
2009-07-27 22:04                                             ` Linus Torvalds
2009-07-27 22:41                                               ` Alan Cox
2009-07-27 20:52                                         ` Alan Cox
2009-07-27 21:22                                           ` Linus Torvalds
2009-07-27 21:54                                             ` Alan Cox
2009-07-27 21:20                                       ` Alan Cox
2009-07-28  5:33                                         ` OGAWA Hirofumi
2009-07-28 10:22                                           ` Alan Cox
2009-07-28 10:42                                             ` OGAWA Hirofumi
2009-07-28 15:49                                             ` Linus Torvalds
2009-07-28 16:42                                               ` Alan Cox
2009-07-28 16:49                                                 ` Linus Torvalds
2009-07-28 16:52                                                   ` Linus Torvalds
2009-07-28 17:09                                                     ` Alan Cox
2009-07-28 18:45                                                       ` Linus Torvalds
2009-07-28 17:06                                                   ` Alan Cox
2009-07-28 18:44                                                     ` Linus Torvalds
2009-07-28 18:56                                                       ` Alan Cox
2009-07-28 19:08                                                         ` Linus Torvalds
2009-07-28 19:15                                                           ` Alan Cox
2009-07-28 19:56                                                             ` Greg KH
2009-07-28 20:47                                                               ` Theodore Tso
2009-07-28 21:01                                                                 ` Greg KH
2009-07-28 22:02                                                                   ` Theodore Tso
2009-07-28 23:49                                                                     ` Alan Cox
2009-07-29  0:12                                                                       ` Greg KH
2009-07-30 23:16                                                                       ` Alan Cox
2009-07-30 23:24                                                                         ` Greg KH
2009-07-31 13:49                                                                       ` Alan Cox
2009-07-31 14:17                                                                         ` Greg KH
2009-07-28 23:46                                                           ` Alan Cox
2009-07-29  0:10                                                             ` Linus Torvalds
2009-07-29  0:26                                                               ` Linus Torvalds
2009-07-29  7:01                                                                 ` Aneesh Kumar K.V
2009-07-29  0:34                                                               ` Alan Cox
2009-07-29  1:04                                                                 ` Linus Torvalds
2009-07-29  1:23                                                                   ` Linus Torvalds
2009-07-29 11:17                                                                     ` Alan Cox
2009-07-29  8:59                                                                   ` Alan Cox
2009-07-29 15:48                                                                     ` Linus Torvalds
2009-07-29 15:55                                                                       ` Alan Cox
2009-07-29 16:05                                                                         ` Linus Torvalds
2009-07-29 16:39                                                                           ` Alan Cox
2009-07-29 19:07                                                                             ` Linus Torvalds
2009-07-29  2:50                                                               ` Gene Heskett
2009-07-29  4:49                                                                 ` Linus Torvalds
2009-07-29  4:54                                                                   ` Linus Torvalds
2009-07-29  5:04                                                                     ` Gene Heskett
2009-07-29  5:00                                                                   ` Gene Heskett
2009-07-29  5:08                                                                     ` Andrew Morton
2009-07-29  7:46                                                                       ` Gene Heskett
2009-07-29 11:07                                                                   ` Alan Cox
2009-07-29 17:40                                                                     ` Gene Heskett
2009-07-29 18:28                                                                       ` Frans Pop
2009-07-29 18:43                                                                         ` Gene Heskett
2009-07-29 19:08                                                                           ` Frans Pop
2009-07-29 19:19                                                                             ` Gene Heskett
2009-07-30 12:43                                                                               ` Valdis.Kletnieks
2009-07-30 15:35                                                                                 ` Gene Heskett
2009-07-30 18:39                                                                                   ` Valdis.Kletnieks
2009-07-31  2:01                                                                                   ` H. Peter Anvin
2009-07-28 15:48                                           ` Linus Torvalds
2009-07-28 16:16                                             ` OGAWA Hirofumi
2009-07-27 18:28                                 ` Andreas Schwab
2009-07-27 13:58                         ` Aneesh Kumar K.V
2009-07-25 20:12             ` [Regression] " Rafael J. Wysocki
2009-07-26 17:41             ` Aneesh Kumar K.V
2009-07-29  2:20             ` Gene Heskett
2009-07-25 11:48         ` Alan Cox
2009-07-25 14:02           ` Frans Pop
2009-07-25 20:16             ` Rafael J. Wysocki
2009-07-25 21:03               ` Alan Cox
2009-07-26 15:51             ` Sergey Senozhatsky
2009-07-24 18:25   ` Aneesh Kumar K.V
2009-07-25 12:07     ` Alan Cox
2009-07-25 16:18       ` Aneesh Kumar K.V
2009-07-25 17:06         ` Aneesh Kumar K.V
2009-07-29 19:09     ` [Regression] kdesu broken, now usb fixed in current git pull Gene Heskett

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.