All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] tty reference count fix
@ 2006-02-13 20:13 Paul Fulghum
  2006-02-13 22:15 ` Jesper Juhl
  2006-02-13 22:51 ` Jason Baron
  0 siblings, 2 replies; 7+ messages in thread
From: Paul Fulghum @ 2006-02-13 20:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List, jesper.juhl, Alan Cox

Fix hole where tty structure can be released when reference
count is non zero. Existing code can sleep without tty_sem
protection between deciding to release the tty structure
(setting local variables tty_closing and otty_closing)
and setting TTY_CLOSING to prevent further opens.
An open can occur during this interval causing release_dev()
to free the tty structure while it is still referenced.

This should fix bugzilla.kernel.org
[Bug 6041] New: Unable to handle kernel paging request

In Bug 6041, tty_open() oopes on accessing the tty structure
it has successfully claimed. Bug was on SMP machine
with the same tty being opened and closed by
multiple processes, and DEBUG_PAGEALLOC enabled.

Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Jesper Juhl <jesper.juhl@gmail.com>

--- linux/drivers/char/tty_io.c	2006-02-10 15:54:00.000000000 -0600
+++ b/drivers/char/tty_io.c	2006-02-13 13:14:33.000000000 -0600
@@ -1841,7 +1841,6 @@ static void release_dev(struct file * fi
 		tty_closing = tty->count <= 1;
 		o_tty_closing = o_tty &&
 			(o_tty->count <= (pty_master ? 1 : 0));
-		up(&tty_sem);
 		do_sleep = 0;
 
 		if (tty_closing) {
@@ -1869,6 +1868,7 @@ static void release_dev(struct file * fi
 
 		printk(KERN_WARNING "release_dev: %s: read/write wait queue "
 				    "active!\n", tty_name(tty, buf));
+		up(&tty_sem);
 		schedule();
 	}	
 
@@ -1877,8 +1877,6 @@ static void release_dev(struct file * fi
 	 * both sides, and we've completed the last operation that could 
 	 * block, so it's safe to proceed with closing.
 	 */
-	 
-	down(&tty_sem);
 	if (pty_master) {
 		if (--o_tty->count < 0) {
 			printk(KERN_WARNING "release_dev: bad pty slave count "
@@ -1892,7 +1890,6 @@ static void release_dev(struct file * fi
 		       tty->count, tty_name(tty, buf));
 		tty->count = 0;
 	}
-	up(&tty_sem);
 	
 	/*
 	 * We've decremented tty->count, so we need to remove this file
@@ -1937,6 +1934,8 @@ static void release_dev(struct file * fi
 		read_unlock(&tasklist_lock);
 	}
 
+	up(&tty_sem);
+
 	/* check whether both sides are closing ... */
 	if (!tty_closing || (o_tty && !o_tty_closing))
 		return;



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tty reference count fix
  2006-02-13 20:13 [PATCH] tty reference count fix Paul Fulghum
@ 2006-02-13 22:15 ` Jesper Juhl
  2006-02-13 23:21   ` Paul Fulghum
  2006-02-13 22:51 ` Jason Baron
  1 sibling, 1 reply; 7+ messages in thread
From: Jesper Juhl @ 2006-02-13 22:15 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: Andrew Morton, Linux Kernel Mailing List, Alan Cox

On 2/13/06, Paul Fulghum <paulkf@microgate.com> wrote:
> Fix hole where tty structure can be released when reference
> count is non zero. Existing code can sleep without tty_sem
> protection between deciding to release the tty structure
> (setting local variables tty_closing and otty_closing)
> and setting TTY_CLOSING to prevent further opens.
> An open can occur during this interval causing release_dev()
> to free the tty structure while it is still referenced.
>
> This should fix bugzilla.kernel.org
> [Bug 6041] New: Unable to handle kernel paging request
>
> In Bug 6041, tty_open() oopes on accessing the tty structure
> it has successfully claimed. Bug was on SMP machine
> with the same tty being opened and closed by
> multiple processes, and DEBUG_PAGEALLOC enabled.
>
> Signed-off-by: Paul Fulghum <paulkf@microgate.com>
> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
> Cc: Jesper Juhl <jesper.juhl@gmail.com>
>

I just applied the patch to 2.6.16-rc3 and booted the patched kernel.
Unfortunately I can't tell you if it fixes the bug since I never
successfully reproduced it, it just happened once out of the blue.
What I can tell you though is that the patched kernel seems to behave
just fine and doesn't seem to introduce any regressions on my system -
but my testing has been quite limited so far.

Not the best feedback, I know, but it's the best I can give you at the moment.


--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tty reference count fix
  2006-02-13 20:13 [PATCH] tty reference count fix Paul Fulghum
  2006-02-13 22:15 ` Jesper Juhl
@ 2006-02-13 22:51 ` Jason Baron
  2006-02-13 23:45   ` Paul Fulghum
  2006-02-14 21:46   ` Arjan van de Ven
  1 sibling, 2 replies; 7+ messages in thread
From: Jason Baron @ 2006-02-13 22:51 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Andrew Morton, Linux Kernel Mailing List, jesper.juhl, Alan Cox


On Mon, 13 Feb 2006, Paul Fulghum wrote:

> Fix hole where tty structure can be released when reference
> count is non zero. Existing code can sleep without tty_sem
> protection between deciding to release the tty structure
> (setting local variables tty_closing and otty_closing)
> and setting TTY_CLOSING to prevent further opens.
> An open can occur during this interval causing release_dev()
> to free the tty structure while it is still referenced.
> 
> This should fix bugzilla.kernel.org
> [Bug 6041] New: Unable to handle kernel paging request
> 
> In Bug 6041, tty_open() oopes on accessing the tty structure
> it has successfully claimed. Bug was on SMP machine
> with the same tty being opened and closed by
> multiple processes, and DEBUG_PAGEALLOC enabled.
> 

patch looks good to me. Or you could even drop the tty_sem completely from 
the release_dev path (patch below) since lock_kernel() is held in both 
tty_open() and the release_dev paths() (and there is no sleeping b/w 
setting the tty_closing flag and setting the TTY_CLOSING bit).

-Jason


Signed-off-by: Jason Baron <jbaron@redhat.com>

--- linux-2.6-working/drivers/char/tty_io.c.bak
+++ linux-2.6-working/drivers/char/tty_io.c
@@ -1829,14 +1829,9 @@ static void release_dev(struct file * fi
 	 * each iteration we avoid any problems.
 	 */
 	while (1) {
-		/* Guard against races with tty->count changes elsewhere and
-		   opens on /dev/tty */
-		   
-		down(&tty_sem);
 		tty_closing = tty->count <= 1;
 		o_tty_closing = o_tty &&
 			(o_tty->count <= (pty_master ? 1 : 0));
-		up(&tty_sem);
 		do_sleep = 0;
 
 		if (tty_closing) {
@@ -1873,7 +1868,6 @@ static void release_dev(struct file * fi
 	 * block, so it's safe to proceed with closing.
 	 */
 	 
-	down(&tty_sem);
 	if (pty_master) {
 		if (--o_tty->count < 0) {
 			printk(KERN_WARNING "release_dev: bad pty slave count "
@@ -1887,7 +1881,6 @@ static void release_dev(struct file * fi
 		       tty->count, tty_name(tty, buf));
 		tty->count = 0;
 	}
-	up(&tty_sem);
 	
 	/*
 	 * We've decremented tty->count, so we need to remove this file

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tty reference count fix
  2006-02-13 22:15 ` Jesper Juhl
@ 2006-02-13 23:21   ` Paul Fulghum
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Fulghum @ 2006-02-13 23:21 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Andrew Morton, Linux Kernel Mailing List, Alan Cox

Jesper Juhl wrote:
> I just applied the patch to 2.6.16-rc3 and booted the patched kernel.
> Unfortunately I can't tell you if it fixes the bug since I never
> successfully reproduced it, it just happened once out of the blue.
> What I can tell you though is that the patched kernel seems to behave
> just fine and doesn't seem to introduce any regressions on my system -
> but my testing has been quite limited so far.
> 
> Not the best feedback, I know, but it's the best I can give you at the moment.

Any feedback is good feedback. I'm not suprised it has not
happened again. The necessary conditions are not common and
the timing window is small. As Andrew pointed out,
DEBUG_PAGEALLOC likely helped uncover this.

Decoding the 2 oops to specific lines of code strongly
suggests this is what happened to you. The decoding showed
the tty open and close (release) paths from different
processes going after the same tty structure at the same time.

Thanks,
Paul

--
Paul Fulghum
Microgate Systems, Ltd





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tty reference count fix
  2006-02-13 22:51 ` Jason Baron
@ 2006-02-13 23:45   ` Paul Fulghum
  2006-02-14  0:20     ` Paul Fulghum
  2006-02-14 21:46   ` Arjan van de Ven
  1 sibling, 1 reply; 7+ messages in thread
From: Paul Fulghum @ 2006-02-13 23:45 UTC (permalink / raw)
  To: Jason Baron
  Cc: Andrew Morton, Linux Kernel Mailing List, jesper.juhl, Alan Cox

Jason Baron wrote:
> patch looks good to me. Or you could even drop the tty_sem completely from 
> the release_dev path (patch below) since lock_kernel() is held in both 
> tty_open() and the release_dev paths()

The historical posts on locking in release_dev()
I looked at suggest that a move from BKL to more
fine grained synchronization (tty_sem) was a goal.

The code looks like it originally relied on the BKL.
The later addition of tty_sem was done in a way
that allowed sleeping between setting tty_closing
and TTY_CLOSING flag:

down(&tty_sem)
if (tty->count == 1)
	tty_closing = 1;
up(&tty_sem)
...
down(&tty_sem) <<<< can sleep, open can increase count
tty->count--
up(&tty_sem)
...
if (tty_closing)
	set_bit(TTY_CLOSING)

 > (and there is no sleeping b/w setting the tty_closing
 > flag and setting the TTY_CLOSING bit).

Your patch leaves the schedule() call at the bottom of
the while loop between setting tty_closing and
setting TTY_CLOSING flag.

Thanks,
Paul

--
Paul Fulghum
Microgate Systems, Ltd


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tty reference count fix
  2006-02-13 23:45   ` Paul Fulghum
@ 2006-02-14  0:20     ` Paul Fulghum
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Fulghum @ 2006-02-14  0:20 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Jason Baron, Andrew Morton, Linux Kernel Mailing List,
	jesper.juhl, Alan Cox

Paul Fulghum wrote:
> Your patch leaves the schedule() call at the bottom of
> the while loop between setting tty_closing and
> setting TTY_CLOSING flag.

Nevermind. After schedule() the count is reread,
so you are right. Dropping tty_sem altogether
would work.

It is a matter of where you ultimately wish to
push the locking to BKL or tty_sem.

-- 
Paul Fulghum
Microgate Systems, Ltd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tty reference count fix
  2006-02-13 22:51 ` Jason Baron
  2006-02-13 23:45   ` Paul Fulghum
@ 2006-02-14 21:46   ` Arjan van de Ven
  1 sibling, 0 replies; 7+ messages in thread
From: Arjan van de Ven @ 2006-02-14 21:46 UTC (permalink / raw)
  To: Jason Baron
  Cc: Paul Fulghum, Andrew Morton, Linux Kernel Mailing List,
	jesper.juhl, Alan Cox


> 
> patch looks good to me. Or you could even drop the tty_sem completely from 
> the release_dev path (patch below) since lock_kernel() is held in both 
> tty_open() and the release_dev paths() (and there is no sleeping b/w 
> setting the tty_closing flag and setting the TTY_CLOSING bit).


that's the wrong direction.. the idea is to first put new locking under
the BKL layer.. and at the end pull the BKL entirely.. The BKL isn't a
good lock.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-02-14 21:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-13 20:13 [PATCH] tty reference count fix Paul Fulghum
2006-02-13 22:15 ` Jesper Juhl
2006-02-13 23:21   ` Paul Fulghum
2006-02-13 22:51 ` Jason Baron
2006-02-13 23:45   ` Paul Fulghum
2006-02-14  0:20     ` Paul Fulghum
2006-02-14 21:46   ` Arjan van de Ven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.