All of lore.kernel.org
 help / color / mirror / Atom feed
* Patch for panic in n_tty_read()
@ 2012-06-25 15:41 Stanislav Kozina
  2012-06-26 14:21 ` Alan Cox
  0 siblings, 1 reply; 25+ messages in thread
From: Stanislav Kozina @ 2012-06-25 15:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-serial

[-- Attachment #1: Type: text/plain, Size: 2781 bytes --]

Greg,

(my first suggested patch for Linux kernel, so please bear with me if I 
don't follow process correctly. Thank you.)

We had few customers who met panics in n_tty_read() with following 
backtrace:

      #8 [ffff880018b8dcd0] page_fault at ffffffff814ddfe5
         [exception RIP: n_tty_read+0x2c9]
<register output removed>
      #9 [ffff880018b8dea0] tty_read at ffffffff81300b16
     #10 [ffff880018b8def0] vfs_read at ffffffff81172f85
     #11 [ffff880018b8df30] sys_read at ffffffff811730c1
     #12 [ffff880018b8df80] system_call_fastpath at ffffffff8100b172

My patch for this panic is attached (tty_panic.patch), in short - I 
believe that we need to hold &tty->read_lock while checking 
tty->read_cnt in while-loop condition in n_tty_read() here:

1835                         while (nr && tty->read_cnt) {
1836                                 int eol;
1837
1838                                 eol = 
test_and_clear_bit(tty->read_tail,
1839                                                 tty->read_flags);
1840                                 c = tty->read_buf[tty->read_tail];
1841                                 spin_lock_irqsave(&tty->read_lock, 
flags);

We gave this patch to the customers, they were testing it for a month on 
several tens of machines without being able to reproduce the problem.

Please can you integrate the patch into the kernel?

My testing
====
The patch pulls the spinlock out of the while loop. This makes 
reset_buffer_flags() and others wait before changing either read_buf or 
read_cnt. So this should solve the issue - the question is if this can 
cause any deadlock.

I inserted the msleep(2000) when the lock is held. This should trigger 
any deadlock of possible.

I found out that:
1) n_tty_read() runs when I e.g. log on the serial console
2) reset_buffer_flags() is running when I push CTRL+C on any terminal

That's why my plan was to log on serial console, and than (in shorter 
time than 2 seconds) press CTRL+C on any terminal. I wrote a stap script 
to verify that both functions were running, the stap script is attached.
And this is part of the output from attached stap script:

1336070859652 -> reset_buffer_flags (PID: 225)
1336070859652 <- reset_buffer_flags (PID: 225)
1336070859652 <- n_tty_read (PID: 7527, retval: -512)
1336070859652 -> n_tty_read (PID: 7527)
1336070859654 -> n_tty_read (PID: 7502)
1336070859654 <- n_tty_read (PID: 7502, retval: 53)
1336070867135 -> n_tty_read (PID: 7498)
1336070867135 <- n_tty_read (PID: 7498, retval: 1)
1336070868260 -> reset_buffer_flags (PID: 237)
1336070868260 <- reset_buffer_flags (PID: 237)

It's clear to see when we were in the msleep(2000) - it forced 
n_tty_read to be delayed by 2 seconds in PID 7498. That's why even 
reset_buffer_flags was not running.

[-- Attachment #2: tty_panic.patch --]
[-- Type: text/plain, Size: 987 bytes --]

diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index 2e50f4d..ace0c19 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1813,13 +1813,13 @@ do_it_again:
 
 		if (tty->icanon) {
 			/* N.B. avoid overrun if nr == 0 */
+			spin_lock_irqsave(&tty->read_lock, flags);
 			while (nr && tty->read_cnt) {
 				int eol;
 
 				eol = test_and_clear_bit(tty->read_tail,
 						tty->read_flags);
 				c = tty->read_buf[tty->read_tail];
-				spin_lock_irqsave(&tty->read_lock, flags);
 				tty->read_tail = ((tty->read_tail+1) &
 						  (N_TTY_BUF_SIZE-1));
 				tty->read_cnt--;
@@ -1831,7 +1831,6 @@ do_it_again:
 					if (--tty->canon_data < 0)
 						tty->canon_data = 0;
 				}
-				spin_unlock_irqrestore(&tty->read_lock, flags);
 
 				if (!eol || (c != __DISABLED_CHAR)) {
 					if (tty_put_user(tty, c, b++)) {
@@ -1846,6 +1845,7 @@ do_it_again:
 					break;
 				}
 			}
+			spin_unlock_irqrestore(&tty->read_lock, flags);
 			if (retval)
 				break;
 		} else {

[-- Attachment #3: trace.stp --]
[-- Type: text/plain, Size: 511 bytes --]

probe kernel.function("reset_buffer_flags").call
{
	printf("%d -> %s (PID: %d)\n", gettimeofday_ms(), probefunc(), pid());
}

probe kernel.function("reset_buffer_flags").return
{
	printf("%d <- %s (PID: %d)\n", gettimeofday_ms(), probefunc(), pid());
}

probe kernel.function("n_tty_read").call
{
	printf("%d -> %s (PID: %d)\n", gettimeofday_ms(), probefunc(), pid());
}

probe kernel.function("n_tty_read").return
{
	printf("%d <- %s (PID: %d, retval: %d)\n", gettimeofday_ms(), probefunc(), pid(), $return)
}

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-06-25 15:41 Patch for panic in n_tty_read() Stanislav Kozina
@ 2012-06-26 14:21 ` Alan Cox
  2012-07-20 12:18   ` Stanislav Kozina
  0 siblings, 1 reply; 25+ messages in thread
From: Alan Cox @ 2012-06-26 14:21 UTC (permalink / raw)
  To: Stanislav Kozina; +Cc: Greg Kroah-Hartman, linux-serial

> We had few customers who met panics in n_tty_read() with following 
> backtrace:
> 
>       #8 [ffff880018b8dcd0] page_fault at ffffffff814ddfe5
>          [exception RIP: n_tty_read+0x2c9]
> <register output removed>
>       #9 [ffff880018b8dea0] tty_read at ffffffff81300b16
>      #10 [ffff880018b8def0] vfs_read at ffffffff81172f85
>      #11 [ffff880018b8df30] sys_read at ffffffff811730c1
>      #12 [ffff880018b8df80] system_call_fastpath at ffffffff8100b172

What serial driver are they using when they hit this ?

> 
> My patch for this panic is attached (tty_panic.patch), in short - I 
> believe that we need to hold &tty->read_lock while checking 
> tty->read_cnt in while-loop condition in n_tty_read() here:

We can't hold the lock for the loop because we touch user space memory.

Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-06-26 14:21 ` Alan Cox
@ 2012-07-20 12:18   ` Stanislav Kozina
  2012-07-20 15:11     ` Alan Cox
  0 siblings, 1 reply; 25+ messages in thread
From: Stanislav Kozina @ 2012-07-20 12:18 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial

Alan,

My apologizes for really late response, I completely missed your email:-/
>> We had few customers who met panics in n_tty_read() with following
>> backtrace:
>>
>>        #8 [ffff880018b8dcd0] page_fault at ffffffff814ddfe5
>>           [exception RIP: n_tty_read+0x2c9]
>> <register output removed>
>>        #9 [ffff880018b8dea0] tty_read at ffffffff81300b16
>>       #10 [ffff880018b8def0] vfs_read at ffffffff81172f85
>>       #11 [ffff880018b8df30] sys_read at ffffffff811730c1
>>       #12 [ffff880018b8df80] system_call_fastpath at ffffffff8100b172
> What serial driver are they using when they hit this ?

 From kernel log:
kernel: Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
kernel: serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
kernel: serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
kernel: 00:09: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
kernel: 00:0a: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A

>
>> My patch for this panic is attached (tty_panic.patch), in short - I
>> believe that we need to hold&tty->read_lock while checking
>> tty->read_cnt in while-loop condition in n_tty_read() here:
> We can't hold the lock for the loop because we touch user space memory.

You mean call to tty_put_user(), correct? Thanks for this catch.
So what about to unlock the lock for this time? Because we need to hold 
the lock while checking tty->read_cnt in the while loop condition, correct?

Thanks and regards,
-Stanislav Kozina

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-07-20 12:18   ` Stanislav Kozina
@ 2012-07-20 15:11     ` Alan Cox
  2012-07-27 12:05       ` Stanislav Kozina
  2012-08-08  7:58       ` Stanislav Kozina
  0 siblings, 2 replies; 25+ messages in thread
From: Alan Cox @ 2012-07-20 15:11 UTC (permalink / raw)
  To: Stanislav Kozina; +Cc: Greg Kroah-Hartman, linux-serial

> You mean call to tty_put_user(), correct? Thanks for this catch.
> So what about to unlock the lock for this time? Because we need to hold 
> the lock while checking tty->read_cnt in the while loop condition, correct?

I think you are right on that yes.

Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-07-20 15:11     ` Alan Cox
@ 2012-07-27 12:05       ` Stanislav Kozina
  2012-07-27 12:50         ` Alan Cox
  2012-08-08  7:58       ` Stanislav Kozina
  1 sibling, 1 reply; 25+ messages in thread
From: Stanislav Kozina @ 2012-07-27 12:05 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial

[-- Attachment #1: Type: text/plain, Size: 448 bytes --]

Alan,

Thank you, updated fix (tested on bits based on commit 
bdc0077af574800d24318b6945cf2344e8dbb050) is attached.
Is this correct now?

Thanks and regards,
-Stanislav Kozina

>> You mean call to tty_put_user(), correct? Thanks for this catch.
>> So what about to unlock the lock for this time? Because we need to hold
>> the lock while checking tty->read_cnt in the while loop condition, correct?
> I think you are right on that yes.
>
> Alan


[-- Attachment #2: tty_panic_2.patch --]
[-- Type: text/plain, Size: 773 bytes --]

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index ee1c268..54d1fc5 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1832,13 +1832,13 @@ do_it_again:
 
 		if (tty->icanon && !L_EXTPROC(tty)) {
 			/* N.B. avoid overrun if nr == 0 */
+			spin_lock_irqsave(&tty->read_lock, flags);
 			while (nr && tty->read_cnt) {
 				int eol;
 
 				eol = test_and_clear_bit(tty->read_tail,
 						tty->read_flags);
 				c = tty->read_buf[tty->read_tail];
-				spin_lock_irqsave(&tty->read_lock, flags);
 				tty->read_tail = ((tty->read_tail+1) &
 						  (N_TTY_BUF_SIZE-1));
 				tty->read_cnt--;
@@ -1864,6 +1864,7 @@ do_it_again:
 					tty_audit_push(tty);
 					break;
 				}
+				spin_lock_irqsave(&tty->read_lock, flags);
 			}
 			if (retval)
 				break;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-07-27 12:05       ` Stanislav Kozina
@ 2012-07-27 12:50         ` Alan Cox
  2012-07-30 11:58           ` Stanislav Kozina
  0 siblings, 1 reply; 25+ messages in thread
From: Alan Cox @ 2012-07-27 12:50 UTC (permalink / raw)
  To: Stanislav Kozina; +Cc: Greg Kroah-Hartman, linux-serial

On Fri, 27 Jul 2012 14:05:16 +0200
Stanislav Kozina <skozina@redhat.com> wrote:

> Alan,
> 
> Thank you, updated fix (tested on bits based on commit 
> bdc0077af574800d24318b6945cf2344e8dbb050) is attached.
> Is this correct now?

Looks good to me. However it changes the locking rules on
tty_audit_push() so please check the audit folks are ok with it. I don't
think that causes any problems.

Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-07-27 12:50         ` Alan Cox
@ 2012-07-30 11:58           ` Stanislav Kozina
  0 siblings, 0 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-07-30 11:58 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial

Alan,

I am very sorry, but I don't see it. We didn't held the lock while 
calling tty_audit_push() before, and we don't hold it after the patch 
neither.
So what's the locking scheme change here? Is there some binding between 
n_tty_read() and tty_audit_push() I just don't see?

Thank you,
-Stanislav

> Looks good to me. However it changes the locking rules on
> tty_audit_push() so please check the audit folks are ok with it. I don't
> think that causes any problems.
>
> Alan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-07-20 15:11     ` Alan Cox
  2012-07-27 12:05       ` Stanislav Kozina
@ 2012-08-08  7:58       ` Stanislav Kozina
  2012-08-08  9:00         ` Alan Cox
  1 sibling, 1 reply; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-08  7:58 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial

[-- Attachment #1: Type: text/plain, Size: 641 bytes --]

Alan,

I am not sure if you got my last email, so I'm resending it now:

 > I am very sorry, but I don't see it. We didn't held the lock while 
calling tty_audit_push() before, and we don't hold it after the patch 
neither.
 > So what's the locking scheme change here? Is there some binding 
between n_tty_read() and tty_audit_push() I just don't see?

Please can you advice me why I should check this patch with audit folks?

Thanks a lot,
-Stanislav

 >> Looks good to me. However it changes the locking rules on
 >> tty_audit_push() so please check the audit folks are ok with it. I don't
 >> think that causes any problems.
 >>
 >> Alan

[-- Attachment #2: tty_panic_2.patch --]
[-- Type: text/plain, Size: 774 bytes --]

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index ee1c268..54d1fc5 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1832,13 +1832,13 @@ do_it_again:
 
 		if (tty->icanon && !L_EXTPROC(tty)) {
 			/* N.B. avoid overrun if nr == 0 */
+			spin_lock_irqsave(&tty->read_lock, flags);
 			while (nr && tty->read_cnt) {
 				int eol;
 
 				eol = test_and_clear_bit(tty->read_tail,
 						tty->read_flags);
 				c = tty->read_buf[tty->read_tail];
-				spin_lock_irqsave(&tty->read_lock, flags);
 				tty->read_tail = ((tty->read_tail+1) &
 						  (N_TTY_BUF_SIZE-1));
 				tty->read_cnt--;
@@ -1864,6 +1864,7 @@ do_it_again:
 					tty_audit_push(tty);
 					break;
 				}
+				spin_lock_irqsave(&tty->read_lock, flags);
 			}
 			if (retval)
 				break;


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-08-08  7:58       ` Stanislav Kozina
@ 2012-08-08  9:00         ` Alan Cox
  2012-08-08 12:09           ` Stanislav Kozina
  0 siblings, 1 reply; 25+ messages in thread
From: Alan Cox @ 2012-08-08  9:00 UTC (permalink / raw)
  To: Stanislav Kozina; +Cc: Greg Kroah-Hartman, linux-serial

On Wed, 08 Aug 2012 09:58:40 +0200
Stanislav Kozina <skozina@redhat.com> wrote:

> Alan,
> 
> I am not sure if you got my last email, so I'm resending it now:
> 
>  > I am very sorry, but I don't see it. We didn't held the lock while 
> calling tty_audit_push() before, and we don't hold it after the patch 
> neither.
>  > So what's the locking scheme change here? Is there some binding 
> between n_tty_read() and tty_audit_push() I just don't see?
> 

No - looking at it again you are right, the lock is still dropped during
the audit call.

Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-08-08  9:00         ` Alan Cox
@ 2012-08-08 12:09           ` Stanislav Kozina
  2012-08-08 12:26             ` Alan Cox
  2012-08-08 14:28             ` [PATCH V2] [tty] Fix possible race " Stanislav Kozina
  0 siblings, 2 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-08 12:09 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial

Alan,

Thanks a lot. So can you integrate the patch in next patchset?

-Stanislav

> No - looking at it again you are right, the lock is still dropped during
> the audit call.
>
> Alan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-08-08 12:09           ` Stanislav Kozina
@ 2012-08-08 12:26             ` Alan Cox
  2012-08-08 14:32               ` Stanislav Kozina
  2012-08-08 14:28             ` [PATCH V2] [tty] Fix possible race " Stanislav Kozina
  1 sibling, 1 reply; 25+ messages in thread
From: Alan Cox @ 2012-08-08 12:26 UTC (permalink / raw)
  To: Stanislav Kozina; +Cc: Greg Kroah-Hartman, linux-serial

On Wed, 08 Aug 2012 14:09:12 +0200
Stanislav Kozina <skozina@redhat.com> wrote:

> Alan,
> 
> Thanks a lot. So can you integrate the patch in next patchset?

Will do - I need to resend the patch queue to Greg so I'll add it before
that.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH V2] [tty] Fix possible race in n_tty_read()
  2012-08-08 12:09           ` Stanislav Kozina
  2012-08-08 12:26             ` Alan Cox
@ 2012-08-08 14:28             ` Stanislav Kozina
  2012-08-08 15:27               ` Alan Cox
  1 sibling, 1 reply; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-08 14:28 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial, Stanislaw Gruszka

Fix possible panic caused by unlocked access to tty->read_cnt in 
while-loop condition in n_tty_read().

Signed-off-by: Stanislav Kozina <skozina@redhat.com>
---
  drivers/tty/n_tty.c |    6 +++++-
  1 files changed, 5 insertions(+), 1 deletions(-)
  v1->v2: Add spin_unlock_irqrestore() call after the while loop

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index ee1c268..df21f39 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1832,13 +1832,13 @@ do_it_again:

  		if (tty->icanon && !L_EXTPROC(tty)) {
  			/* N.B. avoid overrun if nr == 0 */
+			spin_lock_irqsave(&tty->read_lock, flags);
  			while (nr && tty->read_cnt) {
  				int eol;

  				eol = test_and_clear_bit(tty->read_tail,
  						tty->read_flags);
  				c = tty->read_buf[tty->read_tail];
-				spin_lock_irqsave(&tty->read_lock, flags);
  				tty->read_tail = ((tty->read_tail+1) &
  						  (N_TTY_BUF_SIZE-1));
  				tty->read_cnt--;
@@ -1856,15 +1856,19 @@ do_it_again:
  					if (tty_put_user(tty, c, b++)) {
  						retval = -EFAULT;
  						b--;
+						spin_lock_irqsave(&tty->read_lock, flags);
  						break;
  					}
  					nr--;
  				}
  				if (eol) {
  					tty_audit_push(tty);
+					spin_lock_irqsave(&tty->read_lock, flags);
  					break;
  				}
+				spin_lock_irqsave(&tty->read_lock, flags);
  			}
+			spin_unlock_irqrestore(&tty->read_lock, flags);
  			if (retval)
  				break;
  		} else {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Patch for panic in n_tty_read()
  2012-08-08 12:26             ` Alan Cox
@ 2012-08-08 14:32               ` Stanislav Kozina
  0 siblings, 0 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-08 14:32 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial

Alan,

Thanks.
We realized that the lock is not being unlocked:-/ That's why I sent a 
V2 patch.

My apologizes,
-Stanislav

> On Wed, 08 Aug 2012 14:09:12 +0200
> Stanislav Kozina<skozina@redhat.com>  wrote:
>
>> Alan,
>>
>> Thanks a lot. So can you integrate the patch in next patchset?
> Will do - I need to resend the patch queue to Greg so I'll add it before
> that.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-serial" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V2] [tty] Fix possible race in n_tty_read()
  2012-08-08 14:28             ` [PATCH V2] [tty] Fix possible race " Stanislav Kozina
@ 2012-08-08 15:27               ` Alan Cox
  2012-08-09 11:16                 ` Stanislaw Gruszka
  2012-08-09 11:24                 ` Stanislav Kozina
  0 siblings, 2 replies; 25+ messages in thread
From: Alan Cox @ 2012-08-08 15:27 UTC (permalink / raw)
  To: Stanislav Kozina; +Cc: Greg Kroah-Hartman, linux-serial, Stanislaw Gruszka

On Wed, 08 Aug 2012 16:28:47 +0200
Stanislav Kozina <skozina@redhat.com> wrote:

> Fix possible panic caused by unlocked access to tty->read_cnt in 
> while-loop condition in n_tty_read().

Should this also be removing the BUG_ON check you noted in the other
email was not valid now ?

Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V2] [tty] Fix possible race in n_tty_read()
  2012-08-08 15:27               ` Alan Cox
@ 2012-08-09 11:16                 ` Stanislaw Gruszka
  2012-08-13 15:26                   ` Stanislaw Gruszka
  2012-08-09 11:24                 ` Stanislav Kozina
  1 sibling, 1 reply; 25+ messages in thread
From: Stanislaw Gruszka @ 2012-08-09 11:16 UTC (permalink / raw)
  To: Alan Cox; +Cc: Stanislav Kozina, Greg Kroah-Hartman, linux-serial

On Wed, Aug 08, 2012 at 04:27:25PM +0100, Alan Cox wrote:
> On Wed, 08 Aug 2012 16:28:47 +0200
> Stanislav Kozina <skozina@redhat.com> wrote:
> 
> > Fix possible panic caused by unlocked access to tty->read_cnt in 
> > while-loop condition in n_tty_read().
> 
> Should this also be removing the BUG_ON check you noted in the other
> email was not valid now ?

You talk about 
http://marc.info/?l=linux-serial&m=134318985920881&w=2

Is possible that we can call n_tty_read() after n_tty_close() ? How oterwise
tty->read_buf could become NULL?

If I understand correctly Stanislav's patch solve below race condtion:

CPU0					CPU1
n_tty_read:				reset_buffer_flags:

while (nr && tty->read_cnt) {

  					spin_lock_irqsave(&tty->read_lock, flags);
					tty->read_head = tty->read_tail = tty->read_cnt = 0;
  					spin_lock_irqsave(&tty->read_lock, flags);

  spin_lock_irqsave(&tty->read_lock, flags);
  
  tty->read_cnt--;

  spin_lock_irqsave(&tty->read_lock, flags);

  /* Now tty->read_cnt is negative */

}

what itself could have varsious nasty consequences, i.e. ininite
loop. Is also possible that negative tty->read_cnt would result in
tty->read_buf == NULL ? If so, I'm not quite understand that.

Stanislaw

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V2] [tty] Fix possible race in n_tty_read()
  2012-08-08 15:27               ` Alan Cox
  2012-08-09 11:16                 ` Stanislaw Gruszka
@ 2012-08-09 11:24                 ` Stanislav Kozina
  2012-08-09 12:35                   ` Alan Cox
  2012-08-10 10:51                   ` [PATCH] Remove BUG_ON from n_tty_read() Stanislav Kozina
  1 sibling, 2 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-09 11:24 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial, Stanislaw Gruszka

I believe that BUG_ON is different issue.

I'm not sure what's the correct locking scheme for tty->read_buf. I 
guess that it's allocated as part of tty initialization, and only 
removed during n_tty_close(), correct? So we should not need any lock there.
Perhaps we should change that BUG_ON(!tty->read_buf); to just return 
-EAGAIN, as e.g. n_tty_receive_buf does?

         if (!tty->read_buf)
                 return;

I will submit a separate patch changing the BUG_ON to return, I believe 
it's still better to drop a n_tty_read() rather than panic whole system.
But I still don't understand how this can happen - that we have a tty 
operational without a buffer allocated. There still must be some other race.

-Stanislav

> Should this also be removing the BUG_ON check you noted in the other
> email was not valid now ?
>
> Alan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V2] [tty] Fix possible race in n_tty_read()
  2012-08-09 11:24                 ` Stanislav Kozina
@ 2012-08-09 12:35                   ` Alan Cox
  2012-08-10 10:52                     ` Stanislav Kozina
  2012-08-10 10:51                   ` [PATCH] Remove BUG_ON from n_tty_read() Stanislav Kozina
  1 sibling, 1 reply; 25+ messages in thread
From: Alan Cox @ 2012-08-09 12:35 UTC (permalink / raw)
  To: Stanislav Kozina; +Cc: Greg Kroah-Hartman, linux-serial, Stanislaw Gruszka

On Thu, 09 Aug 2012 13:24:42 +0200
Stanislav Kozina <skozina@redhat.com> wrote:

> I believe that BUG_ON is different issue.

Ok thats what I was thinking.. I was a bit confused about that.

> I will submit a separate patch changing the BUG_ON to return, I believe 
> it's still better to drop a n_tty_read() rather than panic whole system.
> But I still don't understand how this can happen - that we have a tty 
> operational without a buffer allocated. There still must be some other race.

Nor me - a WARN_ON and return would be better perhaps - it will at least
then spew an error into the logs

Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] Remove BUG_ON from n_tty_read()
  2012-08-09 11:24                 ` Stanislav Kozina
  2012-08-09 12:35                   ` Alan Cox
@ 2012-08-10 10:51                   ` Stanislav Kozina
  2012-08-10 12:29                     ` Stanislaw Gruszka
  2012-08-10 14:38                     ` [PATCH V2] " Stanislav Kozina
  1 sibling, 2 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-10 10:51 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial, Stanislaw Gruszka

Change the BUG_ON to WARN_ON and return in case of tty->read_buf==NULL

Signed-off-by: Stanislav Kozina <skozina@redhat.com>
---
  drivers/tty/n_tty.c |    5 ++++-
  1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index df21f39..39c6202 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1728,7 +1728,10 @@ static ssize_t n_tty_read(struct tty_struct *tty, 
struct file *file,

  do_it_again:

-	BUG_ON(!tty->read_buf);
+	if (!tty->read_buf) {
+		WARN_ON(!tty->read_buf);
+		return -EAGAIN;
+	}

  	c = job_control(tty, file);
  	if (c < 0)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH V2] [tty] Fix possible race in n_tty_read()
  2012-08-09 12:35                   ` Alan Cox
@ 2012-08-10 10:52                     ` Stanislav Kozina
  0 siblings, 0 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-10 10:52 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial, Stanislaw Gruszka


> Nor me - a WARN_ON and return would be better perhaps - it will at least
> then spew an error into the logs

Good idea, thanks. Patch sent.

-Stanislav

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Remove BUG_ON from n_tty_read()
  2012-08-10 10:51                   ` [PATCH] Remove BUG_ON from n_tty_read() Stanislav Kozina
@ 2012-08-10 12:29                     ` Stanislaw Gruszka
  2012-08-10 14:53                       ` Stanislav Kozina
  2012-08-10 14:38                     ` [PATCH V2] " Stanislav Kozina
  1 sibling, 1 reply; 25+ messages in thread
From: Stanislaw Gruszka @ 2012-08-10 12:29 UTC (permalink / raw)
  To: Stanislav Kozina; +Cc: Alan Cox, Greg Kroah-Hartman, linux-serial

On Fri, Aug 10, 2012 at 12:51:30PM +0200, Stanislav Kozina wrote:
> Change the BUG_ON to WARN_ON and return in case of tty->read_buf==NULL

A few small nitpicks.

Subject should include "tty:" prefix.

> -	BUG_ON(!tty->read_buf);
> +	if (!tty->read_buf) {
> +		WARN_ON(!tty->read_buf);
> +		return -EAGAIN;
> +	}

This usually is done in two lines:

if (WARN_ON())
	return;

Additionally, second BUG_ON(!tty->read_buf) in n_tty_read() should be
probably replaced too.

Thanks
Stanislaw

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH V2] Remove BUG_ON from n_tty_read()
  2012-08-10 10:51                   ` [PATCH] Remove BUG_ON from n_tty_read() Stanislav Kozina
  2012-08-10 12:29                     ` Stanislaw Gruszka
@ 2012-08-10 14:38                     ` Stanislav Kozina
  2012-08-16  7:52                       ` Stanislav Kozina
  1 sibling, 1 reply; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-10 14:38 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial, Stanislaw Gruszka

Change the BUG_ON to WARN_ON and return in case of tty->read_buf==NULL

Signed-off-by: Stanislav Kozina <skozina@redhat.com>
---
  drivers/tty/n_tty.c |    3 ++-
  1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index df21f39..6b9b5e0 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1728,7 +1728,8 @@ static ssize_t n_tty_read(struct tty_struct *tty, 
struct file *file,

  do_it_again:

-    BUG_ON(!tty->read_buf);
+    if (WARN_ON(!tty->read_buf))
+        return -EAGAIN;

      c = job_control(tty, file);
      if (c < 0)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] Remove BUG_ON from n_tty_read()
  2012-08-10 12:29                     ` Stanislaw Gruszka
@ 2012-08-10 14:53                       ` Stanislav Kozina
  0 siblings, 0 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-10 14:53 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: Alan Cox, Greg Kroah-Hartman, linux-serial


> Subject should include "tty:" prefix.

Sorry, just forgot about it for the second time;-/
Can I force git format-patch to add it somehow?

> This usually is done in two lines:
>
> if (WARN_ON())
> 	return;

Thanks, good point, new patch sent.

> Additionally, second BUG_ON(!tty->read_buf) in n_tty_read() should be
> probably replaced too.

I don't think so.
In the first case, we are not running, so there might be no operation on 
tty running. We still don't understand how it can happen that someone 
closes the tty, and then read operation is called, however at least 
there is no (known) running operation.
In the other case, we have just be rescheduled - so we are definitely in 
operation, and if tty is closed while we are just sleeping, I would 
still keep BUG_ON() there. Despite we haven't seen any panics on that 
place, have we?

Thanks,
-Stanislav

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V2] [tty] Fix possible race in n_tty_read()
  2012-08-09 11:16                 ` Stanislaw Gruszka
@ 2012-08-13 15:26                   ` Stanislaw Gruszka
  2012-08-14 11:15                     ` Stanislav Kozina
  0 siblings, 1 reply; 25+ messages in thread
From: Stanislaw Gruszka @ 2012-08-13 15:26 UTC (permalink / raw)
  To: Alan Cox; +Cc: Stanislav Kozina, Greg Kroah-Hartman, linux-serial

On Thu, Aug 09, 2012 at 01:16:20PM +0200, Stanislaw Gruszka wrote:
> On Wed, Aug 08, 2012 at 04:27:25PM +0100, Alan Cox wrote:
> > On Wed, 08 Aug 2012 16:28:47 +0200
> > Stanislav Kozina <skozina@redhat.com> wrote:
> > 
> > > Fix possible panic caused by unlocked access to tty->read_cnt in 
> > > while-loop condition in n_tty_read().
> > 
> > Should this also be removing the BUG_ON check you noted in the other
> > email was not valid now ?
> 
> You talk about 
> http://marc.info/?l=linux-serial&m=134318985920881&w=2
> 
> Is possible that we can call n_tty_read() after n_tty_close() ? How oterwise
> tty->read_buf could become NULL?
> 
> If I understand correctly Stanislav's patch solve below race condtion:
> 
> CPU0					CPU1
> n_tty_read:				reset_buffer_flags:
> 
> while (nr && tty->read_cnt) {
> 
>   					spin_lock_irqsave(&tty->read_lock, flags);
> 					tty->read_head = tty->read_tail = tty->read_cnt = 0;
>   					spin_lock_irqsave(&tty->read_lock, flags);
> 
>   spin_lock_irqsave(&tty->read_lock, flags);
>   
>   tty->read_cnt--;
> 
>   spin_lock_irqsave(&tty->read_lock, flags);
> 
>   /* Now tty->read_cnt is negative */
> 
> }
> 
> what itself could have varsious nasty consequences, i.e. ininite
> loop. Is also possible that negative tty->read_cnt would result in
> tty->read_buf == NULL ? If so, I'm not quite understand that.

I looked a bit more at this. Excluding memory corruption which could
zero tty struct, the only possibility to nullify tty->read_buf is
call to n_tty_close(). So NULL pointer dereference on n_tty_read,
in "while (nr && tty->read_cnt)" loop can only be caused by calling
n_tty_close(), while performing n_tty_read().

Stanislav patch solve that problem because we do not touch tty->read_buf
any longer once tty->read_cnt become 0, and because n_tty_close() clear
tty->read_cnt (by calling n_tty_flush_buffer() -> reset_buffer_flags()).

However looks like main problem persist, we should never do
n_tty_read() after/during n_tty_close() and before n_tty_open(). That
must be an issue in upper layer i.e. tty_io and tty_ldisc, which should
give guarantee about ->close(), ->read(), ->open() ordering. I'm going
to look at that more closely.

Stanislaw

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V2] [tty] Fix possible race in n_tty_read()
  2012-08-13 15:26                   ` Stanislaw Gruszka
@ 2012-08-14 11:15                     ` Stanislav Kozina
  0 siblings, 0 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-14 11:15 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: Alan Cox, Greg Kroah-Hartman, linux-serial


> I looked a bit more at this. Excluding memory corruption which could
> zero tty struct, the only possibility to nullify tty->read_buf is
> call to n_tty_close(). So NULL pointer dereference on n_tty_read,
> in "while (nr&&  tty->read_cnt)" loop can only be caused by calling
> n_tty_close(), while performing n_tty_read().

Correct.

> Stanislav patch solve that problem because we do not touch tty->read_buf
> any longer once tty->read_cnt become 0, and because n_tty_close() clear
> tty->read_cnt (by calling n_tty_flush_buffer() ->  reset_buffer_flags()).

Correct.

> However looks like main problem persist, we should never do
> n_tty_read() after/during n_tty_close() and before n_tty_open(). That
> must be an issue in upper layer i.e. tty_io and tty_ldisc, which should
> give guarantee about ->close(), ->read(), ->open() ordering.

Correct.

> I'm going
> to look at that more closely.

Thanks a lot;-)

Regards,
-Stanislav

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V2] Remove BUG_ON from n_tty_read()
  2012-08-10 14:38                     ` [PATCH V2] " Stanislav Kozina
@ 2012-08-16  7:52                       ` Stanislav Kozina
  0 siblings, 0 replies; 25+ messages in thread
From: Stanislav Kozina @ 2012-08-16  7:52 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-serial, Stanislaw Gruszka

Alan,

What about integration of this patch?
I will try to do some testing to understand how it can happen that we 
read from tty already closed.

Thanks and regards,
-Stanislav

> Change the BUG_ON to WARN_ON and return in case of tty->read_buf==NULL
>
> Signed-off-by: Stanislav Kozina <skozina@redhat.com>
> ---
>  drivers/tty/n_tty.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> index df21f39..6b9b5e0 100644
> --- a/drivers/tty/n_tty.c
> +++ b/drivers/tty/n_tty.c
> @@ -1728,7 +1728,8 @@ static ssize_t n_tty_read(struct tty_struct 
> *tty, struct file *file,
>
>  do_it_again:
>
> -    BUG_ON(!tty->read_buf);
> +    if (WARN_ON(!tty->read_buf))
> +        return -EAGAIN;
>
>      c = job_control(tty, file);
>      if (c < 0)


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-08-16  7:52 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-25 15:41 Patch for panic in n_tty_read() Stanislav Kozina
2012-06-26 14:21 ` Alan Cox
2012-07-20 12:18   ` Stanislav Kozina
2012-07-20 15:11     ` Alan Cox
2012-07-27 12:05       ` Stanislav Kozina
2012-07-27 12:50         ` Alan Cox
2012-07-30 11:58           ` Stanislav Kozina
2012-08-08  7:58       ` Stanislav Kozina
2012-08-08  9:00         ` Alan Cox
2012-08-08 12:09           ` Stanislav Kozina
2012-08-08 12:26             ` Alan Cox
2012-08-08 14:32               ` Stanislav Kozina
2012-08-08 14:28             ` [PATCH V2] [tty] Fix possible race " Stanislav Kozina
2012-08-08 15:27               ` Alan Cox
2012-08-09 11:16                 ` Stanislaw Gruszka
2012-08-13 15:26                   ` Stanislaw Gruszka
2012-08-14 11:15                     ` Stanislav Kozina
2012-08-09 11:24                 ` Stanislav Kozina
2012-08-09 12:35                   ` Alan Cox
2012-08-10 10:52                     ` Stanislav Kozina
2012-08-10 10:51                   ` [PATCH] Remove BUG_ON from n_tty_read() Stanislav Kozina
2012-08-10 12:29                     ` Stanislaw Gruszka
2012-08-10 14:53                       ` Stanislav Kozina
2012-08-10 14:38                     ` [PATCH V2] " Stanislav Kozina
2012-08-16  7:52                       ` Stanislav Kozina

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.