LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH] uart: fix race between uart_put_char() and uart_shutdown()
@ 2018-06-05  0:01 Tycho Andersen
  2018-06-05  3:59 ` Serge E. Hallyn
  2018-06-28 12:05 ` Greg Kroah-Hartman
  0 siblings, 2 replies; 27+ messages in thread
From: Tycho Andersen @ 2018-06-05  0:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-serial, linux-kernel, Serge E . Hallyn, Tycho Andersen

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, we simply also acquire state->port.mutex.

Unfortunately, I don't have any insightful thoughts about how to test this.
Ideas are appreciated :)

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
---
 drivers/tty/serial/serial_core.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 0466f9f08a91..883a8c15510c 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -532,9 +532,15 @@ static int uart_put_char(struct tty_struct *tty, unsigned char c)
 	unsigned long flags;
 	int ret = 0;
 
+	/*
+	 * state->xmit.buf is protected by state->port.mutex, see the note in
+	 * uart_port_startup()
+	 */
+	mutex_lock(&state->port.mutex);
+
 	circ = &state->xmit;
 	if (!circ->buf)
-		return 0;
+		goto out;
 
 	port = uart_port_lock(state, flags);
 	if (port && uart_circ_chars_free(circ) != 0) {
@@ -543,6 +549,9 @@ static int uart_put_char(struct tty_struct *tty, unsigned char c)
 		ret = 1;
 	}
 	uart_port_unlock(port, flags);
+
+out:
+	mutex_unlock(&state->port.mutex);
 	return ret;
 }
 
-- 
2.17.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] uart: fix race between uart_put_char() and uart_shutdown()
  2018-06-05  0:01 [PATCH] uart: fix race between uart_put_char() and uart_shutdown() Tycho Andersen
@ 2018-06-05  3:59 ` Serge E. Hallyn
  2018-06-06 21:42   ` Tycho Andersen
  2018-06-28 12:05 ` Greg Kroah-Hartman
  1 sibling, 1 reply; 27+ messages in thread
From: Serge E. Hallyn @ 2018-06-05  3:59 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel,
	Serge E . Hallyn

Quoting Tycho Andersen (tycho@tycho.ws):
> We have reports of the following crash:
> 
>     PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
>     #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
>     #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
>     #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
>     #3 [ffff88085c6db860] no_context at ffffffff81050b8f
>     #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
>     #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
>     #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
>     #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
>     #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
>     [exception RIP: uart_put_char+149]
>     RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
>     RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
>     RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
>     RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
>     R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
>     R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
>     ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>     #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
>     #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
>     #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
>     #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
>     #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
>     #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
>     #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
>     #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
>     #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
>     #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
>     #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
>     #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
>     #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​
> 
> after slogging through some dissasembly:
> 
> ffffffff814b6720 <uart_put_char>:
> ffffffff814b6720:	55                   	push   %rbp
> ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
> ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
> ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
> ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
> ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
> ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
> ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
> ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
> ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
> ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
> ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
> ffffffff814b6754:	00 00
> ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
> ffffffff814b675d:	00
> ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
> ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
> ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
> ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
> ffffffff814b676f:	00
> ffffffff814b6770:	89 ca                	mov    %ecx,%edx
> ffffffff814b6772:	f7 d2                	not    %edx
> ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
> ffffffff814b677b:	00
> ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
> ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
> ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
> ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
> ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
> ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
> ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
> ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
> ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
> ffffffff814b67a5:	c9                   	leaveq
> ffffffff814b67a6:	c3                   	retq
> ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
> ffffffff814b67ae:	00
> ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
> ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
> ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
> ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
> ffffffff814b67c0:	00
> ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
> ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
> ffffffff814b67d1:	00
> ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
> ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
> ffffffff814b67db:	00 00 00 00 00
> 
> for our build, this is crashing at:
> 
>     circ->buf[circ->head] = c;
> 
> Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
> protected by the "per-port mutex", which based on uart_port_check() is
> state->port.mutex. Indeed, the lock acquired in uart_put_char() is
> uport->lock, i.e. not the same lock.
> 
> Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> last chunk of that function may release state->xmit.buf before its assigned
> to null, and cause the race above.
> 
> To fix it, we simply also acquire state->port.mutex.
> 
> Unfortunately, I don't have any insightful thoughts about how to test this.
> Ideas are appreciated :)

I wonder whether there is something we can do with qemu -serial pipe: ?

> Signed-off-by: Tycho Andersen <tycho@tycho.ws>

Acked-by: Serge Hallyn <serge@hallyn.com>

> ---
>  drivers/tty/serial/serial_core.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index 0466f9f08a91..883a8c15510c 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -532,9 +532,15 @@ static int uart_put_char(struct tty_struct *tty, unsigned char c)
>  	unsigned long flags;
>  	int ret = 0;
>  
> +	/*
> +	 * state->xmit.buf is protected by state->port.mutex, see the note in
> +	 * uart_port_startup()
> +	 */
> +	mutex_lock(&state->port.mutex);
> +
>  	circ = &state->xmit;
>  	if (!circ->buf)
> -		return 0;
> +		goto out;
>  
>  	port = uart_port_lock(state, flags);
>  	if (port && uart_circ_chars_free(circ) != 0) {
> @@ -543,6 +549,9 @@ static int uart_put_char(struct tty_struct *tty, unsigned char c)
>  		ret = 1;
>  	}
>  	uart_port_unlock(port, flags);
> +
> +out:
> +	mutex_unlock(&state->port.mutex);
>  	return ret;
>  }
>  
> -- 
> 2.17.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] uart: fix race between uart_put_char() and uart_shutdown()
  2018-06-05  3:59 ` Serge E. Hallyn
@ 2018-06-06 21:42   ` Tycho Andersen
  0 siblings, 0 replies; 27+ messages in thread
From: Tycho Andersen @ 2018-06-06 21:42 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel

On Mon, Jun 04, 2018 at 10:59:36PM -0500, Serge E. Hallyn wrote:
> Quoting Tycho Andersen (tycho@tycho.ws):
> > Unfortunately, I don't have any insightful thoughts about how to test this.
> > Ideas are appreciated :)
> 
> I wonder whether there is something we can do with qemu -serial pipe: ?

Good idea. I couldn't get tty_put_char() to trigger nicely, but I just
hard coded one to occur, so at least now I know that this doesn't
deadlock when called normally.

Another suggestion Serge gave off list was to write a kernel module
that implemented a driver. I'll see about doing that to see if I can
force the original crash.

> > Signed-off-by: Tycho Andersen <tycho@tycho.ws>
> 
> Acked-by: Serge Hallyn <serge@hallyn.com>

Thanks!

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] uart: fix race between uart_put_char() and uart_shutdown()
  2018-06-05  0:01 [PATCH] uart: fix race between uart_put_char() and uart_shutdown() Tycho Andersen
  2018-06-05  3:59 ` Serge E. Hallyn
@ 2018-06-28 12:05 ` Greg Kroah-Hartman
  2018-06-29 10:24   ` [PATCH v2] " Tycho Andersen
  1 sibling, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2018-06-28 12:05 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Mon, Jun 04, 2018 at 06:01:27PM -0600, Tycho Andersen wrote:
> We have reports of the following crash:
> 
>     PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
>     #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
>     #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
>     #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
>     #3 [ffff88085c6db860] no_context at ffffffff81050b8f
>     #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
>     #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
>     #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
>     #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
>     #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
>     [exception RIP: uart_put_char+149]
>     RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
>     RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
>     RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
>     RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
>     R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
>     R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
>     ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>     #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
>     #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
>     #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
>     #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
>     #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
>     #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
>     #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
>     #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
>     #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
>     #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
>     #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
>     #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
>     #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​
> 
> after slogging through some dissasembly:
> 
> ffffffff814b6720 <uart_put_char>:
> ffffffff814b6720:	55                   	push   %rbp
> ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
> ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
> ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
> ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
> ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
> ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
> ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
> ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
> ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
> ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
> ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
> ffffffff814b6754:	00 00
> ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
> ffffffff814b675d:	00
> ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
> ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
> ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
> ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
> ffffffff814b676f:	00
> ffffffff814b6770:	89 ca                	mov    %ecx,%edx
> ffffffff814b6772:	f7 d2                	not    %edx
> ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
> ffffffff814b677b:	00
> ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
> ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
> ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
> ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
> ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
> ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
> ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
> ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
> ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
> ffffffff814b67a5:	c9                   	leaveq
> ffffffff814b67a6:	c3                   	retq
> ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
> ffffffff814b67ae:	00
> ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
> ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
> ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
> ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
> ffffffff814b67c0:	00
> ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
> ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
> ffffffff814b67d1:	00
> ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
> ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
> ffffffff814b67db:	00 00 00 00 00
> 
> for our build, this is crashing at:
> 
>     circ->buf[circ->head] = c;
> 
> Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
> protected by the "per-port mutex", which based on uart_port_check() is
> state->port.mutex. Indeed, the lock acquired in uart_put_char() is
> uport->lock, i.e. not the same lock.
> 
> Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> last chunk of that function may release state->xmit.buf before its assigned
> to null, and cause the race above.
> 
> To fix it, we simply also acquire state->port.mutex.

Ick.  Can't we just grab the same lock in the other place?  Grabbing 2
locks for every individual character seems _really_ heavy, don't you
think?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2] uart: fix race between uart_put_char() and uart_shutdown()
  2018-06-28 12:05 ` Greg Kroah-Hartman
@ 2018-06-29 10:24   ` " Tycho Andersen
  2018-06-29 16:43     ` Tycho Andersen
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-06-29 10:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-serial, linux-kernel, Serge E . Hallyn, Tycho Andersen

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
---
 drivers/tty/serial/serial_core.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 9c14a453f73c..790f3ea7ffca 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -181,7 +181,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
 		int init_hw)
 {
 	struct uart_port *uport = uart_port_check(state);
-	unsigned long page;
+	unsigned long page, flags = 0;
 	int retval = 0;
 
 	if (uport->type == PORT_UNKNOWN)
@@ -196,15 +196,19 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
 	 * Initialise and allocate the transmit and temporary
 	 * buffer.
 	 */
+	uart_port_lock(state, flags);
 	if (!state->xmit.buf) {
-		/* This is protected by the per port mutex */
-		page = get_zeroed_page(GFP_KERNEL);
-		if (!page)
+		page = get_zeroed_page(GFP_ATOMIC);
+		if (!page) {
+			uart_port_unlock(uport, flags);
 			return -ENOMEM;
+		}
 
 		state->xmit.buf = (unsigned char *) page;
 		uart_circ_clear(&state->xmit);
 	}
+	uart_port_unlock(uport, flags);
+
 
 	retval = uport->ops->startup(uport);
 	if (retval == 0) {
@@ -263,6 +267,7 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
 {
 	struct uart_port *uport = uart_port_check(state);
 	struct tty_port *port = &state->port;
+	unsigned long flags = 0;
 
 	/*
 	 * Set the TTY IO error marker
@@ -295,10 +300,12 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
 	/*
 	 * Free the transmit buffer page.
 	 */
+	uart_port_lock(state, flags);
 	if (state->xmit.buf) {
 		free_page((unsigned long)state->xmit.buf);
 		state->xmit.buf = NULL;
 	}
+	uart_port_unlock(uport, flags);
 }
 
 /**
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] uart: fix race between uart_put_char() and uart_shutdown()
  2018-06-29 10:24   ` [PATCH v2] " Tycho Andersen
@ 2018-06-29 16:43     ` Tycho Andersen
  2018-07-06 14:39       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-06-29 16:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-serial, linux-kernel, Serge E . Hallyn

On Fri, Jun 29, 2018 at 04:24:46AM -0600, Tycho Andersen wrote:
> v2: switch to locking uport->lock on allocation/deallocation instead of
>     locking the per-port mutex in uart_put_char. Note that since
>     uport->lock is a spin lock, we have to switch the allocation to
>     GFP_ATOMIC.

Serge pointed out off-list that we may want to do the allocation
before the lock so that it's more likely to be successful. I'm happy
to send that change to this if it's what we want to do, I don't have a
strong preference.

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] uart: fix race between uart_put_char() and uart_shutdown()
  2018-06-29 16:43     ` Tycho Andersen
@ 2018-07-06 14:39       ` Greg Kroah-Hartman
  2018-07-06 16:24         ` [PATCH v3] " Tycho Andersen
  0 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-06 14:39 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Fri, Jun 29, 2018 at 10:43:30AM -0600, Tycho Andersen wrote:
> On Fri, Jun 29, 2018 at 04:24:46AM -0600, Tycho Andersen wrote:
> > v2: switch to locking uport->lock on allocation/deallocation instead of
> >     locking the per-port mutex in uart_put_char. Note that since
> >     uport->lock is a spin lock, we have to switch the allocation to
> >     GFP_ATOMIC.
> 
> Serge pointed out off-list that we may want to do the allocation
> before the lock so that it's more likely to be successful. I'm happy
> to send that change to this if it's what we want to do, I don't have a
> strong preference.

That sounds like a much better thing to do.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-06 14:39       ` Greg Kroah-Hartman
@ 2018-07-06 16:24         ` " Tycho Andersen
  2018-07-06 16:49           ` Andy Shevchenko
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-07-06 16:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-serial, linux-kernel, Serge E . Hallyn, Tycho Andersen

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
---
 drivers/tty/serial/serial_core.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 9c14a453f73c..46bc97121e49 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -181,7 +181,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
 		int init_hw)
 {
 	struct uart_port *uport = uart_port_check(state);
-	unsigned long page;
+	unsigned long page, flags = 0;
 	int retval = 0;
 
 	if (uport->type == PORT_UNKNOWN)
@@ -196,15 +196,18 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
 	 * Initialise and allocate the transmit and temporary
 	 * buffer.
 	 */
-	if (!state->xmit.buf) {
-		/* This is protected by the per port mutex */
-		page = get_zeroed_page(GFP_KERNEL);
-		if (!page)
-			return -ENOMEM;
+	page = get_zeroed_page(GFP_KERNEL);
+	if (!page)
+		return -ENOMEM;
 
+	uart_port_lock(state, flags);
+	if (!state->xmit.buf) {
 		state->xmit.buf = (unsigned char *) page;
 		uart_circ_clear(&state->xmit);
+	} else {
+		free_page(page);
 	}
+	uart_port_unlock(uport, flags);
 
 	retval = uport->ops->startup(uport);
 	if (retval == 0) {
@@ -263,6 +266,7 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
 {
 	struct uart_port *uport = uart_port_check(state);
 	struct tty_port *port = &state->port;
+	unsigned long flags = 0;
 
 	/*
 	 * Set the TTY IO error marker
@@ -295,10 +299,12 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
 	/*
 	 * Free the transmit buffer page.
 	 */
+	uart_port_lock(state, flags);
 	if (state->xmit.buf) {
 		free_page((unsigned long)state->xmit.buf);
 		state->xmit.buf = NULL;
 	}
+	uart_port_unlock(uport, flags);
 }
 
 /**
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-06 16:24         ` [PATCH v3] " Tycho Andersen
@ 2018-07-06 16:49           ` Andy Shevchenko
  2018-07-06 18:39             ` Tycho Andersen
  0 siblings, 1 reply; 27+ messages in thread
From: Andy Shevchenko @ 2018-07-06 16:49 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Greg Kroah-Hartman, Jiri Slaby, open list:SERIAL DRIVERS,
	Linux Kernel Mailing List, Serge E . Hallyn

On Fri, Jul 6, 2018 at 7:24 PM, Tycho Andersen <tycho@tycho.ws> wrote:

> Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
> protected by the "per-port mutex", which based on uart_port_check() is
> state->port.mutex. Indeed, the lock acquired in uart_put_char() is
> uport->lock, i.e. not the same lock.
>
> Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> last chunk of that function may release state->xmit.buf before its assigned
> to null, and cause the race above.
>
> To fix it, let's lock uport->lock when allocating/deallocating
> state->xmit.buf in addition to the per-port mutex.

Thanks for fixing this!

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>

Some nitpicks though.

> +       unsigned long page, flags = 0;

I would rather put on separate lines and btw assignment is not needed.
It all goes through macros.

> -       if (!state->xmit.buf) {
> -               /* This is protected by the per port mutex */
> -               page = get_zeroed_page(GFP_KERNEL);
> -               if (!page)
> -                       return -ENOMEM;
> +       page = get_zeroed_page(GFP_KERNEL);
> +       if (!page)
> +               return -ENOMEM;
> +       if (!state->xmit.buf) {
>                 state->xmit.buf = (unsigned char *) page;
>                 uart_circ_clear(&state->xmit);
> +       } else {
> +               free_page(page);
>         }

I see original code, but since you are adding else, does it make sense
to switch to positive condition?

> +       unsigned long flags = 0;

Ditto about assignment.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-06 16:49           ` Andy Shevchenko
@ 2018-07-06 18:39             ` Tycho Andersen
  2018-07-06 20:48               ` Andy Shevchenko
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-07-06 18:39 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Jiri Slaby, open list:SERIAL DRIVERS,
	Linux Kernel Mailing List, Serge E . Hallyn

On Fri, Jul 06, 2018 at 07:49:09PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 6, 2018 at 7:24 PM, Tycho Andersen <tycho@tycho.ws> wrote:
> 
> > Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
> > protected by the "per-port mutex", which based on uart_port_check() is
> > state->port.mutex. Indeed, the lock acquired in uart_put_char() is
> > uport->lock, i.e. not the same lock.
> >
> > Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> > last chunk of that function may release state->xmit.buf before its assigned
> > to null, and cause the race above.
> >
> > To fix it, let's lock uport->lock when allocating/deallocating
> > state->xmit.buf in addition to the per-port mutex.
> 
> Thanks for fixing this!
> 
> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> 
> Some nitpicks though.
> 
> > +       unsigned long page, flags = 0;
> 
> I would rather put on separate lines and btw assignment is not needed.
> It all goes through macros.

Sure, I can split it up, but without the initialization I get,

  CC      drivers/tty/serial/serial_core.o
In file included from ./include/linux/seqlock.h:36:0,
                 from ./include/linux/time.h:6,
                 from ./include/linux/stat.h:19,
                 from ./include/linux/module.h:10,
                 from drivers/tty/serial/serial_core.c:10:
drivers/tty/serial/serial_core.c: In function ‘uart_startup.part.20’:
./include/linux/spinlock.h:260:3: warning: ‘flags’ may be used uninitialized in this function  -Wmaybe-uninitialized]
   _raw_spin_unlock_irqrestore(lock, flags); \
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/tty/serial/serial_core.c:184:22: note: ‘flags’ was declared here
  unsigned long page, flags;
                      ^~~~~

> > -       if (!state->xmit.buf) {
> > -               /* This is protected by the per port mutex */
> > -               page = get_zeroed_page(GFP_KERNEL);
> > -               if (!page)
> > -                       return -ENOMEM;
> > +       page = get_zeroed_page(GFP_KERNEL);
> > +       if (!page)
> > +               return -ENOMEM;
> > +       if (!state->xmit.buf) {
> >                 state->xmit.buf = (unsigned char *) page;
> >                 uart_circ_clear(&state->xmit);
> > +       } else {
> > +               free_page(page);
> >         }
> 
> I see original code, but since you are adding else, does it make sense
> to switch to positive condition?

Sure, I'll switch it.

> > +       unsigned long flags = 0;
> 
> Ditto about assignment.

And in this case too,

drivers/tty/serial/serial_core.c:184:22: note: ‘flags’ was declared here
  unsigned long page, flags;
                      ^~~~~
In file included from ./include/linux/seqlock.h:36:0,
                 from ./include/linux/time.h:6,
                 from ./include/linux/stat.h:19,
                 from ./include/linux/module.h:10,
                 from drivers/tty/serial/serial_core.c:10:
drivers/tty/serial/serial_core.c: In function ‘uart_shutdown’:
./include/linux/spinlock.h:260:3: warning: ‘flags’ may be used uninitialized in this function  -Wmaybe-uninitialized]
   _raw_spin_unlock_irqrestore(lock, flags); \
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/tty/serial/serial_core.c:269:16: note: ‘flags’ was declared here
  unsigned long flags;
                ^~~~~

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-06 18:39             ` Tycho Andersen
@ 2018-07-06 20:48               ` Andy Shevchenko
  2018-07-06 21:22                 ` Tycho Andersen
  0 siblings, 1 reply; 27+ messages in thread
From: Andy Shevchenko @ 2018-07-06 20:48 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Greg Kroah-Hartman, Jiri Slaby, open list:SERIAL DRIVERS,
	Linux Kernel Mailing List, Serge E . Hallyn

On Fri, Jul 6, 2018 at 9:39 PM, Tycho Andersen <tycho@tycho.ws> wrote:
> On Fri, Jul 06, 2018 at 07:49:09PM +0300, Andy Shevchenko wrote:
>> On Fri, Jul 6, 2018 at 7:24 PM, Tycho Andersen <tycho@tycho.ws> wrote:

>  but without the initialization I get,
>
>   CC      drivers/tty/serial/serial_core.o
> In file included from ./include/linux/seqlock.h:36:0,
>                  from ./include/linux/time.h:6,
>                  from ./include/linux/stat.h:19,
>                  from ./include/linux/module.h:10,
>                  from drivers/tty/serial/serial_core.c:10:
> drivers/tty/serial/serial_core.c: In function ‘uart_startup.part.20’:
> ./include/linux/spinlock.h:260:3: warning: ‘flags’ may be used uninitialized in this function  -Wmaybe-uninitialized]
>    _raw_spin_unlock_irqrestore(lock, flags); \
>    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/tty/serial/serial_core.c:184:22: note: ‘flags’ was declared here
>   unsigned long page, flags;
>                       ^~~~~

Hmm... I didn't see such warning. How you run make?

Btw, you adding the only places with such assignments in this file.
So, I would not do in your case, until entire file would be fixed.

(But warning looks bogus, or you have some patches on top of current
vanilla / next)

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-06 20:48               ` Andy Shevchenko
@ 2018-07-06 21:22                 ` Tycho Andersen
  2018-07-11 16:07                   ` [PATCH v4] " Tycho Andersen
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-07-06 21:22 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Jiri Slaby, open list:SERIAL DRIVERS,
	Linux Kernel Mailing List, Serge E . Hallyn

On Fri, Jul 06, 2018 at 11:48:58PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 6, 2018 at 9:39 PM, Tycho Andersen <tycho@tycho.ws> wrote:
> > On Fri, Jul 06, 2018 at 07:49:09PM +0300, Andy Shevchenko wrote:
> >> On Fri, Jul 6, 2018 at 7:24 PM, Tycho Andersen <tycho@tycho.ws> wrote:
> 
> >  but without the initialization I get,
> >
> >   CC      drivers/tty/serial/serial_core.o
> > In file included from ./include/linux/seqlock.h:36:0,
> >                  from ./include/linux/time.h:6,
> >                  from ./include/linux/stat.h:19,
> >                  from ./include/linux/module.h:10,
> >                  from drivers/tty/serial/serial_core.c:10:
> > drivers/tty/serial/serial_core.c: In function ‘uart_startup.part.20’:
> > ./include/linux/spinlock.h:260:3: warning: ‘flags’ may be used uninitialized in this function  -Wmaybe-uninitialized]
> >    _raw_spin_unlock_irqrestore(lock, flags); \
> >    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > drivers/tty/serial/serial_core.c:184:22: note: ‘flags’ was declared here
> >   unsigned long page, flags;
> >                       ^~~~~
> 
> Hmm... I didn't see such warning. How you run make?

Just with `make`, although using the specific object file works too.
Perhaps it's gcc versions?

~/packages/linux uart-fix-v4 make drivers/tty/serial/serial_core.o
  CALL    scripts/checksyscalls.sh
  DESCEND  objtool
  CC      drivers/tty/serial/serial_core.o
In file included from ./include/linux/seqlock.h:36:0,
                 from ./include/linux/time.h:6,
                 from ./include/linux/stat.h:19,
                 from ./include/linux/module.h:10,
                 from drivers/tty/serial/serial_core.c:10:
drivers/tty/serial/serial_core.c: In function ‘uart_startup.part.20’:
./include/linux/spinlock.h:260:3: warning: ‘flags’ may be used uninitialized in this function  -Wmaybe-uninitialized]
   _raw_spin_unlock_irqrestore(lock, flags); \
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/tty/serial/serial_core.c:184:22: note: ‘flags’ was declared here
  unsigned long page, flags;
                      ^~~~~
In file included from ./include/linux/seqlock.h:36:0,
                 from ./include/linux/time.h:6,
                 from ./include/linux/stat.h:19,
                 from ./include/linux/module.h:10,
                 from drivers/tty/serial/serial_core.c:10:
drivers/tty/serial/serial_core.c: In function ‘uart_shutdown’:
./include/linux/spinlock.h:260:3: warning: ‘flags’ may be used uninitialized in this function  -Wmaybe-uninitialized]
   _raw_spin_unlock_irqrestore(lock, flags); \
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/tty/serial/serial_core.c:269:16: note: ‘flags’ was declared here
  unsigned long flags;
                ^~~~~
~/packages/linux uart-fix-v4 gcc --version
gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

> Btw, you adding the only places with such assignments in this file.
> So, I would not do in your case, until entire file would be fixed.
> 
> (But warning looks bogus, or you have some patches on top of current
> vanilla / next)

I don't have any patches, but I do admit to not thinking about it very hard and
adding initializations. I'll see if I can figure out what's going on.

Thanks,

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-06 21:22                 ` Tycho Andersen
@ 2018-07-11 16:07                   ` " Tycho Andersen
  2018-07-11 19:24                     ` Serge E. Hallyn
                                       ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Tycho Andersen @ 2018-07-11 16:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-serial, linux-kernel, Serge E . Hallyn, Tycho Andersen

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL
v4: * switch to positive condition of state->xmit.buf in
      uart_port_startup()
    * declare `flags` on its own line
    * use the result of uart_port_lock() in uart_shutdown() to avoid
      uninitialized warning
    * don't use the uart_port_lock/unlock macros in uart_port_startup,
      instead test against uport directly; the compiler can't seem to "see"
      through the macros/ref/unref calls to not warn about uninitialized
      flags. We don't need to do a ref here since we hold the per-port
      mutex anyway.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
---
 drivers/tty/serial/serial_core.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 9c14a453f73c..406e8382d3de 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -182,6 +182,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
 {
 	struct uart_port *uport = uart_port_check(state);
 	unsigned long page;
+	unsigned long flags;
 	int retval = 0;
 
 	if (uport->type == PORT_UNKNOWN)
@@ -196,15 +197,20 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
 	 * Initialise and allocate the transmit and temporary
 	 * buffer.
 	 */
-	if (!state->xmit.buf) {
-		/* This is protected by the per port mutex */
-		page = get_zeroed_page(GFP_KERNEL);
-		if (!page)
-			return -ENOMEM;
+	page = get_zeroed_page(GFP_KERNEL);
+	if (!page)
+		return -ENOMEM;
 
+	if (uport)
+		spin_lock_irqsave(&uport->lock, flags);
+	if (state->xmit.buf) {
+		free_page(page);
+	} else {
 		state->xmit.buf = (unsigned char *) page;
 		uart_circ_clear(&state->xmit);
 	}
+	if (uport)
+		spin_unlock_irqrestore(&uport->lock, flags);
 
 	retval = uport->ops->startup(uport);
 	if (retval == 0) {
@@ -263,6 +269,7 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
 {
 	struct uart_port *uport = uart_port_check(state);
 	struct tty_port *port = &state->port;
+	unsigned long flags;
 
 	/*
 	 * Set the TTY IO error marker
@@ -295,10 +302,12 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
 	/*
 	 * Free the transmit buffer page.
 	 */
+	uport = uart_port_lock(state, flags);
 	if (state->xmit.buf) {
 		free_page((unsigned long)state->xmit.buf);
 		state->xmit.buf = NULL;
 	}
+	uart_port_unlock(uport, flags);
 }
 
 /**
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-11 16:07                   ` [PATCH v4] " Tycho Andersen
@ 2018-07-11 19:24                     ` Serge E. Hallyn
  2018-07-11 19:49                     ` Serge E. Hallyn
                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 27+ messages in thread
From: Serge E. Hallyn @ 2018-07-11 19:24 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel,
	Serge E . Hallyn

Quoting Tycho Andersen (tycho@tycho.ws):
> We have reports of the following crash:
> 
>     PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
>     #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
>     #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
>     #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
>     #3 [ffff88085c6db860] no_context at ffffffff81050b8f
>     #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
>     #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
>     #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
>     #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
>     #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
>     [exception RIP: uart_put_char+149]
>     RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
>     RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
>     RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
>     RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
>     R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
>     R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
>     ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>     #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
>     #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
>     #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
>     #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
>     #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
>     #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
>     #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
>     #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
>     #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
>     #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
>     #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
>     #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
>     #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​
> 
> after slogging through some dissasembly:
> 
> ffffffff814b6720 <uart_put_char>:
> ffffffff814b6720:	55                   	push   %rbp
> ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
> ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
> ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
> ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
> ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
> ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
> ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
> ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
> ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
> ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
> ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
> ffffffff814b6754:	00 00
> ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
> ffffffff814b675d:	00
> ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
> ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
> ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
> ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
> ffffffff814b676f:	00
> ffffffff814b6770:	89 ca                	mov    %ecx,%edx
> ffffffff814b6772:	f7 d2                	not    %edx
> ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
> ffffffff814b677b:	00
> ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
> ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
> ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
> ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
> ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
> ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
> ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
> ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
> ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
> ffffffff814b67a5:	c9                   	leaveq
> ffffffff814b67a6:	c3                   	retq
> ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
> ffffffff814b67ae:	00
> ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
> ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
> ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
> ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
> ffffffff814b67c0:	00
> ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
> ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
> ffffffff814b67d1:	00
> ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
> ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
> ffffffff814b67db:	00 00 00 00 00
> 
> for our build, this is crashing at:
> 
>     circ->buf[circ->head] = c;
> 
> Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
> protected by the "per-port mutex", which based on uart_port_check() is
> state->port.mutex. Indeed, the lock acquired in uart_put_char() is
> uport->lock, i.e. not the same lock.
> 
> Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> last chunk of that function may release state->xmit.buf before its assigned
> to null, and cause the race above.
> 
> To fix it, let's lock uport->lock when allocating/deallocating
> state->xmit.buf in addition to the per-port mutex.
> 
> v2: switch to locking uport->lock on allocation/deallocation instead of
>     locking the per-port mutex in uart_put_char. Note that since
>     uport->lock is a spin lock, we have to switch the allocation to
>     GFP_ATOMIC.
> v3: move the allocation outside the lock, so we can switch back to
>     GFP_KERNEL
> v4: * switch to positive condition of state->xmit.buf in
>       uart_port_startup()
>     * declare `flags` on its own line
>     * use the result of uart_port_lock() in uart_shutdown() to avoid
>       uninitialized warning
>     * don't use the uart_port_lock/unlock macros in uart_port_startup,
>       instead test against uport directly; the compiler can't seem to "see"
>       through the macros/ref/unref calls to not warn about uninitialized
>       flags. We don't need to do a ref here since we hold the per-port
>       mutex anyway.
> 
> Signed-off-by: Tycho Andersen <tycho@tycho.ws>

Reviewed-by: Serge Hallyn <serge@hallyn.com>

Thanks, Tycho.

> ---
>  drivers/tty/serial/serial_core.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index 9c14a453f73c..406e8382d3de 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -182,6 +182,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
>  {
>  	struct uart_port *uport = uart_port_check(state);
>  	unsigned long page;
> +	unsigned long flags;
>  	int retval = 0;
>  
>  	if (uport->type == PORT_UNKNOWN)
> @@ -196,15 +197,20 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
>  	 * Initialise and allocate the transmit and temporary
>  	 * buffer.
>  	 */
> -	if (!state->xmit.buf) {
> -		/* This is protected by the per port mutex */
> -		page = get_zeroed_page(GFP_KERNEL);
> -		if (!page)
> -			return -ENOMEM;
> +	page = get_zeroed_page(GFP_KERNEL);
> +	if (!page)
> +		return -ENOMEM;
>  
> +	if (uport)
> +		spin_lock_irqsave(&uport->lock, flags);
> +	if (state->xmit.buf) {
> +		free_page(page);
> +	} else {
>  		state->xmit.buf = (unsigned char *) page;
>  		uart_circ_clear(&state->xmit);
>  	}
> +	if (uport)
> +		spin_unlock_irqrestore(&uport->lock, flags);
>  
>  	retval = uport->ops->startup(uport);
>  	if (retval == 0) {
> @@ -263,6 +269,7 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
>  {
>  	struct uart_port *uport = uart_port_check(state);
>  	struct tty_port *port = &state->port;
> +	unsigned long flags;
>  
>  	/*
>  	 * Set the TTY IO error marker
> @@ -295,10 +302,12 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
>  	/*
>  	 * Free the transmit buffer page.
>  	 */
> +	uport = uart_port_lock(state, flags);
>  	if (state->xmit.buf) {
>  		free_page((unsigned long)state->xmit.buf);
>  		state->xmit.buf = NULL;
>  	}
> +	uart_port_unlock(uport, flags);
>  }
>  
>  /**
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-11 16:07                   ` [PATCH v4] " Tycho Andersen
  2018-07-11 19:24                     ` Serge E. Hallyn
@ 2018-07-11 19:49                     ` Serge E. Hallyn
  2018-07-11 20:00                       ` Tycho Andersen
  2018-07-12  9:03                     ` Andy Shevchenko
  2018-07-12 15:04                     ` Greg Kroah-Hartman
  3 siblings, 1 reply; 27+ messages in thread
From: Serge E. Hallyn @ 2018-07-11 19:49 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel,
	Serge E . Hallyn

Quoting Tycho Andersen (tycho@tycho.ws):
> We have reports of the following crash:
> 
>     PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
>     #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
>     #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
>     #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
>     #3 [ffff88085c6db860] no_context at ffffffff81050b8f
>     #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
>     #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
>     #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
>     #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
>     #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
>     [exception RIP: uart_put_char+149]
>     RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
>     RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
>     RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
>     RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
>     R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
>     R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
>     ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>     #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
>     #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
>     #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
>     #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
>     #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
>     #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
>     #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
>     #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
>     #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
>     #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
>     #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
>     #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
>     #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​
> 
> after slogging through some dissasembly:
> 
> ffffffff814b6720 <uart_put_char>:
> ffffffff814b6720:	55                   	push   %rbp
> ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
> ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
> ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
> ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
> ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
> ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
> ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
> ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
> ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
> ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
> ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
> ffffffff814b6754:	00 00
> ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
> ffffffff814b675d:	00
> ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
> ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
> ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
> ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
> ffffffff814b676f:	00
> ffffffff814b6770:	89 ca                	mov    %ecx,%edx
> ffffffff814b6772:	f7 d2                	not    %edx
> ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
> ffffffff814b677b:	00
> ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
> ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
> ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
> ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
> ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
> ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
> ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
> ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
> ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
> ffffffff814b67a5:	c9                   	leaveq
> ffffffff814b67a6:	c3                   	retq
> ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
> ffffffff814b67ae:	00
> ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
> ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
> ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
> ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
> ffffffff814b67c0:	00
> ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
> ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
> ffffffff814b67d1:	00
> ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
> ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
> ffffffff814b67db:	00 00 00 00 00
> 
> for our build, this is crashing at:
> 
>     circ->buf[circ->head] = c;
> 
> Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
> protected by the "per-port mutex", which based on uart_port_check() is
> state->port.mutex. Indeed, the lock acquired in uart_put_char() is
> uport->lock, i.e. not the same lock.
> 
> Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> last chunk of that function may release state->xmit.buf before its assigned
> to null, and cause the race above.
> 
> To fix it, let's lock uport->lock when allocating/deallocating
> state->xmit.buf in addition to the per-port mutex.
> 
> v2: switch to locking uport->lock on allocation/deallocation instead of
>     locking the per-port mutex in uart_put_char. Note that since
>     uport->lock is a spin lock, we have to switch the allocation to
>     GFP_ATOMIC.
> v3: move the allocation outside the lock, so we can switch back to
>     GFP_KERNEL
> v4: * switch to positive condition of state->xmit.buf in
>       uart_port_startup()
>     * declare `flags` on its own line
>     * use the result of uart_port_lock() in uart_shutdown() to avoid
>       uninitialized warning
>     * don't use the uart_port_lock/unlock macros in uart_port_startup,
>       instead test against uport directly; the compiler can't seem to "see"
>       through the macros/ref/unref calls to not warn about uninitialized
>       flags. We don't need to do a ref here since we hold the per-port
>       mutex anyway.
> 
> Signed-off-by: Tycho Andersen <tycho@tycho.ws>

One question: would it be worth doing:

> ---
>  drivers/tty/serial/serial_core.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index 9c14a453f73c..406e8382d3de 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -182,6 +182,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
>  {
>  	struct uart_port *uport = uart_port_check(state);
>  	unsigned long page;
> +	unsigned long flags;
>  	int retval = 0;
>  
>  	if (uport->type == PORT_UNKNOWN)
> @@ -196,15 +197,20 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
>  	 * Initialise and allocate the transmit and temporary
>  	 * buffer.
>  	 */
> -	if (!state->xmit.buf) {
> -		/* This is protected by the per port mutex */
> -		page = get_zeroed_page(GFP_KERNEL);
> -		if (!page)
> -			return -ENOMEM;
> +	page = get_zeroed_page(GFP_KERNEL);

ignoring the alloc failure here.  (leave a comment for worried
future reviewers)  Then,

> +	if (!page)
> +		return -ENOMEM;
>  
> +	if (uport)
> +		spin_lock_irqsave(&uport->lock, flags);
> +	if (state->xmit.buf) {
> +		free_page(page);
> +	} else {

here if page is NULL (unlock and) return ENOMEM.

Since it looks like this fn is called on every device open, that
would seem to minimize getting lots of needless ENOMEMs.

Another alternative is to just wait to do the alloc until we
get here, and drop the spinlock here if we need to alloc.  Then
retake the lock, re-check state->xmit.buf;  If that is now not
null then free_page(page), or if it is still NULL and page alloc
failed, then return ENOEMEM). That is uglier code, but is
probably the best behavior.

Still the original patch fixes the real bug so I'm fine with it
as well.

>  		state->xmit.buf = (unsigned char *) page;
>  		uart_circ_clear(&state->xmit);
>  	}
> +	if (uport)
> +		spin_unlock_irqrestore(&uport->lock, flags);
>  
>  	retval = uport->ops->startup(uport);
>  	if (retval == 0) {
> @@ -263,6 +269,7 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
>  {
>  	struct uart_port *uport = uart_port_check(state);
>  	struct tty_port *port = &state->port;
> +	unsigned long flags;
>  
>  	/*
>  	 * Set the TTY IO error marker
> @@ -295,10 +302,12 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
>  	/*
>  	 * Free the transmit buffer page.
>  	 */
> +	uport = uart_port_lock(state, flags);
>  	if (state->xmit.buf) {
>  		free_page((unsigned long)state->xmit.buf);
>  		state->xmit.buf = NULL;
>  	}
> +	uart_port_unlock(uport, flags);
>  }
>  
>  /**
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-11 19:49                     ` Serge E. Hallyn
@ 2018-07-11 20:00                       ` Tycho Andersen
  2018-07-12 15:05                         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-07-11 20:00 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel

On Wed, Jul 11, 2018 at 02:49:08PM -0500, Serge E. Hallyn wrote:
> Quoting Tycho Andersen (tycho@tycho.ws):
> > We have reports of the following crash:
> > 
> >     PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
> >     #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
> >     #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
> >     #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
> >     #3 [ffff88085c6db860] no_context at ffffffff81050b8f
> >     #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
> >     #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
> >     #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
> >     #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
> >     #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
> >     [exception RIP: uart_put_char+149]
> >     RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
> >     RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
> >     RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
> >     RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
> >     R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
> >     R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
> >     ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> >     #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
> >     #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
> >     #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
> >     #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
> >     #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
> >     #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
> >     #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
> >     #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
> >     #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
> >     #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
> >     #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
> >     #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
> >     #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​
> > 
> > after slogging through some dissasembly:
> > 
> > ffffffff814b6720 <uart_put_char>:
> > ffffffff814b6720:	55                   	push   %rbp
> > ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
> > ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
> > ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
> > ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
> > ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
> > ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
> > ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
> > ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
> > ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
> > ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
> > ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
> > ffffffff814b6754:	00 00
> > ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
> > ffffffff814b675d:	00
> > ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
> > ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
> > ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
> > ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
> > ffffffff814b676f:	00
> > ffffffff814b6770:	89 ca                	mov    %ecx,%edx
> > ffffffff814b6772:	f7 d2                	not    %edx
> > ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
> > ffffffff814b677b:	00
> > ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> > ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
> > ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
> > ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
> > ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
> > ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
> > ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
> > ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
> > ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
> > ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
> > ffffffff814b67a5:	c9                   	leaveq
> > ffffffff814b67a6:	c3                   	retq
> > ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
> > ffffffff814b67ae:	00
> > ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
> > ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
> > ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
> > ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
> > ffffffff814b67c0:	00
> > ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
> > ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> > ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
> > ffffffff814b67d1:	00
> > ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
> > ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
> > ffffffff814b67db:	00 00 00 00 00
> > 
> > for our build, this is crashing at:
> > 
> >     circ->buf[circ->head] = c;
> > 
> > Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
> > protected by the "per-port mutex", which based on uart_port_check() is
> > state->port.mutex. Indeed, the lock acquired in uart_put_char() is
> > uport->lock, i.e. not the same lock.
> > 
> > Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> > last chunk of that function may release state->xmit.buf before its assigned
> > to null, and cause the race above.
> > 
> > To fix it, let's lock uport->lock when allocating/deallocating
> > state->xmit.buf in addition to the per-port mutex.
> > 
> > v2: switch to locking uport->lock on allocation/deallocation instead of
> >     locking the per-port mutex in uart_put_char. Note that since
> >     uport->lock is a spin lock, we have to switch the allocation to
> >     GFP_ATOMIC.
> > v3: move the allocation outside the lock, so we can switch back to
> >     GFP_KERNEL
> > v4: * switch to positive condition of state->xmit.buf in
> >       uart_port_startup()
> >     * declare `flags` on its own line
> >     * use the result of uart_port_lock() in uart_shutdown() to avoid
> >       uninitialized warning
> >     * don't use the uart_port_lock/unlock macros in uart_port_startup,
> >       instead test against uport directly; the compiler can't seem to "see"
> >       through the macros/ref/unref calls to not warn about uninitialized
> >       flags. We don't need to do a ref here since we hold the per-port
> >       mutex anyway.
> > 
> > Signed-off-by: Tycho Andersen <tycho@tycho.ws>
> 
> One question: would it be worth doing:
> 
> > ---
> >  drivers/tty/serial/serial_core.c | 19 ++++++++++++++-----
> >  1 file changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> > index 9c14a453f73c..406e8382d3de 100644
> > --- a/drivers/tty/serial/serial_core.c
> > +++ b/drivers/tty/serial/serial_core.c
> > @@ -182,6 +182,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
> >  {
> >  	struct uart_port *uport = uart_port_check(state);
> >  	unsigned long page;
> > +	unsigned long flags;
> >  	int retval = 0;
> >  
> >  	if (uport->type == PORT_UNKNOWN)
> > @@ -196,15 +197,20 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
> >  	 * Initialise and allocate the transmit and temporary
> >  	 * buffer.
> >  	 */
> > -	if (!state->xmit.buf) {
> > -		/* This is protected by the per port mutex */
> > -		page = get_zeroed_page(GFP_KERNEL);
> > -		if (!page)
> > -			return -ENOMEM;
> > +	page = get_zeroed_page(GFP_KERNEL);
> 
> ignoring the alloc failure here.  (leave a comment for worried
> future reviewers)  Then,
> 
> > +	if (!page)
> > +		return -ENOMEM;
> >  
> > +	if (uport)
> > +		spin_lock_irqsave(&uport->lock, flags);
> > +	if (state->xmit.buf) {
> > +		free_page(page);
> > +	} else {
> 
> here if page is NULL (unlock and) return ENOMEM.
> 
> Since it looks like this fn is called on every device open, that
> would seem to minimize getting lots of needless ENOMEMs.
> 
> Another alternative is to just wait to do the alloc until we
> get here, and drop the spinlock here if we need to alloc.  Then
> retake the lock, re-check state->xmit.buf;  If that is now not
> null then free_page(page), or if it is still NULL and page alloc
> failed, then return ENOEMEM). That is uglier code, but is
> probably the best behavior.
> 
> Still the original patch fixes the real bug so I'm fine with it
> as well.

Sure, I'm happy to implement whichever of these we think is best.
Greg?

Thanks,

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-11 16:07                   ` [PATCH v4] " Tycho Andersen
  2018-07-11 19:24                     ` Serge E. Hallyn
  2018-07-11 19:49                     ` Serge E. Hallyn
@ 2018-07-12  9:03                     ` Andy Shevchenko
  2018-07-12 14:13                       ` Tycho Andersen
  2018-07-12 15:04                     ` Greg Kroah-Hartman
  3 siblings, 1 reply; 27+ messages in thread
From: Andy Shevchenko @ 2018-07-12  9:03 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Greg Kroah-Hartman, Jiri Slaby, open list:SERIAL DRIVERS,
	Linux Kernel Mailing List, Serge E . Hallyn

On Wed, Jul 11, 2018 at 7:07 PM, Tycho Andersen <tycho@tycho.ws> wrote:

> Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> last chunk of that function may release state->xmit.buf before its assigned
> to null, and cause the race above.
>
> To fix it, let's lock uport->lock when allocating/deallocating
> state->xmit.buf in addition to the per-port mutex.

>     * use the result of uart_port_lock() in uart_shutdown() to avoid
>       uninitialized warning
>     * don't use the uart_port_lock/unlock macros in uart_port_startup,
>       instead test against uport directly; the compiler can't seem to "see"
>       through the macros/ref/unref calls to not warn about uninitialized
>       flags. We don't need to do a ref here since we hold the per-port
>       mutex anyway.

> +       if (uport)
> +               spin_lock_irqsave(&uport->lock, flags);

> +       if (uport)
> +               spin_unlock_irqrestore(&uport->lock, flags);

At some point it It was uart_port_lock()/uart_port_unlock(), and you
changed to simple spin lock. The macro also take reference to the
port. Do we aware about that here?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-12  9:03                     ` Andy Shevchenko
@ 2018-07-12 14:13                       ` Tycho Andersen
  0 siblings, 0 replies; 27+ messages in thread
From: Tycho Andersen @ 2018-07-12 14:13 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Jiri Slaby, open list:SERIAL DRIVERS,
	Linux Kernel Mailing List, Serge E . Hallyn

Hi Andy,

On Thu, Jul 12, 2018 at 12:03:08PM +0300, Andy Shevchenko wrote:
> On Wed, Jul 11, 2018 at 7:07 PM, Tycho Andersen <tycho@tycho.ws> wrote:
> 
> > Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> > last chunk of that function may release state->xmit.buf before its assigned
> > to null, and cause the race above.
> >
> > To fix it, let's lock uport->lock when allocating/deallocating
> > state->xmit.buf in addition to the per-port mutex.
> 
> >     * use the result of uart_port_lock() in uart_shutdown() to avoid
> >       uninitialized warning
> >     * don't use the uart_port_lock/unlock macros in uart_port_startup,
> >       instead test against uport directly; the compiler can't seem to "see"
> >       through the macros/ref/unref calls to not warn about uninitialized
> >       flags. We don't need to do a ref here since we hold the per-port
> >       mutex anyway.
> 
> > +       if (uport)
> > +               spin_lock_irqsave(&uport->lock, flags);
> 
> > +       if (uport)
> > +               spin_unlock_irqrestore(&uport->lock, flags);
> 
> At some point it It was uart_port_lock()/uart_port_unlock(), and you
> changed to simple spin lock. The macro also take reference to the
> port. Do we aware about that here?

I don't think so, the commit message you quoted above says,

> We don't need to do a ref here since we hold the per-port mutex
> anyway.

Cheers,

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-11 16:07                   ` [PATCH v4] " Tycho Andersen
                                       ` (2 preceding siblings ...)
  2018-07-12  9:03                     ` Andy Shevchenko
@ 2018-07-12 15:04                     ` Greg Kroah-Hartman
  2018-07-12 15:08                       ` Tycho Andersen
  3 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-12 15:04 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> +	if (uport)
> +		spin_lock_irqsave(&uport->lock, flags);

That's the same thing as just calling uart_port_lock(), why aren't you
doing that?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-11 20:00                       ` Tycho Andersen
@ 2018-07-12 15:05                         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 27+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-12 15:05 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Serge E. Hallyn, Jiri Slaby, linux-serial, linux-kernel

On Wed, Jul 11, 2018 at 02:00:56PM -0600, Tycho Andersen wrote:
> On Wed, Jul 11, 2018 at 02:49:08PM -0500, Serge E. Hallyn wrote:
> > Quoting Tycho Andersen (tycho@tycho.ws):
> > > We have reports of the following crash:
> > > 
> > >     PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
> > >     #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
> > >     #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
> > >     #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
> > >     #3 [ffff88085c6db860] no_context at ffffffff81050b8f
> > >     #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
> > >     #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
> > >     #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
> > >     #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
> > >     #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
> > >     [exception RIP: uart_put_char+149]
> > >     RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
> > >     RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
> > >     RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
> > >     RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
> > >     R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
> > >     R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
> > >     ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > >     #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
> > >     #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
> > >     #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
> > >     #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
> > >     #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
> > >     #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
> > >     #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
> > >     #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
> > >     #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
> > >     #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
> > >     #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
> > >     #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
> > >     #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​
> > > 
> > > after slogging through some dissasembly:
> > > 
> > > ffffffff814b6720 <uart_put_char>:
> > > ffffffff814b6720:	55                   	push   %rbp
> > > ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
> > > ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
> > > ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
> > > ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
> > > ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
> > > ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
> > > ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
> > > ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
> > > ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
> > > ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
> > > ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
> > > ffffffff814b6754:	00 00
> > > ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
> > > ffffffff814b675d:	00
> > > ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
> > > ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
> > > ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
> > > ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
> > > ffffffff814b676f:	00
> > > ffffffff814b6770:	89 ca                	mov    %ecx,%edx
> > > ffffffff814b6772:	f7 d2                	not    %edx
> > > ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
> > > ffffffff814b677b:	00
> > > ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> > > ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
> > > ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
> > > ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
> > > ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
> > > ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
> > > ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
> > > ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
> > > ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
> > > ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
> > > ffffffff814b67a5:	c9                   	leaveq
> > > ffffffff814b67a6:	c3                   	retq
> > > ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
> > > ffffffff814b67ae:	00
> > > ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
> > > ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
> > > ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
> > > ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
> > > ffffffff814b67c0:	00
> > > ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
> > > ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
> > > ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
> > > ffffffff814b67d1:	00
> > > ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
> > > ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
> > > ffffffff814b67db:	00 00 00 00 00
> > > 
> > > for our build, this is crashing at:
> > > 
> > >     circ->buf[circ->head] = c;
> > > 
> > > Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
> > > protected by the "per-port mutex", which based on uart_port_check() is
> > > state->port.mutex. Indeed, the lock acquired in uart_put_char() is
> > > uport->lock, i.e. not the same lock.
> > > 
> > > Anyway, since the lock is not acquired, if uart_shutdown() is called, the
> > > last chunk of that function may release state->xmit.buf before its assigned
> > > to null, and cause the race above.
> > > 
> > > To fix it, let's lock uport->lock when allocating/deallocating
> > > state->xmit.buf in addition to the per-port mutex.
> > > 
> > > v2: switch to locking uport->lock on allocation/deallocation instead of
> > >     locking the per-port mutex in uart_put_char. Note that since
> > >     uport->lock is a spin lock, we have to switch the allocation to
> > >     GFP_ATOMIC.
> > > v3: move the allocation outside the lock, so we can switch back to
> > >     GFP_KERNEL
> > > v4: * switch to positive condition of state->xmit.buf in
> > >       uart_port_startup()
> > >     * declare `flags` on its own line
> > >     * use the result of uart_port_lock() in uart_shutdown() to avoid
> > >       uninitialized warning
> > >     * don't use the uart_port_lock/unlock macros in uart_port_startup,
> > >       instead test against uport directly; the compiler can't seem to "see"
> > >       through the macros/ref/unref calls to not warn about uninitialized
> > >       flags. We don't need to do a ref here since we hold the per-port
> > >       mutex anyway.
> > > 
> > > Signed-off-by: Tycho Andersen <tycho@tycho.ws>
> > 
> > One question: would it be worth doing:
> > 
> > > ---
> > >  drivers/tty/serial/serial_core.c | 19 ++++++++++++++-----
> > >  1 file changed, 14 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> > > index 9c14a453f73c..406e8382d3de 100644
> > > --- a/drivers/tty/serial/serial_core.c
> > > +++ b/drivers/tty/serial/serial_core.c
> > > @@ -182,6 +182,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
> > >  {
> > >  	struct uart_port *uport = uart_port_check(state);
> > >  	unsigned long page;
> > > +	unsigned long flags;
> > >  	int retval = 0;
> > >  
> > >  	if (uport->type == PORT_UNKNOWN)
> > > @@ -196,15 +197,20 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
> > >  	 * Initialise and allocate the transmit and temporary
> > >  	 * buffer.
> > >  	 */
> > > -	if (!state->xmit.buf) {
> > > -		/* This is protected by the per port mutex */
> > > -		page = get_zeroed_page(GFP_KERNEL);
> > > -		if (!page)
> > > -			return -ENOMEM;
> > > +	page = get_zeroed_page(GFP_KERNEL);
> > 
> > ignoring the alloc failure here.  (leave a comment for worried
> > future reviewers)  Then,
> > 
> > > +	if (!page)
> > > +		return -ENOMEM;
> > >  
> > > +	if (uport)
> > > +		spin_lock_irqsave(&uport->lock, flags);
> > > +	if (state->xmit.buf) {
> > > +		free_page(page);
> > > +	} else {
> > 
> > here if page is NULL (unlock and) return ENOMEM.
> > 
> > Since it looks like this fn is called on every device open, that
> > would seem to minimize getting lots of needless ENOMEMs.
> > 
> > Another alternative is to just wait to do the alloc until we
> > get here, and drop the spinlock here if we need to alloc.  Then
> > retake the lock, re-check state->xmit.buf;  If that is now not
> > null then free_page(page), or if it is still NULL and page alloc
> > failed, then return ENOEMEM). That is uglier code, but is
> > probably the best behavior.
> > 
> > Still the original patch fixes the real bug so I'm fine with it
> > as well.
> 
> Sure, I'm happy to implement whichever of these we think is best.
> Greg?

Let's fix this issue first please, then I'll be glad to review other
changes based on micro optimizations :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-12 15:04                     ` Greg Kroah-Hartman
@ 2018-07-12 15:08                       ` Tycho Andersen
  2018-07-12 15:40                         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-07-12 15:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Thu, Jul 12, 2018 at 05:04:38PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> > +	if (uport)
> > +		spin_lock_irqsave(&uport->lock, flags);
> 
> That's the same thing as just calling uart_port_lock(), why aren't you
> doing that?

Because the compiler can't seem to "see" through the macros/ref calls,
and I get the warning I mentioned here if I use them:

https://lkml.org/lkml/2018/7/6/840

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-12 15:08                       ` Tycho Andersen
@ 2018-07-12 15:40                         ` Greg Kroah-Hartman
  2018-07-12 18:18                           ` Tycho Andersen
  0 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-12 15:40 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Thu, Jul 12, 2018 at 09:08:22AM -0600, Tycho Andersen wrote:
> On Thu, Jul 12, 2018 at 05:04:38PM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> > > +	if (uport)
> > > +		spin_lock_irqsave(&uport->lock, flags);
> > 
> > That's the same thing as just calling uart_port_lock(), why aren't you
> > doing that?
> 
> Because the compiler can't seem to "see" through the macros/ref calls,
> and I get the warning I mentioned here if I use them:
> 
> https://lkml.org/lkml/2018/7/6/840

What horrible version of gcc are you using that give you that?  Don't
open-code things just because of a broken compiler.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-12 15:40                         ` Greg Kroah-Hartman
@ 2018-07-12 18:18                           ` Tycho Andersen
  2018-07-12 18:25                             ` Greg Kroah-Hartman
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-07-12 18:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Thu, Jul 12, 2018 at 05:40:15PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 12, 2018 at 09:08:22AM -0600, Tycho Andersen wrote:
> > On Thu, Jul 12, 2018 at 05:04:38PM +0200, Greg Kroah-Hartman wrote:
> > > On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> > > > +	if (uport)
> > > > +		spin_lock_irqsave(&uport->lock, flags);
> > > 
> > > That's the same thing as just calling uart_port_lock(), why aren't you
> > > doing that?
> > 
> > Because the compiler can't seem to "see" through the macros/ref calls,
> > and I get the warning I mentioned here if I use them:
> > 
> > https://lkml.org/lkml/2018/7/6/840
> 
> What horrible version of gcc are you using that give you that?  Don't
> open-code things just because of a broken compiler.

I've tried with both 7.3.0 and 5.4.0. I think the reason we see this
here but not elsewhere in the file is because there's an actual
function call (free_page()) in the critical section.

If we move that out, something like the below patch, it all works for
me.

Tycho


From 532e82ceec67142b71c60ce74ce08c5339195a94 Mon Sep 17 00:00:00 2001
From: Tycho Andersen <tycho@tycho.ws>
Date: Mon, 4 Jun 2018 09:55:09 -0600
Subject: [PATCH] uart: fix race between uart_put_char() and uart_shutdown()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

We have reports of the following crash:

    PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0"
    #0 [ffff88085c6db710] machine_kexec at ffffffff81046239
    #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248
    #2 [ffff88085c6db830] oops_end at ffffffff81008ae7
    #3 [ffff88085c6db860] no_context at ffffffff81050b8f
    #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75
    #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83
    #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e
    #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c
    #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122
    [exception RIP: uart_put_char+149]
    RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006
    RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081
    RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120
    RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320
    R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000
    R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544
    #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c
    #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b
    #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2
    #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b
    #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a
    #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016
    #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194
    #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a
    #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2
    #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d
    #20 [ffff88085c6dbeb0] kthread at ffffffff81096384
    #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f​

after slogging through some dissasembly:

ffffffff814b6720 <uart_put_char>:
ffffffff814b6720:	55                   	push   %rbp
ffffffff814b6721:	48 89 e5             	mov    %rsp,%rbp
ffffffff814b6724:	48 83 ec 20          	sub    $0x20,%rsp
ffffffff814b6728:	48 89 1c 24          	mov    %rbx,(%rsp)
ffffffff814b672c:	4c 89 64 24 08       	mov    %r12,0x8(%rsp)
ffffffff814b6731:	4c 89 6c 24 10       	mov    %r13,0x10(%rsp)
ffffffff814b6736:	4c 89 74 24 18       	mov    %r14,0x18(%rsp)
ffffffff814b673b:	e8 b0 8e 58 00       	callq  ffffffff81a3f5f0 <mcount>
ffffffff814b6740:	4c 8b a7 88 02 00 00 	mov    0x288(%rdi),%r12
ffffffff814b6747:	45 31 ed             	xor    %r13d,%r13d
ffffffff814b674a:	41 89 f6             	mov    %esi,%r14d
ffffffff814b674d:	49 83 bc 24 70 01 00 	cmpq   $0x0,0x170(%r12)
ffffffff814b6754:	00 00
ffffffff814b6756:	49 8b 9c 24 80 01 00 	mov    0x180(%r12),%rbx
ffffffff814b675d:	00
ffffffff814b675e:	74 2f                	je     ffffffff814b678f <uart_put_char+0x6f>
ffffffff814b6760:	48 89 df             	mov    %rbx,%rdi
ffffffff814b6763:	e8 a8 67 58 00       	callq  ffffffff81a3cf10 <_raw_spin_lock_irqsave>
ffffffff814b6768:	41 8b 8c 24 78 01 00 	mov    0x178(%r12),%ecx
ffffffff814b676f:	00
ffffffff814b6770:	89 ca                	mov    %ecx,%edx
ffffffff814b6772:	f7 d2                	not    %edx
ffffffff814b6774:	41 03 94 24 7c 01 00 	add    0x17c(%r12),%edx
ffffffff814b677b:	00
ffffffff814b677c:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b6782:	75 23                	jne    ffffffff814b67a7 <uart_put_char+0x87>
ffffffff814b6784:	48 89 c6             	mov    %rax,%rsi
ffffffff814b6787:	48 89 df             	mov    %rbx,%rdi
ffffffff814b678a:	e8 e1 64 58 00       	callq  ffffffff81a3cc70 <_raw_spin_unlock_irqrestore>
ffffffff814b678f:	44 89 e8             	mov    %r13d,%eax
ffffffff814b6792:	48 8b 1c 24          	mov    (%rsp),%rbx
ffffffff814b6796:	4c 8b 64 24 08       	mov    0x8(%rsp),%r12
ffffffff814b679b:	4c 8b 6c 24 10       	mov    0x10(%rsp),%r13
ffffffff814b67a0:	4c 8b 74 24 18       	mov    0x18(%rsp),%r14
ffffffff814b67a5:	c9                   	leaveq
ffffffff814b67a6:	c3                   	retq
ffffffff814b67a7:	49 8b 94 24 70 01 00 	mov    0x170(%r12),%rdx
ffffffff814b67ae:	00
ffffffff814b67af:	48 63 c9             	movslq %ecx,%rcx
ffffffff814b67b2:	41 b5 01             	mov    $0x1,%r13b
ffffffff814b67b5:	44 88 34 0a          	mov    %r14b,(%rdx,%rcx,1)
ffffffff814b67b9:	41 8b 94 24 78 01 00 	mov    0x178(%r12),%edx
ffffffff814b67c0:	00
ffffffff814b67c1:	83 c2 01             	add    $0x1,%edx
ffffffff814b67c4:	81 e2 ff 0f 00 00    	and    $0xfff,%edx
ffffffff814b67ca:	41 89 94 24 78 01 00 	mov    %edx,0x178(%r12)
ffffffff814b67d1:	00
ffffffff814b67d2:	eb b0                	jmp    ffffffff814b6784 <uart_put_char+0x64>
ffffffff814b67d4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff814b67db:	00 00 00 00 00

for our build, this is crashing at:

    circ->buf[circ->head] = c;

Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf)
protected by the "per-port mutex", which based on uart_port_check() is
state->port.mutex. Indeed, the lock acquired in uart_put_char() is
uport->lock, i.e. not the same lock.

Anyway, since the lock is not acquired, if uart_shutdown() is called, the
last chunk of that function may release state->xmit.buf before its assigned
to null, and cause the race above.

To fix it, let's lock uport->lock when allocating/deallocating
state->xmit.buf in addition to the per-port mutex.

v2: switch to locking uport->lock on allocation/deallocation instead of
    locking the per-port mutex in uart_put_char. Note that since
    uport->lock is a spin lock, we have to switch the allocation to
    GFP_ATOMIC.
v3: move the allocation outside the lock, so we can switch back to
    GFP_KERNEL
v4: * switch to positive condition of state->xmit.buf in
      uart_port_startup()
    * declare `flags` on its own line
    * use the result of uart_port_lock() in uart_shutdown() to avoid
      uninitialized warning
    * don't use the uart_port_lock/unlock macros in uart_port_startup,
      instead test against uport directly; the compiler can't seem to "see"
      through the macros/ref/unref calls to not warn about uninitialized
      flags. We don't need to do a ref here since we hold the per-port
      mutex anyway.
v5: use uart_port_lock/unlock, but free the page outside the critical
    section if it is unused

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
---
 drivers/tty/serial/serial_core.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 9c14a453f73c..91b292bbda91 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -182,6 +182,8 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
 {
 	struct uart_port *uport = uart_port_check(state);
 	unsigned long page;
+	unsigned long flags;
+	bool page_used = false;
 	int retval = 0;
 
 	if (uport->type == PORT_UNKNOWN)
@@ -196,15 +198,20 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
 	 * Initialise and allocate the transmit and temporary
 	 * buffer.
 	 */
-	if (!state->xmit.buf) {
-		/* This is protected by the per port mutex */
-		page = get_zeroed_page(GFP_KERNEL);
-		if (!page)
-			return -ENOMEM;
+	page = get_zeroed_page(GFP_KERNEL);
+	if (!page)
+		return -ENOMEM;
 
+	uport = uart_port_lock(state, flags);
+	if (!state->xmit.buf) {
 		state->xmit.buf = (unsigned char *) page;
 		uart_circ_clear(&state->xmit);
+		page_used = true;
 	}
+	uart_port_unlock(uport, flags);
+
+	if (!page_used)
+		free_page(page);
 
 	retval = uport->ops->startup(uport);
 	if (retval == 0) {
@@ -263,6 +270,7 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
 {
 	struct uart_port *uport = uart_port_check(state);
 	struct tty_port *port = &state->port;
+	unsigned long flags;
 
 	/*
 	 * Set the TTY IO error marker
@@ -295,10 +303,12 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
 	/*
 	 * Free the transmit buffer page.
 	 */
+	uport = uart_port_lock(state, flags);
 	if (state->xmit.buf) {
 		free_page((unsigned long)state->xmit.buf);
 		state->xmit.buf = NULL;
 	}
+	uart_port_unlock(uport, flags);
 }
 
 /**
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-12 18:18                           ` Tycho Andersen
@ 2018-07-12 18:25                             ` Greg Kroah-Hartman
  2018-07-12 18:30                               ` Tycho Andersen
  0 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-12 18:25 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Thu, Jul 12, 2018 at 12:18:46PM -0600, Tycho Andersen wrote:
> On Thu, Jul 12, 2018 at 05:40:15PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Jul 12, 2018 at 09:08:22AM -0600, Tycho Andersen wrote:
> > > On Thu, Jul 12, 2018 at 05:04:38PM +0200, Greg Kroah-Hartman wrote:
> > > > On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> > > > > +	if (uport)
> > > > > +		spin_lock_irqsave(&uport->lock, flags);
> > > > 
> > > > That's the same thing as just calling uart_port_lock(), why aren't you
> > > > doing that?
> > > 
> > > Because the compiler can't seem to "see" through the macros/ref calls,
> > > and I get the warning I mentioned here if I use them:
> > > 
> > > https://lkml.org/lkml/2018/7/6/840
> > 
> > What horrible version of gcc are you using that give you that?  Don't
> > open-code things just because of a broken compiler.
> 
> I've tried with both 7.3.0 and 5.4.0. I think the reason we see this
> here but not elsewhere in the file is because there's an actual
> function call (free_page()) in the critical section.
> 
> If we move that out, something like the below patch, it all works for
> me.

Ick.  Which version of this series had the problem?  Let me test it out
here...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-12 18:25                             ` Greg Kroah-Hartman
@ 2018-07-12 18:30                               ` Tycho Andersen
  2018-07-13  9:28                                 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 27+ messages in thread
From: Tycho Andersen @ 2018-07-12 18:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Thu, Jul 12, 2018 at 08:25:45PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 12, 2018 at 12:18:46PM -0600, Tycho Andersen wrote:
> > On Thu, Jul 12, 2018 at 05:40:15PM +0200, Greg Kroah-Hartman wrote:
> > > On Thu, Jul 12, 2018 at 09:08:22AM -0600, Tycho Andersen wrote:
> > > > On Thu, Jul 12, 2018 at 05:04:38PM +0200, Greg Kroah-Hartman wrote:
> > > > > On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> > > > > > +	if (uport)
> > > > > > +		spin_lock_irqsave(&uport->lock, flags);
> > > > > 
> > > > > That's the same thing as just calling uart_port_lock(), why aren't you
> > > > > doing that?
> > > > 
> > > > Because the compiler can't seem to "see" through the macros/ref calls,
> > > > and I get the warning I mentioned here if I use them:
> > > > 
> > > > https://lkml.org/lkml/2018/7/6/840
> > > 
> > > What horrible version of gcc are you using that give you that?  Don't
> > > open-code things just because of a broken compiler.
> > 
> > I've tried with both 7.3.0 and 5.4.0. I think the reason we see this
> > here but not elsewhere in the file is because there's an actual
> > function call (free_page()) in the critical section.
> > 
> > If we move that out, something like the below patch, it all works for
> > me.
> 
> Ick.  Which version of this series had the problem?  Let me test it out
> here...

v3, if you remove the initialization of flags from both functions you
should see it.

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-12 18:30                               ` Tycho Andersen
@ 2018-07-13  9:28                                 ` Greg Kroah-Hartman
  2018-07-13 14:01                                   ` Tycho Andersen
  0 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-13  9:28 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Thu, Jul 12, 2018 at 12:30:01PM -0600, Tycho Andersen wrote:
> On Thu, Jul 12, 2018 at 08:25:45PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Jul 12, 2018 at 12:18:46PM -0600, Tycho Andersen wrote:
> > > On Thu, Jul 12, 2018 at 05:40:15PM +0200, Greg Kroah-Hartman wrote:
> > > > On Thu, Jul 12, 2018 at 09:08:22AM -0600, Tycho Andersen wrote:
> > > > > On Thu, Jul 12, 2018 at 05:04:38PM +0200, Greg Kroah-Hartman wrote:
> > > > > > On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> > > > > > > +	if (uport)
> > > > > > > +		spin_lock_irqsave(&uport->lock, flags);
> > > > > > 
> > > > > > That's the same thing as just calling uart_port_lock(), why aren't you
> > > > > > doing that?
> > > > > 
> > > > > Because the compiler can't seem to "see" through the macros/ref calls,
> > > > > and I get the warning I mentioned here if I use them:
> > > > > 
> > > > > https://lkml.org/lkml/2018/7/6/840
> > > > 
> > > > What horrible version of gcc are you using that give you that?  Don't
> > > > open-code things just because of a broken compiler.
> > > 
> > > I've tried with both 7.3.0 and 5.4.0. I think the reason we see this
> > > here but not elsewhere in the file is because there's an actual
> > > function call (free_page()) in the critical section.
> > > 
> > > If we move that out, something like the below patch, it all works for
> > > me.
> > 
> > Ick.  Which version of this series had the problem?  Let me test it out
> > here...
> 
> v3, if you remove the initialization of flags from both functions you
> should see it.

Ok, I tried v3 out and yes, you are right, removing the "= 0" causes gcc
to complain.  The compiler is being dumb here, so I'll just leave it
as-is.  I've queued up the v3 version now, thanks for sticking with
this.

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4] uart: fix race between uart_put_char() and uart_shutdown()
  2018-07-13  9:28                                 ` Greg Kroah-Hartman
@ 2018-07-13 14:01                                   ` Tycho Andersen
  0 siblings, 0 replies; 27+ messages in thread
From: Tycho Andersen @ 2018-07-13 14:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jiri Slaby, linux-serial, linux-kernel, Serge E . Hallyn

On Fri, Jul 13, 2018 at 11:28:28AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 12, 2018 at 12:30:01PM -0600, Tycho Andersen wrote:
> > On Thu, Jul 12, 2018 at 08:25:45PM +0200, Greg Kroah-Hartman wrote:
> > > On Thu, Jul 12, 2018 at 12:18:46PM -0600, Tycho Andersen wrote:
> > > > On Thu, Jul 12, 2018 at 05:40:15PM +0200, Greg Kroah-Hartman wrote:
> > > > > On Thu, Jul 12, 2018 at 09:08:22AM -0600, Tycho Andersen wrote:
> > > > > > On Thu, Jul 12, 2018 at 05:04:38PM +0200, Greg Kroah-Hartman wrote:
> > > > > > > On Wed, Jul 11, 2018 at 10:07:44AM -0600, Tycho Andersen wrote:
> > > > > > > > +	if (uport)
> > > > > > > > +		spin_lock_irqsave(&uport->lock, flags);
> > > > > > > 
> > > > > > > That's the same thing as just calling uart_port_lock(), why aren't you
> > > > > > > doing that?
> > > > > > 
> > > > > > Because the compiler can't seem to "see" through the macros/ref calls,
> > > > > > and I get the warning I mentioned here if I use them:
> > > > > > 
> > > > > > https://lkml.org/lkml/2018/7/6/840
> > > > > 
> > > > > What horrible version of gcc are you using that give you that?  Don't
> > > > > open-code things just because of a broken compiler.
> > > > 
> > > > I've tried with both 7.3.0 and 5.4.0. I think the reason we see this
> > > > here but not elsewhere in the file is because there's an actual
> > > > function call (free_page()) in the critical section.
> > > > 
> > > > If we move that out, something like the below patch, it all works for
> > > > me.
> > > 
> > > Ick.  Which version of this series had the problem?  Let me test it out
> > > here...
> > 
> > v3, if you remove the initialization of flags from both functions you
> > should see it.
> 
> Ok, I tried v3 out and yes, you are right, removing the "= 0" causes gcc
> to complain.  The compiler is being dumb here, so I'll just leave it
> as-is.  I've queued up the v3 version now, thanks for sticking with
> this.

Great, thanks!

Tycho

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, back to index

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-05  0:01 [PATCH] uart: fix race between uart_put_char() and uart_shutdown() Tycho Andersen
2018-06-05  3:59 ` Serge E. Hallyn
2018-06-06 21:42   ` Tycho Andersen
2018-06-28 12:05 ` Greg Kroah-Hartman
2018-06-29 10:24   ` [PATCH v2] " Tycho Andersen
2018-06-29 16:43     ` Tycho Andersen
2018-07-06 14:39       ` Greg Kroah-Hartman
2018-07-06 16:24         ` [PATCH v3] " Tycho Andersen
2018-07-06 16:49           ` Andy Shevchenko
2018-07-06 18:39             ` Tycho Andersen
2018-07-06 20:48               ` Andy Shevchenko
2018-07-06 21:22                 ` Tycho Andersen
2018-07-11 16:07                   ` [PATCH v4] " Tycho Andersen
2018-07-11 19:24                     ` Serge E. Hallyn
2018-07-11 19:49                     ` Serge E. Hallyn
2018-07-11 20:00                       ` Tycho Andersen
2018-07-12 15:05                         ` Greg Kroah-Hartman
2018-07-12  9:03                     ` Andy Shevchenko
2018-07-12 14:13                       ` Tycho Andersen
2018-07-12 15:04                     ` Greg Kroah-Hartman
2018-07-12 15:08                       ` Tycho Andersen
2018-07-12 15:40                         ` Greg Kroah-Hartman
2018-07-12 18:18                           ` Tycho Andersen
2018-07-12 18:25                             ` Greg Kroah-Hartman
2018-07-12 18:30                               ` Tycho Andersen
2018-07-13  9:28                                 ` Greg Kroah-Hartman
2018-07-13 14:01                                   ` Tycho Andersen

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox