netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NULL pointer dereferences with 4.14.27
@ 2018-03-17 18:41 Carlos Carvalho
  2018-03-17 19:12 ` Holger Hoffstätte
  0 siblings, 1 reply; 4+ messages in thread
From: Carlos Carvalho @ 2018-03-17 18:41 UTC (permalink / raw)
  To: linux-kernel, netdev

I've put 4.14.27 this morning in this machine and in about 2h it started
showing null dereferences identical to the following one. There were several of
them, with about 1/2h of interval. Strangely it continued to work and I saw no
other anomalies. I've just reverted to 4.14.26.

It only happened in this machine, which has a net traffic of several Gb/s and
thousands of simultaneous connections.

Mar 17 13:29:21 sagres kernel: : BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
Mar 17 13:29:21 sagres kernel: : IP: tcp_push+0x4e/0xe7
Mar 17 13:29:21 sagres kernel: : PGD 0 P4D 0 
Mar 17 13:29:21 sagres kernel: : Oops: 0002 [#1] SMP PTI
Mar 17 13:29:21 sagres kernel: : CPU: 55 PID: 2658 Comm: apache2 Not tainted 4.14.27 #4
Mar 17 13:29:21 sagres kernel: : task: ffff89791cf7e600 task.stack: ffffabdd91db8000
Mar 17 13:29:21 sagres kernel: : RIP: 0010:tcp_push+0x4e/0xe7
Mar 17 13:29:21 sagres kernel: : RSP: 0018:ffffabdd91dbbc10 EFLAGS: 00010246
Mar 17 13:29:21 sagres kernel: : RAX: 0000000000000000 RBX: 00000000000004c4 RCX: 0000000000000001
Mar 17 13:29:21 sagres kernel: : RDX: 0000000000000001 RSI: 0000000000000040 RDI: ffff89968330a100
Mar 17 13:29:21 sagres kernel: : RBP: ffff89968330a250 R08: 0000000000007be8 R09: ffffe77cbfc4ab00
Mar 17 13:29:21 sagres kernel: : R10: ffff89968330a250 R11: 0000000000000000 R12: ffff8987aab3bb80
Mar 17 13:29:21 sagres kernel: : R13: ffff89968330a100 R14: ffff89791cf7e930 R15: 00000000ffffffe0
Mar 17 13:29:21 sagres kernel: : FS:  00007f0bd67d4700(0000) GS:ffff89993f4c0000(0000) knlGS:0000000000000000
Mar 17 13:29:21 sagres kernel: : CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 17 13:29:21 sagres kernel: : CR2: 0000000000000038 CR3: 0000003ff4842006 CR4: 00000000003606e0
Mar 17 13:29:21 sagres kernel: : DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 17 13:29:21 sagres kernel: : DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar 17 13:29:21 sagres kernel: : Call Trace:
Mar 17 13:29:21 sagres kernel: : tcp_sendmsg_locked+0xac6/0xc1e
Mar 17 13:29:21 sagres kernel: : tcp_sendmsg+0x23/0x35
Mar 17 13:29:21 sagres kernel: : sock_sendmsg+0x11/0x1b
Mar 17 13:29:21 sagres kernel: : sock_write_iter+0x71/0x87
Mar 17 13:29:21 sagres kernel: : do_iter_readv_writev+0xf0/0x111
Mar 17 13:29:21 sagres kernel: : do_iter_write+0x84/0xf0
Mar 17 13:29:21 sagres kernel: : vfs_writev+0xad/0xfb
Mar 17 13:29:21 sagres kernel: : ? do_writev+0x56/0x92
Mar 17 13:29:21 sagres kernel: : do_writev+0x56/0x92
Mar 17 13:29:21 sagres kernel: : do_syscall_64+0x181/0x210
Mar 17 13:29:21 sagres kernel: : entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mar 17 13:29:21 sagres kernel: : RIP: 0033:0x7f13f1264017
Mar 17 13:29:21 sagres kernel: : RSP: 002b:00007f0bd67d2810 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
Mar 17 13:29:21 sagres kernel: : RAX: ffffffffffffffda RBX: 0000000000000079 RCX: 00007f13f1264017
Mar 17 13:29:21 sagres kernel: : RDX: 000000000000000a RSI: 00007f0bd67d2970 RDI: 0000000000000079
Mar 17 13:29:21 sagres kernel: : RBP: 00007f0bd67d2970 R08: 0000000000000000 R09: 00007f13680762c8
Mar 17 13:29:21 sagres kernel: : R10: 0000556029c85dd4 R11: 0000000000000293 R12: 000000000000000a
Mar 17 13:29:21 sagres kernel: : R13: 00007f0bd67d2970 R14: 00007f0bd67d28d0 R15: 0000556029ea1440
Mar 17 13:29:21 sagres kernel: : Code: d0 75 02 31 c0 41 89 f3 41 81 e3 00 80 00 00 74 1a 44 8b 8f 58 05 00 00 41 d1 e9 44 2b 8f 5c 06 00 00 44 03 8f 64 06 00 00 79 10 <80> 48 38 08 8b 8f 5c 06 00 00 89 8f 64 06 00 00 40 80 e6 01 74 
Mar 17 13:29:21 sagres kernel: : RIP: tcp_push+0x4e/0xe7 RSP: ffffabdd91dbbc10
Mar 17 13:29:21 sagres kernel: : CR2: 0000000000000038
Mar 17 13:29:21 sagres kernel: : ---[ end trace f9a8f71d250d2782 ]---

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NULL pointer dereferences with 4.14.27
  2018-03-17 18:41 NULL pointer dereferences with 4.14.27 Carlos Carvalho
@ 2018-03-17 19:12 ` Holger Hoffstätte
  2018-03-19 15:57   ` Holger Hoffstätte
  0 siblings, 1 reply; 4+ messages in thread
From: Holger Hoffstätte @ 2018-03-17 19:12 UTC (permalink / raw)
  To: Carlos Carvalho, linux-kernel, netdev

On 03/17/18 19:41, Carlos Carvalho wrote:
> I've put 4.14.27 this morning in this machine and in about 2h it started
> showing null dereferences identical to the following one. There were several of
> them, with about 1/2h of interval. Strangely it continued to work and I saw no
> other anomalies. I've just reverted to 4.14.26.
> 
> It only happened in this machine, which has a net traffic of several Gb/s and
> thousands of simultaneous connections.
> 
> Mar 17 13:29:21 sagres kernel: : BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> Mar 17 13:29:21 sagres kernel: : IP: tcp_push+0x4e/0xe7
> Mar 17 13:29:21 sagres kernel: : PGD 0 P4D 0 
> Mar 17 13:29:21 sagres kernel: : Oops: 0002 [#1] SMP PTI
> Mar 17 13:29:21 sagres kernel: : CPU: 55 PID: 2658 Comm: apache2 Not tainted 4.14.27 #4
> Mar 17 13:29:21 sagres kernel: : task: ffff89791cf7e600 task.stack: ffffabdd91db8000
> Mar 17 13:29:21 sagres kernel: : RIP: 0010:tcp_push+0x4e/0xe7
> Mar 17 13:29:21 sagres kernel: : RSP: 0018:ffffabdd91dbbc10 EFLAGS: 00010246
> Mar 17 13:29:21 sagres kernel: : RAX: 0000000000000000 RBX: 00000000000004c4 RCX: 0000000000000001
> Mar 17 13:29:21 sagres kernel: : RDX: 0000000000000001 RSI: 0000000000000040 RDI: ffff89968330a100
> Mar 17 13:29:21 sagres kernel: : RBP: ffff89968330a250 R08: 0000000000007be8 R09: ffffe77cbfc4ab00
> Mar 17 13:29:21 sagres kernel: : R10: ffff89968330a250 R11: 0000000000000000 R12: ffff8987aab3bb80
> Mar 17 13:29:21 sagres kernel: : R13: ffff89968330a100 R14: ffff89791cf7e930 R15: 00000000ffffffe0
> Mar 17 13:29:21 sagres kernel: : FS:  00007f0bd67d4700(0000) GS:ffff89993f4c0000(0000) knlGS:0000000000000000
> Mar 17 13:29:21 sagres kernel: : CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Mar 17 13:29:21 sagres kernel: : CR2: 0000000000000038 CR3: 0000003ff4842006 CR4: 00000000003606e0
> Mar 17 13:29:21 sagres kernel: : DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Mar 17 13:29:21 sagres kernel: : DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Mar 17 13:29:21 sagres kernel: : Call Trace:
> Mar 17 13:29:21 sagres kernel: : tcp_sendmsg_locked+0xac6/0xc1e
> Mar 17 13:29:21 sagres kernel: : tcp_sendmsg+0x23/0x35
> Mar 17 13:29:21 sagres kernel: : sock_sendmsg+0x11/0x1b
> Mar 17 13:29:21 sagres kernel: : sock_write_iter+0x71/0x87
> Mar 17 13:29:21 sagres kernel: : do_iter_readv_writev+0xf0/0x111
> Mar 17 13:29:21 sagres kernel: : do_iter_write+0x84/0xf0
> Mar 17 13:29:21 sagres kernel: : vfs_writev+0xad/0xfb
> Mar 17 13:29:21 sagres kernel: : ? do_writev+0x56/0x92
> Mar 17 13:29:21 sagres kernel: : do_writev+0x56/0x92
> Mar 17 13:29:21 sagres kernel: : do_syscall_64+0x181/0x210
> Mar 17 13:29:21 sagres kernel: : entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> Mar 17 13:29:21 sagres kernel: : RIP: 0033:0x7f13f1264017
> Mar 17 13:29:21 sagres kernel: : RSP: 002b:00007f0bd67d2810 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
> Mar 17 13:29:21 sagres kernel: : RAX: ffffffffffffffda RBX: 0000000000000079 RCX: 00007f13f1264017
> Mar 17 13:29:21 sagres kernel: : RDX: 000000000000000a RSI: 00007f0bd67d2970 RDI: 0000000000000079
> Mar 17 13:29:21 sagres kernel: : RBP: 00007f0bd67d2970 R08: 0000000000000000 R09: 00007f13680762c8
> Mar 17 13:29:21 sagres kernel: : R10: 0000556029c85dd4 R11: 0000000000000293 R12: 000000000000000a
> Mar 17 13:29:21 sagres kernel: : R13: 00007f0bd67d2970 R14: 00007f0bd67d28d0 R15: 0000556029ea1440
> Mar 17 13:29:21 sagres kernel: : Code: d0 75 02 31 c0 41 89 f3 41 81 e3 00 80 00 00 74 1a 44 8b 8f 58 05 00 00 41 d1 e9 44 2b 8f 5c 06 00 00 44 03 8f 64 06 00 00 79 10 <80> 48 38 08 8b 8f 5c 06 00 00 89 8f 64 06 00 00 40 80 e6 01 74 
> Mar 17 13:29:21 sagres kernel: : RIP: tcp_push+0x4e/0xe7 RSP: ffffabdd91dbbc10
> Mar 17 13:29:21 sagres kernel: : CR2: 0000000000000038
> Mar 17 13:29:21 sagres kernel: : ---[ end trace f9a8f71d250d2782 ]---
> 

Fixed by: https://www.spinics.net/lists/netdev/msg489445.html

-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NULL pointer dereferences with 4.14.27
  2018-03-17 19:12 ` Holger Hoffstätte
@ 2018-03-19 15:57   ` Holger Hoffstätte
  2018-03-19 16:33     ` David Miller
  0 siblings, 1 reply; 4+ messages in thread
From: Holger Hoffstätte @ 2018-03-19 15:57 UTC (permalink / raw)
  To: Carlos Carvalho, Soheil Hassas Yeganeh, David S. Miller, Greg KH,
	netdev, stable


(CC: davem, soheil & gregkh)

On 03/17/18 20:12, Holger Hoffstätte wrote:
> On 03/17/18 19:41, Carlos Carvalho wrote:
>> I've put 4.14.27 this morning in this machine and in about 2h it started
>> showing null dereferences identical to the following one. There were several of
>> them, with about 1/2h of interval. Strangely it continued to work and I saw no
>> other anomalies. I've just reverted to 4.14.26.
>>
>> It only happened in this machine, which has a net traffic of several Gb/s and
>> thousands of simultaneous connections.
>>
>> Mar 17 13:29:21 sagres kernel: : BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
>> Mar 17 13:29:21 sagres kernel: : IP: tcp_push+0x4e/0xe7
>> Mar 17 13:29:21 sagres kernel: : PGD 0 P4D 0 
>> Mar 17 13:29:21 sagres kernel: : Oops: 0002 [#1] SMP PTI
>> Mar 17 13:29:21 sagres kernel: : CPU: 55 PID: 2658 Comm: apache2 Not tainted 4.14.27 #4
(snip)
> 
> Fixed by: https://www.spinics.net/lists/netdev/msg489445.html
> 
> -h
> 

This patch is in the netdev patchwork at https://patchwork.ozlabs.org/patch/886324/
but has been marked as "not applicable" without further queued/rejected comment
from Dave, so I believe it became a victim of email lossage.
As the patch says it doesn't apply to anything older than 4.14, but it has been
tested & reported by several people as fixing the problem, and indeed works
fine. Since GregKH only accepts net patches from Dave I wanted to make sure
it got queued up for 4.14.

Thanks,
Holger

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NULL pointer dereferences with 4.14.27
  2018-03-19 15:57   ` Holger Hoffstätte
@ 2018-03-19 16:33     ` David Miller
  0 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2018-03-19 16:33 UTC (permalink / raw)
  To: holger; +Cc: carlos, soheil, gregkh, netdev, stable

From: Holger Hoffstätte <holger@applied-asynchrony.com>
Date: Mon, 19 Mar 2018 16:57:48 +0100

> This patch is in the netdev patchwork at https://patchwork.ozlabs.org/patch/886324/
> but has been marked as "not applicable" without further queued/rejected comment
> from Dave, so I believe it became a victim of email lossage.

It is not a victim of email lossage.

When someone posts a backport for -stable, that is not intended to be
applied upstream (because it's already there), I add the patch to the
stable bundle and mark it as "Not applicable" because it's "Not
applicable" for upstream.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-03-19 16:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-17 18:41 NULL pointer dereferences with 4.14.27 Carlos Carvalho
2018-03-17 19:12 ` Holger Hoffstätte
2018-03-19 15:57   ` Holger Hoffstätte
2018-03-19 16:33     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).