linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NULL ptr deref in wq_worker_sleeping on 4.19
@ 2019-07-19 13:53 Sasha Levin
  2019-08-07 11:46 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 6+ messages in thread
From: Sasha Levin @ 2019-07-19 13:53 UTC (permalink / raw)
  To: tglx, bigeasy, peterz, mingo, tj, jiangshanlai; +Cc: linux-kernel, stable

Hi folks,

We're seeing a rare panic on boot in wq_worker_sleeping() on boot in
4.19 kernels. I wasn't able to reproduce this with 5.2, but I'm not sure
whether it's because the issue is fixed, or I was just unlucky.

The panic looks like this:

[    0.852791] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[    0.853260] PGD 0 P4D 0 
[    0.853260] Oops: 0000 [#1] SMP PTI
[    0.853260] CPU: 7 PID: 49 Comm:  Not tainted 4.19.52-9858d02fd940 #1
[    0.853260] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
[    0.853260] RIP: 0010:kthread_data+0x12/0x30
[    0.853260] Code: 83 7f 58 00 74 02 0f 0b e9 bb 2d 19 00 0f 0b eb e2 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f6 47 26 20 74 0c 48 8b 87 98 05 00 00 <48> 8b 40 10 c3 0f 0b 48 8b 87 98 05 00 00 48 8b 40 10 c3 90 66 2e
[    0.853260] RSP: 0000:ffffc900036abe38 EFLAGS: 00010002
[    0.853260] RAX: 0000000000000000 RBX: ffff8887bfbe17c0 RCX: 0000000000000000
[    0.853260] RDX: 0000000000000001 RSI: 000000000000000a RDI: ffff8887bbb4bb00
[    0.853260] RBP: ffffc900036abea0 R08: 0000000000000000 R09: 0000000000000000
[    0.853260] R10: ffffc9000368bd90 R11: 0000000000000000 R12: ffff8887bbb4bb00
[    0.853260] R13: 0000000000000000 R14: ffffc900036abe60 R15: 0000000000000000
[    0.853260] FS:  0000000000000000(0000) GS:ffff8887bfbc0000(0000) knlGS:0000000000000000
[    0.853260] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.853260] CR2: 0000000000000068 CR3: 00000007df40a000 CR4: 00000000001406e0
[    0.853260] Call Trace:
[    0.853260]  wq_worker_sleeping+0xa/0x60
[    0.853260]  __schedule+0x571/0x8c0
[    0.853260]  schedule+0x32/0x80
[    0.853260]  worker_thread+0xc7/0x440
[    0.853260]  kthread+0xf8/0x130
[    0.853260]  ret_from_fork+0x35/0x40
[    0.853260] Modules linked in:
[    0.853260] CR2: 0000000000000010
[    0.853260] ---[ end trace 160fda44361ab977 ]---

I see that this area was recently touched by 6d25be5782e4 ("sched/core,
workqueues: Distangle worker accounting from rq lock") but I'm not sure
if it's related.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NULL ptr deref in wq_worker_sleeping on 4.19
  2019-07-19 13:53 NULL ptr deref in wq_worker_sleeping on 4.19 Sasha Levin
@ 2019-08-07 11:46 ` Sebastian Andrzej Siewior
  2019-08-07 12:24   ` Austin Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-08-07 11:46 UTC (permalink / raw)
  To: Sasha Levin; +Cc: tglx, peterz, mingo, tj, jiangshanlai, linux-kernel, stable

On 2019-07-19 09:53:52 [-0400], Sasha Levin wrote:
> Hi folks,
Hi,

> We're seeing a rare panic on boot in wq_worker_sleeping() on boot in
> 4.19 kernels. I wasn't able to reproduce this with 5.2, but I'm not sure
> whether it's because the issue is fixed, or I was just unlucky.
> 
> The panic looks like this:
> 
> [    0.852791] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> [    0.853260] PGD 0 P4D 0 [    0.853260] Oops: 0000 [#1] SMP PTI
> [    0.853260] CPU: 7 PID: 49 Comm:  Not tainted 4.19.52-9858d02fd940 #1
> [    0.853260] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
> [    0.853260] RIP: 0010:kthread_data+0x12/0x30
> [    0.853260] Code: 83 7f 58 00 74 02 0f 0b e9 bb 2d 19 00 0f 0b eb e2 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f6 47 26 20 74 0c 48 8b 87 98 05 00 00 <48> 8b 40 10 c3 0f 0b 48 8b 87 98 05 00 00 48 8b 40 10 c3 90 66 2e
> [    0.853260] RSP: 0000:ffffc900036abe38 EFLAGS: 00010002
> [    0.853260] RAX: 0000000000000000 RBX: ffff8887bfbe17c0 RCX: 0000000000000000
> [    0.853260] RDX: 0000000000000001 RSI: 000000000000000a RDI: ffff8887bbb4bb00
> [    0.853260] RBP: ffffc900036abea0 R08: 0000000000000000 R09: 0000000000000000
> [    0.853260] R10: ffffc9000368bd90 R11: 0000000000000000 R12: ffff8887bbb4bb00
> [    0.853260] R13: 0000000000000000 R14: ffffc900036abe60 R15: 0000000000000000
> [    0.853260] FS:  0000000000000000(0000) GS:ffff8887bfbc0000(0000) knlGS:0000000000000000
> [    0.853260] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.853260] CR2: 0000000000000068 CR3: 00000007df40a000 CR4: 00000000001406e0
> [    0.853260] Call Trace:
> [    0.853260]  wq_worker_sleeping+0xa/0x60
> [    0.853260]  __schedule+0x571/0x8c0
> [    0.853260]  schedule+0x32/0x80
> [    0.853260]  worker_thread+0xc7/0x440
> [    0.853260]  kthread+0xf8/0x130
> [    0.853260]  ret_from_fork+0x35/0x40
> [    0.853260] Modules linked in:
> [    0.853260] CR2: 0000000000000010
> [    0.853260] ---[ end trace 160fda44361ab977 ]---
> 
> I see that this area was recently touched by 6d25be5782e4 ("sched/core,
> workqueues: Distangle worker accounting from rq lock") but I'm not sure
> if it's related.

The change should just move code outside of the scheduler and not lead
to any changed behaviour (except the small detail mentioned in the
changelog, nothing explaining what you have here).

The way the call chain looks is, kthread() allocated the struct kthread
(self) and saved it in current->set_child_tid and this pointer is not
NULL. Everything works out and `threadfn' is invoked which is
worker_thread().
The first thing it does, is to set the special kworker flag via
set_pf_worker() which enables the additional code in the scheduler. Then
it has nothing to do and invokes schedule() which then gets us to
wq_worker_sleeping(). Here it invokes wq_worker_sleeping() which is what
explodes.
Based on the register dump and code dump, RAX is NULL which is
current->set_child_tid (from the begin of ktread()). It adds 0x10 for
the ->date pointer and OOPSes while reading from 0x10.

So everything looks fine, except that `set_child_tid' seems to be zeroed
out. Also, task_struct has a few lines after `set_child_tid' the `comm'
member which seems to contain also 0x00 because the trace contains no
task name. At this point I would have expected "kworker/…".

Based on this two hints it looks like something zeroed that memory area
shortly after it was occupied by the task (aka use after free).

Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NULL ptr deref in wq_worker_sleeping on 4.19
  2019-08-07 11:46 ` Sebastian Andrzej Siewior
@ 2019-08-07 12:24   ` Austin Kim
  2019-08-07 12:56     ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Austin Kim @ 2019-08-07 12:24 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Sasha Levin, tglx, peterz, mingo, tj, jiangshanlai, linux-kernel, stable

I wonder what kinds of workqueue is used in case of this panic.

If system workqueue(system_wq) is used for this case, it would be a
help to replace it with high priority workqueue(system_highpri_wq). If
panic disappers with high priority workqueue(system_highpri_wq), we
would think about another scenario.

BR,
Guillermo Austin Kim

2019년 8월 7일 (수) 오후 8:48, Sebastian Andrzej Siewior <bigeasy@linutronix.de>님이 작성:
>
> On 2019-07-19 09:53:52 [-0400], Sasha Levin wrote:
> > Hi folks,
> Hi,
>
> > We're seeing a rare panic on boot in wq_worker_sleeping() on boot in
> > 4.19 kernels. I wasn't able to reproduce this with 5.2, but I'm not sure
> > whether it's because the issue is fixed, or I was just unlucky.
> >
> > The panic looks like this:
> >
> > [    0.852791] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> > [    0.853260] PGD 0 P4D 0 [    0.853260] Oops: 0000 [#1] SMP PTI
> > [    0.853260] CPU: 7 PID: 49 Comm:  Not tainted 4.19.52-9858d02fd940 #1
> > [    0.853260] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
> > [    0.853260] RIP: 0010:kthread_data+0x12/0x30
> > [    0.853260] Code: 83 7f 58 00 74 02 0f 0b e9 bb 2d 19 00 0f 0b eb e2 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f6 47 26 20 74 0c 48 8b 87 98 05 00 00 <48> 8b 40 10 c3 0f 0b 48 8b 87 98 05 00 00 48 8b 40 10 c3 90 66 2e
> > [    0.853260] RSP: 0000:ffffc900036abe38 EFLAGS: 00010002
> > [    0.853260] RAX: 0000000000000000 RBX: ffff8887bfbe17c0 RCX: 0000000000000000
> > [    0.853260] RDX: 0000000000000001 RSI: 000000000000000a RDI: ffff8887bbb4bb00
> > [    0.853260] RBP: ffffc900036abea0 R08: 0000000000000000 R09: 0000000000000000
> > [    0.853260] R10: ffffc9000368bd90 R11: 0000000000000000 R12: ffff8887bbb4bb00
> > [    0.853260] R13: 0000000000000000 R14: ffffc900036abe60 R15: 0000000000000000
> > [    0.853260] FS:  0000000000000000(0000) GS:ffff8887bfbc0000(0000) knlGS:0000000000000000
> > [    0.853260] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    0.853260] CR2: 0000000000000068 CR3: 00000007df40a000 CR4: 00000000001406e0
> > [    0.853260] Call Trace:
> > [    0.853260]  wq_worker_sleeping+0xa/0x60
> > [    0.853260]  __schedule+0x571/0x8c0
> > [    0.853260]  schedule+0x32/0x80
> > [    0.853260]  worker_thread+0xc7/0x440
> > [    0.853260]  kthread+0xf8/0x130
> > [    0.853260]  ret_from_fork+0x35/0x40
> > [    0.853260] Modules linked in:
> > [    0.853260] CR2: 0000000000000010
> > [    0.853260] ---[ end trace 160fda44361ab977 ]---
> >
> > I see that this area was recently touched by 6d25be5782e4 ("sched/core,
> > workqueues: Distangle worker accounting from rq lock") but I'm not sure
> > if it's related.
>
> The change should just move code outside of the scheduler and not lead
> to any changed behaviour (except the small detail mentioned in the
> changelog, nothing explaining what you have here).
>
> The way the call chain looks is, kthread() allocated the struct kthread
> (self) and saved it in current->set_child_tid and this pointer is not
> NULL. Everything works out and `threadfn' is invoked which is
> worker_thread().
> The first thing it does, is to set the special kworker flag via
> set_pf_worker() which enables the additional code in the scheduler. Then
> it has nothing to do and invokes schedule() which then gets us to
> wq_worker_sleeping(). Here it invokes wq_worker_sleeping() which is what
> explodes.
> Based on the register dump and code dump, RAX is NULL which is
> current->set_child_tid (from the begin of ktread()). It adds 0x10 for
> the ->date pointer and OOPSes while reading from 0x10.
>
> So everything looks fine, except that `set_child_tid' seems to be zeroed
> out. Also, task_struct has a few lines after `set_child_tid' the `comm'
> member which seems to contain also 0x00 because the trace contains no
> task name. At this point I would have expected "kworker/…".
>
> Based on this two hints it looks like something zeroed that memory area
> shortly after it was occupied by the task (aka use after free).
>
> Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NULL ptr deref in wq_worker_sleeping on 4.19
  2019-08-07 12:24   ` Austin Kim
@ 2019-08-07 12:56     ` Thomas Gleixner
  2019-08-07 13:33       ` Austin Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2019-08-07 12:56 UTC (permalink / raw)
  To: Austin Kim
  Cc: Sebastian Andrzej Siewior, Sasha Levin, peterz, mingo, tj,
	jiangshanlai, linux-kernel, stable

On Wed, 7 Aug 2019, Austin Kim wrote:

  A: Because it messes up the order in which people normally read text.
  Q: Why is top-posting such a bad thing?
  A: Top-posting.
  Q: What is the most annoying thing in e-mail?

  A: No.
  Q: Should I include quotations after my reply?

  http://daringfireball.net/2007/07/on_top

> I wonder what kinds of workqueue is used in case of this panic.
> 
> If system workqueue(system_wq) is used for this case, it would be a
> help to replace it with high priority workqueue(system_highpri_wq). If
> panic disappers with high priority workqueue(system_highpri_wq), we
> would think about another scenario.

How would that help? As Sebastian explained, something overwrote memory or
it is a Use After Free. How would a high priority workqueue 'fix' that?

You need to find the root cause, which is either memory corruption or a use
after free.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NULL ptr deref in wq_worker_sleeping on 4.19
  2019-08-07 12:56     ` Thomas Gleixner
@ 2019-08-07 13:33       ` Austin Kim
  2019-08-07 20:52         ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Austin Kim @ 2019-08-07 13:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Sebastian Andrzej Siewior, Sasha Levin, peterz, mingo, tj,
	jiangshanlai, linux-kernel, stable, #서형진

Is it possible for you to generate kernel patch only with **kernel
log** for 'use-after-free' case?
With kernel log, we only have limited information, so it is hard to
debug 'use-after-free' case with this signature.

For this matter, kernel should be compiled with below config.
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y

2019년 8월 7일 (수) 오후 9:56, Thomas Gleixner <tglx@linutronix.de>님이 작성:
>
> On Wed, 7 Aug 2019, Austin Kim wrote:
>
>   A: Because it messes up the order in which people normally read text.
>   Q: Why is top-posting such a bad thing?
>   A: Top-posting.
>   Q: What is the most annoying thing in e-mail?
>
>   A: No.
>   Q: Should I include quotations after my reply?
>
>   http://daringfireball.net/2007/07/on_top
>
> > I wonder what kinds of workqueue is used in case of this panic.
> >
> > If system workqueue(system_wq) is used for this case, it would be a
> > help to replace it with high priority workqueue(system_highpri_wq). If
> > panic disappers with high priority workqueue(system_highpri_wq), we
> > would think about another scenario.
>
> How would that help? As Sebastian explained, something overwrote memory or
> it is a Use After Free. How would a high priority workqueue 'fix' that?
>
> You need to find the root cause, which is either memory corruption or a use
> after free.
>
> Thanks,
>
>         tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NULL ptr deref in wq_worker_sleeping on 4.19
  2019-08-07 13:33       ` Austin Kim
@ 2019-08-07 20:52         ` Thomas Gleixner
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2019-08-07 20:52 UTC (permalink / raw)
  To: Austin Kim
  Cc: Sebastian Andrzej Siewior, Sasha Levin, peterz, mingo, tj,
	jiangshanlai, linux-kernel, stable, #서형진

On Wed, 7 Aug 2019, Austin Kim wrote:

Again:

 A: Because it messes up the order in which people normally read text.
 Q: Why is top-posting such a bad thing?
 A: Top-posting.
 Q: What is the most annoying thing in e-mail?

 A: No.
 Q: Should I include quotations after my reply?

 http://daringfireball.net/2007/07/on_top

> Is it possible for you to generate kernel patch only with **kernel
> log** for 'use-after-free' case?

What? I have no idea how to reproduce that issue so how should I create a
kernel patch for something which can happen at some random place in several
million lines of code?

> With kernel log, we only have limited information, so it is hard to
> debug 'use-after-free' case with this signature.

My information is even more limited...
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-08-07 20:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19 13:53 NULL ptr deref in wq_worker_sleeping on 4.19 Sasha Levin
2019-08-07 11:46 ` Sebastian Andrzej Siewior
2019-08-07 12:24   ` Austin Kim
2019-08-07 12:56     ` Thomas Gleixner
2019-08-07 13:33       ` Austin Kim
2019-08-07 20:52         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).