corruption causing crash in __queue_work

* corruption causing crash in __queue_work
@ 2015-12-09 12:08 Nikolay Borisov
  2015-12-09 16:08 ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2015-12-09 12:08 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Linux-Kernel@Vger. Kernel. Org, SiteGround Operations

Hello Tejun, 

I've been observing the following crashes on kernel 4.2.6 :

73309.529940] BUG: unable to handle kernel NULL pointer dereference at           (null)
[73309.530238] IP: [<ffffffff8106b663>] __queue_work+0xb3/0x390
[73309.530466] PGD 0 
[73309.530681] Oops: 0000 [#1] SMP 
[73309.530947] Modules linked in: dm_snapshot dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio libcrc32c ipv6 xt_multiport iptable_filter xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables ext2 dm_mirror dm_region_hash dm_log iTCO_wdt iTCO_vendor_support sb_edac edac_core i2c_i801 igb i2c_algo_bit i2c_core lpc_ich mfd_core ipmi_devintf ipmi_si ipmi_msghandler ioatdma dca
[73309.533556] CPU: 19 PID: 0 Comm: swapper/19 Not tainted 4.2.6-wbpatch-qib #1
[73309.533734] Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
[73309.533911] task: ffff880276501b80 ti: ffff880276510000 task.ti: ffff880276510000
[73309.534093] RIP: 0010:[<ffffffff8106b663>]  [<ffffffff8106b663>] __queue_work+0xb3/0x390
[73309.534321] RSP: 0018:ffff88047fce3d58  EFLAGS: 00010086
[73309.534495] RAX: ffff880277812400 RBX: ffff8801e53e24c0 RCX: 00000000000100f0
[73309.534672] RDX: 0000000000000000 RSI: 0000000000000030 RDI: ffff8801e53e24c0
[73309.534849] RBP: ffff88047fce3de8 R08: 000042ad628a3480 R09: 0000000000000000
[73309.535023] R10: ffffffff816099d5 R11: 0000000000000000 R12: ffffffff8106b940
[73309.535196] R13: 0000000000000013 R14: ffff8803df464c00 R15: 0000000000000013
[73309.535370] FS:  0000000000000000(0000) GS:ffff88047fce0000(0000) knlGS:0000000000000000
[73309.535544] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[73309.535714] CR2: 0000000000000000 CR3: 0000000001a0e000 CR4: 00000000000406e0
[73309.535886] Stack:
[73309.536049]  ffff88047fcefcd8 0000000000000092 0000000000000000 ffff8803df464d10
[73309.536415]  0000000000000032 00000000000100f0 0000000000000000 ffff88047fcf4a00
[73309.536785]  ffff88047fcf4a00 0000000000000013 0000000000000000 ffff880276501b80
[73309.537152] Call Trace:
[73309.537319]  <IRQ> 
[73309.537373]  [<ffffffff8106b940>] ? __queue_work+0x390/0x390
[73309.537714]  [<ffffffff8106b958>] delayed_work_timer_fn+0x18/0x20
[73309.537891]  [<ffffffff810ad1d7>] call_timer_fn+0x47/0x110
[73309.538071]  [<ffffffff810be302>] ? tick_sched_timer+0x52/0xa0
[73309.538249]  [<ffffffff810adb6f>] run_timer_softirq+0x17f/0x2b0
[73309.538425]  [<ffffffff8106b940>] ? __queue_work+0x390/0x390
[73309.538604]  [<ffffffff81057f40>] __do_softirq+0xe0/0x290
[73309.538778]  [<ffffffff810581e6>] irq_exit+0xa6/0xb0
[73309.538952]  [<ffffffff8159413a>] smp_apic_timer_interrupt+0x4a/0x59
[73309.539128]  [<ffffffff815926bb>] apic_timer_interrupt+0x6b/0x70
[73309.539300]  <EOI> 
[73309.539355]  [<ffffffff8148b136>] ? cpuidle_enter_state+0x136/0x290
[73309.539694]  [<ffffffff8148b12d>] ? cpuidle_enter_state+0x12d/0x290
[73309.539870]  [<ffffffff8158d9ed>] ? __schedule+0x37d/0x840
[73309.540045]  [<ffffffff8148b2a7>] cpuidle_enter+0x17/0x20
[73309.540222]  [<ffffffff810936c5>] cpuidle_idle_call+0x95/0x140
[73309.540398]  [<ffffffff81072766>] ? atomic_notifier_call_chain+0x16/0x20
[73309.540574]  [<ffffffff810938b5>] cpu_idle_loop+0x145/0x200
[73309.540748]  [<ffffffff8109398b>] ? cpu_startup_entry+0x1b/0x70
[73309.540924]  [<ffffffff813a1948>] ? get_random_bytes+0x48/0x90
[73309.541098]  [<ffffffff810939cf>] cpu_startup_entry+0x5f/0x70
[73309.541274]  [<ffffffff81033832>] start_secondary+0xc2/0xd0
[73309.541446] Code: 49 8b 96 08 01 00 00 49 63 c7 48 03 14 c5 e0 af ab 81 48 89 55 80 48 89 df e8 0a ee ff ff 48 8b 55 80 48 85 c0 0f 84 3e 01 00 00 <48> 8b 3a 48 39 f8 0f 84 35 01 00 00 48 89 c7 48 89 85 78 ff ff 
[73309.545008] RIP  [<ffffffff8106b663>] __queue_work+0xb3/0x390
[73309.545231]  RSP <ffff88047fce3d58>
[73309.545399] CR2: 0000000000000000

The gist is that this fail on the following line: 

if (last_pool && last_pool != pwq->pool) {

Since the pointer 'pwq' is wrong (it is loaded in %rdx) which in this 
case is 0000000000000000. Looking at the function's source pwq should 
be loaded by per_cpu_ptr since the  if (!(wq->flags & WQ_UNBOUND)) 
check should evaluate to false. So pwq is loaded as the result from 
unbound_pwq_by_node(wq, cpu_to_node(cpu));

Here are the flags of the workqueue: 
crash> struct workqueue_struct.flags 0xffff8803df464c00
  flags = 131082

(0xffff8803df464c00 is indeed the pointer to the workqueue struct, 
so the flags aren't bogus).

So reading the numa_pwq_tbl it seems that it's uninitialised: 

crash> struct workqueue_struct.numa_pwq_tbl 0xffff8803df464c00
  numa_pwq_tbl = 0xffff8803df464d10
crash> rd -64 0xffff8803df464d10 3
ffff8803df464d10:  0000000000000000 0000000000000000   ................
ffff8803df464d20:  0000000000000000                    ........

The machine where the crash occurred has a single NUMA node, so at the 
very least I would have expected to have a pointer, rather than NULL ptr. 

Also this crash is not isolated in that I have observed it on multiple
other nodes running vanilla 4.2.5/4.2.6 kernels. 

Any advice how to further debug that?

^ permalink raw reply	[flat|nested] 24+ messages in thread