Dear Stable kernel Contributors
Observed Futex kernel crash while using navigation app in Broxton Device flashed with Normal 4.9.x kernel.
Futex Crash details are given below.
{{
1>[ 1383.591633] Time of kernel crash: (2021-02-16 12:04:19)
<1>[ 1383.597480] BUG: unable to handle kernel NULL pointer dereference at (null)
<1>[ 1383.606247] IP: [<ffffffffa211c271>] futex_wake+0xe1/0x180
<4>[ 1383.612386] PGD 130f62067
<4>[ 1383.615209] PUD 130f61067
<4>[ 1383.618230] PMD 0
<4>[ 1383.620275]
<4>[ 1383.621926] Oops: 0000 [#1] PREEMPT SMP
<4>[ 1383.626211] Modules linked in: bcmdhd(O) sxmio(C) rfkill_gpio cfg80211 ehset dwc3_pci dwc3 ishtp_tty_client dabridge camera_status mei_me anc_ipc igb_avb(O) mei xhci_pci xhci_hcd intel_ish_ipc intel_ishtp snd_soc_bxt_ivi_ull trusty_timer trusty_wall trusty_log trusty_virtio trusty_ipc dcsd_ts trusty_mem cyttsp6_i2c snd_soc_skl trusty snd_soc_skl_ipc snd_soc_sst_ipc cyttsp6_device_access snd_soc_sst_dsp snd_soc_sst_acpi virtio_ring snd_soc_sst_match snd_hda_ext_core cyttsp6_debug snd_hda_core dcsd_display virtio cyttsp6 [last unloaded: bcmdhd]
<4>[ 1383.680139] CPU: 2 PID: 7292 Comm: Thread-48 Tainted: G U C O 4.9.232-quilt-2e5dc0ac-g33302ae #1
<4>[ 1383.690832] task: ffff8cf005907040 task.stack: ffff9e25a64a0000
<4>[ 1383.697445] RIP: 0010:[<ffffffffa211c271>] [<ffffffffa211c271>] futex_wake+0xe1/0x180
<4>[ 1383.706302] RSP: 0018:ffff9e25a64a3d58 EFLAGS: 00010287
<4>[ 1383.712234] RAX: 000079068685e000 RBX: 0000000000000000 RCX: ffff9e258eb33cd8
<4>[ 1383.720196] RDX: ffffffffffffffe8 RSI: ffff9e258eb33cc0 RDI: 0000000000000000
<4>[ 1383.728165] RBP: ffff9e25a64a3dc0 R08: ffff8cf0b7c5cac8 R09: 0000000000000000
<4>[ 1383.736137] R10: 000000007fffffff R11: 0000000000000000 R12: ffff9e25a64a3d68
<4>[ 1383.744108] R13: 00000000ffffffff R14: 000000007fffffff R15: ffff8cf0b7c5cac4
<4>[ 1383.752082] FS: 0000790670203588(0000) GS:ffff8cf0bfd00000(0000) knlGS:000079066c642a00
<4>[ 1383.761125] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 1383.767556] CR2: 0000000000000000 CR3: 0000000130f63000 CR4: 00000000003406f0
<4>[ 1383.775530] Stack:
<4>[ 1383.777772] ffffffffa20cd4d1 ffff8cf0b7c5cac0 0000000000000001 ffff9e25a64a3d68
<4>[ 1383.786042] ffff8cf04c4d70c0 000079068685e000 0000000000000280 3d49d5e64c9b1e3b
<4>[ 1383.794328] 0000000000000000 0000000000000000 000079068685e280 000000007fffffff
<4>[ 1383.802614] Call Trace:
<4>[ 1383.805345] [<ffffffffa20cd4d1>] ? ttwu_do_wakeup+0xd1/0x100
<4>[ 1383.811764] [<ffffffffa211e638>] do_futex+0x658/0xbf0
<4>[ 1383.817506] [<ffffffffa214496d>] ? __seccomp_filter+0x6d/0x290
<4>[ 1383.824122] [<ffffffffa211ed0d>] SyS_futex+0x13d/0x190
<4>[ 1383.829960] [<ffffffffa200204e>] do_syscall_64+0x6e/0xe0
<4>[ 1383.835993] [<ffffffffa2a95220>] entry_SYSCALL_64_after_swapgs+0x5d/0xd7
<4>[ 1383.843578] Code: 04 48 89 45 a0 4c 89 ff e8 8d 8d 97 00 48 8b 45 a0 48 8b 48 08 4c 8d 40 08 48 8b 39 48 8d 71 e8 49 39 c8 48 8d 57 e8 75 16 eb 6a <48> 8b 4a 18 48 8d 42 18 48 89 d6 4c 39 c0 48 8d 51 e8 74 56 48
<1>[ 1383.865005] RIP [<ffffffffa211c271>] futex_wake+0xe1/0x180
<4>[ 1383.871238] RSP <ffff9e25a64a3d58>
<4>[ 1383.875122] CR2: 0000000000000000
}}
Using GDB, identified crash code location as given below.
{{
(gdb) list *(futex_wake+0xe1)
0xffffffff812cce51 is in futex_wake (../../../../../../kernel/bxt/kernel/futex.c:1445).
1440
1441 ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key, VERIFY_READ);
1442 if (unlikely(ret != 0))
1443 goto out;
1444
1445 hb = hash_futex(&key); // crash in hash_futex() execution
1446
1447 /* Make sure we really have tasks to wakeup */
1448 if (!hb_waiters_pending(hb))
1449 goto out_put_key;
(gdb)
No Futex code changes are introduced by myself.
Referred
git.kernel.org and confirmed up to below commit id are present in kernel source that reported with above crash.
{{
}}
No Real Time (RT) kernel config is enabled and the above crash is noticed in the Normal 4.9 Kernel. From disassembler output,
{{
(gdb) disassemble /s futex_wake+0xe1 // 0xe1 is 225 in decimal.
….
1445
hb = hash_futex(&key);
0xffffffff812cce47
<+215>: lea -0x68(%rbp),%rdi
0xffffffff812cce4b
<+219>: callq 0xffffffff812ca980 <hash_futex>
0xffffffff812cce50
<+224>: lea -0x28(%rbp),%r15
../../../../../../kernel/bxt/include/linux/compiler.h:
264
__READ_ONCE_SIZE;
0xffffffff812cce54
<+228>: mov %rax,%rdx
0xffffffff812cce57
<+231>: shr $0x3,%rdx
0xffffffff812cce5b
<+235>: movzbl (%rdx,%r13,1),%ecx
..
}}
From above assembler code, 225 decimal offset points to compiler optimization macro READ_ONCE() i.e triggered by compiler by one of nested function in
hash_futex() triggered crash?
Request to provide input for further analysis on this crash? Attached crash log for reference.
Regards
Koteswara