[PATCH] x86/mce: fix mce_restart() race with CPU hotplug operation

* [PATCH] x86/mce: fix mce_restart() race with CPU hotplug operation
@ 2015-04-29 15:04 Ethan Zhao
  2015-04-30 16:29 ` Borislav Petkov
  0 siblings, 1 reply; 6+ messages in thread
From: Ethan Zhao @ 2015-04-29 15:04 UTC (permalink / raw)
  To: tony.luck, bp, tglx
  Cc: mingo, hpa, x86, linux-edac, linux-kernel, ethan.kernel,
	tim.uglow, Ethan Zhao

while testing CPU hotplug and MCE with following two scripts,

script 1:

 for i in {1..30}; do while :; do  ((a=$RANDOM%160)); echo 0  >>
 /sys/devices/system/cpu/cpu${i}/online; echo 1 >>
 /sys/devices/system/cpu/cpu${i}/online; done & done

script 2:

 while :; do for i in $(ls
 /sys/devices/system/machinecheck/machinecheck*/check_interval); do echo 1  >>
 $i; done; done

We got panic call trace as:

------------[ cut here ]------------
 kernel BUG at kernel/timer.c:929!
 invalid opcode: 0000 [#1] SMP
 Modules linked in: fuse tun coretemp acpi_cpufreq mperf freq_table
 intel_powerclampsmpboot: CPU 29 is now offline
  kvm_intel kvm crc32c_intel ghash_clmulni_intel aesni_intel xts aes_x86_64
 lrw gf128mul ablk_helper cryptd iTCO_wdt iTCO_vendor_support ses microcode
 pcspkr enclosure i2c_i801 i2c_core lpc_ich i7core_edac mfd_core edac_core
 shpchp ext3 mbcache jbd sd_mod crc_t10dif ixgbe ptp igb pps_core ahci libahci
 dca megaraid_sas hwmon ipv6 autofs4
 CPU 101
 Pid: 0, comm: swapper/101 Tainted: G        W    3.8.13
 #2 Oracle Corporation  Sun Fire X4800 M2 /
 RIP: 0010:[<ffffffff8106bb92>]  [<ffffffff8106bb92>] add_timer_on+0xe2/0xf0
 RSP: 0000:ffff88303f843de8  EFLAGS: 00010282
 RAX: 0000000000000000 RBX: ffff88303f84c920 RCX: 000000011eb1d709
 RDX: ffff88303f840000 RSI: 0000000000000065 RDI: ffff88303f84c920
 RBP: ffff88303f843e18 R08: 000000011eb1d03a R09: ffff88303f843d68
 R10: ffff88303f843d6c R11: 0000000000000006 R12: 00000000000007d0
 R13: ffff883029710000 R14: 0000000000000065 R15: 0000000000000066
 FS:  0000000000000000(0000) GS:ffff88303f840000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 00007f7e1b7b7000 CR3: 0000000001886000 CR4: 00000000000007e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process swapper/101 (pid: 0, threadinfo ffff881028f94000, task
 ffff881028f92500)
 Stack:
  0000000001000c18 ffff88303f84c920 00000000000007d0 0000000000000100
  ffffffff81030bb0 0000000000000066 ffff88303f843e38 ffffffff81030c44
  ffff88303f88c920 ffff88303f88c920 ffff88303f843e70 ffffffff8106acbb
 Call Trace:
  <IRQ>
  [<ffffffff81030bb0>] ? mce_cpu_restart+0x40/0x40
  [<ffffffff81030c44>] mce_timer_fn+0x94/0x130
  [<ffffffff8106acbb>] call_timer_fn+0x3b/0x110
  [<ffffffff81030bb0>] ? mce_cpu_restart+0x40/0x40
  [<ffffffff8106b8dd>] run_timer_softirq+0x1cd/0x2b0
  [<ffffffff81063108>] __do_softirq+0xd8/0x210
  [<ffffffff8144f7c0>] ? intel_pstate_timer_func+0x3a0/0x3a0
  [<ffffffff8157e15c>] call_softirq+0x1c/0x30
  [<ffffffff81017185>] do_softirq+0x65/0xa0
  [<ffffffff810633b5>] irq_exit+0xa5/0xb0
  [<ffffffff8157eede>] smp_apic_timer_interrupt+0x6e/0x9c
  [<ffffffff8157da1d>] apic_timer_interrupt+0x6d/0x80
  <EOI>
  [<ffffffff81450131>] ? cpuidle_wrap_enter+0x41/0x80
  [<ffffffff81450180>] cpuidle_enter_tk+0x10/0x20
  [<ffffffff8144fef7>] cpuidle_idle_call+0xb7/0x1e0
  [<ffffffff8101dd75>] cpu_idle+0xe5/0x140
  [<ffffffff815610b4>] start_secondary+0x24e/0x250
 Code: 90 00 4d 85 ff 74 22 49 8b 0f 0f 1f 80 00 00 00 00 49 8b 7f 08 49 83 c7
 10 4c 89 e2 48 89 de ff d1 49 8b 0f 48 85 c9 75 e8 eb 97 <0f> 0b 66 66 66 2e
 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48
 RIP  [<ffffffff8106bb92>] add_timer_on+0xe2/0xf0
  RSP <ffff88303f843de8>

This panic was caused by race condition between mce_restart() and CPU hotplug
operation. we should protect the mce_restart() operation with

get_online_cpus();
put_online_cpus();

functions pair, just as other subsystem that does iteration on cpu_online_mask.

This bug will affect stable branch 4.0, 3.8, 3.19 (didn't check others).
and this patch has been verified on stable 4.0 branch.

Reported-by: Tim Uglow <tim.uglow@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 3c036cb..fcc2794 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1338,8 +1338,10 @@ static void mce_timer_delete_all(void)
 {
 	int cpu;
 
+	get_online_cpus();
 	for_each_online_cpu(cpu)
 		del_timer_sync(&per_cpu(mce_timer, cpu));
+	put_online_cpus();
 }
 
 static void mce_do_trigger(struct work_struct *work)
@@ -2085,7 +2087,9 @@ static void mce_cpu_restart(void *data)
 static void mce_restart(void)
 {
 	mce_timer_delete_all();
+	get_online_cpus();
 	on_each_cpu(mce_cpu_restart, NULL, 1);
+	put_online_cpus();
 }
 
 /* Toggle features for corrected errors */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread