boot panic regression introduced in 3.5-rc7

* boot panic regression introduced in 3.5-rc7
       [not found] <1107100725.1273654.1343619920692.JavaMail.root@redhat.com>
@ 2012-07-30  3:51 ` CAI Qian
  2012-07-30 17:53   ` John Stultz
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: CAI Qian @ 2012-07-30  3:51 UTC (permalink / raw)
  To: linux-kernel 
  Cc: John Stultz, Ingo Molnar, Peter Zijlstra, Prarit Bhargava,
	Thomas Gleixner, Zhouping Liu

The bisecting pointed out this patch caused one of dell servers boot panic.

  5baefd6d84163443215f4a99f6a20f054ef11236
  hrtimer: Update hrtimer base offsets each hrtimer_interrupt

[    2.971092] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x10a/0x120()
[    2.971092] Hardware name: PowerEdge M605
[    2.971092] Modules linked in:
[    2.971092] Pid: 1, comm: swapper/0 Not tainted 3.5.0-rc6+ #8
[    2.971092] Call Trace:
[    2.971092]  <IRQ>  [<ffffffff81065a6f>] warn_slowpath_common+0x7f/0xc0
[    2.971092]  [<ffffffff81065aca>] warn_slowpath_null+0x1a/0x20
[    2.971092]  [<ffffffff810cb03a>] clockevents_program_event+0x10a/0x120
[    2.971092]  [<ffffffff810cc554>] tick_program_event+0x24/0x30
[    2.971092]  [<ffffffff8109671d>] hrtimer_interrupt+0x1ad/0x230
[    2.971092]  [<ffffffff816884e9>] smp_apic_timer_interrupt+0x69/0x99
[    2.971092]  [<ffffffff8168716f>] apic_timer_interrupt+0x6f/0x80
[    2.971092]  <EOI>  [<ffffffff8167dc30>] ? retint_restore_args+0x13/0x13
[    2.971092]  [<ffffffff81348563>] ? inflate_fast+0x2c3/0x5f0
[    2.971092]  [<ffffffff81349cec>] zlib_inflate+0x123c/0x1970
[    2.971092]  [<ffffffff811c1f0b>] ? vfs_write+0x10b/0x190
[    2.971092]  [<ffffffff81c86144>] ? bunzip2+0x472/0x472
[    2.971092]  [<ffffffff811c21ed>] ? sys_write+0x4d/0x90
[    2.971092]  [<ffffffff81c5aebb>] ? do_copy+0x87/0x8e
[    2.971092]  [<ffffffff81c5ac5c>] ? write_buffer+0x22/0x32
[    2.971092]  [<ffffffff81c5ac6c>] ? write_buffer+0x32/0x32
[    2.971092]  [<ffffffff81c86144>] ? bunzip2+0x472/0x472
[    2.971092]  [<ffffffff81c863c8>] gunzip+0x27b/0x349
[    2.971092]  [<ffffffff81c5b18e>] unpack_to_rootfs+0x159/0x267
[    2.971092]  [<ffffffff81c5abc0>] ? md_run_setup+0x9c/0x9c
[    2.971092]  [<ffffffff81c5b3f2>] ? maybe_link.part.2+0x111/0x111
[    2.971092]  [<ffffffff81c5b44a>] populate_rootfs+0x58/0x109
[    2.971092]  [<ffffffff8100212a>] do_one_initcall+0x12a/0x180
[    2.971092]  [<ffffffff81c59dd1>] kernel_init+0x15d/0x1e1
[    2.971092]  [<ffffffff81c59614>] ? do_early_param+0x8c/0x8c
[    2.971092]  [<ffffffff81687a74>] kernel_thread_helper+0x4/0x10
[    2.971092]  [<ffffffff8167dc30>] ? retint_restore_args+0x13/0x13
[    2.971092]  [<ffffffff81c59c74>] ? start_kernel+0x401/0x401
[    2.971092]  [<ffffffff81687a70>] ? gs_change+0x13/0x13
[    2.971092] ---[ end trace 319c95c486d7d9cd ]---
[    2.971092] Freeing initrd memory: 23412k freed
[    2.971092] DMA-API: preallocated 32768 debug entries
[    2.971092] DMA-API: debugging enabled by kernel config
[    6.034660] work_for_cpu (34) used greatest stack depth: 6520 bytes left
[    6.042463] work_for_cpu (35) used greatest stack depth: 6248 bytes left
[    6.098917] PCI-DMA: Disabling AGP.
[    6.098917] PCI-DMA: aperture base @ f4000000 size 65536 KB
[    6.098917] init_memory_mapping: [mem 0xf4000000-0xf7ffffff]
[    6.098917] PCI-DMA: using GART IOMMU.
[    6.098917] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
[    6.593402] cryptomgr_test (59) used greatest stack depth: 6200 bytes left
[    6.601692] audit: initializing netlink socket (disabled)
[    6.601692] type=2000 audit(196996417573.616:1): initialized
[    7.176075] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[    7.559238] VFS: Disk quotas dquot_6.5.2
[    7.559238] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    7.663975] msgmni has been set to 7965
[    7.702014] cryptomgr_test (68) used greatest stack depth: 5576 bytes left
[    7.710867] cryptomgr_test (70) used greatest stack depth: 5416 bytes left
[    7.720367] alg: No test for stdrng (krng)
[    7.724763] NET: Registered protocol family 38
[    7.725463] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    7.743107] io scheduler noop registered
[    7.743559] io scheduler deadline registered (default)
[    7.743559] io scheduler cfq registered
[    7.743559] start plist test
[    7.743559] end plist test
[    7.896741] work_for_cpu (77) used greatest stack depth: 5288 bytes left
[    7.917588] work_for_cpu (79) used greatest stack depth: 5112 bytes left
[    7.943340] pcieport 0000:00:0a.0: Signaling PME through PCIe PME interrupt
[    7.943340] pcieport 0000:00:0b.0: Signaling PME through PCIe PME interrupt
[    7.943340] pci 0000:02:00.0: Signaling PME through PCIe PME interrupt
[    7.943340] pcieport 0000:00:0c.0: Signaling PME through PCIe PME interrupt
[    7.943340] pci 0000:03:00.0: Signaling PME through PCIe PME interrupt
[    7.943340] pci 0000:04:00.0: Signaling PME through PCIe PME interrupt
[    7.943340] pcieport 0000:00:0d.0: Signaling PME through PCIe PME interrupt
[    7.943340] pci 0000:05:00.0: Signaling PME through PCIe PME interrupt
[    7.943340] pci 0000:06:00.0: Signaling PME through PCIe PME interrupt
[    7.943340] pcieport 0000:00:0f.0: Signaling PME through PCIe PME interrupt
[    7.943340] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[    7.943340] pciehp: PCI Express Hot Plug Controller Driver version: 0.4
[    7.943340] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    7.943340] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[    7.943340] ACPI: Power Button [PWRF]
[    9.001224] ERST: Failed to get Error Log Address Range.
[   19.807939] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0
[   19.808927] Pid: 0, comm: swapper/0 Tainted: G        W    3.5.0-rc6+ #8
[   19.808927] Call Trace:
[   19.808927]  <NMI>  [<ffffffff8166dd25>] panic+0xba/0x1ce
[   19.808927]  [<ffffffff81110a30>] ? touch_nmi_watchdog+0x90/0x90
[   19.808927]  [<ffffffff81110ae5>] watchdog_overflow_callback+0xb5/0xc0
[   19.808927]  [<ffffffff8114ed9d>] __perf_event_overflow+0x9d/0x320
[   19.808927]  [<ffffffff8114aecd>] ? perf_event_update_userpage+0x16d/0x2c0
[   19.808927]  [<ffffffff8114ad60>] ? perf_event_task_disable+0x90/0x90
[   19.808927]  [<ffffffff8114fff4>] perf_event_overflow+0x14/0x20
[   19.808927]  [<ffffffff8102b833>] x86_pmu_handle_irq+0xe3/0x130
[   19.808927]  [<ffffffff8167f81d>] perf_event_nmi_handler+0x1d/0x20
[   19.808927]  [<ffffffff8167ed4e>] nmi_handle.isra.2+0xbe/0x210
[   19.808927]  [<ffffffff8167ec90>] ? __die+0xf0/0xf0
[   19.808927]  [<ffffffff8167efa8>] do_nmi+0x108/0x380
[   19.808927]  [<ffffffff8167e2cc>] end_repeat_nmi+0x1a/0x1e
[   19.808927]  [<ffffffff81096743>] ? hrtimer_interrupt+0x1d3/0x230
[   19.808927]  [<ffffffff81096743>] ? hrtimer_interrupt+0x1d3/0x230
[   19.808927]  [<ffffffff81096743>] ? hrtimer_interrupt+0x1d3/0x230
[   19.808927]  <<EOE>>  <IRQ>  [<ffffffff810cceae>] ? tick_nohz_stop_idle+0x3e/0x50
[   19.808927]  [<ffffffff816884e9>] smp_apic_timer_interrupt+0x69/0x99
[   19.808927]  [<ffffffff8168716f>] apic_timer_interrupt+0x6f/0x80
[   19.808927]  <EOI>  [<ffffffff8167dc30>] ? retint_restore_args+0x13/0x13
[   19.808927]  [<ffffffff810cd0cb>] ? tick_nohz_idle_enter+0x4b/0x80
[   19.808927]  [<ffffffff810cd0c7>] ? tick_nohz_idle_enter+0x47/0x80
[   19.808927]  [<ffffffff81023462>] cpu_idle+0x72/0x140
[   19.808927]  [<ffffffff816515e5>] rest_init+0x149/0x154
[   19.808927]  [<ffffffff8165149c>] ? csum_partial_copy_generic+0x16c/0x16c
[   19.808927]  [<ffffffff81c59c67>] start_kernel+0x3f4/0x401
[   19.808927]  [<ffffffff81c59672>] ? repair_env_string+0x5e/0x5e
[   19.808927]  [<ffffffff81c59356>] x86_64_start_reservations+0x131/0x135
[   19.808927]  [<ffffffff81c5945a>] x86_64_start_kernel+0x100/0x10f

Kernel config and full dmesg are here.
http://people.redhat.com/qcai/config
http://people.redhat.com/qcai/dmesg

CAI Qian

^ permalink raw reply	[flat|nested] 6+ messages in thread