From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756733AbZCCFyh (ORCPT ); Tue, 3 Mar 2009 00:54:37 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751399AbZCCFy1 (ORCPT ); Tue, 3 Mar 2009 00:54:27 -0500 Received: from tomts13.bellnexxia.net ([209.226.175.34]:41690 "EHLO tomts13-srv.bellnexxia.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751139AbZCCFyU (ORCPT ); Tue, 3 Mar 2009 00:54:20 -0500 Date: Tue, 3 Mar 2009 00:54:11 -0500 From: Mathieu Desnoyers To: Ingo Molnar Cc: "Rafael J. Wysocki" , Linux Kernel Mailing List , Kernel Testers List Subject: Re: [Bug #12660] Linux 2.6.28.7 freezing on a 32-bits x86 Thinkpad T43p Message-ID: <20090303055410.GA14584@Krystal> References: <20090224072902.GA20098@elte.hu> <20090224161324.GB2803@Krystal> <20090224204502.GC15161@elte.hu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline In-Reply-To: <20090224204502.GC15161@elte.hu> X-Editor: vi X-Info: http://krystal.dyndns.org:8080 X-Operating-System: Linux/2.6.21.3-grsec (i686) X-Uptime: 00:47:04 up 3 days, 2:13, 2 users, load average: 0.63, 0.66, 0.49 User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Ingo Molnar (mingo@elte.hu) wrote: > > * Mathieu Desnoyers wrote: > > > * Ingo Molnar (mingo@elte.hu) wrote: > > > > > > * Rafael J. Wysocki wrote: > > > > > > > This message has been generated automatically as a part of a report > > > > of recent regressions. > > > > > > > > The following bug entry is on the current list of known regressions > > > > from 2.6.28. Please verify if it still should be listed and let me know > > > > (either way). > > > > > > > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12660 > > > > Subject : Linux 2.6.28.3 freezing on a 32-bits x86 Thinkpad T43p > > > > Submitter : Mathieu Desnoyers > > > > Date : 2009-02-04 21:11 (20 days old) > > > > References : http://marc.info/?l=linux-kernel&m=123378196022258&w=4 > > > > Handled-By : Ingo Molnar > > > > > > Mathieu, this bug is very weird and makes little sense. Could > > > you please reproduce it with vanilla -git too (without any LTT > > > patches applied) and send the full boot+crash log? > > > > > > Ingo > > > > Hi Ingo, > > > > The last time I reproduced this bug (before going back to 2.6.27 on this > > machine) was with a vanilla 2.6.28.5 kernel with the following patche > > applied. So maybe this patch is actually causing the problem now that > > other memory problems have been fixed since 2.6.28.3. I'll try without > > it, but it can take a while before the bug reappears, we'll see. > > > > Mathieu > > > > Hi Ingo, I got a similar problem with 2.6.28.7 (vanilla, there is no other patch applied). It just crashed. The oops follow. Maybe I could try disabling NO_HZ and see if it helps... ? BUG: unable to handle kernel NULL pointer dereference at 00000000 IP: [] get_next_timer_interrupt+0x4a/0x220 *pde = 00000000 Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor Modules linked in: nfs lockd sunrpc af_packet binfmt_misc ppdev lp ipv6 iptable_filter ntfs nls_iso8859_1 nls_cp437 vfat fat dm_snapshot dm_mirror dm_region_hash dm_log dm_mod acpi_cpufreq edd ide_generic ide_cd_mod aes_i586 blowfish cryptoloop loop ip_tables x_tables radeon drm hid_logitech pcmcia joydev sn060:[] EFLAGS: 00010002 CPU: 0 EIP is at get_next_timer_interrupt+0x4a/0x220 EAX: 00000080 EBX: c14fbc24 ECX: 00000000 EDX: 00000000 ESI: c14fb800 EDI: 0000007f EBP: c1491ec8 ESP: c1491e90 DS: 007b ES: 007b FS: 04513a0 task.ti=c1490000) Stack: ffffaf7f ffffaf7e c14fb800 00000015 00000000 00000030 00000000 c1491ec0 c1059216 00000000 00000030 00000015 00000001 ffffaf7e c1491f10 c105f758 c1491ed8 c1043bc2 c15 dce04b00 00000015 Call Trace: [] ? sched_clock_cpu+0xc6/0x120 [] ? tick_nohz_stop_sched_tick+0x158/0x370 [] ? _local_bh_enable+0x52/0xb0 [] ? __ [] ? lockdep_init+0xb/0x70 [] ? cpuidle_idle_call+0x6d/0xb0 [] ? cpu_idle+0x55/0xb0 [] ? rest_init+0x61/0x70 Code: 0f b6 f9 89 4 01 75 ea 85 Another one, same kernel version, different build (random bluetooth module changes) : BUG: unable to handle kernel NULL pointer dereference at 00000000 IP: [] get_next_timer_interrupt+0xfa/0x220 *pde = 00000000 Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq Modules linked in: parport_pc michael_mic ieee80211_crypt_tkip ieee80211_crypt_ccmp nfs lockd sunrpc af_packet binfmt_misc ppdev ipv6 iptable_filter ntfs nls_iso8859_1 nls_cp437 vfat fat dm_snapshot dm_mirror dm_region_hash dm_log dm_mod acpi_cpufreq edd ide_generic ide_cd_mod aes_i586 blowfish cryptoloop loop ip_tables x_tables radeon drm hid_logitech pcmcia joydev snd_intel8x0 snd_seq_dummy irtty_sir snd_seq_oss sir_dev psmouse snd_intel8x0m floppy cdc_ether usbnet serio_raw snd_ac97_codec snd_seq_midi i2c_i801 yenta_socket rsrc_nonstatic usb_storage mii ac97_bus snd_pcm_oss snd_mixer_oss snd_rawmidi snd_seq_midi_event pcmcia_core usbhid ipw2200 snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc tg3 libphy battery ac video output nsc_ircc parport irda crc_ccitt thermal button intel_agp agpgart thinkpad_acpi rfkill led_class evdev nvram unix [last unloaded: parport_pc] Pid: 0, comm: swapper Not tainted (2.6.28.7-trace #23) 2687D5U EIP: 0060:[] EFLAGS: 00010002 CPU: 0 EIP is at get_next_timer_interrupt+0xfa/0x220 EAX: 00000000 EBX: 00013501 ECX: c14fc1cc EDX: 00000000 ESI: 00000035 EDI: c14fc024 EBP: c1491ec8 ESP: c1491e90 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process swapper (pid: 0, ti=c1490000 task=c14513a0 task.ti=c1490000) Stack: 00013500 000134fc c14fb800 00000001 c1491eac 00000135 00000035 c14fc024 c14fc224 c14fc424 c14fc624 000000fe 00000001 000134fc c1491f10 c105f758 c1491ed8 c1043bc2 c1491f04 c1043d00 04ab55bd 000000fe 04aade00 000000fe Call Trace: [] ? tick_nohz_stop_sched_tick+0x158/0x370 [] ? _local_bh_enable+0x52/0xb0 [] ? __do_softirq+0xe0/0x130 [] ? irq_exit+0x7e/0x90 [] ? do_IRQ+0x7d/0x90 [] ? common_interrupt+0x28/0x30 [] ? lockdep_init+0xb/0x70 [] ? acpi_idle_enter_simple+0x175/0x1e2 [] ? cpuidle_idle_call+0x6d/0xb0 [] ? cpu_idle+0x55/0xb0 [] ? rest_init+0x61/0x70 Code: 90 8b 04 f7 8b 10 0f 18 02 90 8d 0c f7 39 c8 0f 84 91 00 00 00 8d b6 00 00 00 00 8d bf 00 00 00 00 8b 40 08 39 d8 0f 48 d8 89 d0 <8b> 12 0f 18 02 90 39 c1 75 ec 8b 45 e0 85 c0 74 05 39 75 e0 7e 3rd crash : BUG: unable to handle kernel NULL pointer dereference at 00000000 IP: [] get_next_timer_interrupt+0x4a/0x220 *pde = 00000000 Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:0b:02.0/resource Modules linked in: nfs lockd sunrpc rfcomm hidp l2cap af_packet binfmt_misc ppdev lp ipv6 iptable_filter ntfs nls_iso8859_1 nls_cp437 vfat fat dm_snapshot dm_mirror dm_region_hash dm_log dm_mod acpi_cpufreq edd ide_generic ide_cd_mod aes_i586 blowfish cryptoloop loop ip_tables x_tables radeon drm bluetooth pcmcia joydev hid_logitech snd_intel8x0m snd_seq_dummy irtty_sir snd_intel8x0 snd_sseq_midi_event snd_seq i2c_i801 pcmcia_core ipw2200 snd_timer snd_seq_device snd soundcore snd_page_alloc cdc_ether usbnet usbhccitt thermal button intel_agp agpgart thinkpad_acpi rfkill led_class evdev nvram unix Pid: 0, comm: swapper Not tain0x220 EAX: 0000006b EBX: c14d6b7c ECX: 00000000 EDX: 0000000+0xe0/0x130 [] ? irq_exit+0x7e/0x90 [] ? do_IRQ+0x7d/0x90 [] ? lockdep_init+0xb/0x70 [] ? acpi_idle_enter_simple+0x175/0x1e2 [