* [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
@ 2010-05-29 1:01 Steven Rostedt
2010-05-31 2:57 ` Zhang Rui
0 siblings, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2010-05-29 1:01 UTC (permalink / raw)
To: LKML; +Cc: Thomas Renninger, clarkt, Zhang Rui, Len Brown, Andrew Morton
I just replaced Windows with F12 on my wife's computer, to have nothing
but issues. But anyway, one of the issues I had on this box was in
vanilla linux kernel 2.6.34 (all the fedora kernels had other issues),
the kacpi_notify would go into an infinite loop.
I debugged it a bit with ftrace and saw that the kacpi_notify workqueue
was constantly requeuing itself (thanks to the workqueue trace events).
I bisected this, and it came down to this change:
commit fa80945269f312bc609e8384302f58b03c916e12
Author: Thomas Renninger <trenn@suse.de>
Date: Sat Feb 20 11:44:27 2010 +0100
ACPI thermal: Don't invalidate thermal zone if critical trip point is bad
I checked out 2.6.34 again, and reverted this patch, and the problem
went away.
The box is an old Compaq Presario that I bought in 2003 or 2004.
-- Steve
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
2010-05-29 1:01 [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched) Steven Rostedt
@ 2010-05-31 2:57 ` Zhang Rui
2010-06-30 13:40 ` Steven Rostedt
0 siblings, 1 reply; 5+ messages in thread
From: Zhang Rui @ 2010-05-31 2:57 UTC (permalink / raw)
To: rostedt; +Cc: LKML, Thomas Renninger, clarkt, Brown, Len, Andrew Morton
Hi, Steve,
On Sat, 2010-05-29 at 09:01 +0800, Steven Rostedt wrote:
> I just replaced Windows with F12 on my wife's computer, to have nothing
> but issues. But anyway, one of the issues I had on this box was in
> vanilla linux kernel 2.6.34 (all the fedora kernels had other issues),
> the kacpi_notify would go into an infinite loop.
>
> I debugged it a bit with ftrace and saw that the kacpi_notify workqueue
> was constantly requeuing itself (thanks to the workqueue trace events).
>
> I bisected this, and it came down to this change:
>
> commit fa80945269f312bc609e8384302f58b03c916e12
> Author: Thomas Renninger <trenn@suse.de>
> Date: Sat Feb 20 11:44:27 2010 +0100
>
> ACPI thermal: Don't invalidate thermal zone if critical trip point is bad
>
>
This patch enables the ACPI thermal control which is used to be disabled
on some laptops.
Maybe this triggers an interrupt storm on this box.
please attach the output of "grep . /sys/firmware/acpi/interrupts/*".
please attach the acpidump output of your laptop as well.
thanks,
rui
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
2010-05-31 2:57 ` Zhang Rui
@ 2010-06-30 13:40 ` Steven Rostedt
2010-06-30 15:17 ` Len Brown
0 siblings, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2010-06-30 13:40 UTC (permalink / raw)
To: Zhang Rui; +Cc: LKML, Thomas Renninger, clarkt, Brown, Len, Andrew Morton
On Mon, 2010-05-31 at 10:57 +0800, Zhang Rui wrote:
> Hi, Steve,
>
> On Sat, 2010-05-29 at 09:01 +0800, Steven Rostedt wrote:
> > I just replaced Windows with F12 on my wife's computer, to have nothing
> > but issues. But anyway, one of the issues I had on this box was in
> > vanilla linux kernel 2.6.34 (all the fedora kernels had other issues),
> > the kacpi_notify would go into an infinite loop.
> >
> > I debugged it a bit with ftrace and saw that the kacpi_notify workqueue
> > was constantly requeuing itself (thanks to the workqueue trace events).
> >
> > I bisected this, and it came down to this change:
> >
> > commit fa80945269f312bc609e8384302f58b03c916e12
> > Author: Thomas Renninger <trenn@suse.de>
> > Date: Sat Feb 20 11:44:27 2010 +0100
> >
> > ACPI thermal: Don't invalidate thermal zone if critical trip point is bad
> >
> >
> This patch enables the ACPI thermal control which is used to be disabled
> on some laptops.
>
> Maybe this triggers an interrupt storm on this box.
>
> please attach the output of "grep . /sys/firmware/acpi/interrupts/*".
> please attach the acpidump output of your laptop as well.
Sorry for the very late reply, I've been on the Cc of two kacpi_notify
bugs and one was not the problem, and I was deleting both emails as they
came in thinking they were the one I was ignoring. I just noticed that
the bug I had was closed due to my unresponsiveness. Sorry about that,
this was my wife's desktop and once I got it working, it became very low
priority (and I basically forgot about it).
Anyway here's the info on the box if your are still interested:
# grep . /sys/firmware/acpi/interrupts/*
/sys/firmware/acpi/interrupts/error: 0
/sys/firmware/acpi/interrupts/ff_gbl_lock: 0 disabled
/sys/firmware/acpi/interrupts/ff_pmtimer: 0 invalid
/sys/firmware/acpi/interrupts/ff_pwr_btn: 0 enabled
/sys/firmware/acpi/interrupts/ff_rt_clk: 0 disabled
/sys/firmware/acpi/interrupts/ff_slp_btn: 0 invalid
/sys/firmware/acpi/interrupts/gpe00: 0 invalid
/sys/firmware/acpi/interrupts/gpe01: 0 invalid
/sys/firmware/acpi/interrupts/gpe02: 0 invalid
/sys/firmware/acpi/interrupts/gpe03: 0 disabled
/sys/firmware/acpi/interrupts/gpe04: 0 disabled
/sys/firmware/acpi/interrupts/gpe05: 0 invalid
/sys/firmware/acpi/interrupts/gpe06: 0 invalid
/sys/firmware/acpi/interrupts/gpe07: 0 invalid
/sys/firmware/acpi/interrupts/gpe08: 0 invalid
/sys/firmware/acpi/interrupts/gpe09: 0 invalid
/sys/firmware/acpi/interrupts/gpe0A: 0 invalid
/sys/firmware/acpi/interrupts/gpe0B: 0 disabled
/sys/firmware/acpi/interrupts/gpe0C: 0 disabled
/sys/firmware/acpi/interrupts/gpe0D: 0 disabled
/sys/firmware/acpi/interrupts/gpe0E: 0 invalid
/sys/firmware/acpi/interrupts/gpe0F: 0 invalid
/sys/firmware/acpi/interrupts/gpe10: 0 invalid
/sys/firmware/acpi/interrupts/gpe11: 0 invalid
/sys/firmware/acpi/interrupts/gpe12: 0 invalid
/sys/firmware/acpi/interrupts/gpe13: 0 invalid
/sys/firmware/acpi/interrupts/gpe14: 0 invalid
/sys/firmware/acpi/interrupts/gpe15: 0 invalid
/sys/firmware/acpi/interrupts/gpe16: 0 invalid
/sys/firmware/acpi/interrupts/gpe17: 0 invalid
/sys/firmware/acpi/interrupts/gpe18: 0 enabled
/sys/firmware/acpi/interrupts/gpe19: 0 invalid
/sys/firmware/acpi/interrupts/gpe1A: 0 invalid
/sys/firmware/acpi/interrupts/gpe1B: 0 invalid
/sys/firmware/acpi/interrupts/gpe1C: 0 invalid
/sys/firmware/acpi/interrupts/gpe1D: 0 invalid
/sys/firmware/acpi/interrupts/gpe1E: 0 invalid
/sys/firmware/acpi/interrupts/gpe1F: 0 invalid
/sys/firmware/acpi/interrupts/gpe_all: 0
/sys/firmware/acpi/interrupts/sci: 0
/sys/firmware/acpi/interrupts/sci_not: 1
Note, this is a desktop not a laptop.
I don't see a acpidump utility installed, nor do I see anywhere in
fedora 12 that would install it.
After reverting the one patch, everything seemed to work (although it
still crashes here and there, but I think that's a video driver bug).
I'll help investigate this if you want. But it may take time. The box
does belong to my wife and I need to wait for her to finish with it
before I can take a look ;-) Well, I can ssh in, but to try patches or
anything else requiring reboots, will have to wait till she's off of it.
-- Steve
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
2010-06-30 13:40 ` Steven Rostedt
@ 2010-06-30 15:17 ` Len Brown
2010-06-30 16:06 ` Steven Rostedt
0 siblings, 1 reply; 5+ messages in thread
From: Len Brown @ 2010-06-30 15:17 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Zhang Rui, LKML, Thomas Renninger, clarkt, Andrew Morton
> /sys/firmware/acpi/interrupts/sci_not: 1
This means that acpi_irq was invoked and could not find a cause for the
interrupt.
Does /proc/interrupts show that the acpi interrupt shares and IRQ
with another device, or is it alone?
thanks,
Len Brown, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
2010-06-30 15:17 ` Len Brown
@ 2010-06-30 16:06 ` Steven Rostedt
0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2010-06-30 16:06 UTC (permalink / raw)
To: Len Brown; +Cc: Zhang Rui, LKML, Thomas Renninger, clarkt, Andrew Morton
On Wed, 2010-06-30 at 11:17 -0400, Len Brown wrote:
> > /sys/firmware/acpi/interrupts/sci_not: 1
>
> This means that acpi_irq was invoked and could not find a cause for the
> interrupt.
>
> Does /proc/interrupts show that the acpi interrupt shares and IRQ
> with another device, or is it alone?
>
It looks to be alone:
# cat /proc/interrupts
CPU0
0: 158 IO-APIC-edge timer
1: 334 IO-APIC-edge i8042
4: 4 IO-APIC-edge
6: 5 IO-APIC-edge floppy
7: 0 IO-APIC-edge parport0
8: 0 IO-APIC-edge rtc0
9: 0 IO-APIC-fasteoi acpi
12: 238565 IO-APIC-edge i8042
14: 158753 IO-APIC-edge ata_piix
15: 604816 IO-APIC-edge ata_piix
16: 38352 IO-APIC-fasteoi uhci_hcd:usb2, i915@pci:0000:00:02.0
17: 3164 IO-APIC-fasteoi Intel 82801DB-ICH4, eth0
18: 0 IO-APIC-fasteoi uhci_hcd:usb4
19: 1652995 IO-APIC-fasteoi uhci_hcd:usb3
23: 3718117 IO-APIC-fasteoi ehci_hcd:usb1
NMI: 16288 Non-maskable interrupts
LOC: 7599309 Local timer interrupts
SPU: 0 Spurious interrupts
PMI: 0 Performance monitoring interrupts
PND: 0 Performance pending work
RES: 0 Rescheduling interrupts
CAL: 0 Function call interrupts
TLB: 0 TLB shootdowns
TRM: 0 Thermal event interrupts
THR: 0 Threshold APIC interrupts
MCE: 0 Machine check exceptions
MCP: 523 Machine check polls
ERR: 6
MIS: 0
-- Steve
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-06-30 16:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-29 1:01 [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched) Steven Rostedt
2010-05-31 2:57 ` Zhang Rui
2010-06-30 13:40 ` Steven Rostedt
2010-06-30 15:17 ` Len Brown
2010-06-30 16:06 ` Steven Rostedt
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.