All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
@ 2010-05-29  1:01 Steven Rostedt
  2010-05-31  2:57 ` Zhang Rui
  0 siblings, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2010-05-29  1:01 UTC (permalink / raw)
  To: LKML; +Cc: Thomas Renninger, clarkt, Zhang Rui, Len Brown, Andrew Morton

I just replaced Windows with F12 on my wife's computer, to have nothing
but issues. But anyway, one of the issues I had on this box was in
vanilla linux kernel 2.6.34 (all the fedora kernels had other issues),
the kacpi_notify would go into an infinite loop.

I debugged it a bit with ftrace and saw that the kacpi_notify workqueue
was constantly requeuing itself (thanks to the workqueue trace events).

I bisected this, and it came down to this change:

commit fa80945269f312bc609e8384302f58b03c916e12
Author: Thomas Renninger <trenn@suse.de>
Date:   Sat Feb 20 11:44:27 2010 +0100

    ACPI thermal: Don't invalidate thermal zone if critical trip point is bad


I checked out 2.6.34 again, and reverted this patch, and the problem
went away.

The box is an old Compaq Presario that I bought in 2003 or 2004.

-- Steve



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
  2010-05-29  1:01 [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched) Steven Rostedt
@ 2010-05-31  2:57 ` Zhang Rui
  2010-06-30 13:40   ` Steven Rostedt
  0 siblings, 1 reply; 5+ messages in thread
From: Zhang Rui @ 2010-05-31  2:57 UTC (permalink / raw)
  To: rostedt; +Cc: LKML, Thomas Renninger, clarkt, Brown, Len, Andrew Morton

Hi, Steve,

On Sat, 2010-05-29 at 09:01 +0800, Steven Rostedt wrote:
> I just replaced Windows with F12 on my wife's computer, to have nothing
> but issues. But anyway, one of the issues I had on this box was in
> vanilla linux kernel 2.6.34 (all the fedora kernels had other issues),
> the kacpi_notify would go into an infinite loop.
> 
> I debugged it a bit with ftrace and saw that the kacpi_notify workqueue
> was constantly requeuing itself (thanks to the workqueue trace events).
> 
> I bisected this, and it came down to this change:
> 
> commit fa80945269f312bc609e8384302f58b03c916e12
> Author: Thomas Renninger <trenn@suse.de>
> Date:   Sat Feb 20 11:44:27 2010 +0100
> 
>     ACPI thermal: Don't invalidate thermal zone if critical trip point is bad
> 
> 
This patch enables the ACPI thermal control which is used to be disabled
on some laptops.

Maybe this triggers an interrupt storm on this box.

please attach the output of "grep . /sys/firmware/acpi/interrupts/*".
please attach the acpidump output of your laptop as well.

thanks,
rui



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
  2010-05-31  2:57 ` Zhang Rui
@ 2010-06-30 13:40   ` Steven Rostedt
  2010-06-30 15:17     ` Len Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2010-06-30 13:40 UTC (permalink / raw)
  To: Zhang Rui; +Cc: LKML, Thomas Renninger, clarkt, Brown, Len, Andrew Morton

On Mon, 2010-05-31 at 10:57 +0800, Zhang Rui wrote:
> Hi, Steve,
> 
> On Sat, 2010-05-29 at 09:01 +0800, Steven Rostedt wrote:
> > I just replaced Windows with F12 on my wife's computer, to have nothing
> > but issues. But anyway, one of the issues I had on this box was in
> > vanilla linux kernel 2.6.34 (all the fedora kernels had other issues),
> > the kacpi_notify would go into an infinite loop.
> > 
> > I debugged it a bit with ftrace and saw that the kacpi_notify workqueue
> > was constantly requeuing itself (thanks to the workqueue trace events).
> > 
> > I bisected this, and it came down to this change:
> > 
> > commit fa80945269f312bc609e8384302f58b03c916e12
> > Author: Thomas Renninger <trenn@suse.de>
> > Date:   Sat Feb 20 11:44:27 2010 +0100
> > 
> >     ACPI thermal: Don't invalidate thermal zone if critical trip point is bad
> > 
> > 
> This patch enables the ACPI thermal control which is used to be disabled
> on some laptops.
> 
> Maybe this triggers an interrupt storm on this box.
> 
> please attach the output of "grep . /sys/firmware/acpi/interrupts/*".
> please attach the acpidump output of your laptop as well.

Sorry for the very late reply, I've been on the Cc of two kacpi_notify
bugs and one was not the problem, and I was deleting both emails as they
came in thinking they were the one I was ignoring. I just noticed that
the bug I had was closed due to my unresponsiveness. Sorry about that,
this was my wife's desktop and once I got it working, it became very low
priority (and I basically forgot about it).


Anyway here's the info on the box if your are still interested:

# grep . /sys/firmware/acpi/interrupts/*
/sys/firmware/acpi/interrupts/error:       0
/sys/firmware/acpi/interrupts/ff_gbl_lock:       0	disabled
/sys/firmware/acpi/interrupts/ff_pmtimer:       0	invalid
/sys/firmware/acpi/interrupts/ff_pwr_btn:       0	enabled
/sys/firmware/acpi/interrupts/ff_rt_clk:       0	disabled
/sys/firmware/acpi/interrupts/ff_slp_btn:       0	invalid
/sys/firmware/acpi/interrupts/gpe00:       0	invalid
/sys/firmware/acpi/interrupts/gpe01:       0	invalid
/sys/firmware/acpi/interrupts/gpe02:       0	invalid
/sys/firmware/acpi/interrupts/gpe03:       0	disabled
/sys/firmware/acpi/interrupts/gpe04:       0	disabled
/sys/firmware/acpi/interrupts/gpe05:       0	invalid
/sys/firmware/acpi/interrupts/gpe06:       0	invalid
/sys/firmware/acpi/interrupts/gpe07:       0	invalid
/sys/firmware/acpi/interrupts/gpe08:       0	invalid
/sys/firmware/acpi/interrupts/gpe09:       0	invalid
/sys/firmware/acpi/interrupts/gpe0A:       0	invalid
/sys/firmware/acpi/interrupts/gpe0B:       0	disabled
/sys/firmware/acpi/interrupts/gpe0C:       0	disabled
/sys/firmware/acpi/interrupts/gpe0D:       0	disabled
/sys/firmware/acpi/interrupts/gpe0E:       0	invalid
/sys/firmware/acpi/interrupts/gpe0F:       0	invalid
/sys/firmware/acpi/interrupts/gpe10:       0	invalid
/sys/firmware/acpi/interrupts/gpe11:       0	invalid
/sys/firmware/acpi/interrupts/gpe12:       0	invalid
/sys/firmware/acpi/interrupts/gpe13:       0	invalid
/sys/firmware/acpi/interrupts/gpe14:       0	invalid
/sys/firmware/acpi/interrupts/gpe15:       0	invalid
/sys/firmware/acpi/interrupts/gpe16:       0	invalid
/sys/firmware/acpi/interrupts/gpe17:       0	invalid
/sys/firmware/acpi/interrupts/gpe18:       0	enabled
/sys/firmware/acpi/interrupts/gpe19:       0	invalid
/sys/firmware/acpi/interrupts/gpe1A:       0	invalid
/sys/firmware/acpi/interrupts/gpe1B:       0	invalid
/sys/firmware/acpi/interrupts/gpe1C:       0	invalid
/sys/firmware/acpi/interrupts/gpe1D:       0	invalid
/sys/firmware/acpi/interrupts/gpe1E:       0	invalid
/sys/firmware/acpi/interrupts/gpe1F:       0	invalid
/sys/firmware/acpi/interrupts/gpe_all:       0
/sys/firmware/acpi/interrupts/sci:       0
/sys/firmware/acpi/interrupts/sci_not:       1

Note, this is a desktop not a laptop.

I don't see a acpidump utility installed, nor do I see anywhere in
fedora 12 that would install it.

After reverting the one patch, everything seemed to work (although it
still crashes here and there, but I think that's a video driver bug).

I'll help investigate this if you want. But it may take time. The box
does belong to my wife and I need to wait for her to finish with it
before I can take a look ;-) Well, I can ssh in, but to try patches or
anything else requiring reboots, will have to wait till she's off of it.


-- Steve



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
  2010-06-30 13:40   ` Steven Rostedt
@ 2010-06-30 15:17     ` Len Brown
  2010-06-30 16:06       ` Steven Rostedt
  0 siblings, 1 reply; 5+ messages in thread
From: Len Brown @ 2010-06-30 15:17 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Zhang Rui, LKML, Thomas Renninger, clarkt, Andrew Morton


> /sys/firmware/acpi/interrupts/sci_not:       1

This means that acpi_irq was invoked and could not find a cause for the 
interrupt.

Does /proc/interrupts show that the acpi interrupt shares and IRQ
with another device, or is it alone?

thanks,
Len Brown, Intel Open Source Technology Center


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched)
  2010-06-30 15:17     ` Len Brown
@ 2010-06-30 16:06       ` Steven Rostedt
  0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2010-06-30 16:06 UTC (permalink / raw)
  To: Len Brown; +Cc: Zhang Rui, LKML, Thomas Renninger, clarkt, Andrew Morton

On Wed, 2010-06-30 at 11:17 -0400, Len Brown wrote:
> > /sys/firmware/acpi/interrupts/sci_not:       1
> 
> This means that acpi_irq was invoked and could not find a cause for the 
> interrupt.
> 
> Does /proc/interrupts show that the acpi interrupt shares and IRQ
> with another device, or is it alone?
> 

It looks to be alone:

# cat /proc/interrupts 
            CPU0       
   0:        158   IO-APIC-edge      timer
   1:        334   IO-APIC-edge      i8042
   4:          4   IO-APIC-edge    
   6:          5   IO-APIC-edge      floppy
   7:          0   IO-APIC-edge      parport0
   8:          0   IO-APIC-edge      rtc0
   9:          0   IO-APIC-fasteoi   acpi
  12:     238565   IO-APIC-edge      i8042
  14:     158753   IO-APIC-edge      ata_piix
  15:     604816   IO-APIC-edge      ata_piix
  16:      38352   IO-APIC-fasteoi   uhci_hcd:usb2, i915@pci:0000:00:02.0
  17:       3164   IO-APIC-fasteoi   Intel 82801DB-ICH4, eth0
  18:          0   IO-APIC-fasteoi   uhci_hcd:usb4
  19:    1652995   IO-APIC-fasteoi   uhci_hcd:usb3
  23:    3718117   IO-APIC-fasteoi   ehci_hcd:usb1
 NMI:      16288   Non-maskable interrupts
 LOC:    7599309   Local timer interrupts
 SPU:          0   Spurious interrupts
 PMI:          0   Performance monitoring interrupts
 PND:          0   Performance pending work
 RES:          0   Rescheduling interrupts
 CAL:          0   Function call interrupts
 TLB:          0   TLB shootdowns
 TRM:          0   Thermal event interrupts
 THR:          0   Threshold APIC interrupts
 MCE:          0   Machine check exceptions
 MCP:        523   Machine check polls
 ERR:          6
 MIS:          0

-- Steve



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-06-30 16:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-29  1:01 [BUG] kacpi_notify goes into an infinite loop (luckly it calls cond_resched) Steven Rostedt
2010-05-31  2:57 ` Zhang Rui
2010-06-30 13:40   ` Steven Rostedt
2010-06-30 15:17     ` Len Brown
2010-06-30 16:06       ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.