* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared!
@ 2003-05-06 21:44 Chuck Ebbert
0 siblings, 0 replies; 10+ messages in thread
From: Chuck Ebbert @ 2003-05-06 21:44 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Alan Cox
Andrew Morton wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> >
> > It seems the heuristic is more complicated
>
> Any suggestions?
Does this pseudocode look like it would work? It should make it
only complain if two or more interrupts in a row go unhandled.
int last_irq_was_dropped[NR_IRQS];
/* call each handler in turn for this irq */
for (each_driver(irq)) {
ret = call_driver();
if (ret == irq_handled) {
if (unlikely(last_irq_was_dropped[irq])
last_irq_was_dropped[irq] = 0;
break;
}
}
if (ret != irq_handled) {
if (unlikely(last_irq_was_dropped[irq]))
complain();
else
last_irq_was_dropped[irq] = 1;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* 2.5.68-mmX: Drowning in irq 7: nobody cared! @ 2003-05-05 13:23 Shane Shrybman 2003-05-05 21:30 ` Andrew Morton 0 siblings, 1 reply; 10+ messages in thread From: Shane Shrybman @ 2003-05-05 13:23 UTC (permalink / raw) To: linux-kernel Hi, I am getting a lot of these in the logs. This is with the ALSA emu10k1 driver for a SB live card. This is a x86, UP, KT133 system with preempt enabled. The system seems to be running fine. handlers: [<d8986540>] (gcc2_compiled.+0x0/0x390 [snd_emu10k1]) irq 7: nobody cared! Call Trace: [<c010c5c2>] handle_IRQ_event+0xa2/0x110 [<c010c7c0>] do_IRQ+0xa0/0x130 [<c010b14c>] common_interrupt+0x18/0x20 [<c012048c>] do_softirq+0x3c/0xa0 [<c010c826>] do_IRQ+0x106/0x130 [<c010b14c>] common_interrupt+0x18/0x20 [<c0108884>] default_idle+0x24/0x30 [<c0114e55>] apm_cpu_idle+0x125/0x170 [<c0114d30>] apm_cpu_idle+0x0/0x170 [<c0108860>] default_idle+0x0/0x30 [<c0108902>] cpu_idle+0x32/0x50 [<c0105000>] _stext+0x0/0x60 [<c02c46be>] start_kernel+0x15e/0x170 handlers: [<d8986540>] (gcc2_compiled.+0x0/0x390 [snd_emu10k1]) irq 7: nobody cared! Call Trace: [<c010c5c2>] handle_IRQ_event+0xa2/0x110 [<c010c7c0>] do_IRQ+0xa0/0x130 [<c010b14c>] common_interrupt+0x18/0x20 [<c012048c>] do_softirq+0x3c/0xa0 [<c010c826>] do_IRQ+0x106/0x130 [<c010b14c>] common_interrupt+0x18/0x20 [<c01203dd>] current_kernel_time+0xd/0x40 [<c015f4d5>] inode_update_time+0x15/0x90 [<c01fd188>] memcpy_toiovec+0x68/0xb0 [<c0132af0>] generic_file_aio_write_nolock+0x390/0x9c0 [<c01fb9f7>] kfree_skbmem+0x17/0x20 [<c011b0f0>] autoremove_wake_function+0x0/0x40 [<c01f86dc>] sock_recvmsg+0x8c/0xb0 [<c013318f>] generic_file_write_nolock+0x6f/0x90 [<c0119cec>] __wake_up+0x1c/0x40 [<c0134737>] __alloc_pages+0x97/0x3a0 [<c0134a33>] __alloc_pages+0x393/0x3a0 [<c01f97f2>] sys_recvfrom+0xa2/0x100 [<c01f9836>] sys_recvfrom+0xe6/0x100 [<c0134a5a>] __get_free_pages+0x1a/0x50 [<c0134481>] free_hot_cold_page+0x21/0xf0 [<c0133350>] generic_file_writev+0x30/0x50 [<c014933d>] do_readv_writev+0x1bd/0x260 [<c0148e30>] do_sync_write+0x0/0xb0 [<c01f9ecf>] sys_socketcall+0x15f/0x1f0 [<c0149474>] vfs_writev+0x44/0x50 [<c01494e8>] sys_writev+0x28/0x40 [<c010a7df>] syscall_call+0x7/0xb CPU0 0: 45130454 XT-PIC timer 1: 6730 XT-PIC i8042 2: 0 XT-PIC cascade 5: 278881 XT-PIC uhci-hcd, uhci-hcd 7: 128713 XT-PIC EMU10K1 8: 1 XT-PIC rtc 10: 983196 XT-PIC ide2, ide3, bttv0 11: 3031816 XT-PIC eth0 12: 60 XT-PIC i8042, i8042, i8042, i8042 14: 67179 XT-PIC ide0 15: 520 XT-PIC ide1 NMI: 0 LOC: 45131802 ERR: 182 MIS: 0 BTW: What about the 4 i8042's on irq 12. Is this normal/OK? Regards, Shane ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared! 2003-05-05 13:23 Shane Shrybman @ 2003-05-05 21:30 ` Andrew Morton 2003-05-06 9:35 ` Alan Cox 2003-05-07 22:10 ` Shane Shrybman 0 siblings, 2 replies; 10+ messages in thread From: Andrew Morton @ 2003-05-05 21:30 UTC (permalink / raw) To: Shane Shrybman; +Cc: linux-kernel Shane Shrybman <shrybman@sympatico.ca> wrote: > > Hi, > > I am getting a lot of these in the logs. This is with the ALSA emu10k1 > driver for a SB live card. This is a x86, UP, KT133 system with preempt > enabled. The system seems to be running fine. > > handlers: > [<d8986540>] (gcc2_compiled.+0x0/0x390 [snd_emu10k1]) > irq 7: nobody cared! Beats me. Does this fix it up? diff -puN sound/pci/emu10k1/irq.c~sound-irq-hack sound/pci/emu10k1/irq.c --- 25/sound/pci/emu10k1/irq.c~sound-irq-hack Mon May 5 14:28:58 2003 +++ 25-akpm/sound/pci/emu10k1/irq.c Mon May 5 14:29:17 2003 @@ -147,5 +147,5 @@ irqreturn_t snd_emu10k1_interrupt(int ir outl(IPR_FXDSP, emu->port + IPR); } } - return IRQ_RETVAL(handled); + return IRQ_HANDLED; } _ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared! 2003-05-05 21:30 ` Andrew Morton @ 2003-05-06 9:35 ` Alan Cox 2003-05-06 10:41 ` Zwane Mwaikambo 2003-05-06 15:17 ` Andrew Morton 2003-05-07 22:10 ` Shane Shrybman 1 sibling, 2 replies; 10+ messages in thread From: Alan Cox @ 2003-05-06 9:35 UTC (permalink / raw) To: Andrew Morton; +Cc: Shane Shrybman, Linux Kernel Mailing List On Llu, 2003-05-05 at 22:30, Andrew Morton wrote: > Shane Shrybman <shrybman@sympatico.ca> wrote: > > > > Hi, > > > > I am getting a lot of these in the logs. This is with the ALSA emu10k1 > > driver for a SB live card. This is a x86, UP, KT133 system with preempt > > enabled. The system seems to be running fine. > > > > handlers: > > [<d8986540>] (gcc2_compiled.+0x0/0x390 [snd_emu10k1]) > > irq 7: nobody cared! > > Beats me. Does this fix it up? With APIC at least it doesnt suprise me the least. The IRQ hack seems extremely racey. Remember on most systems (especially with PIII type APIC) IRQ delivery is asynchronous to the bus so you get IRQ arrives sound card loop clean up IRQ IRQ sent still more work, do it done HANDLED IRQ arrives sound card Umm duh no work for me NOT HANDLED Whine For anything where you get pairs of close IRQ's ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared! 2003-05-06 9:35 ` Alan Cox @ 2003-05-06 10:41 ` Zwane Mwaikambo 2003-05-06 11:17 ` Alan Cox 2003-05-06 15:17 ` Andrew Morton 1 sibling, 1 reply; 10+ messages in thread From: Zwane Mwaikambo @ 2003-05-06 10:41 UTC (permalink / raw) To: Alan Cox; +Cc: Andrew Morton, Shane Shrybman, Linux Kernel Mailing List On Tue, 6 May 2003, Alan Cox wrote: > With APIC at least it doesnt suprise me the least. The IRQ hack seems > extremely racey. Remember on most systems (especially with PIII type > APIC) IRQ delivery is asynchronous to the bus so you get > > IRQ arrives > sound card > loop > clean up IRQ > IRQ sent > still more work, do it > done > HANDLED > > IRQ arrives > sound card > Umm duh no work for me > NOT HANDLED > > Whine > > For anything where you get pairs of close IRQ's Shouldn't this also be observed more easily on P4/xAPIC since you can have a pending vector in the IRR and ISR whilst the core processes one. Zwane -- function.linuxpower.ca ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared! 2003-05-06 10:41 ` Zwane Mwaikambo @ 2003-05-06 11:17 ` Alan Cox 0 siblings, 0 replies; 10+ messages in thread From: Alan Cox @ 2003-05-06 11:17 UTC (permalink / raw) To: Zwane Mwaikambo; +Cc: Andrew Morton, Shane Shrybman, Linux Kernel Mailing List On Maw, 2003-05-06 at 11:41, Zwane Mwaikambo wrote: > > For anything where you get pairs of close IRQ's > > Shouldn't this also be observed more easily on P4/xAPIC since you can have > a pending vector in the IRR and ISR whilst the core processes one. I don't know enough about the pending vector stuff. For the older APIC the IRQ's go via a suprisingly slow seperate APIC bus (4 wire if I remember rightly). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared! 2003-05-06 9:35 ` Alan Cox 2003-05-06 10:41 ` Zwane Mwaikambo @ 2003-05-06 15:17 ` Andrew Morton 2003-05-06 15:07 ` Alan Cox 1 sibling, 1 reply; 10+ messages in thread From: Andrew Morton @ 2003-05-06 15:17 UTC (permalink / raw) To: Alan Cox; +Cc: shrybman, linux-kernel Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > With APIC at least it doesnt suprise me the least. The IRQ hack seems > extremely racey. Good point. How about we do something like "if half of the past 1000 interrupts weren't handled then try to kill the IRQ"? arch/i386/kernel/irq.c | 55 +++++++++++++++++++++++++++++++++++++++---------- include/linux/irq.h | 2 + 2 files changed, 46 insertions(+), 11 deletions(-) diff -puN arch/i386/kernel/irq.c~irq-check-rate-limit arch/i386/kernel/irq.c --- 25/arch/i386/kernel/irq.c~irq-check-rate-limit 2003-05-06 07:54:17.000000000 -0700 +++ 25-akpm/arch/i386/kernel/irq.c 2003-05-06 08:16:15.000000000 -0700 @@ -66,8 +66,12 @@ /* * Controller mappings for all interrupt sources: */ -irq_desc_t irq_desc[NR_IRQS] __cacheline_aligned = - { [0 ... NR_IRQS-1] = { 0, &no_irq_type, NULL, 0, SPIN_LOCK_UNLOCKED}}; +irq_desc_t irq_desc[NR_IRQS] __cacheline_aligned = { + [0 ... NR_IRQS-1] = { + .handler = &no_irq_type, + .lock = SPIN_LOCK_UNLOCKED + } +}; static void register_irq_proc (unsigned int irq); @@ -209,7 +213,6 @@ int handle_IRQ_event(unsigned int irq, { int status = 1; /* Force the "do bottom halves" bit */ int retval = 0; - struct irqaction *first_action = action; if (!(action->flags & SA_INTERRUPT)) local_irq_enable(); @@ -222,19 +225,43 @@ int handle_IRQ_event(unsigned int irq, if (status & SA_SAMPLE_RANDOM) add_interrupt_randomness(irq); local_irq_disable(); - if (retval != 1) { + return status; +} + +/* + * If 500 of the previous 1000 interrupts have not been handled then assume + * that the IRQ is stuck in some manner. Drop a diagnostic and try to turn the + * IRQ off. + * + * Called under desc->lock + */ +static void note_interrupt(irq_desc_t *desc, int irq, int status) +{ + if (status != IRQ_HANDLED) + desc->irqs_unhandled++; + desc->irq_count++; + if (desc->irq_count < 1000) + return; + + desc->irq_count = 0; + if (desc->irqs_unhandled > 500) { + /* + * The interrupt is stuck + */ static int count = 100; + struct irqaction *action; + if (count) { count--; - if (retval) { + if (status) { printk("irq event %d: bogus retval mask %x\n", - irq, retval); + irq, status); } else { printk("irq %d: nobody cared!\n", irq); } dump_stack(); printk("handlers:\n"); - action = first_action; + action = desc->action; do { printk("[<%p>]", action->handler); print_symbol(" (%s)", @@ -243,9 +270,13 @@ int handle_IRQ_event(unsigned int irq, action = action->next; } while (action); } + /* + * Now kill the IRQ + */ + desc->status |= IRQ_DISABLED; + desc->handler->disable(irq); } - - return status; + desc->irqs_unhandled = 0; } /* @@ -418,10 +449,12 @@ asmlinkage unsigned int do_IRQ(struct pt * SMP environment. */ for (;;) { + int status; + spin_unlock(&desc->lock); - handle_IRQ_event(irq, ®s, action); + status = handle_IRQ_event(irq, ®s, action); spin_lock(&desc->lock); - + note_interrupt(desc, irq, status); if (likely(!(desc->status & IRQ_PENDING))) break; desc->status &= ~IRQ_PENDING; diff -puN include/linux/irq.h~irq-check-rate-limit include/linux/irq.h --- 25/include/linux/irq.h~irq-check-rate-limit 2003-05-06 07:56:03.000000000 -0700 +++ 25-akpm/include/linux/irq.h 2003-05-06 08:05:17.000000000 -0700 @@ -61,6 +61,8 @@ typedef struct { hw_irq_controller *handler; struct irqaction *action; /* IRQ action list */ unsigned int depth; /* nested irq disables */ + unsigned int irq_count; /* For detecting broken interrupts */ + unsigned int irqs_unhandled; spinlock_t lock; } ____cacheline_aligned irq_desc_t; _ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared! 2003-05-06 15:17 ` Andrew Morton @ 2003-05-06 15:07 ` Alan Cox 2003-05-06 16:12 ` Andrew Morton 0 siblings, 1 reply; 10+ messages in thread From: Alan Cox @ 2003-05-06 15:07 UTC (permalink / raw) To: Andrew Morton; +Cc: shrybman, Linux Kernel Mailing List On Maw, 2003-05-06 at 16:17, Andrew Morton wrote: > Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > > > With APIC at least it doesnt suprise me the least. The IRQ hack seems > > extremely racey. > > Good point. How about we do something like "if half of the past 1000 > interrupts weren't handled then try to kill the IRQ"? And if its a sound card generating close pairs of IRQs you might still trip. It seems the heuristic is more complicated ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared! 2003-05-06 15:07 ` Alan Cox @ 2003-05-06 16:12 ` Andrew Morton 0 siblings, 0 replies; 10+ messages in thread From: Andrew Morton @ 2003-05-06 16:12 UTC (permalink / raw) To: Alan Cox; +Cc: shrybman, linux-kernel Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > It seems the heuristic is more complicated Any suggestions? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.5.68-mmX: Drowning in irq 7: nobody cared! 2003-05-05 21:30 ` Andrew Morton 2003-05-06 9:35 ` Alan Cox @ 2003-05-07 22:10 ` Shane Shrybman 1 sibling, 0 replies; 10+ messages in thread From: Shane Shrybman @ 2003-05-07 22:10 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Hi Andrew & Alan, Sorry for the delay but the one liner below does seem to have cleared up the issue. I have been running it for about eight hours, with some sound on all the time, and haven't seen any 'nobody cared' messages. BTW: I hand applied this to 2.5.69-mm1. So I am confident that this one liner did fix it. On Mon, 2003-05-05 at 17:30, Andrew Morton wrote: > Shane Shrybman <shrybman@sympatico.ca> wrote: > > > > Hi, > > > > I am getting a lot of these in the logs. This is with the ALSA emu10k1 > > driver for a SB live card. This is a x86, UP, KT133 system with preempt > > enabled. The system seems to be running fine. > > > > handlers: > > [<d8986540>] (gcc2_compiled.+0x0/0x390 [snd_emu10k1]) > > irq 7: nobody cared! > > Beats me. Does this fix it up? > > diff -puN sound/pci/emu10k1/irq.c~sound-irq-hack sound/pci/emu10k1/irq.c > --- 25/sound/pci/emu10k1/irq.c~sound-irq-hack Mon May 5 14:28:58 2003 > +++ 25-akpm/sound/pci/emu10k1/irq.c Mon May 5 14:29:17 2003 > @@ -147,5 +147,5 @@ irqreturn_t snd_emu10k1_interrupt(int ir > outl(IPR_FXDSP, emu->port + IPR); > } > } > - return IRQ_RETVAL(handled); > + return IRQ_HANDLED; > } Regards, Shane ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2003-05-07 21:58 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-05-06 21:44 2.5.68-mmX: Drowning in irq 7: nobody cared! Chuck Ebbert -- strict thread matches above, loose matches on Subject: below -- 2003-05-05 13:23 Shane Shrybman 2003-05-05 21:30 ` Andrew Morton 2003-05-06 9:35 ` Alan Cox 2003-05-06 10:41 ` Zwane Mwaikambo 2003-05-06 11:17 ` Alan Cox 2003-05-06 15:17 ` Andrew Morton 2003-05-06 15:07 ` Alan Cox 2003-05-06 16:12 ` Andrew Morton 2003-05-07 22:10 ` Shane Shrybman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).