* Catching NForce2 lockup with NMI watchdog @ 2003-12-05 4:54 Jesse Allen 2003-12-05 7:40 ` Mikael Pettersson 0 siblings, 1 reply; 62+ messages in thread From: Jesse Allen @ 2003-12-05 4:54 UTC (permalink / raw) To: linux-kernel Hi, I have a NForce2 board and can easily reproduce a lockup with grep on an IDE hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are enabled. It was suggested to me to use NMI watchdog to catch it. However, the NMI watchdog doesn't seem to work. When I set the kernel parameter "nmi_watchdog=1" I get this message in /var/log/syslog: Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to IO-APIC Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC - disabling NMI Watchdog! "nmi_watchdog=2" seems to work at first, In /var/log/messages: Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK. but it still locks up. I have the complete logs when running with nmi_watchdog, kernel config, and more here: http://www.chez.com/alors/nforce-lockup-logs.tar.gz If you have any ideas please give them =) Jesse ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 4:54 Catching NForce2 lockup with NMI watchdog Jesse Allen @ 2003-12-05 7:40 ` Mikael Pettersson 2003-12-05 8:33 ` Josh McKinney 2003-12-05 8:58 ` Mike Fedyk 0 siblings, 2 replies; 62+ messages in thread From: Mikael Pettersson @ 2003-12-05 7:40 UTC (permalink / raw) To: Jesse Allen; +Cc: linux-kernel Jesse Allen writes: > Hi, > > I have a NForce2 board and can easily reproduce a lockup with grep on an IDE > hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are > enabled. It was suggested to me to use NMI watchdog to catch it. However, the > NMI watchdog doesn't seem to work. > > When I set the kernel parameter "nmi_watchdog=1" I get this message in > /var/log/syslog: > Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to > IO-APIC > Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC - > disabling NMI Watchdog! > > "nmi_watchdog=2" seems to work at first, In /var/log/messages: > Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK. > but it still locks up. The NMI watchdog can only handle software lockups, since it relies on the CPU, and for nmi_watchdog=1 the I/O-APIC + bus, still running. Hardware lockups result in, well, hardware lockups :-( ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 7:40 ` Mikael Pettersson @ 2003-12-05 8:33 ` Josh McKinney 2003-12-05 12:14 ` Mikael Pettersson 2003-12-05 8:58 ` Mike Fedyk 1 sibling, 1 reply; 62+ messages in thread From: Josh McKinney @ 2003-12-05 8:33 UTC (permalink / raw) To: linux-kernel On approximately Fri, Dec 05, 2003 at 08:40:58AM +0100, Mikael Pettersson wrote: > Jesse Allen writes: > > Hi, > > > > I have a NForce2 board and can easily reproduce a lockup with grep on an IDE > > hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are > > enabled. It was suggested to me to use NMI watchdog to catch it. However, the > > NMI watchdog doesn't seem to work. > > > > When I set the kernel parameter "nmi_watchdog=1" I get this message in > > /var/log/syslog: > > Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to > > IO-APIC > > Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC - > > disabling NMI Watchdog! > > > > "nmi_watchdog=2" seems to work at first, In /var/log/messages: > > Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK. > > but it still locks up. > > The NMI watchdog can only handle software lockups, since it relies on > the CPU, and for nmi_watchdog=1 the I/O-APIC + bus, still running. > Hardware lockups result in, well, hardware lockups :-( So does this confirm that the lockups with nforce2 chipsets and apic is actually a hardware problem after all? -- Josh McKinney | Webmaster: http://joshandangie.org -------------------------------------------------------------------------- | They that can give up essential liberty Linux, the choice -o) | to obtain a little temporary safety deserve of the GNU generation /\ | neither liberty or safety. _\_v | -Benjamin Franklin ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 8:33 ` Josh McKinney @ 2003-12-05 12:14 ` Mikael Pettersson 2003-12-05 14:19 ` Craig Bradney 0 siblings, 1 reply; 62+ messages in thread From: Mikael Pettersson @ 2003-12-05 12:14 UTC (permalink / raw) To: Josh McKinney; +Cc: linux-kernel Josh McKinney writes: > On approximately Fri, Dec 05, 2003 at 08:40:58AM +0100, Mikael Pettersson wrote: > > Jesse Allen writes: > > > Hi, > > > > > > I have a NForce2 board and can easily reproduce a lockup with grep on an IDE > > > hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are > > > enabled. It was suggested to me to use NMI watchdog to catch it. However, the > > > NMI watchdog doesn't seem to work. > > > > > > When I set the kernel parameter "nmi_watchdog=1" I get this message in > > > /var/log/syslog: > > > Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to > > > IO-APIC > > > Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC - > > > disabling NMI Watchdog! > > > > > > "nmi_watchdog=2" seems to work at first, In /var/log/messages: > > > Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK. > > > but it still locks up. > > > > The NMI watchdog can only handle software lockups, since it relies on > > the CPU, and for nmi_watchdog=1 the I/O-APIC + bus, still running. > > Hardware lockups result in, well, hardware lockups :-( > > So does this confirm that the lockups with nforce2 chipsets and apic > is actually a hardware problem after all? Confirm with very high probability. There may be quirks in nVidia's chipset that we (unlike their Windoze drivers) don't know about. Ask nVidia for detailed chipset documentation. Then maybe we can fix this. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 12:14 ` Mikael Pettersson @ 2003-12-05 14:19 ` Craig Bradney 2003-12-05 17:05 ` Craig Bradney 2003-12-05 18:11 ` Josh McKinney 0 siblings, 2 replies; 62+ messages in thread From: Craig Bradney @ 2003-12-05 14:19 UTC (permalink / raw) To: Mikael Pettersson; +Cc: Josh McKinney, linux-kernel I'm getting those in dmesg too... ..MP-BIOS bug: 8254 timer not connected to IO-APIC ...trying to set up timer (IRQ0) through the 8259A ... failed. ...trying to set up timer as Virtual Wire IRQ... failed. ...trying to set up timer as ExtINT IRQ... works. Do you really think this could be the problem? If so, any ideas why I am relatively lucky to not have the crashes people are having? 5.5 days, then 5 hours, and now Im up to 17 hours... with a decent amount of use combined with idle time. Craig On Fri, 2003-12-05 at 13:14, Mikael Pettersson wrote: > Josh McKinney writes: > > On approximately Fri, Dec 05, 2003 at 08:40:58AM +0100, Mikael Pettersson wrote: > > > Jesse Allen writes: > > > > Hi, > > > > > > > > I have a NForce2 board and can easily reproduce a lockup with grep on an IDE > > > > hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are > > > > enabled. It was suggested to me to use NMI watchdog to catch it. However, the > > > > NMI watchdog doesn't seem to work. > > > > > > > > When I set the kernel parameter "nmi_watchdog=1" I get this message in > > > > /var/log/syslog: > > > > Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to > > > > IO-APIC > > > > Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC - > > > > disabling NMI Watchdog! > > > > > > > > "nmi_watchdog=2" seems to work at first, In /var/log/messages: > > > > Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK. > > > > but it still locks up. > > > > > > The NMI watchdog can only handle software lockups, since it relies on > > > the CPU, and for nmi_watchdog=1 the I/O-APIC + bus, still running. > > > Hardware lockups result in, well, hardware lockups :-( > > > > So does this confirm that the lockups with nforce2 chipsets and apic > > is actually a hardware problem after all? > > Confirm with very high probability. There may be quirks in nVidia's > chipset that we (unlike their Windoze drivers) don't know about. > > Ask nVidia for detailed chipset documentation. Then maybe we can fix this. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 14:19 ` Craig Bradney @ 2003-12-05 17:05 ` Craig Bradney 2003-12-05 18:11 ` Josh McKinney 1 sibling, 0 replies; 62+ messages in thread From: Craig Bradney @ 2003-12-05 17:05 UTC (permalink / raw) To: Mikael Pettersson; +Cc: Josh McKinney, linux-kernel Having just had another hang.. I tried booting with nmi-watchdog=1 and then with 2. I am currently running from the boot with 2 selected. In my current dmesg I have these which dont normally appear and didnt appear in the boot with 1 set. Any ideas? hda: IRQ probe failed (0xfffffcfa) hdb: IRQ probe failed (0xfffffcfa) hdb: IRQ probe failed (0xfffffcfa) Craig On Fri, 2003-12-05 at 15:19, Craig Bradney wrote: > I'm getting those in dmesg too... > > ..MP-BIOS bug: 8254 timer not connected to IO-APIC > ...trying to set up timer (IRQ0) through the 8259A ... failed. > ...trying to set up timer as Virtual Wire IRQ... failed. > ...trying to set up timer as ExtINT IRQ... works. > > > Do you really think this could be the problem? > > If so, any ideas why I am relatively lucky to not have the crashes > people are having? 5.5 days, then 5 hours, and now Im up to 17 hours... > with a decent amount of use combined with idle time. > > Craig > > > On Fri, 2003-12-05 at 13:14, Mikael Pettersson wrote: > > Josh McKinney writes: > > > On approximately Fri, Dec 05, 2003 at 08:40:58AM +0100, Mikael Pettersson wrote: > > > > Jesse Allen writes: > > > > > Hi, > > > > > > > > > > I have a NForce2 board and can easily reproduce a lockup with grep on an IDE > > > > > hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are > > > > > enabled. It was suggested to me to use NMI watchdog to catch it. However, the > > > > > NMI watchdog doesn't seem to work. > > > > > > > > > > When I set the kernel parameter "nmi_watchdog=1" I get this message in > > > > > /var/log/syslog: > > > > > Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to > > > > > IO-APIC > > > > > Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC - > > > > > disabling NMI Watchdog! > > > > > > > > > > "nmi_watchdog=2" seems to work at first, In /var/log/messages: > > > > > Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK. > > > > > but it still locks up. > > > > > > > > The NMI watchdog can only handle software lockups, since it relies on > > > > the CPU, and for nmi_watchdog=1 the I/O-APIC + bus, still running. > > > > Hardware lockups result in, well, hardware lockups :-( > > > > > > So does this confirm that the lockups with nforce2 chipsets and apic > > > is actually a hardware problem after all? > > > > Confirm with very high probability. There may be quirks in nVidia's > > chipset that we (unlike their Windoze drivers) don't know about. > > > > Ask nVidia for detailed chipset documentation. Then maybe we can fix this. > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 14:19 ` Craig Bradney 2003-12-05 17:05 ` Craig Bradney @ 2003-12-05 18:11 ` Josh McKinney 1 sibling, 0 replies; 62+ messages in thread From: Josh McKinney @ 2003-12-05 18:11 UTC (permalink / raw) To: linux-kernel Please don't CC me, I am on the list, thanks. On approximately Fri, Dec 05, 2003 at 03:19:33PM +0100, Craig Bradney wrote: > I'm getting those in dmesg too... > > ..MP-BIOS bug: 8254 timer not connected to IO-APIC > ...trying to set up timer (IRQ0) through the 8259A ... failed. > ...trying to set up timer as Virtual Wire IRQ... failed. > ...trying to set up timer as ExtINT IRQ... works. > > > Do you really think this could be the problem? > > If so, any ideas why I am relatively lucky to not have the crashes > people are having? 5.5 days, then 5 hours, and now Im up to 17 hours... > with a decent amount of use combined with idle time. > > Craig > At least two of us are lucky. I can't reproduce the crashes "anymore" either. I am up to 2 days now, was up to 3 or 4 before I booted 2.4.23 for a while to see if I could make that kernel crash, which I couldn't. I will see how long I can go, since 5 days or so seems to be the top uptime. -- Josh McKinney | Webmaster: http://joshandangie.org -------------------------------------------------------------------------- | They that can give up essential liberty Linux, the choice -o) | to obtain a little temporary safety deserve of the GNU generation /\ | neither liberty or safety. _\_v | -Benjamin Franklin ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 7:40 ` Mikael Pettersson 2003-12-05 8:33 ` Josh McKinney @ 2003-12-05 8:58 ` Mike Fedyk 2003-12-05 12:06 ` Mikael Pettersson 2003-12-08 2:20 ` Bob 1 sibling, 2 replies; 62+ messages in thread From: Mike Fedyk @ 2003-12-05 8:58 UTC (permalink / raw) To: Mikael Pettersson; +Cc: Jesse Allen, linux-kernel On Fri, Dec 05, 2003 at 08:40:58AM +0100, Mikael Pettersson wrote: > Jesse Allen writes: > > Hi, > > > > I have a NForce2 board and can easily reproduce a lockup with grep on an IDE > > hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are > > enabled. It was suggested to me to use NMI watchdog to catch it. However, the > > NMI watchdog doesn't seem to work. > > > > When I set the kernel parameter "nmi_watchdog=1" I get this message in > > /var/log/syslog: > > Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to > > IO-APIC > > Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC - > > disabling NMI Watchdog! > > > > "nmi_watchdog=2" seems to work at first, In /var/log/messages: > > Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK. > > but it still locks up. > > The NMI watchdog can only handle software lockups, since it relies on > the CPU, and for nmi_watchdog=1 the I/O-APIC + bus, still running. > Hardware lockups result in, well, hardware lockups :-( But nmi_watchdog=1 is supposed to work with APIC, or IO-APIC, and it isn't for his motherboard. It doesn't increment NMI in /proc/interrupts. And it gives the above error message. Isn't that a bug? ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 8:58 ` Mike Fedyk @ 2003-12-05 12:06 ` Mikael Pettersson 2003-12-08 2:20 ` Bob 1 sibling, 0 replies; 62+ messages in thread From: Mikael Pettersson @ 2003-12-05 12:06 UTC (permalink / raw) To: Mike Fedyk; +Cc: Jesse Allen, linux-kernel Mike Fedyk writes: > On Fri, Dec 05, 2003 at 08:40:58AM +0100, Mikael Pettersson wrote: > > Jesse Allen writes: > > > Hi, > > > > > > I have a NForce2 board and can easily reproduce a lockup with grep on an IDE > > > hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are > > > enabled. It was suggested to me to use NMI watchdog to catch it. However, the > > > NMI watchdog doesn't seem to work. > > > > > > When I set the kernel parameter "nmi_watchdog=1" I get this message in > > > /var/log/syslog: > > > Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to > > > IO-APIC > > > Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC - > > > disabling NMI Watchdog! > > > > > > "nmi_watchdog=2" seems to work at first, In /var/log/messages: > > > Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK. > > > but it still locks up. > > > > The NMI watchdog can only handle software lockups, since it relies on > > the CPU, and for nmi_watchdog=1 the I/O-APIC + bus, still running. > > Hardware lockups result in, well, hardware lockups :-( > > But nmi_watchdog=1 is supposed to work with APIC, or IO-APIC, and it isn't > for his motherboard. It doesn't increment NMI in /proc/interrupts. And it > gives the above error message. Isn't that a bug? nmi_watchdog=1 only falls back to nmi_watchdog=2 if no SMP is detected. If the I/O-APIC is detected but doesn't work, then the fallback does not happen, and you need to set nmi_watchdog=2 explicitly. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 8:58 ` Mike Fedyk 2003-12-05 12:06 ` Mikael Pettersson @ 2003-12-08 2:20 ` Bob 2003-12-09 14:21 ` Maciej W. Rozycki 1 sibling, 1 reply; 62+ messages in thread From: Bob @ 2003-12-08 2:20 UTC (permalink / raw) To: linux-kernel Mike Fedyk wrote: >for his motherboard. It doesn't increment NMI in /proc/interrupts. And it >gives the above error message. Isn't that a bug? > > > But nmi_watchdog=1 is supposed to work with APIC, or IO-APIC, and it isn't Do you mean like this with an MSI K7N2 Delta MCP2-T mboard and nmi in kernel and this in cat /proc/interrupts, also in /etc/lilo.conf I have append="nmi_watchdog=1" ? Nothing "nmi" or "NMI" is logged. cat /proc/interrupts CPU0 0: 241105839 XT-PIC timer 1: 27337 IO-APIC-edge i8042 2: 0 XT-PIC cascade 8: 1 IO-APIC-edge rtc 9: 0 IO-APIC-level acpi 12: 217952 IO-APIC-edge i8042 14: 22 IO-APIC-edge ide0 15: 24 IO-APIC-edge ide1 16: 4245875 IO-APIC-level 3ware Storage Controller, yenta, yenta 17: 5428737 IO-APIC-level eth0 21: 0 IO-APIC-level NVidia nForce2 NMI: 0 LOC: 241091187 ERR: 0 MIS: 6 ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-08 2:20 ` Bob @ 2003-12-09 14:21 ` Maciej W. Rozycki 2003-12-09 16:35 ` Bob 0 siblings, 1 reply; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-09 14:21 UTC (permalink / raw) To: Bob; +Cc: linux-kernel On Sun, 7 Dec 2003, Bob wrote: > I have append="nmi_watchdog=1" ? Nothing "nmi" or "NMI" is logged. > > cat /proc/interrupts > CPU0 > 0: 241105839 XT-PIC timer > 1: 27337 IO-APIC-edge i8042 > 2: 0 XT-PIC cascade > 8: 1 IO-APIC-edge rtc > 9: 0 IO-APIC-level acpi > 12: 217952 IO-APIC-edge i8042 > 14: 22 IO-APIC-edge ide0 > 15: 24 IO-APIC-edge ide1 > 16: 4245875 IO-APIC-level 3ware Storage Controller, yenta, yenta > 17: 5428737 IO-APIC-level eth0 > 21: 0 IO-APIC-level NVidia nForce2 > NMI: 0 > LOC: 241091187 > ERR: 0 > MIS: 6 You don't have the NMI watchdog working, because the timer interrupt is configured as an 8259A interrupt ("XT-PIC" for IRQ 0 in the output above). This usually means the wiring of a particular system doesn't provide any other alternative or configuration data provided by the BIOS is broken. The timer interrupt has to be configured as an I/O APIC interrupt for the watchdog to work, or you can select "nmi_watchdog=2" for an alternative watchdog internal to processors if they support it. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-09 14:21 ` Maciej W. Rozycki @ 2003-12-09 16:35 ` Bob 2003-12-10 13:41 ` Maciej W. Rozycki 0 siblings, 1 reply; 62+ messages in thread From: Bob @ 2003-12-09 16:35 UTC (permalink / raw) To: linux-kernel Maciej W. Rozycki wrote: >On Sun, 7 Dec 2003, Bob wrote: > > > >>I have append="nmi_watchdog=1" ? Nothing "nmi" or "NMI" is logged. >> >> cat /proc/interrupts >> CPU0 >> 0: 241105839 XT-PIC timer................... >>NMI: 0........... >> > You don't have the NMI watchdog working, because the timer interrupt is >configured as an 8259A interrupt ("XT-PIC" for IRQ 0 in the output above). >This usually means the wiring of a particular system doesn't provide any >other alternative or configuration data provided by the BIOS is broken. >The timer interrupt has to be configured as an I/O APIC interrupt for the >watchdog to work, or you can select "nmi_watchdog=2" for an alternative >watchdog internal to processors if they support it. > > > Using a patch that fixes a number of people's nforce2 lockups while enabling io-apic edge timer, I can now use nmi_watchdog=2 but not =1 turn on ioapic edge timer-- http://www.kernel.org/pub/linux/kernel/people/bart/2.6.0-test11-bart1/broken-out/nforce2-apic.patch We're all trying to get acpi, apic, lapic, io-apic working when turned on in cmos/bios and kernel. The three things that each alone have achieved stability on somebody's system here are 1) bios update 2) cpu disconnect off either in cmos if available or by athcool or kernel patch with same 3) timing delay patch For CPU disconnect you still need athcool or this one http://www.kernel.org/pub/linux/kernel/people/bart/2.6.0-test11-bart1/broken-out/nforce2-disconnect-quirk.patch Both patches are for 2.6.0-test11 kernel. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-09 16:35 ` Bob @ 2003-12-10 13:41 ` Maciej W. Rozycki 2003-12-12 16:01 ` bill davidsen 0 siblings, 1 reply; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-10 13:41 UTC (permalink / raw) To: Bob; +Cc: linux-kernel On Tue, 9 Dec 2003, Bob wrote: > > You don't have the NMI watchdog working, because the timer interrupt is > >configured as an 8259A interrupt ("XT-PIC" for IRQ 0 in the output above). > >This usually means the wiring of a particular system doesn't provide any > >other alternative or configuration data provided by the BIOS is broken. > >The timer interrupt has to be configured as an I/O APIC interrupt for the > >watchdog to work, or you can select "nmi_watchdog=2" for an alternative > >watchdog internal to processors if they support it. > > > Using a patch that fixes a number of people's nforce2 > lockups while enabling io-apic edge timer, I can now > use nmi_watchdog=2 but not =1 The I/O APIC NMI watchdog utilizes the property of being transparent to a single IRQ source of a specially reconfigured 8259A PIC (the master one in the IA32 PC architecture). There are more prerequisites that have to be met and all indeed are for a 100% compatible PC as specified by the Intel's Multiprocessor Specification. 1. The INT output of the master 8259A PIC has to be connected to the LINT0 (or LINTIN0; the name varies by implementations) inputs of all local APICs in the system. 2a. The OUT0 output of the 8254 PIT (IOW the timer source) has to be directly connected to the INTIN2 input of the first I/O APIC. 2b. Alternatively the INT output of the master 8259A PIC has to be connected to the INTIN0 input of the first I/O APIC. 3. There must be no glue logic that would change logical properties of the signal between the INT output of the master 8259A PIC and the respective APIC interrupt inputs. In practice, assuming the MP IRQ routing information provided the BIOS has been correct (which is not always the case), prerequisites #1 and #2 have been met so far, but #3 has proved to be occasionally problematic. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-10 13:41 ` Maciej W. Rozycki @ 2003-12-12 16:01 ` bill davidsen 2003-12-12 16:47 ` Maciej W. Rozycki 2003-12-12 22:27 ` George Anzinger 0 siblings, 2 replies; 62+ messages in thread From: bill davidsen @ 2003-12-12 16:01 UTC (permalink / raw) To: linux-kernel In article <Pine.LNX.4.55.0312101421540.31543@jurand.ds.pg.gda.pl>, Maciej W. Rozycki <macro@ds2.pg.gda.pl> wrote: | The I/O APIC NMI watchdog utilizes the property of being transparent to a | single IRQ source of a specially reconfigured 8259A PIC (the master one in | the IA32 PC architecture). There are more prerequisites that have to be | met and all indeed are for a 100% compatible PC as specified by the | Intel's Multiprocessor Specification. | | 1. The INT output of the master 8259A PIC has to be connected to the LINT0 | (or LINTIN0; the name varies by implementations) inputs of all local APICs | in the system. | | 2a. The OUT0 output of the 8254 PIT (IOW the timer source) has to be | directly connected to the INTIN2 input of the first I/O APIC. | | 2b. Alternatively the INT output of the master 8259A PIC has to be | connected to the INTIN0 input of the first I/O APIC. | | 3. There must be no glue logic that would change logical properties of the | signal between the INT output of the master 8259A PIC and the respective | APIC interrupt inputs. | | In practice, assuming the MP IRQ routing information provided the BIOS has | been correct (which is not always the case), prerequisites #1 and #2 have | been met so far, but #3 has proved to be occasionally problematic. In practice many system seem to take a good bit of guessing and testing. I have an old P-II which only works with acpi=force and nmi_watchdog=2, for instance. It would be nice if there were a program which could poke at the hardware and suggest options which might work, as in eliminating the ones which can be determined not to work. Absent that trial and error rule, unfortunately. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-12 16:01 ` bill davidsen @ 2003-12-12 16:47 ` Maciej W. Rozycki 2003-12-12 16:57 ` Richard B. Johnson 2003-12-13 5:16 ` Bill Davidsen 2003-12-12 22:27 ` George Anzinger 1 sibling, 2 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-12 16:47 UTC (permalink / raw) To: bill davidsen; +Cc: linux-kernel On Fri, 12 Dec 2003, bill davidsen wrote: > | In practice, assuming the MP IRQ routing information provided the BIOS has > | been correct (which is not always the case), prerequisites #1 and #2 have > | been met so far, but #3 has proved to be occasionally problematic. > > In practice many system seem to take a good bit of guessing and testing. > I have an old P-II which only works with acpi=force and nmi_watchdog=2, > for instance. Well, the NMI watchdog is a side-effect feature that works by chance rather than by design. So you can't really complain it doesn't work somewhere, although I wouldn't mind if new hardware was designed such that it works. You shouldn't have to use "acpi=force" for the watchdog to work though and for a PII system if "nmi_watchdog=1" doesn't work, then I suspect a BIOS bug (set APIC_DEBUG to 1 in asm-i386/apic.h and send me the bootstrap log and a dump from `mptable' for a diagnosis, if interested). > It would be nice if there were a program which could poke at the > hardware and suggest options which might work, as in eliminating the > ones which can be determined not to work. Absent that trial and error > rule, unfortunately. Linux has all appropriate bits to set up hardware reasonably as long as BIOS provides accurate information. The only case our code fails is when BIOS tells us lies and the there's little we can do about it. Actually we are doing hardware manufacturers a favor we try to handle some cases at all -- it's the BIOS that should be fixed instead and it is software and it is stored in Flash memories these days, so there's no excuse. So if there's a problem with running Linux because of BIOS bugs, then please bugger the manufacturer in the first place (and avoid the company in the future if they don't support Linux). Sometimes the NMI watchdog works in principle, but its activation leads to system instability -- almost always this is a symptom of buggy SMM code executed by the BIOS behind our back (NMIs are disabled by default in the SMM, but careless code may enable them by accident). -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-12 16:47 ` Maciej W. Rozycki @ 2003-12-12 16:57 ` Richard B. Johnson 2003-12-12 17:21 ` Maciej W. Rozycki 2003-12-13 5:16 ` Bill Davidsen 1 sibling, 1 reply; 62+ messages in thread From: Richard B. Johnson @ 2003-12-12 16:57 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: bill davidsen, linux-kernel On Fri, 12 Dec 2003, Maciej W. Rozycki wrote: > On Fri, 12 Dec 2003, bill davidsen wrote: > > > | In practice, assuming the MP IRQ routing information provided the BIOS has > > | been correct (which is not always the case), prerequisites #1 and #2 have > > | been met so far, but #3 has proved to be occasionally problematic. > > > > In practice many system seem to take a good bit of guessing and testing. > > I have an old P-II which only works with acpi=force and nmi_watchdog=2, > > for instance. > > Well, the NMI watchdog is a side-effect feature that works by chance > rather than by design. So you can't really complain it doesn't work > somewhere, although I wouldn't mind if new hardware was designed such that > it works. You shouldn't have to use "acpi=force" for the watchdog to work > though and for a PII system if "nmi_watchdog=1" doesn't work, then I > suspect a BIOS bug (set APIC_DEBUG to 1 in asm-i386/apic.h and send me the > bootstrap log and a dump from `mptable' for a diagnosis, if interested). > > > It would be nice if there were a program which could poke at the > > hardware and suggest options which might work, as in eliminating the > > ones which can be determined not to work. Absent that trial and error > > rule, unfortunately. > > Linux has all appropriate bits to set up hardware reasonably as long as > BIOS provides accurate information. The only case our code fails is when > BIOS tells us lies and the there's little we can do about it. Actually we > are doing hardware manufacturers a favor we try to handle some cases at > all -- it's the BIOS that should be fixed instead and it is software and > it is stored in Flash memories these days, so there's no excuse. So if > there's a problem with running Linux because of BIOS bugs, then please > bugger the manufacturer in the first place (and avoid the company in the > future if they don't support Linux). > > Sometimes the NMI watchdog works in principle, but its activation leads > to system instability -- almost always this is a symptom of buggy SMM code ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > executed by the BIOS behind our back (NMIs are disabled by default in the ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > SMM, but careless code may enable them by accident). The NMI vector goes to Linux code. In fact all interrupt vectors go to Linux code. There is no way that some BIOS code could possibly be accidentally executed here. Some Linux code would have to call some 16-bit BIOS code somewhere, and it doesn't even know where.......... Cheers, Dick Johnson Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips). Note 96.31% of all statistics are fiction. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-12 16:57 ` Richard B. Johnson @ 2003-12-12 17:21 ` Maciej W. Rozycki 0 siblings, 0 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-12 17:21 UTC (permalink / raw) To: Richard B. Johnson; +Cc: bill davidsen, linux-kernel On Fri, 12 Dec 2003, Richard B. Johnson wrote: > > Sometimes the NMI watchdog works in principle, but its activation leads > > to system instability -- almost always this is a symptom of buggy SMM code > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > executed by the BIOS behind our back (NMIs are disabled by default in the > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > SMM, but careless code may enable them by accident). > > The NMI vector goes to Linux code. In fact all interrupt vectors > go to Linux code. There is no way that some BIOS code could possibly > be accidentally executed here. Some Linux code would have to > call some 16-bit BIOS code somewhere, and it doesn't even know > where.......... The problem happens when the SMM is active (i.e. the BIOS code is being executed) after an SMI has been received during Linux operation (SMIs may get triggered due to various reasons -- a parity/ECC error caught by the chipset, an access to an emulated 8042 controller, a power failure in a notebook, etc.) and an NMI arrives. When in the SMM, no interrupt (including the NMI) causes a switch back into the protected mode (and the processor expects real-mode style interrupt vectors), so the Linux's NMI handler is never reached and the SMM's NMI handler (if at all initialized) isn't appropriate for handling the NMI watchdog. Since the SMM cannot know what NMIs are used for in a particular OS, the code should best keep NMIs disabled -- then an arriving NMI event is latched and postponed until after the RSM instruction is executed. The SMM was invented to be transparent to a running OS, but care has to be taken for this to be true and firmware bugs sometimes make the SMM activity visible. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-12 16:47 ` Maciej W. Rozycki 2003-12-12 16:57 ` Richard B. Johnson @ 2003-12-13 5:16 ` Bill Davidsen 2003-12-15 13:23 ` Maciej W. Rozycki 1 sibling, 1 reply; 62+ messages in thread From: Bill Davidsen @ 2003-12-13 5:16 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-kernel On Fri, 12 Dec 2003, Maciej W. Rozycki wrote: > On Fri, 12 Dec 2003, bill davidsen wrote: > > > | In practice, assuming the MP IRQ routing information provided the BIOS has > > | been correct (which is not always the case), prerequisites #1 and #2 have > > | been met so far, but #3 has proved to be occasionally problematic. > > > > In practice many system seem to take a good bit of guessing and testing. > > I have an old P-II which only works with acpi=force and nmi_watchdog=2, > > for instance. > > Well, the NMI watchdog is a side-effect feature that works by chance > rather than by design. So you can't really complain it doesn't work > somewhere, although I wouldn't mind if new hardware was designed such that > it works. You shouldn't have to use "acpi=force" for the watchdog to work > though and for a PII system if "nmi_watchdog=1" doesn't work, then I > suspect a BIOS bug (set APIC_DEBUG to 1 in asm-i386/apic.h and send me the > bootstrap log and a dump from `mptable' for a diagnosis, if interested). Has the check to see if the BIOS is old than very recent been removed? I used to get a message that the BIOS was too old, I believe that's what prompted the acpi to enable the local apic. Sorrt, I've been running that feature since 2.5.3x or so and I just carried it forward. > > > It would be nice if there were a program which could poke at the > > hardware and suggest options which might work, as in eliminating the > > ones which can be determined not to work. Absent that trial and error > > rule, unfortunately. > > Linux has all appropriate bits to set up hardware reasonably as long as > BIOS provides accurate information. The only case our code fails is when > BIOS tells us lies and the there's little we can do about it. Actually we > are doing hardware manufacturers a favor we try to handle some cases at > all -- it's the BIOS that should be fixed instead and it is software and > it is stored in Flash memories these days, so there's no excuse. So if > there's a problem with running Linux because of BIOS bugs, then please > bugger the manufacturer in the first place (and avoid the company in the > future if they don't support Linux). > > Sometimes the NMI watchdog works in principle, but its activation leads > to system instability -- almost always this is a symptom of buggy SMM code > executed by the BIOS behind our back (NMIs are disabled by default in the > SMM, but careless code may enable them by accident). Works fine for me, system stays up for 30-40 days when I let it... I also run softdog to catch hangs in user mode but not in the kernel. That also works. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-13 5:16 ` Bill Davidsen @ 2003-12-15 13:23 ` Maciej W. Rozycki 0 siblings, 0 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-15 13:23 UTC (permalink / raw) To: Bill Davidsen; +Cc: linux-kernel On Sat, 13 Dec 2003, Bill Davidsen wrote: > > Well, the NMI watchdog is a side-effect feature that works by chance > > rather than by design. So you can't really complain it doesn't work > > somewhere, although I wouldn't mind if new hardware was designed such that > > it works. You shouldn't have to use "acpi=force" for the watchdog to work > > though and for a PII system if "nmi_watchdog=1" doesn't work, then I > > suspect a BIOS bug (set APIC_DEBUG to 1 in asm-i386/apic.h and send me the > > bootstrap log and a dump from `mptable' for a diagnosis, if interested). > > Has the check to see if the BIOS is old than very recent been removed? I > used to get a message that the BIOS was too old, I believe that's what > prompted the acpi to enable the local apic. Sorrt, I've been running that > feature since 2.5.3x or so and I just carried it forward. I don't know what check you refer to, sorry. I don't think we do any version checks in the APIC code. Perhaps ACPI does some, but having no use for it anywhere I'm not familiar with that area. If the "nmi_watchdog=1" option doesn't work for a PII system, then its most likely a bug in BIOS IRQ routing tables -- either missing or broken entries for the 8254 timer and/or the 8259A ExtINTA source. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-12 16:01 ` bill davidsen 2003-12-12 16:47 ` Maciej W. Rozycki @ 2003-12-12 22:27 ` George Anzinger 2003-12-15 13:13 ` Maciej W. Rozycki 1 sibling, 1 reply; 62+ messages in thread From: George Anzinger @ 2003-12-12 22:27 UTC (permalink / raw) To: macro; +Cc: bill davidsen, linux-kernel Having had cause to try and figure out all this, I vote for the following being included in the source somewhere... -g bill davidsen wrote: > In article <Pine.LNX.4.55.0312101421540.31543@jurand.ds.pg.gda.pl>, > Maciej W. Rozycki <macro@ds2.pg.gda.pl> wrote: > > | The I/O APIC NMI watchdog utilizes the property of being transparent to a > | single IRQ source of a specially reconfigured 8259A PIC (the master one in > | the IA32 PC architecture). There are more prerequisites that have to be > | met and all indeed are for a 100% compatible PC as specified by the > | Intel's Multiprocessor Specification. > | > | 1. The INT output of the master 8259A PIC has to be connected to the LINT0 > | (or LINTIN0; the name varies by implementations) inputs of all local APICs > | in the system. > | > | 2a. The OUT0 output of the 8254 PIT (IOW the timer source) has to be > | directly connected to the INTIN2 input of the first I/O APIC. > | > | 2b. Alternatively the INT output of the master 8259A PIC has to be > | connected to the INTIN0 input of the first I/O APIC. > | > | 3. There must be no glue logic that would change logical properties of the > | signal between the INT output of the master 8259A PIC and the respective > | APIC interrupt inputs. > | > | In practice, assuming the MP IRQ routing information provided the BIOS has > | been correct (which is not always the case), prerequisites #1 and #2 have > | been met so far, but #3 has proved to be occasionally problematic. > > In practice many system seem to take a good bit of guessing and testing. > I have an old P-II which only works with acpi=force and nmi_watchdog=2, > for instance. > > It would be nice if there were a program which could poke at the > hardware and suggest options which might work, as in eliminating the > ones which can be determined not to work. Absent that trial and error > rule, unfortunately. -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-12 22:27 ` George Anzinger @ 2003-12-15 13:13 ` Maciej W. Rozycki 2003-12-15 21:42 ` George Anzinger 0 siblings, 1 reply; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-15 13:13 UTC (permalink / raw) To: George Anzinger; +Cc: linux-kernel On Fri, 12 Dec 2003, George Anzinger wrote: > Having had cause to try and figure out all this, I vote for the following being > included in the source somewhere... Hmm, you could have simply asked... ;-) Anyway, an inclusion is doable, I guess. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-15 13:13 ` Maciej W. Rozycki @ 2003-12-15 21:42 ` George Anzinger 2003-12-16 13:37 ` Maciej W. Rozycki 0 siblings, 1 reply; 62+ messages in thread From: George Anzinger @ 2003-12-15 21:42 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-kernel Maciej W. Rozycki wrote: > On Fri, 12 Dec 2003, George Anzinger wrote: > > >>Having had cause to try and figure out all this, I vote for the following being >>included in the source somewhere... > > > Hmm, you could have simply asked... ;-) Anyway, an inclusion is doable, > I guess. > I suspect I did, but most likey the wrong place. In any case, I would like to think that "read the source, Luke" is the right answer. So, while I am in the asking mode, is there a simple way to turn off the PIT interrupt without changing the PIT program? I would like a way to stop the interrupts AND also stop the NMIs that it generates for the watchdog. I suspect that this is a bit more complex that it would appear, due to how its wired. -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-15 21:42 ` George Anzinger @ 2003-12-16 13:37 ` Maciej W. Rozycki 2003-12-16 13:57 ` Richard B. Johnson 2003-12-16 17:26 ` George Anzinger 0 siblings, 2 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-16 13:37 UTC (permalink / raw) To: George Anzinger; +Cc: linux-kernel On Mon, 15 Dec 2003, George Anzinger wrote: > > Hmm, you could have simply asked... ;-) Anyway, an inclusion is doable, > > I guess. > > I suspect I did, but most likey the wrong place. In any case, I would like to > think that "read the source, Luke" is the right answer. Certainly it is, but not necessarily the only one. ;-) > So, while I am in the asking mode, is there a simple way to turn off the PIT > interrupt without changing the PIT program? I would like a way to stop the > interrupts AND also stop the NMIs that it generates for the watchdog. I suspect > that this is a bit more complex that it would appear, due to how its wired. Well, in PC/AT compatible implementations, the counter #0 of the PIT has its gate hardwired to active, so you cannot mask the PIT output itself. So the only other choices are either reprogramming the counter to a mode that won't cause periodic triggers (which is probably the easiest way, but you don't want to do that for some purpose, right?) or reprogramming interrupt controllers not to accept interrupts arriving from the PIT. Note that Linux may behave strangely then. ;-) -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-16 13:37 ` Maciej W. Rozycki @ 2003-12-16 13:57 ` Richard B. Johnson 2003-12-16 15:47 ` Maciej W. Rozycki 2003-12-16 17:26 ` George Anzinger 1 sibling, 1 reply; 62+ messages in thread From: Richard B. Johnson @ 2003-12-16 13:57 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: George Anzinger, linux-kernel On Tue, 16 Dec 2003, Maciej W. Rozycki wrote: > On Mon, 15 Dec 2003, George Anzinger wrote: > > > > Hmm, you could have simply asked... ;-) Anyway, an inclusion is doable, > > > I guess. > > > > I suspect I did, but most likey the wrong place. In any case, I would like to > > think that "read the source, Luke" is the right answer. > > Certainly it is, but not necessarily the only one. ;-) > > > So, while I am in the asking mode, is there a simple way to turn off the PIT > > interrupt without changing the PIT program? I would like a way to stop the > > interrupts AND also stop the NMIs that it generates for the watchdog. I suspect > > that this is a bit more complex that it would appear, due to how its wired. > > Well, in PC/AT compatible implementations, the counter #0 of the PIT has > its gate hardwired to active, so you cannot mask the PIT output itself. > So the only other choices are either reprogramming the counter to a mode > that won't cause periodic triggers (which is probably the easiest way, but > you don't want to do that for some purpose, right?) or reprogramming > interrupt controllers not to accept interrupts arriving from the PIT. > > Note that Linux may behave strangely then. ;-) > Masking OFF the timer channel 0 in the interrupt controller is probably the easiest thing to do. The port is read-write, and the OCW default to having it accessible. movw $0x21, %dx # Controller 0, mask register inb %dx, %al # Get mask orb $1, %al # Mask off bit 0 outb %al, %dx # Write it back You can reenable by: movw $0x21, %dx inb %dx, %al andb $~1, %al outb %al, %dx With port numbers less that 256, you actually don't need the DX register but I forget if the AT&T assembler needs a $ before the port number when doing this. Cheers, Dick Johnson Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips). Note 96.31% of all statistics are fiction. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-16 13:57 ` Richard B. Johnson @ 2003-12-16 15:47 ` Maciej W. Rozycki 2003-12-16 16:44 ` Richard B. Johnson 0 siblings, 1 reply; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-16 15:47 UTC (permalink / raw) To: Richard B. Johnson; +Cc: George Anzinger, linux-kernel On Tue, 16 Dec 2003, Richard B. Johnson wrote: > Masking OFF the timer channel 0 in the interrupt controller > is probably the easiest thing to do. The port is read-write, > and the OCW default to having it accessible. Note we are writing about configurations involving an I/O APIC, so things are not that easy -- the 8254 timer IRQ may be wired in different ways. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-16 15:47 ` Maciej W. Rozycki @ 2003-12-16 16:44 ` Richard B. Johnson 2003-12-16 16:50 ` Maciej W. Rozycki 0 siblings, 1 reply; 62+ messages in thread From: Richard B. Johnson @ 2003-12-16 16:44 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: George Anzinger, Linux kernel On Tue, 16 Dec 2003, Maciej W. Rozycki wrote: > On Tue, 16 Dec 2003, Richard B. Johnson wrote: > > > Masking OFF the timer channel 0 in the interrupt controller > > is probably the easiest thing to do. The port is read-write, > > and the OCW default to having it accessible. > > Note we are writing about configurations involving an I/O APIC, so things > are not that easy -- the 8254 timer IRQ may be wired in different ways. > > -- > + Maciej W. Rozycki, Technical University of Gdansk, Poland + > +--------------------------------------------------------------+ > + e-mail: macro@ds2.pg.gda.pl, PGP key available + Well if I was trying to isolate a problem, I would make it that easy. You boot the machine in its simplist configuration and work "up" from there. Although I haven't looked at recent source-code, with APIC, the problem is even simpler. If you booted with APIC, just set the global "using_apic_timer" to zero and, voila`, timer-ticks stop. Any any event, the caller needs to know that if there is any code executing anywhere that does the equivalent of for(;;) ; ...the machine will lock-up forever because without that timer, there will be no preemption. Once a CPU-hog gets the CPU, only and interrupt can get it away. Cheers, Dick Johnson Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips). Note 96.31% of all statistics are fiction. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-16 16:44 ` Richard B. Johnson @ 2003-12-16 16:50 ` Maciej W. Rozycki 0 siblings, 0 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-16 16:50 UTC (permalink / raw) To: Richard B. Johnson; +Cc: George Anzinger, Linux kernel On Tue, 16 Dec 2003, Richard B. Johnson wrote: > Although I haven't looked at recent source-code, with APIC, the > problem is even simpler. If you booted with APIC, just set > the global "using_apic_timer" to zero and, voila`, timer-ticks > stop. Except we are writing of the 8254 timer, not the local APIC one... > ...the machine will lock-up forever because without that timer, > there will be no preemption. Once a CPU-hog gets the CPU, only > and interrupt can get it away. And the 8254 timer isn't used for preemption when local APICs are used, so disabling it won't break the whole system, only the timekeeping. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-16 13:37 ` Maciej W. Rozycki 2003-12-16 13:57 ` Richard B. Johnson @ 2003-12-16 17:26 ` George Anzinger 2003-12-16 20:54 ` Maciej W. Rozycki 1 sibling, 1 reply; 62+ messages in thread From: George Anzinger @ 2003-12-16 17:26 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-kernel Maciej W. Rozycki wrote: > On Mon, 15 Dec 2003, George Anzinger wrote: > > >>> Hmm, you could have simply asked... ;-) Anyway, an inclusion is doable, >>>I guess. >> >>I suspect I did, but most likey the wrong place. In any case, I would like to >>think that "read the source, Luke" is the right answer. > > > Certainly it is, but not necessarily the only one. ;-) > > >>So, while I am in the asking mode, is there a simple way to turn off the PIT >>interrupt without changing the PIT program? I would like a way to stop the >>interrupts AND also stop the NMIs that it generates for the watchdog. I suspect >>that this is a bit more complex that it would appear, due to how its wired. > > > Well, in PC/AT compatible implementations, the counter #0 of the PIT has > its gate hardwired to active, so you cannot mask the PIT output itself. > So the only other choices are either reprogramming the counter to a mode > that won't cause periodic triggers (which is probably the easiest way, but > you don't want to do that for some purpose, right?) or reprogramming > interrupt controllers not to accept interrupts arriving from the PIT. > > Note that Linux may behave strangely then. ;-) This is for the VST code where we want to stop the timer interrupts for a bit IF and only if we are in the idle task AND there are no timers to service, i.e. the interrupt would be useless. We don't want to mess with the PIT program as that would mess up the time when we turn it on again. So we just want to stop a few interrupts from time to time. We catch up after turning the PIT back on by using the TSC or pm_timer or some other source that keeps something close to reasonable time. > -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-16 17:26 ` George Anzinger @ 2003-12-16 20:54 ` Maciej W. Rozycki 2003-12-16 21:53 ` George Anzinger 0 siblings, 1 reply; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-16 20:54 UTC (permalink / raw) To: George Anzinger; +Cc: linux-kernel On Tue, 16 Dec 2003, George Anzinger wrote: > This is for the VST code where we want to stop the timer interrupts for a bit IF > and only if we are in the idle task AND there are no timers to service, i.e. the > interrupt would be useless. We don't want to mess with the PIT program as that > would mess up the time when we turn it on again. So we just want to stop a few > interrupts from time to time. We catch up after turning the PIT back on by > using the TSC or pm_timer or some other source that keeps something close to > reasonable time. I see. Well, then disable_irq(0) may be the easiest way to do that for the regular timer interrupt. For the NMI watchdog from the I/O APIC you'd use disable_8259A_irq(0) and for one from the local APIC -- just mask the APIC_LVTPC interrupt (there's no wrapper function, but that's easy). -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-16 20:54 ` Maciej W. Rozycki @ 2003-12-16 21:53 ` George Anzinger 2003-12-17 14:03 ` Maciej W. Rozycki 0 siblings, 1 reply; 62+ messages in thread From: George Anzinger @ 2003-12-16 21:53 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-kernel Maciej W. Rozycki wrote: > On Tue, 16 Dec 2003, George Anzinger wrote: > > >>This is for the VST code where we want to stop the timer interrupts for a bit IF >>and only if we are in the idle task AND there are no timers to service, i.e. the >>interrupt would be useless. We don't want to mess with the PIT program as that >>would mess up the time when we turn it on again. So we just want to stop a few >>interrupts from time to time. We catch up after turning the PIT back on by >>using the TSC or pm_timer or some other source that keeps something close to >>reasonable time. > > > I see. Well, then disable_irq(0) may be the easiest way to do that for > the regular timer interrupt. For the NMI watchdog from the I/O APIC you'd > use disable_8259A_irq(0) and for one from the local APIC -- just mask the > APIC_LVTPC interrupt (there's no wrapper function, but that's easy). How confusing :( Could you give me some idea how this works? I have tried disable_irq(0) and, as best as I can tell, it does not do the trick. The confusion I have is understanding where in the chain of hardware each of these thing is taking place. For example, it would be "nice" if I could just turn off the PIT interrupt line so that both the NMI (PIT generated) and the PIT interrupt would be put on hold. Your answer seems to indicate that disable_irq() is working down stream from where the NMI signal is connected to the PIT interrupt line, so we need to turn of the NMI as well. A picture would be nice here :) > -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-16 21:53 ` George Anzinger @ 2003-12-17 14:03 ` Maciej W. Rozycki 0 siblings, 0 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-17 14:03 UTC (permalink / raw) To: George Anzinger; +Cc: linux-kernel On Tue, 16 Dec 2003, George Anzinger wrote: > How confusing :( Could you give me some idea how this works? I have tried > disable_irq(0) and, as best as I can tell, it does not do the trick. The > confusion I have is understanding where in the chain of hardware each of these > thing is taking place. Well, strange -- it should mask the timer interrupt. But I've never tried that and have proposed based on a source study only -- perhaps it needs to be further investigated. > For example, it would be "nice" if I could just turn off the PIT interrupt line > so that both the NMI (PIT generated) and the PIT interrupt would be put on hold. The counter gate of the 8254 chip is designed to do just that -- it's a pity it's hardwired, but I can understand another SSI TTL latch of a dubious utility was just too costly for the original PC in 1981. > Your answer seems to indicate that disable_irq() is working down stream from > where the NMI signal is connected to the PIT interrupt line, so we need to turn > of the NMI as well. A picture would be nice here :) I'll try my best: +------+ OUT0 INTIN2 +--------+ | 8254 +--+-----------------------------------------+ | +------+ | | I/O | | IR0 +------+ INT +------+ INTIN0 | APIC | +-----+ 8259 +-----+ glue +-+-------------+ | +------+ +------+ | +---++---+ | || | || | || +-----------+---------+-... || | | || +--------+ | +--------+ | || | CPU #0 | | | CPU #1 | | || +--------+ | +--------+ | || | | LINT0 | | | LINT0 | ... || | local +---------+ | local +---------+ || | APIC | | APIC | || | | | | || +---++---+ +---++---+ || || || || || inter-APIC bus || || ++====================++===============...===++ The system is a traditional i82489DX/Pentium/P6-style virtual-wire setup with a serial inter-APIC bus and a full MP-spec feature set. More limited systems may miss the OUT0->INTIN2 line and/or one or more of the INT->INTIN0 or INT->LINT0 -- there needs to be only one. If any INT->sth connections are missing then either the INT->LINT0 one for the bootstrap processor (BSP) or the INT->LINT0 has to exist; other are optional. For the system above the path for the 8254 timer interrupt is via INTIN2 and the inter-APIC bus as a LoPri APIC interrupt. The path for the NMI watchdog is via the 8259 reconfigured to pass IR0 transparently to INT and then LINT0 inputs of all processors, reconfigured for a NMI APIC interrupt. Some glue at the INT output may prevent the NMI watchdog from working -- the LINT0 inputs may not toggle back and forth. If the OUT0->INTIN2 line is missing, the path for the 8254 timer interrupt is via the 8259 reconfigured to pass IR0 transparently to INT, then INTIN0 and the inter-APIC bus as a LoPri APIC interrupt. The path for the NMI watchdog is also via the 8259 and then LINT0 inputs of all processors, reconfigured for a NMI APIC interrupt. Again, some glue at the INT output may prevent this set up from working, but if it does work, then both the timer interrupt and the NMI watchdog do -- I've not heard of a system having different glue logic for INTIN0 and LINT0. If the above variant does not work, as a last resort, the path for the 8254 timer interrupt is via the 8259 reconfigured back into its usual mode and then LINT0 of the BSP reconfigured for an ExtINTA APIC interrupt. Additionally, since at this point the glue logic has probably already locked up due to the messing done above, a few artiffical sets of double INTA cycles are sent to the system bus using the RTC chip and INTIN8 reconfigured temporarily to send ExtINTA APIC interrupts via the inter-APIC bus. I do hope a thorough read of the description will make the available variants clear. The I/O APIC input numbers may differ but so far they are almost always as noted above. Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* RE: Catching NForce2 lockup with NMI watchdog
@ 2003-12-05 19:11 Allen Martin
2003-12-05 20:18 ` cheuche+lkml
` (2 more replies)
0 siblings, 3 replies; 62+ messages in thread
From: Allen Martin @ 2003-12-05 19:11 UTC (permalink / raw)
To: 'Mikael Pettersson', Josh McKinney; +Cc: linux-kernel
> -----Original Message-----
> From: Mikael Pettersson [mailto:mikpe@csd.uu.se]
> Sent: Friday, December 05, 2003 4:15 AM
>
> > So does this confirm that the lockups with nforce2
> chipsets and apic
> > is actually a hardware problem after all?
>
> Confirm with very high probability. There may be quirks in nVidia's
> chipset that we (unlike their Windoze drivers) don't know about.
>
> Ask nVidia for detailed chipset documentation. Then maybe we
> can fix this.
NVIDIA doesn't provide a windows driver to setup APIC interrupts. APIC
functionality is exported through the ACPI methods and MP table in the
system BIOS which the motherboard vendors supply.
Likely the root of the problem has to do with the way the Linux kernel is
using the ACPI methods to setup the interrupts which is different from win
9x/2k/XP. I can help track this down, unfortunately so far I've been unable
to reproduce the hangs on any of the boards I have.
-Allen
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 19:11 Allen Martin @ 2003-12-05 20:18 ` cheuche+lkml 2003-12-05 20:34 ` Prakash K. Cheemplavam ` (2 more replies) 2003-12-05 20:36 ` Jesse Allen 2003-12-05 22:55 ` Mike Fedyk 2 siblings, 3 replies; 62+ messages in thread From: cheuche+lkml @ 2003-12-05 20:18 UTC (permalink / raw) To: linux-kernel On Fri, Dec 05, 2003 at 11:11:39AM -0800, Allen Martin wrote: > > Likely the root of the problem has to do with the way the Linux kernel is > using the ACPI methods to setup the interrupts which is different from win > 9x/2k/XP. I can help track this down, unfortunately so far I've been unable > to reproduce the hangs on any of the boards I have. > With a little patch in arch/i386/kernel/mpparse.c in the acpi section, I managed to get the timer interrupt back on IO-APIC-edge, maybe the nmi watchdog could work with the ioapic then ? With the patch, the interrupt flood on IRQ7 I reported on the nvidia2 lockups thread also disappeared, but then I noticed something odd when there is ide activity : With amd74xx/nforce driver, I can almost instantly hang the machine (nothing new there), but with the generic ide driver and the IO load a cat /dev/hda > /dev/null can do, timer interrupts don't seem to get through easily. I first thought the box freezed but I realized the software cursor was blinking *very* slowly. In fact 1 second for the kernel took about 12 seconds. Stopping the IO load on ide and everything seems back to normal. There may be something wrong with the timer using apic and the amd/nforce ide driver does not handle this situation that should not occur and juste freezes. This is pure speculation of course. I looked in mpparse.c because this is where I noticed the difference about the timer interrupt setup with apic between 2.4.22 and 2.4.23. However it is in the path of ACPI source interrupt override, maybe the modification I made just overrides the override (sigh). *Disclaimer* The modification is certainly not the proper fix, does a wrong thing, but it shows an interesting behavior, especially it fixed the interrupt flood on IRQ7 I and some others are able to see. Here the little patch of arch/i386/kernel/mpparse.c I used : --- mpparse.c.old 2003-12-05 14:42:10.000000000 +0100 +++ mpparse.c 2003-12-05 14:43:41.000000000 +0100 @@ -962,7 +962,8 @@ */ for (i = 0; i < mp_irq_entries; i++) { if ((mp_irqs[i].mpc_dstapic == intsrc.mpc_dstapic) - && (mp_irqs[i].mpc_srcbusirq == intsrc.mpc_srcbusirq)) { + && (mp_irqs[i].mpc_srcbusirq == intsrc.mpc_srcbusirq) + && (mp_irqs[i].mpc_irqtype == intsrc.mpc_irqtype)) { mp_irqs[i] = intsrc; found = 1; break; I hope this helps, Mathieu ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 20:18 ` cheuche+lkml @ 2003-12-05 20:34 ` Prakash K. Cheemplavam 2003-12-05 21:02 ` Mike Fedyk 2003-12-05 20:55 ` Jesse Allen 2003-12-06 3:20 ` Jesse Allen 2 siblings, 1 reply; 62+ messages in thread From: Prakash K. Cheemplavam @ 2003-12-05 20:34 UTC (permalink / raw) To: cheuche+lkml; +Cc: linux-kernel > through easily. I first thought the box freezed but I realized the > software cursor was blinking *very* slowly. In fact 1 second for the > kernel took about 12 seconds. Stopping the IO load on ide and > everything seems back to normal. Hmm, interesting observation. This makes me remeber something: When my machine freezes doing hdparm, the cursor still blinks, but I can't do anything anymore. Maybe a connection to your observation? I haven't treid to run the NMI watchdog, as you guys haven't had success with it yet. Prakash ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 20:34 ` Prakash K. Cheemplavam @ 2003-12-05 21:02 ` Mike Fedyk 0 siblings, 0 replies; 62+ messages in thread From: Mike Fedyk @ 2003-12-05 21:02 UTC (permalink / raw) To: Prakash K. Cheemplavam; +Cc: cheuche+lkml, linux-kernel On Fri, Dec 05, 2003 at 09:34:46PM +0100, Prakash K. Cheemplavam wrote: > Hmm, interesting observation. This makes me remeber something: When my > machine freezes doing hdparm, the cursor still blinks, but I can't do > anything anymore. Maybe a connection to your observation? I haven't > treid to run the NMI watchdog, as you guys haven't had success with it yet. Everyone with this problem should turn on the nmi_watchdog, as someone may have the right circumstances to produce an oops where the others didn't. I say that you're not serious about getting this fixed unless you're going to do all of: o turn on nmi_watchdog o try the patches posted[1] o contact nvidia or your motherboard manufacturer saying you need linux support, and return the board if they don't. (phone, fax, email, or even local office if there is one) I bought a VIA board to avoid the problems I expected from the nforce, and I needed a system (server) that would *work* now. [1] If you're worried about your filesystem, just boot the patched kernel in single mode, and that will mount all of your filesystems read-only so there will be little chance of corruption. Mike ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 20:18 ` cheuche+lkml 2003-12-05 20:34 ` Prakash K. Cheemplavam @ 2003-12-05 20:55 ` Jesse Allen 2003-12-06 3:20 ` Jesse Allen 2 siblings, 0 replies; 62+ messages in thread From: Jesse Allen @ 2003-12-05 20:55 UTC (permalink / raw) To: linux-kernel On Fri, Dec 05, 2003 at 09:18:12PM +0100, cheuche+lkml@free.fr wrote: > With a little patch in arch/i386/kernel/mpparse.c in the acpi section, I > managed to get the timer interrupt back on IO-APIC-edge, maybe the nmi > watchdog could work with the ioapic then ? Maybe! thanks! > > With the patch, the interrupt flood on IRQ7 I reported on the nvidia2 > lockups thread also disappeared, but then I noticed something odd when > there is ide activity : Yeah, I have been writing trace code to try to identify where it fails. Somehow what I did seem to have made IRQ 7 less noisy but I have no idea why? =) So I do think the IRQ is related somehow... > > There may be something wrong with the timer using apic and the > amd/nforce ide driver does not handle this situation that should not > occur and juste freezes. This is pure speculation of course. > > *Disclaimer* > The modification is certainly not the proper fix, does a wrong thing, > but it shows an interesting behavior, especially it fixed the > interrupt flood on IRQ7 I and some others are able to see. > > Here the little patch of arch/i386/kernel/mpparse.c I used : > I'll check it out. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 20:18 ` cheuche+lkml 2003-12-05 20:34 ` Prakash K. Cheemplavam 2003-12-05 20:55 ` Jesse Allen @ 2003-12-06 3:20 ` Jesse Allen 2 siblings, 0 replies; 62+ messages in thread From: Jesse Allen @ 2003-12-06 3:20 UTC (permalink / raw) To: linux-kernel On Fri, Dec 05, 2003 at 09:18:12PM +0100, cheuche+lkml@free.fr wrote: > On Fri, Dec 05, 2003 at 11:11:39AM -0800, Allen Martin wrote: > With a little patch in arch/i386/kernel/mpparse.c in the acpi section, I > managed to get the timer interrupt back on IO-APIC-edge, maybe the nmi > watchdog could work with the ioapic then ? > Like reported, with the patch the timer uses IO-APIC-edge, and the noise on IRQ 7 is gone, but still unable to catch a lockup with nmi_watchdog. =( Jesse ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 19:11 Allen Martin 2003-12-05 20:18 ` cheuche+lkml @ 2003-12-05 20:36 ` Jesse Allen 2003-12-05 22:55 ` Mike Fedyk 2 siblings, 0 replies; 62+ messages in thread From: Jesse Allen @ 2003-12-05 20:36 UTC (permalink / raw) To: Allen Martin; +Cc: linux-kernel On Fri, Dec 05, 2003 at 11:11:39AM -0800, Allen Martin wrote: > > -----Original Message----- > > From: Mikael Pettersson [mailto:mikpe@csd.uu.se] > > Sent: Friday, December 05, 2003 4:15 AM > > > > > So does this confirm that the lockups with nforce2 > > chipsets and apic > > > is actually a hardware problem after all? > > > > Confirm with very high probability. There may be quirks in nVidia's > > chipset that we (unlike their Windoze drivers) don't know about. > > > > Ask nVidia for detailed chipset documentation. Then maybe we > > can fix this. > > NVIDIA doesn't provide a windows driver to setup APIC interrupts. APIC > functionality is exported through the ACPI methods and MP table in the > system BIOS which the motherboard vendors supply. > > Likely the root of the problem has to do with the way the Linux kernel is > using the ACPI methods to setup the interrupts which is different from win > 9x/2k/XP. I can help track this down, unfortunately so far I've been unable > to reproduce the hangs on any of the boards I have. > Do you know whether the nforce2's with apic support the timer (IRQ 0) in IO-APIC mode? To me, it seems like a bug: "Dec 4 20:13:11 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to IO-APIC" (This message originates in arch/i386/kernel/io_apic.c) nmi_watchdog doesn't seem to work at all because of this. If it was working, then maybe I can catch the lockup, because if it's like you say, it's probably the kernel not hardware. Jesse ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 19:11 Allen Martin 2003-12-05 20:18 ` cheuche+lkml 2003-12-05 20:36 ` Jesse Allen @ 2003-12-05 22:55 ` Mike Fedyk 2003-12-05 23:11 ` Craig Bradney 2 siblings, 1 reply; 62+ messages in thread From: Mike Fedyk @ 2003-12-05 22:55 UTC (permalink / raw) To: Allen Martin; +Cc: 'Mikael Pettersson', Josh McKinney, linux-kernel On Fri, Dec 05, 2003 at 11:11:39AM -0800, Allen Martin wrote: > NVIDIA doesn't provide a windows driver to setup APIC interrupts. APIC > functionality is exported through the ACPI methods and MP table in the > system BIOS which the motherboard vendors supply. > > Likely the root of the problem has to do with the way the Linux kernel is > using the ACPI methods to setup the interrupts which is different from win > 9x/2k/XP. I can help track this down, unfortunately so far I've been unable > to reproduce the hangs on any of the boards I have. Can the people with nforce chips run a command that will show the chipset config space like was done back when there were problems with via chipsets (before via released the specs on how to set the bits correctly). Maybe you'll see some correlation between the boards that are crashing, and a few bits that are different for the boards that aren't crashing. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-05 22:55 ` Mike Fedyk @ 2003-12-05 23:11 ` Craig Bradney 0 siblings, 0 replies; 62+ messages in thread From: Craig Bradney @ 2003-12-05 23:11 UTC (permalink / raw) To: linux-kernel On Fri, 2003-12-05 at 23:55, Mike Fedyk wrote: > On Fri, Dec 05, 2003 at 11:11:39AM -0800, Allen Martin wrote: > > NVIDIA doesn't provide a windows driver to setup APIC interrupts. APIC > > functionality is exported through the ACPI methods and MP table in the > > system BIOS which the motherboard vendors supply. > > > > Likely the root of the problem has to do with the way the Linux kernel is > > using the ACPI methods to setup the interrupts which is different from win > > 9x/2k/XP. I can help track this down, unfortunately so far I've been unable > > to reproduce the hangs on any of the boards I have. > > Can the people with nforce chips run a command that will show the chipset > config space like was done back when there were problems with via chipsets > (before via released the specs on how to set the bits correctly). > > Maybe you'll see some correlation between the boards that are crashing, and > a few bits that are different for the boards that aren't crashing. > - Is there such a command? or is that your question? Ready to run it as soon as someone lets me know. Craig Uptime: 6.5 hours ^ permalink raw reply [flat|nested] 62+ messages in thread
* RE: Catching NForce2 lockup with NMI watchdog
@ 2003-12-05 20:56 Allen Martin
0 siblings, 0 replies; 62+ messages in thread
From: Allen Martin @ 2003-12-05 20:56 UTC (permalink / raw)
To: 'Jesse Allen'; +Cc: linux-kernel
> -----Original Message-----
> From: Jesse Allen [mailto:the3dfxdude@hotmail.com]
> Sent: Friday, December 05, 2003 12:36 PM
>
> Do you know whether the nforce2's with apic support the timer
> (IRQ 0) in
> IO-APIC mode? To me, it seems like a bug:
> "Dec 4 20:13:11 tesore kernel: ..MP-BIOS bug: 8254 timer not
> connected to
> IO-APIC"
> (This message originates in arch/i386/kernel/io_apic.c)
>
Yes, Win 9x/2k/XP use the system timer on irq0 and have no problem. I
haven't looked at this yet.
-Allen
^ permalink raw reply [flat|nested] 62+ messages in thread
* RE: Catching NForce2 lockup with NMI watchdog @ 2003-12-05 22:41 b 0 siblings, 0 replies; 62+ messages in thread From: b @ 2003-12-05 22:41 UTC (permalink / raw) To: mfedyk; +Cc: linux-kernel >Everyone with this problem should turn on the nmi_watchdog, as >someone may >have the right circumstances to produce an oops where the >others didn't. > >I say that you're not serious about getting this fixed unless >you're going >to do all of: To quote Allen Martin: >NVIDIA doesn't provide a windows driver to setup APIC interrupts. > >APIC functionality is exported through the ACPI methods and MP >table in the system BIOS which the motherboard vendors supply. >Likely the root of the problem has to do with the way the Linux >kernel is using the ACPI methods to setup the interrupts which >is different from win 9x/2k/XP. I can help track this down, >unfortunately so far I've been unable to reproduce the hangs >on any of the boards I have. and >> Do you know whether the nforce2's with apic support the timer >> (IRQ 0) in >> IO-APIC mode? To me, it seems like a bug: >> "Dec 4 20:13:11 tesore kernel: ..MP-BIOS bug: 8254 timer not >> connected to >> IO-APIC" >> (This message originates in arch/i386/kernel/io_apic.c) >> > >Yes, Win 9x/2k/XP use the system timer on irq0 and have no problem. I >haven't looked at this yet. > Is it not possible that Linux could be made to handling this hardware correctly? > > o turn on nmi_watchdog > o try the patches posted[1] > o contact nvidia or your motherboard manufacturer saying you >need linux > support, and return the board if they don't. (phone, fax, >email, or even > local office if there is one) > >I bought a VIA board to avoid the problems I expected from the >nforce, and I >needed a system (server) that would *work* now. > >[1] If you're worried about your file system, just boot the >patched kernel in >single mode, and that will mount all of your file systems >read-only so there >will be little chance of corruption. > >Mike ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered @ 2003-12-07 19:58 Ian Kumlien 2003-12-08 2:07 ` Ross Dickson 0 siblings, 1 reply; 62+ messages in thread From: Ian Kumlien @ 2003-12-07 19:58 UTC (permalink / raw) To: linux-kernel; +Cc: ross [-- Attachment #1: Type: text/plain, Size: 1985 bytes --] > I have monitored list and know my nforce2 experiences have been > common. Hell yeah =) > When I enabled either apic or io-apic in kern config, lockups came > hard and fast. Particularly bad under hard disk load. Heaps of lost > ints on irq7 in apic and ioapic mode. Lockups disappeared when I > lowered the ide hda udma speed to mode 3 with hdparm so I went looking > for answers which now follow. Good job =) > There are three parts to this email. > a) apic mods. > Lockups are due to too fast an apic acknowledge of apic timer int. > Apic hard locked up the system - no nmi debug available. > Fixed it by introducing a delay of at least 500ns into > smp_apic_timer_interrupt() just prior to ack_APIC_irq(). I find this really odd... It works just fine... As did disabling whats now active ie: 'Halt Disconnect and Stop Grant Disconnect' bit is enabled. So it seems like these are the two most important factors, at least from where i stand. Both enabled me to actually use my machine with IO-APIC. (1, disabling Halt Disconnect and Stop Grand Disconnect bit or 2, Add a delay on the irq ack.) Anyone that has any clues? > b) io-apic mods > So I have fixed it too (tested on both my epox and albatron MOBOs). > Firstly I found 8254 connected directly to pin 0 not pin 2 of io-apic. > I have modified check_timer() in io_apic.c to trial connect pin and > test for it after the existing test for connection to io-apic. Good job, i wonder if it could be more generalized and integrated with the rest of the code (i haven't even checked the rest of the code, but this seemed separated). One thing though, I get a lot more NMI's now than with nmi_watchdog=2... NMI: 85520 LOC: 85477 I usually had a 3 figure number by now... but.. =) > c) ide driver mods Cool.. I applied all patches and it survived my grep test so i think it works. -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered 2003-12-07 19:58 Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered Ian Kumlien @ 2003-12-08 2:07 ` Ross Dickson 2003-12-09 18:12 ` Catching NForce2 lockup with NMI watchdog Ian Kumlien 0 siblings, 1 reply; 62+ messages in thread From: Ross Dickson @ 2003-12-08 2:07 UTC (permalink / raw) To: Ian Kumlien; +Cc: linux-kernel, ross On Monday 08 December 2003 05:58, you wrote: > > I have monitored list and know my nforce2 experiences have been > > common. > > Hell yeah =) > > > When I enabled either apic or io-apic in kern config, lockups came > > hard and fast. Particularly bad under hard disk load. Heaps of lost > > ints on irq7 in apic and ioapic mode. Lockups disappeared when I > > lowered the ide hda udma speed to mode 3 with hdparm so I went looking > > for answers which now follow. > > Good job =) Thanks. > > > There are three parts to this email. > > a) apic mods. > > Lockups are due to too fast an apic acknowledge of apic timer int. > > Apic hard locked up the system - no nmi debug available. > > Fixed it by introducing a delay of at least 500ns into > > smp_apic_timer_interrupt() just prior to ack_APIC_irq(). > > I find this really odd... It works just fine... > As did disabling whats now active ie: > 'Halt Disconnect and Stop Grant Disconnect' bit is enabled. > > So it seems like these are the two most important factors, at least from > where i stand. Both enabled me to actually use my machine with IO-APIC. > (1, disabling Halt Disconnect and Stop Grand Disconnect bit or 2, Add a > delay on the irq ack.) > Anyone that has any clues? I started work on this about 2 weeks ago and have not yet tried the "Halt Disconnect patch and Stop Grand Disconnect bit or 2" patch. My lockups ceased with just the apic time delay. I agree the delay is wasteful but the code branches into other servicing routines which I did not want to try to rearrange as yet. Given the infrequent nature of the lockup in CPU cycles maybe we can get smarter and read the timer register and see if enough time has expired to safely ack it. Is the Apic read cycle fast like cache ram (assume as fast as bus cycles?) or slow like 8259? Doing something like, = = loop: Read Apic timer count = if count is xx cycles since rollover then = Ack apic = else = loop = After all it counts bus cycles doesn't it. Alternately perhaps there is a status bit in the apic somewhere to check against after we ack to ensure that it did its job although we would not want to hammer the apic with writes it cannot accept? Or maybe it is not too bad for now, I note that there is an existing fixed 400ns delay in the IDE command routines ide-iops.c ide_execute_command() which we currently tolerate. > > > b) io-apic mods > > So I have fixed it too (tested on both my epox and albatron MOBOs). > > Firstly I found 8254 connected directly to pin 0 not pin 2 of io-apic. > > I have modified check_timer() in io_apic.c to trial connect pin and > > test for it after the existing test for connection to io-apic. > > Good job, i wonder if it could be more generalized and integrated with > the rest of the code (i haven't even checked the rest of the code, but > this seemed separated). > > One thing though, I get a lot more NMI's now than with nmi_watchdog=2... > NMI: 85520 > LOC: 85477 > > I usually had a 3 figure number by now... but.. =) I have not tested the nmi against the b) io-apic mods. We may have a vector clash? Perhaps the new apic mpparse.c patch lets the existing check_timer() routines work properly? I have not yet tried it. > > > c) ide driver mods > > Cool.. > > I applied all patches and it survived my grep test so i think it works. > > -- > Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-08 2:07 ` Ross Dickson @ 2003-12-09 18:12 ` Ian Kumlien 2003-12-09 22:04 ` Craig Bradney 0 siblings, 1 reply; 62+ messages in thread From: Ian Kumlien @ 2003-12-09 18:12 UTC (permalink / raw) To: ross; +Cc: linux-kernel, recbo [-- Attachment #1: Type: text/plain, Size: 1016 bytes --] Bob wrote: > Using a patch that fixes a number of people's nforce2 > lockups while enabling io-apic edge timer, I can now > use nmi_watchdog=2 but not =1 Why regurgitate patches that are outdated, Personally i find int outdated after Ross made his patches available and they DO enable nmi_watchdog=1. (I have seen the old patches mentioned more than once, if something better comes along, please move to that instead.) http://marc.theaimsgroup.com/?l=linux-kernel&m=107080280512734&w=2 Anyways, Is there anyway to detect if the cpu is "disconnected" or, is there anyway to see when the kernel sends it's halts that triggers the disconnect? (or is it automagic?) If there was a way to check, then thats all thats needed, all delays can be removed and the code can be more generalized. (Since doubt that this is apic torment. It's more apic trying to talk to a disconnected cpu... (which both approaches hints at imho)) -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-09 18:12 ` Catching NForce2 lockup with NMI watchdog Ian Kumlien @ 2003-12-09 22:04 ` Craig Bradney 2003-12-09 23:13 ` Ian Kumlien 2003-12-10 6:14 ` Bob 0 siblings, 2 replies; 62+ messages in thread From: Craig Bradney @ 2003-12-09 22:04 UTC (permalink / raw) To: Ian Kumlien; +Cc: ross, linux-kernel, recbo On Tue, 2003-12-09 at 19:12, Ian Kumlien wrote: > Bob wrote: > > Using a patch that fixes a number of people's nforce2 > > lockups while enabling io-apic edge timer, I can now > > use nmi_watchdog=2 but not =1 > > Why regurgitate patches that are outdated, Personally i find int > outdated after Ross made his patches available and they DO enable > nmi_watchdog=1. (I have seen the old patches mentioned more than once, > if something better comes along, please move to that instead.) > > http://marc.theaimsgroup.com/?l=linux-kernel&m=107080280512734&w=2 > > Anyways, Is there anyway to detect if the cpu is "disconnected" or, is > there anyway to see when the kernel sends it's halts that triggers the > disconnect? (or is it automagic?) > > If there was a way to check, then thats all thats needed, all delays can > be removed and the code can be more generalized. > > (Since doubt that this is apic torment. It's more apic trying to talk to > a disconnected cpu... (which both approaches hints at imho)) Have these patches been submitted for review for inclusion into the main kernel? I'm still running the old IO-APIC patch (Uptime 3d 20h) and having no issues whatsoever. Are all of the patches at that address you provide necessary? What do the IDE ones claim to fix? I have had no real issue with IDE at all.. being able to burn CDs, DVDs, use my ATA133 drive for hdparm, greps, compilation, and general use..... Craig ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-09 22:04 ` Craig Bradney @ 2003-12-09 23:13 ` Ian Kumlien 2003-12-10 6:14 ` Bob 1 sibling, 0 replies; 62+ messages in thread From: Ian Kumlien @ 2003-12-09 23:13 UTC (permalink / raw) To: Craig Bradney; +Cc: ross, linux-kernel, recbo [-- Attachment #1: Type: text/plain, Size: 2039 bytes --] On Tue, 2003-12-09 at 23:04, Craig Bradney wrote: > On Tue, 2003-12-09 at 19:12, Ian Kumlien wrote: > > Bob wrote: > > > Using a patch that fixes a number of people's nforce2 > > > lockups while enabling io-apic edge timer, I can now > > > use nmi_watchdog=2 but not =1 > > > > Why regurgitate patches that are outdated, Personally i find int > > outdated after Ross made his patches available and they DO enable > > nmi_watchdog=1. (I have seen the old patches mentioned more than once, > > if something better comes along, please move to that instead.) > > > > http://marc.theaimsgroup.com/?l=linux-kernel&m=107080280512734&w=2 > > > > Anyways, Is there anyway to detect if the cpu is "disconnected" or, is > > there anyway to see when the kernel sends it's halts that triggers the > > disconnect? (or is it automagic?) > > > > If there was a way to check, then thats all thats needed, all delays can > > be removed and the code can be more generalized. > > > > (Since doubt that this is apic torment. It's more apic trying to talk to > > a disconnected cpu... (which both approaches hints at imho)) > > Have these patches been submitted for review for inclusion into the main > kernel? No, there is no final patch in anyway, there are just dodgy workarounds. I just deem this better with working nmi_watchdog=1 > I'm still running the old IO-APIC patch (Uptime 3d 20h) and having no > issues whatsoever. They fix the same problem.. > Are all of the patches at that address you provide necessary? nope, but they are all nforce2 related. > What do the IDE ones claim to fix? I have had no real issue with IDE at > all.. being able to burn CDs, DVDs, use my ATA133 drive for hdparm, > greps, compilation, and general use..... it's just a cleanup afair. Anyways, I think that if we find someway to detect cpu disconnect, then we just need that "detection" prior to the apic ack... (just a guess though) -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-09 22:04 ` Craig Bradney 2003-12-09 23:13 ` Ian Kumlien @ 2003-12-10 6:14 ` Bob 2003-12-10 7:51 ` Craig Bradney 1 sibling, 1 reply; 62+ messages in thread From: Bob @ 2003-12-10 6:14 UTC (permalink / raw) To: linux-kernel Craig Bradney wrote: >What do the IDE ones[patches] claim to fix? I have had no real issue with IDE at >all.. being able to burn CDs, DVDs, use my ATA133 drive for hdparm, >greps, compilation, and general use..... > >Craig > These patches belong together because the same necessity is the mother of their invention. You may not have an offboard promise or sis hd controller. Alan Cox looked at "nforce2 irq storm" and the offboard promise and sis controllers exposing that dma operations might be running out of time(time? timing..."timer"? a timer is a given so "timer" was unthinkable!) waiting for irq availability. That was months ago. It was only evident that giving a "bight of slack(1)" to those ops could help slightly, but we have a timer in any case, don't we? One person with a timer patch may backed into the nforce2 solution while just trying to get nmi_watchdog to work, right? Ian Kumlien looks most likely to reason the problem all the way through(2). -Bob D (1) "give me a bight of slack" "ah, for a bitty byte of pre-unicode slack loop" http://www.bartleby.com/61/13/B0241300.html *bight* PRONUNCIATION <http://www.bartleby.com/61/12.html>: <http://www.bartleby.com/61/wavs/13/B0241300.wav> bt NOUN: *1**a.* A loop in a rope. *b.* The middle or slack part of an extended rope. *2**a.* A bend or curve, especially in a shoreline. *b.* A wide bay formed by such a bend or curve. ETYMOLOGY: Middle English, bend, angle, from Old English /byht/. See *bheug- <http://www.bartleby.com/61/roots/IE63.html>* in Appendix I. (2) voted most likely to finesse through on a level above monkeys From Ian Kumlien: I did some reading on amd's site, and if the disconnect + apic fixed the same problem as the ~500ns delay, then it could be as i suspect... I suspect that something goes wrong with apic ack when the cpu is disconnected and according to the amd docs we could check the Northbridge's CLKFWDRST or isn't that avail on the outside? (It would be interesting to see if that fixes the problem as well.) http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26237.PDF I don't really have the knowledge but it would sure be nicer to fix this by checking this than to just disable it. I dunno if there is something we could do from within the kernel aswell with the sending of HLT but i doubt it. Anyways, we need a generalized patch that does better checking on the NMI bit (like Ross' patch). PS. Anyone that can point me to northbridge tech docks? and CC -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-10 6:14 ` Bob @ 2003-12-10 7:51 ` Craig Bradney 0 siblings, 0 replies; 62+ messages in thread From: Craig Bradney @ 2003-12-10 7:51 UTC (permalink / raw) To: Bob; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 4535 bytes --] Hi, Thanks to all for their replies.. Of course when I got the the PC this morning.. hung. About 4 days uptime with the old IRQ0 patch it was ok until 2am this morning. So.. I have enabled preempt now.. and as for the patches I have put the two 2.6test11 patches in that Jesse Allen attached for APIC and IO_APIC that were I think originally created by Ross Dickson for 2.4.2x. Should I also be adding in a CPU Disconnect patch (or running athcool as theres nothing in my ASUS BIOS)? Should I be running an ATA133 patch (previous emails indicate yes), and if so, is there a 2.6test11 patch? When I boot with nmi_watchdog=1 I get NMI values of about the same as IRQ0, just a bit less (1500 less at this point). With nmi_watchdog=2, I get barely any compared to IRQ0. IRQ0/timer is on IO-APIC-edge. This is from my current boot up, with nmi 1, CPU0 0: 344998 IO-APIC-edge timer 1: 1517 IO-APIC-edge i8042 2: 0 XT-PIC cascade 8: 2 IO-APIC-edge rtc 9: 0 IO-APIC-level acpi 12: 5313 IO-APIC-edge i8042 14: 10179 IO-APIC-edge ide0 15: 927 IO-APIC-edge ide1 19: 23551 IO-APIC-level radeon@PCI:3:0:0 21: 3882 IO-APIC-level ehci_hcd, NVidia nForce2, eth0 22: 3 IO-APIC-level ohci1394 NMI: 343501 LOC: 343354 ERR: 0 MIS: 0 I have attached my dmesg outputs from the starts ups with the two nmi options. regards Craig On Wed, 2003-12-10 at 07:14, Bob wrote: > Craig Bradney wrote: > > >What do the IDE ones[patches] claim to fix? I have had no real issue with IDE at > >all.. being able to burn CDs, DVDs, use my ATA133 drive for hdparm, > >greps, compilation, and general use..... > > > >Craig > > > > These patches belong together because the same > necessity is the mother of their invention. > > You may not have an offboard promise or sis hd > controller. > > Alan Cox looked at "nforce2 irq storm" and the > offboard promise and sis controllers exposing > that dma operations might be running out of > time(time? timing..."timer"? a timer is a given > so "timer" was unthinkable!) waiting for irq > availability. That was months ago. It was only > evident that giving a "bight of slack(1)" to those > ops could help slightly, but we have a timer in > any case, don't we? > > One person with a timer patch may backed into > the nforce2 solution while just trying to get > nmi_watchdog to work, right? > > Ian Kumlien looks most likely to reason the problem > all the way through(2). > > -Bob D > > (1) "give me a bight of slack" > "ah, for a bitty byte of pre-unicode slack loop" > http://www.bartleby.com/61/13/B0241300.html > > *bight* > > > PRONUNCIATION <http://www.bartleby.com/61/12.html>: > <http://www.bartleby.com/61/wavs/13/B0241300.wav> bt > NOUN: *1**a.* A loop in a rope. *b.* The middle or slack part of an > extended rope. *2**a.* A bend or curve, especially in a shoreline. *b.* > A wide bay formed by such a bend or curve. > ETYMOLOGY: Middle English, bend, angle, from Old English /byht/. See > *bheug- <http://www.bartleby.com/61/roots/IE63.html>* in Appendix I. > > > (2) voted most likely to finesse through on a level above monkeys > > From Ian Kumlien: > > I did some reading on amd's site, and if the disconnect + apic fixed the > same problem as the ~500ns delay, then it could be as i suspect... > > I suspect that something goes wrong with apic ack when the cpu is > disconnected and according to the amd docs we could check the > Northbridge's CLKFWDRST or isn't that avail on the outside? > (It would be interesting to see if that fixes the problem as well.) > > http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26237.PDF > > I don't really have the knowledge but it would sure be nicer to fix this > by checking this than to just disable it. I dunno if there is something > we could do from within the kernel aswell with the sending of HLT but i > doubt it. > > Anyways, we need a generalized patch that does better checking on the > NMI bit (like Ross' patch). > > PS. Anyone that can point me to northbridge tech docks? and CC > > -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > [-- Attachment #2: dmesg_afterpatch_r3_nmi_1 --] [-- Type: text/plain, Size: 15364 bytes --] 000 Nvidia ) @ 0x000f75e0 ACPI: RSDT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff3000 ACPI: FADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff3040 ACPI: MADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff74c0 ACPI: DSDT (v001 NVIDIA AWRDACPI 0x00001000 MSFT 0x0100000e) @ 0x00000000 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 6:10 APIC version 16 ACPI: LAPIC_NMI (acpi_id[0x00] polarity[0x1] trigger[0x1] lint[0x1]) ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0]) IOAPIC[0]: Assigned apic_id 2 IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, IRQ 0-23 ACPI: INT_SRC_OVR (bus[0] irq[0x0] global_irq[0x2] polarity[0x0] trigger[0x0]) ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x1] trigger[0x3]) ACPI: INT_SRC_OVR (bus[0] irq[0xe] global_irq[0xe] polarity[0x1] trigger[0x1]) ACPI: INT_SRC_OVR (bus[0] irq[0xf] global_irq[0xf] polarity[0x1] trigger[0x1]) Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Building zonelist for node : 0 Kernel command line: root=/dev/hda6 nmi_watchdog=1 Initializing CPU#0 PID hash table entries: 2048 (order 11: 16384 bytes) Detected 1913.382 MHz processor. Console: colour VGA+ 80x25 Memory: 514424k/524224k available (2563k kernel code, 9052k reserved, 933k data, 168k init, 0k highmem) Calibrating delay loop... 3784.70 BogoMIPS Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: AMD Athlon(tm) XP 2600+ stepping 00 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 ENABLING IO-APIC IRQs init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected. ..TIMER: vector=0x31 pin1=2 pin2=-1 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ..TIMER: Is timer irq0 connected to IOAPIC Pin0? ... IOAPIC[0]: Set PCI routing entry (2-0 -> 0x31 -> IRQ 0 Mode:0 Active:0) activating NMI Watchdog ... done. testing NMI watchdog ... OK. ..TIMER: works OK on apic pin0 irq0 number of MP IRQ sources: 15. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00170011 ....... : max redirection entries: 0017 ....... : PRQ implemented: 0 ....... : IO APIC version: 0011 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 001 01 0 0 0 0 0 1 1 31 01 001 01 0 0 0 0 0 1 1 39 02 000 00 0 0 0 0 0 0 0 00 03 001 01 0 0 0 0 0 1 1 41 04 001 01 0 0 0 0 0 1 1 49 05 001 01 0 0 0 0 0 1 1 51 06 001 01 0 0 0 0 0 1 1 59 07 001 01 0 0 0 0 0 1 1 61 08 001 01 0 0 0 0 0 1 1 69 09 001 01 1 1 0 0 0 1 1 71 0a 001 01 0 0 0 0 0 1 1 79 0b 001 01 0 0 0 0 0 1 1 81 0c 001 01 0 0 0 0 0 1 1 89 0d 001 01 0 0 0 0 0 1 1 91 0e 001 01 0 0 0 0 0 1 1 99 0f 001 01 0 0 0 0 0 1 1 A1 10 000 00 1 0 0 0 0 0 0 00 11 000 00 1 0 0 0 0 0 0 00 12 000 00 1 0 0 0 0 0 0 00 13 000 00 1 0 0 0 0 0 0 00 14 000 00 1 0 0 0 0 0 0 00 15 000 00 1 0 0 0 0 0 0 00 16 000 00 1 0 0 0 0 0 0 00 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:2-> 0:0 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ5 -> 0:5 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ10 -> 0:10 IRQ11 -> 0:11 IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 1912.0876 MHz. ..... host bus clock speed is 332.0674 MHz. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfb490, last bus=3 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20031002 IOAPIC[0]: Set PCI routing entry (2-9 -> 0x71 -> IRQ 9 Mode:1 Active:0) ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB1._PRT] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LUBA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LAPU] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LFIR] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [L3CM] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [APC1] (IRQs 16) ACPI: PCI Interrupt Link [APC2] (IRQs 17) ACPI: PCI Interrupt Link [APC3] (IRQs 18) ACPI: PCI Interrupt Link [APC4] (IRQs *19) ACPI: PCI Interrupt Link [APC5] (IRQs 16) ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCI] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCS] (IRQs *23) ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22) ACPI: PCI Interrupt Link [AP3C] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22) Linux Plug and Play Support v0.97 (c) Adam Belay SCSI subsystem initialized drivers/usb/core/usb.c: registered new driver usbfs drivers/usb/core/usb.c: registered new driver hub ACPI: PCI Interrupt Link [APCS] enabled at IRQ 23 IOAPIC[0]: Set PCI routing entry (2-23 -> 0xa9 -> IRQ 23 Mode:1 Active:0) 00:00:01[A] -> 2-23 -> IRQ 23 Pin 2-23 already programmed ACPI: PCI Interrupt Link [APCF] enabled at IRQ 20 IOAPIC[0]: Set PCI routing entry (2-20 -> 0xb1 -> IRQ 20 Mode:1 Active:0) 00:00:02[A] -> 2-20 -> IRQ 20 ACPI: PCI Interrupt Link [APCG] enabled at IRQ 22 IOAPIC[0]: Set PCI routing entry (2-22 -> 0xb9 -> IRQ 22 Mode:1 Active:0) 00:00:02[B] -> 2-22 -> IRQ 22 ACPI: PCI Interrupt Link [APCL] enabled at IRQ 21 IOAPIC[0]: Set PCI routing entry (2-21 -> 0xc1 -> IRQ 21 Mode:1 Active:0) 00:00:02[C] -> 2-21 -> IRQ 21 ACPI: PCI Interrupt Link [APCH] enabled at IRQ 20 Pin 2-20 already programmed ACPI: PCI Interrupt Link [APCI] enabled at IRQ 22 Pin 2-22 already programmed ACPI: PCI Interrupt Link [APCJ] enabled at IRQ 21 Pin 2-21 already programmed ACPI: PCI Interrupt Link [APCK] enabled at IRQ 20 Pin 2-20 already programmed ACPI: PCI Interrupt Link [APCM] enabled at IRQ 22 Pin 2-22 already programmed ACPI: PCI Interrupt Link [AP3C] enabled at IRQ 21 Pin 2-21 already programmed ACPI: PCI Interrupt Link [APCZ] enabled at IRQ 20 Pin 2-20 already programmed ACPI: PCI Interrupt Link [APC1] enabled at IRQ 16 IOAPIC[0]: Set PCI routing entry (2-16 -> 0xc9 -> IRQ 16 Mode:1 Active:0) 00:01:06[A] -> 2-16 -> IRQ 16 ACPI: PCI Interrupt Link [APC2] enabled at IRQ 17 IOAPIC[0]: Set PCI routing entry (2-17 -> 0xd1 -> IRQ 17 Mode:1 Active:0) 00:01:06[B] -> 2-17 -> IRQ 17 ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18 IOAPIC[0]: Set PCI routing entry (2-18 -> 0xd9 -> IRQ 18 Mode:1 Active:0) 00:01:06[C] -> 2-18 -> IRQ 18 ACPI: PCI Interrupt Link [APC4] enabled at IRQ 19 IOAPIC[0]: Set PCI routing entry (2-19 -> 0xe1 -> IRQ 19 Mode:1 Active:0) 00:01:06[D] -> 2-19 -> IRQ 19 Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-18 already programmed Pin 2-18 already programmed Pin 2-18 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-21 already programmed Pin 2-21 already programmed Pin 2-21 already programmed Pin 2-21 already programmed PCI: Using ACPI for IRQ routing PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off' Machine check exception polling timer started. devfs: v1.22 (20021013) Richard Gooch (rgooch@atnf.csiro.au) devfs: boot_options: 0x1 Installing knfsd (copyright (C) 1996 okir@monad.swb.de). udf: registering filesystem Supermount version 2.0.2a for kernel 2.6 ACPI: Power Button (FF) [PWRF] ACPI: Processor [CPU0] (supports C1) pty: 256 Unix98 ptys configured request_module: failed /sbin/modprobe -- parport_lowlevel. error = -16 lp: driver loaded but no devices found Real Time Clock Driver v1.12 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected NVIDIA nForce2 chipset agpgart: Maximum main memory to use for agp memory: 439M agpgart: AGP aperture is 64M @ 0xd0000000 [drm] Initialized radeon 1.9.0 20020828 on minor 0 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A parport0: PC-style at 0x378 (0x778) [PCSPP(,...)] parport0: irq 7 detected lp0: using parport0 (polling). Using anticipatory io scheduler Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 loop: loaded (max 8 devices) 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html 0000:02:01.0: 3Com PCI 3c920 Tornado at 0x9000. Vers LK1.1.19 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx NFORCE2: IDE controller at PCI slot 0000:00:09.0 NFORCE2: chipset revision 162 NFORCE2: not 100% native mode: will probe irqs later NFORCE2: BIOS didn't set cable bits correctly. Enabling workaround. ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx NFORCE2: 0000:00:09.0 (rev a2) UDMA133 controller ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA hda: Maxtor 6Y080P0, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: SONY DVD RW DRU-510A, ATAPI CD/DVD-ROM drive hdd: SAMSUNG CD-ROM SC-152C, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 160086528 sectors (81964 MB) w/7936KiB Cache, CHS=65535/16/63, UDMA(133) /dev/ide/host0/bus0/target0/lun0: p1 p2 < p5 p6 p7 p8 > hdc: ATAPI 32X DVD-ROM DVD-R CD-R/RW drive, 8192kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.12 hdd: ATAPI 52X CD-ROM drive, 128kB Cache, DMA ohci1394: $Rev: 1045 $ Ben Collins <bcollins@debian.org> PCI: Setting latency timer of device 0000:00:0d.0 to 64 ohci1394_0: OHCI-1394 1.1 (PCI): IRQ=[22] MMIO=[e0083000-e00837ff] Max Packet=[2048] ohci1394_0: SelfID received outside of bus reset sequence ehci_hcd 0000:00:02.2: EHCI Host Controller PCI: Setting latency timer of device 0000:00:02.2 to 64 ehci_hcd 0000:00:02.2: irq 21, pci mem e0848000 ehci_hcd 0000:00:02.2: new USB bus registered, assigned bus number 1 PCI: cache line size of 64 is not supported by device 0000:00:02.2 ehci_hcd 0000:00:02.2: USB 2.0 enabled, EHCI 1.00, driver 2003-Jun-13 hub 1-0:1.0: USB hub found hub 1-0:1.0: 6 ports detected drivers/usb/host/uhci-hcd.c: USB Universal Host Controller Interface driver v2.1 drivers/usb/core/usb.c: registered new driver usblp drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver Initializing USB Mass Storage driver... drivers/usb/core/usb.c: registered new driver usb-storage USB Mass Storage support registered. drivers/usb/core/usb.c: registered new driver hid drivers/usb/input/hid-core.c: v2.0:USB HID core driver mice: PS/2 mouse device common for all mice input: ImExPS/2 Logitech Explorer Mouse on isa0060/serio1 serio: i8042 AUX port at 0x60,0x64 irq 12 input: AT Translated Set 2 keyboard on isa0060/serio0 serio: i8042 KBD port at 0x60,0x64 irq 1 Advanced Linux Sound Architecture Driver Version 0.9.7 (Thu Sep 25 19:16:36 2003 UTC). request_module: failed /sbin/modprobe -- snd-card-0. error = -16 ALSA device list: No soundcards found. NET: Registered protocol family 2 IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 32768 bind 65536) NET: Registered protocol family 1 NET: Registered protocol family 17 kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Mounted devfs on /dev Freeing unused kernel memory: 168k freed ieee1394: Host added: ID:BUS[0-00:1023] GUID[00e018000044dec8] Adding 2008084k swap on /dev/hda5. Priority:-1 extents:1 EXT3 FS on hda6, internal journal i2c_adapter i2c-0: nForce2 SMBus adapter at 0x5000 i2c_adapter i2c-1: nForce2 SMBus adapter at 0x5500 registering 1-002d registering 1-0049 registering 1-0048 kjournald starting. Commit interval 5 seconds EXT3 FS on hda7, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3 FS on hda8, internal journal EXT3-fs: mounted filesystem with ordered data mode. PCI: Setting latency timer of device 0000:00:06.0 to 64 intel8x0: clocking to 47450 agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode agpgart: Putting AGP V2 device at 0000:03:00.0 into 1x mode agpgart: Putting AGP V2 device at 0000:03:00.1 into 1x mode [drm] Loading R200 Microcode [-- Attachment #3: dmesg_afterpatch_r3_nmi_2 --] [-- Type: text/plain, Size: 15361 bytes --] DMI 2.2 present. ACPI: RSDP (v000 Nvidia ) @ 0x000f75e0 ACPI: RSDT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff3000 ACPI: FADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff3040 ACPI: MADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff74c0 ACPI: DSDT (v001 NVIDIA AWRDACPI 0x00001000 MSFT 0x0100000e) @ 0x00000000 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 6:10 APIC version 16 ACPI: LAPIC_NMI (acpi_id[0x00] polarity[0x1] trigger[0x1] lint[0x1]) ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0]) IOAPIC[0]: Assigned apic_id 2 IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, IRQ 0-23 ACPI: INT_SRC_OVR (bus[0] irq[0x0] global_irq[0x2] polarity[0x0] trigger[0x0]) ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x1] trigger[0x3]) ACPI: INT_SRC_OVR (bus[0] irq[0xe] global_irq[0xe] polarity[0x1] trigger[0x1]) ACPI: INT_SRC_OVR (bus[0] irq[0xf] global_irq[0xf] polarity[0x1] trigger[0x1]) Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Building zonelist for node : 0 Kernel command line: root=/dev/hda6 nmi_watchdog=2 Initializing CPU#0 PID hash table entries: 2048 (order 11: 16384 bytes) Detected 1913.393 MHz processor. Console: colour VGA+ 80x25 Memory: 514424k/524224k available (2563k kernel code, 9052k reserved, 933k data, 168k init, 0k highmem) Calibrating delay loop... 3784.70 BogoMIPS Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: AMD Athlon(tm) XP 2600+ stepping 00 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 testing NMI watchdog ... OK. ENABLING IO-APIC IRQs init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected. ..TIMER: vector=0x31 pin1=2 pin2=-1 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ..TIMER: Is timer irq0 connected to IOAPIC Pin0? ... IOAPIC[0]: Set PCI routing entry (2-0 -> 0x31 -> IRQ 0 Mode:0 Active:0) ..TIMER: works OK on apic pin0 irq0 number of MP IRQ sources: 15. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00170011 ....... : max redirection entries: 0017 ....... : PRQ implemented: 0 ....... : IO APIC version: 0011 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 001 01 0 0 0 0 0 1 1 31 01 001 01 0 0 0 0 0 1 1 39 02 000 00 0 0 0 0 0 0 0 00 03 001 01 0 0 0 0 0 1 1 41 04 001 01 0 0 0 0 0 1 1 49 05 001 01 0 0 0 0 0 1 1 51 06 001 01 0 0 0 0 0 1 1 59 07 001 01 0 0 0 0 0 1 1 61 08 001 01 0 0 0 0 0 1 1 69 09 001 01 1 1 0 0 0 1 1 71 0a 001 01 0 0 0 0 0 1 1 79 0b 001 01 0 0 0 0 0 1 1 81 0c 001 01 0 0 0 0 0 1 1 89 0d 001 01 0 0 0 0 0 1 1 91 0e 001 01 0 0 0 0 0 1 1 99 0f 001 01 0 0 0 0 0 1 1 A1 10 000 00 1 0 0 0 0 0 0 00 11 000 00 1 0 0 0 0 0 0 00 12 000 00 1 0 0 0 0 0 0 00 13 000 00 1 0 0 0 0 0 0 00 14 000 00 1 0 0 0 0 0 0 00 15 000 00 1 0 0 0 0 0 0 00 16 000 00 1 0 0 0 0 0 0 00 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:2-> 0:0 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ5 -> 0:5 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ10 -> 0:10 IRQ11 -> 0:11 IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 1912.0941 MHz. ..... host bus clock speed is 332.0685 MHz. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfb490, last bus=3 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20031002 IOAPIC[0]: Set PCI routing entry (2-9 -> 0x71 -> IRQ 9 Mode:1 Active:0) ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB1._PRT] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LUBA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LAPU] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LFIR] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [L3CM] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [APC1] (IRQs 16) ACPI: PCI Interrupt Link [APC2] (IRQs 17) ACPI: PCI Interrupt Link [APC3] (IRQs 18) ACPI: PCI Interrupt Link [APC4] (IRQs *19) ACPI: PCI Interrupt Link [APC5] (IRQs 16) ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCI] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCS] (IRQs *23) ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22) ACPI: PCI Interrupt Link [AP3C] (IRQs 20 21 22) ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22) Linux Plug and Play Support v0.97 (c) Adam Belay SCSI subsystem initialized drivers/usb/core/usb.c: registered new driver usbfs drivers/usb/core/usb.c: registered new driver hub ACPI: PCI Interrupt Link [APCS] enabled at IRQ 23 IOAPIC[0]: Set PCI routing entry (2-23 -> 0xa9 -> IRQ 23 Mode:1 Active:0) 00:00:01[A] -> 2-23 -> IRQ 23 Pin 2-23 already programmed ACPI: PCI Interrupt Link [APCF] enabled at IRQ 20 IOAPIC[0]: Set PCI routing entry (2-20 -> 0xb1 -> IRQ 20 Mode:1 Active:0) 00:00:02[A] -> 2-20 -> IRQ 20 ACPI: PCI Interrupt Link [APCG] enabled at IRQ 22 IOAPIC[0]: Set PCI routing entry (2-22 -> 0xb9 -> IRQ 22 Mode:1 Active:0) 00:00:02[B] -> 2-22 -> IRQ 22 ACPI: PCI Interrupt Link [APCL] enabled at IRQ 21 IOAPIC[0]: Set PCI routing entry (2-21 -> 0xc1 -> IRQ 21 Mode:1 Active:0) 00:00:02[C] -> 2-21 -> IRQ 21 ACPI: PCI Interrupt Link [APCH] enabled at IRQ 20 Pin 2-20 already programmed ACPI: PCI Interrupt Link [APCI] enabled at IRQ 22 Pin 2-22 already programmed ACPI: PCI Interrupt Link [APCJ] enabled at IRQ 21 Pin 2-21 already programmed ACPI: PCI Interrupt Link [APCK] enabled at IRQ 20 Pin 2-20 already programmed ACPI: PCI Interrupt Link [APCM] enabled at IRQ 22 Pin 2-22 already programmed ACPI: PCI Interrupt Link [AP3C] enabled at IRQ 21 Pin 2-21 already programmed ACPI: PCI Interrupt Link [APCZ] enabled at IRQ 20 Pin 2-20 already programmed ACPI: PCI Interrupt Link [APC1] enabled at IRQ 16 IOAPIC[0]: Set PCI routing entry (2-16 -> 0xc9 -> IRQ 16 Mode:1 Active:0) 00:01:06[A] -> 2-16 -> IRQ 16 ACPI: PCI Interrupt Link [APC2] enabled at IRQ 17 IOAPIC[0]: Set PCI routing entry (2-17 -> 0xd1 -> IRQ 17 Mode:1 Active:0) 00:01:06[B] -> 2-17 -> IRQ 17 ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18 IOAPIC[0]: Set PCI routing entry (2-18 -> 0xd9 -> IRQ 18 Mode:1 Active:0) 00:01:06[C] -> 2-18 -> IRQ 18 ACPI: PCI Interrupt Link [APC4] enabled at IRQ 19 IOAPIC[0]: Set PCI routing entry (2-19 -> 0xe1 -> IRQ 19 Mode:1 Active:0) 00:01:06[D] -> 2-19 -> IRQ 19 Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-18 already programmed Pin 2-18 already programmed Pin 2-18 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-21 already programmed Pin 2-21 already programmed Pin 2-21 already programmed Pin 2-21 already programmed PCI: Using ACPI for IRQ routing PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off' Machine check exception polling timer started. devfs: v1.22 (20021013) Richard Gooch (rgooch@atnf.csiro.au) devfs: boot_options: 0x1 Installing knfsd (copyright (C) 1996 okir@monad.swb.de). udf: registering filesystem Supermount version 2.0.2a for kernel 2.6 ACPI: Power Button (FF) [PWRF] ACPI: Processor [CPU0] (supports C1) pty: 256 Unix98 ptys configured request_module: failed /sbin/modprobe -- parport_lowlevel. error = -16 lp: driver loaded but no devices found Real Time Clock Driver v1.12 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected NVIDIA nForce2 chipset agpgart: Maximum main memory to use for agp memory: 439M agpgart: AGP aperture is 64M @ 0xd0000000 [drm] Initialized radeon 1.9.0 20020828 on minor 0 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A parport0: PC-style at 0x378 (0x778) [PCSPP(,...)] parport0: irq 7 detected lp0: using parport0 (polling). Using anticipatory io scheduler Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 loop: loaded (max 8 devices) 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html 0000:02:01.0: 3Com PCI 3c920 Tornado at 0x9000. Vers LK1.1.19 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx NFORCE2: IDE controller at PCI slot 0000:00:09.0 NFORCE2: chipset revision 162 NFORCE2: not 100% native mode: will probe irqs later NFORCE2: BIOS didn't set cable bits correctly. Enabling workaround. ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx NFORCE2: 0000:00:09.0 (rev a2) UDMA133 controller ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA hda: Maxtor 6Y080P0, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: SONY DVD RW DRU-510A, ATAPI CD/DVD-ROM drive hdd: SAMSUNG CD-ROM SC-152C, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 160086528 sectors (81964 MB) w/7936KiB Cache, CHS=65535/16/63, UDMA(133) /dev/ide/host0/bus0/target0/lun0: p1 p2 < p5 p6 p7 p8 > hdc: ATAPI 32X DVD-ROM DVD-R CD-R/RW drive, 8192kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.12 hdd: ATAPI 52X CD-ROM drive, 128kB Cache, DMA ohci1394: $Rev: 1045 $ Ben Collins <bcollins@debian.org> PCI: Setting latency timer of device 0000:00:0d.0 to 64 ohci1394_0: OHCI-1394 1.1 (PCI): IRQ=[22] MMIO=[e0083000-e00837ff] Max Packet=[2048] ohci1394_0: SelfID received outside of bus reset sequence ehci_hcd 0000:00:02.2: EHCI Host Controller PCI: Setting latency timer of device 0000:00:02.2 to 64 ehci_hcd 0000:00:02.2: irq 21, pci mem e0848000 ehci_hcd 0000:00:02.2: new USB bus registered, assigned bus number 1 PCI: cache line size of 64 is not supported by device 0000:00:02.2 ehci_hcd 0000:00:02.2: USB 2.0 enabled, EHCI 1.00, driver 2003-Jun-13 hub 1-0:1.0: USB hub found hub 1-0:1.0: 6 ports detected drivers/usb/host/uhci-hcd.c: USB Universal Host Controller Interface driver v2.1 drivers/usb/core/usb.c: registered new driver usblp drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver Initializing USB Mass Storage driver... drivers/usb/core/usb.c: registered new driver usb-storage USB Mass Storage support registered. drivers/usb/core/usb.c: registered new driver hid drivers/usb/input/hid-core.c: v2.0:USB HID core driver mice: PS/2 mouse device common for all mice input: ImExPS/2 Logitech Explorer Mouse on isa0060/serio1 serio: i8042 AUX port at 0x60,0x64 irq 12 input: AT Translated Set 2 keyboard on isa0060/serio0 serio: i8042 KBD port at 0x60,0x64 irq 1 Advanced Linux Sound Architecture Driver Version 0.9.7 (Thu Sep 25 19:16:36 2003 UTC). request_module: failed /sbin/modprobe -- snd-card-0. error = -16 ALSA device list: No soundcards found. NET: Registered protocol family 2 IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 32768 bind 65536) NET: Registered protocol family 1 NET: Registered protocol family 17 kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Mounted devfs on /dev Freeing unused kernel memory: 168k freed ieee1394: Host added: ID:BUS[0-00:1023] GUID[00e018000044dec8] Adding 2008084k swap on /dev/hda5. Priority:-1 extents:1 EXT3 FS on hda6, internal journal i2c_adapter i2c-0: nForce2 SMBus adapter at 0x5000 i2c_adapter i2c-1: nForce2 SMBus adapter at 0x5500 registering 1-002d registering 1-0049 registering 1-0048 kjournald starting. Commit interval 5 seconds EXT3 FS on hda7, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3 FS on hda8, internal journal EXT3-fs: mounted filesystem with ordered data mode. PCI: Setting latency timer of device 0000:00:06.0 to 64 intel8x0: clocking to 49371 agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode agpgart: Putting AGP V2 device at 0000:03:00.0 into 1x mode agpgart: Putting AGP V2 device at 0000:03:00.1 into 1x mode [drm] Loading R200 Microcode ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog @ 2003-12-13 3:56 Ross Dickson 2003-12-15 13:16 ` Maciej W. Rozycki 0 siblings, 1 reply; 62+ messages in thread From: Ross Dickson @ 2003-12-13 3:56 UTC (permalink / raw) To: george; +Cc: Maciej W. Rozycki, linux-kernel >Having had cause to try and figure out all this, I vote for the following being > included in the source somewhere... >-g Please consider adding 2c. Alternatively the OUT0 output of the 8254 PIT (IOW the timer source) may be directly connected to the INTIN0 input of the first I/O APIC. which we have found for nforce2 boards. ref: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/2375.html Ross Dickson >bill davidsen wrote: > > In article <Pine.LNX.4.55.0312101421540.31543@jurand.ds.pg.gda.pl>, > > Maciej W. Rozycki <macro@ds2.pg.gda.pl> wrote: > > > > | The I/O APIC NMI watchdog utilizes the property of being transparent to a > > | single IRQ source of a specially reconfigured 8259A PIC (the master one in > > | the IA32 PC architecture). There are more prerequisites that have to be > > | met and all indeed are for a 100% compatible PC as specified by the > > | Intel's Multiprocessor Specification. > > | > > | 1. The INT output of the master 8259A PIC has to be connected to the LINT0 > > | (or LINTIN0; the name varies by implementations) inputs of all local APICs > > | in the system. > > | > > | 2a. The OUT0 output of the 8254 PIT (IOW the timer source) has to be > > | directly connected to the INTIN2 input of the first I/O APIC. > > | > > | 2b. Alternatively the INT output of the master 8259A PIC has to be > > | connected to the INTIN0 input of the first I/O APIC. > > | > > | 3. There must be no glue logic that would change logical properties of the > > | signal between the INT output of the master 8259A PIC and the respective > > | APIC interrupt inputs. > > | > > | In practice, assuming the MP IRQ routing information provided the BIOS has > > | been correct (which is not always the case), prerequisites #1 and #2 have > > | been met so far, but #3 has proved to be occasionally problematic. > > > > In practice many system seem to take a good bit of guessing and testing. > > I have an old P-II which only works with acpi=force and nmi_watchdog=2, > > for instance. > > > > It would be nice if there were a program which could poke at the > > hardware and suggest options which might work, as in eliminating the > > ones which can be determined not to work. Absent that trial and error > > rule, unfortunately. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-13 3:56 Ross Dickson @ 2003-12-15 13:16 ` Maciej W. Rozycki 0 siblings, 0 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-15 13:16 UTC (permalink / raw) To: Ross Dickson; +Cc: george, linux-kernel On Sat, 13 Dec 2003, Ross Dickson wrote: > Please consider adding > > 2c. Alternatively the OUT0 output of the 8254 PIT (IOW the timer source) may be > directly connected to the INTIN0 input of the first I/O APIC. > > which we have found for nforce2 boards. Actually the code can handle routing to any INTIN pins, so the whole text needs to be reworded. It's just that I've got used to INTIN0 and INTIN2 after that many years. ;-) -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog @ 2003-12-17 18:14 Ross Dickson 2003-12-17 21:41 ` George Anzinger ` (2 more replies) 0 siblings, 3 replies; 62+ messages in thread From: Ross Dickson @ 2003-12-17 18:14 UTC (permalink / raw) To: Maciej W. Rozycki, george; +Cc: linux-kernel >On Tue, 16 Dec 2003, George Anzinger wrote: >> How confusing :( Could you give me some idea how this works? I have tried > > disable_irq(0) and, as best as I can tell, it does not do the trick. The > > confusion I have is understanding where in the chain of hardware each of these > > thing is taking place. Here is where to find Intel's MP arch spec Maceij mentions. I had to find it recently wrt nforce2 issues http://www.intel.com/design/pentium/datashts/24201606.pdf Section 3.6.1 Apic Architecture is relevant particularly Section 3.6.2.2 Virtual Wire Mode <snip> great diagram! <snip> > If the above variant does not work, as a last resort, the path for the > 8254 timer interrupt is via the 8259 reconfigured back into its usual mode > and then LINT0 of the BSP reconfigured for an ExtINTA APIC interrupt. > Additionally, since at this point the glue logic has probably already > locked up due to the messing done above, a few artiffical sets of double > INTA cycles are sent to the system bus using the RTC chip and INTIN8 > reconfigured temporarily to send ExtINTA APIC interrupts via the > inter-APIC bus. > I do hope a thorough read of the description will make the available > variants clear. The I/O APIC input numbers may differ but so far they are > almost always as noted above. > Maciej All good. I would like to add a footnote to highlight a potential gotcha as I understand it. To clarify, the xt pic 8259A does not in itself have a transparent mode as would a logic buffer or inverter. It always needs inta cycles to function. In PIC mode it is wired to processor pins as per old 8086 and original cpu architecture provides the inta cycles to it (bypasses apic, apic seems off). In virtual wire mode with the 8259A output wired either to a local apic pin on cpu or through the io-apic. In this mode it is the local apic which has to provide the inta cycles on the bus back to the 8259A for it to function correctly. The delivery mode has to be set to ExtInt for the register associated with the pin that the 8259A output (int on Maceij diagram) is connected to. This is the only way to force the apic to deliver the inta cycles to the 8259A and that is how it appears transparent to the system. Spec says there can only be one source register (local apic) or redirection register (ioapic) of mode ExtInt per system regardless of how many local apic and io-apic pins it (int on Maceij diagram) is connected to. Gotcha: If none are set to ExtInt then the 8259A will hang for lack of IntA cycles. Section 7.5.11 covers it 24319202.pdf available here http://www.intel.com/design/pentiumii/manuals/243192.htm Why only one Extint source in virtual wire mode?: The 8259A in X86 architecture systems needs two inta cycles per interrupt event. Do not confuse them with the EOI which is software, the inta is purely hardware. It only works properly with one source causing inta cycles. Docs I have do not say what happens with more than one source. How 8259A works in a nutshell (it is more complex in cascade mode). First the 8259A gets a request from H/ware and if unmasked etc generates its int (int on Maceij diagram) out. 8259A then sits there waiting for Inta from cpu (PIC mode) or local apic (Virtual wire mode). When the inta arrives the 8259A latches its internal ISR bit and waits for second inta. When second inta arrives it outputs a vector onto the data bus indicating which ISR bit was set. If the request from H/ware is still active when the first inta arrives then we get the correct vector number. If it is NOT still exerted then its tough luck and the vector we get 7 for the first 8259A or 15 for the second 8259A and it is too late to try and find out where the real source was, hence the spurious irq7 messages and corresponding irq 7 count increase. It is pretty bad when the apic system that is handling the 8259A in virtual wire mode cannot get the inta to the 8259A in time while the int request hardware is still exerting but it happens. I certainly agree with Marceij's comments that mixed mode of having 8254 PIT routed via the 8259A was never meant to occur alongside ioapic handling of the other interrupts. It is very problematic not to mention confusing. I do not know how smoothly the apic handles the 8259A if you would be turning that source on and off frequently. Regards Ross Dickson ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-17 18:14 Ross Dickson @ 2003-12-17 21:41 ` George Anzinger 2003-12-17 21:48 ` George Anzinger 2003-12-18 14:04 ` Maciej W. Rozycki 2 siblings, 0 replies; 62+ messages in thread From: George Anzinger @ 2003-12-17 21:41 UTC (permalink / raw) To: ross, Maciej W. Rozycki; +Cc: linux-kernel I really want to thank you both for all this information. It will make the code and comments much easier to understand. Thanks again George Ross Dickson wrote: >>On Tue, 16 Dec 2003, George Anzinger wrote: > > > > > >>>How confusing :( Could you give me some idea how this works? I have tried >>>disable_irq(0) and, as best as I can tell, it does not do the trick. The >>>confusion I have is understanding where in the chain of hardware each of these >>>thing is taking place. > > > Here is where to find Intel's MP arch spec Maceij mentions. > I had to find it recently wrt nforce2 issues > > http://www.intel.com/design/pentium/datashts/24201606.pdf > > Section 3.6.1 Apic Architecture is relevant > particularly > Section 3.6.2.2 Virtual Wire Mode > > <snip> > great diagram! > > <snip> > >>If the above variant does not work, as a last resort, the path for the >>8254 timer interrupt is via the 8259 reconfigured back into its usual mode >>and then LINT0 of the BSP reconfigured for an ExtINTA APIC interrupt. >>Additionally, since at this point the glue logic has probably already >>locked up due to the messing done above, a few artiffical sets of double >>INTA cycles are sent to the system bus using the RTC chip and INTIN8 >>reconfigured temporarily to send ExtINTA APIC interrupts via the >>inter-APIC bus. > > > >>I do hope a thorough read of the description will make the available >>variants clear. The I/O APIC input numbers may differ but so far they are >>almost always as noted above. > > > >> Maciej > > > All good. > > I would like to add a footnote to highlight a potential gotcha as I understand it. > > To clarify, the xt pic 8259A does not in itself have a transparent mode as would > a logic buffer or inverter. It always needs inta cycles to function. In PIC mode > it is wired to processor pins as per old 8086 and original cpu architecture > provides the inta cycles to it (bypasses apic, apic seems off). > > In virtual wire mode with the 8259A output wired either to a local apic pin on cpu > or through the io-apic. In this mode it is the local apic which has to provide the > inta cycles on the bus back to the 8259A for it to function correctly. > > The delivery mode has to be set to ExtInt for the register associated with the pin > that the 8259A output (int on Maceij diagram) is connected to. This is the only > way to force the apic to deliver the inta cycles to the 8259A and that is how it > appears transparent to the system. Spec says there can only be one source > register (local apic) or redirection register (ioapic) of mode ExtInt per system > regardless of how many local apic and io-apic pins it (int on Maceij diagram) > is connected to. > > Gotcha: If none are set to ExtInt then the 8259A will hang for lack of IntA > cycles. > > Section 7.5.11 covers it > 24319202.pdf available here > > http://www.intel.com/design/pentiumii/manuals/243192.htm > > Why only one Extint source in virtual wire mode?: > > The 8259A in X86 architecture systems needs two inta cycles per interrupt event. > Do not confuse them with the EOI which is software, the inta is purely hardware. > It only works properly with one source causing inta cycles. Docs I have do not > say what happens with more than one source. > > How 8259A works in a nutshell (it is more complex in cascade mode). > > First the 8259A gets a request from H/ware and if unmasked etc generates its int > (int on Maceij diagram) out. 8259A then sits there waiting for Inta from cpu > (PIC mode) or local apic (Virtual wire mode). When the inta arrives the 8259A > latches its internal ISR bit and waits for second inta. When second inta arrives > it outputs a vector onto the data bus indicating which ISR bit was set. > > If the request from H/ware is still active when the first inta arrives then we get > the correct vector number. > > If it is NOT still exerted then its tough luck and the vector we get 7 for the first > 8259A or 15 for the second 8259A and it is too late to try and find out where > the real source was, hence the spurious irq7 messages and corresponding > irq 7 count increase. > > It is pretty bad when the apic system that is handling the 8259A in virtual > wire mode cannot get the inta to the 8259A in time while the int request > hardware is still exerting but it happens. > > I certainly agree with Marceij's comments that mixed mode of having 8254 PIT > routed via the 8259A was never meant to occur alongside ioapic handling of > the other interrupts. It is very problematic not to mention confusing. > > I do not know how smoothly the apic handles the 8259A if you would be turning > that source on and off frequently. > > Regards > Ross Dickson > > > > -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-17 18:14 Ross Dickson 2003-12-17 21:41 ` George Anzinger @ 2003-12-17 21:48 ` George Anzinger 2003-12-18 1:30 ` Ross Dickson 2003-12-18 14:04 ` Maciej W. Rozycki 2 siblings, 1 reply; 62+ messages in thread From: George Anzinger @ 2003-12-17 21:48 UTC (permalink / raw) To: ross; +Cc: Maciej W. Rozycki, linux-kernel Ross Dickson wrote: > > Section 7.5.11 covers it > 24319202.pdf available here I wonder if you might know the difference between the 243190/1/2 and the 245470/1/2 manuals. I have hard copies of the ladder. -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-17 21:48 ` George Anzinger @ 2003-12-18 1:30 ` Ross Dickson 2003-12-18 14:32 ` Maciej W. Rozycki 0 siblings, 1 reply; 62+ messages in thread From: Ross Dickson @ 2003-12-18 1:30 UTC (permalink / raw) To: George Anzinger; +Cc: Maciej W. Rozycki, linux-kernel On Thursday 18 December 2003 07:48, George Anzinger wrote: > Ross Dickson wrote: > > > > Section 7.5.11 covers it > > 24319202.pdf available here > > I wonder if you might know the difference between the 243190/1/2 and the > 245470/1/2 manuals. I have hard copies of the ladder. > I grabbed the manuals that google search found. By the look of it what I had covered P3 and earlier. Yours are more up to date and cover P4. I have since found them on the web http://www.intel.com/design/pentium4/manuals/245470.htm Regards Ross. > > > -- > George Anzinger george@mvista.com > High-res-timers: http://sourceforge.net/projects/high-res-timers/ > Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml > > > > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-18 1:30 ` Ross Dickson @ 2003-12-18 14:32 ` Maciej W. Rozycki 2003-12-19 4:17 ` Ross Dickson 0 siblings, 1 reply; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-18 14:32 UTC (permalink / raw) To: Ross Dickson; +Cc: George Anzinger, linux-kernel On Thu, 18 Dec 2003, Ross Dickson wrote: > I grabbed the manuals that google search found. By the look of it what I had > covered P3 and earlier. Yours are more up to date and cover P4. Newer manuals sometimes lack details that are present in older ones. If you want to have a thorough view of the APIC, you certainly want to have all four variations of processor manuals, i.e. the one for P4+, the one for PII+, the one for PPro and the one for Pentium. Plus manuals for the I/O APIC, e.g. the one for the i82093AA and perhaps for ones embedded into various chipsets. All of them are or used to be available online. If you want to go back to the i82489DX, there is a datasheet and a programming manual for the part, which are IMO the most exhaustive descriptions, though the implementation differed a bit from newer ones (the chip was so far the most powerful implementation of the APIC). These were unfortunately never available online. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-18 14:32 ` Maciej W. Rozycki @ 2003-12-19 4:17 ` Ross Dickson 2003-12-19 15:35 ` Maciej W. Rozycki 0 siblings, 1 reply; 62+ messages in thread From: Ross Dickson @ 2003-12-19 4:17 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: George Anzinger, linux-kernel On Friday 19 December 2003 00:32, Maciej W. Rozycki wrote: > On Thu, 18 Dec 2003, Ross Dickson wrote: > > > I grabbed the manuals that google search found. By the look of it what I had > > covered P3 and earlier. Yours are more up to date and cover P4. > > Newer manuals sometimes lack details that are present in older ones. If > you want to have a thorough view of the APIC, you certainly want to have > all four variations of processor manuals, i.e. the one for P4+, the one > for PII+, the one for PPro and the one for Pentium. Plus manuals for the > I/O APIC, e.g. the one for the i82093AA and perhaps for ones embedded into > various chipsets. All of them are or used to be available online. If you > want to go back to the i82489DX, there is a datasheet and a programming > manual for the part, which are IMO the most exhaustive descriptions, > though the implementation differed a bit from newer ones (the chip was so > far the most powerful implementation of the APIC). These were > unfortunately never available online. > Point taken, I generally play embedded MPU where the codebase matches the specific hardware version and one set of docs suffice, although it is not uncommon to rediscover an unpublished bug. This one codebase for all hardware certainly is a lot more work! Do you know if the Athlon apic programming docs are available or under NDA? I do not even want to ask about the nvidia nforce2 chipset. Regards Ross. > -- > + Maciej W. Rozycki, Technical University of Gdansk, Poland + > +--------------------------------------------------------------+ > + e-mail: macro@ds2.pg.gda.pl, PGP key available + > > > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-19 4:17 ` Ross Dickson @ 2003-12-19 15:35 ` Maciej W. Rozycki 0 siblings, 0 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-19 15:35 UTC (permalink / raw) To: Ross Dickson; +Cc: George Anzinger, linux-kernel On Fri, 19 Dec 2003, Ross Dickson wrote: > Do you know if the Athlon apic programming docs are available or under NDA? No idea -- I've been only loosely interested in IA-32-based hardware recently. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-17 18:14 Ross Dickson 2003-12-17 21:41 ` George Anzinger 2003-12-17 21:48 ` George Anzinger @ 2003-12-18 14:04 ` Maciej W. Rozycki 2003-12-18 14:22 ` Craig Bradney 2003-12-19 4:06 ` Ross Dickson 2 siblings, 2 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-18 14:04 UTC (permalink / raw) To: Ross Dickson; +Cc: george, linux-kernel On Thu, 18 Dec 2003, Ross Dickson wrote: > Here is where to find Intel's MP arch spec Maceij mentions. > I had to find it recently wrt nforce2 issues > > http://www.intel.com/design/pentium/datashts/24201606.pdf > > Section 3.6.1 Apic Architecture is relevant > particularly > Section 3.6.2.2 Virtual Wire Mode BTW, I have revision 1.1 as well in case anyone was interested in the differences. > I would like to add a footnote to highlight a potential gotcha as I > understand it. > > To clarify, the xt pic 8259A does not in itself have a transparent mode > as would a logic buffer or inverter. It always needs inta cycles to > function. In PIC mode it is wired to processor pins as per old 8086 and > original cpu architecture provides the inta cycles to it (bypasses apic, > apic seems off). It does have such a mode. ;-) You just have not to ack a pending interrupt -- if a request goes away, the INT output gets deasserted as well. We are super cautious though and we reprogram the 8259A into the AEOI mode to prevent a lockup in case INTA cycles escape to the 8259A (which is theoretically possible for a broken design of an i82489DX-based system). See the 8259A datasheet for details. > I certainly agree with Marceij's comments that mixed mode of having 8254 PIT > routed via the 8259A was never meant to occur alongside ioapic handling of > the other interrupts. It is very problematic not to mention confusing. Well, the true "mixed mode", i.e. where certain interrupts are delivered as I/O APIC (either LoPri or Fixed) interrupts and others are routed through an 8259A controller and delivered as ExtINTA interrupts was specifically designed to work since the i8248DX APIC. What wasn't designed but works by the properties of the 8259A PIC is the transparent "through-8259A" mode. Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-18 14:04 ` Maciej W. Rozycki @ 2003-12-18 14:22 ` Craig Bradney 2003-12-19 5:38 ` Ross Dickson 2003-12-19 4:06 ` Ross Dickson 1 sibling, 1 reply; 62+ messages in thread From: Craig Bradney @ 2003-12-18 14:22 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: Ross Dickson, george, linux-kernel Just as an FYI, still going strong here with the old api and ioapic patches. 5d 20h now. When the official 2.6.0 comes to Gentoo Linux I can try that with whatever patches people are finding stable for these nforce fixes. Anyone had any luck in talking to ASUS re a BIOS update? Craig On Thu, 2003-12-18 at 15:04, Maciej W. Rozycki wrote: > On Thu, 18 Dec 2003, Ross Dickson wrote: > > > Here is where to find Intel's MP arch spec Maceij mentions. > > I had to find it recently wrt nforce2 issues > > > > http://www.intel.com/design/pentium/datashts/24201606.pdf > > > > Section 3.6.1 Apic Architecture is relevant > > particularly > > Section 3.6.2.2 Virtual Wire Mode > > BTW, I have revision 1.1 as well in case anyone was interested in the > differences. > > > I would like to add a footnote to highlight a potential gotcha as I > > understand it. > > > > To clarify, the xt pic 8259A does not in itself have a transparent mode > > as would a logic buffer or inverter. It always needs inta cycles to > > function. In PIC mode it is wired to processor pins as per old 8086 and > > original cpu architecture provides the inta cycles to it (bypasses apic, > > apic seems off). > > It does have such a mode. ;-) You just have not to ack a pending > interrupt -- if a request goes away, the INT output gets deasserted as > well. We are super cautious though and we reprogram the 8259A into the > AEOI mode to prevent a lockup in case INTA cycles escape to the 8259A > (which is theoretically possible for a broken design of an i82489DX-based > system). See the 8259A datasheet for details. > > > I certainly agree with Marceij's comments that mixed mode of having 8254 PIT > > routed via the 8259A was never meant to occur alongside ioapic handling of > > the other interrupts. It is very problematic not to mention confusing. > > Well, the true "mixed mode", i.e. where certain interrupts are delivered > as I/O APIC (either LoPri or Fixed) interrupts and others are routed > through an 8259A controller and delivered as ExtINTA interrupts was > specifically designed to work since the i8248DX APIC. What wasn't > designed but works by the properties of the 8259A PIC is the transparent > "through-8259A" mode. > > Maciej ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-18 14:22 ` Craig Bradney @ 2003-12-19 5:38 ` Ross Dickson 2003-12-19 10:36 ` Craig Bradney 0 siblings, 1 reply; 62+ messages in thread From: Ross Dickson @ 2003-12-19 5:38 UTC (permalink / raw) To: Craig Bradney, Maciej W. Rozycki; +Cc: george, linux-kernel On Friday 19 December 2003 00:22, Craig Bradney wrote: > Just as an FYI, still going strong here with the old api and ioapic > patches. 5d 20h now. > > When the official 2.6.0 comes to Gentoo Linux I can try that with > whatever patches people are finding stable for these nforce fixes. > > Anyone had any luck in talking to ASUS re a BIOS update? > > Craig > I have not talked to ASUS. I note from peoples postings that with the latest award bios we may need no apic patches (C1 disconnect auto), just an ioapic one to work round a buggy bios. I don't think you can run nmi_watchdog=1 with the old io-apic (not of my doing) patch. I have pheonix bios MOBOS from albatron and epox so award bios doesn't help me. No disconnect options available in setup. My apic ack delay patch lets the bios have its disconnect on and keep the cpu a few degrees cooler besides whatever else it and the nforce2 chipset might want to control it for. I have been advised my query wrt my apic ack delay patch is progressing with AMD but I have nothing technical to report on it. I have made and am trialling, but have not yet posted a kernel arg controlled version combining my v1 and v2 apic ack delay patches. This would be better than what I have released in the past because people can fix bioses as the fixes become available and use timer ack delay in the mean time. Of course there is still athcool and the earlier disconnect patch to force things if desired. Regards Ross. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-19 5:38 ` Ross Dickson @ 2003-12-19 10:36 ` Craig Bradney 0 siblings, 0 replies; 62+ messages in thread From: Craig Bradney @ 2003-12-19 10:36 UTC (permalink / raw) To: ross; +Cc: Maciej W. Rozycki, george, linux-kernel On Fri, 2003-12-19 at 06:38, Ross Dickson wrote: > On Friday 19 December 2003 00:22, Craig Bradney wrote: > > Just as an FYI, still going strong here with the old api and ioapic > > patches. 5d 20h now. > > > > When the official 2.6.0 comes to Gentoo Linux I can try that with > > whatever patches people are finding stable for these nforce fixes. > > > > Anyone had any luck in talking to ASUS re a BIOS update? > > > > Craig > > > > I have not talked to ASUS. I note from peoples postings that with the > latest award bios we may need no apic patches (C1 disconnect auto), > just an ioapic one to work round a buggy bios. I don't think you can run > nmi_watchdog=1 with the old io-apic (not of my doing) patch. > > I have pheonix bios MOBOS from albatron and epox so award bios doesn't help me. > No disconnect options available in setup. > My apic ack delay patch lets the bios have its disconnect on and keep the cpu a > few degrees cooler besides whatever else it and the nforce2 chipset might want > to control it for. > > I have been advised my query wrt my apic ack delay patch is progressing > with AMD but I have nothing technical to report on it. > > I have made and am trialling, but have not yet posted a kernel arg controlled > version combining my v1 and v2 apic ack delay patches. This would be better > than what I have released in the past because people can fix bioses as the > fixes become available and use timer ack delay in the mean time. > Of course there is still athcool and the earlier disconnect patch to force > things if desired. > > Regards > Ross. Ok Ross. Well, Gentoo's 2.6 is out now so whenever you want me to test your new patch I can try it. Ive been looking back through the list for the updated patches but things seemed to have changed here and there even for the v2 patches so I think I'll wait for the next round of patchesas things seem a little confusing. 2.6test11 is still running happily.. 6d15h now. Craig ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-18 14:04 ` Maciej W. Rozycki 2003-12-18 14:22 ` Craig Bradney @ 2003-12-19 4:06 ` Ross Dickson 2003-12-19 15:33 ` Maciej W. Rozycki 1 sibling, 1 reply; 62+ messages in thread From: Ross Dickson @ 2003-12-19 4:06 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: george, linux-kernel On Friday 19 December 2003 00:04, Maciej W. Rozycki wrote: > On Thu, 18 Dec 2003, Ross Dickson wrote: > > > Here is where to find Intel's MP arch spec Maceij mentions. > > I had to find it recently wrt nforce2 issues > > > > http://www.intel.com/design/pentium/datashts/24201606.pdf > > > > Section 3.6.1 Apic Architecture is relevant > > particularly > > Section 3.6.2.2 Virtual Wire Mode > > BTW, I have revision 1.1 as well in case anyone was interested in the > differences. Yes please if it is forwardable or downloadable. > > > I would like to add a footnote to highlight a potential gotcha as I > > understand it. > > > > To clarify, the xt pic 8259A does not in itself have a transparent mode > > as would a logic buffer or inverter. It always needs inta cycles to > > function. In PIC mode it is wired to processor pins as per old 8086 and > > original cpu architecture provides the inta cycles to it (bypasses apic, > > apic seems off). > > It does have such a mode. ;-) You just have not to ack a pending > interrupt -- if a request goes away, the INT output gets deasserted as > well. We are super cautious though and we reprogram the 8259A into the > AEOI mode to prevent a lockup in case INTA cycles escape to the 8259A > (which is theoretically possible for a broken design of an i82489DX-based > system). See the 8259A datasheet for details. > I believe what you have written because you say it is how the code works. I take it you mean that the INT is either never latched? or only latched with IS bit after receipt of first INTA? It is not obvious in 8259A Datasheet published in Intel Microsystem Components Handbook Volume 1 1983 nor in datasheet December 1988. I note the data sheet is almost silent on the topic of INT behaviour without INTA cycle. Almost, as the WAVEFORMS diagrams which have INT displayed only show it in conjunction with the INTA and under said diagram I read NOTES: Interrupt output must remain HIGH at least until leading edge of first INTA. implying it can go low for some reason? And the, 1. Cycle 1 in iAPX 86 , .... Only indicates its trailing edge is synchrouous with the machine cycle. Figure 10 in the data sheet does not help either as when the IR goes low it has a LATCH* ARMED notation which I took to mean the INT output was then latched. I now think it was referring to the transparent D type latch "REQUEST LATCH" in the priority cell diagram but I cannot see a footnote to the *. Could you please point me to the document where it is made clear? It may be in the i82489DX docs as I do not have them or in a later 8259A data sheet revision? Thanks, Ross. > > I certainly agree with Marceij's comments that mixed mode of having 8254 PIT > > routed via the 8259A was never meant to occur alongside ioapic handling of > > the other interrupts. It is very problematic not to mention confusing. > > Well, the true "mixed mode", i.e. where certain interrupts are delivered > as I/O APIC (either LoPri or Fixed) interrupts and others are routed > through an 8259A controller and delivered as ExtINTA interrupts was > specifically designed to work since the i8248DX APIC. What wasn't > designed but works by the properties of the 8259A PIC is the transparent > "through-8259A" mode. Clarified thanks. > > Maciej > > -- > + Maciej W. Rozycki, Technical University of Gdansk, Poland + > +--------------------------------------------------------------+ > + e-mail: macro@ds2.pg.gda.pl, PGP key available + > > > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog 2003-12-19 4:06 ` Ross Dickson @ 2003-12-19 15:33 ` Maciej W. Rozycki 0 siblings, 0 replies; 62+ messages in thread From: Maciej W. Rozycki @ 2003-12-19 15:33 UTC (permalink / raw) To: Ross Dickson; +Cc: george, linux-kernel On Fri, 19 Dec 2003, Ross Dickson wrote: > > It does have such a mode. ;-) You just have not to ack a pending > > interrupt -- if a request goes away, the INT output gets deasserted as > > well. We are super cautious though and we reprogram the 8259A into the > > AEOI mode to prevent a lockup in case INTA cycles escape to the 8259A > > (which is theoretically possible for a broken design of an i82489DX-based > > system). See the 8259A datasheet for details. > > I believe what you have written because you say it is how the code works. Well, since I'm actually the author of the relevant bits (though Ingo did some clean-ups before applying them), I must have been completely sure the assumptions are valid. > I take it you mean that the INT is either never latched? or only latched with IS bit > after receipt of first INTA? Yes, one of these conditions is true, although I've never bothered to investigate exactly which one. ;-) > It is not obvious in 8259A Datasheet published in Intel Microsystem Components > Handbook Volume 1 1983 nor in datasheet December 1988. Yep, the datasheet is indeed not that clear on the matter. The latest version (version 3, dated Nov 1988) used to be available at the Intel's FTP site, but I can't find it anymore. The 8259A core is documented in many other datasheets, perhaps more clearly -- e.g. I've found at least one Intel datasheet providing an unambiguos explanation of how the SFNM mode works. I knew of the volatile property of the INT output pretty always and it can be quite easily verified with hardware. Given this property some people find the way Intel defines edge-triggered interrupts quite surprising. > Could you please point me to the document where it is made clear? It may be > in the i82489DX docs as I do not have them or in a later 8259A data sheet > revision? Well, there is actually a hint on how this "transparent" property of the 8259A PIC can be used for delivery of EISA chaining interrupts as APIC interrupts in the i82489DX datasheet. The problem with these interrupts appears with the 82357 ISP EISA component that has a pair of 8259A PICs embedded and does not provide the interrupt line externally, only wiring it to IRQ 13 (IR5 of the slave 8259A -- so both 8259A cores have to be treated as "transparent"!) internally. The same problem exists with the 8254 interrupt in this chip, but the datasheet disregards it, assuming the local APIC timer will be used for periodic interrupts exclusively. Linux would use IRQ 0 in the "transparent" 8259A mode with this chip and if that failed (which would be quite possible, since an ISP erratum required glue logic in the 8259A path when used with an APIC and the Intel's suggestion wasn't the most fortunate) the mixed mode with ExtINTA interrupts would be configured. Of course the mixed mode would also permit simultanous use of IRQ 0 and IRQ 13 with ISP -- with the "transparent" 8259A mode can support only a single interrupt source. Note the interesting internal inconsistency of the document -- implementation of the erratum workaround as proposed by Intel would make the suggested "transparent" 8259A mode inoperational. ;-) Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 62+ messages in thread
end of thread, other threads:[~2003-12-19 15:35 UTC | newest] Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-12-05 4:54 Catching NForce2 lockup with NMI watchdog Jesse Allen 2003-12-05 7:40 ` Mikael Pettersson 2003-12-05 8:33 ` Josh McKinney 2003-12-05 12:14 ` Mikael Pettersson 2003-12-05 14:19 ` Craig Bradney 2003-12-05 17:05 ` Craig Bradney 2003-12-05 18:11 ` Josh McKinney 2003-12-05 8:58 ` Mike Fedyk 2003-12-05 12:06 ` Mikael Pettersson 2003-12-08 2:20 ` Bob 2003-12-09 14:21 ` Maciej W. Rozycki 2003-12-09 16:35 ` Bob 2003-12-10 13:41 ` Maciej W. Rozycki 2003-12-12 16:01 ` bill davidsen 2003-12-12 16:47 ` Maciej W. Rozycki 2003-12-12 16:57 ` Richard B. Johnson 2003-12-12 17:21 ` Maciej W. Rozycki 2003-12-13 5:16 ` Bill Davidsen 2003-12-15 13:23 ` Maciej W. Rozycki 2003-12-12 22:27 ` George Anzinger 2003-12-15 13:13 ` Maciej W. Rozycki 2003-12-15 21:42 ` George Anzinger 2003-12-16 13:37 ` Maciej W. Rozycki 2003-12-16 13:57 ` Richard B. Johnson 2003-12-16 15:47 ` Maciej W. Rozycki 2003-12-16 16:44 ` Richard B. Johnson 2003-12-16 16:50 ` Maciej W. Rozycki 2003-12-16 17:26 ` George Anzinger 2003-12-16 20:54 ` Maciej W. Rozycki 2003-12-16 21:53 ` George Anzinger 2003-12-17 14:03 ` Maciej W. Rozycki 2003-12-05 19:11 Allen Martin 2003-12-05 20:18 ` cheuche+lkml 2003-12-05 20:34 ` Prakash K. Cheemplavam 2003-12-05 21:02 ` Mike Fedyk 2003-12-05 20:55 ` Jesse Allen 2003-12-06 3:20 ` Jesse Allen 2003-12-05 20:36 ` Jesse Allen 2003-12-05 22:55 ` Mike Fedyk 2003-12-05 23:11 ` Craig Bradney 2003-12-05 20:56 Allen Martin 2003-12-05 22:41 b 2003-12-07 19:58 Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered Ian Kumlien 2003-12-08 2:07 ` Ross Dickson 2003-12-09 18:12 ` Catching NForce2 lockup with NMI watchdog Ian Kumlien 2003-12-09 22:04 ` Craig Bradney 2003-12-09 23:13 ` Ian Kumlien 2003-12-10 6:14 ` Bob 2003-12-10 7:51 ` Craig Bradney 2003-12-13 3:56 Ross Dickson 2003-12-15 13:16 ` Maciej W. Rozycki 2003-12-17 18:14 Ross Dickson 2003-12-17 21:41 ` George Anzinger 2003-12-17 21:48 ` George Anzinger 2003-12-18 1:30 ` Ross Dickson 2003-12-18 14:32 ` Maciej W. Rozycki 2003-12-19 4:17 ` Ross Dickson 2003-12-19 15:35 ` Maciej W. Rozycki 2003-12-18 14:04 ` Maciej W. Rozycki 2003-12-18 14:22 ` Craig Bradney 2003-12-19 5:38 ` Ross Dickson 2003-12-19 10:36 ` Craig Bradney 2003-12-19 4:06 ` Ross Dickson 2003-12-19 15:33 ` Maciej W. Rozycki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).