* 2.4.21-pre4: tg3 driver problems with shared interrupts @ 2003-02-02 15:18 Stephan von Krawczynski 2003-02-02 16:49 ` Jeff Garzik 0 siblings, 1 reply; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-02 15:18 UTC (permalink / raw) To: linux-kernel; +Cc: davem, jgarzik Hello Dave, Jeff, all I just started experiments with a new setup consisting of: 00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (rev 23) 00:00.1 Host bridge: ServerWorks CNB20HE Host Bridge (rev 01) 00:00.2 Host bridge: ServerWorks: Unknown device 0006 (rev 01) 00:00.3 Host bridge: ServerWorks: Unknown device 0006 (rev 01) 00:02.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0d) 00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0d) 00:04.0 Network controller: Elsa AG QuickStep 1000 (rev 01) 00:05.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07) 00:05.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07) 00:07.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) 00:0f.0 ISA bridge: ServerWorks CSB5 South Bridge (rev 93) 00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93) 00:0f.2 USB Controller: ServerWorks OSB4/CSB5 USB Controller (rev 05) 00:0f.3 Host bridge: ServerWorks: Unknown device 0225 01:02.0 Unknown mass storage controller: Promise Technology, Inc. 20268 (rev 01) 01:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15) 02:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15) 02:03.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01) 02:03.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01) I found out within minutes that this setup does not survive if you let the Broadcom cards share interrupts with anything else. It works ok now like this (eth2 is tg3): CPU0 0: 343269 XT-PIC timer 1: 6804 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 37952 XT-PIC EMU10K1 7: 515 XT-PIC HiSax 9: 711212 XT-PIC aic7xxx, aic7xxx, eth0, eth1 10: 4710570 XT-PIC eth2 11: 639316 XT-PIC ide2, ide3 12: 107821 XT-PIC PS/2 Mouse 15: 69222 XT-PIC ide1 NMI: 0 LOC: 0 ERR: 0 MIS: 0 But horribly failed in such a setup: 0: XT-PIC timer 1: XT-PIC keyboard 2: XT-PIC cascade 5: XT-PIC EMU10K1 7: XT-PIC HiSax 9: XT-PIC eth2, aic7xxx, aic7xxx, eth0, eth1, ide2, ide3 12: XT-PIC PS/2 Mouse 15: XT-PIC ide1 I cannot even produce a "cat /proc/interrupts" for this because I am not fast enough at login (the network at eth2 is heavy loaded). I sometimes read about problems here with tg3-drivers, and I just wanted to point you to the shared case, maybe it has to do with this special case rather than with the drivers internals itself. (PS: its not the ide2-3, I checked that out) -- Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts 2003-02-02 15:18 2.4.21-pre4: tg3 driver problems with shared interrupts Stephan von Krawczynski @ 2003-02-02 16:49 ` Jeff Garzik 2003-02-02 17:09 ` Stephan von Krawczynski 2003-02-02 17:52 ` Stephan von Krawczynski 0 siblings, 2 replies; 27+ messages in thread From: Jeff Garzik @ 2003-02-02 16:49 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: linux-kernel, davem [-- Attachment #1: Type: text/plain, Size: 969 bytes --] Stephan von Krawczynski wrote: > I found out within minutes that this setup does not survive if you let the > Broadcom cards share interrupts with anything else. It works ok now like > this (eth2 is tg3): [...] > But horribly failed in such a setup: [...] > I cannot even produce a "cat /proc/interrupts" for this because I am not fast > enough at login (the network at eth2 is heavy loaded). I sometimes read about > problems here with tg3-drivers, and I just wanted to point you to the shared > case, maybe it has to do with this special case rather than with the drivers > internals itself. > (PS: its not the ide2-3, I checked that out) hmmm. I've attached the latest tg3, version 1.4, which I just sent off to Marcelo. It includes some fixes that may affect your 5701. Can you try two things? 1) 2.4.21-pre4 + tg3 v1.4 2) 2.4.20 + tg3 v1.4 I'm interested to know if you can reproduce with the latest driver in either of these two scenarios... Jeff [-- Attachment #2: tg3.tar.bz2 --] [-- Type: application/x-bzip2, Size: 49437 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts 2003-02-02 16:49 ` Jeff Garzik @ 2003-02-02 17:09 ` Stephan von Krawczynski 2003-02-02 17:15 ` Jeff Garzik 2003-02-02 17:52 ` Stephan von Krawczynski 1 sibling, 1 reply; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-02 17:09 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel, davem On Sun, 02 Feb 2003 11:49:12 -0500 Jeff Garzik <jgarzik@pobox.com> wrote: > Stephan von Krawczynski wrote: > > I found out within minutes that this setup does not survive if you let the > > Broadcom cards share interrupts with anything else. It works ok now like > > this (eth2 is tg3): > > [...] > > But horribly failed in such a setup: > > [...] > > I cannot even produce a "cat /proc/interrupts" for this because I am not fast > > enough at login (the network at eth2 is heavy loaded). I sometimes read about > > problems here with tg3-drivers, and I just wanted to point you to the shared > > case, maybe it has to do with this special case rather than with the drivers > > internals itself. > > (PS: its not the ide2-3, I checked that out) > > > > hmmm. I've attached the latest tg3, version 1.4, which I just sent off > to Marcelo. It includes some fixes that may affect your 5701. Sorry, putting it into pre4 results in: gcc -D__KERNEL__ -I/usr/src/linux-2.4.21-pre4-patch/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686 -nostdinc -iwithprefix include -DKBUILD_BASENAME=tg3 -c -o tg3.o tg3.c tg3.c:140: `PCI_DEVICE_ID_TIGON3_5704S' undeclared here (not in a function) tg3.c:140: initializer element is not constant tg3.c:140: (near initialization for `tg3_pci_tbl[8].device') tg3.c:141: initializer element is not constant tg3.c:141: (near initialization for `tg3_pci_tbl[8]') tg3.c:142: `PCI_DEVICE_ID_TIGON3_5702A3' undeclared here (not in a function) tg3.c:142: initializer element is not constant tg3.c:142: (near initialization for `tg3_pci_tbl[9].device') tg3.c:143: initializer element is not constant tg3.c:143: (near initialization for `tg3_pci_tbl[9]') tg3.c:144: `PCI_DEVICE_ID_TIGON3_5703A3' undeclared here (not in a function) tg3.c:144: initializer element is not constant tg3.c:144: (near initialization for `tg3_pci_tbl[10].device') tg3.c:145: initializer element is not constant tg3.c:145: (near initialization for `tg3_pci_tbl[10]') tg3.c:147: initializer element is not constant tg3.c:147: (near initialization for `tg3_pci_tbl[11]') tg3.c:149: initializer element is not constant tg3.c:149: (near initialization for `tg3_pci_tbl[12]') tg3.c:151: initializer element is not constant tg3.c:151: (near initialization for `tg3_pci_tbl[13]') tg3.c:152: initializer element is not constant tg3.c:152: (near initialization for `tg3_pci_tbl[14]') make[3]: *** [tg3.o] Error 1 Shall I patch it, or do you? -- Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts 2003-02-02 17:09 ` Stephan von Krawczynski @ 2003-02-02 17:15 ` Jeff Garzik 0 siblings, 0 replies; 27+ messages in thread From: Jeff Garzik @ 2003-02-02 17:15 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: linux-kernel, davem, manish [-- Attachment #1: Type: text/plain, Size: 499 bytes --] Stephan von Krawczynski wrote: > Sorry, putting it into pre4 results in: > > gcc -D__KERNEL__ -I/usr/src/linux-2.4.21-pre4-patch/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686 -nostdinc -iwithprefix include -DKBUILD_BASENAME=tg3 -c -o tg3.o tg3.c > tg3.c:140: `PCI_DEVICE_ID_TIGON3_5704S' undeclared here (not in a function) sorry! you need updated include/linux/pci_ids.h (attached). [-- Attachment #2: tg3.tar.bz2 --] [-- Type: application/x-bzip2, Size: 62620 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts 2003-02-02 16:49 ` Jeff Garzik 2003-02-02 17:09 ` Stephan von Krawczynski @ 2003-02-02 17:52 ` Stephan von Krawczynski 2003-02-02 18:28 ` Jeff Garzik 1 sibling, 1 reply; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-02 17:52 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel, davem On Sun, 02 Feb 2003 11:49:12 -0500 Jeff Garzik <jgarzik@pobox.com> wrote: > Stephan von Krawczynski wrote: > > I found out within minutes that this setup does not survive if you let the > > Broadcom cards share interrupts with anything else. It works ok now like > > this (eth2 is tg3): > > [...] > > But horribly failed in such a setup: > > [...] > > I cannot even produce a "cat /proc/interrupts" for this because I am not > > fast enough at login (the network at eth2 is heavy loaded). I sometimes > > read about problems here with tg3-drivers, and I just wanted to point you > > to the shared case, maybe it has to do with this special case rather than > > with the drivers internals itself. > > (PS: its not the ide2-3, I checked that out) > > > > hmmm. I've attached the latest tg3, version 1.4, which I just sent off > to Marcelo. It includes some fixes that may affect your 5701. > > Can you try two things? > > 1) 2.4.21-pre4 + tg3 v1.4 Ok. With the latest version you sent I got it compiled and working. As far as I can tell from short tests it looks good (eth2 is tg3): CPU0 0: 79344 XT-PIC timer 1: 2428 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 0 XT-PIC EMU10K1 7: 81 XT-PIC HiSax 9: 387286 XT-PIC aic7xxx, aic7xxx, eth0, eth1, eth2 11: 75780 XT-PIC ide2, ide3 12: 17740 XT-PIC PS/2 Mouse 15: 2 XT-PIC ide1 NMI: 0 LOC: 0 ERR: 0 MIS: 0 To make sure I will let it stress-test overnight and send you the results in about 15 hours from now on. If everything does fine I will redo with ide2,ide3 on same interrupt, too. Just to see what happens with these Promise things... -- Thanks a lot, I'll be back, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts 2003-02-02 17:52 ` Stephan von Krawczynski @ 2003-02-02 18:28 ` Jeff Garzik 2003-02-02 18:31 ` Stephan von Krawczynski ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Jeff Garzik @ 2003-02-02 18:28 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: linux-kernel, davem Stephan von Krawczynski wrote: > On Sun, 02 Feb 2003 11:49:12 -0500 > Jeff Garzik <jgarzik@pobox.com> wrote: >>Can you try two things? >> >>1) 2.4.21-pre4 + tg3 v1.4 > > > Ok. With the latest version you sent I got it compiled and working. As far as I > can tell from short tests it looks good (eth2 is tg3): cool beans, thanks! > To make sure I will let it stress-test overnight and send you the results in > about 15 hours from now on. If everything does fine I will redo with ide2,ide3 > on same interrupt, too. Just to see what happens with these Promise things... great. though who knows with the Promise stuff... :) I hope it's not their binary-only junk... Jeff ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts 2003-02-02 18:28 ` Jeff Garzik @ 2003-02-02 18:31 ` Stephan von Krawczynski 2003-02-03 10:25 ` Stephan von Krawczynski 2003-02-05 9:48 ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski 2 siblings, 0 replies; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-02 18:31 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel, davem On Sun, 02 Feb 2003 13:28:55 -0500 Jeff Garzik <jgarzik@pobox.com> wrote: > Stephan von Krawczynski wrote: > > On Sun, 02 Feb 2003 11:49:12 -0500 > > Jeff Garzik <jgarzik@pobox.com> wrote: > >>Can you try two things? > >> > >>1) 2.4.21-pre4 + tg3 v1.4 > > > > > > Ok. With the latest version you sent I got it compiled and working. As far as I > > can tell from short tests it looks good (eth2 is tg3): > > cool beans, thanks! > > > > To make sure I will let it stress-test overnight and send you the results in > > about 15 hours from now on. If everything does fine I will redo with ide2,ide3 > > on same interrupt, too. Just to see what happens with these Promise things... > > > great. > > though who knows with the Promise stuff... :) I hope it's not their > binary-only junk... I am using "PROMISE PDC202{68|69|70|71|75|76|77} support" from standard kernel. -- Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts 2003-02-02 18:28 ` Jeff Garzik 2003-02-02 18:31 ` Stephan von Krawczynski @ 2003-02-03 10:25 ` Stephan von Krawczynski 2003-02-05 9:48 ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski 2 siblings, 0 replies; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-03 10:25 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel, davem On Sun, 02 Feb 2003 13:28:55 -0500 Jeff Garzik <jgarzik@pobox.com> wrote: > > To make sure I will let it stress-test overnight and send you the results > > in about 15 hours from now on. If everything does fine I will redo with > > ide2,ide3 on same interrupt, too. Just to see what happens with these > > Promise things... Hi Jeff, I can tell you for sure now that your tg3 driver does very well in shared interrupt config: (eth2 is tg3) CPU0 0: 6052783 XT-PIC timer 1: 8618 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 581 XT-PIC EMU10K1 7: 6842 XT-PIC HiSax 9: 30245790 XT-PIC aic7xxx, aic7xxx, eth0, eth1, eth2 11: 7324397 XT-PIC ide2, ide3 12: 196375 XT-PIC PS/2 Mouse 15: 2 XT-PIC ide1 NMI: 0 LOC: 0 ERR: 52 MIS: 0 Though I don't exactly know what "ERR:" means in this context the machine is alive and performing well. Now I go again for the Promise stuff ... -- Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-02 18:28 ` Jeff Garzik 2003-02-02 18:31 ` Stephan von Krawczynski 2003-02-03 10:25 ` Stephan von Krawczynski @ 2003-02-05 9:48 ` Stephan von Krawczynski 2003-02-05 11:16 ` Benjamin Herrenschmidt 2 siblings, 1 reply; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-05 9:48 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel, davem On Sun, 02 Feb 2003 13:28:55 -0500 Jeff Garzik <jgarzik@pobox.com> wrote: > > To make sure I will let it stress-test overnight and send you the results in > > about 15 hours from now on. If everything does fine I will redo with ide2,ide3 > > on same interrupt, too. Just to see what happens with these Promise things... > > > great. > > though who knows with the Promise stuff... :) I hope it's not their > binary-only junk... > > Jeff Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like: Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy } Feb 4 01:02:22 admin kernel: Feb 4 01:02:22 admin kernel: PDC202XX: Primary channel reset. Feb 4 01:02:22 admin kernel: ide2: reset: success Feb 4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady SeekComplete DataRequest } Feb 4 01:02:23 admin kernel: Feb 4 01:02:23 admin kernel: hde: drive not ready for command Feb 4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady SeekComplete } Feb 4 01:02:23 admin kernel: Feb 4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE Feb 4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy } Feb 4 01:02:23 admin kernel: Results are that the drive itself just hangs and has to be powered off (resetting the box does not work). The drives worked (and works) fine in non shared-interrupt context. Controller is: <6>PDC20268: IDE controller at PCI slot 01:02.0 <6>PCI: Found IRQ 11 for device 01:02.0 <6>PDC20268: chipset revision 1 <6>PDC20268: not 100%% native mode: will probe irqs later <6> ide2: BM-DMA at 0x7400-0x7407, BIOS settings: hde:pio, hdf:pio <6> ide3: BM-DMA at 0x7408-0x740f, BIOS settings: hdg:pio, hdh:pio <4>hdc: AOPEN CD-RW CRW2440, ATAPI CD/DVD-ROM drive <4>hde: ST380021A, ATA DISK drive <4>blk: queue c034e1f8, I/O limit 4095Mb (mask 0xffffffff) <4>hdg: IC35L060AVER07-0, ATA DISK drive <4>blk: queue c034e664, I/O limit 4095Mb (mask 0xffffffff) <4>ide1 at 0x170-0x177,0x376 on irq 15 <4>ide2 at 0x8800-0x8807,0x8402 on irq 11 <4>ide3 at 0x8000-0x8007,0x7802 on irq 11 <4>hde: host protected area => 1 <6>hde: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=155061/16/63, UDMA(100) <4>hdg: host protected area => 1 <6>hdg: 120103200 sectors (61493 MB) w/1916KiB Cache, CHS=119150/16/63, UDMA(100) <6>Partition check: <6> hde: hde1 <6> hdg:<6> [PTBL] [7476/255/63] hdg1 Regards, Stephan PS: tg3 does great! Good job, Jeff... ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 9:48 ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski @ 2003-02-05 11:16 ` Benjamin Herrenschmidt 2003-02-05 11:39 ` Stephan von Krawczynski ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Benjamin Herrenschmidt @ 2003-02-05 11:16 UTC (permalink / raw) To: alan; +Cc: Stephan von Krawczynski, linux-kernel > Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like: > > Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy } > Feb 4 01:02:22 admin kernel: > Feb 4 01:02:22 admin kernel: PDC202XX: Primary channel reset. > Feb 4 01:02:22 admin kernel: ide2: reset: success > Feb 4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady SeekComplete DataRequest } > Feb 4 01:02:23 admin kernel: > Feb 4 01:02:23 admin kernel: hde: drive not ready for command > Feb 4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady SeekComplete } > Feb 4 01:02:23 admin kernel: > Feb 4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE > Feb 4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy } > Feb 4 01:02:23 admin kernel: Hi Alan ! I'm trying to get some sense out of the above report as it seem to match a problem a user reported me as well. It's interesting because it's apparently running UP, so it's not the SMP race found by Ross. It's definitely a problem with shared interrupt though. I've followed the code path involved here. Basically, we had ide_dma_intr() called while the drive was busy. This is strange for a simple reason: Even if we got the wrong interrupt (shared interrupt), __ide_dma_test_irq() should have returned 0, and so ide_dma_intr shouldn't have been called. Assuming the driver was doing basic read/write operations, I checked the code, and while it seems that do_rw_disk() is called without the lock held nor interrupts masked, I see no obvious race. drive->waiting_for_dma is set before setting up the handler and issuing the command, and while the DMA engine is indeed started only after sending the command, it's INTR bit have been cleared previously (I suppose it can't be stale, or can it while the DMA haven't been started yet ? In this case we would need to take the lock here). > Results are that the drive itself just hangs and has to be powered off (resetting the box does not work). The drives worked (and works) fine in non shared-interrupt context. Controller is: > > <6>PDC20268: IDE controller at PCI slot 01:02.0 > <6>PCI: Found IRQ 11 for device 01:02.0 > <6>PDC20268: chipset revision 1 > <6>PDC20268: not 100%% native mode: will probe irqs later > <6> ide2: BM-DMA at 0x7400-0x7407, BIOS settings: hde:pio, hdf:pio > <6> ide3: BM-DMA at 0x7408-0x740f, BIOS settings: hdg:pio, hdh:pio > <4>hdc: AOPEN CD-RW CRW2440, ATAPI CD/DVD-ROM drive > <4>hde: ST380021A, ATA DISK drive > <4>blk: queue c034e1f8, I/O limit 4095Mb (mask 0xffffffff) > <4>hdg: IC35L060AVER07-0, ATA DISK drive > <4>blk: queue c034e664, I/O limit 4095Mb (mask 0xffffffff) > <4>ide1 at 0x170-0x177,0x376 on irq 15 > <4>ide2 at 0x8800-0x8807,0x8402 on irq 11 > <4>ide3 at 0x8000-0x8007,0x7802 on irq 11 > <4>hde: host protected area => 1 > <6>hde: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=155061/16/63, UDMA(100) > <4>hdg: host protected area => 1 > <6>hdg: 120103200 sectors (61493 MB) w/1916KiB Cache, CHS=119150/16/63, UDMA(100) > <6>Partition check: > <6> hde: hde1 > <6> hdg:<6> [PTBL] [7476/255/63] hdg1 > > Regards, > Stephan > > PS: tg3 does great! Good job, Jeff... > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Benjamin Herrenschmidt <benh@kernel.crashing.org> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 11:16 ` Benjamin Herrenschmidt @ 2003-02-05 11:39 ` Stephan von Krawczynski 2003-02-05 12:21 ` Alan Cox 2003-02-05 12:22 ` Benjamin Herrenschmidt 2003-02-05 12:24 ` Alan Cox 2003-02-05 16:56 ` Ross Biro 2 siblings, 2 replies; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-05 11:39 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel On 05 Feb 2003 12:16:02 +0100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > > Okay, I had to watch for it a bit longer and it turns out that the kernel > > PDC driver has a problem in this shared interrupt setup. When loads get > > high it seems to run into some timing problem which causes things like: > > > > Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy } > > Feb 4 01:02:22 admin kernel: > > Feb 4 01:02:22 admin kernel: PDC202XX: Primary channel reset. > > Feb 4 01:02:22 admin kernel: ide2: reset: success > > Feb 4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady > > SeekComplete DataRequest } Feb 4 01:02:23 admin kernel: > > Feb 4 01:02:23 admin kernel: hde: drive not ready for command > > Feb 4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady > > SeekComplete } Feb 4 01:02:23 admin kernel: > > Feb 4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE > > Feb 4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy } > > Feb 4 01:02:23 admin kernel: > > Hi Alan ! > > I'm trying to get some sense out of the above report as it seem to match > a problem a user reported me as well. It's interesting because it's > apparently running UP, so it's not the SMP race found by Ross. It's > definitely a problem with shared interrupt though. Hello Benjamin, I have to give a short note on this one: indeed is the system currently running with a single CPU, _but_ since it is a dual-mb the kernel is already compiled for SMP. It is started with "nosmp" option though. I wanted to mention this not knowing if it is important for the codepath. -- Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 11:39 ` Stephan von Krawczynski @ 2003-02-05 12:21 ` Alan Cox 2003-02-05 12:22 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 27+ messages in thread From: Alan Cox @ 2003-02-05 12:21 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Benjamin Herrenschmidt, alan, linux-kernel > > a problem a user reported me as well. It's interesting because it's > > apparently running UP, so it's not the SMP race found by Ross. It's > > definitely a problem with shared interrupt though. The race Ross found can bite you on a uniprocessor box as well I think. It just needs the irq to hit in the right spot. The more interesting question is how the current -ac behaves under the same treatment ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 11:39 ` Stephan von Krawczynski 2003-02-05 12:21 ` Alan Cox @ 2003-02-05 12:22 ` Benjamin Herrenschmidt 2003-02-05 12:50 ` Alan Cox 2003-02-05 13:19 ` Stephan von Krawczynski 1 sibling, 2 replies; 27+ messages in thread From: Benjamin Herrenschmidt @ 2003-02-05 12:22 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: alan, linux-kernel > I have to give a short note on this one: > indeed is the system currently running with a single CPU, _but_ since it is a > dual-mb the kernel is already compiled for SMP. It is started with "nosmp" > option though. I wanted to mention this not knowing if it is important for the > codepath. Shouldn be an issue. I suppose you don't use fancy stuff like preempt or IDE taskfile IO, right ? Ben. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 12:22 ` Benjamin Herrenschmidt @ 2003-02-05 12:50 ` Alan Cox 2003-02-05 13:19 ` Stephan von Krawczynski 1 sibling, 0 replies; 27+ messages in thread From: Alan Cox @ 2003-02-05 12:50 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Stephan von Krawczynski, alan, linux-kernel > > dual-mb the kernel is already compiled for SMP. It is started with "nosmp" > > option though. I wanted to mention this not knowing if it is important for the > > codepath. > > Shouldn be an issue. I suppose you don't use fancy stuff like preempt or > IDE taskfile IO, right ? IDE taskfile I/O is disabled. Pre-empt and 2.4 IDE don't work together at all yet, and probably never will (see the /proc code for why its basically unfixable in 2.4) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 12:22 ` Benjamin Herrenschmidt 2003-02-05 12:50 ` Alan Cox @ 2003-02-05 13:19 ` Stephan von Krawczynski 1 sibling, 0 replies; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-05 13:19 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel On 05 Feb 2003 13:22:19 +0100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > > I have to give a short note on this one: > > indeed is the system currently running with a single CPU, _but_ since it is a > > dual-mb the kernel is already compiled for SMP. It is started with "nosmp" > > option though. I wanted to mention this not knowing if it is important for the > > codepath. > > Shouldn be an issue. I suppose you don't use fancy stuff like preempt or > IDE taskfile IO, right ? No, not at all. Pure and simple filesystem-I/O (reiserfs). -- Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 11:16 ` Benjamin Herrenschmidt 2003-02-05 11:39 ` Stephan von Krawczynski @ 2003-02-05 12:24 ` Alan Cox 2003-02-05 16:56 ` Ross Biro 2 siblings, 0 replies; 27+ messages in thread From: Alan Cox @ 2003-02-05 12:24 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel > ide_dma_intr() called while the drive was busy. This is strange for a > simple reason: Even if we got the wrong interrupt (shared interrupt), > __ide_dma_test_irq() should have returned 0, and so ide_dma_intr > shouldn't have been called. Ok the other mail makes more sense now 8) > drive->waiting_for_dma is set before setting up the handler and > issuing the command, and while the DMA engine is indeed started > only after sending the command, it's INTR bit have been cleared > previously (I suppose it can't be stale, or can it while the DMA > haven't been started yet ? In this case we would need to take > the lock here). I'd have to go digging. I think that can occur however. Andre ? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 11:16 ` Benjamin Herrenschmidt 2003-02-05 11:39 ` Stephan von Krawczynski 2003-02-05 12:24 ` Alan Cox @ 2003-02-05 16:56 ` Ross Biro 2003-02-05 17:12 ` Benjamin Herrenschmidt 2 siblings, 1 reply; 27+ messages in thread From: Ross Biro @ 2003-02-05 16:56 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel Benjamin Herrenschmidt wrote: >>Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like: >> >>Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy } >> >> >> Since the busy bit is set, we know the drive must have received a command. Since dma_intr thought the drive was not busy, an interrupt must have snuck through between the command being issued and the dma being started. I think in my original patch, I had the dma start outside of the spinlock, that is a bug. The command to the controller to start the dma must be inside of the spinlock. I have not looked at 2.4.21-pre4 at all, so I could be entirely off base here. Ross ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 16:56 ` Ross Biro @ 2003-02-05 17:12 ` Benjamin Herrenschmidt 2003-02-05 17:19 ` Ross Biro 2003-02-06 12:20 ` Stephan von Krawczynski 0 siblings, 2 replies; 27+ messages in thread From: Benjamin Herrenschmidt @ 2003-02-05 17:12 UTC (permalink / raw) To: Ross Biro; +Cc: alan, Stephan von Krawczynski, linux-kernel On Wed, 2003-02-05 at 17:56, Ross Biro wrote: > Benjamin Herrenschmidt wrote: > > >>Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like: > >> > >>Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy } > >> > >> > >> > Since the busy bit is set, we know the drive must have received a > command. Since dma_intr thought the drive was not busy, an interrupt > must have snuck through between the command being issued and the dma > being started. I think in my original patch, I had the dma start > outside of the spinlock, that is a bug. The command to the controller > to start the dma must be inside of the spinlock. While I agree with you here, I don't think it's what's happening. In ide-disk, do_rw_disk sets up the taskfile, then basically calls hwif->ide_dma_read/write to start the command. In ide-dma.c in 2.4.21-pre4, what happens is: /* PRD table */ hwif->OUTL(hwif->dmatable_dma, hwif->dma_prdtable); /* specify r/w */ hwif->OUTB(reading, hwif->dma_command); /* read dma_status for INTR & ERROR flags */ dma_stat = hwif->INB(hwif->dma_status); /* clear INTR & ERROR flags */ hwif->OUTB(dma_stat|6, hwif->dma_status); drive->waiting_for_dma = 1; if (drive->media != ide_disk) return 0; .../... Then issue command byte. Below we clear the DMA status _and_ set waiting_for_dma to 1. That means that if an IRQ sneaks in, we will call drive_is_ready(), which shouldn't return INTR 1 since we just cleared it. I don't see how a race could happen here, but I might have missed something. Even if, on SMP, the code below executes _simultaneously_ with ide_intr, the later will check for handler beeing non-NULL before checking waiting_for_dma (drive_is_ready), and thus will not race since we set the handler after. The only thing I see is a possible wraparound of waiting_for_dma. It's an u8, so it wraps at 255. However, it's incremented in each __ide_dma_test_irq call. So if you get more than 255 shared (network in your case) interrupts before the end of the command, you die. Alan: you can remove safely the waiting_for_dma++, I beleive, in drive_is_ready(). I don't know how that code sneaked in ide-dma. I indeed do that in ppc/pmac.c for other reasons (sort of timeout condition on the DMA controller that happens when I get an initial error), but this is totally unrelated HW on which I know I have no shared IRQ. Stephan: Can you try editing ide-dma.c, function __ide_dma_test_irq(), and remove that line: - drive->waiting_for_dma++; And tell us if it helps in any way. Ben. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 17:12 ` Benjamin Herrenschmidt @ 2003-02-05 17:19 ` Ross Biro 2003-02-05 17:34 ` Benjamin Herrenschmidt 2003-02-05 19:10 ` Alan Cox 2003-02-06 12:20 ` Stephan von Krawczynski 1 sibling, 2 replies; 27+ messages in thread From: Ross Biro @ 2003-02-05 17:19 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel Benjamin Herrenschmidt wrote: >While I agree with you here, I don't think it's what's happening. > /* clear INTR & ERROR flags */ > hwif->OUTB(dma_stat|6, hwif->dma_status); > > > You have way to much faith in the hardware. Promise is especially known for not keeping to the spec. I wouldn't trust the interrupt bit to be valid unless a dma is actually active, i.e. that hwif->OUTB(hwif->INB(dma_base)|1, dma_base); has actually been written. I've actually had a manufacturer tell me that they don't worry about the spec, just making things work with Windows. Ross ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 17:19 ` Ross Biro @ 2003-02-05 17:34 ` Benjamin Herrenschmidt 2003-02-05 17:38 ` Stephan von Krawczynski 2003-02-05 19:10 ` Alan Cox 1 sibling, 1 reply; 27+ messages in thread From: Benjamin Herrenschmidt @ 2003-02-05 17:34 UTC (permalink / raw) To: Ross Biro; +Cc: alan, Stephan von Krawczynski, linux-kernel On Wed, 2003-02-05 at 18:19, Ross Biro wrote: > Benjamin Herrenschmidt wrote: > > >While I agree with you here, I don't think it's what's happening. > > /* clear INTR & ERROR flags */ > > hwif->OUTB(dma_stat|6, hwif->dma_status); > > > > > > > You have way to much faith in the hardware. Promise is especially known > for not keeping to the spec. I wouldn't trust the interrupt bit to be > valid unless a dma is actually active, i.e. that > > hwif->OUTB(hwif->INB(dma_base)|1, dma_base); > > has actually been written. > > I've actually had a manufacturer tell me that they don't worry about the > spec, just making things work with Windows. Ok, so that gives us 2 possibilities. The above problem, which would be fixed by locking all around ide_dma_read/write (or rather in the _caller_, seems better so we don't have to drop the lock for ATAPI). And a possible wraparound of waiting_for_dma if 255 IRQs come in from whatever device we share the IRQ line with. I beleive both need fixing... Ben. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 17:34 ` Benjamin Herrenschmidt @ 2003-02-05 17:38 ` Stephan von Krawczynski [not found] ` <1044467091.685.155.camel@zion.wanadoo.fr> 2003-02-05 20:00 ` Bryan Andersen 0 siblings, 2 replies; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-05 17:38 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel On 05 Feb 2003 18:34:55 +0100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Wed, 2003-02-05 at 18:19, Ross Biro wrote: > > Benjamin Herrenschmidt wrote: > > > > >While I agree with you here, I don't think it's what's happening. > > > /* clear INTR & ERROR flags */ > > > hwif->OUTB(dma_stat|6, hwif->dma_status); > > > > > > > > > > > You have way to much faith in the hardware. Promise is especially known > > for not keeping to the spec. I wouldn't trust the interrupt bit to be > > valid unless a dma is actually active, i.e. that > > > > hwif->OUTB(hwif->INB(dma_base)|1, dma_base); > > > > has actually been written. > > > > I've actually had a manufacturer tell me that they don't worry about the > > spec, just making things work with Windows. > > Ok, so that gives us 2 possibilities. The above problem, which would be > fixed by locking all around ide_dma_read/write (or rather in the > _caller_, seems better so we don't have to drop the lock for ATAPI). > > And a possible wraparound of waiting_for_dma if 255 IRQs come in from > whatever device we share the IRQ line with. > > I beleive both need fixing... > > Ben. Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots. PDC is only 32bit/33MHz PCI. So it may well be that others are in fact _able_ to produce a damn lot more data/interrupts than the PDC. I am pretty astonished by the number of interrupts created by the 3com tg3 cards anyways... -- Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <1044467091.685.155.camel@zion.wanadoo.fr>]
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts [not found] ` <1044467091.685.155.camel@zion.wanadoo.fr> @ 2003-02-05 17:58 ` Stephan von Krawczynski 0 siblings, 0 replies; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-05 17:58 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel, rossb On 05 Feb 2003 18:44:51 +0100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Wed, 2003-02-05 at 18:38, Stephan von Krawczynski wrote: > > On 05 Feb 2003 18:34:55 +0100 > > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > > > > On Wed, 2003-02-05 at 18:19, Ross Biro wrote: > > > > Benjamin Herrenschmidt wrote: > > > > > > > > >While I agree with you here, I don't think it's what's happening. > > > > > /* clear INTR & ERROR flags */ > > > > > hwif->OUTB(dma_stat|6, hwif->dma_status); > > > > > > > > > > > > > > > > > > > You have way to much faith in the hardware. Promise is especially > > > > known for not keeping to the spec. I wouldn't trust the interrupt bit > > > > to be valid unless a dma is actually active, i.e. that > > > > > > > > hwif->OUTB(hwif->INB(dma_base)|1, dma_base); > > > > > > > > has actually been written. > > > > > > > > I've actually had a manufacturer tell me that they don't worry about > > > > the spec, just making things work with Windows. > > > > > > Ok, so that gives us 2 possibilities. The above problem, which would be > > > fixed by locking all around ide_dma_read/write (or rather in the > > > _caller_, seems better so we don't have to drop the lock for ATAPI). > > > > > > And a possible wraparound of waiting_for_dma if 255 IRQs come in from > > > whatever device we share the IRQ line with. > > > > > > I beleive both need fixing... > > > > > > Ben. > > > > Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots. > > PDC is only 32bit/33MHz PCI. So it may well be that others are in fact > > _able_ to produce a damn lot more data/interrupts than the PDC. I am pretty > > astonished by the number of interrupts created by the 3com tg3 cards > > anyways... > > Ok, then please try my "fix" to remove the increment of waiting_for_dma > and let us know if it helps. I will try, in the meantime can any kind soul please give me a hint why I cannot see the interrupts distributed among the CPUs when enabling smp and apic on this very same box: CPU0 CPU1 0: 71158 0 IO-APIC-edge timer 1: 941 0 IO-APIC-edge keyboard 2: 0 0 XT-PIC cascade 12: 33166 0 IO-APIC-edge PS/2 Mouse 15: 4 0 IO-APIC-edge ide1 17: 1732 0 IO-APIC-level ide2, ide3 18: 3423 0 IO-APIC-level eth0, eth1 21: 8177 0 IO-APIC-level eth2 22: 112943 0 IO-APIC-level aic7xxx 23: 16 0 IO-APIC-level aic7xxx 25: 74 0 IO-APIC-level HiSax 26: 0 0 IO-APIC-level EMU10K1 NMI: 0 0 LOC: 71085 71059 ERR: 0 MIS: 0 ?? (kernel 2.4.21-pre4) -- Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 17:38 ` Stephan von Krawczynski [not found] ` <1044467091.685.155.camel@zion.wanadoo.fr> @ 2003-02-05 20:00 ` Bryan Andersen 1 sibling, 0 replies; 27+ messages in thread From: Bryan Andersen @ 2003-02-05 20:00 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: Benjamin Herrenschmidt, rossb, alan, linux-kernel > Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots. PDC > is only 32bit/33MHz PCI. So it may well be that others are in fact _able_ to > produce a damn lot more data/interrupts than the PDC. I am pretty astonished by > the number of interrupts created by the 3com tg3 cards anyways... On my box one of the devices sharing the interrupt with the disk is the display, the USB is quiet with no devices connected. I am running 2.4.21-pre4-ac2. I'd have redistributed interrupts but the motherboard dosen't allow me to specifically set them and APIC isn't supposed to work on the nForce2 yet. cat /proc/interrupts CPU0 0: 273068 XT-PIC timer 1: 5547 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 122188 XT-PIC eth0 10: 641526 XT-PIC ide2, ide3, usb-ohci, nvidia 11: 0 XT-PIC NVIDIA nForce Audio, usb-ohci 12: 78237 XT-PIC PS/2 Mouse 14: 109475 XT-PIC ide0 15: 114178 XT-PIC ide1 NMI: 0 LOC: 273027 ERR: 18344 MIS: 0 - Bryan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 17:19 ` Ross Biro 2003-02-05 17:34 ` Benjamin Herrenschmidt @ 2003-02-05 19:10 ` Alan Cox 1 sibling, 0 replies; 27+ messages in thread From: Alan Cox @ 2003-02-05 19:10 UTC (permalink / raw) To: Ross Biro Cc: Benjamin Herrenschmidt, alan, Stephan von Krawczynski, Linux Kernel Mailing List On Wed, 2003-02-05 at 17:19, Ross Biro wrote: > I've actually had a manufacturer tell me that they don't worry about the > spec, just making things work with Windows. Its very common. As a customer always ask the vendor if they are compliant to each appropriate standard in writing. If they say yes it has nice little liability issues should they be lying 8) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-05 17:12 ` Benjamin Herrenschmidt 2003-02-05 17:19 ` Ross Biro @ 2003-02-06 12:20 ` Stephan von Krawczynski 2003-02-06 23:04 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-06 12:20 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel On 05 Feb 2003 18:12:31 +0100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > Stephan: Can you try editing ide-dma.c, function > __ide_dma_test_irq(), and remove that line: > > - drive->waiting_for_dma++; > > And tell us if it helps in any way. > > Ben. Hello Ben, as requested I tried the above "patch" and had no problem so far. Current situation is: (ide2, ide3 are PDC, eth2 is tg3) CPU0 0: 6332048 XT-PIC timer 1: 14112 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 0 XT-PIC EMU10K1 7: 14950 XT-PIC HiSax 9: 30600647 XT-PIC ide2, ide3, aic7xxx, aic7xxx, eth0, eth1, eth2 12: 234451 XT-PIC PS/2 Mouse 15: 2 XT-PIC ide1 NMI: 0 LOC: 0 ERR: 0 MIS: 0 I would not say this is a rock-solid test case. I will continue to stress the setup and keep you informed. Anyway it looks stable up to now. Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-06 12:20 ` Stephan von Krawczynski @ 2003-02-06 23:04 ` Benjamin Herrenschmidt 2003-02-07 9:10 ` Stephan von Krawczynski 0 siblings, 1 reply; 27+ messages in thread From: Benjamin Herrenschmidt @ 2003-02-06 23:04 UTC (permalink / raw) To: Stephan von Krawczynski; +Cc: rossb, alan, linux-kernel On Thu, 2003-02-06 at 13:20, Stephan von Krawczynski wrote: > On 05 Feb 2003 18:12:31 +0100 > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > > > Stephan: Can you try editing ide-dma.c, function > > __ide_dma_test_irq(), and remove that line: > > > > - drive->waiting_for_dma++; > > > > And tell us if it helps in any way. > > > > Ben. > > Hello Ben, > > as requested I tried the above "patch" and had no problem so far. Current > situation is: > (ide2, ide3 are PDC, eth2 is tg3) Ok, well, if it' still stable by now, I beleive we can safely remove that line from ide_dma_test_irq(). AFAIK, it really have nothing to do here. (I suspect it got copied from ide-pmac somewhat... I use it as a counter in there to implement some timeout when the DMA engine didn't start at all because the disk issued an error, and on these, I know for sure the IRQ isn't shared...) Alan, can you include that ? ===== drivers/ide/ide-dma.c 1.10 vs edited ===== --- 1.10/drivers/ide/ide-dma.c Sat Feb 1 20:37:36 2003 +++ edited/drivers/ide/ide-dma.c Fri Feb 7 00:03:43 2003 @@ -826,7 +826,6 @@ if (!drive->waiting_for_dma) printk(KERN_WARNING "%s: (%s) called while not waiting\n", drive->name, __FUNCTION__); - drive->waiting_for_dma++; return 0; } (Patch against Marcelo's 2.4.21-pre4) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts 2003-02-06 23:04 ` Benjamin Herrenschmidt @ 2003-02-07 9:10 ` Stephan von Krawczynski 0 siblings, 0 replies; 27+ messages in thread From: Stephan von Krawczynski @ 2003-02-07 9:10 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel On 07 Feb 2003 00:04:18 +0100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Thu, 2003-02-06 at 13:20, Stephan von Krawczynski wrote: > > On 05 Feb 2003 18:12:31 +0100 > > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > > > > > > Stephan: Can you try editing ide-dma.c, function > > > __ide_dma_test_irq(), and remove that line: > > > > > > - drive->waiting_for_dma++; > > > > > > And tell us if it helps in any way. > > > > > > Ben. > > > > Hello Ben, > > > > as requested I tried the above "patch" and had no problem so far. Current > > situation is: > > (ide2, ide3 are PDC, eth2 is tg3) > > Ok, well, if it' still stable by now, I beleive we can safely remove > that line from ide_dma_test_irq(). AFAIK, it really have nothing to do > here. Hello all, it is still working ok, currently we are at: CPU0 0: 13848205 XT-PIC timer 1: 54117 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 0 XT-PIC EMU10K1 7: 27260 XT-PIC HiSax 9: 67048861 XT-PIC ide2, ide3, aic7xxx, aic7xxx, eth0, eth1, eth2 12: 765541 XT-PIC PS/2 Mouse 15: 229 XT-PIC ide1 NMI: 0 LOC: 0 ERR: 0 MIS: 0 Regards, Stephan ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2003-02-07 9:01 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-02-02 15:18 2.4.21-pre4: tg3 driver problems with shared interrupts Stephan von Krawczynski 2003-02-02 16:49 ` Jeff Garzik 2003-02-02 17:09 ` Stephan von Krawczynski 2003-02-02 17:15 ` Jeff Garzik 2003-02-02 17:52 ` Stephan von Krawczynski 2003-02-02 18:28 ` Jeff Garzik 2003-02-02 18:31 ` Stephan von Krawczynski 2003-02-03 10:25 ` Stephan von Krawczynski 2003-02-05 9:48 ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski 2003-02-05 11:16 ` Benjamin Herrenschmidt 2003-02-05 11:39 ` Stephan von Krawczynski 2003-02-05 12:21 ` Alan Cox 2003-02-05 12:22 ` Benjamin Herrenschmidt 2003-02-05 12:50 ` Alan Cox 2003-02-05 13:19 ` Stephan von Krawczynski 2003-02-05 12:24 ` Alan Cox 2003-02-05 16:56 ` Ross Biro 2003-02-05 17:12 ` Benjamin Herrenschmidt 2003-02-05 17:19 ` Ross Biro 2003-02-05 17:34 ` Benjamin Herrenschmidt 2003-02-05 17:38 ` Stephan von Krawczynski [not found] ` <1044467091.685.155.camel@zion.wanadoo.fr> 2003-02-05 17:58 ` Stephan von Krawczynski 2003-02-05 20:00 ` Bryan Andersen 2003-02-05 19:10 ` Alan Cox 2003-02-06 12:20 ` Stephan von Krawczynski 2003-02-06 23:04 ` Benjamin Herrenschmidt 2003-02-07 9:10 ` Stephan von Krawczynski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).