* 2.4.21-pre4: tg3 driver problems with shared interrupts
@ 2003-02-02 15:18 Stephan von Krawczynski
2003-02-02 16:49 ` Jeff Garzik
0 siblings, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-02 15:18 UTC (permalink / raw)
To: linux-kernel; +Cc: davem, jgarzik
Hello Dave, Jeff, all
I just started experiments with a new setup consisting of:
00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (rev 23)
00:00.1 Host bridge: ServerWorks CNB20HE Host Bridge (rev 01)
00:00.2 Host bridge: ServerWorks: Unknown device 0006 (rev 01)
00:00.3 Host bridge: ServerWorks: Unknown device 0006 (rev 01)
00:02.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0d)
00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0d)
00:04.0 Network controller: Elsa AG QuickStep 1000 (rev 01)
00:05.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
00:05.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07)
00:07.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 ISA bridge: ServerWorks CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93)
00:0f.2 USB Controller: ServerWorks OSB4/CSB5 USB Controller (rev 05)
00:0f.3 Host bridge: ServerWorks: Unknown device 0225
01:02.0 Unknown mass storage controller: Promise Technology, Inc. 20268 (rev
01)
01:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit
Ethernet (rev 15)
02:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit
Ethernet (rev 15)
02:03.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
02:03.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
I found out within minutes that this setup does not survive if you let the
Broadcom cards share interrupts with anything else. It works ok now like
this (eth2 is tg3):
CPU0
0: 343269 XT-PIC timer
1: 6804 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 37952 XT-PIC EMU10K1
7: 515 XT-PIC HiSax
9: 711212 XT-PIC aic7xxx, aic7xxx, eth0, eth1
10: 4710570 XT-PIC eth2
11: 639316 XT-PIC ide2, ide3
12: 107821 XT-PIC PS/2 Mouse
15: 69222 XT-PIC ide1
NMI: 0
LOC: 0
ERR: 0
MIS: 0
But horribly failed in such a setup:
0: XT-PIC timer
1: XT-PIC keyboard
2: XT-PIC cascade
5: XT-PIC EMU10K1
7: XT-PIC HiSax
9: XT-PIC eth2, aic7xxx, aic7xxx, eth0, eth1, ide2, ide3
12: XT-PIC PS/2 Mouse
15: XT-PIC ide1
I cannot even produce a "cat /proc/interrupts" for this because I am not fast
enough at login (the network at eth2 is heavy loaded). I sometimes read about
problems here with tg3-drivers, and I just wanted to point you to the shared
case, maybe it has to do with this special case rather than with the drivers
internals itself.
(PS: its not the ide2-3, I checked that out)
--
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
2003-02-02 15:18 2.4.21-pre4: tg3 driver problems with shared interrupts Stephan von Krawczynski
@ 2003-02-02 16:49 ` Jeff Garzik
2003-02-02 17:09 ` Stephan von Krawczynski
2003-02-02 17:52 ` Stephan von Krawczynski
0 siblings, 2 replies; 27+ messages in thread
From: Jeff Garzik @ 2003-02-02 16:49 UTC (permalink / raw)
To: Stephan von Krawczynski; +Cc: linux-kernel, davem
[-- Attachment #1: Type: text/plain, Size: 969 bytes --]
Stephan von Krawczynski wrote:
> I found out within minutes that this setup does not survive if you let the
> Broadcom cards share interrupts with anything else. It works ok now like
> this (eth2 is tg3):
[...]
> But horribly failed in such a setup:
[...]
> I cannot even produce a "cat /proc/interrupts" for this because I am not fast
> enough at login (the network at eth2 is heavy loaded). I sometimes read about
> problems here with tg3-drivers, and I just wanted to point you to the shared
> case, maybe it has to do with this special case rather than with the drivers
> internals itself.
> (PS: its not the ide2-3, I checked that out)
hmmm. I've attached the latest tg3, version 1.4, which I just sent off
to Marcelo. It includes some fixes that may affect your 5701.
Can you try two things?
1) 2.4.21-pre4 + tg3 v1.4
2) 2.4.20 + tg3 v1.4
I'm interested to know if you can reproduce with the latest driver in
either of these two scenarios...
Jeff
[-- Attachment #2: tg3.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 49437 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
2003-02-02 16:49 ` Jeff Garzik
@ 2003-02-02 17:09 ` Stephan von Krawczynski
2003-02-02 17:15 ` Jeff Garzik
2003-02-02 17:52 ` Stephan von Krawczynski
1 sibling, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-02 17:09 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel, davem
On Sun, 02 Feb 2003 11:49:12 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:
> Stephan von Krawczynski wrote:
> > I found out within minutes that this setup does not survive if you let the
> > Broadcom cards share interrupts with anything else. It works ok now like
> > this (eth2 is tg3):
>
> [...]
> > But horribly failed in such a setup:
>
> [...]
> > I cannot even produce a "cat /proc/interrupts" for this because I am not fast
> > enough at login (the network at eth2 is heavy loaded). I sometimes read about
> > problems here with tg3-drivers, and I just wanted to point you to the shared
> > case, maybe it has to do with this special case rather than with the drivers
> > internals itself.
> > (PS: its not the ide2-3, I checked that out)
>
>
>
> hmmm. I've attached the latest tg3, version 1.4, which I just sent off
> to Marcelo. It includes some fixes that may affect your 5701.
Sorry, putting it into pre4 results in:
gcc -D__KERNEL__ -I/usr/src/linux-2.4.21-pre4-patch/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686 -nostdinc -iwithprefix include -DKBUILD_BASENAME=tg3 -c -o tg3.o tg3.c
tg3.c:140: `PCI_DEVICE_ID_TIGON3_5704S' undeclared here (not in a function)
tg3.c:140: initializer element is not constant
tg3.c:140: (near initialization for `tg3_pci_tbl[8].device')
tg3.c:141: initializer element is not constant
tg3.c:141: (near initialization for `tg3_pci_tbl[8]')
tg3.c:142: `PCI_DEVICE_ID_TIGON3_5702A3' undeclared here (not in a function)
tg3.c:142: initializer element is not constant
tg3.c:142: (near initialization for `tg3_pci_tbl[9].device')
tg3.c:143: initializer element is not constant
tg3.c:143: (near initialization for `tg3_pci_tbl[9]')
tg3.c:144: `PCI_DEVICE_ID_TIGON3_5703A3' undeclared here (not in a function)
tg3.c:144: initializer element is not constant
tg3.c:144: (near initialization for `tg3_pci_tbl[10].device')
tg3.c:145: initializer element is not constant
tg3.c:145: (near initialization for `tg3_pci_tbl[10]')
tg3.c:147: initializer element is not constant
tg3.c:147: (near initialization for `tg3_pci_tbl[11]')
tg3.c:149: initializer element is not constant
tg3.c:149: (near initialization for `tg3_pci_tbl[12]')
tg3.c:151: initializer element is not constant
tg3.c:151: (near initialization for `tg3_pci_tbl[13]')
tg3.c:152: initializer element is not constant
tg3.c:152: (near initialization for `tg3_pci_tbl[14]')
make[3]: *** [tg3.o] Error 1
Shall I patch it, or do you?
--
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
2003-02-02 17:09 ` Stephan von Krawczynski
@ 2003-02-02 17:15 ` Jeff Garzik
0 siblings, 0 replies; 27+ messages in thread
From: Jeff Garzik @ 2003-02-02 17:15 UTC (permalink / raw)
To: Stephan von Krawczynski; +Cc: linux-kernel, davem, manish
[-- Attachment #1: Type: text/plain, Size: 499 bytes --]
Stephan von Krawczynski wrote:
> Sorry, putting it into pre4 results in:
>
> gcc -D__KERNEL__ -I/usr/src/linux-2.4.21-pre4-patch/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686 -nostdinc -iwithprefix include -DKBUILD_BASENAME=tg3 -c -o tg3.o tg3.c
> tg3.c:140: `PCI_DEVICE_ID_TIGON3_5704S' undeclared here (not in a function)
sorry! you need updated include/linux/pci_ids.h (attached).
[-- Attachment #2: tg3.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 62620 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
2003-02-02 16:49 ` Jeff Garzik
2003-02-02 17:09 ` Stephan von Krawczynski
@ 2003-02-02 17:52 ` Stephan von Krawczynski
2003-02-02 18:28 ` Jeff Garzik
1 sibling, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-02 17:52 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel, davem
On Sun, 02 Feb 2003 11:49:12 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:
> Stephan von Krawczynski wrote:
> > I found out within minutes that this setup does not survive if you let the
> > Broadcom cards share interrupts with anything else. It works ok now like
> > this (eth2 is tg3):
>
> [...]
> > But horribly failed in such a setup:
>
> [...]
> > I cannot even produce a "cat /proc/interrupts" for this because I am not
> > fast enough at login (the network at eth2 is heavy loaded). I sometimes
> > read about problems here with tg3-drivers, and I just wanted to point you
> > to the shared case, maybe it has to do with this special case rather than
> > with the drivers internals itself.
> > (PS: its not the ide2-3, I checked that out)
>
>
>
> hmmm. I've attached the latest tg3, version 1.4, which I just sent off
> to Marcelo. It includes some fixes that may affect your 5701.
>
> Can you try two things?
>
> 1) 2.4.21-pre4 + tg3 v1.4
Ok. With the latest version you sent I got it compiled and working. As far as I
can tell from short tests it looks good (eth2 is tg3):
CPU0
0: 79344 XT-PIC timer
1: 2428 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 0 XT-PIC EMU10K1
7: 81 XT-PIC HiSax
9: 387286 XT-PIC aic7xxx, aic7xxx, eth0, eth1, eth2
11: 75780 XT-PIC ide2, ide3
12: 17740 XT-PIC PS/2 Mouse
15: 2 XT-PIC ide1
NMI: 0
LOC: 0
ERR: 0
MIS: 0
To make sure I will let it stress-test overnight and send you the results in
about 15 hours from now on. If everything does fine I will redo with ide2,ide3
on same interrupt, too. Just to see what happens with these Promise things...
--
Thanks a lot, I'll be back,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
2003-02-02 17:52 ` Stephan von Krawczynski
@ 2003-02-02 18:28 ` Jeff Garzik
2003-02-02 18:31 ` Stephan von Krawczynski
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Jeff Garzik @ 2003-02-02 18:28 UTC (permalink / raw)
To: Stephan von Krawczynski; +Cc: linux-kernel, davem
Stephan von Krawczynski wrote:
> On Sun, 02 Feb 2003 11:49:12 -0500
> Jeff Garzik <jgarzik@pobox.com> wrote:
>>Can you try two things?
>>
>>1) 2.4.21-pre4 + tg3 v1.4
>
>
> Ok. With the latest version you sent I got it compiled and working. As far as I
> can tell from short tests it looks good (eth2 is tg3):
cool beans, thanks!
> To make sure I will let it stress-test overnight and send you the results in
> about 15 hours from now on. If everything does fine I will redo with ide2,ide3
> on same interrupt, too. Just to see what happens with these Promise things...
great.
though who knows with the Promise stuff... :) I hope it's not their
binary-only junk...
Jeff
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
2003-02-02 18:28 ` Jeff Garzik
@ 2003-02-02 18:31 ` Stephan von Krawczynski
2003-02-03 10:25 ` Stephan von Krawczynski
2003-02-05 9:48 ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski
2 siblings, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-02 18:31 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel, davem
On Sun, 02 Feb 2003 13:28:55 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:
> Stephan von Krawczynski wrote:
> > On Sun, 02 Feb 2003 11:49:12 -0500
> > Jeff Garzik <jgarzik@pobox.com> wrote:
> >>Can you try two things?
> >>
> >>1) 2.4.21-pre4 + tg3 v1.4
> >
> >
> > Ok. With the latest version you sent I got it compiled and working. As far as I
> > can tell from short tests it looks good (eth2 is tg3):
>
> cool beans, thanks!
>
>
> > To make sure I will let it stress-test overnight and send you the results in
> > about 15 hours from now on. If everything does fine I will redo with ide2,ide3
> > on same interrupt, too. Just to see what happens with these Promise things...
>
>
> great.
>
> though who knows with the Promise stuff... :) I hope it's not their
> binary-only junk...
I am using "PROMISE PDC202{68|69|70|71|75|76|77} support" from standard kernel.
--
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
2003-02-02 18:28 ` Jeff Garzik
2003-02-02 18:31 ` Stephan von Krawczynski
@ 2003-02-03 10:25 ` Stephan von Krawczynski
2003-02-05 9:48 ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski
2 siblings, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-03 10:25 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel, davem
On Sun, 02 Feb 2003 13:28:55 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:
> > To make sure I will let it stress-test overnight and send you the results
> > in about 15 hours from now on. If everything does fine I will redo with
> > ide2,ide3 on same interrupt, too. Just to see what happens with these
> > Promise things...
Hi Jeff,
I can tell you for sure now that your tg3 driver does very well in shared
interrupt config:
(eth2 is tg3)
CPU0
0: 6052783 XT-PIC timer
1: 8618 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 581 XT-PIC EMU10K1
7: 6842 XT-PIC HiSax
9: 30245790 XT-PIC aic7xxx, aic7xxx, eth0, eth1, eth2
11: 7324397 XT-PIC ide2, ide3
12: 196375 XT-PIC PS/2 Mouse
15: 2 XT-PIC ide1
NMI: 0
LOC: 0
ERR: 52
MIS: 0
Though I don't exactly know what "ERR:" means in this context the machine is
alive and performing well.
Now I go again for the Promise stuff ...
--
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-02 18:28 ` Jeff Garzik
2003-02-02 18:31 ` Stephan von Krawczynski
2003-02-03 10:25 ` Stephan von Krawczynski
@ 2003-02-05 9:48 ` Stephan von Krawczynski
2003-02-05 11:16 ` Benjamin Herrenschmidt
2 siblings, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 9:48 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel, davem
On Sun, 02 Feb 2003 13:28:55 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:
> > To make sure I will let it stress-test overnight and send you the results in
> > about 15 hours from now on. If everything does fine I will redo with ide2,ide3
> > on same interrupt, too. Just to see what happens with these Promise things...
>
>
> great.
>
> though who knows with the Promise stuff... :) I hope it's not their
> binary-only junk...
>
> Jeff
Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like:
Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
Feb 4 01:02:22 admin kernel:
Feb 4 01:02:22 admin kernel: PDC202XX: Primary channel reset.
Feb 4 01:02:22 admin kernel: ide2: reset: success
Feb 4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady SeekComplete DataRequest }
Feb 4 01:02:23 admin kernel:
Feb 4 01:02:23 admin kernel: hde: drive not ready for command
Feb 4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady SeekComplete }
Feb 4 01:02:23 admin kernel:
Feb 4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE
Feb 4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy }
Feb 4 01:02:23 admin kernel:
Results are that the drive itself just hangs and has to be powered off (resetting the box does not work). The drives worked (and works) fine in non shared-interrupt context. Controller is:
<6>PDC20268: IDE controller at PCI slot 01:02.0
<6>PCI: Found IRQ 11 for device 01:02.0
<6>PDC20268: chipset revision 1
<6>PDC20268: not 100%% native mode: will probe irqs later
<6> ide2: BM-DMA at 0x7400-0x7407, BIOS settings: hde:pio, hdf:pio
<6> ide3: BM-DMA at 0x7408-0x740f, BIOS settings: hdg:pio, hdh:pio
<4>hdc: AOPEN CD-RW CRW2440, ATAPI CD/DVD-ROM drive
<4>hde: ST380021A, ATA DISK drive
<4>blk: queue c034e1f8, I/O limit 4095Mb (mask 0xffffffff)
<4>hdg: IC35L060AVER07-0, ATA DISK drive
<4>blk: queue c034e664, I/O limit 4095Mb (mask 0xffffffff)
<4>ide1 at 0x170-0x177,0x376 on irq 15
<4>ide2 at 0x8800-0x8807,0x8402 on irq 11
<4>ide3 at 0x8000-0x8007,0x7802 on irq 11
<4>hde: host protected area => 1
<6>hde: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=155061/16/63, UDMA(100)
<4>hdg: host protected area => 1
<6>hdg: 120103200 sectors (61493 MB) w/1916KiB Cache, CHS=119150/16/63, UDMA(100)
<6>Partition check:
<6> hde: hde1
<6> hdg:<6> [PTBL] [7476/255/63] hdg1
Regards,
Stephan
PS: tg3 does great! Good job, Jeff...
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 9:48 ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski
@ 2003-02-05 11:16 ` Benjamin Herrenschmidt
2003-02-05 11:39 ` Stephan von Krawczynski
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-05 11:16 UTC (permalink / raw)
To: alan; +Cc: Stephan von Krawczynski, linux-kernel
> Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like:
>
> Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
> Feb 4 01:02:22 admin kernel:
> Feb 4 01:02:22 admin kernel: PDC202XX: Primary channel reset.
> Feb 4 01:02:22 admin kernel: ide2: reset: success
> Feb 4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> Feb 4 01:02:23 admin kernel:
> Feb 4 01:02:23 admin kernel: hde: drive not ready for command
> Feb 4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady SeekComplete }
> Feb 4 01:02:23 admin kernel:
> Feb 4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE
> Feb 4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy }
> Feb 4 01:02:23 admin kernel:
Hi Alan !
I'm trying to get some sense out of the above report as it seem to match
a problem a user reported me as well. It's interesting because it's
apparently running UP, so it's not the SMP race found by Ross. It's
definitely a problem with shared interrupt though.
I've followed the code path involved here. Basically, we had
ide_dma_intr() called while the drive was busy. This is strange for a
simple reason: Even if we got the wrong interrupt (shared interrupt),
__ide_dma_test_irq() should have returned 0, and so ide_dma_intr
shouldn't have been called.
Assuming the driver was doing basic read/write operations, I checked
the code, and while it seems that do_rw_disk() is called without
the lock held nor interrupts masked, I see no obvious race.
drive->waiting_for_dma is set before setting up the handler and
issuing the command, and while the DMA engine is indeed started
only after sending the command, it's INTR bit have been cleared
previously (I suppose it can't be stale, or can it while the DMA
haven't been started yet ? In this case we would need to take
the lock here).
> Results are that the drive itself just hangs and has to be powered off (resetting the box does not work). The drives worked (and works) fine in non shared-interrupt context. Controller is:
>
> <6>PDC20268: IDE controller at PCI slot 01:02.0
> <6>PCI: Found IRQ 11 for device 01:02.0
> <6>PDC20268: chipset revision 1
> <6>PDC20268: not 100%% native mode: will probe irqs later
> <6> ide2: BM-DMA at 0x7400-0x7407, BIOS settings: hde:pio, hdf:pio
> <6> ide3: BM-DMA at 0x7408-0x740f, BIOS settings: hdg:pio, hdh:pio
> <4>hdc: AOPEN CD-RW CRW2440, ATAPI CD/DVD-ROM drive
> <4>hde: ST380021A, ATA DISK drive
> <4>blk: queue c034e1f8, I/O limit 4095Mb (mask 0xffffffff)
> <4>hdg: IC35L060AVER07-0, ATA DISK drive
> <4>blk: queue c034e664, I/O limit 4095Mb (mask 0xffffffff)
> <4>ide1 at 0x170-0x177,0x376 on irq 15
> <4>ide2 at 0x8800-0x8807,0x8402 on irq 11
> <4>ide3 at 0x8000-0x8007,0x7802 on irq 11
> <4>hde: host protected area => 1
> <6>hde: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=155061/16/63, UDMA(100)
> <4>hdg: host protected area => 1
> <6>hdg: 120103200 sectors (61493 MB) w/1916KiB Cache, CHS=119150/16/63, UDMA(100)
> <6>Partition check:
> <6> hde: hde1
> <6> hdg:<6> [PTBL] [7476/255/63] hdg1
>
> Regards,
> Stephan
>
> PS: tg3 does great! Good job, Jeff...
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Benjamin Herrenschmidt <benh@kernel.crashing.org>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 11:16 ` Benjamin Herrenschmidt
@ 2003-02-05 11:39 ` Stephan von Krawczynski
2003-02-05 12:21 ` Alan Cox
2003-02-05 12:22 ` Benjamin Herrenschmidt
2003-02-05 12:24 ` Alan Cox
2003-02-05 16:56 ` Ross Biro
2 siblings, 2 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 11:39 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel
On 05 Feb 2003 12:16:02 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
> > Okay, I had to watch for it a bit longer and it turns out that the kernel
> > PDC driver has a problem in this shared interrupt setup. When loads get
> > high it seems to run into some timing problem which causes things like:
> >
> > Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
> > Feb 4 01:02:22 admin kernel:
> > Feb 4 01:02:22 admin kernel: PDC202XX: Primary channel reset.
> > Feb 4 01:02:22 admin kernel: ide2: reset: success
> > Feb 4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady
> > SeekComplete DataRequest } Feb 4 01:02:23 admin kernel:
> > Feb 4 01:02:23 admin kernel: hde: drive not ready for command
> > Feb 4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady
> > SeekComplete } Feb 4 01:02:23 admin kernel:
> > Feb 4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE
> > Feb 4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy }
> > Feb 4 01:02:23 admin kernel:
>
> Hi Alan !
>
> I'm trying to get some sense out of the above report as it seem to match
> a problem a user reported me as well. It's interesting because it's
> apparently running UP, so it's not the SMP race found by Ross. It's
> definitely a problem with shared interrupt though.
Hello Benjamin,
I have to give a short note on this one:
indeed is the system currently running with a single CPU, _but_ since it is a
dual-mb the kernel is already compiled for SMP. It is started with "nosmp"
option though. I wanted to mention this not knowing if it is important for the
codepath.
--
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 11:39 ` Stephan von Krawczynski
@ 2003-02-05 12:21 ` Alan Cox
2003-02-05 12:22 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-02-05 12:21 UTC (permalink / raw)
To: Stephan von Krawczynski; +Cc: Benjamin Herrenschmidt, alan, linux-kernel
> > a problem a user reported me as well. It's interesting because it's
> > apparently running UP, so it's not the SMP race found by Ross. It's
> > definitely a problem with shared interrupt though.
The race Ross found can bite you on a uniprocessor box as well I think.
It just needs the irq to hit in the right spot. The more interesting
question is how the current -ac behaves under the same treatment
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 11:39 ` Stephan von Krawczynski
2003-02-05 12:21 ` Alan Cox
@ 2003-02-05 12:22 ` Benjamin Herrenschmidt
2003-02-05 12:50 ` Alan Cox
2003-02-05 13:19 ` Stephan von Krawczynski
1 sibling, 2 replies; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-05 12:22 UTC (permalink / raw)
To: Stephan von Krawczynski; +Cc: alan, linux-kernel
> I have to give a short note on this one:
> indeed is the system currently running with a single CPU, _but_ since it is a
> dual-mb the kernel is already compiled for SMP. It is started with "nosmp"
> option though. I wanted to mention this not knowing if it is important for the
> codepath.
Shouldn be an issue. I suppose you don't use fancy stuff like preempt or
IDE taskfile IO, right ?
Ben.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 11:16 ` Benjamin Herrenschmidt
2003-02-05 11:39 ` Stephan von Krawczynski
@ 2003-02-05 12:24 ` Alan Cox
2003-02-05 16:56 ` Ross Biro
2 siblings, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-02-05 12:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel
> ide_dma_intr() called while the drive was busy. This is strange for a
> simple reason: Even if we got the wrong interrupt (shared interrupt),
> __ide_dma_test_irq() should have returned 0, and so ide_dma_intr
> shouldn't have been called.
Ok the other mail makes more sense now 8)
> drive->waiting_for_dma is set before setting up the handler and
> issuing the command, and while the DMA engine is indeed started
> only after sending the command, it's INTR bit have been cleared
> previously (I suppose it can't be stale, or can it while the DMA
> haven't been started yet ? In this case we would need to take
> the lock here).
I'd have to go digging. I think that can occur however.
Andre ?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 12:22 ` Benjamin Herrenschmidt
@ 2003-02-05 12:50 ` Alan Cox
2003-02-05 13:19 ` Stephan von Krawczynski
1 sibling, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-02-05 12:50 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Stephan von Krawczynski, alan, linux-kernel
> > dual-mb the kernel is already compiled for SMP. It is started with "nosmp"
> > option though. I wanted to mention this not knowing if it is important for the
> > codepath.
>
> Shouldn be an issue. I suppose you don't use fancy stuff like preempt or
> IDE taskfile IO, right ?
IDE taskfile I/O is disabled. Pre-empt and 2.4 IDE don't work together at
all yet, and probably never will (see the /proc code for why its basically
unfixable in 2.4)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 12:22 ` Benjamin Herrenschmidt
2003-02-05 12:50 ` Alan Cox
@ 2003-02-05 13:19 ` Stephan von Krawczynski
1 sibling, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 13:19 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel
On 05 Feb 2003 13:22:19 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
> > I have to give a short note on this one:
> > indeed is the system currently running with a single CPU, _but_ since it is a
> > dual-mb the kernel is already compiled for SMP. It is started with "nosmp"
> > option though. I wanted to mention this not knowing if it is important for the
> > codepath.
>
> Shouldn be an issue. I suppose you don't use fancy stuff like preempt or
> IDE taskfile IO, right ?
No, not at all. Pure and simple filesystem-I/O (reiserfs).
--
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 11:16 ` Benjamin Herrenschmidt
2003-02-05 11:39 ` Stephan von Krawczynski
2003-02-05 12:24 ` Alan Cox
@ 2003-02-05 16:56 ` Ross Biro
2003-02-05 17:12 ` Benjamin Herrenschmidt
2 siblings, 1 reply; 27+ messages in thread
From: Ross Biro @ 2003-02-05 16:56 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel
Benjamin Herrenschmidt wrote:
>>Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like:
>>
>>Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
>>
>>
>>
Since the busy bit is set, we know the drive must have received a
command. Since dma_intr thought the drive was not busy, an interrupt
must have snuck through between the command being issued and the dma
being started. I think in my original patch, I had the dma start
outside of the spinlock, that is a bug. The command to the controller
to start the dma must be inside of the spinlock.
I have not looked at 2.4.21-pre4 at all, so I could be entirely off base
here.
Ross
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 16:56 ` Ross Biro
@ 2003-02-05 17:12 ` Benjamin Herrenschmidt
2003-02-05 17:19 ` Ross Biro
2003-02-06 12:20 ` Stephan von Krawczynski
0 siblings, 2 replies; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-05 17:12 UTC (permalink / raw)
To: Ross Biro; +Cc: alan, Stephan von Krawczynski, linux-kernel
On Wed, 2003-02-05 at 17:56, Ross Biro wrote:
> Benjamin Herrenschmidt wrote:
>
> >>Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like:
> >>
> >>Feb 4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
> >>
> >>
> >>
> Since the busy bit is set, we know the drive must have received a
> command. Since dma_intr thought the drive was not busy, an interrupt
> must have snuck through between the command being issued and the dma
> being started. I think in my original patch, I had the dma start
> outside of the spinlock, that is a bug. The command to the controller
> to start the dma must be inside of the spinlock.
While I agree with you here, I don't think it's what's happening.
In ide-disk, do_rw_disk sets up the taskfile, then basically calls
hwif->ide_dma_read/write to start the command.
In ide-dma.c in 2.4.21-pre4, what happens is:
/* PRD table */
hwif->OUTL(hwif->dmatable_dma, hwif->dma_prdtable);
/* specify r/w */
hwif->OUTB(reading, hwif->dma_command);
/* read dma_status for INTR & ERROR flags */
dma_stat = hwif->INB(hwif->dma_status);
/* clear INTR & ERROR flags */
hwif->OUTB(dma_stat|6, hwif->dma_status);
drive->waiting_for_dma = 1;
if (drive->media != ide_disk)
return 0;
.../...
Then issue command byte.
Below we clear the DMA status _and_ set waiting_for_dma to 1.
That means that if an IRQ sneaks in, we will call
drive_is_ready(), which shouldn't return INTR 1 since we
just cleared it. I don't see how a race could happen here,
but I might have missed something.
Even if, on SMP, the code below executes _simultaneously_
with ide_intr, the later will check for handler beeing
non-NULL before checking waiting_for_dma (drive_is_ready),
and thus will not race since we set the handler after.
The only thing I see is a possible wraparound of
waiting_for_dma. It's an u8, so it wraps at 255. However,
it's incremented in each __ide_dma_test_irq call. So if you
get more than 255 shared (network in your case) interrupts
before the end of the command, you die.
Alan: you can remove safely the waiting_for_dma++, I beleive,
in drive_is_ready(). I don't know how that code sneaked in
ide-dma. I indeed do that in ppc/pmac.c for other reasons
(sort of timeout condition on the DMA controller that happens
when I get an initial error), but this is totally unrelated
HW on which I know I have no shared IRQ.
Stephan: Can you try editing ide-dma.c, function
__ide_dma_test_irq(), and remove that line:
- drive->waiting_for_dma++;
And tell us if it helps in any way.
Ben.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 17:12 ` Benjamin Herrenschmidt
@ 2003-02-05 17:19 ` Ross Biro
2003-02-05 17:34 ` Benjamin Herrenschmidt
2003-02-05 19:10 ` Alan Cox
2003-02-06 12:20 ` Stephan von Krawczynski
1 sibling, 2 replies; 27+ messages in thread
From: Ross Biro @ 2003-02-05 17:19 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel
Benjamin Herrenschmidt wrote:
>While I agree with you here, I don't think it's what's happening.
> /* clear INTR & ERROR flags */
> hwif->OUTB(dma_stat|6, hwif->dma_status);
>
>
>
You have way to much faith in the hardware. Promise is especially known
for not keeping to the spec. I wouldn't trust the interrupt bit to be
valid unless a dma is actually active, i.e. that
hwif->OUTB(hwif->INB(dma_base)|1, dma_base);
has actually been written.
I've actually had a manufacturer tell me that they don't worry about the
spec, just making things work with Windows.
Ross
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 17:19 ` Ross Biro
@ 2003-02-05 17:34 ` Benjamin Herrenschmidt
2003-02-05 17:38 ` Stephan von Krawczynski
2003-02-05 19:10 ` Alan Cox
1 sibling, 1 reply; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-05 17:34 UTC (permalink / raw)
To: Ross Biro; +Cc: alan, Stephan von Krawczynski, linux-kernel
On Wed, 2003-02-05 at 18:19, Ross Biro wrote:
> Benjamin Herrenschmidt wrote:
>
> >While I agree with you here, I don't think it's what's happening.
> > /* clear INTR & ERROR flags */
> > hwif->OUTB(dma_stat|6, hwif->dma_status);
> >
> >
> >
> You have way to much faith in the hardware. Promise is especially known
> for not keeping to the spec. I wouldn't trust the interrupt bit to be
> valid unless a dma is actually active, i.e. that
>
> hwif->OUTB(hwif->INB(dma_base)|1, dma_base);
>
> has actually been written.
>
> I've actually had a manufacturer tell me that they don't worry about the
> spec, just making things work with Windows.
Ok, so that gives us 2 possibilities. The above problem, which would be
fixed by locking all around ide_dma_read/write (or rather in the
_caller_, seems better so we don't have to drop the lock for ATAPI).
And a possible wraparound of waiting_for_dma if 255 IRQs come in from
whatever device we share the IRQ line with.
I beleive both need fixing...
Ben.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 17:34 ` Benjamin Herrenschmidt
@ 2003-02-05 17:38 ` Stephan von Krawczynski
[not found] ` <1044467091.685.155.camel@zion.wanadoo.fr>
2003-02-05 20:00 ` Bryan Andersen
0 siblings, 2 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 17:38 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel
On 05 Feb 2003 18:34:55 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> On Wed, 2003-02-05 at 18:19, Ross Biro wrote:
> > Benjamin Herrenschmidt wrote:
> >
> > >While I agree with you here, I don't think it's what's happening.
> > > /* clear INTR & ERROR flags */
> > > hwif->OUTB(dma_stat|6, hwif->dma_status);
> > >
> > >
> > >
> > You have way to much faith in the hardware. Promise is especially known
> > for not keeping to the spec. I wouldn't trust the interrupt bit to be
> > valid unless a dma is actually active, i.e. that
> >
> > hwif->OUTB(hwif->INB(dma_base)|1, dma_base);
> >
> > has actually been written.
> >
> > I've actually had a manufacturer tell me that they don't worry about the
> > spec, just making things work with Windows.
>
> Ok, so that gives us 2 possibilities. The above problem, which would be
> fixed by locking all around ide_dma_read/write (or rather in the
> _caller_, seems better so we don't have to drop the lock for ATAPI).
>
> And a possible wraparound of waiting_for_dma if 255 IRQs come in from
> whatever device we share the IRQ line with.
>
> I beleive both need fixing...
>
> Ben.
Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots. PDC
is only 32bit/33MHz PCI. So it may well be that others are in fact _able_ to
produce a damn lot more data/interrupts than the PDC. I am pretty astonished by
the number of interrupts created by the 3com tg3 cards anyways...
--
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
[not found] ` <1044467091.685.155.camel@zion.wanadoo.fr>
@ 2003-02-05 17:58 ` Stephan von Krawczynski
0 siblings, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 17:58 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel, rossb
On 05 Feb 2003 18:44:51 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> On Wed, 2003-02-05 at 18:38, Stephan von Krawczynski wrote:
> > On 05 Feb 2003 18:34:55 +0100
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> >
> > > On Wed, 2003-02-05 at 18:19, Ross Biro wrote:
> > > > Benjamin Herrenschmidt wrote:
> > > >
> > > > >While I agree with you here, I don't think it's what's happening.
> > > > > /* clear INTR & ERROR flags */
> > > > > hwif->OUTB(dma_stat|6, hwif->dma_status);
> > > > >
> > > > >
> > > > >
> > > > You have way to much faith in the hardware. Promise is especially
> > > > known for not keeping to the spec. I wouldn't trust the interrupt bit
> > > > to be valid unless a dma is actually active, i.e. that
> > > >
> > > > hwif->OUTB(hwif->INB(dma_base)|1, dma_base);
> > > >
> > > > has actually been written.
> > > >
> > > > I've actually had a manufacturer tell me that they don't worry about
> > > > the spec, just making things work with Windows.
> > >
> > > Ok, so that gives us 2 possibilities. The above problem, which would be
> > > fixed by locking all around ide_dma_read/write (or rather in the
> > > _caller_, seems better so we don't have to drop the lock for ATAPI).
> > >
> > > And a possible wraparound of waiting_for_dma if 255 IRQs come in from
> > > whatever device we share the IRQ line with.
> > >
> > > I beleive both need fixing...
> > >
> > > Ben.
> >
> > Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots.
> > PDC is only 32bit/33MHz PCI. So it may well be that others are in fact
> > _able_ to produce a damn lot more data/interrupts than the PDC. I am pretty
> > astonished by the number of interrupts created by the 3com tg3 cards
> > anyways...
>
> Ok, then please try my "fix" to remove the increment of waiting_for_dma
> and let us know if it helps.
I will try, in the meantime can any kind soul please give me a hint why I
cannot see the interrupts distributed among the CPUs when enabling smp and
apic on this very same box:
CPU0 CPU1
0: 71158 0 IO-APIC-edge timer
1: 941 0 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
12: 33166 0 IO-APIC-edge PS/2 Mouse
15: 4 0 IO-APIC-edge ide1
17: 1732 0 IO-APIC-level ide2, ide3
18: 3423 0 IO-APIC-level eth0, eth1
21: 8177 0 IO-APIC-level eth2
22: 112943 0 IO-APIC-level aic7xxx
23: 16 0 IO-APIC-level aic7xxx
25: 74 0 IO-APIC-level HiSax
26: 0 0 IO-APIC-level EMU10K1
NMI: 0 0
LOC: 71085 71059
ERR: 0
MIS: 0
??
(kernel 2.4.21-pre4)
--
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 17:19 ` Ross Biro
2003-02-05 17:34 ` Benjamin Herrenschmidt
@ 2003-02-05 19:10 ` Alan Cox
1 sibling, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-02-05 19:10 UTC (permalink / raw)
To: Ross Biro
Cc: Benjamin Herrenschmidt, alan, Stephan von Krawczynski,
Linux Kernel Mailing List
On Wed, 2003-02-05 at 17:19, Ross Biro wrote:
> I've actually had a manufacturer tell me that they don't worry about the
> spec, just making things work with Windows.
Its very common. As a customer always ask the vendor if they are
compliant to each appropriate standard in writing. If they say yes it
has nice little liability issues should they be lying 8)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 17:38 ` Stephan von Krawczynski
[not found] ` <1044467091.685.155.camel@zion.wanadoo.fr>
@ 2003-02-05 20:00 ` Bryan Andersen
1 sibling, 0 replies; 27+ messages in thread
From: Bryan Andersen @ 2003-02-05 20:00 UTC (permalink / raw)
To: Stephan von Krawczynski; +Cc: Benjamin Herrenschmidt, rossb, alan, linux-kernel
> Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots. PDC
> is only 32bit/33MHz PCI. So it may well be that others are in fact _able_ to
> produce a damn lot more data/interrupts than the PDC. I am pretty astonished by
> the number of interrupts created by the 3com tg3 cards anyways...
On my box one of the devices sharing the interrupt with the disk
is the display, the USB is quiet with no devices connected. I am
running 2.4.21-pre4-ac2. I'd have redistributed interrupts but the
motherboard dosen't allow me to specifically set them and APIC isn't
supposed to work on the nForce2 yet.
cat /proc/interrupts
CPU0
0: 273068 XT-PIC timer
1: 5547 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 122188 XT-PIC eth0
10: 641526 XT-PIC ide2, ide3, usb-ohci, nvidia
11: 0 XT-PIC NVIDIA nForce Audio, usb-ohci
12: 78237 XT-PIC PS/2 Mouse
14: 109475 XT-PIC ide0
15: 114178 XT-PIC ide1
NMI: 0
LOC: 273027
ERR: 18344
MIS: 0
- Bryan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-05 17:12 ` Benjamin Herrenschmidt
2003-02-05 17:19 ` Ross Biro
@ 2003-02-06 12:20 ` Stephan von Krawczynski
2003-02-06 23:04 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-06 12:20 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel
On 05 Feb 2003 18:12:31 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> Stephan: Can you try editing ide-dma.c, function
> __ide_dma_test_irq(), and remove that line:
>
> - drive->waiting_for_dma++;
>
> And tell us if it helps in any way.
>
> Ben.
Hello Ben,
as requested I tried the above "patch" and had no problem so far. Current
situation is:
(ide2, ide3 are PDC, eth2 is tg3)
CPU0
0: 6332048 XT-PIC timer
1: 14112 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 0 XT-PIC EMU10K1
7: 14950 XT-PIC HiSax
9: 30600647 XT-PIC ide2, ide3, aic7xxx, aic7xxx, eth0, eth1, eth2
12: 234451 XT-PIC PS/2 Mouse
15: 2 XT-PIC ide1
NMI: 0
LOC: 0
ERR: 0
MIS: 0
I would not say this is a rock-solid test case. I will continue to stress the
setup and keep you informed. Anyway it looks stable up to now.
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-06 12:20 ` Stephan von Krawczynski
@ 2003-02-06 23:04 ` Benjamin Herrenschmidt
2003-02-07 9:10 ` Stephan von Krawczynski
0 siblings, 1 reply; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-06 23:04 UTC (permalink / raw)
To: Stephan von Krawczynski; +Cc: rossb, alan, linux-kernel
On Thu, 2003-02-06 at 13:20, Stephan von Krawczynski wrote:
> On 05 Feb 2003 18:12:31 +0100
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
>
> > Stephan: Can you try editing ide-dma.c, function
> > __ide_dma_test_irq(), and remove that line:
> >
> > - drive->waiting_for_dma++;
> >
> > And tell us if it helps in any way.
> >
> > Ben.
>
> Hello Ben,
>
> as requested I tried the above "patch" and had no problem so far. Current
> situation is:
> (ide2, ide3 are PDC, eth2 is tg3)
Ok, well, if it' still stable by now, I beleive we can safely remove
that line from ide_dma_test_irq(). AFAIK, it really have nothing to do
here.
(I suspect it got copied from ide-pmac somewhat... I use it as a counter
in there to implement some timeout when the DMA engine didn't start at
all because the disk issued an error, and on these, I know for sure
the IRQ isn't shared...)
Alan, can you include that ?
===== drivers/ide/ide-dma.c 1.10 vs edited =====
--- 1.10/drivers/ide/ide-dma.c Sat Feb 1 20:37:36 2003
+++ edited/drivers/ide/ide-dma.c Fri Feb 7 00:03:43 2003
@@ -826,7 +826,6 @@
if (!drive->waiting_for_dma)
printk(KERN_WARNING "%s: (%s) called while not
waiting\n",
drive->name, __FUNCTION__);
- drive->waiting_for_dma++;
return 0;
}
(Patch against Marcelo's 2.4.21-pre4)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
2003-02-06 23:04 ` Benjamin Herrenschmidt
@ 2003-02-07 9:10 ` Stephan von Krawczynski
0 siblings, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-07 9:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel
On 07 Feb 2003 00:04:18 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> On Thu, 2003-02-06 at 13:20, Stephan von Krawczynski wrote:
> > On 05 Feb 2003 18:12:31 +0100
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> >
> >
> > > Stephan: Can you try editing ide-dma.c, function
> > > __ide_dma_test_irq(), and remove that line:
> > >
> > > - drive->waiting_for_dma++;
> > >
> > > And tell us if it helps in any way.
> > >
> > > Ben.
> >
> > Hello Ben,
> >
> > as requested I tried the above "patch" and had no problem so far. Current
> > situation is:
> > (ide2, ide3 are PDC, eth2 is tg3)
>
> Ok, well, if it' still stable by now, I beleive we can safely remove
> that line from ide_dma_test_irq(). AFAIK, it really have nothing to do
> here.
Hello all,
it is still working ok, currently we are at:
CPU0
0: 13848205 XT-PIC timer
1: 54117 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 0 XT-PIC EMU10K1
7: 27260 XT-PIC HiSax
9: 67048861 XT-PIC ide2, ide3, aic7xxx, aic7xxx, eth0, eth1, eth2
12: 765541 XT-PIC PS/2 Mouse
15: 229 XT-PIC ide1
NMI: 0
LOC: 0
ERR: 0
MIS: 0
Regards,
Stephan
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2003-02-07 9:01 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-02 15:18 2.4.21-pre4: tg3 driver problems with shared interrupts Stephan von Krawczynski
2003-02-02 16:49 ` Jeff Garzik
2003-02-02 17:09 ` Stephan von Krawczynski
2003-02-02 17:15 ` Jeff Garzik
2003-02-02 17:52 ` Stephan von Krawczynski
2003-02-02 18:28 ` Jeff Garzik
2003-02-02 18:31 ` Stephan von Krawczynski
2003-02-03 10:25 ` Stephan von Krawczynski
2003-02-05 9:48 ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski
2003-02-05 11:16 ` Benjamin Herrenschmidt
2003-02-05 11:39 ` Stephan von Krawczynski
2003-02-05 12:21 ` Alan Cox
2003-02-05 12:22 ` Benjamin Herrenschmidt
2003-02-05 12:50 ` Alan Cox
2003-02-05 13:19 ` Stephan von Krawczynski
2003-02-05 12:24 ` Alan Cox
2003-02-05 16:56 ` Ross Biro
2003-02-05 17:12 ` Benjamin Herrenschmidt
2003-02-05 17:19 ` Ross Biro
2003-02-05 17:34 ` Benjamin Herrenschmidt
2003-02-05 17:38 ` Stephan von Krawczynski
[not found] ` <1044467091.685.155.camel@zion.wanadoo.fr>
2003-02-05 17:58 ` Stephan von Krawczynski
2003-02-05 20:00 ` Bryan Andersen
2003-02-05 19:10 ` Alan Cox
2003-02-06 12:20 ` Stephan von Krawczynski
2003-02-06 23:04 ` Benjamin Herrenschmidt
2003-02-07 9:10 ` Stephan von Krawczynski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).