linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.4.21-pre4: tg3 driver problems with shared interrupts
@ 2003-02-02 15:18 Stephan von Krawczynski
  2003-02-02 16:49 ` Jeff Garzik
  0 siblings, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-02 15:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: davem, jgarzik

Hello Dave, Jeff, all

I just started experiments with a new setup consisting of:

00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (rev 23)
00:00.1 Host bridge: ServerWorks CNB20HE Host Bridge (rev 01)
00:00.2 Host bridge: ServerWorks: Unknown device 0006 (rev 01)
00:00.3 Host bridge: ServerWorks: Unknown device 0006 (rev 01)
00:02.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0d)
00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0d)
00:04.0 Network controller: Elsa AG QuickStep 1000 (rev 01)
00:05.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
00:05.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07)
00:07.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 ISA bridge: ServerWorks CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93)
00:0f.2 USB Controller: ServerWorks OSB4/CSB5 USB Controller (rev 05)
00:0f.3 Host bridge: ServerWorks: Unknown device 0225
01:02.0 Unknown mass storage controller: Promise Technology, Inc. 20268 (rev
01)
01:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit
Ethernet (rev 15)
02:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit
Ethernet (rev 15)
02:03.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
02:03.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)

I found out within minutes that this setup does not survive if you let the
Broadcom cards share interrupts with anything else. It works ok now like
this (eth2 is tg3):

           CPU0       
  0:     343269          XT-PIC  timer
  1:       6804          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:      37952          XT-PIC  EMU10K1
  7:        515          XT-PIC  HiSax
  9:     711212          XT-PIC  aic7xxx, aic7xxx, eth0, eth1
 10:    4710570          XT-PIC  eth2
 11:     639316          XT-PIC  ide2, ide3
 12:     107821          XT-PIC  PS/2 Mouse
 15:      69222          XT-PIC  ide1
NMI:          0 
LOC:          0 
ERR:          0
MIS:          0

But horribly failed in such a setup:

  0:     XT-PIC  timer
  1:     XT-PIC  keyboard
  2:     XT-PIC  cascade
  5:     XT-PIC  EMU10K1
  7:     XT-PIC  HiSax
  9:     XT-PIC  eth2, aic7xxx, aic7xxx, eth0, eth1, ide2, ide3
 12:     XT-PIC  PS/2 Mouse
 15:     XT-PIC  ide1

I cannot even produce a "cat /proc/interrupts" for this because I am not fast
enough at login (the network at eth2 is heavy loaded). I sometimes read about
problems here with tg3-drivers, and I just wanted to point you to the shared
case, maybe it has to do with this special case rather than with the drivers
internals itself.
(PS: its not the ide2-3, I checked that out)

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
  2003-02-02 15:18 2.4.21-pre4: tg3 driver problems with shared interrupts Stephan von Krawczynski
@ 2003-02-02 16:49 ` Jeff Garzik
  2003-02-02 17:09   ` Stephan von Krawczynski
  2003-02-02 17:52   ` Stephan von Krawczynski
  0 siblings, 2 replies; 27+ messages in thread
From: Jeff Garzik @ 2003-02-02 16:49 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, davem

[-- Attachment #1: Type: text/plain, Size: 969 bytes --]

Stephan von Krawczynski wrote:
> I found out within minutes that this setup does not survive if you let the
> Broadcom cards share interrupts with anything else. It works ok now like
> this (eth2 is tg3):

[...]
> But horribly failed in such a setup:

[...]
> I cannot even produce a "cat /proc/interrupts" for this because I am not fast
> enough at login (the network at eth2 is heavy loaded). I sometimes read about
> problems here with tg3-drivers, and I just wanted to point you to the shared
> case, maybe it has to do with this special case rather than with the drivers
> internals itself.
> (PS: its not the ide2-3, I checked that out)



hmmm.  I've attached the latest tg3, version 1.4, which I just sent off 
to Marcelo.  It includes some fixes that may affect your 5701.

Can you try two things?

1) 2.4.21-pre4 + tg3 v1.4
2) 2.4.20 + tg3 v1.4

I'm interested to know if you can reproduce with the latest driver in 
either of these two scenarios...

	Jeff



[-- Attachment #2: tg3.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 49437 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
  2003-02-02 16:49 ` Jeff Garzik
@ 2003-02-02 17:09   ` Stephan von Krawczynski
  2003-02-02 17:15     ` Jeff Garzik
  2003-02-02 17:52   ` Stephan von Krawczynski
  1 sibling, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-02 17:09 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, davem

On Sun, 02 Feb 2003 11:49:12 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:

> Stephan von Krawczynski wrote:
> > I found out within minutes that this setup does not survive if you let the
> > Broadcom cards share interrupts with anything else. It works ok now like
> > this (eth2 is tg3):
> 
> [...]
> > But horribly failed in such a setup:
> 
> [...]
> > I cannot even produce a "cat /proc/interrupts" for this because I am not fast
> > enough at login (the network at eth2 is heavy loaded). I sometimes read about
> > problems here with tg3-drivers, and I just wanted to point you to the shared
> > case, maybe it has to do with this special case rather than with the drivers
> > internals itself.
> > (PS: its not the ide2-3, I checked that out)
> 
> 
> 
> hmmm.  I've attached the latest tg3, version 1.4, which I just sent off 
> to Marcelo.  It includes some fixes that may affect your 5701.

Sorry, putting it into pre4 results in:

gcc -D__KERNEL__ -I/usr/src/linux-2.4.21-pre4-patch/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686   -nostdinc -iwithprefix include -DKBUILD_BASENAME=tg3  -c -o tg3.o tg3.c
tg3.c:140: `PCI_DEVICE_ID_TIGON3_5704S' undeclared here (not in a function)
tg3.c:140: initializer element is not constant
tg3.c:140: (near initialization for `tg3_pci_tbl[8].device')
tg3.c:141: initializer element is not constant
tg3.c:141: (near initialization for `tg3_pci_tbl[8]')
tg3.c:142: `PCI_DEVICE_ID_TIGON3_5702A3' undeclared here (not in a function)
tg3.c:142: initializer element is not constant
tg3.c:142: (near initialization for `tg3_pci_tbl[9].device')
tg3.c:143: initializer element is not constant
tg3.c:143: (near initialization for `tg3_pci_tbl[9]')
tg3.c:144: `PCI_DEVICE_ID_TIGON3_5703A3' undeclared here (not in a function)
tg3.c:144: initializer element is not constant
tg3.c:144: (near initialization for `tg3_pci_tbl[10].device')
tg3.c:145: initializer element is not constant
tg3.c:145: (near initialization for `tg3_pci_tbl[10]')
tg3.c:147: initializer element is not constant
tg3.c:147: (near initialization for `tg3_pci_tbl[11]')
tg3.c:149: initializer element is not constant
tg3.c:149: (near initialization for `tg3_pci_tbl[12]')
tg3.c:151: initializer element is not constant
tg3.c:151: (near initialization for `tg3_pci_tbl[13]')
tg3.c:152: initializer element is not constant
tg3.c:152: (near initialization for `tg3_pci_tbl[14]')
make[3]: *** [tg3.o] Error 1

Shall I patch it, or do you?

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
  2003-02-02 17:09   ` Stephan von Krawczynski
@ 2003-02-02 17:15     ` Jeff Garzik
  0 siblings, 0 replies; 27+ messages in thread
From: Jeff Garzik @ 2003-02-02 17:15 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, davem, manish

[-- Attachment #1: Type: text/plain, Size: 499 bytes --]

Stephan von Krawczynski wrote:
> Sorry, putting it into pre4 results in:
> 
> gcc -D__KERNEL__ -I/usr/src/linux-2.4.21-pre4-patch/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686   -nostdinc -iwithprefix include -DKBUILD_BASENAME=tg3  -c -o tg3.o tg3.c
> tg3.c:140: `PCI_DEVICE_ID_TIGON3_5704S' undeclared here (not in a function)

sorry!  you need updated include/linux/pci_ids.h (attached).


[-- Attachment #2: tg3.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 62620 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
  2003-02-02 16:49 ` Jeff Garzik
  2003-02-02 17:09   ` Stephan von Krawczynski
@ 2003-02-02 17:52   ` Stephan von Krawczynski
  2003-02-02 18:28     ` Jeff Garzik
  1 sibling, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-02 17:52 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, davem

On Sun, 02 Feb 2003 11:49:12 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:

> Stephan von Krawczynski wrote:
> > I found out within minutes that this setup does not survive if you let the
> > Broadcom cards share interrupts with anything else. It works ok now like
> > this (eth2 is tg3):
> 
> [...]
> > But horribly failed in such a setup:
> 
> [...]
> > I cannot even produce a "cat /proc/interrupts" for this because I am not
> > fast enough at login (the network at eth2 is heavy loaded). I sometimes
> > read about problems here with tg3-drivers, and I just wanted to point you
> > to the shared case, maybe it has to do with this special case rather than
> > with the drivers internals itself.
> > (PS: its not the ide2-3, I checked that out)
> 
> 
> 
> hmmm.  I've attached the latest tg3, version 1.4, which I just sent off 
> to Marcelo.  It includes some fixes that may affect your 5701.
> 
> Can you try two things?
> 
> 1) 2.4.21-pre4 + tg3 v1.4

Ok. With the latest version you sent I got it compiled and working. As far as I
can tell from short tests it looks good (eth2 is tg3):

           CPU0       
  0:      79344          XT-PIC  timer
  1:       2428          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:          0          XT-PIC  EMU10K1
  7:         81          XT-PIC  HiSax
  9:     387286          XT-PIC  aic7xxx, aic7xxx, eth0, eth1, eth2
 11:      75780          XT-PIC  ide2, ide3
 12:      17740          XT-PIC  PS/2 Mouse
 15:          2          XT-PIC  ide1
NMI:          0 
LOC:          0 
ERR:          0
MIS:          0

To make sure I will let it stress-test overnight and send you the results in
about 15 hours from now on. If everything does fine I will redo with ide2,ide3
on same interrupt, too. Just to see what happens with these Promise things...

-- 
Thanks a lot, I'll be back,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
  2003-02-02 17:52   ` Stephan von Krawczynski
@ 2003-02-02 18:28     ` Jeff Garzik
  2003-02-02 18:31       ` Stephan von Krawczynski
                         ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Jeff Garzik @ 2003-02-02 18:28 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, davem

Stephan von Krawczynski wrote:
> On Sun, 02 Feb 2003 11:49:12 -0500
> Jeff Garzik <jgarzik@pobox.com> wrote:
>>Can you try two things?
>>
>>1) 2.4.21-pre4 + tg3 v1.4
> 
> 
> Ok. With the latest version you sent I got it compiled and working. As far as I
> can tell from short tests it looks good (eth2 is tg3):

cool beans, thanks!


> To make sure I will let it stress-test overnight and send you the results in
> about 15 hours from now on. If everything does fine I will redo with ide2,ide3
> on same interrupt, too. Just to see what happens with these Promise things...


great.

though who knows with the Promise stuff... :)  I hope it's not their 
binary-only junk...

	Jeff





^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
  2003-02-02 18:28     ` Jeff Garzik
@ 2003-02-02 18:31       ` Stephan von Krawczynski
  2003-02-03 10:25       ` Stephan von Krawczynski
  2003-02-05  9:48       ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski
  2 siblings, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-02 18:31 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, davem

On Sun, 02 Feb 2003 13:28:55 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:

> Stephan von Krawczynski wrote:
> > On Sun, 02 Feb 2003 11:49:12 -0500
> > Jeff Garzik <jgarzik@pobox.com> wrote:
> >>Can you try two things?
> >>
> >>1) 2.4.21-pre4 + tg3 v1.4
> > 
> > 
> > Ok. With the latest version you sent I got it compiled and working. As far as I
> > can tell from short tests it looks good (eth2 is tg3):
> 
> cool beans, thanks!
> 
> 
> > To make sure I will let it stress-test overnight and send you the results in
> > about 15 hours from now on. If everything does fine I will redo with ide2,ide3
> > on same interrupt, too. Just to see what happens with these Promise things...
> 
> 
> great.
> 
> though who knows with the Promise stuff... :)  I hope it's not their 
> binary-only junk...

I am using "PROMISE PDC202{68|69|70|71|75|76|77} support" from standard kernel.

-- 
Regards,
Stephan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: tg3 driver problems with shared interrupts
  2003-02-02 18:28     ` Jeff Garzik
  2003-02-02 18:31       ` Stephan von Krawczynski
@ 2003-02-03 10:25       ` Stephan von Krawczynski
  2003-02-05  9:48       ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski
  2 siblings, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-03 10:25 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, davem

On Sun, 02 Feb 2003 13:28:55 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:

> > To make sure I will let it stress-test overnight and send you the results
> > in about 15 hours from now on. If everything does fine I will redo with
> > ide2,ide3 on same interrupt, too. Just to see what happens with these
> > Promise things...

Hi Jeff,

I can tell you for sure now that your tg3 driver does very well in shared
interrupt config:
(eth2 is tg3)

           CPU0       
  0:    6052783          XT-PIC  timer
  1:       8618          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:        581          XT-PIC  EMU10K1
  7:       6842          XT-PIC  HiSax
  9:   30245790          XT-PIC  aic7xxx, aic7xxx, eth0, eth1, eth2
 11:    7324397          XT-PIC  ide2, ide3
 12:     196375          XT-PIC  PS/2 Mouse
 15:          2          XT-PIC  ide1
NMI:          0 
LOC:          0 
ERR:         52
MIS:          0


Though I don't exactly know what "ERR:" means in this context the machine is
alive and performing well.
Now I go again for the Promise stuff ...

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-02 18:28     ` Jeff Garzik
  2003-02-02 18:31       ` Stephan von Krawczynski
  2003-02-03 10:25       ` Stephan von Krawczynski
@ 2003-02-05  9:48       ` Stephan von Krawczynski
  2003-02-05 11:16         ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05  9:48 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, davem

On Sun, 02 Feb 2003 13:28:55 -0500
Jeff Garzik <jgarzik@pobox.com> wrote:

> > To make sure I will let it stress-test overnight and send you the results in
> > about 15 hours from now on. If everything does fine I will redo with ide2,ide3
> > on same interrupt, too. Just to see what happens with these Promise things...
> 
> 
> great.
> 
> though who knows with the Promise stuff... :)  I hope it's not their 
> binary-only junk...
> 
> 	Jeff

Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like:

Feb  4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
Feb  4 01:02:22 admin kernel: 
Feb  4 01:02:22 admin kernel: PDC202XX: Primary channel reset.
Feb  4 01:02:22 admin kernel: ide2: reset: success
Feb  4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady SeekComplete DataRequest }
Feb  4 01:02:23 admin kernel: 
Feb  4 01:02:23 admin kernel: hde: drive not ready for command
Feb  4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady SeekComplete }
Feb  4 01:02:23 admin kernel: 
Feb  4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE
Feb  4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy }
Feb  4 01:02:23 admin kernel: 

Results are that the drive itself just hangs and has to be powered off (resetting the box does not work). The drives worked (and works) fine in non shared-interrupt context. Controller is:

<6>PDC20268: IDE controller at PCI slot 01:02.0
<6>PCI: Found IRQ 11 for device 01:02.0
<6>PDC20268: chipset revision 1
<6>PDC20268: not 100%% native mode: will probe irqs later
<6>    ide2: BM-DMA at 0x7400-0x7407, BIOS settings: hde:pio, hdf:pio
<6>    ide3: BM-DMA at 0x7408-0x740f, BIOS settings: hdg:pio, hdh:pio
<4>hdc: AOPEN CD-RW CRW2440, ATAPI CD/DVD-ROM drive
<4>hde: ST380021A, ATA DISK drive
<4>blk: queue c034e1f8, I/O limit 4095Mb (mask 0xffffffff)
<4>hdg: IC35L060AVER07-0, ATA DISK drive
<4>blk: queue c034e664, I/O limit 4095Mb (mask 0xffffffff)
<4>ide1 at 0x170-0x177,0x376 on irq 15
<4>ide2 at 0x8800-0x8807,0x8402 on irq 11
<4>ide3 at 0x8000-0x8007,0x7802 on irq 11
<4>hde: host protected area => 1
<6>hde: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=155061/16/63, UDMA(100)
<4>hdg: host protected area => 1
<6>hdg: 120103200 sectors (61493 MB) w/1916KiB Cache, CHS=119150/16/63, UDMA(100)
<6>Partition check:
<6> hde: hde1
<6> hdg:<6> [PTBL] [7476/255/63] hdg1

Regards,
Stephan

PS: tg3 does great! Good job, Jeff...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05  9:48       ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski
@ 2003-02-05 11:16         ` Benjamin Herrenschmidt
  2003-02-05 11:39           ` Stephan von Krawczynski
                             ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-05 11:16 UTC (permalink / raw)
  To: alan; +Cc: Stephan von Krawczynski, linux-kernel


> Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like:
> 
> Feb  4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
> Feb  4 01:02:22 admin kernel: 
> Feb  4 01:02:22 admin kernel: PDC202XX: Primary channel reset.
> Feb  4 01:02:22 admin kernel: ide2: reset: success
> Feb  4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> Feb  4 01:02:23 admin kernel: 
> Feb  4 01:02:23 admin kernel: hde: drive not ready for command
> Feb  4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady SeekComplete }
> Feb  4 01:02:23 admin kernel: 
> Feb  4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE
> Feb  4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy }
> Feb  4 01:02:23 admin kernel: 

Hi Alan !

I'm trying to get some sense out of the above report as it seem to match
a problem a user reported me as well. It's interesting because it's
apparently running UP, so it's not the SMP race found by Ross. It's
definitely a problem with shared interrupt though.

I've followed the code path involved here. Basically, we had
ide_dma_intr() called while the drive was busy. This is strange for a
simple reason: Even if we got the wrong interrupt (shared interrupt),
__ide_dma_test_irq() should have returned 0, and so ide_dma_intr
shouldn't have been called.

Assuming the driver was doing basic read/write operations, I checked
the code, and while it seems that do_rw_disk() is called without
the lock held nor interrupts masked, I see no obvious race.
drive->waiting_for_dma is set before setting up the handler and
issuing the command, and while the DMA engine is indeed started
only after sending the command, it's INTR bit have been cleared
previously (I suppose it can't be stale, or can it while the DMA
haven't been started yet ? In this case we would need to take
the lock here).

> Results are that the drive itself just hangs and has to be powered off (resetting the box does not work). The drives worked (and works) fine in non shared-interrupt context. Controller is:
> 
> <6>PDC20268: IDE controller at PCI slot 01:02.0
> <6>PCI: Found IRQ 11 for device 01:02.0
> <6>PDC20268: chipset revision 1
> <6>PDC20268: not 100%% native mode: will probe irqs later
> <6>    ide2: BM-DMA at 0x7400-0x7407, BIOS settings: hde:pio, hdf:pio
> <6>    ide3: BM-DMA at 0x7408-0x740f, BIOS settings: hdg:pio, hdh:pio
> <4>hdc: AOPEN CD-RW CRW2440, ATAPI CD/DVD-ROM drive
> <4>hde: ST380021A, ATA DISK drive
> <4>blk: queue c034e1f8, I/O limit 4095Mb (mask 0xffffffff)
> <4>hdg: IC35L060AVER07-0, ATA DISK drive
> <4>blk: queue c034e664, I/O limit 4095Mb (mask 0xffffffff)
> <4>ide1 at 0x170-0x177,0x376 on irq 15
> <4>ide2 at 0x8800-0x8807,0x8402 on irq 11
> <4>ide3 at 0x8000-0x8007,0x7802 on irq 11
> <4>hde: host protected area => 1
> <6>hde: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=155061/16/63, UDMA(100)
> <4>hdg: host protected area => 1
> <6>hdg: 120103200 sectors (61493 MB) w/1916KiB Cache, CHS=119150/16/63, UDMA(100)
> <6>Partition check:
> <6> hde: hde1
> <6> hdg:<6> [PTBL] [7476/255/63] hdg1
> 
> Regards,
> Stephan
> 
> PS: tg3 does great! Good job, Jeff...
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
Benjamin Herrenschmidt <benh@kernel.crashing.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 11:16         ` Benjamin Herrenschmidt
@ 2003-02-05 11:39           ` Stephan von Krawczynski
  2003-02-05 12:21             ` Alan Cox
  2003-02-05 12:22             ` Benjamin Herrenschmidt
  2003-02-05 12:24           ` Alan Cox
  2003-02-05 16:56           ` Ross Biro
  2 siblings, 2 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 11:39 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel

On 05 Feb 2003 12:16:02 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> 
> > Okay, I had to watch for it a bit longer and it turns out that the kernel
> > PDC driver has a problem in this shared interrupt setup. When loads get
> > high it seems to run into some timing problem which causes things like:
> > 
> > Feb  4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
> > Feb  4 01:02:22 admin kernel: 
> > Feb  4 01:02:22 admin kernel: PDC202XX: Primary channel reset.
> > Feb  4 01:02:22 admin kernel: ide2: reset: success
> > Feb  4 01:02:23 admin kernel: hde: status error: status=0x58 { DriveReady
> > SeekComplete DataRequest } Feb  4 01:02:23 admin kernel: 
> > Feb  4 01:02:23 admin kernel: hde: drive not ready for command
> > Feb  4 01:02:23 admin kernel: hde: status error: status=0x50 { DriveReady
> > SeekComplete } Feb  4 01:02:23 admin kernel: 
> > Feb  4 01:02:23 admin kernel: hde: no DRQ after issuing WRITE
> > Feb  4 01:02:23 admin kernel: hde: status timeout: status=0xd0 { Busy }
> > Feb  4 01:02:23 admin kernel: 
> 
> Hi Alan !
> 
> I'm trying to get some sense out of the above report as it seem to match
> a problem a user reported me as well. It's interesting because it's
> apparently running UP, so it's not the SMP race found by Ross. It's
> definitely a problem with shared interrupt though.

Hello Benjamin,

I have to give a short note on this one:
indeed is the system currently running with a single CPU, _but_ since it is a
dual-mb the kernel is already compiled for SMP. It is started with "nosmp"
option though. I wanted to mention this not knowing if it is important for the
codepath.

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 11:39           ` Stephan von Krawczynski
@ 2003-02-05 12:21             ` Alan Cox
  2003-02-05 12:22             ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-02-05 12:21 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: Benjamin Herrenschmidt, alan, linux-kernel

> > a problem a user reported me as well. It's interesting because it's
> > apparently running UP, so it's not the SMP race found by Ross. It's
> > definitely a problem with shared interrupt though.

The race Ross found can bite you on a uniprocessor box as well I think.
It just needs the irq to hit in the right spot. The more interesting 
question is how the current -ac behaves under the same treatment

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 11:39           ` Stephan von Krawczynski
  2003-02-05 12:21             ` Alan Cox
@ 2003-02-05 12:22             ` Benjamin Herrenschmidt
  2003-02-05 12:50               ` Alan Cox
  2003-02-05 13:19               ` Stephan von Krawczynski
  1 sibling, 2 replies; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-05 12:22 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: alan, linux-kernel


> I have to give a short note on this one:
> indeed is the system currently running with a single CPU, _but_ since it is a
> dual-mb the kernel is already compiled for SMP. It is started with "nosmp"
> option though. I wanted to mention this not knowing if it is important for the
> codepath.

Shouldn be an issue. I suppose you don't use fancy stuff like preempt or
IDE taskfile IO, right ?

Ben.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 11:16         ` Benjamin Herrenschmidt
  2003-02-05 11:39           ` Stephan von Krawczynski
@ 2003-02-05 12:24           ` Alan Cox
  2003-02-05 16:56           ` Ross Biro
  2 siblings, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-02-05 12:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel

> ide_dma_intr() called while the drive was busy. This is strange for a
> simple reason: Even if we got the wrong interrupt (shared interrupt),
> __ide_dma_test_irq() should have returned 0, and so ide_dma_intr
> shouldn't have been called.

Ok the other mail makes more sense now 8)

> drive->waiting_for_dma is set before setting up the handler and
> issuing the command, and while the DMA engine is indeed started
> only after sending the command, it's INTR bit have been cleared
> previously (I suppose it can't be stale, or can it while the DMA
> haven't been started yet ? In this case we would need to take
> the lock here).

I'd have to go digging. I think that can occur however.

Andre ?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 12:22             ` Benjamin Herrenschmidt
@ 2003-02-05 12:50               ` Alan Cox
  2003-02-05 13:19               ` Stephan von Krawczynski
  1 sibling, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-02-05 12:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Stephan von Krawczynski, alan, linux-kernel

> > dual-mb the kernel is already compiled for SMP. It is started with "nosmp"
> > option though. I wanted to mention this not knowing if it is important for the
> > codepath.
> 
> Shouldn be an issue. I suppose you don't use fancy stuff like preempt or
> IDE taskfile IO, right ?

IDE taskfile I/O is disabled. Pre-empt and 2.4 IDE don't work together at
all yet, and probably never will (see the /proc code for why its basically
unfixable in 2.4)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 12:22             ` Benjamin Herrenschmidt
  2003-02-05 12:50               ` Alan Cox
@ 2003-02-05 13:19               ` Stephan von Krawczynski
  1 sibling, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 13:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel

On 05 Feb 2003 13:22:19 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> 
> > I have to give a short note on this one:
> > indeed is the system currently running with a single CPU, _but_ since it is a
> > dual-mb the kernel is already compiled for SMP. It is started with "nosmp"
> > option though. I wanted to mention this not knowing if it is important for the
> > codepath.
> 
> Shouldn be an issue. I suppose you don't use fancy stuff like preempt or
> IDE taskfile IO, right ?

No, not at all. Pure and simple filesystem-I/O (reiserfs).

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 11:16         ` Benjamin Herrenschmidt
  2003-02-05 11:39           ` Stephan von Krawczynski
  2003-02-05 12:24           ` Alan Cox
@ 2003-02-05 16:56           ` Ross Biro
  2003-02-05 17:12             ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 27+ messages in thread
From: Ross Biro @ 2003-02-05 16:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel

Benjamin Herrenschmidt wrote:

>>Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like:
>>
>>Feb  4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
>>
>>    
>>
Since the busy bit is set, we know the drive must have received a 
command.  Since dma_intr thought the drive was not busy, an interrupt 
must have snuck through between the command being issued and the dma 
being started.  I think in my original patch, I had the dma start 
outside of the spinlock, that is a bug.  The command to the controller 
to start the dma must be inside of the spinlock.

I have not looked at 2.4.21-pre4 at all, so I could be entirely off base 
here.

    Ross


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 16:56           ` Ross Biro
@ 2003-02-05 17:12             ` Benjamin Herrenschmidt
  2003-02-05 17:19               ` Ross Biro
  2003-02-06 12:20               ` Stephan von Krawczynski
  0 siblings, 2 replies; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-05 17:12 UTC (permalink / raw)
  To: Ross Biro; +Cc: alan, Stephan von Krawczynski, linux-kernel

On Wed, 2003-02-05 at 17:56, Ross Biro wrote:
> Benjamin Herrenschmidt wrote:
> 
> >>Okay, I had to watch for it a bit longer and it turns out that the kernel PDC driver has a problem in this shared interrupt setup. When loads get high it seems to run into some timing problem which causes things like:
> >>
> >>Feb  4 01:02:22 admin kernel: hde: dma_intr: status=0xd0 { Busy }
> >>
> >>    
> >>
> Since the busy bit is set, we know the drive must have received a 
> command.  Since dma_intr thought the drive was not busy, an interrupt 
> must have snuck through between the command being issued and the dma 
> being started.  I think in my original patch, I had the dma start 
> outside of the spinlock, that is a bug.  The command to the controller 
> to start the dma must be inside of the spinlock.

While I agree with you here, I don't think it's what's happening.

In ide-disk, do_rw_disk sets up the taskfile, then basically calls
hwif->ide_dma_read/write to start the command.

In ide-dma.c in 2.4.21-pre4, what happens is:

	/* PRD table */
	hwif->OUTL(hwif->dmatable_dma, hwif->dma_prdtable);
	/* specify r/w */
	hwif->OUTB(reading, hwif->dma_command);
	/* read dma_status for INTR & ERROR flags */
	dma_stat = hwif->INB(hwif->dma_status);
	/* clear INTR & ERROR flags */
	hwif->OUTB(dma_stat|6, hwif->dma_status);
	drive->waiting_for_dma = 1;
	if (drive->media != ide_disk)
		return 0;
        .../...
        Then issue command byte.

Below we clear the DMA status _and_ set waiting_for_dma to 1.
That means that if an IRQ sneaks in, we will call
drive_is_ready(), which shouldn't return INTR 1 since we
just cleared it. I don't see how a race could happen here,
but I might have missed something.

Even if, on SMP, the code below executes _simultaneously_
with ide_intr, the later will check for handler beeing
non-NULL before checking waiting_for_dma (drive_is_ready),
and thus will not race since we set the handler after.

The only thing I see is a possible wraparound of
waiting_for_dma. It's an u8, so it wraps at 255. However,
it's incremented in each __ide_dma_test_irq call. So if you
get more than 255 shared (network in your case) interrupts
before the end of the command, you die.

Alan: you can remove safely the waiting_for_dma++, I beleive,
in drive_is_ready(). I don't know how that code sneaked in
ide-dma. I indeed do that in ppc/pmac.c for other reasons
(sort of timeout condition on the DMA controller that happens
when I get an initial error), but this is totally unrelated
HW on which I know I have no shared IRQ.
   
Stephan: Can you try editing ide-dma.c, function
__ide_dma_test_irq(), and remove that line:

-	drive->waiting_for_dma++;

And tell us if it helps in any way.

Ben.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 17:12             ` Benjamin Herrenschmidt
@ 2003-02-05 17:19               ` Ross Biro
  2003-02-05 17:34                 ` Benjamin Herrenschmidt
  2003-02-05 19:10                 ` Alan Cox
  2003-02-06 12:20               ` Stephan von Krawczynski
  1 sibling, 2 replies; 27+ messages in thread
From: Ross Biro @ 2003-02-05 17:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: alan, Stephan von Krawczynski, linux-kernel

Benjamin Herrenschmidt wrote:

>While I agree with you here, I don't think it's what's happening.
>	/* clear INTR & ERROR flags */
>	hwif->OUTB(dma_stat|6, hwif->dma_status);
>
>  
>
You have way to much faith in the hardware.  Promise is especially known 
for not keeping to the spec.  I wouldn't trust the interrupt bit to be 
valid unless a dma is actually active, i.e. that

                  hwif->OUTB(hwif->INB(dma_base)|1, dma_base);

has actually been written.  

I've actually had a manufacturer tell me that they don't worry about the 
spec, just making things work with Windows.

    Ross


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 17:19               ` Ross Biro
@ 2003-02-05 17:34                 ` Benjamin Herrenschmidt
  2003-02-05 17:38                   ` Stephan von Krawczynski
  2003-02-05 19:10                 ` Alan Cox
  1 sibling, 1 reply; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-05 17:34 UTC (permalink / raw)
  To: Ross Biro; +Cc: alan, Stephan von Krawczynski, linux-kernel

On Wed, 2003-02-05 at 18:19, Ross Biro wrote:
> Benjamin Herrenschmidt wrote:
> 
> >While I agree with you here, I don't think it's what's happening.
> >	/* clear INTR & ERROR flags */
> >	hwif->OUTB(dma_stat|6, hwif->dma_status);
> >
> >  
> >
> You have way to much faith in the hardware.  Promise is especially known 
> for not keeping to the spec.  I wouldn't trust the interrupt bit to be 
> valid unless a dma is actually active, i.e. that
> 
>                   hwif->OUTB(hwif->INB(dma_base)|1, dma_base);
> 
> has actually been written.  
> 
> I've actually had a manufacturer tell me that they don't worry about the 
> spec, just making things work with Windows.

Ok, so that gives us 2 possibilities. The above problem, which would be
fixed by locking all around ide_dma_read/write (or rather in the
_caller_, seems better so we don't have to drop the lock for ATAPI).

And a possible wraparound of waiting_for_dma if 255 IRQs come in from
whatever device we share the IRQ line with.

I beleive both need fixing...

Ben.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 17:34                 ` Benjamin Herrenschmidt
@ 2003-02-05 17:38                   ` Stephan von Krawczynski
       [not found]                     ` <1044467091.685.155.camel@zion.wanadoo.fr>
  2003-02-05 20:00                     ` Bryan Andersen
  0 siblings, 2 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 17:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel

On 05 Feb 2003 18:34:55 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Wed, 2003-02-05 at 18:19, Ross Biro wrote:
> > Benjamin Herrenschmidt wrote:
> > 
> > >While I agree with you here, I don't think it's what's happening.
> > >	/* clear INTR & ERROR flags */
> > >	hwif->OUTB(dma_stat|6, hwif->dma_status);
> > >
> > >  
> > >
> > You have way to much faith in the hardware.  Promise is especially known 
> > for not keeping to the spec.  I wouldn't trust the interrupt bit to be 
> > valid unless a dma is actually active, i.e. that
> > 
> >                   hwif->OUTB(hwif->INB(dma_base)|1, dma_base);
> > 
> > has actually been written.  
> > 
> > I've actually had a manufacturer tell me that they don't worry about the 
> > spec, just making things work with Windows.
> 
> Ok, so that gives us 2 possibilities. The above problem, which would be
> fixed by locking all around ide_dma_read/write (or rather in the
> _caller_, seems better so we don't have to drop the lock for ATAPI).
> 
> And a possible wraparound of waiting_for_dma if 255 IRQs come in from
> whatever device we share the IRQ line with.
> 
> I beleive both need fixing...
> 
> Ben.

Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots. PDC
is only 32bit/33MHz PCI. So it may well be that others are in fact _able_ to
produce a damn lot more data/interrupts than the PDC. I am pretty astonished by
the number of interrupts created by the 3com tg3 cards anyways...

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
       [not found]                     ` <1044467091.685.155.camel@zion.wanadoo.fr>
@ 2003-02-05 17:58                       ` Stephan von Krawczynski
  0 siblings, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-05 17:58 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: alan, linux-kernel, rossb

On 05 Feb 2003 18:44:51 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Wed, 2003-02-05 at 18:38, Stephan von Krawczynski wrote:
> > On 05 Feb 2003 18:34:55 +0100
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > 
> > > On Wed, 2003-02-05 at 18:19, Ross Biro wrote:
> > > > Benjamin Herrenschmidt wrote:
> > > > 
> > > > >While I agree with you here, I don't think it's what's happening.
> > > > >	/* clear INTR & ERROR flags */
> > > > >	hwif->OUTB(dma_stat|6, hwif->dma_status);
> > > > >
> > > > >  
> > > > >
> > > > You have way to much faith in the hardware.  Promise is especially
> > > > known for not keeping to the spec.  I wouldn't trust the interrupt bit
> > > > to be valid unless a dma is actually active, i.e. that
> > > > 
> > > >                   hwif->OUTB(hwif->INB(dma_base)|1, dma_base);
> > > > 
> > > > has actually been written.  
> > > > 
> > > > I've actually had a manufacturer tell me that they don't worry about
> > > > the spec, just making things work with Windows.
> > > 
> > > Ok, so that gives us 2 possibilities. The above problem, which would be
> > > fixed by locking all around ide_dma_read/write (or rather in the
> > > _caller_, seems better so we don't have to drop the lock for ATAPI).
> > > 
> > > And a possible wraparound of waiting_for_dma if 255 IRQs come in from
> > > whatever device we share the IRQ line with.
> > > 
> > > I beleive both need fixing...
> > > 
> > > Ben.
> > 
> > Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots.
> > PDC is only 32bit/33MHz PCI. So it may well be that others are in fact
> > _able_ to produce a damn lot more data/interrupts than the PDC. I am pretty
> > astonished by the number of interrupts created by the 3com tg3 cards
> > anyways...
> 
> Ok, then please try my "fix" to remove the increment of waiting_for_dma
> and let us know if it helps.

I will try, in the meantime can any kind soul please give me a hint why I
cannot see the interrupts distributed among the CPUs when enabling smp and
apic on this very same box:

           CPU0       CPU1       
  0:      71158          0    IO-APIC-edge  timer
  1:        941          0    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
 12:      33166          0    IO-APIC-edge  PS/2 Mouse
 15:          4          0    IO-APIC-edge  ide1
 17:       1732          0   IO-APIC-level  ide2, ide3
 18:       3423          0   IO-APIC-level  eth0, eth1
 21:       8177          0   IO-APIC-level  eth2
 22:     112943          0   IO-APIC-level  aic7xxx
 23:         16          0   IO-APIC-level  aic7xxx
 25:         74          0   IO-APIC-level  HiSax
 26:          0          0   IO-APIC-level  EMU10K1
NMI:          0          0 
LOC:      71085      71059 
ERR:          0
MIS:          0

??
(kernel 2.4.21-pre4)
-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 17:19               ` Ross Biro
  2003-02-05 17:34                 ` Benjamin Herrenschmidt
@ 2003-02-05 19:10                 ` Alan Cox
  1 sibling, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-02-05 19:10 UTC (permalink / raw)
  To: Ross Biro
  Cc: Benjamin Herrenschmidt, alan, Stephan von Krawczynski,
	Linux Kernel Mailing List

On Wed, 2003-02-05 at 17:19, Ross Biro wrote:
> I've actually had a manufacturer tell me that they don't worry about the 
> spec, just making things work with Windows.

Its very common. As a customer always ask the vendor if they are 
compliant to each appropriate standard in writing. If they say yes it
has nice little liability issues should they be lying 8)



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 17:38                   ` Stephan von Krawczynski
       [not found]                     ` <1044467091.685.155.camel@zion.wanadoo.fr>
@ 2003-02-05 20:00                     ` Bryan Andersen
  1 sibling, 0 replies; 27+ messages in thread
From: Bryan Andersen @ 2003-02-05 20:00 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: Benjamin Herrenschmidt, rossb, alan, linux-kernel


> Ok, yet another small brick in the wall: this mb has 64bit/66MHz PCI slots. PDC
> is only 32bit/33MHz PCI. So it may well be that others are in fact _able_ to
> produce a damn lot more data/interrupts than the PDC. I am pretty astonished by
> the number of interrupts created by the 3com tg3 cards anyways...

On my box one of the devices sharing the interrupt with the disk
is the display, the USB is quiet with no devices connected.  I am 
running 2.4.21-pre4-ac2.  I'd have redistributed interrupts but the 
motherboard dosen't allow me to specifically set them and APIC isn't 
supposed to work on the nForce2 yet.

cat /proc/interrupts
            CPU0
   0:     273068          XT-PIC  timer
   1:       5547          XT-PIC  keyboard
   2:          0          XT-PIC  cascade
   5:     122188          XT-PIC  eth0
  10:     641526          XT-PIC  ide2, ide3, usb-ohci, nvidia
  11:          0          XT-PIC  NVIDIA nForce Audio, usb-ohci
  12:      78237          XT-PIC  PS/2 Mouse
  14:     109475          XT-PIC  ide0
  15:     114178          XT-PIC  ide1
NMI:          0
LOC:     273027
ERR:      18344
MIS:          0

- Bryan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-05 17:12             ` Benjamin Herrenschmidt
  2003-02-05 17:19               ` Ross Biro
@ 2003-02-06 12:20               ` Stephan von Krawczynski
  2003-02-06 23:04                 ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-06 12:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel

On 05 Feb 2003 18:12:31 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:


> Stephan: Can you try editing ide-dma.c, function
> __ide_dma_test_irq(), and remove that line:
> 
> -	drive->waiting_for_dma++;
> 
> And tell us if it helps in any way.
> 
> Ben.

Hello Ben,

as requested I tried the above "patch" and had no problem so far. Current
situation is:
(ide2, ide3 are PDC, eth2 is tg3)

           CPU0       
  0:    6332048          XT-PIC  timer
  1:      14112          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:          0          XT-PIC  EMU10K1
  7:      14950          XT-PIC  HiSax
  9:   30600647          XT-PIC  ide2, ide3, aic7xxx, aic7xxx, eth0, eth1, eth2
 12:     234451          XT-PIC  PS/2 Mouse
 15:          2          XT-PIC  ide1
NMI:          0 
LOC:          0 
ERR:          0
MIS:          0

I would not say this is a rock-solid test case. I will continue to stress the
setup and keep you informed. Anyway it looks stable up to now.

Regards,
Stephan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-06 12:20               ` Stephan von Krawczynski
@ 2003-02-06 23:04                 ` Benjamin Herrenschmidt
  2003-02-07  9:10                   ` Stephan von Krawczynski
  0 siblings, 1 reply; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2003-02-06 23:04 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: rossb, alan, linux-kernel

On Thu, 2003-02-06 at 13:20, Stephan von Krawczynski wrote:
> On 05 Feb 2003 18:12:31 +0100
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> 
> > Stephan: Can you try editing ide-dma.c, function
> > __ide_dma_test_irq(), and remove that line:
> > 
> > -	drive->waiting_for_dma++;
> > 
> > And tell us if it helps in any way.
> > 
> > Ben.
> 
> Hello Ben,
> 
> as requested I tried the above "patch" and had no problem so far. Current
> situation is:
> (ide2, ide3 are PDC, eth2 is tg3)

Ok, well, if it' still stable by now, I beleive we can safely remove
that line from ide_dma_test_irq(). AFAIK, it really have nothing to do
here.

(I suspect it got copied from ide-pmac somewhat... I use it as a counter
in there to implement some timeout when the DMA engine didn't start at
all because the disk issued an error, and on these, I know for sure
the IRQ isn't shared...)

Alan, can you include that ?

===== drivers/ide/ide-dma.c 1.10 vs edited =====
--- 1.10/drivers/ide/ide-dma.c  Sat Feb  1 20:37:36 2003
+++ edited/drivers/ide/ide-dma.c        Fri Feb  7 00:03:43 2003
@@ -826,7 +826,6 @@
        if (!drive->waiting_for_dma)
                printk(KERN_WARNING "%s: (%s) called while not
waiting\n",
                        drive->name, __FUNCTION__);
-       drive->waiting_for_dma++;
        return 0;
 }

(Patch against Marcelo's 2.4.21-pre4)



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: 2.4.21-pre4: PDC ide driver problems with shared interrupts
  2003-02-06 23:04                 ` Benjamin Herrenschmidt
@ 2003-02-07  9:10                   ` Stephan von Krawczynski
  0 siblings, 0 replies; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-02-07  9:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: rossb, alan, linux-kernel

On 07 Feb 2003 00:04:18 +0100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Thu, 2003-02-06 at 13:20, Stephan von Krawczynski wrote:
> > On 05 Feb 2003 18:12:31 +0100
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > 
> > 
> > > Stephan: Can you try editing ide-dma.c, function
> > > __ide_dma_test_irq(), and remove that line:
> > > 
> > > -	drive->waiting_for_dma++;
> > > 
> > > And tell us if it helps in any way.
> > > 
> > > Ben.
> > 
> > Hello Ben,
> > 
> > as requested I tried the above "patch" and had no problem so far. Current
> > situation is:
> > (ide2, ide3 are PDC, eth2 is tg3)
> 
> Ok, well, if it' still stable by now, I beleive we can safely remove
> that line from ide_dma_test_irq(). AFAIK, it really have nothing to do
> here.

Hello all,

it is still working ok, currently we are at:

           CPU0       
  0:   13848205          XT-PIC  timer
  1:      54117          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:          0          XT-PIC  EMU10K1
  7:      27260          XT-PIC  HiSax
  9:   67048861          XT-PIC  ide2, ide3, aic7xxx, aic7xxx, eth0, eth1, eth2
 12:     765541          XT-PIC  PS/2 Mouse
 15:        229          XT-PIC  ide1
NMI:          0 
LOC:          0 
ERR:          0
MIS:          0

Regards,
Stephan



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2003-02-07  9:01 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-02 15:18 2.4.21-pre4: tg3 driver problems with shared interrupts Stephan von Krawczynski
2003-02-02 16:49 ` Jeff Garzik
2003-02-02 17:09   ` Stephan von Krawczynski
2003-02-02 17:15     ` Jeff Garzik
2003-02-02 17:52   ` Stephan von Krawczynski
2003-02-02 18:28     ` Jeff Garzik
2003-02-02 18:31       ` Stephan von Krawczynski
2003-02-03 10:25       ` Stephan von Krawczynski
2003-02-05  9:48       ` 2.4.21-pre4: PDC ide " Stephan von Krawczynski
2003-02-05 11:16         ` Benjamin Herrenschmidt
2003-02-05 11:39           ` Stephan von Krawczynski
2003-02-05 12:21             ` Alan Cox
2003-02-05 12:22             ` Benjamin Herrenschmidt
2003-02-05 12:50               ` Alan Cox
2003-02-05 13:19               ` Stephan von Krawczynski
2003-02-05 12:24           ` Alan Cox
2003-02-05 16:56           ` Ross Biro
2003-02-05 17:12             ` Benjamin Herrenschmidt
2003-02-05 17:19               ` Ross Biro
2003-02-05 17:34                 ` Benjamin Herrenschmidt
2003-02-05 17:38                   ` Stephan von Krawczynski
     [not found]                     ` <1044467091.685.155.camel@zion.wanadoo.fr>
2003-02-05 17:58                       ` Stephan von Krawczynski
2003-02-05 20:00                     ` Bryan Andersen
2003-02-05 19:10                 ` Alan Cox
2003-02-06 12:20               ` Stephan von Krawczynski
2003-02-06 23:04                 ` Benjamin Herrenschmidt
2003-02-07  9:10                   ` Stephan von Krawczynski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).