linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* aic7xxx errors with 2.4.8-ac7 on 440gx mobo
@ 2001-08-20  8:36 Yusuf Goolamabbas
  2001-08-20  8:55 ` Cliff Albert
  0 siblings, 1 reply; 75+ messages in thread
From: Yusuf Goolamabbas @ 2001-08-20  8:36 UTC (permalink / raw)
  To: linux-kernel

Hi, 2.4.8 and 2.4.9 have no problems compiling and booting on a P3 500
440GX mobo (APIC not compiled in the kernel)

With 2.4.8-ac7, I get SCSI errors and the kernel fails to boot. If I
compile with APIC enabled and APIC on UP also enabled, it boots
cleanly

booting with append="noapic", gives the same errors


SCSI subsystem driver Revision: 1.00
PCI: Assigned IRQ 11 for device 00:0c.0
PCI: Sharing IRQ 11 with 00:0c.1
PCI: Found IRQ 11 for device 00:0c.1
PCI: Sharing IRQ 11 with 00:0c.0

scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.1
        <Adaptec aic7896/97 Ultra2 SCSI adapter>
        aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.1
        <Adaptec aic7896/97 Ultra2 SCSI adapter>
        aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/255 SCBs

scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command already completed
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Device is active, asserting ATN
Recovery code sleeping
Recovery code awake
Timer Expired
aic7xxx_abort returns 8195
scsi0:0:0:0: Attempting to queue a TARGET RESET message
aic7xxx_dev_reset returns 8195
Recovery SCB completes
scsi0:0:0:0: Attempting to queue an ABORT message
ahc_intr: HOST_MSG_LOOP bad phase 0x0
scsi0:0:0:0: Cmd aborted from QINFIFO
aic7xxx_abort returns 8194
scsi: device set offline - not ready or command retry failed after bus reset: host 0 channel 0 id 0 lun 0
scsi0:0:1:0: Attempting to queue an ABORT message
scsi0:0:1:0: Command already completed
aic7xxx_abort returns 8194
scsi0:0:1:0: Attempting to queue an ABORT message
scsi0:0:1:0: Command already completed
aic7xxx_abort returns 8194
scsi0:0:1:0: Attempting to queue a TARGET RESET message
scsi0:0:1:0: Is not an active device
scsi0:0:1:0: Attempting to queue an ABORT message
scsi0:0:1:0: Command already completed
aic7xxx_abort returns 8194

Regards, Yusuf

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20  8:36 aic7xxx errors with 2.4.8-ac7 on 440gx mobo Yusuf Goolamabbas
@ 2001-08-20  8:55 ` Cliff Albert
  2001-08-20 10:37   ` Alan Cox
  2001-08-20 20:27   ` Justin T. Gibbs
  0 siblings, 2 replies; 75+ messages in thread
From: Cliff Albert @ 2001-08-20  8:55 UTC (permalink / raw)
  To: Yusuf Goolamabbas; +Cc: linux-kernel, gibbs

On Mon, Aug 20, 2001 at 08:36:30AM -0000, Yusuf Goolamabbas wrote:

> Hi, 2.4.8 and 2.4.9 have no problems compiling and booting on a P3 500
> 440GX mobo (APIC not compiled in the kernel)
> 
> With 2.4.8-ac7, I get SCSI errors and the kernel fails to boot. If I
> compile with APIC enabled and APIC on UP also enabled, it boots
> cleanly

I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
i've been getting these errors since 2.4.3.

> booting with append="noapic", gives the same errors

This also didn't resolve my problems

> SCSI subsystem driver Revision: 1.00
> PCI: Assigned IRQ 11 for device 00:0c.0
> PCI: Sharing IRQ 11 with 00:0c.1
> PCI: Found IRQ 11 for device 00:0c.1
> PCI: Sharing IRQ 11 with 00:0c.0
> 
> scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.1
>         <Adaptec aic7896/97 Ultra2 SCSI adapter>
>         aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
> 
> scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.1
>         <Adaptec aic7896/97 Ultra2 SCSI adapter>
>         aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/255 SCBs

SCSI subsystem driver Revision: 1.00
PCI: Found IRQ 14 for device 00:06.0
PCI: Sharing IRQ 14 with 00:04.2
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.1
        <Adaptec aic7890/91 Ultra2 SCSI adapter>
	aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs

> 
> scsi0:0:0:0: Attempting to queue an ABORT message
> scsi0:0:0:0: Command already completed
> aic7xxx_abort returns 8194
> scsi0:0:0:0: Attempting to queue an ABORT message
> scsi0:0:0:0: Device is active, asserting ATN
> Recovery code sleeping
> Recovery code awake
> Timer Expired
> aic7xxx_abort returns 8195
> scsi0:0:0:0: Attempting to queue a TARGET RESET message
> aic7xxx_dev_reset returns 8195
> Recovery SCB completes
> scsi0:0:0:0: Attempting to queue an ABORT message
> ahc_intr: HOST_MSG_LOOP bad phase 0x0
> scsi0:0:0:0: Cmd aborted from QINFIFO
> aic7xxx_abort returns 8194
> scsi: device set offline - not ready or command retry failed after bus reset: host 0 channel 0 id 0 lun 0
> scsi0:0:1:0: Attempting to queue an ABORT message
> scsi0:0:1:0: Command already completed
> aic7xxx_abort returns 8194
> scsi0:0:1:0: Attempting to queue an ABORT message
> scsi0:0:1:0: Command already completed
> aic7xxx_abort returns 8194
> scsi0:0:1:0: Attempting to queue a TARGET RESET message
> scsi0:0:1:0: Is not an active device
> scsi0:0:1:0: Attempting to queue an ABORT message
> scsi0:0:1:0: Command already completed
> aic7xxx_abort returns 8194

scsi0:0:0:0: Attempting to queue an ABORT message
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
Recovery code sleeping
(scsi0:A:0:0): Abort Tag Message Sent
(scsi0:A:0:0): SCB 1 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
Recovery code sleeping
(scsi0:A:0:0): Abort Tag Message Sent
(scsi0:A:0:0): SCB 4 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
Recovery code sleeping
(scsi0:A:0:0): Abort Tag Message Sent
(scsi0:A:0:0): SCB 2 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
Recovery code sleeping
(scsi0:A:0:0): Abort Tag Message Sent
(scsi0:A:0:0): SCB 9 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 0 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 11 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
Recovery code sleeping
(scsi0:A:0:0): Abort Tag Message Sent
(scsi0:A:0:0): SCB 6 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
Recovery code sleeping
(scsi0:A:0:0): Abort Tag Message Sent
(scsi0:A:0:0): SCB 3 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue a TARGET RESET message
scsi0:0:0:0: Command not found
aic7xxx_dev_reset returns 8194
Device not ready.  Make sure there is a disc in the drive.
Device not ready.  Make sure there is a disc in the drive.

SCSI device listing is:

  Vendor: QUANTUM   Model: FIREBALL ST6.4S   Rev: 0F0C
    Type:   Direct-Access                      ANSI SCSI revision: 02
    (scsi0:A:0): 20.000MB/s transfers (20.000MHz, offset 15)
  Vendor: IOMEGA    Model: ZIP 100           Rev: J.03
    Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: YAMAHA    Model: CRW2100S          Rev: 1.0H
    Type:   CD-ROM                             ANSI SCSI revision: 02
    (scsi0:A:5): 20.000MB/s transfers (20.000MHz, offset 7)
  Vendor: PLEXTOR   Model: CD-ROM PX-32TS    Rev: 1.01
    Type:   CD-ROM                             ANSI SCSI revision: 02
    (scsi0:A:6): 20.000MB/s transfers (20.000MHz, offset 15)


-- 
Cliff Albert		| RIPE:	     CA3348-RIPE | www.oisec.net
cliff@oisec.net		| 6BONE:     CA2-6BONE	 | icq 18461740

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20  8:55 ` Cliff Albert
@ 2001-08-20 10:37   ` Alan Cox
  2001-08-20 10:56     ` Yusuf Goolamabbas
                       ` (2 more replies)
  2001-08-20 20:27   ` Justin T. Gibbs
  1 sibling, 3 replies; 75+ messages in thread
From: Alan Cox @ 2001-08-20 10:37 UTC (permalink / raw)
  To: Cliff Albert; +Cc: Yusuf Goolamabbas, linux-kernel, gibbs

> > With 2.4.8-ac7, I get SCSI errors and the kernel fails to boot. If I
> > compile with APIC enabled and APIC on UP also enabled, it boots
> > cleanly
> 
> I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
> the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
> i've been getting these errors since 2.4.3.

There is a known BIOS irq routing table problem with a large number of Intel
BIOS boards with onboard adaptec controllers. The fact that making it use
the io-apic works suggest this is the same thing.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 10:56     ` Yusuf Goolamabbas
@ 2001-08-20 10:56       ` Alan Cox
  2001-08-20 11:13         ` Yusuf Goolamabbas
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Cox @ 2001-08-20 10:56 UTC (permalink / raw)
  To: Yusuf Goolamabbas; +Cc: Alan Cox, Cliff Albert, linux-kernel, gibbs

> > There is a known BIOS irq routing table problem with a large number of Intel
> > BIOS boards with onboard adaptec controllers. The fact that making it use
> > the io-apic works suggest this is the same thing.
> 
> But 2.4.8 and 2.4.9 work without using io-apic. 

I'm not currently sure what that proves. Is your board intel bios ?


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 10:37   ` Alan Cox
@ 2001-08-20 10:56     ` Yusuf Goolamabbas
  2001-08-20 10:56       ` Alan Cox
  2001-08-20 12:46     ` Stefan Fleiter
  2001-08-20 16:21     ` Cliff Albert
  2 siblings, 1 reply; 75+ messages in thread
From: Yusuf Goolamabbas @ 2001-08-20 10:56 UTC (permalink / raw)
  To: Alan Cox; +Cc: Cliff Albert, linux-kernel, gibbs

> > > With 2.4.8-ac7, I get SCSI errors and the kernel fails to boot. If I
> > > compile with APIC enabled and APIC on UP also enabled, it boots
> > > cleanly
> > 
> > I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
> > the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
> > i've been getting these errors since 2.4.3.
> 
> There is a known BIOS irq routing table problem with a large number of Intel
> BIOS boards with onboard adaptec controllers. The fact that making it use
> the io-apic works suggest this is the same thing.

But 2.4.8 and 2.4.9 work without using io-apic. 

-- 
Yusuf Goolamabbas
yusufg@outblaze.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 11:13         ` Yusuf Goolamabbas
@ 2001-08-20 11:09           ` Alan Cox
  2001-08-20 16:43             ` Doug Ledford
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Cox @ 2001-08-20 11:09 UTC (permalink / raw)
  To: Yusuf Goolamabbas; +Cc: Alan Cox, Cliff Albert, linux-kernel, gibbs

> > I'm not currently sure what that proves. Is your board intel bios ?
> 
> The BIOS is Phoenix (4,0 Release 6.0, BIOS Build 125). Does Intel
> provide their own branded bios ? Never seen them. The box is an ISP 2150
> and it is of the Slot 1 variant.

Ok that sounds unrelated. Intel do provide their own bioses (and one at
least branded Dell) but Phoenixbios is quite different.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 10:56       ` Alan Cox
@ 2001-08-20 11:13         ` Yusuf Goolamabbas
  2001-08-20 11:09           ` Alan Cox
  0 siblings, 1 reply; 75+ messages in thread
From: Yusuf Goolamabbas @ 2001-08-20 11:13 UTC (permalink / raw)
  To: Alan Cox; +Cc: Cliff Albert, linux-kernel, gibbs

> > > There is a known BIOS irq routing table problem with a large number of Intel
> > > BIOS boards with onboard adaptec controllers. The fact that making it use
> > > the io-apic works suggest this is the same thing.
> > 
> > But 2.4.8 and 2.4.9 work without using io-apic. 
> 
> I'm not currently sure what that proves. Is your board intel bios ?

The BIOS is Phoenix (4,0 Release 6.0, BIOS Build 125). Does Intel
provide their own branded bios ? Never seen them. The box is an ISP 2150
and it is of the Slot 1 variant.

The adaptec bios version is 2.20S1B1

-- 
Yusuf Goolamabbas
yusufg@outblaze.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 10:37   ` Alan Cox
  2001-08-20 10:56     ` Yusuf Goolamabbas
@ 2001-08-20 12:46     ` Stefan Fleiter
  2001-08-20 15:19       ` Ville Herva
                         ` (2 more replies)
  2001-08-20 16:21     ` Cliff Albert
  2 siblings, 3 replies; 75+ messages in thread
From: Stefan Fleiter @ 2001-08-20 12:46 UTC (permalink / raw)
  To: linux-kernel

Hi Alan!

On Mon, 20 Aug 2001 Alan Cox wrote:

> > > With 2.4.8-ac7, I get SCSI errors and the kernel fails to boot. If I
> > > compile with APIC enabled and APIC on UP also enabled, it boots
> > > cleanly
> > 
> > I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
> > the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
> > i've been getting these errors since 2.4.3.
> 
> There is a known BIOS irq routing table problem with a large number of Intel
> BIOS boards with onboard adaptec controllers.

I have the same problem, but my Adaptec is _not_ onboard.

sf@shuttle:~$ uname -r
2.4.8-ac7

sf@shuttle:~$ dmesg
[..]
SCSI subsystem driver Revision: 1.00
PCI: Found IRQ 14 for device 00:0b.0
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.1
        <Adaptec 2940 Ultra SCSI adapter>
        aic7880: Ultra Wide Channel A, SCSI Id=7, 16/255 SCBs

  Vendor: IBM-PCCO  Model: DDRS-39130Y   !#  Rev: S97B
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: SEAGATE   Model: ST34572W          Rev: 0784
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: PLEXTOR   Model: CD-ROM PX-12TS    Rev: 1.02
  Type:   CD-ROM                             ANSI SCSI revision: 02
  Vendor: HP        Model: HP35480A          Rev: T503
  Type:   Sequential-Access                  ANSI SCSI revision: 02
scsi0:0:0:0: Tagged Queuing enabled.  Depth 24
scsi0:0:1:0: Tagged Queuing enabled.  Depth 24

Greetings,
Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 12:46     ` Stefan Fleiter
@ 2001-08-20 15:19       ` Ville Herva
  2001-08-20 20:33         ` Justin T. Gibbs
  2001-08-20 16:45       ` Doug Ledford
  2001-08-20 20:28       ` Justin T. Gibbs
  2 siblings, 1 reply; 75+ messages in thread
From: Ville Herva @ 2001-08-20 15:19 UTC (permalink / raw)
  To: linux-kernel

On Mon, Aug 20, 2001 at 02:46:02PM +0200, you [Stefan Fleiter] claimed:
> Hi Alan!
>  
> I have the same problem, but my Adaptec is _not_ onboard.
> 
> sf@shuttle:~$ uname -r
> 2.4.8-ac7

Same here, with 2.2.18pre19 + Gibbs aic7xxx 6.1.7.

uname -r
2.2.18pre19

cat /proc/scsi/scsi 
Attached devices: 
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: SEAGATE  Model: ST118273N        Rev: 6244
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 04 Lun: 00
  Vendor: HP       Model: C5683A           Rev: C005
  Type:   Sequential-Access                ANSI SCSI revision: 02

(I reported this to Gibbs some time ago.)

Adaptec AIC-7892 Ultra 160/m SCSI host adapter (Adaptec 29160, not onboard).
No Intel bios (AMD Duron box).


-- v --

v@iki.fi

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 10:37   ` Alan Cox
  2001-08-20 10:56     ` Yusuf Goolamabbas
  2001-08-20 12:46     ` Stefan Fleiter
@ 2001-08-20 16:21     ` Cliff Albert
  2001-08-20 17:23       ` Peter T. Breuer
  2 siblings, 1 reply; 75+ messages in thread
From: Cliff Albert @ 2001-08-20 16:21 UTC (permalink / raw)
  To: Alan Cox; +Cc: Yusuf Goolamabbas, linux-kernel, gibbs

On Mon, Aug 20, 2001 at 11:37:33AM +0100, Alan Cox wrote:

> > > With 2.4.8-ac7, I get SCSI errors and the kernel fails to boot. If I
> > > compile with APIC enabled and APIC on UP also enabled, it boots
> > > cleanly
> > 
> > I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
> > the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
> > i've been getting these errors since 2.4.3.
> 
> There is a known BIOS irq routing table problem with a large number of Intel
> BIOS boards with onboard adaptec controllers. The fact that making it use
> the io-apic works suggest this is the same thing.

It's an ASUS P2B-S board with a Award Bios, flashed to the latest revision that
is available from ASUS

-- 
Cliff Albert		| RIPE:	     CA3348-RIPE | www.oisec.net
cliff@oisec.net		| 6BONE:     CA2-6BONE	 | icq 18461740

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 11:09           ` Alan Cox
@ 2001-08-20 16:43             ` Doug Ledford
  0 siblings, 0 replies; 75+ messages in thread
From: Doug Ledford @ 2001-08-20 16:43 UTC (permalink / raw)
  To: Alan Cox; +Cc: Yusuf Goolamabbas, Cliff Albert, linux-kernel, gibbs

Alan Cox wrote:
>>>I'm not currently sure what that proves. Is your board intel bios ?
>>>
>>The BIOS is Phoenix (4,0 Release 6.0, BIOS Build 125). Does Intel
>>provide their own branded bios ? Never seen them. The box is an ISP 2150
>>and it is of the Slot 1 variant.
>>
> 
> Ok that sounds unrelated. Intel do provide their own bioses (and one at
> least branded Dell) but Phoenixbios is quite different.

No.  The problem Intel boxes do use Phoenix BIOS.  His box is the exact 
problem model.  It requires the use of IOAPIC support for UP or SMP in 
order to work properly.  If 2.4.8 and 2.4.9 both work correctly now 
*without* the use of UP-IOAPIC and without SMP, then that means in 2.4.8 
there must have been added a DMI scan whitelist entry that makes this 
motherboard do something sane (like never trying to assign interrupts or 
enabling UP-IOAPIC even if it isn't the default).



-- 

  Doug Ledford <dledford@redhat.com>  http://people.redhat.com/dledford
       Please check my web site for aic7xxx updates/answers before
                       e-mailing me about problems


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 12:46     ` Stefan Fleiter
  2001-08-20 15:19       ` Ville Herva
@ 2001-08-20 16:45       ` Doug Ledford
  2001-08-20 17:23         ` Stefan Fleiter
  2001-08-20 20:28       ` Justin T. Gibbs
  2 siblings, 1 reply; 75+ messages in thread
From: Doug Ledford @ 2001-08-20 16:45 UTC (permalink / raw)
  To: Stefan Fleiter; +Cc: linux-kernel

Stefan Fleiter wrote:
> Hi Alan!
> 
> On Mon, 20 Aug 2001 Alan Cox wrote:
> 
> 
>>>>With 2.4.8-ac7, I get SCSI errors and the kernel fails to boot. If I
>>>>compile with APIC enabled and APIC on UP also enabled, it boots
>>>>cleanly
>>>>
>>>I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
>>>the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
>>>i've been getting these errors since 2.4.3.
>>>
>>There is a known BIOS irq routing table problem with a large number of Intel
>>BIOS boards with onboard adaptec controllers.
>>
> 
> I have the same problem, but my Adaptec is _not_ onboard.

[snip]

This is *not* the same problem.  The original poster can't get his 
system booted at all (and that includes the fact that it won't even find 
all the drives and read partition tables or anything like that).  Your 
system is getting much further along.  Absolutely 0 progress is *vastly* 
different from progress mixed with some errors.



-- 

  Doug Ledford <dledford@redhat.com>  http://people.redhat.com/dledford
       Please check my web site for aic7xxx updates/answers before
                       e-mailing me about problems


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 16:45       ` Doug Ledford
@ 2001-08-20 17:23         ` Stefan Fleiter
  0 siblings, 0 replies; 75+ messages in thread
From: Stefan Fleiter @ 2001-08-20 17:23 UTC (permalink / raw)
  To: linux-kernel

Hi Doug!

On Mon, 20 Aug 2001 Doug Ledford wrote:


>>>>> With 2.4.8-ac7, I get SCSI errors and the kernel fails to boot.
>> I have the same problem, but my Adaptec is _not_ onboard.
> 
> [snip]
> 
> This is *not* the same problem.  The original poster can't get his 
> system booted at all (and that includes the fact that it won't even find 
> all the drives and read partition tables or anything like that).  Your 
> system is getting much further along.  Absolutely 0 progress is *vastly* 
> different from progress mixed with some errors.

Oh, sorry, you are right.
Will learn me to read the full mail the next time.

Greetings,
Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 16:21     ` Cliff Albert
@ 2001-08-20 17:23       ` Peter T. Breuer
  2001-08-20 17:28         ` Cliff Albert
  0 siblings, 1 reply; 75+ messages in thread
From: Peter T. Breuer @ 2001-08-20 17:23 UTC (permalink / raw)
  To: Cliff Albert; +Cc: linux kernel

"Cliff Albert wrote:"
> On Mon, Aug 20, 2001 at 11:37:33AM +0100, Alan Cox wrote:
> > There is a known BIOS irq routing table problem with a large number of Intel
> > BIOS boards with onboard adaptec controllers. The fact that making it use
> > the io-apic works suggest this is the same thing.
> 
> It's an ASUS P2B-S board with a Award Bios, flashed to the latest revision that
> is available from ASUS

I have exactly the same machine somewhere (will search, later) with onboard scsi.
I believe it's currently running 2.4.3 with apic enabled with only occasional
(weekly) troubles.

I'll see if I can dig it out and test it.

Peter

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 17:23       ` Peter T. Breuer
@ 2001-08-20 17:28         ` Cliff Albert
  0 siblings, 0 replies; 75+ messages in thread
From: Cliff Albert @ 2001-08-20 17:28 UTC (permalink / raw)
  To: Peter T. Breuer; +Cc: linux kernel

On Mon, Aug 20, 2001 at 07:23:31PM +0200, Peter T. Breuer wrote:

> "Cliff Albert wrote:"
> > On Mon, Aug 20, 2001 at 11:37:33AM +0100, Alan Cox wrote:
> > > There is a known BIOS irq routing table problem with a large number of Intel
> > > BIOS boards with onboard adaptec controllers. The fact that making it use
> > > the io-apic works suggest this is the same thing.
> > 
> > It's an ASUS P2B-S board with a Award Bios, flashed to the latest revision that
> > is available from ASUS
> 
> I have exactly the same machine somewhere (will search, later) with onboard scsi.
> I believe it's currently running 2.4.3 with apic enabled with only occasional
> (weekly) troubles.
> 
> I'll see if I can dig it out and test it.

Troubles are infrequent also here, usually i get the errors at boot, now with the
aic7xxx driver in 2.4.8-ac7 with aic7xxx=verbose i still haven't got them but i
also get the errors sometimes after a hour of 8 / 9 of uptime, as the box is 
around this uptime at the moment, i'll probably be expecting some errors very soon

-- 
Cliff Albert		| RIPE:	     CA3348-RIPE | www.oisec.net
cliff@oisec.net		| 6BONE:     CA2-6BONE	 | icq 18461740

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20  8:55 ` Cliff Albert
  2001-08-20 10:37   ` Alan Cox
@ 2001-08-20 20:27   ` Justin T. Gibbs
  2001-08-20 20:45     ` Cliff Albert
  1 sibling, 1 reply; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-20 20:27 UTC (permalink / raw)
  To: Cliff Albert; +Cc: linux-kernel

>
>I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
>the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
>i've been getting these errors since 2.4.3.
>
>> booting with append="noapic", gives the same errors

Can you send me the full messages when you boot with "aic7xxx=verbose"?
That should help indicate the source of your problems.  I also
need to see the devices that are attached to the bus, so a full dmesg
from a successful boot with the old driver would be helpful.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 12:46     ` Stefan Fleiter
  2001-08-20 15:19       ` Ville Herva
  2001-08-20 16:45       ` Doug Ledford
@ 2001-08-20 20:28       ` Justin T. Gibbs
  2001-08-21 20:24         ` Stefan Fleiter
  2 siblings, 1 reply; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-20 20:28 UTC (permalink / raw)
  To: Stefan Fleiter; +Cc: linux-kernel

>> There is a known BIOS irq routing table problem with a large number of Intel
>> BIOS boards with onboard adaptec controllers.
>
>I have the same problem, but my Adaptec is _not_ onboard.

Not the same problem.  I need a full error message log with "aic7xxx=verbose".

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 15:19       ` Ville Herva
@ 2001-08-20 20:33         ` Justin T. Gibbs
  0 siblings, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-20 20:33 UTC (permalink / raw)
  To: Ville Herva; +Cc: linux-kernel

>Same here, with 2.2.18pre19 + Gibbs aic7xxx 6.1.7.
>

...

>(I reported this to Gibbs some time ago.)

I don't recall the particular details of your problem, but considering
the known bugs in 6.1.7, you might have success using something newer.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 20:27   ` Justin T. Gibbs
@ 2001-08-20 20:45     ` Cliff Albert
  2001-08-20 21:04       ` Cliff Albert
  0 siblings, 1 reply; 75+ messages in thread
From: Cliff Albert @ 2001-08-20 20:45 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel

On Mon, Aug 20, 2001 at 02:27:49PM -0600, Justin T. Gibbs wrote:

> >I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
> >the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
> >i've been getting these errors since 2.4.3.
> >
> >> booting with append="noapic", gives the same errors
> 
> Can you send me the full messages when you boot with "aic7xxx=verbose"?
> That should help indicate the source of your problems.  I also
> need to see the devices that are attached to the bus, so a full dmesg
> from a successful boot with the old driver would be helpful.

Well booting is successful on my board, but the same errors that almost
everyone is getting are the same i'm getting. I just turned on verbose.

Most debugging info i already send to the linux-kernel mailinglist, i'll
forward it on to you. The verbose info will be send also in about a few 
hours.

-- 
Cliff Albert		| RIPE:	     CA3348-RIPE | www.oisec.net
cliff@oisec.net		| 6BONE:     CA2-6BONE	 | icq 18461740

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 20:45     ` Cliff Albert
@ 2001-08-20 21:04       ` Cliff Albert
  2001-08-20 21:09         ` Cliff Albert
  2001-08-20 21:44         ` aic7xxx errors with 2.4.8-ac7 on 440gx mobo Justin T. Gibbs
  0 siblings, 2 replies; 75+ messages in thread
From: Cliff Albert @ 2001-08-20 21:04 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1267 bytes --]

On Mon, Aug 20, 2001 at 10:45:36PM +0200, Cliff Albert wrote:

> On Mon, Aug 20, 2001 at 02:27:49PM -0600, Justin T. Gibbs wrote:
> 
> > >I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
> > >the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
> > >i've been getting these errors since 2.4.3.
> > >
> > >> booting with append="noapic", gives the same errors
> > 
> > Can you send me the full messages when you boot with "aic7xxx=verbose"?
> > That should help indicate the source of your problems.  I also
> > need to see the devices that are attached to the bus, so a full dmesg
> > from a successful boot with the old driver would be helpful.
> 
> Well booting is successful on my board, but the same errors that almost
> everyone is getting are the same i'm getting. I just turned on verbose.
> 
> Most debugging info i already send to the linux-kernel mailinglist, i'll
> forward it on to you. The verbose info will be send also in about a few 
> hours.

And here they are, the dmesg is my bootup dmesg with the devices drivers 
and stuff, and the second dmesg is the actual errors (verbose turned on)

-- 
Cliff Albert		| RIPE:	     CA3348-RIPE | www.oisec.net
cliff@oisec.net		| 6BONE:     CA2-6BONE	 | icq 18461740

[-- Attachment #2: dmesg --]
[-- Type: text/plain, Size: 9296 bytes --]

Linux version 2.4.8-ac7 (root@neve) (gcc version 2.95.4 20010810 (Debian prerelease)) #16 Sun Aug 19 13:58:17 CEST 2001
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 0000000017ffd000 (usable)
 BIOS-e820: 0000000017ffd000 - 0000000017fff000 (ACPI data)
 BIOS-e820: 0000000017fff000 - 0000000018000000 (ACPI NVS)
 BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
On node 0 totalpages: 98301
zone(0): 4096 pages.
zone(1): 94205 pages.
zone(2): 0 pages.
Local APIC disabled by BIOS -- reenabling.
Found and enabled local APIC!
Kernel command line: auto BOOT_IMAGE=Linux248ac7 ro root=805 parport=0x378,7 parport=0x278,5 console=ttyS0,9600 aic7xxx=verbose
Initializing CPU#0
Detected 400.915 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 799.53 BogoMIPS
Memory: 384040k/393204k available (1358k kernel code, 8776k reserved, 409k data, 208k init, 0k highmem)
Dentry-cache hash table entries: 65536 (order: 7, 524288 bytes)
Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
Mount-cache hash table entries: 8192 (order: 4, 65536 bytes)
Buffer-cache hash table entries: 32768 (order: 5, 131072 bytes)
Page-cache hash table entries: 131072 (order: 7, 524288 bytes)
CPU: Before vendor init, caps: 0183fbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 0183fbff 00000000 00000000 00000000
CPU:     After generic, caps: 0183fbff 00000000 00000000 00000000
CPU:             Common caps: 0183fbff 00000000 00000000 00000000
CPU: Intel Celeron (Covington) stepping 01
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000040
ESR value after enabling vector: 00000000
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 400.8949 MHz.
..... host bus clock speed is 100.2236 MHz.
cpu: 0, clocks: 1002236, slice: 501118
CPU0<T0:1002224,T1:501104,D:2,S:501118,C:1002236>
PCI: PCI BIOS revision 2.10 entry at 0xf0720, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router PIIX [8086/7110] at 00:04.0
PCI: Found IRQ 14 for device 00:04.2
PCI: Sharing IRQ 14 with 00:06.0
Limiting direct PCI/PCI transfers.
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
Simple Boot Flag extension found and enabled.
Starting kswapd v1.8
Journalled Block Device driver loaded
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE,EPP]
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
0x278: FIFO is 16 bytes
0x278: writeIntrThreshold is 16
0x278: readIntrThreshold is 16
0x278: PWord is 8 bits
0x278: Interrupts are ISA-Pulses
0x278: ECP port cfgA=0x14 cfgB=0x40
0x278: ECP settings irq=<none or set by other means> dma=<none or set by other means>
parport1: PC-style at 0x278 (0x678), irq 5, using FIFO [PCSPP,TRISTATE,COMPAT,ECP]
parport1: cpp_daisy: aa5500ff87(b8)
parport1: assign_addrs: aa5500ff87(b8)
parport1: cpp_daisy: aa5500ff87(b8)
parport1: assign_addrs: aa5500ff87(b8)
Detected PS/2 Mouse Port.
pty: 256 Unix98 ptys configured
keyboard: Timeout - AT keyboard not present?(ed)
keyboard: Timeout - AT keyboard not present?(f4)
Serial driver version 5.05c (2001-07-08) with MANY_PORTS SHARE_IRQ SERIAL_PCI enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
lp0: using parport0 (interrupt-driven).
lp1: using parport1 (interrupt-driven).
Real Time Clock Driver v1.10d
ppdev: user-space parallel port driver
block: queued sectors max/low 254930kB/123858kB, 768 slots per queue
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller on PCI bus 00 dev 21
PCI: Enabling device 00:04.1 (0000 -> 0001)
PIIX4: chipset revision 1
PIIX4: not 100% native mode: will probe irqs later
PIIX4: neither IDE port enabled (BIOS)
ne2k-pci.c:v1.02 10/19/2000 D. Becker/P. Gortmaker
  http://www.scyld.com/network/ne2k-pci.html
PCI: Found IRQ 11 for device 00:0c.0
eth0: RealTek RTL-8029 found at 0xa000, IRQ 11, 00:40:05:5A:6B:90.
8139too Fast Ethernet driver 0.9.18a
PCI: Found IRQ 15 for device 00:0a.0
eth1: RealTek RTL8139 Fast Ethernet at 0xd8800000, 00:50:bf:51:7a:42, IRQ 15
eth1:  Identified 8139 chip type 'RTL-8139C'
PCI: Found IRQ 10 for device 00:0b.0
eth2: RealTek RTL8139 Fast Ethernet at 0xd8802000, 00:50:bf:21:62:9a, IRQ 10
eth2:  Identified 8139 chip type 'RTL-8139C'
Linux agpgart interface v0.99 (c) Jeff Hartmann
agpgart: Maximum main memory to use for agp memory: 321M
agpgart: Detected Intel 440BX chipset
agpgart: AGP aperture is 64M @ 0xe4000000
SCSI subsystem driver Revision: 1.00
PCI: Found IRQ 14 for device 00:06.0
PCI: Sharing IRQ 14 with 00:04.2
ahc_pci:0:6:0: Reading SEEPROM...done.
ahc_pci:0:6:0: Manual SE Termination
ahc_pci:0:6:0: Manual LVD Termination
ahc_pci:0:6:0: BIOS eeprom is present
ahc_pci:0:6:0: Primary Low Byte termination Enabled
ahc_pci:0:6:0: Primary High Byte termination Enabled
ahc_pci:0:6:0: Downloading Sequencer Program... 422 instructions downloaded
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.1
        <Adaptec aic7890/91 Ultra2 SCSI adapter>
        aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs

  Vendor: QUANTUM   Model: FIREBALL ST6.4S   Rev: 0F0C
  Type:   Direct-Access                      ANSI SCSI revision: 02
(scsi0:A:0:1): Sending SDTR period c, offset 7f
(scsi0:A:0:1): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:0): 20.000MB/s transfers (20.000MHz, offset 15)
scsi0: target 0 synchronous at 20.0MHz, offset = 0xf
  Vendor: IOMEGA    Model: ZIP 100           Rev: J.03
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: YAMAHA    Model: CRW2100S          Rev: 1.0H
  Type:   CD-ROM                             ANSI SCSI revision: 02
(scsi0:A:5:1): Sending SDTR period c, offset 7f
(scsi0:A:5:1): Received SDTR period c, offset 7
	Filtered to period c, offset 7
(scsi0:A:5): 20.000MB/s transfers (20.000MHz, offset 7)
scsi0: target 5 synchronous at 20.0MHz, offset = 0x7
  Vendor: PLEXTOR   Model: CD-ROM PX-32TS    Rev: 1.01
  Type:   CD-ROM                             ANSI SCSI revision: 02
(scsi0:A:6:1): Sending SDTR period c, offset 7f
(scsi0:A:6:1): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6): 20.000MB/s transfers (20.000MHz, offset 15)
scsi0: target 6 synchronous at 20.0MHz, offset = 0xf
(scsi0:A:0): 20.000MB/s transfers (20.000MHz, offset 15)
scsi0:0:0:0: Tagged Queuing enabled.  Depth 253
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi removable disk sdb at scsi0, channel 0, id 4, lun 0
(scsi0:A:0:0): Sending SDTR period c, offset f
(scsi0:A:0:0): Received SDTR period c, offset f
	Filtered to period c, offset f
SCSI device sda: 12772516 512-byte hdwr sectors (6540 MB)
Partition check:
 sda: sda1 sda2 < sda5 sda6 sda7 sda8 sda9 >
sdb : READ CAPACITY failed.
sdb : status = 1, message = 00, host = 0, driver = 08 
Current sd00:00: sense key Not Ready
Additional sense indicates Medium not present
sdb : block size assumed to be 512 bytes, disk size 1GB.  
 sdb: I/O error: dev 08:10, sector 0
 unable to read partition table
Attached scsi CD-ROM sr0 at scsi0, channel 0, id 5, lun 0
Attached scsi CD-ROM sr1 at scsi0, channel 0, id 6, lun 0
(scsi0:A:5:0): Sending SDTR period c, offset 7
(scsi0:A:5:0): Received SDTR period c, offset 7
	Filtered to period c, offset 7
sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.12
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
sr1: scsi-1 drive
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 32768 bind 32768)
ip_conntrack (3071 buckets, 24568 max)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
IPv6 v0.8 for NET4.0
IPv6 over IPv4 tunneling driver
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 208k freed
Adding Swap: 136512k swap-space (priority -1)
reiserfs: checking transaction log (device 08:06) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 08:07) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 08:08) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 08:09) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
eth1: Setting 100mbps full-duplex based on auto-negotiated partner ability 41e1.
eth2: Setting 100mbps full-duplex based on auto-negotiated partner ability 41e1.

[-- Attachment #3: dmesg.debug --]
[-- Type: text/plain, Size: 15163 bytes --]


  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: YAMAHA    Model: CRW2100S          Rev: 1.0H
  Type:   CD-ROM                             ANSI SCSI revision: 02
(scsi0:A:5:1): Sending SDTR period c, offset 7f
(scsi0:A:5:1): Received SDTR period c, offset 7
	Filtered to period c, offset 7
(scsi0:A:5): 20.000MB/s transfers (20.000MHz, offset 7)
scsi0: target 5 synchronous at 20.0MHz, offset = 0x7
  Vendor: PLEXTOR   Model: CD-ROM PX-32TS    Rev: 1.01
  Type:   CD-ROM                             ANSI SCSI revision: 02
(scsi0:A:6:1): Sending SDTR period c, offset 7f
(scsi0:A:6:1): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6): 20.000MB/s transfers (20.000MHz, offset 15)
scsi0: target 6 synchronous at 20.0MHz, offset = 0xf
(scsi0:A:0): 20.000MB/s transfers (20.000MHz, offset 15)
scsi0:0:0:0: Tagged Queuing enabled.  Depth 253
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi removable disk sdb at scsi0, channel 0, id 4, lun 0
(scsi0:A:0:0): Sending SDTR period c, offset f
(scsi0:A:0:0): Received SDTR period c, offset f
	Filtered to period c, offset f
SCSI device sda: 12772516 512-byte hdwr sectors (6540 MB)
Partition check:
 sda: sda1 sda2 < sda5 sda6 sda7 sda8 sda9 >
sdb : READ CAPACITY failed.
sdb : status = 1, message = 00, host = 0, driver = 08 
Current sd00:00: sense key Not Ready
Additional sense indicates Medium not present
sdb : block size assumed to be 512 bytes, disk size 1GB.  
 sdb: I/O error: dev 08:10, sector 0
 unable to read partition table
Attached scsi CD-ROM sr0 at scsi0, channel 0, id 5, lun 0
Attached scsi CD-ROM sr1 at scsi0, channel 0, id 6, lun 0
(scsi0:A:5:0): Sending SDTR period c, offset 7
(scsi0:A:5:0): Received SDTR period c, offset 7
	Filtered to period c, offset 7
sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.12
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
sr1: scsi-1 drive
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 32768 bind 32768)
ip_conntrack (3071 buckets, 24568 max)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
IPv6 v0.8 for NET4.0
IPv6 over IPv4 tunneling driver
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 208k freed
Adding Swap: 136512k swap-space (priority -1)
reiserfs: checking transaction log (device 08:06) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 08:07) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 08:08) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 08:09) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
eth1: Setting 100mbps full-duplex based on auto-negotiated partner ability 41e1.
eth2: Setting 100mbps full-duplex based on auto-negotiated partner ability 41e1.
keyboard: Timeout - AT keyboard not present?(f4)
eth0: no IPv6 routers present
eth1: no IPv6 routers present
(scsi0:A:0:0): Sending SDTR period c, offset f
(scsi0:A:0:0): Received SDTR period c, offset f
	Filtered to period c, offset f
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0: Dumping Card State while idle, at SEQADDR 0x8
ACCUM = 0x0, SINDEX = 0x5, DINDEX = 0xe4, ARG_2 = 0x0
HCNT = 0x0
SCSISEQ = 0x12, SBLKCTL = 0xa
 DFCNTRL = 0x0, DFSTATUS = 0x89
LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0xa
STACK == 0x3, 0x10d, 0x165, 0x0
SCB count = 16
Kernel NEXTQSCB = 7
Card NEXTQSCB = 7
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 2:2 0:0 6:3 4:11 3:4 5:6 1:13 7:1 
QOUTFIFO entries: 
Sequencer Free SCB List: 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Pending list: 2, 0, 3, 11, 4, 6, 13, 1
Kernel Free SCB list: 5 14 15 8 9 10 12 
DevQ(0:0:0): 6 waiting
DevQ(0:4:0): 0 waiting
DevQ(0:5:0): 0 waiting
DevQ(0:6:0): 0 waiting
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 3 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0: Dumping Card State while idle, at SEQADDR 0x9
ACCUM = 0x0, SINDEX = 0x7, DINDEX = 0xe4, ARG_2 = 0x0
HCNT = 0x0
SCSISEQ = 0x12, SBLKCTL = 0xa
 DFCNTRL = 0x0, DFSTATUS = 0x89
LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0xa
STACK == 0x3, 0x10d, 0x165, 0xec
SCB count = 16
Kernel NEXTQSCB = 3
Card NEXTQSCB = 3
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 2:2 0:0 4:11 3:4 5:6 1:13 7:1 
QOUTFIFO entries: 
Sequencer Free SCB List: 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Pending list: 2, 0, 11, 4, 6, 13, 1
Kernel Free SCB list: 7 5 14 15 8 9 10 12 
DevQ(0:0:0): 0 waiting
DevQ(0:4:0): 0 waiting
DevQ(0:5:0): 0 waiting
DevQ(0:6:0): 0 waiting
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 0 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0: Dumping Card State while idle, at SEQADDR 0x9
ACCUM = 0x0, SINDEX = 0x3, DINDEX = 0xe4, ARG_2 = 0x0
HCNT = 0x0
SCSISEQ = 0x12, SBLKCTL = 0xa
 DFCNTRL = 0x0, DFSTATUS = 0x89
LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0xa
STACK == 0x3, 0x10d, 0x165, 0xec
SCB count = 16
Kernel NEXTQSCB = 0
Card NEXTQSCB = 0
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 2:2 4:11 3:4 5:6 1:13 7:1 
QOUTFIFO entries: 
Sequencer Free SCB List: 0 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Pending list: 2, 11, 4, 6, 13, 1
Kernel Free SCB list: 3 7 5 14 15 8 9 10 12 
DevQ(0:0:0): 0 waiting
DevQ(0:4:0): 0 waiting
DevQ(0:5:0): 0 waiting
DevQ(0:6:0): 0 waiting
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 1 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0: Dumping Card State while idle, at SEQADDR 0x8
ACCUM = 0x0, SINDEX = 0x0, DINDEX = 0xe4, ARG_2 = 0x0
HCNT = 0x0
SCSISEQ = 0x12, SBLKCTL = 0xa
 DFCNTRL = 0x0, DFSTATUS = 0x89
LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0xa
STACK == 0x3, 0x10d, 0x165, 0xec
SCB count = 16
Kernel NEXTQSCB = 1
Card NEXTQSCB = 1
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 2:2 4:11 3:4 5:6 1:13 
QOUTFIFO entries: 
Sequencer Free SCB List: 7 0 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Pending list: 2, 11, 4, 6, 13
Kernel Free SCB list: 0 3 7 5 14 15 8 9 10 12 
DevQ(0:0:0): 0 waiting
DevQ(0:4:0): 0 waiting
DevQ(0:5:0): 0 waiting
DevQ(0:6:0): 0 waiting
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 13 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0: Dumping Card State while idle, at SEQADDR 0x8
ACCUM = 0x0, SINDEX = 0x1, DINDEX = 0xe4, ARG_2 = 0x0
HCNT = 0x0
SCSISEQ = 0x12, SBLKCTL = 0xa
 DFCNTRL = 0x0, DFSTATUS = 0x89
LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0xa
STACK == 0x3, 0x10d, 0x165, 0xec
SCB count = 16
Kernel NEXTQSCB = 13
Card NEXTQSCB = 13
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 2:2 4:11 3:4 5:6 
QOUTFIFO entries: 
Sequencer Free SCB List: 1 7 0 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Pending list: 2, 11, 4, 6
Kernel Free SCB list: 1 0 3 7 5 14 15 8 9 10 12 
DevQ(0:0:0): 0 waiting
DevQ(0:4:0): 0 waiting
DevQ(0:5:0): 0 waiting
DevQ(0:6:0): 0 waiting
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 2 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command found on device queue
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0: Dumping Card State while idle, at SEQADDR 0x8
ACCUM = 0x0, SINDEX = 0xd, DINDEX = 0xe4, ARG_2 = 0x0
HCNT = 0x0
SCSISEQ = 0x12, SBLKCTL = 0xa
 DFCNTRL = 0x0, DFSTATUS = 0x89
LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0xa
STACK == 0x3, 0x10d, 0x165, 0xec
SCB count = 16
Kernel NEXTQSCB = 2
Card NEXTQSCB = 2
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 4:11 3:4 5:6 
QOUTFIFO entries: 
Sequencer Free SCB List: 2 1 7 0 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Pending list: 11, 4, 6
Kernel Free SCB list: 13 1 0 3 7 5 14 15 8 9 10 12 
DevQ(0:0:0): 0 waiting
DevQ(0:4:0): 0 waiting
DevQ(0:5:0): 0 waiting
DevQ(0:6:0): 0 waiting
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 6 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0: Dumping Card State while idle, at SEQADDR 0x8
ACCUM = 0x0, SINDEX = 0x2, DINDEX = 0xe4, ARG_2 = 0x0
HCNT = 0x0
SCSISEQ = 0x12, SBLKCTL = 0xa
 DFCNTRL = 0x0, DFSTATUS = 0x89
LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0xa
STACK == 0x3, 0x10d, 0x165, 0xec
SCB count = 16
Kernel NEXTQSCB = 6
Card NEXTQSCB = 6
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 4:11 3:4 
QOUTFIFO entries: 
Sequencer Free SCB List: 5 2 1 7 0 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Pending list: 11, 4
Kernel Free SCB list: 2 13 1 0 3 7 5 14 15 8 9 10 12 
DevQ(0:0:0): 0 waiting
DevQ(0:4:0): 0 waiting
DevQ(0:5:0): 0 waiting
DevQ(0:6:0): 0 waiting
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 4 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0: Dumping Card State while idle, at SEQADDR 0x8
ACCUM = 0x0, SINDEX = 0x6, DINDEX = 0xe4, ARG_2 = 0x0
HCNT = 0x0
SCSISEQ = 0x12, SBLKCTL = 0xa
 DFCNTRL = 0x0, DFSTATUS = 0x89
LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
SSTAT0 = 0x0, SSTAT1 = 0xa
STACK == 0x3, 0x10d, 0x165, 0xec
SCB count = 16
Kernel NEXTQSCB = 4
Card NEXTQSCB = 4
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 4:11 
QOUTFIFO entries: 
Sequencer Free SCB List: 3 5 2 1 7 0 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Pending list: 11
Kernel Free SCB list: 6 2 13 1 0 3 7 5 14 15 8 9 10 12 
DevQ(0:0:0): 0 waiting
DevQ(0:4:0): 0 waiting
DevQ(0:5:0): 0 waiting
DevQ(0:6:0): 0 waiting
(scsi0:A:0:0): Queuing a recovery SCB
scsi0:0:0:0: Device is disconnected, re-queuing SCB
(scsi0:A:0:0): Abort Tag Message Sent
Recovery code sleeping
(scsi0:A:0:0): SCB 11 - Abort Tag Completed.
Recovery SCB completes
Recovery code awake
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue an ABORT message
scsi0:0:0:0: Command not found
aic7xxx_abort returns 8194
scsi0:0:0:0: Attempting to queue a TARGET RESET message
scsi0:0:0:0: Command not found
aic7xxx_dev_reset returns 8194
(scsi0:A:0): 3.300MB/s transfers
scsi0: target 0 using asynchronous transfers
(scsi0:A:5): 3.300MB/s transfers
scsi0: target 5 using asynchronous transfers
(scsi0:A:6): 3.300MB/s transfers
scsi0: target 6 using asynchronous transfers
scsi0: SCSI bus reset delivered. 0 SCBs aborted.
(scsi0:A:0:0): Sending SDTR period c, offset f
(scsi0:A:0:0): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:0): 20.000MB/s transfers (20.000MHz, offset 15)
scsi0: target 0 synchronous at 20.0MHz, offset = 0xf
(scsi0:A:0:0): Sending SDTR period c, offset f
(scsi0:A:0:0): Received SDTR period c, offset f
	Filtered to period c, offset f
Device not ready.  Make sure there is a disc in the drive.
(scsi0:A:5:0): Sending SDTR period c, offset 7
(scsi0:A:5:0): Received SDTR period c, offset 7
	Filtered to period c, offset 7
(scsi0:A:5): 20.000MB/s transfers (20.000MHz, offset 7)
scsi0: target 5 synchronous at 20.0MHz, offset = 0x7
(scsi0:A:5:0): Sending SDTR period c, offset 7
(scsi0:A:5:0): Received SDTR period c, offset 7
	Filtered to period c, offset 7
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6): 20.000MB/s transfers (20.000MHz, offset 15)
scsi0: target 6 synchronous at 20.0MHz, offset = 0xf
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
(scsi0:A:6:0): Sending SDTR period c, offset f
(scsi0:A:6:0): Received SDTR period c, offset f
	Filtered to period c, offset f
Device not ready.  Make sure there is a disc in the drive.
device eth0 entered promiscuous mode

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 21:04       ` Cliff Albert
@ 2001-08-20 21:09         ` Cliff Albert
  2001-08-20 21:45           ` Justin T. Gibbs
  2001-08-20 22:36           ` aic7xxx with 2.4.9 on 7899P Sven Heinicke
  2001-08-20 21:44         ` aic7xxx errors with 2.4.8-ac7 on 440gx mobo Justin T. Gibbs
  1 sibling, 2 replies; 75+ messages in thread
From: Cliff Albert @ 2001-08-20 21:09 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel

> > > >I'm getting similair errors on 2.4.8-ac7 on my P2B-S motherboard using
> > > >the NEW AIC7xxx driver, the old isn't experiencing these problems. Further
> > > >i've been getting these errors since 2.4.3.
> > > >
> > > >> booting with append="noapic", gives the same errors
> > > 
> > > Can you send me the full messages when you boot with "aic7xxx=verbose"?
> > > That should help indicate the source of your problems.  I also
> > > need to see the devices that are attached to the bus, so a full dmesg
> > > from a successful boot with the old driver would be helpful.
> > 
> > Well booting is successful on my board, but the same errors that almost
> > everyone is getting are the same i'm getting. I just turned on verbose.
> > 
> > Most debugging info i already send to the linux-kernel mailinglist, i'll
> > forward it on to you. The verbose info will be send also in about a few 
> > hours.
> 
> And here they are, the dmesg is my bootup dmesg with the devices drivers 
> and stuff, and the second dmesg is the actual errors (verbose turned on)

Some more research pointed out that the errors/lock of the scsi bus always 
appears about 20 seconds after kernel load when i cold boot the machine. 
With a warm boot the machine gives these errors/lock at random times.

-- 
Cliff Albert		ripe:  CA3348-RIPE 	IPng: IPv6 Deployment
cliff@ipng.nl		6bone: CA2-6BONE	http://www.ipng.nl/ 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 21:04       ` Cliff Albert
  2001-08-20 21:09         ` Cliff Albert
@ 2001-08-20 21:44         ` Justin T. Gibbs
  2001-08-20 21:48           ` Cliff Albert
  2001-08-25  7:15           ` Cliff Albert
  1 sibling, 2 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-20 21:44 UTC (permalink / raw)
  To: Cliff Albert; +Cc: linux-kernel

>And here they are, the dmesg is my bootup dmesg with the devices drivers 
>and stuff, and the second dmesg is the actual errors (verbose turned on)

You need OFOJ or better firmware in your Fireball ST.  The firmware you
have now is known to be bad.  Before Maxtor's purchase of Quantum's
disk line, you used to be able to get firmware updates off of
ftp.quantum.com, but they've hence cleared out those files.  In a
quick look through Maxtor's site, I could not find the relevant files.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 21:09         ` Cliff Albert
@ 2001-08-20 21:45           ` Justin T. Gibbs
  2001-08-20 22:55             ` Cliff Albert
  2001-08-21 14:42             ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
  2001-08-20 22:36           ` aic7xxx with 2.4.9 on 7899P Sven Heinicke
  1 sibling, 2 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-20 21:45 UTC (permalink / raw)
  To: Cliff Albert; +Cc: linux-kernel

>> And here they are, the dmesg is my bootup dmesg with the devices drivers 
>> and stuff, and the second dmesg is the actual errors (verbose turned on)
>
>Some more research pointed out that the errors/lock of the scsi bus always 
>appears about 20 seconds after kernel load when i cold boot the machine. 
>With a warm boot the machine gives these errors/lock at random times.

IIRC, the problem has to do with the state of the write cache in the drive.
The cache will be in a different state after power-on as compared to
after some amount of activity.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 21:44         ` aic7xxx errors with 2.4.8-ac7 on 440gx mobo Justin T. Gibbs
@ 2001-08-20 21:48           ` Cliff Albert
  2001-08-25  7:15           ` Cliff Albert
  1 sibling, 0 replies; 75+ messages in thread
From: Cliff Albert @ 2001-08-20 21:48 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel

On Mon, Aug 20, 2001 at 03:44:34PM -0600, Justin T. Gibbs wrote:

> >And here they are, the dmesg is my bootup dmesg with the devices drivers 
> >and stuff, and the second dmesg is the actual errors (verbose turned on)
> 
> You need OFOJ or better firmware in your Fireball ST.  The firmware you
> have now is known to be bad.  Before Maxtor's purchase of Quantum's
> disk line, you used to be able to get firmware updates off of
> ftp.quantum.com, but they've hence cleared out those files.  In a
> quick look through Maxtor's site, I could not find the relevant files.

Damn, someone on the list now a place where to get them ? So i can check
if this error is disk or controller related.

-- 
Cliff Albert		| RIPE:	     CA3348-RIPE | www.oisec.net
cliff@oisec.net		| 6BONE:     CA2-6BONE	 | icq 18461740

^ permalink raw reply	[flat|nested] 75+ messages in thread

* aic7xxx with 2.4.9 on 7899P
  2001-08-20 21:09         ` Cliff Albert
  2001-08-20 21:45           ` Justin T. Gibbs
@ 2001-08-20 22:36           ` Sven Heinicke
  1 sibling, 0 replies; 75+ messages in thread
From: Sven Heinicke @ 2001-08-20 22:36 UTC (permalink / raw)
  To: linux-kernel


It's always a blessing and a curse when people seem to be haveing
problems with the same drivers as you.  I started looking into this
when I user complained about disk access time.  I think this is
related to the running aic7xxx topics.

>From my tests, I got a Dell 4400 who's Adaptec 7899P, according to
bonnie++, was writing slower then some of my my IDE drives on a
different system.  I tried Red Hat's 2.4.3-12smp kernel and got a
little improvement.  I then built 2.4.9 and started running bonnie++
again and my syslog gets filled up with such errors:

Aug 20 14:23:33 ps1 kernel: __alloc_pages: 0-order all
Aug 20 14:23:36 ps1 last message repeated 376 times
Aug 20 14:23:36 ps1 kernel: ed.
Aug 20 14:23:36 ps1 kernel: __alloc_pages: 0-order all
Aug 20 14:23:44 ps1 last message repeated 376 times
Aug 20 14:23:44 ps1 kernel: ed.
Aug 20 14:23:44 ps1 kernel: __alloc_pages: 0-order all
Aug 20 14:23:44 ps1 last message repeated 363 times

With slow access time.  Please request more info if you think it might
help.

	Sven

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 21:45           ` Justin T. Gibbs
@ 2001-08-20 22:55             ` Cliff Albert
  2001-08-21  0:36               ` Justin T. Gibbs
  2001-08-21 14:42             ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
  1 sibling, 1 reply; 75+ messages in thread
From: Cliff Albert @ 2001-08-20 22:55 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel

On Mon, Aug 20, 2001 at 03:45:54PM -0600, Justin T. Gibbs wrote:

> >> And here they are, the dmesg is my bootup dmesg with the devices drivers 
> >> and stuff, and the second dmesg is the actual errors (verbose turned on)
> >
> >Some more research pointed out that the errors/lock of the scsi bus always 
> >appears about 20 seconds after kernel load when i cold boot the machine. 
> >With a warm boot the machine gives these errors/lock at random times.
> 
> IIRC, the problem has to do with the state of the write cache in the drive.
> The cache will be in a different state after power-on as compared to
> after some amount of activity.

Well i still suspect the broken firmware of the disk isn't the only cause of
these errors as i've heard more people heaving problems with the new aic7xxx
driver on multiple platforms with different discs, but having the same error
messages as i've experienced. I'll get together with those people and will
try to collect some more debugging info for your needs.

-- 
Cliff Albert		| RIPE:	     CA3348-RIPE | www.oisec.net
cliff@oisec.net		| 6BONE:     CA2-6BONE	 | icq 18461740

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 22:55             ` Cliff Albert
@ 2001-08-21  0:36               ` Justin T. Gibbs
  2001-08-21 15:34                 ` Gérard Roudier
  0 siblings, 1 reply; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-21  0:36 UTC (permalink / raw)
  To: Cliff Albert; +Cc: linux-kernel

>> IIRC, the problem has to do with the state of the write cache in the drive.
>> The cache will be in a different state after power-on as compared to
>> after some amount of activity.
>
>Well i still suspect the broken firmware of the disk isn't the only cause of
>these errors

Perhaps.

One thing to keep in mind however is that, although the messages may
*look* the same, they very rarely are.  If you don't have verbose turned
on, all transaction timeouts, regardless of the reason, look the same.
It is only by analyzing the verbose output that a cause for a particular
problem can be found.  In the case of your system, we always timeout
with the bus idle with several pending transaction, we can always abort
a transaction successfully (i.e. the bus is not dead, neither is the
target), its just that some transactions never complete.  These are
exactly the symptoms of the bogus FireBall ST firmware on your drive.

Another thing to keep in mind...  The newer driver defaults to using
tagged queing and attempts to issue the maximum number of concurrent
transactions possible to each device.  The old driver, until fairly
recently, defaulted to leaving tagged queuing disabled, and if enabled,
only queued 8 transactions.  So, the new aic7xxx driver often places
a much higher load on your SCSI setup than the old one did.  I think
this has something to do with the large number of reports.

This doesn't mean that there haven't been, or continue to be  bugs.
After all this is software, but I am trying to do my best to make
it work. 8-)

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-20 21:45           ` Justin T. Gibbs
  2001-08-20 22:55             ` Cliff Albert
@ 2001-08-21 14:42             ` Sven Heinicke
  2001-08-21 15:08               ` Daniel Phillips
                                 ` (3 more replies)
  1 sibling, 4 replies; 75+ messages in thread
From: Sven Heinicke @ 2001-08-21 14:42 UTC (permalink / raw)
  To: linux-kernel


Forgive the sin of replying to my own message but Daniel Phillips
replied to a different message with a patch to somebody getting a
similar error to mine.  Here is the result:

Aug 20 15:10:33 ps1 kernel: cation failed (gfp=0x30/1). 
Aug 20 15:10:33 ps1 kernel: __alloc_pages: 0-order allocation failed
(gfp=0x30/1). 
Aug 20 15:10:46 ps1 last message repeated 327 times 
Aug 20 15:10:47 ps1 kernel: cation failed (gfp=0x30/1). 
Aug 20 15:10:47 ps1 kernel: __alloc_pages: 0-order allocation failed
(gfp=0x30/1). 
Aug 20 15:10:56 ps1 last message repeated 294 times 


Sven Heinicke writes:
 > 
 > It's always a blessing and a curse when people seem to be haveing
 > problems with the same drivers as you.  I started looking into this
 > when I user complained about disk access time.  I think this is
 > related to the running aic7xxx topics.
 > 
 > From my tests, I got a Dell 4400 who's Adaptec 7899P, according to
 > bonnie++, was writing slower then some of my my IDE drives on a
 > different system.  I tried Red Hat's 2.4.3-12smp kernel and got a
 > little improvement.  I then built 2.4.9 and started running bonnie++
 > again and my syslog gets filled up with such errors:
 > 
 > Aug 20 14:23:33 ps1 kernel: __alloc_pages: 0-order all
 > Aug 20 14:23:36 ps1 last message repeated 376 times
 > Aug 20 14:23:36 ps1 kernel: ed.
 > Aug 20 14:23:36 ps1 kernel: __alloc_pages: 0-order all
 > Aug 20 14:23:44 ps1 last message repeated 376 times
 > Aug 20 14:23:44 ps1 kernel: ed.
 > Aug 20 14:23:44 ps1 kernel: __alloc_pages: 0-order all
 > Aug 20 14:23:44 ps1 last message repeated 363 times
 > 
 > With slow access time.  Please request more info if you think it might
 > help.
 > 
 > 	Sven
 > -
 > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 > the body of a message to majordomo@vger.kernel.org
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
 > Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 14:42             ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
@ 2001-08-21 15:08               ` Daniel Phillips
  2001-08-21 16:48               ` Sven Heinicke
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 75+ messages in thread
From: Daniel Phillips @ 2001-08-21 15:08 UTC (permalink / raw)
  To: Sven Heinicke, linux-kernel

On August 21, 2001 04:42 pm, Sven Heinicke wrote:
> Forgive the sin of replying to my own message but Daniel Phillips
> replied to a different message with a patch to somebody getting a
> similar error to mine.  Here is the result:
> 
> Aug 20 15:10:33 ps1 kernel: cation failed (gfp=0x30/1). 
> Aug 20 15:10:33 ps1 kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x30/1). 
> Aug 20 15:10:46 ps1 last message repeated 327 times 
> Aug 20 15:10:47 ps1 kernel: cation failed (gfp=0x30/1). 
> Aug 20 15:10:47 ps1 kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x30/1). 
> Aug 20 15:10:56 ps1 last message repeated 294 times 

Are you using highmem?  Could you try it with highmem configged off?

--
Daniel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-21  0:36               ` Justin T. Gibbs
@ 2001-08-21 15:34                 ` Gérard Roudier
  0 siblings, 0 replies; 75+ messages in thread
From: Gérard Roudier @ 2001-08-21 15:34 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: Cliff Albert, linux-kernel


Hi Justin,

The Linux SCSI device quirks table can be found in scsi_scan.c. :-)

It does not allow to limit tagged command queue depth, but only to disable
this feature using the BLIST_NOTQ flag. Adding an upper limit seems
feasible without too large a change. Low level drivers (SIMs) could then
access this information when called for select_queue_depth(), or the SCSI
layer could just lowered the value to this upper limit for each device.

And for the filling of such a new quirks table, you might just get the
data for the FreeBSD cam_xpt.c file you know very well. :-)

Historically, the tagged command queueing feature has been kind of
nightmare under Linux and low-level drivers that supported this feature
used to default to a safe value for the command queue depth.
The ncr53c8xx and sym53c8xx drivers defaulted and still default to 8
commands per LUN. This has been proven to work reasonnably even with well
known broken firmware. Some other drivers just defaulted to no tag.

Without some handling of appropriate quirks per device regarding tagged
command queue depth, it is not reasonnable, in my opinion, to default to
something larger that 8 commands per lun under current Linux.

My feeling, and somehow experience, is that not reusing tag numbers too
quickly prevents from triggerring races in broken firmwares. This has been
the reason I implemented a circular tag number allocation sheme in the
ncr53c8xx and sym53c8xx drivers and this seemed to have had the expected
effects.

Regards,
  Gérard.

PS1: By the way, I do agree with your analysis of the problem.
PS2: Thanks a lot for all your efforts for SCSI in free O/Ses.

On Mon, 20 Aug 2001, Justin T. Gibbs wrote:

> >> IIRC, the problem has to do with the state of the write cache in the drive.
> >> The cache will be in a different state after power-on as compared to
> >> after some amount of activity.
> >
> >Well i still suspect the broken firmware of the disk isn't the only cause of
> >these errors
>
> Perhaps.
>
> One thing to keep in mind however is that, although the messages may
> *look* the same, they very rarely are.  If you don't have verbose turned
> on, all transaction timeouts, regardless of the reason, look the same.
> It is only by analyzing the verbose output that a cause for a particular
> problem can be found.  In the case of your system, we always timeout
> with the bus idle with several pending transaction, we can always abort
> a transaction successfully (i.e. the bus is not dead, neither is the
> target), its just that some transactions never complete.  These are
> exactly the symptoms of the bogus FireBall ST firmware on your drive.
>
> Another thing to keep in mind...  The newer driver defaults to using
> tagged queing and attempts to issue the maximum number of concurrent
> transactions possible to each device.  The old driver, until fairly
> recently, defaulted to leaving tagged queuing disabled, and if enabled,
> only queued 8 transactions.  So, the new aic7xxx driver often places
> a much higher load on your SCSI setup than the old one did.  I think
> this has something to do with the large number of reports.
>
> This doesn't mean that there haven't been, or continue to be  bugs.
> After all this is software, but I am trying to do my best to make
> it work. 8-)
>
> --
> Justin
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 14:42             ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
  2001-08-21 15:08               ` Daniel Phillips
@ 2001-08-21 16:48               ` Sven Heinicke
  2001-08-21 17:18                 ` Justin T. Gibbs
                                   ` (4 more replies)
  2001-08-22 10:25               ` Marcelo Tosatti
  2001-08-22 16:09               ` Sven Heinicke
  3 siblings, 5 replies; 75+ messages in thread
From: Sven Heinicke @ 2001-08-21 16:48 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel


Yes, highmem was on, the stystem got 4G of memory.  I turned off
highmem and got no messages apart from one:

Aug 21 07:29:19 ps1 kernel: (scsi0:A:0:0): Locking max tag count at 64

which I was getting before.

Disk access is faster then before but still slower then the IDE
drive.  Any ideas?

Thanks for the help.

Daniel Phillips writes:
 > On August 21, 2001 04:42 pm, Sven Heinicke wrote:
 > > Forgive the sin of replying to my own message but Daniel Phillips
 > > replied to a different message with a patch to somebody getting a
 > > similar error to mine.  Here is the result:
 > > 
 > > Aug 20 15:10:33 ps1 kernel: cation failed (gfp=0x30/1). 
 > > Aug 20 15:10:33 ps1 kernel: __alloc_pages: 0-order allocation failed
 > > (gfp=0x30/1). 
 > > Aug 20 15:10:46 ps1 last message repeated 327 times 
 > > Aug 20 15:10:47 ps1 kernel: cation failed (gfp=0x30/1). 
 > > Aug 20 15:10:47 ps1 kernel: __alloc_pages: 0-order allocation failed
 > > (gfp=0x30/1). 
 > > Aug 20 15:10:56 ps1 last message repeated 294 times 
 > 
 > Are you using highmem?  Could you try it with highmem configged off?
 > 
 > --
 > Daniel
 > -
 > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 > the body of a message to majordomo@vger.kernel.org
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
 > Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 16:48               ` Sven Heinicke
@ 2001-08-21 17:18                 ` Justin T. Gibbs
  2001-08-21 17:26                 ` Daniel Phillips
                                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-21 17:18 UTC (permalink / raw)
  To: Sven Heinicke; +Cc: Daniel Phillips, linux-kernel

>Disk access is faster then before but still slower then the IDE
>drive.  Any ideas?

It could be the occasionall ordered tag that is sent to the drive to
prevent tag starvation.  If you search in drivers/scsi/aic7xxx/aic7xxx_linux.c
for "OTAG_THRESH" and make that if test always fail (add an "&& 0") you will
have effectively disabled this feature.  I should probably make it an option
that defaults to off.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 16:48               ` Sven Heinicke
  2001-08-21 17:18                 ` Justin T. Gibbs
@ 2001-08-21 17:26                 ` Daniel Phillips
  2001-08-21 17:55                 ` Stephan von Krawczynski
                                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 75+ messages in thread
From: Daniel Phillips @ 2001-08-21 17:26 UTC (permalink / raw)
  To: Sven Heinicke; +Cc: linux-kernel

On August 21, 2001 06:48 pm, Sven Heinicke wrote:
> Yes, highmem was on, the stystem got 4G of memory.  I turned off
> highmem and got no messages apart from one:
> 
> Aug 21 07:29:19 ps1 kernel: (scsi0:A:0:0): Locking max tag count at 64
> 
> which I was getting before.
>
> Disk access is faster then before but still slower then the IDE
> drive.  Any ideas?

Two separate problems, I think.  I don't know anything about the aic7xxx 
driver but I can take a look at the highmem problem.  First, can you try
it with highmem enabled, on a recent -ac kernel, say 2.4.8-ac7.

--
Daniel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 16:48               ` Sven Heinicke
  2001-08-21 17:18                 ` Justin T. Gibbs
  2001-08-21 17:26                 ` Daniel Phillips
@ 2001-08-21 17:55                 ` Stephan von Krawczynski
  2001-08-21 18:33                   ` Justin T. Gibbs
  2001-08-21 22:44                 ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
  2001-08-21 22:49                 ` Sven Heinicke
  4 siblings, 1 reply; 75+ messages in thread
From: Stephan von Krawczynski @ 2001-08-21 17:55 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Tue, 21 Aug 2001 19:26:58 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> On August 21, 2001 06:48 pm, Sven Heinicke wrote:
> > Yes, highmem was on, the stystem got 4G of memory.  I turned off
> > highmem and got no messages apart from one:
> > 
> > Aug 21 07:29:19 ps1 kernel: (scsi0:A:0:0): Locking max tag count at 64
> > 
> > which I was getting before.
> >
> > Disk access is faster then before but still slower then the IDE
> > drive.  Any ideas?
> 
> Two separate problems, I think.  I don't know anything about the aic7xxx 
> driver but I can take a look at the highmem problem.  First, can you try
> it with highmem enabled, on a recent -ac kernel, say 2.4.8-ac7.

Ok, Daniel, here are the results of the german jury :-) (EU insider joke)

Aug 21 19:46:40 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 21 19:46:40 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).

And what may be of big interest for Justin: I am using the _old_ AIC7xxx driver. 

The problem can quite easily be produced on my side. All you need is a (problem) host with NFS-server running and a client system. Then simply copy a lot of big files to the server. If you now go and read CD on the server you are in big trouble:
cpu load is shot through the ceiling and you cannot even type chars in a shell after about 3 minutes. Remember I am sitting in front of a dual P-III 1GHz with 1 GB of RAM and U160 SCSI, I simply cannot believe this. I have never seen such a thing under 2.2.

Regards,
Stephan

PS: I try to disable HighMem next.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 17:55                 ` Stephan von Krawczynski
@ 2001-08-21 18:33                   ` Justin T. Gibbs
  2001-08-22  6:46                     ` Jens Axboe
  0 siblings, 1 reply; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-21 18:33 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: Daniel Phillips, linux-kernel

>And what may be of big interest for Justin: I am using the _old_ AIC7xxx
>driver. 

That doesn't surprise me.  The IA64 port had similar issues until I added
39bit addressing support to the aic7xxx driver.  Unfortunately the x86 port
doesn't support passing large dma addresses to drivers so bouncing is required
in order to do PAE.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 20:28       ` Justin T. Gibbs
@ 2001-08-21 20:24         ` Stefan Fleiter
  0 siblings, 0 replies; 75+ messages in thread
From: Stefan Fleiter @ 2001-08-21 20:24 UTC (permalink / raw)
  To: linux-kernel

Hi Justin!

On Mon, 20 Aug 2001 Justin T. Gibbs wrote:


>> I have the same problem, but my Adaptec is _not_ onboard.
> 
> Not the same problem.  I need a full error message log with "aic7xxx=verbose".

Only happened once.
Will send it when I find a way to reproduce.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 16:48               ` Sven Heinicke
                                   ` (2 preceding siblings ...)
  2001-08-21 17:55                 ` Stephan von Krawczynski
@ 2001-08-21 22:44                 ` Sven Heinicke
  2001-08-22  0:58                   ` Daniel Phillips
  2001-08-21 22:49                 ` Sven Heinicke
  4 siblings, 1 reply; 75+ messages in thread
From: Sven Heinicke @ 2001-08-21 22:44 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel


Great, with 2.4.8-ac8 I get no memory problems.  Can you tell me what
file(s) where modified to fix this to I can look for the fixes in
future vanilla kernels?

Thanks!  Now to work on the drive speed problem, it's faster with your
fix but still slower at writing then my IDE drive on another systems.

       Sven

Daniel Phillips writes:
 > On August 21, 2001 06:48 pm, Sven Heinicke wrote:
 > > Yes, highmem was on, the stystem got 4G of memory.  I turned off
 > > highmem and got no messages apart from one:
 > > 
 > > Aug 21 07:29:19 ps1 kernel: (scsi0:A:0:0): Locking max tag count at 64
 > > 
 > > which I was getting before.
 > >
 > > Disk access is faster then before but still slower then the IDE
 > > drive.  Any ideas?
 > 
 > Two separate problems, I think.  I don't know anything about the aic7xxx 
 > driver but I can take a look at the highmem problem.  First, can you try
 > it with highmem enabled, on a recent -ac kernel, say 2.4.8-ac7.
 > 
 > --
 > Daniel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 16:48               ` Sven Heinicke
                                   ` (3 preceding siblings ...)
  2001-08-21 22:44                 ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
@ 2001-08-21 22:49                 ` Sven Heinicke
  2001-08-22 13:06                   ` Gérard Roudier
  4 siblings, 1 reply; 75+ messages in thread
From: Sven Heinicke @ 2001-08-21 22:49 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: Daniel Phillips, linux-kernel


Justin,

I've tried removing your check, for writing bonnie++ still reports
slower write times then IDE drive on the other system.  Daniel Phillips
was great at helping me notice a problem that was causing slow down
but not related to the aic7xxx driver, thus I am now trying the
7.4.8-ac8 kernel.

I made your change whe will get try to get my user to test the system
with and without your change.

     Sven

Justin T. Gibbs writes:
 > >Disk access is faster then before but still slower then the IDE
 > >drive.  Any ideas?
 > 
 > It could be the occasionall ordered tag that is sent to the drive to
 > prevent tag starvation.  If you search in drivers/scsi/aic7xxx/aic7xxx_linux.c
 > for "OTAG_THRESH" and make that if test always fail (add an "&& 0") you will
 > have effectively disabled this feature.  I should probably make it an option
 > that defaults to off.
 > 
 > --
 > Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 22:44                 ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
@ 2001-08-22  0:58                   ` Daniel Phillips
  0 siblings, 0 replies; 75+ messages in thread
From: Daniel Phillips @ 2001-08-22  0:58 UTC (permalink / raw)
  To: Sven Heinicke, linux-kernel; +Cc: Marcelo Tosatti, Andrew Morton, Ben LaHaise

On August 22, 2001 12:44 am, Sven Heinicke wrote:
> Great, with 2.4.8-ac8 I get no memory problems.  Can you tell me what
> file(s) where modified to fix this to I can look for the fixes in
> future vanilla kernels?
> 
> Thanks!  Now to work on the drive speed problem, it's faster with your
> fix but still slower at writing then my IDE drive on another systems.

It's a highmem scanning problem, these were supposed to be fixed in a series 
of changes back in 2.4.8-pre, but they're not completely gone yet.  I've cc'd
a few people who've shown interest in this ugly^H^H^H^H particular corner of
the kernel, but now I'm shutting down for the night.  Guys, can you please
take a look at this one?  It's time to put the highmem allocation problem
definitively to rest.  Most likely, this problem occurs in -ac as well, just
with less frequency.

> Daniel Phillips writes:
>  > On August 21, 2001 06:48 pm, Sven Heinicke wrote:
>  > > Yes, highmem was on, the stystem got 4G of memory.  I turned off
>  > > highmem and got no messages apart from one:
>  > > 
>  > > Aug 21 07:29:19 ps1 kernel: (scsi0:A:0:0): Locking max tag count at 64
>  > > 
>  > > which I was getting before.
>  > >
>  > > Disk access is faster then before but still slower then the IDE
>  > > drive.  Any ideas?
>  > 
>  > Two separate problems, I think.  I don't know anything about the aic7xxx 
>  > driver but I can take a look at the highmem problem.  First, can you try
>  > it with highmem enabled, on a recent -ac kernel, say 2.4.8-ac7.
>  > 
>  > --
>  > Daniel

--
Daniel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 18:33                   ` Justin T. Gibbs
@ 2001-08-22  6:46                     ` Jens Axboe
  2001-08-22 13:24                       ` Justin T. Gibbs
  2001-08-22 15:05                       ` With Daniel Phillips Patch David S. Miller
  0 siblings, 2 replies; 75+ messages in thread
From: Jens Axboe @ 2001-08-22  6:46 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: Stephan von Krawczynski, Daniel Phillips, linux-kernel

On Tue, Aug 21 2001, Justin T. Gibbs wrote:
> [...] Unfortunately the x86 port
> doesn't support passing large dma addresses to drivers so bouncing is required
> in order to do PAE.

With the PCI64 + highmem no-bounce patches it does, so feel free to
convert aic7xxx to the newpci64 API :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 14:42             ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
  2001-08-21 15:08               ` Daniel Phillips
  2001-08-21 16:48               ` Sven Heinicke
@ 2001-08-22 10:25               ` Marcelo Tosatti
  2001-08-22 16:09               ` Sven Heinicke
  3 siblings, 0 replies; 75+ messages in thread
From: Marcelo Tosatti @ 2001-08-22 10:25 UTC (permalink / raw)
  To: Sven Heinicke; +Cc: linux-kernel



On Tue, 21 Aug 2001, Sven Heinicke wrote:

> 
> Forgive the sin of replying to my own message but Daniel Phillips
> replied to a different message with a patch to somebody getting a
> similar error to mine.  Here is the result:
> 
> Aug 20 15:10:33 ps1 kernel: cation failed (gfp=0x30/1). 
> Aug 20 15:10:33 ps1 kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x30/1). 
> Aug 20 15:10:46 ps1 last message repeated 327 times 
> Aug 20 15:10:47 ps1 kernel: cation failed (gfp=0x30/1). 
> Aug 20 15:10:47 ps1 kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x30/1). 
> Aug 20 15:10:56 ps1 last message repeated 294 times 
> 
> 
> Sven Heinicke writes:
>  > 
>  > It's always a blessing and a curse when people seem to be haveing
>  > problems with the same drivers as you.  I started looking into this
>  > when I user complained about disk access time.  I think this is
>  > related to the running aic7xxx topics.
>  > 
>  > From my tests, I got a Dell 4400 who's Adaptec 7899P, according to
>  > bonnie++, was writing slower then some of my my IDE drives on a
>  > different system.  I tried Red Hat's 2.4.3-12smp kernel and got a
>  > little improvement.  I then built 2.4.9 and started running bonnie++
>  > again and my syslog gets filled up with such errors:
>  > 
>  > Aug 20 14:23:33 ps1 kernel: __alloc_pages: 0-order all
>  > Aug 20 14:23:36 ps1 last message repeated 376 times
>  > Aug 20 14:23:36 ps1 kernel: ed.
>  > Aug 20 14:23:36 ps1 kernel: __alloc_pages: 0-order all
>  > Aug 20 14:23:44 ps1 last message repeated 376 times
>  > Aug 20 14:23:44 ps1 kernel: ed.
>  > Aug 20 14:23:44 ps1 kernel: __alloc_pages: 0-order all
>  > Aug 20 14:23:44 ps1 last message repeated 363 times
>  > 
>  > With slow access time.  Please request more info if you think it might
>  > help.

Sven,

Could you please try the following patch on top of 2.4.9? 

diff -Nur --exclude-from=exclude linux.orig/fs/buffer.c linux/fs/buffer.c
--- linux.orig/fs/buffer.c	Wed Aug 15 18:25:49 2001
+++ linux/fs/buffer.c	Tue Aug 21 04:54:01 2001
@@ -2447,7 +2447,8 @@
 	spin_unlock(&free_list[index].lock);
 	write_unlock(&hash_table_lock);
 	spin_unlock(&lru_list_lock);
-	if (gfp_mask & __GFP_IO) {
+	if (gfp_mask & __GFP_IO || (gfp_mask & __GFP_NOBOUNCE) 
+			&& page-zone == &pgdat_list->node_zones[ZONE_HIGHMEM]) {
 		sync_page_buffers(bh, gfp_mask);
 		/* We waited synchronously, so we can free the buffers. */
 		if (gfp_mask & __GFP_WAIT) {
diff -Nur --exclude-from=exclude linux.orig/include/linux/mm.h linux/include/linux/mm.h
--- linux.orig/include/linux/mm.h	Wed Aug 15 18:21:11 2001
+++ linux/include/linux/mm.h	Tue Aug 21 04:52:08 2001
@@ -538,6 +538,8 @@
 #define __GFP_HIGH	0x20	/* Should access emergency pools? */
 #define __GFP_IO	0x40	/* Can start physical IO? */
 #define __GFP_FS	0x80	/* Can call down to low-level FS? */
+#define __GFP_NOBOUNCE	0x100	/* Don't do any IO operation which may
+				   result in IO bouncing */
 
 #define GFP_NOIO	(__GFP_HIGH | __GFP_WAIT)
 #define GFP_NOFS	(__GFP_HIGH | __GFP_WAIT | __GFP_IO)
diff -Nur --exclude-from=exclude linux.orig/include/linux/slab.h linux/include/linux/slab.h
--- linux.orig/include/linux/slab.h	Wed Aug 15 18:21:13 2001
+++ linux/include/linux/slab.h	Tue Aug 21 04:51:20 2001
@@ -23,7 +23,7 @@
 #define	SLAB_NFS		GFP_NFS
 #define	SLAB_DMA		GFP_DMA
 
-#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS)
+#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS|__GFP_NOBOUNCE)
 #define	SLAB_NO_GROW		0x00001000UL	/* don't grow a cache */
 
 /* flags to pass to kmem_cache_create().
diff -Nur --exclude-from=exclude linux.orig/mm/highmem.c linux/mm/highmem.c
--- linux.orig/mm/highmem.c	Thu Aug 16 13:42:45 2001
+++ linux/mm/highmem.c	Tue Aug 21 04:50:08 2001
@@ -321,7 +321,7 @@
 	struct page *page;
 
 repeat_alloc:
-	page = alloc_page(GFP_NOIO);
+	page = alloc_page(GFP_NOIO|__GFP_NOBOUNCE);
 	if (page)
 		return page;
 	/*
@@ -359,7 +359,7 @@
 	struct buffer_head *bh;
 
 repeat_alloc:
-	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO);
+	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO|__GFP_NOBOUNCE);
 	if (bh)
 		return bh;
 	/*
diff -Nur --exclude-from=exclude linux.orig/mm/page_alloc.c linux/mm/page_alloc.c
--- linux.orig/mm/page_alloc.c	Thu Aug 16 13:43:02 2001
+++ linux/mm/page_alloc.c	Tue Aug 21 04:51:03 2001
@@ -398,7 +398,8 @@
 	 * - we're /really/ tight on memory
 	 * 	--> try to free pages ourselves with page_launder
 	 */
-	if (!(current->flags & PF_MEMALLOC)) {
+	if (!(current->flags & PF_MEMALLOC) 
+			|| ((gfp_mask & __GFP_NOBOUNCE) && !order)) {
 		/*
 		 * Are we dealing with a higher order allocation?
 		 *


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 22:49                 ` Sven Heinicke
@ 2001-08-22 13:06                   ` Gérard Roudier
  0 siblings, 0 replies; 75+ messages in thread
From: Gérard Roudier @ 2001-08-22 13:06 UTC (permalink / raw)
  To: Sven Heinicke; +Cc: Justin T. Gibbs, Daniel Phillips, linux-kernel



On Tue, 21 Aug 2001, Sven Heinicke wrote:

> Justin,
>
> I've tried removing your check, for writing bonnie++ still reports
> slower write times then IDE drive on the other system.  Daniel Phillips
> was great at helping me notice a problem that was causing slow down
> but not related to the aic7xxx driver, thus I am now trying the
> 7.4.8-ac8 kernel.

If you are referring to sequential writes, then I may add the following
comments to this thread. :-)

Given a more-or-less single-threaded sequential disk write load, the
sending of ORDERED TAGS should not impact significantly performances,
unless actual IOs queuing is extremally misordered.

You will get best performances for single-threaded sequential write load
by relying on kernel sorting and coalescing rather than on the ability of
the disk to accept a large number of tagged commands. As the aic7xxx
driver wants to use the largest possible number of queued commands, it
justs hits the worst case for performances of sequential writes, in my
opinion. Having write behind caching enabled by the device also helps a
lot for sequential writes.

You should enabled write caching and give a try with tagged command
queuing disabled (or using a small queue depth for your drives). BUT, you
must make sure that the low-level driver doesn't announce some large queue
to upper layers and queues internally the IOs. If it does so, the kernel
will not be able to sort and coalesce IOs and the device will be queued
with bunches of very small IOs.

--
To Justin:

Unless I missed some recent development, Linux SCSI does not allow to
dynamically resize device queues. Your FreeBSD-CAM layer do as we know.

As a result, once a device queue depth has been announced under Linux, SIM
must queue commands internally if it wants to use a shorter actual queue
for this device. All the IOs that are deferred this way aren't seen by
parts that perform optimizations. Theses parts are obviously the kernel
block device and the device itself. This cannot be good for performances
in my opinion. On the other hand, there are still some pathes that may
scan sequentially device queues and io request queues in the kernel. As
you know better than me, you are using heaps under FreeBSD-CAM that have
the advantage of allowing to handle priorities and consume less CPU. But
CPU is cheap nowadays...

So, let me write again than using large device queues by default under
Linux is not desirable, in my opinion. Personnaly, I use a device queue
depth of 16 commands under Linux for my drives. Under FreeBSD, my driver
announces 64 and relies on CAM for shrinking this depth if CAM thinks it
is better. Btw, none of my drives are able to handle more than 64
simultaneous commands (the fastest one is a Cheetah Ultra-160).


> I made your change whe will get try to get my user to test the system
> with and without your change.
>
>      Sven
>
> Justin T. Gibbs writes:
>  > >Disk access is faster then before but still slower then the IDE
>  > >drive.  Any ideas?
>  >
>  > It could be the occasionall ordered tag that is sent to the drive to
>  > prevent tag starvation.  If you search in drivers/scsi/aic7xxx/aic7xxx_linux.c
>  > for "OTAG_THRESH" and make that if test always fail (add an "&& 0") you will
>  > have effectively disabled this feature.  I should probably make it an option
>  > that defaults to off.
>  >
>  > --
>  > Justin

Regards,
  Gérard.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-22  6:46                     ` Jens Axboe
@ 2001-08-22 13:24                       ` Justin T. Gibbs
  2001-08-22 15:05                       ` With Daniel Phillips Patch David S. Miller
  1 sibling, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-22 13:24 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Stephan von Krawczynski, Daniel Phillips, linux-kernel

>On Tue, Aug 21 2001, Justin T. Gibbs wrote:
>> [...] Unfortunately the x86 port
>> doesn't support passing large dma addresses to drivers so bouncing is requir
>ed
>> in order to do PAE.
>
>With the PCI64 + highmem no-bounce patches it does, so feel free to
>convert aic7xxx to the newpci64 API :-)

Is this somehow different than how large DMA is done on the ia64
port?  All I do is look at the size of dma_addr_t to decide whether
to enable high address support in my driver.  If dma_addr_t's size
changes, then 64bit addressing will work the same as on every other
Linux port.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22  6:46                     ` Jens Axboe
  2001-08-22 13:24                       ` Justin T. Gibbs
@ 2001-08-22 15:05                       ` David S. Miller
  2001-08-22 18:21                         ` Gérard Roudier
                                           ` (3 more replies)
  1 sibling, 4 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-22 15:05 UTC (permalink / raw)
  To: gibbs; +Cc: axboe, skraw, phillips, linux-kernel

   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
   Date: Wed, 22 Aug 2001 07:24:29 -0600
   
   Is this somehow different than how large DMA is done on the ia64
   port?  All I do is look at the size of dma_addr_t to decide whether
   to enable high address support in my driver.  If dma_addr_t's size
   changes, then 64bit addressing will work the same as on every other
   Linux port.

It is totally different.

The ia64 method, while it worked for ia64, could not work properly on
just about any other platform.  For example, it assumed that any
physical address could be represented by a kernel virtual address.
This is not true on 32-bit HIGHMEM systems.  It also assumed that
using SAC or DAC addressing was simply a matter of "does the device
support it", and the world is far from being that simple :-)

Please see the pci64 patches for details:

ftp://ftp.kernel.org/pub/linux/kernel/people/davem/PCI64/*.gz

There are Documentation/DMA-mapping.txt updates, where you can read
how to use the interfaces properly.  A handful of net and scsi drivers
were updated to use the new API, you have examples to work with as
well.

I note that the aic7xxx won't be usable for DAC cycles on many
platforms since not all 64-bits are significant :-(  SYM53C8XX
has a similar limitation.  Surprisingly, the network PCI cards
have been the absolute best about this, supporting the full 64-bits
of DAC address in all card instances I delved into.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-22 16:09               ` Sven Heinicke
@ 2001-08-22 15:42                 ` Marcelo Tosatti
  2001-08-29  7:30                   ` Andrey Nekrasov
  2001-08-22 20:25                 ` Sven Heinicke
  1 sibling, 1 reply; 75+ messages in thread
From: Marcelo Tosatti @ 2001-08-22 15:42 UTC (permalink / raw)
  To: Sven Heinicke; +Cc: linux-kernel


Sven,

There is another mistake on the patch I sent you.

On buffer.c, instead 

"page->zone == &pgdat_list->node_zones[ZONE_HIGHMEM]"
            ^^
you should use

"page->zone != &pgdat_list->node_zones[ZONE_HIGHMEM]"
	    ^^ 

Ok? 

On Wed, 22 Aug 2001, Sven Heinicke wrote:

> 
> I tried you patch below, to compile I had to edit like 2451 of
> buffer.c, after the patch to be "page->zone" instead of "page-zone".
> After that the build want great.  But part of the way through running
> bonnie++ the system crashed in a way that it didn't write anything to
> the sylog.  The terminal was spewing:
> 
> APIC error on CPU0: 0c(0c)
> APIC error on CPU1: 0c(0c)
> 
> I've really gotta put that system back into production.  As it seems
> much better off before the I started this thread with the 2.4.8-ac8
> kernel.
> 
> 	Sven
> 
> Marcelo Tosatti writes:
>  > 
>  > 
>  > On Tue, 21 Aug 2001, Sven Heinicke wrote:
>  > 
>  > > 
>  > > Forgive the sin of replying to my own message but Daniel Phillips
>  > > replied to a different message with a patch to somebody getting a
>  > > similar error to mine.  Here is the result:
>  > > 
>  > > Aug 20 15:10:33 ps1 kernel: cation failed (gfp=0x30/1). 
>  > > Aug 20 15:10:33 ps1 kernel: __alloc_pages: 0-order allocation failed
>  > > (gfp=0x30/1). 
>  > > Aug 20 15:10:46 ps1 last message repeated 327 times 
>  > > Aug 20 15:10:47 ps1 kernel: cation failed (gfp=0x30/1). 
>  > > Aug 20 15:10:47 ps1 kernel: __alloc_pages: 0-order allocation failed
>  > > (gfp=0x30/1). 
>  > > Aug 20 15:10:56 ps1 last message repeated 294 times 
>  > > 
>  > > 
>  > > Sven Heinicke writes:
>  > >  > 
>  > >  > It's always a blessing and a curse when people seem to be haveing
>  > >  > problems with the same drivers as you.  I started looking into this
>  > >  > when I user complained about disk access time.  I think this is
>  > >  > related to the running aic7xxx topics.
>  > >  > 
>  > >  > From my tests, I got a Dell 4400 who's Adaptec 7899P, according to
>  > >  > bonnie++, was writing slower then some of my my IDE drives on a
>  > >  > different system.  I tried Red Hat's 2.4.3-12smp kernel and got a
>  > >  > little improvement.  I then built 2.4.9 and started running bonnie++
>  > >  > again and my syslog gets filled up with such errors:
>  > >  > 
>  > >  > Aug 20 14:23:33 ps1 kernel: __alloc_pages: 0-order all
>  > >  > Aug 20 14:23:36 ps1 last message repeated 376 times
>  > >  > Aug 20 14:23:36 ps1 kernel: ed.
>  > >  > Aug 20 14:23:36 ps1 kernel: __alloc_pages: 0-order all
>  > >  > Aug 20 14:23:44 ps1 last message repeated 376 times
>  > >  > Aug 20 14:23:44 ps1 kernel: ed.
>  > >  > Aug 20 14:23:44 ps1 kernel: __alloc_pages: 0-order all
>  > >  > Aug 20 14:23:44 ps1 last message repeated 363 times
>  > >  > 
>  > >  > With slow access time.  Please request more info if you think it might
>  > >  > help.
>  > 
>  > Sven,
>  > 
>  > Could you please try the following patch on top of 2.4.9? 
>  > 
>  > diff -Nur --exclude-from=exclude linux.orig/fs/buffer.c linux/fs/buffer.c
>  > --- linux.orig/fs/buffer.c	Wed Aug 15 18:25:49 2001
>  > +++ linux/fs/buffer.c	Tue Aug 21 04:54:01 2001
>  > @@ -2447,7 +2447,8 @@
>  >  	spin_unlock(&free_list[index].lock);
>  >  	write_unlock(&hash_table_lock);
>  >  	spin_unlock(&lru_list_lock);
>  > -	if (gfp_mask & __GFP_IO) {
>  > +	if (gfp_mask & __GFP_IO || (gfp_mask & __GFP_NOBOUNCE) 
>  > +			&& page-zone == &pgdat_list->node_zones[ZONE_HIGHMEM]) {
>  >  		sync_page_buffers(bh, gfp_mask);
>  >  		/* We waited synchronously, so we can free the buffers. */
>  >  		if (gfp_mask & __GFP_WAIT) {
>  > diff -Nur --exclude-from=exclude linux.orig/include/linux/mm.h linux/include/linux/mm.h
>  > --- linux.orig/include/linux/mm.h	Wed Aug 15 18:21:11 2001
>  > +++ linux/include/linux/mm.h	Tue Aug 21 04:52:08 2001
>  > @@ -538,6 +538,8 @@
>  >  #define __GFP_HIGH	0x20	/* Should access emergency pools? */
>  >  #define __GFP_IO	0x40	/* Can start physical IO? */
>  >  #define __GFP_FS	0x80	/* Can call down to low-level FS? */
>  > +#define __GFP_NOBOUNCE	0x100	/* Don't do any IO operation which may
>  > +				   result in IO bouncing */
>  >  
>  >  #define GFP_NOIO	(__GFP_HIGH | __GFP_WAIT)
>  >  #define GFP_NOFS	(__GFP_HIGH | __GFP_WAIT | __GFP_IO)
>  > diff -Nur --exclude-from=exclude linux.orig/include/linux/slab.h linux/include/linux/slab.h
>  > --- linux.orig/include/linux/slab.h	Wed Aug 15 18:21:13 2001
>  > +++ linux/include/linux/slab.h	Tue Aug 21 04:51:20 2001
>  > @@ -23,7 +23,7 @@
>  >  #define	SLAB_NFS		GFP_NFS
>  >  #define	SLAB_DMA		GFP_DMA
>  >  
>  > -#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS)
>  > +#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS|__GFP_NOBOUNCE)
>  >  #define	SLAB_NO_GROW		0x00001000UL	/* don't grow a cache */
>  >  
>  >  /* flags to pass to kmem_cache_create().
>  > diff -Nur --exclude-from=exclude linux.orig/mm/highmem.c linux/mm/highmem.c
>  > --- linux.orig/mm/highmem.c	Thu Aug 16 13:42:45 2001
>  > +++ linux/mm/highmem.c	Tue Aug 21 04:50:08 2001
>  > @@ -321,7 +321,7 @@
>  >  	struct page *page;
>  >  
>  >  repeat_alloc:
>  > -	page = alloc_page(GFP_NOIO);
>  > +	page = alloc_page(GFP_NOIO|__GFP_NOBOUNCE);
>  >  	if (page)
>  >  		return page;
>  >  	/*
>  > @@ -359,7 +359,7 @@
>  >  	struct buffer_head *bh;
>  >  
>  >  repeat_alloc:
>  > -	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO);
>  > +	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO|__GFP_NOBOUNCE);
>  >  	if (bh)
>  >  		return bh;
>  >  	/*
>  > diff -Nur --exclude-from=exclude linux.orig/mm/page_alloc.c linux/mm/page_alloc.c
>  > --- linux.orig/mm/page_alloc.c	Thu Aug 16 13:43:02 2001
>  > +++ linux/mm/page_alloc.c	Tue Aug 21 04:51:03 2001
>  > @@ -398,7 +398,8 @@
>  >  	 * - we're /really/ tight on memory
>  >  	 * 	--> try to free pages ourselves with page_launder
>  >  	 */
>  > -	if (!(current->flags & PF_MEMALLOC)) {
>  > +	if (!(current->flags & PF_MEMALLOC) 
>  > +			|| ((gfp_mask & __GFP_NOBOUNCE) && !order)) {
>  >  		/*
>  >  		 * Are we dealing with a higher order allocation?
>  >  		 *
>  > 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-21 14:42             ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
                                 ` (2 preceding siblings ...)
  2001-08-22 10:25               ` Marcelo Tosatti
@ 2001-08-22 16:09               ` Sven Heinicke
  2001-08-22 15:42                 ` Marcelo Tosatti
  2001-08-22 20:25                 ` Sven Heinicke
  3 siblings, 2 replies; 75+ messages in thread
From: Sven Heinicke @ 2001-08-22 16:09 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-kernel


I tried you patch below, to compile I had to edit like 2451 of
buffer.c, after the patch to be "page->zone" instead of "page-zone".
After that the build want great.  But part of the way through running
bonnie++ the system crashed in a way that it didn't write anything to
the sylog.  The terminal was spewing:

APIC error on CPU0: 0c(0c)
APIC error on CPU1: 0c(0c)

I've really gotta put that system back into production.  As it seems
much better off before the I started this thread with the 2.4.8-ac8
kernel.

	Sven

Marcelo Tosatti writes:
 > 
 > 
 > On Tue, 21 Aug 2001, Sven Heinicke wrote:
 > 
 > > 
 > > Forgive the sin of replying to my own message but Daniel Phillips
 > > replied to a different message with a patch to somebody getting a
 > > similar error to mine.  Here is the result:
 > > 
 > > Aug 20 15:10:33 ps1 kernel: cation failed (gfp=0x30/1). 
 > > Aug 20 15:10:33 ps1 kernel: __alloc_pages: 0-order allocation failed
 > > (gfp=0x30/1). 
 > > Aug 20 15:10:46 ps1 last message repeated 327 times 
 > > Aug 20 15:10:47 ps1 kernel: cation failed (gfp=0x30/1). 
 > > Aug 20 15:10:47 ps1 kernel: __alloc_pages: 0-order allocation failed
 > > (gfp=0x30/1). 
 > > Aug 20 15:10:56 ps1 last message repeated 294 times 
 > > 
 > > 
 > > Sven Heinicke writes:
 > >  > 
 > >  > It's always a blessing and a curse when people seem to be haveing
 > >  > problems with the same drivers as you.  I started looking into this
 > >  > when I user complained about disk access time.  I think this is
 > >  > related to the running aic7xxx topics.
 > >  > 
 > >  > From my tests, I got a Dell 4400 who's Adaptec 7899P, according to
 > >  > bonnie++, was writing slower then some of my my IDE drives on a
 > >  > different system.  I tried Red Hat's 2.4.3-12smp kernel and got a
 > >  > little improvement.  I then built 2.4.9 and started running bonnie++
 > >  > again and my syslog gets filled up with such errors:
 > >  > 
 > >  > Aug 20 14:23:33 ps1 kernel: __alloc_pages: 0-order all
 > >  > Aug 20 14:23:36 ps1 last message repeated 376 times
 > >  > Aug 20 14:23:36 ps1 kernel: ed.
 > >  > Aug 20 14:23:36 ps1 kernel: __alloc_pages: 0-order all
 > >  > Aug 20 14:23:44 ps1 last message repeated 376 times
 > >  > Aug 20 14:23:44 ps1 kernel: ed.
 > >  > Aug 20 14:23:44 ps1 kernel: __alloc_pages: 0-order all
 > >  > Aug 20 14:23:44 ps1 last message repeated 363 times
 > >  > 
 > >  > With slow access time.  Please request more info if you think it might
 > >  > help.
 > 
 > Sven,
 > 
 > Could you please try the following patch on top of 2.4.9? 
 > 
 > diff -Nur --exclude-from=exclude linux.orig/fs/buffer.c linux/fs/buffer.c
 > --- linux.orig/fs/buffer.c	Wed Aug 15 18:25:49 2001
 > +++ linux/fs/buffer.c	Tue Aug 21 04:54:01 2001
 > @@ -2447,7 +2447,8 @@
 >  	spin_unlock(&free_list[index].lock);
 >  	write_unlock(&hash_table_lock);
 >  	spin_unlock(&lru_list_lock);
 > -	if (gfp_mask & __GFP_IO) {
 > +	if (gfp_mask & __GFP_IO || (gfp_mask & __GFP_NOBOUNCE) 
 > +			&& page-zone == &pgdat_list->node_zones[ZONE_HIGHMEM]) {
 >  		sync_page_buffers(bh, gfp_mask);
 >  		/* We waited synchronously, so we can free the buffers. */
 >  		if (gfp_mask & __GFP_WAIT) {
 > diff -Nur --exclude-from=exclude linux.orig/include/linux/mm.h linux/include/linux/mm.h
 > --- linux.orig/include/linux/mm.h	Wed Aug 15 18:21:11 2001
 > +++ linux/include/linux/mm.h	Tue Aug 21 04:52:08 2001
 > @@ -538,6 +538,8 @@
 >  #define __GFP_HIGH	0x20	/* Should access emergency pools? */
 >  #define __GFP_IO	0x40	/* Can start physical IO? */
 >  #define __GFP_FS	0x80	/* Can call down to low-level FS? */
 > +#define __GFP_NOBOUNCE	0x100	/* Don't do any IO operation which may
 > +				   result in IO bouncing */
 >  
 >  #define GFP_NOIO	(__GFP_HIGH | __GFP_WAIT)
 >  #define GFP_NOFS	(__GFP_HIGH | __GFP_WAIT | __GFP_IO)
 > diff -Nur --exclude-from=exclude linux.orig/include/linux/slab.h linux/include/linux/slab.h
 > --- linux.orig/include/linux/slab.h	Wed Aug 15 18:21:13 2001
 > +++ linux/include/linux/slab.h	Tue Aug 21 04:51:20 2001
 > @@ -23,7 +23,7 @@
 >  #define	SLAB_NFS		GFP_NFS
 >  #define	SLAB_DMA		GFP_DMA
 >  
 > -#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS)
 > +#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS|__GFP_NOBOUNCE)
 >  #define	SLAB_NO_GROW		0x00001000UL	/* don't grow a cache */
 >  
 >  /* flags to pass to kmem_cache_create().
 > diff -Nur --exclude-from=exclude linux.orig/mm/highmem.c linux/mm/highmem.c
 > --- linux.orig/mm/highmem.c	Thu Aug 16 13:42:45 2001
 > +++ linux/mm/highmem.c	Tue Aug 21 04:50:08 2001
 > @@ -321,7 +321,7 @@
 >  	struct page *page;
 >  
 >  repeat_alloc:
 > -	page = alloc_page(GFP_NOIO);
 > +	page = alloc_page(GFP_NOIO|__GFP_NOBOUNCE);
 >  	if (page)
 >  		return page;
 >  	/*
 > @@ -359,7 +359,7 @@
 >  	struct buffer_head *bh;
 >  
 >  repeat_alloc:
 > -	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO);
 > +	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO|__GFP_NOBOUNCE);
 >  	if (bh)
 >  		return bh;
 >  	/*
 > diff -Nur --exclude-from=exclude linux.orig/mm/page_alloc.c linux/mm/page_alloc.c
 > --- linux.orig/mm/page_alloc.c	Thu Aug 16 13:43:02 2001
 > +++ linux/mm/page_alloc.c	Tue Aug 21 04:51:03 2001
 > @@ -398,7 +398,8 @@
 >  	 * - we're /really/ tight on memory
 >  	 * 	--> try to free pages ourselves with page_launder
 >  	 */
 > -	if (!(current->flags & PF_MEMALLOC)) {
 > +	if (!(current->flags & PF_MEMALLOC) 
 > +			|| ((gfp_mask & __GFP_NOBOUNCE) && !order)) {
 >  		/*
 >  		 * Are we dealing with a higher order allocation?
 >  		 *
 > 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 15:05                       ` With Daniel Phillips Patch David S. Miller
@ 2001-08-22 18:21                         ` Gérard Roudier
  2001-08-22 18:32                         ` Justin T. Gibbs
                                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 75+ messages in thread
From: Gérard Roudier @ 2001-08-22 18:21 UTC (permalink / raw)
  To: David S. Miller; +Cc: gibbs, axboe, skraw, phillips, linux-kernel



On Wed, 22 Aug 2001, David S. Miller wrote:

>    From: "Justin T. Gibbs" <gibbs@scsiguy.com>
>    Date: Wed, 22 Aug 2001 07:24:29 -0600
>
>    Is this somehow different than how large DMA is done on the ia64
>    port?  All I do is look at the size of dma_addr_t to decide whether
>    to enable high address support in my driver.  If dma_addr_t's size
>    changes, then 64bit addressing will work the same as on every other
>    Linux port.
>
> It is totally different.
>
> The ia64 method, while it worked for ia64, could not work properly on
> just about any other platform.  For example, it assumed that any
> physical address could be represented by a kernel virtual address.
> This is not true on 32-bit HIGHMEM systems.  It also assumed that
> using SAC or DAC addressing was simply a matter of "does the device
> support it", and the world is far from being that simple :-)
>
> Please see the pci64 patches for details:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/davem/PCI64/*.gz
>
> There are Documentation/DMA-mapping.txt updates, where you can read
> how to use the interfaces properly.  A handful of net and scsi drivers
> were updated to use the new API, you have examples to work with as
> well.
>
> I note that the aic7xxx won't be usable for DAC cycles on many
> platforms since not all 64-bits are significant :-(  SYM53C8XX
> has a similar limitation.  Surprisingly, the network PCI cards
> have been the absolute best about this, supporting the full 64-bits
> of DAC address in all card instances I delved into.

First, let me thank you a LLLOOOOOOTTTTT, David, for you PCI 64 bit
addressing DMA-mapping. I didn't have looked yet into your patch and
documentation update, but I will do so ASAP.

OTOH, it is a great pleasure for me to hear that you didn't forget what I
told you about the current limitation of the SYM53C8XX driver. For now,
the driver would be only able to use 40 bit addresses with all upper bits
set to zero. This doesn't fit PCI 64 bit implementation of the Alpha
Monster window for example, and probably doesn't fit most other non-Intel
PCI 64 bit implementations.

But there is an alternate solution for the SYM53C8XX driver by using up to
16 x 32 bit segment registers. This would (will) allow to address for DMA
16 x 4GB segments with all upper bits being settable for each 4GB segment.
I have this in my todo-list since months, but haven't had strong reasons
for implementing it. The strongest reasons would be that I had access to
64 bit machines with 64 bit PCI, but this isn't possible. My machine uses
a Supermicro 370 DLE Mobo that offers PCI 64 bit path, but I only have 256
MB of memory. :-(, and anyway, the thing looks like a 15 years old 32 bit
Intel-arch :-(, that doesn't support 64 bit PCI addressing. :-(

  Gérard.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 15:05                       ` With Daniel Phillips Patch David S. Miller
  2001-08-22 18:21                         ` Gérard Roudier
@ 2001-08-22 18:32                         ` Justin T. Gibbs
  2001-08-22 18:32                         ` David S. Miller
  2001-08-22 18:46                         ` David S. Miller
  3 siblings, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-22 18:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: axboe, skraw, phillips, linux-kernel

>   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
>   Date: Wed, 22 Aug 2001 07:24:29 -0600
>   
>   Is this somehow different than how large DMA is done on the ia64
>   port?  All I do is look at the size of dma_addr_t to decide whether
>   to enable high address support in my driver.  If dma_addr_t's size
>   changes, then 64bit addressing will work the same as on every other
>   Linux port.
>
>It is totally different.

In looking at the documentation, it really doesn't seem much different
at all.  All you've done is force there to be two types and two APIs,
instead of one of each, for accessing dma addresses.  I would like the
change much better if the size of dma_addr_t simply changed to be
64bits wide if high mem support is enabled in your kernel config.
That the high 32bits may be empty or not even looked at by some
device (as you describe in DMA-mapping.txt) isn't much of a concern.
Sure, you need the other API changes to more finely set dma characteristics,
but having two APIs just complicates life for the device driver.  You'll
see why I say this below.

>The ia64 method, while it worked for ia64, could not work properly on
>just about any other platform.  For example, it assumed that any
>physical address could be represented by a kernel virtual address.

>From the device driver's point of view, this wasn't the case.
The driver asks to have the data mapped into an address that
its dma engine can understand and the system is supposed to do that
mapping.  Whether an IOMMU or some other piece of hardware was involved
didn't matter to the driver.  Well, it might matter because, at least
in the ia64 case, resource shortages result in a panic instead of an
error code being returned that you could do something reasonable with.
Now I just need to stick some "64"'s into my API calls to get the same
effect I currently have on IA64.  The fact that the back end that supports
the mapping changed shouldn't effect the driver.

>It also assumed that using SAC or DAC addressing was simply a matter of
>"does the device support it", and the world is far from being that simple :-)

Can you enumerate the devices that actually issue a DAC when loaded with
a 64bit address with 0's in the most significant 32bits?

>I note that the aic7xxx won't be usable for DAC cycles on many
>platforms since not all 64-bits are significant :-(

This isn't true.  The hardware supports all 64bits, but I've only
implemented two of the three expected S/G formats:

1) 4byte address/3byte count/7bits pad/1bit end of list
2) 4byte address/3byte count/7bits extended address/1bit end of list
and NYI
3) 8byte address/3byte count/7bits pad/1bit end of list

The first is the most efficient as the firmware doesn't have to bother
(or have the code) to load the high address bits.  The second works for
many platforms but doesn't take any additional space up for S/G lists.
The last can be implemented and enabled on any platforms that really need it.

With formats 1 and 2, the choice of what to use can easily be done at
driver initalization time.  If you determine that high mappings will
never be needed, why do the extra work?  Now that I'm supposed to use
two differnt apis depending on what capabilities I enable in my driver,
I'll have to add more bloat to my mapping routine (two func calls that do
almost exact the same thing, gated by a test - or use an indirect function
call).  Perhaps I'll just wrap everything into macros and make it a compile
time option.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 15:05                       ` With Daniel Phillips Patch David S. Miller
  2001-08-22 18:21                         ` Gérard Roudier
  2001-08-22 18:32                         ` Justin T. Gibbs
@ 2001-08-22 18:32                         ` David S. Miller
  2001-08-22 18:46                         ` David S. Miller
  3 siblings, 0 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-22 18:32 UTC (permalink / raw)
  To: groudier; +Cc: gibbs, axboe, skraw, phillips, linux-kernel

   From: G廨ard Roudier <groudier@free.fr>
   Date: Wed, 22 Aug 2001 20:21:50 +0200 (CEST)
   
   First, let me thank you a LLLOOOOOOTTTTT, David, for you PCI 64 bit
   addressing DMA-mapping. I didn't have looked yet into your patch and
   documentation update, but I will do so ASAP.
   
You're very welcome. :-)

   OTOH, it is a great pleasure for me to hear that you didn't forget what I
   told you about the current limitation of the SYM53C8XX driver. For now,
   the driver would be only able to use 40 bit addresses with all upper bits
   set to zero. This doesn't fit PCI 64 bit implementation of the Alpha
   Monster window for example, and probably doesn't fit most other non-Intel
   PCI 64 bit implementations.
   
It is fully known, in fact, I converted the sym53c8xx.c driver as
one of the examples in the patch.

Alpha can use it, in cases where memory in the system is less than
the addressing limitation of device.

Such logic would reside for Alpha port in pci_dac_cycles_ok()
definition.  IA64 and x86 could act similarly.  This was in fact
how I intended ports to implement pci_dac_cycles_ok().

   But there is an alternate solution for the SYM53C8XX driver by using up to
   16 x 32 bit segment registers. This would (will) allow to address for DMA
   16 x 4GB segments with all upper bits being settable for each 4GB segment.

Note, it relies on no 4GB crossing every occuring.  Jens and I have
decided that we will make this guarentee for devices always.  I know
of 2 devices already which have problems with this (Qlogic,FC and some
buggy variants of Tigon3 chips).

   I have this in my todo-list since months, but haven't had strong reasons
   for implementing it. The strongest reasons would be that I had access to
   64 bit machines with 64 bit PCI, but this isn't possible.

Look for someone to borrow a sparc64 system from.  Or, alternatively
send the patch to me for testing.

On sparc64, you will always be "testing all the bits" since each
DAC address to physical memory has:

	0xfffc000000000000

on the top bits.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 15:05                       ` With Daniel Phillips Patch David S. Miller
                                           ` (2 preceding siblings ...)
  2001-08-22 18:32                         ` David S. Miller
@ 2001-08-22 18:46                         ` David S. Miller
  2001-08-22 19:41                           ` Justin T. Gibbs
                                             ` (4 more replies)
  3 siblings, 5 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-22 18:46 UTC (permalink / raw)
  To: gibbs; +Cc: axboe, skraw, phillips, linux-kernel

   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
   Date: Wed, 22 Aug 2001 12:32:17 -0600

   I would like the change much better if the size of dma_addr_t
   simply changed to be 64bits wide if high mem support is enabled
   in your kernel config.

Drivers for SAC only PCI devices shall not be bloated by 64-bit type,
not in any case whatsoever.

   Sure, you need the other API changes to more finely set dma characteristics,
   but having two APIs just complicates life for the device driver.

Either your device is 64-bit capable or not, what is so complicated?
Each driver I converted was like 15 minutes or work, at best!
   
   From the device driver's point of view, this wasn't the case.
   The driver asks to have the data mapped into an address that
   its dma engine can understand and the system is supposed to do that
   mapping.

What is the virtual address of physical address 0x100000000
on a 32-bit cpu system if the page is not currently kmap()'d?

Answer: it doesn't exist.

The only portable way is to use pages.  That is what Jens's and
my work aims to do.  The ia64 API is nonportable and works only
on 64-bit systems.

   >It also assumed that using SAC or DAC addressing was simply a matter of
   >"does the device support it", and the world is far from being that simple :-)
   
   Can you enumerate the devices that actually issue a DAC when loaded with
   a 64bit address with 0's in the most significant 32bits?

Sym53c8xx does this.  You have to configure it to do SAC or DAC
for data, descriptors use SAC always.
   
There will certainly be devices in the future which will only
support DAC cycles.

   Now that I'm supposed to use two differnt apis depending on what
   capabilities I enable in my driver, 

I think you are far overcomplicating things.  I mean, look at how
simple the conversion of some of the networking drivers was.  The
sym53c8xx driver conversion was very simple too, even with it's odd
current behavior due to addressing limitations.

It was nothing more than:

1) Adding probe time code to configure DMA attributes correctly.
   Failing the probe is no suitable mode could be determined.

   You know, it allows you to do something like this:

	pci_set_dma_mask(pdev, 0x1fffffffff);
	if (pci_dac_cycles_ok(pdev)) {
		dac_addressing_method = 1;
		goto dma_configured;
	}
	pci_set_dma_mask(pdev, 0x7ffffffffff);
	if (pci_dac_cycles_ok(pdev)) {
		dac_addressing_method = 2;
		goto dma_configured;
	}
	pci_set_dma_mask(pdev, 0xffffffffffffffff);
	if (pci_dac_cycles_ok(pdev)) {
		dac_addressing_method = 3;
		goto dma_configured;
	}
	if (!pci_dma_supported(pdev, 0xffffffff)) {
		probe_fail_msg();
		return -ENODEV;
	}
	dac_addressing_method = 0; /* Use SAC */

2) sed 's/dma_addr_t/dma64_addr_t/'
   sed 's/pci_{map,unmap,dma_sync}*/pci64_{map,unmap,dma_sync}*/'

And doing a cursory glance over the DMA address references
to make sure they weren't being put into u32's or something
similar.

Justin, have you even _TRIED_ to use the new API?

I did, on like 6 drivers, and it works just fine.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 18:46                         ` David S. Miller
@ 2001-08-22 19:41                           ` Justin T. Gibbs
  2001-08-22 20:19                           ` David S. Miller
                                             ` (3 subsequent siblings)
  4 siblings, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-22 19:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: axboe, skraw, phillips, linux-kernel

>   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
>   Date: Wed, 22 Aug 2001 12:32:17 -0600
>
>   I would like the change much better if the size of dma_addr_t
>   simply changed to be 64bits wide if high mem support is enabled
>   in your kernel config.
>
>Drivers for SAC only PCI devices shall not be bloated by 64-bit type,
>not in any case whatsoever.

Where's the bloat?  The driver's S/G list is in the driver's native format.
Are saying that there will be two different mapping implementation back
ends with the 32bit one possibly saving space by using 32bit only address
in whatever storage is required to record the mapping?  Since you have to
use an API call to fetch these values on a per-element basis, the 32bit back
end need only promote its 32bit values to 64bit values upon return.  On most
architectures I've seen, a load of a 32bit value from a 64bit value is still
just a 32bit move, so the fact that the returned value is a 64bit quantity
makes no difference code wise inside the driver.

>   Sure, you need the other API changes to more finely set dma characteristics
>,
>   but having two APIs just complicates life for the device driver.
>
>Either your device is 64-bit capable or not, what is so complicated?

The complication arrises when there is a performance impact associated
with 64bit support.  You may want to completely compile out 64bit support.
If I can use the exact same API if the user decides to configure my
device to not enable large address support, then my complaint is only
that it seems supurfluous to have two different APIs and types.

>   From the device driver's point of view, this wasn't the case.
>   The driver asks to have the data mapped into an address that
>   its dma engine can understand and the system is supposed to do that
>   mapping.
>
>What is the virtual address of physical address 0x100000000
>on a 32-bit cpu system if the page is not currently kmap()'d?

Why does the driver care?  The driver asked to have some virtual address
mapped into a bus address.  Are you saying the system can't understand
figure out what physical page this is and from that the necessary
IOMMU magic to make it visible to the device?

>The only portable way is to use pages.  That is what Jens's and
>my work aims to do.  The ia64 API is nonportable and works only
>on 64-bit systems.

This doesn't follow from your explanation.  Sure, you need to use
pages, but the basic information provided to the mapping calls allows
you to figure out the pages and that basic information is the same in
the new API.

It seems to me that you are complaining that the "backend" implementation
for IA64 sucked.  Okay.  Fine.  But the drivers were never exposed to
that suckage.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 18:46                         ` David S. Miller
  2001-08-22 19:41                           ` Justin T. Gibbs
@ 2001-08-22 20:19                           ` David S. Miller
  2001-08-22 21:07                           ` Gérard Roudier
                                             ` (2 subsequent siblings)
  4 siblings, 0 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-22 20:19 UTC (permalink / raw)
  To: gibbs; +Cc: axboe, skraw, phillips, linux-kernel

   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
   Date: Wed, 22 Aug 2001 13:41:16 -0600

   >
   >Drivers for SAC only PCI devices shall not be bloated by 64-bit type,
   >not in any case whatsoever.
   
   Where's the bloat?  The driver's S/G list is in the driver's native format.

All the world is not a scsi driver.  Consider networking
and other types of drivers that need not make use of
scatter lists.

   >Either your device is 64-bit capable or not, what is so complicated?
   
   The complication arrises when there is a performance impact associated
   with 64bit support.  You may want to completely compile out 64bit support.
   If I can use the exact same API if the user decides to configure my
   device to not enable large address support, then my complaint is only
   that it seems supurfluous to have two different APIs and types.

You can indeed use the same API.  There are two situations.

1) The performance impact is "platform specific", so let the
   platform decide for you:

	Use pci64_foo() and set the DMA mask to what you can support.
	Ie. a normal DAC supporting driver.

2) The performance impact is "device specific", so only use
   the 32-bit API.   

   >What is the virtual address of physical address 0x100000000
   >on a 32-bit cpu system if the page is not currently kmap()'d?
   
   Why does the driver care?  The driver asked to have some virtual address
   mapped into a bus address.  Are you saying the system can't understand
   figure out what physical page this is and from that the necessary
   IOMMU magic to make it visible to the device?
   
It is the object that the block and networking systems work with
that is the issue.

Do you know how physical memory is mapped under Linux?  Everything
non-HIGHMEM is directly mapped.  Everything else must be temporarily
"kmap()'d" so that the kernel and perform loads and stores to that
page.

The only object representation that works for all kinds of pages,
HIGHMEM or not, is the "struct page *page; unsigned long offset;"
tuple.

If this wasn't a problem, Jens's would not be doing any of the
work he is doing right now :-)

   It seems to me that you are complaining that the "backend" implementation
   for IA64 sucked.  Okay.  Fine.  But the drivers were never exposed to
   that suckage.

No, I have in fact no problem with IA64's backend, I don't care how
any platform implements anything.  It is the front end that sucked
balls, and this part I care about because it is the APIs drivers have
to deal with.  Specifically, my gripes are:

1) It took virtual addresses.  Result: does not work on 32-bit
   platforms.

2) It did not take into consideration at all the issues surrounding
   DAC usage on some platforms, such as:

	a) transfers using DAC cycles might run slower than
	   those using SAC cycles
	b) DAC cycles may be preferred even in the presence of
	   slower transfers because the device is "DMA mapping
	   hungry" ala. compute cluster cards.

Let me ask you again: Have you tried to write a driver to the new
APIs at all?  I have for 6 totally different devices, on drivers
written by totally different people (including those I wrote myself)
and they all worked out beautifully.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-22 16:09               ` Sven Heinicke
  2001-08-22 15:42                 ` Marcelo Tosatti
@ 2001-08-22 20:25                 ` Sven Heinicke
  1 sibling, 0 replies; 75+ messages in thread
From: Sven Heinicke @ 2001-08-22 20:25 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-kernel


cool, it now compiles, runs and doesn't crash while running bonnie++
like it did before.  I've told my users to beat at it.  if it crashes
or continues to get bad disk performance i'm sure the list will hear
from me.  Hopefully I'll be able to test more changes if they need
testing.

	Sven

Marcelo Tosatti writes:
 > 
 > Sven,
 > 
 > There is another mistake on the patch I sent you.
 > 
 > On buffer.c, instead 
 > 
 > "page->zone == &pgdat_list->node_zones[ZONE_HIGHMEM]"
 >             ^^
 > you should use
 > 
 > "page->zone != &pgdat_list->node_zones[ZONE_HIGHMEM]"
 > 	    ^^ 
 > 
 > Ok? 
 > 
 > On Wed, 22 Aug 2001, Sven Heinicke wrote:
 > 
 > > 
 > > I tried you patch below, to compile I had to edit like 2451 of
 > > buffer.c, after the patch to be "page->zone" instead of "page-zone".
 > > After that the build want great.  But part of the way through running
 > > bonnie++ the system crashed in a way that it didn't write anything to
 > > the sylog.  The terminal was spewing:
 > > 
 > > APIC error on CPU0: 0c(0c)
 > > APIC error on CPU1: 0c(0c)
 > > 
 > > I've really gotta put that system back into production.  As it seems
 > > much better off before the I started this thread with the 2.4.8-ac8
 > > kernel.
 > > 
 > > 	Sven
 > > 
 > > Marcelo Tosatti writes:
 > >  > 
 > >  > 
 > >  > On Tue, 21 Aug 2001, Sven Heinicke wrote:
 > >  > 
 > >  > > 
 > >  > > Forgive the sin of replying to my own message but Daniel Phillips
 > >  > > replied to a different message with a patch to somebody getting a
 > >  > > similar error to mine.  Here is the result:
 > >  > > 
 > >  > > Aug 20 15:10:33 ps1 kernel: cation failed (gfp=0x30/1). 
 > >  > > Aug 20 15:10:33 ps1 kernel: __alloc_pages: 0-order allocation failed
 > >  > > (gfp=0x30/1). 
 > >  > > Aug 20 15:10:46 ps1 last message repeated 327 times 
 > >  > > Aug 20 15:10:47 ps1 kernel: cation failed (gfp=0x30/1). 
 > >  > > Aug 20 15:10:47 ps1 kernel: __alloc_pages: 0-order allocation failed
 > >  > > (gfp=0x30/1). 
 > >  > > Aug 20 15:10:56 ps1 last message repeated 294 times 
 > >  > > 
 > >  > > 
 > >  > > Sven Heinicke writes:
 > >  > >  > 
 > >  > >  > It's always a blessing and a curse when people seem to be haveing
 > >  > >  > problems with the same drivers as you.  I started looking into this
 > >  > >  > when I user complained about disk access time.  I think this is
 > >  > >  > related to the running aic7xxx topics.
 > >  > >  > 
 > >  > >  > From my tests, I got a Dell 4400 who's Adaptec 7899P, according to
 > >  > >  > bonnie++, was writing slower then some of my my IDE drives on a
 > >  > >  > different system.  I tried Red Hat's 2.4.3-12smp kernel and got a
 > >  > >  > little improvement.  I then built 2.4.9 and started running bonnie++
 > >  > >  > again and my syslog gets filled up with such errors:
 > >  > >  > 
 > >  > >  > Aug 20 14:23:33 ps1 kernel: __alloc_pages: 0-order all
 > >  > >  > Aug 20 14:23:36 ps1 last message repeated 376 times
 > >  > >  > Aug 20 14:23:36 ps1 kernel: ed.
 > >  > >  > Aug 20 14:23:36 ps1 kernel: __alloc_pages: 0-order all
 > >  > >  > Aug 20 14:23:44 ps1 last message repeated 376 times
 > >  > >  > Aug 20 14:23:44 ps1 kernel: ed.
 > >  > >  > Aug 20 14:23:44 ps1 kernel: __alloc_pages: 0-order all
 > >  > >  > Aug 20 14:23:44 ps1 last message repeated 363 times
 > >  > >  > 
 > >  > >  > With slow access time.  Please request more info if you think it might
 > >  > >  > help.
 > >  > 
 > >  > Sven,
 > >  > 
 > >  > Could you please try the following patch on top of 2.4.9? 
 > >  > 
 > >  > diff -Nur --exclude-from=exclude linux.orig/fs/buffer.c linux/fs/buffer.c
 > >  > --- linux.orig/fs/buffer.c	Wed Aug 15 18:25:49 2001
 > >  > +++ linux/fs/buffer.c	Tue Aug 21 04:54:01 2001
 > >  > @@ -2447,7 +2447,8 @@
 > >  >  	spin_unlock(&free_list[index].lock);
 > >  >  	write_unlock(&hash_table_lock);
 > >  >  	spin_unlock(&lru_list_lock);
 > >  > -	if (gfp_mask & __GFP_IO) {
 > >  > +	if (gfp_mask & __GFP_IO || (gfp_mask & __GFP_NOBOUNCE) 
 > >  > +			&& page-zone == &pgdat_list->node_zones[ZONE_HIGHMEM]) {
 > >  >  		sync_page_buffers(bh, gfp_mask);
 > >  >  		/* We waited synchronously, so we can free the buffers. */
 > >  >  		if (gfp_mask & __GFP_WAIT) {
 > >  > diff -Nur --exclude-from=exclude linux.orig/include/linux/mm.h linux/include/linux/mm.h
 > >  > --- linux.orig/include/linux/mm.h	Wed Aug 15 18:21:11 2001
 > >  > +++ linux/include/linux/mm.h	Tue Aug 21 04:52:08 2001
 > >  > @@ -538,6 +538,8 @@
 > >  >  #define __GFP_HIGH	0x20	/* Should access emergency pools? */
 > >  >  #define __GFP_IO	0x40	/* Can start physical IO? */
 > >  >  #define __GFP_FS	0x80	/* Can call down to low-level FS? */
 > >  > +#define __GFP_NOBOUNCE	0x100	/* Don't do any IO operation which may
 > >  > +				   result in IO bouncing */
 > >  >  
 > >  >  #define GFP_NOIO	(__GFP_HIGH | __GFP_WAIT)
 > >  >  #define GFP_NOFS	(__GFP_HIGH | __GFP_WAIT | __GFP_IO)
 > >  > diff -Nur --exclude-from=exclude linux.orig/include/linux/slab.h linux/include/linux/slab.h
 > >  > --- linux.orig/include/linux/slab.h	Wed Aug 15 18:21:13 2001
 > >  > +++ linux/include/linux/slab.h	Tue Aug 21 04:51:20 2001
 > >  > @@ -23,7 +23,7 @@
 > >  >  #define	SLAB_NFS		GFP_NFS
 > >  >  #define	SLAB_DMA		GFP_DMA
 > >  >  
 > >  > -#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS)
 > >  > +#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS|__GFP_NOBOUNCE)
 > >  >  #define	SLAB_NO_GROW		0x00001000UL	/* don't grow a cache */
 > >  >  
 > >  >  /* flags to pass to kmem_cache_create().
 > >  > diff -Nur --exclude-from=exclude linux.orig/mm/highmem.c linux/mm/highmem.c
 > >  > --- linux.orig/mm/highmem.c	Thu Aug 16 13:42:45 2001
 > >  > +++ linux/mm/highmem.c	Tue Aug 21 04:50:08 2001
 > >  > @@ -321,7 +321,7 @@
 > >  >  	struct page *page;
 > >  >  
 > >  >  repeat_alloc:
 > >  > -	page = alloc_page(GFP_NOIO);
 > >  > +	page = alloc_page(GFP_NOIO|__GFP_NOBOUNCE);
 > >  >  	if (page)
 > >  >  		return page;
 > >  >  	/*
 > >  > @@ -359,7 +359,7 @@
 > >  >  	struct buffer_head *bh;
 > >  >  
 > >  >  repeat_alloc:
 > >  > -	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO);
 > >  > +	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO|__GFP_NOBOUNCE);
 > >  >  	if (bh)
 > >  >  		return bh;
 > >  >  	/*
 > >  > diff -Nur --exclude-from=exclude linux.orig/mm/page_alloc.c linux/mm/page_alloc.c
 > >  > --- linux.orig/mm/page_alloc.c	Thu Aug 16 13:43:02 2001
 > >  > +++ linux/mm/page_alloc.c	Tue Aug 21 04:51:03 2001
 > >  > @@ -398,7 +398,8 @@
 > >  >  	 * - we're /really/ tight on memory
 > >  >  	 * 	--> try to free pages ourselves with page_launder
 > >  >  	 */
 > >  > -	if (!(current->flags & PF_MEMALLOC)) {
 > >  > +	if (!(current->flags & PF_MEMALLOC) 
 > >  > +			|| ((gfp_mask & __GFP_NOBOUNCE) && !order)) {
 > >  >  		/*
 > >  >  		 * Are we dealing with a higher order allocation?
 > >  >  		 *
 > >  > 
 > > -
 > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 > > the body of a message to majordomo@vger.kernel.org
 > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
 > > Please read the FAQ at  http://www.tux.org/lkml/
 > > 
 > 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 18:46                         ` David S. Miller
  2001-08-22 19:41                           ` Justin T. Gibbs
  2001-08-22 20:19                           ` David S. Miller
@ 2001-08-22 21:07                           ` Gérard Roudier
  2001-08-22 21:40                             ` Justin T. Gibbs
  2001-08-22 23:09                             ` David S. Miller
  2001-08-22 21:14                           ` David S. Miller
  2001-08-22 21:14                           ` David S. Miller
  4 siblings, 2 replies; 75+ messages in thread
From: Gérard Roudier @ 2001-08-22 21:07 UTC (permalink / raw)
  To: David S. Miller; +Cc: gibbs, axboe, skraw, phillips, linux-kernel



On Wed, 22 Aug 2001, David S. Miller wrote:

>    From: "Justin T. Gibbs" <gibbs@scsiguy.com>
>    Date: Wed, 22 Aug 2001 12:32:17 -0600
>
>    I would like the change much better if the size of dma_addr_t
>    simply changed to be 64bits wide if high mem support is enabled
>    in your kernel config.
>
> Drivers for SAC only PCI devices shall not be bloated by 64-bit type,
> not in any case whatsoever.
>
>    Sure, you need the other API changes to more finely set dma characteristics,
>    but having two APIs just complicates life for the device driver.
>
> Either your device is 64-bit capable or not, what is so complicated?
> Each driver I converted was like 15 minutes or work, at best!
>
>    From the device driver's point of view, this wasn't the case.
>    The driver asks to have the data mapped into an address that
>    its dma engine can understand and the system is supposed to do that
>    mapping.
>
> What is the virtual address of physical address 0x100000000
> on a 32-bit cpu system if the page is not currently kmap()'d?
>
> Answer: it doesn't exist.

I seem to understand that Justin's is referring to the DMA related API of
BSD O/Ses that, I believe, originates from NetBSD. Just, the Linux
approach seems to make the hypothesis that 32 bits addressing will still
have a long life. Note that my guess is also that most low end servers and
personnal computers may well still use less than 4 GB and so 32 bit
addressable for a long time. This let me prefer the Linux differentiation,
at the moment, given that O/S vendors provides binaries and donnot
encourage users to recompile the kernel and modules. We probably donnot
want more than 99% of real machines to waste uselessly with 64 bit
quantities just for some source program aesthetic considerations.

And I seem to understand that David's preferred SUN hardware only allows
streaming when using SAC with IOMMU.:-)

And using DAC is 1 PCI cycle lost per transaction and if we are
picky on performances ...

> The only portable way is to use pages.  That is what Jens's and
> my work aims to do.  The ia64 API is nonportable and works only
> on 64-bit systems.
>
>    >It also assumed that using SAC or DAC addressing was simply a matter of
>    >"does the device support it", and the world is far from being that simple :-)
>
>    Can you enumerate the devices that actually issue a DAC when loaded with
>    a 64bit address with 0's in the most significant 32bits?
>
> Sym53c8xx does this.  You have to configure it to do SAC or DAC
> for data, descriptors use SAC always.

Note that the manual says that the device will not use DAC if higher 32
bits are zero. Nor I remember of any errata about the device behaving
this way. But, as sym53c8xx device actually trying to do 64 bit PCI
addressing should have been pretty rare for now, not all errata on this
point should have been discovered (just trying to guess ...).

[...]

Later,
  Gérard.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 18:46                         ` David S. Miller
                                             ` (2 preceding siblings ...)
  2001-08-22 21:07                           ` Gérard Roudier
@ 2001-08-22 21:14                           ` David S. Miller
  2001-08-22 21:14                           ` David S. Miller
  4 siblings, 0 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-22 21:14 UTC (permalink / raw)
  To: groudier; +Cc: gibbs, axboe, skraw, phillips, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: Text/Plain; charset=big5, Size: 968 bytes --]

   From: Gérard Roudier <groudier@free.fr>
   Date: Wed, 22 Aug 2001 23:07:47 +0200 (CEST)

   Note that the manual says that the device will not use DAC if higher 32
   bits are zero. Nor I remember of any errata about the device behaving
   this way. But, as sym53c8xx device actually trying to do 64 bit PCI
   addressing should have been pretty rare for now, not all errata on this
   point should have been discovered (just trying to guess ...).

If I do not set DDAC in sym53c8xx it issues DAC cycles for everything,
even addresses with no bits set in upper 32-bits of address.

You will remember, we had this issue long ago and had to add #define
for it (which dies in my pci64 changes becuase this portability
issue no longer exists with proper API present).

Later,
David S. Miller
davem@redhat.com
ý:.žË›±Êâmçë¢kaŠÉb²ßìzwm…ébïîžË›±Êâmébžìÿ‘êçz_âžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣømšSåy«\x1e­æ¶\x17…\x01\x06­†ÛiÿÿðÃ\x0fí»\x1fè®\x0få’i\x7f

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 18:46                         ` David S. Miller
                                             ` (3 preceding siblings ...)
  2001-08-22 21:14                           ` David S. Miller
@ 2001-08-22 21:14                           ` David S. Miller
  4 siblings, 0 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-22 21:14 UTC (permalink / raw)
  To: groudier; +Cc: gibbs, axboe, skraw, phillips, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: Text/Plain; charset=big5, Size: 705 bytes --]

   From: Gérard Roudier <groudier@free.fr>
   Date: Wed, 22 Aug 2001 23:07:47 +0200 (CEST)
   
   And I seem to understand that David's preferred SUN hardware only
   allows streaming when using SAC with IOMMU.:-)
   
It is true.

   And using DAC is 1 PCI cycle lost per transaction and if we are
   picky on performances ...

True.  This is why I find it mysterious when I run across a device
which does not issue SAC for addresses with only 32-bits of
significance.

Happily, most do behave this way.

Later,
David S. Miller
davem@redhat.com
ý:.žË›±Êâmçë¢kaŠÉb²ßìzwm…ébïîžË›±Êâmébžìÿ‘êçz_âžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣømšSåy«\x1e­æ¶\x17…\x01\x06­†ÛiÿÿðÃ\x0fí»\x1fè®\x0få’i\x7f

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 21:07                           ` Gérard Roudier
@ 2001-08-22 21:40                             ` Justin T. Gibbs
  2001-08-22 23:09                             ` David S. Miller
  1 sibling, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-22 21:40 UTC (permalink / raw)
  To: Gérard Roudier; +Cc: David S. Miller, axboe, skraw, phillips, linux-kernel

>I seem to understand that Justin's is referring to the DMA related API of
>BSD O/Ses that, I believe, originates from NetBSD.

Not really.  I just don't think that having both a 32bit and a 64bit
type for dma_addr_t makes sense.  I'm not advocating all devices perform
DAC.

I've started looking through the network devices for bloat caused
by the change in size of this type and I haven't found it anywhere.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 21:07                           ` Gérard Roudier
  2001-08-22 21:40                             ` Justin T. Gibbs
@ 2001-08-22 23:09                             ` David S. Miller
  2001-08-23  0:01                               ` Justin T. Gibbs
  2001-08-23  0:40                               ` David S. Miller
  1 sibling, 2 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-22 23:09 UTC (permalink / raw)
  To: gibbs; +Cc: groudier, axboe, skraw, phillips, linux-kernel

   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
   Date: Wed, 22 Aug 2001 15:40:30 -0600
   
   I've started looking through the network devices for bloat caused
   by the change in size of this type and I haven't found it anywhere.

Consider network drivers (most PCI ones) that keep track of:

	struct sk_buff *skb;
	dma_addr_t mapping;

pairs for each transmit packet.  With your suggested change,
their structures will increase 32-bits in size for each entry
when CONFIG_HIGHMEM on x86 or on a 64-bit platform.

I mean, just grep for dma_addr_t in structures of these networking
drivers to see where the wasted space would be.

Later,
David S. Miller
davem@redhat.com


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 23:09                             ` David S. Miller
@ 2001-08-23  0:01                               ` Justin T. Gibbs
  2001-08-23  0:40                               ` David S. Miller
  1 sibling, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-23  0:01 UTC (permalink / raw)
  To: David S. Miller; +Cc: groudier, axboe, skraw, phillips, linux-kernel

>Consider network drivers (most PCI ones) that keep track of:
>
>	struct sk_buff *skb;
>	dma_addr_t mapping;
>
>pairs for each transmit packet.  With your suggested change,
>their structures will increase 32-bits in size for each entry
>when CONFIG_HIGHMEM on x86 or on a 64-bit platform.

They already increase by 32bits on IA64.  A driver should use a
fixed sized type for a fixed sized address that corresponds to its
capabilities.  There is no guarantee of the size of dma_addr_t.
It is opaque and should be able to represent all dma (or I would prefer
bus) addresses in the system.  The examples I've seen where people
assume it to be 32bits in size are, well, broken.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-22 23:09                             ` David S. Miller
  2001-08-23  0:01                               ` Justin T. Gibbs
@ 2001-08-23  0:40                               ` David S. Miller
  2001-08-23  0:55                                 ` Justin T. Gibbs
  2001-08-23  1:08                                 ` David S. Miller
  1 sibling, 2 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-23  0:40 UTC (permalink / raw)
  To: gibbs; +Cc: groudier, axboe, skraw, phillips, linux-kernel

   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
   Date: Wed, 22 Aug 2001 18:01:40 -0600

   It is opaque and should be able to represent all dma (or I would prefer
   bus) addresses in the system.  The examples I've seen where people
   assume it to be 32bits in size are, well, broken.

It is the type to be used for 32-bit SAC based DMA.
DMA-mapping.txt is pretty clear about this.

In fact, because it is well documented, ia64 is in direct violation of
the API.  I've been mentioning things like this since the beginning.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  0:40                               ` David S. Miller
@ 2001-08-23  0:55                                 ` Justin T. Gibbs
  2001-08-23  1:03                                   ` Matthew Jacob
  2001-08-23  1:08                                 ` David S. Miller
  1 sibling, 1 reply; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-23  0:55 UTC (permalink / raw)
  To: David S. Miller; +Cc: groudier, axboe, skraw, phillips, linux-kernel

>   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
>   Date: Wed, 22 Aug 2001 18:01:40 -0600
>
>   It is opaque and should be able to represent all dma (or I would prefer
>   bus) addresses in the system.  The examples I've seen where people
>   assume it to be 32bits in size are, well, broken.
>
>It is the type to be used for 32-bit SAC based DMA.
>DMA-mapping.txt is pretty clear about this.

Then it is poorly named.  How about "pci_dma32_t".  Or better yet,
uint32_t.  How do the guys writing SBUS drivers like the fact that
all of this mapping stuff is so PCI centric?

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  0:55                                 ` Justin T. Gibbs
@ 2001-08-23  1:03                                   ` Matthew Jacob
  0 siblings, 0 replies; 75+ messages in thread
From: Matthew Jacob @ 2001-08-23  1:03 UTC (permalink / raw)
  To: Justin T. Gibbs
  Cc: David S. Miller, groudier, axboe, skraw, phillips, linux-kernel


What guys writing SBus drivers? I mean, other than the NetBSD folks?


On Wed, 22 Aug 2001, Justin T. Gibbs wrote:

> >   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
> >   Date: Wed, 22 Aug 2001 18:01:40 -0600
> >
> >   It is opaque and should be able to represent all dma (or I would prefer
> >   bus) addresses in the system.  The examples I've seen where people
> >   assume it to be 32bits in size are, well, broken.
> >
> >It is the type to be used for 32-bit SAC based DMA.
> >DMA-mapping.txt is pretty clear about this.
>
> Then it is poorly named.  How about "pci_dma32_t".  Or better yet,
> uint32_t.  How do the guys writing SBUS drivers like the fact that
> all of this mapping stuff is so PCI centric?
>
> --
> Justin
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  0:40                               ` David S. Miller
  2001-08-23  0:55                                 ` Justin T. Gibbs
@ 2001-08-23  1:08                                 ` David S. Miller
  2001-08-23  1:32                                   ` Justin T. Gibbs
  2001-08-23  1:39                                   ` David S. Miller
  1 sibling, 2 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-23  1:08 UTC (permalink / raw)
  To: gibbs; +Cc: groudier, axboe, skraw, phillips, linux-kernel

   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
   Date: Wed, 22 Aug 2001 18:55:21 -0600
   
   Then it is poorly named.  How about "pci_dma32_t".  Or better yet,
   uint32_t.  How do the guys writing SBUS drivers like the fact that
   all of this mapping stuff is so PCI centric?
   
Please actually take a look at a few SBUS drivers before
you open your big mouth.  SBUS drivers use a totally different
API.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  1:08                                 ` David S. Miller
@ 2001-08-23  1:32                                   ` Justin T. Gibbs
  2001-08-23  1:39                                   ` David S. Miller
  1 sibling, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-23  1:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: groudier, axboe, skraw, phillips, linux-kernel

>   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
>   Date: Wed, 22 Aug 2001 18:55:21 -0600
>   
>   Then it is poorly named.  How about "pci_dma32_t".  Or better yet,
>   uint32_t.  How do the guys writing SBUS drivers like the fact that
>   all of this mapping stuff is so PCI centric?
>   
>Please actually take a look at a few SBUS drivers before
>you open your big mouth.  SBUS drivers use a totally different
>API.

Perhaps its different for SBUS, but its not different for ISA
or EISA.  The main point here is that if a single driver has
multiple bus attachements they either "luck out" and can use a
"pci api" to talk to their non-pci devices (the aic7xxx driver talks
EISA/VL/PCI) or they have to have different mapping paths (SBUS/PCI driver).
Do you believe that it is architecturally correct to have a single
api or multiple apis?  From your "big mouth" comment above, I assume
the later.  From the driver's standpoint, the task is pretty much the
same, with perhaps different contraints on the types of address that
can be supported by the device.  The "pci" api already allows you
to express this.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  1:08                                 ` David S. Miller
  2001-08-23  1:32                                   ` Justin T. Gibbs
@ 2001-08-23  1:39                                   ` David S. Miller
  2001-08-23  1:49                                     ` Justin T. Gibbs
  1 sibling, 1 reply; 75+ messages in thread
From: David S. Miller @ 2001-08-23  1:39 UTC (permalink / raw)
  To: gibbs; +Cc: groudier, axboe, skraw, phillips, linux-kernel

   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
   Date: Wed, 22 Aug 2001 19:32:46 -0600

   Perhaps its different for SBUS, but its not different for ISA
   or EISA.

Right, you pass in a NULL pci_dev pointer.  What is the
problem with that?

   Do you believe that it is architecturally correct to have a single
   api or multiple apis?

I think just plain different entry points are the way to do things,
because function pointers and/or extra conditional execution rots when
it's really not needed.

   The "pci" api already allows you to express this.

There will be a "struct device" in 2.5.x and lots of unification.

Frankly, I'd rather not touch the SBUS drivers though.
All the devices are cast in stone, I'm the only person
who maintains or even works on any of the drivers, and
the less I have to change at this point the better.

Later,
David S. Miller
davem@redhat.com


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  1:39                                   ` David S. Miller
@ 2001-08-23  1:49                                     ` Justin T. Gibbs
  0 siblings, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-23  1:49 UTC (permalink / raw)
  To: David S. Miller; +Cc: groudier, axboe, skraw, phillips, linux-kernel

>   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
>   Date: Wed, 22 Aug 2001 19:32:46 -0600
>
>   Perhaps its different for SBUS, but its not different for ISA
>   or EISA.
>
>Right, you pass in a NULL pci_dev pointer.  What is the
>problem with that?

I don't have the same lattitude to express dma characteristics of
broken, non-PCI devices.  For instance, I can't set the "dma mask"
for a VLB card (say some early BusLogic 445) that had some DMA bugs.
I have to treat it like an ISA card even if it may have problems
with DMAs below the typical ISA dma limit.

>   Do you believe that it is architecturally correct to have a single
>   api or multiple apis?
>
>I think just plain different entry points are the way to do things,
>because function pointers and/or extra conditional execution rots when
>it's really not needed.

That needent be the case.  If I can use a single API to define the
DMA characteristics of my device, and the system knows where it
is in the bus hierarchy (and all the warts of the bridges along
the way, etc.), the magic to do the mapping can be hidden from me
and I don't need to have multiple APIs or code paths.  I just pass
a "dma descriptor" that has the necessary info for that type of
dma operation on that platform, and the system does the rest.  This
even allows a device to allocate multiple descriptors to handle its
different operations (bulk data is 64bit capable, transaction descriptors
need to be handled with 24bit addresses, etc.).

>   The "pci" api already allows you to express this.
>
>There will be a "struct device" in 2.5.x and lots of unification.

That's good to know.

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: aic7xxx errors with 2.4.8-ac7 on 440gx mobo
  2001-08-20 21:44         ` aic7xxx errors with 2.4.8-ac7 on 440gx mobo Justin T. Gibbs
  2001-08-20 21:48           ` Cliff Albert
@ 2001-08-25  7:15           ` Cliff Albert
  1 sibling, 0 replies; 75+ messages in thread
From: Cliff Albert @ 2001-08-25  7:15 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel

On Mon, Aug 20, 2001 at 03:44:34PM -0600, Justin T. Gibbs wrote:

> >And here they are, the dmesg is my bootup dmesg with the devices drivers 
> >and stuff, and the second dmesg is the actual errors (verbose turned on)
> 
> You need OFOJ or better firmware in your Fireball ST.  The firmware you
> have now is known to be bad.  Before Maxtor's purchase of Quantum's
> disk line, you used to be able to get firmware updates off of
> ftp.quantum.com, but they've hence cleared out those files.  In a
> quick look through Maxtor's site, I could not find the relevant files.

Actually my scsi errors disappeared all when i upgraded my P2B-S motherboard
to bios version 1014 Beta 1A (which inclused adaptec bios v3.10). It's available
from ftp.asuscom.de

-- 
Cliff Albert		| RIPE:	     CA3348-RIPE | www.oisec.net
cliff@oisec.net		| 6BONE:     CA2-6BONE	 | icq 18461740

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-22 15:42                 ` Marcelo Tosatti
@ 2001-08-29  7:30                   ` Andrey Nekrasov
  2001-09-03 14:58                     ` Marcelo Tosatti
  0 siblings, 1 reply; 75+ messages in thread
From: Andrey Nekrasov @ 2001-08-29  7:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Marcelo Tosatti

Hello Marcelo Tosatti,

Once you wrote about "Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)":

 
  May be later, i applied you patch on my server (1.5Gb/DAC960/2CPU/...)
	All test OK (tiobench/NFS - big file - more 2Gb).

	You patch enter to mainline kernel (Linux/Alan) ?



> Aug 20 15:10:33 ps1 kernel: cation failed (gfp=0x30/1). 
> Aug 20 15:10:33 ps1 kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x30/1). 
> Aug 20 15:10:46 ps1 last message repeated 327 times 
> Aug 20 15:10:47 ps1 kernel: cation failed (gfp=0x30/1). 
> Aug 20 15:10:47 ps1 kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x30/1). 
> Aug 20 15:10:56 ps1 last message repeated 294 times 

 
> >  > Could you please try the following patch on top of 2.4.9? 
> >  > 
> >  > diff -Nur --exclude-from=exclude linux.orig/fs/buffer.c linux/fs/buffer.c
> >  > --- linux.orig/fs/buffer.c	Wed Aug 15 18:25:49 2001
> >  > +++ linux/fs/buffer.c	Tue Aug 21 04:54:01 2001
> >  > @@ -2447,7 +2447,8 @@
> >  >  	spin_unlock(&free_list[index].lock);
> >  >  	write_unlock(&hash_table_lock);
> >  >  	spin_unlock(&lru_list_lock);
> >  > -	if (gfp_mask & __GFP_IO) {
> >  > +	if (gfp_mask & __GFP_IO || (gfp_mask & __GFP_NOBOUNCE) 
> >  > +			&& page-zone == &pgdat_list->node_zones[ZONE_HIGHMEM]) {
> >  >  		sync_page_buffers(bh, gfp_mask);
> >  >  		/* We waited synchronously, so we can free the buffers. */
> >  >  		if (gfp_mask & __GFP_WAIT) {
> >  > diff -Nur --exclude-from=exclude linux.orig/include/linux/mm.h linux/include/linux/mm.h
> >  > --- linux.orig/include/linux/mm.h	Wed Aug 15 18:21:11 2001
> >  > +++ linux/include/linux/mm.h	Tue Aug 21 04:52:08 2001
> >  > @@ -538,6 +538,8 @@
> >  >  #define __GFP_HIGH	0x20	/* Should access emergency pools? */
> >  >  #define __GFP_IO	0x40	/* Can start physical IO? */
> >  >  #define __GFP_FS	0x80	/* Can call down to low-level FS? */
> >  > +#define __GFP_NOBOUNCE	0x100	/* Don't do any IO operation which may
> >  > +				   result in IO bouncing */
> >  >  
> >  >  #define GFP_NOIO	(__GFP_HIGH | __GFP_WAIT)
> >  >  #define GFP_NOFS	(__GFP_HIGH | __GFP_WAIT | __GFP_IO)
> >  > diff -Nur --exclude-from=exclude linux.orig/include/linux/slab.h linux/include/linux/slab.h
> >  > --- linux.orig/include/linux/slab.h	Wed Aug 15 18:21:13 2001
> >  > +++ linux/include/linux/slab.h	Tue Aug 21 04:51:20 2001
> >  > @@ -23,7 +23,7 @@
> >  >  #define	SLAB_NFS		GFP_NFS
> >  >  #define	SLAB_DMA		GFP_DMA
> >  >  
> >  > -#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS)
> >  > +#define SLAB_LEVEL_MASK		(__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS|__GFP_NOBOUNCE)
> >  >  #define	SLAB_NO_GROW		0x00001000UL	/* don't grow a cache */
> >  >  
> >  >  /* flags to pass to kmem_cache_create().
> >  > diff -Nur --exclude-from=exclude linux.orig/mm/highmem.c linux/mm/highmem.c
> >  > --- linux.orig/mm/highmem.c	Thu Aug 16 13:42:45 2001
> >  > +++ linux/mm/highmem.c	Tue Aug 21 04:50:08 2001
> >  > @@ -321,7 +321,7 @@
> >  >  	struct page *page;
> >  >  
> >  >  repeat_alloc:
> >  > -	page = alloc_page(GFP_NOIO);
> >  > +	page = alloc_page(GFP_NOIO|__GFP_NOBOUNCE);
> >  >  	if (page)
> >  >  		return page;
> >  >  	/*
> >  > @@ -359,7 +359,7 @@
> >  >  	struct buffer_head *bh;
> >  >  
> >  >  repeat_alloc:
> >  > -	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO);
> >  > +	bh = kmem_cache_alloc(bh_cachep, SLAB_NOIO|__GFP_NOBOUNCE);
> >  >  	if (bh)
> >  >  		return bh;
> >  >  	/*
> >  > diff -Nur --exclude-from=exclude linux.orig/mm/page_alloc.c linux/mm/page_alloc.c
> >  > --- linux.orig/mm/page_alloc.c	Thu Aug 16 13:43:02 2001
> >  > +++ linux/mm/page_alloc.c	Tue Aug 21 04:51:03 2001
> >  > @@ -398,7 +398,8 @@
> >  >  	 * - we're /really/ tight on memory
> >  >  	 * 	--> try to free pages ourselves with page_launder
> >  >  	 */
> >  > -	if (!(current->flags & PF_MEMALLOC)) {
> >  > +	if (!(current->flags & PF_MEMALLOC) 
> >  > +			|| ((gfp_mask & __GFP_NOBOUNCE) && !order)) {
> >  >  		/*
> >  >  		 * Are we dealing with a higher order allocation?
> >  >  		 *
> >  > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
bye.
Andrey Nekrasov, SpyLOG.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)
  2001-08-29  7:30                   ` Andrey Nekrasov
@ 2001-09-03 14:58                     ` Marcelo Tosatti
  0 siblings, 0 replies; 75+ messages in thread
From: Marcelo Tosatti @ 2001-09-03 14:58 UTC (permalink / raw)
  To: Andrey Nekrasov; +Cc: linux-kernel



On Wed, 29 Aug 2001, Andrey Nekrasov wrote:

> Hello Marcelo Tosatti,
> 
> Once you wrote about "Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P)":
> 
>  
>   May be later, i applied you patch on my server (1.5Gb/DAC960/2CPU/...)
> 	All test OK (tiobench/NFS - big file - more 2Gb).
> 
> 	You patch enter to mainline kernel (Linux/Alan) ?

Yes, its already on Linus tree.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  2:22 Van Maren, Kevin
@ 2001-08-23  2:26 ` David S. Miller
  0 siblings, 0 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-23  2:26 UTC (permalink / raw)
  To: kevin.vanmaren; +Cc: linux-kernel

   From: "Van Maren, Kevin" <kevin.vanmaren@unisys.com>
   Date: Wed, 22 Aug 2001 21:22:19 -0500

   If the HW generates DAC for addresses < 4GB whenever enabling
   support for 64-bit addresses, then that is very broken.

That is what happens.
   
   Please don't complain that I didn't spend hours searching
   through the archives looking for a message from months? years? ago
   that I didn't know existed.

Weeks, if not days.
   
   > I think for SAC-only devices, it is just dumb wasted space in the
   > driver image.
   
   Perhaps.  But the question is whether it is simpler/better to have
   HIGHMEM x86 kernels (which by definition have memory to spare) waste
   a few bytes to provide "sane" interfaces across all platforms.  And
   whether the kernel bloat for all the additional functions compensates
   for it ;-)

The plain fact is that %95 of PCI devices do not support DAC
addressing.

Later,
David S. Miller
davem@redhat.com


   

^ permalink raw reply	[flat|nested] 75+ messages in thread

* RE: With Daniel Phillips Patch
@ 2001-08-23  2:22 Van Maren, Kevin
  2001-08-23  2:26 ` David S. Miller
  0 siblings, 1 reply; 75+ messages in thread
From: Van Maren, Kevin @ 2001-08-23  2:22 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: linux-kernel

>    There had better not be any.  It is a violation of the PCI 
> specification
>    to generate a DAC if the address fits in 32 bits.
>    
> Then sym53c8xx with Gerard's current scripts code is in violation of
> the PCI specification when the chip is told to use DAC :-)

If you say so.  You make it sound like the driver can
compensate for the broken hardware by explicitly telling
it to use SAC for a transfer after checking if the
dma_addr < 4GB.  The HW is not in "violation" if it doesn't
generate a DAC < 4GB, and if it takes a driver check to
ensure that, it is the driver's problem: it is trivial to
check the high 32-bits of a 64-bit address for 0.

If the HW generates DAC for addresses < 4GB whenever enabling
support for 64-bit addresses, then that is very broken.

>    DAC is a LOT faster and more efficient than a copy (except 
> perhaps for the
>    very smallest of transfers, which are already very inefficient).
> 
> SAC with IOMMU is faster on some platforms.

Okay, so when the driver asks for the physical address, the arch-
specific code maps it with the iommu and returns a 32-bit address.
In that case, the dma_addr_t is 32 bits (unless it can return
64-bit addresses as well).

> There are several other reasons.  (Man, people check the archives, I
> feel like I've typed this in like 5 times in linux-kernel postings
> already)

If you send pointer to your previous message I will read it.  I am
interested in this subject (and have experience), so I threw in my
2 cents.  Please don't complain that I didn't spend hours searching
through the archives looking for a message from months? years? ago
that I didn't know existed.

> Let me list one of them, suppose you have a device for which
> some transfers can happily use DAC addresses, but some others strictly
> need to work with SAC addresses.

What does that have to do with anything?  That just means that the
DMA constraints have to be specified on a per-mapping/per-allocation
basis, not a per-device basis.  That doesn't mean you need separate
routines for 64-bit PCI addresses and 32-bit PCI addresses.  It just
means you need sane DMA constraints handling.

> I think for SAC-only devices, it is just dumb wasted space in the
> driver image.

Perhaps.  But the question is whether it is simpler/better to have
HIGHMEM x86 kernels (which by definition have memory to spare) waste
a few bytes to provide "sane" interfaces across all platforms.  And
whether the kernel bloat for all the additional functions compensates
for it ;-)

Reasonable people can have different opinions.

Kevin Van Maren

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  1:31 ` David S. Miller
  2001-08-23  1:40   ` Justin T. Gibbs
@ 2001-08-23  1:45   ` David S. Miller
  1 sibling, 0 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-23  1:45 UTC (permalink / raw)
  To: gibbs; +Cc: kevin.vanmaren, linux-kernel

   From: "Justin T. Gibbs" <gibbs@scsiguy.com>
   Date: Wed, 22 Aug 2001 19:40:55 -0600

   You have to keep track of the significant bits in the dma_addr_t
   regardless of its size, so you put it into your TX descriptor's (or
   what have you) native format that doesn't waste any space.  You don't
   need to keep the full dma_addr_t around.  Perhaps this is just sloppy
   programming?
   
Some devices keep these in registers and advance them as the
dma progresses.  The only reliable way is by keeping track
of it in software.

   If you don't want to take part in technical discussions, you should
   work in closed source. 8-)

It's not open source, it's "dumb source" I have problems with.
A lot of discussions here end up being of that variety.

Later,
David S. Miller
davem@redhat.com
   

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  1:31 ` David S. Miller
@ 2001-08-23  1:40   ` Justin T. Gibbs
  2001-08-23  1:45   ` David S. Miller
  1 sibling, 0 replies; 75+ messages in thread
From: Justin T. Gibbs @ 2001-08-23  1:40 UTC (permalink / raw)
  To: David S. Miller; +Cc: kevin.vanmaren, linux-kernel

>I do not want to even go into the abuse I took in my email when I
>added the original APIs because people had to keep track of the
>damn mappings at all!  These people would strangle me if they learnt
>that in HIGHMEM kernels twice as much space was needed to do this
>DMA address tracking.

You have to keep track of the significant bits in the dma_addr_t
regardless of its size, so you put it into your TX descriptor's (or
what have you) native format that doesn't waste any space.  You don't
need to keep the full dma_addr_t around.  Perhaps this is just sloppy
programming?

>I at least comfort myself that those who maintain drivers on several
>platforms, and have an open mind, such as Gerard, for the most part
>support the API I have designed.

If you don't want to take part in technical discussions, you should
work in closed source. 8-)

--
Justin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
  2001-08-23  1:06 With Daniel Phillips Patch Van Maren, Kevin
@ 2001-08-23  1:31 ` David S. Miller
  2001-08-23  1:40   ` Justin T. Gibbs
  2001-08-23  1:45   ` David S. Miller
  0 siblings, 2 replies; 75+ messages in thread
From: David S. Miller @ 2001-08-23  1:31 UTC (permalink / raw)
  To: kevin.vanmaren; +Cc: gibbs, linux-kernel

   From: "Van Maren, Kevin" <kevin.vanmaren@unisys.com>
   Date: Wed, 22 Aug 2001 20:06:21 -0500

   There had better not be any.  It is a violation of the PCI specification
   to generate a DAC if the address fits in 32 bits.
   
Then sym53c8xx with Gerard's current scripts code is in violation of
the PCI specification when the chip is told to use DAC :-)

   DAC is a LOT faster and more efficient than a copy (except perhaps for the
   very smallest of transfers, which are already very inefficient).

SAC with IOMMU is faster on some platforms.
   
   Separate 32-bit and 64-bit DMA routines adds unnecessary complication.
   As far as I can tell, the only reason to have separate APIs is so
   that 32bit machines with 64 bit DMA  addresses (PAE on ia32) can
   avoid copying around an "extra" 32bits of address for the drivers
   that don't support 64-bit DMA.

Welcome to the complicated real world.

There are several other reasons.  (Man, people check the archives, I
feel like I've typed this in like 5 times in linux-kernel postings
already)  Let me list one of them, suppose you have a device for which
some transfers can happily use DAC addresses, but some others strictly
need to work with SAC addresses.

pci64_*() would mean "DAC address would be OK".

   I think it makes more sense to just make the dma_addr_t 64 bits on
   ia32 if using PAE and deal with the insignificant "waste" -- you
   have > 4GB RAM :-)

I think for SAC-only devices, it is just dumb wasted space in the
driver image.

I do not want to even go into the abuse I took in my email when I
added the original APIs because people had to keep track of the
damn mappings at all!  These people would strangle me if they learnt
that in HIGHMEM kernels twice as much space was needed to do this
DMA address tracking.

I at least comfort myself that those who maintain drivers on several
platforms, and have an open mind, such as Gerard, for the most part
support the API I have designed.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: With Daniel Phillips Patch
@ 2001-08-23  1:06 Van Maren, Kevin
  2001-08-23  1:31 ` David S. Miller
  0 siblings, 1 reply; 75+ messages in thread
From: Van Maren, Kevin @ 2001-08-23  1:06 UTC (permalink / raw)
  To: 'gibbs@scsiguy.com'; +Cc: 'linux-kernel@vger.kernel.org'

> Can you enumerate the devices that actually issue a DAC when loaded with
> 64bit address with 0's in the most significant 32bits?

There had better not be any.  It is a violation of the PCI specification
to generate a DAC if the address fits in 32 bits.

DAC is a LOT faster and more efficient than a copy (except perhaps for the
very smallest of transfers, which are already very inefficient).

The problem is that (for most hardware) the 64-bit descriptors take up more
room (and hence more PCI cycles to transfer) than the 32-bit descriptors,
especially with a 32-bit bus.  [Apparently not the case for the 39-bit
AIC7xxx driver, but it is the case for the 64-bit Adaptec.]  So unless
there is the possibility of using 64-bit DMA, you want to use the smaller
descriptors.  So on systems with <= 32bits of memory/dma_addr_t, the driver
should be able to "know" that it should use the smaller descriptors for
efficiency.

I also believe that a dma_addr_t should be determined by the system, not the
driver: the driver should indicate constraints and the OS should ensure that
the dma_addr_t it provides meets the constraints.  Separate 32-bit and
64-bit
DMA routines adds unnecessary complication.  As far as I can tell, the only
reason to have separate APIs is so that 32bit machines with 64 bit DMA
addresses (PAE on ia32) can avoid copying around an "extra" 32bits of
address for the drivers that don't support 64-bit DMA.  I think it makes
more sense to just make the dma_addr_t 64 bits on ia32 if using PAE and
deal with the insignificant "waste" -- you have > 4GB RAM :-)

Kevin Van Maren

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2001-09-03 16:25 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-20  8:36 aic7xxx errors with 2.4.8-ac7 on 440gx mobo Yusuf Goolamabbas
2001-08-20  8:55 ` Cliff Albert
2001-08-20 10:37   ` Alan Cox
2001-08-20 10:56     ` Yusuf Goolamabbas
2001-08-20 10:56       ` Alan Cox
2001-08-20 11:13         ` Yusuf Goolamabbas
2001-08-20 11:09           ` Alan Cox
2001-08-20 16:43             ` Doug Ledford
2001-08-20 12:46     ` Stefan Fleiter
2001-08-20 15:19       ` Ville Herva
2001-08-20 20:33         ` Justin T. Gibbs
2001-08-20 16:45       ` Doug Ledford
2001-08-20 17:23         ` Stefan Fleiter
2001-08-20 20:28       ` Justin T. Gibbs
2001-08-21 20:24         ` Stefan Fleiter
2001-08-20 16:21     ` Cliff Albert
2001-08-20 17:23       ` Peter T. Breuer
2001-08-20 17:28         ` Cliff Albert
2001-08-20 20:27   ` Justin T. Gibbs
2001-08-20 20:45     ` Cliff Albert
2001-08-20 21:04       ` Cliff Albert
2001-08-20 21:09         ` Cliff Albert
2001-08-20 21:45           ` Justin T. Gibbs
2001-08-20 22:55             ` Cliff Albert
2001-08-21  0:36               ` Justin T. Gibbs
2001-08-21 15:34                 ` Gérard Roudier
2001-08-21 14:42             ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
2001-08-21 15:08               ` Daniel Phillips
2001-08-21 16:48               ` Sven Heinicke
2001-08-21 17:18                 ` Justin T. Gibbs
2001-08-21 17:26                 ` Daniel Phillips
2001-08-21 17:55                 ` Stephan von Krawczynski
2001-08-21 18:33                   ` Justin T. Gibbs
2001-08-22  6:46                     ` Jens Axboe
2001-08-22 13:24                       ` Justin T. Gibbs
2001-08-22 15:05                       ` With Daniel Phillips Patch David S. Miller
2001-08-22 18:21                         ` Gérard Roudier
2001-08-22 18:32                         ` Justin T. Gibbs
2001-08-22 18:32                         ` David S. Miller
2001-08-22 18:46                         ` David S. Miller
2001-08-22 19:41                           ` Justin T. Gibbs
2001-08-22 20:19                           ` David S. Miller
2001-08-22 21:07                           ` Gérard Roudier
2001-08-22 21:40                             ` Justin T. Gibbs
2001-08-22 23:09                             ` David S. Miller
2001-08-23  0:01                               ` Justin T. Gibbs
2001-08-23  0:40                               ` David S. Miller
2001-08-23  0:55                                 ` Justin T. Gibbs
2001-08-23  1:03                                   ` Matthew Jacob
2001-08-23  1:08                                 ` David S. Miller
2001-08-23  1:32                                   ` Justin T. Gibbs
2001-08-23  1:39                                   ` David S. Miller
2001-08-23  1:49                                     ` Justin T. Gibbs
2001-08-22 21:14                           ` David S. Miller
2001-08-22 21:14                           ` David S. Miller
2001-08-21 22:44                 ` With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on 7899P) Sven Heinicke
2001-08-22  0:58                   ` Daniel Phillips
2001-08-21 22:49                 ` Sven Heinicke
2001-08-22 13:06                   ` Gérard Roudier
2001-08-22 10:25               ` Marcelo Tosatti
2001-08-22 16:09               ` Sven Heinicke
2001-08-22 15:42                 ` Marcelo Tosatti
2001-08-29  7:30                   ` Andrey Nekrasov
2001-09-03 14:58                     ` Marcelo Tosatti
2001-08-22 20:25                 ` Sven Heinicke
2001-08-20 22:36           ` aic7xxx with 2.4.9 on 7899P Sven Heinicke
2001-08-20 21:44         ` aic7xxx errors with 2.4.8-ac7 on 440gx mobo Justin T. Gibbs
2001-08-20 21:48           ` Cliff Albert
2001-08-25  7:15           ` Cliff Albert
2001-08-23  1:06 With Daniel Phillips Patch Van Maren, Kevin
2001-08-23  1:31 ` David S. Miller
2001-08-23  1:40   ` Justin T. Gibbs
2001-08-23  1:45   ` David S. Miller
2001-08-23  2:22 Van Maren, Kevin
2001-08-23  2:26 ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).