All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.4.19 (and newer) - prob with the new adaptec aic7xxx driver and Promise UltraTrak100 TX2
@ 2003-07-24  3:33 Vinnie
  2003-07-24  3:35 ` Vinnie
  2003-07-24  3:41 ` Philippe Troin
  0 siblings, 2 replies; 5+ messages in thread
From: Vinnie @ 2003-07-24  3:33 UTC (permalink / raw)
  To: linux-kernel

Hello everyone,

Yep, I know... 2.4.19 is old news.  But I have tried this with newer 
official kernels also, same results.  Not expecting anybody to have a quick 
fix (although any suggestions would be really welcome!), but I do feel that 
this should be reported, since I have not seen many other posts indicating 
problems like this with "the new adaptec drivers".

Our primary file server is a dual 1.4GHz Tualatin 512K machine, with a Tyan 
S2688 Serverworks HE-SLt mainboard, 2GB of registered ECC SDRAM (4x512 
modules), which also has an AIC7899 dual channel U160 host adapter onboard.

The only SCSI device currently attached is a Promise UltraTrak100 TX8 - an 
8-bay SCSI-to-ATA RAID subsystem, with eight 120GB Western Digital drives 
configured as a 7-drive RAID5 array and 1 non-assigned hot spare.  The 
unit's SCSI interface can run 80MB/sec U2W/LVD (and unit's SCSI ID is 
configured appropriately in the HA BIOS.  The internal ribbon cable from 
motherboard to external connector is a custom-made Granite Digital teflon 
cable, and I am also using a Granite Digital Active Terminator to terminate 
the bus (at the TX8).  Using the external cable supplied by Promise with the 
unit.

Note: Problem is reproducible with an Adaptec AHA-2940U2W used as the host 
adapter instead.

In a nutshell, the problem goes like this:

If I compile the kernel to use the NEW aic7xxx adaptec driver, the SCSI bus 
hangs almost immediately upon commencement of a large write operation, such 
as attempting to copy a 500MB file from one of the internal client machines 
to a SMB shared directory on this server.  The problem is reproducible on 
2.4.19 and 2.4.20 kernels, if I use the "new" aic7xxx driver.

The SCSI bus completely hangs, leaving the "SEL" (SCSI Select signal) light 
of my SCSIVue LED pack lit solid yellow until I cycle the power on the 
Promise unit.  The screen fills up with details of SCSI errors, data 
overruns, sending  abort commands, etc.  Unfortunately very few of them make 
it into the system log, because by then, the server can't write to the logs 
anymore.  I have to restart the server once this happens.

On the kernel I normally run (a customized 2.4.18 kernel), I have no such 
problems.  I did have to do a bit of tweeking to the HA settings when I 
first got the promise unit, discovering for example that I needed to turn 
"Allow Disconnect" OFF for the unit's SCSI ID, to keep things running well. 
  Not a problem really since it's the only device on the chain (right now, 
anyway... )


Unfortunately since this server also runs mdp-style "partitioned" md raid1 
arrays with pairs of IDE drives (Neil Brown's mdp patches), I am limited to 
trying kernels for which a good set of mdp patches exist for.

 From the documentation I have on the Promise unit, I know it can handle up 
to 32 tagged commands queued, so I have 32 set in the kernel config options 
instead of the default 253.

One snip from the logs I have been able to find, though:

Jul 21 21:16:13 vince500 kernel: scsi logging level set to 0x00000003
Jul 21 21:18:14 vince500 kernel: (scsi1:A:0:0): data overrun detected in 
Data-out phase.  Tag == 0x3.
Jul 21 21:18:14 vince500 kernel: (scsi1:A:0:0): Have seen Data Phase. 
Length = 524288.  NumSGs = 128.
Jul 21 21:18:14 vince500 kernel: sg[0] - Addr 0x03169f000 : Length 4096
Jul 21 21:18:14 vince500 kernel: sg[1] - Addr 0x03169e000 : Length 4096
Jul 21 21:18:14 vince500 kernel: sg[2] - Addr 0x03169d000 : Length 4096
(50+ lines like the 3 lines above continue in the logs)

I have seen a few other people report similar problems with other devices, 
hard drives, CDROM's, etc.  I have a little trouble believing it is a defect 
in the SCSI implementation on the Promise unit, since it does work OK with 
the 2.4.18 and previous drivers.  I'm not saying it's impossible, just that 
I am hesitant to blame it on the unit.

Also, just to note - I have a symlink to the scsi includes 
(/usr/include/scsi) which points to /usr/src/linux/include/scsi 
(/usr/src/linux is itself a symlink which points to the current kernel 
source tree, so when I build a kernel on a different version of the source, 
I change the /usr/src/linux symlink to point to it and the rest are 
automatically fixed also).  I have the same for /usr/include/asm (to 
/usr/src/linux/include/asm-i386 and /usr/include/linux to 
/usr/src/linux/include/linux).

If anybody does have any suggestions, thanks in advance.  But I mainly 
wanted to just report this.  If need be, I can make a mount point for the 
system logs on one of the RAID1 pairs, so that I can capture more of the 
error messages and post them.

Thanks,
vinnie











^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.4.19 (and newer) - prob with the new adaptec aic7xxx driver and Promise UltraTrak100 TX2
  2003-07-24  3:33 2.4.19 (and newer) - prob with the new adaptec aic7xxx driver and Promise UltraTrak100 TX2 Vinnie
@ 2003-07-24  3:35 ` Vinnie
  2003-07-24  3:41 ` Philippe Troin
  1 sibling, 0 replies; 5+ messages in thread
From: Vinnie @ 2003-07-24  3:35 UTC (permalink / raw)
  To: linux-kernel

Vinnie wrote:
> Hello everyone,
> 

Oops geez the subject was supposed to be UltraTrak100 TX8, not TX2...

vinnie


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.4.19 (and newer) - prob with the new adaptec aic7xxx driver and Promise UltraTrak100 TX2
  2003-07-24  3:33 2.4.19 (and newer) - prob with the new adaptec aic7xxx driver and Promise UltraTrak100 TX2 Vinnie
  2003-07-24  3:35 ` Vinnie
@ 2003-07-24  3:41 ` Philippe Troin
  2003-07-24  3:46   ` Vinnie
  2003-07-24  5:36   ` Vinnie
  1 sibling, 2 replies; 5+ messages in thread
From: Philippe Troin @ 2003-07-24  3:41 UTC (permalink / raw)
  To: Vinnie; +Cc: linux-kernel

Vinnie <listacct1@lvwnet.com> writes:

> Hello everyone,
> 
> Yep, I know... 2.4.19 is old news.  But I have tried this with newer
> official kernels also, same results.  Not expecting anybody to have a
> quick fix (although any suggestions would be really welcome!), but I
> do feel that this should be reported, since I have not seen many other
> posts indicating problems like this with "the new adaptec drivers".
> 
> Our primary file server is a dual 1.4GHz Tualatin 512K machine, with a
> Tyan S2688 Serverworks HE-SLt mainboard, 2GB of registered ECC SDRAM
> (4x512 modules), which also has an AIC7899 dual channel U160 host
> adapter onboard.
> 
> The only SCSI device currently attached is a Promise UltraTrak100 TX8
> - an 8-bay SCSI-to-ATA RAID subsystem, with eight 120GB Western
> Digital drives configured as a 7-drive RAID5 array and 1 non-assigned
> hot spare.  The unit's SCSI interface can run 80MB/sec U2W/LVD (and
> unit's SCSI ID is configured appropriately in the HA BIOS.  The
> internal ribbon cable from motherboard to external connector is a
> custom-made Granite Digital teflon cable, and I am also using a
> Granite Digital Active Terminator to terminate the bus (at the TX8).
> Using the external cable supplied by Promise with the unit.
> 
> Note: Problem is reproducible with an Adaptec AHA-2940U2W used as the
> host adapter instead.
> 
> In a nutshell, the problem goes like this:
> 
> If I compile the kernel to use the NEW aic7xxx adaptec driver, the
> SCSI bus hangs almost immediately upon commencement of a large write
> operation, such as attempting to copy a 500MB file from one of the
> internal client machines to a SMB shared directory on this server.
> The problem is reproducible on 2.4.19 and 2.4.20 kernels, if I use the
> "new" aic7xxx driver.

8< snip >8

Have you tried the updated aic7xxx driver at
http://people.freebsd.org/~gibbs/linux/SRC/ ?

AFAIK it fixes a lot of problems with aic7xxx and was not included in
2.4.21 for technicalities.

Phil.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.4.19 (and newer) - prob with the new adaptec aic7xxx driver and Promise UltraTrak100 TX2
  2003-07-24  3:41 ` Philippe Troin
@ 2003-07-24  3:46   ` Vinnie
  2003-07-24  5:36   ` Vinnie
  1 sibling, 0 replies; 5+ messages in thread
From: Vinnie @ 2003-07-24  3:46 UTC (permalink / raw)
  To: Philippe Troin; +Cc: linux-kernel

Philippe Troin wrote:

>>In a nutshell, the problem goes like this:
>>
>>If I compile the kernel to use the NEW aic7xxx adaptec driver, the
>>SCSI bus hangs almost immediately upon commencement of a large write
>>operation, such as attempting to copy a 500MB file from one of the
>>internal client machines to a SMB shared directory on this server.
>>The problem is reproducible on 2.4.19 and 2.4.20 kernels, if I use the
>>"new" aic7xxx driver.
> 
> 
> 8< snip >8
> 
> Have you tried the updated aic7xxx driver at
> http://people.freebsd.org/~gibbs/linux/SRC/ ?
> 
> AFAIK it fixes a lot of problems with aic7xxx and was not included in
> 2.4.21 for technicalities.

Hi Phil,

Thanks for that REALLY quick reply!  I will go check it out.

vinnie


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.4.19 (and newer) - prob with the new adaptec aic7xxx driver and Promise UltraTrak100 TX2
  2003-07-24  3:41 ` Philippe Troin
  2003-07-24  3:46   ` Vinnie
@ 2003-07-24  5:36   ` Vinnie
  1 sibling, 0 replies; 5+ messages in thread
From: Vinnie @ 2003-07-24  5:36 UTC (permalink / raw)
  To: Philippe Troin; +Cc: linux-kernel

Philippe Troin wrote:
>>
>>If I compile the kernel to use the NEW aic7xxx adaptec driver, the
>>SCSI bus hangs almost immediately upon commencement of a large write
>>operation, such as attempting to copy a 500MB file from one of the
>>internal client machines to a SMB shared directory on this server.
>>The problem is reproducible on 2.4.19 and 2.4.20 kernels, if I use the
>>"new" aic7xxx driver.
> 
> 
> 8< snip >8
> 
> Have you tried the updated aic7xxx driver at
> http://people.freebsd.org/~gibbs/linux/SRC/ ?
> 
> AFAIK it fixes a lot of problems with aic7xxx and was not included in
> 2.4.21 for technicalities.

Hi Phil,

Thanks Phil - the updated driver solved my problem, I am now happily up and 
running (and doing big writes without problems) on a fresh-compiled 2.4.20 
kernel, with the /drivers/scsi tree patched with the latest set of Justin 
Gibbs' drivers (6.2.36)

Thanks to Justin also, and everybody else who has (no doubt) worked on the 
new Adaptec drivers to improve it since the versions included with the 
official kernel.org 2.4.19 and 2.4.20 kernel sources.

Vinnie


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-07-24  5:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-24  3:33 2.4.19 (and newer) - prob with the new adaptec aic7xxx driver and Promise UltraTrak100 TX2 Vinnie
2003-07-24  3:35 ` Vinnie
2003-07-24  3:41 ` Philippe Troin
2003-07-24  3:46   ` Vinnie
2003-07-24  5:36   ` Vinnie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.