All of lore.kernel.org
 help / color / mirror / Atom feed
* Are QLA2000's doomed under 2.4?
@ 2003-06-09  4:20 Nathan Hunsperger
  2003-06-09 16:20 ` Andrew Vasquez
  0 siblings, 1 reply; 3+ messages in thread
From: Nathan Hunsperger @ 2003-06-09  4:20 UTC (permalink / raw)
  To: linux-scsi

I've been trying to get a QLA2000 (ISP2100) up and running under 2.4.20
for a while now.  While I've been able to make it work, I can't make the
system stable.

When I use the drivers in the kernel, the system locks up within 1
minute of applying a heavy load.  When I use QLogic's driver, I get a
very nice "this should not happen" error message and a frozen system
within 15 minutes of heavy load.

With Feral's driver, the system never freezes, but I'm having problems
on the FC loop.  After about an hour of heavy load, anything using disks
on the FC loop freezes, and syslog gives the following (where ... is
many more instances of above message):

May 27 22:12:25 delta kernel: isp0: Interrupting Mailbox Command (0x15) Timeout
May 27 22:12:25 delta kernel: isp0: Mailbox Command 'ABORT' failed (TIMEOUT)
May 27 22:12:30 delta kernel: isp0: Interrupting Mailbox Command (0x15) Timeout
May 27 22:12:30 delta kernel: isp0: Mailbox Command 'ABORT' failed (TIMEOUT)
...
May 27 22:21:30 delta kernel: isp0: Interrupting Mailbox Command (0x17) Timeout
May 27 22:21:30 delta kernel: isp0: Mailbox Command 'ABORT TARGET' failed (TIMEOUT) 
May 27 22:21:35 delta kernel: isp0: Interrupting Mailbox Command (0x17) Timeout
May 27 22:21:35 delta kernel: isp0: Mailbox Command 'ABORT TARGET' failed (TIMEOUT)
...
May 27 22:22:40 delta kernel: isp0: Interrupting Mailbox Command (0x18) Timeout
May 27 22:22:40 delta kernel: isp0: Mailbox Command 'BUS RESET' failed (TIMEOUT)
May 27 22:22:50 delta kernel: isp0: Interrupting Mailbox Command (0x18) Timeout
May 27 22:22:50 delta kernel: isp0: Mailbox Command 'BUS RESET' failed (TIMEOUT)...
May 27 22:40:46 delta kernel: isp0: Board Type 2100, Chip Revision 0x3, loaded F/W Revision 1.19.20
May 27 22:40:46 delta kernel: isp0: Loop ID 7, AL_PA 0xda, Port ID 0xda, Loop State 0x2, Topology 'Private Loop'

Other times, I get similar results, but with COMMAND_ERROR messages
instead.  Always, the card eventually is reset, and all processes
continue normally.

I've been able to trace the final resets to the SCSI layer finally
calling the eh_host_reset_handler function (isplinux_hreset).

Originally, I was thinking that I had a hardware problem, but I have
swapped everything except the disk chassis and disks.  Also, during a 45
minute session of 'BUS RESET' failed, I power-cycled the disk chassis,
unplugged the FC cable, etc, all to no avail.  If something other than
the QLA2000 card itself was causing the reset to fail, I would have
expected one of those two actions to allow the reset to occur.

Does anybody have any ideas on what is going on?  Is the QLA2000 simply
destined to never work?  I've heard of many getting the QLA2100 (same
chipset) to work without a hitch.  So, in short, I'm a tad lost trying
to get this card up and working, and any suggestions would be very much
appreciated.

- Nathan

Some additional info:
I have 14 disks on the loop, all as part of a software raid 5 array,
with lvm on top of that, and an ext3 fs on that.  I am able to cause
these symptoms during parity resync, as well as when doing things like
10 parallel untars of 5GB each.  The system is SMP, though I have also
tried this with non-SMP kernels.  Under FreeBSD and Solaris, I have no
issues, although I get 1/2 the throughput.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Are QLA2000's doomed under 2.4?
  2003-06-09  4:20 Are QLA2000's doomed under 2.4? Nathan Hunsperger
@ 2003-06-09 16:20 ` Andrew Vasquez
  2003-06-11 19:56   ` Nathan Hunsperger
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Vasquez @ 2003-06-09 16:20 UTC (permalink / raw)
  To: linux-scsi

On Sun, 08 Jun 2003, Nathan Hunsperger wrote:

> I've been trying to get a QLA2000 (ISP2100) up and running under 2.4.20
> for a while now.  While I've been able to make it work, I can't make the
> system stable.
> 
> When I use the drivers in the kernel, the system locks up within 1
> minute of applying a heavy load.  When I use QLogic's driver, I get a
> very nice "this should not happen" error message and a frozen system
> within 15 minutes of heavy load.
> 

Which version of the QLogic driver are you trying to you trying to
use?  The latest beta 6.05.00b9 and the latest formal release
(6.04.00) contain no strings in the source which read as 'this should
not happen.'  The qlogicfc.c driver (in the kernel) contains the
following printk:

	printk("qlogicfc%d : no handle slots, this should not happen.\n",
		hostdata->host_id)

Is this the message you are referring to? 

IAC: Could you try the 6.05.00b9 which is available on the web:

	http://www.qlogic.com/support/os_detail.asp?productid=255&osid=26

QLogic doesn't formally support the EOLd ISP2100 chip, but the driver
has worked for a few others, and I'll try to help as best as I can.

Regards,
Andrew Vasquez

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Are QLA2000's doomed under 2.4?
  2003-06-09 16:20 ` Andrew Vasquez
@ 2003-06-11 19:56   ` Nathan Hunsperger
  0 siblings, 0 replies; 3+ messages in thread
From: Nathan Hunsperger @ 2003-06-11 19:56 UTC (permalink / raw)
  To: Andrew Vasquez, linux-scsi

On Mon, Jun 09, 2003 at 09:20:47AM -0700, Andrew Vasquez wrote:
> Which version of the QLogic driver are you trying to you trying to
> use?  The latest beta 6.05.00b9 and the latest formal release
> (6.04.00) contain no strings in the source which read as 'this should
> not happen.'  The qlogicfc.c driver (in the kernel) contains the
> following printk:
> 
> 	printk("qlogicfc%d : no handle slots, this should not happen.\n",
> 		hostdata->host_id)
> 
> Is this the message you are referring to? 

Yeah, thanks for calling me on that.  I've been beating my head on the
wall over this for about 2 months, so I managed to confuse myself a bit.
I've just re-run some tests with the various drivers, and have attached
the kernel logs.

> IAC: Could you try the 6.05.00b9 which is available on the web:
> 
> 	http://www.qlogic.com/support/os_detail.asp?productid=255&osid=26

No luck.  This driver performs the same as 6.04.

> QLogic doesn't formally support the EOLd ISP2100 chip, but the driver
> has worked for a few others, and I'll try to help as best as I can.

Thanks.  Unfortunatly the lack of QLogic support makes it that much
harder for me to debug.


qlogic v6.05.00b9 (f/w 1.19.24):

Jun 10 11:10:38 delta kernel: qla2x00(3): Performing ISP error recovery - ha= f7af807c.
Jun 10 11:10:38 delta kernel: scsi(3): LIP reset occurred.
Jun 10 11:10:38 delta kernel: scsi(3): Waiting for LIP to complete...
Jun 10 11:10:38 delta kernel: scsi(3): Waiting for LIP to complete...
Jun 10 11:10:39 delta kernel: scsi(3): LIP occurred.
Jun 10 11:10:39 delta kernel: scsi(3): LOOP UP detected.
Jun 10 11:10:39 delta kernel: scsi(3): Topology - (Loop), Host Loop address 0x7

or

Jun 10 11:25:32 delta kernel: qla2xxx_eh_abort Exiting: status=Failed
...
Jun 10 12:16:08 delta kernel: scsi(3:0:0:0): DEVICE RESET ISSUED.
...
Jun 10 12:16:08 delta kernel: scsi(3:0:0:0): LOOP RESET ISSUED.
Jun 10 12:16:08 delta kernel: qla2xxx_eh_bus_reset Exiting: Reset Failed

(above 2 entries 10 times for each of 14 drives on loop, @ 5s interval)

Jun 10 12:28:06 delta kernel: scsi(3:0:0:0): now issue ADAPTER RESET.
Jun 10 12:28:06 delta kernel: qla2x00(3): Performing ISP error recovery - ha= f7af807c.
Jun 10 12:28:07 delta kernel: scsi(3): LIP reset occurred.
Jun 10 12:28:07 delta kernel: scsi(3): Waiting for LIP to complete...
Jun 10 12:28:07 delta kernel: scsi(3): LIP occurred.
Jun 10 12:28:07 delta kernel: scsi(3): LOOP UP detected.
Jun 10 12:28:07 delta kernel: scsi(3): Topology - (Loop), Host Loop address 0x7
Jun 10 12:28:07 delta kernel: qla2xxx_eh_host_reset Exiting: status=SUCCESS

qlogic v6.04.00 (f/w 1.19.24):

<same as v6.05.00b9>

qlogicfc:

qlogicfc0 : no handle slots, this should not happen.
hostdata->queued is 4e, in_ptr: xx
(above 2 several times, then lockup)

- Nathan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-06-11 19:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-06-09  4:20 Are QLA2000's doomed under 2.4? Nathan Hunsperger
2003-06-09 16:20 ` Andrew Vasquez
2003-06-11 19:56   ` Nathan Hunsperger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.