* AIC7xxx errors in 2.2.19 but not in 2.2.18
@ 2001-09-14 9:34 Holger Kiehl
2001-09-14 13:11 ` Frank Schneider
0 siblings, 1 reply; 7+ messages in thread
From: Holger Kiehl @ 2001-09-14 9:34 UTC (permalink / raw)
To: linux-kernel
Hello
I am getting SCSI errors with an onboard Adaptec AIC-7890/1 Ultra2, but
only under very heavy disk load and only under kernel 2.2.19. These errors
do not appear under 2.2.18.
The system I have is a dual PIII-450 with 6 disks attached to the controller.
All disks are put together in SW-Raid5 array with one configured as hot
spare.
The errors under 2.2.19 look as follows:
scsi : aborting command due to timeout : pid 52414, scsi0, channel 0, id 0, lun 0 Read (10) 00 00 ba f3 76 00 00 18 00
scsi : aborting command due to timeout : pid 52416, scsi0, channel 0, id 1, lun 0 Write (10) 00 00 ba f4 0e 00 00 80 00
(scsi0:0:1:0) SCSISIGI 0x4, SEQADDR 0x77, SSTAT0 0x0, SSTAT1 0x2
(scsi0:0:1:0) SG_CACHEPTR 0x8, SSTAT2 0x40, STCNT 0x5fc
scsi : aborting command due to timeout : pid 52417, scsi0, channel 0, id 1, lun 0 Write (10) 00 00 ba f4 8e 00 00 80 00
scsi : aborting command due to timeout : pid 52419, scsi0, channel 0, id 2, lun 0 Write (10) 00 00 ba f4 0e 00 00 80 00
scsi : aborting command due to timeout : pid 52420, scsi0, channel 0, id 2, lun 0 Write (10) 00 00 ba f4 8e 00 00 80 00
scsi : aborting command due to timeout : pid 52422, scsi0, channel 0, id 3, lun 0 Write (10) 00 00 ba f4 0e 00 00 80 00
scsi : aborting command due to timeout : pid 52423, scsi0, channel 0, id 3, lun 0 Write (10) 00 00 ba f4 8e 00 00 80 00
scsi : aborting command due to timeout : pid 52425, scsi0, channel 0, id 4, lun 0 Write (10) 00 00 1c ed 06 00 00 08 00
scsi : aborting command due to timeout : pid 52426, scsi0, channel 0, id 4, lun 0 Write (10) 00 00 ba f3 8e 00 00 80 00
scsi : aborting command due to timeout : pid 52427, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 ba f3 8e 00 00 80 00
scsi : aborting command due to timeout : pid 52428, scsi0, channel 0, id 1, lun 0 Write (10) 00 00 ba f5 0e 00 00 80 00
scsi : aborting command due to timeout : pid 52429, scsi0, channel 0, id 3, lun 0 Write (10) 00 00 ba f5 0e 00 00 80 00
scsi : aborting command due to timeout : pid 52430, scsi0, channel 0, id 2, lun 0 Write (10) 00 00 ba f5 0e 00 00 80 00
scsi : aborting command due to timeout : pid 52431, scsi0, channel 0, id 4, lun 0 Write (10) 00 00 ba f4 0e 00 00 80 00
scsi : aborting command due to timeout : pid 52432, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 ba f4 0e 00 00 80 00
SCSI host 0 abort (pid 52416) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
wait_on_bh, CPU 1:
irq: 0 [0 0]
bh: 1 [1 0]
<[c010aead]> <[c0199ffc]> <[c019ab9b]> <[c01a3860]> <6>(scsi0:0:4:0) Synchronous at 80.0 Mbyte/sec, offset 31.
(scsi0:0:2:0) Synchronous at 80.0 Mbyte/sec, offset 63.
(scsi0:0:3:0) Synchronous at 80.0 Mbyte/sec, offset 31.
(scsi0:0:1:0) Synchronous at 80.0 Mbyte/sec, offset 31.
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 31.
scsi : aborting command due to timeout : pid 53513, scsi0, channel 0, id 1, lun 0 Write (10) 00 00 bb 4c f6 00 00 80 00
(scsi0:0:1:0) SCSISIGI 0x4, SEQADDR 0x62, SSTAT0 0x0, SSTAT1 0x2
(scsi0:0:1:0) SG_CACHEPTR 0x3c, SSTAT2 0x40, STCNT 0x3fc
scsi : aborting command due to timeout : pid 53514, scsi0, channel 0, id 2, lun 0 Write (10) 00 00 bb 4a f6 00 00 80 00
scsi : aborting command due to timeout : pid 53515, scsi0, channel 0, id 3, lun 0 Write (10) 00 00 bb 4c f6 00 00 80 00
scsi : aborting command due to timeout : pid 53516, scsi0, channel 0, id 4, lun 0 Write (10) 00 00 bb 4a f6 00 00 80 00
scsi : aborting command due to timeout : pid 53517, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 bb 4d 76 00 00 40 00
scsi : aborting command due to timeout : pid 53518, scsi0, channel 0, id 1, lun 0 Write (10) 00 00 bb 4d 76 00 00 40 00
scsi : aborting command due to timeout : pid 53519, scsi0, channel 0, id 3, lun 0 Write (10) 00 00 bb 4d 76 00 00 40 00
scsi : aborting command due to timeout : pid 53520, scsi0, channel 0, id 2, lun 0 Write (10) 00 00 ba fb 86 00 00 08 00
scsi : aborting command due to timeout : pid 53521, scsi0, channel 0, id 4, lun 0 Write (10) 00 00 ba fb 86 00 00 08 00
scsi : aborting command due to timeout : pid 53522, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 bb 4d be 00 00 30 00
scsi : aborting command due to timeout : pid 53523, scsi0, channel 0, id 1, lun 0 Write (10) 00 00 bb 4d be 00 00 30 00
scsi : aborting command due to timeout : pid 53524, scsi0, channel 0, id 3, lun 0 Write (10) 00 00 bb 4d be 00 00 30 00
scsi : aborting command due to timeout : pid 53525, scsi0, channel 0, id 4, lun 0 Write (10) 00 00 bb 4b 76 00 00 80 00
scsi : aborting command due to timeout : pid 53526, scsi0, channel 0, id 2, lun 0 Write (10) 00 00 bb 4b 76 00 00 80 00
scsi : aborting command due to timeout : pid 53527, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 bb 4b f6 00 00 80 00
SCSI host 0 abort (pid 53513) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:4:0) Synchronous at 80.0 Mbyte/sec, offset 31.
(scsi0:0:3:0) Synchronous at 80.0 Mbyte/sec, offset 31.
(scsi0:0:2:0) Synchronous at 80.0 Mbyte/sec, offset 63.
(scsi0:0:1:0) Synchronous at 80.0 Mbyte/sec, offset 31.
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 31.
>From Alan's changelog I see that there where changes in the AIC7xxx code.
Any idea what is wrong here?
Thanks,
Holger
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AIC7xxx errors in 2.2.19 but not in 2.2.18
2001-09-14 9:34 AIC7xxx errors in 2.2.19 but not in 2.2.18 Holger Kiehl
@ 2001-09-14 13:11 ` Frank Schneider
2001-09-14 13:37 ` Andreas Steinmetz
2001-09-14 13:46 ` Holger Kiehl
0 siblings, 2 replies; 7+ messages in thread
From: Frank Schneider @ 2001-09-14 13:11 UTC (permalink / raw)
To: Holger Kiehl; +Cc: linux-kernel
Holger Kiehl schrieb:
>
> Hello
>
> I am getting SCSI errors with an onboard Adaptec AIC-7890/1 Ultra2, but
> only under very heavy disk load and only under kernel 2.2.19. These errors
> do not appear under 2.2.18.
>
> The system I have is a dual PIII-450 with 6 disks attached to the controller.
> All disks are put together in SW-Raid5 array with one configured as hot
> spare.
>
(..log snipped..)
> >From Alan's changelog I see that there where changes in the AIC7xxx code.
> Any idea what is wrong here?
Hello...
I (and someone else) had also mysterious problems with AIC7xxx and
RAID1/5, but we use Kernel 2.4.x.
In Kernel 2.4.x you can choose between two versions of the
aix7xxx-driver, one "old" one (Version 5.2.x) and a "new" one (Version
6.x.x). Do a "cat /proc/scsi/aic7xxx/0" to find your version.
We both found out that our problems dissapear when we use the "old"
driver (my tests are still in progress because my error (always the same
scsi-disk falling out of an raid5-array with an "internal error", but
the disk seems to be good) only appeared randomly about once a week, so
i still have to wait if it is really gone.
So perhaps you can try to use the older driver or determine the version
of your aic7xxx-driver. Perhaps you can use the aic7xxx-driver from
kernel 2.2.18 in Kernel 2.2.19 ?
You should also boot your system with the parameter "aic7xxx=verbose",
that will provide more infos in the syslog.
Solong..
Frank.
--
Frank Schneider, <SPATZ1@T-ONLINE.DE>.
Microsoft isn't the answer.
Microsoft is the question, and the answer is NO.
... -.-
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AIC7xxx errors in 2.2.19 but not in 2.2.18
2001-09-14 13:11 ` Frank Schneider
@ 2001-09-14 13:37 ` Andreas Steinmetz
2001-09-15 15:37 ` Doug Ledford
2001-09-14 13:46 ` Holger Kiehl
1 sibling, 1 reply; 7+ messages in thread
From: Andreas Steinmetz @ 2001-09-14 13:37 UTC (permalink / raw)
To: Frank Schneider; +Cc: linux-kernel, Holger Kiehl
Hi,
2.2.19 only has the 'old' driver. The 'raid/scsi new' problem is a notifier
chain sequence problem that seems to have been taken care of now.
What I do see here may be a coincidence of kernel upgrade and a faulty drive.
Some snippets of 2.2.19 log messages of a faulty drive below.
May 2 03:33:07 pollux kernel: (scsi1:0:1:0) Parity error during Data-In phase.
May 2 03:33:37 pollux kernel: scsi : aborting command due to timeout : pid
1188263, scsi1, channel 0, id 1, lun 0 Read (10) 00 01 04 cd 97 00 00 80 00
May 2 03:33:38 pollux kernel: scsi : aborting command due to timeout : pid
1188268, scsi1, channel 0, id 0, lun 0 Read (10) 00 01 04 ce 2f 00 00 80 00
May 2 03:33:38 pollux kernel: scsi : aborting command due to timeout : pid
1188269, scsi1, channel 0, id 0, lun 0 Read (10) 00 01 04 ce af 00 00 28 00
May 2 03:33:38 pollux kernel: scsi : aborting command due to timeout : pid
1188270, scsi1, channel 0, id 1, lun 0 Read (10) 00 01 04 ce 17 00 00 80 00
May 2 03:33:38 pollux kernel: scsi : aborting command due to timeout : pid
1188271, scsi1, channel 0, id 1, lun 0 Read (10) 00 01 04 ce 97 00 00 40 00
May 2 03:33:38 pollux kernel: scsi : aborting command due to timeout : pid
1188272, scsi1, channel 0, id 2, lun 0 Read (10) 00 01 04 ce 17 00 00 80 00
May 2 03:33:38 pollux kernel: scsi : aborting command due to timeout : pid
1188273, scsi1, channel 0, id 2, lun 0 Read (10) 00 01 04 ce 97 00 00 40 00
May 2 03:33:38 pollux kernel: scsi : aborting command due to timeout : pid
1188274, scsi1, channel 0, id 3, lun 0 Read (10) 00 01 04 ce 17 00 00 80 00
May 2 03:33:38 pollux kernel: scsi : aborting command due to timeout : pid
1188275, scsi1, channel 0, id 3, lun 0 Read (10) 00 01 04 ce 97 00 00 08 00
May 2 03:33:39 pollux kernel: SCSI host 1 abort (pid 1188263) timed out -
resetting
May 2 03:33:39 pollux kernel: SCSI bus is being reset for host 1 channel 0
May 2 03:33:41 pollux kernel: SCSI host 1 reset (pid 1188263) timed out again -
May 2 03:33:41 pollux kernel: probably an unrecoverable SCSI bus or device
hang.
On 14-Sep-2001 Frank Schneider wrote:
> Holger Kiehl schrieb:
>>
>> Hello
>>
>> I am getting SCSI errors with an onboard Adaptec AIC-7890/1 Ultra2, but
>> only under very heavy disk load and only under kernel 2.2.19. These errors
>> do not appear under 2.2.18.
>>
>> The system I have is a dual PIII-450 with 6 disks attached to the
>> controller.
>> All disks are put together in SW-Raid5 array with one configured as hot
>> spare.
>>
>
> (..log snipped..)
>
>> >From Alan's changelog I see that there where changes in the AIC7xxx code.
>> Any idea what is wrong here?
>
> Hello...
>
> I (and someone else) had also mysterious problems with AIC7xxx and
> RAID1/5, but we use Kernel 2.4.x.
>
> In Kernel 2.4.x you can choose between two versions of the
> aix7xxx-driver, one "old" one (Version 5.2.x) and a "new" one (Version
> 6.x.x). Do a "cat /proc/scsi/aic7xxx/0" to find your version.
>
> We both found out that our problems dissapear when we use the "old"
> driver (my tests are still in progress because my error (always the same
> scsi-disk falling out of an raid5-array with an "internal error", but
> the disk seems to be good) only appeared randomly about once a week, so
> i still have to wait if it is really gone.
>
> So perhaps you can try to use the older driver or determine the version
> of your aic7xxx-driver. Perhaps you can use the aic7xxx-driver from
> kernel 2.2.18 in Kernel 2.2.19 ?
>
> You should also boot your system with the parameter "aic7xxx=verbose",
> that will provide more infos in the syslog.
>
> Solong..
> Frank.
>
> --
> Frank Schneider, <SPATZ1@T-ONLINE.DE>.
> Microsoft isn't the answer.
> Microsoft is the question, and the answer is NO.
> ... -.-
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AIC7xxx errors in 2.2.19 but not in 2.2.18
2001-09-14 13:11 ` Frank Schneider
2001-09-14 13:37 ` Andreas Steinmetz
@ 2001-09-14 13:46 ` Holger Kiehl
2001-09-14 17:12 ` Mike Fedyk
1 sibling, 1 reply; 7+ messages in thread
From: Holger Kiehl @ 2001-09-14 13:46 UTC (permalink / raw)
To: Frank Schneider; +Cc: linux-kernel
On Fri, 14 Sep 2001, Frank Schneider wrote:
> Holger Kiehl schrieb:
> >
> > Hello
> >
> > I am getting SCSI errors with an onboard Adaptec AIC-7890/1 Ultra2, but
> > only under very heavy disk load and only under kernel 2.2.19. These errors
> > do not appear under 2.2.18.
> >
> > The system I have is a dual PIII-450 with 6 disks attached to the controller.
> > All disks are put together in SW-Raid5 array with one configured as hot
> > spare.
> >
>
> (..log snipped..)
>
> > >From Alan's changelog I see that there where changes in the AIC7xxx code.
> > Any idea what is wrong here?
>
> Hello...
>
> I (and someone else) had also mysterious problems with AIC7xxx and
> RAID1/5, but we use Kernel 2.4.x.
>
Just today Neil Brown has send a patch where he mentioned something
about the AIC7xxx driver. But I don't know if this has anything
to do with this problem.
> In Kernel 2.4.x you can choose between two versions of the
> aix7xxx-driver, one "old" one (Version 5.2.x) and a "new" one (Version
> 6.x.x). Do a "cat /proc/scsi/aic7xxx/0" to find your version.
>
I have played with 2.4.5 and the new aic7xxx driver and did not see
the problems here. Have not tried the old one under 2.4.5. Unfortunately
I cannot take 2.4.x because of the bigger swap demand.
> We both found out that our problems dissapear when we use the "old"
> driver (my tests are still in progress because my error (always the same
> scsi-disk falling out of an raid5-array with an "internal error", but
> the disk seems to be good) only appeared randomly about once a week, so
> i still have to wait if it is really gone.
>
> So perhaps you can try to use the older driver or determine the version
> of your aic7xxx-driver. Perhaps you can use the aic7xxx-driver from
> kernel 2.2.18 in Kernel 2.2.19 ?
>
> You should also boot your system with the parameter "aic7xxx=verbose",
> that will provide more infos in the syslog.
>
Next time when I boot, I will put in this option.
Thanks,
Holger
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AIC7xxx errors in 2.2.19 but not in 2.2.18
2001-09-14 13:46 ` Holger Kiehl
@ 2001-09-14 17:12 ` Mike Fedyk
0 siblings, 0 replies; 7+ messages in thread
From: Mike Fedyk @ 2001-09-14 17:12 UTC (permalink / raw)
To: linux-kernel
On Fri, Sep 14, 2001 at 03:46:10PM +0200, Holger Kiehl wrote:
> I have played with 2.4.5 and the new aic7xxx driver and did not see
> the problems here. Have not tried the old one under 2.4.5. Unfortunately
> I cannot take 2.4.x because of the bigger swap demand.
>
2.4.x-ac doesn't have the high swap requirement. Swap demands look similar
to 2.2.xx kernels with 2.4.{8,9}-ac.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AIC7xxx errors in 2.2.19 but not in 2.2.18
2001-09-14 13:37 ` Andreas Steinmetz
@ 2001-09-15 15:37 ` Doug Ledford
2001-09-15 15:42 ` Andreas Steinmetz
0 siblings, 1 reply; 7+ messages in thread
From: Doug Ledford @ 2001-09-15 15:37 UTC (permalink / raw)
To: Andreas Steinmetz; +Cc: Frank Schneider
Andreas Steinmetz wrote:
> Hi,
> 2.2.19 only has the 'old' driver. The 'raid/scsi new' problem is a notifier
> chain sequence problem that seems to have been taken care of now.
> What I do see here may be a coincidence of kernel upgrade and a faulty drive.
> Some snippets of 2.2.19 log messages of a faulty drive below.
>
> May 2 03:33:07 pollux kernel: (scsi1:0:1:0) Parity error during Data-In phase.
> May 2 03:33:37 pollux kernel: scsi : aborting command due to timeout : pid
> 1188263, scsi1, channel 0, id 1, lun 0 Read (10) 00 01 04 cd 97 00 00 80 00
I've seen that error a few times now with the new code in 2.2.19. I
don't have a fix for it at this time (and I probably won't since
development on that driver isn't a 'regular' thing at this point). If
the old driver in 2.2.18 worked for you, then I would copy the aic7xxx*
files from 2.2.18 into 2.2.19 and rebuild your kernel.
--
Doug Ledford <dledford@redhat.com> http://people.redhat.com/dledford
Please check my web site for aic7xxx updates/answers before
e-mailing me about problems
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AIC7xxx errors in 2.2.19 but not in 2.2.18
2001-09-15 15:37 ` Doug Ledford
@ 2001-09-15 15:42 ` Andreas Steinmetz
0 siblings, 0 replies; 7+ messages in thread
From: Andreas Steinmetz @ 2001-09-15 15:42 UTC (permalink / raw)
To: Doug Ledford; +Cc: Holger Kiehl, linux-kernel, Frank Schneider
On 15-Sep-2001 Doug Ledford wrote:
> Andreas Steinmetz wrote:
>
>> Hi,
>> 2.2.19 only has the 'old' driver. The 'raid/scsi new' problem is a notifier
>> chain sequence problem that seems to have been taken care of now.
>> What I do see here may be a coincidence of kernel upgrade and a faulty
>> drive.
>> Some snippets of 2.2.19 log messages of a faulty drive below.
>>
>> May 2 03:33:07 pollux kernel: (scsi1:0:1:0) Parity error during Data-In
>> phase.
>> May 2 03:33:37 pollux kernel: scsi : aborting command due to timeout : pid
>> 1188263, scsi1, channel 0, id 1, lun 0 Read (10) 00 01 04 cd 97 00 00 80 00
>
>
>
> I've seen that error a few times now with the new code in 2.2.19. I
> don't have a fix for it at this time (and I probably won't since
> development on that driver isn't a 'regular' thing at this point). If
> the old driver in 2.2.18 worked for you, then I would copy the aic7xxx*
> files from 2.2.18 into 2.2.19 and rebuild your kernel.
>
>
Please note that the disk was proven faulty. (Other kernel, other OS on other
hardware, disk still failing, since then replaced and no more problems).
Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2001-09-15 15:43 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-14 9:34 AIC7xxx errors in 2.2.19 but not in 2.2.18 Holger Kiehl
2001-09-14 13:11 ` Frank Schneider
2001-09-14 13:37 ` Andreas Steinmetz
2001-09-15 15:37 ` Doug Ledford
2001-09-15 15:42 ` Andreas Steinmetz
2001-09-14 13:46 ` Holger Kiehl
2001-09-14 17:12 ` Mike Fedyk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).