* repeatable IDE errors when using SMART
@ 2002-11-13 2:19 dean gaudet
2002-11-13 9:38 ` Xavier Bestel
2002-11-13 17:26 ` Ross Vandegrift
0 siblings, 2 replies; 6+ messages in thread
From: dean gaudet @ 2002-11-13 2:19 UTC (permalink / raw)
To: linux-kernel
i'm 99.99% certain that the use of smartctl and/or hddtemp is causing my
system to lose contact with the drives. there's just been far too many
concidental errors of this sort:
hdi: status timeout: status=0xd0 { Busy }
hdi: drive not ready for command
hdi: status error: status=0x58 { DriveReady SeekComplete DataRequest }
hdi: drive not ready for command
at the exact time i've got cron jobs running to do either "smartctl -a" or
"hddtemp", or at a time when i run the command by hand.
at some times the error state is bad enough to cause md to mark the disk
as bad.
at one point i started running hddtemp every 5 minutes for logging
purposes and it took less than 2 days for the system to lose contact with
one of the disks... and this repeated after i rebooted.
system is:
- linux 2.4.19-pre7-ac4
- tyan 2462, dual athlon 1.4GHz (onboard IDE is unused)
- promise ultra 100TX2
- promise ultra 133TX2
- 4x maxtor D740X 80GB (each master on one of the promise channels)
(3x 6L080J4, and 1x 6L080L4)
i've replaced drives (always D740X though), cables, and controllers
(swapping a 100TX2 for the 133TX2 which is in there now).
the problem has appeared on all the IDE ports, so it's not restricted to
any one port/drive. however hdi seems to be particularly sensitive --
even after several controller, disk, and cable combinations. the 4 disks
are in a sw raid5 which is well balanced according to iostat -- except
that hdi is the hottest disk in the box (41C vs 35C 28C 32C).
any suggestions?
obviously i could move to a more recent kernel ... but i stayed back on
pre7-ac4 because it seemed later kernels messed up the promise driver in
some way and i never quite paid attention enough to know what a good
stable 2.4.x ac kernel was post pre7-ac4. suggestions welcome.
any known errata regarding SMART accesses interfering with other
operations?
-dean
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: repeatable IDE errors when using SMART
2002-11-13 2:19 repeatable IDE errors when using SMART dean gaudet
@ 2002-11-13 9:38 ` Xavier Bestel
2002-11-13 14:23 ` Alan Cox
2002-11-13 17:26 ` Ross Vandegrift
1 sibling, 1 reply; 6+ messages in thread
From: Xavier Bestel @ 2002-11-13 9:38 UTC (permalink / raw)
To: dean gaudet; +Cc: Linux Kernel Mailing List
Le mer 13/11/2002 Ã 03:19, dean gaudet a écrit:
> i'm 99.99% certain that the use of smartctl and/or hddtemp is causing my
> system to lose contact with the drives. there's just been far too many
> concidental errors of this sort:
Maybe I've seen something like this too, but I'm not sure.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: repeatable IDE errors when using SMART
2002-11-13 9:38 ` Xavier Bestel
@ 2002-11-13 14:23 ` Alan Cox
0 siblings, 0 replies; 6+ messages in thread
From: Alan Cox @ 2002-11-13 14:23 UTC (permalink / raw)
To: Xavier Bestel; +Cc: dean gaudet, Linux Kernel Mailing List
On Wed, 2002-11-13 at 09:38, Xavier Bestel wrote:
> Le mer 13/11/2002 Ã 03:19, dean gaudet a écrit:
> > i'm 99.99% certain that the use of smartctl and/or hddtemp is causing my
> > system to lose contact with the drives. there's just been far too many
> > concidental errors of this sort:
>
> Maybe I've seen something like this too, but I'm not sure.
Its certainly part of the code thats quite convoluted and would be the
ideal spot for a race. Really for the new IDE it wants testing versus
2.5.47-ac but getting 2.5.47 (ac or otherwise) to stay up for 8 hours is
challenging
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: repeatable IDE errors when using SMART
2002-11-13 2:19 repeatable IDE errors when using SMART dean gaudet
2002-11-13 9:38 ` Xavier Bestel
@ 2002-11-13 17:26 ` Ross Vandegrift
2002-11-13 19:39 ` Jakob Oestergaard
1 sibling, 1 reply; 6+ messages in thread
From: Ross Vandegrift @ 2002-11-13 17:26 UTC (permalink / raw)
To: dean gaudet; +Cc: linux-kernel
On Tue, Nov 12, 2002 at 06:19:37PM -0800, dean gaudet wrote:
> any suggestions?
I've noticed that using smartctl on some of my drives kills them too. I
didn't bother investigating much - too scared of losing data. I kind of
assumed it was a problem with the drive not the code though, since it
worked fine on one drive but not the other.
I have a Maxtor and an IBM; unfortuantely I don't recall which one was
bokning out on smartctl...
If it's interesting (and resonably safe) I could do some testing on my
system.
--
Ross Vandegrift
ross@willow.seitz.com
A Pope has a Water Cannon. It is a Water Cannon.
He fires Holy-Water from it. It is a Holy-Water Cannon.
He Blesses it. It is a Holy Holy-Water Cannon.
He Blesses the Hell out of it. It is a Wholly Holy Holy-Water Cannon.
He has it pierced. It is a Holey Wholly Holy Holy-Water Cannon.
Batman and Robin arrive. He shoots them.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: repeatable IDE errors when using SMART
2002-11-13 17:26 ` Ross Vandegrift
@ 2002-11-13 19:39 ` Jakob Oestergaard
2002-11-13 21:29 ` Ross Vandegrift
0 siblings, 1 reply; 6+ messages in thread
From: Jakob Oestergaard @ 2002-11-13 19:39 UTC (permalink / raw)
To: Ross Vandegrift; +Cc: dean gaudet, linux-kernel
On Wed, Nov 13, 2002 at 12:26:10PM -0500, Ross Vandegrift wrote:
> On Tue, Nov 12, 2002 at 06:19:37PM -0800, dean gaudet wrote:
> > any suggestions?
>
> I've noticed that using smartctl on some of my drives kills them too. I
> didn't bother investigating much - too scared of losing data. I kind of
> assumed it was a problem with the drive not the code though, since it
> worked fine on one drive but not the other.
>
> I have a Maxtor and an IBM; unfortuantely I don't recall which one was
> bokning out on smartctl...
>
> If it's interesting (and resonably safe) I could do some testing on my
> system.
Do you use Promise controllers?
If so, do you use anything newer than the Ultra33 or Ultra66
controllers?
--
................................................................
: jakob@unthought.net : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob Østergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............:
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: repeatable IDE errors when using SMART
2002-11-13 19:39 ` Jakob Oestergaard
@ 2002-11-13 21:29 ` Ross Vandegrift
0 siblings, 0 replies; 6+ messages in thread
From: Ross Vandegrift @ 2002-11-13 21:29 UTC (permalink / raw)
To: Jakob Oestergaard, dean gaudet, linux-kernel
On Wed, Nov 13, 2002 at 08:39:30PM +0100, Jakob Oestergaard wrote:
> > I have a Maxtor and an IBM; unfortuantely I don't recall which one was
> > bokning out on smartctl...
> >
> > If it's interesting (and resonably safe) I could do some testing on my
> > system.
>
> Do you use Promise controllers?
>
> If so, do you use anything newer than the Ultra33 or Ultra66
> controllers?
Sure do - one controller is the integrated VIA KT133, the second is an
on-board Promise PDC20265. The Maxtor drive is on the VIA, IBM on
Promise.
It must've been running smartctl on the IBM drive that would bonk my
system - I just tried 'cat /proc/ide/hde/identity' and my machine froze
solid - no numlock, no sysrq, no ping. The disk access light was on.
I hadn't even considered the different controllers.
--
Ross Vandegrift
ross@willow.seitz.com
A Pope has a Water Cannon. It is a Water Cannon.
He fires Holy-Water from it. It is a Holy-Water Cannon.
He Blesses it. It is a Holy Holy-Water Cannon.
He Blesses the Hell out of it. It is a Wholly Holy Holy-Water Cannon.
He has it pierced. It is a Holey Wholly Holy Holy-Water Cannon.
Batman and Robin arrive. He shoots them.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-11-13 21:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-13 2:19 repeatable IDE errors when using SMART dean gaudet
2002-11-13 9:38 ` Xavier Bestel
2002-11-13 14:23 ` Alan Cox
2002-11-13 17:26 ` Ross Vandegrift
2002-11-13 19:39 ` Jakob Oestergaard
2002-11-13 21:29 ` Ross Vandegrift
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).