linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Blockbusting news, results get worse
@ 2003-10-26  7:37 Norman Diamond
  2003-10-26 10:39 ` John Bradford
  0 siblings, 1 reply; 26+ messages in thread
From: Norman Diamond @ 2003-10-26  7:37 UTC (permalink / raw)
  To: Mudama, Eric, 'Hans Reiser ', 'Wes Janzen ',
	'Rogier Wolff ', 'John Bradford ',
	linux-kernel, nikita, 'Pavel Machek ',
	'Justin Cormack ', 'Russell King ',
	'Vitaly Fertman ', 'Krzysztof Halasa '

It gets worse.  First, to recap previous results:

1.  The drive reported a permanent error on read, refused to reallocate the
bad sector, and Linux logged the error but refused to remove the block from
the Reiser file system.  (Different people have different opinions about
whether various parts of this behavior are acceptable, but anyway this was
one of the observed results.)

2.  The drive reported a permanent error on write, refused to reallocate the
bad sector, and Linux logged the error but refused to remove the block from
the Reiser file system.  (I'm not sure if different people have different
opinions about whether various parts of this behavior are acceptable.  This
was a write, good data were known at the time, but subsequently good data
would never be retrievable from the file.)

3.  The drive reported a permanent read error during a S.M.A.R.T. long
self-test and refused to reallocate the bad sector.  (I think different
people have different opinions about the acceptability of this too.)

Well, here's news.

4.  When writing ZEROES to the bad sector, the drive reports SUCCESS.
But it lies.  Subsequent attempts to read still fail.  Subsequent writing of
zeroes appears to succeed again.  Subsequent attempts to read still fail.

I swear, I want that block out of the file system.  Even if the writing of
zeroes really succeeded, I would not be satisfied with the continued use of
that block.  I really want the drive to reallocate it, but Toshiba's
firmware is unsafe to drive at any speed.  So I really want the file system
to exclude that block.

Some participants in this discussion have said that ext2fs can exclude bad
blocks in a way that ReiserFS doesn't, though ReiserFS probably will in the
future.  But to the best of my understanding, ext2fs can detect and exclude
bad blocks at the time of formatting and at the time of a destructive
read-write test.  I have not seen news from anyone about whether ext2fs will
remove a detected permanent bad block from an existing mounted filesystem at
the time that the error is detected during normal operations.  It is 99%
necessary to do so (leaving 1% for audio visual applications where it's more
important to play a movie erroneously at proper speed than to attempt
recovery).

By the way, one participant in this thread recommended not buying disk
drives from bargain basement outlets.  OK, yesterday I inquired at Bic
Camera, which might be one of the biggest two retailers of computers and
parts nationwide, but might not be because they don't have many stores
outside of the Tokyo area.  At least they're surely one of the two biggest
in Tokyo.  They said that they warranty Toshiba disk drives for 1 year.  So
if a customer buys a Toshiba disk drive with firmware that was defective on
the day of purchase and defective on the dates of design and manufacture,
but if the customer doesn't detect the defective firmware until 366 days
later, the customer still gets shafted.

I still have to say, we can't fix Toshiba, and we can avoid Toshiba, but
meanwhile we can fix Linux.  Among other manufacturers, only Maxtor has said
that their firmware isn't broken in this way, but Maxtor doesn't make drives
for notebooks.  Just how many manufacturers of disk drives are we going to
avoid, or can we hope that Linux will be made to compensate for their
defects?

Well, in a future weekend, I will try to see if ext2fs really takes action
on permanently bad blocks that are detected during normal operations on a
mounted partition.


^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: Blockbusting news, results get worse
@ 2003-10-26 18:33 Mudama, Eric
  2003-10-26 22:03 ` Andre Hedrick
  2003-10-27  9:34 ` Norman Diamond
  0 siblings, 2 replies; 26+ messages in thread
From: Mudama, Eric @ 2003-10-26 18:33 UTC (permalink / raw)
  To: 'Norman Diamond', 'Hans Reiser ',
	'Wes Janzen ', 'Rogier Wolff ',
	'John Bradford ',
	linux-kernel, nikita, 'Pavel Machek ',
	'Justin Cormack ', 'Russell King ',
	'Vitaly Fertman ', 'Krzysztof Halasa '



> -----Original Message-----
> From: Norman Diamond [mailto:ndiamond@wta.att.ne.jp]
>
> 
> 4.  When writing ZEROES to the bad sector, the drive reports SUCCESS.
> But it lies.  Subsequent attempts to read still fail.  
> Subsequent writing of
> zeroes appears to succeed again.  Subsequent attempts to read 
> still fail.

*That* is the fundamental problem with the drive.  If it knows it has had
trouble with that block in the past, and it gets a new write, it should know
that is a troublesome area and verify that it was able to put the new block
in the old location.

If it can verify that, then there's no need to reallocate it at all, since
the write most likely cured whatever was wrong.

If it can't verify it, then it should need to reallocate and verify at the
new location.

> They said that they warranty Toshiba disk drives for 1 year.  So
> if a customer buys a Toshiba disk drive with firmware that 
> was defective on the day of purchase and defective on the dates
> of design and manufacture, but if the customer doesn't detect
> the defective firmware until 366 days later, the customer still
> gets shafted.

In theory, I don't see the problem with this.

It isn't realistic for a vendor to warranty a product forever, and this is
why OEMs do large qualifications on drives themselves before they purchase a
single unit, since they know they'll bear the brunt of the support headache
if the product fails.

That being said, there are three options:

1. Pay a premium for longer warranty.  I know this is available in both IDE
and SCSI, not sure if it is available in notebook drives.

2. Do qualification tests yourself during the first year of operation.
Hi/low temperature/humidity/air pressure, random command generator, and make
sure the drive never miscompares or has a hard error it can't "fix".
(Writing a zero and reading non-zero is a miscompare)

3. Look at what products are being shipped in large volume from OEMs, and
buy the same product yourself.  Dell or HP or IBM can't afford to ship
products that don't have the lowest in-the-field failure rates, so buying
what they buy would make sense since they'll run their own tests like #2.


--eric

^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: Blockbusting news, results get worse
@ 2003-10-26 22:12 Mudama, Eric
  0 siblings, 0 replies; 26+ messages in thread
From: Mudama, Eric @ 2003-10-26 22:12 UTC (permalink / raw)
  To: 'Andre Hedrick'; +Cc: linux-kernel



Andre Hedrick wrote:
> Eric,
> 
> Item "3" in your list is not practical, because no drive 
> maker allows the same drives that large oem's purchase to be placed in
retail. 
> There are obvious reasons, but your position stated for the average joe 
> consumer is flawed.

I don't believe your statement is correct that OEM drives and retail drives
always differ.  They may have slight configuration differences, but
fundamentally I think they're the same drive with identical or
near-identical firmware.

> Why don't you guys offer extended warrenty purchase service contracts?

As an optional feature on any drive?  Not sure, it would be nice.  However,
maintaining it specifically for individual drives in a product line might be
more work than someone high up feels is worth it.  Maybe there's a market
for buying a $30 warranty add-on from Maxtor that buys you an extra year or
whatever, however, I think you can get the same thing from CompUSA or other
companies now if you want it.  For them it is profitable, and I don't think
we'd want to compete with our virtual sales force. (The retail shops)

I do know you get better warranties on the more expensive models though, but
obviously that doesn't help after-the-fact.

--eric

^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: Blockbusting news, results get worse
@ 2003-10-27 13:07 Samium Gromoff
  0 siblings, 0 replies; 26+ messages in thread
From: Samium Gromoff @ 2003-10-27 13:07 UTC (permalink / raw)
  To: eric_mudama; +Cc: linux-kernel

Eric Mudama wrote:
> Andre Hedrick wrote:
> > Eric,
> >
> > Item "3" in your list is not practical, because no drive
> > maker allows the same drives that large oem's purchase to be placed in retail.
> > There are obvious reasons, but your position stated for the average joe
> > consumer is flawed.
> 
> I don't believe your statement is correct that OEM drives and retail drives
> always differ.  They may have slight configuration differences, but
> fundamentally I think they're the same drive with identical or
> near-identical firmware.

If there is somebody you should believe about such stuff, that would be Andre.
(by the way he was a T13 committee member not so long ago)

And, hey, i would have been rather surprised if you have answered otherwise,
given your email address...


cheers, Samium Gromoff

^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: Blockbusting news, results get worse
@ 2003-10-27 17:43 Mudama, Eric
  2003-10-27 18:48 ` Hans Reiser
  0 siblings, 1 reply; 26+ messages in thread
From: Mudama, Eric @ 2003-10-27 17:43 UTC (permalink / raw)
  To: 'Norman Diamond', 'Hans Reiser ',
	'Wes Janzen ', 'Rogier Wolff ',
	'John Bradford ',
	linux-kernel, nikita, 'Pavel Machek ',
	'Justin Cormack ', 'Vitaly Fertman ',
	'Krzysztof Halasa '



> -----Original Message-----
> Yeah, I need to deliberately damage one block in order to 
> test the firmware, but I don't want to damage multiple
> blocks and use up the reallocation space.  I am a home
> user, even if I also do programming at work, even if I
> also volunteer one day each weekend to test Linux.  How can I 
> arrange to damage one block on a disk?

Um... you can do that by shorting various pins on the PCBA if you have
access to an oscilloscope, or put it under heavy write workload and remove
power.

A modern drive has many thousands of reassign sectors available, so I don't
think either of these events will cause a permanent issue.

I'd also suggest reading older ATA specs, since some vendors still support
older commands that were capable of various wierdness that might be useful.

--eric


^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: Blockbusting news, results get worse
@ 2003-10-27 18:06 Mudama, Eric
  2003-10-27 19:18 ` Andre Hedrick
  0 siblings, 1 reply; 26+ messages in thread
From: Mudama, Eric @ 2003-10-27 18:06 UTC (permalink / raw)
  To: 'Samium Gromoff'; +Cc: linux-kernel



> -----Original Message-----
> From: Samium Gromoff [mailto:deepfire@ibe.miee.ru]
> Sent: Monday, October 27, 2003 6:08 AM
> To: Mudama, Eric
> Cc: linux-kernel@vger.kernel.org
> Subject: RE: Blockbusting news, results get worse
> 
> 
> Eric Mudama wrote:
> > Andre Hedrick wrote:
> > > Eric,
> > >
> > > Item "3" in your list is not practical, because no drive
> > > maker allows the same drives that large oem's purchase to 
> be placed in retail.
> > > There are obvious reasons, but your position stated for 
> the average joe
> > > consumer is flawed.
> > 
> > I don't believe your statement is correct that OEM drives 
> and retail drives
> > always differ.  They may have slight configuration differences, but
> > fundamentally I think they're the same drive with identical or
> > near-identical firmware.
> 
> If there is somebody you should believe about such stuff, 
> that would be Andre.
> (by the way he was a T13 committee member not so long ago)

That's nice.  For $800/year, anyone can join who is interested, provided
they can attend the meetings.  Anyone is free to join and ask questions on
the T13 mailing list.

http://www.t13.org

As to the "facts," I guess I choose to believe myself, since I'm one of the
guys writing firmware that decides drive behavior in many of these cases
that people bring up.  Now, I've only been doing this for 3 years, so if
there was something done greater than 3 years ago, odds are I haven't heard
of it.  I am only speaking from recent experience.

As to "believing" Andre, I'm sure he's a nice guy, but he comes off as
awefully bitter... it's tough to read more than a few sentences of what he
writes.  He obviously "knows" stuff, but wants to make people jump through
hoops to learn what he knows.

> And, hey, i would have been rather surprised if you have 
> answered otherwise, given your email address...

Of course, you can dismiss everything I'm saying if you like.  However, I'd
like to think I've been helpful to someone.  Disk drives don't work quite
the way some people think, so I like to try to clear up these misconceptions
thinking it will eventually help produce better linux code that works better
with the IDE drives I can afford.

--eric

^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: Blockbusting news, results get worse
@ 2003-10-29 20:11 Mudama, Eric
  0 siblings, 0 replies; 26+ messages in thread
From: Mudama, Eric @ 2003-10-29 20:11 UTC (permalink / raw)
  To: 'Pavel Machek', John Bradford
  Cc: Jeff Garzik, Hans Reiser, 'Norman Diamond',
	'Wes Janzen ', 'Rogier Wolff ',
	linux-kernel, nikita, 'Justin Cormack ',
	'Vitaly Fertman ', 'Krzysztof Halasa '



> -----Original Message-----
> From: Pavel Machek [mailto:pavel@ucw.cz]
> 
> > > If you don't FLUSH CACHE, you have no guarantees your 
> data is on the 
> > > platter.
> > 
> > I think that the idea that is floating around is to 
> deliberately ruin
> > the formatting on part of the drive in order to simulate a 
> bad block.
> > 
> > Operation of disk drives immediately after a power failiure has been
> > discussed before, by the way:
> > 
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=100665153518652&w=2
> 
> Well, that looks like pure speculation.
> 
> BTW I *do* believe that powerfail can make the sector bad. Imagine you
> bump into bad sector during write, and need to reallocate...
> 
> 								Pavel

Both the linked post and Pavel's point are correct.

In a modern drive, tolerances are so tight that your drive is constantly
re-writing blocks it knows it didn't write very well.  In a power-fail
event, there's little to no time to reallocate or reattempt a write, and
even less energy available to "fix" things that aren't within specification
anymore (spin speed, etc) ... if we don't get the actuator to the latch,
your drive probably won't spin again and you'll lose *all* your data, so
that is our number 1 concern when the power fails.

"Performance" IDE drives these days ship with 8MB buffers, which compounds
the problem even further if you're trying to get data on the media after
power has been cut.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2003-10-30  8:28 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-26  7:37 Blockbusting news, results get worse Norman Diamond
2003-10-26 10:39 ` John Bradford
2003-10-26  9:41   ` Pavel Machek
2003-10-26 11:38   ` Norman Diamond
2003-10-26 11:56     ` Pavel Machek
2003-10-26 12:06     ` Hans Reiser
2003-10-26 13:59     ` Krzysztof Halasa
2003-10-26 18:33 Mudama, Eric
2003-10-26 22:03 ` Andre Hedrick
2003-10-27  9:34 ` Norman Diamond
2003-10-27 10:23   ` Jan-Benedict Glaw
2003-10-27 23:31   ` Jason Lunz
2003-10-28 20:56   ` Hans Reiser
2003-10-26 22:12 Mudama, Eric
2003-10-27 13:07 Samium Gromoff
2003-10-27 17:43 Mudama, Eric
2003-10-27 18:48 ` Hans Reiser
2003-10-27 19:47   ` Jeff Garzik
2003-10-27 20:03     ` John Bradford
2003-10-29 20:01       ` Pavel Machek
2003-10-30  8:30         ` John Bradford
2003-10-28  1:21     ` Pavel Machek
2003-10-28 12:54       ` Krzysztof Halasa
2003-10-27 18:06 Mudama, Eric
2003-10-27 19:18 ` Andre Hedrick
2003-10-29 20:11 Mudama, Eric

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).