linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: Blockbusting news, results end
@ 2003-10-27 17:50 Mudama, Eric
  0 siblings, 0 replies; 23+ messages in thread
From: Mudama, Eric @ 2003-10-27 17:50 UTC (permalink / raw)
  To: 'Norman Diamond', 'Hans Reiser ',
	'Wes Janzen ', 'Rogier Wolff ',
	'John Bradford ',
	linux-kernel, nikita, 'Pavel Machek ',
	'Justin Cormack ', 'Vitaly Fertman ',
	'Krzysztof Halasa '



>-----Original Message-----
>> If a drive wants to reallocate a block, but due to some temporary
>> condition is unable to (vibration, excessive temperature, 
>> etc), odds are there's no way for that drive to "remember" that
>> it needs to reassign that block, so if you reboot the drive or
>> reset it or whatever, you're back at square 1.
> 
> Bingo.  This is why reallocation at the time of a failed read is also
> necessary.  Yes the data are lost, yes the failure needs to 
> be both logged (once) and displayed to the user (once), yes if an 
> application reads it again before writing then it will be garbage
> or zeroes, but get the LBA sector number moved to a place that is
> less likely to be unreliable.
> 
> Meanwhile software must still make up for defective firmware.
> 

Reallocating on a failed read doesn't always make sense.  Some huge
percentage of the errors on the media are caused by poor writes due to
various transient conditions (temperature, shock events, etc), and are not
actual media defects that prevent writing there in the future.  If we get an
ECC error, the only thing we can "reallocate" is the stuff with the error in
it, in which case you're no closer to getting a good block of data than you
were prior to the reallocation.

If you try to write to that LBA, it should detect that you're writing to a
marginal area, and do some amount of tests to make sure that the new write
can be read.

Also, your term "defective firmware" is getting annoying.  What, exactly,
should a drive that knows it cannot access the media due to severe
environmental conditions do in firmware to remember its problems between
power cycles?

--eric

^ permalink raw reply	[flat|nested] 23+ messages in thread
* RE: Blockbusting news, results end
@ 2003-10-28 16:10 Mudama, Eric
  2003-10-28 18:30 ` bill davidsen
  0 siblings, 1 reply; 23+ messages in thread
From: Mudama, Eric @ 2003-10-28 16:10 UTC (permalink / raw)
  To: 'Norman Diamond', linux-kernel, jw



> -----Original Message-----
> From: Norman Diamond
> 
> Someone else in this discussion estimated that physical 
> sectors are around 1MB these days.  My friends at Toshiba
> confirmed that physical sectors are much larger than
> logical sectors.  The physical sector size resembles that
> 1MB estimate far better than the 512B logical sector size.

I don't think your friends know what they're talking about.

Disk drives that use anything resembling the "standard" channel technology
have 2 physical sector types: servo sectors and data sectors.  Servo sectors
are designed to give the servo system a constant sample rate at any radius,
therefore they're constant in time, but get physically closer as you get
closer to the ID of the drive.  Data sectors are a fixed size, to hold a
certain number of user bytes of data, and they can be split, etc, as needed
to fit them nicely on the drive.

A modern disk has "a few" data sectors per servo sector at the OD (think
3-4.5), and "a few less" at the ID of the drive. (think 1.5-2).

The 1MB number can't possibly be a servo sector, since a modern drive
transfers ~50-70MB/sec, which would imply that their servo system is holding
postion within microns, only sampling 50-70 times/second.

I don't think any modern drive can hold 1MB on a single track, I know what
our current limit is, and we're not there yet.  (Anyone can figure this out
by looking at sequential read throughput per revolution)

That would mean that to be 1MB for a data sector, or anywhere close, they're
spanning a single data cylinder across tracks.  This doesn't make sense
either.

What I think is possible, but still unlikely, is that their defect
management scheme might not be capable of handling single-block defects.
However, disk drives have had that ability for tens of years, I can't see
how they could possibly sell a drive that way.

There's also a chance that in doing the larger write, you "cleaned up" a
poorly written adjacent track or ratty servo burst, which could account for
it working now.

Other than the track size and the DRAM buffer size, I can't think of
anything else offhand in a disk drive that is "about 1MB"

The insides of these things are near voodoo-magic...

> It is really hard to imagine a physical sector still being 
> 512B because the inter-sector gaps would take some huge
> multiple of the space occupied by the sectors.

We measure these gaps in nanoseconds.  They're not that huge.  But yes,
moving to a larger standard sector size would get you a significantly larger
disk drive built from the same parts.

> I'm sure the physical sectors are not 512B.

I'm sure you're wrong.

I'd imagine that since Seagate and WD and Maxtor are constantly duking it
out to release the next generation of capacity, and we all wind up producing
nearly-identical products when all is said and done, that they're using 512B
data sectors also.

^ permalink raw reply	[flat|nested] 23+ messages in thread
* Re: Blockbusting news, results end
@ 2003-10-28 11:31 Norman Diamond
  0 siblings, 0 replies; 23+ messages in thread
From: Norman Diamond @ 2003-10-28 11:31 UTC (permalink / raw)
  To: linux-kernel, jw, Mudama, Eric

jw schultz wrote:

> > I am assuming that these numbers are applicable (one is
> > unknown):
> > logical sector size == 512B
> > physical sector size == ???B
> > page size/filesystem block size == 4KB
>
> I have dialoged with Eric Mudama.  He is 99% sure that no
> manufacturer of is making ATA drives with physical sectors
> larger than 512B.  I'll let that statement trump Norman
> Diamond's until i hear otherwise.

Someone else in this discussion estimated that physical sectors are around
1MB these days.  My friends at Toshiba confirmed that physical sectors are
much larger than logical sectors.  The physical sector size resembles that
1MB estimate far better than the 512B logical sector size.

> The drive manufacturers would like to be able to go to a
> larger physical sector but the read-modify-write is just too
> scary.

It is really hard to imagine a physical sector still being 512B because the
inter-sector gaps would take some huge multiple of the space occupied by the
sectors.  I think this discussion has proved that we need to be scared of
read-modify-writes, but I think the drive manufacturers are doing it even
though it is scary.

> If they could be sure of market acceptance of drives
> that required all I/O to be in larger units they would build
> them because it would allow greater capacity (and i'm
> guessing speed as well) on the same physical hardware.

No, the effect on speed is the opposite.  A simple write could be done with
one seek and a random fraction of a rotational delay.  A read-modify-write
requires one seek and a random fraction plus additional entire rotational
delay.

I want to ask my friends why the read-modify-write behavior changed between
a 512B write and a 4096B write.  When the drive finally reallocated the
defective sector, it was during a 4096B write.  But I don't think they'll be
allowed to answer.  They already weren't allowed to talk much about the
firmware, but they did confirm that my original complaints about the
defective firmware were pretty accurate.

By the way, a few years ago when I visited other departments at Toshiba's
local division, I walked past a lot of ordinary large open-layout offices,
and also walked past one highly secured door.  That door had a sign on it
(in Japanese) saying that entry was prohibited to anyone not working on disk
drive design.  The accidental occurences by which I became friends with some
of their disk drive engineers were not a result of those business visits.
Probably there is no way that Toshiba would ever officially publicize even
the limited amount of information that my friends admitted to.  Nonetheless,
I'm sure the physical sectors are not 512B.


^ permalink raw reply	[flat|nested] 23+ messages in thread
* RE: Blockbusting news, results end
@ 2003-10-26 18:39 Mudama, Eric
  2003-10-27  9:45 ` Norman Diamond
  0 siblings, 1 reply; 23+ messages in thread
From: Mudama, Eric @ 2003-10-26 18:39 UTC (permalink / raw)
  To: 'Norman Diamond', 'Hans Reiser ',
	'Wes Janzen ', 'Rogier Wolff ',
	'John Bradford ',
	linux-kernel, nikita, 'Pavel Machek ',
	'Justin Cormack ', 'Russell King ',
	'Vitaly Fertman ', 'Krzysztof Halasa '



> -----Original Message-----
> From: Norman Diamond [mailto:ndiamond@wta.att.ne.jp]
>
> The drive finally reallocated the block and there are no 
> longer any visible
> bad blocks.

If a drive wants to reallocate a block, but due to some temporary condition
is unable to (vibration, excessive temperature, etc), odds are there's no
way for that drive to "remember" that it needs to reassign that block, so if
you reboot the drive or reset it or whatever, you're back at square 1.

The only "memory" that survives between power cycles in a disk drive is on
the media, so if we can't reliably access the media we're hosed.

--eric


^ permalink raw reply	[flat|nested] 23+ messages in thread
* Re: Blockbusting news, results end
@ 2003-10-26  8:49 Norman Diamond
  2003-10-26  9:22 ` Pavel Machek
  2003-10-26 11:01 ` Hans Reiser
  0 siblings, 2 replies; 23+ messages in thread
From: Norman Diamond @ 2003-10-26  8:49 UTC (permalink / raw)
  To: Mudama, Eric, 'Hans Reiser ', 'Wes Janzen ',
	'Rogier Wolff ', 'John Bradford ',
	linux-kernel, nikita, 'Pavel Machek ',
	'Justin Cormack ', 'Russell King ',
	'Vitaly Fertman ', 'Krzysztof Halasa '

The drive finally reallocated the block and there are no longer any visible
bad blocks.

I will not be able to perform the following planned test:
  Well, in a future weekend, I will try to see if ext2fs really takes action
  on permanently bad blocks that are detected during normal operations on a
  mounted partition.

But I think the underlying defects remain in need of correction.  Toshiba
knows about theirs but will probably never say if they make any fixes.  Mr.
Reiser and friends have plans to add important features, and I am unable to
detect if ext2fs needs it.  (As mentioned before, I understand that ext2fs
can do it during formatting and fsck, but no one seems to be saying what
happens if a permanently bad block is detected during normal operation on a
mounted partition.)


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2003-10-28 18:40 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-27 17:50 Blockbusting news, results end Mudama, Eric
  -- strict thread matches above, loose matches on Subject: below --
2003-10-28 16:10 Mudama, Eric
2003-10-28 18:30 ` bill davidsen
2003-10-28 11:31 Norman Diamond
2003-10-26 18:39 Mudama, Eric
2003-10-27  9:45 ` Norman Diamond
2003-10-27 10:48   ` Krzysztof Halasa
2003-10-26  8:49 Norman Diamond
2003-10-26  9:22 ` Pavel Machek
2003-10-26 11:25   ` Norman Diamond
2003-10-27 20:58     ` jw schultz
2003-10-27 22:27       ` Andre Hedrick
2003-10-27 22:57         ` jw schultz
2003-10-28  2:03           ` jw schultz
2003-10-26 11:01 ` Hans Reiser
2003-10-26 12:59   ` Oleg Drokin
2003-10-26 12:05     ` Hans Reiser
2003-10-26 12:39       ` Oleg Drokin
2003-10-26 16:26         ` Hans Reiser
2003-10-26 17:13           ` Oleg Drokin
2003-10-26 18:20             ` Hans Reiser
2003-10-26 19:07               ` Oleg Drokin
2003-10-27 12:44                 ` Vitaly Fertman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).