All of lore.kernel.org
 help / color / mirror / Atom feed
From: Krzysztof Halasa <khc@pm.waw.pl>
To: John Bradford <john@grabjohn.com>
Cc: Rogier Wolff <R.E.Wolff@BitWizard.nl>,
	Norman Diamond <ndiamond@wta.att.ne.jp>,
	Hans Reiser <reiser@namesys.com>,
	Wes Janzen <superchkn@sbcglobal.net>,
	linux-kernel@vger.kernel.org
Subject: Re: Blockbusting news, this is important (Re: Why are bad disk sectors numbered strangely, and what happens to them?)
Date: 18 Oct 2003 01:28:53 +0200	[thread overview]
Message-ID: <m37k33igui.fsf@defiant.pm.waw.pl> (raw)
In-Reply-To: <200310171935.h9HJZaLm002335@81-2-122-30.bradfords.org.uk>

John Bradford <john@grabjohn.com> writes:

> I said an _additional_ bit.  I am assuming that N-1 reads returned the
> same, (bad), data, which was identified as bad.  Read N encountered
> one too many flipped bits and returned a false positive.  Perfectly
> possible, and arguably more likely than all of the existing incorrect
> bits flipping back, resulting in the correct data being read back, in
> some cases.

In some cases, theoretically, yes. But I've never got anything like that
in practice.

BTW: Hard drives apparently use more sophisticated algorithms,
involving measuring head signal level even when there is no problem
reading the data, and eventually remapping a sector on read before the
information is lost.

> Tell this to the drive manufacturers.  They are the ones who can sell
> you a specialist firmware if you want to do data recovery, not me.

Maybe. But, you know, it's Linux and I don't want to pay for additional
software just to use disks already paid for. Especially when it's all
working fine now.

> Your argument is flawed - how can you claim the current situation is
> sane when at least some drive manufactuers don't publish simple facts
> such as what happens when defective blocks are encountered on reads
> and on writes?

Do you think you can make them publish such things? It would be great.

> If a system got in to a state as extreme as that, I'd generally take
> the hole system down.  Electromagnatic interference that affects one
> drive immediately noticably may well be affecting other components in
> subtle ways - possible _silent_ data corruption in other words.

Possibly. Possibly the machine will immediately freeze. But data on
disk platters will probably be ok, and you'll be able to read it
when the conditions are back in specs.

> Yes.  Or more specifically, I wouldn't trust that data without
> verifying it.  It's easy to ignore such problems and say that
> everything is probably OK, and maybe 99% of the time you would be
> right, but so what?  What about that 1%?

That's not 1% - rather something like 10^-17 or so.
See the specs.
And we have CRCs all over the place - damaged .gnumeric file will
probably fail gunzip stage.
BTW: the probability of silently corrupting, say, (D)RAM contents is
much much higher than that of corrupting HDD data. Even if you use
ECC RAM.

> > Do you really not value your data enough to mark it as inaccessible?
> 
> Not sure what you mean - in what context?

Remapping a sector on read without actually copying the data makes
it inaccessible. Unless you have manufacturer-provided software, of
course, but I haven't seen any.

> Data recovery is always a last resort.  On the other hand, backing up
> data daily can still result in 23 hours of lost data, so I consider
> early detection of faulty disks very important.  Mirroring brings it's
> own problems to consider - more devices to possibly fail, and if they
> are connected to the same controller, a serious fault with any one
> could usually theoretically destroy all of them.

It all depends on requirements. If you need 100% uninterrupted service
you can use mirrored servers, possibly installed in different locations.
This will fix potential problems, while remapping on failed read will
not.
-- 
Krzysztof Halasa, B*FH

  reply	other threads:[~2003-10-17 23:29 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-13  9:31 Why are bad disk sectors numbered strangely, and what happens to them? Norman Diamond
     [not found] ` <200310131014.h9DAEwY3000241@81-2-122-30.bradfords.org.uk>
2003-10-13 10:24   ` Norman Diamond
2003-10-13 10:33     ` John Bradford
2003-10-13 11:30       ` Norman Diamond
2003-10-13 11:58         ` Maciej Zenczykowski
2003-10-15 10:22           ` Norman Diamond
2003-10-13 12:02         ` John Bradford
2003-10-15 10:23           ` Norman Diamond
2003-10-15 18:56             ` Pavel Machek
2003-10-14  6:54         ` Rogier Wolff
2003-10-13 14:24     ` Chuck Campbell
2003-10-13 14:54       ` Maciej Zenczykowski
2003-10-13 16:29         ` Roger Larsson
2003-10-14  6:49     ` Rogier Wolff
2003-10-14  7:05       ` Wes Janzen
2003-10-14  7:21         ` John Bradford
2003-10-14  7:40           ` Rogier Wolff
2003-10-14  8:11             ` John Bradford
2003-10-14  8:45               ` Hans Reiser
2003-10-14  9:46                 ` Rogier Wolff
2003-10-14  9:57                   ` Hans Reiser
2003-10-14 10:10                     ` Rogier Wolff
2003-10-14 10:31                       ` Hans Reiser
2003-10-14 10:19                 ` John Bradford
     [not found]             ` <200310140800.h9E80BT9000815@81-2-122-30.bradfords.org.uk>
     [not found]               ` <20031014081110.GA14418@bitwizard.nl>
2003-10-14  8:55                 ` Wes Janzen
2003-10-14 10:05                   ` Rogier Wolff
2003-10-14  7:24         ` Rogier Wolff
2003-10-14  9:04         ` Hans Reiser
2003-10-15 10:23           ` Norman Diamond
2003-10-15 10:39             ` Hans Reiser
2003-10-17  9:40           ` Blockbusting news, this is important (Re: Why are bad disk sectors numbered strangely, and what happens to them?) Norman Diamond
2003-10-17  9:48             ` Hans Reiser
2003-10-17 11:11               ` Norman Diamond
2003-10-17 11:45                 ` Hans Reiser
2003-10-17 11:51                 ` John Bradford
2003-10-17 12:53                 ` John Bradford
2003-10-17 13:03                   ` Russell King
2003-10-17 13:26                     ` John Bradford
2003-10-19  7:50                   ` Andre Hedrick
2003-10-17 13:04                 ` Russell King
2003-10-17 14:09                   ` Norman Diamond
2003-10-17  9:58             ` Pavel Machek
2003-10-17 10:15               ` Hans Reiser
2003-10-17 10:24             ` Rogier Wolff
2003-10-17 10:49               ` John Bradford
2003-10-17 11:09                 ` Rogier Wolff
2003-10-17 11:24                 ` Krzysztof Halasa
2003-10-17 19:35                   ` John Bradford
2003-10-17 23:28                     ` Krzysztof Halasa [this message]
2003-10-18  7:42                       ` Pavel Machek
2003-10-18  8:30                         ` John Bradford
2003-10-21 20:26                           ` bill davidsen
2003-10-18  8:27                       ` John Bradford
2003-10-18 12:02                         ` Krzysztof Halasa
2003-10-18 16:26                           ` Nuno Silva
2003-10-18 20:16                             ` Krzysztof Halasa
     [not found]                     ` <m37k33igui.fsf@defiant. <m3u166vjn0.fsf@defiant.pm.waw.pl>
2003-10-21 20:39                       ` bill davidsen
2003-10-17 10:37             ` ATA Defect management John Bradford
2003-10-21 20:44               ` bill davidsen
2003-10-17 12:08             ` Blockbusting news, this is important (Re: Why are bad disk sectors numbered strangely, and what happens to them?) Justin Cormack
2003-10-21 20:12             ` bill davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m37k33igui.fsf@defiant.pm.waw.pl \
    --to=khc@pm.waw.pl \
    --cc=R.E.Wolff@BitWizard.nl \
    --cc=john@grabjohn.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ndiamond@wta.att.ne.jp \
    --cc=reiser@namesys.com \
    --cc=superchkn@sbcglobal.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.