All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Theodore Tso <tytso@mit.edu>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Ric Wheeler <ric@emc.com>,
	Linux-ide <linux-ide@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	linux-raid@vger.kernel.org, Tejun Heo <htejun@gmail.com>,
	James Bottomley <James.Bottomley@SteelEye.com>,
	Mark Lord <mlord@pobox.com>, Neil Brown <neilb@suse.de>,
	Jens Axboe <jens.axboe@oracle.com>,
	"Clark, Nathan" <Clark_Nathan@emc.com>,
	"Singh, Arvinder" <Singh_Arvinder@emc.com>,
	"De Smet, Jochen" <DeSmet_Jochen@emc.com>,
	"Farmer, Matt" <Farmer_Matt@emc.com>,
	linux-fsdevel@vger.kernel.org, "Mizar,
	Sunita" <Mizar_Sunita@emc.com>
Subject: Re: end to end error recovery musings
Date: Mon, 26 Feb 2007 16:33:37 +1100	[thread overview]
Message-ID: <17890.28977.989203.938339@notabene.brown> (raw)
In-Reply-To: message from Theodore Tso on Friday February 23

On Friday February 23, tytso@mit.edu wrote:
> On Fri, Feb 23, 2007 at 05:37:23PM -0700, Andreas Dilger wrote:
> > > Probably the only sane thing to do is to remember the bad sectors and 
> > > avoid attempting reading them; that would mean marking "automatic" 
> > > versus "explicitly requested" requests to determine whether or not to 
> > > filter them against a list of discovered bad blocks.
> > 
> > And clearing this list when the sector is overwritten, as it will almost
> > certainly be relocated at the disk level.  For that matter, a huge win
> > would be to have the MD RAID layer rewrite only the bad sector (in hopes
> > of the disk relocating it) instead of failing the whiole disk.  Otherwise,
> > a few read errors on different disks in a RAID set can take the whole
> > system offline.  Apologies if this is already done in recent kernels...

Yes, current md does this.

> 
> And having a way of making this list available to both the filesystem
> and to a userspace utility, so they can more easily deal with doing a
> forced rewrite of the bad sector, after determining which file is
> involved and perhaps doing something intelligent (up to and including
> automatically requesting a backup system to fetch a backup version of
> the file, and if it can be determined that the file shouldn't have
> been changed since the last backup, automatically fixing up the
> corrupted data block :-).
> 
> 						- Ted

So we want a clear path for media read errors from the device up to
user-space.  Stacked devices (like md) would do appropriate mappings
maybe (for raid0/linear at least.  Other levels wouldn't tolerate
errors).
There would need to be a limit on the number of 'bad blocks' that is
recorded.  Maybe a mechanism to clear old bad  blocks from the list is
needed.

Maybe if generic make request gets a request for a block which
overlaps a 'bad-block' it returns an error immediately.

Do we want a path in the other direction to handle write errors?  The
file system could say "Don't worry to much if this block cannot be
written, just return an error and I will write it somewhere else"?
This might allow md not to fail a whole drive if there is a single
write error.
Or is that completely un-necessary as all modern devices do bad-block
relocation for us?
Is there any need for a bad-block-relocating layer in md or dm?

What about corrected-error counts?  Drives provide them with SMART.
The SCSI layer could provide some as well.  Md can do a similar thing
to some extent.  Where these are actually useful predictors of pending
failure is unclear, but there could be some value.
e.g. after a certain number of recovered errors raid5 could trigger a
background consistency check, or a filesystem could trigger a
background fsck should it support that.


Lots of interesting questions... not so many answers.

NeilBrown

  parent reply	other threads:[~2007-02-26  5:33 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-23 14:15 end to end error recovery musings Ric Wheeler
2007-02-23 14:15 ` Ric Wheeler
2007-02-24  0:03 ` H. Peter Anvin
2007-02-24  0:37   ` Andreas Dilger
2007-02-24  2:05     ` H. Peter Anvin
2007-02-24  2:32     ` Theodore Tso
2007-02-24 18:39       ` Chris Wedgwood
2007-02-26  5:33       ` Neil Brown [this message]
2007-02-26 13:25         ` Theodore Tso
2007-02-26 15:15           ` Alan
2007-02-26 15:18             ` Ric Wheeler
2007-02-26 17:01               ` Alan
2007-02-26 16:42                 ` Ric Wheeler
2007-02-26 15:17           ` James Bottomley
2007-02-26 18:59           ` H. Peter Anvin
2007-02-26 22:46           ` Jeff Garzik
2007-02-26 22:53             ` Ric Wheeler
2007-02-27  1:19               ` Alan
2007-02-26  6:01   ` Douglas Gilbert
2007-02-27  1:10 Moore, Eric
2007-02-27  1:10 ` Moore, Eric
2007-02-27 16:50 ` Martin K. Petersen
2007-02-27 16:50   ` Martin K. Petersen
2007-02-27 18:51   ` Ric Wheeler
2007-02-27 19:02   ` Alan
2007-02-27 19:02     ` Alan
2007-02-27 18:39     ` Andreas Dilger
2007-02-27 19:07     ` Martin K. Petersen
2007-02-27 19:07       ` Martin K. Petersen
2007-02-27 23:39       ` Alan
2007-02-27 23:39         ` Alan
2007-02-27 22:51         ` Martin K. Petersen
2007-02-27 22:51           ` Martin K. Petersen
2007-02-28 13:46           ` Douglas Gilbert
2007-02-28 17:16             ` Martin K. Petersen
2007-02-28 17:30               ` James Bottomley
2007-02-28 17:42                 ` Martin K. Petersen
2007-02-28 17:52                   ` James Bottomley
2007-03-01  1:28                     ` H. Peter Anvin
2007-03-01 14:25                       ` James Bottomley
2007-03-01 17:19                         ` H. Peter Anvin
2007-02-28 15:19       ` Moore, Eric
2007-02-28 15:19         ` Moore, Eric
2007-02-28 17:27         ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17890.28977.989203.938339@notabene.brown \
    --to=neilb@suse.de \
    --cc=Clark_Nathan@emc.com \
    --cc=DeSmet_Jochen@emc.com \
    --cc=Farmer_Matt@emc.com \
    --cc=James.Bottomley@SteelEye.com \
    --cc=Mizar_Sunita@emc.com \
    --cc=Singh_Arvinder@emc.com \
    --cc=hpa@zytor.com \
    --cc=htejun@gmail.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mlord@pobox.com \
    --cc=ric@emc.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.