All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: keld@keldix.com
Cc: linux-raid@vger.kernel.org
Subject: Re: Using the new bad-block-log in md for Linux 3.1
Date: Wed, 27 Jul 2011 16:49:59 +1000	[thread overview]
Message-ID: <20110727164959.3352c5a3@notabene.brown> (raw)
In-Reply-To: <20110727062110.GA9801@www5.open-std.org>

On Wed, 27 Jul 2011 08:21:10 +0200 keld@keldix.com wrote:

> On Wed, Jul 27, 2011 at 02:16:52PM +1000, NeilBrown wrote:
> > 
> > As mentioned earlier, Linux 3.1 will contain support for recording and
> > avoiding bad blocks on devices in md arrays.
> > 
> > These patches are currently in -next and I expect to send them to Linus
> > tomorrow.
> > 
> > Using this funcitonality requires support in mdadm.  When an array is created
> > some space needs to be reserved to store the bad block list.
> > 
> > I have just created an mdadm branch called devel-3.3 which provides initial
> > functionality.  The main patch is included inline below.
> > 
> > This only supports creating new arrays with badblock support.  It also only
> > supports 1.x metadata.
> > 
> > I hope to add support to add a bad block list to an existing 1.x array at
> > some stage, but support for 0.90 metadata is not expected to ever be added.
> > 
> > If you create an array with this mdadm it will add a bad block log - you
> > cannot turn it off (it is only 4K long so why would you want to).  Then as
> > errors occur they will cause the faulty block to be added to the log rather
> > than the device to be remove from the array.
> > If writing the new bad block list fails, then the device as a whole will fail.
> > 
> > I would very much appreciate any reports of success of failure when using
> > this new feature.  If you can make a test array using a known-faulty device
> > and can experiment with that I would particularly like to hear about any
> > experiences.
> > 
> > Thanks,
> > NeilBrown
> > 
> >  git://neil.brown.name/mdadm devel-3.3
> > 
> > http://neil.brown.name/git?p=mdadm;a=shortlog;h=refs/heads/devel-3.3
> 
> How is it implemented? Does the bad block get duplicated in a reserve area?

No duplication - I expect the underlying device to be doing that, and doing
it again at another level seems pointless.

The easiest way to think about it is that the strip containing a bad block is
treated as 'degraded'.  You can have an array were only some strips are
degraded, and they are each missing different devices.

> Or are also corresponding good blocks on other sound devices also excluded?

Not sure what you mean.  A bad block is just on one device.  Each device has
its own independent table of bad blocks.

> 
> How big a device can it handle?

2^54 sectors which with 512byte sectors is 8 exbibytes.
With larger sectors, larger devices.

> 
> If a device fails totally and the remaining devices contain devices with
> bad blocks, will there then be lost data?

Yes.  You shouldn't aim to run an array with bad blocks any more than you
should run an array degraded.
The purpose of bad block management is to provide a more graceful failure
path, not to encourage you to run an array with bad drives (except for
testing).

In particular this lays the ground work to implement hot-replace.  If you
have a drive that is failing it can stay in the array and hobble along for a
bit longer.  Meanwhile you add a fresh new drive as a hot-replace and let it
rebuilt.  If there is a bad block elsewhere in the array the hot-replace
drive might still rebuild completely.  And even if there is a failure, you
will only lose some blocks, not the whole array.

This all makes is very hard to build confidence in the code - most of the
time it is not used at all and I would rather it that way.  But when things
start going wrong, you really want it to be 100% bug free.


Thanks for the questions,
NeilBrown



> 
> Best regrads
> keld


  reply	other threads:[~2011-07-27  6:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-27  4:16 Using the new bad-block-log in md for Linux 3.1 NeilBrown
2011-07-27  6:21 ` keld
2011-07-27  6:49   ` NeilBrown [this message]
2011-07-27  8:17     ` keld
2011-07-27 10:22       ` Mikael Abrahamsson
2011-07-27 12:30 ` Lutz Vieweg
2011-07-27 12:44   ` John Robinson
2011-07-27 13:06     ` Lutz Vieweg
2011-07-27 13:23       ` Lutz Vieweg
2011-07-27 20:55       ` NeilBrown
2011-07-28  9:25         ` Lutz Vieweg
2011-07-28  9:55           ` John Robinson
2011-07-28 12:53 ` Michal Soltys

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110727164959.3352c5a3@notabene.brown \
    --to=neilb@suse.de \
    --cc=keld@keldix.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.