All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Turmel <philip@turmel.org>
To: Fabian Knorr <knorrfab@fim.uni-passau.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: Recovering an Array with inconsistent Superblocks
Date: Sat, 04 Jan 2014 21:32:19 -0500	[thread overview]
Message-ID: <52C8C433.5030403@turmel.org> (raw)
In-Reply-To: <1388873143.7641.20.camel@vessel>

On 01/04/2014 05:05 PM, Fabian Knorr wrote:
> Hi, Phil,
> 
> thank you very much for your reply.
> 
>> Side note: If you have a live spare available for a raid5, there's no
>> good reason not to reshape to a raid6, and very good reasons to do so.
> 
> I was worried that RAID6 would incur a significant load on the CPU,
> especially if one disk fails. The system is a single-core Intel Atom.

It does add more load, especially when degraded.  I guess it depends on
your usage pattern.  I would try it before I gave up on the idea.

>> Device names are not guaranteed to remain identical from one boot to
>> another.  And often won't be if a removable device is plugged in at that
>> time.  The linux MD driver keeps identity data in the superblock that
>> makes the actual device names immaterial.
>>
>> It is really important that we get a "map" of device names to drive
>> serial numbers, and adjust all future operations to ensure we are
>> working with the correct names.  An excerpt from "ls -l
>> /dev/disk/by-id/" would do.  And you need to verify it after every boot
>> until this crisis is resolved.
> 
> See the attachment "partitions". I grep'ed for raid partitions.
> 
>> 1) raid.status appears to be from *after* your --add attempts.  That
>> means anything in those reports from those devices is useless.  So we
>> will have to figure out what that data was.
> 
> Could it be that --add only changed the superblock of one disk,
> namely /dev/sdb in file from my first e-mail?

/dev/sda actually.

>> 2) You attempted to recreate the array.  If you left out
>> "--assume-clean", your data is toast.  Please show the precise command
>> line you used in your re-create attempt.  Also generate a fresh
>> "raid.status" for the current situation.
> 
> The only commands I used were --add /dev/sdb, --run, --assemble --scan,
> --assemble --scan --force and --stop. I didn't try to re-create it, at
> least not now. Also, the timestamp from raid.status (2011) is incorrect,
> the array was re-created from scratch in the summer of 2012. I can't
> tell why disks other than /dev/sdb1 have an invalid superblock.

This is very good news.  In fact, I think --assemble --force can still
be made to work....

>> 3) The array seems to think it's member devices were /dev/sda through
>> /dev/sdh (not in that order).  Your "raid.status" has /dev/sd[abcefghi],
>> suggesting a rescue usb or some such is /dev/sdd. 
> 
> Yes, that's correct.

Very good.

>> 4) Please describe the structure of the *content* of the array, so we
>> can suggest strategies to *safely* recognize when our future attempts to
>> --create --assume-clean have succeeded.  LVM?  Partitioned?  One big
>> filesystem?
> 
> I'm using the array as a physical volume for LVM.

Ok.

Try this:

mdadm --stop /dev/md0

mdadm -Afv /dev/md0 /dev/sd[bcefghi]1

It leaves out /dev/sda, which appears to have been the spare in the
original setup.

If MD is happy after that, use fsck -n on your logical volumes to verify
your FS integrity, and/or see the extent of the damage (little or none,
I think).

If that works, you can --add /dev/sda1 again, and it will become the
spare again.

If it doesn't work, show everything printed by "mdadm -Afv" above.

HTH,

Phil

  reply	other threads:[~2014-01-05  2:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-04 10:04 Recovering an Array with inconsistent Superblocks Fabian Knorr
2014-01-04 16:24 ` Phil Turmel
2014-01-04 17:59   ` Can Jeuleers
2014-01-04 19:16     ` Phil Turmel
2014-01-04 22:05   ` Fabian Knorr
2014-01-05  2:32     ` Phil Turmel [this message]
2014-01-05  9:07       ` Fabian Knorr
2014-01-05  9:56         ` NeilBrown
2014-01-05 10:40           ` Fabian Knorr
     [not found]           ` <1388918703.3591.20.camel@vessel>
2014-01-05 18:25             ` Phil Turmel
2014-01-05 23:50               ` NeilBrown
2014-01-06 14:00               ` Fabian Knorr
2014-01-07  0:26                 ` NeilBrown
2014-01-14  8:54     ` David Brown
2014-01-04 22:08   ` Fabian Knorr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52C8C433.5030403@turmel.org \
    --to=philip@turmel.org \
    --cc=knorrfab@fim.uni-passau.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.