All of lore.kernel.org
 help / color / mirror / Atom feed
* Raid5 crashed, need comments on possible repair solution
@ 2012-04-23 13:56 Christoph Nelles
  2012-04-23 21:00 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Nelles @ 2012-04-23 13:56 UTC (permalink / raw)
  To: linux-raid

Hi,

Linux RAID worked for me fine in the last few years, but yesterday while
reorganizing the HW in my server the RAID5 crashed. It was a
Software-RAID Level 5 with 6x 3TB drives and ran XFS on top of it. I
have no idea why it crashed, but now all superblocks are invalid (one
dump follows) and sadly i have no information on the raid disk layout
(in which sequence the drives were). All drives from the raid are
available and running.

As i cannot afford to buy 6x more drives for making a backup prior
trying to fix the situation, i need a non-destructive approach to fix
the RAID configuration and the superblocks.

From my understanding of the RAID5 implementation the correct order of
drives is important.

First Question:
1) Am i right that the order is important and i have to try to find the
right sequence of drives?

So i would create a loop over all permutations of the drive list and for
each permutation:
- Scrub the Superblock mdadm --zero-superblock /dev/sd[bcdefg]1
- Recreate the RAID5 mdadm --create /dev/md0 -c 64 -l 5 \
	-n 6 --assume-clean <drive permutation>
- Run xfs_check to see if it recognizes the FS xfs_check -s /dev/md0
- Stop the RAID mdadm --stop /dev/md0

2) Is that a promising approach to repair the RAID5 array?
3) According the man page the --assume-cleanthat no data is affected
unless you write to the array, so this effectively prevents a rebuild?
This is important for me, as i don't want to trigger a rebuild as this
will certainly send my data to hell.
4) Any other idea for repairing the RAID without loosing user data?

Thanks in advance for any answers.


Currently the RAID superblocks on each device look like this:

/dev/sdg1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 53a294b5:975244fc:343b0f94:16652fce
           Name : grml:0
  Creation Time : Fri Apr 15 20:55:52 2011
     Raid Level : -unknown-
   Raid Devices : 0

 Avail Dev Size : 5860529039 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 9688dc72:02140045:c16a2123:4f6cc006

    Update Time : Sun Apr 22 23:56:14 2012
       Checksum : 350d8d74 - correct
         Events : 1


   Device Role : spare
   Array State :  ('A' == active, '.' == missing)


Interestingly at the Update Time the system should have been shut down:
Apr 22 23:55:55 router init: Switching to runlevel: 0
[...]
Apr 22 23:56:03 router exiting on signal 15
Apr 22 23:59:21 router syslogd 1.5.0: restart.

I have really no clue what happened.


Regards

Christoph Nelles

-- 
Christoph Nelles

E-Mail    : evilazrael@evilazrael.de
Jabber    : eazrael@evilazrael.net      ICQ       : 78819723

PGP-Key   : ID 0x424FB55B on subkeys.pgp.net
            or http://evilazrael.net/pgp.txt


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid5 crashed, need comments on possible repair solution
  2012-04-23 13:56 Raid5 crashed, need comments on possible repair solution Christoph Nelles
@ 2012-04-23 21:00 ` NeilBrown
  2012-04-23 21:47   ` Christoph Nelles
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2012-04-23 21:00 UTC (permalink / raw)
  To: Christoph Nelles; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3452 bytes --]

On Mon, 23 Apr 2012 15:56:16 +0200 Christoph Nelles
<evilazrael@evilazrael.de> wrote:

> Hi,
> 
> Linux RAID worked for me fine in the last few years, but yesterday while
> reorganizing the HW in my server the RAID5 crashed. It was a
> Software-RAID Level 5 with 6x 3TB drives and ran XFS on top of it. I
> have no idea why it crashed, but now all superblocks are invalid (one
> dump follows) and sadly i have no information on the raid disk layout
> (in which sequence the drives were). All drives from the raid are
> available and running.
> 
> As i cannot afford to buy 6x more drives for making a backup prior
> trying to fix the situation, i need a non-destructive approach to fix
> the RAID configuration and the superblocks.
> 
> >From my understanding of the RAID5 implementation the correct order of
> drives is important.
> 
> First Question:
> 1) Am i right that the order is important and i have to try to find the
> right sequence of drives?
> 
> So i would create a loop over all permutations of the drive list and for
> each permutation:
> - Scrub the Superblock mdadm --zero-superblock /dev/sd[bcdefg]1
> - Recreate the RAID5 mdadm --create /dev/md0 -c 64 -l 5 \
> 	-n 6 --assume-clean <drive permutation>
> - Run xfs_check to see if it recognizes the FS xfs_check -s /dev/md0
> - Stop the RAID mdadm --stop /dev/md0
> 
> 2) Is that a promising approach to repair the RAID5 array?
> 3) According the man page the --assume-cleanthat no data is affected
> unless you write to the array, so this effectively prevents a rebuild?
> This is important for me, as i don't want to trigger a rebuild as this
> will certainly send my data to hell.
> 4) Any other idea for repairing the RAID without loosing user data?
> 
> Thanks in advance for any answers.
> 
> 
> Currently the RAID superblocks on each device look like this:
> 
> /dev/sdg1:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : 53a294b5:975244fc:343b0f94:16652fce
>            Name : grml:0
>   Creation Time : Fri Apr 15 20:55:52 2011
>      Raid Level : -unknown-
>    Raid Devices : 0
> 
>  Avail Dev Size : 5860529039 (2794.52 GiB 3000.59 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : 9688dc72:02140045:c16a2123:4f6cc006
> 
>     Update Time : Sun Apr 22 23:56:14 2012
>        Checksum : 350d8d74 - correct
>          Events : 1
> 
> 
>    Device Role : spare
>    Array State :  ('A' == active, '.' == missing)
> 
> 
> Interestingly at the Update Time the system should have been shut down:
> Apr 22 23:55:55 router init: Switching to runlevel: 0
> [...]
> Apr 22 23:56:03 router exiting on signal 15
> Apr 22 23:59:21 router syslogd 1.5.0: restart.
> 
> I have really no clue what happened.

This is really worrying.  It's about the 3rd or 4th report recently which
contains:

>      Raid Level : -unknown-
>    Raid Devices : 0

and that should not be possible.  There must be some recent bug that causes
the array to be "cleared" *before* writing out the metadata - and that should
be impossible.
What kernel are you running?

You are correct that order is important.  Your algorithm looks good.
However I suggest that you first look through your system looks to see if

  RAID conf printout:

appears at all.  That could contain the device order.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid5 crashed, need comments on possible repair solution
  2012-04-23 21:00 ` NeilBrown
@ 2012-04-23 21:47   ` Christoph Nelles
  2012-04-23 23:01     ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Nelles @ 2012-04-23 21:47 UTC (permalink / raw)
  To: linux-raid

Hello Neil,


first thanks for the answer. I will happily provide any data or logs if
it helps you to investigate this problem.


Am 23.04.2012 23:00, schrieb NeilBrown:
> This is really worrying.  It's about the 3rd or 4th report recently which
> contains:
> 
>>      Raid Level : -unknown-
>>    Raid Devices : 0
> 
> and that should not be possible.  There must be some recent bug that causes
> the array to be "cleared" *before* writing out the metadata - and that should
> be impossible.
> What kernel are you running?

I switched kernel versions during that server rebuild. Last running
system was with 3.2.5, then rebuild and switch to 3.3.1 ant with that it
crashed. Kernel is vanilla selfcompiled, x86_64.
mdadm is 3.1.5, selfcompiled, too.

> You are correct that order is important.  Your algorithm looks good.
> However I suggest that you first look through your system looks to see if
> 
>   RAID conf printout:
> 
> appears at all.  That could contain the device order.

Yes, i found that, but as i reordered and rewired the drives due to
thermal reasons so I have no idea which drive was mapped to which device
name. In the normal system logs are only the model names and i had
multiple drives of the same type.

Regards

Christoph Nelles


-- 
Christoph Nelles

E-Mail    : evilazrael@evilazrael.de
Jabber    : eazrael@evilazrael.net      ICQ       : 78819723

PGP-Key   : ID 0x424FB55B on subkeys.pgp.net
            or http://evilazrael.net/pgp.txt


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid5 crashed, need comments on possible repair solution
  2012-04-23 21:47   ` Christoph Nelles
@ 2012-04-23 23:01     ` NeilBrown
  2012-05-12 17:19       ` Pierre Beck
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2012-04-23 23:01 UTC (permalink / raw)
  To: Christoph Nelles; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2302 bytes --]

On Mon, 23 Apr 2012 23:47:12 +0200 Christoph Nelles
<evilazrael@evilazrael.de> wrote:

> Hello Neil,
> 
> 
> first thanks for the answer. I will happily provide any data or logs if
> it helps you to investigate this problem.
> 
> 
> Am 23.04.2012 23:00, schrieb NeilBrown:
> > This is really worrying.  It's about the 3rd or 4th report recently which
> > contains:
> > 
> >>      Raid Level : -unknown-
> >>    Raid Devices : 0
> > 
> > and that should not be possible.  There must be some recent bug that causes
> > the array to be "cleared" *before* writing out the metadata - and that should
> > be impossible.
> > What kernel are you running?
> 
> I switched kernel versions during that server rebuild. Last running
> system was with 3.2.5, then rebuild and switch to 3.3.1 ant with that it
> crashed. Kernel is vanilla selfcompiled, x86_64.
> mdadm is 3.1.5, selfcompiled, too.

Thanks.
This is suggestive that it is a very recently introduced bug, and your
earlier observation that the "update time" correlated with the machine being
rebooted was very helpful.
I believe I have found the problem and have reproduced the symptom
The sequence I used to reproduce it was a bit forced and probably isn't
exactly what happened in your case.  Maybe there is a race condition that can
trigger it as well.

In any case, the following patch should fix the issue, and is strongly
recommended for any kernel to which it applies.

I'll send this upstream shortly.

Of course this doesn't help you with your current problem though at least it
suggests that it won't happen again.

I recall that you said you would be re-creating the array with a chunk size
of 64k.  The default has been 512K since mdadm-3.1 in late 2009.
Did you explicitly create with "-c 64" when you created the array? If not,
maybe you need to use "-c 512".

NeilBrown



diff --git a/drivers/md/md.c b/drivers/md/md.c
index 333190f..4a7002d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8402,7 +8402,8 @@ static int md_notify_reboot(struct notifier_block *this,
 
 	for_each_mddev(mddev, tmp) {
 		if (mddev_trylock(mddev)) {
-			__md_stop_writes(mddev);
+			if (mddev->pers)
+				__md_stop_writes(mddev);
 			mddev->safemode = 2;
 			mddev_unlock(mddev);
 		}

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Raid5 crashed, need comments on possible repair solution
  2012-04-23 23:01     ` NeilBrown
@ 2012-05-12 17:19       ` Pierre Beck
  2012-05-14 21:00         ` C.J. Adams-Collier KF7BMP
  0 siblings, 1 reply; 6+ messages in thread
From: Pierre Beck @ 2012-05-12 17:19 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi,

got an all-spares auto-assembly on IRC with "Raid level: -unknown-". We 
recovered by re-creating the array. Since more data is always good, I 
add this to the thread and hope it helps confirm the bug is fixed by the 
patch.

RAID-5, 3 members with 1 missing on creation and ever since.

Members on partitions with partition type set for auto-assembly.

Array was transported to a new machine.

Drive order got mixed up on transport: AB_ BA_
(figured that out on recovery)

On target machine boot-up (array not yet configured in mdadm.conf) 
Archlinux auto-assembled array with both drives as spares:

/dev/sda1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : c051172d:52ed3e47:e8dc6dc8:8798f4c9
            Name : OncleGeorges:0
   Creation Time : Fri Aug  5 18:00:19 2011
      Raid Level : -unknown-
    Raid Devices : 0

  Avail Dev Size : 1953515969 (931.51 GiB 1000.20 GB)
     Data Offset : 2048 sectors
    Super Offset : 8 sectors
           State : active
     Device UUID : a8b44e10:ff04d973:c5f92933:3c9e124f

     Update Time : Sat Apr 21 19:14:34 2012
        Checksum : 8b57fb27 - correct
          Events : 1


    Device Role : spare
    Array State :  ('A' == active, '.' == missing)

/dev/sdb1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : c051172d:52ed3e47:e8dc6dc8:8798f4c9
            Name : OncleGeorges:0
   Creation Time : Fri Aug  5 18:00:19 2011
      Raid Level : -unknown-
    Raid Devices : 0

  Avail Dev Size : 2046769231 (975.98 GiB 1047.95 GB)
   Used Dev Size : 1953515969 (931.51 GiB 1000.20 GB)
     Data Offset : 2048 sectors
    Super Offset : 8 sectors
           State : active
     Device UUID : d2594375:c8adc5a0:53938a24:9bd5c6be

     Update Time : Sat Apr 21 19:14:34 2012
        Checksum : 83dd0895 - correct
          Events : 1


    Device Role : spare
    Array State :  ('A' == active, '.' == missing)


Versions:

Linux HostName 3.3.4-2-ARCH #1 SMP PREEMPT Wed May 2 15:39:58 UTC 2012 
i686 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ AuthenticAMD GNU/Linux

mdadm - v3.2.3 - 23rd December 2011

Greetings,

Pierre Beck

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid5 crashed, need comments on possible repair solution
  2012-05-12 17:19       ` Pierre Beck
@ 2012-05-14 21:00         ` C.J. Adams-Collier KF7BMP
  0 siblings, 0 replies; 6+ messages in thread
From: C.J. Adams-Collier KF7BMP @ 2012-05-14 21:00 UTC (permalink / raw)
  To: Pierre Beck; +Cc: NeilBrown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 4186 bytes --]

Thank you Pierre,

This may help me.  I've got an array of 6.  I moved disks from one
chassis to another and at that time, one of the disks dropped out of the
array.  I modified the partition table of sde, as the new system called
it, since its partition table was blank.

Once I made the partition table the same as all other drives in the
array and ran mdadm -a /dev/md0 /dev/sde2, /proc/mdstat indicated that
it was re-building the array.  It took a day and a half or so and it
looked like it was going to complete before I woke up in the morning, so
I went to sleep when it was at 98.8% with 300m left in the prepare at
current rate.

I was doing a 500G copy at the time, so the long duration to complete
1.2% seemed reasonable to me.

When I woke up in the morning, the array showed _UUUU_, with sde and sdg
now having fallen out of the array.  I have since shut the array down
and want some advice for how to move forward.  I've got 5 new 1T disks
in the mail, and they should probably arrive today.  I've got a sixth
here on my desk, but it has some of the data that was potentially lost,
so I don't really want to use it in the new array.  I'll set it up as a
spare once the recovery is complete.

So, considering that I've got enough storage to duplicate the current
state of the disks at a block level, can you advise me on next steps?
Mine look like this:

1) wait until new drives arrive
2) dd if=/dev/sd$old of=/dev/sd$new
3) ???
4) profit!

Cheers,

C.J.


On Sat, 2012-05-12 at 19:19 +0200, Pierre Beck wrote:
> Hi,
> 
> got an all-spares auto-assembly on IRC with "Raid level: -unknown-". We 
> recovered by re-creating the array. Since more data is always good, I 
> add this to the thread and hope it helps confirm the bug is fixed by the 
> patch.
> 
> RAID-5, 3 members with 1 missing on creation and ever since.
> 
> Members on partitions with partition type set for auto-assembly.
> 
> Array was transported to a new machine.
> 
> Drive order got mixed up on transport: AB_ BA_
> (figured that out on recovery)
> 
> On target machine boot-up (array not yet configured in mdadm.conf) 
> Archlinux auto-assembled array with both drives as spares:
> 
> /dev/sda1:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x0
>       Array UUID : c051172d:52ed3e47:e8dc6dc8:8798f4c9
>             Name : OncleGeorges:0
>    Creation Time : Fri Aug  5 18:00:19 2011
>       Raid Level : -unknown-
>     Raid Devices : 0
> 
>   Avail Dev Size : 1953515969 (931.51 GiB 1000.20 GB)
>      Data Offset : 2048 sectors
>     Super Offset : 8 sectors
>            State : active
>      Device UUID : a8b44e10:ff04d973:c5f92933:3c9e124f
> 
>      Update Time : Sat Apr 21 19:14:34 2012
>         Checksum : 8b57fb27 - correct
>           Events : 1
> 
> 
>     Device Role : spare
>     Array State :  ('A' == active, '.' == missing)
> 
> /dev/sdb1:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x0
>       Array UUID : c051172d:52ed3e47:e8dc6dc8:8798f4c9
>             Name : OncleGeorges:0
>    Creation Time : Fri Aug  5 18:00:19 2011
>       Raid Level : -unknown-
>     Raid Devices : 0
> 
>   Avail Dev Size : 2046769231 (975.98 GiB 1047.95 GB)
>    Used Dev Size : 1953515969 (931.51 GiB 1000.20 GB)
>      Data Offset : 2048 sectors
>     Super Offset : 8 sectors
>            State : active
>      Device UUID : d2594375:c8adc5a0:53938a24:9bd5c6be
> 
>      Update Time : Sat Apr 21 19:14:34 2012
>         Checksum : 83dd0895 - correct
>           Events : 1
> 
> 
>     Device Role : spare
>     Array State :  ('A' == active, '.' == missing)
> 
> 
> Versions:
> 
> Linux HostName 3.3.4-2-ARCH #1 SMP PREEMPT Wed May 2 15:39:58 UTC 2012 
> i686 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ AuthenticAMD GNU/Linux
> 
> mdadm - v3.2.3 - 23rd December 2011
> 
> Greetings,
> 
> Pierre Beck
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-05-14 21:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-23 13:56 Raid5 crashed, need comments on possible repair solution Christoph Nelles
2012-04-23 21:00 ` NeilBrown
2012-04-23 21:47   ` Christoph Nelles
2012-04-23 23:01     ` NeilBrown
2012-05-12 17:19       ` Pierre Beck
2012-05-14 21:00         ` C.J. Adams-Collier KF7BMP

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.