All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olivier Swinkels <olivier.swinkels@gmail.com>
To: linux-raid@vger.kernel.org
Subject: [RAID recovery] Unable to recover RAID5 array after disk failure
Date: Fri, 3 Mar 2017 22:35:03 +0100	[thread overview]
Message-ID: <CAJ0QwkJqRbJtM9HiXL+cCj2TkQQRahGgWTwD9QKH_CBfbJeLKA@mail.gmail.com> (raw)

Hi,

I'm in quite a pickle here. I can't recover from a disk failure on my
6 disk raid 5 array.
Any help would really be appreciated!

Please bear with me as I lay out the the steps that got me here:

- I got a message my raid went down as 3 disks seemed to have failed.
I've dealt with this before and usually it meant that one disk failed
and took out the complete SATA controller.

- 1 of the disks was quite old and the 2 others quite new (<1 year).
So i removed the old drive and the controller can up again. I tried to
reassemble the RAID using:   sudo mdadm -v --assemble --force /dev/md0
 /dev/sdb /dev/sdc /dev/sdg /dev/sdf /dev/sde

- However I got the message :
mdadm: /dev/md0 assembled from 4 drives - not enough to start the array.

- This got me worried and this was the place I screwed up:

- Against the recommendations on the wiki I tried to recover the RAID
using a re-create:
sudo mdadm --verbose --create --assume-clean --level=5
--raid-devices=6 /dev/md0 /dev/sdb /dev/sdc missing /dev/sdg /dev/sdf
/dev/sde

- The second error I made was I forgot to add the correct superblock
version and chunksize.

- The resulting RAID did not seem correct as I couldn't find the LVM
which should be there.

- Subsequently the SATA controller went down again, so my assumption
on the failed disk was also incorrect and I disconnected the wrong
disk.

- After some trial and error I found out one of the newer disk was the
culprit and I tried to recover the RAID by re-creating the array with
the healthy disks and the correct superblock configuration using:
sudo mdadm --verbose --create --bitmap=none --chunk=64 --metadata=0.90
--assume-clean --level=5 --raid-devices=6 /dev/md0 /dev/sdb missing
/dev/sdc /dev/sdf /dev/sde /dev/sdd

- This gives me a degraded array, but unfortunately the LVM is still
not available.

- Is this situation still rescue-able?


===============================================================================
===============================================================================
- Below is the output of "mdadm --examine /dev/sd*" BEFORE the first
create action.

/dev/sdb:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7af7d0ad:b37b1b49:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Sun Apr 10 17:59:16 2011
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 3
Preferred Minor : 0

    Update Time : Fri Feb 24 16:31:02 2017
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 3
  Spare Devices : 0
       Checksum : ae0d0dec - correct
         Events : 51108

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       16        0      active sync   /dev/sdb

   0     0       8       16        0      active sync   /dev/sdb
   1     1       0        0        1      active sync
   2     2       0        0        2      faulty removed
   3     3       8       96        3      active sync   /dev/sdg
   4     4       8       80        4      active sync   /dev/sdf
   5     5       0        0        5      faulty removed
/dev/sdc:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7af7d0ad:b37b1b49:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Sun Apr 10 17:59:16 2011
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Fri Feb 24 02:01:04 2017
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : ae0c42ac - correct
         Events : 51108

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       32        1      active sync   /dev/sdc

   0     0       8       16        0      active sync   /dev/sdb
   1     1       8       32        1      active sync   /dev/sdc
   2     2       8       48        2      active sync   /dev/sdd
   3     3       8       96        3      active sync   /dev/sdg
   4     4       8       80        4      active sync   /dev/sdf
   5     5       8       64        5      active sync   /dev/sde
/dev/sde:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7af7d0ad:b37b1b49:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Sun Apr 10 17:59:16 2011
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Fri Feb 24 02:01:04 2017
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : ae0c42c0 - correct
         Events : 51088

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     5       8       64        5      active sync   /dev/sde

   0     0       8       16        0      active sync   /dev/sdb
   1     1       8       32        1      active sync   /dev/sdc
   2     2       8       48        2      active sync   /dev/sdd
   3     3       8       96        3      active sync   /dev/sdg
   4     4       8       80        4      active sync   /dev/sdf
   5     5       8       64        5      active sync   /dev/sde
/dev/sdf:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7af7d0ad:b37b1b49:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Sun Apr 10 17:59:16 2011
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 3
Preferred Minor : 0

    Update Time : Fri Feb 24 16:31:02 2017
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 3
  Spare Devices : 0
       Checksum : ae0d0e37 - correct
         Events : 51108

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8       80        4      active sync   /dev/sdf

   0     0       8       16        0      active sync   /dev/sdb
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       96        3      active sync   /dev/sdg
   4     4       8       80        4      active sync   /dev/sdf
   5     5       0        0        5      faulty removed
/dev/sdg:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 7af7d0ad:b37b1b49:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Sun Apr 10 17:59:16 2011
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 3
Preferred Minor : 0

    Update Time : Fri Feb 24 16:31:02 2017
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 3
  Spare Devices : 0
       Checksum : ae0d0e45 - correct
         Events : 51108

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       96        3      active sync   /dev/sdg

   0     0       8       16        0      active sync   /dev/sdb
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       96        3      active sync   /dev/sdg
   4     4       8       80        4      active sync   /dev/sdf
   5     5       0        0        5      faulty removed

===============================================================================
===============================================================================
- below is status of the current situation:
===============================================================================
===============================================================================

- Phil Turmel's lsdrv:

sudo ./lsdrv
[sudo] password for horus:
PCI [ata_piix] 00:1f.2 IDE interface: Intel Corporation NM10/ICH7
Family SATA Controller [IDE mode] (rev 02)
├scsi 0:0:0:0 ATA      Samsung SSD 850  {S21UNX0H601730R}
│└sda 111.79g [8:0] Partitioned (dos)
│ └sda1 97.66g [8:1] Partitioned (dos) {a2d2e5b3-cef5-44f8-83a7-3c25f285c7b4}
│  └Mounted as /dev/sda1 @ /
└scsi 1:0:0:0 ATA      SAMSUNG HD204UI  {S2H7JD2B201244}
 └sdb 1.82t [8:16] MD raid5 (0/6) (w/ sdc,sdd,sde,sdf) in_sync
{4c0518af-d198-d804-151d-b09aa68c27d9}
  └md0 9.10t [9:0] MD v0.90 raid5 (6) clean DEGRADED, 64k Chunk
{4c0518af:d198d804:151db09a:a68c27d9}
                   PV LVM2_member (inactive)
{DWv51O-lg9s-Dl4w-EBp9-QeIF-Vv60-8wt2uS}
PCI [pata_jmicron] 01:00.1 IDE interface: JMicron Technology Corp.
JMB363 SATA/IDE Controller (rev 03)
├scsi 2:x:x:x [Empty]
└scsi 3:x:x:x [Empty]
PCI [ahci] 01:00.0 SATA controller: JMicron Technology Corp. JMB363
SATA/IDE Controller (rev 03)
├scsi 4:0:0:0 ATA      SAMSUNG HD204UI  {S2H7JD2B201246}
│└sdc 1.82t [8:32] MD raid5 (2/6) (w/ sdb,sdd,sde,sdf) in_sync
{4c0518af-d198-d804-151d-b09aa68c27d9}
│ └md0 9.10t [9:0] MD v0.90 raid5 (6) clean DEGRADED, 64k Chunk
{4c0518af:d198d804:151db09a:a68c27d9}
│                  PV LVM2_member (inactive)
{DWv51O-lg9s-Dl4w-EBp9-QeIF-Vv60-8wt2uS}
├scsi 4:1:0:0 ATA      WDC WD40EFRX-68W {WD-WCC4E6JF3EE3}
│└sdd 3.64t [8:48] MD raid5 (5/6) (w/ sdb,sdc,sde,sdf) in_sync
{4c0518af-d198-d804-151d-b09aa68c27d9}
│ ├md0 9.10t [9:0] MD v0.90 raid5 (6) clean DEGRADED, 64k Chunk
{4c0518af:d198d804:151db09a:a68c27d9}
│ │                PV LVM2_member (inactive)
{DWv51O-lg9s-Dl4w-EBp9-QeIF-Vv60-8wt2uS}
└scsi 5:x:x:x [Empty]
PCI [ahci] 05:00.0 SATA controller: Marvell Technology Group Ltd.
88SE9120 SATA 6Gb/s Controller (rev 12)
├scsi 6:0:0:0 ATA      Hitachi HDS72202 {JK11A1YAJN30GV}
│└sde 1.82t [8:64] MD raid5 (4/6) (w/ sdb,sdc,sdd,sdf) in_sync
{4c0518af-d198-d804-151d-b09aa68c27d9}
│ └md0 9.10t [9:0] MD v0.90 raid5 (6) clean DEGRADED, 64k Chunk
{4c0518af:d198d804:151db09a:a68c27d9}
│                  PV LVM2_member (inactive)
{DWv51O-lg9s-Dl4w-EBp9-QeIF-Vv60-8wt2uS}
└scsi 7:0:0:0 ATA      Hitachi HDS72202 {JK1174YAH779AW}
 └sdf 1.82t [8:80] MD raid5 (3/6) (w/ sdb,sdc,sdd,sde) in_sync
{4c0518af-d198-d804-151d-b09aa68c27d9}
  └md0 9.10t [9:0] MD v0.90 raid5 (6) clean DEGRADED, 64k Chunk
{4c0518af:d198d804:151db09a:a68c27d9}
                   PV LVM2_member (inactive)
{DWv51O-lg9s-Dl4w-EBp9-QeIF-Vv60-8wt2uS}

===============================================================================
===============================================================================
cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : active raid5 sdd[5] sde[4] sdf[3] sdc[2] sdb[0]
      9767572480 blocks level 5, 64k chunk, algorithm 2 [6/5] [U_UUUU]

unused devices: <none>

===============================================================================
===============================================================================
This is the current none functional RAID.
sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Fri Mar  3 21:09:22 2017
     Raid Level : raid5
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
   Raid Devices : 6
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri Mar  3 21:09:22 2017
          State : clean, degraded
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 4c0518af:d198d804:151db09a:a68c27d9 (local to host
horus-server)
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8       16        0      active sync   /dev/sdb
       2       0        0        2      removed
       2       8       32        2      active sync   /dev/sdc
       3       8       80        3      active sync   /dev/sdf
       4       8       64        4      active sync   /dev/sde
       5       8       48        5      active sync   /dev/sdd


===============================================================================
===============================================================================

sudo mdadm --examine /dev/sd*
/dev/sda:
   MBR Magic : aa55
Partition[0] :    204800000 sectors at         2048 (type 83)
/dev/sda1:
   MBR Magic : aa55
/dev/sdb:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4c0518af:d198d804:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Fri Mar  3 21:09:22 2017
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Fri Mar  3 21:09:22 2017
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 0
       Checksum : a857f8f3 - correct
         Events : 1

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       16        0      active sync   /dev/sdb

   0     0       8       16        0      active sync   /dev/sdb
   1     0       0        0        0      spare
   2     2       8       32        2      active sync   /dev/sdc
   3     3       8       80        3      active sync   /dev/sdf
   4     4       8       64        4      active sync   /dev/sde
   5     5       8       48        5      active sync   /dev/sdd
/dev/sdc:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4c0518af:d198d804:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Fri Mar  3 21:09:22 2017
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Fri Mar  3 21:09:22 2017
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 0
       Checksum : a857f907 - correct
         Events : 1

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       32        2      active sync   /dev/sdc

   0     0       8       16        0      active sync   /dev/sdb
   1     0       0        0        0      spare
   2     2       8       32        2      active sync   /dev/sdc
   3     3       8       80        3      active sync   /dev/sdf
   4     4       8       64        4      active sync   /dev/sde
   5     5       8       48        5      active sync   /dev/sdd
/dev/sdd:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4c0518af:d198d804:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Fri Mar  3 21:09:22 2017
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Fri Mar  3 21:09:22 2017
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 0
       Checksum : a857f91d - correct
         Events : 1

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     5       8       48        5      active sync   /dev/sdd

   0     0       8       16        0      active sync   /dev/sdb
   1     0       0        0        0      spare
   2     2       8       32        2      active sync   /dev/sdc
   3     3       8       80        3      active sync   /dev/sdf
   4     4       8       64        4      active sync   /dev/sde
   5     5       8       48        5      active sync   /dev/sdd
mdadm: No md superblock detected on /dev/sdd1.
/dev/sde:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4c0518af:d198d804:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Fri Mar  3 21:09:22 2017
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Fri Mar  3 21:09:22 2017
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 0
       Checksum : a857f92b - correct
         Events : 1

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8       64        4      active sync   /dev/sde

   0     0       8       16        0      active sync   /dev/sdb
   1     0       0        0        0      spare
   2     2       8       32        2      active sync   /dev/sdc
   3     3       8       80        3      active sync   /dev/sdf
   4     4       8       64        4      active sync   /dev/sde
   5     5       8       48        5      active sync   /dev/sdd
/dev/sdf:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4c0518af:d198d804:151db09a:a68c27d9 (local to host
horus-server)
  Creation Time : Fri Mar  3 21:09:22 2017
     Raid Level : raid5
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
     Array Size : 9767572480 (9315.08 GiB 10001.99 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Fri Mar  3 21:09:22 2017
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 0
       Checksum : a857f939 - correct
         Events : 1

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       80        3      active sync   /dev/sdf

   0     0       8       16        0      active sync   /dev/sdb
   1     0       0        0        0      spare
   2     2       8       32        2      active sync   /dev/sdc
   3     3       8       80        3      active sync   /dev/sdf
   4     4       8       64        4      active sync   /dev/sde
   5     5       8       48        5      active sync   /dev/sdd

             reply	other threads:[~2017-03-03 21:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-03 21:35 Olivier Swinkels [this message]
2017-03-05 18:55 ` [RAID recovery] Unable to recover RAID5 array after disk failure Phil Turmel
2017-03-06  8:26   ` Olivier Swinkels
2017-03-06  9:17     ` Olivier Swinkels
2017-03-06 19:50       ` Phil Turmel
2017-03-07  8:39         ` Olivier Swinkels
2017-03-07 14:52           ` Phil Turmel
2017-03-08 19:01             ` Olivier Swinkels
2017-03-17 19:25               ` Olivier Swinkels
2017-03-21 17:08                 ` Phil Turmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ0QwkJqRbJtM9HiXL+cCj2TkQQRahGgWTwD9QKH_CBfbJeLKA@mail.gmail.com \
    --to=olivier.swinkels@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.