Re: Help needed recovering from raid failure

* Re: Help needed recovering from raid failure
@ 2015-04-29 18:17 Peter van Es
  2015-04-29 23:27 ` NeilBrown
  0 siblings, 1 reply; 7+ messages in thread
From: Peter van Es @ 2015-04-29 18:17 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Dear Neil,

first of all, I really appreciate you trying to help me. This is the first time I’m deploying software raid, so really appreciate the guidance.

> On 29 Apr 2015, at 00:26, NeilBrown <neilb@suse.de> wrote:
> 
> This isn't really reporting anything new.
> There is probably a daily cron job which reports all degraded arrays.  This
> message is reported by that job.

I understand...

> 
> 
> Why do you think the array is off-line?  The above message doesn't suggest
> that.
> 

My Ubuntu server was accessible through ssh but did not serve webpages, files etc. When I went to the console, 
it told me it had taken the array offline because of degraded /dev/sdd2 and /dev/sdc2
Those two drives were out of the array. 

> 
>> 
>> Needless to say, I can't boot the system anymore as the boot drive is /dev/md0, and GRUB can't
>> get at it. I do need to recover data (I know, but there's stuf on there I have no backup for--yet).
> 
> You boot off a RAID5?  Does grub support that?  I didn't know.
> But md0 hasn't failed, has it?
> 
> Confused.

Well, it took a little time but yes, I managed to define a raid 5 array that the system was able to boot from. 

> There is something VERY sick here.  I suggest that you tread very carefully.
> 
> All your '1' partitions should be about 2GB and the '2' parititions about 2TB
> 
> But the --examine output suggests sda2 and sdb2 are 2TB, while sdd2 and sde2
> are 2GB.
> 
> That really really shouldn't happen.  Maybe check your partition table
> (fdisk).
> I really cannot see how this would happen.

But this question, and the previous question you asked, tell me a little of what I may have done…

I think confused /dev/md0 and /dev/md1 (now called /dev/md126 and /dev/md127 when running of the USB stick). 

/dev/md0 is a swap array (around 6GB, comprised of 4 x 2 GB in raid 5)
/dev/md1 is the boot and data array (around 5 TB, comprised of 4 x ~2 TB in raid 5) 

I must have confused them and tried to add the /dev/sdc2 and /dev/sdd2 drive to the /dev/md0 array (mdadm —add /dev/md0 /dev/sdc2)
instead of to the /dev/md1 array.  They were  then added as spare drives, their superblocks were overwritten, but since
a) no swap space was used, and 
b) they were added as spares

The data should not have been overwritten.

> 
> Can you
>  mdadm -Ss
> 
> to stop all the arrays, then
> 
>  fdisk -l /dev/sd?
> 
> then 
> 
>  mdadm -Esvv
> 

Neil, here they are: again, I appreciate you taking the time and guiding me through this!

Is there any way to resurrect the super blocks and try to force assemble the array, skipping the failing drive /dev/sdd2 (the /dev/sdd2 drive created some errors I observed in the log, /dev/sdc2 must have had a one off issue to be taken out….). I have two new drives (arrived today), and a new SSD drive. I would want to get the new array assembled using /dev/sdc2 perhaps forcing it back to the array geometry and “hoping for the best” and then install a new /dev/sdd2 to be recovered. Then I’ll create a boot and swap drive off the SSD which means that any array failures should not prevent the system from booting…

Requested outputs are below

Thanks, 

Peter

fdisk output: (USB devices deleted)

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000f24ee

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     3905535     1951744   fd  Linux raid autodetect
/dev/sda2   *     3905536  3907028991  1951561728   fd  Linux raid autodetect

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00029d5c

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048     3905535     1951744   fd  Linux raid autodetect
/dev/sdb2   *     3905536  3907028991  1951561728   fd  Linux raid autodetect

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000727bf

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1            2048     3905535     1951744   fd  Linux raid autodetect
/dev/sdd2   *     3905536  3907028991  1951561728   fd  Linux raid autodetect

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0009fe7f

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1            2048     3905535     1951744   fd  Linux raid autodetect
/dev/sde2   *     3905536  3907028991  1951561728   fd  Linux raid autodetect

mdadm -Esvv output (USB devices deleted)

/dev/sde2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dbe238a3:c7a528c1:a1b78589:276ecfcf
           Name : ubuntu:0  (local to host ubuntu)
  Creation Time : Wed Apr  1 22:27:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3903121408 (1861.15 GiB 1998.40 GB)
     Array Size : 5850624 (5.58 GiB 5.99 GB)
  Used Dev Size : 3900416 (1904.82 MiB 1997.01 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : cdae3287:91168194:942ba99d:1a85c466

    Update Time : Wed Apr 29 17:46:25 2015
       Checksum : b8b84dad - correct
         Events : 30

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dbe238a3:c7a528c1:a1b78589:276ecfcf
           Name : ubuntu:0  (local to host ubuntu)
  Creation Time : Wed Apr  1 22:27:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3901440 (1905.32 MiB 1997.54 MB)
     Array Size : 5850624 (5.58 GiB 5.99 GB)
  Used Dev Size : 3900416 (1904.82 MiB 1997.01 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : b051f523:4887e729:cd63bed1:8c2a7575

    Update Time : Wed Apr 29 17:46:25 2015
       Checksum : 453ddeef - correct
         Events : 30

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sde:
   MBR Magic : aa55
Partition[0] :      3903488 sectors at         2048 (type fd)
Partition[1] :   3903123456 sectors at      3905536 (type fd)
/dev/sdd2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dbe238a3:c7a528c1:a1b78589:276ecfcf
           Name : ubuntu:0  (local to host ubuntu)
  Creation Time : Wed Apr  1 22:27:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3903121408 (1861.15 GiB 1998.40 GB)
     Array Size : 5850624 (5.58 GiB 5.99 GB)
  Used Dev Size : 3900416 (1904.82 MiB 1997.01 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 0f3f2b91:09cbb344:e52c4c4b:722d65c4

    Update Time : Wed Apr 29 17:46:25 2015
       Checksum : 7e273c0f - correct
         Events : 30

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dbe238a3:c7a528c1:a1b78589:276ecfcf
           Name : ubuntu:0  (local to host ubuntu)
  Creation Time : Wed Apr  1 22:27:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3901440 (1905.32 MiB 1997.54 MB)
     Array Size : 5850624 (5.58 GiB 5.99 GB)
  Used Dev Size : 3900416 (1904.82 MiB 1997.01 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : b6668730:3b1380bf:556700d9:30df829c

    Update Time : Wed Apr 29 17:46:25 2015
       Checksum : 15b83814 - correct
         Events : 30

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdd:
   MBR Magic : aa55
Partition[0] :      3903488 sectors at         2048 (type fd)
Partition[1] :   3903123456 sectors at      3905536 (type fd)
/dev/sdb2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 1f28f7bb:7b3ecd41:ca0fa5d1:ccd008df
           Name : ubuntu:1  (local to host ubuntu)
  Creation Time : Wed Apr  1 22:27:58 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3902861312 (1861.03 GiB 1998.26 GB)
     Array Size : 5854290432 (5583.09 GiB 5994.79 GB)
  Used Dev Size : 3902860288 (1861.03 GiB 1998.26 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f1e79609:79b7ac23:55197f70:e8fbfd58

    Update Time : Sun Apr 26 05:59:13 2015
       Checksum : 696f4e76 - correct
         Events : 18014

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AA.. ('A' == active, '.' == missing)
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dbe238a3:c7a528c1:a1b78589:276ecfcf
           Name : ubuntu:0  (local to host ubuntu)
  Creation Time : Wed Apr  1 22:27:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3901440 (1905.32 MiB 1997.54 MB)
     Array Size : 5850624 (5.58 GiB 5.99 GB)
  Used Dev Size : 3900416 (1904.82 MiB 1997.01 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f52239b1:0fb87e7e:71e29ea4:bf67184a

    Update Time : Wed Apr 29 17:46:25 2015
       Checksum : ce9c9cd0 - correct
         Events : 30

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdb:
   MBR Magic : aa55
Partition[0] :      3903488 sectors at         2048 (type fd)
Partition[1] :   3903123456 sectors at      3905536 (type fd)
/dev/sda2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 1f28f7bb:7b3ecd41:ca0fa5d1:ccd008df
           Name : ubuntu:1  (local to host ubuntu)
  Creation Time : Wed Apr  1 22:27:58 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3902861312 (1861.03 GiB 1998.26 GB)
     Array Size : 5854290432 (5583.09 GiB 5994.79 GB)
  Used Dev Size : 3902860288 (1861.03 GiB 1998.26 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 713e556d:ca104217:785db68a:d820a57b

    Update Time : Sun Apr 26 05:59:13 2015
       Checksum : fda151f9 - correct
         Events : 18014

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AA.. ('A' == active, '.' == missing)
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dbe238a3:c7a528c1:a1b78589:276ecfcf
           Name : ubuntu:0  (local to host ubuntu)
  Creation Time : Wed Apr  1 22:27:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3901440 (1905.32 MiB 1997.54 MB)
     Array Size : 5850624 (5.58 GiB 5.99 GB)
  Used Dev Size : 3900416 (1904.82 MiB 1997.01 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : c483532d:06f93351:cfdf5a92:e83855b5

    Update Time : Wed Apr 29 17:46:25 2015
       Checksum : 76650d1c - correct
         Events : 30

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sda:
   MBR Magic : aa55
Partition[0] :      3903488 sectors at         2048 (type fd)
Partition[1] :   3903123456 sectors at      3905536 (type fd)--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread