All of lore.kernel.org
 help / color / mirror / Atom feed
* Trouble recovering raid5 array
@ 2009-02-28  7:56 Andrey Falko
  2009-03-01 18:27 ` Andrey Falko
  0 siblings, 1 reply; 2+ messages in thread
From: Andrey Falko @ 2009-02-28  7:56 UTC (permalink / raw)
  To: linux-raid

Hi everyone,

I'm having some strange problems putting one of my raid5 back
together. Here is back ground story:

I have 4 drives paritioned into a bunch of raid arrays. One of the
drives failed and I replaced it with a new one. I was able to get
mdadm to recover all arrays, except one raid5 array. The array with
troubles is /dev/md8 and it is supposed to have /dev/sd[abcd]13 under
it.

This command started the recovery process (same thing that worked for
my other raid5 arrays):
mdadm --manage --add /dev/md8 /dev/sdc13

md8 : active raid5 sdc13[4] sdd13[3] sdb13[1] sda13[0]
      117185856 blocks level 5, 64k chunk, algorithm 2 [4/3] [UU_U]
      [>....................]  recovery =  1.6% (634084/39061952)
finish=12.1min speed=52840K/sec

However sometime after 1.6% into recovery, I did a "cat /proc/mdstat" and saw:

md8 : active raid5 sdc13[4](S) sdd13[3] sdb13[1] sda13[5](F)
      117185856 blocks level 5, 64k chunk, algorithm 2 [4/2] [_U_U]

/dev/sda only "failed" for this array and not on any of the other
arrays. I proceeded trying to remove and re-add /dev/sda13 and
/dev/sdc13, however that did not work. I ran the following:

# mdadm --examine /dev/sda13
/dev/sda13:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ec0fab43:4fb8d991:e6a58c12:
482d89e4
  Creation Time : Sat Sep 15 00:55:37 2007
     Raid Level : raid5
  Used Dev Size : 39061952 (37.25 GiB 40.00 GB)
     Array Size : 117185856 (111.76 GiB 120.00 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 8

    Update Time : Sat Feb 28 00:49:48 2009
          State : clean
 Active Devices : 2
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 2
       Checksum : a6b02b9d - correct
         Events : 0.36

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8       13        4      spare   /dev/sda13

   0     0       0        0        0      removed
   1     1       8       29        1      active sync   /dev/sdb13
   2     2       0        0        2      faulty removed
   3     3       8       61        3      active sync   /dev/sdd13
   4     4       8       13        4      spare   /dev/sda13
   5     5       8       45        5      spare   /dev/sdc13
# mdadm --examine /dev/sdb13
/dev/sdb13:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ec0fab43:4fb8d991:e6a58c12:482d89e4
  Creation Time : Sat Sep 15 00:55:37 2007
     Raid Level : raid5
  Used Dev Size : 39061952 (37.25 GiB 40.00 GB)
     Array Size : 117185856 (111.76 GiB 120.00 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 8

    Update Time : Sat Feb 28 00:49:48 2009
          State : clean
 Active Devices : 2
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 2
       Checksum : a6b02bad - correct
         Events : 0.36

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       29        1      active sync   /dev/sdb13

   0     0       0        0        0      removed
   1     1       8       29        1      active sync   /dev/sdb13
   2     2       0        0        2      faulty removed
   3     3       8       61        3      active sync   /dev/sdd13
   4     4       8       13        4      spare   /dev/sda13
   5     5       8       45        5      spare   /dev/sdc13
# mdadm --examine /dev/sdc13
/dev/sdc13:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ec0fab43:4fb8d991:e6a58c12:482d89e4
  Creation Time : Sat Sep 15 00:55:37 2007
     Raid Level : raid5
  Used Dev Size : 39061952 (37.25 GiB 40.00 GB)
     Array Size : 117185856 (111.76 GiB 120.00 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 8

    Update Time : Sat Feb 28 00:49:48 2009
          State : clean
 Active Devices : 2
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 2
       Checksum : a6b02bbf - correct
         Events : 0.36

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     5       8       45        5      spare   /dev/sdc13

   0     0       0        0        0      removed
   1     1       8       29        1      active sync   /dev/sdb13
   2     2       0        0        2      faulty removed
   3     3       8       61        3      active sync   /dev/sdd13
   4     4       8       13        4      spare   /dev/sda13
   5     5       8       45        5      spare   /dev/sdc13
# mdadm --examine /dev/sdd13
/dev/sdd13:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ec0fab43:4fb8d991:e6a58c12:482d89e4
  Creation Time : Sat Sep 15 00:55:37 2007
     Raid Level : raid5
  Used Dev Size : 39061952 (37.25 GiB 40.00 GB)
     Array Size : 117185856 (111.76 GiB 120.00 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 8

    Update Time : Sat Feb 28 00:49:48 2009
          State : clean
 Active Devices : 2
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 2
       Checksum : a6b02bd1 - correct
         Events : 0.36

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       61        3      active sync   /dev/sdd13

   0     0       0        0        0      removed
   1     1       8       29        1      active sync   /dev/sdb13
   2     2       0        0        2      faulty removed
   3     3       8       61        3      active sync   /dev/sdd13
   4     4       8       13        4      spare   /dev/sda13
   5     5       8       45        5      spare   /dev/sdc13


At this point I started googling and ended up doing this:

# mdadm --stop /dev/md8
# mdadm --zero-superblock /dev/sda13
# mdadm --zero-superblock /dev/sdb13
# mdadm --zero-superblock /dev/sdc13
# mdadm --zero-superblock /dev/sdd13
# mdadm -A /dev/md8 /dev/sda13 /dev/sdb13 /dev/sdc13 /dev/sdd13 --force
mdadm: no recogniseable superblock on /dev/sda13
mdadm: /dev/sda13 has no superblock - assembly aborted
# mdadm --create /dev/md8 -l 5 -n 4 /dev/sda13 /dev/sdb13 /dev/sdc13 /dev/sdd13
mdadm: /dev/sda13 appears to contain an ext2fs file system
    size=117185856K  mtime=Fri Feb 27 17:22:46 2009
mdadm: /dev/sdd13 appears to contain an ext2fs file system
    size=117185856K  mtime=Fri Feb 27 17:22:46 2009
Continue creating array? y
mdadm: array /dev/md8 started.
# cat /proc/mdstat
md8 : active raid5 sdd13[4] sdc13[2] sdb13[1] sda13[0]
      117185856 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
      [>....................]  recovery =  0.7% (294244/39061952)
finish=10.9min speed=58848K/sec

Again very shortly after 1.6% into recovery, it failed. Now I see the following:

md8 : active raid5 sdd13[4](S) sdc13[2] sdb13[1] sda13[5](F)
      117185856 blocks level 5, 64k chunk, algorithm 2 [4/2] [_UU_]

This surely does not look good. I hope that /dev/sda13 did not get
wrongly synced. Does anyone have any suggestions about what I should
do to recover this array? Does anyone have any ideas on what could
have possibly caused these issues? How can /dev/sda13, a healthy part
of the array lose its superblock?

Let me know if you'd like any more info. mdadm version is 2.6.4.
Kernel version 2.6.24.3.

Thanks in advance for any help,
Andrey

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Trouble recovering raid5 array
  2009-02-28  7:56 Trouble recovering raid5 array Andrey Falko
@ 2009-03-01 18:27 ` Andrey Falko
  0 siblings, 0 replies; 2+ messages in thread
From: Andrey Falko @ 2009-03-01 18:27 UTC (permalink / raw)
  To: linux-raid

On Fri, Feb 27, 2009 at 11:56 PM, Andrey Falko <ma3oxuct@gmail.com> wrote:
> Hi everyone,
>
> I'm having some strange problems putting one of my raid5 back
> together. Here is back ground story:
>
> I have 4 drives paritioned into a bunch of raid arrays. One of the
> drives failed and I replaced it with a new one. I was able to get
> mdadm to recover all arrays, except one raid5 array. The array with
> troubles is /dev/md8 and it is supposed to have /dev/sd[abcd]13 under
> it.
>
> This command started the recovery process (same thing that worked for
> my other raid5 arrays):
> mdadm --manage --add /dev/md8 /dev/sdc13
>
> md8 : active raid5 sdc13[4] sdd13[3] sdb13[1] sda13[0]
>       117185856 blocks level 5, 64k chunk, algorithm 2 [4/3] [UU_U]
>       [>....................]  recovery =  1.6% (634084/39061952)
> finish=12.1min speed=52840K/sec
>
> However sometime after 1.6% into recovery, I did a "cat /proc/mdstat" and saw:
>
> md8 : active raid5 sdc13[4](S) sdd13[3] sdb13[1] sda13[5](F)
>       117185856 blocks level 5, 64k chunk, algorithm 2 [4/2] [_U_U]
>
> /dev/sda only "failed" for this array and not on any of the other
> arrays. I proceeded trying to remove and re-add /dev/sda13 and
> /dev/sdc13, however that did not work. I ran the following:
>
> # mdadm --examine /dev/sda13
> /dev/sda13:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : ec0fab43:4fb8d991:e6a58c12:
> 482d89e4
>   Creation Time : Sat Sep 15 00:55:37 2007
>      Raid Level : raid5
>   Used Dev Size : 39061952 (37.25 GiB 40.00 GB)
>      Array Size : 117185856 (111.76 GiB 120.00 GB)
>    Raid Devices : 4
>   Total Devices : 4
> Preferred Minor : 8
>
>     Update Time : Sat Feb 28 00:49:48 2009
>           State : clean
>  Active Devices : 2
> Working Devices : 4
>  Failed Devices : 1
>   Spare Devices : 2
>        Checksum : a6b02b9d - correct
>          Events : 0.36
>
>          Layout : left-symmetric
>      Chunk Size : 64K
>
>       Number   Major   Minor   RaidDevice State
> this     4       8       13        4      spare   /dev/sda13
>
>    0     0       0        0        0      removed
>    1     1       8       29        1      active sync   /dev/sdb13
>    2     2       0        0        2      faulty removed
>    3     3       8       61        3      active sync   /dev/sdd13
>    4     4       8       13        4      spare   /dev/sda13
>    5     5       8       45        5      spare   /dev/sdc13
> # mdadm --examine /dev/sdb13
> /dev/sdb13:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : ec0fab43:4fb8d991:e6a58c12:482d89e4
>   Creation Time : Sat Sep 15 00:55:37 2007
>      Raid Level : raid5
>   Used Dev Size : 39061952 (37.25 GiB 40.00 GB)
>      Array Size : 117185856 (111.76 GiB 120.00 GB)
>    Raid Devices : 4
>   Total Devices : 4
> Preferred Minor : 8
>
>     Update Time : Sat Feb 28 00:49:48 2009
>           State : clean
>  Active Devices : 2
> Working Devices : 4
>  Failed Devices : 1
>   Spare Devices : 2
>        Checksum : a6b02bad - correct
>          Events : 0.36
>
>          Layout : left-symmetric
>      Chunk Size : 64K
>
>       Number   Major   Minor   RaidDevice State
> this     1       8       29        1      active sync   /dev/sdb13
>
>    0     0       0        0        0      removed
>    1     1       8       29        1      active sync   /dev/sdb13
>    2     2       0        0        2      faulty removed
>    3     3       8       61        3      active sync   /dev/sdd13
>    4     4       8       13        4      spare   /dev/sda13
>    5     5       8       45        5      spare   /dev/sdc13
> # mdadm --examine /dev/sdc13
> /dev/sdc13:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : ec0fab43:4fb8d991:e6a58c12:482d89e4
>   Creation Time : Sat Sep 15 00:55:37 2007
>      Raid Level : raid5
>   Used Dev Size : 39061952 (37.25 GiB 40.00 GB)
>      Array Size : 117185856 (111.76 GiB 120.00 GB)
>    Raid Devices : 4
>   Total Devices : 4
> Preferred Minor : 8
>
>     Update Time : Sat Feb 28 00:49:48 2009
>           State : clean
>  Active Devices : 2
> Working Devices : 4
>  Failed Devices : 1
>   Spare Devices : 2
>        Checksum : a6b02bbf - correct
>          Events : 0.36
>
>          Layout : left-symmetric
>      Chunk Size : 64K
>
>       Number   Major   Minor   RaidDevice State
> this     5       8       45        5      spare   /dev/sdc13
>
>    0     0       0        0        0      removed
>    1     1       8       29        1      active sync   /dev/sdb13
>    2     2       0        0        2      faulty removed
>    3     3       8       61        3      active sync   /dev/sdd13
>    4     4       8       13        4      spare   /dev/sda13
>    5     5       8       45        5      spare   /dev/sdc13
> # mdadm --examine /dev/sdd13
> /dev/sdd13:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : ec0fab43:4fb8d991:e6a58c12:482d89e4
>   Creation Time : Sat Sep 15 00:55:37 2007
>      Raid Level : raid5
>   Used Dev Size : 39061952 (37.25 GiB 40.00 GB)
>      Array Size : 117185856 (111.76 GiB 120.00 GB)
>    Raid Devices : 4
>   Total Devices : 4
> Preferred Minor : 8
>
>     Update Time : Sat Feb 28 00:49:48 2009
>           State : clean
>  Active Devices : 2
> Working Devices : 4
>  Failed Devices : 1
>   Spare Devices : 2
>        Checksum : a6b02bd1 - correct
>          Events : 0.36
>
>          Layout : left-symmetric
>      Chunk Size : 64K
>
>       Number   Major   Minor   RaidDevice State
> this     3       8       61        3      active sync   /dev/sdd13
>
>    0     0       0        0        0      removed
>    1     1       8       29        1      active sync   /dev/sdb13
>    2     2       0        0        2      faulty removed
>    3     3       8       61        3      active sync   /dev/sdd13
>    4     4       8       13        4      spare   /dev/sda13
>    5     5       8       45        5      spare   /dev/sdc13
>
>
> At this point I started googling and ended up doing this:
>
> # mdadm --stop /dev/md8
> # mdadm --zero-superblock /dev/sda13
> # mdadm --zero-superblock /dev/sdb13
> # mdadm --zero-superblock /dev/sdc13
> # mdadm --zero-superblock /dev/sdd13
> # mdadm -A /dev/md8 /dev/sda13 /dev/sdb13 /dev/sdc13 /dev/sdd13 --force
> mdadm: no recogniseable superblock on /dev/sda13
> mdadm: /dev/sda13 has no superblock - assembly aborted
> # mdadm --create /dev/md8 -l 5 -n 4 /dev/sda13 /dev/sdb13 /dev/sdc13 /dev/sdd13
> mdadm: /dev/sda13 appears to contain an ext2fs file system
>     size=117185856K  mtime=Fri Feb 27 17:22:46 2009
> mdadm: /dev/sdd13 appears to contain an ext2fs file system
>     size=117185856K  mtime=Fri Feb 27 17:22:46 2009
> Continue creating array? y
> mdadm: array /dev/md8 started.
> # cat /proc/mdstat
> md8 : active raid5 sdd13[4] sdc13[2] sdb13[1] sda13[0]
>       117185856 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
>       [>....................]  recovery =  0.7% (294244/39061952)
> finish=10.9min speed=58848K/sec
>
> Again very shortly after 1.6% into recovery, it failed. Now I see the following:
>
> md8 : active raid5 sdd13[4](S) sdc13[2] sdb13[1] sda13[5](F)
>       117185856 blocks level 5, 64k chunk, algorithm 2 [4/2] [_UU_]
>
> This surely does not look good. I hope that /dev/sda13 did not get
> wrongly synced. Does anyone have any suggestions about what I should
> do to recover this array? Does anyone have any ideas on what could
> have possibly caused these issues? How can /dev/sda13, a healthy part
> of the array lose its superblock?
>
> Let me know if you'd like any more info. mdadm version is 2.6.4.
> Kernel version 2.6.24.3.
>
> Thanks in advance for any help,
> Andrey
>

Hi everyone,

I upgraded mdadm to 2.6.8 from 2.6.4 and re-tried the last procedure
above. This time, recovery went up to 16.2% before it stopped there
for about 8 seconds and showed me this:
md8 : active raid5 sdd13[4](S) sdc13[2] sdb13[1] sda13[5](F)
       117185856 blocks level 5, 64k chunk, algorithm 2 [4/2] [_UU_]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-03-01 18:27 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-28  7:56 Trouble recovering raid5 array Andrey Falko
2009-03-01 18:27 ` Andrey Falko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.