4 partition raid 5 with 2 disks active and 2 spare, how to force?

All of lore.kernel.org
 help / color / mirror / Atom feed

* 4 partition raid 5 with 2 disks active and 2 spare, how to force?
       [not found] <S1753093Ab0CYHZE/20100325072504Z+37@vger.kernel.org>
@ 2010-03-25  9:30 ` Anshuman Aggarwal
  2010-03-25 11:37   ` Michael Evans
  0 siblings, 1 reply; 16+ messages in thread
From: Anshuman Aggarwal @ 2010-03-25  9:30 UTC (permalink / raw)
  To: linux-raid

All, thanks in advance...particularly Neil.

My raid5 setup has 4 partitions, 2 of which are showing up as spare and 2 as active. The mdadm --assemble --force gives me the following error:
2 active devices and 2 spare cannot start device

it is a raid 5, with superblock 1.2, 4 devices in the order sda1, sdb5, sdc5, sdd5. I have lvm2 on top of this with other devices ...so as you all know data is irreplaceable blah blah.

I know that this device has not been written to for a while, so the data can be considered intact (hopefully all) if I can get the device to start up...but I'm not sure of the best way to coax the kernel to assemble it. Relevant information follows:

=== This device is working fine === 
mdadm --examine  -e1.2 /dev/sdb5
/dev/sdb5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
           Name : GATEWAY:127  (local to host GATEWAY)
  Creation Time : Sat Aug 22 09:44:21 2009
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
     Array Size : 1758296832 (838.42 GiB 900.25 GB)
  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
    Data Offset : 272 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb

Internal Bitmap : 2 sectors from superblock
    Update Time : Fri Mar 19 00:56:15 2010
       Checksum : 1005cfbc - correct
         Events : 3796145

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : .AA. ('A' == active, '.' == missing)

=== This device is marked spare, can be marked active (IMHO) ===
mdadm --examine  -e1.2 /dev/sdd5
/dev/sdd5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
           Name : GATEWAY:127  (local to host GATEWAY)
  Creation Time : Sat Aug 22 09:44:21 2009
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
     Array Size : 1758296832 (838.42 GiB 900.25 GB)
  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
    Data Offset : 272 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 763a832f:1a9a7ea8:ce90d4a3:32e8ae54

Internal Bitmap : 2 sectors from superblock
    Update Time : Fri Mar 19 00:56:15 2010
       Checksum : c78aab46 - correct
         Events : 3796145

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : spare
   Array State : .AA. ('A' == active, '.' == missing)


=== This is the completely failed device (needs replacement)	=== 
mdadm --examine  -e1.2 /dev/sda1
[HANGS!!]



I already have the replacement drive available as sde5 but want to be able to reconstruct as much as possible)

Thanks again,
Anshuman Aggarwal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-25  9:30 ` 4 partition raid 5 with 2 disks active and 2 spare, how to force? Anshuman Aggarwal
@ 2010-03-25 11:37   ` Michael Evans
  2010-03-25 14:09     ` Anshuman Aggarwal
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Evans @ 2010-03-25 11:37 UTC (permalink / raw)
  To: Anshuman Aggarwal; +Cc: linux-raid

On Thu, Mar 25, 2010 at 2:30 AM, Anshuman Aggarwal
<anshuman@brillgene.com> wrote:
> All, thanks in advance...particularly Neil.
>
> My raid5 setup has 4 partitions, 2 of which are showing up as spare and 2 as active. The mdadm --assemble --force gives me the following error:
> 2 active devices and 2 spare cannot start device
>
> it is a raid 5, with superblock 1.2, 4 devices in the order sda1, sdb5, sdc5, sdd5. I have lvm2 on top of this with other devices ...so as you all know data is irreplaceable blah blah.
>
> I know that this device has not been written to for a while, so the data can be considered intact (hopefully all) if I can get the device to start up...but I'm not sure of the best way to coax the kernel to assemble it. Relevant information follows:
>
> === This device is working fine ===
> mdadm --examine  -e1.2 /dev/sdb5
> /dev/sdb5:
>          Magic : a92b4efc
>        Version : 1.2
>    Feature Map : 0x1
>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>           Name : GATEWAY:127  (local to host GATEWAY)
>  Creation Time : Sat Aug 22 09:44:21 2009
>     Raid Level : raid5
>   Raid Devices : 4
>
>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>    Data Offset : 272 sectors
>   Super Offset : 8 sectors
>          State : clean
>    Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
>
> Internal Bitmap : 2 sectors from superblock
>    Update Time : Fri Mar 19 00:56:15 2010
>       Checksum : 1005cfbc - correct
>         Events : 3796145
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>   Device Role : Active device 2
>   Array State : .AA. ('A' == active, '.' == missing)
>
> === This device is marked spare, can be marked active (IMHO) ===
> mdadm --examine  -e1.2 /dev/sdd5
> /dev/sdd5:
>          Magic : a92b4efc
>        Version : 1.2
>    Feature Map : 0x1
>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>           Name : GATEWAY:127  (local to host GATEWAY)
>  Creation Time : Sat Aug 22 09:44:21 2009
>     Raid Level : raid5
>   Raid Devices : 4
>
>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>    Data Offset : 272 sectors
>   Super Offset : 8 sectors
>          State : clean
>    Device UUID : 763a832f:1a9a7ea8:ce90d4a3:32e8ae54
>
> Internal Bitmap : 2 sectors from superblock
>    Update Time : Fri Mar 19 00:56:15 2010
>       Checksum : c78aab46 - correct
>         Events : 3796145
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>   Device Role : spare
>   Array State : .AA. ('A' == active, '.' == missing)
>
>
> === This is the completely failed device (needs replacement)    ===
> mdadm --examine  -e1.2 /dev/sda1
> [HANGS!!]
>
>
>
> I already have the replacement drive available as sde5 but want to be able to reconstruct as much as possible)
>
> Thanks again,
> Anshuman Aggarwal--
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

You have a raid 5 array.

(drives then data+parity per drive as an example)
1234

123P
45P6
7P89
...

You are missing two drives, meaning you lack parity and 1 data stripe
and have NO parity to recover it with.

It's like seeing:

.23.
.5P.
.P8.

and expecting to somehow recover the missing data when it is no longer
within the clean information.

Your only hope is to assemble the array in read only mode with the
other devices, if they can still even be read.  In that case you might
at least be able to recover nearly all of your data; hopefully any
missing areas are in unimportant files or non-allocated space.

At this point you should be EXTREMELY CAREFUL, and DO NOTHING, without
having a good solid plan in place.  Rushing /WILL/ cause you to loose
data that might still potentially be recovered.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-25 11:37   ` Michael Evans
@ 2010-03-25 14:09     ` Anshuman Aggarwal
  2010-03-26  3:38       ` Michael Evans
  0 siblings, 1 reply; 16+ messages in thread
From: Anshuman Aggarwal @ 2010-03-25 14:09 UTC (permalink / raw)
  To: Michael Evans, linux-raid

Thanks Michael, I am clear about the problem of why the multiple failure would cause me to lose data. Which is why I wanted to consult this mailing list before proceeding. 

Could you tell me how to keep the array read-only?  and mark one or both of these spares as active forcibly? and Also, once I am able to use these spares as active and the data is not consistent in a particular stripe, how does the kernel resolve the inconsistency (as in what data does it use, the one based on the data stripes or the one based on the parity?) this one is just academic interest since it'll be difficult to figure out which is the right data anyways.

Thanks,
Anshuman

On 25-Mar-2010, at 5:07 PM, Michael Evans wrote:

> On Thu, Mar 25, 2010 at 2:30 AM, Anshuman Aggarwal
> <anshuman@brillgene.com> wrote:
>> All, thanks in advance...particularly Neil.
>> 
>> My raid5 setup has 4 partitions, 2 of which are showing up as spare and 2 as active. The mdadm --assemble --force gives me the following error:
>> 2 active devices and 2 spare cannot start device
>> 
>> it is a raid 5, with superblock 1.2, 4 devices in the order sda1, sdb5, sdc5, sdd5. I have lvm2 on top of this with other devices ...so as you all know data is irreplaceable blah blah.
>> 
>> I know that this device has not been written to for a while, so the data can be considered intact (hopefully all) if I can get the device to start up...but I'm not sure of the best way to coax the kernel to assemble it. Relevant information follows:
>> 
>> === This device is working fine ===
>> mdadm --examine  -e1.2 /dev/sdb5
>> /dev/sdb5:
>>          Magic : a92b4efc
>>        Version : 1.2
>>    Feature Map : 0x1
>>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>           Name : GATEWAY:127  (local to host GATEWAY)
>>  Creation Time : Sat Aug 22 09:44:21 2009
>>     Raid Level : raid5
>>   Raid Devices : 4
>> 
>>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>    Data Offset : 272 sectors
>>   Super Offset : 8 sectors
>>          State : clean
>>    Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
>> 
>> Internal Bitmap : 2 sectors from superblock
>>    Update Time : Fri Mar 19 00:56:15 2010
>>       Checksum : 1005cfbc - correct
>>         Events : 3796145
>> 
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>> 
>>   Device Role : Active device 2
>>   Array State : .AA. ('A' == active, '.' == missing)
>> 
>> === This device is marked spare, can be marked active (IMHO) ===
>> mdadm --examine  -e1.2 /dev/sdd5
>> /dev/sdd5:
>>          Magic : a92b4efc
>>        Version : 1.2
>>    Feature Map : 0x1
>>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>           Name : GATEWAY:127  (local to host GATEWAY)
>>  Creation Time : Sat Aug 22 09:44:21 2009
>>     Raid Level : raid5
>>   Raid Devices : 4
>> 
>>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>    Data Offset : 272 sectors
>>   Super Offset : 8 sectors
>>          State : clean
>>    Device UUID : 763a832f:1a9a7ea8:ce90d4a3:32e8ae54
>> 
>> Internal Bitmap : 2 sectors from superblock
>>    Update Time : Fri Mar 19 00:56:15 2010
>>       Checksum : c78aab46 - correct
>>         Events : 3796145
>> 
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>> 
>>   Device Role : spare
>>   Array State : .AA. ('A' == active, '.' == missing)
>> 
>> 
>> === This is the completely failed device (needs replacement)    ===
>> mdadm --examine  -e1.2 /dev/sda1
>> [HANGS!!]
>> 
>> 
>> 
>> I already have the replacement drive available as sde5 but want to be able to reconstruct as much as possible)
>> 
>> Thanks again,
>> Anshuman Aggarwal--
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
> 
> You have a raid 5 array.
> 
> (drives then data+parity per drive as an example)
> 1234
> 
> 123P
> 45P6
> 7P89
> ...
> 
> You are missing two drives, meaning you lack parity and 1 data stripe
> and have NO parity to recover it with.
> 
> It's like seeing:
> 
> .23.
> .5P.
> .P8.
> 
> and expecting to somehow recover the missing data when it is no longer
> within the clean information.
> 
> Your only hope is to assemble the array in read only mode with the
> other devices, if they can still even be read.  In that case you might
> at least be able to recover nearly all of your data; hopefully any
> missing areas are in unimportant files or non-allocated space.
> 
> At this point you should be EXTREMELY CAREFUL, and DO NOTHING, without
> having a good solid plan in place.  Rushing /WILL/ cause you to loose
> data that might still potentially be recovered.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-25 14:09     ` Anshuman Aggarwal
@ 2010-03-26  3:38       ` Michael Evans
  2010-03-26 16:28         ` Anshuman Aggarwal
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Evans @ 2010-03-26  3:38 UTC (permalink / raw)
  To: Anshuman Aggarwal; +Cc: linux-raid

On Thu, Mar 25, 2010 at 7:09 AM, Anshuman Aggarwal
<anshuman@brillgene.com> wrote:
> Thanks Michael, I am clear about the problem of why the multiple failure would cause me to lose data. Which is why I wanted to consult this mailing list before proceeding.
>
> Could you tell me how to keep the array read-only?  and mark one or both of these spares as active forcibly? and Also, once I am able to use these spares as active and the data is not consistent in a particular stripe, how does the kernel resolve the inconsistency (as in what data does it use, the one based on the data stripes or the one based on the parity?) this one is just academic interest since it'll be difficult to figure out which is the right data anyways.
>
> Thanks,
> Anshuman
>

Please, read the wikipedia page first,

http://en.wikipedia.org/wiki/RAID

and then this

http://wiki.tldp.org/LVM-on-RAID (some links need updating, but it's
still up to date for concepts)


With that background nearly out of the way, please stop, and read them
both again.  Yes, seriously.  In order to prevent data loss you'll
need to have a good understanding of what RAID does, so that you can
watch out for ways it can fail.

The next step, before we do /anything/ else is for you to post the
COMPLETE output of these commands.

mdadm -Dvvs
mdadm -Evvs

They will help everyone on the list better understand the state of the
metadata records and what potential solutions might be possible.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-26  3:38       ` Michael Evans
@ 2010-03-26 16:28         ` Anshuman Aggarwal
  2010-03-26 19:04           ` Michael Evans
  2010-03-26 19:29           ` 4 partition raid 5 with 2 disks active and 2 spare, how to force? John Robinson
  0 siblings, 2 replies; 16+ messages in thread
From: Anshuman Aggarwal @ 2010-03-26 16:28 UTC (permalink / raw)
  To: Michael Evans; +Cc: linux-raid

Thanks again. I have visited those pages (twice no less) and nothing seems to be new from the concepts (both raid and lvm) since I last studied them. 

My problem is that I'm not familiar enough with the recovery tools and the common practical pitfalls to do this comfortably without the hand holding of this mailing list :)

Here is the requested output:
Note: Since I have 3-4 other arrays running (root device etc.) which don't have anything to do with this one and are all working fine...I am just putting the output of the relevant devices (in order to avoid confusing everybody). Please let me know if you still require the full output.  

mdadm -Dvvs /dev/md_d127
mdadm: md device /dev/md_d127 does not appear to be active.

mdadm --assemble  /dev/md_d127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
mdadm: /dev/md_d127 assembled from 2 drives and 1 spare - not enough to start the array.

Says that the device /dev/md_d127 is not active (because its not active in /proc/mdstat)	
mdadm -Evvs  /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
           Name : GATEWAY:127  (local to host GATEWAY)
  Creation Time : Sat Aug 22 09:44:21 2009
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
     Array Size : 1758296832 (838.42 GiB 900.25 GB)
  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
    Data Offset : 272 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 571fa32b:d76198a1:0f5d3a2d:31f6d6b8

Internal Bitmap : 2 sectors from superblock
    Update Time : Fri Mar 19 00:56:15 2010
       Checksum : 7e769165 - expected aa523227
         Events : 3796145

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : spare
   Array State : .AA. ('A' == active, '.' == missing)
/dev/sdb5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
           Name : GATEWAY:127  (local to host GATEWAY)
  Creation Time : Sat Aug 22 09:44:21 2009
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
     Array Size : 1758296832 (838.42 GiB 900.25 GB)
  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
    Data Offset : 272 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb

Internal Bitmap : 2 sectors from superblock
    Update Time : Fri Mar 19 00:56:15 2010
       Checksum : 1005cfbc - correct
         Events : 3796145

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : .AA. ('A' == active, '.' == missing)
/dev/sdc5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
           Name : GATEWAY:127  (local to host GATEWAY)
  Creation Time : Sat Aug 22 09:44:21 2009
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
     Array Size : 1758296832 (838.42 GiB 900.25 GB)
  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
    Data Offset : 272 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : d9ce99fc:79bc1e9d:197d5b11:c990e007

Internal Bitmap : 2 sectors from superblock
    Update Time : Fri Mar 19 00:56:15 2010
       Checksum : a9f9f59f - correct
         Events : 3796145

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : .AA. ('A' == active, '.' == missing)
/dev/sdd5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
           Name : GATEWAY:127  (local to host GATEWAY)
  Creation Time : Sat Aug 22 09:44:21 2009
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
     Array Size : 1758296832 (838.42 GiB 900.25 GB)
  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
    Data Offset : 272 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 763a832f:1a9a7ea8:ce90d4a3:32e8ae54

Internal Bitmap : 2 sectors from superblock
    Update Time : Fri Mar 19 00:56:15 2010
       Checksum : c78aab46 - correct
         Events : 3796145

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : spare
   Array State : .AA. ('A' == active, '.' == missing)


Regards,
Anshuman

On 26-Mar-2010, at 9:08 AM, Michael Evans wrote:

> On Thu, Mar 25, 2010 at 7:09 AM, Anshuman Aggarwal
> <anshuman@brillgene.com> wrote:
>> Thanks Michael, I am clear about the problem of why the multiple failure would cause me to lose data. Which is why I wanted to consult this mailing list before proceeding.
>> 
>> Could you tell me how to keep the array read-only?  and mark one or both of these spares as active forcibly? and Also, once I am able to use these spares as active and the data is not consistent in a particular stripe, how does the kernel resolve the inconsistency (as in what data does it use, the one based on the data stripes or the one based on the parity?) this one is just academic interest since it'll be difficult to figure out which is the right data anyways.
>> 
>> Thanks,
>> Anshuman
>> 
> 
> Please, read the wikipedia page first,
> 
> http://en.wikipedia.org/wiki/RAID
> 
> and then this
> 
> http://wiki.tldp.org/LVM-on-RAID (some links need updating, but it's
> still up to date for concepts)
> 
> 
> With that background nearly out of the way, please stop, and read them
> both again.  Yes, seriously.  In order to prevent data loss you'll
> need to have a good understanding of what RAID does, so that you can
> watch out for ways it can fail.
> 
> The next step, before we do /anything/ else is for you to post the
> COMPLETE output of these commands.
> 
> mdadm -Dvvs
> mdadm -Evvs
> 
> They will help everyone on the list better understand the state of the
> metadata records and what potential solutions might be possible.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-26 16:28         ` Anshuman Aggarwal
@ 2010-03-26 19:04           ` Michael Evans
  2010-03-28 15:18             ` Anshuman Aggarwal
  2010-03-26 19:29           ` 4 partition raid 5 with 2 disks active and 2 spare, how to force? John Robinson
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Evans @ 2010-03-26 19:04 UTC (permalink / raw)
  To: Anshuman Aggarwal; +Cc: linux-raid

On Fri, Mar 26, 2010 at 9:28 AM, Anshuman Aggarwal
<anshuman@brillgene.com> wrote:
> Thanks again. I have visited those pages (twice no less) and nothing seems to be new from the concepts (both raid and lvm) since I last studied them.
>
> My problem is that I'm not familiar enough with the recovery tools and the common practical pitfalls to do this comfortably without the hand holding of this mailing list :)
>
> Here is the requested output:
> Note: Since I have 3-4 other arrays running (root device etc.) which don't have anything to do with this one and are all working fine...I am just putting the output of the relevant devices (in order to avoid confusing everybody). Please let me know if you still require the full output.
>
> mdadm -Dvvs /dev/md_d127
> mdadm: md device /dev/md_d127 does not appear to be active.
>
> mdadm --assemble  /dev/md_d127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
> mdadm: /dev/md_d127 assembled from 2 drives and 1 spare - not enough to start the array.
>
> Says that the device /dev/md_d127 is not active (because its not active in /proc/mdstat)
> mdadm -Evvs  /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
> /dev/sda1:
>          Magic : a92b4efc
>        Version : 1.2
>    Feature Map : 0x1
>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>           Name : GATEWAY:127  (local to host GATEWAY)
>  Creation Time : Sat Aug 22 09:44:21 2009
>     Raid Level : raid5
>   Raid Devices : 4
>
>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>    Data Offset : 272 sectors
>   Super Offset : 8 sectors
>          State : clean
>    Device UUID : 571fa32b:d76198a1:0f5d3a2d:31f6d6b8
>
> Internal Bitmap : 2 sectors from superblock
>    Update Time : Fri Mar 19 00:56:15 2010
>       Checksum : 7e769165 - expected aa523227
>         Events : 3796145
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>   Device Role : spare
>   Array State : .AA. ('A' == active, '.' == missing)
> /dev/sdb5:
>          Magic : a92b4efc
>        Version : 1.2
>    Feature Map : 0x1
>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>           Name : GATEWAY:127  (local to host GATEWAY)
>  Creation Time : Sat Aug 22 09:44:21 2009
>     Raid Level : raid5
>   Raid Devices : 4
>
>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>    Data Offset : 272 sectors
>   Super Offset : 8 sectors
>          State : clean
>    Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
>
> Internal Bitmap : 2 sectors from superblock
>    Update Time : Fri Mar 19 00:56:15 2010
>       Checksum : 1005cfbc - correct
>         Events : 3796145
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>   Device Role : Active device 2
>   Array State : .AA. ('A' == active, '.' == missing)
> /dev/sdc5:
>          Magic : a92b4efc
>        Version : 1.2
>    Feature Map : 0x1
>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>           Name : GATEWAY:127  (local to host GATEWAY)
>  Creation Time : Sat Aug 22 09:44:21 2009
>     Raid Level : raid5
>   Raid Devices : 4
>
>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>    Data Offset : 272 sectors
>   Super Offset : 8 sectors
>          State : clean
>    Device UUID : d9ce99fc:79bc1e9d:197d5b11:c990e007
>
> Internal Bitmap : 2 sectors from superblock
>    Update Time : Fri Mar 19 00:56:15 2010
>       Checksum : a9f9f59f - correct
>         Events : 3796145
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>   Device Role : Active device 1
>   Array State : .AA. ('A' == active, '.' == missing)
> /dev/sdd5:
>          Magic : a92b4efc
>        Version : 1.2
>    Feature Map : 0x1
>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>           Name : GATEWAY:127  (local to host GATEWAY)
>  Creation Time : Sat Aug 22 09:44:21 2009
>     Raid Level : raid5
>   Raid Devices : 4
>
>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>    Data Offset : 272 sectors
>   Super Offset : 8 sectors
>          State : clean
>    Device UUID : 763a832f:1a9a7ea8:ce90d4a3:32e8ae54
>
> Internal Bitmap : 2 sectors from superblock
>    Update Time : Fri Mar 19 00:56:15 2010
>       Checksum : c78aab46 - correct
>         Events : 3796145
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>   Device Role : spare
>   Array State : .AA. ('A' == active, '.' == missing)
>
>
> Regards,
> Anshuman
>
> On 26-Mar-2010, at 9:08 AM, Michael Evans wrote:
>
>> On Thu, Mar 25, 2010 at 7:09 AM, Anshuman Aggarwal
>> <anshuman@brillgene.com> wrote:
>>> Thanks Michael, I am clear about the problem of why the multiple failure would cause me to lose data. Which is why I wanted to consult this mailing list before proceeding.
>>>
>>> Could you tell me how to keep the array read-only?  and mark one or both of these spares as active forcibly? and Also, once I am able to use these spares as active and the data is not consistent in a particular stripe, how does the kernel resolve the inconsistency (as in what data does it use, the one based on the data stripes or the one based on the parity?) this one is just academic interest since it'll be difficult to figure out which is the right data anyways.
>>>
>>> Thanks,
>>> Anshuman
>>>
>>
>> Please, read the wikipedia page first,
>>
>> http://en.wikipedia.org/wiki/RAID
>>
>> and then this
>>
>> http://wiki.tldp.org/LVM-on-RAID (some links need updating, but it's
>> still up to date for concepts)
>>
>>
>> With that background nearly out of the way, please stop, and read them
>> both again.  Yes, seriously.  In order to prevent data loss you'll
>> need to have a good understanding of what RAID does, so that you can
>> watch out for ways it can fail.
>>
>> The next step, before we do /anything/ else is for you to post the
>> COMPLETE output of these commands.
>>
>> mdadm -Dvvs
>> mdadm -Evvs
>>
>> They will help everyone on the list better understand the state of the
>> metadata records and what potential solutions might be possible.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

Obviously you do not understand the problem then, since if you did not
previously, and you say you learned nothing new.

Also, you added additional arguments to the commands I provided when
that was neither required nor desired.

However enough data was returned to see one thing:  ALL of the events
counters show the same number.

That is extremely odd, usually in this situation at least one device
will have a lower number.


If possible please describe what happened to cause this in the first place.

Also, you'll find these links more directly relevant to your problem:

https://raid.wiki.kernel.org/index.php/RAID_Recovery

Reading my local copy of the manpage (which is slightly outdated, you
should really get the latest stable mdadm release, compile, install
and read the manual to confirm it's still not there) I can't find any
way of bringing an array up in read only mode without using missing
devices, which is what the permutation script tries to do.
Additionally without knowing what type of event is being recovered
from; I suspect either simultaneous disconnection of half the drives;
or what you've done since, because it looks like something, I cannot
offer concrete advice on how to proceed.

However there are two main routes open to you at this point.  Posting
a fresh message asking how to create an array read only for use with
data recovery, and some variant of following the perl script's steps
that the linked document mentions.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-26 16:28         ` Anshuman Aggarwal
  2010-03-26 19:04           ` Michael Evans
@ 2010-03-26 19:29           ` John Robinson
  1 sibling, 0 replies; 16+ messages in thread
From: John Robinson @ 2010-03-26 19:29 UTC (permalink / raw)
  To: Anshuman Aggarwal; +Cc: Linux RAID

On 26/03/2010 16:28, Anshuman Aggarwal wrote:
[...]
> mdadm --assemble  /dev/md_d127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
> mdadm: /dev/md_d127 assembled from 2 drives and 1 spare - not enough to start the array.
[...]

You said sda was broken, so forget that. Goodness knows how sdd5 managed 
to end up being a spare. I think you want `mdadm --assemble /dev/md_d127 
--force /dev/sd[bcd]5`. I don't think you can start it read-only but 
with a member missing you're not going to get a resync going so this is 
unlikely to cause data loss. Still, don't do this if you don't believe 
it's the correct answer, and certainly don't blame me if it wastes your 
data. Good luck!

Cheers,

John.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-26 19:04           ` Michael Evans
@ 2010-03-28 15:18             ` Anshuman Aggarwal
  2010-03-28 16:35               ` Anshuman Aggarwal
  0 siblings, 1 reply; 16+ messages in thread
From: Anshuman Aggarwal @ 2010-03-28 15:18 UTC (permalink / raw)
  To: Michael Evans; +Cc: linux-raid

On 27-Mar-2010, at 12:34 AM, Michael Evans wrote:

> On Fri, Mar 26, 2010 at 9:28 AM, Anshuman Aggarwal
> <anshuman@brillgene.com> wrote:
>> Thanks again. I have visited those pages (twice no less) and nothing seems to be new from the concepts (both raid and lvm) since I last studied them.
>> 
>> My problem is that I'm not familiar enough with the recovery tools and the common practical pitfalls to do this comfortably without the hand holding of this mailing list :)
>> 
>> Here is the requested output:
>> Note: Since I have 3-4 other arrays running (root device etc.) which don't have anything to do with this one and are all working fine...I am just putting the output of the relevant devices (in order to avoid confusing everybody). Please let me know if you still require the full output.
>> 
>> mdadm -Dvvs /dev/md_d127
>> mdadm: md device /dev/md_d127 does not appear to be active.
>> 
>> mdadm --assemble  /dev/md_d127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
>> mdadm: /dev/md_d127 assembled from 2 drives and 1 spare - not enough to start the array.
>> 
>> Says that the device /dev/md_d127 is not active (because its not active in /proc/mdstat)
>> mdadm -Evvs  /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
>> /dev/sda1:
>>          Magic : a92b4efc
>>        Version : 1.2
>>    Feature Map : 0x1
>>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>           Name : GATEWAY:127  (local to host GATEWAY)
>>  Creation Time : Sat Aug 22 09:44:21 2009
>>     Raid Level : raid5
>>   Raid Devices : 4
>> 
>>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>    Data Offset : 272 sectors
>>   Super Offset : 8 sectors
>>          State : clean
>>    Device UUID : 571fa32b:d76198a1:0f5d3a2d:31f6d6b8
>> 
>> Internal Bitmap : 2 sectors from superblock
>>    Update Time : Fri Mar 19 00:56:15 2010
>>       Checksum : 7e769165 - expected aa523227
>>         Events : 3796145
>> 
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>> 
>>   Device Role : spare
>>   Array State : .AA. ('A' == active, '.' == missing)
>> /dev/sdb5:
>>          Magic : a92b4efc
>>        Version : 1.2
>>    Feature Map : 0x1
>>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>           Name : GATEWAY:127  (local to host GATEWAY)
>>  Creation Time : Sat Aug 22 09:44:21 2009
>>     Raid Level : raid5
>>   Raid Devices : 4
>> 
>>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>    Data Offset : 272 sectors
>>   Super Offset : 8 sectors
>>          State : clean
>>    Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
>> 
>> Internal Bitmap : 2 sectors from superblock
>>    Update Time : Fri Mar 19 00:56:15 2010
>>       Checksum : 1005cfbc - correct
>>         Events : 3796145
>> 
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>> 
>>   Device Role : Active device 2
>>   Array State : .AA. ('A' == active, '.' == missing)
>> /dev/sdc5:
>>          Magic : a92b4efc
>>        Version : 1.2
>>    Feature Map : 0x1
>>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>           Name : GATEWAY:127  (local to host GATEWAY)
>>  Creation Time : Sat Aug 22 09:44:21 2009
>>     Raid Level : raid5
>>   Raid Devices : 4
>> 
>>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>    Data Offset : 272 sectors
>>   Super Offset : 8 sectors
>>          State : clean
>>    Device UUID : d9ce99fc:79bc1e9d:197d5b11:c990e007
>> 
>> Internal Bitmap : 2 sectors from superblock
>>    Update Time : Fri Mar 19 00:56:15 2010
>>       Checksum : a9f9f59f - correct
>>         Events : 3796145
>> 
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>> 
>>   Device Role : Active device 1
>>   Array State : .AA. ('A' == active, '.' == missing)
>> /dev/sdd5:
>>          Magic : a92b4efc
>>        Version : 1.2
>>    Feature Map : 0x1
>>     Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>           Name : GATEWAY:127  (local to host GATEWAY)
>>  Creation Time : Sat Aug 22 09:44:21 2009
>>     Raid Level : raid5
>>   Raid Devices : 4
>> 
>>  Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>     Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>  Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>    Data Offset : 272 sectors
>>   Super Offset : 8 sectors
>>          State : clean
>>    Device UUID : 763a832f:1a9a7ea8:ce90d4a3:32e8ae54
>> 
>> Internal Bitmap : 2 sectors from superblock
>>    Update Time : Fri Mar 19 00:56:15 2010
>>       Checksum : c78aab46 - correct
>>         Events : 3796145
>> 
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>> 
>>   Device Role : spare
>>   Array State : .AA. ('A' == active, '.' == missing)
>> 
>> 
>> Regards,
>> Anshuman
>> 
>> On 26-Mar-2010, at 9:08 AM, Michael Evans wrote:
>> 
>>> On Thu, Mar 25, 2010 at 7:09 AM, Anshuman Aggarwal
>>> <anshuman@brillgene.com> wrote:
>>>> Thanks Michael, I am clear about the problem of why the multiple failure would cause me to lose data. Which is why I wanted to consult this mailing list before proceeding.
>>>> 
>>>> Could you tell me how to keep the array read-only?  and mark one or both of these spares as active forcibly? and Also, once I am able to use these spares as active and the data is not consistent in a particular stripe, how does the kernel resolve the inconsistency (as in what data does it use, the one based on the data stripes or the one based on the parity?) this one is just academic interest since it'll be difficult to figure out which is the right data anyways.
>>>> 
>>>> Thanks,
>>>> Anshuman
>>>> 
>>> 
>>> Please, read the wikipedia page first,
>>> 
>>> http://en.wikipedia.org/wiki/RAID
>>> 
>>> and then this
>>> 
>>> http://wiki.tldp.org/LVM-on-RAID (some links need updating, but it's
>>> still up to date for concepts)
>>> 
>>> 
>>> With that background nearly out of the way, please stop, and read them
>>> both again.  Yes, seriously.  In order to prevent data loss you'll
>>> need to have a good understanding of what RAID does, so that you can
>>> watch out for ways it can fail.
>>> 
>>> The next step, before we do /anything/ else is for you to post the
>>> COMPLETE output of these commands.
>>> 
>>> mdadm -Dvvs
>>> mdadm -Evvs
>>> 
>>> They will help everyone on the list better understand the state of the
>>> metadata records and what potential solutions might be possible.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
> 
> Obviously you do not understand the problem then, since if you did not
> previously, and you say you learned nothing new.
> 
> Also, you added additional arguments to the commands I provided when
> that was neither required nor desired.
> 
> However enough data was returned to see one thing:  ALL of the events
> counters show the same number.
> 
> That is extremely odd, usually in this situation at least one device
> will have a lower number.
> 
> 
> If possible please describe what happened to cause this in the first place.
> 
> Also, you'll find these links more directly relevant to your problem:
> 
> https://raid.wiki.kernel.org/index.php/RAID_Recovery
> 
> Reading my local copy of the manpage (which is slightly outdated, you
> should really get the latest stable mdadm release, compile, install
> and read the manual to confirm it's still not there) I can't find any
> way of bringing an array up in read only mode without using missing
> devices, which is what the permutation script tries to do.
> Additionally without knowing what type of event is being recovered
> from; I suspect either simultaneous disconnection of half the drives;
> or what you've done since, because it looks like something, I cannot
> offer concrete advice on how to proceed.
> 
> However there are two main routes open to you at this point.  Posting
> a fresh message asking how to create an array read only for use with
> data recovery, and some variant of following the perl script's steps
> that the linked document mentions.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Michael,
I am running mdadm 3.1.2 (latest stable I think) compiled from source (FYI on Ubuntu Karmic, 2.6.31-20-generic)

Here is what happened....the device /dev/sda1 has failed once, but I was wondering if it was a freak accident so I tried adding it back..and then it started resyncing ...somewhere in this process...the disk /dev/sda1 stalled and the server needed a reboot. After that boot, I got 2 spares (/dev/sda1, /dev/sdd5) and 2 active devices (/dev/sdb1, /dev/sdc1)

Maybe I need to do a build with a --assume-clean with the devices in the right order (which I'm positive I can remember) ...be nice if you could plz double check:
mdadm --build -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5

Again, thanks for your time...

John,
 I did try what you said without any luck(--assemble --force but it refuses to accept the spare as a valid device and 2 active on a 4 member device isn't good enough)






^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-28 15:18             ` Anshuman Aggarwal
@ 2010-03-28 16:35               ` Anshuman Aggarwal
  2010-03-29  5:32                 ` Luca Berra
  0 siblings, 1 reply; 16+ messages in thread
From: Anshuman Aggarwal @ 2010-03-28 16:35 UTC (permalink / raw)
  To: Michael Evans; +Cc: linux-raid


On 28-Mar-2010, at 8:48 PM, Anshuman Aggarwal wrote:

> On 27-Mar-2010, at 12:34 AM, Michael Evans wrote:
> 
>> On Fri, Mar 26, 2010 at 9:28 AM, Anshuman Aggarwal
>> <anshuman@brillgene.com> wrote:
>>> Thanks again. I have visited those pages (twice no less) and nothing seems to be new from the concepts (both raid and lvm) since I last studied them.
>>> 
>>> My problem is that I'm not familiar enough with the recovery tools and the common practical pitfalls to do this comfortably without the hand holding of this mailing list :)
>>> 
>>> Here is the requested output:
>>> Note: Since I have 3-4 other arrays running (root device etc.) which don't have anything to do with this one and are all working fine...I am just putting the output of the relevant devices (in order to avoid confusing everybody). Please let me know if you still require the full output.
>>> 
>>> mdadm -Dvvs /dev/md_d127
>>> mdadm: md device /dev/md_d127 does not appear to be active.
>>> 
>>> mdadm --assemble  /dev/md_d127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
>>> mdadm: /dev/md_d127 assembled from 2 drives and 1 spare - not enough to start the array.
>>> 
>>> Says that the device /dev/md_d127 is not active (because its not active in /proc/mdstat)
>>> mdadm -Evvs  /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
>>> /dev/sda1:
>>>         Magic : a92b4efc
>>>       Version : 1.2
>>>   Feature Map : 0x1
>>>    Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>>          Name : GATEWAY:127  (local to host GATEWAY)
>>> Creation Time : Sat Aug 22 09:44:21 2009
>>>    Raid Level : raid5
>>>  Raid Devices : 4
>>> 
>>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>>    Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>>   Data Offset : 272 sectors
>>>  Super Offset : 8 sectors
>>>         State : clean
>>>   Device UUID : 571fa32b:d76198a1:0f5d3a2d:31f6d6b8
>>> 
>>> Internal Bitmap : 2 sectors from superblock
>>>   Update Time : Fri Mar 19 00:56:15 2010
>>>      Checksum : 7e769165 - expected aa523227
>>>        Events : 3796145
>>> 
>>>        Layout : left-symmetric
>>>    Chunk Size : 64K
>>> 
>>>  Device Role : spare
>>>  Array State : .AA. ('A' == active, '.' == missing)
>>> /dev/sdb5:
>>>         Magic : a92b4efc
>>>       Version : 1.2
>>>   Feature Map : 0x1
>>>    Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>>          Name : GATEWAY:127  (local to host GATEWAY)
>>> Creation Time : Sat Aug 22 09:44:21 2009
>>>    Raid Level : raid5
>>>  Raid Devices : 4
>>> 
>>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>>    Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>>   Data Offset : 272 sectors
>>>  Super Offset : 8 sectors
>>>         State : clean
>>>   Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
>>> 
>>> Internal Bitmap : 2 sectors from superblock
>>>   Update Time : Fri Mar 19 00:56:15 2010
>>>      Checksum : 1005cfbc - correct
>>>        Events : 3796145
>>> 
>>>        Layout : left-symmetric
>>>    Chunk Size : 64K
>>> 
>>>  Device Role : Active device 2
>>>  Array State : .AA. ('A' == active, '.' == missing)
>>> /dev/sdc5:
>>>         Magic : a92b4efc
>>>       Version : 1.2
>>>   Feature Map : 0x1
>>>    Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>>          Name : GATEWAY:127  (local to host GATEWAY)
>>> Creation Time : Sat Aug 22 09:44:21 2009
>>>    Raid Level : raid5
>>>  Raid Devices : 4
>>> 
>>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>>    Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>>   Data Offset : 272 sectors
>>>  Super Offset : 8 sectors
>>>         State : clean
>>>   Device UUID : d9ce99fc:79bc1e9d:197d5b11:c990e007
>>> 
>>> Internal Bitmap : 2 sectors from superblock
>>>   Update Time : Fri Mar 19 00:56:15 2010
>>>      Checksum : a9f9f59f - correct
>>>        Events : 3796145
>>> 
>>>        Layout : left-symmetric
>>>    Chunk Size : 64K
>>> 
>>>  Device Role : Active device 1
>>>  Array State : .AA. ('A' == active, '.' == missing)
>>> /dev/sdd5:
>>>         Magic : a92b4efc
>>>       Version : 1.2
>>>   Feature Map : 0x1
>>>    Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>>          Name : GATEWAY:127  (local to host GATEWAY)
>>> Creation Time : Sat Aug 22 09:44:21 2009
>>>    Raid Level : raid5
>>>  Raid Devices : 4
>>> 
>>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>>    Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>>   Data Offset : 272 sectors
>>>  Super Offset : 8 sectors
>>>         State : clean
>>>   Device UUID : 763a832f:1a9a7ea8:ce90d4a3:32e8ae54
>>> 
>>> Internal Bitmap : 2 sectors from superblock
>>>   Update Time : Fri Mar 19 00:56:15 2010
>>>      Checksum : c78aab46 - correct
>>>        Events : 3796145
>>> 
>>>        Layout : left-symmetric
>>>    Chunk Size : 64K
>>> 
>>>  Device Role : spare
>>>  Array State : .AA. ('A' == active, '.' == missing)
>>> 
>>> 
>>> Regards,
>>> Anshuman
>>> 
>>> On 26-Mar-2010, at 9:08 AM, Michael Evans wrote:
>>> 
>>>> On Thu, Mar 25, 2010 at 7:09 AM, Anshuman Aggarwal
>>>> <anshuman@brillgene.com> wrote:
>>>>> Thanks Michael, I am clear about the problem of why the multiple failure would cause me to lose data. Which is why I wanted to consult this mailing list before proceeding.
>>>>> 
>>>>> Could you tell me how to keep the array read-only?  and mark one or both of these spares as active forcibly? and Also, once I am able to use these spares as active and the data is not consistent in a particular stripe, how does the kernel resolve the inconsistency (as in what data does it use, the one based on the data stripes or the one based on the parity?) this one is just academic interest since it'll be difficult to figure out which is the right data anyways.
>>>>> 
>>>>> Thanks,
>>>>> Anshuman
>>>>> 
>>>> 
>>>> Please, read the wikipedia page first,
>>>> 
>>>> http://en.wikipedia.org/wiki/RAID
>>>> 
>>>> and then this
>>>> 
>>>> http://wiki.tldp.org/LVM-on-RAID (some links need updating, but it's
>>>> still up to date for concepts)
>>>> 
>>>> 
>>>> With that background nearly out of the way, please stop, and read them
>>>> both again.  Yes, seriously.  In order to prevent data loss you'll
>>>> need to have a good understanding of what RAID does, so that you can
>>>> watch out for ways it can fail.
>>>> 
>>>> The next step, before we do /anything/ else is for you to post the
>>>> COMPLETE output of these commands.
>>>> 
>>>> mdadm -Dvvs
>>>> mdadm -Evvs
>>>> 
>>>> They will help everyone on the list better understand the state of the
>>>> metadata records and what potential solutions might be possible.
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> 
>> 
>> Obviously you do not understand the problem then, since if you did not
>> previously, and you say you learned nothing new.
>> 
>> Also, you added additional arguments to the commands I provided when
>> that was neither required nor desired.
>> 
>> However enough data was returned to see one thing:  ALL of the events
>> counters show the same number.
>> 
>> That is extremely odd, usually in this situation at least one device
>> will have a lower number.
>> 
>> 
>> If possible please describe what happened to cause this in the first place.
>> 
>> Also, you'll find these links more directly relevant to your problem:
>> 
>> https://raid.wiki.kernel.org/index.php/RAID_Recovery
>> 
>> Reading my local copy of the manpage (which is slightly outdated, you
>> should really get the latest stable mdadm release, compile, install
>> and read the manual to confirm it's still not there) I can't find any
>> way of bringing an array up in read only mode without using missing
>> devices, which is what the permutation script tries to do.
>> Additionally without knowing what type of event is being recovered
>> from; I suspect either simultaneous disconnection of half the drives;
>> or what you've done since, because it looks like something, I cannot
>> offer concrete advice on how to proceed.
>> 
>> However there are two main routes open to you at this point.  Posting
>> a fresh message asking how to create an array read only for use with
>> data recovery, and some variant of following the perl script's steps
>> that the linked document mentions.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> Michael,
> I am running mdadm 3.1.2 (latest stable I think) compiled from source (FYI on Ubuntu Karmic, 2.6.31-20-generic)
> 
> Here is what happened....the device /dev/sda1 has failed once, but I was wondering if it was a freak accident so I tried adding it back..and then it started resyncing ...somewhere in this process...the disk /dev/sda1 stalled and the server needed a reboot. After that boot, I got 2 spares (/dev/sda1, /dev/sdd5) and 2 active devices (/dev/sdb1, /dev/sdc1)
> 
> Maybe I need to do a build with a --assume-clean with the devices in the right order (which I'm positive I can remember) ...be nice if you could plz double check:
> mdadm --build -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
> 
> Again, thanks for your time...
> 
> John,
> I did try what you said without any luck(--assemble --force but it refuses to accept the spare as a valid device and 2 active on a 4 member device isn't good enough)
> 
> 
> 

Some more info:

I did try this command with the following result:

mdadm --build -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
mdadm: Raid level 5 not permitted with --build.

Should I try this?
mdadm --create -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
> 
> 

Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-28 16:35               ` Anshuman Aggarwal
@ 2010-03-29  5:32                 ` Luca Berra
  2010-03-29  6:41                   ` Michael Evans
  0 siblings, 1 reply; 16+ messages in thread
From: Luca Berra @ 2010-03-29  5:32 UTC (permalink / raw)
  To: linux-raid

On Sun, Mar 28, 2010 at 10:05:58PM +0530, Anshuman Aggarwal wrote:
>> Michael,
>> I am running mdadm 3.1.2 (latest stable I think) compiled from source (FYI on Ubuntu Karmic, 2.6.31-20-generic)
>> 
>> Here is what happened....the device /dev/sda1 has failed once, but I was wondering if it was a freak accident so I tried adding it back..and then it started resyncing ...somewhere in this process...the disk /dev/sda1 stalled and the server needed a reboot. After that boot, I got 2 spares (/dev/sda1, /dev/sdd5) and 2 active devices (/dev/sdb1, /dev/sdc1)
>> 
>> Maybe I need to do a build with a --assume-clean with the devices in the right order (which I'm positive I can remember) ...be nice if you could plz double check:
>> mdadm --build -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
>> 
>> Again, thanks for your time...
>> 
>> John,
>> I did try what you said without any luck(--assemble --force but it refuses to accept the spare as a valid device and 2 active on a 4 member device isn't good enough)
>> 
>> 
>> 
>
>Some more info:
>
>I did try this command with the following result:
>
>mdadm --build -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
>mdadm: Raid level 5 not permitted with --build.
>
>Should I try this?
>mdadm --create -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1 /dev/sdb5 /dev/sdc5 /dev/sdd5
 From your description above /dev/sda was the failed one, so you should
not add it to the array. use the word "missing" in its place.

L.

-- 
Luca Berra -- bluca@comedia.it
         Communication Media & Services S.r.l.
  /"\
  \ /     ASCII RIBBON CAMPAIGN
   X        AGAINST HTML MAIL
  / \

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 4 partition raid 5 with 2 disks active and 2 spare, how to force?
  2010-03-29  5:32                 ` Luca Berra
@ 2010-03-29  6:41                   ` Michael Evans
  2010-04-06 18:07                     ` linux raid recreate Anshuman Aggarwal
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Evans @ 2010-03-29  6:41 UTC (permalink / raw)
  To: linux-raid

On Sun, Mar 28, 2010 at 10:32 PM, Luca Berra <bluca@comedia.it> wrote:
> On Sun, Mar 28, 2010 at 10:05:58PM +0530, Anshuman Aggarwal wrote:
>>>
>>> Michael,
>>> I am running mdadm 3.1.2 (latest stable I think) compiled from source
>>> (FYI on Ubuntu Karmic, 2.6.31-20-generic)
>>>
>>> Here is what happened....the device /dev/sda1 has failed once, but I was
>>> wondering if it was a freak accident so I tried adding it back..and then it
>>> started resyncing ...somewhere in this process...the disk /dev/sda1 stalled
>>> and the server needed a reboot. After that boot, I got 2 spares (/dev/sda1,
>>> /dev/sdd5) and 2 active devices (/dev/sdb1, /dev/sdc1)
>>>
>>> Maybe I need to do a build with a --assume-clean with the devices in the
>>> right order (which I'm positive I can remember) ...be nice if you could plz
>>> double check:
>>> mdadm --build -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1
>>> /dev/sdb5 /dev/sdc5 /dev/sdd5
>>>
>>> Again, thanks for your time...
>>>
>>> John,
>>> I did try what you said without any luck(--assemble --force but it
>>> refuses to accept the spare as a valid device and 2 active on a 4 member
>>> device isn't good enough)
>>>
>>>
>>>
>>
>> Some more info:
>>
>> I did try this command with the following result:
>>
>> mdadm --build -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1
>> /dev/sdb5 /dev/sdc5 /dev/sdd5
>> mdadm: Raid level 5 not permitted with --build.
>>
>> Should I try this?
>> mdadm --create -n 4 -l 5 -e1.2 --assume-clean /dev/md127 /dev/sda1
>> /dev/sdb5 /dev/sdc5 /dev/sdd5
>
> From your description above /dev/sda was the failed one, so you should
> not add it to the array. use the word "missing" in its place.
>
> L.
>
> --
> Luca Berra -- bluca@comedia.it
>        Communication Media & Services S.r.l.
>  /"\
>  \ /     ASCII RIBBON CAMPAIGN
>  X        AGAINST HTML MAIL
>  / \
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Additionally to using missing for the device you know to be a failed
one, VERY highly suggest running a check, or some read-only operation
on the resulting raid device to make sure you can read all of the
data.  Be sure to check the dmesg/system logs to make sure that there
were no noted storage errors.  If there were not, it is /probably/
safe to re-add the previously failed disk and resync it.

While checking that your array data can be read, you should probably
also run the SMART tests via smartctl (or a gui for it) on the
'failed' disk to see if it was a sign of something worse.

In any case, I do NOT recommend using anything within the raid
container other than in read-only mode until the resync is complete.
You may need to use portions of sda that are still good in more
elaborate ways to recover data that is readable there, but not
readable on sdd or other drives.  Read/write mode or even FSCK on the
array contents will only increase the chances of data being out of
sync.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* linux raid recreate
  2010-03-29  6:41                   ` Michael Evans
@ 2010-04-06 18:07                     ` Anshuman Aggarwal
  2010-04-06 22:55                       ` Neil Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Anshuman Aggarwal @ 2010-04-06 18:07 UTC (permalink / raw)
  To: linux-raid

I've just had to recreate my raid5 device by using 
mdadm --create --assume-clean -n4 -l5 -e1.2 -c64 

in order to recover my data (because --assemble would not work with force etc.). 
The problem:
 *  Data Offset in the new array is much larger. 
 * Internal Bitmap is starting at a different # sectors from superblock.
 * Array Size is smaller though the disks are the same. 

How can I get these to be the same as what they were in the original array???


I have tried to make sure that nothing gets written to the md device except the metadata during create. 
All of these are important because the fs on top of the LVM on top of the md would need all the data it can to fsck properly and I don't want it starting on the wrong offset. 

I am including the output from mdadm --examine from before and after the create

Originally...

>>> /dev/sdb5:
>>>        Magic : a92b4efc
>>>      Version : 1.2
>>>  Feature Map : 0x1
>>>   Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>>         Name : GATEWAY:127  (local to host GATEWAY)
>>> Creation Time : Sat Aug 22 09:44:21 2009
>>>   Raid Level : raid5
>>> Raid Devices : 4
>>> 
>>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>>   Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>>  Data Offset : 272 sectors
>>> Super Offset : 8 sectors
>>>        State : clean
>>>  Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
>>> 
>>> Internal Bitmap : 2 sectors from superblock
>>>  Update Time : Fri Mar 19 00:56:15 2010
>>>     Checksum : 1005cfbc - correct
>>>       Events : 3796145
>>> 
>>>       Layout : left-symmetric
>>>   Chunk Size : 64K
>>> 
>>> Device Role : Active device 2
>>> Array State : .AA. ('A' == active, '.' == missing)

New...

/dev/sdb5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 8588b69c:c0579680:8a63486a:cbcb0e7d
           Name : GATEWAY:511  (local to host GATEWAY)
  Creation Time : Tue Apr  6 01:53:25 2010
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 586097284 (279.47 GiB 300.08 GB)
     Array Size : 1758290688 (838.42 GiB 900.24 GB)
  Used Dev Size : 586096896 (279.47 GiB 300.08 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 13d6a075:c1cad6dc:c13c3d98:e4b980e9

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Apr  6 23:23:07 2010
       Checksum : df3cb34f - correct
         Events : 4

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : .AAA ('A' == active, '.' == missing)




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: linux raid recreate
  2010-04-06 18:07                     ` linux raid recreate Anshuman Aggarwal
@ 2010-04-06 22:55                       ` Neil Brown
  2010-04-07  0:24                         ` Berkey B Walker
  2010-04-07  7:27                         ` Anshuman Aggarwal
  0 siblings, 2 replies; 16+ messages in thread
From: Neil Brown @ 2010-04-06 22:55 UTC (permalink / raw)
  To: Anshuman Aggarwal; +Cc: linux-raid

On Tue, 6 Apr 2010 23:37:02 +0530
Anshuman Aggarwal <anshuman@brillgene.com> wrote:

> I've just had to recreate my raid5 device by using 
> mdadm --create --assume-clean -n4 -l5 -e1.2 -c64 
> 
> in order to recover my data (because --assemble would not work with force etc.). 
> The problem:
>  *  Data Offset in the new array is much larger. 
>  * Internal Bitmap is starting at a different # sectors from superblock.
>  * Array Size is smaller though the disks are the same. 
> 
> How can I get these to be the same as what they were in the original array???

Use the same version of mdadm and you used to originally create the array.
Probably 2.6.9 from the data, though 3.1.1 seems to create the same layout.
So anything before 3.1.2

I really should write a "--recreate" for mdadm which uses whatever parameters
if it finds already on the devices.

NeilBrown


> 
> I have tried to make sure that nothing gets written to the md device except the metadata during create. 
> All of these are important because the fs on top of the LVM on top of the md would need all the data it can to fsck properly and I don't want it starting on the wrong offset. 
> 
> I am including the output from mdadm --examine from before and after the create
> 
> Originally...
> 
> >>> /dev/sdb5:
> >>>        Magic : a92b4efc
> >>>      Version : 1.2
> >>>  Feature Map : 0x1
> >>>   Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
> >>>         Name : GATEWAY:127  (local to host GATEWAY)
> >>> Creation Time : Sat Aug 22 09:44:21 2009
> >>>   Raid Level : raid5
> >>> Raid Devices : 4
> >>> 
> >>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
> >>>   Array Size : 1758296832 (838.42 GiB 900.25 GB)
> >>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
> >>>  Data Offset : 272 sectors
> >>> Super Offset : 8 sectors
> >>>        State : clean
> >>>  Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
> >>> 
> >>> Internal Bitmap : 2 sectors from superblock
> >>>  Update Time : Fri Mar 19 00:56:15 2010
> >>>     Checksum : 1005cfbc - correct
> >>>       Events : 3796145
> >>> 
> >>>       Layout : left-symmetric
> >>>   Chunk Size : 64K
> >>> 
> >>> Device Role : Active device 2
> >>> Array State : .AA. ('A' == active, '.' == missing)
> 
> New...
> 
> /dev/sdb5:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : 8588b69c:c0579680:8a63486a:cbcb0e7d
>            Name : GATEWAY:511  (local to host GATEWAY)
>   Creation Time : Tue Apr  6 01:53:25 2010
>      Raid Level : raid5
>    Raid Devices : 4
> 
>  Avail Dev Size : 586097284 (279.47 GiB 300.08 GB)
>      Array Size : 1758290688 (838.42 GiB 900.24 GB)
>   Used Dev Size : 586096896 (279.47 GiB 300.08 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : 13d6a075:c1cad6dc:c13c3d98:e4b980e9
> 
> Internal Bitmap : 8 sectors from superblock
>     Update Time : Tue Apr  6 23:23:07 2010
>        Checksum : df3cb34f - correct
>          Events : 4
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>    Device Role : Active device 2
>    Array State : .AAA ('A' == active, '.' == missing)
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: linux raid recreate
  2010-04-06 22:55                       ` Neil Brown
@ 2010-04-07  0:24                         ` Berkey B Walker
  2010-04-07  7:27                         ` Anshuman Aggarwal
  1 sibling, 0 replies; 16+ messages in thread
From: Berkey B Walker @ 2010-04-07  0:24 UTC (permalink / raw)
  To: Neil Brown; +Cc: Anshuman Aggarwal, linux-raid



Neil Brown wrote:
> On Tue, 6 Apr 2010 23:37:02 +0530
> Anshuman Aggarwal<anshuman@brillgene.com>  wrote:
>
>    
>> I've just had to recreate my raid5 device by using
>> mdadm --create --assume-clean -n4 -l5 -e1.2 -c64
>>
>> in order to recover my data (because --assemble would not work with force etc.).
>> The problem:
>>   *  Data Offset in the new array is much larger.
>>   * Internal Bitmap is starting at a different # sectors from superblock.
>>   * Array Size is smaller though the disks are the same.
>>
>> How can I get these to be the same as what they were in the original array???
>>      
> Use the same version of mdadm and you used to originally create the array.
> Probably 2.6.9 from the data, though 3.1.1 seems to create the same layout.
> So anything before 3.1.2
>
> I really should write a "--recreate" for mdadm which uses whatever parameters
> if it finds already on the devices.
>
> NeilBrown
>
>
>    
I think that is great thinking, Sir.
berk-
>> I have tried to make sure that nothing gets written to the md device except the metadata during create.
>> All of these are important because the fs on top of the LVM on top of the md would need all the data it can to fsck properly and I don't want it starting on the wrong offset.
>>
>> I am including the output from mdadm --examine from before and after the create
>>
>> Originally...
>>
>>      
>>>>> /dev/sdb5:
>>>>>         Magic : a92b4efc
>>>>>       Version : 1.2
>>>>>   Feature Map : 0x1
>>>>>    Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>>>>          Name : GATEWAY:127  (local to host GATEWAY)
>>>>> Creation Time : Sat Aug 22 09:44:21 2009
>>>>>    Raid Level : raid5
>>>>> Raid Devices : 4
>>>>>
>>>>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>>>>    Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>>>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>>>>   Data Offset : 272 sectors
>>>>> Super Offset : 8 sectors
>>>>>         State : clean
>>>>>   Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
>>>>>
>>>>> Internal Bitmap : 2 sectors from superblock
>>>>>   Update Time : Fri Mar 19 00:56:15 2010
>>>>>      Checksum : 1005cfbc - correct
>>>>>        Events : 3796145
>>>>>
>>>>>        Layout : left-symmetric
>>>>>    Chunk Size : 64K
>>>>>
>>>>> Device Role : Active device 2
>>>>> Array State : .AA. ('A' == active, '.' == missing)
>>>>>            
>> New...
>>
>> /dev/sdb5:
>>            Magic : a92b4efc
>>          Version : 1.2
>>      Feature Map : 0x1
>>       Array UUID : 8588b69c:c0579680:8a63486a:cbcb0e7d
>>             Name : GATEWAY:511  (local to host GATEWAY)
>>    Creation Time : Tue Apr  6 01:53:25 2010
>>       Raid Level : raid5
>>     Raid Devices : 4
>>
>>   Avail Dev Size : 586097284 (279.47 GiB 300.08 GB)
>>       Array Size : 1758290688 (838.42 GiB 900.24 GB)
>>    Used Dev Size : 586096896 (279.47 GiB 300.08 GB)
>>      Data Offset : 2048 sectors
>>     Super Offset : 8 sectors
>>            State : clean
>>      Device UUID : 13d6a075:c1cad6dc:c13c3d98:e4b980e9
>>
>> Internal Bitmap : 8 sectors from superblock
>>      Update Time : Tue Apr  6 23:23:07 2010
>>         Checksum : df3cb34f - correct
>>           Events : 4
>>
>>           Layout : left-symmetric
>>       Chunk Size : 64K
>>
>>     Device Role : Active device 2
>>     Array State : .AAA ('A' == active, '.' == missing)
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>      
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>    

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: linux raid recreate
  2010-04-06 22:55                       ` Neil Brown
  2010-04-07  0:24                         ` Berkey B Walker
@ 2010-04-07  7:27                         ` Anshuman Aggarwal
  2010-04-07 13:15                           ` Neil Brown
  1 sibling, 1 reply; 16+ messages in thread
From: Anshuman Aggarwal @ 2010-04-07  7:27 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid


On 07-Apr-2010, at 4:25 AM, Neil Brown wrote:

> On Tue, 6 Apr 2010 23:37:02 +0530
> Anshuman Aggarwal <anshuman@brillgene.com> wrote:
> 
>> I've just had to recreate my raid5 device by using 
>> mdadm --create --assume-clean -n4 -l5 -e1.2 -c64 
>> 
>> in order to recover my data (because --assemble would not work with force etc.). 
>> The problem:
>> *  Data Offset in the new array is much larger. 
>> * Internal Bitmap is starting at a different # sectors from superblock.
>> * Array Size is smaller though the disks are the same. 
>> 
>> How can I get these to be the same as what they were in the original array???
> 
> Use the same version of mdadm and you used to originally create the array.
> Probably 2.6.9 from the data, though 3.1.1 seems to create the same layout.
> So anything before 3.1.2
> 
> I really should write a "--recreate" for mdadm which uses whatever parameters
> if it finds already on the devices.
> 
> NeilBrown
> 


Since I already tried to recreate using 3.1.2, with super block 1.2, would it have overwritten much other data on the device? Also is the superblock format documented somewhere such as a graph explaining where what is stored?

Thanks,

> 
>> 
>> I have tried to make sure that nothing gets written to the md device except the metadata during create. 
>> All of these are important because the fs on top of the LVM on top of the md would need all the data it can to fsck properly and I don't want it starting on the wrong offset. 
>> 
>> I am including the output from mdadm --examine from before and after the create
>> 
>> Originally...
>> 
>>>>> /dev/sdb5:
>>>>>       Magic : a92b4efc
>>>>>     Version : 1.2
>>>>> Feature Map : 0x1
>>>>>  Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
>>>>>        Name : GATEWAY:127  (local to host GATEWAY)
>>>>> Creation Time : Sat Aug 22 09:44:21 2009
>>>>>  Raid Level : raid5
>>>>> Raid Devices : 4
>>>>> 
>>>>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
>>>>>  Array Size : 1758296832 (838.42 GiB 900.25 GB)
>>>>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
>>>>> Data Offset : 272 sectors
>>>>> Super Offset : 8 sectors
>>>>>       State : clean
>>>>> Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
>>>>> 
>>>>> Internal Bitmap : 2 sectors from superblock
>>>>> Update Time : Fri Mar 19 00:56:15 2010
>>>>>    Checksum : 1005cfbc - correct
>>>>>      Events : 3796145
>>>>> 
>>>>>      Layout : left-symmetric
>>>>>  Chunk Size : 64K
>>>>> 
>>>>> Device Role : Active device 2
>>>>> Array State : .AA. ('A' == active, '.' == missing)
>> 
>> New...
>> 
>> /dev/sdb5:
>>          Magic : a92b4efc
>>        Version : 1.2
>>    Feature Map : 0x1
>>     Array UUID : 8588b69c:c0579680:8a63486a:cbcb0e7d
>>           Name : GATEWAY:511  (local to host GATEWAY)
>>  Creation Time : Tue Apr  6 01:53:25 2010
>>     Raid Level : raid5
>>   Raid Devices : 4
>> 
>> Avail Dev Size : 586097284 (279.47 GiB 300.08 GB)
>>     Array Size : 1758290688 (838.42 GiB 900.24 GB)
>>  Used Dev Size : 586096896 (279.47 GiB 300.08 GB)
>>    Data Offset : 2048 sectors
>>   Super Offset : 8 sectors
>>          State : clean
>>    Device UUID : 13d6a075:c1cad6dc:c13c3d98:e4b980e9
>> 
>> Internal Bitmap : 8 sectors from superblock
>>    Update Time : Tue Apr  6 23:23:07 2010
>>       Checksum : df3cb34f - correct
>>         Events : 4
>> 
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>> 
>>   Device Role : Active device 2
>>   Array State : .AAA ('A' == active, '.' == missing)
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: linux raid recreate
  2010-04-07  7:27                         ` Anshuman Aggarwal
@ 2010-04-07 13:15                           ` Neil Brown
  0 siblings, 0 replies; 16+ messages in thread
From: Neil Brown @ 2010-04-07 13:15 UTC (permalink / raw)
  To: Anshuman Aggarwal; +Cc: linux-raid

On Wed, 7 Apr 2010 12:57:20 +0530
Anshuman Aggarwal <anshuman@brillgene.com> wrote:

> 
> On 07-Apr-2010, at 4:25 AM, Neil Brown wrote:
> 
> > On Tue, 6 Apr 2010 23:37:02 +0530
> > Anshuman Aggarwal <anshuman@brillgene.com> wrote:
> > 
> >> I've just had to recreate my raid5 device by using 
> >> mdadm --create --assume-clean -n4 -l5 -e1.2 -c64 
> >> 
> >> in order to recover my data (because --assemble would not work with force etc.). 
> >> The problem:
> >> *  Data Offset in the new array is much larger. 
> >> * Internal Bitmap is starting at a different # sectors from superblock.
> >> * Array Size is smaller though the disks are the same. 
> >> 
> >> How can I get these to be the same as what they were in the original array???
> > 
> > Use the same version of mdadm and you used to originally create the array.
> > Probably 2.6.9 from the data, though 3.1.1 seems to create the same layout.
> > So anything before 3.1.2
> > 
> > I really should write a "--recreate" for mdadm which uses whatever parameters
> > if it finds already on the devices.
> > 
> > NeilBrown
> > 
> 
> 
> Since I already tried to recreate using 3.1.2, with super block 1.2, would it have overwritten much other data on the device? Also is the superblock format documented somewhere such as a graph explaining where what is stored?

I don't think it will have overwritten any data, but I don't have enough info
to be 100% certain.

If you used --assume-clean, and did not write anything to the array, then
only the superblock and bitmap will have been written.

The superblock that you wrote will be the same location as the old
superblock, so writing that will not corrupt data.

The bitmap will have been written 8 sectors from superblock rather than 2,
but it will probably have been a smaller bitmap.
If you report
  mdadm -X /dev/sdb5
I can tell you how big the bitmap is, and so whether it would have extended
in to the data which was at 272 sectors from the start of the device.
So the bitmap would have to exceed 266 sectors for it to over-write any data.

The only superblock documentation I know of is in the source code for mdadm
and the kernel.

NeilBrown



> 
> Thanks,
> 
> > 
> >> 
> >> I have tried to make sure that nothing gets written to the md device except the metadata during create. 
> >> All of these are important because the fs on top of the LVM on top of the md would need all the data it can to fsck properly and I don't want it starting on the wrong offset. 
> >> 
> >> I am including the output from mdadm --examine from before and after the create
> >> 
> >> Originally...
> >> 
> >>>>> /dev/sdb5:
> >>>>>       Magic : a92b4efc
> >>>>>     Version : 1.2
> >>>>> Feature Map : 0x1
> >>>>>  Array UUID : 42c56ea0:2484f566:387adc6c:b3f6a014
> >>>>>        Name : GATEWAY:127  (local to host GATEWAY)
> >>>>> Creation Time : Sat Aug 22 09:44:21 2009
> >>>>>  Raid Level : raid5
> >>>>> Raid Devices : 4
> >>>>> 
> >>>>> Avail Dev Size : 586099060 (279.47 GiB 300.08 GB)
> >>>>>  Array Size : 1758296832 (838.42 GiB 900.25 GB)
> >>>>> Used Dev Size : 586098944 (279.47 GiB 300.08 GB)
> >>>>> Data Offset : 272 sectors
> >>>>> Super Offset : 8 sectors
> >>>>>       State : clean
> >>>>> Device UUID : f8ebb9f8:b447f894:d8b0b59f:ca8e98eb
> >>>>> 
> >>>>> Internal Bitmap : 2 sectors from superblock
> >>>>> Update Time : Fri Mar 19 00:56:15 2010
> >>>>>    Checksum : 1005cfbc - correct
> >>>>>      Events : 3796145
> >>>>> 
> >>>>>      Layout : left-symmetric
> >>>>>  Chunk Size : 64K
> >>>>> 
> >>>>> Device Role : Active device 2
> >>>>> Array State : .AA. ('A' == active, '.' == missing)
> >> 
> >> New...
> >> 
> >> /dev/sdb5:
> >>          Magic : a92b4efc
> >>        Version : 1.2
> >>    Feature Map : 0x1
> >>     Array UUID : 8588b69c:c0579680:8a63486a:cbcb0e7d
> >>           Name : GATEWAY:511  (local to host GATEWAY)
> >>  Creation Time : Tue Apr  6 01:53:25 2010
> >>     Raid Level : raid5
> >>   Raid Devices : 4
> >> 
> >> Avail Dev Size : 586097284 (279.47 GiB 300.08 GB)
> >>     Array Size : 1758290688 (838.42 GiB 900.24 GB)
> >>  Used Dev Size : 586096896 (279.47 GiB 300.08 GB)
> >>    Data Offset : 2048 sectors
> >>   Super Offset : 8 sectors
> >>          State : clean
> >>    Device UUID : 13d6a075:c1cad6dc:c13c3d98:e4b980e9
> >> 
> >> Internal Bitmap : 8 sectors from superblock
> >>    Update Time : Tue Apr  6 23:23:07 2010
> >>       Checksum : df3cb34f - correct
> >>         Events : 4
> >> 
> >>         Layout : left-symmetric
> >>     Chunk Size : 64K
> >> 
> >>   Device Role : Active device 2
> >>   Array State : .AAA ('A' == active, '.' == missing)
> >> 
> >> 
> >> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-04-07 13:15 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <S1753093Ab0CYHZE/20100325072504Z+37@vger.kernel.org>
2010-03-25  9:30 ` 4 partition raid 5 with 2 disks active and 2 spare, how to force? Anshuman Aggarwal
2010-03-25 11:37   ` Michael Evans
2010-03-25 14:09     ` Anshuman Aggarwal
2010-03-26  3:38       ` Michael Evans
2010-03-26 16:28         ` Anshuman Aggarwal
2010-03-26 19:04           ` Michael Evans
2010-03-28 15:18             ` Anshuman Aggarwal
2010-03-28 16:35               ` Anshuman Aggarwal
2010-03-29  5:32                 ` Luca Berra
2010-03-29  6:41                   ` Michael Evans
2010-04-06 18:07                     ` linux raid recreate Anshuman Aggarwal
2010-04-06 22:55                       ` Neil Brown
2010-04-07  0:24                         ` Berkey B Walker
2010-04-07  7:27                         ` Anshuman Aggarwal
2010-04-07 13:15                           ` Neil Brown
2010-03-26 19:29           ` 4 partition raid 5 with 2 disks active and 2 spare, how to force? John Robinson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.