All of lore.kernel.org
 help / color / mirror / Atom feed
* raid10 issues after reorder of boot drives.
@ 2012-04-27 20:04 likewhoa
  2012-04-27 21:51 ` likewhoa
  2012-04-27 22:03 ` NeilBrown
  0 siblings, 2 replies; 20+ messages in thread
From: likewhoa @ 2012-04-27 20:04 UTC (permalink / raw)
  To: linux-raid

I have a strange issue on my raid10 8x400GB array. I cannot assemble my
array anymore and it's gotten to a point that I don't know what to do to
recover my data. I was hoping I could get some advice on it. Below is
some info which I hope will help troubleshoot this issue.

> mdadm -Esv
ARRAY /dev/md/1 metadata=1.0 num-devices=0
UUID=828ed03d:0c28afda:4a636e88:7b29ec9f name=Darkside:1
   spares=7  
devices=/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdc3,/dev/sda3,/dev/sdb3
ARRAY /dev/md/1 level=raid10 metadata=1.0 num-devices=8
UUID=828ed03d:0c28afda:4a636e88:7b29ec9f name=Darkside:1
   devices=8:131

> mdadm --version
mdadm - v3.2.3 - 23rd December 2011

> for i in /dev/sd{a,b,c,d,e,f,g,i}; do fdisk -l
${i};done                                                                                      


Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x4644f013

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sda2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sda3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e460f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdb2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdb3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x7d10530f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdc2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdc3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdd: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x81a213ab

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdd2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdd3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sde: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x4644f00e

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sde2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sde3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdf: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x4644f00c

   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdf2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdf3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdg: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x327d8d82

   Device Boot      Start         End      Blocks   Id  System
/dev/sdg1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdg2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdg3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdi: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e460f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdi1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdi2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdi3        73779200   976773119   451496960   fd  Linux raid
autodetect

> uname -r
3.3.3-gentoo

Thanks in advanced.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-27 20:04 raid10 issues after reorder of boot drives likewhoa
@ 2012-04-27 21:51 ` likewhoa
  2012-04-27 22:05   ` NeilBrown
  2012-04-27 22:03 ` NeilBrown
  1 sibling, 1 reply; 20+ messages in thread
From: likewhoa @ 2012-04-27 21:51 UTC (permalink / raw)
  To: linux-raid

On 04/27/2012 04:04 PM, likewhoa wrote:
> I have a strange issue on my raid10 8x400GB array. I cannot assemble my
> array anymore and it's gotten to a point that I don't know what to do to
> recover my data. I was hoping I could get some advice on it. Below is
> some info which I hope will help troubleshoot this issue.
>
>> mdadm -Esv
> ARRAY /dev/md/1 metadata=1.0 num-devices=0
> UUID=828ed03d:0c28afda:4a636e88:7b29ec9f name=Darkside:1
>    spares=7  
> devices=/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdc3,/dev/sda3,/dev/sdb3
> ARRAY /dev/md/1 level=raid10 metadata=1.0 num-devices=8
> UUID=828ed03d:0c28afda:4a636e88:7b29ec9f name=Darkside:1
>    devices=8:131
>
>> mdadm --version
> mdadm - v3.2.3 - 23rd December 2011
>
>> for i in /dev/sd{a,b,c,d,e,f,g,i}; do fdisk -l
> ${i};done                                                                                      
>
>
> Disk /dev/sda: 500.1 GB, 500107862016 bytes
> 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x4644f013
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sda1            2048     2099199     1048576   82  Linux swap / Solaris
> /dev/sda2         2099200    73779199    35840000   fd  Linux raid
> autodetect
> /dev/sda3        73779200   976773119   451496960   fd  Linux raid
> autodetect
>
> Disk /dev/sdb: 500.1 GB, 500107862016 bytes
> 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x000e460f
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdb1            2048     2099199     1048576   82  Linux swap / Solaris
> /dev/sdb2         2099200    73779199    35840000   fd  Linux raid
> autodetect
> /dev/sdb3        73779200   976773119   451496960   fd  Linux raid
> autodetect
>
> Disk /dev/sdc: 500.1 GB, 500107862016 bytes
> 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x7d10530f
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdc1            2048     2099199     1048576   82  Linux swap / Solaris
> /dev/sdc2         2099200    73779199    35840000   fd  Linux raid
> autodetect
> /dev/sdc3        73779200   976773119   451496960   fd  Linux raid
> autodetect
>
> Disk /dev/sdd: 500.1 GB, 500107862016 bytes
> 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x81a213ab
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdd1            2048     2099199     1048576   82  Linux swap / Solaris
> /dev/sdd2         2099200    73779199    35840000   fd  Linux raid
> autodetect
> /dev/sdd3        73779200   976773119   451496960   fd  Linux raid
> autodetect
>
> Disk /dev/sde: 500.1 GB, 500107862016 bytes
> 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x4644f00e
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sde1            2048     2099199     1048576   82  Linux swap / Solaris
> /dev/sde2         2099200    73779199    35840000   fd  Linux raid
> autodetect
> /dev/sde3        73779200   976773119   451496960   fd  Linux raid
> autodetect
>
> Disk /dev/sdf: 500.1 GB, 500107862016 bytes
> 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x4644f00c
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdf1            2048     2099199     1048576   82  Linux swap / Solaris
> /dev/sdf2         2099200    73779199    35840000   fd  Linux raid
> autodetect
> /dev/sdf3        73779200   976773119   451496960   fd  Linux raid
> autodetect
>
> Disk /dev/sdg: 500.1 GB, 500107862016 bytes
> 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x327d8d82
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdg1            2048     2099199     1048576   82  Linux swap / Solaris
> /dev/sdg2         2099200    73779199    35840000   fd  Linux raid
> autodetect
> /dev/sdg3        73779200   976773119   451496960   fd  Linux raid
> autodetect
>
> Disk /dev/sdi: 500.1 GB, 500107862016 bytes
> 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x000e460f
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdi1            2048     2099199     1048576   82  Linux swap / Solaris
> /dev/sdi2         2099200    73779199    35840000   fd  Linux raid
> autodetect
> /dev/sdi3        73779200   976773119   451496960   fd  Linux raid
> autodetect
>
>> uname -r
> 3.3.3-gentoo
>
> Thanks in advanced.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
adding more verbose info gives me:

> -> mdadm -A --verbose /dev/md1
mdadm: looking for devices for /dev/md1
mdadm: /dev/dm-8 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdi1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdi is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sr0 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/dm-7 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/dm-6 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/dm-5 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/dm-4 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/dm-3 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/dm-2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/dm-1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/dm-0 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/md/0 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdh3 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdh2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdh1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdh is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: cannot open device /dev/sdg3: Device or resource busy
mdadm: /dev/sdg2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdg1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdg is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: cannot open device /dev/sdf3: Device or resource busy
mdadm: /dev/sdf2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdf1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdf is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: cannot open device /dev/sde3: Device or resource busy
mdadm: /dev/sde2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sde1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sde is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: cannot open device /dev/sdd3: Device or resource busy
mdadm: /dev/sdd2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdd1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdd is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: cannot open device /dev/sdb3: Device or resource busy
mdadm: /dev/sdb2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdb1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdb is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: cannot open device /dev/sda3: Device or resource busy
mdadm: /dev/sda2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sda1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sda is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: cannot open device /dev/sdc3: Device or resource busy
mdadm: /dev/sdc2 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdc1 is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
mdadm: /dev/sdc is not one of
/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-27 20:04 raid10 issues after reorder of boot drives likewhoa
  2012-04-27 21:51 ` likewhoa
@ 2012-04-27 22:03 ` NeilBrown
  2012-04-27 23:26   ` likewhoa
  2012-05-01  9:45   ` Brian Candler
  1 sibling, 2 replies; 20+ messages in thread
From: NeilBrown @ 2012-04-27 22:03 UTC (permalink / raw)
  To: likewhoa; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2557 bytes --]

On Fri, 27 Apr 2012 16:04:22 -0400 likewhoa <likewhoa@weboperative.com> wrote:

> I have a strange issue on my raid10 8x400GB array. I cannot assemble my
> array anymore and it's gotten to a point that I don't know what to do to
> recover my data. I was hoping I could get some advice on it. Below is
> some info which I hope will help troubleshoot this issue.
> 
> > mdadm -Esv
> ARRAY /dev/md/1 metadata=1.0 num-devices=0
> UUID=828ed03d:0c28afda:4a636e88:7b29ec9f name=Darkside:1
>    spares=7  
> devices=/dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdc3,/dev/sda3,/dev/sdb3
> ARRAY /dev/md/1 level=raid10 metadata=1.0 num-devices=8
> UUID=828ed03d:0c28afda:4a636e88:7b29ec9f name=Darkside:1
>    devices=8:131

I'm afraid you've been bitten by a rather nasty bug which is present in 3.3
and got back ported to some -stable kernel.  The fix has been submitted and
should be appearing in -stable kernels soon (maybe already).

The effect of this bug is to remove key data from the metadata.  It makes the
devices look like a spare for an array of unknown level with no devices and
no chunk size.

It seems that you have one device which wasn't corrupted:  8:131 which should
be /dev/sdi3.  The others are messed up.

What you will need to do is re-create the array.  You can look at /dev/sdi3
to see what the chunksize and layout is:
  mdadm --examine /dev/sdi3

and then run a command like:

  mdadm --stop /dev/md1
  mdadm --create /dev/md1 --metadata=1.0 -l10 -n8 --chunk=XXX --layout=YY \
  --assume-clean \
    /dev/sdg3 /dev/sda3 /dev/sdi3 /dev/sdb3 /dev/sdf3 /dev/sde3 /dev/sdc3 /dev/sdh3
(probably have some device names wrong)

Note that the order of the devices is *very* important, and that is
information that has been lost.  However some of it can be recovered.
If this is an 'n2' or 'near=2' array, which is the default, then adjacent
devices will be identical.  You can find identical pairs with:

 for i in a b c d e f g h i
 do for j in a b c d e f g h i
  do if [ $i = $j ] ; then continue fi
     if cmp -s -n 32768 /dev/sd${i}3 /dev/sd${j}3
     then echo /dev/sd${i}3 and /dev/sd${j}3 seem to match
     fi
  done
 done

That should help you find pairs.  Then you have at most 16 possible orderings
of those 4 pairs to test.

Note that the "--create" command  doesn't touch the data.  It just writes new
metadata and assembles the array.  Then you can run "fsck -n" to check if the
array looks good.

Good luck, and please ask if anything here isn't clear.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-27 21:51 ` likewhoa
@ 2012-04-27 22:05   ` NeilBrown
  2012-04-27 23:29     ` likewhoa
  0 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2012-04-27 22:05 UTC (permalink / raw)
  To: likewhoa; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 579 bytes --]

On Fri, 27 Apr 2012 17:51:54 -0400 likewhoa <likewhoa@weboperative.com> wrote:


> adding more verbose info gives me:
> 
> > -> mdadm -A --verbose /dev/md1
> mdadm: looking for devices for /dev/md1
> mdadm: /dev/dm-8 is not one of
> /dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3

You seem to have an explicit list of devices in /etc/mdadm.conf
This is not a good idea for 'sd' devices as they can change their names,
which can mean they aren't on the list any more.  You should remove that
once you get this all sorted out.

NeilBrown



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-27 22:03 ` NeilBrown
@ 2012-04-27 23:26   ` likewhoa
  2012-05-01  9:45   ` Brian Candler
  1 sibling, 0 replies; 20+ messages in thread
From: likewhoa @ 2012-04-27 23:26 UTC (permalink / raw)
  To: NeilBrown, linux-raid

note this strange problem which doesn't seem related.

> cat /proc/mdstat
Personalities : [raid10]
md127 : inactive sdb3[8](S) sdf3[13](S) sda3[11](S) sdc3[9](S)
sdd3[12](S) sdg3[10](S) sde3[15](S)
      3160477768 blocks super 1.0
      
md0 : active raid10 sdd2[12] sdh2[0] sdb2[8] sdf2[13] sdg2[10] sda2[11]
sde2[14] sdc2[9]
      143358976 blocks super 1.0 256K chunks 2 near-copies [8/8] [UUUUUUUU]
     
unused devices: <none>

> mdadm -S /dev/md127
mdadm: error opening /dev/md127: No such file or directory



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-27 22:05   ` NeilBrown
@ 2012-04-27 23:29     ` likewhoa
  2012-04-28  0:24       ` NeilBrown
  2012-04-28  0:35       ` likewhoa
  0 siblings, 2 replies; 20+ messages in thread
From: likewhoa @ 2012-04-27 23:29 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On 04/27/2012 06:05 PM, NeilBrown wrote:
> On Fri, 27 Apr 2012 17:51:54 -0400 likewhoa <likewhoa@weboperative.com> wrote:
>
>
>> adding more verbose info gives me:
>>
>>> -> mdadm -A --verbose /dev/md1
>> mdadm: looking for devices for /dev/md1
>> mdadm: /dev/dm-8 is not one of
>> /dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
> You seem to have an explicit list of devices in /etc/mdadm.conf
> This is not a good idea for 'sd' devices as they can change their names,
> which can mean they aren't on the list any more.  You should remove that
> once you get this all sorted out.
>
> NeilBrown
>
>
@Neil sorry but I didn't get to reply to all on my last 2 emails, so
here is goes again so it's archived.

/dev/sdh3:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 828ed03d:0c28afda:4a636e88:7b29ec9f
           Name : Darkside:1  (local to host Darkside)
  Creation Time : Sun Aug 15 21:12:34 2010
     Raid Level : raid10
   Raid Devices : 8

 Avail Dev Size : 902993648 (430.58 GiB 462.33 GB)
     Array Size : 3611971584 (1722.32 GiB 1849.33 GB)
  Used Dev Size : 902992896 (430.58 GiB 462.33 GB)
   Super Offset : 902993904 sectors
          State : clean
    Device UUID : 00565578:e2eaaba3:f1eae17c:f474ee8d

    Update Time : Wed Apr 25 17:22:58 2012
       Checksum : 1e7c3692 - correct
         Events : 82942

         Layout : far=2
     Chunk Size : 256K

   Device Role : Active device 0
   Array State : AAAAAAAA ('A' == active, '.' == missing)
 

The only drive that didn't get affected is far=3. Any suggestions? I
have the drives on separate controllers and when I created the array I
set up the order as /dev/sda3 /dev/sde3 /dev/sdb3 /dev/sdf3 and so on.
so I would assume the same order would be used, also note that I ran
luksFormat on /dev/md1 then ran pvcreate /dev/md1 and so on. Will I have
issues with luksOpen after recreating the array? I removed the /dev/sdh1
drive so now the output is like:

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x4644f013

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sda2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sda3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e460f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdb2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdb3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x7d10530f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdc2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdc3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdd: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x81a213ab

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdd2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdd3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sde: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x4644f00e

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sde2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sde3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdf: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x4644f00c

   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdf2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdf3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdg: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x327d8d82

   Device Boot      Start         End      Blocks   Id  System
/dev/sdg1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdg2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdg3        73779200   976773119   451496960   fd  Linux raid
autodetect

Disk /dev/sdh: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e460f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdh1            2048     2099199     1048576   82  Linux swap / Solaris
/dev/sdh2         2099200    73779199    35840000   fd  Linux raid
autodetect
/dev/sdh3        73779200   976773119   451496960   fd  Linux raid
autodetect

and

cat /proc/mdstat
Personalities : [raid10]
md127 : inactive sdb3[8](S) sdf3[13](S) sda3[11](S) sdc3[9](S)
sdd3[12](S) sdg3[10](S) sde3[15](S)
      3160477768 blocks super 1.0
      
md0 : active raid10 sdd2[12] sdh2[0] sdb2[8] sdf2[13] sdg2[10] sda2[11]
sde2[14] sdc2[9]
      143358976 blocks super 1.0 256K chunks 2 near-copies [8/8] [UUUUUUUU]
     
unused devices: <none>

and the output from you for loop.

/dev/sda3 and /dev/sdc3 seem to match
/dev/sda3 and /dev/sde3 seem to match
/dev/sda3 and /dev/sdg3 seem to match
/dev/sdc3 and /dev/sda3 seem to match
/dev/sdc3 and /dev/sde3 seem to match
/dev/sdc3 and /dev/sdg3 seem to match
/dev/sde3 and /dev/sda3 seem to match
/dev/sde3 and /dev/sdc3 seem to match
/dev/sde3 and /dev/sdg3 seem to match
/dev/sdg3 and /dev/sda3 seem to match
/dev/sdg3 and /dev/sdc3 seem to match
/dev/sdg3 and /dev/sde3 seem to match

Thanks in advanced Neil.
likewhoa



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-27 23:29     ` likewhoa
@ 2012-04-28  0:24       ` NeilBrown
  2012-04-28  0:35       ` likewhoa
  1 sibling, 0 replies; 20+ messages in thread
From: NeilBrown @ 2012-04-28  0:24 UTC (permalink / raw)
  To: likewhoa; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2880 bytes --]

On Fri, 27 Apr 2012 19:29:37 -0400 likewhoa <likewhoa@weboperative.com> wrote:

> On 04/27/2012 06:05 PM, NeilBrown wrote:
> > On Fri, 27 Apr 2012 17:51:54 -0400 likewhoa <likewhoa@weboperative.com> wrote:
> >
> >
> >> adding more verbose info gives me:
> >>
> >>> -> mdadm -A --verbose /dev/md1
> >> mdadm: looking for devices for /dev/md1
> >> mdadm: /dev/dm-8 is not one of
> >> /dev/sdg3,/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdb3,/dev/sda3,/dev/sdc3
> > You seem to have an explicit list of devices in /etc/mdadm.conf
> > This is not a good idea for 'sd' devices as they can change their names,
> > which can mean they aren't on the list any more.  You should remove that
> > once you get this all sorted out.
> >
> > NeilBrown
> >
> >
> @Neil sorry but I didn't get to reply to all on my last 2 emails, so
> here is goes again so it's archived.
> 
> /dev/sdh3:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x0
>      Array UUID : 828ed03d:0c28afda:4a636e88:7b29ec9f
>            Name : Darkside:1  (local to host Darkside)
>   Creation Time : Sun Aug 15 21:12:34 2010
>      Raid Level : raid10
>    Raid Devices : 8
> 
>  Avail Dev Size : 902993648 (430.58 GiB 462.33 GB)
>      Array Size : 3611971584 (1722.32 GiB 1849.33 GB)
>   Used Dev Size : 902992896 (430.58 GiB 462.33 GB)
>    Super Offset : 902993904 sectors
>           State : clean
>     Device UUID : 00565578:e2eaaba3:f1eae17c:f474ee8d
> 
>     Update Time : Wed Apr 25 17:22:58 2012
>        Checksum : 1e7c3692 - correct
>          Events : 82942
> 
>          Layout : far=2
>      Chunk Size : 256K
> 
>    Device Role : Active device 0
>    Array State : AAAAAAAA ('A' == active, '.' == missing)
>  
> 
> The only drive that didn't get affected is far=3. Any suggestions? I
> have the drives on separate controllers and when I created the array I
> set up the order as /dev/sda3 /dev/sde3 /dev/sdb3 /dev/sdf3 and so on.
> so I would assume the same order would be used, also note that I ran
> luksFormat on /dev/md1 then ran pvcreate /dev/md1 and so on. Will I have
> issues with luksOpen after recreating the array? I removed the /dev/sdh1
> drive so now the output is like:

As the array is "far=2" you will need to create with --layout=f2

That means you cannot easily detect pairs by comparing the first few
kilobytes.

However if you are confident that you know the order of the drives (because
of they arrangement of controllers) then maybe you can just create the array.
Make sure you set --chunk=256


Yes, you will need to 'luksOpen' after recreating the array.  That will
quite possibly fail if you have the order wrong.
Then you would need to "pvscan" or whatever one does to find LVM components.
If both those succeed, then try "fsck -n".
If they fail, try a different ordering.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-27 23:29     ` likewhoa
  2012-04-28  0:24       ` NeilBrown
@ 2012-04-28  0:35       ` likewhoa
  2012-04-28  2:37         ` likewhoa
  1 sibling, 1 reply; 20+ messages in thread
From: likewhoa @ 2012-04-28  0:35 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

I am not sure how to proceed now with the output that shows possible
pairs as it won't allow me to setup all 8 devices on the array but only
4. Should I run the array creation with -x4 and set the available spare
devicesor or just create the array as I can remember which was one pair
from each controller. i.e /dev/sda3 /dev/sde3 ...?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-28  0:35       ` likewhoa
@ 2012-04-28  2:37         ` likewhoa
  2012-04-28  2:55           ` NeilBrown
  0 siblings, 1 reply; 20+ messages in thread
From: likewhoa @ 2012-04-28  2:37 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On 04/27/2012 08:35 PM, likewhoa wrote:
> I am not sure how to proceed now with the output that shows possible
> pairs as it won't allow me to setup all 8 devices on the array but only
> 4. Should I run the array creation with -x4 and set the available spare
> devicesor or just create the array as I can remember which was one pair
> from each controller. i.e /dev/sda3 /dev/sde3 ...?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
ok I was able to recreate the array with correct order which I took from
my /dev/md0's --details output and was able to decrypt the luks mapping
but XFS didn't open and xfs_repair is currently doing the matrix. I will
keep this posted with updates.

Thanks again Neil.
WRT 3.3.3 should I just go back to 3.3.2 which seemed to run fine and
wait until there is a release of 3.3.3 that has fix?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-28  2:37         ` likewhoa
@ 2012-04-28  2:55           ` NeilBrown
  2012-04-28  2:59             ` likewhoa
  0 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2012-04-28  2:55 UTC (permalink / raw)
  To: likewhoa; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1399 bytes --]

On Fri, 27 Apr 2012 22:37:29 -0400 likewhoa <likewhoa@weboperative.com> wrote:

> On 04/27/2012 08:35 PM, likewhoa wrote:
> > I am not sure how to proceed now with the output that shows possible
> > pairs as it won't allow me to setup all 8 devices on the array but only
> > 4. Should I run the array creation with -x4 and set the available spare
> > devicesor or just create the array as I can remember which was one pair
> > from each controller. i.e /dev/sda3 /dev/sde3 ...?
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> ok I was able to recreate the array with correct order which I took from
> my /dev/md0's --details output and was able to decrypt the luks mapping
> but XFS didn't open and xfs_repair is currently doing the matrix. I will
> keep this posted with updates.

I hope the order really is correct.... I wouldn't expect xfs to find problems
if it was...

> 
> Thanks again Neil.
> WRT 3.3.3 should I just go back to 3.3.2 which seemed to run fine and
> wait until there is a release of 3.3.3 that has fix?

3.3.4 has the fix and was just released.
3.3.1, 3.3.2 and 3.3.3 all have the bug.  It only triggers on shutdown and
even then only occasionally.
So I recommend 3.3.4.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-28  2:55           ` NeilBrown
@ 2012-04-28  2:59             ` likewhoa
  2012-04-28  3:23               ` NeilBrown
  0 siblings, 1 reply; 20+ messages in thread
From: likewhoa @ 2012-04-28  2:59 UTC (permalink / raw)
  To: NeilBrown,
	linux-raid@vger.kernel.org >>
	"linux-raid@vger.kernel.org"

On 04/27/2012 10:55 PM, NeilBrown wrote:
> On Fri, 27 Apr 2012 22:37:29 -0400 likewhoa <likewhoa@weboperative.com> wrote:
>
>> On 04/27/2012 08:35 PM, likewhoa wrote:
>>> I am not sure how to proceed now with the output that shows possible
>>> pairs as it won't allow me to setup all 8 devices on the array but only
>>> 4. Should I run the array creation with -x4 and set the available spare
>>> devicesor or just create the array as I can remember which was one pair
>>> from each controller. i.e /dev/sda3 /dev/sde3 ...?
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> ok I was able to recreate the array with correct order which I took from
>> my /dev/md0's --details output and was able to decrypt the luks mapping
>> but XFS didn't open and xfs_repair is currently doing the matrix. I will
>> keep this posted with updates.
> I hope the order really is correct.... I wouldn't expect xfs to find problems
> if it was...
>
>> Thanks again Neil.
>> WRT 3.3.3 should I just go back to 3.3.2 which seemed to run fine and
>> wait until there is a release of 3.3.3 that has fix?
> 3.3.4 has the fix and was just released.
> 3.3.1, 3.3.2 and 3.3.3 all have the bug.  It only triggers on shutdown and
> even then only occasionally.
> So I recommend 3.3.4.
>
> NeilBrown
The reason I believe it was correct was that 'cryptsetup luksOpen
/dev/md1 md1' worked. I really do hope that it was correct too because
after opening the luks mapping I assume there is no going back.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-28  2:59             ` likewhoa
@ 2012-04-28  3:23               ` NeilBrown
  2012-04-28  3:51                 ` likewhoa
  0 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2012-04-28  3:23 UTC (permalink / raw)
  To: likewhoa
  Cc: linux-raid@vger.kernel.org >> "linux-raid@vger.kernel.org"

[-- Attachment #1: Type: text/plain, Size: 2835 bytes --]

On Fri, 27 Apr 2012 22:59:48 -0400 likewhoa <likewhoa@weboperative.com> wrote:

> On 04/27/2012 10:55 PM, NeilBrown wrote:
> > On Fri, 27 Apr 2012 22:37:29 -0400 likewhoa <likewhoa@weboperative.com> wrote:
> >
> >> On 04/27/2012 08:35 PM, likewhoa wrote:
> >>> I am not sure how to proceed now with the output that shows possible
> >>> pairs as it won't allow me to setup all 8 devices on the array but only
> >>> 4. Should I run the array creation with -x4 and set the available spare
> >>> devicesor or just create the array as I can remember which was one pair
> >>> from each controller. i.e /dev/sda3 /dev/sde3 ...?
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> ok I was able to recreate the array with correct order which I took from
> >> my /dev/md0's --details output and was able to decrypt the luks mapping
> >> but XFS didn't open and xfs_repair is currently doing the matrix. I will
> >> keep this posted with updates.
> > I hope the order really is correct.... I wouldn't expect xfs to find problems
> > if it was...
> >
> >> Thanks again Neil.
> >> WRT 3.3.3 should I just go back to 3.3.2 which seemed to run fine and
> >> wait until there is a release of 3.3.3 that has fix?
> > 3.3.4 has the fix and was just released.
> > 3.3.1, 3.3.2 and 3.3.3 all have the bug.  It only triggers on shutdown and
> > even then only occasionally.
> > So I recommend 3.3.4.
> >
> > NeilBrown
> The reason I believe it was correct was that 'cryptsetup luksOpen
> /dev/md1 md1' worked. I really do hope that it was correct too because
> after opening the luks mapping I assume there is no going back.

Opening the luks mapping could just mean that the first few blocks are
correct.  So the first disk is right but others might not be.

There is going backup unless something has been written to the array.  Once
that happens anything could be corrupted.  So if the xfs check you are  doing
is read-only you could still have room to move.

With a far=2 array, each first half of each device is mirrored on the second
half.  So you can probably recover the ordering by finding which pairs match.

The "Used Dev Size" is 902992896 sectors.  Half of that is 451496448
or 231166181376 bytes.

So to check if two devices are adjacent in the mapping you can try:

 cmp --ignore-initial=0:231166181376 --bytes=231166181376 first-dev second-dev

You could possibly use a smaller --bytes= number, at least on the first
attempt.
You a similar 'for' loop to before an use this command and it might tell you
which pairs of devices are consecutive.  From that you should be able to get
the full order.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-28  3:23               ` NeilBrown
@ 2012-04-28  3:51                 ` likewhoa
  2012-04-28 15:23                   ` likewhoa
  0 siblings, 1 reply; 20+ messages in thread
From: likewhoa @ 2012-04-28  3:51 UTC (permalink / raw)
  To: NeilBrown
  Cc: linux-raid@vger.kernel.org >> "linux-raid@vger.kernel.org"

On 04/27/2012 11:23 PM, NeilBrown wrote:
> On Fri, 27 Apr 2012 22:59:48 -0400 likewhoa <likewhoa@weboperative.com> wrote:
>
>> On 04/27/2012 10:55 PM, NeilBrown wrote:
>>> On Fri, 27 Apr 2012 22:37:29 -0400 likewhoa <likewhoa@weboperative.com> wrote:
>>>
>>>> On 04/27/2012 08:35 PM, likewhoa wrote:
>>>>> I am not sure how to proceed now with the output that shows possible
>>>>> pairs as it won't allow me to setup all 8 devices on the array but only
>>>>> 4. Should I run the array creation with -x4 and set the available spare
>>>>> devicesor or just create the array as I can remember which was one pair
>>>>> from each controller. i.e /dev/sda3 /dev/sde3 ...?
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> ok I was able to recreate the array with correct order which I took from
>>>> my /dev/md0's --details output and was able to decrypt the luks mapping
>>>> but XFS didn't open and xfs_repair is currently doing the matrix. I will
>>>> keep this posted with updates.
>>> I hope the order really is correct.... I wouldn't expect xfs to find problems
>>> if it was...
>>>
>>>> Thanks again Neil.
>>>> WRT 3.3.3 should I just go back to 3.3.2 which seemed to run fine and
>>>> wait until there is a release of 3.3.3 that has fix?
>>> 3.3.4 has the fix and was just released.
>>> 3.3.1, 3.3.2 and 3.3.3 all have the bug.  It only triggers on shutdown and
>>> even then only occasionally.
>>> So I recommend 3.3.4.
>>>
>>> NeilBrown
>> The reason I believe it was correct was that 'cryptsetup luksOpen
>> /dev/md1 md1' worked. I really do hope that it was correct too because
>> after opening the luks mapping I assume there is no going back.
> Opening the luks mapping could just mean that the first few blocks are
> correct.  So the first disk is right but others might not be.
>
> There is going backup unless something has been written to the array.  Once
> that happens anything could be corrupted.  So if the xfs check you are  doing
> is read-only you could still have room to move.
>
> With a far=2 array, each first half of each device is mirrored on the second
> half.  So you can probably recover the ordering by finding which pairs match.
>
> The "Used Dev Size" is 902992896 sectors.  Half of that is 451496448
> or 231166181376 bytes.
>
> So to check if two devices are adjacent in the mapping you can try:
>
>  cmp --ignore-initial=0:231166181376 --bytes=231166181376 first-dev second-dev
>
> You could possibly use a smaller --bytes= number, at least on the first
> attempt.
> You a similar 'for' loop to before an use this command and it might tell you
> which pairs of devices are consecutive.  From that you should be able to get
> the full order.
>
> NeilBrown
>
I don't see why xfs_repair would write data unless it actually finds the
superblock but I am not sure so I will take my chances since it's still
searching for the secondary superblock now.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-28  3:51                 ` likewhoa
@ 2012-04-28 15:23                   ` likewhoa
  2012-04-28 21:28                     ` NeilBrown
  0 siblings, 1 reply; 20+ messages in thread
From: likewhoa @ 2012-04-28 15:23 UTC (permalink / raw)
  To: NeilBrown
  Cc: linux-raid@vger.kernel.org >> "linux-raid@vger.kernel.org"

On 04/27/2012 11:51 PM, likewhoa wrote:
> On 04/27/2012 11:23 PM, NeilBrown wrote:
>> On Fri, 27 Apr 2012 22:59:48 -0400 likewhoa <likewhoa@weboperative.com> wrote:
>>
>>> On 04/27/2012 10:55 PM, NeilBrown wrote:
>>>> On Fri, 27 Apr 2012 22:37:29 -0400 likewhoa <likewhoa@weboperative.com> wrote:
>>>>
>>>>> On 04/27/2012 08:35 PM, likewhoa wrote:
>>>>>> I am not sure how to proceed now with the output that shows possible
>>>>>> pairs as it won't allow me to setup all 8 devices on the array but only
>>>>>> 4. Should I run the array creation with -x4 and set the available spare
>>>>>> devicesor or just create the array as I can remember which was one pair
>>>>>> from each controller. i.e /dev/sda3 /dev/sde3 ...?
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> ok I was able to recreate the array with correct order which I took from
>>>>> my /dev/md0's --details output and was able to decrypt the luks mapping
>>>>> but XFS didn't open and xfs_repair is currently doing the matrix. I will
>>>>> keep this posted with updates.
>>>> I hope the order really is correct.... I wouldn't expect xfs to find problems
>>>> if it was...
>>>>
>>>>> Thanks again Neil.
>>>>> WRT 3.3.3 should I just go back to 3.3.2 which seemed to run fine and
>>>>> wait until there is a release of 3.3.3 that has fix?
>>>> 3.3.4 has the fix and was just released.
>>>> 3.3.1, 3.3.2 and 3.3.3 all have the bug.  It only triggers on shutdown and
>>>> even then only occasionally.
>>>> So I recommend 3.3.4.
>>>>
>>>> NeilBrown
>>> The reason I believe it was correct was that 'cryptsetup luksOpen
>>> /dev/md1 md1' worked. I really do hope that it was correct too because
>>> after opening the luks mapping I assume there is no going back.
>> Opening the luks mapping could just mean that the first few blocks are
>> correct.  So the first disk is right but others might not be.
>>
>> There is going backup unless something has been written to the array.  Once
>> that happens anything could be corrupted.  So if the xfs check you are  doing
>> is read-only you could still have room to move.
>>
>> With a far=2 array, each first half of each device is mirrored on the second
>> half.  So you can probably recover the ordering by finding which pairs match.
>>
>> The "Used Dev Size" is 902992896 sectors.  Half of that is 451496448
>> or 231166181376 bytes.
>>
>> So to check if two devices are adjacent in the mapping you can try:
>>
>>  cmp --ignore-initial=0:231166181376 --bytes=231166181376 first-dev second-dev
>>
>> You could possibly use a smaller --bytes= number, at least on the first
>> attempt.
>> You a similar 'for' loop to before an use this command and it might tell you
>> which pairs of devices are consecutive.  From that you should be able to get
>> the full order.
>>
>> NeilBrown
>>
> I don't see why xfs_repair would write data unless it actually finds the
> superblock but I am not sure so I will take my chances since it's still
> searching for the secondary superblock now.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
After running the for loop all night which produced this output.

/dev/sda3 /dev/sdb3 differ: byte 1, line 1
/dev/sda3 /dev/sdc3 differ: byte 262145, line 2
/dev/sda3 /dev/sdd3 differ: byte 1, line 1
/dev/sda3 /dev/sde3 differ: byte 1, line 1
/dev/sda3 and /dev/sdf3 seem to match
/dev/sda3 /dev/sdg3 differ: byte 262145, line 2
/dev/sda3 /dev/sdh3 differ: byte 1, line 1
/dev/sdb3 /dev/sda3 differ: byte 1, line 1
/dev/sdb3 /dev/sdc3 differ: byte 1, line 1
/dev/sdb3 /dev/sdd3 differ: byte 1, line 1
/dev/sdb3 and /dev/sde3 seem to match
/dev/sdb3 /dev/sdf3 differ: byte 1, line 1
/dev/sdb3 /dev/sdg3 differ: byte 1, line 1
/dev/sdb3 /dev/sdh3 differ: byte 1, line 1
/dev/sdc3 /dev/sda3 differ: byte 262145, line 2
/dev/sdc3 /dev/sdb3 differ: byte 1, line 1
/dev/sdc3 /dev/sdd3 differ: byte 1, line 1
/dev/sdc3 /dev/sde3 differ: byte 1, line 1
/dev/sdc3 /dev/sdf3 differ: byte 262145, line 2
/dev/sdc3 and /dev/sdg3 seem to match
/dev/sdc3 /dev/sdh3 differ: byte 1, line 1
/dev/sdd3 /dev/sda3 differ: byte 1, line 1
/dev/sdd3 /dev/sdb3 differ: byte 1, line 1
/dev/sdd3 /dev/sdc3 differ: byte 1, line 1
/dev/sdd3 /dev/sde3 differ: byte 1, line 1
/dev/sdd3 /dev/sdf3 differ: byte 1, line 1
/dev/sdd3 /dev/sdg3 differ: byte 1, line 1
/dev/sdd3 and /dev/sdh3 seem to match
/dev/sde3 /dev/sda3 differ: byte 262145, line 2
/dev/sde3 /dev/sdb3 differ: byte 1, line 1
/dev/sde3 and /dev/sdc3 seem to match
/dev/sde3 /dev/sdd3 differ: byte 1, line 1
/dev/sde3 /dev/sdf3 differ: byte 262145, line 2
/dev/sde3 /dev/sdg3 differ: byte 262145, line 2
/dev/sde3 /dev/sdh3 differ: byte 1, line 1
/dev/sdf3 /dev/sda3 differ: byte 1, line 1
/dev/sdf3 /dev/sdb3 differ: byte 1, line 1
/dev/sdf3 /dev/sdc3 differ: byte 1, line 1
/dev/sdf3 and /dev/sdd3 seem to match
/dev/sdf3 /dev/sde3 differ: byte 1, line 1
/dev/sdf3 /dev/sdg3 differ: byte 1, line 1
/dev/sdf3 /dev/sdh3 differ: byte 1, line 1
/dev/sdg3 and /dev/sda3 seem to match
/dev/sdg3 /dev/sdb3 differ: byte 1, line 1
/dev/sdg3 /dev/sdc3 differ: byte 262145, line 2
/dev/sdg3 /dev/sdd3 differ: byte 1, line 1
/dev/sdg3 /dev/sde3 differ: byte 1, line 1
/dev/sdg3 /dev/sdf3 differ: byte 262145, line 2
/dev/sdg3 /dev/sdh3 differ: byte 1, line 1
/dev/sdh3 /dev/sda3 differ: byte 1, line 1
/dev/sdh3 and /dev/sdb3 seem to match
/dev/sdh3 /dev/sdc3 differ: byte 1, line 1
/dev/sdh3 /dev/sdd3 differ: byte 1, line 1
/dev/sdh3 /dev/sde3 differ: byte 1, line 1
/dev/sdh3 /dev/sdf3 differ: byte 1, line 1
/dev/sdh3 /dev/sdg3 differ: byte 1, line 1

I manage recover my luks+xfs with this --create command \o/
> mdadm --create /dev/md1 --metadata=1.0 -l10 -n8 --chunk=256
--layout=f2 --assume-clean /dev/sdh3 /dev/sdb3 /dev/sde3 /dev/sdc3
/dev/sdg3 /dev/sda3 /dev/sdf3 /dev/sdd3

Thank you Neil for your assistance you rock! With regards to my
/etc/mdadm.conf the explicit references to the sd drives was generated
with 'mdadm -Esv', my question is do you suggest I not even populate
/etc/mdadm.conf with such output because of change in drives and just
use 'mdadm -s -A /dev/mdX'? Also after running into this nasty bug I get
the feeling that I should really keep a copy of all my future 'mdadm
--create ...' commands handy just for such situations, do you agree?

Thanks again and have a GREAT weekend.
Fernando Vasquez a.k.a likewhoa

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-28 15:23                   ` likewhoa
@ 2012-04-28 21:28                     ` NeilBrown
  2012-04-29 14:23                       ` likewhoa
  0 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2012-04-28 21:28 UTC (permalink / raw)
  To: likewhoa; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 7807 bytes --]

On Sat, 28 Apr 2012 11:23:18 -0400 likewhoa <likewhoa@weboperative.com> wrote:

> On 04/27/2012 11:51 PM, likewhoa wrote:
> > On 04/27/2012 11:23 PM, NeilBrown wrote:
> >> On Fri, 27 Apr 2012 22:59:48 -0400 likewhoa <likewhoa@weboperative.com> wrote:
> >>
> >>> On 04/27/2012 10:55 PM, NeilBrown wrote:
> >>>> On Fri, 27 Apr 2012 22:37:29 -0400 likewhoa <likewhoa@weboperative.com> wrote:
> >>>>
> >>>>> On 04/27/2012 08:35 PM, likewhoa wrote:
> >>>>>> I am not sure how to proceed now with the output that shows possible
> >>>>>> pairs as it won't allow me to setup all 8 devices on the array but only
> >>>>>> 4. Should I run the array creation with -x4 and set the available spare
> >>>>>> devicesor or just create the array as I can remember which was one pair
> >>>>>> from each controller. i.e /dev/sda3 /dev/sde3 ...?
> >>>>>> --
> >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>> ok I was able to recreate the array with correct order which I took from
> >>>>> my /dev/md0's --details output and was able to decrypt the luks mapping
> >>>>> but XFS didn't open and xfs_repair is currently doing the matrix. I will
> >>>>> keep this posted with updates.
> >>>> I hope the order really is correct.... I wouldn't expect xfs to find problems
> >>>> if it was...
> >>>>
> >>>>> Thanks again Neil.
> >>>>> WRT 3.3.3 should I just go back to 3.3.2 which seemed to run fine and
> >>>>> wait until there is a release of 3.3.3 that has fix?
> >>>> 3.3.4 has the fix and was just released.
> >>>> 3.3.1, 3.3.2 and 3.3.3 all have the bug.  It only triggers on shutdown and
> >>>> even then only occasionally.
> >>>> So I recommend 3.3.4.
> >>>>
> >>>> NeilBrown
> >>> The reason I believe it was correct was that 'cryptsetup luksOpen
> >>> /dev/md1 md1' worked. I really do hope that it was correct too because
> >>> after opening the luks mapping I assume there is no going back.
> >> Opening the luks mapping could just mean that the first few blocks are
> >> correct.  So the first disk is right but others might not be.
> >>
> >> There is going backup unless something has been written to the array.  Once
> >> that happens anything could be corrupted.  So if the xfs check you are  doing
> >> is read-only you could still have room to move.
> >>
> >> With a far=2 array, each first half of each device is mirrored on the second
> >> half.  So you can probably recover the ordering by finding which pairs match.
> >>
> >> The "Used Dev Size" is 902992896 sectors.  Half of that is 451496448
> >> or 231166181376 bytes.
> >>
> >> So to check if two devices are adjacent in the mapping you can try:
> >>
> >>  cmp --ignore-initial=0:231166181376 --bytes=231166181376 first-dev second-dev
> >>
> >> You could possibly use a smaller --bytes= number, at least on the first
> >> attempt.
> >> You a similar 'for' loop to before an use this command and it might tell you
> >> which pairs of devices are consecutive.  From that you should be able to get
> >> the full order.
> >>
> >> NeilBrown
> >>
> > I don't see why xfs_repair would write data unless it actually finds the
> > superblock but I am not sure so I will take my chances since it's still
> > searching for the secondary superblock now.
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> After running the for loop all night which produced this output.
> 
> /dev/sda3 /dev/sdb3 differ: byte 1, line 1
> /dev/sda3 /dev/sdc3 differ: byte 262145, line 2
> /dev/sda3 /dev/sdd3 differ: byte 1, line 1
> /dev/sda3 /dev/sde3 differ: byte 1, line 1
> /dev/sda3 and /dev/sdf3 seem to match
> /dev/sda3 /dev/sdg3 differ: byte 262145, line 2
> /dev/sda3 /dev/sdh3 differ: byte 1, line 1
> /dev/sdb3 /dev/sda3 differ: byte 1, line 1
> /dev/sdb3 /dev/sdc3 differ: byte 1, line 1
> /dev/sdb3 /dev/sdd3 differ: byte 1, line 1
> /dev/sdb3 and /dev/sde3 seem to match
> /dev/sdb3 /dev/sdf3 differ: byte 1, line 1
> /dev/sdb3 /dev/sdg3 differ: byte 1, line 1
> /dev/sdb3 /dev/sdh3 differ: byte 1, line 1
> /dev/sdc3 /dev/sda3 differ: byte 262145, line 2
> /dev/sdc3 /dev/sdb3 differ: byte 1, line 1
> /dev/sdc3 /dev/sdd3 differ: byte 1, line 1
> /dev/sdc3 /dev/sde3 differ: byte 1, line 1
> /dev/sdc3 /dev/sdf3 differ: byte 262145, line 2
> /dev/sdc3 and /dev/sdg3 seem to match
> /dev/sdc3 /dev/sdh3 differ: byte 1, line 1
> /dev/sdd3 /dev/sda3 differ: byte 1, line 1
> /dev/sdd3 /dev/sdb3 differ: byte 1, line 1
> /dev/sdd3 /dev/sdc3 differ: byte 1, line 1
> /dev/sdd3 /dev/sde3 differ: byte 1, line 1
> /dev/sdd3 /dev/sdf3 differ: byte 1, line 1
> /dev/sdd3 /dev/sdg3 differ: byte 1, line 1
> /dev/sdd3 and /dev/sdh3 seem to match
> /dev/sde3 /dev/sda3 differ: byte 262145, line 2
> /dev/sde3 /dev/sdb3 differ: byte 1, line 1
> /dev/sde3 and /dev/sdc3 seem to match
> /dev/sde3 /dev/sdd3 differ: byte 1, line 1
> /dev/sde3 /dev/sdf3 differ: byte 262145, line 2
> /dev/sde3 /dev/sdg3 differ: byte 262145, line 2
> /dev/sde3 /dev/sdh3 differ: byte 1, line 1
> /dev/sdf3 /dev/sda3 differ: byte 1, line 1
> /dev/sdf3 /dev/sdb3 differ: byte 1, line 1
> /dev/sdf3 /dev/sdc3 differ: byte 1, line 1
> /dev/sdf3 and /dev/sdd3 seem to match
> /dev/sdf3 /dev/sde3 differ: byte 1, line 1
> /dev/sdf3 /dev/sdg3 differ: byte 1, line 1
> /dev/sdf3 /dev/sdh3 differ: byte 1, line 1
> /dev/sdg3 and /dev/sda3 seem to match
> /dev/sdg3 /dev/sdb3 differ: byte 1, line 1
> /dev/sdg3 /dev/sdc3 differ: byte 262145, line 2
> /dev/sdg3 /dev/sdd3 differ: byte 1, line 1
> /dev/sdg3 /dev/sde3 differ: byte 1, line 1
> /dev/sdg3 /dev/sdf3 differ: byte 262145, line 2
> /dev/sdg3 /dev/sdh3 differ: byte 1, line 1
> /dev/sdh3 /dev/sda3 differ: byte 1, line 1
> /dev/sdh3 and /dev/sdb3 seem to match
> /dev/sdh3 /dev/sdc3 differ: byte 1, line 1
> /dev/sdh3 /dev/sdd3 differ: byte 1, line 1
> /dev/sdh3 /dev/sde3 differ: byte 1, line 1
> /dev/sdh3 /dev/sdf3 differ: byte 1, line 1
> /dev/sdh3 /dev/sdg3 differ: byte 1, line 1
> 
> I manage recover my luks+xfs with this --create command \o/
> > mdadm --create /dev/md1 --metadata=1.0 -l10 -n8 --chunk=256
> --layout=f2 --assume-clean /dev/sdh3 /dev/sdb3 /dev/sde3 /dev/sdc3
> /dev/sdg3 /dev/sda3 /dev/sdf3 /dev/sdd3

cool!!  I love it when something like that produces a nice clean usable
result.

> 
> Thank you Neil for your assistance you rock! With regards to my
> /etc/mdadm.conf the explicit references to the sd drives was generated
> with 'mdadm -Esv', my question is do you suggest I not even populate
> /etc/mdadm.conf with such output because of change in drives and just
> use 'mdadm -s -A /dev/mdX'? Also after running into this nasty bug I get
> the feeling that I should really keep a copy of all my future 'mdadm
> --create ...' commands handy just for such situations, do you agree?

The output of '-Ds' and '-Es' was only ever meant to be a starting point for
mdadm.conf, not the final content..

I think it is good to populate mdadm.conf, though in simple cases it isn't
essential.
I would just list the UUID for each array:
   ARRAY /dev/md0 uuid=.......
and leave it at that.

Keeping a copy of the output of "mdadm -E" of each device could occasionally
be useful.  You would need to update this copy any time any config change
happened to the array such as a device failure or a spare being added.

Glad it all worked out!

NeilBrown


> 
> Thanks again and have a GREAT weekend.
> Fernando Vasquez a.k.a likewhoa


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-28 21:28                     ` NeilBrown
@ 2012-04-29 14:23                       ` likewhoa
  0 siblings, 0 replies; 20+ messages in thread
From: likewhoa @ 2012-04-29 14:23 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On 04/28/2012 05:28 PM, NeilBrown wrote:
> I would just list the UUID for each array:
>    ARRAY /dev/md0 uuid=.......
> and leave it at that.
I used the output of 'mdadm -Esv' then removed everything but UUID=...
but after doing 'mdadm -A /dev/md1' I get "mdadm: failed to add 8:131 to
/dev/md1: Invalid argument
mdadm: /dev/md1 assembled from 0 drives - not enough to start the
array." mdadm.conf has "ARRAY /dev/md/1
UUID=158918a9:52356b95:f4926a78:a55d4509" so not sure what could be the
problem since it should just work.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-04-27 22:03 ` NeilBrown
  2012-04-27 23:26   ` likewhoa
@ 2012-05-01  9:45   ` Brian Candler
  2012-05-01 10:18     ` NeilBrown
  1 sibling, 1 reply; 20+ messages in thread
From: Brian Candler @ 2012-05-01  9:45 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On Sat, Apr 28, 2012 at 08:03:55AM +1000, NeilBrown wrote:
> I'm afraid you've been bitten by a rather nasty bug which is present in 3.3
> and got back ported to some -stable kernel.  The fix has been submitted and
> should be appearing in -stable kernels soon (maybe already).

Do you happen to know if this bug was present in ubuntu 11.10, kernel
versions 3.0.0-15-server or 3.0.0-16-server ?

I saw failures to assemble RAID6 arrays after unclean shutdowns on both of
these, just in test environments.  If it happens again I'll send mdadm
--examine output. I just tried to replicate it and failed :-(

Regards,

Brian.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-05-01  9:45   ` Brian Candler
@ 2012-05-01 10:18     ` NeilBrown
  2012-05-01 11:15       ` Brian Candler
  2012-05-02  2:37       ` linbloke
  0 siblings, 2 replies; 20+ messages in thread
From: NeilBrown @ 2012-05-01 10:18 UTC (permalink / raw)
  To: Brian Candler; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1418 bytes --]

On Tue, 1 May 2012 10:45:08 +0100 Brian Candler <B.Candler@pobox.com> wrote:

> On Sat, Apr 28, 2012 at 08:03:55AM +1000, NeilBrown wrote:
> > I'm afraid you've been bitten by a rather nasty bug which is present in 3.3
> > and got back ported to some -stable kernel.  The fix has been submitted and
> > should be appearing in -stable kernels soon (maybe already).
> 
> Do you happen to know if this bug was present in ubuntu 11.10, kernel
> versions 3.0.0-15-server or 3.0.0-16-server ?

I don't keep track of what ubuntu (or anyone but suse) put in their kernels,
sorry.

> 
> I saw failures to assemble RAID6 arrays after unclean shutdowns on both of
> these, just in test environments.  If it happens again I'll send mdadm
> --examine output. I just tried to replicate it and failed :-(

To replicate:

 - create an array.
 - stop the array
 - assemble the array with at least one missing device e.g.:
    mdadm -A /dev/md0 /dev/sda1 /dev/sdb1
   if it was a 3-device array
 - check in /proc/mdstat that it is listed as "inactivate"
 - reboot
 - now "mdadm -E" the devices.  If the raid level is -unknown- then the bug
   has hit.

NeilBrown


> 
> Regards,
> 
> Brian.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-05-01 10:18     ` NeilBrown
@ 2012-05-01 11:15       ` Brian Candler
  2012-05-02  2:37       ` linbloke
  1 sibling, 0 replies; 20+ messages in thread
From: Brian Candler @ 2012-05-01 11:15 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On Tue, May 01, 2012 at 08:18:54PM +1000, NeilBrown wrote:
> To replicate:
> 
>  - create an array.
>  - stop the array
>  - assemble the array with at least one missing device e.g.:
>     mdadm -A /dev/md0 /dev/sda1 /dev/sdb1
>    if it was a 3-device array
>  - check in /proc/mdstat that it is listed as "inactivate"
>  - reboot
>  - now "mdadm -E" the devices.  If the raid level is -unknown- then the bug
>    has hit.

Thank you for this info.

After reboot the array came up as active again, and mdadm -E showed "Raid
Level : raid6" for all drives, so that's not the problem I had before.

Regards,

Brian.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: raid10 issues after reorder of boot drives.
  2012-05-01 10:18     ` NeilBrown
  2012-05-01 11:15       ` Brian Candler
@ 2012-05-02  2:37       ` linbloke
  1 sibling, 0 replies; 20+ messages in thread
From: linbloke @ 2012-05-02  2:37 UTC (permalink / raw)
  To: NeilBrown; +Cc: Brian Candler, linux-raid, Jan Ceuleers

On 1/05/12 8:18 PM, NeilBrown wrote:
> On Tue, 1 May 2012 10:45:08 +0100 Brian Candler<B.Candler@pobox.com>  wrote:
>
>> On Sat, Apr 28, 2012 at 08:03:55AM +1000, NeilBrown wrote:
>>> I'm afraid you've been bitten by a rather nasty bug which is present in 3.3
>>> and got back ported to some -stable kernel.  The fix has been submitted and
>>> should be appearing in -stable kernels soon (maybe already).
>> Do you happen to know if this bug was present in ubuntu 11.10, kernel
>> versions 3.0.0-15-server or 3.0.0-16-server ?
> I don't keep track of what ubuntu (or anyone but suse) put in their kernels,
> sorry.

I wondered about Ubuntu and checked the kernel package changelog:
http://changelogs.ubuntu.com/changelogs/pool/main/l/linux/linux_3.0.0-19.33/changelog

Cross-checking with the commit nominated by Jan Ceuleers to this list on 
30/04/12 (c744a65c1e2d59acc54333ce8) and the Ubuntu changelog (see grep 
below) - I don't believe Ubuntu 11.10 includes the bug.

Cheers,

lb


$ grep ' md' changelog
   * md/bitmap: ensure to load bitmap when creating via sysfs.
   * md/raid1,raid10: avoid deadlock during resync/recovery.
   * net: Fix driver name for mdio-gpio.c
   * tcp: md5: using remote adress for md5 lookup in rst packet
   * md/raid5: fix bug that could result in reads from a failed device.
   * md/raid5: abort any pending parity operations when array fails.
   * md: Avoid waking up a thread after it has been freed.
   * md/raid5: fix bug that could result in reads from a failed device.
   * tcp: properly handle md5sig_pool references
   * md/linear: avoid corrupting structure while waiting for rcu_free to
   * md: Fix handling for devices from 2TB to 4TB in 0.90 metadata.
   * winbond: include linux/delay.h for mdelay et al
   * md: fix 'degraded' calculation when starting a reshape.
   * md: fix small irregularity with start_ro module parameter
   * SCSI: st: fix mdata->page_order handling
   * md: Fix unfortunate interaction with evms
   * md/bitmap: protect against bitmap removal while being updated.
   * md: Fix "strchr" [drivers/md/dm-log-userspace.ko] undefined!





>> I saw failures to assemble RAID6 arrays after unclean shutdowns on both of
>> these, just in test environments.  If it happens again I'll send mdadm
>> --examine output. I just tried to replicate it and failed :-(
> To replicate:
>
>   - create an array.
>   - stop the array
>   - assemble the array with at least one missing device e.g.:
>      mdadm -A /dev/md0 /dev/sda1 /dev/sdb1
>     if it was a 3-device array
>   - check in /proc/mdstat that it is listed as "inactivate"
>   - reboot
>   - now "mdadm -E" the devices.  If the raid level is -unknown- then the bug
>     has hit.
>
> NeilBrown
>
>
>> Regards,
>>
>> Brian.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-05-02  2:37 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-27 20:04 raid10 issues after reorder of boot drives likewhoa
2012-04-27 21:51 ` likewhoa
2012-04-27 22:05   ` NeilBrown
2012-04-27 23:29     ` likewhoa
2012-04-28  0:24       ` NeilBrown
2012-04-28  0:35       ` likewhoa
2012-04-28  2:37         ` likewhoa
2012-04-28  2:55           ` NeilBrown
2012-04-28  2:59             ` likewhoa
2012-04-28  3:23               ` NeilBrown
2012-04-28  3:51                 ` likewhoa
2012-04-28 15:23                   ` likewhoa
2012-04-28 21:28                     ` NeilBrown
2012-04-29 14:23                       ` likewhoa
2012-04-27 22:03 ` NeilBrown
2012-04-27 23:26   ` likewhoa
2012-05-01  9:45   ` Brian Candler
2012-05-01 10:18     ` NeilBrown
2012-05-01 11:15       ` Brian Candler
2012-05-02  2:37       ` linbloke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.