All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-19 15:19 Dragon
  2013-02-19 17:48 ` Phil Turmel
  0 siblings, 1 reply; 33+ messages in thread
From: Dragon @ 2013-02-19 15:19 UTC (permalink / raw)
  To: linux-raid

Hello,
sorry for pushing, but any further doing to rescue the raid? sorry but the system didnt run since Dec... 

thx
sunny

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-19 15:19 Possible to rescue SW Raid5 with 2 missing Disks Dragon
@ 2013-02-19 17:48 ` Phil Turmel
  2013-02-19 18:32   ` Roy Sigurd Karlsbakk
  0 siblings, 1 reply; 33+ messages in thread
From: Phil Turmel @ 2013-02-19 17:48 UTC (permalink / raw)
  To: Dragon; +Cc: linux-raid

On 02/19/2013 10:19 AM, Dragon wrote:
> Hello,
> sorry for pushing, but any further doing to rescue the raid? sorry but the system didnt run since Dec... 

I still think you are at the point where duplicating your original disks
onto backups so you can try the combinations I identified is the only
*safe* way forward.  But only *safe* in the sense that if your first
combination doesn't work, you can copy back and try another combination.

I've yet to see any indication that the moosefs on top of your broken
ext4 fs is likely to be saved.

You should post the links to the "fsck -n" results of creating the array
in the six combinations I had you try.

You should also repost the output of mdadm -D and mdadm -E from the
original array that you shared with me on Dec 17.

Phil


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-19 17:48 ` Phil Turmel
@ 2013-02-19 18:32   ` Roy Sigurd Karlsbakk
  0 siblings, 0 replies; 33+ messages in thread
From: Roy Sigurd Karlsbakk @ 2013-02-19 18:32 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid, Dragon

> > Hello,
> > sorry for pushing, but any further doing to rescue the raid? sorry
> > but the system didnt run since Dec...
> 
> I still think you are at the point where duplicating your original
> disks
> onto backups so you can try the combinations I identified is the only
> *safe* way forward. But only *safe* in the sense that if your first
> combination doesn't work, you can copy back and try another
> combination.
> 
> I've yet to see any indication that the moosefs on top of your broken
> ext4 fs is likely to be saved.
> 
> You should post the links to the "fsck -n" results of creating the
> array
> in the six combinations I had you try.
> 
> You should also repost the output of mdadm -D and mdadm -E from the
> original array that you shared with me on Dec 17.

IMHO you should be using ddrescue (preferalbly the one from GNU) to copy the disks onto new disks (at least the faulty ones). Do this first, since additional access may result in more data loss. After doing that, use the new drives for recovery, since they will (hopefully) not be throwing errors. Working live on bad drives isn't something I would recommend.
-- 
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
roy@karlsbakk.net
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-03-03 23:09 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-03-03 23:09 UTC (permalink / raw)
  To: linux-raid

I do create with missing sdd. Than i mount -a and get :"
mount -a
"mount: wrong fs type, bad option, bad superblock on /dev/md2,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so"
And
"EXT4-fs (md2): bad geometry: block count 3651722880 exceeds size of device (3651721600 blocks)" and the mount fails

Ok is the superblock on sdf4 bad? What can i do next?
thx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-03-03 21:59 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-03-03 21:59 UTC (permalink / raw)
  To: linux-raid

Hello,
i had the idea to check old mails and found that:
active raid5 sda4[0] sdf4[5] sde4[4] sdd4[3](F) sdc4[2] sdb4[1], after that i see that sf4 fails too. 
So all looks like the sdd4 fails first. Now is the question which steps i have to make for rescue - blocks sizes etc. - now i would do "mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,f,e}4 missing" and than mount the raid into the mountpoint to look to the files. is that the way?

need help.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-27  7:14 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-02-27  7:14 UTC (permalink / raw)
  To: linux-raid

Hello,
Mikael points me to the Permute array.pl Perlscript, but i dont get it run. i tried
permute_array --md /dev/md2 --mount /mnt/md2 /dev/sd[a-f]4 

Iam not sure how to and i wont to make a mistake, can you help for the right syntax? Anywhere i suggest we have to work out that:

- missing disk are sdd and sdf, while sdd looks like to fails at first
- superblock and chunksize are 512 and on all correct

here are the tests with the 6 combinations i tried in the past

Test1:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 \
                /dev/sd{a,b,c,d,e,f}4
-> filesize 708MB with 20603326 lines and canceling at the end by e2fsck
- bad superblock or partitiontable is damage
- bad checksum of group or descriptor
- lots of invalid inodes
- canceld with lots of illegal blocks in inodes

Test2:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,d,e}4 \
                missing
-> filesize 1,3GB  with 37614367 lines and canceling by e2fsck at the end
- back to original superblock
- bad superblock or damaged partitiontable at the beginning
- lots of invalid inodes
- canceld with iteration of inade

Test3:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c}4 \
                missing /dev/sd{e,f}4
-> filesize 1,4GB with 40745425 lines and canceling by e2fsck at the end
- errors see test2
- read error while reading next inode

Test4:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 \
                /dev/sd{a,b,c,f,e,d}4
->filesize 874MB with 25412000 lines and break by e2fsck at the end
- try original superblock
- bad superblock or damaged partitiontable
- than lots of checksumm  invalid deskriptor of group
- at the end illegal block in inode to much invalid blocks in inode

Test5:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c}4 \
                missing /dev/sd{e,d}4
-> filesize 1,6GB with 45673505 lines and canceling at the end by e2fsck

Test6:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,f,e}4 \
                missing
- try original superblock
- bad superblock or damage partitiontable
- lots of checksumm error in group deskriptor
- ends with conflict in inode table with another filesystem block
-> filesize 542MB with 15727702 lines and cancelingat the end by e2fsck

what would be the next step?
- backup the disks
- create best combination or all with 
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{x,x,x,x,x}4 missing
- ??

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-21 15:36 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-02-21 15:36 UTC (permalink / raw)
  To: linux-raid

Do you mean this script: http://marc.info/?l=linux-raid&m=134495194322112&w=2 else point me to that you have mentioned.

i read somewhere here that i could mount the re-created raid as readonly and test it. right and how?

thx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-20  8:54 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-02-20  8:54 UTC (permalink / raw)
  To: linux-raid

Hi,
ok i hoped the situation will turn arround and we can rescue the system. As i wrote at the first post Phil and i testet all order and checked with fsck. as an result of that, look at the first post, the most confidence would the last test with:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,f,e}4 missing

am i right?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-20  7:44 Dragon
@ 2013-02-20  8:06 ` Mikael Abrahamsson
  0 siblings, 0 replies; 33+ messages in thread
From: Mikael Abrahamsson @ 2013-02-20  8:06 UTC (permalink / raw)
  To: Dragon; +Cc: linux-raid

On Wed, 20 Feb 2013, Dragon wrote:

> the tests on sdd and sdf runs over night and both are clean - has no errors. Therefore i think it was the mistake while moving both. Hope thats helps to get back to an good state of my raid.

You need to try to --create --assume-clean all orders of drives to try to 
find an order that will work for you (passes fsck). There are scripts 
posted about which should be searchable via the archives.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-20  7:44 Dragon
  2013-02-20  8:06 ` Mikael Abrahamsson
  0 siblings, 1 reply; 33+ messages in thread
From: Dragon @ 2013-02-20  7:44 UTC (permalink / raw)
  To: linux-raid

Hello,

the tests on sdd and sdf runs over night and both are clean - has no errors. Therefore i think it was the mistake while moving both. Hope thats helps to get back to an good state of my raid.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-19 20:36 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-02-19 20:36 UTC (permalink / raw)
  To: linux-raid

Hi Phil and Roy and Salatiel who i didnt found in the mailinglist but in my inbox ;)

@Phil 
i did it in 6. 2013-02-15 show the old -E and -D outputs as well i post the output from fsck -n in the first post 25. 2013-02-08.

@Roy
your right i would do the same, but i hoped Mikael says "it looks good" that i dont have to buy another 6 disks and can recreate the raid with hopfully no errros ;)

@Solatiel
i didnt check both disks sdd and sdf of badblocks but i will do tonight and inform you. i think they didnt have anyone.

best regards

sunny

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-18 12:13 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-02-18 12:13 UTC (permalink / raw)
  To: linux-raid

Hello Mikael,

you give me hope ;). After that i do anything but inform this mailing list and communicat with phil who advise me to run fsck -n /dev/md2 and log the errors to an file. after that i tried various combinations, as i wrote in the first post. since that i do not write to the raid...

the problem starts as i wanted to move the disk in the open case slightly while its running. as i do so the sdd and sdf spins down and the error occured - yes it was my mistake ;(

perhaps its an problem how i create the partitions on the raid, because of the offset e.g.??

what will we do next?

thx all so far

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-15 15:41 Dragon
@ 2013-02-16  5:06 ` Mikael Abrahamsson
  0 siblings, 0 replies; 33+ messages in thread
From: Mikael Abrahamsson @ 2013-02-16  5:06 UTC (permalink / raw)
  To: Dragon; +Cc: linux-raid

On Fri, 15 Feb 2013, Dragon wrote:

>          Magic : a92b4efc
>        Version : 1.2
>    Feature Map : 0x0
>     Array UUID : 92a99ca0:8e22bc66:f8050881:78c344d9
>           Name : mfsnode1:2  (local to host mfsnode1)
>  Creation Time : Tue Jul 24 18:18:21 2012
>     Raid Level : raid5
>   Raid Devices : 6
>
> Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
>     Array Size : 29213783040 (13930.22 GiB 14957.46 GB)
>  Used Dev Size : 5842756608 (2786.04 GiB 2991.49 GB)
>    Data Offset : 2048 sectors
>   Super Offset : 8 sectors
>          State : clean
>    Device UUID : 6a679a1d:f42d6b9f:a13cd977:0fdbea86
>
>    Update Time : Mon Dec 17 00:13:56 2012
>       Checksum : 8c50873e - correct
>         Events : 211768
>
>         Layout : left-symmetric
>     Chunk Size : 512K

As far as I can tell, it looks like data offset, super offset,superblock 
version all match what you have after doing --create. So this is good.

> mdadm -E /dev/sdd4
>         Events : 0

> mdadm -E /dev/sdf4
>         Events : 0

Do you have any idea how this happened?

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-15 15:41 Dragon
  2013-02-16  5:06 ` Mikael Abrahamsson
  0 siblings, 1 reply; 33+ messages in thread
From: Dragon @ 2013-02-15 15:41 UTC (permalink / raw)
  To: linux-raid

Hi,
ok your absolutely right. i searched to logfiles in var/log but found nothing about superblock. but as i wrote i worked with phil at this problem and in my outputbox i found an email with information before this:

Here the output oft Option "-D"
--------------------------------------------------------------
mdadm -D /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Tue Jul 24 18:18:21 2012
     Raid Level : raid5
  Used Dev Size : -1
   Raid Devices : 6
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Mon Dec 17 00:13:56 2012
          State : active, FAILED, Not Started
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : mfsnode1:2  (local to host mfsnode1)
           UUID : 92a99ca0:8e22bc66:f8050881:78c344d9
         Events : 211768

    Number   Major   Minor   RaidDevice State
       0       8        4        0      active sync   /dev/sda4
       1       8       20        1      active sync   /dev/sdb4
       2       8       36        2      active sync   /dev/sdc4
       3       0        0        3      removed
       4       8       68        4      active sync   /dev/sde4
       5       0        0        5      removed
--------------------------------------------------------------
mdadm -E /dev/sda4
/dev/sda4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 92a99ca0:8e22bc66:f8050881:78c344d9
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Jul 24 18:18:21 2012
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
     Array Size : 29213783040 (13930.22 GiB 14957.46 GB)
  Used Dev Size : 5842756608 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 6a679a1d:f42d6b9f:a13cd977:0fdbea86

    Update Time : Mon Dec 17 00:13:56 2012
       Checksum : 8c50873e - correct
         Events : 211768

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA.A. ('A' == active, '.' == missing)
--------------------------------------------------------------
mdadm -E /dev/sdb4
/dev/sdb4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 92a99ca0:8e22bc66:f8050881:78c344d9
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Jul 24 18:18:21 2012
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
     Array Size : 29213783040 (13930.22 GiB 14957.46 GB)
  Used Dev Size : 5842756608 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 099f7aa0:9f7a0802:2a874b3a:259431c6

    Update Time : Mon Dec 17 00:13:56 2012
       Checksum : 73870f28 - correct
         Events : 211768

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA.A. ('A' == active, '.' == missing)
--------------------------------------------------------------
mdadm -E /dev/sdc4
/dev/sdc4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 92a99ca0:8e22bc66:f8050881:78c344d9
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Jul 24 18:18:21 2012
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
     Array Size : 29213783040 (13930.22 GiB 14957.46 GB)
  Used Dev Size : 5842756608 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 41f58710:f1658674:4980a5ae:a9c674b3

    Update Time : Mon Dec 17 00:13:56 2012
       Checksum : b7af7c56 - correct
         Events : 211768

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA.A. ('A' == active, '.' == missing)
--------------------------------------------------------------
mdadm -E /dev/sdd4
/dev/sdd4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 92a99ca0:8e22bc66:f8050881:78c344d9
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Jul 24 18:18:21 2012
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
     Array Size : 29213783040 (13930.22 GiB 14957.46 GB)
  Used Dev Size : 5842756608 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 53e3bb36:58d746d3:52012491:f518b985

    Update Time : Mon Dec 17 00:13:56 2012
       Checksum : f16373f1 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : AAA.A. ('A' == active, '.' == missing)
--------------------------------------------------------------
mdadm -E /dev/sde4
/dev/sde4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 92a99ca0:8e22bc66:f8050881:78c344d9
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Jul 24 18:18:21 2012
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842756608 (2786.04 GiB 2991.49 GB)
     Array Size : 29213783040 (13930.22 GiB 14957.46 GB)
    Data Offset : 1024 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 570b6ed0:9be27d23:6db489de:9c5ddf30

    Update Time : Mon Dec 17 00:13:56 2012
       Checksum : d3dbd252 - correct
         Events : 211768

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAA.A. ('A' == active, '.' == missing)
--------------------------------------------------------------
mdadm -E /dev/sdf4
/dev/sdf4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 92a99ca0:8e22bc66:f8050881:78c344d9
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Jul 24 18:18:21 2012
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842756608 (2786.04 GiB 2991.49 GB)
     Array Size : 29213783040 (13930.22 GiB 14957.46 GB)
    Data Offset : 1024 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 3718659d:78d3c607:69c0d241:009a0ef6

    Update Time : Mon Dec 17 00:13:56 2012
       Checksum : ad90dd3a - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : AAA.A. ('A' == active, '.' == missing)
--------------------------------------------------------------
cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : inactive sda4[0] sde4[4] sdc4[2] sdb4[1]
      11685514699 blocks super 1.2

md1 : active (auto-read-only) raid5 sda3[0] sdf3[5] sde3[4] sdd3[3]
sdc3[2] sdb3[1]
      4881920 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/6]
[UUUUUU]

md0 : active raid1 sda2[0] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1]
      7811464 blocks super 1.2 [6/6] [UUUUUU]

unused devices: <none>
--------------------------------------------------------------
dmesg |grep md2
[    2.497112] md: md2 stopped.
[    2.522264] raid5: allocated 6386kB for md2
[    2.522456] raid5: not enough operational devices for md2 (2/6 failed)
[    2.522780] raid5: failed to run raid set md2
--------------------------------------------------------------
dmesg |grep sd[a,b,c,d,e,f]4
[    1.183060]  sdf: sdd1 sdd2 sdd3 sdd4
[    1.211480]  sdc1 sdc2 sdc3 sdc4
[    1.212753]  sda1 sda2 sda3 sda4
[    1.214156]  sdb1 sdb2 sdb3 sdb4
[    1.221997]  sdf1 sdf2 sdf3 sdf4
[    1.249067]  sde1 sde2 sde3 sde4
[    2.502642] md: bind<sdb4>
[    2.505288] md: bind<sdc4>
[    2.505426] md: bind<sde4>
[    2.505521] md: bind<sdf4>
[    2.505613] md: export_rdev(sdd4)
[    2.505823] md: bind<sda4>
[    2.505884] md: kicking non-fresh sdf4 from array!
[    2.505898] md: unbind<sdf4>
[    2.520389] md: export_rdev(sdf4)
[    2.521897] raid5: device sda4 operational as raid disk 0
[    2.521900] raid5: device sde4 operational as raid disk 4
[    2.521902] raid5: device sdc4 operational as raid disk 2
[    2.521904] raid5: device sdb4 operational as raid disk 1
[    2.522510]  disk 0, o:1, dev:sda4
[    2.522512]  disk 1, o:1, dev:sdb4
[    2.522513]  disk 2, o:1, dev:sdc4
[    2.522515]  disk 4, o:1, dev:sde4
------------------------------------------------

behaps this could bring light in the dark

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-15 12:59 Dragon
@ 2013-02-15 14:51 ` Mikael Abrahamsson
  0 siblings, 0 replies; 33+ messages in thread
From: Mikael Abrahamsson @ 2013-02-15 14:51 UTC (permalink / raw)
  To: Dragon; +Cc: linux-raid

On Fri, 15 Feb 2013, Dragon wrote:

> In my case the superblock seems to be on all disks persistant, i use the same version and chunk-size.

The problem is that you used --create which just overwrites your old 
superblocks. So the information on your current superblocks might very 
well be wrong. The order of the drives might be wrong, you might have 
wrong chunk size, you might have wrong data offset, you might even have 
wrong superblock version.

Since you didn't save the superblock information before you did --create, 
it's hard to give you advice on how to proceed.

If you have old log entries (syslog) from when the raid was successfully 
started and stopped before you ran into trouble, that might contain some 
information that might help.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-15 12:59 Dragon
  2013-02-15 14:51 ` Mikael Abrahamsson
  0 siblings, 1 reply; 33+ messages in thread
From: Dragon @ 2013-02-15 12:59 UTC (permalink / raw)
  To: linux-raid

Hi Mikael and Brad,

and thanks for the information. i appreciate this because i learned a lot. and you are right the internet is full of wrong things and yes i react a bit late because the problem is now, but as an excuse, i have not that time to study such a big subject like raid with all its interessting parts like the hardware, speed, possibilities, e.g. but i learned a lot by solve errors like this ;) and appreciate people like you at mailinglists and forums who helped.

ok now its become difficult for me to follow because of my engl. understanding. what i understand out of your answers is:
- force is not at all a problem
- in the most cases do not use --create --assume-clean, else you know why
-->  but what is the problem with this, please explain

In my case the superblock seems to be on all disks persistant, i use the same version and chunk-size.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-15 12:12       ` Brad Campbell
@ 2013-02-15 12:34         ` Mikael Abrahamsson
  0 siblings, 0 replies; 33+ messages in thread
From: Mikael Abrahamsson @ 2013-02-15 12:34 UTC (permalink / raw)
  To: Brad Campbell; +Cc: linux-raid

On Fri, 15 Feb 2013, Brad Campbell wrote:

> Please don't get me wrong. There is no nice way of protecting fools from 
> themselves short of revoking their license to breed, it's just from my 
> severely opinionated viewpoint that this looks like a sane idea. I'm quite 
> prepared to be slapped, and in fact probably need it.

Well, yes, it's a sane idea.

Right now it seems a significant part of new threads started on this list 
is about people doing --create --assume-clean with a different version of 
mdadm (so offset/chunk-size/whatever is wrong) or they have the wrong 
order. In order to help them, I believe the default behaviour of 
--zero-superblock without any additional flags, should be to give hdparm 
style message so that the user understands what is really going to happen, 
and when the superblocks are zeroed, a mdadm --examine and --detail (if 
the array is running) with a UUID/date should be saved somewhere.

The aim would be to drive two things:

People should think thrice about using --create --assume-clean. If they do 
(or do --zero-superblock), detailed information should be saved for the 
array and components before the operation was performed.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-15 11:23     ` Mikael Abrahamsson
@ 2013-02-15 12:12       ` Brad Campbell
  2013-02-15 12:34         ` Mikael Abrahamsson
  0 siblings, 1 reply; 33+ messages in thread
From: Brad Campbell @ 2013-02-15 12:12 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

On 15/02/13 19:23, Mikael Abrahamsson wrote:
> On Fri, 15 Feb 2013, Brad Campbell wrote:
>
>> To the point I've been ready to submit a patch for --assume-clean along the lines of what hdparm 
>> does and makes you also attach --please-destroy-my-disk before it'll work.
>
> Yes, I think this is an excellent idea. It should have --please-destroy-my-disk, and if there are 
> existing superblocks, it should save the contents of mdadm --examine before overwriting or 
> zero:ing them.

Please don't get me wrong. There is no nice way of protecting fools from themselves short of 
revoking their license to breed, it's just from my severely opinionated viewpoint that this looks 
like a sane idea. I'm quite prepared to be slapped, and in fact probably need it.
> --assemble --force should list the drive event count and say which drives have differing event 
> counts and ask for confirmation that this is really what the operator wants to do. If the event 
> count differs more than 50 (or some other value), there should be a requirement for 
> --please-destroy-my-array to be added :P
>
>> --assume-clean is great for those that need it, but for the great majority who rely on google and 
>> wikis it's a data-destroyer.
>
This is not a sleight on Neil and/or mdadm. The tools are wonderful, it's those that compose 
permanent articles without a full understanding that need adjusting. Unfortunately it's also those 
that don't really care as long as it drives page hits or reputation.

Brad (the unpleasant)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-15 11:14   ` Brad Campbell
@ 2013-02-15 11:23     ` Mikael Abrahamsson
  2013-02-15 12:12       ` Brad Campbell
  0 siblings, 1 reply; 33+ messages in thread
From: Mikael Abrahamsson @ 2013-02-15 11:23 UTC (permalink / raw)
  To: Brad Campbell; +Cc: linux-raid

On Fri, 15 Feb 2013, Brad Campbell wrote:

> To the point I've been ready to submit a patch for --assume-clean along 
> the lines of what hdparm does and makes you also attach 
> --please-destroy-my-disk before it'll work.

Yes, I think this is an excellent idea. It should have 
--please-destroy-my-disk, and if there are existing superblocks, it should 
save the contents of mdadm --examine before overwriting or zero:ing them.

--assemble --force should list the drive event count and say which drives 
have differing event counts and ask for confirmation that this is really 
what the operator wants to do. If the event count differs more than 50 (or 
some other value), there should be a requirement for 
--please-destroy-my-array to be added :P

> --assume-clean is great for those that need it, but for the great majority 
> who rely on google and wikis it's a data-destroyer.

I agree.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-15  9:57 ` Mikael Abrahamsson
@ 2013-02-15 11:14   ` Brad Campbell
  2013-02-15 11:23     ` Mikael Abrahamsson
  0 siblings, 1 reply; 33+ messages in thread
From: Brad Campbell @ 2013-02-15 11:14 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

On 15/02/13 17:57, Mikael Abrahamsson wrote:
> On Thu, 14 Feb 2013, Dragon wrote:
>
>> I heart that -force could be result in more problems - right? i do
>> asswell:
>
> Compared to the trouble people get into when getting --create
> --assume-clean and getting it wrong, I'd say --force is *nothing*.
>
> --force needs drives that have similar event count, if they're too far
> apart (one drive for instance), then that specific drive shouldn't be
> used when assembling.
>

To the point I've been ready to submit a patch for --assume-clean along 
the lines of what hdparm does and makes you also attach 
--please-destroy-my-disk before it'll work.

--assume-clean is great for those that need it, but for the great 
majority who rely on google and wikis it's a data-destroyer.

Not mdadm's fault however. More the twits who wrote the wikis.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-14 21:39 Dragon
@ 2013-02-15  9:57 ` Mikael Abrahamsson
  2013-02-15 11:14   ` Brad Campbell
  0 siblings, 1 reply; 33+ messages in thread
From: Mikael Abrahamsson @ 2013-02-15  9:57 UTC (permalink / raw)
  To: linux-raid

On Thu, 14 Feb 2013, Dragon wrote:

> I heart that -force could be result in more problems - right? i do asswell:

Compared to the trouble people get into when getting --create 
--assume-clean and getting it wrong, I'd say --force is *nothing*.

--force needs drives that have similar event count, if they're too far 
apart (one drive for instance), then that specific drive shouldn't be 
used when assembling.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-15  9:46 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-02-15  9:46 UTC (permalink / raw)
  To: linux-raid

Hello Dave,

many thx for your help and your explainations. The first step seems to work. Am i right? Therefore the next step is to readd the missing sdd4 (mdadm --add /dev/md2 /dev/sdd4)

mdadm -A /dev/md2 -R /dev/sda4 /dev/sdb4 /dev/sdc4 /dev/sdf4 /dev/sde4
mdadm: /dev/md2 has been started with 5 drives (out of 6).
-------------
output of dmesg
mdadm: sending ioctl 1261 to a partition!
[ 9884.422907] mdadm: sending ioctl 1261 to a partition!
[ 9884.422916] mdadm: sending ioctl 1261 to a partition!
[ 9884.540433] mdadm: sending ioctl 1261 to a partition!
[ 9884.540443] mdadm: sending ioctl 1261 to a partition!
[ 9884.561874] mdadm: sending ioctl 1261 to a partition!
[ 9884.561884] mdadm: sending ioctl 1261 to a partition!
[ 9884.562283] mdadm: sending ioctl 1261 to a partition!
[ 9884.562292] mdadm: sending ioctl 1261 to a partition!
[ 9885.381902] md: md2 stopped.
[ 9885.431051] md: bind<sdb4>
[ 9885.431185] md: bind<sdc4>
[ 9885.436744] md: bind<sdf4>
[ 9885.437110] md: bind<sde4>
[ 9885.437415] md: bind<sda4>
[ 9885.518260] raid5: device sda4 operational as raid disk 0
[ 9885.518264] raid5: device sde4 operational as raid disk 4
[ 9885.518266] raid5: device sdf4 operational as raid disk 3
[ 9885.518267] raid5: device sdc4 operational as raid disk 2
[ 9885.518269] raid5: device sdb4 operational as raid disk 1
[ 9885.518644] raid5: allocated 6386kB for md2
[ 9885.518843] 0: w=1 pa=0 pr=6 m=1 a=2 r=6 op1=0 op2=0
[ 9885.518847] 4: w=2 pa=0 pr=6 m=1 a=2 r=6 op1=0 op2=0
[ 9885.518849] 3: w=3 pa=0 pr=6 m=1 a=2 r=6 op1=0 op2=0
[ 9885.518851] 2: w=4 pa=0 pr=6 m=1 a=2 r=6 op1=0 op2=0
[ 9885.518853] 1: w=5 pa=0 pr=6 m=1 a=2 r=6 op1=0 op2=0
[ 9885.518855] raid5: raid level 5 set md2 active with 5 out of 6 devices, algorithm 2
[ 9885.518871] RAID5 conf printout:
[ 9885.518872]  --- rd:6 wd:5
[ 9885.518874]  disk 0, o:1, dev:sda4
[ 9885.518876]  disk 1, o:1, dev:sdb4
[ 9885.518877]  disk 2, o:1, dev:sdc4
[ 9885.518879]  disk 3, o:1, dev:sdf4
[ 9885.518880]  disk 4, o:1, dev:sde4
[ 9885.518894] md2: Warning: Device sde4 is misaligned
[ 9885.518896] md2: Warning: Device sdf4 is misaligned
[ 9885.518920] md2: detected capacity change from 0 to 14957451673600
[ 9885.520484]  md2: unknown partition table
---
mdadm -D /dev/md2 
/dev/md2:
        Version : 1.2
  Creation Time : Tue Feb  5 17:44:45 2013
     Raid Level : raid5
     Array Size : 14606886400 (13930.21 GiB 14957.45 GB)
  Used Dev Size : 2921377280 (2786.04 GiB 2991.49 GB)
   Raid Devices : 6
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Tue Feb  5 17:44:45 2013
          State : clean, degraded
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : mfsnode1:2  (local to host mfsnode1)
           UUID : e84f0346:3f5ff3f1:507b6f9c:0fa02c63
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8        4        0      active sync   /dev/sda4
       1       8       20        1      active sync   /dev/sdb4
       2       8       36        2      active sync   /dev/sdc4
       3       8       84        3      active sync   /dev/sdf4
       4       8       68        4      active sync   /dev/sde4
       5       0        0        5      removed
-----

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-14 21:00 Dragon
  2013-02-14 21:11 ` Robin Hill
@ 2013-02-15  0:59 ` Dave Cundiff
  1 sibling, 0 replies; 33+ messages in thread
From: Dave Cundiff @ 2013-02-15  0:59 UTC (permalink / raw)
  To: Dragon; +Cc: Linux MDADM Raid

On Thu, Feb 14, 2013 at 4:00 PM, Dragon <Sunghost@gmx.de> wrote:
> Hello,
>
> the 40TB are in an filecluster and this machine is part of this. the system consists of 6x3tb in sw raid5. there are four partions on each machine. one with 100mb for the efi bios, one for the os in raid1, one for the swap in raid5 and one for the files in raid5. the machine was opened and the disk layed beside, i moved two disks slightly and in this moment both spinded for a second down and the raid was gone. there was no filetransfer but the raid couldnt rebuild because of missing two disks.
>
> mdadm -E /dev/sdd4
> /dev/sdd4:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : 7b99380e:51d754cf:921c68e9:7b830d6a
>            Name : mfsnode1:2  (local to host mfsnode1)
>   Creation Time : Tue Feb  5 17:06:37 2013
>      Raid Level : raid5
>    Raid Devices : 6
>
>  Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
>      Array Size : 29213772800 (13930.21 GiB 14957.45 GB)
>   Used Dev Size : 5842754560 (2786.04 GiB 2991.49 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : 0da58625:14ed8675:6a7c4ba4:337d8c4b
>
>     Update Time : Tue Feb  5 17:06:37 2013

This disk looks to have dropped first. The times on the others are identical.

>        Checksum : 5f97164a - correct
>          Events : 0
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 5
>    Array State : AAA.AA ('A' == active, '.' == missing)

Lets see if we can just help the md driver along without having it
scan, try this

mdadm -A /dev/md2 -R /dev/sda4 /dev/sdb4 /dev/sdc4 /dev/sdf4 /dev/sde4

If that complains try

mdadm -A /dev/md2 --force -R /dev/sda4 /dev/sdb4 /dev/sdc4 /dev/sdf4 /dev/sde4

One of these should start your array with 1 missing disk. If neither
work let me know the output.


--
Dave Cundiff
System Administrator
A2Hosting, Inc
http://www.a2hosting.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-14 21:39 Dragon
  2013-02-15  9:57 ` Mikael Abrahamsson
  0 siblings, 1 reply; 33+ messages in thread
From: Dragon @ 2013-02-14 21:39 UTC (permalink / raw)
  To: linux-raid

I heart that -force could be result in more problems - right? i do asswell:
 mdadm -Af -vv /dev/md2
mdadm: looking for devices for /dev/md2
mdadm: cannot open device /dev/md/1: Device or resource busy
mdadm: /dev/md/1 has wrong uuid.
mdadm: cannot open device /dev/md/0: Device or resource busy
mdadm: /dev/md/0 has wrong uuid.
mdadm: /dev/sdf4 has wrong uuid.
mdadm: cannot open device /dev/sdf3: Device or resource busy
mdadm: /dev/sdf3 has wrong uuid.
mdadm: cannot open device /dev/sdf2: Device or resource busy
mdadm: /dev/sdf2 has wrong uuid.
mdadm: no RAID superblock on /dev/sdf1
mdadm: /dev/sdf1 has wrong uuid.
mdadm: cannot open device /dev/sdf: Device or resource busy
mdadm: /dev/sdf has wrong uuid.
mdadm: /dev/sde4 has wrong uuid.
mdadm: cannot open device /dev/sde3: Device or resource busy
mdadm: /dev/sde3 has wrong uuid.
mdadm: cannot open device /dev/sde2: Device or resource busy
mdadm: /dev/sde2 has wrong uuid.
mdadm: no RAID superblock on /dev/sde1
mdadm: /dev/sde1 has wrong uuid.
mdadm: cannot open device /dev/sde: Device or resource busy
mdadm: /dev/sde has wrong uuid.
mdadm: /dev/sdc4 has wrong uuid.
mdadm: cannot open device /dev/sdc3: Device or resource busy
mdadm: /dev/sdc3 has wrong uuid.
mdadm: cannot open device /dev/sdc2: Device or resource busy
mdadm: /dev/sdc2 has wrong uuid.
mdadm: no RAID superblock on /dev/sdc1
mdadm: /dev/sdc1 has wrong uuid.
mdadm: cannot open device /dev/sdc: Device or resource busy
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdd4 has wrong uuid.
mdadm: cannot open device /dev/sdd3: Device or resource busy
mdadm: /dev/sdd3 has wrong uuid.
mdadm: cannot open device /dev/sdd2: Device or resource busy
mdadm: /dev/sdd2 has wrong uuid.
mdadm: no RAID superblock on /dev/sdd1
mdadm: /dev/sdd1 has wrong uuid.
mdadm: cannot open device /dev/sdd: Device or resource busy
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdb4 has wrong uuid.
mdadm: cannot open device /dev/sdb3: Device or resource busy
mdadm: /dev/sdb3 has wrong uuid.
mdadm: cannot open device /dev/sdb2: Device or resource busy
mdadm: /dev/sdb2 has wrong uuid.
mdadm: no RAID superblock on /dev/sdb1
mdadm: /dev/sdb1 has wrong uuid.
mdadm: cannot open device /dev/sdb: Device or resource busy
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sda4 has wrong uuid.
mdadm: cannot open device /dev/sda3: Device or resource busy
mdadm: /dev/sda3 has wrong uuid.
mdadm: cannot open device /dev/sda2: Device or resource busy
mdadm: /dev/sda2 has wrong uuid.
mdadm: no RAID superblock on /dev/sda1
mdadm: /dev/sda1 has wrong uuid.
mdadm: cannot open device /dev/sda: Device or resource busy
mdadm: /dev/sda has wrong uuid.

it looks strang die device should sdX4...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-14 21:00 Dragon
@ 2013-02-14 21:11 ` Robin Hill
  2013-02-15  0:59 ` Dave Cundiff
  1 sibling, 0 replies; 33+ messages in thread
From: Robin Hill @ 2013-02-14 21:11 UTC (permalink / raw)
  To: Dragon; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1032 bytes --]

On Thu Feb 14, 2013 at 10:00:29 +0100, Dragon wrote:

> Hello,
> 
> the 40TB are in an filecluster and this machine is part of this. the
> system consists of 6x3tb in sw raid5. there are four partions on each
> machine. one with 100mb for the efi bios, one for the os in raid1, one
> for the swap in raid5 and one for the files in raid5. the machine was
> opened and the disk layed beside, i moved two disks slightly and in
> this moment both spinded for a second down and the raid was gone.
> there was no filetransfer but the raid couldnt rebuild because of
> missing two disks.
> 
The standard option to try is a forced assembly - "mdadm -Af /dev/mdX"

If that fails, then please retry with "-vv" as well and post both the
output from the command and the dmesg output.

HTH,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-14 21:00 Dragon
  2013-02-14 21:11 ` Robin Hill
  2013-02-15  0:59 ` Dave Cundiff
  0 siblings, 2 replies; 33+ messages in thread
From: Dragon @ 2013-02-14 21:00 UTC (permalink / raw)
  To: linux-raid

Hello,

the 40TB are in an filecluster and this machine is part of this. the system consists of 6x3tb in sw raid5. there are four partions on each machine. one with 100mb for the efi bios, one for the os in raid1, one for the swap in raid5 and one for the files in raid5. the machine was opened and the disk layed beside, i moved two disks slightly and in this moment both spinded for a second down and the raid was gone. there was no filetransfer but the raid couldnt rebuild because of missing two disks.

sda4
 mdadm -E /dev/sda4
/dev/sda4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e84f0346:3f5ff3f1:507b6f9c:0fa02c63
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Feb  5 17:44:45 2013
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
     Array Size : 29213772800 (13930.21 GiB 14957.45 GB)
  Used Dev Size : 5842754560 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 4f8851b4:001bf0c0:3aab60e0:b2c5558f

    Update Time : Tue Feb  5 17:44:45 2013
       Checksum : c0376a50 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAA. ('A' == active, '.' == missing)
---------------
mdadm -E /dev/sdb4
/dev/sdb4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e84f0346:3f5ff3f1:507b6f9c:0fa02c63
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Feb  5 17:44:45 2013
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
     Array Size : 29213772800 (13930.21 GiB 14957.45 GB)
  Used Dev Size : 5842754560 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : c2f63fa7:768e9945:64826929:6f1f68c2

    Update Time : Tue Feb  5 17:44:45 2013
       Checksum : b3ea7d20 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAA. ('A' == active, '.' == missing)
---------------------
mdadm -E /dev/sdc4
/dev/sdc4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e84f0346:3f5ff3f1:507b6f9c:0fa02c63
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Feb  5 17:44:45 2013
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
     Array Size : 29213772800 (13930.21 GiB 14957.45 GB)
  Used Dev Size : 5842754560 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : e9861f3e:4de4d0ce:7d4b6dd7:e1215fc7

    Update Time : Tue Feb  5 17:44:45 2013
       Checksum : 86fc2eab - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAA. ('A' == active, '.' == missing)
-----------------------
mdadm -E /dev/sdd4
/dev/sdd4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 7b99380e:51d754cf:921c68e9:7b830d6a
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Feb  5 17:06:37 2013
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842757597 (2786.04 GiB 2991.49 GB)
     Array Size : 29213772800 (13930.21 GiB 14957.45 GB)
  Used Dev Size : 5842754560 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 0da58625:14ed8675:6a7c4ba4:337d8c4b

    Update Time : Tue Feb  5 17:06:37 2013
       Checksum : 5f97164a - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 5
   Array State : AAA.AA ('A' == active, '.' == missing)
-------------------
mdadm -E /dev/sde4
/dev/sde4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e84f0346:3f5ff3f1:507b6f9c:0fa02c63
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Feb  5 17:44:45 2013
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842755584 (2786.04 GiB 2991.49 GB)
     Array Size : 29213772800 (13930.21 GiB 14957.45 GB)
  Used Dev Size : 5842754560 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : b70cd4f6:1594cc29:b4346929:89a5ed34

    Update Time : Tue Feb  5 17:44:45 2013
       Checksum : 5a36c944 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAA. ('A' == active, '.' == missing)
-------------------
 mdadm -E /dev/sdf4
/dev/sdf4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e84f0346:3f5ff3f1:507b6f9c:0fa02c63
           Name : mfsnode1:2  (local to host mfsnode1)
  Creation Time : Tue Feb  5 17:44:45 2013
     Raid Level : raid5
   Raid Devices : 6

 Avail Dev Size : 5842755584 (2786.04 GiB 2991.49 GB)
     Array Size : 29213772800 (13930.21 GiB 14957.45 GB)
  Used Dev Size : 5842754560 (2786.04 GiB 2991.49 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 06202661:79792af2:6c8d02ae:769bdded

    Update Time : Tue Feb  5 17:44:45 2013
       Checksum : ca70109c - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAA. ('A' == active, '.' == missing)

hope you guys can help

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-14 17:09 ` Mikael Abrahamsson
@ 2013-02-14 17:18   ` Dave Cundiff
  0 siblings, 0 replies; 33+ messages in thread
From: Dave Cundiff @ 2013-02-14 17:18 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Dragon, Linux MDADM Raid

On Thu, Feb 14, 2013 at 12:09 PM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> On Thu, 14 Feb 2013, Dragon wrote:
>
>> no i dindnt have a old version of examin from before ;(
>

Are the 2 disks completely failed? Or just dropped from the array?
Can you provide mdadm -E output from all devices that were in the array?

>
> Do you know what version of kernel and mdadm was used to create the raid5 in
> the first place? You should go back to that (at least the mdadm version) and
> try the --create --assume-clean with that mdadm version. Several key factors
> have changed between versions.

I would not do a --create --assume-clean on your array until you have
exhausted all other options. At 40TB this sounds like a very large
array. With raid5 disk ordering among other things is VERY important.


>
> Look at this thread for a person with similar problem
> <http://www.spinics.net/lists/raid/msg41732.html>, there is discussion about
> data offset etc in there, might be a good start.
>
> --
> Mikael Abrahamsson    email: swmike@swm.pp.se
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Dave Cundiff
System Administrator
A2Hosting, Inc
http://www.a2hosting.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-14 15:01 Dragon
@ 2013-02-14 17:09 ` Mikael Abrahamsson
  2013-02-14 17:18   ` Dave Cundiff
  0 siblings, 1 reply; 33+ messages in thread
From: Mikael Abrahamsson @ 2013-02-14 17:09 UTC (permalink / raw)
  To: Dragon; +Cc: linux-raid

On Thu, 14 Feb 2013, Dragon wrote:

> no i dindnt have a old version of examin from before ;(

Do you know what version of kernel and mdadm was used to create the raid5 
in the first place? You should go back to that (at least the mdadm 
version) and try the --create --assume-clean with that mdadm version. 
Several key factors have changed between versions.

Look at this thread for a person with similar problem 
<http://www.spinics.net/lists/raid/msg41732.html>, there is discussion 
about data offset etc in there, might be a good start.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-14 15:01 Dragon
  2013-02-14 17:09 ` Mikael Abrahamsson
  0 siblings, 1 reply; 33+ messages in thread
From: Dragon @ 2013-02-14 15:01 UTC (permalink / raw)
  To: linux-raid

Hi Mikael,

yes i searched a bit but iam not sure if can use this, because an failure in trying something would couse more problems than i may have. i also didnt have a backup for 40TB ;(. Therefore someone with experience should help. creating and troubleshooting are two diffent kind of shoes ;)

no i dindnt have a old version of examin from before ;(

sunny

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Possible to rescue SW Raid5 with 2 missing Disks
  2013-02-14 14:31 Dragon
@ 2013-02-14 14:39 ` Mikael Abrahamsson
  0 siblings, 0 replies; 33+ messages in thread
From: Mikael Abrahamsson @ 2013-02-14 14:39 UTC (permalink / raw)
  To: Dragon; +Cc: linux-raid

On Thu, 14 Feb 2013, Dragon wrote:

> while nobody is answering i suggest that there is no better way like 
> phil told, or my i wrong?

Have you looked throught the recent archives for all the other threads 
about people prematurely doing --create, and the factors that affect this? 
(data offset, order, superblock version etc)

Please if you have mdadm --examine from before you destroyed the 
original superblocks then that might help.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-14 14:31 Dragon
  2013-02-14 14:39 ` Mikael Abrahamsson
  0 siblings, 1 reply; 33+ messages in thread
From: Dragon @ 2013-02-14 14:31 UTC (permalink / raw)
  To: linux-raid

Hello,

while nobody is answering i suggest that there is no better way like phil told, or my i wrong?

thx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-10 21:27 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-02-10 21:27 UTC (permalink / raw)
  To: linux-raid

Hello,

i want to ask again for more help its importan for me and very urgent.
I think the moosefs cluster doesnt play a role, because its only a file and directory structure but no real filesystem. this is ext4. i cant believe that a 4 second time out of a raid5 creates such a damage. i had no file write jobs at this time. please help


Many thx.
sunny


-------- Original-Nachricht --------
Datum: Fri, 08 Feb 2013 10:17:00 +0100
Von: "Dragon" <Sunghost@gmx.de>
An: linux-raid@vger.kernel.org
Betreff: Possible to rescue SW Raid5 with 2 missing Disks

Hello,

my situation is this: i have 3 server with each of 6x3tb in an sw raid5 on a debian squeeze system. on top of this i use moosefs as a filecluster. the problem now is that for a very short time 2 of the disks failed. at this time there was no file activity only the filesync between the nodes of the cluster.

after that i cant rebuild the raid: "mdadm: /dev/md2 assembled from 4 drives and 1 spare - not enough to start the array."

the superblock seems to be ok. and at least, by an advice from phil turmel, i tried a diffent kind of recreation with these results:

Test1:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,d,e,f}4
-> filesize 708MB with 20603326 lines and canceling at the end by e2fsck
- bad superblock or partitiontable is damage
- bad checksum of group or descriptor
- lots of invalid inodes
- canceld with lots of illegal blocks in inodes

Test2:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,d,e}4 missing
-> filesize 1,3GB  with 37614367 lines and canceling by e2fsck at the end
- back to original superblock
- bad superblock or damaged partitiontable at the beginning
- lots of invalid inodes
- canceld with iteration of inade

Test3:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c}4 missing /dev/sd{e,f}4
-> filesize 1,4GB with 40745425 lines and canceling by e2fsck at the end
- errors see test2
- read error while reading next inode

Test4:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,f,e,d}4
->filesize 874MB with 25412000 lines and break by e2fsck at the end
- try original superblock
- bad superblock or damaged partitiontable
- than lots of checksumm  invalid deskriptor of group
- at the end illegal block in inode to much invalid blocks in inode

Test5:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c}4 missing /dev/sd{e,d}4
-> filesize 1,6GB with 45673505 lines and canceling at the end by e2fsck

Test6:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,f,e}4 missing
- try original superblock
- bad superblock or damage partitiontable
- lots of checksumm error in group deskriptor
- ends with conflict in inode table with another filesystem block
-> filesize 542MB with 15727702 lines and cancelingat the end by e2fsck

Teset6 looks like the best one, but what do you think and perhaps what could i do else? Any further help?

Many thanks
Sunny



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Possible to rescue SW Raid5 with 2 missing Disks
@ 2013-02-08  9:17 Dragon
  0 siblings, 0 replies; 33+ messages in thread
From: Dragon @ 2013-02-08  9:17 UTC (permalink / raw)
  To: linux-raid

Hello,

my situation is this: i have 3 server with each of 6x3tb in an sw raid5 on a debian squeeze system. on top of this i use moosefs as a filecluster. the problem now is that for a very short time 2 of the disks failed. at this time there was no file activity only the filesync between the nodes of the cluster.

after that i cant rebuild the raid: "mdadm: /dev/md2 assembled from 4 drives and 1 spare - not enough to start the array."

the superblock seems to be ok. and at least, by an advice from phil turmel, i tried a diffent kind of recreation with these results:

Test1:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,d,e,f}4
-> filesize 708MB with 20603326 lines and canceling at the end by e2fsck
- bad superblock or partitiontable is damage
- bad checksum of group or descriptor
- lots of invalid inodes
- canceld with lots of illegal blocks in inodes

Test2:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,d,e}4 missing
-> filesize 1,3GB  with 37614367 lines and canceling by e2fsck at the end
- back to original superblock
- bad superblock or damaged partitiontable at the beginning
- lots of invalid inodes
- canceld with iteration of inade

Test3:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c}4 missing /dev/sd{e,f}4
-> filesize 1,4GB with 40745425 lines and canceling by e2fsck at the end
- errors see test2
- read error while reading next inode

Test4:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,f,e,d}4
->filesize 874MB with 25412000 lines and break by e2fsck at the end
- try original superblock
- bad superblock or damaged partitiontable
- than lots of checksumm  invalid deskriptor of group
- at the end illegal block in inode to much invalid blocks in inode

Test5:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c}4 missing /dev/sd{e,d}4
-> filesize 1,6GB with 45673505 lines and canceling at the end by e2fsck

Test6:
mdadm --create --level 5 -n 6 --chunk=512 --assume-clean /dev/md2 /dev/sd{a,b,c,f,e}4 missing
- try original superblock
- bad superblock or damage partitiontable
- lots of checksumm error in group deskriptor
- ends with conflict in inode table with another filesystem block
-> filesize 542MB with 15727702 lines and cancelingat the end by e2fsck

Teset6 looks like the best one, but what do you think and perhaps what could i do else? Any further help?

Many thanks
Sunny


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2013-03-03 23:09 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-19 15:19 Possible to rescue SW Raid5 with 2 missing Disks Dragon
2013-02-19 17:48 ` Phil Turmel
2013-02-19 18:32   ` Roy Sigurd Karlsbakk
  -- strict thread matches above, loose matches on Subject: below --
2013-03-03 23:09 Dragon
2013-03-03 21:59 Dragon
2013-02-27  7:14 Dragon
2013-02-21 15:36 Dragon
2013-02-20  8:54 Dragon
2013-02-20  7:44 Dragon
2013-02-20  8:06 ` Mikael Abrahamsson
2013-02-19 20:36 Dragon
2013-02-18 12:13 Dragon
2013-02-15 15:41 Dragon
2013-02-16  5:06 ` Mikael Abrahamsson
2013-02-15 12:59 Dragon
2013-02-15 14:51 ` Mikael Abrahamsson
2013-02-15  9:46 Dragon
2013-02-14 21:39 Dragon
2013-02-15  9:57 ` Mikael Abrahamsson
2013-02-15 11:14   ` Brad Campbell
2013-02-15 11:23     ` Mikael Abrahamsson
2013-02-15 12:12       ` Brad Campbell
2013-02-15 12:34         ` Mikael Abrahamsson
2013-02-14 21:00 Dragon
2013-02-14 21:11 ` Robin Hill
2013-02-15  0:59 ` Dave Cundiff
2013-02-14 15:01 Dragon
2013-02-14 17:09 ` Mikael Abrahamsson
2013-02-14 17:18   ` Dave Cundiff
2013-02-14 14:31 Dragon
2013-02-14 14:39 ` Mikael Abrahamsson
2013-02-10 21:27 Dragon
2013-02-08  9:17 Dragon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.