All of lore.kernel.org
 help / color / mirror / Atom feed
* Need help with recovery
@ 2010-07-14 14:37 Jan Ceuleers
  2010-07-15  5:37 ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Ceuleers @ 2010-07-14 14:37 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 649 bytes --]

Hi guys.

I've been having trouble with a four-disk raid5 array, which I now believe were caused by a flaky eSATA cable. The cable has been replaced, and the controller now seems to reliably see all four disks.

In fact I have five disks, and so a partition on the fifth disk acts as a spare. Over the past several days I've been connecting and disconnecting drives in ways that I didn't realise were unreliable (due to the faulty cable).

This has resulted in the md metadata being out of sync on the different partitions. Please could someone provide advice as to how to recover this array (which contains an ext3 filesystem)?


Many thanks!

Jan

[-- Attachment #2: mdadm-examine.txt --]
[-- Type: text/plain, Size: 5198 bytes --]

/dev/sda2:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 714e46f1:479268a7:895e209c:936fa570
  Creation Time : Fri Jun 22 20:56:51 2007
     Raid Level : raid5
  Used Dev Size : 311588608 (297.15 GiB 319.07 GB)
     Array Size : 934765824 (891.46 GiB 957.20 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2

    Update Time : Wed Jul 14 13:19:56 2010
          State : active
 Active Devices : 3
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 242755b4 - correct
         Events : 24565

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8        2        1      active sync   /dev/sda2

   0     0       8       66        0      active sync   /dev/sde2
   1     1       8        2        1      active sync   /dev/sda2
   2     2       8       18        2      active sync   /dev/sdb2
   3     3       0        0        3      faulty removed
   4     4       8       34        4      spare   /dev/sdc2
/dev/sdb2:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 714e46f1:479268a7:895e209c:936fa570
  Creation Time : Fri Jun 22 20:56:51 2007
     Raid Level : raid5
  Used Dev Size : 311588608 (297.15 GiB 319.07 GB)
     Array Size : 934765824 (891.46 GiB 957.20 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 2

    Update Time : Tue Jul 13 03:47:51 2010
          State : active
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1
       Checksum : 2425708a - correct
         Events : 21017

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8        2        0      active sync   /dev/sda2

   0     0       8        2        0      active sync   /dev/sda2
   1     1       8       18        1      active sync   /dev/sdb2
   2     2       8       34        2      active sync   /dev/sdc2
   3     3       8       50        3      active sync   /dev/sdd2
   4     4       8       82        4      spare   /dev/sdf2
/dev/sdc2:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 714e46f1:479268a7:895e209c:936fa570
  Creation Time : Fri Jun 22 20:56:51 2007
     Raid Level : raid5
  Used Dev Size : 311588608 (297.15 GiB 319.07 GB)
     Array Size : 934765824 (891.46 GiB 957.20 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2

    Update Time : Wed Jul 14 13:20:13 2010
          State : clean
 Active Devices : 2
Working Devices : 3
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 2427b5fa - correct
         Events : 24574

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8       34        4      spare   /dev/sdc2

   0     0       8       66        0      active sync   /dev/sde2
   1     1       0        0        1      faulty removed
   2     2       8       18        2      active sync   /dev/sdb2
   3     3       0        0        3      faulty removed
   4     4       8       34        4      spare   /dev/sdc2
/dev/sdd2:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 714e46f1:479268a7:895e209c:936fa570
  Creation Time : Fri Jun 22 20:56:51 2007
     Raid Level : raid5
  Used Dev Size : 311588608 (297.15 GiB 319.07 GB)
     Array Size : 934765824 (891.46 GiB 957.20 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2

    Update Time : Wed Jul 14 13:20:13 2010
          State : clean
 Active Devices : 2
Working Devices : 3
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 2427b5ec - correct
         Events : 24574

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       18        2      active sync   /dev/sdb2

   0     0       8       66        0      active sync   /dev/sde2
   1     1       0        0        1      faulty removed
   2     2       8       18        2      active sync   /dev/sdb2
   3     3       0        0        3      faulty removed
   4     4       8       34        4      spare   /dev/sdc2
/dev/sde2:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 714e46f1:479268a7:895e209c:936fa570
  Creation Time : Fri Jun 22 20:56:51 2007
     Raid Level : raid5
  Used Dev Size : 311588608 (297.15 GiB 319.07 GB)
     Array Size : 934765824 (891.46 GiB 957.20 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2

    Update Time : Wed Jul 14 13:20:13 2010
          State : clean
 Active Devices : 2
Working Devices : 3
 Failed Devices : 2
  Spare Devices : 1
       Checksum : 2427b618 - correct
         Events : 24574

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       66        0      active sync   /dev/sde2

   0     0       8       66        0      active sync   /dev/sde2
   1     1       0        0        1      faulty removed
   2     2       8       18        2      active sync   /dev/sdb2
   3     3       0        0        3      faulty removed
   4     4       8       34        4      spare   /dev/sdc2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Need help with recovery
  2010-07-14 14:37 Need help with recovery Jan Ceuleers
@ 2010-07-15  5:37 ` Neil Brown
  2010-07-15  9:50   ` Ceuleers, Jan (Jan)
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2010-07-15  5:37 UTC (permalink / raw)
  To: Jan Ceuleers; +Cc: linux-raid

On Wed, 14 Jul 2010 16:37:09 +0200
Jan Ceuleers <jan.ceuleers@computer.org> wrote:

> Hi guys.
> 
> I've been having trouble with a four-disk raid5 array, which I now believe were caused by a flaky eSATA cable. The cable has been replaced, and the controller now seems to reliably see all four disks.
> 
> In fact I have five disks, and so a partition on the fifth disk acts as a spare. Over the past several days I've been connecting and disconnecting drives in ways that I didn't realise were unreliable (due to the faulty cable).
> 
> This has resulted in the md metadata being out of sync on the different partitions. Please could someone provide advice as to how to recover this array (which contains an ext3 filesystem)?
> 
> 
> Many thanks!
> 
> Jan


mdadm -A /dev/md2 --force /dev/sd[ade]2

sdc2 this it is a spare, so you don't want it just now.
sdb2 and sde2 both think that are 'device 0', but sde2 is clearly newer.

So run the above mdadm comment (assuming none of the devices have been
renamed again since you created the --examine output), then 
  fsck -n /dev/md2

and make sure it is happy.

If it is you can added sdb2 and sde2 as spares.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Need help with recovery
  2010-07-15  5:37 ` Neil Brown
@ 2010-07-15  9:50   ` Ceuleers, Jan (Jan)
  2010-07-15 13:40     ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Ceuleers, Jan (Jan) @ 2010-07-15  9:50 UTC (permalink / raw)
  To: linux-raid

Neil,

> mdadm -A /dev/md2 --force /dev/sd[ade]2

Many thanks for your feedback. I will try this when I get home.

So if I boil this down to the principles, your advice is to assemble the array in degraded mode using the components that have the most recent update timestamps. Degraded, because this prevents resynchronisation.

Thanks, Jan


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Need help with recovery
  2010-07-15  9:50   ` Ceuleers, Jan (Jan)
@ 2010-07-15 13:40     ` Neil Brown
  2010-07-15 19:27       ` Jan Ceuleers
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2010-07-15 13:40 UTC (permalink / raw)
  To: Ceuleers, Jan (Jan); +Cc: linux-raid

On Thu, 15 Jul 2010 11:50:56 +0200
"Ceuleers, Jan (Jan)" <jan.ceuleers@alcatel-lucent.com> wrote:

> Neil,
> 
> > mdadm -A /dev/md2 --force /dev/sd[ade]2
> 
> Many thanks for your feedback. I will try this when I get home.
> 
> So if I boil this down to the principles, your advice is to assemble
> the array in degraded mode using the components that have the most
> recent update timestamps. Degraded, because this prevents
> resynchronisation.
>
It is more a case of "degraded because that uses fewer devices, and
so fewer old devices which could introduce errors."  though avoiding
resync is a good thing too.

The key bit is "--force" which tell mdadm to assemble it anyway
even if something looks out of data.  mdadm will only include just
enough devices to make the array work.
You don't really need to be selective about the devices you
choose. If you give "mdadm -Af" all of the devices, it will just
pick the ones that are best.  I just listed them for you to
make it more clear what was happening.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Need help with recovery
  2010-07-15 13:40     ` Neil Brown
@ 2010-07-15 19:27       ` Jan Ceuleers
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Ceuleers @ 2010-07-15 19:27 UTC (permalink / raw)
  To: linux-raid

Neil,

On 15/07/10 15:40, Neil Brown wrote:
>>> mdadm -A /dev/md2 --force /dev/sd[ade]2

Your advice has worked a treat. Thank you very much; my server is back up again.

Thanks, Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-07-15 19:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-14 14:37 Need help with recovery Jan Ceuleers
2010-07-15  5:37 ` Neil Brown
2010-07-15  9:50   ` Ceuleers, Jan (Jan)
2010-07-15 13:40     ` Neil Brown
2010-07-15 19:27       ` Jan Ceuleers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.