All of lore.kernel.org
 help / color / mirror / Atom feed
* Self inflicted reshape catastrophe
@ 2021-01-14 22:57 Nathan Brown
  2021-01-15 16:21 ` antlists
  2021-01-18  0:49 ` NeilBrown
  0 siblings, 2 replies; 6+ messages in thread
From: Nathan Brown @ 2021-01-14 22:57 UTC (permalink / raw)
  To: linux-raid

Scenario:

I had a 4 by 10TB raid 5 array and was adding 5 more disks and
reshaping it to a raid6. This was working just fine until I got a
little too aggressive with perf tuning and caused `mdadm` to
completely hang. I froze the rebuild and rebooted the server to wipe
away my tuning mess. The raid didn't automatically assemble so I did
`mdadm --assemble` but really screwed up and put the 5 new disks in a
different array. Not sure why the superblock on those disks didn't
stop `mdadm` from putting them into service but the end result was the
superblock on those 5 new drives got wiped. That array was missing a
disk so 4 went into spare and 1 went into service, I let that rebuild
complete as I figure I'd likely already lost any usable data there.

I now have 4 disks with proper looking superblocks, 4 disks with
garbage superblocks, and 1 disk sitting in an array that it shouldn't
be in. My primary concern is on assembling the 10TB disk array.

What I've done so far:

All this is done with an overlay to avoid modifying the disks any further.

`mdadm --assemble` if I provide all disks, it will refuse to start as
it hits the first of the new drives "superblock on ... doesn't match
others", `--force` has no effect. `--update=revert-reshape` changes
the `--examine` details but nothing happens since the other 5 drives
are absent.

`mdadm --assemble` again with all disks but the new disks super blocks
have been zero'd. Refuses once it hits the first of the new disks "No
super block found on ...", `--force` has no effect.

`mdadm --assemble` using only the 4 original disks, the md dev shows
up now but can't start. If I try to add any of the new disks I get
"Cannot get array info for /dev/md#", `--force` and super block
zeroing has no effect.

`mdadm --create` using all permutations of the new drives, I believe
know the order of the old ones. A handful of the 120 different
arrangements will allow me to see some of the files, but I do not know
how to move the reshape along in this state. Please note that 1 disk
position is using `missing`.

I believe my next best bet is to try and create an appropriate super
block and write them to each of the new disks to see if it will
assemble and continue the reshape. I wanted to get this lists
suggestions before I went down that path.

Thank you for your time.

Details:

`mdadm --version`
mdadm - v4.1 - 2018-10-01

`lsb_release -a`
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal

`uname -a`
Linux nas2 5.4.0-60-generic #67-Ubuntu SMP Tue Jan 5 18:31:36 UTC 2021
x86_64 x86_64 x86_64 GNU/Linux

`mdadm -E /dev/sdk1`
/dev/sdk1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : a6914f4a:14a64337:c3546c24:42930ff9
           Name : any:0
  Creation Time : Mon Dec 23 22:56:41 2019
     Raid Level : raid6
   Raid Devices : 9
 Avail Dev Size : 19532605440 (9313.87 GiB 10000.69 GB)
     Array Size : 68364119040 (65197.10 GiB 70004.86 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : a247b8d7:6abdf354:8ca03a82:8681cf54
Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 1922360832 (1833.31 GiB 1968.50 GB)
  Delta Devices : 4 (5->9)
     New Layout : left-symmetric
    Update Time : Thu Jan 14 02:02:24 2021
  Bad Block Log : 512 entries available at offset 48 sectors
       Checksum : 4229db98 - correct
         Events : 146894
         Layout : left-symmetric-6
     Chunk Size : 512K
   Device Role : Active device 0
   Array State : AAAA..A.A ('A' == active, '.' == missing, 'R' == replacing)

mdadm -E /dev/sdj1
`/dev/sdj1:`
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : a6914f4a:14a64337:c3546c24:42930ff9
           Name : any:0
  Creation Time : Mon Dec 23 22:56:41 2019
     Raid Level : raid6
   Raid Devices : 9
 Avail Dev Size : 19532605440 (9313.87 GiB 10000.69 GB)
     Array Size : 68364119040 (65197.10 GiB 70004.86 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : 218773e0:f097e26a:10eb2032:8b0c5f2a
Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 1922360832 (1833.31 GiB 1968.50 GB)
  Delta Devices : 4 (5->9)
     New Layout : left-symmetric
    Update Time : Thu Jan 14 02:02:24 2021
  Bad Block Log : 512 entries available at offset 48 sectors
       Checksum : e64ccb33 - correct
         Events : 146894
         Layout : left-symmetric-6
     Chunk Size : 512K
   Device Role : Active device 1
   Array State : AAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)

`mdadm -E /dev/sdh1`
/dev/sdh1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : a6914f4a:14a64337:c3546c24:42930ff9
           Name : any:0
  Creation Time : Mon Dec 23 22:56:41 2019
     Raid Level : raid6
   Raid Devices : 9
 Avail Dev Size : 19532605440 (9313.87 GiB 10000.69 GB)
     Array Size : 68364119040 (65197.10 GiB 70004.86 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : e8062d92:654dc1e0:4e28b361:eb97ccc2
Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 1922360832 (1833.31 GiB 1968.50 GB)
  Delta Devices : 4 (5->9)
     New Layout : left-symmetric
    Update Time : Thu Jan 14 02:02:24 2021
  Bad Block Log : 512 entries available at offset 48 sectors
       Checksum : d5e4c90f - correct
         Events : 146894
         Layout : left-symmetric-6
     Chunk Size : 512K
   Device Role : Active device 2
   Array State : AAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)

`mdadm -E /dev/sdi1`
/dev/sdi1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : a6914f4a:14a64337:c3546c24:42930ff9
           Name : any:0
  Creation Time : Mon Dec 23 22:56:41 2019
     Raid Level : raid6
   Raid Devices : 9
 Avail Dev Size : 19532605440 (9313.87 GiB 10000.69 GB)
     Array Size : 68364119040 (65197.10 GiB 70004.86 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : f0612be8:dcf9d96b:1926ce52:484d9ab2
Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 1922360832 (1833.31 GiB 1968.50 GB)
  Delta Devices : 4 (5->9)
     New Layout : left-symmetric
    Update Time : Thu Jan 14 02:02:24 2021
  Bad Block Log : 512 entries available at offset 48 sectors
       Checksum : 97e483b8 - correct
         Events : 146894
         Layout : left-symmetric-6
     Chunk Size : 512K
   Device Role : Active device 3
   Array State : AAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)

If assembled with only the 4 disks with appropriate super blocks I get
`mdadm --detail /dev/md0`
/dev/md0:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 4
       Persistence : Superblock is persistent
             State : inactive
   Working Devices : 4
     Delta Devices : 4, (-4->0)
         New Level : raid6
        New Layout : left-symmetric
     New Chunksize : 512K
              Name : any:0
              UUID : a6914f4a:14a64337:c3546c24:42930ff9
            Events : 146894
    Number   Major   Minor   RaidDevice
       -     253        7        -        /dev/dm-7
       -     253        5        -        /dev/dm-5
       -     253        6        -        /dev/dm-6
       -     253        4        -        /dev/dm-4

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-01-19  0:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-14 22:57 Self inflicted reshape catastrophe Nathan Brown
2021-01-15 16:21 ` antlists
2021-01-18  0:49 ` NeilBrown
2021-01-18  3:12   ` Nathan Brown
2021-01-18 22:36     ` NeilBrown
2021-01-19  0:09       ` Nathan Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.