* Raid-6 cannot reshape
@ 2020-04-01 10:16 Alexander Shenkin
[not found] ` <6b9b6d37-6325-6515-f693-0ff3b641a67a@shenkin.org>
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Shenkin @ 2020-04-01 10:16 UTC (permalink / raw)
To: Linux-RAID
[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]
Hi all,
I had a problem that caused my Ubuntu Server 14 to go down, and it now
will not boot. I added a drive to my raid1 (/dev/md0 -> /boot) + raid6
(/dev/md2 -> /) setup. I was previously running with /dev/md0 (raid1)
assembled from /dev/sd[a-f]1, and /dev/md2 (raid6) assembled
/dev/sd[a-f]3. Both partitions from the new drive were successfully
added to the two md arrays, and the raid1 (the smaller of the two)
resync'd successfully it seemed (I think that resync is the correct word
to use, meaning that mdadm is spreading out the data amongst the drives
once an array has been grown). When resyncing the larger raid6,
however, the sync speed was quite slow (kb's/sec), and got slower and
slower (7kb/sec... 5kb/sec... 3kb/sec... 1kb/sec...) until the system
entirely halted and I eventually turned the power off. It now will not
boot.
Thanks to Roger Heflin's help, I've booted into a livecd environment
(ubuntu server 18) with all the necessary raid personalities available.
When trying to --assemble md127 (raid6), i get the following error:
"Failure to restore critical section for reshape, sorry. Perhaps you
needed to specify the --backup-file". Needless to say, I didn't save a
backup file when adding the drive.
I have attached some diagnostic output here. It seems that my raid6
array is not being recognized as such. I'm not sure what the next steps
are - do I need to figure out how to get the resync up and running
again? Any help would be greatly appreciated.
Many thanks,
Allie
[-- Attachment #2: mdstat.txt --]
[-- Type: text/plain, Size: 395 bytes --]
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md127 : inactive sdd3[5](S) sdf3[8](S) sdc3[2](S) sdg3[9](S) sde3[7](S) sdb3[4](S) sda3[6](S)
20441322496 blocks super 1.2
md126 : active (auto-read-only) raid1 sdf1[8] sde1[7] sdg1[9] sda1[6] sdd1[5] sdc1[2] sdb1[4]
1950656 blocks super 1.2 [7/7] [UUUUUUU]
unused devices: <none>
[-- Attachment #3: fdisk.txt --]
[-- Type: text/plain, Size: 6695 bytes --]
Disk /dev/loop0: 265.3 MiB, 278147072 bytes, 543256 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop1: 70.1 MiB, 73531392 bytes, 143616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop2: 49.7 MiB, 52060160 bytes, 101680 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop3: 36 MiB, 37707776 bytes, 73648 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop4: 27.4 MiB, 28717056 bytes, 56088 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop5: 89.1 MiB, 93417472 bytes, 182456 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop6: 51.9 MiB, 54358016 bytes, 106168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/loop7: 51.9 MiB, 54407168 bytes, 106264 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/sda: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D607CD2D-05F4-42C6-914C-B99CE56E934F
Device Start End Sectors Size Type
/dev/sda1 2048 3905535 3903488 1.9G Linux RAID
/dev/sda2 3905536 3907583 2048 1M BIOS boot
/dev/sda3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sda4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D96D7513-3D74-435B-8B2F-26C2A32B0586
Device Start End Sectors Size Type
/dev/sdb1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdb2 3905536 3907583 2048 1M BIOS boot
/dev/sdb3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdb4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdc: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 4B356AFA-8F48-4227-86F0-329565146D7A
Device Start End Sectors Size Type
/dev/sdc1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdc2 3905536 3907583 2048 1M BIOS boot
/dev/sdc3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdc4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdd: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D4D1E2EB-8520-4B3E-8263-AB5B5BD2D7E2
Device Start End Sectors Size Type
/dev/sdd1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdd2 3905536 3907583 2048 1M BIOS boot
/dev/sdd3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdd4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sde: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 52F044AB-9D50-4363-BF60-651A87159A17
Device Start End Sectors Size Type
/dev/sde1 2048 3905535 3903488 1.9G Linux RAID
/dev/sde2 3905536 3907583 2048 1M BIOS boot
/dev/sde3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sde4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdg: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: BE2DDABE-9106-4436-8ABC-6A919EBFB2E6
Device Start End Sectors Size Type
/dev/sdg1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdg2 3905536 3907583 2048 1M BIOS boot
/dev/sdg3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdg4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/sdf: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 36615F7D-974F-4A0B-B79B-165D872EF418
Device Start End Sectors Size Type
/dev/sdf1 2048 3905535 3903488 1.9G Linux RAID
/dev/sdf2 3905536 3907583 2048 1M BIOS boot
/dev/sdf3 3907584 5844547583 5840640000 2.7T Linux RAID
/dev/sdf4 5844547584 5860532223 15984640 7.6G Linux filesystem
Disk /dev/md126: 1.9 GiB, 1997471744 bytes, 3901312 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk /dev/sdh: 29.2 GiB, 31376707072 bytes, 61282631 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00f674a6
Device Boot Start End Sectors Size Id Type
/dev/sdh1 * 2048 61282630 61280583 29.2G c W95 FAT32 (LBA)
Disk /dev/sdi: 15 GiB, 16106127360 bytes, 31457280 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00e19bdc
Device Boot Start End Sectors Size Id Type
/dev/sdi1 * 2048 31457279 31455232 15G c W95 FAT32 (LBA)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
[not found] ` <6b9b6d37-6325-6515-f693-0ff3b641a67a@shenkin.org>
@ 2020-04-06 15:27 ` Alexander Shenkin
2020-04-06 16:12 ` Roger Heflin
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Shenkin @ 2020-04-06 15:27 UTC (permalink / raw)
To: Linux-RAID
On 4/4/2020 9:19 AM, Alexander Shenkin wrote:
> On 4/1/2020 11:16 AM, Alexander Shenkin wrote:
>> Hi all,
>>
>> I had a problem that caused my Ubuntu Server 14 to go down, and it now
>> will not boot. I added a drive to my raid1 (/dev/md0 -> /boot) + raid6
>> (/dev/md2 -> /) setup. I was previously running with /dev/md0 (raid1)
>> assembled from /dev/sd[a-f]1, and /dev/md2 (raid6) assembled
>> /dev/sd[a-f]3. Both partitions from the new drive were successfully
>> added to the two md arrays, and the raid1 (the smaller of the two)
>> resync'd successfully it seemed (I think that resync is the correct word
>> to use, meaning that mdadm is spreading out the data amongst the drives
>> once an array has been grown). When resyncing the larger raid6,
>> however, the sync speed was quite slow (kb's/sec), and got slower and
>> slower (7kb/sec... 5kb/sec... 3kb/sec... 1kb/sec...) until the system
>> entirely halted and I eventually turned the power off. It now will not
>> boot.
>>
>> Thanks to Roger Heflin's help, I've booted into a livecd environment
>> (ubuntu server 18) with all the necessary raid personalities available.
>>
>> When trying to --assemble md127 (raid6), i get the following error:
>> "Failure to restore critical section for reshape, sorry. Perhaps you
>> needed to specify the --backup-file". Needless to say, I didn't save a
>> backup file when adding the drive.
>>
>> I have attached some diagnostic output here. It seems that my raid6
>> array is not being recognized as such. I'm not sure what the next steps
>> are - do I need to figure out how to get the resync up and running
>> again? Any help would be greatly appreciated.
>>
>> Many thanks,
>>
>> Allie
>>
>
> Hello again,
>
> Reading through https://raid.wiki.kernel.org/index.php/Assemble_Run and
> https://raid.wiki.kernel.org/index.php/RAID_Recovery, the advice seems
> to be to force assemble the array given that my event counts are all the
> same. However, I don't want to do that until some experts have chimed
> in. I've seen some other threads about using overlays to avoid data
> loss... not sure if that is still a recommended method.
>
> I'm including the dmesg and mdadm --examine output here... any advice
> much appreciated!
>
> Thanks,
> Allie
>
> (note below - arrays are on /dev/md[a-g])
>
> root@ubuntu-server:/home/ubuntu-server# mdadm -A --scan --verbose
> mdadm: looking for devices for further assembly
> mdadm: Cannot assemble mbr metadata on /dev/sdh1
> mdadm: Cannot assemble mbr metadata on /dev/sdh
> mdadm: no recogniseable superblock on /dev/md/bobbiflekman:0
> mdadm: no recogniseable superblock on /dev/sdf4
> mdadm: /dev/sdf3 is busy - skipping
> mdadm: no recogniseable superblock on /dev/sdf2
> mdadm: /dev/sdf1 is busy - skipping
> mdadm: Cannot assemble mbr metadata on /dev/sdf
> mdadm: no recogniseable superblock on /dev/sdg4
> mdadm: /dev/sdg3 is busy - skipping
> mdadm: no recogniseable superblock on /dev/sdg2
> mdadm: /dev/sdg1 is busy - skipping
> mdadm: Cannot assemble mbr metadata on /dev/sdg
> mdadm: no recogniseable superblock on /dev/sde4
> mdadm: /dev/sde3 is busy - skipping
> mdadm: no recogniseable superblock on /dev/sde2
> mdadm: /dev/sde1 is busy - skipping
> mdadm: Cannot assemble mbr metadata on /dev/sde
> mdadm: cannot open device /dev/sr1: No medium found
> mdadm: cannot open device /dev/sr0: No medium found
> mdadm: no recogniseable superblock on /dev/sdd4
> mdadm: /dev/sdd3 is busy - skipping
> mdadm: no recogniseable superblock on /dev/sdd2
> mdadm: /dev/sdd1 is busy - skipping
> mdadm: Cannot assemble mbr metadata on /dev/sdd
> mdadm: no recogniseable superblock on /dev/sdc4
> mdadm: /dev/sdc3 is busy - skipping
> mdadm: no recogniseable superblock on /dev/sdc2
> mdadm: /dev/sdc1 is busy - skipping
> mdadm: Cannot assemble mbr metadata on /dev/sdc
> mdadm: no recogniseable superblock on /dev/sdb4
> mdadm: /dev/sdb3 is busy - skipping
> mdadm: no recogniseable superblock on /dev/sdb2
> mdadm: /dev/sdb1 is busy - skipping
> mdadm: Cannot assemble mbr metadata on /dev/sdb
> mdadm: no recogniseable superblock on /dev/sda4
> mdadm: /dev/sda3 is busy - skipping
> mdadm: no recogniseable superblock on /dev/sda2
> mdadm: /dev/sda1 is busy - skipping
> mdadm: Cannot assemble mbr metadata on /dev/sda
> mdadm: no recogniseable superblock on /dev/loop7
> mdadm: no recogniseable superblock on /dev/loop6
> mdadm: no recogniseable superblock on /dev/loop5
> mdadm: no recogniseable superblock on /dev/loop4
> mdadm: no recogniseable superblock on /dev/loop3
> mdadm: no recogniseable superblock on /dev/loop2
> mdadm: no recogniseable superblock on /dev/loop1
> mdadm: no recogniseable superblock on /dev/loop0
> mdadm: No arrays found in config file or automatically
>
Hi again all,
Apologies for all the self-followed-up emails. I realized my previous
--examine was done without stopping the array in question. When stopped
and reexamined, the critical bit seems to be this:
mdadm: /dev/sdf3 is identified as a member of /dev/md/ubuntu:2, slot 4.
mdadm: /dev/sdg3 is identified as a member of /dev/md/ubuntu:2, slot 6.
mdadm: /dev/sde3 is identified as a member of /dev/md/ubuntu:2, slot 5.
mdadm: /dev/sdd3 is identified as a member of /dev/md/ubuntu:2, slot 3.
mdadm: /dev/sdc3 is identified as a member of /dev/md/ubuntu:2, slot 2.
mdadm: /dev/sdb3 is identified as a member of /dev/md/ubuntu:2, slot 1.
mdadm: /dev/sda3 is identified as a member of /dev/md/ubuntu:2, slot 0.
mdadm: /dev/md/ubuntu:2 has an active reshape - checking if critical
section needs to be restored
mdadm: No backup metadata on device-6
mdadm: Failed to find backup of critical section
mdadm: Failed to restore critical section for reshape, sorry.
Possibly you needed to specify the --backup-file
That is, it thinks there is an active reshape.
Thanks,
Allie
root@ubuntu-server:/home/ubuntu-server# mdadm -A --scan --verbose
mdadm: looking for devices for further assembly
mdadm: Cannot assemble mbr metadata on /dev/sdh1
mdadm: Cannot assemble mbr metadata on /dev/sdh
mdadm: no recogniseable superblock on /dev/md/bobbiflekman:0
mdadm: no recogniseable superblock on /dev/sdf4
mdadm: No super block found on /dev/sdf2 (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdf2
mdadm: /dev/sdf1 is busy - skipping
mdadm: No super block found on /dev/sdf (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdf
mdadm: No super block found on /dev/sdg4 (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdg4
mdadm: No super block found on /dev/sdg2 (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdg2
mdadm: /dev/sdg1 is busy - skipping
mdadm: No super block found on /dev/sdg (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdg
mdadm: No super block found on /dev/sde4 (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sde4
mdadm: No super block found on /dev/sde2 (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sde2
mdadm: /dev/sde1 is busy - skipping
mdadm: No super block found on /dev/sde (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sde
mdadm: cannot open device /dev/sr1: No medium found
mdadm: cannot open device /dev/sr0: No medium found
mdadm: No super block found on /dev/sdd4 (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdd4
mdadm: No super block found on /dev/sdd2 (Expected magic a92b4efc, got
9c6196f1)
mdadm: no RAID superblock on /dev/sdd2
mdadm: /dev/sdd1 is busy - skipping
mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdd
mdadm: No super block found on /dev/sdc4 (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdc4
mdadm: No super block found on /dev/sdc2 (Expected magic a92b4efc, got
9c6196f1)
mdadm: no RAID superblock on /dev/sdc2
mdadm: /dev/sdc1 is busy - skipping
mdadm: No super block found on /dev/sdc (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdc
mdadm: No super block found on /dev/sdb4 (Expected magic a92b4efc, got
53425553)
mdadm: no RAID superblock on /dev/sdb4
mdadm: No super block found on /dev/sdb2 (Expected magic a92b4efc, got
9c6196f1)
mdadm: no RAID superblock on /dev/sdb2
mdadm: /dev/sdb1 is busy - skipping
mdadm: No super block found on /dev/sdb (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sdb
mdadm: No super block found on /dev/sda4 (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sda4
mdadm: No super block found on /dev/sda2 (Expected magic a92b4efc, got
9c6196f1)
mdadm: no RAID superblock on /dev/sda2
mdadm: /dev/sda1 is busy - skipping
mdadm: No super block found on /dev/sda (Expected magic a92b4efc, got
00000000)
mdadm: no RAID superblock on /dev/sda
mdadm: No super block found on /dev/loop7 (Expected magic a92b4efc, got
72756769)
mdadm: no RAID superblock on /dev/loop7
mdadm: No super block found on /dev/loop6 (Expected magic a92b4efc, got
69622f21)
mdadm: no RAID superblock on /dev/loop6
mdadm: No super block found on /dev/loop5 (Expected magic a92b4efc, got
14ea0a05)
mdadm: no RAID superblock on /dev/loop5
mdadm: No super block found on /dev/loop4 (Expected magic a92b4efc, got
1824ef5d)
mdadm: no RAID superblock on /dev/loop4
mdadm: No super block found on /dev/loop3 (Expected magic a92b4efc, got
1e993ae9)
mdadm: no RAID superblock on /dev/loop3
mdadm: No super block found on /dev/loop2 (Expected magic a92b4efc, got
cb4d8c8e)
mdadm: no RAID superblock on /dev/loop2
mdadm: No super block found on /dev/loop1 (Expected magic a92b4efc, got
d2964063)
mdadm: no RAID superblock on /dev/loop1
mdadm: No super block found on /dev/loop0 (Expected magic a92b4efc, got
e7e108a6)
mdadm: no RAID superblock on /dev/loop0
mdadm: /dev/sdf3 is identified as a member of /dev/md/ubuntu:2, slot 4.
mdadm: /dev/sdg3 is identified as a member of /dev/md/ubuntu:2, slot 6.
mdadm: /dev/sde3 is identified as a member of /dev/md/ubuntu:2, slot 5.
mdadm: /dev/sdd3 is identified as a member of /dev/md/ubuntu:2, slot 3.
mdadm: /dev/sdc3 is identified as a member of /dev/md/ubuntu:2, slot 2.
mdadm: /dev/sdb3 is identified as a member of /dev/md/ubuntu:2, slot 1.
mdadm: /dev/sda3 is identified as a member of /dev/md/ubuntu:2, slot 0.
mdadm: /dev/md/ubuntu:2 has an active reshape - checking if critical
section needs to be restored
mdadm: No backup metadata on device-6
mdadm: Failed to find backup of critical section
mdadm: Failed to restore critical section for reshape, sorry.
Possibly you needed to specify the --backup-file
mdadm: looking for devices for further assembly
mdadm: /dev/sdf1 is busy - skipping
mdadm: /dev/sdg1 is busy - skipping
mdadm: /dev/sde1 is busy - skipping
mdadm: /dev/sdd1 is busy - skipping
mdadm: /dev/sdc1 is busy - skipping
mdadm: /dev/sdb1 is busy - skipping
mdadm: /dev/sda1 is busy - skipping
mdadm: No arrays found in config file or automatically
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-06 15:27 ` Alexander Shenkin
@ 2020-04-06 16:12 ` Roger Heflin
2020-04-06 16:27 ` Wols Lists
0 siblings, 1 reply; 13+ messages in thread
From: Roger Heflin @ 2020-04-06 16:12 UTC (permalink / raw)
To: Alexander Shenkin; +Cc: Linux-RAID
When I looked at your detailed files you sent a few days ago, all of
the reshapes (on all disks) indicated that they were at position 0, so
it kind of appears that the reshape never actually started at all and
hung immediately which is probably why it cannot find the critical
section, it hung prior to that getting done. Not entirely sure how
to undo a reshape that failed like this.
On Mon, Apr 6, 2020 at 10:29 AM Alexander Shenkin <al@shenkin.org> wrote:
>
>
>
> On 4/4/2020 9:19 AM, Alexander Shenkin wrote:
> > On 4/1/2020 11:16 AM, Alexander Shenkin wrote:
> >> Hi all,
> >>
> >> I had a problem that caused my Ubuntu Server 14 to go down, and it now
> >> will not boot. I added a drive to my raid1 (/dev/md0 -> /boot) + raid6
> >> (/dev/md2 -> /) setup. I was previously running with /dev/md0 (raid1)
> >> assembled from /dev/sd[a-f]1, and /dev/md2 (raid6) assembled
> >> /dev/sd[a-f]3. Both partitions from the new drive were successfully
> >> added to the two md arrays, and the raid1 (the smaller of the two)
> >> resync'd successfully it seemed (I think that resync is the correct word
> >> to use, meaning that mdadm is spreading out the data amongst the drives
> >> once an array has been grown). When resyncing the larger raid6,
> >> however, the sync speed was quite slow (kb's/sec), and got slower and
> >> slower (7kb/sec... 5kb/sec... 3kb/sec... 1kb/sec...) until the system
> >> entirely halted and I eventually turned the power off. It now will not
> >> boot.
> >>
> >> Thanks to Roger Heflin's help, I've booted into a livecd environment
> >> (ubuntu server 18) with all the necessary raid personalities available.
> >>
> >> When trying to --assemble md127 (raid6), i get the following error:
> >> "Failure to restore critical section for reshape, sorry. Perhaps you
> >> needed to specify the --backup-file". Needless to say, I didn't save a
> >> backup file when adding the drive.
> >>
> >> I have attached some diagnostic output here. It seems that my raid6
> >> array is not being recognized as such. I'm not sure what the next steps
> >> are - do I need to figure out how to get the resync up and running
> >> again? Any help would be greatly appreciated.
> >>
> >> Many thanks,
> >>
> >> Allie
> >>
> >
> > Hello again,
> >
> > Reading through https://raid.wiki.kernel.org/index.php/Assemble_Run and
> > https://raid.wiki.kernel.org/index.php/RAID_Recovery, the advice seems
> > to be to force assemble the array given that my event counts are all the
> > same. However, I don't want to do that until some experts have chimed
> > in. I've seen some other threads about using overlays to avoid data
> > loss... not sure if that is still a recommended method.
> >
> > I'm including the dmesg and mdadm --examine output here... any advice
> > much appreciated!
> >
> > Thanks,
> > Allie
> >
> > (note below - arrays are on /dev/md[a-g])
> >
> > root@ubuntu-server:/home/ubuntu-server# mdadm -A --scan --verbose
> > mdadm: looking for devices for further assembly
> > mdadm: Cannot assemble mbr metadata on /dev/sdh1
> > mdadm: Cannot assemble mbr metadata on /dev/sdh
> > mdadm: no recogniseable superblock on /dev/md/bobbiflekman:0
> > mdadm: no recogniseable superblock on /dev/sdf4
> > mdadm: /dev/sdf3 is busy - skipping
> > mdadm: no recogniseable superblock on /dev/sdf2
> > mdadm: /dev/sdf1 is busy - skipping
> > mdadm: Cannot assemble mbr metadata on /dev/sdf
> > mdadm: no recogniseable superblock on /dev/sdg4
> > mdadm: /dev/sdg3 is busy - skipping
> > mdadm: no recogniseable superblock on /dev/sdg2
> > mdadm: /dev/sdg1 is busy - skipping
> > mdadm: Cannot assemble mbr metadata on /dev/sdg
> > mdadm: no recogniseable superblock on /dev/sde4
> > mdadm: /dev/sde3 is busy - skipping
> > mdadm: no recogniseable superblock on /dev/sde2
> > mdadm: /dev/sde1 is busy - skipping
> > mdadm: Cannot assemble mbr metadata on /dev/sde
> > mdadm: cannot open device /dev/sr1: No medium found
> > mdadm: cannot open device /dev/sr0: No medium found
> > mdadm: no recogniseable superblock on /dev/sdd4
> > mdadm: /dev/sdd3 is busy - skipping
> > mdadm: no recogniseable superblock on /dev/sdd2
> > mdadm: /dev/sdd1 is busy - skipping
> > mdadm: Cannot assemble mbr metadata on /dev/sdd
> > mdadm: no recogniseable superblock on /dev/sdc4
> > mdadm: /dev/sdc3 is busy - skipping
> > mdadm: no recogniseable superblock on /dev/sdc2
> > mdadm: /dev/sdc1 is busy - skipping
> > mdadm: Cannot assemble mbr metadata on /dev/sdc
> > mdadm: no recogniseable superblock on /dev/sdb4
> > mdadm: /dev/sdb3 is busy - skipping
> > mdadm: no recogniseable superblock on /dev/sdb2
> > mdadm: /dev/sdb1 is busy - skipping
> > mdadm: Cannot assemble mbr metadata on /dev/sdb
> > mdadm: no recogniseable superblock on /dev/sda4
> > mdadm: /dev/sda3 is busy - skipping
> > mdadm: no recogniseable superblock on /dev/sda2
> > mdadm: /dev/sda1 is busy - skipping
> > mdadm: Cannot assemble mbr metadata on /dev/sda
> > mdadm: no recogniseable superblock on /dev/loop7
> > mdadm: no recogniseable superblock on /dev/loop6
> > mdadm: no recogniseable superblock on /dev/loop5
> > mdadm: no recogniseable superblock on /dev/loop4
> > mdadm: no recogniseable superblock on /dev/loop3
> > mdadm: no recogniseable superblock on /dev/loop2
> > mdadm: no recogniseable superblock on /dev/loop1
> > mdadm: no recogniseable superblock on /dev/loop0
> > mdadm: No arrays found in config file or automatically
> >
>
> Hi again all,
>
> Apologies for all the self-followed-up emails. I realized my previous
> --examine was done without stopping the array in question. When stopped
> and reexamined, the critical bit seems to be this:
>
> mdadm: /dev/sdf3 is identified as a member of /dev/md/ubuntu:2, slot 4.
> mdadm: /dev/sdg3 is identified as a member of /dev/md/ubuntu:2, slot 6.
> mdadm: /dev/sde3 is identified as a member of /dev/md/ubuntu:2, slot 5.
> mdadm: /dev/sdd3 is identified as a member of /dev/md/ubuntu:2, slot 3.
> mdadm: /dev/sdc3 is identified as a member of /dev/md/ubuntu:2, slot 2.
> mdadm: /dev/sdb3 is identified as a member of /dev/md/ubuntu:2, slot 1.
> mdadm: /dev/sda3 is identified as a member of /dev/md/ubuntu:2, slot 0.
> mdadm: /dev/md/ubuntu:2 has an active reshape - checking if critical
> section needs to be restored
> mdadm: No backup metadata on device-6
> mdadm: Failed to find backup of critical section
> mdadm: Failed to restore critical section for reshape, sorry.
> Possibly you needed to specify the --backup-file
>
> That is, it thinks there is an active reshape.
>
> Thanks,
> Allie
>
>
> root@ubuntu-server:/home/ubuntu-server# mdadm -A --scan --verbose
> mdadm: looking for devices for further assembly
> mdadm: Cannot assemble mbr metadata on /dev/sdh1
> mdadm: Cannot assemble mbr metadata on /dev/sdh
> mdadm: no recogniseable superblock on /dev/md/bobbiflekman:0
> mdadm: no recogniseable superblock on /dev/sdf4
> mdadm: No super block found on /dev/sdf2 (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdf2
> mdadm: /dev/sdf1 is busy - skipping
> mdadm: No super block found on /dev/sdf (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdf
> mdadm: No super block found on /dev/sdg4 (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdg4
> mdadm: No super block found on /dev/sdg2 (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdg2
> mdadm: /dev/sdg1 is busy - skipping
> mdadm: No super block found on /dev/sdg (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdg
> mdadm: No super block found on /dev/sde4 (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sde4
> mdadm: No super block found on /dev/sde2 (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sde2
> mdadm: /dev/sde1 is busy - skipping
> mdadm: No super block found on /dev/sde (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sde
> mdadm: cannot open device /dev/sr1: No medium found
> mdadm: cannot open device /dev/sr0: No medium found
> mdadm: No super block found on /dev/sdd4 (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdd4
> mdadm: No super block found on /dev/sdd2 (Expected magic a92b4efc, got
> 9c6196f1)
> mdadm: no RAID superblock on /dev/sdd2
> mdadm: /dev/sdd1 is busy - skipping
> mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdd
> mdadm: No super block found on /dev/sdc4 (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdc4
> mdadm: No super block found on /dev/sdc2 (Expected magic a92b4efc, got
> 9c6196f1)
> mdadm: no RAID superblock on /dev/sdc2
> mdadm: /dev/sdc1 is busy - skipping
> mdadm: No super block found on /dev/sdc (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdc
> mdadm: No super block found on /dev/sdb4 (Expected magic a92b4efc, got
> 53425553)
> mdadm: no RAID superblock on /dev/sdb4
> mdadm: No super block found on /dev/sdb2 (Expected magic a92b4efc, got
> 9c6196f1)
> mdadm: no RAID superblock on /dev/sdb2
> mdadm: /dev/sdb1 is busy - skipping
> mdadm: No super block found on /dev/sdb (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sdb
> mdadm: No super block found on /dev/sda4 (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sda4
> mdadm: No super block found on /dev/sda2 (Expected magic a92b4efc, got
> 9c6196f1)
> mdadm: no RAID superblock on /dev/sda2
> mdadm: /dev/sda1 is busy - skipping
> mdadm: No super block found on /dev/sda (Expected magic a92b4efc, got
> 00000000)
> mdadm: no RAID superblock on /dev/sda
> mdadm: No super block found on /dev/loop7 (Expected magic a92b4efc, got
> 72756769)
> mdadm: no RAID superblock on /dev/loop7
> mdadm: No super block found on /dev/loop6 (Expected magic a92b4efc, got
> 69622f21)
> mdadm: no RAID superblock on /dev/loop6
> mdadm: No super block found on /dev/loop5 (Expected magic a92b4efc, got
> 14ea0a05)
> mdadm: no RAID superblock on /dev/loop5
> mdadm: No super block found on /dev/loop4 (Expected magic a92b4efc, got
> 1824ef5d)
> mdadm: no RAID superblock on /dev/loop4
> mdadm: No super block found on /dev/loop3 (Expected magic a92b4efc, got
> 1e993ae9)
> mdadm: no RAID superblock on /dev/loop3
> mdadm: No super block found on /dev/loop2 (Expected magic a92b4efc, got
> cb4d8c8e)
> mdadm: no RAID superblock on /dev/loop2
> mdadm: No super block found on /dev/loop1 (Expected magic a92b4efc, got
> d2964063)
> mdadm: no RAID superblock on /dev/loop1
> mdadm: No super block found on /dev/loop0 (Expected magic a92b4efc, got
> e7e108a6)
> mdadm: no RAID superblock on /dev/loop0
> mdadm: /dev/sdf3 is identified as a member of /dev/md/ubuntu:2, slot 4.
> mdadm: /dev/sdg3 is identified as a member of /dev/md/ubuntu:2, slot 6.
> mdadm: /dev/sde3 is identified as a member of /dev/md/ubuntu:2, slot 5.
> mdadm: /dev/sdd3 is identified as a member of /dev/md/ubuntu:2, slot 3.
> mdadm: /dev/sdc3 is identified as a member of /dev/md/ubuntu:2, slot 2.
> mdadm: /dev/sdb3 is identified as a member of /dev/md/ubuntu:2, slot 1.
> mdadm: /dev/sda3 is identified as a member of /dev/md/ubuntu:2, slot 0.
> mdadm: /dev/md/ubuntu:2 has an active reshape - checking if critical
> section needs to be restored
> mdadm: No backup metadata on device-6
> mdadm: Failed to find backup of critical section
> mdadm: Failed to restore critical section for reshape, sorry.
> Possibly you needed to specify the --backup-file
> mdadm: looking for devices for further assembly
> mdadm: /dev/sdf1 is busy - skipping
> mdadm: /dev/sdg1 is busy - skipping
> mdadm: /dev/sde1 is busy - skipping
> mdadm: /dev/sdd1 is busy - skipping
> mdadm: /dev/sdc1 is busy - skipping
> mdadm: /dev/sdb1 is busy - skipping
> mdadm: /dev/sda1 is busy - skipping
> mdadm: No arrays found in config file or automatically
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-06 16:12 ` Roger Heflin
@ 2020-04-06 16:27 ` Wols Lists
2020-04-06 20:34 ` Phil Turmel
0 siblings, 1 reply; 13+ messages in thread
From: Wols Lists @ 2020-04-06 16:27 UTC (permalink / raw)
To: Roger Heflin, Alexander Shenkin; +Cc: Linux-RAID
On 06/04/20 17:12, Roger Heflin wrote:
> When I looked at your detailed files you sent a few days ago, all of
> the reshapes (on all disks) indicated that they were at position 0, so
> it kind of appears that the reshape never actually started at all and
> hung immediately which is probably why it cannot find the critical
> section, it hung prior to that getting done. Not entirely sure how
> to undo a reshape that failed like this.
This seems quite common. Search the archives - it's probably something
like --assemble --revert-reshape.
Cheers,
Wol
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-06 16:27 ` Wols Lists
@ 2020-04-06 20:34 ` Phil Turmel
2020-04-07 10:25 ` Alexander Shenkin
0 siblings, 1 reply; 13+ messages in thread
From: Phil Turmel @ 2020-04-06 20:34 UTC (permalink / raw)
To: Wols Lists, Roger Heflin, Alexander Shenkin; +Cc: Linux-RAID
On 4/6/20 12:27 PM, Wols Lists wrote:
> On 06/04/20 17:12, Roger Heflin wrote:
>> When I looked at your detailed files you sent a few days ago, all of
>> the reshapes (on all disks) indicated that they were at position 0, so
>> it kind of appears that the reshape never actually started at all and
>> hung immediately which is probably why it cannot find the critical
>> section, it hung prior to that getting done. Not entirely sure how
>> to undo a reshape that failed like this.
>
> This seems quite common. Search the archives - it's probably something
> like --assemble --revert-reshape.
Ah, yes. I recall cases where mdmon wouldn't start or wouldn't open the
array to start moving the stripes, so the kernel wouldn't advance.
SystemD was one of the culprits, I believe, back then.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-06 20:34 ` Phil Turmel
@ 2020-04-07 10:25 ` Alexander Shenkin
2020-04-07 11:28 ` Phil Turmel
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Shenkin @ 2020-04-07 10:25 UTC (permalink / raw)
To: Phil Turmel, Wols Lists, Roger Heflin; +Cc: Linux-RAID
On 4/6/2020 9:34 PM, Phil Turmel wrote:
> On 4/6/20 12:27 PM, Wols Lists wrote:
>> On 06/04/20 17:12, Roger Heflin wrote:
>>> When I looked at your detailed files you sent a few days ago, all of
>>> the reshapes (on all disks) indicated that they were at position 0, so
>>> it kind of appears that the reshape never actually started at all and
>>> hung immediately which is probably why it cannot find the critical
>>> section, it hung prior to that getting done. Not entirely sure how
>>> to undo a reshape that failed like this.
>>
>> This seems quite common. Search the archives - it's probably something
>> like --assemble --revert-reshape.
>
> Ah, yes. I recall cases where mdmon wouldn't start or wouldn't open the
> array to start moving the stripes, so the kernel wouldn't advance.
> SystemD was one of the culprits, I believe, back then.
Thanks all.
So, is the following safe to run, and a good idea to try?
mdadm --assemble --update=revert-reshape /dev/md127 /dev/sd[a-g]3
And if that doesn't work, add a force?
mdadm --assemble --force --update=revert-reshape /dev/md127 /dev/sd[a-g]3
And adding --invalid-backup if it complains about backup files?
Thanks,
Allie
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-07 10:25 ` Alexander Shenkin
@ 2020-04-07 11:28 ` Phil Turmel
2020-04-07 11:46 ` Alexander Shenkin
0 siblings, 1 reply; 13+ messages in thread
From: Phil Turmel @ 2020-04-07 11:28 UTC (permalink / raw)
To: Alexander Shenkin, Wols Lists, Roger Heflin; +Cc: Linux-RAID
Hi Allie,
On 4/7/20 6:25 AM, Alexander Shenkin wrote:
>
>
> On 4/6/2020 9:34 PM, Phil Turmel wrote:
>> On 4/6/20 12:27 PM, Wols Lists wrote:
>>> On 06/04/20 17:12, Roger Heflin wrote:
>>>> When I looked at your detailed files you sent a few days ago, all of
>>>> the reshapes (on all disks) indicated that they were at position 0, so
>>>> it kind of appears that the reshape never actually started at all and
>>>> hung immediately which is probably why it cannot find the critical
>>>> section, it hung prior to that getting done. Not entirely sure how
>>>> to undo a reshape that failed like this.
>>>
>>> This seems quite common. Search the archives - it's probably something
>>> like --assemble --revert-reshape.
>>
>> Ah, yes. I recall cases where mdmon wouldn't start or wouldn't open the
>> array to start moving the stripes, so the kernel wouldn't advance.
>> SystemD was one of the culprits, I believe, back then.
>
> Thanks all.
>
> So, is the following safe to run, and a good idea to try?
>
> mdadm --assemble --update=revert-reshape /dev/md127 /dev/sd[a-g]3
Yes.
> And if that doesn't work, add a force? >
> mdadm --assemble --force --update=revert-reshape /dev/md127 /dev/sd[a-g]3
Yes.
> And adding --invalid-backup if it complains about backup files?
Yes.
> Thanks,
> Allie
Phil
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-07 11:28 ` Phil Turmel
@ 2020-04-07 11:46 ` Alexander Shenkin
2020-04-07 12:28 ` Phil Turmel
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Shenkin @ 2020-04-07 11:46 UTC (permalink / raw)
To: Phil Turmel, Wols Lists, Roger Heflin; +Cc: Linux-RAID
On 4/7/2020 12:28 PM, Phil Turmel wrote:
> Hi Allie,
>
> On 4/7/20 6:25 AM, Alexander Shenkin wrote:
>>
>>
>> On 4/6/2020 9:34 PM, Phil Turmel wrote:
>>> On 4/6/20 12:27 PM, Wols Lists wrote:
>>>> On 06/04/20 17:12, Roger Heflin wrote:
>>>>> When I looked at your detailed files you sent a few days ago, all of
>>>>> the reshapes (on all disks) indicated that they were at position 0, so
>>>>> it kind of appears that the reshape never actually started at all and
>>>>> hung immediately which is probably why it cannot find the critical
>>>>> section, it hung prior to that getting done. Not entirely sure how
>>>>> to undo a reshape that failed like this.
>>>>
>>>> This seems quite common. Search the archives - it's probably something
>>>> like --assemble --revert-reshape.
>>>
>>> Ah, yes. I recall cases where mdmon wouldn't start or wouldn't open the
>>> array to start moving the stripes, so the kernel wouldn't advance.
>>> SystemD was one of the culprits, I believe, back then.
>>
>> Thanks all.
>>
>> So, is the following safe to run, and a good idea to try?
>>
>> mdadm --assemble --update=revert-reshape /dev/md127 /dev/sd[a-g]3
>
> Yes.
>
>> And if that doesn't work, add a force? >
>> mdadm --assemble --force --update=revert-reshape /dev/md127 /dev/sd[a-g]3
>
> Yes.
>
>> And adding --invalid-backup if it complains about backup files?
>
> Yes.
>
>> Thanks,
>> Allie
>
> Phil
>
Thanks Phil,
The --invalid-backup parameter was necessary to get this up and running.
It's now up with the 7th disk as a spare. Shall I run fsck now, or can
I just try to grow again?
proposed grow operation:
> mdadm --grow -raid-devices=7 --backup-file=/dev/usb/grow_md127.bak
/dev/md127
> mdadm --stop /dev/md127
> umount /dev/md127 # not sure if this is necessary
> resize2fs /dev/md127
Thanks,
Allie
assemble operation results:
root@ubuntu-server:/home/ubuntu-server# mdadm --assemble
--invalid-backup --update=revert-reshape /dev/md127 /dev/sd[a-g]3
mdadm: device 12 in /dev/md127 has wrong state in superblock, but
/dev/sdg3 seems ok
mdadm: /dev/md127 has been started with 6 drives and 1 spare.
root@ubuntu-server:/home/ubuntu-server# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
[raid4] [raid10]
md127 : active raid6 sda3[6] sdg3[9](S) sde3[7] sdf3[8] sdd3[5] sdc3[2]
sdb3[4]
11680755712 blocks super 1.2 level 6, 512k chunk, algorithm 2
[6/6] [UUUUUU]
bitmap: 0/22 pages [0KB], 65536KB chunk
md126 : active (auto-read-only) raid1 sdf1[8] sde1[7] sdg1[9] sda1[6]
sdd1[5] sdc1[2] sdb1[4]
1950656 blocks super 1.2 [7/7] [UUUUUUU]
unused devices: <none>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-07 11:46 ` Alexander Shenkin
@ 2020-04-07 12:28 ` Phil Turmel
2020-04-07 12:31 ` Phil Turmel
0 siblings, 1 reply; 13+ messages in thread
From: Phil Turmel @ 2020-04-07 12:28 UTC (permalink / raw)
To: Alexander Shenkin, Wols Lists, Roger Heflin; +Cc: Linux-RAID
>
> Thanks Phil,
>
> The --invalid-backup parameter was necessary to get this up and running.
> It's now up with the 7th disk as a spare. Shall I run fsck now, or can
> I just try to grow again?
>
> proposed grow operation:
>> mdadm --grow -raid-devices=7 --backup-file=/dev/usb/grow_md127.bak
> /dev/md127
>> mdadm --stop /dev/md127
>> umount /dev/md127 # not sure if this is necessary
>> resize2fs /dev/md127
An fsck could help, if any blocks did get moved.
I would not attempt a grow again until you find out why the previous
attempt didn't make progress. Check if mdmon is running, and/or compile
a fresh copy of mdadm from source. If you don't figure it out, you'll
just end up in the same spot again.
Phil
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-07 12:28 ` Phil Turmel
@ 2020-04-07 12:31 ` Phil Turmel
2020-04-07 13:19 ` Alexander Shenkin
0 siblings, 1 reply; 13+ messages in thread
From: Phil Turmel @ 2020-04-07 12:31 UTC (permalink / raw)
To: Alexander Shenkin, Wols Lists, Roger Heflin; +Cc: Linux-RAID
On 4/7/20 8:28 AM, Phil Turmel wrote:
>>
>> Thanks Phil,
>>
>> The --invalid-backup parameter was necessary to get this up and running.
>> It's now up with the 7th disk as a spare. Shall I run fsck now, or can
>> I just try to grow again?
>>
>> proposed grow operation:
>>> mdadm --grow -raid-devices=7 --backup-file=/dev/usb/grow_md127.bak
>> /dev/md127
>>> mdadm --stop /dev/md127
>>> umount /dev/md127 # not sure if this is necessary
>>> resize2fs /dev/md127
>
> An fsck could help, if any blocks did get moved.
>
> I would not attempt a grow again until you find out why the previous
> attempt didn't make progress. Check if mdmon is running, and/or compile
> a fresh copy of mdadm from source. If you don't figure it out, you'll
> just end up in the same spot again.
Oh, one more point: Don't use a backup file. Let mdadm shift the data
offsets to get the temporary space needed. (It'll run faster, too.)
(I don't see any mdadm --examine reports in the list thread. Did you do
them and keep the complete output?)
Phil
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-07 12:31 ` Phil Turmel
@ 2020-04-07 13:19 ` Alexander Shenkin
2020-04-07 15:08 ` antlists
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Shenkin @ 2020-04-07 13:19 UTC (permalink / raw)
To: Phil Turmel, Wols Lists, Roger Heflin; +Cc: Linux-RAID
[-- Attachment #1: Type: text/plain, Size: 1650 bytes --]
On 4/7/2020 1:31 PM, Phil Turmel wrote:
> On 4/7/20 8:28 AM, Phil Turmel wrote:
>>>
>>> Thanks Phil,
>>>
>>> The --invalid-backup parameter was necessary to get this up and running.
>>> It's now up with the 7th disk as a spare. Shall I run fsck now, or
>>> can
>>> I just try to grow again?
>>>
>>> proposed grow operation:
>>>> mdadm --grow -raid-devices=7 --backup-file=/dev/usb/grow_md127.bak
>>> /dev/md127
>>>> mdadm --stop /dev/md127
>>>> umount /dev/md127 # not sure if this is necessary
>>>> resize2fs /dev/md127
>>
>> An fsck could help, if any blocks did get moved.
>>
>> I would not attempt a grow again until you find out why the previous
>> attempt didn't make progress. Check if mdmon is running, and/or
>> compile a fresh copy of mdadm from source. If you don't figure it
>> out, you'll just end up in the same spot again.
>
>
> Oh, one more point: Don't use a backup file. Let mdadm shift the data
> offsets to get the temporary space needed. (It'll run faster, too.)
>
> (I don't see any mdadm --examine reports in the list thread. Did you do
> them and keep the complete output?)
>
> Phil
>
Thanks Phil,
fsck is finding lots and lots of problems. Figure I'll just run fsck -p
and see what happens... not sure what choice i have...
examine output attached here...
Re re-growing, I was hoping that running on a newer mdadm (4.1) might
fix the problem, and if i still encountered it, perhaps running the
following might unstick it:
echo frozen > /sys/block/md0/md/sync_action
echo reshape > /sys/block/md0/md/sync_action
But, I personally have no idea what happened (really), nor why... :-(
thanks,
allie
[-- Attachment #2: examine.txt --]
[-- Type: text/plain, Size: 6802 bytes --]
root@ubuntu-server:/home/ubuntu-server# mdadm --examine /dev/sd[a-g]3
/dev/sda3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : c7303f62:d848d424:269581c8:83a045ec
Name : ubuntu:2
Creation Time : Sun Feb 5 23:39:58 2017
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
Array Size : 11680755712 (11139.64 GiB 11961.09 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : 10bdbed5:cb70c8a9:566c384d:ec4c926e
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Apr 7 13:14:11 2020
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 307cff15 - correct
Events : 316498
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : c7303f62:d848d424:269581c8:83a045ec
Name : ubuntu:2
Creation Time : Sun Feb 5 23:39:58 2017
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
Array Size : 11680755712 (11139.64 GiB 11961.09 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : cf70dad5:0c9ff5f6:ede689f2:ccee2eb0
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Apr 7 13:14:11 2020
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 64b3fd8b - correct
Events : 316498
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : c7303f62:d848d424:269581c8:83a045ec
Name : ubuntu:2
Creation Time : Sun Feb 5 23:39:58 2017
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
Array Size : 11680755712 (11139.64 GiB 11961.09 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : f8839952:eaba2e9c:c2c401d4:3e0592a5
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Apr 7 13:14:11 2020
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 5d8720d6 - correct
Events : 316498
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : c7303f62:d848d424:269581c8:83a045ec
Name : ubuntu:2
Creation Time : Sun Feb 5 23:39:58 2017
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
Array Size : 11680755712 (11139.64 GiB 11961.09 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : 875a0dbd:965a9986:1b78eb3d:e15fee50
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Apr 7 13:14:11 2020
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : c7aba50f - correct
Events : 316498
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : c7303f62:d848d424:269581c8:83a045ec
Name : ubuntu:2
Creation Time : Sun Feb 5 23:39:58 2017
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
Array Size : 11680755712 (11139.64 GiB 11961.09 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : dc0bda8c:2457fb4c:f87a4bec:8d5b58ed
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Apr 7 13:14:11 2020
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : a8a4517e - correct
Events : 316498
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 5
Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdf3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : c7303f62:d848d424:269581c8:83a045ec
Name : ubuntu:2
Creation Time : Sun Feb 5 23:39:58 2017
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
Array Size : 11680755712 (11139.64 GiB 11961.09 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : dc842dc3:09c910c7:c351c307:e2383d13
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Apr 7 13:14:11 2020
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 9a69f083 - correct
Events : 316498
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 4
Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : c7303f62:d848d424:269581c8:83a045ec
Name : ubuntu:2
Creation Time : Sun Feb 5 23:39:58 2017
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
Array Size : 11680755712 (11139.64 GiB 11961.09 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : 635ef71b:e4add925:30ae4f0a:f6b46611
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Apr 7 11:37:49 2020
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 52b270d0 - correct
Events : 316498
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-07 13:19 ` Alexander Shenkin
@ 2020-04-07 15:08 ` antlists
2020-04-07 17:04 ` Alexander Shenkin
0 siblings, 1 reply; 13+ messages in thread
From: antlists @ 2020-04-07 15:08 UTC (permalink / raw)
To: Alexander Shenkin, Phil Turmel, Roger Heflin; +Cc: Linux-RAID
On 07/04/2020 14:19, Alexander Shenkin wrote:
> Re re-growing, I was hoping that running on a newer mdadm (4.1) might
> fix the problem, and if i still encountered it, perhaps running the
> following might unstick it:
>
> echo frozen > /sys/block/md0/md/sync_action
> echo reshape > /sys/block/md0/md/sync_action
>
> But, I personally have no idea what happened (really), nor why...:-(
iirc pretty much all these reports come from oldish Ubuntu systems ...
What happened *could* be that you have an updated franken-kernel, plus
an old mdadm, and the mess needs an Igor to stitch it all together...
If you ARE going to try the grow again, I'd use an up-to-date recovery
system to run the grow, and then reboot back in to the old Ubuntu once
your system is back.
And seriously think about upgrading your distro to the latest LTS.
Cheers,
Wol
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Raid-6 cannot reshape
2020-04-07 15:08 ` antlists
@ 2020-04-07 17:04 ` Alexander Shenkin
0 siblings, 0 replies; 13+ messages in thread
From: Alexander Shenkin @ 2020-04-07 17:04 UTC (permalink / raw)
To: antlists, Phil Turmel, Roger Heflin; +Cc: Linux-RAID
Thanks all,
My file system was a total mess, everything dropped into lost + found.
I've put everything back into what I think are the right directory
structures, and nice to not have all my data lost (i think)... but i may
just do a clean install of a new os once i have it up, running, and
grown again... many thanks to all... (and btw, it's growing nicely with
ubuntu 18 at the moment...)
allie
On 4/7/2020 4:08 PM, antlists wrote:
> On 07/04/2020 14:19, Alexander Shenkin wrote:
>> Re re-growing, I was hoping that running on a newer mdadm (4.1) might
>> fix the problem, and if i still encountered it, perhaps running the
>> following might unstick it:
>>
>> echo frozen > /sys/block/md0/md/sync_action
>> echo reshape > /sys/block/md0/md/sync_action
>>
>> But, I personally have no idea what happened (really), nor why...:-(
>
> iirc pretty much all these reports come from oldish Ubuntu systems ...
>
> What happened *could* be that you have an updated franken-kernel, plus
> an old mdadm, and the mess needs an Igor to stitch it all together...
>
> If you ARE going to try the grow again, I'd use an up-to-date recovery
> system to run the grow, and then reboot back in to the old Ubuntu once
> your system is back.
>
> And seriously think about upgrading your distro to the latest LTS.
>
> Cheers,
> Wol
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-04-07 17:04 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01 10:16 Raid-6 cannot reshape Alexander Shenkin
[not found] ` <6b9b6d37-6325-6515-f693-0ff3b641a67a@shenkin.org>
2020-04-06 15:27 ` Alexander Shenkin
2020-04-06 16:12 ` Roger Heflin
2020-04-06 16:27 ` Wols Lists
2020-04-06 20:34 ` Phil Turmel
2020-04-07 10:25 ` Alexander Shenkin
2020-04-07 11:28 ` Phil Turmel
2020-04-07 11:46 ` Alexander Shenkin
2020-04-07 12:28 ` Phil Turmel
2020-04-07 12:31 ` Phil Turmel
2020-04-07 13:19 ` Alexander Shenkin
2020-04-07 15:08 ` antlists
2020-04-07 17:04 ` Alexander Shenkin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.