All of lore.kernel.org
 help / color / mirror / Atom feed
* Accidentally resized array to 9
@ 2017-09-29  4:23 Eli Ben-Shoshan
  2017-09-29 12:38 ` John Stoffel
  2017-09-29 12:55 ` Roman Mamedov
  0 siblings, 2 replies; 13+ messages in thread
From: Eli Ben-Shoshan @ 2017-09-29  4:23 UTC (permalink / raw)
  To: linux-raid

I need to add another disk to my array (/dev/md128) when I accidentally 
did an array resize to 9 with the following command:

First I add the disk to the array with the following:

mdadm --manage /dev/md128 --add /dev/sdl

This was a RAID6 with 8 devices. Instead of using --grow with 
--raid-devices set to 9, I did the following:

mdadm --grow /dev/md128 --size 9

This happily returned without any errors so I went to go look at 
/proc/mdstat and did not see a resize operation going. So I shook my 
head and read the output of --grow --help and did the right thing which is:

mdadm --grow /dev/md128 --raid-devices=9

Right after that everything hit the fan. dmesg reported a lot of 
filesystem errors. I quickly stopped all processes that were using this 
device and unmounted the filesystems. I then, stupidly, decided to 
reboot before looking around.

I am now booted and can assemble this array but it seems like there is 
no data there. Here is the output of --misc --examine:

ganon raid # cat md128
/dev/md128:
         Version : 1.2
   Creation Time : Sat Aug 30 22:01:09 2014
      Raid Level : raid6
   Used Dev Size : unknown
    Raid Devices : 9
   Total Devices : 9
     Persistence : Superblock is persistent

     Update Time : Thu Sep 28 19:44:39 2017
           State : clean, Not Started
  Active Devices : 9
Working Devices : 9
  Failed Devices : 0
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 512K

            Name : ganon:ganon - large raid6  (local to host ganon)
            UUID : 2b3f41d5:ac904000:965be496:dd3ae4ae
          Events : 84345

     Number   Major   Minor   RaidDevice State
        0       8       32        0      active sync   /dev/sdc
        1       8       48        1      active sync   /dev/sdd
        6       8      128        2      active sync   /dev/sdi
        3       8       96        3      active sync   /dev/sdg
        4       8       80        4      active sync   /dev/sdf
        8       8      160        5      active sync   /dev/sdk
        7       8       64        6      active sync   /dev/sde
        9       8      112        7      active sync   /dev/sdh
       10       8      176        8      active sync   /dev/sdl

You will note that the "Used Dev Size" is unknown. The output of --misc 
--examine on each disk looks similar to this:

/dev/sdc:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2b3f41d5:ac904000:965be496:dd3ae4ae
            Name : ganon:ganon - large raid6  (local to host ganon)
   Creation Time : Sat Aug 30 22:01:09 2014
      Raid Level : raid6
    Raid Devices : 9

  Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
      Array Size : 0
   Used Dev Size : 0
     Data Offset : 239616 sectors
    Super Offset : 8 sectors
    Unused Space : before=239528 sectors, after=3906789552 sectors
           State : clean
     Device UUID : b1bd681a:36849191:b3fdad44:22567d99

     Update Time : Thu Sep 28 19:44:39 2017
   Bad Block Log : 512 entries available at offset 72 sectors
        Checksum : bca7b1d5 - correct
          Events : 84345

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 0
    Array State : AAAAAAAAA ('A' == active, '.' == missing, 'R' == 
replacing)

I followed directions to create overlays and I tried to re-create the 
array with the following:

mdadm --create /dev/md150 --assume-clean --metadata=1.2 
--data-offset=117M --level=6 --layout=ls --chunk=512 --raid-devices=9 
/dev/mapper/sdc /dev/mapper/sdd /dev/mapper/sdi /dev/mapper/sdg 
/dev/mapper/sdf /dev/mapper/sdk /dev/mapper/sde /dev/mapper/sdh 
/dev/mapper/sdl

while this creates a /dev/md150, it is basically empty. There should be 
an LVM PV label on this disk but pvck returns:

   Could not find LVM label on /dev/md150

The output of --misc --examine looks like this with the overlay:

/dev/md150:
         Version : 1.2
   Creation Time : Fri Sep 29 00:22:11 2017
      Raid Level : raid6
      Array Size : 13673762816 (13040.32 GiB 14001.93 GB)
   Used Dev Size : 1953394688 (1862.90 GiB 2000.28 GB)
    Raid Devices : 9
   Total Devices : 9
     Persistence : Superblock is persistent

   Intent Bitmap : Internal

     Update Time : Fri Sep 29 00:22:11 2017
           State : clean
  Active Devices : 9
Working Devices : 9
  Failed Devices : 0
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 512K

            Name : ganon:150  (local to host ganon)
            UUID : 84098bfe:74c1f70c:958a7d8a:ccb2ef74
          Events : 0

     Number   Major   Minor   RaidDevice State
        0     252       11        0      active sync   /dev/dm-11
        1     252        9        1      active sync   /dev/dm-9
        2     252       16        2      active sync   /dev/dm-16
        3     252       17        3      active sync   /dev/dm-17
        4     252       10        4      active sync   /dev/dm-10
        5     252       14        5      active sync   /dev/dm-14
        6     252       12        6      active sync   /dev/dm-12
        7     252       13        7      active sync   /dev/dm-13
        8     252       15        8      active sync   /dev/dm-15

What do you think? Am I hosed here? Is there any way I can get my data back?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29  4:23 Accidentally resized array to 9 Eli Ben-Shoshan
@ 2017-09-29 12:38 ` John Stoffel
  2017-09-29 14:47   ` Eli Ben-Shoshan
  2017-09-29 12:55 ` Roman Mamedov
  1 sibling, 1 reply; 13+ messages in thread
From: John Stoffel @ 2017-09-29 12:38 UTC (permalink / raw)
  To: Eli Ben-Shoshan; +Cc: linux-raid

>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:

Eli> I need to add another disk to my array (/dev/md128) when I accidentally 
Eli> did an array resize to 9 with the following command:

Eli> First I add the disk to the array with the following:

Eli> mdadm --manage /dev/md128 --add /dev/sdl

Eli> This was a RAID6 with 8 devices. Instead of using --grow with 
Eli> --raid-devices set to 9, I did the following:

Eli> mdadm --grow /dev/md128 --size 9

Eli> This happily returned without any errors so I went to go look at 
Eli> /proc/mdstat and did not see a resize operation going. So I shook my 
Eli> head and read the output of --grow --help and did the right thing which is:

Eli> mdadm --grow /dev/md128 --raid-devices=9

Eli> Right after that everything hit the fan. dmesg reported a lot of 
Eli> filesystem errors. I quickly stopped all processes that were using this 
Eli> device and unmounted the filesystems. I then, stupidly, decided to 
Eli> reboot before looking around.


I think you *might* be able to fix this with just a simple:

   mdadm --grow /dev/md128 --size max

And then try to scan for your LVM configuration, then fsck your volume
on there.  I hope you had backups.

And maybe there should be a warning when re-sizing raid array elements
without a --force option if going smaller than the current size?  

Eli> I am now booted and can assemble this array but it seems like there is 
Eli> no data there. Here is the output of --misc --examine:



Eli> ganon raid # cat md128
Eli> /dev/md128:
Eli>          Version : 1.2
Eli>    Creation Time : Sat Aug 30 22:01:09 2014
Eli>       Raid Level : raid6
Eli>    Used Dev Size : unknown
Eli>     Raid Devices : 9
Eli>    Total Devices : 9
Eli>      Persistence : Superblock is persistent

Eli>      Update Time : Thu Sep 28 19:44:39 2017
Eli>            State : clean, Not Started
Eli>   Active Devices : 9
Eli> Working Devices : 9
Eli>   Failed Devices : 0
Eli>    Spare Devices : 0

Eli>           Layout : left-symmetric
Eli>       Chunk Size : 512K

Eli>             Name : ganon:ganon - large raid6  (local to host ganon)
Eli>             UUID : 2b3f41d5:ac904000:965be496:dd3ae4ae
Eli>           Events : 84345

Eli>      Number   Major   Minor   RaidDevice State
Eli>         0       8       32        0      active sync   /dev/sdc
Eli>         1       8       48        1      active sync   /dev/sdd
Eli>         6       8      128        2      active sync   /dev/sdi
Eli>         3       8       96        3      active sync   /dev/sdg
Eli>         4       8       80        4      active sync   /dev/sdf
Eli>         8       8      160        5      active sync   /dev/sdk
Eli>         7       8       64        6      active sync   /dev/sde
Eli>         9       8      112        7      active sync   /dev/sdh
Eli>        10       8      176        8      active sync   /dev/sdl

Eli> You will note that the "Used Dev Size" is unknown. The output of --misc 
Eli> --examine on each disk looks similar to this:

Eli> /dev/sdc:
Eli>            Magic : a92b4efc
Eli>          Version : 1.2
Eli>      Feature Map : 0x0
Eli>       Array UUID : 2b3f41d5:ac904000:965be496:dd3ae4ae
Eli>             Name : ganon:ganon - large raid6  (local to host ganon)
Eli>    Creation Time : Sat Aug 30 22:01:09 2014
Eli>       Raid Level : raid6
Eli>     Raid Devices : 9

Eli>   Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Eli>       Array Size : 0
Eli>    Used Dev Size : 0
Eli>      Data Offset : 239616 sectors
Eli>     Super Offset : 8 sectors
Eli>     Unused Space : before=239528 sectors, after=3906789552 sectors
Eli>            State : clean
Eli>      Device UUID : b1bd681a:36849191:b3fdad44:22567d99

Eli>      Update Time : Thu Sep 28 19:44:39 2017
Eli>    Bad Block Log : 512 entries available at offset 72 sectors
Eli>         Checksum : bca7b1d5 - correct
Eli>           Events : 84345

Eli>           Layout : left-symmetric
Eli>       Chunk Size : 512K

Eli>     Device Role : Active device 0
Eli>     Array State : AAAAAAAAA ('A' == active, '.' == missing, 'R' == 
Eli> replacing)

Eli> I followed directions to create overlays and I tried to re-create the 
Eli> array with the following:

Eli> mdadm --create /dev/md150 --assume-clean --metadata=1.2 
Eli> --data-offset=117M --level=6 --layout=ls --chunk=512 --raid-devices=9 
Eli> /dev/mapper/sdc /dev/mapper/sdd /dev/mapper/sdi /dev/mapper/sdg 
Eli> /dev/mapper/sdf /dev/mapper/sdk /dev/mapper/sde /dev/mapper/sdh 
Eli> /dev/mapper/sdl

Eli> while this creates a /dev/md150, it is basically empty. There should be 
Eli> an LVM PV label on this disk but pvck returns:

Eli>    Could not find LVM label on /dev/md150

Eli> The output of --misc --examine looks like this with the overlay:

Eli> /dev/md150:
Eli>          Version : 1.2
Eli>    Creation Time : Fri Sep 29 00:22:11 2017
Eli>       Raid Level : raid6
Eli>       Array Size : 13673762816 (13040.32 GiB 14001.93 GB)
Eli>    Used Dev Size : 1953394688 (1862.90 GiB 2000.28 GB)
Eli>     Raid Devices : 9
Eli>    Total Devices : 9
Eli>      Persistence : Superblock is persistent

Eli>    Intent Bitmap : Internal

Eli>      Update Time : Fri Sep 29 00:22:11 2017
Eli>            State : clean
Eli>   Active Devices : 9
Eli> Working Devices : 9
Eli>   Failed Devices : 0
Eli>    Spare Devices : 0

Eli>           Layout : left-symmetric
Eli>       Chunk Size : 512K

Eli>             Name : ganon:150  (local to host ganon)
Eli>             UUID : 84098bfe:74c1f70c:958a7d8a:ccb2ef74
Eli>           Events : 0

Eli>      Number   Major   Minor   RaidDevice State
Eli>         0     252       11        0      active sync   /dev/dm-11
Eli>         1     252        9        1      active sync   /dev/dm-9
Eli>         2     252       16        2      active sync   /dev/dm-16
Eli>         3     252       17        3      active sync   /dev/dm-17
Eli>         4     252       10        4      active sync   /dev/dm-10
Eli>         5     252       14        5      active sync   /dev/dm-14
Eli>         6     252       12        6      active sync   /dev/dm-12
Eli>         7     252       13        7      active sync   /dev/dm-13
Eli>         8     252       15        8      active sync   /dev/dm-15

Eli> What do you think? Am I hosed here? Is there any way I can get my data back?
Eli> --
Eli> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
Eli> the body of a message to majordomo@vger.kernel.org
Eli> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29  4:23 Accidentally resized array to 9 Eli Ben-Shoshan
  2017-09-29 12:38 ` John Stoffel
@ 2017-09-29 12:55 ` Roman Mamedov
  2017-09-29 14:53   ` Eli Ben-Shoshan
  1 sibling, 1 reply; 13+ messages in thread
From: Roman Mamedov @ 2017-09-29 12:55 UTC (permalink / raw)
  To: Eli Ben-Shoshan; +Cc: linux-raid, John Stoffel

On Fri, 29 Sep 2017 00:23:28 -0400
Eli Ben-Shoshan <eli@benshoshan.com> wrote:

> This was a RAID6 with 8 devices. Instead of using --grow with 
> --raid-devices set to 9, I did the following:
> 
> mdadm --grow /dev/md128 --size 9
> 
> This happily returned without any errors so I went to go look at 
> /proc/mdstat and did not see a resize operation going. So I shook my 
> head and read the output of --grow --help and did the right thing which is:
> 
> mdadm --grow /dev/md128 --raid-devices=9

The output of the first command is:

  # mdadm --grow /dev/md0 --size 9
  mdadm: component size of /dev/md0 has been set to 9K
  unfreeze

It didn't occur to you that you FIRST need to restore the "component size" back
to what it was previously?...

And yes as John says there should be a confirmation request on reducing the
array size. In fact I couldn't believe there isn't one already, that's why I
went checking. But nope, there are no warnings or confirmation requests
neither for reducing --size, nor --array-size. I might be remembering that
there were some, from LVM, not MD.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29 12:38 ` John Stoffel
@ 2017-09-29 14:47   ` Eli Ben-Shoshan
  2017-09-29 19:33     ` John Stoffel
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Ben-Shoshan @ 2017-09-29 14:47 UTC (permalink / raw)
  To: John Stoffel; +Cc: linux-raid

On 09/29/2017 08:38 AM, John Stoffel wrote:
>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
> 
> Eli> I need to add another disk to my array (/dev/md128) when I accidentally
> Eli> did an array resize to 9 with the following command:
> 
> Eli> First I add the disk to the array with the following:
> 
> Eli> mdadm --manage /dev/md128 --add /dev/sdl
> 
> Eli> This was a RAID6 with 8 devices. Instead of using --grow with
> Eli> --raid-devices set to 9, I did the following:
> 
> Eli> mdadm --grow /dev/md128 --size 9
> 
> Eli> This happily returned without any errors so I went to go look at
> Eli> /proc/mdstat and did not see a resize operation going. So I shook my
> Eli> head and read the output of --grow --help and did the right thing which is:
> 
> Eli> mdadm --grow /dev/md128 --raid-devices=9
> 
> Eli> Right after that everything hit the fan. dmesg reported a lot of
> Eli> filesystem errors. I quickly stopped all processes that were using this
> Eli> device and unmounted the filesystems. I then, stupidly, decided to
> Eli> reboot before looking around.
> 
> 
> I think you *might* be able to fix this with just a simple:
> 
>     mdadm --grow /dev/md128 --size max
> 
> And then try to scan for your LVM configuration, then fsck your volume
> on there.  I hope you had backups.
> 
> And maybe there should be a warning when re-sizing raid array elements
> without a --force option if going smaller than the current size?

I just tried that and got the following error:

mdadm: Cannot set device size in this type of array

Trying to go further down this path, I also tried to set the size 
explicitly with:

mdadm --grow /dev/md150 --size 1953383512

but got:

mdadm: Cannot set device size in this type of array

I am curious if my data is actually still there on disk.

What does the --size with --grow actually do?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29 12:55 ` Roman Mamedov
@ 2017-09-29 14:53   ` Eli Ben-Shoshan
  2017-09-29 19:50     ` Roman Mamedov
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Ben-Shoshan @ 2017-09-29 14:53 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-raid, John Stoffel

On 09/29/2017 08:55 AM, Roman Mamedov wrote:
> On Fri, 29 Sep 2017 00:23:28 -0400
> Eli Ben-Shoshan <eli@benshoshan.com> wrote:
> 
>> This was a RAID6 with 8 devices. Instead of using --grow with
>> --raid-devices set to 9, I did the following:
>>
>> mdadm --grow /dev/md128 --size 9
>>
>> This happily returned without any errors so I went to go look at
>> /proc/mdstat and did not see a resize operation going. So I shook my
>> head and read the output of --grow --help and did the right thing which is:
>>
>> mdadm --grow /dev/md128 --raid-devices=9
> 
> The output of the first command is:
> 
>    # mdadm --grow /dev/md0 --size 9
>    mdadm: component size of /dev/md0 has been set to 9K
>    unfreeze
> 
> It didn't occur to you that you FIRST need to restore the "component size" back
> to what it was previously?...

I am not sure that I actually got any response at all from setting 
--size. I am running version:

mdadm - v3.4 - 28th January 2016

If I did get output, I totally missed it. I get that this is my fault 
for using the wrong commmand and any data loss is totally due to my 
stupidity. I am just hoping that there might be a way that I can get the 
data back.

> 
> And yes as John says there should be a confirmation request on reducing the
> array size. In fact I couldn't believe there isn't one already, that's why I
> went checking. But nope, there are no warnings or confirmation requests
> neither for reducing --size, nor --array-size. I might be remembering that
> there were some, from LVM, not MD.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29 14:47   ` Eli Ben-Shoshan
@ 2017-09-29 19:33     ` John Stoffel
  2017-09-29 21:04       ` Eli Ben-Shoshan
  0 siblings, 1 reply; 13+ messages in thread
From: John Stoffel @ 2017-09-29 19:33 UTC (permalink / raw)
  To: Eli Ben-Shoshan; +Cc: John Stoffel, linux-raid

>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:

Eli> On 09/29/2017 08:38 AM, John Stoffel wrote:
>>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
>> 
Eli> I need to add another disk to my array (/dev/md128) when I accidentally
Eli> did an array resize to 9 with the following command:
>> 
Eli> First I add the disk to the array with the following:
>> 
Eli> mdadm --manage /dev/md128 --add /dev/sdl
>> 
Eli> This was a RAID6 with 8 devices. Instead of using --grow with
Eli> --raid-devices set to 9, I did the following:
>> 
Eli> mdadm --grow /dev/md128 --size 9
>> 
Eli> This happily returned without any errors so I went to go look at
Eli> /proc/mdstat and did not see a resize operation going. So I shook my
Eli> head and read the output of --grow --help and did the right thing which is:
>> 
Eli> mdadm --grow /dev/md128 --raid-devices=9
>> 
Eli> Right after that everything hit the fan. dmesg reported a lot of
Eli> filesystem errors. I quickly stopped all processes that were using this
Eli> device and unmounted the filesystems. I then, stupidly, decided to
Eli> reboot before looking around.
>> 
>> 
>> I think you *might* be able to fix this with just a simple:
>> 
>> mdadm --grow /dev/md128 --size max
>> 
>> And then try to scan for your LVM configuration, then fsck your volume
>> on there.  I hope you had backups.
>> 
>> And maybe there should be a warning when re-sizing raid array elements
>> without a --force option if going smaller than the current size?

Eli> I just tried that and got the following error:

Eli> mdadm: Cannot set device size in this type of array

Eli> Trying to go further down this path, I also tried to set the size 
Eli> explicitly with:

Eli> mdadm --grow /dev/md150 --size 1953383512

Eli> but got:

Eli> mdadm: Cannot set device size in this type of array

Eli> I am curious if my data is actually still there on disk.

Eli> What does the --size with --grow actually do?

It changes the size of each member of the array.  The man page
explains it, though not ... obviously.

Are you still running with the overlays?  That would explain why it
can't resize them bigger.  But I'm also behind on email today...

John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29 14:53   ` Eli Ben-Shoshan
@ 2017-09-29 19:50     ` Roman Mamedov
  2017-09-30 16:21       ` Phil Turmel
  0 siblings, 1 reply; 13+ messages in thread
From: Roman Mamedov @ 2017-09-29 19:50 UTC (permalink / raw)
  To: Eli Ben-Shoshan; +Cc: linux-raid, John Stoffel

On Fri, 29 Sep 2017 10:53:57 -0400
Eli Ben-Shoshan <eli@benshoshan.com> wrote:

> I am just hoping that there might be a way that I can get the 
> data back.

In theory what you did was cut the array size to only use 9 KB of each device,
then reshaped THAT tiny array from 8 to 9 devices, with the rest left
completely untouched.

So you could try removing the "new" disk, then try --create --assume-clean
with old devices only and --raid-devices=8.

But I'm not sure how you would get the device order right.

Ideally what you can hope for, is you would get the bulk of array data intact,
only with the first 9 KB of each device *(8-2), so about the first 54 KB of
data on the md array, corrupted and unusable. It is likely the LVM and
filesystem tools will not recognize anything due to that, so you will need to
use some data recovery software to look for and save the data.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29 19:33     ` John Stoffel
@ 2017-09-29 21:04       ` Eli Ben-Shoshan
  2017-09-29 21:17         ` John Stoffel
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Ben-Shoshan @ 2017-09-29 21:04 UTC (permalink / raw)
  To: John Stoffel; +Cc: linux-raid

On 09/29/2017 03:33 PM, John Stoffel wrote:
>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
> 
> Eli> On 09/29/2017 08:38 AM, John Stoffel wrote:
>>>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
>>>
> Eli> I need to add another disk to my array (/dev/md128) when I accidentally
> Eli> did an array resize to 9 with the following command:
>>>
> Eli> First I add the disk to the array with the following:
>>>
> Eli> mdadm --manage /dev/md128 --add /dev/sdl
>>>
> Eli> This was a RAID6 with 8 devices. Instead of using --grow with
> Eli> --raid-devices set to 9, I did the following:
>>>
> Eli> mdadm --grow /dev/md128 --size 9
>>>
> Eli> This happily returned without any errors so I went to go look at
> Eli> /proc/mdstat and did not see a resize operation going. So I shook my
> Eli> head and read the output of --grow --help and did the right thing which is:
>>>
> Eli> mdadm --grow /dev/md128 --raid-devices=9
>>>
> Eli> Right after that everything hit the fan. dmesg reported a lot of
> Eli> filesystem errors. I quickly stopped all processes that were using this
> Eli> device and unmounted the filesystems. I then, stupidly, decided to
> Eli> reboot before looking around.
>>>
>>>
>>> I think you *might* be able to fix this with just a simple:
>>>
>>> mdadm --grow /dev/md128 --size max
>>>
>>> And then try to scan for your LVM configuration, then fsck your volume
>>> on there.  I hope you had backups.
>>>
>>> And maybe there should be a warning when re-sizing raid array elements
>>> without a --force option if going smaller than the current size?
> 
> Eli> I just tried that and got the following error:
> 
> Eli> mdadm: Cannot set device size in this type of array
> 
> Eli> Trying to go further down this path, I also tried to set the size
> Eli> explicitly with:
> 
> Eli> mdadm --grow /dev/md150 --size 1953383512
> 
> Eli> but got:
> 
> Eli> mdadm: Cannot set device size in this type of array
> 
> Eli> I am curious if my data is actually still there on disk.
> 
> Eli> What does the --size with --grow actually do?
> 
> It changes the size of each member of the array.  The man page
> explains it, though not ... obviously.
> 
> Are you still running with the overlays?  That would explain why it
> can't resize them bigger.  But I'm also behind on email today...
> 
> John
> 

I was still using the overlay. I just tried the grow without the overlay 
and got the same error.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29 21:04       ` Eli Ben-Shoshan
@ 2017-09-29 21:17         ` John Stoffel
  2017-09-29 21:49           ` Eli Ben-Shoshan
  0 siblings, 1 reply; 13+ messages in thread
From: John Stoffel @ 2017-09-29 21:17 UTC (permalink / raw)
  To: Eli Ben-Shoshan; +Cc: John Stoffel, linux-raid

>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:

Eli> On 09/29/2017 03:33 PM, John Stoffel wrote:
>>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
>> 
Eli> On 09/29/2017 08:38 AM, John Stoffel wrote:
>>>>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
>>>> 
Eli> I need to add another disk to my array (/dev/md128) when I accidentally
Eli> did an array resize to 9 with the following command:
>>>> 
Eli> First I add the disk to the array with the following:
>>>> 
Eli> mdadm --manage /dev/md128 --add /dev/sdl
>>>> 
Eli> This was a RAID6 with 8 devices. Instead of using --grow with
Eli> --raid-devices set to 9, I did the following:
>>>> 
Eli> mdadm --grow /dev/md128 --size 9
>>>> 
Eli> This happily returned without any errors so I went to go look at
Eli> /proc/mdstat and did not see a resize operation going. So I shook my
Eli> head and read the output of --grow --help and did the right thing which is:
>>>> 
Eli> mdadm --grow /dev/md128 --raid-devices=9
>>>> 
Eli> Right after that everything hit the fan. dmesg reported a lot of
Eli> filesystem errors. I quickly stopped all processes that were using this
Eli> device and unmounted the filesystems. I then, stupidly, decided to
Eli> reboot before looking around.
>>>> 
>>>> 
>>>> I think you *might* be able to fix this with just a simple:
>>>> 
>>>> mdadm --grow /dev/md128 --size max
>>>> 
>>>> And then try to scan for your LVM configuration, then fsck your volume
>>>> on there.  I hope you had backups.
>>>> 
>>>> And maybe there should be a warning when re-sizing raid array elements
>>>> without a --force option if going smaller than the current size?
>> 
Eli> I just tried that and got the following error:
>> 
Eli> mdadm: Cannot set device size in this type of array
>> 
Eli> Trying to go further down this path, I also tried to set the size
Eli> explicitly with:
>> 
Eli> mdadm --grow /dev/md150 --size 1953383512
>> 
Eli> but got:
>> 
Eli> mdadm: Cannot set device size in this type of array
>> 
Eli> I am curious if my data is actually still there on disk.
>> 
Eli> What does the --size with --grow actually do?
>> 
>> It changes the size of each member of the array.  The man page
>> explains it, though not ... obviously.
>> 
>> Are you still running with the overlays?  That would explain why it
>> can't resize them bigger.  But I'm also behind on email today...


Eli> I was still using the overlay. I just tried the grow without the overlay 
Eli> and got the same error.

Hmm.. what do the partitions on the disk look like now?  You might
need to do more digging.  But I would say that using --grow and having
it *shrink* without any warnings is a bad idea for the mdadm tools.
It should scream loudly and only run when forced to like that.

Aw crap... you used the whole disk.  I don't like doing this because
A) if I get a disk slightly *smaller* than what I currently have, it
will be painful, B) it's easy to use a small partition starting 4mb
from the start and a few hundred Mb (or even a Gb) from the end.

In your case, can you try to do the 'mdadm --grow /dev/md### --size
max' but with a version of mdadm compiled with debugging info, or at
least using the latest version of the code if at all possible.

Grab it from https://github.com/neilbrown/mdadm  and when you
configure it, make sure you enable debugging.  Or grab it from
https://www.kernel.org/pub/linux/utils/raid/mdadm/ and try the same
thing.

Can you show the output of: cat /proc/partitions as well?  Maybe you
need to do:

  mdadm --grow <dev> --size ########

which is the smallest of the max size of all your disks.  Might
work...


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29 21:17         ` John Stoffel
@ 2017-09-29 21:49           ` Eli Ben-Shoshan
  0 siblings, 0 replies; 13+ messages in thread
From: Eli Ben-Shoshan @ 2017-09-29 21:49 UTC (permalink / raw)
  To: John Stoffel; +Cc: linux-raid

On 09/29/2017 05:17 PM, John Stoffel wrote:
>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
> 
> Eli> On 09/29/2017 03:33 PM, John Stoffel wrote:
>>>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
>>>
> Eli> On 09/29/2017 08:38 AM, John Stoffel wrote:
>>>>>>>>>> "Eli" == Eli Ben-Shoshan <eli@benshoshan.com> writes:
>>>>>
> Eli> I need to add another disk to my array (/dev/md128) when I accidentally
> Eli> did an array resize to 9 with the following command:
>>>>>
> Eli> First I add the disk to the array with the following:
>>>>>
> Eli> mdadm --manage /dev/md128 --add /dev/sdl
>>>>>
> Eli> This was a RAID6 with 8 devices. Instead of using --grow with
> Eli> --raid-devices set to 9, I did the following:
>>>>>
> Eli> mdadm --grow /dev/md128 --size 9
>>>>>
> Eli> This happily returned without any errors so I went to go look at
> Eli> /proc/mdstat and did not see a resize operation going. So I shook my
> Eli> head and read the output of --grow --help and did the right thing which is:
>>>>>
> Eli> mdadm --grow /dev/md128 --raid-devices=9
>>>>>
> Eli> Right after that everything hit the fan. dmesg reported a lot of
> Eli> filesystem errors. I quickly stopped all processes that were using this
> Eli> device and unmounted the filesystems. I then, stupidly, decided to
> Eli> reboot before looking around.
>>>>>
>>>>>
>>>>> I think you *might* be able to fix this with just a simple:
>>>>>
>>>>> mdadm --grow /dev/md128 --size max
>>>>>
>>>>> And then try to scan for your LVM configuration, then fsck your volume
>>>>> on there.  I hope you had backups.
>>>>>
>>>>> And maybe there should be a warning when re-sizing raid array elements
>>>>> without a --force option if going smaller than the current size?
>>>
> Eli> I just tried that and got the following error:
>>>
> Eli> mdadm: Cannot set device size in this type of array
>>>
> Eli> Trying to go further down this path, I also tried to set the size
> Eli> explicitly with:
>>>
> Eli> mdadm --grow /dev/md150 --size 1953383512
>>>
> Eli> but got:
>>>
> Eli> mdadm: Cannot set device size in this type of array
>>>
> Eli> I am curious if my data is actually still there on disk.
>>>
> Eli> What does the --size with --grow actually do?
>>>
>>> It changes the size of each member of the array.  The man page
>>> explains it, though not ... obviously.
>>>
>>> Are you still running with the overlays?  That would explain why it
>>> can't resize them bigger.  But I'm also behind on email today...
> 
> 
> Eli> I was still using the overlay. I just tried the grow without the overlay
> Eli> and got the same error.
> 
> Hmm.. what do the partitions on the disk look like now?  You might
> need to do more digging.  But I would say that using --grow and having
> it *shrink* without any warnings is a bad idea for the mdadm tools.
> It should scream loudly and only run when forced to like that.
> 
> Aw crap... you used the whole disk.  I don't like doing this because
> A) if I get a disk slightly *smaller* than what I currently have, it
> will be painful, B) it's easy to use a small partition starting 4mb
> from the start and a few hundred Mb (or even a Gb) from the end.
> 
> In your case, can you try to do the 'mdadm --grow /dev/md### --size
> max' but with a version of mdadm compiled with debugging info, or at
> least using the latest version of the code if at all possible.
> 
> Grab it from https://github.com/neilbrown/mdadm  and when you
> configure it, make sure you enable debugging.  Or grab it from
> https://www.kernel.org/pub/linux/utils/raid/mdadm/ and try the same
> thing.
> 
> Can you show the output of: cat /proc/partitions as well?  Maybe you
> need to do:
> 
>    mdadm --grow <dev> --size ########
> 
> which is the smallest of the max size of all your disks.  Might
> work...
> 

ganon mdadm-4.0 # cat /proc/partitions
major minor  #blocks  name

    1        0       8192 ram0
    1        1       8192 ram1
    1        2       8192 ram2
    1        3       8192 ram3
    1        4       8192 ram4
    1        5       8192 ram5
    1        6       8192 ram6
    1        7       8192 ram7
    1        8       8192 ram8
    1        9       8192 ram9
    1       10       8192 ram10
    1       11       8192 ram11
    1       12       8192 ram12
    1       13       8192 ram13
    1       14       8192 ram14
    1       15       8192 ram15
    8        0  234431064 sda
    8        1     262144 sda1
    8        2  204472320 sda2
    8        3    2097152 sda3
    8       16  234431064 sdb
    8       17     262144 sdb1
    8       18  204472320 sdb2
    8       19    2097152 sdb3
    8       32 1953514584 sdc
    8       48 1953514584 sdd
    8       64 1953514584 sde
    8       80 1953514584 sdf
    8       96 1953514584 sdg
    9      126     262080 md126
    9      127  204341248 md127
  252        0    1572864 dm-0
  252        1    9437184 dm-1
  252        2    4194304 dm-2
  252        3   16777216 dm-3
  252        4   25165824 dm-4
  252        5    7340032 dm-5
  252        6    6291456 dm-6
  252        7   53477376 dm-7
  252        8    1048576 dm-8
    8      112 1953514584 sdh
    8      128 1953514584 sdi
    8      144 1953514584 sdj
    8      145    4194304 sdj1
    8      146     524288 sdj2
    8      147 1948793912 sdj3
    8      160 1953514584 sdk
    8      176 1953514584 sdl
    8      192 1953514584 sdm
    8      193    4194304 sdm1
    8      194     524288 sdm2
    8      195 1948793912 sdm3
    9      131 1948662656 md131
    9      129    4192192 md129
    9      130     523968 md130

I got mdadm-4.0 compile with debug flags. Here is the output starting 
with --assemble --scan

ganon mdadm-4.0 # ./mdadm --assemble --scan
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:5/end_device-14:5/target14:0:5/14:0:5:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:4/end_device-14:4/target14:0:4/14:0:4:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:3/end_device-14:3/target14:0:3/14:0:3:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:2/end_device-14:2/target14:0:2/14:0:2:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:1/end_device-14:1/target14:0:1/14:0:1:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:0/end_device-14:0/target14:0:0/14:0:0:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:1c.5/0000:04:00.0/ata13/host12/target12:0:0/12:0:0:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata10/host9/target9:0:0/9:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata9/host8/target8:0:0/8:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata8/host7/target7:0:0/7:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata7/host6/target6:0:0/6:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata6/host5/target5:0:0/5:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata5/host4/target4:0:0/4:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: start_array: /dev/md128 has been started with 9 drives.
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:5/end_device-14:5/target14:0:5/14:0:5:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:4/end_device-14:4/target14:0:4/14:0:4:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:3/end_device-14:3/target14:0:3/14:0:3:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:2/end_device-14:2/target14:0:2/14:0:2:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:1/end_device-14:1/target14:0:1/14:0:1:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host14/port-14:0/end_device-14:0/target14:0:0/14:0:0:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: 
/sys/devices/pci0000:00/0000:00:1c.5/0000:04:00.0/ata13/host12/target12:0:0/12:0:0:0
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata10/host9/target9:0:0/9:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata9/host8/target8:0:0/8:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata8/host7/target7:0:0/7:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata7/host6/target6:0:0/6:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata6/host5/target5:0:0/5:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720
mdadm: path_attached_to_hba: hba: /sys/devices/pci0000:00/0000:00:1f.2 - 
disk: /sys/devices/pci0000:00/0000:00:1f.2/ata5/host4/target4:0:0/4:0:0:0
mdadm: scan: ptr->vendorID: 1103 __le16_to_cpu(ptr->deviceID): 2720

and now an attempt to --grow with --size max:

ganon mdadm-4.0 # ./mdadm --grow /dev/md128 --size max
mdadm: Grow_reshape: Cannot set device size in this type of array.

I am not using overlays with the above commands.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-29 19:50     ` Roman Mamedov
@ 2017-09-30 16:21       ` Phil Turmel
  2017-09-30 16:29         ` Roman Mamedov
  2017-09-30 23:30         ` John Stoffel
  0 siblings, 2 replies; 13+ messages in thread
From: Phil Turmel @ 2017-09-30 16:21 UTC (permalink / raw)
  To: Roman Mamedov, Eli Ben-Shoshan; +Cc: linux-raid, John Stoffel

On 09/29/2017 03:50 PM, Roman Mamedov wrote:
> On Fri, 29 Sep 2017 10:53:57 -0400
> Eli Ben-Shoshan <eli@benshoshan.com> wrote:
> 
>> I am just hoping that there might be a way that I can get the 
>> data back.
> 
> In theory what you did was cut the array size to only use 9 KB of each device,
> then reshaped THAT tiny array from 8 to 9 devices, with the rest left
> completely untouched.
> 
> So you could try removing the "new" disk, then try --create --assume-clean
> with old devices only and --raid-devices=8.
> 
> But I'm not sure how you would get the device order right.
> 
> Ideally what you can hope for, is you would get the bulk of array data intact,
> only with the first 9 KB of each device *(8-2), so about the first 54 KB of
> data on the md array, corrupted and unusable. It is likely the LVM and
> filesystem tools will not recognize anything due to that, so you will need to
> use some data recovery software to look for and save the data.
> 

I agree with Roman.  Most of your array should be still on the 8-disk
layout.  But you were mounted and had writing processes immediately
after the broken grow, so there's probably other corruption due to
writes on the 9-disk pattern in the 8 disks.

Roman's suggestion is the best plan, but even after restoring LVM,
expect breakage all over.  Use overlays.

Phil

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-30 16:21       ` Phil Turmel
@ 2017-09-30 16:29         ` Roman Mamedov
  2017-09-30 23:30         ` John Stoffel
  1 sibling, 0 replies; 13+ messages in thread
From: Roman Mamedov @ 2017-09-30 16:29 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Eli Ben-Shoshan, linux-raid, John Stoffel

On Sat, 30 Sep 2017 12:21:20 -0400
Phil Turmel <philip@turmel.org> wrote:

> > Ideally what you can hope for, is you would get the bulk of array data intact,
> > only with the first 9 KB of each device *(8-2), so about the first 54 KB of
> > data on the md array, corrupted and unusable. It is likely the LVM and
> > filesystem tools will not recognize anything due to that, so you will need to
> > use some data recovery software to look for and save the data.
> > 
> 
> I agree with Roman.  Most of your array should be still on the 8-disk
> layout.  But you were mounted and had writing processes immediately
> after the broken grow, so there's probably other corruption due to
> writes on the 9-disk pattern in the 8 disks.

Adding one afterthought that I had, you could probably salvage the 54 KB in
question by reading them in (and saving to a file) from the current
"9-device array of 9KB devices" that you got.

With respect,
Roman

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Accidentally resized array to 9
  2017-09-30 16:21       ` Phil Turmel
  2017-09-30 16:29         ` Roman Mamedov
@ 2017-09-30 23:30         ` John Stoffel
  1 sibling, 0 replies; 13+ messages in thread
From: John Stoffel @ 2017-09-30 23:30 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Roman Mamedov, Eli Ben-Shoshan, linux-raid, John Stoffel

>>>>> "Phil" == Phil Turmel <philip@turmel.org> writes:

Phil> On 09/29/2017 03:50 PM, Roman Mamedov wrote:
>> On Fri, 29 Sep 2017 10:53:57 -0400
>> Eli Ben-Shoshan <eli@benshoshan.com> wrote:
>> 
>>> I am just hoping that there might be a way that I can get the 
>>> data back.
>> 
>> In theory what you did was cut the array size to only use 9 KB of each device,
>> then reshaped THAT tiny array from 8 to 9 devices, with the rest left
>> completely untouched.
>> 
>> So you could try removing the "new" disk, then try --create --assume-clean
>> with old devices only and --raid-devices=8.
>> 
>> But I'm not sure how you would get the device order right.
>> 
>> Ideally what you can hope for, is you would get the bulk of array data intact,
>> only with the first 9 KB of each device *(8-2), so about the first 54 KB of
>> data on the md array, corrupted and unusable. It is likely the LVM and
>> filesystem tools will not recognize anything due to that, so you will need to
>> use some data recovery software to look for and save the data.
>> 

Phil> I agree with Roman.  Most of your array should be still on the 8-disk
Phil> layout.  But you were mounted and had writing processes immediately
Phil> after the broken grow, so there's probably other corruption due to
Phil> writes on the 9-disk pattern in the 8 disks.

Phil> Roman's suggestion is the best plan, but even after restoring LVM,
Phil> expect breakage all over.  Use overlays.

Maybe the answer is to remove the added disk, setup an overlay on the
eight remaining disks, and then try to do mdadm --create ... on each
of the permutations.  Then you would bring up the LVs on there and see
if you can fsck them and get some data back.

I think the grow isn't going to work, it's really quite hosed at this
point.

If I find some time, I think I'll try to spin up a patch to mdadm to
hopefully stop issues like this from happening, by stopping a --size
to a smaller size with an explicit confirmation being asked, or
overridden by a flag to force the shrink.  Since it's so damn painful.

I don't have alot of hope here for you unfortunately.  I think you're
now in the stage where a --create using the original eight disks is
the way to go.

You *might* be able to find RAID backups at some offset into the disks
to tell you what order each disk is in.  So the steps, roughly, would
be:

1. stop /dev/md127
2. remove the new disk.
3. setup overlays again.

4. mdadm --create /dev/md127 --level 6 -n 8 /dev/sd{cdefghij}

5. vgchange -ay /dev/md127
6. lvs
7. fsck ....
8. if nothing, loop back to step four with a different order of
   devices.


If you have any output from before of /proc/mdstat, that would be
helpful, as would a mapping of device name  (/dev/sd*) to serial
number.

[ There's a neat script called 'lsdrv' which you can grab here
(https://github.com/pturmel/lsdrv) to grab and show all this data.
But it's busted for lvcache devices.  Oops!  Time for more hacking! ]

But I hate to say... I suspect you're toast. But don't listen to me.

John



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-09-30 23:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-29  4:23 Accidentally resized array to 9 Eli Ben-Shoshan
2017-09-29 12:38 ` John Stoffel
2017-09-29 14:47   ` Eli Ben-Shoshan
2017-09-29 19:33     ` John Stoffel
2017-09-29 21:04       ` Eli Ben-Shoshan
2017-09-29 21:17         ` John Stoffel
2017-09-29 21:49           ` Eli Ben-Shoshan
2017-09-29 12:55 ` Roman Mamedov
2017-09-29 14:53   ` Eli Ben-Shoshan
2017-09-29 19:50     ` Roman Mamedov
2017-09-30 16:21       ` Phil Turmel
2017-09-30 16:29         ` Roman Mamedov
2017-09-30 23:30         ` John Stoffel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.