All of lore.kernel.org
 help / color / mirror / Atom feed
* mdadm: Patch to restrict --size when shrinking unless forced
@ 2017-10-04 18:00 John Stoffel
  2017-10-04 18:11 ` Jes Sorensen
  2017-10-04 21:50 ` NeilBrown
  0 siblings, 2 replies; 28+ messages in thread
From: John Stoffel @ 2017-10-04 18:00 UTC (permalink / raw)
  To: John Stoffel; +Cc: Eli Ben-Shoshan, Jes.Sorensen, linux-raid


Since Eli had such a horrible experience where he shrunk the
individual component raid device size, instead of growing the overall
raid by adding a device, I came up with this hacky patch to warn you
when you are about to shoot yourself in the foot.

The idea is it will warn you and exit unless you pass in the --force
(or -f) switch when using the command.  For example, on a set of loop
devices:

    # cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
    [raid4] [multipath] [faulty]
    md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
    loop0p1[0]
	  606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
	  [UUUUU]

    # ./mdadm --grow /dev/md99 --size 128
    mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.

    # ./mdadm --grow /dev/md99 --size 128 -f
    mdadm: component size of /dev/md99 has been set to 0K


I suspect I could do a better job of showing the original component
size, so that you have a chance of recovering even then.

But the patch:

diff --git a/Grow.c b/Grow.c
index 455c5f9..701590f 100755
--- a/Grow.c
+++ b/Grow.c
@@ -1561,7 +1561,7 @@ static int reshape_container(char *container, char *devname,
 			     char *backup_file, int verbose,
 			     int forked, int restart, int freeze_reshape);
 
-int Grow_reshape(char *devname, int fd,
+int Grow_reshape(char *devname, int fd, int force,
 		 struct mddev_dev *devlist,
 		 unsigned long long data_offset,
 		 struct context *c, struct shape *s)
@@ -1574,6 +1574,7 @@ int Grow_reshape(char *devname, int fd,
 	 * requested everything (if kernel supports freezing - 2.6.30).
 	 * The steps are:
 	 *  - change size (i.e. component_size)
+         *    - when shrinking, you must force the change 
 	 *  - change level
 	 *  - change layout/chunksize/ndisks
 	 *
@@ -1617,6 +1618,11 @@ int Grow_reshape(char *devname, int fd,
 		return 1;
 	}
 
+	if ((s->size < (unsigned)array.size) && !force) {
+	    pr_err("Cannot set device size smaller than current component_size of %s array.  Use -f to force change.\n",devname);
+	    return 1;
+	}
+
 	if (s->raiddisks && s->raiddisks < array.raid_disks && array.level > 1 &&
 	    get_linux_version() < 2006032 &&
 	    !check_env("MDADM_FORCE_FEWER")) {
diff --git a/ReadMe.c b/ReadMe.c
index 50d3807..46988ae 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -203,6 +203,7 @@ struct option long_options[] = {
     {"invalid-backup",0,0,InvalidBackup},
     {"array-size", 1, 0, 'Z'},
     {"continue", 0, 0, Continue},
+    {"force",	  0, 0, Force},
 
     /* For Incremental */
     {"rebuild-map", 0, 0, RebuildMapOpt},
@@ -563,6 +564,7 @@ char Help_grow[] =
 "                      : This is useful if all devices have been replaced\n"
 "                      : with larger devices.   Value is in Kilobytes, or\n"
 "                      : the special word 'max' meaning 'as large as possible'.\n"
+"  --force       -f    : Override normal checks and be more forceful\n"
 "  --assume-clean      : When increasing the --size, this flag will avoid\n"
 "                      : a resync of the new space\n"
 "  --chunk=       -c   : Change the chunksize of the array\n"
diff --git a/mdadm.c b/mdadm.c
index c3a265b..821658a 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -1617,7 +1617,7 @@ int main(int argc, char *argv[])
 		else if (s.size > 0 || s.raiddisks || s.layout_str ||
 			 s.chunk != 0 || s.level != UnSet ||
 			 data_offset != INVALID_SECTORS) {
-			rv = Grow_reshape(devlist->devname, mdfd,
+		    rv = Grow_reshape(devlist->devname, mdfd, c.force,
 					  devlist->next,
 					  data_offset, &c, &s);
 		} else if (array_size == 0)
diff --git a/mdadm.h b/mdadm.h
index 71b8afb..9e00f05 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1300,7 +1300,7 @@ extern int autodetect(void);
 extern int Grow_Add_device(char *devname, int fd, char *newdev);
 extern int Grow_addbitmap(char *devname, int fd,
 			  struct context *c, struct shape *s);
-extern int Grow_reshape(char *devname, int fd,
+extern int Grow_reshape(char *devname, int fd, int force,
 			struct mddev_dev *devlist,
 			unsigned long long data_offset,
 			struct context *c, struct shape *s);

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-04 18:00 mdadm: Patch to restrict --size when shrinking unless forced John Stoffel
@ 2017-10-04 18:11 ` Jes Sorensen
  2017-10-04 19:15   ` John Stoffel
  2017-10-04 21:50 ` NeilBrown
  1 sibling, 1 reply; 28+ messages in thread
From: Jes Sorensen @ 2017-10-04 18:11 UTC (permalink / raw)
  To: John Stoffel; +Cc: Eli Ben-Shoshan, linux-raid

On 10/04/2017 02:00 PM, John Stoffel wrote:
> 
> Since Eli had such a horrible experience where he shrunk the
> individual component raid device size, instead of growing the overall
> raid by adding a device, I came up with this hacky patch to warn you
> when you are about to shoot yourself in the foot.
> 
> The idea is it will warn you and exit unless you pass in the --force
> (or -f) switch when using the command.  For example, on a set of loop
> devices:
> 
>      # cat /proc/mdstat
>      Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>      [raid4] [multipath] [faulty]
>      md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>      loop0p1[0]
> 	  606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
> 	  [UUUUU]
> 
>      # ./mdadm --grow /dev/md99 --size 128
>      mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
> 
>      # ./mdadm --grow /dev/md99 --size 128 -f
>      mdadm: component size of /dev/md99 has been set to 0K
> 
> 
> I suspect I could do a better job of showing the original component
> size, so that you have a chance of recovering even then.
> 
> But the patch:

Before looking at the actual code, some cosmetics


> diff --git a/Grow.c b/Grow.c
> index 455c5f9..701590f 100755
> --- a/Grow.c
> +++ b/Grow.c
> @@ -1561,7 +1561,7 @@ static int reshape_container(char *container, char *devname,
>   			     char *backup_file, int verbose,
>   			     int forked, int restart, int freeze_reshape);
>   
> -int Grow_reshape(char *devname, int fd,
> +int Grow_reshape(char *devname, int fd, int force,
>   		 struct mddev_dev *devlist,
>   		 unsigned long long data_offset,
>   		 struct context *c, struct shape *s)
> @@ -1574,6 +1574,7 @@ int Grow_reshape(char *devname, int fd,
>   	 * requested everything (if kernel supports freezing - 2.6.30).
>   	 * The steps are:
>   	 *  - change size (i.e. component_size)
> +         *    - when shrinking, you must force the change

Code is indented by tabs, not spaces.

>   	 *  - change level
>   	 *  - change layout/chunksize/ndisks
>   	 *
> @@ -1617,6 +1618,11 @@ int Grow_reshape(char *devname, int fd,
>   		return 1;
>   	}
>   
> +	if ((s->size < (unsigned)array.size) && !force) {
> +	    pr_err("Cannot set device size smaller than current component_size of %s array.  Use -f to force change.\n",devname);
> +	    return 1;
> +	}
> +

Again, tabs, and they are 8 characters wide.

if (s->raiddisks && s->raiddisks < array.raid_disks && array.level > 1 &&
>   	    get_linux_version() < 2006032 &&
>   	    !check_env("MDADM_FORCE_FEWER")) {
> diff --git a/ReadMe.c b/ReadMe.c
> index 50d3807..46988ae 100644
> --- a/ReadMe.c
> +++ b/ReadMe.c
> @@ -203,6 +203,7 @@ struct option long_options[] = {
>       {"invalid-backup",0,0,InvalidBackup},
>       {"array-size", 1, 0, 'Z'},
>       {"continue", 0, 0, Continue},
> +    {"force",	  0, 0, Force},
>   
>       /* For Incremental */
>       {"rebuild-map", 0, 0, RebuildMapOpt},
> @@ -563,6 +564,7 @@ char Help_grow[] =
>   "                      : This is useful if all devices have been replaced\n"
>   "                      : with larger devices.   Value is in Kilobytes, or\n"
>   "                      : the special word 'max' meaning 'as large as possible'.\n"
> +"  --force       -f    : Override normal checks and be more forceful\n"
>   "  --assume-clean      : When increasing the --size, this flag will avoid\n"
>   "                      : a resync of the new space\n"
>   "  --chunk=       -c   : Change the chunksize of the array\n"
> diff --git a/mdadm.c b/mdadm.c
> index c3a265b..821658a 100644
> --- a/mdadm.c
> +++ b/mdadm.c
> @@ -1617,7 +1617,7 @@ int main(int argc, char *argv[])
>   		else if (s.size > 0 || s.raiddisks || s.layout_str ||
>   			 s.chunk != 0 || s.level != UnSet ||
>   			 data_offset != INVALID_SECTORS) {
> -			rv = Grow_reshape(devlist->devname, mdfd,
> +		    rv = Grow_reshape(devlist->devname, mdfd, c.force,

Yikes!

Thanks,
Jes


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-04 18:11 ` Jes Sorensen
@ 2017-10-04 19:15   ` John Stoffel
  2017-10-04 19:23     ` Jes Sorensen
  0 siblings, 1 reply; 28+ messages in thread
From: John Stoffel @ 2017-10-04 19:15 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: John Stoffel, Eli Ben-Shoshan, linux-raid

>>>>> "Jes" == Jes Sorensen <jes.sorensen@gmail.com> writes:

Jes> On 10/04/2017 02:00 PM, John Stoffel wrote:
>> 
>> Since Eli had such a horrible experience where he shrunk the
>> individual component raid device size, instead of growing the overall
>> raid by adding a device, I came up with this hacky patch to warn you
>> when you are about to shoot yourself in the foot.
>> 
>> The idea is it will warn you and exit unless you pass in the --force
>> (or -f) switch when using the command.  For example, on a set of loop
>> devices:
>> 
>> # cat /proc/mdstat
>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>> [raid4] [multipath] [faulty]
>> md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>> loop0p1[0]
>> 606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>> [UUUUU]
>> 
>> # ./mdadm --grow /dev/md99 --size 128
>> mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>> 
>> # ./mdadm --grow /dev/md99 --size 128 -f
>> mdadm: component size of /dev/md99 has been set to 0K
>> 
>> 
>> I suspect I could do a better job of showing the original component
>> size, so that you have a chance of recovering even then.
>> 
>> But the patch:

Jes> Before looking at the actual code, some cosmetics

Sure, I'll re-spin the patch,  I have my emacs setup to use 4 spaces
for indentation, not 8 space tabs.  

>> @@ -1617,7 +1617,7 @@ int main(int argc, char *argv[])
>> else if (s.size > 0 || s.raiddisks || s.layout_str ||
>> s.chunk != 0 || s.level != UnSet ||
>> data_offset != INVALID_SECTORS) {
>> -			rv = Grow_reshape(devlist->devname, mdfd,
>> +		    rv = Grow_reshape(devlist->devname, mdfd, c.force,

Jes> Yikes!

Can you explain please?  I added in the c.force since that's the value
set by the -f (--force) flag when you call mdadm from the command
line.





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-04 19:15   ` John Stoffel
@ 2017-10-04 19:23     ` Jes Sorensen
  2017-10-04 19:33       ` John Stoffel
  0 siblings, 1 reply; 28+ messages in thread
From: Jes Sorensen @ 2017-10-04 19:23 UTC (permalink / raw)
  To: John Stoffel; +Cc: Eli Ben-Shoshan, linux-raid

On 10/04/2017 03:15 PM, John Stoffel wrote:
>>>>>> "Jes" == Jes Sorensen <jes.sorensen@gmail.com> writes:
> 
> Jes> On 10/04/2017 02:00 PM, John Stoffel wrote:
>>>
>>> Since Eli had such a horrible experience where he shrunk the
>>> individual component raid device size, instead of growing the overall
>>> raid by adding a device, I came up with this hacky patch to warn you
>>> when you are about to shoot yourself in the foot.
>>>
>>> The idea is it will warn you and exit unless you pass in the --force
>>> (or -f) switch when using the command.  For example, on a set of loop
>>> devices:
>>>
>>> # cat /proc/mdstat
>>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>>> [raid4] [multipath] [faulty]
>>> md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>>> loop0p1[0]
>>> 606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>>> [UUUUU]
>>>
>>> # ./mdadm --grow /dev/md99 --size 128
>>> mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>>>
>>> # ./mdadm --grow /dev/md99 --size 128 -f
>>> mdadm: component size of /dev/md99 has been set to 0K
>>>
>>>
>>> I suspect I could do a better job of showing the original component
>>> size, so that you have a chance of recovering even then.
>>>
>>> But the patch:
> 
> Jes> Before looking at the actual code, some cosmetics
> 
> Sure, I'll re-spin the patch,  I have my emacs setup to use 4 spaces
> for indentation, not 8 space tabs.

mdadm follows Linux kernel rules, so (setq c-basic-offset 8) in .emacs

>>> @@ -1617,7 +1617,7 @@ int main(int argc, char *argv[])
>>> else if (s.size > 0 || s.raiddisks || s.layout_str ||
>>> s.chunk != 0 || s.level != UnSet ||
>>> data_offset != INVALID_SECTORS) {
>>> -			rv = Grow_reshape(devlist->devname, mdfd,
>>> +		    rv = Grow_reshape(devlist->devname, mdfd, c.force,
> 
> Jes> Yikes!
> 
> Can you explain please?  I added in the c.force since that's the value
> set by the -f (--force) flag when you call mdadm from the command
> line.

The broken indentation

Jes

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-04 19:23     ` Jes Sorensen
@ 2017-10-04 19:33       ` John Stoffel
  0 siblings, 0 replies; 28+ messages in thread
From: John Stoffel @ 2017-10-04 19:33 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: John Stoffel, Eli Ben-Shoshan, linux-raid

>>>>> "Jes" == Jes Sorensen <jes.sorensen@gmail.com> writes:

Jes> Yikes!
>> 
>> Can you explain please?  I added in the c.force since that's the value
>> set by the -f (--force) flag when you call mdadm from the command
>> line.

Jes> The broken indentation

Heh, I'm working on fixing it.  Sorry!  

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-04 18:00 mdadm: Patch to restrict --size when shrinking unless forced John Stoffel
  2017-10-04 18:11 ` Jes Sorensen
@ 2017-10-04 21:50 ` NeilBrown
  2017-10-05  1:26   ` John Stoffel
  2017-10-08 20:57   ` John Stoffel
  1 sibling, 2 replies; 28+ messages in thread
From: NeilBrown @ 2017-10-04 21:50 UTC (permalink / raw)
  To: John Stoffel; +Cc: Eli Ben-Shoshan, Jes.Sorensen, linux-raid

[-- Attachment #1: Type: text/plain, Size: 5561 bytes --]

On Wed, Oct 04 2017, John Stoffel wrote:

> Since Eli had such a horrible experience where he shrunk the
> individual component raid device size, instead of growing the overall
> raid by adding a device, I came up with this hacky patch to warn you
> when you are about to shoot yourself in the foot.
>
> The idea is it will warn you and exit unless you pass in the --force
> (or -f) switch when using the command.  For example, on a set of loop
> devices:
>
>     # cat /proc/mdstat
>     Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>     [raid4] [multipath] [faulty]
>     md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>     loop0p1[0]
> 	  606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
> 	  [UUUUU]
>
>     # ./mdadm --grow /dev/md99 --size 128
>     mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>
>     # ./mdadm --grow /dev/md99 --size 128 -f
>     mdadm: component size of /dev/md99 has been set to 0K
>

I'm not sure I like this.
The reason that mdadm will quietly accept a size change like this is
that it is trivial to revert - just set the same to a big number and all
your data is still there.

Eli's problem was that he made a harmless mistake, realized that he had
made a mistake, but didn't address the mistake before continuing!

If you really want to make this a two-step process, an approach that
would be more consistent with other aspects of mdadm is to require that
--array-size be reduced first.  i.e. setting --size mustn't reduce the
size of the array.
I think that separating the two steps (resize array, resize component)
gives the user a better picture of what is happening, where as requiring
-f just causes people to use -f more often.

Thanks,
NeilBrown


>
> I suspect I could do a better job of showing the original component
> size, so that you have a chance of recovering even then.
>
> But the patch:
>
> diff --git a/Grow.c b/Grow.c
> index 455c5f9..701590f 100755
> --- a/Grow.c
> +++ b/Grow.c
> @@ -1561,7 +1561,7 @@ static int reshape_container(char *container, char *devname,
>  			     char *backup_file, int verbose,
>  			     int forked, int restart, int freeze_reshape);
>  
> -int Grow_reshape(char *devname, int fd,
> +int Grow_reshape(char *devname, int fd, int force,
>  		 struct mddev_dev *devlist,
>  		 unsigned long long data_offset,
>  		 struct context *c, struct shape *s)
> @@ -1574,6 +1574,7 @@ int Grow_reshape(char *devname, int fd,
>  	 * requested everything (if kernel supports freezing - 2.6.30).
>  	 * The steps are:
>  	 *  - change size (i.e. component_size)
> +         *    - when shrinking, you must force the change 
>  	 *  - change level
>  	 *  - change layout/chunksize/ndisks
>  	 *
> @@ -1617,6 +1618,11 @@ int Grow_reshape(char *devname, int fd,
>  		return 1;
>  	}
>  
> +	if ((s->size < (unsigned)array.size) && !force) {
> +	    pr_err("Cannot set device size smaller than current component_size of %s array.  Use -f to force change.\n",devname);
> +	    return 1;
> +	}
> +
>  	if (s->raiddisks && s->raiddisks < array.raid_disks && array.level > 1 &&
>  	    get_linux_version() < 2006032 &&
>  	    !check_env("MDADM_FORCE_FEWER")) {
> diff --git a/ReadMe.c b/ReadMe.c
> index 50d3807..46988ae 100644
> --- a/ReadMe.c
> +++ b/ReadMe.c
> @@ -203,6 +203,7 @@ struct option long_options[] = {
>      {"invalid-backup",0,0,InvalidBackup},
>      {"array-size", 1, 0, 'Z'},
>      {"continue", 0, 0, Continue},
> +    {"force",	  0, 0, Force},
>  
>      /* For Incremental */
>      {"rebuild-map", 0, 0, RebuildMapOpt},
> @@ -563,6 +564,7 @@ char Help_grow[] =
>  "                      : This is useful if all devices have been replaced\n"
>  "                      : with larger devices.   Value is in Kilobytes, or\n"
>  "                      : the special word 'max' meaning 'as large as possible'.\n"
> +"  --force       -f    : Override normal checks and be more forceful\n"
>  "  --assume-clean      : When increasing the --size, this flag will avoid\n"
>  "                      : a resync of the new space\n"
>  "  --chunk=       -c   : Change the chunksize of the array\n"
> diff --git a/mdadm.c b/mdadm.c
> index c3a265b..821658a 100644
> --- a/mdadm.c
> +++ b/mdadm.c
> @@ -1617,7 +1617,7 @@ int main(int argc, char *argv[])
>  		else if (s.size > 0 || s.raiddisks || s.layout_str ||
>  			 s.chunk != 0 || s.level != UnSet ||
>  			 data_offset != INVALID_SECTORS) {
> -			rv = Grow_reshape(devlist->devname, mdfd,
> +		    rv = Grow_reshape(devlist->devname, mdfd, c.force,
>  					  devlist->next,
>  					  data_offset, &c, &s);
>  		} else if (array_size == 0)
> diff --git a/mdadm.h b/mdadm.h
> index 71b8afb..9e00f05 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -1300,7 +1300,7 @@ extern int autodetect(void);
>  extern int Grow_Add_device(char *devname, int fd, char *newdev);
>  extern int Grow_addbitmap(char *devname, int fd,
>  			  struct context *c, struct shape *s);
> -extern int Grow_reshape(char *devname, int fd,
> +extern int Grow_reshape(char *devname, int fd, int force,
>  			struct mddev_dev *devlist,
>  			unsigned long long data_offset,
>  			struct context *c, struct shape *s);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-04 21:50 ` NeilBrown
@ 2017-10-05  1:26   ` John Stoffel
  2017-10-07 22:06     ` Wols Lists
  2017-10-08 20:57   ` John Stoffel
  1 sibling, 1 reply; 28+ messages in thread
From: John Stoffel @ 2017-10-05  1:26 UTC (permalink / raw)
  To: NeilBrown; +Cc: Eli Ben-Shoshan, Jes.Sorensen, linux-raid

It's trivial to revert if you know the starting size!  And I would argue that the --size option is misnamed, since is is a per-component resize.  

In any case, would it be better to print a message which said something like: array md## devices resized from <orig> to <new size>

When the user does this?  But again, I think the --force option is good to have when reducing the size of component devices, sine I would hope the message gives people a pause and hopefully makes them think.

So I really don't think we're holding people back, we're educating them with this warning.  

Sent from my iPhone

> On Oct 4, 2017, at 5:50 PM, NeilBrown <neilb@suse.com> wrote:
> 
>> On Wed, Oct 04 2017, John Stoffel wrote:
>> 
>> Since Eli had such a horrible experience where he shrunk the
>> individual component raid device size, instead of growing the overall
>> raid by adding a device, I came up with this hacky patch to warn you
>> when you are about to shoot yourself in the foot.
>> 
>> The idea is it will warn you and exit unless you pass in the --force
>> (or -f) switch when using the command.  For example, on a set of loop
>> devices:
>> 
>>    # cat /proc/mdstat
>>    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>>    [raid4] [multipath] [faulty]
>>    md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>>    loop0p1[0]
>>      606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>>      [UUUUU]
>> 
>>    # ./mdadm --grow /dev/md99 --size 128
>>    mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>> 
>>    # ./mdadm --grow /dev/md99 --size 128 -f
>>    mdadm: component size of /dev/md99 has been set to 0K
>> 
> 
> I'm not sure I like this.
> The reason that mdadm will quietly accept a size change like this is
> that it is trivial to revert - just set the same to a big number and all
> your data is still there.
> 
> Eli's problem was that he made a harmless mistake, realized that he had
> made a mistake, but didn't address the mistake before continuing!
> 
> If you really want to make this a two-step process, an approach that
> would be more consistent with other aspects of mdadm is to require that
> --array-size be reduced first.  i.e. setting --size mustn't reduce the
> size of the array.
> I think that separating the two steps (resize array, resize component)
> gives the user a better picture of what is happening, where as requiring
> -f just causes people to use -f more often.
> 
> Thanks,
> NeilBrown
> 
> 
>> 
>> I suspect I could do a better job of showing the original component
>> size, so that you have a chance of recovering even then.
>> 
>> But the patch:
>> 
>> diff --git a/Grow.c b/Grow.c
>> index 455c5f9..701590f 100755
>> --- a/Grow.c
>> +++ b/Grow.c
>> @@ -1561,7 +1561,7 @@ static int reshape_container(char *container, char *devname,
>>                 char *backup_file, int verbose,
>>                 int forked, int restart, int freeze_reshape);
>> 
>> -int Grow_reshape(char *devname, int fd,
>> +int Grow_reshape(char *devname, int fd, int force,
>>         struct mddev_dev *devlist,
>>         unsigned long long data_offset,
>>         struct context *c, struct shape *s)
>> @@ -1574,6 +1574,7 @@ int Grow_reshape(char *devname, int fd,
>>     * requested everything (if kernel supports freezing - 2.6.30).
>>     * The steps are:
>>     *  - change size (i.e. component_size)
>> +         *    - when shrinking, you must force the change 
>>     *  - change level
>>     *  - change layout/chunksize/ndisks
>>     *
>> @@ -1617,6 +1618,11 @@ int Grow_reshape(char *devname, int fd,
>>        return 1;
>>    }
>> 
>> +    if ((s->size < (unsigned)array.size) && !force) {
>> +        pr_err("Cannot set device size smaller than current component_size of %s array.  Use -f to force change.\n",devname);
>> +        return 1;
>> +    }
>> +
>>    if (s->raiddisks && s->raiddisks < array.raid_disks && array.level > 1 &&
>>        get_linux_version() < 2006032 &&
>>        !check_env("MDADM_FORCE_FEWER")) {
>> diff --git a/ReadMe.c b/ReadMe.c
>> index 50d3807..46988ae 100644
>> --- a/ReadMe.c
>> +++ b/ReadMe.c
>> @@ -203,6 +203,7 @@ struct option long_options[] = {
>>     {"invalid-backup",0,0,InvalidBackup},
>>     {"array-size", 1, 0, 'Z'},
>>     {"continue", 0, 0, Continue},
>> +    {"force",      0, 0, Force},
>> 
>>     /* For Incremental */
>>     {"rebuild-map", 0, 0, RebuildMapOpt},
>> @@ -563,6 +564,7 @@ char Help_grow[] =
>> "                      : This is useful if all devices have been replaced\n"
>> "                      : with larger devices.   Value is in Kilobytes, or\n"
>> "                      : the special word 'max' meaning 'as large as possible'.\n"
>> +"  --force       -f    : Override normal checks and be more forceful\n"
>> "  --assume-clean      : When increasing the --size, this flag will avoid\n"
>> "                      : a resync of the new space\n"
>> "  --chunk=       -c   : Change the chunksize of the array\n"
>> diff --git a/mdadm.c b/mdadm.c
>> index c3a265b..821658a 100644
>> --- a/mdadm.c
>> +++ b/mdadm.c
>> @@ -1617,7 +1617,7 @@ int main(int argc, char *argv[])
>>        else if (s.size > 0 || s.raiddisks || s.layout_str ||
>>             s.chunk != 0 || s.level != UnSet ||
>>             data_offset != INVALID_SECTORS) {
>> -            rv = Grow_reshape(devlist->devname, mdfd,
>> +            rv = Grow_reshape(devlist->devname, mdfd, c.force,
>>                      devlist->next,
>>                      data_offset, &c, &s);
>>        } else if (array_size == 0)
>> diff --git a/mdadm.h b/mdadm.h
>> index 71b8afb..9e00f05 100644
>> --- a/mdadm.h
>> +++ b/mdadm.h
>> @@ -1300,7 +1300,7 @@ extern int autodetect(void);
>> extern int Grow_Add_device(char *devname, int fd, char *newdev);
>> extern int Grow_addbitmap(char *devname, int fd,
>>              struct context *c, struct shape *s);
>> -extern int Grow_reshape(char *devname, int fd,
>> +extern int Grow_reshape(char *devname, int fd, int force,
>>            struct mddev_dev *devlist,
>>            unsigned long long data_offset,
>>            struct context *c, struct shape *s);
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-05  1:26   ` John Stoffel
@ 2017-10-07 22:06     ` Wols Lists
  2017-10-07 22:17       ` John Stoffel
  0 siblings, 1 reply; 28+ messages in thread
From: Wols Lists @ 2017-10-07 22:06 UTC (permalink / raw)
  To: John Stoffel, NeilBrown; +Cc: Eli Ben-Shoshan, Jes.Sorensen, linux-raid

On 05/10/17 02:26, John Stoffel wrote:
> It's trivial to revert if you know the starting size!  And I would argue that the --size option is misnamed, since is is a per-component resize.  
> 
> In any case, would it be better to print a message which said something like: array md## devices resized from <orig> to <new size>
> 
I think a message like "You are setting array space available to less
than array space used. Use --force if you really want to do this".

> When the user does this?  But again, I think the --force option is good to have when reducing the size of component devices, sine I would hope the message gives people a pause and hopefully makes them think.
> 
I'm with Neil in that you should never have to use force if you're doing
something sensible. As soon as mdadm says "you need to use --force" it
should be a warning that something is amiss. So only require it if the
array needs the space that you're reducing away. If you're using 6TB
with 3 x 3TB drives, then reducing component size to 2.1TB shouldn't
trigger a warning ...

> So I really don't think we're holding people back, we're educating them with this warning.  
> 
Good idea - I just think that the message as you've phrased it isn't
that educative, sorry.

Looking at your current message, it sounds like you're comparing current
array usage with future array size so that's right - you just need a
warning that sends a clear "you are about to shoot yourself in the foot"
message, not just a "use --force to suppress this warning".

Cheers,
Wol

> Sent from my iPhone
> 
>> On Oct 4, 2017, at 5:50 PM, NeilBrown <neilb@suse.com> wrote:
>>
>>> On Wed, Oct 04 2017, John Stoffel wrote:
>>>
>>> Since Eli had such a horrible experience where he shrunk the
>>> individual component raid device size, instead of growing the overall
>>> raid by adding a device, I came up with this hacky patch to warn you
>>> when you are about to shoot yourself in the foot.
>>>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-07 22:06     ` Wols Lists
@ 2017-10-07 22:17       ` John Stoffel
  2017-10-07 22:37         ` Wols Lists
  0 siblings, 1 reply; 28+ messages in thread
From: John Stoffel @ 2017-10-07 22:17 UTC (permalink / raw)
  To: Wols Lists
  Cc: John Stoffel, NeilBrown, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

>>>>> "Wols" == Wols Lists <antlists@youngman.org.uk> writes:

Wols> On 05/10/17 02:26, John Stoffel wrote:
>> It's trivial to revert if you know the starting size!  And I would argue that the --size option is misnamed, since is is a per-component resize.  
>> 
>> In any case, would it be better to print a message which said something like: array md## devices resized from <orig> to <new size>
>> 

Wols> I think a message like "You are setting array space available to
Wols> less than array space used. Use --force if you really want to do
Wols> this".

I think changing the message to say: "Resizing array component size
from X to Y." would address a bunch of comments on this thread.  And
would give people a way to get back to where they were more easily. 

>> When the user does this?  But again, I think the --force option is good to have when reducing the size of component devices, sine I would hope the message gives people a pause and hopefully makes them think.
>> 

Wols> I'm with Neil in that you should never have to use force if
Wols> you're doing something sensible. As soon as mdadm says "you need
Wols> to use --force" it should be a warning that something is
Wols> amiss. So only require it if the array needs the space that
Wols> you're reducing away. If you're using 6TB with 3 x 3TB drives,
Wols> then reducing component size to 2.1TB shouldn't trigger a
Wols> warning ...

You're taking both sides of the arguement here!  The question in my
mind is really if it's *ever* a good idea to reduce the size of
components of an array without an explicit command.  For growing,
sure, that's not a problem.  But since we can shrink component (not
just the array size!) sizes without warning and destroy people's data,
it's upon the tool to at least make some effort to notify them.

>> So I really don't think we're holding people back, we're educating them with this warning.  
>> 

Wols> Good idea - I just think that the message as you've phrased it
Wols> isn't that educative, sorry.

That's okay, the message needs to be tweaked for sure.  I was just
getting out a proof of concept patch.

Wols> Looking at your current message, it sounds like you're comparing
Wols> current array usage with future array size so that's right - you
Wols> just need a warning that sends a clear "you are about to shoot
Wols> yourself in the foot" message, not just a "use --force to
Wols> suppress this warning".

I agree that both A) the message needs to be improved, and B) the --force
option needs to be there when you are shrinking.  Neil didn't like B
as much, but I still think that when shrinkinking, we need to be very
hesitant to do something without explicit statement from the user,
because it's too easy without the new message (to be done still!) to
mess up and break things horribly.


As another idea, maybe we could syslog the current device settings
before we do any changes, so that there's a log of where you started
from and where you ended up?  The output of -D might be helpful.
Might be noisy... but so what?



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-07 22:17       ` John Stoffel
@ 2017-10-07 22:37         ` Wols Lists
  2017-10-07 22:46           ` John Stoffel
  0 siblings, 1 reply; 28+ messages in thread
From: Wols Lists @ 2017-10-07 22:37 UTC (permalink / raw)
  To: John Stoffel; +Cc: NeilBrown, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

On 07/10/17 23:17, John Stoffel wrote:
>>>>>> "Wols" == Wols Lists <antlists@youngman.org.uk> writes:
> 
> Wols> On 05/10/17 02:26, John Stoffel wrote:
>>> It's trivial to revert if you know the starting size!  And I would argue that the --size option is misnamed, since is is a per-component resize.  
>>>
>>> In any case, would it be better to print a message which said something like: array md## devices resized from <orig> to <new size>
>>>
> 
> Wols> I think a message like "You are setting array space available to
> Wols> less than array space used. Use --force if you really want to do
> Wols> this".
> 
> I think changing the message to say: "Resizing array component size
> from X to Y." would address a bunch of comments on this thread.  And
> would give people a way to get back to where they were more easily. 

Except it does NOT tell the user WHY they are being stupid ...
> 
>>> When the user does this?  But again, I think the --force option is good to have when reducing the size of component devices, sine I would hope the message gives people a pause and hopefully makes them think.
>>>
> 
> Wols> I'm with Neil in that you should never have to use force if
> Wols> you're doing something sensible. As soon as mdadm says "you need
> Wols> to use --force" it should be a warning that something is
> Wols> amiss. So only require it if the array needs the space that
> Wols> you're reducing away. If you're using 6TB with 3 x 3TB drives,
> Wols> then reducing component size to 2.1TB shouldn't trigger a
> Wols> warning ...
> 
> You're taking both sides of the arguement here!  The question in my
> mind is really if it's *ever* a good idea to reduce the size of
> components of an array without an explicit command.  For growing,
> sure, that's not a problem.  But since we can shrink component (not
> just the array size!) sizes without warning and destroy people's data,
> it's upon the tool to at least make some effort to notify them.

But it's also possible to reduce the size of an array WITHOUT destroying
peoples' data, and making them use --force here is a bad idea. (See
below - I've just realised I don't think this is possible.)
> 
>>> So I really don't think we're holding people back, we're educating them with this warning.  
>>>
> 
> Wols> Good idea - I just think that the message as you've phrased it
> Wols> isn't that educative, sorry.
> 
> That's okay, the message needs to be tweaked for sure.  I was just
> getting out a proof of concept patch.
> 
> Wols> Looking at your current message, it sounds like you're comparing
> Wols> current array usage with future array size so that's right - you
> Wols> just need a warning that sends a clear "you are about to shoot
> Wols> yourself in the foot" message, not just a "use --force to
> Wols> suppress this warning".
> 
> I agree that both A) the message needs to be improved, and B) the --force
> option needs to be there when you are shrinking.  Neil didn't like B
> as much, but I still think that when shrinkinking, we need to be very
> hesitant to do something without explicit statement from the user,
> because it's too easy without the new message (to be done still!) to
> mess up and break things horribly.
> 
Let me give a worked explanation of what I'm getting at. A bit
contrived, and I've suddenly realised I may be muddling my layers of the
stack, but ...

What I was thinking was let's say the user created an array with 3 x 2TB
drives. He then replaces the drives with 3TB drives. So the array is
only using some of the space available.

So he increases the component size from 2TB to 3TB - and then changes
his mind! To me, it makes sense that he should be able to revert that
change *without* getting a warning. However, as I've just said above,
I've just realised that might not be possible :-( as mdadm has no way of
knowing - inbetween the increase and decrease of size - whether the user
has used other commands to use the new space available.

So if mdadm can tell that the user is only using 2TB, it shouldn't warn
when size is reduced. I just don't think it can tell :-(

So yes, your approach of requiring --force to reduce the component size
does seem a sensible approach - we just need a clear message. Going on
about component devices muddies the water imho. Maybe something like
"WARNING: this command will shrink your array. Have you shrunk the
contents accordingly? Use --force to apply the change." Bear in mind Eli
thought he was growing the array (which is what most people will
expect), a warning that the array is going to shrink should trigger a
"what the!?" response.

Cheers,
Wol


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-07 22:37         ` Wols Lists
@ 2017-10-07 22:46           ` John Stoffel
  0 siblings, 0 replies; 28+ messages in thread
From: John Stoffel @ 2017-10-07 22:46 UTC (permalink / raw)
  To: Wols Lists
  Cc: John Stoffel, NeilBrown, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

>>>>> "Wols" == Wols Lists <antlists@youngman.org.uk> writes:

Wols> On 07/10/17 23:17, John Stoffel wrote:
>>>>>>> "Wols" == Wols Lists <antlists@youngman.org.uk> writes:
>> 
Wols> On 05/10/17 02:26, John Stoffel wrote:
>>>> It's trivial to revert if you know the starting size!  And I would argue that the --size option is misnamed, since is is a per-component resize.  
>>>> 
>>>> In any case, would it be better to print a message which said something like: array md## devices resized from <orig> to <new size>
>>>> 
>> 
Wols> I think a message like "You are setting array space available to
Wols> less than array space used. Use --force if you really want to do
Wols> this".
>> 
>> I think changing the message to say: "Resizing array component size
>> from X to Y." would address a bunch of comments on this thread.  And
>> would give people a way to get back to where they were more easily. 

Wols> Except it does NOT tell the user WHY they are being stupid ...

Ok.  But how much hand holding can we do here?  I see where Neil is
coming from in terms of not stopping people from being stupid.  I just
want to give them help in not making stupid mistakes.

>>>> When the user does this?  But again, I think the --force option is good to have when reducing the size of component devices, sine I would hope the message gives people a pause and hopefully makes them think.
>>>> 
>> 
Wols> I'm with Neil in that you should never have to use force if
Wols> you're doing something sensible. As soon as mdadm says "you need
Wols> to use --force" it should be a warning that something is
Wols> amiss. So only require it if the array needs the space that
Wols> you're reducing away. If you're using 6TB with 3 x 3TB drives,
Wols> then reducing component size to 2.1TB shouldn't trigger a
Wols> warning ...
>> 
>> You're taking both sides of the arguement here!  The question in my
>> mind is really if it's *ever* a good idea to reduce the size of
>> components of an array without an explicit command.  For growing,
>> sure, that's not a problem.  But since we can shrink component (not
>> just the array size!) sizes without warning and destroy people's data,
>> it's upon the tool to at least make some effort to notify them.

Wols> But it's also possible to reduce the size of an array WITHOUT destroying
Wols> peoples' data, and making them use --force here is a bad idea. (See
Wols> below - I've just realised I don't think this is possible.)

But how does mdadm *know* that people won't be destroying their data?
Yes, if they resize the filesystem(s), the logical volumes, the volume
groups, or any upper layers to be smaller, then you can reduce the
component sizes.  But that's a *really* unusual step to take with
RAID1,5 or 6, don't you think?

>>>> So I really don't think we're holding people back, we're educating them with this warning.  
>>>> 
>> 
Wols> Good idea - I just think that the message as you've phrased it
Wols> isn't that educative, sorry.
>> 
>> That's okay, the message needs to be tweaked for sure.  I was just
>> getting out a proof of concept patch.
>> 
Wols> Looking at your current message, it sounds like you're comparing
Wols> current array usage with future array size so that's right - you
Wols> just need a warning that sends a clear "you are about to shoot
Wols> yourself in the foot" message, not just a "use --force to
Wols> suppress this warning".
>> 
>> I agree that both A) the message needs to be improved, and B) the --force
>> option needs to be there when you are shrinking.  Neil didn't like B
>> as much, but I still think that when shrinkinking, we need to be very
>> hesitant to do something without explicit statement from the user,
>> because it's too easy without the new message (to be done still!) to
>> mess up and break things horribly.
>> 
Wols> Let me give a worked explanation of what I'm getting at. A bit
Wols> contrived, and I've suddenly realised I may be muddling my layers of the
Wols> stack, but ...

Wols> What I was thinking was let's say the user created an array with 3 x 2TB
Wols> drives. He then replaces the drives with 3TB drives. So the array is
Wols> only using some of the space available.

Wols> So he increases the component size from 2TB to 3TB - and then changes
Wols> his mind! To me, it makes sense that he should be able to revert that
Wols> change *without* getting a warning. However, as I've just said above,
Wols> I've just realised that might not be possible :-( as mdadm has no way of
Wols> knowing - inbetween the increase and decrease of size - whether the user
Wols> has used other commands to use the new space available.

Exactly!!!!

Wols> So if mdadm can tell that the user is only using 2TB, it shouldn't warn
Wols> when size is reduced. I just don't think it can tell :-(

Correct, it can't know.  So that's why the --force is good in that case.

Wols> So yes, your approach of requiring --force to reduce the component size
Wols> does seem a sensible approach - we just need a clear message. Going on
Wols> about component devices muddies the water imho. Maybe something like
Wols> "WARNING: this command will shrink your array. Have you shrunk the
Wols> contents accordingly? Use --force to apply the change." Bear in mind Eli
Wols> thought he was growing the array (which is what most people will
Wols> expect), a warning that the array is going to shrink should trigger a
Wols> "what the!?" response.

Yes, the message needs to be improved, I agree 100%.  I'll try to whip
up something and send it out for comments.

John


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-04 21:50 ` NeilBrown
  2017-10-05  1:26   ` John Stoffel
@ 2017-10-08 20:57   ` John Stoffel
  2017-10-08 22:52     ` NeilBrown
  1 sibling, 1 reply; 28+ messages in thread
From: John Stoffel @ 2017-10-08 20:57 UTC (permalink / raw)
  To: NeilBrown; +Cc: John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:

NeilBrown> On Wed, Oct 04 2017, John Stoffel wrote:
>> Since Eli had such a horrible experience where he shrunk the
>> individual component raid device size, instead of growing the overall
>> raid by adding a device, I came up with this hacky patch to warn you
>> when you are about to shoot yourself in the foot.
>> 
>> The idea is it will warn you and exit unless you pass in the --force
>> (or -f) switch when using the command.  For example, on a set of loop
>> devices:
>> 
>> # cat /proc/mdstat
>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>> [raid4] [multipath] [faulty]
>> md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>> loop0p1[0]
>> 606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>> [UUUUU]
>> 
>> # ./mdadm --grow /dev/md99 --size 128
>> mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>> 
>> # ./mdadm --grow /dev/md99 --size 128 -f
>> mdadm: component size of /dev/md99 has been set to 0K
>> 

NeilBrown> I'm not sure I like this.
NeilBrown> The reason that mdadm will quietly accept a size change like this is
NeilBrown> that it is trivial to revert - just set the same to a big number and all
NeilBrown> your data is still there.

This is wrong, because if you use --grow --size ### with a small
enough number, it destroys the MD raid superblock.  So again, I think
the --force option is *critical* here.  Or we need to block the size
change from going smaller than the superblock size.  Here's my test,
where I just warn if the size is going to be smaller:

    # ./mdadm --grow /dev/md99 --size 128
    mdadm: setting raid component device size from 202240 to 128 in array /dev/md99,
    this may need to be reverted if new size is smaller.
    mdadm: component size of /dev/md99 has been set to 0K

    # ./mdadm --grow /dev/md99 --size 202240
    mdadm: setting raid component device size from 0 to 202240 in array /dev/md99,
    this may need to be reverted if new size is smaller.
    mdadm: Cannot set device size in this type of array.

    # mdadm -E /dev/md99
    mdadm: No md superblock detected on /dev/md99.

So I think this argues for a much stronger check, and/or the --force
option when shrinking.  I'll re-spin my patch series into two chunks,
one just the message if changing size.  The second to require the
--force option.

And I think we need a third option to make sure the size can't be
smaller than the array superblock size as well.  Otherwise a simple
mistake trashes your array.

My current warning only patch (with whitespace damage...)

> git diff
diff --git a/Grow.c b/Grow.c
index 455c5f9..18aea63 100755
--- a/Grow.c
+++ b/Grow.c
@@ -1625,6 +1625,10 @@ int Grow_reshape(char *devname, int fd,
                return 1;
		        }

+       if (s->size != (unsigned)array.size) {
+               pr_err("setting raid component device size from %u to %llu in array %s,\nthis may need to be reverted if new size is smaller.\n",(unsigned)array.size,s->size,devname);
+       }
+
        st = super_by_fd(fd, &subarray);
	        if (!st) {
		                pr_err("Unable to determine metadata format for %s\n", devname);
				

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-08 20:57   ` John Stoffel
@ 2017-10-08 22:52     ` NeilBrown
  2017-10-09  1:18       ` John Stoffel
  2017-10-09  1:22       ` John Stoffel
  0 siblings, 2 replies; 28+ messages in thread
From: NeilBrown @ 2017-10-08 22:52 UTC (permalink / raw)
  Cc: John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

[-- Attachment #1: Type: text/plain, Size: 3902 bytes --]

On Sun, Oct 08 2017, John Stoffel wrote:

>>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:
>
> NeilBrown> On Wed, Oct 04 2017, John Stoffel wrote:
>>> Since Eli had such a horrible experience where he shrunk the
>>> individual component raid device size, instead of growing the overall
>>> raid by adding a device, I came up with this hacky patch to warn you
>>> when you are about to shoot yourself in the foot.
>>> 
>>> The idea is it will warn you and exit unless you pass in the --force
>>> (or -f) switch when using the command.  For example, on a set of loop
>>> devices:
>>> 
>>> # cat /proc/mdstat
>>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>>> [raid4] [multipath] [faulty]
>>> md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>>> loop0p1[0]
>>> 606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>>> [UUUUU]
>>> 
>>> # ./mdadm --grow /dev/md99 --size 128
>>> mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>>> 
>>> # ./mdadm --grow /dev/md99 --size 128 -f
>>> mdadm: component size of /dev/md99 has been set to 0K
>>> 
>
> NeilBrown> I'm not sure I like this.
> NeilBrown> The reason that mdadm will quietly accept a size change like this is
> NeilBrown> that it is trivial to revert - just set the same to a big number and all
> NeilBrown> your data is still there.
>
> This is wrong, because if you use --grow --size ### with a small
> enough number, it destroys the MD raid superblock.

If that is true, then it is a kernel bug and should be fixed in the kernel.

>  So again, I think
> the --force option is *critical* here.  Or we need to block the size
> change from going smaller than the superblock size.  Here's my test,
> where I just warn if the size is going to be smaller:
>
>     # ./mdadm --grow /dev/md99 --size 128
>     mdadm: setting raid component device size from 202240 to 128 in array /dev/md99,
>     this may need to be reverted if new size is smaller.
>     mdadm: component size of /dev/md99 has been set to 0K
>
>     # ./mdadm --grow /dev/md99 --size 202240
>     mdadm: setting raid component device size from 0 to 202240 in array /dev/md99,
>     this may need to be reverted if new size is smaller.
>     mdadm: Cannot set device size in this type of array.
>
>     # mdadm -E /dev/md99
>     mdadm: No md superblock detected on /dev/md99.
>
> So I think this argues for a much stronger check, and/or the --force
> option when shrinking.  I'll re-spin my patch series into two chunks,
> one just the message if changing size.  The second to require the
> --force option.

Why don't you like my suggestion that you should need to reduce the
--array-size first?

Thanks,
NeilBrown

>
> And I think we need a third option to make sure the size can't be
> smaller than the array superblock size as well.  Otherwise a simple
> mistake trashes your array.
>
> My current warning only patch (with whitespace damage...)
>
>> git diff
> diff --git a/Grow.c b/Grow.c
> index 455c5f9..18aea63 100755
> --- a/Grow.c
> +++ b/Grow.c
> @@ -1625,6 +1625,10 @@ int Grow_reshape(char *devname, int fd,
>                 return 1;
> 		        }
>
> +       if (s->size != (unsigned)array.size) {
> +               pr_err("setting raid component device size from %u to %llu in array %s,\nthis may need to be reverted if new size is smaller.\n",(unsigned)array.size,s->size,devname);
> +       }
> +
>         st = super_by_fd(fd, &subarray);
> 	        if (!st) {
> 		                pr_err("Unable to determine metadata format for %s\n", devname);
> 				
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-08 22:52     ` NeilBrown
@ 2017-10-09  1:18       ` John Stoffel
  2017-10-09  1:36         ` NeilBrown
  2017-10-09  1:22       ` John Stoffel
  1 sibling, 1 reply; 28+ messages in thread
From: John Stoffel @ 2017-10-09  1:18 UTC (permalink / raw)
  To: NeilBrown; +Cc: John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:

NeilBrown> On Sun, Oct 08 2017, John Stoffel wrote:
>>>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:
>> 
NeilBrown> On Wed, Oct 04 2017, John Stoffel wrote:
>>>> Since Eli had such a horrible experience where he shrunk the
>>>> individual component raid device size, instead of growing the overall
>>>> raid by adding a device, I came up with this hacky patch to warn you
>>>> when you are about to shoot yourself in the foot.
>>>> 
>>>> The idea is it will warn you and exit unless you pass in the --force
>>>> (or -f) switch when using the command.  For example, on a set of loop
>>>> devices:
>>>> 
>>>> # cat /proc/mdstat
>>>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>>>> [raid4] [multipath] [faulty]
>>>> md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>>>> loop0p1[0]
>>>> 606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>>>> [UUUUU]
>>>> 
>>>> # ./mdadm --grow /dev/md99 --size 128
>>>> mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>>>> 
>>>> # ./mdadm --grow /dev/md99 --size 128 -f
>>>> mdadm: component size of /dev/md99 has been set to 0K
>>>> 
>> 
NeilBrown> I'm not sure I like this.
NeilBrown> The reason that mdadm will quietly accept a size change like this is
NeilBrown> that it is trivial to revert - just set the same to a big number and all
NeilBrown> your data is still there.
>> 
>> This is wrong, because if you use --grow --size ### with a small
>> enough number, it destroys the MD raid superblock.

NeilBrown> If that is true, then it is a kernel bug and should be fixed in the kernel.

That's a better solution of course.  I'll see if I can figure this
out, but my testing will be slower... :-)

>> So again, I think
>> the --force option is *critical* here.  Or we need to block the size
>> change from going smaller than the superblock size.  Here's my test,
>> where I just warn if the size is going to be smaller:
>> 
>> # ./mdadm --grow /dev/md99 --size 128
>> mdadm: setting raid component device size from 202240 to 128 in array /dev/md99,
>> this may need to be reverted if new size is smaller.
>> mdadm: component size of /dev/md99 has been set to 0K
>> 
>> # ./mdadm --grow /dev/md99 --size 202240
>> mdadm: setting raid component device size from 0 to 202240 in array /dev/md99,
>> this may need to be reverted if new size is smaller.
>> mdadm: Cannot set device size in this type of array.
>> 
>> # mdadm -E /dev/md99
>> mdadm: No md superblock detected on /dev/md99.
>> 
>> So I think this argues for a much stronger check, and/or the --force
>> option when shrinking.  I'll re-spin my patch series into two chunks,
>> one just the message if changing size.  The second to require the
>> --force option.

NeilBrown> Why don't you like my suggestion that you should need to reduce the
NeilBrown> --array-size first?

Ok, so assuming at RAID6 with 4 x 100mb loop devices:

  mdadm --create /dev/md99 --name md99 --level 6 -n 4 /dev/loop?p1

Now we have a 200mb visible size array, using 100mb on each disk.  I
want to shrink them by 50mb each:

  mdadm --grow md## --array-size 100m  

So now the array should be just using the first 50mb on each loop device.

  mdadm --grow md## --size 50m

Then we've shrunk each loop device to 50m.  So since the docs say that
the --array-size does't change anything, the only way to make a
--array-size change permanent is using --size ## correct?

But doesn't

  mdadm --grow md99 -size 0

imply that we *exapand* the array component sizes, along with the
--array_size of the array?  When does that change become permanent?
This is the part of mdadm management that gets wonky in my mind.  We
make it too difficult for the SysAdmin to know what has to happen
here.

And looking at the mdadm man page:

       -z, --size= Amount (in Kibibytes) of space to use from each
		   drive in RAID levels 1/4/5/6.  This must be a multiple of the chunk
		   size, and must leave about 128Kb of space at the end of the drive for
		   the RAID superblock.  If this is not specified (as it normally is not)
		   the smallest drive (or partition) sets the size, though if there is a
		   vari‐ ance among the drives of greater than 1%, a warning is issued.

This is the possible error we're seeing, since setting a --size of 128
is *way* too small.  We need at least 128k * num_devices.  So now I
know what to check in the kernel...  The _about_ is the worrying
phrase, we should be able to be more precise here, and show how to
figure this number from the mdadm -E or -D commands.

Thanks again for all your work on this Neil, you've been amazing,
truly!

Thanks,
John





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-08 22:52     ` NeilBrown
  2017-10-09  1:18       ` John Stoffel
@ 2017-10-09  1:22       ` John Stoffel
  2017-10-09  4:10         ` NeilBrown
  1 sibling, 1 reply; 28+ messages in thread
From: John Stoffel @ 2017-10-09  1:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:

NeilBrown> On Sun, Oct 08 2017, John Stoffel wrote:
>>>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:
>> 
NeilBrown> On Wed, Oct 04 2017, John Stoffel wrote:
>>>> Since Eli had such a horrible experience where he shrunk the
>>>> individual component raid device size, instead of growing the overall
>>>> raid by adding a device, I came up with this hacky patch to warn you
>>>> when you are about to shoot yourself in the foot.
>>>> 
>>>> The idea is it will warn you and exit unless you pass in the --force
>>>> (or -f) switch when using the command.  For example, on a set of loop
>>>> devices:
>>>> 
>>>> # cat /proc/mdstat
>>>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>>>> [raid4] [multipath] [faulty]
>>>> md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>>>> loop0p1[0]
>>>> 606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>>>> [UUUUU]
>>>> 
>>>> # ./mdadm --grow /dev/md99 --size 128
>>>> mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>>>> 
>>>> # ./mdadm --grow /dev/md99 --size 128 -f
>>>> mdadm: component size of /dev/md99 has been set to 0K
>>>> 
>> 
NeilBrown> I'm not sure I like this.
NeilBrown> The reason that mdadm will quietly accept a size change like this is
NeilBrown> that it is trivial to revert - just set the same to a big number and all
NeilBrown> your data is still there.
>> 
>> This is wrong, because if you use --grow --size ### with a small
>> enough number, it destroys the MD raid superblock.

NeilBrown> If that is true, then it is a kernel bug and should be fixed in the kernel.

I just remembered another point I wanted to make.  The earliest we get
such a change into the kernel is 4.15, and then maybe back ported into
some number of stable kernels.  But by putting this check into mdadm
as well, we can protect people running older kernels as well.  It
seems to me like a good arguement for fixing mdadm to:

- adding the --grow --Force ... option.
- fixing the size check so you don't destroy the MD superblock even
  with --Force.
- reporting to the user the pre- and post- size of array components
  when using --grow --size ##

Thanks,
John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-09  1:18       ` John Stoffel
@ 2017-10-09  1:36         ` NeilBrown
  0 siblings, 0 replies; 28+ messages in thread
From: NeilBrown @ 2017-10-09  1:36 UTC (permalink / raw)
  Cc: John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

[-- Attachment #1: Type: text/plain, Size: 6438 bytes --]

On Sun, Oct 08 2017, John Stoffel wrote:

>>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:
>
> NeilBrown> On Sun, Oct 08 2017, John Stoffel wrote:
>>>>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:
>>> 
> NeilBrown> On Wed, Oct 04 2017, John Stoffel wrote:
>>>>> Since Eli had such a horrible experience where he shrunk the
>>>>> individual component raid device size, instead of growing the overall
>>>>> raid by adding a device, I came up with this hacky patch to warn you
>>>>> when you are about to shoot yourself in the foot.
>>>>> 
>>>>> The idea is it will warn you and exit unless you pass in the --force
>>>>> (or -f) switch when using the command.  For example, on a set of loop
>>>>> devices:
>>>>> 
>>>>> # cat /proc/mdstat
>>>>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>>>>> [raid4] [multipath] [faulty]
>>>>> md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>>>>> loop0p1[0]
>>>>> 606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>>>>> [UUUUU]
>>>>> 
>>>>> # ./mdadm --grow /dev/md99 --size 128
>>>>> mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>>>>> 
>>>>> # ./mdadm --grow /dev/md99 --size 128 -f
>>>>> mdadm: component size of /dev/md99 has been set to 0K
>>>>> 
>>> 
> NeilBrown> I'm not sure I like this.
> NeilBrown> The reason that mdadm will quietly accept a size change like this is
> NeilBrown> that it is trivial to revert - just set the same to a big number and all
> NeilBrown> your data is still there.
>>> 
>>> This is wrong, because if you use --grow --size ### with a small
>>> enough number, it destroys the MD raid superblock.
>
> NeilBrown> If that is true, then it is a kernel bug and should be fixed in the kernel.
>
> That's a better solution of course.  I'll see if I can figure this
> out, but my testing will be slower... :-)
>
>>> So again, I think
>>> the --force option is *critical* here.  Or we need to block the size
>>> change from going smaller than the superblock size.  Here's my test,
>>> where I just warn if the size is going to be smaller:
>>> 
>>> # ./mdadm --grow /dev/md99 --size 128
>>> mdadm: setting raid component device size from 202240 to 128 in array /dev/md99,
>>> this may need to be reverted if new size is smaller.
>>> mdadm: component size of /dev/md99 has been set to 0K
>>> 
>>> # ./mdadm --grow /dev/md99 --size 202240
>>> mdadm: setting raid component device size from 0 to 202240 in array /dev/md99,
>>> this may need to be reverted if new size is smaller.
>>> mdadm: Cannot set device size in this type of array.
>>> 
>>> # mdadm -E /dev/md99
>>> mdadm: No md superblock detected on /dev/md99.
>>> 
>>> So I think this argues for a much stronger check, and/or the --force
>>> option when shrinking.  I'll re-spin my patch series into two chunks,
>>> one just the message if changing size.  The second to require the
>>> --force option.
>
> NeilBrown> Why don't you like my suggestion that you should need to reduce the
> NeilBrown> --array-size first?
>
> Ok, so assuming at RAID6 with 4 x 100mb loop devices:
>
>   mdadm --create /dev/md99 --name md99 --level 6 -n 4 /dev/loop?p1
>
> Now we have a 200mb visible size array, using 100mb on each disk.  I
> want to shrink them by 50mb each:
>
>   mdadm --grow md## --array-size 100m  
>
> So now the array should be just using the first 50mb on each loop device.
>
>   mdadm --grow md## --size 50m
>
> Then we've shrunk each loop device to 50m.  So since the docs say that
> the --array-size does't change anything, the only way to make a
> --array-size change permanent is using --size ## correct?

Yes.

>
> But doesn't
>
>   mdadm --grow md99 -size 0
>
> imply that we *exapand* the array component sizes, along with the
> --array_size of the array?

Does it?  "mdadm --grow .. --size max" certainly means that.
Maybe "--size 0" has the same effect, but in then "0" is a special case.


>                              When does that change become permanent?

As you say, when you set "--size".

> This is the part of mdadm management that gets wonky in my mind.  We
> make it too difficult for the SysAdmin to know what has to happen
> here.
>
> And looking at the mdadm man page:
>
>        -z, --size= Amount (in Kibibytes) of space to use from each
> 		   drive in RAID levels 1/4/5/6.  This must be a multiple of the chunk
> 		   size, and must leave about 128Kb of space at the end of the drive for
> 		   the RAID superblock.  If this is not specified (as it normally is not)
> 		   the smallest drive (or partition) sets the size, though if there is a
> 		   vari‐ ance among the drives of greater than 1%, a warning is issued.
>
> This is the possible error we're seeing, since setting a --size of 128
> is *way* too small.  We need at least 128k * num_devices.

Why is this "*way* too small"?? It is too small precisely if there is a
meaningful chunk size, and if the chunk size is > 128k.
num_devices has nothing to do with this.

Where it says "leave 128K as the end of the drive" it means that the
given size should be at least 128K less than that actual size of the
partition/device.


>                                                             So now I
> know what to check in the kernel...  The _about_ is the worrying
> phrase, we should be able to be more precise here, and show how to
> figure this number from the mdadm -E or -D commands.

The 128K is for 0.90 metadata.
The precise number here is "size of device/partition, rounded down to a
multiple of 64K, with 64K then subtracted".
For 1.0, the hard number if like that but with 4K, except that it is
nice to leave extra space for bitmaps etc.
If you ask mdadm to make the --size larger than it can support -
i.e. large enough that there would be no space for the metadata - mdadm
will not let you.

So I don't think there is a need to spell out a precise number.
Anything that works should be acceptable.

NeilBrown


>
> Thanks again for all your work on this Neil, you've been amazing,
> truly!
>
> Thanks,
> John
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-09  1:22       ` John Stoffel
@ 2017-10-09  4:10         ` NeilBrown
  2017-10-09 20:04           ` Phil Turmel
  0 siblings, 1 reply; 28+ messages in thread
From: NeilBrown @ 2017-10-09  4:10 UTC (permalink / raw)
  Cc: John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

[-- Attachment #1: Type: text/plain, Size: 3136 bytes --]

On Sun, Oct 08 2017, John Stoffel wrote:

>>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:
>
> NeilBrown> On Sun, Oct 08 2017, John Stoffel wrote:
>>>>>>>> "NeilBrown" == NeilBrown  <neilb@suse.com> writes:
>>> 
> NeilBrown> On Wed, Oct 04 2017, John Stoffel wrote:
>>>>> Since Eli had such a horrible experience where he shrunk the
>>>>> individual component raid device size, instead of growing the overall
>>>>> raid by adding a device, I came up with this hacky patch to warn you
>>>>> when you are about to shoot yourself in the foot.
>>>>> 
>>>>> The idea is it will warn you and exit unless you pass in the --force
>>>>> (or -f) switch when using the command.  For example, on a set of loop
>>>>> devices:
>>>>> 
>>>>> # cat /proc/mdstat
>>>>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>>>>> [raid4] [multipath] [faulty]
>>>>> md99 : active raid6 loop4p1[4] loop3p1[3] loop2p1[2] loop1p1[1]
>>>>> loop0p1[0]
>>>>> 606720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/5]
>>>>> [UUUUU]
>>>>> 
>>>>> # ./mdadm --grow /dev/md99 --size 128
>>>>> mdadm: Cannot set device size smaller than current component_size of /dev/md99 array.  Use -f to force change.
>>>>> 
>>>>> # ./mdadm --grow /dev/md99 --size 128 -f
>>>>> mdadm: component size of /dev/md99 has been set to 0K
>>>>> 
>>> 
> NeilBrown> I'm not sure I like this.
> NeilBrown> The reason that mdadm will quietly accept a size change like this is
> NeilBrown> that it is trivial to revert - just set the same to a big number and all
> NeilBrown> your data is still there.
>>> 
>>> This is wrong, because if you use --grow --size ### with a small
>>> enough number, it destroys the MD raid superblock.
>
> NeilBrown> If that is true, then it is a kernel bug and should be fixed in the kernel.
>
> I just remembered another point I wanted to make.  The earliest we get
> such a change into the kernel is 4.15, and then maybe back ported into
> some number of stable kernels.  But by putting this check into mdadm
> as well, we can protect people running older kernels as well.  It
> seems to me like a good arguement for fixing mdadm to:

If there is some action that mdadm can currently be told to perform, and
when it tries to perform that action it corrupts the array, then
it is certainly appropriate to teach mdadm not to perform that action.
It shouldn't even perform that action with --force.   I agree that
changing mdadm like this is complementary to changing the kernel.  Both
are useful.

This is quite separate from any proposal to require --force to reduce
the size with --grow --size.

NeilBrown

>
> - adding the --grow --Force ... option.
> - fixing the size check so you don't destroy the MD superblock even
>   with --Force.
> - reporting to the user the pre- and post- size of array components
>   when using --grow --size ##
>
> Thanks,
> John
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-09  4:10         ` NeilBrown
@ 2017-10-09 20:04           ` Phil Turmel
  2017-10-10  0:07             ` Wakko Warner
                               ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Phil Turmel @ 2017-10-09 20:04 UTC (permalink / raw)
  To: NeilBrown, John Stoffel; +Cc: Eli Ben-Shoshan, Jes.Sorensen, linux-raid

On 10/09/2017 12:10 AM, NeilBrown wrote:

> If there is some action that mdadm can currently be told to perform, and
> when it tries to perform that action it corrupts the array, then
> it is certainly appropriate to teach mdadm not to perform that action.
> It shouldn't even perform that action with --force.   I agree that
> changing mdadm like this is complementary to changing the kernel.  Both
> are useful.

A certain amount of the trouble with all of this is the english meaning
of "grow" doesn't really match what mdadm allows.

Might it be reasonable to reject "--grow" operations that reduce the
final array size, and introduce the complementary "--reduce" operation
that rejects array size increases?

Both operations would share the current code, just apply a different
sanity check before proceeding.

mdadm would then at least not violate the rule of least surprise.

Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-09 20:04           ` Phil Turmel
@ 2017-10-10  0:07             ` Wakko Warner
  2017-10-10 13:12               ` Phil Turmel
  2017-10-10 20:52               ` NeilBrown
  2017-10-10  2:01             ` John Stoffel
                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 28+ messages in thread
From: Wakko Warner @ 2017-10-10  0:07 UTC (permalink / raw)
  To: Phil Turmel
  Cc: NeilBrown, John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

Phil Turmel wrote:
> On 10/09/2017 12:10 AM, NeilBrown wrote:
> 
> > If there is some action that mdadm can currently be told to perform, and
> > when it tries to perform that action it corrupts the array, then
> > it is certainly appropriate to teach mdadm not to perform that action.
> > It shouldn't even perform that action with --force.   I agree that
> > changing mdadm like this is complementary to changing the kernel.  Both
> > are useful.
> 
> A certain amount of the trouble with all of this is the english meaning
> of "grow" doesn't really match what mdadm allows.
> 
> Might it be reasonable to reject "--grow" operations that reduce the
> final array size, and introduce the complementary "--reduce" operation
> that rejects array size increases?
> 
> Both operations would share the current code, just apply a different
> sanity check before proceeding.
> 
> mdadm would then at least not violate the rule of least surprise.

As a general user of md raid and as a reader of the list, I would agree that
this would be a better solution.  Thinking in terms of lvm, there's lvreduce
and lvextend.  IMO, --force wouldn't be needed for --reduce (I was orginally
thinking of --shrink)

On a side note, is it possible for the lower layers to know what the last
used sector is?  IE lvm ontop of raid and has 10% allocated and the last
sector is around the 10% mark.  (If this were possible --force would be,
required if shrinking would result in inaccessible data)

I recently did a shrink of 4x 2tb drives so that I could replace the 2tb
drives with 80gb drives (yes, big shrink!)  Would have been nice for mdadm
to know the smallest size was that wouldn't destroy my lvm volumes that were
on top.

-- 
 Microsoft has beaten Volkswagen's world record.  Volkswagen only created 22
 million bugs.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-09 20:04           ` Phil Turmel
  2017-10-10  0:07             ` Wakko Warner
@ 2017-10-10  2:01             ` John Stoffel
  2017-10-10 20:09             ` Jes Sorensen
  2017-10-10 20:48             ` NeilBrown
  3 siblings, 0 replies; 28+ messages in thread
From: John Stoffel @ 2017-10-10  2:01 UTC (permalink / raw)
  To: Phil Turmel
  Cc: NeilBrown, John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

>>>>> "Phil" == Phil Turmel <philip@turmel.org> writes:

Phil> On 10/09/2017 12:10 AM, NeilBrown wrote:
>> If there is some action that mdadm can currently be told to perform, and
>> when it tries to perform that action it corrupts the array, then
>> it is certainly appropriate to teach mdadm not to perform that action.
>> It shouldn't even perform that action with --force.   I agree that
>> changing mdadm like this is complementary to changing the kernel.  Both
>> are useful.

Phil> A certain amount of the trouble with all of this is the english meaning
Phil> of "grow" doesn't really match what mdadm allows.

Phil> Might it be reasonable to reject "--grow" operations that reduce the
Phil> final array size, and introduce the complementary "--reduce" operation
Phil> that rejects array size increases?

I like this idea!  And it wouldn't be hard to implement in mdadm.  

Phil> Both operations would share the current code, just apply a different
Phil> sanity check before proceeding.

Phil> mdadm would then at least not violate the rule of least surprise.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-10  0:07             ` Wakko Warner
@ 2017-10-10 13:12               ` Phil Turmel
  2017-10-10 20:52               ` NeilBrown
  1 sibling, 0 replies; 28+ messages in thread
From: Phil Turmel @ 2017-10-10 13:12 UTC (permalink / raw)
  To: Wakko Warner
  Cc: NeilBrown, John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

On 10/09/2017 08:07 PM, Wakko Warner wrote:
> On a side note, is it possible for the lower layers to know what the last
> used sector is?

No.  Upper layers can use the device content in any order desired.

> IE lvm ontop of raid and has 10% allocated and the last
> sector is around the 10% mark.  (If this were possible --force would be,
> required if shrinking would result in inaccessible data)

LVM has a number of allocation policies, including selecting space at
the end instead of beginning.  And it won't relocate anything (on its
own) when gaps open.  Lower layers simply can't assume anything at all.

> I recently did a shrink of 4x 2tb drives so that I could replace the 2tb
> drives with 80gb drives (yes, big shrink!)  Would have been nice for mdadm
> to know the smallest size was that wouldn't destroy my lvm volumes that were
> on top.

Only LVM can know.

Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-09 20:04           ` Phil Turmel
  2017-10-10  0:07             ` Wakko Warner
  2017-10-10  2:01             ` John Stoffel
@ 2017-10-10 20:09             ` Jes Sorensen
  2017-10-10 20:54               ` Wols Lists
  2017-10-10 20:48             ` NeilBrown
  3 siblings, 1 reply; 28+ messages in thread
From: Jes Sorensen @ 2017-10-10 20:09 UTC (permalink / raw)
  To: Phil Turmel, NeilBrown, John Stoffel; +Cc: Eli Ben-Shoshan, linux-raid

On 10/09/2017 04:04 PM, Phil Turmel wrote:
> On 10/09/2017 12:10 AM, NeilBrown wrote:
> 
>> If there is some action that mdadm can currently be told to perform, and
>> when it tries to perform that action it corrupts the array, then
>> it is certainly appropriate to teach mdadm not to perform that action.
>> It shouldn't even perform that action with --force.   I agree that
>> changing mdadm like this is complementary to changing the kernel.  Both
>> are useful.
> 
> A certain amount of the trouble with all of this is the english meaning
> of "grow" doesn't really match what mdadm allows.
> 
> Might it be reasonable to reject "--grow" operations that reduce the
> final array size, and introduce the complementary "--reduce" operation
> that rejects array size increases?
> 
> Both operations would share the current code, just apply a different
> sanity check before proceeding.

"grow" in mdadmlish translates to reshape/resize in English. Starting to 
introduce new keywords for this really makes no sense and just cause 
confusion, so I am not going to support that.

Jes


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-09 20:04           ` Phil Turmel
                               ` (2 preceding siblings ...)
  2017-10-10 20:09             ` Jes Sorensen
@ 2017-10-10 20:48             ` NeilBrown
  2017-10-10 20:58               ` Phil Turmel
  3 siblings, 1 reply; 28+ messages in thread
From: NeilBrown @ 2017-10-10 20:48 UTC (permalink / raw)
  To: Phil Turmel, John Stoffel; +Cc: Eli Ben-Shoshan, Jes.Sorensen, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1232 bytes --]

On Mon, Oct 09 2017, Phil Turmel wrote:

> On 10/09/2017 12:10 AM, NeilBrown wrote:
>
>> If there is some action that mdadm can currently be told to perform, and
>> when it tries to perform that action it corrupts the array, then
>> it is certainly appropriate to teach mdadm not to perform that action.
>> It shouldn't even perform that action with --force.   I agree that
>> changing mdadm like this is complementary to changing the kernel.  Both
>> are useful.
>
> A certain amount of the trouble with all of this is the english meaning
> of "grow" doesn't really match what mdadm allows.
>
> Might it be reasonable to reject "--grow" operations that reduce the
> final array size, and introduce the complementary "--reduce" operation
> that rejects array size increases?

While there is a lot to like about this approach, one problem is that
some "grow" operations do not change the size. They might, e.g., just change
the chunksize.

I guess you could have --grow --reduce --reshape.

I wouldn't object to such a change.

Thanks,
NeilBrown

>
> Both operations would share the current code, just apply a different
> sanity check before proceeding.
>
> mdadm would then at least not violate the rule of least surprise.
>
> Phil

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-10  0:07             ` Wakko Warner
  2017-10-10 13:12               ` Phil Turmel
@ 2017-10-10 20:52               ` NeilBrown
  2017-10-10 20:55                 ` Wakko Warner
  1 sibling, 1 reply; 28+ messages in thread
From: NeilBrown @ 2017-10-10 20:52 UTC (permalink / raw)
  To: Wakko Warner, Phil Turmel
  Cc: John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

[-- Attachment #1: Type: text/plain, Size: 2252 bytes --]

On Mon, Oct 09 2017, Wakko Warner wrote:

> Phil Turmel wrote:
>> On 10/09/2017 12:10 AM, NeilBrown wrote:
>> 
>> > If there is some action that mdadm can currently be told to perform, and
>> > when it tries to perform that action it corrupts the array, then
>> > it is certainly appropriate to teach mdadm not to perform that action.
>> > It shouldn't even perform that action with --force.   I agree that
>> > changing mdadm like this is complementary to changing the kernel.  Both
>> > are useful.
>> 
>> A certain amount of the trouble with all of this is the english meaning
>> of "grow" doesn't really match what mdadm allows.
>> 
>> Might it be reasonable to reject "--grow" operations that reduce the
>> final array size, and introduce the complementary "--reduce" operation
>> that rejects array size increases?
>> 
>> Both operations would share the current code, just apply a different
>> sanity check before proceeding.
>> 
>> mdadm would then at least not violate the rule of least surprise.
>
> As a general user of md raid and as a reader of the list, I would agree that
> this would be a better solution.  Thinking in terms of lvm, there's lvreduce
> and lvextend.  IMO, --force wouldn't be needed for --reduce (I was orginally
> thinking of --shrink)
>
> On a side note, is it possible for the lower layers to know what the last
> used sector is?  IE lvm ontop of raid and has 10% allocated and the last
> sector is around the 10% mark.  (If this were possible --force would be,
> required if shrinking would result in inaccessible data)

No it isn't.  I've occasionally thought of adding functionality so that
the a device could ask its client (e.g. filesystem, lvm, etc) if
shrinking is OK - but it hasn't happened yet.

>
> I recently did a shrink of 4x 2tb drives so that I could replace the 2tb
> drives with 80gb drives (yes, big shrink!)  Would have been nice for mdadm
> to know the smallest size was that wouldn't destroy my lvm volumes that were
> on top.

Guess, try, see if data is still accessible.  If not, revert the change.
If you have a filesystem on the raid, fsck will complain if you made it
too small.  I don't know what you would try with lvm.  pvscan?

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-10 20:09             ` Jes Sorensen
@ 2017-10-10 20:54               ` Wols Lists
  2017-10-10 21:07                 ` Jes Sorensen
  0 siblings, 1 reply; 28+ messages in thread
From: Wols Lists @ 2017-10-10 20:54 UTC (permalink / raw)
  To: Jes Sorensen, Phil Turmel, NeilBrown, John Stoffel
  Cc: Eli Ben-Shoshan, linux-raid

On 10/10/17 21:09, Jes Sorensen wrote:
>> Both operations would share the current code, just apply a different
>> sanity check before proceeding.
> 
> "grow" in mdadmlish translates to reshape/resize in English. Starting to
> introduce new keywords for this really makes no sense and just cause
> confusion, so I am not going to support that.

But saying "grow" when the result is a shrink also causes confusion.
Would you accept changing "grow" to "resize"?

But personally I think adding a new keyword is sensible. Firstly, in
normal use no-one is ever going to want to shrink an array, so this is
rarely going to be used.

And secondly, if you use "grow" to grow an array, it's a "safe"
operation (unless something goes wrong). If you use "grow" to *shrink*
an array, as Eli found out, it's very dangerous.

I think abusing the English language is far more dangerous than adding a
new keyword. No disrespect to them, but you forget your average sysadmin
is, well, average. Handing them a loaded foot-gun with no safety-catch
is *not* a good idea. (And even a good sysadmin will spend little time
with mdadm. Even if they know this now, there's a good chance they'll
forget before they need it again, and it becomes a land-mine waiting to
go off ...)

One only has to look at the "hung grow" thread to see what the lack of
safety-catches can do - if anybody wants another little project, might
it be an idea to make a load of operations (like resize for example)
block on a degraded array?

Cheers,
Wol

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-10 20:52               ` NeilBrown
@ 2017-10-10 20:55                 ` Wakko Warner
  0 siblings, 0 replies; 28+ messages in thread
From: Wakko Warner @ 2017-10-10 20:55 UTC (permalink / raw)
  To: NeilBrown
  Cc: Phil Turmel, John Stoffel, Eli Ben-Shoshan, Jes.Sorensen, linux-raid

NeilBrown wrote:
> On Mon, Oct 09 2017, Wakko Warner wrote:
> 
> > Phil Turmel wrote:
> >> A certain amount of the trouble with all of this is the english meaning
> >> of "grow" doesn't really match what mdadm allows.
> >> 
> >> Might it be reasonable to reject "--grow" operations that reduce the
> >> final array size, and introduce the complementary "--reduce" operation
> >> that rejects array size increases?
> >> 
> >> Both operations would share the current code, just apply a different
> >> sanity check before proceeding.
> >> 
> >> mdadm would then at least not violate the rule of least surprise.
> >
> > As a general user of md raid and as a reader of the list, I would agree that
> > this would be a better solution.  Thinking in terms of lvm, there's lvreduce
> > and lvextend.  IMO, --force wouldn't be needed for --reduce (I was orginally
> > thinking of --shrink)
> >
> > On a side note, is it possible for the lower layers to know what the last
> > used sector is?  IE lvm ontop of raid and has 10% allocated and the last
> > sector is around the 10% mark.  (If this were possible --force would be,
> > required if shrinking would result in inaccessible data)
> 
> No it isn't.  I've occasionally thought of adding functionality so that
> the a device could ask its client (e.g. filesystem, lvm, etc) if
> shrinking is OK - but it hasn't happened yet.

That's what I thought but wasn't sure.  Thanks.

> > I recently did a shrink of 4x 2tb drives so that I could replace the 2tb
> > drives with 80gb drives (yes, big shrink!)  Would have been nice for mdadm
> > to know the smallest size was that wouldn't destroy my lvm volumes that were
> > on top.
> 
> Guess, try, see if data is still accessible.  If not, revert the change.
> If you have a filesystem on the raid, fsck will complain if you made it
> too small.  I don't know what you would try with lvm.  pvscan?

Fortunately for me, my first try worked.

-- 
 Microsoft has beaten Volkswagen's world record.  Volkswagen only created 22
 million bugs.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-10 20:48             ` NeilBrown
@ 2017-10-10 20:58               ` Phil Turmel
  0 siblings, 0 replies; 28+ messages in thread
From: Phil Turmel @ 2017-10-10 20:58 UTC (permalink / raw)
  To: NeilBrown, John Stoffel; +Cc: Eli Ben-Shoshan, Jes.Sorensen, linux-raid

On 10/10/2017 04:48 PM, NeilBrown wrote:
> On Mon, Oct 09 2017, Phil Turmel wrote:

>> Might it be reasonable to reject "--grow" operations that reduce the
>> final array size, and introduce the complementary "--reduce" operation
>> that rejects array size increases?
> 
> While there is a lot to like about this approach, one problem is that
> some "grow" operations do not change the size. They might, e.g., just change
> the chunksize.

I tried to word my proposal to address that -- either keyword could
accept operations that didn't change the size.

> I guess you could have --grow --reduce --reshape.
> 
> I wouldn't object to such a change.

Jes has weighed in as opposed.  Sigh.

Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: mdadm: Patch to restrict --size when shrinking unless forced
  2017-10-10 20:54               ` Wols Lists
@ 2017-10-10 21:07                 ` Jes Sorensen
  0 siblings, 0 replies; 28+ messages in thread
From: Jes Sorensen @ 2017-10-10 21:07 UTC (permalink / raw)
  To: Wols Lists, Phil Turmel, NeilBrown, John Stoffel
  Cc: Eli Ben-Shoshan, linux-raid

On 10/10/2017 04:54 PM, Wols Lists wrote:
> On 10/10/17 21:09, Jes Sorensen wrote:
>>> Both operations would share the current code, just apply a different
>>> sanity check before proceeding.
>>
>> "grow" in mdadmlish translates to reshape/resize in English. Starting to
>> introduce new keywords for this really makes no sense and just cause
>> confusion, so I am not going to support that.
> 
> But saying "grow" when the result is a shrink also causes confusion.
> Would you accept changing "grow" to "resize"?

Changing an existing keyword that people have been using for years isn't 
going to make anything better. There are scripts in place, people have 
systems with old and new installed.

> But personally I think adding a new keyword is sensible. Firstly, in
> normal use no-one is ever going to want to shrink an array, so this is
> rarely going to be used.
> 
> And secondly, if you use "grow" to grow an array, it's a "safe"
> operation (unless something goes wrong). If you use "grow" to *shrink*
> an array, as Eli found out, it's very dangerous.
> 
> I think abusing the English language is far more dangerous than adding a
> new keyword. No disrespect to them, but you forget your average sysadmin
> is, well, average. Handing them a loaded foot-gun with no safety-catch
> is *not* a good idea. (And even a good sysadmin will spend little time
> with mdadm. Even if they know this now, there's a good chance they'll
> forget before they need it again, and it becomes a land-mine waiting to
> go off ...)

In this case a good sysadmin will read the man page and follow the 
instructions.

The English abuse isn't an argument I really buy. There are millions of 
cases out there for that.

Jes

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-10-10 21:07 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-04 18:00 mdadm: Patch to restrict --size when shrinking unless forced John Stoffel
2017-10-04 18:11 ` Jes Sorensen
2017-10-04 19:15   ` John Stoffel
2017-10-04 19:23     ` Jes Sorensen
2017-10-04 19:33       ` John Stoffel
2017-10-04 21:50 ` NeilBrown
2017-10-05  1:26   ` John Stoffel
2017-10-07 22:06     ` Wols Lists
2017-10-07 22:17       ` John Stoffel
2017-10-07 22:37         ` Wols Lists
2017-10-07 22:46           ` John Stoffel
2017-10-08 20:57   ` John Stoffel
2017-10-08 22:52     ` NeilBrown
2017-10-09  1:18       ` John Stoffel
2017-10-09  1:36         ` NeilBrown
2017-10-09  1:22       ` John Stoffel
2017-10-09  4:10         ` NeilBrown
2017-10-09 20:04           ` Phil Turmel
2017-10-10  0:07             ` Wakko Warner
2017-10-10 13:12               ` Phil Turmel
2017-10-10 20:52               ` NeilBrown
2017-10-10 20:55                 ` Wakko Warner
2017-10-10  2:01             ` John Stoffel
2017-10-10 20:09             ` Jes Sorensen
2017-10-10 20:54               ` Wols Lists
2017-10-10 21:07                 ` Jes Sorensen
2017-10-10 20:48             ` NeilBrown
2017-10-10 20:58               ` Phil Turmel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.