All of lore.kernel.org
 help / color / mirror / Atom feed
* [mdadm PATCH 0/5] Assorted mdadm patch
@ 2017-03-27  3:36 NeilBrown
  2017-03-27  3:36 ` [mdadm PATCH 1/5] udev-md-raid-assembly.rules: Skip non-ready devices NeilBrown
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: NeilBrown @ 2017-03-27  3:36 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: Linux-RAID

I found these while tidying up....
I post a couple already, but then realised that I
have quite a few, so here they are all together.
Sorry for the duplication.

NeilBrown


---

Hannes Reinecke (1):
      udev-md-raid-assembly.rules: Skip non-ready devices

NeilBrown (4):
      Retry HOT_REMOVE_DISK a few times.
      Introduce sys_hot_remove_disk()
      Add 'force' flag to *hot_remove_disk().
      Detail: handle non-existent arrays better.


 Detail.c                    |    8 ++++++++
 Grow.c                      |    9 +--------
 Manage.c                    |   14 +++++---------
 mdadm.h                     |    2 ++
 udev-md-raid-assembly.rules |    3 +++
 util.c                      |   30 ++++++++++++++++++++++++++++++
 6 files changed, 49 insertions(+), 17 deletions(-)

--
Signature


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [mdadm PATCH 2/5] Retry HOT_REMOVE_DISK a few times.
  2017-03-27  3:36 [mdadm PATCH 0/5] Assorted mdadm patch NeilBrown
  2017-03-27  3:36 ` [mdadm PATCH 1/5] udev-md-raid-assembly.rules: Skip non-ready devices NeilBrown
@ 2017-03-27  3:36 ` NeilBrown
  2017-03-27  7:51   ` 王金浦
  2017-03-27  3:36 ` [mdadm PATCH 3/5] Introduce sys_hot_remove_disk() NeilBrown
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2017-03-27  3:36 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: Linux-RAID

HOT_REMOVE_DISK can fail with EBUSY if there are outstanding
IO request that have not completed yet.  It can sometimes
be helpful to wait a little while for these to complete.

We already do this in impose_level() when reshaping a device,
but not in Manage.c in response to an explicit --remove request.

So create hot_remove_disk() to central this code, and call it
where-ever it makes sense to wait for a HOT_REMOVE_DISK to succeed.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 Grow.c   |    9 +--------
 Manage.c |    4 ++--
 mdadm.h  |    1 +
 util.c   |   18 ++++++++++++++++++
 4 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/Grow.c b/Grow.c
index 455c5f90bf58..218a70649c1f 100755
--- a/Grow.c
+++ b/Grow.c
@@ -2736,7 +2736,6 @@ static int impose_level(int fd, int level, char *devname, int verbose)
 		for (d = 0, found = 0;
 		     d < MAX_DISKS && found < array.nr_disks;
 		     d++) {
-			int cnt;
 			mdu_disk_info_t disk;
 			disk.number = d;
 			if (ioctl(fd, GET_DISK_INFO, &disk) < 0)
@@ -2750,13 +2749,7 @@ static int impose_level(int fd, int level, char *devname, int verbose)
 				continue;
 			ioctl(fd, SET_DISK_FAULTY,
 			      makedev(disk.major, disk.minor));
-			cnt = 5;
-			while (ioctl(fd, HOT_REMOVE_DISK,
-				     makedev(disk.major, disk.minor)) < 0
-			       && errno == EBUSY
-			       && cnt--) {
-				usleep(10000);
-			}
+			hot_remove_disk(fd, makedev(disk.major, disk.minor));
 		}
 	}
 	c = map_num(pers, level);
diff --git a/Manage.c b/Manage.c
index 5c3d2b9b1a9f..9139f96e1431 100644
--- a/Manage.c
+++ b/Manage.c
@@ -1183,7 +1183,7 @@ int Manage_remove(struct supertype *tst, int fd, struct mddev_dev *dv,
 		else
 			err = 0;
 	} else {
-		err = ioctl(fd, HOT_REMOVE_DISK, rdev);
+		err = hot_remove_disk(fd, rdev);
 		if (err && errno == ENODEV) {
 			/* Old kernels rejected this if no personality
 			 * is registered */
@@ -1607,7 +1607,7 @@ int Manage_subdevs(char *devname, int fd,
 
 			if (dv->disposition == 'F')
 				/* Need to remove first */
-				ioctl(fd, HOT_REMOVE_DISK, rdev);
+				hot_remove_disk(fd, rdev);
 			/* Make sure it isn't in use (in 2.6 or later) */
 			tfd = dev_open(dv->devname, O_RDONLY|O_EXCL);
 			if (tfd >= 0) {
diff --git a/mdadm.h b/mdadm.h
index 71b8afb9fee6..53b3b5836841 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1476,6 +1476,7 @@ extern int add_disk(int mdfd, struct supertype *st,
 		    struct mdinfo *sra, struct mdinfo *info);
 extern int remove_disk(int mdfd, struct supertype *st,
 		       struct mdinfo *sra, struct mdinfo *info);
+extern int hot_remove_disk(int mdfd, unsigned long dev);
 extern int set_array_info(int mdfd, struct supertype *st, struct mdinfo *info);
 unsigned long long min_recovery_start(struct mdinfo *array);
 
diff --git a/util.c b/util.c
index f1009723ed04..e176a9d466cf 100644
--- a/util.c
+++ b/util.c
@@ -1795,6 +1795,24 @@ int remove_disk(int mdfd, struct supertype *st,
 	return rv;
 }
 
+int hot_remove_disk(int mdfd, unsigned long dev)
+{
+	int cnt = 5;
+	int ret;
+
+	/* HOT_REMOVE_DISK can fail with EBUSY if there are
+	 * outstanding IO requests to the device.
+	 * In this case, it can be helpful to wait a little while,
+	 * up to half a second, for that IO to flush.
+	 */
+	while ((ret = ioctl(mdfd, HOT_REMOVE_DISK, dev)) == -1 &&
+	       errno == EBUSY &&
+	       cnt-- > 0)
+		usleep(10000);
+
+	return ret;
+}
+
 int set_array_info(int mdfd, struct supertype *st, struct mdinfo *info)
 {
 	/* Initialise kernel's knowledge of array.



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [mdadm PATCH 1/5] udev-md-raid-assembly.rules: Skip non-ready devices
  2017-03-27  3:36 [mdadm PATCH 0/5] Assorted mdadm patch NeilBrown
@ 2017-03-27  3:36 ` NeilBrown
  2017-03-27  3:36 ` [mdadm PATCH 2/5] Retry HOT_REMOVE_DISK a few times NeilBrown
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: NeilBrown @ 2017-03-27  3:36 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: Linux-RAID

From: Hannes Reinecke <hare@suse.de>

If a device isn't fully initialized (e.g if it should be
handled by multipathing) it should not be considered for
md/RAID auto-assembly.  Doing so can cause incorrect results
such as causing multipath to fail during startup.

There is a convention that the udev environment variable
SYSTEMD_READY be set to zero for such devices.  So change
the mdadm rules to ignore devices with SYSTEMD_READY==0.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: NeilBrown <neilb@suse.com>
---
 udev-md-raid-assembly.rules |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/udev-md-raid-assembly.rules b/udev-md-raid-assembly.rules
index d0d440a6394c..8ca232a44b1f 100644
--- a/udev-md-raid-assembly.rules
+++ b/udev-md-raid-assembly.rules
@@ -7,6 +7,9 @@ ENV{ANACONDA}=="?*", GOTO="md_inc_end"
 
 SUBSYSTEM!="block", GOTO="md_inc_end"
 
+# skip non-initialized devices
+ENV{SYSTEMD_READY}=="0", GOTO="md_inc_end"
+
 # handle potential components of arrays (the ones supported by md)
 ENV{ID_FS_TYPE}=="linux_raid_member", GOTO="md_inc"
 



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [mdadm PATCH 4/5] Add 'force' flag to *hot_remove_disk().
  2017-03-27  3:36 [mdadm PATCH 0/5] Assorted mdadm patch NeilBrown
                   ` (3 preceding siblings ...)
  2017-03-27  3:36 ` [mdadm PATCH 5/5] Detail: handle non-existent arrays better NeilBrown
@ 2017-03-27  3:36 ` NeilBrown
  2017-03-27  8:05   ` 王金浦
  2017-03-28 18:33   ` jes.sorensen
  4 siblings, 2 replies; 11+ messages in thread
From: NeilBrown @ 2017-03-27  3:36 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: Linux-RAID

In rare circumstances, the short period that *hot_remove_disk()
waits isn't long enough to IO to complete.  This particularly happens
when a device is failing and many retries are still happening.

We don't want to increase the normal wait time for "mdadm --remove"
as that might be use just to test if a device is active or not, and a
delay would be problematic.
So allow "--force" to mean that mdadm should try extra hard for a
--remove to complete, waiting up to 5 seconds.

Note that this patch fixes a comment which claim the previous
wait time was half a second, where it was really 50msec.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 Grow.c   |    2 +-
 Manage.c |   10 +++++-----
 mdadm.h  |    4 ++--
 util.c   |   10 +++++-----
 4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/Grow.c b/Grow.c
index 218a70649c1f..e22661ca3dff 100755
--- a/Grow.c
+++ b/Grow.c
@@ -2749,7 +2749,7 @@ static int impose_level(int fd, int level, char *devname, int verbose)
 				continue;
 			ioctl(fd, SET_DISK_FAULTY,
 			      makedev(disk.major, disk.minor));
-			hot_remove_disk(fd, makedev(disk.major, disk.minor));
+			hot_remove_disk(fd, makedev(disk.major, disk.minor), 1);
 		}
 	}
 	c = map_num(pers, level);
diff --git a/Manage.c b/Manage.c
index edf5798aa87d..55218d9b0a7d 100644
--- a/Manage.c
+++ b/Manage.c
@@ -1110,7 +1110,7 @@ int Manage_add(int fd, int tfd, struct mddev_dev *dv,
 }
 
 int Manage_remove(struct supertype *tst, int fd, struct mddev_dev *dv,
-		  int sysfd, unsigned long rdev, int verbose, char *devname)
+		  int sysfd, unsigned long rdev, int force, int verbose, char *devname)
 {
 	int lfd = -1;
 	int err;
@@ -1177,9 +1177,9 @@ int Manage_remove(struct supertype *tst, int fd, struct mddev_dev *dv,
 		/* device has been removed and we don't know
 		 * the major:minor number
 		 */
-		err = sys_hot_remove_disk(sysfd);
+		err = sys_hot_remove_disk(sysfd, force);
 	} else {
-		err = hot_remove_disk(fd, rdev);
+		err = hot_remove_disk(fd, rdev, force);
 		if (err && errno == ENODEV) {
 			/* Old kernels rejected this if no personality
 			 * is registered */
@@ -1603,7 +1603,7 @@ int Manage_subdevs(char *devname, int fd,
 
 			if (dv->disposition == 'F')
 				/* Need to remove first */
-				hot_remove_disk(fd, rdev);
+				hot_remove_disk(fd, rdev, force);
 			/* Make sure it isn't in use (in 2.6 or later) */
 			tfd = dev_open(dv->devname, O_RDONLY|O_EXCL);
 			if (tfd >= 0) {
@@ -1645,7 +1645,7 @@ int Manage_subdevs(char *devname, int fd,
 				rv = -1;
 			} else
 				rv = Manage_remove(tst, fd, dv, sysfd,
-						   rdev, verbose,
+						   rdev, verbose, force,
 						   devname);
 			if (sysfd >= 0)
 				close(sysfd);
diff --git a/mdadm.h b/mdadm.h
index 52952a800ace..3b3d24230a79 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1476,8 +1476,8 @@ extern int add_disk(int mdfd, struct supertype *st,
 		    struct mdinfo *sra, struct mdinfo *info);
 extern int remove_disk(int mdfd, struct supertype *st,
 		       struct mdinfo *sra, struct mdinfo *info);
-extern int hot_remove_disk(int mdfd, unsigned long dev);
-extern int sys_hot_remove_disk(int statefd);
+extern int hot_remove_disk(int mdfd, unsigned long dev, int force);
+extern int sys_hot_remove_disk(int statefd, int force);
 extern int set_array_info(int mdfd, struct supertype *st, struct mdinfo *info);
 unsigned long long min_recovery_start(struct mdinfo *array);
 
diff --git a/util.c b/util.c
index 43fd83e72ded..6fd582520f0d 100644
--- a/util.c
+++ b/util.c
@@ -1795,15 +1795,15 @@ int remove_disk(int mdfd, struct supertype *st,
 	return rv;
 }
 
-int hot_remove_disk(int mdfd, unsigned long dev)
+int hot_remove_disk(int mdfd, unsigned long dev, int force)
 {
-	int cnt = 5;
+	int cnt = force ? 500 : 5;
 	int ret;
 
 	/* HOT_REMOVE_DISK can fail with EBUSY if there are
 	 * outstanding IO requests to the device.
 	 * In this case, it can be helpful to wait a little while,
-	 * up to half a second, for that IO to flush.
+	 * up to 5 seconds if 'force' is set, or 50 msec if not.
 	 */
 	while ((ret = ioctl(mdfd, HOT_REMOVE_DISK, dev)) == -1 &&
 	       errno == EBUSY &&
@@ -1813,9 +1813,9 @@ int hot_remove_disk(int mdfd, unsigned long dev)
 	return ret;
 }
 
-int sys_hot_remove_disk(int statefd)
+int sys_hot_remove_disk(int statefd, int force)
 {
-	int cnt = 5;
+	int cnt = force ? 500 : 5;
 	int ret;
 
 	while ((ret = write(statefd, "remove", 6)) == -1 &&



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [mdadm PATCH 5/5] Detail: handle non-existent arrays better.
  2017-03-27  3:36 [mdadm PATCH 0/5] Assorted mdadm patch NeilBrown
                   ` (2 preceding siblings ...)
  2017-03-27  3:36 ` [mdadm PATCH 3/5] Introduce sys_hot_remove_disk() NeilBrown
@ 2017-03-27  3:36 ` NeilBrown
  2017-03-28 18:34   ` jes.sorensen
  2017-03-27  3:36 ` [mdadm PATCH 4/5] Add 'force' flag to *hot_remove_disk() NeilBrown
  4 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2017-03-27  3:36 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: Linux-RAID

If you call "mdadm --detail" with a device file for an array which
doesn't exist, such as by
  mknod /dev/md57 b 9 57
  mdadm --detail /dev/md57

you get an unhelpful message about and inactive RAID0, and return
status is '0'.  This is confusing.

So catch this possibility and print a more useful message, and
return a non-zero status.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 Detail.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Detail.c b/Detail.c
index 509b0d418768..d9d1b7092167 100644
--- a/Detail.c
+++ b/Detail.c
@@ -110,6 +110,14 @@ int Detail(char *dev, struct context *c)
 	if (ioctl(fd, GET_ARRAY_INFO, &array) == 0) {
 		inactive = 0;
 	} else if (errno == ENODEV && sra) {
+		if (sra->array.major_version == -1 &&
+		    sra->array.minor_version == -1 &&
+		    sra->devs == NULL) {
+			pr_err("Array associated with md device %s does not exist.\n", dev);
+			close(fd);
+			sysfs_free(sra);
+			return rv;
+		}
 		array = sra->array;
 		inactive = 1;
 	} else {



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [mdadm PATCH 3/5] Introduce sys_hot_remove_disk()
  2017-03-27  3:36 [mdadm PATCH 0/5] Assorted mdadm patch NeilBrown
  2017-03-27  3:36 ` [mdadm PATCH 1/5] udev-md-raid-assembly.rules: Skip non-ready devices NeilBrown
  2017-03-27  3:36 ` [mdadm PATCH 2/5] Retry HOT_REMOVE_DISK a few times NeilBrown
@ 2017-03-27  3:36 ` NeilBrown
  2017-03-28 18:33   ` jes.sorensen
  2017-03-27  3:36 ` [mdadm PATCH 5/5] Detail: handle non-existent arrays better NeilBrown
  2017-03-27  3:36 ` [mdadm PATCH 4/5] Add 'force' flag to *hot_remove_disk() NeilBrown
  4 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2017-03-27  3:36 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: Linux-RAID

The new hot_remove_disk() will retry HOT_REMOVE_DISK
several times in the face of EBUSY.
However we sometimes remove a device by writing "remove" to the
"state" attributed.  This should be retried as well.
So introduce sys_hot_remove_disk() to repeat this action a few times.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 Manage.c |    6 +-----
 mdadm.h  |    1 +
 util.c   |   12 ++++++++++++
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/Manage.c b/Manage.c
index 9139f96e1431..edf5798aa87d 100644
--- a/Manage.c
+++ b/Manage.c
@@ -1177,11 +1177,7 @@ int Manage_remove(struct supertype *tst, int fd, struct mddev_dev *dv,
 		/* device has been removed and we don't know
 		 * the major:minor number
 		 */
-		int n = write(sysfd, "remove", 6);
-		if (n != 6)
-			err = -1;
-		else
-			err = 0;
+		err = sys_hot_remove_disk(sysfd);
 	} else {
 		err = hot_remove_disk(fd, rdev);
 		if (err && errno == ENODEV) {
diff --git a/mdadm.h b/mdadm.h
index 53b3b5836841..52952a800ace 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1477,6 +1477,7 @@ extern int add_disk(int mdfd, struct supertype *st,
 extern int remove_disk(int mdfd, struct supertype *st,
 		       struct mdinfo *sra, struct mdinfo *info);
 extern int hot_remove_disk(int mdfd, unsigned long dev);
+extern int sys_hot_remove_disk(int statefd);
 extern int set_array_info(int mdfd, struct supertype *st, struct mdinfo *info);
 unsigned long long min_recovery_start(struct mdinfo *array);
 
diff --git a/util.c b/util.c
index e176a9d466cf..43fd83e72ded 100644
--- a/util.c
+++ b/util.c
@@ -1813,6 +1813,18 @@ int hot_remove_disk(int mdfd, unsigned long dev)
 	return ret;
 }
 
+int sys_hot_remove_disk(int statefd)
+{
+	int cnt = 5;
+	int ret;
+
+	while ((ret = write(statefd, "remove", 6)) == -1 &&
+	       errno == EBUSY &&
+	       cnt-- > 0)
+		usleep(10000);
+	return ret == 6 ? 0 : -1;
+}
+
 int set_array_info(int mdfd, struct supertype *st, struct mdinfo *info)
 {
 	/* Initialise kernel's knowledge of array.



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [mdadm PATCH 2/5] Retry HOT_REMOVE_DISK a few times.
  2017-03-27  3:36 ` [mdadm PATCH 2/5] Retry HOT_REMOVE_DISK a few times NeilBrown
@ 2017-03-27  7:51   ` 王金浦
  0 siblings, 0 replies; 11+ messages in thread
From: 王金浦 @ 2017-03-27  7:51 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jes Sorensen, Linux-RAID

Hi Neil,
>
> +int hot_remove_disk(int mdfd, unsigned long dev)
> +{
> +       int cnt = 5;
> +       int ret;
> +
> +       /* HOT_REMOVE_DISK can fail with EBUSY if there are
> +        * outstanding IO requests to the device.
> +        * In this case, it can be helpful to wait a little while,
> +        * up to half a second, for that IO to flush.
> +        */
> +       while ((ret = ioctl(mdfd, HOT_REMOVE_DISK, dev)) == -1 &&
> +              errno == EBUSY &&
> +              cnt-- > 0)
> +               usleep(10000);
> +
> +       return ret;
> +}
The comments is different with the code, 50000 us = 50 ms not half a seconds.

Other than that.

It looks good to me!

Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>

Thanks,
Jack

> +
>  int set_array_info(int mdfd, struct supertype *st, struct mdinfo *info)
>  {
>         /* Initialise kernel's knowledge of array.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [mdadm PATCH 4/5] Add 'force' flag to *hot_remove_disk().
  2017-03-27  3:36 ` [mdadm PATCH 4/5] Add 'force' flag to *hot_remove_disk() NeilBrown
@ 2017-03-27  8:05   ` 王金浦
  2017-03-28 18:33   ` jes.sorensen
  1 sibling, 0 replies; 11+ messages in thread
From: 王金浦 @ 2017-03-27  8:05 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jes Sorensen, Linux-RAID

2017-03-27 5:36 GMT+02:00 NeilBrown <neilb@suse.com>:
> In rare circumstances, the short period that *hot_remove_disk()
> waits isn't long enough to IO to complete.  This particularly happens
> when a device is failing and many retries are still happening.
>
> We don't want to increase the normal wait time for "mdadm --remove"
> as that might be use just to test if a device is active or not, and a
> delay would be problematic.
> So allow "--force" to mean that mdadm should try extra hard for a
> --remove to complete, waiting up to 5 seconds.
>
> Note that this patch fixes a comment which claim the previous
> wait time was half a second, where it was really 50msec.
>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  Grow.c   |    2 +-
>  Manage.c |   10 +++++-----
>  mdadm.h  |    4 ++--
>  util.c   |   10 +++++-----
>  4 files changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/Grow.c b/Grow.c
> index 218a70649c1f..e22661ca3dff 100755
> --- a/Grow.c
> +++ b/Grow.c
> @@ -2749,7 +2749,7 @@ static int impose_level(int fd, int level, char *devname, int verbose)
>                                 continue;
>                         ioctl(fd, SET_DISK_FAULTY,
>                               makedev(disk.major, disk.minor));
> -                       hot_remove_disk(fd, makedev(disk.major, disk.minor));
> +                       hot_remove_disk(fd, makedev(disk.major, disk.minor), 1);
>                 }
>         }
>         c = map_num(pers, level);
> diff --git a/Manage.c b/Manage.c
> index edf5798aa87d..55218d9b0a7d 100644
> --- a/Manage.c
> +++ b/Manage.c
> @@ -1110,7 +1110,7 @@ int Manage_add(int fd, int tfd, struct mddev_dev *dv,
>  }
>
>  int Manage_remove(struct supertype *tst, int fd, struct mddev_dev *dv,
> -                 int sysfd, unsigned long rdev, int verbose, char *devname)
> +                 int sysfd, unsigned long rdev, int force, int verbose, char *devname)
>  {
>         int lfd = -1;
>         int err;
> @@ -1177,9 +1177,9 @@ int Manage_remove(struct supertype *tst, int fd, struct mddev_dev *dv,
>                 /* device has been removed and we don't know
>                  * the major:minor number
>                  */
> -               err = sys_hot_remove_disk(sysfd);
> +               err = sys_hot_remove_disk(sysfd, force);
>         } else {
> -               err = hot_remove_disk(fd, rdev);
> +               err = hot_remove_disk(fd, rdev, force);
>                 if (err && errno == ENODEV) {
>                         /* Old kernels rejected this if no personality
>                          * is registered */
> @@ -1603,7 +1603,7 @@ int Manage_subdevs(char *devname, int fd,
>
>                         if (dv->disposition == 'F')
>                                 /* Need to remove first */
> -                               hot_remove_disk(fd, rdev);
> +                               hot_remove_disk(fd, rdev, force);
>                         /* Make sure it isn't in use (in 2.6 or later) */
>                         tfd = dev_open(dv->devname, O_RDONLY|O_EXCL);
>                         if (tfd >= 0) {
> @@ -1645,7 +1645,7 @@ int Manage_subdevs(char *devname, int fd,
>                                 rv = -1;
>                         } else
>                                 rv = Manage_remove(tst, fd, dv, sysfd,
> -                                                  rdev, verbose,
> +                                                  rdev, verbose, force,
>                                                    devname);
>                         if (sysfd >= 0)
>                                 close(sysfd);
> diff --git a/mdadm.h b/mdadm.h
> index 52952a800ace..3b3d24230a79 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -1476,8 +1476,8 @@ extern int add_disk(int mdfd, struct supertype *st,
>                     struct mdinfo *sra, struct mdinfo *info);
>  extern int remove_disk(int mdfd, struct supertype *st,
>                        struct mdinfo *sra, struct mdinfo *info);
> -extern int hot_remove_disk(int mdfd, unsigned long dev);
> -extern int sys_hot_remove_disk(int statefd);
> +extern int hot_remove_disk(int mdfd, unsigned long dev, int force);
> +extern int sys_hot_remove_disk(int statefd, int force);
>  extern int set_array_info(int mdfd, struct supertype *st, struct mdinfo *info);
>  unsigned long long min_recovery_start(struct mdinfo *array);
>
> diff --git a/util.c b/util.c
> index 43fd83e72ded..6fd582520f0d 100644
> --- a/util.c
> +++ b/util.c
> @@ -1795,15 +1795,15 @@ int remove_disk(int mdfd, struct supertype *st,
>         return rv;
>  }
>
> -int hot_remove_disk(int mdfd, unsigned long dev)
> +int hot_remove_disk(int mdfd, unsigned long dev, int force)
>  {
> -       int cnt = 5;
> +       int cnt = force ? 500 : 5;
>         int ret;
>
>         /* HOT_REMOVE_DISK can fail with EBUSY if there are
>          * outstanding IO requests to the device.
>          * In this case, it can be helpful to wait a little while,
> -        * up to half a second, for that IO to flush.
> +        * up to 5 seconds if 'force' is set, or 50 msec if not.
>          */
>         while ((ret = ioctl(mdfd, HOT_REMOVE_DISK, dev)) == -1 &&
>                errno == EBUSY &&
> @@ -1813,9 +1813,9 @@ int hot_remove_disk(int mdfd, unsigned long dev)
>         return ret;
>  }
>
> -int sys_hot_remove_disk(int statefd)
> +int sys_hot_remove_disk(int statefd, int force)
>  {
> -       int cnt = 5;
> +       int cnt = force ? 500 : 5;
>         int ret;
>
>         while ((ret = write(statefd, "remove", 6)) == -1 &&
>
>
> --

Ah, you've found and fix it,
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [mdadm PATCH 3/5] Introduce sys_hot_remove_disk()
  2017-03-27  3:36 ` [mdadm PATCH 3/5] Introduce sys_hot_remove_disk() NeilBrown
@ 2017-03-28 18:33   ` jes.sorensen
  0 siblings, 0 replies; 11+ messages in thread
From: jes.sorensen @ 2017-03-28 18:33 UTC (permalink / raw)
  To: NeilBrown; +Cc: Linux-RAID

NeilBrown <neilb@suse.com> writes:
> The new hot_remove_disk() will retry HOT_REMOVE_DISK
> several times in the face of EBUSY.
> However we sometimes remove a device by writing "remove" to the
> "state" attributed.  This should be retried as well.
> So introduce sys_hot_remove_disk() to repeat this action a few times.
>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  Manage.c |    6 +-----
>  mdadm.h  |    1 +
>  util.c   |   12 ++++++++++++
>  3 files changed, 14 insertions(+), 5 deletions(-)

Applied!

Thanks,
Jes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [mdadm PATCH 4/5] Add 'force' flag to *hot_remove_disk().
  2017-03-27  3:36 ` [mdadm PATCH 4/5] Add 'force' flag to *hot_remove_disk() NeilBrown
  2017-03-27  8:05   ` 王金浦
@ 2017-03-28 18:33   ` jes.sorensen
  1 sibling, 0 replies; 11+ messages in thread
From: jes.sorensen @ 2017-03-28 18:33 UTC (permalink / raw)
  To: NeilBrown; +Cc: Linux-RAID

NeilBrown <neilb@suse.com> writes:
> In rare circumstances, the short period that *hot_remove_disk()
> waits isn't long enough to IO to complete.  This particularly happens
> when a device is failing and many retries are still happening.
>
> We don't want to increase the normal wait time for "mdadm --remove"
> as that might be use just to test if a device is active or not, and a
> delay would be problematic.
> So allow "--force" to mean that mdadm should try extra hard for a
> --remove to complete, waiting up to 5 seconds.
>
> Note that this patch fixes a comment which claim the previous
> wait time was half a second, where it was really 50msec.
>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  Grow.c   |    2 +-
>  Manage.c |   10 +++++-----
>  mdadm.h  |    4 ++--
>  util.c   |   10 +++++-----
>  4 files changed, 13 insertions(+), 13 deletions(-)

Applied!

Thanks,
Jes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [mdadm PATCH 5/5] Detail: handle non-existent arrays better.
  2017-03-27  3:36 ` [mdadm PATCH 5/5] Detail: handle non-existent arrays better NeilBrown
@ 2017-03-28 18:34   ` jes.sorensen
  0 siblings, 0 replies; 11+ messages in thread
From: jes.sorensen @ 2017-03-28 18:34 UTC (permalink / raw)
  To: NeilBrown; +Cc: Linux-RAID

NeilBrown <neilb@suse.com> writes:
> If you call "mdadm --detail" with a device file for an array which
> doesn't exist, such as by
>   mknod /dev/md57 b 9 57
>   mdadm --detail /dev/md57
>
> you get an unhelpful message about and inactive RAID0, and return
> status is '0'.  This is confusing.
>
> So catch this possibility and print a more useful message, and
> return a non-zero status.
>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  Detail.c |    8 ++++++++
>  1 file changed, 8 insertions(+)

Applied!

Thanks,
Jes

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-03-28 18:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-27  3:36 [mdadm PATCH 0/5] Assorted mdadm patch NeilBrown
2017-03-27  3:36 ` [mdadm PATCH 1/5] udev-md-raid-assembly.rules: Skip non-ready devices NeilBrown
2017-03-27  3:36 ` [mdadm PATCH 2/5] Retry HOT_REMOVE_DISK a few times NeilBrown
2017-03-27  7:51   ` 王金浦
2017-03-27  3:36 ` [mdadm PATCH 3/5] Introduce sys_hot_remove_disk() NeilBrown
2017-03-28 18:33   ` jes.sorensen
2017-03-27  3:36 ` [mdadm PATCH 5/5] Detail: handle non-existent arrays better NeilBrown
2017-03-28 18:34   ` jes.sorensen
2017-03-27  3:36 ` [mdadm PATCH 4/5] Add 'force' flag to *hot_remove_disk() NeilBrown
2017-03-27  8:05   ` 王金浦
2017-03-28 18:33   ` jes.sorensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.