All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Monitor: Allow taking spare from rebuilding array
@ 2017-10-25 10:37 Erik Berg
  2017-10-31 11:23 ` Erik Berg
  2017-11-01  1:30 ` NeilBrown
  0 siblings, 2 replies; 3+ messages in thread
From: Erik Berg @ 2017-10-25 10:37 UTC (permalink / raw)
  To: linux-raid

When you have a box with 60 8TB drives, 8 RAID6 sets and 12 spares
assigned to the first set, and the first set has a failed disk and is
rebuilding, spares aren't given to other sets with failing disks.
Waiting 2-3 days for a rebuild to complete before giving out spares to
other failing sets seems excessive.
---
 Monitor.c | 36 ++++++++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/Monitor.c b/Monitor.c
index b60996b..b1178fd 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -781,14 +781,41 @@ static int check_donor(struct state *from, struct state *to)
 		 */
 		if (sub->active < sub->raid)
 			return 0;
-	if (from->metadata->ss->external == 0)
-		if (from->active < from->raid)
-			return 0;
 	if (from->spare <= 0)
 		return 0;
 	return 1;
 }
 
+int spare_rebuilding(struct state *dev)
+{
+	mdu_disk_info_t disk;
+	mdu_array_info_t array;
+
+	int fd = open(dev->devname, O_RDONLY);
+	if (fd < 0) {
+		pr_err("cannot open %s: %s\n",
+			dev->devname, strerror(errno));
+		return 1;
+	}
+
+	md_get_disk_info(fd, &disk);
+	md_get_array_info(fd, &array);
+
+	close(fd);
+
+	if ((disk.state &
+	     ((1 << MD_DISK_ACTIVE) | (1 << MD_DISK_SYNC) |
+	      (1 << MD_DISK_REMOVED) | (1 << MD_DISK_FAULTY) |
+	      (1 << MD_DISK_JOURNAL))) == 0) {
+
+		if (disk.raid_disk < array.raid_disks &&
+		    disk.raid_disk >= 0)
+			return 1;
+	}
+
+	return 0;
+}
+
 static dev_t choose_spare(struct state *from, struct state *to,
 			  struct domainlist *domlist, struct spare_criteria *sc)
 {
@@ -821,7 +848,8 @@ static dev_t choose_spare(struct state *from, struct state *to,
 				pol_add(&pol, pol_domain,
 					from->spare_group, NULL);
 			if (domain_test(domlist, pol,
-					to->metadata->ss->name) == 1)
+					to->metadata->ss->name) == 1 &&
+			    !spare_rebuilding(from))
 			    dev = from->devid[d];
 			dev_policy_free(pol);
 		}
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Monitor: Allow taking spare from rebuilding array
  2017-10-25 10:37 [PATCH] Monitor: Allow taking spare from rebuilding array Erik Berg
@ 2017-10-31 11:23 ` Erik Berg
  2017-11-01  1:30 ` NeilBrown
  1 sibling, 0 replies; 3+ messages in thread
From: Erik Berg @ 2017-10-31 11:23 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb, jsorensen

On 10/25/2017 10:37 AM, Erik Berg wrote:
> When you have a box with 60 8TB drives, 8 RAID6 sets and 12 spares
> assigned to the first set, and the first set has a failed disk and is
> rebuilding, spares aren't given to other sets with failing disks.
> Waiting 2-3 days for a rebuild to complete before giving out spares to
> other failing sets seems excessive.

Anyone got any comments/opinions on this, or suggestions as to other 
ways of doing this?

We'd ultimately just like to have a pool of 12 spares that can go to any 
raid-set as drives fail.


--
erikberg

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Monitor: Allow taking spare from rebuilding array
  2017-10-25 10:37 [PATCH] Monitor: Allow taking spare from rebuilding array Erik Berg
  2017-10-31 11:23 ` Erik Berg
@ 2017-11-01  1:30 ` NeilBrown
  1 sibling, 0 replies; 3+ messages in thread
From: NeilBrown @ 2017-11-01  1:30 UTC (permalink / raw)
  To: Erik Berg, linux-raid

[-- Attachment #1: Type: text/plain, Size: 3323 bytes --]

On Wed, Oct 25 2017, Erik Berg wrote:

> When you have a box with 60 8TB drives, 8 RAID6 sets and 12 spares
> assigned to the first set, and the first set has a failed disk and is
> rebuilding, spares aren't given to other sets with failing disks.
> Waiting 2-3 days for a rebuild to complete before giving out spares to
> other failing sets seems excessive.

True.
One way around this is to have an inactive array that just contains
spares.
It is probably awkward to create such an array with current mdadm
thought :-(


> ---
>  Monitor.c | 36 ++++++++++++++++++++++++++++++++----
>  1 file changed, 32 insertions(+), 4 deletions(-)
>
> diff --git a/Monitor.c b/Monitor.c
> index b60996b..b1178fd 100644
> --- a/Monitor.c
> +++ b/Monitor.c
> @@ -781,14 +781,41 @@ static int check_donor(struct state *from, struct state *to)
>  		 */
>  		if (sub->active < sub->raid)
>  			return 0;
> -	if (from->metadata->ss->external == 0)
> -		if (from->active < from->raid)
> -			return 0;
>  	if (from->spare <= 0)
>  		return 0;
>  	return 1;
>  }
>  
> +int spare_rebuilding(struct state *dev)
> +{
> +	mdu_disk_info_t disk;
> +	mdu_array_info_t array;
> +
> +	int fd = open(dev->devname, O_RDONLY);
> +	if (fd < 0) {
> +		pr_err("cannot open %s: %s\n",
> +			dev->devname, strerror(errno));
> +		return 1;
> +	}
> +
> +	md_get_disk_info(fd, &disk);

This is wrong. md_get_disk_info requires that disk.number be set,
and it returns the info for that disk.

You need to pass 'd' from choose_spare() into  spare_rebuilding(),
and set
 disk.number = d;

Did you test your code? :-)

> +	md_get_array_info(fd, &array);

You only use 'array' to get 'array.raid_disks', and that should be the
same as dev->raid.

> +
> +	close(fd);
> +
> +	if ((disk.state &
> +	     ((1 << MD_DISK_ACTIVE) | (1 << MD_DISK_SYNC) |
> +	      (1 << MD_DISK_REMOVED) | (1 << MD_DISK_FAULTY) |
> +	      (1 << MD_DISK_JOURNAL))) == 0) {

dev->devstate[d] already contains disk.state, and has be tested against
zero, so the above is not needed.

> +
> +		if (disk.raid_disk < array.raid_disks &&
> +		    disk.raid_disk >= 0)
> +			return 1;

This is the important test. If raid_disk is -1, the device is really a
spare.  If >= 0, it isn't.

Maybe it would be better to add a 'devrole' array to 'struct state', and
collect this information in check_array().  Then spare_rebuilding() would
become trivial.

So I don't much like the code, but the concept seems sound.

Thanks,
NeilBrown


> +	}
> +
> +	return 0;
> +}
> +
>  static dev_t choose_spare(struct state *from, struct state *to,
>  			  struct domainlist *domlist, struct spare_criteria *sc)
>  {
> @@ -821,7 +848,8 @@ static dev_t choose_spare(struct state *from, struct state *to,
>  				pol_add(&pol, pol_domain,
>  					from->spare_group, NULL);
>  			if (domain_test(domlist, pol,
> -					to->metadata->ss->name) == 1)
> +					to->metadata->ss->name) == 1 &&
> +			    !spare_rebuilding(from))
>  			    dev = from->devid[d];
>  			dev_policy_free(pol);
>  		}
> -- 
> 2.14.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-11-01  1:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-25 10:37 [PATCH] Monitor: Allow taking spare from rebuilding array Erik Berg
2017-10-31 11:23 ` Erik Berg
2017-11-01  1:30 ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.