On Wed, Oct 25 2017, Erik Berg wrote: > When you have a box with 60 8TB drives, 8 RAID6 sets and 12 spares > assigned to the first set, and the first set has a failed disk and is > rebuilding, spares aren't given to other sets with failing disks. > Waiting 2-3 days for a rebuild to complete before giving out spares to > other failing sets seems excessive. True. One way around this is to have an inactive array that just contains spares. It is probably awkward to create such an array with current mdadm thought :-( > --- > Monitor.c | 36 ++++++++++++++++++++++++++++++++---- > 1 file changed, 32 insertions(+), 4 deletions(-) > > diff --git a/Monitor.c b/Monitor.c > index b60996b..b1178fd 100644 > --- a/Monitor.c > +++ b/Monitor.c > @@ -781,14 +781,41 @@ static int check_donor(struct state *from, struct state *to) > */ > if (sub->active < sub->raid) > return 0; > - if (from->metadata->ss->external == 0) > - if (from->active < from->raid) > - return 0; > if (from->spare <= 0) > return 0; > return 1; > } > > +int spare_rebuilding(struct state *dev) > +{ > + mdu_disk_info_t disk; > + mdu_array_info_t array; > + > + int fd = open(dev->devname, O_RDONLY); > + if (fd < 0) { > + pr_err("cannot open %s: %s\n", > + dev->devname, strerror(errno)); > + return 1; > + } > + > + md_get_disk_info(fd, &disk); This is wrong. md_get_disk_info requires that disk.number be set, and it returns the info for that disk. You need to pass 'd' from choose_spare() into spare_rebuilding(), and set disk.number = d; Did you test your code? :-) > + md_get_array_info(fd, &array); You only use 'array' to get 'array.raid_disks', and that should be the same as dev->raid. > + > + close(fd); > + > + if ((disk.state & > + ((1 << MD_DISK_ACTIVE) | (1 << MD_DISK_SYNC) | > + (1 << MD_DISK_REMOVED) | (1 << MD_DISK_FAULTY) | > + (1 << MD_DISK_JOURNAL))) == 0) { dev->devstate[d] already contains disk.state, and has be tested against zero, so the above is not needed. > + > + if (disk.raid_disk < array.raid_disks && > + disk.raid_disk >= 0) > + return 1; This is the important test. If raid_disk is -1, the device is really a spare. If >= 0, it isn't. Maybe it would be better to add a 'devrole' array to 'struct state', and collect this information in check_array(). Then spare_rebuilding() would become trivial. So I don't much like the code, but the concept seems sound. Thanks, NeilBrown > + } > + > + return 0; > +} > + > static dev_t choose_spare(struct state *from, struct state *to, > struct domainlist *domlist, struct spare_criteria *sc) > { > @@ -821,7 +848,8 @@ static dev_t choose_spare(struct state *from, struct state *to, > pol_add(&pol, pol_domain, > from->spare_group, NULL); > if (domain_test(domlist, pol, > - to->metadata->ss->name) == 1) > + to->metadata->ss->name) == 1 && > + !spare_rebuilding(from)) > dev = from->devid[d]; > dev_policy_free(pol); > } > -- > 2.14.1 > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html