* [PATCH v5 0/2] md_cluster: bugs fix @ 2020-07-20 18:08 Zhao Heming 2020-07-20 18:08 ` [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap Zhao Heming ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Zhao Heming @ 2020-07-20 18:08 UTC (permalink / raw) To: linux-raid; +Cc: Zhao Heming, neilb, song v5: - change safemode_delay patch code by neil's comments. - modify safemode_delay patch comments: - remove useless analysis - limit line in 75 chars - add neil's reviewd info in rmmod patch. v4: - add new patch 'fix rmmod issue when md_cluster convert bitmap to none' - add cover leter - modify both patches comments. v3: - add restoring path for error case - modify comments v2: - change setting path from location_store to md_setup_cluster - add restoring path v1: - create: fix safemode_delay value when converting to clustered bitmap Zhao Heming (2): md-cluster: fix safemode_delay value when converting to clustered bitmap md-cluster: fix rmmod issue when md_cluster convert bitmap to none drivers/md/md.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) -- 2.25.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap 2020-07-20 18:08 [PATCH v5 0/2] md_cluster: bugs fix Zhao Heming @ 2020-07-20 18:08 ` Zhao Heming 2020-07-20 23:02 ` NeilBrown 2020-07-21 8:35 ` Guoqing Jiang 2020-07-20 18:08 ` [PATCH v5 2/2] md-cluster: fix rmmod issue when md_cluster convert bitmap to none Zhao Heming 2020-07-22 5:01 ` [PATCH v5 0/2] md_cluster: bugs fix Song Liu 2 siblings, 2 replies; 7+ messages in thread From: Zhao Heming @ 2020-07-20 18:08 UTC (permalink / raw) To: linux-raid; +Cc: Zhao Heming, neilb, song When array convert to clustered bitmap, the safe_mode_delay doesn't clean and vice versa. the /sys/block/mdX/md/safe_mode_delay keep original value after changing bitmap type. In safe_delay_store(), the code forbids setting mddev->safemode_delay when array is clustered. So in cluster-md env, the expected safemode_delay value should be 0. Reproducible steps: ``` node1 # mdadm --zero-superblock /dev/sd{b,c,d} node1 # mdadm -C /dev/md0 -b internal -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc node1 # cat /sys/block/md0/md/safe_mode_delay 0.204 node1 # mdadm -G /dev/md0 -b none node1 # mdadm --grow /dev/md0 --bitmap=clustered node1 # cat /sys/block/md0/md/safe_mode_delay 0.204 <== doesn't change, should ZERO for cluster-md node1 # mdadm --zero-superblock /dev/sd{b,c,d} node1 # mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc node1 # cat /sys/block/md0/md/safe_mode_delay 0.000 node1 # mdadm -G /dev/md0 -b none node1 # cat /sys/block/md0/md/safe_mode_delay 0.000 <== doesn't change, should default value ``` Signed-off-by: Zhao Heming <heming.zhao@suse.com> --- drivers/md/md.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index f567f536b529..1bde3df3fa18 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -101,6 +101,8 @@ static void mddev_detach(struct mddev *mddev); * count by 2 for every hour elapsed between read errors. */ #define MD_DEFAULT_MAX_CORRECTED_READ_ERRORS 20 +/* Default safemode delay: 200 msec */ +#define DEFAULT_SAFEMODE_DELAY ((200 * HZ)/1000 +1) /* * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit' * is 1000 KB/sec, so the extra system load does not show up that much. @@ -5982,7 +5984,7 @@ int md_run(struct mddev *mddev) if (mddev_is_clustered(mddev)) mddev->safemode_delay = 0; else - mddev->safemode_delay = (200 * HZ)/1000 +1; /* 200 msec delay */ + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; mddev->in_sync = 1; smp_wmb(); spin_lock(&mddev->lock); @@ -7361,6 +7363,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) mddev->bitmap_info.nodes = 0; md_cluster_ops->leave(mddev); + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; } mddev_suspend(mddev); md_bitmap_destroy(mddev); @@ -8355,6 +8358,7 @@ EXPORT_SYMBOL(unregister_md_cluster_operations); int md_setup_cluster(struct mddev *mddev, int nodes) { + int ret; if (!md_cluster_ops) request_module("md-cluster"); spin_lock(&pers_lock); @@ -8366,7 +8370,10 @@ int md_setup_cluster(struct mddev *mddev, int nodes) } spin_unlock(&pers_lock); - return md_cluster_ops->join(mddev, nodes); + ret = md_cluster_ops->join(mddev, nodes); + if (!ret) + mddev->safemode_delay = 0; + return ret; } void md_cluster_stop(struct mddev *mddev) -- 2.25.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap 2020-07-20 18:08 ` [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap Zhao Heming @ 2020-07-20 23:02 ` NeilBrown 2020-07-21 8:35 ` Guoqing Jiang 1 sibling, 0 replies; 7+ messages in thread From: NeilBrown @ 2020-07-20 23:02 UTC (permalink / raw) To: linux-raid; +Cc: Zhao Heming, neilb, song [-- Attachment #1: Type: text/plain, Size: 3207 bytes --] On Tue, Jul 21 2020, Zhao Heming wrote: > When array convert to clustered bitmap, the safe_mode_delay doesn't > clean and vice versa. the /sys/block/mdX/md/safe_mode_delay keep original > value after changing bitmap type. In safe_delay_store(), the code forbids > setting mddev->safemode_delay when array is clustered. So in cluster-md > env, the expected safemode_delay value should be 0. > > Reproducible steps: > ``` > node1 # mdadm --zero-superblock /dev/sd{b,c,d} > node1 # mdadm -C /dev/md0 -b internal -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc > node1 # cat /sys/block/md0/md/safe_mode_delay > 0.204 > node1 # mdadm -G /dev/md0 -b none > node1 # mdadm --grow /dev/md0 --bitmap=clustered > node1 # cat /sys/block/md0/md/safe_mode_delay > 0.204 <== doesn't change, should ZERO for cluster-md > > node1 # mdadm --zero-superblock /dev/sd{b,c,d} > node1 # mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc > node1 # cat /sys/block/md0/md/safe_mode_delay > 0.000 > node1 # mdadm -G /dev/md0 -b none > node1 # cat /sys/block/md0/md/safe_mode_delay > 0.000 <== doesn't change, should default value > ``` > > Signed-off-by: Zhao Heming <heming.zhao@suse.com> Reviewed-by: NeilBrown <neilb@suse.de> Thanks, NeilBrown > --- > drivers/md/md.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index f567f536b529..1bde3df3fa18 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -101,6 +101,8 @@ static void mddev_detach(struct mddev *mddev); > * count by 2 for every hour elapsed between read errors. > */ > #define MD_DEFAULT_MAX_CORRECTED_READ_ERRORS 20 > +/* Default safemode delay: 200 msec */ > +#define DEFAULT_SAFEMODE_DELAY ((200 * HZ)/1000 +1) > /* > * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit' > * is 1000 KB/sec, so the extra system load does not show up that much. > @@ -5982,7 +5984,7 @@ int md_run(struct mddev *mddev) > if (mddev_is_clustered(mddev)) > mddev->safemode_delay = 0; > else > - mddev->safemode_delay = (200 * HZ)/1000 +1; /* 200 msec delay */ > + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; > mddev->in_sync = 1; > smp_wmb(); > spin_lock(&mddev->lock); > @@ -7361,6 +7363,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) > > mddev->bitmap_info.nodes = 0; > md_cluster_ops->leave(mddev); > + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; > } > mddev_suspend(mddev); > md_bitmap_destroy(mddev); > @@ -8355,6 +8358,7 @@ EXPORT_SYMBOL(unregister_md_cluster_operations); > > int md_setup_cluster(struct mddev *mddev, int nodes) > { > + int ret; > if (!md_cluster_ops) > request_module("md-cluster"); > spin_lock(&pers_lock); > @@ -8366,7 +8370,10 @@ int md_setup_cluster(struct mddev *mddev, int nodes) > } > spin_unlock(&pers_lock); > > - return md_cluster_ops->join(mddev, nodes); > + ret = md_cluster_ops->join(mddev, nodes); > + if (!ret) > + mddev->safemode_delay = 0; > + return ret; > } > > void md_cluster_stop(struct mddev *mddev) > -- > 2.25.0 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap 2020-07-20 18:08 ` [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap Zhao Heming 2020-07-20 23:02 ` NeilBrown @ 2020-07-21 8:35 ` Guoqing Jiang 2020-07-21 8:58 ` heming.zhao 1 sibling, 1 reply; 7+ messages in thread From: Guoqing Jiang @ 2020-07-21 8:35 UTC (permalink / raw) To: Zhao Heming, linux-raid; +Cc: neilb, song On 7/20/20 8:08 PM, Zhao Heming wrote: > When array convert to clustered bitmap, the safe_mode_delay doesn't > clean and vice versa. the /sys/block/mdX/md/safe_mode_delay keep original > value after changing bitmap type. In safe_delay_store(), the code forbids > setting mddev->safemode_delay when array is clustered. So in cluster-md > env, the expected safemode_delay value should be 0. > > Reproducible steps: > ``` > node1 # mdadm --zero-superblock /dev/sd{b,c,d} > node1 # mdadm -C /dev/md0 -b internal -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc > node1 # cat /sys/block/md0/md/safe_mode_delay > 0.204 > node1 # mdadm -G /dev/md0 -b none > node1 # mdadm --grow /dev/md0 --bitmap=clustered > node1 # cat /sys/block/md0/md/safe_mode_delay > 0.204 <== doesn't change, should ZERO for cluster-md > > node1 # mdadm --zero-superblock /dev/sd{b,c,d} > node1 # mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc > node1 # cat /sys/block/md0/md/safe_mode_delay > 0.000 > node1 # mdadm -G /dev/md0 -b none > node1 # cat /sys/block/md0/md/safe_mode_delay > 0.000 <== doesn't change, should default value > ``` > > Signed-off-by: Zhao Heming <heming.zhao@suse.com> > --- > drivers/md/md.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index f567f536b529..1bde3df3fa18 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -101,6 +101,8 @@ static void mddev_detach(struct mddev *mddev); > * count by 2 for every hour elapsed between read errors. > */ > #define MD_DEFAULT_MAX_CORRECTED_READ_ERRORS 20 > +/* Default safemode delay: 200 msec */ > +#define DEFAULT_SAFEMODE_DELAY ((200 * HZ)/1000 +1) > /* > * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit' > * is 1000 KB/sec, so the extra system load does not show up that much. > @@ -5982,7 +5984,7 @@ int md_run(struct mddev *mddev) > if (mddev_is_clustered(mddev)) > mddev->safemode_delay = 0; > else > - mddev->safemode_delay = (200 * HZ)/1000 +1; /* 200 msec delay */ > + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; > mddev->in_sync = 1; > smp_wmb(); > spin_lock(&mddev->lock); > @@ -7361,6 +7363,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) > > mddev->bitmap_info.nodes = 0; > md_cluster_ops->leave(mddev); > + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; > } Not about the patch itself, just confuse about the meaning of safemode_delay. As safemode_delay represents 200 ms delay, but md_write_end has this. /* The roundup() ensures this only performs locking once * every ->safemode_delay *jiffies* */ mod_timer(&mddev->safemode_timer, roundup(jiffies, mddev->safemode_delay) + mddev->safemode_delay); And the second argument in mod_timer() means "new timeout in jiffies", does the above need to convert from ms to jiffies? Am I miss something? Thanks, Guoqing ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap 2020-07-21 8:35 ` Guoqing Jiang @ 2020-07-21 8:58 ` heming.zhao 0 siblings, 0 replies; 7+ messages in thread From: heming.zhao @ 2020-07-21 8:58 UTC (permalink / raw) To: Guoqing Jiang, linux-raid; +Cc: neilb, song On 7/21/20 4:35 PM, Guoqing Jiang wrote: > > > On 7/20/20 8:08 PM, Zhao Heming wrote: >> When array convert to clustered bitmap, the safe_mode_delay doesn't >> clean and vice versa. the /sys/block/mdX/md/safe_mode_delay keep original >> value after changing bitmap type. In safe_delay_store(), the code forbids >> setting mddev->safemode_delay when array is clustered. So in cluster-md >> env, the expected safemode_delay value should be 0. >> >> Reproducible steps: >> ``` >> node1 # mdadm --zero-superblock /dev/sd{b,c,d} >> node1 # mdadm -C /dev/md0 -b internal -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc >> node1 # cat /sys/block/md0/md/safe_mode_delay >> 0.204 >> node1 # mdadm -G /dev/md0 -b none >> node1 # mdadm --grow /dev/md0 --bitmap=clustered >> node1 # cat /sys/block/md0/md/safe_mode_delay >> 0.204 <== doesn't change, should ZERO for cluster-md >> >> node1 # mdadm --zero-superblock /dev/sd{b,c,d} >> node1 # mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc >> node1 # cat /sys/block/md0/md/safe_mode_delay >> 0.000 >> node1 # mdadm -G /dev/md0 -b none >> node1 # cat /sys/block/md0/md/safe_mode_delay >> 0.000 <== doesn't change, should default value >> ``` >> >> Signed-off-by: Zhao Heming <heming.zhao@suse.com> >> --- >> drivers/md/md.c | 11 +++++++++-- >> 1 file changed, 9 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/md.c b/drivers/md/md.c >> index f567f536b529..1bde3df3fa18 100644 >> --- a/drivers/md/md.c >> +++ b/drivers/md/md.c >> @@ -101,6 +101,8 @@ static void mddev_detach(struct mddev *mddev); >> * count by 2 for every hour elapsed between read errors. >> */ >> #define MD_DEFAULT_MAX_CORRECTED_READ_ERRORS 20 >> +/* Default safemode delay: 200 msec */ >> +#define DEFAULT_SAFEMODE_DELAY ((200 * HZ)/1000 +1) >> /* >> * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit' >> * is 1000 KB/sec, so the extra system load does not show up that much. >> @@ -5982,7 +5984,7 @@ int md_run(struct mddev *mddev) >> if (mddev_is_clustered(mddev)) >> mddev->safemode_delay = 0; >> else >> - mddev->safemode_delay = (200 * HZ)/1000 +1; /* 200 msec delay */ >> + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; >> mddev->in_sync = 1; >> smp_wmb(); >> spin_lock(&mddev->lock); >> @@ -7361,6 +7363,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) >> mddev->bitmap_info.nodes = 0; >> md_cluster_ops->leave(mddev); >> + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; >> } > > Not about the patch itself, just confuse about the meaning of safemode_delay. > As safemode_delay represents 200 ms delay, but md_write_end has this. > > /* The roundup() ensures this only performs locking once > * every ->safemode_delay *jiffies* > */ > mod_timer(&mddev->safemode_timer, > roundup(jiffies, mddev->safemode_delay) + > mddev->safemode_delay); > > And the second argument in mod_timer() means "new timeout in jiffies", > does the above need to convert from ms to jiffies? Am I miss something? > > Thanks, > Guoqing > The safemode_delay stores jiffies not msec. it converts to ms only in safe_delay_{store|show}. please see default value: #define DEFAULT_SAFEMODE_DELAY ((200 * HZ)/1000 +1) in safe_delay_store, user space input will multiple 10^3, which will convert from second to msec. then (*HZ/1000) to jiffies. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v5 2/2] md-cluster: fix rmmod issue when md_cluster convert bitmap to none 2020-07-20 18:08 [PATCH v5 0/2] md_cluster: bugs fix Zhao Heming 2020-07-20 18:08 ` [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap Zhao Heming @ 2020-07-20 18:08 ` Zhao Heming 2020-07-22 5:01 ` [PATCH v5 0/2] md_cluster: bugs fix Song Liu 2 siblings, 0 replies; 7+ messages in thread From: Zhao Heming @ 2020-07-20 18:08 UTC (permalink / raw) To: linux-raid; +Cc: Zhao Heming, neilb, song update_array_info misses calling module_put when removing cluster bitmap. steps to reproduce: ``` node1 # mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb mdadm: array /dev/md0 started. node1 # lsmod | egrep "dlm|md_|raid1" md_cluster 28672 1 dlm 212992 14 md_cluster configfs 57344 2 dlm raid1 53248 1 md_mod 176128 2 raid1,md_cluster node1 # mdadm -G /dev/md0 -b none node1 # lsmod | egrep "dlm|md_|raid1" md_cluster 28672 1 <== should be zero dlm 212992 9 md_cluster configfs 57344 2 dlm raid1 53248 1 md_mod 176128 2 raid1,md_cluster node1 # mdadm -G /dev/md0 -b clustered node1 # lsmod | egrep "dlm|md_|raid1" md_cluster 28672 2 <== increase dlm 212992 14 md_cluster configfs 57344 2 dlm raid1 53248 1 md_mod 176128 2 raid1,md_cluster node1 # mdadm -G /dev/md0 -b none node1 # mdadm -G /dev/md0 -b clustered node1 # lsmod | egrep "dlm|md_|raid1" md_cluster 28672 3 <== increase dlm 212992 14 md_cluster configfs 57344 2 dlm raid1 53248 1 md_mod 176128 2 raid1,md_cluster ``` Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Zhao Heming <heming.zhao@suse.com> --- drivers/md/md.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/md.c b/drivers/md/md.c index 1bde3df3fa18..502f0d100a96 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -7363,6 +7363,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) mddev->bitmap_info.nodes = 0; md_cluster_ops->leave(mddev); + module_put(md_cluster_mod); mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; } mddev_suspend(mddev); -- 2.25.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v5 0/2] md_cluster: bugs fix 2020-07-20 18:08 [PATCH v5 0/2] md_cluster: bugs fix Zhao Heming 2020-07-20 18:08 ` [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap Zhao Heming 2020-07-20 18:08 ` [PATCH v5 2/2] md-cluster: fix rmmod issue when md_cluster convert bitmap to none Zhao Heming @ 2020-07-22 5:01 ` Song Liu 2 siblings, 0 replies; 7+ messages in thread From: Song Liu @ 2020-07-22 5:01 UTC (permalink / raw) To: Zhao Heming; +Cc: linux-raid, NeilBrown On Mon, Jul 20, 2020 at 11:09 AM Zhao Heming <heming.zhao@suse.com> wrote: > > v5: > - change safemode_delay patch code by neil's comments. > - modify safemode_delay patch comments: > - remove useless analysis > - limit line in 75 chars > - add neil's reviewd info in rmmod patch. Applied to md-next. Thanks! Song ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-07-22 5:01 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-07-20 18:08 [PATCH v5 0/2] md_cluster: bugs fix Zhao Heming 2020-07-20 18:08 ` [PATCH v5 1/2] md-cluster: fix safemode_delay value when converting to clustered bitmap Zhao Heming 2020-07-20 23:02 ` NeilBrown 2020-07-21 8:35 ` Guoqing Jiang 2020-07-21 8:58 ` heming.zhao 2020-07-20 18:08 ` [PATCH v5 2/2] md-cluster: fix rmmod issue when md_cluster convert bitmap to none Zhao Heming 2020-07-22 5:01 ` [PATCH v5 0/2] md_cluster: bugs fix Song Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).