[PATCH 3/5] md: Increase the frequency of recording checkpoint of recovery.

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 3/5] md: Increase the frequency of recording checkpoint of recovery.
@ 2012-10-27  2:28 majianpeng
  2012-10-28 21:41 ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: majianpeng @ 2012-10-27  2:28 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

With the hard disk capacity of larger,the entire recovery process
becomes longer.So it should increase the frequency.
Supposed the speed of recovery is 100MB/s(i think for HDD is good).
The total time of recovery 4TB hdd is about 666 minutes.So change the
frequency from 16 to 64 times,it's about 10mins.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
---
 drivers/md/md.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 01e25bd..ba328ee 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7446,7 +7446,7 @@ void md_do_sync(struct md_thread *thread)
 		if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
 		    ((mddev->curr_resync > mddev->curr_resync_completed &&
 		      (mddev->curr_resync - mddev->curr_resync_completed)
-		      > (max_sectors >> 4)) ||
+		      > (max_sectors >> 6)) ||
 		     (j - mddev->curr_resync_completed)*2
 		     >= mddev->resync_max - mddev->curr_resync_completed
 			    )) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 3/5] md: Increase the frequency of recording checkpoint of recovery.
  2012-10-27  2:28 [PATCH 3/5] md: Increase the frequency of recording checkpoint of recovery majianpeng
@ 2012-10-28 21:41 ` NeilBrown
  2012-10-29  9:24   ` majianpeng
  0 siblings, 1 reply; 4+ messages in thread
From: NeilBrown @ 2012-10-28 21:41 UTC (permalink / raw)
  To: majianpeng; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1882 bytes --]

On Sat, 27 Oct 2012 10:28:20 +0800 majianpeng <majianpeng@gmail.com> wrote:

> With the hard disk capacity of larger,the entire recovery process
> becomes longer.So it should increase the frequency.
> Supposed the speed of recovery is 100MB/s(i think for HDD is good).
> The total time of recovery 4TB hdd is about 666 minutes.So change the
> frequency from 16 to 64 times,it's about 10mins.
> 
> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
> ---
>  drivers/md/md.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 01e25bd..ba328ee 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -7446,7 +7446,7 @@ void md_do_sync(struct md_thread *thread)
>  		if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
>  		    ((mddev->curr_resync > mddev->curr_resync_completed &&
>  		      (mddev->curr_resync - mddev->curr_resync_completed)
> -		      > (max_sectors >> 4)) ||
> +		      > (max_sectors >> 6)) ||
>  		     (j - mddev->curr_resync_completed)*2
>  		     >= mddev->resync_max - mddev->curr_resync_completed
>  			    )) {

I don't like this - the number '6' is completely arbitrary.

The original '4' was never about number of minutes.  It was a fraction of the
total effort needed.
I wanted to make sure you only repeated at most 10% of the required effort
(one order of magnitude).  >>4 is about 6% and easy to calculate.

If you want it to be about time (which I don't object to), then make it about
time.  Keep track of the last time we updated curr_resync_completed, and if
that is more than a few minutes ago, update it again.
This is what I said last time - if you want something to be based on time,
test the time.
If 'echo idle > sync_action' doesn't work (Which it seems it doesn't), then
it is OK to just do it in the kernel.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: [PATCH 3/5] md: Increase the frequency of recording checkpoint of recovery.
  2012-10-28 21:41 ` NeilBrown
@ 2012-10-29  9:24   ` majianpeng
  2012-10-31  0:37     ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: majianpeng @ 2012-10-29  9:24 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

>On Sat, 27 Oct 2012 10:28:20 +0800 majianpeng <majianpeng@gmail.com> wrote:
>
>> With the hard disk capacity of larger,the entire recovery process
>> becomes longer.So it should increase the frequency.
>> Supposed the speed of recovery is 100MB/s(i think for HDD is good).
>> The total time of recovery 4TB hdd is about 666 minutes.So change the
>> frequency from 16 to 64 times,it's about 10mins.
>> 
>> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
>> ---
>>  drivers/md/md.c |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 01e25bd..ba328ee 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -7446,7 +7446,7 @@ void md_do_sync(struct md_thread *thread)
>>  		if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
>>  		    ((mddev->curr_resync > mddev->curr_resync_completed &&
>>  		      (mddev->curr_resync - mddev->curr_resync_completed)
>> -		      > (max_sectors >> 4)) ||
>> +		      > (max_sectors >> 6)) ||
>>  		     (j - mddev->curr_resync_completed)*2
>>  		     >= mddev->resync_max - mddev->curr_resync_completed
>>  			    )) {
>
>I don't like this - the number '6' is completely arbitrary.
>
>The original '4' was never about number of minutes.  It was a fraction of the
>total effort needed.
>I wanted to make sure you only repeated at most 10% of the required effort
>(one order of magnitude).  >>4 is about 6% and easy to calculate.
>
>If you want it to be about time (which I don't object to), then make it about
>time.  Keep track of the last time we updated curr_resync_completed, and if
>that is more than a few minutes ago, update it again.
>This is what I said last time - if you want something to be based on time,
>test the time.
>If 'echo idle > sync_action' doesn't work (Which it seems it doesn't), then
>it is OK to just do it in the kernel.
>
>NeilBrown
>
Ok, I used your suggestions.

[PATCH V1] md:Updating checkpoint of resync/recovery based time.

With the hard disk capacity of larger,the entire recovery process
becomes longer.So it should increase the frequency.
To do so there are a lot of benefits,for example
1:avoid doing repeated work when os crashed or suddenly power failure.
2:In md drivers, there are many places to be treated differently based
on recovery_cp.For example, commit a7854487cd7128a30a7f4.

It is difficult to judge how long it updated.There i chosed five
minutes.No impact on performance.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
---
 drivers/md/md.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9ab768a..eddb840 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7269,6 +7269,7 @@ EXPORT_SYMBOL_GPL(md_allow_write);
 
 #define SYNC_MARKS	10
 #define	SYNC_MARK_STEP	(3*HZ)
+#define UPDATE_FREQUENCY (5*60*HZ)
 void md_do_sync(struct md_thread *thread)
 {
 	struct mddev *mddev = thread->mddev;
@@ -7277,6 +7278,7 @@ void md_do_sync(struct md_thread *thread)
 		 window;
 	sector_t max_sectors,j, io_sectors;
 	unsigned long mark[SYNC_MARKS];
+	unsigned long update_time;
 	sector_t mark_cnt[SYNC_MARKS];
 	int last_mark,m;
 	struct list_head *tmp;
@@ -7436,17 +7438,19 @@ void md_do_sync(struct md_thread *thread)
 	mddev->curr_resync_completed = j;
 	sysfs_notify(&mddev->kobj, NULL, "sync_completed");
 	md_new_event(mddev);
+	update_time = jiffies;
 
 	blk_start_plug(&plug);
 	while (j < max_sectors) {
 		sector_t sectors;
-
+
 		skipped = 0;
 
 		if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
 		    ((mddev->curr_resync > mddev->curr_resync_completed &&
 		      (mddev->curr_resync - mddev->curr_resync_completed)
 		      > (max_sectors >> 4)) ||
+		     time_after_eq(jiffies, update_time + UPDATE_FREQUENCY) ||
 		     (j - mddev->curr_resync_completed)*2
 		     >= mddev->resync_max - mddev->curr_resync_completed
 			    )) {
@@ -7454,6 +7458,7 @@ void md_do_sync(struct md_thread *thread)
 			wait_event(mddev->recovery_wait,
 				   atomic_read(&mddev->recovery_active) == 0);
 			mddev->curr_resync_completed = j;
+			update_time = jiffies;
 			set_bit(MD_CHANGE_CLEAN, &mddev->flags);
 			sysfs_notify(&mddev->kobj, NULL, "sync_completed");
 		}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 3/5] md: Increase the frequency of recording checkpoint of recovery.
  2012-10-29  9:24   ` majianpeng
@ 2012-10-31  0:37     ` NeilBrown
  0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2012-10-31  0:37 UTC (permalink / raw)
  To: majianpeng; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4670 bytes --]

On Mon, 29 Oct 2012 17:24:11 +0800 majianpeng <majianpeng@gmail.com> wrote:

> >On Sat, 27 Oct 2012 10:28:20 +0800 majianpeng <majianpeng@gmail.com> wrote:
> >
> >> With the hard disk capacity of larger,the entire recovery process
> >> becomes longer.So it should increase the frequency.
> >> Supposed the speed of recovery is 100MB/s(i think for HDD is good).
> >> The total time of recovery 4TB hdd is about 666 minutes.So change the
> >> frequency from 16 to 64 times,it's about 10mins.
> >> 
> >> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
> >> ---
> >>  drivers/md/md.c |    2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >> index 01e25bd..ba328ee 100644
> >> --- a/drivers/md/md.c
> >> +++ b/drivers/md/md.c
> >> @@ -7446,7 +7446,7 @@ void md_do_sync(struct md_thread *thread)
> >>  		if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
> >>  		    ((mddev->curr_resync > mddev->curr_resync_completed &&
> >>  		      (mddev->curr_resync - mddev->curr_resync_completed)
> >> -		      > (max_sectors >> 4)) ||
> >> +		      > (max_sectors >> 6)) ||
> >>  		     (j - mddev->curr_resync_completed)*2
> >>  		     >= mddev->resync_max - mddev->curr_resync_completed
> >>  			    )) {
> >
> >I don't like this - the number '6' is completely arbitrary.
> >
> >The original '4' was never about number of minutes.  It was a fraction of the
> >total effort needed.
> >I wanted to make sure you only repeated at most 10% of the required effort
> >(one order of magnitude).  >>4 is about 6% and easy to calculate.
> >
> >If you want it to be about time (which I don't object to), then make it about
> >time.  Keep track of the last time we updated curr_resync_completed, and if
> >that is more than a few minutes ago, update it again.
> >This is what I said last time - if you want something to be based on time,
> >test the time.
> >If 'echo idle > sync_action' doesn't work (Which it seems it doesn't), then
> >it is OK to just do it in the kernel.
> >
> >NeilBrown
> >
> Ok, I used your suggestions.
> 
> [PATCH V1] md:Updating checkpoint of resync/recovery based time.
> 
> With the hard disk capacity of larger,the entire recovery process
> becomes longer.So it should increase the frequency.
> To do so there are a lot of benefits,for example
> 1:avoid doing repeated work when os crashed or suddenly power failure.
> 2:In md drivers, there are many places to be treated differently based
> on recovery_cp.For example, commit a7854487cd7128a30a7f4.
> 
> It is difficult to judge how long it updated.There i chosed five
> minutes.No impact on performance.
> 
> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
> ---
>  drivers/md/md.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 9ab768a..eddb840 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -7269,6 +7269,7 @@ EXPORT_SYMBOL_GPL(md_allow_write);
>  
>  #define SYNC_MARKS	10
>  #define	SYNC_MARK_STEP	(3*HZ)
> +#define UPDATE_FREQUENCY (5*60*HZ)
>  void md_do_sync(struct md_thread *thread)
>  {
>  	struct mddev *mddev = thread->mddev;
> @@ -7277,6 +7278,7 @@ void md_do_sync(struct md_thread *thread)
>  		 window;
>  	sector_t max_sectors,j, io_sectors;
>  	unsigned long mark[SYNC_MARKS];
> +	unsigned long update_time;
>  	sector_t mark_cnt[SYNC_MARKS];
>  	int last_mark,m;
>  	struct list_head *tmp;
> @@ -7436,17 +7438,19 @@ void md_do_sync(struct md_thread *thread)
>  	mddev->curr_resync_completed = j;
>  	sysfs_notify(&mddev->kobj, NULL, "sync_completed");
>  	md_new_event(mddev);
> +	update_time = jiffies;
>  
>  	blk_start_plug(&plug);
>  	while (j < max_sectors) {
>  		sector_t sectors;
> -
> +
>  		skipped = 0;
>  
>  		if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
>  		    ((mddev->curr_resync > mddev->curr_resync_completed &&
>  		      (mddev->curr_resync - mddev->curr_resync_completed)
>  		      > (max_sectors >> 4)) ||
> +		     time_after_eq(jiffies, update_time + UPDATE_FREQUENCY) ||
>  		     (j - mddev->curr_resync_completed)*2
>  		     >= mddev->resync_max - mddev->curr_resync_completed
>  			    )) {
> @@ -7454,6 +7458,7 @@ void md_do_sync(struct md_thread *thread)
>  			wait_event(mddev->recovery_wait,
>  				   atomic_read(&mddev->recovery_active) == 0);
>  			mddev->curr_resync_completed = j;
> +			update_time = jiffies;
>  			set_bit(MD_CHANGE_CLEAN, &mddev->flags);
>  			sysfs_notify(&mddev->kobj, NULL, "sync_completed");
>  		}


Applied, thanks.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-10-31  0:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-27  2:28 [PATCH 3/5] md: Increase the frequency of recording checkpoint of recovery majianpeng
2012-10-28 21:41 ` NeilBrown
2012-10-29  9:24   ` majianpeng
2012-10-31  0:37     ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.