All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFC] md/raid10: optimize read_balance() for 'far copies' arrays
@ 2011-06-08  7:00 Namhyung Kim
  2011-06-08  7:21 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Namhyung Kim @ 2011-06-08  7:00 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

If @conf->far_offset > 0, there is only 1 stripe so that we can treat
the array same as 'near' arrays. Furthermore we could calculate new
distance from the previous position even for the real 'far' array
cases if the position of given disk is already in the lowest stripe.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
---
 drivers/md/raid10.c |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 6e846688962f..9ec4c5f8cd48 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -531,11 +531,19 @@ retry:
 			break;
 
 		/* for far > 1 always use the lowest address */
-		if (conf->far_copies > 1)
-			new_distance = r10_bio->devs[slot].addr;
-		else
+		if (conf->far_copies > 1 && conf->far_offset == 0) {
+			if (conf->mirrors[disk].head_position < conf->stride &&
+			    r10_bio->devs[slot].addr < conf->stride)
+				/* already in the lowest stripe */
+				new_distance = abs(r10_bio->devs[slot].addr -
+						   conf->mirrors[disk].head_position);
+			else
+				new_distance = r10_bio->devs[slot].addr;
+		} else {
 			new_distance = abs(r10_bio->devs[slot].addr -
 					   conf->mirrors[disk].head_position);
+		}
+
 		if (new_distance < best_dist) {
 			best_dist = new_distance;
 			best_slot = slot;
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFC] md/raid10: optimize read_balance() for 'far copies' arrays
  2011-06-08  7:00 [PATCH/RFC] md/raid10: optimize read_balance() for 'far copies' arrays Namhyung Kim
@ 2011-06-08  7:21 ` NeilBrown
  2011-06-08  7:42   ` Namhyung Kim
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2011-06-08  7:21 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: linux-raid

On Wed,  8 Jun 2011 16:00:45 +0900 Namhyung Kim <namhyung@gmail.com> wrote:

> If @conf->far_offset > 0, there is only 1 stripe so that we can treat
> the array same as 'near' arrays. Furthermore we could calculate new
> distance from the previous position even for the real 'far' array
> cases if the position of given disk is already in the lowest stripe.
> 
> Signed-off-by: Namhyung Kim <namhyung@gmail.com>
> ---
>  drivers/md/raid10.c |   14 +++++++++++---
>  1 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 6e846688962f..9ec4c5f8cd48 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -531,11 +531,19 @@ retry:
>  			break;
>  
>  		/* for far > 1 always use the lowest address */
> -		if (conf->far_copies > 1)
> -			new_distance = r10_bio->devs[slot].addr;
> -		else
> +		if (conf->far_copies > 1 && conf->far_offset == 0) {
> +			if (conf->mirrors[disk].head_position < conf->stride &&
> +			    r10_bio->devs[slot].addr < conf->stride)
> +				/* already in the lowest stripe */
> +				new_distance = abs(r10_bio->devs[slot].addr -
> +						   conf->mirrors[disk].head_position);
> +			else
> +				new_distance = r10_bio->devs[slot].addr;
> +		} else {
>  			new_distance = abs(r10_bio->devs[slot].addr -
>  					   conf->mirrors[disk].head_position);
> +		}
> +
>  		if (new_distance < best_dist) {
>  			best_dist = new_distance;
>  			best_slot = slot;


I agree that it still make sense to to balancing if far_offset != 0.
However  there is absolutely no point in your change to the calculation of
new_distance.
You only wont new_distance to contain a distance from head position if we
want to choose the device with the 'closest' head.  But we don't.  We want to
choose the device were the data is closest to the start of the device.  So
the current value for new_distance is correct.

If you would like to resubmit with just the first change I'll happily apply
the patch.

If you have performed some tests and can demonstrate some cases where this
makes something faster, and can show us the results of those tests, I would
be even more happy!!!

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFC] md/raid10: optimize read_balance() for 'far copies' arrays
  2011-06-08  7:21 ` NeilBrown
@ 2011-06-08  7:42   ` Namhyung Kim
  2011-06-08 11:49     ` Keld Jørn Simonsen
  2011-06-10 14:29     ` Bill Davidsen
  0 siblings, 2 replies; 6+ messages in thread
From: Namhyung Kim @ 2011-06-08  7:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

NeilBrown <neilb@suse.de> writes:

> On Wed,  8 Jun 2011 16:00:45 +0900 Namhyung Kim <namhyung@gmail.com> wrote:
>
>> If @conf->far_offset > 0, there is only 1 stripe so that we can treat
>> the array same as 'near' arrays. Furthermore we could calculate new
>> distance from the previous position even for the real 'far' array
>> cases if the position of given disk is already in the lowest stripe.
>> 
> I agree that it still make sense to to balancing if far_offset != 0.
> However  there is absolutely no point in your change to the calculation of
> new_distance.
> You only wont new_distance to contain a distance from head position if we
> want to choose the device with the 'closest' head.  But we don't.  We want to
> choose the device were the data is closest to the start of the device.  So
> the current value for new_distance is correct.
>

Still can't understand why we choose the closest-to-the-start disk in
case we could have possible sequencial access on other disk. Probably
because of the lack of my understanding how md/disk works :(


> If you would like to resubmit with just the first change I'll happily apply
> the patch.
>

OK. Will do that right soon.


> If you have performed some tests and can demonstrate some cases where this
> makes something faster, and can show us the results of those tests, I would
> be even more happy!!!
>

I wish I could. :) However, unfortunately, I don't have such a real system
to test on.

Thanks.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFC] md/raid10: optimize read_balance() for 'far copies' arrays
  2011-06-08  7:42   ` Namhyung Kim
@ 2011-06-08 11:49     ` Keld Jørn Simonsen
  2011-06-08 14:39       ` Namhyung Kim
  2011-06-10 14:29     ` Bill Davidsen
  1 sibling, 1 reply; 6+ messages in thread
From: Keld Jørn Simonsen @ 2011-06-08 11:49 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: NeilBrown, linux-raid

On Wed, Jun 08, 2011 at 04:42:27PM +0900, Namhyung Kim wrote:
> NeilBrown <neilb@suse.de> writes:
> 
> > On Wed,  8 Jun 2011 16:00:45 +0900 Namhyung Kim <namhyung@gmail.com> wrote:
> >
> >> If @conf->far_offset > 0, there is only 1 stripe so that we can treat
> >> the array same as 'near' arrays. Furthermore we could calculate new
> >> distance from the previous position even for the real 'far' array
> >> cases if the position of given disk is already in the lowest stripe.
> >> 
> > I agree that it still make sense to to balancing if far_offset != 0.
> > However  there is absolutely no point in your change to the calculation of
> > new_distance.
> > You only wont new_distance to contain a distance from head position if we
> > want to choose the device with the 'closest' head.  But we don't.  We want to
> > choose the device were the data is closest to the start of the device.  So
> > the current value for new_distance is correct.
> >
> 
> Still can't understand why we choose the closest-to-the-start disk in
> case we could have possible sequencial access on other disk. Probably
> because of the lack of my understanding how md/disk works :(

the nearest position was the case for the initial implementation of
raid10-far.  But this had bad performance for an array with disks of
varying specifications. And also it led to not using the faster
outer sectors. Using the closest-to-beginning gave a spped-up of about
50 % in some cases.

best regards
keld

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFC] md/raid10: optimize read_balance() for 'far copies' arrays
  2011-06-08 11:49     ` Keld Jørn Simonsen
@ 2011-06-08 14:39       ` Namhyung Kim
  0 siblings, 0 replies; 6+ messages in thread
From: Namhyung Kim @ 2011-06-08 14:39 UTC (permalink / raw)
  To: Keld Jørn Simonsen; +Cc: NeilBrown, linux-raid

Keld Jørn Simonsen <keld@keldix.com> writes:
> On Wed, Jun 08, 2011 at 04:42:27PM +0900, Namhyung Kim wrote:
>> Still can't understand why we choose the closest-to-the-start disk in
>> case we could have possible sequencial access on other disk. Probably
>> because of the lack of my understanding how md/disk works :(
>
> the nearest position was the case for the initial implementation of
> raid10-far.  But this had bad performance for an array with disks of
> varying specifications. And also it led to not using the faster
> outer sectors. Using the closest-to-beginning gave a spped-up of about
> 50 % in some cases.
>

Hi Keld,

Thanks for the explanation. That means lower sectors reside on the outer
tracks/cylinders in the disk, right? The 50% seems a huge improvement I
couldn't stand against. Although my patch tried to choose
closest-to-current-head disk if the disk head is in the lowest stripe -
in the (similar) hope that it'd be on the outer tracks - I don't have
the numbers, so I'll just give up on it.

Besides, I just noticed that the rationale behind read_balance()
pressumed that all components of the array are traditional disks. If we
could detect all/some of them are not (i.e. SSD, etc.), it would be
better off using some other criteria for the read balancing IMHO,
something like nr_pending?

-- 
Regards,
Namhyung Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/RFC] md/raid10: optimize read_balance() for 'far copies' arrays
  2011-06-08  7:42   ` Namhyung Kim
  2011-06-08 11:49     ` Keld Jørn Simonsen
@ 2011-06-10 14:29     ` Bill Davidsen
  1 sibling, 0 replies; 6+ messages in thread
From: Bill Davidsen @ 2011-06-10 14:29 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: NeilBrown, linux-raid

Namhyung Kim wrote:
> NeilBrown<neilb@suse.de>  writes:
>
>    
>> On Wed,  8 Jun 2011 16:00:45 +0900 Namhyung Kim<namhyung@gmail.com>  wrote:
>>
>>      
>>> If @conf->far_offset>  0, there is only 1 stripe so that we can treat
>>> the array same as 'near' arrays. Furthermore we could calculate new
>>> distance from the previous position even for the real 'far' array
>>> cases if the position of given disk is already in the lowest stripe.
>>>
>>>        
>> I agree that it still make sense to to balancing if far_offset != 0.
>> However  there is absolutely no point in your change to the calculation of
>> new_distance.
>> You only wont new_distance to contain a distance from head position if we
>> want to choose the device with the 'closest' head.  But we don't.  We want to
>> choose the device were the data is closest to the start of the device.  So
>> the current value for new_distance is correct.
>>
>>      
> Still can't understand why we choose the closest-to-the-start disk in
> case we could have possible sequencial access on other disk. Probably
> because of the lack of my understanding how md/disk works :(
>    

This code is all based on traditional drives, where the seek time, 
rotational latency, and position on the platter are all factors which 
effect performance in some way. Devices like SSD don't have these 
factors (ie. they are constants) and someday it may make sense to 
rethink this code again.

Also note that "close to current" optimizes seek time, while "close to 
beginning" optimizes transfer rate. Note the total lack of parameters to 
tune "what you want" for a given device.

-- 
Bill Davidsen<davidsen@tmr.com>
   We are not out of the woods yet, but we know the direction and have
taken the first step. The steps are many, but finite in number, and if
we persevere we will reach our destination.  -me, 2010




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-06-10 14:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-08  7:00 [PATCH/RFC] md/raid10: optimize read_balance() for 'far copies' arrays Namhyung Kim
2011-06-08  7:21 ` NeilBrown
2011-06-08  7:42   ` Namhyung Kim
2011-06-08 11:49     ` Keld Jørn Simonsen
2011-06-08 14:39       ` Namhyung Kim
2011-06-10 14:29     ` Bill Davidsen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.