linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [rfc][patch] kernel/sched.c oddness?
@ 2002-10-02 18:41 Matthew Dobson
  2002-10-03  0:06 ` Nick Piggin
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Dobson @ 2002-10-02 18:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Michael Hohnbaum

[-- Attachment #1: Type: text/plain, Size: 810 bytes --]

This snippet of code appears wrong...  Either that, or the accompanying 
comment is wrong?

from kernel/sched.c::find_busiest_queue():

| 
*imbalance = (max_load - nr_running) / 2;
|
| 
/* It needs an at least ~25% imbalance to trigger balancing. */
| 
if (!idle && (*imbalance < (max_load + 3)/4)) {
| 
	busiest = NULL;
| 
	goto out;
| 
}

The comment says 25% imbalance, but the code is really checking for a 
~50% imbalance.  The attatched patch moves the division by two to the 
pull_task function where the imbalance number is actually used.  This 
patch makes the code match the comment, and divides the imbalance by two 
where it is needed.

Please let me know if I've misinterpreted what this code is supposed to 
be doing, -or- if we really want the comment to match the current code.

Cheers!

-Matt

[-- Attachment #2: sched_cleanup-2.5.40.patch --]
[-- Type: text/plain, Size: 826 bytes --]

diff -Nur --exclude-from=/usr/src/.dontdiff linux-2.5.40-vanilla/kernel/sched.c linux-2.5.40-sched_cleanup/kernel/sched.c
--- linux-2.5.40-vanilla/kernel/sched.c	Tue Oct  1 00:07:35 2002
+++ linux-2.5.40-sched_cleanup/kernel/sched.c	Wed Oct  2 11:22:41 2002
@@ -689,7 +689,7 @@
 	if (likely(!busiest))
 		goto out;
 
-	*imbalance = (max_load - nr_running) / 2;
+	*imbalance = (max_load - nr_running);
 
 	/* It needs an at least ~25% imbalance to trigger balancing. */
 	if (!idle && (*imbalance < (max_load + 3)/4)) {
@@ -746,6 +746,11 @@
 	task_t *tmp;
 
 	busiest = find_busiest_queue(this_rq, this_cpu, idle, &imbalance);
+	/*
+	 * We only want to steal a number of tasks equal to 1/2 the imbalance,
+ 	 * otherwise, we'll just shift the imbalance to the new queue.
+	 */
+	imbalance >>= 1;
 	if (!busiest)
 		goto out;
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfc][patch] kernel/sched.c oddness?
  2002-10-02 18:41 [rfc][patch] kernel/sched.c oddness? Matthew Dobson
@ 2002-10-03  0:06 ` Nick Piggin
  2002-10-03  0:30   ` Matthew Dobson
  2002-10-03  6:54   ` Ingo Molnar
  0 siblings, 2 replies; 7+ messages in thread
From: Nick Piggin @ 2002-10-03  0:06 UTC (permalink / raw)
  To: colpatch; +Cc: linux-kernel, Michael Hohnbaum, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 3007 bytes --]

Hi Matt,
I think your patch is good except that it would be fine to use a /= 2 as the
compiler will turn this into a shift, and imbalance is not really valid (or
used) if busiest == NULL. Actually there are a number of things (I think)
still wrong with this! I posted a patch to Ingo - I guess I should have
cc'ed the list - we should combine our efforts! Here is my explanation and
a new patch including both our fixes.

Regards,
Nick

Nick Piggin wrote:

> Just a quick follow up
> * using imbalance (not divided by 2) instead of nr_running in 
> is_unbalanced
>  would save a multiply by 3
> * it is a more accurate with a small number of processes due to no 
> divisions
> * perhaps the last test should actually be
>  if(busiest->nr_running <= nr_running)
>
> Regards,
> Nick
>
> Nick Piggin wrote:
>
>> Dear Ingo,
>> I apologise in advance if the current behaviour of the scheduler is 
>> the result
>> of tuning / special casing you did in the scheduler I didn't follow its
>> development closely. However, I noticed on my 2xSMP system that quite
>> unbalanced loads weren't getting even CPU time best example - 3 
>> processes in
>> busywait loops - one would get 100% of one cpu while two would get 
>> 50% each of
>> the other.
>>
>> I think find_busiest_queue might have a few problems:
>> * the "imbalance" value divided by 2 loses meaning and precision esp. 
>> with
>>  small number of processes
>>
>> * in the 25% imbalance calculation presumably the + should be a *
>>
>> * the 25% calculation uses the divided imbalance value which is 
>> inaccurate
>>
>> * i think the 25% imbalance calculation should use a quarter of 
>> max_load, not
>>  three quarters.
>>
>> * the second test (after the lock) isn't very good, its cheaper than 
>> the mul
>>  but having battled contention and claimed the lock it would be a 
>> shame to
>>  not do anything with it!
>>
>> Now top says the patch has fixed up the balancing but perhaps some of 
>> these
>> aren't bugs - perhaps they need a comment because they are a bit 
>> subtle! If
>> they are, please recommend the patch to Linus.
>>
>> Regards,
>> Nick
>>
>>
>


Matthew Dobson wrote:

> This snippet of code appears wrong...  Either that, or the 
> accompanying comment is wrong?
>
> from kernel/sched.c::find_busiest_queue():
>
> | *imbalance = (max_load - nr_running) / 2;
> |
> | /* It needs an at least ~25% imbalance to trigger balancing. */
> | if (!idle && (*imbalance < (max_load + 3)/4)) {
> |     busiest = NULL;
> |     goto out;
> | }
>
> The comment says 25% imbalance, but the code is really checking for a 
> ~50% imbalance.  The attatched patch moves the division by two to the 
> pull_task function where the imbalance number is actually used.  This 
> patch makes the code match the comment, and divides the imbalance by 
> two where it is needed.
>
> Please let me know if I've misinterpreted what this code is supposed 
> to be doing, -or- if we really want the comment to match the current 
> code.
>
> Cheers!
>
> -Matt



[-- Attachment #2: sched-balance2.patch --]
[-- Type: text/plain, Size: 1172 bytes --]

--- linux-2.5/kernel/sched.c.old	2002-10-02 23:14:12.000000000 +1000
+++ linux-2.5/kernel/sched.c	2002-10-03 10:03:41.000000000 +1000
@@ -689,10 +689,10 @@
 	if (likely(!busiest))
 		goto out;
 
-	*imbalance = (max_load - nr_running) / 2;
+	*imbalance = max_load - nr_running;
 
 	/* It needs an at least ~25% imbalance to trigger balancing. */
-	if (!idle && (*imbalance < (max_load + 3)/4)) {
+	if (!idle && (*imbalance*4 < max_load)) {
 		busiest = NULL;
 		goto out;
 	}
@@ -702,10 +702,11 @@
 	 * Make sure nothing changed since we checked the
 	 * runqueue length.
 	 */
-	if (busiest->nr_running <= nr_running + 1) {
+	if (busiest->nr_running <= nr_running) {
 		spin_unlock(&busiest->lock);
 		busiest = NULL;
 	}
+
 out:
 	return busiest;
 }
@@ -748,7 +749,12 @@
 	busiest = find_busiest_queue(this_rq, this_cpu, idle, &imbalance);
 	if (!busiest)
 		goto out;
-
+	/*
+	 * We only want to steal a number of tasks equal to 1/2 the imbalance,
+	 * otherwise, we'll just shift the imbalance to the new queue.
+	 */
+	imbalance /= 2;
+	
 	/*
 	 * We first consider expired tasks. Those will likely not be
 	 * executed in the near future, and they are most likely to

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfc][patch] kernel/sched.c oddness?
  2002-10-03  0:06 ` Nick Piggin
@ 2002-10-03  0:30   ` Matthew Dobson
  2002-10-03  6:54   ` Ingo Molnar
  1 sibling, 0 replies; 7+ messages in thread
From: Matthew Dobson @ 2002-10-03  0:30 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel, Michael Hohnbaum, Ingo Molnar

Looks good to me...

Ingo, are we both crazy?  It's entirely possible as no one else seems to 
  have noticed the 'oddities' that Nick, Michael, and myself have... 
It'd be nice to know for sure either way...

Cheers!

-Matt

Nick Piggin wrote:
> Hi Matt,
> I think your patch is good except that it would be fine to use a /= 2 as 
> the
> compiler will turn this into a shift, and imbalance is not really valid (or
> used) if busiest == NULL. Actually there are a number of things (I think)
> still wrong with this! I posted a patch to Ingo - I guess I should have
> cc'ed the list - we should combine our efforts! Here is my explanation and
> a new patch including both our fixes.
> 
> Regards,
> Nick
> 
> Nick Piggin wrote:
> 
>> Just a quick follow up
>> * using imbalance (not divided by 2) instead of nr_running in 
>> is_unbalanced
>>  would save a multiply by 3
>> * it is a more accurate with a small number of processes due to no 
>> divisions
>> * perhaps the last test should actually be
>>  if(busiest->nr_running <= nr_running)
>>
>> Regards,
>> Nick
>>
>> Nick Piggin wrote:
>>
>>> Dear Ingo,
>>> I apologise in advance if the current behaviour of the scheduler is 
>>> the result
>>> of tuning / special casing you did in the scheduler I didn't follow its
>>> development closely. However, I noticed on my 2xSMP system that quite
>>> unbalanced loads weren't getting even CPU time best example - 3 
>>> processes in
>>> busywait loops - one would get 100% of one cpu while two would get 
>>> 50% each of
>>> the other.
>>>
>>> I think find_busiest_queue might have a few problems:
>>> * the "imbalance" value divided by 2 loses meaning and precision esp. 
>>> with
>>>  small number of processes
>>>
>>> * in the 25% imbalance calculation presumably the + should be a *
>>>
>>> * the 25% calculation uses the divided imbalance value which is 
>>> inaccurate
>>>
>>> * i think the 25% imbalance calculation should use a quarter of 
>>> max_load, not
>>>  three quarters.
>>>
>>> * the second test (after the lock) isn't very good, its cheaper than 
>>> the mul
>>>  but having battled contention and claimed the lock it would be a 
>>> shame to
>>>  not do anything with it!
>>>
>>> Now top says the patch has fixed up the balancing but perhaps some of 
>>> these
>>> aren't bugs - perhaps they need a comment because they are a bit 
>>> subtle! If
>>> they are, please recommend the patch to Linus.
>>>
>>> Regards,
>>> Nick
>>>
>>>
>>
> 
> 
> Matthew Dobson wrote:
> 
>> This snippet of code appears wrong...  Either that, or the 
>> accompanying comment is wrong?
>>
>> from kernel/sched.c::find_busiest_queue():
>>
>> | *imbalance = (max_load - nr_running) / 2;
>> |
>> | /* It needs an at least ~25% imbalance to trigger balancing. */
>> | if (!idle && (*imbalance < (max_load + 3)/4)) {
>> |     busiest = NULL;
>> |     goto out;
>> | }
>>
>> The comment says 25% imbalance, but the code is really checking for a 
>> ~50% imbalance.  The attatched patch moves the division by two to the 
>> pull_task function where the imbalance number is actually used.  This 
>> patch makes the code match the comment, and divides the imbalance by 
>> two where it is needed.
>>
>> Please let me know if I've misinterpreted what this code is supposed 
>> to be doing, -or- if we really want the comment to match the current 
>> code.
>>
>> Cheers!
>>
>> -Matt
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> --- linux-2.5/kernel/sched.c.old	2002-10-02 23:14:12.000000000 +1000
> +++ linux-2.5/kernel/sched.c	2002-10-03 10:03:41.000000000 +1000
> @@ -689,10 +689,10 @@
>  	if (likely(!busiest))
>  		goto out;
>  
> -	*imbalance = (max_load - nr_running) / 2;
> +	*imbalance = max_load - nr_running;
>  
>  	/* It needs an at least ~25% imbalance to trigger balancing. */
> -	if (!idle && (*imbalance < (max_load + 3)/4)) {
> +	if (!idle && (*imbalance*4 < max_load)) {
>  		busiest = NULL;
>  		goto out;
>  	}
> @@ -702,10 +702,11 @@
>  	 * Make sure nothing changed since we checked the
>  	 * runqueue length.
>  	 */
> -	if (busiest->nr_running <= nr_running + 1) {
> +	if (busiest->nr_running <= nr_running) {
>  		spin_unlock(&busiest->lock);
>  		busiest = NULL;
>  	}
> +
>  out:
>  	return busiest;
>  }
> @@ -748,7 +749,12 @@
>  	busiest = find_busiest_queue(this_rq, this_cpu, idle, &imbalance);
>  	if (!busiest)
>  		goto out;
> -
> +	/*
> +	 * We only want to steal a number of tasks equal to 1/2 the imbalance,
> +	 * otherwise, we'll just shift the imbalance to the new queue.
> +	 */
> +	imbalance /= 2;
> +	
>  	/*
>  	 * We first consider expired tasks. Those will likely not be
>  	 * executed in the near future, and they are most likely to



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfc][patch] kernel/sched.c oddness?
  2002-10-03  0:06 ` Nick Piggin
  2002-10-03  0:30   ` Matthew Dobson
@ 2002-10-03  6:54   ` Ingo Molnar
  2002-10-03  8:32     ` Nick Piggin
  2002-10-03 21:15     ` Matthew Dobson
  1 sibling, 2 replies; 7+ messages in thread
From: Ingo Molnar @ 2002-10-03  6:54 UTC (permalink / raw)
  To: Nick Piggin; +Cc: colpatch, linux-kernel, Michael Hohnbaum


> [...] However, I noticed on my 2xSMP system that quite unbalanced loads
> weren't getting even CPU time best example - 3 processes in busywait
> loops - one would get 100% of one cpu while two would get 50% each of
> the other.

this was done intentionally, and this scenario (1+2 tasks) is the very
worst scenario. The problem is that by trying to balance all 3 tasks we
now have 3 tasks that trash their cache going from one CPU to another.  
(this is what happens with your patch - even with another approach we'd
have to trash at least one task)

By keeping 2 tasks on one CPU and 1 task on the other CPU we avoid
cross-CPU migration of threads. Think about the 2+3 or 4+5 tasks case
rather, do we want absolutely perfect balancing, or good SMP affinity and
good combined performance?

	Ingo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfc][patch] kernel/sched.c oddness?
  2002-10-03  6:54   ` Ingo Molnar
@ 2002-10-03  8:32     ` Nick Piggin
  2002-10-03 21:15     ` Matthew Dobson
  1 sibling, 0 replies; 7+ messages in thread
From: Nick Piggin @ 2002-10-03  8:32 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: colpatch, linux-kernel, Michael Hohnbaum


         * Make sure nothing changed since we checked the
         * runqueue length.
         */
-       if (busiest->nr_running <= nr_running + 1) {
+       if (busiest->nr_running <= nr_running) {
                spin_unlock(&busiest->lock);
                busiest = NULL;
        }

OK, thanks for the explanation, then disregard this part of the patch, the
other two are still valid I think.

Ingo Molnar wrote:

>>[...] However, I noticed on my 2xSMP system that quite unbalanced loads
>>weren't getting even CPU time best example - 3 processes in busywait
>>loops - one would get 100% of one cpu while two would get 50% each of
>>the other.
>>
>
>this was done intentionally, and this scenario (1+2 tasks) is the very
>worst scenario. The problem is that by trying to balance all 3 tasks we
>now have 3 tasks that trash their cache going from one CPU to another.  
>(this is what happens with your patch - even with another approach we'd
>have to trash at least one task)
>
>By keeping 2 tasks on one CPU and 1 task on the other CPU we avoid
>cross-CPU migration of threads. Think about the 2+3 or 4+5 tasks case
>rather, do we want absolutely perfect balancing, or good SMP affinity and
>good combined performance?
>
>	Ingo
>
>
>
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfc][patch] kernel/sched.c oddness?
  2002-10-03  6:54   ` Ingo Molnar
  2002-10-03  8:32     ` Nick Piggin
@ 2002-10-03 21:15     ` Matthew Dobson
  2002-10-04  7:46       ` Ingo Molnar
  1 sibling, 1 reply; 7+ messages in thread
From: Matthew Dobson @ 2002-10-03 21:15 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Nick Piggin, linux-kernel, Michael Hohnbaum

[-- Attachment #1: Type: text/plain, Size: 1054 bytes --]

Ingo Molnar wrote:
> this was done intentionally, and this scenario (1+2 tasks) is the very
> worst scenario. The problem is that by trying to balance all 3 tasks we
> now have 3 tasks that trash their cache going from one CPU to another.  
> (this is what happens with your patch - even with another approach we'd
> have to trash at least one task)
> 
> By keeping 2 tasks on one CPU and 1 task on the other CPU we avoid
> cross-CPU migration of threads. Think about the 2+3 or 4+5 tasks case
> rather, do we want absolutely perfect balancing, or good SMP affinity and
> good combined performance?
OK...  But what about the (imbalance / 2) part?  Either the comment 
needs to change, or the code.  Attatched is a slightly revised patch for 
the code.  The comment patch would be even easier:

- 
/* It needs an at least ~25% imbalance to trigger balancing. */
+ 
/* It needs an at least ~50% imbalance to trigger balancing. */

Either way works for me.  I'd like to see something done, as the 
comments don't match the code right now...

Cheers!

-Matt

[-- Attachment #2: sched_cleanup-2.5.40.patch --]
[-- Type: text/plain, Size: 912 bytes --]

diff -Nur --exclude-from=/usr/src/.dontdiff linux-2.5.40-vanilla/kernel/sched.c linux-2.5.40-sched_cleanup/kernel/sched.c
--- linux-2.5.40-vanilla/kernel/sched.c	Tue Oct  1 00:07:35 2002
+++ linux-2.5.40-sched_cleanup/kernel/sched.c	Thu Oct  3 14:09:31 2002
@@ -689,10 +689,10 @@
 	if (likely(!busiest))
 		goto out;
 
-	*imbalance = (max_load - nr_running) / 2;
+	*imbalance = max_load - nr_running;
 
 	/* It needs an at least ~25% imbalance to trigger balancing. */
-	if (!idle && (*imbalance < (max_load + 3)/4)) {
+	if (!idle && (*imbalance <= (max_load + 3)/4)) {
 		busiest = NULL;
 		goto out;
 	}
@@ -746,6 +746,11 @@
 	task_t *tmp;
 
 	busiest = find_busiest_queue(this_rq, this_cpu, idle, &imbalance);
+	/*
+	 * We only want to steal a number of tasks equal to 1/2 the imbalance,
+ 	 * otherwise, we'll just shift the imbalance to the new queue.
+	 */
+	imbalance /= 2;
 	if (!busiest)
 		goto out;
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfc][patch] kernel/sched.c oddness?
  2002-10-03 21:15     ` Matthew Dobson
@ 2002-10-04  7:46       ` Ingo Molnar
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2002-10-04  7:46 UTC (permalink / raw)
  To: Matthew Dobson; +Cc: Nick Piggin, linux-kernel, Michael Hohnbaum


On Thu, 3 Oct 2002, Matthew Dobson wrote:

> /* It needs an at least ~50% imbalance to trigger balancing. */
> 
> Either way works for me.  I'd like to see something done, as the
> comments don't match the code right now...

the patch looks good to me - i'll add it to my next scheduler patchset,
after some testing on bigger SMP boxes. Balancing tends to be a very
volatile area.

	Ingo


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-10-04  7:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-02 18:41 [rfc][patch] kernel/sched.c oddness? Matthew Dobson
2002-10-03  0:06 ` Nick Piggin
2002-10-03  0:30   ` Matthew Dobson
2002-10-03  6:54   ` Ingo Molnar
2002-10-03  8:32     ` Nick Piggin
2002-10-03 21:15     ` Matthew Dobson
2002-10-04  7:46       ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).