All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gregory Haskins <ghaskins@novell.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: mingo@elte.hu, srostedt@redhat.com, peterz@infradead.org,
	linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
	npiggin@suse.de, gregory.haskins@gmail.com
Subject: Re: [PATCH 3/5] sched: make double-lock-balance fair
Date: Wed, 27 Aug 2008 08:10:43 -0400	[thread overview]
Message-ID: <48B54443.3000001@novell.com> (raw)
In-Reply-To: <200808272153.32007.nickpiggin@yahoo.com.au>

[-- Attachment #1: Type: text/plain, Size: 4347 bytes --]

Nick Piggin wrote:
> On Wednesday 27 August 2008 21:41, Gregory Haskins wrote:
>   
>> Nick Piggin wrote:
>>     
>>> On Tuesday 26 August 2008 22:23, Gregory Haskins wrote:
>>>       
>>>> Nick Piggin wrote:
>>>>         
>>>>> On Tuesday 26 August 2008 06:15, Gregory Haskins wrote:
>>>>>           
>>>>>> double_lock balance() currently favors logically lower cpus since they
>>>>>> often do not have to release their own lock to acquire a second lock.
>>>>>> The result is that logically higher cpus can get starved when there is
>>>>>> a lot of pressure on the RQs.  This can result in higher latencies on
>>>>>> higher cpu-ids.
>>>>>>
>>>>>> This patch makes the algorithm more fair by forcing all paths to have
>>>>>> to release both locks before acquiring them again.  Since callsites to
>>>>>> double_lock_balance already consider it a potential
>>>>>> preemption/reschedule point, they have the proper logic to recheck for
>>>>>> atomicity violations.
>>>>>>
>>>>>> Signed-off-by: Gregory Haskins <ghaskins@novell.com>
>>>>>> ---
>>>>>>
>>>>>>  kernel/sched.c |   17 +++++------------
>>>>>>  1 files changed, 5 insertions(+), 12 deletions(-)
>>>>>>
>>>>>> diff --git a/kernel/sched.c b/kernel/sched.c
>>>>>> index 6e0bde6..b7326cd 100644
>>>>>> --- a/kernel/sched.c
>>>>>> +++ b/kernel/sched.c
>>>>>> @@ -2790,23 +2790,16 @@ static int double_lock_balance(struct rq
>>>>>> *this_rq, struct rq *busiest) __acquires(busiest->lock)
>>>>>>  	__acquires(this_rq->lock)
>>>>>>  {
>>>>>> -	int ret = 0;
>>>>>> -
>>>>>>  	if (unlikely(!irqs_disabled())) {
>>>>>>  		/* printk() doesn't work good under rq->lock */
>>>>>>  		spin_unlock(&this_rq->lock);
>>>>>>  		BUG_ON(1);
>>>>>>  	}
>>>>>> -	if (unlikely(!spin_trylock(&busiest->lock))) {
>>>>>> -		if (busiest < this_rq) {
>>>>>> -			spin_unlock(&this_rq->lock);
>>>>>> -			spin_lock(&busiest->lock);
>>>>>> -			spin_lock_nested(&this_rq->lock, SINGLE_DEPTH_NESTING);
>>>>>> -			ret = 1;
>>>>>> -		} else
>>>>>> -			spin_lock_nested(&busiest->lock, SINGLE_DEPTH_NESTING);
>>>>>> -	}
>>>>>> -	return ret;
>>>>>> +
>>>>>> +	spin_unlock(&this_rq->lock);
>>>>>> +	double_rq_lock(this_rq, busiest);
>>>>>>             
>>>>> Rather than adding the extra atomic operation, can't you just put this
>>>>> into the unlikely spin_trylock failure path rather than the unfair
>>>>> logic there?
>>>>>           
>>>> The trick is that we *must* first release this_rq before proceeding or
>>>> the new proposal doesn't work as intended.  This patch effectively
>>>> breaks up the this_rq->lock critical section evenly across all CPUs as
>>>> if it hit the case common for higher cpus.
>>>>         
>>> I don't exactly see why my proposal would introduce any more latency,
>>> because we only trylock while holding the existing lock -- this is will
>>> only ever add a small ~constant time to the critical section, regardless
>>> of whether it is a high or low CPU runqueue.
>>>       
>> Its because we are trying to create a break in the critical section of
>> this_rq->lock, not improve the acquisition of busiest->lock.  So whether
>> you do spin_lock or spin_trylock on busiest does not matter.  Busiest
>> will not be contended in the case that I am concerned with.  If you use
>> my example below: rq[N] will not be contended because cpuN is blocked on
>> rq[0] after already having released rq[N].  So its the contention
>> against this_rq that is the problem.
>>
>> Or am I missing your point completely?
>>     
>
> Well my point is just that after my change, there may be some windows
> of "unfair" behaviour where one CPU gets to go ahead early, but AFAIKS
> now there is no systemic bias against higher numbered CPUs (except the
> small issue of taking lowered numbered locks first which is also present
> in any solution).
>
> As such, I would actually like to see my proposal implemented in the
> !lowlatency case as well... unless my reasoning has a hole in it?
>
> But if you are _also_ wanting to eliminate the long lock hold time and
> strictly improve fairness for lowlatency kernels, then I agree that your
> patch goes much further than mine, so I don't object to putting that
> under CONFIG_PREEMPT.
>   

Ok, I understand what you are saying now, and that makes sense.

-Greg


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

  reply	other threads:[~2008-08-27 12:13 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-25 20:15 [PATCH 0/5] sched: misc rt fixes for tip/sched/devel Gregory Haskins
2008-08-25 20:15 ` [PATCH 1/5] sched: only try to push a task on wakeup if it is migratable Gregory Haskins
2008-08-25 20:15 ` [PATCH 2/5] sched: pull only one task during NEWIDLE balancing to limit critical section Gregory Haskins
2008-08-26  6:21   ` Nick Piggin
2008-08-26 11:36     ` Gregory Haskins
2008-08-27  6:41       ` Nick Piggin
2008-08-27 11:50         ` Gregory Haskins
2008-08-27 11:57           ` Nick Piggin
2008-08-25 20:15 ` [PATCH 3/5] sched: make double-lock-balance fair Gregory Haskins
2008-08-25 20:15   ` Gregory Haskins
2008-08-26  6:14   ` Nick Piggin
2008-08-26 12:23     ` Gregory Haskins
2008-08-27  6:36       ` Nick Piggin
2008-08-27 11:41         ` Gregory Haskins
2008-08-27 11:53           ` Nick Piggin
2008-08-27 12:10             ` Gregory Haskins [this message]
2008-08-25 20:15 ` [PATCH 4/5] sched: add sched_class->needs_post_schedule() member Gregory Haskins
2008-08-25 20:15 ` [PATCH 5/5] sched: create "pushable_tasks" list to limit pushing to one attempt Gregory Haskins
2008-08-26 17:34 ` [PATCH v2 0/6] Series short description Gregory Haskins
2008-08-26 17:34   ` [PATCH v2 1/6] sched: only try to push a task on wakeup if it is migratable Gregory Haskins
2008-08-26 17:34   ` [PATCH v2 2/6] sched: pull only one task during NEWIDLE balancing to limit critical section Gregory Haskins
2008-08-26 17:34     ` Gregory Haskins
2008-08-26 17:35   ` [PATCH v2 3/6] sched: make double-lock-balance fair Gregory Haskins
2008-08-27  8:21     ` Peter Zijlstra
2008-08-27  8:25       ` Peter Zijlstra
2008-08-27 10:26       ` Nick Piggin
2008-08-27 10:41         ` Peter Zijlstra
2008-08-27 10:56           ` Nick Piggin
2008-08-27 10:57             ` Nick Piggin
2008-08-27 12:03               ` Gregory Haskins
2008-08-27 11:07             ` Peter Zijlstra
2008-08-27 11:17               ` Russell King
2008-08-27 12:00               ` Gregory Haskins
2008-08-29 12:49               ` Ralf Baechle
2008-08-27 12:13             ` Gregory Haskins
2008-08-27 12:02       ` Gregory Haskins
2008-08-26 17:35   ` [PATCH v2 4/6] sched: add sched_class->needs_post_schedule() member Gregory Haskins
2008-08-26 17:35   ` [PATCH v2 5/6] plist: fix PLIST_NODE_INIT to work with debug enabled Gregory Haskins
2008-08-26 17:35   ` [PATCH v2 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt Gregory Haskins
2008-08-29 13:24     ` Gregory Haskins
2008-08-26 18:16   ` [PATCH v2 0/6] sched: misc rt fixes for tip/sched/devel (was: Series short description) Gregory Haskins
2008-08-27  8:33   ` [PATCH v2 0/6] Series short description Peter Zijlstra
2008-09-04 12:54 ` [TIP/SCHED/DEVEL PATCH v3 0/6] sched: misc rt fixes Gregory Haskins
2008-09-04 12:55   ` [TIP/SCHED/DEVEL PATCH v3 1/6] sched: only try to push a task on wakeup if it is migratable Gregory Haskins
2008-09-04 12:55   ` [TIP/SCHED/DEVEL PATCH v3 2/6] sched: pull only one task during NEWIDLE balancing to limit critical section Gregory Haskins
2008-09-04 12:55     ` Gregory Haskins
2008-09-04 12:55   ` [TIP/SCHED/DEVEL PATCH v3 3/6] sched: make double-lock-balance fair Gregory Haskins
2008-09-04 12:55   ` [TIP/SCHED/DEVEL PATCH v3 4/6] sched: add sched_class->needs_post_schedule() member Gregory Haskins
2008-09-04 20:36     ` Steven Rostedt
2008-09-04 20:36       ` Gregory Haskins
2008-09-04 12:55   ` [TIP/SCHED/DEVEL PATCH v3 5/6] plist: fix PLIST_NODE_INIT to work with debug enabled Gregory Haskins
2008-09-04 12:55   ` [TIP/SCHED/DEVEL PATCH v3 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt Gregory Haskins
2008-09-04 21:16     ` Steven Rostedt
2008-09-04 21:26       ` Gregory Haskins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48B54443.3000001@novell.com \
    --to=ghaskins@novell.com \
    --cc=gregory.haskins@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    --cc=srostedt@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.