All of lore.kernel.org
 help / color / mirror / Atom feed
* deadlock in scheduler enabling HRTICK feature
@ 2013-06-25 21:05 David Ahern
  2013-06-25 21:17 ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: David Ahern @ 2013-06-25 21:05 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, LKML

Peter/Ingo:

I can reliably cause a deadlock in the scheduler by enabling the HRTICK 
feature. I first hit the problem with 2.6.27 but have been able to 
reproduce it with newer kernels. I have not tried top of Linus' tree, so 
perhaps this has been fixed in 3.10. Exact backtrace differs by release, 
but the root cause is the same: the run queue is locked early in the 
schedule path and then wanted again servicing the softirq.

Using Fedora 18 and the 3.9.6-200.fc18.x86_64 kernel as an example,

[root@f18 ~]# cat /sys/kernel/debug/sched_features
GENTLE_FAIR_SLEEPERS START_DEBIT NO_NEXT_BUDDY LAST_BUDDY 
CACHE_HOT_BUDDY WAKEUP_PREEMPTION ARCH_POWER NO_HRTICK NO_DOUBLE_TICK 
LB_BIAS OWNER_SPIN NONTASK_POWER TTWU_QUEUE NO_FORCE_SD_OVERLAP 
RT_RUNTIME_SHARE NO_LB_MIN NO_NUMA NO_NUMA_FORCE

[root@f18 ~]# echo HRTICK > /sys/kernel/debug/sched_features

[root@f18 ~]# cat /sys/kernel/debug/sched_features
GENTLE_FAIR_SLEEPERS START_DEBIT NO_NEXT_BUDDY LAST_BUDDY 
CACHE_HOT_BUDDY WAKEUP_PREEMPTION ARCH_POWER HRTICK NO_DOUBLE_TICK 
LB_BIAS OWNER_SPIN NONTASK_POWER TTWU_QUEUE NO_FORCE_SD_OVERLAP 
RT_RUNTIME_SHARE NO_LB_MIN NO_NUMA NO_NUMA_FORCE

For a workload a simple kernel build suffices: 'make O=/tmp/kbuild -j 8' 
on a 4vcpu VM. Lockup occurs pretty quickly.

The relevant stack trace from the nmi watchdog:
...
[  219.467698]  <<EOE>>  [<ffffffff81093c61>] try_to_wake_up+0x1d1/0x2d0
[  219.467698]  [<ffffffff81043daf>] ? kvm_clock_read+0x1f/0x30
[  219.467698]  [<ffffffff81093dc7>] wake_up_process+0x27/0x50
[  219.467698]  [<ffffffff81066fc9>] wakeup_softirqd+0x29/0x30
[  219.467698]  [<ffffffff81067b95>] raise_softirq_irqoff+0x25/0x30
[  219.467698]  [<ffffffff810867c5>] __hrtimer_start_range_ns+0x3a5/0x400
[  219.467698]  [<ffffffff8109a089>] ? update_curr+0x99/0x170
[  219.467698]  [<ffffffff81086854>] hrtimer_start_range_ns+0x14/0x20
[  219.467698]  [<ffffffff81090bf0>] hrtick_start+0x90/0xa0
[  219.467698]  [<ffffffff810985f8>] hrtick_start_fair+0x88/0xd0
[  219.467698]  [<ffffffff81098f33>] hrtick_update+0x73/0x80
[  219.467698]  [<ffffffff8109c876>] enqueue_task_fair+0x346/0x550
[  219.467698]  [<ffffffff81090ab6>] enqueue_task+0x66/0x80
[  219.467698]  [<ffffffff81091443>] activate_task+0x23/0x30
[  219.467698]  [<ffffffff810917ac>] ttwu_do_activate.constprop.83+0x3c/0x70
[  219.467698]  [<ffffffff81093c6c>] try_to_wake_up+0x1dc/0x2d0
[  219.467698]  [<ffffffff81198898>] ? mem_cgroup_charge_common+0xa8/0x120
[  219.467698]  [<ffffffff81093d72>] default_wake_function+0x12/0x20
[  219.467698]  [<ffffffff810833fd>] autoremove_wake_function+0x1d/0x50
[  219.467698]  [<ffffffff8108b0e5>] __wake_up_common+0x55/0x90
[  219.467698]  [<ffffffff8108e973>] __wake_up_sync_key+0x53/0x80
...


You can see the nested calls to try_to_wake_up() which has called 
ttwu_queue() in both places. The trouble spot is here in ttwu_queue:
     ...
     raw_spin_lock(&rq->lock);     <---- dead lock here on second call
     ttwu_do_activate(rq, p, 0);
     raw_spin_unlock(&rq->lock);
     ...

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-25 21:05 deadlock in scheduler enabling HRTICK feature David Ahern
@ 2013-06-25 21:17 ` Peter Zijlstra
  2013-06-25 21:20   ` David Ahern
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2013-06-25 21:17 UTC (permalink / raw)
  To: David Ahern; +Cc: Ingo Molnar, LKML

On Tue, Jun 25, 2013 at 03:05:38PM -0600, David Ahern wrote:
> Peter/Ingo:
> 
> I can reliably cause a deadlock in the scheduler by enabling the HRTICK
> feature. I first hit the problem with 2.6.27 but have been able to reproduce
> it with newer kernels. I have not tried top of Linus' tree, so perhaps this
> has been fixed in 3.10. 

Yeah, I know.. its been on the todo list forever but its never been
important. Its disabled by default and all sched debug features are for
those brave enough to keep the pieces ;-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-25 21:17 ` Peter Zijlstra
@ 2013-06-25 21:20   ` David Ahern
  2013-06-26  7:05     ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: David Ahern @ 2013-06-25 21:20 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML

On 6/25/13 3:17 PM, Peter Zijlstra wrote:
> On Tue, Jun 25, 2013 at 03:05:38PM -0600, David Ahern wrote:
>> Peter/Ingo:
>>
>> I can reliably cause a deadlock in the scheduler by enabling the HRTICK
>> feature. I first hit the problem with 2.6.27 but have been able to reproduce
>> it with newer kernels. I have not tried top of Linus' tree, so perhaps this
>> has been fixed in 3.10.
>
> Yeah, I know.. its been on the todo list forever but its never been
> important. Its disabled by default and all sched debug features are for
> those brave enough to keep the pieces ;-)
>

Not a whole to of pieces to keep.

What is the expectation that the feature provides? not a whole lot of 
documentation on it. I walked down the path wondering if it solved an 
odd problem we are seeing with the CFS in 2.6.27 kernel.

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-25 21:20   ` David Ahern
@ 2013-06-26  7:05     ` Peter Zijlstra
  2013-06-26 16:46       ` David Ahern
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2013-06-26  7:05 UTC (permalink / raw)
  To: David Ahern; +Cc: Ingo Molnar, LKML

On Tue, Jun 25, 2013 at 03:20:00PM -0600, David Ahern wrote:
> On 6/25/13 3:17 PM, Peter Zijlstra wrote:
> >On Tue, Jun 25, 2013 at 03:05:38PM -0600, David Ahern wrote:
> >>Peter/Ingo:
> >>
> >>I can reliably cause a deadlock in the scheduler by enabling the HRTICK
> >>feature. I first hit the problem with 2.6.27 but have been able to reproduce
> >>it with newer kernels. I have not tried top of Linus' tree, so perhaps this
> >>has been fixed in 3.10.
> >
> >Yeah, I know.. its been on the todo list forever but its never been
> >important. Its disabled by default and all sched debug features are for
> >those brave enough to keep the pieces ;-)
> >
> 
> Not a whole to of pieces to keep.
> 
> What is the expectation that the feature provides? not a whole lot of
> documentation on it. I walked down the path wondering if it solved an odd
> problem we are seeing with the CFS in 2.6.27 kernel.

Its supposed to use hrtimers for slice expiry instead of the regular tick.

IIRC it did work at some point but bitrotted a bit since. The good news is that
the deadline scheduler wants to use it and I'll probably have to fix it up
then.

But yes, I need to write more comments in sched/features.h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-26  7:05     ` Peter Zijlstra
@ 2013-06-26 16:46       ` David Ahern
  2013-06-27 10:43         ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: David Ahern @ 2013-06-26 16:46 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML

On 6/26/13 1:05 AM, Peter Zijlstra wrote:
>> What is the expectation that the feature provides? not a whole lot of
>> documentation on it. I walked down the path wondering if it solved an odd
>> problem we are seeing with the CFS in 2.6.27 kernel.
>
> Its supposed to use hrtimers for slice expiry instead of the regular tick.

So theoretically CPU bound tasks would get preempted sooner? That was my 
guess/hope anyways.

>
> IIRC it did work at some point but bitrotted a bit since. The good news is that
> the deadline scheduler wants to use it and I'll probably have to fix it up
> then.

Hmmm.... meaning I should not be expecting anything in the next couple 
of months? Any gut opinions on how to approach the nested problem - at 
least a quick hack for me to see if this option has any impact on our 
problem? eg., a CPU variable noting we already have the runqueue lock 
and no need to grab it a second time.

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-26 16:46       ` David Ahern
@ 2013-06-27 10:43         ` Peter Zijlstra
  2013-06-27 10:53           ` Peter Zijlstra
  2013-06-27 22:28           ` David Ahern
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2013-06-27 10:43 UTC (permalink / raw)
  To: David Ahern; +Cc: Ingo Molnar, LKML

On Wed, Jun 26, 2013 at 10:46:33AM -0600, David Ahern wrote:
> On 6/26/13 1:05 AM, Peter Zijlstra wrote:
> >>What is the expectation that the feature provides? not a whole lot of
> >>documentation on it. I walked down the path wondering if it solved an odd
> >>problem we are seeing with the CFS in 2.6.27 kernel.
> >
> >Its supposed to use hrtimers for slice expiry instead of the regular tick.
> 
> So theoretically CPU bound tasks would get preempted sooner? That was my
> guess/hope anyways.

Doth the below worketh?

---
 kernel/sched/core.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9b1f2e5..0d8eb45 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -370,13 +370,6 @@ static struct rq *this_rq_lock(void)
 #ifdef CONFIG_SCHED_HRTICK
 /*
  * Use HR-timers to deliver accurate preemption points.
- *
- * Its all a bit involved since we cannot program an hrt while holding the
- * rq->lock. So what we do is store a state in in rq->hrtick_* and ask for a
- * reschedule event.
- *
- * When we get rescheduled we reprogram the hrtick_timer outside of the
- * rq->lock.
  */
 
 static void hrtick_clear(struct rq *rq)
@@ -404,6 +397,15 @@ static enum hrtimer_restart hrtick(struct hrtimer *timer)
 }
 
 #ifdef CONFIG_SMP
+
+static int __hrtick_restart(struct rq *rq)
+{
+	struct hrtimer *timer = &rq->hrtick_timer;
+	ktime_t time = hrtimer_get_softexpires(timer);
+
+	return __hrtimer_start_range_ns(timer, time, 0, HRTIMER_MODE_ABS_PINNED, 0);
+}
+
 /*
  * called from hardirq (IPI) context
  */
@@ -412,7 +414,7 @@ static void __hrtick_start(void *arg)
 	struct rq *rq = arg;
 
 	raw_spin_lock(&rq->lock);
-	hrtimer_restart(&rq->hrtick_timer);
+	__hrtick_restart(rq);
 	rq->hrtick_csd_pending = 0;
 	raw_spin_unlock(&rq->lock);
 }
@@ -430,7 +432,7 @@ void hrtick_start(struct rq *rq, u64 delay)
 	hrtimer_set_expires(timer, time);
 
 	if (rq == this_rq()) {
-		hrtimer_restart(timer);
+		__hrtick_restart(rq);
 	} else if (!rq->hrtick_csd_pending) {
 		__smp_call_function_single(cpu_of(rq), &rq->hrtick_csd, 0);
 		rq->hrtick_csd_pending = 1;


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-27 10:43         ` Peter Zijlstra
@ 2013-06-27 10:53           ` Peter Zijlstra
  2013-06-27 12:28             ` Mike Galbraith
                               ` (2 more replies)
  2013-06-27 22:28           ` David Ahern
  1 sibling, 3 replies; 17+ messages in thread
From: Peter Zijlstra @ 2013-06-27 10:53 UTC (permalink / raw)
  To: David Ahern; +Cc: Ingo Molnar, LKML, tglx

On Thu, Jun 27, 2013 at 12:43:09PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 26, 2013 at 10:46:33AM -0600, David Ahern wrote:
> > On 6/26/13 1:05 AM, Peter Zijlstra wrote:
> > >>What is the expectation that the feature provides? not a whole lot of
> > >>documentation on it. I walked down the path wondering if it solved an odd
> > >>problem we are seeing with the CFS in 2.6.27 kernel.
> > >
> > >Its supposed to use hrtimers for slice expiry instead of the regular tick.
> > 
> > So theoretically CPU bound tasks would get preempted sooner? That was my
> > guess/hope anyways.
> 
> Doth the below worketh?
> 

Related to all this; the reason its not enabled by default is that mucking
about with hrtimers all the while is god awful expensive.

I've had ideas about making this a special purpose 'hard-coded' timer in the
hrtimer guts that's only ever re-programmed when the new value is sooner.

By making it a 'special' timer we can avoid the whole rb-tree song and dance;
and by taking 'spurious' short interrupts we can avoid prodding the hardware
too often.

Then again; Thomas will likely throw frozen seafood my way for even proposing
stuff like this and I'm not even sure that's going to be enough to make the
cost acceptable.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-27 10:53           ` Peter Zijlstra
@ 2013-06-27 12:28             ` Mike Galbraith
  2013-06-27 13:06             ` Ingo Molnar
  2013-06-27 19:18             ` Andy Lutomirski
  2 siblings, 0 replies; 17+ messages in thread
From: Mike Galbraith @ 2013-06-27 12:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: David Ahern, Ingo Molnar, LKML, tglx

On Thu, 2013-06-27 at 12:53 +0200, Peter Zijlstra wrote: 
> On Thu, Jun 27, 2013 at 12:43:09PM +0200, Peter Zijlstra wrote:
> > On Wed, Jun 26, 2013 at 10:46:33AM -0600, David Ahern wrote:
> > > On 6/26/13 1:05 AM, Peter Zijlstra wrote:
> > > >>What is the expectation that the feature provides? not a whole lot of
> > > >>documentation on it. I walked down the path wondering if it solved an odd
> > > >>problem we are seeing with the CFS in 2.6.27 kernel.
> > > >
> > > >Its supposed to use hrtimers for slice expiry instead of the regular tick.
> > > 
> > > So theoretically CPU bound tasks would get preempted sooner? That was my
> > > guess/hope anyways.
> > 
> > Doth the below worketh?
> > 
> 
> Related to all this; the reason its not enabled by default is that mucking
> about with hrtimers all the while is god awful expensive.

That cost sprang to mind when you mentioned that deadline needs hrtick.

> I've had ideas about making this a special purpose 'hard-coded' timer in the
> hrtimer guts that's only ever re-programmed when the new value is sooner.
> 
> By making it a 'special' timer we can avoid the whole rb-tree song and dance;
> and by taking 'spurious' short interrupts we can avoid prodding the hardware
> too often.
> 
> Then again; Thomas will likely throw frozen seafood my way for even proposing
> stuff like this and I'm not even sure that's going to be enough to make the
> cost acceptable.

Could be worse, flaming marshmallows stick.

-Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-27 10:53           ` Peter Zijlstra
  2013-06-27 12:28             ` Mike Galbraith
@ 2013-06-27 13:06             ` Ingo Molnar
  2013-06-27 19:18             ` Andy Lutomirski
  2 siblings, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2013-06-27 13:06 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: David Ahern, LKML, tglx


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Jun 27, 2013 at 12:43:09PM +0200, Peter Zijlstra wrote:
> > On Wed, Jun 26, 2013 at 10:46:33AM -0600, David Ahern wrote:
> > > On 6/26/13 1:05 AM, Peter Zijlstra wrote:
> > > >>What is the expectation that the feature provides? not a whole lot of
> > > >>documentation on it. I walked down the path wondering if it solved an odd
> > > >>problem we are seeing with the CFS in 2.6.27 kernel.
> > > >
> > > >Its supposed to use hrtimers for slice expiry instead of the regular tick.
> > > 
> > > So theoretically CPU bound tasks would get preempted sooner? That was my
> > > guess/hope anyways.
> > 
> > Doth the below worketh?
> > 
> 
> Related to all this; the reason its not enabled by default is that 
> mucking about with hrtimers all the while is god awful expensive.
> 
> I've had ideas about making this a special purpose 'hard-coded' timer in 
> the hrtimer guts that's only ever re-programmed when the new value is 
> sooner.
> 
> By making it a 'special' timer we can avoid the whole rb-tree song and 
> dance; and by taking 'spurious' short interrupts we can avoid prodding 
> the hardware too often.

Sounds neat ...

> Then again; Thomas will likely throw frozen seafood my way for even 
> proposing stuff like this and I'm not even sure that's going to be 
> enough to make the cost acceptable.

(Could be worse: rotten seafood?)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-27 10:53           ` Peter Zijlstra
  2013-06-27 12:28             ` Mike Galbraith
  2013-06-27 13:06             ` Ingo Molnar
@ 2013-06-27 19:18             ` Andy Lutomirski
  2013-06-27 20:37               ` Peter Zijlstra
  2 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2013-06-27 19:18 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: David Ahern, Ingo Molnar, LKML, tglx

On 06/27/2013 03:53 AM, Peter Zijlstra wrote:
> On Thu, Jun 27, 2013 at 12:43:09PM +0200, Peter Zijlstra wrote:
>> On Wed, Jun 26, 2013 at 10:46:33AM -0600, David Ahern wrote:
>>> On 6/26/13 1:05 AM, Peter Zijlstra wrote:
>>>>> What is the expectation that the feature provides? not a whole lot of
>>>>> documentation on it. I walked down the path wondering if it solved an odd
>>>>> problem we are seeing with the CFS in 2.6.27 kernel.
>>>>
>>>> Its supposed to use hrtimers for slice expiry instead of the regular tick.
>>>
>>> So theoretically CPU bound tasks would get preempted sooner? That was my
>>> guess/hope anyways.
>>
>> Doth the below worketh?
>>
> 
> Related to all this; the reason its not enabled by default is that mucking
> about with hrtimers all the while is god awful expensive.
> 
> I've had ideas about making this a special purpose 'hard-coded' timer in the
> hrtimer guts that's only ever re-programmed when the new value is sooner.
> 
> By making it a 'special' timer we can avoid the whole rb-tree song and dance;
> and by taking 'spurious' short interrupts we can avoid prodding the hardware
> too often.

Supposedly, on really new x86 systems, the TSC deadline timer is so fast
to reprogram that this isn't worth it.

I wonder if the general timing code should have a way to indicate that a
given clockevent is (a) very fast and (b) is actually locked to a
clocksource as opposed to just having a vaguely accurately calibrated
frequency.

--Andy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-27 19:18             ` Andy Lutomirski
@ 2013-06-27 20:37               ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2013-06-27 20:37 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: David Ahern, Ingo Molnar, LKML, tglx

On Thu, Jun 27, 2013 at 12:18:04PM -0700, Andy Lutomirski wrote:
> Supposedly, on really new x86 systems, the TSC deadline timer is so fast
> to reprogram that this isn't worth it.

>From what I've heard its not all that fast. I should play with it sometime, my
SNB desktop actually has this thing.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-27 10:43         ` Peter Zijlstra
  2013-06-27 10:53           ` Peter Zijlstra
@ 2013-06-27 22:28           ` David Ahern
  2013-06-28  9:00             ` Ingo Molnar
  2013-06-28  9:09             ` deadlock in scheduler enabling HRTICK feature Peter Zijlstra
  1 sibling, 2 replies; 17+ messages in thread
From: David Ahern @ 2013-06-27 22:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML

On 6/27/13 4:43 AM, Peter Zijlstra wrote:
> On Wed, Jun 26, 2013 at 10:46:33AM -0600, David Ahern wrote:
>> On 6/26/13 1:05 AM, Peter Zijlstra wrote:
>>>> What is the expectation that the feature provides? not a whole lot of
>>>> documentation on it. I walked down the path wondering if it solved an odd
>>>> problem we are seeing with the CFS in 2.6.27 kernel.
>>>
>>> Its supposed to use hrtimers for slice expiry instead of the regular tick.
>>
>> So theoretically CPU bound tasks would get preempted sooner? That was my
>> guess/hope anyways.
>
> Doth the below worketh?

It doth.

Usually make -j 8 for a kernel build in a VM would lock it up pretty 
quickly. With the patch I was able to run full builds multiple times.

As for the solution you are avoiding the nesting by not waking up the 
softirq daemon.

I'll backport to 2.6.27 and see what happens. Thanks for the patch.

David


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-27 22:28           ` David Ahern
@ 2013-06-28  9:00             ` Ingo Molnar
  2013-06-28  9:18               ` Peter Zijlstra
  2013-06-28  9:09             ` deadlock in scheduler enabling HRTICK feature Peter Zijlstra
  1 sibling, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2013-06-28  9:00 UTC (permalink / raw)
  To: David Ahern; +Cc: Peter Zijlstra, LKML, Mike Galbraith


* David Ahern <dsahern@gmail.com> wrote:

> On 6/27/13 4:43 AM, Peter Zijlstra wrote:
> >On Wed, Jun 26, 2013 at 10:46:33AM -0600, David Ahern wrote:
> >>On 6/26/13 1:05 AM, Peter Zijlstra wrote:
> >>>>What is the expectation that the feature provides? not a whole lot of
> >>>>documentation on it. I walked down the path wondering if it solved an odd
> >>>>problem we are seeing with the CFS in 2.6.27 kernel.
> >>>
> >>>Its supposed to use hrtimers for slice expiry instead of the regular tick.
> >>
> >>So theoretically CPU bound tasks would get preempted sooner? That was my
> >>guess/hope anyways.
> >
> >Doth the below worketh?
> 
> It doth.
> 
> Usually make -j 8 for a kernel build in a VM would lock it up pretty
> quickly. With the patch I was able to run full builds multiple
> times.
> 
> As for the solution you are avoiding the nesting by not waking up
> the softirq daemon.

I guess we could merge this fix?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-27 22:28           ` David Ahern
  2013-06-28  9:00             ` Ingo Molnar
@ 2013-06-28  9:09             ` Peter Zijlstra
  2013-06-28 17:28               ` David Ahern
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2013-06-28  9:09 UTC (permalink / raw)
  To: David Ahern; +Cc: Ingo Molnar, LKML

On Thu, Jun 27, 2013 at 04:28:57PM -0600, David Ahern wrote:
> On 6/27/13 4:43 AM, Peter Zijlstra wrote:
> >On Wed, Jun 26, 2013 at 10:46:33AM -0600, David Ahern wrote:
> >>On 6/26/13 1:05 AM, Peter Zijlstra wrote:
> >>>>What is the expectation that the feature provides? not a whole lot of
> >>>>documentation on it. I walked down the path wondering if it solved an odd
> >>>>problem we are seeing with the CFS in 2.6.27 kernel.
> >>>
> >>>Its supposed to use hrtimers for slice expiry instead of the regular tick.
> >>
> >>So theoretically CPU bound tasks would get preempted sooner? That was my
> >>guess/hope anyways.
> >
> >Doth the below worketh?
> 
> It doth.
> 
> Usually make -j 8 for a kernel build in a VM would lock it up pretty
> quickly. With the patch I was able to run full builds multiple times.

Good!

> As for the solution you are avoiding the nesting by not waking up the
> softirq daemon.

Yah! :-) Obviously doing a wakeup while holding scheduler locks isn't going to
work out well. And the only reason we really need that pesky softirq nonsense
is when we accidentally schedule a timer that's already expired; in that case
we'll run it from sirq context.

We don't care about missing events like that; there's always the actual tick
for backup.

I suppose I'd better go write a Changelog and properly submit the patch :-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-28  9:00             ` Ingo Molnar
@ 2013-06-28  9:18               ` Peter Zijlstra
  2013-07-12 13:29                 ` [tip:sched/core] sched: Fix HRTICK tip-bot for Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2013-06-28  9:18 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: David Ahern, LKML, Mike Galbraith

On Fri, Jun 28, 2013 at 11:00:21AM +0200, Ingo Molnar wrote:
> I guess we could merge this fix?

---
Subject: sched: Fix HRTICK

David reported that the HRTICK sched feature was borken; which was enough
motivation for me to finally fix it ;-)

We should not allow hrtimer code to do softirq wakeups while holding scheduler
locks. The hrtimer code only needs this when we accidentally try to program an
expired time. We don't much care about those anyway since we have the regular
tick to fall back to.

Reported-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---
 kernel/sched/core.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9b1f2e5..0d8eb45 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -370,13 +370,6 @@ static struct rq *this_rq_lock(void)
 #ifdef CONFIG_SCHED_HRTICK
 /*
  * Use HR-timers to deliver accurate preemption points.
- *
- * Its all a bit involved since we cannot program an hrt while holding the
- * rq->lock. So what we do is store a state in in rq->hrtick_* and ask for a
- * reschedule event.
- *
- * When we get rescheduled we reprogram the hrtick_timer outside of the
- * rq->lock.
  */
 
 static void hrtick_clear(struct rq *rq)
@@ -404,6 +397,15 @@ static enum hrtimer_restart hrtick(struct hrtimer *timer)
 }
 
 #ifdef CONFIG_SMP
+
+static int __hrtick_restart(struct rq *rq)
+{
+	struct hrtimer *timer = &rq->hrtick_timer;
+	ktime_t time = hrtimer_get_softexpires(timer);
+
+	return __hrtimer_start_range_ns(timer, time, 0, HRTIMER_MODE_ABS_PINNED, 0);
+}
+
 /*
  * called from hardirq (IPI) context
  */
@@ -412,7 +414,7 @@ static void __hrtick_start(void *arg)
 	struct rq *rq = arg;
 
 	raw_spin_lock(&rq->lock);
-	hrtimer_restart(&rq->hrtick_timer);
+	__hrtick_restart(rq);
 	rq->hrtick_csd_pending = 0;
 	raw_spin_unlock(&rq->lock);
 }
@@ -430,7 +432,7 @@ void hrtick_start(struct rq *rq, u64 delay)
 	hrtimer_set_expires(timer, time);
 
 	if (rq == this_rq()) {
-		hrtimer_restart(timer);
+		__hrtick_restart(rq);
 	} else if (!rq->hrtick_csd_pending) {
 		__smp_call_function_single(cpu_of(rq), &rq->hrtick_csd, 0);
 		rq->hrtick_csd_pending = 1;


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: deadlock in scheduler enabling HRTICK feature
  2013-06-28  9:09             ` deadlock in scheduler enabling HRTICK feature Peter Zijlstra
@ 2013-06-28 17:28               ` David Ahern
  0 siblings, 0 replies; 17+ messages in thread
From: David Ahern @ 2013-06-28 17:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, LKML

On 6/28/13 3:09 AM, Peter Zijlstra wrote:

> Yah! :-) Obviously doing a wakeup while holding scheduler locks isn't going to
> work out well. And the only reason we really need that pesky softirq nonsense
> is when we accidentally schedule a timer that's already expired; in that case
> we'll run it from sirq context.
>
> We don't care about missing events like that; there's always the actual tick
> for backup.
>
> I suppose I'd better go write a Changelog and properly submit the patch :-)
>

Thanks for the quick turnaround on the patch - and the explanation.

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [tip:sched/core] sched: Fix HRTICK
  2013-06-28  9:18               ` Peter Zijlstra
@ 2013-07-12 13:29                 ` tip-bot for Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: tip-bot for Peter Zijlstra @ 2013-07-12 13:29 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, peterz, dsahern, tglx

Commit-ID:  971ee28cbd1ccd87b3164facd9359a534c1d2892
Gitweb:     http://git.kernel.org/tip/971ee28cbd1ccd87b3164facd9359a534c1d2892
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Fri, 28 Jun 2013 11:18:53 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 12 Jul 2013 13:52:58 +0200

sched: Fix HRTICK

David reported that the HRTICK sched feature was borken; which was enough
motivation for me to finally fix it ;-)

We should not allow hrtimer code to do softirq wakeups while holding scheduler
locks. The hrtimer code only needs this when we accidentally try to program an
expired time. We don't much care about those anyway since we have the regular
tick to fall back to.

Reported-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130628091853.GE29209@dyad.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9b1f2e5..0d8eb45 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -370,13 +370,6 @@ static struct rq *this_rq_lock(void)
 #ifdef CONFIG_SCHED_HRTICK
 /*
  * Use HR-timers to deliver accurate preemption points.
- *
- * Its all a bit involved since we cannot program an hrt while holding the
- * rq->lock. So what we do is store a state in in rq->hrtick_* and ask for a
- * reschedule event.
- *
- * When we get rescheduled we reprogram the hrtick_timer outside of the
- * rq->lock.
  */
 
 static void hrtick_clear(struct rq *rq)
@@ -404,6 +397,15 @@ static enum hrtimer_restart hrtick(struct hrtimer *timer)
 }
 
 #ifdef CONFIG_SMP
+
+static int __hrtick_restart(struct rq *rq)
+{
+	struct hrtimer *timer = &rq->hrtick_timer;
+	ktime_t time = hrtimer_get_softexpires(timer);
+
+	return __hrtimer_start_range_ns(timer, time, 0, HRTIMER_MODE_ABS_PINNED, 0);
+}
+
 /*
  * called from hardirq (IPI) context
  */
@@ -412,7 +414,7 @@ static void __hrtick_start(void *arg)
 	struct rq *rq = arg;
 
 	raw_spin_lock(&rq->lock);
-	hrtimer_restart(&rq->hrtick_timer);
+	__hrtick_restart(rq);
 	rq->hrtick_csd_pending = 0;
 	raw_spin_unlock(&rq->lock);
 }
@@ -430,7 +432,7 @@ void hrtick_start(struct rq *rq, u64 delay)
 	hrtimer_set_expires(timer, time);
 
 	if (rq == this_rq()) {
-		hrtimer_restart(timer);
+		__hrtick_restart(rq);
 	} else if (!rq->hrtick_csd_pending) {
 		__smp_call_function_single(cpu_of(rq), &rq->hrtick_csd, 0);
 		rq->hrtick_csd_pending = 1;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-07-12 13:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-25 21:05 deadlock in scheduler enabling HRTICK feature David Ahern
2013-06-25 21:17 ` Peter Zijlstra
2013-06-25 21:20   ` David Ahern
2013-06-26  7:05     ` Peter Zijlstra
2013-06-26 16:46       ` David Ahern
2013-06-27 10:43         ` Peter Zijlstra
2013-06-27 10:53           ` Peter Zijlstra
2013-06-27 12:28             ` Mike Galbraith
2013-06-27 13:06             ` Ingo Molnar
2013-06-27 19:18             ` Andy Lutomirski
2013-06-27 20:37               ` Peter Zijlstra
2013-06-27 22:28           ` David Ahern
2013-06-28  9:00             ` Ingo Molnar
2013-06-28  9:18               ` Peter Zijlstra
2013-07-12 13:29                 ` [tip:sched/core] sched: Fix HRTICK tip-bot for Peter Zijlstra
2013-06-28  9:09             ` deadlock in scheduler enabling HRTICK feature Peter Zijlstra
2013-06-28 17:28               ` David Ahern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.