linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched_rt: the task in irq context can be migrated during context switching
@ 2012-01-05 11:00 Chanho Min
  2012-01-05 14:25 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Chanho Min @ 2012-01-05 11:00 UTC (permalink / raw)
  To: linux-kernel, Ingo Molnar, Peter Zijlstra; +Cc: chanho.min

This issue happens under the following conditions
1. preemption is off
2. __ARCH_WANT_INTERRUPTS_ON_CTXSW is defined.
3. RT scheduling class
4. SMP system

Sequence is as follows:
1.suppose current task is A. start schedule()
2.task A is enqueued pushable task at the entry of schedule()
   __schedule
    prev = rq->curr;
    ...
    put_prev_task
     put_prev_task_rt
      enqueue_pushable_task
4.pick the task B as next task.
   next = pick_next_task(rq);
3.rq->curr set to task B and context_switch is started.
   rq->curr = next;
4.At the entry of context_swtich, release this cpu's rq->lock.
   context_switch
    prepare_task_switch
     prepare_lock_switch
      raw_spin_unlock_irq(&rq->lock);
5.Shortly after rq->lock is released, interrupt is occurred and start
IRQ context
6.try_to_wake_up() which called by ISR acquires rq->lock
    try_to_wake_up
     ttwu_remote
      rq = __task_rq_lock(p)
      ttwu_do_wakeup(rq, p, wake_flags);
        task_woken_rt
7.push_rt_task picks the task A which is enqueued before.
   task_woken_rt
    push_rt_tasks(rq)
     next_task = pick_next_pushable_task(rq)
8.At find_lock_lowest_rq(), If double_lock_balance() returns 0,
lowest_rq can be the remote rq.
  (But,If preemption is on, double_lock_balance always return 1 and it
does't happen.)
   push_rt_task
    find_lock_lowest_rq
     if (double_lock_balance(rq, lowest_rq))..
9.find_lock_lowest_rq return the available rq. task A is migrated to
the remote cpu/rq.
   push_rt_task
    ...
    deactivate_task(rq, next_task, 0);
    set_task_cpu(next_task, lowest_rq->cpu);
    activate_task(lowest_rq, next_task, 0);
10. But, task A is on irq context at this cpu.
    So, task A is scheduled by two cpus at the same time until restore from IRQ.
    Task A's stack is corrupted. Unexpected problem is occurred.

In recent ARM, I saw the patch to remove
__ARCH_WANT_INTERRUPTS_ON_CTXSW is posted.
But, if that feature is adopted by the others architecture or to
remain in the release version,,this can occur.
This is my patch to fix it. Any opinions will be appreciated.

Signed-off-by: Chanho Min <chanho.min@lge.com>
---
 kernel/sched_rt.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 583a136..59e66e3 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -1388,6 +1388,11 @@ static int push_rt_task(struct rq *rq)
        if (!next_task)
                return 0;

+#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
+       if (unlikely(task_running(rq, next_task)))
+               return 0;
+#endif
+
 retry:
        if (unlikely(next_task == rq->curr)) {
                WARN_ON(1);
--

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched_rt: the task in irq context can be migrated during context switching
  2012-01-05 11:00 [PATCH] sched_rt: the task in irq context can be migrated during context switching Chanho Min
@ 2012-01-05 14:25 ` Peter Zijlstra
  2012-01-05 17:55 ` Peter Zijlstra
  2012-01-28 12:06 ` [tip:sched/core] sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW tip-bot for Chanho Min
  2 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2012-01-05 14:25 UTC (permalink / raw)
  To: Chanho Min; +Cc: linux-kernel, Ingo Molnar, chanho.min

On Thu, 2012-01-05 at 20:00 +0900, Chanho Min wrote:
> In recent ARM, I saw the patch to remove
> __ARCH_WANT_INTERRUPTS_ON_CTXSW is posted.
> But, if that feature is adopted by the others architecture 

No, the moment ARM removes the need, the whole feature will be removed
never to be brought back, its a horrid complication to the scheduler we
really don't ever want to bring back (like you found out).

I'll go and digest your scenario now. At the very least your patch might
want to go into -stable.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched_rt: the task in irq context can be migrated during context switching
  2012-01-05 11:00 [PATCH] sched_rt: the task in irq context can be migrated during context switching Chanho Min
  2012-01-05 14:25 ` Peter Zijlstra
@ 2012-01-05 17:55 ` Peter Zijlstra
  2012-01-05 18:15   ` Steven Rostedt
  2012-01-28 12:06 ` [tip:sched/core] sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW tip-bot for Chanho Min
  2 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2012-01-05 17:55 UTC (permalink / raw)
  To: Chanho Min; +Cc: linux-kernel, Ingo Molnar, chanho.min, rostedt

On Thu, 2012-01-05 at 20:00 +0900, Chanho Min wrote:
> This issue happens under the following conditions
> 1. preemption is off
> 2. __ARCH_WANT_INTERRUPTS_ON_CTXSW is defined.
> 3. RT scheduling class
> 4. SMP system
> 
> Sequence is as follows:
> 1.suppose current task is A. start schedule()
> 2.task A is enqueued pushable task at the entry of schedule()
>    __schedule
>     prev = rq->curr;
>     ...
>     put_prev_task
>      put_prev_task_rt
>       enqueue_pushable_task
> 4.pick the task B as next task.
>    next = pick_next_task(rq);
> 3.rq->curr set to task B and context_switch is started.
>    rq->curr = next;
> 4.At the entry of context_swtich, release this cpu's rq->lock.
>    context_switch
>     prepare_task_switch
>      prepare_lock_switch
>       raw_spin_unlock_irq(&rq->lock);
> 5.Shortly after rq->lock is released, interrupt is occurred and start
> IRQ context
> 6.try_to_wake_up() which called by ISR acquires rq->lock
>     try_to_wake_up
>      ttwu_remote
>       rq = __task_rq_lock(p)
>       ttwu_do_wakeup(rq, p, wake_flags);
>         task_woken_rt
> 7.push_rt_task picks the task A which is enqueued before.
>    task_woken_rt
>     push_rt_tasks(rq)
>      next_task = pick_next_pushable_task(rq)
> 8.At find_lock_lowest_rq(), If double_lock_balance() returns 0,
> lowest_rq can be the remote rq.
>   (But,If preemption is on, double_lock_balance always return 1 and it
> does't happen.)
>    push_rt_task
>     find_lock_lowest_rq
>      if (double_lock_balance(rq, lowest_rq))..
> 9.find_lock_lowest_rq return the available rq. task A is migrated to
> the remote cpu/rq.
>    push_rt_task
>     ...
>     deactivate_task(rq, next_task, 0);
>     set_task_cpu(next_task, lowest_rq->cpu);
>     activate_task(lowest_rq, next_task, 0);
> 10. But, task A is on irq context at this cpu.
>     So, task A is scheduled by two cpus at the same time until restore from IRQ.
>     Task A's stack is corrupted. Unexpected problem is occurred.
> 
> In recent ARM, I saw the patch to remove
> __ARCH_WANT_INTERRUPTS_ON_CTXSW is posted.
> But, if that feature is adopted by the others architecture or to
> remain in the release version,,this can occur.
> This is my patch to fix it. Any opinions will be appreciated.

So the problem is quite real, as already said we don't need to worry
about the future, but we might want to fix this in previous kernels.
What I'm not entirely sure of is the proposed solution, Steven don't we
get in trouble by simply bailing out on the push?

> Signed-off-by: Chanho Min <chanho.min@lge.com>
> ---
>  kernel/sched_rt.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
> index 583a136..59e66e3 100644
> --- a/kernel/sched_rt.c
> +++ b/kernel/sched_rt.c
> @@ -1388,6 +1388,11 @@ static int push_rt_task(struct rq *rq)
>         if (!next_task)
>                 return 0;
> 
> +#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
> +       if (unlikely(task_running(rq, next_task)))
> +               return 0;
> +#endif
> +
>  retry:
>         if (unlikely(next_task == rq->curr)) {
>                 WARN_ON(1);
> --


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched_rt: the task in irq context can be migrated during context switching
  2012-01-05 17:55 ` Peter Zijlstra
@ 2012-01-05 18:15   ` Steven Rostedt
  2012-01-05 21:45     ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2012-01-05 18:15 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Chanho Min, linux-kernel, Ingo Molnar, chanho.min

On Thu, 2012-01-05 at 18:55 +0100, Peter Zijlstra wrote:

> So the problem is quite real, as already said we don't need to worry
> about the future, but we might want to fix this in previous kernels.
> What I'm not entirely sure of is the proposed solution, Steven don't we
> get in trouble by simply bailing out on the push?

It shouldn't break anything. We shouldn't be pushing tasks that are
running on a rq anyway. I don't see any harm here. As this scenario can
only happen if we get an interrupt after letting go of the rq lock and
before doing the switch_to(). The schedule_tail() calls
post_schedule_rt() which does the push again, and will push task A at
that time.

That said, I'm not sure this patch is enough. I'm worried about a pull
happening. As task A is running, we could possible possibly pick it on
another CPU to do a pull.

Hmm, looking at the code, the pull already does a task_running() test,
so I guess we should be fine.

Acked-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched_rt: the task in irq context can be migrated during context switching
  2012-01-05 18:15   ` Steven Rostedt
@ 2012-01-05 21:45     ` Peter Zijlstra
  2012-01-10  8:13       ` Chanho Min
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2012-01-05 21:45 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Chanho Min, linux-kernel, Ingo Molnar, chanho.min

On Thu, 2012-01-05 at 13:15 -0500, Steven Rostedt wrote:
> On Thu, 2012-01-05 at 18:55 +0100, Peter Zijlstra wrote:
> 
> > So the problem is quite real, as already said we don't need to worry
> > about the future, but we might want to fix this in previous kernels.
> > What I'm not entirely sure of is the proposed solution, Steven don't we
> > get in trouble by simply bailing out on the push?
> 
> It shouldn't break anything. We shouldn't be pushing tasks that are
> running on a rq anyway. 

Its not running, but its in the middle of getting scheduled out.

> I don't see any harm here. As this scenario can
> only happen if we get an interrupt after letting go of the rq lock and
> before doing the switch_to(). The schedule_tail() calls
> post_schedule_rt() which does the push again, and will push task A at
> that time.

Right, so the post_schedule() hook will try again.

> That said, I'm not sure this patch is enough. I'm worried about a pull
> happening. As task A is running, we could possible possibly pick it on
> another CPU to do a pull.
> 
> Hmm, looking at the code, the pull already does a task_running() test,
> so I guess we should be fine.

Yeah, I'm not sure all those task_running() things make sense though,
when !->on_rq && ->on_cpu we should busy wait for tasks, not skip them.

Then again, with this WANT_INTERRUPTS_ON_CTXSW the busy wait crap is
tricky. Luckily its going the way of the Dodo very soon.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched_rt: the task in irq context can be migrated during context switching
  2012-01-05 21:45     ` Peter Zijlstra
@ 2012-01-10  8:13       ` Chanho Min
  2012-01-10  9:23         ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Chanho Min @ 2012-01-10  8:13 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Steven Rostedt, linux-kernel, Ingo Molnar, chanho.min

On Fri, Jan 6, 2012 at 6:45 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2012-01-05 at 13:15 -0500, Steven Rostedt wrote:
>> On Thu, 2012-01-05 at 18:55 +0100, Peter Zijlstra wrote:
>>
>> > So the problem is quite real, as already said we don't need to worry
>> > about the future, but we might want to fix this in previous kernels.
>> > What I'm not entirely sure of is the proposed solution, Steven don't we
>> > get in trouble by simply bailing out on the push?
>>
>> It shouldn't break anything. We shouldn't be pushing tasks that are
>> running on a rq anyway.
>
> Its not running, but its in the middle of getting scheduled out.
>
>> I don't see any harm here. As this scenario can
>> only happen if we get an interrupt after letting go of the rq lock and
>> before doing the switch_to(). The schedule_tail() calls
>> post_schedule_rt() which does the push again, and will push task A at
>> that time.
>
> Right, so the post_schedule() hook will try again.
>
>> That said, I'm not sure this patch is enough. I'm worried about a pull
>> happening. As task A is running, we could possible possibly pick it on
>> another CPU to do a pull.
>>
>> Hmm, looking at the code, the pull already does a task_running() test,
>> so I guess we should be fine.
>
> Yeah, I'm not sure all those task_running() things make sense though,
> when !->on_rq && ->on_cpu we should busy wait for tasks, not skip them.
>
> Then again, with this WANT_INTERRUPTS_ON_CTXSW the busy wait crap is
> tricky. Luckily its going the way of the Dodo very soon.

So Will this be applied?  It will be critical and very hard to debug.
( pretty rare, sometimes system hang without any message )
It has been well tested on our ARM platform.

Thanks
Chanho

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched_rt: the task in irq context can be migrated during context switching
  2012-01-10  8:13       ` Chanho Min
@ 2012-01-10  9:23         ` Peter Zijlstra
  2012-01-10  9:42           ` Namhyung Kim
  2012-01-28  4:05           ` Chanho Min
  0 siblings, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2012-01-10  9:23 UTC (permalink / raw)
  To: Chanho Min; +Cc: Steven Rostedt, linux-kernel, Ingo Molnar, chanho.min

On Tue, 2012-01-10 at 17:13 +0900, Chanho Min wrote:
> So Will this be applied?  It will be critical and very hard to debug.
> ( pretty rare, sometimes system hang without any message )
> It has been well tested on our ARM platform.
> 
> 
Yeah, thanks. I've queued it, should show up in -tip soonish.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched_rt: the task in irq context can be migrated during context switching
  2012-01-10  9:23         ` Peter Zijlstra
@ 2012-01-10  9:42           ` Namhyung Kim
  2012-01-28  4:05           ` Chanho Min
  1 sibling, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2012-01-10  9:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Chanho Min, Steven Rostedt, linux-kernel, Ingo Molnar, chanho.min

2012-01-10 6:23 PM, Peter Zijlstra wrote:
> On Tue, 2012-01-10 at 17:13 +0900, Chanho Min wrote:
>> So Will this be applied?  It will be critical and very hard to debug.
>> ( pretty rare, sometimes system hang without any message )
>> It has been well tested on our ARM platform.
>>
>>
> Yeah, thanks. I've queued it, should show up in -tip soonish.

It seems it should go into -stable as well, right?


Thanks,
Namhyung Kim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched_rt: the task in irq context can be migrated during context switching
  2012-01-10  9:23         ` Peter Zijlstra
  2012-01-10  9:42           ` Namhyung Kim
@ 2012-01-28  4:05           ` Chanho Min
  1 sibling, 0 replies; 10+ messages in thread
From: Chanho Min @ 2012-01-28  4:05 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Steven Rostedt, linux-kernel, Ingo Molnar, chanho.min

On Tue, Jan 10, 2012 at 6:23 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> Yeah, thanks. I've queued it, should show up in -tip soonish.
Hi,
I haven't seen it yet in tip tree. Is there any other issue?

Thanks,

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:sched/core] sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW
  2012-01-05 11:00 [PATCH] sched_rt: the task in irq context can be migrated during context switching Chanho Min
  2012-01-05 14:25 ` Peter Zijlstra
  2012-01-05 17:55 ` Peter Zijlstra
@ 2012-01-28 12:06 ` tip-bot for Chanho Min
  2 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Chanho Min @ 2012-01-28 12:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, chanho.min, chanho0207, hpa, mingo, a.p.zijlstra,
	rostedt, stable, tglx, mingo

Commit-ID:  cb297a3e433dbdcf7ad81e0564e7b804c941ff0d
Gitweb:     http://git.kernel.org/tip/cb297a3e433dbdcf7ad81e0564e7b804c941ff0d
Author:     Chanho Min <chanho0207@gmail.com>
AuthorDate: Thu, 5 Jan 2012 20:00:19 +0900
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 27 Jan 2012 12:49:41 +0100

sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW

This issue happens under the following conditions:

 1. preemption is off
 2. __ARCH_WANT_INTERRUPTS_ON_CTXSW is defined
 3. RT scheduling class
 4. SMP system

Sequence is as follows:

 1.suppose current task is A. start schedule()
 2.task A is enqueued pushable task at the entry of schedule()
   __schedule
    prev = rq->curr;
    ...
    put_prev_task
     put_prev_task_rt
      enqueue_pushable_task
 4.pick the task B as next task.
   next = pick_next_task(rq);
 3.rq->curr set to task B and context_switch is started.
   rq->curr = next;
 4.At the entry of context_swtich, release this cpu's rq->lock.
   context_switch
    prepare_task_switch
     prepare_lock_switch
      raw_spin_unlock_irq(&rq->lock);
 5.Shortly after rq->lock is released, interrupt is occurred and start IRQ context
 6.try_to_wake_up() which called by ISR acquires rq->lock
    try_to_wake_up
     ttwu_remote
      rq = __task_rq_lock(p)
      ttwu_do_wakeup(rq, p, wake_flags);
        task_woken_rt
 7.push_rt_task picks the task A which is enqueued before.
   task_woken_rt
    push_rt_tasks(rq)
     next_task = pick_next_pushable_task(rq)
 8.At find_lock_lowest_rq(), If double_lock_balance() returns 0,
   lowest_rq can be the remote rq.
  (But,If preemption is on, double_lock_balance always return 1 and it
   does't happen.)
   push_rt_task
    find_lock_lowest_rq
     if (double_lock_balance(rq, lowest_rq))..
 9.find_lock_lowest_rq return the available rq. task A is migrated to
   the remote cpu/rq.
   push_rt_task
    ...
    deactivate_task(rq, next_task, 0);
    set_task_cpu(next_task, lowest_rq->cpu);
    activate_task(lowest_rq, next_task, 0);
 10. But, task A is on irq context at this cpu.
     So, task A is scheduled by two cpus at the same time until restore from IRQ.
     Task A's stack is corrupted.

To fix it, don't migrate an RT task if it's still running.

Signed-off-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/CAOAMb1BHA=5fm7KTewYyke6u-8DP0iUuJMpgQw54vNeXFsGpoQ@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched/rt.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 3640ebb..f42ae7f 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1587,6 +1587,11 @@ static int push_rt_task(struct rq *rq)
 	if (!next_task)
 		return 0;
 
+#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
+       if (unlikely(task_running(rq, next_task)))
+               return 0;
+#endif
+
 retry:
 	if (unlikely(next_task == rq->curr)) {
 		WARN_ON(1);

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-01-28 12:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-05 11:00 [PATCH] sched_rt: the task in irq context can be migrated during context switching Chanho Min
2012-01-05 14:25 ` Peter Zijlstra
2012-01-05 17:55 ` Peter Zijlstra
2012-01-05 18:15   ` Steven Rostedt
2012-01-05 21:45     ` Peter Zijlstra
2012-01-10  8:13       ` Chanho Min
2012-01-10  9:23         ` Peter Zijlstra
2012-01-10  9:42           ` Namhyung Kim
2012-01-28  4:05           ` Chanho Min
2012-01-28 12:06 ` [tip:sched/core] sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW tip-bot for Chanho Min

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).