linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC BUG] There is a potential bug in "yield_to"
@ 2012-07-05  5:31 Michael Wang
  2012-07-05  8:35 ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Wang @ 2012-07-05  5:31 UTC (permalink / raw)
  To: LKML; +Cc: Ingo Molnar, Peter Zijlstra

Hi, All

I found there may be a potential bug in "yield_to":

        local_irq_save(flags);
        rq = this_rq();

again:	

//task's rq may already changed in "sched_move_task"

        p_rq = task_rq(p);
        double_rq_lock(rq, p_rq);
        while (task_rq(p) != p_rq) {
                double_rq_unlock(rq, p_rq);
                goto again;
        }

I think it may happen in this scene:

	cpu 0				cpu 1(task a)

					yield_to {
					disable_irq
	sched_move_task {		rq = this_rq();
	task_rq_lock(task a)		double_rq_lock

	hold lock of rq 1			
	set_task_rq			//task rq changed
	release lock of rq 1

					hold lock of rq 1
					but task b no longer
					there

					set rq 1's current to
					skip which is not task a

which means we hold a rq's lock but it's current is not the one should
do yield.

Only "sched_move_task" will cause this issue as it will move the task
which is still running.

The bug will make the task who want to do yield failed to set skip buddy
to himself, but to a innocent task instead, not very harmful and almost
impossible to occur in normal, but should we fix it with another check
"rq == this_rq()"?

Regards,
Michael Wang


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC BUG] There is a potential bug in "yield_to"
  2012-07-05  5:31 [RFC BUG] There is a potential bug in "yield_to" Michael Wang
@ 2012-07-05  8:35 ` Peter Zijlstra
  2012-07-05 23:50   ` Michael Wang
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2012-07-05  8:35 UTC (permalink / raw)
  To: Michael Wang; +Cc: LKML, Ingo Molnar

On Thu, 2012-07-05 at 13:31 +0800, Michael Wang wrote:
> Hi, All
> 
> I found there may be a potential bug in "yield_to":
> 
>         local_irq_save(flags);
>         rq = this_rq();
> 
> again:	
> 
> //task's rq may already changed in "sched_move_task"
> 
>         p_rq = task_rq(p);
>         double_rq_lock(rq, p_rq);
>         while (task_rq(p) != p_rq) {
>                 double_rq_unlock(rq, p_rq);
>                 goto again;
>         }
> 
> I think it may happen in this scene:
> 
> 	cpu 0				cpu 1(task a)
> 
> 					yield_to {
> 					disable_irq
> 	sched_move_task {		rq = this_rq();
> 	task_rq_lock(task a)		double_rq_lock
> 
> 	hold lock of rq 1			
> 	set_task_rq			//task rq changed
> 	release lock of rq 1
> 
> 					hold lock of rq 1
> 					but task b no longer
> 					there
> 
> 					set rq 1's current to
> 					skip which is not task a
> 
> which means we hold a rq's lock but it's current is not the one should
> do yield.
> 
> Only "sched_move_task" will cause this issue as it will move the task
> which is still running.
> 
> The bug will make the task who want to do yield failed to set skip buddy
> to himself, but to a innocent task instead, not very harmful and almost
> impossible to occur in normal, but should we fix it with another check
> "rq == this_rq()"?

Uhm, what?!

We've got interrupts disabled, this_rq() cannot ever possibly change, so
rq is always correct.

Only p_rq can change, and we have an again loop on that, so what's the
problem again?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC BUG] There is a potential bug in "yield_to"
  2012-07-05  8:35 ` Peter Zijlstra
@ 2012-07-05 23:50   ` Michael Wang
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Wang @ 2012-07-05 23:50 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: LKML, Ingo Molnar

On 07/05/2012 04:35 PM, Peter Zijlstra wrote:
> On Thu, 2012-07-05 at 13:31 +0800, Michael Wang wrote:
>> Hi, All
>>
>> I found there may be a potential bug in "yield_to":
>>
>>         local_irq_save(flags);
>>         rq = this_rq();
>>
>> again:	
>>
>> //task's rq may already changed in "sched_move_task"
>>
>>         p_rq = task_rq(p);
>>         double_rq_lock(rq, p_rq);
>>         while (task_rq(p) != p_rq) {
>>                 double_rq_unlock(rq, p_rq);
>>                 goto again;
>>         }
>>
>> I think it may happen in this scene:
>>
>> 	cpu 0				cpu 1(task a)
>>
>> 					yield_to {
>> 					disable_irq
>> 	sched_move_task {		rq = this_rq();
>> 	task_rq_lock(task a)		double_rq_lock
>>
>> 	hold lock of rq 1			
>> 	set_task_rq			//task rq changed
>> 	release lock of rq 1
>>
>> 					hold lock of rq 1
>> 					but task b no longer
>> 					there
>>
>> 					set rq 1's current to
>> 					skip which is not task a
>>
>> which means we hold a rq's lock but it's current is not the one should
>> do yield.
>>
>> Only "sched_move_task" will cause this issue as it will move the task
>> which is still running.
>>
>> The bug will make the task who want to do yield failed to set skip buddy
>> to himself, but to a innocent task instead, not very harmful and almost
>> impossible to occur in normal, but should we fix it with another check
>> "rq == this_rq()"?
> 
> Uhm, what?!
> 
> We've got interrupts disabled, this_rq() cannot ever possibly change, so
> rq is always correct.
> 
I know I should have missed some thing, the schedule won't happen until
enable the irq later, so even that scene happen, nothing will change on rq.

Thanks for your explain :)

Regards,
Michael Wang

> Only p_rq can change, and we have an again loop on that, so what's the
> problem again?
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-07-05 23:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-05  5:31 [RFC BUG] There is a potential bug in "yield_to" Michael Wang
2012-07-05  8:35 ` Peter Zijlstra
2012-07-05 23:50   ` Michael Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).