All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] rt, hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread()
@ 2015-03-24  7:14 Mike Galbraith
  2015-04-09 14:05 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Galbraith @ 2015-03-24  7:14 UTC (permalink / raw)
  To: LKML, linux-rt-users; +Cc: Steven Rostedt, Sebastian Andrzej Siewior


do_set_cpus_allowed() is not safe vs ->sched_class change.

crash> bt
PID: 11676  TASK: ffff88026f979da0  CPU: 22  COMMAND: "sync_unplug/22"
 #0 [ffff880274d25bc8] machine_kexec at ffffffff8103b41c
 #1 [ffff880274d25c18] crash_kexec at ffffffff810d881a
 #2 [ffff880274d25cd8] oops_end at ffffffff81525818
 #3 [ffff880274d25cf8] do_invalid_op at ffffffff81003096
 #4 [ffff880274d25d90] invalid_op at ffffffff8152d3de
    [exception RIP: set_cpus_allowed_rt+18]
    RIP: ffffffff8109e012  RSP: ffff880274d25e48  RFLAGS: 00010202
    RAX: ffffffff8109e000  RBX: ffff88026f979da0  RCX: ffff8802770cb6e8
    RDX: 0000000000000000  RSI: ffffffff81add700  RDI: ffff88026f979da0
    RBP: ffff880274d25e78   R8: ffffffff816112e0   R9: 0000000000000001
    R10: 0000000000000001  R11: 0000000000011940  R12: ffff88026f979da0
    R13: ffff8802770cb6d0  R14: ffff880274d25fd8  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffff880274d25e60] do_set_cpus_allowed at ffffffff8108e65f
 #6 [ffff880274d25e80] sync_unplug_thread at ffffffff81058c08
 #7 [ffff880274d25ed8] kthread at ffffffff8107cad6
 #8 [ffff880274d25f50] ret_from_fork at ffffffff8152bbbc
crash> task_struct ffff88026f979da0 | grep class
  sched_class = 0xffffffff816111e0 <fair_sched_class+64>,

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: stable-rt@vger.kernel.org
---
 kernel/cpu.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -265,7 +265,7 @@ static int sync_unplug_thread(void *data
 	 * we don't want any more work on this CPU.
 	 */
 	current->flags &= ~PF_NO_SETAFFINITY;
-	do_set_cpus_allowed(current, cpu_present_mask);
+	set_cpus_allowed_ptr(current, cpu_present_mask);
 	migrate_me();
 	return 0;
 }



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] rt, hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread()
  2015-03-24  7:14 [patch] rt, hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread() Mike Galbraith
@ 2015-04-09 14:05 ` Sebastian Andrzej Siewior
  2015-04-09 14:23   ` Mike Galbraith
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-04-09 14:05 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML, linux-rt-users, Steven Rostedt

* Mike Galbraith | 2015-03-24 08:14:49 [+0100]:

>do_set_cpus_allowed() is not safe vs ->sched_class change.
>
>crash> bt
>PID: 11676  TASK: ffff88026f979da0  CPU: 22  COMMAND: "sync_unplug/22"
> #0 [ffff880274d25bc8] machine_kexec at ffffffff8103b41c
> #1 [ffff880274d25c18] crash_kexec at ffffffff810d881a
> #2 [ffff880274d25cd8] oops_end at ffffffff81525818
> #3 [ffff880274d25cf8] do_invalid_op at ffffffff81003096
> #4 [ffff880274d25d90] invalid_op at ffffffff8152d3de
>    [exception RIP: set_cpus_allowed_rt+18]
>    RIP: ffffffff8109e012  RSP: ffff880274d25e48  RFLAGS: 00010202
>    RAX: ffffffff8109e000  RBX: ffff88026f979da0  RCX: ffff8802770cb6e8
>    RDX: 0000000000000000  RSI: ffffffff81add700  RDI: ffff88026f979da0
>    RBP: ffff880274d25e78   R8: ffffffff816112e0   R9: 0000000000000001
>    R10: 0000000000000001  R11: 0000000000011940  R12: ffff88026f979da0
>    R13: ffff8802770cb6d0  R14: ffff880274d25fd8  R15: 0000000000000000
>    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> #5 [ffff880274d25e60] do_set_cpus_allowed at ffffffff8108e65f
> #6 [ffff880274d25e80] sync_unplug_thread at ffffffff81058c08
> #7 [ffff880274d25ed8] kthread at ffffffff8107cad6
> #8 [ffff880274d25f50] ret_from_fork at ffffffff8152bbbc
>crash> task_struct ffff88026f979da0 | grep class
>  sched_class = 0xffffffff816111e0 <fair_sched_class+64>,

Is this a one-time thing or can you reproduce this?
What happen here? I doubt p vanished. +18 is mostlikely the
"migrate_disabled_updated()" check.

I doubt p->sched_class->set_cpus_allowed or p->sched_class vanish
between testing for it and invoking it, or did it?

>Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
>Cc: stable-rt@vger.kernel.org
>---
> kernel/cpu.c |    2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>--- a/kernel/cpu.c
>+++ b/kernel/cpu.c
>@@ -265,7 +265,7 @@ static int sync_unplug_thread(void *data
> 	 * we don't want any more work on this CPU.
> 	 */
> 	current->flags &= ~PF_NO_SETAFFINITY;
>-	do_set_cpus_allowed(current, cpu_present_mask);
>+	set_cpus_allowed_ptr(current, cpu_present_mask);
> 	migrate_me();
> 	return 0;
> }

Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] rt, hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread()
  2015-04-09 14:05 ` Sebastian Andrzej Siewior
@ 2015-04-09 14:23   ` Mike Galbraith
  2015-04-09 14:54     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Galbraith @ 2015-04-09 14:23 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: LKML, linux-rt-users, Steven Rostedt

On Thu, 2015-04-09 at 16:05 +0200, Sebastian Andrzej Siewior wrote:
> * Mike Galbraith | 2015-03-24 08:14:49 [+0100]:
> 
> > do_set_cpus_allowed() is not safe vs ->sched_class change.
> > 
> > crash> bt
> > PID: 11676  TASK: ffff88026f979da0  CPU: 22  COMMAND: 
> > "sync_unplug/22"
> > #0 [ffff880274d25bc8] machine_kexec at ffffffff8103b41c
> > #1 [ffff880274d25c18] crash_kexec at ffffffff810d881a
> > #2 [ffff880274d25cd8] oops_end at ffffffff81525818
> > #3 [ffff880274d25cf8] do_invalid_op at ffffffff81003096
> > #4 [ffff880274d25d90] invalid_op at ffffffff8152d3de
> >    [exception RIP: set_cpus_allowed_rt+18]
> >    RIP: ffffffff8109e012  RSP: ffff880274d25e48  RFLAGS: 00010202
> >    RAX: ffffffff8109e000  RBX: ffff88026f979da0  RCX: 
> > ffff8802770cb6e8
> >    RDX: 0000000000000000  RSI: ffffffff81add700  RDI: 
> > ffff88026f979da0
> >    RBP: ffff880274d25e78   R8: ffffffff816112e0   R9: 
> > 0000000000000001
> >    R10: 0000000000000001  R11: 0000000000011940  R12: 
> > ffff88026f979da0
> >    R13: ffff8802770cb6d0  R14: ffff880274d25fd8  R15: 
> > 0000000000000000
> >    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > #5 [ffff880274d25e60] do_set_cpus_allowed at ffffffff8108e65f
> > #6 [ffff880274d25e80] sync_unplug_thread at ffffffff81058c08
> > #7 [ffff880274d25ed8] kthread at ffffffff8107cad6
> > #8 [ffff880274d25f50] ret_from_fork at ffffffff8152bbbc
> > crash> task_struct ffff88026f979da0 | grep class
> >  sched_class = 0xffffffff816111e0 <fair_sched_class+64>,
> 
> Is this a one-time thing or can you reproduce this?

Well, I can't reproduce it now, having fixed it ;-)  Dunno how 
repeatable it would be if I un-fixed it.

> What happen here? I doubt p vanished. +18 is mostlikely the
> "migrate_disabled_updated()" check.
> 
> I doubt p->sched_class->set_cpus_allowed or p->sched_class vanish
> between testing for it and invoking it, or did it?

Class changed under us.  We saw rt task, called rt method, rt method 
said BUG_ON(!rt_task(p)), as task had become fair class.

        -Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] rt, hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread()
  2015-04-09 14:23   ` Mike Galbraith
@ 2015-04-09 14:54     ` Sebastian Andrzej Siewior
  2015-04-09 17:40       ` Mike Galbraith
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-04-09 14:54 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML, linux-rt-users, Steven Rostedt

On 04/09/2015 04:23 PM, Mike Galbraith wrote:
> On Thu, 2015-04-09 at 16:05 +0200, Sebastian Andrzej Siewior wrote:
>> * Mike Galbraith | 2015-03-24 08:14:49 [+0100]:
>>
>>> do_set_cpus_allowed() is not safe vs ->sched_class change.
>>>
>>> crash> bt
>>> PID: 11676  TASK: ffff88026f979da0  CPU: 22  COMMAND: 
>>> "sync_unplug/22"
>>> #0 [ffff880274d25bc8] machine_kexec at ffffffff8103b41c
>>> #1 [ffff880274d25c18] crash_kexec at ffffffff810d881a
>>> #2 [ffff880274d25cd8] oops_end at ffffffff81525818
>>> #3 [ffff880274d25cf8] do_invalid_op at ffffffff81003096
>>> #4 [ffff880274d25d90] invalid_op at ffffffff8152d3de
>>>    [exception RIP: set_cpus_allowed_rt+18]
>>>    RIP: ffffffff8109e012  RSP: ffff880274d25e48  RFLAGS: 00010202
>>>    RAX: ffffffff8109e000  RBX: ffff88026f979da0  RCX: 
>>> ffff8802770cb6e8
>>>    RDX: 0000000000000000  RSI: ffffffff81add700  RDI: 
>>> ffff88026f979da0
>>>    RBP: ffff880274d25e78   R8: ffffffff816112e0   R9: 
>>> 0000000000000001
>>>    R10: 0000000000000001  R11: 0000000000011940  R12: 
>>> ffff88026f979da0
>>>    R13: ffff8802770cb6d0  R14: ffff880274d25fd8  R15: 
>>> 0000000000000000
>>>    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>> #5 [ffff880274d25e60] do_set_cpus_allowed at ffffffff8108e65f
>>> #6 [ffff880274d25e80] sync_unplug_thread at ffffffff81058c08
>>> #7 [ffff880274d25ed8] kthread at ffffffff8107cad6
>>> #8 [ffff880274d25f50] ret_from_fork at ffffffff8152bbbc
>>> crash> task_struct ffff88026f979da0 | grep class
>>>  sched_class = 0xffffffff816111e0 <fair_sched_class+64>,
>>
>> Is this a one-time thing or can you reproduce this?
> 
> Well, I can't reproduce it now, having fixed it ;-)  Dunno how 
> repeatable it would be if I un-fixed it.
> 
>> What happen here? I doubt p vanished. +18 is mostlikely the
>> "migrate_disabled_updated()" check.
>>
>> I doubt p->sched_class->set_cpus_allowed or p->sched_class vanish
>> between testing for it and invoking it, or did it?
> 
> Class changed under us.  We saw rt task, called rt method, rt method 
> said BUG_ON(!rt_task(p)), as task had become fair class.

but why does backtrace then end in do_set_cpus_allowed and not in
set_cpus_allowed_rt()? Is it possible to provide a backtrace which ends
in the BUG() statement in set_cpus_allowed_rt() if this is where it is
coming from?

>         -Mike
Sebastian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] rt, hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread()
  2015-04-09 14:54     ` Sebastian Andrzej Siewior
@ 2015-04-09 17:40       ` Mike Galbraith
  2015-04-10 14:00         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Galbraith @ 2015-04-09 17:40 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: LKML, linux-rt-users, Steven Rostedt

On Thu, 2015-04-09 at 16:54 +0200, Sebastian Andrzej Siewior wrote:
> On 04/09/2015 04:23 PM, Mike Galbraith wrote:
> > On Thu, 2015-04-09 at 16:05 +0200, Sebastian Andrzej Siewior wrote:
> > > * Mike Galbraith | 2015-03-24 08:14:49 [+0100]:
> > > 
> > > > do_set_cpus_allowed() is not safe vs ->sched_class change.
> > > > 
> > > > crash> bt
> > > > PID: 11676  TASK: ffff88026f979da0  CPU: 22  COMMAND: 
> > > > "sync_unplug/22"
> > > > #0 [ffff880274d25bc8] machine_kexec at ffffffff8103b41c
> > > > #1 [ffff880274d25c18] crash_kexec at ffffffff810d881a
> > > > #2 [ffff880274d25cd8] oops_end at ffffffff81525818
> > > > #3 [ffff880274d25cf8] do_invalid_op at ffffffff81003096
> > > > #4 [ffff880274d25d90] invalid_op at ffffffff8152d3de
> > > >    [exception RIP: set_cpus_allowed_rt+18]
> > > >    RIP: ffffffff8109e012  RSP: ffff880274d25e48  RFLAGS: 
> > > > 00010202
> > > >    RAX: ffffffff8109e000  RBX: ffff88026f979da0  RCX: 
> > > > ffff8802770cb6e8
> > > >    RDX: 0000000000000000  RSI: ffffffff81add700  RDI: 
> > > > ffff88026f979da0
> > > >    RBP: ffff880274d25e78   R8: ffffffff816112e0   R9: 
> > > > 0000000000000001
> > > >    R10: 0000000000000001  R11: 0000000000011940  R12: 
> > > > ffff88026f979da0
> > > >    R13: ffff8802770cb6d0  R14: ffff880274d25fd8  R15: 
> > > > 0000000000000000
> > > >    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > > #5 [ffff880274d25e60] do_set_cpus_allowed at ffffffff8108e65f
> > > > #6 [ffff880274d25e80] sync_unplug_thread at ffffffff81058c08
> > > > #7 [ffff880274d25ed8] kthread at ffffffff8107cad6
> > > > #8 [ffff880274d25f50] ret_from_fork at ffffffff8152bbbc
> > > > crash> task_struct ffff88026f979da0 | grep class
> > > >  sched_class = 0xffffffff816111e0 <fair_sched_class+64>,
> > > 
> > > Is this a one-time thing or can you reproduce this?
> > 
> > Well, I can't reproduce it now, having fixed it ;-)  Dunno how 
> > repeatable it would be if I un-fixed it.
> > 
> > > What happen here? I doubt p vanished. +18 is mostlikely the
> > > "migrate_disabled_updated()" check.
> > > 
> > > I doubt p->sched_class->set_cpus_allowed or p->sched_class vanish
> > > between testing for it and invoking it, or did it?
> > 
> > Class changed under us.  We saw rt task, called rt method, rt 
> > method 
> > said BUG_ON(!rt_task(p)), as task had become fair class.
> 
> but why does backtrace then end in do_set_cpus_allowed and not in
> set_cpus_allowed_rt()? Is it possible to provide a backtrace which 
> ends
> in the BUG() statement in set_cpus_allowed_rt() if this is where it 
> is
> coming from?

[exception RIP: set_cpus_allowed_rt+18] is BUG_ON(!rt_task(p)).

        -Mike




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] rt, hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread()
  2015-04-09 17:40       ` Mike Galbraith
@ 2015-04-10 14:00         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-04-10 14:00 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML, linux-rt-users, Steven Rostedt

* Mike Galbraith | 2015-04-09 19:40:01 [+0200]:

>> in the BUG() statement in set_cpus_allowed_rt() if this is where it 
>> is
>> coming from?
>
>[exception RIP: set_cpus_allowed_rt+18] is BUG_ON(!rt_task(p)).

Oh sorry, I missed the _rt. Now it makes sense.

>
>        -Mike

Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-04-10 14:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-24  7:14 [patch] rt, hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread() Mike Galbraith
2015-04-09 14:05 ` Sebastian Andrzej Siewior
2015-04-09 14:23   ` Mike Galbraith
2015-04-09 14:54     ` Sebastian Andrzej Siewior
2015-04-09 17:40       ` Mike Galbraith
2015-04-10 14:00         ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.