sched_setscheduler() vs idle_balance() race

* sched_setscheduler() vs idle_balance() race
@ 2015-05-28  7:43 Mike Galbraith
  2015-05-28 11:51 ` Peter Zijlstra
  2015-05-28 13:53 ` Peter Zijlstra
  0 siblings, 2 replies; 18+ messages in thread
From: Mike Galbraith @ 2015-05-28  7:43 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: LKML, Ingo Molnar

Hi Peter,

I'm not seeing what prevents pull_task() from yanking a task out from
under __sched_setscheduler().  A box sprinkling smoldering 3.0 kernel
wreckage all over my bugzilla mbox isn't seeing it either ;-)

Scenario: rt task forks, wakes child to CPU foo, immediately tries to
change child to fair class, calls switched_from_rt(), that leads to
pull_rt_task() -> double_lock_balance() which momentarily drops child's
rq->lock, letting some prick doing idle balancing over on CPU bar in to
migrate the child.  Rt parent then calls switched_to_fair(), and box
explodes when we use the passed rq as if the child still lived there.

I sent a patchlet to verify that the diagnosis is really really correct
(can_migrate_task() says no if ->pi_lock is held), but I think it is,
the 8x10 color glossy with circles and arrows clearly shows both tasks
with their grubby mitts on that child at the same time, each thinking it
has that child locked down tight.

Not seeing what should prevent that in mainline either, I'll just ask
while I wait to (hopefully) hear "yup, all better".

	-Mike

^ permalink raw reply	[flat|nested] 18+ messages in thread