possible migration bug with hotplug cpu

* possible migration bug with hotplug cpu
@ 2009-07-08 15:48 Lucas De Marchi
  2009-07-08 15:55 ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Lucas De Marchi @ 2009-07-08 15:48 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel

I was doing some analysis with the number of migrations in my application and
I think there's a bug in this accounting or even worse, in the migrations
mechanism when used together with cpu hotplug.

I turned off all CPUs except one using the hotplug mechanism, after what I
launghed my application that has 8 threads. Before they finish they print the
file /proc/<tid>/sched. I have only 1 online CPU and there are ~ 200
migrations per thread. The function set_task_cpu is responsible for updating
the migrations counter and is called by 9 other functions. With some tests I
discovered that 95% of these migrations come from try_to_wake_up and the other
5% from pull_task and __migrate_task.

Looking at try_to_wake_up:

...
	cpu = task_cpu(p);
	orig_cpu = cpu;
	this_cpu = smp_processor_id();

#ifdef CONFIG_SMP
	if (unlikely(task_running(rq, p)))
		goto out_activate;

	cpu = p->sched_class->select_task_rq(p, sync);  //<<<<===
	if (cpu != orig_cpu) {                          //<<<<===
		set_task_cpu(p, cpu);
...

p->sched_class->select_task_rq(p, sync)  is returning a different cpu of
task_cpu(p) even if I have only 1 online CPU. In my tests this behavior is
similar for rt and normal tasks. For RT, the only possible problem could be on
find_lowest_rq, but I'm still rying to find out why. Since you have more
experience with this code, if you could give it a look I'd appreciate.

Is there any obscure reason why this behavior could be right?

Lucas De Marchi

^ permalink raw reply	[flat|nested] 9+ messages in thread