load balancing regression since commit 367456c7

* load balancing regression since commit 367456c7
@ 2012-04-11  1:06 Tim Chen
  2012-04-17 11:43 ` Peter Zijlstra
  2012-04-17 12:09 ` Peter Zijlstra
  0 siblings, 2 replies; 14+ messages in thread
From: Tim Chen @ 2012-04-11  1:06 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Suresh Siddha, Alex Shi, Huang, Ying, linux-kernel

Peter,

We noticed in a hackbench test (./hackbench 100 process 2000)
on a Sandy bridge 2 socket server, there has been a slow down
by a factor of 4 since commit 367456c7 was applied
(sched: Ditch per cgroup task lists for load-balancing).

The commit 5d6523e (sched: Fix load-balance wreckage) did
not fix the regression.

In the profile, there is heavy spin lock contention in the load_balance path of 3.4-rc2
where it was less than .003% of cpu before commit 367456c7.

When we looked into /proc/schedstat for 3.4-rc2 for the run duration, 
on cpu0 schedule was called 13x more often, and schedule call which
left the processor idle was 530x as much.

There was also a big increase in try to wake up remote (sd->ttwu_wake_remote) count.

		increase in sd->ttwu_wake_remote for cpu0
domain 0	540%
domain 1 	7570%
domain 2	4426%

Wonder if there is unnecessary load balancing to remote cpu?

Tim

profile for 3.4-rc2

     7.16%     hackbench  [kernel.kallsyms]      [k] _raw_spin_lock
               |
               --- _raw_spin_lock
                  |
                  |--56.52%-- load_balance
                  |          idle_balance
                  |          __schedule
                  |          schedule
                  |          |
                  |          |--98.73%-- schedule_timeout
                  |          |          |
                  |          |          |--97.80%-- unix_stream_recvmsg
                  |          |          |          sock_aio_read.part.7
                  |          |          |          sock_aio_read
                  |          |          |          do_sync_read
                  |          |          |          vfs_read
                  |          |          |          sys_read
                  |          |          |          system_call
                  |          |          |          __read_nocancel
                  |          |          |          create_worker
                  |          |          |          group
                  |          |          |          main
                  |          |          |          __libc_start_main
                  |          |          |

^ permalink raw reply	[flat|nested] 14+ messages in thread