[RFC] sched: unused cpu in affine workload

* [RFC] sched: unused cpu in affine workload
@ 2016-04-04  8:23 Jiri Olsa
  2016-04-04  8:44 ` Peter Zijlstra
  2016-04-04  8:59 ` Ingo Molnar
  0 siblings, 2 replies; 9+ messages in thread
From: Jiri Olsa @ 2016-04-04  8:23 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: James Hartsock, Rik van Riel, Srivatsa Vaddagiri, Kirill Tkhai,
	linux-kernel

hi,
we've noticed following issue in one of our workloads.

I have 24 CPUs server with following sched domains:
  domain 0: (pairs)
  domain 1: 0-5,12-17 (group1)  6-11,18-23 (group2)
  domain 2: 0-23 level NUMA

I run CPU hogging workload on following CPUs:
  4,6,14,18,19,20,23

that is:
  4,14          CPUs from group1
  6,18,19,20,23 CPUs from group2

the workload process gets affinity setup via 'taskset -c ${CPUs workload ...'
and forks child for every CPU

very often we notice CPUs 4 and 14 running 3 processes of the workload
while CPUs 6,18,19,20,23 running just 4 processes, leaving one of the
CPU from group2 idle

AFAICS from the code the reason for this is that the load balancing
follows domains setup (topology) and does not regard affinity setups
like this. The code in find_busiest_group running under idle cpu from
group2 will find group1 as bussiest, but its average load will be
smaller than the one on the local group, so there's no task pulling.

It's obvious, that load balancer follows sched domain topology.
However is there some sched feature I'm missing that could help
with this? Or do we need to follow sched domains topology when
we select CPUs for workload to get even balancing?

thanks,
jirka

^ permalink raw reply	[flat|nested] 9+ messages in thread