On Mon, 2018-06-04 at 15:30 +0530, Srikar Dronamraju wrote: > > + dist = node_distance(src_nid, dst_nid); > if (numa_group) { > - src_faults = group_faults(p, src_nid); > - dst_faults = group_faults(p, dst_nid); > + src_weight = group_weight(p, src_nid, dist); > + dst_weight = group_weight(p, dst_nid, dist); > } else { > - src_faults = task_faults(p, src_nid); > - dst_faults = task_faults(p, dst_nid); > + src_weight = task_weight(p, src_nid, dist); > + dst_weight = task_weight(p, dst_nid, dist); > } > > - return dst_faults < src_faults; > + return dst_weight < src_weight; > } While this is better in principle, in practice task/group_weight is a LOT more expensive to calculate than just comparing the faults. This may be too expensive to do in the load balancing code. This patch regressed performance in your synthetic test. How does it do for "real workload" style benchmarks? -- All Rights Reversed.