On Mon, 2018-06-04 at 15:30 +0530, Srikar Dronamraju wrote:
> 
> +	dist = node_distance(src_nid, dst_nid);
>  	if (numa_group) {
> -		src_faults = group_faults(p, src_nid);
> -		dst_faults = group_faults(p, dst_nid);
> +		src_weight = group_weight(p, src_nid, dist);
> +		dst_weight = group_weight(p, dst_nid, dist);
>  	} else {
> -		src_faults = task_faults(p, src_nid);
> -		dst_faults = task_faults(p, dst_nid);
> +		src_weight = task_weight(p, src_nid, dist);
> +		dst_weight = task_weight(p, dst_nid, dist);
>  	}
>  
> -	return dst_faults < src_faults;
> +	return dst_weight < src_weight;
>  }

While this is better in principle, in practice
task/group_weight is a LOT more expensive to
calculate than just comparing the faults.

This may be too expensive to do in the load
balancing code.

This patch regressed performance in your synthetic
test. How does it do for "real workload" style
benchmarks?

-- 
All Rights Reversed.