From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755190AbdDDR1v (ORCPT ); Tue, 4 Apr 2017 13:27:51 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:32922 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753254AbdDDR1t (ORCPT ); Tue, 4 Apr 2017 13:27:49 -0400 From: Srikar Dronamraju To: Ingo Molnar , Peter Zijlstra Cc: LKML , Mel Gorman , Rik van Riel , Srikar Dronamraju Subject: [PATCH] sched: Fix numabalancing to work with isolated cpus Date: Tue, 4 Apr 2017 22:57:28 +0530 X-Mailer: git-send-email 1.9.1 X-TM-AS-MML: disable x-cbid: 17040417-0012-0000-0000-000003DC1CE0 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17040417-0013-0000-0000-00001B62249C Message-Id: <1491326848-5748-1-git-send-email-srikar@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-04-04_16:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704040150 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When performing load balancing, numabalancing only looks at task->cpus_allowed to see if the task can run on the target cpu. If isolcpus kernel parameter is set, then isolated cpus will not be part of mask task->cpus_allowed. For example: (On a Power 8 box running in smt 1 mode) isolcpus=56,64,72,80,88 Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175 /proc/20996/task/20996/status:Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175 /proc/20996/task/20997/status:Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175 /proc/20996/task/20998/status:Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175 Note: offline cpus are excluded in cpus_allowed_list. However a task might call sched_setaffinity() that includes all possible cpus in the system including the isolated cpus. For example: perf bench numa mem --no-data_rand_walk -p 4 -t $THREADS -G 0 -P 3072 -T 0 -l 50 -c -s 1000 would call sched_setaffinity that resets the cpus_allowed mask. Cpus_allowed_list: 0-55,57-63,65-71,73-79,81-87,89-175 Cpus_allowed_list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168 Cpus_allowed_list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168 Cpus_allowed_list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168 Cpus_allowed_list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168 The isolated cpus are part of the cpus allowed list. In the above case, numabalancing ends up scheduling some of these tasks on isolated cpus. To avoid this, please check for isolated cpus before choosing a target cpu. Signed-off-by: Srikar Dronamraju --- kernel/sched/fair.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f045a35..f853dc0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1666,6 +1666,10 @@ static void task_numa_find_cpu(struct task_numa_env *env, if (!cpumask_test_cpu(cpu, &env->p->cpus_allowed)) continue; + /* Skip isolated cpus */ + if (cpumask_test_cpu(cpu, cpu_isolated_map)) + continue; + env->dst_cpu = cpu; task_numa_compare(env, taskimp, groupimp); } -- 1.8.3.1