From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61E3FC11F64 for ; Thu, 1 Jul 2021 04:17:57 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EAA8361467 for ; Thu, 1 Jul 2021 04:17:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EAA8361467 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4GFlLW4gWKz3cdP for ; Thu, 1 Jul 2021 14:17:55 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=GYdRC4h9; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=srikar@linux.vnet.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=GYdRC4h9; dkim-atps=neutral Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4GFlKF1dCWz3002 for ; Thu, 1 Jul 2021 14:16:49 +1000 (AEST) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16144445193283; Thu, 1 Jul 2021 00:16:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : mime-version; s=pp1; bh=1QeEN8lYtB5E3x3m5VwTT+Q6+645tDNgC2asz074Es0=; b=GYdRC4h9cYtdeNnPrwOxhUpXnbKweWClxnzpPdP+ASedvrH7zP2Vo8haRi8Z8S2G1T3x 67Uc4cR8co855xbVIym4kUUw2paKz9Z2WXhXJnVMa3mSQGYSEzyYwrKRQQZY6o45VzMw m3XqS+kC9eIWD9UuRhgDFn7z7i6hmbUUAK4v0hMxIp4v5VGMwenGo6NY+g4Obi0iMgAG H1/k72WqXPMG9tUdz9oF/T0TSpcp+YGyKodD9wqzJVa0vwlDKOCm5ErnxhrXzPgwVh0o HjxS0b16I1/dRqXY8knQPmSmVMnicLdOVA2QIMl2hz+xZUxUjPUYT/MpGniIlfgf4V6K cQ== Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com with ESMTP id 39gt05cnyn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Jul 2021 00:16:20 -0400 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1614GFsi009388; Thu, 1 Jul 2021 04:16:17 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma06fra.de.ibm.com with ESMTP id 39dugh94ph-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Jul 2021 04:16:17 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1614EY6A30212556 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 1 Jul 2021 04:14:34 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7DB6411C070; Thu, 1 Jul 2021 04:16:14 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3D88211C050; Thu, 1 Jul 2021 04:16:11 +0000 (GMT) Received: from saptagiri.in.ibm.com (unknown [9.85.122.203]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 1 Jul 2021 04:16:11 +0000 (GMT) From: Srikar Dronamraju To: Ingo Molnar , Peter Zijlstra , Michael Ellerman Subject: [PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes Date: Thu, 1 Jul 2021 09:45:51 +0530 Message-Id: <20210701041552.112072-2-srikar@linux.vnet.ibm.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210701041552.112072-1-srikar@linux.vnet.ibm.com> References: <20210701041552.112072-1-srikar@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: seh_S1PwfZvT5jv2RQ5mzXErcKr8xvet X-Proofpoint-ORIG-GUID: seh_S1PwfZvT5jv2RQ5mzXErcKr8xvet Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-01_01:2021-06-30, 2021-07-01 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 mlxscore=0 priorityscore=1501 clxscore=1015 suspectscore=0 phishscore=0 lowpriorityscore=0 malwarescore=0 mlxlogscore=999 spamscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107010027 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nathan Lynch , Gautham R Shenoy , Vincent Guittot , Srikar Dronamraju , Rik van Riel , linuxppc-dev@lists.ozlabs.org, Geetika Moolchandani , LKML , Dietmar Eggemann , Thomas Gleixner , Laurent Dufour , Mel Gorman , Valentin Schneider Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" Currently scheduler doesn't check if node is online before adding CPUs to the node mask. However on some architectures, node distance is only available for nodes that are online. Its not sure how much to rely on the node distance, when one of the nodes is offline. If said node distance is fake (since one of the nodes is offline) and the actual node distance is different, then the cpumask of such nodes when the nodes become becomes online will be wrong. This can cause topology_span_sane to throw up a warning message and the rest of the topology being not updated properly. Resolve this by skipping update of cpumask for nodes that are not online. However by skipping, relevant CPUs may not be set when nodes are onlined. i.e when coming up with NUMA masks at a certain NUMA distance, CPUs that are part of other nodes, which are already online will not be part of the NUMA mask. Hence the first time, a CPU is added to the newly onlined node, add the other CPUs to the numa_mask. Cc: LKML Cc: linuxppc-dev@lists.ozlabs.org Cc: Nathan Lynch Cc: Michael Ellerman Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Valentin Schneider Cc: Gautham R Shenoy Cc: Dietmar Eggemann Cc: Mel Gorman Cc: Vincent Guittot Cc: Rik van Riel Cc: Geetika Moolchandani Cc: Laurent Dufour Reported-by: Geetika Moolchandani Signed-off-by: Srikar Dronamraju --- Changelog v1->v2: v1 link: http://lore.kernel.org/lkml/20210520154427.1041031-4-srikar@linux.vnet.ibm.com/t/#u Update the NUMA masks, whenever 1st CPU is added to cpuless node kernel/sched/topology.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index b77ad49dc14f..f25dbcab4fd2 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1833,6 +1833,9 @@ void sched_init_numa(void) sched_domains_numa_masks[i][j] = mask; for_each_node(k) { + if (!node_online(j)) + continue; + if (sched_debug() && (node_distance(j, k) != node_distance(k, j))) sched_numa_warn("Node-distance not symmetric"); @@ -1891,12 +1894,30 @@ void sched_init_numa(void) void sched_domains_numa_masks_set(unsigned int cpu) { int node = cpu_to_node(cpu); - int i, j; + int i, j, empty; + empty = cpumask_empty(sched_domains_numa_masks[0][node]); for (i = 0; i < sched_domains_numa_levels; i++) { for (j = 0; j < nr_node_ids; j++) { - if (node_distance(j, node) <= sched_domains_numa_distance[i]) + if (!node_online(j)) + continue; + + if (node_distance(j, node) <= sched_domains_numa_distance[i]) { cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]); + + /* + * We skip updating numa_masks for offline + * nodes. However now that the node is + * finally online, CPUs that were added + * earlier, should now be accommodated into + * newly oneline node's numa mask. + */ + if (node != j && empty) { + cpumask_or(sched_domains_numa_masks[i][node], + sched_domains_numa_masks[i][node], + sched_domains_numa_masks[0][j]); + } + } } } } -- 2.27.0