From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EC3FC56202 for ; Tue, 17 Nov 2020 13:44:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27AFC20729 for ; Tue, 17 Nov 2020 13:44:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733108AbgKQNnd (ORCPT ); Tue, 17 Nov 2020 08:43:33 -0500 Received: from outbound-smtp56.blacknight.com ([46.22.136.240]:39615 "EHLO outbound-smtp56.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387426AbgKQNm1 (ORCPT ); Tue, 17 Nov 2020 08:42:27 -0500 Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp56.blacknight.com (Postfix) with ESMTPS id B49F9FAEEC for ; Tue, 17 Nov 2020 13:42:23 +0000 (GMT) Received: (qmail 10513 invoked from network); 17 Nov 2020 13:42:23 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 17 Nov 2020 13:42:23 -0000 From: Mel Gorman To: LKML Cc: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Valentin Schneider , Juri Lelli , Mel Gorman Subject: [RFC PATCH 0/3] Revisit NUMA imbalance tolerance and fork balancing Date: Tue, 17 Nov 2020 13:42:19 +0000 Message-Id: <20201117134222.31482-1-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When NUMA and CPU balancing were reconciled, there was an attempt to allow a degree of imbalance but it caused more problems than it solved. Instead, imbalance was only allowed with an almost idle NUMA domain. A lot of the problems have since been addressed so it's time for a revisit. There is also an issue with how fork is balanced across threads. It's mentioned in this context as patch 2 and 3 should share similar behaviour in terms of a nodes utilisation. Patch 1 is just a cosmetic rename Patch 2 allows a "floating" imbalance to exist so communicating tasks can remain on the same domain until utilisation is higher. It aims to balance compute availability with memory bandwidth. Patch 3 is the interesting one. Currently fork can allow a NUMA node to be completely utilised as long as there are idle CPUs until the load balancer gets involved. This caused serious problems with a real workload that unfortunately I cannot share many details about but there is a proxy reproducer. kernel/sched/fair.c | 41 ++++++++++++++++++++++++----------------- 1 file changed, 24 insertions(+), 17 deletions(-)