From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7D85C433E0 for ; Sat, 9 Jan 2021 14:12:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7E16023A23 for ; Sat, 9 Jan 2021 14:12:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725951AbhAIOM5 (ORCPT ); Sat, 9 Jan 2021 09:12:57 -0500 Received: from outbound-smtp16.blacknight.com ([46.22.139.233]:51697 "EHLO outbound-smtp16.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725780AbhAIOM5 (ORCPT ); Sat, 9 Jan 2021 09:12:57 -0500 Received: from mail.blacknight.com (pemlinmail05.blacknight.ie [81.17.254.26]) by outbound-smtp16.blacknight.com (Postfix) with ESMTPS id 4BEE61C3F74 for ; Sat, 9 Jan 2021 14:12:05 +0000 (GMT) Received: (qmail 22975 invoked from network); 9 Jan 2021 14:12:05 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 9 Jan 2021 14:12:05 -0000 Date: Sat, 9 Jan 2021 14:12:03 +0000 From: Mel Gorman To: Peter Zijlstra Cc: Vincent Guittot , "Li, Aubrey" , linux-kernel , Ingo Molnar , Juri Lelli , Valentin Schneider , Qais Yousef , Dietmar Eggemann , Steven Rostedt , Ben Segall , Tim Chen , Jiang Biao Subject: Re: [RFC][PATCH 1/5] sched/fair: Fix select_idle_cpu()s cost accounting Message-ID: <20210109141203.GG3592@techsingularity.net> References: <20201214164822.402812729@infradead.org> <20201214170017.877557652@infradead.org> <20201215075911.GA3040@hirez.programming.kicks-ass.net> <20210108102738.GB3592@techsingularity.net> <20210108144058.GD3592@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 08, 2021 at 08:45:44PM +0100, Peter Zijlstra wrote: > On Fri, Jan 08, 2021 at 04:10:51PM +0100, Vincent Guittot wrote: > > Also, there is another problem (that I'm investigating) which is that > > this_rq()->avg_idle is stalled when your cpu is busy. Which means that > > this avg_idle can just be a very old and meaningless value. I think > > that we should decay it periodically to reflect there is less and less > > https://lkml.kernel.org/r/20180530143105.977759909@infradead.org > > :-) This needs to be revived. I'm of the opinion that your initial series needs to be split somewhat into "single scan for core" parts followed by the "Fix depth selection". I have tests in the queue and one of them is just patch 2 on its own. Preliminary results for patch 2 on its own do not look bad but that's based on one test (tbench). It'll be tomorrow before it works through variants of patch 1 which I suspect will be inconclusive and make me more convinced it should be split out separately. The single scan for core would be patches 2-4 of the series this thread is about which is an orthogonal problem to avoiding repeated scans of the same runqueues during a single wakeup. I would prefer to drop patch 5 initially because the has_idle_cores throttling mechanism for idle core searches is reasonably effective and the first round of tests indicated that changing it was inconclusive and should be treated separately. The depth scan stuff would go on top because right now the depth scan is based on magic numbers and no amount of shuffling that around will make it less magic without an avg_idle value that decays and potentially the scan cost also aging. It's much less straight-forward than the single scan aspect. Switching idle core searches to SIS_PROP should also be treated separately. Thoughts? I don't want to go too far in this direction on my own because the testing requirements are severe in terms of time. Even then no matter how much I test, stuff will be missed as it's sensitive to domain sizes, CPU generation, cache topology, cache characteristics, domain utilisation, architecture, kitchen sink etc. -- Mel Gorman SUSE Labs