From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73D92C433F5 for ; Wed, 27 Apr 2022 11:04:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231703AbiD0LHJ (ORCPT ); Wed, 27 Apr 2022 07:07:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232218AbiD0LGp (ORCPT ); Wed, 27 Apr 2022 07:06:45 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0CDAE36CE0A for ; Wed, 27 Apr 2022 03:58:47 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BB0E5ED1; Wed, 27 Apr 2022 03:58:47 -0700 (PDT) Received: from wubuntu (unknown [10.57.77.199]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 208FF3F5A1; Wed, 27 Apr 2022 03:58:46 -0700 (PDT) Date: Wed, 27 Apr 2022 11:58:44 +0100 From: Qais Yousef To: Xuewen Yan Cc: Xuewen Yan , dietmar.eggemann@arm.com, lukasz.luba@arm.com, rafael@kernel.org, viresh.kumar@linaro.org, mingo@redhat.com, peterz@infradead.org, vincent.guittot@linaro.org, rostedt@goodmis.org, linux-kernel@vger.kernel.org, di.shen@unisoc.com Subject: Re: [PATCH] sched: Take thermal pressure into account when determine rt fits capacity Message-ID: <20220427105844.otru4yohja4s23ye@wubuntu> References: <20220407051932.4071-1-xuewen.yan@unisoc.com> <20220420135127.o7ttm5tddwvwrp2a@airbuntu> <20220421161509.asz25zmh25eurgrk@airbuntu> <20220425161209.ydugtrs3b7gyy3kk@airbuntu> <20220426092142.lppfj5eqgt3d24nb@airbuntu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/27/22 09:38, Xuewen Yan wrote: > > > > The best (simplest) way forward IMHO is to introduce a new function > > > > > > > > bool cpu_in_capacity_inversion(int cpu); > > > > > > > > (feel free to pick another name) which will detect the scenario you're in. You > > > > can use this function then in rt_task_fits_capacity() > > > > > > > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > > > > index a32c46889af8..d48811a7e956 100644 > > > > --- a/kernel/sched/rt.c > > > > +++ b/kernel/sched/rt.c > > > > @@ -462,6 +462,9 @@ static inline bool rt_task_fits_capacity(struct task_struct *p, int cpu) > > > > if (!static_branch_unlikely(&sched_asym_cpucapacity)) > > > > return true; > > > > > > > > + if (cpu_in_capacity_inversion(cpu)) > > > > + return false; > > > > + > > > > min_cap = uclamp_eff_value(p, UCLAMP_MIN); > > > > max_cap = uclamp_eff_value(p, UCLAMP_MAX); > > > > > > > > You'll probably need to do something similar in dl_task_fits_capacity(). > > > > > > > > This might be a bit aggressive though as we'll steer away all RT tasks from > > > > this CPU (as long as there's another CPU that can fit it). I need to think more > > > > about it. But we could do something like this too > > > > > > > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > > > > index a32c46889af8..f2a34946a7ab 100644 > > > > --- a/kernel/sched/rt.c > > > > +++ b/kernel/sched/rt.c > > > > @@ -462,11 +462,14 @@ static inline bool rt_task_fits_capacity(struct task_struct *p, int cpu) > > > > if (!static_branch_unlikely(&sched_asym_cpucapacity)) > > > > return true; > > > > > > > > + cpu_cap = capacity_orig_of(cpu); > > > > + > > > > + if (cpu_in_capacity_inversion(cpu)) > > > > > > It's a good idea, but as you said, in mainline, the > > > sysctl_sched_uclamp_util_min_rt_default is always 1024, > > > Maybe it's better to add it to the judgment? > > > > I don't think so. If we want to handle finding the next best thing, we need to > > make the search more complex than that. This is no worse than having 2 RT tasks > > waking up at the same time while there's only a single big CPU. One of them > > will end up on a medium or a little and we don't provide better guarantees > > here. > > I may have misunderstood your patch before, do you mean this: > 1. the cpu has to be inversion, if not, the cpu's capacity is still > the biggest, although the sysctl_sched_uclamp_util_min_rt_default > =1024, it still can put on the cpu. > 2. If the cpu is inversion, the thermal pressure should be considered, > at this time, if the sysctl_sched_uclamp_util_min_rt_default is not > 1024, make the rt still have chance to select the cpu. > If the sysctl_sched_uclamp_util_min_rt_default is 1024, all of the > cpu actually can not fit the rt, at this time, select cpu without > considering the cap_orig_of(cpu). The worst thing may be that rt > would put on the small core. > > I understand right? If so, Perhaps this approach has the least impact > on the current code complexity. I believe you understood correctly. Tasks that need to run at 1024 when the biggest cpu is in capacity inversion will get screwed - the system can't satisfy their requirements. If they're happy to run on a medium (the next best thing), then their uclamp_min should change to reflect that. If they are not happy to run at the medium, then I'm not sure if it'll make much of a difference if they end up on little. Their deadline will be missed anyway.. Again this is no worse than having two RT tasks with uclamp_min = 1024 waking up at the same time on a system with 1 big cpu. Only one of them will be able to run there. I think tasks wanting 1024 is rare and no one seemed to bother with doing better here so far. But we can certainly do better if need to :-) Thanks -- Qais Yousef