From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D135C433E7 for ; Mon, 12 Oct 2020 17:20:34 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5D23620878 for ; Mon, 12 Oct 2020 17:20:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="Yl6WdjFG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D23620878 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=XNKVcu6y/OVoMF479hqRWA+59WBV1YwIMGu6aexWA6o=; b=Yl6WdjFGVVzMs0n/2JKkErNmu IQSyqJHYr+sOlfKiwL6wzNSJX5TuOCi8IQbK3T/PHcPcicR1WO2Jze5UcuSWM4cyN2bnWTqsFgJQk dSswA1nkOZCkcsmD08v998/cHwbKn2RP1laLh8yLrVVC95vznvYDbg4i2XDKPXX/hKgTAcGbDQqnZ pTnz2LkkyL2GqHz7WMniWwlODLkGD12HNgrisiUeVJy0BIlB5TvSHVvv9Onxbb/j7fdSQcSSX1LGl at0dFZANeg2SgP7XhxsJBW6UNw5vdY3ofi2SECj+QUnDO1a7alSeOUtSf/+2RyTL3GREDoVgr80IS Iqr0EaT2w==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kS1TZ-0005eK-8x; Mon, 12 Oct 2020 17:19:09 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kS1TV-0005db-DR for linux-arm-kernel@lists.infradead.org; Mon, 12 Oct 2020 17:19:06 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E7CB5D6E; Mon, 12 Oct 2020 10:19:03 -0700 (PDT) Received: from [10.57.48.121] (unknown [10.57.48.121]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0A00F3F66B; Mon, 12 Oct 2020 10:19:00 -0700 (PDT) Subject: Re: [PATCH v2 2/2] [RFC] CPUFreq: Add support for cpu-perf-dependencies To: Ionela Voinescu , Sudeep Holla References: <20200924095347.32148-1-nicola.mazzucato@arm.com> <20200924095347.32148-3-nicola.mazzucato@arm.com> <20201006071909.3cgz7i5v35dgnuzn@vireshk-i7> <2417d7b5-bc58-fa30-192c-e5991ec22ce0@arm.com> <20201008110241.dcyxdtqqj7slwmnc@vireshk-i7> <20201008150317.GB20268@arm.com> <56846759-e3a6-9471-827d-27af0c3d410d@arm.com> <20201009053921.pkq4pcyrv4r7ylzu@vireshk-i7> <20201012154915.GD16519@bogus> <20201012165219.GA3573@arm.com> From: Lukasz Luba Message-ID: <17819d4d-9e7e-9a38-4227-d0d10a0749f1@arm.com> Date: Mon, 12 Oct 2020 18:18:59 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20201012165219.GA3573@arm.com> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201012_131905_596190_DF77A87F X-CRM114-Status: GOOD ( 34.86 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: devicetree@vger.kernel.org, linux-pm@vger.kernel.org, vireshk@kernel.org, daniel.lezcano@linaro.org, rjw@rjwysocki.net, linux-kernel@vger.kernel.org, robh+dt@kernel.org, Nicola Mazzucato , Viresh Kumar , chris.redpath@arm.com, morten.rasmussen@arm.com, linux-arm-kernel@lists.infradead.org Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 10/12/20 5:52 PM, Ionela Voinescu wrote: > On Monday 12 Oct 2020 at 16:49:30 (+0100), Sudeep Holla wrote: >> On Fri, Oct 09, 2020 at 11:09:21AM +0530, Viresh Kumar wrote: >>> On 08-10-20, 17:00, Nicola Mazzucato wrote: >>>> On 10/8/20 4:03 PM, Ionela Voinescu wrote: >>>>> Hi Viresh, >>>>> >>>>> On Thursday 08 Oct 2020 at 16:32:41 (+0530), Viresh Kumar wrote: >>>>>> On 07-10-20, 13:58, Nicola Mazzucato wrote: >>>>>>> Hi Viresh, >>>>>>> >>>>>>> performance controls is what is exposed by the firmware through a protocol that >>>>>>> is not capable of describing hardware (say SCMI). For example, the firmware can >>>>>>> tell that the platform has N controls, but it can't say to which hardware they >>>>>>> are "wired" to. This is done in dt, where, for example, we map these controls >>>>>>> to cpus, gpus, etc. >>>>>>> >>>>>>> Let's focus on cpus. >>>>>>> >>>>>>> Normally we would have N of performance controls (what comes from f/w) >>>>>>> that that correspond to hardware clock/dvfs domains. >>>>>>> >>>>>>> However, some firmware implementations might benefit from having finer >>>>>>> grained information about the performance requirements (e.g. >>>>>>> per-CPU) and therefore choose to present M performance controls to the >>>>>>> OS. DT would be adjusted accordingly to "wire" these controls to cpus >>>>>>> or set of cpus. >>>>>>> In this scenario, the f/w will make aggregation decisions based on the >>>>>>> requests it receives on these M controls. >>>>>>> >>>>>>> Here we would have M cpufreq policies which do not necessarily reflect the >>>>>>> underlying clock domains, thus some s/w components will underperform >>>>>>> (EAS and thermal, for example). >>>>>>> >>>>>>> A real example would be a platform in which the firmware describes the system >>>>>>> having M per-cpu control, and the cpufreq subsystem will have M policies while >>>>>>> in fact these cpus are "performance-dependent" each other (e.g. are in the same >>>>>>> clock domain). >>>>>> >>>>>> If the CPUs are in the same clock domain, they must be part of the >>>>>> same cpufreq policy. >>>>> >>>>> But cpufreq does not currently support HW_ALL (I'm using the ACPI >>>>> coordination type to describe the generic scenario of using hardware >>>>> aggregation and coordination when establishing the clock rate of CPUs). >>>>> >>>>> Adding support for HW_ALL* will involve either bypassing some >>>>> assumptions around cpufreq policies or making core cpufreq changes. >>>>> >>>>> In the way I see it, support for HW_ALL involves either: >>>>> >>>>> - (a) Creating per-cpu policies in order to allow each of the CPUs to >>>>> send their own frequency request to the hardware which will do >>>>> aggregation and clock rate decision at the level of the clock >>>>> domain. The PSD domains (ACPI) and the new DT binding will tell >>>>> which CPUs are actually in the same clock domain for whomever is >>>>> interested, despite those CPUs not being in the same policy. >>>>> This requires the extra mask that Nicola introduced. >>>>> >>>>> - (b) Making deep changes to cpufreq (core/governors/drivers) to allow: >>>>> - Governors to stop aggregating (usually max) the information >>>>> for each of the CPUs in the policy and convey to the core >>>>> information for each CPU. >>>>> - Cpufreq core to be able to receive and pass this information >>>>> down to the drivers. >>>>> - Drivers to be able to have some per cpu structures to hold >>>>> frequency control (let's say SCP fast channel addresses) for >>>>> each of the CPUs in the policy. Or have these structures in the >>>>> cpufreq core/policy, to avoid code duplication in drivers. >>>>> >>>>> Therefore (a) is the least invasive but we'll be bypassing the rule >>>>> above. But to make that rule stick we'll have to make invasive cpufreq >>>>> changes (b). >>>> >>>> Regarding the 'rule' above of one cpufreq policy per clock domain, I would like >>>> to share my understanding on it. Perhaps it's a good opportunity to shed some light. >>>> >>>> Looking back in the history of CPUFreq, related_cpus was originally designed >>>> to hold the map of cpus within the same clock. Later on, the meaning of this >>>> cpumask changed [1]. >>>> This led to the introduction of a new cpumask 'freqdomain_cpus' >>>> within acpi-cpufreq to keep the knowledge of hardware clock domains for >>>> sysfs consumers since related_cpus was not suitable anymore for this. >>>> Further on, this cpumask was assigned to online+offline cpus within the same clk >>>> domain when sw coordination is in use [2]. >>>> >>>> My interpretation is that there is no guarantee that related_cpus holds the >>>> 'real' hardware clock implementation. As a consequence, it is not true anymore >>>> that cpus that are in the same clock domain will be part of the same >>>> policy. >>>> >>>> This guided me to think it would be better to have a cpumask which always holds >>>> the real hw clock domains in the policy. >>>> >>>>> >>>>> This is my current understanding and I'm leaning towards (a). What do >>>>> you think? >>>>> >>>>> *in not so many words, this is what these patches are trying to propose, >>>>> while also making sure it's supported for both ACPI and DT. >>>>> >>>>> BTW, thank you for your effort in making sense of this! >>>>> >>>>> Regards, >>>>> Ionela. >>>>> >>>> >>>> This could be a platform where per-cpu and perf-dependencies will be used: >>>> >>>> CPU: 0 1 2 3 4 5 6 7 >>>> Type: A A A A B B B B >>>> Cluster: [ ] >>>> perf-controls: [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] >>>> perf-dependency: [ ] [ ] >>>> HW clock: [ ] [ ] >>>> >>>> The firmware will present 8 controls to the OS and each control is mapped to a >>>> cpu device via the standard dt. This is done so we can achieve hw coordination. >>>> What is required in these systems is to present to OS the information of which >>>> cpus belong to which clock domain. In other words, when hw coordinates we don't >>>> have any way at present in dt to understand how these cpus are dependent >>>> each other, from performance perspective (as opposed to ACPI where we have >>>> _PSD). Hence my proposal for the new cpu-perf-dependencies. >>>> This is regardless whether we decide to go for either a policy per-cpu or a >>>> policy per-domain. >>>> >>>> Hope it helps. >>> >>> Oh yes, I get it now. Finally. Thanks for helping me out :) >>> >>> So if I can say all this stuff in simple terms, this is what it will >>> be like: >>> >>> - We don't want software aggregation of frequencies and so we need to >>> have per-cpu policies even when they share their clock lines. >>> >>> - But we still need a way for other frameworks to know which CPUs >>> share the clock lines (that's what the perf-dependency is all about, >>> right ?). >>> >>> - We can't get it from SCMI, but need a DT based solution. >>> >>> - Currently for the cpufreq-case we relied for this on the way OPP >>> tables for the CPUs were described. i.e. the opp-table is marked as >>> "shared" and multiple CPUs point to it. >>> >>> - I wonder if we can keep using that instead of creating new bindings >>> for exact same stuff ? Though the difference here would be that the >>> OPP may not have any other entries. >> >> Well summarised, sorry for chiming in late. I could have not summarised >> any better. Just saw the big thread and was thinking of summarising. >> If the last point on OPP is possible(i.e. no OPP entries but just use >> it for fetch the information) for $subject patch is trying to achieve, >> then it would be good. >> > > Just to put in my two pennies worth: using opp-shared (in possibly empty > OPP table) as alternative to cpu-perf-dependencies sounds good enough > to me as well. +1 Regards, Lukasz > > Thanks, > Ionela. > >> -- >> Regards, >> Sudeep _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel