From: Beata Michalska <beata.michalska@arm.com> To: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>, viresh.kumar@linaro.org Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, ionela.voinescu@arm.com, sudeep.holla@arm.com, will@kernel.org, catalin.marinas@arm.com, vincent.guittot@linaro.org, sumitg@nvidia.com, yang@os.amperecomputing.com, lihuisong@huawei.com Subject: Re: Re: [PATCH v4 4/4] cpufreq: Use arch specific feedback for cpuinfo_cur_freq Date: Fri, 26 Apr 2024 12:45:39 +0200 [thread overview] Message-ID: <ZiuF0zgqkMlmkEZz@arm.com> (raw) In-Reply-To: <s2bel7fzwpkyfyfkhod4xaihuklsaum75ycbcgmcanqaezxdu7@uxvqdqt3yo7l> On Wed, Apr 17, 2024 at 02:38:58PM -0700, Vanshidhar Konda wrote: > On Tue, Apr 16, 2024 at 05:46:18PM +0200, Beata Michalska wrote: > > On Mon, Apr 15, 2024 at 09:23:10PM -0700, Vanshidhar Konda wrote: > > > On Fri, Apr 05, 2024 at 02:33:19PM +0100, Beata Michalska wrote: > > > > Some architectures provide a way to determine an average frequency over > > > > a certain period of time based on available performance monitors (AMU on > > > > ARM or APERF/MPERf on x86). With those at hand, enroll arch_freq_get_on_cpu > > > > into cpuinfo_cur_freq policy sysfs attribute handler, which is expected to > > > > represent the current frequency of a given CPU, as obtained by the hardware. > > > > This is the type of feedback that counters do provide. > > > > > > > > > > --- snip --- > > > > > > While testing this patch series on AmpereOne system, I simulated CPU > > > frequency throttling when the system is under power or thermal > > > constraints. > > > > > > In this scenario, based on the user guilde, I expect scaling_cur_freq > > > is the frequency the kernel requests from the hardware; cpuinfo_cur_freq > > > is the actual frequency that the hardware is able to run at during the > > > power or thermal constraints. > > There has been a discussion on scaling_cur_freq vs cpuinfo_cur_freq [1]. > > The guidelines you are referring here (assuming you mean [2]) are kinda > > out-of-sync already as scaling_cur_freq has been wired earlier to use arch > > specific feedback. As there was no Arm dedicated implementation of > > arch_freq_get_on_cpu, this went kinda unnoticed. > > The conclusion of the above mentioned discussion (though rather unstated > > explicitly) was to keep the current behaviour of scaling_cur_freq and align > > both across different archs: so with the patches, both attributes will provide > > hw feedback on current frequency, when available. > > Note that if we are to adhere to the docs cpuinfo_cur_freq is the place to use > > the counters really. > > > > That change was also requested through [3] > > > > Adding @Viresh in case there was any shift in the tides .... > > > > Thank you for the pointer to the discussion in [1]. It makes sense to > bring arm64 behavior in line with x86. The question about whether > modifying the behavior of scaling_cur_freq was a good idea did not get > any response. > > > > > > > The AmpereOne system I'm testing on has the following configuration: > > > - Max frequency is 3000000 > > > - Support for AMU registers > > > - ACPI CPPC feedback counters use PCC register space > > > - Fedora 39 with 6.7.5 kernel > > > - Fedora 39 with 6.9.0-rc3 + this patch series > > > > > > With 6.7.5 kernel: > > > Core scaling_cur_freq cpuinfo_cur_freq > > > ---- ---------------- ---------------- > > > 0 3000000 2593000 > > > 1 3000000 2613000 > > > 2 3000000 2625000 > > > 3 3000000 2632000 > > > > > So if I got it right from the info you have provided the numbers above are > > obtained without applying the patches. In that case, scaling_cur_freq will > > use policy->cur (in your case) showing last frequency set, not necessarily > > the actual freq, whereas cpuinfo_cur_freq uses __cpufreq_get and AMU counters. > > > > > > > With 6.9.0-rc3 + this patch series: > > > Core scaling_cur_freq cpuinfo_cur_freq > > > ---- ---------------- ---------------- > > > 0 2671875 2671875 > > > 1 2589632 2589632 > > > 2 2648437 2648437 > > > 3 2698242 2698242 > > > > > With the patches applied both scaling_cur_freq and cpuinfo_cur_freq will use AMU > > counters, or fie scale factor obtained based on AMU counters to be more precise: > > both should now show similar/same frequency (as discussed in [1]) > > I'd say due to existing implementation for scaling_cur_freq (which we cannot > > change at this point) this is unavoidable. > > > > > In the second case we can't identify that the CPU frequency is > > > being throttled by the hardware. I noticed this behavior with > > > or without this patch. > > > > > I am not entirely sure comparing the two should be a way to go about throttling > > (whether w/ or w/o the changes). > > It would probably be best to refer to thermal sysfs and get a hold of cur_state > > Throttling could happen due to non-thermal reasons. Or a system may not > even support thermal zones. So on those systems we wouldn't be able to > identify/debug the behavior of the hardware providing less than maximum > performance. The discussion around scaling_cur_freq should probably be > re-visited in a targeted manner I think. > @Viresh: It seems that we might need to revisit the discussion we've had around scaling_cur_freq and cpuinfo_cur_freq and the use of arch_freq_get_on_cpu. As Vanshi has raised, having both utilizing arch specific feedback for getting current frequency is bit problematic and might be confusing at best. As arch_freq_get_on_cpu is already used by show_scaling_cur_freq there are not many options we are left with, if we want to kee all archs aligned: we can either try to rework show_scaling_cur_freq and it's use of arch_freq_get_on_cpu, and move it to cpuinfo_cur_freq, which would align with relevant docs, though that will not work for x86, or we keep it only there and skip updating cpuinfo_cur_freq, going against the guidelines. Other options, purely theoretical, would involve making arch_freq_get_on_cpu aware of type of the info requested (hw vs sw) or adding yet another arch-specific implementation, and those are not really appealing alternatives to say at least. What's your opinion on this one ? --- BR Beata > I'll test v5 of the series on AmpereOne and report back on that thread. > > Thanks, > Vanshi > > > which should indicate current throttle state: > > > > /sys/class/thermal/thermal_zone[0-*]/cdev[0-*]/cur_state > > > > with values above '0' implying ongoing throttling. > > > > The appropriate thermal_zone can be identified through 'type' attribute. > > > > > > Thank you for giving those patches a spin. > > > > --- > > BR > > Beata > > --- > > [1] https://lore.kernel.org/all/20230609043922.eyyqutbwlofqaddz@vireshk-i7/ > > [2] https://elixir.bootlin.com/linux/latest/source/Documentation/admin-guide/pm/cpufreq.rst#L197 > > [3] https://lore.kernel.org/lkml/2cfbc633-1e94-d741-2337-e1b0cf48b81b@nvidia.com/ > > --- > > > > > > > Thanks, > > > Vanshi
WARNING: multiple messages have this Message-ID (diff)
From: Beata Michalska <beata.michalska@arm.com> To: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>, viresh.kumar@linaro.org Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, ionela.voinescu@arm.com, sudeep.holla@arm.com, will@kernel.org, catalin.marinas@arm.com, vincent.guittot@linaro.org, sumitg@nvidia.com, yang@os.amperecomputing.com, lihuisong@huawei.com Subject: Re: Re: [PATCH v4 4/4] cpufreq: Use arch specific feedback for cpuinfo_cur_freq Date: Fri, 26 Apr 2024 12:45:39 +0200 [thread overview] Message-ID: <ZiuF0zgqkMlmkEZz@arm.com> (raw) In-Reply-To: <s2bel7fzwpkyfyfkhod4xaihuklsaum75ycbcgmcanqaezxdu7@uxvqdqt3yo7l> On Wed, Apr 17, 2024 at 02:38:58PM -0700, Vanshidhar Konda wrote: > On Tue, Apr 16, 2024 at 05:46:18PM +0200, Beata Michalska wrote: > > On Mon, Apr 15, 2024 at 09:23:10PM -0700, Vanshidhar Konda wrote: > > > On Fri, Apr 05, 2024 at 02:33:19PM +0100, Beata Michalska wrote: > > > > Some architectures provide a way to determine an average frequency over > > > > a certain period of time based on available performance monitors (AMU on > > > > ARM or APERF/MPERf on x86). With those at hand, enroll arch_freq_get_on_cpu > > > > into cpuinfo_cur_freq policy sysfs attribute handler, which is expected to > > > > represent the current frequency of a given CPU, as obtained by the hardware. > > > > This is the type of feedback that counters do provide. > > > > > > > > > > --- snip --- > > > > > > While testing this patch series on AmpereOne system, I simulated CPU > > > frequency throttling when the system is under power or thermal > > > constraints. > > > > > > In this scenario, based on the user guilde, I expect scaling_cur_freq > > > is the frequency the kernel requests from the hardware; cpuinfo_cur_freq > > > is the actual frequency that the hardware is able to run at during the > > > power or thermal constraints. > > There has been a discussion on scaling_cur_freq vs cpuinfo_cur_freq [1]. > > The guidelines you are referring here (assuming you mean [2]) are kinda > > out-of-sync already as scaling_cur_freq has been wired earlier to use arch > > specific feedback. As there was no Arm dedicated implementation of > > arch_freq_get_on_cpu, this went kinda unnoticed. > > The conclusion of the above mentioned discussion (though rather unstated > > explicitly) was to keep the current behaviour of scaling_cur_freq and align > > both across different archs: so with the patches, both attributes will provide > > hw feedback on current frequency, when available. > > Note that if we are to adhere to the docs cpuinfo_cur_freq is the place to use > > the counters really. > > > > That change was also requested through [3] > > > > Adding @Viresh in case there was any shift in the tides .... > > > > Thank you for the pointer to the discussion in [1]. It makes sense to > bring arm64 behavior in line with x86. The question about whether > modifying the behavior of scaling_cur_freq was a good idea did not get > any response. > > > > > > > The AmpereOne system I'm testing on has the following configuration: > > > - Max frequency is 3000000 > > > - Support for AMU registers > > > - ACPI CPPC feedback counters use PCC register space > > > - Fedora 39 with 6.7.5 kernel > > > - Fedora 39 with 6.9.0-rc3 + this patch series > > > > > > With 6.7.5 kernel: > > > Core scaling_cur_freq cpuinfo_cur_freq > > > ---- ---------------- ---------------- > > > 0 3000000 2593000 > > > 1 3000000 2613000 > > > 2 3000000 2625000 > > > 3 3000000 2632000 > > > > > So if I got it right from the info you have provided the numbers above are > > obtained without applying the patches. In that case, scaling_cur_freq will > > use policy->cur (in your case) showing last frequency set, not necessarily > > the actual freq, whereas cpuinfo_cur_freq uses __cpufreq_get and AMU counters. > > > > > > > With 6.9.0-rc3 + this patch series: > > > Core scaling_cur_freq cpuinfo_cur_freq > > > ---- ---------------- ---------------- > > > 0 2671875 2671875 > > > 1 2589632 2589632 > > > 2 2648437 2648437 > > > 3 2698242 2698242 > > > > > With the patches applied both scaling_cur_freq and cpuinfo_cur_freq will use AMU > > counters, or fie scale factor obtained based on AMU counters to be more precise: > > both should now show similar/same frequency (as discussed in [1]) > > I'd say due to existing implementation for scaling_cur_freq (which we cannot > > change at this point) this is unavoidable. > > > > > In the second case we can't identify that the CPU frequency is > > > being throttled by the hardware. I noticed this behavior with > > > or without this patch. > > > > > I am not entirely sure comparing the two should be a way to go about throttling > > (whether w/ or w/o the changes). > > It would probably be best to refer to thermal sysfs and get a hold of cur_state > > Throttling could happen due to non-thermal reasons. Or a system may not > even support thermal zones. So on those systems we wouldn't be able to > identify/debug the behavior of the hardware providing less than maximum > performance. The discussion around scaling_cur_freq should probably be > re-visited in a targeted manner I think. > @Viresh: It seems that we might need to revisit the discussion we've had around scaling_cur_freq and cpuinfo_cur_freq and the use of arch_freq_get_on_cpu. As Vanshi has raised, having both utilizing arch specific feedback for getting current frequency is bit problematic and might be confusing at best. As arch_freq_get_on_cpu is already used by show_scaling_cur_freq there are not many options we are left with, if we want to kee all archs aligned: we can either try to rework show_scaling_cur_freq and it's use of arch_freq_get_on_cpu, and move it to cpuinfo_cur_freq, which would align with relevant docs, though that will not work for x86, or we keep it only there and skip updating cpuinfo_cur_freq, going against the guidelines. Other options, purely theoretical, would involve making arch_freq_get_on_cpu aware of type of the info requested (hw vs sw) or adding yet another arch-specific implementation, and those are not really appealing alternatives to say at least. What's your opinion on this one ? --- BR Beata > I'll test v5 of the series on AmpereOne and report back on that thread. > > Thanks, > Vanshi > > > which should indicate current throttle state: > > > > /sys/class/thermal/thermal_zone[0-*]/cdev[0-*]/cur_state > > > > with values above '0' implying ongoing throttling. > > > > The appropriate thermal_zone can be identified through 'type' attribute. > > > > > > Thank you for giving those patches a spin. > > > > --- > > BR > > Beata > > --- > > [1] https://lore.kernel.org/all/20230609043922.eyyqutbwlofqaddz@vireshk-i7/ > > [2] https://elixir.bootlin.com/linux/latest/source/Documentation/admin-guide/pm/cpufreq.rst#L197 > > [3] https://lore.kernel.org/lkml/2cfbc633-1e94-d741-2337-e1b0cf48b81b@nvidia.com/ > > --- > > > > > > > Thanks, > > > Vanshi _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-04-26 10:45 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-04-05 13:33 [PATCH v4 0/4] Add support for AArch64 AMUv1-based arch_freq_get_on_cpu Beata Michalska 2024-04-05 13:33 ` Beata Michalska 2024-04-05 13:33 ` [PATCH v4 1/4] arch_topology: init capacity_freq_ref to 0 Beata Michalska 2024-04-05 13:33 ` Beata Michalska 2024-04-08 8:35 ` Vincent Guittot 2024-04-08 8:35 ` Vincent Guittot 2024-04-05 13:33 ` [PATCH v4 2/4] arm64: Provide an AMU-based version of arch_freq_get_on_cpu Beata Michalska 2024-04-05 13:33 ` Beata Michalska 2024-04-05 13:33 ` [PATCH v4 3/4] arm64: Update AMU-based frequency scale factor on entering idle Beata Michalska 2024-04-05 13:33 ` Beata Michalska 2024-04-10 18:57 ` Sumit Gupta 2024-04-10 18:57 ` Sumit Gupta 2024-04-11 19:30 ` Beata Michalska 2024-04-11 19:30 ` Beata Michalska 2024-04-05 13:33 ` [PATCH v4 4/4] cpufreq: Use arch specific feedback for cpuinfo_cur_freq Beata Michalska 2024-04-05 13:33 ` Beata Michalska 2024-04-16 4:23 ` Vanshidhar Konda 2024-04-16 4:23 ` Vanshidhar Konda 2024-04-16 15:46 ` Beata Michalska 2024-04-16 15:46 ` Beata Michalska 2024-04-17 21:38 ` Vanshidhar Konda 2024-04-17 21:38 ` Vanshidhar Konda 2024-04-26 10:45 ` Beata Michalska [this message] 2024-04-26 10:45 ` Beata Michalska 2024-04-29 9:25 ` Viresh Kumar 2024-04-29 9:25 ` Viresh Kumar 2024-05-01 14:46 ` Vanshidhar Konda 2024-05-01 14:46 ` Vanshidhar Konda 2024-05-07 8:31 ` Beata Michalska 2024-05-07 8:31 ` Beata Michalska 2024-05-07 10:02 ` Beata Michalska 2024-05-07 10:02 ` Beata Michalska 2024-05-20 9:18 ` Viresh Kumar 2024-05-20 9:18 ` Viresh Kumar
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=ZiuF0zgqkMlmkEZz@arm.com \ --to=beata.michalska@arm.com \ --cc=catalin.marinas@arm.com \ --cc=ionela.voinescu@arm.com \ --cc=lihuisong@huawei.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=sudeep.holla@arm.com \ --cc=sumitg@nvidia.com \ --cc=vanshikonda@os.amperecomputing.com \ --cc=vincent.guittot@linaro.org \ --cc=viresh.kumar@linaro.org \ --cc=will@kernel.org \ --cc=yang@os.amperecomputing.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.