From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E11F0C433E3 for ; Mon, 27 Jul 2020 17:23:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CE69020729 for ; Mon, 27 Jul 2020 17:23:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730190AbgG0RXh (ORCPT ); Mon, 27 Jul 2020 13:23:37 -0400 Received: from cloudserver094114.home.pl ([79.96.170.134]:59756 "EHLO cloudserver094114.home.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726617AbgG0RXg (ORCPT ); Mon, 27 Jul 2020 13:23:36 -0400 Received: from 89-64-87-33.dynamic.chello.pl (89.64.87.33) (HELO kreacher.localnet) by serwer1319399.home.pl (79.96.170.134) with SMTP (IdeaSmtpServer 0.83.415) id 52252b58e7e4f51e; Mon, 27 Jul 2020 19:23:33 +0200 From: "Rafael J. Wysocki" To: Francisco Jerez Cc: Srinivas Pandruvada , "Rafael J. Wysocki" , Linux PM , Linux Documentation , LKML , Peter Zijlstra , Giovanni Gherdovich , Doug Smythies Subject: Re: [PATCH] cpufreq: intel_pstate: Implement passive mode with HWP enabled Date: Mon, 27 Jul 2020 19:23:32 +0200 Message-ID: <1712943.Luj0Z5seXe@kreacher> In-Reply-To: <87h7u0h34t.fsf@riseup.net> References: <3955470.QvD6XneCf3@kreacher> <87h7u0h34t.fsf@riseup.net> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Sender: linux-doc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org On Wednesday, July 22, 2020 1:14:42 AM CEST Francisco Jerez wrote: > > --==-=-= > Content-Type: multipart/mixed; boundary="=-=-=" > > --=-=-= > Content-Type: text/plain; charset=utf-8 > Content-Disposition: inline > Content-Transfer-Encoding: quoted-printable > > Srinivas Pandruvada writes: > > > On Mon, 2020-07-20 at 16:20 -0700, Francisco Jerez wrote: > >> "Rafael J. Wysocki" writes: > >>=20 > >> > On Fri, Jul 17, 2020 at 2:21 AM Francisco Jerez < > >> > currojerez@riseup.net> wrote: > >> > > "Rafael J. Wysocki" writes: > >> > >=20 > > {...] > > > >> > Overall, so far, I'm seeing a claim that the CPU subsystem can be > >> > made > >> > use less energy and do as much work as before (which is what > >> > improving > >> > the energy-efficiency means in general) if the maximum frequency of > >> > CPUs is limited in a clever way. > >> >=20 > >> > I'm failing to see what that clever way is, though. > >> Hopefully the clarifications above help some. > > > > To simplify: > > > > Suppose I called a function numpy.multiply() to multiply two big arrays > > and thread is a pegged to a CPU. Let's say it is causing CPU to > > finish the job in 10ms and it is using a P-State of 0x20. But the same > > job could have been done in 10ms even if it was using P-state of 0x16. > > So we are not energy efficient. To really know where is the bottle neck > > there are numbers of perf counters, may be cache was the issue, we > > could rather raise the uncore frequency a little. A simple APRF,MPERF > > counters are not enough.=20 > > Yes, that's right, APERF and MPERF aren't sufficient to identify every > kind of possible bottleneck, some visibility of the utilization of other > subsystems is necessary in addition -- Like e.g the instrumentation > introduced in my series to detect a GPU bottleneck. A bottleneck > condition in an IO device can be communicated to CPUFREQ It generally is not sufficient to communicate it to cpufreq. It needs to be communicated to the CPU scheduler. > by adjusting a > PM QoS latency request (link [2] in my previous reply) that effectively > gives the governor permission to rearrange CPU work arbitrarily within > the specified time frame (which should be of the order of the natural > latency of the IO device -- e.g. at least the rendering time of a frame > for a GPU) in order to minimize energy usage. OK, we need to talk more about this. > > or we characterize the workload at different P-states and set limits. > > I think this is not you want to say for energy efficiency with your > > changes.=20 > > > > The way you are trying to improve "performance" is by caller (device > > driver) to say how important my job at hand. Here device driver suppose > > offload this calculations to some GPU and can wait up to 10 ms, you > > want to tell CPU to be slow. But the p-state driver at a movement > > observes that there is a chance of overshoot of latency, it will > > immediately ask for higher P-state. So you want P-state limits based on > > the latency requirements of the caller. Since caller has more knowledge > > of latency requirement, this allows other devices sharing the power > > budget to get more or less power, and improve overall energy efficiency > > as the combined performance of system is improved. > > Is this correct? > > Yes, pretty much. OK