From mboxrd@z Thu Jan 1 00:00:00 1970 From: Antti P Miettinen Subject: Re: [linux-pm] [PATCH 0/2] RFC: CPU frequency max as PM QoS param Date: Wed, 07 Mar 2012 20:08:23 +0200 Message-ID: <871up4xp0o.fsf@amiettinen-lnx.nvidia.com> References: <87d39fk2n3.fsf@ti.com> <20120228005630.GA15348@envy17> <87ty2b5mdo.fsf@amiettinen-lnx.nvidia.com> <201203042346.54468.rjw@sisk.pl> <87399lvrxj.fsf@amiettinen-lnx.nvidia.com> <20120306143729.GB29474@redhat.com> <8762egyky7.fsf@amiettinen-lnx.nvidia.com> <20120307165957.GA23690@redhat.com> Mime-Version: 1.0 Return-path: In-Reply-To: <20120307165957.GA23690@redhat.com> (Dave Jones's message of "Wed, 7 Mar 2012 11:59:57 -0500") Sender: cpufreq-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Dave Jones Cc: "Rafael J. Wysocki" , markgross@thegnar.org, Kevin Hilman , Len Brown , cpufreq List , j-pihet , pavel@ucw.cz, Linux PM list , Peter Zijlstra Dave Jones writes: > I think exposing absolute frequencies to applications is a mistake. > (And one that the core cpufreq made a long time ago). How is an application > to decide what to set it to without knowledge of the hardware it's > running on ? Abstracting the CPU frequency was briefly discussed previously. Computing performance is affected by many factors, not just CPU frequency. My view is that the actual frequency values for a given use case are likely to be very system specific (CPU, memory etc). The values would probably come from a tuning excercise and would be configurable parameters of applications. > I much prefer the idea that was mentioned a few weeks ago during the > discussion with Peter Zijlstra about cpufreq being more connected to > the scheduler, and essentially having per-process governors. > > Each process gets a /proc/self/power-policy > This can be 'performance' 'power-save' or 'ondemand' > - A global sysfs knob sets the default new processes get. > - Processes can adjust it themselves if desired. > - There's no need for a system-wide governor any more. > > There are some open questions about how this could work. > > - A list of rules for desired behaviour when performing state changes > when switching between tasks with different policies is needed. > > - We don't want to be doing power transitions every context switch, > or switching overhead will be brutal. > So some kind of lazy state changing may be necessary. > > - For 'ondemand', when would the scheduler decide to ramp up/down > the speed ? > > Dave I think better power/perf requests from applications to kernel are indeed sorely needed. However, if we are going to construct application oriented power/performance requests, I think something else than performance/powersave/ondemand might be more appropriate. For a fixed hardware platform, an application might well be tuned to know very well the minimum and maximum levels of CPU frequency that it needs in certain situation in order to meet the required responsiveness and/or consume minimum amount of energy. I think performance vs power-save is unnecessarily coarse setting. The big problem I see with application oriented metrics is that there are so many of them. I'd rather keep this kind of complexity outside the kernel. To me it feels easier to start from the hardware knobs that we can control, since - in the end - those are the knobs that the kernel needs to control. And we need consolidation of requests from multiple applications, but to me exposing hardware oriented parameters towards the user space is not a problem as such. --Antti