linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Lezcano <daniel.lezcano@linaro.org>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	lukasz.luba@arm.com, rafael@kernel.org,
	Ram Chandrasekar <rkumbako@codeaurora.org>
Subject: Re: [PATCH v6 2/7] powercap/drivers/dtpm: Create a registering system
Date: Fri, 2 Apr 2021 00:08:49 +0200	[thread overview]
Message-ID: <d0f818c7-3262-268b-bcc2-8036ce559d7b@linaro.org> (raw)
In-Reply-To: <YGYe9p3oyNpMnsBT@kroah.com>


Hi Greg,

On 01/04/2021 21:28, Greg KH wrote:
> On Thu, Apr 01, 2021 at 08:36:49PM +0200, Daniel Lezcano wrote:
>> A SoC can be differently structured depending on the platform and the
>> kernel can not be aware of all the combinations, as well as the
>> specific tweaks for a particular board.
>>
>> The creation of the hierarchy must be delegated to userspace.
> 
> Why?  Isn't this what DT is for?

I've always been told the DT describes the hardware. Here we are more
describing a configuration, that is the reason why I've let the
userspace to handle that through configfs.

> What "userspace tool" is going to be created to manage all of this?
> Pointers to that codebase?

You are certainly aware of most of it but let me give a bit more of context.

The thermal framework has cooling devices which export their 'state', a
representation of their performance level, in sysfs. Unfortunately that
gives access from the user space to act on the performance as a power
limiter in total conflict with the in-kernel thermal governor decisions.

That is done from thermal daemon the different SoC vendors tweak for
their platform. Depending on the application running and identified as a
scenario, the daemon acts proactively on the different cooling devices
to ensure a skin temperature which is far below the thermal limit of the
components.

This usage of the cooling devices hijacked the real purpose of the
thermal framework which is to protect the silicon. Nobody to blame,
there is no alternative for userspace.

The use case falls under the power limitation framework prerogative and
that is the reason why we provided a new framework to limit the power
based on the powercap framework. The thermal daemon can then use it and
stop abusing the thermal framework.

This DTPM framework allows to read the power consumption and set a power
limit to a device.

While the powercap simple backend gives a single device entry, DTPM
aggregates the different devices power by summing their power and their
limits. The tree representation of the different DTPM nodes describe how
their limits are set and how the power is computed along the different
devices.

For more info, we did a presentation at ELC [1] and Linux PM
microconference [2] and there is an article talking about it [3].


To answer your questions, there is a SoC vendor thermal daemon using
DTPM and there is a tool created to watch the thermal framework and read
the DTPM values, it is available at [4]. It is currently under
development with the goal of doing power rebalancing / capping across
the different nodes when there is a violation of the parent's power limit.



[1]
https://ossna2020.sched.com/event/c3Wf/ideas-for-finer-grained-control-over-your-heat-budget-amit-kucheria-daniel-lezcano-linaro

[2]
https://www.linuxplumbersconf.org/event/7/page/80-accepted-microconferences#power-cr

[3] https://www.linaro.org/blog/using-energy-model-to-stay-in-tdp-budget/

[4] https://git.linaro.org/people/daniel.lezcano/dtpm.git


>> These changes provide a registering mechanism where the different
>> subsystems will initialize their dtpm backends and register with a
>> name the dtpm node in a list.
>>
>> The next changes will provide an userspace interface to create
>> hierarchically the different nodes. Those will be created by name and
>> found via the list filled by the different subsystem.
>>
>> If a specified name is not found in the list, it is assumed to be a
>> virtual node which will have children and the default is to allocate
>> such node.
> 
> So userspace sets the name?
> 
> Why not use the name in the device itself?  I thought I asked that last
> time...

I probably missed it, sorry for that.

When the userspace creates the directory in the configfs, there is a
lookup with the name in the device list name. If it is found, then the
device is used, otherwise a virtual node is created instead, its power
consumption is equal to the sum of the children.

The different drivers register themselves with their name and the
associated dtpm structure. The userspace pick in this list to create a
hierarchy via configfs.

For example, a big.Little system.

- little CPUs power limiter will have the name cpu0-cpufreq
- big CPUs will have the name cpu4-cpufreq
- gpu will have the name ff9a0000.gpu-devfreq
- charger will have the name power-supply-charge
- DDR memory controller can have the name dmc-devfreq

Userspace may want to create this hierarchy:

soc
 - package
   - cluster0
     - cpu0-cpufreq
   - cluster1
     - ff9a0000.gpu-devfreq
   - dmc-devfreq
 - battery
   - power-supply-charge

It will do:

mkdir soc (virtual node)
mkdir soc/cluster0 (virtual node)
mkdir soc/cluster0/cpu0-cpufreq (real device)
etc ...

The configfs does not represent the layout of the sensors or the floor
plan of the devices but only the constraints we want to tie together.

That is the reason why I think using configfs instead of OF is more
adequate and flexible as userspace deals with the power numbers.
Moreover, we won't be facing issues with devices initialization priority
when the daemon starts.

I thought we can add OF later, when the framework has more users and
more devices. The configfs and OF can certainly co-exist or be mutually
exclusive via the Kconfig option.


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

  reply	other threads:[~2021-04-01 22:08 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 18:36 [PATCH v6 1/7] powercap/drivers/dtpm: Encapsulate even more the code Daniel Lezcano
2021-04-01 18:36 ` [PATCH v6 2/7] powercap/drivers/dtpm: Create a registering system Daniel Lezcano
2021-04-01 19:28   ` Greg KH
2021-04-01 22:08     ` Daniel Lezcano [this message]
2021-04-02  8:02       ` Greg KH
2021-04-02 11:10         ` Daniel Lezcano
2021-04-02 11:48           ` Greg KH
2021-04-01 18:36 ` [PATCH v6 3/7] powercap/drivers/dtpm: Simplify the dtpm table Daniel Lezcano
2021-11-26 17:08   ` Doug Smythies
2021-11-26 17:21     ` Rafael J. Wysocki
2021-11-26 17:43       ` Daniel Lezcano
2021-11-26 18:18         ` Rafael J. Wysocki
2021-11-26 21:56         ` Doug Smythies
2021-11-26 23:05           ` Daniel Lezcano
2021-11-26 23:08             ` [PATCH] powercap/drivers/dtpm: Disable dtpm at boot time Daniel Lezcano
2021-11-26 23:10               ` Daniel Lezcano
2021-11-27  1:13                 ` Doug Smythies
2021-12-01 18:56                   ` Rafael J. Wysocki
2021-11-26 19:10       ` [PATCH v6 3/7] powercap/drivers/dtpm: Simplify the dtpm table Doug Smythies
2021-11-26 19:29         ` Rafael J. Wysocki
2021-11-30 16:46           ` Daniel Lezcano
2021-11-26 17:40     ` Daniel Lezcano
2021-11-26 18:23       ` Rafael J. Wysocki
2021-04-01 18:36 ` [PATCH v6 4/7] powercap/drivers/dtpm: Use container_of instead of a private data field Daniel Lezcano
2021-04-01 18:36 ` [PATCH v6 5/7] powercap/drivers/dtpm: Scale the power with the load Daniel Lezcano
2021-04-01 18:36 ` [PATCH v6 6/7] powercap/drivers/dtpm: Export the symbols for the modules Daniel Lezcano
2021-04-01 18:36 ` [PATCH v6 7/7] powercap/drivers/dtpm: Allow dtpm node device creation through configfs Daniel Lezcano
2021-04-01 19:37   ` Greg KH
2021-04-02 10:54     ` Daniel Lezcano
2021-04-02 10:54     ` Daniel Lezcano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0f818c7-3262-268b-bcc2-8036ce559d7b@linaro.org \
    --to=daniel.lezcano@linaro.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=rafael@kernel.org \
    --cc=rkumbako@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).