Linux-PM Archive on lore.kernel.org
 help / color / Atom feed
* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
@ 2019-10-22 22:06 Leonard Crestez
  2019-10-22 22:47 ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Leonard Crestez @ 2019-10-22 22:06 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar
  Cc: Linux ACPI, Linux PM, Sudeep Holla, Dmitry Osipenko,
	Matthias Kaehlcke, Kyungmin Park, Chanwoo Choi,
	Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

Hello,

I've been working on a series which add DEV_PM_QOS support to devfreq, 
now at v9:

	https://patchwork.kernel.org/cover/11171807/

Your third patch removes DEV_PM_QOS_FREQUENCY_MIN/MAX that my series 
depends upon. I found the email on patchwork, hopefully the in-reply-to 
header is OK?

As far as I can tell the replacement ("frequency qos") needs constraints 
to be managed outside the device infrastructure and it's not obviously 
usable a generic mechanism for making "min_freq/max_freq" requests to a 
specific device.

I've read a bit through your emails and it seems the problem is that 
you're dealing with dev_pm_qos on per-policy basis but each "struct 
cpufreq_policy" can cover multiple CPU devices.

An alternative solution which follows dev_pm_qos would be to add 
notifiers for each CPU inside cpufreq_online and cpufreq_offline. This 
makes quite a bit of sense because each CPU is a separate "device" with 
a possibly distinct list of qos requests.

If cpufreq needs a group of CPUs to run at the same frequency then it 
should deal with this by doing dev_pm_qos_read_frequency on each CPU 
device and picking a frequency that attempts to satisfy all constraints.

Handling sysfs min/max_freq through dev_pm_qos would be of dubious 
value, though I guess you could register identical requests for each CPU.

I'm not familiar with what you're trying to accomplish with PM_QOS other 
than replace the sysfs min_freq/max_freq files: What I want to do is add 
a driver using the interconnect driver which translates requests for 
"bandwidth-on-a-path" into "frequency-on-a-device". More specifically a 
display driver could request bandwidth to RAM and this would be 
translated into min frequency for NoC and the DDR controller, both of 
which implement scaling via devfreq:

	https://patchwork.kernel.org/cover/11104113/
	https://patchwork.kernel.org/cover/11111865/

This is part of an effort to upstream an out-of-tree "busfreq" feature 
which allows device device to make "min frequency requests" through an 
entirely out-of-tree mechanism. It would also allow finer-grained 
scaling that what IMX tree currently support.

If you're making cpufreq qos constrains be "per-cpufreq-policy" then 
it's not clear how you would handle in-kernel constraints from other 
subsystems. Would users have to get a pointer to struct cpufreq_policy 
and struct freq_constraints? That would make object lifetime a 
nightmare! But dev_pm_qos solves this by tying to struct device.

And if you don't care about in-kernel requests then what's the purpose 
of involving PM QoS? The old min/max_freq sysfs implementation worked.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-22 22:06 [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq Leonard Crestez
@ 2019-10-22 22:47 ` Rafael J. Wysocki
  2019-10-23  2:20   ` Leonard Crestez
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-22 22:47 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Rafael J. Wysocki, Viresh Kumar, Linux ACPI, Linux PM,
	Sudeep Holla, Dmitry Osipenko, Matthias Kaehlcke, Kyungmin Park,
	Chanwoo Choi, Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On Wed, Oct 23, 2019 at 12:06 AM Leonard Crestez
<leonard.crestez@nxp.com> wrote:
>
> Hello,
>
> I've been working on a series which add DEV_PM_QOS support to devfreq,
> now at v9:
>
>         https://patchwork.kernel.org/cover/11171807/
>
> Your third patch removes DEV_PM_QOS_FREQUENCY_MIN/MAX that my series
> depends upon. I found the email on patchwork, hopefully the in-reply-to
> header is OK?
>
> As far as I can tell the replacement ("frequency qos") needs constraints
> to be managed outside the device infrastructure and it's not obviously
> usable a generic mechanism for making "min_freq/max_freq" requests to a
> specific device.

You can add a struct freq_constrants pointer to struct dev_pm_info and
use it just fine.  It doesn't have to be bolted into struct
dev_pm_qos.

> I've read a bit through your emails and it seems the problem is that
> you're dealing with dev_pm_qos on per-policy basis but each "struct
> cpufreq_policy" can cover multiple CPU devices.
>
> An alternative solution which follows dev_pm_qos would be to add
> notifiers for each CPU inside cpufreq_online and cpufreq_offline. This
> makes quite a bit of sense because each CPU is a separate "device" with
> a possibly distinct list of qos requests.

But combining the lists of requests for all the CPUs in a policy
defeats the idea of automatic aggregation of requests which really is
what PM QoS is about.

There have to be two lists of requests per policy, one for the max and
one for the min frequency.

> If cpufreq needs a group of CPUs to run at the same frequency then it
> should deal with this by doing dev_pm_qos_read_frequency on each CPU
> device and picking a frequency that attempts to satisfy all constraints.

No, that would be combining the requests by hand.

> Handling sysfs min/max_freq through dev_pm_qos would be of dubious
> value, though I guess you could register identical requests for each CPU.
>
> I'm not familiar with what you're trying to accomplish with PM_QOS other
> than replace the sysfs min_freq/max_freq files:

QoS-based management of the frequency limits is not really needed for
that.  The real motivation for adding it were things like thermal and
platform firmware induced limits that all have their own values to
combine with the ones provided by user space.

> What I want to do is add
> a driver using the interconnect driver which translates requests for
> "bandwidth-on-a-path" into "frequency-on-a-device". More specifically a
> display driver could request bandwidth to RAM and this would be
> translated into min frequency for NoC and the DDR controller, both of
> which implement scaling via devfreq:
>
>         https://patchwork.kernel.org/cover/11104113/
>         https://patchwork.kernel.org/cover/11111865/
>
> This is part of an effort to upstream an out-of-tree "busfreq" feature
> which allows device device to make "min frequency requests" through an
> entirely out-of-tree mechanism. It would also allow finer-grained
> scaling that what IMX tree currently support.
>
> If you're making cpufreq qos constrains be "per-cpufreq-policy" then
> it's not clear how you would handle in-kernel constraints from other
> subsystems. Would users have to get a pointer to struct cpufreq_policy
> and struct freq_constraints?

Yes.

> That would make object lifetime a nightmare!

Why really?  It is not much different from the device PM QoS case.

Actually, https://patchwork.kernel.org/patch/11193019/ is a simple
one-for-one replacement of the former.  As it turns out, all of its
users have access to a policy object anyway already.

> But dev_pm_qos solves this by tying to struct device.

Well, the cpufreq sysfs is per-policy and not per-CPU and we really
need a per-policy min and max frequency in cpufreq, for governors etc.

> And if you don't care about in-kernel requests then what's the purpose
> of involving PM QoS? The old min/max_freq sysfs implementation worked.

See above.

Thanks!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-22 22:47 ` Rafael J. Wysocki
@ 2019-10-23  2:20   ` Leonard Crestez
  2019-10-23  8:54     ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Leonard Crestez @ 2019-10-23  2:20 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar
  Cc: Linux ACPI, Linux PM, Sudeep Holla, Dmitry Osipenko,
	Matthias Kaehlcke, Kyungmin Park, Chanwoo Choi,
	Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
> On Wed, Oct 23, 2019 at 12:06 AM Leonard Crestez
> <leonard.crestez@nxp.com> wrote:
>> I've been working on a series which add DEV_PM_QOS support to devfreq,
>> now at v9:
>>
>> Your third patch removes DEV_PM_QOS_FREQUENCY_MIN/MAX that my series
>> depends upon. I found the email on patchwork, hopefully the in-reply-to
>> header is OK?
>>
>> As far as I can tell the replacement ("frequency qos") needs constraints
>> to be managed outside the device infrastructure and it's not obviously
>> usable a generic mechanism for making "min_freq/max_freq" requests to a
>> specific device.
> 
> You can add a struct freq_constrants pointer to struct dev_pm_info and
> use it just fine.  It doesn't have to be bolted into struct
> dev_pm_qos.

I'm not sure what you mean by this? min/max_freq was already available 
in dev_pm_qos so it's not clear why it would be moved somewhere else. 
What I'm looking for is a mechanism to make min/max_freq requests on a 
per-device basis and DEV_PM_QOS_MIN_FREQUENCY already did that.

Reuse is good, right?

>> I've read a bit through your emails and it seems the problem is that
>> you're dealing with dev_pm_qos on per-policy basis but each "struct
>> cpufreq_policy" can cover multiple CPU devices.
>>
>> An alternative solution which follows dev_pm_qos would be to add
>> notifiers for each CPU inside cpufreq_online and cpufreq_offline. This
>> makes quite a bit of sense because each CPU is a separate "device" with
>> a possibly distinct list of qos requests.
> 
> But combining the lists of requests for all the CPUs in a policy
> defeats the idea of automatic aggregation of requests which really is
> what PM QoS is about.

My primary interest is the "dev" part of dev_pm_qos: making pm_qos 
requests tied to a specific device.

> There have to be two lists of requests per policy, one for the max and
> one for the min frequency >
>> If cpufreq needs a group of CPUs to run at the same frequency then it
>> should deal with this by doing dev_pm_qos_read_frequency on each CPU
>> device and picking a frequency that attempts to satisfy all constraints.
> 
> No, that would be combining the requests by hand.

It's just a loop though.

>> Handling sysfs min/max_freq through dev_pm_qos would be of dubious
>> value, though I guess you could register identical requests for each CPU.
>>
>> I'm not familiar with what you're trying to accomplish with PM_QOS other
>> than replace the sysfs min_freq/max_freq files:
> 
> QoS-based management of the frequency limits is not really needed for
> that.  The real motivation for adding it were things like thermal and
> platform firmware induced limits that all have their own values to
> combine with the ones provided by user space.

Current users seem to be thermal-related. Do you care about min/max_freq 
requests from stuff not directly tied to a CPU?

>> What I want to do is add
>> a driver using the interconnect driver which translates requests for
>> "bandwidth-on-a-path" into "frequency-on-a-device". More specifically a
>> display driver could request bandwidth to RAM and this would be
>> translated into min frequency for NoC and the DDR controller, both of
>> which implement scaling via devfreq:
>>
>> This is part of an effort to upstream an out-of-tree "busfreq" feature
>> which allows device device to make "min frequency requests" through an
>> entirely out-of-tree mechanism. It would also allow finer-grained
>> scaling that what IMX tree currently support.
>>
>> If you're making cpufreq qos constrains be "per-cpufreq-policy" then
>> it's not clear how you would handle in-kernel constraints from other
>> subsystems. Would users have to get a pointer to struct cpufreq_policy
>> and struct freq_constraints?
> 
> Yes.
> 
>> That would make object lifetime a nightmare!
> 
> Why really?  It is not much different from the device PM QoS case
 >> Actually,  is a simple
> one-for-one replacement of the former.  As it turns out, all of its
> users have access to a policy object anyway already.

All current users are very closely tied to cpufreq, what I had in mind 
is requests from unrelated subsystems.

Browsing through the cpufreq core it seems that it's possible for a 
struct cpufreq_policy to be created and destroyed at various points, the 
simplest example being rmmod/modprobe on a cpufreq driver.

The freq_qos_add_request function grabs a pointer to struct 
freq_constraints, this can become invalid when cpufreq_policy is freed.

I guess all users need to register a CPUFREQ_POLICY_NOTIFIER and make 
sure to freq_qos_add_request every time? Looking at your [PATCH 2/3] I 
can't spot any obvious issue, thermal clamping code seems to get the 
appropriate callbacks.
>> But dev_pm_qos solves this by tying to struct device.

The lifetime of "struct device" is already controlled by 
get_device/put_device.

> Well, the cpufreq sysfs is per-policy and not per-CPU and we really
> need a per-policy min and max frequency in cpufreq, for governors etc.

Aggregation could be performed at two levels:

1) Per cpu device (by dev_pm_qos)
2) Per policy (inside cpufreq)

The per-cpu dev_pm_qos notifier would just update a per-policy 
pm_qos_constraints object. The second step could even be done strictly 
inside the cpufreq core using existing pm_qos, no need to invent new 
frameworks.

Maybe dev_pm_qos is not a very good fit for cpufreq because of these 
"cpu device versus cpufreq_policy" issues but it makes a ton of sense 
for devfreq. Can you maybe hold PATCH 3 from this series pending further 
discussion?

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-23  2:20   ` Leonard Crestez
@ 2019-10-23  8:54     ` Rafael J. Wysocki
  2019-10-23  8:57       ` Rafael J. Wysocki
  2019-10-23 13:33       ` Leonard Crestez
  0 siblings, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-23  8:54 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Rafael J. Wysocki, Viresh Kumar, Linux ACPI, Linux PM,
	Sudeep Holla, Dmitry Osipenko, Matthias Kaehlcke, Kyungmin Park,
	Chanwoo Choi, Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>
> On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
> > On Wed, Oct 23, 2019 at 12:06 AM Leonard Crestez
> > <leonard.crestez@nxp.com> wrote:
> >> I've been working on a series which add DEV_PM_QOS support to devfreq,
> >> now at v9:
> >>
> >> Your third patch removes DEV_PM_QOS_FREQUENCY_MIN/MAX that my series
> >> depends upon. I found the email on patchwork, hopefully the in-reply-to
> >> header is OK?
> >>
> >> As far as I can tell the replacement ("frequency qos") needs constraints
> >> to be managed outside the device infrastructure and it's not obviously
> >> usable a generic mechanism for making "min_freq/max_freq" requests to a
> >> specific device.
> >
> > You can add a struct freq_constrants pointer to struct dev_pm_info and
> > use it just fine.  It doesn't have to be bolted into struct
> > dev_pm_qos.
>
> I'm not sure what you mean by this? min/max_freq was already available
> in dev_pm_qos so it's not clear why it would be moved somewhere else.
> What I'm looking for is a mechanism to make min/max_freq requests on a
> per-device basis and DEV_PM_QOS_MIN_FREQUENCY already did that.
>
> Reuse is good, right?

But they go away in patch 3 of this series as there are no users in
the tree.  Sorry about that.

> >> I've read a bit through your emails and it seems the problem is that
> >> you're dealing with dev_pm_qos on per-policy basis but each "struct
> >> cpufreq_policy" can cover multiple CPU devices.
> >>
> >> An alternative solution which follows dev_pm_qos would be to add
> >> notifiers for each CPU inside cpufreq_online and cpufreq_offline. This
> >> makes quite a bit of sense because each CPU is a separate "device" with
> >> a possibly distinct list of qos requests.
> >
> > But combining the lists of requests for all the CPUs in a policy
> > defeats the idea of automatic aggregation of requests which really is
> > what PM QoS is about.
>
> My primary interest is the "dev" part of dev_pm_qos: making pm_qos
> requests tied to a specific device.

The list of requests needs to be associated with the user of the
effective constraint.  If that is the device, it is all good.

> > There have to be two lists of requests per policy, one for the max and
> > one for the min frequency >
> >> If cpufreq needs a group of CPUs to run at the same frequency then it
> >> should deal with this by doing dev_pm_qos_read_frequency on each CPU
> >> device and picking a frequency that attempts to satisfy all constraints.
> >
> > No, that would be combining the requests by hand.
>
> It's just a loop though.

Yes, it is, and needs to be run on every change of an effective
constraint for any CPU even if the total effective constraint doesn't
change.  And, of course, the per-policy user space limits would need
to be combined with that by hand.

Not particularly straightforward if you asked me.

Not to mention the fact that, say, cpu_cooling, has a per-policy list
of requests anyway.

> >> Handling sysfs min/max_freq through dev_pm_qos would be of dubious
> >> value, though I guess you could register identical requests for each CPU.
> >>
> >> I'm not familiar with what you're trying to accomplish with PM_QOS other
> >> than replace the sysfs min_freq/max_freq files:
> >
> > QoS-based management of the frequency limits is not really needed for
> > that.  The real motivation for adding it were things like thermal and
> > platform firmware induced limits that all have their own values to
> > combine with the ones provided by user space.
>
> Current users seem to be thermal-related. Do you care about min/max_freq
> requests from stuff not directly tied to a CPU?

Yes, I do.

And they will need to add requests per policy.

> >> What I want to do is add
> >> a driver using the interconnect driver which translates requests for
> >> "bandwidth-on-a-path" into "frequency-on-a-device". More specifically a
> >> display driver could request bandwidth to RAM and this would be
> >> translated into min frequency for NoC and the DDR controller, both of
> >> which implement scaling via devfreq:
> >>
> >> This is part of an effort to upstream an out-of-tree "busfreq" feature
> >> which allows device device to make "min frequency requests" through an
> >> entirely out-of-tree mechanism. It would also allow finer-grained
> >> scaling that what IMX tree currently support.
> >>
> >> If you're making cpufreq qos constrains be "per-cpufreq-policy" then
> >> it's not clear how you would handle in-kernel constraints from other
> >> subsystems. Would users have to get a pointer to struct cpufreq_policy
> >> and struct freq_constraints?
> >
> > Yes.
> >
> >> That would make object lifetime a nightmare!
> >
> > Why really?  It is not much different from the device PM QoS case
>  >> Actually,  is a simple
> > one-for-one replacement of the former.  As it turns out, all of its
> > users have access to a policy object anyway already.
>
> All current users are very closely tied to cpufreq, what I had in mind
> is requests from unrelated subsystems.

You can use cpufreq policy notifiers for that.  Add a request for each
CPU in the policy (or for each related CPU if that is needed) to
policy->constraints on CREATE_POLICY and remove them on REMOVE_POLICY.
That's all you need to do.

BTW, the original code from Viresh did that already, I haven't changed
it.  And it didn't have per-CPU lists of frequency requests for that
matter, it used the ones in policy->cpu as the per-policy lists, which
doesn't work.

> Browsing through the cpufreq core it seems that it's possible for a
> struct cpufreq_policy to be created and destroyed at various points, the
> simplest example being rmmod/modprobe on a cpufreq driver.
>
> The freq_qos_add_request function grabs a pointer to struct
> freq_constraints, this can become invalid when cpufreq_policy is freed.
>
> I guess all users need to register a CPUFREQ_POLICY_NOTIFIER and make
> sure to freq_qos_add_request every time?

Yes.

The policy is the user of the effective constraint anyway and holding
on to a list of requests without a user of the effective constraint
would be, well, not useful.

> Looking at your [PATCH 2/3] I  can't spot any obvious issue, thermal clamping
> code seems to get the appropriate callbacks.
>
> >> But dev_pm_qos solves this by tying to struct device.
>
> The lifetime of "struct device" is already controlled by
> get_device/put_device.

And why does this matter here?

> > Well, the cpufreq sysfs is per-policy and not per-CPU and we really
> > need a per-policy min and max frequency in cpufreq, for governors etc.
>
> Aggregation could be performed at two levels:
>
> 1) Per cpu device (by dev_pm_qos)
> 2) Per policy (inside cpufreq)
>
> The per-cpu dev_pm_qos notifier would just update a per-policy
> pm_qos_constraints object. The second step could even be done strictly
> inside the cpufreq core using existing pm_qos, no need to invent new
> frameworks.
>
> Maybe dev_pm_qos is not a very good fit for cpufreq because of these
> "cpu device versus cpufreq_policy" issues but it makes a ton of sense
> for devfreq. Can you maybe hold PATCH 3 from this series pending further
> discussion?

It can be reverted at any time if need be and in 5.4 that would be dead code.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-23  8:54     ` Rafael J. Wysocki
@ 2019-10-23  8:57       ` Rafael J. Wysocki
  2019-10-23 13:33       ` Leonard Crestez
  1 sibling, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-23  8:57 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Rafael J. Wysocki, Viresh Kumar, Linux ACPI, Linux PM,
	Sudeep Holla, Dmitry Osipenko, Matthias Kaehlcke, Kyungmin Park,
	Chanwoo Choi, Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On Wed, Oct 23, 2019 at 10:54 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@nxp.com> wrote:
> >
> > On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
> > > On Wed, Oct 23, 2019 at 12:06 AM Leonard Crestez
> > > <leonard.crestez@nxp.com> wrote:
> > >> I've been working on a series which add DEV_PM_QOS support to devfreq,
> > >> now at v9:
> > >>
> > >> Your third patch removes DEV_PM_QOS_FREQUENCY_MIN/MAX that my series
> > >> depends upon. I found the email on patchwork, hopefully the in-reply-to
> > >> header is OK?
> > >>
> > >> As far as I can tell the replacement ("frequency qos") needs constraints
> > >> to be managed outside the device infrastructure and it's not obviously
> > >> usable a generic mechanism for making "min_freq/max_freq" requests to a
> > >> specific device.
> > >
> > > You can add a struct freq_constrants pointer to struct dev_pm_info and
> > > use it just fine.  It doesn't have to be bolted into struct
> > > dev_pm_qos.
> >
> > I'm not sure what you mean by this? min/max_freq was already available
> > in dev_pm_qos so it's not clear why it would be moved somewhere else.
> > What I'm looking for is a mechanism to make min/max_freq requests on a
> > per-device basis and DEV_PM_QOS_MIN_FREQUENCY already did that.
> >
> > Reuse is good, right?
>
> But they go away in patch 3 of this series as there are no users in
> the tree.  Sorry about that.
>
> > >> I've read a bit through your emails and it seems the problem is that
> > >> you're dealing with dev_pm_qos on per-policy basis but each "struct
> > >> cpufreq_policy" can cover multiple CPU devices.
> > >>
> > >> An alternative solution which follows dev_pm_qos would be to add
> > >> notifiers for each CPU inside cpufreq_online and cpufreq_offline. This
> > >> makes quite a bit of sense because each CPU is a separate "device" with
> > >> a possibly distinct list of qos requests.
> > >
> > > But combining the lists of requests for all the CPUs in a policy
> > > defeats the idea of automatic aggregation of requests which really is
> > > what PM QoS is about.
> >
> > My primary interest is the "dev" part of dev_pm_qos: making pm_qos
> > requests tied to a specific device.
>
> The list of requests needs to be associated with the user of the
> effective constraint.  If that is the device, it is all good.
>
> > > There have to be two lists of requests per policy, one for the max and
> > > one for the min frequency >
> > >> If cpufreq needs a group of CPUs to run at the same frequency then it
> > >> should deal with this by doing dev_pm_qos_read_frequency on each CPU
> > >> device and picking a frequency that attempts to satisfy all constraints.
> > >
> > > No, that would be combining the requests by hand.
> >
> > It's just a loop though.
>
> Yes, it is, and needs to be run on every change of an effective
> constraint for any CPU even if the total effective constraint doesn't
> change.  And, of course, the per-policy user space limits would need
> to be combined with that by hand.
>
> Not particularly straightforward if you asked me.
>
> Not to mention the fact that, say, cpu_cooling, has a per-policy list
> of requests anyway.

A per-policy request, not a list of them.  Sorry.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-23  8:54     ` Rafael J. Wysocki
  2019-10-23  8:57       ` Rafael J. Wysocki
@ 2019-10-23 13:33       ` Leonard Crestez
  2019-10-24 13:42         ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Leonard Crestez @ 2019-10-23 13:33 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar
  Cc: Linux ACPI, Linux PM, Sudeep Holla, Dmitry Osipenko,
	Matthias Kaehlcke, Kyungmin Park, Chanwoo Choi,
	Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On 2019-10-23 11:54 AM, Rafael J. Wysocki wrote:
> On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>> On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
>>> On Wed, Oct 23, 2019 at 12:06 AM Leonard Crestez
>>> <leonard.crestez@nxp.com> wrote:
>>>> I've been working on a series which add DEV_PM_QOS support to devfreq,
>>>> now at v9:
>>>>
>>>> Your third patch removes DEV_PM_QOS_FREQUENCY_MIN/MAX that my series
>>>> depends upon. I found the email on patchwork, hopefully the in-reply-to
>>>> header is OK?
>>>>
>>>> As far as I can tell the replacement ("frequency qos") needs constraints
>>>> to be managed outside the device infrastructure and it's not obviously
>>>> usable a generic mechanism for making "min_freq/max_freq" requests to a
>>>> specific device.
>>>
>>> You can add a struct freq_constrants pointer to struct dev_pm_info and
>>> use it just fine.  It doesn't have to be bolted into struct
>>> dev_pm_qos.
>>
>> I'm not sure what you mean by this? min/max_freq was already available
>> in dev_pm_qos so it's not clear why it would be moved somewhere else.
>> What I'm looking for is a mechanism to make min/max_freq requests on a
>> per-device basis and DEV_PM_QOS_MIN_FREQUENCY already did that.
>>
>> Reuse is good, right?
> 
> But they go away in patch 3 of this series as there are no users in
> the tree.  Sorry about that. >
>>>> I've read a bit through your emails and it seems the problem is that
>>>> you're dealing with dev_pm_qos on per-policy basis but each "struct
>>>> cpufreq_policy" can cover multiple CPU devices.
>>>>
>>>> An alternative solution which follows dev_pm_qos would be to add
>>>> notifiers for each CPU inside cpufreq_online and cpufreq_offline. This
>>>> makes quite a bit of sense because each CPU is a separate "device" with
>>>> a possibly distinct list of qos requests.
>>>
>>> But combining the lists of requests for all the CPUs in a policy
>>> defeats the idea of automatic aggregation of requests which really is
>>> what PM QoS is about.
>>
>> My primary interest is the "dev" part of dev_pm_qos: making pm_qos
>> requests tied to a specific device.
> 
> The list of requests needs to be associated with the user of the
> effective constraint.  If that is the device, it is all good.
The phrase "user of the effective constraint" is somewhat unclear.

I'm using the target device as dev for dev_pm_qos, not the requestor. 
This is consistent with how it was used for cpufreq: thermal called a 
dev_pm_qos_add_request on with dev = cpu_dev not a thermal sensor or 
anything else.

However looking at other dev_pm_qos users there are instances of a 
driver calling dev_pm_qos_add_request on it's own device but this is not 
a strict requirement, correct?

>>> There have to be two lists of requests per policy, one for the max and
>>> one for the min frequency >
>>>> If cpufreq needs a group of CPUs to run at the same frequency then it
>>>> should deal with this by doing dev_pm_qos_read_frequency on each CPU
>>>> device and picking a frequency that attempts to satisfy all constraints.
>>>
>>> No, that would be combining the requests by hand.
>>
>> It's just a loop though.
> 
> Yes, it is, and needs to be run on every change of an effective
> constraint for any CPU even if the total effective constraint doesn't
> change.  And, of course, the per-policy user space limits would need
> to be combined with that by hand.
> 
> Not particularly straightforward if you asked me.

Well, this cpu-to-policy aggregation could also use a pm_qos_constraint 
object instead of looping.

> Not to mention the fact that, say, cpu_cooling, has a per-policy list
> of requests anyway.
> 
>>>> Handling sysfs min/max_freq through dev_pm_qos would be of dubious
>>>> value, though I guess you could register identical requests for each CPU.
>>>>
>>>> I'm not familiar with what you're trying to accomplish with PM_QOS other
>>>> than replace the sysfs min_freq/max_freq files:
>>>
>>> QoS-based management of the frequency limits is not really needed for
>>> that.  The real motivation for adding it were things like thermal and
>>> platform firmware induced limits that all have their own values to
>>> combine with the ones provided by user space.
>>
>> Current users seem to be thermal-related. Do you care about min/max_freq
>> requests from stuff not directly tied to a CPU?
> 
> Yes, I do.
> 
> And they will need to add requests per policy.
> 
>>>> What I want to do is add
>>>> a driver using the interconnect driver which translates requests for
>>>> "bandwidth-on-a-path" into "frequency-on-a-device". More specifically a
>>>> display driver could request bandwidth to RAM and this would be
>>>> translated into min frequency for NoC and the DDR controller, both of
>>>> which implement scaling via devfreq:
>>>>
>>>> This is part of an effort to upstream an out-of-tree "busfreq" feature
>>>> which allows device device to make "min frequency requests" through an
>>>> entirely out-of-tree mechanism. It would also allow finer-grained
>>>> scaling that what IMX tree currently support.
>>>>
>>>> If you're making cpufreq qos constrains be "per-cpufreq-policy" then
>>>> it's not clear how you would handle in-kernel constraints from other
>>>> subsystems. Would users have to get a pointer to struct cpufreq_policy
>>>> and struct freq_constraints?
>>>
>>> Yes.
>>>
>>>> That would make object lifetime a nightmare!
>>>
>>> Why really?  It is not much different from the device PM QoS case
>>   >> Actually,  is a simple
>>> one-for-one replacement of the former.  As it turns out, all of its
>>> users have access to a policy object anyway already.
>>
>> All current users are very closely tied to cpufreq, what I had in mind
>> is requests from unrelated subsystems.
> 
> You can use cpufreq policy notifiers for that.  Add a request for each
> CPU in the policy (or for each related CPU if that is needed) to
> policy->constraints on CREATE_POLICY and remove them on REMOVE_POLICY.
> That's all you need to do.
> 
> BTW, the original code from Viresh did that already, I haven't changed
> it.  And it didn't have per-CPU lists of frequency requests for that
> matter, it used the ones in policy->cpu as the per-policy lists, which
> doesn't work.
> 
>> Browsing through the cpufreq core it seems that it's possible for a
>> struct cpufreq_policy to be created and destroyed at various points, the
>> simplest example being rmmod/modprobe on a cpufreq driver.
>>
>> The freq_qos_add_request function grabs a pointer to struct
>> freq_constraints, this can become invalid when cpufreq_policy is freed.
>>
>> I guess all users need to register a CPUFREQ_POLICY_NOTIFIER and make
>> sure to freq_qos_add_request every time?
> 
> Yes.
> 
> The policy is the user of the effective constraint anyway and holding
> on to a list of requests without a user of the effective constraint
> would be, well, not useful.
> 
>> Looking at your [PATCH 2/3] I  can't spot any obvious issue, thermal clamping
>> code seems to get the appropriate callbacks.
>>
>>>> But dev_pm_qos solves this by tying to struct device.
>>
>> The lifetime of "struct device" is already controlled by
>> get_device/put_device.
> 
> And why does this matter here?

My point is that dev_pm_qos is easier for consumers to use than dealing 
with cpufreq_policy lifetime and has less exposure to cpufreq 
implementation details.

But all current consumers seem to be appropriately coupled into cpufreq.

>>> Well, the cpufreq sysfs is per-policy and not per-CPU and we really
>>> need a per-policy min and max frequency in cpufreq, for governors etc.
>>
>> Aggregation could be performed at two levels:
>>
>> 1) Per cpu device (by dev_pm_qos)
>> 2) Per policy (inside cpufreq)
>>
>> The per-cpu dev_pm_qos notifier would just update a per-policy
>> pm_qos_constraints object. The second step could even be done strictly
>> inside the cpufreq core using existing pm_qos, no need to invent new
>> frameworks.
>>
>> Maybe dev_pm_qos is not a very good fit for cpufreq because of these
>> "cpu device versus cpufreq_policy" issues but it makes a ton of sense
>> for devfreq. Can you maybe hold PATCH 3 from this series pending further
>> discussion?
> 
> It can be reverted at any time if need be and in 5.4 that would be dead code.

I guess I can post v10 of my "devfreq pm qos" which starts by reverting 
"PATCH 3" of this series?

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-23 13:33       ` Leonard Crestez
@ 2019-10-24 13:42         ` Rafael J. Wysocki
  2019-10-24 17:47           ` Leonard Crestez
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-24 13:42 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Rafael J. Wysocki, Viresh Kumar, Linux ACPI, Linux PM,
	Sudeep Holla, Dmitry Osipenko, Matthias Kaehlcke, Kyungmin Park,
	Chanwoo Choi, Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On Wed, Oct 23, 2019 at 3:33 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>
> On 2019-10-23 11:54 AM, Rafael J. Wysocki wrote:
> > On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@nxp.com> wrote:
> >> On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:

[cut]

> >>> But combining the lists of requests for all the CPUs in a policy
> >>> defeats the idea of automatic aggregation of requests which really is
> >>> what PM QoS is about.
> >>
> >> My primary interest is the "dev" part of dev_pm_qos: making pm_qos
> >> requests tied to a specific device.
> >
> > The list of requests needs to be associated with the user of the
> > effective constraint.  If that is the device, it is all good.
>
> The phrase "user of the effective constraint" is somewhat unclear.

Fair enough, so let me elaborate.

The effective constraint (ie. the one resulting from taking all of the
requests in the relevant QoS list into account) affects the selection
of an OPP, so it is natural to associate the QoS list producing it
with a list of OPPs to select.  In the cpufreq case, the policy holds
the list of OPPs and so it also should hold the corresponding QoS
lists (for the min and max frequency limits).  It "uses" the effective
constraints produced by those QoS lists by preventing the OPPs out of
the between the min and max values  from being selected.

Essentially, the policy represents a power (clock/voltage) domain with
multiple components (it doesn't matter what they are at this level of
abstraction). While there can be multiple sources of QoS requests
associated with each component, all of these requests ultimately need
to be passed to the domain for aggregation, because that's where the
frequency selection decisions are made and so that's where the
effective constraint value needs to be known. Now, the natural way to
allow requests from multiple sources to be passed for aggregation is
to provide a QoS list that they can be added to. That really is what
PM QoS is for.

> I'm using the target device as dev for dev_pm_qos, not the requestor.
> This is consistent with how it was used for cpufreq: thermal called a
> dev_pm_qos_add_request on with dev = cpu_dev not a thermal sensor or
> anything else.

Not really, but close. :-)

Without my series (that is 5.4-rc4, say), the cpu_cooling driver adds
its constraint to the device PM QoS of cpufreq_cdev which is a special
device created by that driver.  That would be fine, except that the
cpufreq core doesn't use that QoS.  It uses the device PM QoS of the
policy->cpu device instead.  That is, that's where it adds its
notifiers (see cpufreq_policy_alloc()), that's where user space
requests are added (see cpufreq_online()), and (most important) that's
where the effective constraint value is read from (see
cpufreq_set_policy()).  That turns out to be problematic (in addition
to the cpu_cooling driver's QoS requests going nowhere), because
confusion ensues if the current policy->cpu goes away.

> However looking at other dev_pm_qos users there are instances of a
> driver calling dev_pm_qos_add_request on it's own device but this is not
> a strict requirement, correct?

No, it isn't.

> >>> There have to be two lists of requests per policy, one for the max and
> >>> one for the min frequency >
> >>>> If cpufreq needs a group of CPUs to run at the same frequency then it
> >>>> should deal with this by doing dev_pm_qos_read_frequency on each CPU
> >>>> device and picking a frequency that attempts to satisfy all constraints.
> >>>
> >>> No, that would be combining the requests by hand.
> >>
> >> It's just a loop though.
> >
> > Yes, it is, and needs to be run on every change of an effective
> > constraint for any CPU even if the total effective constraint doesn't
> > change.  And, of course, the per-policy user space limits would need
> > to be combined with that by hand.
> >
> > Not particularly straightforward if you asked me.
>
> Well, this cpu-to-policy aggregation could also use a pm_qos_constraint
> object instead of looping.

Yes, it could, but then somebody would need to add those
"intermediate" requests to a proper policy-level QoS and it would need
an extra notifier invocation to update each of them on a "component"
QoS change.

This is an interesting idea in case we ever need to improve the
scalability of the QoS lists, but I'd rather use the simpler approach
for now.

[cut]

> >>> Well, the cpufreq sysfs is per-policy and not per-CPU and we really
> >>> need a per-policy min and max frequency in cpufreq, for governors etc.
> >>
> >> Aggregation could be performed at two levels:
> >>
> >> 1) Per cpu device (by dev_pm_qos)
> >> 2) Per policy (inside cpufreq)
> >>
> >> The per-cpu dev_pm_qos notifier would just update a per-policy
> >> pm_qos_constraints object. The second step could even be done strictly
> >> inside the cpufreq core using existing pm_qos, no need to invent new
> >> frameworks.
> >>
> >> Maybe dev_pm_qos is not a very good fit for cpufreq because of these
> >> "cpu device versus cpufreq_policy" issues but it makes a ton of sense
> >> for devfreq. Can you maybe hold PATCH 3 from this series pending further
> >> discussion?
> >
> > It can be reverted at any time if need be and in 5.4 that would be dead code.
>
> I guess I can post v10 of my "devfreq pm qos" which starts by reverting
> "PATCH 3" of this series?

You may do that, but I would consider adding a struct freq_constraints
pointer directly to struct dev_pm_info and using the new frequency QoS
helpers to manage it.

Arguably, there is no need to bundle that with the rest of device PM
QoS and doing the above would help to avoid some code duplication too.

Thanks!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-24 13:42         ` Rafael J. Wysocki
@ 2019-10-24 17:47           ` Leonard Crestez
  2019-10-24 21:10             ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Leonard Crestez @ 2019-10-24 17:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Linux ACPI, Linux PM, Sudeep Holla,
	Dmitry Osipenko, Matthias Kaehlcke, Kyungmin Park, Chanwoo Choi,
	Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On 24.10.2019 16:42, Rafael J. Wysocki wrote:
> On Wed, Oct 23, 2019 at 3:33 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>>
>> On 2019-10-23 11:54 AM, Rafael J. Wysocki wrote:
>>> On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>>>> On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
> 
> [cut]
> 
>>>>> But combining the lists of requests for all the CPUs in a policy
>>>>> defeats the idea of automatic aggregation of requests which really is
>>>>> what PM QoS is about.
>>>>
>>>> My primary interest is the "dev" part of dev_pm_qos: making pm_qos
>>>> requests tied to a specific device.
>>>
>>> The list of requests needs to be associated with the user of the
>>> effective constraint.  If that is the device, it is all good.
>>
>> The phrase "user of the effective constraint" is somewhat unclear.
> 
> Fair enough, so let me elaborate.
> 
> The effective constraint (ie. the one resulting from taking all of the
> requests in the relevant QoS list into account) affects the selection
> of an OPP, so it is natural to associate the QoS list producing it
> with a list of OPPs to select.  In the cpufreq case, the policy holds
> the list of OPPs and so it also should hold the corresponding QoS
> lists (for the min and max frequency limits).  It "uses" the effective
> constraints produced by those QoS lists by preventing the OPPs out of
> the between the min and max values  from being selected.
> 
> Essentially, the policy represents a power (clock/voltage) domain with
> multiple components (it doesn't matter what they are at this level of
> abstraction). While there can be multiple sources of QoS requests
> associated with each component, all of these requests ultimately need
> to be passed to the domain for aggregation, because that's where the
> frequency selection decisions are made and so that's where the
> effective constraint value needs to be known. Now, the natural way to
> allow requests from multiple sources to be passed for aggregation is
> to provide a QoS list that they can be added to. That really is what
> PM QoS is for.
> 
>> I'm using the target device as dev for dev_pm_qos, not the requestor.
>> This is consistent with how it was used for cpufreq: thermal called a
>> dev_pm_qos_add_request on with dev = cpu_dev not a thermal sensor or
>> anything else.
> 
> Not really, but close. :-)
> 
> Without my series (that is 5.4-rc4, say), the cpu_cooling driver adds
> its constraint to the device PM QoS of cpufreq_cdev which is a special
> device created by that driver.  That would be fine, except that the
> cpufreq core doesn't use that QoS.  It uses the device PM QoS of the
> policy->cpu device instead.  That is, that's where it adds its
> notifiers (see cpufreq_policy_alloc()), that's where user space
> requests are added (see cpufreq_online()), and (most important) that's
> where the effective constraint value is read from (see
> cpufreq_set_policy()).  That turns out to be problematic (in addition
> to the cpu_cooling driver's QoS requests going nowhere), because
> confusion ensues if the current policy->cpu goes away.

That behavior in cpu_cooling seems like a bug.

>> However looking at other dev_pm_qos users there are instances of a
>> driver calling dev_pm_qos_add_request on it's own device but this is not
>> a strict requirement, correct?
> 
> No, it isn't.
> 
>>>>> There have to be two lists of requests per policy, one for the max and
>>>>> one for the min frequency >
>>>>>> If cpufreq needs a group of CPUs to run at the same frequency then it
>>>>>> should deal with this by doing dev_pm_qos_read_frequency on each CPU
>>>>>> device and picking a frequency that attempts to satisfy all constraints.
>>>>>
>>>>> No, that would be combining the requests by hand.
>>>>
>>>> It's just a loop though.
>>>
>>> Yes, it is, and needs to be run on every change of an effective
>>> constraint for any CPU even if the total effective constraint doesn't
>>> change.  And, of course, the per-policy user space limits would need
>>> to be combined with that by hand.
>>>
>>> Not particularly straightforward if you asked me.
>>
>> Well, this cpu-to-policy aggregation could also use a pm_qos_constraint
>> object instead of looping.
> 
> Yes, it could, but then somebody would need to add those
> "intermediate" requests to a proper policy-level QoS and it would need
> an extra notifier invocation to update each of them on a "component"
> QoS change.
> 
> This is an interesting idea in case we ever need to improve the
> scalability of the QoS lists, but I'd rather use the simpler approach
> for now.

The advantage I see is reducing the exposure of cpufreq internals

>>>>> Well, the cpufreq sysfs is per-policy and not per-CPU and we really
>>>>> need a per-policy min and max frequency in cpufreq, for governors etc.
>>>>
>>>> Aggregation could be performed at two levels:
>>>>
>>>> 1) Per cpu device (by dev_pm_qos)
>>>> 2) Per policy (inside cpufreq)
>>>>
>>>> The per-cpu dev_pm_qos notifier would just update a per-policy
>>>> pm_qos_constraints object. The second step could even be done strictly
>>>> inside the cpufreq core using existing pm_qos, no need to invent new
>>>> frameworks.
>>>>
>>>> Maybe dev_pm_qos is not a very good fit for cpufreq because of these
>>>> "cpu device versus cpufreq_policy" issues but it makes a ton of sense
>>>> for devfreq. Can you maybe hold PATCH 3 from this series pending further
>>>> discussion?
>>>
>>> It can be reverted at any time if need be and in 5.4 that would be dead code.
>>
>> I guess I can post v10 of my "devfreq pm qos" which starts by reverting
>> "PATCH 3" of this series?
> 
> You may do that, but I would consider adding a struct freq_constraints
> pointer directly to struct dev_pm_info and using the new frequency QoS
> helpers to manage it.
> 
> Arguably, there is no need to bundle that with the rest of device PM
> QoS and doing the above would help to avoid some code duplication too.

Adding to struct dev_pm_info would increase sizeof(struct device) while 
dev_pm_qos only allocates memory when constraints are added. My 
expectation is that very few devices would even have min_freq and 
max_freq constraints.

Maybe struct dev_pm_qos could host a "struct freq_constraints freq" 
instead of two separate "struct pm_qos_constraints min/max_frequency"? 
This way there would be two users of freq_constraints: cpufreq_policy 
(which is not a device) and dev_pm_qos.

In the future freq_constraints might be extended to implement some logic 
for conflicts between min_freq and max_freq requests.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-24 17:47           ` Leonard Crestez
@ 2019-10-24 21:10             ` Rafael J. Wysocki
  2019-10-25 18:04               ` Leonard Crestez
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-24 21:10 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Rafael J. Wysocki, Viresh Kumar, Linux ACPI, Linux PM,
	Sudeep Holla, Dmitry Osipenko, Matthias Kaehlcke, Kyungmin Park,
	Chanwoo Choi, Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On Thu, Oct 24, 2019 at 7:47 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>
> On 24.10.2019 16:42, Rafael J. Wysocki wrote:
> > On Wed, Oct 23, 2019 at 3:33 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
> >>
> >> On 2019-10-23 11:54 AM, Rafael J. Wysocki wrote:
> >>> On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@nxp.com> wrote:
> >>>> On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
> >
> > [cut]
> >
> >>>>> But combining the lists of requests for all the CPUs in a policy
> >>>>> defeats the idea of automatic aggregation of requests which really is
> >>>>> what PM QoS is about.
> >>>>
> >>>> My primary interest is the "dev" part of dev_pm_qos: making pm_qos
> >>>> requests tied to a specific device.
> >>>
> >>> The list of requests needs to be associated with the user of the
> >>> effective constraint.  If that is the device, it is all good.
> >>
> >> The phrase "user of the effective constraint" is somewhat unclear.
> >
> > Fair enough, so let me elaborate.
> >
> > The effective constraint (ie. the one resulting from taking all of the
> > requests in the relevant QoS list into account) affects the selection
> > of an OPP, so it is natural to associate the QoS list producing it
> > with a list of OPPs to select.  In the cpufreq case, the policy holds
> > the list of OPPs and so it also should hold the corresponding QoS
> > lists (for the min and max frequency limits).  It "uses" the effective
> > constraints produced by those QoS lists by preventing the OPPs out of
> > the between the min and max values  from being selected.
> >
> > Essentially, the policy represents a power (clock/voltage) domain with
> > multiple components (it doesn't matter what they are at this level of
> > abstraction). While there can be multiple sources of QoS requests
> > associated with each component, all of these requests ultimately need
> > to be passed to the domain for aggregation, because that's where the
> > frequency selection decisions are made and so that's where the
> > effective constraint value needs to be known. Now, the natural way to
> > allow requests from multiple sources to be passed for aggregation is
> > to provide a QoS list that they can be added to. That really is what
> > PM QoS is for.
> >
> >> I'm using the target device as dev for dev_pm_qos, not the requestor.
> >> This is consistent with how it was used for cpufreq: thermal called a
> >> dev_pm_qos_add_request on with dev = cpu_dev not a thermal sensor or
> >> anything else.
> >
> > Not really, but close. :-)
> >
> > Without my series (that is 5.4-rc4, say), the cpu_cooling driver adds
> > its constraint to the device PM QoS of cpufreq_cdev which is a special
> > device created by that driver.  That would be fine, except that the
> > cpufreq core doesn't use that QoS.  It uses the device PM QoS of the
> > policy->cpu device instead.  That is, that's where it adds its
> > notifiers (see cpufreq_policy_alloc()), that's where user space
> > requests are added (see cpufreq_online()), and (most important) that's
> > where the effective constraint value is read from (see
> > cpufreq_set_policy()).  That turns out to be problematic (in addition
> > to the cpu_cooling driver's QoS requests going nowhere), because
> > confusion ensues if the current policy->cpu goes away.
>
> That behavior in cpu_cooling seems like a bug.

Well, kind of. :-)

> >> However looking at other dev_pm_qos users there are instances of a
> >> driver calling dev_pm_qos_add_request on it's own device but this is not
> >> a strict requirement, correct?
> >
> > No, it isn't.
> >
> >>>>> There have to be two lists of requests per policy, one for the max and
> >>>>> one for the min frequency >
> >>>>>> If cpufreq needs a group of CPUs to run at the same frequency then it
> >>>>>> should deal with this by doing dev_pm_qos_read_frequency on each CPU
> >>>>>> device and picking a frequency that attempts to satisfy all constraints.
> >>>>>
> >>>>> No, that would be combining the requests by hand.
> >>>>
> >>>> It's just a loop though.
> >>>
> >>> Yes, it is, and needs to be run on every change of an effective
> >>> constraint for any CPU even if the total effective constraint doesn't
> >>> change.  And, of course, the per-policy user space limits would need
> >>> to be combined with that by hand.
> >>>
> >>> Not particularly straightforward if you asked me.
> >>
> >> Well, this cpu-to-policy aggregation could also use a pm_qos_constraint
> >> object instead of looping.
> >
> > Yes, it could, but then somebody would need to add those
> > "intermediate" requests to a proper policy-level QoS and it would need
> > an extra notifier invocation to update each of them on a "component"
> > QoS change.
> >
> > This is an interesting idea in case we ever need to improve the
> > scalability of the QoS lists, but I'd rather use the simpler approach
> > for now.
>
> The advantage I see is reducing the exposure of cpufreq internals

That can be achieved by providing a helper to add a frequency QoS
request to the min or max QoS list of the policy covering a given CPU.
The caller of it would just need to pass the CPU number, a pointer to
the request struct and the type.

It wasn't necessary to add it at this time, though, and there would be
the extra complication that the caller would need to know whether or
not the policy had been created already.

> >>>>> Well, the cpufreq sysfs is per-policy and not per-CPU and we really
> >>>>> need a per-policy min and max frequency in cpufreq, for governors etc.
> >>>>
> >>>> Aggregation could be performed at two levels:
> >>>>
> >>>> 1) Per cpu device (by dev_pm_qos)
> >>>> 2) Per policy (inside cpufreq)
> >>>>
> >>>> The per-cpu dev_pm_qos notifier would just update a per-policy
> >>>> pm_qos_constraints object. The second step could even be done strictly
> >>>> inside the cpufreq core using existing pm_qos, no need to invent new
> >>>> frameworks.
> >>>>
> >>>> Maybe dev_pm_qos is not a very good fit for cpufreq because of these
> >>>> "cpu device versus cpufreq_policy" issues but it makes a ton of sense
> >>>> for devfreq. Can you maybe hold PATCH 3 from this series pending further
> >>>> discussion?
> >>>
> >>> It can be reverted at any time if need be and in 5.4 that would be dead code.
> >>
> >> I guess I can post v10 of my "devfreq pm qos" which starts by reverting
> >> "PATCH 3" of this series?
> >
> > You may do that, but I would consider adding a struct freq_constraints
> > pointer directly to struct dev_pm_info and using the new frequency QoS
> > helpers to manage it.
> >
> > Arguably, there is no need to bundle that with the rest of device PM
> > QoS and doing the above would help to avoid some code duplication too.
>
> Adding to struct dev_pm_info would increase sizeof(struct device) while
> dev_pm_qos only allocates memory when constraints are added. My
> expectation is that very few devices would even have min_freq and
> max_freq constraints.

Well, fair enough.

> Maybe struct dev_pm_qos could host a "struct freq_constraints freq"
> instead of two separate "struct pm_qos_constraints min/max_frequency"?

That is possible too.

> This way there would be two users of freq_constraints: cpufreq_policy
> (which is not a device) and dev_pm_qos.
>
> In the future freq_constraints might be extended to implement some logic
> for conflicts between min_freq and max_freq requests.

Sure.

Thanks!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-24 21:10             ` Rafael J. Wysocki
@ 2019-10-25 18:04               ` Leonard Crestez
  0 siblings, 0 replies; 25+ messages in thread
From: Leonard Crestez @ 2019-10-25 18:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Linux ACPI, Linux PM, Sudeep Holla,
	Dmitry Osipenko, Matthias Kaehlcke, Kyungmin Park, Chanwoo Choi,
	Artur Świgoń,
	Georgi Djakov, dl-linux-imx, Saravana Kannan, MyungJoo Ham

On 25.10.2019 00:11, Rafael J. Wysocki wrote:
> On Thu, Oct 24, 2019 at 7:47 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>>
>> On 24.10.2019 16:42, Rafael J. Wysocki wrote:
>>> On Wed, Oct 23, 2019 at 3:33 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>>>>
>>>> On 2019-10-23 11:54 AM, Rafael J. Wysocki wrote:
>>>>> On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>>>>>> On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
>>>
>>> [cut]
>>>
>>>>>>> But combining the lists of requests for all the CPUs in a policy
>>>>>>> defeats the idea of automatic aggregation of requests which really is
>>>>>>> what PM QoS is about.
>>>>>>
>>>>>> My primary interest is the "dev" part of dev_pm_qos: making pm_qos
>>>>>> requests tied to a specific device.
>>>>>
>>>>> The list of requests needs to be associated with the user of the
>>>>> effective constraint.  If that is the device, it is all good.
>>>>
>>>> The phrase "user of the effective constraint" is somewhat unclear.
>>>
>>> Fair enough, so let me elaborate.
>>>
>>> The effective constraint (ie. the one resulting from taking all of the
>>> requests in the relevant QoS list into account) affects the selection
>>> of an OPP, so it is natural to associate the QoS list producing it
>>> with a list of OPPs to select.  In the cpufreq case, the policy holds
>>> the list of OPPs and so it also should hold the corresponding QoS
>>> lists (for the min and max frequency limits).  It "uses" the effective
>>> constraints produced by those QoS lists by preventing the OPPs out of
>>> the between the min and max values  from being selected.
>>>
>>> Essentially, the policy represents a power (clock/voltage) domain with
>>> multiple components (it doesn't matter what they are at this level of
>>> abstraction). While there can be multiple sources of QoS requests
>>> associated with each component, all of these requests ultimately need
>>> to be passed to the domain for aggregation, because that's where the
>>> frequency selection decisions are made and so that's where the
>>> effective constraint value needs to be known. Now, the natural way to
>>> allow requests from multiple sources to be passed for aggregation is
>>> to provide a QoS list that they can be added to. That really is what
>>> PM QoS is for.
>>>
>>>> I'm using the target device as dev for dev_pm_qos, not the requestor.
>>>> This is consistent with how it was used for cpufreq: thermal called a
>>>> dev_pm_qos_add_request on with dev = cpu_dev not a thermal sensor or
>>>> anything else.
>>>
>>> Not really, but close. :-)
>>>
>>> Without my series (that is 5.4-rc4, say), the cpu_cooling driver adds
>>> its constraint to the device PM QoS of cpufreq_cdev which is a special
>>> device created by that driver.  That would be fine, except that the
>>> cpufreq core doesn't use that QoS.  It uses the device PM QoS of the
>>> policy->cpu device instead.  That is, that's where it adds its
>>> notifiers (see cpufreq_policy_alloc()), that's where user space
>>> requests are added (see cpufreq_online()), and (most important) that's
>>> where the effective constraint value is read from (see
>>> cpufreq_set_policy()).  That turns out to be problematic (in addition
>>> to the cpu_cooling driver's QoS requests going nowhere), because
>>> confusion ensues if the current policy->cpu goes away.
>>
>> That behavior in cpu_cooling seems like a bug.
> 
> Well, kind of. :-)
> 
>>>> However looking at other dev_pm_qos users there are instances of a
>>>> driver calling dev_pm_qos_add_request on it's own device but this is not
>>>> a strict requirement, correct?
>>>
>>> No, it isn't.
>>>
>>>>>>> There have to be two lists of requests per policy, one for the max and
>>>>>>> one for the min frequency >
>>>>>>>> If cpufreq needs a group of CPUs to run at the same frequency then it
>>>>>>>> should deal with this by doing dev_pm_qos_read_frequency on each CPU
>>>>>>>> device and picking a frequency that attempts to satisfy all constraints.
>>>>>>>
>>>>>>> No, that would be combining the requests by hand.
>>>>>>
>>>>>> It's just a loop though.
>>>>>
>>>>> Yes, it is, and needs to be run on every change of an effective
>>>>> constraint for any CPU even if the total effective constraint doesn't
>>>>> change.  And, of course, the per-policy user space limits would need
>>>>> to be combined with that by hand.
>>>>>
>>>>> Not particularly straightforward if you asked me.
>>>>
>>>> Well, this cpu-to-policy aggregation could also use a pm_qos_constraint
>>>> object instead of looping.
>>>
>>> Yes, it could, but then somebody would need to add those
>>> "intermediate" requests to a proper policy-level QoS and it would need
>>> an extra notifier invocation to update each of them on a "component"
>>> QoS change.
>>>
>>> This is an interesting idea in case we ever need to improve the
>>> scalability of the QoS lists, but I'd rather use the simpler approach
>>> for now.
>>
>> The advantage I see is reducing the exposure of cpufreq internals
> 
> That can be achieved by providing a helper to add a frequency QoS
> request to the min or max QoS list of the policy covering a given CPU.
> The caller of it would just need to pass the CPU number, a pointer to
> the request struct and the type.
> 
> It wasn't necessary to add it at this time, though, and there would be
> the extra complication that the caller would need to know whether or
> not the policy had been created already.

Using dev_pm_qos already provides that, and since the request is tied to 
the struct device of the CPU there is no need to know anything about 
cpufreq_policy at all.

It would just need an additional layer of aggregation from CPU to 
cpufreq_policy.

>>>>>>> Well, the cpufreq sysfs is per-policy and not per-CPU and we really
>>>>>>> need a per-policy min and max frequency in cpufreq, for governors etc.
>>>>>>
>>>>>> Aggregation could be performed at two levels:
>>>>>>
>>>>>> 1) Per cpu device (by dev_pm_qos)
>>>>>> 2) Per policy (inside cpufreq)
>>>>>>
>>>>>> The per-cpu dev_pm_qos notifier would just update a per-policy
>>>>>> pm_qos_constraints object. The second step could even be done strictly
>>>>>> inside the cpufreq core using existing pm_qos, no need to invent new
>>>>>> frameworks.
>>>>>>
>>>>>> Maybe dev_pm_qos is not a very good fit for cpufreq because of these
>>>>>> "cpu device versus cpufreq_policy" issues but it makes a ton of sense
>>>>>> for devfreq. Can you maybe hold PATCH 3 from this series pending further
>>>>>> discussion?
>>>>>
>>>>> It can be reverted at any time if need be and in 5.4 that would be dead code.
>>>>
>>>> I guess I can post v10 of my "devfreq pm qos" which starts by reverting
>>>> "PATCH 3" of this series?
>>>
>>> You may do that, but I would consider adding a struct freq_constraints
>>> pointer directly to struct dev_pm_info and using the new frequency QoS
>>> helpers to manage it.
>>>
>>> Arguably, there is no need to bundle that with the rest of device PM
>>> QoS and doing the above would help to avoid some code duplication too.
>>
>> Adding to struct dev_pm_info would increase sizeof(struct device) while
>> dev_pm_qos only allocates memory when constraints are added. My
>> expectation is that very few devices would even have min_freq and
>> max_freq constraints.
> 
> Well, fair enough.
> 
>> Maybe struct dev_pm_qos could host a "struct freq_constraints freq"
>> instead of two separate "struct pm_qos_constraints min/max_frequency"?
> 
> That is possible too.
> 
>> This way there would be two users of freq_constraints: cpufreq_policy
>> (which is not a device) and dev_pm_qos.
>>
>> In the future freq_constraints might be extended to implement some logic
>> for conflicts between min_freq and max_freq requests.
> 
> Sure.

Posted for review: https://patchwork.kernel.org/cover/11212887/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-18  9:26                   ` Rafael J. Wysocki
@ 2019-10-18  9:28                     ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2019-10-18  9:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sudeep Holla, Rafael J. Wysocki, Linux PM, Linux ACPI, LKML,
	Dmitry Osipenko

On 18-10-19, 11:26, Rafael J. Wysocki wrote:
> On Fri, Oct 18, 2019 at 11:24 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 18-10-19, 10:30, Rafael J. Wysocki wrote:
> > > On Fri, Oct 18, 2019 at 10:27 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > >
> > > > On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> > > > > On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > > >
> > > > > > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > > > > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > > > > > that the new min is less than the new max, because the QoS doesn't do
> > > > > > > that.]
> > > > > >
> > > > > > The ->verify() callback does that for us I believe.
> > > > >
> > > > > It does in practice AFAICS, but in theory it may assume the right
> > > > > ordering between the min and the max and just test the boundaries, may
> > > > > it not?
> > > >
> > > > I think cpufreq_verify_within_limits() gets called for sure from
> > > > within ->verify() for all platforms
> > >
> > > That's why I mean by "in practice". :-)
> >
> > Hmm, I am not sure if we should really add another min <= max check in
> > cpufreq_set_policy() as in practice it will never hit :)
> 
> Fair enough, but adding a comment regarding that in there would be prudent IMO.

will do.

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-18  9:24                 ` Viresh Kumar
@ 2019-10-18  9:26                   ` Rafael J. Wysocki
  2019-10-18  9:28                     ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-18  9:26 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Sudeep Holla, Rafael J. Wysocki, Linux PM,
	Linux ACPI, LKML, Dmitry Osipenko

On Fri, Oct 18, 2019 at 11:24 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 18-10-19, 10:30, Rafael J. Wysocki wrote:
> > On Fri, Oct 18, 2019 at 10:27 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> > > > On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > >
> > > > > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > > > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > > > > that the new min is less than the new max, because the QoS doesn't do
> > > > > > that.]
> > > > >
> > > > > The ->verify() callback does that for us I believe.
> > > >
> > > > It does in practice AFAICS, but in theory it may assume the right
> > > > ordering between the min and the max and just test the boundaries, may
> > > > it not?
> > >
> > > I think cpufreq_verify_within_limits() gets called for sure from
> > > within ->verify() for all platforms
> >
> > That's why I mean by "in practice". :-)
>
> Hmm, I am not sure if we should really add another min <= max check in
> cpufreq_set_policy() as in practice it will never hit :)

Fair enough, but adding a comment regarding that in there would be prudent IMO.


>
> --
> viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-18  8:30               ` Rafael J. Wysocki
@ 2019-10-18  9:24                 ` Viresh Kumar
  2019-10-18  9:26                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-10-18  9:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sudeep Holla, Rafael J. Wysocki, Linux PM, Linux ACPI, LKML,
	Dmitry Osipenko

On 18-10-19, 10:30, Rafael J. Wysocki wrote:
> On Fri, Oct 18, 2019 at 10:27 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> > > On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > >
> > > > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > > > that the new min is less than the new max, because the QoS doesn't do
> > > > > that.]
> > > >
> > > > The ->verify() callback does that for us I believe.
> > >
> > > It does in practice AFAICS, but in theory it may assume the right
> > > ordering between the min and the max and just test the boundaries, may
> > > it not?
> >
> > I think cpufreq_verify_within_limits() gets called for sure from
> > within ->verify() for all platforms
> 
> That's why I mean by "in practice". :-)

Hmm, I am not sure if we should really add another min <= max check in
cpufreq_set_policy() as in practice it will never hit :)

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-18  8:27             ` Viresh Kumar
@ 2019-10-18  8:30               ` Rafael J. Wysocki
  2019-10-18  9:24                 ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-18  8:30 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Sudeep Holla, Rafael J. Wysocki, Linux PM,
	Linux ACPI, LKML, Dmitry Osipenko

On Fri, Oct 18, 2019 at 10:27 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> > On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > > that the new min is less than the new max, because the QoS doesn't do
> > > > that.]
> > >
> > > The ->verify() callback does that for us I believe.
> >
> > It does in practice AFAICS, but in theory it may assume the right
> > ordering between the min and the max and just test the boundaries, may
> > it not?
>
> I think cpufreq_verify_within_limits() gets called for sure from
> within ->verify() for all platforms

That's why I mean by "in practice". :-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-18  8:24           ` Rafael J. Wysocki
@ 2019-10-18  8:27             ` Viresh Kumar
  2019-10-18  8:30               ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-10-18  8:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sudeep Holla, Rafael J. Wysocki, Linux PM, Linux ACPI, LKML,
	Dmitry Osipenko

On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > that the new min is less than the new max, because the QoS doesn't do
> > > that.]
> >
> > The ->verify() callback does that for us I believe.
> 
> It does in practice AFAICS, but in theory it may assume the right
> ordering between the min and the max and just test the boundaries, may
> it not?

I think cpufreq_verify_within_limits() gets called for sure from
within ->verify() for all platforms and this explicitly checks

        if (policy->min > policy->max)
                policy->min = policy->max;

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-18  5:44         ` Viresh Kumar
@ 2019-10-18  8:24           ` Rafael J. Wysocki
  2019-10-18  8:27             ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-18  8:24 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Sudeep Holla, Rafael J. Wysocki, Linux PM,
	Linux ACPI, LKML, Dmitry Osipenko

On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > that the new min is less than the new max, because the QoS doesn't do
> > that.]
>
> The ->verify() callback does that for us I believe.

It does in practice AFAICS, but in theory it may assume the right
ordering between the min and the max and just test the boundaries, may
it not?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-17 16:34       ` Rafael J. Wysocki
  2019-10-17 16:42         ` Sudeep Holla
@ 2019-10-18  5:44         ` Viresh Kumar
  2019-10-18  8:24           ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-10-18  5:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sudeep Holla, Rafael J. Wysocki, Linux PM, Linux ACPI, LKML,
	Dmitry Osipenko

On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> that the new min is less than the new max, because the QoS doesn't do
> that.]

The ->verify() callback does that for us I believe.

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-16 14:23 ` Sudeep Holla
  2019-10-17  9:57   ` Viresh Kumar
@ 2019-10-17 17:14   ` Sudeep Holla
  1 sibling, 0 replies; 25+ messages in thread
From: Sudeep Holla @ 2019-10-17 17:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Linux ACPI, LKML, Viresh Kumar, Dmitry Osipenko, Sudeep Holla

On Wed, Oct 16, 2019 at 03:23:43PM +0100, Sudeep Holla wrote:
> On Wed, Oct 16, 2019 at 12:37:58PM +0200, Rafael J. Wysocki wrote:
> > Hi All,
> >
> > The motivation for this series is to address the problem discussed here:
> >
> > https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
> >
> > and also reported here:
> >
> > https://lore.kernel.org/linux-pm/20191015155735.GA29105@bogus/
> >
> > Plus, generally speaking, using the policy CPU as a proxy for the policy
> > with respect to PM QoS does not feel particularly straightforward to me
> > and adds extra complexity.
> >
> > Anyway, the first patch adds frequency QoS that is based on "raw" PM QoS (kind
> > of in analogy with device PM QoS) and is just about min and max frequency
> > requests (no direct relationship to devices).
> >
> > The second patch switches over cpufreq and its users to the new frequency QoS.
> > [The Fixes: tag has been tentatively added to it.]
> >
> > The third one removes frequency request types from device PM QoS.
> >
> > Unfortunately, the patches are rather big, but also they are quite
> > straightforward.
> >
> > I didn't have the time to test this series, so giving it a go would be much
> > appreciated.
>
> Thanks for the spinning these patches so quickly.
>
For the record, I thought of providing the crash that this series fixes.
After applying [1] which fixes the boot issue I was seeing on TC2, I started
seeing the below crash, which this series fixes.

FWIW,
Tested-by: Sudeep Holla <sudeep.holla@arm.com>

--

Unable to handle kernel paging request at virtual address 31b2c303
pgd = 772b96e1
[31b2c303] *pgd=a4050003, *pmd=00000000
Internal error: Oops: 206 [#1] SMP THUMB2
Modules linked in:
CPU: 1 PID: 518 Comm: bash Not tainted 5.4.0-rc3-00062-g6e3a7fd7a87e-dirty #123
Hardware name: ARM-Versatile Express
PC is at blocking_notifier_chain_unregister+0x2a/0x78
LR is at blocking_notifier_chain_unregister+0x1b/0x78
Flags: NzCv  IRQs on  FIQs off  Mode SVC_32  ISA Thumb  Segment user
Control: 70c5387d  Table: a57b08c0  DAC: 55555555
Process bash (pid: 518, stack limit = 0x018ebe57)
(blocking_notifier_chain_unregister) from (dev_pm_qos_remove_notifier+0x5d/0xb4)
(dev_pm_qos_remove_notifier) from (cpufreq_policy_free+0x77/0xc8)
(cpufreq_policy_free) from (subsys_interface_unregister+0x4f/0x80)
(subsys_interface_unregister) from (cpufreq_unregister_driver+0x29/0x6c)
(cpufreq_unregister_driver) from (bL_cpufreq_switcher_notifier+0x41/0x4c)
(bL_cpufreq_switcher_notifier) from (notifier_call_chain+0x3d/0x58)
(notifier_call_chain) from (blocking_notifier_call_chain+0x29/0x38)
(blocking_notifier_call_chain) from (bL_activation_notify+0x13/0x40)
(bL_activation_notify) from (bL_switcher_active_store+0x59/0x190)
(bL_switcher_active_store) from (kernfs_fop_write+0x85/0x12c)
(kernfs_fop_write) from (__vfs_write+0x21/0x130)
(__vfs_write) from (vfs_write+0x6b/0xfc)
(vfs_write) from (ksys_write+0x6d/0x90)
(ksys_write) from (ret_fast_syscall+0x1/0x5a)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-17 16:34       ` Rafael J. Wysocki
@ 2019-10-17 16:42         ` Sudeep Holla
  2019-10-18  5:44         ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Sudeep Holla @ 2019-10-17 16:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Rafael J. Wysocki, Linux PM, Linux ACPI, LKML,
	Sudeep Holla, Dmitry Osipenko

On Thu, Oct 17, 2019 at 06:34:28PM +0200, Rafael J. Wysocki wrote:
> On Thu, Oct 17, 2019 at 12:00 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Thu, Oct 17, 2019 at 03:27:25PM +0530, Viresh Kumar wrote:
> > > On 16-10-19, 15:23, Sudeep Holla wrote:
> > > > Thanks for the spinning these patches so quickly.
> > > >
> > > > I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> > > > So I looked at my bug report in detail and looks like the cpufreq_driver
> > > > variable is set to NULL at that point and it fails to dereference it
> > > > while trying to execute:
> > > >     ret = cpufreq_driver->verify(new_policy);
> > > > (Hint verify is at offset 0x1c/28)
> > > >
> > > > So I suspect some race as this platform with bL switcher tries to
> > > > unregister and re-register the cpufreq driver during the boot.
> > > >
> > > > I need to spend more time on this as reverting the initial PM QoS patch
> > > > to cpufreq.c makes the issue disappear.
>
> I guess you mean commit 67d874c3b2c6 ("cpufreq: Register notifiers
> with the PM QoS framework")?
>

Correct.

> That would make sense, because it added the cpufreq_notifier_min() and
> cpufreq_notifier_max() that trigger handle_update() via
> schedule_work().
>

Yes, it was not clear as I didn't trace to handle_update earlier. After
looking at depth today afternoon, I arrived at the same conclusion.

> [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> that the new min is less than the new max, because the QoS doesn't do
> that.]
>
> > > Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
> > > get updated only once while registering/unregistering cpufreq drivers. That is
> > > the last thing which can go wrong from my point of view :)
> > >
> >
> > Yes, if I boot my TC2 with bL switcher enabled, it always crashes on boot.
>
> It does look like handle_update() races with
> cpufreq_unregister_driver() and cpufreq_remove_dev (called from there
> indirectly) does look racy.

Indeed, we just crossed the mails. I just found that and posted a patch.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-17  9:59     ` Sudeep Holla
@ 2019-10-17 16:34       ` Rafael J. Wysocki
  2019-10-17 16:42         ` Sudeep Holla
  2019-10-18  5:44         ` Viresh Kumar
  0 siblings, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-17 16:34 UTC (permalink / raw)
  To: Sudeep Holla, Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM, Linux ACPI, LKML, Dmitry Osipenko

On Thu, Oct 17, 2019 at 12:00 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Thu, Oct 17, 2019 at 03:27:25PM +0530, Viresh Kumar wrote:
> > On 16-10-19, 15:23, Sudeep Holla wrote:
> > > Thanks for the spinning these patches so quickly.
> > >
> > > I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> > > So I looked at my bug report in detail and looks like the cpufreq_driver
> > > variable is set to NULL at that point and it fails to dereference it
> > > while trying to execute:
> > >     ret = cpufreq_driver->verify(new_policy);
> > > (Hint verify is at offset 0x1c/28)
> > >
> > > So I suspect some race as this platform with bL switcher tries to
> > > unregister and re-register the cpufreq driver during the boot.
> > >
> > > I need to spend more time on this as reverting the initial PM QoS patch
> > > to cpufreq.c makes the issue disappear.

I guess you mean commit 67d874c3b2c6 ("cpufreq: Register notifiers
with the PM QoS framework")?

That would make sense, because it added the cpufreq_notifier_min() and
cpufreq_notifier_max() that trigger handle_update() via
schedule_work().

[BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
that the new min is less than the new max, because the QoS doesn't do
that.]

> > Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
> > get updated only once while registering/unregistering cpufreq drivers. That is
> > the last thing which can go wrong from my point of view :)
> >
>
> Yes, if I boot my TC2 with bL switcher enabled, it always crashes on boot.

It does look like handle_update() races with
cpufreq_unregister_driver() and cpufreq_remove_dev (called from there
indirectly) does look racy.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-17  9:57   ` Viresh Kumar
@ 2019-10-17  9:59     ` Sudeep Holla
  2019-10-17 16:34       ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Sudeep Holla @ 2019-10-17  9:59 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM, Linux ACPI, LKML, Dmitry Osipenko

On Thu, Oct 17, 2019 at 03:27:25PM +0530, Viresh Kumar wrote:
> On 16-10-19, 15:23, Sudeep Holla wrote:
> > Thanks for the spinning these patches so quickly.
> >
> > I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> > So I looked at my bug report in detail and looks like the cpufreq_driver
> > variable is set to NULL at that point and it fails to dereference it
> > while trying to execute:
> > 	ret = cpufreq_driver->verify(new_policy);
> > (Hint verify is at offset 0x1c/28)
> >
> > So I suspect some race as this platform with bL switcher tries to
> > unregister and re-register the cpufreq driver during the boot.
> >
> > I need to spend more time on this as reverting the initial PM QoS patch
> > to cpufreq.c makes the issue disappear.
>
> Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
> get updated only once while registering/unregistering cpufreq drivers. That is
> the last thing which can go wrong from my point of view :)
>

Yes, if I boot my TC2 with bL switcher enabled, it always crashes on boot.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-16 14:23 ` Sudeep Holla
@ 2019-10-17  9:57   ` Viresh Kumar
  2019-10-17  9:59     ` Sudeep Holla
  2019-10-17 17:14   ` Sudeep Holla
  1 sibling, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-10-17  9:57 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Rafael J. Wysocki, Linux PM, Linux ACPI, LKML, Dmitry Osipenko

On 16-10-19, 15:23, Sudeep Holla wrote:
> Thanks for the spinning these patches so quickly.
> 
> I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> So I looked at my bug report in detail and looks like the cpufreq_driver
> variable is set to NULL at that point and it fails to dereference it
> while trying to execute:
> 	ret = cpufreq_driver->verify(new_policy);
> (Hint verify is at offset 0x1c/28)
> 
> So I suspect some race as this platform with bL switcher tries to
> unregister and re-register the cpufreq driver during the boot.
> 
> I need to spend more time on this as reverting the initial PM QoS patch
> to cpufreq.c makes the issue disappear.

Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
get updated only once while registering/unregistering cpufreq drivers. That is
the last thing which can go wrong from my point of view :)

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-16 10:37 Rafael J. Wysocki
  2019-10-16 14:23 ` Sudeep Holla
@ 2019-10-17  9:46 ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2019-10-17  9:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Linux ACPI, LKML, Sudeep Holla, Dmitry Osipenko

On 16-10-19, 12:37, Rafael J. Wysocki wrote:
> Hi All,
> 
> The motivation for this series is to address the problem discussed here:
> 
> https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
> 
> and also reported here:
> 
> https://lore.kernel.org/linux-pm/20191015155735.GA29105@bogus/
> 
> Plus, generally speaking, using the policy CPU as a proxy for the policy
> with respect to PM QoS does not feel particularly straightforward to me
> and adds extra complexity.
> 
> Anyway, the first patch adds frequency QoS that is based on "raw" PM QoS (kind
> of in analogy with device PM QoS) and is just about min and max frequency
> requests (no direct relationship to devices).
> 
> The second patch switches over cpufreq and its users to the new frequency QoS.
> [The Fixes: tag has been tentatively added to it.]
> 
> The third one removes frequency request types from device PM QoS.
> 
> Unfortunately, the patches are rather big, but also they are quite
> straightforward.
> 
> I didn't have the time to test this series, so giving it a go would be much
> appreciated.

Apart from the minor comment on one of the patches, these look okay to me.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
  2019-10-16 10:37 Rafael J. Wysocki
@ 2019-10-16 14:23 ` Sudeep Holla
  2019-10-17  9:57   ` Viresh Kumar
  2019-10-17 17:14   ` Sudeep Holla
  2019-10-17  9:46 ` Viresh Kumar
  1 sibling, 2 replies; 25+ messages in thread
From: Sudeep Holla @ 2019-10-16 14:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Linux ACPI, LKML, Viresh Kumar, Sudeep Holla, Dmitry Osipenko

On Wed, Oct 16, 2019 at 12:37:58PM +0200, Rafael J. Wysocki wrote:
> Hi All,
>
> The motivation for this series is to address the problem discussed here:
>
> https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
>
> and also reported here:
>
> https://lore.kernel.org/linux-pm/20191015155735.GA29105@bogus/
>
> Plus, generally speaking, using the policy CPU as a proxy for the policy
> with respect to PM QoS does not feel particularly straightforward to me
> and adds extra complexity.
>
> Anyway, the first patch adds frequency QoS that is based on "raw" PM QoS (kind
> of in analogy with device PM QoS) and is just about min and max frequency
> requests (no direct relationship to devices).
>
> The second patch switches over cpufreq and its users to the new frequency QoS.
> [The Fixes: tag has been tentatively added to it.]
>
> The third one removes frequency request types from device PM QoS.
>
> Unfortunately, the patches are rather big, but also they are quite
> straightforward.
>
> I didn't have the time to test this series, so giving it a go would be much
> appreciated.

Thanks for the spinning these patches so quickly.

I did give it a spin, but unfortunately it doesn't fix the bug I reported.
So I looked at my bug report in detail and looks like the cpufreq_driver
variable is set to NULL at that point and it fails to dereference it
while trying to execute:
	ret = cpufreq_driver->verify(new_policy);
(Hint verify is at offset 0x1c/28)

So I suspect some race as this platform with bL switcher tries to
unregister and re-register the cpufreq driver during the boot.

I need to spend more time on this as reverting the initial PM QoS patch
to cpufreq.c makes the issue disappear.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq
@ 2019-10-16 10:37 Rafael J. Wysocki
  2019-10-16 14:23 ` Sudeep Holla
  2019-10-17  9:46 ` Viresh Kumar
  0 siblings, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2019-10-16 10:37 UTC (permalink / raw)
  To: Linux PM; +Cc: Linux ACPI, LKML, Viresh Kumar, Sudeep Holla, Dmitry Osipenko

Hi All,

The motivation for this series is to address the problem discussed here:

https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f

and also reported here:

https://lore.kernel.org/linux-pm/20191015155735.GA29105@bogus/

Plus, generally speaking, using the policy CPU as a proxy for the policy
with respect to PM QoS does not feel particularly straightforward to me
and adds extra complexity.

Anyway, the first patch adds frequency QoS that is based on "raw" PM QoS (kind
of in analogy with device PM QoS) and is just about min and max frequency
requests (no direct relationship to devices).

The second patch switches over cpufreq and its users to the new frequency QoS.
[The Fixes: tag has been tentatively added to it.]

The third one removes frequency request types from device PM QoS.

Unfortunately, the patches are rather big, but also they are quite
straightforward.

I didn't have the time to test this series, so giving it a go would be much
appreciated.

Thanks,
Rafael




^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, back to index

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-22 22:06 [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq Leonard Crestez
2019-10-22 22:47 ` Rafael J. Wysocki
2019-10-23  2:20   ` Leonard Crestez
2019-10-23  8:54     ` Rafael J. Wysocki
2019-10-23  8:57       ` Rafael J. Wysocki
2019-10-23 13:33       ` Leonard Crestez
2019-10-24 13:42         ` Rafael J. Wysocki
2019-10-24 17:47           ` Leonard Crestez
2019-10-24 21:10             ` Rafael J. Wysocki
2019-10-25 18:04               ` Leonard Crestez
  -- strict thread matches above, loose matches on Subject: below --
2019-10-16 10:37 Rafael J. Wysocki
2019-10-16 14:23 ` Sudeep Holla
2019-10-17  9:57   ` Viresh Kumar
2019-10-17  9:59     ` Sudeep Holla
2019-10-17 16:34       ` Rafael J. Wysocki
2019-10-17 16:42         ` Sudeep Holla
2019-10-18  5:44         ` Viresh Kumar
2019-10-18  8:24           ` Rafael J. Wysocki
2019-10-18  8:27             ` Viresh Kumar
2019-10-18  8:30               ` Rafael J. Wysocki
2019-10-18  9:24                 ` Viresh Kumar
2019-10-18  9:26                   ` Rafael J. Wysocki
2019-10-18  9:28                     ` Viresh Kumar
2019-10-17 17:14   ` Sudeep Holla
2019-10-17  9:46 ` Viresh Kumar

Linux-PM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pm/0 linux-pm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pm linux-pm/ https://lore.kernel.org/linux-pm \
		linux-pm@vger.kernel.org
	public-inbox-index linux-pm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git