Linux-PM Archive on lore.kernel.org
 help / color / Atom feed
* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
       [not found] ` <20200520034307.20435-1-andrew-sh.cheng@mediatek.com>
@ 2020-05-20  4:10   ` Chanwoo Choi
       [not found]     ` <1589953015.8243.2.camel@mtksdaap41>
       [not found]   ` <20200520034307.20435-10-andrew-sh.cheng@mediatek.com>
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Chanwoo Choi @ 2020-05-20  4:10 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

Hi Andrew,

Could you explain the base commit of these patches?
When I tried to apply them to v5.7-rc1 for testing,
the merge conflict occurs.

Thanks,
Chanwoo Choi

On 5/20/20 12:42 PM, Andrew-sh.Cheng wrote:
> MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the same voltage domain.
> So, this series is to add drivers to handle the voltage coupling between CPU and CCI DVFS.
> 
> For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.
> 
> Change since v5:
> 	- Changing dt-binding format to yaml.
> 	- Extending current devfreq passive_governor instead of create a new one.
> 	- Resend depending patches of Sravana Kannan base on kernel-5.7
> 
> 
> Andrew-sh.Cheng (6):
>   cpufreq: mediatek: add clock and regulator enable for intermediate
>     clock
>   dt-bindings: devfreq: add compatible for mt8183 cci devfreq
>   devfreq: add mediatek cci devfreq
>   opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it
>     is disabled
>   cpufreq: mediatek: add opp notification for SVS support
>   devfreq: mediatek: cci devfreq register opp notification for SVS
>     support
> 
> Saravana Kannan (6):
>   OPP: Allow required-opps even if the device doesn't have power-domains
>   OPP: Add function to look up required OPP's for a given OPP
>   OPP: Improve required-opps linking
>   PM / devfreq: Cache OPP table reference in devfreq
>   PM / devfreq: Add required OPPs support to passive governor
>   PM / devfreq: Add cpu based scaling support to passive_governor
> 
>  .../devicetree/bindings/devfreq/mt8183-cci.yaml    |  51 ++++
>  drivers/cpufreq/mediatek-cpufreq.c                 | 122 ++++++++-
>  drivers/devfreq/Kconfig                            |  12 +
>  drivers/devfreq/Makefile                           |   1 +
>  drivers/devfreq/devfreq.c                          |   6 +
>  drivers/devfreq/governor_passive.c                 | 298 +++++++++++++++++++--
>  drivers/devfreq/mt8183-cci-devfreq.c               | 233 ++++++++++++++++
>  drivers/opp/core.c                                 |  85 +++++-
>  drivers/opp/of.c                                   | 108 ++++----
>  drivers/opp/opp.h                                  |   5 +
>  include/linux/devfreq.h                            |  42 ++-
>  include/linux/pm_opp.h                             |  11 +
>  12 files changed, 874 insertions(+), 100 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
       [not found]     ` <1589953015.8243.2.camel@mtksdaap41>
@ 2020-05-20  6:24       ` Chanwoo Choi
       [not found]         ` <1589958625.23971.2.camel@mtksdaap41>
  0 siblings, 1 reply; 14+ messages in thread
From: Chanwoo Choi @ 2020-05-20  6:24 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

Hi,

On 5/20/20 2:36 PM, andrew-sh.cheng wrote:
> On Wed, 2020-05-20 at 13:10 +0900, Chanwoo Choi wrote:
>> Hi Andrew,
>>
>> Could you explain the base commit of these patches?
>> When I tried to apply them to v5.7-rc1 for testing,
>> the merge conflict occurs.
>>
>> Thanks,

>> Chanwoo Choi
> 
> Hi Chanwoo Choi,
> 
> My base commit is
> commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Sun Apr 12 12:35:55 2020 -0700
> 
>     Linux 5.7-rc1
> 
> Could you show me the conflict error?


When I tried to apply first patch with 'git am',
the merge conflict occurred.

git am \[PATCH\ 01_12\]\ OPP\:\ Allow\ required-opps\ even\ if\ the\ device\ doesn\'t\ have\ power-domains.eml
Applying: OPP: Allow required-opps even if the device doesn't have power-domains
error: patch failed: drivers/opp/core.c:755
error: drivers/opp/core.c: patch does not apply
error: patch failed: drivers/opp/of.c:195																																																																												
error: drivers/opp/of.c: patch does not apply
Patch failed at 0001 OPP: Allow required-opps even if the device doesn't have power-domains
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Regards,
Chanwoo Choi

> 
> BR,
> Andrew-sh.Cheng
>>
>> On 5/20/20 12:42 PM, Andrew-sh.Cheng wrote:
>>> MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the same voltage domain.
>>> So, this series is to add drivers to handle the voltage coupling between CPU and CCI DVFS.
>>>
>>> For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.
>>>
>>> Change since v5:
>>> 	- Changing dt-binding format to yaml.
>>> 	- Extending current devfreq passive_governor instead of create a new one.
>>> 	- Resend depending patches of Sravana Kannan base on kernel-5.7
>>>
>>>
>>> Andrew-sh.Cheng (6):
>>>   cpufreq: mediatek: add clock and regulator enable for intermediate
>>>     clock
>>>   dt-bindings: devfreq: add compatible for mt8183 cci devfreq
>>>   devfreq: add mediatek cci devfreq
>>>   opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it
>>>     is disabled
>>>   cpufreq: mediatek: add opp notification for SVS support
>>>   devfreq: mediatek: cci devfreq register opp notification for SVS
>>>     support
>>>
>>> Saravana Kannan (6):
>>>   OPP: Allow required-opps even if the device doesn't have power-domains
>>>   OPP: Add function to look up required OPP's for a given OPP
>>>   OPP: Improve required-opps linking
>>>   PM / devfreq: Cache OPP table reference in devfreq
>>>   PM / devfreq: Add required OPPs support to passive governor
>>>   PM / devfreq: Add cpu based scaling support to passive_governor
>>>
>>>  .../devicetree/bindings/devfreq/mt8183-cci.yaml    |  51 ++++
>>>  drivers/cpufreq/mediatek-cpufreq.c                 | 122 ++++++++-
>>>  drivers/devfreq/Kconfig                            |  12 +
>>>  drivers/devfreq/Makefile                           |   1 +
>>>  drivers/devfreq/devfreq.c                          |   6 +
>>>  drivers/devfreq/governor_passive.c                 | 298 +++++++++++++++++++--
>>>  drivers/devfreq/mt8183-cci-devfreq.c               | 233 ++++++++++++++++
>>>  drivers/opp/core.c                                 |  85 +++++-
>>>  drivers/opp/of.c                                   | 108 ++++----
>>>  drivers/opp/opp.h                                  |   5 +
>>>  include/linux/devfreq.h                            |  42 ++-
>>>  include/linux/pm_opp.h                             |  11 +
>>>  12 files changed, 874 insertions(+), 100 deletions(-)
>>>  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
>>>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
>>>
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 09/12] devfreq: add mediatek cci devfreq
       [not found]   ` <20200520034307.20435-10-andrew-sh.cheng@mediatek.com>
@ 2020-05-20 12:31     ` Mark Brown
  2020-05-21  8:52       ` andrew-sh.cheng
  2020-05-28  7:35     ` Chanwoo Choi
  1 sibling, 1 reply; 14+ messages in thread
From: Mark Brown @ 2020-05-20 12:31 UTC (permalink / raw)
  To: Andrew-sh.Cheng
  Cc: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, linux-pm,
	devicetree, linux-arm-kernel, linux-mediatek, linux-kernel,
	srv_heupstream


[-- Attachment #1: Type: text/plain, Size: 502 bytes --]

On Wed, May 20, 2020 at 11:43:04AM +0800, Andrew-sh.Cheng wrote:

> +	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
> +	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
> +	if (ret) {
> +		if (ret != -EPROBE_DEFER)
> +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
> +				ret);
> +		return ret;
> +	}
> +	ret = regulator_enable(cci_df->proc_reg);

The code appears to require a regulator (and I'm guessing the device
needs power) so why is this using regulator_get_optional()?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
       [not found]         ` <1589958625.23971.2.camel@mtksdaap41>
@ 2020-05-20 14:53           ` Matthias Brugger
  0 siblings, 0 replies; 14+ messages in thread
From: Matthias Brugger @ 2020-05-20 14:53 UTC (permalink / raw)
  To: andrew-sh.cheng, Chanwoo Choi
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Rafael J . Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Liam Girdwood, Mark Brown, linux-pm, devicetree,
	linux-arm-kernel, linux-mediatek, linux-kernel, srv_heupstream



On 20/05/2020 09:10, andrew-sh.cheng wrote:
> On Wed, 2020-05-20 at 15:24 +0900, Chanwoo Choi wrote:
>> Hi,
>>
>> On 5/20/20 2:36 PM, andrew-sh.cheng wrote:
>>> On Wed, 2020-05-20 at 13:10 +0900, Chanwoo Choi wrote:
>>>> Hi Andrew,
>>>>
>>>> Could you explain the base commit of these patches?
>>>> When I tried to apply them to v5.7-rc1 for testing,
>>>> the merge conflict occurs.
>>>>
>>>> Thanks,
>>
>>>> Chanwoo Choi
>>>
>>> Hi Chanwoo Choi,
>>>
>>> My base commit is
>>> commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
>>> Author: Linus Torvalds <torvalds@linux-foundation.org>
>>> Date:   Sun Apr 12 12:35:55 2020 -0700
>>>
>>>     Linux 5.7-rc1
>>>
>>> Could you show me the conflict error?
>>
>>
>> When I tried to apply first patch with 'git am',
>> the merge conflict occurred.
>>
>> git am \[PATCH\ 01_12\]\ OPP\:\ Allow\ required-opps\ even\ if\ the\ device\ doesn\'t\ have\ power-domains.eml
>> Applying: OPP: Allow required-opps even if the device doesn't have power-domains
>> error: patch failed: drivers/opp/core.c:755
>> error: drivers/opp/core.c: patch does not apply
>> error: patch failed: drivers/opp/of.c:195																																																																												
>> error: drivers/opp/of.c: patch does not apply
>> Patch failed at 0001 OPP: Allow required-opps even if the device doesn't have power-domains
>> Use 'git am --show-current-patch' to see the failed patch
>> When you have resolved this problem, run "git am --continue".
>> If you prefer to skip this patch, run "git am --skip" instead.
>> To restore the original branch and stop patching, run "git am --abort".
>>
>> Regards,
>> Chanwoo Choi
> 
> Hi Chanwoo,
> 
> I just make a new folder to get code and check.
> Below is my command list.
> Please help check the different with you.
>   505  repo init -u http://gerrit.mediatek.inc:8080/cros-kernel/manifest
> -b upstream
>   506  repo sync -j8
>   507  repo start kern-dev --all
>   508   git remote add main
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>   509  git remote add main
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>   510  ls
>   511  cd kernel/mediatek/
>   512   git remote add main
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>   513  git fetch main
>   514  git checkout v5.7-rc1
>   515  git am
> Add-cpufreq-and-cci-devfreq-for-mt8183-and-SVS-support.patch
>   516  history
> 

For reference I just tried with b4.sh [1]:
# b4.sh am -l -o /tmp -n patch  1589958625.23971.2.camel@mtksdaap41
# git am -3 -s /tmp/patch.mbx

Applies without conflicts.

Regards,
Matthias

[1] https://git.kernel.org/pub/scm/utils/b4/b4.git

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains
       [not found]   ` <20200520034307.20435-2-andrew-sh.cheng@mediatek.com>
@ 2020-05-20 14:54     ` Matthias Brugger
  0 siblings, 0 replies; 14+ messages in thread
From: Matthias Brugger @ 2020-05-20 14:54 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
	Rob Herring, Mark Rutland, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Saravana Kannan



On 20/05/2020 05:42, Andrew-sh.Cheng wrote:
> From: Saravana Kannan <saravanak@google.com>
> 
> A Device-A can have a (minimum) performance requirement on another
> Device-B to be able to function correctly. This performance requirement
> on Device-B can also change based on the current performance level of
> Device-A.
> 
> The existing required-opps feature fits well to describe this need. So,
> instead of limiting required-opps to point to only PM-domain devices,
> allow it to point to any device.
> 
> Signed-off-by: Saravana Kannan <saravanak@google.com>

Please check all patches, they are missing your
Signed-off-by

Regards,
Matthias

> ---
>  drivers/opp/core.c |  2 +-
>  drivers/opp/of.c   | 11 -----------
>  2 files changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index ba43e6a3dc0a..51403c1f2481 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -755,7 +755,7 @@ static int _set_required_opps(struct device *dev,
>  		return 0;
>  
>  	/* Single genpd case */
> -	if (!genpd_virt_devs) {
> +	if (!genpd_virt_devs && required_opp_tables[0]->is_genpd) {
>  		pstate = likely(opp) ? opp->required_opps[0]->pstate : 0;
>  		ret = dev_pm_genpd_set_performance_state(dev, pstate);
>  		if (ret) {
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index 9cd8f0adacae..6d33de668a7b 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -195,17 +195,6 @@ static void _opp_table_alloc_required_tables(struct opp_table *opp_table,
>  
>  		if (IS_ERR(required_opp_tables[i]))
>  			goto free_required_tables;
> -
> -		/*
> -		 * We only support genpd's OPPs in the "required-opps" for now,
> -		 * as we don't know how much about other cases. Error out if the
> -		 * required OPP doesn't belong to a genpd.
> -		 */
> -		if (!required_opp_tables[i]->is_genpd) {
> -			dev_err(dev, "required-opp doesn't belong to genpd: %pOF\n",
> -				required_np);
> -			goto free_required_tables;
> -		}
>  	}
>  
>  	goto put_np;
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 09/12] devfreq: add mediatek cci devfreq
  2020-05-20 12:31     ` [PATCH 09/12] devfreq: add mediatek cci devfreq Mark Brown
@ 2020-05-21  8:52       ` andrew-sh.cheng
  0 siblings, 0 replies; 14+ messages in thread
From: andrew-sh.cheng @ 2020-05-21  8:52 UTC (permalink / raw)
  Cc: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, linux-pm,
	devicetree, linux-arm-kernel, linux-mediatek, linux-kernel,
	srv_heupstream


On Wed, 2020-05-20 at 13:31 +0100, Mark Brown wrote:
> On Wed, May 20, 2020 at 11:43:04AM +0800, Andrew-sh.Cheng wrote:
> 
> > +	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
> > +	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
> > +	if (ret) {
> > +		if (ret != -EPROBE_DEFER)
> > +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
> > +				ret);
> > +		return ret;
> > +	}
> > +	ret = regulator_enable(cci_df->proc_reg);
> 
> The code appears to require a regulator (and I'm guessing the device
> needs power) so why is this using regulator_get_optional()?

Hi Mark,

Do you mean, why not use regulator_get_exclusive() or regulator_get()?
Because cci and cpu litter core shared buck, it cannot use
regulator_get_exclusive().
Because both cci and cpu want to tune voltage, it cannot use
regulator_get(), otherwise it will get dummy regulator even this buck
doesn't register.as regulator.

BR,
Andrew-sh.Cheng

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
       [not found]     ` <20200520034307.20435-7-andrew-sh.cheng@mediatek.com>
@ 2020-05-28  5:03       ` Chanwoo Choi
  2020-05-28  6:14       ` Chanwoo Choi
  1 sibling, 0 replies; 14+ messages in thread
From: Chanwoo Choi @ 2020-05-28  5:03 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Saravana Kannan, Sibi Sankar

Hi Andrew-sh.Cheng,

Thanks for your posting. I like this approach absolutely.
I think that it is necessary. When I developed the embedded product,
I needed this feature always. 

I add the comments on below.

On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> From: Saravana Kannan <skannan@codeaurora.org>
> 
> Many CPU architectures have caches that can scale independent of the
> CPUs. Frequency scaling of the caches is necessary to make sure that the
> cache is not a performance bottleneck that leads to poor performance and
> power. The same idea applies for RAM/DDR.
> 
> To achieve this, this patch adds support for cpu based scaling to the
> passive governor. This is accomplished by taking the current frequency
> of each CPU frequency domain and then adjust the frequency of the cache
> (or any devfreq device) based on the frequency of the CPUs. It listens
> to CPU frequency transition notifiers to keep itself up to date on the
> current CPU frequency.
> 
> To decide the frequency of the device, the governor does one of the
> following:
> * Derives the optimal devfreq device opp from required-opps property of
>   the parent cpu opp_table.
> 
> * Scales the device frequency in proportion to the CPU frequency. So, if
>   the CPUs are running at their max frequency, the device runs at its
>   max frequency. If the CPUs are running at their min frequency, the
>   device runs at its min frequency. It is interpolated for frequencies
>   in between.
> 
> Andrew-sh.Cheng change
> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> for kernel-5.7
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig            |   2 +
>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
>  include/linux/devfreq.h            |  40 +++++-
>  3 files changed, 299 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index 0b1df12e0f21..d9067950af6a 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>  	  device. This governor does not change the frequency by itself
>  	  through sysfs entries. The passive governor recommends that
>  	  devfreq device uses the OPP table to get the frequency/voltage.
> +	  Alternatively the governor can also be chosen to scale based on
> +	  the online CPUs current frequency.
>  
>  comment "DEVFREQ Drivers"
>  
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 2d67d6c12dce..7dcda02a5bb7 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -8,11 +8,89 @@
>   */
>  
>  #include <linux/module.h>
> +#include <linux/cpu.h>
> +#include <linux/cpufreq.h>
> +#include <linux/cpumask.h>
>  #include <linux/device.h>
>  #include <linux/devfreq.h>
> +#include <linux/slab.h>
>  #include "governor.h"
>  
> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,

Need to change 'unsigned int' to 'unsigned long'.

> +					     unsigned int cpu)
> +{
> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;

Better to define them separately as following and then need to rename
the variable. Usually, use the 'min_freq' and 'max_freq' word for
the minimum/maximum frequency.

	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
	unsigned long dev_min_freq, dev_max_freq, dev_max_state,

The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
and 'unsigned int'. You need to handle them properly.


> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	unsigned long *freq_table = devfreq->profile->freq_table;

In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
So, I think 'dev_freq_table' is proper name instead of 'freq_table'
for the readability.

	freq_table -> dev_freq_table

> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;

In the get_target_freq_with_devfreq(), use 'p_opp' indicating
the OPP of parent device. For the consistency, I think that
use 'p_opp' instead of 'cpu_opp'. 

> +	unsigned long cpu_freq, freq;

Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
	cpu_freq -> cpu_curr_freq.

> +
> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> +	    !cpu_state->opp_table || !devfreq->opp_table)
> +		return 0;
> +
> +	cpu_freq = cpu_state->freq * 1000;
> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
> +	if (IS_ERR(cpu_opp))
> +		return 0;
> +
> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> +					    devfreq->opp_table, cpu_opp);
> +	dev_pm_opp_put(cpu_opp);
> +
> +	if (!IS_ERR(opp)) {
> +		freq = dev_pm_opp_get_freq(opp);
> +		dev_pm_opp_put(opp);

Better to add the 'out' goto statement.
If you use 'goto out', you can reduce the one indentation
without 'else' statement.
	

> +	} else {

As I commented, when dev_pm_opp_xlate_required_opp() return successfully
, use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.


> +		/* Use Interpolation if required opps is not available */
> +		cpu_min = cpu_state->min_freq;
> +		cpu_max = cpu_state->max_freq;
> +		cpu_freq = cpu_state->freq;
> +
> +		if (freq_table) {
> +			/* Get minimum frequency according to sorting order */
> +			max_state = freq_table[devfreq->profile->max_state - 1];
> +			if (freq_table[0] < max_state) {
> +				dev_min = freq_table[0];
> +				dev_max = max_state;
> +			} else {
> +				dev_min = max_state;
> +				dev_max = freq_table[0];
> +			}
> +		} else {
> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
> +				return 0;
> +			dev_min =
> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
> +			dev_max =
> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;

I think it is not proper to access the variable of pm_qos structure directly.
Instead of direct access, you have to use the exported PM QoS function such as
- pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
- pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);

> +		}
> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> +	}


I think that you better to add 'out' jump label as following:

out:

> +
> +	return freq;
> +}
> +
> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> +					unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	unsigned int cpu, target_freq = 0;

Need to define 'target_freq' with 'unsigned long' type.

> +
> +	for_each_online_cpu(cpu)
> +		target_freq = max(target_freq,
> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> +
> +	*freq = target_freq;
> +
> +	return 0;
> +}
> +
> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>  					unsigned long *freq)
>  {
>  	struct devfreq_passive_data *p_data
> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	int i, count, ret = 0;
>  
>  	/*
> -	 * If the devfreq device with passive governor has the specific method
> -	 * to determine the next frequency, should use the get_target_freq()
> -	 * of struct devfreq_passive_data.
> -	 */
> -	if (p_data->get_target_freq) {
> -		ret = p_data->get_target_freq(devfreq, freq);
> -		goto out;
> -	}
> -
> -	/*
>  	 * If the parent and passive devfreq device uses the OPP table,
>  	 * get the next frequency by using the OPP table.
>  	 */
> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	return ret;
>  }
>  
> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +					   unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	int ret;
> +
> +	/*
> +	 * If the devfreq device with passive governor has the specific method
> +	 * to determine the next frequency, should use the get_target_freq()
> +	 * of struct devfreq_passive_data.
> +	 */
> +	if (p_data->get_target_freq)
> +		return p_data->get_target_freq(devfreq, freq);
> +
> +	switch (p_data->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
>  {
>  	int ret;
> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>  	return NOTIFY_DONE;
>  }
>  
> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> +					 unsigned long event, void *ptr)
> +{
> +	struct devfreq_passive_data *data =
> +			container_of(nb, struct devfreq_passive_data, nb);
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct devfreq_cpu_state *cpu_state;
> +	struct cpufreq_freqs *freq = ptr;

How about changing 'freq' to 'cpu_freqs'?

In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
the instance of 'struct cpufreq_freqs'. And in order to
identfy, how about adding 'cpu_' prefix for variable name?

> +	unsigned int current_freq;

Need to define curr_freq with 'unsigned long' type
and better to use 'curr_freq' variable name.

> +	int ret;
> +
> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
> +	    !data->cpu_state[freq->policy->cpu])
> +		return 0;
> +
> +	cpu_state = data->cpu_state[freq->policy->cpu];
> +	if (cpu_state->freq == freq->new)
> +		return 0;
> +
> +	/* Backup current freq and pre-update cpu state freq*/
> +	current_freq = cpu_state->freq;
> +	cpu_state->freq = freq->new;
> +
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);
> +	mutex_unlock(&devfreq->lock);
> +	if (ret) {
> +		cpu_state->freq = current_freq;
> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct device *dev = devfreq->dev.parent;
> +	struct opp_table *opp_table = NULL;
> +	struct devfreq_cpu_state *state;

For the readability, I thinkt 'cpu_state' is proper instead of 'state'.

> +	struct cpufreq_policy *policy;
> +	struct device *cpu_dev;
> +	unsigned int cpu;
> +	int ret;
> +
> +	get_online_cpus();

Add blank line.

> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> +	ret = cpufreq_register_notifier(&data->nb,
> +					CPUFREQ_TRANSITION_NOTIFIER);
> +	if (ret) {
> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> +		data->nb.notifier_call = NULL;
> +		goto out;
> +	}
> +
> +	/* Populate devfreq_cpu_state */
> +	for_each_online_cpu(cpu) {
> +		if (data->cpu_state[cpu])
> +			continue;
> +
> +		policy = cpufreq_cpu_get(cpu);

cpufreq_cpu_get() might return 'NULL'. I think you need to handle
return value as following:

		if (!policy) {
			ret = -EINVAL;
			goto out;
		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
			goto out;
		} else if (IS_ERR(policy) {
			ret = PTR_ERR(policy);
			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
			goto out;
		}

If cpufreq_cpu_get() return successfully, to do next.
It reduces the one indentaion.



> +		if (policy) {
> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
> +			if (!state) {
> +				ret = -ENOMEM;
> +				goto out;
> +			}
> +
> +			cpu_dev = get_cpu_device(cpu);
> +			if (!cpu_dev) {
> +				dev_err(dev, "Couldn't get cpu device.\n");
> +				ret = -ENODEV;
> +				goto out;
> +			}
> +
> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> +			if (IS_ERR(devfreq->opp_table)) {
> +				ret = PTR_ERR(opp_table);
> +				goto out;
> +			}
> +
> +			state->dev = cpu_dev;
> +			state->opp_table = opp_table;
> +			state->first_cpu = cpumask_first(policy->related_cpus);
> +			state->freq = policy->cur;
> +			state->min_freq = policy->cpuinfo.min_freq;
> +			state->max_freq = policy->cpuinfo.max_freq;
> +			data->cpu_state[cpu] = state;

Add blank line.

> +			cpufreq_cpu_put(policy);
> +		} else {
> +			ret = -EPROBE_DEFER;
> +			goto out;
> +		}
> +	}

Add blank line.

> +out:
> +	put_online_cpus();
> +	if (ret)
> +		return ret;
> +
> +	/* Update devfreq */
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);
> +	mutex_unlock(&devfreq->lock);
> +	if (ret)
> +		dev_err(dev, "Couldn't update the frequency.\n");
> +
> +	return ret;
> +}
> +
> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq_cpu_state *cpu_state;
> +	int cpu;
> +
> +	if (data->nb.notifier_call)
> +		cpufreq_unregister_notifier(&data->nb,
> +					    CPUFREQ_TRANSITION_NOTIFIER);
> +
> +	for_each_possible_cpu(cpu) {
> +		cpu_state = data->cpu_state[cpu];
> +		if (cpu_state) {
> +			if (cpu_state->opp_table)
> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> +			kfree(cpu_state);
> +			cpu_state = NULL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  				unsigned int event, void *data)
>  {
> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  	struct notifier_block *nb = &p_data->nb;
>  	int ret = 0;
>  
> -	if (!parent)
> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>  		return -EPROBE_DEFER;

If you modify the devfreq_passive_event_handler() as following,
you can move this condition for DEVFREQ_PARENT_DEV into 
(register|unregister)_parent_dev_notifier.

	switch (event) {                                                                                  
	case DEVFREQ_GOV_START:                                               
		ret = register_parent_dev_notifier(p_data);
		break;
	case DEVFREQ_GOV_STOP:                                             
		ret = unregister_parent_dev_notifier(p_data);
		break;
	default: 
		ret = -EINVAL;
		break;
	}
                                                                                              
	return ret;

>  
>  	switch (event) {
> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  		if (!p_data->this)
>  			p_data->this = devfreq;
>  
> -		nb->notifier_call = devfreq_passive_notifier_call;
> -		ret = devfreq_register_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER);
> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
> +			nb->notifier_call = devfreq_passive_notifier_call;
> +			ret = devfreq_register_notifier(parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER);
> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
> +			ret = cpufreq_passive_register(&p_data);

I think that we better to collect the code related to notifier registration
into one function like devfreq_pass_register_notifier() instead of
cpufreq_passive_register() as following: I think it is more simple and readable.

If you have more proper function name of register_parent_dev_notifier,
please give your opinion.


	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
		switch (p_data->parent_type) {
		case DEVFREQ_PARENT_DEV:
			nb->notifier_call = devfreq_passive_notifier_call;
			ret = devfreq_register_notifier(parent, nb,
			break;
		case CPUFREQ_PARENT_DEV:
			cpufreq_register_notifier(...)
			...
			break;
		}
		

> +		} else {
> +			ret = -EINVAL;
> +		}
>  		break;
>  	case DEVFREQ_GOV_STOP:
> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER));
> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER));
> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
> +			cpufreq_passive_unregister(&p_data);
> +		else
> +			ret = -EINVAL;

ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)

>  		break;
>  	default:
>  		break;
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index a4b19d593151..04ce576fd6f1 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
>  
>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>  /**
> + * struct devfreq_cpu_state - holds the per-cpu state
> + * @freq:	the current frequency of the cpu.
> + * @min_freq:	the min frequency of the cpu.
> + * @max_freq:	the max frequency of the cpu.
> + * @first_cpu:	the cpumask of the first cpu of a policy.
> + * @dev:	reference to cpu device.
> + * @opp_table:	reference to cpu opp table.
> + *
> + * This structure stores the required cpu_state of a cpu.
> + * This is auto-populated by the governor.
> + */
> +struct devfreq_cpu_state {> +	unsigned int freq;

It is better to change from 'freq' to 'curr_freq'
for more correct expression.

> +	unsigned int min_freq;
> +	unsigned int max_freq;
> +	unsigned int first_cpu;
> +	struct device *dev;

How about changing the name 'dev' to 'cpu_dev'?


> +	struct opp_table *opp_table;
> +};

devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.

So, you can move it into drivers/devfreq/governor_passive.c
and just add the definition into include/linux/devfreq.h as following:
It is able to prevent the access of variable of 'struct devfreq_cpu_state'
outside.

	struct devfreq_cpu_state;

> +
> +enum devfreq_parent_dev_type {
> +	DEVFREQ_PARENT_DEV,
> +	CPUFREQ_PARENT_DEV,
> +};
> +
> +/**
>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>   *	and devfreq_add_device
>   * @parent:	the devfreq instance of parent device.
> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
>   *			using governors except for passive governor.
>   *			If the devfreq device has the specific method to decide
>   *			the next frequency, should use this callback.
> - * @this:	the devfreq instance of own device.
> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> + * @parent_type		parent type of the device

Need to add ':' at the end of word. -> "parent_type:".

> + * @this:		the devfreq instance of own device.
> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list

I knew that you make them with same indentation.
But, actually, it is not related to this patch like clean-up code.
Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.

> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>   *
>   * The devfreq_passive_data have to set the devfreq instance of parent
>   * device with governors except for the passive governor. But, don't need to
> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> - * them.
> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> + * will handle them.
>   */
>  struct devfreq_passive_data {
>  	/* Should set the devfreq instance of parent device */
> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
>  	/* Optional callback to decide the next frequency of passvice device */
>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>  
> +	/* Should set the type of parent device */
> +	enum devfreq_parent_dev_type parent_type;
> +
>  	/* For passive governor's internal use. Don't need to set them */
>  	struct devfreq *this;
>  	struct notifier_block nb;
> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>  };
>  #endif
>  
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
       [not found]     ` <20200520034307.20435-7-andrew-sh.cheng@mediatek.com>
  2020-05-28  5:03       ` [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor Chanwoo Choi
@ 2020-05-28  6:14       ` Chanwoo Choi
  2020-05-28  7:17         ` Chanwoo Choi
       [not found]         ` <1591098190.30729.15.camel@mtksdaap41>
  1 sibling, 2 replies; 14+ messages in thread
From: Chanwoo Choi @ 2020-05-28  6:14 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

Hi Andrew-sh.Cheng,

Thanks for your posting. I like this approach absolutely.
I think that it is necessary. When I developed the embedded product,
I needed this feature always. 

I add the comments on below.


And the following email is not valid. So, I dropped this email
from Cc list.
Saravana Kannan <skannan@codeaurora.org>


On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> From: Saravana Kannan <skannan@codeaurora.org>
> 
> Many CPU architectures have caches that can scale independent of the
> CPUs. Frequency scaling of the caches is necessary to make sure that the
> cache is not a performance bottleneck that leads to poor performance and
> power. The same idea applies for RAM/DDR.
> 
> To achieve this, this patch adds support for cpu based scaling to the
> passive governor. This is accomplished by taking the current frequency
> of each CPU frequency domain and then adjust the frequency of the cache
> (or any devfreq device) based on the frequency of the CPUs. It listens
> to CPU frequency transition notifiers to keep itself up to date on the
> current CPU frequency.
> 
> To decide the frequency of the device, the governor does one of the
> following:
> * Derives the optimal devfreq device opp from required-opps property of
>   the parent cpu opp_table.
> 
> * Scales the device frequency in proportion to the CPU frequency. So, if
>   the CPUs are running at their max frequency, the device runs at its
>   max frequency. If the CPUs are running at their min frequency, the
>   device runs at its min frequency. It is interpolated for frequencies
>   in between.
> 
> Andrew-sh.Cheng change
> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> for kernel-5.7
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig            |   2 +
>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
>  include/linux/devfreq.h            |  40 +++++-
>  3 files changed, 299 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index 0b1df12e0f21..d9067950af6a 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>  	  device. This governor does not change the frequency by itself
>  	  through sysfs entries. The passive governor recommends that
>  	  devfreq device uses the OPP table to get the frequency/voltage.
> +	  Alternatively the governor can also be chosen to scale based on
> +	  the online CPUs current frequency.
>  
>  comment "DEVFREQ Drivers"
>  
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 2d67d6c12dce..7dcda02a5bb7 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -8,11 +8,89 @@
>   */
>  
>  #include <linux/module.h>
> +#include <linux/cpu.h>
> +#include <linux/cpufreq.h>
> +#include <linux/cpumask.h>
>  #include <linux/device.h>
>  #include <linux/devfreq.h>
> +#include <linux/slab.h>
>  #include "governor.h"
>  
> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,

Need to change 'unsigned int' to 'unsigned long'.

> +					     unsigned int cpu)
> +{
> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;

Better to define them separately as following and then need to rename
the variable. Usually, use the 'min_freq' and 'max_freq' word for
the minimum/maximum frequency.

	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
	unsigned long dev_min_freq, dev_max_freq, dev_max_state,

The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
and 'unsigned int'. You need to handle them properly.


> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	unsigned long *freq_table = devfreq->profile->freq_table;

In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
So, I think 'dev_freq_table' is proper name instead of 'freq_table'
for the readability.

	freq_table -> dev_freq_table

> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;

In the get_target_freq_with_devfreq(), use 'p_opp' indicating
the OPP of parent device. For the consistency, I think that
use 'p_opp' instead of 'cpu_opp'. 

> +	unsigned long cpu_freq, freq;

Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
	cpu_freq -> cpu_curr_freq.

> +
> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> +	    !cpu_state->opp_table || !devfreq->opp_table)
> +		return 0;
> +
> +	cpu_freq = cpu_state->freq * 1000;
> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
> +	if (IS_ERR(cpu_opp))
> +		return 0;
> +
> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> +					    devfreq->opp_table, cpu_opp);
> +	dev_pm_opp_put(cpu_opp);
> +
> +	if (!IS_ERR(opp)) {
> +		freq = dev_pm_opp_get_freq(opp);
> +		dev_pm_opp_put(opp);

Better to add the 'out' goto statement.
If you use 'goto out', you can reduce the one indentation
without 'else' statement.
	

> +	} else {

As I commented, when dev_pm_opp_xlate_required_opp() return successfully
, use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.


> +		/* Use Interpolation if required opps is not available */
> +		cpu_min = cpu_state->min_freq;
> +		cpu_max = cpu_state->max_freq;
> +		cpu_freq = cpu_state->freq;
> +
> +		if (freq_table) {
> +			/* Get minimum frequency according to sorting order */
> +			max_state = freq_table[devfreq->profile->max_state - 1];
> +			if (freq_table[0] < max_state) {
> +				dev_min = freq_table[0];
> +				dev_max = max_state;
> +			} else {
> +				dev_min = max_state;
> +				dev_max = freq_table[0];
> +			}
> +		} else {
> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
> +				return 0;
> +			dev_min =
> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
> +			dev_max =
> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;

I think it is not proper to access the variable of pm_qos structure directly.
Instead of direct access, you have to use the exported PM QoS function such as
- pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
- pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);

> +		}
> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> +	}


I think that you better to add 'out' jump label as following:

out:

> +
> +	return freq;
> +}
> +
> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> +					unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	unsigned int cpu, target_freq = 0;

Need to define 'target_freq' with 'unsigned long' type.

> +
> +	for_each_online_cpu(cpu)
> +		target_freq = max(target_freq,
> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> +
> +	*freq = target_freq;
> +
> +	return 0;
> +}
> +
> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>  					unsigned long *freq)
>  {
>  	struct devfreq_passive_data *p_data
> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	int i, count, ret = 0;
>  
>  	/*
> -	 * If the devfreq device with passive governor has the specific method
> -	 * to determine the next frequency, should use the get_target_freq()
> -	 * of struct devfreq_passive_data.
> -	 */
> -	if (p_data->get_target_freq) {
> -		ret = p_data->get_target_freq(devfreq, freq);
> -		goto out;
> -	}
> -
> -	/*
>  	 * If the parent and passive devfreq device uses the OPP table,
>  	 * get the next frequency by using the OPP table.
>  	 */
> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	return ret;
>  }
>  
> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +					   unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	int ret;
> +
> +	/*
> +	 * If the devfreq device with passive governor has the specific method
> +	 * to determine the next frequency, should use the get_target_freq()
> +	 * of struct devfreq_passive_data.
> +	 */
> +	if (p_data->get_target_freq)
> +		return p_data->get_target_freq(devfreq, freq);
> +
> +	switch (p_data->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
>  {
>  	int ret;
> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>  	return NOTIFY_DONE;
>  }
>  
> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> +					 unsigned long event, void *ptr)
> +{
> +	struct devfreq_passive_data *data =
> +			container_of(nb, struct devfreq_passive_data, nb);
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct devfreq_cpu_state *cpu_state;
> +	struct cpufreq_freqs *freq = ptr;

How about changing 'freq' to 'cpu_freqs'?

In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
the instance of 'struct cpufreq_freqs'. And in order to
identfy, how about adding 'cpu_' prefix for variable name?

> +	unsigned int current_freq;

Need to define curr_freq with 'unsigned long' type
and better to use 'curr_freq' variable name.

> +	int ret;
> +
> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
> +	    !data->cpu_state[freq->policy->cpu])
> +		return 0;
> +
> +	cpu_state = data->cpu_state[freq->policy->cpu];
> +	if (cpu_state->freq == freq->new)
> +		return 0;
> +
> +	/* Backup current freq and pre-update cpu state freq*/
> +	current_freq = cpu_state->freq;
> +	cpu_state->freq = freq->new;
> +
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);
> +	mutex_unlock(&devfreq->lock);
> +	if (ret) {
> +		cpu_state->freq = current_freq;
> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct device *dev = devfreq->dev.parent;
> +	struct opp_table *opp_table = NULL;
> +	struct devfreq_cpu_state *state;

For the readability, I thinkt 'cpu_state' is proper instead of 'state'.

> +	struct cpufreq_policy *policy;
> +	struct device *cpu_dev;
> +	unsigned int cpu;
> +	int ret;
> +
> +	get_online_cpus();

Add blank line.

> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> +	ret = cpufreq_register_notifier(&data->nb,
> +					CPUFREQ_TRANSITION_NOTIFIER);
> +	if (ret) {
> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> +		data->nb.notifier_call = NULL;
> +		goto out;
> +	}
> +
> +	/* Populate devfreq_cpu_state */
> +	for_each_online_cpu(cpu) {
> +		if (data->cpu_state[cpu])
> +			continue;
> +
> +		policy = cpufreq_cpu_get(cpu);

cpufreq_cpu_get() might return 'NULL'. I think you need to handle
return value as following:

		if (!policy) {
			ret = -EINVAL;
			goto out;
		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
			goto out;
		} else if (IS_ERR(policy) {
			ret = PTR_ERR(policy);
			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
			goto out;
		}

If cpufreq_cpu_get() return successfully, to do next.
It reduces the one indentaion.



> +		if (policy) {
> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
> +			if (!state) {
> +				ret = -ENOMEM;
> +				goto out;
> +			}
> +
> +			cpu_dev = get_cpu_device(cpu);
> +			if (!cpu_dev) {
> +				dev_err(dev, "Couldn't get cpu device.\n");
> +				ret = -ENODEV;
> +				goto out;
> +			}
> +
> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> +			if (IS_ERR(devfreq->opp_table)) {
> +				ret = PTR_ERR(opp_table);
> +				goto out;
> +			}
> +
> +			state->dev = cpu_dev;
> +			state->opp_table = opp_table;
> +			state->first_cpu = cpumask_first(policy->related_cpus);
> +			state->freq = policy->cur;
> +			state->min_freq = policy->cpuinfo.min_freq;
> +			state->max_freq = policy->cpuinfo.max_freq;
> +			data->cpu_state[cpu] = state;

Add blank line.

> +			cpufreq_cpu_put(policy);
> +		} else {
> +			ret = -EPROBE_DEFER;
> +			goto out;
> +		}
> +	}

Add blank line.

> +out:
> +	put_online_cpus();
> +	if (ret)
> +		return ret;
> +
> +	/* Update devfreq */
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);
> +	mutex_unlock(&devfreq->lock);
> +	if (ret)
> +		dev_err(dev, "Couldn't update the frequency.\n");
> +
> +	return ret;
> +}
> +
> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq_cpu_state *cpu_state;
> +	int cpu;
> +
> +	if (data->nb.notifier_call)
> +		cpufreq_unregister_notifier(&data->nb,
> +					    CPUFREQ_TRANSITION_NOTIFIER);
> +
> +	for_each_possible_cpu(cpu) {
> +		cpu_state = data->cpu_state[cpu];
> +		if (cpu_state) {
> +			if (cpu_state->opp_table)
> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> +			kfree(cpu_state);
> +			cpu_state = NULL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  				unsigned int event, void *data)
>  {
> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  	struct notifier_block *nb = &p_data->nb;
>  	int ret = 0;
>  
> -	if (!parent)
> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>  		return -EPROBE_DEFER;

If you modify the devfreq_passive_event_handler() as following,
you can move this condition for DEVFREQ_PARENT_DEV into 
(register|unregister)_parent_dev_notifier.

	switch (event) {                                                                                  
	case DEVFREQ_GOV_START:                                               
		ret = register_parent_dev_notifier(p_data);
		break;
	case DEVFREQ_GOV_STOP:                                             
		ret = unregister_parent_dev_notifier(p_data);
		break;
	default: 
		ret = -EINVAL;
		break;
	}
                                                                                              
	return ret;

>  
>  	switch (event) {
> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  		if (!p_data->this)
>  			p_data->this = devfreq;
>  
> -		nb->notifier_call = devfreq_passive_notifier_call;
> -		ret = devfreq_register_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER);
> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
> +			nb->notifier_call = devfreq_passive_notifier_call;
> +			ret = devfreq_register_notifier(parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER);
> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
> +			ret = cpufreq_passive_register(&p_data);

I think that we better to collect the code related to notifier registration
into one function like devfreq_pass_register_notifier() instead of
cpufreq_passive_register() as following: I think it is more simple and readable.

If you have more proper function name of register_parent_dev_notifier,
please give your opinion.


	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
		switch (p_data->parent_type) {
		case DEVFREQ_PARENT_DEV:
			nb->notifier_call = devfreq_passive_notifier_call;
			ret = devfreq_register_notifier(parent, nb,
			break;
		case CPUFREQ_PARENT_DEV:
			cpufreq_register_notifier(...)
			...
			break;
		}
		

> +		} else {
> +			ret = -EINVAL;
> +		}
>  		break;
>  	case DEVFREQ_GOV_STOP:
> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER));
> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER));
> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
> +			cpufreq_passive_unregister(&p_data);
> +		else
> +			ret = -EINVAL;

ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)

>  		break;
>  	default:
>  		break;
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index a4b19d593151..04ce576fd6f1 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
>  
>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>  /**
> + * struct devfreq_cpu_state - holds the per-cpu state
> + * @freq:	the current frequency of the cpu.
> + * @min_freq:	the min frequency of the cpu.
> + * @max_freq:	the max frequency of the cpu.
> + * @first_cpu:	the cpumask of the first cpu of a policy.
> + * @dev:	reference to cpu device.
> + * @opp_table:	reference to cpu opp table.
> + *
> + * This structure stores the required cpu_state of a cpu.
> + * This is auto-populated by the governor.
> + */
> +struct devfreq_cpu_state {> +	unsigned int freq;

It is better to change from 'freq' to 'curr_freq'
for more correct expression.

> +	unsigned int min_freq;
> +	unsigned int max_freq;
> +	unsigned int first_cpu;
> +	struct device *dev;

How about changing the name 'dev' to 'cpu_dev'?


> +	struct opp_table *opp_table;
> +};

devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.

So, you can move it into drivers/devfreq/governor_passive.c
and just add the definition into include/linux/devfreq.h as following:
It is able to prevent the access of variable of 'struct devfreq_cpu_state'
outside.

	struct devfreq_cpu_state;

> +
> +enum devfreq_parent_dev_type {
> +	DEVFREQ_PARENT_DEV,
> +	CPUFREQ_PARENT_DEV,
> +};
> +
> +/**
>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>   *	and devfreq_add_device
>   * @parent:	the devfreq instance of parent device.
> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
>   *			using governors except for passive governor.
>   *			If the devfreq device has the specific method to decide
>   *			the next frequency, should use this callback.
> - * @this:	the devfreq instance of own device.
> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> + * @parent_type		parent type of the device

Need to add ':' at the end of word. -> "parent_type:".

> + * @this:		the devfreq instance of own device.
> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list

I knew that you make them with same indentation.
But, actually, it is not related to this patch like clean-up code.
Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.

> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>   *
>   * The devfreq_passive_data have to set the devfreq instance of parent
>   * device with governors except for the passive governor. But, don't need to
> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> - * them.
> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> + * will handle them.
>   */
>  struct devfreq_passive_data {
>  	/* Should set the devfreq instance of parent device */
> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
>  	/* Optional callback to decide the next frequency of passvice device */
>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>  
> +	/* Should set the type of parent device */
> +	enum devfreq_parent_dev_type parent_type;
> +
>  	/* For passive governor's internal use. Don't need to set them */
>  	struct devfreq *this;
>  	struct notifier_block nb;
> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>  };
>  #endif
>  
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-05-28  6:14       ` Chanwoo Choi
@ 2020-05-28  7:17         ` Chanwoo Choi
       [not found]           ` <1591100614.1804.1.camel@mtksdaap41>
       [not found]         ` <1591098190.30729.15.camel@mtksdaap41>
  1 sibling, 1 reply; 14+ messages in thread
From: Chanwoo Choi @ 2020-05-28  7:17 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

Hi Andrew-sh.Cheng,

The exynos-bus.c used the passive governor.
Even if don't make the problem because DEVFREQ_PARENT_DEV is zero,
you need to initialize the parent_type with DEVFREQ_PARENT_DEV as following:

diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
index 8fa8eb541373..1c71c47bc2ac 100644
--- a/drivers/devfreq/exynos-bus.c
+++ b/drivers/devfreq/exynos-bus.c
@@ -369,6 +369,7 @@ static int exynos_bus_profile_init_passive(struct exynos_bus *bus,
                return -ENOMEM;
 
        passive_data->parent = parent_devfreq;
+       passive_data->parent_type = DEVFREQ_PARENT_DEV;
 
        /* Add devfreq device for exynos bus with passive governor */
        bus->devfreq = devm_devfreq_add_device(dev, profile, DEVFREQ_GOV_PASSIVE,


On 5/28/20 3:14 PM, Chanwoo Choi wrote:
> Hi Andrew-sh.Cheng,
> 
> Thanks for your posting. I like this approach absolutely.
> I think that it is necessary. When I developed the embedded product,
> I needed this feature always. 
> 
> I add the comments on below.
> 
> 
> And the following email is not valid. So, I dropped this email
> from Cc list.
> Saravana Kannan <skannan@codeaurora.org>
> 
> 
> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
>> From: Saravana Kannan <skannan@codeaurora.org>
>>
>> Many CPU architectures have caches that can scale independent of the
>> CPUs. Frequency scaling of the caches is necessary to make sure that the
>> cache is not a performance bottleneck that leads to poor performance and
>> power. The same idea applies for RAM/DDR.
>>
>> To achieve this, this patch adds support for cpu based scaling to the
>> passive governor. This is accomplished by taking the current frequency
>> of each CPU frequency domain and then adjust the frequency of the cache
>> (or any devfreq device) based on the frequency of the CPUs. It listens
>> to CPU frequency transition notifiers to keep itself up to date on the
>> current CPU frequency.
>>
>> To decide the frequency of the device, the governor does one of the
>> following:
>> * Derives the optimal devfreq device opp from required-opps property of
>>   the parent cpu opp_table.
>>
>> * Scales the device frequency in proportion to the CPU frequency. So, if
>>   the CPUs are running at their max frequency, the device runs at its
>>   max frequency. If the CPUs are running at their min frequency, the
>>   device runs at its min frequency. It is interpolated for frequencies
>>   in between.
>>
>> Andrew-sh.Cheng change
>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
>> for kernel-5.7
>>
>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>> ---
>>  drivers/devfreq/Kconfig            |   2 +
>>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
>>  include/linux/devfreq.h            |  40 +++++-
>>  3 files changed, 299 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>> index 0b1df12e0f21..d9067950af6a 100644
>> --- a/drivers/devfreq/Kconfig
>> +++ b/drivers/devfreq/Kconfig
>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>>  	  device. This governor does not change the frequency by itself
>>  	  through sysfs entries. The passive governor recommends that
>>  	  devfreq device uses the OPP table to get the frequency/voltage.
>> +	  Alternatively the governor can also be chosen to scale based on
>> +	  the online CPUs current frequency.
>>  
>>  comment "DEVFREQ Drivers"
>>  
>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
>> index 2d67d6c12dce..7dcda02a5bb7 100644
>> --- a/drivers/devfreq/governor_passive.c
>> +++ b/drivers/devfreq/governor_passive.c
>> @@ -8,11 +8,89 @@
>>   */
>>  
>>  #include <linux/module.h>
>> +#include <linux/cpu.h>
>> +#include <linux/cpufreq.h>
>> +#include <linux/cpumask.h>
>>  #include <linux/device.h>
>>  #include <linux/devfreq.h>
>> +#include <linux/slab.h>
>>  #include "governor.h"
>>  
>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> 
> Need to change 'unsigned int' to 'unsigned long'.
> 
>> +					     unsigned int cpu)
>> +{
>> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;
> 
> Better to define them separately as following and then need to rename
> the variable. Usually, use the 'min_freq' and 'max_freq' word for
> the minimum/maximum frequency.
> 
> 	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
> 	unsigned long dev_min_freq, dev_max_freq, dev_max_state,
> 
> The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
> and 'unsigned int'. You need to handle them properly.
> 
> 
>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>> +	unsigned long *freq_table = devfreq->profile->freq_table;
> 
> In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
> So, I think 'dev_freq_table' is proper name instead of 'freq_table'
> for the readability.
> 
> 	freq_table -> dev_freq_table
> 
>> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;
> 
> In the get_target_freq_with_devfreq(), use 'p_opp' indicating
> the OPP of parent device. For the consistency, I think that
> use 'p_opp' instead of 'cpu_opp'. 
> 
>> +	unsigned long cpu_freq, freq;
> 
> Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
> 	cpu_freq -> cpu_curr_freq.
> 
>> +
>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
>> +	    !cpu_state->opp_table || !devfreq->opp_table)
>> +		return 0;
>> +
>> +	cpu_freq = cpu_state->freq * 1000;
>> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
>> +	if (IS_ERR(cpu_opp))
>> +		return 0;
>> +
>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
>> +					    devfreq->opp_table, cpu_opp);
>> +	dev_pm_opp_put(cpu_opp);
>> +
>> +	if (!IS_ERR(opp)) {
>> +		freq = dev_pm_opp_get_freq(opp);
>> +		dev_pm_opp_put(opp);
> 
> Better to add the 'out' goto statement.
> If you use 'goto out', you can reduce the one indentation
> without 'else' statement.
> 	
> 
>> +	} else {
> 
> As I commented, when dev_pm_opp_xlate_required_opp() return successfully
> , use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.
> 
> 
>> +		/* Use Interpolation if required opps is not available */
>> +		cpu_min = cpu_state->min_freq;
>> +		cpu_max = cpu_state->max_freq;
>> +		cpu_freq = cpu_state->freq;
>> +
>> +		if (freq_table) {
>> +			/* Get minimum frequency according to sorting order */
>> +			max_state = freq_table[devfreq->profile->max_state - 1];
>> +			if (freq_table[0] < max_state) {
>> +				dev_min = freq_table[0];
>> +				dev_max = max_state;
>> +			} else {
>> +				dev_min = max_state;
>> +				dev_max = freq_table[0];
>> +			}
>> +		} else {
>> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
>> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
>> +				return 0;
>> +			dev_min =
>> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
>> +			dev_max =
>> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;
> 
> I think it is not proper to access the variable of pm_qos structure directly.
> Instead of direct access, you have to use the exported PM QoS function such as
> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);
> 
>> +		}
>> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
>> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
>> +	}
> 
> 
> I think that you better to add 'out' jump label as following:
> 
> out:
> 
>> +
>> +	return freq;
>> +}
>> +
>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>> +					unsigned long *freq)
>> +{
>> +	struct devfreq_passive_data *p_data =
>> +				(struct devfreq_passive_data *)devfreq->data;
>> +	unsigned int cpu, target_freq = 0;
> 
> Need to define 'target_freq' with 'unsigned long' type.
> 
>> +
>> +	for_each_online_cpu(cpu)
>> +		target_freq = max(target_freq,
>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
>> +
>> +	*freq = target_freq;
>> +
>> +	return 0;
>> +}
>> +
>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>>  					unsigned long *freq)
>>  {
>>  	struct devfreq_passive_data *p_data
>> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>  	int i, count, ret = 0;
>>  
>>  	/*
>> -	 * If the devfreq device with passive governor has the specific method
>> -	 * to determine the next frequency, should use the get_target_freq()
>> -	 * of struct devfreq_passive_data.
>> -	 */
>> -	if (p_data->get_target_freq) {
>> -		ret = p_data->get_target_freq(devfreq, freq);
>> -		goto out;
>> -	}
>> -
>> -	/*
>>  	 * If the parent and passive devfreq device uses the OPP table,
>>  	 * get the next frequency by using the OPP table.
>>  	 */
>> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>  	return ret;
>>  }
>>  
>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>> +					   unsigned long *freq)
>> +{
>> +	struct devfreq_passive_data *p_data =
>> +				(struct devfreq_passive_data *)devfreq->data;
>> +	int ret;
>> +
>> +	/*
>> +	 * If the devfreq device with passive governor has the specific method
>> +	 * to determine the next frequency, should use the get_target_freq()
>> +	 * of struct devfreq_passive_data.
>> +	 */
>> +	if (p_data->get_target_freq)
>> +		return p_data->get_target_freq(devfreq, freq);
>> +
>> +	switch (p_data->parent_type) {
>> +	case DEVFREQ_PARENT_DEV:
>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
>> +		break;
>> +	case CPUFREQ_PARENT_DEV:
>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
>> +		break;
>> +	default:
>> +		ret = -EINVAL;
>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
>> +		break;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
>>  {
>>  	int ret;
>> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>  	return NOTIFY_DONE;
>>  }
>>  
>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
>> +					 unsigned long event, void *ptr)
>> +{
>> +	struct devfreq_passive_data *data =
>> +			container_of(nb, struct devfreq_passive_data, nb);
>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>> +	struct devfreq_cpu_state *cpu_state;
>> +	struct cpufreq_freqs *freq = ptr;
> 
> How about changing 'freq' to 'cpu_freqs'?
> 
> In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
> the instance of 'struct cpufreq_freqs'. And in order to
> identfy, how about adding 'cpu_' prefix for variable name?
> 
>> +	unsigned int current_freq;
> 
> Need to define curr_freq with 'unsigned long' type
> and better to use 'curr_freq' variable name.
> 
>> +	int ret;
>> +
>> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
>> +	    !data->cpu_state[freq->policy->cpu])
>> +		return 0;
>> +
>> +	cpu_state = data->cpu_state[freq->policy->cpu];
>> +	if (cpu_state->freq == freq->new)
>> +		return 0;
>> +
>> +	/* Backup current freq and pre-update cpu state freq*/
>> +	current_freq = cpu_state->freq;
>> +	cpu_state->freq = freq->new;
>> +
>> +	mutex_lock(&devfreq->lock);
>> +	ret = update_devfreq(devfreq);
>> +	mutex_unlock(&devfreq->lock);
>> +	if (ret) {
>> +		cpu_state->freq = current_freq;
>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
>> +		return ret;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
>> +{
>> +	struct devfreq_passive_data *data = *p_data;
>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>> +	struct device *dev = devfreq->dev.parent;
>> +	struct opp_table *opp_table = NULL;
>> +	struct devfreq_cpu_state *state;
> 
> For the readability, I thinkt 'cpu_state' is proper instead of 'state'.
> 
>> +	struct cpufreq_policy *policy;
>> +	struct device *cpu_dev;
>> +	unsigned int cpu;
>> +	int ret;
>> +
>> +	get_online_cpus();
> 
> Add blank line.
> 
>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
>> +	ret = cpufreq_register_notifier(&data->nb,
>> +					CPUFREQ_TRANSITION_NOTIFIER);
>> +	if (ret) {
>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
>> +		data->nb.notifier_call = NULL;
>> +		goto out;
>> +	}
>> +
>> +	/* Populate devfreq_cpu_state */
>> +	for_each_online_cpu(cpu) {
>> +		if (data->cpu_state[cpu])
>> +			continue;
>> +
>> +		policy = cpufreq_cpu_get(cpu);
> 
> cpufreq_cpu_get() might return 'NULL'. I think you need to handle
> return value as following:
> 
> 		if (!policy) {
> 			ret = -EINVAL;
> 			goto out;
> 		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> 			goto out;
> 		} else if (IS_ERR(policy) {
> 			ret = PTR_ERR(policy);
> 			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> 			goto out;
> 		}
> 
> If cpufreq_cpu_get() return successfully, to do next.
> It reduces the one indentaion.
> 
> 
> 
>> +		if (policy) {
>> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
>> +			if (!state) {
>> +				ret = -ENOMEM;
>> +				goto out;
>> +			}
>> +
>> +			cpu_dev = get_cpu_device(cpu);
>> +			if (!cpu_dev) {
>> +				dev_err(dev, "Couldn't get cpu device.\n");
>> +				ret = -ENODEV;
>> +				goto out;
>> +			}
>> +
>> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>> +			if (IS_ERR(devfreq->opp_table)) {
>> +				ret = PTR_ERR(opp_table);
>> +				goto out;
>> +			}
>> +
>> +			state->dev = cpu_dev;
>> +			state->opp_table = opp_table;
>> +			state->first_cpu = cpumask_first(policy->related_cpus);
>> +			state->freq = policy->cur;
>> +			state->min_freq = policy->cpuinfo.min_freq;
>> +			state->max_freq = policy->cpuinfo.max_freq;
>> +			data->cpu_state[cpu] = state;
> 
> Add blank line.
> 
>> +			cpufreq_cpu_put(policy);
>> +		} else {
>> +			ret = -EPROBE_DEFER;
>> +			goto out;
>> +		}
>> +	}
> 
> Add blank line.
> 
>> +out:
>> +	put_online_cpus();
>> +	if (ret)
>> +		return ret;
>> +
>> +	/* Update devfreq */
>> +	mutex_lock(&devfreq->lock);
>> +	ret = update_devfreq(devfreq);
>> +	mutex_unlock(&devfreq->lock);
>> +	if (ret)
>> +		dev_err(dev, "Couldn't update the frequency.\n");
>> +
>> +	return ret;
>> +}
>> +
>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
>> +{
>> +	struct devfreq_passive_data *data = *p_data;
>> +	struct devfreq_cpu_state *cpu_state;
>> +	int cpu;
>> +
>> +	if (data->nb.notifier_call)
>> +		cpufreq_unregister_notifier(&data->nb,
>> +					    CPUFREQ_TRANSITION_NOTIFIER);
>> +
>> +	for_each_possible_cpu(cpu) {
>> +		cpu_state = data->cpu_state[cpu];
>> +		if (cpu_state) {
>> +			if (cpu_state->opp_table)
>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
>> +			kfree(cpu_state);
>> +			cpu_state = NULL;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>  				unsigned int event, void *data)
>>  {
>> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>  	struct notifier_block *nb = &p_data->nb;
>>  	int ret = 0;
>>  
>> -	if (!parent)
>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>>  		return -EPROBE_DEFER;
> 
> If you modify the devfreq_passive_event_handler() as following,
> you can move this condition for DEVFREQ_PARENT_DEV into 
> (register|unregister)_parent_dev_notifier.
> 
> 	switch (event) {                                                                                  
> 	case DEVFREQ_GOV_START:                                               
> 		ret = register_parent_dev_notifier(p_data);
> 		break;
> 	case DEVFREQ_GOV_STOP:                                             
> 		ret = unregister_parent_dev_notifier(p_data);
> 		break;
> 	default: 
> 		ret = -EINVAL;
> 		break;
> 	}
>                                                                                               
> 	return ret;
> 
>>  
>>  	switch (event) {
>> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>  		if (!p_data->this)
>>  			p_data->this = devfreq;
>>  
>> -		nb->notifier_call = devfreq_passive_notifier_call;
>> -		ret = devfreq_register_notifier(parent, nb,
>> -					DEVFREQ_TRANSITION_NOTIFIER);
>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
>> +			nb->notifier_call = devfreq_passive_notifier_call;
>> +			ret = devfreq_register_notifier(parent, nb,
>> +						DEVFREQ_TRANSITION_NOTIFIER);
>> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
>> +			ret = cpufreq_passive_register(&p_data);
> 
> I think that we better to collect the code related to notifier registration
> into one function like devfreq_pass_register_notifier() instead of
> cpufreq_passive_register() as following: I think it is more simple and readable.
> 
> If you have more proper function name of register_parent_dev_notifier,
> please give your opinion.
> 
> 
> 	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> 		switch (p_data->parent_type) {
> 		case DEVFREQ_PARENT_DEV:
> 			nb->notifier_call = devfreq_passive_notifier_call;
> 			ret = devfreq_register_notifier(parent, nb,
> 			break;
> 		case CPUFREQ_PARENT_DEV:
> 			cpufreq_register_notifier(...)
> 			...
> 			break;
> 		}
> 		
> 
>> +		} else {
>> +			ret = -EINVAL;
>> +		}
>>  		break;
>>  	case DEVFREQ_GOV_STOP:
>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
>> -					DEVFREQ_TRANSITION_NOTIFIER));
>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
>> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
>> +						DEVFREQ_TRANSITION_NOTIFIER));
>> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
>> +			cpufreq_passive_unregister(&p_data);
>> +		else
>> +			ret = -EINVAL;
> 
> ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> 
>>  		break;
>>  	default:
>>  		break;
>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>> index a4b19d593151..04ce576fd6f1 100644
>> --- a/include/linux/devfreq.h
>> +++ b/include/linux/devfreq.h
>> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
>>  
>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>>  /**
>> + * struct devfreq_cpu_state - holds the per-cpu state
>> + * @freq:	the current frequency of the cpu.
>> + * @min_freq:	the min frequency of the cpu.
>> + * @max_freq:	the max frequency of the cpu.
>> + * @first_cpu:	the cpumask of the first cpu of a policy.
>> + * @dev:	reference to cpu device.
>> + * @opp_table:	reference to cpu opp table.
>> + *
>> + * This structure stores the required cpu_state of a cpu.
>> + * This is auto-populated by the governor.
>> + */
>> +struct devfreq_cpu_state {> +	unsigned int freq;
> 
> It is better to change from 'freq' to 'curr_freq'
> for more correct expression.
> 
>> +	unsigned int min_freq;
>> +	unsigned int max_freq;
>> +	unsigned int first_cpu;
>> +	struct device *dev;
> 
> How about changing the name 'dev' to 'cpu_dev'?
> 
> 
>> +	struct opp_table *opp_table;
>> +};
> 
> devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.
> 
> So, you can move it into drivers/devfreq/governor_passive.c
> and just add the definition into include/linux/devfreq.h as following:
> It is able to prevent the access of variable of 'struct devfreq_cpu_state'
> outside.
> 
> 	struct devfreq_cpu_state;
> 
>> +
>> +enum devfreq_parent_dev_type {
>> +	DEVFREQ_PARENT_DEV,
>> +	CPUFREQ_PARENT_DEV,
>> +};
>> +
>> +/**
>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>>   *	and devfreq_add_device
>>   * @parent:	the devfreq instance of parent device.
>> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
>>   *			using governors except for passive governor.
>>   *			If the devfreq device has the specific method to decide
>>   *			the next frequency, should use this callback.
>> - * @this:	the devfreq instance of own device.
>> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>> + * @parent_type		parent type of the device
> 
> Need to add ':' at the end of word. -> "parent_type:".
> 
>> + * @this:		the devfreq instance of own device.
>> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> 
> I knew that you make them with same indentation.
> But, actually, it is not related to this patch like clean-up code.
> Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.
> 
>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>>   *
>>   * The devfreq_passive_data have to set the devfreq instance of parent
>>   * device with governors except for the passive governor. But, don't need to
>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
>> - * them.
>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
>> + * will handle them.
>>   */
>>  struct devfreq_passive_data {
>>  	/* Should set the devfreq instance of parent device */
>> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
>>  	/* Optional callback to decide the next frequency of passvice device */
>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>>  
>> +	/* Should set the type of parent device */
>> +	enum devfreq_parent_dev_type parent_type;
>> +
>>  	/* For passive governor's internal use. Don't need to set them */
>>  	struct devfreq *this;
>>  	struct notifier_block nb;
>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>>  };
>>  #endif
>>  
>>
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 09/12] devfreq: add mediatek cci devfreq
       [not found]   ` <20200520034307.20435-10-andrew-sh.cheng@mediatek.com>
  2020-05-20 12:31     ` [PATCH 09/12] devfreq: add mediatek cci devfreq Mark Brown
@ 2020-05-28  7:35     ` Chanwoo Choi
  2020-05-28  8:00       ` Chanwoo Choi
  1 sibling, 1 reply; 14+ messages in thread
From: Chanwoo Choi @ 2020-05-28  7:35 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

Hi Andrew-sh.Cheng,

On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> This adds a devfreq driver for the Cache Coherent Interconnect (CCI)
> of the Mediatek MT8183.
> 
> On the MT8183 the CCI is supplied by the same regulator as the LITTLE
> cores. The driver is notified when the regulator voltage changes
> (driven by cpufreq) and adjusts the CCI frequency to the maximum
> possible value.
> 
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig              |  10 ++
>  drivers/devfreq/Makefile             |   1 +
>  drivers/devfreq/mt8183-cci-devfreq.c | 206 +++++++++++++++++++++++++++++++++++

The mt8183-cci.c is enough for driver name.

>  3 files changed, 217 insertions(+)
>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index d9067950af6a..4ed7116271ee 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -103,6 +103,16 @@ config ARM_IMX8M_DDRC_DEVFREQ
>  	  This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows
>  	  adjusting DRAM frequency.
>  
> +config ARM_MT8183_CCI_DEVFREQ
> +	tristate "MT8183 CCI DEVFREQ Driver"
> +	depends on ARM_MEDIATEK_CPUFREQ
> +	help
> +		This adds a devfreq driver for Cache Coherent Interconnect
> +		of Mediatek MT8183, which is shared the same regulator
> +		with cpu cluster.
> +		It can track buck voltage and update a proper cci frequency.

s/cci/CCI

> +		Use notification to get regulator status.
> +
>  config ARM_TEGRA_DEVFREQ
>  	tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver"
>  	depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \
> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
> index 3eb4d5e6635c..5b1b670c954d 100644
> --- a/drivers/devfreq/Makefile
> +++ b/drivers/devfreq/Makefile
> @@ -10,6 +10,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
>  # DEVFREQ Drivers
>  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
>  obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ)	+= imx8m-ddrc.o
> +obj-$(CONFIG_ARM_MT8183_CCI_DEVFREQ)	+= mt8183-cci-devfreq.o
>  obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
>  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
>  obj-$(CONFIG_ARM_TEGRA20_DEVFREQ)	+= tegra20-devfreq.o
> diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
> new file mode 100644
> index 000000000000..cd7929a83bf8
> --- /dev/null
> +++ b/drivers/devfreq/mt8183-cci-devfreq.c
> @@ -0,0 +1,206 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2019 MediaTek Inc.

s/2019/2020

> +
> + * Author: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/devfreq.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/regulator/consumer.h>
> +#include <linux/time.h>
> +
> +#include "governor.h"

It is not needed. Please remove it.

> +
> +#define MAX_VOLT_LIMIT		(1150000)
> +
> +struct cci_devfreq {
> +	struct devfreq *devfreq;
> +	struct regulator *proc_reg;

'proc' means the 'processor'?
Instead of 'proc_reg', you better to use 'cpu_reg'.

> +	struct clk *cci_clk;
> +	int old_vproc;
> +	unsigned long old_freq;
> +};
> +
> +static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
> +{
> +	int ret;
> +
> +	ret = regulator_set_voltage(cci_df->proc_reg, vproc,
> +				    MAX_VOLT_LIMIT);
> +	if (!ret)
> +		cci_df->old_vproc = vproc;
> +	return ret;
> +}
> +
> +static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
> +				  u32 flags)
> +{
> +	int ret;
> +	struct cci_devfreq *cci_df = dev_get_drvdata(dev);
> +	struct dev_pm_opp *opp;
> +	unsigned long opp_rate, opp_voltage, old_voltage;
> +
> +	if (!cci_df)
> +		return -EINVAL;
> +
> +	if (cci_df->old_freq == *freq)
> +		return 0;
> +
> +	opp_rate = *freq;
> +	opp = dev_pm_opp_find_freq_floor(dev, &opp_rate);
> +	opp_voltage = dev_pm_opp_get_voltage(opp);
> +	dev_pm_opp_put(opp);


You can use the helper function for getting the rate 
with devfreq_recommended_opp(). You can refer the following code
in drivers/devfreq/exynos-bus.c

	opp = devfreq_recommended_opp(dev, freq, flags);
	if (IS_ERR(opp)) {
		dev_err(dev, "failed to get recommended opp instance\n");
		return PTR_ERR(opp);
	}
	dev_pm_opp_put(opp);

> +
> +	old_voltage = cci_df->old_vproc;
> +	if (old_voltage == 0)
> +		old_voltage = regulator_get_voltage(cci_df->proc_reg);
> +
> +	// scale up: set voltage first then freq
> +	if (opp_voltage > old_voltage) {
> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
> +		if (ret) {
> +			pr_err("cci: failed to scale up voltage\n");
> +			return ret;
> +		}
> +	}
> +
> +	ret = clk_set_rate(cci_df->cci_clk, *freq);
> +	if (ret) {
> +		pr_err("%s: failed cci to set rate: %d\n", __func__,
> +		       ret);
> +		mtk_cci_set_voltage(cci_df, old_voltage);
> +		return ret;
> +	}
> +
> +	// scale down: set freq first then voltage
> +	if (opp_voltage < old_voltage) {
> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
> +		if (ret) {
> +			pr_err("cci: failed to scale down voltage\n");
> +			clk_set_rate(cci_df->cci_clk, cci_df->old_freq);
> +			return ret;
> +		}
> +	}


I recommend that dev_pm_opp_set_rate() and dev_pm_opp_set_regulator()
instead of 'clk_set_rate' and 'regulator_set_voltage'.
In the dev_pm_opp_set_rate(), handle the these sequence.
You can refer the merged patch[1].

[1] commit 4294a779bd8dff6c65e7e85ffe7a1ea236e92a68
- PM / devfreq: exynos-bus: Convert to use dev_pm_opp_set_rate()


> +
> +	cci_df->old_freq = *freq;
> +
> +	return 0;
> +}
> +
> +static struct devfreq_dev_profile cci_devfreq_profile = {
> +	.target = mtk_cci_devfreq_target,

Need to add '.exit' for calling dev_pm_opp_of_remove_table().
You can refer the merged devfreq patches like exynos_bus.c, imx-bus.c.

> +};
> +
> +static int mtk_cci_devfreq_probe(struct platform_device *pdev)
> +{
> +	struct device *cci_dev = &pdev->dev;
> +	struct cci_devfreq *cci_df;
> +	struct devfreq_passive_data *passive_data;
> +	int ret;
> +
> +	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
> +	if (!cci_df)
> +		return -ENOMEM;
> +
> +	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
> +	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
> +	if (ret) {
> +		if (ret != -EPROBE_DEFER)
> +			dev_err(cci_dev, "failed to get clock for CCI: %d\n",
> +				ret);
> +		return ret;
> +	}
> +	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
> +	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
> +	if (ret) {
> +		if (ret != -EPROBE_DEFER)
> +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
> +				ret);
> +		return ret;
> +	}

I recommend that use dev_pm_opp_set_regulators.
You can refer the merged patch[1].

[1] commit 4294a779bd8dff6c65e7e85ffe7a1ea236e92a68
- PM / devfreq: exynos-bus: Convert to use dev_pm_opp_set_rate()


> +	ret = regulator_enable(cci_df->proc_reg);
> +	if (ret) {
> +		pr_warn("enable buck for cci fail\n");

Use dev_err instead of 'pr_warn'.

> +		return ret;
> +	}
> +
> +	ret = dev_pm_opp_of_add_table(cci_dev);
> +	if (ret) {
> +		dev_err(cci_dev, "Fail to init CCI OPP table: %d\n", ret);

How about changing the error log as following
because in this driver, use the 'failed to' sentence for error handling?

	failed to get OPP table for CCI:L %d

> +		return ret;
> +	}
> +
> +	platform_set_drvdata(pdev, cci_df);
> +
> +	passive_data = devm_kzalloc(cci_dev, sizeof(*passive_data), GFP_KERNEL);
> +	if (!passive_data)
> +		return -ENOMEM;

On this error case, you have to call dev_pm_opp_of_remove_table().
You better to make the 'err_opp' jump lable and then add 'goto err_opp'.

> +
> +	passive_data->parent_type = CPUFREQ_PARENT_DEV;
> +
> +	cci_df->devfreq = devm_devfreq_add_device(cci_dev,
> +						  &cci_devfreq_profile,
> +						  DEVFREQ_GOV_PASSIVE,
> +						  passive_data);
> +	if (IS_ERR(cci_df->devfreq)) {
> +		ret = PTR_ERR(cci_df->devfreq);
> +		dev_err(cci_dev, "cannot create cci devfreq device:%d\n", ret);
> +		dev_pm_opp_of_remove_table(cci_dev);

Instead of direct call, use 'goto err_opp'.

> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int mtk_cci_devfreq_remove(struct platform_device *pdev)
> +{
> +	struct device *cci_dev = &pdev->dev;
> +	struct cci_devfreq *cci_df;
> +	struct notifier_block *opp_nb;
> +
> +	cci_df = platform_get_drvdata(pdev);
> +	opp_nb = &cci_df->opp_nb;
> +
> +	dev_pm_opp_unregister_notifier(cci_dev, opp_nb);

This patch doesn't call the dev_pm_opp_register_notifier.
Please remove it.

> +	devm_devfreq_remove_device(cci_dev, cci_df->devfreq);

Don't need to call this function because you used devm_devfreq_add_device().

> +	dev_pm_opp_of_remove_table(cci_dev)> +	regulator_disable(cci_df->proc_reg);
> +
> +	return 0;
> +}
> +
> +static const __maybe_unused struct of_device_id
> +	mediatek_cci_devfreq_of_match[] = {

Make it on one line and remove '__maybe_unused' keyword.
- mediatek_cci_devfreq_of_match-> mediatek_cci_of_match

> +	{ .compatible = "mediatek,mt8183-cci" },
> +	{ },
> +};
> +MODULE_DEVICE_TABLE(of, mediatek_cci_devfreq_of_match);

ditto.

> +
> +static struct platform_driver cci_devfreq_driver = {
> +	.probe	= mtk_cci_devfreq_probe,
> +	.remove	= mtk_cci_devfreq_remove,
> +	.driver = {
> +		.name = "mediatek-cci-devfreq",
> +		.of_match_table = of_match_ptr(mediatek_cci_devfreq_of_match),

ditto.

> +	},
> +};
> +
> +static int __init mtk_cci_devfreq_init(void)
> +{
> +	return platform_driver_register(&cci_devfreq_driver);
> +}
> +module_init(mtk_cci_devfreq_init)
> +
> +static void __exit mtk_cci_devfreq_exit(void)
> +{
> +	platform_driver_unregister(&cci_devfreq_driver);
> +}
> +module_exit(mtk_cci_devfreq_exit)

Use 'module_platform_driver' instead of module_init and module_exit.

> +
> +MODULE_DESCRIPTION("Mediatek CCI devfreq driver");
> +MODULE_AUTHOR("Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>");
> +MODULE_LICENSE("GPL v2");
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq
       [not found]     ` <20200520034307.20435-9-andrew-sh.cheng@mediatek.com>
@ 2020-05-28  7:42       ` Chanwoo Choi
  0 siblings, 0 replies; 14+ messages in thread
From: Chanwoo Choi @ 2020-05-28  7:42 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

Hi,

On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> This adds dt-binding documentation of cci devfreq
> for Mediatek MT8183 SoC platform.
> 
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  .../devicetree/bindings/devfreq/mt8183-cci.yaml    | 51 ++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> 
> diff --git a/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> new file mode 100644
> index 000000000000..a7341fd94097
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> @@ -0,0 +1,51 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: https://protect2.fireeye.com/url?k=33f1f15d-6e23ea05-33f07a12-0cc47a31c8b4-91b3f8aeecce95dc&q=1&u=http%3A%2F%2Fdevicetree.org%2Fschemas%2Fdevfreq%2Fmt8183-cci.yaml%23
> +$schema: https://protect2.fireeye.com/url?k=fc7d9089-a1af8bd1-fc7c1bc6-0cc47a31c8b4-b46f5afc59faf86d&q=1&u=http%3A%2F%2Fdevicetree.org%2Fmeta-schemas%2Fcore.yaml%23
> +
> +title: CCI_DEVFREQ driver for MT8183.
> +
> +maintainers:
> +  - Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> +
> +description: |
> +  This module is used to create CCI DEVFREQ.
> +  The performance will depend on both CCI frequency and CPU frequency.
> +  For MT8183, CCI co-buck with Little core.
> +  Contain CCI opp table for voltage and frequency scaling.
> +
> +properties:
> +  compatible:
> +    const: "mediatek,mt8183-cci"
> +
> +  clocks:
> +    maxItems: 1
> +
> +  clock-names:
> +    const: "cci"
> +
> +  operating-points-v2: true
> +  opp-table: true
> +
> +  proc-supply:
> +    description:
> +      Phandle of the regulator that provides the supply voltage.
> +
> +required:
> +  - compatible
> +  - clocks
> +  - clock-names
> +  - proc-supply
> +
> +examples:
> +  - |
> +    #include <dt-bindings/clock/mt8183-clk.h>
> +    cci: cci {
> +      compatible = "mediatek,mt8183-cci";
> +      clocks = <&apmixedsys CLK_APMIXED_CCIPLL>;
> +      clock-names = "cci";
> +      operating-points-v2 = <&cci_opp>;
> +      proc-supply = <&mt6358_vproc12_reg>;
> +    };
> +
> 

I recommend that add the more detailed example
with OPP table with CPU node.


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 09/12] devfreq: add mediatek cci devfreq
  2020-05-28  7:35     ` Chanwoo Choi
@ 2020-05-28  8:00       ` Chanwoo Choi
  0 siblings, 0 replies; 14+ messages in thread
From: Chanwoo Choi @ 2020-05-28  8:00 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

Hi Andrew-sh.Cheng,

On 5/28/20 4:35 PM, Chanwoo Choi wrote:
> Hi Andrew-sh.Cheng,
> 
> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
>> This adds a devfreq driver for the Cache Coherent Interconnect (CCI)
>> of the Mediatek MT8183.
>>
>> On the MT8183 the CCI is supplied by the same regulator as the LITTLE
>> cores. The driver is notified when the regulator voltage changes
>> (driven by cpufreq) and adjusts the CCI frequency to the maximum
>> possible value.

I understood that the mt8183-cci.c and mt8183 cpufreq driver (ARM_MEDIATEK_CPUFREQ)
shared the same regulator. So, when mt8183 cpufreq driver
changes the CPU frequency and voltage, the mt8183-cci.c
changes the CCI frequency according to the new mt8183 frequency
with passive governor. 

I think that mt8183-cci.c don't need to change the voltage
because already mt8183 cpufreq changed the voltage of shared regulator.
Why do you change the voltage in this driver?

>>
>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>> ---
>>  drivers/devfreq/Kconfig              |  10 ++
>>  drivers/devfreq/Makefile             |   1 +
>>  drivers/devfreq/mt8183-cci-devfreq.c | 206 +++++++++++++++++++++++++++++++++++
> 
> The mt8183-cci.c is enough for driver name.
> 
>>  3 files changed, 217 insertions(+)
>>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
>>
>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>> index d9067950af6a..4ed7116271ee 100644
>> --- a/drivers/devfreq/Kconfig
>> +++ b/drivers/devfreq/Kconfig
>> @@ -103,6 +103,16 @@ config ARM_IMX8M_DDRC_DEVFREQ
>>  	  This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows
>>  	  adjusting DRAM frequency.
>>  
>> +config ARM_MT8183_CCI_DEVFREQ
>> +	tristate "MT8183 CCI DEVFREQ Driver"
>> +	depends on ARM_MEDIATEK_CPUFREQ
>> +	help
>> +		This adds a devfreq driver for Cache Coherent Interconnect
>> +		of Mediatek MT8183, which is shared the same regulator
>> +		with cpu cluster.
>> +		It can track buck voltage and update a proper cci frequency.
> 
> s/cci/CCI
> 
>> +		Use notification to get regulator status.
>> +
>>  config ARM_TEGRA_DEVFREQ
>>  	tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver"
>>  	depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \
>> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
>> index 3eb4d5e6635c..5b1b670c954d 100644
>> --- a/drivers/devfreq/Makefile
>> +++ b/drivers/devfreq/Makefile
>> @@ -10,6 +10,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
>>  # DEVFREQ Drivers
>>  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
>>  obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ)	+= imx8m-ddrc.o
>> +obj-$(CONFIG_ARM_MT8183_CCI_DEVFREQ)	+= mt8183-cci-devfreq.o
>>  obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
>>  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
>>  obj-$(CONFIG_ARM_TEGRA20_DEVFREQ)	+= tegra20-devfreq.o
>> diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
>> new file mode 100644
>> index 000000000000..cd7929a83bf8
>> --- /dev/null
>> +++ b/drivers/devfreq/mt8183-cci-devfreq.c
>> @@ -0,0 +1,206 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (c) 2019 MediaTek Inc.
> 
> s/2019/2020
> 
>> +
>> + * Author: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>> + */
>> +
>> +#include <linux/clk.h>
>> +#include <linux/devfreq.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/regulator/consumer.h>
>> +#include <linux/time.h>
>> +
>> +#include "governor.h"
> 
> It is not needed. Please remove it.
> 
>> +
>> +#define MAX_VOLT_LIMIT		(1150000)
>> +
>> +struct cci_devfreq {
>> +	struct devfreq *devfreq;
>> +	struct regulator *proc_reg;
> 
> 'proc' means the 'processor'?
> Instead of 'proc_reg', you better to use 'cpu_reg'.
> 
>> +	struct clk *cci_clk;
>> +	int old_vproc;
>> +	unsigned long old_freq;
>> +};
>> +
>> +static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
>> +{
>> +	int ret;
>> +
>> +	ret = regulator_set_voltage(cci_df->proc_reg, vproc,
>> +				    MAX_VOLT_LIMIT);
>> +	if (!ret)
>> +		cci_df->old_vproc = vproc;
>> +	return ret;
>> +}
>> +
>> +static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
>> +				  u32 flags)
>> +{
>> +	int ret;
>> +	struct cci_devfreq *cci_df = dev_get_drvdata(dev);
>> +	struct dev_pm_opp *opp;
>> +	unsigned long opp_rate, opp_voltage, old_voltage;
>> +
>> +	if (!cci_df)
>> +		return -EINVAL;
>> +
>> +	if (cci_df->old_freq == *freq)
>> +		return 0;
>> +
>> +	opp_rate = *freq;
>> +	opp = dev_pm_opp_find_freq_floor(dev, &opp_rate);
>> +	opp_voltage = dev_pm_opp_get_voltage(opp);
>> +	dev_pm_opp_put(opp);
> 
> 
> You can use the helper function for getting the rate 
> with devfreq_recommended_opp(). You can refer the following code
> in drivers/devfreq/exynos-bus.c
> 
> 	opp = devfreq_recommended_opp(dev, freq, flags);
> 	if (IS_ERR(opp)) {
> 		dev_err(dev, "failed to get recommended opp instance\n");
> 		return PTR_ERR(opp);
> 	}
> 	dev_pm_opp_put(opp);
> 
>> +
>> +	old_voltage = cci_df->old_vproc;
>> +	if (old_voltage == 0)
>> +		old_voltage = regulator_get_voltage(cci_df->proc_reg);
>> +
>> +	// scale up: set voltage first then freq
>> +	if (opp_voltage > old_voltage) {
>> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
>> +		if (ret) {
>> +			pr_err("cci: failed to scale up voltage\n");
>> +			return ret;
>> +		}
>> +	}
>> +
>> +	ret = clk_set_rate(cci_df->cci_clk, *freq);
>> +	if (ret) {
>> +		pr_err("%s: failed cci to set rate: %d\n", __func__,
>> +		       ret);
>> +		mtk_cci_set_voltage(cci_df, old_voltage);
>> +		return ret;
>> +	}
>> +
>> +	// scale down: set freq first then voltage
>> +	if (opp_voltage < old_voltage) {
>> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
>> +		if (ret) {
>> +			pr_err("cci: failed to scale down voltage\n");
>> +			clk_set_rate(cci_df->cci_clk, cci_df->old_freq);
>> +			return ret;
>> +		}
>> +	}
> 
> 
> I recommend that dev_pm_opp_set_rate() and dev_pm_opp_set_regulator()
> instead of 'clk_set_rate' and 'regulator_set_voltage'.
> In the dev_pm_opp_set_rate(), handle the these sequence.
> You can refer the merged patch[1].
> 
> [1] commit 4294a779bd8dff6c65e7e85ffe7a1ea236e92a68
> - PM / devfreq: exynos-bus: Convert to use dev_pm_opp_set_rate()
> 
> 
>> +
>> +	cci_df->old_freq = *freq;
>> +
>> +	return 0;
>> +}
>> +
>> +static struct devfreq_dev_profile cci_devfreq_profile = {
>> +	.target = mtk_cci_devfreq_target,
> 
> Need to add '.exit' for calling dev_pm_opp_of_remove_table().
> You can refer the merged devfreq patches like exynos_bus.c, imx-bus.c.
> 
>> +};
>> +
>> +static int mtk_cci_devfreq_probe(struct platform_device *pdev)
>> +{
>> +	struct device *cci_dev = &pdev->dev;
>> +	struct cci_devfreq *cci_df;
>> +	struct devfreq_passive_data *passive_data;
>> +	int ret;
>> +
>> +	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
>> +	if (!cci_df)
>> +		return -ENOMEM;
>> +
>> +	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
>> +	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
>> +	if (ret) {
>> +		if (ret != -EPROBE_DEFER)
>> +			dev_err(cci_dev, "failed to get clock for CCI: %d\n",
>> +				ret);
>> +		return ret;
>> +	}
>> +	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
>> +	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
>> +	if (ret) {
>> +		if (ret != -EPROBE_DEFER)
>> +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
>> +				ret);
>> +		return ret;
>> +	}
> 
> I recommend that use dev_pm_opp_set_regulators.
> You can refer the merged patch[1].
> 
> [1] commit 4294a779bd8dff6c65e7e85ffe7a1ea236e92a68
> - PM / devfreq: exynos-bus: Convert to use dev_pm_opp_set_rate()
> 
> 
>> +	ret = regulator_enable(cci_df->proc_reg);
>> +	if (ret) {
>> +		pr_warn("enable buck for cci fail\n");
> 
> Use dev_err instead of 'pr_warn'.
> 
>> +		return ret;
>> +	}
>> +
>> +	ret = dev_pm_opp_of_add_table(cci_dev);
>> +	if (ret) {
>> +		dev_err(cci_dev, "Fail to init CCI OPP table: %d\n", ret);
> 
> How about changing the error log as following
> because in this driver, use the 'failed to' sentence for error handling?
> 
> 	failed to get OPP table for CCI:L %d
> 
>> +		return ret;
>> +	}
>> +
>> +	platform_set_drvdata(pdev, cci_df);
>> +
>> +	passive_data = devm_kzalloc(cci_dev, sizeof(*passive_data), GFP_KERNEL);
>> +	if (!passive_data)
>> +		return -ENOMEM;
> 
> On this error case, you have to call dev_pm_opp_of_remove_table().
> You better to make the 'err_opp' jump lable and then add 'goto err_opp'.
> 
>> +
>> +	passive_data->parent_type = CPUFREQ_PARENT_DEV;
>> +
>> +	cci_df->devfreq = devm_devfreq_add_device(cci_dev,
>> +						  &cci_devfreq_profile,
>> +						  DEVFREQ_GOV_PASSIVE,
>> +						  passive_data);
>> +	if (IS_ERR(cci_df->devfreq)) {
>> +		ret = PTR_ERR(cci_df->devfreq);
>> +		dev_err(cci_dev, "cannot create cci devfreq device:%d\n", ret);
>> +		dev_pm_opp_of_remove_table(cci_dev);
> 
> Instead of direct call, use 'goto err_opp'.
> 
>> +		return ret;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int mtk_cci_devfreq_remove(struct platform_device *pdev)
>> +{
>> +	struct device *cci_dev = &pdev->dev;
>> +	struct cci_devfreq *cci_df;
>> +	struct notifier_block *opp_nb;
>> +
>> +	cci_df = platform_get_drvdata(pdev);
>> +	opp_nb = &cci_df->opp_nb;
>> +
>> +	dev_pm_opp_unregister_notifier(cci_dev, opp_nb);
> 
> This patch doesn't call the dev_pm_opp_register_notifier.
> Please remove it.
> 
>> +	devm_devfreq_remove_device(cci_dev, cci_df->devfreq);
> 
> Don't need to call this function because you used devm_devfreq_add_device().
> 
>> +	dev_pm_opp_of_remove_table(cci_dev)> +	regulator_disable(cci_df->proc_reg);
>> +
>> +	return 0;
>> +}
>> +
>> +static const __maybe_unused struct of_device_id
>> +	mediatek_cci_devfreq_of_match[] = {
> 
> Make it on one line and remove '__maybe_unused' keyword.
> - mediatek_cci_devfreq_of_match-> mediatek_cci_of_match
> 
>> +	{ .compatible = "mediatek,mt8183-cci" },
>> +	{ },
>> +};
>> +MODULE_DEVICE_TABLE(of, mediatek_cci_devfreq_of_match);
> 
> ditto.
> 
>> +
>> +static struct platform_driver cci_devfreq_driver = {
>> +	.probe	= mtk_cci_devfreq_probe,
>> +	.remove	= mtk_cci_devfreq_remove,
>> +	.driver = {
>> +		.name = "mediatek-cci-devfreq",
>> +		.of_match_table = of_match_ptr(mediatek_cci_devfreq_of_match),
> 
> ditto.
> 
>> +	},
>> +};
>> +
>> +static int __init mtk_cci_devfreq_init(void)
>> +{
>> +	return platform_driver_register(&cci_devfreq_driver);
>> +}
>> +module_init(mtk_cci_devfreq_init)
>> +
>> +static void __exit mtk_cci_devfreq_exit(void)
>> +{
>> +	platform_driver_unregister(&cci_devfreq_driver);
>> +}
>> +module_exit(mtk_cci_devfreq_exit)
> 
> Use 'module_platform_driver' instead of module_init and module_exit.
> 
>> +
>> +MODULE_DESCRIPTION("Mediatek CCI devfreq driver");
>> +MODULE_AUTHOR("Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>");
>> +MODULE_LICENSE("GPL v2");
>>
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
       [not found]         ` <1591098190.30729.15.camel@mtksdaap41>
@ 2020-06-03  4:07           ` Chanwoo Choi
  0 siblings, 0 replies; 14+ messages in thread
From: Chanwoo Choi @ 2020-06-03  4:07 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	devicetree, srv_heupstream, linux-pm, linux-kernel,
	linux-mediatek, Sibi Sankar, linux-arm-kernel

Hi Andrew-sh.Cheng,

Do you know that why cannot show the patches sent from you on mailing list?

Even if you sent them to linux-pm mailing list, I cannot find
your patches on linux-pm's patchwork[1] and others.
[1] https://patchwork.kernel.org/project/linux-pm/list/

Could you find you patch on mailing list?
Do you use git send-email when you send these patches?

I used the thunderbird tool and gmail for reading the patches.
When I tried to read the original source of this patch,
it looks like that the body of patch is encoded.
I cannot read the plain text of patch body.
- When gmail, use 'Show original'
- When thunderbird, use 'More -> View Source'

If I'm missing something to check this patch,
please let me know. I'll fix my environment.
It is strange situation on my case.


On 6/2/20 8:43 PM, andrew-sh.cheng wrote:
> On Thu, 2020-05-28 at 15:14 +0900, Chanwoo Choi wrote:
>> Hi Andrew-sh.Cheng,
>>
>> Thanks for your posting. I like this approach absolutely.
>> I think that it is necessary. When I developed the embedded product,
>> I needed this feature always. 
>>
>> I add the comments on below.
>>
>>
>> And the following email is not valid. So, I dropped this email
>> from Cc list.
>> Saravana Kannan <skannan@codeaurora.org>
>>
>>
>> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
>>> From: Saravana Kannan <skannan@codeaurora.org>
>>>
>>> Many CPU architectures have caches that can scale independent of the
>>> CPUs. Frequency scaling of the caches is necessary to make sure that the
>>> cache is not a performance bottleneck that leads to poor performance and
>>> power. The same idea applies for RAM/DDR.
>>>
>>> To achieve this, this patch adds support for cpu based scaling to the
>>> passive governor. This is accomplished by taking the current frequency
>>> of each CPU frequency domain and then adjust the frequency of the cache
>>> (or any devfreq device) based on the frequency of the CPUs. It listens
>>> to CPU frequency transition notifiers to keep itself up to date on the
>>> current CPU frequency.
>>>
>>> To decide the frequency of the device, the governor does one of the
>>> following:
>>> * Derives the optimal devfreq device opp from required-opps property of
>>>   the parent cpu opp_table.
>>>
>>> * Scales the device frequency in proportion to the CPU frequency. So, if
>>>   the CPUs are running at their max frequency, the device runs at its
>>>   max frequency. If the CPUs are running at their min frequency, the
>>>   device runs at its min frequency. It is interpolated for frequencies
>>>   in between.
>>>
>>> Andrew-sh.Cheng change
>>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
>>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
>>> for kernel-5.7
>>>
>>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
>>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
>>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>>> ---
>>>  drivers/devfreq/Kconfig            |   2 +
>>>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
>>>  include/linux/devfreq.h            |  40 +++++-
>>>  3 files changed, 299 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>>> index 0b1df12e0f21..d9067950af6a 100644
>>> --- a/drivers/devfreq/Kconfig
>>> +++ b/drivers/devfreq/Kconfig
>>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>>>  	  device. This governor does not change the frequency by itself
>>>  	  through sysfs entries. The passive governor recommends that
>>>  	  devfreq device uses the OPP table to get the frequency/voltage.
>>> +	  Alternatively the governor can also be chosen to scale based on
>>> +	  the online CPUs current frequency.
>>>  
>>>  comment "DEVFREQ Drivers"
>>>  
>>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
>>> index 2d67d6c12dce..7dcda02a5bb7 100644
>>> --- a/drivers/devfreq/governor_passive.c
>>> +++ b/drivers/devfreq/governor_passive.c
>>> @@ -8,11 +8,89 @@
>>>   */
>>>  
>>>  #include <linux/module.h>
>>> +#include <linux/cpu.h>
>>> +#include <linux/cpufreq.h>
>>> +#include <linux/cpumask.h>
>>>  #include <linux/device.h>
>>>  #include <linux/devfreq.h>
>>> +#include <linux/slab.h>
>>>  #include "governor.h"
>>>  
>>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
>>
>> Need to change 'unsigned int' to 'unsigned long'
> Get it.

If you add the blank line before/after of your reply,
it is better to catch your reply. Please add the blank line for me.

>> .
>>
>>> +					     unsigned int cpu)
>>> +{
>>> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;
>>
>> Better to define them separately as following and then need to rename
>> the variable. Usually, use the 'min_freq' and 'max_freq' word for
>> the minimum/maximum frequency.
>>
>> 	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
>> 	unsigned long dev_min_freq, dev_max_freq, dev_max_state,
>>
>> The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
>> and 'unsigned int'. You need to handle them properly.
> Get it.
> For cpu_freq, I separate it into "unsigned long cpu_curr_freq" and
> "unsigned int cpu_curr_freq_khz"
>>
>>
>>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	unsigned long *freq_table = devfreq->profile->freq_table;
>>
>> In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
>> So, I think 'dev_freq_table' is proper name instead of 'freq_table'
>> for the readability.
>>
>> 	freq_table -> dev_freq_table
>>
>>> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;
>>
>> In the get_target_freq_with_devfreq(), use 'p_opp' indicating
>> the OPP of parent device. For the consistency, I think that
>> use 'p_opp' instead of 'cpu_opp'. 
>>
>>> +	unsigned long cpu_freq, freq;
>>
>> Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
>> 	cpu_freq -> cpu_curr_freq.
> Get it.
> Will modify them for readability.
>>
>>> +
>>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
>>> +	    !cpu_state->opp_table || !devfreq->opp_table)
>>> +		return 0;
>>> +
>>> +	cpu_freq = cpu_state->freq * 1000;
>>> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
>>> +	if (IS_ERR(cpu_opp))
>>> +		return 0;
>>> +
>>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
>>> +					    devfreq->opp_table, cpu_opp);
>>> +	dev_pm_opp_put(cpu_opp);
>>> +
>>> +	if (!IS_ERR(opp)) {
>>> +		freq = dev_pm_opp_get_freq(opp);
>>> +		dev_pm_opp_put(opp);
>>
>> Better to add the 'out' goto statement.
>> If you use 'goto out', you can reduce the one indentation
>> without 'else' statement.
> Get it.
>> 	
>>
>>> +	} else {
>>
>> As I commented, when dev_pm_opp_xlate_required_opp() return successfully
>> , use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.
>>
>>
>>> +		/* Use Interpolation if required opps is not available */
>>> +		cpu_min = cpu_state->min_freq;
>>> +		cpu_max = cpu_state->max_freq;
>>> +		cpu_freq = cpu_state->freq;
>>> +
>>> +		if (freq_table) {
>>> +			/* Get minimum frequency according to sorting order */
>>> +			max_state = freq_table[devfreq->profile->max_state - 1];
>>> +			if (freq_table[0] < max_state) {
>>> +				dev_min = freq_table[0];
>>> +				dev_max = max_state;
>>> +			} else {
>>> +				dev_min = max_state;
>>> +				dev_max = freq_table[0];
>>> +			}
>>> +		} else {
>>> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
>>> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
>>> +				return 0;
>>> +			dev_min =
>>> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
>>> +			dev_max =
>>> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;
>>
>> I think it is not proper to access the variable of pm_qos structure directly.
>> Instead of direct access, you have to use the exported PM QoS function such as
>> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
>> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);
> Get it.
>>
>>> +		}
>>> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
>>> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
>>> +	}
>>
>>
>> I think that you better to add 'out' jump label as following:
>>
>> out:
>>
>>> +
>>> +	return freq;
>>> +}
>>> +
>>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>>> +					unsigned long *freq)
>>> +{
>>> +	struct devfreq_passive_data *p_data =
>>> +				(struct devfreq_passive_data *)devfreq->data;
>>> +	unsigned int cpu, target_freq = 0;
>>
>> Need to define 'target_freq' with 'unsigned long' type.
> Get it.
>>
>>> +
>>> +	for_each_online_cpu(cpu)
>>> +		target_freq = max(target_freq,
>>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
>>> +
>>> +	*freq = target_freq;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>>>  					unsigned long *freq)
>>>  {
>>>  	struct devfreq_passive_data *p_data
>>> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>  	int i, count, ret = 0;
>>>  
>>>  	/*
>>> -	 * If the devfreq device with passive governor has the specific method
>>> -	 * to determine the next frequency, should use the get_target_freq()
>>> -	 * of struct devfreq_passive_data.
>>> -	 */
>>> -	if (p_data->get_target_freq) {
>>> -		ret = p_data->get_target_freq(devfreq, freq);
>>> -		goto out;
>>> -	}
>>> -
>>> -	/*
>>>  	 * If the parent and passive devfreq device uses the OPP table,
>>>  	 * get the next frequency by using the OPP table.
>>>  	 */
>>> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>  	return ret;
>>>  }
>>>  
>>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>> +					   unsigned long *freq)
>>> +{
>>> +	struct devfreq_passive_data *p_data =
>>> +				(struct devfreq_passive_data *)devfreq->data;
>>> +	int ret;
>>> +
>>> +	/*
>>> +	 * If the devfreq device with passive governor has the specific method
>>> +	 * to determine the next frequency, should use the get_target_freq()
>>> +	 * of struct devfreq_passive_data.
>>> +	 */
>>> +	if (p_data->get_target_freq)
>>> +		return p_data->get_target_freq(devfreq, freq);
>>> +
>>> +	switch (p_data->parent_type) {
>>> +	case DEVFREQ_PARENT_DEV:
>>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
>>> +		break;
>>> +	case CPUFREQ_PARENT_DEV:
>>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
>>> +		break;
>>> +	default:
>>> +		ret = -EINVAL;
>>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
>>> +		break;
>>> +	}
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
>>>  {
>>>  	int ret;
>>> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>>  	return NOTIFY_DONE;
>>>  }
>>>  
>>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
>>> +					 unsigned long event, void *ptr)
>>> +{
>>> +	struct devfreq_passive_data *data =
>>> +			container_of(nb, struct devfreq_passive_data, nb);
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	struct devfreq_cpu_state *cpu_state;
>>> +	struct cpufreq_freqs *freq = ptr;
>>
>> How about changing 'freq' to 'cpu_freqs'?
>>
>> In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
>> the instance of 'struct cpufreq_freqs'. And in order to
>> identfy, how about adding 'cpu_' prefix for variable name?
>>
>>> +	unsigned int current_freq;
>>
>> Need to define curr_freq with 'unsigned long' type
>> and better to use 'curr_freq' variable name.
> It is good to change current_freq to curr_freq, but why should it us
> 'unsigned long'?
> I think it is 'unsigned int'.

I think that 'curr_freq' is proper. Yes, it is 'unsigned int'.
When you changing the cpu frequency to device frequency,
recommend to handle them between unsigned int and unsigned long.

>>
>>> +	int ret;
>>> +
>>> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
>>> +	    !data->cpu_state[freq->policy->cpu])
>>> +		return 0;
>>> +
>>> +	cpu_state = data->cpu_state[freq->policy->cpu];
>>> +	if (cpu_state->freq == freq->new)
>>> +		return 0;
>>> +
>>> +	/* Backup current freq and pre-update cpu state freq*/
>>> +	current_freq = cpu_state->freq;
>>> +	cpu_state->freq = freq->new;
>>> +
>>> +	mutex_lock(&devfreq->lock);
>>> +	ret = update_devfreq(devfreq);
>>> +	mutex_unlock(&devfreq->lock);
>>> +	if (ret) {
>>> +		cpu_state->freq = current_freq;
>>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
>>> +		return ret;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
>>> +{
>>> +	struct devfreq_passive_data *data = *p_data;
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	struct device *dev = devfreq->dev.parent;
>>> +	struct opp_table *opp_table = NULL;
>>> +	struct devfreq_cpu_state *state;
>>
>> For the readability, I thinkt 'cpu_state' is proper instead of 'state'.
> Get it.
>>
>>> +	struct cpufreq_policy *policy;
>>> +	struct device *cpu_dev;
>>> +	unsigned int cpu;
>>> +	int ret;
>>> +
>>> +	get_online_cpus();
>>
>> Add blank line.
> Get it.
>>
>>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
>>> +	ret = cpufreq_register_notifier(&data->nb,
>>> +					CPUFREQ_TRANSITION_NOTIFIER);
>>> +	if (ret) {
>>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
>>> +		data->nb.notifier_call = NULL;
>>> +		goto out;
>>> +	}
>>> +
>>> +	/* Populate devfreq_cpu_state */
>>> +	for_each_online_cpu(cpu) {
>>> +		if (data->cpu_state[cpu])
>>> +			continue;
>>> +
>>> +		policy = cpufreq_cpu_get(cpu);
>>
>> cpufreq_cpu_get() might return 'NULL'. I think you need to handle
>> return value as following:
>>
>> 		if (!policy) {
>> 			ret = -EINVAL;
>> 			goto out;
>> 		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
>> 			goto out;
>> 		} else if (IS_ERR(policy) {
>> 			ret = PTR_ERR(policy);
>> 			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
>> 			goto out;
>> 		}
>>
>> If cpufreq_cpu_get() return successfully, to do next.
>> It reduces the one indentaion.
>>
>>
> Get it.
>>
>>> +		if (policy) {
>>> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
>>> +			if (!state) {
>>> +				ret = -ENOMEM;
>>> +				goto out;
>>> +			}
>>> +
>>> +			cpu_dev = get_cpu_device(cpu);
>>> +			if (!cpu_dev) {
>>> +				dev_err(dev, "Couldn't get cpu device.\n");
>>> +				ret = -ENODEV;
>>> +				goto out;
>>> +			}
>>> +
>>> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>>> +			if (IS_ERR(devfreq->opp_table)) {
>>> +				ret = PTR_ERR(opp_table);
>>> +				goto out;
>>> +			}
>>> +
>>> +			state->dev = cpu_dev;
>>> +			state->opp_table = opp_table;
>>> +			state->first_cpu = cpumask_first(policy->related_cpus);
>>> +			state->freq = policy->cur;
>>> +			state->min_freq = policy->cpuinfo.min_freq;
>>> +			state->max_freq = policy->cpuinfo.max_freq;
>>> +			data->cpu_state[cpu] = state;
>>
>> Add blank line.
>>
>>> +			cpufreq_cpu_put(policy);
>>> +		} else {
>>> +			ret = -EPROBE_DEFER;
>>> +			goto out;
>>> +		}
>>> +	}
>>
>> Add blank line.
> Get it.
>>> +out:
>>> +	put_online_cpus();
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	/* Update devfreq */
>>> +	mutex_lock(&devfreq->lock);
>>> +	ret = update_devfreq(devfreq);
>>> +	mutex_unlock(&devfreq->lock);
>>> +	if (ret)
>>> +		dev_err(dev, "Couldn't update the frequency.\n");
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
>>> +{
>>> +	struct devfreq_passive_data *data = *p_data;
>>> +	struct devfreq_cpu_state *cpu_state;
>>> +	int cpu;
>>> +
>>> +	if (data->nb.notifier_call)
>>> +		cpufreq_unregister_notifier(&data->nb,
>>> +					    CPUFREQ_TRANSITION_NOTIFIER);
>>> +
>>> +	for_each_possible_cpu(cpu) {
>>> +		cpu_state = data->cpu_state[cpu];
>>> +		if (cpu_state) {
>>> +			if (cpu_state->opp_table)
>>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
>>> +			kfree(cpu_state);
>>> +			cpu_state = NULL;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>  				unsigned int event, void *data)
>>>  {
>>> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>  	struct notifier_block *nb = &p_data->nb;
>>>  	int ret = 0;
>>>  
>>> -	if (!parent)
>>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>>>  		return -EPROBE_DEFER;
>>
>> If you modify the devfreq_passive_event_handler() as following,
>> you can move this condition for DEVFREQ_PARENT_DEV into 
>> (register|unregister)_parent_dev_notifier.
>>
>> 	switch (event) {                                                                                  
>> 	case DEVFREQ_GOV_START:                                               
>> 		ret = register_parent_dev_notifier(p_data);
>> 		break;
>> 	case DEVFREQ_GOV_STOP:                                             
>> 		ret = unregister_parent_dev_notifier(p_data);
>> 		break;
>> 	default: 
>> 		ret = -EINVAL;
>> 		break;
>> 	}
>>                                                                                               
>> 	return ret;
>>
> Get it.
>>>  
>>>  	switch (event) {
>>> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>  		if (!p_data->this)
>>>  			p_data->this = devfreq;
>>>  
>>> -		nb->notifier_call = devfreq_passive_notifier_call;
>>> -		ret = devfreq_register_notifier(parent, nb,
>>> -					DEVFREQ_TRANSITION_NOTIFIER);
>>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
>>> +			nb->notifier_call = devfreq_passive_notifier_call;
>>> +			ret = devfreq_register_notifier(parent, nb,
>>> +						DEVFREQ_TRANSITION_NOTIFIER);
>>> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
>>> +			ret = cpufreq_passive_register(&p_data);
>>
>> I think that we better to collect the code related to notifier registration
>> into one function like devfreq_pass_register_notifier() instead of
>> cpufreq_passive_register() as following: I think it is more simple and readable.
>>
>> If you have more proper function name of register_parent_dev_notifier,
>> please give your opinion.
>>
>> 	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
>> 		switch (p_data->parent_type) {
>> 		case DEVFREQ_PARENT_DEV:
>> 			nb->notifier_call = devfreq_passive_notifier_call;
>> 			ret = devfreq_register_notifier(parent, nb,
>> 			break;
>> 		case CPUFREQ_PARENT_DEV:
>> 			cpufreq_register_notifier(...)
>> 			...
>> 			break;
>> 		}
> Not fully understanding.
> Do you mean expanding cpufreq_passive_register()?

Yes and rename it for both cpufreq and devfreq.

> I think leave it in function will be with clean for this code segment.

I want that one function handle the notifier register
for both cpufreq and devfreq so that we make it more simply as following:
On the step hanling the governor event, don't need to consider
the type of parent device of devfreq deivce with this style.

	case DEVFREQ_GOV_START:
		ret = register_notifier(...);
		break;
	case DEVFREQ_GOV_STOP:
		ret = unregister_notifier(...);
		break;

> 
>> 		
>>
>>> +		} else {
>>> +			ret = -EINVAL;
>>> +		}
>>>  		break;
>>>  	case DEVFREQ_GOV_STOP:
>>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
>>> -					DEVFREQ_TRANSITION_NOTIFIER));
>>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
>>> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
>>> +						DEVFREQ_TRANSITION_NOTIFIER));
>>> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
>>> +			cpufreq_passive_unregister(&p_data);
>>> +		else
>>> +			ret = -EINVAL;
>>
>> ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> Get it.

ditto. As I aboved commented.

>>
>>>  		break;
>>>  	default:
>>>  		break;
>>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>>> index a4b19d593151..04ce576fd6f1 100644
>>> --- a/include/linux/devfreq.h
>>> +++ b/include/linux/devfreq.h
>>> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
>>>  
>>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>>>  /**
>>> + * struct devfreq_cpu_state - holds the per-cpu state
>>> + * @freq:	the current frequency of the cpu.
>>> + * @min_freq:	the min frequency of the cpu.
>>> + * @max_freq:	the max frequency of the cpu.
>>> + * @first_cpu:	the cpumask of the first cpu of a policy.
>>> + * @dev:	reference to cpu device.
>>> + * @opp_table:	reference to cpu opp table.
>>> + *
>>> + * This structure stores the required cpu_state of a cpu.
>>> + * This is auto-populated by the governor.
>>> + */
>>> +struct devfreq_cpu_state {> +	unsigned int freq;
>>
>> It is better to change from 'freq' to 'curr_freq'
>> for more correct expression.
> Get it.
>>
>>> +	unsigned int min_freq;
>>> +	unsigned int max_freq;
>>> +	unsigned int first_cpu;
>>> +	struct device *dev;
>>
>> How about changing the name 'dev' to 'cpu_dev'?
> Okay.
>>
>>
>>> +	struct opp_table *opp_table;
>>> +};
>>
>> devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.
>>
>> So, you can move it into drivers/devfreq/governor_passive.c
>> and just add the definition into include/linux/devfreq.h as following:
>> It is able to prevent the access of variable of 'struct devfreq_cpu_state'
>> outside.
>>
>> 	struct devfreq_cpu_state;
> Get it.
>>
>>> +
>>> +enum devfreq_parent_dev_type {
>>> +	DEVFREQ_PARENT_DEV,
>>> +	CPUFREQ_PARENT_DEV,
>>> +};
>>> +
>>> +/**
>>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>>>   *	and devfreq_add_device
>>>   * @parent:	the devfreq instance of parent device.
>>> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
>>>   *			using governors except for passive governor.
>>>   *			If the devfreq device has the specific method to decide
>>>   *			the next frequency, should use this callback.
>>> - * @this:	the devfreq instance of own device.
>>> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>>> + * @parent_type		parent type of the device
>>
>> Need to add ':' at the end of word. -> "parent_type:".
>>
>>> + * @this:		the devfreq instance of own device.
>>> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>>
>> I knew that you make them with same indentation.
>> But, actually, it is not related to this patch like clean-up code.
>> Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.
> Get it.
>>
>>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>>>   *
>>>   * The devfreq_passive_data have to set the devfreq instance of parent
>>>   * device with governors except for the passive governor. But, don't need to
>>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
>>> - * them.
>>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
>>> + * will handle them.
>>>   */
>>>  struct devfreq_passive_data {
>>>  	/* Should set the devfreq instance of parent device */
>>> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
>>>  	/* Optional callback to decide the next frequency of passvice device */
>>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>>>  
>>> +	/* Should set the type of parent device */
>>> +	enum devfreq_parent_dev_type parent_type;
>>> +
>>>  	/* For passive governor's internal use. Don't need to set them */
>>>  	struct devfreq *this;
>>>  	struct notifier_block nb;
>>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>>>  };
>>>  #endif
>>>  
>>>
>>
>>
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
       [not found]           ` <1591100614.1804.1.camel@mtksdaap41>
@ 2020-06-03  4:12             ` Chanwoo Choi
  0 siblings, 0 replies; 14+ messages in thread
From: Chanwoo Choi @ 2020-06-03  4:12 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	devicetree, srv_heupstream, linux-pm, linux-kernel,
	linux-mediatek, Sibi Sankar, linux-arm-kernel

Hi Andrew-sh.Cheng,

On 6/2/20 9:23 PM, andrew-sh.cheng wrote:
> On Thu, 2020-05-28 at 16:17 +0900, Chanwoo Choi wrote:
>> Hi Andrew-sh.Cheng,
>>
>> The exynos-bus.c used the passive governor.
>> Even if don't make the problem because DEVFREQ_PARENT_DEV is zero,
>> you need to initialize the parent_type with DEVFREQ_PARENT_DEV as following:
>>
>> diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
>> index 8fa8eb541373..1c71c47bc2ac 100644
>> --- a/drivers/devfreq/exynos-bus.c
>> +++ b/drivers/devfreq/exynos-bus.c
>> @@ -369,6 +369,7 @@ static int exynos_bus_profile_init_passive(struct exynos_bus *bus,
>>                 return -ENOMEM;
>>  
>>         passive_data->parent = parent_devfreq;
>> +       passive_data->parent_type = DEVFREQ_PARENT_DEV;
>>  
>>         /* Add devfreq device for exynos bus with passive governor */
>>         bus->devfreq = devm_devfreq_add_device(dev, profile, DEVFREQ_GOV_PASSIVE,
> Hi Chanwoo Choi,
> Do you just remind me to initialize it to DEVFREQ_PARENT_DEV whn use
> this governor?

Yes. This change was not included in this patchset.

> I will do it and thank you for reminding.

Thanks.

(snip)


And, this patchset doesn't include the dt-binding example
and any real example in devicetree. If possible, I recommend
you better to update dt-binding document with example.

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, back to index

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20200520034324epcas1p3affbd24bd1f3fe40d51baade07c1abba@epcas1p3.samsung.com>
     [not found] ` <20200520034307.20435-1-andrew-sh.cheng@mediatek.com>
2020-05-20  4:10   ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Chanwoo Choi
     [not found]     ` <1589953015.8243.2.camel@mtksdaap41>
2020-05-20  6:24       ` Chanwoo Choi
     [not found]         ` <1589958625.23971.2.camel@mtksdaap41>
2020-05-20 14:53           ` Matthias Brugger
     [not found]   ` <20200520034307.20435-10-andrew-sh.cheng@mediatek.com>
2020-05-20 12:31     ` [PATCH 09/12] devfreq: add mediatek cci devfreq Mark Brown
2020-05-21  8:52       ` andrew-sh.cheng
2020-05-28  7:35     ` Chanwoo Choi
2020-05-28  8:00       ` Chanwoo Choi
     [not found]   ` <20200520034307.20435-2-andrew-sh.cheng@mediatek.com>
2020-05-20 14:54     ` [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains Matthias Brugger
     [not found]   ` <CGME20200520034335epcas1p45a321a1a878fb7cd7b9c9ada0a474ef7@epcas1p4.samsung.com>
     [not found]     ` <20200520034307.20435-7-andrew-sh.cheng@mediatek.com>
2020-05-28  5:03       ` [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor Chanwoo Choi
2020-05-28  6:14       ` Chanwoo Choi
2020-05-28  7:17         ` Chanwoo Choi
     [not found]           ` <1591100614.1804.1.camel@mtksdaap41>
2020-06-03  4:12             ` Chanwoo Choi
     [not found]         ` <1591098190.30729.15.camel@mtksdaap41>
2020-06-03  4:07           ` Chanwoo Choi
     [not found]   ` <CGME20200520034339epcas1p1524dea2d7089cb3492384bbe917dcffe@epcas1p1.samsung.com>
     [not found]     ` <20200520034307.20435-9-andrew-sh.cheng@mediatek.com>
2020-05-28  7:42       ` [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq Chanwoo Choi

Linux-PM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pm/0 linux-pm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pm linux-pm/ https://lore.kernel.org/linux-pm \
		linux-pm@vger.kernel.org
	public-inbox-index linux-pm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git