linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
@ 2020-01-09  7:52 Jeffy Chen
  2020-01-10 11:37 ` Sudeep Holla
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jeffy Chen @ 2020-01-09  7:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Brian Norris, Marc Zyngier, Douglas Anderson, Robin Murphy,
	Heiko Stuebner, Jeffy Chen, Greg Kroah-Hartman, Sudeep Holla,
	Rafael J. Wysocki

The CPU freqs are not supposed to change before cpufreq policies
properly registered, meaning that they should be used to calculate the
initial CPU capacities.

Doing this helps choosing the best CPU during early boot, especially
for the initramfs decompressing.

Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
---
When testing the rk3399-evb(arm64 big.LITTLE platform), i saw a noticable
slow-down in early boot:
[    0.280176] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    3.564412] vcc_sys: supplied by dc_12v
[    3.637176] iommu: Default domain type: Translated
[    3.830721] SCSI subsystem initialized

That is because the big cores are running at a low freq(from bootrom), but
been chosen based on the larger initial cpu capacity.

 drivers/base/arch_topology.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1eb81f113786..d62fe1d4dda9 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -94,7 +94,7 @@ static void update_topology_flags_workfn(struct work_struct *work)
 	update_topology = 0;
 }
 
-static u32 capacity_scale;
+static DEFINE_PER_CPU(u32, max_freq);
 static u32 *raw_capacity;
 
 static int free_raw_capacity(void)
@@ -108,16 +108,22 @@ static int free_raw_capacity(void)
 void topology_normalize_cpu_scale(void)
 {
 	u64 capacity;
+	u64 capacity_scale;
 	int cpu;
 
 	if (!raw_capacity)
 		return;
 
-	pr_debug("cpu_capacity: capacity_scale=%u\n", capacity_scale);
+	capacity_scale = 1;
 	for_each_possible_cpu(cpu) {
-		pr_debug("cpu_capacity: cpu=%d raw_capacity=%u\n",
-			 cpu, raw_capacity[cpu]);
-		capacity = (raw_capacity[cpu] << SCHED_CAPACITY_SHIFT)
+		capacity = raw_capacity[cpu] * per_cpu(max_freq, cpu);
+		capacity_scale = max(capacity, capacity_scale);
+	}
+
+	pr_debug("cpu_capacity: capacity_scale=%llu\n", capacity_scale);
+	for_each_possible_cpu(cpu) {
+		capacity = raw_capacity[cpu] * per_cpu(max_freq, cpu);
+		capacity = (capacity << SCHED_CAPACITY_SHIFT)
 			/ capacity_scale;
 		topology_set_cpu_scale(cpu, capacity);
 		pr_debug("cpu_capacity: CPU%d cpu_capacity=%lu\n",
@@ -127,6 +133,7 @@ void topology_normalize_cpu_scale(void)
 
 bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
 {
+	struct clk *cpu_clk;
 	static bool cap_parsing_failed;
 	int ret;
 	u32 cpu_capacity;
@@ -146,10 +153,15 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
 				return false;
 			}
 		}
-		capacity_scale = max(cpu_capacity, capacity_scale);
 		raw_capacity[cpu] = cpu_capacity;
 		pr_debug("cpu_capacity: %pOF cpu_capacity=%u (raw)\n",
 			cpu_node, raw_capacity[cpu]);
+
+		cpu_clk = of_clk_get(cpu_node, 0);
+		if (!PTR_ERR_OR_ZERO(cpu_clk))
+			per_cpu(max_freq, cpu) = clk_get_rate(cpu_clk) / 1000;
+
+		clk_put(cpu_clk);
 	} else {
 		if (raw_capacity) {
 			pr_err("cpu_capacity: missing %pOF raw capacity\n",
@@ -188,11 +200,8 @@ init_cpu_capacity_callback(struct notifier_block *nb,
 
 	cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus);
 
-	for_each_cpu(cpu, policy->related_cpus) {
-		raw_capacity[cpu] = topology_get_cpu_scale(cpu) *
-				    policy->cpuinfo.max_freq / 1000UL;
-		capacity_scale = max(raw_capacity[cpu], capacity_scale);
-	}
+	for_each_cpu(cpu, policy->related_cpus)
+		per_cpu(max_freq, cpu) = policy->cpuinfo.max_freq / 1000;
 
 	if (cpumask_empty(cpus_to_visit)) {
 		topology_normalize_cpu_scale();
-- 
2.11.0




^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-09  7:52 [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq Jeffy Chen
@ 2020-01-10 11:37 ` Sudeep Holla
  2020-01-10 11:54   ` Robin Murphy
  2020-01-10 12:01   ` Dietmar Eggemann
  2020-01-10 14:03 ` kbuild test robot
  2020-01-10 14:45 ` kbuild test robot
  2 siblings, 2 replies; 10+ messages in thread
From: Sudeep Holla @ 2020-01-10 11:37 UTC (permalink / raw)
  To: Jeffy Chen
  Cc: linux-kernel, Brian Norris, Marc Zyngier, Douglas Anderson,
	Robin Murphy, Heiko Stuebner, Sudeep Holla, Greg Kroah-Hartman,
	Rafael J. Wysocki

On Thu, Jan 09, 2020 at 03:52:14PM +0800, Jeffy Chen wrote:
> The CPU freqs are not supposed to change before cpufreq policies
> properly registered, meaning that they should be used to calculate the
> initial CPU capacities.
> 
> Doing this helps choosing the best CPU during early boot, especially
> for the initramfs decompressing.
> 
> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>

[...]

> @@ -146,10 +153,15 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
>  				return false;
>  			}
>  		}
> -		capacity_scale = max(cpu_capacity, capacity_scale);
>  		raw_capacity[cpu] = cpu_capacity;
>  		pr_debug("cpu_capacity: %pOF cpu_capacity=%u (raw)\n",
>  			cpu_node, raw_capacity[cpu]);
> +
> +		cpu_clk = of_clk_get(cpu_node, 0);
> +		if (!PTR_ERR_OR_ZERO(cpu_clk))
> +			per_cpu(max_freq, cpu) = clk_get_rate(cpu_clk) / 1000;
> +
> +		clk_put(cpu_clk);

I don't like to assume DVFS to be supplied only using 'clk'. So NACK!
We have other non-clk mechanism for CPU DVFS and this needs to simply
use cpufreq APIs to get frequency value if required.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-10 11:37 ` Sudeep Holla
@ 2020-01-10 11:54   ` Robin Murphy
  2020-01-10 12:01   ` Dietmar Eggemann
  1 sibling, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2020-01-10 11:54 UTC (permalink / raw)
  To: Sudeep Holla, Jeffy Chen
  Cc: linux-kernel, Brian Norris, Marc Zyngier, Douglas Anderson,
	Heiko Stuebner, Greg Kroah-Hartman, Rafael J. Wysocki

On 2020-01-10 11:37 am, Sudeep Holla wrote:
> On Thu, Jan 09, 2020 at 03:52:14PM +0800, Jeffy Chen wrote:
>> The CPU freqs are not supposed to change before cpufreq policies
>> properly registered, meaning that they should be used to calculate the
>> initial CPU capacities.
>>
>> Doing this helps choosing the best CPU during early boot, especially
>> for the initramfs decompressing.
>>
>> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
> 
> [...]
> 
>> @@ -146,10 +153,15 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
>>   				return false;
>>   			}
>>   		}
>> -		capacity_scale = max(cpu_capacity, capacity_scale);
>>   		raw_capacity[cpu] = cpu_capacity;
>>   		pr_debug("cpu_capacity: %pOF cpu_capacity=%u (raw)\n",
>>   			cpu_node, raw_capacity[cpu]);
>> +
>> +		cpu_clk = of_clk_get(cpu_node, 0);
>> +		if (!PTR_ERR_OR_ZERO(cpu_clk))
>> +			per_cpu(max_freq, cpu) = clk_get_rate(cpu_clk) / 1000;
>> +
>> +		clk_put(cpu_clk);
> 
> I don't like to assume DVFS to be supplied only using 'clk'. So NACK!
> We have other non-clk mechanism for CPU DVFS and this needs to simply
> use cpufreq APIs to get frequency value if required.

...but in this case, as soon as cpufreq is ready the problem is gone 
anyway, because it sees the big cluster's clock rate is way out-of-spec 
and bumps it up to a sane OPP.

It really is unfortunate that so many RK3399 images out there are using 
the broken firmware combination that manages to miss out the boot-time 
PLL setup altogether.

Robin.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-10 11:37 ` Sudeep Holla
  2020-01-10 11:54   ` Robin Murphy
@ 2020-01-10 12:01   ` Dietmar Eggemann
  2020-01-10 12:28     ` Robin Murphy
  1 sibling, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2020-01-10 12:01 UTC (permalink / raw)
  To: Sudeep Holla, Jeffy Chen
  Cc: linux-kernel, Brian Norris, Marc Zyngier, Douglas Anderson,
	Robin Murphy, Heiko Stuebner, Greg Kroah-Hartman,
	Rafael J. Wysocki

On 10/01/2020 12:37, Sudeep Holla wrote:
> On Thu, Jan 09, 2020 at 03:52:14PM +0800, Jeffy Chen wrote:
>> The CPU freqs are not supposed to change before cpufreq policies
>> properly registered, meaning that they should be used to calculate the
>> initial CPU capacities.
>>
>> Doing this helps choosing the best CPU during early boot, especially
>> for the initramfs decompressing.
>>
>> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
> 
> [...]
> 
>> @@ -146,10 +153,15 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
>>  				return false;
>>  			}
>>  		}
>> -		capacity_scale = max(cpu_capacity, capacity_scale);
>>  		raw_capacity[cpu] = cpu_capacity;
>>  		pr_debug("cpu_capacity: %pOF cpu_capacity=%u (raw)\n",
>>  			cpu_node, raw_capacity[cpu]);
>> +
>> +		cpu_clk = of_clk_get(cpu_node, 0);
>> +		if (!PTR_ERR_OR_ZERO(cpu_clk))
>> +			per_cpu(max_freq, cpu) = clk_get_rate(cpu_clk) / 1000;
>> +
>> +		clk_put(cpu_clk);
> 
> I don't like to assume DVFS to be supplied only using 'clk'. So NACK!
> We have other non-clk mechanism for CPU DVFS and this needs to simply
> use cpufreq APIs to get frequency value if required.

To support this, it's failing on my Arm64 Juno board.

...
[    0.084858] CPU1 cpu_clk=-517
[    0.087961] CPU2 cpu_clk=-517
[    0.091005] CPU0 cpu_clk=-517
[    0.094121] CPU3 cpu_clk=-517
[    0.097248] CPU4 cpu_clk=-517
[    0.100415] CPU5 cpu_clk=-517
...

Since you're on a big.LITTLE platform, did you specify
'capacity-dmips-mhz' for CPUs to be able to distinguish big and little
CPUs before CPUfreq kicks in?

$ grep capacity-dmips-mhz ./arch/arm64/boot/dts/arm/juno.dts
			capacity-dmips-mhz = <1024>;
			capacity-dmips-mhz = <1024>;
			capacity-dmips-mhz = <578>;
			capacity-dmips-mhz = <578>;
			capacity-dmips-mhz = <578>;
			capacity-dmips-mhz = <578>;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-10 12:01   ` Dietmar Eggemann
@ 2020-01-10 12:28     ` Robin Murphy
  2020-01-11  2:51       ` JeffyChen
  0 siblings, 1 reply; 10+ messages in thread
From: Robin Murphy @ 2020-01-10 12:28 UTC (permalink / raw)
  To: Dietmar Eggemann, Sudeep Holla, Jeffy Chen
  Cc: linux-kernel, Brian Norris, Marc Zyngier, Douglas Anderson,
	Heiko Stuebner, Greg Kroah-Hartman, Rafael J. Wysocki

On 2020-01-10 12:01 pm, Dietmar Eggemann wrote:
> On 10/01/2020 12:37, Sudeep Holla wrote:
>> On Thu, Jan 09, 2020 at 03:52:14PM +0800, Jeffy Chen wrote:
>>> The CPU freqs are not supposed to change before cpufreq policies
>>> properly registered, meaning that they should be used to calculate the
>>> initial CPU capacities.
>>>
>>> Doing this helps choosing the best CPU during early boot, especially
>>> for the initramfs decompressing.
>>>
>>> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
>>
>> [...]
>>
>>> @@ -146,10 +153,15 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
>>>   				return false;
>>>   			}
>>>   		}
>>> -		capacity_scale = max(cpu_capacity, capacity_scale);
>>>   		raw_capacity[cpu] = cpu_capacity;
>>>   		pr_debug("cpu_capacity: %pOF cpu_capacity=%u (raw)\n",
>>>   			cpu_node, raw_capacity[cpu]);
>>> +
>>> +		cpu_clk = of_clk_get(cpu_node, 0);
>>> +		if (!PTR_ERR_OR_ZERO(cpu_clk))
>>> +			per_cpu(max_freq, cpu) = clk_get_rate(cpu_clk) / 1000;
>>> +
>>> +		clk_put(cpu_clk);
>>
>> I don't like to assume DVFS to be supplied only using 'clk'. So NACK!
>> We have other non-clk mechanism for CPU DVFS and this needs to simply
>> use cpufreq APIs to get frequency value if required.
> 
> To support this, it's failing on my Arm64 Juno board.
> 
> ...
> [    0.084858] CPU1 cpu_clk=-517
> [    0.087961] CPU2 cpu_clk=-517
> [    0.091005] CPU0 cpu_clk=-517
> [    0.094121] CPU3 cpu_clk=-517
> [    0.097248] CPU4 cpu_clk=-517
> [    0.100415] CPU5 cpu_clk=-517
> ...
> 
> Since you're on a big.LITTLE platform, did you specify
> 'capacity-dmips-mhz' for CPUs to be able to distinguish big and little
> CPUs before CPUfreq kicks in?

Indeed, and that's the "problem" - the capacities are there, but with 
the broken firmware the kernel starts with the little (boot) cluster 
clocked at either 400 or 200MHz, but the big cluster at just 12MHz. At 
that speed, a full distro config can take about 3 minutes to get to the 
point of loading cpufreq as a module, and I've seen at least one distro 
reverting 97df3aa76b4a to 'fix' the symptom :(

Robin.

> 
> $ grep capacity-dmips-mhz ./arch/arm64/boot/dts/arm/juno.dts
> 			capacity-dmips-mhz = <1024>;
> 			capacity-dmips-mhz = <1024>;
> 			capacity-dmips-mhz = <578>;
> 			capacity-dmips-mhz = <578>;
> 			capacity-dmips-mhz = <578>;
> 			capacity-dmips-mhz = <578>;
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-09  7:52 [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq Jeffy Chen
  2020-01-10 11:37 ` Sudeep Holla
@ 2020-01-10 14:03 ` kbuild test robot
  2020-01-10 14:45 ` kbuild test robot
  2 siblings, 0 replies; 10+ messages in thread
From: kbuild test robot @ 2020-01-10 14:03 UTC (permalink / raw)
  To: Jeffy Chen
  Cc: kbuild-all, linux-kernel, Brian Norris, Marc Zyngier,
	Douglas Anderson, Robin Murphy, Heiko Stuebner, Jeffy Chen,
	Greg Kroah-Hartman, Sudeep Holla, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 1462 bytes --]

Hi Jeffy,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on linux/master linus/master v5.5-rc5 next-20200108]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Jeffy-Chen/arch_topology-Adjust-initial-CPU-capacities-with-current-freq/20200110-083308
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git adc92dd4550ee038a9794eae1c05d88721a3a737
config: riscv-rv32_defconfig (attached as .config)
compiler: riscv32-linux-gcc (GCC) 7.5.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.5.0 make.cross ARCH=riscv 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/base/arch_topology.o: In function `.L19':
>> arch_topology.c:(.text+0x1a2): undefined reference to `__udivdi3'

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 18412 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-09  7:52 [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq Jeffy Chen
  2020-01-10 11:37 ` Sudeep Holla
  2020-01-10 14:03 ` kbuild test robot
@ 2020-01-10 14:45 ` kbuild test robot
  2 siblings, 0 replies; 10+ messages in thread
From: kbuild test robot @ 2020-01-10 14:45 UTC (permalink / raw)
  To: Jeffy Chen
  Cc: kbuild-all, linux-kernel, Brian Norris, Marc Zyngier,
	Douglas Anderson, Robin Murphy, Heiko Stuebner, Jeffy Chen,
	Greg Kroah-Hartman, Sudeep Holla, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 1501 bytes --]

Hi Jeffy,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on linux/master linus/master v5.5-rc5 next-20200109]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Jeffy-Chen/arch_topology-Adjust-initial-CPU-capacities-with-current-freq/20200110-083308
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git adc92dd4550ee038a9794eae1c05d88721a3a737
config: arm-sunxi_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 7.5.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.5.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/base/arch_topology.o: In function `topology_normalize_cpu_scale.part.0':
>> arch_topology.c:(.text+0x224): undefined reference to `__aeabi_uldivmod'

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26003 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-10 12:28     ` Robin Murphy
@ 2020-01-11  2:51       ` JeffyChen
  2020-01-11 15:12         ` Robin Murphy
  0 siblings, 1 reply; 10+ messages in thread
From: JeffyChen @ 2020-01-11  2:51 UTC (permalink / raw)
  To: Robin Murphy, Dietmar Eggemann, Sudeep Holla
  Cc: linux-kernel, Brian Norris, Marc Zyngier, Douglas Anderson,
	Heiko Stuebner, Greg Kroah-Hartman, Rafael J. Wysocki

Hi Robin,

Thanks for the clarification :)

On 01/10/2020 08:28 PM, Robin Murphy wrote:
> On 2020-01-10 12:01 pm, Dietmar Eggemann wrote:
>> On 10/01/2020 12:37, Sudeep Holla wrote:
>>> On Thu, Jan 09, 2020 at 03:52:14PM +0800, Jeffy Chen wrote:
>>>> The CPU freqs are not supposed to change before cpufreq policies
>>>> properly registered, meaning that they should be used to calculate the
>>>> initial CPU capacities.
>>>>
>>>> Doing this helps choosing the best CPU during early boot, especially
>>>> for the initramfs decompressing.
>>>>
>>>> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
>>>
>>> [...]
>>>
>>>> @@ -146,10 +153,15 @@ bool __init topology_parse_cpu_capacity(struct
>>>> device_node *cpu_node, int cpu)
>>>>                   return false;
>>>>               }
>>>>           }
>>>> -        capacity_scale = max(cpu_capacity, capacity_scale);
>>>>           raw_capacity[cpu] = cpu_capacity;
>>>>           pr_debug("cpu_capacity: %pOF cpu_capacity=%u (raw)\n",
>>>>               cpu_node, raw_capacity[cpu]);
>>>> +
>>>> +        cpu_clk = of_clk_get(cpu_node, 0);
>>>> +        if (!PTR_ERR_OR_ZERO(cpu_clk))
>>>> +            per_cpu(max_freq, cpu) = clk_get_rate(cpu_clk) / 1000;
>>>> +
>>>> +        clk_put(cpu_clk);
>>>
>>> I don't like to assume DVFS to be supplied only using 'clk'. So NACK!
>>> We have other non-clk mechanism for CPU DVFS and this needs to simply
>>> use cpufreq APIs to get frequency value if required.
>>
>> To support this, it's failing on my Arm64 Juno board.
>>
>> ...
>> [    0.084858] CPU1 cpu_clk=-517
>> [    0.087961] CPU2 cpu_clk=-517
>> [    0.091005] CPU0 cpu_clk=-517
>> [    0.094121] CPU3 cpu_clk=-517
>> [    0.097248] CPU4 cpu_clk=-517
>> [    0.100415] CPU5 cpu_clk=-517

It there any other way to get the initial cpu capacity for this case?

Or can we just assuming all the cores running at the same freq here?

>> ...
>>
>> Since you're on a big.LITTLE platform, did you specify
>> 'capacity-dmips-mhz' for CPUs to be able to distinguish big and little
>> CPUs before CPUfreq kicks in?
>
> Indeed, and that's the "problem" - the capacities are there, but with
> the broken firmware the kernel starts with the little (boot) cluster
> clocked at either 400 or 200MHz, but the big cluster at just 12MHz. At
> that speed, a full distro config can take about 3 minutes to get to the
> point of loading cpufreq as a module, and I've seen at least one distro
> reverting 97df3aa76b4a to 'fix' the symptom :(

Right, for the big cluster, the bootrom(maskrom) will init the clock to 
24MHz, and if the bootloader(u-boot for example) doesn't bump it, it 
would become 12MHz after kernel initialized the whole clk tree.

And in rockchip's BSP 4.4 kernel, there are hacks to bump it to 
800MHz(higher freq might require regulator changing) in clk tree 
initialization, the BSP u-boot also added that recently.

The chromeos's coreboot looks fine, but upstream u-boot seems missing 
that part too, i'll try to send a patch for that :)

>
> Robin.
>
>>
>> $ grep capacity-dmips-mhz ./arch/arm64/boot/dts/arm/juno.dts
>>             capacity-dmips-mhz = <1024>;
>>             capacity-dmips-mhz = <1024>;
>>             capacity-dmips-mhz = <578>;
>>             capacity-dmips-mhz = <578>;
>>             capacity-dmips-mhz = <578>;
>>             capacity-dmips-mhz = <578>;
>>
>
>
>
>




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-11  2:51       ` JeffyChen
@ 2020-01-11 15:12         ` Robin Murphy
  2020-01-13  3:59           ` JeffyChen
  0 siblings, 1 reply; 10+ messages in thread
From: Robin Murphy @ 2020-01-11 15:12 UTC (permalink / raw)
  To: JeffyChen, Dietmar Eggemann, Sudeep Holla
  Cc: linux-kernel, Brian Norris, Marc Zyngier, Douglas Anderson,
	Heiko Stuebner, Greg Kroah-Hartman, Rafael J. Wysocki

On 2020-01-11 2:51 am, JeffyChen wrote:
> Hi Robin,
> 
> Thanks for the clarification :)
> 
> On 01/10/2020 08:28 PM, Robin Murphy wrote:
>> On 2020-01-10 12:01 pm, Dietmar Eggemann wrote:
>>> On 10/01/2020 12:37, Sudeep Holla wrote:
>>>> On Thu, Jan 09, 2020 at 03:52:14PM +0800, Jeffy Chen wrote:
>>>>> The CPU freqs are not supposed to change before cpufreq policies
>>>>> properly registered, meaning that they should be used to calculate the
>>>>> initial CPU capacities.
>>>>>
>>>>> Doing this helps choosing the best CPU during early boot, especially
>>>>> for the initramfs decompressing.
>>>>>
>>>>> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
>>>>
>>>> [...]
>>>>
>>>>> @@ -146,10 +153,15 @@ bool __init topology_parse_cpu_capacity(struct
>>>>> device_node *cpu_node, int cpu)
>>>>>                   return false;
>>>>>               }
>>>>>           }
>>>>> -        capacity_scale = max(cpu_capacity, capacity_scale);
>>>>>           raw_capacity[cpu] = cpu_capacity;
>>>>>           pr_debug("cpu_capacity: %pOF cpu_capacity=%u (raw)\n",
>>>>>               cpu_node, raw_capacity[cpu]);
>>>>> +
>>>>> +        cpu_clk = of_clk_get(cpu_node, 0);
>>>>> +        if (!PTR_ERR_OR_ZERO(cpu_clk))
>>>>> +            per_cpu(max_freq, cpu) = clk_get_rate(cpu_clk) / 1000;
>>>>> +
>>>>> +        clk_put(cpu_clk);
>>>>
>>>> I don't like to assume DVFS to be supplied only using 'clk'. So NACK!
>>>> We have other non-clk mechanism for CPU DVFS and this needs to simply
>>>> use cpufreq APIs to get frequency value if required.
>>>
>>> To support this, it's failing on my Arm64 Juno board.
>>>
>>> ...
>>> [    0.084858] CPU1 cpu_clk=-517
>>> [    0.087961] CPU2 cpu_clk=-517
>>> [    0.091005] CPU0 cpu_clk=-517
>>> [    0.094121] CPU3 cpu_clk=-517
>>> [    0.097248] CPU4 cpu_clk=-517
>>> [    0.100415] CPU5 cpu_clk=-517
> 
> It there any other way to get the initial cpu capacity for this case?
> 
> Or can we just assuming all the cores running at the same freq here?
> 
>>> ...
>>>
>>> Since you're on a big.LITTLE platform, did you specify
>>> 'capacity-dmips-mhz' for CPUs to be able to distinguish big and little
>>> CPUs before CPUfreq kicks in?
>>
>> Indeed, and that's the "problem" - the capacities are there, but with
>> the broken firmware the kernel starts with the little (boot) cluster
>> clocked at either 400 or 200MHz, but the big cluster at just 12MHz. At
>> that speed, a full distro config can take about 3 minutes to get to the
>> point of loading cpufreq as a module, and I've seen at least one distro
>> reverting 97df3aa76b4a to 'fix' the symptom :(
> 
> Right, for the big cluster, the bootrom(maskrom) will init the clock to 
> 24MHz, and if the bootloader(u-boot for example) doesn't bump it, it 
> would become 12MHz after kernel initialized the whole clk tree.
> 
> And in rockchip's BSP 4.4 kernel, there are hacks to bump it to 
> 800MHz(higher freq might require regulator changing) in clk tree 
> initialization, the BSP u-boot also added that recently.
> 
> The chromeos's coreboot looks fine, but upstream u-boot seems missing 
> that part too, i'll try to send a patch for that :)

Actually, last time I looked both the BSP U-Boot and mainline do contain 
equivalent code to initialise both PLLs to (IIRC) 600MHz and apparently 
adjust a couple of other things set by the maskrom. The trap is that 
mainline does it in the SPL - thus the unfortunately common combination 
of using the upstream main stage with the miniloader ends up missing out 
that step entirely. In comparison, I'm now using the full upstream 
TPL/SPL flow on my RK3399 board (NanoPC-T4) and even a full generic 
distro kernel is acceptably quick:

[    2.315378] Trying to unpack rootfs image as initramfs...
[    2.781747] Freeing initrd memory: 7316K
...
[    4.239990] Freeing unused kernel memory: 1984K
[    4.247829] Run /init as init process

Robin.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq
  2020-01-11 15:12         ` Robin Murphy
@ 2020-01-13  3:59           ` JeffyChen
  0 siblings, 0 replies; 10+ messages in thread
From: JeffyChen @ 2020-01-13  3:59 UTC (permalink / raw)
  To: Robin Murphy, Dietmar Eggemann, Sudeep Holla
  Cc: linux-kernel, Brian Norris, Marc Zyngier, Douglas Anderson,
	Heiko Stuebner, Greg Kroah-Hartman, Rafael J. Wysocki

Hi Robin,

On 01/11/2020 11:12 PM, Robin Murphy wrote:
>>
>
> Actually, last time I looked both the BSP U-Boot and mainline do contain
> equivalent code to initialise both PLLs to (IIRC) 600MHz and apparently
> adjust a couple of other things set by the maskrom. The trap is that
> mainline does it in the SPL - thus the unfortunately common combination
> of using the upstream main stage with the miniloader ends up missing out
> that step entirely. In comparison, I'm now using the full upstream
> TPL/SPL flow on my RK3399 board (NanoPC-T4) and even a full generic
> distro kernel is acceptably quick:
>
> [    2.315378] Trying to unpack rootfs image as initramfs...
> [    2.781747] Freeing initrd memory: 7316K
> ...
> [    4.239990] Freeing unused kernel memory: 1984K
> [    4.247829] Run /init as init process

Oops, sorry for the noise, i've checked in the wrong u-boot code base...

Will ask miniloader team to check that, thanks :)

>
> Robin.




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-01-13  4:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-09  7:52 [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq Jeffy Chen
2020-01-10 11:37 ` Sudeep Holla
2020-01-10 11:54   ` Robin Murphy
2020-01-10 12:01   ` Dietmar Eggemann
2020-01-10 12:28     ` Robin Murphy
2020-01-11  2:51       ` JeffyChen
2020-01-11 15:12         ` Robin Murphy
2020-01-13  3:59           ` JeffyChen
2020-01-10 14:03 ` kbuild test robot
2020-01-10 14:45 ` kbuild test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).