Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-17 16:49 Dave Martin
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Martin @ 2013-10-17 16:49 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Mark Rutland, Nicolas Pitre, Heiko Stuebner, linux-doc,
	Naour Romain, Tarek Dakhran, Kukjin Kim, Russell King,
	Stephen Warren, linux-samsung-soc, devicetree, Pawel Moll,
	Ian Campbell, Rob Herring, Lorenzo Pieralisi, Vyacheslav Tyrtov,
	Ben Dooks, Mike Turquette, Thomas Gleixner, linux-arm-kernel,
	linux-kernel, Rob Landley

On Thu, Oct 17, 2013 at 06:04:13PM +0200, Daniel Lezcano wrote:
> On 10/17/2013 04:32 PM, Dave Martin wrote:
> > On Thu, Oct 17, 2013 at 12:45:29PM +0200, Daniel Lezcano wrote:
> >> On 10/14/2013 05:08 PM, Vyacheslav Tyrtov wrote:
> >>> From: Tarek Dakhran <t.dakhran@samsung.com>
> >>> 
> >>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> >>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > [...]
> > 
> >>> +	__mcpm_cpu_down(cpu, cluster);
> >>> +
> >>> +	if (!skip_wfi) {
> >>> +		exynos_core_power_down(cpu, cluster);
> >>> +		wfi();
> >>> +	}
> >>> +}
> >> 
> >> I did not looked line by line but these functions looks very similar
> >> than the tc2_pm.c's function. no ?
> > 
> > This is true.
> > 
> >> May be some code consolidation could be considered here.
> >> 
> >> Added Nico and Lorenzo in Cc.
> >> 
> >> Thanks
> >>    -- Daniel
> > 
> > Nico can commnent further, but I think the main concern here was that
> > this code shouldn't be factored prematurely.
> 
> I do not share this opinion.
> 
> > There are many low-level platform specifics involved here, so it's
> > hard to be certain that all platforms could fit into a more abstracted
> > framework until we have some evidence to look at.
> > 
> > This could be revisited when we have a few diverse MCPM ports to
> > compare.
> 
> I am worried about seeing more and more duplicated code around the ARM
> arch (eg. arm[64]/kernel/smp.c arm64/kernel/smp.c).
> 
> The cpuidle drivers have been duplicated and it took a while before
> refactoring them, and it is not finished. The hotplug code have been
> duplicated and now it is very difficult to factor it out.
> 
> A lot of work have been done to consolidate the code across the
> different SoC since the last 2 years.
> 
> If we let duplicate code populate the different files, they will
> belong to different maintainers, thus different trees. Each SoC
> contributors will tend to add their small changes making the code to
> diverge more and more and making difficult to re-factor it later.

I think this is Nico's call, since he has more experience than I do
of how these things tend to play out in practice.

Cheers
---Dave

> I am in favor of preventing duplicate code entering in the kernel and
> force the contributors to improve the kernel in general, not just the
> small part they are supposed to work on. Otherwise, we are letting the
> kernel to fork itself, internally...
> 
> > The low-level A15/A7 cacheflush sequence is already being factored
> > by Nico [1].
> 
> Hopefully he did it :)
> 
> Thanks
>   -- Daniel
> 
> > [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205085.html
> > [PATCH] ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
> > 
> > [...]
> > 
> 
> 
> -- 
>  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
> 
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-04 10:42     ` Alexei Colin
@ 2013-11-04 17:12       ` Alexei Colin
  0 siblings, 0 replies; 18+ messages in thread
From: Alexei Colin @ 2013-11-04 17:12 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/04/2013 05:42 AM, Alexei Colin wrote:
> Aliaksei,
> 
> On 12/31/1969 07:00 PM,  wrote:
>>> From: Tarek Dakhran <t.dakhran@samsung.com>
>>>
>>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>>>
>>> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
>>> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
>>> ---
>>>  arch/arm/mach-exynos/Makefile |   2 +
>>>  arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 272 insertions(+)
>>>  create mode 100644 arch/arm/mach-exynos/edcs.c
>>>
> [snip]
>>> +
>>> +/*
>>> + * Enable cluster-level coherency, in preparation for turning on the MMU.
>>> + */
>>> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
>>> +{
>>> +	asm volatile ("\n"
>>> +	"b	cci_enable_port_for_self");
>>> +}
>>
>> 	This code breaks odroid-xu boot with NR_CPUS set to 8. Kernel panics
>> 	like this:
>>
>> %< -----------------------------------------------------------------------
>> [    5.315000] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
>> [    5.320000] Freeing unused kernel memory: 216K (c049b000 - c04d1000)
>> [    5.325000] Unhandled fault: imprecise external abort (0x1406) at 0x00000000
>> [    5.340000] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007
>> [    5.340000] 
>> [    5.345000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 200000Hz, actual 200000HZ div = 250)
>> [    5.355000] CPU: 3 PID: 1 Comm: init Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
>> [    5.365000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
>> [    5.370000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
>> [    5.380000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 196079Hz, actual 196078HZ div = 255)
>> [    5.390000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c03609fc>] (panic+0x90/0x1e8)
>> [    5.395000] [<c03609fc>] (panic+0x90/0x1e8) from [<c002048c>] (do_exit+0x780/0x834)
>> [    5.405000] [<c002048c>] (do_exit+0x780/0x834) from [<c002062c>] (do_group_exit+0x3c/0xb0)
>> [    5.410000] [<c002062c>] (do_group_exit+0x3c/0xb0) from [<c002ae80>] (get_signal_to_deliver+0x1d4/0x534)
>> [    5.420000] [<c002ae80>] (get_signal_to_deliver+0x1d4/0x534) from [<c0010d08>] (do_signal+0x100/0x40c)
>> [    5.430000] [<c0010d08>] (do_signal+0x100/0x40c) from [<c0011348>] (do_work_pending+0x68/0xa8)
>> [    5.430000] mmc_host mmc1: Bus speed (slot 0) = 100000000Hz (slot req 50000000Hz, actual 50000000HZ div = 1)
>> [    5.430000] mmc1: new high speed SDHC card at address b368
>> [    5.435000] mmcblk0: mmc1:b368 USD   14.9 GiB 
>> [    5.440000]  mmcblk0: p1 p2 p3 < p5 p6 p7 >
>> [    5.455000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 400000Hz, actual 400000HZ div = 125)
>> [    5.475000] [<c0011348>] (do_work_pending+0x68/0xa8) from [<c000e420>] (work_pending+0xc/0x20)
>> [    5.480000] CPU1: stopping
>> [    5.480000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
>> [    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
>> [    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
>> [    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
>> [    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
>> [    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
>> [    5.480000] Exception stack(0xef0a7f88 to 0xef0a7fd0)
>> [    5.480000] 7f80:                   00000001 00000000 008d20ff 00000001 00000000 00000000
>> [    5.480000] 7fa0: c04d07a0 60000113 010da000 412fc0f3 c15aa7a0 00000000 00000001 ef0a7fd0
>> [    5.480000] 7fc0: c0072d74 c0072d78 20000113 ffffffff
>> [    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c0072d78>] (rcu_idle_exit+0x68/0xb8)
>> [    5.480000] [<c0072d78>] (rcu_idle_exit+0x68/0xb8) from [<c00550a4>] (cpu_startup_entry+0x6c/0x148)
>> [    5.480000] [<c00550a4>] (cpu_startup_entry+0x6c/0x148) from [<400085c4>] (0x400085c4)
>> [    5.480000] CPU0: stopping
>> [    5.480000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
>> [    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
>> [    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
>> [    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
>> [    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
>> [    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
>> [    5.480000] Exception stack(0xc04d3f70 to 0xc04d3fb8)
>> [    5.480000] SMP: failed to stop secondary CPUs
>> [    5.480000] 3f60:                                     00000000 00000000 00002190 00000000
>> [    5.480000] 3f80: c04d2000 c050a88f 00000001 c050a88f c04da44c 412fc0f3 c036a960 00000000
>> [    5.480000] 3fa0: 00000020 c04d3fb8 c000f5d4 c000f5d8 60000113 ffffffff
>> [    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c000f5d8>] (arch_cpu_idle+0x28/0x30)
>> [    5.480000] [<c000f5d8>] (arch_cpu_idle+0x28/0x30) from [<c0055094>] (cpu_startup_entry+0x5c/0x148)
>> [    5.480000] [<c0055094>] (cpu_startup_entry+0x5c/0x148) from [<c049ba9c>] (start_kernel+0x32c/0x384)
>> [    5.480000] CPU2: stopping
>> [    5.480000] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
>> [    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
>> [    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
>> [    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
>> [    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
>> [    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
>> [    5.480000] Exception stack(0xef0a9fa0 to 0xef0a9fe8)
>> [    5.480000] 9fa0: 00000002 00000000 008e4858 00000000 ef0a8000 c050a88f 00000001 c050a88f
>> [    5.480000] 9fc0: c04da44c 412fc0f3 c036a960 00000000 00000001 ef0a9fe8 c000f5d4 c000f5d8
>> [    5.480000] 9fe0: 60000113 ffffffff
>> [    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c000f5d8>] (arch_cpu_idle+0x28/0x30)
>> [    5.480000] [<c000f5d8>] (arch_cpu_idle+0x28/0x30) from [<c0055094>] (cpu_startup_entry+0x5c/0x148)
>> [    5.480000] [<c0055094>] (cpu_startup_entry+0x5c/0x148) from [<400085c4>] (0x400085c4)
>> %< -----------------------------------------------------------------------
>>
>> 	I checked arch/arm/mach-vexpress/tc2_pm.c to see how CCI is enabled
>> 	there an realized that you should follow same pattern, i.e.:
>>
>>         asm volatile (" \n"
>> "       cmp     r0, #1 \n"
>> "       bxne    lr \n"
>> "       b       cci_enable_port_for_self ");
>>
>> 	In this case only one cluster (4 LITTLE cores for Exynos5410) will be
>> 	initialized at boot time. And no panic.
> 
> After this modification the same crash goes away for me too: Odroid
> XU+E, 3.12-rc5 + this patchset, exynos_defconfig, exynos5410-smdk5410.dtb.
> 
> But, the other four cores fail to come online at boot time and fail when
> manually brought online 'echo 1 > /sys/devices/system/cpu/cpu4/online'.
> The write of S5P_CORE_LOCAL_PWR_EN to EDCS_CORE_CONFIGURATION happens
> but the CPU never reaches mcpm_entry_point.
> 
> If I change the above to 'cmp r0, #0' (the A15 cluster?), then I get the
> same crash as with unconditional call, and _the same_ 4 CPUs online.
> Does this mean that enabling coherence for the A15 cluster causes the
> above crash? And, is it true that for the A15 CPUs to wake up and
> proceed to the entry point coherence must be enabled (or is some other
> reason preventing them from waking up)?

Just realized that the argument to power_up_setup is not the cluster but
"affinity," which is 1 for clusters and 0 for cores (?). This explains
Aliaksei's correction and renders my above experiment pointless.

This code that enables coherency happens much later than
mcpm_entry_point. So, this strengthens the hypothesis that CPUs from the
other cluster don't come online due to some other reason. Is there an
explit "turn on cluster" operation that needs to happen before (or
after) the register write in exynos_core_power_control can turn on a
core in that cluster?

> 
> [    0.045000] CPU: Testing write buffer coherency: ok
> [    0.050000] CPU0: update cpu_power 1468
> [    0.055000] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
> [    0.060000] Setting up static identity map for 0xc038c4b8 - 0xc038c510
> [    0.065000] ARM CCI driver probed
> [    0.065000] edcs_init: configuring entry points
> [    0.070000] edcs_init: calling data init
> [    0.075000] edcs_data_init: cpu 0 cluster 0
> [    0.080000] EDCS power management initialized
> [    0.115000] exynos_power_up: cpu 1 cluster 0
> [    0.115000] exynos_power_up: cpu 1 cluster 0 use count 1
> [    0.115000] exynos_core_power_control: changing value to 3
> CPU01 cluster00: kernel mcpm_entry_point
> CPU01 cluster00: released
> [    0.125000] CPU1: Booted secondary processor
> [    0.125000] mcpm_cpu_powered_up: up
> [    0.160000] CPU1: update cpu_power 1468
> [    0.165000] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
> [    0.175000] exynos_power_up: cpu 2 cluster 0
> [    0.180000] exynos_power_up: cpu 2 cluster 0 use count 1
> [    0.180000] exynos_core_power_control: changing value to 3
> CPU02 cluster00: kernel mcpm_entry_point
> CPU02 cluster00: released
> [    0.190000] CPU2: Booted secondary processor
> ...
> [    0.280000] exynos_power_up: cpu 0 cluster 1
> [    0.285000] exynos_power_up: cpu 0 cluster 1 use count 1
> [    0.285000] exynos_core_power_control: changing value to 3
> [    1.290000] CPU4: failed to come online
> [    1.300000] exynos_power_up: cpu 1 cluster 1
> [    1.300000] exynos_power_up: cpu 1 cluster 1 use count 1
> [    1.300000] exynos_core_power_control: changing value to 3
> [    2.305000] CPU5: failed to come online
> ...
> [    4.340000] SMP: Total of 4 processors activated.
> [    4.345000] CPU: All CPU(s) started in SVC mode.
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-10-25 10:06     ` Aliaksei Katovich
  (?)
@ 2013-11-04 10:42     ` Alexei Colin
  2013-11-04 17:12       ` Alexei Colin
  -1 siblings, 1 reply; 18+ messages in thread
From: Alexei Colin @ 2013-11-04 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

Aliaksei,

On 12/31/1969 07:00 PM,  wrote:
>> From: Tarek Dakhran <t.dakhran@samsung.com>
>>
>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>>
>> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
>> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
>> ---
>>  arch/arm/mach-exynos/Makefile |   2 +
>>  arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 272 insertions(+)
>>  create mode 100644 arch/arm/mach-exynos/edcs.c
>>
[snip]
>> +
>> +/*
>> + * Enable cluster-level coherency, in preparation for turning on the MMU.
>> + */
>> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
>> +{
>> +	asm volatile ("\n"
>> +	"b	cci_enable_port_for_self");
>> +}
> 
> 	This code breaks odroid-xu boot with NR_CPUS set to 8. Kernel panics
> 	like this:
> 
> %< -----------------------------------------------------------------------
> [    5.315000] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
> [    5.320000] Freeing unused kernel memory: 216K (c049b000 - c04d1000)
> [    5.325000] Unhandled fault: imprecise external abort (0x1406) at 0x00000000
> [    5.340000] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007
> [    5.340000] 
> [    5.345000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 200000Hz, actual 200000HZ div = 250)
> [    5.355000] CPU: 3 PID: 1 Comm: init Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
> [    5.365000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
> [    5.370000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
> [    5.380000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 196079Hz, actual 196078HZ div = 255)
> [    5.390000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c03609fc>] (panic+0x90/0x1e8)
> [    5.395000] [<c03609fc>] (panic+0x90/0x1e8) from [<c002048c>] (do_exit+0x780/0x834)
> [    5.405000] [<c002048c>] (do_exit+0x780/0x834) from [<c002062c>] (do_group_exit+0x3c/0xb0)
> [    5.410000] [<c002062c>] (do_group_exit+0x3c/0xb0) from [<c002ae80>] (get_signal_to_deliver+0x1d4/0x534)
> [    5.420000] [<c002ae80>] (get_signal_to_deliver+0x1d4/0x534) from [<c0010d08>] (do_signal+0x100/0x40c)
> [    5.430000] [<c0010d08>] (do_signal+0x100/0x40c) from [<c0011348>] (do_work_pending+0x68/0xa8)
> [    5.430000] mmc_host mmc1: Bus speed (slot 0) = 100000000Hz (slot req 50000000Hz, actual 50000000HZ div = 1)
> [    5.430000] mmc1: new high speed SDHC card at address b368
> [    5.435000] mmcblk0: mmc1:b368 USD   14.9 GiB 
> [    5.440000]  mmcblk0: p1 p2 p3 < p5 p6 p7 >
> [    5.455000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 400000Hz, actual 400000HZ div = 125)
> [    5.475000] [<c0011348>] (do_work_pending+0x68/0xa8) from [<c000e420>] (work_pending+0xc/0x20)
> [    5.480000] CPU1: stopping
> [    5.480000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
> [    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
> [    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
> [    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
> [    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
> [    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
> [    5.480000] Exception stack(0xef0a7f88 to 0xef0a7fd0)
> [    5.480000] 7f80:                   00000001 00000000 008d20ff 00000001 00000000 00000000
> [    5.480000] 7fa0: c04d07a0 60000113 010da000 412fc0f3 c15aa7a0 00000000 00000001 ef0a7fd0
> [    5.480000] 7fc0: c0072d74 c0072d78 20000113 ffffffff
> [    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c0072d78>] (rcu_idle_exit+0x68/0xb8)
> [    5.480000] [<c0072d78>] (rcu_idle_exit+0x68/0xb8) from [<c00550a4>] (cpu_startup_entry+0x6c/0x148)
> [    5.480000] [<c00550a4>] (cpu_startup_entry+0x6c/0x148) from [<400085c4>] (0x400085c4)
> [    5.480000] CPU0: stopping
> [    5.480000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
> [    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
> [    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
> [    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
> [    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
> [    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
> [    5.480000] Exception stack(0xc04d3f70 to 0xc04d3fb8)
> [    5.480000] SMP: failed to stop secondary CPUs
> [    5.480000] 3f60:                                     00000000 00000000 00002190 00000000
> [    5.480000] 3f80: c04d2000 c050a88f 00000001 c050a88f c04da44c 412fc0f3 c036a960 00000000
> [    5.480000] 3fa0: 00000020 c04d3fb8 c000f5d4 c000f5d8 60000113 ffffffff
> [    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c000f5d8>] (arch_cpu_idle+0x28/0x30)
> [    5.480000] [<c000f5d8>] (arch_cpu_idle+0x28/0x30) from [<c0055094>] (cpu_startup_entry+0x5c/0x148)
> [    5.480000] [<c0055094>] (cpu_startup_entry+0x5c/0x148) from [<c049ba9c>] (start_kernel+0x32c/0x384)
> [    5.480000] CPU2: stopping
> [    5.480000] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
> [    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
> [    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
> [    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
> [    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
> [    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
> [    5.480000] Exception stack(0xef0a9fa0 to 0xef0a9fe8)
> [    5.480000] 9fa0: 00000002 00000000 008e4858 00000000 ef0a8000 c050a88f 00000001 c050a88f
> [    5.480000] 9fc0: c04da44c 412fc0f3 c036a960 00000000 00000001 ef0a9fe8 c000f5d4 c000f5d8
> [    5.480000] 9fe0: 60000113 ffffffff
> [    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c000f5d8>] (arch_cpu_idle+0x28/0x30)
> [    5.480000] [<c000f5d8>] (arch_cpu_idle+0x28/0x30) from [<c0055094>] (cpu_startup_entry+0x5c/0x148)
> [    5.480000] [<c0055094>] (cpu_startup_entry+0x5c/0x148) from [<400085c4>] (0x400085c4)
> %< -----------------------------------------------------------------------
> 
> 	I checked arch/arm/mach-vexpress/tc2_pm.c to see how CCI is enabled
> 	there an realized that you should follow same pattern, i.e.:
> 
>         asm volatile (" \n"
> "       cmp     r0, #1 \n"
> "       bxne    lr \n"
> "       b       cci_enable_port_for_self ");
> 
> 	In this case only one cluster (4 LITTLE cores for Exynos5410) will be
> 	initialized at boot time. And no panic.

After this modification the same crash goes away for me too: Odroid
XU+E, 3.12-rc5 + this patchset, exynos_defconfig, exynos5410-smdk5410.dtb.

But, the other four cores fail to come online at boot time and fail when
manually brought online 'echo 1 > /sys/devices/system/cpu/cpu4/online'.
The write of S5P_CORE_LOCAL_PWR_EN to EDCS_CORE_CONFIGURATION happens
but the CPU never reaches mcpm_entry_point.

If I change the above to 'cmp r0, #0' (the A15 cluster?), then I get the
same crash as with unconditional call, and _the same_ 4 CPUs online.
Does this mean that enabling coherence for the A15 cluster causes the
above crash? And, is it true that for the A15 CPUs to wake up and
proceed to the entry point coherence must be enabled (or is some other
reason preventing them from waking up)?

[    0.045000] CPU: Testing write buffer coherency: ok
[    0.050000] CPU0: update cpu_power 1468
[    0.055000] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.060000] Setting up static identity map for 0xc038c4b8 - 0xc038c510
[    0.065000] ARM CCI driver probed
[    0.065000] edcs_init: configuring entry points
[    0.070000] edcs_init: calling data init
[    0.075000] edcs_data_init: cpu 0 cluster 0
[    0.080000] EDCS power management initialized
[    0.115000] exynos_power_up: cpu 1 cluster 0
[    0.115000] exynos_power_up: cpu 1 cluster 0 use count 1
[    0.115000] exynos_core_power_control: changing value to 3
CPU01 cluster00: kernel mcpm_entry_point
CPU01 cluster00: released
[    0.125000] CPU1: Booted secondary processor
[    0.125000] mcpm_cpu_powered_up: up
[    0.160000] CPU1: update cpu_power 1468
[    0.165000] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.175000] exynos_power_up: cpu 2 cluster 0
[    0.180000] exynos_power_up: cpu 2 cluster 0 use count 1
[    0.180000] exynos_core_power_control: changing value to 3
CPU02 cluster00: kernel mcpm_entry_point
CPU02 cluster00: released
[    0.190000] CPU2: Booted secondary processor
...
[    0.280000] exynos_power_up: cpu 0 cluster 1
[    0.285000] exynos_power_up: cpu 0 cluster 1 use count 1
[    0.285000] exynos_core_power_control: changing value to 3
[    1.290000] CPU4: failed to come online
[    1.300000] exynos_power_up: cpu 1 cluster 1
[    1.300000] exynos_power_up: cpu 1 cluster 1 use count 1
[    1.300000] exynos_core_power_control: changing value to 3
[    2.305000] CPU5: failed to come online
...
[    4.340000] SMP: Total of 4 processors activated.
[    4.345000] CPU: All CPU(s) started in SVC mode.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-10-14 15:08   ` Vyacheslav Tyrtov
@ 2013-10-25 10:06     ` Aliaksei Katovich
  -1 siblings, 0 replies; 18+ messages in thread
From: Aliaksei Katovich @ 2013-10-25 10:06 UTC (permalink / raw)
  To: Vyacheslav Tyrtov
  Cc: linux-kernel, Mark Rutland, devicetree, Kukjin Kim, Russell King,
	Ben Dooks, Pawel Moll, Ian Campbell, Stephen Warren, linux-doc,
	Rob Herring, Tarek Dakhran, Daniel Lezcano, linux-samsung-soc,
	Rob Landley, Mike Turquette, Thomas Gleixner, Naour Romain,
	linux-arm-kernel, Heiko Stuebner

hi Vyacheslav;

> From: Tarek Dakhran <t.dakhran@samsung.com>
> 
> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> 
> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> ---
>  arch/arm/mach-exynos/Makefile |   2 +
>  arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 272 insertions(+)
>  create mode 100644 arch/arm/mach-exynos/edcs.c
> 
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index 5369615..ba6efdb 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>  
>  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> +
> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> new file mode 100644
> index 0000000..e304bd9
> --- /dev/null
> +++ b/arch/arm/mach-exynos/edcs.c
> @@ -0,0 +1,270 @@
> +/*
> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> + *
> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +#include <linux/spinlock.h>
> +#include <linux/errno.h>
> +
> +#include <asm/mcpm.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +
> +#include <linux/arm-cci.h>
> +#include <mach/regs-pmu.h>
> +
> +#define EDCS_CPUS_PER_CLUSTER	4
> +#define EDCS_CLUSTERS		2
> +
> +/* Exynos5410 power management registers */
> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> +						+ ((_nr) * 0x80))
> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> +
> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> +						 _nr * EDCS_CPUS_PER_CLUSTER)
> +
> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> +static int core_count[EDCS_CLUSTERS];
> +
> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> +				bool enable)
> +{
> +	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> +
> +	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
> +		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
> +}
> +
> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, true);
> +}
> +
> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, false);
> +}
> +
> +void set_boot_flag(unsigned int cpu, unsigned int mode)
> +{
> +	__raw_writel(mode, REG_CPU_STATE_ADDR(cpu));
> +}
> +
> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
> +
> +	local_irq_disable();
> +	arch_spin_lock(&edcs_lock);
> +
> +	edcs_use_count[cpu][cluster]++;
> +	if (edcs_use_count[cpu][cluster] == 1) {
> +		++core_count[cluster];
> +		set_boot_flag(cpu, 0x2);
> +		exynos_core_power_up(cpu, cluster);
> +	} else if (edcs_use_count[cpu][cluster] != 2) {
> +		/*
> +		 * The only possible values are:
> +		 * 0 = CPU down
> +		 * 1 = CPU (still) up
> +		 * 2 = CPU requested to be up before it had a chance
> +		 *     to actually make itself down.
> +		 * Any other value is a bug.
> +		 */
> +		BUG();
> +	}
> +
> +	arch_spin_unlock(&edcs_lock);
> +	local_irq_enable();
> +
> +	return 0;
> +}
> +static void exynos_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +	bool last_man = false, skip_wfi = false;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +
> +	__mcpm_cpu_going_down(cpu, cluster);
> +
> +	arch_spin_lock(&edcs_lock);
> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> +	edcs_use_count[cpu][cluster]--;
> +	if (edcs_use_count[cpu][cluster] == 0) {
> +		--core_count[cluster];
> +		if (core_count[cluster] == 0)
> +			last_man = true;
> +	} else if (edcs_use_count[cpu][cluster] == 1) {
> +		/*
> +		 * A power_up request went ahead of us.
> +		 * Even if we do not want to shut this CPU down,
> +		 * the caller expects a certain state as if the WFI
> +		 * was aborted.  So let's continue with cache cleaning.
> +		 */
> +		skip_wfi = true;
> +	} else
> +		BUG();
> +
> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
> +		arch_spin_unlock(&edcs_lock);
> +
> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
> +			/*
> +			 * On the Cortex-A15 we need to disable
> +			 * L2 prefetching before flushing the cache.
> +			 */
> +			asm volatile(
> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
> +			"isb\n\t"
> +			"dsb"
> +			: : "r" (0x400));
> +		}
> +
> +		/*
> +		 * We need to disable and flush the whole (L1 and L2) cache.
> +		 * Let's do it in the safest possible way i.e. with
> +		 * no memory access within the following sequence
> +		 * including the stack.
> +		 *
> +		 * Note: fp is preserved to the stack explicitly prior doing
> +		 * this since adding it to the clobber list is incompatible
> +		 * with having CONFIG_FRAME_POINTER=y.
> +		 */
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_all\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +			"r9", "r10", "lr", "memory");
> +
> +		cci_disable_port_by_cpu(mpidr);
> +
> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
> +
> +	} else {
> +		arch_spin_unlock(&edcs_lock);
> +		/*
> +			* We need to disable and flush only the L1 cache.
> +			* Let's do it in the safest possible way as above.
> +		*/
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_louis\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +		      "r9", "r10", "lr", "memory");
> +
> +	}
> +	__mcpm_cpu_down(cpu, cluster);
> +
> +	if (!skip_wfi) {
> +		exynos_core_power_down(cpu, cluster);
> +		wfi();
> +	}
> +}
> +
> +static const struct mcpm_platform_ops exynos_power_ops = {
> +	.power_up	= exynos_power_up,
> +	.power_down	= exynos_power_down,
> +};
> +
> +static void __init edcs_data_init(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +	edcs_use_count[cpu][cluster] = 1;
> +	++core_count[cluster];
> +}
> +
> +/*
> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> + */
> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
> +{
> +	asm volatile ("\n"
> +	"b	cci_enable_port_for_self");
> +}

	This code breaks odroid-xu boot with NR_CPUS set to 8. Kernel panics
	like this:

%< -----------------------------------------------------------------------
[    5.315000] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    5.320000] Freeing unused kernel memory: 216K (c049b000 - c04d1000)
[    5.325000] Unhandled fault: imprecise external abort (0x1406) at 0x00000000
[    5.340000] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007
[    5.340000] 
[    5.345000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 200000Hz, actual 200000HZ div = 250)
[    5.355000] CPU: 3 PID: 1 Comm: init Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
[    5.365000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
[    5.370000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
[    5.380000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 196079Hz, actual 196078HZ div = 255)
[    5.390000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c03609fc>] (panic+0x90/0x1e8)
[    5.395000] [<c03609fc>] (panic+0x90/0x1e8) from [<c002048c>] (do_exit+0x780/0x834)
[    5.405000] [<c002048c>] (do_exit+0x780/0x834) from [<c002062c>] (do_group_exit+0x3c/0xb0)
[    5.410000] [<c002062c>] (do_group_exit+0x3c/0xb0) from [<c002ae80>] (get_signal_to_deliver+0x1d4/0x534)
[    5.420000] [<c002ae80>] (get_signal_to_deliver+0x1d4/0x534) from [<c0010d08>] (do_signal+0x100/0x40c)
[    5.430000] [<c0010d08>] (do_signal+0x100/0x40c) from [<c0011348>] (do_work_pending+0x68/0xa8)
[    5.430000] mmc_host mmc1: Bus speed (slot 0) = 100000000Hz (slot req 50000000Hz, actual 50000000HZ div = 1)
[    5.430000] mmc1: new high speed SDHC card at address b368
[    5.435000] mmcblk0: mmc1:b368 USD   14.9 GiB 
[    5.440000]  mmcblk0: p1 p2 p3 < p5 p6 p7 >
[    5.455000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 400000Hz, actual 400000HZ div = 125)
[    5.475000] [<c0011348>] (do_work_pending+0x68/0xa8) from [<c000e420>] (work_pending+0xc/0x20)
[    5.480000] CPU1: stopping
[    5.480000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
[    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
[    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
[    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
[    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
[    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
[    5.480000] Exception stack(0xef0a7f88 to 0xef0a7fd0)
[    5.480000] 7f80:                   00000001 00000000 008d20ff 00000001 00000000 00000000
[    5.480000] 7fa0: c04d07a0 60000113 010da000 412fc0f3 c15aa7a0 00000000 00000001 ef0a7fd0
[    5.480000] 7fc0: c0072d74 c0072d78 20000113 ffffffff
[    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c0072d78>] (rcu_idle_exit+0x68/0xb8)
[    5.480000] [<c0072d78>] (rcu_idle_exit+0x68/0xb8) from [<c00550a4>] (cpu_startup_entry+0x6c/0x148)
[    5.480000] [<c00550a4>] (cpu_startup_entry+0x6c/0x148) from [<400085c4>] (0x400085c4)
[    5.480000] CPU0: stopping
[    5.480000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
[    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
[    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
[    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
[    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
[    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
[    5.480000] Exception stack(0xc04d3f70 to 0xc04d3fb8)
[    5.480000] SMP: failed to stop secondary CPUs
[    5.480000] 3f60:                                     00000000 00000000 00002190 00000000
[    5.480000] 3f80: c04d2000 c050a88f 00000001 c050a88f c04da44c 412fc0f3 c036a960 00000000
[    5.480000] 3fa0: 00000020 c04d3fb8 c000f5d4 c000f5d8 60000113 ffffffff
[    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c000f5d8>] (arch_cpu_idle+0x28/0x30)
[    5.480000] [<c000f5d8>] (arch_cpu_idle+0x28/0x30) from [<c0055094>] (cpu_startup_entry+0x5c/0x148)
[    5.480000] [<c0055094>] (cpu_startup_entry+0x5c/0x148) from [<c049ba9c>] (start_kernel+0x32c/0x384)
[    5.480000] CPU2: stopping
[    5.480000] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
[    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
[    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
[    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
[    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
[    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
[    5.480000] Exception stack(0xef0a9fa0 to 0xef0a9fe8)
[    5.480000] 9fa0: 00000002 00000000 008e4858 00000000 ef0a8000 c050a88f 00000001 c050a88f
[    5.480000] 9fc0: c04da44c 412fc0f3 c036a960 00000000 00000001 ef0a9fe8 c000f5d4 c000f5d8
[    5.480000] 9fe0: 60000113 ffffffff
[    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c000f5d8>] (arch_cpu_idle+0x28/0x30)
[    5.480000] [<c000f5d8>] (arch_cpu_idle+0x28/0x30) from [<c0055094>] (cpu_startup_entry+0x5c/0x148)
[    5.480000] [<c0055094>] (cpu_startup_entry+0x5c/0x148) from [<400085c4>] (0x400085c4)
%< -----------------------------------------------------------------------

	I checked arch/arm/mach-vexpress/tc2_pm.c to see how CCI is enabled
	there an realized that you should follow same pattern, i.e.:

        asm volatile (" \n"
"       cmp     r0, #1 \n"
"       bxne    lr \n"
"       b       cci_enable_port_for_self ");

	In this case only one cluster (4 LITTLE cores for Exynos5410) will be
	initialized at boot time. And no panic.

--
Aliaksei

> +
> +static int __init edcs_init(void)
> +{
> +	int ret;
> +	struct device_node *node;
> +
> +	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
> +	if (!node)
> +		return -ENODEV;
> +
> +	if (!cci_probed())
> +		return -ENODEV;
> +
> +	/*
> +	 * Future entries into the kernel can now go
> +	 * through the cluster entry vectors.
> +	 */
> +	__raw_writel(virt_to_phys(mcpm_entry_point),
> +				S5P_VA_SYSRAM_NS + 0x1c);
> +
> +	edcs_data_init();
> +	mcpm_smp_set_ops();
> +
> +	ret = mcpm_platform_register(&exynos_power_ops);
> +	if (!ret) {
> +		mcpm_sync_init(edcs_power_up_setup);
> +		pr_info("EDCS power management initialized\n");
> +	}
> +	return ret;
> +}
> +
> +early_initcall(edcs_init);
> -- 
> 1.8.1.5
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-25 10:06     ` Aliaksei Katovich
  0 siblings, 0 replies; 18+ messages in thread
From: Aliaksei Katovich @ 2013-10-25 10:06 UTC (permalink / raw)
  To: linux-arm-kernel

hi Vyacheslav;

> From: Tarek Dakhran <t.dakhran@samsung.com>
> 
> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> 
> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> ---
>  arch/arm/mach-exynos/Makefile |   2 +
>  arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 272 insertions(+)
>  create mode 100644 arch/arm/mach-exynos/edcs.c
> 
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index 5369615..ba6efdb 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>  
>  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> +
> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> new file mode 100644
> index 0000000..e304bd9
> --- /dev/null
> +++ b/arch/arm/mach-exynos/edcs.c
> @@ -0,0 +1,270 @@
> +/*
> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> + *
> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +#include <linux/spinlock.h>
> +#include <linux/errno.h>
> +
> +#include <asm/mcpm.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +
> +#include <linux/arm-cci.h>
> +#include <mach/regs-pmu.h>
> +
> +#define EDCS_CPUS_PER_CLUSTER	4
> +#define EDCS_CLUSTERS		2
> +
> +/* Exynos5410 power management registers */
> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> +						+ ((_nr) * 0x80))
> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> +
> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> +						 _nr * EDCS_CPUS_PER_CLUSTER)
> +
> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> +static int core_count[EDCS_CLUSTERS];
> +
> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> +				bool enable)
> +{
> +	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> +
> +	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
> +		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
> +}
> +
> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, true);
> +}
> +
> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, false);
> +}
> +
> +void set_boot_flag(unsigned int cpu, unsigned int mode)
> +{
> +	__raw_writel(mode, REG_CPU_STATE_ADDR(cpu));
> +}
> +
> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
> +
> +	local_irq_disable();
> +	arch_spin_lock(&edcs_lock);
> +
> +	edcs_use_count[cpu][cluster]++;
> +	if (edcs_use_count[cpu][cluster] == 1) {
> +		++core_count[cluster];
> +		set_boot_flag(cpu, 0x2);
> +		exynos_core_power_up(cpu, cluster);
> +	} else if (edcs_use_count[cpu][cluster] != 2) {
> +		/*
> +		 * The only possible values are:
> +		 * 0 = CPU down
> +		 * 1 = CPU (still) up
> +		 * 2 = CPU requested to be up before it had a chance
> +		 *     to actually make itself down.
> +		 * Any other value is a bug.
> +		 */
> +		BUG();
> +	}
> +
> +	arch_spin_unlock(&edcs_lock);
> +	local_irq_enable();
> +
> +	return 0;
> +}
> +static void exynos_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +	bool last_man = false, skip_wfi = false;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +
> +	__mcpm_cpu_going_down(cpu, cluster);
> +
> +	arch_spin_lock(&edcs_lock);
> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> +	edcs_use_count[cpu][cluster]--;
> +	if (edcs_use_count[cpu][cluster] == 0) {
> +		--core_count[cluster];
> +		if (core_count[cluster] == 0)
> +			last_man = true;
> +	} else if (edcs_use_count[cpu][cluster] == 1) {
> +		/*
> +		 * A power_up request went ahead of us.
> +		 * Even if we do not want to shut this CPU down,
> +		 * the caller expects a certain state as if the WFI
> +		 * was aborted.  So let's continue with cache cleaning.
> +		 */
> +		skip_wfi = true;
> +	} else
> +		BUG();
> +
> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
> +		arch_spin_unlock(&edcs_lock);
> +
> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
> +			/*
> +			 * On the Cortex-A15 we need to disable
> +			 * L2 prefetching before flushing the cache.
> +			 */
> +			asm volatile(
> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
> +			"isb\n\t"
> +			"dsb"
> +			: : "r" (0x400));
> +		}
> +
> +		/*
> +		 * We need to disable and flush the whole (L1 and L2) cache.
> +		 * Let's do it in the safest possible way i.e. with
> +		 * no memory access within the following sequence
> +		 * including the stack.
> +		 *
> +		 * Note: fp is preserved to the stack explicitly prior doing
> +		 * this since adding it to the clobber list is incompatible
> +		 * with having CONFIG_FRAME_POINTER=y.
> +		 */
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_all\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +			"r9", "r10", "lr", "memory");
> +
> +		cci_disable_port_by_cpu(mpidr);
> +
> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
> +
> +	} else {
> +		arch_spin_unlock(&edcs_lock);
> +		/*
> +			* We need to disable and flush only the L1 cache.
> +			* Let's do it in the safest possible way as above.
> +		*/
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_louis\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +		      "r9", "r10", "lr", "memory");
> +
> +	}
> +	__mcpm_cpu_down(cpu, cluster);
> +
> +	if (!skip_wfi) {
> +		exynos_core_power_down(cpu, cluster);
> +		wfi();
> +	}
> +}
> +
> +static const struct mcpm_platform_ops exynos_power_ops = {
> +	.power_up	= exynos_power_up,
> +	.power_down	= exynos_power_down,
> +};
> +
> +static void __init edcs_data_init(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +	edcs_use_count[cpu][cluster] = 1;
> +	++core_count[cluster];
> +}
> +
> +/*
> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> + */
> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
> +{
> +	asm volatile ("\n"
> +	"b	cci_enable_port_for_self");
> +}

	This code breaks odroid-xu boot with NR_CPUS set to 8. Kernel panics
	like this:

%< -----------------------------------------------------------------------
[    5.315000] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    5.320000] Freeing unused kernel memory: 216K (c049b000 - c04d1000)
[    5.325000] Unhandled fault: imprecise external abort (0x1406)@0x00000000
[    5.340000] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007
[    5.340000] 
[    5.345000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 200000Hz, actual 200000HZ div = 250)
[    5.355000] CPU: 3 PID: 1 Comm: init Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
[    5.365000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
[    5.370000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
[    5.380000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 196079Hz, actual 196078HZ div = 255)
[    5.390000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c03609fc>] (panic+0x90/0x1e8)
[    5.395000] [<c03609fc>] (panic+0x90/0x1e8) from [<c002048c>] (do_exit+0x780/0x834)
[    5.405000] [<c002048c>] (do_exit+0x780/0x834) from [<c002062c>] (do_group_exit+0x3c/0xb0)
[    5.410000] [<c002062c>] (do_group_exit+0x3c/0xb0) from [<c002ae80>] (get_signal_to_deliver+0x1d4/0x534)
[    5.420000] [<c002ae80>] (get_signal_to_deliver+0x1d4/0x534) from [<c0010d08>] (do_signal+0x100/0x40c)
[    5.430000] [<c0010d08>] (do_signal+0x100/0x40c) from [<c0011348>] (do_work_pending+0x68/0xa8)
[    5.430000] mmc_host mmc1: Bus speed (slot 0) = 100000000Hz (slot req 50000000Hz, actual 50000000HZ div = 1)
[    5.430000] mmc1: new high speed SDHC card at address b368
[    5.435000] mmcblk0: mmc1:b368 USD   14.9 GiB 
[    5.440000]  mmcblk0: p1 p2 p3 < p5 p6 p7 >
[    5.455000] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 400000Hz, actual 400000HZ div = 125)
[    5.475000] [<c0011348>] (do_work_pending+0x68/0xa8) from [<c000e420>] (work_pending+0xc/0x20)
[    5.480000] CPU1: stopping
[    5.480000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
[    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
[    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
[    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
[    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
[    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
[    5.480000] Exception stack(0xef0a7f88 to 0xef0a7fd0)
[    5.480000] 7f80:                   00000001 00000000 008d20ff 00000001 00000000 00000000
[    5.480000] 7fa0: c04d07a0 60000113 010da000 412fc0f3 c15aa7a0 00000000 00000001 ef0a7fd0
[    5.480000] 7fc0: c0072d74 c0072d78 20000113 ffffffff
[    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c0072d78>] (rcu_idle_exit+0x68/0xb8)
[    5.480000] [<c0072d78>] (rcu_idle_exit+0x68/0xb8) from [<c00550a4>] (cpu_startup_entry+0x6c/0x148)
[    5.480000] [<c00550a4>] (cpu_startup_entry+0x6c/0x148) from [<400085c4>] (0x400085c4)
[    5.480000] CPU0: stopping
[    5.480000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
[    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
[    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
[    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
[    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
[    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
[    5.480000] Exception stack(0xc04d3f70 to 0xc04d3fb8)
[    5.480000] SMP: failed to stop secondary CPUs
[    5.480000] 3f60:                                     00000000 00000000 00002190 00000000
[    5.480000] 3f80: c04d2000 c050a88f 00000001 c050a88f c04da44c 412fc0f3 c036a960 00000000
[    5.480000] 3fa0: 00000020 c04d3fb8 c000f5d4 c000f5d8 60000113 ffffffff
[    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c000f5d8>] (arch_cpu_idle+0x28/0x30)
[    5.480000] [<c000f5d8>] (arch_cpu_idle+0x28/0x30) from [<c0055094>] (cpu_startup_entry+0x5c/0x148)
[    5.480000] [<c0055094>] (cpu_startup_entry+0x5c/0x148) from [<c049ba9c>] (start_kernel+0x32c/0x384)
[    5.480000] CPU2: stopping
[    5.480000] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.12.0-rc5-00006-g847e427-dirty #1
[    5.480000] [<c0014d40>] (unwind_backtrace+0x0/0xf8) from [<c00117cc>] (show_stack+0x10/0x14)
[    5.480000] [<c00117cc>] (show_stack+0x10/0x14) from [<c03633ac>] (dump_stack+0x6c/0xac)
[    5.480000] [<c03633ac>] (dump_stack+0x6c/0xac) from [<c0013604>] (handle_IPI+0xf8/0x11c)
[    5.480000] [<c0013604>] (handle_IPI+0xf8/0x11c) from [<c000851c>] (gic_handle_irq+0x60/0x68)
[    5.480000] [<c000851c>] (gic_handle_irq+0x60/0x68) from [<c00122c0>] (__irq_svc+0x40/0x70)
[    5.480000] Exception stack(0xef0a9fa0 to 0xef0a9fe8)
[    5.480000] 9fa0: 00000002 00000000 008e4858 00000000 ef0a8000 c050a88f 00000001 c050a88f
[    5.480000] 9fc0: c04da44c 412fc0f3 c036a960 00000000 00000001 ef0a9fe8 c000f5d4 c000f5d8
[    5.480000] 9fe0: 60000113 ffffffff
[    5.480000] [<c00122c0>] (__irq_svc+0x40/0x70) from [<c000f5d8>] (arch_cpu_idle+0x28/0x30)
[    5.480000] [<c000f5d8>] (arch_cpu_idle+0x28/0x30) from [<c0055094>] (cpu_startup_entry+0x5c/0x148)
[    5.480000] [<c0055094>] (cpu_startup_entry+0x5c/0x148) from [<400085c4>] (0x400085c4)
%< -----------------------------------------------------------------------

	I checked arch/arm/mach-vexpress/tc2_pm.c to see how CCI is enabled
	there an realized that you should follow same pattern, i.e.:

        asm volatile (" \n"
"       cmp     r0, #1 \n"
"       bxne    lr \n"
"       b       cci_enable_port_for_self ");

	In this case only one cluster (4 LITTLE cores for Exynos5410) will be
	initialized@boot time. And no panic.

--
Aliaksei

> +
> +static int __init edcs_init(void)
> +{
> +	int ret;
> +	struct device_node *node;
> +
> +	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
> +	if (!node)
> +		return -ENODEV;
> +
> +	if (!cci_probed())
> +		return -ENODEV;
> +
> +	/*
> +	 * Future entries into the kernel can now go
> +	 * through the cluster entry vectors.
> +	 */
> +	__raw_writel(virt_to_phys(mcpm_entry_point),
> +				S5P_VA_SYSRAM_NS + 0x1c);
> +
> +	edcs_data_init();
> +	mcpm_smp_set_ops();
> +
> +	ret = mcpm_platform_register(&exynos_power_ops);
> +	if (!ret) {
> +		mcpm_sync_init(edcs_power_up_setup);
> +		pr_info("EDCS power management initialized\n");
> +	}
> +	return ret;
> +}
> +
> +early_initcall(edcs_init);
> -- 
> 1.8.1.5
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-18 12:53 Dave Martin
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Martin @ 2013-10-18 12:53 UTC (permalink / raw)
  To: Tarek Dakhran
  Cc: Vyacheslav Tyrtov, Mark Rutland, devicetree, Kukjin Kim,
	Russell King, Daniel Lezcano, Pawel Moll, Ian Campbell,
	Stephen Warren, linux-doc, linux-kernel, Rob Herring,
	linux-samsung-soc, Rob Landley, Ben Dooks, Mike Turquette,
	Thomas Gleixner, Naour Romain, linux-arm-kernel, Heiko Stuebner

On Fri, Oct 18, 2013 at 12:29:03PM +0400, Tarek Dakhran wrote:
> On 17.10.2013 19:46, Dave Martin wrote:
> > On Mon, Oct 14, 2013 at 07:08:24PM +0400, Vyacheslav Tyrtov wrote:
> >> From: Tarek Dakhran <t.dakhran@samsung.com>
> >> 
> >> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> >> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> >> 
> >> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> >> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> >> ---
> >>   arch/arm/mach-exynos/Makefile |   2 +
> >>   arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
> >>   2 files changed, 272 insertions(+)
> >>   create mode 100644 arch/arm/mach-exynos/edcs.c
> >> 
> >> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> >> index 5369615..ba6efdb 100644
> >> --- a/arch/arm/mach-exynos/Makefile
> >> +++ b/arch/arm/mach-exynos/Makefile
> >> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
> >>   obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
> >>   obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> >> +
> >> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> >> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> >> new file mode 100644
> >> index 0000000..e304bd9
> >> --- /dev/null
> >> +++ b/arch/arm/mach-exynos/edcs.c
> >> @@ -0,0 +1,270 @@
> >> +/*
> >> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> >> + *
> >> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> >> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License version 2 as
> >> + * published by the Free Software Foundation.
> >> + *
> >> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> >> + */
> >> +

> [snip]

> >> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> >> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> >> +
> >> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> > Is there any reason why S5P_VA_SYSRAM_NS needs to be a static mapping?

> What do you mean by "static mapping"?

See below

> > 
> >> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> >> +						 _nr * EDCS_CPUS_PER_CLUSTER)
> >> +
> >> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> >> +
> >> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> >> +static int core_count[EDCS_CLUSTERS];
> >> +
> >> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> >> +				bool enable)
> >> +{
> >> +	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
> >> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> >> +
> >> +	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
> >> +		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
> > I think you need to replace all the __raw_readl() / __raw_writel() calls
> > in this file with readl_relaxed() / writel_relaxed().
> > 
> > This ensures little-endian byte order, so that BE8 kernels will work.
> > 
> Will be done.
> >> +}
> >> +
> >> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> >> +{
> >> +	exynos_core_power_control(cpu, cluster, true);
> >> +}

> [snip]

> >> +
> >> +	edcs_use_count[cpu][cluster]++;
> >> +	if (edcs_use_count[cpu][cluster] == 1) {
> >> +		++core_count[cluster];
> >> +		set_boot_flag(cpu, 0x2);
> > 0x2 looks like a magic number.  Can we have a #define for that?

> Will be done.

Thanks

> > If the boot flag is read by the inbound CPU or by a peripheral then
> > there are memory ordering issues to take into account.  Otherwise, can't
> > the inbound CPU come online and race with the write to the boot flag?

> Inbound CPU doesn't write the boot flag.

Sorry, I didn't explain this clearly.  The scenario I'm thinking about is:

	CPU0 writes 2 to the boot flag

	The write is temporarily held in a write buffer somewhere in the
	memory system.

	CPU0 pokes the power controller to bring CPU1 online

	CPU1 powers up and starts running its boot ROM

	CPU1 reads the boot flag, but misses the value written by CPU0
	because that is still held in a write buffer somewhere, and isn't
	globally visible.

	... and finally ...

	The write to the boot flag drains and becomes globally visible.
	But it's too late now.

There's also the problem that set_boot_flag() and the read by the
boot ROM probably don't use the same memory type (Device versus
Strongly-Ordered for example).  The ARM Architecture doesn't guarantee
full coherency in situations like this.


I believe that a correct solution to this problem is to put a wmb()
between setting the boot flag and poking the power controller.

Putting the wmb() before the writel in exynos_core_power_control()
should do the job.

> > What is the memory type of REG_STATE_ADDR()?
> Same type as S5P_VA_SYSRAM_NS.
> This 4K region is mapped as follows:
> 
> static struct map_desc exynos5410_iodesc[] __initdata = {
> 	{
> 		.virtual	= (unsigned long)S5P_VA_SYSRAM_NS,
> 		.pfn		= __phys_to_pfn(EXYNOS5410_PA_SYSRAM_NS),
> 		.length		= SZ_4K,
> 		.type		= MT_DEVICE,
> 	},
> };

This is what I mean by a static mapping.

If this mapping isn't needed early (i.e., before ioremap works) then
ioremap can be used instead of reserving a specific virtual address.

This isn't a problem for this patch, just nice to have in general.

[...]

> >> +	arch_spin_lock(&edcs_lock);
> >> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> >> +	edcs_use_count[cpu][cluster]--;
> >> +	if (edcs_use_count[cpu][cluster] == 0) {
> >> +		--core_count[cluster];
> >> +		if (core_count[cluster] == 0)
> >> +			last_man = true;
> >> +	} else if (edcs_use_count[cpu][cluster] == 1) {
> >> +		/*
> >> +		 * A power_up request went ahead of us.
> >> +		 * Even if we do not want to shut this CPU down,
> >> +		 * the caller expects a certain state as if the WFI
> >> +		 * was aborted.  So let's continue with cache cleaning.
> >> +		 */
> >> +		skip_wfi = true;
> >> +	} else
> >> +		BUG();
> > For TC2, Lorenzo discovered need to disable the GIC CPU interface, so
> > that no suprious wakeups can happen once the power controller is
> > configured to power the CPU down.
> > 
> > The problem is that a spurious wakeup can trigger a race in the power
> > controller when the last man powers down, where the power controller
> > continues to wait for the cluster to power down, but it never happens.
> > 
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2013-September/200917.html
> > ([PATCH v2] arm: vexpress: tc2: fix hotplug/idle/kexec race on cluster power down)
> > 
> > 
> > This might not be a problem for Exynos5410 -- it depends on the way the
> > power control logic works.
> > 
> > On the other hand, disabling the GIC CPU interface here is cheap to do,
> > even if it's not really needed.

Did you have any further thoughts on this issue?

> >> +
> >> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
> >> +		arch_spin_unlock(&edcs_lock);
> >> +
> >> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
> >> +			/*
> >> +			 * On the Cortex-A15 we need to disable
> >> +			 * L2 prefetching before flushing the cache.
> >> +			 */
> >> +			asm volatile(
> >> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
> >> +			"isb\n\t"
> >> +			"dsb"
> >> +			: : "r" (0x400));
> >> +		}
> >> +
> >> +		/*
> >> +		 * We need to disable and flush the whole (L1 and L2) cache.
> >> +		 * Let's do it in the safest possible way i.e. with
> >> +		 * no memory access within the following sequence
> >> +		 * including the stack.
> >> +		 *
> >> +		 * Note: fp is preserved to the stack explicitly prior doing
> >> +		 * this since adding it to the clobber list is incompatible
> >> +		 * with having CONFIG_FRAME_POINTER=y.
> >> +		 */
> >> +		asm volatile(
> >> +		"str	fp, [sp, #-4]!\n\t"
> >> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> >> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> >> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> >> +		"isb\n\t"
> >> +		"bl	v7_flush_dcache_all\n\t"
> >> +		"clrex\n\t"
> >> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> >> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> >> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> >> +		"isb\n\t"
> >> +		"dsb\n\t"
> >> +		"ldr	fp, [sp], #4"
> >> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> >> +			"r9", "r10", "lr", "memory");
> > For these code sequences, please take a look at
> > 
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205085.html
> > [PATCH] ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
> > 
> > The aim is to consolidate on a macro that can be shared between
> > backends.
> Will be done right after Nicolases patch will be applied to the mainline.

OK, keep an eye on it.

> >> +static int __init edcs_init(void)
> >> +{

[...]

> >> +	/*
> >> +	 * Future entries into the kernel can now go
> >> +	 * through the cluster entry vectors.
> >> +	 */
> >> +	__raw_writel(virt_to_phys(mcpm_entry_point),
> >> +				S5P_VA_SYSRAM_NS + 0x1c);
> > It would me more readable to have a #define to describe what
> > S5P_VA_SYSRAM_NS + 0x1c is.
> Will be done.

> > Also, what is the memory type of S5P_VA_SYSRAM_NS?  How is the entry
> > point address value read?  Does the inbound CPU's boot ROM or firmware
> > read it?

> The former (i.e. boot ROM reads it).

OK -- since that write is to device memory, I think we can assume in
practice that it will drain before the first secondary comes online.

If you follow the above advice about adding a wmb() in
exynos_core_power_control(), that should ensure correct ordering for
this too, if I've understood correctly.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-10-17 15:46 Dave Martin
@ 2013-10-18  8:29   ` Tarek Dakhran
  0 siblings, 0 replies; 18+ messages in thread
From: Tarek Dakhran @ 2013-10-18  8:29 UTC (permalink / raw)
  To: Dave Martin, Vyacheslav Tyrtov
  Cc: linux-kernel, Mark Rutland, devicetree, Kukjin Kim, Russell King,
	Ben Dooks, Pawel Moll, Ian Campbell, Stephen Warren, linux-doc,
	Rob Herring, Daniel Lezcano, linux-samsung-soc, Rob Landley,
	Mike Turquette, Thomas Gleixner, Naour Romain, linux-arm-kernel,
	Heiko Stuebner

On 17.10.2013 19:46, Dave Martin wrote:
> On Mon, Oct 14, 2013 at 07:08:24PM +0400, Vyacheslav Tyrtov wrote:
>> From: Tarek Dakhran <t.dakhran@samsung.com>
>>
>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>>
>> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
>> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
>> ---
>>   arch/arm/mach-exynos/Makefile |   2 +
>>   arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 272 insertions(+)
>>   create mode 100644 arch/arm/mach-exynos/edcs.c
>>
>> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
>> index 5369615..ba6efdb 100644
>> --- a/arch/arm/mach-exynos/Makefile
>> +++ b/arch/arm/mach-exynos/Makefile
>> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>>   
>>   obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>>   obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
>> +
>> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
>> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
>> new file mode 100644
>> index 0000000..e304bd9
>> --- /dev/null
>> +++ b/arch/arm/mach-exynos/edcs.c
>> @@ -0,0 +1,270 @@
>> +/*
>> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
>> + *
>> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
>> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
>> + */
>> +
[snip]
>> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
>> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
>> +
>> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> Is there any reason why S5P_VA_SYSRAM_NS needs to be a static mapping?
What do you mean by "static mapping"?
>
>> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
>> +						 _nr * EDCS_CPUS_PER_CLUSTER)
>> +
>> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
>> +
>> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
>> +static int core_count[EDCS_CLUSTERS];
>> +
>> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
>> +				bool enable)
>> +{
>> +	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
>> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
>> +
>> +	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
>> +		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
> I think you need to replace all the __raw_readl() / __raw_writel() calls
> in this file with readl_relaxed() / writel_relaxed().
>
> This ensures little-endian byte order, so that BE8 kernels will work.
>
Will be done.
>> +}
>> +
>> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +	exynos_core_power_control(cpu, cluster, true);
>> +}
[snip]
>> +
>> +	edcs_use_count[cpu][cluster]++;
>> +	if (edcs_use_count[cpu][cluster] == 1) {
>> +		++core_count[cluster];
>> +		set_boot_flag(cpu, 0x2);
> 0x2 looks like a magic number.  Can we have a #define for that?
Will be done.
> If the boot flag is read by the inbound CPU or by a peripheral then
> there are memory ordering issues to take into account.  Otherwise, can't
> the inbound CPU come online and race with the write to the boot flag?
Inbound CPU doesn't write the boot flag.
> What is the memory type of REG_STATE_ADDR()?
Same type as S5P_VA_SYSRAM_NS.
This 4K region is mapped as follows:

static struct map_desc exynos5410_iodesc[] __initdata = {
	{
		.virtual	= (unsigned long)S5P_VA_SYSRAM_NS,
		.pfn		= __phys_to_pfn(EXYNOS5410_PA_SYSRAM_NS),
		.length		= SZ_4K,
		.type		= MT_DEVICE,
	},
};


>> +		exynos_core_power_up(cpu, cluster);
>> +	} else if (edcs_use_count[cpu][cluster] != 2) {
>> +		/*
>> +		 * The only possible values are:
>> +		 * 0 = CPU down
>> +		 * 1 = CPU (still) up
>> +		 * 2 = CPU requested to be up before it had a chance
>> +		 *     to actually make itself down.
>> +		 * Any other value is a bug.
>> +		 */
>> +		BUG();
>> +	}
>> +
>> +	arch_spin_unlock(&edcs_lock);
>> +	local_irq_enable();
>> +
>> +	return 0;
>> +}
>> +static void exynos_power_down(void)
>> +{
>> +	unsigned int mpidr, cpu, cluster;
>> +	bool last_man = false, skip_wfi = false;
>> +
>> +	mpidr = read_cpuid_mpidr();
>> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>> +
>> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
>> +
>> +	__mcpm_cpu_going_down(cpu, cluster);
>> +
>> +	arch_spin_lock(&edcs_lock);
>> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
>> +	edcs_use_count[cpu][cluster]--;
>> +	if (edcs_use_count[cpu][cluster] == 0) {
>> +		--core_count[cluster];
>> +		if (core_count[cluster] == 0)
>> +			last_man = true;
>> +	} else if (edcs_use_count[cpu][cluster] == 1) {
>> +		/*
>> +		 * A power_up request went ahead of us.
>> +		 * Even if we do not want to shut this CPU down,
>> +		 * the caller expects a certain state as if the WFI
>> +		 * was aborted.  So let's continue with cache cleaning.
>> +		 */
>> +		skip_wfi = true;
>> +	} else
>> +		BUG();
> For TC2, Lorenzo discovered need to disable the GIC CPU interface, so
> that no suprious wakeups can happen once the power controller is
> configured to power the CPU down.
>
> The problem is that a spurious wakeup can trigger a race in the power
> controller when the last man powers down, where the power controller
> continues to wait for the cluster to power down, but it never happens.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-September/200917.html
> ([PATCH v2] arm: vexpress: tc2: fix hotplug/idle/kexec race on cluster power down)
>
>
> This might not be a problem for Exynos5410 -- it depends on the way the
> power control logic works.
>
> On the other hand, disabling the GIC CPU interface here is cheap to do,
> even if it's not really needed.
>
>> +
>> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
>> +		arch_spin_unlock(&edcs_lock);
>> +
>> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
>> +			/*
>> +			 * On the Cortex-A15 we need to disable
>> +			 * L2 prefetching before flushing the cache.
>> +			 */
>> +			asm volatile(
>> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
>> +			"isb\n\t"
>> +			"dsb"
>> +			: : "r" (0x400));
>> +		}
>> +
>> +		/*
>> +		 * We need to disable and flush the whole (L1 and L2) cache.
>> +		 * Let's do it in the safest possible way i.e. with
>> +		 * no memory access within the following sequence
>> +		 * including the stack.
>> +		 *
>> +		 * Note: fp is preserved to the stack explicitly prior doing
>> +		 * this since adding it to the clobber list is incompatible
>> +		 * with having CONFIG_FRAME_POINTER=y.
>> +		 */
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_all\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +			"r9", "r10", "lr", "memory");
> For these code sequences, please take a look at
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205085.html
> [PATCH] ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
>
> The aim is to consolidate on a macro that can be shared between
> backends.
Will be done right after Nicolases patch will be applied to the mainline.
>
>> +
>> +		cci_disable_port_by_cpu(mpidr);
>> +
>> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
>> +
>> +	} else {
>> +		arch_spin_unlock(&edcs_lock);
>> +		/*
>> +			* We need to disable and flush only the L1 cache.
>> +			* Let's do it in the safest possible way as above.
>> +		*/
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_louis\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +		      "r9", "r10", "lr", "memory");
>> +
>> +	}
>> +	__mcpm_cpu_down(cpu, cluster);
>> +
>> +	if (!skip_wfi) {
>> +		exynos_core_power_down(cpu, cluster);
>> +		wfi();
>> +	}
>> +}
>> +
>> +static const struct mcpm_platform_ops exynos_power_ops = {
>> +	.power_up	= exynos_power_up,
>> +	.power_down	= exynos_power_down,
>> +};
>> +
>> +static void __init edcs_data_init(void)
>> +{
>> +	unsigned int mpidr, cpu, cluster;
>> +
>> +	mpidr = read_cpuid_mpidr();
>> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>> +
>> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
>> +	edcs_use_count[cpu][cluster] = 1;
>> +	++core_count[cluster];
>> +}
>> +
>> +/*
>> + * Enable cluster-level coherency, in preparation for turning on the MMU.
>> + */
>> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
>> +{
>> +	asm volatile ("\n"
>> +	"b	cci_enable_port_for_self");
>> +}
>> +
>> +static int __init edcs_init(void)
>> +{
>> +	int ret;
>> +	struct device_node *node;
>> +
>> +	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
>> +	if (!node)
>> +		return -ENODEV;
>> +
>> +	if (!cci_probed())
>> +		return -ENODEV;
>> +
>> +	/*
>> +	 * Future entries into the kernel can now go
>> +	 * through the cluster entry vectors.
>> +	 */
>> +	__raw_writel(virt_to_phys(mcpm_entry_point),
>> +				S5P_VA_SYSRAM_NS + 0x1c);
> It would me more readable to have a #define to describe what
> S5P_VA_SYSRAM_NS + 0x1c is.
Will be done.
> Also, what is the memory type of S5P_VA_SYSRAM_NS?  How is the entry
> point address value read?  Does the inbound CPU's boot ROM or firmware
> read it?
The former (i.e. boot ROM reads it).
> If it is read via some non-coherent side-channel, or if S5P_VA_SYSRAM_NS
> is mapped as normal memory then we would need some explicit
> synchronisation, but I suspect this is not the case (?)
>
> Cheers
> ---Dave
>
>> +
>> +	edcs_data_init();
>> +	mcpm_smp_set_ops();
>> +
>> +	ret = mcpm_platform_register(&exynos_power_ops);
>> +	if (!ret) {
>> +		mcpm_sync_init(edcs_power_up_setup);
>> +		pr_info("EDCS power management initialized\n");
>> +	}
>> +	return ret;
>> +}
>> +
>> +early_initcall(edcs_init);
>> -- 
>> 1.8.1.5
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Thanks for your comments.

Sincerely yours,
     Tarek Dakhran

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-18  8:29   ` Tarek Dakhran
  0 siblings, 0 replies; 18+ messages in thread
From: Tarek Dakhran @ 2013-10-18  8:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 17.10.2013 19:46, Dave Martin wrote:
> On Mon, Oct 14, 2013 at 07:08:24PM +0400, Vyacheslav Tyrtov wrote:
>> From: Tarek Dakhran <t.dakhran@samsung.com>
>>
>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>>
>> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
>> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
>> ---
>>   arch/arm/mach-exynos/Makefile |   2 +
>>   arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 272 insertions(+)
>>   create mode 100644 arch/arm/mach-exynos/edcs.c
>>
>> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
>> index 5369615..ba6efdb 100644
>> --- a/arch/arm/mach-exynos/Makefile
>> +++ b/arch/arm/mach-exynos/Makefile
>> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>>   
>>   obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>>   obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
>> +
>> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
>> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
>> new file mode 100644
>> index 0000000..e304bd9
>> --- /dev/null
>> +++ b/arch/arm/mach-exynos/edcs.c
>> @@ -0,0 +1,270 @@
>> +/*
>> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
>> + *
>> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
>> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
>> + */
>> +
[snip]
>> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
>> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
>> +
>> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> Is there any reason why S5P_VA_SYSRAM_NS needs to be a static mapping?
What do you mean by "static mapping"?
>
>> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
>> +						 _nr * EDCS_CPUS_PER_CLUSTER)
>> +
>> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
>> +
>> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
>> +static int core_count[EDCS_CLUSTERS];
>> +
>> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
>> +				bool enable)
>> +{
>> +	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
>> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
>> +
>> +	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
>> +		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
> I think you need to replace all the __raw_readl() / __raw_writel() calls
> in this file with readl_relaxed() / writel_relaxed().
>
> This ensures little-endian byte order, so that BE8 kernels will work.
>
Will be done.
>> +}
>> +
>> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +	exynos_core_power_control(cpu, cluster, true);
>> +}
[snip]
>> +
>> +	edcs_use_count[cpu][cluster]++;
>> +	if (edcs_use_count[cpu][cluster] == 1) {
>> +		++core_count[cluster];
>> +		set_boot_flag(cpu, 0x2);
> 0x2 looks like a magic number.  Can we have a #define for that?
Will be done.
> If the boot flag is read by the inbound CPU or by a peripheral then
> there are memory ordering issues to take into account.  Otherwise, can't
> the inbound CPU come online and race with the write to the boot flag?
Inbound CPU doesn't write the boot flag.
> What is the memory type of REG_STATE_ADDR()?
Same type as S5P_VA_SYSRAM_NS.
This 4K region is mapped as follows:

static struct map_desc exynos5410_iodesc[] __initdata = {
	{
		.virtual	= (unsigned long)S5P_VA_SYSRAM_NS,
		.pfn		= __phys_to_pfn(EXYNOS5410_PA_SYSRAM_NS),
		.length		= SZ_4K,
		.type		= MT_DEVICE,
	},
};


>> +		exynos_core_power_up(cpu, cluster);
>> +	} else if (edcs_use_count[cpu][cluster] != 2) {
>> +		/*
>> +		 * The only possible values are:
>> +		 * 0 = CPU down
>> +		 * 1 = CPU (still) up
>> +		 * 2 = CPU requested to be up before it had a chance
>> +		 *     to actually make itself down.
>> +		 * Any other value is a bug.
>> +		 */
>> +		BUG();
>> +	}
>> +
>> +	arch_spin_unlock(&edcs_lock);
>> +	local_irq_enable();
>> +
>> +	return 0;
>> +}
>> +static void exynos_power_down(void)
>> +{
>> +	unsigned int mpidr, cpu, cluster;
>> +	bool last_man = false, skip_wfi = false;
>> +
>> +	mpidr = read_cpuid_mpidr();
>> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>> +
>> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
>> +
>> +	__mcpm_cpu_going_down(cpu, cluster);
>> +
>> +	arch_spin_lock(&edcs_lock);
>> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
>> +	edcs_use_count[cpu][cluster]--;
>> +	if (edcs_use_count[cpu][cluster] == 0) {
>> +		--core_count[cluster];
>> +		if (core_count[cluster] == 0)
>> +			last_man = true;
>> +	} else if (edcs_use_count[cpu][cluster] == 1) {
>> +		/*
>> +		 * A power_up request went ahead of us.
>> +		 * Even if we do not want to shut this CPU down,
>> +		 * the caller expects a certain state as if the WFI
>> +		 * was aborted.  So let's continue with cache cleaning.
>> +		 */
>> +		skip_wfi = true;
>> +	} else
>> +		BUG();
> For TC2, Lorenzo discovered need to disable the GIC CPU interface, so
> that no suprious wakeups can happen once the power controller is
> configured to power the CPU down.
>
> The problem is that a spurious wakeup can trigger a race in the power
> controller when the last man powers down, where the power controller
> continues to wait for the cluster to power down, but it never happens.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-September/200917.html
> ([PATCH v2] arm: vexpress: tc2: fix hotplug/idle/kexec race on cluster power down)
>
>
> This might not be a problem for Exynos5410 -- it depends on the way the
> power control logic works.
>
> On the other hand, disabling the GIC CPU interface here is cheap to do,
> even if it's not really needed.
>
>> +
>> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
>> +		arch_spin_unlock(&edcs_lock);
>> +
>> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
>> +			/*
>> +			 * On the Cortex-A15 we need to disable
>> +			 * L2 prefetching before flushing the cache.
>> +			 */
>> +			asm volatile(
>> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
>> +			"isb\n\t"
>> +			"dsb"
>> +			: : "r" (0x400));
>> +		}
>> +
>> +		/*
>> +		 * We need to disable and flush the whole (L1 and L2) cache.
>> +		 * Let's do it in the safest possible way i.e. with
>> +		 * no memory access within the following sequence
>> +		 * including the stack.
>> +		 *
>> +		 * Note: fp is preserved to the stack explicitly prior doing
>> +		 * this since adding it to the clobber list is incompatible
>> +		 * with having CONFIG_FRAME_POINTER=y.
>> +		 */
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_all\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +			"r9", "r10", "lr", "memory");
> For these code sequences, please take a look at
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205085.html
> [PATCH] ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
>
> The aim is to consolidate on a macro that can be shared between
> backends.
Will be done right after Nicolases patch will be applied to the mainline.
>
>> +
>> +		cci_disable_port_by_cpu(mpidr);
>> +
>> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
>> +
>> +	} else {
>> +		arch_spin_unlock(&edcs_lock);
>> +		/*
>> +			* We need to disable and flush only the L1 cache.
>> +			* Let's do it in the safest possible way as above.
>> +		*/
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_louis\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +		      "r9", "r10", "lr", "memory");
>> +
>> +	}
>> +	__mcpm_cpu_down(cpu, cluster);
>> +
>> +	if (!skip_wfi) {
>> +		exynos_core_power_down(cpu, cluster);
>> +		wfi();
>> +	}
>> +}
>> +
>> +static const struct mcpm_platform_ops exynos_power_ops = {
>> +	.power_up	= exynos_power_up,
>> +	.power_down	= exynos_power_down,
>> +};
>> +
>> +static void __init edcs_data_init(void)
>> +{
>> +	unsigned int mpidr, cpu, cluster;
>> +
>> +	mpidr = read_cpuid_mpidr();
>> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>> +
>> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
>> +	edcs_use_count[cpu][cluster] = 1;
>> +	++core_count[cluster];
>> +}
>> +
>> +/*
>> + * Enable cluster-level coherency, in preparation for turning on the MMU.
>> + */
>> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
>> +{
>> +	asm volatile ("\n"
>> +	"b	cci_enable_port_for_self");
>> +}
>> +
>> +static int __init edcs_init(void)
>> +{
>> +	int ret;
>> +	struct device_node *node;
>> +
>> +	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
>> +	if (!node)
>> +		return -ENODEV;
>> +
>> +	if (!cci_probed())
>> +		return -ENODEV;
>> +
>> +	/*
>> +	 * Future entries into the kernel can now go
>> +	 * through the cluster entry vectors.
>> +	 */
>> +	__raw_writel(virt_to_phys(mcpm_entry_point),
>> +				S5P_VA_SYSRAM_NS + 0x1c);
> It would me more readable to have a #define to describe what
> S5P_VA_SYSRAM_NS + 0x1c is.
Will be done.
> Also, what is the memory type of S5P_VA_SYSRAM_NS?  How is the entry
> point address value read?  Does the inbound CPU's boot ROM or firmware
> read it?
The former (i.e. boot ROM reads it).
> If it is read via some non-coherent side-channel, or if S5P_VA_SYSRAM_NS
> is mapped as normal memory then we would need some explicit
> synchronisation, but I suspect this is not the case (?)
>
> Cheers
> ---Dave
>
>> +
>> +	edcs_data_init();
>> +	mcpm_smp_set_ops();
>> +
>> +	ret = mcpm_platform_register(&exynos_power_ops);
>> +	if (!ret) {
>> +		mcpm_sync_init(edcs_power_up_setup);
>> +		pr_info("EDCS power management initialized\n");
>> +	}
>> +	return ret;
>> +}
>> +
>> +early_initcall(edcs_init);
>> -- 
>> 1.8.1.5
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Thanks for your comments.

Sincerely yours,
     Tarek Dakhran

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-10-17 16:04   ` Daniel Lezcano
@ 2013-10-17 18:16     ` Nicolas Pitre
  -1 siblings, 0 replies; 18+ messages in thread
From: Nicolas Pitre @ 2013-10-17 18:16 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Dave Martin, Vyacheslav Tyrtov, linux-kernel, Mark Rutland,
	devicetree, Kukjin Kim, Russell King, Ben Dooks, Pawel Moll,
	Ian Campbell, Stephen Warren, linux-doc, Rob Herring,
	Tarek Dakhran, linux-samsung-soc, Rob Landley, Mike Turquette,
	Thomas Gleixner, Naour Romain, Lorenzo Pieralisi,
	linux-arm-kernel, Heiko Stuebner

On Thu, 17 Oct 2013, Daniel Lezcano wrote:

> On 10/17/2013 04:32 PM, Dave Martin wrote:
> > On Thu, Oct 17, 2013 at 12:45:29PM +0200, Daniel Lezcano wrote:
> > > On 10/14/2013 05:08 PM, Vyacheslav Tyrtov wrote:
> > > > From: Tarek Dakhran <t.dakhran@samsung.com>
> > > > 
> > > > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > > > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > [...]
> > 
> > > > +	__mcpm_cpu_down(cpu, cluster);
> > > > +
> > > > +	if (!skip_wfi) {
> > > > +		exynos_core_power_down(cpu, cluster);
> > > > +		wfi();
> > > > +	}
> > > > +}
> > > 
> > > I did not looked line by line but these functions looks very similar
> > > than the tc2_pm.c's function. no ?
> > 
> > This is true.
> > 
> > > May be some code consolidation could be considered here.
> > > 
> > > Added Nico and Lorenzo in Cc.
> > > 
> > > Thanks
> > >    -- Daniel
> > 
> > Nico can commnent further, but I think the main concern here was that
> > this code shouldn't be factored prematurely.
> 
> I do not share this opinion.
> 
> > There are many low-level platform specifics involved here, so it's
> > hard to be certain that all platforms could fit into a more abstracted
> > framework until we have some evidence to look at.
> > 
> > This could be revisited when we have a few diverse MCPM ports to
> > compare.
> 
> I am worried about seeing more and more duplicated code around the ARM arch
> (eg. arm[64]/kernel/smp.c arm64/kernel/smp.c).

I didn't look too closely at those files so can't comment much on them.

> The cpuidle drivers have been duplicated and it took a while before
> refactoring them, and it is not finished. The hotplug code have been
> duplicated and now it is very difficult to factor it out.
> 
> A lot of work have been done to consolidate the code across the different SoC
> since the last 2 years.
> 
> If we let duplicate code populate the different files, they will belong to
> different maintainers, thus different trees. Each SoC contributors will tend
> to add their small changes making the code to diverge more and more and making
> difficult to re-factor it later.
> 
> I am in favor of preventing duplicate code entering in the kernel and force
> the contributors to improve the kernel in general, not just the small part
> they are supposed to work on. Otherwise, we are letting the kernel to fork
> itself, internally...

Now I'm going to comment.

What you say above is perfectly right.  As a general principle, we want 
good consolidation.  That's the theory.

However you need to know what needs to be consolidated in the first 
place.  In practice you cannot consolidate duplicated code if that code 
doesn't exist. In the cpuidle and hotplug cases it is very easy now to 
see what kind of consolidation should have been done after the facts.  
I'm sure that 10 years ago that wasn't as obvious.

In the MCPM case we didn't know up front what exactly would end up being 
duplicated in each machine specific backend.  And to some extent we 
still don't know as there is very few backends merged into mainline at 
this point (I know more of them exist out of tree but I didn't see the 
code yet). Right now we have only 2 such backends and some duplication 
was already identified, hence the patch to abstract the v7 cache 
flushing I posted recently.

Another possibility for consolidation in the MCPM backends is the last 
man determination.  However we've seen that the PSCI compatibility 
backend doesn't need that and abstracting that just yet might be 
premature, possibly making interaction with PSCI very awkward.

So it is very important to get the right balance between code 
duplication and over-engineering.  Premature abstractions do fall into 
the over-engineering category and it is just as bad.  Above all it is 
most important to be vigilant and take care of duplicated code before it 
gets too big.

In the past we've seen bad code duplication in some subsystems because 
no one was looking at the big picture.  In the MCPM case I'm watching. 
And in this case we're far from being in a critical situation.

Nicolas

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-17 18:16     ` Nicolas Pitre
  0 siblings, 0 replies; 18+ messages in thread
From: Nicolas Pitre @ 2013-10-17 18:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 17 Oct 2013, Daniel Lezcano wrote:

> On 10/17/2013 04:32 PM, Dave Martin wrote:
> > On Thu, Oct 17, 2013 at 12:45:29PM +0200, Daniel Lezcano wrote:
> > > On 10/14/2013 05:08 PM, Vyacheslav Tyrtov wrote:
> > > > From: Tarek Dakhran <t.dakhran@samsung.com>
> > > > 
> > > > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > > > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > [...]
> > 
> > > > +	__mcpm_cpu_down(cpu, cluster);
> > > > +
> > > > +	if (!skip_wfi) {
> > > > +		exynos_core_power_down(cpu, cluster);
> > > > +		wfi();
> > > > +	}
> > > > +}
> > > 
> > > I did not looked line by line but these functions looks very similar
> > > than the tc2_pm.c's function. no ?
> > 
> > This is true.
> > 
> > > May be some code consolidation could be considered here.
> > > 
> > > Added Nico and Lorenzo in Cc.
> > > 
> > > Thanks
> > >    -- Daniel
> > 
> > Nico can commnent further, but I think the main concern here was that
> > this code shouldn't be factored prematurely.
> 
> I do not share this opinion.
> 
> > There are many low-level platform specifics involved here, so it's
> > hard to be certain that all platforms could fit into a more abstracted
> > framework until we have some evidence to look at.
> > 
> > This could be revisited when we have a few diverse MCPM ports to
> > compare.
> 
> I am worried about seeing more and more duplicated code around the ARM arch
> (eg. arm[64]/kernel/smp.c arm64/kernel/smp.c).

I didn't look too closely at those files so can't comment much on them.

> The cpuidle drivers have been duplicated and it took a while before
> refactoring them, and it is not finished. The hotplug code have been
> duplicated and now it is very difficult to factor it out.
> 
> A lot of work have been done to consolidate the code across the different SoC
> since the last 2 years.
> 
> If we let duplicate code populate the different files, they will belong to
> different maintainers, thus different trees. Each SoC contributors will tend
> to add their small changes making the code to diverge more and more and making
> difficult to re-factor it later.
> 
> I am in favor of preventing duplicate code entering in the kernel and force
> the contributors to improve the kernel in general, not just the small part
> they are supposed to work on. Otherwise, we are letting the kernel to fork
> itself, internally...

Now I'm going to comment.

What you say above is perfectly right.  As a general principle, we want 
good consolidation.  That's the theory.

However you need to know what needs to be consolidated in the first 
place.  In practice you cannot consolidate duplicated code if that code 
doesn't exist. In the cpuidle and hotplug cases it is very easy now to 
see what kind of consolidation should have been done after the facts.  
I'm sure that 10 years ago that wasn't as obvious.

In the MCPM case we didn't know up front what exactly would end up being 
duplicated in each machine specific backend.  And to some extent we 
still don't know as there is very few backends merged into mainline at 
this point (I know more of them exist out of tree but I didn't see the 
code yet). Right now we have only 2 such backends and some duplication 
was already identified, hence the patch to abstract the v7 cache 
flushing I posted recently.

Another possibility for consolidation in the MCPM backends is the last 
man determination.  However we've seen that the PSCI compatibility 
backend doesn't need that and abstracting that just yet might be 
premature, possibly making interaction with PSCI very awkward.

So it is very important to get the right balance between code 
duplication and over-engineering.  Premature abstractions do fall into 
the over-engineering category and it is just as bad.  Above all it is 
most important to be vigilant and take care of duplicated code before it 
gets too big.

In the past we've seen bad code duplication in some subsystems because 
no one was looking at the big picture.  In the MCPM case I'm watching. 
And in this case we're far from being in a critical situation.

Nicolas

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-10-17 14:32 Dave Martin
@ 2013-10-17 16:04   ` Daniel Lezcano
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Lezcano @ 2013-10-17 16:04 UTC (permalink / raw)
  To: Dave Martin
  Cc: Vyacheslav Tyrtov, linux-kernel, Mark Rutland, devicetree,
	Kukjin Kim, Russell King, Ben Dooks, Pawel Moll, Ian Campbell,
	Nicolas Pitre, Stephen Warren, linux-doc, Rob Herring,
	Tarek Dakhran, linux-samsung-soc, Rob Landley, Mike Turquette,
	Thomas Gleixner, Naour Romain, Lorenzo Pieralisi,
	linux-arm-kernel, Heiko Stuebner

On 10/17/2013 04:32 PM, Dave Martin wrote:
> On Thu, Oct 17, 2013 at 12:45:29PM +0200, Daniel Lezcano wrote:
>> On 10/14/2013 05:08 PM, Vyacheslav Tyrtov wrote:
>>> From: Tarek Dakhran <t.dakhran@samsung.com>
>>>
>>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>
> [...]
>
>>> +	__mcpm_cpu_down(cpu, cluster);
>>> +
>>> +	if (!skip_wfi) {
>>> +		exynos_core_power_down(cpu, cluster);
>>> +		wfi();
>>> +	}
>>> +}
>>
>> I did not looked line by line but these functions looks very similar
>> than the tc2_pm.c's function. no ?
>
> This is true.
>
>> May be some code consolidation could be considered here.
>>
>> Added Nico and Lorenzo in Cc.
>>
>> Thanks
>>    -- Daniel
>
> Nico can commnent further, but I think the main concern here was that
> this code shouldn't be factored prematurely.

I do not share this opinion.

> There are many low-level platform specifics involved here, so it's
> hard to be certain that all platforms could fit into a more abstracted
> framework until we have some evidence to look at.
>
> This could be revisited when we have a few diverse MCPM ports to
> compare.

I am worried about seeing more and more duplicated code around the ARM 
arch (eg. arm[64]/kernel/smp.c arm64/kernel/smp.c).

The cpuidle drivers have been duplicated and it took a while before 
refactoring them, and it is not finished. The hotplug code have been 
duplicated and now it is very difficult to factor it out.

A lot of work have been done to consolidate the code across the 
different SoC since the last 2 years.

If we let duplicate code populate the different files, they will belong 
to different maintainers, thus different trees. Each SoC contributors 
will tend to add their small changes making the code to diverge more and 
more and making difficult to re-factor it later.

I am in favor of preventing duplicate code entering in the kernel and 
force the contributors to improve the kernel in general, not just the 
small part they are supposed to work on. Otherwise, we are letting the 
kernel to fork itself, internally...

> The low-level A15/A7 cacheflush sequence is already being factored
> by Nico [1].

Hopefully he did it :)

Thanks
   -- Daniel

> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205085.html
> [PATCH] ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
>
> [...]
>


-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-17 16:04   ` Daniel Lezcano
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Lezcano @ 2013-10-17 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/17/2013 04:32 PM, Dave Martin wrote:
> On Thu, Oct 17, 2013 at 12:45:29PM +0200, Daniel Lezcano wrote:
>> On 10/14/2013 05:08 PM, Vyacheslav Tyrtov wrote:
>>> From: Tarek Dakhran <t.dakhran@samsung.com>
>>>
>>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>
> [...]
>
>>> +	__mcpm_cpu_down(cpu, cluster);
>>> +
>>> +	if (!skip_wfi) {
>>> +		exynos_core_power_down(cpu, cluster);
>>> +		wfi();
>>> +	}
>>> +}
>>
>> I did not looked line by line but these functions looks very similar
>> than the tc2_pm.c's function. no ?
>
> This is true.
>
>> May be some code consolidation could be considered here.
>>
>> Added Nico and Lorenzo in Cc.
>>
>> Thanks
>>    -- Daniel
>
> Nico can commnent further, but I think the main concern here was that
> this code shouldn't be factored prematurely.

I do not share this opinion.

> There are many low-level platform specifics involved here, so it's
> hard to be certain that all platforms could fit into a more abstracted
> framework until we have some evidence to look at.
>
> This could be revisited when we have a few diverse MCPM ports to
> compare.

I am worried about seeing more and more duplicated code around the ARM 
arch (eg. arm[64]/kernel/smp.c arm64/kernel/smp.c).

The cpuidle drivers have been duplicated and it took a while before 
refactoring them, and it is not finished. The hotplug code have been 
duplicated and now it is very difficult to factor it out.

A lot of work have been done to consolidate the code across the 
different SoC since the last 2 years.

If we let duplicate code populate the different files, they will belong 
to different maintainers, thus different trees. Each SoC contributors 
will tend to add their small changes making the code to diverge more and 
more and making difficult to re-factor it later.

I am in favor of preventing duplicate code entering in the kernel and 
force the contributors to improve the kernel in general, not just the 
small part they are supposed to work on. Otherwise, we are letting the 
kernel to fork itself, internally...

> The low-level A15/A7 cacheflush sequence is already being factored
> by Nico [1].

Hopefully he did it :)

Thanks
   -- Daniel

> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205085.html
> [PATCH] ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
>
> [...]
>

-- 
  <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-17 15:46 Dave Martin
  2013-10-18  8:29   ` Tarek Dakhran
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Martin @ 2013-10-17 15:46 UTC (permalink / raw)
  To: Vyacheslav Tyrtov
  Cc: linux-kernel, Mark Rutland, devicetree, Kukjin Kim, Russell King,
	Ben Dooks, Pawel Moll, Ian Campbell, Stephen Warren, linux-doc,
	Rob Herring, Tarek Dakhran, Daniel Lezcano, linux-samsung-soc,
	Rob Landley, Mike Turquette, Thomas Gleixner, Naour Romain,
	linux-arm-kernel, Heiko Stuebner

On Mon, Oct 14, 2013 at 07:08:24PM +0400, Vyacheslav Tyrtov wrote:
> From: Tarek Dakhran <t.dakhran@samsung.com>
> 
> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> 
> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> ---
>  arch/arm/mach-exynos/Makefile |   2 +
>  arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 272 insertions(+)
>  create mode 100644 arch/arm/mach-exynos/edcs.c
> 
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index 5369615..ba6efdb 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>  
>  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> +
> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> new file mode 100644
> index 0000000..e304bd9
> --- /dev/null
> +++ b/arch/arm/mach-exynos/edcs.c
> @@ -0,0 +1,270 @@
> +/*
> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> + *
> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +#include <linux/spinlock.h>
> +#include <linux/errno.h>
> +
> +#include <asm/mcpm.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +
> +#include <linux/arm-cci.h>
> +#include <mach/regs-pmu.h>
> +
> +#define EDCS_CPUS_PER_CLUSTER	4
> +#define EDCS_CLUSTERS		2
> +
> +/* Exynos5410 power management registers */
> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> +						+ ((_nr) * 0x80))
> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> +
> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)

Is there any reason why S5P_VA_SYSRAM_NS needs to be a static mapping?

> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> +						 _nr * EDCS_CPUS_PER_CLUSTER)
> +
> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> +static int core_count[EDCS_CLUSTERS];
> +
> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> +				bool enable)
> +{
> +	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> +
> +	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
> +		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));

I think you need to replace all the __raw_readl() / __raw_writel() calls
in this file with readl_relaxed() / writel_relaxed().

This ensures little-endian byte order, so that BE8 kernels will work.

> +}
> +
> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, true);
> +}
> +
> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, false);
> +}
> +
> +void set_boot_flag(unsigned int cpu, unsigned int mode)
> +{
> +	__raw_writel(mode, REG_CPU_STATE_ADDR(cpu));
> +}
> +
> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
> +
> +	local_irq_disable();
> +	arch_spin_lock(&edcs_lock);
> +
> +	edcs_use_count[cpu][cluster]++;
> +	if (edcs_use_count[cpu][cluster] == 1) {
> +		++core_count[cluster];
> +		set_boot_flag(cpu, 0x2);

0x2 looks like a magic number.  Can we have a #define for that?


If the boot flag is read by the inbound CPU or by a peripheral then
there are memory ordering issues to take into account.  Otherwise, can't
the inbound CPU come online and race with the write to the boot flag?

What is the memory type of REG_STATE_ADDR()?

> +		exynos_core_power_up(cpu, cluster);
> +	} else if (edcs_use_count[cpu][cluster] != 2) {
> +		/*
> +		 * The only possible values are:
> +		 * 0 = CPU down
> +		 * 1 = CPU (still) up
> +		 * 2 = CPU requested to be up before it had a chance
> +		 *     to actually make itself down.
> +		 * Any other value is a bug.
> +		 */
> +		BUG();
> +	}
> +
> +	arch_spin_unlock(&edcs_lock);
> +	local_irq_enable();
> +
> +	return 0;
> +}
> +static void exynos_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +	bool last_man = false, skip_wfi = false;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +
> +	__mcpm_cpu_going_down(cpu, cluster);
> +
> +	arch_spin_lock(&edcs_lock);
> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> +	edcs_use_count[cpu][cluster]--;
> +	if (edcs_use_count[cpu][cluster] == 0) {
> +		--core_count[cluster];
> +		if (core_count[cluster] == 0)
> +			last_man = true;
> +	} else if (edcs_use_count[cpu][cluster] == 1) {
> +		/*
> +		 * A power_up request went ahead of us.
> +		 * Even if we do not want to shut this CPU down,
> +		 * the caller expects a certain state as if the WFI
> +		 * was aborted.  So let's continue with cache cleaning.
> +		 */
> +		skip_wfi = true;
> +	} else
> +		BUG();

For TC2, Lorenzo discovered need to disable the GIC CPU interface, so
that no suprious wakeups can happen once the power controller is
configured to power the CPU down.

The problem is that a spurious wakeup can trigger a race in the power
controller when the last man powers down, where the power controller
continues to wait for the cluster to power down, but it never happens.

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-September/200917.html
([PATCH v2] arm: vexpress: tc2: fix hotplug/idle/kexec race on cluster power down)


This might not be a problem for Exynos5410 -- it depends on the way the
power control logic works.

On the other hand, disabling the GIC CPU interface here is cheap to do,
even if it's not really needed.

> +
> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
> +		arch_spin_unlock(&edcs_lock);
> +
> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
> +			/*
> +			 * On the Cortex-A15 we need to disable
> +			 * L2 prefetching before flushing the cache.
> +			 */
> +			asm volatile(
> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
> +			"isb\n\t"
> +			"dsb"
> +			: : "r" (0x400));
> +		}
> +
> +		/*
> +		 * We need to disable and flush the whole (L1 and L2) cache.
> +		 * Let's do it in the safest possible way i.e. with
> +		 * no memory access within the following sequence
> +		 * including the stack.
> +		 *
> +		 * Note: fp is preserved to the stack explicitly prior doing
> +		 * this since adding it to the clobber list is incompatible
> +		 * with having CONFIG_FRAME_POINTER=y.
> +		 */
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_all\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +			"r9", "r10", "lr", "memory");

For these code sequences, please take a look at

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205085.html
[PATCH] ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code

The aim is to consolidate on a macro that can be shared between
backends.

> +
> +		cci_disable_port_by_cpu(mpidr);
> +
> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
> +
> +	} else {
> +		arch_spin_unlock(&edcs_lock);
> +		/*
> +			* We need to disable and flush only the L1 cache.
> +			* Let's do it in the safest possible way as above.
> +		*/
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_louis\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +		      "r9", "r10", "lr", "memory");
> +
> +	}
> +	__mcpm_cpu_down(cpu, cluster);
> +
> +	if (!skip_wfi) {
> +		exynos_core_power_down(cpu, cluster);
> +		wfi();
> +	}
> +}
> +
> +static const struct mcpm_platform_ops exynos_power_ops = {
> +	.power_up	= exynos_power_up,
> +	.power_down	= exynos_power_down,
> +};
> +
> +static void __init edcs_data_init(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +	edcs_use_count[cpu][cluster] = 1;
> +	++core_count[cluster];
> +}
> +
> +/*
> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> + */
> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
> +{
> +	asm volatile ("\n"
> +	"b	cci_enable_port_for_self");
> +}
> +
> +static int __init edcs_init(void)
> +{
> +	int ret;
> +	struct device_node *node;
> +
> +	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
> +	if (!node)
> +		return -ENODEV;
> +
> +	if (!cci_probed())
> +		return -ENODEV;
> +
> +	/*
> +	 * Future entries into the kernel can now go
> +	 * through the cluster entry vectors.
> +	 */
> +	__raw_writel(virt_to_phys(mcpm_entry_point),
> +				S5P_VA_SYSRAM_NS + 0x1c);

It would me more readable to have a #define to describe what
S5P_VA_SYSRAM_NS + 0x1c is.


Also, what is the memory type of S5P_VA_SYSRAM_NS?  How is the entry
point address value read?  Does the inbound CPU's boot ROM or firmware
read it?

If it is read via some non-coherent side-channel, or if S5P_VA_SYSRAM_NS
is mapped as normal memory then we would need some explicit
synchronisation, but I suspect this is not the case (?)

Cheers
---Dave

> +
> +	edcs_data_init();
> +	mcpm_smp_set_ops();
> +
> +	ret = mcpm_platform_register(&exynos_power_ops);
> +	if (!ret) {
> +		mcpm_sync_init(edcs_power_up_setup);
> +		pr_info("EDCS power management initialized\n");
> +	}
> +	return ret;
> +}
> +
> +early_initcall(edcs_init);
> -- 
> 1.8.1.5
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-17 14:32 Dave Martin
  2013-10-17 16:04   ` Daniel Lezcano
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Martin @ 2013-10-17 14:32 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Vyacheslav Tyrtov, linux-kernel, Mark Rutland, devicetree,
	Kukjin Kim, Russell King, Ben Dooks, Pawel Moll, Ian Campbell,
	Nicolas Pitre, Stephen Warren, linux-doc, Rob Herring,
	Tarek Dakhran, linux-samsung-soc, Rob Landley, Mike Turquette,
	Thomas Gleixner, Naour Romain, Lorenzo Pieralisi,
	linux-arm-kernel, Heiko Stuebner

On Thu, Oct 17, 2013 at 12:45:29PM +0200, Daniel Lezcano wrote:
> On 10/14/2013 05:08 PM, Vyacheslav Tyrtov wrote:
> > From: Tarek Dakhran <t.dakhran@samsung.com>
> > 
> > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.

[...]

> > +	__mcpm_cpu_down(cpu, cluster);
> > +
> > +	if (!skip_wfi) {
> > +		exynos_core_power_down(cpu, cluster);
> > +		wfi();
> > +	}
> > +}
> 
> I did not looked line by line but these functions looks very similar
> than the tc2_pm.c's function. no ?

This is true.

> May be some code consolidation could be considered here.
> 
> Added Nico and Lorenzo in Cc.
> 
> Thanks
>   -- Daniel

Nico can commnent further, but I think the main concern here was that
this code shouldn't be factored prematurely.

There are many low-level platform specifics involved here, so it's
hard to be certain that all platforms could fit into a more abstracted
framework until we have some evidence to look at.

This could be revisited when we have a few diverse MCPM ports to
compare.


The low-level A15/A7 cacheflush sequence is already being factored
by Nico [1].

Cheers
---Dave

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205085.html
[PATCH] ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code

[...]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-10-14 15:08   ` Vyacheslav Tyrtov
@ 2013-10-17 10:45     ` Daniel Lezcano
  -1 siblings, 0 replies; 18+ messages in thread
From: Daniel Lezcano @ 2013-10-17 10:45 UTC (permalink / raw)
  To: Vyacheslav Tyrtov, linux-kernel
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Stephen Warren,
	Ian Campbell, Rob Landley, Kukjin Kim, Russell King, Ben Dooks,
	Mike Turquette, Thomas Gleixner, Heiko Stuebner, Naour Romain,
	devicetree, linux-doc, linux-arm-kernel, linux-samsung-soc,
	Tarek Dakhran, Nicolas Pitre, Lorenzo Pieralisi

On 10/14/2013 05:08 PM, Vyacheslav Tyrtov wrote:
> From: Tarek Dakhran <t.dakhran@samsung.com>
>
> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.

IIUC, the 5410 has a CCI-400 bug preventing to use the two clusters at 
the same time. Right ? Could you explain how you fixed it ?

> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> ---
>   arch/arm/mach-exynos/Makefile |   2 +
>   arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 272 insertions(+)
>   create mode 100644 arch/arm/mach-exynos/edcs.c
>
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index 5369615..ba6efdb 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>
>   obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>   obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> +
> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> new file mode 100644
> index 0000000..e304bd9
> --- /dev/null
> +++ b/arch/arm/mach-exynos/edcs.c
> @@ -0,0 +1,270 @@
> +/*
> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> + *
> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +#include <linux/spinlock.h>
> +#include <linux/errno.h>
> +
> +#include <asm/mcpm.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +
> +#include <linux/arm-cci.h>
> +#include <mach/regs-pmu.h>
> +
> +#define EDCS_CPUS_PER_CLUSTER	4
> +#define EDCS_CLUSTERS		2
> +
> +/* Exynos5410 power management registers */
> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> +						+ ((_nr) * 0x80))
> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> +
> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> +						 _nr * EDCS_CPUS_PER_CLUSTER)
> +
> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> +static int core_count[EDCS_CLUSTERS];
> +
> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> +				bool enable)
> +{
> +	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> +
> +	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
> +		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
> +}
> +
> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, true);
> +}
> +
> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, false);
> +}
> +
> +void set_boot_flag(unsigned int cpu, unsigned int mode)
> +{
> +	__raw_writel(mode, REG_CPU_STATE_ADDR(cpu));
> +}
> +
> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
> +
> +	local_irq_disable();
> +	arch_spin_lock(&edcs_lock);
> +
> +	edcs_use_count[cpu][cluster]++;
> +	if (edcs_use_count[cpu][cluster] == 1) {
> +		++core_count[cluster];
> +		set_boot_flag(cpu, 0x2);
> +		exynos_core_power_up(cpu, cluster);
> +	} else if (edcs_use_count[cpu][cluster] != 2) {
> +		/*
> +		 * The only possible values are:
> +		 * 0 = CPU down
> +		 * 1 = CPU (still) up
> +		 * 2 = CPU requested to be up before it had a chance
> +		 *     to actually make itself down.
> +		 * Any other value is a bug.
> +		 */
> +		BUG();
> +	}
> +
> +	arch_spin_unlock(&edcs_lock);
> +	local_irq_enable();
> +
> +	return 0;
> +}
> +static void exynos_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +	bool last_man = false, skip_wfi = false;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +
> +	__mcpm_cpu_going_down(cpu, cluster);
> +
> +	arch_spin_lock(&edcs_lock);
> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> +	edcs_use_count[cpu][cluster]--;
> +	if (edcs_use_count[cpu][cluster] == 0) {
> +		--core_count[cluster];
> +		if (core_count[cluster] == 0)
> +			last_man = true;
> +	} else if (edcs_use_count[cpu][cluster] == 1) {
> +		/*
> +		 * A power_up request went ahead of us.
> +		 * Even if we do not want to shut this CPU down,
> +		 * the caller expects a certain state as if the WFI
> +		 * was aborted.  So let's continue with cache cleaning.
> +		 */
> +		skip_wfi = true;
> +	} else
> +		BUG();
> +
> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
> +		arch_spin_unlock(&edcs_lock);
> +
> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
> +			/*
> +			 * On the Cortex-A15 we need to disable
> +			 * L2 prefetching before flushing the cache.
> +			 */
> +			asm volatile(
> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
> +			"isb\n\t"
> +			"dsb"
> +			: : "r" (0x400));
> +		}
> +
> +		/*
> +		 * We need to disable and flush the whole (L1 and L2) cache.
> +		 * Let's do it in the safest possible way i.e. with
> +		 * no memory access within the following sequence
> +		 * including the stack.
> +		 *
> +		 * Note: fp is preserved to the stack explicitly prior doing
> +		 * this since adding it to the clobber list is incompatible
> +		 * with having CONFIG_FRAME_POINTER=y.
> +		 */
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_all\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +			"r9", "r10", "lr", "memory");
> +
> +		cci_disable_port_by_cpu(mpidr);
> +
> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
> +
> +	} else {
> +		arch_spin_unlock(&edcs_lock);
> +		/*
> +			* We need to disable and flush only the L1 cache.
> +			* Let's do it in the safest possible way as above.
> +		*/
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_louis\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +		      "r9", "r10", "lr", "memory");
> +
> +	}
> +	__mcpm_cpu_down(cpu, cluster);
> +
> +	if (!skip_wfi) {
> +		exynos_core_power_down(cpu, cluster);
> +		wfi();
> +	}
> +}

I did not looked line by line but these functions looks very similar 
than the tc2_pm.c's function. no ?

May be some code consolidation could be considered here.

Added Nico and Lorenzo in Cc.

Thanks
   -- Daniel

> +static const struct mcpm_platform_ops exynos_power_ops = {
> +	.power_up	= exynos_power_up,
> +	.power_down	= exynos_power_down,
> +};
> +
> +static void __init edcs_data_init(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +	edcs_use_count[cpu][cluster] = 1;
> +	++core_count[cluster];
> +}
> +
> +/*
> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> + */
> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
> +{
> +	asm volatile ("\n"
> +	"b	cci_enable_port_for_self");
> +}
> +
> +static int __init edcs_init(void)
> +{
> +	int ret;
> +	struct device_node *node;
> +
> +	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
> +	if (!node)
> +		return -ENODEV;
> +
> +	if (!cci_probed())
> +		return -ENODEV;
> +
> +	/*
> +	 * Future entries into the kernel can now go
> +	 * through the cluster entry vectors.
> +	 */
> +	__raw_writel(virt_to_phys(mcpm_entry_point),
> +				S5P_VA_SYSRAM_NS + 0x1c);
> +
> +	edcs_data_init();
> +	mcpm_smp_set_ops();
> +
> +	ret = mcpm_platform_register(&exynos_power_ops);
> +	if (!ret) {
> +		mcpm_sync_init(edcs_power_up_setup);
> +		pr_info("EDCS power management initialized\n");
> +	}
> +	return ret;
> +}
> +
> +early_initcall(edcs_init);
>


-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-17 10:45     ` Daniel Lezcano
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Lezcano @ 2013-10-17 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/14/2013 05:08 PM, Vyacheslav Tyrtov wrote:
> From: Tarek Dakhran <t.dakhran@samsung.com>
>
> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.

IIUC, the 5410 has a CCI-400 bug preventing to use the two clusters at 
the same time. Right ? Could you explain how you fixed it ?

> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> ---
>   arch/arm/mach-exynos/Makefile |   2 +
>   arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 272 insertions(+)
>   create mode 100644 arch/arm/mach-exynos/edcs.c
>
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index 5369615..ba6efdb 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>
>   obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>   obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> +
> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> new file mode 100644
> index 0000000..e304bd9
> --- /dev/null
> +++ b/arch/arm/mach-exynos/edcs.c
> @@ -0,0 +1,270 @@
> +/*
> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> + *
> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +#include <linux/spinlock.h>
> +#include <linux/errno.h>
> +
> +#include <asm/mcpm.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +
> +#include <linux/arm-cci.h>
> +#include <mach/regs-pmu.h>
> +
> +#define EDCS_CPUS_PER_CLUSTER	4
> +#define EDCS_CLUSTERS		2
> +
> +/* Exynos5410 power management registers */
> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> +						+ ((_nr) * 0x80))
> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> +
> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> +						 _nr * EDCS_CPUS_PER_CLUSTER)
> +
> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> +static int core_count[EDCS_CLUSTERS];
> +
> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> +				bool enable)
> +{
> +	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> +
> +	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
> +		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
> +}
> +
> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, true);
> +}
> +
> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, false);
> +}
> +
> +void set_boot_flag(unsigned int cpu, unsigned int mode)
> +{
> +	__raw_writel(mode, REG_CPU_STATE_ADDR(cpu));
> +}
> +
> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
> +
> +	local_irq_disable();
> +	arch_spin_lock(&edcs_lock);
> +
> +	edcs_use_count[cpu][cluster]++;
> +	if (edcs_use_count[cpu][cluster] == 1) {
> +		++core_count[cluster];
> +		set_boot_flag(cpu, 0x2);
> +		exynos_core_power_up(cpu, cluster);
> +	} else if (edcs_use_count[cpu][cluster] != 2) {
> +		/*
> +		 * The only possible values are:
> +		 * 0 = CPU down
> +		 * 1 = CPU (still) up
> +		 * 2 = CPU requested to be up before it had a chance
> +		 *     to actually make itself down.
> +		 * Any other value is a bug.
> +		 */
> +		BUG();
> +	}
> +
> +	arch_spin_unlock(&edcs_lock);
> +	local_irq_enable();
> +
> +	return 0;
> +}
> +static void exynos_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +	bool last_man = false, skip_wfi = false;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +
> +	__mcpm_cpu_going_down(cpu, cluster);
> +
> +	arch_spin_lock(&edcs_lock);
> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> +	edcs_use_count[cpu][cluster]--;
> +	if (edcs_use_count[cpu][cluster] == 0) {
> +		--core_count[cluster];
> +		if (core_count[cluster] == 0)
> +			last_man = true;
> +	} else if (edcs_use_count[cpu][cluster] == 1) {
> +		/*
> +		 * A power_up request went ahead of us.
> +		 * Even if we do not want to shut this CPU down,
> +		 * the caller expects a certain state as if the WFI
> +		 * was aborted.  So let's continue with cache cleaning.
> +		 */
> +		skip_wfi = true;
> +	} else
> +		BUG();
> +
> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
> +		arch_spin_unlock(&edcs_lock);
> +
> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
> +			/*
> +			 * On the Cortex-A15 we need to disable
> +			 * L2 prefetching before flushing the cache.
> +			 */
> +			asm volatile(
> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
> +			"isb\n\t"
> +			"dsb"
> +			: : "r" (0x400));
> +		}
> +
> +		/*
> +		 * We need to disable and flush the whole (L1 and L2) cache.
> +		 * Let's do it in the safest possible way i.e. with
> +		 * no memory access within the following sequence
> +		 * including the stack.
> +		 *
> +		 * Note: fp is preserved to the stack explicitly prior doing
> +		 * this since adding it to the clobber list is incompatible
> +		 * with having CONFIG_FRAME_POINTER=y.
> +		 */
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_all\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +			"r9", "r10", "lr", "memory");
> +
> +		cci_disable_port_by_cpu(mpidr);
> +
> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
> +
> +	} else {
> +		arch_spin_unlock(&edcs_lock);
> +		/*
> +			* We need to disable and flush only the L1 cache.
> +			* Let's do it in the safest possible way as above.
> +		*/
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_louis\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"
> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +		      "r9", "r10", "lr", "memory");
> +
> +	}
> +	__mcpm_cpu_down(cpu, cluster);
> +
> +	if (!skip_wfi) {
> +		exynos_core_power_down(cpu, cluster);
> +		wfi();
> +	}
> +}

I did not looked line by line but these functions looks very similar 
than the tc2_pm.c's function. no ?

May be some code consolidation could be considered here.

Added Nico and Lorenzo in Cc.

Thanks
   -- Daniel

> +static const struct mcpm_platform_ops exynos_power_ops = {
> +	.power_up	= exynos_power_up,
> +	.power_down	= exynos_power_down,
> +};
> +
> +static void __init edcs_data_init(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +	edcs_use_count[cpu][cluster] = 1;
> +	++core_count[cluster];
> +}
> +
> +/*
> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> + */
> +static void __naked edcs_power_up_setup(unsigned int affinity_level)
> +{
> +	asm volatile ("\n"
> +	"b	cci_enable_port_for_self");
> +}
> +
> +static int __init edcs_init(void)
> +{
> +	int ret;
> +	struct device_node *node;
> +
> +	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
> +	if (!node)
> +		return -ENODEV;
> +
> +	if (!cci_probed())
> +		return -ENODEV;
> +
> +	/*
> +	 * Future entries into the kernel can now go
> +	 * through the cluster entry vectors.
> +	 */
> +	__raw_writel(virt_to_phys(mcpm_entry_point),
> +				S5P_VA_SYSRAM_NS + 0x1c);
> +
> +	edcs_data_init();
> +	mcpm_smp_set_ops();
> +
> +	ret = mcpm_platform_register(&exynos_power_ops);
> +	if (!ret) {
> +		mcpm_sync_init(edcs_power_up_setup);
> +		pr_info("EDCS power management initialized\n");
> +	}
> +	return ret;
> +}
> +
> +early_initcall(edcs_init);
>


-- 
  <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-10-14 15:08 [PATCH v2 0/4] Exynos 5410 Dual cluster support Vyacheslav Tyrtov
@ 2013-10-14 15:08   ` Vyacheslav Tyrtov
  0 siblings, 0 replies; 18+ messages in thread
From: Vyacheslav Tyrtov @ 2013-10-14 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Stephen Warren,
	Ian Campbell, Rob Landley, Kukjin Kim, Russell King, Ben Dooks,
	Mike Turquette, Daniel Lezcano, Thomas Gleixner, Heiko Stuebner,
	Naour Romain, devicetree, linux-doc, linux-arm-kernel,
	linux-samsung-soc, Tarek Dakhran, Tyrtov Vyacheslav

From: Tarek Dakhran <t.dakhran@samsung.com>

Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.

Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
---
 arch/arm/mach-exynos/Makefile |   2 +
 arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 272 insertions(+)
 create mode 100644 arch/arm/mach-exynos/edcs.c

diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
index 5369615..ba6efdb 100644
--- a/arch/arm/mach-exynos/Makefile
+++ b/arch/arm/mach-exynos/Makefile
@@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
 
 obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
 obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
+
+obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
new file mode 100644
index 0000000..e304bd9
--- /dev/null
+++ b/arch/arm/mach-exynos/edcs.c
@@ -0,0 +1,270 @@
+/*
+ * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
+ *
+ * Copyright (c) 2013 Samsung Electronics Co., Ltd.
+ * Author: Tarek Dakhran <t.dakhran@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * EDCS(exynos dual cluster support) for Exynos5410 SoC.
+ */
+
+#include <linux/init.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+#include <linux/spinlock.h>
+#include <linux/errno.h>
+
+#include <asm/mcpm.h>
+#include <asm/proc-fns.h>
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+
+#include <linux/arm-cci.h>
+#include <mach/regs-pmu.h>
+
+#define EDCS_CPUS_PER_CLUSTER	4
+#define EDCS_CLUSTERS		2
+
+/* Exynos5410 power management registers */
+#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
+						+ ((_nr) * 0x80))
+#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
+#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
+
+#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
+#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
+						 _nr * EDCS_CPUS_PER_CLUSTER)
+
+static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
+static int core_count[EDCS_CLUSTERS];
+
+static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
+				bool enable)
+{
+	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
+	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
+
+	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
+		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
+}
+
+static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
+{
+	exynos_core_power_control(cpu, cluster, true);
+}
+
+static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
+{
+	exynos_core_power_control(cpu, cluster, false);
+}
+
+void set_boot_flag(unsigned int cpu, unsigned int mode)
+{
+	__raw_writel(mode, REG_CPU_STATE_ADDR(cpu));
+}
+
+static int exynos_power_up(unsigned int cpu, unsigned int cluster)
+{
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
+
+	local_irq_disable();
+	arch_spin_lock(&edcs_lock);
+
+	edcs_use_count[cpu][cluster]++;
+	if (edcs_use_count[cpu][cluster] == 1) {
+		++core_count[cluster];
+		set_boot_flag(cpu, 0x2);
+		exynos_core_power_up(cpu, cluster);
+	} else if (edcs_use_count[cpu][cluster] != 2) {
+		/*
+		 * The only possible values are:
+		 * 0 = CPU down
+		 * 1 = CPU (still) up
+		 * 2 = CPU requested to be up before it had a chance
+		 *     to actually make itself down.
+		 * Any other value is a bug.
+		 */
+		BUG();
+	}
+
+	arch_spin_unlock(&edcs_lock);
+	local_irq_enable();
+
+	return 0;
+}
+static void exynos_power_down(void)
+{
+	unsigned int mpidr, cpu, cluster;
+	bool last_man = false, skip_wfi = false;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
+
+	__mcpm_cpu_going_down(cpu, cluster);
+
+	arch_spin_lock(&edcs_lock);
+	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
+	edcs_use_count[cpu][cluster]--;
+	if (edcs_use_count[cpu][cluster] == 0) {
+		--core_count[cluster];
+		if (core_count[cluster] == 0)
+			last_man = true;
+	} else if (edcs_use_count[cpu][cluster] == 1) {
+		/*
+		 * A power_up request went ahead of us.
+		 * Even if we do not want to shut this CPU down,
+		 * the caller expects a certain state as if the WFI
+		 * was aborted.  So let's continue with cache cleaning.
+		 */
+		skip_wfi = true;
+	} else
+		BUG();
+
+	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
+		arch_spin_unlock(&edcs_lock);
+
+		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
+			/*
+			 * On the Cortex-A15 we need to disable
+			 * L2 prefetching before flushing the cache.
+			 */
+			asm volatile(
+			"mcr	p15, 1, %0, c15, c0, 3\n\t"
+			"isb\n\t"
+			"dsb"
+			: : "r" (0x400));
+		}
+
+		/*
+		 * We need to disable and flush the whole (L1 and L2) cache.
+		 * Let's do it in the safest possible way i.e. with
+		 * no memory access within the following sequence
+		 * including the stack.
+		 *
+		 * Note: fp is preserved to the stack explicitly prior doing
+		 * this since adding it to the clobber list is incompatible
+		 * with having CONFIG_FRAME_POINTER=y.
+		 */
+		asm volatile(
+		"str	fp, [sp, #-4]!\n\t"
+		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
+		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
+		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
+		"isb\n\t"
+		"bl	v7_flush_dcache_all\n\t"
+		"clrex\n\t"
+		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
+		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
+		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
+		"isb\n\t"
+		"dsb\n\t"
+		"ldr	fp, [sp], #4"
+		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+			"r9", "r10", "lr", "memory");
+
+		cci_disable_port_by_cpu(mpidr);
+
+		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
+
+	} else {
+		arch_spin_unlock(&edcs_lock);
+		/*
+			* We need to disable and flush only the L1 cache.
+			* Let's do it in the safest possible way as above.
+		*/
+		asm volatile(
+		"str	fp, [sp, #-4]!\n\t"
+		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
+		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
+		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
+		"isb\n\t"
+		"bl	v7_flush_dcache_louis\n\t"
+		"clrex\n\t"
+		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
+		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
+		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
+		"isb\n\t"
+		"dsb\n\t"
+		"ldr	fp, [sp], #4"
+		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+		      "r9", "r10", "lr", "memory");
+
+	}
+	__mcpm_cpu_down(cpu, cluster);
+
+	if (!skip_wfi) {
+		exynos_core_power_down(cpu, cluster);
+		wfi();
+	}
+}
+
+static const struct mcpm_platform_ops exynos_power_ops = {
+	.power_up	= exynos_power_up,
+	.power_down	= exynos_power_down,
+};
+
+static void __init edcs_data_init(void)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
+	edcs_use_count[cpu][cluster] = 1;
+	++core_count[cluster];
+}
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ */
+static void __naked edcs_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile ("\n"
+	"b	cci_enable_port_for_self");
+}
+
+static int __init edcs_init(void)
+{
+	int ret;
+	struct device_node *node;
+
+	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
+	if (!node)
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	/*
+	 * Future entries into the kernel can now go
+	 * through the cluster entry vectors.
+	 */
+	__raw_writel(virt_to_phys(mcpm_entry_point),
+				S5P_VA_SYSRAM_NS + 0x1c);
+
+	edcs_data_init();
+	mcpm_smp_set_ops();
+
+	ret = mcpm_platform_register(&exynos_power_ops);
+	if (!ret) {
+		mcpm_sync_init(edcs_power_up_setup);
+		pr_info("EDCS power management initialized\n");
+	}
+	return ret;
+}
+
+early_initcall(edcs_init);
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-10-14 15:08   ` Vyacheslav Tyrtov
  0 siblings, 0 replies; 18+ messages in thread
From: Vyacheslav Tyrtov @ 2013-10-14 15:08 UTC (permalink / raw)
  To: linux-arm-kernel

From: Tarek Dakhran <t.dakhran@samsung.com>

Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.

Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
---
 arch/arm/mach-exynos/Makefile |   2 +
 arch/arm/mach-exynos/edcs.c   | 270 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 272 insertions(+)
 create mode 100644 arch/arm/mach-exynos/edcs.c

diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
index 5369615..ba6efdb 100644
--- a/arch/arm/mach-exynos/Makefile
+++ b/arch/arm/mach-exynos/Makefile
@@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
 
 obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
 obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
+
+obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
new file mode 100644
index 0000000..e304bd9
--- /dev/null
+++ b/arch/arm/mach-exynos/edcs.c
@@ -0,0 +1,270 @@
+/*
+ * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
+ *
+ * Copyright (c) 2013 Samsung Electronics Co., Ltd.
+ * Author: Tarek Dakhran <t.dakhran@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * EDCS(exynos dual cluster support) for Exynos5410 SoC.
+ */
+
+#include <linux/init.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+#include <linux/spinlock.h>
+#include <linux/errno.h>
+
+#include <asm/mcpm.h>
+#include <asm/proc-fns.h>
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+
+#include <linux/arm-cci.h>
+#include <mach/regs-pmu.h>
+
+#define EDCS_CPUS_PER_CLUSTER	4
+#define EDCS_CLUSTERS		2
+
+/* Exynos5410 power management registers */
+#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
+						+ ((_nr) * 0x80))
+#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
+#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
+
+#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
+#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
+						 _nr * EDCS_CPUS_PER_CLUSTER)
+
+static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
+static int core_count[EDCS_CLUSTERS];
+
+static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
+				bool enable)
+{
+	unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
+	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
+
+	if ((__raw_readl(EDCS_CORE_STATUS(offset)) & 0x3) != value)
+		__raw_writel(value, EDCS_CORE_CONFIGURATION(offset));
+}
+
+static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
+{
+	exynos_core_power_control(cpu, cluster, true);
+}
+
+static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
+{
+	exynos_core_power_control(cpu, cluster, false);
+}
+
+void set_boot_flag(unsigned int cpu, unsigned int mode)
+{
+	__raw_writel(mode, REG_CPU_STATE_ADDR(cpu));
+}
+
+static int exynos_power_up(unsigned int cpu, unsigned int cluster)
+{
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
+
+	local_irq_disable();
+	arch_spin_lock(&edcs_lock);
+
+	edcs_use_count[cpu][cluster]++;
+	if (edcs_use_count[cpu][cluster] == 1) {
+		++core_count[cluster];
+		set_boot_flag(cpu, 0x2);
+		exynos_core_power_up(cpu, cluster);
+	} else if (edcs_use_count[cpu][cluster] != 2) {
+		/*
+		 * The only possible values are:
+		 * 0 = CPU down
+		 * 1 = CPU (still) up
+		 * 2 = CPU requested to be up before it had a chance
+		 *     to actually make itself down.
+		 * Any other value is a bug.
+		 */
+		BUG();
+	}
+
+	arch_spin_unlock(&edcs_lock);
+	local_irq_enable();
+
+	return 0;
+}
+static void exynos_power_down(void)
+{
+	unsigned int mpidr, cpu, cluster;
+	bool last_man = false, skip_wfi = false;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
+
+	__mcpm_cpu_going_down(cpu, cluster);
+
+	arch_spin_lock(&edcs_lock);
+	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
+	edcs_use_count[cpu][cluster]--;
+	if (edcs_use_count[cpu][cluster] == 0) {
+		--core_count[cluster];
+		if (core_count[cluster] == 0)
+			last_man = true;
+	} else if (edcs_use_count[cpu][cluster] == 1) {
+		/*
+		 * A power_up request went ahead of us.
+		 * Even if we do not want to shut this CPU down,
+		 * the caller expects a certain state as if the WFI
+		 * was aborted.  So let's continue with cache cleaning.
+		 */
+		skip_wfi = true;
+	} else
+		BUG();
+
+	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
+		arch_spin_unlock(&edcs_lock);
+
+		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
+			/*
+			 * On the Cortex-A15 we need to disable
+			 * L2 prefetching before flushing the cache.
+			 */
+			asm volatile(
+			"mcr	p15, 1, %0, c15, c0, 3\n\t"
+			"isb\n\t"
+			"dsb"
+			: : "r" (0x400));
+		}
+
+		/*
+		 * We need to disable and flush the whole (L1 and L2) cache.
+		 * Let's do it in the safest possible way i.e. with
+		 * no memory access within the following sequence
+		 * including the stack.
+		 *
+		 * Note: fp is preserved to the stack explicitly prior doing
+		 * this since adding it to the clobber list is incompatible
+		 * with having CONFIG_FRAME_POINTER=y.
+		 */
+		asm volatile(
+		"str	fp, [sp, #-4]!\n\t"
+		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
+		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
+		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
+		"isb\n\t"
+		"bl	v7_flush_dcache_all\n\t"
+		"clrex\n\t"
+		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
+		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
+		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
+		"isb\n\t"
+		"dsb\n\t"
+		"ldr	fp, [sp], #4"
+		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+			"r9", "r10", "lr", "memory");
+
+		cci_disable_port_by_cpu(mpidr);
+
+		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
+
+	} else {
+		arch_spin_unlock(&edcs_lock);
+		/*
+			* We need to disable and flush only the L1 cache.
+			* Let's do it in the safest possible way as above.
+		*/
+		asm volatile(
+		"str	fp, [sp, #-4]!\n\t"
+		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
+		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
+		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
+		"isb\n\t"
+		"bl	v7_flush_dcache_louis\n\t"
+		"clrex\n\t"
+		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
+		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
+		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
+		"isb\n\t"
+		"dsb\n\t"
+		"ldr	fp, [sp], #4"
+		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+		      "r9", "r10", "lr", "memory");
+
+	}
+	__mcpm_cpu_down(cpu, cluster);
+
+	if (!skip_wfi) {
+		exynos_core_power_down(cpu, cluster);
+		wfi();
+	}
+}
+
+static const struct mcpm_platform_ops exynos_power_ops = {
+	.power_up	= exynos_power_up,
+	.power_down	= exynos_power_down,
+};
+
+static void __init edcs_data_init(void)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
+	edcs_use_count[cpu][cluster] = 1;
+	++core_count[cluster];
+}
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ */
+static void __naked edcs_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile ("\n"
+	"b	cci_enable_port_for_self");
+}
+
+static int __init edcs_init(void)
+{
+	int ret;
+	struct device_node *node;
+
+	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
+	if (!node)
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	/*
+	 * Future entries into the kernel can now go
+	 * through the cluster entry vectors.
+	 */
+	__raw_writel(virt_to_phys(mcpm_entry_point),
+				S5P_VA_SYSRAM_NS + 0x1c);
+
+	edcs_data_init();
+	mcpm_smp_set_ops();
+
+	ret = mcpm_platform_register(&exynos_power_ops);
+	if (!ret) {
+		mcpm_sync_init(edcs_power_up_setup);
+		pr_info("EDCS power management initialized\n");
+	}
+	return ret;
+}
+
+early_initcall(edcs_init);
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-11-04 17:12 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-17 16:49 [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support Dave Martin
  -- strict thread matches above, loose matches on Subject: below --
2013-10-18 12:53 Dave Martin
2013-10-17 15:46 Dave Martin
2013-10-18  8:29 ` Tarek Dakhran
2013-10-18  8:29   ` Tarek Dakhran
2013-10-17 14:32 Dave Martin
2013-10-17 16:04 ` Daniel Lezcano
2013-10-17 16:04   ` Daniel Lezcano
2013-10-17 18:16   ` Nicolas Pitre
2013-10-17 18:16     ` Nicolas Pitre
2013-10-14 15:08 [PATCH v2 0/4] Exynos 5410 Dual cluster support Vyacheslav Tyrtov
2013-10-14 15:08 ` [PATCH v2 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support Vyacheslav Tyrtov
2013-10-14 15:08   ` Vyacheslav Tyrtov
2013-10-17 10:45   ` Daniel Lezcano
2013-10-17 10:45     ` Daniel Lezcano
2013-10-25 10:06   ` Aliaksei Katovich
2013-10-25 10:06     ` Aliaksei Katovich
2013-11-04 10:42     ` Alexei Colin
2013-11-04 17:12       ` Alexei Colin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.