All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] ARM: sun9i: SMP bring-up with Multi-Cluster Power Management
@ 2017-07-25  5:09 ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi, Chen-Yu Tsai, linux-arm-kernel, linux-kernel,
	devicetree, Nicolas Pitre, Dave Martin

Hi everyone,

This is a partial resend of my sun9i SMP support with MCPM series from
over two years ago [1]. Not much has changed since then. We've tried
to implement PSCI for both the A80 and A83T. Results were not promising.
The issue is that these two chips have a broken security extensions
implementation. If a specific bit is not burned in its e-fuse, most if
not all security protections don't work [2]. Even worse, non-secure
access to the GIC become secure. This requires a crazy workaround in
the GIC driver which probably doesn't work in all cases [3].

Nicolas mentioned that the MCPM framework is likely overkill in our
case [4]. However the framework does provide cluster/core state tracking
and proper sequencing of cache related operations. We could rework
the code to use standard smp_ops, but I would like to actually get
a working version in first.

Core and cluster power-down, aka hotplugging, is not included in this
series. Nicolas mentioned that a new optional callback should be added
in cases where the kernel has to do the actual power down [5]. This
will be done later on. Only patches 1 ~ 4 from the original RFC series
are resent.

Changes since RFC:

  - Have MACH_SUN9I imply MCPM, and have SUN9I_A80_MCPM default to
    MACH_SUN9I. This means no defconfig changes are required.


Please have a look.

Regards
ChenYu

[1] http://www.spinics.net/lists/arm-kernel/msg418350.html
[2] https://lists.denx.de/pipermail/u-boot/2017-June/294637.html
[3] https://github.com/wens/linux/commit/c48654c1f737116e7a7660183c8c74fa91970528
[4] http://www.spinics.net/lists/arm-kernel/msg434160.html
[5] http://www.spinics.net/lists/arm-kernel/msg434408.html


Original cover letter from the old RFC series:

This is my attempt to support SMP and CPU hot plugging on the Allwinner
A80 SoC. The A80 is a big.Little processor with 2 clusters of 4x Cortex-A7
and 4x Cortex-A15 cores.

Much of the sunxi-specific MCPM code is derived from Allwinner code and
documentation, with some references to the other MCPM implementations,
as well as the Cortex's Technical Reference Manuals for the power
sequencing info.

One major difference compared to other platforms is we currently do not
have a standalone PMU or other embedded firmware to do the actually power
sequencing. All power/reset control is done by the kernel. As such, I
couldn't figure out where to put the code to power off the outbound
processor. I'm putting it in the .wait_for_powerdown() callback for now.
This does not get called by the big.Little switcher. But since we lack
cpufreq support at the moment, big.Little switcher is probably not going
to work anyway.

The code has been tested on my A80 Optimus, and reliably brings up all
cores. CPU hotplugging works as well. One issue I have is the processors
in cluster 0 do not stay in WFI after they are signaled to go offline.
I haven't tested the CCI-400 PMU bits yet.

I've done the best I could to fit the code into the new MCPM callbacks,
unlike the Allwinner code which uses the old .power_up()/.power_down()
ones. However my knowledge of ARM internals is limited, so it is quite
possible I got something wrong. Reviews are highly appreciated.

The actual work is split into 3 phases:

Patch 1 adds basic SMP bringup code using the common MCPM code.
No hotplugging is supported.

Patch 2 ~ 4 add the required DT device nodes.

Patch 5 adds support for hotplugging processor cores 1~7.

Patch 6 adds support for cpu0 hotplugging. The BROM checks a region
of secure SRAM for special flags. If they are set, execution is
diverted to the configured secondary startup address, just like it
would be for all the other processor cores.

Patch 7 adds the DT nodes for the secure SRAM.

Chen-Yu Tsai (4):
  ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management
    (MCPM)
  ARM: dts: sun9i: Add CCI-400 device nodes for A80
  ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
  ARM: dts: sun9i: Add PRCM device node for the A80 dtsi

 arch/arm/boot/dts/sun9i-a80.dtsi |  56 ++++++
 arch/arm/mach-sunxi/Kconfig      |  10 +
 arch/arm/mach-sunxi/Makefile     |   1 +
 arch/arm/mach-sunxi/mcpm.c       | 391 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 458 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

-- 
2.13.3

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/4] ARM: sun9i: SMP bring-up with Multi-Cluster Power Management
@ 2017-07-25  5:09 ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Chen-Yu Tsai,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Nicolas Pitre, Dave Martin

Hi everyone,

This is a partial resend of my sun9i SMP support with MCPM series from
over two years ago [1]. Not much has changed since then. We've tried
to implement PSCI for both the A80 and A83T. Results were not promising.
The issue is that these two chips have a broken security extensions
implementation. If a specific bit is not burned in its e-fuse, most if
not all security protections don't work [2]. Even worse, non-secure
access to the GIC become secure. This requires a crazy workaround in
the GIC driver which probably doesn't work in all cases [3].

Nicolas mentioned that the MCPM framework is likely overkill in our
case [4]. However the framework does provide cluster/core state tracking
and proper sequencing of cache related operations. We could rework
the code to use standard smp_ops, but I would like to actually get
a working version in first.

Core and cluster power-down, aka hotplugging, is not included in this
series. Nicolas mentioned that a new optional callback should be added
in cases where the kernel has to do the actual power down [5]. This
will be done later on. Only patches 1 ~ 4 from the original RFC series
are resent.

Changes since RFC:

  - Have MACH_SUN9I imply MCPM, and have SUN9I_A80_MCPM default to
    MACH_SUN9I. This means no defconfig changes are required.


Please have a look.

Regards
ChenYu

[1] http://www.spinics.net/lists/arm-kernel/msg418350.html
[2] https://lists.denx.de/pipermail/u-boot/2017-June/294637.html
[3] https://github.com/wens/linux/commit/c48654c1f737116e7a7660183c8c74fa91970528
[4] http://www.spinics.net/lists/arm-kernel/msg434160.html
[5] http://www.spinics.net/lists/arm-kernel/msg434408.html


Original cover letter from the old RFC series:

This is my attempt to support SMP and CPU hot plugging on the Allwinner
A80 SoC. The A80 is a big.Little processor with 2 clusters of 4x Cortex-A7
and 4x Cortex-A15 cores.

Much of the sunxi-specific MCPM code is derived from Allwinner code and
documentation, with some references to the other MCPM implementations,
as well as the Cortex's Technical Reference Manuals for the power
sequencing info.

One major difference compared to other platforms is we currently do not
have a standalone PMU or other embedded firmware to do the actually power
sequencing. All power/reset control is done by the kernel. As such, I
couldn't figure out where to put the code to power off the outbound
processor. I'm putting it in the .wait_for_powerdown() callback for now.
This does not get called by the big.Little switcher. But since we lack
cpufreq support at the moment, big.Little switcher is probably not going
to work anyway.

The code has been tested on my A80 Optimus, and reliably brings up all
cores. CPU hotplugging works as well. One issue I have is the processors
in cluster 0 do not stay in WFI after they are signaled to go offline.
I haven't tested the CCI-400 PMU bits yet.

I've done the best I could to fit the code into the new MCPM callbacks,
unlike the Allwinner code which uses the old .power_up()/.power_down()
ones. However my knowledge of ARM internals is limited, so it is quite
possible I got something wrong. Reviews are highly appreciated.

The actual work is split into 3 phases:

Patch 1 adds basic SMP bringup code using the common MCPM code.
No hotplugging is supported.

Patch 2 ~ 4 add the required DT device nodes.

Patch 5 adds support for hotplugging processor cores 1~7.

Patch 6 adds support for cpu0 hotplugging. The BROM checks a region
of secure SRAM for special flags. If they are set, execution is
diverted to the configured secondary startup address, just like it
would be for all the other processor cores.

Patch 7 adds the DT nodes for the secure SRAM.

Chen-Yu Tsai (4):
  ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management
    (MCPM)
  ARM: dts: sun9i: Add CCI-400 device nodes for A80
  ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
  ARM: dts: sun9i: Add PRCM device node for the A80 dtsi

 arch/arm/boot/dts/sun9i-a80.dtsi |  56 ++++++
 arch/arm/mach-sunxi/Kconfig      |  10 +
 arch/arm/mach-sunxi/Makefile     |   1 +
 arch/arm/mach-sunxi/mcpm.c       | 391 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 458 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

-- 
2.13.3

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/4] ARM: sun9i: SMP bring-up with Multi-Cluster Power Management
@ 2017-07-25  5:09 ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hi everyone,

This is a partial resend of my sun9i SMP support with MCPM series from
over two years ago [1]. Not much has changed since then. We've tried
to implement PSCI for both the A80 and A83T. Results were not promising.
The issue is that these two chips have a broken security extensions
implementation. If a specific bit is not burned in its e-fuse, most if
not all security protections don't work [2]. Even worse, non-secure
access to the GIC become secure. This requires a crazy workaround in
the GIC driver which probably doesn't work in all cases [3].

Nicolas mentioned that the MCPM framework is likely overkill in our
case [4]. However the framework does provide cluster/core state tracking
and proper sequencing of cache related operations. We could rework
the code to use standard smp_ops, but I would like to actually get
a working version in first.

Core and cluster power-down, aka hotplugging, is not included in this
series. Nicolas mentioned that a new optional callback should be added
in cases where the kernel has to do the actual power down [5]. This
will be done later on. Only patches 1 ~ 4 from the original RFC series
are resent.

Changes since RFC:

  - Have MACH_SUN9I imply MCPM, and have SUN9I_A80_MCPM default to
    MACH_SUN9I. This means no defconfig changes are required.


Please have a look.

Regards
ChenYu

[1] http://www.spinics.net/lists/arm-kernel/msg418350.html
[2] https://lists.denx.de/pipermail/u-boot/2017-June/294637.html
[3] https://github.com/wens/linux/commit/c48654c1f737116e7a7660183c8c74fa91970528
[4] http://www.spinics.net/lists/arm-kernel/msg434160.html
[5] http://www.spinics.net/lists/arm-kernel/msg434408.html


Original cover letter from the old RFC series:

This is my attempt to support SMP and CPU hot plugging on the Allwinner
A80 SoC. The A80 is a big.Little processor with 2 clusters of 4x Cortex-A7
and 4x Cortex-A15 cores.

Much of the sunxi-specific MCPM code is derived from Allwinner code and
documentation, with some references to the other MCPM implementations,
as well as the Cortex's Technical Reference Manuals for the power
sequencing info.

One major difference compared to other platforms is we currently do not
have a standalone PMU or other embedded firmware to do the actually power
sequencing. All power/reset control is done by the kernel. As such, I
couldn't figure out where to put the code to power off the outbound
processor. I'm putting it in the .wait_for_powerdown() callback for now.
This does not get called by the big.Little switcher. But since we lack
cpufreq support at the moment, big.Little switcher is probably not going
to work anyway.

The code has been tested on my A80 Optimus, and reliably brings up all
cores. CPU hotplugging works as well. One issue I have is the processors
in cluster 0 do not stay in WFI after they are signaled to go offline.
I haven't tested the CCI-400 PMU bits yet.

I've done the best I could to fit the code into the new MCPM callbacks,
unlike the Allwinner code which uses the old .power_up()/.power_down()
ones. However my knowledge of ARM internals is limited, so it is quite
possible I got something wrong. Reviews are highly appreciated.

The actual work is split into 3 phases:

Patch 1 adds basic SMP bringup code using the common MCPM code.
No hotplugging is supported.

Patch 2 ~ 4 add the required DT device nodes.

Patch 5 adds support for hotplugging processor cores 1~7.

Patch 6 adds support for cpu0 hotplugging. The BROM checks a region
of secure SRAM for special flags. If they are set, execution is
diverted to the configured secondary startup address, just like it
would be for all the other processor cores.

Patch 7 adds the DT nodes for the secure SRAM.

Chen-Yu Tsai (4):
  ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management
    (MCPM)
  ARM: dts: sun9i: Add CCI-400 device nodes for A80
  ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
  ARM: dts: sun9i: Add PRCM device node for the A80 dtsi

 arch/arm/boot/dts/sun9i-a80.dtsi |  56 ++++++
 arch/arm/mach-sunxi/Kconfig      |  10 +
 arch/arm/mach-sunxi/Makefile     |   1 +
 arch/arm/mach-sunxi/mcpm.c       | 391 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 458 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

-- 
2.13.3

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi, Chen-Yu Tsai, linux-arm-kernel, linux-kernel,
	devicetree, Nicolas Pitre, Dave Martin

The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
1 cluster of 4 Cortex-A15s.

This patch adds support to bring up the second cluster and thus all
cores using the common MCPM code. Core/cluster power down has not
been implemented, thus CPU hotplugging and big.LITTLE switcher is
not supported.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/Kconfig  |  10 ++
 arch/arm/mach-sunxi/Makefile |   1 +
 arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 402 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
index 58153cdf025b..177380548d99 100644
--- a/arch/arm/mach-sunxi/Kconfig
+++ b/arch/arm/mach-sunxi/Kconfig
@@ -47,5 +47,15 @@ config MACH_SUN9I
 	bool "Allwinner (sun9i) SoCs support"
 	default ARCH_SUNXI
 	select ARM_GIC
+	imply MCPM
+
+config SUN9I_A80_MCPM
+	bool "Allwinner A80 Multi-Cluster PM support"
+	depends on MCPM && MACH_SUN9I
+	default MACH_SUN9I
+	select ARM_CCI400_PORT_CTRL
+	help
+	  This is needed to provide CPU and cluster power management
+	  on Allwinner A80 implementing big.LITTLE.
 
 endif
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
index 27b168f121a1..e8558912c714 100644
--- a/arch/arm/mach-sunxi/Makefile
+++ b/arch/arm/mach-sunxi/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
 obj-$(CONFIG_SMP) += platsmp.o
+obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
new file mode 100644
index 000000000000..4b6e1d6ae379
--- /dev/null
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -0,0 +1,391 @@
+/*
+ * Copyright (c) 2015 Chen-Yu Tsai
+ *
+ * Chen-Yu Tsai <wens@csie.org>
+ *
+ * arch/arm/mach-sunxi/mcpm.c
+ *
+ * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/arm-cci.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+#include <asm/mcpm.h>
+
+#define SUNXI_CPUS_PER_CLUSTER		4
+#define SUNXI_NR_CLUSTERS		2
+
+#define SUN9I_A80_A15_CLUSTER		1
+
+#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
+#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
+#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
+#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
+#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
+#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
+#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
+#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
+#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
+#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
+#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
+#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
+
+#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
+#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
+#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
+#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
+#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
+#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
+#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
+#define PRCM_CPU_SOFT_ENTRY_REG		0x164
+
+static void __iomem *cpucfg_base;
+static void __iomem *prcm_base;
+
+static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
+				      bool enable)
+{
+	u32 reg;
+
+	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
+	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+	if (enable) {
+		if (reg == 0x00) {
+			pr_debug("power clamp for cluster %u cpu %u already open\n",
+				 cluster, cpu);
+			return 0;
+		}
+
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	} else {
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	}
+
+	return 0;
+}
+
+static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* Cortex-A7: hold L1 reset disable signal low */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
+		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	}
+
+	/* assert processor related resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* open power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, true);
+
+	/* clear processor power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* de-assert all processor resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	} else {
+		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	return 0;
+}
+
+static int sunxi_cluster_powerup(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	/* assert cluster processor power-on resets */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* hold L1/L2 reset disable signals low */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	if (of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER) {
+		/* Cortex-A15: hold L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
+	} else {
+		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+
+	/* clear cluster power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg |= CPUCFG_CX_RST_CTRL_H_RST;
+	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* de-assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	return 0;
+}
+
+static void sunxi_cpu_cache_disable(void)
+{
+	/* Disable and flush the local CPU cache. */
+	v7_exit_coherency_flush(louis);
+}
+
+/*
+ * This bit is shared between the initial mcpm_sync_init call to enable
+ * CCI-400 and proper cluster cache disable before power down.
+ */
+static void sunxi_cluster_cache_disable_without_axi(void)
+{
+	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
+		/*
+		 * On the Cortex-A15 we need to disable
+		 * L2 prefetching before flushing the cache.
+		 */
+		asm volatile(
+		"mcr	p15, 1, %0, c15, c0, 3\n"
+		"isb\n"
+		"dsb"
+		: : "r" (0x400));
+	}
+
+	/* Flush all cache levels for this cluster. */
+	v7_exit_coherency_flush(all);
+
+	/*
+	 * Disable cluster-level coherency by masking
+	 * incoming snoops and DVM messages:
+	 */
+	cci_disable_port_by_cpu(read_cpuid_mpidr());
+}
+
+static void sunxi_cluster_cache_disable(void)
+{
+	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
+	u32 reg;
+
+	pr_info("%s: cluster %u\n", __func__, cluster);
+
+	sunxi_cluster_cache_disable_without_axi();
+
+	/* last man standing, assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+}
+
+static const struct mcpm_platform_ops sunxi_power_ops = {
+	.cpu_powerup		= sunxi_cpu_powerup,
+	.cluster_powerup	= sunxi_cluster_powerup,
+	.cpu_cache_disable	= sunxi_cpu_cache_disable,
+	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+};
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ *
+ * Also enable regional clock gating and L2 data latency settings for
+ * Cortex-A15.
+ */
+static void __naked sunxi_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile (
+		"mrc	p15, 0, r1, c0, c0, 0\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
+		"and	r1, r1, r2\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
+		"cmp	r1, r2\n"
+		"bne	not_a15\n"
+
+		/* The following is Cortex-A15 specific */
+
+		/* L2CTRL: Enable CPU regional clock gates */
+		"mrc p15, 1, r1, c15, c0, 4\n"
+		"orr r1, r1, #(0x1<<31)\n"
+		"mcr p15, 1, r1, c15, c0, 4\n"
+
+		/* L2ACTLR */
+		"mrc p15, 1, r1, c15, c0, 0\n"
+		/* Enable L2, GIC, and Timer regional clock gates */
+		"orr r1, r1, #(0x1<<26)\n"
+		/* Disable clean/evict from being pushed to external */
+		"orr r1, r1, #(0x1<<3)\n"
+		"mcr p15, 1, r1, c15, c0, 0\n"
+
+		/* L2 data RAM latency */
+		"mrc p15, 1, r1, c9, c0, 2\n"
+		"bic r1, r1, #(0x7<<0)\n"
+		"orr r1, r1, #(0x3<<0)\n"
+		"mcr p15, 1, r1, c9, c0, 2\n"
+
+		/* End of Cortex-A15 specific setup */
+		"not_a15:\n"
+
+		"cmp	r0, #1\n"
+		"bxne	lr\n"
+		"b	cci_enable_port_for_self"
+	);
+}
+
+static void sunxi_mcpm_setup_entry_point(void)
+{
+	__raw_writel(virt_to_phys(mcpm_entry_point),
+		     prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
+}
+
+static int __init sunxi_mcpm_init(void)
+{
+	struct device_node *node;
+	int ret;
+
+	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	node = of_find_compatible_node(NULL, NULL,
+			"allwinner,sun9i-a80-cpucfg");
+	if (!node)
+		return -ENODEV;
+
+	cpucfg_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!cpucfg_base) {
+		pr_err("%s: failed to map CPUCFG registers\n", __func__);
+		return -ENOMEM;
+	}
+
+	node = of_find_compatible_node(NULL, NULL,
+			"allwinner,sun9i-a80-prcm");
+	if (!node)
+		return -ENODEV;
+
+	prcm_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!prcm_base) {
+		pr_err("%s: failed to map PRCM registers\n", __func__);
+		iounmap(prcm_base);
+		return -ENOMEM;
+	}
+
+	ret = mcpm_platform_register(&sunxi_power_ops);
+	if (!ret)
+		ret = mcpm_sync_init(sunxi_power_up_setup);
+	if (!ret)
+		/* do not disable AXI master as no one will re-enable it */
+		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
+	if (ret) {
+		iounmap(cpucfg_base);
+		iounmap(prcm_base);
+		return ret;
+	}
+
+	mcpm_smp_set_ops();
+
+	pr_info("sunxi MCPM support installed\n");
+
+	sunxi_mcpm_setup_entry_point();
+
+	return ret;
+}
+
+early_initcall(sunxi_mcpm_init);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Chen-Yu Tsai,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Nicolas Pitre, Dave Martin

The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
1 cluster of 4 Cortex-A15s.

This patch adds support to bring up the second cluster and thus all
cores using the common MCPM code. Core/cluster power down has not
been implemented, thus CPU hotplugging and big.LITTLE switcher is
not supported.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/mach-sunxi/Kconfig  |  10 ++
 arch/arm/mach-sunxi/Makefile |   1 +
 arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 402 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
index 58153cdf025b..177380548d99 100644
--- a/arch/arm/mach-sunxi/Kconfig
+++ b/arch/arm/mach-sunxi/Kconfig
@@ -47,5 +47,15 @@ config MACH_SUN9I
 	bool "Allwinner (sun9i) SoCs support"
 	default ARCH_SUNXI
 	select ARM_GIC
+	imply MCPM
+
+config SUN9I_A80_MCPM
+	bool "Allwinner A80 Multi-Cluster PM support"
+	depends on MCPM && MACH_SUN9I
+	default MACH_SUN9I
+	select ARM_CCI400_PORT_CTRL
+	help
+	  This is needed to provide CPU and cluster power management
+	  on Allwinner A80 implementing big.LITTLE.
 
 endif
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
index 27b168f121a1..e8558912c714 100644
--- a/arch/arm/mach-sunxi/Makefile
+++ b/arch/arm/mach-sunxi/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
 obj-$(CONFIG_SMP) += platsmp.o
+obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
new file mode 100644
index 000000000000..4b6e1d6ae379
--- /dev/null
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -0,0 +1,391 @@
+/*
+ * Copyright (c) 2015 Chen-Yu Tsai
+ *
+ * Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
+ *
+ * arch/arm/mach-sunxi/mcpm.c
+ *
+ * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/arm-cci.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+#include <asm/mcpm.h>
+
+#define SUNXI_CPUS_PER_CLUSTER		4
+#define SUNXI_NR_CLUSTERS		2
+
+#define SUN9I_A80_A15_CLUSTER		1
+
+#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
+#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
+#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
+#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
+#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
+#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
+#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
+#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
+#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
+#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
+#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
+#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
+
+#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
+#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
+#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
+#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
+#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
+#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
+#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
+#define PRCM_CPU_SOFT_ENTRY_REG		0x164
+
+static void __iomem *cpucfg_base;
+static void __iomem *prcm_base;
+
+static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
+				      bool enable)
+{
+	u32 reg;
+
+	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
+	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+	if (enable) {
+		if (reg == 0x00) {
+			pr_debug("power clamp for cluster %u cpu %u already open\n",
+				 cluster, cpu);
+			return 0;
+		}
+
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	} else {
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	}
+
+	return 0;
+}
+
+static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* Cortex-A7: hold L1 reset disable signal low */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
+		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	}
+
+	/* assert processor related resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* open power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, true);
+
+	/* clear processor power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* de-assert all processor resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	} else {
+		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	return 0;
+}
+
+static int sunxi_cluster_powerup(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	/* assert cluster processor power-on resets */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* hold L1/L2 reset disable signals low */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	if (of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER) {
+		/* Cortex-A15: hold L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
+	} else {
+		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+
+	/* clear cluster power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg |= CPUCFG_CX_RST_CTRL_H_RST;
+	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* de-assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	return 0;
+}
+
+static void sunxi_cpu_cache_disable(void)
+{
+	/* Disable and flush the local CPU cache. */
+	v7_exit_coherency_flush(louis);
+}
+
+/*
+ * This bit is shared between the initial mcpm_sync_init call to enable
+ * CCI-400 and proper cluster cache disable before power down.
+ */
+static void sunxi_cluster_cache_disable_without_axi(void)
+{
+	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
+		/*
+		 * On the Cortex-A15 we need to disable
+		 * L2 prefetching before flushing the cache.
+		 */
+		asm volatile(
+		"mcr	p15, 1, %0, c15, c0, 3\n"
+		"isb\n"
+		"dsb"
+		: : "r" (0x400));
+	}
+
+	/* Flush all cache levels for this cluster. */
+	v7_exit_coherency_flush(all);
+
+	/*
+	 * Disable cluster-level coherency by masking
+	 * incoming snoops and DVM messages:
+	 */
+	cci_disable_port_by_cpu(read_cpuid_mpidr());
+}
+
+static void sunxi_cluster_cache_disable(void)
+{
+	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
+	u32 reg;
+
+	pr_info("%s: cluster %u\n", __func__, cluster);
+
+	sunxi_cluster_cache_disable_without_axi();
+
+	/* last man standing, assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+}
+
+static const struct mcpm_platform_ops sunxi_power_ops = {
+	.cpu_powerup		= sunxi_cpu_powerup,
+	.cluster_powerup	= sunxi_cluster_powerup,
+	.cpu_cache_disable	= sunxi_cpu_cache_disable,
+	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+};
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ *
+ * Also enable regional clock gating and L2 data latency settings for
+ * Cortex-A15.
+ */
+static void __naked sunxi_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile (
+		"mrc	p15, 0, r1, c0, c0, 0\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
+		"and	r1, r1, r2\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
+		"cmp	r1, r2\n"
+		"bne	not_a15\n"
+
+		/* The following is Cortex-A15 specific */
+
+		/* L2CTRL: Enable CPU regional clock gates */
+		"mrc p15, 1, r1, c15, c0, 4\n"
+		"orr r1, r1, #(0x1<<31)\n"
+		"mcr p15, 1, r1, c15, c0, 4\n"
+
+		/* L2ACTLR */
+		"mrc p15, 1, r1, c15, c0, 0\n"
+		/* Enable L2, GIC, and Timer regional clock gates */
+		"orr r1, r1, #(0x1<<26)\n"
+		/* Disable clean/evict from being pushed to external */
+		"orr r1, r1, #(0x1<<3)\n"
+		"mcr p15, 1, r1, c15, c0, 0\n"
+
+		/* L2 data RAM latency */
+		"mrc p15, 1, r1, c9, c0, 2\n"
+		"bic r1, r1, #(0x7<<0)\n"
+		"orr r1, r1, #(0x3<<0)\n"
+		"mcr p15, 1, r1, c9, c0, 2\n"
+
+		/* End of Cortex-A15 specific setup */
+		"not_a15:\n"
+
+		"cmp	r0, #1\n"
+		"bxne	lr\n"
+		"b	cci_enable_port_for_self"
+	);
+}
+
+static void sunxi_mcpm_setup_entry_point(void)
+{
+	__raw_writel(virt_to_phys(mcpm_entry_point),
+		     prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
+}
+
+static int __init sunxi_mcpm_init(void)
+{
+	struct device_node *node;
+	int ret;
+
+	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	node = of_find_compatible_node(NULL, NULL,
+			"allwinner,sun9i-a80-cpucfg");
+	if (!node)
+		return -ENODEV;
+
+	cpucfg_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!cpucfg_base) {
+		pr_err("%s: failed to map CPUCFG registers\n", __func__);
+		return -ENOMEM;
+	}
+
+	node = of_find_compatible_node(NULL, NULL,
+			"allwinner,sun9i-a80-prcm");
+	if (!node)
+		return -ENODEV;
+
+	prcm_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!prcm_base) {
+		pr_err("%s: failed to map PRCM registers\n", __func__);
+		iounmap(prcm_base);
+		return -ENOMEM;
+	}
+
+	ret = mcpm_platform_register(&sunxi_power_ops);
+	if (!ret)
+		ret = mcpm_sync_init(sunxi_power_up_setup);
+	if (!ret)
+		/* do not disable AXI master as no one will re-enable it */
+		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
+	if (ret) {
+		iounmap(cpucfg_base);
+		iounmap(prcm_base);
+		return ret;
+	}
+
+	mcpm_smp_set_ops();
+
+	pr_info("sunxi MCPM support installed\n");
+
+	sunxi_mcpm_setup_entry_point();
+
+	return ret;
+}
+
+early_initcall(sunxi_mcpm_init);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: linux-arm-kernel

The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
1 cluster of 4 Cortex-A15s.

This patch adds support to bring up the second cluster and thus all
cores using the common MCPM code. Core/cluster power down has not
been implemented, thus CPU hotplugging and big.LITTLE switcher is
not supported.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/Kconfig  |  10 ++
 arch/arm/mach-sunxi/Makefile |   1 +
 arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 402 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
index 58153cdf025b..177380548d99 100644
--- a/arch/arm/mach-sunxi/Kconfig
+++ b/arch/arm/mach-sunxi/Kconfig
@@ -47,5 +47,15 @@ config MACH_SUN9I
 	bool "Allwinner (sun9i) SoCs support"
 	default ARCH_SUNXI
 	select ARM_GIC
+	imply MCPM
+
+config SUN9I_A80_MCPM
+	bool "Allwinner A80 Multi-Cluster PM support"
+	depends on MCPM && MACH_SUN9I
+	default MACH_SUN9I
+	select ARM_CCI400_PORT_CTRL
+	help
+	  This is needed to provide CPU and cluster power management
+	  on Allwinner A80 implementing big.LITTLE.
 
 endif
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
index 27b168f121a1..e8558912c714 100644
--- a/arch/arm/mach-sunxi/Makefile
+++ b/arch/arm/mach-sunxi/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
 obj-$(CONFIG_SMP) += platsmp.o
+obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
new file mode 100644
index 000000000000..4b6e1d6ae379
--- /dev/null
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -0,0 +1,391 @@
+/*
+ * Copyright (c) 2015 Chen-Yu Tsai
+ *
+ * Chen-Yu Tsai <wens@csie.org>
+ *
+ * arch/arm/mach-sunxi/mcpm.c
+ *
+ * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/arm-cci.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+#include <asm/mcpm.h>
+
+#define SUNXI_CPUS_PER_CLUSTER		4
+#define SUNXI_NR_CLUSTERS		2
+
+#define SUN9I_A80_A15_CLUSTER		1
+
+#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
+#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
+#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
+#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
+#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
+#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
+#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
+#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
+#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
+#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
+#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
+#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
+
+#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
+#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
+#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
+#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
+#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
+#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
+#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
+#define PRCM_CPU_SOFT_ENTRY_REG		0x164
+
+static void __iomem *cpucfg_base;
+static void __iomem *prcm_base;
+
+static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
+				      bool enable)
+{
+	u32 reg;
+
+	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
+	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+	if (enable) {
+		if (reg == 0x00) {
+			pr_debug("power clamp for cluster %u cpu %u already open\n",
+				 cluster, cpu);
+			return 0;
+		}
+
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	} else {
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	}
+
+	return 0;
+}
+
+static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* Cortex-A7: hold L1 reset disable signal low */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
+		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	}
+
+	/* assert processor related resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* open power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, true);
+
+	/* clear processor power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* de-assert all processor resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	} else {
+		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	return 0;
+}
+
+static int sunxi_cluster_powerup(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	/* assert cluster processor power-on resets */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER)) {
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* hold L1/L2 reset disable signals low */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	if (of_machine_is_compatible("allwinner,sun9i-a80") &&
+			cluster == SUN9I_A80_A15_CLUSTER) {
+		/* Cortex-A15: hold L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
+	} else {
+		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+
+	/* clear cluster power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg |= CPUCFG_CX_RST_CTRL_H_RST;
+	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* de-assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	return 0;
+}
+
+static void sunxi_cpu_cache_disable(void)
+{
+	/* Disable and flush the local CPU cache. */
+	v7_exit_coherency_flush(louis);
+}
+
+/*
+ * This bit is shared between the initial mcpm_sync_init call to enable
+ * CCI-400 and proper cluster cache disable before power down.
+ */
+static void sunxi_cluster_cache_disable_without_axi(void)
+{
+	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
+		/*
+		 * On the Cortex-A15 we need to disable
+		 * L2 prefetching before flushing the cache.
+		 */
+		asm volatile(
+		"mcr	p15, 1, %0, c15, c0, 3\n"
+		"isb\n"
+		"dsb"
+		: : "r" (0x400));
+	}
+
+	/* Flush all cache levels for this cluster. */
+	v7_exit_coherency_flush(all);
+
+	/*
+	 * Disable cluster-level coherency by masking
+	 * incoming snoops and DVM messages:
+	 */
+	cci_disable_port_by_cpu(read_cpuid_mpidr());
+}
+
+static void sunxi_cluster_cache_disable(void)
+{
+	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
+	u32 reg;
+
+	pr_info("%s: cluster %u\n", __func__, cluster);
+
+	sunxi_cluster_cache_disable_without_axi();
+
+	/* last man standing, assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+}
+
+static const struct mcpm_platform_ops sunxi_power_ops = {
+	.cpu_powerup		= sunxi_cpu_powerup,
+	.cluster_powerup	= sunxi_cluster_powerup,
+	.cpu_cache_disable	= sunxi_cpu_cache_disable,
+	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+};
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ *
+ * Also enable regional clock gating and L2 data latency settings for
+ * Cortex-A15.
+ */
+static void __naked sunxi_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile (
+		"mrc	p15, 0, r1, c0, c0, 0\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
+		"and	r1, r1, r2\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
+		"cmp	r1, r2\n"
+		"bne	not_a15\n"
+
+		/* The following is Cortex-A15 specific */
+
+		/* L2CTRL: Enable CPU regional clock gates */
+		"mrc p15, 1, r1, c15, c0, 4\n"
+		"orr r1, r1, #(0x1<<31)\n"
+		"mcr p15, 1, r1, c15, c0, 4\n"
+
+		/* L2ACTLR */
+		"mrc p15, 1, r1, c15, c0, 0\n"
+		/* Enable L2, GIC, and Timer regional clock gates */
+		"orr r1, r1, #(0x1<<26)\n"
+		/* Disable clean/evict from being pushed to external */
+		"orr r1, r1, #(0x1<<3)\n"
+		"mcr p15, 1, r1, c15, c0, 0\n"
+
+		/* L2 data RAM latency */
+		"mrc p15, 1, r1, c9, c0, 2\n"
+		"bic r1, r1, #(0x7<<0)\n"
+		"orr r1, r1, #(0x3<<0)\n"
+		"mcr p15, 1, r1, c9, c0, 2\n"
+
+		/* End of Cortex-A15 specific setup */
+		"not_a15:\n"
+
+		"cmp	r0, #1\n"
+		"bxne	lr\n"
+		"b	cci_enable_port_for_self"
+	);
+}
+
+static void sunxi_mcpm_setup_entry_point(void)
+{
+	__raw_writel(virt_to_phys(mcpm_entry_point),
+		     prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
+}
+
+static int __init sunxi_mcpm_init(void)
+{
+	struct device_node *node;
+	int ret;
+
+	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	node = of_find_compatible_node(NULL, NULL,
+			"allwinner,sun9i-a80-cpucfg");
+	if (!node)
+		return -ENODEV;
+
+	cpucfg_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!cpucfg_base) {
+		pr_err("%s: failed to map CPUCFG registers\n", __func__);
+		return -ENOMEM;
+	}
+
+	node = of_find_compatible_node(NULL, NULL,
+			"allwinner,sun9i-a80-prcm");
+	if (!node)
+		return -ENODEV;
+
+	prcm_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!prcm_base) {
+		pr_err("%s: failed to map PRCM registers\n", __func__);
+		iounmap(prcm_base);
+		return -ENOMEM;
+	}
+
+	ret = mcpm_platform_register(&sunxi_power_ops);
+	if (!ret)
+		ret = mcpm_sync_init(sunxi_power_up_setup);
+	if (!ret)
+		/* do not disable AXI master as no one will re-enable it */
+		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
+	if (ret) {
+		iounmap(cpucfg_base);
+		iounmap(prcm_base);
+		return ret;
+	}
+
+	mcpm_smp_set_ops();
+
+	pr_info("sunxi MCPM support installed\n");
+
+	sunxi_mcpm_setup_entry_point();
+
+	return ret;
+}
+
+early_initcall(sunxi_mcpm_init);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/4] ARM: dts: sun9i: Add CCI-400 device nodes for A80
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi, Chen-Yu Tsai, linux-arm-kernel, linux-kernel,
	devicetree, Nicolas Pitre, Dave Martin

The A80 includes an ARM CCI-400 interconnect to support multi-cluster
CPU caches.

Also add the maximum clock frequency for the CPUs, as listed in the
A80 Optimus Board FEX file.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 46 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 759a72317eb8..fc179b8ab038 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -63,48 +63,64 @@
 		cpu0: cpu@0 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x0>;
 		};
 
 		cpu1: cpu@1 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x1>;
 		};
 
 		cpu2: cpu@2 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x2>;
 		};
 
 		cpu3: cpu@3 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x3>;
 		};
 
 		cpu4: cpu@100 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x100>;
 		};
 
 		cpu5: cpu@101 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x101>;
 		};
 
 		cpu6: cpu@102 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x102>;
 		};
 
 		cpu7: cpu@103 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x103>;
 		};
 	};
@@ -436,6 +452,36 @@
 			interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
 		};
 
+		cci: cci@01c90000 {
+			compatible = "arm,cci-400";
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x01c90000 0x1000>;
+			ranges = <0x0 0x01c90000 0x10000>;
+
+			cci_control0: slave-if@4000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x4000 0x1000>;
+			};
+
+			cci_control1: slave-if@5000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x5000 0x1000>;
+			};
+
+			pmu@9000 {
+				 compatible = "arm,cci-400-pmu,r1";
+				 reg = <0x9000 0x5000>;
+				 interrupts = <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>;
+			};
+		};
+
 		de_clocks: clock@03000000 {
 			compatible = "allwinner,sun9i-a80-de-clks";
 			reg = <0x03000000 0x30>;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/4] ARM: dts: sun9i: Add CCI-400 device nodes for A80
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Chen-Yu Tsai,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Nicolas Pitre, Dave Martin

The A80 includes an ARM CCI-400 interconnect to support multi-cluster
CPU caches.

Also add the maximum clock frequency for the CPUs, as listed in the
A80 Optimus Board FEX file.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 46 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 759a72317eb8..fc179b8ab038 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -63,48 +63,64 @@
 		cpu0: cpu@0 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x0>;
 		};
 
 		cpu1: cpu@1 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x1>;
 		};
 
 		cpu2: cpu@2 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x2>;
 		};
 
 		cpu3: cpu@3 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x3>;
 		};
 
 		cpu4: cpu@100 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x100>;
 		};
 
 		cpu5: cpu@101 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x101>;
 		};
 
 		cpu6: cpu@102 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x102>;
 		};
 
 		cpu7: cpu@103 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x103>;
 		};
 	};
@@ -436,6 +452,36 @@
 			interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
 		};
 
+		cci: cci@01c90000 {
+			compatible = "arm,cci-400";
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x01c90000 0x1000>;
+			ranges = <0x0 0x01c90000 0x10000>;
+
+			cci_control0: slave-if@4000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x4000 0x1000>;
+			};
+
+			cci_control1: slave-if@5000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x5000 0x1000>;
+			};
+
+			pmu@9000 {
+				 compatible = "arm,cci-400-pmu,r1";
+				 reg = <0x9000 0x5000>;
+				 interrupts = <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>;
+			};
+		};
+
 		de_clocks: clock@03000000 {
 			compatible = "allwinner,sun9i-a80-de-clks";
 			reg = <0x03000000 0x30>;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/4] ARM: dts: sun9i: Add CCI-400 device nodes for A80
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: linux-arm-kernel

The A80 includes an ARM CCI-400 interconnect to support multi-cluster
CPU caches.

Also add the maximum clock frequency for the CPUs, as listed in the
A80 Optimus Board FEX file.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 46 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 759a72317eb8..fc179b8ab038 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -63,48 +63,64 @@
 		cpu0: cpu at 0 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x0>;
 		};
 
 		cpu1: cpu at 1 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x1>;
 		};
 
 		cpu2: cpu at 2 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x2>;
 		};
 
 		cpu3: cpu at 3 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x3>;
 		};
 
 		cpu4: cpu at 100 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x100>;
 		};
 
 		cpu5: cpu at 101 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x101>;
 		};
 
 		cpu6: cpu at 102 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x102>;
 		};
 
 		cpu7: cpu at 103 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x103>;
 		};
 	};
@@ -436,6 +452,36 @@
 			interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
 		};
 
+		cci: cci at 01c90000 {
+			compatible = "arm,cci-400";
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x01c90000 0x1000>;
+			ranges = <0x0 0x01c90000 0x10000>;
+
+			cci_control0: slave-if at 4000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x4000 0x1000>;
+			};
+
+			cci_control1: slave-if at 5000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x5000 0x1000>;
+			};
+
+			pmu at 9000 {
+				 compatible = "arm,cci-400-pmu,r1";
+				 reg = <0x9000 0x5000>;
+				 interrupts = <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>;
+			};
+		};
+
 		de_clocks: clock at 03000000 {
 			compatible = "allwinner,sun9i-a80-de-clks";
 			reg = <0x03000000 0x30>;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/4] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi, Chen-Yu Tsai, linux-arm-kernel, linux-kernel,
	devicetree, Nicolas Pitre, Dave Martin

CPUCFG is a collection of registers that are mapped to the SoC's signals
from each individual processor core and associated peripherals, such as
resets for processors, L1/L2 cache and other things.

These registers are used for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index fc179b8ab038..cc5db467f616 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -368,6 +368,11 @@
 			#reset-cells = <1>;
 		};
 
+		cpucfg@01700000 {
+			compatible = "allwinner,sun9i-a80-cpucfg";
+			reg = <0x01700000 0x100>;
+		};
+
 		mmc0: mmc@01c0f000 {
 			compatible = "allwinner,sun9i-a80-mmc";
 			reg = <0x01c0f000 0x1000>;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/4] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Chen-Yu Tsai,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Nicolas Pitre, Dave Martin

CPUCFG is a collection of registers that are mapped to the SoC's signals
from each individual processor core and associated peripherals, such as
resets for processors, L1/L2 cache and other things.

These registers are used for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index fc179b8ab038..cc5db467f616 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -368,6 +368,11 @@
 			#reset-cells = <1>;
 		};
 
+		cpucfg@01700000 {
+			compatible = "allwinner,sun9i-a80-cpucfg";
+			reg = <0x01700000 0x100>;
+		};
+
 		mmc0: mmc@01c0f000 {
 			compatible = "allwinner,sun9i-a80-mmc";
 			reg = <0x01c0f000 0x1000>;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/4] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: linux-arm-kernel

CPUCFG is a collection of registers that are mapped to the SoC's signals
from each individual processor core and associated peripherals, such as
resets for processors, L1/L2 cache and other things.

These registers are used for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index fc179b8ab038..cc5db467f616 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -368,6 +368,11 @@
 			#reset-cells = <1>;
 		};
 
+		cpucfg at 01700000 {
+			compatible = "allwinner,sun9i-a80-cpucfg";
+			reg = <0x01700000 0x100>;
+		};
+
 		mmc0: mmc at 01c0f000 {
 			compatible = "allwinner,sun9i-a80-mmc";
 			reg = <0x01c0f000 0x1000>;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/4] ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi, Chen-Yu Tsai, linux-arm-kernel, linux-kernel,
	devicetree, Nicolas Pitre, Dave Martin

The PRCM is a collection of clock controls, reset controls, and various
power switches/gates. Some of these can be independently listed and
supported, while a number of CPU related ones are used in tandem with
CPUCFG for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index cc5db467f616..cadf3a5e6997 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -714,6 +714,11 @@
 			interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
+		prcm@08001400 {
+			compatible = "allwinner,sun9i-a80-prcm";
+			reg = <0x08001400 0x200>;
+		};
+
 		apbs_rst: reset@080014b0 {
 			reg = <0x080014b0 0x4>;
 			compatible = "allwinner,sun6i-a31-clock-reset";
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/4] ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Chen-Yu Tsai,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Nicolas Pitre, Dave Martin

The PRCM is a collection of clock controls, reset controls, and various
power switches/gates. Some of these can be independently listed and
supported, while a number of CPU related ones are used in tandem with
CPUCFG for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index cc5db467f616..cadf3a5e6997 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -714,6 +714,11 @@
 			interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
+		prcm@08001400 {
+			compatible = "allwinner,sun9i-a80-prcm";
+			reg = <0x08001400 0x200>;
+		};
+
 		apbs_rst: reset@080014b0 {
 			reg = <0x080014b0 0x4>;
 			compatible = "allwinner,sun6i-a31-clock-reset";
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/4] ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
@ 2017-07-25  5:09   ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  5:09 UTC (permalink / raw)
  To: linux-arm-kernel

The PRCM is a collection of clock controls, reset controls, and various
power switches/gates. Some of these can be independently listed and
supported, while a number of CPU related ones are used in tandem with
CPUCFG for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index cc5db467f616..cadf3a5e6997 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -714,6 +714,11 @@
 			interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
+		prcm at 08001400 {
+			compatible = "allwinner,sun9i-a80-prcm";
+			reg = <0x08001400 0x200>;
+		};
+
 		apbs_rst: reset at 080014b0 {
 			reg = <0x080014b0 0x4>;
 			compatible = "allwinner,sun6i-a31-clock-reset";
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
  2017-07-25  5:09   ` Chen-Yu Tsai
@ 2017-07-25  7:47     ` Maxime Ripard
  -1 siblings, 0 replies; 28+ messages in thread
From: Maxime Ripard @ 2017-07-25  7:47 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Russell King, linux-sunxi, linux-arm-kernel, linux-kernel,
	devicetree, Nicolas Pitre, Dave Martin

[-- Attachment #1: Type: text/plain, Size: 15274 bytes --]

Hi Chen-Yu,

On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
> 1 cluster of 4 Cortex-A15s.
> 
> This patch adds support to bring up the second cluster and thus all
> cores using the common MCPM code. Core/cluster power down has not
> been implemented, thus CPU hotplugging and big.LITTLE switcher is
> not supported.
> 
> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> ---
>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>  arch/arm/mach-sunxi/Makefile |   1 +
>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 402 insertions(+)
>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
> 
> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
> index 58153cdf025b..177380548d99 100644
> --- a/arch/arm/mach-sunxi/Kconfig
> +++ b/arch/arm/mach-sunxi/Kconfig
> @@ -47,5 +47,15 @@ config MACH_SUN9I
>  	bool "Allwinner (sun9i) SoCs support"
>  	default ARCH_SUNXI
>  	select ARM_GIC
> +	imply MCPM
> +
> +config SUN9I_A80_MCPM
> +	bool "Allwinner A80 Multi-Cluster PM support"
> +	depends on MCPM && MACH_SUN9I
> +	default MACH_SUN9I
> +	select ARM_CCI400_PORT_CTRL
> +	help
> +	  This is needed to provide CPU and cluster power management
> +	  on Allwinner A80 implementing big.LITTLE.

Do we really need an option for that? we don't provide the option to
disable the CPU SMP operations for the rest of the SoCs.

>  endif
> diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
> index 27b168f121a1..e8558912c714 100644
> --- a/arch/arm/mach-sunxi/Makefile
> +++ b/arch/arm/mach-sunxi/Makefile
> @@ -1,2 +1,3 @@
>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
>  obj-$(CONFIG_SMP) += platsmp.o
> +obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
> new file mode 100644
> index 000000000000..4b6e1d6ae379
> --- /dev/null
> +++ b/arch/arm/mach-sunxi/mcpm.c
> @@ -0,0 +1,391 @@
> +/*
> + * Copyright (c) 2015 Chen-Yu Tsai
> + *
> + * Chen-Yu Tsai <wens@csie.org>
> + *
> + * arch/arm/mach-sunxi/mcpm.c
> + *
> + * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/arm-cci.h>
> +#include <linux/delay.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +#include <asm/mcpm.h>
> +
> +#define SUNXI_CPUS_PER_CLUSTER		4
> +#define SUNXI_NR_CLUSTERS		2
> + 
> +#define SUN9I_A80_A15_CLUSTER		1

Don't we have a way to derive that from the DT ?

> +#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
> +#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
> +#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
> +#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
> +#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
> +#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
> +#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
> +
> +#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
> +#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
> +#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
> +#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
> +#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
> +#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
> +#define PRCM_CPU_SOFT_ENTRY_REG		0x164
> +
> +static void __iomem *cpucfg_base;
> +static void __iomem *prcm_base;
> +
> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
> +				      bool enable)
> +{
> +	u32 reg;
> +
> +	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
> +	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +	if (enable) {
> +		if (reg == 0x00) {
> +			pr_debug("power clamp for cluster %u cpu %u already open\n",
> +				 cluster, cpu);
> +			return 0;
> +		}
> +
> +		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +	} else {
> +		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +	}
> +
> +	return 0;
> +}
> +
> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
> +{
> +	u32 reg;
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
> +		return -EINVAL;
> +
> +	/* assert processor power-on reset */
> +	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> +	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +
> +	/* Cortex-A7: hold L1 reset disable signal low */
> +	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER)) {
> +		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> +		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
> +		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> +	}
> +
> +	/* assert processor related resets */
> +	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> +
> +	/*
> +	 * Allwinner code also asserts resets for NEON on A15. According
> +	 * to ARM manuals, asserting power-on reset is sufficient.
> +	 */
> +	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER)) {
> +		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> +	}
> +	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +
> +	/* open power switch */
> +	sunxi_cpu_power_switch_set(cpu, cluster, true);
> +
> +	/* clear processor power gate */
> +	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> +	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
> +	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> +	udelay(20);
> +
> +	/* de-assert processor power-on reset */
> +	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> +	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +
> +	/* de-assert all processor resets */
> +	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> +	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
> +	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER)) {
> +		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> +	} else {
> +		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
> +	}
> +	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +
> +	return 0;
> +}
> +
> +static int sunxi_cluster_powerup(unsigned int cluster)
> +{
> +	u32 reg;
> +
> +	pr_debug("%s: cluster %u\n", __func__, cluster);
> +	if (cluster >= SUNXI_NR_CLUSTERS)
> +		return -EINVAL;
> +
> +	/* assert ACINACTM */
> +	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> +	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +
> +	/* assert cluster processor power-on resets */
> +	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
> +	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +
> +	/* assert cluster resets */
> +	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> +	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
> +	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
> +	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
> +
> +	/*
> +	 * Allwinner code also asserts resets for NEON on A15. According
> +	 * to ARM manuals, asserting power-on reset is sufficient.
> +	 */
> +	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER)) {
> +		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
> +	}
> +	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +
> +	/* hold L1/L2 reset disable signals low */
> +	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> +	if (of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER) {
> +		/* Cortex-A15: hold L2RSTDISABLE low */
> +		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
> +	} else {
> +		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
> +		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
> +		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
> +	}
> +	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> +
> +	/* clear cluster power gate */
> +	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> +	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
> +	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> +	udelay(20);
> +
> +	/* de-assert cluster resets */
> +	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> +	reg |= CPUCFG_CX_RST_CTRL_H_RST;
> +	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
> +	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +
> +	/* de-assert ACINACTM */
> +	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
> +	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +
> +	return 0;
> +}
> +
> +static void sunxi_cpu_cache_disable(void)
> +{
> +	/* Disable and flush the local CPU cache. */
> +	v7_exit_coherency_flush(louis);
> +}
> +
> +/*
> + * This bit is shared between the initial mcpm_sync_init call to enable
> + * CCI-400 and proper cluster cache disable before power down.
> + */
> +static void sunxi_cluster_cache_disable_without_axi(void)
> +{
> +	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
> +		/*
> +		 * On the Cortex-A15 we need to disable
> +		 * L2 prefetching before flushing the cache.
> +		 */
> +		asm volatile(
> +		"mcr	p15, 1, %0, c15, c0, 3\n"
> +		"isb\n"
> +		"dsb"
> +		: : "r" (0x400));
> +	}
> +
> +	/* Flush all cache levels for this cluster. */
> +	v7_exit_coherency_flush(all);
> +
> +	/*
> +	 * Disable cluster-level coherency by masking
> +	 * incoming snoops and DVM messages:
> +	 */
> +	cci_disable_port_by_cpu(read_cpuid_mpidr());
> +}
> +
> +static void sunxi_cluster_cache_disable(void)
> +{
> +	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
> +	u32 reg;
> +
> +	pr_info("%s: cluster %u\n", __func__, cluster);
> +
> +	sunxi_cluster_cache_disable_without_axi();
> +
> +	/* last man standing, assert ACINACTM */
> +	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> +	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +}
> +
> +static const struct mcpm_platform_ops sunxi_power_ops = {
> +	.cpu_powerup		= sunxi_cpu_powerup,
> +	.cluster_powerup	= sunxi_cluster_powerup,
> +	.cpu_cache_disable	= sunxi_cpu_cache_disable,
> +	.cluster_cache_disable	= sunxi_cluster_cache_disable,
> +};
> +
> +/*
> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> + *
> + * Also enable regional clock gating and L2 data latency settings for
> + * Cortex-A15.
> + */
> +static void __naked sunxi_power_up_setup(unsigned int affinity_level)
> +{
> +	asm volatile (
> +		"mrc	p15, 0, r1, c0, c0, 0\n"
> +		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
> +		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
> +		"and	r1, r1, r2\n"
> +		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
> +		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
> +		"cmp	r1, r2\n"
> +		"bne	not_a15\n"
> +
> +		/* The following is Cortex-A15 specific */
> +
> +		/* L2CTRL: Enable CPU regional clock gates */
> +		"mrc p15, 1, r1, c15, c0, 4\n"
> +		"orr r1, r1, #(0x1<<31)\n"
> +		"mcr p15, 1, r1, c15, c0, 4\n"
> +
> +		/* L2ACTLR */
> +		"mrc p15, 1, r1, c15, c0, 0\n"
> +		/* Enable L2, GIC, and Timer regional clock gates */
> +		"orr r1, r1, #(0x1<<26)\n"
> +		/* Disable clean/evict from being pushed to external */
> +		"orr r1, r1, #(0x1<<3)\n"
> +		"mcr p15, 1, r1, c15, c0, 0\n"
> +
> +		/* L2 data RAM latency */
> +		"mrc p15, 1, r1, c9, c0, 2\n"
> +		"bic r1, r1, #(0x7<<0)\n"
> +		"orr r1, r1, #(0x3<<0)\n"
> +		"mcr p15, 1, r1, c9, c0, 2\n"
> +
> +		/* End of Cortex-A15 specific setup */
> +		"not_a15:\n"
> +
> +		"cmp	r0, #1\n"
> +		"bxne	lr\n"
> +		"b	cci_enable_port_for_self"
> +	);
> +}
> +
> +static void sunxi_mcpm_setup_entry_point(void)
> +{
> +	__raw_writel(virt_to_phys(mcpm_entry_point),
> +		     prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
> +}
> +
> +static int __init sunxi_mcpm_init(void)
> +{
> +	struct device_node *node;
> +	int ret;
> +
> +	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
> +		return -ENODEV;
> +
> +	if (!cci_probed())
> +		return -ENODEV;
> +
> +	node = of_find_compatible_node(NULL, NULL,
> +			"allwinner,sun9i-a80-cpucfg");
> +	if (!node)
> +		return -ENODEV;
> +
> +	cpucfg_base = of_iomap(node, 0);
> +	of_node_put(node);
> +	if (!cpucfg_base) {
> +		pr_err("%s: failed to map CPUCFG registers\n", __func__);
> +		return -ENOMEM;
> +	}

Can't we request the region as well?

> +
> +	node = of_find_compatible_node(NULL, NULL,
> +			"allwinner,sun9i-a80-prcm");
> +	if (!node)
> +		return -ENODEV;
> +
> +	prcm_base = of_iomap(node, 0);
> +
> +	of_node_put(node);
> +	if (!prcm_base) {
> +		pr_err("%s: failed to map PRCM registers\n", __func__);
> +		iounmap(prcm_base);
> +		return -ENOMEM;
> +	}
> +
> +	ret = mcpm_platform_register(&sunxi_power_ops);
> +	if (!ret)
> +		ret = mcpm_sync_init(sunxi_power_up_setup);
> +	if (!ret)
> +		/* do not disable AXI master as no one will re-enable it */
> +		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
> +	if (ret) {
> +		iounmap(cpucfg_base);
> +		iounmap(prcm_base);
> +		return ret;
> +	}
> +
> +	mcpm_smp_set_ops();
> +
> +	pr_info("sunxi MCPM support installed\n");
> +
> +	sunxi_mcpm_setup_entry_point();
> +
> +	return ret;
> +}

It looks mostly good, and I would replace the sunxi by sun9i, and call
that file sun9i-mcpm.c

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25  7:47     ` Maxime Ripard
  0 siblings, 0 replies; 28+ messages in thread
From: Maxime Ripard @ 2017-07-25  7:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Chen-Yu,

On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
> 1 cluster of 4 Cortex-A15s.
> 
> This patch adds support to bring up the second cluster and thus all
> cores using the common MCPM code. Core/cluster power down has not
> been implemented, thus CPU hotplugging and big.LITTLE switcher is
> not supported.
> 
> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> ---
>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>  arch/arm/mach-sunxi/Makefile |   1 +
>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 402 insertions(+)
>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
> 
> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
> index 58153cdf025b..177380548d99 100644
> --- a/arch/arm/mach-sunxi/Kconfig
> +++ b/arch/arm/mach-sunxi/Kconfig
> @@ -47,5 +47,15 @@ config MACH_SUN9I
>  	bool "Allwinner (sun9i) SoCs support"
>  	default ARCH_SUNXI
>  	select ARM_GIC
> +	imply MCPM
> +
> +config SUN9I_A80_MCPM
> +	bool "Allwinner A80 Multi-Cluster PM support"
> +	depends on MCPM && MACH_SUN9I
> +	default MACH_SUN9I
> +	select ARM_CCI400_PORT_CTRL
> +	help
> +	  This is needed to provide CPU and cluster power management
> +	  on Allwinner A80 implementing big.LITTLE.

Do we really need an option for that? we don't provide the option to
disable the CPU SMP operations for the rest of the SoCs.

>  endif
> diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
> index 27b168f121a1..e8558912c714 100644
> --- a/arch/arm/mach-sunxi/Makefile
> +++ b/arch/arm/mach-sunxi/Makefile
> @@ -1,2 +1,3 @@
>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
>  obj-$(CONFIG_SMP) += platsmp.o
> +obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
> new file mode 100644
> index 000000000000..4b6e1d6ae379
> --- /dev/null
> +++ b/arch/arm/mach-sunxi/mcpm.c
> @@ -0,0 +1,391 @@
> +/*
> + * Copyright (c) 2015 Chen-Yu Tsai
> + *
> + * Chen-Yu Tsai <wens@csie.org>
> + *
> + * arch/arm/mach-sunxi/mcpm.c
> + *
> + * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/arm-cci.h>
> +#include <linux/delay.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +#include <asm/mcpm.h>
> +
> +#define SUNXI_CPUS_PER_CLUSTER		4
> +#define SUNXI_NR_CLUSTERS		2
> + 
> +#define SUN9I_A80_A15_CLUSTER		1

Don't we have a way to derive that from the DT ?

> +#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
> +#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
> +#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
> +#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
> +#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
> +#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
> +#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
> +
> +#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
> +#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
> +#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
> +#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
> +#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
> +#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
> +#define PRCM_CPU_SOFT_ENTRY_REG		0x164
> +
> +static void __iomem *cpucfg_base;
> +static void __iomem *prcm_base;
> +
> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
> +				      bool enable)
> +{
> +	u32 reg;
> +
> +	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
> +	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +	if (enable) {
> +		if (reg == 0x00) {
> +			pr_debug("power clamp for cluster %u cpu %u already open\n",
> +				 cluster, cpu);
> +			return 0;
> +		}
> +
> +		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +	} else {
> +		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> +		udelay(10);
> +	}
> +
> +	return 0;
> +}
> +
> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
> +{
> +	u32 reg;
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
> +		return -EINVAL;
> +
> +	/* assert processor power-on reset */
> +	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> +	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +
> +	/* Cortex-A7: hold L1 reset disable signal low */
> +	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER)) {
> +		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> +		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
> +		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> +	}
> +
> +	/* assert processor related resets */
> +	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> +
> +	/*
> +	 * Allwinner code also asserts resets for NEON on A15. According
> +	 * to ARM manuals, asserting power-on reset is sufficient.
> +	 */
> +	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER)) {
> +		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> +	}
> +	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +
> +	/* open power switch */
> +	sunxi_cpu_power_switch_set(cpu, cluster, true);
> +
> +	/* clear processor power gate */
> +	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> +	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
> +	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> +	udelay(20);
> +
> +	/* de-assert processor power-on reset */
> +	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> +	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +
> +	/* de-assert all processor resets */
> +	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> +	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
> +	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER)) {
> +		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> +	} else {
> +		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
> +	}
> +	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +
> +	return 0;
> +}
> +
> +static int sunxi_cluster_powerup(unsigned int cluster)
> +{
> +	u32 reg;
> +
> +	pr_debug("%s: cluster %u\n", __func__, cluster);
> +	if (cluster >= SUNXI_NR_CLUSTERS)
> +		return -EINVAL;
> +
> +	/* assert ACINACTM */
> +	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> +	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +
> +	/* assert cluster processor power-on resets */
> +	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
> +	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> +
> +	/* assert cluster resets */
> +	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> +	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
> +	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
> +	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
> +
> +	/*
> +	 * Allwinner code also asserts resets for NEON on A15. According
> +	 * to ARM manuals, asserting power-on reset is sufficient.
> +	 */
> +	if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER)) {
> +		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
> +	}
> +	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +
> +	/* hold L1/L2 reset disable signals low */
> +	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> +	if (of_machine_is_compatible("allwinner,sun9i-a80") &&
> +			cluster == SUN9I_A80_A15_CLUSTER) {
> +		/* Cortex-A15: hold L2RSTDISABLE low */
> +		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
> +	} else {
> +		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
> +		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
> +		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
> +	}
> +	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> +
> +	/* clear cluster power gate */
> +	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> +	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
> +	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> +	udelay(20);
> +
> +	/* de-assert cluster resets */
> +	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> +	reg |= CPUCFG_CX_RST_CTRL_H_RST;
> +	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
> +	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> +
> +	/* de-assert ACINACTM */
> +	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
> +	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +
> +	return 0;
> +}
> +
> +static void sunxi_cpu_cache_disable(void)
> +{
> +	/* Disable and flush the local CPU cache. */
> +	v7_exit_coherency_flush(louis);
> +}
> +
> +/*
> + * This bit is shared between the initial mcpm_sync_init call to enable
> + * CCI-400 and proper cluster cache disable before power down.
> + */
> +static void sunxi_cluster_cache_disable_without_axi(void)
> +{
> +	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
> +		/*
> +		 * On the Cortex-A15 we need to disable
> +		 * L2 prefetching before flushing the cache.
> +		 */
> +		asm volatile(
> +		"mcr	p15, 1, %0, c15, c0, 3\n"
> +		"isb\n"
> +		"dsb"
> +		: : "r" (0x400));
> +	}
> +
> +	/* Flush all cache levels for this cluster. */
> +	v7_exit_coherency_flush(all);
> +
> +	/*
> +	 * Disable cluster-level coherency by masking
> +	 * incoming snoops and DVM messages:
> +	 */
> +	cci_disable_port_by_cpu(read_cpuid_mpidr());
> +}
> +
> +static void sunxi_cluster_cache_disable(void)
> +{
> +	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
> +	u32 reg;
> +
> +	pr_info("%s: cluster %u\n", __func__, cluster);
> +
> +	sunxi_cluster_cache_disable_without_axi();
> +
> +	/* last man standing, assert ACINACTM */
> +	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> +	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> +}
> +
> +static const struct mcpm_platform_ops sunxi_power_ops = {
> +	.cpu_powerup		= sunxi_cpu_powerup,
> +	.cluster_powerup	= sunxi_cluster_powerup,
> +	.cpu_cache_disable	= sunxi_cpu_cache_disable,
> +	.cluster_cache_disable	= sunxi_cluster_cache_disable,
> +};
> +
> +/*
> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> + *
> + * Also enable regional clock gating and L2 data latency settings for
> + * Cortex-A15.
> + */
> +static void __naked sunxi_power_up_setup(unsigned int affinity_level)
> +{
> +	asm volatile (
> +		"mrc	p15, 0, r1, c0, c0, 0\n"
> +		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
> +		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
> +		"and	r1, r1, r2\n"
> +		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
> +		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
> +		"cmp	r1, r2\n"
> +		"bne	not_a15\n"
> +
> +		/* The following is Cortex-A15 specific */
> +
> +		/* L2CTRL: Enable CPU regional clock gates */
> +		"mrc p15, 1, r1, c15, c0, 4\n"
> +		"orr r1, r1, #(0x1<<31)\n"
> +		"mcr p15, 1, r1, c15, c0, 4\n"
> +
> +		/* L2ACTLR */
> +		"mrc p15, 1, r1, c15, c0, 0\n"
> +		/* Enable L2, GIC, and Timer regional clock gates */
> +		"orr r1, r1, #(0x1<<26)\n"
> +		/* Disable clean/evict from being pushed to external */
> +		"orr r1, r1, #(0x1<<3)\n"
> +		"mcr p15, 1, r1, c15, c0, 0\n"
> +
> +		/* L2 data RAM latency */
> +		"mrc p15, 1, r1, c9, c0, 2\n"
> +		"bic r1, r1, #(0x7<<0)\n"
> +		"orr r1, r1, #(0x3<<0)\n"
> +		"mcr p15, 1, r1, c9, c0, 2\n"
> +
> +		/* End of Cortex-A15 specific setup */
> +		"not_a15:\n"
> +
> +		"cmp	r0, #1\n"
> +		"bxne	lr\n"
> +		"b	cci_enable_port_for_self"
> +	);
> +}
> +
> +static void sunxi_mcpm_setup_entry_point(void)
> +{
> +	__raw_writel(virt_to_phys(mcpm_entry_point),
> +		     prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
> +}
> +
> +static int __init sunxi_mcpm_init(void)
> +{
> +	struct device_node *node;
> +	int ret;
> +
> +	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
> +		return -ENODEV;
> +
> +	if (!cci_probed())
> +		return -ENODEV;
> +
> +	node = of_find_compatible_node(NULL, NULL,
> +			"allwinner,sun9i-a80-cpucfg");
> +	if (!node)
> +		return -ENODEV;
> +
> +	cpucfg_base = of_iomap(node, 0);
> +	of_node_put(node);
> +	if (!cpucfg_base) {
> +		pr_err("%s: failed to map CPUCFG registers\n", __func__);
> +		return -ENOMEM;
> +	}

Can't we request the region as well?

> +
> +	node = of_find_compatible_node(NULL, NULL,
> +			"allwinner,sun9i-a80-prcm");
> +	if (!node)
> +		return -ENODEV;
> +
> +	prcm_base = of_iomap(node, 0);
> +
> +	of_node_put(node);
> +	if (!prcm_base) {
> +		pr_err("%s: failed to map PRCM registers\n", __func__);
> +		iounmap(prcm_base);
> +		return -ENOMEM;
> +	}
> +
> +	ret = mcpm_platform_register(&sunxi_power_ops);
> +	if (!ret)
> +		ret = mcpm_sync_init(sunxi_power_up_setup);
> +	if (!ret)
> +		/* do not disable AXI master as no one will re-enable it */
> +		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
> +	if (ret) {
> +		iounmap(cpucfg_base);
> +		iounmap(prcm_base);
> +		return ret;
> +	}
> +
> +	mcpm_smp_set_ops();
> +
> +	pr_info("sunxi MCPM support installed\n");
> +
> +	sunxi_mcpm_setup_entry_point();
> +
> +	return ret;
> +}

It looks mostly good, and I would replace the sunxi by sun9i, and call
that file sun9i-mcpm.c

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170725/69bcd490/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
  2017-07-25  7:47     ` Maxime Ripard
  (?)
@ 2017-07-25  8:29       ` Chen-Yu Tsai
  -1 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  8:29 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Chen-Yu Tsai, Russell King, linux-sunxi, linux-arm-kernel,
	linux-kernel, devicetree, Nicolas Pitre, Dave Martin

         default ARCH_SUNXI
On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
<maxime.ripard@free-electrons.com> wrote:
> Hi Chen-Yu,
>
> On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
>> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>> 1 cluster of 4 Cortex-A15s.
>>
>> This patch adds support to bring up the second cluster and thus all
>> cores using the common MCPM code. Core/cluster power down has not
>> been implemented, thus CPU hotplugging and big.LITTLE switcher is
>> not supported.
>>
>> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
>> ---
>>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>>  arch/arm/mach-sunxi/Makefile |   1 +
>>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 402 insertions(+)
>>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>>
>> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
>> index 58153cdf025b..177380548d99 100644
>> --- a/arch/arm/mach-sunxi/Kconfig
>> +++ b/arch/arm/mach-sunxi/Kconfig
>> @@ -47,5 +47,15 @@ config MACH_SUN9I
>>       bool "Allwinner (sun9i) SoCs support"
>>       default ARCH_SUNXI
>>       select ARM_GIC
>> +     imply MCPM
>> +
>> +config SUN9I_A80_MCPM
>> +     bool "Allwinner A80 Multi-Cluster PM support"
>> +     depends on MCPM && MACH_SUN9I
>> +     default MACH_SUN9I
>> +     select ARM_CCI400_PORT_CTRL
>> +     help
>> +       This is needed to provide CPU and cluster power management
>> +       on Allwinner A80 implementing big.LITTLE.
>
> Do we really need an option for that? we don't provide the option to
> disable the CPU SMP operations for the rest of the SoCs.

It was an option as it also required MCPM and CCI400 support to be built.
We could hide it. Or, using mach-hisi as a reference, we could do:

config MACH_SUN9I
        default ARCH_SUNXI
        select ARM_GIC
        select MCPM if SMP
        select ARM_CCI400_PORT_CTRL if SMP

and in the Makefile:

obj-$(CONFIG_MCPM) += sun9i-mcpm.o

>
>>  endif
>> diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
>> index 27b168f121a1..e8558912c714 100644
>> --- a/arch/arm/mach-sunxi/Makefile
>> +++ b/arch/arm/mach-sunxi/Makefile
>> @@ -1,2 +1,3 @@
>>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
>>  obj-$(CONFIG_SMP) += platsmp.o
>> +obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
>> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
>> new file mode 100644
>> index 000000000000..4b6e1d6ae379
>> --- /dev/null
>> +++ b/arch/arm/mach-sunxi/mcpm.c
>> @@ -0,0 +1,391 @@
>> +/*
>> + * Copyright (c) 2015 Chen-Yu Tsai
>> + *
>> + * Chen-Yu Tsai <wens@csie.org>
>> + *
>> + * arch/arm/mach-sunxi/mcpm.c
>> + *
>> + * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/arm-cci.h>
>> +#include <linux/delay.h>
>> +#include <linux/io.h>
>> +#include <linux/of_address.h>
>> +
>> +#include <asm/cputype.h>
>> +#include <asm/cp15.h>
>> +#include <asm/mcpm.h>
>> +
>> +#define SUNXI_CPUS_PER_CLUSTER               4
>> +#define SUNXI_NR_CLUSTERS            2
>> +
>> +#define SUN9I_A80_A15_CLUSTER                1
>
> Don't we have a way to derive that from the DT ?

Indeed we can.

It would be slighty more complicated though:

node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
if (of_device_is_compatible(node, "arm,cortex-a15")) {
        ...
}

>
>> +#define CPUCFG_CX_CTRL_REG0(c)               (0x10 * (c))
>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)        BIT(n)
>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL       0xf
>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7        BIT(4)
>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15       BIT(0)
>> +#define CPUCFG_CX_CTRL_REG1(c)               (0x10 * (c) + 0x4)
>> +#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
>> +#define CPUCFG_CX_RST_CTRL(c)                (0x80 + 0x4 * (c))
>> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST       BIT(24)
>> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)        BIT(20 + (n))
>> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL       (0xf << 20)
>> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)        BIT(16 + (n))
>> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL       (0xf << 16)
>> +#define CPUCFG_CX_RST_CTRL_H_RST     BIT(12)
>> +#define CPUCFG_CX_RST_CTRL_L2_RST    BIT(8)
>> +#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
>> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)       BIT(n)
>> +
>> +#define PRCM_CPU_PO_RST_CTRL(c)              (0x4 + 0x4 * (c))
>> +#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
>> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL        0xf
>> +#define PRCM_PWROFF_GATING_REG(c)    (0x100 + 0x4 * (c))
>> +#define PRCM_PWROFF_GATING_REG_CLUSTER       BIT(4)
>> +#define PRCM_PWROFF_GATING_REG_CORE(n)       BIT(n)
>> +#define PRCM_PWR_SWITCH_REG(c, cpu)  (0x140 + 0x10 * (c) + 0x4 * (cpu))
>> +#define PRCM_CPU_SOFT_ENTRY_REG              0x164
>> +
>> +static void __iomem *cpucfg_base;
>> +static void __iomem *prcm_base;
>> +
>> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
>> +                                   bool enable)
>> +{
>> +     u32 reg;
>> +
>> +     /* control sequence from Allwinner A80 user manual v1.2 PRCM section */
>> +     reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +     if (enable) {
>> +             if (reg == 0x00) {
>> +                     pr_debug("power clamp for cluster %u cpu %u already open\n",
>> +                              cluster, cpu);
>> +                     return 0;
>> +             }
>> +
>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +     } else {
>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
>> +{
>> +     u32 reg;
>> +
>> +     pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +     if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
>> +             return -EINVAL;
>> +
>> +     /* assert processor power-on reset */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* Cortex-A7: hold L1 reset disable signal low */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
>> +             writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +     }
>> +
>> +     /* assert processor related resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>> +
>> +     /*
>> +      * Allwinner code also asserts resets for NEON on A15. According
>> +      * to ARM manuals, asserting power-on reset is sufficient.
>> +      */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* open power switch */
>> +     sunxi_cpu_power_switch_set(cpu, cluster, true);
>> +
>> +     /* clear processor power gate */
>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     udelay(20);
>> +
>> +     /* de-assert processor power-on reset */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* de-assert all processor resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>> +     reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>> +     } else {
>> +             reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     return 0;
>> +}
>> +
>> +static int sunxi_cluster_powerup(unsigned int cluster)
>> +{
>> +     u32 reg;
>> +
>> +     pr_debug("%s: cluster %u\n", __func__, cluster);
>> +     if (cluster >= SUNXI_NR_CLUSTERS)
>> +             return -EINVAL;
>> +
>> +     /* assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +
>> +     /* assert cluster processor power-on resets */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* assert cluster resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
>> +
>> +     /*
>> +      * Allwinner code also asserts resets for NEON on A15. According
>> +      * to ARM manuals, asserting power-on reset is sufficient.
>> +      */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* hold L1/L2 reset disable signals low */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +     if (of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER) {
>> +             /* Cortex-A15: hold L2RSTDISABLE low */
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
>> +     } else {
>> +             /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +
>> +     /* clear cluster power gate */
>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     udelay(20);
>> +
>> +     /* de-assert cluster resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>> +     reg |= CPUCFG_CX_RST_CTRL_H_RST;
>> +     reg |= CPUCFG_CX_RST_CTRL_L2_RST;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* de-assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +
>> +     return 0;
>> +}
>> +
>> +static void sunxi_cpu_cache_disable(void)
>> +{
>> +     /* Disable and flush the local CPU cache. */
>> +     v7_exit_coherency_flush(louis);
>> +}
>> +
>> +/*
>> + * This bit is shared between the initial mcpm_sync_init call to enable
>> + * CCI-400 and proper cluster cache disable before power down.
>> + */
>> +static void sunxi_cluster_cache_disable_without_axi(void)
>> +{
>> +     if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
>> +             /*
>> +              * On the Cortex-A15 we need to disable
>> +              * L2 prefetching before flushing the cache.
>> +              */
>> +             asm volatile(
>> +             "mcr    p15, 1, %0, c15, c0, 3\n"
>> +             "isb\n"
>> +             "dsb"
>> +             : : "r" (0x400));
>> +     }
>> +
>> +     /* Flush all cache levels for this cluster. */
>> +     v7_exit_coherency_flush(all);
>> +
>> +     /*
>> +      * Disable cluster-level coherency by masking
>> +      * incoming snoops and DVM messages:
>> +      */
>> +     cci_disable_port_by_cpu(read_cpuid_mpidr());
>> +}
>> +
>> +static void sunxi_cluster_cache_disable(void)
>> +{
>> +     unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
>> +     u32 reg;
>> +
>> +     pr_info("%s: cluster %u\n", __func__, cluster);
>> +
>> +     sunxi_cluster_cache_disable_without_axi();
>> +
>> +     /* last man standing, assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +}
>> +
>> +static const struct mcpm_platform_ops sunxi_power_ops = {
>> +     .cpu_powerup            = sunxi_cpu_powerup,
>> +     .cluster_powerup        = sunxi_cluster_powerup,
>> +     .cpu_cache_disable      = sunxi_cpu_cache_disable,
>> +     .cluster_cache_disable  = sunxi_cluster_cache_disable,
>> +};
>> +
>> +/*
>> + * Enable cluster-level coherency, in preparation for turning on the MMU.
>> + *
>> + * Also enable regional clock gating and L2 data latency settings for
>> + * Cortex-A15.
>> + */
>> +static void __naked sunxi_power_up_setup(unsigned int affinity_level)
>> +{
>> +     asm volatile (
>> +             "mrc    p15, 0, r1, c0, c0, 0\n"
>> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
>> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
>> +             "and    r1, r1, r2\n"
>> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
>> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
>> +             "cmp    r1, r2\n"
>> +             "bne    not_a15\n"
>> +
>> +             /* The following is Cortex-A15 specific */
>> +
>> +             /* L2CTRL: Enable CPU regional clock gates */
>> +             "mrc p15, 1, r1, c15, c0, 4\n"
>> +             "orr r1, r1, #(0x1<<31)\n"
>> +             "mcr p15, 1, r1, c15, c0, 4\n"
>> +
>> +             /* L2ACTLR */
>> +             "mrc p15, 1, r1, c15, c0, 0\n"
>> +             /* Enable L2, GIC, and Timer regional clock gates */
>> +             "orr r1, r1, #(0x1<<26)\n"
>> +             /* Disable clean/evict from being pushed to external */
>> +             "orr r1, r1, #(0x1<<3)\n"
>> +             "mcr p15, 1, r1, c15, c0, 0\n"
>> +
>> +             /* L2 data RAM latency */
>> +             "mrc p15, 1, r1, c9, c0, 2\n"
>> +             "bic r1, r1, #(0x7<<0)\n"
>> +             "orr r1, r1, #(0x3<<0)\n"
>> +             "mcr p15, 1, r1, c9, c0, 2\n"
>> +
>> +             /* End of Cortex-A15 specific setup */
>> +             "not_a15:\n"
>> +
>> +             "cmp    r0, #1\n"
>> +             "bxne   lr\n"
>> +             "b      cci_enable_port_for_self"
>> +     );
>> +}
>> +
>> +static void sunxi_mcpm_setup_entry_point(void)
>> +{
>> +     __raw_writel(virt_to_phys(mcpm_entry_point),
>> +                  prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
>> +}
>> +
>> +static int __init sunxi_mcpm_init(void)
>> +{
>> +     struct device_node *node;
>> +     int ret;
>> +
>> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
>> +             return -ENODEV;
>> +
>> +     if (!cci_probed())
>> +             return -ENODEV;
>> +
>> +     node = of_find_compatible_node(NULL, NULL,
>> +                     "allwinner,sun9i-a80-cpucfg");
>> +     if (!node)
>> +             return -ENODEV;
>> +
>> +     cpucfg_base = of_iomap(node, 0);
>> +     of_node_put(node);
>> +     if (!cpucfg_base) {
>> +             pr_err("%s: failed to map CPUCFG registers\n", __func__);
>> +             return -ENOMEM;
>> +     }
>
> Can't we request the region as well?

Yes we can! But only for the CPUCFG registers. The PRCM block is
shared with all the PRCM block clock drivers. :(

>
>> +
>> +     node = of_find_compatible_node(NULL, NULL,
>> +                     "allwinner,sun9i-a80-prcm");
>> +     if (!node)
>> +             return -ENODEV;
>> +
>> +     prcm_base = of_iomap(node, 0);
>> +
>> +     of_node_put(node);
>> +     if (!prcm_base) {
>> +             pr_err("%s: failed to map PRCM registers\n", __func__);
>> +             iounmap(prcm_base);
>> +             return -ENOMEM;
>> +     }
>> +
>> +     ret = mcpm_platform_register(&sunxi_power_ops);
>> +     if (!ret)
>> +             ret = mcpm_sync_init(sunxi_power_up_setup);
>> +     if (!ret)
>> +             /* do not disable AXI master as no one will re-enable it */
>> +             ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
>> +     if (ret) {
>> +             iounmap(cpucfg_base);
>> +             iounmap(prcm_base);
>> +             return ret;
>> +     }
>> +
>> +     mcpm_smp_set_ops();
>> +
>> +     pr_info("sunxi MCPM support installed\n");
>> +
>> +     sunxi_mcpm_setup_entry_point();
>> +
>> +     return ret;
>> +}
>
> It looks mostly good, and I would replace the sunxi by sun9i, and call
> that file sun9i-mcpm.c

I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
or just mcpm. Most of the stuff is similiar, except the A83T has two
revisions and one of them has two gate/power bits swapped. :(

ChenYu

>
> Thanks!
> Maxime
>
> --
> Maxime Ripard, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25  8:29       ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  8:29 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Chen-Yu Tsai, Russell King, linux-sunxi, linux-arm-kernel,
	linux-kernel, devicetree, Nicolas Pitre, Dave Martin

         default ARCH_SUNXI
On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
<maxime.ripard-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org> wrote:
> Hi Chen-Yu,
>
> On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
>> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>> 1 cluster of 4 Cortex-A15s.
>>
>> This patch adds support to bring up the second cluster and thus all
>> cores using the common MCPM code. Core/cluster power down has not
>> been implemented, thus CPU hotplugging and big.LITTLE switcher is
>> not supported.
>>
>> Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
>> ---
>>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>>  arch/arm/mach-sunxi/Makefile |   1 +
>>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 402 insertions(+)
>>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>>
>> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
>> index 58153cdf025b..177380548d99 100644
>> --- a/arch/arm/mach-sunxi/Kconfig
>> +++ b/arch/arm/mach-sunxi/Kconfig
>> @@ -47,5 +47,15 @@ config MACH_SUN9I
>>       bool "Allwinner (sun9i) SoCs support"
>>       default ARCH_SUNXI
>>       select ARM_GIC
>> +     imply MCPM
>> +
>> +config SUN9I_A80_MCPM
>> +     bool "Allwinner A80 Multi-Cluster PM support"
>> +     depends on MCPM && MACH_SUN9I
>> +     default MACH_SUN9I
>> +     select ARM_CCI400_PORT_CTRL
>> +     help
>> +       This is needed to provide CPU and cluster power management
>> +       on Allwinner A80 implementing big.LITTLE.
>
> Do we really need an option for that? we don't provide the option to
> disable the CPU SMP operations for the rest of the SoCs.

It was an option as it also required MCPM and CCI400 support to be built.
We could hide it. Or, using mach-hisi as a reference, we could do:

config MACH_SUN9I
        default ARCH_SUNXI
        select ARM_GIC
        select MCPM if SMP
        select ARM_CCI400_PORT_CTRL if SMP

and in the Makefile:

obj-$(CONFIG_MCPM) += sun9i-mcpm.o

>
>>  endif
>> diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
>> index 27b168f121a1..e8558912c714 100644
>> --- a/arch/arm/mach-sunxi/Makefile
>> +++ b/arch/arm/mach-sunxi/Makefile
>> @@ -1,2 +1,3 @@
>>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
>>  obj-$(CONFIG_SMP) += platsmp.o
>> +obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
>> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
>> new file mode 100644
>> index 000000000000..4b6e1d6ae379
>> --- /dev/null
>> +++ b/arch/arm/mach-sunxi/mcpm.c
>> @@ -0,0 +1,391 @@
>> +/*
>> + * Copyright (c) 2015 Chen-Yu Tsai
>> + *
>> + * Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
>> + *
>> + * arch/arm/mach-sunxi/mcpm.c
>> + *
>> + * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/arm-cci.h>
>> +#include <linux/delay.h>
>> +#include <linux/io.h>
>> +#include <linux/of_address.h>
>> +
>> +#include <asm/cputype.h>
>> +#include <asm/cp15.h>
>> +#include <asm/mcpm.h>
>> +
>> +#define SUNXI_CPUS_PER_CLUSTER               4
>> +#define SUNXI_NR_CLUSTERS            2
>> +
>> +#define SUN9I_A80_A15_CLUSTER                1
>
> Don't we have a way to derive that from the DT ?

Indeed we can.

It would be slighty more complicated though:

node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
if (of_device_is_compatible(node, "arm,cortex-a15")) {
        ...
}

>
>> +#define CPUCFG_CX_CTRL_REG0(c)               (0x10 * (c))
>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)        BIT(n)
>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL       0xf
>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7        BIT(4)
>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15       BIT(0)
>> +#define CPUCFG_CX_CTRL_REG1(c)               (0x10 * (c) + 0x4)
>> +#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
>> +#define CPUCFG_CX_RST_CTRL(c)                (0x80 + 0x4 * (c))
>> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST       BIT(24)
>> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)        BIT(20 + (n))
>> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL       (0xf << 20)
>> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)        BIT(16 + (n))
>> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL       (0xf << 16)
>> +#define CPUCFG_CX_RST_CTRL_H_RST     BIT(12)
>> +#define CPUCFG_CX_RST_CTRL_L2_RST    BIT(8)
>> +#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
>> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)       BIT(n)
>> +
>> +#define PRCM_CPU_PO_RST_CTRL(c)              (0x4 + 0x4 * (c))
>> +#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
>> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL        0xf
>> +#define PRCM_PWROFF_GATING_REG(c)    (0x100 + 0x4 * (c))
>> +#define PRCM_PWROFF_GATING_REG_CLUSTER       BIT(4)
>> +#define PRCM_PWROFF_GATING_REG_CORE(n)       BIT(n)
>> +#define PRCM_PWR_SWITCH_REG(c, cpu)  (0x140 + 0x10 * (c) + 0x4 * (cpu))
>> +#define PRCM_CPU_SOFT_ENTRY_REG              0x164
>> +
>> +static void __iomem *cpucfg_base;
>> +static void __iomem *prcm_base;
>> +
>> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
>> +                                   bool enable)
>> +{
>> +     u32 reg;
>> +
>> +     /* control sequence from Allwinner A80 user manual v1.2 PRCM section */
>> +     reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +     if (enable) {
>> +             if (reg == 0x00) {
>> +                     pr_debug("power clamp for cluster %u cpu %u already open\n",
>> +                              cluster, cpu);
>> +                     return 0;
>> +             }
>> +
>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +     } else {
>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
>> +{
>> +     u32 reg;
>> +
>> +     pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +     if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
>> +             return -EINVAL;
>> +
>> +     /* assert processor power-on reset */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* Cortex-A7: hold L1 reset disable signal low */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
>> +             writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +     }
>> +
>> +     /* assert processor related resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>> +
>> +     /*
>> +      * Allwinner code also asserts resets for NEON on A15. According
>> +      * to ARM manuals, asserting power-on reset is sufficient.
>> +      */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* open power switch */
>> +     sunxi_cpu_power_switch_set(cpu, cluster, true);
>> +
>> +     /* clear processor power gate */
>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     udelay(20);
>> +
>> +     /* de-assert processor power-on reset */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* de-assert all processor resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>> +     reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>> +     } else {
>> +             reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     return 0;
>> +}
>> +
>> +static int sunxi_cluster_powerup(unsigned int cluster)
>> +{
>> +     u32 reg;
>> +
>> +     pr_debug("%s: cluster %u\n", __func__, cluster);
>> +     if (cluster >= SUNXI_NR_CLUSTERS)
>> +             return -EINVAL;
>> +
>> +     /* assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +
>> +     /* assert cluster processor power-on resets */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* assert cluster resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
>> +
>> +     /*
>> +      * Allwinner code also asserts resets for NEON on A15. According
>> +      * to ARM manuals, asserting power-on reset is sufficient.
>> +      */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* hold L1/L2 reset disable signals low */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +     if (of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER) {
>> +             /* Cortex-A15: hold L2RSTDISABLE low */
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
>> +     } else {
>> +             /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +
>> +     /* clear cluster power gate */
>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     udelay(20);
>> +
>> +     /* de-assert cluster resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>> +     reg |= CPUCFG_CX_RST_CTRL_H_RST;
>> +     reg |= CPUCFG_CX_RST_CTRL_L2_RST;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* de-assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +
>> +     return 0;
>> +}
>> +
>> +static void sunxi_cpu_cache_disable(void)
>> +{
>> +     /* Disable and flush the local CPU cache. */
>> +     v7_exit_coherency_flush(louis);
>> +}
>> +
>> +/*
>> + * This bit is shared between the initial mcpm_sync_init call to enable
>> + * CCI-400 and proper cluster cache disable before power down.
>> + */
>> +static void sunxi_cluster_cache_disable_without_axi(void)
>> +{
>> +     if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
>> +             /*
>> +              * On the Cortex-A15 we need to disable
>> +              * L2 prefetching before flushing the cache.
>> +              */
>> +             asm volatile(
>> +             "mcr    p15, 1, %0, c15, c0, 3\n"
>> +             "isb\n"
>> +             "dsb"
>> +             : : "r" (0x400));
>> +     }
>> +
>> +     /* Flush all cache levels for this cluster. */
>> +     v7_exit_coherency_flush(all);
>> +
>> +     /*
>> +      * Disable cluster-level coherency by masking
>> +      * incoming snoops and DVM messages:
>> +      */
>> +     cci_disable_port_by_cpu(read_cpuid_mpidr());
>> +}
>> +
>> +static void sunxi_cluster_cache_disable(void)
>> +{
>> +     unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
>> +     u32 reg;
>> +
>> +     pr_info("%s: cluster %u\n", __func__, cluster);
>> +
>> +     sunxi_cluster_cache_disable_without_axi();
>> +
>> +     /* last man standing, assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +}
>> +
>> +static const struct mcpm_platform_ops sunxi_power_ops = {
>> +     .cpu_powerup            = sunxi_cpu_powerup,
>> +     .cluster_powerup        = sunxi_cluster_powerup,
>> +     .cpu_cache_disable      = sunxi_cpu_cache_disable,
>> +     .cluster_cache_disable  = sunxi_cluster_cache_disable,
>> +};
>> +
>> +/*
>> + * Enable cluster-level coherency, in preparation for turning on the MMU.
>> + *
>> + * Also enable regional clock gating and L2 data latency settings for
>> + * Cortex-A15.
>> + */
>> +static void __naked sunxi_power_up_setup(unsigned int affinity_level)
>> +{
>> +     asm volatile (
>> +             "mrc    p15, 0, r1, c0, c0, 0\n"
>> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
>> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
>> +             "and    r1, r1, r2\n"
>> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
>> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
>> +             "cmp    r1, r2\n"
>> +             "bne    not_a15\n"
>> +
>> +             /* The following is Cortex-A15 specific */
>> +
>> +             /* L2CTRL: Enable CPU regional clock gates */
>> +             "mrc p15, 1, r1, c15, c0, 4\n"
>> +             "orr r1, r1, #(0x1<<31)\n"
>> +             "mcr p15, 1, r1, c15, c0, 4\n"
>> +
>> +             /* L2ACTLR */
>> +             "mrc p15, 1, r1, c15, c0, 0\n"
>> +             /* Enable L2, GIC, and Timer regional clock gates */
>> +             "orr r1, r1, #(0x1<<26)\n"
>> +             /* Disable clean/evict from being pushed to external */
>> +             "orr r1, r1, #(0x1<<3)\n"
>> +             "mcr p15, 1, r1, c15, c0, 0\n"
>> +
>> +             /* L2 data RAM latency */
>> +             "mrc p15, 1, r1, c9, c0, 2\n"
>> +             "bic r1, r1, #(0x7<<0)\n"
>> +             "orr r1, r1, #(0x3<<0)\n"
>> +             "mcr p15, 1, r1, c9, c0, 2\n"
>> +
>> +             /* End of Cortex-A15 specific setup */
>> +             "not_a15:\n"
>> +
>> +             "cmp    r0, #1\n"
>> +             "bxne   lr\n"
>> +             "b      cci_enable_port_for_self"
>> +     );
>> +}
>> +
>> +static void sunxi_mcpm_setup_entry_point(void)
>> +{
>> +     __raw_writel(virt_to_phys(mcpm_entry_point),
>> +                  prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
>> +}
>> +
>> +static int __init sunxi_mcpm_init(void)
>> +{
>> +     struct device_node *node;
>> +     int ret;
>> +
>> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
>> +             return -ENODEV;
>> +
>> +     if (!cci_probed())
>> +             return -ENODEV;
>> +
>> +     node = of_find_compatible_node(NULL, NULL,
>> +                     "allwinner,sun9i-a80-cpucfg");
>> +     if (!node)
>> +             return -ENODEV;
>> +
>> +     cpucfg_base = of_iomap(node, 0);
>> +     of_node_put(node);
>> +     if (!cpucfg_base) {
>> +             pr_err("%s: failed to map CPUCFG registers\n", __func__);
>> +             return -ENOMEM;
>> +     }
>
> Can't we request the region as well?

Yes we can! But only for the CPUCFG registers. The PRCM block is
shared with all the PRCM block clock drivers. :(

>
>> +
>> +     node = of_find_compatible_node(NULL, NULL,
>> +                     "allwinner,sun9i-a80-prcm");
>> +     if (!node)
>> +             return -ENODEV;
>> +
>> +     prcm_base = of_iomap(node, 0);
>> +
>> +     of_node_put(node);
>> +     if (!prcm_base) {
>> +             pr_err("%s: failed to map PRCM registers\n", __func__);
>> +             iounmap(prcm_base);
>> +             return -ENOMEM;
>> +     }
>> +
>> +     ret = mcpm_platform_register(&sunxi_power_ops);
>> +     if (!ret)
>> +             ret = mcpm_sync_init(sunxi_power_up_setup);
>> +     if (!ret)
>> +             /* do not disable AXI master as no one will re-enable it */
>> +             ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
>> +     if (ret) {
>> +             iounmap(cpucfg_base);
>> +             iounmap(prcm_base);
>> +             return ret;
>> +     }
>> +
>> +     mcpm_smp_set_ops();
>> +
>> +     pr_info("sunxi MCPM support installed\n");
>> +
>> +     sunxi_mcpm_setup_entry_point();
>> +
>> +     return ret;
>> +}
>
> It looks mostly good, and I would replace the sunxi by sun9i, and call
> that file sun9i-mcpm.c

I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
or just mcpm. Most of the stuff is similiar, except the A83T has two
revisions and one of them has two gate/power bits swapped. :(

ChenYu

>
> Thanks!
> Maxime
>
> --
> Maxime Ripard, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25  8:29       ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25  8:29 UTC (permalink / raw)
  To: linux-arm-kernel

         default ARCH_SUNXI
On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
<maxime.ripard@free-electrons.com> wrote:
> Hi Chen-Yu,
>
> On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
>> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>> 1 cluster of 4 Cortex-A15s.
>>
>> This patch adds support to bring up the second cluster and thus all
>> cores using the common MCPM code. Core/cluster power down has not
>> been implemented, thus CPU hotplugging and big.LITTLE switcher is
>> not supported.
>>
>> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
>> ---
>>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>>  arch/arm/mach-sunxi/Makefile |   1 +
>>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 402 insertions(+)
>>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>>
>> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
>> index 58153cdf025b..177380548d99 100644
>> --- a/arch/arm/mach-sunxi/Kconfig
>> +++ b/arch/arm/mach-sunxi/Kconfig
>> @@ -47,5 +47,15 @@ config MACH_SUN9I
>>       bool "Allwinner (sun9i) SoCs support"
>>       default ARCH_SUNXI
>>       select ARM_GIC
>> +     imply MCPM
>> +
>> +config SUN9I_A80_MCPM
>> +     bool "Allwinner A80 Multi-Cluster PM support"
>> +     depends on MCPM && MACH_SUN9I
>> +     default MACH_SUN9I
>> +     select ARM_CCI400_PORT_CTRL
>> +     help
>> +       This is needed to provide CPU and cluster power management
>> +       on Allwinner A80 implementing big.LITTLE.
>
> Do we really need an option for that? we don't provide the option to
> disable the CPU SMP operations for the rest of the SoCs.

It was an option as it also required MCPM and CCI400 support to be built.
We could hide it. Or, using mach-hisi as a reference, we could do:

config MACH_SUN9I
        default ARCH_SUNXI
        select ARM_GIC
        select MCPM if SMP
        select ARM_CCI400_PORT_CTRL if SMP

and in the Makefile:

obj-$(CONFIG_MCPM) += sun9i-mcpm.o

>
>>  endif
>> diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
>> index 27b168f121a1..e8558912c714 100644
>> --- a/arch/arm/mach-sunxi/Makefile
>> +++ b/arch/arm/mach-sunxi/Makefile
>> @@ -1,2 +1,3 @@
>>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
>>  obj-$(CONFIG_SMP) += platsmp.o
>> +obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
>> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
>> new file mode 100644
>> index 000000000000..4b6e1d6ae379
>> --- /dev/null
>> +++ b/arch/arm/mach-sunxi/mcpm.c
>> @@ -0,0 +1,391 @@
>> +/*
>> + * Copyright (c) 2015 Chen-Yu Tsai
>> + *
>> + * Chen-Yu Tsai <wens@csie.org>
>> + *
>> + * arch/arm/mach-sunxi/mcpm.c
>> + *
>> + * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/arm-cci.h>
>> +#include <linux/delay.h>
>> +#include <linux/io.h>
>> +#include <linux/of_address.h>
>> +
>> +#include <asm/cputype.h>
>> +#include <asm/cp15.h>
>> +#include <asm/mcpm.h>
>> +
>> +#define SUNXI_CPUS_PER_CLUSTER               4
>> +#define SUNXI_NR_CLUSTERS            2
>> +
>> +#define SUN9I_A80_A15_CLUSTER                1
>
> Don't we have a way to derive that from the DT ?

Indeed we can.

It would be slighty more complicated though:

node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
if (of_device_is_compatible(node, "arm,cortex-a15")) {
        ...
}

>
>> +#define CPUCFG_CX_CTRL_REG0(c)               (0x10 * (c))
>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)        BIT(n)
>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL       0xf
>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7        BIT(4)
>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15       BIT(0)
>> +#define CPUCFG_CX_CTRL_REG1(c)               (0x10 * (c) + 0x4)
>> +#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
>> +#define CPUCFG_CX_RST_CTRL(c)                (0x80 + 0x4 * (c))
>> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST       BIT(24)
>> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)        BIT(20 + (n))
>> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL       (0xf << 20)
>> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)        BIT(16 + (n))
>> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL       (0xf << 16)
>> +#define CPUCFG_CX_RST_CTRL_H_RST     BIT(12)
>> +#define CPUCFG_CX_RST_CTRL_L2_RST    BIT(8)
>> +#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
>> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)       BIT(n)
>> +
>> +#define PRCM_CPU_PO_RST_CTRL(c)              (0x4 + 0x4 * (c))
>> +#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
>> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL        0xf
>> +#define PRCM_PWROFF_GATING_REG(c)    (0x100 + 0x4 * (c))
>> +#define PRCM_PWROFF_GATING_REG_CLUSTER       BIT(4)
>> +#define PRCM_PWROFF_GATING_REG_CORE(n)       BIT(n)
>> +#define PRCM_PWR_SWITCH_REG(c, cpu)  (0x140 + 0x10 * (c) + 0x4 * (cpu))
>> +#define PRCM_CPU_SOFT_ENTRY_REG              0x164
>> +
>> +static void __iomem *cpucfg_base;
>> +static void __iomem *prcm_base;
>> +
>> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
>> +                                   bool enable)
>> +{
>> +     u32 reg;
>> +
>> +     /* control sequence from Allwinner A80 user manual v1.2 PRCM section */
>> +     reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +     if (enable) {
>> +             if (reg == 0x00) {
>> +                     pr_debug("power clamp for cluster %u cpu %u already open\n",
>> +                              cluster, cpu);
>> +                     return 0;
>> +             }
>> +
>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +             writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +     } else {
>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>> +             udelay(10);
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
>> +{
>> +     u32 reg;
>> +
>> +     pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +     if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
>> +             return -EINVAL;
>> +
>> +     /* assert processor power-on reset */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* Cortex-A7: hold L1 reset disable signal low */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
>> +             writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +     }
>> +
>> +     /* assert processor related resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>> +
>> +     /*
>> +      * Allwinner code also asserts resets for NEON on A15. According
>> +      * to ARM manuals, asserting power-on reset is sufficient.
>> +      */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* open power switch */
>> +     sunxi_cpu_power_switch_set(cpu, cluster, true);
>> +
>> +     /* clear processor power gate */
>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     udelay(20);
>> +
>> +     /* de-assert processor power-on reset */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* de-assert all processor resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>> +     reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>> +     } else {
>> +             reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     return 0;
>> +}
>> +
>> +static int sunxi_cluster_powerup(unsigned int cluster)
>> +{
>> +     u32 reg;
>> +
>> +     pr_debug("%s: cluster %u\n", __func__, cluster);
>> +     if (cluster >= SUNXI_NR_CLUSTERS)
>> +             return -EINVAL;
>> +
>> +     /* assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +
>> +     /* assert cluster processor power-on resets */
>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>> +
>> +     /* assert cluster resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
>> +     reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
>> +
>> +     /*
>> +      * Allwinner code also asserts resets for NEON on A15. According
>> +      * to ARM manuals, asserting power-on reset is sufficient.
>> +      */
>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* hold L1/L2 reset disable signals low */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +     if (of_machine_is_compatible("allwinner,sun9i-a80") &&
>> +                     cluster == SUN9I_A80_A15_CLUSTER) {
>> +             /* Cortex-A15: hold L2RSTDISABLE low */
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
>> +     } else {
>> +             /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
>> +     }
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>> +
>> +     /* clear cluster power gate */
>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>> +     udelay(20);
>> +
>> +     /* de-assert cluster resets */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>> +     reg |= CPUCFG_CX_RST_CTRL_H_RST;
>> +     reg |= CPUCFG_CX_RST_CTRL_L2_RST;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>> +
>> +     /* de-assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +
>> +     return 0;
>> +}
>> +
>> +static void sunxi_cpu_cache_disable(void)
>> +{
>> +     /* Disable and flush the local CPU cache. */
>> +     v7_exit_coherency_flush(louis);
>> +}
>> +
>> +/*
>> + * This bit is shared between the initial mcpm_sync_init call to enable
>> + * CCI-400 and proper cluster cache disable before power down.
>> + */
>> +static void sunxi_cluster_cache_disable_without_axi(void)
>> +{
>> +     if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
>> +             /*
>> +              * On the Cortex-A15 we need to disable
>> +              * L2 prefetching before flushing the cache.
>> +              */
>> +             asm volatile(
>> +             "mcr    p15, 1, %0, c15, c0, 3\n"
>> +             "isb\n"
>> +             "dsb"
>> +             : : "r" (0x400));
>> +     }
>> +
>> +     /* Flush all cache levels for this cluster. */
>> +     v7_exit_coherency_flush(all);
>> +
>> +     /*
>> +      * Disable cluster-level coherency by masking
>> +      * incoming snoops and DVM messages:
>> +      */
>> +     cci_disable_port_by_cpu(read_cpuid_mpidr());
>> +}
>> +
>> +static void sunxi_cluster_cache_disable(void)
>> +{
>> +     unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
>> +     u32 reg;
>> +
>> +     pr_info("%s: cluster %u\n", __func__, cluster);
>> +
>> +     sunxi_cluster_cache_disable_without_axi();
>> +
>> +     /* last man standing, assert ACINACTM */
>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>> +}
>> +
>> +static const struct mcpm_platform_ops sunxi_power_ops = {
>> +     .cpu_powerup            = sunxi_cpu_powerup,
>> +     .cluster_powerup        = sunxi_cluster_powerup,
>> +     .cpu_cache_disable      = sunxi_cpu_cache_disable,
>> +     .cluster_cache_disable  = sunxi_cluster_cache_disable,
>> +};
>> +
>> +/*
>> + * Enable cluster-level coherency, in preparation for turning on the MMU.
>> + *
>> + * Also enable regional clock gating and L2 data latency settings for
>> + * Cortex-A15.
>> + */
>> +static void __naked sunxi_power_up_setup(unsigned int affinity_level)
>> +{
>> +     asm volatile (
>> +             "mrc    p15, 0, r1, c0, c0, 0\n"
>> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
>> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
>> +             "and    r1, r1, r2\n"
>> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
>> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
>> +             "cmp    r1, r2\n"
>> +             "bne    not_a15\n"
>> +
>> +             /* The following is Cortex-A15 specific */
>> +
>> +             /* L2CTRL: Enable CPU regional clock gates */
>> +             "mrc p15, 1, r1, c15, c0, 4\n"
>> +             "orr r1, r1, #(0x1<<31)\n"
>> +             "mcr p15, 1, r1, c15, c0, 4\n"
>> +
>> +             /* L2ACTLR */
>> +             "mrc p15, 1, r1, c15, c0, 0\n"
>> +             /* Enable L2, GIC, and Timer regional clock gates */
>> +             "orr r1, r1, #(0x1<<26)\n"
>> +             /* Disable clean/evict from being pushed to external */
>> +             "orr r1, r1, #(0x1<<3)\n"
>> +             "mcr p15, 1, r1, c15, c0, 0\n"
>> +
>> +             /* L2 data RAM latency */
>> +             "mrc p15, 1, r1, c9, c0, 2\n"
>> +             "bic r1, r1, #(0x7<<0)\n"
>> +             "orr r1, r1, #(0x3<<0)\n"
>> +             "mcr p15, 1, r1, c9, c0, 2\n"
>> +
>> +             /* End of Cortex-A15 specific setup */
>> +             "not_a15:\n"
>> +
>> +             "cmp    r0, #1\n"
>> +             "bxne   lr\n"
>> +             "b      cci_enable_port_for_self"
>> +     );
>> +}
>> +
>> +static void sunxi_mcpm_setup_entry_point(void)
>> +{
>> +     __raw_writel(virt_to_phys(mcpm_entry_point),
>> +                  prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
>> +}
>> +
>> +static int __init sunxi_mcpm_init(void)
>> +{
>> +     struct device_node *node;
>> +     int ret;
>> +
>> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
>> +             return -ENODEV;
>> +
>> +     if (!cci_probed())
>> +             return -ENODEV;
>> +
>> +     node = of_find_compatible_node(NULL, NULL,
>> +                     "allwinner,sun9i-a80-cpucfg");
>> +     if (!node)
>> +             return -ENODEV;
>> +
>> +     cpucfg_base = of_iomap(node, 0);
>> +     of_node_put(node);
>> +     if (!cpucfg_base) {
>> +             pr_err("%s: failed to map CPUCFG registers\n", __func__);
>> +             return -ENOMEM;
>> +     }
>
> Can't we request the region as well?

Yes we can! But only for the CPUCFG registers. The PRCM block is
shared with all the PRCM block clock drivers. :(

>
>> +
>> +     node = of_find_compatible_node(NULL, NULL,
>> +                     "allwinner,sun9i-a80-prcm");
>> +     if (!node)
>> +             return -ENODEV;
>> +
>> +     prcm_base = of_iomap(node, 0);
>> +
>> +     of_node_put(node);
>> +     if (!prcm_base) {
>> +             pr_err("%s: failed to map PRCM registers\n", __func__);
>> +             iounmap(prcm_base);
>> +             return -ENOMEM;
>> +     }
>> +
>> +     ret = mcpm_platform_register(&sunxi_power_ops);
>> +     if (!ret)
>> +             ret = mcpm_sync_init(sunxi_power_up_setup);
>> +     if (!ret)
>> +             /* do not disable AXI master as no one will re-enable it */
>> +             ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
>> +     if (ret) {
>> +             iounmap(cpucfg_base);
>> +             iounmap(prcm_base);
>> +             return ret;
>> +     }
>> +
>> +     mcpm_smp_set_ops();
>> +
>> +     pr_info("sunxi MCPM support installed\n");
>> +
>> +     sunxi_mcpm_setup_entry_point();
>> +
>> +     return ret;
>> +}
>
> It looks mostly good, and I would replace the sunxi by sun9i, and call
> that file sun9i-mcpm.c

I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
or just mcpm. Most of the stuff is similiar, except the A83T has two
revisions and one of them has two gate/power bits swapped. :(

ChenYu

>
> Thanks!
> Maxime
>
> --
> Maxime Ripard, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
  2017-07-25  8:29       ` Chen-Yu Tsai
@ 2017-07-25 12:13         ` icenowy at aosc.io
  -1 siblings, 0 replies; 28+ messages in thread
From: icenowy @ 2017-07-25 12:13 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Maxime Ripard, Nicolas Pitre, devicetree, linux-kernel,
	linux-sunxi, Russell King, Dave Martin, linux-arm-kernel

在 2017-07-25 16:29,Chen-Yu Tsai 写道:
> default ARCH_SUNXI
> On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
> <maxime.ripard@free-electrons.com> wrote:
>> Hi Chen-Yu,
>> 
>> On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
>>> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>>> 1 cluster of 4 Cortex-A15s.
>>> 
>>> This patch adds support to bring up the second cluster and thus all
>>> cores using the common MCPM code. Core/cluster power down has not
>>> been implemented, thus CPU hotplugging and big.LITTLE switcher is
>>> not supported.
>>> 
>>> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
>>> ---
>>>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>>>  arch/arm/mach-sunxi/Makefile |   1 +
>>>  arch/arm/mach-sunxi/mcpm.c   | 391 
>>> +++++++++++++++++++++++++++++++++++++++++++
>>>  3 files changed, 402 insertions(+)
>>>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>>> 
>>> diff --git a/arch/arm/mach-sunxi/Kconfig 
>>> b/arch/arm/mach-sunxi/Kconfig
>>> index 58153cdf025b..177380548d99 100644
>>> --- a/arch/arm/mach-sunxi/Kconfig
>>> +++ b/arch/arm/mach-sunxi/Kconfig
>>> @@ -47,5 +47,15 @@ config MACH_SUN9I
>>>       bool "Allwinner (sun9i) SoCs support"
>>>       default ARCH_SUNXI
>>>       select ARM_GIC
>>> +     imply MCPM
>>> +
>>> +config SUN9I_A80_MCPM
>>> +     bool "Allwinner A80 Multi-Cluster PM support"
>>> +     depends on MCPM && MACH_SUN9I
>>> +     default MACH_SUN9I
>>> +     select ARM_CCI400_PORT_CTRL
>>> +     help
>>> +       This is needed to provide CPU and cluster power management
>>> +       on Allwinner A80 implementing big.LITTLE.
>> 
>> Do we really need an option for that? we don't provide the option to
>> disable the CPU SMP operations for the rest of the SoCs.
> 
> It was an option as it also required MCPM and CCI400 support to be 
> built.
> We could hide it. Or, using mach-hisi as a reference, we could do:

I think a hidden config option is a proper way, as we can then select
this config option in MACH_SUN8I when introducing A83T support.

> 
> config MACH_SUN9I
>         default ARCH_SUNXI
>         select ARM_GIC
>         select MCPM if SMP
>         select ARM_CCI400_PORT_CTRL if SMP
> 
> and in the Makefile:
> 
> obj-$(CONFIG_MCPM) += sun9i-mcpm.o
> 
>> 
>>>  endif
>>> diff --git a/arch/arm/mach-sunxi/Makefile 
>>> b/arch/arm/mach-sunxi/Makefile
>>> index 27b168f121a1..e8558912c714 100644
>>> --- a/arch/arm/mach-sunxi/Makefile
>>> +++ b/arch/arm/mach-sunxi/Makefile
>>> @@ -1,2 +1,3 @@
>>>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
>>>  obj-$(CONFIG_SMP) += platsmp.o
>>> +obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
>>> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
>>> new file mode 100644
>>> index 000000000000..4b6e1d6ae379
>>> --- /dev/null
>>> +++ b/arch/arm/mach-sunxi/mcpm.c
>>> @@ -0,0 +1,391 @@
>>> +/*
>>> + * Copyright (c) 2015 Chen-Yu Tsai
>>> + *
>>> + * Chen-Yu Tsai <wens@csie.org>
>>> + *
>>> + * arch/arm/mach-sunxi/mcpm.c
>>> + *
>>> + * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
>>> + *
>>> + * This program is free software; you can redistribute it and/or 
>>> modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +
>>> +#include <linux/arm-cci.h>
>>> +#include <linux/delay.h>
>>> +#include <linux/io.h>
>>> +#include <linux/of_address.h>
>>> +
>>> +#include <asm/cputype.h>
>>> +#include <asm/cp15.h>
>>> +#include <asm/mcpm.h>
>>> +
>>> +#define SUNXI_CPUS_PER_CLUSTER               4
>>> +#define SUNXI_NR_CLUSTERS            2
>>> +
>>> +#define SUN9I_A80_A15_CLUSTER                1
>> 
>> Don't we have a way to derive that from the DT ?
> 
> Indeed we can.
> 
> It would be slighty more complicated though:
> 
> node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
> if (of_device_is_compatible(node, "arm,cortex-a15")) {
>         ...
> }
> 
>> 
>>> +#define CPUCFG_CX_CTRL_REG0(c)               (0x10 * (c))
>>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)        BIT(n)
>>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL       0xf
>>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7        BIT(4)
>>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15       BIT(0)
>>> +#define CPUCFG_CX_CTRL_REG1(c)               (0x10 * (c) + 0x4)
>>> +#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
>>> +#define CPUCFG_CX_RST_CTRL(c)                (0x80 + 0x4 * (c))
>>> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST       BIT(24)
>>> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)        BIT(20 + (n))
>>> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL       (0xf << 20)
>>> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)        BIT(16 + (n))
>>> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL       (0xf << 16)
>>> +#define CPUCFG_CX_RST_CTRL_H_RST     BIT(12)
>>> +#define CPUCFG_CX_RST_CTRL_L2_RST    BIT(8)
>>> +#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
>>> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)       BIT(n)
>>> +
>>> +#define PRCM_CPU_PO_RST_CTRL(c)              (0x4 + 0x4 * (c))
>>> +#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
>>> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL        0xf
>>> +#define PRCM_PWROFF_GATING_REG(c)    (0x100 + 0x4 * (c))
>>> +#define PRCM_PWROFF_GATING_REG_CLUSTER       BIT(4)
>>> +#define PRCM_PWROFF_GATING_REG_CORE(n)       BIT(n)
>>> +#define PRCM_PWR_SWITCH_REG(c, cpu)  (0x140 + 0x10 * (c) + 0x4 * 
>>> (cpu))
>>> +#define PRCM_CPU_SOFT_ENTRY_REG              0x164
>>> +
>>> +static void __iomem *cpucfg_base;
>>> +static void __iomem *prcm_base;
>>> +
>>> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int 
>>> cluster,
>>> +                                   bool enable)
>>> +{
>>> +     u32 reg;
>>> +
>>> +     /* control sequence from Allwinner A80 user manual v1.2 PRCM 
>>> section */
>>> +     reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>>> +     if (enable) {
>>> +             if (reg == 0x00) {
>>> +                     pr_debug("power clamp for cluster %u cpu %u 
>>> already open\n",
>>> +                              cluster, cpu);
>>> +                     return 0;
>>> +             }
>>> +
>>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +             writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +             writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +             writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +             writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +     } else {
>>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +     }
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
>>> +{
>>> +     u32 reg;
>>> +
>>> +     pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>>> +     if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= 
>>> SUNXI_NR_CLUSTERS)
>>> +             return -EINVAL;
>>> +
>>> +     /* assert processor power-on reset */
>>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +
>>> +     /* Cortex-A7: hold L1 reset disable signal low */
>>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>>> +             reg = readl(cpucfg_base + 
>>> CPUCFG_CX_CTRL_REG0(cluster));
>>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
>>> +             writel(reg, cpucfg_base + 
>>> CPUCFG_CX_CTRL_REG0(cluster));
>>> +     }
>>> +
>>> +     /* assert processor related resets */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>>> +
>>> +     /*
>>> +      * Allwinner code also asserts resets for NEON on A15. 
>>> According
>>> +      * to ARM manuals, asserting power-on reset is sufficient.
>>> +      */
>>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>>> +     }
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +
>>> +     /* open power switch */
>>> +     sunxi_cpu_power_switch_set(cpu, cluster, true);
>>> +
>>> +     /* clear processor power gate */
>>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>>> +     reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
>>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>>> +     udelay(20);
>>> +
>>> +     /* de-assert processor power-on reset */
>>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +     reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +
>>> +     /* de-assert all processor resets */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>>> +     reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
>>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>>> +             reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>>> +     } else {
>>> +             reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
>>> +     }
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static int sunxi_cluster_powerup(unsigned int cluster)
>>> +{
>>> +     u32 reg;
>>> +
>>> +     pr_debug("%s: cluster %u\n", __func__, cluster);
>>> +     if (cluster >= SUNXI_NR_CLUSTERS)
>>> +             return -EINVAL;
>>> +
>>> +     /* assert ACINACTM */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +
>>> +     /* assert cluster processor power-on resets */
>>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
>>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +
>>> +     /* assert cluster resets */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
>>> +
>>> +     /*
>>> +      * Allwinner code also asserts resets for NEON on A15. 
>>> According
>>> +      * to ARM manuals, asserting power-on reset is sufficient.
>>> +      */
>>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
>>> +     }
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +
>>> +     /* hold L1/L2 reset disable signals low */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>>> +     if (of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER) {
>>> +             /* Cortex-A15: hold L2RSTDISABLE low */
>>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
>>> +     } else {
>>> +             /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
>>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
>>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
>>> +     }
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>>> +
>>> +     /* clear cluster power gate */
>>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>>> +     reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
>>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>>> +     udelay(20);
>>> +
>>> +     /* de-assert cluster resets */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>>> +     reg |= CPUCFG_CX_RST_CTRL_H_RST;
>>> +     reg |= CPUCFG_CX_RST_CTRL_L2_RST;
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +
>>> +     /* de-assert ACINACTM */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +     reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static void sunxi_cpu_cache_disable(void)
>>> +{
>>> +     /* Disable and flush the local CPU cache. */
>>> +     v7_exit_coherency_flush(louis);
>>> +}
>>> +
>>> +/*
>>> + * This bit is shared between the initial mcpm_sync_init call to 
>>> enable
>>> + * CCI-400 and proper cluster cache disable before power down.
>>> + */
>>> +static void sunxi_cluster_cache_disable_without_axi(void)
>>> +{
>>> +     if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
>>> +             /*
>>> +              * On the Cortex-A15 we need to disable
>>> +              * L2 prefetching before flushing the cache.
>>> +              */
>>> +             asm volatile(
>>> +             "mcr    p15, 1, %0, c15, c0, 3\n"
>>> +             "isb\n"
>>> +             "dsb"
>>> +             : : "r" (0x400));
>>> +     }
>>> +
>>> +     /* Flush all cache levels for this cluster. */
>>> +     v7_exit_coherency_flush(all);
>>> +
>>> +     /*
>>> +      * Disable cluster-level coherency by masking
>>> +      * incoming snoops and DVM messages:
>>> +      */
>>> +     cci_disable_port_by_cpu(read_cpuid_mpidr());
>>> +}
>>> +
>>> +static void sunxi_cluster_cache_disable(void)
>>> +{
>>> +     unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 
>>> 1);
>>> +     u32 reg;
>>> +
>>> +     pr_info("%s: cluster %u\n", __func__, cluster);
>>> +
>>> +     sunxi_cluster_cache_disable_without_axi();
>>> +
>>> +     /* last man standing, assert ACINACTM */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +}
>>> +
>>> +static const struct mcpm_platform_ops sunxi_power_ops = {
>>> +     .cpu_powerup            = sunxi_cpu_powerup,
>>> +     .cluster_powerup        = sunxi_cluster_powerup,
>>> +     .cpu_cache_disable      = sunxi_cpu_cache_disable,
>>> +     .cluster_cache_disable  = sunxi_cluster_cache_disable,
>>> +};
>>> +
>>> +/*
>>> + * Enable cluster-level coherency, in preparation for turning on the 
>>> MMU.
>>> + *
>>> + * Also enable regional clock gating and L2 data latency settings 
>>> for
>>> + * Cortex-A15.
>>> + */
>>> +static void __naked sunxi_power_up_setup(unsigned int 
>>> affinity_level)
>>> +{
>>> +     asm volatile (
>>> +             "mrc    p15, 0, r1, c0, c0, 0\n"
>>> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) 
>>> "\n"
>>> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) 
>>> "\n"
>>> +             "and    r1, r1, r2\n"
>>> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 
>>> 0xffff) "\n"
>>> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 
>>> 16) "\n"
>>> +             "cmp    r1, r2\n"
>>> +             "bne    not_a15\n"
>>> +
>>> +             /* The following is Cortex-A15 specific */
>>> +
>>> +             /* L2CTRL: Enable CPU regional clock gates */
>>> +             "mrc p15, 1, r1, c15, c0, 4\n"
>>> +             "orr r1, r1, #(0x1<<31)\n"
>>> +             "mcr p15, 1, r1, c15, c0, 4\n"
>>> +
>>> +             /* L2ACTLR */
>>> +             "mrc p15, 1, r1, c15, c0, 0\n"
>>> +             /* Enable L2, GIC, and Timer regional clock gates */
>>> +             "orr r1, r1, #(0x1<<26)\n"
>>> +             /* Disable clean/evict from being pushed to external */
>>> +             "orr r1, r1, #(0x1<<3)\n"
>>> +             "mcr p15, 1, r1, c15, c0, 0\n"
>>> +
>>> +             /* L2 data RAM latency */
>>> +             "mrc p15, 1, r1, c9, c0, 2\n"
>>> +             "bic r1, r1, #(0x7<<0)\n"
>>> +             "orr r1, r1, #(0x3<<0)\n"
>>> +             "mcr p15, 1, r1, c9, c0, 2\n"
>>> +
>>> +             /* End of Cortex-A15 specific setup */
>>> +             "not_a15:\n"
>>> +
>>> +             "cmp    r0, #1\n"
>>> +             "bxne   lr\n"
>>> +             "b      cci_enable_port_for_self"
>>> +     );
>>> +}
>>> +
>>> +static void sunxi_mcpm_setup_entry_point(void)
>>> +{
>>> +     __raw_writel(virt_to_phys(mcpm_entry_point),
>>> +                  prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
>>> +}
>>> +
>>> +static int __init sunxi_mcpm_init(void)
>>> +{
>>> +     struct device_node *node;
>>> +     int ret;
>>> +
>>> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
>>> +             return -ENODEV;
>>> +
>>> +     if (!cci_probed())
>>> +             return -ENODEV;
>>> +
>>> +     node = of_find_compatible_node(NULL, NULL,
>>> +                     "allwinner,sun9i-a80-cpucfg");
>>> +     if (!node)
>>> +             return -ENODEV;
>>> +
>>> +     cpucfg_base = of_iomap(node, 0);
>>> +     of_node_put(node);
>>> +     if (!cpucfg_base) {
>>> +             pr_err("%s: failed to map CPUCFG registers\n", 
>>> __func__);
>>> +             return -ENOMEM;
>>> +     }
>> 
>> Can't we request the region as well?
> 
> Yes we can! But only for the CPUCFG registers. The PRCM block is
> shared with all the PRCM block clock drivers. :(
> 
>> 
>>> +
>>> +     node = of_find_compatible_node(NULL, NULL,
>>> +                     "allwinner,sun9i-a80-prcm");
>>> +     if (!node)
>>> +             return -ENODEV;
>>> +
>>> +     prcm_base = of_iomap(node, 0);
>>> +
>>> +     of_node_put(node);
>>> +     if (!prcm_base) {
>>> +             pr_err("%s: failed to map PRCM registers\n", __func__);
>>> +             iounmap(prcm_base);
>>> +             return -ENOMEM;
>>> +     }
>>> +
>>> +     ret = mcpm_platform_register(&sunxi_power_ops);
>>> +     if (!ret)
>>> +             ret = mcpm_sync_init(sunxi_power_up_setup);
>>> +     if (!ret)
>>> +             /* do not disable AXI master as no one will re-enable 
>>> it */
>>> +             ret = 
>>> mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
>>> +     if (ret) {
>>> +             iounmap(cpucfg_base);
>>> +             iounmap(prcm_base);
>>> +             return ret;
>>> +     }
>>> +
>>> +     mcpm_smp_set_ops();
>>> +
>>> +     pr_info("sunxi MCPM support installed\n");
>>> +
>>> +     sunxi_mcpm_setup_entry_point();
>>> +
>>> +     return ret;
>>> +}
>> 
>> It looks mostly good, and I would replace the sunxi by sun9i, and call
>> that file sun9i-mcpm.c
> 
> I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
> or just mcpm. Most of the stuff is similiar, except the A83T has two
> revisions and one of them has two gate/power bits swapped. :(
> 
> ChenYu
> 
>> 
>> Thanks!
>> Maxime
>> 
>> --
>> Maxime Ripard, Free Electrons
>> Embedded Linux and Kernel engineering
>> http://free-electrons.com
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25 12:13         ` icenowy at aosc.io
  0 siblings, 0 replies; 28+ messages in thread
From: icenowy at aosc.io @ 2017-07-25 12:13 UTC (permalink / raw)
  To: linux-arm-kernel

? 2017-07-25 16:29?Chen-Yu Tsai ???
> default ARCH_SUNXI
> On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
> <maxime.ripard@free-electrons.com> wrote:
>> Hi Chen-Yu,
>> 
>> On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
>>> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>>> 1 cluster of 4 Cortex-A15s.
>>> 
>>> This patch adds support to bring up the second cluster and thus all
>>> cores using the common MCPM code. Core/cluster power down has not
>>> been implemented, thus CPU hotplugging and big.LITTLE switcher is
>>> not supported.
>>> 
>>> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
>>> ---
>>>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>>>  arch/arm/mach-sunxi/Makefile |   1 +
>>>  arch/arm/mach-sunxi/mcpm.c   | 391 
>>> +++++++++++++++++++++++++++++++++++++++++++
>>>  3 files changed, 402 insertions(+)
>>>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>>> 
>>> diff --git a/arch/arm/mach-sunxi/Kconfig 
>>> b/arch/arm/mach-sunxi/Kconfig
>>> index 58153cdf025b..177380548d99 100644
>>> --- a/arch/arm/mach-sunxi/Kconfig
>>> +++ b/arch/arm/mach-sunxi/Kconfig
>>> @@ -47,5 +47,15 @@ config MACH_SUN9I
>>>       bool "Allwinner (sun9i) SoCs support"
>>>       default ARCH_SUNXI
>>>       select ARM_GIC
>>> +     imply MCPM
>>> +
>>> +config SUN9I_A80_MCPM
>>> +     bool "Allwinner A80 Multi-Cluster PM support"
>>> +     depends on MCPM && MACH_SUN9I
>>> +     default MACH_SUN9I
>>> +     select ARM_CCI400_PORT_CTRL
>>> +     help
>>> +       This is needed to provide CPU and cluster power management
>>> +       on Allwinner A80 implementing big.LITTLE.
>> 
>> Do we really need an option for that? we don't provide the option to
>> disable the CPU SMP operations for the rest of the SoCs.
> 
> It was an option as it also required MCPM and CCI400 support to be 
> built.
> We could hide it. Or, using mach-hisi as a reference, we could do:

I think a hidden config option is a proper way, as we can then select
this config option in MACH_SUN8I when introducing A83T support.

> 
> config MACH_SUN9I
>         default ARCH_SUNXI
>         select ARM_GIC
>         select MCPM if SMP
>         select ARM_CCI400_PORT_CTRL if SMP
> 
> and in the Makefile:
> 
> obj-$(CONFIG_MCPM) += sun9i-mcpm.o
> 
>> 
>>>  endif
>>> diff --git a/arch/arm/mach-sunxi/Makefile 
>>> b/arch/arm/mach-sunxi/Makefile
>>> index 27b168f121a1..e8558912c714 100644
>>> --- a/arch/arm/mach-sunxi/Makefile
>>> +++ b/arch/arm/mach-sunxi/Makefile
>>> @@ -1,2 +1,3 @@
>>>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
>>>  obj-$(CONFIG_SMP) += platsmp.o
>>> +obj-$(CONFIG_SUN9I_A80_MCPM) += mcpm.o
>>> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
>>> new file mode 100644
>>> index 000000000000..4b6e1d6ae379
>>> --- /dev/null
>>> +++ b/arch/arm/mach-sunxi/mcpm.c
>>> @@ -0,0 +1,391 @@
>>> +/*
>>> + * Copyright (c) 2015 Chen-Yu Tsai
>>> + *
>>> + * Chen-Yu Tsai <wens@csie.org>
>>> + *
>>> + * arch/arm/mach-sunxi/mcpm.c
>>> + *
>>> + * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
>>> + *
>>> + * This program is free software; you can redistribute it and/or 
>>> modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +
>>> +#include <linux/arm-cci.h>
>>> +#include <linux/delay.h>
>>> +#include <linux/io.h>
>>> +#include <linux/of_address.h>
>>> +
>>> +#include <asm/cputype.h>
>>> +#include <asm/cp15.h>
>>> +#include <asm/mcpm.h>
>>> +
>>> +#define SUNXI_CPUS_PER_CLUSTER               4
>>> +#define SUNXI_NR_CLUSTERS            2
>>> +
>>> +#define SUN9I_A80_A15_CLUSTER                1
>> 
>> Don't we have a way to derive that from the DT ?
> 
> Indeed we can.
> 
> It would be slighty more complicated though:
> 
> node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
> if (of_device_is_compatible(node, "arm,cortex-a15")) {
>         ...
> }
> 
>> 
>>> +#define CPUCFG_CX_CTRL_REG0(c)               (0x10 * (c))
>>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)        BIT(n)
>>> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL       0xf
>>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7        BIT(4)
>>> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15       BIT(0)
>>> +#define CPUCFG_CX_CTRL_REG1(c)               (0x10 * (c) + 0x4)
>>> +#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
>>> +#define CPUCFG_CX_RST_CTRL(c)                (0x80 + 0x4 * (c))
>>> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST       BIT(24)
>>> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)        BIT(20 + (n))
>>> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL       (0xf << 20)
>>> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)        BIT(16 + (n))
>>> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL       (0xf << 16)
>>> +#define CPUCFG_CX_RST_CTRL_H_RST     BIT(12)
>>> +#define CPUCFG_CX_RST_CTRL_L2_RST    BIT(8)
>>> +#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
>>> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)       BIT(n)
>>> +
>>> +#define PRCM_CPU_PO_RST_CTRL(c)              (0x4 + 0x4 * (c))
>>> +#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
>>> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL        0xf
>>> +#define PRCM_PWROFF_GATING_REG(c)    (0x100 + 0x4 * (c))
>>> +#define PRCM_PWROFF_GATING_REG_CLUSTER       BIT(4)
>>> +#define PRCM_PWROFF_GATING_REG_CORE(n)       BIT(n)
>>> +#define PRCM_PWR_SWITCH_REG(c, cpu)  (0x140 + 0x10 * (c) + 0x4 * 
>>> (cpu))
>>> +#define PRCM_CPU_SOFT_ENTRY_REG              0x164
>>> +
>>> +static void __iomem *cpucfg_base;
>>> +static void __iomem *prcm_base;
>>> +
>>> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int 
>>> cluster,
>>> +                                   bool enable)
>>> +{
>>> +     u32 reg;
>>> +
>>> +     /* control sequence from Allwinner A80 user manual v1.2 PRCM 
>>> section */
>>> +     reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
>>> +     if (enable) {
>>> +             if (reg == 0x00) {
>>> +                     pr_debug("power clamp for cluster %u cpu %u 
>>> already open\n",
>>> +                              cluster, cpu);
>>> +                     return 0;
>>> +             }
>>> +
>>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +             writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +             writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +             writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +             writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +     } else {
>>> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, 
>>> cpu));
>>> +             udelay(10);
>>> +     }
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
>>> +{
>>> +     u32 reg;
>>> +
>>> +     pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>>> +     if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= 
>>> SUNXI_NR_CLUSTERS)
>>> +             return -EINVAL;
>>> +
>>> +     /* assert processor power-on reset */
>>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +
>>> +     /* Cortex-A7: hold L1 reset disable signal low */
>>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>>> +             reg = readl(cpucfg_base + 
>>> CPUCFG_CX_CTRL_REG0(cluster));
>>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
>>> +             writel(reg, cpucfg_base + 
>>> CPUCFG_CX_CTRL_REG0(cluster));
>>> +     }
>>> +
>>> +     /* assert processor related resets */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>>> +
>>> +     /*
>>> +      * Allwinner code also asserts resets for NEON on A15. 
>>> According
>>> +      * to ARM manuals, asserting power-on reset is sufficient.
>>> +      */
>>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>>> +     }
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +
>>> +     /* open power switch */
>>> +     sunxi_cpu_power_switch_set(cpu, cluster, true);
>>> +
>>> +     /* clear processor power gate */
>>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>>> +     reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
>>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>>> +     udelay(20);
>>> +
>>> +     /* de-assert processor power-on reset */
>>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +     reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
>>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +
>>> +     /* de-assert all processor resets */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
>>> +     reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
>>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>>> +             reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
>>> +     } else {
>>> +             reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
>>> +     }
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static int sunxi_cluster_powerup(unsigned int cluster)
>>> +{
>>> +     u32 reg;
>>> +
>>> +     pr_debug("%s: cluster %u\n", __func__, cluster);
>>> +     if (cluster >= SUNXI_NR_CLUSTERS)
>>> +             return -EINVAL;
>>> +
>>> +     /* assert ACINACTM */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +
>>> +     /* assert cluster processor power-on resets */
>>> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
>>> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
>>> +
>>> +     /* assert cluster resets */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
>>> +     reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
>>> +
>>> +     /*
>>> +      * Allwinner code also asserts resets for NEON on A15. 
>>> According
>>> +      * to ARM manuals, asserting power-on reset is sufficient.
>>> +      */
>>> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
>>> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
>>> +     }
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +
>>> +     /* hold L1/L2 reset disable signals low */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>>> +     if (of_machine_is_compatible("allwinner,sun9i-a80") &&
>>> +                     cluster == SUN9I_A80_A15_CLUSTER) {
>>> +             /* Cortex-A15: hold L2RSTDISABLE low */
>>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
>>> +     } else {
>>> +             /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
>>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
>>> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
>>> +     }
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
>>> +
>>> +     /* clear cluster power gate */
>>> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>>> +     reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
>>> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
>>> +     udelay(20);
>>> +
>>> +     /* de-assert cluster resets */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +     reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
>>> +     reg |= CPUCFG_CX_RST_CTRL_H_RST;
>>> +     reg |= CPUCFG_CX_RST_CTRL_L2_RST;
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
>>> +
>>> +     /* de-assert ACINACTM */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +     reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static void sunxi_cpu_cache_disable(void)
>>> +{
>>> +     /* Disable and flush the local CPU cache. */
>>> +     v7_exit_coherency_flush(louis);
>>> +}
>>> +
>>> +/*
>>> + * This bit is shared between the initial mcpm_sync_init call to 
>>> enable
>>> + * CCI-400 and proper cluster cache disable before power down.
>>> + */
>>> +static void sunxi_cluster_cache_disable_without_axi(void)
>>> +{
>>> +     if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
>>> +             /*
>>> +              * On the Cortex-A15 we need to disable
>>> +              * L2 prefetching before flushing the cache.
>>> +              */
>>> +             asm volatile(
>>> +             "mcr    p15, 1, %0, c15, c0, 3\n"
>>> +             "isb\n"
>>> +             "dsb"
>>> +             : : "r" (0x400));
>>> +     }
>>> +
>>> +     /* Flush all cache levels for this cluster. */
>>> +     v7_exit_coherency_flush(all);
>>> +
>>> +     /*
>>> +      * Disable cluster-level coherency by masking
>>> +      * incoming snoops and DVM messages:
>>> +      */
>>> +     cci_disable_port_by_cpu(read_cpuid_mpidr());
>>> +}
>>> +
>>> +static void sunxi_cluster_cache_disable(void)
>>> +{
>>> +     unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 
>>> 1);
>>> +     u32 reg;
>>> +
>>> +     pr_info("%s: cluster %u\n", __func__, cluster);
>>> +
>>> +     sunxi_cluster_cache_disable_without_axi();
>>> +
>>> +     /* last man standing, assert ACINACTM */
>>> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
>>> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
>>> +}
>>> +
>>> +static const struct mcpm_platform_ops sunxi_power_ops = {
>>> +     .cpu_powerup            = sunxi_cpu_powerup,
>>> +     .cluster_powerup        = sunxi_cluster_powerup,
>>> +     .cpu_cache_disable      = sunxi_cpu_cache_disable,
>>> +     .cluster_cache_disable  = sunxi_cluster_cache_disable,
>>> +};
>>> +
>>> +/*
>>> + * Enable cluster-level coherency, in preparation for turning on the 
>>> MMU.
>>> + *
>>> + * Also enable regional clock gating and L2 data latency settings 
>>> for
>>> + * Cortex-A15.
>>> + */
>>> +static void __naked sunxi_power_up_setup(unsigned int 
>>> affinity_level)
>>> +{
>>> +     asm volatile (
>>> +             "mrc    p15, 0, r1, c0, c0, 0\n"
>>> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) 
>>> "\n"
>>> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) 
>>> "\n"
>>> +             "and    r1, r1, r2\n"
>>> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 
>>> 0xffff) "\n"
>>> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 
>>> 16) "\n"
>>> +             "cmp    r1, r2\n"
>>> +             "bne    not_a15\n"
>>> +
>>> +             /* The following is Cortex-A15 specific */
>>> +
>>> +             /* L2CTRL: Enable CPU regional clock gates */
>>> +             "mrc p15, 1, r1, c15, c0, 4\n"
>>> +             "orr r1, r1, #(0x1<<31)\n"
>>> +             "mcr p15, 1, r1, c15, c0, 4\n"
>>> +
>>> +             /* L2ACTLR */
>>> +             "mrc p15, 1, r1, c15, c0, 0\n"
>>> +             /* Enable L2, GIC, and Timer regional clock gates */
>>> +             "orr r1, r1, #(0x1<<26)\n"
>>> +             /* Disable clean/evict from being pushed to external */
>>> +             "orr r1, r1, #(0x1<<3)\n"
>>> +             "mcr p15, 1, r1, c15, c0, 0\n"
>>> +
>>> +             /* L2 data RAM latency */
>>> +             "mrc p15, 1, r1, c9, c0, 2\n"
>>> +             "bic r1, r1, #(0x7<<0)\n"
>>> +             "orr r1, r1, #(0x3<<0)\n"
>>> +             "mcr p15, 1, r1, c9, c0, 2\n"
>>> +
>>> +             /* End of Cortex-A15 specific setup */
>>> +             "not_a15:\n"
>>> +
>>> +             "cmp    r0, #1\n"
>>> +             "bxne   lr\n"
>>> +             "b      cci_enable_port_for_self"
>>> +     );
>>> +}
>>> +
>>> +static void sunxi_mcpm_setup_entry_point(void)
>>> +{
>>> +     __raw_writel(virt_to_phys(mcpm_entry_point),
>>> +                  prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
>>> +}
>>> +
>>> +static int __init sunxi_mcpm_init(void)
>>> +{
>>> +     struct device_node *node;
>>> +     int ret;
>>> +
>>> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
>>> +             return -ENODEV;
>>> +
>>> +     if (!cci_probed())
>>> +             return -ENODEV;
>>> +
>>> +     node = of_find_compatible_node(NULL, NULL,
>>> +                     "allwinner,sun9i-a80-cpucfg");
>>> +     if (!node)
>>> +             return -ENODEV;
>>> +
>>> +     cpucfg_base = of_iomap(node, 0);
>>> +     of_node_put(node);
>>> +     if (!cpucfg_base) {
>>> +             pr_err("%s: failed to map CPUCFG registers\n", 
>>> __func__);
>>> +             return -ENOMEM;
>>> +     }
>> 
>> Can't we request the region as well?
> 
> Yes we can! But only for the CPUCFG registers. The PRCM block is
> shared with all the PRCM block clock drivers. :(
> 
>> 
>>> +
>>> +     node = of_find_compatible_node(NULL, NULL,
>>> +                     "allwinner,sun9i-a80-prcm");
>>> +     if (!node)
>>> +             return -ENODEV;
>>> +
>>> +     prcm_base = of_iomap(node, 0);
>>> +
>>> +     of_node_put(node);
>>> +     if (!prcm_base) {
>>> +             pr_err("%s: failed to map PRCM registers\n", __func__);
>>> +             iounmap(prcm_base);
>>> +             return -ENOMEM;
>>> +     }
>>> +
>>> +     ret = mcpm_platform_register(&sunxi_power_ops);
>>> +     if (!ret)
>>> +             ret = mcpm_sync_init(sunxi_power_up_setup);
>>> +     if (!ret)
>>> +             /* do not disable AXI master as no one will re-enable 
>>> it */
>>> +             ret = 
>>> mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
>>> +     if (ret) {
>>> +             iounmap(cpucfg_base);
>>> +             iounmap(prcm_base);
>>> +             return ret;
>>> +     }
>>> +
>>> +     mcpm_smp_set_ops();
>>> +
>>> +     pr_info("sunxi MCPM support installed\n");
>>> +
>>> +     sunxi_mcpm_setup_entry_point();
>>> +
>>> +     return ret;
>>> +}
>> 
>> It looks mostly good, and I would replace the sunxi by sun9i, and call
>> that file sun9i-mcpm.c
> 
> I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
> or just mcpm. Most of the stuff is similiar, except the A83T has two
> revisions and one of them has two gate/power bits swapped. :(
> 
> ChenYu
> 
>> 
>> Thanks!
>> Maxime
>> 
>> --
>> Maxime Ripard, Free Electrons
>> Embedded Linux and Kernel engineering
>> http://free-electrons.com
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25 14:40         ` Maxime Ripard
  0 siblings, 0 replies; 28+ messages in thread
From: Maxime Ripard @ 2017-07-25 14:40 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Russell King, linux-sunxi, linux-arm-kernel, linux-kernel,
	devicetree, Nicolas Pitre, Dave Martin

[-- Attachment #1: Type: text/plain, Size: 18692 bytes --]

On Tue, Jul 25, 2017 at 04:29:52PM +0800, Chen-Yu Tsai wrote:
>          default ARCH_SUNXI
> On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
> <maxime.ripard@free-electrons.com> wrote:
> > Hi Chen-Yu,
> >
> > On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
> >> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
> >> 1 cluster of 4 Cortex-A15s.
> >>
> >> This patch adds support to bring up the second cluster and thus all
> >> cores using the common MCPM code. Core/cluster power down has not
> >> been implemented, thus CPU hotplugging and big.LITTLE switcher is
> >> not supported.
> >>
> >> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> >> ---
> >>  arch/arm/mach-sunxi/Kconfig  |  10 ++
> >>  arch/arm/mach-sunxi/Makefile |   1 +
> >>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
> >>  3 files changed, 402 insertions(+)
> >>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
> >>
> >> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
> >> index 58153cdf025b..177380548d99 100644
> >> --- a/arch/arm/mach-sunxi/Kconfig
> >> +++ b/arch/arm/mach-sunxi/Kconfig
> >> @@ -47,5 +47,15 @@ config MACH_SUN9I
> >>       bool "Allwinner (sun9i) SoCs support"
> >>       default ARCH_SUNXI
> >>       select ARM_GIC
> >> +     imply MCPM
> >> +
> >> +config SUN9I_A80_MCPM
> >> +     bool "Allwinner A80 Multi-Cluster PM support"
> >> +     depends on MCPM && MACH_SUN9I
> >> +     default MACH_SUN9I
> >> +     select ARM_CCI400_PORT_CTRL
> >> +     help
> >> +       This is needed to provide CPU and cluster power management
> >> +       on Allwinner A80 implementing big.LITTLE.
> >
> > Do we really need an option for that? we don't provide the option to
> > disable the CPU SMP operations for the rest of the SoCs.
> 
> It was an option as it also required MCPM and CCI400 support to be built.
> We could hide it. Or, using mach-hisi as a reference, we could do:
> 
> config MACH_SUN9I
>         default ARCH_SUNXI
>         select ARM_GIC
>         select MCPM if SMP
>         select ARM_CCI400_PORT_CTRL if SMP
> 
> and in the Makefile:
> 
> obj-$(CONFIG_MCPM) += sun9i-mcpm.o

I guess a hidden option would work for me.

> >> +#define SUNXI_CPUS_PER_CLUSTER               4
> >> +#define SUNXI_NR_CLUSTERS            2
> >> +
> >> +#define SUN9I_A80_A15_CLUSTER                1
> >
> > Don't we have a way to derive that from the DT ?
> 
> Indeed we can.
> 
> It would be slighty more complicated though:
> 
> node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
> if (of_device_is_compatible(node, "arm,cortex-a15")) {
>         ...
> }

There's no helper to create that map?

We'll use it for A83T too, so the complexity will be reduced anyway.

> >
> >> +#define CPUCFG_CX_CTRL_REG0(c)               (0x10 * (c))
> >> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)        BIT(n)
> >> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL       0xf
> >> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7        BIT(4)
> >> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15       BIT(0)
> >> +#define CPUCFG_CX_CTRL_REG1(c)               (0x10 * (c) + 0x4)
> >> +#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
> >> +#define CPUCFG_CX_RST_CTRL(c)                (0x80 + 0x4 * (c))
> >> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST       BIT(24)
> >> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)        BIT(20 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL       (0xf << 20)
> >> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)        BIT(16 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL       (0xf << 16)
> >> +#define CPUCFG_CX_RST_CTRL_H_RST     BIT(12)
> >> +#define CPUCFG_CX_RST_CTRL_L2_RST    BIT(8)
> >> +#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)       BIT(n)
> >> +
> >> +#define PRCM_CPU_PO_RST_CTRL(c)              (0x4 + 0x4 * (c))
> >> +#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
> >> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL        0xf
> >> +#define PRCM_PWROFF_GATING_REG(c)    (0x100 + 0x4 * (c))
> >> +#define PRCM_PWROFF_GATING_REG_CLUSTER       BIT(4)
> >> +#define PRCM_PWROFF_GATING_REG_CORE(n)       BIT(n)
> >> +#define PRCM_PWR_SWITCH_REG(c, cpu)  (0x140 + 0x10 * (c) + 0x4 * (cpu))
> >> +#define PRCM_CPU_SOFT_ENTRY_REG              0x164
> >> +
> >> +static void __iomem *cpucfg_base;
> >> +static void __iomem *prcm_base;
> >> +
> >> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
> >> +                                   bool enable)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     /* control sequence from Allwinner A80 user manual v1.2 PRCM section */
> >> +     reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +     if (enable) {
> >> +             if (reg == 0x00) {
> >> +                     pr_debug("power clamp for cluster %u cpu %u already open\n",
> >> +                              cluster, cpu);
> >> +                     return 0;
> >> +             }
> >> +
> >> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +     } else {
> >> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +     }
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> >> +     if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
> >> +             return -EINVAL;
> >> +
> >> +     /* assert processor power-on reset */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* Cortex-A7: hold L1 reset disable signal low */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
> >> +             writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +     }
> >> +
> >> +     /* assert processor related resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> >> +
> >> +     /*
> >> +      * Allwinner code also asserts resets for NEON on A15. According
> >> +      * to ARM manuals, asserting power-on reset is sufficient.
> >> +      */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* open power switch */
> >> +     sunxi_cpu_power_switch_set(cpu, cluster, true);
> >> +
> >> +     /* clear processor power gate */
> >> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     udelay(20);
> >> +
> >> +     /* de-assert processor power-on reset */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* de-assert all processor resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> >> +     reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> >> +     } else {
> >> +             reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int sunxi_cluster_powerup(unsigned int cluster)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     pr_debug("%s: cluster %u\n", __func__, cluster);
> >> +     if (cluster >= SUNXI_NR_CLUSTERS)
> >> +             return -EINVAL;
> >> +
> >> +     /* assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +
> >> +     /* assert cluster processor power-on resets */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* assert cluster resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
> >> +
> >> +     /*
> >> +      * Allwinner code also asserts resets for NEON on A15. According
> >> +      * to ARM manuals, asserting power-on reset is sufficient.
> >> +      */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* hold L1/L2 reset disable signals low */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +     if (of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER) {
> >> +             /* Cortex-A15: hold L2RSTDISABLE low */
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
> >> +     } else {
> >> +             /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +
> >> +     /* clear cluster power gate */
> >> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
> >> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     udelay(20);
> >> +
> >> +     /* de-assert cluster resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> >> +     reg |= CPUCFG_CX_RST_CTRL_H_RST;
> >> +     reg |= CPUCFG_CX_RST_CTRL_L2_RST;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* de-assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static void sunxi_cpu_cache_disable(void)
> >> +{
> >> +     /* Disable and flush the local CPU cache. */
> >> +     v7_exit_coherency_flush(louis);
> >> +}
> >> +
> >> +/*
> >> + * This bit is shared between the initial mcpm_sync_init call to enable
> >> + * CCI-400 and proper cluster cache disable before power down.
> >> + */
> >> +static void sunxi_cluster_cache_disable_without_axi(void)
> >> +{
> >> +     if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
> >> +             /*
> >> +              * On the Cortex-A15 we need to disable
> >> +              * L2 prefetching before flushing the cache.
> >> +              */
> >> +             asm volatile(
> >> +             "mcr    p15, 1, %0, c15, c0, 3\n"
> >> +             "isb\n"
> >> +             "dsb"
> >> +             : : "r" (0x400));
> >> +     }
> >> +
> >> +     /* Flush all cache levels for this cluster. */
> >> +     v7_exit_coherency_flush(all);
> >> +
> >> +     /*
> >> +      * Disable cluster-level coherency by masking
> >> +      * incoming snoops and DVM messages:
> >> +      */
> >> +     cci_disable_port_by_cpu(read_cpuid_mpidr());
> >> +}
> >> +
> >> +static void sunxi_cluster_cache_disable(void)
> >> +{
> >> +     unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
> >> +     u32 reg;
> >> +
> >> +     pr_info("%s: cluster %u\n", __func__, cluster);
> >> +
> >> +     sunxi_cluster_cache_disable_without_axi();
> >> +
> >> +     /* last man standing, assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +}
> >> +
> >> +static const struct mcpm_platform_ops sunxi_power_ops = {
> >> +     .cpu_powerup            = sunxi_cpu_powerup,
> >> +     .cluster_powerup        = sunxi_cluster_powerup,
> >> +     .cpu_cache_disable      = sunxi_cpu_cache_disable,
> >> +     .cluster_cache_disable  = sunxi_cluster_cache_disable,
> >> +};
> >> +
> >> +/*
> >> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> >> + *
> >> + * Also enable regional clock gating and L2 data latency settings for
> >> + * Cortex-A15.
> >> + */
> >> +static void __naked sunxi_power_up_setup(unsigned int affinity_level)
> >> +{
> >> +     asm volatile (
> >> +             "mrc    p15, 0, r1, c0, c0, 0\n"
> >> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
> >> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
> >> +             "and    r1, r1, r2\n"
> >> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
> >> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
> >> +             "cmp    r1, r2\n"
> >> +             "bne    not_a15\n"
> >> +
> >> +             /* The following is Cortex-A15 specific */
> >> +
> >> +             /* L2CTRL: Enable CPU regional clock gates */
> >> +             "mrc p15, 1, r1, c15, c0, 4\n"
> >> +             "orr r1, r1, #(0x1<<31)\n"
> >> +             "mcr p15, 1, r1, c15, c0, 4\n"
> >> +
> >> +             /* L2ACTLR */
> >> +             "mrc p15, 1, r1, c15, c0, 0\n"
> >> +             /* Enable L2, GIC, and Timer regional clock gates */
> >> +             "orr r1, r1, #(0x1<<26)\n"
> >> +             /* Disable clean/evict from being pushed to external */
> >> +             "orr r1, r1, #(0x1<<3)\n"
> >> +             "mcr p15, 1, r1, c15, c0, 0\n"
> >> +
> >> +             /* L2 data RAM latency */
> >> +             "mrc p15, 1, r1, c9, c0, 2\n"
> >> +             "bic r1, r1, #(0x7<<0)\n"
> >> +             "orr r1, r1, #(0x3<<0)\n"
> >> +             "mcr p15, 1, r1, c9, c0, 2\n"
> >> +
> >> +             /* End of Cortex-A15 specific setup */
> >> +             "not_a15:\n"
> >> +
> >> +             "cmp    r0, #1\n"
> >> +             "bxne   lr\n"
> >> +             "b      cci_enable_port_for_self"
> >> +     );
> >> +}
> >> +
> >> +static void sunxi_mcpm_setup_entry_point(void)
> >> +{
> >> +     __raw_writel(virt_to_phys(mcpm_entry_point),
> >> +                  prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
> >> +}
> >> +
> >> +static int __init sunxi_mcpm_init(void)
> >> +{
> >> +     struct device_node *node;
> >> +     int ret;
> >> +
> >> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
> >> +             return -ENODEV;
> >> +
> >> +     if (!cci_probed())
> >> +             return -ENODEV;
> >> +
> >> +     node = of_find_compatible_node(NULL, NULL,
> >> +                     "allwinner,sun9i-a80-cpucfg");
> >> +     if (!node)
> >> +             return -ENODEV;
> >> +
> >> +     cpucfg_base = of_iomap(node, 0);
> >> +     of_node_put(node);
> >> +     if (!cpucfg_base) {
> >> +             pr_err("%s: failed to map CPUCFG registers\n", __func__);
> >> +             return -ENOMEM;
> >> +     }
> >
> > Can't we request the region as well?
> 
> Yes we can! But only for the CPUCFG registers. The PRCM block is
> shared with all the PRCM block clock drivers. :(

Yeah, I know :/

> >
> >> +
> >> +     node = of_find_compatible_node(NULL, NULL,
> >> +                     "allwinner,sun9i-a80-prcm");
> >> +     if (!node)
> >> +             return -ENODEV;
> >> +
> >> +     prcm_base = of_iomap(node, 0);
> >> +
> >> +     of_node_put(node);
> >> +     if (!prcm_base) {
> >> +             pr_err("%s: failed to map PRCM registers\n", __func__);
> >> +             iounmap(prcm_base);
> >> +             return -ENOMEM;
> >> +     }
> >> +
> >> +     ret = mcpm_platform_register(&sunxi_power_ops);
> >> +     if (!ret)
> >> +             ret = mcpm_sync_init(sunxi_power_up_setup);
> >> +     if (!ret)
> >> +             /* do not disable AXI master as no one will re-enable it */
> >> +             ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
> >> +     if (ret) {
> >> +             iounmap(cpucfg_base);
> >> +             iounmap(prcm_base);
> >> +             return ret;
> >> +     }
> >> +
> >> +     mcpm_smp_set_ops();
> >> +
> >> +     pr_info("sunxi MCPM support installed\n");
> >> +
> >> +     sunxi_mcpm_setup_entry_point();
> >> +
> >> +     return ret;
> >> +}
> >
> > It looks mostly good, and I would replace the sunxi by sun9i, and call
> > that file sun9i-mcpm.c
> 
> I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
> or just mcpm. Most of the stuff is similiar, except the A83T has two
> revisions and one of them has two gate/power bits swapped. :(

Hmmm, that's true.

What about just mcpm then?

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25 14:40         ` Maxime Ripard
  0 siblings, 0 replies; 28+ messages in thread
From: Maxime Ripard @ 2017-07-25 14:40 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Russell King, linux-sunxi, linux-arm-kernel, linux-kernel,
	devicetree, Nicolas Pitre, Dave Martin

[-- Attachment #1: Type: text/plain, Size: 18743 bytes --]

On Tue, Jul 25, 2017 at 04:29:52PM +0800, Chen-Yu Tsai wrote:
>          default ARCH_SUNXI
> On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
> <maxime.ripard-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org> wrote:
> > Hi Chen-Yu,
> >
> > On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
> >> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
> >> 1 cluster of 4 Cortex-A15s.
> >>
> >> This patch adds support to bring up the second cluster and thus all
> >> cores using the common MCPM code. Core/cluster power down has not
> >> been implemented, thus CPU hotplugging and big.LITTLE switcher is
> >> not supported.
> >>
> >> Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
> >> ---
> >>  arch/arm/mach-sunxi/Kconfig  |  10 ++
> >>  arch/arm/mach-sunxi/Makefile |   1 +
> >>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
> >>  3 files changed, 402 insertions(+)
> >>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
> >>
> >> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
> >> index 58153cdf025b..177380548d99 100644
> >> --- a/arch/arm/mach-sunxi/Kconfig
> >> +++ b/arch/arm/mach-sunxi/Kconfig
> >> @@ -47,5 +47,15 @@ config MACH_SUN9I
> >>       bool "Allwinner (sun9i) SoCs support"
> >>       default ARCH_SUNXI
> >>       select ARM_GIC
> >> +     imply MCPM
> >> +
> >> +config SUN9I_A80_MCPM
> >> +     bool "Allwinner A80 Multi-Cluster PM support"
> >> +     depends on MCPM && MACH_SUN9I
> >> +     default MACH_SUN9I
> >> +     select ARM_CCI400_PORT_CTRL
> >> +     help
> >> +       This is needed to provide CPU and cluster power management
> >> +       on Allwinner A80 implementing big.LITTLE.
> >
> > Do we really need an option for that? we don't provide the option to
> > disable the CPU SMP operations for the rest of the SoCs.
> 
> It was an option as it also required MCPM and CCI400 support to be built.
> We could hide it. Or, using mach-hisi as a reference, we could do:
> 
> config MACH_SUN9I
>         default ARCH_SUNXI
>         select ARM_GIC
>         select MCPM if SMP
>         select ARM_CCI400_PORT_CTRL if SMP
> 
> and in the Makefile:
> 
> obj-$(CONFIG_MCPM) += sun9i-mcpm.o

I guess a hidden option would work for me.

> >> +#define SUNXI_CPUS_PER_CLUSTER               4
> >> +#define SUNXI_NR_CLUSTERS            2
> >> +
> >> +#define SUN9I_A80_A15_CLUSTER                1
> >
> > Don't we have a way to derive that from the DT ?
> 
> Indeed we can.
> 
> It would be slighty more complicated though:
> 
> node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
> if (of_device_is_compatible(node, "arm,cortex-a15")) {
>         ...
> }

There's no helper to create that map?

We'll use it for A83T too, so the complexity will be reduced anyway.

> >
> >> +#define CPUCFG_CX_CTRL_REG0(c)               (0x10 * (c))
> >> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)        BIT(n)
> >> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL       0xf
> >> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7        BIT(4)
> >> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15       BIT(0)
> >> +#define CPUCFG_CX_CTRL_REG1(c)               (0x10 * (c) + 0x4)
> >> +#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
> >> +#define CPUCFG_CX_RST_CTRL(c)                (0x80 + 0x4 * (c))
> >> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST       BIT(24)
> >> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)        BIT(20 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL       (0xf << 20)
> >> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)        BIT(16 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL       (0xf << 16)
> >> +#define CPUCFG_CX_RST_CTRL_H_RST     BIT(12)
> >> +#define CPUCFG_CX_RST_CTRL_L2_RST    BIT(8)
> >> +#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)       BIT(n)
> >> +
> >> +#define PRCM_CPU_PO_RST_CTRL(c)              (0x4 + 0x4 * (c))
> >> +#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
> >> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL        0xf
> >> +#define PRCM_PWROFF_GATING_REG(c)    (0x100 + 0x4 * (c))
> >> +#define PRCM_PWROFF_GATING_REG_CLUSTER       BIT(4)
> >> +#define PRCM_PWROFF_GATING_REG_CORE(n)       BIT(n)
> >> +#define PRCM_PWR_SWITCH_REG(c, cpu)  (0x140 + 0x10 * (c) + 0x4 * (cpu))
> >> +#define PRCM_CPU_SOFT_ENTRY_REG              0x164
> >> +
> >> +static void __iomem *cpucfg_base;
> >> +static void __iomem *prcm_base;
> >> +
> >> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
> >> +                                   bool enable)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     /* control sequence from Allwinner A80 user manual v1.2 PRCM section */
> >> +     reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +     if (enable) {
> >> +             if (reg == 0x00) {
> >> +                     pr_debug("power clamp for cluster %u cpu %u already open\n",
> >> +                              cluster, cpu);
> >> +                     return 0;
> >> +             }
> >> +
> >> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +     } else {
> >> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +     }
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> >> +     if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
> >> +             return -EINVAL;
> >> +
> >> +     /* assert processor power-on reset */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* Cortex-A7: hold L1 reset disable signal low */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
> >> +             writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +     }
> >> +
> >> +     /* assert processor related resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> >> +
> >> +     /*
> >> +      * Allwinner code also asserts resets for NEON on A15. According
> >> +      * to ARM manuals, asserting power-on reset is sufficient.
> >> +      */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* open power switch */
> >> +     sunxi_cpu_power_switch_set(cpu, cluster, true);
> >> +
> >> +     /* clear processor power gate */
> >> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     udelay(20);
> >> +
> >> +     /* de-assert processor power-on reset */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* de-assert all processor resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> >> +     reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> >> +     } else {
> >> +             reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int sunxi_cluster_powerup(unsigned int cluster)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     pr_debug("%s: cluster %u\n", __func__, cluster);
> >> +     if (cluster >= SUNXI_NR_CLUSTERS)
> >> +             return -EINVAL;
> >> +
> >> +     /* assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +
> >> +     /* assert cluster processor power-on resets */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* assert cluster resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
> >> +
> >> +     /*
> >> +      * Allwinner code also asserts resets for NEON on A15. According
> >> +      * to ARM manuals, asserting power-on reset is sufficient.
> >> +      */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* hold L1/L2 reset disable signals low */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +     if (of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER) {
> >> +             /* Cortex-A15: hold L2RSTDISABLE low */
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
> >> +     } else {
> >> +             /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +
> >> +     /* clear cluster power gate */
> >> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
> >> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     udelay(20);
> >> +
> >> +     /* de-assert cluster resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> >> +     reg |= CPUCFG_CX_RST_CTRL_H_RST;
> >> +     reg |= CPUCFG_CX_RST_CTRL_L2_RST;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* de-assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static void sunxi_cpu_cache_disable(void)
> >> +{
> >> +     /* Disable and flush the local CPU cache. */
> >> +     v7_exit_coherency_flush(louis);
> >> +}
> >> +
> >> +/*
> >> + * This bit is shared between the initial mcpm_sync_init call to enable
> >> + * CCI-400 and proper cluster cache disable before power down.
> >> + */
> >> +static void sunxi_cluster_cache_disable_without_axi(void)
> >> +{
> >> +     if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
> >> +             /*
> >> +              * On the Cortex-A15 we need to disable
> >> +              * L2 prefetching before flushing the cache.
> >> +              */
> >> +             asm volatile(
> >> +             "mcr    p15, 1, %0, c15, c0, 3\n"
> >> +             "isb\n"
> >> +             "dsb"
> >> +             : : "r" (0x400));
> >> +     }
> >> +
> >> +     /* Flush all cache levels for this cluster. */
> >> +     v7_exit_coherency_flush(all);
> >> +
> >> +     /*
> >> +      * Disable cluster-level coherency by masking
> >> +      * incoming snoops and DVM messages:
> >> +      */
> >> +     cci_disable_port_by_cpu(read_cpuid_mpidr());
> >> +}
> >> +
> >> +static void sunxi_cluster_cache_disable(void)
> >> +{
> >> +     unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
> >> +     u32 reg;
> >> +
> >> +     pr_info("%s: cluster %u\n", __func__, cluster);
> >> +
> >> +     sunxi_cluster_cache_disable_without_axi();
> >> +
> >> +     /* last man standing, assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +}
> >> +
> >> +static const struct mcpm_platform_ops sunxi_power_ops = {
> >> +     .cpu_powerup            = sunxi_cpu_powerup,
> >> +     .cluster_powerup        = sunxi_cluster_powerup,
> >> +     .cpu_cache_disable      = sunxi_cpu_cache_disable,
> >> +     .cluster_cache_disable  = sunxi_cluster_cache_disable,
> >> +};
> >> +
> >> +/*
> >> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> >> + *
> >> + * Also enable regional clock gating and L2 data latency settings for
> >> + * Cortex-A15.
> >> + */
> >> +static void __naked sunxi_power_up_setup(unsigned int affinity_level)
> >> +{
> >> +     asm volatile (
> >> +             "mrc    p15, 0, r1, c0, c0, 0\n"
> >> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
> >> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
> >> +             "and    r1, r1, r2\n"
> >> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
> >> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
> >> +             "cmp    r1, r2\n"
> >> +             "bne    not_a15\n"
> >> +
> >> +             /* The following is Cortex-A15 specific */
> >> +
> >> +             /* L2CTRL: Enable CPU regional clock gates */
> >> +             "mrc p15, 1, r1, c15, c0, 4\n"
> >> +             "orr r1, r1, #(0x1<<31)\n"
> >> +             "mcr p15, 1, r1, c15, c0, 4\n"
> >> +
> >> +             /* L2ACTLR */
> >> +             "mrc p15, 1, r1, c15, c0, 0\n"
> >> +             /* Enable L2, GIC, and Timer regional clock gates */
> >> +             "orr r1, r1, #(0x1<<26)\n"
> >> +             /* Disable clean/evict from being pushed to external */
> >> +             "orr r1, r1, #(0x1<<3)\n"
> >> +             "mcr p15, 1, r1, c15, c0, 0\n"
> >> +
> >> +             /* L2 data RAM latency */
> >> +             "mrc p15, 1, r1, c9, c0, 2\n"
> >> +             "bic r1, r1, #(0x7<<0)\n"
> >> +             "orr r1, r1, #(0x3<<0)\n"
> >> +             "mcr p15, 1, r1, c9, c0, 2\n"
> >> +
> >> +             /* End of Cortex-A15 specific setup */
> >> +             "not_a15:\n"
> >> +
> >> +             "cmp    r0, #1\n"
> >> +             "bxne   lr\n"
> >> +             "b      cci_enable_port_for_self"
> >> +     );
> >> +}
> >> +
> >> +static void sunxi_mcpm_setup_entry_point(void)
> >> +{
> >> +     __raw_writel(virt_to_phys(mcpm_entry_point),
> >> +                  prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
> >> +}
> >> +
> >> +static int __init sunxi_mcpm_init(void)
> >> +{
> >> +     struct device_node *node;
> >> +     int ret;
> >> +
> >> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
> >> +             return -ENODEV;
> >> +
> >> +     if (!cci_probed())
> >> +             return -ENODEV;
> >> +
> >> +     node = of_find_compatible_node(NULL, NULL,
> >> +                     "allwinner,sun9i-a80-cpucfg");
> >> +     if (!node)
> >> +             return -ENODEV;
> >> +
> >> +     cpucfg_base = of_iomap(node, 0);
> >> +     of_node_put(node);
> >> +     if (!cpucfg_base) {
> >> +             pr_err("%s: failed to map CPUCFG registers\n", __func__);
> >> +             return -ENOMEM;
> >> +     }
> >
> > Can't we request the region as well?
> 
> Yes we can! But only for the CPUCFG registers. The PRCM block is
> shared with all the PRCM block clock drivers. :(

Yeah, I know :/

> >
> >> +
> >> +     node = of_find_compatible_node(NULL, NULL,
> >> +                     "allwinner,sun9i-a80-prcm");
> >> +     if (!node)
> >> +             return -ENODEV;
> >> +
> >> +     prcm_base = of_iomap(node, 0);
> >> +
> >> +     of_node_put(node);
> >> +     if (!prcm_base) {
> >> +             pr_err("%s: failed to map PRCM registers\n", __func__);
> >> +             iounmap(prcm_base);
> >> +             return -ENOMEM;
> >> +     }
> >> +
> >> +     ret = mcpm_platform_register(&sunxi_power_ops);
> >> +     if (!ret)
> >> +             ret = mcpm_sync_init(sunxi_power_up_setup);
> >> +     if (!ret)
> >> +             /* do not disable AXI master as no one will re-enable it */
> >> +             ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
> >> +     if (ret) {
> >> +             iounmap(cpucfg_base);
> >> +             iounmap(prcm_base);
> >> +             return ret;
> >> +     }
> >> +
> >> +     mcpm_smp_set_ops();
> >> +
> >> +     pr_info("sunxi MCPM support installed\n");
> >> +
> >> +     sunxi_mcpm_setup_entry_point();
> >> +
> >> +     return ret;
> >> +}
> >
> > It looks mostly good, and I would replace the sunxi by sun9i, and call
> > that file sun9i-mcpm.c
> 
> I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
> or just mcpm. Most of the stuff is similiar, except the A83T has two
> revisions and one of them has two gate/power bits swapped. :(

Hmmm, that's true.

What about just mcpm then?

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25 14:40         ` Maxime Ripard
  0 siblings, 0 replies; 28+ messages in thread
From: Maxime Ripard @ 2017-07-25 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 04:29:52PM +0800, Chen-Yu Tsai wrote:
>          default ARCH_SUNXI
> On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
> <maxime.ripard@free-electrons.com> wrote:
> > Hi Chen-Yu,
> >
> > On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
> >> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
> >> 1 cluster of 4 Cortex-A15s.
> >>
> >> This patch adds support to bring up the second cluster and thus all
> >> cores using the common MCPM code. Core/cluster power down has not
> >> been implemented, thus CPU hotplugging and big.LITTLE switcher is
> >> not supported.
> >>
> >> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> >> ---
> >>  arch/arm/mach-sunxi/Kconfig  |  10 ++
> >>  arch/arm/mach-sunxi/Makefile |   1 +
> >>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
> >>  3 files changed, 402 insertions(+)
> >>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
> >>
> >> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
> >> index 58153cdf025b..177380548d99 100644
> >> --- a/arch/arm/mach-sunxi/Kconfig
> >> +++ b/arch/arm/mach-sunxi/Kconfig
> >> @@ -47,5 +47,15 @@ config MACH_SUN9I
> >>       bool "Allwinner (sun9i) SoCs support"
> >>       default ARCH_SUNXI
> >>       select ARM_GIC
> >> +     imply MCPM
> >> +
> >> +config SUN9I_A80_MCPM
> >> +     bool "Allwinner A80 Multi-Cluster PM support"
> >> +     depends on MCPM && MACH_SUN9I
> >> +     default MACH_SUN9I
> >> +     select ARM_CCI400_PORT_CTRL
> >> +     help
> >> +       This is needed to provide CPU and cluster power management
> >> +       on Allwinner A80 implementing big.LITTLE.
> >
> > Do we really need an option for that? we don't provide the option to
> > disable the CPU SMP operations for the rest of the SoCs.
> 
> It was an option as it also required MCPM and CCI400 support to be built.
> We could hide it. Or, using mach-hisi as a reference, we could do:
> 
> config MACH_SUN9I
>         default ARCH_SUNXI
>         select ARM_GIC
>         select MCPM if SMP
>         select ARM_CCI400_PORT_CTRL if SMP
> 
> and in the Makefile:
> 
> obj-$(CONFIG_MCPM) += sun9i-mcpm.o

I guess a hidden option would work for me.

> >> +#define SUNXI_CPUS_PER_CLUSTER               4
> >> +#define SUNXI_NR_CLUSTERS            2
> >> +
> >> +#define SUN9I_A80_A15_CLUSTER                1
> >
> > Don't we have a way to derive that from the DT ?
> 
> Indeed we can.
> 
> It would be slighty more complicated though:
> 
> node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
> if (of_device_is_compatible(node, "arm,cortex-a15")) {
>         ...
> }

There's no helper to create that map?

We'll use it for A83T too, so the complexity will be reduced anyway.

> >
> >> +#define CPUCFG_CX_CTRL_REG0(c)               (0x10 * (c))
> >> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)        BIT(n)
> >> +#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL       0xf
> >> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7        BIT(4)
> >> +#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15       BIT(0)
> >> +#define CPUCFG_CX_CTRL_REG1(c)               (0x10 * (c) + 0x4)
> >> +#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
> >> +#define CPUCFG_CX_RST_CTRL(c)                (0x80 + 0x4 * (c))
> >> +#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST       BIT(24)
> >> +#define CPUCFG_CX_RST_CTRL_ETM_RST(n)        BIT(20 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL       (0xf << 20)
> >> +#define CPUCFG_CX_RST_CTRL_DBG_RST(n)        BIT(16 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL       (0xf << 16)
> >> +#define CPUCFG_CX_RST_CTRL_H_RST     BIT(12)
> >> +#define CPUCFG_CX_RST_CTRL_L2_RST    BIT(8)
> >> +#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
> >> +#define CPUCFG_CX_RST_CTRL_CORE_RST(n)       BIT(n)
> >> +
> >> +#define PRCM_CPU_PO_RST_CTRL(c)              (0x4 + 0x4 * (c))
> >> +#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
> >> +#define PRCM_CPU_PO_RST_CTRL_CORE_ALL        0xf
> >> +#define PRCM_PWROFF_GATING_REG(c)    (0x100 + 0x4 * (c))
> >> +#define PRCM_PWROFF_GATING_REG_CLUSTER       BIT(4)
> >> +#define PRCM_PWROFF_GATING_REG_CORE(n)       BIT(n)
> >> +#define PRCM_PWR_SWITCH_REG(c, cpu)  (0x140 + 0x10 * (c) + 0x4 * (cpu))
> >> +#define PRCM_CPU_SOFT_ENTRY_REG              0x164
> >> +
> >> +static void __iomem *cpucfg_base;
> >> +static void __iomem *prcm_base;
> >> +
> >> +static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
> >> +                                   bool enable)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     /* control sequence from Allwinner A80 user manual v1.2 PRCM section */
> >> +     reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +     if (enable) {
> >> +             if (reg == 0x00) {
> >> +                     pr_debug("power clamp for cluster %u cpu %u already open\n",
> >> +                              cluster, cpu);
> >> +                     return 0;
> >> +             }
> >> +
> >> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +             writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +     } else {
> >> +             writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
> >> +             udelay(10);
> >> +     }
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> >> +     if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
> >> +             return -EINVAL;
> >> +
> >> +     /* assert processor power-on reset */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* Cortex-A7: hold L1 reset disable signal low */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
> >> +             writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +     }
> >> +
> >> +     /* assert processor related resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> >> +
> >> +     /*
> >> +      * Allwinner code also asserts resets for NEON on A15. According
> >> +      * to ARM manuals, asserting power-on reset is sufficient.
> >> +      */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* open power switch */
> >> +     sunxi_cpu_power_switch_set(cpu, cluster, true);
> >> +
> >> +     /* clear processor power gate */
> >> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     udelay(20);
> >> +
> >> +     /* de-assert processor power-on reset */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* de-assert all processor resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
> >> +     reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
> >> +     } else {
> >> +             reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int sunxi_cluster_powerup(unsigned int cluster)
> >> +{
> >> +     u32 reg;
> >> +
> >> +     pr_debug("%s: cluster %u\n", __func__, cluster);
> >> +     if (cluster >= SUNXI_NR_CLUSTERS)
> >> +             return -EINVAL;
> >> +
> >> +     /* assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +
> >> +     /* assert cluster processor power-on resets */
> >> +     reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +     reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
> >> +     writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
> >> +
> >> +     /* assert cluster resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
> >> +     reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
> >> +
> >> +     /*
> >> +      * Allwinner code also asserts resets for NEON on A15. According
> >> +      * to ARM manuals, asserting power-on reset is sufficient.
> >> +      */
> >> +     if (!(of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER)) {
> >> +             reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* hold L1/L2 reset disable signals low */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +     if (of_machine_is_compatible("allwinner,sun9i-a80") &&
> >> +                     cluster == SUN9I_A80_A15_CLUSTER) {
> >> +             /* Cortex-A15: hold L2RSTDISABLE low */
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
> >> +     } else {
> >> +             /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
> >> +             reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
> >> +     }
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
> >> +
> >> +     /* clear cluster power gate */
> >> +     reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
> >> +     writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
> >> +     udelay(20);
> >> +
> >> +     /* de-assert cluster resets */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +     reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
> >> +     reg |= CPUCFG_CX_RST_CTRL_H_RST;
> >> +     reg |= CPUCFG_CX_RST_CTRL_L2_RST;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
> >> +
> >> +     /* de-assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static void sunxi_cpu_cache_disable(void)
> >> +{
> >> +     /* Disable and flush the local CPU cache. */
> >> +     v7_exit_coherency_flush(louis);
> >> +}
> >> +
> >> +/*
> >> + * This bit is shared between the initial mcpm_sync_init call to enable
> >> + * CCI-400 and proper cluster cache disable before power down.
> >> + */
> >> +static void sunxi_cluster_cache_disable_without_axi(void)
> >> +{
> >> +     if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
> >> +             /*
> >> +              * On the Cortex-A15 we need to disable
> >> +              * L2 prefetching before flushing the cache.
> >> +              */
> >> +             asm volatile(
> >> +             "mcr    p15, 1, %0, c15, c0, 3\n"
> >> +             "isb\n"
> >> +             "dsb"
> >> +             : : "r" (0x400));
> >> +     }
> >> +
> >> +     /* Flush all cache levels for this cluster. */
> >> +     v7_exit_coherency_flush(all);
> >> +
> >> +     /*
> >> +      * Disable cluster-level coherency by masking
> >> +      * incoming snoops and DVM messages:
> >> +      */
> >> +     cci_disable_port_by_cpu(read_cpuid_mpidr());
> >> +}
> >> +
> >> +static void sunxi_cluster_cache_disable(void)
> >> +{
> >> +     unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
> >> +     u32 reg;
> >> +
> >> +     pr_info("%s: cluster %u\n", __func__, cluster);
> >> +
> >> +     sunxi_cluster_cache_disable_without_axi();
> >> +
> >> +     /* last man standing, assert ACINACTM */
> >> +     reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +     reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
> >> +     writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
> >> +}
> >> +
> >> +static const struct mcpm_platform_ops sunxi_power_ops = {
> >> +     .cpu_powerup            = sunxi_cpu_powerup,
> >> +     .cluster_powerup        = sunxi_cluster_powerup,
> >> +     .cpu_cache_disable      = sunxi_cpu_cache_disable,
> >> +     .cluster_cache_disable  = sunxi_cluster_cache_disable,
> >> +};
> >> +
> >> +/*
> >> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> >> + *
> >> + * Also enable regional clock gating and L2 data latency settings for
> >> + * Cortex-A15.
> >> + */
> >> +static void __naked sunxi_power_up_setup(unsigned int affinity_level)
> >> +{
> >> +     asm volatile (
> >> +             "mrc    p15, 0, r1, c0, c0, 0\n"
> >> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
> >> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
> >> +             "and    r1, r1, r2\n"
> >> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
> >> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
> >> +             "cmp    r1, r2\n"
> >> +             "bne    not_a15\n"
> >> +
> >> +             /* The following is Cortex-A15 specific */
> >> +
> >> +             /* L2CTRL: Enable CPU regional clock gates */
> >> +             "mrc p15, 1, r1, c15, c0, 4\n"
> >> +             "orr r1, r1, #(0x1<<31)\n"
> >> +             "mcr p15, 1, r1, c15, c0, 4\n"
> >> +
> >> +             /* L2ACTLR */
> >> +             "mrc p15, 1, r1, c15, c0, 0\n"
> >> +             /* Enable L2, GIC, and Timer regional clock gates */
> >> +             "orr r1, r1, #(0x1<<26)\n"
> >> +             /* Disable clean/evict from being pushed to external */
> >> +             "orr r1, r1, #(0x1<<3)\n"
> >> +             "mcr p15, 1, r1, c15, c0, 0\n"
> >> +
> >> +             /* L2 data RAM latency */
> >> +             "mrc p15, 1, r1, c9, c0, 2\n"
> >> +             "bic r1, r1, #(0x7<<0)\n"
> >> +             "orr r1, r1, #(0x3<<0)\n"
> >> +             "mcr p15, 1, r1, c9, c0, 2\n"
> >> +
> >> +             /* End of Cortex-A15 specific setup */
> >> +             "not_a15:\n"
> >> +
> >> +             "cmp    r0, #1\n"
> >> +             "bxne   lr\n"
> >> +             "b      cci_enable_port_for_self"
> >> +     );
> >> +}
> >> +
> >> +static void sunxi_mcpm_setup_entry_point(void)
> >> +{
> >> +     __raw_writel(virt_to_phys(mcpm_entry_point),
> >> +                  prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
> >> +}
> >> +
> >> +static int __init sunxi_mcpm_init(void)
> >> +{
> >> +     struct device_node *node;
> >> +     int ret;
> >> +
> >> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
> >> +             return -ENODEV;
> >> +
> >> +     if (!cci_probed())
> >> +             return -ENODEV;
> >> +
> >> +     node = of_find_compatible_node(NULL, NULL,
> >> +                     "allwinner,sun9i-a80-cpucfg");
> >> +     if (!node)
> >> +             return -ENODEV;
> >> +
> >> +     cpucfg_base = of_iomap(node, 0);
> >> +     of_node_put(node);
> >> +     if (!cpucfg_base) {
> >> +             pr_err("%s: failed to map CPUCFG registers\n", __func__);
> >> +             return -ENOMEM;
> >> +     }
> >
> > Can't we request the region as well?
> 
> Yes we can! But only for the CPUCFG registers. The PRCM block is
> shared with all the PRCM block clock drivers. :(

Yeah, I know :/

> >
> >> +
> >> +     node = of_find_compatible_node(NULL, NULL,
> >> +                     "allwinner,sun9i-a80-prcm");
> >> +     if (!node)
> >> +             return -ENODEV;
> >> +
> >> +     prcm_base = of_iomap(node, 0);
> >> +
> >> +     of_node_put(node);
> >> +     if (!prcm_base) {
> >> +             pr_err("%s: failed to map PRCM registers\n", __func__);
> >> +             iounmap(prcm_base);
> >> +             return -ENOMEM;
> >> +     }
> >> +
> >> +     ret = mcpm_platform_register(&sunxi_power_ops);
> >> +     if (!ret)
> >> +             ret = mcpm_sync_init(sunxi_power_up_setup);
> >> +     if (!ret)
> >> +             /* do not disable AXI master as no one will re-enable it */
> >> +             ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
> >> +     if (ret) {
> >> +             iounmap(cpucfg_base);
> >> +             iounmap(prcm_base);
> >> +             return ret;
> >> +     }
> >> +
> >> +     mcpm_smp_set_ops();
> >> +
> >> +     pr_info("sunxi MCPM support installed\n");
> >> +
> >> +     sunxi_mcpm_setup_entry_point();
> >> +
> >> +     return ret;
> >> +}
> >
> > It looks mostly good, and I would replace the sunxi by sun9i, and call
> > that file sun9i-mcpm.c
> 
> I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
> or just mcpm. Most of the stuff is similiar, except the A83T has two
> revisions and one of them has two gate/power bits swapped. :(

Hmmm, that's true.

What about just mcpm then?

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170725/37b95049/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
  2017-07-25 14:40         ` Maxime Ripard
  (?)
@ 2017-07-25 15:18           ` Chen-Yu Tsai
  -1 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25 15:18 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Chen-Yu Tsai, Russell King, linux-sunxi, linux-arm-kernel,
	linux-kernel, devicetree, Nicolas Pitre, Dave Martin

On Tue, Jul 25, 2017 at 10:40 PM, Maxime Ripard
<maxime.ripard@free-electrons.com> wrote:
> On Tue, Jul 25, 2017 at 04:29:52PM +0800, Chen-Yu Tsai wrote:
>>          default ARCH_SUNXI
>> On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
>> <maxime.ripard@free-electrons.com> wrote:
>> > Hi Chen-Yu,
>> >
>> > On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
>> >> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>> >> 1 cluster of 4 Cortex-A15s.
>> >>
>> >> This patch adds support to bring up the second cluster and thus all
>> >> cores using the common MCPM code. Core/cluster power down has not
>> >> been implemented, thus CPU hotplugging and big.LITTLE switcher is
>> >> not supported.
>> >>
>> >> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
>> >> ---
>> >>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>> >>  arch/arm/mach-sunxi/Makefile |   1 +
>> >>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
>> >>  3 files changed, 402 insertions(+)
>> >>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>> >>
>> >> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
>> >> index 58153cdf025b..177380548d99 100644
>> >> --- a/arch/arm/mach-sunxi/Kconfig
>> >> +++ b/arch/arm/mach-sunxi/Kconfig
>> >> @@ -47,5 +47,15 @@ config MACH_SUN9I
>> >>       bool "Allwinner (sun9i) SoCs support"
>> >>       default ARCH_SUNXI
>> >>       select ARM_GIC
>> >> +     imply MCPM
>> >> +
>> >> +config SUN9I_A80_MCPM
>> >> +     bool "Allwinner A80 Multi-Cluster PM support"
>> >> +     depends on MCPM && MACH_SUN9I
>> >> +     default MACH_SUN9I
>> >> +     select ARM_CCI400_PORT_CTRL
>> >> +     help
>> >> +       This is needed to provide CPU and cluster power management
>> >> +       on Allwinner A80 implementing big.LITTLE.
>> >
>> > Do we really need an option for that? we don't provide the option to
>> > disable the CPU SMP operations for the rest of the SoCs.
>>
>> It was an option as it also required MCPM and CCI400 support to be built.
>> We could hide it. Or, using mach-hisi as a reference, we could do:
>>
>> config MACH_SUN9I
>>         default ARCH_SUNXI
>>         select ARM_GIC
>>         select MCPM if SMP
>>         select ARM_CCI400_PORT_CTRL if SMP
>>
>> and in the Makefile:
>>
>> obj-$(CONFIG_MCPM) += sun9i-mcpm.o
>
> I guess a hidden option would work for me.

I kind of prefer mach-hisi's solution though.

>
>> >> +#define SUNXI_CPUS_PER_CLUSTER               4
>> >> +#define SUNXI_NR_CLUSTERS            2
>> >> +
>> >> +#define SUN9I_A80_A15_CLUSTER                1
>> >
>> > Don't we have a way to derive that from the DT ?
>>
>> Indeed we can.
>>
>> It would be slighty more complicated though:
>>
>> node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
>> if (of_device_is_compatible(node, "arm,cortex-a15")) {
>>         ...
>> }
>
> There's no helper to create that map?

Are you referring to the topology map? That one only stores topology,
not what type of cores they are. The CPU capacity part doesn't either.
It only stores the results.

> We'll use it for A83T too, so the complexity will be reduced anyway.

I'll just put it in a helper function.

[...]

>> >
>> > It looks mostly good, and I would replace the sunxi by sun9i, and call
>> > that file sun9i-mcpm.c
>>
>> I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
>> or just mcpm. Most of the stuff is similiar, except the A83T has two
>> revisions and one of them has two gate/power bits swapped. :(
>
> Hmmm, that's true.
>
> What about just mcpm then?

Works for me. I don't need to change anything. :)

ChenYu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25 15:18           ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25 15:18 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Chen-Yu Tsai, Russell King, linux-sunxi, linux-arm-kernel,
	linux-kernel, devicetree, Nicolas Pitre, Dave Martin

On Tue, Jul 25, 2017 at 10:40 PM, Maxime Ripard
<maxime.ripard-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org> wrote:
> On Tue, Jul 25, 2017 at 04:29:52PM +0800, Chen-Yu Tsai wrote:
>>          default ARCH_SUNXI
>> On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
>> <maxime.ripard-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org> wrote:
>> > Hi Chen-Yu,
>> >
>> > On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
>> >> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>> >> 1 cluster of 4 Cortex-A15s.
>> >>
>> >> This patch adds support to bring up the second cluster and thus all
>> >> cores using the common MCPM code. Core/cluster power down has not
>> >> been implemented, thus CPU hotplugging and big.LITTLE switcher is
>> >> not supported.
>> >>
>> >> Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
>> >> ---
>> >>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>> >>  arch/arm/mach-sunxi/Makefile |   1 +
>> >>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
>> >>  3 files changed, 402 insertions(+)
>> >>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>> >>
>> >> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
>> >> index 58153cdf025b..177380548d99 100644
>> >> --- a/arch/arm/mach-sunxi/Kconfig
>> >> +++ b/arch/arm/mach-sunxi/Kconfig
>> >> @@ -47,5 +47,15 @@ config MACH_SUN9I
>> >>       bool "Allwinner (sun9i) SoCs support"
>> >>       default ARCH_SUNXI
>> >>       select ARM_GIC
>> >> +     imply MCPM
>> >> +
>> >> +config SUN9I_A80_MCPM
>> >> +     bool "Allwinner A80 Multi-Cluster PM support"
>> >> +     depends on MCPM && MACH_SUN9I
>> >> +     default MACH_SUN9I
>> >> +     select ARM_CCI400_PORT_CTRL
>> >> +     help
>> >> +       This is needed to provide CPU and cluster power management
>> >> +       on Allwinner A80 implementing big.LITTLE.
>> >
>> > Do we really need an option for that? we don't provide the option to
>> > disable the CPU SMP operations for the rest of the SoCs.
>>
>> It was an option as it also required MCPM and CCI400 support to be built.
>> We could hide it. Or, using mach-hisi as a reference, we could do:
>>
>> config MACH_SUN9I
>>         default ARCH_SUNXI
>>         select ARM_GIC
>>         select MCPM if SMP
>>         select ARM_CCI400_PORT_CTRL if SMP
>>
>> and in the Makefile:
>>
>> obj-$(CONFIG_MCPM) += sun9i-mcpm.o
>
> I guess a hidden option would work for me.

I kind of prefer mach-hisi's solution though.

>
>> >> +#define SUNXI_CPUS_PER_CLUSTER               4
>> >> +#define SUNXI_NR_CLUSTERS            2
>> >> +
>> >> +#define SUN9I_A80_A15_CLUSTER                1
>> >
>> > Don't we have a way to derive that from the DT ?
>>
>> Indeed we can.
>>
>> It would be slighty more complicated though:
>>
>> node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
>> if (of_device_is_compatible(node, "arm,cortex-a15")) {
>>         ...
>> }
>
> There's no helper to create that map?

Are you referring to the topology map? That one only stores topology,
not what type of cores they are. The CPU capacity part doesn't either.
It only stores the results.

> We'll use it for A83T too, so the complexity will be reduced anyway.

I'll just put it in a helper function.

[...]

>> >
>> > It looks mostly good, and I would replace the sunxi by sun9i, and call
>> > that file sun9i-mcpm.c
>>
>> I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
>> or just mcpm. Most of the stuff is similiar, except the A83T has two
>> revisions and one of them has two gate/power bits swapped. :(
>
> Hmmm, that's true.
>
> What about just mcpm then?

Works for me. I don't need to change anything. :)

ChenYu
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2017-07-25 15:18           ` Chen-Yu Tsai
  0 siblings, 0 replies; 28+ messages in thread
From: Chen-Yu Tsai @ 2017-07-25 15:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 25, 2017 at 10:40 PM, Maxime Ripard
<maxime.ripard@free-electrons.com> wrote:
> On Tue, Jul 25, 2017 at 04:29:52PM +0800, Chen-Yu Tsai wrote:
>>          default ARCH_SUNXI
>> On Tue, Jul 25, 2017 at 3:47 PM, Maxime Ripard
>> <maxime.ripard@free-electrons.com> wrote:
>> > Hi Chen-Yu,
>> >
>> > On Tue, Jul 25, 2017 at 01:09:16PM +0800, Chen-Yu Tsai wrote:
>> >> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>> >> 1 cluster of 4 Cortex-A15s.
>> >>
>> >> This patch adds support to bring up the second cluster and thus all
>> >> cores using the common MCPM code. Core/cluster power down has not
>> >> been implemented, thus CPU hotplugging and big.LITTLE switcher is
>> >> not supported.
>> >>
>> >> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
>> >> ---
>> >>  arch/arm/mach-sunxi/Kconfig  |  10 ++
>> >>  arch/arm/mach-sunxi/Makefile |   1 +
>> >>  arch/arm/mach-sunxi/mcpm.c   | 391 +++++++++++++++++++++++++++++++++++++++++++
>> >>  3 files changed, 402 insertions(+)
>> >>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>> >>
>> >> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
>> >> index 58153cdf025b..177380548d99 100644
>> >> --- a/arch/arm/mach-sunxi/Kconfig
>> >> +++ b/arch/arm/mach-sunxi/Kconfig
>> >> @@ -47,5 +47,15 @@ config MACH_SUN9I
>> >>       bool "Allwinner (sun9i) SoCs support"
>> >>       default ARCH_SUNXI
>> >>       select ARM_GIC
>> >> +     imply MCPM
>> >> +
>> >> +config SUN9I_A80_MCPM
>> >> +     bool "Allwinner A80 Multi-Cluster PM support"
>> >> +     depends on MCPM && MACH_SUN9I
>> >> +     default MACH_SUN9I
>> >> +     select ARM_CCI400_PORT_CTRL
>> >> +     help
>> >> +       This is needed to provide CPU and cluster power management
>> >> +       on Allwinner A80 implementing big.LITTLE.
>> >
>> > Do we really need an option for that? we don't provide the option to
>> > disable the CPU SMP operations for the rest of the SoCs.
>>
>> It was an option as it also required MCPM and CCI400 support to be built.
>> We could hide it. Or, using mach-hisi as a reference, we could do:
>>
>> config MACH_SUN9I
>>         default ARCH_SUNXI
>>         select ARM_GIC
>>         select MCPM if SMP
>>         select ARM_CCI400_PORT_CTRL if SMP
>>
>> and in the Makefile:
>>
>> obj-$(CONFIG_MCPM) += sun9i-mcpm.o
>
> I guess a hidden option would work for me.

I kind of prefer mach-hisi's solution though.

>
>> >> +#define SUNXI_CPUS_PER_CLUSTER               4
>> >> +#define SUNXI_NR_CLUSTERS            2
>> >> +
>> >> +#define SUN9I_A80_A15_CLUSTER                1
>> >
>> > Don't we have a way to derive that from the DT ?
>>
>> Indeed we can.
>>
>> It would be slighty more complicated though:
>>
>> node = of_cpu_device_node_get(cluster * SUNXI_CPUS_PER_CLUSTER + cpu);
>> if (of_device_is_compatible(node, "arm,cortex-a15")) {
>>         ...
>> }
>
> There's no helper to create that map?

Are you referring to the topology map? That one only stores topology,
not what type of cores they are. The CPU capacity part doesn't either.
It only stores the results.

> We'll use it for A83T too, so the complexity will be reduced anyway.

I'll just put it in a helper function.

[...]

>> >
>> > It looks mostly good, and I would replace the sunxi by sun9i, and call
>> > that file sun9i-mcpm.c
>>
>> I was hoping to reuse the file for the A83T, so it was sunxi-mcpm.c
>> or just mcpm. Most of the stuff is similiar, except the A83T has two
>> revisions and one of them has two gate/power bits swapped. :(
>
> Hmmm, that's true.
>
> What about just mcpm then?

Works for me. I don't need to change anything. :)

ChenYu

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-07-25 15:18 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-25  5:09 [PATCH 0/4] ARM: sun9i: SMP bring-up with Multi-Cluster Power Management Chen-Yu Tsai
2017-07-25  5:09 ` Chen-Yu Tsai
2017-07-25  5:09 ` Chen-Yu Tsai
2017-07-25  5:09 ` [PATCH 1/4] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM) Chen-Yu Tsai
2017-07-25  5:09   ` Chen-Yu Tsai
2017-07-25  5:09   ` Chen-Yu Tsai
2017-07-25  7:47   ` Maxime Ripard
2017-07-25  7:47     ` Maxime Ripard
2017-07-25  8:29     ` Chen-Yu Tsai
2017-07-25  8:29       ` Chen-Yu Tsai
2017-07-25  8:29       ` Chen-Yu Tsai
2017-07-25 12:13       ` icenowy
2017-07-25 12:13         ` icenowy at aosc.io
2017-07-25 14:40       ` Maxime Ripard
2017-07-25 14:40         ` Maxime Ripard
2017-07-25 14:40         ` Maxime Ripard
2017-07-25 15:18         ` Chen-Yu Tsai
2017-07-25 15:18           ` Chen-Yu Tsai
2017-07-25 15:18           ` Chen-Yu Tsai
2017-07-25  5:09 ` [PATCH 2/4] ARM: dts: sun9i: Add CCI-400 device nodes for A80 Chen-Yu Tsai
2017-07-25  5:09   ` Chen-Yu Tsai
2017-07-25  5:09   ` Chen-Yu Tsai
2017-07-25  5:09 ` [PATCH 3/4] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi Chen-Yu Tsai
2017-07-25  5:09   ` Chen-Yu Tsai
2017-07-25  5:09   ` Chen-Yu Tsai
2017-07-25  5:09 ` [PATCH 4/4] ARM: dts: sun9i: Add PRCM device node for the " Chen-Yu Tsai
2017-07-25  5:09   ` Chen-Yu Tsai
2017-07-25  5:09   ` Chen-Yu Tsai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.