All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 14:37 ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

This is v2 of my sun9i SMP support with MCPM series which was started
over two years ago [1]. We've tried to implement PSCI for both the A80
and A83T. Results were not promising. The issue is that these two chips
have a broken security extensions implementation. If a specific bit is
not burned in its e-fuse, most if not all security protections don't
work [2]. Even worse, non-secure access to the GIC become secure. This
requires a crazy workaround in the GIC driver which probably doesn't work
in all cases [3].

Nicolas mentioned that the MCPM framework is likely overkill in our
case [4]. However the framework does provide cluster/core state tracking
and proper sequencing of cache related operations. We could rework
the code to use standard smp_ops, but I would like to actually get
a working version in first.

Much of the sunxi-specific MCPM code is derived from Allwinner code and
documentation, with some references to the other MCPM implementations,
as well as the Cortex's Technical Reference Manuals for the power
sequencing info.

One major difference compared to other platforms is we currently do not
have a standalone PMU or other embedded firmware to do the actually power
sequencing. All power/reset control is done by the kernel. Nicolas
mentioned that a new optional callback should be added in cases where the
kernel has to do the actual power down [5]. For now however I'm using a
dedicated single thread workqueue. CPU and cluster power off work is
queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
is somewhat heavy, as I have a total of 10 static work structs. It might
also be a bit racy, as nothing prevents the system from bringing a core
back before the asynchronous work shuts it down. This would likely
happen under a heavily loaded system with a scheduler that brings cores
in and out of the system frequently. In simple use-cases it performs OK.

Changes since v1:

  - Leading zeroes for device node addresses removed
  - Added device tree binding for SMP SRAM
  - Simplified Kconfig options
  - Switched to SPDX license identifier
  - Map CPU to device tree node and check compatible to see if it's
    Cortex-A15 or Cortex-A7
  - Fix incorrect CPUCFG cluster status macro that prevented cluster
    0 L2 cache WFI detection
  - Fixed reversed bit for turning off cluster
  - Put cluster in reset before turning off power (or it hangs)
  - Added dedicated workqueue for turning off power to cpus and clusters
  - Request CPUCFG and SRAM MMIO ranges
  - Some comments fixed or added
  - Some debug messages added

[1] http://www.spinics.net/lists/arm-kernel/msg418350.html
[2] https://lists.denx.de/pipermail/u-boot/2017-June/294637.html
[3] https://github.com/wens/linux/commit/c48654c1f737116e7a7660183c8c74fa91970528
[4] http://www.spinics.net/lists/arm-kernel/msg434160.html
[5] http://www.spinics.net/lists/arm-kernel/msg434408.html

Chen-Yu Tsai (8):
  ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management
    (MCPM)
  ARM: dts: sun9i: Add CCI-400 device nodes for A80
  ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
  ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
  ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for
    cpu1~7
  dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP
    hotplug
  ARM: sun9i: mcpm: Support cpu0 hotplug
  ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug

 .../devicetree/bindings/arm/sunxi/smp-sram.txt     |  44 ++
 arch/arm/boot/dts/sun9i-a80.dtsi                   |  75 +++
 arch/arm/mach-sunxi/Kconfig                        |   2 +
 arch/arm/mach-sunxi/Makefile                       |   1 +
 arch/arm/mach-sunxi/mcpm.c                         | 633 +++++++++++++++++++++
 5 files changed, 755 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

-- 
2.15.1

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 14:37 ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

This is v2 of my sun9i SMP support with MCPM series which was started
over two years ago [1]. We've tried to implement PSCI for both the A80
and A83T. Results were not promising. The issue is that these two chips
have a broken security extensions implementation. If a specific bit is
not burned in its e-fuse, most if not all security protections don't
work [2]. Even worse, non-secure access to the GIC become secure. This
requires a crazy workaround in the GIC driver which probably doesn't work
in all cases [3].

Nicolas mentioned that the MCPM framework is likely overkill in our
case [4]. However the framework does provide cluster/core state tracking
and proper sequencing of cache related operations. We could rework
the code to use standard smp_ops, but I would like to actually get
a working version in first.

Much of the sunxi-specific MCPM code is derived from Allwinner code and
documentation, with some references to the other MCPM implementations,
as well as the Cortex's Technical Reference Manuals for the power
sequencing info.

One major difference compared to other platforms is we currently do not
have a standalone PMU or other embedded firmware to do the actually power
sequencing. All power/reset control is done by the kernel. Nicolas
mentioned that a new optional callback should be added in cases where the
kernel has to do the actual power down [5]. For now however I'm using a
dedicated single thread workqueue. CPU and cluster power off work is
queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
is somewhat heavy, as I have a total of 10 static work structs. It might
also be a bit racy, as nothing prevents the system from bringing a core
back before the asynchronous work shuts it down. This would likely
happen under a heavily loaded system with a scheduler that brings cores
in and out of the system frequently. In simple use-cases it performs OK.

Changes since v1:

  - Leading zeroes for device node addresses removed
  - Added device tree binding for SMP SRAM
  - Simplified Kconfig options
  - Switched to SPDX license identifier
  - Map CPU to device tree node and check compatible to see if it's
    Cortex-A15 or Cortex-A7
  - Fix incorrect CPUCFG cluster status macro that prevented cluster
    0 L2 cache WFI detection
  - Fixed reversed bit for turning off cluster
  - Put cluster in reset before turning off power (or it hangs)
  - Added dedicated workqueue for turning off power to cpus and clusters
  - Request CPUCFG and SRAM MMIO ranges
  - Some comments fixed or added
  - Some debug messages added

[1] http://www.spinics.net/lists/arm-kernel/msg418350.html
[2] https://lists.denx.de/pipermail/u-boot/2017-June/294637.html
[3] https://github.com/wens/linux/commit/c48654c1f737116e7a7660183c8c74fa91970528
[4] http://www.spinics.net/lists/arm-kernel/msg434160.html
[5] http://www.spinics.net/lists/arm-kernel/msg434408.html

Chen-Yu Tsai (8):
  ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management
    (MCPM)
  ARM: dts: sun9i: Add CCI-400 device nodes for A80
  ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
  ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
  ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for
    cpu1~7
  dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP
    hotplug
  ARM: sun9i: mcpm: Support cpu0 hotplug
  ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug

 .../devicetree/bindings/arm/sunxi/smp-sram.txt     |  44 ++
 arch/arm/boot/dts/sun9i-a80.dtsi                   |  75 +++
 arch/arm/mach-sunxi/Kconfig                        |   2 +
 arch/arm/mach-sunxi/Makefile                       |   1 +
 arch/arm/mach-sunxi/mcpm.c                         | 633 +++++++++++++++++++++
 5 files changed, 755 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

-- 
2.15.1

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 14:37 ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

This is v2 of my sun9i SMP support with MCPM series which was started
over two years ago [1]. We've tried to implement PSCI for both the A80
and A83T. Results were not promising. The issue is that these two chips
have a broken security extensions implementation. If a specific bit is
not burned in its e-fuse, most if not all security protections don't
work [2]. Even worse, non-secure access to the GIC become secure. This
requires a crazy workaround in the GIC driver which probably doesn't work
in all cases [3].

Nicolas mentioned that the MCPM framework is likely overkill in our
case [4]. However the framework does provide cluster/core state tracking
and proper sequencing of cache related operations. We could rework
the code to use standard smp_ops, but I would like to actually get
a working version in first.

Much of the sunxi-specific MCPM code is derived from Allwinner code and
documentation, with some references to the other MCPM implementations,
as well as the Cortex's Technical Reference Manuals for the power
sequencing info.

One major difference compared to other platforms is we currently do not
have a standalone PMU or other embedded firmware to do the actually power
sequencing. All power/reset control is done by the kernel. Nicolas
mentioned that a new optional callback should be added in cases where the
kernel has to do the actual power down [5]. For now however I'm using a
dedicated single thread workqueue. CPU and cluster power off work is
queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
is somewhat heavy, as I have a total of 10 static work structs. It might
also be a bit racy, as nothing prevents the system from bringing a core
back before the asynchronous work shuts it down. This would likely
happen under a heavily loaded system with a scheduler that brings cores
in and out of the system frequently. In simple use-cases it performs OK.

Changes since v1:

  - Leading zeroes for device node addresses removed
  - Added device tree binding for SMP SRAM
  - Simplified Kconfig options
  - Switched to SPDX license identifier
  - Map CPU to device tree node and check compatible to see if it's
    Cortex-A15 or Cortex-A7
  - Fix incorrect CPUCFG cluster status macro that prevented cluster
    0 L2 cache WFI detection
  - Fixed reversed bit for turning off cluster
  - Put cluster in reset before turning off power (or it hangs)
  - Added dedicated workqueue for turning off power to cpus and clusters
  - Request CPUCFG and SRAM MMIO ranges
  - Some comments fixed or added
  - Some debug messages added

[1] http://www.spinics.net/lists/arm-kernel/msg418350.html
[2] https://lists.denx.de/pipermail/u-boot/2017-June/294637.html
[3] https://github.com/wens/linux/commit/c48654c1f737116e7a7660183c8c74fa91970528
[4] http://www.spinics.net/lists/arm-kernel/msg434160.html
[5] http://www.spinics.net/lists/arm-kernel/msg434408.html

Chen-Yu Tsai (8):
  ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management
    (MCPM)
  ARM: dts: sun9i: Add CCI-400 device nodes for A80
  ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
  ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
  ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for
    cpu1~7
  dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP
    hotplug
  ARM: sun9i: mcpm: Support cpu0 hotplug
  ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug

 .../devicetree/bindings/arm/sunxi/smp-sram.txt     |  44 ++
 arch/arm/boot/dts/sun9i-a80.dtsi                   |  75 +++
 arch/arm/mach-sunxi/Kconfig                        |   2 +
 arch/arm/mach-sunxi/Makefile                       |   1 +
 arch/arm/mach-sunxi/mcpm.c                         | 633 +++++++++++++++++++++
 5 files changed, 755 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

-- 
2.15.1

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 1/8] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
1 cluster of 4 Cortex-A15s.

This patch adds support to bring up the second cluster and thus all
cores using the common MCPM code. Core/cluster power down has not
been implemented, thus CPU hotplugging and big.LITTLE switcher is
not supported.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/Kconfig  |   2 +
 arch/arm/mach-sunxi/Makefile |   1 +
 arch/arm/mach-sunxi/mcpm.c   | 425 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 428 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
index 58153cdf025b..53c4e7420cfb 100644
--- a/arch/arm/mach-sunxi/Kconfig
+++ b/arch/arm/mach-sunxi/Kconfig
@@ -47,5 +47,7 @@ config MACH_SUN9I
 	bool "Allwinner (sun9i) SoCs support"
 	default ARCH_SUNXI
 	select ARM_GIC
+	select MCPM if SMP
+	select ARM_CCI400_PORT_CTRL if SMP
 
 endif
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
index 27b168f121a1..cd25d9d81257 100644
--- a/arch/arm/mach-sunxi/Makefile
+++ b/arch/arm/mach-sunxi/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
+obj-$(CONFIG_MCPM) += mcpm.o
 obj-$(CONFIG_SMP) += platsmp.o
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
new file mode 100644
index 000000000000..30719998f3f0
--- /dev/null
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2015 Chen-Yu Tsai
+ *
+ * Chen-Yu Tsai <wens@csie.org>
+ *
+ * arch/arm/mach-sunxi/mcpm.c
+ *
+ * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
+ */
+
+#include <linux/arm-cci.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+#include <asm/mcpm.h>
+
+#define SUNXI_CPUS_PER_CLUSTER		4
+#define SUNXI_NR_CLUSTERS		2
+
+#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
+#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
+#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
+#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
+#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
+#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
+#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
+#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
+#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
+#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
+#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
+#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
+
+#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
+#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
+#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
+#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
+#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
+#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
+#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
+#define PRCM_CPU_SOFT_ENTRY_REG		0x164
+
+static void __iomem *cpucfg_base;
+static void __iomem *prcm_base;
+
+static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
+{
+	struct device_node *node;
+	int cpu = cluster * SUNXI_CPUS_PER_CLUSTER + core;
+
+	node = of_cpu_device_node_get(cpu);
+
+	/* In case of_cpu_device_node_get fails */
+	if (!node)
+		node = of_get_cpu_node(cpu, NULL);
+
+	if (!node) {
+		/*
+		 * There's no point in returning an error, since we
+		 * would be mid way in a core or cluster power sequence.
+		 */
+		pr_err("%s: Couldn't get CPU cluster %u core %u device node\n",
+		       __func__, cluster, core);
+
+		return false;
+	}
+
+	return of_device_is_compatible(node, "arm,cortex-a15");
+}
+
+static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
+				      bool enable)
+{
+	u32 reg;
+
+	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
+	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+	if (enable) {
+		if (reg == 0x00) {
+			pr_debug("power clamp for cluster %u cpu %u already open\n",
+				 cluster, cpu);
+			return 0;
+		}
+
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	} else {
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	}
+
+	return 0;
+}
+
+static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* Cortex-A7: hold L1 reset disable signal low */
+	if (!sunxi_core_is_cortex_a15(cpu, cluster)) {
+		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
+		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	}
+
+	/* assert processor related resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!sunxi_core_is_cortex_a15(cpu, cluster))
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* open power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, true);
+
+	/* clear processor power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* de-assert all processor resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
+	if (!sunxi_core_is_cortex_a15(cpu, cluster))
+		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	else
+		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	return 0;
+}
+
+static int sunxi_cluster_powerup(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	/* assert cluster processor power-on resets */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!sunxi_core_is_cortex_a15(0, cluster))
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
+
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* hold L1/L2 reset disable signals low */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	if (sunxi_core_is_cortex_a15(0, cluster)) {
+		/* Cortex-A15: hold L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
+	} else {
+		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+
+	/* clear cluster power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg |= CPUCFG_CX_RST_CTRL_H_RST;
+	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* de-assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	return 0;
+}
+
+static void sunxi_cpu_cache_disable(void)
+{
+	/* Disable and flush the local CPU cache. */
+	v7_exit_coherency_flush(louis);
+}
+
+/*
+ * This bit is shared between the initial mcpm_sync_init call to enable
+ * CCI-400 and proper cluster cache disable before power down.
+ */
+static void sunxi_cluster_cache_disable_without_axi(void)
+{
+	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
+		/*
+		 * On the Cortex-A15 we need to disable
+		 * L2 prefetching before flushing the cache.
+		 */
+		asm volatile(
+		"mcr	p15, 1, %0, c15, c0, 3\n"
+		"isb\n"
+		"dsb"
+		: : "r" (0x400));
+	}
+
+	/* Flush all cache levels for this cluster. */
+	v7_exit_coherency_flush(all);
+
+	/*
+	 * Disable cluster-level coherency by masking
+	 * incoming snoops and DVM messages:
+	 */
+	cci_disable_port_by_cpu(read_cpuid_mpidr());
+}
+
+static void sunxi_cluster_cache_disable(void)
+{
+	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+
+	sunxi_cluster_cache_disable_without_axi();
+
+	/* last man standing, assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+}
+
+static const struct mcpm_platform_ops sunxi_power_ops = {
+	.cpu_powerup		= sunxi_cpu_powerup,
+	.cluster_powerup	= sunxi_cluster_powerup,
+	.cpu_cache_disable	= sunxi_cpu_cache_disable,
+	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+};
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ *
+ * Also enable regional clock gating and L2 data latency settings for
+ * Cortex-A15. These settings are from the vendor kernel.
+ */
+static void __naked sunxi_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile (
+		"mrc	p15, 0, r1, c0, c0, 0\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
+		"and	r1, r1, r2\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
+		"cmp	r1, r2\n"
+		"bne	not_a15\n"
+
+		/* The following is Cortex-A15 specific */
+
+		/* ACTLR2: Enable CPU regional clock gates */
+		"mrc p15, 1, r1, c15, c0, 4\n"
+		"orr r1, r1, #(0x1<<31)\n"
+		"mcr p15, 1, r1, c15, c0, 4\n"
+
+		/* L2ACTLR */
+		"mrc p15, 1, r1, c15, c0, 0\n"
+		/* Enable L2, GIC, and Timer regional clock gates */
+		"orr r1, r1, #(0x1<<26)\n"
+		/* Disable clean/evict from being pushed to external */
+		"orr r1, r1, #(0x1<<3)\n"
+		"mcr p15, 1, r1, c15, c0, 0\n"
+
+		/* L2CTRL: L2 data RAM latency */
+		"mrc p15, 1, r1, c9, c0, 2\n"
+		"bic r1, r1, #(0x7<<0)\n"
+		"orr r1, r1, #(0x3<<0)\n"
+		"mcr p15, 1, r1, c9, c0, 2\n"
+
+		/* End of Cortex-A15 specific setup */
+		"not_a15:\n"
+
+		"cmp	r0, #1\n"
+		"bxne	lr\n"
+		"b	cci_enable_port_for_self"
+	);
+}
+
+static void sunxi_mcpm_setup_entry_point(void)
+{
+	__raw_writel(virt_to_phys(mcpm_entry_point),
+		     prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
+}
+
+static int __init sunxi_mcpm_init(void)
+{
+	struct device_node *cpucfg_node, *node;
+	struct resource res;
+	int ret;
+
+	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	node = of_find_compatible_node(NULL, NULL, "allwinner,sun9i-a80-prcm");
+	if (!node)
+		return -ENODEV;
+
+	/*
+	 * Unfortunately we can not request the I/O region for the PRCM.
+	 * It is shared with the PRCM clock.
+	 */
+	prcm_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!prcm_base) {
+		pr_err("%s: failed to map PRCM registers\n", __func__);
+		return -ENOMEM;
+	}
+
+	cpucfg_node = of_find_compatible_node(NULL, NULL,
+					      "allwinner,sun9i-a80-cpucfg");
+	if (!cpucfg_node) {
+		ret = -ENODEV;
+		goto err_unmap_prcm;
+	}
+
+	cpucfg_base = of_io_request_and_map(cpucfg_node, 0, "sunxi-mcpm");
+	if (IS_ERR(cpucfg_base)) {
+		ret = PTR_ERR(cpucfg_base);
+		pr_err("%s: failed to map CPUCFG registers: %d\n",
+		       __func__, ret);
+		goto err_put_cpucfg_node;
+	}
+
+	ret = mcpm_platform_register(&sunxi_power_ops);
+	if (!ret)
+		ret = mcpm_sync_init(sunxi_power_up_setup);
+	if (!ret)
+		/* do not disable AXI master as no one will re-enable it */
+		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
+	if (ret)
+		goto err_unmap_release_cpucfg;
+
+	/* We don't need the CPUCFG device node anymore */
+	of_node_put(cpucfg_node);
+
+	/* Set the hardware entry point address */
+	sunxi_mcpm_setup_entry_point();
+
+	/* Actually enable MCPM */
+	mcpm_smp_set_ops();
+
+	pr_info("sunxi MCPM support installed\n");
+
+	return ret;
+
+err_unmap_release_cpucfg:
+	iounmap(cpucfg_base);
+	of_address_to_resource(cpucfg_node, 0, &res);
+	release_mem_region(res.start, resource_size(&res));
+err_put_cpucfg_node:
+	of_node_put(cpucfg_node);
+err_unmap_prcm:
+	iounmap(prcm_base);
+	return ret;
+}
+
+early_initcall(sunxi_mcpm_init);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 1/8] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
1 cluster of 4 Cortex-A15s.

This patch adds support to bring up the second cluster and thus all
cores using the common MCPM code. Core/cluster power down has not
been implemented, thus CPU hotplugging and big.LITTLE switcher is
not supported.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/mach-sunxi/Kconfig  |   2 +
 arch/arm/mach-sunxi/Makefile |   1 +
 arch/arm/mach-sunxi/mcpm.c   | 425 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 428 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
index 58153cdf025b..53c4e7420cfb 100644
--- a/arch/arm/mach-sunxi/Kconfig
+++ b/arch/arm/mach-sunxi/Kconfig
@@ -47,5 +47,7 @@ config MACH_SUN9I
 	bool "Allwinner (sun9i) SoCs support"
 	default ARCH_SUNXI
 	select ARM_GIC
+	select MCPM if SMP
+	select ARM_CCI400_PORT_CTRL if SMP
 
 endif
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
index 27b168f121a1..cd25d9d81257 100644
--- a/arch/arm/mach-sunxi/Makefile
+++ b/arch/arm/mach-sunxi/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
+obj-$(CONFIG_MCPM) += mcpm.o
 obj-$(CONFIG_SMP) += platsmp.o
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
new file mode 100644
index 000000000000..30719998f3f0
--- /dev/null
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2015 Chen-Yu Tsai
+ *
+ * Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
+ *
+ * arch/arm/mach-sunxi/mcpm.c
+ *
+ * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
+ */
+
+#include <linux/arm-cci.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+#include <asm/mcpm.h>
+
+#define SUNXI_CPUS_PER_CLUSTER		4
+#define SUNXI_NR_CLUSTERS		2
+
+#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
+#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
+#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
+#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
+#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
+#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
+#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
+#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
+#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
+#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
+#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
+#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
+
+#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
+#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
+#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
+#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
+#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
+#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
+#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
+#define PRCM_CPU_SOFT_ENTRY_REG		0x164
+
+static void __iomem *cpucfg_base;
+static void __iomem *prcm_base;
+
+static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
+{
+	struct device_node *node;
+	int cpu = cluster * SUNXI_CPUS_PER_CLUSTER + core;
+
+	node = of_cpu_device_node_get(cpu);
+
+	/* In case of_cpu_device_node_get fails */
+	if (!node)
+		node = of_get_cpu_node(cpu, NULL);
+
+	if (!node) {
+		/*
+		 * There's no point in returning an error, since we
+		 * would be mid way in a core or cluster power sequence.
+		 */
+		pr_err("%s: Couldn't get CPU cluster %u core %u device node\n",
+		       __func__, cluster, core);
+
+		return false;
+	}
+
+	return of_device_is_compatible(node, "arm,cortex-a15");
+}
+
+static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
+				      bool enable)
+{
+	u32 reg;
+
+	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
+	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+	if (enable) {
+		if (reg == 0x00) {
+			pr_debug("power clamp for cluster %u cpu %u already open\n",
+				 cluster, cpu);
+			return 0;
+		}
+
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	} else {
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	}
+
+	return 0;
+}
+
+static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* Cortex-A7: hold L1 reset disable signal low */
+	if (!sunxi_core_is_cortex_a15(cpu, cluster)) {
+		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
+		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	}
+
+	/* assert processor related resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!sunxi_core_is_cortex_a15(cpu, cluster))
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* open power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, true);
+
+	/* clear processor power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* de-assert all processor resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
+	if (!sunxi_core_is_cortex_a15(cpu, cluster))
+		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	else
+		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	return 0;
+}
+
+static int sunxi_cluster_powerup(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	/* assert cluster processor power-on resets */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!sunxi_core_is_cortex_a15(0, cluster))
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
+
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* hold L1/L2 reset disable signals low */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	if (sunxi_core_is_cortex_a15(0, cluster)) {
+		/* Cortex-A15: hold L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
+	} else {
+		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+
+	/* clear cluster power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg |= CPUCFG_CX_RST_CTRL_H_RST;
+	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* de-assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	return 0;
+}
+
+static void sunxi_cpu_cache_disable(void)
+{
+	/* Disable and flush the local CPU cache. */
+	v7_exit_coherency_flush(louis);
+}
+
+/*
+ * This bit is shared between the initial mcpm_sync_init call to enable
+ * CCI-400 and proper cluster cache disable before power down.
+ */
+static void sunxi_cluster_cache_disable_without_axi(void)
+{
+	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
+		/*
+		 * On the Cortex-A15 we need to disable
+		 * L2 prefetching before flushing the cache.
+		 */
+		asm volatile(
+		"mcr	p15, 1, %0, c15, c0, 3\n"
+		"isb\n"
+		"dsb"
+		: : "r" (0x400));
+	}
+
+	/* Flush all cache levels for this cluster. */
+	v7_exit_coherency_flush(all);
+
+	/*
+	 * Disable cluster-level coherency by masking
+	 * incoming snoops and DVM messages:
+	 */
+	cci_disable_port_by_cpu(read_cpuid_mpidr());
+}
+
+static void sunxi_cluster_cache_disable(void)
+{
+	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+
+	sunxi_cluster_cache_disable_without_axi();
+
+	/* last man standing, assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+}
+
+static const struct mcpm_platform_ops sunxi_power_ops = {
+	.cpu_powerup		= sunxi_cpu_powerup,
+	.cluster_powerup	= sunxi_cluster_powerup,
+	.cpu_cache_disable	= sunxi_cpu_cache_disable,
+	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+};
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ *
+ * Also enable regional clock gating and L2 data latency settings for
+ * Cortex-A15. These settings are from the vendor kernel.
+ */
+static void __naked sunxi_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile (
+		"mrc	p15, 0, r1, c0, c0, 0\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
+		"and	r1, r1, r2\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
+		"cmp	r1, r2\n"
+		"bne	not_a15\n"
+
+		/* The following is Cortex-A15 specific */
+
+		/* ACTLR2: Enable CPU regional clock gates */
+		"mrc p15, 1, r1, c15, c0, 4\n"
+		"orr r1, r1, #(0x1<<31)\n"
+		"mcr p15, 1, r1, c15, c0, 4\n"
+
+		/* L2ACTLR */
+		"mrc p15, 1, r1, c15, c0, 0\n"
+		/* Enable L2, GIC, and Timer regional clock gates */
+		"orr r1, r1, #(0x1<<26)\n"
+		/* Disable clean/evict from being pushed to external */
+		"orr r1, r1, #(0x1<<3)\n"
+		"mcr p15, 1, r1, c15, c0, 0\n"
+
+		/* L2CTRL: L2 data RAM latency */
+		"mrc p15, 1, r1, c9, c0, 2\n"
+		"bic r1, r1, #(0x7<<0)\n"
+		"orr r1, r1, #(0x3<<0)\n"
+		"mcr p15, 1, r1, c9, c0, 2\n"
+
+		/* End of Cortex-A15 specific setup */
+		"not_a15:\n"
+
+		"cmp	r0, #1\n"
+		"bxne	lr\n"
+		"b	cci_enable_port_for_self"
+	);
+}
+
+static void sunxi_mcpm_setup_entry_point(void)
+{
+	__raw_writel(virt_to_phys(mcpm_entry_point),
+		     prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
+}
+
+static int __init sunxi_mcpm_init(void)
+{
+	struct device_node *cpucfg_node, *node;
+	struct resource res;
+	int ret;
+
+	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	node = of_find_compatible_node(NULL, NULL, "allwinner,sun9i-a80-prcm");
+	if (!node)
+		return -ENODEV;
+
+	/*
+	 * Unfortunately we can not request the I/O region for the PRCM.
+	 * It is shared with the PRCM clock.
+	 */
+	prcm_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!prcm_base) {
+		pr_err("%s: failed to map PRCM registers\n", __func__);
+		return -ENOMEM;
+	}
+
+	cpucfg_node = of_find_compatible_node(NULL, NULL,
+					      "allwinner,sun9i-a80-cpucfg");
+	if (!cpucfg_node) {
+		ret = -ENODEV;
+		goto err_unmap_prcm;
+	}
+
+	cpucfg_base = of_io_request_and_map(cpucfg_node, 0, "sunxi-mcpm");
+	if (IS_ERR(cpucfg_base)) {
+		ret = PTR_ERR(cpucfg_base);
+		pr_err("%s: failed to map CPUCFG registers: %d\n",
+		       __func__, ret);
+		goto err_put_cpucfg_node;
+	}
+
+	ret = mcpm_platform_register(&sunxi_power_ops);
+	if (!ret)
+		ret = mcpm_sync_init(sunxi_power_up_setup);
+	if (!ret)
+		/* do not disable AXI master as no one will re-enable it */
+		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
+	if (ret)
+		goto err_unmap_release_cpucfg;
+
+	/* We don't need the CPUCFG device node anymore */
+	of_node_put(cpucfg_node);
+
+	/* Set the hardware entry point address */
+	sunxi_mcpm_setup_entry_point();
+
+	/* Actually enable MCPM */
+	mcpm_smp_set_ops();
+
+	pr_info("sunxi MCPM support installed\n");
+
+	return ret;
+
+err_unmap_release_cpucfg:
+	iounmap(cpucfg_base);
+	of_address_to_resource(cpucfg_node, 0, &res);
+	release_mem_region(res.start, resource_size(&res));
+err_put_cpucfg_node:
+	of_node_put(cpucfg_node);
+err_unmap_prcm:
+	iounmap(prcm_base);
+	return ret;
+}
+
+early_initcall(sunxi_mcpm_init);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 1/8] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM)
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
1 cluster of 4 Cortex-A15s.

This patch adds support to bring up the second cluster and thus all
cores using the common MCPM code. Core/cluster power down has not
been implemented, thus CPU hotplugging and big.LITTLE switcher is
not supported.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/Kconfig  |   2 +
 arch/arm/mach-sunxi/Makefile |   1 +
 arch/arm/mach-sunxi/mcpm.c   | 425 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 428 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
index 58153cdf025b..53c4e7420cfb 100644
--- a/arch/arm/mach-sunxi/Kconfig
+++ b/arch/arm/mach-sunxi/Kconfig
@@ -47,5 +47,7 @@ config MACH_SUN9I
 	bool "Allwinner (sun9i) SoCs support"
 	default ARCH_SUNXI
 	select ARM_GIC
+	select MCPM if SMP
+	select ARM_CCI400_PORT_CTRL if SMP
 
 endif
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
index 27b168f121a1..cd25d9d81257 100644
--- a/arch/arm/mach-sunxi/Makefile
+++ b/arch/arm/mach-sunxi/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
+obj-$(CONFIG_MCPM) += mcpm.o
 obj-$(CONFIG_SMP) += platsmp.o
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
new file mode 100644
index 000000000000..30719998f3f0
--- /dev/null
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2015 Chen-Yu Tsai
+ *
+ * Chen-Yu Tsai <wens@csie.org>
+ *
+ * arch/arm/mach-sunxi/mcpm.c
+ *
+ * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
+ */
+
+#include <linux/arm-cci.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+#include <asm/mcpm.h>
+
+#define SUNXI_CPUS_PER_CLUSTER		4
+#define SUNXI_NR_CLUSTERS		2
+
+#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
+#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
+#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
+#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
+#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
+#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
+#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
+#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
+#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
+#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
+#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
+#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
+
+#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
+#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
+#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
+#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
+#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
+#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
+#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
+#define PRCM_CPU_SOFT_ENTRY_REG		0x164
+
+static void __iomem *cpucfg_base;
+static void __iomem *prcm_base;
+
+static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
+{
+	struct device_node *node;
+	int cpu = cluster * SUNXI_CPUS_PER_CLUSTER + core;
+
+	node = of_cpu_device_node_get(cpu);
+
+	/* In case of_cpu_device_node_get fails */
+	if (!node)
+		node = of_get_cpu_node(cpu, NULL);
+
+	if (!node) {
+		/*
+		 * There's no point in returning an error, since we
+		 * would be mid way in a core or cluster power sequence.
+		 */
+		pr_err("%s: Couldn't get CPU cluster %u core %u device node\n",
+		       __func__, cluster, core);
+
+		return false;
+	}
+
+	return of_device_is_compatible(node, "arm,cortex-a15");
+}
+
+static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
+				      bool enable)
+{
+	u32 reg;
+
+	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
+	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+	if (enable) {
+		if (reg == 0x00) {
+			pr_debug("power clamp for cluster %u cpu %u already open\n",
+				 cluster, cpu);
+			return 0;
+		}
+
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	} else {
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	}
+
+	return 0;
+}
+
+static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* Cortex-A7: hold L1 reset disable signal low */
+	if (!sunxi_core_is_cortex_a15(cpu, cluster)) {
+		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
+		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	}
+
+	/* assert processor related resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!sunxi_core_is_cortex_a15(cpu, cluster))
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* open power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, true);
+
+	/* clear processor power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* de-assert all processor resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
+	if (!sunxi_core_is_cortex_a15(cpu, cluster))
+		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	else
+		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	return 0;
+}
+
+static int sunxi_cluster_powerup(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	/* assert cluster processor power-on resets */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!sunxi_core_is_cortex_a15(0, cluster))
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
+
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* hold L1/L2 reset disable signals low */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	if (sunxi_core_is_cortex_a15(0, cluster)) {
+		/* Cortex-A15: hold L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
+	} else {
+		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+
+	/* clear cluster power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg |= CPUCFG_CX_RST_CTRL_H_RST;
+	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* de-assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	return 0;
+}
+
+static void sunxi_cpu_cache_disable(void)
+{
+	/* Disable and flush the local CPU cache. */
+	v7_exit_coherency_flush(louis);
+}
+
+/*
+ * This bit is shared between the initial mcpm_sync_init call to enable
+ * CCI-400 and proper cluster cache disable before power down.
+ */
+static void sunxi_cluster_cache_disable_without_axi(void)
+{
+	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
+		/*
+		 * On the Cortex-A15 we need to disable
+		 * L2 prefetching before flushing the cache.
+		 */
+		asm volatile(
+		"mcr	p15, 1, %0, c15, c0, 3\n"
+		"isb\n"
+		"dsb"
+		: : "r" (0x400));
+	}
+
+	/* Flush all cache levels for this cluster. */
+	v7_exit_coherency_flush(all);
+
+	/*
+	 * Disable cluster-level coherency by masking
+	 * incoming snoops and DVM messages:
+	 */
+	cci_disable_port_by_cpu(read_cpuid_mpidr());
+}
+
+static void sunxi_cluster_cache_disable(void)
+{
+	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+
+	sunxi_cluster_cache_disable_without_axi();
+
+	/* last man standing, assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+}
+
+static const struct mcpm_platform_ops sunxi_power_ops = {
+	.cpu_powerup		= sunxi_cpu_powerup,
+	.cluster_powerup	= sunxi_cluster_powerup,
+	.cpu_cache_disable	= sunxi_cpu_cache_disable,
+	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+};
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ *
+ * Also enable regional clock gating and L2 data latency settings for
+ * Cortex-A15. These settings are from the vendor kernel.
+ */
+static void __naked sunxi_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile (
+		"mrc	p15, 0, r1, c0, c0, 0\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
+		"and	r1, r1, r2\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
+		"cmp	r1, r2\n"
+		"bne	not_a15\n"
+
+		/* The following is Cortex-A15 specific */
+
+		/* ACTLR2: Enable CPU regional clock gates */
+		"mrc p15, 1, r1, c15, c0, 4\n"
+		"orr r1, r1, #(0x1<<31)\n"
+		"mcr p15, 1, r1, c15, c0, 4\n"
+
+		/* L2ACTLR */
+		"mrc p15, 1, r1, c15, c0, 0\n"
+		/* Enable L2, GIC, and Timer regional clock gates */
+		"orr r1, r1, #(0x1<<26)\n"
+		/* Disable clean/evict from being pushed to external */
+		"orr r1, r1, #(0x1<<3)\n"
+		"mcr p15, 1, r1, c15, c0, 0\n"
+
+		/* L2CTRL: L2 data RAM latency */
+		"mrc p15, 1, r1, c9, c0, 2\n"
+		"bic r1, r1, #(0x7<<0)\n"
+		"orr r1, r1, #(0x3<<0)\n"
+		"mcr p15, 1, r1, c9, c0, 2\n"
+
+		/* End of Cortex-A15 specific setup */
+		"not_a15:\n"
+
+		"cmp	r0, #1\n"
+		"bxne	lr\n"
+		"b	cci_enable_port_for_self"
+	);
+}
+
+static void sunxi_mcpm_setup_entry_point(void)
+{
+	__raw_writel(virt_to_phys(mcpm_entry_point),
+		     prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
+}
+
+static int __init sunxi_mcpm_init(void)
+{
+	struct device_node *cpucfg_node, *node;
+	struct resource res;
+	int ret;
+
+	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	node = of_find_compatible_node(NULL, NULL, "allwinner,sun9i-a80-prcm");
+	if (!node)
+		return -ENODEV;
+
+	/*
+	 * Unfortunately we can not request the I/O region for the PRCM.
+	 * It is shared with the PRCM clock.
+	 */
+	prcm_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!prcm_base) {
+		pr_err("%s: failed to map PRCM registers\n", __func__);
+		return -ENOMEM;
+	}
+
+	cpucfg_node = of_find_compatible_node(NULL, NULL,
+					      "allwinner,sun9i-a80-cpucfg");
+	if (!cpucfg_node) {
+		ret = -ENODEV;
+		goto err_unmap_prcm;
+	}
+
+	cpucfg_base = of_io_request_and_map(cpucfg_node, 0, "sunxi-mcpm");
+	if (IS_ERR(cpucfg_base)) {
+		ret = PTR_ERR(cpucfg_base);
+		pr_err("%s: failed to map CPUCFG registers: %d\n",
+		       __func__, ret);
+		goto err_put_cpucfg_node;
+	}
+
+	ret = mcpm_platform_register(&sunxi_power_ops);
+	if (!ret)
+		ret = mcpm_sync_init(sunxi_power_up_setup);
+	if (!ret)
+		/* do not disable AXI master as no one will re-enable it */
+		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
+	if (ret)
+		goto err_unmap_release_cpucfg;
+
+	/* We don't need the CPUCFG device node anymore */
+	of_node_put(cpucfg_node);
+
+	/* Set the hardware entry point address */
+	sunxi_mcpm_setup_entry_point();
+
+	/* Actually enable MCPM */
+	mcpm_smp_set_ops();
+
+	pr_info("sunxi MCPM support installed\n");
+
+	return ret;
+
+err_unmap_release_cpucfg:
+	iounmap(cpucfg_base);
+	of_address_to_resource(cpucfg_node, 0, &res);
+	release_mem_region(res.start, resource_size(&res));
+err_put_cpucfg_node:
+	of_node_put(cpucfg_node);
+err_unmap_prcm:
+	iounmap(prcm_base);
+	return ret;
+}
+
+early_initcall(sunxi_mcpm_init);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/8] ARM: dts: sun9i: Add CCI-400 device nodes for A80
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

The A80 includes an ARM CCI-400 interconnect to support multi-cluster
CPU caches.

Also add the maximum clock frequency for the CPUs, as listed in the
A80 Optimus Board FEX file.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 46 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 90eac0b2a193..85fb800af8ab 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -63,48 +63,64 @@
 		cpu0: cpu@0 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x0>;
 		};
 
 		cpu1: cpu@1 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x1>;
 		};
 
 		cpu2: cpu@2 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x2>;
 		};
 
 		cpu3: cpu@3 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x3>;
 		};
 
 		cpu4: cpu@100 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x100>;
 		};
 
 		cpu5: cpu@101 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x101>;
 		};
 
 		cpu6: cpu@102 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x102>;
 		};
 
 		cpu7: cpu@103 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x103>;
 		};
 	};
@@ -431,6 +447,36 @@
 			interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
 		};
 
+		cci: cci@1c90000 {
+			compatible = "arm,cci-400";
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x01c90000 0x1000>;
+			ranges = <0x0 0x01c90000 0x10000>;
+
+			cci_control0: slave-if@4000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x4000 0x1000>;
+			};
+
+			cci_control1: slave-if@5000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x5000 0x1000>;
+			};
+
+			pmu@9000 {
+				 compatible = "arm,cci-400-pmu,r1";
+				 reg = <0x9000 0x5000>;
+				 interrupts = <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>;
+			};
+		};
+
 		de_clocks: clock@3000000 {
 			compatible = "allwinner,sun9i-a80-de-clks";
 			reg = <0x03000000 0x30>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/8] ARM: dts: sun9i: Add CCI-400 device nodes for A80
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

The A80 includes an ARM CCI-400 interconnect to support multi-cluster
CPU caches.

Also add the maximum clock frequency for the CPUs, as listed in the
A80 Optimus Board FEX file.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 46 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 90eac0b2a193..85fb800af8ab 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -63,48 +63,64 @@
 		cpu0: cpu@0 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x0>;
 		};
 
 		cpu1: cpu@1 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x1>;
 		};
 
 		cpu2: cpu@2 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x2>;
 		};
 
 		cpu3: cpu@3 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x3>;
 		};
 
 		cpu4: cpu@100 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x100>;
 		};
 
 		cpu5: cpu@101 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x101>;
 		};
 
 		cpu6: cpu@102 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x102>;
 		};
 
 		cpu7: cpu@103 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x103>;
 		};
 	};
@@ -431,6 +447,36 @@
 			interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
 		};
 
+		cci: cci@1c90000 {
+			compatible = "arm,cci-400";
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x01c90000 0x1000>;
+			ranges = <0x0 0x01c90000 0x10000>;
+
+			cci_control0: slave-if@4000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x4000 0x1000>;
+			};
+
+			cci_control1: slave-if@5000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x5000 0x1000>;
+			};
+
+			pmu@9000 {
+				 compatible = "arm,cci-400-pmu,r1";
+				 reg = <0x9000 0x5000>;
+				 interrupts = <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>;
+			};
+		};
+
 		de_clocks: clock@3000000 {
 			compatible = "allwinner,sun9i-a80-de-clks";
 			reg = <0x03000000 0x30>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/8] ARM: dts: sun9i: Add CCI-400 device nodes for A80
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

The A80 includes an ARM CCI-400 interconnect to support multi-cluster
CPU caches.

Also add the maximum clock frequency for the CPUs, as listed in the
A80 Optimus Board FEX file.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 46 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 90eac0b2a193..85fb800af8ab 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -63,48 +63,64 @@
 		cpu0: cpu at 0 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x0>;
 		};
 
 		cpu1: cpu at 1 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x1>;
 		};
 
 		cpu2: cpu at 2 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x2>;
 		};
 
 		cpu3: cpu at 3 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x3>;
 		};
 
 		cpu4: cpu at 100 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x100>;
 		};
 
 		cpu5: cpu at 101 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x101>;
 		};
 
 		cpu6: cpu at 102 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x102>;
 		};
 
 		cpu7: cpu at 103 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x103>;
 		};
 	};
@@ -431,6 +447,36 @@
 			interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
 		};
 
+		cci: cci at 1c90000 {
+			compatible = "arm,cci-400";
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x01c90000 0x1000>;
+			ranges = <0x0 0x01c90000 0x10000>;
+
+			cci_control0: slave-if at 4000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x4000 0x1000>;
+			};
+
+			cci_control1: slave-if at 5000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x5000 0x1000>;
+			};
+
+			pmu at 9000 {
+				 compatible = "arm,cci-400-pmu,r1";
+				 reg = <0x9000 0x5000>;
+				 interrupts = <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>;
+			};
+		};
+
 		de_clocks: clock at 3000000 {
 			compatible = "allwinner,sun9i-a80-de-clks";
 			reg = <0x03000000 0x30>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 3/8] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

CPUCFG is a collection of registers that are mapped to the SoC's signals
from each individual processor core and associated peripherals, such as
resets for processors, L1/L2 cache and other things.

These registers are used for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85fb800af8ab..85ecb4d64cfd 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -363,6 +363,11 @@
 			#reset-cells = <1>;
 		};
 
+		cpucfg@1700000 {
+			compatible = "allwinner,sun9i-a80-cpucfg";
+			reg = <0x01700000 0x100>;
+		};
+
 		mmc0: mmc@1c0f000 {
 			compatible = "allwinner,sun9i-a80-mmc";
 			reg = <0x01c0f000 0x1000>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 3/8] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

CPUCFG is a collection of registers that are mapped to the SoC's signals
from each individual processor core and associated peripherals, such as
resets for processors, L1/L2 cache and other things.

These registers are used for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85fb800af8ab..85ecb4d64cfd 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -363,6 +363,11 @@
 			#reset-cells = <1>;
 		};
 
+		cpucfg@1700000 {
+			compatible = "allwinner,sun9i-a80-cpucfg";
+			reg = <0x01700000 0x100>;
+		};
+
 		mmc0: mmc@1c0f000 {
 			compatible = "allwinner,sun9i-a80-mmc";
 			reg = <0x01c0f000 0x1000>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 3/8] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

CPUCFG is a collection of registers that are mapped to the SoC's signals
from each individual processor core and associated peripherals, such as
resets for processors, L1/L2 cache and other things.

These registers are used for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85fb800af8ab..85ecb4d64cfd 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -363,6 +363,11 @@
 			#reset-cells = <1>;
 		};
 
+		cpucfg at 1700000 {
+			compatible = "allwinner,sun9i-a80-cpucfg";
+			reg = <0x01700000 0x100>;
+		};
+
 		mmc0: mmc at 1c0f000 {
 			compatible = "allwinner,sun9i-a80-mmc";
 			reg = <0x01c0f000 0x1000>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 4/8] ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

The PRCM is a collection of clock controls, reset controls, and various
power switches/gates. Some of these can be independently listed and
supported, while a number of CPU related ones are used in tandem with
CPUCFG for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85ecb4d64cfd..bf4d40e8359f 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -709,6 +709,11 @@
 			interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
+		prcm@8001400 {
+			compatible = "allwinner,sun9i-a80-prcm";
+			reg = <0x08001400 0x200>;
+		};
+
 		apbs_rst: reset@80014b0 {
 			reg = <0x080014b0 0x4>;
 			compatible = "allwinner,sun6i-a31-clock-reset";
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 4/8] ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

The PRCM is a collection of clock controls, reset controls, and various
power switches/gates. Some of these can be independently listed and
supported, while a number of CPU related ones are used in tandem with
CPUCFG for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85ecb4d64cfd..bf4d40e8359f 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -709,6 +709,11 @@
 			interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
+		prcm@8001400 {
+			compatible = "allwinner,sun9i-a80-prcm";
+			reg = <0x08001400 0x200>;
+		};
+
 		apbs_rst: reset@80014b0 {
 			reg = <0x080014b0 0x4>;
 			compatible = "allwinner,sun6i-a31-clock-reset";
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 4/8] ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

The PRCM is a collection of clock controls, reset controls, and various
power switches/gates. Some of these can be independently listed and
supported, while a number of CPU related ones are used in tandem with
CPUCFG for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85ecb4d64cfd..bf4d40e8359f 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -709,6 +709,11 @@
 			interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
+		prcm at 8001400 {
+			compatible = "allwinner,sun9i-a80-prcm";
+			reg = <0x08001400 0x200>;
+		};
+
 		apbs_rst: reset at 80014b0 {
 			reg = <0x080014b0 0x4>;
 			compatible = "allwinner,sun6i-a31-clock-reset";
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 5/8] ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for cpu1~7
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

This patch adds common code used to power down all cores and clusters.
The code is quite long. The common MCPM library does not provide a
callback for doing the actual power down sequence. Instead it assumes
some other part (maybe a power management coprocessor) will handle it.
Since our platform does not have this, we resort to using a single thread
workqueue, based on how our work should be done in the order they were
queued, so the cluster power down code does not execute before the core
power down code. Though the scenario is not catastrophic, it will leave
the cluster on and using power.

This might be a bit racy, as nothing prevents the system from bringing a
core or cluster back before the asynchronous work shuts it down. This would
likely happen under a heavily loaded system with a scheduler that brings
cores in and out of the system frequently. It would either result in a
stall on a single core, or worse, hang the system if a cluster is abruptly
turned off. In simple use-cases it performs OK.

The primary core (cpu0) requires setting flags to have the BROM bounce
execution to the SMP software entry code. This is done in a subsequent
patch to keep the changes cleanly separated.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/mcpm.c | 170 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 165 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index 30719998f3f0..ddc26b5fec48 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -12,9 +12,12 @@
 #include <linux/arm-cci.h>
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/irqchip/arm-gic.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_device.h>
+#include <linux/workqueue.h>
 
 #include <asm/cputype.h>
 #include <asm/cp15.h>
@@ -30,6 +33,9 @@
 #define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
 #define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
 #define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_STATUS(c)		(0x30 + 0x4 * (c))
+#define CPUCFG_CX_STATUS_STANDBYWFI(n)	BIT(16 + (n))
+#define CPUCFG_CX_STATUS_STANDBYWFIL2	BIT(0)
 #define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
 #define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
 #define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
@@ -237,6 +243,30 @@ static int sunxi_cluster_powerup(unsigned int cluster)
 	return 0;
 }
 
+struct sunxi_mcpm_work {
+	struct work_struct work;
+	unsigned int cpu;
+	unsigned int cluster;
+};
+
+static struct workqueue_struct *sunxi_mcpm_wq;
+static struct sunxi_mcpm_work sunxi_mcpm_cpu_down_work[2][4];
+static struct sunxi_mcpm_work sunxi_mcpm_cluster_down_work[2];
+
+static void sunxi_cpu_powerdown_prepare(unsigned int cpu, unsigned int cluster)
+{
+	gic_cpu_if_down(0);
+
+	queue_work(sunxi_mcpm_wq,
+		   &sunxi_mcpm_cpu_down_work[cluster][cpu].work);
+}
+
+static void sunxi_cluster_powerdown_prepare(unsigned int cluster)
+{
+	queue_work(sunxi_mcpm_wq,
+		   &sunxi_mcpm_cluster_down_work[cluster].work);
+}
+
 static void sunxi_cpu_cache_disable(void)
 {
 	/* Disable and flush the local CPU cache. */
@@ -286,11 +316,116 @@ static void sunxi_cluster_cache_disable(void)
 	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
 }
 
+static int sunxi_do_cpu_powerdown(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* gate processor power */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg |= PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* close power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, false);
+
+	return 0;
+}
+
+static int sunxi_do_cluster_powerdown(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert cluster resets */
+	pr_debug("%s: assert cluster reset\n", __func__);
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* gate cluster power */
+	pr_debug("%s: gate cluster power\n", __func__);
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg |= PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	return 0;
+}
+
+static struct sunxi_mcpm_work *to_sunxi_mcpm_work(struct work_struct *work)
+{
+	return container_of(work, struct sunxi_mcpm_work, work);
+}
+
+/* async. work functions to bring down cpus and clusters */
+static void sunxi_cpu_powerdown(struct work_struct *_work)
+{
+	struct sunxi_mcpm_work *work = to_sunxi_mcpm_work(_work);
+	unsigned int cluster = work->cluster, cpu = work->cpu;
+	int ret;
+	u32 reg;
+
+	/* wait for CPU core to enter WFI */
+	ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+				 reg & CPUCFG_CX_STATUS_STANDBYWFI(cpu),
+				 1000, 100000);
+
+	if (ret)
+		return;
+
+	/* power down CPU core */
+	sunxi_do_cpu_powerdown(cpu, cluster);
+}
+
+static void sunxi_cluster_powerdown(struct work_struct *_work)
+{
+	struct sunxi_mcpm_work *work = to_sunxi_mcpm_work(_work);
+	unsigned int cluster = work->cluster;
+	int ret;
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+
+	/* wait for cluster L2 WFI */
+	ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+				 reg & CPUCFG_CX_STATUS_STANDBYWFIL2,
+				 1000, 100000);
+	if (ret)
+		return;
+
+	sunxi_do_cluster_powerdown(cluster);
+}
+
+static int sunxi_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
+{
+	int ret;
+	u32 reg;
+
+	ret = readl_poll_timeout(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu),
+				 reg, reg == 0xff, 1000, 100000);
+	pr_debug("%s: cpu %u cluster %u powerdown: %s\n", __func__,
+		 cpu, cluster, ret ? "timed out" : "done");
+	return ret;
+}
+
 static const struct mcpm_platform_ops sunxi_power_ops = {
-	.cpu_powerup		= sunxi_cpu_powerup,
-	.cluster_powerup	= sunxi_cluster_powerup,
-	.cpu_cache_disable	= sunxi_cpu_cache_disable,
-	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+	.cpu_powerup		   = sunxi_cpu_powerup,
+	.cpu_powerdown_prepare	   = sunxi_cpu_powerdown_prepare,
+	.cluster_powerup	   = sunxi_cluster_powerup,
+	.cluster_powerdown_prepare = sunxi_cluster_powerdown_prepare,
+	.cpu_cache_disable	   = sunxi_cpu_cache_disable,
+	.cluster_cache_disable	   = sunxi_cluster_cache_disable,
+	.wait_for_powerdown	   = sunxi_wait_for_powerdown,
 };
 
 /*
@@ -352,6 +487,7 @@ static int __init sunxi_mcpm_init(void)
 	struct device_node *cpucfg_node, *node;
 	struct resource res;
 	int ret;
+	int i, j;
 
 	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
 		return -ENODEV;
@@ -389,6 +525,28 @@ static int __init sunxi_mcpm_init(void)
 		goto err_put_cpucfg_node;
 	}
 
+	/* Initialize our strictly ordered workqueue */
+	sunxi_mcpm_wq = alloc_ordered_workqueue("%s", 0, "sunxi-mcpm");
+	if (!sunxi_mcpm_wq) {
+		ret = -ENOMEM;
+		pr_err("%s: failed to create our workqueue\n", __func__);
+		goto err_unmap_release_cpucfg;
+	}
+
+	/* Initialize power down work */
+	for (i = 0; i < SUNXI_NR_CLUSTERS; i++) {
+		for (j = 0; j < SUNXI_CPUS_PER_CLUSTER; j++) {
+			sunxi_mcpm_cpu_down_work[i][j].cluster = i;
+			sunxi_mcpm_cpu_down_work[i][j].cpu = j;
+			INIT_WORK(&sunxi_mcpm_cpu_down_work[i][j].work,
+				  sunxi_cpu_powerdown);
+		}
+
+		sunxi_mcpm_cluster_down_work[i].cluster = i;
+		INIT_WORK(&sunxi_mcpm_cluster_down_work[i].work,
+			  sunxi_cluster_powerdown);
+	}
+
 	ret = mcpm_platform_register(&sunxi_power_ops);
 	if (!ret)
 		ret = mcpm_sync_init(sunxi_power_up_setup);
@@ -396,7 +554,7 @@ static int __init sunxi_mcpm_init(void)
 		/* do not disable AXI master as no one will re-enable it */
 		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
 	if (ret)
-		goto err_unmap_release_cpucfg;
+		goto err_destroy_workqueue;
 
 	/* We don't need the CPUCFG device node anymore */
 	of_node_put(cpucfg_node);
@@ -411,6 +569,8 @@ static int __init sunxi_mcpm_init(void)
 
 	return ret;
 
+err_destroy_workqueue:
+	destroy_workqueue(sunxi_mcpm_wq);
 err_unmap_release_cpucfg:
 	iounmap(cpucfg_base);
 	of_address_to_resource(cpucfg_node, 0, &res);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 5/8] ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for cpu1~7
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

This patch adds common code used to power down all cores and clusters.
The code is quite long. The common MCPM library does not provide a
callback for doing the actual power down sequence. Instead it assumes
some other part (maybe a power management coprocessor) will handle it.
Since our platform does not have this, we resort to using a single thread
workqueue, based on how our work should be done in the order they were
queued, so the cluster power down code does not execute before the core
power down code. Though the scenario is not catastrophic, it will leave
the cluster on and using power.

This might be a bit racy, as nothing prevents the system from bringing a
core or cluster back before the asynchronous work shuts it down. This would
likely happen under a heavily loaded system with a scheduler that brings
cores in and out of the system frequently. It would either result in a
stall on a single core, or worse, hang the system if a cluster is abruptly
turned off. In simple use-cases it performs OK.

The primary core (cpu0) requires setting flags to have the BROM bounce
execution to the SMP software entry code. This is done in a subsequent
patch to keep the changes cleanly separated.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/mach-sunxi/mcpm.c | 170 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 165 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index 30719998f3f0..ddc26b5fec48 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -12,9 +12,12 @@
 #include <linux/arm-cci.h>
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/irqchip/arm-gic.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_device.h>
+#include <linux/workqueue.h>
 
 #include <asm/cputype.h>
 #include <asm/cp15.h>
@@ -30,6 +33,9 @@
 #define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
 #define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
 #define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_STATUS(c)		(0x30 + 0x4 * (c))
+#define CPUCFG_CX_STATUS_STANDBYWFI(n)	BIT(16 + (n))
+#define CPUCFG_CX_STATUS_STANDBYWFIL2	BIT(0)
 #define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
 #define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
 #define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
@@ -237,6 +243,30 @@ static int sunxi_cluster_powerup(unsigned int cluster)
 	return 0;
 }
 
+struct sunxi_mcpm_work {
+	struct work_struct work;
+	unsigned int cpu;
+	unsigned int cluster;
+};
+
+static struct workqueue_struct *sunxi_mcpm_wq;
+static struct sunxi_mcpm_work sunxi_mcpm_cpu_down_work[2][4];
+static struct sunxi_mcpm_work sunxi_mcpm_cluster_down_work[2];
+
+static void sunxi_cpu_powerdown_prepare(unsigned int cpu, unsigned int cluster)
+{
+	gic_cpu_if_down(0);
+
+	queue_work(sunxi_mcpm_wq,
+		   &sunxi_mcpm_cpu_down_work[cluster][cpu].work);
+}
+
+static void sunxi_cluster_powerdown_prepare(unsigned int cluster)
+{
+	queue_work(sunxi_mcpm_wq,
+		   &sunxi_mcpm_cluster_down_work[cluster].work);
+}
+
 static void sunxi_cpu_cache_disable(void)
 {
 	/* Disable and flush the local CPU cache. */
@@ -286,11 +316,116 @@ static void sunxi_cluster_cache_disable(void)
 	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
 }
 
+static int sunxi_do_cpu_powerdown(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* gate processor power */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg |= PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* close power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, false);
+
+	return 0;
+}
+
+static int sunxi_do_cluster_powerdown(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert cluster resets */
+	pr_debug("%s: assert cluster reset\n", __func__);
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* gate cluster power */
+	pr_debug("%s: gate cluster power\n", __func__);
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg |= PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	return 0;
+}
+
+static struct sunxi_mcpm_work *to_sunxi_mcpm_work(struct work_struct *work)
+{
+	return container_of(work, struct sunxi_mcpm_work, work);
+}
+
+/* async. work functions to bring down cpus and clusters */
+static void sunxi_cpu_powerdown(struct work_struct *_work)
+{
+	struct sunxi_mcpm_work *work = to_sunxi_mcpm_work(_work);
+	unsigned int cluster = work->cluster, cpu = work->cpu;
+	int ret;
+	u32 reg;
+
+	/* wait for CPU core to enter WFI */
+	ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+				 reg & CPUCFG_CX_STATUS_STANDBYWFI(cpu),
+				 1000, 100000);
+
+	if (ret)
+		return;
+
+	/* power down CPU core */
+	sunxi_do_cpu_powerdown(cpu, cluster);
+}
+
+static void sunxi_cluster_powerdown(struct work_struct *_work)
+{
+	struct sunxi_mcpm_work *work = to_sunxi_mcpm_work(_work);
+	unsigned int cluster = work->cluster;
+	int ret;
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+
+	/* wait for cluster L2 WFI */
+	ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+				 reg & CPUCFG_CX_STATUS_STANDBYWFIL2,
+				 1000, 100000);
+	if (ret)
+		return;
+
+	sunxi_do_cluster_powerdown(cluster);
+}
+
+static int sunxi_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
+{
+	int ret;
+	u32 reg;
+
+	ret = readl_poll_timeout(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu),
+				 reg, reg == 0xff, 1000, 100000);
+	pr_debug("%s: cpu %u cluster %u powerdown: %s\n", __func__,
+		 cpu, cluster, ret ? "timed out" : "done");
+	return ret;
+}
+
 static const struct mcpm_platform_ops sunxi_power_ops = {
-	.cpu_powerup		= sunxi_cpu_powerup,
-	.cluster_powerup	= sunxi_cluster_powerup,
-	.cpu_cache_disable	= sunxi_cpu_cache_disable,
-	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+	.cpu_powerup		   = sunxi_cpu_powerup,
+	.cpu_powerdown_prepare	   = sunxi_cpu_powerdown_prepare,
+	.cluster_powerup	   = sunxi_cluster_powerup,
+	.cluster_powerdown_prepare = sunxi_cluster_powerdown_prepare,
+	.cpu_cache_disable	   = sunxi_cpu_cache_disable,
+	.cluster_cache_disable	   = sunxi_cluster_cache_disable,
+	.wait_for_powerdown	   = sunxi_wait_for_powerdown,
 };
 
 /*
@@ -352,6 +487,7 @@ static int __init sunxi_mcpm_init(void)
 	struct device_node *cpucfg_node, *node;
 	struct resource res;
 	int ret;
+	int i, j;
 
 	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
 		return -ENODEV;
@@ -389,6 +525,28 @@ static int __init sunxi_mcpm_init(void)
 		goto err_put_cpucfg_node;
 	}
 
+	/* Initialize our strictly ordered workqueue */
+	sunxi_mcpm_wq = alloc_ordered_workqueue("%s", 0, "sunxi-mcpm");
+	if (!sunxi_mcpm_wq) {
+		ret = -ENOMEM;
+		pr_err("%s: failed to create our workqueue\n", __func__);
+		goto err_unmap_release_cpucfg;
+	}
+
+	/* Initialize power down work */
+	for (i = 0; i < SUNXI_NR_CLUSTERS; i++) {
+		for (j = 0; j < SUNXI_CPUS_PER_CLUSTER; j++) {
+			sunxi_mcpm_cpu_down_work[i][j].cluster = i;
+			sunxi_mcpm_cpu_down_work[i][j].cpu = j;
+			INIT_WORK(&sunxi_mcpm_cpu_down_work[i][j].work,
+				  sunxi_cpu_powerdown);
+		}
+
+		sunxi_mcpm_cluster_down_work[i].cluster = i;
+		INIT_WORK(&sunxi_mcpm_cluster_down_work[i].work,
+			  sunxi_cluster_powerdown);
+	}
+
 	ret = mcpm_platform_register(&sunxi_power_ops);
 	if (!ret)
 		ret = mcpm_sync_init(sunxi_power_up_setup);
@@ -396,7 +554,7 @@ static int __init sunxi_mcpm_init(void)
 		/* do not disable AXI master as no one will re-enable it */
 		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
 	if (ret)
-		goto err_unmap_release_cpucfg;
+		goto err_destroy_workqueue;
 
 	/* We don't need the CPUCFG device node anymore */
 	of_node_put(cpucfg_node);
@@ -411,6 +569,8 @@ static int __init sunxi_mcpm_init(void)
 
 	return ret;
 
+err_destroy_workqueue:
+	destroy_workqueue(sunxi_mcpm_wq);
 err_unmap_release_cpucfg:
 	iounmap(cpucfg_base);
 	of_address_to_resource(cpucfg_node, 0, &res);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 5/8] ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for cpu1~7
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds common code used to power down all cores and clusters.
The code is quite long. The common MCPM library does not provide a
callback for doing the actual power down sequence. Instead it assumes
some other part (maybe a power management coprocessor) will handle it.
Since our platform does not have this, we resort to using a single thread
workqueue, based on how our work should be done in the order they were
queued, so the cluster power down code does not execute before the core
power down code. Though the scenario is not catastrophic, it will leave
the cluster on and using power.

This might be a bit racy, as nothing prevents the system from bringing a
core or cluster back before the asynchronous work shuts it down. This would
likely happen under a heavily loaded system with a scheduler that brings
cores in and out of the system frequently. It would either result in a
stall on a single core, or worse, hang the system if a cluster is abruptly
turned off. In simple use-cases it performs OK.

The primary core (cpu0) requires setting flags to have the BROM bounce
execution to the SMP software entry code. This is done in a subsequent
patch to keep the changes cleanly separated.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/mcpm.c | 170 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 165 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index 30719998f3f0..ddc26b5fec48 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -12,9 +12,12 @@
 #include <linux/arm-cci.h>
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/irqchip/arm-gic.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_device.h>
+#include <linux/workqueue.h>
 
 #include <asm/cputype.h>
 #include <asm/cp15.h>
@@ -30,6 +33,9 @@
 #define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
 #define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
 #define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_STATUS(c)		(0x30 + 0x4 * (c))
+#define CPUCFG_CX_STATUS_STANDBYWFI(n)	BIT(16 + (n))
+#define CPUCFG_CX_STATUS_STANDBYWFIL2	BIT(0)
 #define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
 #define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
 #define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
@@ -237,6 +243,30 @@ static int sunxi_cluster_powerup(unsigned int cluster)
 	return 0;
 }
 
+struct sunxi_mcpm_work {
+	struct work_struct work;
+	unsigned int cpu;
+	unsigned int cluster;
+};
+
+static struct workqueue_struct *sunxi_mcpm_wq;
+static struct sunxi_mcpm_work sunxi_mcpm_cpu_down_work[2][4];
+static struct sunxi_mcpm_work sunxi_mcpm_cluster_down_work[2];
+
+static void sunxi_cpu_powerdown_prepare(unsigned int cpu, unsigned int cluster)
+{
+	gic_cpu_if_down(0);
+
+	queue_work(sunxi_mcpm_wq,
+		   &sunxi_mcpm_cpu_down_work[cluster][cpu].work);
+}
+
+static void sunxi_cluster_powerdown_prepare(unsigned int cluster)
+{
+	queue_work(sunxi_mcpm_wq,
+		   &sunxi_mcpm_cluster_down_work[cluster].work);
+}
+
 static void sunxi_cpu_cache_disable(void)
 {
 	/* Disable and flush the local CPU cache. */
@@ -286,11 +316,116 @@ static void sunxi_cluster_cache_disable(void)
 	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
 }
 
+static int sunxi_do_cpu_powerdown(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* gate processor power */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg |= PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* close power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, false);
+
+	return 0;
+}
+
+static int sunxi_do_cluster_powerdown(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert cluster resets */
+	pr_debug("%s: assert cluster reset\n", __func__);
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* gate cluster power */
+	pr_debug("%s: gate cluster power\n", __func__);
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg |= PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	return 0;
+}
+
+static struct sunxi_mcpm_work *to_sunxi_mcpm_work(struct work_struct *work)
+{
+	return container_of(work, struct sunxi_mcpm_work, work);
+}
+
+/* async. work functions to bring down cpus and clusters */
+static void sunxi_cpu_powerdown(struct work_struct *_work)
+{
+	struct sunxi_mcpm_work *work = to_sunxi_mcpm_work(_work);
+	unsigned int cluster = work->cluster, cpu = work->cpu;
+	int ret;
+	u32 reg;
+
+	/* wait for CPU core to enter WFI */
+	ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+				 reg & CPUCFG_CX_STATUS_STANDBYWFI(cpu),
+				 1000, 100000);
+
+	if (ret)
+		return;
+
+	/* power down CPU core */
+	sunxi_do_cpu_powerdown(cpu, cluster);
+}
+
+static void sunxi_cluster_powerdown(struct work_struct *_work)
+{
+	struct sunxi_mcpm_work *work = to_sunxi_mcpm_work(_work);
+	unsigned int cluster = work->cluster;
+	int ret;
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+
+	/* wait for cluster L2 WFI */
+	ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+				 reg & CPUCFG_CX_STATUS_STANDBYWFIL2,
+				 1000, 100000);
+	if (ret)
+		return;
+
+	sunxi_do_cluster_powerdown(cluster);
+}
+
+static int sunxi_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
+{
+	int ret;
+	u32 reg;
+
+	ret = readl_poll_timeout(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu),
+				 reg, reg == 0xff, 1000, 100000);
+	pr_debug("%s: cpu %u cluster %u powerdown: %s\n", __func__,
+		 cpu, cluster, ret ? "timed out" : "done");
+	return ret;
+}
+
 static const struct mcpm_platform_ops sunxi_power_ops = {
-	.cpu_powerup		= sunxi_cpu_powerup,
-	.cluster_powerup	= sunxi_cluster_powerup,
-	.cpu_cache_disable	= sunxi_cpu_cache_disable,
-	.cluster_cache_disable	= sunxi_cluster_cache_disable,
+	.cpu_powerup		   = sunxi_cpu_powerup,
+	.cpu_powerdown_prepare	   = sunxi_cpu_powerdown_prepare,
+	.cluster_powerup	   = sunxi_cluster_powerup,
+	.cluster_powerdown_prepare = sunxi_cluster_powerdown_prepare,
+	.cpu_cache_disable	   = sunxi_cpu_cache_disable,
+	.cluster_cache_disable	   = sunxi_cluster_cache_disable,
+	.wait_for_powerdown	   = sunxi_wait_for_powerdown,
 };
 
 /*
@@ -352,6 +487,7 @@ static int __init sunxi_mcpm_init(void)
 	struct device_node *cpucfg_node, *node;
 	struct resource res;
 	int ret;
+	int i, j;
 
 	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
 		return -ENODEV;
@@ -389,6 +525,28 @@ static int __init sunxi_mcpm_init(void)
 		goto err_put_cpucfg_node;
 	}
 
+	/* Initialize our strictly ordered workqueue */
+	sunxi_mcpm_wq = alloc_ordered_workqueue("%s", 0, "sunxi-mcpm");
+	if (!sunxi_mcpm_wq) {
+		ret = -ENOMEM;
+		pr_err("%s: failed to create our workqueue\n", __func__);
+		goto err_unmap_release_cpucfg;
+	}
+
+	/* Initialize power down work */
+	for (i = 0; i < SUNXI_NR_CLUSTERS; i++) {
+		for (j = 0; j < SUNXI_CPUS_PER_CLUSTER; j++) {
+			sunxi_mcpm_cpu_down_work[i][j].cluster = i;
+			sunxi_mcpm_cpu_down_work[i][j].cpu = j;
+			INIT_WORK(&sunxi_mcpm_cpu_down_work[i][j].work,
+				  sunxi_cpu_powerdown);
+		}
+
+		sunxi_mcpm_cluster_down_work[i].cluster = i;
+		INIT_WORK(&sunxi_mcpm_cluster_down_work[i].work,
+			  sunxi_cluster_powerdown);
+	}
+
 	ret = mcpm_platform_register(&sunxi_power_ops);
 	if (!ret)
 		ret = mcpm_sync_init(sunxi_power_up_setup);
@@ -396,7 +554,7 @@ static int __init sunxi_mcpm_init(void)
 		/* do not disable AXI master as no one will re-enable it */
 		ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
 	if (ret)
-		goto err_unmap_release_cpucfg;
+		goto err_destroy_workqueue;
 
 	/* We don't need the CPUCFG device node anymore */
 	of_node_put(cpucfg_node);
@@ -411,6 +569,8 @@ static int __init sunxi_mcpm_init(void)
 
 	return ret;
 
+err_destroy_workqueue:
+	destroy_workqueue(sunxi_mcpm_wq);
 err_unmap_release_cpucfg:
 	iounmap(cpucfg_base);
 	of_address_to_resource(cpucfg_node, 0, &res);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

On the Allwinner A80 SoC the BROM supports hotplugging the primary core
(cpu0) by checking two 32bit values at a specific location within the
secure SRAM block. This region needs to be reserved and accessible to
the SMP code.

Document its usage.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 .../devicetree/bindings/arm/sunxi/smp-sram.txt     | 44 ++++++++++++++++++++++
 1 file changed, 44 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt

diff --git a/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
new file mode 100644
index 000000000000..082e6a9382d3
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
@@ -0,0 +1,44 @@
+Allwinner SRAM for smp bringup:
+------------------------------------------------
+
+Allwinner's A80 SoC uses part of the secure sram for hotplugging of the
+primary core (cpu0). Once the core gets powered up it checks if a magic
+value is set at a specific location. If it is then the BROM will jump
+to the software entry address, instead of executing a standard boot.
+
+Therefore a reserved section sub-node has to be added to the mmio-sram
+declaration.
+
+Note that this is separate from the Allwinner SRAM controller found in
+../../sram/sunxi-sram.txt. This SRAM is secure only and not mappable to
+any device.
+
+Also there are no "secure-only" properties. The implementation should
+check if this SRAM is usable first.
+
+Required sub-node properties:
+- compatible : depending on the SoC this should be one of:
+		"allwinner,sun9i-a80-smp-sram"
+
+The rest of the properties should follow the generic mmio-sram discription
+found in ../../misc/sram.txt
+
+Example:
+
+	sram_b: sram@20000 {
+		/* 256 KiB secure SRAM at 0x20000 */
+		compatible = "mmio-sram";
+		reg = <0x00020000 0x40000>;
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0 0x00020000 0x40000>;
+
+		smp-sram@1000 {
+			/*
+			 * This is checked by BROM to determine if
+			 * cpu0 should jump to SMP entry vector
+			 */
+			compatible = "allwinner,sun9i-a80-smp-sram";
+			reg = <0x1000 0x8>;
+		};
+	};
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

On the Allwinner A80 SoC the BROM supports hotplugging the primary core
(cpu0) by checking two 32bit values at a specific location within the
secure SRAM block. This region needs to be reserved and accessible to
the SMP code.

Document its usage.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 .../devicetree/bindings/arm/sunxi/smp-sram.txt     | 44 ++++++++++++++++++++++
 1 file changed, 44 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt

diff --git a/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
new file mode 100644
index 000000000000..082e6a9382d3
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
@@ -0,0 +1,44 @@
+Allwinner SRAM for smp bringup:
+------------------------------------------------
+
+Allwinner's A80 SoC uses part of the secure sram for hotplugging of the
+primary core (cpu0). Once the core gets powered up it checks if a magic
+value is set at a specific location. If it is then the BROM will jump
+to the software entry address, instead of executing a standard boot.
+
+Therefore a reserved section sub-node has to be added to the mmio-sram
+declaration.
+
+Note that this is separate from the Allwinner SRAM controller found in
+../../sram/sunxi-sram.txt. This SRAM is secure only and not mappable to
+any device.
+
+Also there are no "secure-only" properties. The implementation should
+check if this SRAM is usable first.
+
+Required sub-node properties:
+- compatible : depending on the SoC this should be one of:
+		"allwinner,sun9i-a80-smp-sram"
+
+The rest of the properties should follow the generic mmio-sram discription
+found in ../../misc/sram.txt
+
+Example:
+
+	sram_b: sram@20000 {
+		/* 256 KiB secure SRAM at 0x20000 */
+		compatible = "mmio-sram";
+		reg = <0x00020000 0x40000>;
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0 0x00020000 0x40000>;
+
+		smp-sram@1000 {
+			/*
+			 * This is checked by BROM to determine if
+			 * cpu0 should jump to SMP entry vector
+			 */
+			compatible = "allwinner,sun9i-a80-smp-sram";
+			reg = <0x1000 0x8>;
+		};
+	};
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

On the Allwinner A80 SoC the BROM supports hotplugging the primary core
(cpu0) by checking two 32bit values at a specific location within the
secure SRAM block. This region needs to be reserved and accessible to
the SMP code.

Document its usage.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 .../devicetree/bindings/arm/sunxi/smp-sram.txt     | 44 ++++++++++++++++++++++
 1 file changed, 44 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt

diff --git a/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
new file mode 100644
index 000000000000..082e6a9382d3
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
@@ -0,0 +1,44 @@
+Allwinner SRAM for smp bringup:
+------------------------------------------------
+
+Allwinner's A80 SoC uses part of the secure sram for hotplugging of the
+primary core (cpu0). Once the core gets powered up it checks if a magic
+value is set at a specific location. If it is then the BROM will jump
+to the software entry address, instead of executing a standard boot.
+
+Therefore a reserved section sub-node has to be added to the mmio-sram
+declaration.
+
+Note that this is separate from the Allwinner SRAM controller found in
+../../sram/sunxi-sram.txt. This SRAM is secure only and not mappable to
+any device.
+
+Also there are no "secure-only" properties. The implementation should
+check if this SRAM is usable first.
+
+Required sub-node properties:
+- compatible : depending on the SoC this should be one of:
+		"allwinner,sun9i-a80-smp-sram"
+
+The rest of the properties should follow the generic mmio-sram discription
+found in ../../misc/sram.txt
+
+Example:
+
+	sram_b: sram at 20000 {
+		/* 256 KiB secure SRAM at 0x20000 */
+		compatible = "mmio-sram";
+		reg = <0x00020000 0x40000>;
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0 0x00020000 0x40000>;
+
+		smp-sram at 1000 {
+			/*
+			 * This is checked by BROM to determine if
+			 * cpu0 should jump to SMP entry vector
+			 */
+			compatible = "allwinner,sun9i-a80-smp-sram";
+			reg = <0x1000 0x8>;
+		};
+	};
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 7/8] ARM: sun9i: mcpm: Support cpu0 hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

The BROM has a branch that checks if the primary core is hotplugging.
If the magic flag is set, execution jumps to the address set in the
software entry register. (Secondary cores always branch to the that
address.)

This patch sets the flags that makes BROM jump execution on the
primary core (cpu0) to the SMP software entry code when the core is
powered back up. After it is re-integrated into the system, the flag
is cleared.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/mcpm.c | 54 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index ddc26b5fec48..c8d74c783644 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -56,8 +56,12 @@
 #define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
 #define PRCM_CPU_SOFT_ENTRY_REG		0x164
 
+#define CPU0_SUPPORT_HOTPLUG_MAGIC0	0xFA50392F
+#define CPU0_SUPPORT_HOTPLUG_MAGIC1	0x790DCA3A
+
 static void __iomem *cpucfg_base;
 static void __iomem *prcm_base;
+static void __iomem *sram_b_smp_base;
 
 static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
 {
@@ -116,6 +120,17 @@ static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
 	return 0;
 }
 
+static void sunxi_cpu0_hotplug_support_set(bool enable)
+{
+	if (enable) {
+		writel(CPU0_SUPPORT_HOTPLUG_MAGIC0, sram_b_smp_base);
+		writel(CPU0_SUPPORT_HOTPLUG_MAGIC1, sram_b_smp_base + 0x4);
+	} else {
+		writel(0x0, sram_b_smp_base);
+		writel(0x0, sram_b_smp_base + 0x4);
+	}
+}
+
 static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 {
 	u32 reg;
@@ -124,6 +139,10 @@ static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
 		return -EINVAL;
 
+	/* Set hotplug support magic flags for cpu0 */
+	if (cluster == 0 && cpu == 0)
+		sunxi_cpu0_hotplug_support_set(true);
+
 	/* assert processor power-on reset */
 	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
 	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
@@ -406,6 +425,13 @@ static void sunxi_cluster_powerdown(struct work_struct *_work)
 	sunxi_do_cluster_powerdown(cluster);
 }
 
+static void sunxi_cpu_is_up(unsigned int cpu, unsigned int cluster)
+{
+	/* Clear hotplug support magic flags for cpu0 */
+	if (cluster == 0 && cpu == 0)
+		sunxi_cpu0_hotplug_support_set(false);
+}
+
 static int sunxi_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
 {
 	int ret;
@@ -425,6 +451,7 @@ static const struct mcpm_platform_ops sunxi_power_ops = {
 	.cluster_powerdown_prepare = sunxi_cluster_powerdown_prepare,
 	.cpu_cache_disable	   = sunxi_cpu_cache_disable,
 	.cluster_cache_disable	   = sunxi_cluster_cache_disable,
+	.cpu_is_up		   = sunxi_cpu_is_up,
 	.wait_for_powerdown	   = sunxi_wait_for_powerdown,
 };
 
@@ -484,7 +511,7 @@ static void sunxi_mcpm_setup_entry_point(void)
 
 static int __init sunxi_mcpm_init(void)
 {
-	struct device_node *cpucfg_node, *node;
+	struct device_node *cpucfg_node, *sram_node, *node;
 	struct resource res;
 	int ret;
 	int i, j;
@@ -525,12 +552,26 @@ static int __init sunxi_mcpm_init(void)
 		goto err_put_cpucfg_node;
 	}
 
+	sram_node = of_find_compatible_node(NULL, NULL,
+					    "allwinner,sun9i-a80-smp-sram");
+	if (!sram_node) {
+		ret = -ENODEV;
+		goto err_unmap_release_cpucfg;
+	}
+
+	sram_b_smp_base = of_io_request_and_map(sram_node, 0, "sunxi-mcpm");
+	if (IS_ERR(sram_b_smp_base)) {
+		ret = PTR_ERR(sram_b_smp_base);
+		pr_err("%s: failed to map secure SRAM\n", __func__);
+		goto err_put_sram_node;
+	}
+
 	/* Initialize our strictly ordered workqueue */
 	sunxi_mcpm_wq = alloc_ordered_workqueue("%s", 0, "sunxi-mcpm");
 	if (!sunxi_mcpm_wq) {
 		ret = -ENOMEM;
 		pr_err("%s: failed to create our workqueue\n", __func__);
-		goto err_unmap_release_cpucfg;
+		goto err_unmap_release_secure_sram;
 	}
 
 	/* Initialize power down work */
@@ -556,8 +597,9 @@ static int __init sunxi_mcpm_init(void)
 	if (ret)
 		goto err_destroy_workqueue;
 
-	/* We don't need the CPUCFG device node anymore */
+	/* We don't need the CPUCFG and SRAM device nodes anymore */
 	of_node_put(cpucfg_node);
+	of_node_put(sram_node);
 
 	/* Set the hardware entry point address */
 	sunxi_mcpm_setup_entry_point();
@@ -571,6 +613,12 @@ static int __init sunxi_mcpm_init(void)
 
 err_destroy_workqueue:
 	destroy_workqueue(sunxi_mcpm_wq);
+err_unmap_release_secure_sram:
+	iounmap(sram_b_smp_base);
+	of_address_to_resource(sram_node, 0, &res);
+	release_mem_region(res.start, resource_size(&res));
+err_put_sram_node:
+	of_node_put(sram_node);
 err_unmap_release_cpucfg:
 	iounmap(cpucfg_base);
 	of_address_to_resource(cpucfg_node, 0, &res);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 7/8] ARM: sun9i: mcpm: Support cpu0 hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

The BROM has a branch that checks if the primary core is hotplugging.
If the magic flag is set, execution jumps to the address set in the
software entry register. (Secondary cores always branch to the that
address.)

This patch sets the flags that makes BROM jump execution on the
primary core (cpu0) to the SMP software entry code when the core is
powered back up. After it is re-integrated into the system, the flag
is cleared.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/mach-sunxi/mcpm.c | 54 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index ddc26b5fec48..c8d74c783644 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -56,8 +56,12 @@
 #define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
 #define PRCM_CPU_SOFT_ENTRY_REG		0x164
 
+#define CPU0_SUPPORT_HOTPLUG_MAGIC0	0xFA50392F
+#define CPU0_SUPPORT_HOTPLUG_MAGIC1	0x790DCA3A
+
 static void __iomem *cpucfg_base;
 static void __iomem *prcm_base;
+static void __iomem *sram_b_smp_base;
 
 static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
 {
@@ -116,6 +120,17 @@ static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
 	return 0;
 }
 
+static void sunxi_cpu0_hotplug_support_set(bool enable)
+{
+	if (enable) {
+		writel(CPU0_SUPPORT_HOTPLUG_MAGIC0, sram_b_smp_base);
+		writel(CPU0_SUPPORT_HOTPLUG_MAGIC1, sram_b_smp_base + 0x4);
+	} else {
+		writel(0x0, sram_b_smp_base);
+		writel(0x0, sram_b_smp_base + 0x4);
+	}
+}
+
 static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 {
 	u32 reg;
@@ -124,6 +139,10 @@ static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
 		return -EINVAL;
 
+	/* Set hotplug support magic flags for cpu0 */
+	if (cluster == 0 && cpu == 0)
+		sunxi_cpu0_hotplug_support_set(true);
+
 	/* assert processor power-on reset */
 	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
 	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
@@ -406,6 +425,13 @@ static void sunxi_cluster_powerdown(struct work_struct *_work)
 	sunxi_do_cluster_powerdown(cluster);
 }
 
+static void sunxi_cpu_is_up(unsigned int cpu, unsigned int cluster)
+{
+	/* Clear hotplug support magic flags for cpu0 */
+	if (cluster == 0 && cpu == 0)
+		sunxi_cpu0_hotplug_support_set(false);
+}
+
 static int sunxi_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
 {
 	int ret;
@@ -425,6 +451,7 @@ static const struct mcpm_platform_ops sunxi_power_ops = {
 	.cluster_powerdown_prepare = sunxi_cluster_powerdown_prepare,
 	.cpu_cache_disable	   = sunxi_cpu_cache_disable,
 	.cluster_cache_disable	   = sunxi_cluster_cache_disable,
+	.cpu_is_up		   = sunxi_cpu_is_up,
 	.wait_for_powerdown	   = sunxi_wait_for_powerdown,
 };
 
@@ -484,7 +511,7 @@ static void sunxi_mcpm_setup_entry_point(void)
 
 static int __init sunxi_mcpm_init(void)
 {
-	struct device_node *cpucfg_node, *node;
+	struct device_node *cpucfg_node, *sram_node, *node;
 	struct resource res;
 	int ret;
 	int i, j;
@@ -525,12 +552,26 @@ static int __init sunxi_mcpm_init(void)
 		goto err_put_cpucfg_node;
 	}
 
+	sram_node = of_find_compatible_node(NULL, NULL,
+					    "allwinner,sun9i-a80-smp-sram");
+	if (!sram_node) {
+		ret = -ENODEV;
+		goto err_unmap_release_cpucfg;
+	}
+
+	sram_b_smp_base = of_io_request_and_map(sram_node, 0, "sunxi-mcpm");
+	if (IS_ERR(sram_b_smp_base)) {
+		ret = PTR_ERR(sram_b_smp_base);
+		pr_err("%s: failed to map secure SRAM\n", __func__);
+		goto err_put_sram_node;
+	}
+
 	/* Initialize our strictly ordered workqueue */
 	sunxi_mcpm_wq = alloc_ordered_workqueue("%s", 0, "sunxi-mcpm");
 	if (!sunxi_mcpm_wq) {
 		ret = -ENOMEM;
 		pr_err("%s: failed to create our workqueue\n", __func__);
-		goto err_unmap_release_cpucfg;
+		goto err_unmap_release_secure_sram;
 	}
 
 	/* Initialize power down work */
@@ -556,8 +597,9 @@ static int __init sunxi_mcpm_init(void)
 	if (ret)
 		goto err_destroy_workqueue;
 
-	/* We don't need the CPUCFG device node anymore */
+	/* We don't need the CPUCFG and SRAM device nodes anymore */
 	of_node_put(cpucfg_node);
+	of_node_put(sram_node);
 
 	/* Set the hardware entry point address */
 	sunxi_mcpm_setup_entry_point();
@@ -571,6 +613,12 @@ static int __init sunxi_mcpm_init(void)
 
 err_destroy_workqueue:
 	destroy_workqueue(sunxi_mcpm_wq);
+err_unmap_release_secure_sram:
+	iounmap(sram_b_smp_base);
+	of_address_to_resource(sram_node, 0, &res);
+	release_mem_region(res.start, resource_size(&res));
+err_put_sram_node:
+	of_node_put(sram_node);
 err_unmap_release_cpucfg:
 	iounmap(cpucfg_base);
 	of_address_to_resource(cpucfg_node, 0, &res);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 7/8] ARM: sun9i: mcpm: Support cpu0 hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

The BROM has a branch that checks if the primary core is hotplugging.
If the magic flag is set, execution jumps to the address set in the
software entry register. (Secondary cores always branch to the that
address.)

This patch sets the flags that makes BROM jump execution on the
primary core (cpu0) to the SMP software entry code when the core is
powered back up. After it is re-integrated into the system, the flag
is cleared.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/mcpm.c | 54 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index ddc26b5fec48..c8d74c783644 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -56,8 +56,12 @@
 #define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
 #define PRCM_CPU_SOFT_ENTRY_REG		0x164
 
+#define CPU0_SUPPORT_HOTPLUG_MAGIC0	0xFA50392F
+#define CPU0_SUPPORT_HOTPLUG_MAGIC1	0x790DCA3A
+
 static void __iomem *cpucfg_base;
 static void __iomem *prcm_base;
+static void __iomem *sram_b_smp_base;
 
 static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
 {
@@ -116,6 +120,17 @@ static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
 	return 0;
 }
 
+static void sunxi_cpu0_hotplug_support_set(bool enable)
+{
+	if (enable) {
+		writel(CPU0_SUPPORT_HOTPLUG_MAGIC0, sram_b_smp_base);
+		writel(CPU0_SUPPORT_HOTPLUG_MAGIC1, sram_b_smp_base + 0x4);
+	} else {
+		writel(0x0, sram_b_smp_base);
+		writel(0x0, sram_b_smp_base + 0x4);
+	}
+}
+
 static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 {
 	u32 reg;
@@ -124,6 +139,10 @@ static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
 		return -EINVAL;
 
+	/* Set hotplug support magic flags for cpu0 */
+	if (cluster == 0 && cpu == 0)
+		sunxi_cpu0_hotplug_support_set(true);
+
 	/* assert processor power-on reset */
 	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
 	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
@@ -406,6 +425,13 @@ static void sunxi_cluster_powerdown(struct work_struct *_work)
 	sunxi_do_cluster_powerdown(cluster);
 }
 
+static void sunxi_cpu_is_up(unsigned int cpu, unsigned int cluster)
+{
+	/* Clear hotplug support magic flags for cpu0 */
+	if (cluster == 0 && cpu == 0)
+		sunxi_cpu0_hotplug_support_set(false);
+}
+
 static int sunxi_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
 {
 	int ret;
@@ -425,6 +451,7 @@ static const struct mcpm_platform_ops sunxi_power_ops = {
 	.cluster_powerdown_prepare = sunxi_cluster_powerdown_prepare,
 	.cpu_cache_disable	   = sunxi_cpu_cache_disable,
 	.cluster_cache_disable	   = sunxi_cluster_cache_disable,
+	.cpu_is_up		   = sunxi_cpu_is_up,
 	.wait_for_powerdown	   = sunxi_wait_for_powerdown,
 };
 
@@ -484,7 +511,7 @@ static void sunxi_mcpm_setup_entry_point(void)
 
 static int __init sunxi_mcpm_init(void)
 {
-	struct device_node *cpucfg_node, *node;
+	struct device_node *cpucfg_node, *sram_node, *node;
 	struct resource res;
 	int ret;
 	int i, j;
@@ -525,12 +552,26 @@ static int __init sunxi_mcpm_init(void)
 		goto err_put_cpucfg_node;
 	}
 
+	sram_node = of_find_compatible_node(NULL, NULL,
+					    "allwinner,sun9i-a80-smp-sram");
+	if (!sram_node) {
+		ret = -ENODEV;
+		goto err_unmap_release_cpucfg;
+	}
+
+	sram_b_smp_base = of_io_request_and_map(sram_node, 0, "sunxi-mcpm");
+	if (IS_ERR(sram_b_smp_base)) {
+		ret = PTR_ERR(sram_b_smp_base);
+		pr_err("%s: failed to map secure SRAM\n", __func__);
+		goto err_put_sram_node;
+	}
+
 	/* Initialize our strictly ordered workqueue */
 	sunxi_mcpm_wq = alloc_ordered_workqueue("%s", 0, "sunxi-mcpm");
 	if (!sunxi_mcpm_wq) {
 		ret = -ENOMEM;
 		pr_err("%s: failed to create our workqueue\n", __func__);
-		goto err_unmap_release_cpucfg;
+		goto err_unmap_release_secure_sram;
 	}
 
 	/* Initialize power down work */
@@ -556,8 +597,9 @@ static int __init sunxi_mcpm_init(void)
 	if (ret)
 		goto err_destroy_workqueue;
 
-	/* We don't need the CPUCFG device node anymore */
+	/* We don't need the CPUCFG and SRAM device nodes anymore */
 	of_node_put(cpucfg_node);
+	of_node_put(sram_node);
 
 	/* Set the hardware entry point address */
 	sunxi_mcpm_setup_entry_point();
@@ -571,6 +613,12 @@ static int __init sunxi_mcpm_init(void)
 
 err_destroy_workqueue:
 	destroy_workqueue(sunxi_mcpm_wq);
+err_unmap_release_secure_sram:
+	iounmap(sram_b_smp_base);
+	of_address_to_resource(sram_node, 0, &res);
+	release_mem_region(res.start, resource_size(&res));
+err_put_sram_node:
+	of_node_put(sram_node);
 err_unmap_release_cpucfg:
 	iounmap(cpucfg_base);
 	of_address_to_resource(cpucfg_node, 0, &res);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 8/8] ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Nicolas Pitre, Dave Martin

The A80 stores some magic flags in a portion of the secure SRAM. The
BROM jumps directly to the software entry point set by the SMP code
if the flags are set. This is required for CPU0 hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index bf4d40e8359f..b1c86b76ac3c 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -250,6 +250,25 @@
 		 */
 		ranges = <0 0 0 0x20000000>;
 
+		sram_b: sram@20000 {
+			/* 256 KiB secure SRAM at 0x20000 */
+			compatible = "mmio-sram";
+			reg = <0x00020000 0x40000>;
+
+			#address-cells = <1>;
+			#size-cells = <1>;
+			ranges = <0 0x00020000 0x40000>;
+
+			smp-sram@1000 {
+				/*
+				 * This is checked by BROM to determine if
+				 * cpu0 should jump to SMP entry vector
+				 */
+				compatible = "allwinner,sun9i-a80-smp-sram";
+				reg = <0x1000 0x8>;
+			};
+		};
+
 		ehci0: usb@a00000 {
 			compatible = "allwinner,sun9i-a80-ehci", "generic-ehci";
 			reg = <0x00a00000 0x100>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 8/8] ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: Maxime Ripard, Russell King, Rob Herring, Mark Rutland
  Cc: Mylene JOSSERAND, Chen-Yu Tsai,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

The A80 stores some magic flags in a portion of the secure SRAM. The
BROM jumps directly to the software entry point set by the SMP code
if the flags are set. This is required for CPU0 hotplugging.

Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index bf4d40e8359f..b1c86b76ac3c 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -250,6 +250,25 @@
 		 */
 		ranges = <0 0 0 0x20000000>;
 
+		sram_b: sram@20000 {
+			/* 256 KiB secure SRAM at 0x20000 */
+			compatible = "mmio-sram";
+			reg = <0x00020000 0x40000>;
+
+			#address-cells = <1>;
+			#size-cells = <1>;
+			ranges = <0 0x00020000 0x40000>;
+
+			smp-sram@1000 {
+				/*
+				 * This is checked by BROM to determine if
+				 * cpu0 should jump to SMP entry vector
+				 */
+				compatible = "allwinner,sun9i-a80-smp-sram";
+				reg = <0x1000 0x8>;
+			};
+		};
+
 		ehci0: usb@a00000 {
 			compatible = "allwinner,sun9i-a80-ehci", "generic-ehci";
 			reg = <0x00a00000 0x100>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 8/8] ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug
@ 2018-01-04 14:37   ` Chen-Yu Tsai
  0 siblings, 0 replies; 38+ messages in thread
From: Chen-Yu Tsai @ 2018-01-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

The A80 stores some magic flags in a portion of the secure SRAM. The
BROM jumps directly to the software entry point set by the SMP code
if the flags are set. This is required for CPU0 hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index bf4d40e8359f..b1c86b76ac3c 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -250,6 +250,25 @@
 		 */
 		ranges = <0 0 0 0x20000000>;
 
+		sram_b: sram at 20000 {
+			/* 256 KiB secure SRAM at 0x20000 */
+			compatible = "mmio-sram";
+			reg = <0x00020000 0x40000>;
+
+			#address-cells = <1>;
+			#size-cells = <1>;
+			ranges = <0 0x00020000 0x40000>;
+
+			smp-sram at 1000 {
+				/*
+				 * This is checked by BROM to determine if
+				 * cpu0 should jump to SMP entry vector
+				 */
+				compatible = "allwinner,sun9i-a80-smp-sram";
+				reg = <0x1000 0x8>;
+			};
+		};
+
 		ehci0: usb at a00000 {
 			compatible = "allwinner,sun9i-a80-ehci", "generic-ehci";
 			reg = <0x00a00000 0x100>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
  2018-01-04 14:37 ` Chen-Yu Tsai
@ 2018-01-04 14:58   ` Maxime Ripard
  -1 siblings, 0 replies; 38+ messages in thread
From: Maxime Ripard @ 2018-01-04 14:58 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Russell King, Rob Herring, Mark Rutland, Mylene JOSSERAND,
	devicetree, linux-arm-kernel, linux-kernel, linux-sunxi,
	Nicolas Pitre, Dave Martin

[-- Attachment #1: Type: text/plain, Size: 2267 bytes --]

On Thu, Jan 04, 2018 at 10:37:46PM +0800, Chen-Yu Tsai wrote:
> This is v2 of my sun9i SMP support with MCPM series which was started
> over two years ago [1]. We've tried to implement PSCI for both the A80
> and A83T. Results were not promising. The issue is that these two chips
> have a broken security extensions implementation. If a specific bit is
> not burned in its e-fuse, most if not all security protections don't
> work [2]. Even worse, non-secure access to the GIC become secure. This
> requires a crazy workaround in the GIC driver which probably doesn't work
> in all cases [3].
> 
> Nicolas mentioned that the MCPM framework is likely overkill in our
> case [4]. However the framework does provide cluster/core state tracking
> and proper sequencing of cache related operations. We could rework
> the code to use standard smp_ops, but I would like to actually get
> a working version in first.
> 
> Much of the sunxi-specific MCPM code is derived from Allwinner code and
> documentation, with some references to the other MCPM implementations,
> as well as the Cortex's Technical Reference Manuals for the power
> sequencing info.
> 
> One major difference compared to other platforms is we currently do not
> have a standalone PMU or other embedded firmware to do the actually power
> sequencing. All power/reset control is done by the kernel. Nicolas
> mentioned that a new optional callback should be added in cases where the
> kernel has to do the actual power down [5]. For now however I'm using a
> dedicated single thread workqueue. CPU and cluster power off work is
> queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> is somewhat heavy, as I have a total of 10 static work structs. It might
> also be a bit racy, as nothing prevents the system from bringing a core
> back before the asynchronous work shuts it down. This would likely
> happen under a heavily loaded system with a scheduler that brings cores
> in and out of the system frequently. In simple use-cases it performs OK.

It all looks sane to me
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 14:58   ` Maxime Ripard
  0 siblings, 0 replies; 38+ messages in thread
From: Maxime Ripard @ 2018-01-04 14:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 10:37:46PM +0800, Chen-Yu Tsai wrote:
> This is v2 of my sun9i SMP support with MCPM series which was started
> over two years ago [1]. We've tried to implement PSCI for both the A80
> and A83T. Results were not promising. The issue is that these two chips
> have a broken security extensions implementation. If a specific bit is
> not burned in its e-fuse, most if not all security protections don't
> work [2]. Even worse, non-secure access to the GIC become secure. This
> requires a crazy workaround in the GIC driver which probably doesn't work
> in all cases [3].
> 
> Nicolas mentioned that the MCPM framework is likely overkill in our
> case [4]. However the framework does provide cluster/core state tracking
> and proper sequencing of cache related operations. We could rework
> the code to use standard smp_ops, but I would like to actually get
> a working version in first.
> 
> Much of the sunxi-specific MCPM code is derived from Allwinner code and
> documentation, with some references to the other MCPM implementations,
> as well as the Cortex's Technical Reference Manuals for the power
> sequencing info.
> 
> One major difference compared to other platforms is we currently do not
> have a standalone PMU or other embedded firmware to do the actually power
> sequencing. All power/reset control is done by the kernel. Nicolas
> mentioned that a new optional callback should be added in cases where the
> kernel has to do the actual power down [5]. For now however I'm using a
> dedicated single thread workqueue. CPU and cluster power off work is
> queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> is somewhat heavy, as I have a total of 10 static work structs. It might
> also be a bit racy, as nothing prevents the system from bringing a core
> back before the asynchronous work shuts it down. This would likely
> happen under a heavily loaded system with a scheduler that brings cores
> in and out of the system frequently. In simple use-cases it performs OK.

It all looks sane to me
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180104/8ea853e2/attachment.sig>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 18:04     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Pieralisi @ 2018-01-04 18:04 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Chen-Yu Tsai, Russell King, Rob Herring, Mark Rutland,
	Mylene JOSSERAND, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

On Thu, Jan 04, 2018 at 03:58:38PM +0100, Maxime Ripard wrote:
> On Thu, Jan 04, 2018 at 10:37:46PM +0800, Chen-Yu Tsai wrote:
> > This is v2 of my sun9i SMP support with MCPM series which was started
> > over two years ago [1]. We've tried to implement PSCI for both the A80
> > and A83T. Results were not promising. The issue is that these two chips
> > have a broken security extensions implementation. If a specific bit is
> > not burned in its e-fuse, most if not all security protections don't
> > work [2]. Even worse, non-secure access to the GIC become secure. This
> > requires a crazy workaround in the GIC driver which probably doesn't work
> > in all cases [3].
> > 
> > Nicolas mentioned that the MCPM framework is likely overkill in our
> > case [4]. However the framework does provide cluster/core state tracking
> > and proper sequencing of cache related operations. We could rework
> > the code to use standard smp_ops, but I would like to actually get
> > a working version in first.
> > 
> > Much of the sunxi-specific MCPM code is derived from Allwinner code and
> > documentation, with some references to the other MCPM implementations,
> > as well as the Cortex's Technical Reference Manuals for the power
> > sequencing info.
> > 
> > One major difference compared to other platforms is we currently do not
> > have a standalone PMU or other embedded firmware to do the actually power
> > sequencing. All power/reset control is done by the kernel. Nicolas
> > mentioned that a new optional callback should be added in cases where the
> > kernel has to do the actual power down [5]. For now however I'm using a
> > dedicated single thread workqueue. CPU and cluster power off work is
> > queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> > is somewhat heavy, as I have a total of 10 static work structs. It might
> > also be a bit racy, as nothing prevents the system from bringing a core
> > back before the asynchronous work shuts it down. This would likely
> > happen under a heavily loaded system with a scheduler that brings cores
> > in and out of the system frequently. In simple use-cases it performs OK.
> 
> It all looks sane to me
> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>

It does not to me, sorry. You do not need MCPM (and workqueues) to
do SMP bring-up.

Nico explained why, just do it:

commit 905cdf9dda5d ("ARM: hisi/hip04: remove the MCPM overhead")

Lorenzo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 18:04     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Pieralisi @ 2018-01-04 18:04 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Chen-Yu Tsai, Russell King, Rob Herring, Mark Rutland,
	Mylene JOSSERAND, devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

On Thu, Jan 04, 2018 at 03:58:38PM +0100, Maxime Ripard wrote:
> On Thu, Jan 04, 2018 at 10:37:46PM +0800, Chen-Yu Tsai wrote:
> > This is v2 of my sun9i SMP support with MCPM series which was started
> > over two years ago [1]. We've tried to implement PSCI for both the A80
> > and A83T. Results were not promising. The issue is that these two chips
> > have a broken security extensions implementation. If a specific bit is
> > not burned in its e-fuse, most if not all security protections don't
> > work [2]. Even worse, non-secure access to the GIC become secure. This
> > requires a crazy workaround in the GIC driver which probably doesn't work
> > in all cases [3].
> > 
> > Nicolas mentioned that the MCPM framework is likely overkill in our
> > case [4]. However the framework does provide cluster/core state tracking
> > and proper sequencing of cache related operations. We could rework
> > the code to use standard smp_ops, but I would like to actually get
> > a working version in first.
> > 
> > Much of the sunxi-specific MCPM code is derived from Allwinner code and
> > documentation, with some references to the other MCPM implementations,
> > as well as the Cortex's Technical Reference Manuals for the power
> > sequencing info.
> > 
> > One major difference compared to other platforms is we currently do not
> > have a standalone PMU or other embedded firmware to do the actually power
> > sequencing. All power/reset control is done by the kernel. Nicolas
> > mentioned that a new optional callback should be added in cases where the
> > kernel has to do the actual power down [5]. For now however I'm using a
> > dedicated single thread workqueue. CPU and cluster power off work is
> > queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> > is somewhat heavy, as I have a total of 10 static work structs. It might
> > also be a bit racy, as nothing prevents the system from bringing a core
> > back before the asynchronous work shuts it down. This would likely
> > happen under a heavily loaded system with a scheduler that brings cores
> > in and out of the system frequently. In simple use-cases it performs OK.
> 
> It all looks sane to me
> Acked-by: Maxime Ripard <maxime.ripard-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>

It does not to me, sorry. You do not need MCPM (and workqueues) to
do SMP bring-up.

Nico explained why, just do it:

commit 905cdf9dda5d ("ARM: hisi/hip04: remove the MCPM overhead")

Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 18:04     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Pieralisi @ 2018-01-04 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 03:58:38PM +0100, Maxime Ripard wrote:
> On Thu, Jan 04, 2018 at 10:37:46PM +0800, Chen-Yu Tsai wrote:
> > This is v2 of my sun9i SMP support with MCPM series which was started
> > over two years ago [1]. We've tried to implement PSCI for both the A80
> > and A83T. Results were not promising. The issue is that these two chips
> > have a broken security extensions implementation. If a specific bit is
> > not burned in its e-fuse, most if not all security protections don't
> > work [2]. Even worse, non-secure access to the GIC become secure. This
> > requires a crazy workaround in the GIC driver which probably doesn't work
> > in all cases [3].
> > 
> > Nicolas mentioned that the MCPM framework is likely overkill in our
> > case [4]. However the framework does provide cluster/core state tracking
> > and proper sequencing of cache related operations. We could rework
> > the code to use standard smp_ops, but I would like to actually get
> > a working version in first.
> > 
> > Much of the sunxi-specific MCPM code is derived from Allwinner code and
> > documentation, with some references to the other MCPM implementations,
> > as well as the Cortex's Technical Reference Manuals for the power
> > sequencing info.
> > 
> > One major difference compared to other platforms is we currently do not
> > have a standalone PMU or other embedded firmware to do the actually power
> > sequencing. All power/reset control is done by the kernel. Nicolas
> > mentioned that a new optional callback should be added in cases where the
> > kernel has to do the actual power down [5]. For now however I'm using a
> > dedicated single thread workqueue. CPU and cluster power off work is
> > queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> > is somewhat heavy, as I have a total of 10 static work structs. It might
> > also be a bit racy, as nothing prevents the system from bringing a core
> > back before the asynchronous work shuts it down. This would likely
> > happen under a heavily loaded system with a scheduler that brings cores
> > in and out of the system frequently. In simple use-cases it performs OK.
> 
> It all looks sane to me
> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>

It does not to me, sorry. You do not need MCPM (and workqueues) to
do SMP bring-up.

Nico explained why, just do it:

commit 905cdf9dda5d ("ARM: hisi/hip04: remove the MCPM overhead")

Lorenzo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 20:56   ` Nicolas Pitre
  0 siblings, 0 replies; 38+ messages in thread
From: Nicolas Pitre @ 2018-01-04 20:56 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Maxime Ripard, Russell King, Rob Herring, Mark Rutland,
	Mylene JOSSERAND, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Dave Martin

On Thu, 4 Jan 2018, Chen-Yu Tsai wrote:

> Nicolas mentioned that the MCPM framework is likely overkill in our
> case [4]. However the framework does provide cluster/core state tracking
> and proper sequencing of cache related operations. We could rework
> the code to use standard smp_ops, but I would like to actually get
> a working version in first.
> 
> [...] For now however I'm using a
> dedicated single thread workqueue. CPU and cluster power off work is
> queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> is somewhat heavy, as I have a total of 10 static work structs. It might
> also be a bit racy, as nothing prevents the system from bringing a core
> back before the asynchronous work shuts it down. This would likely
> happen under a heavily loaded system with a scheduler that brings cores
> in and out of the system frequently. In simple use-cases it performs OK.

If you know up front your code is racy then this doesn't fully qualify 
as a "working version". Furthermore you're trading custom cluster/core 
state tracking for workqueue handling which doesn't look like a winning 
tradeoff to me. Especially given you can't have asynchronous CPU wakeups 
in hardware from an IRQ to deal with then the state tracking becomes 
very simple.

If you hook into struct smp_operations directly, you'll have direct 
access to both .cpu_die and .cpu_kill methods which are executed on the 
target CPU and on a different CPU respectively, which is exactly what 
you need. Those calls are already serialized with .smp_boot_secondary so 
you don't have to worry about races. The only thing you need to protect 
against races is your cluster usage count. Your code will end up being 
simpler than what you have now. See arch/arm/mach-hisi/platmcpm.c for 
example.


Nicolas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 20:56   ` Nicolas Pitre
  0 siblings, 0 replies; 38+ messages in thread
From: Nicolas Pitre @ 2018-01-04 20:56 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Maxime Ripard, Russell King, Rob Herring, Mark Rutland,
	Mylene JOSSERAND, devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Dave Martin

On Thu, 4 Jan 2018, Chen-Yu Tsai wrote:

> Nicolas mentioned that the MCPM framework is likely overkill in our
> case [4]. However the framework does provide cluster/core state tracking
> and proper sequencing of cache related operations. We could rework
> the code to use standard smp_ops, but I would like to actually get
> a working version in first.
> 
> [...] For now however I'm using a
> dedicated single thread workqueue. CPU and cluster power off work is
> queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> is somewhat heavy, as I have a total of 10 static work structs. It might
> also be a bit racy, as nothing prevents the system from bringing a core
> back before the asynchronous work shuts it down. This would likely
> happen under a heavily loaded system with a scheduler that brings cores
> in and out of the system frequently. In simple use-cases it performs OK.

If you know up front your code is racy then this doesn't fully qualify 
as a "working version". Furthermore you're trading custom cluster/core 
state tracking for workqueue handling which doesn't look like a winning 
tradeoff to me. Especially given you can't have asynchronous CPU wakeups 
in hardware from an IRQ to deal with then the state tracking becomes 
very simple.

If you hook into struct smp_operations directly, you'll have direct 
access to both .cpu_die and .cpu_kill methods which are executed on the 
target CPU and on a different CPU respectively, which is exactly what 
you need. Those calls are already serialized with .smp_boot_secondary so 
you don't have to worry about races. The only thing you need to protect 
against races is your cluster usage count. Your code will end up being 
simpler than what you have now. See arch/arm/mach-hisi/platmcpm.c for 
example.


Nicolas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management
@ 2018-01-04 20:56   ` Nicolas Pitre
  0 siblings, 0 replies; 38+ messages in thread
From: Nicolas Pitre @ 2018-01-04 20:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 4 Jan 2018, Chen-Yu Tsai wrote:

> Nicolas mentioned that the MCPM framework is likely overkill in our
> case [4]. However the framework does provide cluster/core state tracking
> and proper sequencing of cache related operations. We could rework
> the code to use standard smp_ops, but I would like to actually get
> a working version in first.
> 
> [...] For now however I'm using a
> dedicated single thread workqueue. CPU and cluster power off work is
> queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> is somewhat heavy, as I have a total of 10 static work structs. It might
> also be a bit racy, as nothing prevents the system from bringing a core
> back before the asynchronous work shuts it down. This would likely
> happen under a heavily loaded system with a scheduler that brings cores
> in and out of the system frequently. In simple use-cases it performs OK.

If you know up front your code is racy then this doesn't fully qualify 
as a "working version". Furthermore you're trading custom cluster/core 
state tracking for workqueue handling which doesn't look like a winning 
tradeoff to me. Especially given you can't have asynchronous CPU wakeups 
in hardware from an IRQ to deal with then the state tracking becomes 
very simple.

If you hook into struct smp_operations directly, you'll have direct 
access to both .cpu_die and .cpu_kill methods which are executed on the 
target CPU and on a different CPU respectively, which is exactly what 
you need. Those calls are already serialized with .smp_boot_secondary so 
you don't have to worry about races. The only thing you need to protect 
against races is your cluster usage count. Your code will end up being 
simpler than what you have now. See arch/arm/mach-hisi/platmcpm.c for 
example.


Nicolas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug
@ 2018-01-09  3:40     ` Rob Herring
  0 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2018-01-09  3:40 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Maxime Ripard, Russell King, Mark Rutland, Mylene JOSSERAND,
	devicetree, linux-arm-kernel, linux-kernel, linux-sunxi,
	Nicolas Pitre, Dave Martin

On Thu, Jan 04, 2018 at 10:37:52PM +0800, Chen-Yu Tsai wrote:
> On the Allwinner A80 SoC the BROM supports hotplugging the primary core
> (cpu0) by checking two 32bit values at a specific location within the
> secure SRAM block. This region needs to be reserved and accessible to
> the SMP code.
> 
> Document its usage.
> 
> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> ---
>  .../devicetree/bindings/arm/sunxi/smp-sram.txt     | 44 ++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug
@ 2018-01-09  3:40     ` Rob Herring
  0 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2018-01-09  3:40 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Maxime Ripard, Russell King, Mark Rutland, Mylene JOSSERAND,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Nicolas Pitre, Dave Martin

On Thu, Jan 04, 2018 at 10:37:52PM +0800, Chen-Yu Tsai wrote:
> On the Allwinner A80 SoC the BROM supports hotplugging the primary core
> (cpu0) by checking two 32bit values at a specific location within the
> secure SRAM block. This region needs to be reserved and accessible to
> the SMP code.
> 
> Document its usage.
> 
> Signed-off-by: Chen-Yu Tsai <wens-jdAy2FN1RRM@public.gmane.org>
> ---
>  .../devicetree/bindings/arm/sunxi/smp-sram.txt     | 44 ++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt

Reviewed-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug
@ 2018-01-09  3:40     ` Rob Herring
  0 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2018-01-09  3:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 10:37:52PM +0800, Chen-Yu Tsai wrote:
> On the Allwinner A80 SoC the BROM supports hotplugging the primary core
> (cpu0) by checking two 32bit values at a specific location within the
> secure SRAM block. This region needs to be reserved and accessible to
> the SMP code.
> 
> Document its usage.
> 
> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> ---
>  .../devicetree/bindings/arm/sunxi/smp-sram.txt     | 44 ++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2018-01-09  3:40 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-04 14:37 [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management Chen-Yu Tsai
2018-01-04 14:37 ` Chen-Yu Tsai
2018-01-04 14:37 ` Chen-Yu Tsai
2018-01-04 14:37 ` [PATCH v2 1/8] ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management (MCPM) Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37 ` [PATCH v2 2/8] ARM: dts: sun9i: Add CCI-400 device nodes for A80 Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37 ` [PATCH v2 3/8] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37 ` [PATCH v2 4/8] ARM: dts: sun9i: Add PRCM device node for the " Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37 ` [PATCH v2 5/8] ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for cpu1~7 Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37 ` [PATCH v2 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-09  3:40   ` Rob Herring
2018-01-09  3:40     ` Rob Herring
2018-01-09  3:40     ` Rob Herring
2018-01-04 14:37 ` [PATCH v2 7/8] ARM: sun9i: mcpm: Support cpu0 hotplug Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37 ` [PATCH v2 8/8] ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:37   ` Chen-Yu Tsai
2018-01-04 14:58 ` [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management Maxime Ripard
2018-01-04 14:58   ` Maxime Ripard
2018-01-04 18:04   ` Lorenzo Pieralisi
2018-01-04 18:04     ` Lorenzo Pieralisi
2018-01-04 18:04     ` Lorenzo Pieralisi
2018-01-04 20:56 ` Nicolas Pitre
2018-01-04 20:56   ` Nicolas Pitre
2018-01-04 20:56   ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.