linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support
@ 2018-01-15  7:14 Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 1/8] ARM: sun9i: Support SMP bring-up on A80 Chen-Yu Tsai
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

This is v3 of my sun9i SMP/hotplug support series which was started
over two years ago [1]. We've tried to implement PSCI for both the A80
and A83T. Results were not promising. The issue is that these two chips
have a broken security extensions implementation. If a specific bit is
not burned in its e-fuse, most if not all security protections don't
work [2]. Even worse, non-secure access to the GIC become secure. This
requires a crazy workaround in the GIC driver which probably doesn't work
in all cases [3].

This version completely does away with the MCPM framework, instead just
implementing a set of smp_ops. Most of the code from the previous
version was reused, so the structure still has some traces of MCPM.
As our hardware has CCI-400, we still need some sort of MMU/cache
disabled trampoline code to enable cache coherency. Code for this
was adapted from the MCPM framework. This and the entry code are done
in inline assembly. Most of the other sunxi-specific code is derived
from Allwinner code and documentation, with some references to the
other MCPM implementations, as well as the Cortex's Technical Reference
Manuals for the power sequencing stuff.

This currently only works for !THUMB2_KERNEL. The entry code needs some
work to work with a Thumb kernel, and while I've looked at the ARM entry
code for this, it seems either my knowledge of ARM, Thumb mode or PIC
programming skills is lacking, as I could not get it to work. If anyone
is interested, the remaining changes can be found here [4]. The Kconfig
symbol is guarded against this, so if the users chooses THUMB2_KERNEL,
they will lose SMP on this platform.

Hope we can get this version merged. A83T SMP support will be built on
it.

On the side, THUMB2_KERNEL has not worked for me in some time. My custom
kernel config[5] boots, but gets an undefined instruction exception upon
returning from a syscall to userspace.

Regards
ChenYu

Changes since v2:
  - Do away with the MCPM framework, directly implement smp_ops
  - Some debug messages were clarified
  - New ARCH_SUNXI_MCPM Kconfig symbol for this feature

Changes since v1:

  - Leading zeroes for device node addresses removed
  - Added device tree binding for SMP SRAM
  - Simplified Kconfig options
  - Switched to SPDX license identifier
  - Map CPU to device tree node and check compatible to see if it's
    Cortex-A15 or Cortex-A7
  - Fix incorrect CPUCFG cluster status macro that prevented cluster
    0 L2 cache WFI detection
  - Fixed reversed bit for turning off cluster
  - Put cluster in reset before turning off power (or it hangs)
  - Added dedicated workqueue for turning off power to cpus and clusters
  - Request CPUCFG and SRAM MMIO ranges
  - Some comments fixed or added
  - Some debug messages added

[1] http://www.spinics.net/lists/arm-kernel/msg418350.html
[2] https://lists.denx.de/pipermail/u-boot/2017-June/294637.html
[3] https://github.com/wens/linux/commit/c48654c1f737116e7a7660183c8c74fa91970528
[4] https://github.com/wens/linux/commits/sun9i-smp-mcpm-v3
[5] http://wens.tw/a83t/arm-sunxi-config

Chen-Yu Tsai (8):
  ARM: sun9i: Support SMP bring-up on A80
  ARM: dts: sun9i: Add CCI-400 device nodes for A80
  ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
  ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
  ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for
    cpu1~7
  dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP
    hotplug
  ARM: sun9i: mcpm: Support cpu0 hotplug
  ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug

 .../devicetree/bindings/arm/sunxi/smp-sram.txt     |  44 ++
 arch/arm/boot/dts/sun9i-a80.dtsi                   |  75 ++
 arch/arm/mach-sunxi/Kconfig                        |   7 +
 arch/arm/mach-sunxi/Makefile                       |   1 +
 arch/arm/mach-sunxi/mcpm.c                         | 789 +++++++++++++++++++++
 5 files changed, 916 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

-- 
2.15.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/8] ARM: sun9i: Support SMP bring-up on A80
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
@ 2018-01-15  7:14 ` Chen-Yu Tsai
  2018-01-15 12:04   ` Dave Martin
  2018-01-15  7:14 ` [PATCH v3 2/8] ARM: dts: sun9i: Add CCI-400 device nodes for A80 Chen-Yu Tsai
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
1 cluster of 4 Cortex-A15s.

This patch adds support to bring up the second cluster and thus all
cores using custom platform SMP code. Core/cluster power down has not
been implemented, thus CPU hotplugging is not supported.

This is limited to !THUMB2_KERNEL for now. The entry code must be built
as ARM machine code, and it does not switch modes. Further work was
done to move the assembly code to a separate file and add the proper
mode statements and mode switching instructions. However initial tests
failed to boot properly with Thumb-2.

Parts of the trampoline and re-entry code for the boot cpu was adapted
from the MCPM framework.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/Kconfig  |   7 +
 arch/arm/mach-sunxi/Makefile |   1 +
 arch/arm/mach-sunxi/mcpm.c   | 548 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 556 insertions(+)
 create mode 100644 arch/arm/mach-sunxi/mcpm.c

diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
index 58153cdf025b..b53e37d170e6 100644
--- a/arch/arm/mach-sunxi/Kconfig
+++ b/arch/arm/mach-sunxi/Kconfig
@@ -48,4 +48,11 @@ config MACH_SUN9I
 	default ARCH_SUNXI
 	select ARM_GIC
 
+config ARCH_SUNXI_MCPM
+	bool
+	depends on SMP && !THUMB2_KERNEL
+	default MACH_SUN9I
+	select ARM_CCI400_PORT_CTRL
+	select ARM_CPU_SUSPEND
+
 endif
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
index 27b168f121a1..cacd1afa8137 100644
--- a/arch/arm/mach-sunxi/Makefile
+++ b/arch/arm/mach-sunxi/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
+obj-$(CONFIG_ARCH_SUNXI_MCPM) += mcpm.o
 obj-$(CONFIG_SMP) += platsmp.o
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
new file mode 100644
index 000000000000..7c77bb3b367a
--- /dev/null
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -0,0 +1,548 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2015 Chen-Yu Tsai
+ *
+ * Chen-Yu Tsai <wens@csie.org>
+ *
+ * arch/arm/mach-sunxi/mcpm.c
+ *
+ * Based on Allwinner code, arch/arm/mach-exynos/mcpm-exynos.c, and
+ * arch/arm/mach-hisi/platmcpm.c
+ */
+
+#include <linux/arm-cci.h>
+#include <linux/cpu_pm.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/smp.h>
+
+#include <asm/cacheflush.h>
+#include <asm/cp15.h>
+#include <asm/cputype.h>
+#include <asm/idmap.h>
+#include <asm/smp_plat.h>
+#include <asm/suspend.h>
+
+#define SUNXI_CPUS_PER_CLUSTER		4
+#define SUNXI_NR_CLUSTERS		2
+
+#define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7	BIT(4)
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
+#define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
+#define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
+#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
+#define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
+#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL	(0xf << 20)
+#define CPUCFG_CX_RST_CTRL_DBG_RST(n)	BIT(16 + (n))
+#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL	(0xf << 16)
+#define CPUCFG_CX_RST_CTRL_H_RST	BIT(12)
+#define CPUCFG_CX_RST_CTRL_L2_RST	BIT(8)
+#define CPUCFG_CX_RST_CTRL_CX_RST(n)	BIT(4 + (n))
+#define CPUCFG_CX_RST_CTRL_CORE_RST(n)	BIT(n)
+
+#define PRCM_CPU_PO_RST_CTRL(c)		(0x4 + 0x4 * (c))
+#define PRCM_CPU_PO_RST_CTRL_CORE(n)	BIT(n)
+#define PRCM_CPU_PO_RST_CTRL_CORE_ALL	0xf
+#define PRCM_PWROFF_GATING_REG(c)	(0x100 + 0x4 * (c))
+#define PRCM_PWROFF_GATING_REG_CLUSTER	BIT(4)
+#define PRCM_PWROFF_GATING_REG_CORE(n)	BIT(n)
+#define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
+#define PRCM_CPU_SOFT_ENTRY_REG		0x164
+
+static void __iomem *cpucfg_base;
+static void __iomem *prcm_base;
+
+static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
+{
+	struct device_node *node;
+	int cpu = cluster * SUNXI_CPUS_PER_CLUSTER + core;
+
+	node = of_cpu_device_node_get(cpu);
+
+	/* In case of_cpu_device_node_get fails */
+	if (!node)
+		node = of_get_cpu_node(cpu, NULL);
+
+	if (!node) {
+		/*
+		 * There's no point in returning an error, since we
+		 * would be mid way in a core or cluster power sequence.
+		 */
+		pr_err("%s: Couldn't get CPU cluster %u core %u device node\n",
+		       __func__, cluster, core);
+
+		return false;
+	}
+
+	return of_device_is_compatible(node, "arm,cortex-a15");
+}
+
+static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
+				      bool enable)
+{
+	u32 reg;
+
+	/* control sequence from Allwinner A80 user manual v1.2 PRCM section */
+	reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+	if (enable) {
+		if (reg == 0x00) {
+			pr_debug("power clamp for cluster %u cpu %u already open\n",
+				 cluster, cpu);
+			return 0;
+		}
+
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+		writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	} else {
+		writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+		udelay(10);
+	}
+
+	return 0;
+}
+
+static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* Cortex-A7: hold L1 reset disable signal low */
+	if (!sunxi_core_is_cortex_a15(cpu, cluster)) {
+		reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
+		writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	}
+
+	/* assert processor related resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!sunxi_core_is_cortex_a15(cpu, cluster))
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* open power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, true);
+
+	/* clear processor power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert processor power-on reset */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* de-assert all processor resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+	reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
+	if (!sunxi_core_is_cortex_a15(cpu, cluster))
+		reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+	else
+		reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	return 0;
+}
+
+static int sunxi_cluster_powerup(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	/* assert cluster processor power-on resets */
+	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
+	writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+	/* assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+
+	/*
+	 * Allwinner code also asserts resets for NEON on A15. According
+	 * to ARM manuals, asserting power-on reset is sufficient.
+	 */
+	if (!sunxi_core_is_cortex_a15(0, cluster))
+		reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
+
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* hold L1/L2 reset disable signals low */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+	if (sunxi_core_is_cortex_a15(0, cluster)) {
+		/* Cortex-A15: hold L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
+	} else {
+		/* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
+		reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
+		reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
+	}
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+
+	/* clear cluster power gate */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* de-assert cluster resets */
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg |= CPUCFG_CX_RST_CTRL_H_RST;
+	reg |= CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* de-assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+	return 0;
+}
+
+/*
+ * This bit is shared between the initial mcpm_sync_init call to enable
+ * CCI-400 and proper cluster cache disable before power down.
+ */
+static void sunxi_cluster_cache_disable_without_axi(void)
+{
+	if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
+		/*
+		 * On the Cortex-A15 we need to disable
+		 * L2 prefetching before flushing the cache.
+		 */
+		asm volatile(
+		"mcr	p15, 1, %0, c15, c0, 3\n"
+		"isb\n"
+		"dsb"
+		: : "r" (0x400));
+	}
+
+	/* Flush all cache levels for this cluster. */
+	v7_exit_coherency_flush(all);
+
+	/*
+	 * Disable cluster-level coherency by masking
+	 * incoming snoops and DVM messages:
+	 */
+	cci_disable_port_by_cpu(read_cpuid_mpidr());
+}
+
+static int sunxi_mcpm_cpu_table[SUNXI_NR_CLUSTERS][SUNXI_CPUS_PER_CLUSTER];
+static int sunxi_mcpm_first_comer;
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ *
+ * Also enable regional clock gating and L2 data latency settings for
+ * Cortex-A15. These settings are from the vendor kernel.
+ */
+static void __naked sunxi_mcpm_cluster_cache_enable(void)
+{
+	asm volatile (
+		"mrc	p15, 0, r1, c0, c0, 0\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
+		"and	r1, r1, r2\n"
+		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
+		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
+		"cmp	r1, r2\n"
+		"bne	not_a15\n"
+
+		/* The following is Cortex-A15 specific */
+
+		/* ACTLR2: Enable CPU regional clock gates */
+		"mrc p15, 1, r1, c15, c0, 4\n"
+		"orr r1, r1, #(0x1<<31)\n"
+		"mcr p15, 1, r1, c15, c0, 4\n"
+
+		/* L2ACTLR */
+		"mrc p15, 1, r1, c15, c0, 0\n"
+		/* Enable L2, GIC, and Timer regional clock gates */
+		"orr r1, r1, #(0x1<<26)\n"
+		/* Disable clean/evict from being pushed to external */
+		"orr r1, r1, #(0x1<<3)\n"
+		"mcr p15, 1, r1, c15, c0, 0\n"
+
+		/* L2CTRL: L2 data RAM latency */
+		"mrc p15, 1, r1, c9, c0, 2\n"
+		"bic r1, r1, #(0x7<<0)\n"
+		"orr r1, r1, #(0x3<<0)\n"
+		"mcr p15, 1, r1, c9, c0, 2\n"
+
+		/* End of Cortex-A15 specific setup */
+		"not_a15:\n"
+
+		/* Get value of sunxi_mcpm_first_comer */
+		"adr	r1, first\n"
+		"ldr	r0, [r1]\n"
+		"ldr	r0, [r1, r0]\n"
+
+		/* Skip cci_enable_port_for_self if not first comer */
+		"cmp	r0, #0\n"
+		"bxeq	lr\n"
+		"b	cci_enable_port_for_self\n"
+
+		"first: .word sunxi_mcpm_first_comer - .\n"
+	);
+}
+
+static void __naked sunxi_mcpm_secondary_startup(void)
+{
+	asm volatile(
+		"bl	sunxi_mcpm_cluster_cache_enable\n"
+		"b	secondary_startup"
+		/* Let compiler know about sunxi_mcpm_cluster_cache_enable */
+		:: "i" (sunxi_mcpm_cluster_cache_enable)
+	);
+}
+
+static DEFINE_SPINLOCK(boot_lock);
+
+static bool sunxi_mcpm_cluster_is_down(unsigned int cluster)
+{
+	int i;
+
+	for (i = 0; i < SUNXI_CPUS_PER_CLUSTER; i++)
+		if (sunxi_mcpm_cpu_table[cluster][i])
+			return false;
+	return true;
+}
+
+static int sunxi_mcpm_boot_secondary(unsigned int l_cpu, struct task_struct *idle)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = cpu_logical_map(l_cpu);
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	if (!cpucfg_base)
+		return -ENODEV;
+	if (cluster >= SUNXI_NR_CLUSTERS || cpu >= SUNXI_CPUS_PER_CLUSTER)
+		return -EINVAL;
+
+	spin_lock_irq(&boot_lock);
+
+	if (sunxi_mcpm_cpu_table[cluster][cpu])
+		goto out;
+
+	if (sunxi_mcpm_cluster_is_down(cluster)) {
+		sunxi_mcpm_first_comer = true;
+		sunxi_cluster_powerup(cluster);
+	} else {
+		sunxi_mcpm_first_comer = false;
+	}
+
+	/* This is read by incoming CPUs with their cache and MMU disabled */
+	sync_cache_w(&sunxi_mcpm_first_comer);
+	sunxi_cpu_powerup(cpu, cluster);
+
+out:
+	sunxi_mcpm_cpu_table[cluster][cpu]++;
+	spin_unlock_irq(&boot_lock);
+
+	return 0;
+}
+
+static const struct smp_operations sunxi_mcpm_smp_ops __initconst = {
+	.smp_boot_secondary	= sunxi_mcpm_boot_secondary,
+};
+
+static bool __init sunxi_mcpm_cpu_table_init(void)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	if (cluster >= SUNXI_NR_CLUSTERS || cpu >= SUNXI_CPUS_PER_CLUSTER) {
+		pr_err("%s: boot CPU is out of bounds!\n", __func__);
+		return false;
+	}
+	sunxi_mcpm_cpu_table[cluster][cpu] = 1;
+	return true;
+}
+
+/*
+ * Adapted from arch/arm/common/mcpm_entry.c
+ *
+ * We need the trampoline code to enable CCI-400 on the first cluster
+ */
+typedef typeof(cpu_reset) phys_reset_t;
+
+static void __naked sunxi_mcpm_resume(void)
+{
+	asm volatile(
+		"bl	sunxi_mcpm_cluster_cache_enable\n"
+		"b	cpu_resume"
+		/* Let compiler know about sunxi_mcpm_cluster_cache_enable */
+		:: "i" (sunxi_mcpm_cluster_cache_enable)
+	);
+}
+
+static int __init nocache_trampoline(unsigned long __unused)
+{
+	phys_reset_t phys_reset;
+
+	setup_mm_for_reboot();
+	sunxi_cluster_cache_disable_without_axi();
+
+	phys_reset = (phys_reset_t)(unsigned long)__pa_symbol(cpu_reset);
+	phys_reset(__pa_symbol(sunxi_mcpm_resume), false);
+	BUG();
+}
+
+static int __init sunxi_mcpm_lookback(void)
+{
+	int ret;
+
+	/*
+	 * We're going to soft-restart the current CPU through the
+	 * low-level MCPM code by leveraging the suspend/resume
+	 * infrastructure. Let's play it safe by using cpu_pm_enter()
+	 * in case the CPU init code path resets the VFP or similar.
+	 */
+	sunxi_mcpm_first_comer = true;
+	local_irq_disable();
+	local_fiq_disable();
+	ret = cpu_pm_enter();
+	if (!ret) {
+		ret = cpu_suspend(0, nocache_trampoline);
+		cpu_pm_exit();
+	}
+	local_fiq_enable();
+	local_irq_enable();
+	sunxi_mcpm_first_comer = false;
+
+	return ret;
+}
+
+static int __init sunxi_mcpm_init(void)
+{
+	struct device_node *cpucfg_node, *node;
+	struct resource res;
+	int ret;
+
+	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
+		return -ENODEV;
+
+	if (!sunxi_mcpm_cpu_table_init())
+		return -EINVAL;
+
+	if (!cci_probed()) {
+		pr_err("%s: CCI-400 not available\n", __func__);
+		return -ENODEV;
+	}
+
+	node = of_find_compatible_node(NULL, NULL, "allwinner,sun9i-a80-prcm");
+	if (!node) {
+		pr_err("%s: PRCM not available\n", __func__);
+		return -ENODEV;
+	}
+
+	/*
+	 * Unfortunately we can not request the I/O region for the PRCM.
+	 * It is shared with the PRCM clock.
+	 */
+	prcm_base = of_iomap(node, 0);
+	of_node_put(node);
+	if (!prcm_base) {
+		pr_err("%s: failed to map PRCM registers\n", __func__);
+		return -ENOMEM;
+	}
+
+	cpucfg_node = of_find_compatible_node(NULL, NULL,
+					      "allwinner,sun9i-a80-cpucfg");
+	if (!cpucfg_node) {
+		ret = -ENODEV;
+		pr_err("%s: CPUCFG not available\n", __func__);
+		goto err_unmap_prcm;
+	}
+
+	cpucfg_base = of_io_request_and_map(cpucfg_node, 0, "sunxi-mcpm");
+	if (IS_ERR(cpucfg_base)) {
+		ret = PTR_ERR(cpucfg_base);
+		pr_err("%s: failed to map CPUCFG registers: %d\n",
+		       __func__, ret);
+		goto err_put_cpucfg_node;
+	}
+
+	/* Configure CCI-400 for boot cluster */
+	ret = sunxi_mcpm_lookback();
+	if (ret) {
+		pr_err("%s: failed to configure boot cluster: %d\n",
+		       __func__, ret);
+		goto err_unmap_release_cpucfg;
+	}
+
+	/* We don't need the CPUCFG device node anymore */
+	of_node_put(cpucfg_node);
+
+	/* Set the hardware entry point address */
+	writel(__pa_symbol(sunxi_mcpm_secondary_startup),
+	       prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
+
+	/* Actually enable MCPM */
+	smp_set_ops(&sunxi_mcpm_smp_ops);
+
+	pr_info("sunxi MCPM support installed\n");
+
+	return 0;
+
+err_unmap_release_cpucfg:
+	iounmap(cpucfg_base);
+	of_address_to_resource(cpucfg_node, 0, &res);
+	release_mem_region(res.start, resource_size(&res));
+err_put_cpucfg_node:
+	of_node_put(cpucfg_node);
+err_unmap_prcm:
+	iounmap(prcm_base);
+	return ret;
+}
+
+early_initcall(sunxi_mcpm_init);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/8] ARM: dts: sun9i: Add CCI-400 device nodes for A80
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 1/8] ARM: sun9i: Support SMP bring-up on A80 Chen-Yu Tsai
@ 2018-01-15  7:14 ` Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 3/8] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi Chen-Yu Tsai
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

The A80 includes an ARM CCI-400 interconnect to support multi-cluster
CPU caches.

Also add the maximum clock frequency for the CPUs, as listed in the
A80 Optimus Board FEX file.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 46 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 90eac0b2a193..85fb800af8ab 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -63,48 +63,64 @@
 		cpu0: cpu@0 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x0>;
 		};
 
 		cpu1: cpu@1 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x1>;
 		};
 
 		cpu2: cpu@2 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x2>;
 		};
 
 		cpu3: cpu@3 {
 			compatible = "arm,cortex-a7";
 			device_type = "cpu";
+			cci-control-port = <&cci_control0>;
+			clock-frequency = <12000000>;
 			reg = <0x3>;
 		};
 
 		cpu4: cpu@100 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x100>;
 		};
 
 		cpu5: cpu@101 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x101>;
 		};
 
 		cpu6: cpu@102 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x102>;
 		};
 
 		cpu7: cpu@103 {
 			compatible = "arm,cortex-a15";
 			device_type = "cpu";
+			cci-control-port = <&cci_control1>;
+			clock-frequency = <18000000>;
 			reg = <0x103>;
 		};
 	};
@@ -431,6 +447,36 @@
 			interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
 		};
 
+		cci: cci@1c90000 {
+			compatible = "arm,cci-400";
+			#address-cells = <1>;
+			#size-cells = <1>;
+			reg = <0x01c90000 0x1000>;
+			ranges = <0x0 0x01c90000 0x10000>;
+
+			cci_control0: slave-if@4000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x4000 0x1000>;
+			};
+
+			cci_control1: slave-if@5000 {
+				compatible = "arm,cci-400-ctrl-if";
+				interface-type = "ace";
+				reg = <0x5000 0x1000>;
+			};
+
+			pmu@9000 {
+				 compatible = "arm,cci-400-pmu,r1";
+				 reg = <0x9000 0x5000>;
+				 interrupts = <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+					      <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>;
+			};
+		};
+
 		de_clocks: clock@3000000 {
 			compatible = "allwinner,sun9i-a80-de-clks";
 			reg = <0x03000000 0x30>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/8] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 1/8] ARM: sun9i: Support SMP bring-up on A80 Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 2/8] ARM: dts: sun9i: Add CCI-400 device nodes for A80 Chen-Yu Tsai
@ 2018-01-15  7:14 ` Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 4/8] ARM: dts: sun9i: Add PRCM device node for the " Chen-Yu Tsai
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

CPUCFG is a collection of registers that are mapped to the SoC's signals
from each individual processor core and associated peripherals, such as
resets for processors, L1/L2 cache and other things.

These registers are used for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85fb800af8ab..85ecb4d64cfd 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -363,6 +363,11 @@
 			#reset-cells = <1>;
 		};
 
+		cpucfg@1700000 {
+			compatible = "allwinner,sun9i-a80-cpucfg";
+			reg = <0x01700000 0x100>;
+		};
+
 		mmc0: mmc@1c0f000 {
 			compatible = "allwinner,sun9i-a80-mmc";
 			reg = <0x01c0f000 0x1000>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/8] ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
                   ` (2 preceding siblings ...)
  2018-01-15  7:14 ` [PATCH v3 3/8] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi Chen-Yu Tsai
@ 2018-01-15  7:14 ` Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 5/8] ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for cpu1~7 Chen-Yu Tsai
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

The PRCM is a collection of clock controls, reset controls, and various
power switches/gates. Some of these can be independently listed and
supported, while a number of CPU related ones are used in tandem with
CPUCFG for SMP bringup and CPU hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85ecb4d64cfd..bf4d40e8359f 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -709,6 +709,11 @@
 			interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
+		prcm@8001400 {
+			compatible = "allwinner,sun9i-a80-prcm";
+			reg = <0x08001400 0x200>;
+		};
+
 		apbs_rst: reset@80014b0 {
 			reg = <0x080014b0 0x4>;
 			compatible = "allwinner,sun6i-a31-clock-reset";
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 5/8] ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for cpu1~7
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
                   ` (3 preceding siblings ...)
  2018-01-15  7:14 ` [PATCH v3 4/8] ARM: dts: sun9i: Add PRCM device node for the " Chen-Yu Tsai
@ 2018-01-15  7:14 ` Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug Chen-Yu Tsai
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

This patch adds common code used to power down all cores and clusters.
The code was previously based on the MCPM framework. It has now been
adapted to hook into struct smp_operations directly, but the code
structure still shows signs of prior work.

The primary core (cpu0) requires setting flags to have the BROM bounce
execution to the SMP software entry code. This is done in a subsequent
patch to keep the changes cleanly separated. By default the ARM SMP code
blocks cpu0 from being turned off, so splitting this out is safe.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/mcpm.c | 190 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 189 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index 7c77bb3b367a..740343a2156e 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -14,6 +14,8 @@
 #include <linux/cpu_pm.h>
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/irqchip/arm-gic.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_device.h>
@@ -29,6 +31,9 @@
 #define SUNXI_CPUS_PER_CLUSTER		4
 #define SUNXI_NR_CLUSTERS		2
 
+#define POLL_USEC	100
+#define TIMEOUT_USEC	100000
+
 #define CPUCFG_CX_CTRL_REG0(c)		(0x10 * (c))
 #define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n)	BIT(n)
 #define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL	0xf
@@ -36,6 +41,9 @@
 #define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15	BIT(0)
 #define CPUCFG_CX_CTRL_REG1(c)		(0x10 * (c) + 0x4)
 #define CPUCFG_CX_CTRL_REG1_ACINACTM	BIT(0)
+#define CPUCFG_CX_STATUS(c)		(0x30 + 0x4 * (c))
+#define CPUCFG_CX_STATUS_STANDBYWFI(n)	BIT(16 + (n))
+#define CPUCFG_CX_STATUS_STANDBYWFIL2	BIT(0)
 #define CPUCFG_CX_RST_CTRL(c)		(0x80 + 0x4 * (c))
 #define CPUCFG_CX_RST_CTRL_DBG_SOC_RST	BIT(24)
 #define CPUCFG_CX_RST_CTRL_ETM_RST(n)	BIT(20 + (n))
@@ -120,7 +128,7 @@ static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 {
 	u32 reg;
 
-	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	pr_debug("%s: cluster %u cpu %u\n", __func__, cluster, cpu);
 	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
 		return -EINVAL;
 
@@ -388,8 +396,188 @@ static int sunxi_mcpm_boot_secondary(unsigned int l_cpu, struct task_struct *idl
 	return 0;
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static void sunxi_cluster_cache_disable(void)
+{
+	unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+
+	sunxi_cluster_cache_disable_without_axi();
+
+	/* last man standing, assert ACINACTM */
+	reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+	reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+	writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+}
+
+static void sunxi_mcpm_cpu_die(unsigned int l_cpu)
+{
+	unsigned int mpidr, cpu, cluster;
+	bool last_man;
+
+	mpidr = cpu_logical_map(l_cpu);
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+	pr_debug("%s: cluster %u cpu %u\n", __func__, cluster, cpu);
+
+	spin_lock(&boot_lock);
+	sunxi_mcpm_cpu_table[cluster][cpu]--;
+	if (sunxi_mcpm_cpu_table[cluster][cpu] == 1) {
+		/* A power_up request went ahead of us. */
+		pr_debug("%s: aborting due to a power up request\n",
+			 __func__);
+		spin_unlock(&boot_lock);
+		return;
+	} else if (sunxi_mcpm_cpu_table[cluster][cpu] > 1) {
+		pr_err("Cluster %d CPU%d boots multiple times\n",
+		       cluster, cpu);
+		BUG();
+	}
+
+	last_man = sunxi_mcpm_cluster_is_down(cluster);
+	spin_unlock(&boot_lock);
+
+	gic_cpu_if_down(0);
+	if (last_man)
+		sunxi_cluster_cache_disable();
+	else
+		v7_exit_coherency_flush(louis);
+
+	for (;;)
+		wfi();
+}
+
+static int sunxi_cpu_powerdown(unsigned int cpu, unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u cpu %u\n", __func__, cluster, cpu);
+	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* gate processor power */
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg |= PRCM_PWROFF_GATING_REG_CORE(cpu);
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	/* close power switch */
+	sunxi_cpu_power_switch_set(cpu, cluster, false);
+
+	return 0;
+}
+
+static int sunxi_cluster_powerdown(unsigned int cluster)
+{
+	u32 reg;
+
+	pr_debug("%s: cluster %u\n", __func__, cluster);
+	if (cluster >= SUNXI_NR_CLUSTERS)
+		return -EINVAL;
+
+	/* assert cluster resets or system will hang */
+	pr_debug("%s: assert cluster reset\n", __func__);
+	reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+	reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+	reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+	writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+	/* gate cluster power */
+	pr_debug("%s: gate cluster power\n", __func__);
+	reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	reg |= PRCM_PWROFF_GATING_REG_CLUSTER;
+	writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+	udelay(20);
+
+	return 0;
+}
+
+static int sunxi_mcpm_cpu_kill(unsigned int l_cpu)
+{
+	unsigned int mpidr, cpu, cluster;
+	unsigned int tries, count;
+	int ret = 0;
+	u32 reg;
+
+	mpidr = cpu_logical_map(l_cpu);
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	/* This should never happen */
+	if (WARN_ON(cluster >= SUNXI_NR_CLUSTERS ||
+		    cpu >= SUNXI_CPUS_PER_CLUSTER))
+		return 0;
+
+	/* wait for CPU core to die and enter WFI */
+	count = TIMEOUT_USEC / POLL_USEC;
+	spin_lock_irq(&boot_lock);
+	for (tries = 0; tries < count; tries++) {
+		spin_unlock_irq(&boot_lock);
+		usleep_range(POLL_USEC / 2, POLL_USEC);
+		spin_lock_irq(&boot_lock);
+
+		/*
+		 * If the user turns off a bunch of cores at the same
+		 * time, the kernel might call cpu_kill before some of
+		 * them are ready. This is because boot_lock serializes
+		 * both cpu_die and cpu_kill callbacks. Either one could
+		 * run first. We should wait for cpu_die to complete.
+		 */
+		if (sunxi_mcpm_cpu_table[cluster][cpu])
+			continue;
+
+		reg = readl(cpucfg_base + CPUCFG_CX_STATUS(cluster));
+		if (reg & CPUCFG_CX_STATUS_STANDBYWFI(cpu))
+			break;
+	}
+
+	if (tries >= count) {
+		ret = ETIMEDOUT;
+		goto out;
+	}
+
+	/* power down CPU core */
+	sunxi_cpu_powerdown(cpu, cluster);
+
+	if (!sunxi_mcpm_cluster_is_down(cluster))
+		goto out;
+
+	/* wait for cluster L2 WFI */
+	ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+				 reg & CPUCFG_CX_STATUS_STANDBYWFIL2,
+				 POLL_USEC, TIMEOUT_USEC);
+	if (ret) {
+		/*
+		 * Ignore timeout on the cluster. Leaving the cluster on
+		 * will not affect system execution, just use a bit more
+		 * power. But returning an error here will only confuse
+		 * the user as the CPU has already been shutdown.
+		 */
+		ret = 0;
+		goto out;
+	}
+
+	/* Power down cluster */
+	sunxi_cluster_powerdown(cluster);
+
+out:
+	spin_unlock_irq(&boot_lock);
+	pr_debug("%s: cluster %u cpu %u powerdown: %d\n",
+		 __func__, cluster, cpu, ret);
+	return !ret;
+}
+
+#endif
+
 static const struct smp_operations sunxi_mcpm_smp_ops __initconst = {
 	.smp_boot_secondary	= sunxi_mcpm_boot_secondary,
+#ifdef CONFIG_HOTPLUG_CPU
+	.cpu_die		= sunxi_mcpm_cpu_die,
+	.cpu_kill		= sunxi_mcpm_cpu_kill,
+#endif
 };
 
 static bool __init sunxi_mcpm_cpu_table_init(void)
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
                   ` (4 preceding siblings ...)
  2018-01-15  7:14 ` [PATCH v3 5/8] ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for cpu1~7 Chen-Yu Tsai
@ 2018-01-15  7:14 ` Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 7/8] ARM: sun9i: mcpm: Support cpu0 hotplug Chen-Yu Tsai
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

On the Allwinner A80 SoC the BROM supports hotplugging the primary core
(cpu0) by checking two 32bit values at a specific location within the
secure SRAM block. This region needs to be reserved and accessible to
the SMP code.

Document its usage.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 .../devicetree/bindings/arm/sunxi/smp-sram.txt     | 44 ++++++++++++++++++++++
 1 file changed, 44 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt

diff --git a/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
new file mode 100644
index 000000000000..082e6a9382d3
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
@@ -0,0 +1,44 @@
+Allwinner SRAM for smp bringup:
+------------------------------------------------
+
+Allwinner's A80 SoC uses part of the secure sram for hotplugging of the
+primary core (cpu0). Once the core gets powered up it checks if a magic
+value is set at a specific location. If it is then the BROM will jump
+to the software entry address, instead of executing a standard boot.
+
+Therefore a reserved section sub-node has to be added to the mmio-sram
+declaration.
+
+Note that this is separate from the Allwinner SRAM controller found in
+../../sram/sunxi-sram.txt. This SRAM is secure only and not mappable to
+any device.
+
+Also there are no "secure-only" properties. The implementation should
+check if this SRAM is usable first.
+
+Required sub-node properties:
+- compatible : depending on the SoC this should be one of:
+		"allwinner,sun9i-a80-smp-sram"
+
+The rest of the properties should follow the generic mmio-sram discription
+found in ../../misc/sram.txt
+
+Example:
+
+	sram_b: sram@20000 {
+		/* 256 KiB secure SRAM at 0x20000 */
+		compatible = "mmio-sram";
+		reg = <0x00020000 0x40000>;
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0 0x00020000 0x40000>;
+
+		smp-sram@1000 {
+			/*
+			 * This is checked by BROM to determine if
+			 * cpu0 should jump to SMP entry vector
+			 */
+			compatible = "allwinner,sun9i-a80-smp-sram";
+			reg = <0x1000 0x8>;
+		};
+	};
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 7/8] ARM: sun9i: mcpm: Support cpu0 hotplug
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
                   ` (5 preceding siblings ...)
  2018-01-15  7:14 ` [PATCH v3 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug Chen-Yu Tsai
@ 2018-01-15  7:14 ` Chen-Yu Tsai
  2018-01-15  7:14 ` [PATCH v3 8/8] ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug Chen-Yu Tsai
  2018-01-15 21:00 ` [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Nicolas Pitre
  8 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

The BROM has a branch that checks if the primary core is hotplugging.
If the magic flag is set, execution jumps to the address set in the
software entry register. (Secondary cores always branch to the that
address.)

This patch sets the flags that makes BROM jump execution on the
primary core (cpu0) to the SMP software entry code when the core is
powered back up. After it is re-integrated into the system, the flag
is cleared.

A custom .cpu_can_disable callback that returns true for all cpus,
so that cpu0 can really be brought down.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---
 arch/arm/mach-sunxi/mcpm.c | 59 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 56 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index 740343a2156e..9ea8a4aa4f1d 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -64,8 +64,12 @@
 #define PRCM_PWR_SWITCH_REG(c, cpu)	(0x140 + 0x10 * (c) + 0x4 * (cpu))
 #define PRCM_CPU_SOFT_ENTRY_REG		0x164
 
+#define CPU0_SUPPORT_HOTPLUG_MAGIC0	0xFA50392F
+#define CPU0_SUPPORT_HOTPLUG_MAGIC1	0x790DCA3A
+
 static void __iomem *cpucfg_base;
 static void __iomem *prcm_base;
+static void __iomem *sram_b_smp_base;
 
 static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
 {
@@ -124,6 +128,17 @@ static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
 	return 0;
 }
 
+static void sunxi_cpu0_hotplug_support_set(bool enable)
+{
+	if (enable) {
+		writel(CPU0_SUPPORT_HOTPLUG_MAGIC0, sram_b_smp_base);
+		writel(CPU0_SUPPORT_HOTPLUG_MAGIC1, sram_b_smp_base + 0x4);
+	} else {
+		writel(0x0, sram_b_smp_base);
+		writel(0x0, sram_b_smp_base + 0x4);
+	}
+}
+
 static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 {
 	u32 reg;
@@ -132,6 +147,10 @@ static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
 	if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
 		return -EINVAL;
 
+	/* Set hotplug support magic flags for cpu0 */
+	if (cluster == 0 && cpu == 0)
+		sunxi_cpu0_hotplug_support_set(true);
+
 	/* assert processor power-on reset */
 	reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
 	reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
@@ -360,6 +379,13 @@ static bool sunxi_mcpm_cluster_is_down(unsigned int cluster)
 	return true;
 }
 
+static void sunxi_mcpm_secondary_init(unsigned int cpu)
+{
+	/* Clear hotplug support magic flags for cpu0 */
+	if (cpu == 0)
+		sunxi_cpu0_hotplug_support_set(false);
+}
+
 static int sunxi_mcpm_boot_secondary(unsigned int l_cpu, struct task_struct *idle)
 {
 	unsigned int mpidr, cpu, cluster;
@@ -570,13 +596,19 @@ static int sunxi_mcpm_cpu_kill(unsigned int l_cpu)
 	return !ret;
 }
 
+static bool sunxi_mcpm_cpu_can_disable(unsigned int __unused)
+{
+	return true;
+}
 #endif
 
 static const struct smp_operations sunxi_mcpm_smp_ops __initconst = {
+	.smp_secondary_init	= sunxi_mcpm_secondary_init,
 	.smp_boot_secondary	= sunxi_mcpm_boot_secondary,
 #ifdef CONFIG_HOTPLUG_CPU
 	.cpu_die		= sunxi_mcpm_cpu_die,
 	.cpu_kill		= sunxi_mcpm_cpu_kill,
+	.cpu_can_disable	= sunxi_mcpm_cpu_can_disable,
 #endif
 };
 
@@ -652,7 +684,7 @@ static int __init sunxi_mcpm_lookback(void)
 
 static int __init sunxi_mcpm_init(void)
 {
-	struct device_node *cpucfg_node, *node;
+	struct device_node *cpucfg_node, *sram_node, *node;
 	struct resource res;
 	int ret;
 
@@ -700,16 +732,31 @@ static int __init sunxi_mcpm_init(void)
 		goto err_put_cpucfg_node;
 	}
 
+	sram_node = of_find_compatible_node(NULL, NULL,
+					    "allwinner,sun9i-a80-smp-sram");
+	if (!sram_node) {
+		ret = -ENODEV;
+		goto err_unmap_release_cpucfg;
+	}
+
+	sram_b_smp_base = of_io_request_and_map(sram_node, 0, "sunxi-mcpm");
+	if (IS_ERR(sram_b_smp_base)) {
+		ret = PTR_ERR(sram_b_smp_base);
+		pr_err("%s: failed to map secure SRAM\n", __func__);
+		goto err_put_sram_node;
+	}
+
 	/* Configure CCI-400 for boot cluster */
 	ret = sunxi_mcpm_lookback();
 	if (ret) {
 		pr_err("%s: failed to configure boot cluster: %d\n",
 		       __func__, ret);
-		goto err_unmap_release_cpucfg;
+		goto err_unmap_release_secure_sram;
 	}
 
-	/* We don't need the CPUCFG device node anymore */
+	/* We don't need the CPUCFG and SRAM device nodes anymore */
 	of_node_put(cpucfg_node);
+	of_node_put(sram_node);
 
 	/* Set the hardware entry point address */
 	writel(__pa_symbol(sunxi_mcpm_secondary_startup),
@@ -722,6 +769,12 @@ static int __init sunxi_mcpm_init(void)
 
 	return 0;
 
+err_unmap_release_secure_sram:
+	iounmap(sram_b_smp_base);
+	of_address_to_resource(sram_node, 0, &res);
+	release_mem_region(res.start, resource_size(&res));
+err_put_sram_node:
+	of_node_put(sram_node);
 err_unmap_release_cpucfg:
 	iounmap(cpucfg_base);
 	of_address_to_resource(cpucfg_node, 0, &res);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 8/8] ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
                   ` (6 preceding siblings ...)
  2018-01-15  7:14 ` [PATCH v3 7/8] ARM: sun9i: mcpm: Support cpu0 hotplug Chen-Yu Tsai
@ 2018-01-15  7:14 ` Chen-Yu Tsai
  2018-01-15 21:00 ` [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Nicolas Pitre
  8 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-15  7:14 UTC (permalink / raw)
  To: Maxime Ripard, Russell King
  Cc: Chen-Yu Tsai, devicetree, linux-arm-kernel, linux-kernel,
	linux-sunxi, Nicolas Pitre, Dave Martin

The A80 stores some magic flags in a portion of the secure SRAM. The
BROM jumps directly to the software entry point set by the SMP code
if the flags are set. This is required for CPU0 hotplugging.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
---
 arch/arm/boot/dts/sun9i-a80.dtsi | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index bf4d40e8359f..b1c86b76ac3c 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -250,6 +250,25 @@
 		 */
 		ranges = <0 0 0 0x20000000>;
 
+		sram_b: sram@20000 {
+			/* 256 KiB secure SRAM at 0x20000 */
+			compatible = "mmio-sram";
+			reg = <0x00020000 0x40000>;
+
+			#address-cells = <1>;
+			#size-cells = <1>;
+			ranges = <0 0x00020000 0x40000>;
+
+			smp-sram@1000 {
+				/*
+				 * This is checked by BROM to determine if
+				 * cpu0 should jump to SMP entry vector
+				 */
+				compatible = "allwinner,sun9i-a80-smp-sram";
+				reg = <0x1000 0x8>;
+			};
+		};
+
 		ehci0: usb@a00000 {
 			compatible = "allwinner,sun9i-a80-ehci", "generic-ehci";
 			reg = <0x00a00000 0x100>;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/8] ARM: sun9i: Support SMP bring-up on A80
  2018-01-15  7:14 ` [PATCH v3 1/8] ARM: sun9i: Support SMP bring-up on A80 Chen-Yu Tsai
@ 2018-01-15 12:04   ` Dave Martin
  2018-01-16  4:09     ` Chen-Yu Tsai
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Martin @ 2018-01-15 12:04 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Maxime Ripard, Russell King, Nicolas Pitre, devicetree,
	linux-sunxi, linux-kernel, linux-arm-kernel

On Mon, Jan 15, 2018 at 07:14:43AM +0000, Chen-Yu Tsai wrote:
> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
> 1 cluster of 4 Cortex-A15s.
> 
> This patch adds support to bring up the second cluster and thus all
> cores using custom platform SMP code. Core/cluster power down has not
> been implemented, thus CPU hotplugging is not supported.
> 
> This is limited to !THUMB2_KERNEL for now. The entry code must be built
> as ARM machine code, and it does not switch modes. Further work was
> done to move the assembly code to a separate file and add the proper
> mode statements and mode switching instructions. However initial tests
> failed to boot properly with Thumb-2.
> 
> Parts of the trampoline and re-entry code for the boot cpu was adapted
> from the MCPM framework.
> 
> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> ---
>  arch/arm/mach-sunxi/Kconfig  |   7 +
>  arch/arm/mach-sunxi/Makefile |   1 +
>  arch/arm/mach-sunxi/mcpm.c   | 548 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 556 insertions(+)
>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
> 
> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
> index 58153cdf025b..b53e37d170e6 100644
> --- a/arch/arm/mach-sunxi/Kconfig
> +++ b/arch/arm/mach-sunxi/Kconfig
> @@ -48,4 +48,11 @@ config MACH_SUN9I
>  	default ARCH_SUNXI
>  	select ARM_GIC
>  
> +config ARCH_SUNXI_MCPM
> +	bool
> +	depends on SMP && !THUMB2_KERNEL
> +	default MACH_SUN9I
> +	select ARM_CCI400_PORT_CTRL
> +	select ARM_CPU_SUSPEND
> +

If this no longer uses MCPM, you should get rid of "mcpm" from all the
names, comments etc. -- it's just confusing otherwise.

>  endif
> diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
> index 27b168f121a1..cacd1afa8137 100644
> --- a/arch/arm/mach-sunxi/Makefile
> +++ b/arch/arm/mach-sunxi/Makefile
> @@ -1,2 +1,3 @@
>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
> +obj-$(CONFIG_ARCH_SUNXI_MCPM) += mcpm.o
>  obj-$(CONFIG_SMP) += platsmp.o
> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
> new file mode 100644
> index 000000000000..7c77bb3b367a
> --- /dev/null
> +++ b/arch/arm/mach-sunxi/mcpm.c

[...]

> +static int sunxi_mcpm_cpu_table[SUNXI_NR_CLUSTERS][SUNXI_CPUS_PER_CLUSTER];
> +static int sunxi_mcpm_first_comer;
> +
> +/*
> + * Enable cluster-level coherency, in preparation for turning on the MMU.
> + *
> + * Also enable regional clock gating and L2 data latency settings for
> + * Cortex-A15. These settings are from the vendor kernel.
> + */
> +static void __naked sunxi_mcpm_cluster_cache_enable(void)
> +{
> +	asm volatile (
> +		"mrc	p15, 0, r1, c0, c0, 0\n"
> +		"movw	r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
> +		"movt	r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
> +		"and	r1, r1, r2\n"
> +		"movw	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
> +		"movt	r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
> +		"cmp	r1, r2\n"
> +		"bne	not_a15\n"
> +
> +		/* The following is Cortex-A15 specific */
> +
> +		/* ACTLR2: Enable CPU regional clock gates */
> +		"mrc p15, 1, r1, c15, c0, 4\n"
> +		"orr r1, r1, #(0x1<<31)\n"
> +		"mcr p15, 1, r1, c15, c0, 4\n"
> +
> +		/* L2ACTLR */
> +		"mrc p15, 1, r1, c15, c0, 0\n"
> +		/* Enable L2, GIC, and Timer regional clock gates */
> +		"orr r1, r1, #(0x1<<26)\n"
> +		/* Disable clean/evict from being pushed to external */
> +		"orr r1, r1, #(0x1<<3)\n"
> +		"mcr p15, 1, r1, c15, c0, 0\n"
> +
> +		/* L2CTRL: L2 data RAM latency */
> +		"mrc p15, 1, r1, c9, c0, 2\n"
> +		"bic r1, r1, #(0x7<<0)\n"
> +		"orr r1, r1, #(0x3<<0)\n"
> +		"mcr p15, 1, r1, c9, c0, 2\n"
> +
> +		/* End of Cortex-A15 specific setup */
> +		"not_a15:\n"
> +
> +		/* Get value of sunxi_mcpm_first_comer */
> +		"adr	r1, first\n"
> +		"ldr	r0, [r1]\n"
> +		"ldr	r0, [r1, r0]\n"
> +
> +		/* Skip cci_enable_port_for_self if not first comer */
> +		"cmp	r0, #0\n"
> +		"bxeq	lr\n"
> +		"b	cci_enable_port_for_self\n"

For Thumb, you need a ".align 2" here.

I've never understood why gas doesn't implicitly align .word on arches
that care about data alignment, but it doesn't...

Since the MMU isn't on yet, you may get alignment faults if the .word
is misaligned in the assembled code.

> +
> +		"first: .word sunxi_mcpm_first_comer - .\n"
> +	);
> +}

[...]

> +static int __init sunxi_mcpm_init(void)
> +{
> +	struct device_node *cpucfg_node, *node;
> +	struct resource res;
> +	int ret;
> +
> +	if (!of_machine_is_compatible("allwinner,sun9i-a80"))
> +		return -ENODEV;
> +
> +	if (!sunxi_mcpm_cpu_table_init())
> +		return -EINVAL;
> +
> +	if (!cci_probed()) {
> +		pr_err("%s: CCI-400 not available\n", __func__);
> +		return -ENODEV;
> +	}
> +
> +	node = of_find_compatible_node(NULL, NULL, "allwinner,sun9i-a80-prcm");
> +	if (!node) {
> +		pr_err("%s: PRCM not available\n", __func__);
> +		return -ENODEV;
> +	}
> +
> +	/*
> +	 * Unfortunately we can not request the I/O region for the PRCM.
> +	 * It is shared with the PRCM clock.
> +	 */
> +	prcm_base = of_iomap(node, 0);
> +	of_node_put(node);
> +	if (!prcm_base) {
> +		pr_err("%s: failed to map PRCM registers\n", __func__);
> +		return -ENOMEM;
> +	}
> +
> +	cpucfg_node = of_find_compatible_node(NULL, NULL,
> +					      "allwinner,sun9i-a80-cpucfg");
> +	if (!cpucfg_node) {
> +		ret = -ENODEV;
> +		pr_err("%s: CPUCFG not available\n", __func__);
> +		goto err_unmap_prcm;
> +	}
> +
> +	cpucfg_base = of_io_request_and_map(cpucfg_node, 0, "sunxi-mcpm");
> +	if (IS_ERR(cpucfg_base)) {
> +		ret = PTR_ERR(cpucfg_base);
> +		pr_err("%s: failed to map CPUCFG registers: %d\n",
> +		       __func__, ret);
> +		goto err_put_cpucfg_node;
> +	}
> +
> +	/* Configure CCI-400 for boot cluster */
> +	ret = sunxi_mcpm_lookback();
> +	if (ret) {
> +		pr_err("%s: failed to configure boot cluster: %d\n",
> +		       __func__, ret);
> +		goto err_unmap_release_cpucfg;
> +	}
> +
> +	/* We don't need the CPUCFG device node anymore */
> +	of_node_put(cpucfg_node);
> +
> +	/* Set the hardware entry point address */
> +	writel(__pa_symbol(sunxi_mcpm_secondary_startup),
> +	       prcm_base + PRCM_CPU_SOFT_ENTRY_REG);

It's possible the firmware / boot ROM doesn't know how to branch
correctly to a Thumb symbol here.  This is often missed in firmware
implementations (or deliberately omitted, since it easy to work
around in the OS).

Things you could try:

1) Add CFLAGS_mcpm.o += -marm
(to see whether building just this file as ARM fixes it).

2) Split sunxi_mcpm_secondary_startup out into a separate asm file
and build just that as ARM, so that the entry point from the firmware
is ARM.

If both work, the firmware can't branch directly to Thumb and so you
need to keep the secondary entry point as ARM code.

If (1) works but (2) doesn't, then there must be somthing else in this
file that is Thumb-incompatible, but I don't see it so far.


[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support
  2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
                   ` (7 preceding siblings ...)
  2018-01-15  7:14 ` [PATCH v3 8/8] ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug Chen-Yu Tsai
@ 2018-01-15 21:00 ` Nicolas Pitre
  2018-01-16  4:24   ` Chen-Yu Tsai
  8 siblings, 1 reply; 13+ messages in thread
From: Nicolas Pitre @ 2018-01-15 21:00 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Maxime Ripard, Russell King, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Dave Martin

On Mon, 15 Jan 2018, Chen-Yu Tsai wrote:

> Changes since v2:
>   - Do away with the MCPM framework, directly implement smp_ops
>   - Some debug messages were clarified
>   - New ARCH_SUNXI_MCPM Kconfig symbol for this feature

You should use ARCH_SUNXI_SMP instead to avoid confusion with the actual 
MCPM code. Ditto for function names as Dave mentioned.

For the ARM to Thumb switch you could add something like this at the 
beginning of your entry code:

#ifdef CONFIG_THUMB2_KERNEL
	.arm
	mov	ip, #1
	bx	ip
	.thumb
#endif
	[your code follows here]

And make sure that code is word aligned.


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/8] ARM: sun9i: Support SMP bring-up on A80
  2018-01-15 12:04   ` Dave Martin
@ 2018-01-16  4:09     ` Chen-Yu Tsai
  0 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-16  4:09 UTC (permalink / raw)
  To: Dave Martin
  Cc: Maxime Ripard, Russell King, Nicolas Pitre, devicetree,
	linux-sunxi, linux-kernel, linux-arm-kernel

On Mon, Jan 15, 2018 at 8:04 PM, Dave Martin <Dave.Martin@arm.com> wrote:
> On Mon, Jan 15, 2018 at 07:14:43AM +0000, Chen-Yu Tsai wrote:
>> The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
>> 1 cluster of 4 Cortex-A15s.
>>
>> This patch adds support to bring up the second cluster and thus all
>> cores using custom platform SMP code. Core/cluster power down has not
>> been implemented, thus CPU hotplugging is not supported.
>>
>> This is limited to !THUMB2_KERNEL for now. The entry code must be built
>> as ARM machine code, and it does not switch modes. Further work was
>> done to move the assembly code to a separate file and add the proper
>> mode statements and mode switching instructions. However initial tests
>> failed to boot properly with Thumb-2.
>>
>> Parts of the trampoline and re-entry code for the boot cpu was adapted
>> from the MCPM framework.
>>
>> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
>> ---
>>  arch/arm/mach-sunxi/Kconfig  |   7 +
>>  arch/arm/mach-sunxi/Makefile |   1 +
>>  arch/arm/mach-sunxi/mcpm.c   | 548 +++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 556 insertions(+)
>>  create mode 100644 arch/arm/mach-sunxi/mcpm.c
>>
>> diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
>> index 58153cdf025b..b53e37d170e6 100644
>> --- a/arch/arm/mach-sunxi/Kconfig
>> +++ b/arch/arm/mach-sunxi/Kconfig
>> @@ -48,4 +48,11 @@ config MACH_SUN9I
>>       default ARCH_SUNXI
>>       select ARM_GIC
>>
>> +config ARCH_SUNXI_MCPM
>> +     bool
>> +     depends on SMP && !THUMB2_KERNEL
>> +     default MACH_SUN9I
>> +     select ARM_CCI400_PORT_CTRL
>> +     select ARM_CPU_SUSPEND
>> +
>
> If this no longer uses MCPM, you should get rid of "mcpm" from all the
> names, comments etc. -- it's just confusing otherwise.

Discussed with Maxime. Will switch to "mc_smp" instead. The "mc" part
is there to differentiate with some old smp code we have for the A31
and A23.

>
>>  endif
>> diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
>> index 27b168f121a1..cacd1afa8137 100644
>> --- a/arch/arm/mach-sunxi/Makefile
>> +++ b/arch/arm/mach-sunxi/Makefile
>> @@ -1,2 +1,3 @@
>>  obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
>> +obj-$(CONFIG_ARCH_SUNXI_MCPM) += mcpm.o
>>  obj-$(CONFIG_SMP) += platsmp.o
>> diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
>> new file mode 100644
>> index 000000000000..7c77bb3b367a
>> --- /dev/null
>> +++ b/arch/arm/mach-sunxi/mcpm.c
>
> [...]
>
>> +static int sunxi_mcpm_cpu_table[SUNXI_NR_CLUSTERS][SUNXI_CPUS_PER_CLUSTER];
>> +static int sunxi_mcpm_first_comer;
>> +
>> +/*
>> + * Enable cluster-level coherency, in preparation for turning on the MMU.
>> + *
>> + * Also enable regional clock gating and L2 data latency settings for
>> + * Cortex-A15. These settings are from the vendor kernel.
>> + */
>> +static void __naked sunxi_mcpm_cluster_cache_enable(void)
>> +{
>> +     asm volatile (
>> +             "mrc    p15, 0, r1, c0, c0, 0\n"
>> +             "movw   r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
>> +             "movt   r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
>> +             "and    r1, r1, r2\n"
>> +             "movw   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
>> +             "movt   r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
>> +             "cmp    r1, r2\n"
>> +             "bne    not_a15\n"
>> +
>> +             /* The following is Cortex-A15 specific */
>> +
>> +             /* ACTLR2: Enable CPU regional clock gates */
>> +             "mrc p15, 1, r1, c15, c0, 4\n"
>> +             "orr r1, r1, #(0x1<<31)\n"
>> +             "mcr p15, 1, r1, c15, c0, 4\n"
>> +
>> +             /* L2ACTLR */
>> +             "mrc p15, 1, r1, c15, c0, 0\n"
>> +             /* Enable L2, GIC, and Timer regional clock gates */
>> +             "orr r1, r1, #(0x1<<26)\n"
>> +             /* Disable clean/evict from being pushed to external */
>> +             "orr r1, r1, #(0x1<<3)\n"
>> +             "mcr p15, 1, r1, c15, c0, 0\n"
>> +
>> +             /* L2CTRL: L2 data RAM latency */
>> +             "mrc p15, 1, r1, c9, c0, 2\n"
>> +             "bic r1, r1, #(0x7<<0)\n"
>> +             "orr r1, r1, #(0x3<<0)\n"
>> +             "mcr p15, 1, r1, c9, c0, 2\n"
>> +
>> +             /* End of Cortex-A15 specific setup */
>> +             "not_a15:\n"
>> +
>> +             /* Get value of sunxi_mcpm_first_comer */
>> +             "adr    r1, first\n"
>> +             "ldr    r0, [r1]\n"
>> +             "ldr    r0, [r1, r0]\n"
>> +
>> +             /* Skip cci_enable_port_for_self if not first comer */
>> +             "cmp    r0, #0\n"
>> +             "bxeq   lr\n"
>> +             "b      cci_enable_port_for_self\n"
>
> For Thumb, you need a ".align 2" here.
>
> I've never understood why gas doesn't implicitly align .word on arches
> that care about data alignment, but it doesn't...
>
> Since the MMU isn't on yet, you may get alignment faults if the .word
> is misaligned in the assembled code.

This was indeed the problem. Don't know why I missed it considering I
had added them for other parts in my full assembly patches...

>
>> +
>> +             "first: .word sunxi_mcpm_first_comer - .\n"
>> +     );
>> +}
>
> [...]
>
>> +static int __init sunxi_mcpm_init(void)
>> +{
>> +     struct device_node *cpucfg_node, *node;
>> +     struct resource res;
>> +     int ret;
>> +
>> +     if (!of_machine_is_compatible("allwinner,sun9i-a80"))
>> +             return -ENODEV;
>> +
>> +     if (!sunxi_mcpm_cpu_table_init())
>> +             return -EINVAL;
>> +
>> +     if (!cci_probed()) {
>> +             pr_err("%s: CCI-400 not available\n", __func__);
>> +             return -ENODEV;
>> +     }
>> +
>> +     node = of_find_compatible_node(NULL, NULL, "allwinner,sun9i-a80-prcm");
>> +     if (!node) {
>> +             pr_err("%s: PRCM not available\n", __func__);
>> +             return -ENODEV;
>> +     }
>> +
>> +     /*
>> +      * Unfortunately we can not request the I/O region for the PRCM.
>> +      * It is shared with the PRCM clock.
>> +      */
>> +     prcm_base = of_iomap(node, 0);
>> +     of_node_put(node);
>> +     if (!prcm_base) {
>> +             pr_err("%s: failed to map PRCM registers\n", __func__);
>> +             return -ENOMEM;
>> +     }
>> +
>> +     cpucfg_node = of_find_compatible_node(NULL, NULL,
>> +                                           "allwinner,sun9i-a80-cpucfg");
>> +     if (!cpucfg_node) {
>> +             ret = -ENODEV;
>> +             pr_err("%s: CPUCFG not available\n", __func__);
>> +             goto err_unmap_prcm;
>> +     }
>> +
>> +     cpucfg_base = of_io_request_and_map(cpucfg_node, 0, "sunxi-mcpm");
>> +     if (IS_ERR(cpucfg_base)) {
>> +             ret = PTR_ERR(cpucfg_base);
>> +             pr_err("%s: failed to map CPUCFG registers: %d\n",
>> +                    __func__, ret);
>> +             goto err_put_cpucfg_node;
>> +     }
>> +
>> +     /* Configure CCI-400 for boot cluster */
>> +     ret = sunxi_mcpm_lookback();
>> +     if (ret) {
>> +             pr_err("%s: failed to configure boot cluster: %d\n",
>> +                    __func__, ret);
>> +             goto err_unmap_release_cpucfg;
>> +     }
>> +
>> +     /* We don't need the CPUCFG device node anymore */
>> +     of_node_put(cpucfg_node);
>> +
>> +     /* Set the hardware entry point address */
>> +     writel(__pa_symbol(sunxi_mcpm_secondary_startup),
>> +            prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
>
> It's possible the firmware / boot ROM doesn't know how to branch
> correctly to a Thumb symbol here.  This is often missed in firmware
> implementations (or deliberately omitted, since it easy to work
> around in the OS).

I've tried all the tricks to jump from ARM to Thumb. Then it occurred
to me, __pa_symbol() would set the lowest bit for Thumb symbols. So I
tried just using cpu_resume for the trampoline re-entry point, and it
worked.

I went back and checked the BROM code. It does

    ldr pc, <entry_code_address>

which I assume is the same as

    ldr ip, <entry_code_address>
    bx ip

Indeed just fixing the alignment issue above made it work.

> Things you could try:
>
> 1) Add CFLAGS_mcpm.o += -marm
> (to see whether building just this file as ARM fixes it).

This doesn't really work. The spinlock code are still Thumb instructions
and the assembler complains.

> 2) Split sunxi_mcpm_secondary_startup out into a separate asm file
> and build just that as ARM, so that the entry point from the firmware
> is ARM.

I tried this before but...

> If both work, the firmware can't branch directly to Thumb and so you
> need to keep the secondary entry point as ARM code.

as mentioned above the BROM does branch correctly and it was the .word
.alignment that was the culprit.

Thanks for all the feedback!
ChenYu

> If (1) works but (2) doesn't, then there must be somthing else in this
> file that is Thumb-incompatible, but I don't see it so far.
>
>
> [...]
>
> Cheers
> ---Dave

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support
  2018-01-15 21:00 ` [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Nicolas Pitre
@ 2018-01-16  4:24   ` Chen-Yu Tsai
  0 siblings, 0 replies; 13+ messages in thread
From: Chen-Yu Tsai @ 2018-01-16  4:24 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Maxime Ripard, Russell King, devicetree, linux-arm-kernel,
	linux-kernel, linux-sunxi, Dave Martin

On Tue, Jan 16, 2018 at 5:00 AM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Mon, 15 Jan 2018, Chen-Yu Tsai wrote:
>
>> Changes since v2:
>>   - Do away with the MCPM framework, directly implement smp_ops
>>   - Some debug messages were clarified
>>   - New ARCH_SUNXI_MCPM Kconfig symbol for this feature
>
> You should use ARCH_SUNXI_SMP instead to avoid confusion with the actual
> MCPM code. Ditto for function names as Dave mentioned.

All switched to "MC_SMP". There is existing, albeit deprecated, SMP code
for single cluster SoCs, so "multi cluster" is desired to differentiate
from the old stuff.

> For the ARM to Thumb switch you could add something like this at the
> beginning of your entry code:
>
> #ifdef CONFIG_THUMB2_KERNEL
>         .arm
>         mov     ip, #1
>         bx      ip
>         .thumb
> #endif
>         [your code follows here]
>
> And make sure that code is word aligned.

Thanks for the tip. As I mentioned in my reply to Dave,
this wasn't really needed.

Thanks again!
ChenYu

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-01-16  4:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-15  7:14 [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Chen-Yu Tsai
2018-01-15  7:14 ` [PATCH v3 1/8] ARM: sun9i: Support SMP bring-up on A80 Chen-Yu Tsai
2018-01-15 12:04   ` Dave Martin
2018-01-16  4:09     ` Chen-Yu Tsai
2018-01-15  7:14 ` [PATCH v3 2/8] ARM: dts: sun9i: Add CCI-400 device nodes for A80 Chen-Yu Tsai
2018-01-15  7:14 ` [PATCH v3 3/8] ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi Chen-Yu Tsai
2018-01-15  7:14 ` [PATCH v3 4/8] ARM: dts: sun9i: Add PRCM device node for the " Chen-Yu Tsai
2018-01-15  7:14 ` [PATCH v3 5/8] ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for cpu1~7 Chen-Yu Tsai
2018-01-15  7:14 ` [PATCH v3 6/8] dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP hotplug Chen-Yu Tsai
2018-01-15  7:14 ` [PATCH v3 7/8] ARM: sun9i: mcpm: Support cpu0 hotplug Chen-Yu Tsai
2018-01-15  7:14 ` [PATCH v3 8/8] ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug Chen-Yu Tsai
2018-01-15 21:00 ` [PATCH v3 0/8] ARM: sun9i: SMP and CPU hotplug support Nicolas Pitre
2018-01-16  4:24   ` Chen-Yu Tsai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).