All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/6] arm64: Add asm/cpu.h
@ 2013-12-11 13:13 Mark Brown
  2013-12-11 13:13 ` [PATCH 2/6] arm64: dts: Add a virtio disk to the RTSM motherboard Mark Brown
                   ` (5 more replies)
  0 siblings, 6 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-11 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Hanjun Guo <hanjun.guo@linaro.org>

In order to implement both topology and ACPI support we need to define
per-CPU data.

Signed-off-by: Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
---
 arch/arm64/include/asm/cpu.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)
 create mode 100644 arch/arm64/include/asm/cpu.h

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
new file mode 100644
index 000000000000..d67ff011d361
--- /dev/null
+++ b/arch/arm64/include/asm/cpu.h
@@ -0,0 +1,25 @@
+/*
+ *  Copyright (C) 2004-2005 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef __ASM_ARM64_CPU_H
+#define __ASM_ARM64_CPU_H
+
+#include <linux/percpu.h>
+#include <linux/cpu.h>
+#include <linux/topology.h>
+
+struct cpuinfo_arm {
+	struct cpu	cpu;
+	u64		cpuid;
+#ifdef CONFIG_SMP
+	unsigned int	loops_per_jiffy;
+#endif
+};
+
+DECLARE_PER_CPU(struct cpuinfo_arm, cpu_data);
+
+#endif
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 2/6] arm64: dts: Add a virtio disk to the RTSM motherboard
  2013-12-11 13:13 [PATCH 1/6] arm64: Add asm/cpu.h Mark Brown
@ 2013-12-11 13:13 ` Mark Brown
  2013-12-11 13:13 ` [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP Mark Brown
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-11 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Mark Hambleton <mahamble@broadcom.com>

Describe the virtio device so we can mount disk images in the simulator.

[Reduced the size of the region based on feedback from review -- broonie]

Signed-off-by: Mark Hambleton <mahamble@broadcom.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/boot/dts/rtsm_ve-motherboard.dtsi | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/boot/dts/rtsm_ve-motherboard.dtsi b/arch/arm64/boot/dts/rtsm_ve-motherboard.dtsi
index b45e5f39f577..2f2ecd217363 100644
--- a/arch/arm64/boot/dts/rtsm_ve-motherboard.dtsi
+++ b/arch/arm64/boot/dts/rtsm_ve-motherboard.dtsi
@@ -183,6 +183,12 @@
 				clocks = <&v2m_oscclk1>, <&v2m_clk24mhz>;
 				clock-names = "clcdclk", "apb_pclk";
 			};
+
+			virtio_block at 0130000 {
+				compatible = "virtio,mmio";
+				reg = <0x130000 0x200>;
+				interrupts = <42>;
+			};
 		};
 
 		v2m_fixed_3v3: fixedregulator at 0 {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 13:13 [PATCH 1/6] arm64: Add asm/cpu.h Mark Brown
  2013-12-11 13:13 ` [PATCH 2/6] arm64: dts: Add a virtio disk to the RTSM motherboard Mark Brown
@ 2013-12-11 13:13 ` Mark Brown
  2013-12-11 13:55   ` Mark Rutland
  2013-12-11 13:13 ` [PATCH 4/6] arm64: topology: Implement basic CPU topology support Mark Brown
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-11 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Mark Brown <broonie@linaro.org>

Add a dts file for ARMv8 fixed virtual platform which models a
big.LITTLE configuration for the architecture.  This is based on the
DT shipped by ARM but has been modified for mainline, removing features
not present in mainline and updating the CPU bindings for mainline.

Signed-off-by: Mark Brown <broonie@linaro.org>
---
 arch/arm64/boot/dts/Makefile                |   4 +-
 arch/arm64/boot/dts/fvp-base-gicv2-psci.dts | 233 ++++++++++++++++++++++++++++
 2 files changed, 236 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/boot/dts/fvp-base-gicv2-psci.dts

diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
index c52bdb051f66..2d16cdbe5d47 100644
--- a/arch/arm64/boot/dts/Makefile
+++ b/arch/arm64/boot/dts/Makefile
@@ -1,4 +1,6 @@
-dtb-$(CONFIG_ARCH_VEXPRESS) += rtsm_ve-aemv8a.dtb foundation-v8.dtb
+dtb-$(CONFIG_ARCH_VEXPRESS) += 	rtsm_ve-aemv8a.dtb \
+				foundation-v8.dtb \
+				fvp-base-gicv2-psci.dtb
 dtb-$(CONFIG_ARCH_XGENE) += apm-mustang.dtb
 
 targets += dtbs
diff --git a/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts b/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts
new file mode 100644
index 000000000000..736f546123bb
--- /dev/null
+++ b/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts
@@ -0,0 +1,233 @@
+/*
+ * Copyright (c) 2013, ARM Limited. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials provided with the distribution.
+ *
+ * Neither the name of ARM nor the names of its contributors may be used
+ * to endorse or promote products derived from this software without specific
+ * prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/dts-v1/;
+
+/memreserve/ 0x80000000 0x00010000;
+
+/ {
+};
+
+/ {
+	model = "FVP Base";
+	compatible = "arm,vfp-base", "arm,vexpress";
+	interrupt-parent = <&gic>;
+	#address-cells = <2>;
+	#size-cells = <2>;
+
+	chosen { };
+
+	aliases {
+		serial0 = &v2m_serial0;
+		serial1 = &v2m_serial1;
+		serial2 = &v2m_serial2;
+		serial3 = &v2m_serial3;
+	};
+
+	psci {
+		compatible = "arm,psci";
+		method = "smc";
+		cpu_suspend = <0xc4000001>;
+		cpu_off = <0x84000002>;
+		cpu_on = <0xc4000003>;
+	};
+
+	cpus {
+		#address-cells = <2>;
+		#size-cells = <0>;
+
+		big0: cpu at 0 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a57", "arm,armv8";
+			reg = <0x0 0x0>;
+			enable-method = "psci";
+			clock-frequency = <1000000>;
+		};
+		big1: cpu at 1 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a57", "arm,armv8";
+			reg = <0x0 0x1>;
+			enable-method = "psci";
+			clock-frequency = <1000000>;
+		};
+		big2: cpu at 2 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a57", "arm,armv8";
+			reg = <0x0 0x2>;
+			enable-method = "psci";
+			clock-frequency = <1000000>;
+		};
+		big3: cpu at 3 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a57", "arm,armv8";
+			reg = <0x0 0x3>;
+			enable-method = "psci";
+			clock-frequency = <1000000>;
+		};
+		little0: cpu at 100 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a53", "arm,armv8";
+			reg = <0x0 0x100>;
+			enable-method = "psci";
+			clock-frequency = <1000000>;
+		};
+		little1: cpu at 101 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a53", "arm,armv8";
+			reg = <0x0 0x101>;
+			enable-method = "psci";
+			clock-frequency = <1000000>;
+		};
+		little2: cpu at 102 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a53", "arm,armv8";
+			reg = <0x0 0x102>;
+			enable-method = "psci";
+			clock-frequency = <1000000>;
+		};
+		little3: cpu at 103 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a53", "arm,armv8";
+			reg = <0x0 0x103>;
+			enable-method = "psci";
+			clock-frequency = <1000000>;
+		};
+	};
+
+	memory at 80000000 {
+		device_type = "memory";
+		reg = <0x00000000 0x80000000 0 0x80000000>,
+		      <0x00000008 0x80000000 0 0x80000000>;
+	};
+
+	gic: interrupt-controller at 2f000000 {
+		compatible = "arm,cortex-a15-gic", "arm,cortex-a9-gic";
+		#interrupt-cells = <3>;
+		#address-cells = <0>;
+		interrupt-controller;
+		reg = <0x0 0x2f000000 0 0x10000>,
+		      <0x0 0x2c000000 0 0x2000>,
+		      <0x0 0x2c010000 0 0x2000>,
+		      <0x0 0x2c02F000 0 0x2000>;
+		interrupts = <1 9 0xf04>;
+	};
+
+	timer {
+		compatible = "arm,armv8-timer";
+		interrupts = <1 13 0xff01>,
+			     <1 14 0xff01>,
+			     <1 11 0xff01>,
+			     <1 10 0xff01>;
+		clock-frequency = <100000000>;
+	};
+
+	timer at 2a810000 {
+			compatible = "arm,armv7-timer-mem";
+			reg = <0x0 0x2a810000 0x0 0x10000>;
+			clock-frequency = <100000000>;
+			#address-cells = <2>;
+			#size-cells = <2>;
+			ranges;
+			frame at 2a820000 {
+				frame-number = <0>;
+				interrupts = <0 25 4>;
+				reg = <0x0 0x2a820000 0x0 0x10000>;
+			};
+	};
+
+	pmu {
+		compatible = "arm,armv8-pmuv3";
+		interrupts = <0 60 4>,
+			     <0 61 4>,
+			     <0 62 4>,
+			     <0 63 4>;
+	};
+
+	smb {
+		compatible = "simple-bus";
+
+		#address-cells = <2>;
+		#size-cells = <1>;
+		ranges = <0 0 0 0x08000000 0x04000000>,
+			 <1 0 0 0x14000000 0x04000000>,
+			 <2 0 0 0x18000000 0x04000000>,
+			 <3 0 0 0x1c000000 0x04000000>,
+			 <4 0 0 0x0c000000 0x04000000>,
+			 <5 0 0 0x10000000 0x04000000>;
+
+		#interrupt-cells = <1>;
+		interrupt-map-mask = <0 0 63>;
+		interrupt-map = <0 0  0 &gic 0  0 4>,
+				<0 0  1 &gic 0  1 4>,
+				<0 0  2 &gic 0  2 4>,
+				<0 0  3 &gic 0  3 4>,
+				<0 0  4 &gic 0  4 4>,
+				<0 0  5 &gic 0  5 4>,
+				<0 0  6 &gic 0  6 4>,
+				<0 0  7 &gic 0  7 4>,
+				<0 0  8 &gic 0  8 4>,
+				<0 0  9 &gic 0  9 4>,
+				<0 0 10 &gic 0 10 4>,
+				<0 0 11 &gic 0 11 4>,
+				<0 0 12 &gic 0 12 4>,
+				<0 0 13 &gic 0 13 4>,
+				<0 0 14 &gic 0 14 4>,
+				<0 0 15 &gic 0 15 4>,
+				<0 0 16 &gic 0 16 4>,
+				<0 0 17 &gic 0 17 4>,
+				<0 0 18 &gic 0 18 4>,
+				<0 0 19 &gic 0 19 4>,
+				<0 0 20 &gic 0 20 4>,
+				<0 0 21 &gic 0 21 4>,
+				<0 0 22 &gic 0 22 4>,
+				<0 0 23 &gic 0 23 4>,
+				<0 0 24 &gic 0 24 4>,
+				<0 0 25 &gic 0 25 4>,
+				<0 0 26 &gic 0 26 4>,
+				<0 0 27 &gic 0 27 4>,
+				<0 0 28 &gic 0 28 4>,
+				<0 0 29 &gic 0 29 4>,
+				<0 0 30 &gic 0 30 4>,
+				<0 0 31 &gic 0 31 4>,
+				<0 0 32 &gic 0 32 4>,
+				<0 0 33 &gic 0 33 4>,
+				<0 0 34 &gic 0 34 4>,
+				<0 0 35 &gic 0 35 4>,
+				<0 0 36 &gic 0 36 4>,
+				<0 0 37 &gic 0 37 4>,
+				<0 0 38 &gic 0 38 4>,
+				<0 0 39 &gic 0 39 4>,
+				<0 0 40 &gic 0 40 4>,
+				<0 0 41 &gic 0 41 4>,
+				<0 0 42 &gic 0 42 4>;
+
+		/include/ "rtsm_ve-motherboard.dtsi"
+	};
+};
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 13:13 [PATCH 1/6] arm64: Add asm/cpu.h Mark Brown
  2013-12-11 13:13 ` [PATCH 2/6] arm64: dts: Add a virtio disk to the RTSM motherboard Mark Brown
  2013-12-11 13:13 ` [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP Mark Brown
@ 2013-12-11 13:13 ` Mark Brown
  2013-12-11 14:12   ` Will Deacon
                     ` (2 more replies)
  2013-12-11 13:13 ` [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores Mark Brown
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-11 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Mark Brown <broonie@linaro.org>

Add basic CPU topology support to arm64, based on the existing pre-v8
code and some work done by Mark Hambleton.  This patch does not
implement the ARM CPU topology bindings, it implements equivalent
support to the existing the equivalent pre-v8 capability using the
mandatory MPIDR information in the CPU binding in device tree and
assuming that a simple SMP or multi-cluster topology is in use.

The primary goal is to separate the architecture hookup for providing
topology information from the DT parsing in order to ease review and
avoid blocking the architecture code (which will be built on by other
work) with the DT code review by providing something something simple
and basic.  Having this support should also make the kernel cope better
with incomplete DTs.

Further patches will provide support for overriding this using the
topology bindings, providing richer support for a wider range of systems.

Signed-off-by: Mark Brown <broonie@linaro.org>
---
 arch/arm64/Kconfig                |   8 +++
 arch/arm64/include/asm/cpu.h      |   1 -
 arch/arm64/include/asm/cputype.h  |   9 +++
 arch/arm64/include/asm/smp_plat.h |   1 +
 arch/arm64/include/asm/topology.h |  42 +++++++++++
 arch/arm64/kernel/Makefile        |   1 +
 arch/arm64/kernel/setup.c         |   9 +--
 arch/arm64/kernel/smp.c           |  19 ++++-
 arch/arm64/kernel/topology.c      | 143 ++++++++++++++++++++++++++++++++++++++
 9 files changed, 227 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm64/include/asm/topology.h
 create mode 100644 arch/arm64/kernel/topology.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 88c8b6c1341a..7b4dab852937 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -154,6 +154,14 @@ config SMP
 
 	  If you don't know what to do here, say N.
 
+config ARM_CPU_TOPOLOGY
+	bool "Support CPU topology definition"
+	depends on SMP
+	default y
+	help
+	  Support CPU topology definition, based on configuration
+	  provided by the firmware.
+
 config NR_CPUS
 	int "Maximum number of CPUs (2-32)"
 	range 2 32
diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index d67ff011d361..8a26b690110c 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -10,7 +10,6 @@
 
 #include <linux/percpu.h>
 #include <linux/cpu.h>
-#include <linux/topology.h>
 
 struct cpuinfo_arm {
 	struct cpu	cpu;
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 5fe138e0b828..bd504739cbfd 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -29,6 +29,15 @@
 #define INVALID_HWID		ULONG_MAX
 
 #define MPIDR_HWID_BITMASK	0xff00ffffff
+#define MPIDR_SMP_BITMASK (0x3 << 30)
+#define MPIDR_SMP_VALUE (0x2 << 30)
+#define MPIDR_MT_BITMASK (0x1 << 24)
+#define MPIDR_LEVEL_BITS 8
+#define MPIDR_LEVEL_MASK ((1 << MPIDR_LEVEL_BITS) - 1)
+
+#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
+	((mpidr >> (MPIDR_LEVEL_BITS * level)) & MPIDR_LEVEL_MASK)
+
 
 #define read_cpuid(reg) ({						\
 	u64 __val;							\
diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h
index ed43a0d2b1b2..4ad4ecc93bcf 100644
--- a/arch/arm64/include/asm/smp_plat.h
+++ b/arch/arm64/include/asm/smp_plat.h
@@ -19,6 +19,7 @@
 #ifndef __ASM_SMP_PLAT_H
 #define __ASM_SMP_PLAT_H
 
+#include <linux/cpumask.h>
 #include <asm/types.h>
 
 /*
diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
new file mode 100644
index 000000000000..611edefaeaf1
--- /dev/null
+++ b/arch/arm64/include/asm/topology.h
@@ -0,0 +1,42 @@
+#ifndef _ASM_ARM_TOPOLOGY_H
+#define _ASM_ARM_TOPOLOGY_H
+
+#ifdef CONFIG_ARM_CPU_TOPOLOGY
+
+#include <linux/cpumask.h>
+
+struct cputopo_arm {
+	int thread_id;
+	int core_id;
+	int socket_id;
+	cpumask_t thread_sibling;
+	cpumask_t core_sibling;
+};
+
+extern struct cputopo_arm cpu_topology[NR_CPUS];
+
+#define topology_physical_package_id(cpu)	(cpu_topology[cpu].socket_id)
+#define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
+#define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
+#define topology_thread_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
+
+#define mc_capable()	(cpu_topology[0].socket_id != -1)
+#define smt_capable()	(cpu_topology[0].thread_id != -1)
+
+void init_cpu_topology(void);
+void store_cpu_topology(unsigned int cpuid);
+const struct cpumask *cpu_coregroup_mask(int cpu);
+int cluster_to_logical_mask(unsigned int socket_id, cpumask_t *cluster_mask);
+
+#else
+
+static inline void init_cpu_topology(void) { }
+static inline void store_cpu_topology(unsigned int cpuid) { }
+static inline int cluster_to_logical_mask(unsigned int socket_id,
+	cpumask_t *cluster_mask) { return -EINVAL; }
+
+#endif
+
+#include <asm-generic/topology.h>
+
+#endif /* _ASM_ARM_TOPOLOGY_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 5ba2fd43a75b..2d145e38ad49 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP)			+= smp.o smp_spin_table.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
 arm64-obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
+arm64-obj-$(CONFIG_ARM_CPU_TOPOLOGY)  += topology.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 0bc5e4cbc017..dbe4a9ba90cb 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -54,6 +54,7 @@
 #include <asm/traps.h>
 #include <asm/memblock.h>
 #include <asm/psci.h>
+#include <asm/cpu.h>
 
 unsigned int processor_id;
 EXPORT_SYMBOL(processor_id);
@@ -250,16 +251,16 @@ static int __init arm64_device_init(void)
 }
 arch_initcall(arm64_device_init);
 
-static DEFINE_PER_CPU(struct cpu, cpu_data);
+DEFINE_PER_CPU(struct cpuinfo_arm, cpu_data);
 
 static int __init topology_init(void)
 {
 	int i;
 
 	for_each_possible_cpu(i) {
-		struct cpu *cpu = &per_cpu(cpu_data, i);
-		cpu->hotpluggable = 1;
-		register_cpu(cpu, i);
+		struct cpuinfo_arm *cpu = &per_cpu(cpu_data, i);
+		cpu->cpu.hotpluggable = 1;
+		register_cpu(&cpu->cpu, i);
 	}
 
 	return 0;
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index a5aeefab03c3..f29c7ffad84a 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -35,7 +35,6 @@
 #include <linux/clockchips.h>
 #include <linux/completion.h>
 #include <linux/of.h>
-
 #include <asm/atomic.h>
 #include <asm/cacheflush.h>
 #include <asm/cputype.h>
@@ -48,6 +47,7 @@
 #include <asm/sections.h>
 #include <asm/tlbflush.h>
 #include <asm/ptrace.h>
+#include <asm/cpu.h>
 
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
@@ -113,6 +113,16 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	return ret;
 }
 
+static void __cpuinit smp_store_cpu_info(unsigned int cpuid)
+{
+	struct cpuinfo_arm *cpu_info = &per_cpu(cpu_data, cpuid);
+
+	cpu_info->loops_per_jiffy = loops_per_jiffy;
+	cpu_info->cpuid = read_cpuid_id();
+
+	store_cpu_topology(cpuid);
+}
+
 /*
  * This is the secondary CPU boot entry.  We're using this CPUs
  * idle thread stack, but a set of temporary page tables.
@@ -150,6 +160,8 @@ asmlinkage void secondary_start_kernel(void)
 	 */
 	notify_cpu_starting(cpu);
 
+	smp_store_cpu_info(cpu);
+
 	/*
 	 * OK, now it's safe to let the boot CPU continue.  Wait for
 	 * the CPU migration code to notice that the CPU is online
@@ -387,6 +399,11 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	int err;
 	unsigned int cpu, ncores = num_possible_cpus();
 
+	init_cpu_topology();
+
+	smp_store_cpu_info(smp_processor_id());
+
+
 	/*
 	 * are we trying to boot more cores than exist?
 	 */
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
new file mode 100644
index 000000000000..e0b40f48b448
--- /dev/null
+++ b/arch/arm64/kernel/topology.c
@@ -0,0 +1,143 @@
+/*
+ * arch/arm64/kernel/topology.c
+ *
+ * Copyright (C) 2011,2013 Linaro Limited.
+ * Written by: Vincent Guittot
+ *
+ * based on arch/sh/kernel/topology.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/export.h>
+#include <linux/init.h>
+#include <linux/percpu.h>
+#include <linux/node.h>
+#include <linux/nodemask.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+
+#include <asm/cputype.h>
+#include <asm/smp_plat.h>
+#include <asm/topology.h>
+
+/*
+ * cpu topology table
+ */
+struct cputopo_arm cpu_topology[NR_CPUS];
+EXPORT_SYMBOL_GPL(cpu_topology);
+
+const struct cpumask *cpu_coregroup_mask(int cpu)
+{
+	return &cpu_topology[cpu].core_sibling;
+}
+
+static void update_siblings_masks(unsigned int cpuid)
+{
+	struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
+	int cpu;
+
+	/* update core and thread sibling masks */
+	for_each_possible_cpu(cpu) {
+		cpu_topo = &cpu_topology[cpu];
+
+		if (cpuid_topo->socket_id != cpu_topo->socket_id)
+			continue;
+
+		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+		if (cpu != cpuid)
+			cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+		if (cpuid_topo->core_id != cpu_topo->core_id)
+			continue;
+
+		cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
+		if (cpu != cpuid)
+			cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
+	}
+	smp_wmb();
+}
+
+/*
+ * store_cpu_topology is called at boot when only one cpu is running
+ * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
+ * which prevents simultaneous write access to cpu_topology array
+ */
+void store_cpu_topology(unsigned int cpuid)
+{
+	struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid];
+	u64 mpidr;
+
+	/* If the cpu topology has been already set, just return */
+	if (cpuid_topo->core_id != -1)
+		return;
+
+	mpidr = cpu_logical_map(cpuid);
+
+	/*
+	 * Create cpu topology mapping, assume the cores are largely
+	 * independent since the DT bindings do not include the flags
+	 * for MT.
+	 */
+	cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	update_siblings_masks(cpuid);
+
+	pr_info("CPU%u: cpu %d, socket %d mapped using MPIDR %llx\n",
+		cpuid, cpu_topology[cpuid].core_id,
+		cpu_topology[cpuid].socket_id, mpidr);
+}
+
+
+/*
+ * cluster_to_logical_mask - return cpu logical mask of CPUs in a cluster
+ * @socket_id:		cluster HW identifier
+ * @cluster_mask:	the cpumask location to be initialized, modified by the
+ *			function only if return value == 0
+ *
+ * Return:
+ *
+ * 0 on success
+ * -EINVAL if cluster_mask is NULL or there is no record matching socket_id
+ */
+int cluster_to_logical_mask(unsigned int socket_id, cpumask_t *cluster_mask)
+{
+	int cpu;
+
+	if (!cluster_mask)
+		return -EINVAL;
+
+	for_each_online_cpu(cpu)
+		if (socket_id == topology_physical_package_id(cpu)) {
+			cpumask_copy(cluster_mask, topology_core_cpumask(cpu));
+			return 0;
+		}
+
+	return -EINVAL;
+}
+
+/*
+ * init_cpu_topology is called at boot when only one cpu is running
+ * which prevent simultaneous write access to cpu_topology array
+ */
+void __init init_cpu_topology(void)
+{
+	unsigned int cpu;
+
+	/* init core mask and power*/
+	for_each_possible_cpu(cpu) {
+		struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
+
+		cpu_topo->thread_id = -1;
+		cpu_topo->core_id =  -1;
+		cpu_topo->socket_id = -1;
+		cpumask_clear(&cpu_topo->core_sibling);
+		cpumask_clear(&cpu_topo->thread_sibling);
+	}
+	smp_wmb();
+}
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores
  2013-12-11 13:13 [PATCH 1/6] arm64: Add asm/cpu.h Mark Brown
                   ` (2 preceding siblings ...)
  2013-12-11 13:13 ` [PATCH 4/6] arm64: topology: Implement basic CPU topology support Mark Brown
@ 2013-12-11 13:13 ` Mark Brown
  2013-12-11 14:47   ` Catalin Marinas
  2013-12-11 13:13 ` [PATCH 6/6] arm64: dts: Add CPU topology properties for ARMv8 4xA53 4xA57 FVP Mark Brown
  2013-12-11 14:10 ` [PATCH 1/6] arm64: Add asm/cpu.h Catalin Marinas
  5 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-11 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Mark Brown <broonie@linaro.org>

In non-heterogeneous systems like big.LITTLE systems the scheduler will be
able to make better use of the available cores if we provide power numbers
to it. Do this by parsing the CPU nodes in the DT.

The power numbers are the same as for ARMv7 since it seems that the
expected differential between the big and little cores is very similar on
both ARMv7 and ARMv8.

Signed-off-by: Mark Brown <broonie@linaro.org>
---
 arch/arm64/kernel/topology.c | 164 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 164 insertions(+)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index e0b40f48b448..f08bb2306cd4 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -18,6 +18,7 @@
 #include <linux/percpu.h>
 #include <linux/node.h>
 #include <linux/nodemask.h>
+#include <linux/of.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 
@@ -26,6 +27,163 @@
 #include <asm/topology.h>
 
 /*
+ * cpu power scale management
+ */
+
+/*
+ * cpu power table
+ * This per cpu data structure describes the relative capacity of each core.
+ * On a heteregenous system, cores don't have the same computation capacity
+ * and we reflect that difference in the cpu_power field so the scheduler can
+ * take this difference into account during load balance. A per cpu structure
+ * is preferred because each CPU updates its own cpu_power field during the
+ * load balance except for idle cores. One idle core is selected to run the
+ * rebalance_domains for all idle cores and the cpu_power can be updated
+ * during this sequence.
+ */
+static DEFINE_PER_CPU(unsigned long, cpu_scale);
+
+unsigned long arch_scale_freq_power(struct sched_domain *sd, int cpu)
+{
+	return per_cpu(cpu_scale, cpu);
+}
+
+static void set_power_scale(unsigned int cpu, unsigned long power)
+{
+	per_cpu(cpu_scale, cpu) = power;
+}
+
+#ifdef CONFIG_OF
+struct cpu_efficiency {
+	const char *compatible;
+	unsigned long efficiency;
+};
+
+/*
+ * Table of relative efficiency of each processors
+ * The efficiency value must fit in 20bit and the final
+ * cpu_scale value must be in the range
+ *   0 < cpu_scale < 3*SCHED_POWER_SCALE/2
+ * in order to return@most 1 when DIV_ROUND_CLOSEST
+ * is used to compute the capacity of a CPU.
+ * Processors that are not defined in the table,
+ * use the default SCHED_POWER_SCALE value for cpu_scale.
+ */
+static const struct cpu_efficiency table_efficiency[] = {
+	{ "arm,cortex-a57", 3891 },
+	{ "arm,cortex-a53", 2048 },
+	{ NULL, },
+};
+
+static unsigned long *__cpu_capacity;
+#define cpu_capacity(cpu)	__cpu_capacity[cpu]
+
+static unsigned long middle_capacity = 1;
+
+/*
+ * Iterate all CPUs' descriptor in DT and compute the efficiency
+ * (as per table_efficiency). Also calculate a middle efficiency
+ * as close as possible to  (max{eff_i} - min{eff_i}) / 2
+ * This is later used to scale the cpu_power field such that an
+ * 'average' CPU is of middle power. Also see the comments near
+ * table_efficiency[] and update_cpu_power().
+ */
+static void __init parse_dt_topology(void)
+{
+	const struct cpu_efficiency *cpu_eff;
+	struct device_node *cn = NULL;
+	unsigned long min_capacity = (unsigned long)(-1);
+	unsigned long max_capacity = 0;
+	unsigned long capacity = 0;
+	int alloc_size, cpu;
+
+	alloc_size = nr_cpu_ids * sizeof(*__cpu_capacity);
+	__cpu_capacity = kzalloc(alloc_size, GFP_NOWAIT);
+
+	for_each_possible_cpu(cpu) {
+		const u32 *rate;
+		int len;
+
+		/* Too early to use cpu->of_node */
+		cn = of_get_cpu_node(cpu, NULL);
+		if (!cn) {
+			pr_err("Missing device node for CPU %d\n", cpu);
+			continue;
+		}
+
+		/* check if the cpu is marked as "disabled", if so ignore */
+		if (!of_device_is_available(cn))
+			continue;
+
+		for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++)
+			if (of_device_is_compatible(cn, cpu_eff->compatible))
+				break;
+
+		if (cpu_eff->compatible == NULL) {
+			pr_warn("%s: Unknown CPU type\n", cn->full_name);
+			continue;
+		}
+
+		rate = of_get_property(cn, "clock-frequency", &len);
+		if (!rate || len != 4) {
+			pr_err("%s: Missing clock-frequency property\n",
+				cn->full_name);
+			continue;
+		}
+
+		capacity = ((be32_to_cpup(rate)) >> 20) * cpu_eff->efficiency;
+
+		/* Save min capacity of the system */
+		if (capacity < min_capacity)
+			min_capacity = capacity;
+
+		/* Save max capacity of the system */
+		if (capacity > max_capacity)
+			max_capacity = capacity;
+
+		cpu_capacity(cpu) = capacity;
+	}
+
+	/* If min and max capacities are equal we bypass the update of the
+	 * cpu_scale because all CPUs have the same capacity. Otherwise, we
+	 * compute a middle_capacity factor that will ensure that the capacity
+	 * of an 'average' CPU of the system will be as close as possible to
+	 * SCHED_POWER_SCALE, which is the default value, but with the
+	 * constraint explained near table_efficiency[].
+	 */
+	if (min_capacity == max_capacity)
+		return;
+	else if (4 * max_capacity < (3 * (max_capacity + min_capacity)))
+		middle_capacity = (min_capacity + max_capacity)
+				>> (SCHED_POWER_SHIFT+1);
+	else
+		middle_capacity = ((max_capacity / 3)
+				>> (SCHED_POWER_SHIFT-1)) + 1;
+
+}
+
+/*
+ * Look for a customed capacity of a CPU in the cpu_topo_data table during the
+ * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the
+ * function returns directly for SMP system.
+ */
+static void update_cpu_power(unsigned int cpu, unsigned long hwid)
+{
+	if (!cpu_capacity(cpu))
+		return;
+
+	set_power_scale(cpu, cpu_capacity(cpu) / middle_capacity);
+
+	pr_info("CPU%u: update cpu_power %lu\n",
+		cpu, arch_scale_freq_power(NULL, cpu));
+}
+
+#else
+static inline void parse_dt_topology(void) {}
+static inline void update_cpu_power(unsigned int cpuid, unsigned int mpidr) {}
+#endif
+
+/*
  * cpu topology table
  */
 struct cputopo_arm cpu_topology[NR_CPUS];
@@ -88,6 +246,8 @@ void store_cpu_topology(unsigned int cpuid)
 
 	update_siblings_masks(cpuid);
 
+	update_cpu_power(cpuid, mpidr & MPIDR_HWID_BITMASK);
+
 	pr_info("CPU%u: cpu %d, socket %d mapped using MPIDR %llx\n",
 		cpuid, cpu_topology[cpuid].core_id,
 		cpu_topology[cpuid].socket_id, mpidr);
@@ -138,6 +298,10 @@ void __init init_cpu_topology(void)
 		cpu_topo->socket_id = -1;
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_clear(&cpu_topo->thread_sibling);
+
+		set_power_scale(cpu, SCHED_POWER_SCALE);
 	}
 	smp_wmb();
+
+	parse_dt_topology();
 }
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 6/6] arm64: dts: Add CPU topology properties for ARMv8 4xA53 4xA57 FVP
  2013-12-11 13:13 [PATCH 1/6] arm64: Add asm/cpu.h Mark Brown
                   ` (3 preceding siblings ...)
  2013-12-11 13:13 ` [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores Mark Brown
@ 2013-12-11 13:13 ` Mark Brown
  2013-12-11 14:10 ` [PATCH 1/6] arm64: Add asm/cpu.h Catalin Marinas
  5 siblings, 0 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-11 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Mark Brown <broonie@linaro.org>

This provides for discovery of the clusters using the CPU topology
bindings, it is added as a separate commit to aid testing of the code
for systems without explicit topology bindings.

Signed-off-by: Mark Brown <broonie@linaro.org>
---
 arch/arm64/boot/dts/fvp-base-gicv2-psci.dts | 31 +++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts b/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts
index 736f546123bb..452ba22f3c6e 100644
--- a/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts
+++ b/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts
@@ -119,6 +119,37 @@
 			enable-method = "psci";
 			clock-frequency = <1000000>;
 		};
+
+		cpu-map {
+			cluster0 {
+				core0 {
+					cpu = <&big0>;
+				};
+				core1 {
+					cpu = <&big1>;
+				};
+				core2 {
+					cpu = <&big2>;
+				};
+				core3 {
+					cpu = <&big3>;
+				};
+			};
+			cluster1 {
+				core0 {
+					cpu = <&little0>;
+				};
+				core1 {
+					cpu = <&little1>;
+				};
+				core2 {
+					cpu = <&little2>;
+				};
+				core3 {
+					cpu = <&little3>;
+				};
+			};
+		};
 	};
 
 	memory at 80000000 {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 13:13 ` [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP Mark Brown
@ 2013-12-11 13:55   ` Mark Rutland
  2013-12-11 14:11     ` Mark Brown
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2013-12-11 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 01:13:23PM +0000, Mark Brown wrote:
> From: Mark Brown <broonie@linaro.org>
> 
> Add a dts file for ARMv8 fixed virtual platform which models a
> big.LITTLE configuration for the architecture.  This is based on the
> DT shipped by ARM but has been modified for mainline, removing features
> not present in mainline and updating the CPU bindings for mainline.
> 
> Signed-off-by: Mark Brown <broonie@linaro.org>
> ---
>  arch/arm64/boot/dts/Makefile                |   4 +-
>  arch/arm64/boot/dts/fvp-base-gicv2-psci.dts | 233 ++++++++++++++++++++++++++++
>  2 files changed, 236 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/boot/dts/fvp-base-gicv2-psci.dts
> 
> diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
> index c52bdb051f66..2d16cdbe5d47 100644
> --- a/arch/arm64/boot/dts/Makefile
> +++ b/arch/arm64/boot/dts/Makefile
> @@ -1,4 +1,6 @@
> -dtb-$(CONFIG_ARCH_VEXPRESS) += rtsm_ve-aemv8a.dtb foundation-v8.dtb
> +dtb-$(CONFIG_ARCH_VEXPRESS) += 	rtsm_ve-aemv8a.dtb \
> +				foundation-v8.dtb \
> +				fvp-base-gicv2-psci.dtb
>  dtb-$(CONFIG_ARCH_XGENE) += apm-mustang.dtb
>  
>  targets += dtbs
> diff --git a/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts b/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts
> new file mode 100644
> index 000000000000..736f546123bb
> --- /dev/null
> +++ b/arch/arm64/boot/dts/fvp-base-gicv2-psci.dts
> @@ -0,0 +1,233 @@
> +/*
> + * Copyright (c) 2013, ARM Limited. All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions are met:
> + *
> + * Redistributions of source code must retain the above copyright notice, this
> + * list of conditions and the following disclaimer.
> + *
> + * Redistributions in binary form must reproduce the above copyright notice,
> + * this list of conditions and the following disclaimer in the documentation
> + * and/or other materials provided with the distribution.
> + *
> + * Neither the name of ARM nor the names of its contributors may be used
> + * to endorse or promote products derived from this software without specific
> + * prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> + * POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +/dts-v1/;
> +
> +/memreserve/ 0x80000000 0x00010000;

While we are admittedly missing them elsewhere, it would be nice to have
a comment stating what the memreserve is for. With a proper PSCI
implementation, I don't see why this should be necessary.

> +
> +/ {
> +};
> +
> +/ {
> +	model = "FVP Base";

FVP Base (is as the name implies) a base upon which particular model
instances are built. This name should be clarified (e.g. "FVP Base A57x4
A53x4").

That also applies to the filename.

> +	compatible = "arm,vfp-base", "arm,vexpress";

s/vfp/fvp/

Likewise, it would be nice to have a more specific compatible string
here.

As the FVP is not a vexpress (though it is similar), we should probably
have "arm,fvp-base" as the fallback and drop the "arm,vexpress" entry.

> +	interrupt-parent = <&gic>;
> +	#address-cells = <2>;
> +	#size-cells = <2>;
> +
> +	chosen { };
> +
> +	aliases {
> +		serial0 = &v2m_serial0;
> +		serial1 = &v2m_serial1;
> +		serial2 = &v2m_serial2;
> +		serial3 = &v2m_serial3;
> +	};
> +
> +	psci {
> +		compatible = "arm,psci";
> +		method = "smc";
> +		cpu_suspend = <0xc4000001>;
> +		cpu_off = <0x84000002>;
> +		cpu_on = <0xc4000003>;
> +	};

Are these IDs right? One of these IDs is a different width than the
others.

Which firmware/bootloader does this correspond to?

> +
> +	cpus {
> +		#address-cells = <2>;
> +		#size-cells = <0>;
> +
> +		big0: cpu at 0 {
> +			device_type = "cpu";
> +			compatible = "arm,cortex-a57", "arm,armv8";
> +			reg = <0x0 0x0>;
> +			enable-method = "psci";
> +			clock-frequency = <1000000>;

Is clock-frequency used anywhere? Is it a useful thing to have
(regardless of whether ePAPR defines it)?

[...]

> +	timer {
> +		compatible = "arm,armv8-timer";
> +		interrupts = <1 13 0xff01>,
> +			     <1 14 0xff01>,
> +			     <1 11 0xff01>,
> +			     <1 10 0xff01>;
> +		clock-frequency = <100000000>;

A clock-frequency property in a timer is an indicator that the
bootloader/firmware is broken and should be fixed. Is this actually
necessary, or does the firmware/loader program CNTFRQ correctly on all
CPUs?

> +	};
> +
> +	timer at 2a810000 {
> +			compatible = "arm,armv7-timer-mem";
> +			reg = <0x0 0x2a810000 0x0 0x10000>;
> +			clock-frequency = <100000000>;
> +			#address-cells = <2>;
> +			#size-cells = <2>;
> +			ranges;
> +			frame at 2a820000 {
> +				frame-number = <0>;
> +				interrupts = <0 25 4>;
> +				reg = <0x0 0x2a820000 0x0 0x10000>;
> +			};
> +	};
> +
> +	pmu {
> +		compatible = "arm,armv8-pmuv3";
> +		interrupts = <0 60 4>,
> +			     <0 61 4>,
> +			     <0 62 4>,
> +			     <0 63 4>;
> +	};

I assume this is just the A57 cores? That should probably be noted for
now. In future we'll need to define the relationship between interrupts
and CPUs, and describe the A53 cores.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/6] arm64: Add asm/cpu.h
  2013-12-11 13:13 [PATCH 1/6] arm64: Add asm/cpu.h Mark Brown
                   ` (4 preceding siblings ...)
  2013-12-11 13:13 ` [PATCH 6/6] arm64: dts: Add CPU topology properties for ARMv8 4xA53 4xA57 FVP Mark Brown
@ 2013-12-11 14:10 ` Catalin Marinas
  2013-12-11 14:23   ` Mark Brown
  5 siblings, 1 reply; 47+ messages in thread
From: Catalin Marinas @ 2013-12-11 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 01:13:21PM +0000, Mark Brown wrote:
> +struct cpuinfo_arm {
> +	struct cpu	cpu;
> +	u64		cpuid;
> +#ifdef CONFIG_SMP
> +	unsigned int	loops_per_jiffy;
> +#endif
> +};

How is this structure used? I haven't seen the ACPI code doing anything
with struct cpu (though I haven't dug deep enough). Also loops_per_jiffy
is useless, that's related to the delay loop based on the generic timer.

-- 
Catalin

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 13:55   ` Mark Rutland
@ 2013-12-11 14:11     ` Mark Brown
  2013-12-11 15:04       ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-11 14:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 01:55:36PM +0000, Mark Rutland wrote:
> On Wed, Dec 11, 2013 at 01:13:23PM +0000, Mark Brown wrote:

> > +/dts-v1/;

> > +/memreserve/ 0x80000000 0x00010000;

> While we are admittedly missing them elsewhere, it would be nice to have
> a comment stating what the memreserve is for. With a proper PSCI
> implementation, I don't see why this should be necessary.

Could you tell me what it's there for?  Like you say everyone else has
it with no comments and the source doesn't either...

> > +/ {
> > +	model = "FVP Base";

> FVP Base (is as the name implies) a base upon which particular model
> instances are built. This name should be clarified (e.g. "FVP Base A57x4
> A53x4").

> That also applies to the filename.

I can update these, though they do seem to come from what you guys are
releasing - you might want to follow up on this internally (this applies
to almost all of your review comments, sorry).  It's probably going to
be a bit confusing for users to have the filename change but ho hum :/

> As the FVP is not a vexpress (though it is similar), we should probably
> have "arm,fvp-base" as the fallback and drop the "arm,vexpress" entry.

Is it not intended to be a software emulation of a vexpress and hence
supposed to be broadly compatible with one?

> > +	psci {
> > +		compatible = "arm,psci";
> > +		method = "smc";
> > +		cpu_suspend = <0xc4000001>;
> > +		cpu_off = <0x84000002>;
> > +		cpu_on = <0xc4000003>;
> > +	};

> Are these IDs right? One of these IDs is a different width than the
> others.

I have no idea, I have no documentation for any of this stuff other than
the DT you guys release on github - it's just copied verbatim from
there.

> Which firmware/bootloader does this correspond to?

I'm testing using the Linaro 13.11 release.

> > +		big0: cpu at 0 {
> > +			device_type = "cpu";
> > +			compatible = "arm,cortex-a57", "arm,armv8";
> > +			reg = <0x0 0x0>;
> > +			enable-method = "psci";
> > +			clock-frequency = <1000000>;

> Is clock-frequency used anywhere? Is it a useful thing to have
> (regardless of whether ePAPR defines it)?

The topology code for ARMv7 (which I'm following for ARMv8) uses this in
conjunction with the compatible to provide information to the scheduler
about the relative performance of the cores.  

> > +	timer {
> > +		compatible = "arm,armv8-timer";
> > +		interrupts = <1 13 0xff01>,
> > +			     <1 14 0xff01>,
> > +			     <1 11 0xff01>,
> > +			     <1 10 0xff01>;
> > +		clock-frequency = <100000000>;

> A clock-frequency property in a timer is an indicator that the
> bootloader/firmware is broken and should be fixed. Is this actually
> necessary, or does the firmware/loader program CNTFRQ correctly on all
> CPUs?

I can try to see what happens if I remove it - I doubt anyone has tested
without it...

> > +	pmu {
> > +		compatible = "arm,armv8-pmuv3";
> > +		interrupts = <0 60 4>,
> > +			     <0 61 4>,
> > +			     <0 62 4>,
> > +			     <0 63 4>;
> > +	};

> I assume this is just the A57 cores? That should probably be noted for
> now. In future we'll need to define the relationship between interrupts
> and CPUs, and describe the A53 cores.

Again I don't know, this is just copied verbatim from what you guys
release - I have no additional documentation.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131211/3104db53/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 13:13 ` [PATCH 4/6] arm64: topology: Implement basic CPU topology support Mark Brown
@ 2013-12-11 14:12   ` Will Deacon
  2013-12-11 14:15     ` Mark Brown
  2013-12-16 10:57   ` Lorenzo Pieralisi
  2013-12-16 14:45   ` Alex Shi
  2 siblings, 1 reply; 47+ messages in thread
From: Will Deacon @ 2013-12-11 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 01:13:24PM +0000, Mark Brown wrote:
> From: Mark Brown <broonie@linaro.org>
> 
> Add basic CPU topology support to arm64, based on the existing pre-v8
> code and some work done by Mark Hambleton.  This patch does not
> implement the ARM CPU topology bindings, it implements equivalent
> support to the existing the equivalent pre-v8 capability using the
> mandatory MPIDR information in the CPU binding in device tree and
> assuming that a simple SMP or multi-cluster topology is in use.
> 
> The primary goal is to separate the architecture hookup for providing
> topology information from the DT parsing in order to ease review and
> avoid blocking the architecture code (which will be built on by other
> work) with the DT code review by providing something something simple
> and basic.  Having this support should also make the kernel cope better
> with incomplete DTs.
> 
> Further patches will provide support for overriding this using the
> topology bindings, providing richer support for a wider range of systems.

[Adding Lorenzo]

I seem to remember Lorenzo having patches already for this, along with
bindings, Documentation etc. so it would be good to know how these two
series are supposed to interact.

Will

> Signed-off-by: Mark Brown <broonie@linaro.org>
> ---
>  arch/arm64/Kconfig                |   8 +++
>  arch/arm64/include/asm/cpu.h      |   1 -
>  arch/arm64/include/asm/cputype.h  |   9 +++
>  arch/arm64/include/asm/smp_plat.h |   1 +
>  arch/arm64/include/asm/topology.h |  42 +++++++++++
>  arch/arm64/kernel/Makefile        |   1 +
>  arch/arm64/kernel/setup.c         |   9 +--
>  arch/arm64/kernel/smp.c           |  19 ++++-
>  arch/arm64/kernel/topology.c      | 143 ++++++++++++++++++++++++++++++++++++++
>  9 files changed, 227 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm64/include/asm/topology.h
>  create mode 100644 arch/arm64/kernel/topology.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 88c8b6c1341a..7b4dab852937 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -154,6 +154,14 @@ config SMP
>  
>  	  If you don't know what to do here, say N.
>  
> +config ARM_CPU_TOPOLOGY
> +	bool "Support CPU topology definition"
> +	depends on SMP
> +	default y
> +	help
> +	  Support CPU topology definition, based on configuration
> +	  provided by the firmware.
> +
>  config NR_CPUS
>  	int "Maximum number of CPUs (2-32)"
>  	range 2 32
> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index d67ff011d361..8a26b690110c 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
> @@ -10,7 +10,6 @@
>  
>  #include <linux/percpu.h>
>  #include <linux/cpu.h>
> -#include <linux/topology.h>
>  
>  struct cpuinfo_arm {
>  	struct cpu	cpu;
> diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
> index 5fe138e0b828..bd504739cbfd 100644
> --- a/arch/arm64/include/asm/cputype.h
> +++ b/arch/arm64/include/asm/cputype.h
> @@ -29,6 +29,15 @@
>  #define INVALID_HWID		ULONG_MAX
>  
>  #define MPIDR_HWID_BITMASK	0xff00ffffff
> +#define MPIDR_SMP_BITMASK (0x3 << 30)
> +#define MPIDR_SMP_VALUE (0x2 << 30)
> +#define MPIDR_MT_BITMASK (0x1 << 24)
> +#define MPIDR_LEVEL_BITS 8
> +#define MPIDR_LEVEL_MASK ((1 << MPIDR_LEVEL_BITS) - 1)
> +
> +#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
> +	((mpidr >> (MPIDR_LEVEL_BITS * level)) & MPIDR_LEVEL_MASK)
> +
>  
>  #define read_cpuid(reg) ({						\
>  	u64 __val;							\
> diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h
> index ed43a0d2b1b2..4ad4ecc93bcf 100644
> --- a/arch/arm64/include/asm/smp_plat.h
> +++ b/arch/arm64/include/asm/smp_plat.h
> @@ -19,6 +19,7 @@
>  #ifndef __ASM_SMP_PLAT_H
>  #define __ASM_SMP_PLAT_H
>  
> +#include <linux/cpumask.h>
>  #include <asm/types.h>
>  
>  /*
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> new file mode 100644
> index 000000000000..611edefaeaf1
> --- /dev/null
> +++ b/arch/arm64/include/asm/topology.h
> @@ -0,0 +1,42 @@
> +#ifndef _ASM_ARM_TOPOLOGY_H
> +#define _ASM_ARM_TOPOLOGY_H
> +
> +#ifdef CONFIG_ARM_CPU_TOPOLOGY
> +
> +#include <linux/cpumask.h>
> +
> +struct cputopo_arm {
> +	int thread_id;
> +	int core_id;
> +	int socket_id;
> +	cpumask_t thread_sibling;
> +	cpumask_t core_sibling;
> +};
> +
> +extern struct cputopo_arm cpu_topology[NR_CPUS];
> +
> +#define topology_physical_package_id(cpu)	(cpu_topology[cpu].socket_id)
> +#define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
> +#define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
> +#define topology_thread_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
> +
> +#define mc_capable()	(cpu_topology[0].socket_id != -1)
> +#define smt_capable()	(cpu_topology[0].thread_id != -1)
> +
> +void init_cpu_topology(void);
> +void store_cpu_topology(unsigned int cpuid);
> +const struct cpumask *cpu_coregroup_mask(int cpu);
> +int cluster_to_logical_mask(unsigned int socket_id, cpumask_t *cluster_mask);
> +
> +#else
> +
> +static inline void init_cpu_topology(void) { }
> +static inline void store_cpu_topology(unsigned int cpuid) { }
> +static inline int cluster_to_logical_mask(unsigned int socket_id,
> +	cpumask_t *cluster_mask) { return -EINVAL; }
> +
> +#endif
> +
> +#include <asm-generic/topology.h>
> +
> +#endif /* _ASM_ARM_TOPOLOGY_H */
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 5ba2fd43a75b..2d145e38ad49 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -18,6 +18,7 @@ arm64-obj-$(CONFIG_SMP)			+= smp.o smp_spin_table.o
>  arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
>  arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
>  arm64-obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
> +arm64-obj-$(CONFIG_ARM_CPU_TOPOLOGY)  += topology.o
>  
>  obj-y					+= $(arm64-obj-y) vdso/
>  obj-m					+= $(arm64-obj-m)
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 0bc5e4cbc017..dbe4a9ba90cb 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -54,6 +54,7 @@
>  #include <asm/traps.h>
>  #include <asm/memblock.h>
>  #include <asm/psci.h>
> +#include <asm/cpu.h>
>  
>  unsigned int processor_id;
>  EXPORT_SYMBOL(processor_id);
> @@ -250,16 +251,16 @@ static int __init arm64_device_init(void)
>  }
>  arch_initcall(arm64_device_init);
>  
> -static DEFINE_PER_CPU(struct cpu, cpu_data);
> +DEFINE_PER_CPU(struct cpuinfo_arm, cpu_data);
>  
>  static int __init topology_init(void)
>  {
>  	int i;
>  
>  	for_each_possible_cpu(i) {
> -		struct cpu *cpu = &per_cpu(cpu_data, i);
> -		cpu->hotpluggable = 1;
> -		register_cpu(cpu, i);
> +		struct cpuinfo_arm *cpu = &per_cpu(cpu_data, i);
> +		cpu->cpu.hotpluggable = 1;
> +		register_cpu(&cpu->cpu, i);
>  	}
>  
>  	return 0;
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index a5aeefab03c3..f29c7ffad84a 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -35,7 +35,6 @@
>  #include <linux/clockchips.h>
>  #include <linux/completion.h>
>  #include <linux/of.h>
> -
>  #include <asm/atomic.h>
>  #include <asm/cacheflush.h>
>  #include <asm/cputype.h>
> @@ -48,6 +47,7 @@
>  #include <asm/sections.h>
>  #include <asm/tlbflush.h>
>  #include <asm/ptrace.h>
> +#include <asm/cpu.h>
>  
>  /*
>   * as from 2.5, kernels no longer have an init_tasks structure
> @@ -113,6 +113,16 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  	return ret;
>  }
>  
> +static void __cpuinit smp_store_cpu_info(unsigned int cpuid)
> +{
> +	struct cpuinfo_arm *cpu_info = &per_cpu(cpu_data, cpuid);
> +
> +	cpu_info->loops_per_jiffy = loops_per_jiffy;
> +	cpu_info->cpuid = read_cpuid_id();
> +
> +	store_cpu_topology(cpuid);
> +}
> +
>  /*
>   * This is the secondary CPU boot entry.  We're using this CPUs
>   * idle thread stack, but a set of temporary page tables.
> @@ -150,6 +160,8 @@ asmlinkage void secondary_start_kernel(void)
>  	 */
>  	notify_cpu_starting(cpu);
>  
> +	smp_store_cpu_info(cpu);
> +
>  	/*
>  	 * OK, now it's safe to let the boot CPU continue.  Wait for
>  	 * the CPU migration code to notice that the CPU is online
> @@ -387,6 +399,11 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  	int err;
>  	unsigned int cpu, ncores = num_possible_cpus();
>  
> +	init_cpu_topology();
> +
> +	smp_store_cpu_info(smp_processor_id());
> +
> +
>  	/*
>  	 * are we trying to boot more cores than exist?
>  	 */
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> new file mode 100644
> index 000000000000..e0b40f48b448
> --- /dev/null
> +++ b/arch/arm64/kernel/topology.c
> @@ -0,0 +1,143 @@
> +/*
> + * arch/arm64/kernel/topology.c
> + *
> + * Copyright (C) 2011,2013 Linaro Limited.
> + * Written by: Vincent Guittot
> + *
> + * based on arch/sh/kernel/topology.c
> + *
> + * This file is subject to the terms and conditions of the GNU General Public
> + * License.  See the file "COPYING" in the main directory of this archive
> + * for more details.
> + */
> +
> +#include <linux/cpu.h>
> +#include <linux/cpumask.h>
> +#include <linux/export.h>
> +#include <linux/init.h>
> +#include <linux/percpu.h>
> +#include <linux/node.h>
> +#include <linux/nodemask.h>
> +#include <linux/sched.h>
> +#include <linux/slab.h>
> +
> +#include <asm/cputype.h>
> +#include <asm/smp_plat.h>
> +#include <asm/topology.h>
> +
> +/*
> + * cpu topology table
> + */
> +struct cputopo_arm cpu_topology[NR_CPUS];
> +EXPORT_SYMBOL_GPL(cpu_topology);
> +
> +const struct cpumask *cpu_coregroup_mask(int cpu)
> +{
> +	return &cpu_topology[cpu].core_sibling;
> +}
> +
> +static void update_siblings_masks(unsigned int cpuid)
> +{
> +	struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> +	int cpu;
> +
> +	/* update core and thread sibling masks */
> +	for_each_possible_cpu(cpu) {
> +		cpu_topo = &cpu_topology[cpu];
> +
> +		if (cpuid_topo->socket_id != cpu_topo->socket_id)
> +			continue;
> +
> +		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> +		if (cpu != cpuid)
> +			cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> +
> +		if (cpuid_topo->core_id != cpu_topo->core_id)
> +			continue;
> +
> +		cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
> +		if (cpu != cpuid)
> +			cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
> +	}
> +	smp_wmb();
> +}
> +
> +/*
> + * store_cpu_topology is called at boot when only one cpu is running
> + * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
> + * which prevents simultaneous write access to cpu_topology array
> + */
> +void store_cpu_topology(unsigned int cpuid)
> +{
> +	struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid];
> +	u64 mpidr;
> +
> +	/* If the cpu topology has been already set, just return */
> +	if (cpuid_topo->core_id != -1)
> +		return;
> +
> +	mpidr = cpu_logical_map(cpuid);
> +
> +	/*
> +	 * Create cpu topology mapping, assume the cores are largely
> +	 * independent since the DT bindings do not include the flags
> +	 * for MT.
> +	 */
> +	cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	update_siblings_masks(cpuid);
> +
> +	pr_info("CPU%u: cpu %d, socket %d mapped using MPIDR %llx\n",
> +		cpuid, cpu_topology[cpuid].core_id,
> +		cpu_topology[cpuid].socket_id, mpidr);
> +}
> +
> +
> +/*
> + * cluster_to_logical_mask - return cpu logical mask of CPUs in a cluster
> + * @socket_id:		cluster HW identifier
> + * @cluster_mask:	the cpumask location to be initialized, modified by the
> + *			function only if return value == 0
> + *
> + * Return:
> + *
> + * 0 on success
> + * -EINVAL if cluster_mask is NULL or there is no record matching socket_id
> + */
> +int cluster_to_logical_mask(unsigned int socket_id, cpumask_t *cluster_mask)
> +{
> +	int cpu;
> +
> +	if (!cluster_mask)
> +		return -EINVAL;
> +
> +	for_each_online_cpu(cpu)
> +		if (socket_id == topology_physical_package_id(cpu)) {
> +			cpumask_copy(cluster_mask, topology_core_cpumask(cpu));
> +			return 0;
> +		}
> +
> +	return -EINVAL;
> +}
> +
> +/*
> + * init_cpu_topology is called at boot when only one cpu is running
> + * which prevent simultaneous write access to cpu_topology array
> + */
> +void __init init_cpu_topology(void)
> +{
> +	unsigned int cpu;
> +
> +	/* init core mask and power*/
> +	for_each_possible_cpu(cpu) {
> +		struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
> +
> +		cpu_topo->thread_id = -1;
> +		cpu_topo->core_id =  -1;
> +		cpu_topo->socket_id = -1;
> +		cpumask_clear(&cpu_topo->core_sibling);
> +		cpumask_clear(&cpu_topo->thread_sibling);
> +	}
> +	smp_wmb();
> +}
> -- 
> 1.8.5.1
> 
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 14:12   ` Will Deacon
@ 2013-12-11 14:15     ` Mark Brown
  2013-12-11 14:24       ` Catalin Marinas
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-11 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:12:24PM +0000, Will Deacon wrote:

> [Adding Lorenzo]

> I seem to remember Lorenzo having patches already for this, along with
> bindings, Documentation etc. so it would be good to know how these two
> series are supposed to interact.

He said he hadn't had time to work on the actual implementation which is
why I picked this up - as far as I'm aware we only have the (already
merged) binding for this.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131211/72a9933b/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/6] arm64: Add asm/cpu.h
  2013-12-11 14:10 ` [PATCH 1/6] arm64: Add asm/cpu.h Catalin Marinas
@ 2013-12-11 14:23   ` Mark Brown
  2013-12-11 14:27     ` Catalin Marinas
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-11 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:10:19PM +0000, Catalin Marinas wrote:
> On Wed, Dec 11, 2013 at 01:13:21PM +0000, Mark Brown wrote:

> > +struct cpuinfo_arm {
> > +	struct cpu	cpu;
> > +	u64		cpuid;
> > +#ifdef CONFIG_SMP
> > +	unsigned int	loops_per_jiffy;
> > +#endif
> > +};

> How is this structure used? I haven't seen the ACPI code doing anything
> with struct cpu (though I haven't dug deep enough). Also loops_per_jiffy
> is useless, that's related to the delay loop based on the generic timer.

Now I look again we can probably drop this for the toplogy work - it had
been pulled in as part of pulling things in from pre-v8 and the cpuid
was used in the code while I was working on it but isn't any more unless
I'm misreading the code.  

I don't know what the ACPI guys are doing wtih it, I just saw they added
the same thing.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131211/7ca1e735/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 14:15     ` Mark Brown
@ 2013-12-11 14:24       ` Catalin Marinas
  2013-12-11 14:30         ` Mark Brown
  2013-12-12  6:59         ` Hanjun Guo
  0 siblings, 2 replies; 47+ messages in thread
From: Catalin Marinas @ 2013-12-11 14:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:15:03PM +0000, Mark Brown wrote:
> On Wed, Dec 11, 2013 at 02:12:24PM +0000, Will Deacon wrote:
> > I seem to remember Lorenzo having patches already for this, along with
> > bindings, Documentation etc. so it would be good to know how these two
> > series are supposed to interact.
> 
> He said he hadn't had time to work on the actual implementation which is
> why I picked this up - as far as I'm aware we only have the (already
> merged) binding for this.

Lorenzo (on holiday now) pushed the DT topology bindings but there are
no patches yet for reading them. I would much prefer to only describe
the topology via DT rather than MPIDR (because hardware people sometimes
have strange ideas).

-- 
Catalin

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/6] arm64: Add asm/cpu.h
  2013-12-11 14:23   ` Mark Brown
@ 2013-12-11 14:27     ` Catalin Marinas
  2013-12-12  6:50       ` Hanjun Guo
  0 siblings, 1 reply; 47+ messages in thread
From: Catalin Marinas @ 2013-12-11 14:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:23:08PM +0000, Mark Brown wrote:
> On Wed, Dec 11, 2013 at 02:10:19PM +0000, Catalin Marinas wrote:
> > On Wed, Dec 11, 2013 at 01:13:21PM +0000, Mark Brown wrote:
> 
> > > +struct cpuinfo_arm {
> > > +	struct cpu	cpu;
> > > +	u64		cpuid;
> > > +#ifdef CONFIG_SMP
> > > +	unsigned int	loops_per_jiffy;
> > > +#endif
> > > +};
> 
> > How is this structure used? I haven't seen the ACPI code doing anything
> > with struct cpu (though I haven't dug deep enough). Also loops_per_jiffy
> > is useless, that's related to the delay loop based on the generic timer.
> 
> Now I look again we can probably drop this for the toplogy work - it had
> been pulled in as part of pulling things in from pre-v8 and the cpuid
> was used in the code while I was working on it but isn't any more unless
> I'm misreading the code.  
> 
> I don't know what the ACPI guys are doing wtih it, I just saw they added
> the same thing.

It looked to me that for ACPI and empty asm/cpu.h file would do for now.
There are a couple of prototypes for CPU hotplug I think but haven't
seen the structure used (not even the x86_cpu one).

-- 
Catalin

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 14:24       ` Catalin Marinas
@ 2013-12-11 14:30         ` Mark Brown
  2013-12-11 14:49           ` Catalin Marinas
  2013-12-12  6:59         ` Hanjun Guo
  1 sibling, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-11 14:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:24:32PM +0000, Catalin Marinas wrote:
> On Wed, Dec 11, 2013 at 02:15:03PM +0000, Mark Brown wrote:

> > He said he hadn't had time to work on the actual implementation which is
> > why I picked this up - as far as I'm aware we only have the (already
> > merged) binding for this.

> Lorenzo (on holiday now) pushed the DT topology bindings but there are
> no patches yet for reading them. I would much prefer to only describe
> the topology via DT rather than MPIDR (because hardware people sometimes
> have strange ideas).

Right, that was my understanding too.  

For clarity the current code only uses DT information, the MPIDR values
it uses are those in the CPU nodes in the DT which are used by the SMP
code and are not read from the hardware.  We can drop that bit of code
easily enough but it seemed useful for getting the scaffolding in place
separately to the binding parsing and for general robustness, perhaps
adding a patch at the end of the series to drop the code handling MPIDR
would deal with the application issues?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131211/c84aaf90/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores
  2013-12-11 13:13 ` [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores Mark Brown
@ 2013-12-11 14:47   ` Catalin Marinas
  2013-12-11 17:31     ` Mark Brown
  0 siblings, 1 reply; 47+ messages in thread
From: Catalin Marinas @ 2013-12-11 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 01:13:25PM +0000, Mark Brown wrote:
> The power numbers are the same as for ARMv7 since it seems that the
> expected differential between the big and little cores is very similar on
> both ARMv7 and ARMv8.

I have no idea ;). We don't have real silicon yet, so that's just a wild
guess.

> +/*
> + * Table of relative efficiency of each processors
> + * The efficiency value must fit in 20bit and the final
> + * cpu_scale value must be in the range
> + *   0 < cpu_scale < 3*SCHED_POWER_SCALE/2
> + * in order to return at most 1 when DIV_ROUND_CLOSEST
> + * is used to compute the capacity of a CPU.
> + * Processors that are not defined in the table,
> + * use the default SCHED_POWER_SCALE value for cpu_scale.
> + */
> +static const struct cpu_efficiency table_efficiency[] = {
> +	{ "arm,cortex-a57", 3891 },
> +	{ "arm,cortex-a53", 2048 },
> +	{ NULL, },
> +};

I also don't think we can just have absolute numbers here. I'm pretty
sure these were generated on TC2 but other platforms may have different
max CPU frequencies, memory subsystem, level and size of caches. The
"average" efficiency and difference will be different.

Can we define this via DT? It's a bit strange since that's a constant
used by the Linux scheduler but highly related to hardware.

-- 
Catalin

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 14:30         ` Mark Brown
@ 2013-12-11 14:49           ` Catalin Marinas
  2013-12-11 16:13             ` Mark Brown
  0 siblings, 1 reply; 47+ messages in thread
From: Catalin Marinas @ 2013-12-11 14:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:30:18PM +0000, Mark Brown wrote:
> On Wed, Dec 11, 2013 at 02:24:32PM +0000, Catalin Marinas wrote:
> > On Wed, Dec 11, 2013 at 02:15:03PM +0000, Mark Brown wrote:
> 
> > > He said he hadn't had time to work on the actual implementation which is
> > > why I picked this up - as far as I'm aware we only have the (already
> > > merged) binding for this.
> 
> > Lorenzo (on holiday now) pushed the DT topology bindings but there are
> > no patches yet for reading them. I would much prefer to only describe
> > the topology via DT rather than MPIDR (because hardware people sometimes
> > have strange ideas).
> 
> Right, that was my understanding too.  
> 
> For clarity the current code only uses DT information, the MPIDR values
> it uses are those in the CPU nodes in the DT which are used by the SMP
> code and are not read from the hardware.

But these must match the hardware MPIDR to be useful on SMP. So whether
you read them from hardware or DT, they are still the same.

> We can drop that bit of code
> easily enough but it seemed useful for getting the scaffolding in place
> separately to the binding parsing and for general robustness, perhaps
> adding a patch at the end of the series to drop the code handling MPIDR
> would deal with the application issues?

Can we have the topology infrastructure with a flat initial topology and
the actual building in a separate patch?

-- 
Catalin

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 14:11     ` Mark Brown
@ 2013-12-11 15:04       ` Mark Rutland
  2013-12-11 16:00         ` Mark Brown
  2013-12-11 16:08         ` Jon Medhurst (Tixy)
  0 siblings, 2 replies; 47+ messages in thread
From: Mark Rutland @ 2013-12-11 15:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:11:48PM +0000, Mark Brown wrote:
> On Wed, Dec 11, 2013 at 01:55:36PM +0000, Mark Rutland wrote:
> > On Wed, Dec 11, 2013 at 01:13:23PM +0000, Mark Brown wrote:
> 
> > > +/dts-v1/;
> 
> > > +/memreserve/ 0x80000000 0x00010000;
> 
> > While we are admittedly missing them elsewhere, it would be nice to have
> > a comment stating what the memreserve is for. With a proper PSCI
> > implementation, I don't see why this should be necessary.
> 
> Could you tell me what it's there for?  Like you say everyone else has
> it with no comments and the source doesn't either...

When using the bootwrapper, the bootwrapper itself takes up a portion of
memory (along with the spin-table) which would otherwise be available to
the kernel. This is also true with my bootwrapper PSCI implementation.
The memreserve tells the kernel not to stomp over this memory (though it
will map it in).

With a proper firmware, I'd expect the PSCI implementation to be outside
of the memory exposed to non-secure software, and thus there shouldn't
be anything to reserve.

It's possible that there is a reason for the reservation, but I'd rather
we understood it than we propagated and codified a misunderstanding.

> 
> > > +/ {
> > > +	model = "FVP Base";
> 
> > FVP Base (is as the name implies) a base upon which particular model
> > instances are built. This name should be clarified (e.g. "FVP Base A57x4
> > A53x4").
> 
> > That also applies to the filename.
> 
> I can update these, though they do seem to come from what you guys are
> releasing - you might want to follow up on this internally (this applies
> to almost all of your review comments, sorry).  It's probably going to
> be a bit confusing for users to have the filename change but ho hum :/

I'll try to chase up the issues, thanks for making me aware.

I don't see the name issue as a big problem. This DT has never been part
of the kernel tree, so there's no filename compatibility issue to deal
with. Existing users of the DT will already have to be modified to get
the DTs from a new source.

There should be nothing hanging off the compatible string for the
platform yet -- we have no board files or platform blobs in the arm64
port. If the model name is being used as anything other than a handy
indicator to users, then that's broken anyway.

> 
> > As the FVP is not a vexpress (though it is similar), we should probably
> > have "arm,fvp-base" as the fallback and drop the "arm,vexpress" entry.
> 
> Is it not intended to be a software emulation of a vexpress and hence
> supposed to be broadly compatible with one?

As I understand it, "Base" is a base platform that's somewhat like
vexpress, but is not intended to be an emulation of vexpress. It may be
broadly equivalent, but to limit confusion I'd rather treat them
separately.

Regardless, top-level compatible strings aren't used for anything at the
moment (and hopefully never will be), so nothing should be relying on
"arm,vexpress" being present and we can remove it.

> 
> > > +	psci {
> > > +		compatible = "arm,psci";
> > > +		method = "smc";
> > > +		cpu_suspend = <0xc4000001>;
> > > +		cpu_off = <0x84000002>;
> > > +		cpu_on = <0xc4000003>;
> > > +	};
> 
> > Are these IDs right? One of these IDs is a different width than the
> > others.
> 
> I have no idea, I have no documentation for any of this stuff other than
> the DT you guys release on github - it's just copied verbatim from
> there.

So this goes with the trusted firmware?

> 
> > Which firmware/bootloader does this correspond to?
> 
> I'm testing using the Linaro 13.11 release.

Release of what? I'm not familiar with the entire stack Linaro manage.

I guess the trusted firmware is being used, as mentioned above?

> 
> > > +		big0: cpu at 0 {
> > > +			device_type = "cpu";
> > > +			compatible = "arm,cortex-a57", "arm,armv8";
> > > +			reg = <0x0 0x0>;
> > > +			enable-method = "psci";
> > > +			clock-frequency = <1000000>;
> 
> > Is clock-frequency used anywhere? Is it a useful thing to have
> > (regardless of whether ePAPR defines it)?
> 
> The topology code for ARMv7 (which I'm following for ARMv8) uses this in
> conjunction with the compatible to provide information to the scheduler
> about the relative performance of the cores.  

Ok. While I have concerns regarding the topology code, I'm not averse to
describing the clock-frequency here.

However I worry about how configurable clocks are going to be handled
here. On the 32-bit side I see some platforms with clock-frequency and
others with clocks. We should figure out what the expected way to handle
configurable clocks before we add the code for handling a fixed rate
clock here, or we're going to back ourselves into a corner where this
support can only work on a subset of systems.

> 
> > > +	timer {
> > > +		compatible = "arm,armv8-timer";
> > > +		interrupts = <1 13 0xff01>,
> > > +			     <1 14 0xff01>,
> > > +			     <1 11 0xff01>,
> > > +			     <1 10 0xff01>;
> > > +		clock-frequency = <100000000>;
> 
> > A clock-frequency property in a timer is an indicator that the
> > bootloader/firmware is broken and should be fixed. Is this actually
> > necessary, or does the firmware/loader program CNTFRQ correctly on all
> > CPUs?
> 
> I can try to see what happens if I remove it - I doubt anyone has tested
> without it...

I would very much hope it's unnecessary. If it's needed, then bugs
should be filed and the firmware fixed. The system initialisation code
_must_ set CNTFRQ on all CPUs or it's fundamentally broken.

> 
> > > +	pmu {
> > > +		compatible = "arm,armv8-pmuv3";
> > > +		interrupts = <0 60 4>,
> > > +			     <0 61 4>,
> > > +			     <0 62 4>,
> > > +			     <0 63 4>;
> > > +	};
> 
> > I assume this is just the A57 cores? That should probably be noted for
> > now. In future we'll need to define the relationship between interrupts
> > and CPUs, and describe the A53 cores.
> 
> Again I don't know, this is just copied verbatim from what you guys
> release - I have no additional documentation.

Given that the DTs on the github page don't mention specific CPUs, these
interrupts might not be correct for specific implementations.

I'd rather that the elements we are unsure about were dropped from the
DT for the timebeing rather than being present but incorrect. There's
less room for something to blow up that way, and it makes it clearer
where the deficiencies are.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 15:04       ` Mark Rutland
@ 2013-12-11 16:00         ` Mark Brown
  2013-12-11 16:08         ` Jon Medhurst (Tixy)
  1 sibling, 0 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-11 16:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 03:04:33PM +0000, Mark Rutland wrote:
> On Wed, Dec 11, 2013 at 02:11:48PM +0000, Mark Brown wrote:
> > On Wed, Dec 11, 2013 at 01:55:36PM +0000, Mark Rutland wrote:
> > > On Wed, Dec 11, 2013 at 01:13:23PM +0000, Mark Brown wrote:

> > > > +/memreserve/ 0x80000000 0x00010000;

> > > While we are admittedly missing them elsewhere, it would be nice to have
> > > a comment stating what the memreserve is for. With a proper PSCI
> > > implementation, I don't see why this should be necessary.

> > Could you tell me what it's there for?  Like you say everyone else has
> > it with no comments and the source doesn't either...

> When using the bootwrapper, the bootwrapper itself takes up a portion of
> memory (along with the spin-table) which would otherwise be available to
> the kernel. This is also true with my bootwrapper PSCI implementation.
> The memreserve tells the kernel not to stomp over this memory (though it
> will map it in).

OK, I'll try to distill that down to a comment (and add the same comment
to the other DTs).  However see below...

> With a proper firmware, I'd expect the PSCI implementation to be outside
> of the memory exposed to non-secure software, and thus there shouldn't
> be anything to reserve.

> It's possible that there is a reason for the reservation, but I'd rather
> we understood it than we propagated and codified a misunderstanding.

It'd certainly be nice if it were clearer.

> I don't see the name issue as a big problem. This DT has never been part
> of the kernel tree, so there's no filename compatibility issue to deal
> with. Existing users of the DT will already have to be modified to get
> the DTs from a new source.

I'm more worried about existing users not noticing that this is a DT for
the same thing.

> There should be nothing hanging off the compatible string for the
> platform yet -- we have no board files or platform blobs in the arm64
> port. If the model name is being used as anything other than a handy
> indicator to users, then that's broken anyway.

Yeah, that doesn't seem at all risky.

> > > > +	psci {
> > > > +		compatible = "arm,psci";
> > > > +		method = "smc";
> > > > +		cpu_suspend = <0xc4000001>;
> > > > +		cpu_off = <0x84000002>;
> > > > +		cpu_on = <0xc4000003>;
> > > > +	};

> > > Are these IDs right? One of these IDs is a different width than the
> > > others.

> > I have no idea, I have no documentation for any of this stuff other than
> > the DT you guys release on github - it's just copied verbatim from
> > there.

> So this goes with the trusted firmware?

This is on the ARM github site so I'm not 100% sure how tied that is.

> > > Which firmware/bootloader does this correspond to?

> > I'm testing using the Linaro 13.11 release.

> Release of what? I'm not familiar with the entire stack Linaro manage.

> I guess the trusted firmware is being used, as mentioned above?

The only release Linaro have for ARMv8 is an OE one AFAICT which I got
from here:

    http://releases.linaro.org/13.11/openembedded/aarch64

According to the notes there it includes the trusted firmware.  The
above is all the information I have on it.

> Ok. While I have concerns regarding the topology code, I'm not averse to
> describing the clock-frequency here.

> However I worry about how configurable clocks are going to be handled
> here. On the 32-bit side I see some platforms with clock-frequency and
> others with clocks. We should figure out what the expected way to handle
> configurable clocks before we add the code for handling a fixed rate
> clock here, or we're going to back ourselves into a corner where this
> support can only work on a subset of systems.

Well, we can always extend later.  I do share your concerns about doing
this using clock-frequency though, that's why I specified it as being
the maximum when I copied it into the kernel bindings.  That's clearly
the intention of the existing ARMv7 implementation.

One trick here is that this is all happening rather early in boot so
going through the whole clock tree might be entertaining, though we can
probably go round and do another pass when cpufreq comes up.

> I'd rather that the elements we are unsure about were dropped from the
> DT for the timebeing rather than being present but incorrect. There's
> less room for something to blow up that way, and it makes it clearer
> where the deficiencies are.

OK.  To be honest I'm leaning towards letting you guys upstream this
given that all the information I have on it is the DT so from my point
of view I'm unsure about essentially all of it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131211/619d4490/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 15:04       ` Mark Rutland
  2013-12-11 16:00         ` Mark Brown
@ 2013-12-11 16:08         ` Jon Medhurst (Tixy)
  2013-12-11 16:41           ` Ryan Harkin
  1 sibling, 1 reply; 47+ messages in thread
From: Jon Medhurst (Tixy) @ 2013-12-11 16:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2013-12-11 at 15:04 +0000, Mark Rutland wrote:
> On Wed, Dec 11, 2013 at 02:11:48PM +0000, Mark Brown wrote:
> > On Wed, Dec 11, 2013 at 01:55:36PM +0000, Mark Rutland wrote:
> > > On Wed, Dec 11, 2013 at 01:13:23PM +0000, Mark Brown wrote:
[...]
> > 
> > > > +/ {
> > > > +	model = "FVP Base";
> > 
> > > FVP Base (is as the name implies) a base upon which particular model
> > > instances are built. This name should be clarified (e.g. "FVP Base A57x4
> > > A53x4").
> > 
> > > That also applies to the filename.
> > 
> > I can update these, though they do seem to come from what you guys are
> > releasing - you might want to follow up on this internally (this applies
> > to almost all of your review comments, sorry).  It's probably going to
> > be a bit confusing for users to have the filename change but ho hum :/
> 
> I'll try to chase up the issues, thanks for making me aware.
> 
> I don't see the name issue as a big problem. This DT has never been part
> of the kernel tree, so there's no filename compatibility issue to deal
> with. Existing users of the DT will already have to be modified to get
> the DTs from a new source.
> 
> There should be nothing hanging off the compatible string for the
> platform yet -- we have no board files or platform blobs in the arm64
> port. If the model name is being used as anything other than a handy
> indicator to users, then that's broken anyway.

I believe Android uses model names to determine the filenames of init
scripts to run. That's not a kernel problem, but thought I would point
out one 'broken' use that I have first hand experience of, having been
tripped up before by ARM's twice yearly lets-rename-everything-again
exercise ;-)

-- 
Tixy

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 14:49           ` Catalin Marinas
@ 2013-12-11 16:13             ` Mark Brown
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-11 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:49:24PM +0000, Catalin Marinas wrote:
> On Wed, Dec 11, 2013 at 02:30:18PM +0000, Mark Brown wrote:

> > For clarity the current code only uses DT information, the MPIDR values
> > it uses are those in the CPU nodes in the DT which are used by the SMP
> > code and are not read from the hardware.

> But these must match the hardware MPIDR to be useful on SMP. So whether
> you read them from hardware or DT, they are still the same.

OK.  To be honest it does seem a bit unehelpful to just ignore the
values; systems with bogus MPIDRs can always use the topology binding to
set something sensible so we end up making work for people who did the
right thing.  That said...

> > We can drop that bit of code
> > easily enough but it seemed useful for getting the scaffolding in place
> > separately to the binding parsing and for general robustness, perhaps
> > adding a patch at the end of the series to drop the code handling MPIDR
> > would deal with the application issues?

> Can we have the topology infrastructure with a flat initial topology and
> the actual building in a separate patch?

That would be the effect of just dropping that hunk completely, the
default values we initialise if we don't do anything have that effect.
I'll do that when I repost.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131211/85cf2894/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 16:08         ` Jon Medhurst (Tixy)
@ 2013-12-11 16:41           ` Ryan Harkin
  2013-12-11 17:09             ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Ryan Harkin @ 2013-12-11 16:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 11 December 2013 16:08, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
> On Wed, 2013-12-11 at 15:04 +0000, Mark Rutland wrote:
>> On Wed, Dec 11, 2013 at 02:11:48PM +0000, Mark Brown wrote:
>> > On Wed, Dec 11, 2013 at 01:55:36PM +0000, Mark Rutland wrote:
>> > > On Wed, Dec 11, 2013 at 01:13:23PM +0000, Mark Brown wrote:
> [...]
>> >
>> > > > +/ {
>> > > > +       model = "FVP Base";
>> >
>> > > FVP Base (is as the name implies) a base upon which particular model
>> > > instances are built. This name should be clarified (e.g. "FVP Base A57x4
>> > > A53x4").
>> >
>> > > That also applies to the filename.

This same file is used to boot the AEMv8 architectural model as well
as the Cortex A57-A73 model, so I think someone would need to find
another filename that makes sense in both contexts.

I guess that using the same file for two models could in itself be a
problem solved via includes and simpler wrappers.

But as Mark Brown says, ARM have originated this file and personally
I'd rather it was changed in the ARM Trusted Firmware repo first and
propagated here.

To answer another question from earlier: there is no direct
correlation between the ARM Trusted Firmware and the device tree files
other than the same repo hosts both files.  Trusted firmware does not
build or embed the DTBs.  UEFI is currently what loads the DTB and
passes it to the kernel.  And that isn't part of the trusted firmware
repo, of course.


>> >
>> > I can update these, though they do seem to come from what you guys are
>> > releasing - you might want to follow up on this internally (this applies
>> > to almost all of your review comments, sorry).  It's probably going to
>> > be a bit confusing for users to have the filename change but ho hum :/
>>
>> I'll try to chase up the issues, thanks for making me aware.
>>
>> I don't see the name issue as a big problem. This DT has never been part
>> of the kernel tree, so there's no filename compatibility issue to deal
>> with. Existing users of the DT will already have to be modified to get
>> the DTs from a new source.
>>
>> There should be nothing hanging off the compatible string for the
>> platform yet -- we have no board files or platform blobs in the arm64
>> port. If the model name is being used as anything other than a handy
>> indicator to users, then that's broken anyway.
>
> I believe Android uses model names to determine the filenames of init
> scripts to run. That's not a kernel problem, but thought I would point
> out one 'broken' use that I have first hand experience of, having been
> tripped up before by ARM's twice yearly lets-rename-everything-again
> exercise ;-)
>
> --
> Tixy
>
>
>
> _______________________________________________
> linaro-kernel mailing list
> linaro-kernel at lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 16:41           ` Ryan Harkin
@ 2013-12-11 17:09             ` Mark Rutland
  2013-12-11 17:50               ` Mark Brown
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2013-12-11 17:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 04:41:32PM +0000, Ryan Harkin wrote:
> On 11 December 2013 16:08, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
> > On Wed, 2013-12-11 at 15:04 +0000, Mark Rutland wrote:
> >> On Wed, Dec 11, 2013 at 02:11:48PM +0000, Mark Brown wrote:
> >> > On Wed, Dec 11, 2013 at 01:55:36PM +0000, Mark Rutland wrote:
> >> > > On Wed, Dec 11, 2013 at 01:13:23PM +0000, Mark Brown wrote:
> > [...]
> >> >
> >> > > > +/ {
> >> > > > +       model = "FVP Base";
> >> >
> >> > > FVP Base (is as the name implies) a base upon which particular model
> >> > > instances are built. This name should be clarified (e.g. "FVP Base A57x4
> >> > > A53x4").
> >> >
> >> > > That also applies to the filename.
> 
> This same file is used to boot the AEMv8 architectural model as well
> as the Cortex A57-A73 model, so I think someone would need to find
> another filename that makes sense in both contexts.
> 
> I guess that using the same file for two models could in itself be a
> problem solved via includes and simpler wrappers.

We should have a base platform dtsi that describes the standard devices
and memory map.

Individual variants can include the dtsi and describe the model name and
compatible string, CPUs, variant-specific devices, and firmware details.

While the same DT will work regardless of cpu type currently, it's
relatively easy to factor that out anyway.

> 
> But as Mark Brown says, ARM have originated this file and personally
> I'd rather it was changed in the ARM Trusted Firmware repo first and
> propagated here.

I completely agree that the DTs in the Trusted Firmware repo should be
corrected. I will try to get them fixed.

> 
> To answer another question from earlier: there is no direct
> correlation between the ARM Trusted Firmware and the device tree files
> other than the same repo hosts both files.  Trusted firmware does not
> build or embed the DTBs.  UEFI is currently what loads the DTB and
> passes it to the kernel.  And that isn't part of the trusted firmware
> repo, of course.

There _is_ a direct correlation between the trusted firmware and the DT;
the psci node describes the Trusted Firmware PSCI implementation. If you
don't use the Trusted Firmware then you need a different DT.

However, this would only be a platform variant built atop of the more
common FVP Base, so most of the DT can be shared.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores
  2013-12-11 14:47   ` Catalin Marinas
@ 2013-12-11 17:31     ` Mark Brown
  2013-12-11 19:27       ` Nicolas Pitre
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-11 17:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 02:47:55PM +0000, Catalin Marinas wrote:
> On Wed, Dec 11, 2013 at 01:13:25PM +0000, Mark Brown wrote:

> > The power numbers are the same as for ARMv7 since it seems that the
> > expected differential between the big and little cores is very similar on
> > both ARMv7 and ARMv8.

> I have no idea ;). We don't have real silicon yet, so that's just a wild
> guess.

I was going on some typical DMIPS/MHz numbers that I'd found so
hopefully it's not a complete guess, though it will vary and that's just
one benchmark with all the realism problems that entails.  The ratio
seemed to be about the same as the equivalent for the ARMv7 cores so
given that it's a finger in the air thing it didn't seem worth drilling
down much further.

> > +static const struct cpu_efficiency table_efficiency[] = {
> > +	{ "arm,cortex-a57", 3891 },
> > +	{ "arm,cortex-a53", 2048 },
> > +	{ NULL, },
> > +};

> I also don't think we can just have absolute numbers here. I'm pretty
> sure these were generated on TC2 but other platforms may have different
> max CPU frequencies, memory subsystem, level and size of caches. The
> "average" efficiency and difference will be different.

The CPU frequencies at least are taken care of already, these numbers
get scaled for each core.  Once we're talking about things like the
memory I'd also start worrying about application specific effects.
There's also going to be stuff like thermal management which get fed in
here and which varies during runtime.

I don't know where the numbers came from for v7.

> Can we define this via DT? It's a bit strange since that's a constant
> used by the Linux scheduler but highly related to hardware.

I really don't think that's a good idea at this point, it seems better
for the DT to stick to factual descriptions of what's present rather
than putting tuning numbers in there.  If the wild guesses are in the
kernel source it's fairly easy to improve them, if they're baked into
system DTs that becomes harder.

I think it's important not to overthink what we're doing here - the
information we're trying to convey is that the A57s are a lot faster
than the A53s.  Getting the numbers "right" is good and helpful but it's
not so critical that we should let perfect be the enemy of good.  This
should at least give ARMv8 implementations about equivalent performance
to ARMv7 with this stuff.

I'm also worried about putting numbers into the DT now with all the
scheduler work going on, this time next year we may well have a
completely different idea of what we want to tell the scheduler.  It may
be that we end up being able to explicitly tell the scheduler about
things like the memory architecture, or that the scheduler just gets
smarter and can estimate all this stuff at runtime.  

Customisation seems better provided at runtime than in the DT, that's
more friendly to application specific tuning and it means that we're
less committed to what's in the DT so we can improve things as our
understanding increases.  If it was punting to platform data and we
could just update it if we decided it wasn't ideal it'd be less of an
issue but punting to something that ought to be an ABI isn't awesome.

Once we've got more experience with the silicon and the scheduler work
has progressed we might decide it's helpful to put tuning controls into
DT but starting from that point feels like it's more likely to cause
problems than help.  With where we are now something simple and in the
ballpark is going to get us a long way.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131211/6b8f55f4/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP
  2013-12-11 17:09             ` Mark Rutland
@ 2013-12-11 17:50               ` Mark Brown
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-11 17:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 05:09:03PM +0000, Mark Rutland wrote:
> On Wed, Dec 11, 2013 at 04:41:32PM +0000, Ryan Harkin wrote:

> > But as Mark Brown says, ARM have originated this file and personally
> > I'd rather it was changed in the ARM Trusted Firmware repo first and
> > propagated here.

> I completely agree that the DTs in the Trusted Firmware repo should be
> corrected. I will try to get them fixed.

And mainlined?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131211/fb9afc37/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores
  2013-12-11 17:31     ` Mark Brown
@ 2013-12-11 19:27       ` Nicolas Pitre
  2013-12-12 11:56         ` Morten Rasmussen
  0 siblings, 1 reply; 47+ messages in thread
From: Nicolas Pitre @ 2013-12-11 19:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 11 Dec 2013, Mark Brown wrote:

> On Wed, Dec 11, 2013 at 02:47:55PM +0000, Catalin Marinas wrote:
> > On Wed, Dec 11, 2013 at 01:13:25PM +0000, Mark Brown wrote:
> 
> > > The power numbers are the same as for ARMv7 since it seems that the
> > > expected differential between the big and little cores is very similar on
> > > both ARMv7 and ARMv8.
> 
> > I have no idea ;). We don't have real silicon yet, so that's just a wild
> > guess.
> 
> I was going on some typical DMIPS/MHz numbers that I'd found so
> hopefully it's not a complete guess, though it will vary and that's just
> one benchmark with all the realism problems that entails.  The ratio
> seemed to be about the same as the equivalent for the ARMv7 cores so
> given that it's a finger in the air thing it didn't seem worth drilling
> down much further.
> 
> > > +static const struct cpu_efficiency table_efficiency[] = {
> > > +	{ "arm,cortex-a57", 3891 },
> > > +	{ "arm,cortex-a53", 2048 },
> > > +	{ NULL, },
> > > +};
> 
> > I also don't think we can just have absolute numbers here. I'm pretty
> > sure these were generated on TC2 but other platforms may have different
> > max CPU frequencies, memory subsystem, level and size of caches. The
> > "average" efficiency and difference will be different.
> 
> The CPU frequencies at least are taken care of already, these numbers
> get scaled for each core.  Once we're talking about things like the
> memory I'd also start worrying about application specific effects.
> There's also going to be stuff like thermal management which get fed in
> here and which varies during runtime.
> 
> I don't know where the numbers came from for v7.
> 
> > Can we define this via DT? It's a bit strange since that's a constant
> > used by the Linux scheduler but highly related to hardware.
> 
> I really don't think that's a good idea at this point, it seems better
> for the DT to stick to factual descriptions of what's present rather
> than putting tuning numbers in there.  If the wild guesses are in the
> kernel source it's fairly easy to improve them, if they're baked into
> system DTs that becomes harder.

I really think putting such things into DT is wrong.

If those numbers were derived from benchmark results, then it is most 
probably best to try to come up with some kind of equivalent benchmark 
in the kernel to qualify CPUs at run time.  After all this is what 
actually matters i.e. how CPUs perform relative to each other, and that 
may vary with many factors that people will forget to update when 
copying a DT content to enable a new board.

And that wouldn't be the first time some benchmark is used at boot time.  
Different crypto/RAID algorithms are tested to determine the best one to 
use, etc.

> I'm also worried about putting numbers into the DT now with all the
> scheduler work going on, this time next year we may well have a
> completely different idea of what we want to tell the scheduler.  It may
> be that we end up being able to explicitly tell the scheduler about
> things like the memory architecture, or that the scheduler just gets
> smarter and can estimate all this stuff at runtime.  

Exactly.  Which is why the kernel better be self-sufficient to determine 
such params.  Dt should be used only for things that may not be probed 
at run time.  The relative performance of a CPU certainly can be probed 
at run time.

Obviously the specifics of the actual benchmark might be debated, but 
the same can be said about static numbers.


Nicolas

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/6] arm64: Add asm/cpu.h
  2013-12-11 14:27     ` Catalin Marinas
@ 2013-12-12  6:50       ` Hanjun Guo
  2013-12-12 10:36         ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Hanjun Guo @ 2013-12-12  6:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 2013-12-11 22:27, Catalin Marinas wrote:
> On Wed, Dec 11, 2013 at 02:23:08PM +0000, Mark Brown wrote:
>> On Wed, Dec 11, 2013 at 02:10:19PM +0000, Catalin Marinas wrote:
>>> On Wed, Dec 11, 2013 at 01:13:21PM +0000, Mark Brown wrote:
>>
>>>> +struct cpuinfo_arm {
>>>> +	struct cpu	cpu;
>>>> +	u64		cpuid;
>>>> +#ifdef CONFIG_SMP
>>>> +	unsigned int	loops_per_jiffy;
>>>> +#endif
>>>> +};
>>
>>> How is this structure used? I haven't seen the ACPI code doing anything
>>> with struct cpu (though I haven't dug deep enough). Also loops_per_jiffy
>>> is useless, that's related to the delay loop based on the generic timer.
>>
>> Now I look again we can probably drop this for the toplogy work - it had
>> been pulled in as part of pulling things in from pre-v8 and the cpuid
>> was used in the code while I was working on it but isn't any more unless
>> I'm misreading the code.  
>>
>> I don't know what the ACPI guys are doing wtih it, I just saw they added
>> the same thing.
> 
> It looked to me that for ACPI and empty asm/cpu.h file would do for now.
> There are a couple of prototypes for CPU hotplug I think but haven't
> seen the structure used (not even the x86_cpu one).

In ACPI code, only struct cpu is used for ACPI based CPU hotplug, here
is the code will send to upstream when the ACPI core for ARM is ready.

---
 arch/arm64/include/asm/cpu.h |    5 +++++
 arch/arm64/kernel/topology.c |   26 ++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index dbeb98d..5613e09 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -20,6 +20,11 @@ struct cpuinfo_arm {
 #endif
 };

+#ifdef CONFIG_HOTPLUG_CPU
+extern int arch_register_cpu(int cpu);
+extern void arch_unregister_cpu(int cpu);
+#endif
+
 DECLARE_PER_CPU(struct cpuinfo_arm, cpu_data);

 #endif
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index cb548f1..5c8e69c 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -18,3 +18,29 @@ void arch_fix_phys_package_id(int num, u32 slot)
 }
 EXPORT_SYMBOL_GPL(arch_fix_phys_package_id);

+#ifdef CONFIG_HOTPLUG_CPU
+int __ref arch_register_cpu(int cpu)
+{
+       struct cpuinfo_arm *cpuinfo = &per_cpu(cpu_data, cpu);
+
+       /* BSP cann't be taken down on arm */
+       if (cpu)
+               cpuinfo->cpu.hotpluggable = 1;
+
+       return register_cpu(&cpuinfo->cpu, cpu);
+}
+EXPORT_SYMBOL(arch_register_cpu);
+
+void arch_unregister_cpu(int cpu)
+{
+       unregister_cpu(&per_cpu(cpu_data, cpu).cpu);
+}
+EXPORT_SYMBOL(arch_unregister_cpu);
+#else /* CONFIG_HOTPLUG_CPU */
+
+static int __init arch_register_cpu(int cpu)
+{
+       return register_cpu(&per_cpu(cpu_data, cpu).cpu, cpu);
+}
+#endif /* CONFIG_HOTPLUG_CPU */
+

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 14:24       ` Catalin Marinas
  2013-12-11 14:30         ` Mark Brown
@ 2013-12-12  6:59         ` Hanjun Guo
  2013-12-12 10:27           ` Mark Brown
  1 sibling, 1 reply; 47+ messages in thread
From: Hanjun Guo @ 2013-12-12  6:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 2013-12-11 22:24, Catalin Marinas wrote:
> On Wed, Dec 11, 2013 at 02:15:03PM +0000, Mark Brown wrote:
>> On Wed, Dec 11, 2013 at 02:12:24PM +0000, Will Deacon wrote:
>>> I seem to remember Lorenzo having patches already for this, along with
>>> bindings, Documentation etc. so it would be good to know how these two
>>> series are supposed to interact.
>>
>> He said he hadn't had time to work on the actual implementation which is
>> why I picked this up - as far as I'm aware we only have the (already
>> merged) binding for this.
> 
> Lorenzo (on holiday now) pushed the DT topology bindings but there are
> no patches yet for reading them. I would much prefer to only describe
> the topology via DT rather than MPIDR (because hardware people sometimes
> have strange ideas).

Topology code for ARM32 is based on MPIDR now, should we update it too with
describing the topology via DT?

Thanks
Hanjun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-12  6:59         ` Hanjun Guo
@ 2013-12-12 10:27           ` Mark Brown
  2013-12-12 11:22             ` Hanjun Guo
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-12 10:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 12, 2013 at 02:59:52PM +0800, Hanjun Guo wrote:
> On 2013-12-11 22:24, Catalin Marinas wrote:

> > Lorenzo (on holiday now) pushed the DT topology bindings but there are
> > no patches yet for reading them. I would much prefer to only describe
> > the topology via DT rather than MPIDR (because hardware people sometimes
> > have strange ideas).

> Topology code for ARM32 is based on MPIDR now, should we update it too with
> describing the topology via DT?

Yes, the binding is supposed to apply to both so it'd be good to support
it on pre-v8.  However using MPIDR should continue to be supported on
pre-v8 as existing systems don't use the topology binding and so
presumably have accurate MPIDRs, requiring the binding would be a
regression for them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131212/d51be8f5/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/6] arm64: Add asm/cpu.h
  2013-12-12  6:50       ` Hanjun Guo
@ 2013-12-12 10:36         ` Mark Rutland
  2013-12-12 11:20           ` Hanjun Guo
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2013-12-12 10:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 12, 2013 at 06:50:27AM +0000, Hanjun Guo wrote:
> On 2013-12-11 22:27, Catalin Marinas wrote:
> > On Wed, Dec 11, 2013 at 02:23:08PM +0000, Mark Brown wrote:
> >> On Wed, Dec 11, 2013 at 02:10:19PM +0000, Catalin Marinas wrote:
> >>> On Wed, Dec 11, 2013 at 01:13:21PM +0000, Mark Brown wrote:
> >>
> >>>> +struct cpuinfo_arm {
> >>>> +	struct cpu	cpu;
> >>>> +	u64		cpuid;
> >>>> +#ifdef CONFIG_SMP
> >>>> +	unsigned int	loops_per_jiffy;
> >>>> +#endif
> >>>> +};
> >>
> >>> How is this structure used? I haven't seen the ACPI code doing anything
> >>> with struct cpu (though I haven't dug deep enough). Also loops_per_jiffy
> >>> is useless, that's related to the delay loop based on the generic timer.
> >>
> >> Now I look again we can probably drop this for the toplogy work - it had
> >> been pulled in as part of pulling things in from pre-v8 and the cpuid
> >> was used in the code while I was working on it but isn't any more unless
> >> I'm misreading the code.  
> >>
> >> I don't know what the ACPI guys are doing wtih it, I just saw they added
> >> the same thing.
> > 
> > It looked to me that for ACPI and empty asm/cpu.h file would do for now.
> > There are a couple of prototypes for CPU hotplug I think but haven't
> > seen the structure used (not even the x86_cpu one).
> 
> In ACPI code, only struct cpu is used for ACPI based CPU hotplug, here
> is the code will send to upstream when the ACPI core for ARM is ready.
> 
> ---
>  arch/arm64/include/asm/cpu.h |    5 +++++
>  arch/arm64/kernel/topology.c |   26 ++++++++++++++++++++++++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index dbeb98d..5613e09 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
> @@ -20,6 +20,11 @@ struct cpuinfo_arm {
>  #endif
>  };
> 
> +#ifdef CONFIG_HOTPLUG_CPU
> +extern int arch_register_cpu(int cpu);
> +extern void arch_unregister_cpu(int cpu);
> +#endif
> +
>  DECLARE_PER_CPU(struct cpuinfo_arm, cpu_data);
> 
>  #endif
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index cb548f1..5c8e69c 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -18,3 +18,29 @@ void arch_fix_phys_package_id(int num, u32 slot)
>  }
>  EXPORT_SYMBOL_GPL(arch_fix_phys_package_id);
> 
> +#ifdef CONFIG_HOTPLUG_CPU
> +int __ref arch_register_cpu(int cpu)
> +{
> +       struct cpuinfo_arm *cpuinfo = &per_cpu(cpu_data, cpu);
> +
> +       /* BSP cann't be taken down on arm */
> +       if (cpu)
> +               cpuinfo->cpu.hotpluggable = 1;

_why_ does the ACPI standard prohibit hotplugging the boot CPU?

In non-ACPI systems we can hotplug out the boot CPU (we can do so under
KVM using PSCI).

I note that the x86 arch_register_cpu allows CPU0 to be hotpluggable on
Intel systems as long as there are no dependencies on CPU0 being
active. Surely we can test for something more fine-grained rather than
disallowing CPU0 hotplug outright?

How does this interact with the existing arm64 hotplug code?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/6] arm64: Add asm/cpu.h
  2013-12-12 10:36         ` Mark Rutland
@ 2013-12-12 11:20           ` Hanjun Guo
  0 siblings, 0 replies; 47+ messages in thread
From: Hanjun Guo @ 2013-12-12 11:20 UTC (permalink / raw)
  To: linux-arm-kernel

On 2013-12-12 18:36, Mark Rutland wrote:
> On Thu, Dec 12, 2013 at 06:50:27AM +0000, Hanjun Guo wrote:
[...]
>>  #endif
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index cb548f1..5c8e69c 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -18,3 +18,29 @@ void arch_fix_phys_package_id(int num, u32 slot)
>>  }
>>  EXPORT_SYMBOL_GPL(arch_fix_phys_package_id);
>>
>> +#ifdef CONFIG_HOTPLUG_CPU
>> +int __ref arch_register_cpu(int cpu)
>> +{
>> +       struct cpuinfo_arm *cpuinfo = &per_cpu(cpu_data, cpu);
>> +
>> +       /* BSP cann't be taken down on arm */
>> +       if (cpu)
>> +               cpuinfo->cpu.hotpluggable = 1;
> 
> _why_ does the ACPI standard prohibit hotplugging the boot CPU?

ACPI spec has not limitations to hotplug the boot CPU, it depends
on the arch.

> 
> In non-ACPI systems we can hotplug out the boot CPU (we can do so under
> KVM using PSCI).

If all the things (interrupts and etc) running on CPU0 can be migrated to
other CPUs when offline a CPU, it can be hotplugable in ACPI way too.

> 
> I note that the x86 arch_register_cpu allows CPU0 to be hotpluggable on
> Intel systems as long as there are no dependencies on CPU0 being
> active. Surely we can test for something more fine-grained rather than
> disallowing CPU0 hotplug outright?

Ok, will update it when I formally send this patch out.

> 
> How does this interact with the existing arm64 hotplug code?

Some other patches are needed for ACPI based CPU hotplug,will send
out when the ACPI core for ARM is ready, I will cc you and then
you will know what's going on :)

Thanks
Hanjun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-12 10:27           ` Mark Brown
@ 2013-12-12 11:22             ` Hanjun Guo
  0 siblings, 0 replies; 47+ messages in thread
From: Hanjun Guo @ 2013-12-12 11:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 2013-12-12 18:27, Mark Brown wrote:
> On Thu, Dec 12, 2013 at 02:59:52PM +0800, Hanjun Guo wrote:
>> On 2013-12-11 22:24, Catalin Marinas wrote:
> 
>>> Lorenzo (on holiday now) pushed the DT topology bindings but there are
>>> no patches yet for reading them. I would much prefer to only describe
>>> the topology via DT rather than MPIDR (because hardware people sometimes
>>> have strange ideas).
> 
>> Topology code for ARM32 is based on MPIDR now, should we update it too with
>> describing the topology via DT?
> 
> Yes, the binding is supposed to apply to both so it'd be good to support
> it on pre-v8.  However using MPIDR should continue to be supported on
> pre-v8 as existing systems don't use the topology binding and so
> presumably have accurate MPIDRs, requiring the binding would be a
> regression for them.

Ah, that makes sense :)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores
  2013-12-11 19:27       ` Nicolas Pitre
@ 2013-12-12 11:56         ` Morten Rasmussen
  2013-12-12 12:22           ` Mark Brown
  2013-12-12 14:26           ` Vincent Guittot
  0 siblings, 2 replies; 47+ messages in thread
From: Morten Rasmussen @ 2013-12-12 11:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 07:27:09PM +0000, Nicolas Pitre wrote:
> On Wed, 11 Dec 2013, Mark Brown wrote:
> 
> > On Wed, Dec 11, 2013 at 02:47:55PM +0000, Catalin Marinas wrote:
> > > On Wed, Dec 11, 2013 at 01:13:25PM +0000, Mark Brown wrote:
> > 
> > > > The power numbers are the same as for ARMv7 since it seems that the
> > > > expected differential between the big and little cores is very similar on
> > > > both ARMv7 and ARMv8.
> > 
> > > I have no idea ;). We don't have real silicon yet, so that's just a wild
> > > guess.
> > 
> > I was going on some typical DMIPS/MHz numbers that I'd found so
> > hopefully it's not a complete guess, though it will vary and that's just
> > one benchmark with all the realism problems that entails.  The ratio
> > seemed to be about the same as the equivalent for the ARMv7 cores so
> > given that it's a finger in the air thing it didn't seem worth drilling
> > down much further.
> > 
> > > > +static const struct cpu_efficiency table_efficiency[] = {
> > > > +	{ "arm,cortex-a57", 3891 },
> > > > +	{ "arm,cortex-a53", 2048 },
> > > > +	{ NULL, },
> > > > +};
> > 
> > > I also don't think we can just have absolute numbers here. I'm pretty
> > > sure these were generated on TC2 but other platforms may have different
> > > max CPU frequencies, memory subsystem, level and size of caches. The
> > > "average" efficiency and difference will be different.
> > 
> > The CPU frequencies at least are taken care of already, these numbers
> > get scaled for each core.  Once we're talking about things like the
> > memory I'd also start worrying about application specific effects.
> > There's also going to be stuff like thermal management which get fed in
> > here and which varies during runtime.
> > 
> > I don't know where the numbers came from for v7.

I'm fairly sure that they are guestimates based on TC2. Vincent should
know. I wouldn't consider them accurate in any way as the relative
performance varies wildly depending on the workload. However, they are
better than having no information at all.

> > 
> > > Can we define this via DT? It's a bit strange since that's a constant
> > > used by the Linux scheduler but highly related to hardware.
> > 
> > I really don't think that's a good idea at this point, it seems better
> > for the DT to stick to factual descriptions of what's present rather
> > than putting tuning numbers in there.  If the wild guesses are in the
> > kernel source it's fairly easy to improve them, if they're baked into
> > system DTs that becomes harder.
> 
> I really think putting such things into DT is wrong.
> 
> If those numbers were derived from benchmark results, then it is most 
> probably best to try to come up with some kind of equivalent benchmark 
> in the kernel to qualify CPUs at run time.  After all this is what 
> actually matters i.e. how CPUs perform relative to each other, and that 
> may vary with many factors that people will forget to update when 
> copying a DT content to enable a new board.
> 
> And that wouldn't be the first time some benchmark is used at boot time.  
> Different crypto/RAID algorithms are tested to determine the best one to 
> use, etc.
> 
> > I'm also worried about putting numbers into the DT now with all the
> > scheduler work going on, this time next year we may well have a
> > completely different idea of what we want to tell the scheduler.  It may
> > be that we end up being able to explicitly tell the scheduler about
> > things like the memory architecture, or that the scheduler just gets
> > smarter and can estimate all this stuff at runtime.  

I agree. We need to sort the scheduler side out first before we commit
to anything. If we are worried about including code into v8 that we are
going to change later, then it is probably better to leave this part
out. See my response to Mark's patch subset with the same patch for
details (I didn't see this thread until afterwardsi - sorry).

> 
> Exactly.  Which is why the kernel better be self-sufficient to determine 
> such params.  Dt should be used only for things that may not be probed 
> at run time.  The relative performance of a CPU certainly can be probed 
> at run time.
> 
> Obviously the specifics of the actual benchmark might be debated, but 
> the same can be said about static numbers.

Indeed.

Morten

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores
  2013-12-12 11:56         ` Morten Rasmussen
@ 2013-12-12 12:22           ` Mark Brown
  2013-12-12 13:42             ` Morten Rasmussen
  2013-12-12 14:26           ` Vincent Guittot
  1 sibling, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-12 12:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 12, 2013 at 11:56:40AM +0000, Morten Rasmussen wrote:

> > > I'm also worried about putting numbers into the DT now with all the
> > > scheduler work going on, this time next year we may well have a
> > > completely different idea of what we want to tell the scheduler.  It may
> > > be that we end up being able to explicitly tell the scheduler about
> > > things like the memory architecture, or that the scheduler just gets
> > > smarter and can estimate all this stuff at runtime.  

> I agree. We need to sort the scheduler side out first before we commit
> to anything. If we are worried about including code into v8 that we are
> going to change later, then it is probably better to leave this part
> out. See my response to Mark's patch subset with the same patch for
> details (I didn't see this thread until afterwardsi - sorry).

My take on change is that we should be doing as good a job as we can
with the scheduler we have so users get whatever we're able to deliver
at the current time.  Having to change in kernel code shouldn't be that
big a deal, especially with something like this where the scheduler is
free to ignore what it's told without churning the interface.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131212/219906ec/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores
  2013-12-12 12:22           ` Mark Brown
@ 2013-12-12 13:42             ` Morten Rasmussen
  0 siblings, 0 replies; 47+ messages in thread
From: Morten Rasmussen @ 2013-12-12 13:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 12, 2013 at 12:22:36PM +0000, Mark Brown wrote:
> On Thu, Dec 12, 2013 at 11:56:40AM +0000, Morten Rasmussen wrote:
> 
> > > > I'm also worried about putting numbers into the DT now with all the
> > > > scheduler work going on, this time next year we may well have a
> > > > completely different idea of what we want to tell the scheduler.  It may
> > > > be that we end up being able to explicitly tell the scheduler about
> > > > things like the memory architecture, or that the scheduler just gets
> > > > smarter and can estimate all this stuff at runtime.  
> 
> > I agree. We need to sort the scheduler side out first before we commit
> > to anything. If we are worried about including code into v8 that we are
> > going to change later, then it is probably better to leave this part
> > out. See my response to Mark's patch subset with the same patch for
> > details (I didn't see this thread until afterwardsi - sorry).
> 
> My take on change is that we should be doing as good a job as we can
> with the scheduler we have so users get whatever we're able to deliver
> at the current time.  Having to change in kernel code shouldn't be that
> big a deal, especially with something like this where the scheduler is
> free to ignore what it's told without churning the interface.

Fair enough. I just wanted to make sure that people knew about the
cpu_power issues before deciding whether to do the same for v8.

Morten

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores
  2013-12-12 11:56         ` Morten Rasmussen
  2013-12-12 12:22           ` Mark Brown
@ 2013-12-12 14:26           ` Vincent Guittot
  1 sibling, 0 replies; 47+ messages in thread
From: Vincent Guittot @ 2013-12-12 14:26 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 December 2013 12:56, Morten Rasmussen <morten.rasmussen@arm.com> wrote:
> On Wed, Dec 11, 2013 at 07:27:09PM +0000, Nicolas Pitre wrote:
>> On Wed, 11 Dec 2013, Mark Brown wrote:
>>
>> > On Wed, Dec 11, 2013 at 02:47:55PM +0000, Catalin Marinas wrote:
>> > > On Wed, Dec 11, 2013 at 01:13:25PM +0000, Mark Brown wrote:
>> >
>> > > > The power numbers are the same as for ARMv7 since it seems that the
>> > > > expected differential between the big and little cores is very similar on
>> > > > both ARMv7 and ARMv8.
>> >
>> > > I have no idea ;). We don't have real silicon yet, so that's just a wild
>> > > guess.
>> >
>> > I was going on some typical DMIPS/MHz numbers that I'd found so
>> > hopefully it's not a complete guess, though it will vary and that's just
>> > one benchmark with all the realism problems that entails.  The ratio
>> > seemed to be about the same as the equivalent for the ARMv7 cores so
>> > given that it's a finger in the air thing it didn't seem worth drilling
>> > down much further.
>> >
>> > > > +static const struct cpu_efficiency table_efficiency[] = {
>> > > > +       { "arm,cortex-a57", 3891 },
>> > > > +       { "arm,cortex-a53", 2048 },
>> > > > +       { NULL, },
>> > > > +};
>> >
>> > > I also don't think we can just have absolute numbers here. I'm pretty
>> > > sure these were generated on TC2 but other platforms may have different
>> > > max CPU frequencies, memory subsystem, level and size of caches. The
>> > > "average" efficiency and difference will be different.
>> >
>> > The CPU frequencies at least are taken care of already, these numbers
>> > get scaled for each core.  Once we're talking about things like the
>> > memory I'd also start worrying about application specific effects.
>> > There's also going to be stuff like thermal management which get fed in
>> > here and which varies during runtime.
>> >
>> > I don't know where the numbers came from for v7.
>
> I'm fairly sure that they are guestimates based on TC2. Vincent should
> know. I wouldn't consider them accurate in any way as the relative

The values are not based on TC2 but on the dmips/Mhz figures from ARM

Vincent

> performance varies wildly depending on the workload. However, they are
> better than having no information at all.
>
>> >
>> > > Can we define this via DT? It's a bit strange since that's a constant
>> > > used by the Linux scheduler but highly related to hardware.
>> >
>> > I really don't think that's a good idea at this point, it seems better
>> > for the DT to stick to factual descriptions of what's present rather
>> > than putting tuning numbers in there.  If the wild guesses are in the
>> > kernel source it's fairly easy to improve them, if they're baked into
>> > system DTs that becomes harder.
>>
>> I really think putting such things into DT is wrong.
>>
>> If those numbers were derived from benchmark results, then it is most
>> probably best to try to come up with some kind of equivalent benchmark
>> in the kernel to qualify CPUs at run time.  After all this is what
>> actually matters i.e. how CPUs perform relative to each other, and that
>> may vary with many factors that people will forget to update when
>> copying a DT content to enable a new board.
>>
>> And that wouldn't be the first time some benchmark is used at boot time.
>> Different crypto/RAID algorithms are tested to determine the best one to
>> use, etc.
>>
>> > I'm also worried about putting numbers into the DT now with all the
>> > scheduler work going on, this time next year we may well have a
>> > completely different idea of what we want to tell the scheduler.  It may
>> > be that we end up being able to explicitly tell the scheduler about
>> > things like the memory architecture, or that the scheduler just gets
>> > smarter and can estimate all this stuff at runtime.
>
> I agree. We need to sort the scheduler side out first before we commit
> to anything. If we are worried about including code into v8 that we are
> going to change later, then it is probably better to leave this part
> out. See my response to Mark's patch subset with the same patch for
> details (I didn't see this thread until afterwardsi - sorry).
>
>>
>> Exactly.  Which is why the kernel better be self-sufficient to determine
>> such params.  Dt should be used only for things that may not be probed
>> at run time.  The relative performance of a CPU certainly can be probed
>> at run time.
>>
>> Obviously the specifics of the actual benchmark might be debated, but
>> the same can be said about static numbers.
>
> Indeed.
>
> Morten
>
> _______________________________________________
> linaro-kernel mailing list
> linaro-kernel at lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 13:13 ` [PATCH 4/6] arm64: topology: Implement basic CPU topology support Mark Brown
  2013-12-11 14:12   ` Will Deacon
@ 2013-12-16 10:57   ` Lorenzo Pieralisi
  2013-12-16 11:33     ` Mark Brown
  2013-12-16 12:29     ` Mark Brown
  2013-12-16 14:45   ` Alex Shi
  2 siblings, 2 replies; 47+ messages in thread
From: Lorenzo Pieralisi @ 2013-12-16 10:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 11, 2013 at 01:13:24PM +0000, Mark Brown wrote:
> From: Mark Brown <broonie@linaro.org>
> 
> Add basic CPU topology support to arm64, based on the existing pre-v8
> code and some work done by Mark Hambleton.  This patch does not
> implement the ARM CPU topology bindings, it implements equivalent
> support to the existing the equivalent pre-v8 capability using the
> mandatory MPIDR information in the CPU binding in device tree and
> assuming that a simple SMP or multi-cluster topology is in use.
> 
> The primary goal is to separate the architecture hookup for providing
> topology information from the DT parsing in order to ease review and
> avoid blocking the architecture code (which will be built on by other
> work) with the DT code review by providing something something simple
> and basic.  Having this support should also make the kernel cope better
> with incomplete DTs.
> 
> Further patches will provide support for overriding this using the
> topology bindings, providing richer support for a wider range of systems.
> 
> Signed-off-by: Mark Brown <broonie@linaro.org>
> ---
>  arch/arm64/Kconfig                |   8 +++
>  arch/arm64/include/asm/cpu.h      |   1 -
>  arch/arm64/include/asm/cputype.h  |   9 +++
>  arch/arm64/include/asm/smp_plat.h |   1 +
>  arch/arm64/include/asm/topology.h |  42 +++++++++++
>  arch/arm64/kernel/Makefile        |   1 +
>  arch/arm64/kernel/setup.c         |   9 +--
>  arch/arm64/kernel/smp.c           |  19 ++++-
>  arch/arm64/kernel/topology.c      | 143 ++++++++++++++++++++++++++++++++++++++
>  9 files changed, 227 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm64/include/asm/topology.h
>  create mode 100644 arch/arm64/kernel/topology.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 88c8b6c1341a..7b4dab852937 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -154,6 +154,14 @@ config SMP
> 
>           If you don't know what to do here, say N.
> 
> +config ARM_CPU_TOPOLOGY
> +       bool "Support CPU topology definition"
> +       depends on SMP
> +       default y
> +       help
> +         Support CPU topology definition, based on configuration
> +         provided by the firmware.
> +
>  config NR_CPUS
>         int "Maximum number of CPUs (2-32)"
>         range 2 32
> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index d67ff011d361..8a26b690110c 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
> @@ -10,7 +10,6 @@
> 
>  #include <linux/percpu.h>
>  #include <linux/cpu.h>
> -#include <linux/topology.h>
> 
>  struct cpuinfo_arm {
>         struct cpu      cpu;
> diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
> index 5fe138e0b828..bd504739cbfd 100644
> --- a/arch/arm64/include/asm/cputype.h
> +++ b/arch/arm64/include/asm/cputype.h
> @@ -29,6 +29,15 @@
>  #define INVALID_HWID           ULONG_MAX
> 
>  #define MPIDR_HWID_BITMASK     0xff00ffffff
> +#define MPIDR_SMP_BITMASK (0x3 << 30)
> +#define MPIDR_SMP_VALUE (0x2 << 30)
> +#define MPIDR_MT_BITMASK (0x1 << 24)
> +#define MPIDR_LEVEL_BITS 8
> +#define MPIDR_LEVEL_MASK ((1 << MPIDR_LEVEL_BITS) - 1)
> +
> +#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
> +       ((mpidr >> (MPIDR_LEVEL_BITS * level)) & MPIDR_LEVEL_MASK)

This macros does not cover all affinity levels, I have a patch that
implements this macro in my arm64 cpu_{suspend}/{resume} series.

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/213031.html

> +
> 
>  #define read_cpuid(reg) ({                                             \
>         u64 __val;                                                      \
> diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h
> index ed43a0d2b1b2..4ad4ecc93bcf 100644
> --- a/arch/arm64/include/asm/smp_plat.h
> +++ b/arch/arm64/include/asm/smp_plat.h
> @@ -19,6 +19,7 @@
>  #ifndef __ASM_SMP_PLAT_H
>  #define __ASM_SMP_PLAT_H
> 
> +#include <linux/cpumask.h>
>  #include <asm/types.h>
> 
>  /*
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> new file mode 100644
> index 000000000000..611edefaeaf1
> --- /dev/null
> +++ b/arch/arm64/include/asm/topology.h
> @@ -0,0 +1,42 @@
> +#ifndef _ASM_ARM_TOPOLOGY_H
> +#define _ASM_ARM_TOPOLOGY_H
> +
> +#ifdef CONFIG_ARM_CPU_TOPOLOGY
> +
> +#include <linux/cpumask.h>
> +
> +struct cputopo_arm {
> +       int thread_id;
> +       int core_id;
> +       int socket_id;
> +       cpumask_t thread_sibling;
> +       cpumask_t core_sibling;
> +};
> +
> +extern struct cputopo_arm cpu_topology[NR_CPUS];
> +
> +#define topology_physical_package_id(cpu)      (cpu_topology[cpu].socket_id)
> +#define topology_core_id(cpu)          (cpu_topology[cpu].core_id)
> +#define topology_core_cpumask(cpu)     (&cpu_topology[cpu].core_sibling)
> +#define topology_thread_cpumask(cpu)   (&cpu_topology[cpu].thread_sibling)
> +
> +#define mc_capable()   (cpu_topology[0].socket_id != -1)
> +#define smt_capable()  (cpu_topology[0].thread_id != -1)
> +
> +void init_cpu_topology(void);
> +void store_cpu_topology(unsigned int cpuid);

Do not think function above should be exported. Topology can be built in one
go from DT, code needing this function was there because the MPIDR in arm32
was "probed", hence all CPUs (primaries and secondaries) had to call it IIRC.

> +const struct cpumask *cpu_coregroup_mask(int cpu);
> +int cluster_to_logical_mask(unsigned int socket_id, cpumask_t *cluster_mask);
I think the function signature above needs changing when we move to DT
topology bindings, put it differently the look up won't be based on a
socket id anymore, I need some time to think about it.

[...]

> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index a5aeefab03c3..f29c7ffad84a 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -35,7 +35,6 @@
>  #include <linux/clockchips.h>
>  #include <linux/completion.h>
>  #include <linux/of.h>
> -
>  #include <asm/atomic.h>
>  #include <asm/cacheflush.h>
>  #include <asm/cputype.h>
> @@ -48,6 +47,7 @@
>  #include <asm/sections.h>
>  #include <asm/tlbflush.h>
>  #include <asm/ptrace.h>
> +#include <asm/cpu.h>
> 
>  /*
>   * as from 2.5, kernels no longer have an init_tasks structure
> @@ -113,6 +113,16 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>         return ret;
>  }
> 
> +static void __cpuinit smp_store_cpu_info(unsigned int cpuid)
> +{
> +       struct cpuinfo_arm *cpu_info = &per_cpu(cpu_data, cpuid);
> +
> +       cpu_info->loops_per_jiffy = loops_per_jiffy;
> +       cpu_info->cpuid = read_cpuid_id();
> +
> +       store_cpu_topology(cpuid);

All bits of info are in the DT, topology can be built by primary CPU, no need
to call store_cpu_topology() on each CPU, that was only needed because on
arm32 the topology code relies on each CPU reading its own MPIDR.

> +
>  /*
>   * This is the secondary CPU boot entry.  We're using this CPUs
>   * idle thread stack, but a set of temporary page tables.
> @@ -150,6 +160,8 @@ asmlinkage void secondary_start_kernel(void)
>          */
>         notify_cpu_starting(cpu);
> 
> +       smp_store_cpu_info(cpu);
> +
>         /*
>          * OK, now it's safe to let the boot CPU continue.  Wait for
>          * the CPU migration code to notice that the CPU is online
> @@ -387,6 +399,11 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>         int err;
>         unsigned int cpu, ncores = num_possible_cpus();
> 
> +       init_cpu_topology();
> +
> +       smp_store_cpu_info(smp_processor_id());
> +
> +
>         /*
>          * are we trying to boot more cores than exist?
>          */
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> new file mode 100644
> index 000000000000..e0b40f48b448
> --- /dev/null
> +++ b/arch/arm64/kernel/topology.c
> @@ -0,0 +1,143 @@
> +/*
> + * arch/arm64/kernel/topology.c
> + *
> + * Copyright (C) 2011,2013 Linaro Limited.
> + * Written by: Vincent Guittot
> + *
> + * based on arch/sh/kernel/topology.c
> + *
> + * This file is subject to the terms and conditions of the GNU General Public
> + * License.  See the file "COPYING" in the main directory of this archive
> + * for more details.
> + */
> +
> +#include <linux/cpu.h>
> +#include <linux/cpumask.h>
> +#include <linux/export.h>
> +#include <linux/init.h>
> +#include <linux/percpu.h>
> +#include <linux/node.h>
> +#include <linux/nodemask.h>
> +#include <linux/sched.h>
> +#include <linux/slab.h>
> +
> +#include <asm/cputype.h>
> +#include <asm/smp_plat.h>
> +#include <asm/topology.h>
> +
> +/*
> + * cpu topology table
> + */
> +struct cputopo_arm cpu_topology[NR_CPUS];
> +EXPORT_SYMBOL_GPL(cpu_topology);
> +
> +const struct cpumask *cpu_coregroup_mask(int cpu)
> +{
> +       return &cpu_topology[cpu].core_sibling;
> +}
> +
> +static void update_siblings_masks(unsigned int cpuid)
> +{
> +       struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> +       int cpu;
> +
> +       /* update core and thread sibling masks */
> +       for_each_possible_cpu(cpu) {
> +               cpu_topo = &cpu_topology[cpu];
> +
> +               if (cpuid_topo->socket_id != cpu_topo->socket_id)
> +                       continue;
> +
> +               cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> +               if (cpu != cpuid)
> +                       cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> +
> +               if (cpuid_topo->core_id != cpu_topo->core_id)
> +                       continue;
> +
> +               cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
> +               if (cpu != cpuid)
> +                       cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
> +       }
> +       smp_wmb();
> +}
> +
> +/*
> + * store_cpu_topology is called at boot when only one cpu is running
> + * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
> + * which prevents simultaneous write access to cpu_topology array
> + */
> +void store_cpu_topology(unsigned int cpuid)
> +{
> +       struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid];
> +       u64 mpidr;
> +
> +       /* If the cpu topology has been already set, just return */
> +       if (cpuid_topo->core_id != -1)
> +               return;
> +
> +       mpidr = cpu_logical_map(cpuid);
> +
> +       /*
> +        * Create cpu topology mapping, assume the cores are largely
> +        * independent since the DT bindings do not include the flags
> +        * for MT.
> +        */
> +       cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +       cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);

This is what we are trying to prevent.  The way levels are mapped to
core and cluster id is just a recommendation. Better to follow DT bindings,
they are stricter and provide us with all the required bits of information.

> +
> +       update_siblings_masks(cpuid);
> +
> +       pr_info("CPU%u: cpu %d, socket %d mapped using MPIDR %llx\n",
> +               cpuid, cpu_topology[cpuid].core_id,
> +               cpu_topology[cpuid].socket_id, mpidr);
> +}
> +
> +
> +/*
> + * cluster_to_logical_mask - return cpu logical mask of CPUs in a cluster
> + * @socket_id:         cluster HW identifier
> + * @cluster_mask:      the cpumask location to be initialized, modified by the
> + *                     function only if return value == 0
> + *
> + * Return:
> + *
> + * 0 on success
> + * -EINVAL if cluster_mask is NULL or there is no record matching socket_id
> + */
> +int cluster_to_logical_mask(unsigned int socket_id, cpumask_t *cluster_mask)
> +{
> +       int cpu;
> +
> +       if (!cluster_mask)
> +               return -EINVAL;
> +
> +       for_each_online_cpu(cpu)
> +               if (socket_id == topology_physical_package_id(cpu)) {
> +                       cpumask_copy(cluster_mask, topology_core_cpumask(cpu));
> +                       return 0;
> +               }
> +
> +       return -EINVAL;
> +}

As mentioned, I think this function will have to change. Masks can be
built using phandles to topology nodes. I know this is how cluster masks
are currently built in arm32 kernels, but this does not mean that's the
correct approach, given the laxity of the MPIDR specification.

> +/*
> + * init_cpu_topology is called at boot when only one cpu is running
> + * which prevent simultaneous write access to cpu_topology array
> + */
> +void __init init_cpu_topology(void)
> +{
> +       unsigned int cpu;
> +
> +       /* init core mask and power*/
> +       for_each_possible_cpu(cpu) {
> +               struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
> +
> +               cpu_topo->thread_id = -1;
> +               cpu_topo->core_id =  -1;
> +               cpu_topo->socket_id = -1;
> +               cpumask_clear(&cpu_topo->core_sibling);
> +               cpumask_clear(&cpu_topo->thread_sibling);
> +       }

This is probably the place where the topology should be parsed and built
in one go, from DT, I did that and then needed to rewrite the code since
topology bindings changed before getting merged.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-16 10:57   ` Lorenzo Pieralisi
@ 2013-12-16 11:33     ` Mark Brown
  2013-12-16 12:29     ` Mark Brown
  1 sibling, 0 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-16 11:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 16, 2013 at 10:57:34AM +0000, Lorenzo Pieralisi wrote:

> > +#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
> > +       ((mpidr >> (MPIDR_LEVEL_BITS * level)) & MPIDR_LEVEL_MASK)

> This macros does not cover all affinity levels, I have a patch that
> implements this macro in my arm64 cpu_{suspend}/{resume} series.
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/213031.html

That hunk has been dropped from the latest postings of the series
anyway, please see the more current threads (and I'm going to do another
spin today too).  It'd be good to get things like your helpers
integrated so we don't have duplication of effort or people running
around trying to keep track of out of tree patches.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131216/2e74114f/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-16 10:57   ` Lorenzo Pieralisi
  2013-12-16 11:33     ` Mark Brown
@ 2013-12-16 12:29     ` Mark Brown
  2013-12-16 14:46       ` Lorenzo Pieralisi
  1 sibling, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-16 12:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 16, 2013 at 10:57:34AM +0000, Lorenzo Pieralisi wrote:
> On Wed, Dec 11, 2013 at 01:13:24PM +0000, Mark Brown wrote:

Replying again since I didn't notice half the content here - please
delete unneeded context from your mails, it makes it much easier to 
find the content you've added.

> > +const struct cpumask *cpu_coregroup_mask(int cpu);
> > +int cluster_to_logical_mask(unsigned int socket_id, cpumask_t *cluster_mask);

> I think the function signature above needs changing when we move to DT
> topology bindings, put it differently the look up won't be based on a
> socket id anymore, I need some time to think about it.

Well, we need to consider the possibility of ACPI or whatever as well.

> > +       cpu_info->loops_per_jiffy = loops_per_jiffy;
> > +       cpu_info->cpuid = read_cpuid_id();

> > +       store_cpu_topology(cpuid);

> All bits of info are in the DT, topology can be built by primary CPU, no need
> to call store_cpu_topology() on each CPU, that was only needed because on
> arm32 the topology code relies on each CPU reading its own MPIDR.

This is also gone in the current versions of the series.

> > +       /*
> > +        * Create cpu topology mapping, assume the cores are largely
> > +        * independent since the DT bindings do not include the flags
> > +        * for MT.
> > +        */
> > +       cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> > +       cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);

> This is what we are trying to prevent.  The way levels are mapped to
> core and cluster id is just a recommendation. Better to follow DT bindings,
> they are stricter and provide us with all the required bits of information.

Again, this is gone from the current version.  Like I said to Catalin it
does feel like this is making more work for systems that have done the
right thing with their MPIDRs which doesn't seem ideal (and note that
all the DTs that you guys are publishing for your models lack any
topology information at present).

> > +       for_each_online_cpu(cpu)
> > +               if (socket_id == topology_physical_package_id(cpu)) {
> > +                       cpumask_copy(cluster_mask, topology_core_cpumask(cpu));
> > +                       return 0;
> > +               }
> > +
> > +       return -EINVAL;
> > +}

> As mentioned, I think this function will have to change. Masks can be
> built using phandles to topology nodes. I know this is how cluster masks
> are currently built in arm32 kernels, but this does not mean that's the
> correct approach, given the laxity of the MPIDR specification.

So, this can actually be removed completely since there aren't any
references to this function any more that said the socket_id assignment
is still there...

This isn't being done using MPDIR, it's being done based on using the
lowest level of cluster however we found it.  What we're doing is that
while parsing the topology information source we use it to pick a
physical package identifier and then later on we use that identifier as
a handle.  The socket ID isn't really taken from a field in the MPDIR,
it's the result of doing the mapping you mention.

I definitely don't think we should be using phandles directly unless we
want to have a separate implementation for ACPI (or whatever else might
come up) which would mean less encapsulation of the topology code.
Having the parsing code assign socket IDs doesn't seem like a
particularly bad way of doing things, we need IDs for the generic
topology API functions to use anyway.

> > +       for_each_possible_cpu(cpu) {
> > +               struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
> > +
> > +               cpu_topo->thread_id = -1;
> > +               cpu_topo->core_id =  -1;
> > +               cpu_topo->socket_id = -1;
> > +               cpumask_clear(&cpu_topo->core_sibling);
> > +               cpumask_clear(&cpu_topo->thread_sibling);
> > +       }

> This is probably the place where the topology should be parsed and built
> in one go, from DT, I did that and then needed to rewrite the code since
> topology bindings changed before getting merged.

Right, and that's where it happens for the DT parsing code.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131216/aaec2a2a/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-11 13:13 ` [PATCH 4/6] arm64: topology: Implement basic CPU topology support Mark Brown
  2013-12-11 14:12   ` Will Deacon
  2013-12-16 10:57   ` Lorenzo Pieralisi
@ 2013-12-16 14:45   ` Alex Shi
  2013-12-16 15:22     ` Mark Brown
  2 siblings, 1 reply; 47+ messages in thread
From: Alex Shi @ 2013-12-16 14:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/11/2013 09:13 PM, Mark Brown wrote:
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> new file mode 100644
> index 000000000000..611edefaeaf1
> --- /dev/null
> +++ b/arch/arm64/include/asm/topology.h
> @@ -0,0 +1,42 @@
> +#ifndef _ASM_ARM_TOPOLOGY_H
> +#define _ASM_ARM_TOPOLOGY_H
> +
> +#ifdef CONFIG_ARM_CPU_TOPOLOGY
> +
> +#include <linux/cpumask.h>
> +
> +struct cputopo_arm {
> +	int thread_id;
> +	int core_id;
> +	int socket_id;
> +	cpumask_t thread_sibling;
> +	cpumask_t core_sibling;
> +};

Forgive me if I am stupid. :)

why we don't need a cluster_id? and does one cpu socket include few
clusters?

-- 
Thanks
    Alex

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-16 12:29     ` Mark Brown
@ 2013-12-16 14:46       ` Lorenzo Pieralisi
  2013-12-16 15:12         ` Mark Brown
  0 siblings, 1 reply; 47+ messages in thread
From: Lorenzo Pieralisi @ 2013-12-16 14:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 16, 2013 at 12:29:48PM +0000, Mark Brown wrote:
> On Mon, Dec 16, 2013 at 10:57:34AM +0000, Lorenzo Pieralisi wrote:
> > On Wed, Dec 11, 2013 at 01:13:24PM +0000, Mark Brown wrote:
> 
> Replying again since I didn't notice half the content here - please
> delete unneeded context from your mails, it makes it much easier to 
> find the content you've added.
> 
> > > +const struct cpumask *cpu_coregroup_mask(int cpu);
> > > +int cluster_to_logical_mask(unsigned int socket_id, cpumask_t *cluster_mask);
> 
> > I think the function signature above needs changing when we move to DT
> > topology bindings, put it differently the look up won't be based on a
> > socket id anymore, I need some time to think about it.
> 
> Well, we need to consider the possibility of ACPI or whatever as well.

That's a fair point, I will have a look at v2.

[...]

> > > +       /*
> > > +        * Create cpu topology mapping, assume the cores are largely
> > > +        * independent since the DT bindings do not include the flags
> > > +        * for MT.
> > > +        */
> > > +       cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> > > +       cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> 
> > This is what we are trying to prevent.  The way levels are mapped to
> > core and cluster id is just a recommendation. Better to follow DT bindings,
> > they are stricter and provide us with all the required bits of information.
> 
> Again, this is gone from the current version.  Like I said to Catalin it
> does feel like this is making more work for systems that have done the
> right thing with their MPIDRs which doesn't seem ideal (and note that
> all the DTs that you guys are publishing for your models lack any
> topology information at present).

This is an age-old question and the problem has always been that the
"right thing" is recommended, not enforced. I do not want to turn this into
bikeshedding, as long as cpu-map node takes priority if present, fine by me.

> > > +       for_each_online_cpu(cpu)
> > > +               if (socket_id == topology_physical_package_id(cpu)) {
> > > +                       cpumask_copy(cluster_mask, topology_core_cpumask(cpu));
> > > +                       return 0;
> > > +               }
> > > +
> > > +       return -EINVAL;
> > > +}
> 
> > As mentioned, I think this function will have to change. Masks can be
> > built using phandles to topology nodes. I know this is how cluster masks
> > are currently built in arm32 kernels, but this does not mean that's the
> > correct approach, given the laxity of the MPIDR specification.
> 
> So, this can actually be removed completely since there aren't any
> references to this function any more that said the socket_id assignment
> is still there...
> 
> This isn't being done using MPDIR, it's being done based on using the
> lowest level of cluster however we found it.  What we're doing is that
> while parsing the topology information source we use it to pick a
> physical package identifier and then later on we use that identifier as
> a handle.  The socket ID isn't really taken from a field in the MPDIR,
> it's the result of doing the mapping you mention.
> 
> I definitely don't think we should be using phandles directly unless we
> want to have a separate implementation for ACPI (or whatever else might
> come up) which would mean less encapsulation of the topology code.
> Having the parsing code assign socket IDs doesn't seem like a
> particularly bad way of doing things, we need IDs for the generic
> topology API functions to use anyway.

It makes sense, again I will have a look at v2 and comment on that.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-16 14:46       ` Lorenzo Pieralisi
@ 2013-12-16 15:12         ` Mark Brown
  2013-12-17 11:47           ` Catalin Marinas
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-16 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 16, 2013 at 02:46:38PM +0000, Lorenzo Pieralisi wrote:
> On Mon, Dec 16, 2013 at 12:29:48PM +0000, Mark Brown wrote:

> > Well, we need to consider the possibility of ACPI or whatever as well.

> That's a fair point, I will have a look at v2.

Probably best to wait until the v4 or whatever that I'm going to post
shortly (need to do a few more checks locally before I post).  I'll CC
you.

[MPIDR parsing]
> > Again, this is gone from the current version.  Like I said to Catalin it
> > does feel like this is making more work for systems that have done the
> > right thing with their MPIDRs which doesn't seem ideal (and note that
> > all the DTs that you guys are publishing for your models lack any
> > topology information at present).

> This is an age-old question and the problem has always been that the
> "right thing" is recommended, not enforced. I do not want to turn this into
> bikeshedding, as long as cpu-map node takes priority if present, fine by me.

I already dropped that code, though I could resurrect it (perhaps as a
separate patch).  The way the code was written was as you describe as a
last resort - MPIDR would only be considered if the explict topology
binding was not present, it was done as a last step before reporting if
no other topology information was discovered.

Actually now I think about it if we're not going to parse the MPIDR we
should probably update the bindings to say that the topology binding is
mandatory for any v8 system with more than one core.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131216/9a8ff959/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-16 14:45   ` Alex Shi
@ 2013-12-16 15:22     ` Mark Brown
  2013-12-17  7:19       ` Alex Shi
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Brown @ 2013-12-16 15:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 16, 2013 at 10:45:18PM +0800, Alex Shi wrote:
> On 12/11/2013 09:13 PM, Mark Brown wrote:

> > +struct cputopo_arm {
> > +	int thread_id;
> > +	int core_id;
> > +	int socket_id;

> Forgive me if I am stupid. :)

> why we don't need a cluster_id? and does one cpu socket include few
> clusters?

The ARMv7 code calls a cluster a socket I think because it's trying to
maintain similarity with other architectures and the ARMv8 code follows
ARMv7 in this regard.  At the minute we're using this ID for whatever
the lowest level of grouping is above a core and presenting the
scheduler with a flat topology there - the topology binding and MPIDR
both allow multiple layers of cluster so you could definitely have
multiple clusters in a supercluster.

Without practical examples of such systems or more architecture
documentation than I've been able to find it's not clear how to
represent them to the scheduler, it will depend on how closely
associated the clusters are and what the scheduler's features are,
perhaps we should describe such a system as NUMA but it's not clear to
me that this would produce the desired results.  I wonder if we may end
up figuring this out from other data such as descriptions of caches and
interconnects rather than purely from the topology information.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131216/b1c8dde4/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-16 15:22     ` Mark Brown
@ 2013-12-17  7:19       ` Alex Shi
  2013-12-17 12:02         ` Mark Brown
  0 siblings, 1 reply; 47+ messages in thread
From: Alex Shi @ 2013-12-17  7:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/16/2013 11:22 PM, Mark Brown wrote:
> On Mon, Dec 16, 2013 at 10:45:18PM +0800, Alex Shi wrote:
>> On 12/11/2013 09:13 PM, Mark Brown wrote:
> 
>>> +struct cputopo_arm {
>>> +	int thread_id;
>>> +	int core_id;
>>> +	int socket_id;
> 
>> Forgive me if I am stupid. :)
> 
>> why we don't need a cluster_id? and does one cpu socket include few
>> clusters?
> 
> The ARMv7 code calls a cluster a socket I think because it's trying to
> maintain similarity with other architectures and the ARMv8 code follows
> ARMv7 in this regard.  At the minute we're using this ID for whatever
> the lowest level of grouping is above a core and presenting the
> scheduler with a flat topology there - the topology binding and MPIDR
> both allow multiple layers of cluster so you could definitely have
> multiple clusters in a supercluster.
> 
> Without practical examples of such systems or more architecture
> documentation than I've been able to find it's not clear how to
> represent them to the scheduler, it will depend on how closely
> associated the clusters are and what the scheduler's features are,
> perhaps we should describe such a system as NUMA but it's not clear to
> me that this would produce the desired results.  I wonder if we may end
> up figuring this out from other data such as descriptions of caches and
> interconnects rather than purely from the topology information.

For topology meaning, it may be better to have cluster concept there.
And don't know if there will has a real ARM NUMA system for 64bit
server. If so, socket_id is good in a NUMA system.


-- 
Thanks
    Alex

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-16 15:12         ` Mark Brown
@ 2013-12-17 11:47           ` Catalin Marinas
  2013-12-17 12:17             ` Mark Brown
  0 siblings, 1 reply; 47+ messages in thread
From: Catalin Marinas @ 2013-12-17 11:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 16, 2013 at 03:12:32PM +0000, Mark Brown wrote:
> On Mon, Dec 16, 2013 at 02:46:38PM +0000, Lorenzo Pieralisi wrote:
> > On Mon, Dec 16, 2013 at 12:29:48PM +0000, Mark Brown wrote:
> [MPIDR parsing]
> > > Again, this is gone from the current version.  Like I said to Catalin it
> > > does feel like this is making more work for systems that have done the
> > > right thing with their MPIDRs which doesn't seem ideal (and note that
> > > all the DTs that you guys are publishing for your models lack any
> > > topology information at present).
> 
> > This is an age-old question and the problem has always been that the
> > "right thing" is recommended, not enforced. I do not want to turn this into
> > bikeshedding, as long as cpu-map node takes priority if present, fine by me.
> 
> I already dropped that code, though I could resurrect it (perhaps as a
> separate patch).  The way the code was written was as you describe as a
> last resort - MPIDR would only be considered if the explict topology
> binding was not present, it was done as a last step before reporting if
> no other topology information was discovered.
> 
> Actually now I think about it if we're not going to parse the MPIDR we
> should probably update the bindings to say that the topology binding is
> mandatory for any v8 system with more than one core.

Do we need such information if only a flat topology is needed?

-- 
Catalin

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-17  7:19       ` Alex Shi
@ 2013-12-17 12:02         ` Mark Brown
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-17 12:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 17, 2013 at 03:19:39PM +0800, Alex Shi wrote:

> For topology meaning, it may be better to have cluster concept there.
> And don't know if there will has a real ARM NUMA system for 64bit
> server. If so, socket_id is good in a NUMA system.

The scheduler does have a separate NUMA node ID that we're not currently
using which I'd have expected to be a better fit for actual NUMA stuff;
looking at the options we've got it's not clear to me that with what the
scheduler has at the minute sockets aren't a good mapping for what the
hardware has.  There doesn't seem to be anything between threaded cores
and physical packages.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131217/d9dc0ae4/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/6] arm64: topology: Implement basic CPU topology support
  2013-12-17 11:47           ` Catalin Marinas
@ 2013-12-17 12:17             ` Mark Brown
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Brown @ 2013-12-17 12:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 17, 2013 at 11:47:34AM +0000, Catalin Marinas wrote:
> On Mon, Dec 16, 2013 at 03:12:32PM +0000, Mark Brown wrote:

> > Actually now I think about it if we're not going to parse the MPIDR we
> > should probably update the bindings to say that the topology binding is
> > mandatory for any v8 system with more than one core.

> Do we need such information if only a flat topology is needed?

Probably not but it's simpler to specify that rather than saying it's
mandatory only if you need it and it might be useful to know if
everything is a flat collection of threads or a flat collection of
cores, it seems better to get the information in there if we can in case
the DT is fixed.  Single core systems are the only ones where it's
definitely never going to be helpful.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131217/2e597e6b/attachment.sig>

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2013-12-17 12:17 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-11 13:13 [PATCH 1/6] arm64: Add asm/cpu.h Mark Brown
2013-12-11 13:13 ` [PATCH 2/6] arm64: dts: Add a virtio disk to the RTSM motherboard Mark Brown
2013-12-11 13:13 ` [PATCH 3/6] arm64: dts: Add a devicetree for the ARMv8 4xA53 4xA57 FVP Mark Brown
2013-12-11 13:55   ` Mark Rutland
2013-12-11 14:11     ` Mark Brown
2013-12-11 15:04       ` Mark Rutland
2013-12-11 16:00         ` Mark Brown
2013-12-11 16:08         ` Jon Medhurst (Tixy)
2013-12-11 16:41           ` Ryan Harkin
2013-12-11 17:09             ` Mark Rutland
2013-12-11 17:50               ` Mark Brown
2013-12-11 13:13 ` [PATCH 4/6] arm64: topology: Implement basic CPU topology support Mark Brown
2013-12-11 14:12   ` Will Deacon
2013-12-11 14:15     ` Mark Brown
2013-12-11 14:24       ` Catalin Marinas
2013-12-11 14:30         ` Mark Brown
2013-12-11 14:49           ` Catalin Marinas
2013-12-11 16:13             ` Mark Brown
2013-12-12  6:59         ` Hanjun Guo
2013-12-12 10:27           ` Mark Brown
2013-12-12 11:22             ` Hanjun Guo
2013-12-16 10:57   ` Lorenzo Pieralisi
2013-12-16 11:33     ` Mark Brown
2013-12-16 12:29     ` Mark Brown
2013-12-16 14:46       ` Lorenzo Pieralisi
2013-12-16 15:12         ` Mark Brown
2013-12-17 11:47           ` Catalin Marinas
2013-12-17 12:17             ` Mark Brown
2013-12-16 14:45   ` Alex Shi
2013-12-16 15:22     ` Mark Brown
2013-12-17  7:19       ` Alex Shi
2013-12-17 12:02         ` Mark Brown
2013-12-11 13:13 ` [PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores Mark Brown
2013-12-11 14:47   ` Catalin Marinas
2013-12-11 17:31     ` Mark Brown
2013-12-11 19:27       ` Nicolas Pitre
2013-12-12 11:56         ` Morten Rasmussen
2013-12-12 12:22           ` Mark Brown
2013-12-12 13:42             ` Morten Rasmussen
2013-12-12 14:26           ` Vincent Guittot
2013-12-11 13:13 ` [PATCH 6/6] arm64: dts: Add CPU topology properties for ARMv8 4xA53 4xA57 FVP Mark Brown
2013-12-11 14:10 ` [PATCH 1/6] arm64: Add asm/cpu.h Catalin Marinas
2013-12-11 14:23   ` Mark Brown
2013-12-11 14:27     ` Catalin Marinas
2013-12-12  6:50       ` Hanjun Guo
2013-12-12 10:36         ` Mark Rutland
2013-12-12 11:20           ` Hanjun Guo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.