All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/4] arm64:numa: Add numa support for arm64 platforms.
@ 2014-12-31  7:33 Ganapatrao Kulkarni
       [not found] ` <1420011208-7051-1-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
  0 siblings, 1 reply; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2014-12-31  7:33 UTC (permalink / raw)
  To: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	arnd-r2nGTMty4D4
  Cc: gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

This is v3 patch set to support numa on arm64 based platforms.
Tested these patches on cavium's multinode(2 node topology) platform.

In this patchset, defined and implemented dt bindings for numa mapping for core and memory.
using device node property arm,associativity.

v2:
Defined and implemented numa map for memory, cores to node and
proximity distance matrix of nodes to each other.

v1:
Initial patchset to support numa on arm64 platforms.

Ganapatrao Kulkarni (4):
  arm64: defconfig: increase NR_CPUS range to 2-4096.
  Documentation: arm64/arm: dt bindings for numa.
  arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node
    topology.
  arm64:numa: adding numa support for arm64 platforms.

 Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++
 arch/arm64/Kconfig                             |  36 +-
 arch/arm64/boot/dts/thunder-88xx-2n.dts        |  78 +++
 arch/arm64/boot/dts/thunder-88xx-2n.dtsi       | 789 +++++++++++++++++++++++++
 arch/arm64/include/asm/mmzone.h                |  32 +
 arch/arm64/include/asm/numa.h                  |  45 ++
 arch/arm64/kernel/Makefile                     |   1 +
 arch/arm64/kernel/dt_numa.c                    | 296 ++++++++++
 arch/arm64/kernel/setup.c                      |   8 +
 arch/arm64/kernel/smp.c                        |   2 +
 arch/arm64/mm/Makefile                         |   1 +
 arch/arm64/mm/init.c                           |  34 +-
 arch/arm64/mm/numa.c                           | 520 ++++++++++++++++
 13 files changed, 2032 insertions(+), 8 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dtsi
 create mode 100644 arch/arm64/include/asm/mmzone.h
 create mode 100644 arch/arm64/include/asm/numa.h
 create mode 100644 arch/arm64/kernel/dt_numa.c
 create mode 100644 arch/arm64/mm/numa.c

-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096.
       [not found] ` <1420011208-7051-1-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
@ 2014-12-31  7:33   ` Ganapatrao Kulkarni
       [not found]     ` <1420011208-7051-2-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
  2014-12-31  7:33   ` [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa Ganapatrao Kulkarni
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2014-12-31  7:33 UTC (permalink / raw)
  To: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	arnd-r2nGTMty4D4
  Cc: gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

Raising the maximum limit to 4096.
This is to accomadate up-coming higher multi-core platforms.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
---
 arch/arm64/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 61fbd1f..242419d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -302,8 +302,8 @@ config SCHED_SMT
 	  places. If unsure say N here.
 
 config NR_CPUS
-	int "Maximum number of CPUs (2-64)"
-	range 2 64
+	int "Maximum number of CPUs (2-4096)"
+	range 2 4096
 	depends on SMP
 	# These have to remain sorted largest to smallest
 	default "64"
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa.
       [not found] ` <1420011208-7051-1-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
  2014-12-31  7:33   ` [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096 Ganapatrao Kulkarni
@ 2014-12-31  7:33   ` Ganapatrao Kulkarni
       [not found]     ` <1420011208-7051-3-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
  2014-12-31  7:33   ` [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Ganapatrao Kulkarni
  2014-12-31  7:33   ` [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms Ganapatrao Kulkarni
  3 siblings, 1 reply; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2014-12-31  7:33 UTC (permalink / raw)
  To: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	arnd-r2nGTMty4D4
  Cc: gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 6935 bytes --]

DT bindings for numa map for memory, cores and IOs using arm,associativity
device node property.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
---
 Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++++++++++++++++++++
 1 file changed, 198 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/numa.txt

diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
new file mode 100644
index 0000000..4f51e25
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/numa.txt
@@ -0,0 +1,198 @@
+==============================================================================
+NUMA binding description.
+==============================================================================
+
+==============================================================================
+1 - Introduction
+==============================================================================
+
+Systems employing a Non Uniform Memory Access (NUMA) architecture contain
+collections of hardware resources including processors, memory, and I/O buses,
+that comprise what is commonly known as a “NUMA node”.
+Processor accesses to memory within the local NUMA node is generally faster
+than processor accesses to memory outside of the local NUMA node.
+DT defines interfaces that allow the platform to convey NUMA node
+topology information to OS.
+
+==============================================================================
+2 - arm,associativity
+==============================================================================
+
+The mapping is done using arm,associativity device property.
+this property needs to be present in every device node which needs to to be
+mapped to numa nodes.
+
+arm,associativity property is set of 32-bit integers. representing the
+board id, socket id and core id.
+
+ex:
+	/* board 0, socket 0, core 0 */
+	arm,associativity = <0 0 0x000>;
+
+	/* board 1, socket 0, core 8 */
+	arm,associativity = <1 0 0x08>;
+
+==============================================================================
+3 - arm,associativity-reference-points
+==============================================================================
+This property is a set of 32-bit integers, each representing an index into
+the arm,associativity nodes. The first integer is the most significant
+NUMA boundary and the following are progressively less significant boundaries.
+There can be more than one level of NUMA.
+
+Ex:
+	arm,associativity-reference-points = <0 1>;
+	The board Id(index 0) used first to calculate the associativity (node
+	distance), then follows the  socket id(index 1).
+
+	arm,associativity-reference-points = <1 0>;
+	The socket Id(index 1) used first to calculate the associativity,
+	then follows the board id(index 0).
+
+	arm,associativity-reference-points = <0>;
+	Only the board Id(index 0) used to calculate the associativity.
+
+	arm,associativity-reference-points = <1>;
+	Only socket Id(index 1) used to calculate the associativity.
+
+==============================================================================
+4 - Example dts
+==============================================================================
+
+Example: 2 Node system consists of 2 boards and each board having one socket
+and 8 core in each socket.
+
+	arm,associativity-reference-points = <0 1>;
+
+	memory@00c00000 {
+		device_type = "memory";
+		reg = <0x0 0x00c00000 0x0 0x80000000>;
+		/* board 0, socket 0, no specific core */
+		arm,associativity = <0 0 0xffff>;
+	};
+
+	memory@10000000000 {
+		device_type = "memory";
+		reg = <0x100 0x00000000 0x0 0x80000000>;
+		/* board 1, socket 0, no specific core */
+		arm,associativity = <1 0 0xffff>;
+	};
+
+	cpus {
+		#address-cells = <2>;
+		#size-cells = <0>;
+
+		cpu@000 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x000>;
+			enable-method = "psci";
+			/* board 0, socket 0, core 0*/
+			arm,associativity = <0 0 0x000>;
+		};
+		cpu@001 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x001>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x001>;
+		};
+		cpu@002 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x002>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x002>;
+		};
+		cpu@003 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x003>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x003>;
+		};
+		cpu@004 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x004>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x004>;
+		};
+		cpu@005 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x005>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x005>;
+		};
+		cpu@006 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x006>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x006>;
+		};
+		cpu@007 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x007>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x007>;
+		};
+		cpu@008 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x008>;
+			enable-method = "psci";
+			/* board 1, socket 0, core 0*/
+			arm,associativity = <1 0 0x008>;
+		};
+		cpu@009 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x009>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x009>;
+		};
+		cpu@00a {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00a>;
+		};
+		cpu@00b {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x00b>;
+		};
+		cpu@00c {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x00c>;
+		};
+		cpu@00d {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x00d>;
+		};
+		cpu@00e {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x00e>;
+		};
+		cpu@00f {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x00f>;
+		};
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
       [not found] ` <1420011208-7051-1-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
  2014-12-31  7:33   ` [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096 Ganapatrao Kulkarni
  2014-12-31  7:33   ` [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa Ganapatrao Kulkarni
@ 2014-12-31  7:33   ` Ganapatrao Kulkarni
       [not found]     ` <1420011208-7051-4-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
  2014-12-31  7:33   ` [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms Ganapatrao Kulkarni
  3 siblings, 1 reply; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2014-12-31  7:33 UTC (permalink / raw)
  To: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	arnd-r2nGTMty4D4
  Cc: gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

adding dt file for Cavium's Thunder SoC in 2 Node topology
using arm,associativity device node property.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
---
 arch/arm64/boot/dts/thunder-88xx-2n.dts  |  78 +++
 arch/arm64/boot/dts/thunder-88xx-2n.dtsi | 789 +++++++++++++++++++++++++++++++
 2 files changed, 867 insertions(+)
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dtsi

diff --git a/arch/arm64/boot/dts/thunder-88xx-2n.dts b/arch/arm64/boot/dts/thunder-88xx-2n.dts
new file mode 100644
index 0000000..5dc89d5e
--- /dev/null
+++ b/arch/arm64/boot/dts/thunder-88xx-2n.dts
@@ -0,0 +1,78 @@
+/*
+ * Cavium Thunder DTS file - Thunder board description
+ *
+ * Copyright (C) 2014, Cavium Inc.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This library is free software; you can redistribute it and/or
+ *     modify it under the terms of the GNU General Public License as
+ *     published by the Free Software Foundation; either version 2 of the
+ *     License, or (at your option) any later version.
+ *
+ *     This library is distributed in the hope that it will be useful,
+ *     but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *     GNU General Public License for more details.
+ *
+ *     You should have received a copy of the GNU General Public
+ *     License along with this library; if not, write to the Free
+ *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ *     MA 02110-1301 USA
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ *     obtaining a copy of this software and associated documentation
+ *     files (the "Software"), to deal in the Software without
+ *     restriction, including without limitation the rights to use,
+ *     copy, modify, merge, publish, distribute, sublicense, and/or
+ *     sell copies of the Software, and to permit persons to whom the
+ *     Software is furnished to do so, subject to the following
+ *     conditions:
+ *
+ *     The above copyright notice and this permission notice shall be
+ *     included in all copies or substantial portions of the Software.
+ *
+ *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ *     OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/dts-v1/;
+
+/include/ "thunder-88xx-2n.dtsi"
+
+/ {
+	model = "Cavium ThunderX CN88XX board";
+	compatible = "cavium,thunder-88xx";
+	arm,associativity-reference-points = <0 1>;
+
+	aliases {
+		serial0 = &uaa0;
+		serial1 = &uaa1;
+	};
+
+	memory@00000000 {
+		device_type = "memory";
+		reg = <0x0 0x00000000 0x0 0x80000000>;
+		/* board 0, socket 0, no specific core */
+		arm,associativity = <0 0 0xffff>;
+	};
+
+	memory@10000000000 {
+		device_type = "memory";
+		reg = <0x100 0x00000000 0x0 0x80000000>;
+		/* board 1, socket 0, no specific core */
+		arm,associativity = <1 0 0xffff>;
+	};
+
+};
diff --git a/arch/arm64/boot/dts/thunder-88xx-2n.dtsi b/arch/arm64/boot/dts/thunder-88xx-2n.dtsi
new file mode 100644
index 0000000..f7f561a
--- /dev/null
+++ b/arch/arm64/boot/dts/thunder-88xx-2n.dtsi
@@ -0,0 +1,789 @@
+/*
+ * Cavium Thunder DTS file - Thunder SoC description
+ *
+ * Copyright (C) 2014, Cavium Inc.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This library is free software; you can redistribute it and/or
+ *     modify it under the terms of the GNU General Public License as
+ *     published by the Free Software Foundation; either version 2 of the
+ *     License, or (at your option) any later version.
+ *
+ *     This library is distributed in the hope that it will be useful,
+ *     but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *     GNU General Public License for more details.
+ *
+ *     You should have received a copy of the GNU General Public
+ *     License along with this library; if not, write to the Free
+ *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ *     MA 02110-1301 USA
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ *     obtaining a copy of this software and associated documentation
+ *     files (the "Software"), to deal in the Software without
+ *     restriction, including without limitation the rights to use,
+ *     copy, modify, merge, publish, distribute, sublicense, and/or
+ *     sell copies of the Software, and to permit persons to whom the
+ *     Software is furnished to do so, subject to the following
+ *     conditions:
+ *
+ *     The above copyright notice and this permission notice shall be
+ *     included in all copies or substantial portions of the Software.
+ *
+ *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ *     OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/ {
+	compatible = "cavium,thunder-88xx";
+	interrupt-parent = <&gic0>;
+	#address-cells = <2>;
+	#size-cells = <2>;
+
+	psci {
+		compatible = "arm,psci-0.2";
+		method = "smc";
+	};
+
+	cpus {
+		#address-cells = <2>;
+		#size-cells = <0>;
+
+		cpu@000 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x000>;
+			enable-method = "psci";
+			/* board 0, socket 0, core 0*/
+			arm,associativity = <0 0 0x000>;
+		};
+		cpu@001 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x001>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x001>;
+		};
+		cpu@002 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x002>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x002>;
+		};
+		cpu@003 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x003>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x003>;
+		};
+		cpu@004 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x004>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x004>;
+		};
+		cpu@005 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x005>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x005>;
+		};
+		cpu@006 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x006>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x006>;
+		};
+		cpu@007 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x007>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x007>;
+		};
+		cpu@008 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x008>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x008>;
+		};
+		cpu@009 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x009>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x009>;
+		};
+		cpu@00a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00a>;
+		};
+		cpu@00b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00b>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00b>;
+		};
+		cpu@00c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00c>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00c>;
+		};
+		cpu@00d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00d>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00d>;
+		};
+		cpu@00e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00e>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00e>;
+		};
+		cpu@00f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00f>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00f>;
+		};
+		cpu@100 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x100>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x100>;
+		};
+		cpu@101 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x101>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x101>;
+		};
+		cpu@102 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x102>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x102>;
+		};
+		cpu@103 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x103>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x103>;
+		};
+		cpu@104 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x104>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x104>;
+		};
+		cpu@105 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x105>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x105>;
+		};
+		cpu@106 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x106>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x106>;
+		};
+		cpu@107 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x107>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x107>;
+		};
+		cpu@108 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x108>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x108>;
+		};
+		cpu@109 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x109>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x109>;
+		};
+		cpu@10a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10a>;
+		};
+		cpu@10b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10b>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10b>;
+		};
+		cpu@10c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10c>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10c>;
+		};
+		cpu@10d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10d>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10d>;
+		};
+		cpu@10e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10e>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10e>;
+		};
+		cpu@10f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10f>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10f>;
+		};
+		cpu@200 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x200>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x200>;
+		};
+		cpu@201 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x201>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x201>;
+		};
+		cpu@202 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x202>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x202>;
+		};
+		cpu@203 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x203>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x203>;
+		};
+		cpu@204 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x204>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x204>;
+		};
+		cpu@205 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x205>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x205>;
+		};
+		cpu@206 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x206>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x206>;
+		};
+		cpu@207 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x207>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x207>;
+		};
+		cpu@208 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x208>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x208>;
+		};
+		cpu@209 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x209>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x209>;
+		};
+		cpu@20a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20a>;
+		};
+		cpu@20b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20b>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20b>;
+		};
+		cpu@20c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20c>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20c>;
+		};
+		cpu@20d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20d>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20d>;
+		};
+		cpu@20e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20e>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20e>;
+		};
+		cpu@20f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20f>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20f>;
+		};
+		cpu@10000 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10000>;
+			enable-method = "psci";
+			/* board 1, socket 0, core 0*/
+			arm,associativity = <1 0 0x10000>;
+		};
+		cpu@10001 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10001>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10001>;
+		};
+		cpu@10002 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10002>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10002>;
+		};
+		cpu@10003 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10003>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10003>;
+		};
+		cpu@10004 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10004>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10004>;
+		};
+		cpu@10005 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10005>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10005>;
+		};
+		cpu@10006 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10006>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10006>;
+		};
+		cpu@10007 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10007>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10007>;
+		};
+		cpu@10008 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10008>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10008>;
+		};
+		cpu@10009 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10009>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10009>;
+		};
+		cpu@1000a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000a>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000a>;
+		};
+		cpu@1000b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000b>;
+		};
+		cpu@1000c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000c>;
+		};
+		cpu@1000d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000d>;
+		};
+		cpu@1000e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000e>;
+		};
+		cpu@1000f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000f>;
+		};
+		cpu@10100 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10100>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10100>;
+		};
+		cpu@10101 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10101>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10101>;
+		};
+		cpu@10102 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10102>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10102>;
+		};
+		cpu@10103 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10103>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10103>;
+		};
+		cpu@10104 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10104>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10104>;
+		};
+		cpu@10105 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10105>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10105>;
+		};
+		cpu@10106 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10106>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10106>;
+		};
+		cpu@10107 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10107>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10107>;
+		};
+		cpu@10108 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10108>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10108>;
+		};
+		cpu@10109 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10109>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10109>;
+		};
+		cpu@1010a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010a>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010a>;
+		};
+		cpu@1010b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010b>;
+		};
+		cpu@1010c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010c>;
+		};
+		cpu@1010d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010d>;
+		};
+		cpu@1010e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010e>;
+		};
+		cpu@1010f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010f>;
+		};
+		cpu@10200 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10200>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10200>;
+		};
+		cpu@10201 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10201>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10201>;
+		};
+		cpu@10202 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10202>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10202>;
+		};
+		cpu@10203 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10203>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10203>;
+		};
+		cpu@10204 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10204>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10204>;
+		};
+		cpu@10205 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10205>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10205>;
+		};
+		cpu@10206 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10206>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10206>;
+		};
+		cpu@10207 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10207>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10207>;
+		};
+		cpu@10208 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10208>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10208>;
+		};
+		cpu@10209 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10209>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10209>;
+		};
+		cpu@1020a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020a>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020a>;
+		};
+		cpu@1020b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020b>;
+		};
+		cpu@1020c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020c>;
+		};
+		cpu@1020d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020d>;
+		};
+		cpu@1020e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020e>;
+		};
+		cpu@1020f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020f>;
+		};
+	};
+
+	timer {
+		compatible = "arm,armv8-timer";
+		interrupts = <1 13 0xff01>,
+		             <1 14 0xff01>,
+		             <1 11 0xff01>,
+		             <1 10 0xff01>;
+	};
+
+	soc {
+		compatible = "simple-bus";
+		#address-cells = <2>;
+		#size-cells = <2>;
+		ranges;
+
+		refclk50mhz: refclk50mhz {
+			compatible = "fixed-clock";
+			#clock-cells = <0>;
+			clock-frequency = <50000000>;
+			clock-output-names = "refclk50mhz";
+		};
+
+		gic0: interrupt-controller@8010,00000000 {
+			compatible = "arm,gic-v3";
+			#interrupt-cells = <3>;
+			#redistributor-regions = <2>;
+			interrupt-controller;
+			reg = <0x8010 0x00000000 0x0 0x010000>, /* GICD */
+			      <0x8010 0x80000000 0x0 0x600000>, /* GICR Node 0 */
+			      <0x9010 0x80000000 0x0 0x600000>; /* GICR Node 1 */
+			interrupts = <1 9 0xf04>;
+		};
+
+		uaa0: serial@87e0,24000000 {
+			compatible = "arm,pl011", "arm,primecell";
+			reg = <0x87e0 0x24000000 0x0 0x1000>;
+			interrupts = <1 21 4>;
+			clocks = <&refclk50mhz>;
+			clock-names = "apb_pclk";
+		};
+
+		uaa1: serial@87e0,25000000 {
+			compatible = "arm,pl011", "arm,primecell";
+			reg = <0x87e0 0x25000000 0x0 0x1000>;
+			interrupts = <1 22 4>;
+			clocks = <&refclk50mhz>;
+			clock-names = "apb_pclk";
+		};
+	};
+};
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
       [not found] ` <1420011208-7051-1-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
                     ` (2 preceding siblings ...)
  2014-12-31  7:33   ` [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Ganapatrao Kulkarni
@ 2014-12-31  7:33   ` Ganapatrao Kulkarni
       [not found]     ` <1420011208-7051-5-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
  3 siblings, 1 reply; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2014-12-31  7:33 UTC (permalink / raw)
  To: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	arnd-r2nGTMty4D4
  Cc: gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

Adding numa support for arm64 based platforms.
Adding dt node pasring for numa topology using property arm,associativity.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
---
 arch/arm64/Kconfig              |  32 +++
 arch/arm64/include/asm/mmzone.h |  32 +++
 arch/arm64/include/asm/numa.h   |  45 ++++
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/dt_numa.c     | 296 +++++++++++++++++++++++
 arch/arm64/kernel/setup.c       |   8 +
 arch/arm64/kernel/smp.c         |   2 +
 arch/arm64/mm/Makefile          |   1 +
 arch/arm64/mm/init.c            |  34 ++-
 arch/arm64/mm/numa.c            | 520 ++++++++++++++++++++++++++++++++++++++++
 10 files changed, 965 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm64/include/asm/mmzone.h
 create mode 100644 arch/arm64/include/asm/numa.h
 create mode 100644 arch/arm64/kernel/dt_numa.c
 create mode 100644 arch/arm64/mm/numa.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 242419d..6d262b1 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -59,6 +59,7 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_RCU_TABLE_FREE
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_MEMBLOCK_NODE_MAP if NUMA
 	select IRQ_DOMAIN
 	select MODULES_USE_ELF_RELA
 	select NO_BOOTMEM
@@ -315,6 +316,37 @@ config HOTPLUG_CPU
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu.
 
+# Common NUMA Features
+config NUMA
+	bool "Numa Memory Allocation and Scheduler Support"
+	depends on SMP
+	---help---
+	  Enable NUMA (Non Uniform Memory Access) support.
+
+	  The kernel will try to allocate memory used by a CPU on the
+	  local memory controller of the CPU and add some more
+	  NUMA awareness to the kernel.
+
+config ARM64_DT_NUMA
+	def_bool n
+	prompt "DT NUMA detection"
+	---help---
+	  Enable DT based numa.
+
+config NODES_SHIFT
+	int "Maximum NUMA Nodes (as a power of 2)"
+	range 1 10
+	default "2"
+	depends on NEED_MULTIPLE_NODES
+	---help---
+	  Specify the maximum number of NUMA Nodes available on the target
+	  system.  Increases memory reserved to accommodate various tables.
+
+config USE_PERCPU_NUMA_NODE_ID
+	def_bool y
+	depends on NUMA
+
+
 source kernel/Kconfig.preempt
 
 config HZ
diff --git a/arch/arm64/include/asm/mmzone.h b/arch/arm64/include/asm/mmzone.h
new file mode 100644
index 0000000..d27ee66
--- /dev/null
+++ b/arch/arm64/include/asm/mmzone.h
@@ -0,0 +1,32 @@
+#ifndef __ASM_ARM64_MMZONE_H_
+#define __ASM_ARM64_MMZONE_H_
+
+#ifdef CONFIG_NUMA
+
+#include <linux/mmdebug.h>
+#include <asm/smp.h>
+#include <linux/types.h>
+#include <asm/numa.h>
+
+extern struct pglist_data *node_data[];
+
+#define NODE_DATA(nid)		(node_data[nid])
+
+
+struct numa_memblk {
+	u64			start;
+	u64			end;
+	int			nid;
+};
+
+struct numa_meminfo {
+	int			nr_blks;
+	struct numa_memblk	blk[NR_NODE_MEMBLKS];
+};
+
+void __init numa_remove_memblk_from(int idx, struct numa_meminfo *mi);
+int __init numa_cleanup_meminfo(struct numa_meminfo *mi);
+void __init numa_reset_distance(void);
+
+#endif /* CONFIG_NUMA */
+#endif /* __ASM_ARM64_MMZONE_H_ */
diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
new file mode 100644
index 0000000..7c9bebc
--- /dev/null
+++ b/arch/arm64/include/asm/numa.h
@@ -0,0 +1,45 @@
+#ifndef _ASM_ARM64_NUMA_H
+#define _ASM_ARM64_NUMA_H
+
+#include <linux/nodemask.h>
+#include <asm/topology.h>
+
+#ifdef CONFIG_NUMA
+
+#define NR_NODE_MEMBLKS		(MAX_NUMNODES * 2)
+#define ZONE_ALIGN (1UL << (MAX_ORDER + PAGE_SHIFT))
+
+/* currently, arm64 implements flat NUMA topology */
+#define parent_node(node)	(node)
+
+/* dummy definitions for pci functions */
+#define pcibus_to_node(node)	0
+#define cpumask_of_pcibus(bus)	0
+
+struct __node_cpu_hwid {
+	u32 node_id;    /* logical node containing this CPU */
+	u64 cpu_hwid;   /* MPIDR for this CPU */
+};
+
+const struct cpumask *cpumask_of_node(int node);
+/* Mappings between node number and cpus on that node. */
+extern cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
+
+void __init arm64_numa_init(void);
+int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
+void numa_store_cpu_info(int cpu);
+void numa_set_node(int cpu, int node);
+void numa_clear_node(int cpu);
+void numa_add_cpu(int cpu);
+void numa_remove_cpu(int cpu);
+void __init numa_set_distance(int from, int to, int distance);
+#ifdef CONFIG_ARM64_DT_NUMA
+int dt_get_cpu_node_id(int cpu);
+int __init arm64_dt_numa_init(void);
+#endif
+#else	/* CONFIG_NUMA */
+static inline void arm64_numa_init(void);
+static inline void numa_store_cpu_info(int cpu)		{ }
+static inline void arm64_numa_init(void)		{ }
+#endif	/* CONFIG_NUMA */
+#endif	/* _ASM_ARM64_NUMA_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 5bd029b..39451df 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -31,6 +31,7 @@ arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
 arm64-obj-$(CONFIG_KGDB)		+= kgdb.o
 arm64-obj-$(CONFIG_EFI)			+= efi.o efi-stub.o efi-entry.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
+arm64-obj-$(CONFIG_ARM64_DT_NUMA)	+= dt_numa.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/dt_numa.c b/arch/arm64/kernel/dt_numa.c
new file mode 100644
index 0000000..1b64b25
--- /dev/null
+++ b/arch/arm64/kernel/dt_numa.c
@@ -0,0 +1,296 @@
+/*
+ * DT NUMA Parsing support, based on the powerpc implementation.
+ *
+ * Copyright (C) 2014 Cavium Inc.
+ * Author: Ganapatrao Kulkarni <gkulkarni-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/memblock.h>
+#include <linux/ctype.h>
+#include <linux/module.h>
+#include <linux/nodemask.h>
+#include <linux/sched.h>
+#include <linux/of.h>
+#include <linux/of_fdt.h>
+#include <asm/smp_plat.h>
+
+#define MAX_DISTANCE_REF_POINTS 4
+static int min_common_depth;
+static int distance_ref_points_depth;
+static const __be32 *distance_ref_points;
+static int distance_lookup_table[MAX_NUMNODES][MAX_DISTANCE_REF_POINTS];
+static int of_node_to_nid_single(struct device_node *device);
+static int default_nid;
+extern nodemask_t numa_nodes_parsed __initdata;
+
+static void initialize_distance_lookup_table(int nid,
+		const __be32 *associativity)
+{
+	int i;
+
+	for (i = 0; i < distance_ref_points_depth; i++) {
+		const __be32 *entry;
+
+		entry = &associativity[be32_to_cpu(distance_ref_points[i])];
+		distance_lookup_table[nid][i] = of_read_number(entry, 1);
+	}
+}
+
+/* must hold reference to node during call */
+static const __be32 *of_get_associativity(struct device_node *dev)
+{
+	return of_get_property(dev, "arm,associativity", NULL);
+}
+
+/* Returns nid in the range [0..MAX_NUMNODES-1], or -1 if no useful numa
+ * info is found.
+ */
+static int associativity_to_nid(const __be32 *associativity)
+{
+	int nid = -1;
+
+	if (min_common_depth == -1)
+		goto out;
+
+	if (of_read_number(associativity, 1) >= min_common_depth)
+		nid = of_read_number(&associativity[min_common_depth], 1);
+
+	/* set 0xffff as invalid node */
+	if (nid == 0xffff || nid >= MAX_NUMNODES)
+		nid = -1;
+
+	initialize_distance_lookup_table(nid, associativity);
+
+out:
+	return nid;
+}
+
+/* Returns the nid associated with the given device tree node,
+ * or -1 if not found.
+ */
+static int of_node_to_nid_single(struct device_node *device)
+{
+	int nid = -1;
+	const __be32 *tmp;
+
+	tmp = of_get_associativity(device);
+	if (tmp)
+		nid = associativity_to_nid(tmp);
+	return nid;
+}
+
+/* Walk the device tree upwards, looking for an associativity id */
+int of_node_to_nid(struct device_node *device)
+{
+	struct device_node *tmp;
+	int nid = -1;
+
+	of_node_get(device);
+	while (device) {
+		nid = of_node_to_nid_single(device);
+		if (nid != -1)
+			break;
+
+		tmp = device;
+		device = of_get_parent(tmp);
+		of_node_put(tmp);
+	}
+	of_node_put(device);
+
+	return nid;
+}
+EXPORT_SYMBOL_GPL(of_node_to_nid);
+
+static int __init find_min_common_depth(unsigned long node)
+{
+	int depth;
+	const __be32 *numa_prop;
+	int nr_address_cells;
+
+	/*
+	 * This property is a set of 32-bit integers, each representing
+	 * an index into the arm,associativity nodes.
+	 *
+	 * With form 1 affinity the first integer is the most significant
+	 * NUMA boundary and the following are progressively less significant
+	 * boundaries. There can be more than one level of NUMA.
+	 */
+
+	distance_ref_points = of_get_flat_dt_prop(node,
+			"arm,associativity-reference-points",
+			&distance_ref_points_depth);
+	numa_prop = distance_ref_points;
+
+	if (numa_prop) {
+		nr_address_cells = dt_mem_next_cell(
+				OF_ROOT_NODE_ADDR_CELLS_DEFAULT, &numa_prop);
+		nr_address_cells = dt_mem_next_cell(
+				OF_ROOT_NODE_ADDR_CELLS_DEFAULT, &numa_prop);
+	}
+	if (!distance_ref_points) {
+		pr_err("NUMA: arm,associativity-reference-points not found.\n");
+		goto err;
+	}
+
+	distance_ref_points_depth /= sizeof(__be32);
+
+	if (distance_ref_points_depth < 2) {
+		pr_err(KERN_WARNING "NUMA: short arm,associativity-reference-points\n");
+		goto err;
+	}
+
+	depth = of_read_number(distance_ref_points, 1);
+
+	/*
+	 * Warn and cap if the hardware supports more than
+	 * MAX_DISTANCE_REF_POINTS domains.
+	 */
+	if (distance_ref_points_depth > MAX_DISTANCE_REF_POINTS) {
+		pr_debug(KERN_WARNING "NUMA: distance array capped at %d entries\n", MAX_DISTANCE_REF_POINTS);
+		distance_ref_points_depth = MAX_DISTANCE_REF_POINTS;
+	}
+
+	return depth;
+
+err:
+	return -1;
+}
+
+int dt_get_cpu_node_id(int cpu)
+{
+	struct device_node *dn = NULL;
+
+	while ((dn = of_find_node_by_type(dn, "cpu"))) {
+		const u32 *cell;
+		u64 hwid;
+
+		/*
+		 * A cpu node with missing "reg" property is
+		 * considered invalid to build a cpu_logical_map
+		 * entry.
+		 */
+		cell = of_get_property(dn, "reg", NULL);
+		if (!cell) {
+			pr_err("%s: missing reg property\n", dn->full_name);
+			return default_nid;
+		}
+		hwid = of_read_number(cell, of_n_addr_cells(dn));
+
+		if (cpu_logical_map(cpu) == hwid)
+		return of_node_to_nid_single(dn);
+	}
+	return NUMA_NO_NODE;
+}
+EXPORT_SYMBOL(dt_get_cpu_node_id);
+
+static int __init parse_memory_node(unsigned long node)
+{
+	const __be32 *reg, *endp, *associativity;
+	int length;
+	int nid = -1;
+
+	associativity = of_get_flat_dt_prop(node, "arm,associativity", &length);
+
+	if (associativity)
+		nid = associativity_to_nid(associativity);
+
+	reg = of_get_flat_dt_prop(node, "reg", &length);
+	endp = reg + (length / sizeof(__be32));
+
+	while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
+		u64 base, size;
+		struct memblock_region *mblk;
+
+		base = dt_mem_next_cell(dt_root_addr_cells, &reg);
+		size = dt_mem_next_cell(dt_root_size_cells, &reg);
+		pr_debug("NUMA-DT:  base = %llx , node = %u\n",
+				base, nid);
+		for_each_memblock(memory, mblk) {
+			if (mblk->base == base) {
+				numa_add_memblk(nid, mblk->base,mblk->size);
+				node_set(nid, numa_nodes_parsed);
+				break;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
+ */
+int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
+				     int depth, void *data)
+{
+	const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
+
+	/* We are scanning "numa-map" nodes only */
+	if (depth == 0) {
+		min_common_depth = find_min_common_depth(node);
+		if (min_common_depth < 0)
+			return min_common_depth;
+		pr_debug("NUMA associativity depth for CPU/Memory: %d\n", min_common_depth);
+		return 0;
+	}
+
+	if (type) {
+		if (strcmp(type, "memory") == 0)
+			parse_memory_node(node);
+	}
+	return 0;
+}
+
+int dt_get_node_distance(int a, int b)
+{
+	int i;
+	int distance = LOCAL_DISTANCE;
+
+	for (i = 0; i < distance_ref_points_depth; i++) {
+		if (distance_lookup_table[a][i] == distance_lookup_table[b][i])
+			break;
+
+		/* Double the distance for each NUMA level */
+		distance *= 2;
+	}
+	return distance;
+}
+
+/* DT node mapping is done already early_init_dt_scan_memory */
+int __init arm64_dt_numa_init(void)
+{
+	int i;
+	u32 nodea, nodeb, distance, node_count = 0;
+
+	of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
+
+	for_each_node_mask(i, numa_nodes_parsed)
+		node_count = i;
+	node_count++;
+
+	for (nodea =  0; nodea < node_count; nodea++) {
+		for (nodeb = 0; nodeb < node_count; nodeb++) {
+			distance = dt_get_node_distance(nodea, nodeb);
+			numa_set_distance(nodea, nodeb, distance);
+		}
+	}
+	return 0;
+}
+EXPORT_SYMBOL(arm64_dt_numa_init);
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 2437196..80b4a9e 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -425,6 +425,9 @@ static int __init topology_init(void)
 {
 	int i;
 
+	for_each_online_node(i)
+		register_one_node(i);
+
 	for_each_possible_cpu(i) {
 		struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
 		cpu->hotpluggable = 1;
@@ -461,7 +464,12 @@ static int c_show(struct seq_file *m, void *v)
 		 * "processor".  Give glibc what it expects.
 		 */
 #ifdef CONFIG_SMP
+	if (IS_ENABLED(CONFIG_NUMA)) {
+		seq_printf(m, "processor\t: %d", i);
+		seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
+	} else {
 		seq_printf(m, "processor\t: %d\n", i);
+	}
 #endif
 	}
 
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index b06d1d9..1d1e86f 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -50,6 +50,7 @@
 #include <asm/sections.h>
 #include <asm/tlbflush.h>
 #include <asm/ptrace.h>
+#include <asm/numa.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
@@ -123,6 +124,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 static void smp_store_cpu_info(unsigned int cpuid)
 {
 	store_cpu_topology(cpuid);
+	numa_store_cpu_info(cpuid);
 }
 
 /*
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index c56179e..c86e6de 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -3,3 +3,4 @@ obj-y				:= dma-mapping.o extable.o fault.o init.o \
 				   ioremap.o mmap.o pgd.o mmu.o \
 				   context.o proc.o pageattr.o
 obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
+obj-$(CONFIG_NUMA)		+= numa.o
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 494297c..6fd6802 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -39,6 +39,7 @@
 #include <asm/setup.h>
 #include <asm/sizes.h>
 #include <asm/tlb.h>
+#include <asm/numa.h>
 
 #include "mm.h"
 
@@ -73,6 +74,20 @@ static phys_addr_t max_zone_dma_phys(void)
 	return min(offset + (1ULL << 32), memblock_end_of_DRAM());
 }
 
+#ifdef CONFIG_NUMA
+static void __init zone_sizes_init(unsigned long min, unsigned long max)
+{
+	unsigned long max_zone_pfns[MAX_NR_ZONES];
+
+	memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
+	if (IS_ENABLED(CONFIG_ZONE_DMA))
+		max_zone_pfns[ZONE_DMA] = PFN_DOWN(max_zone_dma_phys());
+	max_zone_pfns[ZONE_NORMAL] = max;
+
+	free_area_init_nodes(max_zone_pfns);
+}
+
+#else
 static void __init zone_sizes_init(unsigned long min, unsigned long max)
 {
 	struct memblock_region *reg;
@@ -111,6 +126,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 
 	free_area_init_node(0, zone_size, min, zhole_size);
 }
+#endif /* CONFIG_NUMA */
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 int pfn_valid(unsigned long pfn)
@@ -128,10 +144,15 @@ static void arm64_memory_present(void)
 static void arm64_memory_present(void)
 {
 	struct memblock_region *reg;
+	int nid = 0;
 
-	for_each_memblock(memory, reg)
-		memory_present(0, memblock_region_memory_base_pfn(reg),
-			       memblock_region_memory_end_pfn(reg));
+	for_each_memblock(memory, reg) {
+#ifdef CONFIG_NUMA
+		nid = reg->nid;
+#endif
+		memory_present(nid, memblock_region_memory_base_pfn(reg),
+				memblock_region_memory_end_pfn(reg));
+	}
 }
 #endif
 
@@ -167,6 +188,10 @@ void __init bootmem_init(void)
 	min = PFN_UP(memblock_start_of_DRAM());
 	max = PFN_DOWN(memblock_end_of_DRAM());
 
+	high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
+	max_pfn = max_low_pfn = max;
+
+	arm64_numa_init();
 	/*
 	 * Sparsemem tries to allocate bootmem in memory_present(), so must be
 	 * done after the fixed reservations.
@@ -175,9 +200,6 @@ void __init bootmem_init(void)
 
 	sparse_init();
 	zone_sizes_init(min, max);
-
-	high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
-	max_pfn = max_low_pfn = max;
 }
 
 #ifndef CONFIG_SPARSEMEM_VMEMMAP
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
new file mode 100644
index 0000000..e146a2d
--- /dev/null
+++ b/arch/arm64/mm/numa.c
@@ -0,0 +1,520 @@
+/*
+ * NUMA support, based on the x86 implementation.
+ *
+ * Copyright (C) 2014 Cavium Inc.
+ * Author: Ganapatrao Kulkarni <gkulkarni-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/memblock.h>
+#include <linux/mmzone.h>
+#include <linux/ctype.h>
+#include <linux/module.h>
+#include <linux/nodemask.h>
+#include <linux/sched.h>
+#include <linux/topology.h>
+#include <linux/of.h>
+#include <asm/smp_plat.h>
+
+int __initdata numa_off;
+nodemask_t numa_nodes_parsed __initdata;
+static int numa_distance_cnt;
+static u8 *numa_distance;
+
+struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
+EXPORT_SYMBOL(node_data);
+
+static struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
+static struct numa_meminfo numa_meminfo;
+
+static __init int numa_setup(char *opt)
+{
+	if (!opt)
+		return -EINVAL;
+	if (!strncmp(opt, "off", 3)) {
+		pr_info("%s\n", "NUMA turned off");
+		numa_off = 1;
+	}
+	return 0;
+}
+early_param("numa", numa_setup);
+
+cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
+EXPORT_SYMBOL(node_to_cpumask_map);
+
+/*
+ * Returns a pointer to the bitmask of CPUs on Node 'node'.
+ */
+const struct cpumask *cpumask_of_node(int node)
+{
+	if (node >= nr_node_ids) {
+		pr_warn("cpumask_of_node(%d): node > nr_node_ids(%d)\n",
+			node, nr_node_ids);
+		dump_stack();
+		return cpu_none_mask;
+	}
+	if (node_to_cpumask_map[node] == NULL) {
+		pr_warn("cpumask_of_node(%d): no node_to_cpumask_map!\n",
+			node);
+		dump_stack();
+		return cpu_online_mask;
+	}
+	return node_to_cpumask_map[node];
+}
+EXPORT_SYMBOL(cpumask_of_node);
+
+int cpu_to_node_map[NR_CPUS];
+EXPORT_SYMBOL(cpu_to_node_map);
+
+void numa_clear_node(int cpu)
+{
+	cpu_to_node_map[cpu] = NUMA_NO_NODE;
+}
+
+/*
+ * Allocate node_to_cpumask_map based on number of available nodes
+ * Requires node_possible_map to be valid.
+ *
+ * Note: cpumask_of_node() is not valid until after this is done.
+ * (Use CONFIG_DEBUG_PER_CPU_MAPS to check this.)
+ */
+void __init setup_node_to_cpumask_map(void)
+{
+	unsigned int node;
+
+	/* setup nr_node_ids if not done yet */
+	if (nr_node_ids == MAX_NUMNODES)
+		setup_nr_node_ids();
+
+	/* allocate the map */
+	for (node = 0; node < nr_node_ids; node++)
+		alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
+
+	/* cpumask_of_node() will now work */
+	pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids);
+}
+
+/*
+ *  Set the cpu to node and mem mapping
+ */
+void numa_store_cpu_info(int cpu)
+{
+#ifdef CONFIG_ARM64_DT_NUMA
+	node_cpu_hwid[cpu].node_id  =  dt_get_cpu_node_id(cpu);
+#endif
+	/* mapping of MPIDR/hwid, node and logical cpu id */
+	cpu_to_node_map[cpu] = node_cpu_hwid[cpu].node_id;
+	node_cpu_hwid[cpu].cpu_hwid = cpu_logical_map(cpu);
+	cpumask_set_cpu(cpu, node_to_cpumask_map[cpu_to_node_map[cpu]]);
+	set_numa_node(cpu_to_node_map[cpu]);
+	set_numa_mem(local_memory_node(cpu_to_node_map[cpu]));
+}
+
+/**
+ * numa_add_memblk_to - Add one numa_memblk to a numa_meminfo
+ */
+
+static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
+				     struct numa_meminfo *mi)
+{
+	/* ignore zero length blks */
+	if (start == end)
+		return 0;
+
+	/* whine about and ignore invalid blks */
+	if (start > end || nid < 0 || nid >= MAX_NUMNODES) {
+		pr_warn("NUMA: Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
+				nid, start, end - 1);
+		return 0;
+	}
+
+	if (mi->nr_blks >= NR_NODE_MEMBLKS) {
+		pr_err("NUMA: too many memblk ranges\n");
+		return -EINVAL;
+	}
+
+	pr_info("NUMA: Adding memblock %d [0x%llx - 0x%llx] on node %d\n",
+			mi->nr_blks, start, end, nid);
+	mi->blk[mi->nr_blks].start = start;
+	mi->blk[mi->nr_blks].end = end;
+	mi->blk[mi->nr_blks].nid = nid;
+	mi->nr_blks++;
+	return 0;
+}
+
+/**
+ * numa_add_memblk - Add one numa_memblk to numa_meminfo
+ * @nid: NUMA node ID of the new memblk
+ * @start: Start address of the new memblk
+ * @end: End address of the new memblk
+ *
+ * Add a new memblk to the default numa_meminfo.
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+#define MAX_PHYS_ADDR	((phys_addr_t)~0)
+
+int __init numa_add_memblk(u32 nid, u64 base, u64 size)
+{
+	const u64 phys_offset = __pa(PAGE_OFFSET);
+
+	base &= PAGE_MASK;
+	size &= PAGE_MASK;
+
+	if (base > MAX_PHYS_ADDR) {
+		pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
+				base, base + size);
+		return -ENOMEM;
+	}
+
+	if (base + size > MAX_PHYS_ADDR) {
+		pr_info("NUMA: Ignoring memory range 0x%lx - 0x%llx\n",
+				ULONG_MAX, base + size);
+		size = MAX_PHYS_ADDR - base;
+	}
+
+	if (base + size < phys_offset) {
+		pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
+			   base, base + size);
+		return -ENOMEM;
+	}
+	if (base < phys_offset) {
+		pr_info("NUMA: Ignoring memory range 0x%llx - 0x%llx\n",
+			   base, phys_offset);
+		size -= phys_offset - base;
+		base = phys_offset;
+	}
+
+	return numa_add_memblk_to(nid, base, base+size, &numa_meminfo);
+}
+EXPORT_SYMBOL(numa_add_memblk);
+
+/* Initialize NODE_DATA for a node on the local memory */
+static void __init setup_node_data(int nid, u64 start, u64 end)
+{
+	const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
+	u64 nd_pa;
+	void *nd;
+	int tnid;
+
+	start = roundup(start, ZONE_ALIGN);
+
+	pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
+	       nid, start, end - 1);
+
+	/*
+	 * Allocate node data.  Try node-local memory and then any node.
+	 */
+	nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
+	if (!nd_pa) {
+		nd_pa = __memblock_alloc_base(nd_size, SMP_CACHE_BYTES,
+					      MEMBLOCK_ALLOC_ACCESSIBLE);
+		if (!nd_pa) {
+			pr_err("Cannot find %zu bytes in node %d\n",
+			       nd_size, nid);
+			return;
+		}
+	}
+	nd = __va(nd_pa);
+
+	/* report and initialize */
+	pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
+	       nd_pa, nd_pa + nd_size - 1);
+	tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
+	if (tnid != nid)
+		pr_info("    NODE_DATA(%d) on node %d\n", nid, tnid);
+
+	node_data[nid] = nd;
+	memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
+	NODE_DATA(nid)->node_id = nid;
+	NODE_DATA(nid)->node_start_pfn = start >> PAGE_SHIFT;
+	NODE_DATA(nid)->node_spanned_pages = (end - start) >> PAGE_SHIFT;
+
+	node_set_online(nid);
+}
+
+/*
+ * Set nodes, which have memory in @mi, in *@nodemask.
+ */
+static void __init numa_nodemask_from_meminfo(nodemask_t *nodemask,
+					      const struct numa_meminfo *mi)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mi->blk); i++)
+		if (mi->blk[i].start != mi->blk[i].end &&
+		    mi->blk[i].nid != NUMA_NO_NODE)
+			node_set(mi->blk[i].nid, *nodemask);
+}
+
+/*
+ * Sanity check to catch more bad NUMA configurations (they are amazingly
+ * common).  Make sure the nodes cover all memory.
+ */
+static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
+{
+	u64 numaram, totalram;
+	int i;
+
+	numaram = 0;
+	for (i = 0; i < mi->nr_blks; i++) {
+		u64 s = mi->blk[i].start >> PAGE_SHIFT;
+		u64 e = mi->blk[i].end >> PAGE_SHIFT;
+
+		numaram += e - s;
+		numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
+		if ((s64)numaram < 0)
+			numaram = 0;
+	}
+
+	totalram = max_pfn - absent_pages_in_range(0, max_pfn);
+
+	/* We seem to lose 3 pages somewhere. Allow 1M of slack. */
+	if ((s64)(totalram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
+		pr_err("NUMA: nodes only cover %lluMB of your %lluMB Total RAM. Not used.\n",
+		       (numaram << PAGE_SHIFT) >> 20,
+		       (totalram << PAGE_SHIFT) >> 20);
+		return false;
+	}
+	return true;
+}
+
+/**
+ * numa_reset_distance - Reset NUMA distance table
+ *
+ * The current table is freed.  The next numa_set_distance() call will
+ * create a new one.
+ */
+void __init numa_reset_distance(void)
+{
+	size_t size = numa_distance_cnt * numa_distance_cnt *
+		sizeof(numa_distance[0]);
+
+	/* numa_distance could be 1LU marking allocation failure, test cnt */
+	if (numa_distance_cnt)
+		memblock_free(__pa(numa_distance), size);
+	numa_distance_cnt = 0;
+	numa_distance = NULL;	/* enable table creation */
+}
+
+static int __init numa_alloc_distance(void)
+{
+	nodemask_t nodes_parsed;
+	size_t size;
+	int i, j, cnt = 0;
+	u64 phys;
+
+	/* size the new table and allocate it */
+	nodes_parsed = numa_nodes_parsed;
+	numa_nodemask_from_meminfo(&nodes_parsed, &numa_meminfo);
+
+	for_each_node_mask(i, nodes_parsed)
+		cnt = i;
+	cnt++;
+	size = cnt * cnt * sizeof(numa_distance[0]);
+
+	phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
+				      size, PAGE_SIZE);
+	if (!phys) {
+		pr_warning("NUMA: Warning: can't allocate distance table!\n");
+		/* don't retry until explicitly reset */
+		numa_distance = (void *)1LU;
+		return -ENOMEM;
+	}
+	memblock_reserve(phys, size);
+
+	numa_distance = __va(phys);
+	numa_distance_cnt = cnt;
+
+	/* fill with the default distances */
+	for (i = 0; i < cnt; i++)
+		for (j = 0; j < cnt; j++)
+			numa_distance[i * cnt + j] = i == j ?
+				LOCAL_DISTANCE : REMOTE_DISTANCE;
+	pr_debug("NUMA: Initialized distance table, cnt=%d\n", cnt);
+
+	return 0;
+}
+
+/**
+ * numa_set_distance - Set NUMA distance from one NUMA to another
+ * @from: the 'from' node to set distance
+ * @to: the 'to'  node to set distance
+ * @distance: NUMA distance
+ *
+ * Set the distance from node @from to @to to @distance.  If distance table
+ * doesn't exist, one which is large enough to accommodate all the currently
+ * known nodes will be created.
+ *
+ * If such table cannot be allocated, a warning is printed and further
+ * calls are ignored until the distance table is reset with
+ * numa_reset_distance().
+ *
+ * If @from or @to is higher than the highest known node or lower than zero
+ * at the time of table creation or @distance doesn't make sense, the call
+ * is ignored.
+ * This is to allow simplification of specific NUMA config implementations.
+ */
+void __init numa_set_distance(int from, int to, int distance)
+{
+	if (!numa_distance && numa_alloc_distance() < 0)
+		return;
+
+	if (from >= numa_distance_cnt || to >= numa_distance_cnt ||
+			from < 0 || to < 0) {
+		pr_warn_once("NUMA: Warning: node ids are out of bound, from=%d to=%d distance=%d\n",
+			    from, to, distance);
+		return;
+	}
+
+	if ((u8)distance != distance ||
+	    (from == to && distance != LOCAL_DISTANCE)) {
+		pr_warn_once("NUMA: Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
+			     from, to, distance);
+		return;
+	}
+
+	numa_distance[from * numa_distance_cnt + to] = distance;
+}
+EXPORT_SYMBOL(numa_set_distance);
+
+int __node_distance(int from, int to)
+{
+	if (from >= numa_distance_cnt || to >= numa_distance_cnt)
+		return from == to ? LOCAL_DISTANCE : REMOTE_DISTANCE;
+	return numa_distance[from * numa_distance_cnt + to];
+}
+EXPORT_SYMBOL(__node_distance);
+
+static int __init numa_register_memblks(struct numa_meminfo *mi)
+{
+	unsigned long uninitialized_var(pfn_align);
+	int i, nid;
+
+	/* Account for nodes with cpus and no memory */
+	node_possible_map = numa_nodes_parsed;
+	numa_nodemask_from_meminfo(&node_possible_map, mi);
+	if (WARN_ON(nodes_empty(node_possible_map)))
+		return -EINVAL;
+
+	for (i = 0; i < mi->nr_blks; i++) {
+		struct numa_memblk *mb = &mi->blk[i];
+
+		memblock_set_node(mb->start, mb->end - mb->start,
+				  &memblock.memory, mb->nid);
+	}
+
+	/*
+	 * If sections array is gonna be used for pfn -> nid mapping, check
+	 * whether its granularity is fine enough.
+	 */
+#ifdef NODE_NOT_IN_PAGE_FLAGS
+	pfn_align = node_map_pfn_alignment();
+	if (pfn_align && pfn_align < PAGES_PER_SECTION) {
+		pr_warn("Node alignment %lluMB < min %lluMB, rejecting NUMA config\n",
+		       PFN_PHYS(pfn_align) >> 20,
+		       PFN_PHYS(PAGES_PER_SECTION) >> 20);
+		return -EINVAL;
+	}
+#endif
+	if (!numa_meminfo_cover_memory(mi))
+		return -EINVAL;
+
+	/* Finally register nodes. */
+	for_each_node_mask(nid, node_possible_map) {
+		u64 start = PFN_PHYS(max_pfn);
+		u64 end = 0;
+
+		for (i = 0; i < mi->nr_blks; i++) {
+			if (nid != mi->blk[i].nid)
+				continue;
+			start = min(mi->blk[i].start, start);
+			end = max(mi->blk[i].end, end);
+		}
+
+		if (start < end)
+			setup_node_data(nid, start, end);
+	}
+
+	/* Dump memblock with node info and return. */
+	memblock_dump_all();
+	return 0;
+}
+
+static int __init numa_init(int (*init_func)(void))
+{
+	int ret, i;
+
+	nodes_clear(node_possible_map);
+	nodes_clear(node_online_map);
+
+	ret = init_func();
+	if (ret < 0)
+		return ret;
+
+	ret = numa_register_memblks(&numa_meminfo);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_cpu_ids; i++)
+		numa_clear_node(i);
+
+	setup_node_to_cpumask_map();
+	return 0;
+}
+
+/**
+ * dummy_numa_init - Fallback dummy NUMA init
+ *
+ * Used if there's no underlying NUMA architecture, NUMA initialization
+ * fails, or NUMA is disabled on the command line.
+ *
+ * Must online at least one node and add memory blocks that cover all
+ * allowed memory.  This function must not fail.
+ */
+static int __init dummy_numa_init(void)
+{
+	pr_info("%s\n", "No NUMA configuration found");
+	pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
+	       0LLU, PFN_PHYS(max_pfn) - 1);
+	node_set(0, numa_nodes_parsed);
+	numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
+
+	return 0;
+}
+
+/**
+ * arm64_numa_init - Initialize NUMA
+ *
+ * Try each configured NUMA initialization method until one succeeds.  The
+ * last fallback is dummy single node config encomapssing whole memory and
+ * never fails.
+ */
+void __init arm64_numa_init(void)
+{
+	if (!numa_off) {
+#ifdef CONFIG_ARM64_DT_NUMA
+		if (!numa_init(arm64_dt_numa_init))
+			return;
+#endif
+	}
+
+	numa_init(dummy_numa_init);
+}
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
       [not found]     ` <1420011208-7051-4-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
@ 2015-01-02 10:49       ` Arnd Bergmann
  2015-01-02 21:17         ` Arnd Bergmann
  1 sibling, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 10:49 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> +
> +	memory@00000000 {
> +		device_type = "memory";
> +		reg = <0x0 0x00000000 0x0 0x80000000>;
> +		/* board 0, socket 0, no specific core */
> +		arm,associativity = <0 0 0xffff>;
> +	};
> +
> +	memory@10000000000 {
> +		device_type = "memory";
> +		reg = <0x100 0x00000000 0x0 0x80000000>;
> +		/* board 1, socket 0, no specific core */
> +		arm,associativity = <1 0 0xffff>;
> +	};
> +};

So no memory in any other socket?

> +		cpu@00f {
> +			device_type = "cpu";
> +			compatible = "cavium,thunder", "arm,armv8";
> +			reg = <0x0 0x00f>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 0x00f>;
> +		};
> +		cpu@100 {
> +			device_type = "cpu";
> +			compatible = "cavium,thunder", "arm,armv8";
> +			reg = <0x0 0x100>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 0x100>;
> +		};

What is the 0x100 offset in the last-level topology field? Does this have
no significance to topology at all? I would expect that to be something
like cluster number that is relevant to caching and should be represented
as a separate level.

In contrast, the level-two topology information seems to always be
zero for all CPUs, so you could probably leave that one out.

> +	soc {
> +		compatible = "simple-bus";
> +		#address-cells = <2>;
> +		#size-cells = <2>;
> +		ranges;

The soc node is missing a topology information, please add one.

	Arnd

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096.
       [not found]     ` <1420011208-7051-2-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
@ 2015-01-02 10:49       ` Arnd Bergmann
  2015-01-02 21:17         ` Arnd Bergmann
  1 sibling, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 10:49 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Wednesday 31 December 2014 13:03:25 Ganapatrao Kulkarni wrote:
> Raising the maximum limit to 4096.
> This is to accomadate up-coming higher multi-core platforms.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>

Acked-by: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa.
       [not found]     ` <1420011208-7051-3-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
@ 2015-01-02 11:02       ` Arnd Bergmann
  2015-01-02 21:17         ` Arnd Bergmann
  1 sibling, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 11:02 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Wednesday 31 December 2014 13:03:26 Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using arm,associativity
> device node property.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++++++++++++++++++++
>  1 file changed, 198 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..4f51e25
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,198 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a “NUMA node”.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers. representing the
> +board id, socket id and core id.
> +
> +ex:
> +	/* board 0, socket 0, core 0 */
> +	arm,associativity = <0 0 0x000>;
> +
> +	/* board 1, socket 0, core 8 */
> +	arm,associativity = <1 0 0x08>;

This is way too specific to Cavium machines. Most other vendors will not (at first)
have multiple boards or multiple sockets, but need to represent multiple clusters
and/or SMT threads instead. Also the wording suggests that this is only relevant
for NUMA, which I don't think is helpful because we will also want to describe
the topology within one NUMA node for locality.

I think we should stick to the powerpc definition here and not define what the
levels mean at the binding level. Something like:

"Each level of topology defines a boundary in the system at which a significant
difference in performance can be measured between cross-device accesses within
a single location and those spanning multiple locations. The first cell always
contains the broadest subdivision within the system, while the last cell enumerates
the individual devices, such as an SMT thread of a CPU, or a bus bridge within
an SoC".

> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +	arm,associativity-reference-points = <0 1>;
> +	The board Id(index 0) used first to calculate the associativity (node
> +	distance), then follows the  socket id(index 1).
> +
> +	arm,associativity-reference-points = <1 0>;
> +	The socket Id(index 1) used first to calculate the associativity,
> +	then follows the board id(index 0).
> +
> +	arm,associativity-reference-points = <0>;
> +	Only the board Id(index 0) used to calculate the associativity.
> +
> +	arm,associativity-reference-points = <1>;
> +	Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.

I think the example should also include a PCI controller.

> +
> +	arm,associativity-reference-points = <0 1>;

This doesn't really match the associativity properties, because the
second level in the cpus nodes is completely meaningless and should
not be listed as a secondary reference point.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
       [not found]     ` <1420011208-7051-5-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
@ 2015-01-02 11:34       ` Arnd Bergmann
  2015-01-02 21:10         ` Arnd Bergmann
  1 sibling, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 11:34 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: inux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Wednesday 31 December 2014 13:03:28 Ganapatrao Kulkarni wrote:
> Adding numa support for arm64 based platforms.
> Adding dt node pasring for numa topology using property arm,associativity.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>

Maybe the parts that are common with powerpc can be moved to drivers/of/numa.c?
We can always look for both arm,associativity and ibm,associativity, I don't
think we should be worried about any conflicts that way.

> +#define MAX_DISTANCE_REF_POINTS 4

I think we should use 8 here like powerpc, four levels might get exceeded
on complex SoCs.

> +int dt_get_cpu_node_id(int cpu)
> +{
> +	struct device_node *dn = NULL;
> +
> +	while ((dn = of_find_node_by_type(dn, "cpu"))) {
> +		const u32 *cell;
> +		u64 hwid;
> +
> +		/*
> +		 * A cpu node with missing "reg" property is
> +		 * considered invalid to build a cpu_logical_map
> +		 * entry.
> +		 */
> +		cell = of_get_property(dn, "reg", NULL);
> +		if (!cell) {
> +			pr_err("%s: missing reg property\n", dn->full_name);
> +			return default_nid;
> +		}
> +		hwid = of_read_number(cell, of_n_addr_cells(dn));
> +
> +		if (cpu_logical_map(cpu) == hwid)
> +		return of_node_to_nid_single(dn);
> +	}
> +	return NUMA_NO_NODE;
> +}
> +EXPORT_SYMBOL(dt_get_cpu_node_id);

Maybe just expose a function to the device node for a CPU ID here, and
expect callers to use of_node_to_nid?

> +
> +/**
> + * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
> + */
> +int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
> +				     int depth, void *data)
> +{
> +	const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
> +
> +	/* We are scanning "numa-map" nodes only */

a stale comment?

> +/* DT node mapping is done already early_init_dt_scan_memory */
> +int __init arm64_dt_numa_init(void)
> +{
> +	int i;
> +	u32 nodea, nodeb, distance, node_count = 0;
> +
> +	of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
> +
> +	for_each_node_mask(i, numa_nodes_parsed)
> +		node_count = i;
> +	node_count++;
> +
> +	for (nodea =  0; nodea < node_count; nodea++) {
> +		for (nodeb = 0; nodeb < node_count; nodeb++) {
> +			distance = dt_get_node_distance(nodea, nodeb);
> +			numa_set_distance(nodea, nodeb, distance);
> +		}
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL(arm64_dt_numa_init);

No need to export functions that are called only be architecture code.
Since this works on the flattened device tree format, you can never
have loadable modules calling it.

> @@ -461,7 +464,12 @@ static int c_show(struct seq_file *m, void *v)
>  		 * "processor".  Give glibc what it expects.
>  		 */
>  #ifdef CONFIG_SMP
> +	if (IS_ENABLED(CONFIG_NUMA)) {
> +		seq_printf(m, "processor\t: %d", i);
> +		seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
> +	} else {
>  		seq_printf(m, "processor\t: %d\n", i);
> +	}
>  #endif
>  	}

Do we need to make this conditional? I think we can just always
print the node number, even if it's going to be zero for systems
without the associativity properties.

> +
> +int cpu_to_node_map[NR_CPUS];
> +EXPORT_SYMBOL(cpu_to_node_map);

This seems to be x86 specific, do we need it?

> +/*
> + *  Set the cpu to node and mem mapping
> + */
> +void numa_store_cpu_info(int cpu)
> +{
> +#ifdef CONFIG_ARM64_DT_NUMA
> +	node_cpu_hwid[cpu].node_id  =  dt_get_cpu_node_id(cpu);
> +#endif

I would try to avoid the #ifdef here, by providing a stub function of
dt_get_cpu_node_id or whichever function we end up calling here when
NUMA is disabled.

> +
> +/**
> + * arm64_numa_init - Initialize NUMA
> + *
> + * Try each configured NUMA initialization method until one succeeds.  The
> + * last fallback is dummy single node config encomapssing whole memory and
> + * never fails.
> + */
> +void __init arm64_numa_init(void)
> +{
> +	if (!numa_off) {
> +#ifdef CONFIG_ARM64_DT_NUMA
> +		if (!numa_init(arm64_dt_numa_init))
> +			return;
> +#endif
> +	}
> +
> +	numa_init(dummy_numa_init);
> +}

I don't think we need the CONFIG_ARM64_DT_NUMA=n case here, it should just
not be conditional, and the arm64_dt_numa_init should fall back to doing
something reasonable when numa is turned off or there are no associativity
properties.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
  2014-12-31  7:33   ` [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms Ganapatrao Kulkarni
@ 2015-01-02 21:10         ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:10 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

[re-sent with correct mailing list address]

On Wednesday 31 December 2014 13:03:28 Ganapatrao Kulkarni wrote:
> Adding numa support for arm64 based platforms.
> Adding dt node pasring for numa topology using property arm,associativity.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>

Maybe the parts that are common with powerpc can be moved to drivers/of/numa.c?
We can always look for both arm,associativity and ibm,associativity, I don't
think we should be worried about any conflicts that way.

> +#define MAX_DISTANCE_REF_POINTS 4

I think we should use 8 here like powerpc, four levels might get exceeded
on complex SoCs.

> +int dt_get_cpu_node_id(int cpu)
> +{
> +	struct device_node *dn = NULL;
> +
> +	while ((dn = of_find_node_by_type(dn, "cpu"))) {
> +		const u32 *cell;
> +		u64 hwid;
> +
> +		/*
> +		 * A cpu node with missing "reg" property is
> +		 * considered invalid to build a cpu_logical_map
> +		 * entry.
> +		 */
> +		cell = of_get_property(dn, "reg", NULL);
> +		if (!cell) {
> +			pr_err("%s: missing reg property\n", dn->full_name);
> +			return default_nid;
> +		}
> +		hwid = of_read_number(cell, of_n_addr_cells(dn));
> +
> +		if (cpu_logical_map(cpu) == hwid)
> +		return of_node_to_nid_single(dn);
> +	}
> +	return NUMA_NO_NODE;
> +}
> +EXPORT_SYMBOL(dt_get_cpu_node_id);

Maybe just expose a function to the device node for a CPU ID here, and
expect callers to use of_node_to_nid?

> +
> +/**
> + * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
> + */
> +int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
> +				     int depth, void *data)
> +{
> +	const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
> +
> +	/* We are scanning "numa-map" nodes only */

a stale comment?

> +/* DT node mapping is done already early_init_dt_scan_memory */
> +int __init arm64_dt_numa_init(void)
> +{
> +	int i;
> +	u32 nodea, nodeb, distance, node_count = 0;
> +
> +	of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
> +
> +	for_each_node_mask(i, numa_nodes_parsed)
> +		node_count = i;
> +	node_count++;
> +
> +	for (nodea =  0; nodea < node_count; nodea++) {
> +		for (nodeb = 0; nodeb < node_count; nodeb++) {
> +			distance = dt_get_node_distance(nodea, nodeb);
> +			numa_set_distance(nodea, nodeb, distance);
> +		}
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL(arm64_dt_numa_init);

No need to export functions that are called only be architecture code.
Since this works on the flattened device tree format, you can never
have loadable modules calling it.

> @@ -461,7 +464,12 @@ static int c_show(struct seq_file *m, void *v)
>  		 * "processor".  Give glibc what it expects.
>  		 */
>  #ifdef CONFIG_SMP
> +	if (IS_ENABLED(CONFIG_NUMA)) {
> +		seq_printf(m, "processor\t: %d", i);
> +		seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
> +	} else {
>  		seq_printf(m, "processor\t: %d\n", i);
> +	}
>  #endif
>  	}

Do we need to make this conditional? I think we can just always
print the node number, even if it's going to be zero for systems
without the associativity properties.

> +
> +int cpu_to_node_map[NR_CPUS];
> +EXPORT_SYMBOL(cpu_to_node_map);

This seems to be x86 specific, do we need it?

> +/*
> + *  Set the cpu to node and mem mapping
> + */
> +void numa_store_cpu_info(int cpu)
> +{
> +#ifdef CONFIG_ARM64_DT_NUMA
> +	node_cpu_hwid[cpu].node_id  =  dt_get_cpu_node_id(cpu);
> +#endif

I would try to avoid the #ifdef here, by providing a stub function of
dt_get_cpu_node_id or whichever function we end up calling here when
NUMA is disabled.

> +
> +/**
> + * arm64_numa_init - Initialize NUMA
> + *
> + * Try each configured NUMA initialization method until one succeeds.  The
> + * last fallback is dummy single node config encomapssing whole memory and
> + * never fails.
> + */
> +void __init arm64_numa_init(void)
> +{
> +	if (!numa_off) {
> +#ifdef CONFIG_ARM64_DT_NUMA
> +		if (!numa_init(arm64_dt_numa_init))
> +			return;
> +#endif
> +	}
> +
> +	numa_init(dummy_numa_init);
> +}

I don't think we need the CONFIG_ARM64_DT_NUMA=n case here, it should just
not be conditional, and the arm64_dt_numa_init should fall back to doing
something reasonable when numa is turned off or there are no associativity
properties.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
@ 2015-01-02 21:10         ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:10 UTC (permalink / raw)
  To: linux-arm-kernel

[re-sent with correct mailing list address]

On Wednesday 31 December 2014 13:03:28 Ganapatrao Kulkarni wrote:
> Adding numa support for arm64 based platforms.
> Adding dt node pasring for numa topology using property arm,associativity.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>

Maybe the parts that are common with powerpc can be moved to drivers/of/numa.c?
We can always look for both arm,associativity and ibm,associativity, I don't
think we should be worried about any conflicts that way.

> +#define MAX_DISTANCE_REF_POINTS 4

I think we should use 8 here like powerpc, four levels might get exceeded
on complex SoCs.

> +int dt_get_cpu_node_id(int cpu)
> +{
> +	struct device_node *dn = NULL;
> +
> +	while ((dn = of_find_node_by_type(dn, "cpu"))) {
> +		const u32 *cell;
> +		u64 hwid;
> +
> +		/*
> +		 * A cpu node with missing "reg" property is
> +		 * considered invalid to build a cpu_logical_map
> +		 * entry.
> +		 */
> +		cell = of_get_property(dn, "reg", NULL);
> +		if (!cell) {
> +			pr_err("%s: missing reg property\n", dn->full_name);
> +			return default_nid;
> +		}
> +		hwid = of_read_number(cell, of_n_addr_cells(dn));
> +
> +		if (cpu_logical_map(cpu) == hwid)
> +		return of_node_to_nid_single(dn);
> +	}
> +	return NUMA_NO_NODE;
> +}
> +EXPORT_SYMBOL(dt_get_cpu_node_id);

Maybe just expose a function to the device node for a CPU ID here, and
expect callers to use of_node_to_nid?

> +
> +/**
> + * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
> + */
> +int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
> +				     int depth, void *data)
> +{
> +	const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
> +
> +	/* We are scanning "numa-map" nodes only */

a stale comment?

> +/* DT node mapping is done already early_init_dt_scan_memory */
> +int __init arm64_dt_numa_init(void)
> +{
> +	int i;
> +	u32 nodea, nodeb, distance, node_count = 0;
> +
> +	of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
> +
> +	for_each_node_mask(i, numa_nodes_parsed)
> +		node_count = i;
> +	node_count++;
> +
> +	for (nodea =  0; nodea < node_count; nodea++) {
> +		for (nodeb = 0; nodeb < node_count; nodeb++) {
> +			distance = dt_get_node_distance(nodea, nodeb);
> +			numa_set_distance(nodea, nodeb, distance);
> +		}
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL(arm64_dt_numa_init);

No need to export functions that are called only be architecture code.
Since this works on the flattened device tree format, you can never
have loadable modules calling it.

> @@ -461,7 +464,12 @@ static int c_show(struct seq_file *m, void *v)
>  		 * "processor".  Give glibc what it expects.
>  		 */
>  #ifdef CONFIG_SMP
> +	if (IS_ENABLED(CONFIG_NUMA)) {
> +		seq_printf(m, "processor\t: %d", i);
> +		seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
> +	} else {
>  		seq_printf(m, "processor\t: %d\n", i);
> +	}
>  #endif
>  	}

Do we need to make this conditional? I think we can just always
print the node number, even if it's going to be zero for systems
without the associativity properties.

> +
> +int cpu_to_node_map[NR_CPUS];
> +EXPORT_SYMBOL(cpu_to_node_map);

This seems to be x86 specific, do we need it?

> +/*
> + *  Set the cpu to node and mem mapping
> + */
> +void numa_store_cpu_info(int cpu)
> +{
> +#ifdef CONFIG_ARM64_DT_NUMA
> +	node_cpu_hwid[cpu].node_id  =  dt_get_cpu_node_id(cpu);
> +#endif

I would try to avoid the #ifdef here, by providing a stub function of
dt_get_cpu_node_id or whichever function we end up calling here when
NUMA is disabled.

> +
> +/**
> + * arm64_numa_init - Initialize NUMA
> + *
> + * Try each configured NUMA initialization method until one succeeds.  The
> + * last fallback is dummy single node config encomapssing whole memory and
> + * never fails.
> + */
> +void __init arm64_numa_init(void)
> +{
> +	if (!numa_off) {
> +#ifdef CONFIG_ARM64_DT_NUMA
> +		if (!numa_init(arm64_dt_numa_init))
> +			return;
> +#endif
> +	}
> +
> +	numa_init(dummy_numa_init);
> +}

I don't think we need the CONFIG_ARM64_DT_NUMA=n case here, it should just
not be conditional, and the arm64_dt_numa_init should fall back to doing
something reasonable when numa is turned off or there are no associativity
properties.

	Arnd

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa.
  2014-12-31  7:33   ` [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa Ganapatrao Kulkarni
@ 2015-01-02 21:17         ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Wednesday 31 December 2014 13:03:26 Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using arm,associativity
> device node property.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++++++++++++++++++++
>  1 file changed, 198 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..4f51e25
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,198 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a “NUMA node”.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers. representing the
> +board id, socket id and core id.
> +
> +ex:
> +	/* board 0, socket 0, core 0 */
> +	arm,associativity = <0 0 0x000>;
> +
> +	/* board 1, socket 0, core 8 */
> +	arm,associativity = <1 0 0x08>;

This is way too specific to Cavium machines. Most other vendors will not (at first)
have multiple boards or multiple sockets, but need to represent multiple clusters
and/or SMT threads instead. Also the wording suggests that this is only relevant
for NUMA, which I don't think is helpful because we will also want to describe
the topology within one NUMA node for locality.

I think we should stick to the powerpc definition here and not define what the
levels mean at the binding level. Something like:

"Each level of topology defines a boundary in the system at which a significant
difference in performance can be measured between cross-device accesses within
a single location and those spanning multiple locations. The first cell always
contains the broadest subdivision within the system, while the last cell enumerates
the individual devices, such as an SMT thread of a CPU, or a bus bridge within
an SoC".

> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +	arm,associativity-reference-points = <0 1>;
> +	The board Id(index 0) used first to calculate the associativity (node
> +	distance), then follows the  socket id(index 1).
> +
> +	arm,associativity-reference-points = <1 0>;
> +	The socket Id(index 1) used first to calculate the associativity,
> +	then follows the board id(index 0).
> +
> +	arm,associativity-reference-points = <0>;
> +	Only the board Id(index 0) used to calculate the associativity.
> +
> +	arm,associativity-reference-points = <1>;
> +	Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.

I think the example should also include a PCI controller.

> +
> +	arm,associativity-reference-points = <0 1>;

This doesn't really match the associativity properties, because the
second level in the cpus nodes is completely meaningless and should
not be listed as a secondary reference point.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-01-02 21:17         ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 31 December 2014 13:03:26 Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using arm,associativity
> device node property.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++++++++++++++++++++
>  1 file changed, 198 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..4f51e25
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,198 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a ???NUMA node???.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers. representing the
> +board id, socket id and core id.
> +
> +ex:
> +	/* board 0, socket 0, core 0 */
> +	arm,associativity = <0 0 0x000>;
> +
> +	/* board 1, socket 0, core 8 */
> +	arm,associativity = <1 0 0x08>;

This is way too specific to Cavium machines. Most other vendors will not (at first)
have multiple boards or multiple sockets, but need to represent multiple clusters
and/or SMT threads instead. Also the wording suggests that this is only relevant
for NUMA, which I don't think is helpful because we will also want to describe
the topology within one NUMA node for locality.

I think we should stick to the powerpc definition here and not define what the
levels mean at the binding level. Something like:

"Each level of topology defines a boundary in the system at which a significant
difference in performance can be measured between cross-device accesses within
a single location and those spanning multiple locations. The first cell always
contains the broadest subdivision within the system, while the last cell enumerates
the individual devices, such as an SMT thread of a CPU, or a bus bridge within
an SoC".

> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +	arm,associativity-reference-points = <0 1>;
> +	The board Id(index 0) used first to calculate the associativity (node
> +	distance), then follows the  socket id(index 1).
> +
> +	arm,associativity-reference-points = <1 0>;
> +	The socket Id(index 1) used first to calculate the associativity,
> +	then follows the board id(index 0).
> +
> +	arm,associativity-reference-points = <0>;
> +	Only the board Id(index 0) used to calculate the associativity.
> +
> +	arm,associativity-reference-points = <1>;
> +	Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.

I think the example should also include a PCI controller.

> +
> +	arm,associativity-reference-points = <0 1>;

This doesn't really match the associativity properties, because the
second level in the cpus nodes is completely meaningless and should
not be listed as a secondary reference point.

	Arnd

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096.
  2014-12-31  7:33   ` [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096 Ganapatrao Kulkarni
@ 2015-01-02 21:17         ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Wednesday 31 December 2014 13:03:25 Ganapatrao Kulkarni wrote:
> Raising the maximum limit to 4096.
> This is to accomadate up-coming higher multi-core platforms.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>

Acked-by: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096.
@ 2015-01-02 21:17         ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 31 December 2014 13:03:25 Ganapatrao Kulkarni wrote:
> Raising the maximum limit to 4096.
> This is to accomadate up-coming higher multi-core platforms.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2014-12-31  7:33   ` [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Ganapatrao Kulkarni
@ 2015-01-02 21:17         ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Will.Deacon-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> +
> +	memory@00000000 {
> +		device_type = "memory";
> +		reg = <0x0 0x00000000 0x0 0x80000000>;
> +		/* board 0, socket 0, no specific core */
> +		arm,associativity = <0 0 0xffff>;
> +	};
> +
> +	memory@10000000000 {
> +		device_type = "memory";
> +		reg = <0x100 0x00000000 0x0 0x80000000>;
> +		/* board 1, socket 0, no specific core */
> +		arm,associativity = <1 0 0xffff>;
> +	};
> +};

So no memory in any other socket?

> +		cpu@00f {
> +			device_type = "cpu";
> +			compatible = "cavium,thunder", "arm,armv8";
> +			reg = <0x0 0x00f>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 0x00f>;
> +		};
> +		cpu@100 {
> +			device_type = "cpu";
> +			compatible = "cavium,thunder", "arm,armv8";
> +			reg = <0x0 0x100>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 0x100>;
> +		};

What is the 0x100 offset in the last-level topology field? Does this have
no significance to topology at all? I would expect that to be something
like cluster number that is relevant to caching and should be represented
as a separate level.

In contrast, the level-two topology information seems to always be
zero for all CPUs, so you could probably leave that one out.

> +	soc {
> +		compatible = "simple-bus";
> +		#address-cells = <2>;
> +		#size-cells = <2>;
> +		ranges;

The soc node is missing a topology information, please add one.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-02 21:17         ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> +
> +	memory at 00000000 {
> +		device_type = "memory";
> +		reg = <0x0 0x00000000 0x0 0x80000000>;
> +		/* board 0, socket 0, no specific core */
> +		arm,associativity = <0 0 0xffff>;
> +	};
> +
> +	memory at 10000000000 {
> +		device_type = "memory";
> +		reg = <0x100 0x00000000 0x0 0x80000000>;
> +		/* board 1, socket 0, no specific core */
> +		arm,associativity = <1 0 0xffff>;
> +	};
> +};

So no memory in any other socket?

> +		cpu at 00f {
> +			device_type = "cpu";
> +			compatible = "cavium,thunder", "arm,armv8";
> +			reg = <0x0 0x00f>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 0x00f>;
> +		};
> +		cpu at 100 {
> +			device_type = "cpu";
> +			compatible = "cavium,thunder", "arm,armv8";
> +			reg = <0x0 0x100>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 0x100>;
> +		};

What is the 0x100 offset in the last-level topology field? Does this have
no significance to topology at all? I would expect that to be something
like cluster number that is relevant to caching and should be represented
as a separate level.

In contrast, the level-two topology information seems to always be
zero for all CPUs, so you could probably leave that one out.

> +	soc {
> +		compatible = "simple-bus";
> +		#address-cells = <2>;
> +		#size-cells = <2>;
> +		ranges;

The soc node is missing a topology information, please add one.

	Arnd

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-01-02 21:17         ` Arnd Bergmann
@ 2015-01-06  5:28           ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  5:28 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, Grant Likely, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Leif Lindholm, Roy Franz, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

Hi Arnd,


On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> On Wednesday 31 December 2014 13:03:26 Ganapatrao Kulkarni wrote:
>> DT bindings for numa map for memory, cores and IOs using arm,associativity
>> device node property.
>>
>> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
>> ---
>>  Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++++++++++++++++++++
>>  1 file changed, 198 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>
>> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
>> new file mode 100644
>> index 0000000..4f51e25
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> @@ -0,0 +1,198 @@
>> +==============================================================================
>> +NUMA binding description.
>> +==============================================================================
>> +
>> +==============================================================================
>> +1 - Introduction
>> +==============================================================================
>> +
>> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
>> +collections of hardware resources including processors, memory, and I/O buses,
>> +that comprise what is commonly known as a “NUMA node†.
>> +Processor accesses to memory within the local NUMA node is generally faster
>> +than processor accesses to memory outside of the local NUMA node.
>> +DT defines interfaces that allow the platform to convey NUMA node
>> +topology information to OS.
>> +
>> +==============================================================================
>> +2 - arm,associativity
>> +==============================================================================
>> +
>> +The mapping is done using arm,associativity device property.
>> +this property needs to be present in every device node which needs to to be
>> +mapped to numa nodes.
>> +
>> +arm,associativity property is set of 32-bit integers. representing the
>> +board id, socket id and core id.
>> +
>> +ex:
>> +     /* board 0, socket 0, core 0 */
>> +     arm,associativity = <0 0 0x000>;
>> +
>> +     /* board 1, socket 0, core 8 */
>> +     arm,associativity = <1 0 0x08>;
>
> This is way too specific to Cavium machines. Most other vendors will not (at first)
> have multiple boards or multiple sockets, but need to represent multiple clusters
> and/or SMT threads instead. Also the wording suggests that this is only relevant
> for NUMA, which I don't think is helpful because we will also want to describe
> the topology within one NUMA node for locality.
>
> I think we should stick to the powerpc definition here and not define what the
> levels mean at the binding level. Something like:
>
> "Each level of topology defines a boundary in the system at which a significant
> difference in performance can be measured between cross-device accesses within
> a single location and those spanning multiple locations. The first cell always
> contains the broadest subdivision within the system, while the last cell enumerates
> the individual devices, such as an SMT thread of a CPU, or a bus bridge within
> an SoC".
Ok,, i will change as suggested.
>
>> +==============================================================================
>> +3 - arm,associativity-reference-points
>> +==============================================================================
>> +This property is a set of 32-bit integers, each representing an index into
>> +the arm,associativity nodes. The first integer is the most significant
>> +NUMA boundary and the following are progressively less significant boundaries.
>> +There can be more than one level of NUMA.
>> +
>> +Ex:
>> +     arm,associativity-reference-points = <0 1>;
>> +     The board Id(index 0) used first to calculate the associativity (node
>> +     distance), then follows the  socket id(index 1).
>> +
>> +     arm,associativity-reference-points = <1 0>;
>> +     The socket Id(index 1) used first to calculate the associativity,
>> +     then follows the board id(index 0).
>> +
>> +     arm,associativity-reference-points = <0>;
>> +     Only the board Id(index 0) used to calculate the associativity.
>> +
>> +     arm,associativity-reference-points = <1>;
>> +     Only socket Id(index 1) used to calculate the associativity.
>> +
>> +==============================================================================
>> +4 - Example dts
>> +==============================================================================
>> +
>> +Example: 2 Node system consists of 2 boards and each board having one socket
>> +and 8 core in each socket.
>
> I think the example should also include a PCI controller.
Yes, i will add pci.
>
>> +
>> +     arm,associativity-reference-points = <0 1>;
>
> This doesn't really match the associativity properties, because the
> second level in the cpus nodes is completely meaningless and should
> not be listed as a secondary reference point.
agreed, will remove second entry.
>
>         Arnd

thanks
ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-01-06  5:28           ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  5:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,


On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 31 December 2014 13:03:26 Ganapatrao Kulkarni wrote:
>> DT bindings for numa map for memory, cores and IOs using arm,associativity
>> device node property.
>>
>> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
>> ---
>>  Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++++++++++++++++++++
>>  1 file changed, 198 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>
>> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
>> new file mode 100644
>> index 0000000..4f51e25
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> @@ -0,0 +1,198 @@
>> +==============================================================================
>> +NUMA binding description.
>> +==============================================================================
>> +
>> +==============================================================================
>> +1 - Introduction
>> +==============================================================================
>> +
>> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
>> +collections of hardware resources including processors, memory, and I/O buses,
>> +that comprise what is commonly known as a ???NUMA node?? .
>> +Processor accesses to memory within the local NUMA node is generally faster
>> +than processor accesses to memory outside of the local NUMA node.
>> +DT defines interfaces that allow the platform to convey NUMA node
>> +topology information to OS.
>> +
>> +==============================================================================
>> +2 - arm,associativity
>> +==============================================================================
>> +
>> +The mapping is done using arm,associativity device property.
>> +this property needs to be present in every device node which needs to to be
>> +mapped to numa nodes.
>> +
>> +arm,associativity property is set of 32-bit integers. representing the
>> +board id, socket id and core id.
>> +
>> +ex:
>> +     /* board 0, socket 0, core 0 */
>> +     arm,associativity = <0 0 0x000>;
>> +
>> +     /* board 1, socket 0, core 8 */
>> +     arm,associativity = <1 0 0x08>;
>
> This is way too specific to Cavium machines. Most other vendors will not (at first)
> have multiple boards or multiple sockets, but need to represent multiple clusters
> and/or SMT threads instead. Also the wording suggests that this is only relevant
> for NUMA, which I don't think is helpful because we will also want to describe
> the topology within one NUMA node for locality.
>
> I think we should stick to the powerpc definition here and not define what the
> levels mean at the binding level. Something like:
>
> "Each level of topology defines a boundary in the system at which a significant
> difference in performance can be measured between cross-device accesses within
> a single location and those spanning multiple locations. The first cell always
> contains the broadest subdivision within the system, while the last cell enumerates
> the individual devices, such as an SMT thread of a CPU, or a bus bridge within
> an SoC".
Ok,, i will change as suggested.
>
>> +==============================================================================
>> +3 - arm,associativity-reference-points
>> +==============================================================================
>> +This property is a set of 32-bit integers, each representing an index into
>> +the arm,associativity nodes. The first integer is the most significant
>> +NUMA boundary and the following are progressively less significant boundaries.
>> +There can be more than one level of NUMA.
>> +
>> +Ex:
>> +     arm,associativity-reference-points = <0 1>;
>> +     The board Id(index 0) used first to calculate the associativity (node
>> +     distance), then follows the  socket id(index 1).
>> +
>> +     arm,associativity-reference-points = <1 0>;
>> +     The socket Id(index 1) used first to calculate the associativity,
>> +     then follows the board id(index 0).
>> +
>> +     arm,associativity-reference-points = <0>;
>> +     Only the board Id(index 0) used to calculate the associativity.
>> +
>> +     arm,associativity-reference-points = <1>;
>> +     Only socket Id(index 1) used to calculate the associativity.
>> +
>> +==============================================================================
>> +4 - Example dts
>> +==============================================================================
>> +
>> +Example: 2 Node system consists of 2 boards and each board having one socket
>> +and 8 core in each socket.
>
> I think the example should also include a PCI controller.
Yes, i will add pci.
>
>> +
>> +     arm,associativity-reference-points = <0 1>;
>
> This doesn't really match the associativity properties, because the
> second level in the cpus nodes is completely meaningless and should
> not be listed as a secondary reference point.
agreed, will remove second entry.
>
>         Arnd

thanks
ganapat

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
  2015-01-02 21:10         ` Arnd Bergmann
@ 2015-01-06  9:25           ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  9:25 UTC (permalink / raw)
  To: Arnd Bergmann, Shannon Zhao
  Cc: devicetree, Steve Capper, Al Stone, Ard Biesheuvel,
	Catalin Marinas, Will Deacon, Leif Lindholm, Roy Franz,
	Rob Herring, Ganapatrao Kulkarni, msalter, Grant Likely,
	jchandra, linux-arm-kernel, Hanjun Guo

Hi Arnd,


On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> [re-sent with correct mailing list address]
>
> On Wednesday 31 December 2014 13:03:28 Ganapatrao Kulkarni wrote:
>> Adding numa support for arm64 based platforms.
>> Adding dt node pasring for numa topology using property arm,associativity.
>>
>> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
>
> Maybe the parts that are common with powerpc can be moved to drivers/of/numa.c?
> We can always look for both arm,associativity and ibm,associativity, I don't
> think we should be worried about any conflicts that way.
ok, i will move common functions from powerpc and arm64 to driver/of/numa.c
>
>> +#define MAX_DISTANCE_REF_POINTS 4
>
> I think we should use 8 here like powerpc, four levels might get exceeded
> on complex SoCs.
sure.
>
>> +int dt_get_cpu_node_id(int cpu)
>> +{
>> +     struct device_node *dn = NULL;
>> +
>> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
>> +             const u32 *cell;
>> +             u64 hwid;
>> +
>> +             /*
>> +              * A cpu node with missing "reg" property is
>> +              * considered invalid to build a cpu_logical_map
>> +              * entry.
>> +              */
>> +             cell = of_get_property(dn, "reg", NULL);
>> +             if (!cell) {
>> +                     pr_err("%s: missing reg property\n", dn->full_name);
>> +                     return default_nid;
>> +             }
>> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
>> +
>> +             if (cpu_logical_map(cpu) == hwid)
>> +             return of_node_to_nid_single(dn);
>> +     }
>> +     return NUMA_NO_NODE;
>> +}
>> +EXPORT_SYMBOL(dt_get_cpu_node_id);
>
> Maybe just expose a function to the device node for a CPU ID here, and
> expect callers to use of_node_to_nid?
shall i make this wrapper function in dt_numa.c, which will use
functions _of_node_to_nid and  _of_cpu_to_node(cpu)
And,  this function can be a weak function in numa.c which returns 0.
>
>> +
>> +/**
>> + * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
>> + */
>> +int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
>> +                                  int depth, void *data)
>> +{
>> +     const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
>> +
>> +     /* We are scanning "numa-map" nodes only */
>
> a stale comment?
oops, will remove.
>
>> +/* DT node mapping is done already early_init_dt_scan_memory */
>> +int __init arm64_dt_numa_init(void)
>> +{
>> +     int i;
>> +     u32 nodea, nodeb, distance, node_count = 0;
>> +
>> +     of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
>> +
>> +     for_each_node_mask(i, numa_nodes_parsed)
>> +             node_count = i;
>> +     node_count++;
>> +
>> +     for (nodea =  0; nodea < node_count; nodea++) {
>> +             for (nodeb = 0; nodeb < node_count; nodeb++) {
>> +                     distance = dt_get_node_distance(nodea, nodeb);
>> +                     numa_set_distance(nodea, nodeb, distance);
>> +             }
>> +     }
>> +     return 0;
>> +}
>> +EXPORT_SYMBOL(arm64_dt_numa_init);
>
> No need to export functions that are called only be architecture code.
> Since this works on the flattened device tree format, you can never
> have loadable modules calling it.
yes, will do.
>
>> @@ -461,7 +464,12 @@ static int c_show(struct seq_file *m, void *v)
>>                * "processor".  Give glibc what it expects.
>>                */
>>  #ifdef CONFIG_SMP
>> +     if (IS_ENABLED(CONFIG_NUMA)) {
>> +             seq_printf(m, "processor\t: %d", i);
>> +             seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
>> +     } else {
>>               seq_printf(m, "processor\t: %d\n", i);
>> +     }
>>  #endif
>>       }
>
> Do we need to make this conditional? I think we can just always
> print the node number, even if it's going to be zero for systems
> without the associativity properties.
yes, we can.
>
>> +
>> +int cpu_to_node_map[NR_CPUS];
>> +EXPORT_SYMBOL(cpu_to_node_map);
>
> This seems to be x86 specific, do we need it?
>
>> +/*
>> + *  Set the cpu to node and mem mapping
>> + */
>> +void numa_store_cpu_info(int cpu)
>> +{
>> +#ifdef CONFIG_ARM64_DT_NUMA
>> +     node_cpu_hwid[cpu].node_id  =  dt_get_cpu_node_id(cpu);
>> +#endif
>
> I would try to avoid the #ifdef here, by providing a stub function of
> dt_get_cpu_node_id or whichever function we end up calling here when
> NUMA is disabled.
as commented above.
.>
>> +
>> +/**
>> + * arm64_numa_init - Initialize NUMA
>> + *
>> + * Try each configured NUMA initialization method until one succeeds.  The
>> + * last fallback is dummy single node config encomapssing whole memory and
>> + * never fails.
>> + */
>> +void __init arm64_numa_init(void)
>> +{
>> +     if (!numa_off) {
>> +#ifdef CONFIG_ARM64_DT_NUMA
>> +             if (!numa_init(arm64_dt_numa_init))
>> +                     return;
>> +#endif
>> +     }
>> +
>> +     numa_init(dummy_numa_init);
>> +}
>
> I don't think we need the CONFIG_ARM64_DT_NUMA=n case here, it should just
> not be conditional, and the arm64_dt_numa_init should fall back to doing
> something reasonable when numa is turned off or there are no associativity
> properties.
i think we can remove ifdef, will do it.
>
>         Arnd

thanks
ganapat

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
@ 2015-01-06  9:25           ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  9:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,


On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> [re-sent with correct mailing list address]
>
> On Wednesday 31 December 2014 13:03:28 Ganapatrao Kulkarni wrote:
>> Adding numa support for arm64 based platforms.
>> Adding dt node pasring for numa topology using property arm,associativity.
>>
>> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
>
> Maybe the parts that are common with powerpc can be moved to drivers/of/numa.c?
> We can always look for both arm,associativity and ibm,associativity, I don't
> think we should be worried about any conflicts that way.
ok, i will move common functions from powerpc and arm64 to driver/of/numa.c
>
>> +#define MAX_DISTANCE_REF_POINTS 4
>
> I think we should use 8 here like powerpc, four levels might get exceeded
> on complex SoCs.
sure.
>
>> +int dt_get_cpu_node_id(int cpu)
>> +{
>> +     struct device_node *dn = NULL;
>> +
>> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
>> +             const u32 *cell;
>> +             u64 hwid;
>> +
>> +             /*
>> +              * A cpu node with missing "reg" property is
>> +              * considered invalid to build a cpu_logical_map
>> +              * entry.
>> +              */
>> +             cell = of_get_property(dn, "reg", NULL);
>> +             if (!cell) {
>> +                     pr_err("%s: missing reg property\n", dn->full_name);
>> +                     return default_nid;
>> +             }
>> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
>> +
>> +             if (cpu_logical_map(cpu) == hwid)
>> +             return of_node_to_nid_single(dn);
>> +     }
>> +     return NUMA_NO_NODE;
>> +}
>> +EXPORT_SYMBOL(dt_get_cpu_node_id);
>
> Maybe just expose a function to the device node for a CPU ID here, and
> expect callers to use of_node_to_nid?
shall i make this wrapper function in dt_numa.c, which will use
functions _of_node_to_nid and  _of_cpu_to_node(cpu)
And,  this function can be a weak function in numa.c which returns 0.
>
>> +
>> +/**
>> + * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
>> + */
>> +int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
>> +                                  int depth, void *data)
>> +{
>> +     const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
>> +
>> +     /* We are scanning "numa-map" nodes only */
>
> a stale comment?
oops, will remove.
>
>> +/* DT node mapping is done already early_init_dt_scan_memory */
>> +int __init arm64_dt_numa_init(void)
>> +{
>> +     int i;
>> +     u32 nodea, nodeb, distance, node_count = 0;
>> +
>> +     of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
>> +
>> +     for_each_node_mask(i, numa_nodes_parsed)
>> +             node_count = i;
>> +     node_count++;
>> +
>> +     for (nodea =  0; nodea < node_count; nodea++) {
>> +             for (nodeb = 0; nodeb < node_count; nodeb++) {
>> +                     distance = dt_get_node_distance(nodea, nodeb);
>> +                     numa_set_distance(nodea, nodeb, distance);
>> +             }
>> +     }
>> +     return 0;
>> +}
>> +EXPORT_SYMBOL(arm64_dt_numa_init);
>
> No need to export functions that are called only be architecture code.
> Since this works on the flattened device tree format, you can never
> have loadable modules calling it.
yes, will do.
>
>> @@ -461,7 +464,12 @@ static int c_show(struct seq_file *m, void *v)
>>                * "processor".  Give glibc what it expects.
>>                */
>>  #ifdef CONFIG_SMP
>> +     if (IS_ENABLED(CONFIG_NUMA)) {
>> +             seq_printf(m, "processor\t: %d", i);
>> +             seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
>> +     } else {
>>               seq_printf(m, "processor\t: %d\n", i);
>> +     }
>>  #endif
>>       }
>
> Do we need to make this conditional? I think we can just always
> print the node number, even if it's going to be zero for systems
> without the associativity properties.
yes, we can.
>
>> +
>> +int cpu_to_node_map[NR_CPUS];
>> +EXPORT_SYMBOL(cpu_to_node_map);
>
> This seems to be x86 specific, do we need it?
>
>> +/*
>> + *  Set the cpu to node and mem mapping
>> + */
>> +void numa_store_cpu_info(int cpu)
>> +{
>> +#ifdef CONFIG_ARM64_DT_NUMA
>> +     node_cpu_hwid[cpu].node_id  =  dt_get_cpu_node_id(cpu);
>> +#endif
>
> I would try to avoid the #ifdef here, by providing a stub function of
> dt_get_cpu_node_id or whichever function we end up calling here when
> NUMA is disabled.
as commented above.
.>
>> +
>> +/**
>> + * arm64_numa_init - Initialize NUMA
>> + *
>> + * Try each configured NUMA initialization method until one succeeds.  The
>> + * last fallback is dummy single node config encomapssing whole memory and
>> + * never fails.
>> + */
>> +void __init arm64_numa_init(void)
>> +{
>> +     if (!numa_off) {
>> +#ifdef CONFIG_ARM64_DT_NUMA
>> +             if (!numa_init(arm64_dt_numa_init))
>> +                     return;
>> +#endif
>> +     }
>> +
>> +     numa_init(dummy_numa_init);
>> +}
>
> I don't think we need the CONFIG_ARM64_DT_NUMA=n case here, it should just
> not be conditional, and the arm64_dt_numa_init should fall back to doing
> something reasonable when numa is turned off or there are no associativity
> properties.
i think we can remove ifdef, will do it.
>
>         Arnd

thanks
ganapat

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-02 21:17         ` Arnd Bergmann
@ 2015-01-06  9:34           ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  9:34 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, Grant Likely, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Leif Lindholm, Roy Franz, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> +
>> +     memory@00000000 {
>> +             device_type = "memory";
>> +             reg = <0x0 0x00000000 0x0 0x80000000>;
>> +             /* board 0, socket 0, no specific core */
>> +             arm,associativity = <0 0 0xffff>;
>> +     };
>> +
>> +     memory@10000000000 {
>> +             device_type = "memory";
>> +             reg = <0x100 0x00000000 0x0 0x80000000>;
>> +             /* board 1, socket 0, no specific core */
>> +             arm,associativity = <1 0 0xffff>;
>> +     };
>> +};
>
> So no memory in any other socket?
>
>> +             cpu@00f {
>> +                     device_type = "cpu";
>> +                     compatible = "cavium,thunder", "arm,armv8";
>> +                     reg = <0x0 0x00f>;
>> +                     enable-method = "psci";
>> +                     arm,associativity = <0 0 0x00f>;
>> +             };
>> +             cpu@100 {
>> +                     device_type = "cpu";
>> +                     compatible = "cavium,thunder", "arm,armv8";
>> +                     reg = <0x0 0x100>;
>> +                     enable-method = "psci";
>> +                     arm,associativity = <0 0 0x100>;
>> +             };
>
> What is the 0x100 offset in the last-level topology field? Does this have
> no significance to topology at all? I would expect that to be something
> like cluster number that is relevant to caching and should be represented
> as a separate level.
i did not understand, can you please explain little more about "
should be represented as a separate level."
at present, i have put the hwid of a cpu.
>
> In contrast, the level-two topology information seems to always be
> zero for all CPUs, so you could probably leave that one out.
>
>> +     soc {
>> +             compatible = "simple-bus";
>> +             #address-cells = <2>;
>> +             #size-cells = <2>;
>> +             ranges;
>
> The soc node is missing a topology information, please add one.
ok, will be added.
>
>         Arnd

thanks
ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-06  9:34           ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  9:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> +
>> +     memory at 00000000 {
>> +             device_type = "memory";
>> +             reg = <0x0 0x00000000 0x0 0x80000000>;
>> +             /* board 0, socket 0, no specific core */
>> +             arm,associativity = <0 0 0xffff>;
>> +     };
>> +
>> +     memory at 10000000000 {
>> +             device_type = "memory";
>> +             reg = <0x100 0x00000000 0x0 0x80000000>;
>> +             /* board 1, socket 0, no specific core */
>> +             arm,associativity = <1 0 0xffff>;
>> +     };
>> +};
>
> So no memory in any other socket?
>
>> +             cpu at 00f {
>> +                     device_type = "cpu";
>> +                     compatible = "cavium,thunder", "arm,armv8";
>> +                     reg = <0x0 0x00f>;
>> +                     enable-method = "psci";
>> +                     arm,associativity = <0 0 0x00f>;
>> +             };
>> +             cpu at 100 {
>> +                     device_type = "cpu";
>> +                     compatible = "cavium,thunder", "arm,armv8";
>> +                     reg = <0x0 0x100>;
>> +                     enable-method = "psci";
>> +                     arm,associativity = <0 0 0x100>;
>> +             };
>
> What is the 0x100 offset in the last-level topology field? Does this have
> no significance to topology at all? I would expect that to be something
> like cluster number that is relevant to caching and should be represented
> as a separate level.
i did not understand, can you please explain little more about "
should be represented as a separate level."
at present, i have put the hwid of a cpu.
>
> In contrast, the level-two topology information seems to always be
> zero for all CPUs, so you could probably leave that one out.
>
>> +     soc {
>> +             compatible = "simple-bus";
>> +             #address-cells = <2>;
>> +             #size-cells = <2>;
>> +             ranges;
>
> The soc node is missing a topology information, please add one.
ok, will be added.
>
>         Arnd

thanks
ganapat

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
  2015-01-06  9:25           ` Ganapatrao Kulkarni
@ 2015-01-06 19:59               ` Arnd Bergmann
  -1 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-06 19:59 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: Shannon Zhao, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, Grant Likely, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Leif Lindholm, Roy Franz, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

On Tuesday 06 January 2015 14:55:53 Ganapatrao Kulkarni wrote:
> On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> >> +int dt_get_cpu_node_id(int cpu)
> >> +{
> >> +     struct device_node *dn = NULL;
> >> +
> >> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
> >> +             const u32 *cell;
> >> +             u64 hwid;
> >> +
> >> +             /*
> >> +              * A cpu node with missing "reg" property is
> >> +              * considered invalid to build a cpu_logical_map
> >> +              * entry.
> >> +              */
> >> +             cell = of_get_property(dn, "reg", NULL);
> >> +             if (!cell) {
> >> +                     pr_err("%s: missing reg property\n", dn->full_name);
> >> +                     return default_nid;
> >> +             }
> >> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
> >> +
> >> +             if (cpu_logical_map(cpu) == hwid)
> >> +             return of_node_to_nid_single(dn);
> >> +     }
> >> +     return NUMA_NO_NODE;
> >> +}
> >> +EXPORT_SYMBOL(dt_get_cpu_node_id);
> >
> > Maybe just expose a function to the device node for a CPU ID here, and
> > expect callers to use of_node_to_nid?
> shall i make this wrapper function in dt_numa.c, which will use
> functions _of_node_to_nid and  _of_cpu_to_node(cpu)

Yes, I guess that would work.

> And,  this function can be a weak function in numa.c which returns 0.

No, please don't use weak functions. You can either use IS_ENABLED()
tricks to remove function calls at compile-time, or in the header
file provide an inline function as an alternative to the extern
declaration, based on a configuration symbol.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
@ 2015-01-06 19:59               ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-06 19:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 06 January 2015 14:55:53 Ganapatrao Kulkarni wrote:
> On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> +int dt_get_cpu_node_id(int cpu)
> >> +{
> >> +     struct device_node *dn = NULL;
> >> +
> >> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
> >> +             const u32 *cell;
> >> +             u64 hwid;
> >> +
> >> +             /*
> >> +              * A cpu node with missing "reg" property is
> >> +              * considered invalid to build a cpu_logical_map
> >> +              * entry.
> >> +              */
> >> +             cell = of_get_property(dn, "reg", NULL);
> >> +             if (!cell) {
> >> +                     pr_err("%s: missing reg property\n", dn->full_name);
> >> +                     return default_nid;
> >> +             }
> >> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
> >> +
> >> +             if (cpu_logical_map(cpu) == hwid)
> >> +             return of_node_to_nid_single(dn);
> >> +     }
> >> +     return NUMA_NO_NODE;
> >> +}
> >> +EXPORT_SYMBOL(dt_get_cpu_node_id);
> >
> > Maybe just expose a function to the device node for a CPU ID here, and
> > expect callers to use of_node_to_nid?
> shall i make this wrapper function in dt_numa.c, which will use
> functions _of_node_to_nid and  _of_cpu_to_node(cpu)

Yes, I guess that would work.

> And,  this function can be a weak function in numa.c which returns 0.

No, please don't use weak functions. You can either use IS_ENABLED()
tricks to remove function calls at compile-time, or in the header
file provide an inline function as an alternative to the extern
declaration, based on a configuration symbol.

	Arnd

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-06  9:34           ` Ganapatrao Kulkarni
@ 2015-01-06 20:02               ` Arnd Bergmann
  -1 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-06 20:02 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, Grant Likely, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Leif Lindholm, Roy Franz, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> +             cpu@00f {
> >> +                     device_type = "cpu";
> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> +                     reg = <0x0 0x00f>;
> >> +                     enable-method = "psci";
> >> +                     arm,associativity = <0 0 0x00f>;
> >> +             };
> >> +             cpu@100 {
> >> +                     device_type = "cpu";
> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> +                     reg = <0x0 0x100>;
> >> +                     enable-method = "psci";
> >> +                     arm,associativity = <0 0 0x100>;
> >> +             };
> >
> > What is the 0x100 offset in the last-level topology field? Does this have
> > no significance to topology at all? I would expect that to be something
> > like cluster number that is relevant to caching and should be represented
> > as a separate level.
>
> i did not understand, can you please explain little more about "
> should be represented as a separate level."
> at present, i have put the hwid of a cpu.

>From what I undertand, the hwid of the CPU contains the "cluster" number in
this bit position, so you typically have a shared L2 or L3 cache between
all cores within a cluster, but separate caches in other clusters.

If this is the case, there will be a measurable difference in performance
between two processes sharing memory when running on the same cluster,
or when running on different clusters on the same socket. If the
performance difference is relevant, it should be described as a separate
level in the associativity property.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-06 20:02               ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-06 20:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> +             cpu at 00f {
> >> +                     device_type = "cpu";
> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> +                     reg = <0x0 0x00f>;
> >> +                     enable-method = "psci";
> >> +                     arm,associativity = <0 0 0x00f>;
> >> +             };
> >> +             cpu at 100 {
> >> +                     device_type = "cpu";
> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> +                     reg = <0x0 0x100>;
> >> +                     enable-method = "psci";
> >> +                     arm,associativity = <0 0 0x100>;
> >> +             };
> >
> > What is the 0x100 offset in the last-level topology field? Does this have
> > no significance to topology at all? I would expect that to be something
> > like cluster number that is relevant to caching and should be represented
> > as a separate level.
>
> i did not understand, can you please explain little more about "
> should be represented as a separate level."
> at present, i have put the hwid of a cpu.

>From what I undertand, the hwid of the CPU contains the "cluster" number in
this bit position, so you typically have a shared L2 or L3 cache between
all cores within a cluster, but separate caches in other clusters.

If this is the case, there will be a measurable difference in performance
between two processes sharing memory when running on the same cluster,
or when running on different clusters on the same socket. If the
performance difference is relevant, it should be described as a separate
level in the associativity property.

	Arnd

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-06 20:02               ` Arnd Bergmann
@ 2015-01-07  7:07                 ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-07  7:07 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, Grant Likely, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Leif Lindholm, Roy Franz, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

Hi Arnd,

On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
>> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
>> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> >> +             cpu@00f {
>> >> +                     device_type = "cpu";
>> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> >> +                     reg = <0x0 0x00f>;
>> >> +                     enable-method = "psci";
>> >> +                     arm,associativity = <0 0 0x00f>;
>> >> +             };
>> >> +             cpu@100 {
>> >> +                     device_type = "cpu";
>> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> >> +                     reg = <0x0 0x100>;
>> >> +                     enable-method = "psci";
>> >> +                     arm,associativity = <0 0 0x100>;
>> >> +             };
>> >
>> > What is the 0x100 offset in the last-level topology field? Does this have
>> > no significance to topology at all? I would expect that to be something
>> > like cluster number that is relevant to caching and should be represented
>> > as a separate level.
>>
>> i did not understand, can you please explain little more about "
>> should be represented as a separate level."
>> at present, i have put the hwid of a cpu.
>
> From what I undertand, the hwid of the CPU contains the "cluster" number in
> this bit position, so you typically have a shared L2 or L3 cache between
> all cores within a cluster, but separate caches in other clusters.
>
> If this is the case, there will be a measurable difference in performance
> between two processes sharing memory when running on the same cluster,
> or when running on different clusters on the same socket. If the
> performance difference is relevant, it should be described as a separate
> level in the associativity property.
you mean, the associativity as array of  <board> <socket> <cluster>
>
>         Arnd
thanks
ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-07  7:07                 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-07  7:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,

On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
>> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> >> +             cpu at 00f {
>> >> +                     device_type = "cpu";
>> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> >> +                     reg = <0x0 0x00f>;
>> >> +                     enable-method = "psci";
>> >> +                     arm,associativity = <0 0 0x00f>;
>> >> +             };
>> >> +             cpu at 100 {
>> >> +                     device_type = "cpu";
>> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> >> +                     reg = <0x0 0x100>;
>> >> +                     enable-method = "psci";
>> >> +                     arm,associativity = <0 0 0x100>;
>> >> +             };
>> >
>> > What is the 0x100 offset in the last-level topology field? Does this have
>> > no significance to topology at all? I would expect that to be something
>> > like cluster number that is relevant to caching and should be represented
>> > as a separate level.
>>
>> i did not understand, can you please explain little more about "
>> should be represented as a separate level."
>> at present, i have put the hwid of a cpu.
>
> From what I undertand, the hwid of the CPU contains the "cluster" number in
> this bit position, so you typically have a shared L2 or L3 cache between
> all cores within a cluster, but separate caches in other clusters.
>
> If this is the case, there will be a measurable difference in performance
> between two processes sharing memory when running on the same cluster,
> or when running on different clusters on the same socket. If the
> performance difference is relevant, it should be described as a separate
> level in the associativity property.
you mean, the associativity as array of  <board> <socket> <cluster>
>
>         Arnd
thanks
ganapat

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
  2015-01-06 19:59               ` Arnd Bergmann
@ 2015-01-07  7:09                 ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-07  7:09 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Shannon Zhao, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, Grant Likely, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Leif Lindholm, Roy Franz, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

On Wed, Jan 7, 2015 at 1:29 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> On Tuesday 06 January 2015 14:55:53 Ganapatrao Kulkarni wrote:
>> On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
>> >> +int dt_get_cpu_node_id(int cpu)
>> >> +{
>> >> +     struct device_node *dn = NULL;
>> >> +
>> >> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
>> >> +             const u32 *cell;
>> >> +             u64 hwid;
>> >> +
>> >> +             /*
>> >> +              * A cpu node with missing "reg" property is
>> >> +              * considered invalid to build a cpu_logical_map
>> >> +              * entry.
>> >> +              */
>> >> +             cell = of_get_property(dn, "reg", NULL);
>> >> +             if (!cell) {
>> >> +                     pr_err("%s: missing reg property\n", dn->full_name);
>> >> +                     return default_nid;
>> >> +             }
>> >> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
>> >> +
>> >> +             if (cpu_logical_map(cpu) == hwid)
>> >> +             return of_node_to_nid_single(dn);
>> >> +     }
>> >> +     return NUMA_NO_NODE;
>> >> +}
>> >> +EXPORT_SYMBOL(dt_get_cpu_node_id);
>> >
>> > Maybe just expose a function to the device node for a CPU ID here, and
>> > expect callers to use of_node_to_nid?
>> shall i make this wrapper function in dt_numa.c, which will use
>> functions _of_node_to_nid and  _of_cpu_to_node(cpu)
>
> Yes, I guess that would work.
>
>> And,  this function can be a weak function in numa.c which returns 0.
>
> No, please don't use weak functions. You can either use IS_ENABLED()
> tricks to remove function calls at compile-time, or in the header
> file provide an inline function as an alternative to the extern
> declaration, based on a configuration symbol.
ok
>
>         Arnd
thanks
ganaapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
@ 2015-01-07  7:09                 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-07  7:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 7, 2015 at 1:29 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tuesday 06 January 2015 14:55:53 Ganapatrao Kulkarni wrote:
>> On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> >> +int dt_get_cpu_node_id(int cpu)
>> >> +{
>> >> +     struct device_node *dn = NULL;
>> >> +
>> >> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
>> >> +             const u32 *cell;
>> >> +             u64 hwid;
>> >> +
>> >> +             /*
>> >> +              * A cpu node with missing "reg" property is
>> >> +              * considered invalid to build a cpu_logical_map
>> >> +              * entry.
>> >> +              */
>> >> +             cell = of_get_property(dn, "reg", NULL);
>> >> +             if (!cell) {
>> >> +                     pr_err("%s: missing reg property\n", dn->full_name);
>> >> +                     return default_nid;
>> >> +             }
>> >> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
>> >> +
>> >> +             if (cpu_logical_map(cpu) == hwid)
>> >> +             return of_node_to_nid_single(dn);
>> >> +     }
>> >> +     return NUMA_NO_NODE;
>> >> +}
>> >> +EXPORT_SYMBOL(dt_get_cpu_node_id);
>> >
>> > Maybe just expose a function to the device node for a CPU ID here, and
>> > expect callers to use of_node_to_nid?
>> shall i make this wrapper function in dt_numa.c, which will use
>> functions _of_node_to_nid and  _of_cpu_to_node(cpu)
>
> Yes, I guess that would work.
>
>> And,  this function can be a weak function in numa.c which returns 0.
>
> No, please don't use weak functions. You can either use IS_ENABLED()
> tricks to remove function calls at compile-time, or in the header
> file provide an inline function as an alternative to the extern
> declaration, based on a configuration symbol.
ok
>
>         Arnd
thanks
ganaapat

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-07  7:07                 ` Ganapatrao Kulkarni
@ 2015-01-07  8:18                   ` Arnd Bergmann
  -1 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-07  8:18 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: devicetree, Steve Capper, Al Stone, Ard Biesheuvel,
	Catalin Marinas, Will Deacon, Leif Lindholm, Roy Franz,
	Rob Herring, Ganapatrao Kulkarni, msalter, Grant Likely,
	jchandra, linux-arm-kernel, Hanjun Guo

On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> Hi Arnd,
> 
> On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> >> +             cpu@00f {
> >> >> +                     device_type = "cpu";
> >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> >> +                     reg = <0x0 0x00f>;
> >> >> +                     enable-method = "psci";
> >> >> +                     arm,associativity = <0 0 0x00f>;
> >> >> +             };
> >> >> +             cpu@100 {
> >> >> +                     device_type = "cpu";
> >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> >> +                     reg = <0x0 0x100>;
> >> >> +                     enable-method = "psci";
> >> >> +                     arm,associativity = <0 0 0x100>;
> >> >> +             };
> >> >
> >> > What is the 0x100 offset in the last-level topology field? Does this have
> >> > no significance to topology at all? I would expect that to be something
> >> > like cluster number that is relevant to caching and should be represented
> >> > as a separate level.
> >>
> >> i did not understand, can you please explain little more about "
> >> should be represented as a separate level."
> >> at present, i have put the hwid of a cpu.
> >
> > From what I undertand, the hwid of the CPU contains the "cluster" number in
> > this bit position, so you typically have a shared L2 or L3 cache between
> > all cores within a cluster, but separate caches in other clusters.
> >
> > If this is the case, there will be a measurable difference in performance
> > between two processes sharing memory when running on the same cluster,
> > or when running on different clusters on the same socket. If the
> > performance difference is relevant, it should be described as a separate
> > level in the associativity property.
> you mean, the associativity as array of  <board> <socket> <cluster>

No, that would leave out the core number, which is required to identify
the individual thread. I meant adding an extra level such as

<board> <socket> <cluster> <core>

A lot of machines will leave out the <board> number because they are
built with SoCs that don't have a long-distance coherency protocol.

	Arnd

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-07  8:18                   ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-07  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> Hi Arnd,
> 
> On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> >> +             cpu at 00f {
> >> >> +                     device_type = "cpu";
> >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> >> +                     reg = <0x0 0x00f>;
> >> >> +                     enable-method = "psci";
> >> >> +                     arm,associativity = <0 0 0x00f>;
> >> >> +             };
> >> >> +             cpu at 100 {
> >> >> +                     device_type = "cpu";
> >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> >> +                     reg = <0x0 0x100>;
> >> >> +                     enable-method = "psci";
> >> >> +                     arm,associativity = <0 0 0x100>;
> >> >> +             };
> >> >
> >> > What is the 0x100 offset in the last-level topology field? Does this have
> >> > no significance to topology at all? I would expect that to be something
> >> > like cluster number that is relevant to caching and should be represented
> >> > as a separate level.
> >>
> >> i did not understand, can you please explain little more about "
> >> should be represented as a separate level."
> >> at present, i have put the hwid of a cpu.
> >
> > From what I undertand, the hwid of the CPU contains the "cluster" number in
> > this bit position, so you typically have a shared L2 or L3 cache between
> > all cores within a cluster, but separate caches in other clusters.
> >
> > If this is the case, there will be a measurable difference in performance
> > between two processes sharing memory when running on the same cluster,
> > or when running on different clusters on the same socket. If the
> > performance difference is relevant, it should be described as a separate
> > level in the associativity property.
> you mean, the associativity as array of  <board> <socket> <cluster>

No, that would leave out the core number, which is required to identify
the individual thread. I meant adding an extra level such as

<board> <socket> <cluster> <core>

A lot of machines will leave out the <board> number because they are
built with SoCs that don't have a long-distance coherency protocol.

	Arnd

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-07  8:18                   ` Arnd Bergmann
@ 2015-01-14 17:36                     ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 43+ messages in thread
From: Lorenzo Pieralisi @ 2015-01-14 17:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ganapatrao Kulkarni, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Roy Franz,
	Ard Biesheuvel, msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring,
	Steve Capper, hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> > Hi Arnd,
> > 
> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> > >> >> +             cpu@00f {
> > >> >> +                     device_type = "cpu";
> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> > >> >> +                     reg = <0x0 0x00f>;
> > >> >> +                     enable-method = "psci";
> > >> >> +                     arm,associativity = <0 0 0x00f>;
> > >> >> +             };
> > >> >> +             cpu@100 {
> > >> >> +                     device_type = "cpu";
> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> > >> >> +                     reg = <0x0 0x100>;
> > >> >> +                     enable-method = "psci";
> > >> >> +                     arm,associativity = <0 0 0x100>;
> > >> >> +             };
> > >> >
> > >> > What is the 0x100 offset in the last-level topology field? Does this have
> > >> > no significance to topology at all? I would expect that to be something
> > >> > like cluster number that is relevant to caching and should be represented
> > >> > as a separate level.
> > >>
> > >> i did not understand, can you please explain little more about "
> > >> should be represented as a separate level."
> > >> at present, i have put the hwid of a cpu.
> > >
> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
> > > this bit position, so you typically have a shared L2 or L3 cache between
> > > all cores within a cluster, but separate caches in other clusters.
> > >
> > > If this is the case, there will be a measurable difference in performance
> > > between two processes sharing memory when running on the same cluster,
> > > or when running on different clusters on the same socket. If the
> > > performance difference is relevant, it should be described as a separate
> > > level in the associativity property.
> > you mean, the associativity as array of  <board> <socket> <cluster>
> 
> No, that would leave out the core number, which is required to identify
> the individual thread. I meant adding an extra level such as
> 
> <board> <socket> <cluster> <core>
> 
> A lot of machines will leave out the <board> number because they are
> built with SoCs that don't have a long-distance coherency protocol.

Can't we use phandles to cpu-map nodes instead of a list of numbers (and
yet another topology binding description) ?

Is arm,associativity used solely to map "devices" (inclusive of caches)
to a set of cpus ?

cpu-map misses a notion of distance between hierarchy layers, but we can
add to that.

Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-14 17:36                     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 43+ messages in thread
From: Lorenzo Pieralisi @ 2015-01-14 17:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> > Hi Arnd,
> > 
> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> > >> >> +             cpu at 00f {
> > >> >> +                     device_type = "cpu";
> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> > >> >> +                     reg = <0x0 0x00f>;
> > >> >> +                     enable-method = "psci";
> > >> >> +                     arm,associativity = <0 0 0x00f>;
> > >> >> +             };
> > >> >> +             cpu at 100 {
> > >> >> +                     device_type = "cpu";
> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> > >> >> +                     reg = <0x0 0x100>;
> > >> >> +                     enable-method = "psci";
> > >> >> +                     arm,associativity = <0 0 0x100>;
> > >> >> +             };
> > >> >
> > >> > What is the 0x100 offset in the last-level topology field? Does this have
> > >> > no significance to topology at all? I would expect that to be something
> > >> > like cluster number that is relevant to caching and should be represented
> > >> > as a separate level.
> > >>
> > >> i did not understand, can you please explain little more about "
> > >> should be represented as a separate level."
> > >> at present, i have put the hwid of a cpu.
> > >
> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
> > > this bit position, so you typically have a shared L2 or L3 cache between
> > > all cores within a cluster, but separate caches in other clusters.
> > >
> > > If this is the case, there will be a measurable difference in performance
> > > between two processes sharing memory when running on the same cluster,
> > > or when running on different clusters on the same socket. If the
> > > performance difference is relevant, it should be described as a separate
> > > level in the associativity property.
> > you mean, the associativity as array of  <board> <socket> <cluster>
> 
> No, that would leave out the core number, which is required to identify
> the individual thread. I meant adding an extra level such as
> 
> <board> <socket> <cluster> <core>
> 
> A lot of machines will leave out the <board> number because they are
> built with SoCs that don't have a long-distance coherency protocol.

Can't we use phandles to cpu-map nodes instead of a list of numbers (and
yet another topology binding description) ?

Is arm,associativity used solely to map "devices" (inclusive of caches)
to a set of cpus ?

cpu-map misses a notion of distance between hierarchy layers, but we can
add to that.

Lorenzo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-14 17:36                     ` Lorenzo Pieralisi
@ 2015-01-14 18:48                       ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-14 18:48 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Arnd Bergmann, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Roy Franz,
	Ard Biesheuvel, msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring,
	Steve Capper, hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

Hi Lorenzo,

On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
<lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org> wrote:
> On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
>> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
>> > Hi Arnd,
>> >
>> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
>> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
>> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
>> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> > >> >> +             cpu@00f {
>> > >> >> +                     device_type = "cpu";
>> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> > >> >> +                     reg = <0x0 0x00f>;
>> > >> >> +                     enable-method = "psci";
>> > >> >> +                     arm,associativity = <0 0 0x00f>;
>> > >> >> +             };
>> > >> >> +             cpu@100 {
>> > >> >> +                     device_type = "cpu";
>> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> > >> >> +                     reg = <0x0 0x100>;
>> > >> >> +                     enable-method = "psci";
>> > >> >> +                     arm,associativity = <0 0 0x100>;
>> > >> >> +             };
>> > >> >
>> > >> > What is the 0x100 offset in the last-level topology field? Does this have
>> > >> > no significance to topology at all? I would expect that to be something
>> > >> > like cluster number that is relevant to caching and should be represented
>> > >> > as a separate level.
>> > >>
>> > >> i did not understand, can you please explain little more about "
>> > >> should be represented as a separate level."
>> > >> at present, i have put the hwid of a cpu.
>> > >
>> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
>> > > this bit position, so you typically have a shared L2 or L3 cache between
>> > > all cores within a cluster, but separate caches in other clusters.
>> > >
>> > > If this is the case, there will be a measurable difference in performance
>> > > between two processes sharing memory when running on the same cluster,
>> > > or when running on different clusters on the same socket. If the
>> > > performance difference is relevant, it should be described as a separate
>> > > level in the associativity property.
>> > you mean, the associativity as array of  <board> <socket> <cluster>
>>
>> No, that would leave out the core number, which is required to identify
>> the individual thread. I meant adding an extra level such as
>>
>> <board> <socket> <cluster> <core>
>>
>> A lot of machines will leave out the <board> number because they are
>> built with SoCs that don't have a long-distance coherency protocol.
>
> Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> yet another topology binding description) ?
cpu-map describes only a cpu topology.
infact, i have tried initially(in v1 patch set) to use topology for
the numa mapping.
However, for numa, we need to define association of cpu, memory and IOs.
arm,associativity is a generic node property and can be used in any dt nodes.
>
> Is arm,associativity used solely to map "devices" (inclusive of caches)
> to a set of cpus ?
>
> cpu-map misses a notion of distance between hierarchy layers, but we can
> add to that.
>
> Lorenzo
thanks
ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-14 18:48                       ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-14 18:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Lorenzo,

On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
>> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
>> > Hi Arnd,
>> >
>> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
>> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> > >> >> +             cpu at 00f {
>> > >> >> +                     device_type = "cpu";
>> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> > >> >> +                     reg = <0x0 0x00f>;
>> > >> >> +                     enable-method = "psci";
>> > >> >> +                     arm,associativity = <0 0 0x00f>;
>> > >> >> +             };
>> > >> >> +             cpu at 100 {
>> > >> >> +                     device_type = "cpu";
>> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> > >> >> +                     reg = <0x0 0x100>;
>> > >> >> +                     enable-method = "psci";
>> > >> >> +                     arm,associativity = <0 0 0x100>;
>> > >> >> +             };
>> > >> >
>> > >> > What is the 0x100 offset in the last-level topology field? Does this have
>> > >> > no significance to topology at all? I would expect that to be something
>> > >> > like cluster number that is relevant to caching and should be represented
>> > >> > as a separate level.
>> > >>
>> > >> i did not understand, can you please explain little more about "
>> > >> should be represented as a separate level."
>> > >> at present, i have put the hwid of a cpu.
>> > >
>> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
>> > > this bit position, so you typically have a shared L2 or L3 cache between
>> > > all cores within a cluster, but separate caches in other clusters.
>> > >
>> > > If this is the case, there will be a measurable difference in performance
>> > > between two processes sharing memory when running on the same cluster,
>> > > or when running on different clusters on the same socket. If the
>> > > performance difference is relevant, it should be described as a separate
>> > > level in the associativity property.
>> > you mean, the associativity as array of  <board> <socket> <cluster>
>>
>> No, that would leave out the core number, which is required to identify
>> the individual thread. I meant adding an extra level such as
>>
>> <board> <socket> <cluster> <core>
>>
>> A lot of machines will leave out the <board> number because they are
>> built with SoCs that don't have a long-distance coherency protocol.
>
> Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> yet another topology binding description) ?
cpu-map describes only a cpu topology.
infact, i have tried initially(in v1 patch set) to use topology for
the numa mapping.
However, for numa, we need to define association of cpu, memory and IOs.
arm,associativity is a generic node property and can be used in any dt nodes.
>
> Is arm,associativity used solely to map "devices" (inclusive of caches)
> to a set of cpus ?
>
> cpu-map misses a notion of distance between hierarchy layers, but we can
> add to that.
>
> Lorenzo
thanks
ganapat

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-14 18:48                       ` Ganapatrao Kulkarni
@ 2015-01-14 23:49                           ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 43+ messages in thread
From: Lorenzo Pieralisi @ 2015-01-14 23:49 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: Arnd Bergmann, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Roy Franz,
	Ard Biesheuvel, msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring,
	Steve Capper, hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

On Wed, Jan 14, 2015 at 06:48:32PM +0000, Ganapatrao Kulkarni wrote:
> Hi Lorenzo,
> 
> On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
> <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org> wrote:
> > On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> >> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> >> > Hi Arnd,
> >> >
> >> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> >> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> >> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> >> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> > >> >> +             cpu@00f {
> >> > >> >> +                     device_type = "cpu";
> >> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> > >> >> +                     reg = <0x0 0x00f>;
> >> > >> >> +                     enable-method = "psci";
> >> > >> >> +                     arm,associativity = <0 0 0x00f>;
> >> > >> >> +             };
> >> > >> >> +             cpu@100 {
> >> > >> >> +                     device_type = "cpu";
> >> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> > >> >> +                     reg = <0x0 0x100>;
> >> > >> >> +                     enable-method = "psci";
> >> > >> >> +                     arm,associativity = <0 0 0x100>;
> >> > >> >> +             };
> >> > >> >
> >> > >> > What is the 0x100 offset in the last-level topology field? Does this have
> >> > >> > no significance to topology at all? I would expect that to be something
> >> > >> > like cluster number that is relevant to caching and should be represented
> >> > >> > as a separate level.
> >> > >>
> >> > >> i did not understand, can you please explain little more about "
> >> > >> should be represented as a separate level."
> >> > >> at present, i have put the hwid of a cpu.
> >> > >
> >> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
> >> > > this bit position, so you typically have a shared L2 or L3 cache between
> >> > > all cores within a cluster, but separate caches in other clusters.
> >> > >
> >> > > If this is the case, there will be a measurable difference in performance
> >> > > between two processes sharing memory when running on the same cluster,
> >> > > or when running on different clusters on the same socket. If the
> >> > > performance difference is relevant, it should be described as a separate
> >> > > level in the associativity property.
> >> > you mean, the associativity as array of  <board> <socket> <cluster>
> >>
> >> No, that would leave out the core number, which is required to identify
> >> the individual thread. I meant adding an extra level such as
> >>
> >> <board> <socket> <cluster> <core>
> >>
> >> A lot of machines will leave out the <board> number because they are
> >> built with SoCs that don't have a long-distance coherency protocol.
> >
> > Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> > yet another topology binding description) ?
> cpu-map describes only a cpu topology.
> infact, i have tried initially(in v1 patch set) to use topology for
> the numa mapping.
> However, for numa, we need to define association of cpu, memory and IOs.
> arm,associativity is a generic node property and can be used in any dt nodes.

I understand that, I was advising to define "arm,associativity" as a
phandle in cpu nodes AND all devices.

Why can't you make it point at a phandle in the cpu-map instead of adding
a t-uple doing the same thing. Am I missing something here ?
cpu-map allows you to describe the system hierarchy and can expand beyond
clusters (several layers of clusterings, above core it is just a way to
define the system hierarchy, leaves node will always be cores or threads).

On a side note, one of the reasons cpu-map was devised for was exactly
that, to allow mappings of resources (ie IRQs but it is valid for caches
and other devices too) to groups of CPUs.

Is there anything that you can't do by using cpu-map phandles to
describe devices associativity ?

We have to add bindings that allow to compute the distance as you
do by using the reference points (I am reading the code to figure
out how it is used), but that's feasible as a binding update.

Lorenzo

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-14 23:49                           ` Lorenzo Pieralisi
  0 siblings, 0 replies; 43+ messages in thread
From: Lorenzo Pieralisi @ 2015-01-14 23:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 14, 2015 at 06:48:32PM +0000, Ganapatrao Kulkarni wrote:
> Hi Lorenzo,
> 
> On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com> wrote:
> > On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> >> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> >> > Hi Arnd,
> >> >
> >> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> >> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> > >> >> +             cpu at 00f {
> >> > >> >> +                     device_type = "cpu";
> >> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> > >> >> +                     reg = <0x0 0x00f>;
> >> > >> >> +                     enable-method = "psci";
> >> > >> >> +                     arm,associativity = <0 0 0x00f>;
> >> > >> >> +             };
> >> > >> >> +             cpu at 100 {
> >> > >> >> +                     device_type = "cpu";
> >> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> > >> >> +                     reg = <0x0 0x100>;
> >> > >> >> +                     enable-method = "psci";
> >> > >> >> +                     arm,associativity = <0 0 0x100>;
> >> > >> >> +             };
> >> > >> >
> >> > >> > What is the 0x100 offset in the last-level topology field? Does this have
> >> > >> > no significance to topology at all? I would expect that to be something
> >> > >> > like cluster number that is relevant to caching and should be represented
> >> > >> > as a separate level.
> >> > >>
> >> > >> i did not understand, can you please explain little more about "
> >> > >> should be represented as a separate level."
> >> > >> at present, i have put the hwid of a cpu.
> >> > >
> >> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
> >> > > this bit position, so you typically have a shared L2 or L3 cache between
> >> > > all cores within a cluster, but separate caches in other clusters.
> >> > >
> >> > > If this is the case, there will be a measurable difference in performance
> >> > > between two processes sharing memory when running on the same cluster,
> >> > > or when running on different clusters on the same socket. If the
> >> > > performance difference is relevant, it should be described as a separate
> >> > > level in the associativity property.
> >> > you mean, the associativity as array of  <board> <socket> <cluster>
> >>
> >> No, that would leave out the core number, which is required to identify
> >> the individual thread. I meant adding an extra level such as
> >>
> >> <board> <socket> <cluster> <core>
> >>
> >> A lot of machines will leave out the <board> number because they are
> >> built with SoCs that don't have a long-distance coherency protocol.
> >
> > Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> > yet another topology binding description) ?
> cpu-map describes only a cpu topology.
> infact, i have tried initially(in v1 patch set) to use topology for
> the numa mapping.
> However, for numa, we need to define association of cpu, memory and IOs.
> arm,associativity is a generic node property and can be used in any dt nodes.

I understand that, I was advising to define "arm,associativity" as a
phandle in cpu nodes AND all devices.

Why can't you make it point at a phandle in the cpu-map instead of adding
a t-uple doing the same thing. Am I missing something here ?
cpu-map allows you to describe the system hierarchy and can expand beyond
clusters (several layers of clusterings, above core it is just a way to
define the system hierarchy, leaves node will always be cores or threads).

On a side note, one of the reasons cpu-map was devised for was exactly
that, to allow mappings of resources (ie IRQs but it is valid for caches
and other devices too) to groups of CPUs.

Is there anything that you can't do by using cpu-map phandles to
describe devices associativity ?

We have to add bindings that allow to compute the distance as you
do by using the reference points (I am reading the code to figure
out how it is used), but that's feasible as a binding update.

Lorenzo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-14 23:49                           ` Lorenzo Pieralisi
@ 2015-01-15 17:32                               ` Arnd Bergmann
  -1 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-15 17:32 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Ganapatrao Kulkarni, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Will Deacon,
	Catalin Marinas, grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Roy Franz,
	Ard Biesheuvel, msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring,
	Steve Capper, hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	jchandra-dY08KVG/lbpWk0Htik3J/w, Al Stone

On Wednesday 14 January 2015 23:49:05 Lorenzo Pieralisi wrote:
> On Wed, Jan 14, 2015 at 06:48:32PM +0000, Ganapatrao Kulkarni wrote:
> > On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
> > <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org> wrote:
> > > On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> > >> No, that would leave out the core number, which is required to identify
> > >> the individual thread. I meant adding an extra level such as
> > >>
> > >> <board> <socket> <cluster> <core>
> > >>
> > >> A lot of machines will leave out the <board> number because they are
> > >> built with SoCs that don't have a long-distance coherency protocol.
> > >
> > > Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> > > yet another topology binding description) ?
> > cpu-map describes only a cpu topology.
> > infact, i have tried initially(in v1 patch set) to use topology for
> > the numa mapping.
> > However, for numa, we need to define association of cpu, memory and IOs.
> > arm,associativity is a generic node property and can be used in any dt nodes.
> 
> I understand that, I was advising to define "arm,associativity" as a
> phandle in cpu nodes AND all devices.
> 
> Why can't you make it point at a phandle in the cpu-map instead of adding
> a t-uple doing the same thing. Am I missing something here ?

Most importantly, it's following an existing spec for ibm,associativity,
which defines topology in terms of associativity, not a hierarchical tree.

> cpu-map allows you to describe the system hierarchy and can expand beyond
> clusters (several layers of clusterings, above core it is just a way to
> define the system hierarchy, leaves node will always be cores or threads).

> On a side note, one of the reasons cpu-map was devised for was exactly
> that, to allow mappings of resources (ie IRQs but it is valid for caches
> and other devices too) to groups of CPUs.
> 
> Is there anything that you can't do by using cpu-map phandles to
> describe devices associativity ?

- It doesn't work for cpu-less nodes.
- It fails if you have multiple paths between two devices, rather than
  a strict tree.
- It doesn't (yet) have a way to define which levels are relevant to NUMA
  topology.
- the phandle references are done in the wrong way if you want to
  represent a lot of devices.

> We have to add bindings that allow to compute the distance as you
> do by using the reference points (I am reading the code to figure
> out how it is used), but that's feasible as a binding update.

It's very unfortunate that we have two conflicting bindings that are
established. I still think that the associativity binding is more
flexible, but we could try to extend the arm topology binding if
necessary, but I'm not sure the end result of that would be better.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
@ 2015-01-15 17:32                               ` Arnd Bergmann
  0 siblings, 0 replies; 43+ messages in thread
From: Arnd Bergmann @ 2015-01-15 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 14 January 2015 23:49:05 Lorenzo Pieralisi wrote:
> On Wed, Jan 14, 2015 at 06:48:32PM +0000, Ganapatrao Kulkarni wrote:
> > On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
> > <lorenzo.pieralisi@arm.com> wrote:
> > > On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> > >> No, that would leave out the core number, which is required to identify
> > >> the individual thread. I meant adding an extra level such as
> > >>
> > >> <board> <socket> <cluster> <core>
> > >>
> > >> A lot of machines will leave out the <board> number because they are
> > >> built with SoCs that don't have a long-distance coherency protocol.
> > >
> > > Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> > > yet another topology binding description) ?
> > cpu-map describes only a cpu topology.
> > infact, i have tried initially(in v1 patch set) to use topology for
> > the numa mapping.
> > However, for numa, we need to define association of cpu, memory and IOs.
> > arm,associativity is a generic node property and can be used in any dt nodes.
> 
> I understand that, I was advising to define "arm,associativity" as a
> phandle in cpu nodes AND all devices.
> 
> Why can't you make it point at a phandle in the cpu-map instead of adding
> a t-uple doing the same thing. Am I missing something here ?

Most importantly, it's following an existing spec for ibm,associativity,
which defines topology in terms of associativity, not a hierarchical tree.

> cpu-map allows you to describe the system hierarchy and can expand beyond
> clusters (several layers of clusterings, above core it is just a way to
> define the system hierarchy, leaves node will always be cores or threads).

> On a side note, one of the reasons cpu-map was devised for was exactly
> that, to allow mappings of resources (ie IRQs but it is valid for caches
> and other devices too) to groups of CPUs.
> 
> Is there anything that you can't do by using cpu-map phandles to
> describe devices associativity ?

- It doesn't work for cpu-less nodes.
- It fails if you have multiple paths between two devices, rather than
  a strict tree.
- It doesn't (yet) have a way to define which levels are relevant to NUMA
  topology.
- the phandle references are done in the wrong way if you want to
  represent a lot of devices.

> We have to add bindings that allow to compute the distance as you
> do by using the reference points (I am reading the code to figure
> out how it is used), but that's feasible as a binding update.

It's very unfortunate that we have two conflicting bindings that are
established. I still think that the associativity binding is more
flexible, but we could try to extend the arm topology binding if
necessary, but I'm not sure the end result of that would be better.

	Arnd

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 0/4] arm64:numa: Add numa support for arm64 platforms.
@ 2014-12-31  7:36 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2014-12-31  7:36 UTC (permalink / raw)
  To: linux-arm-kernel, Will.Deacon, catalin.marinas, grant.likely,
	devicetree, leif.lindholm, roy.franz, ard.biesheuvel, msalter,
	robh+dt, steve.capper, hanjun.guo, jchandra, al.stone, arnd
  Cc: gpkulkarni

This is v3 patch set to support numa on arm64 based platforms.
Tested these patches on cavium's multinode(2 node topology) platform.

In this patchset, defined and implemented dt bindings for numa mapping for core and memory.
using device node property arm,associativity.

v2:
Defined and implemented numa map for memory, cores to node and
proximity distance matrix of nodes to each other.

v1:
Initial patchset to support numa on arm64 platforms.

Ganapatrao Kulkarni (4):
  arm64: defconfig: increase NR_CPUS range to 2-4096.
  Documentation: arm64/arm: dt bindings for numa.
  arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node
    topology.
  arm64:numa: adding numa support for arm64 platforms.

 Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++
 arch/arm64/Kconfig                             |  36 +-
 arch/arm64/boot/dts/thunder-88xx-2n.dts        |  78 +++
 arch/arm64/boot/dts/thunder-88xx-2n.dtsi       | 789 +++++++++++++++++++++++++
 arch/arm64/include/asm/mmzone.h                |  32 +
 arch/arm64/include/asm/numa.h                  |  45 ++
 arch/arm64/kernel/Makefile                     |   1 +
 arch/arm64/kernel/dt_numa.c                    | 296 ++++++++++
 arch/arm64/kernel/setup.c                      |   8 +
 arch/arm64/kernel/smp.c                        |   2 +
 arch/arm64/mm/Makefile                         |   1 +
 arch/arm64/mm/init.c                           |  34 +-
 arch/arm64/mm/numa.c                           | 520 ++++++++++++++++
 13 files changed, 2032 insertions(+), 8 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dtsi
 create mode 100644 arch/arm64/include/asm/mmzone.h
 create mode 100644 arch/arm64/include/asm/numa.h
 create mode 100644 arch/arm64/kernel/dt_numa.c
 create mode 100644 arch/arm64/mm/numa.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC PATCH v3 0/4] arm64:numa: Add numa support for arm64 platforms.
@ 2014-12-31  7:36 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 43+ messages in thread
From: Ganapatrao Kulkarni @ 2014-12-31  7:36 UTC (permalink / raw)
  To: linux-arm-kernel

This is v3 patch set to support numa on arm64 based platforms.
Tested these patches on cavium's multinode(2 node topology) platform.

In this patchset, defined and implemented dt bindings for numa mapping for core and memory.
using device node property arm,associativity.

v2:
Defined and implemented numa map for memory, cores to node and
proximity distance matrix of nodes to each other.

v1:
Initial patchset to support numa on arm64 platforms.

Ganapatrao Kulkarni (4):
  arm64: defconfig: increase NR_CPUS range to 2-4096.
  Documentation: arm64/arm: dt bindings for numa.
  arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node
    topology.
  arm64:numa: adding numa support for arm64 platforms.

 Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++
 arch/arm64/Kconfig                             |  36 +-
 arch/arm64/boot/dts/thunder-88xx-2n.dts        |  78 +++
 arch/arm64/boot/dts/thunder-88xx-2n.dtsi       | 789 +++++++++++++++++++++++++
 arch/arm64/include/asm/mmzone.h                |  32 +
 arch/arm64/include/asm/numa.h                  |  45 ++
 arch/arm64/kernel/Makefile                     |   1 +
 arch/arm64/kernel/dt_numa.c                    | 296 ++++++++++
 arch/arm64/kernel/setup.c                      |   8 +
 arch/arm64/kernel/smp.c                        |   2 +
 arch/arm64/mm/Makefile                         |   1 +
 arch/arm64/mm/init.c                           |  34 +-
 arch/arm64/mm/numa.c                           | 520 ++++++++++++++++
 13 files changed, 2032 insertions(+), 8 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dtsi
 create mode 100644 arch/arm64/include/asm/mmzone.h
 create mode 100644 arch/arm64/include/asm/numa.h
 create mode 100644 arch/arm64/kernel/dt_numa.c
 create mode 100644 arch/arm64/mm/numa.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2015-01-15 17:32 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-31  7:33 [RFC PATCH v3 0/4] arm64:numa: Add numa support for arm64 platforms Ganapatrao Kulkarni
     [not found] ` <1420011208-7051-1-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2014-12-31  7:33   ` [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096 Ganapatrao Kulkarni
     [not found]     ` <1420011208-7051-2-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-01-02 10:49       ` Arnd Bergmann
2015-01-02 21:17       ` Arnd Bergmann
2015-01-02 21:17         ` Arnd Bergmann
2014-12-31  7:33   ` [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa Ganapatrao Kulkarni
     [not found]     ` <1420011208-7051-3-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-01-02 11:02       ` Arnd Bergmann
2015-01-02 21:17       ` Arnd Bergmann
2015-01-02 21:17         ` Arnd Bergmann
2015-01-06  5:28         ` Ganapatrao Kulkarni
2015-01-06  5:28           ` Ganapatrao Kulkarni
2014-12-31  7:33   ` [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Ganapatrao Kulkarni
     [not found]     ` <1420011208-7051-4-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-01-02 10:49       ` Arnd Bergmann
2015-01-02 21:17       ` Arnd Bergmann
2015-01-02 21:17         ` Arnd Bergmann
2015-01-06  9:34         ` Ganapatrao Kulkarni
2015-01-06  9:34           ` Ganapatrao Kulkarni
     [not found]           ` <CAFpQJXXnM==4AmmNHf8yp2x0aK4Lnp2cy-4JpzuxsgXX3A=J4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-06 20:02             ` Arnd Bergmann
2015-01-06 20:02               ` Arnd Bergmann
2015-01-07  7:07               ` Ganapatrao Kulkarni
2015-01-07  7:07                 ` Ganapatrao Kulkarni
2015-01-07  8:18                 ` Arnd Bergmann
2015-01-07  8:18                   ` Arnd Bergmann
2015-01-14 17:36                   ` Lorenzo Pieralisi
2015-01-14 17:36                     ` Lorenzo Pieralisi
2015-01-14 18:48                     ` Ganapatrao Kulkarni
2015-01-14 18:48                       ` Ganapatrao Kulkarni
     [not found]                       ` <CAFpQJXXiUmK+BdQAFxV_JCuLyDSm89M7pe+enc2ZMTbuyC-T+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-14 23:49                         ` Lorenzo Pieralisi
2015-01-14 23:49                           ` Lorenzo Pieralisi
     [not found]                           ` <20150114234905.GB18194-7AyDDHkRsp3ZROr8t4l/smS4ubULX0JqMm0uRHvK7Nw@public.gmane.org>
2015-01-15 17:32                             ` Arnd Bergmann
2015-01-15 17:32                               ` Arnd Bergmann
2014-12-31  7:33   ` [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms Ganapatrao Kulkarni
     [not found]     ` <1420011208-7051-5-git-send-email-ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-01-02 11:34       ` Arnd Bergmann
2015-01-02 21:10       ` Arnd Bergmann
2015-01-02 21:10         ` Arnd Bergmann
2015-01-06  9:25         ` Ganapatrao Kulkarni
2015-01-06  9:25           ` Ganapatrao Kulkarni
     [not found]           ` <CAFpQJXVQnRY9wZRk-83bQLw0m=41ZxH2v-YZCHdF0fvC8LS_Tw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-06 19:59             ` Arnd Bergmann
2015-01-06 19:59               ` Arnd Bergmann
2015-01-07  7:09               ` Ganapatrao Kulkarni
2015-01-07  7:09                 ` Ganapatrao Kulkarni
2014-12-31  7:36 [RFC PATCH v3 0/4] arm64:numa: Add " Ganapatrao Kulkarni
2014-12-31  7:36 ` Ganapatrao Kulkarni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.