All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/15] multi-cluster power management
@ 2013-01-29  7:50 Nicolas Pitre
  2013-01-29  7:50 ` [PATCH v3 01/15] ARM: multi-cluster PM: secondary kernel entry code Nicolas Pitre
                   ` (15 more replies)
  0 siblings, 16 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

This is version 3 of the patch series required to safely power up
and down CPUs in a cluster as can be found in b.L systems.  Also
included are the needed patches to allow CPU hotplug on RTSM configured
for big.LITTLE.

This is now called "Multi-Cluster Power Management", or mcpm for short.
At least that makes for a prefix which is not already used in the kernel
and therefore is unlikely to be ambiguous.  Despite the name, this can be
used on single cluster systems as well if appropriate.

Please refer to http://article.gmane.org/gmane.linux.ports.arm.kernel/208625
for the initial series and particularly the cover page blurb for this work.

Thanks to those who provided review comments.

Changes from v2:

- The bL_ prefix has been changed into mcpm_ and surroundings adjusted
  accordingly.
- Documentation moved up one level in Documentation/arm/.
- Clarifications in commit log for patch #1 about future work.
- The debug macro in mcpm_head.S now displays CPU and cluster numbers.
- Patch improving mcpm_cpu_die() folded into the original patch that
  created it.
- Return -EADDRNOTAVAIL on ioremap failure.
- The auxcr patch moved down in the series to better identify dependencies.

Changes from v1:

- Pulled in Rob Herring's auxcr accessor patch and converted this series
  to it.
- VMajor rework of various barriers (some DSBs demoted to DMBs, etc.)
- The sync_mem() macro is now split and enhanced to properly process the
  cache for writers and readers in the cluster critical region helpers.
- BL_NR_CLUSTERS and BL_CPUS_PER_CLUSTER renamed to BL_MAX_CLUSTERS
  and BL_MAX_CPUS_PER_CLUSTER.
- Removed unused C definitions and prototypes for vlocks.
- Simplified the vlock memory allocation.
- The vlock code is GPL v2.
- Replaced MPIDR inline asm by read_cpuid_mpidr().
- Use of MPIDR_AFFINITY_LEVEL() to replace explicit shifts and masks.
- Dropped gic_cpu_if_down().
- Added a DSB before SEV and WFI.
- Fixed power_up_setup helper prototype.
- Nuked smp_wmb() in bL_set_entry_vector().
- Moved the CCI driver to drivers/bus/.
- Dependency on CONFIG_EXPERIMENTAL removed.
- Leftover garbage in Makefile removed.
- Added/clarified various comments in the assembly code.
- Some documentation typos fixed.
- Copyright notices updated to 2013

Still not addressed yet in this series:

- The CCI and DCSCB device tree binding descriptions.

Diffstat:

 Documentation/arm/cluster-pm-race-avoidance.txt | 498 ++++++++++++++++++
 Documentation/arm/vlocks.txt                    | 211 ++++++++
 arch/arm/Kconfig                                |   8 +
 arch/arm/common/Makefile                        |   1 +
 arch/arm/common/mcpm_entry.c                    | 314 +++++++++++
 arch/arm/common/mcpm_head.S                     | 219 ++++++++
 arch/arm/common/mcpm_platsmp.c                  |  85 +++
 arch/arm/common/vlock.S                         | 108 ++++
 arch/arm/common/vlock.h                         |  29 +
 arch/arm/include/asm/cp15.h                     |  14 +
 arch/arm/include/asm/mach/arch.h                |   3 +
 arch/arm/include/asm/mcpm_entry.h               | 190 +++++++
 arch/arm/kernel/setup.c                         |   5 +-
 arch/arm/mach-vexpress/Kconfig                  |   9 +
 arch/arm/mach-vexpress/Makefile                 |   1 +
 arch/arm/mach-vexpress/core.h                   |   2 +
 arch/arm/mach-vexpress/dcscb.c                  | 249 +++++++++
 arch/arm/mach-vexpress/dcscb_setup.S            |  80 +++
 arch/arm/mach-vexpress/platsmp.c                |  12 +
 arch/arm/mach-vexpress/v2m.c                    |   2 +-
 drivers/bus/Kconfig                             |   5 +
 drivers/bus/Makefile                            |   2 +
 drivers/bus/arm-cci.c                           | 124 +++++
 drivers/cpuidle/cpuidle-calxeda.c               |  14 -
 include/linux/arm-cci.h                         |  30 ++
 25 files changed, 2199 insertions(+), 16 deletions(-)

Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 01/15] ARM: multi-cluster PM: secondary kernel entry code
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
@ 2013-01-29  7:50 ` Nicolas Pitre
  2013-01-31 15:45   ` Santosh Shilimkar
  2013-01-29  7:50 ` [PATCH v3 02/15] ARM: mcpm: introduce the CPU/cluster power API Nicolas Pitre
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

CPUs in cluster based systems, such as big.LITTLE, have special needs
when entering the kernel due to a hotplug event, or when resuming from
a deep sleep mode.

This is vectorized so multiple CPUs can enter the kernel in parallel
without serialization.

The mcpm prefix stands for "multi cluster power management", however
this is usable on single cluster systems as well.  Only the basic
structure is introduced here.  This will be extended with later patches.

In order not to complexify things more than they currently have to,
the planned work to make runtime adjusted MPIDR based indexing and
dynamic memory allocation for cluster states is postponed to a later
cycle. The MAX_NR_CLUSTERS and MAX_CPUS_PER_CLUSTER static definitions
should be sufficient for those systems expected to be available in the
near future.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/Kconfig                  |  8 ++++
 arch/arm/common/Makefile          |  1 +
 arch/arm/common/mcpm_entry.c      | 29 +++++++++++++
 arch/arm/common/mcpm_head.S       | 86 +++++++++++++++++++++++++++++++++++++++
 arch/arm/include/asm/mcpm_entry.h | 35 ++++++++++++++++
 5 files changed, 159 insertions(+)
 create mode 100644 arch/arm/common/mcpm_entry.c
 create mode 100644 arch/arm/common/mcpm_head.S
 create mode 100644 arch/arm/include/asm/mcpm_entry.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 67874b82a4..200f559c1c 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1584,6 +1584,14 @@ config HAVE_ARM_TWD
 	help
 	  This options enables support for the ARM timer and watchdog unit
 
+config CLUSTER_PM
+	bool "Cluster Power Management Infrastructure"
+	depends on CPU_V7 && SMP
+	help
+	  This option provides the common power management infrastructure
+	  for (multi-)cluster based systems, such as big.LITTLE based
+	  systems.
+
 choice
 	prompt "Memory split"
 	default VMSPLIT_3G
diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
index e8a4e58f1b..23e85b1fae 100644
--- a/arch/arm/common/Makefile
+++ b/arch/arm/common/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_SHARP_PARAM)	+= sharpsl_param.o
 obj-$(CONFIG_SHARP_SCOOP)	+= scoop.o
 obj-$(CONFIG_PCI_HOST_ITE8152)  += it8152.o
 obj-$(CONFIG_ARM_TIMER_SP804)	+= timer-sp.o
+obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o
diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
new file mode 100644
index 0000000000..3a6d7e70fd
--- /dev/null
+++ b/arch/arm/common/mcpm_entry.c
@@ -0,0 +1,29 @@
+/*
+ * arch/arm/common/mcpm_entry.c -- entry point for multi-cluster PM
+ *
+ * Created by:  Nicolas Pitre, March 2012
+ * Copyright:   (C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+
+#include <asm/mcpm_entry.h>
+#include <asm/barrier.h>
+#include <asm/proc-fns.h>
+#include <asm/cacheflush.h>
+
+extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
+
+void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr)
+{
+	unsigned long val = ptr ? virt_to_phys(ptr) : 0;
+	mcpm_entry_vectors[cluster][cpu] = val;
+	__cpuc_flush_dcache_area((void *)&mcpm_entry_vectors[cluster][cpu], 4);
+	outer_clean_range(__pa(&mcpm_entry_vectors[cluster][cpu]),
+			  __pa(&mcpm_entry_vectors[cluster][cpu + 1]));
+}
diff --git a/arch/arm/common/mcpm_head.S b/arch/arm/common/mcpm_head.S
new file mode 100644
index 0000000000..794c8ea8c4
--- /dev/null
+++ b/arch/arm/common/mcpm_head.S
@@ -0,0 +1,86 @@
+/*
+ * arch/arm/common/mcpm_head.S -- kernel entry point for multi-cluster PM
+ *
+ * Created by:  Nicolas Pitre, March 2012
+ * Copyright:   (C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+#include <asm/mcpm_entry.h>
+
+	.macro	pr_dbg	string
+#if defined(CONFIG_DEBUG_LL) && defined(DEBUG)
+	b	1901f
+1902:	.asciz	"CPU"
+1903:	.asciz	" cluster"
+1904:	.asciz	": \string"
+	.align
+1901:	adr	r0, 1902b
+	bl	printascii
+	mov	r0, r9
+	bl	printhex8
+	adr	r0, 1903b
+	bl	printascii
+	mov	r0, r10
+	bl	printhex8
+	adr	r0, 1904b
+	bl	printascii
+#endif
+	.endm
+
+	.arm
+	.align
+
+ENTRY(mcpm_entry_point)
+
+ THUMB(	adr	r12, BSYM(1f)	)
+ THUMB(	bx	r12		)
+ THUMB(	.thumb			)
+1:
+	mrc	p15, 0, r0, c0, c0, 5		@ MPIDR
+	ubfx	r9, r0, #0, #4			@ r9 = cpu
+	ubfx	r10, r0, #8, #4			@ r10 = cluster
+	mov	r3, #MAX_CPUS_PER_CLUSTER
+	mla	r4, r3, r10, r9			@ r4 = canonical CPU index
+	cmp	r4, #(MAX_CPUS_PER_CLUSTER * MAX_NR_CLUSTERS)
+	blo	2f
+
+	/* We didn't expect this CPU.  Try to cheaply make it quiet. */
+1:	wfi
+	wfe
+	b	1b
+
+2:	pr_dbg	"kernel mcpm_entry_point\n"
+
+	/*
+	 * MMU is off so we need to get to mcpm_entry_vectors in a
+	 * position independent way.
+	 */
+	adr	r5, 3f
+	ldr	r6, [r5]
+	add	r6, r5, r6			@ r6 = mcpm_entry_vectors
+
+mcpm_entry_gated:
+	ldr	r5, [r6, r4, lsl #2]		@ r5 = CPU entry vector
+	cmp	r5, #0
+	wfeeq
+	beq	mcpm_entry_gated
+	pr_dbg	"released\n"
+	bx	r5
+
+	.align	2
+
+3:	.word	mcpm_entry_vectors - .
+
+ENDPROC(mcpm_entry_point)
+
+	.bss
+	.align	5
+
+	.type	mcpm_entry_vectors, #object
+ENTRY(mcpm_entry_vectors)
+	.space	4 * MAX_NR_CLUSTERS * MAX_CPUS_PER_CLUSTER
diff --git a/arch/arm/include/asm/mcpm_entry.h b/arch/arm/include/asm/mcpm_entry.h
new file mode 100644
index 0000000000..cc10ebbd2e
--- /dev/null
+++ b/arch/arm/include/asm/mcpm_entry.h
@@ -0,0 +1,35 @@
+/*
+ * arch/arm/include/asm/mcpm_entry.h
+ *
+ * Created by:  Nicolas Pitre, April 2012
+ * Copyright:   (C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef MCPM_ENTRY_H
+#define MCPM_ENTRY_H
+
+#define MAX_CPUS_PER_CLUSTER	4
+#define MAX_NR_CLUSTERS		2
+
+#ifndef __ASSEMBLY__
+
+/*
+ * Platform specific code should use this symbol to set up secondary
+ * entry location for processors to use when released from reset.
+ */
+extern void mcpm_entry_point(void);
+
+/*
+ * This is used to indicate where the given CPU from given cluster should
+ * branch once it is ready to re-enter the kernel using ptr, or NULL if it
+ * should be gated.  A gated CPU is held in a WFE loop until its vector
+ * becomes non NULL.
+ */
+void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr);
+
+#endif /* ! __ASSEMBLY__ */
+#endif
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 02/15] ARM: mcpm: introduce the CPU/cluster power API
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
  2013-01-29  7:50 ` [PATCH v3 01/15] ARM: multi-cluster PM: secondary kernel entry code Nicolas Pitre
@ 2013-01-29  7:50 ` Nicolas Pitre
  2013-01-31 15:55   ` Santosh Shilimkar
  2013-01-29  7:50 ` [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup Nicolas Pitre
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

This is the basic API used to handle the powering up/down of individual
CPUs in a (multi-)cluster system.  The platform specific backend
implementation has the responsibility to also handle the cluster level
power as well when the first/last CPU in a cluster is brought up/down.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/common/mcpm_entry.c      | 88 +++++++++++++++++++++++++++++++++++++
 arch/arm/include/asm/mcpm_entry.h | 92 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 180 insertions(+)

diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
index 3a6d7e70fd..c8c0e2113e 100644
--- a/arch/arm/common/mcpm_entry.c
+++ b/arch/arm/common/mcpm_entry.c
@@ -11,11 +11,13 @@
 
 #include <linux/kernel.h>
 #include <linux/init.h>
+#include <linux/irqflags.h>
 
 #include <asm/mcpm_entry.h>
 #include <asm/barrier.h>
 #include <asm/proc-fns.h>
 #include <asm/cacheflush.h>
+#include <asm/idmap.h>
 
 extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
 
@@ -27,3 +29,89 @@ void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr)
 	outer_clean_range(__pa(&mcpm_entry_vectors[cluster][cpu]),
 			  __pa(&mcpm_entry_vectors[cluster][cpu + 1]));
 }
+
+static const struct mcpm_platform_ops *platform_ops;
+
+int __init mcpm_platform_register(const struct mcpm_platform_ops *ops)
+{
+	if (platform_ops)
+		return -EBUSY;
+	platform_ops = ops;
+	return 0;
+}
+
+int mcpm_cpu_power_up(unsigned int cpu, unsigned int cluster)
+{
+	if (!platform_ops)
+		return -EUNATCH; /* try not to shadow power_up errors */
+	might_sleep();
+	return platform_ops->power_up(cpu, cluster);
+}
+
+typedef void (*phys_reset_t)(unsigned long);
+
+void mcpm_cpu_power_down(void)
+{
+	phys_reset_t phys_reset;
+
+	BUG_ON(!platform_ops);
+	BUG_ON(!irqs_disabled());
+
+	/*
+	 * Do this before calling into the power_down method,
+	 * as it might not always be safe to do afterwards.
+	 */
+	setup_mm_for_reboot();
+
+	platform_ops->power_down();
+
+	/*
+	 * It is possible for a power_up request to happen concurrently
+	 * with a power_down request for the same CPU. In this case the
+	 * power_down method might not be able to actually enter a
+	 * powered down state with the WFI instruction if the power_up
+	 * method has removed the required reset condition.  The
+	 * power_down method is then allowed to return. We must perform
+	 * a re-entry in the kernel as if the power_up method just had
+	 * deasserted reset on the CPU.
+	 *
+	 * To simplify race issues, the platform specific implementation
+	 * must accommodate for the possibility of unordered calls to
+	 * power_down and power_up with a usage count. Therefore, if a
+	 * call to power_up is issued for a CPU that is not down, then
+	 * the next call to power_down must not attempt a full shutdown
+	 * but only do the minimum (normally disabling L1 cache and CPU
+	 * coherency) and return just as if a concurrent power_up request
+	 * had happened as described above.
+	 */
+
+	phys_reset = (phys_reset_t)(unsigned long)virt_to_phys(cpu_reset);
+	phys_reset(virt_to_phys(mcpm_entry_point));
+
+	/* should never get here */
+	BUG();
+}
+
+void mcpm_cpu_suspend(u64 expected_residency)
+{
+	phys_reset_t phys_reset;
+
+	BUG_ON(!platform_ops);
+	BUG_ON(!irqs_disabled());
+
+	/* Very similar to mcpm_cpu_power_down() */
+	setup_mm_for_reboot();
+	platform_ops->suspend(expected_residency);
+	phys_reset = (phys_reset_t)(unsigned long)virt_to_phys(cpu_reset);
+	phys_reset(virt_to_phys(mcpm_entry_point));
+	BUG();
+}
+
+int mcpm_cpu_powered_up(void)
+{
+	if (!platform_ops)
+		return -EUNATCH;
+	if (platform_ops->powered_up)
+		platform_ops->powered_up();
+	return 0;
+}
diff --git a/arch/arm/include/asm/mcpm_entry.h b/arch/arm/include/asm/mcpm_entry.h
index cc10ebbd2e..3286d5eb91 100644
--- a/arch/arm/include/asm/mcpm_entry.h
+++ b/arch/arm/include/asm/mcpm_entry.h
@@ -31,5 +31,97 @@ extern void mcpm_entry_point(void);
  */
 void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr);
 
+/*
+ * CPU/cluster power operations API for higher subsystems to use.
+ */
+
+/**
+ * mcpm_cpu_power_up - make given CPU in given cluster runable
+ *
+ * @cpu: CPU number within given cluster
+ * @cluster: cluster number for the CPU
+ *
+ * The identified CPU is brought out of reset.  If the cluster was powered
+ * down then it is brought up as well, taking care not to let the other CPUs
+ * in the cluster run, and ensuring appropriate cluster setup.
+ *
+ * Caller must ensure the appropriate entry vector is initialized with
+ * mcpm_set_entry_vector() prior to calling this.
+ *
+ * This must be called in a sleepable context.  However, the implementation
+ * is strongly encouraged to return early and let the operation happen
+ * asynchronously, especially when significant delays are expected.
+ *
+ * If the operation cannot be performed then an error code is returned.
+ */
+int mcpm_cpu_power_up(unsigned int cpu, unsigned int cluster);
+
+/**
+ * mcpm_cpu_power_down - power the calling CPU down
+ *
+ * The calling CPU is powered down.
+ *
+ * If this CPU is found to be the "last man standing" in the cluster
+ * then the cluster is prepared for power-down too.
+ *
+ * This must be called with interrupts disabled.
+ *
+ * This does not return.  Re-entry in the kernel is expected via
+ * mcpm_entry_point.
+ */
+void mcpm_cpu_power_down(void);
+
+/**
+ * mcpm_cpu_suspend - bring the calling CPU in a suspended state
+ *
+ * @expected_residency: duration in microseconds the CPU is expected
+ *			to remain suspended, or 0 if unknown/infinity.
+ *
+ * The calling CPU is suspended.  The expected residency argument is used
+ * as a hint by the platform specific backend to implement the appropriate
+ * sleep state level according to the knowledge it has on wake-up latency
+ * for the given hardware.
+ *
+ * If this CPU is found to be the "last man standing" in the cluster
+ * then the cluster may be prepared for power-down too, if the expected
+ * residency makes it worthwhile.
+ *
+ * This must be called with interrupts disabled.
+ *
+ * This does not return.  Re-entry in the kernel is expected via
+ * mcpm_entry_point.
+ */
+void mcpm_cpu_suspend(u64 expected_residency);
+
+/**
+ * mcpm_cpu_powered_up - housekeeping workafter a CPU has been powered up
+ *
+ * This lets the platform specific backend code perform needed housekeeping
+ * work.  This must be called by the newly activated CPU as soon as it is
+ * fully operational in kernel space, before it enables interrupts.
+ *
+ * If the operation cannot be performed then an error code is returned.
+ */
+int mcpm_cpu_powered_up(void);
+
+/*
+ * Platform specific methods used in the implementation of the above API.
+ */
+struct mcpm_platform_ops {
+	int (*power_up)(unsigned int cpu, unsigned int cluster);
+	void (*power_down)(void);
+	void (*suspend)(u64);
+	void (*powered_up)(void);
+};
+
+/**
+ * mcpm_platform_register - register platform specific power methods
+ *
+ * @ops: mcpm_platform_ops structure to register
+ *
+ * An error is returned if the registration has been done previously.
+ */
+int __init mcpm_platform_register(const struct mcpm_platform_ops *ops);
+
 #endif /* ! __ASSEMBLY__ */
 #endif
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
  2013-01-29  7:50 ` [PATCH v3 01/15] ARM: multi-cluster PM: secondary kernel entry code Nicolas Pitre
  2013-01-29  7:50 ` [PATCH v3 02/15] ARM: mcpm: introduce the CPU/cluster power API Nicolas Pitre
@ 2013-01-29  7:50 ` Nicolas Pitre
  2013-01-31 16:08   ` Santosh Shilimkar
  2013-01-29  7:50 ` [PATCH v3 04/15] ARM: mcpm: Add baremetal voting mutexes Nicolas Pitre
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

From: Dave Martin <dave.martin@linaro.org>

This provides helper methods to coordinate between CPUs coming down
and CPUs going up, as well as documentation on the used algorithms,
so that cluster teardown and setup
operations are not done for a cluster simultaneously.

For use in the power_down() implementation:
  * __mcpm_cpu_going_down(unsigned int cluster, unsigned int cpu)
  * __mcpm_outbound_enter_critical(unsigned int cluster)
  * __mcpm_outbound_leave_critical(unsigned int cluster)
  * __mcpm_cpu_down(unsigned int cluster, unsigned int cpu)

The power_up_setup() helper should do platform-specific setup in
preparation for turning the CPU on, such as invalidating local caches
or entering coherency.  It must be assembler for now, since it must
run before the MMU can be switched on.  It is passed the affinity level
which should be initialized.

Because the mcpm_sync_struct content is looked-up and modified
with the cache enabled or disabled depending on the code path, it is
crucial to always ensure proper cache maintenance to update main memory
right away.  Therefore, any cached write must be followed by a cache
clean operation and any cached read must be preceded by a cache
invalidate operation (actually a cache flush i.e. clean+invalidate to
avoid discarding possible concurrent writes) on the accessed memory.

Also, in order to prevent a cached writer from interfering with an
adjacent non-cached writer, we ensure each state variable is located to
a separate cache line.

Thanks to Nicolas Pitre and Achin Gupta for the help with this
patch.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
---
 Documentation/arm/cluster-pm-race-avoidance.txt | 498 ++++++++++++++++++++++++
 arch/arm/common/mcpm_entry.c                    | 197 ++++++++++
 arch/arm/common/mcpm_head.S                     | 106 ++++-
 arch/arm/include/asm/mcpm_entry.h               |  63 +++
 4 files changed, 862 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/arm/cluster-pm-race-avoidance.txt

diff --git a/Documentation/arm/cluster-pm-race-avoidance.txt b/Documentation/arm/cluster-pm-race-avoidance.txt
new file mode 100644
index 0000000000..750b6fc24a
--- /dev/null
+++ b/Documentation/arm/cluster-pm-race-avoidance.txt
@@ -0,0 +1,498 @@
+Cluster-wide Power-up/power-down race avoidance algorithm
+=========================================================
+
+This file documents the algorithm which is used to coordinate CPU and
+cluster setup and teardown operations and to manage hardware coherency
+controls safely.
+
+The section "Rationale" explains what the algorithm is for and why it is
+needed.  "Basic model" explains general concepts using a simplified view
+of the system.  The other sections explain the actual details of the
+algorithm in use.
+
+
+Rationale
+---------
+
+In a system containing multiple CPUs, it is desirable to have the
+ability to turn off individual CPUs when the system is idle, reducing
+power consumption and thermal dissipation.
+
+In a system containing multiple clusters of CPUs, it is also desirable
+to have the ability to turn off entire clusters.
+
+Turning entire clusters off and on is a risky business, because it
+involves performing potentially destructive operations affecting a group
+of independently running CPUs, while the OS continues to run.  This
+means that we need some coordination in order to ensure that critical
+cluster-level operations are only performed when it is truly safe to do
+so.
+
+Simple locking may not be sufficient to solve this problem, because
+mechanisms like Linux spinlocks may rely on coherency mechanisms which
+are not immediately enabled when a cluster powers up.  Since enabling or
+disabling those mechanisms may itself be a non-atomic operation (such as
+writing some hardware registers and invalidating large caches), other
+methods of coordination are required in order to guarantee safe
+power-down and power-up at the cluster level.
+
+The mechanism presented in this document describes a coherent memory
+based protocol for performing the needed coordination.  It aims to be as
+lightweight as possible, while providing the required safety properties.
+
+
+Basic model
+-----------
+
+Each cluster and CPU is assigned a state, as follows:
+
+	DOWN
+	COMING_UP
+	UP
+	GOING_DOWN
+
+	    +---------> UP ----------+
+	    |                        v
+
+	COMING_UP                GOING_DOWN
+
+	    ^                        |
+	    +--------- DOWN <--------+
+
+
+DOWN:	The CPU or cluster is not coherent, and is either powered off or
+	suspended, or is ready to be powered off or suspended.
+
+COMING_UP: The CPU or cluster has committed to moving to the UP state.
+	It may be part way through the process of initialisation and
+	enabling coherency.
+
+UP:	The CPU or cluster is active and coherent at the hardware
+	level.  A CPU in this state is not necessarily being used
+	actively by the kernel.
+
+GOING_DOWN: The CPU or cluster has committed to moving to the DOWN
+	state.  It may be part way through the process of teardown and
+	coherency exit.
+
+
+Each CPU has one of these states assigned to it at any point in time.
+The CPU states are described in the "CPU state" section, below.
+
+Each cluster is also assigned a state, but it is necessary to split the
+state value into two parts (the "cluster" state and "inbound" state) and
+to introduce additional states in order to avoid races between different
+CPUs in the cluster simultaneously modifying the state.  The cluster-
+level states are described in the "Cluster state" section.
+
+To help distinguish the CPU states from cluster states in this
+discussion, the state names are given a CPU_ prefix for the CPU states,
+and a CLUSTER_ or INBOUND_ prefix for the cluster states.
+
+
+CPU state
+---------
+
+In this algorithm, each individual core in a multi-core processor is
+referred to as a "CPU".  CPUs are assumed to be single-threaded:
+therefore, a CPU can only be doing one thing@a single point in time.
+
+This means that CPUs fit the basic model closely.
+
+The algorithm defines the following states for each CPU in the system:
+
+	CPU_DOWN
+	CPU_COMING_UP
+	CPU_UP
+	CPU_GOING_DOWN
+
+	 cluster setup and
+	CPU setup complete          policy decision
+	      +-----------> CPU_UP ------------+
+	      |                                v
+
+	CPU_COMING_UP                   CPU_GOING_DOWN
+
+	      ^                                |
+	      +----------- CPU_DOWN <----------+
+	 policy decision           CPU teardown complete
+	or hardware event
+
+
+The definitions of the four states correspond closely to the states of
+the basic model.
+
+Transitions between states occur as follows.
+
+A trigger event (spontaneous) means that the CPU can transition to the
+next state as a result of making local progress only, with no
+requirement for any external event to happen.
+
+
+CPU_DOWN:
+
+	A CPU reaches the CPU_DOWN state when it is ready for
+	power-down.  On reaching this state, the CPU will typically
+	power itself down or suspend itself, via a WFI instruction or a
+	firmware call.
+
+	Next state:	CPU_COMING_UP
+	Conditions:	none
+
+	Trigger events:
+
+		a) an explicit hardware power-up operation, resulting
+		   from a policy decision on another CPU;
+
+		b) a hardware event, such as an interrupt.
+
+
+CPU_COMING_UP:
+
+	A CPU cannot start participating in hardware coherency until the
+	cluster is set up and coherent.  If the cluster is not ready,
+	then the CPU will wait in the CPU_COMING_UP state until the
+	cluster has been set up.
+
+	Next state:	CPU_UP
+	Conditions:	The CPU's parent cluster must be in CLUSTER_UP.
+	Trigger events:	Transition of the parent cluster to CLUSTER_UP.
+
+	Refer to the "Cluster state" section for a description of the
+	CLUSTER_UP state.
+
+
+CPU_UP:
+	When a CPU reaches the CPU_UP state, it is safe for the CPU to
+	start participating in local coherency.
+
+	This is done by jumping to the kernel's CPU resume code.
+
+	Note that the definition of this state is slightly different
+	from the basic model definition: CPU_UP does not mean that the
+	CPU is coherent yet, but it does mean that it is safe to resume
+	the kernel.  The kernel handles the rest of the resume
+	procedure, so the remaining steps are not visible as part of the
+	race avoidance algorithm.
+
+	The CPU remains in this state until an explicit policy decision
+	is made to shut down or suspend the CPU.
+
+	Next state:	CPU_GOING_DOWN
+	Conditions:	none
+	Trigger events:	explicit policy decision
+
+
+CPU_GOING_DOWN:
+
+	While in this state, the CPU exits coherency, including any
+	operations required to achieve this (such as cleaning data
+	caches).
+
+	Next state:	CPU_DOWN
+	Conditions:	local CPU teardown complete
+	Trigger events:	(spontaneous)
+
+
+Cluster state
+-------------
+
+A cluster is a group of connected CPUs with some common resources.
+Because a cluster contains multiple CPUs, it can be doing multiple
+things@the same time.  This has some implications.  In particular, a
+CPU can start up while another CPU is tearing the cluster down.
+
+In this discussion, the "outbound side" is the view of the cluster state
+as seen by a CPU tearing the cluster down.  The "inbound side" is the
+view of the cluster state as seen by a CPU setting the CPU up.
+
+In order to enable safe coordination in such situations, it is important
+that a CPU which is setting up the cluster can advertise its state
+independently of the CPU which is tearing down the cluster.  For this
+reason, the cluster state is split into two parts:
+
+	"cluster" state: The global state of the cluster; or the state
+		on the outbound side:
+
+		CLUSTER_DOWN
+		CLUSTER_UP
+		CLUSTER_GOING_DOWN
+
+	"inbound" state: The state of the cluster on the inbound side.
+
+		INBOUND_NOT_COMING_UP
+		INBOUND_COMING_UP
+
+
+	The different pairings of these states results in six possible
+	states for the cluster as a whole:
+
+	                            CLUSTER_UP
+	          +==========> INBOUND_NOT_COMING_UP -------------+
+	          #                                               |
+	                                                          |
+	     CLUSTER_UP     <----+                                |
+	  INBOUND_COMING_UP      |                                v
+
+	          ^             CLUSTER_GOING_DOWN       CLUSTER_GOING_DOWN
+	          #              INBOUND_COMING_UP <=== INBOUND_NOT_COMING_UP
+
+	    CLUSTER_DOWN         |                                |
+	  INBOUND_COMING_UP <----+                                |
+	                                                          |
+	          ^                                               |
+	          +===========     CLUSTER_DOWN      <------------+
+	                       INBOUND_NOT_COMING_UP
+
+	Transitions -----> can only be made by the outbound CPU, and
+	only involve changes to the "cluster" state.
+
+	Transitions ===##> can only be made by the inbound CPU, and only
+	involve changes to the "inbound" state, except where there is no
+	further transition possible on the outbound side (i.e., the
+	outbound CPU has put the cluster into the CLUSTER_DOWN state).
+
+	The race avoidance algorithm does not provide a way to determine
+	which exact CPUs within the cluster play these roles.  This must
+	be decided in advance by some other means.  Refer to the section
+	"Last man and first man selection" for more explanation.
+
+
+	CLUSTER_DOWN/INBOUND_NOT_COMING_UP is the only state where the
+	cluster can actually be powered down.
+
+	The parallelism of the inbound and outbound CPUs is observed by
+	the existence of two different paths from CLUSTER_GOING_DOWN/
+	INBOUND_NOT_COMING_UP (corresponding to GOING_DOWN in the basic
+	model) to CLUSTER_DOWN/INBOUND_COMING_UP (corresponding to
+	COMING_UP in the basic model).  The second path avoids cluster
+	teardown completely.
+
+	CLUSTER_UP/INBOUND_COMING_UP is equivalent to UP in the basic
+	model.  The final transition to CLUSTER_UP/INBOUND_NOT_COMING_UP
+	is trivial and merely resets the state machine ready for the
+	next cycle.
+
+	Details of the allowable transitions follow.
+
+	The next state in each case is notated
+
+		<cluster state>/<inbound state> (<transitioner>)
+
+	where the <transitioner> is the side on which the transition
+	can occur; either the inbound or the outbound side.
+
+
+CLUSTER_DOWN/INBOUND_NOT_COMING_UP:
+
+	Next state:	CLUSTER_DOWN/INBOUND_COMING_UP (inbound)
+	Conditions:	none
+	Trigger events:
+
+		a) an explicit hardware power-up operation, resulting
+		   from a policy decision on another CPU;
+
+		b) a hardware event, such as an interrupt.
+
+
+CLUSTER_DOWN/INBOUND_COMING_UP:
+
+	In this state, an inbound CPU sets up the cluster, including
+	enabling of hardware coherency at the cluster level and any
+	other operations (such as cache invalidation) which are required
+	in order to achieve this.
+
+	The purpose of this state is to do sufficient cluster-level
+	setup to enable other CPUs in the cluster to enter coherency
+	safely.
+
+	Next state:	CLUSTER_UP/INBOUND_COMING_UP (inbound)
+	Conditions:	cluster-level setup and hardware coherency complete
+	Trigger events:	(spontaneous)
+
+
+CLUSTER_UP/INBOUND_COMING_UP:
+
+	Cluster-level setup is complete and hardware coherency is
+	enabled for the cluster.  Other CPUs in the cluster can safely
+	enter coherency.
+
+	This is a transient state, leading immediately to
+	CLUSTER_UP/INBOUND_NOT_COMING_UP.  All other CPUs on the cluster
+	should consider treat these two states as equivalent.
+
+	Next state:	CLUSTER_UP/INBOUND_NOT_COMING_UP (inbound)
+	Conditions:	none
+	Trigger events:	(spontaneous)
+
+
+CLUSTER_UP/INBOUND_NOT_COMING_UP:
+
+	Cluster-level setup is complete and hardware coherency is
+	enabled for the cluster.  Other CPUs in the cluster can safely
+	enter coherency.
+
+	The cluster will remain in this state until a policy decision is
+	made to power the cluster down.
+
+	Next state:	CLUSTER_GOING_DOWN/INBOUND_NOT_COMING_UP (outbound)
+	Conditions:	none
+	Trigger events:	policy decision to power down the cluster
+
+
+CLUSTER_GOING_DOWN/INBOUND_NOT_COMING_UP:
+
+	An outbound CPU is tearing the cluster down.  The selected CPU
+	must wait in this state until all CPUs in the cluster are in the
+	CPU_DOWN state.
+
+	When all CPUs are in the CPU_DOWN state, the cluster can be torn
+	down, for example by cleaning data caches and exiting
+	cluster-level coherency.
+
+	To avoid wasteful unnecessary teardown operations, the outbound
+	should check the inbound cluster state for asynchronous
+	transitions to INBOUND_COMING_UP.  Alternatively, individual
+	CPUs can be checked for entry into CPU_COMING_UP or CPU_UP.
+
+
+	Next states:
+
+	CLUSTER_DOWN/INBOUND_NOT_COMING_UP (outbound)
+		Conditions:	cluster torn down and ready to power off
+		Trigger events:	(spontaneous)
+
+	CLUSTER_GOING_DOWN/INBOUND_COMING_UP (inbound)
+		Conditions:	none
+		Trigger events:
+
+			a) an explicit hardware power-up operation,
+			   resulting from a policy decision on another
+			   CPU;
+
+			b) a hardware event, such as an interrupt.
+
+
+CLUSTER_GOING_DOWN/INBOUND_COMING_UP:
+
+	The cluster is (or was) being torn down, but another CPU has
+	come online in the meantime and is trying to set up the cluster
+	again.
+
+	If the outbound CPU observes this state, it has two choices:
+
+		a) back out of teardown, restoring the cluster to the
+		   CLUSTER_UP state;
+
+		b) finish tearing the cluster down and put the cluster
+		   in the CLUSTER_DOWN state; the inbound CPU will
+		   set up the cluster again from there.
+
+	Choice (a) permits the removal of some latency by avoiding
+	unnecessary teardown and setup operations in situations where
+	the cluster is not really going to be powered down.
+
+
+	Next states:
+
+	CLUSTER_UP/INBOUND_COMING_UP (outbound)
+		Conditions:	cluster-level setup and hardware
+				coherency complete
+		Trigger events:	(spontaneous)
+
+	CLUSTER_DOWN/INBOUND_COMING_UP (outbound)
+		Conditions:	cluster torn down and ready to power off
+		Trigger events:	(spontaneous)
+
+
+Last man and First man selection
+--------------------------------
+
+The CPU which performs cluster tear-down operations on the outbound side
+is commonly referred to as the "last man".
+
+The CPU which performs cluster setup on the inbound side is commonly
+referred to as the "first man".
+
+The race avoidance algorithm documented above does not provide a
+mechanism to choose which CPUs should play these roles.
+
+
+Last man:
+
+When shutting down the cluster, all the CPUs involved are initially
+executing Linux and hence coherent.  Therefore, ordinary spinlocks can
+be used to select a last man safely, before the CPUs become
+non-coherent.
+
+
+First man:
+
+Because CPUs may power up asynchronously in response to external wake-up
+events, a dynamic mechanism is needed to make sure that only one CPU
+attempts to play the first man role and do the cluster-level
+initialisation: any other CPUs must wait for this to complete before
+proceeding.
+
+Cluster-level initialisation may involve actions such as configuring
+coherency controls in the bus fabric.
+
+The current implementation in mcpm_head.S uses a separate mutual exclusion
+mechanism to do this arbitration.  This mechanism is documented in
+detail in vlocks.txt.
+
+
+Features and Limitations
+------------------------
+
+Implementation:
+
+	The current ARM-based implementation is split between
+	arch/arm/common/mcpm_head.S (low-level inbound CPU operations) and
+	arch/arm/common/mcpm_entry.c (everything else):
+
+	__mcpm_cpu_going_down() signals the transition of a CPU to the
+		CPU_GOING_DOWN state.
+
+	__mcpm_cpu_down() signals the transition of a CPU to the CPU_DOWN
+		state.
+
+	A CPU transitions to CPU_COMING_UP and then to CPU_UP via the
+		low-level power-up code in mcpm_head.S.  This could
+		involve CPU-specific setup code, but in the current
+		implementation it does not.
+
+	__mcpm_outbound_enter_critical() and __mcpm_outbound_leave_critical()
+		handle transitions from CLUSTER_UP to CLUSTER_GOING_DOWN
+		and from there to CLUSTER_DOWN or back to CLUSTER_UP (in
+		the case of an aborted cluster power-down).
+
+		These functions are more complex than the __mcpm_cpu_*()
+		functions due to the extra inter-CPU coordination which
+		is needed for safe transitions at the cluster level.
+
+	A cluster transitions from CLUSTER_DOWN back to CLUSTER_UP via
+		the low-level power-up code in mcpm_head.S.  This
+		typically involves platform-specific setup code,
+		provided by the platform-specific power_up_setup
+		function registered via mcpm_sync_init.
+
+Deep topologies:
+
+	As currently described and implemented, the algorithm does not
+	support CPU topologies involving more than two levels (i.e.,
+	clusters of clusters are not supported).  The algorithm could be
+	extended by replicating the cluster-level states for the
+	additional topological levels, and modifying the transition
+	rules for the intermediate (non-outermost) cluster levels.
+
+
+Colophon
+--------
+
+Originally created and documented by Dave Martin for Linaro Limited, in
+collaboration with Nicolas Pitre and Achin Gupta.
+
+Copyright (C) 2012-2013  Linaro Limited
+Distributed under the terms of Version 2 of the GNU General Public
+License, as defined in linux/COPYING.
diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
index c8c0e2113e..2b83121966 100644
--- a/arch/arm/common/mcpm_entry.c
+++ b/arch/arm/common/mcpm_entry.c
@@ -18,6 +18,7 @@
 #include <asm/proc-fns.h>
 #include <asm/cacheflush.h>
 #include <asm/idmap.h>
+#include <asm/cputype.h>
 
 extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
 
@@ -115,3 +116,199 @@ int mcpm_cpu_powered_up(void)
 		platform_ops->powered_up();
 	return 0;
 }
+
+struct sync_struct mcpm_sync;
+
+/*
+ * There is no __cpuc_clean_dcache_area but we use it anyway for
+ * code intent clarity, and alias it to __cpuc_flush_dcache_area.
+ */
+#define __cpuc_clean_dcache_area __cpuc_flush_dcache_area
+
+/*
+ * Ensure preceding writes to *p by this CPU are visible to
+ * subsequent reads by other CPUs:
+ */
+static void __sync_range_w(volatile void *p, size_t size)
+{
+	char *_p = (char *)p;
+
+	__cpuc_clean_dcache_area(_p, size);
+	outer_clean_range(__pa(_p), __pa(_p + size));
+}
+
+/*
+ * Ensure preceding writes to *p by other CPUs are visible to
+ * subsequent reads by this CPU.  We must be careful not to
+ * discard data simultaneously written by another CPU, hence the
+ * usage of flush rather than invalidate operations.
+ */
+static void __sync_range_r(volatile void *p, size_t size)
+{
+	char *_p = (char *)p;
+
+#ifdef CONFIG_OUTER_CACHE
+	if (outer_cache.flush_range) {
+		/*
+		 * Ensure dirty data migrated from other CPUs into our cache
+		 * are cleaned out safely before the outer cache is cleaned:
+		 */
+		__cpuc_clean_dcache_area(_p, size);
+
+		/* Clean and invalidate stale data for *p from outer ... */
+		outer_flush_range(__pa(_p), __pa(_p + size));
+	}
+#endif
+
+	/* ... and inner cache: */
+	__cpuc_flush_dcache_area(_p, size);
+}
+
+#define sync_w(ptr) __sync_range_w(ptr, sizeof *(ptr))
+#define sync_r(ptr) __sync_range_r(ptr, sizeof *(ptr))
+
+/*
+ * __mcpm_cpu_going_down: Indicates that the cpu is being torn down.
+ *    This must be called at the point of committing to teardown of a CPU.
+ *    The CPU cache (SCTRL.C bit) is expected to still be active.
+ */
+void __mcpm_cpu_going_down(unsigned int cpu, unsigned int cluster)
+{
+	mcpm_sync.clusters[cluster].cpus[cpu].cpu = CPU_GOING_DOWN;
+	sync_w(&mcpm_sync.clusters[cluster].cpus[cpu].cpu);
+}
+
+/*
+ * __mcpm_cpu_down: Indicates that cpu teardown is complete and that the
+ *    cluster can be torn down without disrupting this CPU.
+ *    To avoid deadlocks, this must be called before a CPU is powered down.
+ *    The CPU cache (SCTRL.C bit) is expected to be off.
+ */
+void __mcpm_cpu_down(unsigned int cpu, unsigned int cluster)
+{
+	dmb();
+	mcpm_sync.clusters[cluster].cpus[cpu].cpu = CPU_DOWN;
+	sync_w(&mcpm_sync.clusters[cluster].cpus[cpu].cpu);
+	dsb_sev();
+}
+
+/*
+ * __mcpm_outbound_leave_critical: Leave the cluster teardown critical section.
+ * @state: the final state of the cluster:
+ *     CLUSTER_UP: no destructive teardown was done and the cluster has been
+ *         restored to the previous state (CPU cache still active); or
+ *     CLUSTER_DOWN: the cluster has been torn-down, ready for power-off
+ *         (CPU cache disabled).
+ */
+void __mcpm_outbound_leave_critical(unsigned int cluster, int state)
+{
+	dmb();
+	mcpm_sync.clusters[cluster].cluster = state;
+	sync_w(&mcpm_sync.clusters[cluster].cluster);
+	dsb_sev();
+}
+
+/*
+ * __mcpm_outbound_enter_critical: Enter the cluster teardown critical section.
+ * This function should be called by the last man, after local CPU teardown
+ * is complete.  CPU cache expected to be active.
+ *
+ * Returns:
+ *     false: the critical section was not entered because an inbound CPU was
+ *         observed, or the cluster is already being set up;
+ *     true: the critical section was entered: it is now safe to tear down the
+ *         cluster.
+ */
+bool __mcpm_outbound_enter_critical(unsigned int cpu, unsigned int cluster)
+{
+	unsigned int i;
+	struct mcpm_sync_struct *c = &mcpm_sync.clusters[cluster];
+
+	/* Warn inbound CPUs that the cluster is being torn down: */
+	c->cluster = CLUSTER_GOING_DOWN;
+	sync_w(&c->cluster);
+
+	/* Back out if the inbound cluster is already in the critical region: */
+	sync_r(&c->inbound);
+	if (c->inbound == INBOUND_COMING_UP)
+		goto abort;
+
+	/*
+	 * Wait for all CPUs to get out of the GOING_DOWN state, so that local
+	 * teardown is complete on each CPU before tearing down the cluster.
+	 *
+	 * If any CPU has been woken up again from the DOWN state, then we
+	 * shouldn't be taking the cluster down at all: abort in that case.
+	 */
+	sync_r(&c->cpus);
+	for (i = 0; i < MAX_CPUS_PER_CLUSTER; i++) {
+		int cpustate;
+
+		if (i == cpu)
+			continue;
+
+		while (1) {
+			cpustate = c->cpus[i].cpu;
+			if (cpustate != CPU_GOING_DOWN)
+				break;
+
+			wfe();
+			sync_r(&c->cpus[i].cpu);
+		}
+
+		switch (cpustate) {
+		case CPU_DOWN:
+			continue;
+
+		default:
+			goto abort;
+		}
+	}
+
+	return true;
+
+abort:
+	__mcpm_outbound_leave_critical(cluster, CLUSTER_UP);
+	return false;
+}
+
+int __mcpm_mcpm_state(unsigned int cluster)
+{
+	sync_r(&mcpm_sync.clusters[cluster].cluster);
+	return mcpm_sync.clusters[cluster].cluster;
+}
+
+extern unsigned long mcpm_power_up_setup_phys;
+
+int __init mcpm_sync_init(
+	void (*power_up_setup)(unsigned int affinity_level))
+{
+	unsigned int i, j, mpidr, this_cluster;
+
+	BUILD_BUG_ON(MCPM_SYNC_CLUSTER_SIZE * MAX_NR_CLUSTERS != sizeof mcpm_sync);
+	BUG_ON((unsigned long)&mcpm_sync & (__CACHE_WRITEBACK_GRANULE - 1));
+
+	/*
+	 * Set initial CPU and cluster states.
+	 * Only one cluster is assumed to be active at this point.
+	 */
+	for (i = 0; i < MAX_NR_CLUSTERS; i++) {
+		mcpm_sync.clusters[i].cluster = CLUSTER_DOWN;
+		mcpm_sync.clusters[i].inbound = INBOUND_NOT_COMING_UP;
+		for (j = 0; j < MAX_CPUS_PER_CLUSTER; j++)
+			mcpm_sync.clusters[i].cpus[j].cpu = CPU_DOWN;
+	}
+	mpidr = read_cpuid_mpidr();
+	this_cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+	for_each_online_cpu(i)
+		mcpm_sync.clusters[this_cluster].cpus[i].cpu = CPU_UP;
+	mcpm_sync.clusters[this_cluster].cluster = CLUSTER_UP;
+	sync_w(&mcpm_sync);
+
+	if (power_up_setup) {
+		mcpm_power_up_setup_phys = virt_to_phys(power_up_setup);
+		sync_w(&mcpm_power_up_setup_phys);
+	}
+
+	return 0;
+}
diff --git a/arch/arm/common/mcpm_head.S b/arch/arm/common/mcpm_head.S
index 794c8ea8c4..65db7ec87e 100644
--- a/arch/arm/common/mcpm_head.S
+++ b/arch/arm/common/mcpm_head.S
@@ -7,11 +7,19 @@
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
+ *
+ *
+ * Refer to Documentation/arm/cluster-pm-race-avoidance.txt
+ * for details of the synchronisation algorithms used here.
  */
 
 #include <linux/linkage.h>
 #include <asm/mcpm_entry.h>
 
+.if MCPM_SYNC_CLUSTER_CPUS
+.error "cpus must be the first member of struct mcpm_sync_struct"
+.endif
+
 	.macro	pr_dbg	string
 #if defined(CONFIG_DEBUG_LL) && defined(DEBUG)
 	b	1901f
@@ -57,24 +65,114 @@ ENTRY(mcpm_entry_point)
 2:	pr_dbg	"kernel mcpm_entry_point\n"
 
 	/*
-	 * MMU is off so we need to get to mcpm_entry_vectors in a
+	 * MMU is off so we need to get to various variables in a
 	 * position independent way.
 	 */
 	adr	r5, 3f
-	ldr	r6, [r5]
+	ldmia	r5, {r6, r7, r8}
 	add	r6, r5, r6			@ r6 = mcpm_entry_vectors
+	ldr	r7, [r5, r7]			@ r7 = mcpm_power_up_setup_phys
+	add	r8, r5, r8			@ r8 = mcpm_sync
+
+	mov	r0, #MCPM_SYNC_CLUSTER_SIZE
+	mla	r8, r0, r10, r8			@ r8 = sync cluster base
+
+	@ Signal that this CPU is coming UP:
+	mov	r0, #CPU_COMING_UP
+	mov	r5, #MCPM_SYNC_CPU_SIZE
+	mla	r5, r9, r5, r8			@ r5 = sync cpu address
+	strb	r0, [r5]
+
+	@ At this point, the cluster cannot unexpectedly enter the GOING_DOWN
+	@ state, because there is at least one active CPU (this CPU).
+
+	@ Note: the following is racy as another CPU might be testing
+	@ the same flag at the same moment.  That'll be fixed later.
+	ldrb	r0, [r8, #MCPM_SYNC_CLUSTER_CLUSTER]
+	cmp	r0, #CLUSTER_UP			@ cluster already up?
+	bne	mcpm_setup			@ if not, set up the cluster
+
+	@ Otherwise, skip setup:
+	b	mcpm_setup_complete
+
+mcpm_setup:
+	@ Control dependency implies strb not observable before previous ldrb.
+
+	@ Signal that the cluster is being brought up:
+	mov	r0, #INBOUND_COMING_UP
+	strb	r0, [r8, #MCPM_SYNC_CLUSTER_INBOUND]
+	dmb
+
+	@ Any CPU trying to take the cluster into CLUSTER_GOING_DOWN from this
+	@ point onwards will observe INBOUND_COMING_UP and abort.
+
+	@ Wait for any previously-pending cluster teardown operations to abort
+	@ or complete:
+mcpm_teardown_wait:
+	ldrb	r0, [r8, #MCPM_SYNC_CLUSTER_CLUSTER]
+	cmp	r0, #CLUSTER_GOING_DOWN
+	bne	first_man_setup
+	wfe
+	b	mcpm_teardown_wait
+
+first_man_setup:
+	dmb
+
+	@ If the outbound gave up before teardown started, skip cluster setup:
+
+	cmp	r0, #CLUSTER_UP
+	beq	mcpm_setup_leave
+
+	@ power_up_setup is now responsible for setting up the cluster:
+
+	cmp	r7, #0
+	mov	r0, #1		@ second (cluster) affinity level
+	blxne	r7		@ Call power_up_setup if defined
+	dmb
+
+	mov	r0, #CLUSTER_UP
+	strb	r0, [r8, #MCPM_SYNC_CLUSTER_CLUSTER]
+	dmb
+
+mcpm_setup_leave:
+	@ Leave the cluster setup critical section:
+
+	mov	r0, #INBOUND_NOT_COMING_UP
+	strb	r0, [r8, #MCPM_SYNC_CLUSTER_INBOUND]
+	dsb
+	sev
+
+mcpm_setup_complete:
+	@ If a platform-specific CPU setup hook is needed, it is
+	@ called from here.
+
+	cmp	r7, #0
+	mov	r0, #0		@ first (CPU) affinity level
+	blxne	r7		@ Call power_up_setup if defined
+	dmb
+
+	@ Mark the CPU as up:
+
+	mov	r0, #CPU_UP
+	strb	r0, [r5]
+
+	@ Observability order of CPU_UP and opening of the gate does not matter.
 
 mcpm_entry_gated:
 	ldr	r5, [r6, r4, lsl #2]		@ r5 = CPU entry vector
 	cmp	r5, #0
 	wfeeq
 	beq	mcpm_entry_gated
+	dmb
+
 	pr_dbg	"released\n"
 	bx	r5
 
 	.align	2
 
 3:	.word	mcpm_entry_vectors - .
+	.word	mcpm_power_up_setup_phys - 3b
+	.word	mcpm_sync - 3b
 
 ENDPROC(mcpm_entry_point)
 
@@ -84,3 +182,7 @@ ENDPROC(mcpm_entry_point)
 	.type	mcpm_entry_vectors, #object
 ENTRY(mcpm_entry_vectors)
 	.space	4 * MAX_NR_CLUSTERS * MAX_CPUS_PER_CLUSTER
+
+	.type	mcpm_power_up_setup_phys, #object
+ENTRY(mcpm_power_up_setup_phys)
+	.space  4		@ set by mcpm_sync_init()
diff --git a/arch/arm/include/asm/mcpm_entry.h b/arch/arm/include/asm/mcpm_entry.h
index 3286d5eb91..e76652209d 100644
--- a/arch/arm/include/asm/mcpm_entry.h
+++ b/arch/arm/include/asm/mcpm_entry.h
@@ -15,8 +15,37 @@
 #define MAX_CPUS_PER_CLUSTER	4
 #define MAX_NR_CLUSTERS		2
 
+/* Definitions for mcpm_sync_struct */
+#define CPU_DOWN		0x11
+#define CPU_COMING_UP		0x12
+#define CPU_UP			0x13
+#define CPU_GOING_DOWN		0x14
+
+#define CLUSTER_DOWN		0x21
+#define CLUSTER_UP		0x22
+#define CLUSTER_GOING_DOWN	0x23
+
+#define INBOUND_NOT_COMING_UP	0x31
+#define INBOUND_COMING_UP	0x32
+
+/* This is a complete guess. */
+#define __CACHE_WRITEBACK_ORDER	6
+#define __CACHE_WRITEBACK_GRANULE (1 << __CACHE_WRITEBACK_ORDER)
+
+/* Offsets for the mcpm_sync_struct members, for use in asm: */
+#define MCPM_SYNC_CLUSTER_CPUS	0
+#define MCPM_SYNC_CPU_SIZE	__CACHE_WRITEBACK_GRANULE
+#define MCPM_SYNC_CLUSTER_CLUSTER \
+	(MCPM_SYNC_CLUSTER_CPUS + MCPM_SYNC_CPU_SIZE * MAX_CPUS_PER_CLUSTER)
+#define MCPM_SYNC_CLUSTER_INBOUND \
+	(MCPM_SYNC_CLUSTER_CLUSTER + __CACHE_WRITEBACK_GRANULE)
+#define MCPM_SYNC_CLUSTER_SIZE \
+	(MCPM_SYNC_CLUSTER_INBOUND + __CACHE_WRITEBACK_GRANULE)
+
 #ifndef __ASSEMBLY__
 
+#include <linux/types.h>
+
 /*
  * Platform specific code should use this symbol to set up secondary
  * entry location for processors to use when released from reset.
@@ -123,5 +152,39 @@ struct mcpm_platform_ops {
  */
 int __init mcpm_platform_register(const struct mcpm_platform_ops *ops);
 
+/* Synchronisation structures for coordinating safe cluster setup/teardown: */
+
+/*
+ * When modifying this structure, make sure you update the MCPM_SYNC_ defines
+ * to match.
+ */
+struct mcpm_sync_struct {
+	/* individual CPU states */
+	struct {
+		volatile s8 cpu __aligned(__CACHE_WRITEBACK_GRANULE);
+	} cpus[MAX_CPUS_PER_CLUSTER];
+
+	/* cluster state */
+	volatile s8 cluster __aligned(__CACHE_WRITEBACK_GRANULE);
+
+	/* inbound-side state */
+	volatile s8 inbound __aligned(__CACHE_WRITEBACK_GRANULE);
+};
+
+struct sync_struct {
+	struct mcpm_sync_struct clusters[MAX_NR_CLUSTERS];
+};
+
+extern unsigned long sync_phys;	/* physical address of *mcpm_sync */
+
+void __mcpm_cpu_going_down(unsigned int cpu, unsigned int cluster);
+void __mcpm_cpu_down(unsigned int cpu, unsigned int cluster);
+void __mcpm_outbound_leave_critical(unsigned int cluster, int state);
+bool __mcpm_outbound_enter_critical(unsigned int this_cpu, unsigned int cluster);
+int __mcpm_mcpm_state(unsigned int cluster);
+
+int __init mcpm_sync_init(
+	void (*power_up_setup)(unsigned int affinity_level));
+
 #endif /* ! __ASSEMBLY__ */
 #endif
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 04/15] ARM: mcpm: Add baremetal voting mutexes
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (2 preceding siblings ...)
  2013-01-29  7:50 ` [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup Nicolas Pitre
@ 2013-01-29  7:50 ` Nicolas Pitre
  2013-02-01  5:29   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 05/15] ARM: mcpm_head.S: vlock-based first man election Nicolas Pitre
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

From: Dave Martin <dave.martin@linaro.org>

This patch adds a simple low-level voting mutex implementation
to be used to arbitrate during first man selection when no load/store
exclusive instructions are usable.

For want of a better name, these are called "vlocks".  (I was
tempted to call them ballot locks, but "block" is way too confusing
an abbreviation...)

There is no function to wait for the lock to be released, and no
vlock_lock() function since we don't need these at the moment.
These could straightforwardly be added if vlocks get used for other
purposes.

For architectural correctness even Strongly-Ordered memory accesses
require barriers in order to guarantee that multiple CPUs have a
coherent view of the ordering of memory accesses.  Whether or not
this matters depends on hardware implementation details of the
memory system.  Since the purpose of this code is to provide a clean,
generic locking mechanism with no platform-specific dependencies the
barriers should be present to avoid unpleasant surprises on future
platforms.

Note:

  * When taking the lock, we don't care about implicit background
    memory operations and other signalling which may be pending,
    because those are not part of the critical section anyway.

    A DMB is sufficient to ensure correctly observed ordering if
    the explicit memory accesses in vlock_trylock.

  * No barrier is required after checking the election result,
    because the result is determined by the store to
    VLOCK_OWNER_OFFSET and is already globally observed due to the
    barriers in voting_end.  This means that global agreement on
    the winner is guaranteed, even before the winner is known
    locally.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 Documentation/arm/vlocks.txt | 211 +++++++++++++++++++++++++++++++++++++++++++
 arch/arm/common/vlock.S      | 108 ++++++++++++++++++++++
 arch/arm/common/vlock.h      |  29 ++++++
 3 files changed, 348 insertions(+)
 create mode 100644 Documentation/arm/vlocks.txt
 create mode 100644 arch/arm/common/vlock.S
 create mode 100644 arch/arm/common/vlock.h

diff --git a/Documentation/arm/vlocks.txt b/Documentation/arm/vlocks.txt
new file mode 100644
index 0000000000..415960a9ba
--- /dev/null
+++ b/Documentation/arm/vlocks.txt
@@ -0,0 +1,211 @@
+vlocks for Bare-Metal Mutual Exclusion
+======================================
+
+Voting Locks, or "vlocks" provide a simple low-level mutual exclusion
+mechanism, with reasonable but minimal requirements on the memory
+system.
+
+These are intended to be used to coordinate critical activity among CPUs
+which are otherwise non-coherent, in situations where the hardware
+provides no other mechanism to support this and ordinary spinlocks
+cannot be used.
+
+
+vlocks make use of the atomicity provided by the memory system for
+writes to a single memory location.  To arbitrate, every CPU "votes for
+itself", by storing a unique number to a common memory location.  The
+final value seen in that memory location when all the votes have been
+cast identifies the winner.
+
+In order to make sure that the election produces an unambiguous result
+in finite time, a CPU will only enter the election in the first place if
+no winner has been chosen and the election does not appear to have
+started yet.
+
+
+Algorithm
+---------
+
+The easiest way to explain the vlocks algorithm is with some pseudo-code:
+
+
+	int currently_voting[NR_CPUS] = { 0, };
+	int last_vote = -1; /* no votes yet */
+
+	bool vlock_trylock(int this_cpu)
+	{
+		/* signal our desire to vote */
+		currently_voting[this_cpu] = 1;
+		if (last_vote != -1) {
+			/* someone already volunteered himself */
+			currently_voting[this_cpu] = 0;
+			return false; /* not ourself */
+		}
+
+		/* let's suggest ourself */
+		last_vote = this_cpu;
+		currently_voting[this_cpu] = 0;
+
+		/* then wait until everyone else is done voting */
+		for_each_cpu(i) {
+			while (currently_voting[i] != 0)
+				/* wait */;
+		}
+
+		/* result */
+		if (last_vote == this_cpu)
+			return true; /* we won */
+		return false;
+	}
+
+	bool vlock_unlock(void)
+	{
+		last_vote = -1;
+	}
+
+
+The currently_voting[] array provides a way for the CPUs to determine
+whether an election is in progress, and plays a role analogous to the
+"entering" array in Lamport's bakery algorithm [1].
+
+However, once the election has started, the underlying memory system
+atomicity is used to pick the winner.  This avoids the need for a static
+priority rule to act as a tie-breaker, or any counters which could
+overflow.
+
+As long as the last_vote variable is globally visible to all CPUs, it
+will contain only one value that won't change once every CPU has cleared
+its currently_voting flag.
+
+
+Features and limitations
+------------------------
+
+ * vlocks are not intended to be fair.  In the contended case, it is the
+   _last_ CPU which attempts to get the lock which will be most likely
+   to win.
+
+   vlocks are therefore best suited to situations where it is necessary
+   to pick a unique winner, but it does not matter which CPU actually
+   wins.
+
+ * Like other similar mechanisms, vlocks will not scale well to a large
+   number of CPUs.
+
+   vlocks can be cascaded in a voting hierarchy to permit better scaling
+   if necessary, as in the following hypothetical example for 4096 CPUs:
+
+	/* first level: local election */
+	my_town = towns[(this_cpu >> 4) & 0xf];
+	I_won = vlock_trylock(my_town, this_cpu & 0xf);
+	if (I_won) {
+		/* we won the town election, let's go for the state */
+		my_state = states[(this_cpu >> 8) & 0xf];
+		I_won = vlock_lock(my_state, this_cpu & 0xf));
+		if (I_won) {
+			/* and so on */
+			I_won = vlock_lock(the_whole_country, this_cpu & 0xf];
+			if (I_won) {
+				/* ... */
+			}
+			vlock_unlock(the_whole_country);
+		}
+		vlock_unlock(my_state);
+	}
+	vlock_unlock(my_town);
+
+
+ARM implementation
+------------------
+
+The current ARM implementation [2] contains some optimisations beyond
+the basic algorithm:
+
+ * By packing the members of the currently_voting array close together,
+   we can read the whole array in one transaction (providing the number
+   of CPUs potentially contending the lock is small enough).  This
+   reduces the number of round-trips required to external memory.
+
+   In the ARM implementation, this means that we can use a single load
+   and comparison:
+
+	LDR	Rt, [Rn]
+	CMP	Rt, #0
+
+   ...in place of code equivalent to:
+
+	LDRB	Rt, [Rn]
+	CMP	Rt, #0
+	LDRBEQ	Rt, [Rn, #1]
+	CMPEQ	Rt, #0
+	LDRBEQ	Rt, [Rn, #2]
+	CMPEQ	Rt, #0
+	LDRBEQ	Rt, [Rn, #3]
+	CMPEQ	Rt, #0
+
+   This cuts down on the fast-path latency, as well as potentially
+   reducing bus contention in contended cases.
+
+   The optimisation relies on the fact that the ARM memory system
+   guarantees coherency between overlapping memory accesses of
+   different sizes, similarly to many other architectures.  Note that
+   we do not care which element of currently_voting appears in which
+   bits of Rt, so there is no need to worry about endianness in this
+   optimisation.
+
+   If there are too many CPUs to read the currently_voting array in
+   one transaction then multiple transations are still required.  The
+   implementation uses a simple loop of word-sized loads for this
+   case.  The number of transactions is still fewer than would be
+   required if bytes were loaded individually.
+
+
+   In principle, we could aggregate further by using LDRD or LDM, but
+   to keep the code simple this was not attempted in the initial
+   implementation.
+
+
+ * vlocks are currently only used to coordinate between CPUs which are
+   unable to enable their caches yet.  This means that the
+   implementation removes many of the barriers which would be required
+   when executing the algorithm in cached memory.
+
+   packing of the currently_voting array does not work with cached
+   memory unless all CPUs contending the lock are cache-coherent, due
+   to cache writebacks from one CPU clobbering values written by other
+   CPUs.  (Though if all the CPUs are cache-coherent, you should be
+   probably be using proper spinlocks instead anyway).
+
+
+ * The "no votes yet" value used for the last_vote variable is 0 (not
+   -1 as in the pseudocode).  This allows statically-allocated vlocks
+   to be implicitly initialised to an unlocked state simply by putting
+   them in .bss.
+
+   An offset is added to each CPU's ID for the purpose of setting this
+   variable, so that no CPU uses the value 0 for its ID.
+
+
+Colophon
+--------
+
+Originally created and documented by Dave Martin for Linaro Limited, for
+use in ARM-based big.LITTLE platforms, with review and input gratefully
+received from Nicolas Pitre and Achin Gupta.  Thanks to Nicolas for
+grabbing most of this text out of the relevant mail thread and writing
+up the pseudocode.
+
+Copyright (C) 2012-2013  Linaro Limited
+Distributed under the terms of Version 2 of the GNU General Public
+License, as defined in linux/COPYING.
+
+
+References
+----------
+
+[1] Lamport, L. "A New Solution of Dijkstra's Concurrent Programming
+    Problem", Communications of the ACM 17, 8 (August 1974), 453-455.
+
+    http://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm
+
+[2] linux/arch/arm/common/vlock.S, www.kernel.org.
diff --git a/arch/arm/common/vlock.S b/arch/arm/common/vlock.S
new file mode 100644
index 0000000000..ff198583f6
--- /dev/null
+++ b/arch/arm/common/vlock.S
@@ -0,0 +1,108 @@
+/*
+ * vlock.S - simple voting lock implementation for ARM
+ *
+ * Created by:	Dave Martin, 2012-08-16
+ * Copyright:	(C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ *
+ * This algorithm is described in more detail in
+ * Documentation/arm/vlocks.txt.
+ */
+
+#include <linux/linkage.h>
+#include "vlock.h"
+
+/* Select different code if voting flags  can fit in a single word. */
+#if VLOCK_VOTING_SIZE > 4
+#define FEW(x...)
+#define MANY(x...) x
+#else
+#define FEW(x...) x
+#define MANY(x...)
+#endif
+
+@ voting lock for first-man coordination
+
+.macro voting_begin rbase:req, rcpu:req, rscratch:req
+	mov	\rscratch, #1
+	strb	\rscratch, [\rbase, \rcpu]
+	dmb
+.endm
+
+.macro voting_end rbase:req, rcpu:req, rscratch:req
+	dmb
+	mov	\rscratch, #0
+	strb	\rscratch, [\rbase, \rcpu]
+	dsb
+	sev
+.endm
+
+/*
+ * The vlock structure must reside in Strongly-Ordered or Device memory.
+ * This implementation deliberately eliminates most of the barriers which
+ * would be required for other memory types, and assumes that independent
+ * writes to neighbouring locations within a cacheline do not interfere
+ * with one another.
+ */
+
+@ r0: lock structure base
+@ r1: CPU ID (0-based index within cluster)
+ENTRY(vlock_trylock)
+	add	r1, r1, #VLOCK_VOTING_OFFSET
+
+	voting_begin	r0, r1, r2
+
+	ldrb	r2, [r0, #VLOCK_OWNER_OFFSET]	@ check whether lock is held
+	cmp	r2, #VLOCK_OWNER_NONE
+	bne	trylock_fail			@ fail if so
+
+	@ Control dependency implies strb not observable before previous ldrb.
+
+	strb	r1, [r0, #VLOCK_OWNER_OFFSET]	@ submit my vote
+
+	voting_end	r0, r1, r2		@ implies DMB
+
+	@ Wait for the current round of voting to finish:
+
+ MANY(	mov	r3, #VLOCK_VOTING_OFFSET			)
+0:
+ MANY(	ldr	r2, [r0, r3]					)
+ FEW(	ldr	r2, [r0, #VLOCK_VOTING_OFFSET]			)
+	cmp	r2, #0
+	wfene
+	bne	0b
+ MANY(	add	r3, r3, #4					)
+ MANY(	cmp	r3, #VLOCK_VOTING_OFFSET + VLOCK_VOTING_SIZE	)
+ MANY(	bne	0b						)
+
+	@ Check who won:
+
+	dmb
+	ldrb	r2, [r0, #VLOCK_OWNER_OFFSET]
+	eor	r0, r1, r2			@ zero if I won, else nonzero
+	bx	lr
+
+trylock_fail:
+	voting_end	r0, r1, r2
+	mov	r0, #1				@ nonzero indicates that I lost
+	bx	lr
+ENDPROC(vlock_trylock)
+
+@ r0: lock structure base
+ENTRY(vlock_unlock)
+	dmb
+	mov	r1, #VLOCK_OWNER_NONE
+	strb	r1, [r0, #VLOCK_OWNER_OFFSET]
+	dsb
+	sev
+	bx	lr
+ENDPROC(vlock_unlock)
diff --git a/arch/arm/common/vlock.h b/arch/arm/common/vlock.h
new file mode 100644
index 0000000000..eda912f915
--- /dev/null
+++ b/arch/arm/common/vlock.h
@@ -0,0 +1,29 @@
+/*
+ * vlock.h - simple voting lock implementation
+ *
+ * Created by:	Dave Martin, 2012-08-16
+ * Copyright:	(C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __VLOCK_H
+#define __VLOCK_H
+
+#include <asm/mcpm_entry.h>
+
+/* Offsets and sizes are rounded to a word (4 bytes) */
+#define VLOCK_OWNER_OFFSET	0
+#define VLOCK_VOTING_OFFSET	4
+#define VLOCK_VOTING_SIZE	((MAX_CPUS_PER_CLUSTER + 3) / 4 * 4)
+#define VLOCK_SIZE		(VLOCK_VOTING_OFFSET + VLOCK_VOTING_SIZE)
+#define VLOCK_OWNER_NONE	0
+
+#endif /* ! __VLOCK_H */
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 05/15] ARM: mcpm_head.S: vlock-based first man election
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (3 preceding siblings ...)
  2013-01-29  7:50 ` [PATCH v3 04/15] ARM: mcpm: Add baremetal voting mutexes Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-02-01  5:34   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 06/15] ARM: mcpm: generic SMP secondary bringup and hotplug support Nicolas Pitre
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

From: Dave Martin <dave.martin@linaro.org>

Instead of requiring the first man to be elected in advance (which
can be suboptimal in some situations), this patch uses a per-
cluster mutex to co-ordinate selection of the first man.

This should also make it more feasible to reuse this code path for
asynchronous cluster resume (as in CPUidle scenarios).

We must ensure that the vlock data doesn't share a cacheline with
anything else, or dirty cache eviction could corrupt it.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 arch/arm/common/Makefile    |  2 +-
 arch/arm/common/mcpm_head.S | 41 ++++++++++++++++++++++++++++++++++++-----
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
index 23e85b1fae..c901a38c59 100644
--- a/arch/arm/common/Makefile
+++ b/arch/arm/common/Makefile
@@ -13,4 +13,4 @@ obj-$(CONFIG_SHARP_PARAM)	+= sharpsl_param.o
 obj-$(CONFIG_SHARP_SCOOP)	+= scoop.o
 obj-$(CONFIG_PCI_HOST_ITE8152)  += it8152.o
 obj-$(CONFIG_ARM_TIMER_SP804)	+= timer-sp.o
-obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o
+obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o vlock.o
diff --git a/arch/arm/common/mcpm_head.S b/arch/arm/common/mcpm_head.S
index 65db7ec87e..a2a2bb6bf0 100644
--- a/arch/arm/common/mcpm_head.S
+++ b/arch/arm/common/mcpm_head.S
@@ -16,6 +16,8 @@
 #include <linux/linkage.h>
 #include <asm/mcpm_entry.h>
 
+#include "vlock.h"
+
 .if MCPM_SYNC_CLUSTER_CPUS
 .error "cpus must be the first member of struct mcpm_sync_struct"
 .endif
@@ -69,10 +71,11 @@ ENTRY(mcpm_entry_point)
 	 * position independent way.
 	 */
 	adr	r5, 3f
-	ldmia	r5, {r6, r7, r8}
+	ldmia	r5, {r6, r7, r8, r11}
 	add	r6, r5, r6			@ r6 = mcpm_entry_vectors
 	ldr	r7, [r5, r7]			@ r7 = mcpm_power_up_setup_phys
 	add	r8, r5, r8			@ r8 = mcpm_sync
+	add	r11, r5, r11			@ r11 = first_man_locks
 
 	mov	r0, #MCPM_SYNC_CLUSTER_SIZE
 	mla	r8, r0, r10, r8			@ r8 = sync cluster base
@@ -86,13 +89,22 @@ ENTRY(mcpm_entry_point)
 	@ At this point, the cluster cannot unexpectedly enter the GOING_DOWN
 	@ state, because there is at least one active CPU (this CPU).
 
-	@ Note: the following is racy as another CPU might be testing
-	@ the same flag at the same moment.  That'll be fixed later.
+	mov	r0, #VLOCK_SIZE
+	mla	r11, r0, r10, r11		@ r11 = cluster first man lock
+	mov	r0, r11
+	mov	r1, r9				@ cpu
+	bl	vlock_trylock			@ implies DMB
+
+	cmp	r0, #0				@ failed to get the lock?
+	bne	mcpm_setup_wait		@ wait for cluster setup if so
+
 	ldrb	r0, [r8, #MCPM_SYNC_CLUSTER_CLUSTER]
 	cmp	r0, #CLUSTER_UP			@ cluster already up?
 	bne	mcpm_setup			@ if not, set up the cluster
 
-	@ Otherwise, skip setup:
+	@ Otherwise, release the first man lock and skip setup:
+	mov	r0, r11
+	bl	vlock_unlock
 	b	mcpm_setup_complete
 
 mcpm_setup:
@@ -142,6 +154,19 @@ mcpm_setup_leave:
 	dsb
 	sev
 
+	mov	r0, r11
+	bl	vlock_unlock	@ implies DMB
+	b	mcpm_setup_complete
+
+	@ In the contended case, non-first men wait here for cluster setup
+	@ to complete:
+mcpm_setup_wait:
+	ldrb	r0, [r8, #MCPM_SYNC_CLUSTER_CLUSTER]
+	cmp	r0, #CLUSTER_UP
+	wfene
+	bne	mcpm_setup_wait
+	dmb
+
 mcpm_setup_complete:
 	@ If a platform-specific CPU setup hook is needed, it is
 	@ called from here.
@@ -173,11 +198,17 @@ mcpm_entry_gated:
 3:	.word	mcpm_entry_vectors - .
 	.word	mcpm_power_up_setup_phys - 3b
 	.word	mcpm_sync - 3b
+	.word	first_man_locks - 3b
 
 ENDPROC(mcpm_entry_point)
 
 	.bss
-	.align	5
+
+	.align	__CACHE_WRITEBACK_ORDER
+	.type	first_man_locks, #object
+first_man_locks:
+	.space	VLOCK_SIZE * MAX_NR_CLUSTERS
+	.align	__CACHE_WRITEBACK_ORDER
 
 	.type	mcpm_entry_vectors, #object
 ENTRY(mcpm_entry_vectors)
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 06/15] ARM: mcpm: generic SMP secondary bringup and hotplug support
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (4 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 05/15] ARM: mcpm_head.S: vlock-based first man election Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-01-29 20:38   ` Rob Herring
  2013-02-01  5:38   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time Nicolas Pitre
                   ` (9 subsequent siblings)
  15 siblings, 2 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

Now that the cluster power API is in place, we can use it for SMP secondary
bringup and CPU hotplug in a generic fashion.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/common/Makefile       |  2 +-
 arch/arm/common/mcpm_platsmp.c | 85 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 86 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/common/mcpm_platsmp.c

diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
index c901a38c59..e1c9db45de 100644
--- a/arch/arm/common/Makefile
+++ b/arch/arm/common/Makefile
@@ -13,4 +13,4 @@ obj-$(CONFIG_SHARP_PARAM)	+= sharpsl_param.o
 obj-$(CONFIG_SHARP_SCOOP)	+= scoop.o
 obj-$(CONFIG_PCI_HOST_ITE8152)  += it8152.o
 obj-$(CONFIG_ARM_TIMER_SP804)	+= timer-sp.o
-obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o vlock.o
+obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o mcpm_platsmp.o vlock.o
diff --git a/arch/arm/common/mcpm_platsmp.c b/arch/arm/common/mcpm_platsmp.c
new file mode 100644
index 0000000000..401298f5ee
--- /dev/null
+++ b/arch/arm/common/mcpm_platsmp.c
@@ -0,0 +1,85 @@
+/*
+ * linux/arch/arm/mach-vexpress/mcpm_platsmp.c
+ *
+ * Created by:  Nicolas Pitre, November 2012
+ * Copyright:   (C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Code to handle secondary CPU bringup and hotplug for the cluster power API.
+ */
+
+#include <linux/init.h>
+#include <linux/smp.h>
+
+#include <asm/mcpm_entry.h>
+#include <asm/smp_plat.h>
+#include <asm/hardware/gic.h>
+
+static void __init simple_smp_init_cpus(void)
+{
+	set_smp_cross_call(gic_raise_softirq);
+}
+
+static int __cpuinit mcpm_boot_secondary(unsigned int cpu, struct task_struct *idle)
+{
+	unsigned int mpidr, pcpu, pcluster, ret;
+	extern void secondary_startup(void);
+
+	mpidr = cpu_logical_map(cpu);
+	pcpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	pcluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+	pr_debug("%s: logical CPU %d is physical CPU %d cluster %d\n",
+		 __func__, cpu, pcpu, pcluster);
+
+	mcpm_set_entry_vector(pcpu, pcluster, NULL);
+	ret = mcpm_cpu_power_up(pcpu, pcluster);
+	if (ret)
+		return ret;
+	mcpm_set_entry_vector(pcpu, pcluster, secondary_startup);
+	gic_raise_softirq(cpumask_of(cpu), 0);
+	dsb_sev();
+	return 0;
+}
+
+static void __cpuinit mcpm_secondary_init(unsigned int cpu)
+{
+	mcpm_cpu_powered_up();
+	gic_secondary_init(0);
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+
+static int mcpm_cpu_disable(unsigned int cpu)
+{
+	/*
+	 * We assume all CPUs may be shut down.
+	 * This would be the hook to use for eventual Secure
+	 * OS migration requests as described in the PSCI spec.
+	 */
+	return 0;
+}
+
+static void mcpm_cpu_die(unsigned int cpu)
+{
+	unsigned int mpidr, pcpu, pcluster;
+	mpidr = read_cpuid_mpidr();
+	pcpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	pcluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+	mcpm_set_entry_vector(pcpu, pcluster, NULL);
+	mcpm_cpu_power_down();
+}
+
+#endif
+
+struct smp_operations __initdata mcpm_smp_ops = {
+	.smp_init_cpus		= simple_smp_init_cpus,
+	.smp_boot_secondary	= mcpm_boot_secondary,
+	.smp_secondary_init	= mcpm_secondary_init,
+#ifdef CONFIG_HOTPLUG_CPU
+	.cpu_disable		= mcpm_cpu_disable,
+	.cpu_die		= mcpm_cpu_die,
+#endif
+};
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (5 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 06/15] ARM: mcpm: generic SMP secondary bringup and hotplug support Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-01-29 15:43   ` Jon Medhurst (Tixy)
  2013-02-01  5:41   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 08/15] ARM: introduce common set_auxcr/get_auxcr functions Nicolas Pitre
                   ` (8 subsequent siblings)
  15 siblings, 2 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jon Medhurst <tixy@linaro.org>

Signed-off-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/include/asm/mach/arch.h |  3 +++
 arch/arm/kernel/setup.c          |  5 ++++-
 arch/arm/mach-vexpress/core.h    |  2 ++
 arch/arm/mach-vexpress/platsmp.c | 12 ++++++++++++
 arch/arm/mach-vexpress/v2m.c     |  2 +-
 5 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
index 917d4fcfd9..3d01c6d6c3 100644
--- a/arch/arm/include/asm/mach/arch.h
+++ b/arch/arm/include/asm/mach/arch.h
@@ -17,8 +17,10 @@ struct pt_regs;
 struct smp_operations;
 #ifdef CONFIG_SMP
 #define smp_ops(ops) (&(ops))
+#define smp_init_ops(ops) (&(ops))
 #else
 #define smp_ops(ops) (struct smp_operations *)NULL
+#define smp_init_ops(ops) (void (*)(void))NULL
 #endif
 
 struct machine_desc {
@@ -42,6 +44,7 @@ struct machine_desc {
 	unsigned char		reserve_lp2 :1;	/* never has lp2	*/
 	char			restart_mode;	/* default restart mode	*/
 	struct smp_operations	*smp;		/* SMP operations	*/
+	void			(*smp_init)(void);
 	void			(*fixup)(struct tag *, char **,
 					 struct meminfo *);
 	void			(*reserve)(void);/* reserve mem blocks	*/
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 3f6cbb2e3e..41edca8582 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -768,7 +768,10 @@ void __init setup_arch(char **cmdline_p)
 	arm_dt_init_cpu_maps();
 #ifdef CONFIG_SMP
 	if (is_smp()) {
-		smp_set_ops(mdesc->smp);
+		if(mdesc->smp_init)
+			(*mdesc->smp_init)();
+		else
+			smp_set_ops(mdesc->smp);
 		smp_init_cpus();
 	}
 #endif
diff --git a/arch/arm/mach-vexpress/core.h b/arch/arm/mach-vexpress/core.h
index f134cd4a85..3a761fd76c 100644
--- a/arch/arm/mach-vexpress/core.h
+++ b/arch/arm/mach-vexpress/core.h
@@ -6,6 +6,8 @@
 
 void vexpress_dt_smp_map_io(void);
 
+void vexpress_smp_init_ops(void);
+
 extern struct smp_operations	vexpress_smp_ops;
 
 extern void vexpress_cpu_die(unsigned int cpu);
diff --git a/arch/arm/mach-vexpress/platsmp.c b/arch/arm/mach-vexpress/platsmp.c
index c5d70de9bb..667344b479 100644
--- a/arch/arm/mach-vexpress/platsmp.c
+++ b/arch/arm/mach-vexpress/platsmp.c
@@ -12,6 +12,7 @@
 #include <linux/errno.h>
 #include <linux/smp.h>
 #include <linux/io.h>
+#include <linux/of.h>
 #include <linux/of_fdt.h>
 #include <linux/vexpress.h>
 
@@ -206,3 +207,14 @@ struct smp_operations __initdata vexpress_smp_ops = {
 	.cpu_die		= vexpress_cpu_die,
 #endif
 };
+
+void __init vexpress_smp_init_ops(void)
+{
+	struct smp_operations *ops = &vexpress_smp_ops;
+#ifdef CONFIG_CLUSTER_PM
+	extern struct smp_operations mcpm_smp_ops;
+	if(of_find_compatible_node(NULL, NULL, "arm,cci"))
+		ops = &mcpm_smp_ops;
+#endif
+	smp_set_ops(ops);
+}
diff --git a/arch/arm/mach-vexpress/v2m.c b/arch/arm/mach-vexpress/v2m.c
index 011661a6c5..34172bd504 100644
--- a/arch/arm/mach-vexpress/v2m.c
+++ b/arch/arm/mach-vexpress/v2m.c
@@ -494,7 +494,7 @@ static const char * const v2m_dt_match[] __initconst = {
 
 DT_MACHINE_START(VEXPRESS_DT, "ARM-Versatile Express")
 	.dt_compat	= v2m_dt_match,
-	.smp		= smp_ops(vexpress_smp_ops),
+	.smp_init	= smp_init_ops(vexpress_smp_init_ops),
 	.map_io		= v2m_dt_map_io,
 	.init_early	= v2m_dt_init_early,
 	.init_irq	= v2m_dt_init_irq,
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 08/15] ARM: introduce common set_auxcr/get_auxcr functions
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (6 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-02-01  5:44   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 09/15] ARM: vexpress: introduce DCSCB support Nicolas Pitre
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

From: Rob Herring <rob.herring@calxeda.com>

Move the private set_auxcr/get_auxcr functions from
drivers/cpuidle/cpuidle-calxeda.c so they can be used across platforms.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/include/asm/cp15.h       | 14 ++++++++++++++
 drivers/cpuidle/cpuidle-calxeda.c | 14 --------------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm/include/asm/cp15.h b/arch/arm/include/asm/cp15.h
index 5ef4d8015a..ce4d01c03e 100644
--- a/arch/arm/include/asm/cp15.h
+++ b/arch/arm/include/asm/cp15.h
@@ -59,6 +59,20 @@ static inline void set_cr(unsigned int val)
 	isb();
 }
 
+static inline unsigned int get_auxcr(void)
+{
+	unsigned int val;
+	asm("mrc p15, 0, %0, c1, c0, 1	@ get AUXCR" : "=r" (val));
+	return val;
+}
+
+static inline void set_auxcr(unsigned int val)
+{
+	asm volatile("mcr p15, 0, %0, c1, c0, 1	@ set AUXCR"
+	  : : "r" (val));
+	isb();
+}
+
 #ifndef CONFIG_SMP
 extern void adjust_cr(unsigned long mask, unsigned long set);
 #endif
diff --git a/drivers/cpuidle/cpuidle-calxeda.c b/drivers/cpuidle/cpuidle-calxeda.c
index e1aab38c5a..ece83d6e04 100644
--- a/drivers/cpuidle/cpuidle-calxeda.c
+++ b/drivers/cpuidle/cpuidle-calxeda.c
@@ -37,20 +37,6 @@ extern void *scu_base_addr;
 
 static struct cpuidle_device __percpu *calxeda_idle_cpuidle_devices;
 
-static inline unsigned int get_auxcr(void)
-{
-	unsigned int val;
-	asm("mrc p15, 0, %0, c1, c0, 1	@ get AUXCR" : "=r" (val) : : "cc");
-	return val;
-}
-
-static inline void set_auxcr(unsigned int val)
-{
-	asm volatile("mcr p15, 0, %0, c1, c0, 1	@ set AUXCR"
-	  : : "r" (val) : "cc");
-	isb();
-}
-
 static noinline void calxeda_idle_restore(void)
 {
 	set_cr(get_cr() | CR_C);
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 09/15] ARM: vexpress: introduce DCSCB support
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (7 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 08/15] ARM: introduce common set_auxcr/get_auxcr functions Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-02-01  5:50   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 10/15] ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation Nicolas Pitre
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

This adds basic CPU and cluster reset controls on RTSM for the
A15x4-A7x4 model configuration using the Dual Cluster System
Configuration Block (DCSCB).

The cache coherency interconnect (CCI) is not handled yet.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/mach-vexpress/Kconfig  |   8 ++
 arch/arm/mach-vexpress/Makefile |   1 +
 arch/arm/mach-vexpress/dcscb.c  | 159 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 168 insertions(+)
 create mode 100644 arch/arm/mach-vexpress/dcscb.c

diff --git a/arch/arm/mach-vexpress/Kconfig b/arch/arm/mach-vexpress/Kconfig
index 52d315b792..f3f92b120a 100644
--- a/arch/arm/mach-vexpress/Kconfig
+++ b/arch/arm/mach-vexpress/Kconfig
@@ -52,4 +52,12 @@ config ARCH_VEXPRESS_CORTEX_A5_A9_ERRATA
 config ARCH_VEXPRESS_CA9X4
 	bool "Versatile Express Cortex-A9x4 tile"
 
+config ARCH_VEXPRESS_DCSCB
+	bool "Dual Cluster System Control Block (DCSCB) support"
+	depends on CLUSTER_PM
+	help
+	  Support for the Dual Cluster System Configuration Block (DCSCB).
+	  This is needed to provide CPU and cluster power management
+	  on RTSM.
+
 endmenu
diff --git a/arch/arm/mach-vexpress/Makefile b/arch/arm/mach-vexpress/Makefile
index 80b64971fb..2253644054 100644
--- a/arch/arm/mach-vexpress/Makefile
+++ b/arch/arm/mach-vexpress/Makefile
@@ -6,5 +6,6 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) := -I$(srctree)/$(src)/include \
 
 obj-y					:= v2m.o reset.o
 obj-$(CONFIG_ARCH_VEXPRESS_CA9X4)	+= ct-ca9x4.o
+obj-$(CONFIG_ARCH_VEXPRESS_DCSCB)	+= dcscb.o
 obj-$(CONFIG_SMP)			+= platsmp.o
 obj-$(CONFIG_HOTPLUG_CPU)		+= hotplug.o
diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
new file mode 100644
index 0000000000..677ced9efc
--- /dev/null
+++ b/arch/arm/mach-vexpress/dcscb.c
@@ -0,0 +1,159 @@
+/*
+ * arch/arm/mach-vexpress/dcscb.c - Dual Cluster System Control Block
+ *
+ * Created by:	Nicolas Pitre, May 2012
+ * Copyright:	(C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/io.h>
+#include <linux/spinlock.h>
+#include <linux/errno.h>
+#include <linux/vexpress.h>
+
+#include <asm/mcpm_entry.h>
+#include <asm/proc-fns.h>
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+
+
+#define DCSCB_PHYS_BASE	0x60000000
+
+#define RST_HOLD0	0x0
+#define RST_HOLD1	0x4
+#define SYS_SWRESET	0x8
+#define RST_STAT0	0xc
+#define RST_STAT1	0x10
+#define EAG_CFG_R	0x20
+#define EAG_CFG_W	0x24
+#define KFC_CFG_R	0x28
+#define KFC_CFG_W	0x2c
+#define DCS_CFG_R	0x30
+
+/*
+ * We can't use regular spinlocks. In the switcher case, it is possible
+ * for an outbound CPU to call power_down() after its inbound counterpart
+ * is already live using the same logical CPU number which trips lockdep
+ * debugging.
+ */
+static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+static void __iomem *dcscb_base;
+
+static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
+{
+	unsigned int rst_hold, cpumask = (1 << cpu);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= 4 || cluster >= 2)
+		return -EINVAL;
+
+	/*
+	 * Since this is called with IRQs enabled, and no arch_spin_lock_irq
+	 * variant exists, we need to disable IRQs manually here.
+	 */
+	local_irq_disable();
+	arch_spin_lock(&dcscb_lock);
+
+	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
+	if (rst_hold & (1 << 8)) {
+		/* remove cluster reset and add individual CPU's reset */
+		rst_hold &= ~(1 << 8);
+		rst_hold |= 0xf;
+	}
+	rst_hold &= ~(cpumask | (cpumask << 4));
+	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
+
+	arch_spin_unlock(&dcscb_lock);
+	local_irq_enable();
+
+	return 0;
+}
+
+static void dcscb_power_down(void)
+{
+	unsigned int mpidr, cpu, cluster, rst_hold, cpumask, last_man;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+	cpumask = (1 << cpu);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= 4 || cluster >= 2);
+
+	arch_spin_lock(&dcscb_lock);
+	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
+	rst_hold |= cpumask;
+	if (((rst_hold | (rst_hold >> 4)) & 0xf) == 0xf)
+		rst_hold |= (1 << 8);
+	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
+	arch_spin_unlock(&dcscb_lock);
+	last_man = (rst_hold & (1 << 8));
+
+	/*
+	 * Now let's clean our L1 cache and shut ourself down.
+	 * If we're the last CPU in this cluster then clean L2 too.
+	 */
+
+	/*
+	 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
+	 * a preliminary flush here for those CPUs.  At least, that's
+	 * the theory -- without the extra flush, Linux explodes on
+	 * RTSM (maybe not needed anymore, to be investigated)..
+	 */
+	flush_cache_louis();
+	cpu_proc_fin();
+
+	if (!last_man) {
+		flush_cache_louis();
+	} else {
+		flush_cache_all();
+		outer_flush_all();
+	}
+
+	/* Disable local coherency by clearing the ACTLR "SMP" bit: */
+	set_auxcr(get_auxcr() & ~(1 << 6));
+
+	/* Now we are prepared for power-down, do it: */
+	dsb();
+	wfi();
+
+	/* Not dead@this point?  Let our caller cope. */
+}
+
+static const struct mcpm_platform_ops dcscb_power_ops = {
+	.power_up	= dcscb_power_up,
+	.power_down	= dcscb_power_down,
+};
+
+static int __init dcscb_init(void)
+{
+	int ret;
+
+	dcscb_base = ioremap(DCSCB_PHYS_BASE, 0x1000);
+	if (!dcscb_base)
+		return -EADDRNOTAVAIL;
+
+	ret = mcpm_platform_register(&dcscb_power_ops);
+	if (ret) {
+		iounmap(dcscb_base);
+		return ret;
+	}
+
+	/*
+	 * Future entries into the kernel can now go
+	 * through the cluster entry vectors.
+	 */
+	vexpress_flags_set(virt_to_phys(mcpm_entry_point));
+
+	return 0;
+}
+
+early_initcall(dcscb_init);
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 10/15] ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (8 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 09/15] ARM: vexpress: introduce DCSCB support Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-02-01  5:53   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 11/15] ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster Nicolas Pitre
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

It is possible for a CPU to be told to power up before it managed
to power itself down.  Solve this race with a usage count as mandated
by the API definition.

Signed-off-by: nicolas Pitre <nico@linaro.org>
---
 arch/arm/mach-vexpress/dcscb.c | 77 +++++++++++++++++++++++++++++++++---------
 1 file changed, 61 insertions(+), 16 deletions(-)

diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
index 677ced9efc..f993608944 100644
--- a/arch/arm/mach-vexpress/dcscb.c
+++ b/arch/arm/mach-vexpress/dcscb.c
@@ -45,6 +45,7 @@
 static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
 
 static void __iomem *dcscb_base;
+static int dcscb_use_count[4][2];
 
 static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
 {
@@ -61,14 +62,27 @@ static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
 	local_irq_disable();
 	arch_spin_lock(&dcscb_lock);
 
-	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
-	if (rst_hold & (1 << 8)) {
-		/* remove cluster reset and add individual CPU's reset */
-		rst_hold &= ~(1 << 8);
-		rst_hold |= 0xf;
+	dcscb_use_count[cpu][cluster]++;
+	if (dcscb_use_count[cpu][cluster] == 1) {
+		rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
+		if (rst_hold & (1 << 8)) {
+			/* remove cluster reset and add individual CPU's reset */
+			rst_hold &= ~(1 << 8);
+			rst_hold |= 0xf;
+		}
+		rst_hold &= ~(cpumask | (cpumask << 4));
+		writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
+	} else if (dcscb_use_count[cpu][cluster] != 2) {
+		/*
+		 * The only possible values are:
+		 * 0 = CPU down
+		 * 1 = CPU (still) up
+		 * 2 = CPU requested to be up before it had a chance
+		 *     to actually make itself down.
+		 * Any other value is a bug.
+		 */
+		BUG();
 	}
-	rst_hold &= ~(cpumask | (cpumask << 4));
-	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
 
 	arch_spin_unlock(&dcscb_lock);
 	local_irq_enable();
@@ -78,7 +92,8 @@ static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
 
 static void dcscb_power_down(void)
 {
-	unsigned int mpidr, cpu, cluster, rst_hold, cpumask, last_man;
+	unsigned int mpidr, cpu, cluster, rst_hold, cpumask;
+	bool last_man = false, skip_wfi = false;
 
 	mpidr = read_cpuid_mpidr();
 	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
@@ -89,13 +104,26 @@ static void dcscb_power_down(void)
 	BUG_ON(cpu >= 4 || cluster >= 2);
 
 	arch_spin_lock(&dcscb_lock);
-	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
-	rst_hold |= cpumask;
-	if (((rst_hold | (rst_hold >> 4)) & 0xf) == 0xf)
-		rst_hold |= (1 << 8);
-	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
+	dcscb_use_count[cpu][cluster]--;
+	if (dcscb_use_count[cpu][cluster] == 0) {
+		rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
+		rst_hold |= cpumask;
+		if (((rst_hold | (rst_hold >> 4)) & 0xf) == 0xf) {
+			rst_hold |= (1 << 8);
+			last_man = true;
+		}
+		writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
+	} else if (dcscb_use_count[cpu][cluster] == 1) {
+		/*
+		 * A power_up request went ahead of us.
+		 * Even if we do not want to shut this CPU down,
+		 * the caller expects a certain state as if the WFI
+		 * was aborted.  So let's continue with cache cleaning.
+		 */
+		skip_wfi = true;
+	} else
+		BUG();
 	arch_spin_unlock(&dcscb_lock);
-	last_man = (rst_hold & (1 << 8));
 
 	/*
 	 * Now let's clean our L1 cache and shut ourself down.
@@ -122,8 +150,10 @@ static void dcscb_power_down(void)
 	set_auxcr(get_auxcr() & ~(1 << 6));
 
 	/* Now we are prepared for power-down, do it: */
-	dsb();
-	wfi();
+	if (!skip_wfi) {
+		dsb();
+		wfi();
+	}
 
 	/* Not dead@this point?  Let our caller cope. */
 }
@@ -133,6 +163,19 @@ static const struct mcpm_platform_ops dcscb_power_ops = {
 	.power_down	= dcscb_power_down,
 };
 
+static void __init dcscb_usage_count_init(void)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= 4 || cluster >= 2);
+	dcscb_use_count[cpu][cluster] = 1;
+}
+
 static int __init dcscb_init(void)
 {
 	int ret;
@@ -141,6 +184,8 @@ static int __init dcscb_init(void)
 	if (!dcscb_base)
 		return -EADDRNOTAVAIL;
 
+	dcscb_usage_count_init();
+
 	ret = mcpm_platform_register(&dcscb_power_ops);
 	if (ret) {
 		iounmap(dcscb_base);
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 11/15] ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (9 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 10/15] ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-02-01  5:57   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 12/15] drivers/bus: add ARM CCI support Nicolas Pitre
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

If 4 CPUs are assumed, the A15x1-A7x1 model configuration would never
shut down the initial cluster as the 0xf reset bit mask will never be
observed.  Let's construct this mask based on the provided information
in the DCSCB config register for the number of CPUs per cluster.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/mach-vexpress/dcscb.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
index f993608944..8d363357ef 100644
--- a/arch/arm/mach-vexpress/dcscb.c
+++ b/arch/arm/mach-vexpress/dcscb.c
@@ -46,10 +46,12 @@ static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
 
 static void __iomem *dcscb_base;
 static int dcscb_use_count[4][2];
+static int dcscb_mcpm_cpu_mask[2];
 
 static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
 {
 	unsigned int rst_hold, cpumask = (1 << cpu);
+	unsigned int mcpm_mask = dcscb_mcpm_cpu_mask[cluster];
 
 	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
 	if (cpu >= 4 || cluster >= 2)
@@ -68,7 +70,7 @@ static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
 		if (rst_hold & (1 << 8)) {
 			/* remove cluster reset and add individual CPU's reset */
 			rst_hold &= ~(1 << 8);
-			rst_hold |= 0xf;
+			rst_hold |= mcpm_mask;
 		}
 		rst_hold &= ~(cpumask | (cpumask << 4));
 		writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
@@ -92,13 +94,14 @@ static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
 
 static void dcscb_power_down(void)
 {
-	unsigned int mpidr, cpu, cluster, rst_hold, cpumask;
+	unsigned int mpidr, cpu, cluster, rst_hold, cpumask, mcpm_mask;
 	bool last_man = false, skip_wfi = false;
 
 	mpidr = read_cpuid_mpidr();
 	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
 	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
 	cpumask = (1 << cpu);
+	mcpm_mask = dcscb_mcpm_cpu_mask[cluster];
 
 	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
 	BUG_ON(cpu >= 4 || cluster >= 2);
@@ -108,7 +111,7 @@ static void dcscb_power_down(void)
 	if (dcscb_use_count[cpu][cluster] == 0) {
 		rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
 		rst_hold |= cpumask;
-		if (((rst_hold | (rst_hold >> 4)) & 0xf) == 0xf) {
+		if (((rst_hold | (rst_hold >> 4)) & mcpm_mask) == mcpm_mask) {
 			rst_hold |= (1 << 8);
 			last_man = true;
 		}
@@ -178,12 +181,15 @@ static void __init dcscb_usage_count_init(void)
 
 static int __init dcscb_init(void)
 {
+	unsigned int cfg;
 	int ret;
 
 	dcscb_base = ioremap(DCSCB_PHYS_BASE, 0x1000);
 	if (!dcscb_base)
 		return -EADDRNOTAVAIL;
-
+	cfg = readl_relaxed(dcscb_base + DCS_CFG_R);
+	dcscb_mcpm_cpu_mask[0] = (1 << (((cfg >> 16) >> (0 << 2)) & 0xf)) - 1;
+	dcscb_mcpm_cpu_mask[1] = (1 << (((cfg >> 16) >> (1 << 2)) & 0xf)) - 1;
 	dcscb_usage_count_init();
 
 	ret = mcpm_platform_register(&dcscb_power_ops);
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 12/15] drivers/bus: add ARM CCI support
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (10 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 11/15] ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-02-01  6:01   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache Nicolas Pitre
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

On ARM multi-cluster systems coherency between cores running on
different clusters is managed by the cache-coherent interconnect (CCI).
It allows broadcasting of TLB invalidates and memory barriers and it
guarantees cache coherency at system level.

This patch enables the basic infrastructure required in Linux to
handle and programme the CCI component. The first implementation is
based on a platform device, its relative DT compatible property and
a simple programming interface.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 drivers/bus/Kconfig     |   4 ++
 drivers/bus/Makefile    |   2 +
 drivers/bus/arm-cci.c   | 107 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/arm-cci.h |  30 ++++++++++++++
 4 files changed, 143 insertions(+)
 create mode 100644 drivers/bus/arm-cci.c
 create mode 100644 include/linux/arm-cci.h

diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig
index 0f51ed687d..d032f74ff2 100644
--- a/drivers/bus/Kconfig
+++ b/drivers/bus/Kconfig
@@ -19,4 +19,8 @@ config OMAP_INTERCONNECT
 
 	help
 	  Driver to enable OMAP interconnect error handling driver.
+
+config ARM_CCI
+       bool "ARM CCI driver support"
+
 endmenu
diff --git a/drivers/bus/Makefile b/drivers/bus/Makefile
index 45d997c854..55aac809e5 100644
--- a/drivers/bus/Makefile
+++ b/drivers/bus/Makefile
@@ -6,3 +6,5 @@ obj-$(CONFIG_OMAP_OCP2SCP)	+= omap-ocp2scp.o
 
 # Interconnect bus driver for OMAP SoCs.
 obj-$(CONFIG_OMAP_INTERCONNECT)	+= omap_l3_smx.o omap_l3_noc.o
+
+obj-$(CONFIG_ARM_CCI)		+= arm-cci.o
diff --git a/drivers/bus/arm-cci.c b/drivers/bus/arm-cci.c
new file mode 100644
index 0000000000..25ae156924
--- /dev/null
+++ b/drivers/bus/arm-cci.c
@@ -0,0 +1,107 @@
+/*
+ * ARM Cache Coherency Interconnect (CCI400) support
+ *
+ * Copyright (C) 2012-2013 ARM Ltd.
+ * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/device.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/arm-cci.h>
+
+#define CCI400_EAG_OFFSET       0x4000
+#define CCI400_KF_OFFSET        0x5000
+
+#define DRIVER_NAME	"CCI"
+struct cci_drvdata {
+	void __iomem *baseaddr;
+	spinlock_t lock;
+};
+
+static struct cci_drvdata *info;
+
+void disable_cci(int cluster)
+{
+	u32 cci_reg = cluster ? CCI400_KF_OFFSET : CCI400_EAG_OFFSET;
+	writel_relaxed(0x0, info->baseaddr	+ cci_reg);
+
+	while (readl_relaxed(info->baseaddr + 0xc) & 0x1)
+			;
+}
+EXPORT_SYMBOL_GPL(disable_cci);
+
+static int cci_driver_probe(struct platform_device *pdev)
+{
+	struct resource *res;
+	int ret = 0;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (!info) {
+		dev_err(&pdev->dev, "unable to allocate mem\n");
+		return -ENOMEM;
+	}
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		dev_err(&pdev->dev, "No memory resource\n");
+		ret = -EINVAL;
+		goto mem_free;
+	}
+
+	if (!request_mem_region(res->start, resource_size(res),
+				dev_name(&pdev->dev))) {
+		dev_err(&pdev->dev, "address 0x%x in use\n", (u32) res->start);
+		ret = -EBUSY;
+		goto mem_free;
+	}
+
+	info->baseaddr = ioremap(res->start, resource_size(res));
+	if (!info->baseaddr) {
+		ret = -EADDRNOTAVAIL;
+		goto ioremap_err;
+	}
+
+	platform_set_drvdata(pdev, info);
+
+	pr_info("CCI loaded at %p\n", info->baseaddr);
+	return ret;
+
+ioremap_err:
+	release_region(res->start, resource_size(res));
+mem_free:
+	kfree(info);
+
+	return ret;
+}
+
+static const struct of_device_id arm_cci_matches[] = {
+	{.compatible = "arm,cci"},
+	{},
+};
+
+static struct platform_driver cci_platform_driver = {
+	.driver = {
+		   .name = DRIVER_NAME,
+		   .of_match_table = arm_cci_matches,
+		  },
+	.probe = cci_driver_probe,
+};
+
+static int __init cci_init(void)
+{
+	return platform_driver_register(&cci_platform_driver);
+}
+
+core_initcall(cci_init);
diff --git a/include/linux/arm-cci.h b/include/linux/arm-cci.h
new file mode 100644
index 0000000000..86ae587817
--- /dev/null
+++ b/include/linux/arm-cci.h
@@ -0,0 +1,30 @@
+/*
+ * CCI support
+ *
+ * Copyright (C) 2012-2013 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __LINUX_ARM_CCI_H
+#define __LINUX_ARM_CCI_H
+
+#ifdef CONFIG_ARM_CCI
+extern void disable_cci(int cluster);
+#else
+static inline void disable_cci(int cluster) { }
+#endif
+
+#endif
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (11 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 12/15] drivers/bus: add ARM CCI support Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-02-01  6:13   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI Nicolas Pitre
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

From: Dave Martin <dave.martin@linaro.org>

Non-local variables used by the CCI management function called after
disabling the cache must be flushed out to main memory in advance,
otherwise incoherency of those values may occur if they are sitting
in the cache of some other CPU when cci_disable() executes.

This patch adds the appropriate flushing to the CCI driver to ensure
that the relevant data is available in RAM ahead of time.

Because this creates a dependency on arch-specific cacheflushing
functions, this patch also makes ARM_CCI depend on ARM.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 drivers/bus/Kconfig   |  1 +
 drivers/bus/arm-cci.c | 21 +++++++++++++++++++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig
index d032f74ff2..cd4ac9f001 100644
--- a/drivers/bus/Kconfig
+++ b/drivers/bus/Kconfig
@@ -22,5 +22,6 @@ config OMAP_INTERCONNECT
 
 config ARM_CCI
        bool "ARM CCI driver support"
+	depends on ARM
 
 endmenu
diff --git a/drivers/bus/arm-cci.c b/drivers/bus/arm-cci.c
index 25ae156924..30e4a77da0 100644
--- a/drivers/bus/arm-cci.c
+++ b/drivers/bus/arm-cci.c
@@ -21,8 +21,16 @@
 #include <linux/slab.h>
 #include <linux/arm-cci.h>
 
-#define CCI400_EAG_OFFSET       0x4000
-#define CCI400_KF_OFFSET        0x5000
+#include <asm/cacheflush.h>
+#include <asm/memory.h>
+#include <asm/outercache.h>
+
+#include <asm/irq_regs.h>
+#include <asm/pmu.h>
+
+#define CCI400_PMCR                   0x0100
+#define CCI400_EAG_OFFSET             0x4000
+#define CCI400_KF_OFFSET              0x5000
 
 #define DRIVER_NAME	"CCI"
 struct cci_drvdata {
@@ -73,6 +81,15 @@ static int cci_driver_probe(struct platform_device *pdev)
 		goto ioremap_err;
 	}
 
+	/*
+	 * Multi-cluster systems may need this data when non-coherent, during
+	 * cluster power-up/power-down. Make sure it reaches main memory:
+	 */
+	__cpuc_flush_dcache_area(info, sizeof *info);
+	__cpuc_flush_dcache_area(&info, sizeof info);
+	outer_clean_range(virt_to_phys(info), virt_to_phys(info + 1));
+	outer_clean_range(virt_to_phys(&info), virt_to_phys(&info + 1));
+
 	platform_set_drvdata(pdev, info);
 
 	pr_info("CCI loaded at %p\n", info->baseaddr);
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (12 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-01-29 10:46   ` Lorenzo Pieralisi
  2013-02-01  6:15   ` Santosh Shilimkar
  2013-01-29  7:51 ` [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree Nicolas Pitre
  2013-02-04 14:24 ` [PATCH v3 00/15] multi-cluster power management Will Deacon
  15 siblings, 2 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

From: Dave Martin <dave.martin@linaro.org>

Add the required code to properly handle race free platform coherency exit
to the DCSCB power down method.

The power_up_setup callback is used to enable the CCI interface for
the cluster being brought up.  This must be done in assembly before
the kernel environment is entered.

Thanks to Achin Gupta and Nicolas Pitre for their help and
contributions.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/mach-vexpress/Kconfig       |  1 +
 arch/arm/mach-vexpress/Makefile      |  2 +-
 arch/arm/mach-vexpress/dcscb.c       | 74 ++++++++++++++++++++++++---------
 arch/arm/mach-vexpress/dcscb_setup.S | 80 ++++++++++++++++++++++++++++++++++++
 4 files changed, 137 insertions(+), 20 deletions(-)
 create mode 100644 arch/arm/mach-vexpress/dcscb_setup.S

diff --git a/arch/arm/mach-vexpress/Kconfig b/arch/arm/mach-vexpress/Kconfig
index f3f92b120a..f8fbe7c6a2 100644
--- a/arch/arm/mach-vexpress/Kconfig
+++ b/arch/arm/mach-vexpress/Kconfig
@@ -55,6 +55,7 @@ config ARCH_VEXPRESS_CA9X4
 config ARCH_VEXPRESS_DCSCB
 	bool "Dual Cluster System Control Block (DCSCB) support"
 	depends on CLUSTER_PM
+	select ARM_CCI
 	help
 	  Support for the Dual Cluster System Configuration Block (DCSCB).
 	  This is needed to provide CPU and cluster power management
diff --git a/arch/arm/mach-vexpress/Makefile b/arch/arm/mach-vexpress/Makefile
index 2253644054..f6e90f3272 100644
--- a/arch/arm/mach-vexpress/Makefile
+++ b/arch/arm/mach-vexpress/Makefile
@@ -6,6 +6,6 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) := -I$(srctree)/$(src)/include \
 
 obj-y					:= v2m.o reset.o
 obj-$(CONFIG_ARCH_VEXPRESS_CA9X4)	+= ct-ca9x4.o
-obj-$(CONFIG_ARCH_VEXPRESS_DCSCB)	+= dcscb.o
+obj-$(CONFIG_ARCH_VEXPRESS_DCSCB)	+= dcscb.o	dcscb_setup.o
 obj-$(CONFIG_SMP)			+= platsmp.o
 obj-$(CONFIG_HOTPLUG_CPU)		+= hotplug.o
diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
index 8d363357ef..58051ffafb 100644
--- a/arch/arm/mach-vexpress/dcscb.c
+++ b/arch/arm/mach-vexpress/dcscb.c
@@ -15,6 +15,7 @@
 #include <linux/spinlock.h>
 #include <linux/errno.h>
 #include <linux/vexpress.h>
+#include <linux/arm-cci.h>
 
 #include <asm/mcpm_entry.h>
 #include <asm/proc-fns.h>
@@ -106,6 +107,8 @@ static void dcscb_power_down(void)
 	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
 	BUG_ON(cpu >= 4 || cluster >= 2);
 
+	__mcpm_cpu_going_down(cpu, cluster);
+
 	arch_spin_lock(&dcscb_lock);
 	dcscb_use_count[cpu][cluster]--;
 	if (dcscb_use_count[cpu][cluster] == 0) {
@@ -113,6 +116,7 @@ static void dcscb_power_down(void)
 		rst_hold |= cpumask;
 		if (((rst_hold | (rst_hold >> 4)) & mcpm_mask) == mcpm_mask) {
 			rst_hold |= (1 << 8);
+			BUG_ON(__mcpm_mcpm_state(cluster) != CLUSTER_UP);
 			last_man = true;
 		}
 		writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
@@ -126,31 +130,59 @@ static void dcscb_power_down(void)
 		skip_wfi = true;
 	} else
 		BUG();
-	arch_spin_unlock(&dcscb_lock);
 
-	/*
-	 * Now let's clean our L1 cache and shut ourself down.
-	 * If we're the last CPU in this cluster then clean L2 too.
-	 */
-
-	/*
-	 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
-	 * a preliminary flush here for those CPUs.  At least, that's
-	 * the theory -- without the extra flush, Linux explodes on
-	 * RTSM (maybe not needed anymore, to be investigated)..
-	 */
-	flush_cache_louis();
-	cpu_proc_fin();
+	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
+		arch_spin_unlock(&dcscb_lock);
 
-	if (!last_man) {
-		flush_cache_louis();
-	} else {
+		/*
+		 * Flush all cache levels for this cluster.
+		 *
+		 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
+		 * a preliminary flush here for those CPUs.  At least, that's
+		 * the theory -- without the extra flush, Linux explodes on
+		 * RTSM (maybe not needed anymore, to be investigated).
+		 */
 		flush_cache_all();
+		cpu_proc_fin(); /* disable allocation into internal caches*/
+		flush_cache_all();
+
+		/*
+		 * This is a harmless no-op.  On platforms with a real
+		 * outer cache this might either be needed or not,
+		 * depending on where the outer cache sits.
+		 */
 		outer_flush_all();
+
+		/* Disable local coherency by clearing the ACTLR "SMP" bit: */
+		set_auxcr(get_auxcr() & ~(1 << 6));
+
+		/*
+		 * Disable cluster-level coherency by masking
+		 * incoming snoops and DVM messages:
+		 */
+		disable_cci(cluster);
+
+		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
+	} else {
+		arch_spin_unlock(&dcscb_lock);
+
+		/*
+		 * Flush the local CPU cache.
+		 *
+		 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
+		 * a preliminary flush here for those CPUs.  At least, that's
+		 * the theory -- without the extra flush, Linux explodes on
+		 * RTSM (maybe not needed anymore, to be investigated).
+		 */
+		flush_cache_louis();
+		cpu_proc_fin(); /* disable allocation into internal caches*/
+		flush_cache_louis();
+
+		/* Disable local coherency by clearing the ACTLR "SMP" bit: */
+		set_auxcr(get_auxcr() & ~(1 << 6));
 	}
 
-	/* Disable local coherency by clearing the ACTLR "SMP" bit: */
-	set_auxcr(get_auxcr() & ~(1 << 6));
+	__mcpm_cpu_down(cpu, cluster);
 
 	/* Now we are prepared for power-down, do it: */
 	if (!skip_wfi) {
@@ -179,6 +211,8 @@ static void __init dcscb_usage_count_init(void)
 	dcscb_use_count[cpu][cluster] = 1;
 }
 
+extern void dcscb_power_up_setup(unsigned int affinity_level);
+
 static int __init dcscb_init(void)
 {
 	unsigned int cfg;
@@ -193,6 +227,8 @@ static int __init dcscb_init(void)
 	dcscb_usage_count_init();
 
 	ret = mcpm_platform_register(&dcscb_power_ops);
+	if (!ret)
+		ret = mcpm_sync_init(dcscb_power_up_setup);
 	if (ret) {
 		iounmap(dcscb_base);
 		return ret;
diff --git a/arch/arm/mach-vexpress/dcscb_setup.S b/arch/arm/mach-vexpress/dcscb_setup.S
new file mode 100644
index 0000000000..cac033b982
--- /dev/null
+++ b/arch/arm/mach-vexpress/dcscb_setup.S
@@ -0,0 +1,80 @@
+/*
+ * arch/arm/include/asm/dcscb_setup.S
+ *
+ * Created by:  Dave Martin, 2012-06-22
+ * Copyright:   (C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+
+#include <linux/linkage.h>
+#include <asm/mcpm_entry.h>
+
+
+#define SLAVE_SNOOPCTL_OFFSET	0
+#define SNOOPCTL_SNOOP_ENABLE	(1 << 0)
+#define SNOOPCTL_DVM_ENABLE	(1 << 1)
+
+#define CCI_STATUS_OFFSET	0xc
+#define STATUS_CHANGE_PENDING	(1 << 0)
+
+#define CCI_SLAVE_OFFSET(n)	(0x1000 + 0x1000 * (n))
+
+#define RTSM_CCI_PHYS_BASE	0x2c090000
+#define RTSM_CCI_SLAVE_A15	3
+#define RTSM_CCI_SLAVE_A7	4
+
+#define RTSM_CCI_A15_OFFSET	CCI_SLAVE_OFFSET(RTSM_CCI_SLAVE_A15)
+#define RTSM_CCI_A7_OFFSET	CCI_SLAVE_OFFSET(RTSM_CCI_SLAVE_A7)
+
+
+ENTRY(dcscb_power_up_setup)
+
+	cmp	r0, #0			@ check affinity level
+	beq	2f
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ * The ACTLR SMP bit does not need to be set here, because cpu_resume()
+ * already restores that.
+ */
+
+	mrc	p15, 0, r0, c0, c0, 5	@ MPIDR
+	ubfx	r0, r0, #8, #4		@ cluster
+
+	@ A15/A7 may not require explicit L2 invalidation on reset, dependent
+	@ on hardware integration desicions.
+	@ For now, this code assumes that L2 is either already invalidated, or
+	@ invalidation is not required.
+
+	ldr	r3, =RTSM_CCI_PHYS_BASE + RTSM_CCI_A15_OFFSET
+	cmp	r0, #0		@ A15 cluster?
+	addne	r3, r3, #RTSM_CCI_A7_OFFSET - RTSM_CCI_A15_OFFSET
+
+	@ r3 now points to the correct CCI slave register block
+
+	ldr	r0, [r3, #SLAVE_SNOOPCTL_OFFSET]
+	orr	r0, r0, #SNOOPCTL_SNOOP_ENABLE | SNOOPCTL_DVM_ENABLE
+	str	r0, [r3, #SLAVE_SNOOPCTL_OFFSET]	@ enable CCI snoops
+
+	@ Wait for snoop control change to complete:
+
+	ldr	r3, =RTSM_CCI_PHYS_BASE
+
+1:	ldr	r0, [r3, #CCI_STATUS_OFFSET]
+	tst	r0, #STATUS_CHANGE_PENDING
+	bne	1b
+
+	dsb		@ Synchronise side-effects of enabling CCI
+
+	bx	lr
+
+2:	@ Implementation-specific local CPU setup operations should go here,
+	@ if any.  In this case, there is nothing to do.
+
+	bx	lr
+
+ENDPROC(dcscb_power_up_setup)
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (13 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI Nicolas Pitre
@ 2013-01-29  7:51 ` Nicolas Pitre
  2013-01-29 21:01   ` Rob Herring
  2013-02-04 14:24 ` [PATCH v3 00/15] multi-cluster power management Will Deacon
  15 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

This allows for the DCSCB support to be compiled in and selected
at run time.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/mach-vexpress/dcscb.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
index 58051ffafb..a724507cbc 100644
--- a/arch/arm/mach-vexpress/dcscb.c
+++ b/arch/arm/mach-vexpress/dcscb.c
@@ -14,6 +14,7 @@
 #include <linux/io.h>
 #include <linux/spinlock.h>
 #include <linux/errno.h>
+#include <linux/of_address.h>
 #include <linux/vexpress.h>
 #include <linux/arm-cci.h>
 
@@ -24,8 +25,6 @@
 #include <asm/cp15.h>
 
 
-#define DCSCB_PHYS_BASE	0x60000000
-
 #define RST_HOLD0	0x0
 #define RST_HOLD1	0x4
 #define SYS_SWRESET	0x8
@@ -215,10 +214,14 @@ extern void dcscb_power_up_setup(unsigned int affinity_level);
 
 static int __init dcscb_init(void)
 {
+	struct device_node *node;
 	unsigned int cfg;
 	int ret;
 
-	dcscb_base = ioremap(DCSCB_PHYS_BASE, 0x1000);
+	node = of_find_compatible_node(NULL, NULL, "arm,dcscb");
+	if (!node)
+		return -ENODEV;
+	dcscb_base= of_iomap(node, 0);
 	if (!dcscb_base)
 		return -EADDRNOTAVAIL;
 	cfg = readl_relaxed(dcscb_base + DCS_CFG_R);
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI
  2013-01-29  7:51 ` [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI Nicolas Pitre
@ 2013-01-29 10:46   ` Lorenzo Pieralisi
  2013-01-29 18:42     ` Nicolas Pitre
  2013-02-01  6:15   ` Santosh Shilimkar
  1 sibling, 1 reply; 54+ messages in thread
From: Lorenzo Pieralisi @ 2013-01-29 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 29, 2013 at 07:51:09AM +0000, Nicolas Pitre wrote:

[...]

> +		/*
> +		 * Flush the local CPU cache.
> +		 *
> +		 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
> +		 * a preliminary flush here for those CPUs.  At least, that's
> +		 * the theory -- without the extra flush, Linux explodes on
> +		 * RTSM (maybe not needed anymore, to be investigated).
> +		 */
> +		flush_cache_louis();

This is not needed. If it is, that is a model bug and should be flagged
up as such.

> +		cpu_proc_fin(); /* disable allocation into internal caches*/

This code disables the I-cache causing following instruction fetches from
DRAM; that is extremely slow and should be avoided, there is no point in
disabling the I-cache here, that is not required.

On fast-models that's a non-issue, but I really want to prevent copy'n'paste
of this sequence as it stands.

> +		flush_cache_louis();
> +
> +		/* Disable local coherency by clearing the ACTLR "SMP" bit: */
> +		set_auxcr(get_auxcr() & ~(1 << 6));
>  	}
>  
> -	/* Disable local coherency by clearing the ACTLR "SMP" bit: */
> -	set_auxcr(get_auxcr() & ~(1 << 6));
> +	__mcpm_cpu_down(cpu, cluster);
>  
>  	/* Now we are prepared for power-down, do it: */
>  	if (!skip_wfi) {
> @@ -179,6 +211,8 @@ static void __init dcscb_usage_count_init(void)
>  	dcscb_use_count[cpu][cluster] = 1;
>  }
>  
> +extern void dcscb_power_up_setup(unsigned int affinity_level);
> +
>  static int __init dcscb_init(void)
>  {
>  	unsigned int cfg;
> @@ -193,6 +227,8 @@ static int __init dcscb_init(void)
>  	dcscb_usage_count_init();
>  
>  	ret = mcpm_platform_register(&dcscb_power_ops);
> +	if (!ret)
> +		ret = mcpm_sync_init(dcscb_power_up_setup);
>  	if (ret) {
>  		iounmap(dcscb_base);
>  		return ret;
> diff --git a/arch/arm/mach-vexpress/dcscb_setup.S b/arch/arm/mach-vexpress/dcscb_setup.S
> new file mode 100644
> index 0000000000..cac033b982
> --- /dev/null
> +++ b/arch/arm/mach-vexpress/dcscb_setup.S
> @@ -0,0 +1,80 @@
> +/*
> + * arch/arm/include/asm/dcscb_setup.S
> + *
> + * Created by:  Dave Martin, 2012-06-22
> + * Copyright:   (C) 2012-2013  Linaro Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +
> +#include <linux/linkage.h>
> +#include <asm/mcpm_entry.h>
> +
> +
> +#define SLAVE_SNOOPCTL_OFFSET	0
> +#define SNOOPCTL_SNOOP_ENABLE	(1 << 0)
> +#define SNOOPCTL_DVM_ENABLE	(1 << 1)
> +
> +#define CCI_STATUS_OFFSET	0xc
> +#define STATUS_CHANGE_PENDING	(1 << 0)
> +
> +#define CCI_SLAVE_OFFSET(n)	(0x1000 + 0x1000 * (n))
> +
> +#define RTSM_CCI_PHYS_BASE	0x2c090000
> +#define RTSM_CCI_SLAVE_A15	3
> +#define RTSM_CCI_SLAVE_A7	4

We need to remove these hardcoded values in due course as you know, I am
working on new code that allows us to match the CCI port address to
MPIDR on resume.

Lorenzo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time
  2013-01-29  7:51 ` [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time Nicolas Pitre
@ 2013-01-29 15:43   ` Jon Medhurst (Tixy)
  2013-01-29 19:26     ` Nicolas Pitre
  2013-02-01  5:41   ` Santosh Shilimkar
  1 sibling, 1 reply; 54+ messages in thread
From: Jon Medhurst (Tixy) @ 2013-01-29 15:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2013-01-29 at 02:51 -0500, Nicolas Pitre wrote:
> From: Jon Medhurst <tixy@linaro.org>
> 
> Signed-off-by: Jon Medhurst <tixy@linaro.org>
> ---

Should this patch be split into two. One to introduce the new smp_init
hook into the generic ARM kernel code, and one to make vexpress use it?
With, descriptions something like:

-----------------------------------------------------------------------
ARM: kernel: Enable selection of SMP operations at boot time

Add a new 'smp_init' hook to machine_desc so platforms can specify a
function to be used to setup smp ops instead of having a statically
defined value.
-----------------------------------------------------------------------
ARM: vexpress: Select multi-cluster SMP operation if required
-----------------------------------------------------------------------

-- 
Tixy


>  arch/arm/include/asm/mach/arch.h |  3 +++
>  arch/arm/kernel/setup.c          |  5 ++++-
>  arch/arm/mach-vexpress/core.h    |  2 ++
>  arch/arm/mach-vexpress/platsmp.c | 12 ++++++++++++
>  arch/arm/mach-vexpress/v2m.c     |  2 +-
>  5 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
> index 917d4fcfd9..3d01c6d6c3 100644
> --- a/arch/arm/include/asm/mach/arch.h
> +++ b/arch/arm/include/asm/mach/arch.h
> @@ -17,8 +17,10 @@ struct pt_regs;
>  struct smp_operations;
>  #ifdef CONFIG_SMP
>  #define smp_ops(ops) (&(ops))
> +#define smp_init_ops(ops) (&(ops))
>  #else
>  #define smp_ops(ops) (struct smp_operations *)NULL
> +#define smp_init_ops(ops) (void (*)(void))NULL
>  #endif
>  
>  struct machine_desc {
> @@ -42,6 +44,7 @@ struct machine_desc {
>  	unsigned char		reserve_lp2 :1;	/* never has lp2	*/
>  	char			restart_mode;	/* default restart mode	*/
>  	struct smp_operations	*smp;		/* SMP operations	*/
> +	void			(*smp_init)(void);
>  	void			(*fixup)(struct tag *, char **,
>  					 struct meminfo *);
>  	void			(*reserve)(void);/* reserve mem blocks	*/
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index 3f6cbb2e3e..41edca8582 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -768,7 +768,10 @@ void __init setup_arch(char **cmdline_p)
>  	arm_dt_init_cpu_maps();
>  #ifdef CONFIG_SMP
>  	if (is_smp()) {
> -		smp_set_ops(mdesc->smp);
> +		if(mdesc->smp_init)
> +			(*mdesc->smp_init)();
> +		else
> +			smp_set_ops(mdesc->smp);
>  		smp_init_cpus();
>  	}
>  #endif
> diff --git a/arch/arm/mach-vexpress/core.h b/arch/arm/mach-vexpress/core.h
> index f134cd4a85..3a761fd76c 100644
> --- a/arch/arm/mach-vexpress/core.h
> +++ b/arch/arm/mach-vexpress/core.h
> @@ -6,6 +6,8 @@
>  
>  void vexpress_dt_smp_map_io(void);
>  
> +void vexpress_smp_init_ops(void);
> +
>  extern struct smp_operations	vexpress_smp_ops;
>  
>  extern void vexpress_cpu_die(unsigned int cpu);
> diff --git a/arch/arm/mach-vexpress/platsmp.c b/arch/arm/mach-vexpress/platsmp.c
> index c5d70de9bb..667344b479 100644
> --- a/arch/arm/mach-vexpress/platsmp.c
> +++ b/arch/arm/mach-vexpress/platsmp.c
> @@ -12,6 +12,7 @@
>  #include <linux/errno.h>
>  #include <linux/smp.h>
>  #include <linux/io.h>
> +#include <linux/of.h>
>  #include <linux/of_fdt.h>
>  #include <linux/vexpress.h>
>  
> @@ -206,3 +207,14 @@ struct smp_operations __initdata vexpress_smp_ops = {
>  	.cpu_die		= vexpress_cpu_die,
>  #endif
>  };
> +
> +void __init vexpress_smp_init_ops(void)
> +{
> +	struct smp_operations *ops = &vexpress_smp_ops;
> +#ifdef CONFIG_CLUSTER_PM
> +	extern struct smp_operations mcpm_smp_ops;
> +	if(of_find_compatible_node(NULL, NULL, "arm,cci"))
> +		ops = &mcpm_smp_ops;
> +#endif
> +	smp_set_ops(ops);
> +}
> diff --git a/arch/arm/mach-vexpress/v2m.c b/arch/arm/mach-vexpress/v2m.c
> index 011661a6c5..34172bd504 100644
> --- a/arch/arm/mach-vexpress/v2m.c
> +++ b/arch/arm/mach-vexpress/v2m.c
> @@ -494,7 +494,7 @@ static const char * const v2m_dt_match[] __initconst = {
>  
>  DT_MACHINE_START(VEXPRESS_DT, "ARM-Versatile Express")
>  	.dt_compat	= v2m_dt_match,
> -	.smp		= smp_ops(vexpress_smp_ops),
> +	.smp_init	= smp_init_ops(vexpress_smp_init_ops),
>  	.map_io		= v2m_dt_map_io,
>  	.init_early	= v2m_dt_init_early,
>  	.init_irq	= v2m_dt_init_irq,

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI
  2013-01-29 10:46   ` Lorenzo Pieralisi
@ 2013-01-29 18:42     ` Nicolas Pitre
  2013-01-30 17:27       ` Lorenzo Pieralisi
  0 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29 18:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 29 Jan 2013, Lorenzo Pieralisi wrote:

> On Tue, Jan 29, 2013 at 07:51:09AM +0000, Nicolas Pitre wrote:
> 
> [...]
> 
> > +		/*
> > +		 * Flush the local CPU cache.
> > +		 *
> > +		 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
> > +		 * a preliminary flush here for those CPUs.  At least, that's
> > +		 * the theory -- without the extra flush, Linux explodes on
> > +		 * RTSM (maybe not needed anymore, to be investigated).
> > +		 */
> > +		flush_cache_louis();
> 
> This is not needed. If it is, that is a model bug and should be flagged
> up as such.

Could someone at ARM do that?

I just confirmed that this is still the case by commenting out the 
preliminary flush calls and hot-plugging CPUs out and back.  Result is a 
non-sensical kernel oops which has the looks of serious memory 
corruption.  This is with RTSM version 7.1.42.

> > +		cpu_proc_fin(); /* disable allocation into internal caches*/
> 
> This code disables the I-cache causing following instruction fetches from
> DRAM; that is extremely slow and should be avoided, there is no point in
> disabling the I-cache here, that is not required.
> On fast-models that's a non-issue, but I really want to prevent copy'n'paste
> of this sequence as it stands.

Agreed, I'll change that.  The (not included in this series) TC2 backend 
does leave the I-cache active already.

> > diff --git a/arch/arm/mach-vexpress/dcscb_setup.S b/arch/arm/mach-vexpress/dcscb_setup.S
> > new file mode 100644
> > index 0000000000..cac033b982
> > --- /dev/null
> > +++ b/arch/arm/mach-vexpress/dcscb_setup.S
> > @@ -0,0 +1,80 @@
> > +/*
> > + * arch/arm/include/asm/dcscb_setup.S
> > + *
> > + * Created by:  Dave Martin, 2012-06-22
> > + * Copyright:   (C) 2012-2013  Linaro Limited
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/mcpm_entry.h>
> > +
> > +
> > +#define SLAVE_SNOOPCTL_OFFSET	0
> > +#define SNOOPCTL_SNOOP_ENABLE	(1 << 0)
> > +#define SNOOPCTL_DVM_ENABLE	(1 << 1)
> > +
> > +#define CCI_STATUS_OFFSET	0xc
> > +#define STATUS_CHANGE_PENDING	(1 << 0)
> > +
> > +#define CCI_SLAVE_OFFSET(n)	(0x1000 + 0x1000 * (n))
> > +
> > +#define RTSM_CCI_PHYS_BASE	0x2c090000
> > +#define RTSM_CCI_SLAVE_A15	3
> > +#define RTSM_CCI_SLAVE_A7	4
> 
> We need to remove these hardcoded values in due course as you know, I am
> working on new code that allows us to match the CCI port address to
> MPIDR on resume.

Yes, absolutely.  I was expecting this code to become generic and more 
closely tied to the CCI driver.  The CCI init 
code could set up variables to be used by this code.


Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time
  2013-01-29 15:43   ` Jon Medhurst (Tixy)
@ 2013-01-29 19:26     ` Nicolas Pitre
  0 siblings, 0 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29 19:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 29 Jan 2013, Jon Medhurst (Tixy) wrote:

> On Tue, 2013-01-29 at 02:51 -0500, Nicolas Pitre wrote:
> > From: Jon Medhurst <tixy@linaro.org>
> > 
> > Signed-off-by: Jon Medhurst <tixy@linaro.org>
> > ---
> 
> Should this patch be split into two. One to introduce the new smp_init
> hook into the generic ARM kernel code, and one to make vexpress use it?
> With, descriptions something like:
> 
> -----------------------------------------------------------------------
> ARM: kernel: Enable selection of SMP operations at boot time
> 
> Add a new 'smp_init' hook to machine_desc so platforms can specify a
> function to be used to setup smp ops instead of having a statically
> defined value.
> -----------------------------------------------------------------------
> ARM: vexpress: Select multi-cluster SMP operation if required
> -----------------------------------------------------------------------

That certainly makes sense.  Are you willing to split your patch as such 
and send me the result?


Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 06/15] ARM: mcpm: generic SMP secondary bringup and hotplug support
  2013-01-29  7:51 ` [PATCH v3 06/15] ARM: mcpm: generic SMP secondary bringup and hotplug support Nicolas Pitre
@ 2013-01-29 20:38   ` Rob Herring
  2013-02-01  5:38   ` Santosh Shilimkar
  1 sibling, 0 replies; 54+ messages in thread
From: Rob Herring @ 2013-01-29 20:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/29/2013 01:51 AM, Nicolas Pitre wrote:
> Now that the cluster power API is in place, we can use it for SMP secondary
> bringup and CPU hotplug in a generic fashion.
> 
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
>  arch/arm/common/Makefile       |  2 +-
>  arch/arm/common/mcpm_platsmp.c | 85 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 86 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm/common/mcpm_platsmp.c
> 
> diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
> index c901a38c59..e1c9db45de 100644
> --- a/arch/arm/common/Makefile
> +++ b/arch/arm/common/Makefile
> @@ -13,4 +13,4 @@ obj-$(CONFIG_SHARP_PARAM)	+= sharpsl_param.o
>  obj-$(CONFIG_SHARP_SCOOP)	+= scoop.o
>  obj-$(CONFIG_PCI_HOST_ITE8152)  += it8152.o
>  obj-$(CONFIG_ARM_TIMER_SP804)	+= timer-sp.o
> -obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o vlock.o
> +obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o mcpm_platsmp.o vlock.o
> diff --git a/arch/arm/common/mcpm_platsmp.c b/arch/arm/common/mcpm_platsmp.c
> new file mode 100644
> index 0000000000..401298f5ee
> --- /dev/null
> +++ b/arch/arm/common/mcpm_platsmp.c
> @@ -0,0 +1,85 @@
> +/*
> + * linux/arch/arm/mach-vexpress/mcpm_platsmp.c
> + *
> + * Created by:  Nicolas Pitre, November 2012
> + * Copyright:   (C) 2012-2013  Linaro Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * Code to handle secondary CPU bringup and hotplug for the cluster power API.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/smp.h>
> +
> +#include <asm/mcpm_entry.h>
> +#include <asm/smp_plat.h>
> +#include <asm/hardware/gic.h>
> +
> +static void __init simple_smp_init_cpus(void)
> +{
> +	set_smp_cross_call(gic_raise_softirq);

In case you're not aware, you might want to base this on the gic move to
drivers/irqchips. It is in arm-soc now. Then you don't need this
set_smp_cross_call anymore. It is now set by the gic code internally.

> +}
> +
> +static int __cpuinit mcpm_boot_secondary(unsigned int cpu, struct task_struct *idle)
> +{
> +	unsigned int mpidr, pcpu, pcluster, ret;
> +	extern void secondary_startup(void);
> +
> +	mpidr = cpu_logical_map(cpu);
> +	pcpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	pcluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +	pr_debug("%s: logical CPU %d is physical CPU %d cluster %d\n",
> +		 __func__, cpu, pcpu, pcluster);
> +
> +	mcpm_set_entry_vector(pcpu, pcluster, NULL);
> +	ret = mcpm_cpu_power_up(pcpu, pcluster);
> +	if (ret)
> +		return ret;
> +	mcpm_set_entry_vector(pcpu, pcluster, secondary_startup);
> +	gic_raise_softirq(cpumask_of(cpu), 0);

You can use arch_send_wakeup_ipi_mask here now instead. That's in 3.8-rc1.

> +	dsb_sev();
> +	return 0;
> +}
> +
> +static void __cpuinit mcpm_secondary_init(unsigned int cpu)
> +{
> +	mcpm_cpu_powered_up();
> +	gic_secondary_init(0);

Catalin's gic series will remove this.

> +}
> +
> +#ifdef CONFIG_HOTPLUG_CPU
> +
> +static int mcpm_cpu_disable(unsigned int cpu)
> +{
> +	/*
> +	 * We assume all CPUs may be shut down.
> +	 * This would be the hook to use for eventual Secure
> +	 * OS migration requests as described in the PSCI spec.
> +	 */
> +	return 0;
> +}
> +
> +static void mcpm_cpu_die(unsigned int cpu)
> +{
> +	unsigned int mpidr, pcpu, pcluster;
> +	mpidr = read_cpuid_mpidr();
> +	pcpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	pcluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +	mcpm_set_entry_vector(pcpu, pcluster, NULL);
> +	mcpm_cpu_power_down();
> +}
> +
> +#endif
> +
> +struct smp_operations __initdata mcpm_smp_ops = {
> +	.smp_init_cpus		= simple_smp_init_cpus,
> +	.smp_boot_secondary	= mcpm_boot_secondary,
> +	.smp_secondary_init	= mcpm_secondary_init,
> +#ifdef CONFIG_HOTPLUG_CPU
> +	.cpu_disable		= mcpm_cpu_disable,
> +	.cpu_die		= mcpm_cpu_die,
> +#endif
> +};
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree
  2013-01-29  7:51 ` [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree Nicolas Pitre
@ 2013-01-29 21:01   ` Rob Herring
  2013-01-29 21:41     ` Nicolas Pitre
  0 siblings, 1 reply; 54+ messages in thread
From: Rob Herring @ 2013-01-29 21:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/29/2013 01:51 AM, Nicolas Pitre wrote:
> This allows for the DCSCB support to be compiled in and selected
> at run time.

Shouldn't this just be rolled into the commit creating dcscb.c?

> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
>  arch/arm/mach-vexpress/dcscb.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
> index 58051ffafb..a724507cbc 100644
> --- a/arch/arm/mach-vexpress/dcscb.c
> +++ b/arch/arm/mach-vexpress/dcscb.c
> @@ -14,6 +14,7 @@
>  #include <linux/io.h>
>  #include <linux/spinlock.h>
>  #include <linux/errno.h>
> +#include <linux/of_address.h>
>  #include <linux/vexpress.h>
>  #include <linux/arm-cci.h>
>  
> @@ -24,8 +25,6 @@
>  #include <asm/cp15.h>
>  
>  
> -#define DCSCB_PHYS_BASE	0x60000000
> -
>  #define RST_HOLD0	0x0
>  #define RST_HOLD1	0x4
>  #define SYS_SWRESET	0x8
> @@ -215,10 +214,14 @@ extern void dcscb_power_up_setup(unsigned int affinity_level);
>  
>  static int __init dcscb_init(void)
>  {
> +	struct device_node *node;
>  	unsigned int cfg;
>  	int ret;
>  
> -	dcscb_base = ioremap(DCSCB_PHYS_BASE, 0x1000);
> +	node = of_find_compatible_node(NULL, NULL, "arm,dcscb");

This needs binding documentation and should be a more specific name. Not
knowing what dcscb is, I don't have a suggestion. Perhaps should include
vexpress or specific core tile name it is part of.

Rob

> +	if (!node)
> +		return -ENODEV;
> +	dcscb_base= of_iomap(node, 0);
>  	if (!dcscb_base)
>  		return -EADDRNOTAVAIL;
>  	cfg = readl_relaxed(dcscb_base + DCS_CFG_R);
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree
  2013-01-29 21:01   ` Rob Herring
@ 2013-01-29 21:41     ` Nicolas Pitre
  2013-01-30 12:22       ` Achin Gupta
  0 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-29 21:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 29 Jan 2013, Rob Herring wrote:

> On 01/29/2013 01:51 AM, Nicolas Pitre wrote:
> > This allows for the DCSCB support to be compiled in and selected
> > at run time.
> 
> Shouldn't this just be rolled into the commit creating dcscb.c?

Probably, yes.

> > Signed-off-by: Nicolas Pitre <nico@linaro.org>
> > ---
> >  arch/arm/mach-vexpress/dcscb.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
> > index 58051ffafb..a724507cbc 100644
> > --- a/arch/arm/mach-vexpress/dcscb.c
> > +++ b/arch/arm/mach-vexpress/dcscb.c
> > @@ -14,6 +14,7 @@
> >  #include <linux/io.h>
> >  #include <linux/spinlock.h>
> >  #include <linux/errno.h>
> > +#include <linux/of_address.h>
> >  #include <linux/vexpress.h>
> >  #include <linux/arm-cci.h>
> >  
> > @@ -24,8 +25,6 @@
> >  #include <asm/cp15.h>
> >  
> >  
> > -#define DCSCB_PHYS_BASE	0x60000000
> > -
> >  #define RST_HOLD0	0x0
> >  #define RST_HOLD1	0x4
> >  #define SYS_SWRESET	0x8
> > @@ -215,10 +214,14 @@ extern void dcscb_power_up_setup(unsigned int affinity_level);
> >  
> >  static int __init dcscb_init(void)
> >  {
> > +	struct device_node *node;
> >  	unsigned int cfg;
> >  	int ret;
> >  
> > -	dcscb_base = ioremap(DCSCB_PHYS_BASE, 0x1000);
> > +	node = of_find_compatible_node(NULL, NULL, "arm,dcscb");
> 
> This needs binding documentation and should be a more specific name. Not
> knowing what dcscb is, I don't have a suggestion.

Yes, I mentioned in the cover page that DT bindings are not yet 
documented.

DCSCB stands for "Dual Cluster System Control Block".  This is in fact a 
set of miscellaneous registers, mainly for reset control of individual 
CPUs and clusters.

> Perhaps should include vexpress or specific core tile name it is part 
> of.

/me hopes for some ARM dude more acquainted with their nomenclature to 
chime in with suggestions.


Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree
  2013-01-29 21:41     ` Nicolas Pitre
@ 2013-01-30 12:22       ` Achin Gupta
  2013-01-30 17:43         ` Nicolas Pitre
  0 siblings, 1 reply; 54+ messages in thread
From: Achin Gupta @ 2013-01-30 12:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 29, 2013 at 9:41 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Tue, 29 Jan 2013, Rob Herring wrote:
>
>> On 01/29/2013 01:51 AM, Nicolas Pitre wrote:
>> > This allows for the DCSCB support to be compiled in and selected
>> > at run time.
>>
>> Shouldn't this just be rolled into the commit creating dcscb.c?
>
> Probably, yes.
>
>> > Signed-off-by: Nicolas Pitre <nico@linaro.org>
>> > ---
>> >  arch/arm/mach-vexpress/dcscb.c | 9 ++++++---
>> >  1 file changed, 6 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
>> > index 58051ffafb..a724507cbc 100644
>> > --- a/arch/arm/mach-vexpress/dcscb.c
>> > +++ b/arch/arm/mach-vexpress/dcscb.c
>> > @@ -14,6 +14,7 @@
>> >  #include <linux/io.h>
>> >  #include <linux/spinlock.h>
>> >  #include <linux/errno.h>
>> > +#include <linux/of_address.h>
>> >  #include <linux/vexpress.h>
>> >  #include <linux/arm-cci.h>
>> >
>> > @@ -24,8 +25,6 @@
>> >  #include <asm/cp15.h>
>> >
>> >
>> > -#define DCSCB_PHYS_BASE    0x60000000
>> > -
>> >  #define RST_HOLD0  0x0
>> >  #define RST_HOLD1  0x4
>> >  #define SYS_SWRESET        0x8
>> > @@ -215,10 +214,14 @@ extern void dcscb_power_up_setup(unsigned int affinity_level);
>> >
>> >  static int __init dcscb_init(void)
>> >  {
>> > +   struct device_node *node;
>> >     unsigned int cfg;
>> >     int ret;
>> >
>> > -   dcscb_base = ioremap(DCSCB_PHYS_BASE, 0x1000);
>> > +   node = of_find_compatible_node(NULL, NULL, "arm,dcscb");
>>
>> This needs binding documentation and should be a more specific name. Not
>> knowing what dcscb is, I don't have a suggestion.
>
> Yes, I mentioned in the cover page that DT bindings are not yet
> documented.
>
> DCSCB stands for "Dual Cluster System Control Block".  This is in fact a
> set of miscellaneous registers, mainly for reset control of individual
> CPUs and clusters.
>
>> Perhaps should include vexpress or specific core tile name it is part
>> of.
>
> /me hopes for some ARM dude more acquainted with their nomenclature to
> chime in with suggestions.
>

As nico said, the DCSCB is just a reset controller thats a part of the
FastModels implementation.  The implementation should be referred to as
VE bL RTSM. The official name goes by:

RTSM_VE_Cortex-A15x4-A7x4
RTSM_VE_Cortex-A15x1-A7x1

The file should be renamed as bL_rtsm.c or something similar.

Thanks,
Achin

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI
  2013-01-29 18:42     ` Nicolas Pitre
@ 2013-01-30 17:27       ` Lorenzo Pieralisi
  0 siblings, 0 replies; 54+ messages in thread
From: Lorenzo Pieralisi @ 2013-01-30 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 29, 2013 at 06:42:33PM +0000, Nicolas Pitre wrote:
> On Tue, 29 Jan 2013, Lorenzo Pieralisi wrote:
> 
> > On Tue, Jan 29, 2013 at 07:51:09AM +0000, Nicolas Pitre wrote:
> > 
> > [...]
> > 
> > > +		/*
> > > +		 * Flush the local CPU cache.
> > > +		 *
> > > +		 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
> > > +		 * a preliminary flush here for those CPUs.  At least, that's
> > > +		 * the theory -- without the extra flush, Linux explodes on
> > > +		 * RTSM (maybe not needed anymore, to be investigated).
> > > +		 */
> > > +		flush_cache_louis();
> > 
> > This is not needed. If it is, that is a model bug and should be flagged
> > up as such.
> 
> Could someone at ARM do that?

I will do that.

> 
> I just confirmed that this is still the case by commenting out the 
> preliminary flush calls and hot-plugging CPUs out and back.  Result is a 
> non-sensical kernel oops which has the looks of serious memory 
> corruption.  This is with RTSM version 7.1.42.
> 
> > > +		cpu_proc_fin(); /* disable allocation into internal caches*/
> > 
> > This code disables the I-cache causing following instruction fetches from
> > DRAM; that is extremely slow and should be avoided, there is no point in
> > disabling the I-cache here, that is not required.
> > On fast-models that's a non-issue, but I really want to prevent copy'n'paste
> > of this sequence as it stands.
> 
> Agreed, I'll change that.  The (not included in this series) TC2 backend 
> does leave the I-cache active already.

Great, thanks !!

Lorenzo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree
  2013-01-30 12:22       ` Achin Gupta
@ 2013-01-30 17:43         ` Nicolas Pitre
  2013-01-31 10:54           ` Dave Martin
  0 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-30 17:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 30 Jan 2013, Achin Gupta wrote:

> On Tue, Jan 29, 2013 at 9:41 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> > On Tue, 29 Jan 2013, Rob Herring wrote:
> >
> >> On 01/29/2013 01:51 AM, Nicolas Pitre wrote:
> >> > +   node = of_find_compatible_node(NULL, NULL, "arm,dcscb");
> >>
> >> This needs binding documentation and should be a more specific name. Not
> >> knowing what dcscb is, I don't have a suggestion.
> >
> > Yes, I mentioned in the cover page that DT bindings are not yet
> > documented.
> >
> > DCSCB stands for "Dual Cluster System Control Block".  This is in fact a
> > set of miscellaneous registers, mainly for reset control of individual
> > CPUs and clusters.
> >
> >> Perhaps should include vexpress or specific core tile name it is part
> >> of.
> >
> > /me hopes for some ARM dude more acquainted with their nomenclature to
> > chime in with suggestions.
> >
> 
> As nico said, the DCSCB is just a reset controller thats a part of the
> FastModels implementation.  The implementation should be referred to as
> VE bL RTSM. The official name goes by:
> 
> RTSM_VE_Cortex-A15x4-A7x4
> RTSM_VE_Cortex-A15x1-A7x1
> 
> The file should be renamed as bL_rtsm.c or something similar.

I don't think the file name is a problem.  Actually, going with 
bL_rtsm.c is rather too generic for what itcovers.

It's the actual device tree binding name that I'd need suggestions for.


Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree
  2013-01-30 17:43         ` Nicolas Pitre
@ 2013-01-31 10:54           ` Dave Martin
  2013-02-04  4:39             ` Nicolas Pitre
  0 siblings, 1 reply; 54+ messages in thread
From: Dave Martin @ 2013-01-31 10:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 30, 2013 at 12:43:29PM -0500, Nicolas Pitre wrote:
> On Wed, 30 Jan 2013, Achin Gupta wrote:
> 
> > On Tue, Jan 29, 2013 at 9:41 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> > > On Tue, 29 Jan 2013, Rob Herring wrote:
> > >
> > >> On 01/29/2013 01:51 AM, Nicolas Pitre wrote:
> > >> > +   node = of_find_compatible_node(NULL, NULL, "arm,dcscb");
> > >>
> > >> This needs binding documentation and should be a more specific name. Not
> > >> knowing what dcscb is, I don't have a suggestion.

The name is 100% specific.  The real problem seems to be that it's also
very cryptic, and undocumented.

> > >
> > > Yes, I mentioned in the cover page that DT bindings are not yet
> > > documented.
> > >
> > > DCSCB stands for "Dual Cluster System Control Block".  This is in fact a
> > > set of miscellaneous registers, mainly for reset control of individual
> > > CPUs and clusters.
> > >
> > >> Perhaps should include vexpress or specific core tile name it is part
> > >> of.
> > >
> > > /me hopes for some ARM dude more acquainted with their nomenclature to
> > > chime in with suggestions.
> > >
> > 
> > As nico said, the DCSCB is just a reset controller thats a part of the
> > FastModels implementation.  The implementation should be referred to as
> > VE bL RTSM. The official name goes by:
> > 
> > RTSM_VE_Cortex-A15x4-A7x4
> > RTSM_VE_Cortex-A15x1-A7x1
> > 
> > The file should be renamed as bL_rtsm.c or something similar.
> 
> I don't think the file name is a problem.  Actually, going with 
> bL_rtsm.c is rather too generic for what itcovers.
> 
> It's the actual device tree binding name that I'd need suggestions for.

We could go for a slightly more generic, informative name like

	arm,dcscb-system-controller

Any views on that?

I think the most important thing is to document the binding, though.

As discussed, this thing appears in the fast models, but we don't
anticipate its being used in real SoCs because they will usually need
something more sophisticated.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 01/15] ARM: multi-cluster PM: secondary kernel entry code
  2013-01-29  7:50 ` [PATCH v3 01/15] ARM: multi-cluster PM: secondary kernel entry code Nicolas Pitre
@ 2013-01-31 15:45   ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-01-31 15:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:20 PM, Nicolas Pitre wrote:
> CPUs in cluster based systems, such as big.LITTLE, have special needs
> when entering the kernel due to a hotplug event, or when resuming from
> a deep sleep mode.
>
> This is vectorized so multiple CPUs can enter the kernel in parallel
> without serialization.
>
> The mcpm prefix stands for "multi cluster power management", however
> this is usable on single cluster systems as well.  Only the basic
> structure is introduced here.  This will be extended with later patches.
>
> In order not to complexify things more than they currently have to,
> the planned work to make runtime adjusted MPIDR based indexing and
> dynamic memory allocation for cluster states is postponed to a later
> cycle. The MAX_NR_CLUSTERS and MAX_CPUS_PER_CLUSTER static definitions
> should be sufficient for those systems expected to be available in the
> near future.
>
'mcpm' sounds definitely better than 'bL' ;)

> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
>   arch/arm/Kconfig                  |  8 ++++
>   arch/arm/common/Makefile          |  1 +
>   arch/arm/common/mcpm_entry.c      | 29 +++++++++++++
>   arch/arm/common/mcpm_head.S       | 86 +++++++++++++++++++++++++++++++++++++++
>   arch/arm/include/asm/mcpm_entry.h | 35 ++++++++++++++++
>   5 files changed, 159 insertions(+)
>   create mode 100644 arch/arm/common/mcpm_entry.c
>   create mode 100644 arch/arm/common/mcpm_head.S
>   create mode 100644 arch/arm/include/asm/mcpm_entry.h
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 67874b82a4..200f559c1c 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1584,6 +1584,14 @@ config HAVE_ARM_TWD
>   	help
>   	  This options enables support for the ARM timer and watchdog unit
>
> +config CLUSTER_PM
> +	bool "Cluster Power Management Infrastructure"
> +	depends on CPU_V7 && SMP
> +	help
> +	  This option provides the common power management infrastructure
> +	  for (multi-)cluster based systems, such as big.LITTLE based
> +	  systems.
> +
>   choice
>   	prompt "Memory split"
>   	default VMSPLIT_3G
> diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
> index e8a4e58f1b..23e85b1fae 100644
> --- a/arch/arm/common/Makefile
> +++ b/arch/arm/common/Makefile
> @@ -13,3 +13,4 @@ obj-$(CONFIG_SHARP_PARAM)	+= sharpsl_param.o
>   obj-$(CONFIG_SHARP_SCOOP)	+= scoop.o
>   obj-$(CONFIG_PCI_HOST_ITE8152)  += it8152.o
>   obj-$(CONFIG_ARM_TIMER_SP804)	+= timer-sp.o
> +obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o
> diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
> new file mode 100644
> index 0000000000..3a6d7e70fd
> --- /dev/null
> +++ b/arch/arm/common/mcpm_entry.c
> @@ -0,0 +1,29 @@
> +/*
> + * arch/arm/common/mcpm_entry.c -- entry point for multi-cluster PM
> + *
> + * Created by:  Nicolas Pitre, March 2012
> + * Copyright:   (C) 2012-2013  Linaro Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/init.h>
> +
> +#include <asm/mcpm_entry.h>
> +#include <asm/barrier.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +
> +extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
> +
> +void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr)
> +{
> +	unsigned long val = ptr ? virt_to_phys(ptr) : 0;
May be extra line would be good here.
Patch looks fine to my eyes otherwise.

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 02/15] ARM: mcpm: introduce the CPU/cluster power API
  2013-01-29  7:50 ` [PATCH v3 02/15] ARM: mcpm: introduce the CPU/cluster power API Nicolas Pitre
@ 2013-01-31 15:55   ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-01-31 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:20 PM, Nicolas Pitre wrote:
> This is the basic API used to handle the powering up/down of individual
> CPUs in a (multi-)cluster system.  The platform specific backend
> implementation has the responsibility to also handle the cluster level
> power as well when the first/last CPU in a cluster is brought up/down.
>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup
  2013-01-29  7:50 ` [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup Nicolas Pitre
@ 2013-01-31 16:08   ` Santosh Shilimkar
  2013-01-31 17:16     ` Nicolas Pitre
  0 siblings, 1 reply; 54+ messages in thread
From: Santosh Shilimkar @ 2013-01-31 16:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:20 PM, Nicolas Pitre wrote:
> From: Dave Martin <dave.martin@linaro.org>
>
> This provides helper methods to coordinate between CPUs coming down
> and CPUs going up, as well as documentation on the used algorithms,
> so that cluster teardown and setup
> operations are not done for a cluster simultaneously.
>
> For use in the power_down() implementation:
>    * __mcpm_cpu_going_down(unsigned int cluster, unsigned int cpu)
>    * __mcpm_outbound_enter_critical(unsigned int cluster)
>    * __mcpm_outbound_leave_critical(unsigned int cluster)
>    * __mcpm_cpu_down(unsigned int cluster, unsigned int cpu)
>
> The power_up_setup() helper should do platform-specific setup in
> preparation for turning the CPU on, such as invalidating local caches
> or entering coherency.  It must be assembler for now, since it must
> run before the MMU can be switched on.  It is passed the affinity level
> which should be initialized.
>
> Because the mcpm_sync_struct content is looked-up and modified
> with the cache enabled or disabled depending on the code path, it is
> crucial to always ensure proper cache maintenance to update main memory
> right away.  Therefore, any cached write must be followed by a cache
> clean operation and any cached read must be preceded by a cache
> invalidate operation (actually a cache flush i.e. clean+invalidate to
> avoid discarding possible concurrent writes) on the accessed memory.
>
> Also, in order to prevent a cached writer from interfering with an
> adjacent non-cached writer, we ensure each state variable is located to
> a separate cache line.
>
> Thanks to Nicolas Pitre and Achin Gupta for the help with this
> patch.
>
> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> ---
[..]

> diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
> index c8c0e2113e..2b83121966 100644
> --- a/arch/arm/common/mcpm_entry.c
> +++ b/arch/arm/common/mcpm_entry.c
> @@ -18,6 +18,7 @@
>   #include <asm/proc-fns.h>
>   #include <asm/cacheflush.h>
>   #include <asm/idmap.h>
> +#include <asm/cputype.h>
>
>   extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
>
[...]

> +/*
> + * Ensure preceding writes to *p by other CPUs are visible to
> + * subsequent reads by this CPU.  We must be careful not to
> + * discard data simultaneously written by another CPU, hence the
> + * usage of flush rather than invalidate operations.
> + */
> +static void __sync_range_r(volatile void *p, size_t size)
> +{
> +	char *_p = (char *)p;
> +
> +#ifdef CONFIG_OUTER_CACHE
> +	if (outer_cache.flush_range) {
> +
You don't need above #ifdef. In case of non-outer
cache the function pointer is null anyways.
		/*
> +		 * Ensure dirty data migrated from other CPUs into our cache
> +		 * are cleaned out safely before the outer cache is cleaned:
> +		 */
> +		__cpuc_clean_dcache_area(_p, size);
> +
> +		/* Clean and invalidate stale data for *p from outer ... */
> +		outer_flush_range(__pa(_p), __pa(_p + size));
> +	}
> +#endif
> +
> +	/* ... and inner cache: */
> +	__cpuc_flush_dcache_area(_p, size);
This will be un-necessary when inner cache is available, no ?
May be you can re-arrange the code like below, unless and until
you would like to invalidate any speculative fetches during the
outer_flush_range()

	__cpuc_clean_dcache_area(_p, size);
	if (outer_cache.flush_range)
		outer_flush_range(__pa(_p), __pa(_p + size));

Rest of the patch looks fine to me.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup
  2013-01-31 16:08   ` Santosh Shilimkar
@ 2013-01-31 17:16     ` Nicolas Pitre
  2013-02-01  5:10       ` Santosh Shilimkar
  0 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-01-31 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 31 Jan 2013, Santosh Shilimkar wrote:

> On Tuesday 29 January 2013 01:20 PM, Nicolas Pitre wrote:
> > From: Dave Martin <dave.martin@linaro.org>
> > 
> > This provides helper methods to coordinate between CPUs coming down
> > and CPUs going up, as well as documentation on the used algorithms,
> > so that cluster teardown and setup
> > operations are not done for a cluster simultaneously.
> > 
> > For use in the power_down() implementation:
> >    * __mcpm_cpu_going_down(unsigned int cluster, unsigned int cpu)
> >    * __mcpm_outbound_enter_critical(unsigned int cluster)
> >    * __mcpm_outbound_leave_critical(unsigned int cluster)
> >    * __mcpm_cpu_down(unsigned int cluster, unsigned int cpu)
> > 
> > The power_up_setup() helper should do platform-specific setup in
> > preparation for turning the CPU on, such as invalidating local caches
> > or entering coherency.  It must be assembler for now, since it must
> > run before the MMU can be switched on.  It is passed the affinity level
> > which should be initialized.
> > 
> > Because the mcpm_sync_struct content is looked-up and modified
> > with the cache enabled or disabled depending on the code path, it is
> > crucial to always ensure proper cache maintenance to update main memory
> > right away.  Therefore, any cached write must be followed by a cache
> > clean operation and any cached read must be preceded by a cache
> > invalidate operation (actually a cache flush i.e. clean+invalidate to
> > avoid discarding possible concurrent writes) on the accessed memory.
> > 
> > Also, in order to prevent a cached writer from interfering with an
> > adjacent non-cached writer, we ensure each state variable is located to
> > a separate cache line.
> > 
> > Thanks to Nicolas Pitre and Achin Gupta for the help with this
> > patch.
> > 
> > Signed-off-by: Dave Martin <dave.martin@linaro.org>
> > ---
> [..]
> 
> > diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
> > index c8c0e2113e..2b83121966 100644
> > --- a/arch/arm/common/mcpm_entry.c
> > +++ b/arch/arm/common/mcpm_entry.c
> > @@ -18,6 +18,7 @@
> >   #include <asm/proc-fns.h>
> >   #include <asm/cacheflush.h>
> >   #include <asm/idmap.h>
> > +#include <asm/cputype.h>
> > 
> >   extern volatile unsigned long
> > mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
> > 
> [...]
> 
> > +/*
> > + * Ensure preceding writes to *p by other CPUs are visible to
> > + * subsequent reads by this CPU.  We must be careful not to
> > + * discard data simultaneously written by another CPU, hence the
> > + * usage of flush rather than invalidate operations.
> > + */
> > +static void __sync_range_r(volatile void *p, size_t size)
> > +{
> > +	char *_p = (char *)p;
> > +
> > +#ifdef CONFIG_OUTER_CACHE
> > +	if (outer_cache.flush_range) {
> > +
> You don't need above #ifdef. In case of non-outer
> cache the function pointer is null anyways.

We do need the #ifdef, because if CONFIG_OUTER_CACHE is not selected 
then the outer_cache structure simply doesn't exist.

> 		/*
> > +		 * Ensure dirty data migrated from other CPUs into our cache
> > +		 * are cleaned out safely before the outer cache is cleaned:
> > +		 */
> > +		__cpuc_clean_dcache_area(_p, size);
> > +
> > +		/* Clean and invalidate stale data for *p from outer ... */
> > +		outer_flush_range(__pa(_p), __pa(_p + size));
> > +	}
> > +#endif
> > +
> > +	/* ... and inner cache: */
> > +	__cpuc_flush_dcache_area(_p, size);
> This will be un-necessary when inner cache is available, no ?
> May be you can re-arrange the code like below, unless and until
> you would like to invalidate any speculative fetches during the
> outer_flush_range()
> 
> 	__cpuc_clean_dcache_area(_p, size);
> 	if (outer_cache.flush_range)
> 		outer_flush_range(__pa(_p), __pa(_p + size));

As you said, the code is sequenced that way to get rid of potential 
speculative fetch that could happen right before L2 is flushed.

See discussion here:
http://article.gmane.org/gmane.linux.ports.arm.kernel/208887


Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup
  2013-01-31 17:16     ` Nicolas Pitre
@ 2013-02-01  5:10       ` Santosh Shilimkar
  2013-02-01 17:26         ` Nicolas Pitre
  0 siblings, 1 reply; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 31 January 2013 10:46 PM, Nicolas Pitre wrote:
> On Thu, 31 Jan 2013, Santosh Shilimkar wrote:
>
>> On Tuesday 29 January 2013 01:20 PM, Nicolas Pitre wrote:
>>> From: Dave Martin <dave.martin@linaro.org>
>>>
>>> This provides helper methods to coordinate between CPUs coming down
>>> and CPUs going up, as well as documentation on the used algorithms,
>>> so that cluster teardown and setup
>>> operations are not done for a cluster simultaneously.
>>>
>>> For use in the power_down() implementation:
>>>     * __mcpm_cpu_going_down(unsigned int cluster, unsigned int cpu)
>>>     * __mcpm_outbound_enter_critical(unsigned int cluster)
>>>     * __mcpm_outbound_leave_critical(unsigned int cluster)
>>>     * __mcpm_cpu_down(unsigned int cluster, unsigned int cpu)
>>>
>>> The power_up_setup() helper should do platform-specific setup in
>>> preparation for turning the CPU on, such as invalidating local caches
>>> or entering coherency.  It must be assembler for now, since it must
>>> run before the MMU can be switched on.  It is passed the affinity level
>>> which should be initialized.
>>>
>>> Because the mcpm_sync_struct content is looked-up and modified
>>> with the cache enabled or disabled depending on the code path, it is
>>> crucial to always ensure proper cache maintenance to update main memory
>>> right away.  Therefore, any cached write must be followed by a cache
>>> clean operation and any cached read must be preceded by a cache
>>> invalidate operation (actually a cache flush i.e. clean+invalidate to
>>> avoid discarding possible concurrent writes) on the accessed memory.
>>>
>>> Also, in order to prevent a cached writer from interfering with an
>>> adjacent non-cached writer, we ensure each state variable is located to
>>> a separate cache line.
>>>
>>> Thanks to Nicolas Pitre and Achin Gupta for the help with this
>>> patch.
>>>
>>> Signed-off-by: Dave Martin <dave.martin@linaro.org>
>>> ---
>> [..]
>>
>>> diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
>>> index c8c0e2113e..2b83121966 100644
>>> --- a/arch/arm/common/mcpm_entry.c
>>> +++ b/arch/arm/common/mcpm_entry.c
>>> @@ -18,6 +18,7 @@
>>>    #include <asm/proc-fns.h>
>>>    #include <asm/cacheflush.h>
>>>    #include <asm/idmap.h>
>>> +#include <asm/cputype.h>
>>>
>>>    extern volatile unsigned long
>>> mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
>>>
>> [...]
>>
>>> +/*
>>> + * Ensure preceding writes to *p by other CPUs are visible to
>>> + * subsequent reads by this CPU.  We must be careful not to
>>> + * discard data simultaneously written by another CPU, hence the
>>> + * usage of flush rather than invalidate operations.
>>> + */
>>> +static void __sync_range_r(volatile void *p, size_t size)
>>> +{
>>> +	char *_p = (char *)p;
>>> +
>>> +#ifdef CONFIG_OUTER_CACHE
>>> +	if (outer_cache.flush_range) {
>>> +
>> You don't need above #ifdef. In case of non-outer
>> cache the function pointer is null anyways.
>
> We do need the #ifdef, because if CONFIG_OUTER_CACHE is not selected
> then the outer_cache structure simply doesn't exist.
>
You are right. #ifdef in middle of the code looks bit ugly and hence
was thinking to avoid it.

>> 		/*
>>> +		 * Ensure dirty data migrated from other CPUs into our cache
>>> +		 * are cleaned out safely before the outer cache is cleaned:
>>> +		 */
>>> +		__cpuc_clean_dcache_area(_p, size);
>>> +
>>> +		/* Clean and invalidate stale data for *p from outer ... */
>>> +		outer_flush_range(__pa(_p), __pa(_p + size));
>>> +	}
>>> +#endif
>>> +
>>> +	/* ... and inner cache: */
>>> +	__cpuc_flush_dcache_area(_p, size);
>> This will be un-necessary when inner cache is available, no ?
>> May be you can re-arrange the code like below, unless and until
>> you would like to invalidate any speculative fetches during the
>> outer_flush_range()
>>
>> 	__cpuc_clean_dcache_area(_p, size);
>> 	if (outer_cache.flush_range)
>> 		outer_flush_range(__pa(_p), __pa(_p + size));
>
> As you said, the code is sequenced that way to get rid of potential
> speculative fetch that could happen right before L2 is flushed.
>
Thanks for clarifying it. It makes sense.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 04/15] ARM: mcpm: Add baremetal voting mutexes
  2013-01-29  7:50 ` [PATCH v3 04/15] ARM: mcpm: Add baremetal voting mutexes Nicolas Pitre
@ 2013-02-01  5:29   ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:20 PM, Nicolas Pitre wrote:
> From: Dave Martin <dave.martin@linaro.org>
>
> This patch adds a simple low-level voting mutex implementation
> to be used to arbitrate during first man selection when no load/store
> exclusive instructions are usable.
>
> For want of a better name, these are called "vlocks".  (I was
> tempted to call them ballot locks, but "block" is way too confusing
> an abbreviation...)
>
> There is no function to wait for the lock to be released, and no
> vlock_lock() function since we don't need these at the moment.
> These could straightforwardly be added if vlocks get used for other
> purposes.
>
> For architectural correctness even Strongly-Ordered memory accesses
> require barriers in order to guarantee that multiple CPUs have a
> coherent view of the ordering of memory accesses.  Whether or not
> this matters depends on hardware implementation details of the
> memory system.  Since the purpose of this code is to provide a clean,
> generic locking mechanism with no platform-specific dependencies the
> barriers should be present to avoid unpleasant surprises on future
> platforms.
>
> Note:
>
>    * When taking the lock, we don't care about implicit background
>      memory operations and other signalling which may be pending,
>      because those are not part of the critical section anyway.
>
>      A DMB is sufficient to ensure correctly observed ordering if
>      the explicit memory accesses in vlock_trylock.
>
>    * No barrier is required after checking the election result,
>      because the result is determined by the store to
>      VLOCK_OWNER_OFFSET and is already globally observed due to the
>      barriers in voting_end.  This means that global agreement on
>      the winner is guaranteed, even before the winner is known
>      locally.
>
> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
> ---
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 05/15] ARM: mcpm_head.S: vlock-based first man election
  2013-01-29  7:51 ` [PATCH v3 05/15] ARM: mcpm_head.S: vlock-based first man election Nicolas Pitre
@ 2013-02-01  5:34   ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> From: Dave Martin <dave.martin@linaro.org>
>
> Instead of requiring the first man to be elected in advance (which
> can be suboptimal in some situations), this patch uses a per-
> cluster mutex to co-ordinate selection of the first man.
>
> This should also make it more feasible to reuse this code path for
> asynchronous cluster resume (as in CPUidle scenarios).
>
> We must ensure that the vlock data doesn't share a cacheline with
> anything else, or dirty cache eviction could corrupt it.
>
> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
> ---
Reviewed-by: Santosh Shilimkar<santosh.shilimkar@ti.com>

Regards,
Santosh

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 06/15] ARM: mcpm: generic SMP secondary bringup and hotplug support
  2013-01-29  7:51 ` [PATCH v3 06/15] ARM: mcpm: generic SMP secondary bringup and hotplug support Nicolas Pitre
  2013-01-29 20:38   ` Rob Herring
@ 2013-02-01  5:38   ` Santosh Shilimkar
  1 sibling, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> Now that the cluster power API is in place, we can use it for SMP secondary
> bringup and CPU hotplug in a generic fashion.
>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
>   arch/arm/common/Makefile       |  2 +-
>   arch/arm/common/mcpm_platsmp.c | 85 ++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 86 insertions(+), 1 deletion(-)
>   create mode 100644 arch/arm/common/mcpm_platsmp.c
>
> diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
> index c901a38c59..e1c9db45de 100644
> --- a/arch/arm/common/Makefile
> +++ b/arch/arm/common/Makefile
> @@ -13,4 +13,4 @@ obj-$(CONFIG_SHARP_PARAM)	+= sharpsl_param.o
>   obj-$(CONFIG_SHARP_SCOOP)	+= scoop.o
>   obj-$(CONFIG_PCI_HOST_ITE8152)  += it8152.o
>   obj-$(CONFIG_ARM_TIMER_SP804)	+= timer-sp.o
> -obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o vlock.o
> +obj-$(CONFIG_CLUSTER_PM)	+= mcpm_head.o mcpm_entry.o mcpm_platsmp.o vlock.o
> diff --git a/arch/arm/common/mcpm_platsmp.c b/arch/arm/common/mcpm_platsmp.c
> new file mode 100644
> index 0000000000..401298f5ee
> --- /dev/null
> +++ b/arch/arm/common/mcpm_platsmp.c
> @@ -0,0 +1,85 @@
> +/*
> + * linux/arch/arm/mach-vexpress/mcpm_platsmp.c
> + *
> + * Created by:  Nicolas Pitre, November 2012
> + * Copyright:   (C) 2012-2013  Linaro Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * Code to handle secondary CPU bringup and hotplug for the cluster power API.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/smp.h>
> +
> +#include <asm/mcpm_entry.h>
> +#include <asm/smp_plat.h>
> +#include <asm/hardware/gic.h>
> +
> +static void __init simple_smp_init_cpus(void)
> +{
> +	set_smp_cross_call(gic_raise_softirq);
> +}
> +
> +static int __cpuinit mcpm_boot_secondary(unsigned int cpu, struct task_struct *idle)
> +{
> +	unsigned int mpidr, pcpu, pcluster, ret;
> +	extern void secondary_startup(void);
> +
> +	mpidr = cpu_logical_map(cpu);
> +	pcpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	pcluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +	pr_debug("%s: logical CPU %d is physical CPU %d cluster %d\n",
> +		 __func__, cpu, pcpu, pcluster);
> +
> +	mcpm_set_entry_vector(pcpu, pcluster, NULL);
> +	ret = mcpm_cpu_power_up(pcpu, pcluster);
> +	if (ret)
> +		return ret;
> +	mcpm_set_entry_vector(pcpu, pcluster, secondary_startup);
> +	gic_raise_softirq(cpumask_of(cpu), 0);
> +	dsb_sev();
> +	return 0;
> +}
> +
> +static void __cpuinit mcpm_secondary_init(unsigned int cpu)
> +{
> +	mcpm_cpu_powered_up();
> +	gic_secondary_init(0);
This gic init should not be needed with Catalin's notifier
series. Some thing to be removed based on when that series
gets in.

Reviewed-by: Santosh Shilimkar<santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time
  2013-01-29  7:51 ` [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time Nicolas Pitre
  2013-01-29 15:43   ` Jon Medhurst (Tixy)
@ 2013-02-01  5:41   ` Santosh Shilimkar
  2013-02-01 17:28     ` Nicolas Pitre
  1 sibling, 1 reply; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> From: Jon Medhurst <tixy@linaro.org>
>
The patch deserves couple of lines of description here.

> Signed-off-by: Jon Medhurst <tixy@linaro.org>
> ---
>   arch/arm/include/asm/mach/arch.h |  3 +++
>   arch/arm/kernel/setup.c          |  5 ++++-
>   arch/arm/mach-vexpress/core.h    |  2 ++
>   arch/arm/mach-vexpress/platsmp.c | 12 ++++++++++++
>   arch/arm/mach-vexpress/v2m.c     |  2 +-
>   5 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
> index 917d4fcfd9..3d01c6d6c3 100644
> --- a/arch/arm/include/asm/mach/arch.h
> +++ b/arch/arm/include/asm/mach/arch.h
> @@ -17,8 +17,10 @@ struct pt_regs;
>   struct smp_operations;
>   #ifdef CONFIG_SMP
>   #define smp_ops(ops) (&(ops))
> +#define smp_init_ops(ops) (&(ops))
>   #else
>   #define smp_ops(ops) (struct smp_operations *)NULL
> +#define smp_init_ops(ops) (void (*)(void))NULL
>   #endif
>
>   struct machine_desc {
> @@ -42,6 +44,7 @@ struct machine_desc {
>   	unsigned char		reserve_lp2 :1;	/* never has lp2	*/
>   	char			restart_mode;	/* default restart mode	*/
>   	struct smp_operations	*smp;		/* SMP operations	*/
> +	void			(*smp_init)(void);
>   	void			(*fixup)(struct tag *, char **,
>   					 struct meminfo *);
>   	void			(*reserve)(void);/* reserve mem blocks	*/
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index 3f6cbb2e3e..41edca8582 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -768,7 +768,10 @@ void __init setup_arch(char **cmdline_p)
>   	arm_dt_init_cpu_maps();
>   #ifdef CONFIG_SMP
>   	if (is_smp()) {
> -		smp_set_ops(mdesc->smp);
> +		if(mdesc->smp_init)
> +			(*mdesc->smp_init)();
> +		else
> +			smp_set_ops(mdesc->smp);
>   		smp_init_cpus();
>   	}
>   #endif
> diff --git a/arch/arm/mach-vexpress/core.h b/arch/arm/mach-vexpress/core.h
> index f134cd4a85..3a761fd76c 100644
> --- a/arch/arm/mach-vexpress/core.h
> +++ b/arch/arm/mach-vexpress/core.h
> @@ -6,6 +6,8 @@
>
>   void vexpress_dt_smp_map_io(void);
>
> +void vexpress_smp_init_ops(void);
> +
>   extern struct smp_operations	vexpress_smp_ops;
>
>   extern void vexpress_cpu_die(unsigned int cpu);
> diff --git a/arch/arm/mach-vexpress/platsmp.c b/arch/arm/mach-vexpress/platsmp.c
> index c5d70de9bb..667344b479 100644
> --- a/arch/arm/mach-vexpress/platsmp.c
> +++ b/arch/arm/mach-vexpress/platsmp.c
> @@ -12,6 +12,7 @@
>   #include <linux/errno.h>
>   #include <linux/smp.h>
>   #include <linux/io.h>
> +#include <linux/of.h>
>   #include <linux/of_fdt.h>
>   #include <linux/vexpress.h>
>
> @@ -206,3 +207,14 @@ struct smp_operations __initdata vexpress_smp_ops = {
>   	.cpu_die		= vexpress_cpu_die,
>   #endif
>   };
> +
> +void __init vexpress_smp_init_ops(void)
> +{
> +	struct smp_operations *ops = &vexpress_smp_ops;
> +#ifdef CONFIG_CLUSTER_PM
See if you can avoid this #ifdef in the middle of function.

Reviewed-by: Santosh Shilimkar<santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 08/15] ARM: introduce common set_auxcr/get_auxcr functions
  2013-01-29  7:51 ` [PATCH v3 08/15] ARM: introduce common set_auxcr/get_auxcr functions Nicolas Pitre
@ 2013-02-01  5:44   ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> From: Rob Herring <rob.herring@calxeda.com>
>
> Move the private set_auxcr/get_auxcr functions from
> drivers/cpuidle/cpuidle-calxeda.c so they can be used across platforms.
>
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> Cc: Russell King <linux@arm.linux.org.uk>
> Acked-by: Tony Lindgren <tony@atomide.com>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 09/15] ARM: vexpress: introduce DCSCB support
  2013-01-29  7:51 ` [PATCH v3 09/15] ARM: vexpress: introduce DCSCB support Nicolas Pitre
@ 2013-02-01  5:50   ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> This adds basic CPU and cluster reset controls on RTSM for the
> A15x4-A7x4 model configuration using the Dual Cluster System
> Configuration Block (DCSCB).
>
> The cache coherency interconnect (CCI) is not handled yet.
>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
>   arch/arm/mach-vexpress/Kconfig  |   8 ++
>   arch/arm/mach-vexpress/Makefile |   1 +
>   arch/arm/mach-vexpress/dcscb.c  | 159 ++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 168 insertions(+)
>   create mode 100644 arch/arm/mach-vexpress/dcscb.c
>
> diff --git a/arch/arm/mach-vexpress/Kconfig b/arch/arm/mach-vexpress/Kconfig
> index 52d315b792..f3f92b120a 100644
> --- a/arch/arm/mach-vexpress/Kconfig
> +++ b/arch/arm/mach-vexpress/Kconfig
> @@ -52,4 +52,12 @@ config ARCH_VEXPRESS_CORTEX_A5_A9_ERRATA
>   config ARCH_VEXPRESS_CA9X4
>   	bool "Versatile Express Cortex-A9x4 tile"
>
> +config ARCH_VEXPRESS_DCSCB
> +	bool "Dual Cluster System Control Block (DCSCB) support"
> +	depends on CLUSTER_PM
> +	help
> +	  Support for the Dual Cluster System Configuration Block (DCSCB).
> +	  This is needed to provide CPU and cluster power management
> +	  on RTSM.
> +
>   endmenu
> diff --git a/arch/arm/mach-vexpress/Makefile b/arch/arm/mach-vexpress/Makefile
> index 80b64971fb..2253644054 100644
> --- a/arch/arm/mach-vexpress/Makefile
> +++ b/arch/arm/mach-vexpress/Makefile
> @@ -6,5 +6,6 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) := -I$(srctree)/$(src)/include \
>
>   obj-y					:= v2m.o reset.o
>   obj-$(CONFIG_ARCH_VEXPRESS_CA9X4)	+= ct-ca9x4.o
> +obj-$(CONFIG_ARCH_VEXPRESS_DCSCB)	+= dcscb.o
>   obj-$(CONFIG_SMP)			+= platsmp.o
>   obj-$(CONFIG_HOTPLUG_CPU)		+= hotplug.o
> diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
> new file mode 100644
> index 0000000000..677ced9efc
> --- /dev/null
> +++ b/arch/arm/mach-vexpress/dcscb.c
> @@ -0,0 +1,159 @@
> +/*
> + * arch/arm/mach-vexpress/dcscb.c - Dual Cluster System Control Block
> + *
> + * Created by:	Nicolas Pitre, May 2012
> + * Copyright:	(C) 2012-2013  Linaro Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/io.h>
> +#include <linux/spinlock.h>
> +#include <linux/errno.h>
> +#include <linux/vexpress.h>
> +
> +#include <asm/mcpm_entry.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +
> +
> +#define DCSCB_PHYS_BASE	0x60000000
> +
> +#define RST_HOLD0	0x0
> +#define RST_HOLD1	0x4
> +#define SYS_SWRESET	0x8
> +#define RST_STAT0	0xc
> +#define RST_STAT1	0x10
> +#define EAG_CFG_R	0x20
> +#define EAG_CFG_W	0x24
> +#define KFC_CFG_R	0x28
> +#define KFC_CFG_W	0x2c
> +#define DCS_CFG_R	0x30
> +
> +/*
> + * We can't use regular spinlocks. In the switcher case, it is possible
> + * for an outbound CPU to call power_down() after its inbound counterpart
> + * is already live using the same logical CPU number which trips lockdep
> + * debugging.
> + */
> +static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +static void __iomem *dcscb_base;
> +
> +static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	unsigned int rst_hold, cpumask = (1 << cpu);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	if (cpu >= 4 || cluster >= 2)

Is this a CCI limitation ?
> +		return -EINVAL;
> +
> +	/*
> +	 * Since this is called with IRQs enabled, and no arch_spin_lock_irq
> +	 * variant exists, we need to disable IRQs manually here.
> +	 */
> +	local_irq_disable();
> +	arch_spin_lock(&dcscb_lock);
> +
> +	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
> +	if (rst_hold & (1 << 8)) {
> +		/* remove cluster reset and add individual CPU's reset */
> +		rst_hold &= ~(1 << 8);
> +		rst_hold |= 0xf;
> +	}
> +	rst_hold &= ~(cpumask | (cpumask << 4));
> +	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
> +
> +	arch_spin_unlock(&dcscb_lock);
> +	local_irq_enable();
> +
> +	return 0;
> +}
> +
> +static void dcscb_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster, rst_hold, cpumask, last_man;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +	cpumask = (1 << cpu);
> +
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= 4 || cluster >= 2);
> +
> +	arch_spin_lock(&dcscb_lock);
> +	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
> +	rst_hold |= cpumask;
> +	if (((rst_hold | (rst_hold >> 4)) & 0xf) == 0xf)
> +		rst_hold |= (1 << 8);
> +	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
> +	arch_spin_unlock(&dcscb_lock);
> +	last_man = (rst_hold & (1 << 8));
> +
> +	/*
> +	 * Now let's clean our L1 cache and shut ourself down.
> +	 * If we're the last CPU in this cluster then clean L2 too.
> +	 */
> +
How about merging these two comments ?
> +	/*
> +	 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
> +	 * a preliminary flush here for those CPUs.  At least, that's
> +	 * the theory -- without the extra flush, Linux explodes on
> +	 * RTSM (maybe not needed anymore, to be investigated)..
> +	 */
> +	flush_cache_louis();
> +	cpu_proc_fin();
Lorenzo already noticed the I cache getting disable here.
That should be fixed. Rest of the patch looks fine.

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 10/15] ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation
  2013-01-29  7:51 ` [PATCH v3 10/15] ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation Nicolas Pitre
@ 2013-02-01  5:53   ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> It is possible for a CPU to be told to power up before it managed
> to power itself down.  Solve this race with a usage count as mandated
> by the API definition.
>
> Signed-off-by: nicolas Pitre <nico@linaro.org>
> ---
>   arch/arm/mach-vexpress/dcscb.c | 77 +++++++++++++++++++++++++++++++++---------
>   1 file changed, 61 insertions(+), 16 deletions(-)
>
> diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
> index 677ced9efc..f993608944 100644
> --- a/arch/arm/mach-vexpress/dcscb.c
> +++ b/arch/arm/mach-vexpress/dcscb.c
> @@ -45,6 +45,7 @@
>   static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
>
>   static void __iomem *dcscb_base;
> +static int dcscb_use_count[4][2];
>
>   static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
>   {
> @@ -61,14 +62,27 @@ static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
>   	local_irq_disable();
>   	arch_spin_lock(&dcscb_lock);
>
> -	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
> -	if (rst_hold & (1 << 8)) {
> -		/* remove cluster reset and add individual CPU's reset */
> -		rst_hold &= ~(1 << 8);
> -		rst_hold |= 0xf;
> +	dcscb_use_count[cpu][cluster]++;
> +	if (dcscb_use_count[cpu][cluster] == 1) {
> +		rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
> +		if (rst_hold & (1 << 8)) {
> +			/* remove cluster reset and add individual CPU's reset */
> +			rst_hold &= ~(1 << 8);
> +			rst_hold |= 0xf;
> +		}
> +		rst_hold &= ~(cpumask | (cpumask << 4));
> +		writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
> +	} else if (dcscb_use_count[cpu][cluster] != 2) {
> +		/*
> +		 * The only possible values are:
> +		 * 0 = CPU down
> +		 * 1 = CPU (still) up
> +		 * 2 = CPU requested to be up before it had a chance
> +		 *     to actually make itself down.
> +		 * Any other value is a bug.
> +		 */
> +		BUG();
No strong opinion but would switch case be better here ?

>   	}
> -	rst_hold &= ~(cpumask | (cpumask << 4));
> -	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
>
>   	arch_spin_unlock(&dcscb_lock);
>   	local_irq_enable();
> @@ -78,7 +92,8 @@ static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
>
>   static void dcscb_power_down(void)
>   {
> -	unsigned int mpidr, cpu, cluster, rst_hold, cpumask, last_man;
> +	unsigned int mpidr, cpu, cluster, rst_hold, cpumask;
> +	bool last_man = false, skip_wfi = false;
>
>   	mpidr = read_cpuid_mpidr();
>   	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> @@ -89,13 +104,26 @@ static void dcscb_power_down(void)
>   	BUG_ON(cpu >= 4 || cluster >= 2);
>
>   	arch_spin_lock(&dcscb_lock);
> -	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
> -	rst_hold |= cpumask;
> -	if (((rst_hold | (rst_hold >> 4)) & 0xf) == 0xf)
> -		rst_hold |= (1 << 8);
> -	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
> +	dcscb_use_count[cpu][cluster]--;
> +	if (dcscb_use_count[cpu][cluster] == 0) {
> +		rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
> +		rst_hold |= cpumask;
> +		if (((rst_hold | (rst_hold >> 4)) & 0xf) == 0xf) {
> +			rst_hold |= (1 << 8);
> +			last_man = true;
> +		}
> +		writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
> +	} else if (dcscb_use_count[cpu][cluster] == 1) {
> +		/*
> +		 * A power_up request went ahead of us.
> +		 * Even if we do not want to shut this CPU down,
> +		 * the caller expects a certain state as if the WFI
> +		 * was aborted.  So let's continue with cache cleaning.
> +		 */
> +		skip_wfi = true;
> +	} else
> +		BUG();
Same comment as above.

Rest looks fine.
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 11/15] ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster
  2013-01-29  7:51 ` [PATCH v3 11/15] ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster Nicolas Pitre
@ 2013-02-01  5:57   ` Santosh Shilimkar
  2013-02-01 17:24     ` Nicolas Pitre
  0 siblings, 1 reply; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  5:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> If 4 CPUs are assumed, the A15x1-A7x1 model configuration would never
> shut down the initial cluster as the 0xf reset bit mask will never be
> observed.  Let's construct this mask based on the provided information
> in the DCSCB config register for the number of CPUs per cluster.
>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
>   arch/arm/mach-vexpress/dcscb.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
> index f993608944..8d363357ef 100644
> --- a/arch/arm/mach-vexpress/dcscb.c
> +++ b/arch/arm/mach-vexpress/dcscb.c
> @@ -46,10 +46,12 @@ static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
>
>   static void __iomem *dcscb_base;
>   static int dcscb_use_count[4][2];
> +static int dcscb_mcpm_cpu_mask[2];
s/2/MAX_CLUSTERS ?

Apart from above minor question,
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 12/15] drivers/bus: add ARM CCI support
  2013-01-29  7:51 ` [PATCH v3 12/15] drivers/bus: add ARM CCI support Nicolas Pitre
@ 2013-02-01  6:01   ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  6:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
>
> On ARM multi-cluster systems coherency between cores running on
> different clusters is managed by the cache-coherent interconnect (CCI).
> It allows broadcasting of TLB invalidates and memory barriers and it
> guarantees cache coherency at system level.
>
> This patch enables the basic infrastructure required in Linux to
> handle and programme the CCI component. The first implementation is
> based on a platform device, its relative DT compatible property and
> a simple programming interface.
>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
>   drivers/bus/Kconfig     |   4 ++
>   drivers/bus/Makefile    |   2 +
>   drivers/bus/arm-cci.c   | 107 ++++++++++++++++++++++++++++++++++++++++++++++++
>   include/linux/arm-cci.h |  30 ++++++++++++++
>   4 files changed, 143 insertions(+)
>   create mode 100644 drivers/bus/arm-cci.c
>   create mode 100644 include/linux/arm-cci.h
>
> diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig
> index 0f51ed687d..d032f74ff2 100644
> --- a/drivers/bus/Kconfig
> +++ b/drivers/bus/Kconfig
> @@ -19,4 +19,8 @@ config OMAP_INTERCONNECT
>
>   	help
>   	  Driver to enable OMAP interconnect error handling driver.
> +
> +config ARM_CCI
> +       bool "ARM CCI driver support"
> +
>   endmenu
> diff --git a/drivers/bus/Makefile b/drivers/bus/Makefile
> index 45d997c854..55aac809e5 100644
> --- a/drivers/bus/Makefile
> +++ b/drivers/bus/Makefile
> @@ -6,3 +6,5 @@ obj-$(CONFIG_OMAP_OCP2SCP)	+= omap-ocp2scp.o
>
>   # Interconnect bus driver for OMAP SoCs.
>   obj-$(CONFIG_OMAP_INTERCONNECT)	+= omap_l3_smx.o omap_l3_noc.o
> +
> +obj-$(CONFIG_ARM_CCI)		+= arm-cci.o
> diff --git a/drivers/bus/arm-cci.c b/drivers/bus/arm-cci.c
> new file mode 100644
> index 0000000000..25ae156924
> --- /dev/null
> +++ b/drivers/bus/arm-cci.c
> @@ -0,0 +1,107 @@
> +/*
> + * ARM Cache Coherency Interconnect (CCI400) support
> + *
> + * Copyright (C) 2012-2013 ARM Ltd.
> + * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
> + * kind, whether express or implied; without even the implied warranty
> + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <linux/device.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/arm-cci.h>
> +
> +#define CCI400_EAG_OFFSET       0x4000
> +#define CCI400_KF_OFFSET        0x5000
> +
> +#define DRIVER_NAME	"CCI"
> +struct cci_drvdata {
> +	void __iomem *baseaddr;
> +	spinlock_t lock;
> +};
> +
> +static struct cci_drvdata *info;
> +
> +void disable_cci(int cluster)
> +{
> +	u32 cci_reg = cluster ? CCI400_KF_OFFSET : CCI400_EAG_OFFSET;
> +	writel_relaxed(0x0, info->baseaddr	+ cci_reg);
> +
> +	while (readl_relaxed(info->baseaddr + 0xc) & 0x1)
0xc ? Is that a status register ? A define for the same would be
good. Rest of the patch looks fine.

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache
  2013-01-29  7:51 ` [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache Nicolas Pitre
@ 2013-02-01  6:13   ` Santosh Shilimkar
  2013-02-02 22:23     ` Nicolas Pitre
  0 siblings, 1 reply; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  6:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> From: Dave Martin <dave.martin@linaro.org>
>
> Non-local variables used by the CCI management function called after
> disabling the cache must be flushed out to main memory in advance,
> otherwise incoherency of those values may occur if they are sitting
> in the cache of some other CPU when cci_disable() executes.
>
Any CPU calling cci_disable() would have already cleaned its local
cache and the snoop unit should take care of syncing the shared data
before hand from other CPU local caches for shared accesses.
May be I am unable to visualize the issue here or missing some key
point.

> This patch adds the appropriate flushing to the CCI driver to ensure
> that the relevant data is available in RAM ahead of time.
>
> Because this creates a dependency on arch-specific cacheflushing
> functions, this patch also makes ARM_CCI depend on ARM.
>
You should do that otherwise to avoid other arch building this
driver for random builds and breaking their builds.


> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
Patch is fine apart from the question.

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI
  2013-01-29  7:51 ` [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI Nicolas Pitre
  2013-01-29 10:46   ` Lorenzo Pieralisi
@ 2013-02-01  6:15   ` Santosh Shilimkar
  1 sibling, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-01  6:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> From: Dave Martin <dave.martin@linaro.org>
>
> Add the required code to properly handle race free platform coherency exit
> to the DCSCB power down method.
>
> The power_up_setup callback is used to enable the CCI interface for
> the cluster being brought up.  This must be done in assembly before
> the kernel environment is entered.
>
> Thanks to Achin Gupta and Nicolas Pitre for their help and
> contributions.
>
> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> ---
My concerns on this patch are already highlighted by Lorenzo.
Apart from that patch looks fine to me.

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 11/15] ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster
  2013-02-01  5:57   ` Santosh Shilimkar
@ 2013-02-01 17:24     ` Nicolas Pitre
  2013-02-02  6:54       ` Santosh Shilimkar
  0 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-02-01 17:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 1 Feb 2013, Santosh Shilimkar wrote:

> On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> > If 4 CPUs are assumed, the A15x1-A7x1 model configuration would never
> > shut down the initial cluster as the 0xf reset bit mask will never be
> > observed.  Let's construct this mask based on the provided information
> > in the DCSCB config register for the number of CPUs per cluster.
> > 
> > Signed-off-by: Nicolas Pitre <nico@linaro.org>
> > ---
> >   arch/arm/mach-vexpress/dcscb.c | 14 ++++++++++----
> >   1 file changed, 10 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
> > index f993608944..8d363357ef 100644
> > --- a/arch/arm/mach-vexpress/dcscb.c
> > +++ b/arch/arm/mach-vexpress/dcscb.c
> > @@ -46,10 +46,12 @@ static arch_spinlock_t dcscb_lock =
> > __ARCH_SPIN_LOCK_UNLOCKED;
> > 
> >   static void __iomem *dcscb_base;
> >   static int dcscb_use_count[4][2];
> > +static int dcscb_mcpm_cpu_mask[2];
> s/2/MAX_CLUSTERS ?

No.  The DCSCB (*dual* cluster system control block) does manage only 2 
clusters, regardless of the MAX_CLUSTERS definition which might increase 
in the future.

> Apart from above minor question,
> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup
  2013-02-01  5:10       ` Santosh Shilimkar
@ 2013-02-01 17:26         ` Nicolas Pitre
  0 siblings, 0 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-02-01 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 1 Feb 2013, Santosh Shilimkar wrote:

> On Thursday 31 January 2013 10:46 PM, Nicolas Pitre wrote:
> > On Thu, 31 Jan 2013, Santosh Shilimkar wrote:
> > 
> > > On Tuesday 29 January 2013 01:20 PM, Nicolas Pitre wrote:
> > > > From: Dave Martin <dave.martin@linaro.org>
> > > > 
> > > > This provides helper methods to coordinate between CPUs coming down
> > > > and CPUs going up, as well as documentation on the used algorithms,
> > > > so that cluster teardown and setup
> > > > operations are not done for a cluster simultaneously.
> > > > 
> > > > For use in the power_down() implementation:
> > > >     * __mcpm_cpu_going_down(unsigned int cluster, unsigned int cpu)
> > > >     * __mcpm_outbound_enter_critical(unsigned int cluster)
> > > >     * __mcpm_outbound_leave_critical(unsigned int cluster)
> > > >     * __mcpm_cpu_down(unsigned int cluster, unsigned int cpu)
> > > > 
> > > > The power_up_setup() helper should do platform-specific setup in
> > > > preparation for turning the CPU on, such as invalidating local caches
> > > > or entering coherency.  It must be assembler for now, since it must
> > > > run before the MMU can be switched on.  It is passed the affinity level
> > > > which should be initialized.
> > > > 
> > > > Because the mcpm_sync_struct content is looked-up and modified
> > > > with the cache enabled or disabled depending on the code path, it is
> > > > crucial to always ensure proper cache maintenance to update main memory
> > > > right away.  Therefore, any cached write must be followed by a cache
> > > > clean operation and any cached read must be preceded by a cache
> > > > invalidate operation (actually a cache flush i.e. clean+invalidate to
> > > > avoid discarding possible concurrent writes) on the accessed memory.
> > > > 
> > > > Also, in order to prevent a cached writer from interfering with an
> > > > adjacent non-cached writer, we ensure each state variable is located to
> > > > a separate cache line.
> > > > 
> > > > Thanks to Nicolas Pitre and Achin Gupta for the help with this
> > > > patch.
> > > > 
> > > > Signed-off-by: Dave Martin <dave.martin@linaro.org>
> > > > ---
> > > [..]
> > > 
> > > > diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
> > > > index c8c0e2113e..2b83121966 100644
> > > > --- a/arch/arm/common/mcpm_entry.c
> > > > +++ b/arch/arm/common/mcpm_entry.c
> > > > @@ -18,6 +18,7 @@
> > > >    #include <asm/proc-fns.h>
> > > >    #include <asm/cacheflush.h>
> > > >    #include <asm/idmap.h>
> > > > +#include <asm/cputype.h>
> > > > 
> > > >    extern volatile unsigned long
> > > > mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
> > > > 
> > > [...]
> > > 
> > > > +/*
> > > > + * Ensure preceding writes to *p by other CPUs are visible to
> > > > + * subsequent reads by this CPU.  We must be careful not to
> > > > + * discard data simultaneously written by another CPU, hence the
> > > > + * usage of flush rather than invalidate operations.
> > > > + */
> > > > +static void __sync_range_r(volatile void *p, size_t size)
> > > > +{
> > > > +	char *_p = (char *)p;
> > > > +
> > > > +#ifdef CONFIG_OUTER_CACHE
> > > > +	if (outer_cache.flush_range) {
> > > > +
> > > You don't need above #ifdef. In case of non-outer
> > > cache the function pointer is null anyways.
> > 
> > We do need the #ifdef, because if CONFIG_OUTER_CACHE is not selected
> > then the outer_cache structure simply doesn't exist.
> > 
> You are right. #ifdef in middle of the code looks bit ugly and hence
> was thinking to avoid it.
> 
> > > 		/*
> > > > +		 * Ensure dirty data migrated from other CPUs into our cache
> > > > +		 * are cleaned out safely before the outer cache is cleaned:
> > > > +		 */
> > > > +		__cpuc_clean_dcache_area(_p, size);
> > > > +
> > > > +		/* Clean and invalidate stale data for *p from outer ... */
> > > > +		outer_flush_range(__pa(_p), __pa(_p + size));
> > > > +	}
> > > > +#endif
> > > > +
> > > > +	/* ... and inner cache: */
> > > > +	__cpuc_flush_dcache_area(_p, size);
> > > This will be un-necessary when inner cache is available, no ?
> > > May be you can re-arrange the code like below, unless and until
> > > you would like to invalidate any speculative fetches during the
> > > outer_flush_range()
> > > 
> > > 	__cpuc_clean_dcache_area(_p, size);
> > > 	if (outer_cache.flush_range)
> > > 		outer_flush_range(__pa(_p), __pa(_p + size));
> > 
> > As you said, the code is sequenced that way to get rid of potential
> > speculative fetch that could happen right before L2 is flushed.
> > 
> Thanks for clarifying it. It makes sense.

May I translate this into an ACK tag?  ;-)


Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time
  2013-02-01  5:41   ` Santosh Shilimkar
@ 2013-02-01 17:28     ` Nicolas Pitre
  0 siblings, 0 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-02-01 17:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 1 Feb 2013, Santosh Shilimkar wrote:

> On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> > From: Jon Medhurst <tixy@linaro.org>
> > 
> The patch deserves couple of lines of description here.

Yes, Tixy has split it into 2 patches with proper description which 
should be part of the next round.


> 
> > Signed-off-by: Jon Medhurst <tixy@linaro.org>
> > ---
> >   arch/arm/include/asm/mach/arch.h |  3 +++
> >   arch/arm/kernel/setup.c          |  5 ++++-
> >   arch/arm/mach-vexpress/core.h    |  2 ++
> >   arch/arm/mach-vexpress/platsmp.c | 12 ++++++++++++
> >   arch/arm/mach-vexpress/v2m.c     |  2 +-
> >   5 files changed, 22 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/mach/arch.h
> > b/arch/arm/include/asm/mach/arch.h
> > index 917d4fcfd9..3d01c6d6c3 100644
> > --- a/arch/arm/include/asm/mach/arch.h
> > +++ b/arch/arm/include/asm/mach/arch.h
> > @@ -17,8 +17,10 @@ struct pt_regs;
> >   struct smp_operations;
> >   #ifdef CONFIG_SMP
> >   #define smp_ops(ops) (&(ops))
> > +#define smp_init_ops(ops) (&(ops))
> >   #else
> >   #define smp_ops(ops) (struct smp_operations *)NULL
> > +#define smp_init_ops(ops) (void (*)(void))NULL
> >   #endif
> > 
> >   struct machine_desc {
> > @@ -42,6 +44,7 @@ struct machine_desc {
> >   	unsigned char		reserve_lp2 :1;	/* never has lp2	*/
> >   	char			restart_mode;	/* default restart mode	*/
> >   	struct smp_operations	*smp;		/* SMP operations	*/
> > +	void			(*smp_init)(void);
> >   	void			(*fixup)(struct tag *, char **,
> >   					 struct meminfo *);
> >   	void			(*reserve)(void);/* reserve mem blocks	*/
> > diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> > index 3f6cbb2e3e..41edca8582 100644
> > --- a/arch/arm/kernel/setup.c
> > +++ b/arch/arm/kernel/setup.c
> > @@ -768,7 +768,10 @@ void __init setup_arch(char **cmdline_p)
> >   	arm_dt_init_cpu_maps();
> >   #ifdef CONFIG_SMP
> >   	if (is_smp()) {
> > -		smp_set_ops(mdesc->smp);
> > +		if(mdesc->smp_init)
> > +			(*mdesc->smp_init)();
> > +		else
> > +			smp_set_ops(mdesc->smp);
> >   		smp_init_cpus();
> >   	}
> >   #endif
> > diff --git a/arch/arm/mach-vexpress/core.h b/arch/arm/mach-vexpress/core.h
> > index f134cd4a85..3a761fd76c 100644
> > --- a/arch/arm/mach-vexpress/core.h
> > +++ b/arch/arm/mach-vexpress/core.h
> > @@ -6,6 +6,8 @@
> > 
> >   void vexpress_dt_smp_map_io(void);
> > 
> > +void vexpress_smp_init_ops(void);
> > +
> >   extern struct smp_operations	vexpress_smp_ops;
> > 
> >   extern void vexpress_cpu_die(unsigned int cpu);
> > diff --git a/arch/arm/mach-vexpress/platsmp.c
> > b/arch/arm/mach-vexpress/platsmp.c
> > index c5d70de9bb..667344b479 100644
> > --- a/arch/arm/mach-vexpress/platsmp.c
> > +++ b/arch/arm/mach-vexpress/platsmp.c
> > @@ -12,6 +12,7 @@
> >   #include <linux/errno.h>
> >   #include <linux/smp.h>
> >   #include <linux/io.h>
> > +#include <linux/of.h>
> >   #include <linux/of_fdt.h>
> >   #include <linux/vexpress.h>
> > 
> > @@ -206,3 +207,14 @@ struct smp_operations __initdata vexpress_smp_ops = {
> >   	.cpu_die		= vexpress_cpu_die,
> >   #endif
> >   };
> > +
> > +void __init vexpress_smp_init_ops(void)
> > +{
> > +	struct smp_operations *ops = &vexpress_smp_ops;
> > +#ifdef CONFIG_CLUSTER_PM
> See if you can avoid this #ifdef in the middle of function.
> 
> Reviewed-by: Santosh Shilimkar<santosh.shilimkar@ti.com>
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 11/15] ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster
  2013-02-01 17:24     ` Nicolas Pitre
@ 2013-02-02  6:54       ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-02  6:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 01 February 2013 10:54 PM, Nicolas Pitre wrote:
> On Fri, 1 Feb 2013, Santosh Shilimkar wrote:
>
>> On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
>>> If 4 CPUs are assumed, the A15x1-A7x1 model configuration would never
>>> shut down the initial cluster as the 0xf reset bit mask will never be
>>> observed.  Let's construct this mask based on the provided information
>>> in the DCSCB config register for the number of CPUs per cluster.
>>>
>>> Signed-off-by: Nicolas Pitre <nico@linaro.org>
>>> ---
>>>    arch/arm/mach-vexpress/dcscb.c | 14 ++++++++++----
>>>    1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
>>> index f993608944..8d363357ef 100644
>>> --- a/arch/arm/mach-vexpress/dcscb.c
>>> +++ b/arch/arm/mach-vexpress/dcscb.c
>>> @@ -46,10 +46,12 @@ static arch_spinlock_t dcscb_lock =
>>> __ARCH_SPIN_LOCK_UNLOCKED;
>>>
>>>    static void __iomem *dcscb_base;
>>>    static int dcscb_use_count[4][2];
>>> +static int dcscb_mcpm_cpu_mask[2];
>> s/2/MAX_CLUSTERS ?
>
> No.  The DCSCB (*dual* cluster system control block) does manage only 2
> clusters, regardless of the MAX_CLUSTERS definition which might increase
> in the future.
>
OK. Thanks for clarification.

Regards
Santosh

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache
  2013-02-01  6:13   ` Santosh Shilimkar
@ 2013-02-02 22:23     ` Nicolas Pitre
  2013-02-03 10:07       ` Santosh Shilimkar
  0 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-02-02 22:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 1 Feb 2013, Santosh Shilimkar wrote:

> On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> > From: Dave Martin <dave.martin@linaro.org>
> > 
> > Non-local variables used by the CCI management function called after
> > disabling the cache must be flushed out to main memory in advance,
> > otherwise incoherency of those values may occur if they are sitting
> > in the cache of some other CPU when cci_disable() executes.
> > 
> Any CPU calling cci_disable() would have already cleaned its local
> cache and the snoop unit should take care of syncing the shared data
> before hand from other CPU local caches for shared accesses.
> May be I am unable to visualize the issue here or missing some key
> point.

Let's suppose CPU0 initializes the CCI.  Without this patch, the CCI 
base address might be sitting in CPU0's cache.

The last CPU in a cluster to shut itself down is responsible for calling 
cci_disable().  And being the last, it is also responsible for flushing 
out its L1 and L2 caches before doing that.  If CPU0 went down before 
that, it did flush its L1 already. So the base address will be flushed 
to RAM in that case.

But if it is a CPU in _another_ cluster which is shutting down and 
becoming the last man _there_.  It will flush its L1 and L2 cache 
before calling cci_disable().  And 
because the cache is disabled at that point, that CPU won't 
send any snoop request across to the other cluster where CPU0 holds the 
base address in its L1 or even L2 cache.

This is why we must push out that value out to RAM before cci_disable() 
is used.

> > This patch adds the appropriate flushing to the CCI driver to ensure
> > that the relevant data is available in RAM ahead of time.
> > 
> > Because this creates a dependency on arch-specific cacheflushing
> > functions, this patch also makes ARM_CCI depend on ARM.
> > 
> You should do that otherwise to avoid other arch building this
> driver for random builds and breaking their builds.

Before this patch the driver was buildable on any architecture.  That's 
why this dependency is added only in this patch.

> > Signed-off-by: Dave Martin <dave.martin@linaro.org>
> > Signed-off-by: Nicolas Pitre <nico@linaro.org>
> > ---
> Patch is fine apart from the question.
> 
> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
> 


Nicvolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache
  2013-02-02 22:23     ` Nicolas Pitre
@ 2013-02-03 10:07       ` Santosh Shilimkar
  2013-02-03 18:29         ` Nicolas Pitre
  0 siblings, 1 reply; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-03 10:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Sunday 03 February 2013 03:53 AM, Nicolas Pitre wrote:
> On Fri, 1 Feb 2013, Santosh Shilimkar wrote:
>
>> On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
>>> From: Dave Martin <dave.martin@linaro.org>
>>>
>>> Non-local variables used by the CCI management function called after
>>> disabling the cache must be flushed out to main memory in advance,
>>> otherwise incoherency of those values may occur if they are sitting
>>> in the cache of some other CPU when cci_disable() executes.
>>>
>> Any CPU calling cci_disable() would have already cleaned its local
>> cache and the snoop unit should take care of syncing the shared data
>> before hand from other CPU local caches for shared accesses.
>> May be I am unable to visualize the issue here or missing some key
>> point.
>
> Let's suppose CPU0 initializes the CCI.  Without this patch, the CCI
> base address might be sitting in CPU0's cache.
>
> The last CPU in a cluster to shut itself down is responsible for calling
> cci_disable().  And being the last, it is also responsible for flushing
> out its L1 and L2 caches before doing that.  If CPU0 went down before
> that, it did flush its L1 already. So the base address will be flushed
> to RAM in that case.
>
Yes. This is valid case. Thanks for description.

> But if it is a CPU in _another_ cluster which is shutting down and
> becoming the last man _there_.  It will flush its L1 and L2 cache
> before calling cci_disable().  And
> because the cache is disabled at that point, that CPU won't
> send any snoop request across to the other cluster where CPU0 holds the
> base address in its L1 or even L2 cache.
>
> This is why we must push out that value out to RAM before cci_disable()
> is used.
>
>>> This patch adds the appropriate flushing to the CCI driver to ensure
>>> that the relevant data is available in RAM ahead of time.
>>>
>>> Because this creates a dependency on arch-specific cacheflushing
>>> functions, this patch also makes ARM_CCI depend on ARM.
>>>
>> You should do that otherwise to avoid other arch building this
>> driver for random builds and breaking their builds.
>
> Before this patch the driver was buildable on any architecture.  That's
> why this dependency is added only in this patch.
>
I was just trying to counter the reasoning in the changelog which says
dependency is added because of arch specific cache flushing function.
Meaning even without that ARM dependency should be in place to avoid
driver getting build for other archs.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache
  2013-02-03 10:07       ` Santosh Shilimkar
@ 2013-02-03 18:29         ` Nicolas Pitre
  2013-02-04  5:25           ` Santosh Shilimkar
  0 siblings, 1 reply; 54+ messages in thread
From: Nicolas Pitre @ 2013-02-03 18:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 3 Feb 2013, Santosh Shilimkar wrote:

> On Sunday 03 February 2013 03:53 AM, Nicolas Pitre wrote:
> > On Fri, 1 Feb 2013, Santosh Shilimkar wrote:
> > 
> > > On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
> > > > From: Dave Martin <dave.martin@linaro.org>
> > > > 
> > > > Non-local variables used by the CCI management function called after
> > > > disabling the cache must be flushed out to main memory in advance,
> > > > otherwise incoherency of those values may occur if they are sitting
> > > > in the cache of some other CPU when cci_disable() executes.
> > > > 
> > > Any CPU calling cci_disable() would have already cleaned its local
> > > cache and the snoop unit should take care of syncing the shared data
> > > before hand from other CPU local caches for shared accesses.
> > > May be I am unable to visualize the issue here or missing some key
> > > point.
> > 
> > Let's suppose CPU0 initializes the CCI.  Without this patch, the CCI
> > base address might be sitting in CPU0's cache.
> > 
> > The last CPU in a cluster to shut itself down is responsible for calling
> > cci_disable().  And being the last, it is also responsible for flushing
> > out its L1 and L2 caches before doing that.  If CPU0 went down before
> > that, it did flush its L1 already. So the base address will be flushed
> > to RAM in that case.
> > 
> Yes. This is valid case. Thanks for description.
> 
> > But if it is a CPU in _another_ cluster which is shutting down and
> > becoming the last man _there_.  It will flush its L1 and L2 cache
> > before calling cci_disable().  And
> > because the cache is disabled at that point, that CPU won't
> > send any snoop request across to the other cluster where CPU0 holds the
> > base address in its L1 or even L2 cache.
> > 
> > This is why we must push out that value out to RAM before cci_disable()
> > is used.
> > 
> > > > This patch adds the appropriate flushing to the CCI driver to ensure
> > > > that the relevant data is available in RAM ahead of time.
> > > > 
> > > > Because this creates a dependency on arch-specific cacheflushing
> > > > functions, this patch also makes ARM_CCI depend on ARM.
> > > > 
> > > You should do that otherwise to avoid other arch building this
> > > driver for random builds and breaking their builds.
> > 
> > Before this patch the driver was buildable on any architecture.  That's
> > why this dependency is added only in this patch.
> > 
> I was just trying to counter the reasoning in the changelog which says
> dependency is added because of arch specific cache flushing function.
> Meaning even without that ARM dependency should be in place to avoid
> driver getting build for other archs.

Well, some upstream maintainers' opinion is that you should not put 
artificial dependencies on a specific architecture if a driver is 
buildable on any architecture, even if the built driver is of no use to 
those other architectures.


Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree
  2013-01-31 10:54           ` Dave Martin
@ 2013-02-04  4:39             ` Nicolas Pitre
  0 siblings, 0 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-02-04  4:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 31 Jan 2013, Dave Martin wrote:

> On Wed, Jan 30, 2013 at 12:43:29PM -0500, Nicolas Pitre wrote:
> > On Wed, 30 Jan 2013, Achin Gupta wrote:
> > 
> > > On Tue, Jan 29, 2013 at 9:41 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> > > > On Tue, 29 Jan 2013, Rob Herring wrote:
> > > >
> > > >> On 01/29/2013 01:51 AM, Nicolas Pitre wrote:
> > > >> > +   node = of_find_compatible_node(NULL, NULL, "arm,dcscb");
> > > >>
> > > >> This needs binding documentation and should be a more specific name. Not
> > > >> knowing what dcscb is, I don't have a suggestion.
> 
> The name is 100% specific.  The real problem seems to be that it's also
> very cryptic, and undocumented.

OK, here's my attempt at documenting it, folded with the latest fixes in 
the patch that introduced DCSCB support.  ACK?

---------- >8
From: Nicolas Pitre <nicolas.pitre@linaro.org>
Date: Wed, 2 May 2012 20:56:52 -0400
Subject: [PATCH] ARM: vexpress: introduce DCSCB support

This adds basic CPU and cluster reset controls on RTSM for the
A15x4-A7x4 model configuration using the Dual Cluster System
Configuration Block (DCSCB).

The cache coherency interconnect (CCI) is not handled yet.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 .../devicetree/bindings/arm/rtsm-dcscb.txt         |  19 +++
 arch/arm/mach-vexpress/Kconfig                     |   8 +
 arch/arm/mach-vexpress/Makefile                    |   1 +
 arch/arm/mach-vexpress/dcscb.c                     | 162 +++++++++++++++++++++
 4 files changed, 190 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/rtsm-dcscb.txt b/Documentation/devicetree/bindings/arm/rtsm-dcscb.txt
new file mode 100644
index 0000000000..3b8fbf3c00
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/rtsm-dcscb.txt
@@ -0,0 +1,19 @@
+ARM Dual Cluster System Configuration Block
+-------------------------------------------
+
+The Dual Cluster System Configuration Block (DCSCB) provides basic
+functionality for controlling clocks, resets and configuration pins in
+the Dual Cluster System implemented by the Real-Time System Model (RTSM).
+
+Required properties:
+
+- compatible : should be "arm,rtsm,dcscb"
+
+- reg : physical base address and the size of the registers window
+
+Example:
+
+	dcscb at 60000000 {
+		compatible = "arm,rtsm,dcscb";
+		reg = <0x60000000 0x1000>;
+	};
diff --git a/arch/arm/mach-vexpress/Kconfig b/arch/arm/mach-vexpress/Kconfig
index 52d315b792..f3f92b120a 100644
--- a/arch/arm/mach-vexpress/Kconfig
+++ b/arch/arm/mach-vexpress/Kconfig
@@ -52,4 +52,12 @@ config ARCH_VEXPRESS_CORTEX_A5_A9_ERRATA
 config ARCH_VEXPRESS_CA9X4
 	bool "Versatile Express Cortex-A9x4 tile"
 
+config ARCH_VEXPRESS_DCSCB
+	bool "Dual Cluster System Control Block (DCSCB) support"
+	depends on CLUSTER_PM
+	help
+	  Support for the Dual Cluster System Configuration Block (DCSCB).
+	  This is needed to provide CPU and cluster power management
+	  on RTSM.
+
 endmenu
diff --git a/arch/arm/mach-vexpress/Makefile b/arch/arm/mach-vexpress/Makefile
index 80b64971fb..2253644054 100644
--- a/arch/arm/mach-vexpress/Makefile
+++ b/arch/arm/mach-vexpress/Makefile
@@ -6,5 +6,6 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) := -I$(srctree)/$(src)/include \
 
 obj-y					:= v2m.o reset.o
 obj-$(CONFIG_ARCH_VEXPRESS_CA9X4)	+= ct-ca9x4.o
+obj-$(CONFIG_ARCH_VEXPRESS_DCSCB)	+= dcscb.o
 obj-$(CONFIG_SMP)			+= platsmp.o
 obj-$(CONFIG_HOTPLUG_CPU)		+= hotplug.o
diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
new file mode 100644
index 0000000000..07e835cb72
--- /dev/null
+++ b/arch/arm/mach-vexpress/dcscb.c
@@ -0,0 +1,162 @@
+/*
+ * arch/arm/mach-vexpress/dcscb.c - Dual Cluster System Configuration Block
+ *
+ * Created by:	Nicolas Pitre, May 2012
+ * Copyright:	(C) 2012-2013  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/io.h>
+#include <linux/spinlock.h>
+#include <linux/errno.h>
+#include <linux/of_address.h>
+#include <linux/vexpress.h>
+
+#include <asm/mcpm_entry.h>
+#include <asm/proc-fns.h>
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+
+
+#define RST_HOLD0	0x0
+#define RST_HOLD1	0x4
+#define SYS_SWRESET	0x8
+#define RST_STAT0	0xc
+#define RST_STAT1	0x10
+#define EAG_CFG_R	0x20
+#define EAG_CFG_W	0x24
+#define KFC_CFG_R	0x28
+#define KFC_CFG_W	0x2c
+#define DCS_CFG_R	0x30
+
+/*
+ * We can't use regular spinlocks. In the switcher case, it is possible
+ * for an outbound CPU to call power_down() after its inbound counterpart
+ * is already live using the same logical CPU number which trips lockdep
+ * debugging.
+ */
+static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+static void __iomem *dcscb_base;
+
+static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
+{
+	unsigned int rst_hold, cpumask = (1 << cpu);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	if (cpu >= 4 || cluster >= 2)
+		return -EINVAL;
+
+	/*
+	 * Since this is called with IRQs enabled, and no arch_spin_lock_irq
+	 * variant exists, we need to disable IRQs manually here.
+	 */
+	local_irq_disable();
+	arch_spin_lock(&dcscb_lock);
+
+	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
+	if (rst_hold & (1 << 8)) {
+		/* remove cluster reset and add individual CPU's reset */
+		rst_hold &= ~(1 << 8);
+		rst_hold |= 0xf;
+	}
+	rst_hold &= ~(cpumask | (cpumask << 4));
+	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
+
+	arch_spin_unlock(&dcscb_lock);
+	local_irq_enable();
+
+	return 0;
+}
+
+static void dcscb_power_down(void)
+{
+	unsigned int mpidr, cpu, cluster, rst_hold, cpumask, last_man;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+	cpumask = (1 << cpu);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= 4 || cluster >= 2);
+
+	arch_spin_lock(&dcscb_lock);
+	rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
+	rst_hold |= cpumask;
+	if (((rst_hold | (rst_hold >> 4)) & 0xf) == 0xf)
+		rst_hold |= (1 << 8);
+	writel(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
+	arch_spin_unlock(&dcscb_lock);
+	last_man = (rst_hold & (1 << 8));
+
+	/*
+	 * Now let's clean our L1 cache and shut ourself down.
+	 * If we're the last CPU in this cluster then clean L2 too.
+	 */
+
+	/*
+	 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
+	 * a preliminary flush here for those CPUs.  At least, that's
+	 * the theory -- without the extra flush, Linux explodes on
+	 * RTSM (maybe not needed anymore, to be investigated)..
+	 */
+	flush_cache_louis();
+	set_cr(get_cr() & ~CR_C);
+
+	if (!last_man) {
+		flush_cache_louis();
+	} else {
+		flush_cache_all();
+		outer_flush_all();
+	}
+
+	/* Disable local coherency by clearing the ACTLR "SMP" bit: */
+	set_auxcr(get_auxcr() & ~(1 << 6));
+
+	/* Now we are prepared for power-down, do it: */
+	dsb();
+	wfi();
+
+	/* Not dead@this point?  Let our caller cope. */
+}
+
+static const struct mcpm_platform_ops dcscb_power_ops = {
+	.power_up	= dcscb_power_up,
+	.power_down	= dcscb_power_down,
+};
+
+static int __init dcscb_init(void)
+{
+	struct device_node *node;
+	int ret;
+
+	node = of_find_compatible_node(NULL, NULL, "arm,rtsm,dcscb");
+	if (!node)
+		return -ENODEV;
+	dcscb_base= of_iomap(node, 0);
+	if (!dcscb_base)
+		return -EADDRNOTAVAIL;
+
+	ret = mcpm_platform_register(&dcscb_power_ops);
+	if (ret) {
+		iounmap(dcscb_base);
+		return ret;
+	}
+
+	/*
+	 * Future entries into the kernel can now go
+	 * through the cluster entry vectors.
+	 */
+	vexpress_flags_set(virt_to_phys(mcpm_entry_point));
+
+	return 0;
+}
+
+early_initcall(dcscb_init);

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache
  2013-02-03 18:29         ` Nicolas Pitre
@ 2013-02-04  5:25           ` Santosh Shilimkar
  0 siblings, 0 replies; 54+ messages in thread
From: Santosh Shilimkar @ 2013-02-04  5:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Sunday 03 February 2013 11:59 PM, Nicolas Pitre wrote:
> On Sun, 3 Feb 2013, Santosh Shilimkar wrote:
>
>> On Sunday 03 February 2013 03:53 AM, Nicolas Pitre wrote:
>>> On Fri, 1 Feb 2013, Santosh Shilimkar wrote:
>>>
>>>> On Tuesday 29 January 2013 01:21 PM, Nicolas Pitre wrote:
>>>>> From: Dave Martin <dave.martin@linaro.org>

[..]

>>>>> This patch adds the appropriate flushing to the CCI driver to ensure
>>>>> that the relevant data is available in RAM ahead of time.
>>>>>
>>>>> Because this creates a dependency on arch-specific cacheflushing
>>>>> functions, this patch also makes ARM_CCI depend on ARM.
>>>>>
>>>> You should do that otherwise to avoid other arch building this
>>>> driver for random builds and breaking their builds.
>>>
>>> Before this patch the driver was buildable on any architecture.  That's
>>> why this dependency is added only in this patch.
>>>
>> I was just trying to counter the reasoning in the changelog which says
>> dependency is added because of arch specific cache flushing function.
>> Meaning even without that ARM dependency should be in place to avoid
>> driver getting build for other archs.
>
> Well, some upstream maintainers' opinion is that you should not put
> artificial dependencies on a specific architecture if a driver is
> buildable on any architecture, even if the built driver is of no use to
> those other architectures.
>
I see. I have seen some complaint about not adding the arch dependency.
Anyways, the patch has the dependency set with ARM and hence its
non-issue.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 00/15] multi-cluster power management
  2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
                   ` (14 preceding siblings ...)
  2013-01-29  7:51 ` [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree Nicolas Pitre
@ 2013-02-04 14:24 ` Will Deacon
  2013-02-04 20:59   ` Nicolas Pitre
  15 siblings, 1 reply; 54+ messages in thread
From: Will Deacon @ 2013-02-04 14:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nicolas,

On Tue, Jan 29, 2013 at 07:50:55AM +0000, Nicolas Pitre wrote:
> This is version 3 of the patch series required to safely power up
> and down CPUs in a cluster as can be found in b.L systems.  Also
> included are the needed patches to allow CPU hotplug on RTSM configured
> for big.LITTLE.

[...]

For patches 1-6 and 8:

  Reviewed-by: Will Deacon <will.deacon@arm.com>

Please keep me posted when you have a crack at removing the compile-time
cluster/cpu limits!

Cheers,

Will

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 00/15] multi-cluster power management
  2013-02-04 14:24 ` [PATCH v3 00/15] multi-cluster power management Will Deacon
@ 2013-02-04 20:59   ` Nicolas Pitre
  0 siblings, 0 replies; 54+ messages in thread
From: Nicolas Pitre @ 2013-02-04 20:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 4 Feb 2013, Will Deacon wrote:

> Hi Nicolas,
> 
> On Tue, Jan 29, 2013 at 07:50:55AM +0000, Nicolas Pitre wrote:
> > This is version 3 of the patch series required to safely power up
> > and down CPUs in a cluster as can be found in b.L systems.  Also
> > included are the needed patches to allow CPU hotplug on RTSM configured
> > for big.LITTLE.
> 
> [...]
> 
> For patches 1-6 and 8:
> 
>   Reviewed-by: Will Deacon <will.deacon@arm.com>

Thanks.

> Please keep me posted when you have a crack at removing the compile-time
> cluster/cpu limits!

The trick would entail standard memory allocation from the boot CPU, and 
the physical address for those allocations passed to the assembly code.  

Then we have this code:

	mrc     p15, 0, r0, c0, c0, 5           @ MPIDR
	ubfx    r9, r0, #0, #8                  @ r9 = cpu
	ubfx    r10, r0, #8, #8                 @ r10 = cluster
	mov     r3, #MAX_CPUS_PER_CLUSTER
	mla     r4, r3, r10, r9                 @ r4 = canonical CPU index
	cmp     r4, #(MAX_CPUS_PER_CLUSTER * MAX_NR_CLUSTERS)
	blo     2f
	... out-of-bound code here

Those constants just need to be turned into global variables as well.  
Or if we want to be really fancy, we could do some trivial instruction 
rewriting.

Still, for the foreseeable future, going with dynamic allocation isn't 
going to provide much of a gain while the compile-time limits keep the 
code simple while people get familiar with it.


Nicolas

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2013-02-04 20:59 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-29  7:50 [PATCH v3 00/15] multi-cluster power management Nicolas Pitre
2013-01-29  7:50 ` [PATCH v3 01/15] ARM: multi-cluster PM: secondary kernel entry code Nicolas Pitre
2013-01-31 15:45   ` Santosh Shilimkar
2013-01-29  7:50 ` [PATCH v3 02/15] ARM: mcpm: introduce the CPU/cluster power API Nicolas Pitre
2013-01-31 15:55   ` Santosh Shilimkar
2013-01-29  7:50 ` [PATCH v3 03/15] ARM: mcpm: introduce helpers for platform coherency exit/setup Nicolas Pitre
2013-01-31 16:08   ` Santosh Shilimkar
2013-01-31 17:16     ` Nicolas Pitre
2013-02-01  5:10       ` Santosh Shilimkar
2013-02-01 17:26         ` Nicolas Pitre
2013-01-29  7:50 ` [PATCH v3 04/15] ARM: mcpm: Add baremetal voting mutexes Nicolas Pitre
2013-02-01  5:29   ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 05/15] ARM: mcpm_head.S: vlock-based first man election Nicolas Pitre
2013-02-01  5:34   ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 06/15] ARM: mcpm: generic SMP secondary bringup and hotplug support Nicolas Pitre
2013-01-29 20:38   ` Rob Herring
2013-02-01  5:38   ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 07/15] ARM: vexpress: Select the correct SMP operations at run-time Nicolas Pitre
2013-01-29 15:43   ` Jon Medhurst (Tixy)
2013-01-29 19:26     ` Nicolas Pitre
2013-02-01  5:41   ` Santosh Shilimkar
2013-02-01 17:28     ` Nicolas Pitre
2013-01-29  7:51 ` [PATCH v3 08/15] ARM: introduce common set_auxcr/get_auxcr functions Nicolas Pitre
2013-02-01  5:44   ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 09/15] ARM: vexpress: introduce DCSCB support Nicolas Pitre
2013-02-01  5:50   ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 10/15] ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation Nicolas Pitre
2013-02-01  5:53   ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 11/15] ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster Nicolas Pitre
2013-02-01  5:57   ` Santosh Shilimkar
2013-02-01 17:24     ` Nicolas Pitre
2013-02-02  6:54       ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 12/15] drivers/bus: add ARM CCI support Nicolas Pitre
2013-02-01  6:01   ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 13/15] ARM: CCI: ensure powerdown-time data is flushed from cache Nicolas Pitre
2013-02-01  6:13   ` Santosh Shilimkar
2013-02-02 22:23     ` Nicolas Pitre
2013-02-03 10:07       ` Santosh Shilimkar
2013-02-03 18:29         ` Nicolas Pitre
2013-02-04  5:25           ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 14/15] ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI Nicolas Pitre
2013-01-29 10:46   ` Lorenzo Pieralisi
2013-01-29 18:42     ` Nicolas Pitre
2013-01-30 17:27       ` Lorenzo Pieralisi
2013-02-01  6:15   ` Santosh Shilimkar
2013-01-29  7:51 ` [PATCH v3 15/15] ARM: vexpress/dcscb: probe via device tree Nicolas Pitre
2013-01-29 21:01   ` Rob Herring
2013-01-29 21:41     ` Nicolas Pitre
2013-01-30 12:22       ` Achin Gupta
2013-01-30 17:43         ` Nicolas Pitre
2013-01-31 10:54           ` Dave Martin
2013-02-04  4:39             ` Nicolas Pitre
2013-02-04 14:24 ` [PATCH v3 00/15] multi-cluster power management Will Deacon
2013-02-04 20:59   ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.