linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
@ 2019-03-27 13:21 Abel Vesa
  2019-03-27 13:21 ` [RFC 1/7] sched: idle: Add sched get idle state helper Abel Vesa
                   ` (7 more replies)
  0 siblings, 8 replies; 28+ messages in thread
From: Abel Vesa @ 2019-03-27 13:21 UTC (permalink / raw)
  To: Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland, Shawn Guo,
	Sascha Hauer, catalin.marinas, Will Deacon, Rafael J. Wysocki,
	Lorenzo Pieralisi, Fabio Estevam, Lucas Stach, Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List,
	linux-pm, Abel Vesa

This work is a workaround I'm looking into (more as a background task)
in order to add support for cpuidle on i.MX8MQ based platforms.

The main idea here is getting around the missing GIC wake_request signal
(due to integration design issue) by waking up a each individual core through
some dedicated SW power-up bits inside the power controller (GPC) right before
every IPI is requested for that each individual core.

This work is basically composed of four parts (in kernel):

 - the cpuidle core poking mechanism along with the related sched/irq_work calls
 - the cpuidle-arm ops addition in order to support poking, along with the
   'local-wakeup-poke' DT idle state knob
 - the psci and cpu_ops cpu_poke addition
 - the i.MX8MQ specific idle states in dts

There is also a change needed in TF-A which is available here:

  https://lists.trustedfirmware.org/pipermail/tf-a/2019-March/000009.html

Abel Vesa (7):
  sched: idle: Add sched get idle state helper
  cpuidle: Add cpu poke support
  smp: Poke the cores before requesting IPI
  psci: Add cpu_poke ops to support core poking
  cpuidle-arm: Add ops to support poke alonside enter
  cpuidle-arm: Add arm64 wake helper for cpu_poke op
  arm64: dts: imx8mq: Add cpu-sleep state with poke wake-up enabled

 arch/arm64/boot/dts/freescale/imx8mq.dtsi | 20 ++++++++++++++++++
 arch/arm64/include/asm/cpu_ops.h          |  1 +
 arch/arm64/include/asm/cpuidle.h          |  6 ++++++
 arch/arm64/kernel/cpuidle.c               |  8 ++++++++
 arch/arm64/kernel/psci.c                  |  1 +
 drivers/cpuidle/cpuidle-arm.c             | 13 +++++++++++-
 drivers/cpuidle/cpuidle.c                 | 34 +++++++++++++++++++++++++++++++
 drivers/cpuidle/dt_idle_states.c          | 15 +++++++++-----
 drivers/cpuidle/dt_idle_states.h          | 10 +++++++++
 drivers/firmware/psci.c                   |  6 ++++++
 include/linux/cpuidle.h                   |  7 +++++++
 include/linux/psci.h                      |  1 +
 include/uapi/linux/psci.h                 |  2 ++
 kernel/irq_work.c                         | 19 ++++++++++++++---
 kernel/sched/core.c                       | 16 ++++++++++-----
 kernel/sched/idle.c                       | 11 ++++++++++
 kernel/smp.c                              | 10 ++++++++-
 kernel/time/tick-broadcast.c              |  4 ++++
 18 files changed, 169 insertions(+), 15 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC 1/7] sched: idle: Add sched get idle state helper
  2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
@ 2019-03-27 13:21 ` Abel Vesa
  2019-03-27 13:21 ` [RFC 2/7] cpuidle: Add cpu poke support Abel Vesa
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Abel Vesa @ 2019-03-27 13:21 UTC (permalink / raw)
  To: Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland, Shawn Guo,
	Sascha Hauer, catalin.marinas, Will Deacon, Rafael J. Wysocki,
	Lorenzo Pieralisi, Fabio Estevam, Lucas Stach, Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List,
	linux-pm, Abel Vesa

This helper is useful in order to get the idle state of a specific cpu.

Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
---
 include/linux/cpuidle.h |  1 +
 kernel/sched/idle.c     | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 3b39472..88a9119 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -211,6 +211,7 @@ static inline void cpuidle_use_deepest_state(bool enable)
 
 /* kernel/sched/idle.c */
 extern void sched_idle_set_state(struct cpuidle_state *idle_state);
+extern struct cpuidle_state *sched_idle_get_state(int cpu);
 extern void default_idle_call(void);
 
 #ifdef CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index f5516ba..484825d 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -21,6 +21,17 @@ void sched_idle_set_state(struct cpuidle_state *idle_state)
 	idle_set_state(this_rq(), idle_state);
 }
 
+/**
+ * sched_idle_get_state - Get idle state for the specified CPU.
+ * @index: CPU index.
+ */
+
+struct cpuidle_state *sched_idle_get_state(int cpu)
+{
+	return idle_get_state(cpu_rq(cpu));
+}
+
+
 static int __read_mostly cpu_idle_force_poll;
 
 void cpu_idle_poll_ctrl(bool enable)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 2/7] cpuidle: Add cpu poke support
  2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
  2019-03-27 13:21 ` [RFC 1/7] sched: idle: Add sched get idle state helper Abel Vesa
@ 2019-03-27 13:21 ` Abel Vesa
  2019-03-27 13:21 ` [RFC 3/7] smp: Poke the cores before requesting IPI Abel Vesa
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Abel Vesa @ 2019-03-27 13:21 UTC (permalink / raw)
  To: Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland, Shawn Guo,
	Sascha Hauer, catalin.marinas, Will Deacon, Rafael J. Wysocki,
	Lorenzo Pieralisi, Fabio Estevam, Lucas Stach, Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List,
	linux-pm, Abel Vesa

Having a poke operation per state, allows each cpuidle driver to
implement, for each state, different ways of waking up (poking) cores.

Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
---
 drivers/cpuidle/cpuidle.c | 34 ++++++++++++++++++++++++++++++++++
 include/linux/cpuidle.h   |  6 ++++++
 2 files changed, 40 insertions(+)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 7f10830..fca5313 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -297,6 +297,29 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
 }
 
 /**
+ * cpuidle_poke_single - poke the specified cpu to wake up from
+ *		         current idle state
+ *
+ * @dev: cpuidle device for this cpu
+ * @drv: cpuidle driver for this cpu
+ * @cpu: the index of the cpu
+ */
+int cpuidle_poke_single(struct cpuidle_driver *drv, struct cpuidle_device *dev,
+			int cpu)
+{
+	struct cpuidle_state *state;
+
+	if (cpuidle_disabled())
+		return 0;
+
+	state = sched_idle_get_state(cpu);
+	if (state && state->poke)
+		return state->poke(dev, drv, cpu);
+
+	return 0;
+}
+
+/**
  * cpuidle_select - ask the cpuidle framework to choose an idle state
  *
  * @drv: the cpuidle driver
@@ -414,6 +437,17 @@ void cpuidle_resume(void)
 	mutex_unlock(&cpuidle_lock);
 }
 
+void cpuidle_poke(const struct cpumask *mask)
+{
+	struct cpuidle_device *dev = cpuidle_get_device();
+	struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
+	int cpu;
+
+	for_each_cpu(cpu, mask) {
+		WARN_ON(cpuidle_poke_single(drv, dev, cpu));
+	}
+}
+
 /**
  * cpuidle_enable_device - enables idle PM for a CPU
  * @dev: the CPU
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 88a9119..0270771 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -55,6 +55,10 @@ struct cpuidle_state {
 			struct cpuidle_driver *drv,
 			int index);
 
+	int (*poke)	(struct cpuidle_device *dev,
+			struct cpuidle_driver *drv,
+			int cpu);
+
 	int (*enter_dead) (struct cpuidle_device *dev, int index);
 
 	/*
@@ -145,6 +149,7 @@ extern void cpuidle_unregister(struct cpuidle_driver *drv);
 extern void cpuidle_pause_and_lock(void);
 extern void cpuidle_resume_and_unlock(void);
 extern void cpuidle_pause(void);
+extern void cpuidle_poke(const struct cpumask *mask);
 extern void cpuidle_resume(void);
 extern int cpuidle_enable_device(struct cpuidle_device *dev);
 extern void cpuidle_disable_device(struct cpuidle_device *dev);
@@ -181,6 +186,7 @@ static inline void cpuidle_unregister(struct cpuidle_driver *drv) { }
 static inline void cpuidle_pause_and_lock(void) { }
 static inline void cpuidle_resume_and_unlock(void) { }
 static inline void cpuidle_pause(void) { }
+static inline void cpuidle_poke(const struct cpumask *mask) { }
 static inline void cpuidle_resume(void) { }
 static inline int cpuidle_enable_device(struct cpuidle_device *dev)
 {return -ENODEV; }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 3/7] smp: Poke the cores before requesting IPI
  2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
  2019-03-27 13:21 ` [RFC 1/7] sched: idle: Add sched get idle state helper Abel Vesa
  2019-03-27 13:21 ` [RFC 2/7] cpuidle: Add cpu poke support Abel Vesa
@ 2019-03-27 13:21 ` Abel Vesa
  2019-03-27 13:21 ` [RFC 4/7] psci: Add cpu_poke ops to support core poking Abel Vesa
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Abel Vesa @ 2019-03-27 13:21 UTC (permalink / raw)
  To: Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland, Shawn Guo,
	Sascha Hauer, catalin.marinas, Will Deacon, Rafael J. Wysocki,
	Lorenzo Pieralisi, Fabio Estevam, Lucas Stach, Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List,
	linux-pm, Abel Vesa

Try poking the specified core(s) every time before requesting IPI,
this way allowing the cpuidle driver to do its magic for the current
idle state of the specified core(s), if there is such a need.

Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
---
 kernel/irq_work.c            | 19 ++++++++++++++++---
 kernel/sched/core.c          | 16 +++++++++++-----
 kernel/smp.c                 | 10 +++++++++-
 kernel/time/tick-broadcast.c |  4 ++++
 4 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 6b7cdf1..deca898 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -17,6 +17,7 @@
 #include <linux/cpu.h>
 #include <linux/notifier.h>
 #include <linux/smp.h>
+#include <linux/cpuidle.h>
 #include <asm/processor.h>
 
 
@@ -76,8 +77,12 @@ bool irq_work_queue_on(struct irq_work *work, int cpu)
 	if (!irq_work_claim(work))
 		return false;
 
-	if (llist_add(&work->llnode, &per_cpu(raised_list, cpu)))
+	if (llist_add(&work->llnode, &per_cpu(raised_list, cpu))) {
+		/* Poke the cpu through cpuidle first */
+		cpuidle_poke(cpumask_of(cpu));
+
 		arch_send_call_function_single_ipi(cpu);
+	}
 
 #else /* #ifdef CONFIG_SMP */
 	irq_work_queue(work);
@@ -99,11 +104,19 @@ bool irq_work_queue(struct irq_work *work)
 	/* If the work is "lazy", handle it from next tick if any */
 	if (work->flags & IRQ_WORK_LAZY) {
 		if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) &&
-		    tick_nohz_tick_stopped())
+		    tick_nohz_tick_stopped()) {
+			/* Poke the cpus through cpuidle first */
+			cpuidle_poke(cpumask_of(smp_processor_id()));
+
 			arch_irq_work_raise();
+		}
 	} else {
-		if (llist_add(&work->llnode, this_cpu_ptr(&raised_list)))
+		if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) {
+			/* Poke the cpus through cpuidle first */
+			cpuidle_poke(cpumask_of(smp_processor_id()));
+
 			arch_irq_work_raise();
+		}
 	}
 
 	preempt_enable();
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4778c48..7be9dba 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -126,6 +126,12 @@ struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
 	}
 }
 
+static void smp_poke_and_send_reschedule(int cpu)
+{
+	cpuidle_poke(cpumask_of(cpu));
+	smp_send_reschedule(cpu);
+}
+
 /*
  * RQ-clock updating methods:
  */
@@ -511,7 +517,7 @@ void resched_curr(struct rq *rq)
 	}
 
 	if (set_nr_and_not_polling(curr))
-		smp_send_reschedule(cpu);
+		smp_poke_and_send_reschedule(cpu);
 	else
 		trace_sched_wake_idle_without_ipi(cpu);
 }
@@ -583,7 +589,7 @@ static void wake_up_idle_cpu(int cpu)
 		return;
 
 	if (set_nr_and_not_polling(rq->idle))
-		smp_send_reschedule(cpu);
+		smp_poke_and_send_reschedule(cpu);
 	else
 		trace_sched_wake_idle_without_ipi(cpu);
 }
@@ -1471,7 +1477,7 @@ void kick_process(struct task_struct *p)
 	preempt_disable();
 	cpu = task_cpu(p);
 	if ((cpu != smp_processor_id()) && task_curr(p))
-		smp_send_reschedule(cpu);
+		smp_poke_and_send_reschedule(cpu);
 	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(kick_process);
@@ -1836,7 +1842,7 @@ static void ttwu_queue_remote(struct task_struct *p, int cpu, int wake_flags)
 
 	if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) {
 		if (!set_nr_if_polling(rq->idle))
-			smp_send_reschedule(cpu);
+			smp_poke_and_send_reschedule(cpu);
 		else
 			trace_sched_wake_idle_without_ipi(cpu);
 	}
@@ -1857,7 +1863,7 @@ void wake_up_if_idle(int cpu)
 	} else {
 		rq_lock_irqsave(rq, &rf);
 		if (is_idle_task(rq->curr))
-			smp_send_reschedule(cpu);
+			smp_poke_and_send_reschedule(cpu);
 		/* Else CPU is not idle, do nothing here: */
 		rq_unlock_irqrestore(rq, &rf);
 	}
diff --git a/kernel/smp.c b/kernel/smp.c
index f4cf1b0..f6b2ce7 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -17,6 +17,7 @@
 #include <linux/smp.h>
 #include <linux/cpu.h>
 #include <linux/sched.h>
+#include <linux/cpuidle.h>
 #include <linux/sched/idle.h>
 #include <linux/hypervisor.h>
 
@@ -175,8 +176,12 @@ static int generic_exec_single(int cpu, call_single_data_t *csd,
 	 * locking and barrier primitives. Generic code isn't really
 	 * equipped to do the right thing...
 	 */
-	if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
+	if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu))) {
+		/* Poke the cpus through cpuidle first */
+		cpuidle_poke(cpumask_of(cpu));
+
 		arch_send_call_function_single_ipi(cpu);
+	}
 
 	return 0;
 }
@@ -457,6 +462,9 @@ void smp_call_function_many(const struct cpumask *mask,
 			__cpumask_set_cpu(cpu, cfd->cpumask_ipi);
 	}
 
+	/* Poke the cpus through cpuidle first */
+	cpuidle_poke(cfd->cpumask_ipi);
+
 	/* Send a message to all CPUs in the map */
 	arch_send_call_function_ipi_mask(cfd->cpumask_ipi);
 
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 0283523..8bb7b2b 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -16,6 +16,7 @@
 #include <linux/sched.h>
 #include <linux/smp.h>
 #include <linux/module.h>
+#include <linux/cpuidle.h>
 
 #include "tick-internal.h"
 
@@ -286,6 +287,9 @@ static bool tick_do_broadcast(struct cpumask *mask)
 	}
 
 	if (!cpumask_empty(mask)) {
+		/* Poke the cpus through cpuidle first */
+		cpuidle_poke(mask);
+
 		/*
 		 * It might be necessary to actually check whether the devices
 		 * have different broadcast functions. For now, just use the
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 4/7] psci: Add cpu_poke ops to support core poking
  2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
                   ` (2 preceding siblings ...)
  2019-03-27 13:21 ` [RFC 3/7] smp: Poke the cores before requesting IPI Abel Vesa
@ 2019-03-27 13:21 ` Abel Vesa
  2019-03-27 13:21 ` [RFC 5/7] cpuidle-arm: Add ops to support poke alonside enter Abel Vesa
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Abel Vesa @ 2019-03-27 13:21 UTC (permalink / raw)
  To: Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland, Shawn Guo,
	Sascha Hauer, catalin.marinas, Will Deacon, Rafael J. Wysocki,
	Lorenzo Pieralisi, Fabio Estevam, Lucas Stach, Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List,
	linux-pm, Abel Vesa

There can be platforms that need a dedicated work to be done
in TF-A before the specified core can be woken up through an IPI.
Allow those platforms to call into the TF-A to do that work
by making use of the cpu_poke operation.

Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
---
 arch/arm64/include/asm/cpu_ops.h | 1 +
 arch/arm64/kernel/psci.c         | 1 +
 drivers/firmware/psci.c          | 6 ++++++
 include/linux/psci.h             | 1 +
 include/uapi/linux/psci.h        | 2 ++
 5 files changed, 11 insertions(+)

diff --git a/arch/arm64/include/asm/cpu_ops.h b/arch/arm64/include/asm/cpu_ops.h
index 8f03446..913afef 100644
--- a/arch/arm64/include/asm/cpu_ops.h
+++ b/arch/arm64/include/asm/cpu_ops.h
@@ -60,6 +60,7 @@ struct cpu_operations {
 #ifdef CONFIG_CPU_IDLE
 	int		(*cpu_init_idle)(unsigned int);
 	int		(*cpu_suspend)(unsigned long);
+	int		(*cpu_poke)(unsigned int);
 #endif
 };
 
diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index 8cdaf25..53227eb 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -115,6 +115,7 @@ const struct cpu_operations cpu_psci_ops = {
 #ifdef CONFIG_CPU_IDLE
 	.cpu_init_idle	= psci_cpu_init_idle,
 	.cpu_suspend	= psci_cpu_suspend_enter,
+	.cpu_poke	= psci_cpu_suspend_exit,
 #endif
 	.cpu_init	= cpu_psci_cpu_init,
 	.cpu_prepare	= cpu_psci_cpu_prepare,
diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
index c80ec1d..282bc47 100644
--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -73,6 +73,7 @@ enum psci_function {
 	PSCI_FN_CPU_ON,
 	PSCI_FN_CPU_OFF,
 	PSCI_FN_MIGRATE,
+	PSCI_FN_CPU_POKE,
 	PSCI_FN_MAX,
 };
 
@@ -424,6 +425,11 @@ int psci_cpu_suspend_enter(unsigned long index)
 	return ret;
 }
 
+int psci_cpu_suspend_exit(unsigned int index)
+{
+	return invoke_psci_fn(PSCI_0_2_FN_CPU_POKE, index, 0, 0);
+}
+
 /* ARM specific CPU idle operations */
 #ifdef CONFIG_ARM
 static const struct cpuidle_ops psci_cpuidle_ops __initconst = {
diff --git a/include/linux/psci.h b/include/linux/psci.h
index 8b1b3b5..d863733 100644
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -24,6 +24,7 @@ bool psci_tos_resident_on(int cpu);
 
 int psci_cpu_init_idle(unsigned int cpu);
 int psci_cpu_suspend_enter(unsigned long index);
+int psci_cpu_suspend_exit(unsigned int index);
 
 enum psci_conduit {
 	PSCI_CONDUIT_NONE,
diff --git a/include/uapi/linux/psci.h b/include/uapi/linux/psci.h
index b3bcabe..19e7481 100644
--- a/include/uapi/linux/psci.h
+++ b/include/uapi/linux/psci.h
@@ -40,8 +40,10 @@
 #define PSCI_0_2_FN_MIGRATE_INFO_UP_CPU		PSCI_0_2_FN(7)
 #define PSCI_0_2_FN_SYSTEM_OFF			PSCI_0_2_FN(8)
 #define PSCI_0_2_FN_SYSTEM_RESET		PSCI_0_2_FN(9)
+#define PSCI_0_2_FN_CPU_POKE			PSCI_0_2_FN(11)
 
 #define PSCI_0_2_FN64_CPU_SUSPEND		PSCI_0_2_FN64(1)
+#define PSCI_0_2_FN64_CPU_POKE			PSCI_0_2_FN64(11)
 #define PSCI_0_2_FN64_CPU_ON			PSCI_0_2_FN64(3)
 #define PSCI_0_2_FN64_AFFINITY_INFO		PSCI_0_2_FN64(4)
 #define PSCI_0_2_FN64_MIGRATE			PSCI_0_2_FN64(5)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 5/7] cpuidle-arm: Add ops to support poke alonside enter
  2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
                   ` (3 preceding siblings ...)
  2019-03-27 13:21 ` [RFC 4/7] psci: Add cpu_poke ops to support core poking Abel Vesa
@ 2019-03-27 13:21 ` Abel Vesa
  2019-03-27 13:21 ` [RFC 6/7] cpuidle-arm: Add arm64 wake helper for cpu_poke op Abel Vesa
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Abel Vesa @ 2019-03-27 13:21 UTC (permalink / raw)
  To: Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland, Shawn Guo,
	Sascha Hauer, catalin.marinas, Will Deacon, Rafael J. Wysocki,
	Lorenzo Pieralisi, Fabio Estevam, Lucas Stach, Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List,
	linux-pm, Abel Vesa

In order to support poking alongside the enter operation,
the cpuidle_dt_ops are added. On each state initialization,
if the state has the property "local-wakeup-poke" set, then
gets to have the poking mechanims enabled. For now,
the arm_poke_idle_state doesn't do anything.

Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
---
 drivers/cpuidle/cpuidle-arm.c    | 13 ++++++++++++-
 drivers/cpuidle/dt_idle_states.c | 15 ++++++++++-----
 drivers/cpuidle/dt_idle_states.h | 10 ++++++++++
 3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-arm.c b/drivers/cpuidle/cpuidle-arm.c
index 3a407a3..76ee7ac 100644
--- a/drivers/cpuidle/cpuidle-arm.c
+++ b/drivers/cpuidle/cpuidle-arm.c
@@ -45,6 +45,12 @@ static int arm_enter_idle_state(struct cpuidle_device *dev,
 	return CPU_PM_CPU_IDLE_ENTER(arm_cpuidle_suspend, idx);
 }
 
+static int arm_poke_idle_state(struct cpuidle_device *dev,
+				struct cpuidle_driver *drv, int cpu)
+{
+	return 0;
+}
+
 static struct cpuidle_driver arm_idle_driver __initdata = {
 	.name = "arm_idle",
 	.owner = THIS_MODULE,
@@ -65,9 +71,14 @@ static struct cpuidle_driver arm_idle_driver __initdata = {
 	}
 };
 
+static const struct cpuidle_dt_ops cpuidle_ops = {
+	.enter = arm_enter_idle_state,
+	.poke = arm_poke_idle_state
+};
+
 static const struct of_device_id arm_idle_state_match[] __initconst = {
 	{ .compatible = "arm,idle-state",
-	  .data = arm_enter_idle_state },
+	  .data = &cpuidle_ops },
 	{ },
 };
 
diff --git a/drivers/cpuidle/dt_idle_states.c b/drivers/cpuidle/dt_idle_states.c
index add9569..6490ed4 100644
--- a/drivers/cpuidle/dt_idle_states.c
+++ b/drivers/cpuidle/dt_idle_states.c
@@ -27,19 +27,18 @@ static int init_state_node(struct cpuidle_state *idle_state,
 {
 	int err;
 	const char *desc;
-
+	const struct cpuidle_dt_ops *ops = match_id->data;
 	/*
 	 * CPUidle drivers are expected to initialize the const void *data
-	 * pointer of the passed in struct of_device_id array to the idle
-	 * state enter function.
+	 * pointer of the passed in struct of_device_id array to the ops.
 	 */
-	idle_state->enter = match_id->data;
+	idle_state->enter = ops->enter;
 	/*
 	 * Since this is not a "coupled" state, it's safe to assume interrupts
 	 * won't be enabled when it exits allowing the tick to be frozen
 	 * safely. So enter() can be also enter_s2idle() callback.
 	 */
-	idle_state->enter_s2idle = match_id->data;
+	idle_state->enter_s2idle = (void *)ops->enter;
 
 	err = of_property_read_u32(state_node, "wakeup-latency-us",
 				   &idle_state->exit_latency);
@@ -83,6 +82,12 @@ static int init_state_node(struct cpuidle_state *idle_state,
 	idle_state->flags = 0;
 	if (of_property_read_bool(state_node, "local-timer-stop"))
 		idle_state->flags |= CPUIDLE_FLAG_TIMER_STOP;
+
+	if (of_property_read_bool(state_node, "local-wakeup-poke")) {
+		WARN_ONCE(!ops->poke, "cpuidle driver: missing poke function\n");
+		idle_state->poke = ops->poke;
+	}
+
 	/*
 	 * TODO:
 	 *	replace with kstrdup and pointer assignment when name
diff --git a/drivers/cpuidle/dt_idle_states.h b/drivers/cpuidle/dt_idle_states.h
index 14ae88c..901a40e 100644
--- a/drivers/cpuidle/dt_idle_states.h
+++ b/drivers/cpuidle/dt_idle_states.h
@@ -2,6 +2,16 @@
 #ifndef __DT_IDLE_STATES
 #define __DT_IDLE_STATES
 
+struct cpuidle_dt_ops {
+	int (*enter)	(struct cpuidle_device *dev,
+			struct cpuidle_driver *drv,
+			int index);
+
+	int (*poke)	(struct cpuidle_device *dev,
+			struct cpuidle_driver *drv,
+			int cpu);
+};
+
 int dt_init_idle_driver(struct cpuidle_driver *drv,
 			const struct of_device_id *matches,
 			unsigned int start_idx);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 6/7] cpuidle-arm: Add arm64 wake helper for cpu_poke op
  2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
                   ` (4 preceding siblings ...)
  2019-03-27 13:21 ` [RFC 5/7] cpuidle-arm: Add ops to support poke alonside enter Abel Vesa
@ 2019-03-27 13:21 ` Abel Vesa
  2019-03-27 13:21 ` [RFC 7/7] arm64: dts: imx8mq: Add cpu-sleep state with poke wake-up enabled Abel Vesa
  2019-03-27 15:44 ` [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Lucas Stach
  7 siblings, 0 replies; 28+ messages in thread
From: Abel Vesa @ 2019-03-27 13:21 UTC (permalink / raw)
  To: Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland, Shawn Guo,
	Sascha Hauer, catalin.marinas, Will Deacon, Rafael J. Wysocki,
	Lorenzo Pieralisi, Fabio Estevam, Lucas Stach, Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List,
	linux-pm, Abel Vesa

When the arm_poke_idle_state gets called, the poking
cpu_ops of the current core gets called, passing on the
the index of the core to be poked.

Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
---
 arch/arm64/include/asm/cpuidle.h | 6 ++++++
 arch/arm64/kernel/cpuidle.c      | 8 ++++++++
 drivers/cpuidle/cpuidle-arm.c    | 2 +-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpuidle.h b/arch/arm64/include/asm/cpuidle.h
index 3c5ddb4..e637d4d 100644
--- a/arch/arm64/include/asm/cpuidle.h
+++ b/arch/arm64/include/asm/cpuidle.h
@@ -7,6 +7,7 @@
 #ifdef CONFIG_CPU_IDLE
 extern int arm_cpuidle_init(unsigned int cpu);
 extern int arm_cpuidle_suspend(int index);
+extern int arm_cpuidle_wake(int index);
 #else
 static inline int arm_cpuidle_init(unsigned int cpu)
 {
@@ -17,5 +18,10 @@ static inline int arm_cpuidle_suspend(int index)
 {
 	return -EOPNOTSUPP;
 }
+
+static inline int arm_cpuidle_wake(int index)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 #endif
diff --git a/arch/arm64/kernel/cpuidle.c b/arch/arm64/kernel/cpuidle.c
index f2d1381..af00955 100644
--- a/arch/arm64/kernel/cpuidle.c
+++ b/arch/arm64/kernel/cpuidle.c
@@ -43,6 +43,14 @@ int arm_cpuidle_suspend(int index)
 	return cpu_ops[cpu]->cpu_suspend(index);
 }
 
+int arm_cpuidle_wake(int index)
+{
+	int cpu = smp_processor_id();
+
+	return cpu_ops[cpu]->cpu_poke(index);
+}
+
+
 #ifdef CONFIG_ACPI
 
 #include <acpi/processor.h>
diff --git a/drivers/cpuidle/cpuidle-arm.c b/drivers/cpuidle/cpuidle-arm.c
index 76ee7ac..d5d3eef 100644
--- a/drivers/cpuidle/cpuidle-arm.c
+++ b/drivers/cpuidle/cpuidle-arm.c
@@ -48,7 +48,7 @@ static int arm_enter_idle_state(struct cpuidle_device *dev,
 static int arm_poke_idle_state(struct cpuidle_device *dev,
 				struct cpuidle_driver *drv, int cpu)
 {
-	return 0;
+	return arm_cpuidle_wake(cpu);
 }
 
 static struct cpuidle_driver arm_idle_driver __initdata = {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 7/7] arm64: dts: imx8mq: Add cpu-sleep state with poke wake-up enabled
  2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
                   ` (5 preceding siblings ...)
  2019-03-27 13:21 ` [RFC 6/7] cpuidle-arm: Add arm64 wake helper for cpu_poke op Abel Vesa
@ 2019-03-27 13:21 ` Abel Vesa
  2019-03-27 15:44 ` [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Lucas Stach
  7 siblings, 0 replies; 28+ messages in thread
From: Abel Vesa @ 2019-03-27 13:21 UTC (permalink / raw)
  To: Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland, Shawn Guo,
	Sascha Hauer, catalin.marinas, Will Deacon, Rafael J. Wysocki,
	Lorenzo Pieralisi, Fabio Estevam, Lucas Stach, Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List,
	linux-pm, Abel Vesa

Add the idle state cpu-sleep to each core. This idle state
makes use of 'local-wakeup-poke' property which basically tells
the cpuidle-arm driver to enable the poking for this state.

Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
---
 arch/arm64/boot/dts/freescale/imx8mq.dtsi | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
index 230f198..8b7303d 100644
--- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
@@ -84,6 +84,22 @@
 		#address-cells = <1>;
 		#size-cells = <0>;
 
+		idle-states {
+			entry-method = "psci";
+
+			CPU_SLEEP: cpu-sleep {
+				compatible = "arm,idle-state";
+				arm,psci-suspend-param = <0x0010033>;
+				local-timer-stop;
+				local-wakeup-poke;
+				entry-latency-us = <1000>;
+				exit-latency-us = <700>;
+				min-residency-us = <2700>;
+				wakeup-latency-us = <1500>;
+			};
+		};
+
+
 		A53_0: cpu@0 {
 			device_type = "cpu";
 			compatible = "arm,cortex-a53";
@@ -94,6 +110,7 @@
 			next-level-cache = <&A53_L2>;
 			operating-points-v2 = <&a53_opp_table>;
 			#cooling-cells = <2>;
+			cpu-idle-states = <&CPU_SLEEP>;
 		};
 
 		A53_1: cpu@1 {
@@ -106,6 +123,7 @@
 			next-level-cache = <&A53_L2>;
 			operating-points-v2 = <&a53_opp_table>;
 			#cooling-cells = <2>;
+			cpu-idle-states = <&CPU_SLEEP>;
 		};
 
 		A53_2: cpu@2 {
@@ -118,6 +136,7 @@
 			next-level-cache = <&A53_L2>;
 			operating-points-v2 = <&a53_opp_table>;
 			#cooling-cells = <2>;
+			cpu-idle-states = <&CPU_SLEEP>;
 		};
 
 		A53_3: cpu@3 {
@@ -130,6 +149,7 @@
 			next-level-cache = <&A53_L2>;
 			operating-points-v2 = <&a53_opp_table>;
 			#cooling-cells = <2>;
+			cpu-idle-states = <&CPU_SLEEP>;
 		};
 
 		A53_L2: l2-cache0 {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
                   ` (6 preceding siblings ...)
  2019-03-27 13:21 ` [RFC 7/7] arm64: dts: imx8mq: Add cpu-sleep state with poke wake-up enabled Abel Vesa
@ 2019-03-27 15:44 ` Lucas Stach
  2019-03-27 15:57   ` Marc Zyngier
  7 siblings, 1 reply; 28+ messages in thread
From: Lucas Stach @ 2019-03-27 15:44 UTC (permalink / raw)
  To: Abel Vesa, Sudeep Holla, Marc Zyngier, Rob Herring, Mark Rutland,
	Shawn Guo, Sascha Hauer, catalin.marinas, Will Deacon,
	Rafael J. Wysocki, Lorenzo Pieralisi, Fabio Estevam,
	Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List, linux-pm

Hi Abel,

Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> This work is a workaround I'm looking into (more as a background task)
> in order to add support for cpuidle on i.MX8MQ based platforms.
> 
> The main idea here is getting around the missing GIC wake_request signal
> (due to integration design issue) by waking up a each individual core through
> some dedicated SW power-up bits inside the power controller (GPC) right before
> every IPI is requested for that each individual core.

Just a general comment, without going into the details of this series:
this issue is not only affecting IPIs, but also MSIs terminated at the
GIC. Currently MSIs are terminated at the PCIe core, but terminating
them at the GIC is clearly preferable, as this allows assigning CPU
affinity to individual MSIs and lowers IRQ service overhead.

I'm not sure what the consequences are for upstream Linux support yet,
but we should keep in mind that having a workaround for IPIs is only
solving part of the issue.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 15:44 ` [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Lucas Stach
@ 2019-03-27 15:57   ` Marc Zyngier
  2019-03-27 16:06     ` Lucas Stach
  0 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2019-03-27 15:57 UTC (permalink / raw)
  To: Lucas Stach, Abel Vesa, Sudeep Holla, Rob Herring, Mark Rutland,
	Shawn Guo, Sascha Hauer, catalin.marinas, Will Deacon,
	Rafael J. Wysocki, Lorenzo Pieralisi, Fabio Estevam,
	Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List, linux-pm

On 27/03/2019 15:44, Lucas Stach wrote:
> Hi Abel,
> 
> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
>> This work is a workaround I'm looking into (more as a background task)
>> in order to add support for cpuidle on i.MX8MQ based platforms.
>>
>> The main idea here is getting around the missing GIC wake_request signal
>> (due to integration design issue) by waking up a each individual core through
>> some dedicated SW power-up bits inside the power controller (GPC) right before
>> every IPI is requested for that each individual core.
> 
> Just a general comment, without going into the details of this series:
> this issue is not only affecting IPIs, but also MSIs terminated at the
> GIC. Currently MSIs are terminated at the PCIe core, but terminating
> them at the GIC is clearly preferable, as this allows assigning CPU
> affinity to individual MSIs and lowers IRQ service overhead.
> 
> I'm not sure what the consequences are for upstream Linux support yet,
> but we should keep in mind that having a workaround for IPIs is only
> solving part of the issue.

If this erratum is affecting more than just IPIs, then indeed I don't
see how this patch series solves anything.

But the erratum documentation seems to imply that only SGIs are
affected, and goes as far as suggesting to use an external interrupt
would solve it. How comes this is not the case? Or is it that anything
directly routed to a redistributor is also affected? This would break
LPIs (and thus MSIs) and PPIs (the CPU timer, among others).

What is the *exact* status of this thing? I have the ugly feeling that
the true workaround is just to disable cpuidle.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 15:57   ` Marc Zyngier
@ 2019-03-27 16:06     ` Lucas Stach
  2019-03-27 17:00       ` Leonard Crestez
  2019-03-27 17:45       ` Marc Zyngier
  0 siblings, 2 replies; 28+ messages in thread
From: Lucas Stach @ 2019-03-27 16:06 UTC (permalink / raw)
  To: Marc Zyngier, Abel Vesa, Sudeep Holla, Rob Herring, Mark Rutland,
	Shawn Guo, Sascha Hauer, catalin.marinas, Will Deacon,
	Rafael J. Wysocki, Lorenzo Pieralisi, Fabio Estevam,
	Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List, linux-pm

Hi Marc,

Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> On 27/03/2019 15:44, Lucas Stach wrote:
> > Hi Abel,
> > 
> > Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> > > This work is a workaround I'm looking into (more as a background task)
> > > in order to add support for cpuidle on i.MX8MQ based platforms.
> > > 
> > > The main idea here is getting around the missing GIC wake_request signal
> > > (due to integration design issue) by waking up a each individual core through
> > > some dedicated SW power-up bits inside the power controller (GPC) right before
> > > every IPI is requested for that each individual core.
> > 
> > Just a general comment, without going into the details of this series:
> > this issue is not only affecting IPIs, but also MSIs terminated at the
> > GIC. Currently MSIs are terminated at the PCIe core, but terminating
> > them at the GIC is clearly preferable, as this allows assigning CPU
> > affinity to individual MSIs and lowers IRQ service overhead.
> > 
> > I'm not sure what the consequences are for upstream Linux support yet,
> > but we should keep in mind that having a workaround for IPIs is only
> > solving part of the issue.
> 
> If this erratum is affecting more than just IPIs, then indeed I don't
> see how this patch series solves anything.
> 
> But the erratum documentation seems to imply that only SGIs are
> affected, and goes as far as suggesting to use an external interrupt
> would solve it. How comes this is not the case? Or is it that anything
> directly routed to a redistributor is also affected? This would break
> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
> 
> What is the *exact* status of this thing? I have the ugly feeling that
> the true workaround is just to disable cpuidle.

As far as I understand the erratum, the basic issue is that the GIC
wake_request signals are not connected to the GPC (the CPU/peripheral
power sequencer). The SPIs are routed through the GPC and thus are
visible as wakeup sources, which is why the workaround of using an
external SPI as wakeup trigger for the IPI works.

Anything that isn't visible to the GPC and requires the GIC
wake_request signal to behave as specified is broken by this erratum.
You probably know the GIC better than any of us to tell what this
means.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 16:06     ` Lucas Stach
@ 2019-03-27 17:00       ` Leonard Crestez
  2019-03-27 17:11         ` Lucas Stach
  2019-03-27 18:13         ` Marc Zyngier
  2019-03-27 17:45       ` Marc Zyngier
  1 sibling, 2 replies; 28+ messages in thread
From: Leonard Crestez @ 2019-03-27 17:00 UTC (permalink / raw)
  To: l.stach, marc.zyngier, Richard Zhu
  Cc: Fabio Estevam, Cosmin Samoila, Robin Gong, Mircea Pop,
	Daniel Baluta, catalin.marinas, Aisheng Dong, shawnguo,
	Robert Chiras, Anson Huang, Jun Li, Abel Vesa, robh, Zening Wang,
	dl-linux-imx, BOUGH CHEN, Horia Geanta, Leonard Crestez,
	Peter Chen, Joakim Zhang, rjw, Leo Zhang, Shenwei Wang, linux-pm,
	linux-arm-kernel, Ranjani Vaidyanathan, Han Xu, will.deacon,
	Iuliana Prodan, sudeep.holla, lorenzo.pieralisi, Jacky Bai,
	linux-kernel, mark.rutland, Peng Fan, kernel, Viorel Suman

On Wed, 2019-03-27 at 17:06 +0100, Lucas Stach wrote:
> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> > On 27/03/2019 15:44, Lucas Stach wrote:
> > > Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> > > > This work is a workaround I'm looking into (more as a background task)
> > > > in order to add support for cpuidle on i.MX8MQ based platforms.
> > > > 
> > > > The main idea here is getting around the missing GIC wake_request signal
> > > > (due to integration design issue) by waking up a each individual core through
> > > > some dedicated SW power-up bits inside the power controller (GPC) right before
> > > > every IPI is requested for that each individual core.
> > > 
> > > Just a general comment, without going into the details of this series:
> > > this issue is not only affecting IPIs, but also MSIs terminated at the
> > > GIC. Currently MSIs are terminated at the PCIe core, but terminating
> > > them at the GIC is clearly preferable, as this allows assigning CPU
> > > affinity to individual MSIs and lowers IRQ service overhead.
> > > 
> > > I'm not sure what the consequences are for upstream Linux support yet,
> > > but we should keep in mind that having a workaround for IPIs is only
> > > solving part of the issue.
> > 
> > If this erratum is affecting more than just IPIs, then indeed I don't
> > see how this patch series solves anything.
> > 
> > But the erratum documentation seems to imply that only SGIs are
> > affected, and goes as far as suggesting to use an external interrupt
> > would solve it. How comes this is not the case? Or is it that anything
> > directly routed to a redistributor is also affected? This would break
> > LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
> > 
> > What is the *exact* status of this thing? I have the ugly feeling that
> > the true workaround is just to disable cpuidle.
> 
> As far as I understand the erratum, the basic issue is that the GIC
> wake_request signals are not connected to the GPC (the CPU/peripheral
> power sequencer). The SPIs are routed through the GPC and thus are
> visible as wakeup sources, which is why the workaround of using an
> external SPI as wakeup trigger for the IPI works.

We had a kernel workaround for IPIs in our internal tree for a long
time and I don't think we do anything special for PCI. Does PCI MSI
really bypass the GPC on 8mq?

Adding Richard/Jacky, they might know about this.

This seems like something of a corner case to me, don't many imx boards
ship without PCI; especially for low-power scenarios? If required it
might be reasonable to add an additional workaround to disable all
cpuidle if pci msis are used.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 17:00       ` Leonard Crestez
@ 2019-03-27 17:11         ` Lucas Stach
  2019-03-27 18:13         ` Marc Zyngier
  1 sibling, 0 replies; 28+ messages in thread
From: Lucas Stach @ 2019-03-27 17:11 UTC (permalink / raw)
  To: Leonard Crestez, marc.zyngier, Richard Zhu
  Cc: Fabio Estevam, Cosmin Samoila, Robin Gong, Mircea Pop,
	Daniel Baluta, catalin.marinas, Aisheng Dong, shawnguo,
	Robert Chiras, Anson Huang, Jun Li, Abel Vesa, robh, Zening Wang,
	dl-linux-imx, BOUGH CHEN, Horia Geanta, Peter Chen, Joakim Zhang,
	rjw, Leo Zhang, Shenwei Wang, linux-pm, linux-arm-kernel,
	Ranjani Vaidyanathan, Han Xu, will.deacon, Iuliana Prodan,
	sudeep.holla, lorenzo.pieralisi, Jacky Bai, linux-kernel,
	mark.rutland, Peng Fan, kernel, Viorel Suman

Am Mittwoch, den 27.03.2019, 17:00 +0000 schrieb Leonard Crestez:
> On Wed, 2019-03-27 at 17:06 +0100, Lucas Stach wrote:
> > Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> > > On 27/03/2019 15:44, Lucas Stach wrote:
> > > > Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> > > > > This work is a workaround I'm looking into (more as a
> > > > > background task)
> > > > > in order to add support for cpuidle on i.MX8MQ based
> > > > > platforms.
> > > > > 
> > > > > The main idea here is getting around the missing GIC
> > > > > wake_request signal
> > > > > (due to integration design issue) by waking up a each
> > > > > individual core through
> > > > > some dedicated SW power-up bits inside the power controller
> > > > > (GPC) right before
> > > > > every IPI is requested for that each individual core.
> > > > 
> > > > Just a general comment, without going into the details of this
> > > > series:
> > > > this issue is not only affecting IPIs, but also MSIs terminated
> > > > at the
> > > > GIC. Currently MSIs are terminated at the PCIe core, but
> > > > terminating
> > > > them at the GIC is clearly preferable, as this allows assigning
> > > > CPU
> > > > affinity to individual MSIs and lowers IRQ service overhead.
> > > > 
> > > > I'm not sure what the consequences are for upstream Linux
> > > > support yet,
> > > > but we should keep in mind that having a workaround for IPIs is
> > > > only
> > > > solving part of the issue.
> > > 
> > > If this erratum is affecting more than just IPIs, then indeed I
> > > don't
> > > see how this patch series solves anything.
> > > 
> > > But the erratum documentation seems to imply that only SGIs are
> > > affected, and goes as far as suggesting to use an external
> > > interrupt
> > > would solve it. How comes this is not the case? Or is it that
> > > anything
> > > directly routed to a redistributor is also affected? This would
> > > break
> > > LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
> > > 
> > > What is the *exact* status of this thing? I have the ugly feeling
> > > that
> > > the true workaround is just to disable cpuidle.
> > 
> > As far as I understand the erratum, the basic issue is that the GIC
> > wake_request signals are not connected to the GPC (the
> > CPU/peripheral
> > power sequencer). The SPIs are routed through the GPC and thus are
> > visible as wakeup sources, which is why the workaround of using an
> > external SPI as wakeup trigger for the IPI works.
> 
> We had a kernel workaround for IPIs in our internal tree for a long
> time and I don't think we do anything special for PCI. Does PCI MSI
> really bypass the GPC on 8mq?
> 
> Adding Richard/Jacky, they might know about this.

Currently the MSIs are terminated at the PCIe controller and routed to
the CPU via a normal interrupt line that is going through the GPC, so
there are no workaround required today.

But then this setup severely limits the usefulness of PCI MSIs, as they
incur an additional overhead of working with the DWC MSI controller and
are unable to target a specific CPU, as they are all routed via a
single IRQ line.

> This seems like something of a corner case to me, don't many imx
> boards
> ship without PCI; especially for low-power scenarios? If required it
> might be reasonable to add an additional workaround to disable all
> cpuidle if pci msis are used.

I don't know how common using PCIe with the i.MX8M is, but even the
reference board ships with the WLAN connected to PCIe.

I'm working with a design that has both a multi-queue and TSN capable
ethernet card connected to one PCIe controller and a NVMe SSD with
multiple queues connected to the second controller. Being able to
terminate the MSIs at the GIC level and have proper CPU affinity makes
a lot of sense in that scenario.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 16:06     ` Lucas Stach
  2019-03-27 17:00       ` Leonard Crestez
@ 2019-03-27 17:45       ` Marc Zyngier
  2019-03-27 17:55         ` Lucas Stach
  2019-03-27 18:40         ` Leonard Crestez
  1 sibling, 2 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-27 17:45 UTC (permalink / raw)
  To: Lucas Stach, Abel Vesa, Sudeep Holla, Rob Herring, Mark Rutland,
	Shawn Guo, Sascha Hauer, catalin.marinas, Will Deacon,
	Rafael J. Wysocki, Lorenzo Pieralisi, Fabio Estevam,
	Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List, linux-pm

On 27/03/2019 16:06, Lucas Stach wrote:
> Hi Marc,
> 
> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
>> On 27/03/2019 15:44, Lucas Stach wrote:
>>> Hi Abel,
>>>
>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
>>>> This work is a workaround I'm looking into (more as a background task)
>>>> in order to add support for cpuidle on i.MX8MQ based platforms.
>>>>
>>>> The main idea here is getting around the missing GIC wake_request signal
>>>> (due to integration design issue) by waking up a each individual core through
>>>> some dedicated SW power-up bits inside the power controller (GPC) right before
>>>> every IPI is requested for that each individual core.
>>>
>>> Just a general comment, without going into the details of this series:
>>> this issue is not only affecting IPIs, but also MSIs terminated at the
>>> GIC. Currently MSIs are terminated at the PCIe core, but terminating
>>> them at the GIC is clearly preferable, as this allows assigning CPU
>>> affinity to individual MSIs and lowers IRQ service overhead.
>>>
>>> I'm not sure what the consequences are for upstream Linux support yet,
>>> but we should keep in mind that having a workaround for IPIs is only
>>> solving part of the issue.
>>
>> If this erratum is affecting more than just IPIs, then indeed I don't
>> see how this patch series solves anything.
>>
>> But the erratum documentation seems to imply that only SGIs are
>> affected, and goes as far as suggesting to use an external interrupt
>> would solve it. How comes this is not the case? Or is it that anything
>> directly routed to a redistributor is also affected? This would break
>> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
>>
>> What is the *exact* status of this thing? I have the ugly feeling that
>> the true workaround is just to disable cpuidle.
> 
> As far as I understand the erratum, the basic issue is that the GIC
> wake_request signals are not connected to the GPC (the CPU/peripheral
> power sequencer). The SPIs are routed through the GPC and thus are
> visible as wakeup sources, which is why the workaround of using an
> external SPI as wakeup trigger for the IPI works.

Are all SPIs connected to the GPC?

> Anything that isn't visible to the GPC and requires the GIC
> wake_request signal to behave as specified is broken by this erratum.

I really wonder how a timer interrupt (a PPI, hence not routed through
the GPC) can wake up the CPU in this case. It really feels like
something like "program CNTV_CVAL_EL0 to expire at some later point;
WFI" could result in the CPU going to a deep sleep state, and not
wake-up at all.

This would indicate that not only cpuidle is broken with this, but
absolutely every interrupt that is not routed through the GPC.

> You probably know the GIC better than any of us to tell what this
> means.

Yeah, and that's a very unfortunate state of things... :-/

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 17:45       ` Marc Zyngier
@ 2019-03-27 17:55         ` Lucas Stach
  2019-03-28 11:27           ` Aisheng Dong
  2019-03-27 18:40         ` Leonard Crestez
  1 sibling, 1 reply; 28+ messages in thread
From: Lucas Stach @ 2019-03-27 17:55 UTC (permalink / raw)
  To: Marc Zyngier, Abel Vesa, Sudeep Holla, Rob Herring, Mark Rutland,
	Shawn Guo, Sascha Hauer, catalin.marinas, Will Deacon,
	Rafael J. Wysocki, Lorenzo Pieralisi, Fabio Estevam,
	Aisheng Dong
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List, linux-pm

Am Mittwoch, den 27.03.2019, 17:45 +0000 schrieb Marc Zyngier:
> On 27/03/2019 16:06, Lucas Stach wrote:
> > Hi Marc,
> > 
> > Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> > > On 27/03/2019 15:44, Lucas Stach wrote:
> > > > Hi Abel,
> > > > 
> > > > Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> > > > > This work is a workaround I'm looking into (more as a background task)
> > > > > in order to add support for cpuidle on i.MX8MQ based platforms.
> > > > > 
> > > > > The main idea here is getting around the missing GIC wake_request signal
> > > > > (due to integration design issue) by waking up a each individual core through
> > > > > some dedicated SW power-up bits inside the power controller (GPC) right before
> > > > > every IPI is requested for that each individual core.
> > > > 
> > > > Just a general comment, without going into the details of this series:
> > > > this issue is not only affecting IPIs, but also MSIs terminated at the
> > > > GIC. Currently MSIs are terminated at the PCIe core, but terminating
> > > > them at the GIC is clearly preferable, as this allows assigning CPU
> > > > affinity to individual MSIs and lowers IRQ service overhead.
> > > > 
> > > > I'm not sure what the consequences are for upstream Linux support yet,
> > > > but we should keep in mind that having a workaround for IPIs is only
> > > > solving part of the issue.
> > > 
> > > If this erratum is affecting more than just IPIs, then indeed I don't
> > > see how this patch series solves anything.
> > > 
> > > But the erratum documentation seems to imply that only SGIs are
> > > affected, and goes as far as suggesting to use an external interrupt
> > > would solve it. How comes this is not the case? Or is it that anything
> > > directly routed to a redistributor is also affected? This would break
> > > LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
> > > 
> > > What is the *exact* status of this thing? I have the ugly feeling that
> > > the true workaround is just to disable cpuidle.
> > 
> > As far as I understand the erratum, the basic issue is that the GIC
> > wake_request signals are not connected to the GPC (the CPU/peripheral
> > power sequencer). The SPIs are routed through the GPC and thus are
> > visible as wakeup sources, which is why the workaround of using an
> > external SPI as wakeup trigger for the IPI works.
> 
> Are all SPIs connected to the GPC?

AFAICS yes.

> > Anything that isn't visible to the GPC and requires the GIC
> > wake_request signal to behave as specified is broken by this erratum.
> 
> I really wonder how a timer interrupt (a PPI, hence not routed through
> the GPC) can wake up the CPU in this case. It really feels like
> something like "program CNTV_CVAL_EL0 to expire at some later point;
> WFI" could result in the CPU going to a deep sleep state, and not
> wake-up at all.

I guess it's broken in the same way. The downstream DT claims
"local-timer-stop" for the CPU sleep state and "arm,no-tick-in-suspend" 
for the armv8-timer, which I guess is not the timer actually stopping
in suspend, but the CPU being unable to wake up due to the timer IRQ.

> This would indicate that not only cpuidle is broken with this, but
> absolutely every interrupt that is not routed through the GPC.

That's my understanding as well. Note that I have no NXP internal
information and can only infer from the published reference manual,
errata notice and downstream kernel.

Regards,
Lucas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 17:00       ` Leonard Crestez
  2019-03-27 17:11         ` Lucas Stach
@ 2019-03-27 18:13         ` Marc Zyngier
  2019-03-28 11:21           ` Aisheng Dong
  1 sibling, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2019-03-27 18:13 UTC (permalink / raw)
  To: Leonard Crestez, l.stach, Richard Zhu
  Cc: Fabio Estevam, Cosmin Samoila, Robin Gong, Mircea Pop,
	Daniel Baluta, catalin.marinas, Aisheng Dong, shawnguo,
	Robert Chiras, Anson Huang, Jun Li, Abel Vesa, robh, Zening Wang,
	dl-linux-imx, BOUGH CHEN, Horia Geanta, Peter Chen, Joakim Zhang,
	rjw, Leo Zhang, Shenwei Wang, linux-pm, linux-arm-kernel,
	Ranjani Vaidyanathan, Han Xu, will.deacon, Iuliana Prodan,
	sudeep.holla, lorenzo.pieralisi, Jacky Bai, linux-kernel,
	mark.rutland, Peng Fan, kernel, Viorel Suman

On 27/03/2019 17:00, Leonard Crestez wrote:
> On Wed, 2019-03-27 at 17:06 +0100, Lucas Stach wrote:
>> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
>>> On 27/03/2019 15:44, Lucas Stach wrote:
>>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
>>>>> This work is a workaround I'm looking into (more as a background task)
>>>>> in order to add support for cpuidle on i.MX8MQ based platforms.
>>>>>
>>>>> The main idea here is getting around the missing GIC wake_request signal
>>>>> (due to integration design issue) by waking up a each individual core through
>>>>> some dedicated SW power-up bits inside the power controller (GPC) right before
>>>>> every IPI is requested for that each individual core.
>>>>
>>>> Just a general comment, without going into the details of this series:
>>>> this issue is not only affecting IPIs, but also MSIs terminated at the
>>>> GIC. Currently MSIs are terminated at the PCIe core, but terminating
>>>> them at the GIC is clearly preferable, as this allows assigning CPU
>>>> affinity to individual MSIs and lowers IRQ service overhead.
>>>>
>>>> I'm not sure what the consequences are for upstream Linux support yet,
>>>> but we should keep in mind that having a workaround for IPIs is only
>>>> solving part of the issue.
>>>
>>> If this erratum is affecting more than just IPIs, then indeed I don't
>>> see how this patch series solves anything.
>>>
>>> But the erratum documentation seems to imply that only SGIs are
>>> affected, and goes as far as suggesting to use an external interrupt
>>> would solve it. How comes this is not the case? Or is it that anything
>>> directly routed to a redistributor is also affected? This would break
>>> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
>>>
>>> What is the *exact* status of this thing? I have the ugly feeling that
>>> the true workaround is just to disable cpuidle.
>>
>> As far as I understand the erratum, the basic issue is that the GIC
>> wake_request signals are not connected to the GPC (the CPU/peripheral
>> power sequencer). The SPIs are routed through the GPC and thus are
>> visible as wakeup sources, which is why the workaround of using an
>> external SPI as wakeup trigger for the IPI works.
> 
> We had a kernel workaround for IPIs in our internal tree for a long
> time and I don't think we do anything special for PCI. Does PCI MSI
> really bypass the GPC on 8mq?

If you have an ITS, certainly. If you don't, it depends. MSIs can hit
the distributor's MBI registers and generate non-wired SPIs, which I
assume will bypass the GPC altogether.

> Adding Richard/Jacky, they might know about this.
> 
> This seems like something of a corner case to me, don't many imx boards
> ship without PCI; especially for low-power scenarios? If required it
> might be reasonable to add an additional workaround to disable all
> cpuidle if pci msis are used.

Establishing a link between cpuidle and PCI in the kernel would be
pretty invasive, and that would come on top of what this series also
mandates.

At that level of apparent brokenness, it is far safer to get cpuidle out
of the picture altogether, and I'd rather see these patches in a vendor
tree (for once).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 17:45       ` Marc Zyngier
  2019-03-27 17:55         ` Lucas Stach
@ 2019-03-27 18:40         ` Leonard Crestez
  2019-03-28 10:35           ` Marc Zyngier
  2019-03-28 10:45           ` Lorenzo Pieralisi
  1 sibling, 2 replies; 28+ messages in thread
From: Leonard Crestez @ 2019-03-27 18:40 UTC (permalink / raw)
  To: l.stach, marc.zyngier, Abel Vesa, Jacky Bai
  Cc: dl-linux-imx, linux-kernel, Aisheng Dong, linux-pm,
	lorenzo.pieralisi, Fabio Estevam, mark.rutland, rjw,
	catalin.marinas, will.deacon, robh, shawnguo, linux-arm-kernel,
	sudeep.holla, Anson Huang, kernel

On Wed, 2019-03-27 at 17:45 +0000, Marc Zyngier wrote:
> On 27/03/2019 16:06, Lucas Stach wrote:
> > Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> > > On 27/03/2019 15:44, Lucas Stach wrote:
> > > > Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> > > > > This work is a workaround I'm looking into (more as a background task)
> > > > > in order to add support for cpuidle on i.MX8MQ based platforms.
> > > > > 
> > > > > The main idea here is getting around the missing GIC wake_request signal
> > > > > (due to integration design issue) by waking up a each individual core through
> > > > > some dedicated SW power-up bits inside the power controller (GPC) right before
> > > > > every IPI is requested for that each individual core.
> > > > 
> > > > Just a general comment, without going into the details of this series:
> > > > this issue is not only affecting IPIs, but also MSIs terminated at the
> > > > GIC. Currently MSIs are terminated at the PCIe core, but terminating
> > > > them at the GIC is clearly preferable, as this allows assigning CPU
> > > > affinity to individual MSIs and lowers IRQ service overhead.
> > > > 
> > > > I'm not sure what the consequences are for upstream Linux support yet,
> > > > but we should keep in mind that having a workaround for IPIs is only
> > > > solving part of the issue.
> > > 
> > > If this erratum is affecting more than just IPIs, then indeed I don't
> > > see how this patch series solves anything.
> > > 
> > > But the erratum documentation seems to imply that only SGIs are
> > > affected, and goes as far as suggesting to use an external interrupt
> > > would solve it. How comes this is not the case? Or is it that anything
> > > directly routed to a redistributor is also affected? This would break
> > > LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
> > 
> > Anything that isn't visible to the GPC and requires the GIC
> > wake_request signal to behave as specified is broken by this erratum.
> 
> I really wonder how a timer interrupt (a PPI, hence not routed through
> the GPC) can wake up the CPU in this case. It really feels like
> something like "program CNTV_CVAL_EL0 to expire at some later point;
> WFI" could result in the CPU going to a deep sleep state, and not
> wake-up at all.

This is already a common issue for cpuidle implementions handled by the
"local-timer-stop" property. imx has other timer blocks in the SOC,
they generate SPIs which are connected to GPC.

> This would indicate that not only cpuidle is broken with this, but
> absolutely every interrupt that is not routed through the GPC.

Yes, cpuidle is broken for irqs not routed through GPC. However:

* All SPIs are connected to GPC in a 1:1 mapping
* This series deals with SGIs
* The timer PPIs are not required; covered by local-timer-stop
* LPIs are currently unused (I understand imx-pci uses SPI by default
from Lucas)

Anything missing?

My understanding is that this wake request feature via GIC is new in v3
and this is maybe why HW team missed it during integration. Older
imx6/7 has GICv2 and has deep idle states which always rely on GPC to
wakeup so the approach can work.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 18:40         ` Leonard Crestez
@ 2019-03-28 10:35           ` Marc Zyngier
  2019-03-28 10:36             ` Rafael J. Wysocki
  2019-03-28 11:55             ` Aisheng Dong
  2019-03-28 10:45           ` Lorenzo Pieralisi
  1 sibling, 2 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 10:35 UTC (permalink / raw)
  To: Leonard Crestez, l.stach, Abel Vesa, Jacky Bai
  Cc: dl-linux-imx, linux-kernel, Aisheng Dong, linux-pm,
	lorenzo.pieralisi, Fabio Estevam, mark.rutland, rjw,
	catalin.marinas, will.deacon, robh, shawnguo, linux-arm-kernel,
	sudeep.holla, Anson Huang, kernel

On 27/03/2019 18:40, Leonard Crestez wrote:
> On Wed, 2019-03-27 at 17:45 +0000, Marc Zyngier wrote:
>> On 27/03/2019 16:06, Lucas Stach wrote:
>>> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
>>>> On 27/03/2019 15:44, Lucas Stach wrote:
>>>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
>>>>>> This work is a workaround I'm looking into (more as a background task)
>>>>>> in order to add support for cpuidle on i.MX8MQ based platforms.
>>>>>>
>>>>>> The main idea here is getting around the missing GIC wake_request signal
>>>>>> (due to integration design issue) by waking up a each individual core through
>>>>>> some dedicated SW power-up bits inside the power controller (GPC) right before
>>>>>> every IPI is requested for that each individual core.
>>>>>
>>>>> Just a general comment, without going into the details of this series:
>>>>> this issue is not only affecting IPIs, but also MSIs terminated at the
>>>>> GIC. Currently MSIs are terminated at the PCIe core, but terminating
>>>>> them at the GIC is clearly preferable, as this allows assigning CPU
>>>>> affinity to individual MSIs and lowers IRQ service overhead.
>>>>>
>>>>> I'm not sure what the consequences are for upstream Linux support yet,
>>>>> but we should keep in mind that having a workaround for IPIs is only
>>>>> solving part of the issue.
>>>>
>>>> If this erratum is affecting more than just IPIs, then indeed I don't
>>>> see how this patch series solves anything.
>>>>
>>>> But the erratum documentation seems to imply that only SGIs are
>>>> affected, and goes as far as suggesting to use an external interrupt
>>>> would solve it. How comes this is not the case? Or is it that anything
>>>> directly routed to a redistributor is also affected? This would break
>>>> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
>>>
>>> Anything that isn't visible to the GPC and requires the GIC
>>> wake_request signal to behave as specified is broken by this erratum.
>>
>> I really wonder how a timer interrupt (a PPI, hence not routed through
>> the GPC) can wake up the CPU in this case. It really feels like
>> something like "program CNTV_CVAL_EL0 to expire at some later point;
>> WFI" could result in the CPU going to a deep sleep state, and not
>> wake-up at all.
> 
> This is already a common issue for cpuidle implementions handled by the
> "local-timer-stop" property. imx has other timer blocks in the SOC,
> they generate SPIs which are connected to GPC.
> 
>> This would indicate that not only cpuidle is broken with this, but
>> absolutely every interrupt that is not routed through the GPC.
> 
> Yes, cpuidle is broken for irqs not routed through GPC. However:
> 
> * All SPIs are connected to GPC in a 1:1 mapping
> * This series deals with SGIs
> * The timer PPIs are not required; covered by local-timer-stop
> * LPIs are currently unused (I understand imx-pci uses SPI by default
> from Lucas)
> 
> Anything missing?
> 
> My understanding is that this wake request feature via GIC is new in v3
> and this is maybe why HW team missed it during integration. Older
> imx6/7 has GICv2 and has deep idle states which always rely on GPC to
> wakeup so the approach can work.

Certainly the approach can work. The question is whether we want to
support this in a mainline kernel, spreading random hooks in the generic
code and adding a firmware interface on top of that.

By all accounts, this HW is broken. You can indeed impose limitations
(dumb down PCI, mandate the use of a broadcast timer), or you can just
flag cpuidle as unsupported on this HW. My vote is on the latter.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-28 10:35           ` Marc Zyngier
@ 2019-03-28 10:36             ` Rafael J. Wysocki
  2019-03-28 11:55             ` Aisheng Dong
  1 sibling, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2019-03-28 10:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Leonard Crestez, l.stach, Abel Vesa, Jacky Bai, dl-linux-imx,
	linux-kernel, Aisheng Dong, linux-pm, lorenzo.pieralisi,
	Fabio Estevam, mark.rutland, catalin.marinas, will.deacon, robh,
	shawnguo, linux-arm-kernel, sudeep.holla, Anson Huang, kernel

On Thursday, March 28, 2019 11:35:23 AM CET Marc Zyngier wrote:
> On 27/03/2019 18:40, Leonard Crestez wrote:
> > On Wed, 2019-03-27 at 17:45 +0000, Marc Zyngier wrote:
> >> On 27/03/2019 16:06, Lucas Stach wrote:
> >>> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> >>>> On 27/03/2019 15:44, Lucas Stach wrote:
> >>>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> >>>>>> This work is a workaround I'm looking into (more as a background task)
> >>>>>> in order to add support for cpuidle on i.MX8MQ based platforms.
> >>>>>>
> >>>>>> The main idea here is getting around the missing GIC wake_request signal
> >>>>>> (due to integration design issue) by waking up a each individual core through
> >>>>>> some dedicated SW power-up bits inside the power controller (GPC) right before
> >>>>>> every IPI is requested for that each individual core.
> >>>>>
> >>>>> Just a general comment, without going into the details of this series:
> >>>>> this issue is not only affecting IPIs, but also MSIs terminated at the
> >>>>> GIC. Currently MSIs are terminated at the PCIe core, but terminating
> >>>>> them at the GIC is clearly preferable, as this allows assigning CPU
> >>>>> affinity to individual MSIs and lowers IRQ service overhead.
> >>>>>
> >>>>> I'm not sure what the consequences are for upstream Linux support yet,
> >>>>> but we should keep in mind that having a workaround for IPIs is only
> >>>>> solving part of the issue.
> >>>>
> >>>> If this erratum is affecting more than just IPIs, then indeed I don't
> >>>> see how this patch series solves anything.
> >>>>
> >>>> But the erratum documentation seems to imply that only SGIs are
> >>>> affected, and goes as far as suggesting to use an external interrupt
> >>>> would solve it. How comes this is not the case? Or is it that anything
> >>>> directly routed to a redistributor is also affected? This would break
> >>>> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
> >>>
> >>> Anything that isn't visible to the GPC and requires the GIC
> >>> wake_request signal to behave as specified is broken by this erratum.
> >>
> >> I really wonder how a timer interrupt (a PPI, hence not routed through
> >> the GPC) can wake up the CPU in this case. It really feels like
> >> something like "program CNTV_CVAL_EL0 to expire at some later point;
> >> WFI" could result in the CPU going to a deep sleep state, and not
> >> wake-up at all.
> > 
> > This is already a common issue for cpuidle implementions handled by the
> > "local-timer-stop" property. imx has other timer blocks in the SOC,
> > they generate SPIs which are connected to GPC.
> > 
> >> This would indicate that not only cpuidle is broken with this, but
> >> absolutely every interrupt that is not routed through the GPC.
> > 
> > Yes, cpuidle is broken for irqs not routed through GPC. However:
> > 
> > * All SPIs are connected to GPC in a 1:1 mapping
> > * This series deals with SGIs
> > * The timer PPIs are not required; covered by local-timer-stop
> > * LPIs are currently unused (I understand imx-pci uses SPI by default
> > from Lucas)
> > 
> > Anything missing?
> > 
> > My understanding is that this wake request feature via GIC is new in v3
> > and this is maybe why HW team missed it during integration. Older
> > imx6/7 has GICv2 and has deep idle states which always rely on GPC to
> > wakeup so the approach can work.
> 
> Certainly the approach can work. The question is whether we want to
> support this in a mainline kernel, spreading random hooks in the generic
> code and adding a firmware interface on top of that.

Not really.

> By all accounts, this HW is broken. You can indeed impose limitations
> (dumb down PCI, mandate the use of a broadcast timer), or you can just
> flag cpuidle as unsupported on this HW. My vote is on the latter.

Agreed.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 18:40         ` Leonard Crestez
  2019-03-28 10:35           ` Marc Zyngier
@ 2019-03-28 10:45           ` Lorenzo Pieralisi
  2019-11-06 20:14             ` Florian Fainelli
  1 sibling, 1 reply; 28+ messages in thread
From: Lorenzo Pieralisi @ 2019-03-28 10:45 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: l.stach, marc.zyngier, Abel Vesa, Jacky Bai, dl-linux-imx,
	linux-kernel, Aisheng Dong, linux-pm, Fabio Estevam,
	mark.rutland, rjw, catalin.marinas, will.deacon, robh, shawnguo,
	linux-arm-kernel, sudeep.holla, Anson Huang, kernel

On Wed, Mar 27, 2019 at 06:40:07PM +0000, Leonard Crestez wrote:
> On Wed, 2019-03-27 at 17:45 +0000, Marc Zyngier wrote:
> > On 27/03/2019 16:06, Lucas Stach wrote:
> > > Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> > > > On 27/03/2019 15:44, Lucas Stach wrote:
> > > > > Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> > > > > > This work is a workaround I'm looking into (more as a background task)
> > > > > > in order to add support for cpuidle on i.MX8MQ based platforms.
> > > > > > 
> > > > > > The main idea here is getting around the missing GIC wake_request signal
> > > > > > (due to integration design issue) by waking up a each individual core through
> > > > > > some dedicated SW power-up bits inside the power controller (GPC) right before
> > > > > > every IPI is requested for that each individual core.
> > > > > 
> > > > > Just a general comment, without going into the details of this series:
> > > > > this issue is not only affecting IPIs, but also MSIs terminated at the
> > > > > GIC. Currently MSIs are terminated at the PCIe core, but terminating
> > > > > them at the GIC is clearly preferable, as this allows assigning CPU
> > > > > affinity to individual MSIs and lowers IRQ service overhead.
> > > > > 
> > > > > I'm not sure what the consequences are for upstream Linux support yet,
> > > > > but we should keep in mind that having a workaround for IPIs is only
> > > > > solving part of the issue.
> > > > 
> > > > If this erratum is affecting more than just IPIs, then indeed I don't
> > > > see how this patch series solves anything.
> > > > 
> > > > But the erratum documentation seems to imply that only SGIs are
> > > > affected, and goes as far as suggesting to use an external interrupt
> > > > would solve it. How comes this is not the case? Or is it that anything
> > > > directly routed to a redistributor is also affected? This would break
> > > > LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
> > > 
> > > Anything that isn't visible to the GPC and requires the GIC
> > > wake_request signal to behave as specified is broken by this erratum.
> > 
> > I really wonder how a timer interrupt (a PPI, hence not routed through
> > the GPC) can wake up the CPU in this case. It really feels like
> > something like "program CNTV_CVAL_EL0 to expire at some later point;
> > WFI" could result in the CPU going to a deep sleep state, and not
> > wake-up at all.
> 
> This is already a common issue for cpuidle implementions handled by the
> "local-timer-stop" property. imx has other timer blocks in the SOC,
> they generate SPIs which are connected to GPC.

It is not a common issue. The tick-broadcast mechanism relies on
IPIs that are sent to specific CPUs upon timer expiry.

If IPIs don't work for CPUs in shutdown state (which is what this patch
is fixing AFAIU), the only reason I can see how a CPU can resume from
idle on a timer expiry is the GPC waking up all cores upon the global
timer SPI; if that's the case there is precious little point in
implementing CPUidle at all - too bad people worked hard to implement
NOHZ in a power efficient manner.

> > This would indicate that not only cpuidle is broken with this, but
> > absolutely every interrupt that is not routed through the GPC.
> 
> Yes, cpuidle is broken for irqs not routed through GPC. However:
> 
> * All SPIs are connected to GPC in a 1:1 mapping
> * This series deals with SGIs
> * The timer PPIs are not required; covered by local-timer-stop
> * LPIs are currently unused (I understand imx-pci uses SPI by default
> from Lucas)
> 
> Anything missing?

Yes, LPIs must be able to wake up CPUs and only the CPU for which
an IRQ is actually pending.

From an architectural perspective, an ARM core executing the WFI
instruction must resume execution upon an IRQ occurrence targeted
at it and that's true regardless of the idle state entered.

Anything deviating from this behaviour is not architecture compliant.

> My understanding is that this wake request feature via GIC is new in v3
> and this is maybe why HW team missed it during integration. Older
> imx6/7 has GICv2 and has deep idle states which always rely on GPC to
> wakeup so the approach can work.

If HW designers really wanted to have sensible power management policy
in this SoC they would have paid attention, I am against patching the
kernel heavily to fix a platform bug.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 18:13         ` Marc Zyngier
@ 2019-03-28 11:21           ` Aisheng Dong
  2019-03-29  9:11             ` Richard Zhu
  0 siblings, 1 reply; 28+ messages in thread
From: Aisheng Dong @ 2019-03-28 11:21 UTC (permalink / raw)
  To: Marc Zyngier, Leonard Crestez, l.stach, Richard Zhu, Jacky Bai
  Cc: Fabio Estevam, Cosmin Samoila, Robin Gong, Mircea Pop,
	Daniel Baluta, catalin.marinas, shawnguo, Robert Chiras,
	Anson Huang, Jun Li, Abel Vesa, robh, Zening Wang, dl-linux-imx,
	BOUGH CHEN, Horia Geanta, Peter Chen, Joakim Zhang, rjw,
	Leo Zhang, Shenwei Wang, linux-pm, linux-arm-kernel,
	Ranjani Vaidyanathan, Han Xu, will.deacon, Iuliana Prodan,
	sudeep.holla, lorenzo.pieralisi, linux-kernel, mark.rutland,
	Peng Fan, kernel, Viorel Suman

> From: Marc Zyngier [mailto:marc.zyngier@arm.com]
> Sent: Thursday, March 28, 2019 2:13 AM
> On 27/03/2019 17:00, Leonard Crestez wrote:
> > On Wed, 2019-03-27 at 17:06 +0100, Lucas Stach wrote:
> >> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> >>> On 27/03/2019 15:44, Lucas Stach wrote:
> >>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> >>>>> This work is a workaround I'm looking into (more as a background
> >>>>> task) in order to add support for cpuidle on i.MX8MQ based platforms.
> >>>>>
> >>>>> The main idea here is getting around the missing GIC wake_request
> >>>>> signal (due to integration design issue) by waking up a each
> >>>>> individual core through some dedicated SW power-up bits inside the
> >>>>> power controller (GPC) right before every IPI is requested for that each
> individual core.
> >>>>
> >>>> Just a general comment, without going into the details of this series:
> >>>> this issue is not only affecting IPIs, but also MSIs terminated at
> >>>> the GIC. Currently MSIs are terminated at the PCIe core, but
> >>>> terminating them at the GIC is clearly preferable, as this allows
> >>>> assigning CPU affinity to individual MSIs and lowers IRQ service overhead.
> >>>>
> >>>> I'm not sure what the consequences are for upstream Linux support
> >>>> yet, but we should keep in mind that having a workaround for IPIs
> >>>> is only solving part of the issue.
> >>>
> >>> If this erratum is affecting more than just IPIs, then indeed I
> >>> don't see how this patch series solves anything.
> >>>
> >>> But the erratum documentation seems to imply that only SGIs are
> >>> affected, and goes as far as suggesting to use an external interrupt
> >>> would solve it. How comes this is not the case? Or is it that
> >>> anything directly routed to a redistributor is also affected? This
> >>> would break LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
> >>>
> >>> What is the *exact* status of this thing? I have the ugly feeling
> >>> that the true workaround is just to disable cpuidle.
> >>
> >> As far as I understand the erratum, the basic issue is that the GIC
> >> wake_request signals are not connected to the GPC (the CPU/peripheral
> >> power sequencer). The SPIs are routed through the GPC and thus are
> >> visible as wakeup sources, which is why the workaround of using an
> >> external SPI as wakeup trigger for the IPI works.
> >
> > We had a kernel workaround for IPIs in our internal tree for a long
> > time and I don't think we do anything special for PCI. Does PCI MSI
> > really bypass the GPC on 8mq?
> 
> If you have an ITS, certainly. If you don't, it depends. MSIs can hit the
> distributor's MBI registers and generate non-wired SPIs, which I assume will
> bypass the GPC altogether.
> 

Richard & Jacky,

Can you double check if this issue affect PCI MSI function?

Regards
Dong Aisheng

> > Adding Richard/Jacky, they might know about this.
> >
> > This seems like something of a corner case to me, don't many imx
> > boards ship without PCI; especially for low-power scenarios? If
> > required it might be reasonable to add an additional workaround to
> > disable all cpuidle if pci msis are used.
> 
> Establishing a link between cpuidle and PCI in the kernel would be pretty
> invasive, and that would come on top of what this series also mandates.
> 
> At that level of apparent brokenness, it is far safer to get cpuidle out of the
> picture altogether, and I'd rather see these patches in a vendor tree (for once).
> 
> Thanks,
> 
> 	M.
> --
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-27 17:55         ` Lucas Stach
@ 2019-03-28 11:27           ` Aisheng Dong
  0 siblings, 0 replies; 28+ messages in thread
From: Aisheng Dong @ 2019-03-28 11:27 UTC (permalink / raw)
  To: Lucas Stach, Marc Zyngier, Abel Vesa, Sudeep Holla, Rob Herring,
	Mark Rutland, Shawn Guo, Sascha Hauer, catalin.marinas,
	Will Deacon, Rafael J. Wysocki, Lorenzo Pieralisi, Fabio Estevam
  Cc: dl-linux-imx, linux-arm-kernel, Linux Kernel Mailing List, linux-pm

[...]
> > > Anything that isn't visible to the GPC and requires the GIC
> > > wake_request signal to behave as specified is broken by this erratum.
> >
> > I really wonder how a timer interrupt (a PPI, hence not routed through
> > the GPC) can wake up the CPU in this case. It really feels like
> > something like "program CNTV_CVAL_EL0 to expire at some later point;
> > WFI" could result in the CPU going to a deep sleep state, and not
> > wake-up at all.
> 
> I guess it's broken in the same way. The downstream DT claims
> "local-timer-stop" for the CPU sleep state and "arm,no-tick-in-suspend"
> for the armv8-timer, which I guess is not the timer actually stopping in suspend,
> but the CPU being unable to wake up due to the timer IRQ.
> 
> > This would indicate that not only cpuidle is broken with this, but
> > absolutely every interrupt that is not routed through the GPC.
> 
> That's my understanding as well. Note that I have no NXP internal information
> and can only infer from the published reference manual, errata notice and
> downstream kernel.
> 

We will double check it.
Thanks for the information.

Regards
Dong Aisheng

> Regards,
> Lucas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-28 10:35           ` Marc Zyngier
  2019-03-28 10:36             ` Rafael J. Wysocki
@ 2019-03-28 11:55             ` Aisheng Dong
  1 sibling, 0 replies; 28+ messages in thread
From: Aisheng Dong @ 2019-03-28 11:55 UTC (permalink / raw)
  To: Marc Zyngier, Leonard Crestez, l.stach, Abel Vesa, Jacky Bai,
	Rafael J. Wysocki, Lorenzo Pieralisi
  Cc: dl-linux-imx, linux-kernel, linux-pm, lorenzo.pieralisi,
	Fabio Estevam, mark.rutland, rjw, catalin.marinas, will.deacon,
	robh, shawnguo, linux-arm-kernel, sudeep.holla, Anson Huang,
	kernel

[...]

> > * All SPIs are connected to GPC in a 1:1 mapping
> > * This series deals with SGIs
> > * The timer PPIs are not required; covered by local-timer-stop
> > * LPIs are currently unused (I understand imx-pci uses SPI by default
> > from Lucas)
> >
> > Anything missing?
> >
> > My understanding is that this wake request feature via GIC is new in
> > v3 and this is maybe why HW team missed it during integration. Older
> > imx6/7 has GICv2 and has deep idle states which always rely on GPC to
> > wakeup so the approach can work.
> 
> Certainly the approach can work. The question is whether we want to support
> this in a mainline kernel, spreading random hooks in the generic code and
> adding a firmware interface on top of that.
> 
> By all accounts, this HW is broken. You can indeed impose limitations (dumb
> down PCI, mandate the use of a broadcast timer), or you can just flag cpuidle
> as unsupported on this HW. My vote is on the latter.
> 

Hi Marc, Rafael, Lorenzo 

Thanks for the suggestion. I fully understand the concern.
Do you think we can patch the platform code to address the issue to avoid
the big churn on kernel core code?

If yes, we could try to investigate if there's a suitable place to do that.
The main thing we need to do seems like to manually wakeup cpu core
during the sending IPI path when exit idle. We could see if there's chance
to do it on that path.

Regards
Dong Aisheng

> Thanks,
> 
> 	M.
> --
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-28 11:21           ` Aisheng Dong
@ 2019-03-29  9:11             ` Richard Zhu
  0 siblings, 0 replies; 28+ messages in thread
From: Richard Zhu @ 2019-03-29  9:11 UTC (permalink / raw)
  To: Aisheng Dong, Marc Zyngier, Leonard Crestez, l.stach, Jacky Bai
  Cc: Fabio Estevam, Cosmin Samoila, Robin Gong, Mircea Pop,
	Daniel Baluta, catalin.marinas, shawnguo, Robert Chiras,
	Anson Huang, Jun Li, Abel Vesa, robh, Zening Wang, dl-linux-imx,
	BOUGH CHEN, Horia Geanta, Peter Chen, Joakim Zhang, rjw,
	Leo Zhang, Shenwei Wang, linux-pm, linux-arm-kernel,
	Ranjani Vaidyanathan, Han Xu, will.deacon, Iuliana Prodan,
	sudeep.holla, lorenzo.pieralisi, linux-kernel, mark.rutland,
	Peng Fan, kernel, Viorel Suman



> -----Original Message-----
> From: Aisheng Dong
> Sent: 2019年3月28日 19:21
> To: Marc Zyngier <marc.zyngier@arm.com>; Leonard Crestez
> <leonard.crestez@nxp.com>; l.stach@pengutronix.de; Richard Zhu
> <hongxing.zhu@nxp.com>; Jacky Bai <ping.bai@nxp.com>
> Cc: Fabio Estevam <fabio.estevam@nxp.com>; Cosmin Samoila
> <cosmin.samoila@nxp.com>; Robin Gong <yibin.gong@nxp.com>; Mircea Pop
> <mircea.pop@nxp.com>; Daniel Baluta <daniel.baluta@nxp.com>;
> catalin.marinas@arm.com; shawnguo@kernel.org; Robert Chiras
> <robert.chiras@nxp.com>; Anson Huang <anson.huang@nxp.com>; Jun Li
> <jun.li@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; robh@kernel.org;
> Zening Wang <zening.wang@nxp.com>; dl-linux-imx <linux-imx@nxp.com>;
> BOUGH CHEN <haibo.chen@nxp.com>; Horia Geanta
> <horia.geanta@nxp.com>; Peter Chen <peter.chen@nxp.com>; Joakim Zhang
> <qiangqing.zhang@nxp.com>; rjw@rjwysocki.net; Leo Zhang
> <leo.zhang@nxp.com>; Shenwei Wang <shenwei.wang@nxp.com>;
> linux-pm@vger.kernel.org; linux-arm-kernel@lists.infradead.org; Ranjani
> Vaidyanathan <ranjani.vaidyanathan@nxp.com>; Han Xu <han.xu@nxp.com>;
> will.deacon@arm.com; Iuliana Prodan <iuliana.prodan@nxp.com>;
> sudeep.holla@arm.com; lorenzo.pieralisi@arm.com;
> linux-kernel@vger.kernel.org; mark.rutland@arm.com; Peng Fan
> <peng.fan@nxp.com>; kernel@pengutronix.de; Viorel Suman
> <viorel.suman@nxp.com>
> Subject: RE: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI
> wakeup
> 
> > From: Marc Zyngier [mailto:marc.zyngier@arm.com]
> > Sent: Thursday, March 28, 2019 2:13 AM On 27/03/2019 17:00, Leonard
> > Crestez wrote:
> > > On Wed, 2019-03-27 at 17:06 +0100, Lucas Stach wrote:
> > >> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
> > >>> On 27/03/2019 15:44, Lucas Stach wrote:
> > >>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
> > >>>>> This work is a workaround I'm looking into (more as a background
> > >>>>> task) in order to add support for cpuidle on i.MX8MQ based
> platforms.
> > >>>>>
> > >>>>> The main idea here is getting around the missing GIC
> > >>>>> wake_request signal (due to integration design issue) by waking
> > >>>>> up a each individual core through some dedicated SW power-up
> > >>>>> bits inside the power controller (GPC) right before every IPI is
> > >>>>> requested for that each
> > individual core.
> > >>>>
> > >>>> Just a general comment, without going into the details of this series:
> > >>>> this issue is not only affecting IPIs, but also MSIs terminated
> > >>>> at the GIC. Currently MSIs are terminated at the PCIe core, but
> > >>>> terminating them at the GIC is clearly preferable, as this allows
> > >>>> assigning CPU affinity to individual MSIs and lowers IRQ service
> overhead.
> > >>>>
> > >>>> I'm not sure what the consequences are for upstream Linux support
> > >>>> yet, but we should keep in mind that having a workaround for IPIs
> > >>>> is only solving part of the issue.
> > >>>
> > >>> If this erratum is affecting more than just IPIs, then indeed I
> > >>> don't see how this patch series solves anything.
> > >>>
> > >>> But the erratum documentation seems to imply that only SGIs are
> > >>> affected, and goes as far as suggesting to use an external
> > >>> interrupt would solve it. How comes this is not the case? Or is it
> > >>> that anything directly routed to a redistributor is also affected?
> > >>> This would break LPIs (and thus MSIs) and PPIs (the CPU timer, among
> others).
> > >>>
> > >>> What is the *exact* status of this thing? I have the ugly feeling
> > >>> that the true workaround is just to disable cpuidle.
> > >>
> > >> As far as I understand the erratum, the basic issue is that the GIC
> > >> wake_request signals are not connected to the GPC (the
> > >> CPU/peripheral power sequencer). The SPIs are routed through the
> > >> GPC and thus are visible as wakeup sources, which is why the
> > >> workaround of using an external SPI as wakeup trigger for the IPI works.
> > >
> > > We had a kernel workaround for IPIs in our internal tree for a long
> > > time and I don't think we do anything special for PCI. Does PCI MSI
> > > really bypass the GPC on 8mq?
> >
> > If you have an ITS, certainly. If you don't, it depends. MSIs can hit
> > the distributor's MBI registers and generate non-wired SPIs, which I
> > assume will bypass the GPC altogether.
> >
> 
> Richard & Jacky,
> 
> Can you double check if this issue affect PCI MSI function?
> 
[Richard Zhu] GIC V3 has the ITS/LPIs features. That can be used by PCIe MSI functions.
BTW, the PCIe MSI ITS mode is not enabled in vendor tree.

Best Regards
Richard Zhu

> Regards
> Dong Aisheng
> 
> > > Adding Richard/Jacky, they might know about this.
> > >
> > > This seems like something of a corner case to me, don't many imx
> > > boards ship without PCI; especially for low-power scenarios? If
> > > required it might be reasonable to add an additional workaround to
> > > disable all cpuidle if pci msis are used.
> >
> > Establishing a link between cpuidle and PCI in the kernel would be
> > pretty invasive, and that would come on top of what this series also
> mandates.
> >
> > At that level of apparent brokenness, it is far safer to get cpuidle
> > out of the picture altogether, and I'd rather see these patches in a vendor
> tree (for once).
> >
> > Thanks,
> >
> > 	M.
> > --
> > Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-03-28 10:45           ` Lorenzo Pieralisi
@ 2019-11-06 20:14             ` Florian Fainelli
  2019-11-06 21:31               ` Leonard Crestez
  0 siblings, 1 reply; 28+ messages in thread
From: Florian Fainelli @ 2019-11-06 20:14 UTC (permalink / raw)
  To: Lorenzo Pieralisi, Leonard Crestez
  Cc: Aisheng Dong, mark.rutland, Jacky Bai, Anson Huang, linux-pm,
	marc.zyngier, catalin.marinas, rjw, linux-kernel, will.deacon,
	dl-linux-imx, kernel, sudeep.holla, Fabio Estevam, l.stach,
	shawnguo, robh, linux-arm-kernel, Abel Vesa

On 3/28/19 3:45 AM, Lorenzo Pieralisi wrote:
> On Wed, Mar 27, 2019 at 06:40:07PM +0000, Leonard Crestez wrote:
>> On Wed, 2019-03-27 at 17:45 +0000, Marc Zyngier wrote:
>>> On 27/03/2019 16:06, Lucas Stach wrote:
>>>> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
>>>>> On 27/03/2019 15:44, Lucas Stach wrote:
>>>>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
>>>>>>> This work is a workaround I'm looking into (more as a background task)
>>>>>>> in order to add support for cpuidle on i.MX8MQ based platforms.
>>>>>>>
>>>>>>> The main idea here is getting around the missing GIC wake_request signal
>>>>>>> (due to integration design issue) by waking up a each individual core through
>>>>>>> some dedicated SW power-up bits inside the power controller (GPC) right before
>>>>>>> every IPI is requested for that each individual core.
>>>>>>
>>>>>> Just a general comment, without going into the details of this series:
>>>>>> this issue is not only affecting IPIs, but also MSIs terminated at the
>>>>>> GIC. Currently MSIs are terminated at the PCIe core, but terminating
>>>>>> them at the GIC is clearly preferable, as this allows assigning CPU
>>>>>> affinity to individual MSIs and lowers IRQ service overhead.
>>>>>>
>>>>>> I'm not sure what the consequences are for upstream Linux support yet,
>>>>>> but we should keep in mind that having a workaround for IPIs is only
>>>>>> solving part of the issue.
>>>>>
>>>>> If this erratum is affecting more than just IPIs, then indeed I don't
>>>>> see how this patch series solves anything.
>>>>>
>>>>> But the erratum documentation seems to imply that only SGIs are
>>>>> affected, and goes as far as suggesting to use an external interrupt
>>>>> would solve it. How comes this is not the case? Or is it that anything
>>>>> directly routed to a redistributor is also affected? This would break
>>>>> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
>>>>
>>>> Anything that isn't visible to the GPC and requires the GIC
>>>> wake_request signal to behave as specified is broken by this erratum.
>>>
>>> I really wonder how a timer interrupt (a PPI, hence not routed through
>>> the GPC) can wake up the CPU in this case. It really feels like
>>> something like "program CNTV_CVAL_EL0 to expire at some later point;
>>> WFI" could result in the CPU going to a deep sleep state, and not
>>> wake-up at all.
>>
>> This is already a common issue for cpuidle implementions handled by the
>> "local-timer-stop" property. imx has other timer blocks in the SOC,
>> they generate SPIs which are connected to GPC.
> 
> It is not a common issue. The tick-broadcast mechanism relies on
> IPIs that are sent to specific CPUs upon timer expiry.
> 
> If IPIs don't work for CPUs in shutdown state (which is what this patch
> is fixing AFAIU), the only reason I can see how a CPU can resume from
> idle on a timer expiry is the GPC waking up all cores upon the global
> timer SPI; if that's the case there is precious little point in
> implementing CPUidle at all - too bad people worked hard to implement
> NOHZ in a power efficient manner.
> 
>>> This would indicate that not only cpuidle is broken with this, but
>>> absolutely every interrupt that is not routed through the GPC.
>>
>> Yes, cpuidle is broken for irqs not routed through GPC. However:
>>
>> * All SPIs are connected to GPC in a 1:1 mapping
>> * This series deals with SGIs
>> * The timer PPIs are not required; covered by local-timer-stop
>> * LPIs are currently unused (I understand imx-pci uses SPI by default
>> from Lucas)
>>
>> Anything missing?
> 
> Yes, LPIs must be able to wake up CPUs and only the CPU for which
> an IRQ is actually pending.
> 
>>From an architectural perspective, an ARM core executing the WFI
> instruction must resume execution upon an IRQ occurrence targeted
> at it and that's true regardless of the idle state entered.
> 
> Anything deviating from this behaviour is not architecture compliant.

What if you enter a deeper state than WFI, which leads to the power
gating of your CPU core, and you are missing the necessary hardware that
should be driven from the GIC's nIRQOUT/nFIQOUT signals to automatically
bring the core back on upon the GIC seeing a pending interrupt targeting
that core?

Would it be acceptable in that case to "help" the platform by ensuring
that there is at least one core that is not allowed to enter the deepest
idle state and be able to help wake back up the others? I am asking
because I am facing a similar issue to what Abel is trying to solve here
with ARCH_BRCMSTB platforms which do not have the ability to have their
CPU cores wake-up on their once power gated.

> 
>> My understanding is that this wake request feature via GIC is new in v3
>> and this is maybe why HW team missed it during integration. Older
>> imx6/7 has GICv2 and has deep idle states which always rely on GPC to
>> wakeup so the approach can work.
> 
> If HW designers really wanted to have sensible power management policy
> in this SoC they would have paid attention, I am against patching the
> kernel heavily to fix a platform bug.
HW designers may not be aware of how the cpuifle framework operates or
what its constraints are, so they may not understand that any interrupt,
must be able to autonomously (with lack of a better name) wake-up a
given core, given any idle state it has entered.
-- 
Florian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-11-06 20:14             ` Florian Fainelli
@ 2019-11-06 21:31               ` Leonard Crestez
  2019-11-06 22:10                 ` Florian Fainelli
  0 siblings, 1 reply; 28+ messages in thread
From: Leonard Crestez @ 2019-11-06 21:31 UTC (permalink / raw)
  To: Florian Fainelli, Abel Vesa
  Cc: Lorenzo Pieralisi, Aisheng Dong, mark.rutland, Jacky Bai,
	Anson Huang, linux-pm, marc.zyngier, catalin.marinas, rjw,
	linux-kernel, will.deacon, dl-linux-imx, kernel, sudeep.holla,
	Fabio Estevam, l.stach, shawnguo, robh, linux-arm-kernel

On 06.11.2019 22:15, Florian Fainelli wrote:
> On 3/28/19 3:45 AM, Lorenzo Pieralisi wrote:
>> On Wed, Mar 27, 2019 at 06:40:07PM +0000, Leonard Crestez wrote:
>>> On Wed, 2019-03-27 at 17:45 +0000, Marc Zyngier wrote:
>>>> On 27/03/2019 16:06, Lucas Stach wrote:
>>>>> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
>>>>>> On 27/03/2019 15:44, Lucas Stach wrote:
>>>>>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
>>>>>>>> This work is a workaround I'm looking into (more as a background task)
>>>>>>>> in order to add support for cpuidle on i.MX8MQ based platforms.
>>>>>>>>
>>>>>>>> The main idea here is getting around the missing GIC wake_request signal
>>>>>>>> (due to integration design issue) by waking up a each individual core through
>>>>>>>> some dedicated SW power-up bits inside the power controller (GPC) right before
>>>>>>>> every IPI is requested for that each individual core.
>>>>>>>
>>>>>>> Just a general comment, without going into the details of this series:
>>>>>>> this issue is not only affecting IPIs, but also MSIs terminated at the
>>>>>>> GIC. Currently MSIs are terminated at the PCIe core, but terminating
>>>>>>> them at the GIC is clearly preferable, as this allows assigning CPU
>>>>>>> affinity to individual MSIs and lowers IRQ service overhead.
>>>>>>>
>>>>>>> I'm not sure what the consequences are for upstream Linux support yet,
>>>>>>> but we should keep in mind that having a workaround for IPIs is only
>>>>>>> solving part of the issue.
>>>>>>
>>>>>> If this erratum is affecting more than just IPIs, then indeed I don't
>>>>>> see how this patch series solves anything.
>>>>>>
>>>>>> But the erratum documentation seems to imply that only SGIs are
>>>>>> affected, and goes as far as suggesting to use an external interrupt
>>>>>> would solve it. How comes this is not the case? Or is it that anything
>>>>>> directly routed to a redistributor is also affected? This would break
>>>>>> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
>>>>>
>>>>> Anything that isn't visible to the GPC and requires the GIC
>>>>> wake_request signal to behave as specified is broken by this erratum.
>>>>
>>>> I really wonder how a timer interrupt (a PPI, hence not routed through
>>>> the GPC) can wake up the CPU in this case. It really feels like
>>>> something like "program CNTV_CVAL_EL0 to expire at some later point;
>>>> WFI" could result in the CPU going to a deep sleep state, and not
>>>> wake-up at all.
>>>
>>> This is already a common issue for cpuidle implementions handled by the
>>> "local-timer-stop" property. imx has other timer blocks in the SOC,
>>> they generate SPIs which are connected to GPC.
>>
>> It is not a common issue. The tick-broadcast mechanism relies on
>> IPIs that are sent to specific CPUs upon timer expiry.
>>
>> If IPIs don't work for CPUs in shutdown state (which is what this patch
>> is fixing AFAIU), the only reason I can see how a CPU can resume from
>> idle on a timer expiry is the GPC waking up all cores upon the global
>> timer SPI; if that's the case there is precious little point in
>> implementing CPUidle at all - too bad people worked hard to implement
>> NOHZ in a power efficient manner.
>>
>>>> This would indicate that not only cpuidle is broken with this, but
>>>> absolutely every interrupt that is not routed through the GPC.
>>>
>>> Yes, cpuidle is broken for irqs not routed through GPC. However:
>>>
>>> * All SPIs are connected to GPC in a 1:1 mapping
>>> * This series deals with SGIs
>>> * The timer PPIs are not required; covered by local-timer-stop
>>> * LPIs are currently unused (I understand imx-pci uses SPI by default
>>> from Lucas)
>>>
>>> Anything missing?
>>
>> Yes, LPIs must be able to wake up CPUs and only the CPU for which
>> an IRQ is actually pending.
>>
>> >From an architectural perspective, an ARM core executing the WFI
>> instruction must resume execution upon an IRQ occurrence targeted
>> at it and that's true regardless of the idle state entered.
>>
>> Anything deviating from this behaviour is not architecture compliant.
> 
> What if you enter a deeper state than WFI, which leads to the power
> gating of your CPU core, and you are missing the necessary hardware that
> should be driven from the GIC's nIRQOUT/nFIQOUT signals to automatically
> bring the core back on upon the GIC seeing a pending interrupt targeting
> that core?

imx8mq has a secondary "GPC" block which receives SPIs and can wake the 
cores. Do you have something similar? Because if you only have the GIC 
then that sounds much worse: you'd have to ensure that all peripheral 
interrupts are routed away from sleeping cores.

On IMX only SGIs need special treatment and a newer version just 
replaces __smp_cross_call in a platform-specific manner:

     https://lkml.org/lkml/2019/6/10/350

> Would it be acceptable in that case to "help" the platform by ensuring
> that there is at least one core that is not allowed to enter the deepest
> idle state and be able to help wake back up the others? I am asking
> because I am facing a similar issue to what Abel is trying to solve here
> with ARCH_BRCMSTB platforms which do not have the ability to have their
> CPU cores wake-up on their once power gated.

Maybe you can workaround in ATF: if (last_core) wfi(); else powerdown();

But you still need special treatment for interrupts targeted at gated cores.

>>> My understanding is that this wake request feature via GIC is new in v3
>>> and this is maybe why HW team missed it during integration. Older
>>> imx6/7 has GICv2 and has deep idle states which always rely on GPC to
>>> wakeup so the approach can work.
>>
>> If HW designers really wanted to have sensible power management policy
>> in this SoC they would have paid attention, I am against patching the
>> kernel heavily to fix a platform bug.

> HW designers may not be aware of how the cpuifle framework operates or
> what its constraints are, so they may not understand that any interrupt,
> must be able to autonomously (with lack of a better name) wake-up a
> given core, given any idle state it has entered.

My understanding is that this is a requirement of GICv3 architecture.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-11-06 21:31               ` Leonard Crestez
@ 2019-11-06 22:10                 ` Florian Fainelli
  2019-11-06 22:47                   ` Leonard Crestez
  0 siblings, 1 reply; 28+ messages in thread
From: Florian Fainelli @ 2019-11-06 22:10 UTC (permalink / raw)
  To: Leonard Crestez, Abel Vesa
  Cc: Lorenzo Pieralisi, Aisheng Dong, mark.rutland, Jacky Bai,
	Anson Huang, linux-pm, marc.zyngier, catalin.marinas, rjw,
	linux-kernel, will.deacon, dl-linux-imx, kernel, sudeep.holla,
	Fabio Estevam, l.stach, shawnguo, robh, linux-arm-kernel

On 11/6/19 1:31 PM, Leonard Crestez wrote:
> On 06.11.2019 22:15, Florian Fainelli wrote:
>> On 3/28/19 3:45 AM, Lorenzo Pieralisi wrote:
>>> On Wed, Mar 27, 2019 at 06:40:07PM +0000, Leonard Crestez wrote:
>>>> On Wed, 2019-03-27 at 17:45 +0000, Marc Zyngier wrote:
>>>>> On 27/03/2019 16:06, Lucas Stach wrote:
>>>>>> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
>>>>>>> On 27/03/2019 15:44, Lucas Stach wrote:
>>>>>>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
>>>>>>>>> This work is a workaround I'm looking into (more as a background task)
>>>>>>>>> in order to add support for cpuidle on i.MX8MQ based platforms.
>>>>>>>>>
>>>>>>>>> The main idea here is getting around the missing GIC wake_request signal
>>>>>>>>> (due to integration design issue) by waking up a each individual core through
>>>>>>>>> some dedicated SW power-up bits inside the power controller (GPC) right before
>>>>>>>>> every IPI is requested for that each individual core.
>>>>>>>>
>>>>>>>> Just a general comment, without going into the details of this series:
>>>>>>>> this issue is not only affecting IPIs, but also MSIs terminated at the
>>>>>>>> GIC. Currently MSIs are terminated at the PCIe core, but terminating
>>>>>>>> them at the GIC is clearly preferable, as this allows assigning CPU
>>>>>>>> affinity to individual MSIs and lowers IRQ service overhead.
>>>>>>>>
>>>>>>>> I'm not sure what the consequences are for upstream Linux support yet,
>>>>>>>> but we should keep in mind that having a workaround for IPIs is only
>>>>>>>> solving part of the issue.
>>>>>>>
>>>>>>> If this erratum is affecting more than just IPIs, then indeed I don't
>>>>>>> see how this patch series solves anything.
>>>>>>>
>>>>>>> But the erratum documentation seems to imply that only SGIs are
>>>>>>> affected, and goes as far as suggesting to use an external interrupt
>>>>>>> would solve it. How comes this is not the case? Or is it that anything
>>>>>>> directly routed to a redistributor is also affected? This would break
>>>>>>> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
>>>>>>
>>>>>> Anything that isn't visible to the GPC and requires the GIC
>>>>>> wake_request signal to behave as specified is broken by this erratum.
>>>>>
>>>>> I really wonder how a timer interrupt (a PPI, hence not routed through
>>>>> the GPC) can wake up the CPU in this case. It really feels like
>>>>> something like "program CNTV_CVAL_EL0 to expire at some later point;
>>>>> WFI" could result in the CPU going to a deep sleep state, and not
>>>>> wake-up at all.
>>>>
>>>> This is already a common issue for cpuidle implementions handled by the
>>>> "local-timer-stop" property. imx has other timer blocks in the SOC,
>>>> they generate SPIs which are connected to GPC.
>>>
>>> It is not a common issue. The tick-broadcast mechanism relies on
>>> IPIs that are sent to specific CPUs upon timer expiry.
>>>
>>> If IPIs don't work for CPUs in shutdown state (which is what this patch
>>> is fixing AFAIU), the only reason I can see how a CPU can resume from
>>> idle on a timer expiry is the GPC waking up all cores upon the global
>>> timer SPI; if that's the case there is precious little point in
>>> implementing CPUidle at all - too bad people worked hard to implement
>>> NOHZ in a power efficient manner.
>>>
>>>>> This would indicate that not only cpuidle is broken with this, but
>>>>> absolutely every interrupt that is not routed through the GPC.
>>>>
>>>> Yes, cpuidle is broken for irqs not routed through GPC. However:
>>>>
>>>> * All SPIs are connected to GPC in a 1:1 mapping
>>>> * This series deals with SGIs
>>>> * The timer PPIs are not required; covered by local-timer-stop
>>>> * LPIs are currently unused (I understand imx-pci uses SPI by default
>>>> from Lucas)
>>>>
>>>> Anything missing?
>>>
>>> Yes, LPIs must be able to wake up CPUs and only the CPU for which
>>> an IRQ is actually pending.
>>>
>>> >From an architectural perspective, an ARM core executing the WFI
>>> instruction must resume execution upon an IRQ occurrence targeted
>>> at it and that's true regardless of the idle state entered.
>>>
>>> Anything deviating from this behaviour is not architecture compliant.
>>
>> What if you enter a deeper state than WFI, which leads to the power
>> gating of your CPU core, and you are missing the necessary hardware that
>> should be driven from the GIC's nIRQOUT/nFIQOUT signals to automatically
>> bring the core back on upon the GIC seeing a pending interrupt targeting
>> that core?
> 
> imx8mq has a secondary "GPC" block which receives SPIs and can wake the 
> cores. Do you have something similar? Because if you only have the GIC 
> then that sounds much worse: you'd have to ensure that all peripheral 
> interrupts are routed away from sleeping cores.

We have a legacy interrupt controller that receives all SPIs as well,
and it can be used as a full replacement for the GIC (with the loss of
nVIRQ/nFIQ) but it cannot wake-up the cores unfortunately. This is all
custom logic, so we could have done at least wake-up based on SPIs, but
we missed that apparently, at least we were consistent.

Out of curiosity, does your GPC somehow know the affinity of a given
interrupt to a particular core?

> 
> On IMX only SGIs need special treatment and a newer version just 
> replaces __smp_cross_call in a platform-specific manner:
> 
>      https://lkml.org/lkml/2019/6/10/350

Right, because for PPIs you leverage the timer broadcast and for SPIs
you have that GPC, so all your left are the remaining "intra GIC"
interrupts which are SGIs.

> 
>> Would it be acceptable in that case to "help" the platform by ensuring
>> that there is at least one core that is not allowed to enter the deepest
>> idle state and be able to help wake back up the others? I am asking
>> because I am facing a similar issue to what Abel is trying to solve here
>> with ARCH_BRCMSTB platforms which do not have the ability to have their
>> CPU cores wake-up on their once power gated.
> 
> Maybe you can workaround in ATF: if (last_core) wfi(); else powerdown();

Yes, that would certainly work, the biggest problem in my case is
dealing with SPIs, since we still have no way to wake-up from those,
other than by getting the help of another CPU that is not power gated.
Lovely, I know.

> 
> But you still need special treatment for interrupts targeted at gated cores.
> 
>>>> My understanding is that this wake request feature via GIC is new in v3
>>>> and this is maybe why HW team missed it during integration. Older
>>>> imx6/7 has GICv2 and has deep idle states which always rely on GPC to
>>>> wakeup so the approach can work.
>>>
>>> If HW designers really wanted to have sensible power management policy
>>> in this SoC they would have paid attention, I am against patching the
>>> kernel heavily to fix a platform bug.
> 
>> HW designers may not be aware of how the cpuifle framework operates or
>> what its constraints are, so they may not understand that any interrupt,
>> must be able to autonomously (with lack of a better name) wake-up a
>> given core, given any idle state it has entered.
> 
> My understanding is that this is a requirement of GICv3 architecture.
> 

The systems I use have a GICv2 architecture though this is still no
excuse for not having hooked the nIRQOUT/nFIQOUT to a power management
controller, this is clearly an oversight, and it should have been
possible to automatically take a core out of power gating, since we did
design our own power gating logic, but this was done that way. Hopefully
future designs can remedy that, designers are aware of why this is a
problem now.
--
Florian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup
  2019-11-06 22:10                 ` Florian Fainelli
@ 2019-11-06 22:47                   ` Leonard Crestez
  0 siblings, 0 replies; 28+ messages in thread
From: Leonard Crestez @ 2019-11-06 22:47 UTC (permalink / raw)
  To: Florian Fainelli, Abel Vesa
  Cc: Lorenzo Pieralisi, Aisheng Dong, mark.rutland, Jacky Bai,
	Anson Huang, linux-pm, marc.zyngier, catalin.marinas, rjw,
	linux-kernel, will.deacon, dl-linux-imx, kernel, sudeep.holla,
	Fabio Estevam, l.stach, shawnguo, robh, linux-arm-kernel

On 07.11.2019 00:10, Florian Fainelli wrote:
> On 11/6/19 1:31 PM, Leonard Crestez wrote:
>> On 06.11.2019 22:15, Florian Fainelli wrote:
>>> On 3/28/19 3:45 AM, Lorenzo Pieralisi wrote:
>>>> On Wed, Mar 27, 2019 at 06:40:07PM +0000, Leonard Crestez wrote:
>>>>> On Wed, 2019-03-27 at 17:45 +0000, Marc Zyngier wrote:
>>>>>> On 27/03/2019 16:06, Lucas Stach wrote:
>>>>>>> Am Mittwoch, den 27.03.2019, 15:57 +0000 schrieb Marc Zyngier:
>>>>>>>> On 27/03/2019 15:44, Lucas Stach wrote:
>>>>>>>>> Am Mittwoch, den 27.03.2019, 13:21 +0000 schrieb Abel Vesa:
>>>>>>>>>> This work is a workaround I'm looking into (more as a background task)
>>>>>>>>>> in order to add support for cpuidle on i.MX8MQ based platforms.
>>>>>>>>>>
>>>>>>>>>> The main idea here is getting around the missing GIC wake_request signal
>>>>>>>>>> (due to integration design issue) by waking up a each individual core through
>>>>>>>>>> some dedicated SW power-up bits inside the power controller (GPC) right before
>>>>>>>>>> every IPI is requested for that each individual core.
>>>>>>>>>
>>>>>>>>> Just a general comment, without going into the details of this series:
>>>>>>>>> this issue is not only affecting IPIs, but also MSIs terminated at the
>>>>>>>>> GIC. Currently MSIs are terminated at the PCIe core, but terminating
>>>>>>>>> them at the GIC is clearly preferable, as this allows assigning CPU
>>>>>>>>> affinity to individual MSIs and lowers IRQ service overhead.
>>>>>>>>>
>>>>>>>>> I'm not sure what the consequences are for upstream Linux support yet,
>>>>>>>>> but we should keep in mind that having a workaround for IPIs is only
>>>>>>>>> solving part of the issue.
>>>>>>>>
>>>>>>>> If this erratum is affecting more than just IPIs, then indeed I don't
>>>>>>>> see how this patch series solves anything.
>>>>>>>>
>>>>>>>> But the erratum documentation seems to imply that only SGIs are
>>>>>>>> affected, and goes as far as suggesting to use an external interrupt
>>>>>>>> would solve it. How comes this is not the case? Or is it that anything
>>>>>>>> directly routed to a redistributor is also affected? This would break
>>>>>>>> LPIs (and thus MSIs) and PPIs (the CPU timer, among others).
>>>>>>>
>>>>>>> Anything that isn't visible to the GPC and requires the GIC
>>>>>>> wake_request signal to behave as specified is broken by this erratum.
>>>>>>
>>>>>> I really wonder how a timer interrupt (a PPI, hence not routed through
>>>>>> the GPC) can wake up the CPU in this case. It really feels like
>>>>>> something like "program CNTV_CVAL_EL0 to expire at some later point;
>>>>>> WFI" could result in the CPU going to a deep sleep state, and not
>>>>>> wake-up at all.
>>>>>
>>>>> This is already a common issue for cpuidle implementions handled by the
>>>>> "local-timer-stop" property. imx has other timer blocks in the SOC,
>>>>> they generate SPIs which are connected to GPC.
>>>>
>>>> It is not a common issue. The tick-broadcast mechanism relies on
>>>> IPIs that are sent to specific CPUs upon timer expiry.
>>>>
>>>> If IPIs don't work for CPUs in shutdown state (which is what this patch
>>>> is fixing AFAIU), the only reason I can see how a CPU can resume from
>>>> idle on a timer expiry is the GPC waking up all cores upon the global
>>>> timer SPI; if that's the case there is precious little point in
>>>> implementing CPUidle at all - too bad people worked hard to implement
>>>> NOHZ in a power efficient manner.
>>>>
>>>>>> This would indicate that not only cpuidle is broken with this, but
>>>>>> absolutely every interrupt that is not routed through the GPC.
>>>>>
>>>>> Yes, cpuidle is broken for irqs not routed through GPC. However:
>>>>>
>>>>> * All SPIs are connected to GPC in a 1:1 mapping
>>>>> * This series deals with SGIs
>>>>> * The timer PPIs are not required; covered by local-timer-stop
>>>>> * LPIs are currently unused (I understand imx-pci uses SPI by default
>>>>> from Lucas)
>>>>>
>>>>> Anything missing?
>>>>
>>>> Yes, LPIs must be able to wake up CPUs and only the CPU for which
>>>> an IRQ is actually pending.
>>>>
>>>> >From an architectural perspective, an ARM core executing the WFI
>>>> instruction must resume execution upon an IRQ occurrence targeted
>>>> at it and that's true regardless of the idle state entered.
>>>>
>>>> Anything deviating from this behaviour is not architecture compliant.
>>>
>>> What if you enter a deeper state than WFI, which leads to the power
>>> gating of your CPU core, and you are missing the necessary hardware that
>>> should be driven from the GIC's nIRQOUT/nFIQOUT signals to automatically
>>> bring the core back on upon the GIC seeing a pending interrupt targeting
>>> that core?
>>
>> imx8mq has a secondary "GPC" block which receives SPIs and can wake the
>> cores. Do you have something similar? Because if you only have the GIC
>> then that sounds much worse: you'd have to ensure that all peripheral
>> interrupts are routed away from sleeping cores.
> 
> We have a legacy interrupt controller that receives all SPIs as well,
> and it can be used as a full replacement for the GIC (with the loss of
> nVIRQ/nFIQ) but it cannot wake-up the cores unfortunately. This is all
> custom logic, so we could have done at least wake-up based on SPIs, but
> we missed that apparently, at least we were consistent.
> 
> Out of curiosity, does your GPC somehow know the affinity of a given
> interrupt to a particular core?

Yes, if it's told by software. There are mask and status registers for 
each SPI for each core in GPC but AFAIK GIC bits are unrelated.

>> On IMX only SGIs need special treatment and a newer version just
>> replaces __smp_cross_call in a platform-specific manner:
>>
>>       https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2019%2F6%2F10%2F350&amp;data=02%7C01%7Cleonard.crestez%40nxp.com%7Ce019c8afbfff487ef72208d7630629f3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637086750468897744&amp;sdata=SPCLEZJU5bpTrs8vQNQ7CuBWmlF8f3uyPaNUB%2F%2BAm%2Fs%3D&amp;reserved=0
> 
> Right, because for PPIs you leverage the timer broadcast and for SPIs
> you have that GPC, so all your left are the remaining "intra GIC"
> interrupts which are SGIs.
> 
>>> Would it be acceptable in that case to "help" the platform by ensuring
>>> that there is at least one core that is not allowed to enter the deepest
>>> idle state and be able to help wake back up the others? I am asking
>>> because I am facing a similar issue to what Abel is trying to solve here
>>> with ARCH_BRCMSTB platforms which do not have the ability to have their
>>> CPU cores wake-up on their once power gated.
>>
>> Maybe you can workaround in ATF: if (last_core) wfi(); else powerdown();
> 
> Yes, that would certainly work, the biggest problem in my case is
> dealing with SPIs, since we still have no way to wake-up from those,
> other than by getting the help of another CPU that is not power gated.
> Lovely, I know.

By default irqs are only routed to core0 so maybe you could only power 
down if your core has no irqs enabled? It might even be possible to do 
this by reading GIC registers in ATF but this might race with other GIC 
manipulation from kernel.

Perhaps your workarounds could also be encapsulated into a 
platform-specific irqchip implementation which occasionally pokes at ATF.

>> But you still need special treatment for interrupts targeted at gated cores.
>>
>>>>> My understanding is that this wake request feature via GIC is new in v3
>>>>> and this is maybe why HW team missed it during integration. Older
>>>>> imx6/7 has GICv2 and has deep idle states which always rely on GPC to
>>>>> wakeup so the approach can work.
>>>>
>>>> If HW designers really wanted to have sensible power management policy
>>>> in this SoC they would have paid attention, I am against patching the
>>>> kernel heavily to fix a platform bug.
>>
>>> HW designers may not be aware of how the cpuifle framework operates or
>>> what its constraints are, so they may not understand that any interrupt,
>>> must be able to autonomously (with lack of a better name) wake-up a
>>> given core, given any idle state it has entered.
>>
>> My understanding is that this is a requirement of GICv3 architecture.
>>
> 
> The systems I use have a GICv2 architecture though this is still no
> excuse for not having hooked the nIRQOUT/nFIQOUT to a power management
> controller, this is clearly an oversight, and it should have been
> possible to automatically take a core out of power gating, since we did
> design our own power gating logic, but this was done that way. Hopefully
> future designs can remedy that, designers are aware of why this is a
> problem now.
> --
> Florian
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-11-06 22:47 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-27 13:21 [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Abel Vesa
2019-03-27 13:21 ` [RFC 1/7] sched: idle: Add sched get idle state helper Abel Vesa
2019-03-27 13:21 ` [RFC 2/7] cpuidle: Add cpu poke support Abel Vesa
2019-03-27 13:21 ` [RFC 3/7] smp: Poke the cores before requesting IPI Abel Vesa
2019-03-27 13:21 ` [RFC 4/7] psci: Add cpu_poke ops to support core poking Abel Vesa
2019-03-27 13:21 ` [RFC 5/7] cpuidle-arm: Add ops to support poke alonside enter Abel Vesa
2019-03-27 13:21 ` [RFC 6/7] cpuidle-arm: Add arm64 wake helper for cpu_poke op Abel Vesa
2019-03-27 13:21 ` [RFC 7/7] arm64: dts: imx8mq: Add cpu-sleep state with poke wake-up enabled Abel Vesa
2019-03-27 15:44 ` [RFC 0/7] cpuidle: Add poking mechanism to support non-IPI wakeup Lucas Stach
2019-03-27 15:57   ` Marc Zyngier
2019-03-27 16:06     ` Lucas Stach
2019-03-27 17:00       ` Leonard Crestez
2019-03-27 17:11         ` Lucas Stach
2019-03-27 18:13         ` Marc Zyngier
2019-03-28 11:21           ` Aisheng Dong
2019-03-29  9:11             ` Richard Zhu
2019-03-27 17:45       ` Marc Zyngier
2019-03-27 17:55         ` Lucas Stach
2019-03-28 11:27           ` Aisheng Dong
2019-03-27 18:40         ` Leonard Crestez
2019-03-28 10:35           ` Marc Zyngier
2019-03-28 10:36             ` Rafael J. Wysocki
2019-03-28 11:55             ` Aisheng Dong
2019-03-28 10:45           ` Lorenzo Pieralisi
2019-11-06 20:14             ` Florian Fainelli
2019-11-06 21:31               ` Leonard Crestez
2019-11-06 22:10                 ` Florian Fainelli
2019-11-06 22:47                   ` Leonard Crestez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).