All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3.19-rc2 v15 0/8] irq/arm: Implement arch_trigger_all_cpu_backtrace
@ 2015-01-23 14:22 ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly (and safely) bring in the changes for the arm
we also make the sched_clock implementation NMI-safe and rearrange
some of the existing x86 NMI code to make it architecture neutral.

This patchset touches a fairly large number of different sub-systems
(irq, generic sched_clock, printk, x86, arm). However, of the eight
patches, five fall under one of tglx's maintainerships (either through
irq, time or x86). Thus unless there are objections I'd like to gather
acks from some of the folks Cc:ed on the patches. Then I can wrap it up
nicely and send it to Thomas.

The patches have been runtime tested on two systems capable of
supporting FIQ (Freescale i.MX6 and STiH416) and two that do not
(vexpress-a9 and Qualcomm Snapdragon 600), the changes to the x86
logic were tested on qemu and all patches have been compile tested
on x86, arm and arm64.

Note: On platforms not capable of supporting FIQ, the IPI to generate a
      backtrace will fall back to using IRQ for propagation instead.
      The backtrace logic contains a timeout to we will not wedge the
      requesting CPU if other CPUs are not responsive.

v15:

* Added a patch to make sched_clock safe to call from NMI (Stephen
  Boyd). Note that sched_clock() is not called by the NMI handlers that
  have been added for the arm but it could be called if tools such as
  ftrace are deployed.

* Fixed some warnings picked up during bisectability testing.

v14:

* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c
  to printk.c (Steven Rostedt)

v13:

* Updated the code to print the backtrace to replicate Steven Rostedt's
  x86 work to make SysRq-l safe. This is pretty much a total rewrite of
  patches 4 and 5.

v12:

* Squash first two patches into a single one and re-describe
  (Thomas Gleixner).

* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe"
  (Thomas Gleixner).

v11:

* Optimized gic_raise_softirq() by replacing a register read with
  a memory read (Jason Cooper).

v10:

* Add a further patch to optimize away some of the locking on systems
  where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with
  exynos_defconfig (which is the only defconfig to set this option).

* Whitespace fixes in patch 4. That patch previously used spaces for
  alignment of new constants but the rest of the file used tabs.

v9:

* Improved documentation and structure of initial patch (now initial
  two patches) to make gic_raise_softirq() safe to call from FIQ
  (Thomas Gleixner).

* Avoid masking interrupts during gic_raise_softirq(). The use of the
  read lock makes this redundant (because we can safely re-enter the
  function).

v8:

* Fixed build on arm64 causes by a spurious include file in irq-gic.c.

v7-2 (accidentally released twice with same number):

* Fixed boot regression on vexpress-a9 (reported by Russell King).

* Rebased on v3.18-rc3; removed one patch from set that is already
  included in mainline.

* Dropped arm64/fiq.h patch from the set (still useful but not related
  to issuing backtraces).

v7:

* Re-arranged code within the patch series to fix a regression
  introduced midway through the series and corrected by a later patch
  (testing by Olof's autobuilder). Tested offending patch in isolation
  using defconfig identified by the autobuilder.

v6:

* Renamed svc_entry's call_trace argument to just trace (example code
  from Russell King).

* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell
  King).

* Modified usr_entry to optional avoid calling into the trace code and
  used this in FIQ entry from usr path. Modified corresponding exit code
  to avoid calling into trace code and the scheduler (example code from
  Russell King).

* Ensured the default FIQ register state is restored when the default
  FIQ handler is reinstalled (example code from Russell King).

* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting
  a default FIQ handler.

* Re-instated fiq_safe_migration_lock and associated logic in
  gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd()
  in the console unlock logic.

v5:

* Rebased on 3.17-rc4.

* Removed a spurious line from the final "glue it together" patch
  that broke the build.

v4:

* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas
  Pitre).

* Really fix bad pt_regs pointer generation in __fiq_abt.

* Remove fiq_safe_migration_lock and associated logic in
  gic_raise_softirq() (review of Russell King)

* Restructured to introduce the default FIQ handler first, before the
  new features (review of Russell King).

v3:

* Removed redundant header guards from arch/arm64/include/asm/fiq.h
  (review of Catalin Marinas).

* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas
  Pitre).

v2:

* Restructured to sit nicely on a similar FYI patchset from Russell
  King. It now effectively replaces the work in progress final patch
  with something much more complete.

* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq
  (review of Nicolas Pitre)

* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts
  being acknowledged by the IRQ handler does still exist but should be
  harmless because the IRQ handler will still wind up calling
  ipi_cpu_backtrace().

* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively
  becomes a platform feature (although the use of non-maskable
  interrupts to implement it is best effort rather than guaranteed).

* Better comments highlighting usage of RAZ/WI registers (and parts of
  registers) in the GIC code.

Changes *before* v1:

* This patchset is a hugely cut-down successor to "[PATCH v11 00/19]
  arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting
  the new structure. For historic details see:
        https://lkml.org/lkml/2014/9/2/227

* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value).
  In fixing this we also remove the useless indirection previously
  found in the fiq_handler macro.

* Make default fiq handler "always on" by migrating from fiq.c to
  traps.c and replace do_unexp_fiq with the new handler (review
  of Russell King).

* Add arm64 version of fiq.h (review of Russell King)

* Removed conditional branching and code from irq-gic.c, this is
  replaced by much simpler code that relies on the GIC specification's
  heavy use of read-as-zero/write-ignored (review of Russell King)


Daniel Thompson (8):
  irqchip: gic: Optimize locking in gic_raise_softirq
  irqchip: gic: Make gic_raise_softirq FIQ-safe
  irqchip: gic: Introduce plumbing for IPI FIQ
  sched_clock: Avoid deadlock during read from NMI
  printk: Simple implementation for NMI backtracing
  x86/nmi: Use common printk functions
  ARM: Add support for on-demand backtrace of other CPUs
  ARM: Fix on-demand backtrace triggered by IRQ

 arch/Kconfig                    |   3 +
 arch/arm/Kconfig                |   1 +
 arch/arm/include/asm/hardirq.h  |   2 +-
 arch/arm/include/asm/irq.h      |   5 +
 arch/arm/include/asm/smp.h      |   3 +
 arch/arm/kernel/smp.c           |  81 ++++++++++++++++
 arch/arm/kernel/traps.c         |   8 +-
 arch/x86/Kconfig                |   1 +
 arch/x86/kernel/apic/hw_nmi.c   |  97 ++-----------------
 drivers/irqchip/irq-gic.c       | 203 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 ++
 include/linux/printk.h          |  22 +++++
 kernel/printk/printk.c          | 122 ++++++++++++++++++++++++
 kernel/time/sched_clock.c       | 157 ++++++++++++++++++++-----------
 14 files changed, 549 insertions(+), 164 deletions(-)

--
1.9.3


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 0/8] irq/arm: Implement arch_trigger_all_cpu_backtrace
@ 2015-01-23 14:22 ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly (and safely) bring in the changes for the arm
we also make the sched_clock implementation NMI-safe and rearrange
some of the existing x86 NMI code to make it architecture neutral.

This patchset touches a fairly large number of different sub-systems
(irq, generic sched_clock, printk, x86, arm). However, of the eight
patches, five fall under one of tglx's maintainerships (either through
irq, time or x86). Thus unless there are objections I'd like to gather
acks from some of the folks Cc:ed on the patches. Then I can wrap it up
nicely and send it to Thomas.

The patches have been runtime tested on two systems capable of
supporting FIQ (Freescale i.MX6 and STiH416) and two that do not
(vexpress-a9 and Qualcomm Snapdragon 600), the changes to the x86
logic were tested on qemu and all patches have been compile tested
on x86, arm and arm64.

Note: On platforms not capable of supporting FIQ, the IPI to generate a
      backtrace will fall back to using IRQ for propagation instead.
      The backtrace logic contains a timeout to we will not wedge the
      requesting CPU if other CPUs are not responsive.

v15:

* Added a patch to make sched_clock safe to call from NMI (Stephen
  Boyd). Note that sched_clock() is not called by the NMI handlers that
  have been added for the arm but it could be called if tools such as
  ftrace are deployed.

* Fixed some warnings picked up during bisectability testing.

v14:

* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c
  to printk.c (Steven Rostedt)

v13:

* Updated the code to print the backtrace to replicate Steven Rostedt's
  x86 work to make SysRq-l safe. This is pretty much a total rewrite of
  patches 4 and 5.

v12:

* Squash first two patches into a single one and re-describe
  (Thomas Gleixner).

* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe"
  (Thomas Gleixner).

v11:

* Optimized gic_raise_softirq() by replacing a register read with
  a memory read (Jason Cooper).

v10:

* Add a further patch to optimize away some of the locking on systems
  where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with
  exynos_defconfig (which is the only defconfig to set this option).

* Whitespace fixes in patch 4. That patch previously used spaces for
  alignment of new constants but the rest of the file used tabs.

v9:

* Improved documentation and structure of initial patch (now initial
  two patches) to make gic_raise_softirq() safe to call from FIQ
  (Thomas Gleixner).

* Avoid masking interrupts during gic_raise_softirq(). The use of the
  read lock makes this redundant (because we can safely re-enter the
  function).

v8:

* Fixed build on arm64 causes by a spurious include file in irq-gic.c.

v7-2 (accidentally released twice with same number):

* Fixed boot regression on vexpress-a9 (reported by Russell King).

* Rebased on v3.18-rc3; removed one patch from set that is already
  included in mainline.

* Dropped arm64/fiq.h patch from the set (still useful but not related
  to issuing backtraces).

v7:

* Re-arranged code within the patch series to fix a regression
  introduced midway through the series and corrected by a later patch
  (testing by Olof's autobuilder). Tested offending patch in isolation
  using defconfig identified by the autobuilder.

v6:

* Renamed svc_entry's call_trace argument to just trace (example code
  from Russell King).

* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell
  King).

* Modified usr_entry to optional avoid calling into the trace code and
  used this in FIQ entry from usr path. Modified corresponding exit code
  to avoid calling into trace code and the scheduler (example code from
  Russell King).

* Ensured the default FIQ register state is restored when the default
  FIQ handler is reinstalled (example code from Russell King).

* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting
  a default FIQ handler.

* Re-instated fiq_safe_migration_lock and associated logic in
  gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd()
  in the console unlock logic.

v5:

* Rebased on 3.17-rc4.

* Removed a spurious line from the final "glue it together" patch
  that broke the build.

v4:

* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas
  Pitre).

* Really fix bad pt_regs pointer generation in __fiq_abt.

* Remove fiq_safe_migration_lock and associated logic in
  gic_raise_softirq() (review of Russell King)

* Restructured to introduce the default FIQ handler first, before the
  new features (review of Russell King).

v3:

* Removed redundant header guards from arch/arm64/include/asm/fiq.h
  (review of Catalin Marinas).

* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas
  Pitre).

v2:

* Restructured to sit nicely on a similar FYI patchset from Russell
  King. It now effectively replaces the work in progress final patch
  with something much more complete.

* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq
  (review of Nicolas Pitre)

* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts
  being acknowledged by the IRQ handler does still exist but should be
  harmless because the IRQ handler will still wind up calling
  ipi_cpu_backtrace().

* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively
  becomes a platform feature (although the use of non-maskable
  interrupts to implement it is best effort rather than guaranteed).

* Better comments highlighting usage of RAZ/WI registers (and parts of
  registers) in the GIC code.

Changes *before* v1:

* This patchset is a hugely cut-down successor to "[PATCH v11 00/19]
  arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting
  the new structure. For historic details see:
        https://lkml.org/lkml/2014/9/2/227

* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value).
  In fixing this we also remove the useless indirection previously
  found in the fiq_handler macro.

* Make default fiq handler "always on" by migrating from fiq.c to
  traps.c and replace do_unexp_fiq with the new handler (review
  of Russell King).

* Add arm64 version of fiq.h (review of Russell King)

* Removed conditional branching and code from irq-gic.c, this is
  replaced by much simpler code that relies on the GIC specification's
  heavy use of read-as-zero/write-ignored (review of Russell King)


Daniel Thompson (8):
  irqchip: gic: Optimize locking in gic_raise_softirq
  irqchip: gic: Make gic_raise_softirq FIQ-safe
  irqchip: gic: Introduce plumbing for IPI FIQ
  sched_clock: Avoid deadlock during read from NMI
  printk: Simple implementation for NMI backtracing
  x86/nmi: Use common printk functions
  ARM: Add support for on-demand backtrace of other CPUs
  ARM: Fix on-demand backtrace triggered by IRQ

 arch/Kconfig                    |   3 +
 arch/arm/Kconfig                |   1 +
 arch/arm/include/asm/hardirq.h  |   2 +-
 arch/arm/include/asm/irq.h      |   5 +
 arch/arm/include/asm/smp.h      |   3 +
 arch/arm/kernel/smp.c           |  81 ++++++++++++++++
 arch/arm/kernel/traps.c         |   8 +-
 arch/x86/Kconfig                |   1 +
 arch/x86/kernel/apic/hw_nmi.c   |  97 ++-----------------
 drivers/irqchip/irq-gic.c       | 203 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 ++
 include/linux/printk.h          |  22 +++++
 kernel/printk/printk.c          | 122 ++++++++++++++++++++++++
 kernel/time/sched_clock.c       | 157 ++++++++++++++++++++-----------
 14 files changed, 549 insertions(+), 164 deletions(-)

--
1.9.3

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 1/8] irqchip: gic: Optimize locking in gic_raise_softirq
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-01-23 14:22   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently gic_raise_softirq() is locked using upon irq_controller_lock.
This lock is primarily used to make register read-modify-write sequences
atomic but gic_raise_softirq() uses it instead to ensure that the
big.LITTLE migration logic can figure out when it is safe to migrate
interrupts between physical cores.

This is sub-optimal in closely related ways:

1. No locking at all is required on systems where the b.L switcher is
   not configured.

2. Finer grain locking can be used on systems where the b.L switcher is
   present.

This patch resolves both of the above by introducing a separate finer
grain lock and providing conditionally compiled inlines to lock/unlock
it.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index d617ee5a3d8a..a9ed64dcc84b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -73,6 +73,27 @@ struct gic_chip_data {
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 
 /*
+ * This lock is used by the big.LITTLE migration code to ensure no IPIs
+ * can be pended on the old core after the map has been updated.
+ */
+#ifdef CONFIG_BL_SWITCHER
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+
+static inline void bl_migration_lock(unsigned long *flags)
+{
+	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+}
+
+static inline void bl_migration_unlock(unsigned long flags)
+{
+	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+}
+#else
+static inline void bl_migration_lock(unsigned long *flags) {}
+static inline void bl_migration_unlock(unsigned long flags) {}
+#endif
+
+/*
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
@@ -624,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	int cpu;
 	unsigned long flags, map = 0;
 
-	raw_spin_lock_irqsave(&irq_controller_lock, flags);
+	bl_migration_lock(&flags);
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -639,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
+	bl_migration_unlock(flags);
 }
 #endif
 
@@ -710,8 +731,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
 
 	raw_spin_lock(&irq_controller_lock);
 
-	/* Update the target interface for this logical CPU */
+	/*
+	 * Update the target interface for this logical CPU
+	 *
+	 * From the point we release the cpu_map_migration_lock any new
+	 * SGIs will be pended on the new cpu which makes the set of SGIs
+	 * pending on the old cpu static. That means we can defer the
+	 * migration until after we have released the irq_controller_lock.
+	 */
+	raw_spin_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
+	raw_spin_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 1/8] irqchip: gic: Optimize locking in gic_raise_softirq
@ 2015-01-23 14:22   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Currently gic_raise_softirq() is locked using upon irq_controller_lock.
This lock is primarily used to make register read-modify-write sequences
atomic but gic_raise_softirq() uses it instead to ensure that the
big.LITTLE migration logic can figure out when it is safe to migrate
interrupts between physical cores.

This is sub-optimal in closely related ways:

1. No locking at all is required on systems where the b.L switcher is
   not configured.

2. Finer grain locking can be used on systems where the b.L switcher is
   present.

This patch resolves both of the above by introducing a separate finer
grain lock and providing conditionally compiled inlines to lock/unlock
it.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index d617ee5a3d8a..a9ed64dcc84b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -73,6 +73,27 @@ struct gic_chip_data {
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 
 /*
+ * This lock is used by the big.LITTLE migration code to ensure no IPIs
+ * can be pended on the old core after the map has been updated.
+ */
+#ifdef CONFIG_BL_SWITCHER
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+
+static inline void bl_migration_lock(unsigned long *flags)
+{
+	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+}
+
+static inline void bl_migration_unlock(unsigned long flags)
+{
+	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+}
+#else
+static inline void bl_migration_lock(unsigned long *flags) {}
+static inline void bl_migration_unlock(unsigned long flags) {}
+#endif
+
+/*
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
@@ -624,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	int cpu;
 	unsigned long flags, map = 0;
 
-	raw_spin_lock_irqsave(&irq_controller_lock, flags);
+	bl_migration_lock(&flags);
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -639,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
+	bl_migration_unlock(flags);
 }
 #endif
 
@@ -710,8 +731,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
 
 	raw_spin_lock(&irq_controller_lock);
 
-	/* Update the target interface for this logical CPU */
+	/*
+	 * Update the target interface for this logical CPU
+	 *
+	 * From the point we release the cpu_map_migration_lock any new
+	 * SGIs will be pended on the new cpu which makes the set of SGIs
+	 * pending on the old cpu static. That means we can defer the
+	 * migration until after we have released the irq_controller_lock.
+	 */
+	raw_spin_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
+	raw_spin_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 2/8] irqchip: gic: Make gic_raise_softirq FIQ-safe
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-01-23 14:22   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
and lock up.

    	gic_raise_softirq()
	   lock(x);
-~-> FIQ
        handle_fiq()
	   gic_raise_softirq()
	      lock(x);		<-- Lockup

arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
renders it difficult for FIQ handlers to safely defer work to less
restrictive calling contexts.

This patch fixes the problem by converting the cpu_map_migration_lock
into a rwlock making it safe to re-enter the function.

Note that having made it safe to re-enter gic_raise_softirq() we no
longer need to mask interrupts during gic_raise_softirq() because the
b.L migration is always performed from task context.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index a9ed64dcc84b..c172176499f6 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 /*
  * This lock is used by the big.LITTLE migration code to ensure no IPIs
  * can be pended on the old core after the map has been updated.
+ *
+ * This lock may be locked for reading from both IRQ and FIQ handlers
+ * and therefore must not be locked for writing when these are enabled.
  */
 #ifdef CONFIG_BL_SWITCHER
-static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static DEFINE_RWLOCK(cpu_map_migration_lock);
 
-static inline void bl_migration_lock(unsigned long *flags)
+static inline void bl_migration_lock(void)
 {
-	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+	read_lock(&cpu_map_migration_lock);
 }
 
-static inline void bl_migration_unlock(unsigned long flags)
+static inline void bl_migration_unlock(void)
 {
-	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+	read_unlock(&cpu_map_migration_lock);
 }
 #else
-static inline void bl_migration_lock(unsigned long *flags) {}
-static inline void bl_migration_unlock(unsigned long flags) {}
+static inline void bl_migration_lock(void) {}
+static inline void bl_migration_unlock(void) {}
 #endif
 
 /*
@@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+/*
+ * Raise the specified IPI on all cpus set in mask.
+ *
+ * This function is safe to call from all calling contexts, including
+ * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable
+ * to avoid deadlocks when the function is re-entered at different
+ * exception levels.
+ */
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
-	unsigned long flags, map = 0;
+	unsigned long map = 0;
 
-	bl_migration_lock(&flags);
+	bl_migration_lock();
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	bl_migration_unlock(flags);
+	bl_migration_unlock();
 }
 #endif
 
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu)
  * Migrate all peripheral interrupts with a target matching the current CPU
  * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
  * is also updated.  Targets to other CPU interfaces are unchanged.
- * This must be called with IRQs locally disabled.
+ * This must be called from a task context and with IRQ and FIQ locally
+ * disabled.
  */
 void gic_migrate_target(unsigned int new_cpu_id)
 {
@@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * pending on the old cpu static. That means we can defer the
 	 * migration until after we have released the irq_controller_lock.
 	 */
-	raw_spin_lock(&cpu_map_migration_lock);
+	write_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
-	raw_spin_unlock(&cpu_map_migration_lock);
+	write_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 2/8] irqchip: gic: Make gic_raise_softirq FIQ-safe
@ 2015-01-23 14:22   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
and lock up.

    	gic_raise_softirq()
	   lock(x);
-~-> FIQ
        handle_fiq()
	   gic_raise_softirq()
	      lock(x);		<-- Lockup

arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
renders it difficult for FIQ handlers to safely defer work to less
restrictive calling contexts.

This patch fixes the problem by converting the cpu_map_migration_lock
into a rwlock making it safe to re-enter the function.

Note that having made it safe to re-enter gic_raise_softirq() we no
longer need to mask interrupts during gic_raise_softirq() because the
b.L migration is always performed from task context.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index a9ed64dcc84b..c172176499f6 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 /*
  * This lock is used by the big.LITTLE migration code to ensure no IPIs
  * can be pended on the old core after the map has been updated.
+ *
+ * This lock may be locked for reading from both IRQ and FIQ handlers
+ * and therefore must not be locked for writing when these are enabled.
  */
 #ifdef CONFIG_BL_SWITCHER
-static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static DEFINE_RWLOCK(cpu_map_migration_lock);
 
-static inline void bl_migration_lock(unsigned long *flags)
+static inline void bl_migration_lock(void)
 {
-	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+	read_lock(&cpu_map_migration_lock);
 }
 
-static inline void bl_migration_unlock(unsigned long flags)
+static inline void bl_migration_unlock(void)
 {
-	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+	read_unlock(&cpu_map_migration_lock);
 }
 #else
-static inline void bl_migration_lock(unsigned long *flags) {}
-static inline void bl_migration_unlock(unsigned long flags) {}
+static inline void bl_migration_lock(void) {}
+static inline void bl_migration_unlock(void) {}
 #endif
 
 /*
@@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+/*
+ * Raise the specified IPI on all cpus set in mask.
+ *
+ * This function is safe to call from all calling contexts, including
+ * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable
+ * to avoid deadlocks when the function is re-entered at different
+ * exception levels.
+ */
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
-	unsigned long flags, map = 0;
+	unsigned long map = 0;
 
-	bl_migration_lock(&flags);
+	bl_migration_lock();
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	bl_migration_unlock(flags);
+	bl_migration_unlock();
 }
 #endif
 
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu)
  * Migrate all peripheral interrupts with a target matching the current CPU
  * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
  * is also updated.  Targets to other CPU interfaces are unchanged.
- * This must be called with IRQs locally disabled.
+ * This must be called from a task context and with IRQ and FIQ locally
+ * disabled.
  */
 void gic_migrate_target(unsigned int new_cpu_id)
 {
@@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * pending on the old cpu static. That means we can defer the
 	 * migration until after we have released the irq_controller_lock.
 	 */
-	raw_spin_lock(&cpu_map_migration_lock);
+	write_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
-	raw_spin_unlock(&cpu_map_migration_lock);
+	write_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 3/8] irqchip: gic: Introduce plumbing for IPI FIQ
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-01-23 14:22   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently it is not possible to exploit FIQ for systems with a GIC, even if
the systems are otherwise capable of it. This patch makes it possible
for IPIs to be delivered using FIQ.

To do so it modifies the register state so that normal interrupts are
placed in group 1 and specific IPIs are placed into group 0. It also
configures the controller to raise group 0 interrupts using the FIQ
signal. It provides a means for architecture code to define which IPIs
shall use FIQ and to acknowledge any IPIs that are raised.

All GIC hardware except GICv1-without-TrustZone support provides a means
to group exceptions into group 0 and group 1 but the hardware
functionality is unavailable to the kernel when a secure monitor is
present because access to the grouping registers are prohibited outside
"secure world". However when grouping is not available (or in the case
of early GICv1 implementations is very hard to configure) the code to
change groups does not deploy and all IPIs will be raised via IRQ.

It has been tested and shown working on two systems capable of
supporting grouping (Freescale i.MX6 and STiH416). It has also been
tested for boot regressions on two systems that do not support grouping
(vexpress-a9 and Qualcomm Snapdragon 600).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/traps.c         |   5 +-
 drivers/irqchip/irq-gic.c       | 151 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 +++
 3 files changed, 153 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 788e23fe64d8..b35e220ae1b1 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -26,6 +26,7 @@
 #include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/irq.h>
+#include <linux/irqchip/arm-gic.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 
 	nmi_enter();
 
-	/* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
+	gic_handle_fiq_ipi();
+#endif
 
 	nmi_exit();
 
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index c172176499f6..c4f4a8827ed8 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/ratelimit.h>
 
 #include <asm/cputype.h>
 #include <asm/irq.h>
@@ -48,6 +49,10 @@
 #include "irq-gic-common.h"
 #include "irqchip.h"
 
+#ifndef SMP_IPI_FIQ_MASK
+#define SMP_IPI_FIQ_MASK 0
+#endif
+
 union gic_base {
 	void __iomem *common_base;
 	void __percpu * __iomem *percpu_base;
@@ -65,6 +70,7 @@ struct gic_chip_data {
 #endif
 	struct irq_domain *domain;
 	unsigned int gic_irqs;
+	u32 igroup0_shadow;
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
@@ -348,6 +354,83 @@ static struct irq_chip gic_chip = {
 	.irq_set_wake		= gic_set_wake,
 };
 
+/*
+ * Shift an interrupt between Group 0 and Group 1.
+ *
+ * In addition to changing the group we also modify the priority to
+ * match what "ARM strongly recommends" for a system where no Group 1
+ * interrupt must ever preempt a Group 0 interrupt.
+ *
+ * If is safe to call this function on systems which do not support
+ * grouping (it will have no effect).
+ */
+static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq,
+			      int group)
+{
+	void __iomem *base = gic_data_dist_base(gic);
+	unsigned int grp_reg = hwirq / 32 * 4;
+	u32 grp_mask = BIT(hwirq % 32);
+	u32 grp_val;
+
+	unsigned int pri_reg = (hwirq / 4) * 4;
+	u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
+	u32 pri_val;
+
+	/*
+	 * Systems which do not support grouping will have not have
+	 * the EnableGrp1 bit set.
+	 */
+	if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
+		return;
+
+	raw_spin_lock(&irq_controller_lock);
+
+	grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
+	pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
+
+	if (group) {
+		grp_val |= grp_mask;
+		pri_val |= pri_mask;
+	} else {
+		grp_val &= ~grp_mask;
+		pri_val &= ~pri_mask;
+	}
+
+	writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
+	if (grp_reg == 0)
+		gic->igroup0_shadow = grp_val;
+
+	writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
+
+	raw_spin_unlock(&irq_controller_lock);
+}
+
+
+/*
+ * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
+ * otherwise do nothing.
+ */
+void gic_handle_fiq_ipi(void)
+{
+	struct gic_chip_data *gic = &gic_data[0];
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
+	unsigned long irqstat, irqnr;
+
+	if (WARN_ON(!in_nmi()))
+		return;
+
+	while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
+	       SMP_IPI_FIQ_MASK) {
+		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
+		writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
+
+		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+		WARN_RATELIMIT(irqnr > 16,
+			       "Unexpected irqnr %lu (bad prioritization?)\n",
+			       irqnr);
+	}
+}
+
 void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 {
 	if (gic_nr >= MAX_GIC_NR)
@@ -379,15 +462,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 static void gic_cpu_if_up(void)
 {
 	void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
-	u32 bypass = 0;
+	void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
+	u32 ctrl = 0;
 
 	/*
-	* Preserve bypass disable bits to be written back later
-	*/
-	bypass = readl(cpu_base + GIC_CPU_CTRL);
-	bypass &= GICC_DIS_BYPASS_MASK;
+	 * Preserve bypass disable bits to be written back later
+	 */
+	ctrl = readl(cpu_base + GIC_CPU_CTRL);
+	ctrl &= GICC_DIS_BYPASS_MASK;
 
-	writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
+	/*
+	 * If EnableGrp1 is set in the distributor then enable group 1
+	 * support for this CPU (and route group 0 interrupts to FIQ).
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
+		ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
+			GICC_ENABLE_GRP1;
+
+	writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
 
@@ -411,7 +503,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored) depending on current mode.
+	 */
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
+
+	/*
+	 * Set all global interrupts to be group 1 if (and only if) it
+	 * is possible to enable group 1 interrupts. This register is RAZ/WI
+	 * if not accessible or not implemented, however some GICv1 devices
+	 * do not implement the EnableGrp1 bit making it unsafe to set
+	 * this register unconditionally.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
+		for (i = 32; i < gic_irqs; i += 32)
+			writel_relaxed(0xffffffff,
+				       base + GIC_DIST_IGROUP + i * 4 / 32);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -420,6 +528,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	void __iomem *base = gic_data_cpu_base(gic);
 	unsigned int cpu_mask, cpu = smp_processor_id();
 	int i;
+	unsigned long secure_irqs, secure_irq;
 
 	/*
 	 * Get what the GIC says our CPU mask is.
@@ -438,6 +547,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * If the distributor is configured to support interrupt grouping
+	 * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
+	 * to be group1 and ensure any remaining group 0 interrupts have
+	 * the right priority.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
+		secure_irqs = SMP_IPI_FIQ_MASK;
+		writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
+		gic->igroup0_shadow = ~secure_irqs;
+		for_each_set_bit(secure_irq, &secure_irqs, 16)
+			gic_set_group_irq(gic, secure_irq, 0);
+	}
+
 	writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK);
 	gic_cpu_if_up();
 }
@@ -527,7 +650,8 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
+		       dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -655,6 +779,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
 	unsigned long map = 0;
+	unsigned long softint;
 
 	bl_migration_lock();
 
@@ -668,8 +793,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	/* We avoid a readl here by using the shadow copy of IGROUP[0] */
+	softint = map << 16 | irq;
+	if (gic_data[0].igroup0_shadow & BIT(irq))
+		softint |= 0x8000;
+
+	/* This always happens on GIC0 */
+	writel_relaxed(softint,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
 	bl_migration_unlock();
 }
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 71d706d5f169..7690f70049a3 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -22,6 +22,10 @@
 #define GIC_CPU_IDENT			0xfc
 
 #define GICC_ENABLE			0x1
+#define GICC_ENABLE_GRP1		0x2
+#define GICC_ACK_CTL			0x4
+#define GICC_FIQ_EN			0x8
+#define GICC_COMMON_BPR			0x10
 #define GICC_INT_PRI_THRESHOLD		0xf0
 #define GICC_IAR_INT_ID_MASK		0x3ff
 #define GICC_INT_SPURIOUS		1023
@@ -44,6 +48,7 @@
 #define GIC_DIST_SGI_PENDING_SET	0xf20
 
 #define GICD_ENABLE			0x1
+#define GICD_ENABLE_GRP1		0x2
 #define GICD_DISABLE			0x0
 #define GICD_INT_ACTLOW_LVLTRIG		0x0
 #define GICD_INT_EN_CLR_X32		0xffffffff
@@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops
 {
 	gic_routable_irq_domain_ops = ops;
 }
+
+void gic_handle_fiq_ipi(void);
+
 #endif /* __ASSEMBLY */
 #endif
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 3/8] irqchip: gic: Introduce plumbing for IPI FIQ
@ 2015-01-23 14:22   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Currently it is not possible to exploit FIQ for systems with a GIC, even if
the systems are otherwise capable of it. This patch makes it possible
for IPIs to be delivered using FIQ.

To do so it modifies the register state so that normal interrupts are
placed in group 1 and specific IPIs are placed into group 0. It also
configures the controller to raise group 0 interrupts using the FIQ
signal. It provides a means for architecture code to define which IPIs
shall use FIQ and to acknowledge any IPIs that are raised.

All GIC hardware except GICv1-without-TrustZone support provides a means
to group exceptions into group 0 and group 1 but the hardware
functionality is unavailable to the kernel when a secure monitor is
present because access to the grouping registers are prohibited outside
"secure world". However when grouping is not available (or in the case
of early GICv1 implementations is very hard to configure) the code to
change groups does not deploy and all IPIs will be raised via IRQ.

It has been tested and shown working on two systems capable of
supporting grouping (Freescale i.MX6 and STiH416). It has also been
tested for boot regressions on two systems that do not support grouping
(vexpress-a9 and Qualcomm Snapdragon 600).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/traps.c         |   5 +-
 drivers/irqchip/irq-gic.c       | 151 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 +++
 3 files changed, 153 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 788e23fe64d8..b35e220ae1b1 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -26,6 +26,7 @@
 #include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/irq.h>
+#include <linux/irqchip/arm-gic.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 
 	nmi_enter();
 
-	/* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
+	gic_handle_fiq_ipi();
+#endif
 
 	nmi_exit();
 
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index c172176499f6..c4f4a8827ed8 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/ratelimit.h>
 
 #include <asm/cputype.h>
 #include <asm/irq.h>
@@ -48,6 +49,10 @@
 #include "irq-gic-common.h"
 #include "irqchip.h"
 
+#ifndef SMP_IPI_FIQ_MASK
+#define SMP_IPI_FIQ_MASK 0
+#endif
+
 union gic_base {
 	void __iomem *common_base;
 	void __percpu * __iomem *percpu_base;
@@ -65,6 +70,7 @@ struct gic_chip_data {
 #endif
 	struct irq_domain *domain;
 	unsigned int gic_irqs;
+	u32 igroup0_shadow;
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
@@ -348,6 +354,83 @@ static struct irq_chip gic_chip = {
 	.irq_set_wake		= gic_set_wake,
 };
 
+/*
+ * Shift an interrupt between Group 0 and Group 1.
+ *
+ * In addition to changing the group we also modify the priority to
+ * match what "ARM strongly recommends" for a system where no Group 1
+ * interrupt must ever preempt a Group 0 interrupt.
+ *
+ * If is safe to call this function on systems which do not support
+ * grouping (it will have no effect).
+ */
+static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq,
+			      int group)
+{
+	void __iomem *base = gic_data_dist_base(gic);
+	unsigned int grp_reg = hwirq / 32 * 4;
+	u32 grp_mask = BIT(hwirq % 32);
+	u32 grp_val;
+
+	unsigned int pri_reg = (hwirq / 4) * 4;
+	u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
+	u32 pri_val;
+
+	/*
+	 * Systems which do not support grouping will have not have
+	 * the EnableGrp1 bit set.
+	 */
+	if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
+		return;
+
+	raw_spin_lock(&irq_controller_lock);
+
+	grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
+	pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
+
+	if (group) {
+		grp_val |= grp_mask;
+		pri_val |= pri_mask;
+	} else {
+		grp_val &= ~grp_mask;
+		pri_val &= ~pri_mask;
+	}
+
+	writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
+	if (grp_reg == 0)
+		gic->igroup0_shadow = grp_val;
+
+	writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
+
+	raw_spin_unlock(&irq_controller_lock);
+}
+
+
+/*
+ * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
+ * otherwise do nothing.
+ */
+void gic_handle_fiq_ipi(void)
+{
+	struct gic_chip_data *gic = &gic_data[0];
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
+	unsigned long irqstat, irqnr;
+
+	if (WARN_ON(!in_nmi()))
+		return;
+
+	while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
+	       SMP_IPI_FIQ_MASK) {
+		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
+		writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
+
+		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+		WARN_RATELIMIT(irqnr > 16,
+			       "Unexpected irqnr %lu (bad prioritization?)\n",
+			       irqnr);
+	}
+}
+
 void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 {
 	if (gic_nr >= MAX_GIC_NR)
@@ -379,15 +462,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 static void gic_cpu_if_up(void)
 {
 	void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
-	u32 bypass = 0;
+	void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
+	u32 ctrl = 0;
 
 	/*
-	* Preserve bypass disable bits to be written back later
-	*/
-	bypass = readl(cpu_base + GIC_CPU_CTRL);
-	bypass &= GICC_DIS_BYPASS_MASK;
+	 * Preserve bypass disable bits to be written back later
+	 */
+	ctrl = readl(cpu_base + GIC_CPU_CTRL);
+	ctrl &= GICC_DIS_BYPASS_MASK;
 
-	writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
+	/*
+	 * If EnableGrp1 is set in the distributor then enable group 1
+	 * support for this CPU (and route group 0 interrupts to FIQ).
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
+		ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
+			GICC_ENABLE_GRP1;
+
+	writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
 
@@ -411,7 +503,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored) depending on current mode.
+	 */
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
+
+	/*
+	 * Set all global interrupts to be group 1 if (and only if) it
+	 * is possible to enable group 1 interrupts. This register is RAZ/WI
+	 * if not accessible or not implemented, however some GICv1 devices
+	 * do not implement the EnableGrp1 bit making it unsafe to set
+	 * this register unconditionally.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
+		for (i = 32; i < gic_irqs; i += 32)
+			writel_relaxed(0xffffffff,
+				       base + GIC_DIST_IGROUP + i * 4 / 32);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -420,6 +528,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	void __iomem *base = gic_data_cpu_base(gic);
 	unsigned int cpu_mask, cpu = smp_processor_id();
 	int i;
+	unsigned long secure_irqs, secure_irq;
 
 	/*
 	 * Get what the GIC says our CPU mask is.
@@ -438,6 +547,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * If the distributor is configured to support interrupt grouping
+	 * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
+	 * to be group1 and ensure any remaining group 0 interrupts have
+	 * the right priority.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
+		secure_irqs = SMP_IPI_FIQ_MASK;
+		writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
+		gic->igroup0_shadow = ~secure_irqs;
+		for_each_set_bit(secure_irq, &secure_irqs, 16)
+			gic_set_group_irq(gic, secure_irq, 0);
+	}
+
 	writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK);
 	gic_cpu_if_up();
 }
@@ -527,7 +650,8 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
+		       dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -655,6 +779,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
 	unsigned long map = 0;
+	unsigned long softint;
 
 	bl_migration_lock();
 
@@ -668,8 +793,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	/* We avoid a readl here by using the shadow copy of IGROUP[0] */
+	softint = map << 16 | irq;
+	if (gic_data[0].igroup0_shadow & BIT(irq))
+		softint |= 0x8000;
+
+	/* This always happens on GIC0 */
+	writel_relaxed(softint,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
 	bl_migration_unlock();
 }
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 71d706d5f169..7690f70049a3 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -22,6 +22,10 @@
 #define GIC_CPU_IDENT			0xfc
 
 #define GICC_ENABLE			0x1
+#define GICC_ENABLE_GRP1		0x2
+#define GICC_ACK_CTL			0x4
+#define GICC_FIQ_EN			0x8
+#define GICC_COMMON_BPR			0x10
 #define GICC_INT_PRI_THRESHOLD		0xf0
 #define GICC_IAR_INT_ID_MASK		0x3ff
 #define GICC_INT_SPURIOUS		1023
@@ -44,6 +48,7 @@
 #define GIC_DIST_SGI_PENDING_SET	0xf20
 
 #define GICD_ENABLE			0x1
+#define GICD_ENABLE_GRP1		0x2
 #define GICD_DISABLE			0x0
 #define GICD_INT_ACTLOW_LVLTRIG		0x0
 #define GICD_INT_EN_CLR_X32		0xffffffff
@@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops
 {
 	gic_routable_irq_domain_ops = ops;
 }
+
+void gic_handle_fiq_ipi(void);
+
 #endif /* __ASSEMBLY */
 #endif
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 4/8] sched_clock: Avoid deadlock during read from NMI
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-01-23 14:22   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently it is possible for an NMI (or FIQ on ARM) to come in and
read sched_clock() whilst update_sched_clock() has locked the seqcount
for writing. This results in the NMI handler locking up when it calls
raw_read_seqcount_begin().

This patch fixes that problem by providing banked clock data in a
similar manner to Thomas Gleixner's 4396e058c52e("timekeeping: Provide
fast and NMI safe access to CLOCK_MONOTONIC").

Changing the mode of operation of the seqcount away from the traditional
LSB-means-interrupted-write to a banked approach also revealed a good deal
of "fake" write locking within sched_clock_register(). This is likely
to be a latent issue because sched_clock_register() is typically called
before we enable interrupts. Nevertheless the issue has been eliminated
by increasing the scope of the read locking performed by sched_clock().

Suggested-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: John Stultz <john.stultz@linaro.org>
---
 kernel/time/sched_clock.c | 157 +++++++++++++++++++++++++++++-----------------
 1 file changed, 100 insertions(+), 57 deletions(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 01d2d15aa662..a2ea66944bc1 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -18,28 +18,28 @@
 #include <linux/seqlock.h>
 #include <linux/bitops.h>
 
-struct clock_data {
-	ktime_t wrap_kt;
+struct clock_data_banked {
 	u64 epoch_ns;
 	u64 epoch_cyc;
-	seqcount_t seq;
-	unsigned long rate;
+	u64 (*read_sched_clock)(void);
+	u64 sched_clock_mask;
 	u32 mult;
 	u32 shift;
 	bool suspended;
 };
 
+struct clock_data {
+	ktime_t wrap_kt;
+	seqcount_t seq;
+	unsigned long rate;
+	struct clock_data_banked bank[2];
+};
+
 static struct hrtimer sched_clock_timer;
 static int irqtime = -1;
 
 core_param(irqtime, irqtime, int, 0400);
 
-static struct clock_data cd = {
-	.mult	= NSEC_PER_SEC / HZ,
-};
-
-static u64 __read_mostly sched_clock_mask;
-
 static u64 notrace jiffy_sched_clock_read(void)
 {
 	/*
@@ -49,7 +49,14 @@ static u64 notrace jiffy_sched_clock_read(void)
 	return (u64)(jiffies - INITIAL_JIFFIES);
 }
 
-static u64 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
+static struct clock_data cd = {
+	.bank = {
+		[0] = {
+			.mult	= NSEC_PER_SEC / HZ,
+			.read_sched_clock = jiffy_sched_clock_read,
+		},
+	},
+};
 
 static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 {
@@ -58,50 +65,82 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 
 unsigned long long notrace sched_clock(void)
 {
-	u64 epoch_ns;
-	u64 epoch_cyc;
 	u64 cyc;
 	unsigned long seq;
-
-	if (cd.suspended)
-		return cd.epoch_ns;
+	struct clock_data_banked *b;
+	u64 res;
 
 	do {
-		seq = raw_read_seqcount_begin(&cd.seq);
-		epoch_cyc = cd.epoch_cyc;
-		epoch_ns = cd.epoch_ns;
+		seq = raw_read_seqcount(&cd.seq);
+		b = cd.bank + (seq & 1);
+		if (b->suspended) {
+			res = b->epoch_ns;
+		} else {
+			cyc = b->read_sched_clock();
+			cyc = (cyc - b->epoch_cyc) & b->sched_clock_mask;
+			res = b->epoch_ns + cyc_to_ns(cyc, b->mult, b->shift);
+		}
 	} while (read_seqcount_retry(&cd.seq, seq));
 
-	cyc = read_sched_clock();
-	cyc = (cyc - epoch_cyc) & sched_clock_mask;
-	return epoch_ns + cyc_to_ns(cyc, cd.mult, cd.shift);
+	return res;
+}
+
+/*
+ * Start updating the banked clock data.
+ *
+ * sched_clock will never observe mis-matched data even if called from
+ * an NMI. We do this by maintaining an odd/even copy of the data and
+ * steering sched_clock to one or the other using a sequence counter.
+ * In order to preserve the data cache profile of sched_clock as much
+ * as possible the system reverts back to the even copy when the update
+ * completes; the odd copy is used *only* during an update.
+ *
+ * The caller is responsible for avoiding simultaneous updates.
+ */
+static struct clock_data_banked *update_bank_begin(void)
+{
+	/* update the backup (odd) bank and steer readers towards it */
+	memcpy(cd.bank + 1, cd.bank, sizeof(struct clock_data_banked));
+	raw_write_seqcount_latch(&cd.seq);
+
+	return cd.bank;
+}
+
+/*
+ * Finalize update of banked clock data.
+ *
+ * This is just a trivial switch back to the primary (even) copy.
+ */
+static void update_bank_end(void)
+{
+	raw_write_seqcount_latch(&cd.seq);
 }
 
 /*
  * Atomically update the sched_clock epoch.
  */
-static void notrace update_sched_clock(void)
+static void notrace update_sched_clock(bool suspended)
 {
-	unsigned long flags;
+	struct clock_data_banked *b;
 	u64 cyc;
 	u64 ns;
 
-	cyc = read_sched_clock();
-	ns = cd.epoch_ns +
-		cyc_to_ns((cyc - cd.epoch_cyc) & sched_clock_mask,
-			  cd.mult, cd.shift);
-
-	raw_local_irq_save(flags);
-	raw_write_seqcount_begin(&cd.seq);
-	cd.epoch_ns = ns;
-	cd.epoch_cyc = cyc;
-	raw_write_seqcount_end(&cd.seq);
-	raw_local_irq_restore(flags);
+	b = update_bank_begin();
+
+	cyc = b->read_sched_clock();
+	ns = b->epoch_ns + cyc_to_ns((cyc - b->epoch_cyc) & b->sched_clock_mask,
+				     b->mult, b->shift);
+
+	b->epoch_ns = ns;
+	b->epoch_cyc = cyc;
+	b->suspended = suspended;
+
+	update_bank_end();
 }
 
 static enum hrtimer_restart sched_clock_poll(struct hrtimer *hrt)
 {
-	update_sched_clock();
+	update_sched_clock(false);
 	hrtimer_forward_now(hrt, cd.wrap_kt);
 	return HRTIMER_RESTART;
 }
@@ -111,9 +150,9 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
 {
 	u64 res, wrap, new_mask, new_epoch, cyc, ns;
 	u32 new_mult, new_shift;
-	ktime_t new_wrap_kt;
 	unsigned long r;
 	char r_unit;
+	struct clock_data_banked *b;
 
 	if (cd.rate > rate)
 		return;
@@ -122,29 +161,30 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
 
 	/* calculate the mult/shift to convert counter ticks to ns. */
 	clocks_calc_mult_shift(&new_mult, &new_shift, rate, NSEC_PER_SEC, 3600);
+	cd.rate = rate;
 
 	new_mask = CLOCKSOURCE_MASK(bits);
 
 	/* calculate how many ns until we wrap */
 	wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask);
-	new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3));
+	cd.wrap_kt = ns_to_ktime(wrap - (wrap >> 3));
+
+	b = update_bank_begin();
 
 	/* update epoch for new counter and update epoch_ns from old counter*/
 	new_epoch = read();
-	cyc = read_sched_clock();
-	ns = cd.epoch_ns + cyc_to_ns((cyc - cd.epoch_cyc) & sched_clock_mask,
-			  cd.mult, cd.shift);
+	cyc = b->read_sched_clock();
+	ns = b->epoch_ns + cyc_to_ns((cyc - b->epoch_cyc) & b->sched_clock_mask,
+				     b->mult, b->shift);
 
-	raw_write_seqcount_begin(&cd.seq);
-	read_sched_clock = read;
-	sched_clock_mask = new_mask;
-	cd.rate = rate;
-	cd.wrap_kt = new_wrap_kt;
-	cd.mult = new_mult;
-	cd.shift = new_shift;
-	cd.epoch_cyc = new_epoch;
-	cd.epoch_ns = ns;
-	raw_write_seqcount_end(&cd.seq);
+	b->read_sched_clock = read;
+	b->sched_clock_mask = new_mask;
+	b->mult = new_mult;
+	b->shift = new_shift;
+	b->epoch_cyc = new_epoch;
+	b->epoch_ns = ns;
+
+	update_bank_end();
 
 	r = rate;
 	if (r >= 4000000) {
@@ -175,10 +215,10 @@ void __init sched_clock_postinit(void)
 	 * If no sched_clock function has been provided at that point,
 	 * make it the final one one.
 	 */
-	if (read_sched_clock == jiffy_sched_clock_read)
+	if (cd.bank[0].read_sched_clock == jiffy_sched_clock_read)
 		sched_clock_register(jiffy_sched_clock_read, BITS_PER_LONG, HZ);
 
-	update_sched_clock();
+	update_sched_clock(false);
 
 	/*
 	 * Start the timer to keep sched_clock() properly updated and
@@ -191,17 +231,20 @@ void __init sched_clock_postinit(void)
 
 static int sched_clock_suspend(void)
 {
-	update_sched_clock();
+	update_sched_clock(true);
 	hrtimer_cancel(&sched_clock_timer);
-	cd.suspended = true;
 	return 0;
 }
 
 static void sched_clock_resume(void)
 {
-	cd.epoch_cyc = read_sched_clock();
+	struct clock_data_banked *b;
+
+	b = update_bank_begin();
+	b->epoch_cyc = b->read_sched_clock();
 	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
-	cd.suspended = false;
+	b->suspended = false;
+	update_bank_end();
 }
 
 static struct syscore_ops sched_clock_ops = {
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 4/8] sched_clock: Avoid deadlock during read from NMI
@ 2015-01-23 14:22   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Currently it is possible for an NMI (or FIQ on ARM) to come in and
read sched_clock() whilst update_sched_clock() has locked the seqcount
for writing. This results in the NMI handler locking up when it calls
raw_read_seqcount_begin().

This patch fixes that problem by providing banked clock data in a
similar manner to Thomas Gleixner's 4396e058c52e("timekeeping: Provide
fast and NMI safe access to CLOCK_MONOTONIC").

Changing the mode of operation of the seqcount away from the traditional
LSB-means-interrupted-write to a banked approach also revealed a good deal
of "fake" write locking within sched_clock_register(). This is likely
to be a latent issue because sched_clock_register() is typically called
before we enable interrupts. Nevertheless the issue has been eliminated
by increasing the scope of the read locking performed by sched_clock().

Suggested-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: John Stultz <john.stultz@linaro.org>
---
 kernel/time/sched_clock.c | 157 +++++++++++++++++++++++++++++-----------------
 1 file changed, 100 insertions(+), 57 deletions(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 01d2d15aa662..a2ea66944bc1 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -18,28 +18,28 @@
 #include <linux/seqlock.h>
 #include <linux/bitops.h>
 
-struct clock_data {
-	ktime_t wrap_kt;
+struct clock_data_banked {
 	u64 epoch_ns;
 	u64 epoch_cyc;
-	seqcount_t seq;
-	unsigned long rate;
+	u64 (*read_sched_clock)(void);
+	u64 sched_clock_mask;
 	u32 mult;
 	u32 shift;
 	bool suspended;
 };
 
+struct clock_data {
+	ktime_t wrap_kt;
+	seqcount_t seq;
+	unsigned long rate;
+	struct clock_data_banked bank[2];
+};
+
 static struct hrtimer sched_clock_timer;
 static int irqtime = -1;
 
 core_param(irqtime, irqtime, int, 0400);
 
-static struct clock_data cd = {
-	.mult	= NSEC_PER_SEC / HZ,
-};
-
-static u64 __read_mostly sched_clock_mask;
-
 static u64 notrace jiffy_sched_clock_read(void)
 {
 	/*
@@ -49,7 +49,14 @@ static u64 notrace jiffy_sched_clock_read(void)
 	return (u64)(jiffies - INITIAL_JIFFIES);
 }
 
-static u64 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
+static struct clock_data cd = {
+	.bank = {
+		[0] = {
+			.mult	= NSEC_PER_SEC / HZ,
+			.read_sched_clock = jiffy_sched_clock_read,
+		},
+	},
+};
 
 static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 {
@@ -58,50 +65,82 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 
 unsigned long long notrace sched_clock(void)
 {
-	u64 epoch_ns;
-	u64 epoch_cyc;
 	u64 cyc;
 	unsigned long seq;
-
-	if (cd.suspended)
-		return cd.epoch_ns;
+	struct clock_data_banked *b;
+	u64 res;
 
 	do {
-		seq = raw_read_seqcount_begin(&cd.seq);
-		epoch_cyc = cd.epoch_cyc;
-		epoch_ns = cd.epoch_ns;
+		seq = raw_read_seqcount(&cd.seq);
+		b = cd.bank + (seq & 1);
+		if (b->suspended) {
+			res = b->epoch_ns;
+		} else {
+			cyc = b->read_sched_clock();
+			cyc = (cyc - b->epoch_cyc) & b->sched_clock_mask;
+			res = b->epoch_ns + cyc_to_ns(cyc, b->mult, b->shift);
+		}
 	} while (read_seqcount_retry(&cd.seq, seq));
 
-	cyc = read_sched_clock();
-	cyc = (cyc - epoch_cyc) & sched_clock_mask;
-	return epoch_ns + cyc_to_ns(cyc, cd.mult, cd.shift);
+	return res;
+}
+
+/*
+ * Start updating the banked clock data.
+ *
+ * sched_clock will never observe mis-matched data even if called from
+ * an NMI. We do this by maintaining an odd/even copy of the data and
+ * steering sched_clock to one or the other using a sequence counter.
+ * In order to preserve the data cache profile of sched_clock as much
+ * as possible the system reverts back to the even copy when the update
+ * completes; the odd copy is used *only* during an update.
+ *
+ * The caller is responsible for avoiding simultaneous updates.
+ */
+static struct clock_data_banked *update_bank_begin(void)
+{
+	/* update the backup (odd) bank and steer readers towards it */
+	memcpy(cd.bank + 1, cd.bank, sizeof(struct clock_data_banked));
+	raw_write_seqcount_latch(&cd.seq);
+
+	return cd.bank;
+}
+
+/*
+ * Finalize update of banked clock data.
+ *
+ * This is just a trivial switch back to the primary (even) copy.
+ */
+static void update_bank_end(void)
+{
+	raw_write_seqcount_latch(&cd.seq);
 }
 
 /*
  * Atomically update the sched_clock epoch.
  */
-static void notrace update_sched_clock(void)
+static void notrace update_sched_clock(bool suspended)
 {
-	unsigned long flags;
+	struct clock_data_banked *b;
 	u64 cyc;
 	u64 ns;
 
-	cyc = read_sched_clock();
-	ns = cd.epoch_ns +
-		cyc_to_ns((cyc - cd.epoch_cyc) & sched_clock_mask,
-			  cd.mult, cd.shift);
-
-	raw_local_irq_save(flags);
-	raw_write_seqcount_begin(&cd.seq);
-	cd.epoch_ns = ns;
-	cd.epoch_cyc = cyc;
-	raw_write_seqcount_end(&cd.seq);
-	raw_local_irq_restore(flags);
+	b = update_bank_begin();
+
+	cyc = b->read_sched_clock();
+	ns = b->epoch_ns + cyc_to_ns((cyc - b->epoch_cyc) & b->sched_clock_mask,
+				     b->mult, b->shift);
+
+	b->epoch_ns = ns;
+	b->epoch_cyc = cyc;
+	b->suspended = suspended;
+
+	update_bank_end();
 }
 
 static enum hrtimer_restart sched_clock_poll(struct hrtimer *hrt)
 {
-	update_sched_clock();
+	update_sched_clock(false);
 	hrtimer_forward_now(hrt, cd.wrap_kt);
 	return HRTIMER_RESTART;
 }
@@ -111,9 +150,9 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
 {
 	u64 res, wrap, new_mask, new_epoch, cyc, ns;
 	u32 new_mult, new_shift;
-	ktime_t new_wrap_kt;
 	unsigned long r;
 	char r_unit;
+	struct clock_data_banked *b;
 
 	if (cd.rate > rate)
 		return;
@@ -122,29 +161,30 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
 
 	/* calculate the mult/shift to convert counter ticks to ns. */
 	clocks_calc_mult_shift(&new_mult, &new_shift, rate, NSEC_PER_SEC, 3600);
+	cd.rate = rate;
 
 	new_mask = CLOCKSOURCE_MASK(bits);
 
 	/* calculate how many ns until we wrap */
 	wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask);
-	new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3));
+	cd.wrap_kt = ns_to_ktime(wrap - (wrap >> 3));
+
+	b = update_bank_begin();
 
 	/* update epoch for new counter and update epoch_ns from old counter*/
 	new_epoch = read();
-	cyc = read_sched_clock();
-	ns = cd.epoch_ns + cyc_to_ns((cyc - cd.epoch_cyc) & sched_clock_mask,
-			  cd.mult, cd.shift);
+	cyc = b->read_sched_clock();
+	ns = b->epoch_ns + cyc_to_ns((cyc - b->epoch_cyc) & b->sched_clock_mask,
+				     b->mult, b->shift);
 
-	raw_write_seqcount_begin(&cd.seq);
-	read_sched_clock = read;
-	sched_clock_mask = new_mask;
-	cd.rate = rate;
-	cd.wrap_kt = new_wrap_kt;
-	cd.mult = new_mult;
-	cd.shift = new_shift;
-	cd.epoch_cyc = new_epoch;
-	cd.epoch_ns = ns;
-	raw_write_seqcount_end(&cd.seq);
+	b->read_sched_clock = read;
+	b->sched_clock_mask = new_mask;
+	b->mult = new_mult;
+	b->shift = new_shift;
+	b->epoch_cyc = new_epoch;
+	b->epoch_ns = ns;
+
+	update_bank_end();
 
 	r = rate;
 	if (r >= 4000000) {
@@ -175,10 +215,10 @@ void __init sched_clock_postinit(void)
 	 * If no sched_clock function has been provided at that point,
 	 * make it the final one one.
 	 */
-	if (read_sched_clock == jiffy_sched_clock_read)
+	if (cd.bank[0].read_sched_clock == jiffy_sched_clock_read)
 		sched_clock_register(jiffy_sched_clock_read, BITS_PER_LONG, HZ);
 
-	update_sched_clock();
+	update_sched_clock(false);
 
 	/*
 	 * Start the timer to keep sched_clock() properly updated and
@@ -191,17 +231,20 @@ void __init sched_clock_postinit(void)
 
 static int sched_clock_suspend(void)
 {
-	update_sched_clock();
+	update_sched_clock(true);
 	hrtimer_cancel(&sched_clock_timer);
-	cd.suspended = true;
 	return 0;
 }
 
 static void sched_clock_resume(void)
 {
-	cd.epoch_cyc = read_sched_clock();
+	struct clock_data_banked *b;
+
+	b = update_bank_begin();
+	b->epoch_cyc = b->read_sched_clock();
 	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
-	cd.suspended = false;
+	b->suspended = false;
+	update_bank_end();
 }
 
 static struct syscore_ops sched_clock_ops = {
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 5/8] printk: Simple implementation for NMI backtracing
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-01-23 14:22   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently there is a quite a pile of code sitting in
arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
The code is inaccessible to backtrace implementations for other
architectures, which is a shame because they would probably like to be
safe too.

Copy this code into printk. We'll port the x86 NMI backtrace to it in a
later patch.

Incidentally, technically I think it might be safe to call
prepare_nmi_printk() from NMI, providing care were taken to honour the
return code. complete_nmi_printk() cannot be called from NMI but could
be scheduled using irq_work_queue(). However honouring the return code
means sometimes it is impossible to get the message out so I'd say using
this code in such a way should probably attract sympathy and/or derision
rather than admiration.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/Kconfig           |   3 ++
 include/linux/printk.h |  22 +++++++++
 kernel/printk/printk.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 147 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 05d7a8a458d5..50c9412a77d0 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -309,6 +309,9 @@ config ARCH_WANT_OLD_COMPAT_IPC
 	select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
 	bool
 
+config ARCH_WANT_NMI_PRINTK
+	bool
+
 config HAVE_ARCH_SECCOMP_FILTER
 	bool
 	help
diff --git a/include/linux/printk.h b/include/linux/printk.h
index c8f170324e64..188fdc2c1efd 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -219,6 +219,28 @@ static inline void show_regs_print_info(const char *log_lvl)
 }
 #endif
 
+#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
+extern __printf(1, 0) int nmi_vprintk(const char *fmt, va_list args);
+
+struct cpumask;
+extern int prepare_nmi_printk(struct cpumask *cpus);
+extern void complete_nmi_printk(struct cpumask *cpus);
+
+/*
+ * Replace printk to write into the NMI seq.
+ *
+ * To avoid include hell this is a macro rather than an inline function
+ * (printk_func is not declared in this header file).
+ */
+#define this_cpu_begin_nmi_printk() ({ \
+	printk_func_t __orig = this_cpu_read(printk_func); \
+	this_cpu_write(printk_func, nmi_vprintk); \
+	__orig; \
+})
+#define this_cpu_end_nmi_printk(fn) this_cpu_write(printk_func, fn)
+
+#endif
+
 extern asmlinkage void dump_stack(void) __cold;
 
 #ifndef pr_fmt
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02d6b6d28796..774119e27e0b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1805,6 +1805,127 @@ asmlinkage int printk_emit(int facility, int level,
 }
 EXPORT_SYMBOL(printk_emit);
 
+#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
+
+#define NMI_BUF_SIZE		4096
+
+struct nmi_seq_buf {
+	unsigned char		buffer[NMI_BUF_SIZE];
+	struct seq_buf		seq;
+};
+
+/* Safe printing in NMI context */
+static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+
+/* "in progress" flag of NMI printing */
+static unsigned long nmi_print_flag;
+
+/*
+ * It is not safe to call printk() directly from NMI handlers.
+ * It may be fine if the NMI detected a lock up and we have no choice
+ * but to do so, but doing a NMI on all other CPUs to get a back trace
+ * can be done with a sysrq-l. We don't want that to lock up, which
+ * can happen if the NMI interrupts a printk in progress.
+ *
+ * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
+ * the content into a per cpu seq_buf buffer. Then when the NMIs are
+ * all done, we can safely dump the contents of the seq_buf to a printk()
+ * from a non NMI context.
+ *
+ * This is not a generic printk() implementation and must be used with
+ * great care. In particular there is a static limit on the quantity of
+ * data that may be emitted during NMI, only one client can be active at
+ * one time (arbitrated by the return value of begin_nmi_printk() and
+ * it is required that something at task or interrupt context be scheduled
+ * to issue the output.
+ */
+int nmi_vprintk(const char *fmt, va_list args)
+{
+	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	unsigned int len = seq_buf_used(&s->seq);
+
+	seq_buf_vprintf(&s->seq, fmt, args);
+	return seq_buf_used(&s->seq) - len;
+}
+EXPORT_SYMBOL_GPL(nmi_vprintk);
+
+/*
+ * Check for concurrent usage and set up per_cpu seq_buf buffers that the NMIs
+ * running on the other CPUs will write to. Provides the mask of CPUs it is
+ * safe to write from (i.e. a copy of the online mask).
+ */
+int prepare_nmi_printk(struct cpumask *cpus)
+{
+	struct nmi_seq_buf *s;
+	int cpu;
+
+	if (test_and_set_bit(0, &nmi_print_flag)) {
+		/*
+		 * If something is already using the NMI print facility we
+		 * can't allow a second one...
+		 */
+		return -EBUSY;
+	}
+
+	cpumask_copy(cpus, cpu_online_mask);
+
+	for_each_cpu(cpu, cpus) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(prepare_nmi_printk);
+
+static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
+{
+	const char *buf = s->buffer + start;
+
+	printk("%.*s", (end - start) + 1, buf);
+}
+
+void complete_nmi_printk(struct cpumask *cpus)
+{
+	struct nmi_seq_buf *s;
+	int len;
+	int cpu;
+	int i;
+
+	/*
+	 * Now that all the NMIs have triggered, we can dump out their
+	 * back traces safely to the console.
+	 */
+	for_each_cpu(cpu, cpus) {
+		int last_i = 0;
+
+		s = &per_cpu(nmi_print_seq, cpu);
+
+		len = seq_buf_used(&s->seq);
+		if (!len)
+			continue;
+
+		/* Print line by line. */
+		for (i = 0; i < len; i++) {
+			if (s->buffer[i] == '\n') {
+				print_seq_line(s, last_i, i);
+				last_i = i + 1;
+			}
+		}
+		/* Check if there was a partial line. */
+		if (last_i < len) {
+			print_seq_line(s, last_i, len - 1);
+			pr_cont("\n");
+		}
+	}
+
+	clear_bit(0, &nmi_print_flag);
+	smp_mb__after_atomic();
+}
+EXPORT_SYMBOL_GPL(complete_nmi_printk);
+
+#endif /* CONFIG_ARCH_WANT_NMI_PRINTK */
+
 int vprintk_default(const char *fmt, va_list args)
 {
 	int r;
@@ -1829,6 +1950,7 @@ EXPORT_SYMBOL_GPL(vprintk_default);
  */
 DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
 
+
 /**
  * printk - print a kernel message
  * @fmt: format string
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 5/8] printk: Simple implementation for NMI backtracing
@ 2015-01-23 14:22   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Currently there is a quite a pile of code sitting in
arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
The code is inaccessible to backtrace implementations for other
architectures, which is a shame because they would probably like to be
safe too.

Copy this code into printk. We'll port the x86 NMI backtrace to it in a
later patch.

Incidentally, technically I think it might be safe to call
prepare_nmi_printk() from NMI, providing care were taken to honour the
return code. complete_nmi_printk() cannot be called from NMI but could
be scheduled using irq_work_queue(). However honouring the return code
means sometimes it is impossible to get the message out so I'd say using
this code in such a way should probably attract sympathy and/or derision
rather than admiration.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/Kconfig           |   3 ++
 include/linux/printk.h |  22 +++++++++
 kernel/printk/printk.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 147 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 05d7a8a458d5..50c9412a77d0 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -309,6 +309,9 @@ config ARCH_WANT_OLD_COMPAT_IPC
 	select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
 	bool
 
+config ARCH_WANT_NMI_PRINTK
+	bool
+
 config HAVE_ARCH_SECCOMP_FILTER
 	bool
 	help
diff --git a/include/linux/printk.h b/include/linux/printk.h
index c8f170324e64..188fdc2c1efd 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -219,6 +219,28 @@ static inline void show_regs_print_info(const char *log_lvl)
 }
 #endif
 
+#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
+extern __printf(1, 0) int nmi_vprintk(const char *fmt, va_list args);
+
+struct cpumask;
+extern int prepare_nmi_printk(struct cpumask *cpus);
+extern void complete_nmi_printk(struct cpumask *cpus);
+
+/*
+ * Replace printk to write into the NMI seq.
+ *
+ * To avoid include hell this is a macro rather than an inline function
+ * (printk_func is not declared in this header file).
+ */
+#define this_cpu_begin_nmi_printk() ({ \
+	printk_func_t __orig = this_cpu_read(printk_func); \
+	this_cpu_write(printk_func, nmi_vprintk); \
+	__orig; \
+})
+#define this_cpu_end_nmi_printk(fn) this_cpu_write(printk_func, fn)
+
+#endif
+
 extern asmlinkage void dump_stack(void) __cold;
 
 #ifndef pr_fmt
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02d6b6d28796..774119e27e0b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1805,6 +1805,127 @@ asmlinkage int printk_emit(int facility, int level,
 }
 EXPORT_SYMBOL(printk_emit);
 
+#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
+
+#define NMI_BUF_SIZE		4096
+
+struct nmi_seq_buf {
+	unsigned char		buffer[NMI_BUF_SIZE];
+	struct seq_buf		seq;
+};
+
+/* Safe printing in NMI context */
+static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+
+/* "in progress" flag of NMI printing */
+static unsigned long nmi_print_flag;
+
+/*
+ * It is not safe to call printk() directly from NMI handlers.
+ * It may be fine if the NMI detected a lock up and we have no choice
+ * but to do so, but doing a NMI on all other CPUs to get a back trace
+ * can be done with a sysrq-l. We don't want that to lock up, which
+ * can happen if the NMI interrupts a printk in progress.
+ *
+ * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
+ * the content into a per cpu seq_buf buffer. Then when the NMIs are
+ * all done, we can safely dump the contents of the seq_buf to a printk()
+ * from a non NMI context.
+ *
+ * This is not a generic printk() implementation and must be used with
+ * great care. In particular there is a static limit on the quantity of
+ * data that may be emitted during NMI, only one client can be active at
+ * one time (arbitrated by the return value of begin_nmi_printk() and
+ * it is required that something at task or interrupt context be scheduled
+ * to issue the output.
+ */
+int nmi_vprintk(const char *fmt, va_list args)
+{
+	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	unsigned int len = seq_buf_used(&s->seq);
+
+	seq_buf_vprintf(&s->seq, fmt, args);
+	return seq_buf_used(&s->seq) - len;
+}
+EXPORT_SYMBOL_GPL(nmi_vprintk);
+
+/*
+ * Check for concurrent usage and set up per_cpu seq_buf buffers that the NMIs
+ * running on the other CPUs will write to. Provides the mask of CPUs it is
+ * safe to write from (i.e. a copy of the online mask).
+ */
+int prepare_nmi_printk(struct cpumask *cpus)
+{
+	struct nmi_seq_buf *s;
+	int cpu;
+
+	if (test_and_set_bit(0, &nmi_print_flag)) {
+		/*
+		 * If something is already using the NMI print facility we
+		 * can't allow a second one...
+		 */
+		return -EBUSY;
+	}
+
+	cpumask_copy(cpus, cpu_online_mask);
+
+	for_each_cpu(cpu, cpus) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(prepare_nmi_printk);
+
+static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
+{
+	const char *buf = s->buffer + start;
+
+	printk("%.*s", (end - start) + 1, buf);
+}
+
+void complete_nmi_printk(struct cpumask *cpus)
+{
+	struct nmi_seq_buf *s;
+	int len;
+	int cpu;
+	int i;
+
+	/*
+	 * Now that all the NMIs have triggered, we can dump out their
+	 * back traces safely to the console.
+	 */
+	for_each_cpu(cpu, cpus) {
+		int last_i = 0;
+
+		s = &per_cpu(nmi_print_seq, cpu);
+
+		len = seq_buf_used(&s->seq);
+		if (!len)
+			continue;
+
+		/* Print line by line. */
+		for (i = 0; i < len; i++) {
+			if (s->buffer[i] == '\n') {
+				print_seq_line(s, last_i, i);
+				last_i = i + 1;
+			}
+		}
+		/* Check if there was a partial line. */
+		if (last_i < len) {
+			print_seq_line(s, last_i, len - 1);
+			pr_cont("\n");
+		}
+	}
+
+	clear_bit(0, &nmi_print_flag);
+	smp_mb__after_atomic();
+}
+EXPORT_SYMBOL_GPL(complete_nmi_printk);
+
+#endif /* CONFIG_ARCH_WANT_NMI_PRINTK */
+
 int vprintk_default(const char *fmt, va_list args)
 {
 	int r;
@@ -1829,6 +1950,7 @@ EXPORT_SYMBOL_GPL(vprintk_default);
  */
 DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
 
+
 /**
  * printk - print a kernel message
  * @fmt: format string
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 6/8] x86/nmi: Use common printk functions
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-01-23 14:22   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander, Ingo Molnar, H. Peter Anvin, x86

Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe
all-cpu backtracing from NMI has been copied to printk.c to make it
accessible to other architectures.

Port the x86 NMI backtrace to the generic code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 arch/x86/Kconfig              |  1 +
 arch/x86/kernel/apic/hw_nmi.c | 97 ++++---------------------------------------
 2 files changed, 8 insertions(+), 90 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ba397bde7948..f36d3058968e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -138,6 +138,7 @@ config X86
 	select HAVE_ACPI_APEI_NMI if ACPI
 	select ACPI_LEGACY_TABLES_LOOKUP if ACPI
 	select X86_FEATURE_NAMES if PROC_FS
+	select ARCH_WANT_NMI_PRINTK if X86_LOCAL_APIC
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 6873ab925d00..e15c3131636e 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -32,56 +32,23 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
 static cpumask_t printtrace_mask;
 
-#define NMI_BUF_SIZE		4096
-
-struct nmi_seq_buf {
-	unsigned char		buffer[NMI_BUF_SIZE];
-	struct seq_buf		seq;
-};
-
-/* Safe printing in NMI context */
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
-
-/* "in progress" flag of arch_trigger_all_cpu_backtrace */
-static unsigned long backtrace_flag;
-
-static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
-{
-	const char *buf = s->buffer + start;
-
-	printk("%.*s", (end - start) + 1, buf);
-}
-
 void arch_trigger_all_cpu_backtrace(bool include_self)
 {
-	struct nmi_seq_buf *s;
-	int len;
-	int cpu;
 	int i;
 	int this_cpu = get_cpu();
 
-	if (test_and_set_bit(0, &backtrace_flag)) {
+	if (0 != prepare_nmi_printk(to_cpumask(backtrace_mask))) {
 		/*
-		 * If there is already a trigger_all_cpu_backtrace() in progress
-		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
 		 */
 		put_cpu();
 		return;
 	}
 
-	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
 	if (!include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
-
 	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
-	/*
-	 * Set up per_cpu seq_buf buffers that the NMIs running on the other
-	 * CPUs will write to.
-	 */
-	for_each_cpu(cpu, to_cpumask(backtrace_mask)) {
-		s = &per_cpu(nmi_print_seq, cpu);
-		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
-	}
 
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("sending NMI to %s CPUs:\n",
@@ -97,73 +64,23 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Now that all the NMIs have triggered, we can dump out their
-	 * back traces safely to the console.
-	 */
-	for_each_cpu(cpu, &printtrace_mask) {
-		int last_i = 0;
-
-		s = &per_cpu(nmi_print_seq, cpu);
-		len = seq_buf_used(&s->seq);
-		if (!len)
-			continue;
-
-		/* Print line by line. */
-		for (i = 0; i < len; i++) {
-			if (s->buffer[i] == '\n') {
-				print_seq_line(s, last_i, i);
-				last_i = i + 1;
-			}
-		}
-		/* Check if there was a partial line. */
-		if (last_i < len) {
-			print_seq_line(s, last_i, len - 1);
-			pr_cont("\n");
-		}
-	}
-
-	clear_bit(0, &backtrace_flag);
-	smp_mb__after_atomic();
+	complete_nmi_printk(&printtrace_mask);
 	put_cpu();
 }
 
-/*
- * It is not safe to call printk() directly from NMI handlers.
- * It may be fine if the NMI detected a lock up and we have no choice
- * but to do so, but doing a NMI on all other CPUs to get a back trace
- * can be done with a sysrq-l. We don't want that to lock up, which
- * can happen if the NMI interrupts a printk in progress.
- *
- * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- * the content into a per cpu seq_buf buffer. Then when the NMIs are
- * all done, we can safely dump the contents of the seq_buf to a printk()
- * from a non NMI context.
- */
-static int nmi_vprintk(const char *fmt, va_list args)
-{
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	unsigned int len = seq_buf_used(&s->seq);
-
-	seq_buf_vprintf(&s->seq, fmt, args);
-	return seq_buf_used(&s->seq) - len;
-}
-
 static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
 	int cpu;
+	printk_func_t orig;
 
 	cpu = smp_processor_id();
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
-		printk_func_t printk_func_save = this_cpu_read(printk_func);
-
-		/* Replace printk to write into the NMI seq */
-		this_cpu_write(printk_func, nmi_vprintk);
+		orig = this_cpu_begin_nmi_printk();
 		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
 		show_regs(regs);
-		this_cpu_write(printk_func, printk_func_save);
+		this_cpu_end_nmi_printk(orig);
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return NMI_HANDLED;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 6/8] x86/nmi: Use common printk functions
@ 2015-01-23 14:22   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe
all-cpu backtracing from NMI has been copied to printk.c to make it
accessible to other architectures.

Port the x86 NMI backtrace to the generic code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86 at kernel.org
---
 arch/x86/Kconfig              |  1 +
 arch/x86/kernel/apic/hw_nmi.c | 97 ++++---------------------------------------
 2 files changed, 8 insertions(+), 90 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ba397bde7948..f36d3058968e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -138,6 +138,7 @@ config X86
 	select HAVE_ACPI_APEI_NMI if ACPI
 	select ACPI_LEGACY_TABLES_LOOKUP if ACPI
 	select X86_FEATURE_NAMES if PROC_FS
+	select ARCH_WANT_NMI_PRINTK if X86_LOCAL_APIC
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 6873ab925d00..e15c3131636e 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -32,56 +32,23 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
 static cpumask_t printtrace_mask;
 
-#define NMI_BUF_SIZE		4096
-
-struct nmi_seq_buf {
-	unsigned char		buffer[NMI_BUF_SIZE];
-	struct seq_buf		seq;
-};
-
-/* Safe printing in NMI context */
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
-
-/* "in progress" flag of arch_trigger_all_cpu_backtrace */
-static unsigned long backtrace_flag;
-
-static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
-{
-	const char *buf = s->buffer + start;
-
-	printk("%.*s", (end - start) + 1, buf);
-}
-
 void arch_trigger_all_cpu_backtrace(bool include_self)
 {
-	struct nmi_seq_buf *s;
-	int len;
-	int cpu;
 	int i;
 	int this_cpu = get_cpu();
 
-	if (test_and_set_bit(0, &backtrace_flag)) {
+	if (0 != prepare_nmi_printk(to_cpumask(backtrace_mask))) {
 		/*
-		 * If there is already a trigger_all_cpu_backtrace() in progress
-		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
 		 */
 		put_cpu();
 		return;
 	}
 
-	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
 	if (!include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
-
 	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
-	/*
-	 * Set up per_cpu seq_buf buffers that the NMIs running on the other
-	 * CPUs will write to.
-	 */
-	for_each_cpu(cpu, to_cpumask(backtrace_mask)) {
-		s = &per_cpu(nmi_print_seq, cpu);
-		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
-	}
 
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("sending NMI to %s CPUs:\n",
@@ -97,73 +64,23 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Now that all the NMIs have triggered, we can dump out their
-	 * back traces safely to the console.
-	 */
-	for_each_cpu(cpu, &printtrace_mask) {
-		int last_i = 0;
-
-		s = &per_cpu(nmi_print_seq, cpu);
-		len = seq_buf_used(&s->seq);
-		if (!len)
-			continue;
-
-		/* Print line by line. */
-		for (i = 0; i < len; i++) {
-			if (s->buffer[i] == '\n') {
-				print_seq_line(s, last_i, i);
-				last_i = i + 1;
-			}
-		}
-		/* Check if there was a partial line. */
-		if (last_i < len) {
-			print_seq_line(s, last_i, len - 1);
-			pr_cont("\n");
-		}
-	}
-
-	clear_bit(0, &backtrace_flag);
-	smp_mb__after_atomic();
+	complete_nmi_printk(&printtrace_mask);
 	put_cpu();
 }
 
-/*
- * It is not safe to call printk() directly from NMI handlers.
- * It may be fine if the NMI detected a lock up and we have no choice
- * but to do so, but doing a NMI on all other CPUs to get a back trace
- * can be done with a sysrq-l. We don't want that to lock up, which
- * can happen if the NMI interrupts a printk in progress.
- *
- * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- * the content into a per cpu seq_buf buffer. Then when the NMIs are
- * all done, we can safely dump the contents of the seq_buf to a printk()
- * from a non NMI context.
- */
-static int nmi_vprintk(const char *fmt, va_list args)
-{
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	unsigned int len = seq_buf_used(&s->seq);
-
-	seq_buf_vprintf(&s->seq, fmt, args);
-	return seq_buf_used(&s->seq) - len;
-}
-
 static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
 	int cpu;
+	printk_func_t orig;
 
 	cpu = smp_processor_id();
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
-		printk_func_t printk_func_save = this_cpu_read(printk_func);
-
-		/* Replace printk to write into the NMI seq */
-		this_cpu_write(printk_func, nmi_vprintk);
+		orig = this_cpu_begin_nmi_printk();
 		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
 		show_regs(regs);
-		this_cpu_write(printk_func, printk_func_save);
+		this_cpu_end_nmi_printk(orig);
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return NMI_HANDLED;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 7/8] ARM: Add support for on-demand backtrace of other CPUs
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-01-23 14:22   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Duplicate the x86 code to trigger a backtrace using an NMI and hook
it up to IPI on ARM. Where it is possible for the hardware to do so the
IPI will be delivered at FIQ level.

Also provided are a few small items of plumbing to hook up the new code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/Kconfig               |  1 +
 arch/arm/include/asm/hardirq.h |  2 +-
 arch/arm/include/asm/irq.h     |  5 ++++
 arch/arm/include/asm/smp.h     |  3 ++
 arch/arm/kernel/smp.c          | 68 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/traps.c        |  3 ++
 6 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 97d07ed60a0b..91d62731b52d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -11,6 +11,7 @@ config ARM
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select ARCH_WANT_IPC_PARSE_VERSION
+	select ARCH_WANT_NMI_PRINTK
 	select BUILDTIME_EXTABLE_SORT if MMU
 	select CLONE_BACKWARDS
 	select CPU_PM if (SUSPEND || CPU_IDLE)
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index fe3ea776dc34..5df33e30ae1b 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	8
+#define NR_IPI	9
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 18f5a554134f..b076584ac0fa 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -18,6 +18,8 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#define SMP_IPI_FIQ_MASK 0x0100
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
@@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
 
+extern void ipi_cpu_backtrace(struct pt_regs *regs);
 extern int register_ipi_completion(struct completion *completion, int cpu);
 
 struct smp_operations {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 5e6052e18850..93fe51d305d1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -26,6 +26,7 @@
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/seq_buf.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -72,6 +73,7 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
 static DECLARE_COMPLETION(cpu_running);
@@ -444,6 +446,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_CPU_STOP, "CPU stop interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_COMPLETION, "completion interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -558,6 +561,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 	unsigned int cpu = smp_processor_id();
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
+	BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
+
 	if ((unsigned)ipinr < NR_IPI) {
 		trace_ipi_entry(ipi_types[ipinr]);
 		__inc_irq_stat(cpu, ipi_irqs[ipinr]);
@@ -611,6 +616,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n",
 		        cpu, ipinr);
@@ -705,3 +716,60 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+static  cpumask_t printtrace_mask;
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	int i;
+	int this_cpu = get_cpu();
+
+	if (0 != prepare_nmi_printk(to_cpumask(backtrace_mask))) {
+		/*
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
+		 */
+		put_cpu();
+		return;
+	}
+
+	if (!include_self)
+		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
+	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+		touch_softlockup_watchdog();
+	}
+
+	complete_nmi_printk(&printtrace_mask);
+	put_cpu();
+}
+
+void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu;
+	printk_func_t orig;
+
+	cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		orig = this_cpu_begin_nmi_printk();
+		pr_warn("FIQ backtrace for cpu %d\n", cpu);
+		show_regs(regs);
+		this_cpu_end_nmi_printk(orig);
+
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b35e220ae1b1..1836415b8a5c 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 #ifdef CONFIG_ARM_GIC
 	gic_handle_fiq_ipi();
 #endif
+#ifdef CONFIG_SMP
+	ipi_cpu_backtrace(regs);
+#endif
 
 	nmi_exit();
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 7/8] ARM: Add support for on-demand backtrace of other CPUs
@ 2015-01-23 14:22   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Duplicate the x86 code to trigger a backtrace using an NMI and hook
it up to IPI on ARM. Where it is possible for the hardware to do so the
IPI will be delivered at FIQ level.

Also provided are a few small items of plumbing to hook up the new code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/Kconfig               |  1 +
 arch/arm/include/asm/hardirq.h |  2 +-
 arch/arm/include/asm/irq.h     |  5 ++++
 arch/arm/include/asm/smp.h     |  3 ++
 arch/arm/kernel/smp.c          | 68 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/traps.c        |  3 ++
 6 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 97d07ed60a0b..91d62731b52d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -11,6 +11,7 @@ config ARM
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select ARCH_WANT_IPC_PARSE_VERSION
+	select ARCH_WANT_NMI_PRINTK
 	select BUILDTIME_EXTABLE_SORT if MMU
 	select CLONE_BACKWARDS
 	select CPU_PM if (SUSPEND || CPU_IDLE)
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index fe3ea776dc34..5df33e30ae1b 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	8
+#define NR_IPI	9
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 18f5a554134f..b076584ac0fa 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -18,6 +18,8 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#define SMP_IPI_FIQ_MASK 0x0100
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
@@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
 
+extern void ipi_cpu_backtrace(struct pt_regs *regs);
 extern int register_ipi_completion(struct completion *completion, int cpu);
 
 struct smp_operations {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 5e6052e18850..93fe51d305d1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -26,6 +26,7 @@
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/seq_buf.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -72,6 +73,7 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
 static DECLARE_COMPLETION(cpu_running);
@@ -444,6 +446,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_CPU_STOP, "CPU stop interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_COMPLETION, "completion interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -558,6 +561,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 	unsigned int cpu = smp_processor_id();
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
+	BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
+
 	if ((unsigned)ipinr < NR_IPI) {
 		trace_ipi_entry(ipi_types[ipinr]);
 		__inc_irq_stat(cpu, ipi_irqs[ipinr]);
@@ -611,6 +616,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n",
 		        cpu, ipinr);
@@ -705,3 +716,60 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+static  cpumask_t printtrace_mask;
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	int i;
+	int this_cpu = get_cpu();
+
+	if (0 != prepare_nmi_printk(to_cpumask(backtrace_mask))) {
+		/*
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
+		 */
+		put_cpu();
+		return;
+	}
+
+	if (!include_self)
+		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
+	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+		touch_softlockup_watchdog();
+	}
+
+	complete_nmi_printk(&printtrace_mask);
+	put_cpu();
+}
+
+void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu;
+	printk_func_t orig;
+
+	cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		orig = this_cpu_begin_nmi_printk();
+		pr_warn("FIQ backtrace for cpu %d\n", cpu);
+		show_regs(regs);
+		this_cpu_end_nmi_printk(orig);
+
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b35e220ae1b1..1836415b8a5c 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 #ifdef CONFIG_ARM_GIC
 	gic_handle_fiq_ipi();
 #endif
+#ifdef CONFIG_SMP
+	ipi_cpu_backtrace(regs);
+#endif
 
 	nmi_exit();
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 8/8] ARM: Fix on-demand backtrace triggered by IRQ
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-01-23 14:22   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently if arch_trigger_all_cpu_backtrace() is called with interrupts
disabled and on a platform the delivers IPI_CPU_BACKTRACE using regular
IRQ requests the system will wedge for ten seconds waiting for the
current CPU to react to a masked interrupt.

This patch resolves this issue by calling directly into the backtrace
dump code instead of generating an IPI.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <linux@arm.linux.org.uk>
---
 arch/arm/kernel/smp.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 93fe51d305d1..ef35cf832aee 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -739,6 +739,16 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
 	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
 
+	/*
+	 * If irqs are disabled on the current processor then, if
+	 * IPI_CPU_BACKTRACE is delivered using IRQ, we will won't be able to
+	 * react to IPI_CPU_BACKTRACE until we leave this function. We avoid
+	 * the potential timeout (not to mention the failure to print useful
+	 * information) by calling the backtrace directly.
+	 */
+	if (include_self && irqs_disabled())
+		ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL);
+
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("Sending FIQ to %s CPUs:\n",
 			(include_self ? "all" : "other"));
@@ -767,7 +777,10 @@ void ipi_cpu_backtrace(struct pt_regs *regs)
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
 		orig = this_cpu_begin_nmi_printk();
 		pr_warn("FIQ backtrace for cpu %d\n", cpu);
-		show_regs(regs);
+		if (regs != NULL)
+			show_regs(regs);
+		else
+			dump_stack();
 		this_cpu_end_nmi_printk(orig);
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 8/8] ARM: Fix on-demand backtrace triggered by IRQ
@ 2015-01-23 14:22   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-23 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Currently if arch_trigger_all_cpu_backtrace() is called with interrupts
disabled and on a platform the delivers IPI_CPU_BACKTRACE using regular
IRQ requests the system will wedge for ten seconds waiting for the
current CPU to react to a masked interrupt.

This patch resolves this issue by calling directly into the backtrace
dump code instead of generating an IPI.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <linux@arm.linux.org.uk>
---
 arch/arm/kernel/smp.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 93fe51d305d1..ef35cf832aee 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -739,6 +739,16 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
 	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
 
+	/*
+	 * If irqs are disabled on the current processor then, if
+	 * IPI_CPU_BACKTRACE is delivered using IRQ, we will won't be able to
+	 * react to IPI_CPU_BACKTRACE until we leave this function. We avoid
+	 * the potential timeout (not to mention the failure to print useful
+	 * information) by calling the backtrace directly.
+	 */
+	if (include_self && irqs_disabled())
+		ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL);
+
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("Sending FIQ to %s CPUs:\n",
 			(include_self ? "all" : "other"));
@@ -767,7 +777,10 @@ void ipi_cpu_backtrace(struct pt_regs *regs)
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
 		orig = this_cpu_begin_nmi_printk();
 		pr_warn("FIQ backtrace for cpu %d\n", cpu);
-		show_regs(regs);
+		if (regs != NULL)
+			show_regs(regs);
+		else
+			dump_stack();
 		this_cpu_end_nmi_printk(orig);
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 3.19-rc2 v15 5/8] printk: Simple implementation for NMI backtracing
  2015-01-23 14:22   ` Daniel Thompson
@ 2015-01-24 21:44     ` Thomas Gleixner
  -1 siblings, 0 replies; 94+ messages in thread
From: Thomas Gleixner @ 2015-01-24 21:44 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Jason Cooper, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

On Fri, 23 Jan 2015, Daniel Thompson wrote:
> +#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
> +extern __printf(1, 0) int nmi_vprintk(const char *fmt, va_list args);
> +
> +struct cpumask;
> +extern int prepare_nmi_printk(struct cpumask *cpus);
> +extern void complete_nmi_printk(struct cpumask *cpus);
> +
> +/*
> + * Replace printk to write into the NMI seq.
> + *
> + * To avoid include hell this is a macro rather than an inline function
> + * (printk_func is not declared in this header file).
> + */
> +#define this_cpu_begin_nmi_printk() ({ \
> +	printk_func_t __orig = this_cpu_read(printk_func); \
> +	this_cpu_write(printk_func, nmi_vprintk); \
> +	__orig; \
> +})
> +#define this_cpu_end_nmi_printk(fn) this_cpu_write(printk_func, fn)

Why can't we just make it a proper function in printk.c and make
DEFINE_PER_CPU(printk_func_t, printk_func) static once x86 is
converted over, thereby getting rid of the misplaced declaration in
percpu.h?

It's really not performance critical at all. If you do system wide
backtraces a function call is the least of your worries.

> +#ifdef CONFIG_ARCH_WANT_NMI_PRINTK

Why can't this simply be CONFIG_PRINTK_NMI and live at the same place
as the other printk related options?

> +int nmi_vprintk(const char *fmt, va_list args)
> +{
> +	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
> +	unsigned int len = seq_buf_used(&s->seq);
> +
> +	seq_buf_vprintf(&s->seq, fmt, args);
> +	return seq_buf_used(&s->seq) - len;
> +}
> +EXPORT_SYMBOL_GPL(nmi_vprintk);

What's the point of these exports? This stuff is really not supposed
to be used inside random modules.

> +/*
> + * Check for concurrent usage and set up per_cpu seq_buf buffers that the NMIs
> + * running on the other CPUs will write to. Provides the mask of CPUs it is
> + * safe to write from (i.e. a copy of the online mask).
> + */
> +int prepare_nmi_printk(struct cpumask *cpus)

Can we please make all this proper prefixed? , i.e. printk_nmi_*

> +{
> +	struct nmi_seq_buf *s;
> +	int cpu;
> +
> +	if (test_and_set_bit(0, &nmi_print_flag)) {
> +		/*
> +		 * If something is already using the NMI print facility we
> +		 * can't allow a second one...
> +		 */
> +		return -EBUSY;

So what's the point of saving and restoring the printk_func pointer at
the call site?

void printk_nmi_begin()
{
	if (__this_cpu_inc_return(nmi_printk_nest_level) == 1)
	      this_cpu_write(printk_func, nmi_vprintk);
}

void printk_nmi_end()
{
	if (__this_cpu_dec_return(nmi_printk_nest_level) > 0)
	        return;
        this_cpu_write(printk_func, default_vprintk);
	if (in_nmi())
		irq_work_schedule();
        else
		printk_nmi_complete();
}

> +	}
> +
> +	cpumask_copy(cpus, cpu_online_mask);

Why do you need external storage for this if nesting is not allowed?
What's wrong with having a printk_nmi_mask? It's protected by the
nmi_print_flag, so the call sites do not have to take care about
protecting it until printk_nmi_complete() has been invoked.

> +	for_each_cpu(cpu, cpus) {
> +		s = &per_cpu(nmi_print_seq, cpu);
> +		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);

Why do you want to do this here? The buffers should be initialized
before the first NMI can hit and the complete code should reinit them
before the next printk_nmi_prepare() sees the nmi_print_flag cleared.

> +static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
> +{
> +	const char *buf = s->buffer + start;
> +
> +	printk("%.*s", (end - start) + 1, buf);
> +}
> +
> +void complete_nmi_printk(struct cpumask *cpus)
> +{
> +	struct nmi_seq_buf *s;
> +	int len;
> +	int cpu;
> +	int i;

Please condense all ints to a single line, but what's worse is the
completely inconsistency versus scopes.

len and i are only used in the for_each loop. Either we put all of
them at the top of the function or we do it right.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 5/8] printk: Simple implementation for NMI backtracing
@ 2015-01-24 21:44     ` Thomas Gleixner
  0 siblings, 0 replies; 94+ messages in thread
From: Thomas Gleixner @ 2015-01-24 21:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 23 Jan 2015, Daniel Thompson wrote:
> +#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
> +extern __printf(1, 0) int nmi_vprintk(const char *fmt, va_list args);
> +
> +struct cpumask;
> +extern int prepare_nmi_printk(struct cpumask *cpus);
> +extern void complete_nmi_printk(struct cpumask *cpus);
> +
> +/*
> + * Replace printk to write into the NMI seq.
> + *
> + * To avoid include hell this is a macro rather than an inline function
> + * (printk_func is not declared in this header file).
> + */
> +#define this_cpu_begin_nmi_printk() ({ \
> +	printk_func_t __orig = this_cpu_read(printk_func); \
> +	this_cpu_write(printk_func, nmi_vprintk); \
> +	__orig; \
> +})
> +#define this_cpu_end_nmi_printk(fn) this_cpu_write(printk_func, fn)

Why can't we just make it a proper function in printk.c and make
DEFINE_PER_CPU(printk_func_t, printk_func) static once x86 is
converted over, thereby getting rid of the misplaced declaration in
percpu.h?

It's really not performance critical at all. If you do system wide
backtraces a function call is the least of your worries.

> +#ifdef CONFIG_ARCH_WANT_NMI_PRINTK

Why can't this simply be CONFIG_PRINTK_NMI and live at the same place
as the other printk related options?

> +int nmi_vprintk(const char *fmt, va_list args)
> +{
> +	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
> +	unsigned int len = seq_buf_used(&s->seq);
> +
> +	seq_buf_vprintf(&s->seq, fmt, args);
> +	return seq_buf_used(&s->seq) - len;
> +}
> +EXPORT_SYMBOL_GPL(nmi_vprintk);

What's the point of these exports? This stuff is really not supposed
to be used inside random modules.

> +/*
> + * Check for concurrent usage and set up per_cpu seq_buf buffers that the NMIs
> + * running on the other CPUs will write to. Provides the mask of CPUs it is
> + * safe to write from (i.e. a copy of the online mask).
> + */
> +int prepare_nmi_printk(struct cpumask *cpus)

Can we please make all this proper prefixed? , i.e. printk_nmi_*

> +{
> +	struct nmi_seq_buf *s;
> +	int cpu;
> +
> +	if (test_and_set_bit(0, &nmi_print_flag)) {
> +		/*
> +		 * If something is already using the NMI print facility we
> +		 * can't allow a second one...
> +		 */
> +		return -EBUSY;

So what's the point of saving and restoring the printk_func pointer at
the call site?

void printk_nmi_begin()
{
	if (__this_cpu_inc_return(nmi_printk_nest_level) == 1)
	      this_cpu_write(printk_func, nmi_vprintk);
}

void printk_nmi_end()
{
	if (__this_cpu_dec_return(nmi_printk_nest_level) > 0)
	        return;
        this_cpu_write(printk_func, default_vprintk);
	if (in_nmi())
		irq_work_schedule();
        else
		printk_nmi_complete();
}

> +	}
> +
> +	cpumask_copy(cpus, cpu_online_mask);

Why do you need external storage for this if nesting is not allowed?
What's wrong with having a printk_nmi_mask? It's protected by the
nmi_print_flag, so the call sites do not have to take care about
protecting it until printk_nmi_complete() has been invoked.

> +	for_each_cpu(cpu, cpus) {
> +		s = &per_cpu(nmi_print_seq, cpu);
> +		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);

Why do you want to do this here? The buffers should be initialized
before the first NMI can hit and the complete code should reinit them
before the next printk_nmi_prepare() sees the nmi_print_flag cleared.

> +static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
> +{
> +	const char *buf = s->buffer + start;
> +
> +	printk("%.*s", (end - start) + 1, buf);
> +}
> +
> +void complete_nmi_printk(struct cpumask *cpus)
> +{
> +	struct nmi_seq_buf *s;
> +	int len;
> +	int cpu;
> +	int i;

Please condense all ints to a single line, but what's worse is the
completely inconsistency versus scopes.

len and i are only used in the for_each loop. Either we put all of
them at the top of the function or we do it right.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3.19-rc2 v15 4/8] sched_clock: Avoid deadlock during read from NMI
  2015-01-23 14:22   ` Daniel Thompson
@ 2015-01-24 22:40     ` Thomas Gleixner
  -1 siblings, 0 replies; 94+ messages in thread
From: Thomas Gleixner @ 2015-01-24 22:40 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Jason Cooper, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

On Fri, 23 Jan 2015, Daniel Thompson wrote:
> This patch fixes that problem by providing banked clock data in a
> similar manner to Thomas Gleixner's 4396e058c52e("timekeeping: Provide
> fast and NMI safe access to CLOCK_MONOTONIC").

By some definition of similar.

> -struct clock_data {
> -	ktime_t wrap_kt;
> +struct clock_data_banked {
>  	u64 epoch_ns;
>  	u64 epoch_cyc;
> -	seqcount_t seq;
> -	unsigned long rate;
> +	u64 (*read_sched_clock)(void);
> +	u64 sched_clock_mask;
>  	u32 mult;
>  	u32 shift;
>  	bool suspended;
>  };
>  
> +struct clock_data {
> +	ktime_t wrap_kt;
> +	seqcount_t seq;
> +	unsigned long rate;
> +	struct clock_data_banked bank[2];
> +};

....

> -static u64 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
> +static struct clock_data cd = {
> +	.bank = {
> +		[0] = {
> +			.mult	= NSEC_PER_SEC / HZ,
> +			.read_sched_clock = jiffy_sched_clock_read,
> +		},
> +	},
> +};

If you had carefully studied the changes which made it possible to do
the nmi safe clock monotonic accessor then you'd had noticed that I
went a great way to optimize the cache foot print first and then add
this new fangled thing.

So in the first place 'cd' lacks ____cacheline_aligned. It should have
been there before, but that's a different issue. You should have
noticed.

Secondly, I don't see any hint that you actually thought about the
cache foot print of the result struct clock_data. 

struct clock_data {
	ktime_t wrap_kt;
	seqcount_t seq;
	unsigned long rate;
	struct clock_data_banked bank[2];
};

wrap_kt and rate are completely irrelevant for the hotpath. The whole
thing up to the last member of bank[0] still fits into 64 byte on both
32 and 64bit, but that's not by design and not documented so anyone
who is aware of cache foot print issues will go WTF when the first
member of a hot path data structure is completely irrelevant.

>  static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
>  {
> @@ -58,50 +65,82 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
>  
>  unsigned long long notrace sched_clock(void)
>  {
> -	u64 epoch_ns;
> -	u64 epoch_cyc;
>  	u64 cyc;
>  	unsigned long seq;
> -
> -	if (cd.suspended)
> -		return cd.epoch_ns;
> +	struct clock_data_banked *b;
> +	u64 res;

So we now have

  	u64 cyc;
  	unsigned long seq;
	struct clock_data_banked *b;
	u64 res;

Let me try a different version of that:

	struct clock_data_banked *b;
  	unsigned long seq;
	u64 res, cyc;

Can you spot the difference in the reading experience?
 
>  	do {
> -		seq = raw_read_seqcount_begin(&cd.seq);
> -		epoch_cyc = cd.epoch_cyc;
> -		epoch_ns = cd.epoch_ns;
> +		seq = raw_read_seqcount(&cd.seq);
> +		b = cd.bank + (seq & 1);
> +		if (b->suspended) {
> +			res = b->epoch_ns;

So now we have read_sched_clock as a pointer in the bank. Why do you
still need b->suspended?

What's wrong with setting b->read_sched_clock to NULL at suspend and
restore the proper pointer on resume and use that as a conditional?
 
It would allow the compiler to generate better code, but that's
obviously not the goal here. Darn, this is hot path code and not some
random driver.

> +		} else {
> +			cyc = b->read_sched_clock();
> +			cyc = (cyc - b->epoch_cyc) & b->sched_clock_mask;
> +			res = b->epoch_ns + cyc_to_ns(cyc, b->mult, b->shift);

It would allow the following optimization as well:

   	 res = b->epoch_ns;
	 if (b->read_sched_clock) {
	    	...
	 }

If you think that compilers are smart enough to figure all that out
for you, you might get surprised. The more clear your code is the
better is the chance that the compiler gets it right. We have seen the
opposite of that as well, but that's clearly a compiler bug.

> +/*
> + * Start updating the banked clock data.
> + *
> + * sched_clock will never observe mis-matched data even if called from
> + * an NMI. We do this by maintaining an odd/even copy of the data and
> + * steering sched_clock to one or the other using a sequence counter.
> + * In order to preserve the data cache profile of sched_clock as much
> + * as possible the system reverts back to the even copy when the update
> + * completes; the odd copy is used *only* during an update.
> + *
> + * The caller is responsible for avoiding simultaneous updates.
> + */
> +static struct clock_data_banked *update_bank_begin(void)
> +{
> +	/* update the backup (odd) bank and steer readers towards it */
> +	memcpy(cd.bank + 1, cd.bank, sizeof(struct clock_data_banked));
> +	raw_write_seqcount_latch(&cd.seq);
> +
> +	return cd.bank;
> +}
> +
> +/*
> + * Finalize update of banked clock data.
> + *
> + * This is just a trivial switch back to the primary (even) copy.
> + */
> +static void update_bank_end(void)
> +{
> +	raw_write_seqcount_latch(&cd.seq);
>  }

What's wrong with having a master struct

struct master_data {
	struct clock_data_banked master_data;
	ktime_t wrap_kt;
	unsigned long rate;
	u64 (*real_read_sched_clock)(void);
};

Then you only have to care about the serialization of the master_data
update and then the hotpath data update would be the same simple
function as update_fast_timekeeper(). And it would have the same
ordering scheme and aside of that the resulting code would be simpler,
more intuitive to read and I'm pretty sure faster.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 4/8] sched_clock: Avoid deadlock during read from NMI
@ 2015-01-24 22:40     ` Thomas Gleixner
  0 siblings, 0 replies; 94+ messages in thread
From: Thomas Gleixner @ 2015-01-24 22:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 23 Jan 2015, Daniel Thompson wrote:
> This patch fixes that problem by providing banked clock data in a
> similar manner to Thomas Gleixner's 4396e058c52e("timekeeping: Provide
> fast and NMI safe access to CLOCK_MONOTONIC").

By some definition of similar.

> -struct clock_data {
> -	ktime_t wrap_kt;
> +struct clock_data_banked {
>  	u64 epoch_ns;
>  	u64 epoch_cyc;
> -	seqcount_t seq;
> -	unsigned long rate;
> +	u64 (*read_sched_clock)(void);
> +	u64 sched_clock_mask;
>  	u32 mult;
>  	u32 shift;
>  	bool suspended;
>  };
>  
> +struct clock_data {
> +	ktime_t wrap_kt;
> +	seqcount_t seq;
> +	unsigned long rate;
> +	struct clock_data_banked bank[2];
> +};

....

> -static u64 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
> +static struct clock_data cd = {
> +	.bank = {
> +		[0] = {
> +			.mult	= NSEC_PER_SEC / HZ,
> +			.read_sched_clock = jiffy_sched_clock_read,
> +		},
> +	},
> +};

If you had carefully studied the changes which made it possible to do
the nmi safe clock monotonic accessor then you'd had noticed that I
went a great way to optimize the cache foot print first and then add
this new fangled thing.

So in the first place 'cd' lacks ____cacheline_aligned. It should have
been there before, but that's a different issue. You should have
noticed.

Secondly, I don't see any hint that you actually thought about the
cache foot print of the result struct clock_data. 

struct clock_data {
	ktime_t wrap_kt;
	seqcount_t seq;
	unsigned long rate;
	struct clock_data_banked bank[2];
};

wrap_kt and rate are completely irrelevant for the hotpath. The whole
thing up to the last member of bank[0] still fits into 64 byte on both
32 and 64bit, but that's not by design and not documented so anyone
who is aware of cache foot print issues will go WTF when the first
member of a hot path data structure is completely irrelevant.

>  static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
>  {
> @@ -58,50 +65,82 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
>  
>  unsigned long long notrace sched_clock(void)
>  {
> -	u64 epoch_ns;
> -	u64 epoch_cyc;
>  	u64 cyc;
>  	unsigned long seq;
> -
> -	if (cd.suspended)
> -		return cd.epoch_ns;
> +	struct clock_data_banked *b;
> +	u64 res;

So we now have

  	u64 cyc;
  	unsigned long seq;
	struct clock_data_banked *b;
	u64 res;

Let me try a different version of that:

	struct clock_data_banked *b;
  	unsigned long seq;
	u64 res, cyc;

Can you spot the difference in the reading experience?
 
>  	do {
> -		seq = raw_read_seqcount_begin(&cd.seq);
> -		epoch_cyc = cd.epoch_cyc;
> -		epoch_ns = cd.epoch_ns;
> +		seq = raw_read_seqcount(&cd.seq);
> +		b = cd.bank + (seq & 1);
> +		if (b->suspended) {
> +			res = b->epoch_ns;

So now we have read_sched_clock as a pointer in the bank. Why do you
still need b->suspended?

What's wrong with setting b->read_sched_clock to NULL at suspend and
restore the proper pointer on resume and use that as a conditional?
 
It would allow the compiler to generate better code, but that's
obviously not the goal here. Darn, this is hot path code and not some
random driver.

> +		} else {
> +			cyc = b->read_sched_clock();
> +			cyc = (cyc - b->epoch_cyc) & b->sched_clock_mask;
> +			res = b->epoch_ns + cyc_to_ns(cyc, b->mult, b->shift);

It would allow the following optimization as well:

   	 res = b->epoch_ns;
	 if (b->read_sched_clock) {
	    	...
	 }

If you think that compilers are smart enough to figure all that out
for you, you might get surprised. The more clear your code is the
better is the chance that the compiler gets it right. We have seen the
opposite of that as well, but that's clearly a compiler bug.

> +/*
> + * Start updating the banked clock data.
> + *
> + * sched_clock will never observe mis-matched data even if called from
> + * an NMI. We do this by maintaining an odd/even copy of the data and
> + * steering sched_clock to one or the other using a sequence counter.
> + * In order to preserve the data cache profile of sched_clock as much
> + * as possible the system reverts back to the even copy when the update
> + * completes; the odd copy is used *only* during an update.
> + *
> + * The caller is responsible for avoiding simultaneous updates.
> + */
> +static struct clock_data_banked *update_bank_begin(void)
> +{
> +	/* update the backup (odd) bank and steer readers towards it */
> +	memcpy(cd.bank + 1, cd.bank, sizeof(struct clock_data_banked));
> +	raw_write_seqcount_latch(&cd.seq);
> +
> +	return cd.bank;
> +}
> +
> +/*
> + * Finalize update of banked clock data.
> + *
> + * This is just a trivial switch back to the primary (even) copy.
> + */
> +static void update_bank_end(void)
> +{
> +	raw_write_seqcount_latch(&cd.seq);
>  }

What's wrong with having a master struct

struct master_data {
	struct clock_data_banked master_data;
	ktime_t wrap_kt;
	unsigned long rate;
	u64 (*real_read_sched_clock)(void);
};

Then you only have to care about the serialization of the master_data
update and then the hotpath data update would be the same simple
function as update_fast_timekeeper(). And it would have the same
ordering scheme and aside of that the resulting code would be simpler,
more intuitive to read and I'm pretty sure faster.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3.19-rc2 v15 5/8] printk: Simple implementation for NMI backtracing
  2015-01-24 21:44     ` Thomas Gleixner
@ 2015-01-26 17:21       ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-26 17:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jason Cooper, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

On 24/01/15 21:44, Thomas Gleixner wrote:
> On Fri, 23 Jan 2015, Daniel Thompson wrote:
>> +#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
>> +extern __printf(1, 0) int nmi_vprintk(const char *fmt, va_list args);
>> +
>> +struct cpumask;
>> +extern int prepare_nmi_printk(struct cpumask *cpus);
>> +extern void complete_nmi_printk(struct cpumask *cpus);
>> +
>> +/*
>> + * Replace printk to write into the NMI seq.
>> + *
>> + * To avoid include hell this is a macro rather than an inline function
>> + * (printk_func is not declared in this header file).
>> + */
>> +#define this_cpu_begin_nmi_printk() ({ \
>> +	printk_func_t __orig = this_cpu_read(printk_func); \
>> +	this_cpu_write(printk_func, nmi_vprintk); \
>> +	__orig; \
>> +})
>> +#define this_cpu_end_nmi_printk(fn) this_cpu_write(printk_func, fn)
> 
> Why can't we just make it a proper function in printk.c and make
> DEFINE_PER_CPU(printk_func_t, printk_func) static once x86 is
> converted over, thereby getting rid of the misplaced declaration in
> percpu.h?
> 
> It's really not performance critical at all. If you do system wide
> backtraces a function call is the least of your worries.

Yes. I'll make this a proper function.

Not sure about tidying up printk_func though. I had hoped to use that to
get rid of CONFIG_KGGB_KDB ifdef's that are currently found in printk.c .


>> +#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
> 
> Why can't this simply be CONFIG_PRINTK_NMI and live at the same place
> as the other printk related options?

Will do.


>> +int nmi_vprintk(const char *fmt, va_list args)
>> +{
>> +	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
>> +	unsigned int len = seq_buf_used(&s->seq);
>> +
>> +	seq_buf_vprintf(&s->seq, fmt, args);
>> +	return seq_buf_used(&s->seq) - len;
>> +}
>> +EXPORT_SYMBOL_GPL(nmi_vprintk);
> 
> What's the point of these exports? This stuff is really not supposed
> to be used inside random modules.

Will do.


>> +/*
>> + * Check for concurrent usage and set up per_cpu seq_buf buffers that the NMIs
>> + * running on the other CPUs will write to. Provides the mask of CPUs it is
>> + * safe to write from (i.e. a copy of the online mask).
>> + */
>> +int prepare_nmi_printk(struct cpumask *cpus)
> 
> Can we please make all this proper prefixed? , i.e. printk_nmi_*

Will do.


>> +{
>> +	struct nmi_seq_buf *s;
>> +	int cpu;
>> +
>> +	if (test_and_set_bit(0, &nmi_print_flag)) {
>> +		/*
>> +		 * If something is already using the NMI print facility we
>> +		 * can't allow a second one...
>> +		 */
>> +		return -EBUSY;
> 
> So what's the point of saving and restoring the printk_func pointer at
> the call site?
> 
> void printk_nmi_begin()
> {
> 	if (__this_cpu_inc_return(nmi_printk_nest_level) == 1)
> 	      this_cpu_write(printk_func, nmi_vprintk);
> }
> 
> void printk_nmi_end()
> {
> 	if (__this_cpu_dec_return(nmi_printk_nest_level) > 0)
> 	        return;
>         this_cpu_write(printk_func, default_vprintk);

Looks good to here.


> 	if (in_nmi())
> 		irq_work_schedule();
>         else
> 		printk_nmi_complete();
> }

Not sure about using irq_work here. arch_trigger_all_cpu_backtrace is
generally called when something's gone bad meaning there's a good chance
the interrupts are masked.


> 
>> +	}
>> +
>> +	cpumask_copy(cpus, cpu_online_mask);
> 
> Why do you need external storage for this if nesting is not allowed?
> What's wrong with having a printk_nmi_mask? It's protected by the
> nmi_print_flag, so the call sites do not have to take care about
> protecting it until printk_nmi_complete() has been invoked.

It was used to tell the caller which CPUs are initialized and allowed to
trace...

On reflection though that's a rather pointless optimization. Given the
quantity of data we're about to throw on the console I can't really see
any reason not to use for_each_possible_cpu() for initialization and
leave the caller to figure out which cores to send IPIs to.


>> +	for_each_cpu(cpu, cpus) {
>> +		s = &per_cpu(nmi_print_seq, cpu);
>> +		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
> 
> Why do you want to do this here? The buffers should be initialized
> before the first NMI can hit and the complete code should reinit them
> before the next printk_nmi_prepare() sees the nmi_print_flag cleared.

To be honest I inherited the just-in-time initialization from Steven's code.

Assuming Steven didn't have a special reason to do it like that then I'm
happy to change this.


>> +static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
>> +{
>> +	const char *buf = s->buffer + start;
>> +
>> +	printk("%.*s", (end - start) + 1, buf);
>> +}
>> +
>> +void complete_nmi_printk(struct cpumask *cpus)
>> +{
>> +	struct nmi_seq_buf *s;
>> +	int len;
>> +	int cpu;
>> +	int i;
> 
> Please condense all ints to a single line, but what's worse is the
> completely inconsistency versus scopes.
> 
> len and i are only used in the for_each loop. Either we put all of
> them at the top of the function or we do it right.

Will do.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 5/8] printk: Simple implementation for NMI backtracing
@ 2015-01-26 17:21       ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-26 17:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/01/15 21:44, Thomas Gleixner wrote:
> On Fri, 23 Jan 2015, Daniel Thompson wrote:
>> +#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
>> +extern __printf(1, 0) int nmi_vprintk(const char *fmt, va_list args);
>> +
>> +struct cpumask;
>> +extern int prepare_nmi_printk(struct cpumask *cpus);
>> +extern void complete_nmi_printk(struct cpumask *cpus);
>> +
>> +/*
>> + * Replace printk to write into the NMI seq.
>> + *
>> + * To avoid include hell this is a macro rather than an inline function
>> + * (printk_func is not declared in this header file).
>> + */
>> +#define this_cpu_begin_nmi_printk() ({ \
>> +	printk_func_t __orig = this_cpu_read(printk_func); \
>> +	this_cpu_write(printk_func, nmi_vprintk); \
>> +	__orig; \
>> +})
>> +#define this_cpu_end_nmi_printk(fn) this_cpu_write(printk_func, fn)
> 
> Why can't we just make it a proper function in printk.c and make
> DEFINE_PER_CPU(printk_func_t, printk_func) static once x86 is
> converted over, thereby getting rid of the misplaced declaration in
> percpu.h?
> 
> It's really not performance critical at all. If you do system wide
> backtraces a function call is the least of your worries.

Yes. I'll make this a proper function.

Not sure about tidying up printk_func though. I had hoped to use that to
get rid of CONFIG_KGGB_KDB ifdef's that are currently found in printk.c .


>> +#ifdef CONFIG_ARCH_WANT_NMI_PRINTK
> 
> Why can't this simply be CONFIG_PRINTK_NMI and live at the same place
> as the other printk related options?

Will do.


>> +int nmi_vprintk(const char *fmt, va_list args)
>> +{
>> +	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
>> +	unsigned int len = seq_buf_used(&s->seq);
>> +
>> +	seq_buf_vprintf(&s->seq, fmt, args);
>> +	return seq_buf_used(&s->seq) - len;
>> +}
>> +EXPORT_SYMBOL_GPL(nmi_vprintk);
> 
> What's the point of these exports? This stuff is really not supposed
> to be used inside random modules.

Will do.


>> +/*
>> + * Check for concurrent usage and set up per_cpu seq_buf buffers that the NMIs
>> + * running on the other CPUs will write to. Provides the mask of CPUs it is
>> + * safe to write from (i.e. a copy of the online mask).
>> + */
>> +int prepare_nmi_printk(struct cpumask *cpus)
> 
> Can we please make all this proper prefixed? , i.e. printk_nmi_*

Will do.


>> +{
>> +	struct nmi_seq_buf *s;
>> +	int cpu;
>> +
>> +	if (test_and_set_bit(0, &nmi_print_flag)) {
>> +		/*
>> +		 * If something is already using the NMI print facility we
>> +		 * can't allow a second one...
>> +		 */
>> +		return -EBUSY;
> 
> So what's the point of saving and restoring the printk_func pointer at
> the call site?
> 
> void printk_nmi_begin()
> {
> 	if (__this_cpu_inc_return(nmi_printk_nest_level) == 1)
> 	      this_cpu_write(printk_func, nmi_vprintk);
> }
> 
> void printk_nmi_end()
> {
> 	if (__this_cpu_dec_return(nmi_printk_nest_level) > 0)
> 	        return;
>         this_cpu_write(printk_func, default_vprintk);

Looks good to here.


> 	if (in_nmi())
> 		irq_work_schedule();
>         else
> 		printk_nmi_complete();
> }

Not sure about using irq_work here. arch_trigger_all_cpu_backtrace is
generally called when something's gone bad meaning there's a good chance
the interrupts are masked.


> 
>> +	}
>> +
>> +	cpumask_copy(cpus, cpu_online_mask);
> 
> Why do you need external storage for this if nesting is not allowed?
> What's wrong with having a printk_nmi_mask? It's protected by the
> nmi_print_flag, so the call sites do not have to take care about
> protecting it until printk_nmi_complete() has been invoked.

It was used to tell the caller which CPUs are initialized and allowed to
trace...

On reflection though that's a rather pointless optimization. Given the
quantity of data we're about to throw on the console I can't really see
any reason not to use for_each_possible_cpu() for initialization and
leave the caller to figure out which cores to send IPIs to.


>> +	for_each_cpu(cpu, cpus) {
>> +		s = &per_cpu(nmi_print_seq, cpu);
>> +		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
> 
> Why do you want to do this here? The buffers should be initialized
> before the first NMI can hit and the complete code should reinit them
> before the next printk_nmi_prepare() sees the nmi_print_flag cleared.

To be honest I inherited the just-in-time initialization from Steven's code.

Assuming Steven didn't have a special reason to do it like that then I'm
happy to change this.


>> +static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
>> +{
>> +	const char *buf = s->buffer + start;
>> +
>> +	printk("%.*s", (end - start) + 1, buf);
>> +}
>> +
>> +void complete_nmi_printk(struct cpumask *cpus)
>> +{
>> +	struct nmi_seq_buf *s;
>> +	int len;
>> +	int cpu;
>> +	int i;
> 
> Please condense all ints to a single line, but what's worse is the
> completely inconsistency versus scopes.
> 
> len and i are only used in the for_each loop. Either we put all of
> them at the top of the function or we do it right.

Will do.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3.19-rc2 v15 4/8] sched_clock: Avoid deadlock during read from NMI
  2015-01-24 22:40     ` Thomas Gleixner
@ 2015-01-26 20:28       ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-26 20:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jason Cooper, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

On 24/01/15 22:40, Thomas Gleixner wrote:
> On Fri, 23 Jan 2015, Daniel Thompson wrote:
>> This patch fixes that problem by providing banked clock data in a
>> similar manner to Thomas Gleixner's 4396e058c52e("timekeeping: Provide
>> fast and NMI safe access to CLOCK_MONOTONIC").
> 
> By some definition of similar.

Fair point, I copied only the NMI-safety concept.

Anyhow, thanks very much for the review.


>> -struct clock_data {
>> -	ktime_t wrap_kt;
>> +struct clock_data_banked {
>>  	u64 epoch_ns;
>>  	u64 epoch_cyc;
>> -	seqcount_t seq;
>> -	unsigned long rate;
>> +	u64 (*read_sched_clock)(void);
>> +	u64 sched_clock_mask;
>>  	u32 mult;
>>  	u32 shift;
>>  	bool suspended;
>>  };
>>  
>> +struct clock_data {
>> +	ktime_t wrap_kt;
>> +	seqcount_t seq;
>> +	unsigned long rate;
>> +	struct clock_data_banked bank[2];
>> +};
> 
> ....
> 
>> -static u64 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
>> +static struct clock_data cd = {
>> +	.bank = {
>> +		[0] = {
>> +			.mult	= NSEC_PER_SEC / HZ,
>> +			.read_sched_clock = jiffy_sched_clock_read,
>> +		},
>> +	},
>> +};
> 
> If you had carefully studied the changes which made it possible to do
> the nmi safe clock monotonic accessor then you'd had noticed that I
> went a great way to optimize the cache foot print first and then add
> this new fangled thing.
> 
> So in the first place 'cd' lacks ____cacheline_aligned. It should have
> been there before, but that's a different issue. You should have
> noticed.
> 
> Secondly, I don't see any hint that you actually thought about the
> cache foot print of the result struct clock_data.

I did think about the cache footprint but only to the point of believing
my patch was unlikely to regress performance. As it happens it was the
absence of __cacheline_aligned on cd in the current code that made be
believe absence of regression would be enough (once I'd managed that I
ordered the members within the structure to get best locality of
reference within the *patch* in order to make code review easier).

I guess I did two things wrong here: inadequately documenting what work
I did and possessing insufficient ambition to improve!

I'll work on both of these.


> struct clock_data {
> 	ktime_t wrap_kt;
> 	seqcount_t seq;
> 	unsigned long rate;
> 	struct clock_data_banked bank[2];
> };
> 
> wrap_kt and rate are completely irrelevant for the hotpath. The whole
> thing up to the last member of bank[0] still fits into 64 byte on both
> 32 and 64bit, but that's not by design and not documented so anyone
> who is aware of cache foot print issues will go WTF when the first
> member of a hot path data structure is completely irrelevant.

Agreed.

It looks like I also put the function pointer in the wrong place within
clock_data_banked. It should occupy the space between the 64-bit and
32-bit members shouldn't it?


>>  static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
>>  {
>> @@ -58,50 +65,82 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
>>  
>>  unsigned long long notrace sched_clock(void)
>>  {
>> -	u64 epoch_ns;
>> -	u64 epoch_cyc;
>>  	u64 cyc;
>>  	unsigned long seq;
>> -
>> -	if (cd.suspended)
>> -		return cd.epoch_ns;
>> +	struct clock_data_banked *b;
>> +	u64 res;
> 
> So we now have
> 
>   	u64 cyc;
>   	unsigned long seq;
> 	struct clock_data_banked *b;
> 	u64 res;
> 
> Let me try a different version of that:
> 
> 	struct clock_data_banked *b;
>   	unsigned long seq;
> 	u64 res, cyc;
> 
> Can you spot the difference in the reading experience?

Will fix.

>  
>>  	do {
>> -		seq = raw_read_seqcount_begin(&cd.seq);
>> -		epoch_cyc = cd.epoch_cyc;
>> -		epoch_ns = cd.epoch_ns;
>> +		seq = raw_read_seqcount(&cd.seq);
>> +		b = cd.bank + (seq & 1);
>> +		if (b->suspended) {
>> +			res = b->epoch_ns;
> 
> So now we have read_sched_clock as a pointer in the bank. Why do you
> still need b->suspended?
> 
> What's wrong with setting b->read_sched_clock to NULL at suspend and
> restore the proper pointer on resume and use that as a conditional?
>  
> It would allow the compiler to generate better code, but that's
> obviously not the goal here. Darn, this is hot path code and not some
> random driver.

The update code probably won't be as easy to read but, as you say, this
is hot patch code.

>> +		} else {
>> +			cyc = b->read_sched_clock();
>> +			cyc = (cyc - b->epoch_cyc) & b->sched_clock_mask;
>> +			res = b->epoch_ns + cyc_to_ns(cyc, b->mult, b->shift);
> 
> It would allow the following optimization as well:
> 
>    	 res = b->epoch_ns;
> 	 if (b->read_sched_clock) {
> 	    	...
> 	 }
> 
> If you think that compilers are smart enough to figure all that out
> for you, you might get surprised. The more clear your code is the
> better is the chance that the compiler gets it right. We have seen the
> opposite of that as well, but that's clearly a compiler bug.

Good idea and, in this case there is a function pointer with unknown
side effects so a compiler would never be able to make that optimization.


>> +/*
>> + * Start updating the banked clock data.
>> + *
>> + * sched_clock will never observe mis-matched data even if called from
>> + * an NMI. We do this by maintaining an odd/even copy of the data and
>> + * steering sched_clock to one or the other using a sequence counter.
>> + * In order to preserve the data cache profile of sched_clock as much
>> + * as possible the system reverts back to the even copy when the update
>> + * completes; the odd copy is used *only* during an update.
>> + *
>> + * The caller is responsible for avoiding simultaneous updates.
>> + */
>> +static struct clock_data_banked *update_bank_begin(void)
>> +{
>> +	/* update the backup (odd) bank and steer readers towards it */
>> +	memcpy(cd.bank + 1, cd.bank, sizeof(struct clock_data_banked));
>> +	raw_write_seqcount_latch(&cd.seq);
>> +
>> +	return cd.bank;
>> +}
>> +
>> +/*
>> + * Finalize update of banked clock data.
>> + *
>> + * This is just a trivial switch back to the primary (even) copy.
>> + */
>> +static void update_bank_end(void)
>> +{
>> +	raw_write_seqcount_latch(&cd.seq);
>>  }
> 
> What's wrong with having a master struct
> 
> struct master_data {
> 	struct clock_data_banked master_data;
> 	ktime_t wrap_kt;
> 	unsigned long rate;
> 	u64 (*real_read_sched_clock)(void);
> };
> 
> Then you only have to care about the serialization of the master_data
> update and then the hotpath data update would be the same simple
> function as update_fast_timekeeper(). And it would have the same
> ordering scheme and aside of that the resulting code would be simpler,
> more intuitive to read and I'm pretty sure faster.

Sorry. I don't quite understand this.

Is the intent to have a single function to update the hotpath data used
by both update_sched_clock() and sched_clock_register() to replace the
pairing of update_bank_begin/end()?

If so, I started out doing that but eventually concluded that
update_sched_clock() didn't really benefit from having to make a third
copy of the values it consumes rather than updates.

However if that's an unconvincing reason I'm happy to switch to having
an update structure.


Daniel.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc2 v15 4/8] sched_clock: Avoid deadlock during read from NMI
@ 2015-01-26 20:28       ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-01-26 20:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/01/15 22:40, Thomas Gleixner wrote:
> On Fri, 23 Jan 2015, Daniel Thompson wrote:
>> This patch fixes that problem by providing banked clock data in a
>> similar manner to Thomas Gleixner's 4396e058c52e("timekeeping: Provide
>> fast and NMI safe access to CLOCK_MONOTONIC").
> 
> By some definition of similar.

Fair point, I copied only the NMI-safety concept.

Anyhow, thanks very much for the review.


>> -struct clock_data {
>> -	ktime_t wrap_kt;
>> +struct clock_data_banked {
>>  	u64 epoch_ns;
>>  	u64 epoch_cyc;
>> -	seqcount_t seq;
>> -	unsigned long rate;
>> +	u64 (*read_sched_clock)(void);
>> +	u64 sched_clock_mask;
>>  	u32 mult;
>>  	u32 shift;
>>  	bool suspended;
>>  };
>>  
>> +struct clock_data {
>> +	ktime_t wrap_kt;
>> +	seqcount_t seq;
>> +	unsigned long rate;
>> +	struct clock_data_banked bank[2];
>> +};
> 
> ....
> 
>> -static u64 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
>> +static struct clock_data cd = {
>> +	.bank = {
>> +		[0] = {
>> +			.mult	= NSEC_PER_SEC / HZ,
>> +			.read_sched_clock = jiffy_sched_clock_read,
>> +		},
>> +	},
>> +};
> 
> If you had carefully studied the changes which made it possible to do
> the nmi safe clock monotonic accessor then you'd had noticed that I
> went a great way to optimize the cache foot print first and then add
> this new fangled thing.
> 
> So in the first place 'cd' lacks ____cacheline_aligned. It should have
> been there before, but that's a different issue. You should have
> noticed.
> 
> Secondly, I don't see any hint that you actually thought about the
> cache foot print of the result struct clock_data.

I did think about the cache footprint but only to the point of believing
my patch was unlikely to regress performance. As it happens it was the
absence of __cacheline_aligned on cd in the current code that made be
believe absence of regression would be enough (once I'd managed that I
ordered the members within the structure to get best locality of
reference within the *patch* in order to make code review easier).

I guess I did two things wrong here: inadequately documenting what work
I did and possessing insufficient ambition to improve!

I'll work on both of these.


> struct clock_data {
> 	ktime_t wrap_kt;
> 	seqcount_t seq;
> 	unsigned long rate;
> 	struct clock_data_banked bank[2];
> };
> 
> wrap_kt and rate are completely irrelevant for the hotpath. The whole
> thing up to the last member of bank[0] still fits into 64 byte on both
> 32 and 64bit, but that's not by design and not documented so anyone
> who is aware of cache foot print issues will go WTF when the first
> member of a hot path data structure is completely irrelevant.

Agreed.

It looks like I also put the function pointer in the wrong place within
clock_data_banked. It should occupy the space between the 64-bit and
32-bit members shouldn't it?


>>  static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
>>  {
>> @@ -58,50 +65,82 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
>>  
>>  unsigned long long notrace sched_clock(void)
>>  {
>> -	u64 epoch_ns;
>> -	u64 epoch_cyc;
>>  	u64 cyc;
>>  	unsigned long seq;
>> -
>> -	if (cd.suspended)
>> -		return cd.epoch_ns;
>> +	struct clock_data_banked *b;
>> +	u64 res;
> 
> So we now have
> 
>   	u64 cyc;
>   	unsigned long seq;
> 	struct clock_data_banked *b;
> 	u64 res;
> 
> Let me try a different version of that:
> 
> 	struct clock_data_banked *b;
>   	unsigned long seq;
> 	u64 res, cyc;
> 
> Can you spot the difference in the reading experience?

Will fix.

>  
>>  	do {
>> -		seq = raw_read_seqcount_begin(&cd.seq);
>> -		epoch_cyc = cd.epoch_cyc;
>> -		epoch_ns = cd.epoch_ns;
>> +		seq = raw_read_seqcount(&cd.seq);
>> +		b = cd.bank + (seq & 1);
>> +		if (b->suspended) {
>> +			res = b->epoch_ns;
> 
> So now we have read_sched_clock as a pointer in the bank. Why do you
> still need b->suspended?
> 
> What's wrong with setting b->read_sched_clock to NULL at suspend and
> restore the proper pointer on resume and use that as a conditional?
>  
> It would allow the compiler to generate better code, but that's
> obviously not the goal here. Darn, this is hot path code and not some
> random driver.

The update code probably won't be as easy to read but, as you say, this
is hot patch code.

>> +		} else {
>> +			cyc = b->read_sched_clock();
>> +			cyc = (cyc - b->epoch_cyc) & b->sched_clock_mask;
>> +			res = b->epoch_ns + cyc_to_ns(cyc, b->mult, b->shift);
> 
> It would allow the following optimization as well:
> 
>    	 res = b->epoch_ns;
> 	 if (b->read_sched_clock) {
> 	    	...
> 	 }
> 
> If you think that compilers are smart enough to figure all that out
> for you, you might get surprised. The more clear your code is the
> better is the chance that the compiler gets it right. We have seen the
> opposite of that as well, but that's clearly a compiler bug.

Good idea and, in this case there is a function pointer with unknown
side effects so a compiler would never be able to make that optimization.


>> +/*
>> + * Start updating the banked clock data.
>> + *
>> + * sched_clock will never observe mis-matched data even if called from
>> + * an NMI. We do this by maintaining an odd/even copy of the data and
>> + * steering sched_clock to one or the other using a sequence counter.
>> + * In order to preserve the data cache profile of sched_clock as much
>> + * as possible the system reverts back to the even copy when the update
>> + * completes; the odd copy is used *only* during an update.
>> + *
>> + * The caller is responsible for avoiding simultaneous updates.
>> + */
>> +static struct clock_data_banked *update_bank_begin(void)
>> +{
>> +	/* update the backup (odd) bank and steer readers towards it */
>> +	memcpy(cd.bank + 1, cd.bank, sizeof(struct clock_data_banked));
>> +	raw_write_seqcount_latch(&cd.seq);
>> +
>> +	return cd.bank;
>> +}
>> +
>> +/*
>> + * Finalize update of banked clock data.
>> + *
>> + * This is just a trivial switch back to the primary (even) copy.
>> + */
>> +static void update_bank_end(void)
>> +{
>> +	raw_write_seqcount_latch(&cd.seq);
>>  }
> 
> What's wrong with having a master struct
> 
> struct master_data {
> 	struct clock_data_banked master_data;
> 	ktime_t wrap_kt;
> 	unsigned long rate;
> 	u64 (*real_read_sched_clock)(void);
> };
> 
> Then you only have to care about the serialization of the master_data
> update and then the hotpath data update would be the same simple
> function as update_fast_timekeeper(). And it would have the same
> ordering scheme and aside of that the resulting code would be simpler,
> more intuitive to read and I'm pretty sure faster.

Sorry. I don't quite understand this.

Is the intent to have a single function to update the hotpath data used
by both update_sched_clock() and sched_clock_register() to replace the
pairing of update_bank_begin/end()?

If so, I started out doing that but eventually concluded that
update_sched_clock() didn't really benefit from having to make a third
copy of the values it consumes rather than updates.

However if that's an unconvincing reason I'm happy to switch to having
an update structure.


Daniel.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-02-03 19:06   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly (and safely) bring in the changes for the arm
we also make the sched_clock implementation NMI-safe and rearrange
some of the existing x86 NMI code to make it architecture neutral.

This patchset touches a fairly large number of different sub-systems
(irq, printk, x86, arm). However, of the six patches, four fall under
one of tglx's maintainerships (either through irq or x86) and another
(printk) has no explicit maintainer. Thus unless there are objections
I'd like to gather acks from some of the folks Cc:ed on the patches.
Then I can wrap it up nicely and send it to Thomas.

The patches have been runtime tested on two systems capable of
supporting FIQ (Freescale i.MX6 and STiH416) and two that do not
(vexpress-a9 and Qualcomm Snapdragon 600), the changes to the x86
logic were tested on qemu and all patches have been compile tested
on x86, arm and arm64.

Note: On platforms not capable of supporting FIQ, the IPI to generate a
      backtrace will fall back to using IRQ for propagation instead.
      The backtrace logic contains a timeout to we will not wedge the
      requesting CPU if other CPUs are not responsive.

v16:

* Significant clean up of the printk patches (Thomas Gleixner).
  Replacing macros with real functions, CONFIG_ARCH_WANT_NMI_PRINTK
  -> CONFIG_PRINTK_NMI, prefixing global functions with printk_nmi,
  removing pointless exports, removing cpu_mask from the interfaces,
  removal of just-in-time initialization of trace buffers, prevented
  call sites having to save state, rolled up variable declarations
  into single lines.

* Dropped the sched_clock() patches from *this* patchset and managed
  them separately (http://thread.gmane.org/gmane.linux.kernel/1879261 ).
  The cross-dependancies between the patches are minimal; the backtrace
  code only calls sched_clock() if we are ftracing and backtracing is
  normally only triggered to report information about about a broken
  system (although users can type SysRq-l for amusement, most use it
  to find out why the system it dead).

* Squashed together the final two patches. Essentially these duplicated
  the x86 code and slavishly avoided changing it before, in the next
  patch, fixing it to work better on ARM. It seems better that the code
  just works first time!

v15:

* Added a patch to make sched_clock safe to call from NMI (Stephen
  Boyd). Note that sched_clock() is not called by the NMI handlers that
  have been added for the arm but it could be called if tools such as
  ftrace are deployed.

* Fixed some warnings picked up during bisectability testing.

v14:

* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c
  to printk.c (Steven Rostedt)

v13:

* Updated the code to print the backtrace to replicate Steven Rostedt's
  x86 work to make SysRq-l safe. This is pretty much a total rewrite of
  patches 4 and 5.

v12:

* Squash first two patches into a single one and re-describe
  (Thomas Gleixner).

* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe"
  (Thomas Gleixner).

v11:

* Optimized gic_raise_softirq() by replacing a register read with
  a memory read (Jason Cooper).

v10:

* Add a further patch to optimize away some of the locking on systems
  where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with
  exynos_defconfig (which is the only defconfig to set this option).

* Whitespace fixes in patch 4. That patch previously used spaces for
  alignment of new constants but the rest of the file used tabs.

v9:

* Improved documentation and structure of initial patch (now initial
  two patches) to make gic_raise_softirq() safe to call from FIQ
  (Thomas Gleixner).

* Avoid masking interrupts during gic_raise_softirq(). The use of the
  read lock makes this redundant (because we can safely re-enter the
  function).

v8:

* Fixed build on arm64 causes by a spurious include file in irq-gic.c.

v7-2 (accidentally released twice with same number):

* Fixed boot regression on vexpress-a9 (reported by Russell King).

* Rebased on v3.18-rc3; removed one patch from set that is already
  included in mainline.

* Dropped arm64/fiq.h patch from the set (still useful but not related
  to issuing backtraces).

v7:

* Re-arranged code within the patch series to fix a regression
  introduced midway through the series and corrected by a later patch
  (testing by Olof's autobuilder). Tested offending patch in isolation
  using defconfig identified by the autobuilder.

v6:

* Renamed svc_entry's call_trace argument to just trace (example code
  from Russell King).

* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell
  King).

* Modified usr_entry to optional avoid calling into the trace code and
  used this in FIQ entry from usr path. Modified corresponding exit code
  to avoid calling into trace code and the scheduler (example code from
  Russell King).

* Ensured the default FIQ register state is restored when the default
  FIQ handler is reinstalled (example code from Russell King).

* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting
  a default FIQ handler.

* Re-instated fiq_safe_migration_lock and associated logic in
  gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd()
  in the console unlock logic.

v5:

* Rebased on 3.17-rc4.

* Removed a spurious line from the final "glue it together" patch
  that broke the build.

v4:

* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas
  Pitre).

* Really fix bad pt_regs pointer generation in __fiq_abt.

* Remove fiq_safe_migration_lock and associated logic in
  gic_raise_softirq() (review of Russell King)

* Restructured to introduce the default FIQ handler first, before the
  new features (review of Russell King).

v3:

* Removed redundant header guards from arch/arm64/include/asm/fiq.h
  (review of Catalin Marinas).

* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas
  Pitre).

v2:

* Restructured to sit nicely on a similar FYI patchset from Russell
  King. It now effectively replaces the work in progress final patch
  with something much more complete.

* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq
  (review of Nicolas Pitre)

* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts
  being acknowledged by the IRQ handler does still exist but should be
  harmless because the IRQ handler will still wind up calling
  ipi_cpu_backtrace().

* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively
  becomes a platform feature (although the use of non-maskable
  interrupts to implement it is best effort rather than guaranteed).

* Better comments highlighting usage of RAZ/WI registers (and parts of
  registers) in the GIC code.

Changes *before* v1:

* This patchset is a hugely cut-down successor to "[PATCH v11 00/19]
  arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting
  the new structure. For historic details see:
        https://lkml.org/lkml/2014/9/2/227

* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value).
  In fixing this we also remove the useless indirection previously
  found in the fiq_handler macro.

* Make default fiq handler "always on" by migrating from fiq.c to
  traps.c and replace do_unexp_fiq with the new handler (review
  of Russell King).

* Add arm64 version of fiq.h (review of Russell King)

* Removed conditional branching and code from irq-gic.c, this is
  replaced by much simpler code that relies on the GIC specification's
  heavy use of read-as-zero/write-ignored (review of Russell King)


Daniel Thompson (6):
  irqchip: gic: Optimize locking in gic_raise_softirq
  irqchip: gic: Make gic_raise_softirq FIQ-safe
  irqchip: gic: Introduce plumbing for IPI FIQ
  printk: Simple implementation for NMI backtracing
  x86/nmi: Use common printk functions
  ARM: Add support for on-demand backtrace of other CPUs

 arch/arm/Kconfig                |   1 +
 arch/arm/include/asm/hardirq.h  |   2 +-
 arch/arm/include/asm/irq.h      |   5 +
 arch/arm/include/asm/smp.h      |   3 +
 arch/arm/kernel/smp.c           |  80 ++++++++++++++++
 arch/arm/kernel/traps.c         |   8 +-
 arch/x86/Kconfig                |   1 +
 arch/x86/kernel/apic/hw_nmi.c   | 101 ++------------------
 drivers/irqchip/irq-gic.c       | 203 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 ++
 include/linux/printk.h          |  18 ++++
 init/Kconfig                    |   3 +
 kernel/printk/printk.c          | 149 +++++++++++++++++++++++++++++
 13 files changed, 471 insertions(+), 111 deletions(-)

--
1.9.3


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace
@ 2015-02-03 19:06   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: linux-arm-kernel

This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly (and safely) bring in the changes for the arm
we also make the sched_clock implementation NMI-safe and rearrange
some of the existing x86 NMI code to make it architecture neutral.

This patchset touches a fairly large number of different sub-systems
(irq, printk, x86, arm). However, of the six patches, four fall under
one of tglx's maintainerships (either through irq or x86) and another
(printk) has no explicit maintainer. Thus unless there are objections
I'd like to gather acks from some of the folks Cc:ed on the patches.
Then I can wrap it up nicely and send it to Thomas.

The patches have been runtime tested on two systems capable of
supporting FIQ (Freescale i.MX6 and STiH416) and two that do not
(vexpress-a9 and Qualcomm Snapdragon 600), the changes to the x86
logic were tested on qemu and all patches have been compile tested
on x86, arm and arm64.

Note: On platforms not capable of supporting FIQ, the IPI to generate a
      backtrace will fall back to using IRQ for propagation instead.
      The backtrace logic contains a timeout to we will not wedge the
      requesting CPU if other CPUs are not responsive.

v16:

* Significant clean up of the printk patches (Thomas Gleixner).
  Replacing macros with real functions, CONFIG_ARCH_WANT_NMI_PRINTK
  -> CONFIG_PRINTK_NMI, prefixing global functions with printk_nmi,
  removing pointless exports, removing cpu_mask from the interfaces,
  removal of just-in-time initialization of trace buffers, prevented
  call sites having to save state, rolled up variable declarations
  into single lines.

* Dropped the sched_clock() patches from *this* patchset and managed
  them separately (http://thread.gmane.org/gmane.linux.kernel/1879261 ).
  The cross-dependancies between the patches are minimal; the backtrace
  code only calls sched_clock() if we are ftracing and backtracing is
  normally only triggered to report information about about a broken
  system (although users can type SysRq-l for amusement, most use it
  to find out why the system it dead).

* Squashed together the final two patches. Essentially these duplicated
  the x86 code and slavishly avoided changing it before, in the next
  patch, fixing it to work better on ARM. It seems better that the code
  just works first time!

v15:

* Added a patch to make sched_clock safe to call from NMI (Stephen
  Boyd). Note that sched_clock() is not called by the NMI handlers that
  have been added for the arm but it could be called if tools such as
  ftrace are deployed.

* Fixed some warnings picked up during bisectability testing.

v14:

* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c
  to printk.c (Steven Rostedt)

v13:

* Updated the code to print the backtrace to replicate Steven Rostedt's
  x86 work to make SysRq-l safe. This is pretty much a total rewrite of
  patches 4 and 5.

v12:

* Squash first two patches into a single one and re-describe
  (Thomas Gleixner).

* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe"
  (Thomas Gleixner).

v11:

* Optimized gic_raise_softirq() by replacing a register read with
  a memory read (Jason Cooper).

v10:

* Add a further patch to optimize away some of the locking on systems
  where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with
  exynos_defconfig (which is the only defconfig to set this option).

* Whitespace fixes in patch 4. That patch previously used spaces for
  alignment of new constants but the rest of the file used tabs.

v9:

* Improved documentation and structure of initial patch (now initial
  two patches) to make gic_raise_softirq() safe to call from FIQ
  (Thomas Gleixner).

* Avoid masking interrupts during gic_raise_softirq(). The use of the
  read lock makes this redundant (because we can safely re-enter the
  function).

v8:

* Fixed build on arm64 causes by a spurious include file in irq-gic.c.

v7-2 (accidentally released twice with same number):

* Fixed boot regression on vexpress-a9 (reported by Russell King).

* Rebased on v3.18-rc3; removed one patch from set that is already
  included in mainline.

* Dropped arm64/fiq.h patch from the set (still useful but not related
  to issuing backtraces).

v7:

* Re-arranged code within the patch series to fix a regression
  introduced midway through the series and corrected by a later patch
  (testing by Olof's autobuilder). Tested offending patch in isolation
  using defconfig identified by the autobuilder.

v6:

* Renamed svc_entry's call_trace argument to just trace (example code
  from Russell King).

* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell
  King).

* Modified usr_entry to optional avoid calling into the trace code and
  used this in FIQ entry from usr path. Modified corresponding exit code
  to avoid calling into trace code and the scheduler (example code from
  Russell King).

* Ensured the default FIQ register state is restored when the default
  FIQ handler is reinstalled (example code from Russell King).

* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting
  a default FIQ handler.

* Re-instated fiq_safe_migration_lock and associated logic in
  gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd()
  in the console unlock logic.

v5:

* Rebased on 3.17-rc4.

* Removed a spurious line from the final "glue it together" patch
  that broke the build.

v4:

* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas
  Pitre).

* Really fix bad pt_regs pointer generation in __fiq_abt.

* Remove fiq_safe_migration_lock and associated logic in
  gic_raise_softirq() (review of Russell King)

* Restructured to introduce the default FIQ handler first, before the
  new features (review of Russell King).

v3:

* Removed redundant header guards from arch/arm64/include/asm/fiq.h
  (review of Catalin Marinas).

* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas
  Pitre).

v2:

* Restructured to sit nicely on a similar FYI patchset from Russell
  King. It now effectively replaces the work in progress final patch
  with something much more complete.

* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq
  (review of Nicolas Pitre)

* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts
  being acknowledged by the IRQ handler does still exist but should be
  harmless because the IRQ handler will still wind up calling
  ipi_cpu_backtrace().

* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively
  becomes a platform feature (although the use of non-maskable
  interrupts to implement it is best effort rather than guaranteed).

* Better comments highlighting usage of RAZ/WI registers (and parts of
  registers) in the GIC code.

Changes *before* v1:

* This patchset is a hugely cut-down successor to "[PATCH v11 00/19]
  arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting
  the new structure. For historic details see:
        https://lkml.org/lkml/2014/9/2/227

* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value).
  In fixing this we also remove the useless indirection previously
  found in the fiq_handler macro.

* Make default fiq handler "always on" by migrating from fiq.c to
  traps.c and replace do_unexp_fiq with the new handler (review
  of Russell King).

* Add arm64 version of fiq.h (review of Russell King)

* Removed conditional branching and code from irq-gic.c, this is
  replaced by much simpler code that relies on the GIC specification's
  heavy use of read-as-zero/write-ignored (review of Russell King)


Daniel Thompson (6):
  irqchip: gic: Optimize locking in gic_raise_softirq
  irqchip: gic: Make gic_raise_softirq FIQ-safe
  irqchip: gic: Introduce plumbing for IPI FIQ
  printk: Simple implementation for NMI backtracing
  x86/nmi: Use common printk functions
  ARM: Add support for on-demand backtrace of other CPUs

 arch/arm/Kconfig                |   1 +
 arch/arm/include/asm/hardirq.h  |   2 +-
 arch/arm/include/asm/irq.h      |   5 +
 arch/arm/include/asm/smp.h      |   3 +
 arch/arm/kernel/smp.c           |  80 ++++++++++++++++
 arch/arm/kernel/traps.c         |   8 +-
 arch/x86/Kconfig                |   1 +
 arch/x86/kernel/apic/hw_nmi.c   | 101 ++------------------
 drivers/irqchip/irq-gic.c       | 203 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 ++
 include/linux/printk.h          |  18 ++++
 init/Kconfig                    |   3 +
 kernel/printk/printk.c          | 149 +++++++++++++++++++++++++++++
 13 files changed, 471 insertions(+), 111 deletions(-)

--
1.9.3

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
  2015-02-03 19:06   ` Daniel Thompson
@ 2015-02-03 19:06     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently gic_raise_softirq() is locked using upon irq_controller_lock.
This lock is primarily used to make register read-modify-write sequences
atomic but gic_raise_softirq() uses it instead to ensure that the
big.LITTLE migration logic can figure out when it is safe to migrate
interrupts between physical cores.

This is sub-optimal in closely related ways:

1. No locking at all is required on systems where the b.L switcher is
   not configured.

2. Finer grain locking can be used on systems where the b.L switcher is
   present.

This patch resolves both of the above by introducing a separate finer
grain lock and providing conditionally compiled inlines to lock/unlock
it.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index d617ee5a3d8a..a9ed64dcc84b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -73,6 +73,27 @@ struct gic_chip_data {
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 
 /*
+ * This lock is used by the big.LITTLE migration code to ensure no IPIs
+ * can be pended on the old core after the map has been updated.
+ */
+#ifdef CONFIG_BL_SWITCHER
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+
+static inline void bl_migration_lock(unsigned long *flags)
+{
+	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+}
+
+static inline void bl_migration_unlock(unsigned long flags)
+{
+	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+}
+#else
+static inline void bl_migration_lock(unsigned long *flags) {}
+static inline void bl_migration_unlock(unsigned long flags) {}
+#endif
+
+/*
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
@@ -624,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	int cpu;
 	unsigned long flags, map = 0;
 
-	raw_spin_lock_irqsave(&irq_controller_lock, flags);
+	bl_migration_lock(&flags);
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -639,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
+	bl_migration_unlock(flags);
 }
 #endif
 
@@ -710,8 +731,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
 
 	raw_spin_lock(&irq_controller_lock);
 
-	/* Update the target interface for this logical CPU */
+	/*
+	 * Update the target interface for this logical CPU
+	 *
+	 * From the point we release the cpu_map_migration_lock any new
+	 * SGIs will be pended on the new cpu which makes the set of SGIs
+	 * pending on the old cpu static. That means we can defer the
+	 * migration until after we have released the irq_controller_lock.
+	 */
+	raw_spin_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
+	raw_spin_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
@ 2015-02-03 19:06     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: linux-arm-kernel

Currently gic_raise_softirq() is locked using upon irq_controller_lock.
This lock is primarily used to make register read-modify-write sequences
atomic but gic_raise_softirq() uses it instead to ensure that the
big.LITTLE migration logic can figure out when it is safe to migrate
interrupts between physical cores.

This is sub-optimal in closely related ways:

1. No locking at all is required on systems where the b.L switcher is
   not configured.

2. Finer grain locking can be used on systems where the b.L switcher is
   present.

This patch resolves both of the above by introducing a separate finer
grain lock and providing conditionally compiled inlines to lock/unlock
it.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index d617ee5a3d8a..a9ed64dcc84b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -73,6 +73,27 @@ struct gic_chip_data {
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 
 /*
+ * This lock is used by the big.LITTLE migration code to ensure no IPIs
+ * can be pended on the old core after the map has been updated.
+ */
+#ifdef CONFIG_BL_SWITCHER
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+
+static inline void bl_migration_lock(unsigned long *flags)
+{
+	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+}
+
+static inline void bl_migration_unlock(unsigned long flags)
+{
+	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+}
+#else
+static inline void bl_migration_lock(unsigned long *flags) {}
+static inline void bl_migration_unlock(unsigned long flags) {}
+#endif
+
+/*
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
@@ -624,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	int cpu;
 	unsigned long flags, map = 0;
 
-	raw_spin_lock_irqsave(&irq_controller_lock, flags);
+	bl_migration_lock(&flags);
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -639,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
+	bl_migration_unlock(flags);
 }
 #endif
 
@@ -710,8 +731,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
 
 	raw_spin_lock(&irq_controller_lock);
 
-	/* Update the target interface for this logical CPU */
+	/*
+	 * Update the target interface for this logical CPU
+	 *
+	 * From the point we release the cpu_map_migration_lock any new
+	 * SGIs will be pended on the new cpu which makes the set of SGIs
+	 * pending on the old cpu static. That means we can defer the
+	 * migration until after we have released the irq_controller_lock.
+	 */
+	raw_spin_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
+	raw_spin_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe
  2015-02-03 19:06   ` Daniel Thompson
@ 2015-02-03 19:06     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
and lock up.

    	gic_raise_softirq()
	   lock(x);
-~-> FIQ
        handle_fiq()
	   gic_raise_softirq()
	      lock(x);		<-- Lockup

arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
renders it difficult for FIQ handlers to safely defer work to less
restrictive calling contexts.

This patch fixes the problem by converting the cpu_map_migration_lock
into a rwlock making it safe to re-enter the function.

Note that having made it safe to re-enter gic_raise_softirq() we no
longer need to mask interrupts during gic_raise_softirq() because the
b.L migration is always performed from task context.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index a9ed64dcc84b..c172176499f6 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 /*
  * This lock is used by the big.LITTLE migration code to ensure no IPIs
  * can be pended on the old core after the map has been updated.
+ *
+ * This lock may be locked for reading from both IRQ and FIQ handlers
+ * and therefore must not be locked for writing when these are enabled.
  */
 #ifdef CONFIG_BL_SWITCHER
-static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static DEFINE_RWLOCK(cpu_map_migration_lock);
 
-static inline void bl_migration_lock(unsigned long *flags)
+static inline void bl_migration_lock(void)
 {
-	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+	read_lock(&cpu_map_migration_lock);
 }
 
-static inline void bl_migration_unlock(unsigned long flags)
+static inline void bl_migration_unlock(void)
 {
-	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+	read_unlock(&cpu_map_migration_lock);
 }
 #else
-static inline void bl_migration_lock(unsigned long *flags) {}
-static inline void bl_migration_unlock(unsigned long flags) {}
+static inline void bl_migration_lock(void) {}
+static inline void bl_migration_unlock(void) {}
 #endif
 
 /*
@@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+/*
+ * Raise the specified IPI on all cpus set in mask.
+ *
+ * This function is safe to call from all calling contexts, including
+ * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable
+ * to avoid deadlocks when the function is re-entered at different
+ * exception levels.
+ */
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
-	unsigned long flags, map = 0;
+	unsigned long map = 0;
 
-	bl_migration_lock(&flags);
+	bl_migration_lock();
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	bl_migration_unlock(flags);
+	bl_migration_unlock();
 }
 #endif
 
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu)
  * Migrate all peripheral interrupts with a target matching the current CPU
  * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
  * is also updated.  Targets to other CPU interfaces are unchanged.
- * This must be called with IRQs locally disabled.
+ * This must be called from a task context and with IRQ and FIQ locally
+ * disabled.
  */
 void gic_migrate_target(unsigned int new_cpu_id)
 {
@@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * pending on the old cpu static. That means we can defer the
 	 * migration until after we have released the irq_controller_lock.
 	 */
-	raw_spin_lock(&cpu_map_migration_lock);
+	write_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
-	raw_spin_unlock(&cpu_map_migration_lock);
+	write_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe
@ 2015-02-03 19:06     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: linux-arm-kernel

It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
and lock up.

    	gic_raise_softirq()
	   lock(x);
-~-> FIQ
        handle_fiq()
	   gic_raise_softirq()
	      lock(x);		<-- Lockup

arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
renders it difficult for FIQ handlers to safely defer work to less
restrictive calling contexts.

This patch fixes the problem by converting the cpu_map_migration_lock
into a rwlock making it safe to re-enter the function.

Note that having made it safe to re-enter gic_raise_softirq() we no
longer need to mask interrupts during gic_raise_softirq() because the
b.L migration is always performed from task context.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index a9ed64dcc84b..c172176499f6 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 /*
  * This lock is used by the big.LITTLE migration code to ensure no IPIs
  * can be pended on the old core after the map has been updated.
+ *
+ * This lock may be locked for reading from both IRQ and FIQ handlers
+ * and therefore must not be locked for writing when these are enabled.
  */
 #ifdef CONFIG_BL_SWITCHER
-static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static DEFINE_RWLOCK(cpu_map_migration_lock);
 
-static inline void bl_migration_lock(unsigned long *flags)
+static inline void bl_migration_lock(void)
 {
-	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+	read_lock(&cpu_map_migration_lock);
 }
 
-static inline void bl_migration_unlock(unsigned long flags)
+static inline void bl_migration_unlock(void)
 {
-	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+	read_unlock(&cpu_map_migration_lock);
 }
 #else
-static inline void bl_migration_lock(unsigned long *flags) {}
-static inline void bl_migration_unlock(unsigned long flags) {}
+static inline void bl_migration_lock(void) {}
+static inline void bl_migration_unlock(void) {}
 #endif
 
 /*
@@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+/*
+ * Raise the specified IPI on all cpus set in mask.
+ *
+ * This function is safe to call from all calling contexts, including
+ * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable
+ * to avoid deadlocks when the function is re-entered at different
+ * exception levels.
+ */
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
-	unsigned long flags, map = 0;
+	unsigned long map = 0;
 
-	bl_migration_lock(&flags);
+	bl_migration_lock();
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	bl_migration_unlock(flags);
+	bl_migration_unlock();
 }
 #endif
 
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu)
  * Migrate all peripheral interrupts with a target matching the current CPU
  * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
  * is also updated.  Targets to other CPU interfaces are unchanged.
- * This must be called with IRQs locally disabled.
+ * This must be called from a task context and with IRQ and FIQ locally
+ * disabled.
  */
 void gic_migrate_target(unsigned int new_cpu_id)
 {
@@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * pending on the old cpu static. That means we can defer the
 	 * migration until after we have released the irq_controller_lock.
 	 */
-	raw_spin_lock(&cpu_map_migration_lock);
+	write_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
-	raw_spin_unlock(&cpu_map_migration_lock);
+	write_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 3/6] irqchip: gic: Introduce plumbing for IPI FIQ
  2015-02-03 19:06   ` Daniel Thompson
@ 2015-02-03 19:06     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently it is not possible to exploit FIQ for systems with a GIC, even if
the systems are otherwise capable of it. This patch makes it possible
for IPIs to be delivered using FIQ.

To do so it modifies the register state so that normal interrupts are
placed in group 1 and specific IPIs are placed into group 0. It also
configures the controller to raise group 0 interrupts using the FIQ
signal. It provides a means for architecture code to define which IPIs
shall use FIQ and to acknowledge any IPIs that are raised.

All GIC hardware except GICv1-without-TrustZone support provides a means
to group exceptions into group 0 and group 1 but the hardware
functionality is unavailable to the kernel when a secure monitor is
present because access to the grouping registers are prohibited outside
"secure world". However when grouping is not available (or in the case
of early GICv1 implementations is very hard to configure) the code to
change groups does not deploy and all IPIs will be raised via IRQ.

It has been tested and shown working on two systems capable of
supporting grouping (Freescale i.MX6 and STiH416). It has also been
tested for boot regressions on two systems that do not support grouping
(vexpress-a9 and Qualcomm Snapdragon 600).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/traps.c         |   5 +-
 drivers/irqchip/irq-gic.c       | 151 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 +++
 3 files changed, 153 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 788e23fe64d8..b35e220ae1b1 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -26,6 +26,7 @@
 #include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/irq.h>
+#include <linux/irqchip/arm-gic.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 
 	nmi_enter();
 
-	/* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
+	gic_handle_fiq_ipi();
+#endif
 
 	nmi_exit();
 
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index c172176499f6..c4f4a8827ed8 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/ratelimit.h>
 
 #include <asm/cputype.h>
 #include <asm/irq.h>
@@ -48,6 +49,10 @@
 #include "irq-gic-common.h"
 #include "irqchip.h"
 
+#ifndef SMP_IPI_FIQ_MASK
+#define SMP_IPI_FIQ_MASK 0
+#endif
+
 union gic_base {
 	void __iomem *common_base;
 	void __percpu * __iomem *percpu_base;
@@ -65,6 +70,7 @@ struct gic_chip_data {
 #endif
 	struct irq_domain *domain;
 	unsigned int gic_irqs;
+	u32 igroup0_shadow;
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
@@ -348,6 +354,83 @@ static struct irq_chip gic_chip = {
 	.irq_set_wake		= gic_set_wake,
 };
 
+/*
+ * Shift an interrupt between Group 0 and Group 1.
+ *
+ * In addition to changing the group we also modify the priority to
+ * match what "ARM strongly recommends" for a system where no Group 1
+ * interrupt must ever preempt a Group 0 interrupt.
+ *
+ * If is safe to call this function on systems which do not support
+ * grouping (it will have no effect).
+ */
+static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq,
+			      int group)
+{
+	void __iomem *base = gic_data_dist_base(gic);
+	unsigned int grp_reg = hwirq / 32 * 4;
+	u32 grp_mask = BIT(hwirq % 32);
+	u32 grp_val;
+
+	unsigned int pri_reg = (hwirq / 4) * 4;
+	u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
+	u32 pri_val;
+
+	/*
+	 * Systems which do not support grouping will have not have
+	 * the EnableGrp1 bit set.
+	 */
+	if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
+		return;
+
+	raw_spin_lock(&irq_controller_lock);
+
+	grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
+	pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
+
+	if (group) {
+		grp_val |= grp_mask;
+		pri_val |= pri_mask;
+	} else {
+		grp_val &= ~grp_mask;
+		pri_val &= ~pri_mask;
+	}
+
+	writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
+	if (grp_reg == 0)
+		gic->igroup0_shadow = grp_val;
+
+	writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
+
+	raw_spin_unlock(&irq_controller_lock);
+}
+
+
+/*
+ * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
+ * otherwise do nothing.
+ */
+void gic_handle_fiq_ipi(void)
+{
+	struct gic_chip_data *gic = &gic_data[0];
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
+	unsigned long irqstat, irqnr;
+
+	if (WARN_ON(!in_nmi()))
+		return;
+
+	while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
+	       SMP_IPI_FIQ_MASK) {
+		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
+		writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
+
+		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+		WARN_RATELIMIT(irqnr > 16,
+			       "Unexpected irqnr %lu (bad prioritization?)\n",
+			       irqnr);
+	}
+}
+
 void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 {
 	if (gic_nr >= MAX_GIC_NR)
@@ -379,15 +462,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 static void gic_cpu_if_up(void)
 {
 	void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
-	u32 bypass = 0;
+	void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
+	u32 ctrl = 0;
 
 	/*
-	* Preserve bypass disable bits to be written back later
-	*/
-	bypass = readl(cpu_base + GIC_CPU_CTRL);
-	bypass &= GICC_DIS_BYPASS_MASK;
+	 * Preserve bypass disable bits to be written back later
+	 */
+	ctrl = readl(cpu_base + GIC_CPU_CTRL);
+	ctrl &= GICC_DIS_BYPASS_MASK;
 
-	writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
+	/*
+	 * If EnableGrp1 is set in the distributor then enable group 1
+	 * support for this CPU (and route group 0 interrupts to FIQ).
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
+		ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
+			GICC_ENABLE_GRP1;
+
+	writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
 
@@ -411,7 +503,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored) depending on current mode.
+	 */
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
+
+	/*
+	 * Set all global interrupts to be group 1 if (and only if) it
+	 * is possible to enable group 1 interrupts. This register is RAZ/WI
+	 * if not accessible or not implemented, however some GICv1 devices
+	 * do not implement the EnableGrp1 bit making it unsafe to set
+	 * this register unconditionally.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
+		for (i = 32; i < gic_irqs; i += 32)
+			writel_relaxed(0xffffffff,
+				       base + GIC_DIST_IGROUP + i * 4 / 32);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -420,6 +528,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	void __iomem *base = gic_data_cpu_base(gic);
 	unsigned int cpu_mask, cpu = smp_processor_id();
 	int i;
+	unsigned long secure_irqs, secure_irq;
 
 	/*
 	 * Get what the GIC says our CPU mask is.
@@ -438,6 +547,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * If the distributor is configured to support interrupt grouping
+	 * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
+	 * to be group1 and ensure any remaining group 0 interrupts have
+	 * the right priority.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
+		secure_irqs = SMP_IPI_FIQ_MASK;
+		writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
+		gic->igroup0_shadow = ~secure_irqs;
+		for_each_set_bit(secure_irq, &secure_irqs, 16)
+			gic_set_group_irq(gic, secure_irq, 0);
+	}
+
 	writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK);
 	gic_cpu_if_up();
 }
@@ -527,7 +650,8 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
+		       dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -655,6 +779,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
 	unsigned long map = 0;
+	unsigned long softint;
 
 	bl_migration_lock();
 
@@ -668,8 +793,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	/* We avoid a readl here by using the shadow copy of IGROUP[0] */
+	softint = map << 16 | irq;
+	if (gic_data[0].igroup0_shadow & BIT(irq))
+		softint |= 0x8000;
+
+	/* This always happens on GIC0 */
+	writel_relaxed(softint,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
 	bl_migration_unlock();
 }
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 71d706d5f169..7690f70049a3 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -22,6 +22,10 @@
 #define GIC_CPU_IDENT			0xfc
 
 #define GICC_ENABLE			0x1
+#define GICC_ENABLE_GRP1		0x2
+#define GICC_ACK_CTL			0x4
+#define GICC_FIQ_EN			0x8
+#define GICC_COMMON_BPR			0x10
 #define GICC_INT_PRI_THRESHOLD		0xf0
 #define GICC_IAR_INT_ID_MASK		0x3ff
 #define GICC_INT_SPURIOUS		1023
@@ -44,6 +48,7 @@
 #define GIC_DIST_SGI_PENDING_SET	0xf20
 
 #define GICD_ENABLE			0x1
+#define GICD_ENABLE_GRP1		0x2
 #define GICD_DISABLE			0x0
 #define GICD_INT_ACTLOW_LVLTRIG		0x0
 #define GICD_INT_EN_CLR_X32		0xffffffff
@@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops
 {
 	gic_routable_irq_domain_ops = ops;
 }
+
+void gic_handle_fiq_ipi(void);
+
 #endif /* __ASSEMBLY */
 #endif
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 3/6] irqchip: gic: Introduce plumbing for IPI FIQ
@ 2015-02-03 19:06     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: linux-arm-kernel

Currently it is not possible to exploit FIQ for systems with a GIC, even if
the systems are otherwise capable of it. This patch makes it possible
for IPIs to be delivered using FIQ.

To do so it modifies the register state so that normal interrupts are
placed in group 1 and specific IPIs are placed into group 0. It also
configures the controller to raise group 0 interrupts using the FIQ
signal. It provides a means for architecture code to define which IPIs
shall use FIQ and to acknowledge any IPIs that are raised.

All GIC hardware except GICv1-without-TrustZone support provides a means
to group exceptions into group 0 and group 1 but the hardware
functionality is unavailable to the kernel when a secure monitor is
present because access to the grouping registers are prohibited outside
"secure world". However when grouping is not available (or in the case
of early GICv1 implementations is very hard to configure) the code to
change groups does not deploy and all IPIs will be raised via IRQ.

It has been tested and shown working on two systems capable of
supporting grouping (Freescale i.MX6 and STiH416). It has also been
tested for boot regressions on two systems that do not support grouping
(vexpress-a9 and Qualcomm Snapdragon 600).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/traps.c         |   5 +-
 drivers/irqchip/irq-gic.c       | 151 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 +++
 3 files changed, 153 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 788e23fe64d8..b35e220ae1b1 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -26,6 +26,7 @@
 #include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/irq.h>
+#include <linux/irqchip/arm-gic.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 
 	nmi_enter();
 
-	/* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
+	gic_handle_fiq_ipi();
+#endif
 
 	nmi_exit();
 
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index c172176499f6..c4f4a8827ed8 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/ratelimit.h>
 
 #include <asm/cputype.h>
 #include <asm/irq.h>
@@ -48,6 +49,10 @@
 #include "irq-gic-common.h"
 #include "irqchip.h"
 
+#ifndef SMP_IPI_FIQ_MASK
+#define SMP_IPI_FIQ_MASK 0
+#endif
+
 union gic_base {
 	void __iomem *common_base;
 	void __percpu * __iomem *percpu_base;
@@ -65,6 +70,7 @@ struct gic_chip_data {
 #endif
 	struct irq_domain *domain;
 	unsigned int gic_irqs;
+	u32 igroup0_shadow;
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
@@ -348,6 +354,83 @@ static struct irq_chip gic_chip = {
 	.irq_set_wake		= gic_set_wake,
 };
 
+/*
+ * Shift an interrupt between Group 0 and Group 1.
+ *
+ * In addition to changing the group we also modify the priority to
+ * match what "ARM strongly recommends" for a system where no Group 1
+ * interrupt must ever preempt a Group 0 interrupt.
+ *
+ * If is safe to call this function on systems which do not support
+ * grouping (it will have no effect).
+ */
+static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq,
+			      int group)
+{
+	void __iomem *base = gic_data_dist_base(gic);
+	unsigned int grp_reg = hwirq / 32 * 4;
+	u32 grp_mask = BIT(hwirq % 32);
+	u32 grp_val;
+
+	unsigned int pri_reg = (hwirq / 4) * 4;
+	u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
+	u32 pri_val;
+
+	/*
+	 * Systems which do not support grouping will have not have
+	 * the EnableGrp1 bit set.
+	 */
+	if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
+		return;
+
+	raw_spin_lock(&irq_controller_lock);
+
+	grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
+	pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
+
+	if (group) {
+		grp_val |= grp_mask;
+		pri_val |= pri_mask;
+	} else {
+		grp_val &= ~grp_mask;
+		pri_val &= ~pri_mask;
+	}
+
+	writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
+	if (grp_reg == 0)
+		gic->igroup0_shadow = grp_val;
+
+	writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
+
+	raw_spin_unlock(&irq_controller_lock);
+}
+
+
+/*
+ * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
+ * otherwise do nothing.
+ */
+void gic_handle_fiq_ipi(void)
+{
+	struct gic_chip_data *gic = &gic_data[0];
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
+	unsigned long irqstat, irqnr;
+
+	if (WARN_ON(!in_nmi()))
+		return;
+
+	while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
+	       SMP_IPI_FIQ_MASK) {
+		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
+		writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
+
+		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+		WARN_RATELIMIT(irqnr > 16,
+			       "Unexpected irqnr %lu (bad prioritization?)\n",
+			       irqnr);
+	}
+}
+
 void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 {
 	if (gic_nr >= MAX_GIC_NR)
@@ -379,15 +462,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 static void gic_cpu_if_up(void)
 {
 	void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
-	u32 bypass = 0;
+	void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
+	u32 ctrl = 0;
 
 	/*
-	* Preserve bypass disable bits to be written back later
-	*/
-	bypass = readl(cpu_base + GIC_CPU_CTRL);
-	bypass &= GICC_DIS_BYPASS_MASK;
+	 * Preserve bypass disable bits to be written back later
+	 */
+	ctrl = readl(cpu_base + GIC_CPU_CTRL);
+	ctrl &= GICC_DIS_BYPASS_MASK;
 
-	writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
+	/*
+	 * If EnableGrp1 is set in the distributor then enable group 1
+	 * support for this CPU (and route group 0 interrupts to FIQ).
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
+		ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
+			GICC_ENABLE_GRP1;
+
+	writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
 
@@ -411,7 +503,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored) depending on current mode.
+	 */
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
+
+	/*
+	 * Set all global interrupts to be group 1 if (and only if) it
+	 * is possible to enable group 1 interrupts. This register is RAZ/WI
+	 * if not accessible or not implemented, however some GICv1 devices
+	 * do not implement the EnableGrp1 bit making it unsafe to set
+	 * this register unconditionally.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
+		for (i = 32; i < gic_irqs; i += 32)
+			writel_relaxed(0xffffffff,
+				       base + GIC_DIST_IGROUP + i * 4 / 32);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -420,6 +528,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	void __iomem *base = gic_data_cpu_base(gic);
 	unsigned int cpu_mask, cpu = smp_processor_id();
 	int i;
+	unsigned long secure_irqs, secure_irq;
 
 	/*
 	 * Get what the GIC says our CPU mask is.
@@ -438,6 +547,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * If the distributor is configured to support interrupt grouping
+	 * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
+	 * to be group1 and ensure any remaining group 0 interrupts have
+	 * the right priority.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
+		secure_irqs = SMP_IPI_FIQ_MASK;
+		writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
+		gic->igroup0_shadow = ~secure_irqs;
+		for_each_set_bit(secure_irq, &secure_irqs, 16)
+			gic_set_group_irq(gic, secure_irq, 0);
+	}
+
 	writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK);
 	gic_cpu_if_up();
 }
@@ -527,7 +650,8 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
+		       dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -655,6 +779,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
 	unsigned long map = 0;
+	unsigned long softint;
 
 	bl_migration_lock();
 
@@ -668,8 +793,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	/* We avoid a readl here by using the shadow copy of IGROUP[0] */
+	softint = map << 16 | irq;
+	if (gic_data[0].igroup0_shadow & BIT(irq))
+		softint |= 0x8000;
+
+	/* This always happens on GIC0 */
+	writel_relaxed(softint,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
 	bl_migration_unlock();
 }
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 71d706d5f169..7690f70049a3 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -22,6 +22,10 @@
 #define GIC_CPU_IDENT			0xfc
 
 #define GICC_ENABLE			0x1
+#define GICC_ENABLE_GRP1		0x2
+#define GICC_ACK_CTL			0x4
+#define GICC_FIQ_EN			0x8
+#define GICC_COMMON_BPR			0x10
 #define GICC_INT_PRI_THRESHOLD		0xf0
 #define GICC_IAR_INT_ID_MASK		0x3ff
 #define GICC_INT_SPURIOUS		1023
@@ -44,6 +48,7 @@
 #define GIC_DIST_SGI_PENDING_SET	0xf20
 
 #define GICD_ENABLE			0x1
+#define GICD_ENABLE_GRP1		0x2
 #define GICD_DISABLE			0x0
 #define GICD_INT_ACTLOW_LVLTRIG		0x0
 #define GICD_INT_EN_CLR_X32		0xffffffff
@@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops
 {
 	gic_routable_irq_domain_ops = ops;
 }
+
+void gic_handle_fiq_ipi(void);
+
 #endif /* __ASSEMBLY */
 #endif
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 4/6] printk: Simple implementation for NMI backtracing
  2015-02-03 19:06   ` Daniel Thompson
@ 2015-02-03 19:06     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently there is a quite a pile of code sitting in
arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
The code is inaccessible to backtrace implementations for other
architectures, which is a shame because they would probably like to be
safe too.

Copy this code into printk. We'll port the x86 NMI backtrace to it in a
later patch.

Incidentally, technically I think it might be safe to call
printk_nmi_prepare() from NMI, providing care were taken to honour the
return code. printk_nmi_complete() cannot be called from NMI but could
be scheduled using irq_work_queue(). However honouring the return code
means sometimes it is impossible to get the message out so in most cases
I'd say using this code in such a way should probably attract sympathy
and/or derision rather than admiration.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/printk.h |  18 ++++++
 init/Kconfig           |   3 +
 kernel/printk/printk.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 170 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index c8f170324e64..03c6921b3fcd 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -219,6 +219,24 @@ static inline void show_regs_print_info(const char *log_lvl)
 }
 #endif
 
+#ifdef CONFIG_PRINTK_NMI
+/*
+ * printk_nmi_prepare/complete are called to prepare the system for
+ * some or all cores to issue trace from NMI. printk_nmi_complete will
+ * print buffered output and cannot (safely) be called from NMI.
+ */
+extern int printk_nmi_prepare(void);
+extern void printk_nmi_complete(void);
+
+/*
+ * printk_nmi_this_cpu_begin/end are used divert/restore printk on this
+ * cpu. The result is the output of printk() (by this CPU) will be
+ * stored in temporary buffers for later printing by printk_nmi_complete.
+ */
+extern void printk_nmi_this_cpu_begin(void);
+extern void printk_nmi_this_cpu_end(void);
+#endif
+
 extern asmlinkage void dump_stack(void) __cold;
 
 #ifndef pr_fmt
diff --git a/init/Kconfig b/init/Kconfig
index 9afb971497f4..13b843e3b5f4 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1430,6 +1430,9 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config PRINTK_NMI
+	bool
+
 config BUG
 	bool "BUG() support" if EXPERT
 	default y
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02d6b6d28796..fe1cd57ea349 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1829,6 +1829,155 @@ EXPORT_SYMBOL_GPL(vprintk_default);
  */
 DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
 
+#ifdef CONFIG_PRINTK_NMI
+
+#define NMI_BUF_SIZE		4096
+
+struct nmi_seq_buf {
+	unsigned char		buffer[NMI_BUF_SIZE];
+	struct seq_buf		seq;
+};
+
+/* Safe printing in NMI context */
+static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+
+static DEFINE_PER_CPU(printk_func_t, nmi_print_saved_print_func);
+
+/* "in progress" flag of NMI printing */
+static unsigned long nmi_print_flag;
+
+static int __init printk_nmi_init(void)
+{
+	struct nmi_seq_buf *s;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
+	}
+
+	return 0;
+}
+pure_initcall(printk_nmi_init);
+
+/*
+ * It is not safe to call printk() directly from NMI handlers.
+ * It may be fine if the NMI detected a lock up and we have no choice
+ * but to do so, but doing a NMI on all other CPUs to get a back trace
+ * can be done with a sysrq-l. We don't want that to lock up, which
+ * can happen if the NMI interrupts a printk in progress.
+ *
+ * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
+ * the content into a per cpu seq_buf buffer. Then when the NMIs are
+ * all done, we can safely dump the contents of the seq_buf to a printk()
+ * from a non NMI context.
+ *
+ * This is not a generic printk() implementation and must be used with
+ * great care. In particular there is a static limit on the quantity of
+ * data that may be emitted during NMI, only one client can be active at
+ * one time (arbitrated by the return value of printk_nmi_begin() and
+ * it is required that something at task or interrupt context be scheduled
+ * to issue the output.
+ */
+static int nmi_vprintk(const char *fmt, va_list args)
+{
+	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	unsigned int len = seq_buf_used(&s->seq);
+
+	seq_buf_vprintf(&s->seq, fmt, args);
+	return seq_buf_used(&s->seq) - len;
+}
+
+/*
+ * Reserve the NMI printk mechanism. Return an error if some other component
+ * is already using it.
+ */
+int printk_nmi_prepare(void)
+{
+	if (test_and_set_bit(0, &nmi_print_flag)) {
+		/*
+		 * If something is already using the NMI print facility we
+		 * can't allow a second one...
+		 */
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
+{
+	const char *buf = s->buffer + start;
+
+	printk("%.*s", (end - start) + 1, buf);
+}
+
+void printk_nmi_complete(void)
+{
+	struct nmi_seq_buf *s;
+	int len, cpu, i, last_i;
+
+	/*
+	 * Now that all the NMIs have triggered, we can dump out their
+	 * back traces safely to the console.
+	 */
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		last_i = 0;
+
+		len = seq_buf_used(&s->seq);
+		if (!len)
+			continue;
+
+		/* Print line by line. */
+		for (i = 0; i < len; i++) {
+			if (s->buffer[i] == '\n') {
+				print_seq_line(s, last_i, i);
+				last_i = i + 1;
+			}
+		}
+		/* Check if there was a partial line. */
+		if (last_i < len) {
+			print_seq_line(s, last_i, len - 1);
+			pr_cont("\n");
+		}
+
+		/* Wipe out the buffer ready for the next time around. */
+		seq_buf_clear(&s->seq);
+	}
+
+	clear_bit(0, &nmi_print_flag);
+	smp_mb__after_atomic();
+}
+
+void printk_nmi_this_cpu_begin(void)
+{
+	/*
+	 * Detect double-begins and report them. This code is unsafe (because
+	 * it will print from NMI) but things are pretty badly damaged if the
+	 * NMI re-enters and is somehow granted permission to use NMI printk,
+	 * so how much worse can it get? Also since this code interferes with
+	 * the operation of printk it is unlikely that any consequential
+	 * failures will be able to log anything making this our last
+	 * opportunity to tell anyone that something is wrong.
+	 */
+	if (this_cpu_read(nmi_print_saved_print_func)) {
+		this_cpu_write(printk_func, vprintk_default);
+		BUG();
+	}
+
+	this_cpu_write(nmi_print_saved_print_func, this_cpu_read(printk_func));
+	this_cpu_write(printk_func, nmi_vprintk);
+}
+
+void printk_nmi_this_cpu_end(void)
+{
+	this_cpu_write(printk_func, this_cpu_read(nmi_print_saved_print_func));
+	this_cpu_write(nmi_print_saved_print_func, NULL);
+}
+
+#endif /* CONFIG_PRINTK_NMI */
+
 /**
  * printk - print a kernel message
  * @fmt: format string
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 4/6] printk: Simple implementation for NMI backtracing
@ 2015-02-03 19:06     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: linux-arm-kernel

Currently there is a quite a pile of code sitting in
arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
The code is inaccessible to backtrace implementations for other
architectures, which is a shame because they would probably like to be
safe too.

Copy this code into printk. We'll port the x86 NMI backtrace to it in a
later patch.

Incidentally, technically I think it might be safe to call
printk_nmi_prepare() from NMI, providing care were taken to honour the
return code. printk_nmi_complete() cannot be called from NMI but could
be scheduled using irq_work_queue(). However honouring the return code
means sometimes it is impossible to get the message out so in most cases
I'd say using this code in such a way should probably attract sympathy
and/or derision rather than admiration.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/printk.h |  18 ++++++
 init/Kconfig           |   3 +
 kernel/printk/printk.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 170 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index c8f170324e64..03c6921b3fcd 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -219,6 +219,24 @@ static inline void show_regs_print_info(const char *log_lvl)
 }
 #endif
 
+#ifdef CONFIG_PRINTK_NMI
+/*
+ * printk_nmi_prepare/complete are called to prepare the system for
+ * some or all cores to issue trace from NMI. printk_nmi_complete will
+ * print buffered output and cannot (safely) be called from NMI.
+ */
+extern int printk_nmi_prepare(void);
+extern void printk_nmi_complete(void);
+
+/*
+ * printk_nmi_this_cpu_begin/end are used divert/restore printk on this
+ * cpu. The result is the output of printk() (by this CPU) will be
+ * stored in temporary buffers for later printing by printk_nmi_complete.
+ */
+extern void printk_nmi_this_cpu_begin(void);
+extern void printk_nmi_this_cpu_end(void);
+#endif
+
 extern asmlinkage void dump_stack(void) __cold;
 
 #ifndef pr_fmt
diff --git a/init/Kconfig b/init/Kconfig
index 9afb971497f4..13b843e3b5f4 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1430,6 +1430,9 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config PRINTK_NMI
+	bool
+
 config BUG
 	bool "BUG() support" if EXPERT
 	default y
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 02d6b6d28796..fe1cd57ea349 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1829,6 +1829,155 @@ EXPORT_SYMBOL_GPL(vprintk_default);
  */
 DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
 
+#ifdef CONFIG_PRINTK_NMI
+
+#define NMI_BUF_SIZE		4096
+
+struct nmi_seq_buf {
+	unsigned char		buffer[NMI_BUF_SIZE];
+	struct seq_buf		seq;
+};
+
+/* Safe printing in NMI context */
+static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+
+static DEFINE_PER_CPU(printk_func_t, nmi_print_saved_print_func);
+
+/* "in progress" flag of NMI printing */
+static unsigned long nmi_print_flag;
+
+static int __init printk_nmi_init(void)
+{
+	struct nmi_seq_buf *s;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
+	}
+
+	return 0;
+}
+pure_initcall(printk_nmi_init);
+
+/*
+ * It is not safe to call printk() directly from NMI handlers.
+ * It may be fine if the NMI detected a lock up and we have no choice
+ * but to do so, but doing a NMI on all other CPUs to get a back trace
+ * can be done with a sysrq-l. We don't want that to lock up, which
+ * can happen if the NMI interrupts a printk in progress.
+ *
+ * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
+ * the content into a per cpu seq_buf buffer. Then when the NMIs are
+ * all done, we can safely dump the contents of the seq_buf to a printk()
+ * from a non NMI context.
+ *
+ * This is not a generic printk() implementation and must be used with
+ * great care. In particular there is a static limit on the quantity of
+ * data that may be emitted during NMI, only one client can be active at
+ * one time (arbitrated by the return value of printk_nmi_begin() and
+ * it is required that something at task or interrupt context be scheduled
+ * to issue the output.
+ */
+static int nmi_vprintk(const char *fmt, va_list args)
+{
+	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	unsigned int len = seq_buf_used(&s->seq);
+
+	seq_buf_vprintf(&s->seq, fmt, args);
+	return seq_buf_used(&s->seq) - len;
+}
+
+/*
+ * Reserve the NMI printk mechanism. Return an error if some other component
+ * is already using it.
+ */
+int printk_nmi_prepare(void)
+{
+	if (test_and_set_bit(0, &nmi_print_flag)) {
+		/*
+		 * If something is already using the NMI print facility we
+		 * can't allow a second one...
+		 */
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
+{
+	const char *buf = s->buffer + start;
+
+	printk("%.*s", (end - start) + 1, buf);
+}
+
+void printk_nmi_complete(void)
+{
+	struct nmi_seq_buf *s;
+	int len, cpu, i, last_i;
+
+	/*
+	 * Now that all the NMIs have triggered, we can dump out their
+	 * back traces safely to the console.
+	 */
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		last_i = 0;
+
+		len = seq_buf_used(&s->seq);
+		if (!len)
+			continue;
+
+		/* Print line by line. */
+		for (i = 0; i < len; i++) {
+			if (s->buffer[i] == '\n') {
+				print_seq_line(s, last_i, i);
+				last_i = i + 1;
+			}
+		}
+		/* Check if there was a partial line. */
+		if (last_i < len) {
+			print_seq_line(s, last_i, len - 1);
+			pr_cont("\n");
+		}
+
+		/* Wipe out the buffer ready for the next time around. */
+		seq_buf_clear(&s->seq);
+	}
+
+	clear_bit(0, &nmi_print_flag);
+	smp_mb__after_atomic();
+}
+
+void printk_nmi_this_cpu_begin(void)
+{
+	/*
+	 * Detect double-begins and report them. This code is unsafe (because
+	 * it will print from NMI) but things are pretty badly damaged if the
+	 * NMI re-enters and is somehow granted permission to use NMI printk,
+	 * so how much worse can it get? Also since this code interferes with
+	 * the operation of printk it is unlikely that any consequential
+	 * failures will be able to log anything making this our last
+	 * opportunity to tell anyone that something is wrong.
+	 */
+	if (this_cpu_read(nmi_print_saved_print_func)) {
+		this_cpu_write(printk_func, vprintk_default);
+		BUG();
+	}
+
+	this_cpu_write(nmi_print_saved_print_func, this_cpu_read(printk_func));
+	this_cpu_write(printk_func, nmi_vprintk);
+}
+
+void printk_nmi_this_cpu_end(void)
+{
+	this_cpu_write(printk_func, this_cpu_read(nmi_print_saved_print_func));
+	this_cpu_write(nmi_print_saved_print_func, NULL);
+}
+
+#endif /* CONFIG_PRINTK_NMI */
+
 /**
  * printk - print a kernel message
  * @fmt: format string
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 5/6] x86/nmi: Use common printk functions
  2015-02-03 19:06   ` Daniel Thompson
@ 2015-02-03 19:06     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander, Ingo Molnar, H. Peter Anvin, x86

Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe
all-cpu backtracing from NMI has been copied to printk.c to make it
accessible to other architectures.

Port the x86 NMI backtrace to the generic code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 arch/x86/Kconfig              |   1 +
 arch/x86/kernel/apic/hw_nmi.c | 101 +++---------------------------------------
 2 files changed, 8 insertions(+), 94 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0dc9d0144a27..e1a07b9e535c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -138,6 +138,7 @@ config X86
 	select HAVE_ACPI_APEI_NMI if ACPI
 	select ACPI_LEGACY_TABLES_LOOKUP if ACPI
 	select X86_FEATURE_NAMES if PROC_FS
+	select PRINTK_NMI if X86_LOCAL_APIC
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 6873ab925d00..8bc00476011d 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -30,40 +30,16 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 #ifdef arch_trigger_all_cpu_backtrace
 /* For reliability, we're prepared to waste bits here. */
 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
-static cpumask_t printtrace_mask;
-
-#define NMI_BUF_SIZE		4096
-
-struct nmi_seq_buf {
-	unsigned char		buffer[NMI_BUF_SIZE];
-	struct seq_buf		seq;
-};
-
-/* Safe printing in NMI context */
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
-
-/* "in progress" flag of arch_trigger_all_cpu_backtrace */
-static unsigned long backtrace_flag;
-
-static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
-{
-	const char *buf = s->buffer + start;
-
-	printk("%.*s", (end - start) + 1, buf);
-}
 
 void arch_trigger_all_cpu_backtrace(bool include_self)
 {
-	struct nmi_seq_buf *s;
-	int len;
-	int cpu;
 	int i;
 	int this_cpu = get_cpu();
 
-	if (test_and_set_bit(0, &backtrace_flag)) {
+	if (0 != printk_nmi_prepare()) {
 		/*
-		 * If there is already a trigger_all_cpu_backtrace() in progress
-		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
 		 */
 		put_cpu();
 		return;
@@ -73,16 +49,6 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 	if (!include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
 
-	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
-	/*
-	 * Set up per_cpu seq_buf buffers that the NMIs running on the other
-	 * CPUs will write to.
-	 */
-	for_each_cpu(cpu, to_cpumask(backtrace_mask)) {
-		s = &per_cpu(nmi_print_seq, cpu);
-		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
-	}
-
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("sending NMI to %s CPUs:\n",
 			(include_self ? "all" : "other"));
@@ -97,73 +63,20 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Now that all the NMIs have triggered, we can dump out their
-	 * back traces safely to the console.
-	 */
-	for_each_cpu(cpu, &printtrace_mask) {
-		int last_i = 0;
-
-		s = &per_cpu(nmi_print_seq, cpu);
-		len = seq_buf_used(&s->seq);
-		if (!len)
-			continue;
-
-		/* Print line by line. */
-		for (i = 0; i < len; i++) {
-			if (s->buffer[i] == '\n') {
-				print_seq_line(s, last_i, i);
-				last_i = i + 1;
-			}
-		}
-		/* Check if there was a partial line. */
-		if (last_i < len) {
-			print_seq_line(s, last_i, len - 1);
-			pr_cont("\n");
-		}
-	}
-
-	clear_bit(0, &backtrace_flag);
-	smp_mb__after_atomic();
+	printk_nmi_complete();
 	put_cpu();
 }
 
-/*
- * It is not safe to call printk() directly from NMI handlers.
- * It may be fine if the NMI detected a lock up and we have no choice
- * but to do so, but doing a NMI on all other CPUs to get a back trace
- * can be done with a sysrq-l. We don't want that to lock up, which
- * can happen if the NMI interrupts a printk in progress.
- *
- * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- * the content into a per cpu seq_buf buffer. Then when the NMIs are
- * all done, we can safely dump the contents of the seq_buf to a printk()
- * from a non NMI context.
- */
-static int nmi_vprintk(const char *fmt, va_list args)
-{
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	unsigned int len = seq_buf_used(&s->seq);
-
-	seq_buf_vprintf(&s->seq, fmt, args);
-	return seq_buf_used(&s->seq) - len;
-}
-
 static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
-	int cpu;
-
-	cpu = smp_processor_id();
+	int cpu = smp_processor_id();
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
-		printk_func_t printk_func_save = this_cpu_read(printk_func);
-
-		/* Replace printk to write into the NMI seq */
-		this_cpu_write(printk_func, nmi_vprintk);
+		printk_nmi_this_cpu_begin();
 		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
 		show_regs(regs);
-		this_cpu_write(printk_func, printk_func_save);
+		printk_nmi_this_cpu_end();
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return NMI_HANDLED;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 5/6] x86/nmi: Use common printk functions
@ 2015-02-03 19:06     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: linux-arm-kernel

Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe
all-cpu backtracing from NMI has been copied to printk.c to make it
accessible to other architectures.

Port the x86 NMI backtrace to the generic code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86 at kernel.org
---
 arch/x86/Kconfig              |   1 +
 arch/x86/kernel/apic/hw_nmi.c | 101 +++---------------------------------------
 2 files changed, 8 insertions(+), 94 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0dc9d0144a27..e1a07b9e535c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -138,6 +138,7 @@ config X86
 	select HAVE_ACPI_APEI_NMI if ACPI
 	select ACPI_LEGACY_TABLES_LOOKUP if ACPI
 	select X86_FEATURE_NAMES if PROC_FS
+	select PRINTK_NMI if X86_LOCAL_APIC
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 6873ab925d00..8bc00476011d 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -30,40 +30,16 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 #ifdef arch_trigger_all_cpu_backtrace
 /* For reliability, we're prepared to waste bits here. */
 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
-static cpumask_t printtrace_mask;
-
-#define NMI_BUF_SIZE		4096
-
-struct nmi_seq_buf {
-	unsigned char		buffer[NMI_BUF_SIZE];
-	struct seq_buf		seq;
-};
-
-/* Safe printing in NMI context */
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
-
-/* "in progress" flag of arch_trigger_all_cpu_backtrace */
-static unsigned long backtrace_flag;
-
-static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
-{
-	const char *buf = s->buffer + start;
-
-	printk("%.*s", (end - start) + 1, buf);
-}
 
 void arch_trigger_all_cpu_backtrace(bool include_self)
 {
-	struct nmi_seq_buf *s;
-	int len;
-	int cpu;
 	int i;
 	int this_cpu = get_cpu();
 
-	if (test_and_set_bit(0, &backtrace_flag)) {
+	if (0 != printk_nmi_prepare()) {
 		/*
-		 * If there is already a trigger_all_cpu_backtrace() in progress
-		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
 		 */
 		put_cpu();
 		return;
@@ -73,16 +49,6 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 	if (!include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
 
-	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
-	/*
-	 * Set up per_cpu seq_buf buffers that the NMIs running on the other
-	 * CPUs will write to.
-	 */
-	for_each_cpu(cpu, to_cpumask(backtrace_mask)) {
-		s = &per_cpu(nmi_print_seq, cpu);
-		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
-	}
-
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("sending NMI to %s CPUs:\n",
 			(include_self ? "all" : "other"));
@@ -97,73 +63,20 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Now that all the NMIs have triggered, we can dump out their
-	 * back traces safely to the console.
-	 */
-	for_each_cpu(cpu, &printtrace_mask) {
-		int last_i = 0;
-
-		s = &per_cpu(nmi_print_seq, cpu);
-		len = seq_buf_used(&s->seq);
-		if (!len)
-			continue;
-
-		/* Print line by line. */
-		for (i = 0; i < len; i++) {
-			if (s->buffer[i] == '\n') {
-				print_seq_line(s, last_i, i);
-				last_i = i + 1;
-			}
-		}
-		/* Check if there was a partial line. */
-		if (last_i < len) {
-			print_seq_line(s, last_i, len - 1);
-			pr_cont("\n");
-		}
-	}
-
-	clear_bit(0, &backtrace_flag);
-	smp_mb__after_atomic();
+	printk_nmi_complete();
 	put_cpu();
 }
 
-/*
- * It is not safe to call printk() directly from NMI handlers.
- * It may be fine if the NMI detected a lock up and we have no choice
- * but to do so, but doing a NMI on all other CPUs to get a back trace
- * can be done with a sysrq-l. We don't want that to lock up, which
- * can happen if the NMI interrupts a printk in progress.
- *
- * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- * the content into a per cpu seq_buf buffer. Then when the NMIs are
- * all done, we can safely dump the contents of the seq_buf to a printk()
- * from a non NMI context.
- */
-static int nmi_vprintk(const char *fmt, va_list args)
-{
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	unsigned int len = seq_buf_used(&s->seq);
-
-	seq_buf_vprintf(&s->seq, fmt, args);
-	return seq_buf_used(&s->seq) - len;
-}
-
 static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
-	int cpu;
-
-	cpu = smp_processor_id();
+	int cpu = smp_processor_id();
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
-		printk_func_t printk_func_save = this_cpu_read(printk_func);
-
-		/* Replace printk to write into the NMI seq */
-		this_cpu_write(printk_func, nmi_vprintk);
+		printk_nmi_this_cpu_begin();
 		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
 		show_regs(regs);
-		this_cpu_write(printk_func, printk_func_save);
+		printk_nmi_this_cpu_end();
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return NMI_HANDLED;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 6/6] ARM: Add support for on-demand backtrace of other CPUs
  2015-02-03 19:06   ` Daniel Thompson
@ 2015-02-03 19:06     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Replicate the x86 code to trigger a backtrace using an NMI and hook
it up to IPI on ARM.

The code differs slightly from the code on x86 because, on ARM, we do
now know at compile time whether a platform is capable of supporting FIQ.
We must avoid using an IPI to request a backtrace from the CPU on which
the backtrace was requested if interrupts are disabled and fall back to
generating it directly.

In addition the implementation of arch_trigger_all_cpu_backtrace() the
patch also includes a few small items of plumbing that must be hooked
up for the new code to work.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/Kconfig               |  1 +
 arch/arm/include/asm/hardirq.h |  2 +-
 arch/arm/include/asm/irq.h     |  5 +++
 arch/arm/include/asm/smp.h     |  3 ++
 arch/arm/kernel/smp.c          | 80 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/traps.c        |  3 ++
 6 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 97d07ed60a0b..45f52e012e22 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -74,6 +74,7 @@ config ARM
 	select OLD_SIGACTION
 	select OLD_SIGSUSPEND3
 	select PERF_USE_VMALLOC
+	select PRINTK_NMI
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
 	# Above selects are sorted alphabetically; please add new ones
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index fe3ea776dc34..5df33e30ae1b 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	8
+#define NR_IPI	9
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 18f5a554134f..b076584ac0fa 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -18,6 +18,8 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#define SMP_IPI_FIQ_MASK 0x0100
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
@@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
 
+extern void ipi_cpu_backtrace(struct pt_regs *regs);
 extern int register_ipi_completion(struct completion *completion, int cpu);
 
 struct smp_operations {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 86ef244c5a24..828277c4c248 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -26,6 +26,7 @@
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/seq_buf.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -72,6 +73,7 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
 static DECLARE_COMPLETION(cpu_running);
@@ -456,6 +458,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_CPU_STOP, "CPU stop interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_COMPLETION, "completion interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -570,6 +573,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 	unsigned int cpu = smp_processor_id();
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
+	BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
+
 	if ((unsigned)ipinr < NR_IPI) {
 		trace_ipi_entry(ipi_types[ipinr]);
 		__inc_irq_stat(cpu, ipi_irqs[ipinr]);
@@ -623,6 +628,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n",
 		        cpu, ipinr);
@@ -717,3 +728,72 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	int i;
+	int this_cpu = get_cpu();
+
+	if (0 != printk_nmi_prepare()) {
+		/*
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
+		 */
+		put_cpu();
+		return;
+	}
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+	/*
+	 * If irqs are disabled on the current processor then, if
+	 * IPI_CPU_BACKTRACE is delivered using IRQ, we will won't be able to
+	 * react to IPI_CPU_BACKTRACE until we leave this function. We avoid
+	 * the potential timeout (not to mention the failure to print useful
+	 * information) by calling the backtrace directly.
+	 */
+	if (irqs_disabled()) {
+		ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL);
+		include_self = false;
+	}
+
+	if (!include_self)
+		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+		touch_softlockup_watchdog();
+	}
+
+	printk_nmi_complete();
+	put_cpu();
+}
+
+void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		printk_nmi_this_cpu_begin();
+		pr_warn("FIQ backtrace for cpu %d\n", cpu);
+		if (regs != NULL)
+			show_regs(regs);
+		else
+			dump_stack();
+		printk_nmi_this_cpu_end();
+
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b35e220ae1b1..1836415b8a5c 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 #ifdef CONFIG_ARM_GIC
 	gic_handle_fiq_ipi();
 #endif
+#ifdef CONFIG_SMP
+	ipi_cpu_backtrace(regs);
+#endif
 
 	nmi_exit();
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 6/6] ARM: Add support for on-demand backtrace of other CPUs
@ 2015-02-03 19:06     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-03 19:06 UTC (permalink / raw)
  To: linux-arm-kernel

Replicate the x86 code to trigger a backtrace using an NMI and hook
it up to IPI on ARM.

The code differs slightly from the code on x86 because, on ARM, we do
now know at compile time whether a platform is capable of supporting FIQ.
We must avoid using an IPI to request a backtrace from the CPU on which
the backtrace was requested if interrupts are disabled and fall back to
generating it directly.

In addition the implementation of arch_trigger_all_cpu_backtrace() the
patch also includes a few small items of plumbing that must be hooked
up for the new code to work.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/Kconfig               |  1 +
 arch/arm/include/asm/hardirq.h |  2 +-
 arch/arm/include/asm/irq.h     |  5 +++
 arch/arm/include/asm/smp.h     |  3 ++
 arch/arm/kernel/smp.c          | 80 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/traps.c        |  3 ++
 6 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 97d07ed60a0b..45f52e012e22 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -74,6 +74,7 @@ config ARM
 	select OLD_SIGACTION
 	select OLD_SIGSUSPEND3
 	select PERF_USE_VMALLOC
+	select PRINTK_NMI
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
 	# Above selects are sorted alphabetically; please add new ones
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index fe3ea776dc34..5df33e30ae1b 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	8
+#define NR_IPI	9
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 18f5a554134f..b076584ac0fa 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -18,6 +18,8 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#define SMP_IPI_FIQ_MASK 0x0100
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
@@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
 
+extern void ipi_cpu_backtrace(struct pt_regs *regs);
 extern int register_ipi_completion(struct completion *completion, int cpu);
 
 struct smp_operations {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 86ef244c5a24..828277c4c248 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -26,6 +26,7 @@
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/seq_buf.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -72,6 +73,7 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
 static DECLARE_COMPLETION(cpu_running);
@@ -456,6 +458,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_CPU_STOP, "CPU stop interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_COMPLETION, "completion interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -570,6 +573,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 	unsigned int cpu = smp_processor_id();
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
+	BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
+
 	if ((unsigned)ipinr < NR_IPI) {
 		trace_ipi_entry(ipi_types[ipinr]);
 		__inc_irq_stat(cpu, ipi_irqs[ipinr]);
@@ -623,6 +628,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n",
 		        cpu, ipinr);
@@ -717,3 +728,72 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	int i;
+	int this_cpu = get_cpu();
+
+	if (0 != printk_nmi_prepare()) {
+		/*
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
+		 */
+		put_cpu();
+		return;
+	}
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+	/*
+	 * If irqs are disabled on the current processor then, if
+	 * IPI_CPU_BACKTRACE is delivered using IRQ, we will won't be able to
+	 * react to IPI_CPU_BACKTRACE until we leave this function. We avoid
+	 * the potential timeout (not to mention the failure to print useful
+	 * information) by calling the backtrace directly.
+	 */
+	if (irqs_disabled()) {
+		ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL);
+		include_self = false;
+	}
+
+	if (!include_self)
+		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+		touch_softlockup_watchdog();
+	}
+
+	printk_nmi_complete();
+	put_cpu();
+}
+
+void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		printk_nmi_this_cpu_begin();
+		pr_warn("FIQ backtrace for cpu %d\n", cpu);
+		if (regs != NULL)
+			show_regs(regs);
+		else
+			dump_stack();
+		printk_nmi_this_cpu_end();
+
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b35e220ae1b1..1836415b8a5c 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 #ifdef CONFIG_ARM_GIC
 	gic_handle_fiq_ipi();
 #endif
+#ifdef CONFIG_SMP
+	ipi_cpu_backtrace(regs);
+#endif
 
 	nmi_exit();
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
  2015-02-03 19:06     ` Daniel Thompson
@ 2015-02-26 20:31       ` Nicolas Pitre
  -1 siblings, 0 replies; 94+ messages in thread
From: Nicolas Pitre @ 2015-02-26 20:31 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Tue, 3 Feb 2015, Daniel Thompson wrote:

> Currently gic_raise_softirq() is locked using upon irq_controller_lock.
> This lock is primarily used to make register read-modify-write sequences
> atomic but gic_raise_softirq() uses it instead to ensure that the
> big.LITTLE migration logic can figure out when it is safe to migrate
> interrupts between physical cores.
> 
> This is sub-optimal in closely related ways:
> 
> 1. No locking at all is required on systems where the b.L switcher is
>    not configured.

ACK

> 2. Finer grain locking can be used on systems where the b.L switcher is
>    present.

NAK

Consider this sequence:

	CPU 1				CPU 2
	-----				-----
	gic_raise_softirq()		gic_migrate_target()
	bl_migration_lock() [OK]
	[...]				[...]
	map |= gic_cpu_map[cpu];	bl_migration_lock() [contended]
	bl_migration_unlock(flags);	bl_migration_lock() [OK]
					gic_cpu_map[cpu] = 1 << new_cpu_id;
					bl_migration_unlock(flags);
					[...]
					(migrate pending IPI from old CPU)
	writel_relaxed(map to GIC_DIST_SOFTINT);
	[this IPI is now lost]

Granted, this race is apparently aready possible today.  We probably get 
away with it because the locked sequence in gic_migrate_target() include 
the retargetting of peripheral interrupts which gives plenti of time for 
code execution in gic_raise_softirq() to post its IPI before the IPI 
migration code is executed.  So in that sense it could be argued that 
the reduced lock coverage from your patch doesn't make things any worse.  
If anything it might even help by letting gic_migrate_target() complete 
sooner.  But removing cpu_map_migration_lock altogether would improve 
things even further by that logic.  I however don't think we should live 
so dangerously.

Therefore, for the lock to be effective, it has to encompass the 
changing of the CPU map _and_ migration of pending IPIs before new IPIs 
are allowed again.  That means the locked area has to grow not shrink.

Oh, and a minor nit:

> + * This lock is used by the big.LITTLE migration code to ensure no IPIs
> + * can be pended on the old core after the map has been updated.
> + */
> +#ifdef CONFIG_BL_SWITCHER
> +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
> +
> +static inline void bl_migration_lock(unsigned long *flags)

Please name it gic_migration_lock. "bl_migration_lock" is a bit too 
generic in this context.


Nicolas

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
@ 2015-02-26 20:31       ` Nicolas Pitre
  0 siblings, 0 replies; 94+ messages in thread
From: Nicolas Pitre @ 2015-02-26 20:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 3 Feb 2015, Daniel Thompson wrote:

> Currently gic_raise_softirq() is locked using upon irq_controller_lock.
> This lock is primarily used to make register read-modify-write sequences
> atomic but gic_raise_softirq() uses it instead to ensure that the
> big.LITTLE migration logic can figure out when it is safe to migrate
> interrupts between physical cores.
> 
> This is sub-optimal in closely related ways:
> 
> 1. No locking at all is required on systems where the b.L switcher is
>    not configured.

ACK

> 2. Finer grain locking can be used on systems where the b.L switcher is
>    present.

NAK

Consider this sequence:

	CPU 1				CPU 2
	-----				-----
	gic_raise_softirq()		gic_migrate_target()
	bl_migration_lock() [OK]
	[...]				[...]
	map |= gic_cpu_map[cpu];	bl_migration_lock() [contended]
	bl_migration_unlock(flags);	bl_migration_lock() [OK]
					gic_cpu_map[cpu] = 1 << new_cpu_id;
					bl_migration_unlock(flags);
					[...]
					(migrate pending IPI from old CPU)
	writel_relaxed(map to GIC_DIST_SOFTINT);
	[this IPI is now lost]

Granted, this race is apparently aready possible today.  We probably get 
away with it because the locked sequence in gic_migrate_target() include 
the retargetting of peripheral interrupts which gives plenti of time for 
code execution in gic_raise_softirq() to post its IPI before the IPI 
migration code is executed.  So in that sense it could be argued that 
the reduced lock coverage from your patch doesn't make things any worse.  
If anything it might even help by letting gic_migrate_target() complete 
sooner.  But removing cpu_map_migration_lock altogether would improve 
things even further by that logic.  I however don't think we should live 
so dangerously.

Therefore, for the lock to be effective, it has to encompass the 
changing of the CPU map _and_ migration of pending IPIs before new IPIs 
are allowed again.  That means the locked area has to grow not shrink.

Oh, and a minor nit:

> + * This lock is used by the big.LITTLE migration code to ensure no IPIs
> + * can be pended on the old core after the map has been updated.
> + */
> +#ifdef CONFIG_BL_SWITCHER
> +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
> +
> +static inline void bl_migration_lock(unsigned long *flags)

Please name it gic_migration_lock. "bl_migration_lock" is a bit too 
generic in this context.


Nicolas

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3.19-rc6 v16 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe
  2015-02-03 19:06     ` Daniel Thompson
@ 2015-02-26 20:33       ` Nicolas Pitre
  -1 siblings, 0 replies; 94+ messages in thread
From: Nicolas Pitre @ 2015-02-26 20:33 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Tue, 3 Feb 2015, Daniel Thompson wrote:

> It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
> and lock up.
> 
>     	gic_raise_softirq()
> 	   lock(x);
> -~-> FIQ
>         handle_fiq()
> 	   gic_raise_softirq()
> 	      lock(x);		<-- Lockup
> 
> arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
> renders it difficult for FIQ handlers to safely defer work to less
> restrictive calling contexts.
> 
> This patch fixes the problem by converting the cpu_map_migration_lock
> into a rwlock making it safe to re-enter the function.
> 
> Note that having made it safe to re-enter gic_raise_softirq() we no
> longer need to mask interrupts during gic_raise_softirq() because the
> b.L migration is always performed from task context.

Very good point.

Once my concerns on patch #1 are addressed, you may add my ACK to this 
one.

> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jason Cooper <jason@lakedaemon.net>
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
>  1 file changed, 25 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index a9ed64dcc84b..c172176499f6 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
>  /*
>   * This lock is used by the big.LITTLE migration code to ensure no IPIs
>   * can be pended on the old core after the map has been updated.
> + *
> + * This lock may be locked for reading from both IRQ and FIQ handlers
> + * and therefore must not be locked for writing when these are enabled.
>   */
>  #ifdef CONFIG_BL_SWITCHER
> -static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
> +static DEFINE_RWLOCK(cpu_map_migration_lock);
>  
> -static inline void bl_migration_lock(unsigned long *flags)
> +static inline void bl_migration_lock(void)
>  {
> -	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
> +	read_lock(&cpu_map_migration_lock);
>  }
>  
> -static inline void bl_migration_unlock(unsigned long flags)
> +static inline void bl_migration_unlock(void)
>  {
> -	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
> +	read_unlock(&cpu_map_migration_lock);
>  }
>  #else
> -static inline void bl_migration_lock(unsigned long *flags) {}
> -static inline void bl_migration_unlock(unsigned long flags) {}
> +static inline void bl_migration_lock(void) {}
> +static inline void bl_migration_unlock(void) {}
>  #endif
>  
>  /*
> @@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
>  #endif
>  
>  #ifdef CONFIG_SMP
> +/*
> + * Raise the specified IPI on all cpus set in mask.
> + *
> + * This function is safe to call from all calling contexts, including
> + * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable
> + * to avoid deadlocks when the function is re-entered at different
> + * exception levels.
> + */
>  static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  {
>  	int cpu;
> -	unsigned long flags, map = 0;
> +	unsigned long map = 0;
>  
> -	bl_migration_lock(&flags);
> +	bl_migration_lock();
>  
>  	/* Convert our logical CPU mask into a physical one. */
>  	for_each_cpu(cpu, mask)
> @@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  	/* this always happens on GIC0 */
>  	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
>  
> -	bl_migration_unlock(flags);
> +	bl_migration_unlock();
>  }
>  #endif
>  
> @@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu)
>   * Migrate all peripheral interrupts with a target matching the current CPU
>   * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
>   * is also updated.  Targets to other CPU interfaces are unchanged.
> - * This must be called with IRQs locally disabled.
> + * This must be called from a task context and with IRQ and FIQ locally
> + * disabled.
>   */
>  void gic_migrate_target(unsigned int new_cpu_id)
>  {
> @@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
>  	 * pending on the old cpu static. That means we can defer the
>  	 * migration until after we have released the irq_controller_lock.
>  	 */
> -	raw_spin_lock(&cpu_map_migration_lock);
> +	write_lock(&cpu_map_migration_lock);
>  	gic_cpu_map[cpu] = 1 << new_cpu_id;
> -	raw_spin_unlock(&cpu_map_migration_lock);
> +	write_unlock(&cpu_map_migration_lock);
>  
>  	/*
>  	 * Find all the peripheral interrupts targetting the current
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe
@ 2015-02-26 20:33       ` Nicolas Pitre
  0 siblings, 0 replies; 94+ messages in thread
From: Nicolas Pitre @ 2015-02-26 20:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 3 Feb 2015, Daniel Thompson wrote:

> It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
> and lock up.
> 
>     	gic_raise_softirq()
> 	   lock(x);
> -~-> FIQ
>         handle_fiq()
> 	   gic_raise_softirq()
> 	      lock(x);		<-- Lockup
> 
> arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
> renders it difficult for FIQ handlers to safely defer work to less
> restrictive calling contexts.
> 
> This patch fixes the problem by converting the cpu_map_migration_lock
> into a rwlock making it safe to re-enter the function.
> 
> Note that having made it safe to re-enter gic_raise_softirq() we no
> longer need to mask interrupts during gic_raise_softirq() because the
> b.L migration is always performed from task context.

Very good point.

Once my concerns on patch #1 are addressed, you may add my ACK to this 
one.

> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jason Cooper <jason@lakedaemon.net>
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
>  1 file changed, 25 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index a9ed64dcc84b..c172176499f6 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
>  /*
>   * This lock is used by the big.LITTLE migration code to ensure no IPIs
>   * can be pended on the old core after the map has been updated.
> + *
> + * This lock may be locked for reading from both IRQ and FIQ handlers
> + * and therefore must not be locked for writing when these are enabled.
>   */
>  #ifdef CONFIG_BL_SWITCHER
> -static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
> +static DEFINE_RWLOCK(cpu_map_migration_lock);
>  
> -static inline void bl_migration_lock(unsigned long *flags)
> +static inline void bl_migration_lock(void)
>  {
> -	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
> +	read_lock(&cpu_map_migration_lock);
>  }
>  
> -static inline void bl_migration_unlock(unsigned long flags)
> +static inline void bl_migration_unlock(void)
>  {
> -	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
> +	read_unlock(&cpu_map_migration_lock);
>  }
>  #else
> -static inline void bl_migration_lock(unsigned long *flags) {}
> -static inline void bl_migration_unlock(unsigned long flags) {}
> +static inline void bl_migration_lock(void) {}
> +static inline void bl_migration_unlock(void) {}
>  #endif
>  
>  /*
> @@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
>  #endif
>  
>  #ifdef CONFIG_SMP
> +/*
> + * Raise the specified IPI on all cpus set in mask.
> + *
> + * This function is safe to call from all calling contexts, including
> + * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable
> + * to avoid deadlocks when the function is re-entered at different
> + * exception levels.
> + */
>  static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  {
>  	int cpu;
> -	unsigned long flags, map = 0;
> +	unsigned long map = 0;
>  
> -	bl_migration_lock(&flags);
> +	bl_migration_lock();
>  
>  	/* Convert our logical CPU mask into a physical one. */
>  	for_each_cpu(cpu, mask)
> @@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  	/* this always happens on GIC0 */
>  	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
>  
> -	bl_migration_unlock(flags);
> +	bl_migration_unlock();
>  }
>  #endif
>  
> @@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu)
>   * Migrate all peripheral interrupts with a target matching the current CPU
>   * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
>   * is also updated.  Targets to other CPU interfaces are unchanged.
> - * This must be called with IRQs locally disabled.
> + * This must be called from a task context and with IRQ and FIQ locally
> + * disabled.
>   */
>  void gic_migrate_target(unsigned int new_cpu_id)
>  {
> @@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
>  	 * pending on the old cpu static. That means we can defer the
>  	 * migration until after we have released the irq_controller_lock.
>  	 */
> -	raw_spin_lock(&cpu_map_migration_lock);
> +	write_lock(&cpu_map_migration_lock);
>  	gic_cpu_map[cpu] = 1 << new_cpu_id;
> -	raw_spin_unlock(&cpu_map_migration_lock);
> +	write_unlock(&cpu_map_migration_lock);
>  
>  	/*
>  	 * Find all the peripheral interrupts targetting the current
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
  2015-02-26 20:31       ` Nicolas Pitre
@ 2015-02-26 21:05         ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-26 21:05 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Thu, 2015-02-26 at 15:31 -0500, Nicolas Pitre wrote:
> On Tue, 3 Feb 2015, Daniel Thompson wrote:
> 
> > Currently gic_raise_softirq() is locked using upon irq_controller_lock.
> > This lock is primarily used to make register read-modify-write sequences
> > atomic but gic_raise_softirq() uses it instead to ensure that the
> > big.LITTLE migration logic can figure out when it is safe to migrate
> > interrupts between physical cores.
> > 
> > This is sub-optimal in closely related ways:
> > 
> > 1. No locking at all is required on systems where the b.L switcher is
> >    not configured.
> 
> ACK
> 
> > 2. Finer grain locking can be used on systems where the b.L switcher is
> >    present.
> 
> NAK
> 
> Consider this sequence:
> 
> 	CPU 1				CPU 2
> 	-----				-----
> 	gic_raise_softirq()		gic_migrate_target()
> 	bl_migration_lock() [OK]
> 	[...]				[...]
> 	map |= gic_cpu_map[cpu];	bl_migration_lock() [contended]
> 	bl_migration_unlock(flags);	bl_migration_lock() [OK]
> 					gic_cpu_map[cpu] = 1 << new_cpu_id;
> 					bl_migration_unlock(flags);
> 					[...]
> 					(migrate pending IPI from old CPU)
> 	writel_relaxed(map to GIC_DIST_SOFTINT);

Isn't this solved inside gic_raise_softirq? How can the writel_relaxed()
escape from the critical section and happen at the end of the sequence?


> 	[this IPI is now lost]
> 
> Granted, this race is apparently aready possible today.  We probably get 
> away with it because the locked sequence in gic_migrate_target() include 
> the retargetting of peripheral interrupts which gives plenti of time for 
> code execution in gic_raise_softirq() to post its IPI before the IPI 
> migration code is executed.  So in that sense it could be argued that 
> the reduced lock coverage from your patch doesn't make things any worse.  
> If anything it might even help by letting gic_migrate_target() complete 
> sooner.  But removing cpu_map_migration_lock altogether would improve 
> things even further by that logic.  I however don't think we should live 
> so dangerously.
> 
> Therefore, for the lock to be effective, it has to encompass the 
> changing of the CPU map _and_ migration of pending IPIs before new IPIs 
> are allowed again.  That means the locked area has to grow not shrink.
> 
> Oh, and a minor nit:
> 
> > + * This lock is used by the big.LITTLE migration code to ensure no IPIs
> > + * can be pended on the old core after the map has been updated.
> > + */
> > +#ifdef CONFIG_BL_SWITCHER
> > +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
> > +
> > +static inline void bl_migration_lock(unsigned long *flags)
> 
> Please name it gic_migration_lock. "bl_migration_lock" is a bit too 
> generic in this context.

I'll change this.

Daniel.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
@ 2015-02-26 21:05         ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-02-26 21:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2015-02-26 at 15:31 -0500, Nicolas Pitre wrote:
> On Tue, 3 Feb 2015, Daniel Thompson wrote:
> 
> > Currently gic_raise_softirq() is locked using upon irq_controller_lock.
> > This lock is primarily used to make register read-modify-write sequences
> > atomic but gic_raise_softirq() uses it instead to ensure that the
> > big.LITTLE migration logic can figure out when it is safe to migrate
> > interrupts between physical cores.
> > 
> > This is sub-optimal in closely related ways:
> > 
> > 1. No locking at all is required on systems where the b.L switcher is
> >    not configured.
> 
> ACK
> 
> > 2. Finer grain locking can be used on systems where the b.L switcher is
> >    present.
> 
> NAK
> 
> Consider this sequence:
> 
> 	CPU 1				CPU 2
> 	-----				-----
> 	gic_raise_softirq()		gic_migrate_target()
> 	bl_migration_lock() [OK]
> 	[...]				[...]
> 	map |= gic_cpu_map[cpu];	bl_migration_lock() [contended]
> 	bl_migration_unlock(flags);	bl_migration_lock() [OK]
> 					gic_cpu_map[cpu] = 1 << new_cpu_id;
> 					bl_migration_unlock(flags);
> 					[...]
> 					(migrate pending IPI from old CPU)
> 	writel_relaxed(map to GIC_DIST_SOFTINT);

Isn't this solved inside gic_raise_softirq? How can the writel_relaxed()
escape from the critical section and happen at the end of the sequence?


> 	[this IPI is now lost]
> 
> Granted, this race is apparently aready possible today.  We probably get 
> away with it because the locked sequence in gic_migrate_target() include 
> the retargetting of peripheral interrupts which gives plenti of time for 
> code execution in gic_raise_softirq() to post its IPI before the IPI 
> migration code is executed.  So in that sense it could be argued that 
> the reduced lock coverage from your patch doesn't make things any worse.  
> If anything it might even help by letting gic_migrate_target() complete 
> sooner.  But removing cpu_map_migration_lock altogether would improve 
> things even further by that logic.  I however don't think we should live 
> so dangerously.
> 
> Therefore, for the lock to be effective, it has to encompass the 
> changing of the CPU map _and_ migration of pending IPIs before new IPIs 
> are allowed again.  That means the locked area has to grow not shrink.
> 
> Oh, and a minor nit:
> 
> > + * This lock is used by the big.LITTLE migration code to ensure no IPIs
> > + * can be pended on the old core after the map has been updated.
> > + */
> > +#ifdef CONFIG_BL_SWITCHER
> > +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
> > +
> > +static inline void bl_migration_lock(unsigned long *flags)
> 
> Please name it gic_migration_lock. "bl_migration_lock" is a bit too 
> generic in this context.

I'll change this.

Daniel.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
  2015-02-26 21:05         ` Daniel Thompson
@ 2015-02-26 21:33           ` Nicolas Pitre
  -1 siblings, 0 replies; 94+ messages in thread
From: Nicolas Pitre @ 2015-02-26 21:33 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Thu, 26 Feb 2015, Daniel Thompson wrote:

> On Thu, 2015-02-26 at 15:31 -0500, Nicolas Pitre wrote:
> > On Tue, 3 Feb 2015, Daniel Thompson wrote:
> > 
> > > Currently gic_raise_softirq() is locked using upon irq_controller_lock.
> > > This lock is primarily used to make register read-modify-write sequences
> > > atomic but gic_raise_softirq() uses it instead to ensure that the
> > > big.LITTLE migration logic can figure out when it is safe to migrate
> > > interrupts between physical cores.
> > > 
> > > This is sub-optimal in closely related ways:
> > > 
> > > 1. No locking at all is required on systems where the b.L switcher is
> > >    not configured.
> > 
> > ACK
> > 
> > > 2. Finer grain locking can be used on systems where the b.L switcher is
> > >    present.
> > 
> > NAK
> > 
> > Consider this sequence:
> > 
> > 	CPU 1				CPU 2
> > 	-----				-----
> > 	gic_raise_softirq()		gic_migrate_target()
> > 	bl_migration_lock() [OK]
> > 	[...]				[...]
> > 	map |= gic_cpu_map[cpu];	bl_migration_lock() [contended]
> > 	bl_migration_unlock(flags);	bl_migration_lock() [OK]
> > 					gic_cpu_map[cpu] = 1 << new_cpu_id;
> > 					bl_migration_unlock(flags);
> > 					[...]
> > 					(migrate pending IPI from old CPU)
> > 	writel_relaxed(map to GIC_DIST_SOFTINT);
> 
> Isn't this solved inside gic_raise_softirq? How can the writel_relaxed()
> escape from the critical section and happen at the end of the sequence?

Hmmm... blah.  OK I obviously can't read today.

The patch is fine of course.


> > Oh, and a minor nit:
> > 
> > > + * This lock is used by the big.LITTLE migration code to ensure no IPIs
> > > + * can be pended on the old core after the map has been updated.
> > > + */
> > > +#ifdef CONFIG_BL_SWITCHER
> > > +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
> > > +
> > > +static inline void bl_migration_lock(unsigned long *flags)
> > 
> > Please name it gic_migration_lock. "bl_migration_lock" is a bit too 
> > generic in this context.
> 
> I'll change this.

Good.  You may add my ACK.


Nicolas

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
@ 2015-02-26 21:33           ` Nicolas Pitre
  0 siblings, 0 replies; 94+ messages in thread
From: Nicolas Pitre @ 2015-02-26 21:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 26 Feb 2015, Daniel Thompson wrote:

> On Thu, 2015-02-26 at 15:31 -0500, Nicolas Pitre wrote:
> > On Tue, 3 Feb 2015, Daniel Thompson wrote:
> > 
> > > Currently gic_raise_softirq() is locked using upon irq_controller_lock.
> > > This lock is primarily used to make register read-modify-write sequences
> > > atomic but gic_raise_softirq() uses it instead to ensure that the
> > > big.LITTLE migration logic can figure out when it is safe to migrate
> > > interrupts between physical cores.
> > > 
> > > This is sub-optimal in closely related ways:
> > > 
> > > 1. No locking at all is required on systems where the b.L switcher is
> > >    not configured.
> > 
> > ACK
> > 
> > > 2. Finer grain locking can be used on systems where the b.L switcher is
> > >    present.
> > 
> > NAK
> > 
> > Consider this sequence:
> > 
> > 	CPU 1				CPU 2
> > 	-----				-----
> > 	gic_raise_softirq()		gic_migrate_target()
> > 	bl_migration_lock() [OK]
> > 	[...]				[...]
> > 	map |= gic_cpu_map[cpu];	bl_migration_lock() [contended]
> > 	bl_migration_unlock(flags);	bl_migration_lock() [OK]
> > 					gic_cpu_map[cpu] = 1 << new_cpu_id;
> > 					bl_migration_unlock(flags);
> > 					[...]
> > 					(migrate pending IPI from old CPU)
> > 	writel_relaxed(map to GIC_DIST_SOFTINT);
> 
> Isn't this solved inside gic_raise_softirq? How can the writel_relaxed()
> escape from the critical section and happen at the end of the sequence?

Hmmm... blah.  OK I obviously can't read today.

The patch is fine of course.


> > Oh, and a minor nit:
> > 
> > > + * This lock is used by the big.LITTLE migration code to ensure no IPIs
> > > + * can be pended on the old core after the map has been updated.
> > > + */
> > > +#ifdef CONFIG_BL_SWITCHER
> > > +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
> > > +
> > > +static inline void bl_migration_lock(unsigned long *flags)
> > 
> > Please name it gic_migration_lock. "bl_migration_lock" is a bit too 
> > generic in this context.
> 
> I'll change this.

Good.  You may add my ACK.


Nicolas

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-03-04 10:12   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly bring in the changes for the arm we also rearrange
some of the existing x86 NMI code to make it architecture neutral.

The patchset http://thread.gmane.org/gmane.linux.kernel/1897765 , which
makes sched_clock() NMI/FIQ-safe, should be treated as a prerequisite for
this patch set. Although sched_clock() is not called directly by any of
the code that runs from a FIQ handler it is possible for sched_clock()
to be called indirectly when the function tracer is enabled.

This patchset touches a fairly large number of different sub-systems
(irq, printk, x86, arm). However, of the six patches, four fall under
one of tglx's maintainerships (either through irq or x86) and another
(printk) has no explicit maintainer so I've aimed the whole set at
Thomas (ack:s would be nice for other patches).

Thomas: The patchset breaks cleanly at all stages, including between
        patches 5 and 6. If you want to leave the ARM patch for Russell
	to take at a later point then that also should be fine.

The patches have been runtime tested on two systems capable of
supporting FIQ (Freescale i.MX6 and STiH416) and two that do not
(vexpress-a9 and Qualcomm Snapdragon 600), the changes to the x86
logic were tested on qemu and all patches have been compile tested
on x86, arm and arm64.

Note: On platforms not capable of supporting FIQ, the IPI to generate a
      backtrace will fall back to using IRQ for propagation instead.
      The backtrace logic contains a timeout to we will not permanently
      wedge the requesting CPU if other CPUs are not responsive.

v17:

* Rename bl_migration_lock/unlock to gic_migration_lock/unlock
  (Nicolas Pitre).

v16:

* Significant clean up of the printk patches (Thomas Gleixner).
  Replacing macros with real functions, CONFIG_ARCH_WANT_NMI_PRINTK
  -> CONFIG_PRINTK_NMI, prefixing global functions with printk_nmi,
  removing pointless exports, removing cpu_mask from the interfaces,
  removal of just-in-time initialization of trace buffers, prevented
  call sites having to save state, rolled up variable declarations
  into single lines.

* Dropped the sched_clock() patches from *this* patchset and managed
  them separately (http://thread.gmane.org/gmane.linux.kernel/1879261 ).
  The cross-dependancies between the patches are minimal; the backtrace
  code only calls sched_clock() if we are ftracing and backtracing is
  normally only triggered to report information about about a broken
  system (although users can type SysRq-l for amusement, most use it
  to find out why the system it dead).

* Squashed together the final two patches. Essentially these duplicated
  the x86 code and slavishly avoided changing it before, in the next
  patch, fixing it to work better on ARM. It seems better that the code
  just works first time!

v15:

* Added a patch to make sched_clock safe to call from NMI (Stephen
  Boyd). Note that sched_clock() is not called by the NMI handlers that
  have been added for the arm but it could be called if tools such as
  ftrace are deployed.

* Fixed some warnings picked up during bisectability testing.

v14:

* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c
  to printk.c (Steven Rostedt)

v13:

* Updated the code to print the backtrace to replicate Steven Rostedt's
  x86 work to make SysRq-l safe. This is pretty much a total rewrite of
  patches 4 and 5.

v12:

* Squash first two patches into a single one and re-describe
  (Thomas Gleixner).

* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe"
  (Thomas Gleixner).

v11:

* Optimized gic_raise_softirq() by replacing a register read with
  a memory read (Jason Cooper).

v10:

* Add a further patch to optimize away some of the locking on systems
  where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with
  exynos_defconfig (which is the only defconfig to set this option).

* Whitespace fixes in patch 4. That patch previously used spaces for
  alignment of new constants but the rest of the file used tabs.

v9:

* Improved documentation and structure of initial patch (now initial
  two patches) to make gic_raise_softirq() safe to call from FIQ
  (Thomas Gleixner).

* Avoid masking interrupts during gic_raise_softirq(). The use of the
  read lock makes this redundant (because we can safely re-enter the
  function).

v8:

* Fixed build on arm64 causes by a spurious include file in irq-gic.c.

v7-2 (accidentally released twice with same number):

* Fixed boot regression on vexpress-a9 (reported by Russell King).

* Rebased on v3.18-rc3; removed one patch from set that is already
  included in mainline.

* Dropped arm64/fiq.h patch from the set (still useful but not related
  to issuing backtraces).

v7:

* Re-arranged code within the patch series to fix a regression
  introduced midway through the series and corrected by a later patch
  (testing by Olof's autobuilder). Tested offending patch in isolation
  using defconfig identified by the autobuilder.

v6:

* Renamed svc_entry's call_trace argument to just trace (example code
  from Russell King).

* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell
  King).

* Modified usr_entry to optional avoid calling into the trace code and
  used this in FIQ entry from usr path. Modified corresponding exit code
  to avoid calling into trace code and the scheduler (example code from
  Russell King).

* Ensured the default FIQ register state is restored when the default
  FIQ handler is reinstalled (example code from Russell King).

* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting
  a default FIQ handler.

* Re-instated fiq_safe_migration_lock and associated logic in
  gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd()
  in the console unlock logic.

v5:

* Rebased on 3.17-rc4.

* Removed a spurious line from the final "glue it together" patch
  that broke the build.

v4:

* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas
  Pitre).

* Really fix bad pt_regs pointer generation in __fiq_abt.

* Remove fiq_safe_migration_lock and associated logic in
  gic_raise_softirq() (review of Russell King)

* Restructured to introduce the default FIQ handler first, before the
  new features (review of Russell King).

v3:

* Removed redundant header guards from arch/arm64/include/asm/fiq.h
  (review of Catalin Marinas).

* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas
  Pitre).

v2:

* Restructured to sit nicely on a similar FYI patchset from Russell
  King. It now effectively replaces the work in progress final patch
  with something much more complete.

* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq
  (review of Nicolas Pitre)

* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts
  being acknowledged by the IRQ handler does still exist but should be
  harmless because the IRQ handler will still wind up calling
  ipi_cpu_backtrace().

* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively
  becomes a platform feature (although the use of non-maskable
  interrupts to implement it is best effort rather than guaranteed).

* Better comments highlighting usage of RAZ/WI registers (and parts of
  registers) in the GIC code.

Changes *before* v1:

* This patchset is a hugely cut-down successor to "[PATCH v11 00/19]
  arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting
  the new structure. For historic details see:
        https://lkml.org/lkml/2014/9/2/227

* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value).
  In fixing this we also remove the useless indirection previously
  found in the fiq_handler macro.

* Make default fiq handler "always on" by migrating from fiq.c to
  traps.c and replace do_unexp_fiq with the new handler (review
  of Russell King).

* Add arm64 version of fiq.h (review of Russell King)

* Removed conditional branching and code from irq-gic.c, this is
  replaced by much simpler code that relies on the GIC specification's
  heavy use of read-as-zero/write-ignored (review of Russell King)


Daniel Thompson (6):
  irqchip: gic: Optimize locking in gic_raise_softirq
  irqchip: gic: Make gic_raise_softirq FIQ-safe
  irqchip: gic: Introduce plumbing for IPI FIQ
  printk: Simple implementation for NMI backtracing
  x86/nmi: Use common printk functions
  ARM: Add support for on-demand backtrace of other CPUs

 arch/arm/Kconfig                |   1 +
 arch/arm/include/asm/hardirq.h  |   2 +-
 arch/arm/include/asm/irq.h      |   5 +
 arch/arm/include/asm/smp.h      |   3 +
 arch/arm/kernel/smp.c           |  80 ++++++++++++++++
 arch/arm/kernel/traps.c         |   8 +-
 arch/x86/Kconfig                |   1 +
 arch/x86/kernel/apic/hw_nmi.c   | 101 ++------------------
 drivers/irqchip/irq-gic.c       | 203 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 ++
 include/linux/printk.h          |  18 ++++
 init/Kconfig                    |   3 +
 kernel/printk/printk.c          | 149 +++++++++++++++++++++++++++++
 13 files changed, 471 insertions(+), 111 deletions(-)

--
2.1.0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace
@ 2015-03-04 10:12   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly bring in the changes for the arm we also rearrange
some of the existing x86 NMI code to make it architecture neutral.

The patchset http://thread.gmane.org/gmane.linux.kernel/1897765 , which
makes sched_clock() NMI/FIQ-safe, should be treated as a prerequisite for
this patch set. Although sched_clock() is not called directly by any of
the code that runs from a FIQ handler it is possible for sched_clock()
to be called indirectly when the function tracer is enabled.

This patchset touches a fairly large number of different sub-systems
(irq, printk, x86, arm). However, of the six patches, four fall under
one of tglx's maintainerships (either through irq or x86) and another
(printk) has no explicit maintainer so I've aimed the whole set at
Thomas (ack:s would be nice for other patches).

Thomas: The patchset breaks cleanly at all stages, including between
        patches 5 and 6. If you want to leave the ARM patch for Russell
	to take at a later point then that also should be fine.

The patches have been runtime tested on two systems capable of
supporting FIQ (Freescale i.MX6 and STiH416) and two that do not
(vexpress-a9 and Qualcomm Snapdragon 600), the changes to the x86
logic were tested on qemu and all patches have been compile tested
on x86, arm and arm64.

Note: On platforms not capable of supporting FIQ, the IPI to generate a
      backtrace will fall back to using IRQ for propagation instead.
      The backtrace logic contains a timeout to we will not permanently
      wedge the requesting CPU if other CPUs are not responsive.

v17:

* Rename bl_migration_lock/unlock to gic_migration_lock/unlock
  (Nicolas Pitre).

v16:

* Significant clean up of the printk patches (Thomas Gleixner).
  Replacing macros with real functions, CONFIG_ARCH_WANT_NMI_PRINTK
  -> CONFIG_PRINTK_NMI, prefixing global functions with printk_nmi,
  removing pointless exports, removing cpu_mask from the interfaces,
  removal of just-in-time initialization of trace buffers, prevented
  call sites having to save state, rolled up variable declarations
  into single lines.

* Dropped the sched_clock() patches from *this* patchset and managed
  them separately (http://thread.gmane.org/gmane.linux.kernel/1879261 ).
  The cross-dependancies between the patches are minimal; the backtrace
  code only calls sched_clock() if we are ftracing and backtracing is
  normally only triggered to report information about about a broken
  system (although users can type SysRq-l for amusement, most use it
  to find out why the system it dead).

* Squashed together the final two patches. Essentially these duplicated
  the x86 code and slavishly avoided changing it before, in the next
  patch, fixing it to work better on ARM. It seems better that the code
  just works first time!

v15:

* Added a patch to make sched_clock safe to call from NMI (Stephen
  Boyd). Note that sched_clock() is not called by the NMI handlers that
  have been added for the arm but it could be called if tools such as
  ftrace are deployed.

* Fixed some warnings picked up during bisectability testing.

v14:

* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c
  to printk.c (Steven Rostedt)

v13:

* Updated the code to print the backtrace to replicate Steven Rostedt's
  x86 work to make SysRq-l safe. This is pretty much a total rewrite of
  patches 4 and 5.

v12:

* Squash first two patches into a single one and re-describe
  (Thomas Gleixner).

* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe"
  (Thomas Gleixner).

v11:

* Optimized gic_raise_softirq() by replacing a register read with
  a memory read (Jason Cooper).

v10:

* Add a further patch to optimize away some of the locking on systems
  where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with
  exynos_defconfig (which is the only defconfig to set this option).

* Whitespace fixes in patch 4. That patch previously used spaces for
  alignment of new constants but the rest of the file used tabs.

v9:

* Improved documentation and structure of initial patch (now initial
  two patches) to make gic_raise_softirq() safe to call from FIQ
  (Thomas Gleixner).

* Avoid masking interrupts during gic_raise_softirq(). The use of the
  read lock makes this redundant (because we can safely re-enter the
  function).

v8:

* Fixed build on arm64 causes by a spurious include file in irq-gic.c.

v7-2 (accidentally released twice with same number):

* Fixed boot regression on vexpress-a9 (reported by Russell King).

* Rebased on v3.18-rc3; removed one patch from set that is already
  included in mainline.

* Dropped arm64/fiq.h patch from the set (still useful but not related
  to issuing backtraces).

v7:

* Re-arranged code within the patch series to fix a regression
  introduced midway through the series and corrected by a later patch
  (testing by Olof's autobuilder). Tested offending patch in isolation
  using defconfig identified by the autobuilder.

v6:

* Renamed svc_entry's call_trace argument to just trace (example code
  from Russell King).

* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell
  King).

* Modified usr_entry to optional avoid calling into the trace code and
  used this in FIQ entry from usr path. Modified corresponding exit code
  to avoid calling into trace code and the scheduler (example code from
  Russell King).

* Ensured the default FIQ register state is restored when the default
  FIQ handler is reinstalled (example code from Russell King).

* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting
  a default FIQ handler.

* Re-instated fiq_safe_migration_lock and associated logic in
  gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd()
  in the console unlock logic.

v5:

* Rebased on 3.17-rc4.

* Removed a spurious line from the final "glue it together" patch
  that broke the build.

v4:

* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas
  Pitre).

* Really fix bad pt_regs pointer generation in __fiq_abt.

* Remove fiq_safe_migration_lock and associated logic in
  gic_raise_softirq() (review of Russell King)

* Restructured to introduce the default FIQ handler first, before the
  new features (review of Russell King).

v3:

* Removed redundant header guards from arch/arm64/include/asm/fiq.h
  (review of Catalin Marinas).

* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas
  Pitre).

v2:

* Restructured to sit nicely on a similar FYI patchset from Russell
  King. It now effectively replaces the work in progress final patch
  with something much more complete.

* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq
  (review of Nicolas Pitre)

* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts
  being acknowledged by the IRQ handler does still exist but should be
  harmless because the IRQ handler will still wind up calling
  ipi_cpu_backtrace().

* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively
  becomes a platform feature (although the use of non-maskable
  interrupts to implement it is best effort rather than guaranteed).

* Better comments highlighting usage of RAZ/WI registers (and parts of
  registers) in the GIC code.

Changes *before* v1:

* This patchset is a hugely cut-down successor to "[PATCH v11 00/19]
  arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting
  the new structure. For historic details see:
        https://lkml.org/lkml/2014/9/2/227

* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value).
  In fixing this we also remove the useless indirection previously
  found in the fiq_handler macro.

* Make default fiq handler "always on" by migrating from fiq.c to
  traps.c and replace do_unexp_fiq with the new handler (review
  of Russell King).

* Add arm64 version of fiq.h (review of Russell King)

* Removed conditional branching and code from irq-gic.c, this is
  replaced by much simpler code that relies on the GIC specification's
  heavy use of read-as-zero/write-ignored (review of Russell King)


Daniel Thompson (6):
  irqchip: gic: Optimize locking in gic_raise_softirq
  irqchip: gic: Make gic_raise_softirq FIQ-safe
  irqchip: gic: Introduce plumbing for IPI FIQ
  printk: Simple implementation for NMI backtracing
  x86/nmi: Use common printk functions
  ARM: Add support for on-demand backtrace of other CPUs

 arch/arm/Kconfig                |   1 +
 arch/arm/include/asm/hardirq.h  |   2 +-
 arch/arm/include/asm/irq.h      |   5 +
 arch/arm/include/asm/smp.h      |   3 +
 arch/arm/kernel/smp.c           |  80 ++++++++++++++++
 arch/arm/kernel/traps.c         |   8 +-
 arch/x86/Kconfig                |   1 +
 arch/x86/kernel/apic/hw_nmi.c   | 101 ++------------------
 drivers/irqchip/irq-gic.c       | 203 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 ++
 include/linux/printk.h          |  18 ++++
 init/Kconfig                    |   3 +
 kernel/printk/printk.c          | 149 +++++++++++++++++++++++++++++
 13 files changed, 471 insertions(+), 111 deletions(-)

--
2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
  2015-03-04 10:12   ` Daniel Thompson
@ 2015-03-04 10:12     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently gic_raise_softirq() is locked using upon irq_controller_lock.
This lock is primarily used to make register read-modify-write sequences
atomic but gic_raise_softirq() uses it instead to ensure that the
big.LITTLE migration logic can figure out when it is safe to migrate
interrupts between physical cores.

This is sub-optimal in closely related ways:

1. No locking at all is required on systems where the b.L switcher is
   not configured.

2. Finer grain locking can be used on systems where the b.L switcher is
   present.

This patch resolves both of the above by introducing a separate finer
grain lock and providing conditionally compiled inlines to lock/unlock
it.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 4634cf7d0ec3..f2a0b4525b65 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -73,6 +73,27 @@ struct gic_chip_data {
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 
 /*
+ * This lock is used by the big.LITTLE migration code to ensure no IPIs
+ * can be pended on the old core after the map has been updated.
+ */
+#ifdef CONFIG_BL_SWITCHER
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+
+static inline void gic_migration_lock(unsigned long *flags)
+{
+	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+}
+
+static inline void gic_migration_unlock(unsigned long flags)
+{
+	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+}
+#else
+static inline void gic_migration_lock(unsigned long *flags) {}
+static inline void gic_migration_unlock(unsigned long flags) {}
+#endif
+
+/*
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
@@ -627,7 +648,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	int cpu;
 	unsigned long flags, map = 0;
 
-	raw_spin_lock_irqsave(&irq_controller_lock, flags);
+	gic_migration_lock(&flags);
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -642,7 +663,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
+	gic_migration_unlock(flags);
 }
 #endif
 
@@ -713,8 +734,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
 
 	raw_spin_lock(&irq_controller_lock);
 
-	/* Update the target interface for this logical CPU */
+	/*
+	 * Update the target interface for this logical CPU
+	 *
+	 * From the point we release the cpu_map_migration_lock any new
+	 * SGIs will be pended on the new cpu which makes the set of SGIs
+	 * pending on the old cpu static. That means we can defer the
+	 * migration until after we have released the irq_controller_lock.
+	 */
+	raw_spin_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
+	raw_spin_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
@ 2015-03-04 10:12     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

Currently gic_raise_softirq() is locked using upon irq_controller_lock.
This lock is primarily used to make register read-modify-write sequences
atomic but gic_raise_softirq() uses it instead to ensure that the
big.LITTLE migration logic can figure out when it is safe to migrate
interrupts between physical cores.

This is sub-optimal in closely related ways:

1. No locking at all is required on systems where the b.L switcher is
   not configured.

2. Finer grain locking can be used on systems where the b.L switcher is
   present.

This patch resolves both of the above by introducing a separate finer
grain lock and providing conditionally compiled inlines to lock/unlock
it.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 4634cf7d0ec3..f2a0b4525b65 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -73,6 +73,27 @@ struct gic_chip_data {
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 
 /*
+ * This lock is used by the big.LITTLE migration code to ensure no IPIs
+ * can be pended on the old core after the map has been updated.
+ */
+#ifdef CONFIG_BL_SWITCHER
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+
+static inline void gic_migration_lock(unsigned long *flags)
+{
+	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+}
+
+static inline void gic_migration_unlock(unsigned long flags)
+{
+	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+}
+#else
+static inline void gic_migration_lock(unsigned long *flags) {}
+static inline void gic_migration_unlock(unsigned long flags) {}
+#endif
+
+/*
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
@@ -627,7 +648,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	int cpu;
 	unsigned long flags, map = 0;
 
-	raw_spin_lock_irqsave(&irq_controller_lock, flags);
+	gic_migration_lock(&flags);
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -642,7 +663,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
+	gic_migration_unlock(flags);
 }
 #endif
 
@@ -713,8 +734,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
 
 	raw_spin_lock(&irq_controller_lock);
 
-	/* Update the target interface for this logical CPU */
+	/*
+	 * Update the target interface for this logical CPU
+	 *
+	 * From the point we release the cpu_map_migration_lock any new
+	 * SGIs will be pended on the new cpu which makes the set of SGIs
+	 * pending on the old cpu static. That means we can defer the
+	 * migration until after we have released the irq_controller_lock.
+	 */
+	raw_spin_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
+	raw_spin_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe
  2015-03-04 10:12   ` Daniel Thompson
@ 2015-03-04 10:12     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
and lock up.

    	gic_raise_softirq()
	   lock(x);
-~-> FIQ
        handle_fiq()
	   gic_raise_softirq()
	      lock(x);		<-- Lockup

arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
renders it difficult for FIQ handlers to safely defer work to less
restrictive calling contexts.

This patch fixes the problem by converting the cpu_map_migration_lock
into a rwlock making it safe to re-enter the function.

Note that having made it safe to re-enter gic_raise_softirq() we no
longer need to mask interrupts during gic_raise_softirq() because the
b.L migration is always performed from task context.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index f2a0b4525b65..48d6296a365a 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 /*
  * This lock is used by the big.LITTLE migration code to ensure no IPIs
  * can be pended on the old core after the map has been updated.
+ *
+ * This lock may be locked for reading from both IRQ and FIQ handlers
+ * and therefore must not be locked for writing when these are enabled.
  */
 #ifdef CONFIG_BL_SWITCHER
-static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static DEFINE_RWLOCK(cpu_map_migration_lock);
 
-static inline void gic_migration_lock(unsigned long *flags)
+static inline void gic_migration_lock(void)
 {
-	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+	read_lock(&cpu_map_migration_lock);
 }
 
-static inline void gic_migration_unlock(unsigned long flags)
+static inline void gic_migration_unlock(void)
 {
-	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+	read_unlock(&cpu_map_migration_lock);
 }
 #else
-static inline void gic_migration_lock(unsigned long *flags) {}
-static inline void gic_migration_unlock(unsigned long flags) {}
+static inline void gic_migration_lock(void) {}
+static inline void gic_migration_unlock(void) {}
 #endif
 
 /*
@@ -643,12 +646,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+/*
+ * Raise the specified IPI on all cpus set in mask.
+ *
+ * This function is safe to call from all calling contexts, including
+ * FIQ handlers. It relies on gic_migration_lock() being multiply acquirable
+ * to avoid deadlocks when the function is re-entered at different
+ * exception levels.
+ */
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
-	unsigned long flags, map = 0;
+	unsigned long map = 0;
 
-	gic_migration_lock(&flags);
+	gic_migration_lock();
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -663,7 +674,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	gic_migration_unlock(flags);
+	gic_migration_unlock();
 }
 #endif
 
@@ -711,7 +722,8 @@ int gic_get_cpu_id(unsigned int cpu)
  * Migrate all peripheral interrupts with a target matching the current CPU
  * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
  * is also updated.  Targets to other CPU interfaces are unchanged.
- * This must be called with IRQs locally disabled.
+ * This must be called from a task context and with IRQ and FIQ locally
+ * disabled.
  */
 void gic_migrate_target(unsigned int new_cpu_id)
 {
@@ -742,9 +754,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * pending on the old cpu static. That means we can defer the
 	 * migration until after we have released the irq_controller_lock.
 	 */
-	raw_spin_lock(&cpu_map_migration_lock);
+	write_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
-	raw_spin_unlock(&cpu_map_migration_lock);
+	write_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe
@ 2015-03-04 10:12     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
and lock up.

    	gic_raise_softirq()
	   lock(x);
-~-> FIQ
        handle_fiq()
	   gic_raise_softirq()
	      lock(x);		<-- Lockup

arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
renders it difficult for FIQ handlers to safely defer work to less
restrictive calling contexts.

This patch fixes the problem by converting the cpu_map_migration_lock
into a rwlock making it safe to re-enter the function.

Note that having made it safe to re-enter gic_raise_softirq() we no
longer need to mask interrupts during gic_raise_softirq() because the
b.L migration is always performed from task context.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index f2a0b4525b65..48d6296a365a 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 /*
  * This lock is used by the big.LITTLE migration code to ensure no IPIs
  * can be pended on the old core after the map has been updated.
+ *
+ * This lock may be locked for reading from both IRQ and FIQ handlers
+ * and therefore must not be locked for writing when these are enabled.
  */
 #ifdef CONFIG_BL_SWITCHER
-static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static DEFINE_RWLOCK(cpu_map_migration_lock);
 
-static inline void gic_migration_lock(unsigned long *flags)
+static inline void gic_migration_lock(void)
 {
-	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+	read_lock(&cpu_map_migration_lock);
 }
 
-static inline void gic_migration_unlock(unsigned long flags)
+static inline void gic_migration_unlock(void)
 {
-	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+	read_unlock(&cpu_map_migration_lock);
 }
 #else
-static inline void gic_migration_lock(unsigned long *flags) {}
-static inline void gic_migration_unlock(unsigned long flags) {}
+static inline void gic_migration_lock(void) {}
+static inline void gic_migration_unlock(void) {}
 #endif
 
 /*
@@ -643,12 +646,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+/*
+ * Raise the specified IPI on all cpus set in mask.
+ *
+ * This function is safe to call from all calling contexts, including
+ * FIQ handlers. It relies on gic_migration_lock() being multiply acquirable
+ * to avoid deadlocks when the function is re-entered at different
+ * exception levels.
+ */
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
-	unsigned long flags, map = 0;
+	unsigned long map = 0;
 
-	gic_migration_lock(&flags);
+	gic_migration_lock();
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -663,7 +674,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	gic_migration_unlock(flags);
+	gic_migration_unlock();
 }
 #endif
 
@@ -711,7 +722,8 @@ int gic_get_cpu_id(unsigned int cpu)
  * Migrate all peripheral interrupts with a target matching the current CPU
  * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
  * is also updated.  Targets to other CPU interfaces are unchanged.
- * This must be called with IRQs locally disabled.
+ * This must be called from a task context and with IRQ and FIQ locally
+ * disabled.
  */
 void gic_migrate_target(unsigned int new_cpu_id)
 {
@@ -742,9 +754,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * pending on the old cpu static. That means we can defer the
 	 * migration until after we have released the irq_controller_lock.
 	 */
-	raw_spin_lock(&cpu_map_migration_lock);
+	write_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
-	raw_spin_unlock(&cpu_map_migration_lock);
+	write_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 3/6] irqchip: gic: Introduce plumbing for IPI FIQ
  2015-03-04 10:12   ` Daniel Thompson
@ 2015-03-04 10:12     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently it is not possible to exploit FIQ for systems with a GIC, even if
the systems are otherwise capable of it. This patch makes it possible
for IPIs to be delivered using FIQ.

To do so it modifies the register state so that normal interrupts are
placed in group 1 and specific IPIs are placed into group 0. It also
configures the controller to raise group 0 interrupts using the FIQ
signal. It provides a means for architecture code to define which IPIs
shall use FIQ and to acknowledge any IPIs that are raised.

All GIC hardware except GICv1-without-TrustZone support provides a means
to group exceptions into group 0 and group 1 but the hardware
functionality is unavailable to the kernel when a secure monitor is
present because access to the grouping registers are prohibited outside
"secure world". However when grouping is not available (or in the case
of early GICv1 implementations is very hard to configure) the code to
change groups does not deploy and all IPIs will be raised via IRQ.

It has been tested and shown working on two systems capable of
supporting grouping (Freescale i.MX6 and STiH416). It has also been
tested for boot regressions on two systems that do not support grouping
(vexpress-a9 and Qualcomm Snapdragon 600).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/traps.c         |   5 +-
 drivers/irqchip/irq-gic.c       | 151 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 +++
 3 files changed, 153 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 788e23fe64d8..b35e220ae1b1 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -26,6 +26,7 @@
 #include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/irq.h>
+#include <linux/irqchip/arm-gic.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 
 	nmi_enter();
 
-	/* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
+	gic_handle_fiq_ipi();
+#endif
 
 	nmi_exit();
 
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 48d6296a365a..cd2e4f93675b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/ratelimit.h>
 
 #include <asm/cputype.h>
 #include <asm/irq.h>
@@ -48,6 +49,10 @@
 #include "irq-gic-common.h"
 #include "irqchip.h"
 
+#ifndef SMP_IPI_FIQ_MASK
+#define SMP_IPI_FIQ_MASK 0
+#endif
+
 union gic_base {
 	void __iomem *common_base;
 	void __percpu * __iomem *percpu_base;
@@ -65,6 +70,7 @@ struct gic_chip_data {
 #endif
 	struct irq_domain *domain;
 	unsigned int gic_irqs;
+	u32 igroup0_shadow;
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
@@ -351,6 +357,83 @@ static struct irq_chip gic_chip = {
 	.irq_set_wake		= gic_set_wake,
 };
 
+/*
+ * Shift an interrupt between Group 0 and Group 1.
+ *
+ * In addition to changing the group we also modify the priority to
+ * match what "ARM strongly recommends" for a system where no Group 1
+ * interrupt must ever preempt a Group 0 interrupt.
+ *
+ * If is safe to call this function on systems which do not support
+ * grouping (it will have no effect).
+ */
+static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq,
+			      int group)
+{
+	void __iomem *base = gic_data_dist_base(gic);
+	unsigned int grp_reg = hwirq / 32 * 4;
+	u32 grp_mask = BIT(hwirq % 32);
+	u32 grp_val;
+
+	unsigned int pri_reg = (hwirq / 4) * 4;
+	u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
+	u32 pri_val;
+
+	/*
+	 * Systems which do not support grouping will have not have
+	 * the EnableGrp1 bit set.
+	 */
+	if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
+		return;
+
+	raw_spin_lock(&irq_controller_lock);
+
+	grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
+	pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
+
+	if (group) {
+		grp_val |= grp_mask;
+		pri_val |= pri_mask;
+	} else {
+		grp_val &= ~grp_mask;
+		pri_val &= ~pri_mask;
+	}
+
+	writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
+	if (grp_reg == 0)
+		gic->igroup0_shadow = grp_val;
+
+	writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
+
+	raw_spin_unlock(&irq_controller_lock);
+}
+
+
+/*
+ * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
+ * otherwise do nothing.
+ */
+void gic_handle_fiq_ipi(void)
+{
+	struct gic_chip_data *gic = &gic_data[0];
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
+	unsigned long irqstat, irqnr;
+
+	if (WARN_ON(!in_nmi()))
+		return;
+
+	while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
+	       SMP_IPI_FIQ_MASK) {
+		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
+		writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
+
+		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+		WARN_RATELIMIT(irqnr > 16,
+			       "Unexpected irqnr %lu (bad prioritization?)\n",
+			       irqnr);
+	}
+}
+
 void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 {
 	if (gic_nr >= MAX_GIC_NR)
@@ -382,15 +465,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 static void gic_cpu_if_up(void)
 {
 	void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
-	u32 bypass = 0;
+	void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
+	u32 ctrl = 0;
 
 	/*
-	* Preserve bypass disable bits to be written back later
-	*/
-	bypass = readl(cpu_base + GIC_CPU_CTRL);
-	bypass &= GICC_DIS_BYPASS_MASK;
+	 * Preserve bypass disable bits to be written back later
+	 */
+	ctrl = readl(cpu_base + GIC_CPU_CTRL);
+	ctrl &= GICC_DIS_BYPASS_MASK;
 
-	writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
+	/*
+	 * If EnableGrp1 is set in the distributor then enable group 1
+	 * support for this CPU (and route group 0 interrupts to FIQ).
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
+		ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
+			GICC_ENABLE_GRP1;
+
+	writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
 
@@ -414,7 +506,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored) depending on current mode.
+	 */
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
+
+	/*
+	 * Set all global interrupts to be group 1 if (and only if) it
+	 * is possible to enable group 1 interrupts. This register is RAZ/WI
+	 * if not accessible or not implemented, however some GICv1 devices
+	 * do not implement the EnableGrp1 bit making it unsafe to set
+	 * this register unconditionally.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
+		for (i = 32; i < gic_irqs; i += 32)
+			writel_relaxed(0xffffffff,
+				       base + GIC_DIST_IGROUP + i * 4 / 32);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -423,6 +531,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	void __iomem *base = gic_data_cpu_base(gic);
 	unsigned int cpu_mask, cpu = smp_processor_id();
 	int i;
+	unsigned long secure_irqs, secure_irq;
 
 	/*
 	 * Get what the GIC says our CPU mask is.
@@ -441,6 +550,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * If the distributor is configured to support interrupt grouping
+	 * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
+	 * to be group1 and ensure any remaining group 0 interrupts have
+	 * the right priority.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
+		secure_irqs = SMP_IPI_FIQ_MASK;
+		writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
+		gic->igroup0_shadow = ~secure_irqs;
+		for_each_set_bit(secure_irq, &secure_irqs, 16)
+			gic_set_group_irq(gic, secure_irq, 0);
+	}
+
 	writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK);
 	gic_cpu_if_up();
 }
@@ -530,7 +653,8 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
+		       dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -658,6 +782,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
 	unsigned long map = 0;
+	unsigned long softint;
 
 	gic_migration_lock();
 
@@ -671,8 +796,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	/* We avoid a readl here by using the shadow copy of IGROUP[0] */
+	softint = map << 16 | irq;
+	if (gic_data[0].igroup0_shadow & BIT(irq))
+		softint |= 0x8000;
+
+	/* This always happens on GIC0 */
+	writel_relaxed(softint,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
 	gic_migration_unlock();
 }
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 71d706d5f169..7690f70049a3 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -22,6 +22,10 @@
 #define GIC_CPU_IDENT			0xfc
 
 #define GICC_ENABLE			0x1
+#define GICC_ENABLE_GRP1		0x2
+#define GICC_ACK_CTL			0x4
+#define GICC_FIQ_EN			0x8
+#define GICC_COMMON_BPR			0x10
 #define GICC_INT_PRI_THRESHOLD		0xf0
 #define GICC_IAR_INT_ID_MASK		0x3ff
 #define GICC_INT_SPURIOUS		1023
@@ -44,6 +48,7 @@
 #define GIC_DIST_SGI_PENDING_SET	0xf20
 
 #define GICD_ENABLE			0x1
+#define GICD_ENABLE_GRP1		0x2
 #define GICD_DISABLE			0x0
 #define GICD_INT_ACTLOW_LVLTRIG		0x0
 #define GICD_INT_EN_CLR_X32		0xffffffff
@@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops
 {
 	gic_routable_irq_domain_ops = ops;
 }
+
+void gic_handle_fiq_ipi(void);
+
 #endif /* __ASSEMBLY */
 #endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 3/6] irqchip: gic: Introduce plumbing for IPI FIQ
@ 2015-03-04 10:12     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

Currently it is not possible to exploit FIQ for systems with a GIC, even if
the systems are otherwise capable of it. This patch makes it possible
for IPIs to be delivered using FIQ.

To do so it modifies the register state so that normal interrupts are
placed in group 1 and specific IPIs are placed into group 0. It also
configures the controller to raise group 0 interrupts using the FIQ
signal. It provides a means for architecture code to define which IPIs
shall use FIQ and to acknowledge any IPIs that are raised.

All GIC hardware except GICv1-without-TrustZone support provides a means
to group exceptions into group 0 and group 1 but the hardware
functionality is unavailable to the kernel when a secure monitor is
present because access to the grouping registers are prohibited outside
"secure world". However when grouping is not available (or in the case
of early GICv1 implementations is very hard to configure) the code to
change groups does not deploy and all IPIs will be raised via IRQ.

It has been tested and shown working on two systems capable of
supporting grouping (Freescale i.MX6 and STiH416). It has also been
tested for boot regressions on two systems that do not support grouping
(vexpress-a9 and Qualcomm Snapdragon 600).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/traps.c         |   5 +-
 drivers/irqchip/irq-gic.c       | 151 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 +++
 3 files changed, 153 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 788e23fe64d8..b35e220ae1b1 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -26,6 +26,7 @@
 #include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/irq.h>
+#include <linux/irqchip/arm-gic.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 
 	nmi_enter();
 
-	/* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
+	gic_handle_fiq_ipi();
+#endif
 
 	nmi_exit();
 
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 48d6296a365a..cd2e4f93675b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/ratelimit.h>
 
 #include <asm/cputype.h>
 #include <asm/irq.h>
@@ -48,6 +49,10 @@
 #include "irq-gic-common.h"
 #include "irqchip.h"
 
+#ifndef SMP_IPI_FIQ_MASK
+#define SMP_IPI_FIQ_MASK 0
+#endif
+
 union gic_base {
 	void __iomem *common_base;
 	void __percpu * __iomem *percpu_base;
@@ -65,6 +70,7 @@ struct gic_chip_data {
 #endif
 	struct irq_domain *domain;
 	unsigned int gic_irqs;
+	u32 igroup0_shadow;
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
@@ -351,6 +357,83 @@ static struct irq_chip gic_chip = {
 	.irq_set_wake		= gic_set_wake,
 };
 
+/*
+ * Shift an interrupt between Group 0 and Group 1.
+ *
+ * In addition to changing the group we also modify the priority to
+ * match what "ARM strongly recommends" for a system where no Group 1
+ * interrupt must ever preempt a Group 0 interrupt.
+ *
+ * If is safe to call this function on systems which do not support
+ * grouping (it will have no effect).
+ */
+static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq,
+			      int group)
+{
+	void __iomem *base = gic_data_dist_base(gic);
+	unsigned int grp_reg = hwirq / 32 * 4;
+	u32 grp_mask = BIT(hwirq % 32);
+	u32 grp_val;
+
+	unsigned int pri_reg = (hwirq / 4) * 4;
+	u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
+	u32 pri_val;
+
+	/*
+	 * Systems which do not support grouping will have not have
+	 * the EnableGrp1 bit set.
+	 */
+	if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
+		return;
+
+	raw_spin_lock(&irq_controller_lock);
+
+	grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
+	pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
+
+	if (group) {
+		grp_val |= grp_mask;
+		pri_val |= pri_mask;
+	} else {
+		grp_val &= ~grp_mask;
+		pri_val &= ~pri_mask;
+	}
+
+	writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
+	if (grp_reg == 0)
+		gic->igroup0_shadow = grp_val;
+
+	writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
+
+	raw_spin_unlock(&irq_controller_lock);
+}
+
+
+/*
+ * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
+ * otherwise do nothing.
+ */
+void gic_handle_fiq_ipi(void)
+{
+	struct gic_chip_data *gic = &gic_data[0];
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
+	unsigned long irqstat, irqnr;
+
+	if (WARN_ON(!in_nmi()))
+		return;
+
+	while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
+	       SMP_IPI_FIQ_MASK) {
+		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
+		writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
+
+		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+		WARN_RATELIMIT(irqnr > 16,
+			       "Unexpected irqnr %lu (bad prioritization?)\n",
+			       irqnr);
+	}
+}
+
 void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 {
 	if (gic_nr >= MAX_GIC_NR)
@@ -382,15 +465,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 static void gic_cpu_if_up(void)
 {
 	void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
-	u32 bypass = 0;
+	void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
+	u32 ctrl = 0;
 
 	/*
-	* Preserve bypass disable bits to be written back later
-	*/
-	bypass = readl(cpu_base + GIC_CPU_CTRL);
-	bypass &= GICC_DIS_BYPASS_MASK;
+	 * Preserve bypass disable bits to be written back later
+	 */
+	ctrl = readl(cpu_base + GIC_CPU_CTRL);
+	ctrl &= GICC_DIS_BYPASS_MASK;
 
-	writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
+	/*
+	 * If EnableGrp1 is set in the distributor then enable group 1
+	 * support for this CPU (and route group 0 interrupts to FIQ).
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
+		ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
+			GICC_ENABLE_GRP1;
+
+	writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
 
@@ -414,7 +506,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored) depending on current mode.
+	 */
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
+
+	/*
+	 * Set all global interrupts to be group 1 if (and only if) it
+	 * is possible to enable group 1 interrupts. This register is RAZ/WI
+	 * if not accessible or not implemented, however some GICv1 devices
+	 * do not implement the EnableGrp1 bit making it unsafe to set
+	 * this register unconditionally.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
+		for (i = 32; i < gic_irqs; i += 32)
+			writel_relaxed(0xffffffff,
+				       base + GIC_DIST_IGROUP + i * 4 / 32);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -423,6 +531,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	void __iomem *base = gic_data_cpu_base(gic);
 	unsigned int cpu_mask, cpu = smp_processor_id();
 	int i;
+	unsigned long secure_irqs, secure_irq;
 
 	/*
 	 * Get what the GIC says our CPU mask is.
@@ -441,6 +550,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * If the distributor is configured to support interrupt grouping
+	 * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
+	 * to be group1 and ensure any remaining group 0 interrupts have
+	 * the right priority.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
+		secure_irqs = SMP_IPI_FIQ_MASK;
+		writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
+		gic->igroup0_shadow = ~secure_irqs;
+		for_each_set_bit(secure_irq, &secure_irqs, 16)
+			gic_set_group_irq(gic, secure_irq, 0);
+	}
+
 	writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK);
 	gic_cpu_if_up();
 }
@@ -530,7 +653,8 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
+		       dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -658,6 +782,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
 	unsigned long map = 0;
+	unsigned long softint;
 
 	gic_migration_lock();
 
@@ -671,8 +796,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	/* We avoid a readl here by using the shadow copy of IGROUP[0] */
+	softint = map << 16 | irq;
+	if (gic_data[0].igroup0_shadow & BIT(irq))
+		softint |= 0x8000;
+
+	/* This always happens on GIC0 */
+	writel_relaxed(softint,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
 	gic_migration_unlock();
 }
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 71d706d5f169..7690f70049a3 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -22,6 +22,10 @@
 #define GIC_CPU_IDENT			0xfc
 
 #define GICC_ENABLE			0x1
+#define GICC_ENABLE_GRP1		0x2
+#define GICC_ACK_CTL			0x4
+#define GICC_FIQ_EN			0x8
+#define GICC_COMMON_BPR			0x10
 #define GICC_INT_PRI_THRESHOLD		0xf0
 #define GICC_IAR_INT_ID_MASK		0x3ff
 #define GICC_INT_SPURIOUS		1023
@@ -44,6 +48,7 @@
 #define GIC_DIST_SGI_PENDING_SET	0xf20
 
 #define GICD_ENABLE			0x1
+#define GICD_ENABLE_GRP1		0x2
 #define GICD_DISABLE			0x0
 #define GICD_INT_ACTLOW_LVLTRIG		0x0
 #define GICD_INT_EN_CLR_X32		0xffffffff
@@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops
 {
 	gic_routable_irq_domain_ops = ops;
 }
+
+void gic_handle_fiq_ipi(void);
+
 #endif /* __ASSEMBLY */
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
  2015-03-04 10:12   ` Daniel Thompson
@ 2015-03-04 10:12     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Currently there is a quite a pile of code sitting in
arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
The code is inaccessible to backtrace implementations for other
architectures, which is a shame because they would probably like to be
safe too.

Copy this code into printk. We'll port the x86 NMI backtrace to it in a
later patch.

Incidentally, technically I think it might be safe to call
printk_nmi_prepare() from NMI, providing care were taken to honour the
return code. printk_nmi_complete() cannot be called from NMI but could
be scheduled using irq_work_queue(). However honouring the return code
means sometimes it is impossible to get the message out so in most cases
I'd say using this code in such a way should probably attract sympathy
and/or derision rather than admiration.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/printk.h |  18 ++++++
 init/Kconfig           |   3 +
 kernel/printk/printk.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 170 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index baa3f97d8ce8..7fd94e644976 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -228,6 +228,24 @@ static inline void show_regs_print_info(const char *log_lvl)
 }
 #endif
 
+#ifdef CONFIG_PRINTK_NMI
+/*
+ * printk_nmi_prepare/complete are called to prepare the system for
+ * some or all cores to issue trace from NMI. printk_nmi_complete will
+ * print buffered output and cannot (safely) be called from NMI.
+ */
+extern int printk_nmi_prepare(void);
+extern void printk_nmi_complete(void);
+
+/*
+ * printk_nmi_this_cpu_begin/end are used divert/restore printk on this
+ * cpu. The result is the output of printk() (by this CPU) will be
+ * stored in temporary buffers for later printing by printk_nmi_complete.
+ */
+extern void printk_nmi_this_cpu_begin(void);
+extern void printk_nmi_this_cpu_end(void);
+#endif
+
 extern asmlinkage void dump_stack(void) __cold;
 
 #ifndef pr_fmt
diff --git a/init/Kconfig b/init/Kconfig
index f5dbc6d4261b..4f00d11ef0a4 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1421,6 +1421,9 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config PRINTK_NMI
+	bool
+
 config BUG
 	bool "BUG() support" if EXPERT
 	default y
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 01cfd69c54c6..291271300cd5 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1829,6 +1829,155 @@ EXPORT_SYMBOL_GPL(vprintk_default);
  */
 DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
 
+#ifdef CONFIG_PRINTK_NMI
+
+#define NMI_BUF_SIZE		4096
+
+struct nmi_seq_buf {
+	unsigned char		buffer[NMI_BUF_SIZE];
+	struct seq_buf		seq;
+};
+
+/* Safe printing in NMI context */
+static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+
+static DEFINE_PER_CPU(printk_func_t, nmi_print_saved_print_func);
+
+/* "in progress" flag of NMI printing */
+static unsigned long nmi_print_flag;
+
+static int __init printk_nmi_init(void)
+{
+	struct nmi_seq_buf *s;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
+	}
+
+	return 0;
+}
+pure_initcall(printk_nmi_init);
+
+/*
+ * It is not safe to call printk() directly from NMI handlers.
+ * It may be fine if the NMI detected a lock up and we have no choice
+ * but to do so, but doing a NMI on all other CPUs to get a back trace
+ * can be done with a sysrq-l. We don't want that to lock up, which
+ * can happen if the NMI interrupts a printk in progress.
+ *
+ * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
+ * the content into a per cpu seq_buf buffer. Then when the NMIs are
+ * all done, we can safely dump the contents of the seq_buf to a printk()
+ * from a non NMI context.
+ *
+ * This is not a generic printk() implementation and must be used with
+ * great care. In particular there is a static limit on the quantity of
+ * data that may be emitted during NMI, only one client can be active at
+ * one time (arbitrated by the return value of printk_nmi_begin() and
+ * it is required that something at task or interrupt context be scheduled
+ * to issue the output.
+ */
+static int nmi_vprintk(const char *fmt, va_list args)
+{
+	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	unsigned int len = seq_buf_used(&s->seq);
+
+	seq_buf_vprintf(&s->seq, fmt, args);
+	return seq_buf_used(&s->seq) - len;
+}
+
+/*
+ * Reserve the NMI printk mechanism. Return an error if some other component
+ * is already using it.
+ */
+int printk_nmi_prepare(void)
+{
+	if (test_and_set_bit(0, &nmi_print_flag)) {
+		/*
+		 * If something is already using the NMI print facility we
+		 * can't allow a second one...
+		 */
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
+{
+	const char *buf = s->buffer + start;
+
+	printk("%.*s", (end - start) + 1, buf);
+}
+
+void printk_nmi_complete(void)
+{
+	struct nmi_seq_buf *s;
+	int len, cpu, i, last_i;
+
+	/*
+	 * Now that all the NMIs have triggered, we can dump out their
+	 * back traces safely to the console.
+	 */
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		last_i = 0;
+
+		len = seq_buf_used(&s->seq);
+		if (!len)
+			continue;
+
+		/* Print line by line. */
+		for (i = 0; i < len; i++) {
+			if (s->buffer[i] == '\n') {
+				print_seq_line(s, last_i, i);
+				last_i = i + 1;
+			}
+		}
+		/* Check if there was a partial line. */
+		if (last_i < len) {
+			print_seq_line(s, last_i, len - 1);
+			pr_cont("\n");
+		}
+
+		/* Wipe out the buffer ready for the next time around. */
+		seq_buf_clear(&s->seq);
+	}
+
+	clear_bit(0, &nmi_print_flag);
+	smp_mb__after_atomic();
+}
+
+void printk_nmi_this_cpu_begin(void)
+{
+	/*
+	 * Detect double-begins and report them. This code is unsafe (because
+	 * it will print from NMI) but things are pretty badly damaged if the
+	 * NMI re-enters and is somehow granted permission to use NMI printk,
+	 * so how much worse can it get? Also since this code interferes with
+	 * the operation of printk it is unlikely that any consequential
+	 * failures will be able to log anything making this our last
+	 * opportunity to tell anyone that something is wrong.
+	 */
+	if (this_cpu_read(nmi_print_saved_print_func)) {
+		this_cpu_write(printk_func, vprintk_default);
+		BUG();
+	}
+
+	this_cpu_write(nmi_print_saved_print_func, this_cpu_read(printk_func));
+	this_cpu_write(printk_func, nmi_vprintk);
+}
+
+void printk_nmi_this_cpu_end(void)
+{
+	this_cpu_write(printk_func, this_cpu_read(nmi_print_saved_print_func));
+	this_cpu_write(nmi_print_saved_print_func, NULL);
+}
+
+#endif /* CONFIG_PRINTK_NMI */
+
 /**
  * printk - print a kernel message
  * @fmt: format string
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
@ 2015-03-04 10:12     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

Currently there is a quite a pile of code sitting in
arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
The code is inaccessible to backtrace implementations for other
architectures, which is a shame because they would probably like to be
safe too.

Copy this code into printk. We'll port the x86 NMI backtrace to it in a
later patch.

Incidentally, technically I think it might be safe to call
printk_nmi_prepare() from NMI, providing care were taken to honour the
return code. printk_nmi_complete() cannot be called from NMI but could
be scheduled using irq_work_queue(). However honouring the return code
means sometimes it is impossible to get the message out so in most cases
I'd say using this code in such a way should probably attract sympathy
and/or derision rather than admiration.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/printk.h |  18 ++++++
 init/Kconfig           |   3 +
 kernel/printk/printk.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 170 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index baa3f97d8ce8..7fd94e644976 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -228,6 +228,24 @@ static inline void show_regs_print_info(const char *log_lvl)
 }
 #endif
 
+#ifdef CONFIG_PRINTK_NMI
+/*
+ * printk_nmi_prepare/complete are called to prepare the system for
+ * some or all cores to issue trace from NMI. printk_nmi_complete will
+ * print buffered output and cannot (safely) be called from NMI.
+ */
+extern int printk_nmi_prepare(void);
+extern void printk_nmi_complete(void);
+
+/*
+ * printk_nmi_this_cpu_begin/end are used divert/restore printk on this
+ * cpu. The result is the output of printk() (by this CPU) will be
+ * stored in temporary buffers for later printing by printk_nmi_complete.
+ */
+extern void printk_nmi_this_cpu_begin(void);
+extern void printk_nmi_this_cpu_end(void);
+#endif
+
 extern asmlinkage void dump_stack(void) __cold;
 
 #ifndef pr_fmt
diff --git a/init/Kconfig b/init/Kconfig
index f5dbc6d4261b..4f00d11ef0a4 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1421,6 +1421,9 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config PRINTK_NMI
+	bool
+
 config BUG
 	bool "BUG() support" if EXPERT
 	default y
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 01cfd69c54c6..291271300cd5 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1829,6 +1829,155 @@ EXPORT_SYMBOL_GPL(vprintk_default);
  */
 DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
 
+#ifdef CONFIG_PRINTK_NMI
+
+#define NMI_BUF_SIZE		4096
+
+struct nmi_seq_buf {
+	unsigned char		buffer[NMI_BUF_SIZE];
+	struct seq_buf		seq;
+};
+
+/* Safe printing in NMI context */
+static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+
+static DEFINE_PER_CPU(printk_func_t, nmi_print_saved_print_func);
+
+/* "in progress" flag of NMI printing */
+static unsigned long nmi_print_flag;
+
+static int __init printk_nmi_init(void)
+{
+	struct nmi_seq_buf *s;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
+	}
+
+	return 0;
+}
+pure_initcall(printk_nmi_init);
+
+/*
+ * It is not safe to call printk() directly from NMI handlers.
+ * It may be fine if the NMI detected a lock up and we have no choice
+ * but to do so, but doing a NMI on all other CPUs to get a back trace
+ * can be done with a sysrq-l. We don't want that to lock up, which
+ * can happen if the NMI interrupts a printk in progress.
+ *
+ * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
+ * the content into a per cpu seq_buf buffer. Then when the NMIs are
+ * all done, we can safely dump the contents of the seq_buf to a printk()
+ * from a non NMI context.
+ *
+ * This is not a generic printk() implementation and must be used with
+ * great care. In particular there is a static limit on the quantity of
+ * data that may be emitted during NMI, only one client can be active at
+ * one time (arbitrated by the return value of printk_nmi_begin() and
+ * it is required that something at task or interrupt context be scheduled
+ * to issue the output.
+ */
+static int nmi_vprintk(const char *fmt, va_list args)
+{
+	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	unsigned int len = seq_buf_used(&s->seq);
+
+	seq_buf_vprintf(&s->seq, fmt, args);
+	return seq_buf_used(&s->seq) - len;
+}
+
+/*
+ * Reserve the NMI printk mechanism. Return an error if some other component
+ * is already using it.
+ */
+int printk_nmi_prepare(void)
+{
+	if (test_and_set_bit(0, &nmi_print_flag)) {
+		/*
+		 * If something is already using the NMI print facility we
+		 * can't allow a second one...
+		 */
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
+{
+	const char *buf = s->buffer + start;
+
+	printk("%.*s", (end - start) + 1, buf);
+}
+
+void printk_nmi_complete(void)
+{
+	struct nmi_seq_buf *s;
+	int len, cpu, i, last_i;
+
+	/*
+	 * Now that all the NMIs have triggered, we can dump out their
+	 * back traces safely to the console.
+	 */
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		last_i = 0;
+
+		len = seq_buf_used(&s->seq);
+		if (!len)
+			continue;
+
+		/* Print line by line. */
+		for (i = 0; i < len; i++) {
+			if (s->buffer[i] == '\n') {
+				print_seq_line(s, last_i, i);
+				last_i = i + 1;
+			}
+		}
+		/* Check if there was a partial line. */
+		if (last_i < len) {
+			print_seq_line(s, last_i, len - 1);
+			pr_cont("\n");
+		}
+
+		/* Wipe out the buffer ready for the next time around. */
+		seq_buf_clear(&s->seq);
+	}
+
+	clear_bit(0, &nmi_print_flag);
+	smp_mb__after_atomic();
+}
+
+void printk_nmi_this_cpu_begin(void)
+{
+	/*
+	 * Detect double-begins and report them. This code is unsafe (because
+	 * it will print from NMI) but things are pretty badly damaged if the
+	 * NMI re-enters and is somehow granted permission to use NMI printk,
+	 * so how much worse can it get? Also since this code interferes with
+	 * the operation of printk it is unlikely that any consequential
+	 * failures will be able to log anything making this our last
+	 * opportunity to tell anyone that something is wrong.
+	 */
+	if (this_cpu_read(nmi_print_saved_print_func)) {
+		this_cpu_write(printk_func, vprintk_default);
+		BUG();
+	}
+
+	this_cpu_write(nmi_print_saved_print_func, this_cpu_read(printk_func));
+	this_cpu_write(printk_func, nmi_vprintk);
+}
+
+void printk_nmi_this_cpu_end(void)
+{
+	this_cpu_write(printk_func, this_cpu_read(nmi_print_saved_print_func));
+	this_cpu_write(nmi_print_saved_print_func, NULL);
+}
+
+#endif /* CONFIG_PRINTK_NMI */
+
 /**
  * printk - print a kernel message
  * @fmt: format string
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
  2015-03-04 10:12   ` Daniel Thompson
@ 2015-03-04 10:12     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander, Ingo Molnar, H. Peter Anvin, x86

Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe
all-cpu backtracing from NMI has been copied to printk.c to make it
accessible to other architectures.

Port the x86 NMI backtrace to the generic code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 arch/x86/Kconfig              |   1 +
 arch/x86/kernel/apic/hw_nmi.c | 101 +++---------------------------------------
 2 files changed, 8 insertions(+), 94 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c2fb8a87dccb..fbae5564a1f3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -141,6 +141,7 @@ config X86
 	select ACPI_LEGACY_TABLES_LOOKUP if ACPI
 	select X86_FEATURE_NAMES if PROC_FS
 	select SRCU
+	select PRINTK_NMI if X86_LOCAL_APIC
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 6873ab925d00..8bc00476011d 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -30,40 +30,16 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 #ifdef arch_trigger_all_cpu_backtrace
 /* For reliability, we're prepared to waste bits here. */
 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
-static cpumask_t printtrace_mask;
-
-#define NMI_BUF_SIZE		4096
-
-struct nmi_seq_buf {
-	unsigned char		buffer[NMI_BUF_SIZE];
-	struct seq_buf		seq;
-};
-
-/* Safe printing in NMI context */
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
-
-/* "in progress" flag of arch_trigger_all_cpu_backtrace */
-static unsigned long backtrace_flag;
-
-static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
-{
-	const char *buf = s->buffer + start;
-
-	printk("%.*s", (end - start) + 1, buf);
-}
 
 void arch_trigger_all_cpu_backtrace(bool include_self)
 {
-	struct nmi_seq_buf *s;
-	int len;
-	int cpu;
 	int i;
 	int this_cpu = get_cpu();
 
-	if (test_and_set_bit(0, &backtrace_flag)) {
+	if (0 != printk_nmi_prepare()) {
 		/*
-		 * If there is already a trigger_all_cpu_backtrace() in progress
-		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
 		 */
 		put_cpu();
 		return;
@@ -73,16 +49,6 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 	if (!include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
 
-	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
-	/*
-	 * Set up per_cpu seq_buf buffers that the NMIs running on the other
-	 * CPUs will write to.
-	 */
-	for_each_cpu(cpu, to_cpumask(backtrace_mask)) {
-		s = &per_cpu(nmi_print_seq, cpu);
-		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
-	}
-
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("sending NMI to %s CPUs:\n",
 			(include_self ? "all" : "other"));
@@ -97,73 +63,20 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Now that all the NMIs have triggered, we can dump out their
-	 * back traces safely to the console.
-	 */
-	for_each_cpu(cpu, &printtrace_mask) {
-		int last_i = 0;
-
-		s = &per_cpu(nmi_print_seq, cpu);
-		len = seq_buf_used(&s->seq);
-		if (!len)
-			continue;
-
-		/* Print line by line. */
-		for (i = 0; i < len; i++) {
-			if (s->buffer[i] == '\n') {
-				print_seq_line(s, last_i, i);
-				last_i = i + 1;
-			}
-		}
-		/* Check if there was a partial line. */
-		if (last_i < len) {
-			print_seq_line(s, last_i, len - 1);
-			pr_cont("\n");
-		}
-	}
-
-	clear_bit(0, &backtrace_flag);
-	smp_mb__after_atomic();
+	printk_nmi_complete();
 	put_cpu();
 }
 
-/*
- * It is not safe to call printk() directly from NMI handlers.
- * It may be fine if the NMI detected a lock up and we have no choice
- * but to do so, but doing a NMI on all other CPUs to get a back trace
- * can be done with a sysrq-l. We don't want that to lock up, which
- * can happen if the NMI interrupts a printk in progress.
- *
- * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- * the content into a per cpu seq_buf buffer. Then when the NMIs are
- * all done, we can safely dump the contents of the seq_buf to a printk()
- * from a non NMI context.
- */
-static int nmi_vprintk(const char *fmt, va_list args)
-{
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	unsigned int len = seq_buf_used(&s->seq);
-
-	seq_buf_vprintf(&s->seq, fmt, args);
-	return seq_buf_used(&s->seq) - len;
-}
-
 static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
-	int cpu;
-
-	cpu = smp_processor_id();
+	int cpu = smp_processor_id();
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
-		printk_func_t printk_func_save = this_cpu_read(printk_func);
-
-		/* Replace printk to write into the NMI seq */
-		this_cpu_write(printk_func, nmi_vprintk);
+		printk_nmi_this_cpu_begin();
 		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
 		show_regs(regs);
-		this_cpu_write(printk_func, printk_func_save);
+		printk_nmi_this_cpu_end();
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return NMI_HANDLED;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
@ 2015-03-04 10:12     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe
all-cpu backtracing from NMI has been copied to printk.c to make it
accessible to other architectures.

Port the x86 NMI backtrace to the generic code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86 at kernel.org
---
 arch/x86/Kconfig              |   1 +
 arch/x86/kernel/apic/hw_nmi.c | 101 +++---------------------------------------
 2 files changed, 8 insertions(+), 94 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c2fb8a87dccb..fbae5564a1f3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -141,6 +141,7 @@ config X86
 	select ACPI_LEGACY_TABLES_LOOKUP if ACPI
 	select X86_FEATURE_NAMES if PROC_FS
 	select SRCU
+	select PRINTK_NMI if X86_LOCAL_APIC
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 6873ab925d00..8bc00476011d 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -30,40 +30,16 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 #ifdef arch_trigger_all_cpu_backtrace
 /* For reliability, we're prepared to waste bits here. */
 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
-static cpumask_t printtrace_mask;
-
-#define NMI_BUF_SIZE		4096
-
-struct nmi_seq_buf {
-	unsigned char		buffer[NMI_BUF_SIZE];
-	struct seq_buf		seq;
-};
-
-/* Safe printing in NMI context */
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
-
-/* "in progress" flag of arch_trigger_all_cpu_backtrace */
-static unsigned long backtrace_flag;
-
-static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
-{
-	const char *buf = s->buffer + start;
-
-	printk("%.*s", (end - start) + 1, buf);
-}
 
 void arch_trigger_all_cpu_backtrace(bool include_self)
 {
-	struct nmi_seq_buf *s;
-	int len;
-	int cpu;
 	int i;
 	int this_cpu = get_cpu();
 
-	if (test_and_set_bit(0, &backtrace_flag)) {
+	if (0 != printk_nmi_prepare()) {
 		/*
-		 * If there is already a trigger_all_cpu_backtrace() in progress
-		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
 		 */
 		put_cpu();
 		return;
@@ -73,16 +49,6 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 	if (!include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
 
-	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
-	/*
-	 * Set up per_cpu seq_buf buffers that the NMIs running on the other
-	 * CPUs will write to.
-	 */
-	for_each_cpu(cpu, to_cpumask(backtrace_mask)) {
-		s = &per_cpu(nmi_print_seq, cpu);
-		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
-	}
-
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("sending NMI to %s CPUs:\n",
 			(include_self ? "all" : "other"));
@@ -97,73 +63,20 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Now that all the NMIs have triggered, we can dump out their
-	 * back traces safely to the console.
-	 */
-	for_each_cpu(cpu, &printtrace_mask) {
-		int last_i = 0;
-
-		s = &per_cpu(nmi_print_seq, cpu);
-		len = seq_buf_used(&s->seq);
-		if (!len)
-			continue;
-
-		/* Print line by line. */
-		for (i = 0; i < len; i++) {
-			if (s->buffer[i] == '\n') {
-				print_seq_line(s, last_i, i);
-				last_i = i + 1;
-			}
-		}
-		/* Check if there was a partial line. */
-		if (last_i < len) {
-			print_seq_line(s, last_i, len - 1);
-			pr_cont("\n");
-		}
-	}
-
-	clear_bit(0, &backtrace_flag);
-	smp_mb__after_atomic();
+	printk_nmi_complete();
 	put_cpu();
 }
 
-/*
- * It is not safe to call printk() directly from NMI handlers.
- * It may be fine if the NMI detected a lock up and we have no choice
- * but to do so, but doing a NMI on all other CPUs to get a back trace
- * can be done with a sysrq-l. We don't want that to lock up, which
- * can happen if the NMI interrupts a printk in progress.
- *
- * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- * the content into a per cpu seq_buf buffer. Then when the NMIs are
- * all done, we can safely dump the contents of the seq_buf to a printk()
- * from a non NMI context.
- */
-static int nmi_vprintk(const char *fmt, va_list args)
-{
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	unsigned int len = seq_buf_used(&s->seq);
-
-	seq_buf_vprintf(&s->seq, fmt, args);
-	return seq_buf_used(&s->seq) - len;
-}
-
 static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
-	int cpu;
-
-	cpu = smp_processor_id();
+	int cpu = smp_processor_id();
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
-		printk_func_t printk_func_save = this_cpu_read(printk_func);
-
-		/* Replace printk to write into the NMI seq */
-		this_cpu_write(printk_func, nmi_vprintk);
+		printk_nmi_this_cpu_begin();
 		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
 		show_regs(regs);
-		this_cpu_write(printk_func, printk_func_save);
+		printk_nmi_this_cpu_end();
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return NMI_HANDLED;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 6/6] ARM: Add support for on-demand backtrace of other CPUs
  2015-03-04 10:12   ` Daniel Thompson
@ 2015-03-04 10:12     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Thompson, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

Replicate the x86 code to trigger a backtrace using an NMI and hook
it up to IPI on ARM.

The code differs slightly from the code on x86 because, on ARM, we do
now know at compile time whether a platform is capable of supporting FIQ.
We must avoid using an IPI to request a backtrace from the CPU on which
the backtrace was requested if interrupts are disabled and fall back to
generating it directly.

In addition the implementation of arch_trigger_all_cpu_backtrace() the
patch also includes a few small items of plumbing that must be hooked
up for the new code to work.

Credit:
  Russell King provided the initial prototype implementing this feature
  for ARM. Today the patch has been reworked and, mostly, rewriten to
  keep it aligned with x86. However this patch does still include some
  code from Russell's original prototype.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/Kconfig               |  1 +
 arch/arm/include/asm/hardirq.h |  2 +-
 arch/arm/include/asm/irq.h     |  5 +++
 arch/arm/include/asm/smp.h     |  3 ++
 arch/arm/kernel/smp.c          | 80 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/traps.c        |  3 ++
 6 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9f1f09a2bc9b..2bb2e3cc660d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -76,6 +76,7 @@ config ARM
 	select OLD_SIGACTION
 	select OLD_SIGSUSPEND3
 	select PERF_USE_VMALLOC
+	select PRINTK_NMI
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
 	# Above selects are sorted alphabetically; please add new ones
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index fe3ea776dc34..5df33e30ae1b 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	8
+#define NR_IPI	9
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 18f5a554134f..b076584ac0fa 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -18,6 +18,8 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#define SMP_IPI_FIQ_MASK 0x0100
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
@@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
 
+extern void ipi_cpu_backtrace(struct pt_regs *regs);
 extern int register_ipi_completion(struct completion *completion, int cpu);
 
 struct smp_operations {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 86ef244c5a24..828277c4c248 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -26,6 +26,7 @@
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/seq_buf.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -72,6 +73,7 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
 static DECLARE_COMPLETION(cpu_running);
@@ -456,6 +458,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_CPU_STOP, "CPU stop interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_COMPLETION, "completion interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -570,6 +573,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 	unsigned int cpu = smp_processor_id();
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
+	BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
+
 	if ((unsigned)ipinr < NR_IPI) {
 		trace_ipi_entry(ipi_types[ipinr]);
 		__inc_irq_stat(cpu, ipi_irqs[ipinr]);
@@ -623,6 +628,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n",
 		        cpu, ipinr);
@@ -717,3 +728,72 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	int i;
+	int this_cpu = get_cpu();
+
+	if (0 != printk_nmi_prepare()) {
+		/*
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
+		 */
+		put_cpu();
+		return;
+	}
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+	/*
+	 * If irqs are disabled on the current processor then, if
+	 * IPI_CPU_BACKTRACE is delivered using IRQ, we will won't be able to
+	 * react to IPI_CPU_BACKTRACE until we leave this function. We avoid
+	 * the potential timeout (not to mention the failure to print useful
+	 * information) by calling the backtrace directly.
+	 */
+	if (irqs_disabled()) {
+		ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL);
+		include_self = false;
+	}
+
+	if (!include_self)
+		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+		touch_softlockup_watchdog();
+	}
+
+	printk_nmi_complete();
+	put_cpu();
+}
+
+void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		printk_nmi_this_cpu_begin();
+		pr_warn("FIQ backtrace for cpu %d\n", cpu);
+		if (regs != NULL)
+			show_regs(regs);
+		else
+			dump_stack();
+		printk_nmi_this_cpu_end();
+
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b35e220ae1b1..1836415b8a5c 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 #ifdef CONFIG_ARM_GIC
 	gic_handle_fiq_ipi();
 #endif
+#ifdef CONFIG_SMP
+	ipi_cpu_backtrace(regs);
+#endif
 
 	nmi_exit();
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 6/6] ARM: Add support for on-demand backtrace of other CPUs
@ 2015-03-04 10:12     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

Replicate the x86 code to trigger a backtrace using an NMI and hook
it up to IPI on ARM.

The code differs slightly from the code on x86 because, on ARM, we do
now know at compile time whether a platform is capable of supporting FIQ.
We must avoid using an IPI to request a backtrace from the CPU on which
the backtrace was requested if interrupts are disabled and fall back to
generating it directly.

In addition the implementation of arch_trigger_all_cpu_backtrace() the
patch also includes a few small items of plumbing that must be hooked
up for the new code to work.

Credit:
  Russell King provided the initial prototype implementing this feature
  for ARM. Today the patch has been reworked and, mostly, rewriten to
  keep it aligned with x86. However this patch does still include some
  code from Russell's original prototype.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/Kconfig               |  1 +
 arch/arm/include/asm/hardirq.h |  2 +-
 arch/arm/include/asm/irq.h     |  5 +++
 arch/arm/include/asm/smp.h     |  3 ++
 arch/arm/kernel/smp.c          | 80 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/traps.c        |  3 ++
 6 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9f1f09a2bc9b..2bb2e3cc660d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -76,6 +76,7 @@ config ARM
 	select OLD_SIGACTION
 	select OLD_SIGSUSPEND3
 	select PERF_USE_VMALLOC
+	select PRINTK_NMI
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
 	# Above selects are sorted alphabetically; please add new ones
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index fe3ea776dc34..5df33e30ae1b 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	8
+#define NR_IPI	9
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 18f5a554134f..b076584ac0fa 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -18,6 +18,8 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#define SMP_IPI_FIQ_MASK 0x0100
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
@@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
 
+extern void ipi_cpu_backtrace(struct pt_regs *regs);
 extern int register_ipi_completion(struct completion *completion, int cpu);
 
 struct smp_operations {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 86ef244c5a24..828277c4c248 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -26,6 +26,7 @@
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/seq_buf.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -72,6 +73,7 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
 static DECLARE_COMPLETION(cpu_running);
@@ -456,6 +458,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_CPU_STOP, "CPU stop interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_COMPLETION, "completion interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -570,6 +573,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 	unsigned int cpu = smp_processor_id();
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
+	BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
+
 	if ((unsigned)ipinr < NR_IPI) {
 		trace_ipi_entry(ipi_types[ipinr]);
 		__inc_irq_stat(cpu, ipi_irqs[ipinr]);
@@ -623,6 +628,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n",
 		        cpu, ipinr);
@@ -717,3 +728,72 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	int i;
+	int this_cpu = get_cpu();
+
+	if (0 != printk_nmi_prepare()) {
+		/*
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
+		 */
+		put_cpu();
+		return;
+	}
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+	/*
+	 * If irqs are disabled on the current processor then, if
+	 * IPI_CPU_BACKTRACE is delivered using IRQ, we will won't be able to
+	 * react to IPI_CPU_BACKTRACE until we leave this function. We avoid
+	 * the potential timeout (not to mention the failure to print useful
+	 * information) by calling the backtrace directly.
+	 */
+	if (irqs_disabled()) {
+		ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL);
+		include_self = false;
+	}
+
+	if (!include_self)
+		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+		touch_softlockup_watchdog();
+	}
+
+	printk_nmi_complete();
+	put_cpu();
+}
+
+void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		printk_nmi_this_cpu_begin();
+		pr_warn("FIQ backtrace for cpu %d\n", cpu);
+		if (regs != NULL)
+			show_regs(regs);
+		else
+			dump_stack();
+		printk_nmi_this_cpu_end();
+
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b35e220ae1b1..1836415b8a5c 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 #ifdef CONFIG_ARM_GIC
 	gic_handle_fiq_ipi();
 #endif
+#ifdef CONFIG_SMP
+	ipi_cpu_backtrace(regs);
+#endif
 
 	nmi_exit();
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
  2015-03-04 10:12     ` Daniel Thompson
@ 2015-03-04 16:13       ` Joe Perches
  -1 siblings, 0 replies; 94+ messages in thread
From: Joe Perches @ 2015-03-04 16:13 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> Currently there is a quite a pile of code sitting in
> arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> The code is inaccessible to backtrace implementations for other
> architectures, which is a shame because they would probably like to be
> safe too.
> 
> Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> later patch.

I think this would be better as a separate file
rather than increasing the bulk of printk.c



^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
@ 2015-03-04 16:13       ` Joe Perches
  0 siblings, 0 replies; 94+ messages in thread
From: Joe Perches @ 2015-03-04 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> Currently there is a quite a pile of code sitting in
> arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> The code is inaccessible to backtrace implementations for other
> architectures, which is a shame because they would probably like to be
> safe too.
> 
> Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> later patch.

I think this would be better as a separate file
rather than increasing the bulk of printk.c

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
  2015-03-04 16:13       ` Joe Perches
@ 2015-03-04 16:20         ` Steven Rostedt
  -1 siblings, 0 replies; 94+ messages in thread
From: Steven Rostedt @ 2015-03-04 16:20 UTC (permalink / raw)
  To: Joe Perches
  Cc: Daniel Thompson, Thomas Gleixner, Jason Cooper, Russell King,
	Will Deacon, Catalin Marinas, Marc Zyngier, Stephen Boyd,
	John Stultz, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Wed, 04 Mar 2015 08:13:21 -0800
Joe Perches <joe@perches.com> wrote:

> On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> > Currently there is a quite a pile of code sitting in
> > arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> > The code is inaccessible to backtrace implementations for other
> > architectures, which is a shame because they would probably like to be
> > safe too.
> > 
> > Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> > later patch.
> 
> I think this would be better as a separate file
> rather than increasing the bulk of printk.c
> 

I agree, as printk already has its own directory. Perhaps a
"nmi_backtrace.c"?

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
@ 2015-03-04 16:20         ` Steven Rostedt
  0 siblings, 0 replies; 94+ messages in thread
From: Steven Rostedt @ 2015-03-04 16:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 04 Mar 2015 08:13:21 -0800
Joe Perches <joe@perches.com> wrote:

> On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> > Currently there is a quite a pile of code sitting in
> > arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> > The code is inaccessible to backtrace implementations for other
> > architectures, which is a shame because they would probably like to be
> > safe too.
> > 
> > Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> > later patch.
> 
> I think this would be better as a separate file
> rather than increasing the bulk of printk.c
> 

I agree, as printk already has its own directory. Perhaps a
"nmi_backtrace.c"?

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
  2015-03-04 16:20         ` Steven Rostedt
@ 2015-03-04 16:33           ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 16:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Joe Perches, Thomas Gleixner, Jason Cooper, Russell King,
	Will Deacon, Catalin Marinas, Marc Zyngier, Stephen Boyd,
	John Stultz, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Wed, 2015-03-04 at 11:20 -0500, Steven Rostedt wrote:
> On Wed, 04 Mar 2015 08:13:21 -0800
> Joe Perches <joe@perches.com> wrote:
> 
> > On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> > > Currently there is a quite a pile of code sitting in
> > > arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> > > The code is inaccessible to backtrace implementations for other
> > > architectures, which is a shame because they would probably like to be
> > > safe too.
> > > 
> > > Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> > > later patch.
> > 
> > I think this would be better as a separate file
> > rather than increasing the bulk of printk.c
> > 
> 
> I agree, as printk already has its own directory. Perhaps a
> "nmi_backtrace.c"?

I agree on moving the code. However, after Thomas' review I made sure
all the external symbols were prefixed printk_nmi and, as a result of
the same review I started using CONFIG_PRINTK_NMI to enable/disable the
feature). For that reason I'm much more inclined to name it
"printk_nmi.c". Any objections?

I know it is a somewhat generic name but I'll move the comment text that
commences "This is not a generic printk() implementation and must be
used with great care. In particular..." to the top of the file to make
clear the limitations of this code.


Daniel.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
@ 2015-03-04 16:33           ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-04 16:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-03-04 at 11:20 -0500, Steven Rostedt wrote:
> On Wed, 04 Mar 2015 08:13:21 -0800
> Joe Perches <joe@perches.com> wrote:
> 
> > On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> > > Currently there is a quite a pile of code sitting in
> > > arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> > > The code is inaccessible to backtrace implementations for other
> > > architectures, which is a shame because they would probably like to be
> > > safe too.
> > > 
> > > Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> > > later patch.
> > 
> > I think this would be better as a separate file
> > rather than increasing the bulk of printk.c
> > 
> 
> I agree, as printk already has its own directory. Perhaps a
> "nmi_backtrace.c"?

I agree on moving the code. However, after Thomas' review I made sure
all the external symbols were prefixed printk_nmi and, as a result of
the same review I started using CONFIG_PRINTK_NMI to enable/disable the
feature). For that reason I'm much more inclined to name it
"printk_nmi.c". Any objections?

I know it is a somewhat generic name but I'll move the comment text that
commences "This is not a generic printk() implementation and must be
used with great care. In particular..." to the top of the file to make
clear the limitations of this code.


Daniel.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
  2015-03-04 16:33           ` Daniel Thompson
@ 2015-03-04 17:21             ` Joe Perches
  -1 siblings, 0 replies; 94+ messages in thread
From: Joe Perches @ 2015-03-04 17:21 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Steven Rostedt, Thomas Gleixner, Jason Cooper, Russell King,
	Will Deacon, Catalin Marinas, Marc Zyngier, Stephen Boyd,
	John Stultz, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Wed, 2015-03-04 at 16:33 +0000, Daniel Thompson wrote:
> On Wed, 2015-03-04 at 11:20 -0500, Steven Rostedt wrote:
> > On Wed, 04 Mar 2015 08:13:21 -0800
> > Joe Perches <joe@perches.com> wrote:
> > 
> > > On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> > > > Currently there is a quite a pile of code sitting in
> > > > arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> > > > The code is inaccessible to backtrace implementations for other
> > > > architectures, which is a shame because they would probably like to be
> > > > safe too.
> > > > 
> > > > Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> > > > later patch.
> > > 
> > > I think this would be better as a separate file
> > > rather than increasing the bulk of printk.c
> > > 
> > I agree, as printk already has its own directory. Perhaps a
> > "nmi_backtrace.c"?
> 
> I agree on moving the code. However, after Thomas' review I made sure
> all the external symbols were prefixed printk_nmi and, as a result of
> the same review I started using CONFIG_PRINTK_NMI to enable/disable the
> feature). For that reason I'm much more inclined to name it
> "printk_nmi.c". Any objections?

Steven's suggestion seems more sensible.

sed 's/CONFIG_PRINTK_NMI/CONFIG_PRINTK_NMI_BACKTRACE/g'
sed 's/printk_nmi/printk_nmib/g'
or
sed 's/printk_nmi/nmi_backtrace/g'

might work well.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
@ 2015-03-04 17:21             ` Joe Perches
  0 siblings, 0 replies; 94+ messages in thread
From: Joe Perches @ 2015-03-04 17:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-03-04 at 16:33 +0000, Daniel Thompson wrote:
> On Wed, 2015-03-04 at 11:20 -0500, Steven Rostedt wrote:
> > On Wed, 04 Mar 2015 08:13:21 -0800
> > Joe Perches <joe@perches.com> wrote:
> > 
> > > On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> > > > Currently there is a quite a pile of code sitting in
> > > > arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> > > > The code is inaccessible to backtrace implementations for other
> > > > architectures, which is a shame because they would probably like to be
> > > > safe too.
> > > > 
> > > > Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> > > > later patch.
> > > 
> > > I think this would be better as a separate file
> > > rather than increasing the bulk of printk.c
> > > 
> > I agree, as printk already has its own directory. Perhaps a
> > "nmi_backtrace.c"?
> 
> I agree on moving the code. However, after Thomas' review I made sure
> all the external symbols were prefixed printk_nmi and, as a result of
> the same review I started using CONFIG_PRINTK_NMI to enable/disable the
> feature). For that reason I'm much more inclined to name it
> "printk_nmi.c". Any objections?

Steven's suggestion seems more sensible.

sed 's/CONFIG_PRINTK_NMI/CONFIG_PRINTK_NMI_BACKTRACE/g'
sed 's/printk_nmi/printk_nmib/g'
or
sed 's/printk_nmi/nmi_backtrace/g'

might work well.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
  2015-03-04 10:12     ` Daniel Thompson
@ 2015-03-05  0:54       ` Ingo Molnar
  -1 siblings, 0 replies; 94+ messages in thread
From: Ingo Molnar @ 2015-03-05  0:54 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander, Ingo Molnar, H. Peter Anvin, x86


* Daniel Thompson <daniel.thompson@linaro.org> wrote:

> Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support 
> safe all-cpu backtracing from NMI has been copied to printk.c to 
> make it accessible to other architectures.
> 
> Port the x86 NMI backtrace to the generic code.

Is there any difference between the generic and the x86 code as they 
stand today?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
@ 2015-03-05  0:54       ` Ingo Molnar
  0 siblings, 0 replies; 94+ messages in thread
From: Ingo Molnar @ 2015-03-05  0:54 UTC (permalink / raw)
  To: linux-arm-kernel


* Daniel Thompson <daniel.thompson@linaro.org> wrote:

> Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support 
> safe all-cpu backtracing from NMI has been copied to printk.c to 
> make it accessible to other architectures.
> 
> Port the x86 NMI backtrace to the generic code.

Is there any difference between the generic and the x86 code as they 
stand today?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
  2015-03-04 17:21             ` Joe Perches
@ 2015-03-05 12:11               ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-05 12:11 UTC (permalink / raw)
  To: Joe Perches
  Cc: Steven Rostedt, Thomas Gleixner, Jason Cooper, Russell King,
	Will Deacon, Catalin Marinas, Marc Zyngier, Stephen Boyd,
	John Stultz, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander

On Wed, 2015-03-04 at 09:21 -0800, Joe Perches wrote:
> On Wed, 2015-03-04 at 16:33 +0000, Daniel Thompson wrote:
> > On Wed, 2015-03-04 at 11:20 -0500, Steven Rostedt wrote:
> > > On Wed, 04 Mar 2015 08:13:21 -0800
> > > Joe Perches <joe@perches.com> wrote:
> > > 
> > > > On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> > > > > Currently there is a quite a pile of code sitting in
> > > > > arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> > > > > The code is inaccessible to backtrace implementations for other
> > > > > architectures, which is a shame because they would probably like to be
> > > > > safe too.
> > > > > 
> > > > > Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> > > > > later patch.
> > > > 
> > > > I think this would be better as a separate file
> > > > rather than increasing the bulk of printk.c
> > > > 
> > > I agree, as printk already has its own directory. Perhaps a
> > > "nmi_backtrace.c"?
> > 
> > I agree on moving the code. However, after Thomas' review I made sure
> > all the external symbols were prefixed printk_nmi and, as a result of
> > the same review I started using CONFIG_PRINTK_NMI to enable/disable the
> > feature). For that reason I'm much more inclined to name it
> > "printk_nmi.c". Any objections?
> 
> Steven's suggestion seems more sensible.
> 
> sed 's/CONFIG_PRINTK_NMI/CONFIG_PRINTK_NMI_BACKTRACE/g'
> sed 's/printk_nmi/printk_nmib/g'
> or
> sed 's/printk_nmi/nmi_backtrace/g'
> 
> might work well.

Ok. The later rename is consistent and makes is much less likely the
facility will be (accidentally) misused in future.

I'll change like this.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing
@ 2015-03-05 12:11               ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-05 12:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-03-04 at 09:21 -0800, Joe Perches wrote:
> On Wed, 2015-03-04 at 16:33 +0000, Daniel Thompson wrote:
> > On Wed, 2015-03-04 at 11:20 -0500, Steven Rostedt wrote:
> > > On Wed, 04 Mar 2015 08:13:21 -0800
> > > Joe Perches <joe@perches.com> wrote:
> > > 
> > > > On Wed, 2015-03-04 at 10:12 +0000, Daniel Thompson wrote:
> > > > > Currently there is a quite a pile of code sitting in
> > > > > arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
> > > > > The code is inaccessible to backtrace implementations for other
> > > > > architectures, which is a shame because they would probably like to be
> > > > > safe too.
> > > > > 
> > > > > Copy this code into printk. We'll port the x86 NMI backtrace to it in a
> > > > > later patch.
> > > > 
> > > > I think this would be better as a separate file
> > > > rather than increasing the bulk of printk.c
> > > > 
> > > I agree, as printk already has its own directory. Perhaps a
> > > "nmi_backtrace.c"?
> > 
> > I agree on moving the code. However, after Thomas' review I made sure
> > all the external symbols were prefixed printk_nmi and, as a result of
> > the same review I started using CONFIG_PRINTK_NMI to enable/disable the
> > feature). For that reason I'm much more inclined to name it
> > "printk_nmi.c". Any objections?
> 
> Steven's suggestion seems more sensible.
> 
> sed 's/CONFIG_PRINTK_NMI/CONFIG_PRINTK_NMI_BACKTRACE/g'
> sed 's/printk_nmi/printk_nmib/g'
> or
> sed 's/printk_nmi/nmi_backtrace/g'
> 
> might work well.

Ok. The later rename is consistent and makes is much less likely the
facility will be (accidentally) misused in future.

I'll change like this.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
  2015-03-05  0:54       ` Ingo Molnar
@ 2015-03-05 12:29         ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-05 12:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander, Ingo Molnar, H. Peter Anvin, x86

On Thu, 2015-03-05 at 01:54 +0100, Ingo Molnar wrote:
> * Daniel Thompson <daniel.thompson@linaro.org> wrote:
> 
> > Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support 
> > safe all-cpu backtracing from NMI has been copied to printk.c to 
> > make it accessible to other architectures.
> > 
> > Port the x86 NMI backtrace to the generic code.
> 
> Is there any difference between the generic and the x86 code as they 
> stand today?

Shouldn't be any user observable change but there are some changes,
mostly due to review comments.

1. The seq_buf structures are initialized at boot and *after* they
   are consumed (originally they were initialized just before use).

2. The generic code doesn't maintain an equivalent of backtrace_mask
   (which was essentially a copy of cpus_online made when backtracing
   was requested) and instead iterates using for_each_possible_cpu()
   to initialize and dump the seq_buf:s.


Daniel.


PS
The main piece that git code motion tracking should follow if I squashed
the generic and x86 patches together would be nmi_vprintk(). I suspect
most of the rest would be missed as the code copies is in pretty small
fragments.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
@ 2015-03-05 12:29         ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-05 12:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2015-03-05 at 01:54 +0100, Ingo Molnar wrote:
> * Daniel Thompson <daniel.thompson@linaro.org> wrote:
> 
> > Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support 
> > safe all-cpu backtracing from NMI has been copied to printk.c to 
> > make it accessible to other architectures.
> > 
> > Port the x86 NMI backtrace to the generic code.
> 
> Is there any difference between the generic and the x86 code as they 
> stand today?

Shouldn't be any user observable change but there are some changes,
mostly due to review comments.

1. The seq_buf structures are initialized at boot and *after* they
   are consumed (originally they were initialized just before use).

2. The generic code doesn't maintain an equivalent of backtrace_mask
   (which was essentially a copy of cpus_online made when backtracing
   was requested) and instead iterates using for_each_possible_cpu()
   to initialize and dump the seq_buf:s.


Daniel.


PS
The main piece that git code motion tracking should follow if I squashed
the generic and x86 patches together would be nmi_vprintk(). I suspect
most of the rest would be missed as the code copies is in pretty small
fragments.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
  2015-03-05 12:29         ` Daniel Thompson
@ 2015-03-05 19:46           ` Ingo Molnar
  -1 siblings, 0 replies; 94+ messages in thread
From: Ingo Molnar @ 2015-03-05 19:46 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander, Ingo Molnar, H. Peter Anvin, x86


* Daniel Thompson <daniel.thompson@linaro.org> wrote:

> On Thu, 2015-03-05 at 01:54 +0100, Ingo Molnar wrote:
> > * Daniel Thompson <daniel.thompson@linaro.org> wrote:
> > 
> > > Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support 
> > > safe all-cpu backtracing from NMI has been copied to printk.c to 
> > > make it accessible to other architectures.
> > > 
> > > Port the x86 NMI backtrace to the generic code.
> > 
> > Is there any difference between the generic and the x86 code as they 
> > stand today?
> 
> Shouldn't be any user observable change but there are some changes,
> mostly due to review comments.
> 
> 1. The seq_buf structures are initialized at boot and *after* they
>    are consumed (originally they were initialized just before use).
> 
> 2. The generic code doesn't maintain an equivalent of backtrace_mask
>    (which was essentially a copy of cpus_online made when backtracing
>    was requested) and instead iterates using for_each_possible_cpu()
>    to initialize and dump the seq_buf:s.

Ok, I have no fundamental objections:

Acked-by: Ingo Molnar <mingo@kernel.org>

I suspect you want to carry the x86 bits yourself?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
@ 2015-03-05 19:46           ` Ingo Molnar
  0 siblings, 0 replies; 94+ messages in thread
From: Ingo Molnar @ 2015-03-05 19:46 UTC (permalink / raw)
  To: linux-arm-kernel


* Daniel Thompson <daniel.thompson@linaro.org> wrote:

> On Thu, 2015-03-05 at 01:54 +0100, Ingo Molnar wrote:
> > * Daniel Thompson <daniel.thompson@linaro.org> wrote:
> > 
> > > Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support 
> > > safe all-cpu backtracing from NMI has been copied to printk.c to 
> > > make it accessible to other architectures.
> > > 
> > > Port the x86 NMI backtrace to the generic code.
> > 
> > Is there any difference between the generic and the x86 code as they 
> > stand today?
> 
> Shouldn't be any user observable change but there are some changes,
> mostly due to review comments.
> 
> 1. The seq_buf structures are initialized at boot and *after* they
>    are consumed (originally they were initialized just before use).
> 
> 2. The generic code doesn't maintain an equivalent of backtrace_mask
>    (which was essentially a copy of cpus_online made when backtracing
>    was requested) and instead iterates using for_each_possible_cpu()
>    to initialize and dump the seq_buf:s.

Ok, I have no fundamental objections:

Acked-by: Ingo Molnar <mingo@kernel.org>

I suspect you want to carry the x86 bits yourself?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
  2015-03-05 19:46           ` Ingo Molnar
@ 2015-03-06 19:02             ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-06 19:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Jason Cooper, Russell King, Will Deacon,
	Catalin Marinas, Marc Zyngier, Stephen Boyd, John Stultz,
	Steven Rostedt, linux-kernel, linux-arm-kernel, patches,
	linaro-kernel, Sumit Semwal, Dirk Behme, Daniel Drake,
	Dmitry Pervushin, Tim Sander, Ingo Molnar, H. Peter Anvin, x86

On Thu, 2015-03-05 at 20:46 +0100, Ingo Molnar wrote:
> * Daniel Thompson <daniel.thompson@linaro.org> wrote:
> 
> > On Thu, 2015-03-05 at 01:54 +0100, Ingo Molnar wrote:
> > > * Daniel Thompson <daniel.thompson@linaro.org> wrote:
> > > 
> > > > Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support 
> > > > safe all-cpu backtracing from NMI has been copied to printk.c to 
> > > > make it accessible to other architectures.
> > > > 
> > > > Port the x86 NMI backtrace to the generic code.
> > > 
> > > Is there any difference between the generic and the x86 code as they 
> > > stand today?
> > 
> > Shouldn't be any user observable change but there are some changes,
> > mostly due to review comments.
> > 
> > 1. The seq_buf structures are initialized at boot and *after* they
> >    are consumed (originally they were initialized just before use).
> > 
> > 2. The generic code doesn't maintain an equivalent of backtrace_mask
> >    (which was essentially a copy of cpus_online made when backtracing
> >    was requested) and instead iterates using for_each_possible_cpu()
> >    to initialize and dump the seq_buf:s.
> 
> Ok, I have no fundamental objections:
> 
> Acked-by: Ingo Molnar <mingo@kernel.org>
> 
> I suspect you want to carry the x86 bits yourself?

I've done plenty of bisectability testing on this set so patches 4 and 5
could be separated from the set and go via the x86 tree. However with
your ack I hope that taking the patchset via the irqchip route should be
possible.

Jason: After I've attended to Joe Perches/Steven Rostedt's comments will
you be comfortable enough to take patches 1-5 through one of your
trees? 

It would be great to deliver patch 6 too but rmk is having a short break
so getting an ack for that may not work out


Daniel.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions
@ 2015-03-06 19:02             ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-06 19:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2015-03-05 at 20:46 +0100, Ingo Molnar wrote:
> * Daniel Thompson <daniel.thompson@linaro.org> wrote:
> 
> > On Thu, 2015-03-05 at 01:54 +0100, Ingo Molnar wrote:
> > > * Daniel Thompson <daniel.thompson@linaro.org> wrote:
> > > 
> > > > Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support 
> > > > safe all-cpu backtracing from NMI has been copied to printk.c to 
> > > > make it accessible to other architectures.
> > > > 
> > > > Port the x86 NMI backtrace to the generic code.
> > > 
> > > Is there any difference between the generic and the x86 code as they 
> > > stand today?
> > 
> > Shouldn't be any user observable change but there are some changes,
> > mostly due to review comments.
> > 
> > 1. The seq_buf structures are initialized at boot and *after* they
> >    are consumed (originally they were initialized just before use).
> > 
> > 2. The generic code doesn't maintain an equivalent of backtrace_mask
> >    (which was essentially a copy of cpus_online made when backtracing
> >    was requested) and instead iterates using for_each_possible_cpu()
> >    to initialize and dump the seq_buf:s.
> 
> Ok, I have no fundamental objections:
> 
> Acked-by: Ingo Molnar <mingo@kernel.org>
> 
> I suspect you want to carry the x86 bits yourself?

I've done plenty of bisectability testing on this set so patches 4 and 5
could be separated from the set and go via the x86 tree. However with
your ack I hope that taking the patchset via the irqchip route should be
possible.

Jason: After I've attended to Joe Perches/Steven Rostedt's comments will
you be comfortable enough to take patches 1-5 through one of your
trees? 

It would be great to deliver patch 6 too but rmk is having a short break
so getting an ack for that may not work out


Daniel.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace
  2015-01-23 14:22 ` Daniel Thompson
@ 2015-03-12 13:39   ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper
  Cc: Daniel Thompson, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

Jason/Thomas:
  Any chance of taking the first five of these patches via the irqchip
  route? The x86 patch has an ack from Ingo, printk has no explicit
  maintainer and I've done plenty of bisectability tests on the patchset.

This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly bring in the changes for the arm we also rearrange
some of the existing x86 NMI code to make it architecture neutral.

The patchset http://thread.gmane.org/gmane.linux.kernel/1897765 , which
makes sched_clock() NMI/FIQ-safe, should be treated as a prerequisite
for the sixth and final patch in the series (which enables the feature
on ARM).  Although sched_clock() is not called directly by any of the
code that runs from a FIQ handler it is possible for sched_clock() to be
called indirectly when the function tracer is enabled.

The patches have been runtime tested on two systems capable of
supporting FIQ (Freescale i.MX6 and STiH416) and two that do not
(vexpress-a9 and Qualcomm Snapdragon 600), the changes to the x86
logic were tested on qemu and all patches have been compile tested
on x86, arm and arm64.

Note: On platforms not capable of supporting FIQ, the IPI to generate a
      backtrace will fall back to using IRQ for propagation instead.
      The backtrace logic contains a timeout to we will not permanently
      wedge the requesting CPU if other CPUs are not responsive.

v18:

* Move printk_nmi_ functions out of printk.c and into their own
  file, nmi_callback.c (Joe Perches/Steven Rostedt).

* Rename printk_nmi_ functions so their name matches their new home
  (Joe Perches)

v17:

* Rename bl_migration_lock/unlock to gic_migration_lock/unlock
  (Nicolas Pitre).

v16:

* Significant clean up of the printk patches (Thomas Gleixner).
  Replacing macros with real functions, CONFIG_ARCH_WANT_NMI_PRINTK
  -> CONFIG_PRINTK_NMI, prefixing global functions with printk_nmi,
  removing pointless exports, removing cpu_mask from the interfaces,
  removal of just-in-time initialization of trace buffers, prevented
  call sites having to save state, rolled up variable declarations
  into single lines.

* Dropped the sched_clock() patches from *this* patchset and managed
  them separately (http://thread.gmane.org/gmane.linux.kernel/1879261 ).
  The cross-dependancies between the patches are minimal; the backtrace
  code only calls sched_clock() if we are ftracing and backtracing is
  normally only triggered to report information about about a broken
  system (although users can type SysRq-l for amusement, most use it
  to find out why the system it dead).

* Squashed together the final two patches. Essentially these duplicated
  the x86 code and slavishly avoided changing it before, in the next
  patch, fixing it to work better on ARM. It seems better that the code
  just works first time!

v15:

* Added a patch to make sched_clock safe to call from NMI (Stephen
  Boyd). Note that sched_clock() is not called by the NMI handlers that
  have been added for the arm but it could be called if tools such as
  ftrace are deployed.

* Fixed some warnings picked up during bisectability testing.

v14:

* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c
  to printk.c (Steven Rostedt)

v13:

* Updated the code to print the backtrace to replicate Steven Rostedt's
  x86 work to make SysRq-l safe. This is pretty much a total rewrite of
  patches 4 and 5.

v12:

* Squash first two patches into a single one and re-describe
  (Thomas Gleixner).

* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe"
  (Thomas Gleixner).

v11:

* Optimized gic_raise_softirq() by replacing a register read with
  a memory read (Jason Cooper).

v10:

* Add a further patch to optimize away some of the locking on systems
  where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with
  exynos_defconfig (which is the only defconfig to set this option).

* Whitespace fixes in patch 4. That patch previously used spaces for
  alignment of new constants but the rest of the file used tabs.

v9:

* Improved documentation and structure of initial patch (now initial
  two patches) to make gic_raise_softirq() safe to call from FIQ
  (Thomas Gleixner).

* Avoid masking interrupts during gic_raise_softirq(). The use of the
  read lock makes this redundant (because we can safely re-enter the
  function).

v8:

* Fixed build on arm64 causes by a spurious include file in irq-gic.c.

v7-2 (accidentally released twice with same number):

* Fixed boot regression on vexpress-a9 (reported by Russell King).

* Rebased on v3.18-rc3; removed one patch from set that is already
  included in mainline.

* Dropped arm64/fiq.h patch from the set (still useful but not related
  to issuing backtraces).

v7:

* Re-arranged code within the patch series to fix a regression
  introduced midway through the series and corrected by a later patch
  (testing by Olof's autobuilder). Tested offending patch in isolation
  using defconfig identified by the autobuilder.

v6:

* Renamed svc_entry's call_trace argument to just trace (example code
  from Russell King).

* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell
  King).

* Modified usr_entry to optional avoid calling into the trace code and
  used this in FIQ entry from usr path. Modified corresponding exit code
  to avoid calling into trace code and the scheduler (example code from
  Russell King).

* Ensured the default FIQ register state is restored when the default
  FIQ handler is reinstalled (example code from Russell King).

* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting
  a default FIQ handler.

* Re-instated fiq_safe_migration_lock and associated logic in
  gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd()
  in the console unlock logic.

v5:

* Rebased on 3.17-rc4.

* Removed a spurious line from the final "glue it together" patch
  that broke the build.

v4:

* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas
  Pitre).

* Really fix bad pt_regs pointer generation in __fiq_abt.

* Remove fiq_safe_migration_lock and associated logic in
  gic_raise_softirq() (review of Russell King)

* Restructured to introduce the default FIQ handler first, before the
  new features (review of Russell King).

v3:

* Removed redundant header guards from arch/arm64/include/asm/fiq.h
  (review of Catalin Marinas).

* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas
  Pitre).

v2:

* Restructured to sit nicely on a similar FYI patchset from Russell
  King. It now effectively replaces the work in progress final patch
  with something much more complete.

* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq
  (review of Nicolas Pitre)

* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts
  being acknowledged by the IRQ handler does still exist but should be
  harmless because the IRQ handler will still wind up calling
  ipi_cpu_backtrace().

* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively
  becomes a platform feature (although the use of non-maskable
  interrupts to implement it is best effort rather than guaranteed).

* Better comments highlighting usage of RAZ/WI registers (and parts of
  registers) in the GIC code.

Changes *before* v1:

* This patchset is a hugely cut-down successor to "[PATCH v11 00/19]
  arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting
  the new structure. For historic details see:
        https://lkml.org/lkml/2014/9/2/227

* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value).
  In fixing this we also remove the useless indirection previously
  found in the fiq_handler macro.

* Make default fiq handler "always on" by migrating from fiq.c to
  traps.c and replace do_unexp_fiq with the new handler (review
  of Russell King).

* Add arm64 version of fiq.h (review of Russell King)

* Removed conditional branching and code from irq-gic.c, this is
  replaced by much simpler code that relies on the GIC specification's
  heavy use of read-as-zero/write-ignored (review of Russell King)


Daniel Thompson (6):
  irqchip: gic: Optimize locking in gic_raise_softirq
  irqchip: gic: Make gic_raise_softirq FIQ-safe
  irqchip: gic: Introduce plumbing for IPI FIQ
  printk: Simple implementation for NMI backtracing
  x86/nmi: Use common printk functions
  ARM: Add support for on-demand backtrace of other CPUs

 arch/arm/Kconfig                |   1 +
 arch/arm/include/asm/hardirq.h  |   2 +-
 arch/arm/include/asm/irq.h      |   5 +
 arch/arm/include/asm/smp.h      |   3 +
 arch/arm/kernel/smp.c           |  81 ++++++++++++++++
 arch/arm/kernel/traps.c         |   8 +-
 arch/x86/Kconfig                |   1 +
 arch/x86/kernel/apic/hw_nmi.c   | 101 ++------------------
 drivers/irqchip/irq-gic.c       | 203 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 ++
 include/linux/printk.h          |  20 ++++
 init/Kconfig                    |   3 +
 kernel/printk/Makefile          |   1 +
 kernel/printk/nmi_backtrace.c   | 148 +++++++++++++++++++++++++++++
 14 files changed, 474 insertions(+), 111 deletions(-)
 create mode 100644 kernel/printk/nmi_backtrace.c

--
2.1.0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace
@ 2015-03-12 13:39   ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Jason/Thomas:
  Any chance of taking the first five of these patches via the irqchip
  route? The x86 patch has an ack from Ingo, printk has no explicit
  maintainer and I've done plenty of bisectability tests on the patchset.

This patchset modifies the GIC driver to allow it, on supported
platforms, to route IPI interrupts to FIQ. It then uses this
feature to implement arch_trigger_all_cpu_backtrace for arm.
In order to neatly bring in the changes for the arm we also rearrange
some of the existing x86 NMI code to make it architecture neutral.

The patchset http://thread.gmane.org/gmane.linux.kernel/1897765 , which
makes sched_clock() NMI/FIQ-safe, should be treated as a prerequisite
for the sixth and final patch in the series (which enables the feature
on ARM).  Although sched_clock() is not called directly by any of the
code that runs from a FIQ handler it is possible for sched_clock() to be
called indirectly when the function tracer is enabled.

The patches have been runtime tested on two systems capable of
supporting FIQ (Freescale i.MX6 and STiH416) and two that do not
(vexpress-a9 and Qualcomm Snapdragon 600), the changes to the x86
logic were tested on qemu and all patches have been compile tested
on x86, arm and arm64.

Note: On platforms not capable of supporting FIQ, the IPI to generate a
      backtrace will fall back to using IRQ for propagation instead.
      The backtrace logic contains a timeout to we will not permanently
      wedge the requesting CPU if other CPUs are not responsive.

v18:

* Move printk_nmi_ functions out of printk.c and into their own
  file, nmi_callback.c (Joe Perches/Steven Rostedt).

* Rename printk_nmi_ functions so their name matches their new home
  (Joe Perches)

v17:

* Rename bl_migration_lock/unlock to gic_migration_lock/unlock
  (Nicolas Pitre).

v16:

* Significant clean up of the printk patches (Thomas Gleixner).
  Replacing macros with real functions, CONFIG_ARCH_WANT_NMI_PRINTK
  -> CONFIG_PRINTK_NMI, prefixing global functions with printk_nmi,
  removing pointless exports, removing cpu_mask from the interfaces,
  removal of just-in-time initialization of trace buffers, prevented
  call sites having to save state, rolled up variable declarations
  into single lines.

* Dropped the sched_clock() patches from *this* patchset and managed
  them separately (http://thread.gmane.org/gmane.linux.kernel/1879261 ).
  The cross-dependancies between the patches are minimal; the backtrace
  code only calls sched_clock() if we are ftracing and backtracing is
  normally only triggered to report information about about a broken
  system (although users can type SysRq-l for amusement, most use it
  to find out why the system it dead).

* Squashed together the final two patches. Essentially these duplicated
  the x86 code and slavishly avoided changing it before, in the next
  patch, fixing it to work better on ARM. It seems better that the code
  just works first time!

v15:

* Added a patch to make sched_clock safe to call from NMI (Stephen
  Boyd). Note that sched_clock() is not called by the NMI handlers that
  have been added for the arm but it could be called if tools such as
  ftrace are deployed.

* Fixed some warnings picked up during bisectability testing.

v14:

* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c
  to printk.c (Steven Rostedt)

v13:

* Updated the code to print the backtrace to replicate Steven Rostedt's
  x86 work to make SysRq-l safe. This is pretty much a total rewrite of
  patches 4 and 5.

v12:

* Squash first two patches into a single one and re-describe
  (Thomas Gleixner).

* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe"
  (Thomas Gleixner).

v11:

* Optimized gic_raise_softirq() by replacing a register read with
  a memory read (Jason Cooper).

v10:

* Add a further patch to optimize away some of the locking on systems
  where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with
  exynos_defconfig (which is the only defconfig to set this option).

* Whitespace fixes in patch 4. That patch previously used spaces for
  alignment of new constants but the rest of the file used tabs.

v9:

* Improved documentation and structure of initial patch (now initial
  two patches) to make gic_raise_softirq() safe to call from FIQ
  (Thomas Gleixner).

* Avoid masking interrupts during gic_raise_softirq(). The use of the
  read lock makes this redundant (because we can safely re-enter the
  function).

v8:

* Fixed build on arm64 causes by a spurious include file in irq-gic.c.

v7-2 (accidentally released twice with same number):

* Fixed boot regression on vexpress-a9 (reported by Russell King).

* Rebased on v3.18-rc3; removed one patch from set that is already
  included in mainline.

* Dropped arm64/fiq.h patch from the set (still useful but not related
  to issuing backtraces).

v7:

* Re-arranged code within the patch series to fix a regression
  introduced midway through the series and corrected by a later patch
  (testing by Olof's autobuilder). Tested offending patch in isolation
  using defconfig identified by the autobuilder.

v6:

* Renamed svc_entry's call_trace argument to just trace (example code
  from Russell King).

* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell
  King).

* Modified usr_entry to optional avoid calling into the trace code and
  used this in FIQ entry from usr path. Modified corresponding exit code
  to avoid calling into trace code and the scheduler (example code from
  Russell King).

* Ensured the default FIQ register state is restored when the default
  FIQ handler is reinstalled (example code from Russell King).

* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting
  a default FIQ handler.

* Re-instated fiq_safe_migration_lock and associated logic in
  gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd()
  in the console unlock logic.

v5:

* Rebased on 3.17-rc4.

* Removed a spurious line from the final "glue it together" patch
  that broke the build.

v4:

* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas
  Pitre).

* Really fix bad pt_regs pointer generation in __fiq_abt.

* Remove fiq_safe_migration_lock and associated logic in
  gic_raise_softirq() (review of Russell King)

* Restructured to introduce the default FIQ handler first, before the
  new features (review of Russell King).

v3:

* Removed redundant header guards from arch/arm64/include/asm/fiq.h
  (review of Catalin Marinas).

* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas
  Pitre).

v2:

* Restructured to sit nicely on a similar FYI patchset from Russell
  King. It now effectively replaces the work in progress final patch
  with something much more complete.

* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq
  (review of Nicolas Pitre)

* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts
  being acknowledged by the IRQ handler does still exist but should be
  harmless because the IRQ handler will still wind up calling
  ipi_cpu_backtrace().

* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively
  becomes a platform feature (although the use of non-maskable
  interrupts to implement it is best effort rather than guaranteed).

* Better comments highlighting usage of RAZ/WI registers (and parts of
  registers) in the GIC code.

Changes *before* v1:

* This patchset is a hugely cut-down successor to "[PATCH v11 00/19]
  arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting
  the new structure. For historic details see:
        https://lkml.org/lkml/2014/9/2/227

* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value).
  In fixing this we also remove the useless indirection previously
  found in the fiq_handler macro.

* Make default fiq handler "always on" by migrating from fiq.c to
  traps.c and replace do_unexp_fiq with the new handler (review
  of Russell King).

* Add arm64 version of fiq.h (review of Russell King)

* Removed conditional branching and code from irq-gic.c, this is
  replaced by much simpler code that relies on the GIC specification's
  heavy use of read-as-zero/write-ignored (review of Russell King)


Daniel Thompson (6):
  irqchip: gic: Optimize locking in gic_raise_softirq
  irqchip: gic: Make gic_raise_softirq FIQ-safe
  irqchip: gic: Introduce plumbing for IPI FIQ
  printk: Simple implementation for NMI backtracing
  x86/nmi: Use common printk functions
  ARM: Add support for on-demand backtrace of other CPUs

 arch/arm/Kconfig                |   1 +
 arch/arm/include/asm/hardirq.h  |   2 +-
 arch/arm/include/asm/irq.h      |   5 +
 arch/arm/include/asm/smp.h      |   3 +
 arch/arm/kernel/smp.c           |  81 ++++++++++++++++
 arch/arm/kernel/traps.c         |   8 +-
 arch/x86/Kconfig                |   1 +
 arch/x86/kernel/apic/hw_nmi.c   | 101 ++------------------
 drivers/irqchip/irq-gic.c       | 203 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 ++
 include/linux/printk.h          |  20 ++++
 init/Kconfig                    |   3 +
 kernel/printk/Makefile          |   1 +
 kernel/printk/nmi_backtrace.c   | 148 +++++++++++++++++++++++++++++
 14 files changed, 474 insertions(+), 111 deletions(-)
 create mode 100644 kernel/printk/nmi_backtrace.c

--
2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
  2015-03-12 13:39   ` Daniel Thompson
@ 2015-03-12 13:39     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper
  Cc: Daniel Thompson, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

Currently gic_raise_softirq() is locked using upon irq_controller_lock.
This lock is primarily used to make register read-modify-write sequences
atomic but gic_raise_softirq() uses it instead to ensure that the
big.LITTLE migration logic can figure out when it is safe to migrate
interrupts between physical cores.

This is sub-optimal in closely related ways:

1. No locking at all is required on systems where the b.L switcher is
   not configured.

2. Finer grain locking can be used on systems where the b.L switcher is
   present.

This patch resolves both of the above by introducing a separate finer
grain lock and providing conditionally compiled inlines to lock/unlock
it.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 4634cf7d0ec3..f2a0b4525b65 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -73,6 +73,27 @@ struct gic_chip_data {
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 
 /*
+ * This lock is used by the big.LITTLE migration code to ensure no IPIs
+ * can be pended on the old core after the map has been updated.
+ */
+#ifdef CONFIG_BL_SWITCHER
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+
+static inline void gic_migration_lock(unsigned long *flags)
+{
+	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+}
+
+static inline void gic_migration_unlock(unsigned long flags)
+{
+	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+}
+#else
+static inline void gic_migration_lock(unsigned long *flags) {}
+static inline void gic_migration_unlock(unsigned long flags) {}
+#endif
+
+/*
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
@@ -627,7 +648,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	int cpu;
 	unsigned long flags, map = 0;
 
-	raw_spin_lock_irqsave(&irq_controller_lock, flags);
+	gic_migration_lock(&flags);
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -642,7 +663,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
+	gic_migration_unlock(flags);
 }
 #endif
 
@@ -713,8 +734,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
 
 	raw_spin_lock(&irq_controller_lock);
 
-	/* Update the target interface for this logical CPU */
+	/*
+	 * Update the target interface for this logical CPU
+	 *
+	 * From the point we release the cpu_map_migration_lock any new
+	 * SGIs will be pended on the new cpu which makes the set of SGIs
+	 * pending on the old cpu static. That means we can defer the
+	 * migration until after we have released the irq_controller_lock.
+	 */
+	raw_spin_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
+	raw_spin_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 1/6] irqchip: gic: Optimize locking in gic_raise_softirq
@ 2015-03-12 13:39     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Currently gic_raise_softirq() is locked using upon irq_controller_lock.
This lock is primarily used to make register read-modify-write sequences
atomic but gic_raise_softirq() uses it instead to ensure that the
big.LITTLE migration logic can figure out when it is safe to migrate
interrupts between physical cores.

This is sub-optimal in closely related ways:

1. No locking at all is required on systems where the b.L switcher is
   not configured.

2. Finer grain locking can be used on systems where the b.L switcher is
   present.

This patch resolves both of the above by introducing a separate finer
grain lock and providing conditionally compiled inlines to lock/unlock
it.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 4634cf7d0ec3..f2a0b4525b65 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -73,6 +73,27 @@ struct gic_chip_data {
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 
 /*
+ * This lock is used by the big.LITTLE migration code to ensure no IPIs
+ * can be pended on the old core after the map has been updated.
+ */
+#ifdef CONFIG_BL_SWITCHER
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+
+static inline void gic_migration_lock(unsigned long *flags)
+{
+	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+}
+
+static inline void gic_migration_unlock(unsigned long flags)
+{
+	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+}
+#else
+static inline void gic_migration_lock(unsigned long *flags) {}
+static inline void gic_migration_unlock(unsigned long flags) {}
+#endif
+
+/*
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
@@ -627,7 +648,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	int cpu;
 	unsigned long flags, map = 0;
 
-	raw_spin_lock_irqsave(&irq_controller_lock, flags);
+	gic_migration_lock(&flags);
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -642,7 +663,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
+	gic_migration_unlock(flags);
 }
 #endif
 
@@ -713,8 +734,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
 
 	raw_spin_lock(&irq_controller_lock);
 
-	/* Update the target interface for this logical CPU */
+	/*
+	 * Update the target interface for this logical CPU
+	 *
+	 * From the point we release the cpu_map_migration_lock any new
+	 * SGIs will be pended on the new cpu which makes the set of SGIs
+	 * pending on the old cpu static. That means we can defer the
+	 * migration until after we have released the irq_controller_lock.
+	 */
+	raw_spin_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
+	raw_spin_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe
  2015-03-12 13:39   ` Daniel Thompson
@ 2015-03-12 13:39     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper
  Cc: Daniel Thompson, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
and lock up.

    	gic_raise_softirq()
	   lock(x);
-~-> FIQ
        handle_fiq()
	   gic_raise_softirq()
	      lock(x);		<-- Lockup

arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
renders it difficult for FIQ handlers to safely defer work to less
restrictive calling contexts.

This patch fixes the problem by converting the cpu_map_migration_lock
into a rwlock making it safe to re-enter the function.

Note that having made it safe to re-enter gic_raise_softirq() we no
longer need to mask interrupts during gic_raise_softirq() because the
b.L migration is always performed from task context.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index f2a0b4525b65..48d6296a365a 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 /*
  * This lock is used by the big.LITTLE migration code to ensure no IPIs
  * can be pended on the old core after the map has been updated.
+ *
+ * This lock may be locked for reading from both IRQ and FIQ handlers
+ * and therefore must not be locked for writing when these are enabled.
  */
 #ifdef CONFIG_BL_SWITCHER
-static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static DEFINE_RWLOCK(cpu_map_migration_lock);
 
-static inline void gic_migration_lock(unsigned long *flags)
+static inline void gic_migration_lock(void)
 {
-	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+	read_lock(&cpu_map_migration_lock);
 }
 
-static inline void gic_migration_unlock(unsigned long flags)
+static inline void gic_migration_unlock(void)
 {
-	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+	read_unlock(&cpu_map_migration_lock);
 }
 #else
-static inline void gic_migration_lock(unsigned long *flags) {}
-static inline void gic_migration_unlock(unsigned long flags) {}
+static inline void gic_migration_lock(void) {}
+static inline void gic_migration_unlock(void) {}
 #endif
 
 /*
@@ -643,12 +646,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+/*
+ * Raise the specified IPI on all cpus set in mask.
+ *
+ * This function is safe to call from all calling contexts, including
+ * FIQ handlers. It relies on gic_migration_lock() being multiply acquirable
+ * to avoid deadlocks when the function is re-entered at different
+ * exception levels.
+ */
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
-	unsigned long flags, map = 0;
+	unsigned long map = 0;
 
-	gic_migration_lock(&flags);
+	gic_migration_lock();
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -663,7 +674,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	gic_migration_unlock(flags);
+	gic_migration_unlock();
 }
 #endif
 
@@ -711,7 +722,8 @@ int gic_get_cpu_id(unsigned int cpu)
  * Migrate all peripheral interrupts with a target matching the current CPU
  * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
  * is also updated.  Targets to other CPU interfaces are unchanged.
- * This must be called with IRQs locally disabled.
+ * This must be called from a task context and with IRQ and FIQ locally
+ * disabled.
  */
 void gic_migrate_target(unsigned int new_cpu_id)
 {
@@ -742,9 +754,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * pending on the old cpu static. That means we can defer the
 	 * migration until after we have released the irq_controller_lock.
 	 */
-	raw_spin_lock(&cpu_map_migration_lock);
+	write_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
-	raw_spin_unlock(&cpu_map_migration_lock);
+	write_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe
@ 2015-03-12 13:39     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

It is currently possible for FIQ handlers to re-enter gic_raise_softirq()
and lock up.

    	gic_raise_softirq()
	   lock(x);
-~-> FIQ
        handle_fiq()
	   gic_raise_softirq()
	      lock(x);		<-- Lockup

arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue
renders it difficult for FIQ handlers to safely defer work to less
restrictive calling contexts.

This patch fixes the problem by converting the cpu_map_migration_lock
into a rwlock making it safe to re-enter the function.

Note that having made it safe to re-enter gic_raise_softirq() we no
longer need to mask interrupts during gic_raise_softirq() because the
b.L migration is always performed from task context.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
---
 drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index f2a0b4525b65..48d6296a365a 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
 /*
  * This lock is used by the big.LITTLE migration code to ensure no IPIs
  * can be pended on the old core after the map has been updated.
+ *
+ * This lock may be locked for reading from both IRQ and FIQ handlers
+ * and therefore must not be locked for writing when these are enabled.
  */
 #ifdef CONFIG_BL_SWITCHER
-static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static DEFINE_RWLOCK(cpu_map_migration_lock);
 
-static inline void gic_migration_lock(unsigned long *flags)
+static inline void gic_migration_lock(void)
 {
-	raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags);
+	read_lock(&cpu_map_migration_lock);
 }
 
-static inline void gic_migration_unlock(unsigned long flags)
+static inline void gic_migration_unlock(void)
 {
-	raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
+	read_unlock(&cpu_map_migration_lock);
 }
 #else
-static inline void gic_migration_lock(unsigned long *flags) {}
-static inline void gic_migration_unlock(unsigned long flags) {}
+static inline void gic_migration_lock(void) {}
+static inline void gic_migration_unlock(void) {}
 #endif
 
 /*
@@ -643,12 +646,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+/*
+ * Raise the specified IPI on all cpus set in mask.
+ *
+ * This function is safe to call from all calling contexts, including
+ * FIQ handlers. It relies on gic_migration_lock() being multiply acquirable
+ * to avoid deadlocks when the function is re-entered at different
+ * exception levels.
+ */
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
-	unsigned long flags, map = 0;
+	unsigned long map = 0;
 
-	gic_migration_lock(&flags);
+	gic_migration_lock();
 
 	/* Convert our logical CPU mask into a physical one. */
 	for_each_cpu(cpu, mask)
@@ -663,7 +674,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	/* this always happens on GIC0 */
 	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
-	gic_migration_unlock(flags);
+	gic_migration_unlock();
 }
 #endif
 
@@ -711,7 +722,8 @@ int gic_get_cpu_id(unsigned int cpu)
  * Migrate all peripheral interrupts with a target matching the current CPU
  * to the interface corresponding to @new_cpu_id.  The CPU interface mapping
  * is also updated.  Targets to other CPU interfaces are unchanged.
- * This must be called with IRQs locally disabled.
+ * This must be called from a task context and with IRQ and FIQ locally
+ * disabled.
  */
 void gic_migrate_target(unsigned int new_cpu_id)
 {
@@ -742,9 +754,9 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * pending on the old cpu static. That means we can defer the
 	 * migration until after we have released the irq_controller_lock.
 	 */
-	raw_spin_lock(&cpu_map_migration_lock);
+	write_lock(&cpu_map_migration_lock);
 	gic_cpu_map[cpu] = 1 << new_cpu_id;
-	raw_spin_unlock(&cpu_map_migration_lock);
+	write_unlock(&cpu_map_migration_lock);
 
 	/*
 	 * Find all the peripheral interrupts targetting the current
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 3/6] irqchip: gic: Introduce plumbing for IPI FIQ
  2015-03-12 13:39   ` Daniel Thompson
@ 2015-03-12 13:39     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper
  Cc: Daniel Thompson, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

Currently it is not possible to exploit FIQ for systems with a GIC, even if
the systems are otherwise capable of it. This patch makes it possible
for IPIs to be delivered using FIQ.

To do so it modifies the register state so that normal interrupts are
placed in group 1 and specific IPIs are placed into group 0. It also
configures the controller to raise group 0 interrupts using the FIQ
signal. It provides a means for architecture code to define which IPIs
shall use FIQ and to acknowledge any IPIs that are raised.

All GIC hardware except GICv1-without-TrustZone support provides a means
to group exceptions into group 0 and group 1 but the hardware
functionality is unavailable to the kernel when a secure monitor is
present because access to the grouping registers are prohibited outside
"secure world". However when grouping is not available (or in the case
of early GICv1 implementations is very hard to configure) the code to
change groups does not deploy and all IPIs will be raised via IRQ.

It has been tested and shown working on two systems capable of
supporting grouping (Freescale i.MX6 and STiH416). It has also been
tested for boot regressions on two systems that do not support grouping
(vexpress-a9 and Qualcomm Snapdragon 600).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/traps.c         |   5 +-
 drivers/irqchip/irq-gic.c       | 151 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 +++
 3 files changed, 153 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 788e23fe64d8..b35e220ae1b1 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -26,6 +26,7 @@
 #include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/irq.h>
+#include <linux/irqchip/arm-gic.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 
 	nmi_enter();
 
-	/* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
+	gic_handle_fiq_ipi();
+#endif
 
 	nmi_exit();
 
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 48d6296a365a..cd2e4f93675b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/ratelimit.h>
 
 #include <asm/cputype.h>
 #include <asm/irq.h>
@@ -48,6 +49,10 @@
 #include "irq-gic-common.h"
 #include "irqchip.h"
 
+#ifndef SMP_IPI_FIQ_MASK
+#define SMP_IPI_FIQ_MASK 0
+#endif
+
 union gic_base {
 	void __iomem *common_base;
 	void __percpu * __iomem *percpu_base;
@@ -65,6 +70,7 @@ struct gic_chip_data {
 #endif
 	struct irq_domain *domain;
 	unsigned int gic_irqs;
+	u32 igroup0_shadow;
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
@@ -351,6 +357,83 @@ static struct irq_chip gic_chip = {
 	.irq_set_wake		= gic_set_wake,
 };
 
+/*
+ * Shift an interrupt between Group 0 and Group 1.
+ *
+ * In addition to changing the group we also modify the priority to
+ * match what "ARM strongly recommends" for a system where no Group 1
+ * interrupt must ever preempt a Group 0 interrupt.
+ *
+ * If is safe to call this function on systems which do not support
+ * grouping (it will have no effect).
+ */
+static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq,
+			      int group)
+{
+	void __iomem *base = gic_data_dist_base(gic);
+	unsigned int grp_reg = hwirq / 32 * 4;
+	u32 grp_mask = BIT(hwirq % 32);
+	u32 grp_val;
+
+	unsigned int pri_reg = (hwirq / 4) * 4;
+	u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
+	u32 pri_val;
+
+	/*
+	 * Systems which do not support grouping will have not have
+	 * the EnableGrp1 bit set.
+	 */
+	if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
+		return;
+
+	raw_spin_lock(&irq_controller_lock);
+
+	grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
+	pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
+
+	if (group) {
+		grp_val |= grp_mask;
+		pri_val |= pri_mask;
+	} else {
+		grp_val &= ~grp_mask;
+		pri_val &= ~pri_mask;
+	}
+
+	writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
+	if (grp_reg == 0)
+		gic->igroup0_shadow = grp_val;
+
+	writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
+
+	raw_spin_unlock(&irq_controller_lock);
+}
+
+
+/*
+ * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
+ * otherwise do nothing.
+ */
+void gic_handle_fiq_ipi(void)
+{
+	struct gic_chip_data *gic = &gic_data[0];
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
+	unsigned long irqstat, irqnr;
+
+	if (WARN_ON(!in_nmi()))
+		return;
+
+	while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
+	       SMP_IPI_FIQ_MASK) {
+		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
+		writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
+
+		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+		WARN_RATELIMIT(irqnr > 16,
+			       "Unexpected irqnr %lu (bad prioritization?)\n",
+			       irqnr);
+	}
+}
+
 void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 {
 	if (gic_nr >= MAX_GIC_NR)
@@ -382,15 +465,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 static void gic_cpu_if_up(void)
 {
 	void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
-	u32 bypass = 0;
+	void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
+	u32 ctrl = 0;
 
 	/*
-	* Preserve bypass disable bits to be written back later
-	*/
-	bypass = readl(cpu_base + GIC_CPU_CTRL);
-	bypass &= GICC_DIS_BYPASS_MASK;
+	 * Preserve bypass disable bits to be written back later
+	 */
+	ctrl = readl(cpu_base + GIC_CPU_CTRL);
+	ctrl &= GICC_DIS_BYPASS_MASK;
 
-	writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
+	/*
+	 * If EnableGrp1 is set in the distributor then enable group 1
+	 * support for this CPU (and route group 0 interrupts to FIQ).
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
+		ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
+			GICC_ENABLE_GRP1;
+
+	writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
 
@@ -414,7 +506,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored) depending on current mode.
+	 */
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
+
+	/*
+	 * Set all global interrupts to be group 1 if (and only if) it
+	 * is possible to enable group 1 interrupts. This register is RAZ/WI
+	 * if not accessible or not implemented, however some GICv1 devices
+	 * do not implement the EnableGrp1 bit making it unsafe to set
+	 * this register unconditionally.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
+		for (i = 32; i < gic_irqs; i += 32)
+			writel_relaxed(0xffffffff,
+				       base + GIC_DIST_IGROUP + i * 4 / 32);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -423,6 +531,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	void __iomem *base = gic_data_cpu_base(gic);
 	unsigned int cpu_mask, cpu = smp_processor_id();
 	int i;
+	unsigned long secure_irqs, secure_irq;
 
 	/*
 	 * Get what the GIC says our CPU mask is.
@@ -441,6 +550,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * If the distributor is configured to support interrupt grouping
+	 * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
+	 * to be group1 and ensure any remaining group 0 interrupts have
+	 * the right priority.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
+		secure_irqs = SMP_IPI_FIQ_MASK;
+		writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
+		gic->igroup0_shadow = ~secure_irqs;
+		for_each_set_bit(secure_irq, &secure_irqs, 16)
+			gic_set_group_irq(gic, secure_irq, 0);
+	}
+
 	writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK);
 	gic_cpu_if_up();
 }
@@ -530,7 +653,8 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
+		       dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -658,6 +782,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
 	unsigned long map = 0;
+	unsigned long softint;
 
 	gic_migration_lock();
 
@@ -671,8 +796,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	/* We avoid a readl here by using the shadow copy of IGROUP[0] */
+	softint = map << 16 | irq;
+	if (gic_data[0].igroup0_shadow & BIT(irq))
+		softint |= 0x8000;
+
+	/* This always happens on GIC0 */
+	writel_relaxed(softint,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
 	gic_migration_unlock();
 }
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 71d706d5f169..7690f70049a3 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -22,6 +22,10 @@
 #define GIC_CPU_IDENT			0xfc
 
 #define GICC_ENABLE			0x1
+#define GICC_ENABLE_GRP1		0x2
+#define GICC_ACK_CTL			0x4
+#define GICC_FIQ_EN			0x8
+#define GICC_COMMON_BPR			0x10
 #define GICC_INT_PRI_THRESHOLD		0xf0
 #define GICC_IAR_INT_ID_MASK		0x3ff
 #define GICC_INT_SPURIOUS		1023
@@ -44,6 +48,7 @@
 #define GIC_DIST_SGI_PENDING_SET	0xf20
 
 #define GICD_ENABLE			0x1
+#define GICD_ENABLE_GRP1		0x2
 #define GICD_DISABLE			0x0
 #define GICD_INT_ACTLOW_LVLTRIG		0x0
 #define GICD_INT_EN_CLR_X32		0xffffffff
@@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops
 {
 	gic_routable_irq_domain_ops = ops;
 }
+
+void gic_handle_fiq_ipi(void);
+
 #endif /* __ASSEMBLY */
 #endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 3/6] irqchip: gic: Introduce plumbing for IPI FIQ
@ 2015-03-12 13:39     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Currently it is not possible to exploit FIQ for systems with a GIC, even if
the systems are otherwise capable of it. This patch makes it possible
for IPIs to be delivered using FIQ.

To do so it modifies the register state so that normal interrupts are
placed in group 1 and specific IPIs are placed into group 0. It also
configures the controller to raise group 0 interrupts using the FIQ
signal. It provides a means for architecture code to define which IPIs
shall use FIQ and to acknowledge any IPIs that are raised.

All GIC hardware except GICv1-without-TrustZone support provides a means
to group exceptions into group 0 and group 1 but the hardware
functionality is unavailable to the kernel when a secure monitor is
present because access to the grouping registers are prohibited outside
"secure world". However when grouping is not available (or in the case
of early GICv1 implementations is very hard to configure) the code to
change groups does not deploy and all IPIs will be raised via IRQ.

It has been tested and shown working on two systems capable of
supporting grouping (Freescale i.MX6 and STiH416). It has also been
tested for boot regressions on two systems that do not support grouping
(vexpress-a9 and Qualcomm Snapdragon 600).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/traps.c         |   5 +-
 drivers/irqchip/irq-gic.c       | 151 +++++++++++++++++++++++++++++++++++++---
 include/linux/irqchip/arm-gic.h |   8 +++
 3 files changed, 153 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 788e23fe64d8..b35e220ae1b1 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -26,6 +26,7 @@
 #include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/irq.h>
+#include <linux/irqchip/arm-gic.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 
 	nmi_enter();
 
-	/* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
+	gic_handle_fiq_ipi();
+#endif
 
 	nmi_exit();
 
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 48d6296a365a..cd2e4f93675b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/ratelimit.h>
 
 #include <asm/cputype.h>
 #include <asm/irq.h>
@@ -48,6 +49,10 @@
 #include "irq-gic-common.h"
 #include "irqchip.h"
 
+#ifndef SMP_IPI_FIQ_MASK
+#define SMP_IPI_FIQ_MASK 0
+#endif
+
 union gic_base {
 	void __iomem *common_base;
 	void __percpu * __iomem *percpu_base;
@@ -65,6 +70,7 @@ struct gic_chip_data {
 #endif
 	struct irq_domain *domain;
 	unsigned int gic_irqs;
+	u32 igroup0_shadow;
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
@@ -351,6 +357,83 @@ static struct irq_chip gic_chip = {
 	.irq_set_wake		= gic_set_wake,
 };
 
+/*
+ * Shift an interrupt between Group 0 and Group 1.
+ *
+ * In addition to changing the group we also modify the priority to
+ * match what "ARM strongly recommends" for a system where no Group 1
+ * interrupt must ever preempt a Group 0 interrupt.
+ *
+ * If is safe to call this function on systems which do not support
+ * grouping (it will have no effect).
+ */
+static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq,
+			      int group)
+{
+	void __iomem *base = gic_data_dist_base(gic);
+	unsigned int grp_reg = hwirq / 32 * 4;
+	u32 grp_mask = BIT(hwirq % 32);
+	u32 grp_val;
+
+	unsigned int pri_reg = (hwirq / 4) * 4;
+	u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
+	u32 pri_val;
+
+	/*
+	 * Systems which do not support grouping will have not have
+	 * the EnableGrp1 bit set.
+	 */
+	if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
+		return;
+
+	raw_spin_lock(&irq_controller_lock);
+
+	grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
+	pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
+
+	if (group) {
+		grp_val |= grp_mask;
+		pri_val |= pri_mask;
+	} else {
+		grp_val &= ~grp_mask;
+		pri_val &= ~pri_mask;
+	}
+
+	writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
+	if (grp_reg == 0)
+		gic->igroup0_shadow = grp_val;
+
+	writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
+
+	raw_spin_unlock(&irq_controller_lock);
+}
+
+
+/*
+ * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
+ * otherwise do nothing.
+ */
+void gic_handle_fiq_ipi(void)
+{
+	struct gic_chip_data *gic = &gic_data[0];
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
+	unsigned long irqstat, irqnr;
+
+	if (WARN_ON(!in_nmi()))
+		return;
+
+	while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
+	       SMP_IPI_FIQ_MASK) {
+		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
+		writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
+
+		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+		WARN_RATELIMIT(irqnr > 16,
+			       "Unexpected irqnr %lu (bad prioritization?)\n",
+			       irqnr);
+	}
+}
+
 void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 {
 	if (gic_nr >= MAX_GIC_NR)
@@ -382,15 +465,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 static void gic_cpu_if_up(void)
 {
 	void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
-	u32 bypass = 0;
+	void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
+	u32 ctrl = 0;
 
 	/*
-	* Preserve bypass disable bits to be written back later
-	*/
-	bypass = readl(cpu_base + GIC_CPU_CTRL);
-	bypass &= GICC_DIS_BYPASS_MASK;
+	 * Preserve bypass disable bits to be written back later
+	 */
+	ctrl = readl(cpu_base + GIC_CPU_CTRL);
+	ctrl &= GICC_DIS_BYPASS_MASK;
 
-	writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
+	/*
+	 * If EnableGrp1 is set in the distributor then enable group 1
+	 * support for this CPU (and route group 0 interrupts to FIQ).
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
+		ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
+			GICC_ENABLE_GRP1;
+
+	writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
 
@@ -414,7 +506,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored) depending on current mode.
+	 */
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
+
+	/*
+	 * Set all global interrupts to be group 1 if (and only if) it
+	 * is possible to enable group 1 interrupts. This register is RAZ/WI
+	 * if not accessible or not implemented, however some GICv1 devices
+	 * do not implement the EnableGrp1 bit making it unsafe to set
+	 * this register unconditionally.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
+		for (i = 32; i < gic_irqs; i += 32)
+			writel_relaxed(0xffffffff,
+				       base + GIC_DIST_IGROUP + i * 4 / 32);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -423,6 +531,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	void __iomem *base = gic_data_cpu_base(gic);
 	unsigned int cpu_mask, cpu = smp_processor_id();
 	int i;
+	unsigned long secure_irqs, secure_irq;
 
 	/*
 	 * Get what the GIC says our CPU mask is.
@@ -441,6 +550,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * If the distributor is configured to support interrupt grouping
+	 * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
+	 * to be group1 and ensure any remaining group 0 interrupts have
+	 * the right priority.
+	 */
+	if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
+		secure_irqs = SMP_IPI_FIQ_MASK;
+		writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
+		gic->igroup0_shadow = ~secure_irqs;
+		for_each_set_bit(secure_irq, &secure_irqs, 16)
+			gic_set_group_irq(gic, secure_irq, 0);
+	}
+
 	writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK);
 	gic_cpu_if_up();
 }
@@ -530,7 +653,8 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
+		       dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -658,6 +782,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
 	int cpu;
 	unsigned long map = 0;
+	unsigned long softint;
 
 	gic_migration_lock();
 
@@ -671,8 +796,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	/* We avoid a readl here by using the shadow copy of IGROUP[0] */
+	softint = map << 16 | irq;
+	if (gic_data[0].igroup0_shadow & BIT(irq))
+		softint |= 0x8000;
+
+	/* This always happens on GIC0 */
+	writel_relaxed(softint,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 
 	gic_migration_unlock();
 }
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 71d706d5f169..7690f70049a3 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -22,6 +22,10 @@
 #define GIC_CPU_IDENT			0xfc
 
 #define GICC_ENABLE			0x1
+#define GICC_ENABLE_GRP1		0x2
+#define GICC_ACK_CTL			0x4
+#define GICC_FIQ_EN			0x8
+#define GICC_COMMON_BPR			0x10
 #define GICC_INT_PRI_THRESHOLD		0xf0
 #define GICC_IAR_INT_ID_MASK		0x3ff
 #define GICC_INT_SPURIOUS		1023
@@ -44,6 +48,7 @@
 #define GIC_DIST_SGI_PENDING_SET	0xf20
 
 #define GICD_ENABLE			0x1
+#define GICD_ENABLE_GRP1		0x2
 #define GICD_DISABLE			0x0
 #define GICD_INT_ACTLOW_LVLTRIG		0x0
 #define GICD_INT_EN_CLR_X32		0xffffffff
@@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops
 {
 	gic_routable_irq_domain_ops = ops;
 }
+
+void gic_handle_fiq_ipi(void);
+
 #endif /* __ASSEMBLY */
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 4/6] printk: Simple implementation for NMI backtracing
  2015-03-12 13:39   ` Daniel Thompson
@ 2015-03-12 13:39     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper
  Cc: Daniel Thompson, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

Currently there is a quite a pile of code sitting in
arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
The code is inaccessible to backtrace implementations for other
architectures, which is a shame because they would probably like to be
safe too.

Copy this code into printk, reworking it a little as we do so to make
it easier to exploit as library code.

We'll port the x86 NMI backtrace logic to it in a later patch.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/printk.h        |  20 ++++++
 init/Kconfig                  |   3 +
 kernel/printk/Makefile        |   1 +
 kernel/printk/nmi_backtrace.c | 148 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 172 insertions(+)
 create mode 100644 kernel/printk/nmi_backtrace.c

diff --git a/include/linux/printk.h b/include/linux/printk.h
index baa3f97d8ce8..44bb85ad1f62 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -228,6 +228,26 @@ static inline void show_regs_print_info(const char *log_lvl)
 }
 #endif
 
+#ifdef CONFIG_PRINTK_NMI_BACKTRACE
+/*
+ * printk_nmi_backtrace_prepare/complete are called to prepare the
+ * system for some or all cores to issue trace from NMI.
+ * printk_nmi_backtrace_complete will print buffered output and cannot
+ * (safely) be called from NMI.
+ */
+extern int printk_nmi_backtrace_prepare(void);
+extern void printk_nmi_backtrace_complete(void);
+
+/*
+ * printk_nmi_backtrace_this_cpu_begin/end are used divert/restore printk
+ * on this cpu. The result is the output of printk() (by this CPU) will be
+ * stored in temporary buffers for later printing by
+ * printk_nmi_backtrace_complete.
+ */
+extern void printk_nmi_backtrace_this_cpu_begin(void);
+extern void printk_nmi_backtrace_this_cpu_end(void);
+#endif
+
 extern asmlinkage void dump_stack(void) __cold;
 
 #ifndef pr_fmt
diff --git a/init/Kconfig b/init/Kconfig
index f5dbc6d4261b..0107e9b4d2cf 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1421,6 +1421,9 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config PRINTK_NMI_BACKTRACE
+	bool
+
 config BUG
 	bool "BUG() support" if EXPERT
 	default y
diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile
index 85405bdcf2b3..1849b001384a 100644
--- a/kernel/printk/Makefile
+++ b/kernel/printk/Makefile
@@ -1,2 +1,3 @@
 obj-y	= printk.o
+obj-$(CONFIG_PRINTK_NMI_BACKTRACE)	+= nmi_backtrace.o
 obj-$(CONFIG_A11Y_BRAILLE_CONSOLE)	+= braille.o
diff --git a/kernel/printk/nmi_backtrace.c b/kernel/printk/nmi_backtrace.c
new file mode 100644
index 000000000000..e9a06929c4f3
--- /dev/null
+++ b/kernel/printk/nmi_backtrace.c
@@ -0,0 +1,148 @@
+#include <linux/kernel.h>
+#include <linux/seq_buf.h>
+
+#define NMI_BUF_SIZE		4096
+
+struct nmi_seq_buf {
+	unsigned char		buffer[NMI_BUF_SIZE];
+	struct seq_buf		seq;
+};
+
+/* Safe printing in NMI context */
+static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+
+static DEFINE_PER_CPU(printk_func_t, nmi_print_saved_print_func);
+
+/* "in progress" flag of NMI printing */
+static unsigned long nmi_print_flag;
+
+static int __init printk_nmi_backtrace_init(void)
+{
+	struct nmi_seq_buf *s;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
+	}
+
+	return 0;
+}
+pure_initcall(printk_nmi_backtrace_init);
+
+/*
+ * It is not safe to call printk() directly from NMI handlers.
+ * It may be fine if the NMI detected a lock up and we have no choice
+ * but to do so, but doing a NMI on all other CPUs to get a back trace
+ * can be done with a sysrq-l. We don't want that to lock up, which
+ * can happen if the NMI interrupts a printk in progress.
+ *
+ * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
+ * the content into a per cpu seq_buf buffer. Then when the NMIs are
+ * all done, we can safely dump the contents of the seq_buf to a printk()
+ * from a non NMI context.
+ *
+ * This is not a generic printk() implementation and must be used with
+ * great care. In particular there is a static limit on the quantity of
+ * data that may be emitted during NMI, only one client can be active at
+ * one time (arbitrated by the return value of printk_nmi_begin() and
+ * it is required that something at task or interrupt context be scheduled
+ * to issue the output.
+ */
+static int nmi_vprintk(const char *fmt, va_list args)
+{
+	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	unsigned int len = seq_buf_used(&s->seq);
+
+	seq_buf_vprintf(&s->seq, fmt, args);
+	return seq_buf_used(&s->seq) - len;
+}
+
+/*
+ * Reserve the NMI printk mechanism. Return an error if some other component
+ * is already using it.
+ */
+int printk_nmi_backtrace_prepare(void)
+{
+	if (test_and_set_bit(0, &nmi_print_flag)) {
+		/*
+		 * If something is already using the NMI print facility we
+		 * can't allow a second one...
+		 */
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
+{
+	const char *buf = s->buffer + start;
+
+	printk("%.*s", (end - start) + 1, buf);
+}
+
+void printk_nmi_backtrace_complete(void)
+{
+	struct nmi_seq_buf *s;
+	int len, cpu, i, last_i;
+
+	/*
+	 * Now that all the NMIs have triggered, we can dump out their
+	 * back traces safely to the console.
+	 */
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		last_i = 0;
+
+		len = seq_buf_used(&s->seq);
+		if (!len)
+			continue;
+
+		/* Print line by line. */
+		for (i = 0; i < len; i++) {
+			if (s->buffer[i] == '\n') {
+				print_seq_line(s, last_i, i);
+				last_i = i + 1;
+			}
+		}
+		/* Check if there was a partial line. */
+		if (last_i < len) {
+			print_seq_line(s, last_i, len - 1);
+			pr_cont("\n");
+		}
+
+		/* Wipe out the buffer ready for the next time around. */
+		seq_buf_clear(&s->seq);
+	}
+
+	clear_bit(0, &nmi_print_flag);
+	smp_mb__after_atomic();
+}
+
+void printk_nmi_backtrace_this_cpu_begin(void)
+{
+	/*
+	 * Detect double-begins and report them. This code is unsafe (because
+	 * it will print from NMI) but things are pretty badly damaged if the
+	 * NMI re-enters and is somehow granted permission to use NMI printk,
+	 * so how much worse can it get? Also since this code interferes with
+	 * the operation of printk it is unlikely that any consequential
+	 * failures will be able to log anything making this our last
+	 * opportunity to tell anyone that something is wrong.
+	 */
+	if (this_cpu_read(nmi_print_saved_print_func)) {
+		this_cpu_write(printk_func,
+			       this_cpu_read(nmi_print_saved_print_func));
+		BUG();
+	}
+
+	this_cpu_write(nmi_print_saved_print_func, this_cpu_read(printk_func));
+	this_cpu_write(printk_func, nmi_vprintk);
+}
+
+void printk_nmi_backtrace_this_cpu_end(void)
+{
+	this_cpu_write(printk_func, this_cpu_read(nmi_print_saved_print_func));
+	this_cpu_write(nmi_print_saved_print_func, NULL);
+}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 4/6] printk: Simple implementation for NMI backtracing
@ 2015-03-12 13:39     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Currently there is a quite a pile of code sitting in
arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI.
The code is inaccessible to backtrace implementations for other
architectures, which is a shame because they would probably like to be
safe too.

Copy this code into printk, reworking it a little as we do so to make
it easier to exploit as library code.

We'll port the x86 NMI backtrace logic to it in a later patch.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/printk.h        |  20 ++++++
 init/Kconfig                  |   3 +
 kernel/printk/Makefile        |   1 +
 kernel/printk/nmi_backtrace.c | 148 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 172 insertions(+)
 create mode 100644 kernel/printk/nmi_backtrace.c

diff --git a/include/linux/printk.h b/include/linux/printk.h
index baa3f97d8ce8..44bb85ad1f62 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -228,6 +228,26 @@ static inline void show_regs_print_info(const char *log_lvl)
 }
 #endif
 
+#ifdef CONFIG_PRINTK_NMI_BACKTRACE
+/*
+ * printk_nmi_backtrace_prepare/complete are called to prepare the
+ * system for some or all cores to issue trace from NMI.
+ * printk_nmi_backtrace_complete will print buffered output and cannot
+ * (safely) be called from NMI.
+ */
+extern int printk_nmi_backtrace_prepare(void);
+extern void printk_nmi_backtrace_complete(void);
+
+/*
+ * printk_nmi_backtrace_this_cpu_begin/end are used divert/restore printk
+ * on this cpu. The result is the output of printk() (by this CPU) will be
+ * stored in temporary buffers for later printing by
+ * printk_nmi_backtrace_complete.
+ */
+extern void printk_nmi_backtrace_this_cpu_begin(void);
+extern void printk_nmi_backtrace_this_cpu_end(void);
+#endif
+
 extern asmlinkage void dump_stack(void) __cold;
 
 #ifndef pr_fmt
diff --git a/init/Kconfig b/init/Kconfig
index f5dbc6d4261b..0107e9b4d2cf 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1421,6 +1421,9 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config PRINTK_NMI_BACKTRACE
+	bool
+
 config BUG
 	bool "BUG() support" if EXPERT
 	default y
diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile
index 85405bdcf2b3..1849b001384a 100644
--- a/kernel/printk/Makefile
+++ b/kernel/printk/Makefile
@@ -1,2 +1,3 @@
 obj-y	= printk.o
+obj-$(CONFIG_PRINTK_NMI_BACKTRACE)	+= nmi_backtrace.o
 obj-$(CONFIG_A11Y_BRAILLE_CONSOLE)	+= braille.o
diff --git a/kernel/printk/nmi_backtrace.c b/kernel/printk/nmi_backtrace.c
new file mode 100644
index 000000000000..e9a06929c4f3
--- /dev/null
+++ b/kernel/printk/nmi_backtrace.c
@@ -0,0 +1,148 @@
+#include <linux/kernel.h>
+#include <linux/seq_buf.h>
+
+#define NMI_BUF_SIZE		4096
+
+struct nmi_seq_buf {
+	unsigned char		buffer[NMI_BUF_SIZE];
+	struct seq_buf		seq;
+};
+
+/* Safe printing in NMI context */
+static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+
+static DEFINE_PER_CPU(printk_func_t, nmi_print_saved_print_func);
+
+/* "in progress" flag of NMI printing */
+static unsigned long nmi_print_flag;
+
+static int __init printk_nmi_backtrace_init(void)
+{
+	struct nmi_seq_buf *s;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
+	}
+
+	return 0;
+}
+pure_initcall(printk_nmi_backtrace_init);
+
+/*
+ * It is not safe to call printk() directly from NMI handlers.
+ * It may be fine if the NMI detected a lock up and we have no choice
+ * but to do so, but doing a NMI on all other CPUs to get a back trace
+ * can be done with a sysrq-l. We don't want that to lock up, which
+ * can happen if the NMI interrupts a printk in progress.
+ *
+ * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
+ * the content into a per cpu seq_buf buffer. Then when the NMIs are
+ * all done, we can safely dump the contents of the seq_buf to a printk()
+ * from a non NMI context.
+ *
+ * This is not a generic printk() implementation and must be used with
+ * great care. In particular there is a static limit on the quantity of
+ * data that may be emitted during NMI, only one client can be active at
+ * one time (arbitrated by the return value of printk_nmi_begin() and
+ * it is required that something at task or interrupt context be scheduled
+ * to issue the output.
+ */
+static int nmi_vprintk(const char *fmt, va_list args)
+{
+	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	unsigned int len = seq_buf_used(&s->seq);
+
+	seq_buf_vprintf(&s->seq, fmt, args);
+	return seq_buf_used(&s->seq) - len;
+}
+
+/*
+ * Reserve the NMI printk mechanism. Return an error if some other component
+ * is already using it.
+ */
+int printk_nmi_backtrace_prepare(void)
+{
+	if (test_and_set_bit(0, &nmi_print_flag)) {
+		/*
+		 * If something is already using the NMI print facility we
+		 * can't allow a second one...
+		 */
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
+{
+	const char *buf = s->buffer + start;
+
+	printk("%.*s", (end - start) + 1, buf);
+}
+
+void printk_nmi_backtrace_complete(void)
+{
+	struct nmi_seq_buf *s;
+	int len, cpu, i, last_i;
+
+	/*
+	 * Now that all the NMIs have triggered, we can dump out their
+	 * back traces safely to the console.
+	 */
+	for_each_possible_cpu(cpu) {
+		s = &per_cpu(nmi_print_seq, cpu);
+		last_i = 0;
+
+		len = seq_buf_used(&s->seq);
+		if (!len)
+			continue;
+
+		/* Print line by line. */
+		for (i = 0; i < len; i++) {
+			if (s->buffer[i] == '\n') {
+				print_seq_line(s, last_i, i);
+				last_i = i + 1;
+			}
+		}
+		/* Check if there was a partial line. */
+		if (last_i < len) {
+			print_seq_line(s, last_i, len - 1);
+			pr_cont("\n");
+		}
+
+		/* Wipe out the buffer ready for the next time around. */
+		seq_buf_clear(&s->seq);
+	}
+
+	clear_bit(0, &nmi_print_flag);
+	smp_mb__after_atomic();
+}
+
+void printk_nmi_backtrace_this_cpu_begin(void)
+{
+	/*
+	 * Detect double-begins and report them. This code is unsafe (because
+	 * it will print from NMI) but things are pretty badly damaged if the
+	 * NMI re-enters and is somehow granted permission to use NMI printk,
+	 * so how much worse can it get? Also since this code interferes with
+	 * the operation of printk it is unlikely that any consequential
+	 * failures will be able to log anything making this our last
+	 * opportunity to tell anyone that something is wrong.
+	 */
+	if (this_cpu_read(nmi_print_saved_print_func)) {
+		this_cpu_write(printk_func,
+			       this_cpu_read(nmi_print_saved_print_func));
+		BUG();
+	}
+
+	this_cpu_write(nmi_print_saved_print_func, this_cpu_read(printk_func));
+	this_cpu_write(printk_func, nmi_vprintk);
+}
+
+void printk_nmi_backtrace_this_cpu_end(void)
+{
+	this_cpu_write(printk_func, this_cpu_read(nmi_print_saved_print_func));
+	this_cpu_write(nmi_print_saved_print_func, NULL);
+}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 5/6] x86/nmi: Use common printk functions
  2015-03-12 13:39   ` Daniel Thompson
@ 2015-03-12 13:39     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper
  Cc: Daniel Thompson, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander, H. Peter Anvin, x86

Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe
all-cpu backtracing from NMI has been copied to printk.c to make it
accessible to other architectures.

Port the x86 NMI backtrace to the generic code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 arch/x86/Kconfig              |   1 +
 arch/x86/kernel/apic/hw_nmi.c | 101 +++---------------------------------------
 2 files changed, 8 insertions(+), 94 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c2fb8a87dccb..15278140833b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -141,6 +141,7 @@ config X86
 	select ACPI_LEGACY_TABLES_LOOKUP if ACPI
 	select X86_FEATURE_NAMES if PROC_FS
 	select SRCU
+	select PRINTK_NMI_BACKTRACE if X86_LOCAL_APIC
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 6873ab925d00..db934f9461ed 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -30,40 +30,16 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 #ifdef arch_trigger_all_cpu_backtrace
 /* For reliability, we're prepared to waste bits here. */
 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
-static cpumask_t printtrace_mask;
-
-#define NMI_BUF_SIZE		4096
-
-struct nmi_seq_buf {
-	unsigned char		buffer[NMI_BUF_SIZE];
-	struct seq_buf		seq;
-};
-
-/* Safe printing in NMI context */
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
-
-/* "in progress" flag of arch_trigger_all_cpu_backtrace */
-static unsigned long backtrace_flag;
-
-static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
-{
-	const char *buf = s->buffer + start;
-
-	printk("%.*s", (end - start) + 1, buf);
-}
 
 void arch_trigger_all_cpu_backtrace(bool include_self)
 {
-	struct nmi_seq_buf *s;
-	int len;
-	int cpu;
 	int i;
 	int this_cpu = get_cpu();
 
-	if (test_and_set_bit(0, &backtrace_flag)) {
+	if (0 != printk_nmi_backtrace_prepare()) {
 		/*
-		 * If there is already a trigger_all_cpu_backtrace() in progress
-		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
 		 */
 		put_cpu();
 		return;
@@ -73,16 +49,6 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 	if (!include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
 
-	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
-	/*
-	 * Set up per_cpu seq_buf buffers that the NMIs running on the other
-	 * CPUs will write to.
-	 */
-	for_each_cpu(cpu, to_cpumask(backtrace_mask)) {
-		s = &per_cpu(nmi_print_seq, cpu);
-		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
-	}
-
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("sending NMI to %s CPUs:\n",
 			(include_self ? "all" : "other"));
@@ -97,73 +63,20 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Now that all the NMIs have triggered, we can dump out their
-	 * back traces safely to the console.
-	 */
-	for_each_cpu(cpu, &printtrace_mask) {
-		int last_i = 0;
-
-		s = &per_cpu(nmi_print_seq, cpu);
-		len = seq_buf_used(&s->seq);
-		if (!len)
-			continue;
-
-		/* Print line by line. */
-		for (i = 0; i < len; i++) {
-			if (s->buffer[i] == '\n') {
-				print_seq_line(s, last_i, i);
-				last_i = i + 1;
-			}
-		}
-		/* Check if there was a partial line. */
-		if (last_i < len) {
-			print_seq_line(s, last_i, len - 1);
-			pr_cont("\n");
-		}
-	}
-
-	clear_bit(0, &backtrace_flag);
-	smp_mb__after_atomic();
+	printk_nmi_backtrace_complete();
 	put_cpu();
 }
 
-/*
- * It is not safe to call printk() directly from NMI handlers.
- * It may be fine if the NMI detected a lock up and we have no choice
- * but to do so, but doing a NMI on all other CPUs to get a back trace
- * can be done with a sysrq-l. We don't want that to lock up, which
- * can happen if the NMI interrupts a printk in progress.
- *
- * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- * the content into a per cpu seq_buf buffer. Then when the NMIs are
- * all done, we can safely dump the contents of the seq_buf to a printk()
- * from a non NMI context.
- */
-static int nmi_vprintk(const char *fmt, va_list args)
-{
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	unsigned int len = seq_buf_used(&s->seq);
-
-	seq_buf_vprintf(&s->seq, fmt, args);
-	return seq_buf_used(&s->seq) - len;
-}
-
 static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
-	int cpu;
-
-	cpu = smp_processor_id();
+	int cpu = smp_processor_id();
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
-		printk_func_t printk_func_save = this_cpu_read(printk_func);
-
-		/* Replace printk to write into the NMI seq */
-		this_cpu_write(printk_func, nmi_vprintk);
+		printk_nmi_backtrace_this_cpu_begin();
 		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
 		show_regs(regs);
-		this_cpu_write(printk_func, printk_func_save);
+		printk_nmi_backtrace_this_cpu_end();
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return NMI_HANDLED;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 5/6] x86/nmi: Use common printk functions
@ 2015-03-12 13:39     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe
all-cpu backtracing from NMI has been copied to printk.c to make it
accessible to other architectures.

Port the x86 NMI backtrace to the generic code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86 at kernel.org
---
 arch/x86/Kconfig              |   1 +
 arch/x86/kernel/apic/hw_nmi.c | 101 +++---------------------------------------
 2 files changed, 8 insertions(+), 94 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c2fb8a87dccb..15278140833b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -141,6 +141,7 @@ config X86
 	select ACPI_LEGACY_TABLES_LOOKUP if ACPI
 	select X86_FEATURE_NAMES if PROC_FS
 	select SRCU
+	select PRINTK_NMI_BACKTRACE if X86_LOCAL_APIC
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 6873ab925d00..db934f9461ed 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -30,40 +30,16 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
 #ifdef arch_trigger_all_cpu_backtrace
 /* For reliability, we're prepared to waste bits here. */
 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
-static cpumask_t printtrace_mask;
-
-#define NMI_BUF_SIZE		4096
-
-struct nmi_seq_buf {
-	unsigned char		buffer[NMI_BUF_SIZE];
-	struct seq_buf		seq;
-};
-
-/* Safe printing in NMI context */
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
-
-/* "in progress" flag of arch_trigger_all_cpu_backtrace */
-static unsigned long backtrace_flag;
-
-static void print_seq_line(struct nmi_seq_buf *s, int start, int end)
-{
-	const char *buf = s->buffer + start;
-
-	printk("%.*s", (end - start) + 1, buf);
-}
 
 void arch_trigger_all_cpu_backtrace(bool include_self)
 {
-	struct nmi_seq_buf *s;
-	int len;
-	int cpu;
 	int i;
 	int this_cpu = get_cpu();
 
-	if (test_and_set_bit(0, &backtrace_flag)) {
+	if (0 != printk_nmi_backtrace_prepare()) {
 		/*
-		 * If there is already a trigger_all_cpu_backtrace() in progress
-		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
 		 */
 		put_cpu();
 		return;
@@ -73,16 +49,6 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 	if (!include_self)
 		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
 
-	cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
-	/*
-	 * Set up per_cpu seq_buf buffers that the NMIs running on the other
-	 * CPUs will write to.
-	 */
-	for_each_cpu(cpu, to_cpumask(backtrace_mask)) {
-		s = &per_cpu(nmi_print_seq, cpu);
-		seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE);
-	}
-
 	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
 		pr_info("sending NMI to %s CPUs:\n",
 			(include_self ? "all" : "other"));
@@ -97,73 +63,20 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Now that all the NMIs have triggered, we can dump out their
-	 * back traces safely to the console.
-	 */
-	for_each_cpu(cpu, &printtrace_mask) {
-		int last_i = 0;
-
-		s = &per_cpu(nmi_print_seq, cpu);
-		len = seq_buf_used(&s->seq);
-		if (!len)
-			continue;
-
-		/* Print line by line. */
-		for (i = 0; i < len; i++) {
-			if (s->buffer[i] == '\n') {
-				print_seq_line(s, last_i, i);
-				last_i = i + 1;
-			}
-		}
-		/* Check if there was a partial line. */
-		if (last_i < len) {
-			print_seq_line(s, last_i, len - 1);
-			pr_cont("\n");
-		}
-	}
-
-	clear_bit(0, &backtrace_flag);
-	smp_mb__after_atomic();
+	printk_nmi_backtrace_complete();
 	put_cpu();
 }
 
-/*
- * It is not safe to call printk() directly from NMI handlers.
- * It may be fine if the NMI detected a lock up and we have no choice
- * but to do so, but doing a NMI on all other CPUs to get a back trace
- * can be done with a sysrq-l. We don't want that to lock up, which
- * can happen if the NMI interrupts a printk in progress.
- *
- * Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- * the content into a per cpu seq_buf buffer. Then when the NMIs are
- * all done, we can safely dump the contents of the seq_buf to a printk()
- * from a non NMI context.
- */
-static int nmi_vprintk(const char *fmt, va_list args)
-{
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	unsigned int len = seq_buf_used(&s->seq);
-
-	seq_buf_vprintf(&s->seq, fmt, args);
-	return seq_buf_used(&s->seq) - len;
-}
-
 static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
-	int cpu;
-
-	cpu = smp_processor_id();
+	int cpu = smp_processor_id();
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
-		printk_func_t printk_func_save = this_cpu_read(printk_func);
-
-		/* Replace printk to write into the NMI seq */
-		this_cpu_write(printk_func, nmi_vprintk);
+		printk_nmi_backtrace_this_cpu_begin();
 		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
 		show_regs(regs);
-		this_cpu_write(printk_func, printk_func_save);
+		printk_nmi_backtrace_this_cpu_end();
 
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return NMI_HANDLED;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 6/6] ARM: Add support for on-demand backtrace of other CPUs
  2015-03-12 13:39   ` Daniel Thompson
@ 2015-03-12 13:39     ` Daniel Thompson
  -1 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper
  Cc: Daniel Thompson, Russell King, Will Deacon, Catalin Marinas,
	Marc Zyngier, Stephen Boyd, John Stultz, Steven Rostedt,
	linux-kernel, linux-arm-kernel, patches, linaro-kernel,
	Sumit Semwal, Dirk Behme, Daniel Drake, Dmitry Pervushin,
	Tim Sander

Replicate the x86 code to trigger a backtrace using an NMI and hook
it up to IPI on ARM.

The code differs slightly from the code on x86 because, on ARM, we do
now know at compile time whether a platform is capable of supporting FIQ.
We must avoid using an IPI to request a backtrace from the CPU on which
the backtrace was requested if interrupts are disabled and fall back to
generating it directly.

In addition the implementation of arch_trigger_all_cpu_backtrace() the
patch also includes a few small items of plumbing that must be hooked
up for the new code to work.

Credit:
  Russell King provided the initial prototype implementing this feature
  for ARM. Today the patch has been reworked and, mostly, rewriten to
  keep it aligned with x86. However this patch does still include some
  code from Russell's original prototype.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/Kconfig               |  1 +
 arch/arm/include/asm/hardirq.h |  2 +-
 arch/arm/include/asm/irq.h     |  5 +++
 arch/arm/include/asm/smp.h     |  3 ++
 arch/arm/kernel/smp.c          | 81 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/traps.c        |  3 ++
 6 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9f1f09a2bc9b..f3c95a44945d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -76,6 +76,7 @@ config ARM
 	select OLD_SIGACTION
 	select OLD_SIGSUSPEND3
 	select PERF_USE_VMALLOC
+	select PRINTK_NMI_BACKTRACE
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
 	# Above selects are sorted alphabetically; please add new ones
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index fe3ea776dc34..5df33e30ae1b 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	8
+#define NR_IPI	9
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 18f5a554134f..b076584ac0fa 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -18,6 +18,8 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#define SMP_IPI_FIQ_MASK 0x0100
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
@@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
 
+extern void ipi_cpu_backtrace(struct pt_regs *regs);
 extern int register_ipi_completion(struct completion *completion, int cpu);
 
 struct smp_operations {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 86ef244c5a24..7eb6241e99d1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -26,6 +26,7 @@
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/seq_buf.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -72,6 +73,7 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
 static DECLARE_COMPLETION(cpu_running);
@@ -456,6 +458,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_CPU_STOP, "CPU stop interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_COMPLETION, "completion interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -570,6 +573,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 	unsigned int cpu = smp_processor_id();
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
+	BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
+
 	if ((unsigned)ipinr < NR_IPI) {
 		trace_ipi_entry(ipi_types[ipinr]);
 		__inc_irq_stat(cpu, ipi_irqs[ipinr]);
@@ -623,6 +628,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n",
 		        cpu, ipinr);
@@ -717,3 +728,73 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	int i;
+	int this_cpu = get_cpu();
+
+	if (0 != printk_nmi_backtrace_prepare()) {
+		/*
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
+		 */
+		put_cpu();
+		return;
+	}
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+	/*
+	 * If irqs are disabled on the current processor and
+	 * IPI_CPU_BACKTRACE is delivered using IRQ then we aren't be able to
+	 * react to IPI_CPU_BACKTRACE until we leave this function. This
+	 * would force us to get stuck and, eventually, timeout. We avoid
+	 * the timeout (and the resulting failure to print useful information)
+	 * by calling the backtrace logic directly whenever irqs are disabled.
+	 */
+	if (include_self && irqs_disabled()) {
+		ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL);
+		include_self = false;
+	}
+
+	if (!include_self)
+		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+		touch_softlockup_watchdog();
+	}
+
+	printk_nmi_backtrace_complete();
+	put_cpu();
+}
+
+void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		printk_nmi_backtrace_this_cpu_begin();
+		pr_warn("FIQ backtrace for cpu %d\n", cpu);
+		if (regs != NULL)
+			show_regs(regs);
+		else
+			dump_stack();
+		printk_nmi_backtrace_this_cpu_end();
+
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b35e220ae1b1..1836415b8a5c 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 #ifdef CONFIG_ARM_GIC
 	gic_handle_fiq_ipi();
 #endif
+#ifdef CONFIG_SMP
+	ipi_cpu_backtrace(regs);
+#endif
 
 	nmi_exit();
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 4.0-rc2 v18 6/6] ARM: Add support for on-demand backtrace of other CPUs
@ 2015-03-12 13:39     ` Daniel Thompson
  0 siblings, 0 replies; 94+ messages in thread
From: Daniel Thompson @ 2015-03-12 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Replicate the x86 code to trigger a backtrace using an NMI and hook
it up to IPI on ARM.

The code differs slightly from the code on x86 because, on ARM, we do
now know at compile time whether a platform is capable of supporting FIQ.
We must avoid using an IPI to request a backtrace from the CPU on which
the backtrace was requested if interrupts are disabled and fall back to
generating it directly.

In addition the implementation of arch_trigger_all_cpu_backtrace() the
patch also includes a few small items of plumbing that must be hooked
up for the new code to work.

Credit:
  Russell King provided the initial prototype implementing this feature
  for ARM. Today the patch has been reworked and, mostly, rewriten to
  keep it aligned with x86. However this patch does still include some
  code from Russell's original prototype.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/Kconfig               |  1 +
 arch/arm/include/asm/hardirq.h |  2 +-
 arch/arm/include/asm/irq.h     |  5 +++
 arch/arm/include/asm/smp.h     |  3 ++
 arch/arm/kernel/smp.c          | 81 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/traps.c        |  3 ++
 6 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9f1f09a2bc9b..f3c95a44945d 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -76,6 +76,7 @@ config ARM
 	select OLD_SIGACTION
 	select OLD_SIGSUSPEND3
 	select PERF_USE_VMALLOC
+	select PRINTK_NMI_BACKTRACE
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
 	# Above selects are sorted alphabetically; please add new ones
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index fe3ea776dc34..5df33e30ae1b 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	8
+#define NR_IPI	9
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 18f5a554134f..b076584ac0fa 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -18,6 +18,8 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#define SMP_IPI_FIQ_MASK 0x0100
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
@@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
 
+extern void ipi_cpu_backtrace(struct pt_regs *regs);
 extern int register_ipi_completion(struct completion *completion, int cpu);
 
 struct smp_operations {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 86ef244c5a24..7eb6241e99d1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -26,6 +26,7 @@
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/seq_buf.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -72,6 +73,7 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
 static DECLARE_COMPLETION(cpu_running);
@@ -456,6 +458,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_CPU_STOP, "CPU stop interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_COMPLETION, "completion interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -570,6 +573,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 	unsigned int cpu = smp_processor_id();
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
+	BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
+
 	if ((unsigned)ipinr < NR_IPI) {
 		trace_ipi_entry(ipi_types[ipinr]);
 		__inc_irq_stat(cpu, ipi_irqs[ipinr]);
@@ -623,6 +628,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n",
 		        cpu, ipinr);
@@ -717,3 +728,73 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	int i;
+	int this_cpu = get_cpu();
+
+	if (0 != printk_nmi_backtrace_prepare()) {
+		/*
+		 * If there is already an nmi printk sequence in
+		 * progress then just give up...
+		 */
+		put_cpu();
+		return;
+	}
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+	/*
+	 * If irqs are disabled on the current processor and
+	 * IPI_CPU_BACKTRACE is delivered using IRQ then we aren't be able to
+	 * react to IPI_CPU_BACKTRACE until we leave this function. This
+	 * would force us to get stuck and, eventually, timeout. We avoid
+	 * the timeout (and the resulting failure to print useful information)
+	 * by calling the backtrace logic directly whenever irqs are disabled.
+	 */
+	if (include_self && irqs_disabled()) {
+		ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL);
+		include_self = false;
+	}
+
+	if (!include_self)
+		cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+		touch_softlockup_watchdog();
+	}
+
+	printk_nmi_backtrace_complete();
+	put_cpu();
+}
+
+void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		printk_nmi_backtrace_this_cpu_begin();
+		pr_warn("FIQ backtrace for cpu %d\n", cpu);
+		if (regs != NULL)
+			show_regs(regs);
+		else
+			dump_stack();
+		printk_nmi_backtrace_this_cpu_end();
+
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b35e220ae1b1..1836415b8a5c 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
 #ifdef CONFIG_ARM_GIC
 	gic_handle_fiq_ipi();
 #endif
+#ifdef CONFIG_SMP
+	ipi_cpu_backtrace(regs);
+#endif
 
 	nmi_exit();
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2015-03-12 13:41 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23 14:22 [PATCH 3.19-rc2 v15 0/8] irq/arm: Implement arch_trigger_all_cpu_backtrace Daniel Thompson
2015-01-23 14:22 ` Daniel Thompson
2015-01-23 14:22 ` [PATCH 3.19-rc2 v15 1/8] irqchip: gic: Optimize locking in gic_raise_softirq Daniel Thompson
2015-01-23 14:22   ` Daniel Thompson
2015-01-23 14:22 ` [PATCH 3.19-rc2 v15 2/8] irqchip: gic: Make gic_raise_softirq FIQ-safe Daniel Thompson
2015-01-23 14:22   ` Daniel Thompson
2015-01-23 14:22 ` [PATCH 3.19-rc2 v15 3/8] irqchip: gic: Introduce plumbing for IPI FIQ Daniel Thompson
2015-01-23 14:22   ` Daniel Thompson
2015-01-23 14:22 ` [PATCH 3.19-rc2 v15 4/8] sched_clock: Avoid deadlock during read from NMI Daniel Thompson
2015-01-23 14:22   ` Daniel Thompson
2015-01-24 22:40   ` Thomas Gleixner
2015-01-24 22:40     ` Thomas Gleixner
2015-01-26 20:28     ` Daniel Thompson
2015-01-26 20:28       ` Daniel Thompson
2015-01-23 14:22 ` [PATCH 3.19-rc2 v15 5/8] printk: Simple implementation for NMI backtracing Daniel Thompson
2015-01-23 14:22   ` Daniel Thompson
2015-01-24 21:44   ` Thomas Gleixner
2015-01-24 21:44     ` Thomas Gleixner
2015-01-26 17:21     ` Daniel Thompson
2015-01-26 17:21       ` Daniel Thompson
2015-01-23 14:22 ` [PATCH 3.19-rc2 v15 6/8] x86/nmi: Use common printk functions Daniel Thompson
2015-01-23 14:22   ` Daniel Thompson
2015-01-23 14:22 ` [PATCH 3.19-rc2 v15 7/8] ARM: Add support for on-demand backtrace of other CPUs Daniel Thompson
2015-01-23 14:22   ` Daniel Thompson
2015-01-23 14:22 ` [PATCH 3.19-rc2 v15 8/8] ARM: Fix on-demand backtrace triggered by IRQ Daniel Thompson
2015-01-23 14:22   ` Daniel Thompson
2015-02-03 19:06 ` [PATCH 3.19-rc6 v16 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace Daniel Thompson
2015-02-03 19:06   ` Daniel Thompson
2015-02-03 19:06   ` [PATCH 3.19-rc6 v16 1/6] irqchip: gic: Optimize locking in gic_raise_softirq Daniel Thompson
2015-02-03 19:06     ` Daniel Thompson
2015-02-26 20:31     ` Nicolas Pitre
2015-02-26 20:31       ` Nicolas Pitre
2015-02-26 21:05       ` Daniel Thompson
2015-02-26 21:05         ` Daniel Thompson
2015-02-26 21:33         ` Nicolas Pitre
2015-02-26 21:33           ` Nicolas Pitre
2015-02-03 19:06   ` [PATCH 3.19-rc6 v16 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe Daniel Thompson
2015-02-03 19:06     ` Daniel Thompson
2015-02-26 20:33     ` Nicolas Pitre
2015-02-26 20:33       ` Nicolas Pitre
2015-02-03 19:06   ` [PATCH 3.19-rc6 v16 3/6] irqchip: gic: Introduce plumbing for IPI FIQ Daniel Thompson
2015-02-03 19:06     ` Daniel Thompson
2015-02-03 19:06   ` [PATCH 3.19-rc6 v16 4/6] printk: Simple implementation for NMI backtracing Daniel Thompson
2015-02-03 19:06     ` Daniel Thompson
2015-02-03 19:06   ` [PATCH 3.19-rc6 v16 5/6] x86/nmi: Use common printk functions Daniel Thompson
2015-02-03 19:06     ` Daniel Thompson
2015-02-03 19:06   ` [PATCH 3.19-rc6 v16 6/6] ARM: Add support for on-demand backtrace of other CPUs Daniel Thompson
2015-02-03 19:06     ` Daniel Thompson
2015-03-04 10:12 ` [PATCH 4.0-rc1 v17 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace Daniel Thompson
2015-03-04 10:12   ` Daniel Thompson
2015-03-04 10:12   ` [PATCH 4.0-rc1 v17 1/6] irqchip: gic: Optimize locking in gic_raise_softirq Daniel Thompson
2015-03-04 10:12     ` Daniel Thompson
2015-03-04 10:12   ` [PATCH 4.0-rc1 v17 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe Daniel Thompson
2015-03-04 10:12     ` Daniel Thompson
2015-03-04 10:12   ` [PATCH 4.0-rc1 v17 3/6] irqchip: gic: Introduce plumbing for IPI FIQ Daniel Thompson
2015-03-04 10:12     ` Daniel Thompson
2015-03-04 10:12   ` [PATCH 4.0-rc1 v17 4/6] printk: Simple implementation for NMI backtracing Daniel Thompson
2015-03-04 10:12     ` Daniel Thompson
2015-03-04 16:13     ` Joe Perches
2015-03-04 16:13       ` Joe Perches
2015-03-04 16:20       ` Steven Rostedt
2015-03-04 16:20         ` Steven Rostedt
2015-03-04 16:33         ` Daniel Thompson
2015-03-04 16:33           ` Daniel Thompson
2015-03-04 17:21           ` Joe Perches
2015-03-04 17:21             ` Joe Perches
2015-03-05 12:11             ` Daniel Thompson
2015-03-05 12:11               ` Daniel Thompson
2015-03-04 10:12   ` [PATCH 4.0-rc1 v17 5/6] x86/nmi: Use common printk functions Daniel Thompson
2015-03-04 10:12     ` Daniel Thompson
2015-03-05  0:54     ` Ingo Molnar
2015-03-05  0:54       ` Ingo Molnar
2015-03-05 12:29       ` Daniel Thompson
2015-03-05 12:29         ` Daniel Thompson
2015-03-05 19:46         ` Ingo Molnar
2015-03-05 19:46           ` Ingo Molnar
2015-03-06 19:02           ` Daniel Thompson
2015-03-06 19:02             ` Daniel Thompson
2015-03-04 10:12   ` [PATCH 4.0-rc1 v17 6/6] ARM: Add support for on-demand backtrace of other CPUs Daniel Thompson
2015-03-04 10:12     ` Daniel Thompson
2015-03-12 13:39 ` [PATCH 4.0-rc2 v18 0/6] irq/arm: Implement arch_trigger_all_cpu_backtrace Daniel Thompson
2015-03-12 13:39   ` Daniel Thompson
2015-03-12 13:39   ` [PATCH 4.0-rc2 v18 1/6] irqchip: gic: Optimize locking in gic_raise_softirq Daniel Thompson
2015-03-12 13:39     ` Daniel Thompson
2015-03-12 13:39   ` [PATCH 4.0-rc2 v18 2/6] irqchip: gic: Make gic_raise_softirq FIQ-safe Daniel Thompson
2015-03-12 13:39     ` Daniel Thompson
2015-03-12 13:39   ` [PATCH 4.0-rc2 v18 3/6] irqchip: gic: Introduce plumbing for IPI FIQ Daniel Thompson
2015-03-12 13:39     ` Daniel Thompson
2015-03-12 13:39   ` [PATCH 4.0-rc2 v18 4/6] printk: Simple implementation for NMI backtracing Daniel Thompson
2015-03-12 13:39     ` Daniel Thompson
2015-03-12 13:39   ` [PATCH 4.0-rc2 v18 5/6] x86/nmi: Use common printk functions Daniel Thompson
2015-03-12 13:39     ` Daniel Thompson
2015-03-12 13:39   ` [PATCH 4.0-rc2 v18 6/6] ARM: Add support for on-demand backtrace of other CPUs Daniel Thompson
2015-03-12 13:39     ` Daniel Thompson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.