All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] assortment of timer, watchdog, interrupt
@ 2018-05-04 17:19 Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled Nicholas Piggin
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

These are a bunch of small things I've built up from looking
through code trying to track down some rare irq latency issues.

None of them actually fix any long irq latencies, but they
hopefully make the code a bit neater, get rid of some small
glitches, increase watchdog coverage etc.

Ben spotted a bug with the first patch last time I posted,
that's fixed.

Thanks,
Nick


Nicholas Piggin (11):
  powerpc/64: irq_work avoid interrupt when called with hardware irqs
    enabled
  powerpc/pseries: put cede MSR[EE] check under IRQ_SOFT_MASK_DEBUG
  powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely
  powerpc/64s: micro-optimise __hard_irq_enable() for mtmsrd L=1 support
  powerpc/64: remove start_tb and accum_tb from thread_struct
  powerpc/pseries: lparcfg calculate PURR on demand
  powerpc: generic clockevents broadcast receiver call
    tick_receive_broadcast
  powerpc: allow soft-NMI watchdog to cover timer interrupts with large
    decrementers
  powerpc: move timer broadcast code under GENERIC_CLOCKEVENTS_BROADCAST
    ifdef
  powerpc: move a stray NMI IPI case under NMI_IPI ifdef
  powerpc/time: account broadcast timer event interrupts separately

 arch/powerpc/include/asm/hardirq.h        |   1 +
 arch/powerpc/include/asm/hw_irq.h         |  15 ++-
 arch/powerpc/include/asm/plpar_wrappers.h |   8 +-
 arch/powerpc/include/asm/processor.h      |   4 -
 arch/powerpc/include/asm/time.h           |   9 --
 arch/powerpc/kernel/entry_64.S            |   8 ++
 arch/powerpc/kernel/exceptions-64s.S      |   5 +-
 arch/powerpc/kernel/irq.c                 |  34 +++--
 arch/powerpc/kernel/process.c             |  18 ---
 arch/powerpc/kernel/smp.c                 |  14 ++-
 arch/powerpc/kernel/time.c                | 143 +++++++++++++---------
 arch/powerpc/platforms/pseries/lparcfg.c  |  18 +--
 12 files changed, 155 insertions(+), 122 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-06-04 14:10   ` [01/11] " Michael Ellerman
  2018-05-04 17:19 ` [PATCH 02/11] powerpc/pseries: put cede MSR[EE] check under IRQ_SOFT_MASK_DEBUG Nicholas Piggin
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

irq_work_raise should not cause a decrementer exception unless it is
called from NMI context. Doing so often just results in an immediate
masked decrementer interrupt:

   <...>-550    90d...    4us : update_curr_rt <-dequeue_task_rt
   <...>-550    90d...    5us : dbs_update_util_handler <-update_curr_rt
   <...>-550    90d...    6us : arch_irq_work_raise <-irq_work_queue
   <...>-550    90d...    7us : soft_nmi_interrupt <-soft_nmi_common
   <...>-550    90d...    7us : printk_nmi_enter <-soft_nmi_interrupt
   <...>-550    90d.Z.    8us : rcu_nmi_enter <-soft_nmi_interrupt
   <...>-550    90d.Z.    9us : rcu_nmi_exit <-soft_nmi_interrupt
   <...>-550    90d...    9us : printk_nmi_exit <-soft_nmi_interrupt
   <...>-550    90d...   10us : cpuacct_charge <-update_curr_rt

The soft_nmi_interrupt here is the call into the watchdog, due to the
decrementer interrupt firing with irqs soft-disabled. This is
harmless, but sub-optimal.

When it's not called from NMI context or with interrupts enabled, mark
the decrementer pending in the irq_happened mask directly, rather than
having the masked decrementer interupt handler do it. This will be
replayed at the next local_irq_enable. See the comment for details.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/time.c | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 360e71d455cc..e7e8611e8863 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -513,6 +513,35 @@ static inline void clear_irq_work_pending(void)
 		"i" (offsetof(struct paca_struct, irq_work_pending)));
 }
 
+void arch_irq_work_raise(void)
+{
+	preempt_disable();
+	set_irq_work_pending_flag();
+	/*
+	 * Non-nmi code running with interrupts disabled will replay
+	 * irq_happened before it re-enables interrupts, so setthe
+	 * decrementer there instead of causing a hardware exception
+	 * which would immediately hit the masked interrupt handler
+	 * and have the net effect of setting the decrementer in
+	 * irq_happened.
+	 *
+	 * NMI interrupts can not check this when they return, so the
+	 * decrementer hardware exception is raised, which will fire
+	 * when interrupts are next enabled.
+	 *
+	 * BookE does not support this yet, it must audit all NMI
+	 * interrupt handlers to ensure they call nmi_enter() so this
+	 * check would be correct.
+	 */
+	if (IS_ENABLED(CONFIG_BOOKE) || !irqs_disabled() || in_nmi()) {
+		set_dec(1);
+	} else {
+		hard_irq_disable();
+		local_paca->irq_happened |= PACA_IRQ_DEC;
+	}
+	preempt_enable();
+}
+
 #else /* 32-bit */
 
 DEFINE_PER_CPU(u8, irq_work_pending);
@@ -521,8 +550,6 @@ DEFINE_PER_CPU(u8, irq_work_pending);
 #define test_irq_work_pending()		__this_cpu_read(irq_work_pending)
 #define clear_irq_work_pending()	__this_cpu_write(irq_work_pending, 0)
 
-#endif /* 32 vs 64 bit */
-
 void arch_irq_work_raise(void)
 {
 	preempt_disable();
@@ -531,6 +558,8 @@ void arch_irq_work_raise(void)
 	preempt_enable();
 }
 
+#endif /* 32 vs 64 bit */
+
 #else  /* CONFIG_IRQ_WORK */
 
 #define test_irq_work_pending()	0
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 02/11] powerpc/pseries: put cede MSR[EE] check under IRQ_SOFT_MASK_DEBUG
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 03/11] powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely Nicholas Piggin
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

This check does not catch IRQ soft mask bugs, but this option is
slightly more suitable than TRACE_IRQFLAGS.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/plpar_wrappers.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index 96c1a46acbd0..cff5a411e595 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -39,10 +39,10 @@ static inline long extended_cede_processor(unsigned long latency_hint)
 	set_cede_latency_hint(latency_hint);
 
 	rc = cede_processor();
-#ifdef CONFIG_TRACE_IRQFLAGS
-		/* Ensure that H_CEDE returns with IRQs on */
-		if (WARN_ON(!(mfmsr() & MSR_EE)))
-			__hard_irq_enable();
+#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
+	/* Ensure that H_CEDE returns with IRQs on */
+	if (WARN_ON(!(mfmsr() & MSR_EE)))
+		__hard_irq_enable();
 #endif
 
 	set_cede_latency_hint(old_latency_hint);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 03/11] powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 02/11] powerpc/pseries: put cede MSR[EE] check under IRQ_SOFT_MASK_DEBUG Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 04/11] powerpc/64s: micro-optimise __hard_irq_enable() for mtmsrd L=1 support Nicholas Piggin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

When the masked interrupt handler clears MSR[EE] for an interrupt in
the PACA_IRQ_MUST_HARD_MASK set, it does not set PACA_IRQ_HARD_DIS.
This makes them get out of synch.

With that taken into account, it's only low level irq manipulation
(and interrupt entry before reconcile) where they can be out of synch.
This makes the code less surprising.

It also allows the IRQ replay code to rely on the IRQ_HARD_DIS value
and not have to mtmsrd again in this case (e.g., for an external
interrupt that has been masked). The bigger benefit might just be
that there is not such an element of surprise in these two bits of
state.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/hw_irq.h    | 10 ++++++----
 arch/powerpc/kernel/entry_64.S       |  8 ++++++++
 arch/powerpc/kernel/exceptions-64s.S |  5 ++++-
 arch/powerpc/kernel/irq.c            | 28 +++++++++++++++++++---------
 4 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 855e17d158b1..8004d7887ff6 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -248,14 +248,16 @@ static inline bool lazy_irq_pending(void)
 
 /*
  * This is called by asynchronous interrupts to conditionally
- * re-enable hard interrupts when soft-disabled after having
- * cleared the source of the interrupt
+ * re-enable hard interrupts after having cleared the source
+ * of the interrupt. They are kept disabled if there is a different
+ * soft-masked interrupt pending that requires hard masking.
  */
 static inline void may_hard_irq_enable(void)
 {
-	get_paca()->irq_happened &= ~PACA_IRQ_HARD_DIS;
-	if (!(get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK))
+	if (!(get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK)) {
+		get_paca()->irq_happened &= ~PACA_IRQ_HARD_DIS;
 		__hard_irq_enable();
+	}
 }
 
 static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 51695608c68b..db4df061c33a 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -973,6 +973,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	or	r4,r4,r3
 	std	r4,_TRAP(r1)
 
+	/*
+	 * PACA_IRQ_HARD_DIS won't always be set here, so set it now
+	 * to reconcile the IRQ state. Tracing is already accounted for.
+	 */
+	ld	r4,PACAIRQHAPPENED(r13)
+	ori	r4,r4,PACA_IRQ_HARD_DIS
+	stb	r4,PACAIRQHAPPENED(r13)
+
 	/*
 	 * Then find the right handler and call it. Interrupts are
 	 * still soft-disabled and we keep them that way.
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index ae6a849db60b..69172dd41b11 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1498,7 +1498,10 @@ masked_##_H##interrupt:					\
 	mfspr	r10,SPRN_##_H##SRR1;			\
 	xori	r10,r10,MSR_EE; /* clear MSR_EE */	\
 	mtspr	SPRN_##_H##SRR1,r10;			\
-2:	mtcrf	0x80,r9;				\
+	ori	r11,r11,PACA_IRQ_HARD_DIS;		\
+	stb	r11,PACAIRQHAPPENED(r13);		\
+2:	/* done */					\
+	mtcrf	0x80,r9;				\
 	ld	r9,PACA_EXGEN+EX_R9(r13);		\
 	ld	r10,PACA_EXGEN+EX_R10(r13);		\
 	ld	r11,PACA_EXGEN+EX_R11(r13);		\
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 061aa0f47bb1..6569b5ffff93 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -145,8 +145,20 @@ notrace unsigned int __check_irq_replay(void)
 	trace_hardirqs_on();
 	trace_hardirqs_off();
 
+	/*
+	 * We are always hard disabled here, but PACA_IRQ_HARD_DIS may
+	 * not be set, which means interrupts have only just been hard
+	 * disabled as part of the local_irq_restore or interrupt return
+	 * code. In that case, skip the decrementr check becaus it's
+	 * expensive to read the TB.
+	 *
+	 * HARD_DIS then gets cleared here, but it's reconciled later.
+	 * Either local_irq_disable will replay the interrupt and that
+	 * will reconcile state like other hard interrupts. Or interrupt
+	 * retur will replay the interrupt and in that case it sets
+	 * PACA_IRQ_HARD_DIS by hand (see comments in entry_64.S).
+	 */
 	if (happened & PACA_IRQ_HARD_DIS) {
-		/* Clear bit 0 which we wouldn't clear otherwise */
 		local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
 
 		/*
@@ -256,16 +268,14 @@ notrace void arch_local_irq_restore(unsigned long mask)
 	 * __check_irq_replay(). We also need to soft-disable
 	 * again to avoid warnings in there due to the use of
 	 * per-cpu variables.
-	 *
-	 * We know that if the value in irq_happened is exactly 0x01
-	 * then we are already hard disabled (there are other less
-	 * common cases that we'll ignore for now), so we skip the
-	 * (expensive) mtmsrd.
 	 */
-	if (unlikely(irq_happened != PACA_IRQ_HARD_DIS))
+	if (!(irq_happened & PACA_IRQ_HARD_DIS)) {
+#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
+		WARN_ON(!(mfmsr() & MSR_EE));
+#endif
 		__hard_irq_disable();
 #ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
-	else {
+	} else {
 		/*
 		 * We should already be hard disabled here. We had bugs
 		 * where that wasn't the case so let's dbl check it and
@@ -274,8 +284,8 @@ notrace void arch_local_irq_restore(unsigned long mask)
 		 */
 		if (WARN_ON(mfmsr() & MSR_EE))
 			__hard_irq_disable();
-	}
 #endif
+	}
 
 	irq_soft_mask_set(IRQS_ALL_DISABLED);
 	trace_hardirqs_off();
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 04/11] powerpc/64s: micro-optimise __hard_irq_enable() for mtmsrd L=1 support
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
                   ` (2 preceding siblings ...)
  2018-05-04 17:19 ` [PATCH 03/11] powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 05/11] powerpc/64: remove start_tb and accum_tb from thread_struct Nicholas Piggin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

Book3S minimum supported ISA version now requires mtmsrd L=1. This
instruction does not require bits other than RI and EE to be supplied,
so __hard_irq_enable() and __hard_irq_disable() does not have to read
the kernel_msr from paca.

Interrupt entry code already relies on L=1 support.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/hw_irq.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 8004d7887ff6..fbc2d83808aa 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -228,8 +228,8 @@ static inline bool arch_irqs_disabled(void)
 #define __hard_irq_enable()	asm volatile("wrteei 1" : : : "memory")
 #define __hard_irq_disable()	asm volatile("wrteei 0" : : : "memory")
 #else
-#define __hard_irq_enable()	__mtmsrd(local_paca->kernel_msr | MSR_EE, 1)
-#define __hard_irq_disable()	__mtmsrd(local_paca->kernel_msr, 1)
+#define __hard_irq_enable()	__mtmsrd(MSR_EE|MSR_RI, 1)
+#define __hard_irq_disable()	__mtmsrd(MSR_RI, 1)
 #endif
 
 #define hard_irq_disable()	do {					\
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 05/11] powerpc/64: remove start_tb and accum_tb from thread_struct
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
                   ` (3 preceding siblings ...)
  2018-05-04 17:19 ` [PATCH 04/11] powerpc/64s: micro-optimise __hard_irq_enable() for mtmsrd L=1 support Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 06/11] powerpc/pseries: lparcfg calculate PURR on demand Nicholas Piggin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

These fields are only written to.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/processor.h | 4 ----
 arch/powerpc/kernel/process.c        | 6 +-----
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index c4b36a494a63..eff269adfa71 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -264,10 +264,6 @@ struct thread_struct {
 	struct thread_fp_state	*fp_save_area;
 	int		fpexc_mode;	/* floating-point exception mode */
 	unsigned int	align_ctl;	/* alignment handling control */
-#ifdef CONFIG_PPC64
-	unsigned long	start_tb;	/* Start purr when proc switched in */
-	unsigned long	accum_tb;	/* Total accumulated purr for process */
-#endif
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 	struct perf_event *ptrace_bps[HBP_NUM];
 	/*
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 1237f13fed51..ff7344d996e3 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1187,11 +1187,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
 	 */
 	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
 		struct cpu_usage *cu = this_cpu_ptr(&cpu_usage_array);
-		long unsigned start_tb, current_tb;
-		start_tb = old_thread->start_tb;
-		cu->current_tb = current_tb = mfspr(SPRN_PURR);
-		old_thread->accum_tb += (current_tb - start_tb);
-		new_thread->start_tb = current_tb;
+		cu->current_tb = mfspr(SPRN_PURR);
 	}
 #endif /* CONFIG_PPC64 */
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 06/11] powerpc/pseries: lparcfg calculate PURR on demand
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
                   ` (4 preceding siblings ...)
  2018-05-04 17:19 ` [PATCH 05/11] powerpc/64: remove start_tb and accum_tb from thread_struct Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 07/11] powerpc: generic clockevents broadcast receiver call tick_receive_broadcast Nicholas Piggin
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

For SPLPAR, lparcfg provides a sum of PURR registers for all CPUs.
Currently this is done by reading PURR in context switch and timer
interrupt, and storing that into a per-CPU variable. These are summed
to provide the value.

This does not work with all timer schemes (e.g., NO_HZ_FULL), and it
is sub-optimal for performance because it reads the PURR register on
every context switch, although that's been difficult to distinguish
from noise in the contxt_switch microbenchmark.

This patch implements the sum by calling a function on each CPU, to
read and add PURR values of each CPU.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/time.h          |  8 --------
 arch/powerpc/kernel/process.c            | 14 --------------
 arch/powerpc/kernel/time.c               |  8 --------
 arch/powerpc/platforms/pseries/lparcfg.c | 18 ++++++++++--------
 4 files changed, 10 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index db546c034905..c965c79765c4 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -196,14 +196,6 @@ extern u64 mulhdu(u64, u64);
 extern void div128_by_32(u64 dividend_high, u64 dividend_low,
 			 unsigned divisor, struct div_result *dr);
 
-/* Used to store Processor Utilization register (purr) values */
-
-struct cpu_usage {
-        u64 current_tb;  /* Holds the current purr register values */
-};
-
-DECLARE_PER_CPU(struct cpu_usage, cpu_usage_array);
-
 extern void secondary_cpu_time_init(void);
 extern void __init time_init(void);
 
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index ff7344d996e3..e6ff36923d84 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -845,10 +845,6 @@ bool ppc_breakpoint_available(void)
 }
 EXPORT_SYMBOL_GPL(ppc_breakpoint_available);
 
-#ifdef CONFIG_PPC64
-DEFINE_PER_CPU(struct cpu_usage, cpu_usage_array);
-#endif
-
 static inline bool hw_brk_match(struct arch_hw_breakpoint *a,
 			      struct arch_hw_breakpoint *b)
 {
@@ -1181,16 +1177,6 @@ struct task_struct *__switch_to(struct task_struct *prev,
 
 	WARN_ON(!irqs_disabled());
 
-#ifdef CONFIG_PPC64
-	/*
-	 * Collect processor utilization data per process
-	 */
-	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-		struct cpu_usage *cu = this_cpu_ptr(&cpu_usage_array);
-		cu->current_tb = mfspr(SPRN_PURR);
-	}
-#endif /* CONFIG_PPC64 */
-
 #ifdef CONFIG_PPC_BOOK3S_64
 	batch = this_cpu_ptr(&ppc64_tlb_batch);
 	if (batch->active) {
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index e7e8611e8863..1fe6a24357e7 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -597,14 +597,6 @@ static void __timer_interrupt(void)
 		__this_cpu_inc(irq_stat.timer_irqs_others);
 	}
 
-#ifdef CONFIG_PPC64
-	/* collect purr register values often, for accurate calculations */
-	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-		struct cpu_usage *cu = this_cpu_ptr(&cpu_usage_array);
-		cu->current_tb = mfspr(SPRN_PURR);
-	}
-#endif
-
 	trace_timer_interrupt_exit(regs);
 }
 
diff --git a/arch/powerpc/platforms/pseries/lparcfg.c b/arch/powerpc/platforms/pseries/lparcfg.c
index c508c938dc71..7c872dc01bdb 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -52,18 +52,20 @@
  * Track sum of all purrs across all processors. This is used to further
  * calculate usage values by different applications
  */
+static void cpu_get_purr(void *arg)
+{
+	atomic64_t *sum = arg;
+
+	atomic64_add(mfspr(SPRN_PURR), sum);
+}
+
 static unsigned long get_purr(void)
 {
-	unsigned long sum_purr = 0;
-	int cpu;
+	atomic64_t purr = ATOMIC64_INIT(0);
 
-	for_each_possible_cpu(cpu) {
-		struct cpu_usage *cu;
+	on_each_cpu(cpu_get_purr, &purr, 1);
 
-		cu = &per_cpu(cpu_usage_array, cpu);
-		sum_purr += cu->current_tb;
-	}
-	return sum_purr;
+	return atomic64_read(&purr);
 }
 
 /*
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 07/11] powerpc: generic clockevents broadcast receiver call tick_receive_broadcast
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
                   ` (5 preceding siblings ...)
  2018-05-04 17:19 ` [PATCH 06/11] powerpc/pseries: lparcfg calculate PURR on demand Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-05 14:38   ` kbuild test robot
  2018-05-04 17:19 ` [PATCH 08/11] powerpc: allow soft-NMI watchdog to cover timer interrupts with large decrementers Nicholas Piggin
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Nicholas Piggin, Benjamin Herrenschmidt, Srivatsa S . Bhat,
	Preeti U Murthy

The broadcast tick recipient can call tick_receive_broadcast rather
than re-running the full timer interrupt.

It does not have to check for the next event time, because the sender
already determined the timer has expired. It does not have to test
irq_work_pending, because that's a direct decrementer interrupt and
does not go through the clock events subsystem. And it does not have
to read PURR because that was removed with the previous patch.

This results in no code size change, but both the decrementer and
broadcast path lengths are reduced.

Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/hw_irq.h |  1 +
 arch/powerpc/include/asm/time.h   |  1 -
 arch/powerpc/kernel/smp.c         |  4 +-
 arch/powerpc/kernel/time.c        | 84 ++++++++++++++-----------------
 4 files changed, 42 insertions(+), 48 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index fbc2d83808aa..46fe8307fc43 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -55,6 +55,7 @@ extern void replay_system_reset(void);
 extern void __replay_interrupt(unsigned int vector);
 
 extern void timer_interrupt(struct pt_regs *);
+extern void timer_broadcast_interrupt(void);
 extern void performance_monitor_exception(struct pt_regs *regs);
 extern void WatchdogException(struct pt_regs *regs);
 extern void unknown_exception(struct pt_regs *regs);
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index c965c79765c4..69b89f941252 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -28,7 +28,6 @@ extern struct clock_event_device decrementer_clockevent;
 
 struct rtc_time;
 extern void to_tm(int tim, struct rtc_time * tm);
-extern void tick_broadcast_ipi_handler(void);
 
 extern void generic_calibrate_decr(void);
 extern void hdec_interrupt(struct pt_regs *regs);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 9ca7148b5881..5441a47701b1 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -157,7 +157,7 @@ static irqreturn_t reschedule_action(int irq, void *data)
 
 static irqreturn_t tick_broadcast_ipi_action(int irq, void *data)
 {
-	tick_broadcast_ipi_handler();
+	timer_broadcast_interrupt();
 	return IRQ_HANDLED;
 }
 
@@ -278,7 +278,7 @@ irqreturn_t smp_ipi_demux_relaxed(void)
 		if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE))
 			scheduler_ipi();
 		if (all & IPI_MESSAGE(PPC_MSG_TICK_BROADCAST))
-			tick_broadcast_ipi_handler();
+			timer_broadcast_interrupt();
 #ifdef CONFIG_NMI_IPI
 		if (all & IPI_MESSAGE(PPC_MSG_NMI_IPI))
 			nmi_ipi_action(0, NULL);
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 1fe6a24357e7..ad876906f847 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -567,47 +567,16 @@ void arch_irq_work_raise(void)
 
 #endif /* CONFIG_IRQ_WORK */
 
-static void __timer_interrupt(void)
-{
-	struct pt_regs *regs = get_irq_regs();
-	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
-	struct clock_event_device *evt = this_cpu_ptr(&decrementers);
-	u64 now;
-
-	trace_timer_interrupt_entry(regs);
-
-	if (test_irq_work_pending()) {
-		clear_irq_work_pending();
-		irq_work_run();
-	}
-
-	now = get_tb_or_rtc();
-	if (now >= *next_tb) {
-		*next_tb = ~(u64)0;
-		if (evt->event_handler)
-			evt->event_handler(evt);
-		__this_cpu_inc(irq_stat.timer_irqs_event);
-	} else {
-		now = *next_tb - now;
-		if (now <= decrementer_max)
-			set_dec(now);
-		/* We may have raced with new irq work */
-		if (test_irq_work_pending())
-			set_dec(1);
-		__this_cpu_inc(irq_stat.timer_irqs_others);
-	}
-
-	trace_timer_interrupt_exit(regs);
-}
-
 /*
  * timer_interrupt - gets called when the decrementer overflows,
  * with interrupts disabled.
  */
-void timer_interrupt(struct pt_regs * regs)
+void timer_interrupt(struct pt_regs *regs)
 {
-	struct pt_regs *old_regs;
+	struct clock_event_device *evt = this_cpu_ptr(&decrementers);
 	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
+	struct pt_regs *old_regs;
+	u64 now;
 
 	/* Ensure a positive value is written to the decrementer, or else
 	 * some CPUs will continue to take decrementer exceptions.
@@ -638,13 +607,47 @@ void timer_interrupt(struct pt_regs * regs)
 
 	old_regs = set_irq_regs(regs);
 	irq_enter();
+	trace_timer_interrupt_entry(regs);
+
+	if (test_irq_work_pending()) {
+		clear_irq_work_pending();
+		irq_work_run();
+	}
+
+	now = get_tb_or_rtc();
+	if (now >= *next_tb) {
+		*next_tb = ~(u64)0;
+		if (evt->event_handler)
+			evt->event_handler(evt);
+		__this_cpu_inc(irq_stat.timer_irqs_event);
+	} else {
+		now = *next_tb - now;
+		if (now <= decrementer_max)
+			set_dec(now);
+		/* We may have raced with new irq work */
+		if (test_irq_work_pending())
+			set_dec(1);
+		__this_cpu_inc(irq_stat.timer_irqs_others);
+	}
 
-	__timer_interrupt();
+	trace_timer_interrupt_exit(regs);
 	irq_exit();
 	set_irq_regs(old_regs);
 }
 EXPORT_SYMBOL(timer_interrupt);
 
+void timer_broadcast_interrupt(void)
+{
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
+	struct pt_regs *regs = get_irq_regs();
+
+	trace_timer_interrupt_entry(regs);
+	*next_tb = ~(u64)0;
+	tick_receive_broadcast();
+	__this_cpu_inc(irq_stat.timer_irqs_event);
+	trace_timer_interrupt_exit(regs);
+}
+
 /*
  * Hypervisor decrementer interrupts shouldn't occur but are sometimes
  * left pending on exit from a KVM guest.  We don't need to do anything
@@ -992,15 +995,6 @@ static int decrementer_shutdown(struct clock_event_device *dev)
 	return 0;
 }
 
-/* Interrupt handler for the timer broadcast IPI */
-void tick_broadcast_ipi_handler(void)
-{
-	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
-
-	*next_tb = get_tb_or_rtc();
-	__timer_interrupt();
-}
-
 static void register_decrementer_clockevent(int cpu)
 {
 	struct clock_event_device *dec = &per_cpu(decrementers, cpu);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 08/11] powerpc: allow soft-NMI watchdog to cover timer interrupts with large decrementers
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
                   ` (6 preceding siblings ...)
  2018-05-04 17:19 ` [PATCH 07/11] powerpc: generic clockevents broadcast receiver call tick_receive_broadcast Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 09/11] powerpc: move timer broadcast code under GENERIC_CLOCKEVENTS_BROADCAST ifdef Nicholas Piggin
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

Large decrementers (e.g., POWER9) can take a very long time to wrap,
so when the timer iterrupt handler sets the decrementer to max so as
to avoid taking another decrementer interrupt when hard enabling
interrupts before running timers, it effectively disables the soft
NMI coverage for timer interrupts.

Fix this by using the traditional 31-bit value instead, which wraps
after a few seconds. masked interrupt code does the same thing, and
in normal operation neither of these paths would ever wrap even the
31 bit value.

Note: the SMP watchdog should catch timer interrupt lockups, but it
is preferable for the local soft-NMI to catch them, mainly to avoid
the IPI.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/time.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index ad876906f847..5862a3611795 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -578,22 +578,29 @@ void timer_interrupt(struct pt_regs *regs)
 	struct pt_regs *old_regs;
 	u64 now;
 
-	/* Ensure a positive value is written to the decrementer, or else
-	 * some CPUs will continue to take decrementer exceptions.
-	 */
-	set_dec(decrementer_max);
-
 	/* Some implementations of hotplug will get timer interrupts while
 	 * offline, just ignore these and we also need to set
 	 * decrementers_next_tb as MAX to make sure __check_irq_replay
 	 * don't replay timer interrupt when return, otherwise we'll trap
 	 * here infinitely :(
 	 */
-	if (!cpu_online(smp_processor_id())) {
+	if (unlikely(!cpu_online(smp_processor_id()))) {
 		*next_tb = ~(u64)0;
+		set_dec(decrementer_max);
 		return;
 	}
 
+	/* Ensure a positive value is written to the decrementer, or else
+	 * some CPUs will continue to take decrementer exceptions. When the
+	 * PPC_WATCHDOG (decrementer based) is configured, keep this at most
+	 * 31 bits, which is about 4 seconds on most systems, which gives
+	 * the watchdog a chance of catching timer interrupt hard lockups.
+	 */
+	if (IS_ENABLED(CONFIG_PPC_WATCHDOG))
+		set_dec(0x7fffffff);
+	else
+		set_dec(decrementer_max);
+
 	/* Conditionally hard-enable interrupts now that the DEC has been
 	 * bumped to its maximum value
 	 */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 09/11] powerpc: move timer broadcast code under GENERIC_CLOCKEVENTS_BROADCAST ifdef
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
                   ` (7 preceding siblings ...)
  2018-05-04 17:19 ` [PATCH 08/11] powerpc: allow soft-NMI watchdog to cover timer interrupts with large decrementers Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 10/11] powerpc: move a stray NMI IPI case under NMI_IPI ifdef Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 11/11] powerpc/time: account broadcast timer event interrupts separately Nicholas Piggin
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/smp.c  | 8 ++++++++
 arch/powerpc/kernel/time.c | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 5441a47701b1..914708eeb43f 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -155,11 +155,13 @@ static irqreturn_t reschedule_action(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 static irqreturn_t tick_broadcast_ipi_action(int irq, void *data)
 {
 	timer_broadcast_interrupt();
 	return IRQ_HANDLED;
 }
+#endif
 
 #ifdef CONFIG_NMI_IPI
 static irqreturn_t nmi_ipi_action(int irq, void *data)
@@ -172,7 +174,9 @@ static irqreturn_t nmi_ipi_action(int irq, void *data)
 static irq_handler_t smp_ipi_action[] = {
 	[PPC_MSG_CALL_FUNCTION] =  call_function_action,
 	[PPC_MSG_RESCHEDULE] = reschedule_action,
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 	[PPC_MSG_TICK_BROADCAST] = tick_broadcast_ipi_action,
+#endif
 #ifdef CONFIG_NMI_IPI
 	[PPC_MSG_NMI_IPI] = nmi_ipi_action,
 #endif
@@ -186,7 +190,9 @@ static irq_handler_t smp_ipi_action[] = {
 const char *smp_ipi_name[] = {
 	[PPC_MSG_CALL_FUNCTION] =  "ipi call function",
 	[PPC_MSG_RESCHEDULE] = "ipi reschedule",
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 	[PPC_MSG_TICK_BROADCAST] = "ipi tick-broadcast",
+#endif
 	[PPC_MSG_NMI_IPI] = "nmi ipi",
 };
 
@@ -277,8 +283,10 @@ irqreturn_t smp_ipi_demux_relaxed(void)
 			generic_smp_call_function_interrupt();
 		if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE))
 			scheduler_ipi();
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 		if (all & IPI_MESSAGE(PPC_MSG_TICK_BROADCAST))
 			timer_broadcast_interrupt();
+#endif
 #ifdef CONFIG_NMI_IPI
 		if (all & IPI_MESSAGE(PPC_MSG_NMI_IPI))
 			nmi_ipi_action(0, NULL);
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 5862a3611795..23921f7b6e67 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -643,6 +643,7 @@ void timer_interrupt(struct pt_regs *regs)
 }
 EXPORT_SYMBOL(timer_interrupt);
 
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 void timer_broadcast_interrupt(void)
 {
 	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
@@ -654,6 +655,7 @@ void timer_broadcast_interrupt(void)
 	__this_cpu_inc(irq_stat.timer_irqs_event);
 	trace_timer_interrupt_exit(regs);
 }
+#endif
 
 /*
  * Hypervisor decrementer interrupts shouldn't occur but are sometimes
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 10/11] powerpc: move a stray NMI IPI case under NMI_IPI ifdef
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
                   ` (8 preceding siblings ...)
  2018-05-04 17:19 ` [PATCH 09/11] powerpc: move timer broadcast code under GENERIC_CLOCKEVENTS_BROADCAST ifdef Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  2018-05-04 17:19 ` [PATCH 11/11] powerpc/time: account broadcast timer event interrupts separately Nicholas Piggin
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/smp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 914708eeb43f..28ec1638a540 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -193,7 +193,9 @@ const char *smp_ipi_name[] = {
 #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 	[PPC_MSG_TICK_BROADCAST] = "ipi tick-broadcast",
 #endif
+#ifdef CONFIG_NMI_IPI
 	[PPC_MSG_NMI_IPI] = "nmi ipi",
+#endif
 };
 
 /* optional function to request ipi, for controllers with >= 4 ipis */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 11/11] powerpc/time: account broadcast timer event interrupts separately
  2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
                   ` (9 preceding siblings ...)
  2018-05-04 17:19 ` [PATCH 10/11] powerpc: move a stray NMI IPI case under NMI_IPI ifdef Nicholas Piggin
@ 2018-05-04 17:19 ` Nicholas Piggin
  10 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2018-05-04 17:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Benjamin Herrenschmidt

These are not local timer interrupts but IPIs. It's good to be able
to see how timer offloading is behaving, so split these out into
their own category.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/hardirq.h | 1 +
 arch/powerpc/kernel/irq.c          | 6 ++++++
 arch/powerpc/kernel/time.c         | 5 +----
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/hardirq.h b/arch/powerpc/include/asm/hardirq.h
index 5986d473722b..20b01897ea5d 100644
--- a/arch/powerpc/include/asm/hardirq.h
+++ b/arch/powerpc/include/asm/hardirq.h
@@ -8,6 +8,7 @@
 typedef struct {
 	unsigned int __softirq_pending;
 	unsigned int timer_irqs_event;
+	unsigned int broadcast_irqs_event;
 	unsigned int timer_irqs_others;
 	unsigned int pmu_irqs;
 	unsigned int mce_exceptions;
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 6569b5ffff93..627db34bb79d 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -518,6 +518,11 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 		seq_printf(p, "%10u ", per_cpu(irq_stat, j).timer_irqs_event);
         seq_printf(p, "  Local timer interrupts for timer event device\n");
 
+	seq_printf(p, "%*s: ", prec, "BCT");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ", per_cpu(irq_stat, j).broadcast_irqs_event);
+	seq_printf(p, "  Broadcast timer interrupts for timer event device\n");
+
 	seq_printf(p, "%*s: ", prec, "LOC");
 	for_each_online_cpu(j)
 		seq_printf(p, "%10u ", per_cpu(irq_stat, j).timer_irqs_others);
@@ -577,6 +582,7 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
 {
 	u64 sum = per_cpu(irq_stat, cpu).timer_irqs_event;
 
+	sum += per_cpu(irq_stat, cpu).broadcast_irqs_event;
 	sum += per_cpu(irq_stat, cpu).pmu_irqs;
 	sum += per_cpu(irq_stat, cpu).mce_exceptions;
 	sum += per_cpu(irq_stat, cpu).spurious_irqs;
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 23921f7b6e67..ed6b2abdde15 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -647,13 +647,10 @@ EXPORT_SYMBOL(timer_interrupt);
 void timer_broadcast_interrupt(void)
 {
 	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
-	struct pt_regs *regs = get_irq_regs();
 
-	trace_timer_interrupt_entry(regs);
 	*next_tb = ~(u64)0;
 	tick_receive_broadcast();
-	__this_cpu_inc(irq_stat.timer_irqs_event);
-	trace_timer_interrupt_exit(regs);
+	__this_cpu_inc(irq_stat.broadcast_irqs_event);
 }
 #endif
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 07/11] powerpc: generic clockevents broadcast receiver call tick_receive_broadcast
  2018-05-04 17:19 ` [PATCH 07/11] powerpc: generic clockevents broadcast receiver call tick_receive_broadcast Nicholas Piggin
@ 2018-05-05 14:38   ` kbuild test robot
  0 siblings, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2018-05-05 14:38 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: kbuild-all, linuxppc-dev, Preeti U Murthy, Srivatsa S . Bhat,
	Nicholas Piggin

[-- Attachment #1: Type: text/plain, Size: 2091 bytes --]

Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.17-rc3 next-20180504]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-64-irq_work-avoid-interrupt-when-called-with-hardware-irqs-enabled/20180505-193051
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-kilauea_defconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

Note: the linux-review/Nicholas-Piggin/powerpc-64-irq_work-avoid-interrupt-when-called-with-hardware-irqs-enabled/20180505-193051 HEAD 6f7d1654cc9b2355ec6983991f1f293089e9c762 builds fine.
      It only hurts bisectibility.

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/time.c: In function 'timer_broadcast_interrupt':
>> arch/powerpc/kernel/time.c:646:2: error: implicit declaration of function 'tick_receive_broadcast'; did you mean 'tick_setup_hrtimer_broadcast'? [-Werror=implicit-function-declaration]
     tick_receive_broadcast();
     ^~~~~~~~~~~~~~~~~~~~~~
     tick_setup_hrtimer_broadcast
   cc1: all warnings being treated as errors

vim +646 arch/powerpc/kernel/time.c

   638	
   639	void timer_broadcast_interrupt(void)
   640	{
   641		u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
   642		struct pt_regs *regs = get_irq_regs();
   643	
   644		trace_timer_interrupt_entry(regs);
   645		*next_tb = ~(u64)0;
 > 646		tick_receive_broadcast();
   647		__this_cpu_inc(irq_stat.timer_irqs_event);
   648		trace_timer_interrupt_exit(regs);
   649	}
   650	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 13373 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled
  2018-05-04 17:19 ` [PATCH 01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled Nicholas Piggin
@ 2018-06-04 14:10   ` Michael Ellerman
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2018-06-04 14:10 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin

On Fri, 2018-05-04 at 17:19:25 UTC, Nicholas Piggin wrote:
> irq_work_raise should not cause a decrementer exception unless it is
> called from NMI context. Doing so often just results in an immediate
> masked decrementer interrupt:
> 
>    <...>-550    90d...    4us : update_curr_rt <-dequeue_task_rt
>    <...>-550    90d...    5us : dbs_update_util_handler <-update_curr_rt
>    <...>-550    90d...    6us : arch_irq_work_raise <-irq_work_queue
>    <...>-550    90d...    7us : soft_nmi_interrupt <-soft_nmi_common
>    <...>-550    90d...    7us : printk_nmi_enter <-soft_nmi_interrupt
>    <...>-550    90d.Z.    8us : rcu_nmi_enter <-soft_nmi_interrupt
>    <...>-550    90d.Z.    9us : rcu_nmi_exit <-soft_nmi_interrupt
>    <...>-550    90d...    9us : printk_nmi_exit <-soft_nmi_interrupt
>    <...>-550    90d...   10us : cpuacct_charge <-update_curr_rt
> 
> The soft_nmi_interrupt here is the call into the watchdog, due to the
> decrementer interrupt firing with irqs soft-disabled. This is
> harmless, but sub-optimal.
> 
> When it's not called from NMI context or with interrupts enabled, mark
> the decrementer pending in the irq_happened mask directly, rather than
> having the masked decrementer interupt handler do it. This will be
> replayed at the next local_irq_enable. See the comment for details.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ebb37cf3ffd39fdb6ec5b07111f8bb

cheers

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-06-04 14:10 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
2018-05-04 17:19 ` [PATCH 01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled Nicholas Piggin
2018-06-04 14:10   ` [01/11] " Michael Ellerman
2018-05-04 17:19 ` [PATCH 02/11] powerpc/pseries: put cede MSR[EE] check under IRQ_SOFT_MASK_DEBUG Nicholas Piggin
2018-05-04 17:19 ` [PATCH 03/11] powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely Nicholas Piggin
2018-05-04 17:19 ` [PATCH 04/11] powerpc/64s: micro-optimise __hard_irq_enable() for mtmsrd L=1 support Nicholas Piggin
2018-05-04 17:19 ` [PATCH 05/11] powerpc/64: remove start_tb and accum_tb from thread_struct Nicholas Piggin
2018-05-04 17:19 ` [PATCH 06/11] powerpc/pseries: lparcfg calculate PURR on demand Nicholas Piggin
2018-05-04 17:19 ` [PATCH 07/11] powerpc: generic clockevents broadcast receiver call tick_receive_broadcast Nicholas Piggin
2018-05-05 14:38   ` kbuild test robot
2018-05-04 17:19 ` [PATCH 08/11] powerpc: allow soft-NMI watchdog to cover timer interrupts with large decrementers Nicholas Piggin
2018-05-04 17:19 ` [PATCH 09/11] powerpc: move timer broadcast code under GENERIC_CLOCKEVENTS_BROADCAST ifdef Nicholas Piggin
2018-05-04 17:19 ` [PATCH 10/11] powerpc: move a stray NMI IPI case under NMI_IPI ifdef Nicholas Piggin
2018-05-04 17:19 ` [PATCH 11/11] powerpc/time: account broadcast timer event interrupts separately Nicholas Piggin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.