All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: Nicholas Piggin <npiggin@gmail.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: [PATCH 01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled
Date: Sat,  5 May 2018 03:19:25 +1000	[thread overview]
Message-ID: <20180504171935.25410-2-npiggin@gmail.com> (raw)
In-Reply-To: <20180504171935.25410-1-npiggin@gmail.com>

irq_work_raise should not cause a decrementer exception unless it is
called from NMI context. Doing so often just results in an immediate
masked decrementer interrupt:

   <...>-550    90d...    4us : update_curr_rt <-dequeue_task_rt
   <...>-550    90d...    5us : dbs_update_util_handler <-update_curr_rt
   <...>-550    90d...    6us : arch_irq_work_raise <-irq_work_queue
   <...>-550    90d...    7us : soft_nmi_interrupt <-soft_nmi_common
   <...>-550    90d...    7us : printk_nmi_enter <-soft_nmi_interrupt
   <...>-550    90d.Z.    8us : rcu_nmi_enter <-soft_nmi_interrupt
   <...>-550    90d.Z.    9us : rcu_nmi_exit <-soft_nmi_interrupt
   <...>-550    90d...    9us : printk_nmi_exit <-soft_nmi_interrupt
   <...>-550    90d...   10us : cpuacct_charge <-update_curr_rt

The soft_nmi_interrupt here is the call into the watchdog, due to the
decrementer interrupt firing with irqs soft-disabled. This is
harmless, but sub-optimal.

When it's not called from NMI context or with interrupts enabled, mark
the decrementer pending in the irq_happened mask directly, rather than
having the masked decrementer interupt handler do it. This will be
replayed at the next local_irq_enable. See the comment for details.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/time.c | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 360e71d455cc..e7e8611e8863 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -513,6 +513,35 @@ static inline void clear_irq_work_pending(void)
 		"i" (offsetof(struct paca_struct, irq_work_pending)));
 }
 
+void arch_irq_work_raise(void)
+{
+	preempt_disable();
+	set_irq_work_pending_flag();
+	/*
+	 * Non-nmi code running with interrupts disabled will replay
+	 * irq_happened before it re-enables interrupts, so setthe
+	 * decrementer there instead of causing a hardware exception
+	 * which would immediately hit the masked interrupt handler
+	 * and have the net effect of setting the decrementer in
+	 * irq_happened.
+	 *
+	 * NMI interrupts can not check this when they return, so the
+	 * decrementer hardware exception is raised, which will fire
+	 * when interrupts are next enabled.
+	 *
+	 * BookE does not support this yet, it must audit all NMI
+	 * interrupt handlers to ensure they call nmi_enter() so this
+	 * check would be correct.
+	 */
+	if (IS_ENABLED(CONFIG_BOOKE) || !irqs_disabled() || in_nmi()) {
+		set_dec(1);
+	} else {
+		hard_irq_disable();
+		local_paca->irq_happened |= PACA_IRQ_DEC;
+	}
+	preempt_enable();
+}
+
 #else /* 32-bit */
 
 DEFINE_PER_CPU(u8, irq_work_pending);
@@ -521,8 +550,6 @@ DEFINE_PER_CPU(u8, irq_work_pending);
 #define test_irq_work_pending()		__this_cpu_read(irq_work_pending)
 #define clear_irq_work_pending()	__this_cpu_write(irq_work_pending, 0)
 
-#endif /* 32 vs 64 bit */
-
 void arch_irq_work_raise(void)
 {
 	preempt_disable();
@@ -531,6 +558,8 @@ void arch_irq_work_raise(void)
 	preempt_enable();
 }
 
+#endif /* 32 vs 64 bit */
+
 #else  /* CONFIG_IRQ_WORK */
 
 #define test_irq_work_pending()	0
-- 
2.17.0

  reply	other threads:[~2018-05-04 17:19 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-04 17:19 [PATCH 00/11] assortment of timer, watchdog, interrupt Nicholas Piggin
2018-05-04 17:19 ` Nicholas Piggin [this message]
2018-06-04 14:10   ` [01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled Michael Ellerman
2018-05-04 17:19 ` [PATCH 02/11] powerpc/pseries: put cede MSR[EE] check under IRQ_SOFT_MASK_DEBUG Nicholas Piggin
2018-05-04 17:19 ` [PATCH 03/11] powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely Nicholas Piggin
2018-05-04 17:19 ` [PATCH 04/11] powerpc/64s: micro-optimise __hard_irq_enable() for mtmsrd L=1 support Nicholas Piggin
2018-05-04 17:19 ` [PATCH 05/11] powerpc/64: remove start_tb and accum_tb from thread_struct Nicholas Piggin
2018-05-04 17:19 ` [PATCH 06/11] powerpc/pseries: lparcfg calculate PURR on demand Nicholas Piggin
2018-05-04 17:19 ` [PATCH 07/11] powerpc: generic clockevents broadcast receiver call tick_receive_broadcast Nicholas Piggin
2018-05-05 14:38   ` kbuild test robot
2018-05-04 17:19 ` [PATCH 08/11] powerpc: allow soft-NMI watchdog to cover timer interrupts with large decrementers Nicholas Piggin
2018-05-04 17:19 ` [PATCH 09/11] powerpc: move timer broadcast code under GENERIC_CLOCKEVENTS_BROADCAST ifdef Nicholas Piggin
2018-05-04 17:19 ` [PATCH 10/11] powerpc: move a stray NMI IPI case under NMI_IPI ifdef Nicholas Piggin
2018-05-04 17:19 ` [PATCH 11/11] powerpc/time: account broadcast timer event interrupts separately Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180504171935.25410-2-npiggin@gmail.com \
    --to=npiggin@gmail.com \
    --cc=benh@kernel.crashing.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.