From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org, "Paul E. McKenney" <paulmck@kernel.org>,
Andy Lutomirski <luto@kernel.org>,
Alexandre Chartre <alexandre.chartre@oracle.com>,
Frederic Weisbecker <frederic@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <sean.j.christopherson@intel.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Petr Mladek <pmladek@suse.com>,
Steven Rostedt <rostedt@goodmis.org>,
Joel Fernandes <joel@joelfernandes.org>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Juergen Gross <jgross@suse.com>, Brian Gerst <brgerst@gmail.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Will Deacon <will@kernel.org>
Subject: [patch V4 part 5 07/31] x86/entry: Provide idtentry_entry/exit_cond_rcu()
Date: Tue, 05 May 2020 15:53:48 +0200 [thread overview]
Message-ID: <20200505135828.808686575@linutronix.de> (raw)
In-Reply-To: 20200505135341.730586321@linutronix.de
The pagefault handler cannot use the regular idtentry_enter() because on
that invokes rcu_irq_enter() the pagefault was caused in the kernel. Not a
problem per se, but kernel side page faults can schedule which is not
possible without invoking rcu_irq_exit().
Adding rcu_irq_exit() and a matching rcu_irq_enter() into the actual
pagefault handling code is possible, but not pretty either.
Provide idtentry_entry/exit_cond_rcu() which calls rcu_irq_enter() only
when RCU is not watching. While this is not a legit kernel #PF establishing
RCU before handling it avoids RCU side effects which might affect
debugability.
The function is also useful for implementing lightweight scheduler IPI
entry handling later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/entry/common.c | 119 ++++++++++++++++++++++++++++++++++------
arch/x86/include/asm/idtentry.h | 3 +
2 files changed, 106 insertions(+), 16 deletions(-)
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -515,6 +515,28 @@ SYSCALL_DEFINE0(ni_syscall)
return -ENOSYS;
}
+static __always_inline bool __idtentry_enter(struct pt_regs *regs,
+ bool cond_rcu)
+{
+ if (user_mode(regs)) {
+ enter_from_user_mode();
+ } else {
+ if (!cond_rcu || !rcu_is_watching()) {
+ lockdep_hardirqs_off(CALLER_ADDR0);
+ rcu_irq_enter();
+ instr_begin();
+ trace_hardirqs_off_prepare();
+ instr_end();
+ return true;
+ } else {
+ instr_begin();
+ trace_hardirqs_off();
+ instr_end();
+ }
+ }
+ return false;
+}
+
/**
* idtentry_enter - Handle state tracking on idtentry
* @regs: Pointer to pt_regs of interrupted context
@@ -532,19 +554,60 @@ SYSCALL_DEFINE0(ni_syscall)
*/
void noinstr idtentry_enter(struct pt_regs *regs)
{
- if (user_mode(regs)) {
- enter_from_user_mode();
- } else {
- lockdep_hardirqs_off(CALLER_ADDR0);
- rcu_irq_enter();
- instr_begin();
- trace_hardirqs_off_prepare();
- instr_end();
- }
+ __idtentry_enter(regs, false);
+}
+
+/**
+ * idtentry_enter_cond_rcu - Handle state tracking on idtentry with conditional
+ * RCU handling
+ * @regs: Pointer to pt_regs of interrupted context
+ *
+ * Invokes:
+ * - lockdep irqflag state tracking as low level ASM entry disabled
+ * interrupts.
+ *
+ * - Context tracking if the exception hit user mode.
+ *
+ * - The hardirq tracer to keep the state consistent as low level ASM
+ * entry disabled interrupts.
+ *
+ * For kernel mode entries the conditional RCU handling is useful for two
+ * purposes
+ *
+ * 1) Pagefaults: Kernel code can fault and sleep, e.g. on exec. This code
+ * is not in an RCU idle section. If rcu_irq_enter() would be invoked
+ * then nothing would invoke rcu_irq_exit() before scheduling.
+ *
+ * If the kernel faults in a RCU idle section then all bets are off
+ * anyway but at least avoiding a subsequent issue vs. RCU is helpful for
+ * debugging.
+ *
+ * 2) Scheduler IPI: To avoid the overhead of a regular idtentry vs. RCU
+ * and irq_enter() the IPI can be made lightweight if the tracepoints
+ * are not enabled. While the IPI functionality itself does not require
+ * RCU (folding preempt count) it still calls out into instrumentable
+ * functions, e.g. ack_APIC_irq(). The scheduler IPI can hit RCU idle
+ * sections, so RCU needs to be adjusted. For the fast path case, e.g.
+ * KVM kicking a vCPU out of guest mode this can be avoided because the
+ * IPI is handled after KVM reestablished kernel context including RCU.
+ *
+ * For user mode entries enter_from_user_mode() must be invoked to
+ * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit
+ * would not be possible.
+ *
+ * Returns: True if RCU has been adjusted on a kernel entry
+ * False otherwise
+ *
+ * The return value must be fed into the rcu_exit argument of
+ * idtentry_exit_cond_rcu().
+ */
+bool noinstr idtentry_enter_cond_rcu(struct pt_regs *regs)
+{
+ return __idtentry_enter(regs, true);
}
static __always_inline void __idtentry_exit(struct pt_regs *regs,
- bool preempt_hcall)
+ bool preempt_hcall, bool rcu_exit)
{
lockdep_assert_irqs_disabled();
@@ -568,7 +631,8 @@ static __always_inline void __idtentry_e
*/
if (!preempt_count()) {
instr_begin();
- rcu_irq_exit_preempt();
+ if (rcu_exit)
+ rcu_irq_exit_preempt();
if (need_resched())
preempt_schedule_irq();
/* Covers both tracing and lockdep */
@@ -592,11 +656,13 @@ static __always_inline void __idtentry_e
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare(CALLER_ADDR0);
instr_end();
- rcu_irq_exit();
+ if (rcu_exit)
+ rcu_irq_exit();
lockdep_hardirqs_on(CALLER_ADDR0);
} else {
/* IRQ flags state is correct already. Just tell RCU */
- rcu_irq_exit();
+ if (rcu_exit)
+ rcu_irq_exit();
}
}
@@ -617,7 +683,28 @@ static __always_inline void __idtentry_e
*/
void noinstr idtentry_exit(struct pt_regs *regs)
{
- __idtentry_exit(regs, false);
+ __idtentry_exit(regs, false, true);
+}
+
+/**
+ * idtentry_exit_cond_rcu - Handle return from exception with conditional RCU
+ * handling
+ * @regs: Pointer to pt_regs (exception entry regs)
+ * @rcu_exit: Invoke rcu_irq_exit() if true
+ *
+ * Depending on the return target (kernel/user) this runs the necessary
+ * preemption and work checks if possible and reguired and returns to
+ * the caller with interrupts disabled and no further work pending.
+ *
+ * This is the last action before returning to the low level ASM code which
+ * just needs to return to the appropriate context.
+ *
+ * Counterpart to idtentry_enter_cond_rcu(). The return value of the entry
+ * function must be fed into the @rcu_exit argument.
+ */
+void noinstr idtentry_exit_cond_rcu(struct pt_regs *regs, bool rcu_exit)
+{
+ __idtentry_exit(regs, false, rcu_exit);
}
#ifdef CONFIG_XEN_PV
@@ -658,11 +745,11 @@ static noinstr void run_on_irqstack(void
set_irq_regs(old_regs);
if (IS_ENABLED(CONFIG_PREEMPTION)) {
- __idtentry_exit(regs, false);
+ __idtentry_exit(regs, false, true);
} else {
bool inhcall = __this_cpu_read(xen_in_preemptible_hcall);
- __idtentry_exit(regs, inhcall && need_resched());
+ __idtentry_exit(regs, inhcall && need_resched(), true);
}
}
#endif /* CONFIG_XEN_PV */
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -10,6 +10,9 @@
void idtentry_enter(struct pt_regs *regs);
void idtentry_exit(struct pt_regs *regs);
+bool idtentry_enter_cond_rcu(struct pt_regs *regs);
+void idtentry_exit_cond_rcu(struct pt_regs *regs, bool rcu_exit);
+
/**
* DECLARE_IDTENTRY - Declare functions for simple IDT entry points
* No error code pushed by hardware
next prev parent reply other threads:[~2020-05-05 14:16 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-05 13:53 [patch V4 part 5 00/31] x86/entry: Entry/exception code rework, Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 01/31] genirq: Provide irq_enter/exit_rcu() Thomas Gleixner
2020-05-15 5:53 ` Andy Lutomirski
2020-05-05 13:53 ` [patch V4 part 5 02/31] x86/entry: Provide helpers for execute on irqstack Thomas Gleixner
2020-05-06 8:20 ` Thomas Gleixner
2020-05-10 4:33 ` Lai Jiangshan
2020-05-11 9:07 ` Alexandre Chartre
2020-05-11 11:54 ` Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 03/31] x86/entry/64: Move softirq stack switch to C Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 04/31] x86/entry: Split idtentry_enter/exit() Thomas Gleixner
2020-05-11 12:42 ` Alexandre Chartre
2020-05-05 13:53 ` [patch V4 part 5 05/31] x86/entry: Switch XEN/PV hypercall entry to IDTENTRY Thomas Gleixner
2020-05-07 2:11 ` Boris Ostrovsky
2020-05-07 8:30 ` Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 06/31] x86/entry/64: Simplify idtentry_body Thomas Gleixner
2020-05-05 13:53 ` Thomas Gleixner [this message]
2020-05-11 13:53 ` [patch V4 part 5 07/31] x86/entry: Provide idtentry_entry/exit_cond_rcu() Alexandre Chartre
2020-05-11 14:13 ` Peter Zijlstra
2020-05-12 16:30 ` Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 08/31] x86/entry: Switch page fault exception to IDTENTRY_RAW Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 09/31] x86/entry: Remove the transition leftovers Thomas Gleixner
2020-05-11 14:11 ` Alexandre Chartre
2020-05-05 13:53 ` [patch V4 part 5 10/31] x86/entry: Change exit path of xen_failsafe_callback Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 11/31] x86/entry/64: Remove error_exit Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 12/31] x86/entry/32: Remove common_exception Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 13/31] x86/irq: Convey vector as argument and not in ptregs Thomas Gleixner
2020-05-10 2:44 ` Lai Jiangshan
2020-05-11 14:35 ` Thomas Gleixner
2020-05-11 15:11 ` Lai Jiangshan
2020-05-05 13:53 ` [patch V4 part 5 14/31] x86/irq/64: Provide handle_irq() Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 15/31] x86/entry: Add IRQENTRY_IRQ macro Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 16/31] x86/entry: Use idtentry for interrupts Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 17/31] x86/entry: Provide IDTENTRY_SYSVEC Thomas Gleixner
2020-05-05 13:53 ` [patch V4 part 5 18/31] x86/entry: Convert APIC interrupts to IDTENTRY_SYSVEC Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 19/31] x86/entry: Convert SMP system vectors " Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 20/31] x86/entry: Convert various system vectors Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 21/31] x86/entry: Convert KVM vectors to IDTENTRY_SYSVEC Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 22/31] x86/entry: Convert various hypervisor " Thomas Gleixner
2020-05-06 16:56 ` Wei Liu
2020-05-06 17:11 ` Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 23/31] x86/entry: Convert XEN hypercall vector " Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 24/31] x86/entry: Convert reschedule interrupt to IDTENTRY_RAW Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 25/31] x86/entry: Remove the apic/BUILD interrupt leftovers Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 26/31] x86/entry/64: Remove IRQ stack switching ASM Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 27/31] x86/entry: Make enter_from_user_mode() static Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 28/31] x86/entry/32: Remove redundant irq disable code Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 29/31] x86/entry/64: Remove TRACE_IRQS_*_DEBUG Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 30/31] x86/entry: Move paranoid irq tracing out of ASM code Thomas Gleixner
2020-05-05 13:54 ` [patch V4 part 5 31/31] x86/entry: Remove the TRACE_IRQS cruft Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200505135828.808686575@linutronix.de \
--to=tglx@linutronix.de \
--cc=alexandre.chartre@oracle.com \
--cc=boris.ostrovsky@oracle.com \
--cc=brgerst@gmail.com \
--cc=frederic@kernel.org \
--cc=jgross@suse.com \
--cc=joel@joelfernandes.org \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=sean.j.christopherson@intel.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).