All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work
@ 2019-10-23 12:27 Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label Thomas Gleixner
                   ` (19 more replies)
  0 siblings, 20 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

When working on a way to move out the posix cpu timer expiry out of the
timer interrupt context, I noticed that KVM is not handling pending task
work before entering a guest. A quick hack was to add that to the x86 KVM
handling loop. The discussion ended with a request to make this a generic
infrastructure possible with also moving the per arch implementations of
the enter from and return to user space handling generic.

  https://lore.kernel.org/r/89E42BCC-47A8-458B-B06A-D6A20D20512C@amacapital.net

The series implements the syscall enter/exit and the general exit to
userspace work handling along with the pre guest enter functionality.

Changes vs. RFC version:

  - Dropped ARM64 conversion as requested by ARM64 folks

  - Addressed various review comments (Peter, Andy, Mike, Paolo, Josh,
    Miroslav)

  - Picked up, fixed and completed Peter's patch which makes interrupt
    enable/disable symmetric in trap handlers

  - Completed the removal of irq disabling / irq tracing from low level
    ASM code

  - Moved KVM specific parts of the enter guest mode handling to KVM
    (Paolo)

The series is also available from git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP/core.entry

Thanks,

        tglx

RFC version: https://lore.kernel.org/r/20190919150314.054351477@linutronix.de

---
 /Makefile                             |    3 
 arch/Kconfig                          |    3 
 arch/x86/Kconfig                      |    1 
 arch/x86/entry/calling.h              |   12 +
 arch/x86/entry/common.c               |  264 ++------------------------------
 arch/x86/entry/entry_32.S             |   41 ----
 arch/x86/entry/entry_64.S             |   32 ---
 arch/x86/entry/entry_64_compat.S      |   30 ---
 arch/x86/include/asm/irqflags.h       |    8 
 arch/x86/include/asm/paravirt.h       |    9 -
 arch/x86/include/asm/signal.h         |    1 
 arch/x86/include/asm/thread_info.h    |    9 -
 arch/x86/kernel/signal.c              |    2 
 arch/x86/kernel/traps.c               |   33 ++--
 arch/x86/kvm/x86.c                    |   17 --
 arch/x86/mm/fault.c                   |    7 
 b/arch/x86/include/asm/entry-common.h |  104 ++++++++++++
 b/arch/x86/kvm/Kconfig                |    1 
 b/include/linux/entry-common.h        |  280 ++++++++++++++++++++++++++++++++++
 b/kernel/entry/common.c               |  184 ++++++++++++++++++++++
 include/linux/kvm_host.h              |   64 +++++++
 kernel/Makefile                       |    1 
 virt/kvm/Kconfig                      |    3 
 23 files changed, 735 insertions(+), 374 deletions(-)




^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 13:43   ` Sean Christopherson
                     ` (2 more replies)
  2019-10-23 12:27 ` [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit Thomas Gleixner
                   ` (18 subsequent siblings)
  19 siblings, 3 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

The C reimplementation of SYSENTER left that unused ENTRY() label
around. Remove it.

Fixes: 5f310f739b4c ("x86/entry/32: Re-implement SYSENTER using the new C path")
Originally-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_32.S |    1 -
 1 file changed, 1 deletion(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -825,7 +825,6 @@ END(ret_from_fork)
 	cmpl	$USER_RPL, %eax
 	jb	restore_all_kernel		# not returning to v8086 or userspace
 
-ENTRY(resume_userspace)
 	DISABLE_INTERRUPTS(CLBR_ANY)
 	TRACE_IRQS_OFF
 	movl	%esp, %eax



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 13:45   ` Sean Christopherson
                     ` (2 more replies)
  2019-10-23 12:27 ` [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug() Thomas Gleixner
                   ` (17 subsequent siblings)
  19 siblings, 3 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Jump directly to restore_regs_and_return_to_kernel instead of making
a pointless extra jump through .Lparanoid_exit_restore

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_64.S |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1272,12 +1272,11 @@ ENTRY(paranoid_exit)
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3	scratch_reg=%rbx save_reg=%r14
 	SWAPGS_UNSAFE_STACK
-	jmp	.Lparanoid_exit_restore
+	jmp	restore_regs_and_return_to_kernel
 .Lparanoid_exit_no_swapgs:
 	TRACE_IRQS_IRETQ_DEBUG
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3	scratch_reg=%rbx save_reg=%r14
-.Lparanoid_exit_restore:
 	jmp restore_regs_and_return_to_kernel
 END(paranoid_exit)
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 13:52   ` Sean Christopherson
                     ` (3 more replies)
  2019-10-23 12:27 ` [patch V2 04/17] x86/entry: Make DEBUG_ENTRY_ASSERT_IRQS_OFF available for 32bit Thomas Gleixner
                   ` (16 subsequent siblings)
  19 siblings, 4 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

That function returns immediately after conditionally reenabling interrupts which
is more than pointless and requires the ASM code to disable interrupts again.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/traps.c |    1 -
 1 file changed, 1 deletion(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -871,7 +871,6 @@ do_simd_coprocessor_error(struct pt_regs
 dotraplinkage void
 do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
 {
-	cond_local_irq_enable(regs);
 }
 
 dotraplinkage void



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 04/17] x86/entry: Make DEBUG_ENTRY_ASSERT_IRQS_OFF available for 32bit
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (2 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug() Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 14:16   ` Sean Christopherson
  2019-11-06 15:50   ` Alexandre Chartre
  2019-10-23 12:27 ` [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code Thomas Gleixner
                   ` (15 subsequent siblings)
  19 siblings, 2 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Move the interrupt state verification debug macro to common code and fixup
the irqflags and paravirt components so it can be used in 32bit code later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/calling.h        |   12 ++++++++++++
 arch/x86/entry/entry_64.S       |   12 ------------
 arch/x86/include/asm/irqflags.h |    8 ++++++--
 arch/x86/include/asm/paravirt.h |    9 +++++----
 4 files changed, 23 insertions(+), 18 deletions(-)

--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -366,3 +366,15 @@ For 32-bit we have the following convent
 #else
 #define GET_CR2_INTO(reg) _ASM_MOV %cr2, reg
 #endif
+
+.macro DEBUG_ENTRY_ASSERT_IRQS_OFF
+#ifdef CONFIG_DEBUG_ENTRY
+	push %_ASM_AX
+	SAVE_FLAGS(CLBR_EAX)
+	test $X86_EFLAGS_IF, %_ASM_AX
+	jz .Lokay_\@
+	ud2
+.Lokay_\@:
+	pop %_ASM_AX
+#endif
+.endm
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -387,18 +387,6 @@ ENTRY(spurious_entries_start)
     .endr
 END(spurious_entries_start)
 
-.macro DEBUG_ENTRY_ASSERT_IRQS_OFF
-#ifdef CONFIG_DEBUG_ENTRY
-	pushq %rax
-	SAVE_FLAGS(CLBR_RAX)
-	testl $X86_EFLAGS_IF, %eax
-	jz .Lokay_\@
-	ud2
-.Lokay_\@:
-	popq %rax
-#endif
-.endm
-
 /*
  * Enters the IRQ stack if we're not already using it.  NMI-safe.  Clobbers
  * flags and puts old RSP into old_rsp, and leaves all other GPRs alone.
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -126,11 +126,15 @@ static inline notrace unsigned long arch
 #define ENABLE_INTERRUPTS(x)	sti
 #define DISABLE_INTERRUPTS(x)	cli
 
-#ifdef CONFIG_X86_64
 #ifdef CONFIG_DEBUG_ENTRY
-#define SAVE_FLAGS(x)		pushfq; popq %rax
+# ifdef CONFIG_X86_64
+#  define SAVE_FLAGS(x)		pushfq; popq %rax
+# else
+#  define SAVE_FLAGS(x)		pushfl; popl %eax
+# endif
 #endif
 
+#ifdef CONFIG_X86_64
 #define SWAPGS	swapgs
 /*
  * Currently paravirt can't handle swapgs nicely when we
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -904,6 +904,11 @@ extern void default_banner(void);
 		  ANNOTATE_RETPOLINE_SAFE;				\
 		  jmp PARA_INDIRECT(pv_ops+PV_CPU_usergs_sysret64);)
 
+#endif /* CONFIG_PARAVIRT_XXL */
+#endif	/* CONFIG_X86_64 */
+
+#ifdef CONFIG_PARAVIRT_XXL
+
 #ifdef CONFIG_DEBUG_ENTRY
 #define SAVE_FLAGS(clobbers)                                        \
 	PARA_SITE(PARA_PATCH(PV_IRQ_save_fl),			    \
@@ -912,10 +917,6 @@ extern void default_banner(void);
 		  call PARA_INDIRECT(pv_ops+PV_IRQ_save_fl);	    \
 		  PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);)
 #endif
-#endif /* CONFIG_PARAVIRT_XXL */
-#endif	/* CONFIG_X86_64 */
-
-#ifdef CONFIG_PARAVIRT_XXL
 
 #define GET_CR2_INTO_AX							\
 	PARA_SITE(PARA_PATCH(PV_MMU_read_cr2),				\



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (3 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 04/17] x86/entry: Make DEBUG_ENTRY_ASSERT_IRQS_OFF available for 32bit Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 14:16   ` Sean Christopherson
                     ` (2 more replies)
  2019-10-23 12:27 ` [patch V2 06/17] x86/entry/32: Remove redundant interrupt disable Thomas Gleixner
                   ` (14 subsequent siblings)
  19 siblings, 3 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Traps enable interrupts conditionally but rely on the ASM return code to
disable them again. That results in redundant interrupt disable and trace
calls.

Make the trap handlers disable interrupts before returning to avoid that,
which allows simplification of the ASM entry code.

Originally-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/traps.c |   32 +++++++++++++++++++++-----------
 arch/x86/mm/fault.c     |    7 +++++--
 2 files changed, 26 insertions(+), 13 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -276,6 +276,7 @@ static void do_error_trap(struct pt_regs
 			NOTIFY_STOP) {
 		cond_local_irq_enable(regs);
 		do_trap(trapnr, signr, str, regs, error_code, sicode, addr);
+		cond_local_irq_disable(regs);
 	}
 }
 
@@ -501,6 +502,7 @@ dotraplinkage void do_bounds(struct pt_r
 		die("bounds", regs, error_code);
 	}
 
+	cond_local_irq_disable(regs);
 	return;
 
 exit_trap:
@@ -512,6 +514,7 @@ dotraplinkage void do_bounds(struct pt_r
 	 * time..
 	 */
 	do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, 0, NULL);
+	cond_local_irq_disable(regs);
 }
 
 dotraplinkage void
@@ -525,19 +528,19 @@ do_general_protection(struct pt_regs *re
 
 	if (static_cpu_has(X86_FEATURE_UMIP)) {
 		if (user_mode(regs) && fixup_umip_exception(regs))
-			return;
+			goto exit_trap;
 	}
 
 	if (v8086_mode(regs)) {
 		local_irq_enable();
 		handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
-		return;
+		goto exit_trap;
 	}
 
 	tsk = current;
 	if (!user_mode(regs)) {
 		if (fixup_exception(regs, X86_TRAP_GP, error_code, 0))
-			return;
+			goto exit_trap;
 
 		tsk->thread.error_code = error_code;
 		tsk->thread.trap_nr = X86_TRAP_GP;
@@ -549,12 +552,12 @@ do_general_protection(struct pt_regs *re
 		 */
 		if (!preemptible() && kprobe_running() &&
 		    kprobe_fault_handler(regs, X86_TRAP_GP))
-			return;
+			goto exit_trap;
 
 		if (notify_die(DIE_GPF, desc, regs, error_code,
 			       X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP)
 			die(desc, regs, error_code);
-		return;
+		goto exit_trap;
 	}
 
 	tsk->thread.error_code = error_code;
@@ -563,6 +566,8 @@ do_general_protection(struct pt_regs *re
 	show_signal(tsk, SIGSEGV, "", desc, regs, error_code);
 
 	force_sig(SIGSEGV);
+exit_trap:
+	cond_local_irq_disable(regs);
 }
 NOKPROBE_SYMBOL(do_general_protection);
 
@@ -783,9 +788,7 @@ dotraplinkage void do_debug(struct pt_re
 	if (v8086_mode(regs)) {
 		handle_vm86_trap((struct kernel_vm86_regs *) regs, error_code,
 					X86_TRAP_DB);
-		cond_local_irq_disable(regs);
-		debug_stack_usage_dec();
-		goto exit;
+		goto exit_irq;
 	}
 
 	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
@@ -802,6 +805,8 @@ dotraplinkage void do_debug(struct pt_re
 	si_code = get_si_code(tsk->thread.debugreg6);
 	if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS) || user_icebp)
 		send_sigtrap(regs, error_code, si_code);
+
+exit_irq:
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
 
@@ -827,7 +832,7 @@ static void math_error(struct pt_regs *r
 
 	if (!user_mode(regs)) {
 		if (fixup_exception(regs, trapnr, error_code, 0))
-			return;
+			goto exit_trap;
 
 		task->thread.error_code = error_code;
 		task->thread.trap_nr = trapnr;
@@ -835,7 +840,7 @@ static void math_error(struct pt_regs *r
 		if (notify_die(DIE_TRAP, str, regs, error_code,
 					trapnr, SIGFPE) != NOTIFY_STOP)
 			die(str, regs, error_code);
-		return;
+		goto exit_trap;
 	}
 
 	/*
@@ -849,10 +854,12 @@ static void math_error(struct pt_regs *r
 	si_code = fpu__exception_code(fpu, trapnr);
 	/* Retry when we get spurious exceptions: */
 	if (!si_code)
-		return;
+		goto exit_trap;
 
 	force_sig_fault(SIGFPE, si_code,
 			(void __user *)uprobe_get_trap_addr(regs));
+exit_trap:
+	cond_local_irq_disable(regs);
 }
 
 dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code)
@@ -888,6 +895,8 @@ do_device_not_available(struct pt_regs *
 
 		info.regs = regs;
 		math_emulate(&info);
+
+		cond_local_irq_disable(regs);
 		return;
 	}
 #endif
@@ -918,6 +927,7 @@ dotraplinkage void do_iret_error(struct
 		do_trap(X86_TRAP_IRET, SIGILL, "iret exception", regs, error_code,
 			ILL_BADSTK, (void __user *)NULL);
 	}
+	local_irq_disable();
 }
 #endif
 
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1500,10 +1500,13 @@ static noinline void
 		return;
 
 	/* Was the fault on kernel-controlled part of the address space? */
-	if (unlikely(fault_in_kernel_space(address)))
+	if (unlikely(fault_in_kernel_space(address))) {
 		do_kern_addr_fault(regs, hw_error_code, address);
-	else
+	} else {
 		do_user_addr_fault(regs, hw_error_code, address);
+		if (regs->flags & X86_EFLAGS_IF)
+			local_irq_disable();
+	}
 }
 NOKPROBE_SYMBOL(__do_page_fault);
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 06/17] x86/entry/32: Remove redundant interrupt disable
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (4 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 14:17   ` Sean Christopherson
  2019-11-08 10:41   ` Alexandre Chartre
  2019-10-23 12:27 ` [patch V2 07/17] x86/entry/64: " Thomas Gleixner
                   ` (13 subsequent siblings)
  19 siblings, 2 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Now that the trap handlers return with interrupts disabled, the
unconditional disabling of interrupts in the low level entry code can be
removed along with the trace calls and the misnomed preempt_stop macro.
As a consequence ret_from_exception and ret_from_intr collapse.

Add a debug check to verify that interrupts are disabled depending on
CONFIG_DEBUG_ENTRY.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_32.S |   21 ++++++---------------
 1 file changed, 6 insertions(+), 15 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -63,12 +63,6 @@
  * enough to patch inline, increasing performance.
  */
 
-#ifdef CONFIG_PREEMPTION
-# define preempt_stop(clobbers)	DISABLE_INTERRUPTS(clobbers); TRACE_IRQS_OFF
-#else
-# define preempt_stop(clobbers)
-#endif
-
 .macro TRACE_IRQS_IRET
 #ifdef CONFIG_TRACE_IRQFLAGS
 	testl	$X86_EFLAGS_IF, PT_EFLAGS(%esp)     # interrupts off?
@@ -809,8 +803,7 @@ END(ret_from_fork)
 	# userspace resumption stub bypassing syscall exit tracing
 	ALIGN
 ret_from_exception:
-	preempt_stop(CLBR_ANY)
-ret_from_intr:
+	DEBUG_ENTRY_ASSERT_IRQS_OFF
 #ifdef CONFIG_VM86
 	movl	PT_EFLAGS(%esp), %eax		# mix EFLAGS and CS
 	movb	PT_CS(%esp), %al
@@ -825,8 +818,6 @@ END(ret_from_fork)
 	cmpl	$USER_RPL, %eax
 	jb	restore_all_kernel		# not returning to v8086 or userspace
 
-	DISABLE_INTERRUPTS(CLBR_ANY)
-	TRACE_IRQS_OFF
 	movl	%esp, %eax
 	call	prepare_exit_to_usermode
 	jmp	restore_all
@@ -1084,7 +1075,7 @@ ENTRY(entry_INT80_32)
 
 restore_all_kernel:
 #ifdef CONFIG_PREEMPTION
-	DISABLE_INTERRUPTS(CLBR_ANY)
+	/* Interrupts are disabled and debug-checked */
 	cmpl	$0, PER_CPU_VAR(__preempt_count)
 	jnz	.Lno_preempt
 	testl	$X86_EFLAGS_IF, PT_EFLAGS(%esp)	# interrupts off (exception path) ?
@@ -1189,7 +1180,7 @@ END(spurious_entries_start)
 	TRACE_IRQS_OFF
 	movl	%esp, %eax
 	call	smp_spurious_interrupt
-	jmp	ret_from_intr
+	jmp	ret_from_exception
 ENDPROC(common_spurious)
 #endif
 
@@ -1207,7 +1198,7 @@ ENDPROC(common_spurious)
 	TRACE_IRQS_OFF
 	movl	%esp, %eax
 	call	do_IRQ
-	jmp	ret_from_intr
+	jmp	ret_from_exception
 ENDPROC(common_interrupt)
 
 #define BUILD_INTERRUPT3(name, nr, fn)			\
@@ -1219,7 +1210,7 @@ ENTRY(name)						\
 	TRACE_IRQS_OFF					\
 	movl	%esp, %eax;				\
 	call	fn;					\
-	jmp	ret_from_intr;				\
+	jmp	ret_from_exception;				\
 ENDPROC(name)
 
 #define BUILD_INTERRUPT(name, nr)		\
@@ -1366,7 +1357,7 @@ ENTRY(xen_do_upcall)
 #ifndef CONFIG_PREEMPTION
 	call	xen_maybe_preempt_hcall
 #endif
-	jmp	ret_from_intr
+	jmp	ret_from_exception
 ENDPROC(xen_hypervisor_callback)
 
 /*



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (5 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 06/17] x86/entry/32: Remove redundant interrupt disable Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 14:20   ` Sean Christopherson
                     ` (2 more replies)
  2019-10-23 12:27 ` [patch V2 08/17] x86/entry: Move syscall irq tracing to C code Thomas Gleixner
                   ` (12 subsequent siblings)
  19 siblings, 3 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Now that the trap handlers return with interrupts disabled, the
unconditional disabling of interrupts in the low level entry code can be
removed along with the trace calls.

Add debug checks where appropriate.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_64.S |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -595,8 +595,7 @@ END(common_spurious)
 	call	do_IRQ	/* rdi points to pt_regs */
 	/* 0(%rsp): old RSP */
 ret_from_intr:
-	DISABLE_INTERRUPTS(CLBR_ANY)
-	TRACE_IRQS_OFF
+	DEBUG_ENTRY_ASSERT_IRQS_OFF
 
 	LEAVE_IRQ_STACK
 
@@ -1252,8 +1251,7 @@ END(paranoid_entry)
  */
 ENTRY(paranoid_exit)
 	UNWIND_HINT_REGS
-	DISABLE_INTERRUPTS(CLBR_ANY)
-	TRACE_IRQS_OFF_DEBUG
+	DEBUG_ENTRY_ASSERT_IRQS_OFF
 	testl	%ebx, %ebx			/* swapgs needed? */
 	jnz	.Lparanoid_exit_no_swapgs
 	TRACE_IRQS_IRETQ
@@ -1356,8 +1354,7 @@ END(error_entry)
 
 ENTRY(error_exit)
 	UNWIND_HINT_REGS
-	DISABLE_INTERRUPTS(CLBR_ANY)
-	TRACE_IRQS_OFF
+	DEBUG_ENTRY_ASSERT_IRQS_OFF
 	testb	$3, CS(%rsp)
 	jz	retint_kernel
 	jmp	retint_user



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 08/17] x86/entry: Move syscall irq tracing to C code
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (6 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 07/17] x86/entry/64: " Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 21:30   ` Andy Lutomirski
  2019-10-23 12:27 ` [patch V2 09/17] x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY Thomas Gleixner
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Interrupt state tracing can be safely done in C code. The few stack
operations in assembly do not need to be covered.

Remove the now pointless indirection via .Lsyscall_32_done and jump to
swapgs_restore_regs_and_return_to_usermode directly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/common.c          |   10 ++++++++++
 arch/x86/entry/entry_32.S        |   17 -----------------
 arch/x86/entry/entry_64.S        |    6 ------
 arch/x86/entry/entry_64_compat.S |   30 ++++--------------------------
 4 files changed, 14 insertions(+), 49 deletions(-)

--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -218,6 +218,9 @@ static void exit_to_usermode_loop(struct
 	user_enter_irqoff();
 
 	mds_user_clear_cpu_buffers();
+
+	/* The return to usermode reenables interrupts. Tell the tracer */
+	trace_hardirqs_on();
 }
 
 #define SYSCALL_EXIT_WORK_FLAGS				\
@@ -279,6 +282,9 @@ static void syscall_slow_exit_work(struc
 {
 	struct thread_info *ti;
 
+	/* User to kernel transition disabled interrupts. */
+	trace_hardirqs_off();
+
 	enter_from_user_mode();
 	local_irq_enable();
 	ti = current_thread_info();
@@ -351,6 +357,7 @@ static __always_inline void do_syscall_3
 /* Handles int $0x80 */
 __visible void do_int80_syscall_32(struct pt_regs *regs)
 {
+	trace_hardirqs_off();
 	enter_from_user_mode();
 	local_irq_enable();
 	do_syscall_32_irqs_on(regs);
@@ -367,6 +374,9 @@ static __always_inline void do_syscall_3
 	unsigned long landing_pad = (unsigned long)current->mm->context.vdso +
 		vdso_image_32.sym_int80_landing_pad;
 
+	/* User to kernel transition disabled interrupts. */
+	trace_hardirqs_off();
+
 	/*
 	 * SYSENTER loses EIP, and even SYSCALL32 needs us to skip forward
 	 * so that 'regs->ip -= 2' lands back on an int $0x80 instruction.
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -924,12 +924,6 @@ ENTRY(entry_SYSENTER_32)
 	jnz	.Lsysenter_fix_flags
 .Lsysenter_flags_fixed:
 
-	/*
-	 * User mode is traced as though IRQs are on, and SYSENTER
-	 * turned them off.
-	 */
-	TRACE_IRQS_OFF
-
 	movl	%esp, %eax
 	call	do_fast_syscall_32
 	/* XEN PV guests always use IRET path */
@@ -939,8 +933,6 @@ ENTRY(entry_SYSENTER_32)
 	STACKLEAK_ERASE
 
 /* Opportunistic SYSEXIT */
-	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
-
 	/*
 	 * Setup entry stack - we keep the pointer in %eax and do the
 	 * switch after almost all user-state is restored.
@@ -1039,12 +1031,6 @@ ENTRY(entry_INT80_32)
 
 	SAVE_ALL pt_regs_ax=$-ENOSYS switch_stacks=1	/* save rest */
 
-	/*
-	 * User mode is traced as though IRQs are on, and the interrupt gate
-	 * turned them off.
-	 */
-	TRACE_IRQS_OFF
-
 	movl	%esp, %eax
 	call	do_int80_syscall_32
 .Lsyscall_32_done:
@@ -1052,11 +1038,8 @@ ENTRY(entry_INT80_32)
 	STACKLEAK_ERASE
 
 restore_all:
-	TRACE_IRQS_IRET
 	SWITCH_TO_ENTRY_STACK
-.Lrestore_all_notrace:
 	CHECK_AND_APPLY_ESPFIX
-.Lrestore_nocheck:
 	/* Switch back to user CR3 */
 	SWITCH_TO_USER_CR3 scratch_reg=%eax
 
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -167,15 +167,11 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
 
 	PUSH_AND_CLEAR_REGS rax=$-ENOSYS
 
-	TRACE_IRQS_OFF
-
 	/* IRQs are off. */
 	movq	%rax, %rdi
 	movq	%rsp, %rsi
 	call	do_syscall_64		/* returns with IRQs disabled */
 
-	TRACE_IRQS_IRETQ		/* we're about to change IF */
-
 	/*
 	 * Try to use SYSRET instead of IRET if we're returning to
 	 * a completely clean 64-bit userspace context.  If we're not,
@@ -342,7 +338,6 @@ ENTRY(ret_from_fork)
 	UNWIND_HINT_REGS
 	movq	%rsp, %rdi
 	call	syscall_return_slowpath	/* returns with IRQs disabled */
-	TRACE_IRQS_ON			/* user mode is traced as IRQS on */
 	jmp	swapgs_restore_regs_and_return_to_usermode
 
 1:
@@ -606,7 +601,6 @@ END(common_spurious)
 GLOBAL(retint_user)
 	mov	%rsp,%rdi
 	call	prepare_exit_to_usermode
-	TRACE_IRQS_IRETQ
 
 GLOBAL(swapgs_restore_regs_and_return_to_usermode)
 #ifdef CONFIG_DEBUG_ENTRY
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -129,17 +129,11 @@ ENTRY(entry_SYSENTER_compat)
 	jnz	.Lsysenter_fix_flags
 .Lsysenter_flags_fixed:
 
-	/*
-	 * User mode is traced as though IRQs are on, and SYSENTER
-	 * turned them off.
-	 */
-	TRACE_IRQS_OFF
-
 	movq	%rsp, %rdi
 	call	do_fast_syscall_32
 	/* XEN PV guests always use IRET path */
-	ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
-		    "jmp .Lsyscall_32_done", X86_FEATURE_XENPV
+	ALTERNATIVE "testl %eax, %eax; jz swapgs_restore_regs_and_return_to_usermode", \
+		    "jmp swapgs_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV
 	jmp	sysret32_from_system_call
 
 .Lsysenter_fix_flags:
@@ -247,17 +241,11 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
 	pushq   $0			/* pt_regs->r15 = 0 */
 	xorl	%r15d, %r15d		/* nospec   r15 */
 
-	/*
-	 * User mode is traced as though IRQs are on, and SYSENTER
-	 * turned them off.
-	 */
-	TRACE_IRQS_OFF
-
 	movq	%rsp, %rdi
 	call	do_fast_syscall_32
 	/* XEN PV guests always use IRET path */
-	ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
-		    "jmp .Lsyscall_32_done", X86_FEATURE_XENPV
+	ALTERNATIVE "testl %eax, %eax; jz swapgs_restore_regs_and_return_to_usermode", \
+		    "jmp swapgs_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV
 
 	/* Opportunistic SYSRET */
 sysret32_from_system_call:
@@ -266,7 +254,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
 	 * stack. So let's erase the thread stack right now.
 	 */
 	STACKLEAK_ERASE
-	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movq	RBX(%rsp), %rbx		/* pt_regs->rbx */
 	movq	RBP(%rsp), %rbp		/* pt_regs->rbp */
 	movq	EFLAGS(%rsp), %r11	/* pt_regs->flags (in r11) */
@@ -403,17 +390,8 @@ ENTRY(entry_INT80_compat)
 	xorl	%r15d, %r15d		/* nospec   r15 */
 	cld
 
-	/*
-	 * User mode is traced as though IRQs are on, and the interrupt
-	 * gate turned them off.
-	 */
-	TRACE_IRQS_OFF
-
 	movq	%rsp, %rdi
 	call	do_int80_syscall_32
-.Lsyscall_32_done:
 
-	/* Go back to user mode. */
-	TRACE_IRQS_ON
 	jmp	swapgs_restore_regs_and_return_to_usermode
 END(entry_INT80_compat)



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 09/17] x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (7 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 08/17] x86/entry: Move syscall irq tracing to C code Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2020-01-06  4:11   ` Frederic Weisbecker
  2019-10-23 12:27 ` [patch V2 10/17] entry: Provide generic syscall entry functionality Thomas Gleixner
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

From: Thomas Gleixner <tglx@linutronix.de>

Evaluating _TIF_NOHZ to decide whether to use the slow syscall entry path
is not only pointless, it's actually counterproductive:

 1) Context tracking code is invoked unconditionally before that flag is
    evaluated.

 2) If the flag is set the slow path is invoked for nothing due to #1

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/thread_info.h |    8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -133,14 +133,10 @@ struct thread_info {
 #define _TIF_X32		(1 << TIF_X32)
 #define _TIF_FSCHECK		(1 << TIF_FSCHECK)
 
-/*
- * work to do in syscall_trace_enter().  Also includes TIF_NOHZ for
- * enter_from_user_mode()
- */
+/* Work to do before invoking the actual syscall. */
 #define _TIF_WORK_SYSCALL_ENTRY	\
 	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_AUDIT |	\
-	 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT |	\
-	 _TIF_NOHZ)
+	 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT)
 
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW_BASE						\



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 10/17] entry: Provide generic syscall entry functionality
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (8 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 09/17] x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 11/17] x86/entry: Use generic syscall entry function Thomas Gleixner
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

From: Thomas Gleixner <tglx@linutronix.de>

On syscall entry certain work needs to be done conditionally like tracing,
seccomp etc. This code is duplicated in all architectures.

Provide a generic version.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Fix function documentation (Mike)
    Add comment about return value (Andy)
---
 arch/Kconfig                 |    3 
 include/linux/entry-common.h |  132 +++++++++++++++++++++++++++++++++++++++++++
 kernel/Makefile              |    1 
 kernel/entry/Makefile        |    3 
 kernel/entry/common.c        |   33 ++++++++++
 5 files changed, 172 insertions(+)

--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -27,6 +27,9 @@ config HAVE_IMA_KEXEC
 config HOTPLUG_SMT
 	bool
 
+config GENERIC_ENTRY
+       bool
+
 config OPROFILE
 	tristate "OProfile system profiling"
 	depends on PROFILING
--- /dev/null
+++ b/include/linux/entry-common.h
@@ -0,0 +1,132 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_ENTRYCOMMON_H
+#define __LINUX_ENTRYCOMMON_H
+
+#include <linux/tracehook.h>
+#include <linux/syscalls.h>
+#include <linux/seccomp.h>
+#include <linux/sched.h>
+#include <linux/audit.h>
+
+#include <asm/entry-common.h>
+
+/*
+ * Define dummy _TIF work flags if not defined by the architecture or for
+ * disabled functionality.
+ */
+#ifndef _TIF_SYSCALL_TRACE
+# define _TIF_SYSCALL_TRACE		(0)
+#endif
+
+#ifndef _TIF_SYSCALL_EMU
+# define _TIF_SYSCALL_EMU		(0)
+#endif
+
+#ifndef _TIF_SYSCALL_TRACEPOINT
+# define _TIF_SYSCALL_TRACEPOINT	(0)
+#endif
+
+#ifndef _TIF_SECCOMP
+# define _TIF_SECCOMP			(0)
+#endif
+
+#ifndef _TIF_AUDIT
+# define _TIF_AUDIT			(0)
+#endif
+
+/*
+ * TIF flags handled in syscall_enter_from_usermode()
+ */
+#ifndef ARCH_SYSCALL_ENTER_WORK
+# define ARCH_SYSCALL_ENTER_WORK	(0)
+#endif
+
+#define SYSCALL_ENTER_WORK						\
+	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | TIF_SECCOMP |	\
+	 _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_EMU |			\
+	 ARCH_SYSCALL_ENTER_WORK)
+
+/**
+ * arch_syscall_enter_tracehook - Wrapper around tracehook_report_syscall_entry()
+ * @regs:	Pointer to currents pt_regs
+ *
+ * Returns: 0 on success or an error code to skip the syscall.
+ *
+ * Defaults to tracehook_report_syscall_entry(). Can be replaced by
+ * architecture specific code.
+ *
+ * Invoked from syscall_enter_from_usermode()
+ */
+static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs);
+
+#ifndef arch_syscall_enter_tracehook
+static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs)
+{
+	return tracehook_report_syscall_entry(regs);
+}
+#endif
+
+/**
+ * arch_syscall_enter_seccomp - Architecture specific seccomp invocation
+ * @regs:	Pointer to currents pt_regs
+ *
+ * Returns: The original or a modified syscall number
+ *
+ * Invoked from syscall_enter_from_usermode(). Can be replaced by
+ * architecture specific code.
+ */
+static inline long arch_syscall_enter_seccomp(struct pt_regs *regs);
+
+#ifndef arch_syscall_enter_seccomp
+static inline long arch_syscall_enter_seccomp(struct pt_regs *regs)
+{
+	return secure_computing(NULL);
+}
+#endif
+
+/**
+ * arch_syscall_enter_audit - Architecture specific audit invocation
+ * @regs:	Pointer to currents pt_regs
+ *
+ * Invoked from syscall_enter_from_usermode(). Must be replaced by
+ * architecture specific code if the architecture supports audit.
+ */
+static inline void arch_syscall_enter_audit(struct pt_regs *regs);
+
+#ifndef arch_syscall_enter_audit
+static inline void arch_syscall_enter_audit(struct pt_regs *regs) { }
+#endif
+
+/* Common syscall enter function */
+long core_syscall_enter_from_usermode(struct pt_regs *regs, long syscall);
+
+/**
+ * syscall_enter_from_usermode - Check and handle work before invoking
+ *				 a syscall
+ * @regs:	Pointer to currents pt_regs
+ * @syscall:	The syscall number
+ *
+ * Invoked from architecture specific syscall entry code with interrupts
+ * enabled.
+ *
+ * Returns: The original or a modified syscall number
+ *
+ * If the returned syscall number is -1 then the syscall should be
+ * skipped. In this case the caller may invoke syscall_set_error() or
+ * syscall_set_return_value() first.  If neither of those is called and -1
+ * is returned, then the syscall will fail with ENOSYS.
+ */
+static inline long syscall_enter_from_usermode(struct pt_regs *regs,
+					       long syscall)
+{
+	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
+
+	if (IS_ENABLED(CONFIG_DEBUG_ENTRY))
+		BUG_ON(regs != task_pt_regs(current));
+
+	if (ti_work & SYSCALL_ENTER_WORK)
+		syscall = core_syscall_enter_from_usermode(regs, syscall);
+	return syscall;
+}
+
+#endif
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -43,6 +43,7 @@ obj-y += irq/
 obj-y += rcu/
 obj-y += livepatch/
 obj-y += dma/
+obj-y += entry/
 
 obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
--- /dev/null
+++ b/kernel/entry/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_GENERIC_ENTRY) += common.o
--- /dev/null
+++ b/kernel/entry/common.c
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/context_tracking.h>
+#include <linux/entry-common.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/syscalls.h>
+
+long core_syscall_enter_from_usermode(struct pt_regs *regs, long syscall)
+{
+	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
+	unsigned long ret = 0;
+
+	if (ti_work & (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU)) {
+		ret = arch_syscall_enter_tracehook(regs);
+		if (ret || (ti_work & _TIF_SYSCALL_EMU))
+			return -1L;
+	}
+
+	/* Do seccomp after ptrace, to catch any tracer changes. */
+	if (ti_work & _TIF_SECCOMP) {
+		ret = arch_syscall_enter_seccomp(regs);
+		if (ret == -1L)
+			return ret;
+	}
+
+	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
+		trace_sys_enter(regs, syscall);
+
+	arch_syscall_enter_audit(regs);
+
+	return ret ? : syscall;
+}



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 11/17] x86/entry: Use generic syscall entry function
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (9 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 10/17] entry: Provide generic syscall entry functionality Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 12/17] entry: Provide generic syscall exit function Thomas Gleixner
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

From: Thomas Gleixner <tglx@linutronix.de>

Replace the syscall entry work handling with the generic version, Provide
the necessary helper inlines to handle the real architecture specific
parts, e.g. audit and seccomp invocations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/Kconfig                    |    1 
 arch/x86/entry/common.c             |  108 +++---------------------------------
 arch/x86/include/asm/entry-common.h |   59 +++++++++++++++++++
 arch/x86/include/asm/thread_info.h  |    5 -
 4 files changed, 70 insertions(+), 103 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -111,6 +111,7 @@ config X86
 	select GENERIC_CPU_AUTOPROBE
 	select GENERIC_CPU_VULNERABILITIES
 	select GENERIC_EARLY_IOREMAP
+	select GENERIC_ENTRY
 	select GENERIC_FIND_FIRST_BIT
 	select GENERIC_IOMAP
 	select GENERIC_IRQ_EFFECTIVE_AFF_MASK	if SMP
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -10,13 +10,13 @@
 #include <linux/kernel.h>
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
+#include <linux/entry-common.h>
 #include <linux/mm.h>
 #include <linux/smp.h>
 #include <linux/errno.h>
 #include <linux/ptrace.h>
 #include <linux/tracehook.h>
 #include <linux/audit.h>
-#include <linux/seccomp.h>
 #include <linux/signal.h>
 #include <linux/export.h>
 #include <linux/context_tracking.h>
@@ -34,7 +34,6 @@
 #include <asm/fpu/api.h>
 #include <asm/nospec-branch.h>
 
-#define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
 
 #ifdef CONFIG_CONTEXT_TRACKING
@@ -48,86 +47,6 @@
 static inline void enter_from_user_mode(void) {}
 #endif
 
-static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
-{
-#ifdef CONFIG_X86_64
-	if (arch == AUDIT_ARCH_X86_64) {
-		audit_syscall_entry(regs->orig_ax, regs->di,
-				    regs->si, regs->dx, regs->r10);
-	} else
-#endif
-	{
-		audit_syscall_entry(regs->orig_ax, regs->bx,
-				    regs->cx, regs->dx, regs->si);
-	}
-}
-
-/*
- * Returns the syscall nr to run (which should match regs->orig_ax) or -1
- * to skip the syscall.
- */
-static long syscall_trace_enter(struct pt_regs *regs)
-{
-	u32 arch = in_ia32_syscall() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
-
-	struct thread_info *ti = current_thread_info();
-	unsigned long ret = 0;
-	u32 work;
-
-	if (IS_ENABLED(CONFIG_DEBUG_ENTRY))
-		BUG_ON(regs != task_pt_regs(current));
-
-	work = READ_ONCE(ti->flags);
-
-	if (work & (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU)) {
-		ret = tracehook_report_syscall_entry(regs);
-		if (ret || (work & _TIF_SYSCALL_EMU))
-			return -1L;
-	}
-
-#ifdef CONFIG_SECCOMP
-	/*
-	 * Do seccomp after ptrace, to catch any tracer changes.
-	 */
-	if (work & _TIF_SECCOMP) {
-		struct seccomp_data sd;
-
-		sd.arch = arch;
-		sd.nr = regs->orig_ax;
-		sd.instruction_pointer = regs->ip;
-#ifdef CONFIG_X86_64
-		if (arch == AUDIT_ARCH_X86_64) {
-			sd.args[0] = regs->di;
-			sd.args[1] = regs->si;
-			sd.args[2] = regs->dx;
-			sd.args[3] = regs->r10;
-			sd.args[4] = regs->r8;
-			sd.args[5] = regs->r9;
-		} else
-#endif
-		{
-			sd.args[0] = regs->bx;
-			sd.args[1] = regs->cx;
-			sd.args[2] = regs->dx;
-			sd.args[3] = regs->si;
-			sd.args[4] = regs->di;
-			sd.args[5] = regs->bp;
-		}
-
-		ret = __secure_computing(&sd);
-		if (ret == -1)
-			return ret;
-	}
-#endif
-
-	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
-		trace_sys_enter(regs, regs->orig_ax);
-
-	do_audit_syscall_entry(regs, arch);
-
-	return ret ?: regs->orig_ax;
-}
-
 #define EXIT_TO_USERMODE_LOOP_FLAGS				\
 	(_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE |	\
 	 _TIF_NEED_RESCHED | _TIF_USER_RETURN_NOTIFY | _TIF_PATCH_PENDING)
@@ -280,16 +199,13 @@ static void syscall_slow_exit_work(struc
 #ifdef CONFIG_X86_64
 __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
 {
-	struct thread_info *ti;
-
 	/* User to kernel transition disabled interrupts. */
 	trace_hardirqs_off();
 
 	enter_from_user_mode();
 	local_irq_enable();
-	ti = current_thread_info();
-	if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY)
-		nr = syscall_trace_enter(regs);
+
+	nr = syscall_enter_from_usermode(regs, nr);
 
 	if (likely(nr < NR_syscalls)) {
 		nr = array_index_nospec(nr, NR_syscalls);
@@ -316,22 +232,18 @@ static void syscall_slow_exit_work(struc
  */
 static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
 {
-	struct thread_info *ti = current_thread_info();
 	unsigned int nr = (unsigned int)regs->orig_ax;
 
 #ifdef CONFIG_IA32_EMULATION
-	ti->status |= TS_COMPAT;
+	current_thread_info()->status |= TS_COMPAT;
 #endif
 
-	if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY) {
-		/*
-		 * Subtlety here: if ptrace pokes something larger than
-		 * 2^32-1 into orig_ax, this truncates it.  This may or
-		 * may not be necessary, but it matches the old asm
-		 * behavior.
-		 */
-		nr = syscall_trace_enter(regs);
-	}
+	/*
+	 * Subtlety here: if ptrace pokes something larger than 2^32-1 into
+	 * orig_ax, this truncates it.  This may or may not be necessary,
+	 * but it matches the old asm behavior.
+	 */
+	nr = syscall_enter_from_usermode(regs, nr);
 
 	if (likely(nr < IA32_NR_syscalls)) {
 		nr = array_index_nospec(nr, IA32_NR_syscalls);
--- /dev/null
+++ b/arch/x86/include/asm/entry-common.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _ASM_X86_ENTRY_COMMON_H
+#define _ASM_X86_ENTRY_COMMON_H
+
+#include <linux/seccomp.h>
+#include <linux/audit.h>
+
+static inline long arch_syscall_enter_seccomp(struct pt_regs *regs)
+{
+#ifdef CONFIG_SECCOMP
+	u32 arch = in_ia32_syscall() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
+	struct seccomp_data sd;
+
+	sd.arch = arch;
+	sd.nr = regs->orig_ax;
+	sd.instruction_pointer = regs->ip;
+
+#ifdef CONFIG_X86_64
+	if (arch == AUDIT_ARCH_X86_64) {
+		sd.args[0] = regs->di;
+		sd.args[1] = regs->si;
+		sd.args[2] = regs->dx;
+		sd.args[3] = regs->r10;
+		sd.args[4] = regs->r8;
+		sd.args[5] = regs->r9;
+	} else
+#endif
+	{
+		sd.args[0] = regs->bx;
+		sd.args[1] = regs->cx;
+		sd.args[2] = regs->dx;
+		sd.args[3] = regs->si;
+		sd.args[4] = regs->di;
+		sd.args[5] = regs->bp;
+	}
+
+	return __secure_computing(&sd);
+#else
+	return 0;
+#endif
+}
+#define arch_syscall_enter_seccomp arch_syscall_enter_seccomp
+
+static inline void arch_syscall_enter_audit(struct pt_regs *regs)
+{
+#ifdef CONFIG_X86_64
+	if (in_ia32_syscall()) {
+		audit_syscall_entry(regs->orig_ax, regs->di,
+				    regs->si, regs->dx, regs->r10);
+	} else
+#endif
+	{
+		audit_syscall_entry(regs->orig_ax, regs->bx,
+				    regs->cx, regs->dx, regs->si);
+	}
+}
+#define arch_syscall_enter_audit arch_syscall_enter_audit
+
+#endif
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -133,11 +133,6 @@ struct thread_info {
 #define _TIF_X32		(1 << TIF_X32)
 #define _TIF_FSCHECK		(1 << TIF_FSCHECK)
 
-/* Work to do before invoking the actual syscall. */
-#define _TIF_WORK_SYSCALL_ENTRY	\
-	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_AUDIT |	\
-	 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT)
-
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW_BASE						\
 	(_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|		\



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 12/17] entry: Provide generic syscall exit function
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (10 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 11/17] x86/entry: Use generic syscall entry function Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 13/17] x86/entry: Use generic syscall exit functionality Thomas Gleixner
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

From: Thomas Gleixner <tglx@linutronix.de>

Like syscall entry all architectures have similar and pointlessly different
code to handle pending work before returning from a syscall to user space.

Provide a generic version.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/entry-common.h |   31 ++++++++++++++++++++++++
 kernel/entry/common.c        |   55 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 86 insertions(+)

--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -46,6 +46,17 @@
 	 _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_EMU |			\
 	 ARCH_SYSCALL_ENTER_WORK)
 
+/*
+ * TIF flags handled in syscall_exit_to_usermode()
+ */
+#ifndef ARCH_SYSCALL_EXIT_WORK
+# define ARCH_SYSCALL_EXIT_WORK		(0)
+#endif
+
+#define SYSCALL_EXIT_WORK						\
+	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT |			\
+	 _TIF_SYSCALL_TRACEPOINT | ARCH_SYSCALL_EXIT_WORK)
+
 /**
  * arch_syscall_enter_tracehook - Wrapper around tracehook_report_syscall_entry()
  * @regs:	Pointer to currents pt_regs
@@ -129,4 +140,24 @@ static inline long syscall_enter_from_us
 	return syscall;
 }
 
+/**
+ * arch_syscall_exit_tracehook - Wrapper around tracehook_report_syscall_exit()
+ *
+ * Defaults to tracehook_report_syscall_exit(). Can be replaced by
+ * architecture specific code.
+ *
+ * Invoked from syscall_exit_to_usermode()
+ */
+static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step);
+
+#ifndef arch_syscall_exit_tracehook
+static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step)
+{
+	tracehook_report_syscall_exit(regs, step);
+}
+#endif
+
+/* Common syscall exit function */
+void syscall_exit_to_usermode(struct pt_regs *regs, long syscall, long retval);
+
 #endif
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -31,3 +31,58 @@ long core_syscall_enter_from_usermode(st
 
 	return ret ? : syscall;
 }
+
+#ifndef _TIF_SINGLESTEP
+static inline bool report_single_step(unsigned long ti_work)
+{
+	return false;
+}
+#else
+/*
+ * If TIF_SYSCALL_EMU is set, then the only reason to report is when
+ * TIF_SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP).  This syscall
+ * instruction has been already reported in syscall_enter_from_usermode().
+ */
+#define SYSEMU_STEP	(_TIF_SINGLESTEP | _TIF_SYSCALL_EMU)
+
+static inline bool report_single_step(unsigned long ti_work)
+{
+	return (ti_work & SYSEMU_STEP) == _TIF_SINGLESTEP;
+}
+#endif
+
+static void syscall_exit_work(struct pt_regs *regs, long retval,
+			      unsigned long ti_work)
+{
+	bool step;
+
+	audit_syscall_exit(regs);
+
+	if (ti_work & _TIF_SYSCALL_TRACEPOINT)
+		trace_sys_exit(regs, retval);
+
+	step = report_single_step(ti_work);
+	if (step || ti_work & _TIF_SYSCALL_TRACE)
+		arch_syscall_exit_tracehook(regs, step);
+}
+
+void syscall_exit_to_usermode(struct pt_regs *regs, long syscall, long retval)
+{
+	unsigned long ti_work;
+
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
+
+	if (IS_ENABLED(CONFIG_PROVE_LOCKING) &&
+	    WARN(irqs_disabled(), "syscall %ld left IRQs disabled", syscall))
+		local_irq_enable();
+
+	rseq_syscall(regs);
+
+	/*
+	 * Handle work which needs to run exactly once per syscall exit
+	 * with interrupts enabled.
+	 */
+	ti_work = READ_ONCE(current_thread_info()->flags);
+	if (unlikely(ti_work & SYSCALL_EXIT_WORK))
+		syscall_exit_work(regs, retval, ti_work);
+}



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 13/17] x86/entry: Use generic syscall exit functionality
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (11 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 12/17] entry: Provide generic syscall exit function Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 14/17] entry: Provide generic exit to usermode functionality Thomas Gleixner
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Replace the x86 variant with the generic version.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/common.c             |   44 ------------------------------------
 arch/x86/include/asm/entry-common.h |    2 +
 2 files changed, 3 insertions(+), 43 deletions(-)

--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -142,55 +142,13 @@ static void exit_to_usermode_loop(struct
 	trace_hardirqs_on();
 }
 
-#define SYSCALL_EXIT_WORK_FLAGS				\
-	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT |	\
-	 _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT)
-
-static void syscall_slow_exit_work(struct pt_regs *regs, u32 cached_flags)
-{
-	bool step;
-
-	audit_syscall_exit(regs);
-
-	if (cached_flags & _TIF_SYSCALL_TRACEPOINT)
-		trace_sys_exit(regs, regs->ax);
-
-	/*
-	 * If TIF_SYSCALL_EMU is set, we only get here because of
-	 * TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP).
-	 * We already reported this syscall instruction in
-	 * syscall_trace_enter().
-	 */
-	step = unlikely(
-		(cached_flags & (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU))
-		== _TIF_SINGLESTEP);
-	if (step || cached_flags & _TIF_SYSCALL_TRACE)
-		tracehook_report_syscall_exit(regs, step);
-}
-
 /*
  * Called with IRQs on and fully valid regs.  Returns with IRQs off in a
  * state such that we can immediately switch to user mode.
  */
 __visible inline void syscall_return_slowpath(struct pt_regs *regs)
 {
-	struct thread_info *ti = current_thread_info();
-	u32 cached_flags = READ_ONCE(ti->flags);
-
-	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
-
-	if (IS_ENABLED(CONFIG_PROVE_LOCKING) &&
-	    WARN(irqs_disabled(), "syscall %ld left IRQs disabled", regs->orig_ax))
-		local_irq_enable();
-
-	rseq_syscall(regs);
-
-	/*
-	 * First do one-time work.  If these work items are enabled, we
-	 * want to run them exactly once per syscall exit with IRQs on.
-	 */
-	if (unlikely(cached_flags & SYSCALL_EXIT_WORK_FLAGS))
-		syscall_slow_exit_work(regs, cached_flags);
+	syscall_exit_to_usermode(regs, regs->orig_ax, regs->ax);
 
 	local_irq_disable();
 	prepare_exit_to_usermode(regs);
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -5,6 +5,8 @@
 #include <linux/seccomp.h>
 #include <linux/audit.h>
 
+#define ARCH_SYSCALL_EXIT_WORK		(_TIF_SINGLESTEP)
+
 static inline long arch_syscall_enter_seccomp(struct pt_regs *regs)
 {
 #ifdef CONFIG_SECCOMP



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 14/17] entry: Provide generic exit to usermode functionality
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (12 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 13/17] x86/entry: Use generic syscall exit functionality Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 21:34   ` Andy Lutomirski
  2019-10-23 12:27 ` [patch V2 15/17] x86/entry: Use generic exit to usermode Thomas Gleixner
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

From: Thomas Gleixner <tglx@linutronix.de>

Provide a generic facility to handle the exit to usermode work. That's
aimed to replace the pointlessly different copies in each architecture.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Move lockdep and address limit check right to the end of the return
    sequence. (PeterZ)
---
 include/linux/entry-common.h |  105 +++++++++++++++++++++++++++++++++++++++++++
 kernel/entry/common.c        |   82 +++++++++++++++++++++++++++++++++
 2 files changed, 187 insertions(+)

--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -34,6 +34,30 @@
 # define _TIF_AUDIT			(0)
 #endif
 
+#ifndef _TIF_UPROBE
+# define _TIF_UPROBE			(0)
+#endif
+
+#ifndef _TIF_PATCH_PENDING
+# define _TIF_PATCH_PENDING		(0)
+#endif
+
+#ifndef _TIF_NOTIFY_RESUME
+# define _TIF_NOTIFY_RESUME		(0)
+#endif
+
+/*
+ * TIF flags handled in exit_to_usermode()
+ */
+#ifndef ARCH_EXIT_TO_USERMODE_WORK
+# define ARCH_EXIT_TO_USERMODE_WORK	(0)
+#endif
+
+#define EXIT_TO_USERMODE_WORK						\
+	(_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE |		\
+	 _TIF_NEED_RESCHED | _TIF_PATCH_PENDING |			\
+	 ARCH_EXIT_TO_USERMODE_WORK)
+
 /*
  * TIF flags handled in syscall_enter_from_usermode()
  */
@@ -58,6 +82,87 @@
 	 _TIF_SYSCALL_TRACEPOINT | ARCH_SYSCALL_EXIT_WORK)
 
 /**
+ * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable()
+ * @ti_work:	Cached TIF flags gathered with interrupts disabled
+ *
+ * Defaults to local_irq_enable(). Can be supplied by architecture specific
+ * code.
+ */
+static inline void local_irq_enable_exit_to_user(unsigned long ti_work);
+
+#ifndef local_irq_enable_exit_to_user
+static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
+{
+	local_irq_enable();
+}
+#endif
+
+/**
+ * local_irq_disable_exit_to_user - Exit to user variant of local_irq_disable()
+ *
+ * Defaults to local_irq_disable(). Can be supplied by architecture specific
+ * code.
+ */
+static inline void local_irq_disable_exit_to_user(void);
+
+#ifndef local_irq_disable_exit_to_user
+static inline void local_irq_disable_exit_to_user(void)
+{
+	local_irq_disable();
+}
+#endif
+
+/**
+ * arch_exit_to_usermode_work - Architecture specific TIF work for
+ *				exit to user mode.
+ * @regs:	Pointer to currents pt_regs
+ * @ti_work:	Cached TIF flags gathered with interrupts disabled
+ *
+ * Invoked from exit_to_usermode() with interrupt disabled
+ *
+ * Defaults to NOOP. Can be supplied by architecture specific code.
+ */
+static inline void arch_exit_to_usermode_work(struct pt_regs *regs,
+					      unsigned long ti_work);
+
+#ifndef arch_exit_to_usermode_work
+static inline void arch_exit_to_usermode_work(struct pt_regs *regs,
+					      unsigned long ti_work)
+{
+}
+#endif
+
+/**
+ * arch_exit_to_usermode - Architecture specific preparation for
+ *			   exit to user mode.
+ * @regs:	Pointer to currents pt_regs
+ * @ti_work:	Cached TIF flags gathered with interrupts disabled
+ *
+ * Invoked from exit_to_usermode() with interrupt disabled as the last
+ * function before return.
+ */
+static inline void arch_exit_to_usermode(struct pt_regs *regs,
+					 unsigned long ti_work);
+
+#ifndef arch_exit_to_usermode
+static inline void arch_exit_to_usermode(struct pt_regs *regs,
+					 unsigned long ti_work)
+{
+}
+#endif
+
+/* Common exit to usermode function to handle TIF work */
+asmlinkage __visible void exit_to_usermode(struct pt_regs *regs);
+
+/**
+ * arch_do_signal -  Architecture specific signal delivery function
+ * @regs:	Pointer to currents pt_regs
+ *
+ * Invoked from exit_to_usermode()
+ */
+void arch_do_signal(struct pt_regs *regs);
+
+/**
  * arch_syscall_enter_tracehook - Wrapper around tracehook_report_syscall_entry()
  * @regs:	Pointer to currents pt_regs
  *
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -2,10 +2,86 @@
 
 #include <linux/context_tracking.h>
 #include <linux/entry-common.h>
+#include <linux/livepatch.h>
+#include <linux/uprobes.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
 
+static unsigned long core_exit_to_usermode_work(struct pt_regs *regs,
+						unsigned long ti_work)
+{
+	/*
+	 * Before returning to user space ensure that all pending work
+	 * items have been completed.
+	 */
+	while (ti_work & EXIT_TO_USERMODE_WORK) {
+
+		local_irq_enable_exit_to_user(ti_work);
+
+		if (ti_work & _TIF_NEED_RESCHED)
+			schedule();
+
+		if (ti_work & _TIF_UPROBE)
+			uprobe_notify_resume(regs);
+
+		if (ti_work & _TIF_PATCH_PENDING)
+			klp_update_patch_state(current);
+
+		if (ti_work & _TIF_SIGPENDING)
+			arch_do_signal(regs);
+
+		if (ti_work & _TIF_NOTIFY_RESUME) {
+			clear_thread_flag(TIF_NOTIFY_RESUME);
+			tracehook_notify_resume(regs);
+			rseq_handle_notify_resume(NULL, regs);
+		}
+
+		/* Architecture specific TIF work */
+		arch_exit_to_usermode_work(regs, ti_work);
+
+		/*
+		 * Disable interrupts and reevaluate the work flags as they
+		 * might have changed while interrupts and preemption was
+		 * enabled above.
+		 */
+		local_irq_disable_exit_to_user();
+		ti_work = READ_ONCE(current_thread_info()->flags);
+	}
+	return ti_work;
+}
+
+static void do_exit_to_usermode(struct pt_regs *regs)
+{
+	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
+
+	if (unlikely(ti_work & EXIT_TO_USERMODE_WORK))
+		ti_work = core_exit_to_usermode_work(regs, ti_work);
+
+	arch_exit_to_usermode(regs, ti_work);
+
+	/* Ensure no locks are held and the address limit is intact */
+	lockdep_sys_exit();
+	addr_limit_user_check();
+
+	/* Return to userspace right after this which turns on interrupts */
+	trace_hardirqs_on();
+}
+
+/**
+ * exit_to_usermode - Check and handle pending work which needs to be
+ *		      handled before returning to user mode
+ * @regs:	Pointer to currents pt_regs
+ *
+ * Called and returns with interrupts disabled
+ */
+asmlinkage __visible void exit_to_usermode(struct pt_regs *regs)
+{
+	trace_hardirqs_off();
+	lockdep_assert_irqs_disabled();
+	do_exit_to_usermode(regs);
+}
+
 long core_syscall_enter_from_usermode(struct pt_regs *regs, long syscall)
 {
 	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
@@ -85,4 +161,10 @@ void syscall_exit_to_usermode(struct pt_
 	ti_work = READ_ONCE(current_thread_info()->flags);
 	if (unlikely(ti_work & SYSCALL_EXIT_WORK))
 		syscall_exit_work(regs, retval, ti_work);
+
+	/*
+	 * Disable interrupts and handle the regular exit to user mode work
+	 */
+	local_irq_disable_exit_to_user();
+	do_exit_to_usermode(regs);
 }



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 15/17] x86/entry: Use generic exit to usermode
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (13 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 14/17] entry: Provide generic exit to usermode functionality Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 12:27 ` [patch V2 16/17] kvm/workpending: Provide infrastructure for work before entering a guest Thomas Gleixner
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

From: Thomas Gleixner <tglx@linutronix.de>

Replace the x86 specific exit to usermode code with the generic
implementation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/common.c             |  110 ------------------------------------
 arch/x86/entry/entry_32.S           |    2 
 arch/x86/entry/entry_64.S           |    2 
 arch/x86/include/asm/entry-common.h |   47 ++++++++++++++-
 arch/x86/include/asm/signal.h       |    1 
 arch/x86/kernel/signal.c            |    2 
 6 files changed, 51 insertions(+), 113 deletions(-)

--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -15,15 +15,9 @@
 #include <linux/smp.h>
 #include <linux/errno.h>
 #include <linux/ptrace.h>
-#include <linux/tracehook.h>
-#include <linux/audit.h>
 #include <linux/signal.h>
 #include <linux/export.h>
-#include <linux/context_tracking.h>
-#include <linux/user-return-notifier.h>
 #include <linux/nospec.h>
-#include <linux/uprobes.h>
-#include <linux/livepatch.h>
 #include <linux/syscalls.h>
 #include <linux/uaccess.h>
 
@@ -47,101 +41,6 @@
 static inline void enter_from_user_mode(void) {}
 #endif
 
-#define EXIT_TO_USERMODE_LOOP_FLAGS				\
-	(_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE |	\
-	 _TIF_NEED_RESCHED | _TIF_USER_RETURN_NOTIFY | _TIF_PATCH_PENDING)
-
-static void exit_to_usermode_loop(struct pt_regs *regs, u32 cached_flags)
-{
-	/*
-	 * In order to return to user mode, we need to have IRQs off with
-	 * none of EXIT_TO_USERMODE_LOOP_FLAGS set.  Several of these flags
-	 * can be set at any time on preemptible kernels if we have IRQs on,
-	 * so we need to loop.  Disabling preemption wouldn't help: doing the
-	 * work to clear some of the flags can sleep.
-	 */
-	while (true) {
-		/* We have work to do. */
-		local_irq_enable();
-
-		if (cached_flags & _TIF_NEED_RESCHED)
-			schedule();
-
-		if (cached_flags & _TIF_UPROBE)
-			uprobe_notify_resume(regs);
-
-		if (cached_flags & _TIF_PATCH_PENDING)
-			klp_update_patch_state(current);
-
-		/* deal with pending signal delivery */
-		if (cached_flags & _TIF_SIGPENDING)
-			do_signal(regs);
-
-		if (cached_flags & _TIF_NOTIFY_RESUME) {
-			clear_thread_flag(TIF_NOTIFY_RESUME);
-			tracehook_notify_resume(regs);
-			rseq_handle_notify_resume(NULL, regs);
-		}
-
-		if (cached_flags & _TIF_USER_RETURN_NOTIFY)
-			fire_user_return_notifiers();
-
-		/* Disable IRQs and retry */
-		local_irq_disable();
-
-		cached_flags = READ_ONCE(current_thread_info()->flags);
-
-		if (!(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS))
-			break;
-	}
-}
-
-/* Called with IRQs disabled. */
-__visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
-{
-	struct thread_info *ti = current_thread_info();
-	u32 cached_flags;
-
-	addr_limit_user_check();
-
-	lockdep_assert_irqs_disabled();
-	lockdep_sys_exit();
-
-	cached_flags = READ_ONCE(ti->flags);
-
-	if (unlikely(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS))
-		exit_to_usermode_loop(regs, cached_flags);
-
-	/* Reload ti->flags; we may have rescheduled above. */
-	cached_flags = READ_ONCE(ti->flags);
-
-	fpregs_assert_state_consistent();
-	if (unlikely(cached_flags & _TIF_NEED_FPU_LOAD))
-		switch_fpu_return();
-
-#ifdef CONFIG_COMPAT
-	/*
-	 * Compat syscalls set TS_COMPAT.  Make sure we clear it before
-	 * returning to user mode.  We need to clear it *after* signal
-	 * handling, because syscall restart has a fixup for compat
-	 * syscalls.  The fixup is exercised by the ptrace_syscall_32
-	 * selftest.
-	 *
-	 * We also need to clear TS_REGS_POKED_I386: the 32-bit tracer
-	 * special case only applies after poking regs and before the
-	 * very next return to user mode.
-	 */
-	ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
-#endif
-
-	user_enter_irqoff();
-
-	mds_user_clear_cpu_buffers();
-
-	/* The return to usermode reenables interrupts. Tell the tracer */
-	trace_hardirqs_on();
-}
-
 /*
  * Called with IRQs on and fully valid regs.  Returns with IRQs off in a
  * state such that we can immediately switch to user mode.
@@ -149,9 +48,6 @@ static void exit_to_usermode_loop(struct
 __visible inline void syscall_return_slowpath(struct pt_regs *regs)
 {
 	syscall_exit_to_usermode(regs, regs->orig_ax, regs->ax);
-
-	local_irq_disable();
-	prepare_exit_to_usermode(regs);
 }
 
 #ifdef CONFIG_X86_64
@@ -179,7 +75,7 @@ static void exit_to_usermode_loop(struct
 #endif
 	}
 
-	syscall_return_slowpath(regs);
+	syscall_exit_to_usermode(regs, regs->orig_ax, regs->ax);
 }
 #endif
 
@@ -223,7 +119,7 @@ static __always_inline void do_syscall_3
 #endif /* CONFIG_IA32_EMULATION */
 	}
 
-	syscall_return_slowpath(regs);
+	syscall_exit_to_usermode(regs, regs->orig_ax, regs->ax);
 }
 
 /* Handles int $0x80 */
@@ -278,7 +174,7 @@ static __always_inline void do_syscall_3
 		/* User code screwed up. */
 		local_irq_disable();
 		regs->ax = -EFAULT;
-		prepare_exit_to_usermode(regs);
+		exit_to_usermode(regs);
 		return 0;	/* Keep it simple: use IRET. */
 	}
 
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -819,7 +819,7 @@ END(ret_from_fork)
 	jb	restore_all_kernel		# not returning to v8086 or userspace
 
 	movl	%esp, %eax
-	call	prepare_exit_to_usermode
+	call	exit_to_usermode
 	jmp	restore_all
 END(ret_from_exception)
 
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -600,7 +600,7 @@ END(common_spurious)
 	/* Interrupt came from user space */
 GLOBAL(retint_user)
 	mov	%rsp,%rdi
-	call	prepare_exit_to_usermode
+	call	exit_to_usermode
 
 GLOBAL(swapgs_restore_regs_and_return_to_usermode)
 #ifdef CONFIG_DEBUG_ENTRY
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -2,11 +2,54 @@
 #ifndef _ASM_X86_ENTRY_COMMON_H
 #define _ASM_X86_ENTRY_COMMON_H
 
-#include <linux/seccomp.h>
-#include <linux/audit.h>
+#include <linux/user-return-notifier.h>
+#include <linux/context_tracking.h>
+
+#include <asm/nospec-branch.h>
+#include <asm/fpu/api.h>
 
 #define ARCH_SYSCALL_EXIT_WORK		(_TIF_SINGLESTEP)
 
+#define ARCH_EXIT_TO_USERMODE_WORK	(_TIF_USER_RETURN_NOTIFY)
+
+#define ARCH_EXIT_TO_USER_FROM_SYSCALL_EXIT
+
+static inline void arch_exit_to_usermode_work(struct pt_regs *regs,
+					      unsigned long ti_work)
+{
+	if (ti_work & _TIF_USER_RETURN_NOTIFY)
+		fire_user_return_notifiers();
+}
+#define arch_exit_to_usermode_work arch_exit_to_usermode_work
+
+static inline void arch_exit_to_usermode(struct pt_regs *regs,
+					 unsigned long ti_work)
+{
+	fpregs_assert_state_consistent();
+	if (unlikely(ti_work & _TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+
+#ifdef CONFIG_COMPAT
+	/*
+	 * Compat syscalls set TS_COMPAT.  Make sure we clear it before
+	 * returning to user mode.  We need to clear it *after* signal
+	 * handling, because syscall restart has a fixup for compat
+	 * syscalls.  The fixup is exercised by the ptrace_syscall_32
+	 * selftest.
+	 *
+	 * We also need to clear TS_REGS_POKED_I386: the 32-bit tracer
+	 * special case only applies after poking regs and before the
+	 * very next return to user mode.
+	 */
+	current_thread_info()->status &= ~(TS_COMPAT | TS_I386_REGS_POKED);
+#endif
+
+	user_enter_irqoff();
+
+	mds_user_clear_cpu_buffers();
+}
+#define arch_exit_to_usermode arch_exit_to_usermode
+
 static inline long arch_syscall_enter_seccomp(struct pt_regs *regs)
 {
 #ifdef CONFIG_SECCOMP
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -35,7 +35,6 @@ typedef sigset_t compat_sigset_t;
 #endif /* __ASSEMBLY__ */
 #include <uapi/asm/signal.h>
 #ifndef __ASSEMBLY__
-extern void do_signal(struct pt_regs *regs);
 
 #define __ARCH_HAS_SA_RESTORER
 
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -808,7 +808,7 @@ static inline unsigned long get_nr_resta
  * want to handle. Thus you cannot kill init even with a SIGKILL even by
  * mistake.
  */
-void do_signal(struct pt_regs *regs)
+void arch_do_signal(struct pt_regs *regs)
 {
 	struct ksignal ksig;
 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 16/17] kvm/workpending: Provide infrastructure for work before entering a guest
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (14 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 15/17] x86/entry: Use generic exit to usermode Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 14:55   ` Sean Christopherson
  2019-10-23 12:27 ` [patch V2 17/17] x86/kvm: Use generic exit to guest work function Thomas Gleixner
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Entering a guest is similar to exiting to user space. Pending work like
handling signals, rescheduling, task work etc. needs to be handled before
that.

Provide generic infrastructure to avoid duplication of the same handling code
all over the place.

The kvm_exit code is split up into a KVM specific part and a generic
builtin core part to avoid multiple exports for the actual work
functions. The exit to guest mode handling is slightly different from the
exit to usermode handling, e.g. vs. rseq, so a separate function is used.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Moved KVM specific functions to kvm (Paolo)
    Added lockdep assert (Andy)
    Dropped live patching from enter guest mode work (Miroslav)
---
 include/linux/entry-common.h |   12 ++++++++
 include/linux/kvm_host.h     |   64 +++++++++++++++++++++++++++++++++++++++++++
 kernel/entry/common.c        |   14 +++++++++
 virt/kvm/Kconfig             |    3 ++
 4 files changed, 93 insertions(+)

--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -265,4 +265,16 @@ static inline void arch_syscall_exit_tra
 /* Common syscall exit function */
 void syscall_exit_to_usermode(struct pt_regs *regs, long syscall, long retval);
 
+/* KVM exit to guest mode */
+
+void core_exit_to_guestmode_work(unsigned long ti_work);
+
+#ifndef ARCH_EXIT_TO_GUESTMODE_WORK
+# define ARCH_EXIT_TO_GUESTMODE_WORK	(0)
+#endif
+
+#define EXIT_TO_GUESTMODE_WORK						\
+	(_TIF_NEED_RESCHED | _TIF_SIGPENDING | _TIF_NOTIFY_RESUME |	\
+	 ARCH_EXIT_TO_GUESTMODE_WORK)
+
 #endif
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -22,6 +22,7 @@
 #include <linux/err.h>
 #include <linux/irqflags.h>
 #include <linux/context_tracking.h>
+#include <linux/entry-common.h>
 #include <linux/irqbypass.h>
 #include <linux/swait.h>
 #include <linux/refcount.h>
@@ -1382,4 +1383,67 @@ static inline int kvm_arch_vcpu_run_pid_
 }
 #endif /* CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE */
 
+/* Exit to guest mode work */
+#ifdef CONFIG_KVM_EXIT_TO_GUEST_WORK
+
+#ifndef arch_exit_to_guestmode_work
+/**
+ * arch_exit_to_guestmode_work - Architecture specific exit to guest mode function
+ * @kvm:	Pointer to the guest instance
+ * @vcpu:	Pointer to current's VCPU data
+ * @ti_work:	Cached TIF flags gathered in exit_to_guestmode()
+ *
+ * Invoked from kvm_exit_to_guestmode_work(). Can be replaced by
+ * architecture specific code.
+ */
+static inline int arch_exit_to_guestmode_work(struct kvm *kvm,
+					      struct kvm_vcpu *vcpu,
+					      unsigned long ti_work)
+{
+	return 0;
+}
+#endif
+
+/**
+ * exit_to_guestmode - Check and handle pending work which needs to be
+ *		       handled before returning to guest mode
+ * @kvm:	Pointer to the guest instance
+ * @vcpu:	Pointer to current's VCPU data
+ *
+ * Returns: 0 or an error code
+ */
+static inline int exit_to_guestmode(struct kvm *kvm, struct kvm_vcpu *vcpu)
+{
+	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
+	int r = 0;
+
+	if (unlikely(ti_work & EXIT_TO_GUESTMODE_WORK)) {
+		if (ti_work & _TIF_SIGPENDING) {
+			vcpu->run->exit_reason = KVM_EXIT_INTR;
+			vcpu->stat.signal_exits++;
+			return -EINTR;
+		}
+		core_exit_to_guestmode_work(ti_work);
+		r = arch_exit_to_guestmode_work(kvm, vcpu, ti_work);
+	}
+	return r;
+}
+
+/**
+ * _exit_to_guestmode_work_pending - Check if work is pending which needs to be
+ *				     handled before returning to guest mode
+ *
+ * Returns: True if work pending, False otherwise.
+ */
+static inline bool exit_to_guestmode_work_pending(void)
+{
+	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
+
+	lockdep_assert_irqs_disabled();
+
+	return !!(ti_work & EXIT_TO_GUESTMODE_WORK);
+
+}
+#endif /* CONFIG_KVM_EXIT_TO_GUEST_WORK */
+
 #endif
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -8,6 +8,20 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
 
+#ifdef CONFIG_KVM_EXIT_TO_GUEST_WORK
+void core_exit_to_guestmode_work(unsigned long ti_work)
+{
+	if (ti_work & _TIF_NEED_RESCHED)
+		schedule();
+
+	if (ti_work & _TIF_NOTIFY_RESUME) {
+		clear_thread_flag(TIF_NOTIFY_RESUME);
+		tracehook_notify_resume(NULL);
+	}
+}
+EXPORT_SYMBOL_GPL(core_exit_to_guestmode_work);
+#endif
+
 static unsigned long core_exit_to_usermode_work(struct pt_regs *regs,
 						unsigned long ti_work)
 {
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -60,3 +60,6 @@ config HAVE_KVM_VCPU_RUN_PID_CHANGE
 
 config HAVE_KVM_NO_POLL
        bool
+
+config KVM_EXIT_TO_GUEST_WORK
+       bool



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 17/17] x86/kvm: Use generic exit to guest work function
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (15 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 16/17] kvm/workpending: Provide infrastructure for work before entering a guest Thomas Gleixner
@ 2019-10-23 12:27 ` Thomas Gleixner
  2019-10-23 14:48   ` Sean Christopherson
  2019-10-23 14:37 ` [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Peter Zijlstra
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 12:27 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Use the generic infrastructure to check for and handle pending work before
entering into guest mode.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kvm/Kconfig |    1 +
 arch/x86/kvm/x86.c   |   17 +++++------------
 2 files changed, 6 insertions(+), 12 deletions(-)

--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -42,6 +42,7 @@ config KVM
 	select HAVE_KVM_MSI
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
 	select HAVE_KVM_NO_POLL
+	select KVM_EXIT_TO_GUEST_WORK
 	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	select KVM_VFIO
 	select SRCU
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -52,6 +52,7 @@
 #include <linux/irqbypass.h>
 #include <linux/sched/stat.h>
 #include <linux/sched/isolation.h>
+#include <linux/entry-common.h>
 #include <linux/mem_encrypt.h>
 
 #include <trace/events/kvm.h>
@@ -8115,8 +8116,8 @@ static int vcpu_enter_guest(struct kvm_v
 	if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
 		kvm_x86_ops->sync_pir_to_irr(vcpu);
 
-	if (vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu)
-	    || need_resched() || signal_pending(current)) {
+	if (vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) ||
+	    exit_to_guestmode_work_pending()) {
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		smp_wmb();
 		local_irq_enable();
@@ -8309,17 +8310,9 @@ static int vcpu_run(struct kvm_vcpu *vcp
 
 		kvm_check_async_pf_completion(vcpu);
 
-		if (signal_pending(current)) {
-			r = -EINTR;
-			vcpu->run->exit_reason = KVM_EXIT_INTR;
-			++vcpu->stat.signal_exits;
+		r = exit_to_guestmode(kvm, vcpu);
+		if (r)
 			break;
-		}
-		if (need_resched()) {
-			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
-			cond_resched();
-			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
-		}
 	}
 
 	srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label
  2019-10-23 12:27 ` [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label Thomas Gleixner
@ 2019-10-23 13:43   ` Sean Christopherson
  2019-11-06 15:26   ` Alexandre Chartre
  2019-11-16 12:02   ` [tip: x86/asm] " tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 13:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:06PM +0200, Thomas Gleixner wrote:
> The C reimplementation of SYSENTER left that unused ENTRY() label
> around. Remove it.
> 
> Fixes: 5f310f739b4c ("x86/entry/32: Re-implement SYSENTER using the new C path")
> Originally-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit
  2019-10-23 12:27 ` [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit Thomas Gleixner
@ 2019-10-23 13:45   ` Sean Christopherson
  2019-11-06 15:29   ` Alexandre Chartre
  2019-11-16 12:02   ` [tip: x86/asm] " tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 13:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:07PM +0200, Thomas Gleixner wrote:
> Jump directly to restore_regs_and_return_to_kernel instead of making
> a pointless extra jump through .Lparanoid_exit_restore
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()
  2019-10-23 12:27 ` [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug() Thomas Gleixner
@ 2019-10-23 13:52   ` Sean Christopherson
  2019-10-23 21:31   ` Josh Poimboeuf
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 13:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:08PM +0200, Thomas Gleixner wrote:
> That function returns immediately after conditionally reenabling interrupts which
> is more than pointless and requires the ASM code to disable interrupts again.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 04/17] x86/entry: Make DEBUG_ENTRY_ASSERT_IRQS_OFF available for 32bit
  2019-10-23 12:27 ` [patch V2 04/17] x86/entry: Make DEBUG_ENTRY_ASSERT_IRQS_OFF available for 32bit Thomas Gleixner
@ 2019-10-23 14:16   ` Sean Christopherson
  2019-11-06 15:50   ` Alexandre Chartre
  1 sibling, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 14:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:09PM +0200, Thomas Gleixner wrote:
> Move the interrupt state verification debug macro to common code and fixup
> the irqflags and paravirt components so it can be used in 32bit code later.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code
  2019-10-23 12:27 ` [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code Thomas Gleixner
@ 2019-10-23 14:16   ` Sean Christopherson
  2019-10-23 22:01   ` Josh Poimboeuf
  2019-11-06 16:19   ` Alexandre Chartre
  2 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 14:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:10PM +0200, Thomas Gleixner wrote:
> Traps enable interrupts conditionally but rely on the ASM return code to
> disable them again. That results in redundant interrupt disable and trace
> calls.
> 
> Make the trap handlers disable interrupts before returning to avoid that,
> which allows simplification of the ASM entry code.
> 
> Originally-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 06/17] x86/entry/32: Remove redundant interrupt disable
  2019-10-23 12:27 ` [patch V2 06/17] x86/entry/32: Remove redundant interrupt disable Thomas Gleixner
@ 2019-10-23 14:17   ` Sean Christopherson
  2019-11-08 10:41   ` Alexandre Chartre
  1 sibling, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 14:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:11PM +0200, Thomas Gleixner wrote:
> Now that the trap handlers return with interrupts disabled, the
> unconditional disabling of interrupts in the low level entry code can be
> removed along with the trace calls and the misnomed preempt_stop macro.
> As a consequence ret_from_exception and ret_from_intr collapse.
> 
> Add a debug check to verify that interrupts are disabled depending on
> CONFIG_DEBUG_ENTRY.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

One nit below.

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

>  arch/x86/entry/entry_32.S |   21 ++++++---------------
>  1 file changed, 6 insertions(+), 15 deletions(-)
> 
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -1207,7 +1198,7 @@ ENDPROC(common_spurious)
>  	TRACE_IRQS_OFF
>  	movl	%esp, %eax
>  	call	do_IRQ
> -	jmp	ret_from_intr
> +	jmp	ret_from_exception
>  ENDPROC(common_interrupt)
>  
>  #define BUILD_INTERRUPT3(name, nr, fn)			\
> @@ -1219,7 +1210,7 @@ ENTRY(name)						\
>  	TRACE_IRQS_OFF					\
>  	movl	%esp, %eax;				\
>  	call	fn;					\
> -	jmp	ret_from_intr;				\
> +	jmp	ret_from_exception;				\

This backslash is now unaligned.

>  ENDPROC(name)
>  
>  #define BUILD_INTERRUPT(name, nr)		\
> @@ -1366,7 +1357,7 @@ ENTRY(xen_do_upcall)
>  #ifndef CONFIG_PREEMPTION
>  	call	xen_maybe_preempt_hcall
>  #endif
> -	jmp	ret_from_intr
> +	jmp	ret_from_exception
>  ENDPROC(xen_hypervisor_callback)
>  
>  /*
> 
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-23 12:27 ` [patch V2 07/17] x86/entry/64: " Thomas Gleixner
@ 2019-10-23 14:20   ` Sean Christopherson
  2019-10-23 22:06   ` Josh Poimboeuf
  2019-11-08 11:07   ` Alexandre Chartre
  2 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 14:20 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:12PM +0200, Thomas Gleixner wrote:
> Now that the trap handlers return with interrupts disabled, the
> unconditional disabling of interrupts in the low level entry code can be
> removed along with the trace calls.
> 
> Add debug checks where appropriate.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (16 preceding siblings ...)
  2019-10-23 12:27 ` [patch V2 17/17] x86/kvm: Use generic exit to guest work function Thomas Gleixner
@ 2019-10-23 14:37 ` Peter Zijlstra
  2019-10-23 21:20 ` Josh Poimboeuf
  2019-10-29 11:28 ` Will Deacon
  19 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2019-10-23 14:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Andy Lutomirski, Will Deacon, Paolo Bonzini, kvm,
	linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:05PM +0200, Thomas Gleixner wrote:
>  /Makefile                             |    3 
>  arch/Kconfig                          |    3 
>  arch/x86/Kconfig                      |    1 
>  arch/x86/entry/calling.h              |   12 +
>  arch/x86/entry/common.c               |  264 ++------------------------------
>  arch/x86/entry/entry_32.S             |   41 ----
>  arch/x86/entry/entry_64.S             |   32 ---
>  arch/x86/entry/entry_64_compat.S      |   30 ---
>  arch/x86/include/asm/irqflags.h       |    8 
>  arch/x86/include/asm/paravirt.h       |    9 -
>  arch/x86/include/asm/signal.h         |    1 
>  arch/x86/include/asm/thread_info.h    |    9 -
>  arch/x86/kernel/signal.c              |    2 
>  arch/x86/kernel/traps.c               |   33 ++--
>  arch/x86/kvm/x86.c                    |   17 --
>  arch/x86/mm/fault.c                   |    7 
>  b/arch/x86/include/asm/entry-common.h |  104 ++++++++++++
>  b/arch/x86/kvm/Kconfig                |    1 
>  b/include/linux/entry-common.h        |  280 ++++++++++++++++++++++++++++++++++
>  b/kernel/entry/common.c               |  184 ++++++++++++++++++++++
>  include/linux/kvm_host.h              |   64 +++++++
>  kernel/Makefile                       |    1 
>  virt/kvm/Kconfig                      |    3 
>  23 files changed, 735 insertions(+), 374 deletions(-)

This looks really nice; esp. the cleaned up interrupt state.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 17/17] x86/kvm: Use generic exit to guest work function
  2019-10-23 12:27 ` [patch V2 17/17] x86/kvm: Use generic exit to guest work function Thomas Gleixner
@ 2019-10-23 14:48   ` Sean Christopherson
  0 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 14:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:22PM +0200, Thomas Gleixner wrote:
> Use the generic infrastructure to check for and handle pending work before
> entering into guest mode.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/kvm/Kconfig |    1 +
>  arch/x86/kvm/x86.c   |   17 +++++------------
>  2 files changed, 6 insertions(+), 12 deletions(-)
> 
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -42,6 +42,7 @@ config KVM
>  	select HAVE_KVM_MSI
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>  	select HAVE_KVM_NO_POLL
> +	select KVM_EXIT_TO_GUEST_WORK
>  	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>  	select KVM_VFIO
>  	select SRCU
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -52,6 +52,7 @@
>  #include <linux/irqbypass.h>
>  #include <linux/sched/stat.h>
>  #include <linux/sched/isolation.h>
> +#include <linux/entry-common.h>
>  #include <linux/mem_encrypt.h>
>  
>  #include <trace/events/kvm.h>
> @@ -8115,8 +8116,8 @@ static int vcpu_enter_guest(struct kvm_v
>  	if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
>  		kvm_x86_ops->sync_pir_to_irr(vcpu);
>  
> -	if (vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu)
> -	    || need_resched() || signal_pending(current)) {
> +	if (vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) ||
> +	    exit_to_guestmode_work_pending()) {

The terms EXIT_TO_GUEST and exit_to_guestmode are very confusing, as
they're inverted from the usual virt terminology of VM-Enter (enter guest)
and VM-Exit (exit guest).  The conflict is most obvious here, with the
above "vcpu->mode == EXITING_GUEST_MODE", which is checking to see if the
vCPU is being forced to exit *from* guest mode because was kicked by some
other part of KVM.

Maybe XFER_TO_GUEST?  I.e. avoid entry/exit entirely, so that neither the
entry code or KVM ends up with a confusing name.

>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		smp_wmb();
>  		local_irq_enable();
> @@ -8309,17 +8310,9 @@ static int vcpu_run(struct kvm_vcpu *vcp
>  
>  		kvm_check_async_pf_completion(vcpu);
>  
> -		if (signal_pending(current)) {
> -			r = -EINTR;
> -			vcpu->run->exit_reason = KVM_EXIT_INTR;
> -			++vcpu->stat.signal_exits;
> +		r = exit_to_guestmode(kvm, vcpu);

Ditto here.  If the run loop is stripped down to the core functionality,
it effectively looks like:

	for (;;) {
		r = vcpu_enter_guest(vcpu);
		if (r <= 0)
			break;

		...

		r = exit_to_guestmode(kvm, vcpu);
		if (r)
			break;
	}

Appending _handle_work to the function would also be helpful so that it's
somewhat clear the function isn't related to the core vcpu_enter_guest()
functionality, e.g.:

	for (;;) {
		r = vcpu_enter_guest(vcpu);
		if (r <= 0)
			break;

		...

		r = xfer_to_guestmode_handle_work(kvm, vcpu);
		if (r)
			break;
	}


> +		if (r)
>  			break;
> -		}
> -		if (need_resched()) {
> -			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> -			cond_resched();
> -			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
> -		}
>  	}
>  
>  	srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> 
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 16/17] kvm/workpending: Provide infrastructure for work before entering a guest
  2019-10-23 12:27 ` [patch V2 16/17] kvm/workpending: Provide infrastructure for work before entering a guest Thomas Gleixner
@ 2019-10-23 14:55   ` Sean Christopherson
  0 siblings, 0 replies; 64+ messages in thread
From: Sean Christopherson @ 2019-10-23 14:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:21PM +0200, Thomas Gleixner wrote:
> Entering a guest is similar to exiting to user space. Pending work like
> handling signals, rescheduling, task work etc. needs to be handled before
> that.
> 
> Provide generic infrastructure to avoid duplication of the same handling code
> all over the place.
> 
> The kvm_exit code is split up into a KVM specific part and a generic
> builtin core part to avoid multiple exports for the actual work
> functions. The exit to guest mode handling is slightly different from the
> exit to usermode handling, e.g. vs. rseq, so a separate function is used.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> +/**
> + * exit_to_guestmode - Check and handle pending work which needs to be
> + *		       handled before returning to guest mode

Nit: I'd prefer "transferring" or "transitioning" over "returning".  KVM
could bail out of the very first run of a guest in order to handle work,
in which case the kernel isn't technically returning to guest mode as it's
never been there.  The comment might trip up VMX folks that understand the
difference between VMLAUNCH and VMRESUME, but not the purpose of this code.

> + * @kvm:	Pointer to the guest instance
> + * @vcpu:	Pointer to current's VCPU data
> + *
> + * Returns: 0 or an error code
> + */
> +static inline int exit_to_guestmode(struct kvm *kvm, struct kvm_vcpu *vcpu)
> +{
> +	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
> +	int r = 0;
> +
> +	if (unlikely(ti_work & EXIT_TO_GUESTMODE_WORK)) {
> +		if (ti_work & _TIF_SIGPENDING) {
> +			vcpu->run->exit_reason = KVM_EXIT_INTR;
> +			vcpu->stat.signal_exits++;
> +			return -EINTR;
> +		}
> +		core_exit_to_guestmode_work(ti_work);
> +		r = arch_exit_to_guestmode_work(kvm, vcpu, ti_work);
> +	}
> +	return r;
> +}
> +
> +/**
> + * _exit_to_guestmode_work_pending - Check if work is pending which needs to be
> + *				     handled before returning to guest mode

Same pedantic comment on "returning".

> + *
> + * Returns: True if work pending, False otherwise.
> + */
> +static inline bool exit_to_guestmode_work_pending(void)
> +{
> +	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
> +
> +	lockdep_assert_irqs_disabled();
> +
> +	return !!(ti_work & EXIT_TO_GUESTMODE_WORK);
> +
> +}
> +#endif /* CONFIG_KVM_EXIT_TO_GUEST_WORK */
> +
>  #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (17 preceding siblings ...)
  2019-10-23 14:37 ` [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Peter Zijlstra
@ 2019-10-23 21:20 ` Josh Poimboeuf
  2019-10-29 11:28 ` Will Deacon
  19 siblings, 0 replies; 64+ messages in thread
From: Josh Poimboeuf @ 2019-10-23 21:20 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:05PM +0200, Thomas Gleixner wrote:
> The series is also available from git:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP/core.entry

Actually

     git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.core/entry

:-)

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 08/17] x86/entry: Move syscall irq tracing to C code
  2019-10-23 12:27 ` [patch V2 08/17] x86/entry: Move syscall irq tracing to C code Thomas Gleixner
@ 2019-10-23 21:30   ` Andy Lutomirski
  2019-10-23 21:35     ` Andy Lutomirski
                       ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: Andy Lutomirski @ 2019-10-23 21:30 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Josh Poimboeuf, Miroslav Benes

[-- Attachment #1: Type: text/plain, Size: 866 bytes --]

On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Interrupt state tracing can be safely done in C code. The few stack
> operations in assembly do not need to be covered.
>
> Remove the now pointless indirection via .Lsyscall_32_done and jump to
> swapgs_restore_regs_and_return_to_usermode directly.

This doesn't look right.

>  #define SYSCALL_EXIT_WORK_FLAGS                                \
> @@ -279,6 +282,9 @@ static void syscall_slow_exit_work(struc
>  {
>         struct thread_info *ti;
>
> +       /* User to kernel transition disabled interrupts. */
> +       trace_hardirqs_off();
> +

So you just traced IRQs off, but...

>         enter_from_user_mode();
>         local_irq_enable();

Now they're on and traced on again?

I also don't see how your patch handles the fastpath case.

How about the attached patch instead?

[-- Attachment #2: irqtrace.patch --]
[-- Type: text/x-patch, Size: 2651 bytes --]

commit e154af0d7ff2d605f155f5aca059e3e835e426d4
Author: Andy Lutomirski <luto@kernel.org>
Date:   Fri Aug 2 10:30:44 2019 -0700

    x86/entry: Move exit-to-usermode irqflag tracing to prepare_exit_to_usermode()
    
    prepare_exit_to_usermode() can easily handle irqflag tracing.  Move
    the logic there and remove it from the entry asm.
    
    Signed-off-by: Andy Lutomirski <luto@kernel.org>

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index cc49380ef8ab..f4ce0cf2fb74 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -217,6 +217,13 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
 
 	user_enter_irqoff();
 
+	/*
+	 * The actual return to usermode will almost certainly turn IRQs on.
+	 * Trace it here to simplify the asm code.
+	 */
+	if (likely(regs->flags & X86_EFLAGS_IF))
+		trace_hardirqs_on();
+
 	mds_user_clear_cpu_buffers();
 }
 
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index f83ca5aa8b77..c703c29bebb1 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1062,7 +1062,6 @@ ENTRY(entry_INT80_32)
 	STACKLEAK_ERASE
 
 restore_all:
-	TRACE_IRQS_IRET
 	SWITCH_TO_ENTRY_STACK
 .Lrestore_all_notrace:
 	CHECK_AND_APPLY_ESPFIX
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3abae80d4902..056419a0e76f 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -172,8 +172,6 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
 	movq	%rsp, %rsi
 	call	do_syscall_64		/* returns with IRQs disabled */
 
-	TRACE_IRQS_IRETQ		/* we're about to change IF */
-
 	/*
 	 * Try to use SYSRET instead of IRET if we're returning to
 	 * a completely clean 64-bit userspace context.  If we're not,
@@ -617,7 +615,6 @@ ret_from_intr:
 .Lretint_user:
 	mov	%rsp,%rdi
 	call	prepare_exit_to_usermode
-	TRACE_IRQS_IRETQ
 
 GLOBAL(swapgs_restore_regs_and_return_to_usermode)
 #ifdef CONFIG_DEBUG_ENTRY
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 435df637f392..3502f38bde01 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -254,7 +254,6 @@ sysret32_from_system_call:
 	 * stack. So let's erase the thread stack right now.
 	 */
 	STACKLEAK_ERASE
-	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movq	RBX(%rsp), %rbx		/* pt_regs->rbx */
 	movq	RBP(%rsp), %rbp		/* pt_regs->rbp */
 	movq	EFLAGS(%rsp), %r11	/* pt_regs->flags (in r11) */
@@ -396,6 +395,5 @@ ENTRY(entry_INT80_compat)
 .Lsyscall_32_done:
 
 	/* Go back to user mode. */
-	TRACE_IRQS_ON
 	jmp	swapgs_restore_regs_and_return_to_usermode
 END(entry_INT80_compat)

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()
  2019-10-23 12:27 ` [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug() Thomas Gleixner
  2019-10-23 13:52   ` Sean Christopherson
@ 2019-10-23 21:31   ` Josh Poimboeuf
  2019-10-23 22:35     ` Thomas Gleixner
  2019-11-06 15:33   ` Alexandre Chartre
  2020-02-27 14:15   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  3 siblings, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2019-10-23 21:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:08PM +0200, Thomas Gleixner wrote:
> That function returns immediately after conditionally reenabling interrupts which
> is more than pointless and requires the ASM code to disable interrupts again.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/kernel/traps.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -871,7 +871,6 @@ do_simd_coprocessor_error(struct pt_regs
>  dotraplinkage void
>  do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
>  {
> -	cond_local_irq_enable(regs);
>  }

I think we can just remove this handler altogether.  The Intel and AMD
manuals say vector 15 (X86_TRAP_SPURIOUS) is reserved.

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 14/17] entry: Provide generic exit to usermode functionality
  2019-10-23 12:27 ` [patch V2 14/17] entry: Provide generic exit to usermode functionality Thomas Gleixner
@ 2019-10-23 21:34   ` Andy Lutomirski
  2019-10-23 23:20     ` Thomas Gleixner
  0 siblings, 1 reply; 64+ messages in thread
From: Andy Lutomirski @ 2019-10-23 21:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Josh Poimboeuf, Miroslav Benes

On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> Provide a generic facility to handle the exit to usermode work. That's
> aimed to replace the pointlessly different copies in each architecture.


>  /**
> + * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable()
> + * @ti_work:   Cached TIF flags gathered with interrupts disabled
> + *
> + * Defaults to local_irq_enable(). Can be supplied by architecture specific
> + * code.

What did you have in mind here?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 08/17] x86/entry: Move syscall irq tracing to C code
  2019-10-23 21:30   ` Andy Lutomirski
@ 2019-10-23 21:35     ` Andy Lutomirski
  2019-10-23 23:31       ` Thomas Gleixner
  2019-10-23 23:16     ` Thomas Gleixner
  2019-10-24 16:24     ` Andy Lutomirski
  2 siblings, 1 reply; 64+ messages in thread
From: Andy Lutomirski @ 2019-10-23 21:35 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, LKML, X86 ML, Peter Zijlstra, Will Deacon,
	Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Josh Poimboeuf, Miroslav Benes

On Wed, Oct 23, 2019 at 2:30 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > Interrupt state tracing can be safely done in C code. The few stack
> > operations in assembly do not need to be covered.
> >
> > Remove the now pointless indirection via .Lsyscall_32_done and jump to
> > swapgs_restore_regs_and_return_to_usermode directly.
>
> This doesn't look right.
>
> >  #define SYSCALL_EXIT_WORK_FLAGS                                \
> > @@ -279,6 +282,9 @@ static void syscall_slow_exit_work(struc
> >  {
> >         struct thread_info *ti;
> >
> > +       /* User to kernel transition disabled interrupts. */
> > +       trace_hardirqs_off();
> > +
>
> So you just traced IRQs off, but...
>
> >         enter_from_user_mode();
> >         local_irq_enable();
>
> Now they're on and traced on again?
>
> I also don't see how your patch handles the fastpath case.
>
> How about the attached patch instead?

Ignore the attached patch.  You have this in your
do_exit_to_usermode() later in the series.  But I'm still quite
confused by this patch.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code
  2019-10-23 12:27 ` [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code Thomas Gleixner
  2019-10-23 14:16   ` Sean Christopherson
@ 2019-10-23 22:01   ` Josh Poimboeuf
  2019-10-23 23:23     ` Thomas Gleixner
  2019-11-06 16:19   ` Alexandre Chartre
  2 siblings, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2019-10-23 22:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:10PM +0200, Thomas Gleixner wrote:
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1500,10 +1500,13 @@ static noinline void
>  		return;
>  
>  	/* Was the fault on kernel-controlled part of the address space? */
> -	if (unlikely(fault_in_kernel_space(address)))
> +	if (unlikely(fault_in_kernel_space(address))) {
>  		do_kern_addr_fault(regs, hw_error_code, address);
> -	else
> +	} else {
>  		do_user_addr_fault(regs, hw_error_code, address);
> +		if (regs->flags & X86_EFLAGS_IF)
> +			local_irq_disable();
> +	}

The corresponding irq enable is in do_user_addr_fault(), why not do the
disable there?

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-23 12:27 ` [patch V2 07/17] x86/entry/64: " Thomas Gleixner
  2019-10-23 14:20   ` Sean Christopherson
@ 2019-10-23 22:06   ` Josh Poimboeuf
  2019-10-23 23:52     ` Thomas Gleixner
  2019-11-08 11:07   ` Alexandre Chartre
  2 siblings, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2019-10-23 22:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:12PM +0200, Thomas Gleixner wrote:
> Now that the trap handlers return with interrupts disabled, the
> unconditional disabling of interrupts in the low level entry code can be
> removed along with the trace calls.
> 
> Add debug checks where appropriate.

This seems a little scary.  Does anybody other than Andy actually run
with CONFIG_DEBUG_ENTRY?  What happens if somebody accidentally leaves
irqs enabled?  How do we know you found all the leaks?

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()
  2019-10-23 21:31   ` Josh Poimboeuf
@ 2019-10-23 22:35     ` Thomas Gleixner
  2019-10-23 22:49       ` Josh Poimboeuf
  0 siblings, 1 reply; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 22:35 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

[-- Attachment #1: Type: text/plain, Size: 1819 bytes --]

On Wed, 23 Oct 2019, Josh Poimboeuf wrote:

> On Wed, Oct 23, 2019 at 02:27:08PM +0200, Thomas Gleixner wrote:
> > That function returns immediately after conditionally reenabling interrupts which
> > is more than pointless and requires the ASM code to disable interrupts again.
> > 
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > ---
> >  arch/x86/kernel/traps.c |    1 -
> >  1 file changed, 1 deletion(-)
> > 
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -871,7 +871,6 @@ do_simd_coprocessor_error(struct pt_regs
> >  dotraplinkage void
> >  do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
> >  {
> > -	cond_local_irq_enable(regs);
> >  }
> 
> I think we can just remove this handler altogether.  The Intel and AMD
> manuals say vector 15 (X86_TRAP_SPURIOUS) is reserved.

Right, but this has history. Pentium Pro Erratum:

  PROBLEM: If the APIC subsystem is configured in mixed mode with Virtual
  Wire mode implemented through the local APIC, an interrupt vector of 0Fh
  (Intel reserved encoding) may be generated by the local APIC (Int 15).
  This vector may be generated upon receipt of a spurious interrupt (an
  interrupt which is removed before the system receives the INTA sequence)
  instead of the programmed 8259 spurious interrupt vector.

  IMPLICATION: The spurious interrupt vector programmed in the 8259 is
  normally handled by an operating system’s spurious interrupt
  handler. However, a vector of 0Fh is unknown to some operating systems,
  which would crash if this erratum occurred.

Initially (2.1.) there was a printk() in that handler, which later got
ifdeffed out (2.1.54).

So I rather keep that thing at least as long as we support PPro :) Even if
we ditch that the handler is not really hurting anyone.

Thanks,

	tglx




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()
  2019-10-23 22:35     ` Thomas Gleixner
@ 2019-10-23 22:49       ` Josh Poimboeuf
  2019-10-23 23:18         ` Thomas Gleixner
  0 siblings, 1 reply; 64+ messages in thread
From: Josh Poimboeuf @ 2019-10-23 22:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

On Thu, Oct 24, 2019 at 12:35:27AM +0200, Thomas Gleixner wrote:
> On Wed, 23 Oct 2019, Josh Poimboeuf wrote:
> 
> > On Wed, Oct 23, 2019 at 02:27:08PM +0200, Thomas Gleixner wrote:
> > > That function returns immediately after conditionally reenabling interrupts which
> > > is more than pointless and requires the ASM code to disable interrupts again.
> > > 
> > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > > ---
> > >  arch/x86/kernel/traps.c |    1 -
> > >  1 file changed, 1 deletion(-)
> > > 
> > > --- a/arch/x86/kernel/traps.c
> > > +++ b/arch/x86/kernel/traps.c
> > > @@ -871,7 +871,6 @@ do_simd_coprocessor_error(struct pt_regs
> > >  dotraplinkage void
> > >  do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
> > >  {
> > > -	cond_local_irq_enable(regs);
> > >  }
> > 
> > I think we can just remove this handler altogether.  The Intel and AMD
> > manuals say vector 15 (X86_TRAP_SPURIOUS) is reserved.
> 
> Right, but this has history. Pentium Pro Erratum:
> 
>   PROBLEM: If the APIC subsystem is configured in mixed mode with Virtual
>   Wire mode implemented through the local APIC, an interrupt vector of 0Fh
>   (Intel reserved encoding) may be generated by the local APIC (Int 15).
>   This vector may be generated upon receipt of a spurious interrupt (an
>   interrupt which is removed before the system receives the INTA sequence)
>   instead of the programmed 8259 spurious interrupt vector.
> 
>   IMPLICATION: The spurious interrupt vector programmed in the 8259 is
>   normally handled by an operating system’s spurious interrupt
>   handler. However, a vector of 0Fh is unknown to some operating systems,
>   which would crash if this erratum occurred.
> 
> Initially (2.1.) there was a printk() in that handler, which later got
> ifdeffed out (2.1.54).
> 
> So I rather keep that thing at least as long as we support PPro :) Even if
> we ditch that the handler is not really hurting anyone.

Ah.  I guess we could remove the idtentry for 64-bit then?  Anyway the
above would be a good comment for the function.

-- 
Josh


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 08/17] x86/entry: Move syscall irq tracing to C code
  2019-10-23 21:30   ` Andy Lutomirski
  2019-10-23 21:35     ` Andy Lutomirski
@ 2019-10-23 23:16     ` Thomas Gleixner
  2019-10-24 16:24     ` Andy Lutomirski
  2 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 23:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, X86 ML, Peter Zijlstra, Will Deacon, Paolo Bonzini,
	kvm list, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, 23 Oct 2019, Andy Lutomirski wrote:

> On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > Interrupt state tracing can be safely done in C code. The few stack
> > operations in assembly do not need to be covered.
> >
> > Remove the now pointless indirection via .Lsyscall_32_done and jump to
> > swapgs_restore_regs_and_return_to_usermode directly.
> 
> This doesn't look right.
> 
> >  #define SYSCALL_EXIT_WORK_FLAGS                                \
> > @@ -279,6 +282,9 @@ static void syscall_slow_exit_work(struc
> >  {
> >         struct thread_info *ti;
> >
> > +       /* User to kernel transition disabled interrupts. */
> > +       trace_hardirqs_off();
> > +
> 
> So you just traced IRQs off, but...
> 
> >         enter_from_user_mode();
> >         local_irq_enable();
> 
> Now they're on and traced on again?

Yes, because that's what actually happens.

usermode
 syscall		<- Disables interrupts, but tracing thinks they are on
   entry_SYSCALL_64
   ....
   call do_syscall_64

     trace_hardirqs_off() <- So before calling anything else, we have to tell
			     the tracer that interrupts are on, which we did
			     so far in the ASM code between entry_SYSCALL_64
			     and 'call do_syscall_64'. I'm merily lifting this
			     to C-code.

     enter_from_user_mode()
     local_irq_enable()
 
> I also don't see how your patch handles the fastpath case.

Hmm?

All syscalls return through:

    syscall_return_slowpath()
        local_irq_disable()
	prepare_exit_to_usermode()
	  user_enter_irqoff()
	  mds_user_clear_cpu_buffers()
	  trace_hardirqs_on()

What am I missing?
 
> How about the attached patch instead?

      	    	^^^^^^ Groan.

>
>  	user_enter_irqoff();
>  
> +	/*
> +	 * The actual return to usermode will almost certainly turn IRQs on.
> +	 * Trace it here to simplify the asm code.

Why would we return to user from a syscall or interrupt with interrupts
traced as disabled? Also the existing ASM is inconsistent vs. that:

ENTRY(entry_SYSENTER_32)        TRACE_IRQS_ON

ENTRY(entry_INT80_32)		TRACE_IRQS_IRET

ENTRY(entry_SYSCALL_64)		TRACE_IRQS_IRET

ENTRY(ret_from_fork)		TRACE_IRQS_ON

GLOBAL(retint_user)		TRACE_IRQS_IRETQ

ENTRY(entry_SYSCALL_compat)	TRACE_IRQS_ON

ENTRY(entry_INT80_compat)	TRACE_IRQS_ON

> +	 */
> +	if (likely(regs->flags & X86_EFLAGS_IF))
> +		trace_hardirqs_on();

My variant does this unconditionally and after
mds_user_clear_cpu_buffers().

> 	mds_user_clear_cpu_buffers();
> }
 
And your ASM changes keep still all the TRACE_IRQS_OFF invocations in the
various syscall entry pathes, which is what I removed and put as the first
thing into the C functions.

Confused.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()
  2019-10-23 22:49       ` Josh Poimboeuf
@ 2019-10-23 23:18         ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 23:18 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

[-- Attachment #1: Type: text/plain, Size: 2360 bytes --]

On Wed, 23 Oct 2019, Josh Poimboeuf wrote:
> On Thu, Oct 24, 2019 at 12:35:27AM +0200, Thomas Gleixner wrote:
> > On Wed, 23 Oct 2019, Josh Poimboeuf wrote:
> > 
> > > On Wed, Oct 23, 2019 at 02:27:08PM +0200, Thomas Gleixner wrote:
> > > > That function returns immediately after conditionally reenabling interrupts which
> > > > is more than pointless and requires the ASM code to disable interrupts again.
> > > > 
> > > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > > > ---
> > > >  arch/x86/kernel/traps.c |    1 -
> > > >  1 file changed, 1 deletion(-)
> > > > 
> > > > --- a/arch/x86/kernel/traps.c
> > > > +++ b/arch/x86/kernel/traps.c
> > > > @@ -871,7 +871,6 @@ do_simd_coprocessor_error(struct pt_regs
> > > >  dotraplinkage void
> > > >  do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
> > > >  {
> > > > -	cond_local_irq_enable(regs);
> > > >  }
> > > 
> > > I think we can just remove this handler altogether.  The Intel and AMD
> > > manuals say vector 15 (X86_TRAP_SPURIOUS) is reserved.
> > 
> > Right, but this has history. Pentium Pro Erratum:
> > 
> >   PROBLEM: If the APIC subsystem is configured in mixed mode with Virtual
> >   Wire mode implemented through the local APIC, an interrupt vector of 0Fh
> >   (Intel reserved encoding) may be generated by the local APIC (Int 15).
> >   This vector may be generated upon receipt of a spurious interrupt (an
> >   interrupt which is removed before the system receives the INTA sequence)
> >   instead of the programmed 8259 spurious interrupt vector.
> > 
> >   IMPLICATION: The spurious interrupt vector programmed in the 8259 is
> >   normally handled by an operating system’s spurious interrupt
> >   handler. However, a vector of 0Fh is unknown to some operating systems,
> >   which would crash if this erratum occurred.
> > 
> > Initially (2.1.) there was a printk() in that handler, which later got
> > ifdeffed out (2.1.54).
> > 
> > So I rather keep that thing at least as long as we support PPro :) Even if
> > we ditch that the handler is not really hurting anyone.
> 
> Ah.  I guess we could remove the idtentry for 64-bit then?  Anyway the
> above would be a good comment for the function.

That doesn't buy much. Who knows how many other CPUs issue vector 15
occasionally and will then crash and burn. Better safe than sorry :)

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 14/17] entry: Provide generic exit to usermode functionality
  2019-10-23 21:34   ` Andy Lutomirski
@ 2019-10-23 23:20     ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 23:20 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, X86 ML, Peter Zijlstra, Will Deacon, Paolo Bonzini,
	kvm list, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, 23 Oct 2019, Andy Lutomirski wrote:

> On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > From: Thomas Gleixner <tglx@linutronix.de>
> >
> > Provide a generic facility to handle the exit to usermode work. That's
> > aimed to replace the pointlessly different copies in each architecture.
> 
> 
> >  /**
> > + * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable()
> > + * @ti_work:   Cached TIF flags gathered with interrupts disabled
> > + *
> > + * Defaults to local_irq_enable(). Can be supplied by architecture specific
> > + * code.
> 
> What did you have in mind here?

Look at the previous version which had the ARM64 conversion. ARM64 does
magic different stuff vs. local_irq_enable() in the exit path. It's not
using the regular one. I'm happy to ditch that :)

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code
  2019-10-23 22:01   ` Josh Poimboeuf
@ 2019-10-23 23:23     ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 23:23 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

On Wed, 23 Oct 2019, Josh Poimboeuf wrote:
> On Wed, Oct 23, 2019 at 02:27:10PM +0200, Thomas Gleixner wrote:
> > --- a/arch/x86/mm/fault.c
> > +++ b/arch/x86/mm/fault.c
> > @@ -1500,10 +1500,13 @@ static noinline void
> >  		return;
> >  
> >  	/* Was the fault on kernel-controlled part of the address space? */
> > -	if (unlikely(fault_in_kernel_space(address)))
> > +	if (unlikely(fault_in_kernel_space(address))) {
> >  		do_kern_addr_fault(regs, hw_error_code, address);
> > -	else
> > +	} else {
> >  		do_user_addr_fault(regs, hw_error_code, address);
> > +		if (regs->flags & X86_EFLAGS_IF)
> > +			local_irq_disable();
> > +	}
> 
> The corresponding irq enable is in do_user_addr_fault(), why not do the
> disable there?

Yeah, will do. Was just as lazy as Peter and did not want to touch the
gazillion of returns. :)

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 08/17] x86/entry: Move syscall irq tracing to C code
  2019-10-23 21:35     ` Andy Lutomirski
@ 2019-10-23 23:31       ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 23:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, X86 ML, Peter Zijlstra, Will Deacon, Paolo Bonzini,
	kvm list, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, 23 Oct 2019, Andy Lutomirski wrote:
> On Wed, Oct 23, 2019 at 2:30 PM Andy Lutomirski <luto@kernel.org> wrote:
> >
> > On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > >
> > > Interrupt state tracing can be safely done in C code. The few stack
> > > operations in assembly do not need to be covered.
> > >
> > > Remove the now pointless indirection via .Lsyscall_32_done and jump to
> > > swapgs_restore_regs_and_return_to_usermode directly.
> >
> > This doesn't look right.
> >
> > >  #define SYSCALL_EXIT_WORK_FLAGS                                \
> > > @@ -279,6 +282,9 @@ static void syscall_slow_exit_work(struc
> > >  {
> > >         struct thread_info *ti;
> > >
> > > +       /* User to kernel transition disabled interrupts. */
> > > +       trace_hardirqs_off();
> > > +
> >
> > So you just traced IRQs off, but...
> >
> > >         enter_from_user_mode();
> > >         local_irq_enable();
> >
> > Now they're on and traced on again?
> >
> > I also don't see how your patch handles the fastpath case.
> >
> > How about the attached patch instead?
> 
> Ignore the attached patch.  You have this in your
> do_exit_to_usermode() later in the series.  But I'm still quite
> confused by this patch.

What's confusing you? It basically does:

  ENTRY(syscall/int80)

-	TRACE_IRQS_OFF
	call C-syscall*()
-	TRACE_IRQS_ON/IRET

and

C-syscall*()

+       trace_hardirqs_off()		<- first action
	....
	prepare_exit_to_usermode()	<- last action
	return

and

prepare_exit_to_usermode()
	....
+       trace_hardirqs_on()		<- last action
	return

So this is exactly the same as the ASM today.

The only change is that I made it do unconditionally trace_hardirqs_on()
for consistency reasons.

I tried to split it into bits and pieces, but failed to come up with
something sensible. Let me try again tomorrow.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-23 22:06   ` Josh Poimboeuf
@ 2019-10-23 23:52     ` Thomas Gleixner
  2019-10-24 16:18       ` Andy Lutomirski
  0 siblings, 1 reply; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-23 23:52 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Miroslav Benes

On Wed, 23 Oct 2019, Josh Poimboeuf wrote:

> On Wed, Oct 23, 2019 at 02:27:12PM +0200, Thomas Gleixner wrote:
> > Now that the trap handlers return with interrupts disabled, the
> > unconditional disabling of interrupts in the low level entry code can be
> > removed along with the trace calls.
> > 
> > Add debug checks where appropriate.
> 
> This seems a little scary.  Does anybody other than Andy actually run
> with CONFIG_DEBUG_ENTRY?

I do.

> What happens if somebody accidentally leaves irqs enabled?  How do we
> know you found all the leaks?

For the DO_ERROR() ones that's trivial:

 #define DO_ERROR(trapnr, signr, sicode, addr, str, name)                  \
 dotraplinkage void do_##name(struct pt_regs *regs, long error_code)	   \
 {									   \
 	do_error_trap(regs, error_code, str, trapnr, signr, sicode, addr); \
+	lockdep_assert_irqs_disabled();					   \
 }
 
 DO_ERROR(X86_TRAP_DE,     SIGFPE,  FPE_INTDIV,   IP, "divide error",        divide_error)

Now for the rest we surely could do:

dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
{
	__do_bounds(regs, error_code);
	lockdep_assert_irqs_disabled();
}

and move the existing body into a static function so independent of any
(future) return path there the lockdep assert will be invoked.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-23 23:52     ` Thomas Gleixner
@ 2019-10-24 16:18       ` Andy Lutomirski
  2019-10-24 20:52         ` Thomas Gleixner
  0 siblings, 1 reply; 64+ messages in thread
From: Andy Lutomirski @ 2019-10-24 16:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Josh Poimboeuf, LKML, X86 ML, Peter Zijlstra, Andy Lutomirski,
	Will Deacon, Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Miroslav Benes

On Wed, Oct 23, 2019 at 4:52 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, 23 Oct 2019, Josh Poimboeuf wrote:
>
> > On Wed, Oct 23, 2019 at 02:27:12PM +0200, Thomas Gleixner wrote:
> > > Now that the trap handlers return with interrupts disabled, the
> > > unconditional disabling of interrupts in the low level entry code can be
> > > removed along with the trace calls.
> > >
> > > Add debug checks where appropriate.
> >
> > This seems a little scary.  Does anybody other than Andy actually run
> > with CONFIG_DEBUG_ENTRY?
>
> I do.
>
> > What happens if somebody accidentally leaves irqs enabled?  How do we
> > know you found all the leaks?
>
> For the DO_ERROR() ones that's trivial:
>
>  #define DO_ERROR(trapnr, signr, sicode, addr, str, name)                  \
>  dotraplinkage void do_##name(struct pt_regs *regs, long error_code)       \
>  {                                                                         \
>         do_error_trap(regs, error_code, str, trapnr, signr, sicode, addr); \
> +       lockdep_assert_irqs_disabled();                                    \
>  }
>
>  DO_ERROR(X86_TRAP_DE,     SIGFPE,  FPE_INTDIV,   IP, "divide error",        divide_error)
>
> Now for the rest we surely could do:
>
> dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
> {
>         __do_bounds(regs, error_code);
>         lockdep_assert_irqs_disabled();
> }
>
> and move the existing body into a static function so independent of any
> (future) return path there the lockdep assert will be invoked.
>

If we do this, can we macro-ize it:

DEFINE_IDTENTRY_HANDLER(do_bounds)
{
 ...
}

If you do this, please don't worry about the weird ones that take cr2
as a third argument.  Once your series lands, I will send a follow-up
to get rid of it.  It's 2/3 written already.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 08/17] x86/entry: Move syscall irq tracing to C code
  2019-10-23 21:30   ` Andy Lutomirski
  2019-10-23 21:35     ` Andy Lutomirski
  2019-10-23 23:16     ` Thomas Gleixner
@ 2019-10-24 16:24     ` Andy Lutomirski
  2019-10-24 17:40       ` Peter Zijlstra
  2 siblings, 1 reply; 64+ messages in thread
From: Andy Lutomirski @ 2019-10-24 16:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, LKML, X86 ML, Peter Zijlstra, Will Deacon,
	Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Josh Poimboeuf, Miroslav Benes

On Wed, Oct 23, 2019 at 2:30 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > Interrupt state tracing can be safely done in C code. The few stack
> > operations in assembly do not need to be covered.
> >
> > Remove the now pointless indirection via .Lsyscall_32_done and jump to
> > swapgs_restore_regs_and_return_to_usermode directly.
>
> This doesn't look right.

Well, I feel a bit silly.  I read this:

>
> >  #define SYSCALL_EXIT_WORK_FLAGS                                \
> > @@ -279,6 +282,9 @@ static void syscall_slow_exit_work(struc

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

and I applied the diff in my head to the wrong function, and I didn't
notice that it didn't really apply there.  Oddly, gitweb gets this
right:

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=WIP.core/entry&id=e3158f93138ded84eb44fa97606197f6adcf9366

Looking at the actual code:

Acked-by: Andy Lutomirski <luto@kernel.org>

with one minor caveat: you are making a subtle and mostly irrelevant
semantic change: with your patch, user mode will be traced as IRQs on
even if a nasty user has used iopl() to turn off interrupts.  This is
probably a good thing, but I think you should mention it in the
changelog.

FWIW, the rest of the series looks pretty good, too.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 08/17] x86/entry: Move syscall irq tracing to C code
  2019-10-24 16:24     ` Andy Lutomirski
@ 2019-10-24 17:40       ` Peter Zijlstra
  2019-10-24 20:54         ` Thomas Gleixner
  0 siblings, 1 reply; 64+ messages in thread
From: Peter Zijlstra @ 2019-10-24 17:40 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, LKML, X86 ML, Will Deacon, Paolo Bonzini,
	kvm list, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Thu, Oct 24, 2019 at 09:24:13AM -0700, Andy Lutomirski wrote:
> On Wed, Oct 23, 2019 at 2:30 PM Andy Lutomirski <luto@kernel.org> wrote:
> >
> > On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > >
> > > Interrupt state tracing can be safely done in C code. The few stack
> > > operations in assembly do not need to be covered.
> > >
> > > Remove the now pointless indirection via .Lsyscall_32_done and jump to
> > > swapgs_restore_regs_and_return_to_usermode directly.
> >
> > This doesn't look right.
> 
> Well, I feel a bit silly.  I read this:
> 
> >
> > >  #define SYSCALL_EXIT_WORK_FLAGS                                \
> > > @@ -279,6 +282,9 @@ static void syscall_slow_exit_work(struc
> 
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> and I applied the diff in my head to the wrong function, and I didn't
> notice that it didn't really apply there.  Oddly, gitweb gets this

I had the same when reviewing these patches; I was almost going to ask
tglx about it on IRC when the penny dropped.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-24 16:18       ` Andy Lutomirski
@ 2019-10-24 20:52         ` Thomas Gleixner
  2019-10-24 20:59           ` Thomas Gleixner
                             ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-24 20:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Josh Poimboeuf, LKML, X86 ML, Peter Zijlstra, Will Deacon,
	Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Miroslav Benes

On Thu, 24 Oct 2019, Andy Lutomirski wrote:
> On Wed, Oct 23, 2019 at 4:52 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Wed, 23 Oct 2019, Josh Poimboeuf wrote:
> > > What happens if somebody accidentally leaves irqs enabled?  How do we
> > > know you found all the leaks?
> >
> > For the DO_ERROR() ones that's trivial:
> >
> >  #define DO_ERROR(trapnr, signr, sicode, addr, str, name)                  \
> >  dotraplinkage void do_##name(struct pt_regs *regs, long error_code)       \
> >  {                                                                         \
> >         do_error_trap(regs, error_code, str, trapnr, signr, sicode, addr); \
> > +       lockdep_assert_irqs_disabled();                                    \
> >  }
> >
> >  DO_ERROR(X86_TRAP_DE,     SIGFPE,  FPE_INTDIV,   IP, "divide error",        divide_error)
> >
> > Now for the rest we surely could do:
> >
> > dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
> > {
> >         __do_bounds(regs, error_code);
> >         lockdep_assert_irqs_disabled();
> > }
> >
> > and move the existing body into a static function so independent of any
> > (future) return path there the lockdep assert will be invoked.
> >
> 
> If we do this, can we macro-ize it:
> 
> DEFINE_IDTENTRY_HANDLER(do_bounds)
> {
>  ...
> }
>  
> If you do this, please don't worry about the weird ones that take cr2
> as a third argument.  Once your series lands, I will send a follow-up
> to get rid of it.  It's 2/3 written already.

I spent quite some time digging deeper into this. Finding all corner cases
which eventually enable interrupts from an exception handler is not as
trivial as it looked in the first place. Especially the fault handler is a
nightmare. Also PeterZ's approach of doing

	   if (regs->eflags & IF)
	   	local_irq_disable();

is doomed due to sys_iopl(). See below.

I'm tempted to do pretty much the same thing as the syscall rework did
as a first step:

  - Move the actual handler invocation to C

  - Do the irq tracing on entry in C

  - Move irq disable before return to ASM

Peter gave me some half finished patches which pretty much do that by
copying half of the linux/syscalls.h macro maze into the entry code. That's
one possible solution, but TBH it sucks big times.

We have the following variants:

do_divide_error(struct pt_regs *regs, long error_code);
do_debug(struct pt_regs *regs, long error_code);
do_nmi(struct pt_regs *regs, long error_code);
do_int3(struct pt_regs *regs, long error_code);
do_overflow(struct pt_regs *regs, long error_code);
do_bounds(struct pt_regs *regs, long error_code);
do_invalid_op(struct pt_regs *regs, long error_code);
do_device_not_available(struct pt_regs *regs, long error_code);
do_coprocessor_segment_overrun(struct pt_regs *regs, long error_code);
do_invalid_TSS(struct pt_regs *regs, long error_code);
do_segment_not_present(struct pt_regs *regs, long error_code);
do_stack_segment(struct pt_regs *regs, long error_code);
do_general_protection(struct pt_regs *regs, long error_code);
do_spurious_interrupt_bug(struct pt_regs *regs, long error_code);
do_coprocessor_error(struct pt_regs *regs, long error_code);
do_alignment_check(struct pt_regs *regs, long error_code);
do_machine_check(struct pt_regs *regs, long error_code);
do_simd_coprocessor_error(struct pt_regs *regs, long error_code);
do_iret_error(struct pt_regs *regs, long error_code);
do_mce(struct pt_regs *regs, long error_code);

do_async_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
do_double_fault(struct pt_regs *regs, long error_code, unsigned long address);
do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);

So if we can remove the third argument then we can spare most of the macro
maze and just have one common function without bells and whistels. The
other option would be to extend all handlers to have three arguments,
i.e. add 'long unused', which is not pretty either.

What's your plan with cr2? Stash it in pt_regs or something else?

Once we have the interesting parts in C then we can revisit the elimination
of the unconditional irq disable because in C it's way simpler to do
diagnostics, but I'm not entirely sure whether it's worth it.

A related issue is the inconsistency of the irq disabled tracing in the
return to user path. As I pointed out in the other mail, the various
syscall implementations do that differently. The exception handlers do it
always conditional, regular interrupts as well. For regular interrupts that
does not make sense as they can by all means never return to an interrupt
disabled context.

The interesting bells and whistels result from sys_iopl(). If user space
has been granted iopl(level = 3) it gains cli/sti priviledges. When the
application has interrupts disabled in userspace:

  - invocation of a syscall

  - any exception (aside of NMI/MCE) which conditionally enables interrupts
    depending on user_mode(regs) and therefor can be preempted and
    schedule

is just undefined behaviour and I personally consider it to be a plain bug.

Just for the record: This results in running a resulting or even completely
unrelated signal handler with interrupts disabled as well.

Whatever we decide it is, leaving it completely inconsistent is not a
solution at all. The options are:

  1)  Always do conditional tracing depending on the user_regs->eflags.IF
      state.

  2)  #1 + warn once when syscalls and exceptions (except NMI/MCE) happen
      and user_regs->eflags.IF is cleared.

  3a) #2 + enforce signal handling to run with interrupts enabled.

  3b) #2 + set regs->eflags.IF. So the state is always correct from the
      kernel POV. Of course that changes existing behaviour, but its
      changing undefined and inconsistent behaviour.
  
  4) Let iopl(level) return -EPERM if level == 3.

     Yeah, I know it's not possible due to regressions (DPKD uses iopl(3)),
     but TBH that'd be the sanest option of all.

     Of course the infinite wisdom of hardware designers tied IN, INS, OUT,
     OUTS and CLI/STI together on IOPL so we cannot even distangle them in
     any way.

     The only way out would be to actually use a full 8K sized I/O bitmap,
     but that's a massive pain as it has to be copied on every context
     switch. 

Really pretty options to chose from ...

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 08/17] x86/entry: Move syscall irq tracing to C code
  2019-10-24 17:40       ` Peter Zijlstra
@ 2019-10-24 20:54         ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-24 20:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Lutomirski, LKML, X86 ML, Will Deacon, Paolo Bonzini,
	kvm list, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Thu, 24 Oct 2019, Peter Zijlstra wrote:
> On Thu, Oct 24, 2019 at 09:24:13AM -0700, Andy Lutomirski wrote:
> > On Wed, Oct 23, 2019 at 2:30 PM Andy Lutomirski <luto@kernel.org> wrote:
> > >
> > > On Wed, Oct 23, 2019 at 5:31 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > > >
> > > > Interrupt state tracing can be safely done in C code. The few stack
> > > > operations in assembly do not need to be covered.
> > > >
> > > > Remove the now pointless indirection via .Lsyscall_32_done and jump to
> > > > swapgs_restore_regs_and_return_to_usermode directly.
> > >
> > > This doesn't look right.
> > 
> > Well, I feel a bit silly.  I read this:

Happened to me before. Don't worry.

> > >
> > > >  #define SYSCALL_EXIT_WORK_FLAGS                                \
> > > > @@ -279,6 +282,9 @@ static void syscall_slow_exit_work(struc
> > 
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > 
> > and I applied the diff in my head to the wrong function, and I didn't
> > notice that it didn't really apply there.  Oddly, gitweb gets this
> 
> I had the same when reviewing these patches; I was almost going to ask
> tglx about it on IRC when the penny dropped.

diff is weird at times.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-24 20:52         ` Thomas Gleixner
@ 2019-10-24 20:59           ` Thomas Gleixner
  2019-10-24 21:21           ` Peter Zijlstra
  2019-10-24 21:24           ` Andy Lutomirski
  2 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-24 20:59 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Josh Poimboeuf, LKML, X86 ML, Peter Zijlstra, Will Deacon,
	Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Miroslav Benes

On Thu, 24 Oct 2019, Thomas Gleixner wrote:
> Whatever we decide it is, leaving it completely inconsistent is not a
> solution at all. The options are:

Actually there is also:

    0) Always do unconditional trace_irqs_on().

       But that does not allow to actually trace the real return flags
       state which might be useful to diagnose crap which results from user
       space CLI.
 
>   1)  Always do conditional tracing depending on the user_regs->eflags.IF
>       state.
> 
>   2)  #1 + warn once when syscalls and exceptions (except NMI/MCE) happen
>       and user_regs->eflags.IF is cleared.
> 
>   3a) #2 + enforce signal handling to run with interrupts enabled.
> 
>   3b) #2 + set regs->eflags.IF. So the state is always correct from the
>       kernel POV. Of course that changes existing behaviour, but its
>       changing undefined and inconsistent behaviour.
>   
>   4) Let iopl(level) return -EPERM if level == 3.
> 
>      Yeah, I know it's not possible due to regressions (DPKD uses iopl(3)),
>      but TBH that'd be the sanest option of all.
> 
>      Of course the infinite wisdom of hardware designers tied IN, INS, OUT,
>      OUTS and CLI/STI together on IOPL so we cannot even distangle them in
>      any way.
> 
>      The only way out would be to actually use a full 8K sized I/O bitmap,
>      but that's a massive pain as it has to be copied on every context
>      switch. 
> 
> Really pretty options to chose from ...
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-24 20:52         ` Thomas Gleixner
  2019-10-24 20:59           ` Thomas Gleixner
@ 2019-10-24 21:21           ` Peter Zijlstra
  2019-10-24 21:24           ` Andy Lutomirski
  2 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2019-10-24 21:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, Josh Poimboeuf, LKML, X86 ML, Will Deacon,
	Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Miroslav Benes

On Thu, Oct 24, 2019 at 10:52:59PM +0200, Thomas Gleixner wrote:
> is just undefined behaviour and I personally consider it to be a plain bug.

I concur.

> Just for the record: This results in running a resulting or even completely
> unrelated signal handler with interrupts disabled as well.
> 
> Whatever we decide it is, leaving it completely inconsistent is not a
> solution at all. The options are:
> 
>   1)  Always do conditional tracing depending on the user_regs->eflags.IF
>       state.
> 
>   2)  #1 + warn once when syscalls and exceptions (except NMI/MCE) happen
>       and user_regs->eflags.IF is cleared.
> 
>   3a) #2 + enforce signal handling to run with interrupts enabled.
> 
>   3b) #2 + set regs->eflags.IF. So the state is always correct from the
>       kernel POV. Of course that changes existing behaviour, but its
>       changing undefined and inconsistent behaviour.
>   
>   4) Let iopl(level) return -EPERM if level == 3.
> 
>      Yeah, I know it's not possible due to regressions (DPKD uses iopl(3)),
>      but TBH that'd be the sanest option of all.
> 
>      Of course the infinite wisdom of hardware designers tied IN, INS, OUT,
>      OUTS and CLI/STI together on IOPL so we cannot even distangle them in
>      any way.
> 
>      The only way out would be to actually use a full 8K sized I/O bitmap,
>      but that's a massive pain as it has to be copied on every context
>      switch. 
> 
> Really pretty options to chose from ...

If 4 is out (and I'm afraid it might be), then I'm on record for liking
3b.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-24 20:52         ` Thomas Gleixner
  2019-10-24 20:59           ` Thomas Gleixner
  2019-10-24 21:21           ` Peter Zijlstra
@ 2019-10-24 21:24           ` Andy Lutomirski
  2019-10-24 22:33             ` Thomas Gleixner
  2 siblings, 1 reply; 64+ messages in thread
From: Andy Lutomirski @ 2019-10-24 21:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, Josh Poimboeuf, LKML, X86 ML, Peter Zijlstra,
	Will Deacon, Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Miroslav Benes

On Thu, Oct 24, 2019 at 1:53 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Thu, 24 Oct 2019, Andy Lutomirski wrote:
> > On Wed, Oct 23, 2019 at 4:52 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > > On Wed, 23 Oct 2019, Josh Poimboeuf wrote:
> > > > What happens if somebody accidentally leaves irqs enabled?  How do we
> > > > know you found all the leaks?
> > >
> > > For the DO_ERROR() ones that's trivial:
> > >
> > >  #define DO_ERROR(trapnr, signr, sicode, addr, str, name)                  \
> > >  dotraplinkage void do_##name(struct pt_regs *regs, long error_code)       \
> > >  {                                                                         \
> > >         do_error_trap(regs, error_code, str, trapnr, signr, sicode, addr); \
> > > +       lockdep_assert_irqs_disabled();                                    \
> > >  }
> > >
> > >  DO_ERROR(X86_TRAP_DE,     SIGFPE,  FPE_INTDIV,   IP, "divide error",        divide_error)
> > >
> > > Now for the rest we surely could do:
> > >
> > > dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
> > > {
> > >         __do_bounds(regs, error_code);
> > >         lockdep_assert_irqs_disabled();
> > > }
> > >
> > > and move the existing body into a static function so independent of any
> > > (future) return path there the lockdep assert will be invoked.
> > >
> >
> > If we do this, can we macro-ize it:
> >
> > DEFINE_IDTENTRY_HANDLER(do_bounds)
> > {
> >  ...
> > }
> >
> > If you do this, please don't worry about the weird ones that take cr2
> > as a third argument.  Once your series lands, I will send a follow-up
> > to get rid of it.  It's 2/3 written already.
>
> I spent quite some time digging deeper into this. Finding all corner cases
> which eventually enable interrupts from an exception handler is not as
> trivial as it looked in the first place. Especially the fault handler is a
> nightmare. Also PeterZ's approach of doing
>
>            if (regs->eflags & IF)
>                 local_irq_disable();
>
> is doomed due to sys_iopl(). See below.

I missed something in the discussion.  What breaks?  Can you check
user_mode(regs) too?

>
> I'm tempted to do pretty much the same thing as the syscall rework did
> as a first step:
>
>   - Move the actual handler invocation to C
>
>   - Do the irq tracing on entry in C
>
>   - Move irq disable before return to ASM
>
> Peter gave me some half finished patches which pretty much do that by
> copying half of the linux/syscalls.h macro maze into the entry code. That's
> one possible solution, but TBH it sucks big times.
>
> We have the following variants:
>
> do_divide_error(struct pt_regs *regs, long error_code);
> do_debug(struct pt_regs *regs, long error_code);
> do_nmi(struct pt_regs *regs, long error_code);
> do_int3(struct pt_regs *regs, long error_code);
> do_overflow(struct pt_regs *regs, long error_code);
> do_bounds(struct pt_regs *regs, long error_code);
> do_invalid_op(struct pt_regs *regs, long error_code);
> do_device_not_available(struct pt_regs *regs, long error_code);
> do_coprocessor_segment_overrun(struct pt_regs *regs, long error_code);
> do_invalid_TSS(struct pt_regs *regs, long error_code);
> do_segment_not_present(struct pt_regs *regs, long error_code);
> do_stack_segment(struct pt_regs *regs, long error_code);
> do_general_protection(struct pt_regs *regs, long error_code);
> do_spurious_interrupt_bug(struct pt_regs *regs, long error_code);
> do_coprocessor_error(struct pt_regs *regs, long error_code);
> do_alignment_check(struct pt_regs *regs, long error_code);
> do_machine_check(struct pt_regs *regs, long error_code);
> do_simd_coprocessor_error(struct pt_regs *regs, long error_code);
> do_iret_error(struct pt_regs *regs, long error_code);
> do_mce(struct pt_regs *regs, long error_code);
>
> do_async_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
> do_double_fault(struct pt_regs *regs, long error_code, unsigned long address);
> do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
>
> So if we can remove the third argument then we can spare most of the macro
> maze and just have one common function without bells and whistels. The
> other option would be to extend all handlers to have three arguments,
> i.e. add 'long unused', which is not pretty either.
>
> What's your plan with cr2? Stash it in pt_regs or something else?

Just read it from CR2.  I added a new idtentry macro arg called
"entry_work", and setting it to 0 causes the enter_from_user_mode to
be skipped.  Then C code calls enter_from_user_mode() after reading
CR2 (and DR7).  WIP code is here:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=x86/idtentry

The idea is that, if everything is converted, then we get rid of the
entry_work=1 case, which is easier if there's a macro.

So my suggestion is to use a macro for the 2-arg version and open-code
all the 3-arg cases.  Then, when the dust settles, we get rid of the
third arg and they can use the macro.

>
> The interesting bells and whistels result from sys_iopl(). If user space
> has been granted iopl(level = 3) it gains cli/sti priviledges. When the
> application has interrupts disabled in userspace:
>
>   - invocation of a syscall
>
>   - any exception (aside of NMI/MCE) which conditionally enables interrupts
>     depending on user_mode(regs) and therefor can be preempted and
>     schedule
>
> is just undefined behaviour and I personally consider it to be a plain bug.
>
> Just for the record: This results in running a resulting or even completely
> unrelated signal handler with interrupts disabled as well.

I am seriously tempted to say that the solution is to remove iopl(),
at least on 64-bit kernels.  Doing STI in user mode is BS :)

Otherwise we need to give it semantics, no?  I personally have no
actual problem with the fact that an NMI can cause scheduling to
happen.  Big fscking deal.

>
> Whatever we decide it is, leaving it completely inconsistent is not a
> solution at all. The options are:
>
>   1)  Always do conditional tracing depending on the user_regs->eflags.IF
>       state.

I'm okay with always tracing like user mode means IRQs on or doing it
"correctly".  I consider the former to be simpler and therefore quite
possibly better.

>
>   2)  #1 + warn once when syscalls and exceptions (except NMI/MCE) happen
>       and user_regs->eflags.IF is cleared.
>
>   3a) #2 + enforce signal handling to run with interrupts enabled.
>
>   3b) #2 + set regs->eflags.IF. So the state is always correct from the
>       kernel POV. Of course that changes existing behaviour, but its
>       changing undefined and inconsistent behaviour.
>
>   4) Let iopl(level) return -EPERM if level == 3.
>
>      Yeah, I know it's not possible due to regressions (DPKD uses iopl(3)),
>      but TBH that'd be the sanest option of all.
>
>      Of course the infinite wisdom of hardware designers tied IN, INS, OUT,
>      OUTS and CLI/STI together on IOPL so we cannot even distangle them in
>      any way.

>
>      The only way out would be to actually use a full 8K sized I/O bitmap,
>      but that's a massive pain as it has to be copied on every context
>      switch.

Hmm.  This actually doesn't seem that bad.  We already have a TIF_
flag to optimize this.  So basically iopl() would effectively become
ioperm(everything on).

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-24 21:24           ` Andy Lutomirski
@ 2019-10-24 22:33             ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2019-10-24 22:33 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Josh Poimboeuf, LKML, X86 ML, Peter Zijlstra, Will Deacon,
	Paolo Bonzini, kvm list, linux-arch, Mike Rapoport,
	Miroslav Benes

On Thu, 24 Oct 2019, Andy Lutomirski wrote:
> On Thu, Oct 24, 2019 at 1:53 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > I spent quite some time digging deeper into this. Finding all corner cases
> > which eventually enable interrupts from an exception handler is not as
> > trivial as it looked in the first place. Especially the fault handler is a
> > nightmare. Also PeterZ's approach of doing
> >
> >            if (regs->eflags & IF)
> >                 local_irq_disable();
> >
> > is doomed due to sys_iopl(). See below.
> 
> I missed something in the discussion.  What breaks?

Assume user space has issued CLI then the above check is giving the wrong
answer because it assumes that all faults in user mode have IF set.

> Can you check user_mode(regs) too?

Yes, but I still hate it with a passion :)

> > What's your plan with cr2? Stash it in pt_regs or something else?
> 
> Just read it from CR2.  I added a new idtentry macro arg called
> "entry_work", and setting it to 0 causes the enter_from_user_mode to
> be skipped.  Then C code calls enter_from_user_mode() after reading
> CR2 (and DR7).  WIP code is here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=x86/idtentry
> 
> The idea is that, if everything is converted, then we get rid of the
> entry_work=1 case, which is easier if there's a macro.
> 
> So my suggestion is to use a macro for the 2-arg version and open-code
> all the 3-arg cases.  Then, when the dust settles, we get rid of the
> third arg and they can use the macro.

I'll have a look tomorrow with brain awake.
 
> > The interesting bells and whistels result from sys_iopl(). If user space
> > has been granted iopl(level = 3) it gains cli/sti priviledges. When the
> > application has interrupts disabled in userspace:
> >
> >   - invocation of a syscall
> >
> >   - any exception (aside of NMI/MCE) which conditionally enables interrupts
> >     depending on user_mode(regs) and therefor can be preempted and
> >     schedule
> >
> > is just undefined behaviour and I personally consider it to be a plain bug.
> >
> > Just for the record: This results in running a resulting or even completely
> > unrelated signal handler with interrupts disabled as well.
> 
> I am seriously tempted to say that the solution is to remove iopl(),
> at least on 64-bit kernels.  Doing STI in user mode is BS :)

STI would be halfways sane. CLI is the problem. And yes I agree it's BS :)

> Otherwise we need to give it semantics, no?  I personally have no
> actual problem with the fact that an NMI can cause scheduling to
> happen.  Big fscking deal.

Right, I don't care either. I do neither care that any exception/syscall
which hits a user space CLI region might schedule. It's been that way
forever.

But giving this semantics is insanely hard at least if you want sensible,
useful, consistent and understandable semantics. I know that's overrated.

> > Whatever we decide it is, leaving it completely inconsistent is not a
> > solution at all. The options are:
> >
> >   1)  Always do conditional tracing depending on the user_regs->eflags.IF
> >       state.
> 
> I'm okay with always tracing like user mode means IRQs on or doing it
> "correctly".  I consider the former to be simpler and therefore quite
> possibly better.
> 
> >
> >   2)  #1 + warn once when syscalls and exceptions (except NMI/MCE) happen
> >       and user_regs->eflags.IF is cleared.
> >
> >   3a) #2 + enforce signal handling to run with interrupts enabled.
> >
> >   3b) #2 + set regs->eflags.IF. So the state is always correct from the
> >       kernel POV. Of course that changes existing behaviour, but its
> >       changing undefined and inconsistent behaviour.
> >
> >   4) Let iopl(level) return -EPERM if level == 3.
> >
> >      Yeah, I know it's not possible due to regressions (DPKD uses iopl(3)),
> >      but TBH that'd be the sanest option of all.
> >
> >      Of course the infinite wisdom of hardware designers tied IN, INS, OUT,
> >      OUTS and CLI/STI together on IOPL so we cannot even distangle them in
> >      any way.
> 
> >
> >      The only way out would be to actually use a full 8K sized I/O bitmap,
> >      but that's a massive pain as it has to be copied on every context
> >      switch.
> 
> Hmm.  This actually doesn't seem that bad.  We already have a TIF_
> flag to optimize this.  So basically iopl() would effectively become
> ioperm(everything on).

Yes, and the insane user space would:

     1) Pay the latency price for copying 8K bitmap on every context switch
     	IN

     2) Inflict latency on the next task due to requiring memset of 8K
     	bitmap on every context switch OUT

     3) #GP when issuing CLI/STI

I personally have no problem with that. #1 and #3 are sane and as iopl()
requires CAP_RAW_IO it's not available to Joe User, so the sysadmin is
responsible for eventual issues resulting from #2.

Though the no-regression hammer might pound on #3 as it breaks random
engineering trainwrecks from hell.

#1/#2 could be easily mitigated though.

      struct tss_struct {
      	struct x86_hw_tss       x86_tss;
	unsigned long           io_bitmap[IO_BITMAP_LONGS + 1];
      };

and x86_tss has

    u16	io_bitmap_base;

which is either set to

  INVALID_IO_BITMAP_OFFSET ( 0x8000 )

or

  IO_BITMAP_OFFSET
    (offsetof(struct tss_struct, io_bitmap) - offsetof(struct tss_struct, x86_tss))

So we could add

	unsigned long           io_bitmap_all[IO_BITMAP_LONGS + 1];

and just set the base to this one.

But that involves also upping __KERNEL_TSS_LIMIT. Too tired to think about
the implications of that right now.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work
  2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
                   ` (18 preceding siblings ...)
  2019-10-23 21:20 ` Josh Poimboeuf
@ 2019-10-29 11:28 ` Will Deacon
  19 siblings, 0 replies; 64+ messages in thread
From: Will Deacon @ 2019-10-29 11:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Paolo Bonzini, kvm,
	linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes

Hi Thomas,

On Wed, Oct 23, 2019 at 02:27:05PM +0200, Thomas Gleixner wrote:
> When working on a way to move out the posix cpu timer expiry out of the
> timer interrupt context, I noticed that KVM is not handling pending task
> work before entering a guest. A quick hack was to add that to the x86 KVM
> handling loop. The discussion ended with a request to make this a generic
> infrastructure possible with also moving the per arch implementations of
> the enter from and return to user space handling generic.
> 
>   https://lore.kernel.org/r/89E42BCC-47A8-458B-B06A-D6A20D20512C@amacapital.net
> 
> The series implements the syscall enter/exit and the general exit to
> userspace work handling along with the pre guest enter functionality.
> 
> Changes vs. RFC version:
> 
>   - Dropped ARM64 conversion as requested by ARM64 folks

If you fancy another crack at arm64 on your way back from Lyon, we've now
got more of the asm->C conversion queued up here:

https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/entry-s-to-c

No worries if not, but figured it was worth letting you know anyway.

Will

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label
  2019-10-23 12:27 ` [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label Thomas Gleixner
  2019-10-23 13:43   ` Sean Christopherson
@ 2019-11-06 15:26   ` Alexandre Chartre
  2019-11-16 12:02   ` [tip: x86/asm] " tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 64+ messages in thread
From: Alexandre Chartre @ 2019-11-06 15:26 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes


On 10/23/19 2:27 PM, Thomas Gleixner wrote:
> The C reimplementation of SYSENTER left that unused ENTRY() label
> around. Remove it.
> 
> Fixes: 5f310f739b4c ("x86/entry/32: Re-implement SYSENTER using the new C path")
> Originally-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/entry/entry_32.S |    1 -
>   1 file changed, 1 deletion(-)
> 

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

alex.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit
  2019-10-23 12:27 ` [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit Thomas Gleixner
  2019-10-23 13:45   ` Sean Christopherson
@ 2019-11-06 15:29   ` Alexandre Chartre
  2019-11-16 12:02   ` [tip: x86/asm] " tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 64+ messages in thread
From: Alexandre Chartre @ 2019-11-06 15:29 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes


On 10/23/19 2:27 PM, Thomas Gleixner wrote:
> Jump directly to restore_regs_and_return_to_kernel instead of making
> a pointless extra jump through .Lparanoid_exit_restore
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/entry/entry_64.S |    3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

alex.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()
  2019-10-23 12:27 ` [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug() Thomas Gleixner
  2019-10-23 13:52   ` Sean Christopherson
  2019-10-23 21:31   ` Josh Poimboeuf
@ 2019-11-06 15:33   ` Alexandre Chartre
  2020-02-27 14:15   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 64+ messages in thread
From: Alexandre Chartre @ 2019-11-06 15:33 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes


On 10/23/19 2:27 PM, Thomas Gleixner wrote:
> That function returns immediately after conditionally reenabling interrupts which
> is more than pointless and requires the ASM code to disable interrupts again.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/kernel/traps.c |    1 -
>   1 file changed, 1 deletion(-)
> 

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

alex.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 04/17] x86/entry: Make DEBUG_ENTRY_ASSERT_IRQS_OFF available for 32bit
  2019-10-23 12:27 ` [patch V2 04/17] x86/entry: Make DEBUG_ENTRY_ASSERT_IRQS_OFF available for 32bit Thomas Gleixner
  2019-10-23 14:16   ` Sean Christopherson
@ 2019-11-06 15:50   ` Alexandre Chartre
  1 sibling, 0 replies; 64+ messages in thread
From: Alexandre Chartre @ 2019-11-06 15:50 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes


On 10/23/19 2:27 PM, Thomas Gleixner wrote:
> Move the interrupt state verification debug macro to common code and fixup
> the irqflags and paravirt components so it can be used in 32bit code later.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/entry/calling.h        |   12 ++++++++++++
>   arch/x86/entry/entry_64.S       |   12 ------------
>   arch/x86/include/asm/irqflags.h |    8 ++++++--
>   arch/x86/include/asm/paravirt.h |    9 +++++----
>   4 files changed, 23 insertions(+), 18 deletions(-)
> 

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

alex.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code
  2019-10-23 12:27 ` [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code Thomas Gleixner
  2019-10-23 14:16   ` Sean Christopherson
  2019-10-23 22:01   ` Josh Poimboeuf
@ 2019-11-06 16:19   ` Alexandre Chartre
  2 siblings, 0 replies; 64+ messages in thread
From: Alexandre Chartre @ 2019-11-06 16:19 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes


On 10/23/19 2:27 PM, Thomas Gleixner wrote:
> Traps enable interrupts conditionally but rely on the ASM return code to
> disable them again. That results in redundant interrupt disable and trace
> calls.
> 
> Make the trap handlers disable interrupts before returning to avoid that,
> which allows simplification of the ASM entry code.
> 
> Originally-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/kernel/traps.c |   32 +++++++++++++++++++++-----------
>   arch/x86/mm/fault.c     |    7 +++++--
>   2 files changed, 26 insertions(+), 13 deletions(-)
> 

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

alex.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 06/17] x86/entry/32: Remove redundant interrupt disable
  2019-10-23 12:27 ` [patch V2 06/17] x86/entry/32: Remove redundant interrupt disable Thomas Gleixner
  2019-10-23 14:17   ` Sean Christopherson
@ 2019-11-08 10:41   ` Alexandre Chartre
  1 sibling, 0 replies; 64+ messages in thread
From: Alexandre Chartre @ 2019-11-08 10:41 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes


On 10/23/19 2:27 PM, Thomas Gleixner wrote:
> Now that the trap handlers return with interrupts disabled, the
> unconditional disabling of interrupts in the low level entry code can be
> removed along with the trace calls and the misnomed preempt_stop macro.
> As a consequence ret_from_exception and ret_from_intr collapse.
> 
> Add a debug check to verify that interrupts are disabled depending on
> CONFIG_DEBUG_ENTRY.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/entry/entry_32.S |   21 ++++++---------------
>   1 file changed, 6 insertions(+), 15 deletions(-)
> 

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

alex.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 07/17] x86/entry/64: Remove redundant interrupt disable
  2019-10-23 12:27 ` [patch V2 07/17] x86/entry/64: " Thomas Gleixner
  2019-10-23 14:20   ` Sean Christopherson
  2019-10-23 22:06   ` Josh Poimboeuf
@ 2019-11-08 11:07   ` Alexandre Chartre
  2 siblings, 0 replies; 64+ messages in thread
From: Alexandre Chartre @ 2019-11-08 11:07 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Peter Zijlstra, Andy Lutomirski, Will Deacon, Paolo Bonzini,
	kvm, linux-arch, Mike Rapoport, Josh Poimboeuf, Miroslav Benes


On 10/23/19 2:27 PM, Thomas Gleixner wrote:
> Now that the trap handlers return with interrupts disabled, the
> unconditional disabling of interrupts in the low level entry code can be
> removed along with the trace calls.
> 
> Add debug checks where appropriate.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/entry/entry_64.S |    9 +++------
>   1 file changed, 3 insertions(+), 6 deletions(-)
> 

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

alex.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [tip: x86/asm] x86/entry/64: Remove pointless jump in paranoid_exit
  2019-10-23 12:27 ` [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit Thomas Gleixner
  2019-10-23 13:45   ` Sean Christopherson
  2019-11-06 15:29   ` Alexandre Chartre
@ 2019-11-16 12:02   ` tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 64+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2019-11-16 12:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Sean Christopherson, Alexandre Chartre,
	Peter Zijlstra (Intel),
	Ingo Molnar, Borislav Petkov, linux-kernel

The following commit has been merged into the x86/asm branch of tip:

Commit-ID:     45c08383141794a7e9b26f35d491b74f33ac469e
Gitweb:        https://git.kernel.org/tip/45c08383141794a7e9b26f35d491b74f33ac469e
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Oct 2019 14:27:07 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sat, 16 Nov 2019 12:55:55 +01:00

x86/entry/64: Remove pointless jump in paranoid_exit

Jump directly to restore_regs_and_return_to_kernel instead of making
a pointless extra jump through .Lparanoid_exit_restore

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191023123117.779277679@linutronix.de

---
 arch/x86/entry/entry_64.S | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d58c012..76942cb 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1273,12 +1273,11 @@ SYM_CODE_START_LOCAL(paranoid_exit)
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3	scratch_reg=%rbx save_reg=%r14
 	SWAPGS_UNSAFE_STACK
-	jmp	.Lparanoid_exit_restore
+	jmp	restore_regs_and_return_to_kernel
 .Lparanoid_exit_no_swapgs:
 	TRACE_IRQS_IRETQ_DEBUG
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3	scratch_reg=%rbx save_reg=%r14
-.Lparanoid_exit_restore:
 	jmp restore_regs_and_return_to_kernel
 SYM_CODE_END(paranoid_exit)
 

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [tip: x86/asm] x86/entry/32: Remove unused resume_userspace label
  2019-10-23 12:27 ` [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label Thomas Gleixner
  2019-10-23 13:43   ` Sean Christopherson
  2019-11-06 15:26   ` Alexandre Chartre
@ 2019-11-16 12:02   ` tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 64+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2019-11-16 12:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra, Thomas Gleixner, Sean Christopherson,
	Alexandre Chartre, Ingo Molnar, Borislav Petkov, linux-kernel

The following commit has been merged into the x86/asm branch of tip:

Commit-ID:     df1a7524741b6c094786032e12a21a448321d9f6
Gitweb:        https://git.kernel.org/tip/df1a7524741b6c094786032e12a21a448321d9f6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Oct 2019 14:27:06 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sat, 16 Nov 2019 12:55:55 +01:00

x86/entry/32: Remove unused resume_userspace label

The C reimplementation of SYSENTER left that unused ENTRY() label
around. Remove it.

Fixes: 5f310f739b4c ("x86/entry/32: Re-implement SYSENTER using the new C path")
Originally-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191023123117.686514045@linutronix.de

---
 arch/x86/entry/entry_32.S | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index a987b62..4bbcc5e 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -824,7 +824,6 @@ ret_from_intr:
 	cmpl	$USER_RPL, %eax
 	jb	restore_all_kernel		# not returning to v8086 or userspace
 
-SYM_INNER_LABEL_ALIGN(resume_userspace, SYM_L_LOCAL)
 	DISABLE_INTERRUPTS(CLBR_ANY)
 	TRACE_IRQS_OFF
 	movl	%esp, %eax

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [patch V2 09/17] x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY
  2019-10-23 12:27 ` [patch V2 09/17] x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY Thomas Gleixner
@ 2020-01-06  4:11   ` Frederic Weisbecker
  0 siblings, 0 replies; 64+ messages in thread
From: Frederic Weisbecker @ 2020-01-06  4:11 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Andy Lutomirski, Will Deacon,
	Paolo Bonzini, kvm, linux-arch, Mike Rapoport, Josh Poimboeuf,
	Miroslav Benes

On Wed, Oct 23, 2019 at 02:27:14PM +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Evaluating _TIF_NOHZ to decide whether to use the slow syscall entry path
> is not only pointless, it's actually counterproductive:
> 
>  1) Context tracking code is invoked unconditionally before that flag is
>     evaluated.
> 
>  2) If the flag is set the slow path is invoked for nothing due to #1
> 
> Remove it.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

I'm borrowing this patch for a series of mine. But if you apply
it in the meantime, that would be even better :-)

Thanks!

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [tip: x86/entry] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()
  2019-10-23 12:27 ` [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug() Thomas Gleixner
                     ` (2 preceding siblings ...)
  2019-11-06 15:33   ` Alexandre Chartre
@ 2020-02-27 14:15   ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 64+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-02-27 14:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Sean Christopherson, Alexandre Chartre,
	Frederic Weisbecker, Andy Lutomirski, Peter Zijlstra (Intel),
	x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     e039dd815941e785203142261397da6ec64d20cc
Gitweb:        https://git.kernel.org/tip/e039dd815941e785203142261397da6ec64d20cc
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 22:36:40 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 27 Feb 2020 14:48:39 +01:00

x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug()

That function returns immediately after conditionally reenabling interrupts which
is more than pointless and requires the ASM code to disable interrupts again.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20191023123117.871608831@linutronix.de
Link: https://lkml.kernel.org/r/20200225220216.518575042@linutronix.de


---
 arch/x86/kernel/traps.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 6ef00eb..474b8cb 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -862,7 +862,6 @@ do_simd_coprocessor_error(struct pt_regs *regs, long error_code)
 dotraplinkage void
 do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
 {
-	cond_local_irq_enable(regs);
 }
 
 dotraplinkage void

^ permalink raw reply related	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2020-02-27 14:22 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23 12:27 [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Thomas Gleixner
2019-10-23 12:27 ` [patch V2 01/17] x86/entry/32: Remove unused resume_userspace label Thomas Gleixner
2019-10-23 13:43   ` Sean Christopherson
2019-11-06 15:26   ` Alexandre Chartre
2019-11-16 12:02   ` [tip: x86/asm] " tip-bot2 for Thomas Gleixner
2019-10-23 12:27 ` [patch V2 02/17] x86/entry/64: Remove pointless jump in paranoid_exit Thomas Gleixner
2019-10-23 13:45   ` Sean Christopherson
2019-11-06 15:29   ` Alexandre Chartre
2019-11-16 12:02   ` [tip: x86/asm] " tip-bot2 for Thomas Gleixner
2019-10-23 12:27 ` [patch V2 03/17] x86/traps: Remove pointless irq enable from do_spurious_interrupt_bug() Thomas Gleixner
2019-10-23 13:52   ` Sean Christopherson
2019-10-23 21:31   ` Josh Poimboeuf
2019-10-23 22:35     ` Thomas Gleixner
2019-10-23 22:49       ` Josh Poimboeuf
2019-10-23 23:18         ` Thomas Gleixner
2019-11-06 15:33   ` Alexandre Chartre
2020-02-27 14:15   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2019-10-23 12:27 ` [patch V2 04/17] x86/entry: Make DEBUG_ENTRY_ASSERT_IRQS_OFF available for 32bit Thomas Gleixner
2019-10-23 14:16   ` Sean Christopherson
2019-11-06 15:50   ` Alexandre Chartre
2019-10-23 12:27 ` [patch V2 05/17] x86/traps: Make interrupt enable/disable symmetric in C code Thomas Gleixner
2019-10-23 14:16   ` Sean Christopherson
2019-10-23 22:01   ` Josh Poimboeuf
2019-10-23 23:23     ` Thomas Gleixner
2019-11-06 16:19   ` Alexandre Chartre
2019-10-23 12:27 ` [patch V2 06/17] x86/entry/32: Remove redundant interrupt disable Thomas Gleixner
2019-10-23 14:17   ` Sean Christopherson
2019-11-08 10:41   ` Alexandre Chartre
2019-10-23 12:27 ` [patch V2 07/17] x86/entry/64: " Thomas Gleixner
2019-10-23 14:20   ` Sean Christopherson
2019-10-23 22:06   ` Josh Poimboeuf
2019-10-23 23:52     ` Thomas Gleixner
2019-10-24 16:18       ` Andy Lutomirski
2019-10-24 20:52         ` Thomas Gleixner
2019-10-24 20:59           ` Thomas Gleixner
2019-10-24 21:21           ` Peter Zijlstra
2019-10-24 21:24           ` Andy Lutomirski
2019-10-24 22:33             ` Thomas Gleixner
2019-11-08 11:07   ` Alexandre Chartre
2019-10-23 12:27 ` [patch V2 08/17] x86/entry: Move syscall irq tracing to C code Thomas Gleixner
2019-10-23 21:30   ` Andy Lutomirski
2019-10-23 21:35     ` Andy Lutomirski
2019-10-23 23:31       ` Thomas Gleixner
2019-10-23 23:16     ` Thomas Gleixner
2019-10-24 16:24     ` Andy Lutomirski
2019-10-24 17:40       ` Peter Zijlstra
2019-10-24 20:54         ` Thomas Gleixner
2019-10-23 12:27 ` [patch V2 09/17] x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY Thomas Gleixner
2020-01-06  4:11   ` Frederic Weisbecker
2019-10-23 12:27 ` [patch V2 10/17] entry: Provide generic syscall entry functionality Thomas Gleixner
2019-10-23 12:27 ` [patch V2 11/17] x86/entry: Use generic syscall entry function Thomas Gleixner
2019-10-23 12:27 ` [patch V2 12/17] entry: Provide generic syscall exit function Thomas Gleixner
2019-10-23 12:27 ` [patch V2 13/17] x86/entry: Use generic syscall exit functionality Thomas Gleixner
2019-10-23 12:27 ` [patch V2 14/17] entry: Provide generic exit to usermode functionality Thomas Gleixner
2019-10-23 21:34   ` Andy Lutomirski
2019-10-23 23:20     ` Thomas Gleixner
2019-10-23 12:27 ` [patch V2 15/17] x86/entry: Use generic exit to usermode Thomas Gleixner
2019-10-23 12:27 ` [patch V2 16/17] kvm/workpending: Provide infrastructure for work before entering a guest Thomas Gleixner
2019-10-23 14:55   ` Sean Christopherson
2019-10-23 12:27 ` [patch V2 17/17] x86/kvm: Use generic exit to guest work function Thomas Gleixner
2019-10-23 14:48   ` Sean Christopherson
2019-10-23 14:37 ` [patch V2 00/17] entry: Provide generic implementation for host and guest entry/exit work Peter Zijlstra
2019-10-23 21:20 ` Josh Poimboeuf
2019-10-29 11:28 ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.