LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v5 00/17] x86: Rewrite exit-to-userspace code
@ 2015-07-03 19:44 Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 01/17] selftests/x86: Add a test for 32-bit fast syscall arg faults Andy Lutomirski
                   ` (17 more replies)
  0 siblings, 18 replies; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

This is the first big batch of x86 asm-to-C conversion patches.

The exit-to-usermode code is copied in several places and is written
in a nasty combination of asm and C.  It's not at all clear what
it's supposed to do, and the way it's structured makes it very hard
to work with.  For example, it's not even clear why syscall exit
hooks are called only once per syscall right now.  (It seems to be a
side effect of the way that rdi and rdx are handled in the asm loop,
and it seems reliable, but it's still pointlessly complicated.)  The
existing code also makes context tracking overly complicated and
hard to understand.  Finally, it's nearly impossible for anyone to
change what happens on exit to usermode, since the existing code is
so fragile.

I tried to clean it up incrementally, but I decided it was too hard.
Instead, this series just replaces the code.  It seems to work.

Context tracking in particular works very differently now.  The
low-level entry code checks that we're in CONTEXT_USER and switches
to CONTEXT_KERNEL.  The exit code does the reverse.  There is no
need to track what CONTEXT_XYZ state we came from, because we
already know.  Similarly, SCHEDULE_USER is gone, since we can
reschedule if needed by simply calling schedule() from C code.

The main things that are missing are that I haven't done the 32-bit
parts (anyone want to help?) and therefore I haven't deleted the old
C code.  I also think this may break UML for trivial reasons.

IRQ context tracking is still messy.  One the cleanup progresses
to the point that we can enter CONTEXT_KERNEL in syscalls before
enabling interrupts, we can fully clean up IRQ context tracking.

Once these land, I'll send some more :)

Note: we might want to backport patches 1 and 2.

Changes from v4:
 - Remove now-unused SAVE_EXTRA_REGS_RBP macro
 - Fix comment at the top of common.c
 - Decorate internal labels in error_entry with .L
 - Fix two mis-formatted asm lines
 - Undo inadvertent removal of R11 initialization in the sysexit path

I didn't rename the error_entry labels.

Changes from v3:
 - Add the syscall_arg_fault_32 test.
 - Fix a pre-existing bad syscall arg buglet.
 - Fix an asm glitch due to a bad rebase.
 - Fix a CONFIG_PROVE_LOCKDEP warning.
Borislav: the end result of this series differs from the v3.91 that I
only in the removal of a single trailing tab.  The badarg patch is in
a different place now, though, since we might want to backport it.

Changes from v2: Misplaced the actual list -- sorry.

Changes from v1:
 - Fix bisection failure by squashing the 64-bit native and compat syscall
   conversions together.  The intermediate state didn't built, and fixing
   it isn't worthwhile (the results will be harder to understand).
 - Replace context_tracking_assert_state with CT_WARN_ON and ct_state.
 - The last two patches are now.  I incorrectly thought that we weren't
   ready for them yet on 32-bit kernels, but I was wrong.

Andy Lutomirski (16):
  selftests/x86: Add a test for 32-bit fast syscall arg faults
  x86/entry/64/compat: Fix bad fast syscall arg failure path
  context_tracking: Add ct_state and CT_WARN_ON
  notifiers: Assert that RCU is watching in notify_die
  x86: Move C entry and exit code to arch/x86/entry/common.c
  x86/traps: Assert that we're in CONTEXT_KERNEL in exception entries
  x86/entry: Add enter_from_user_mode and use it in syscalls
  x86/entry: Add new, comprehensible entry and exit hooks
  x86/entry/64: Really create an error-entry-from-usermode code path
  x86/entry/64: Migrate 64-bit and compat syscalls to new exit hooks
  x86/asm/entry/64: Save all regs on interrupt entry
  x86/asm/entry/64: Simplify irq stack pt_regs handling
  x86/asm/entry/64: Migrate error and interrupt exit work to C
  x86/entry: Remove exception_enter from most trap handlers
  x86/entry: Remove SCHEDULE_USER and asm/context-tracking.h
  x86/irq: Document how IRQ context tracking works and add an assertion

Ingo Molnar (1):
  uml: Fix do_signal() prototype

 arch/um/include/shared/kern_util.h              |   3 +-
 arch/um/kernel/process.c                        |   6 +-
 arch/um/kernel/signal.c                         |   8 +-
 arch/um/kernel/tlb.c                            |   2 +-
 arch/um/kernel/trap.c                           |   2 +-
 arch/x86/entry/Makefile                         |   1 +
 arch/x86/entry/calling.h                        |   3 -
 arch/x86/entry/common.c                         | 374 ++++++++++++++++++++++++
 arch/x86/entry/entry_64.S                       | 197 ++++---------
 arch/x86/entry/entry_64_compat.S                |  46 ++-
 arch/x86/include/asm/context_tracking.h         |  10 -
 arch/x86/include/asm/signal.h                   |   1 +
 arch/x86/include/asm/traps.h                    |   4 +-
 arch/x86/kernel/cpu/mcheck/mce.c                |   5 +-
 arch/x86/kernel/cpu/mcheck/p5.c                 |   5 +-
 arch/x86/kernel/cpu/mcheck/winchip.c            |   4 +-
 arch/x86/kernel/irq.c                           |  15 +
 arch/x86/kernel/ptrace.c                        | 202 +------------
 arch/x86/kernel/signal.c                        |  28 +-
 arch/x86/kernel/traps.c                         |  87 ++----
 include/linux/context_tracking.h                |  15 +
 include/linux/context_tracking_state.h          |   1 +
 kernel/notifier.c                               |   2 +
 tools/testing/selftests/x86/Makefile            |   2 +-
 tools/testing/selftests/x86/syscall_arg_fault.c | 130 ++++++++
 25 files changed, 681 insertions(+), 472 deletions(-)
 create mode 100644 arch/x86/entry/common.c
 delete mode 100644 arch/x86/include/asm/context_tracking.h
 create mode 100644 tools/testing/selftests/x86/syscall_arg_fault.c

-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 01/17] selftests/x86: Add a test for 32-bit fast syscall arg faults
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:49   ` [tip:x86/asm] x86/entry, " tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 02/17] x86/entry/64/compat: Fix bad fast syscall arg failure path Andy Lutomirski
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

This test passes on 4.0 and fails on some newer kernels.  Fortunately,
the failure is likely not a big deal.  This test will make sure that
we don't break it further (e.g. OOPSing) as we clean up the entry
code and that we eventually fix the regression.

There's arguably no need to preserve the old ABI here -- anything
that makes it into a fast (vDSO) syscall with a bad stack is about
to crash no matter what we do.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 tools/testing/selftests/x86/Makefile            |   2 +-
 tools/testing/selftests/x86/syscall_arg_fault.c | 130 ++++++++++++++++++++++++
 2 files changed, 131 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/syscall_arg_fault.c

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index caa60d56d7d1..e8df47e6326c 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -5,7 +5,7 @@ include ../lib.mk
 .PHONY: all all_32 all_64 warn_32bit_failure clean
 
 TARGETS_C_BOTHBITS := sigreturn single_step_syscall sysret_ss_attrs
-TARGETS_C_32BIT_ONLY := entry_from_vm86
+TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault
 
 TARGETS_C_32BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_32BIT_ONLY)
 BINARIES_32 := $(TARGETS_C_32BIT_ALL:%=%_32)
diff --git a/tools/testing/selftests/x86/syscall_arg_fault.c b/tools/testing/selftests/x86/syscall_arg_fault.c
new file mode 100644
index 000000000000..7db4fc9fa09f
--- /dev/null
+++ b/tools/testing/selftests/x86/syscall_arg_fault.c
@@ -0,0 +1,130 @@
+/*
+ * syscall_arg_fault.c - tests faults 32-bit fast syscall stack args
+ * Copyright (c) 2015 Andrew Lutomirski
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#define _GNU_SOURCE
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/signal.h>
+#include <sys/ucontext.h>
+#include <err.h>
+#include <setjmp.h>
+#include <errno.h>
+
+/* Our sigaltstack scratch space. */
+static unsigned char altstack_data[SIGSTKSZ];
+
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+		       int flags)
+{
+	struct sigaction sa;
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_sigaction = handler;
+	sa.sa_flags = SA_SIGINFO | flags;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+static volatile sig_atomic_t sig_traps;
+static sigjmp_buf jmpbuf;
+
+static volatile sig_atomic_t n_errs;
+
+static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
+{
+	ucontext_t *ctx = (ucontext_t*)ctx_void;
+
+	if (ctx->uc_mcontext.gregs[REG_EAX] != -EFAULT) {
+		printf("[FAIL]\tAX had the wrong value: 0x%x\n",
+		       ctx->uc_mcontext.gregs[REG_EAX]);
+		n_errs++;
+	} else {
+		printf("[OK]\tSeems okay\n");
+	}
+
+	siglongjmp(jmpbuf, 1);
+}
+
+static void sigill(int sig, siginfo_t *info, void *ctx_void)
+{
+	printf("[SKIP]\tIllegal instruction\n");
+	siglongjmp(jmpbuf, 1);
+}
+
+int main()
+{
+	stack_t stack = {
+		.ss_sp = altstack_data,
+		.ss_size = SIGSTKSZ,
+	};
+	if (sigaltstack(&stack, NULL) != 0)
+		err(1, "sigaltstack");
+
+	sethandler(SIGSEGV, sigsegv, SA_ONSTACK);
+	sethandler(SIGILL, sigill, SA_ONSTACK);
+
+	/*
+	 * Exercise another nasty special case.  The 32-bit SYSCALL
+	 * and SYSENTER instructions (even in compat mode) each
+	 * clobber one register.  A Linux system call has a syscall
+	 * number and six arguments, and the user stack pointer
+	 * needs to live in some register on return.  That means
+	 * that we need eight registers, but SYSCALL and SYSENTER
+	 * only preserve seven registers.  As a result, one argument
+	 * ends up on the stack.  The stack is user memory, which
+	 * means that the kernel can fail to read it.
+	 *
+	 * The 32-bit fast system calls don't have a defined ABI:
+	 * we're supposed to invoke them through the vDSO.  So we'll
+	 * fudge it: we set all regs to invalid pointer values and
+	 * invoke the entry instruction.  The return will fail no
+	 * matter what, and we completely lose our program state,
+	 * but we can fix it up with a signal handler.
+	 */
+
+	printf("[RUN]\tSYSENTER with invalid state\n");
+	if (sigsetjmp(jmpbuf, 1) == 0) {
+		asm volatile (
+			"movl $-1, %%eax\n\t"
+			"movl $-1, %%ebx\n\t"
+			"movl $-1, %%ecx\n\t"
+			"movl $-1, %%edx\n\t"
+			"movl $-1, %%esi\n\t"
+			"movl $-1, %%edi\n\t"
+			"movl $-1, %%ebp\n\t"
+			"movl $-1, %%esp\n\t"
+			"sysenter"
+			: : : "memory", "flags");
+	}
+
+	printf("[RUN]\tSYSCALL with invalid state\n");
+	if (sigsetjmp(jmpbuf, 1) == 0) {
+		asm volatile (
+			"movl $-1, %%eax\n\t"
+			"movl $-1, %%ebx\n\t"
+			"movl $-1, %%ecx\n\t"
+			"movl $-1, %%edx\n\t"
+			"movl $-1, %%esi\n\t"
+			"movl $-1, %%edi\n\t"
+			"movl $-1, %%ebp\n\t"
+			"movl $-1, %%esp\n\t"
+			"syscall\n\t"
+			"pushl $0"	/* make sure we segfault cleanly */
+			: : : "memory", "flags");
+	}
+
+	return 0;
+}
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 02/17] x86/entry/64/compat: Fix bad fast syscall arg failure path
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 01/17] selftests/x86: Add a test for 32-bit fast syscall arg faults Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:49   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 03/17] uml: Fix do_signal() prototype Andy Lutomirski
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

If user code does SYSCALL32 or SYSENTER without a valid stack, then
our attempt to determine the syscall args will result in a failed
uaccess fault.  Previously, we would try to recover by jumping to
the syscall exit code, but we'd run the syscall exit work even
though we never made it to the syscall entry work.

Clean it up by treating the failure path as a non-syscall entry and
exit pair.

This fixes strace's output when running the syscall_arg_fault test.
Without this fix, strace would get out of sync and would fail to
associate syscall entries with syscall exits.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/entry_64.S        |  2 +-
 arch/x86/entry/entry_64_compat.S | 35 +++++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3bb2c4302df1..141a5d49dddc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -613,7 +613,7 @@ ret_from_intr:
 	testb	$3, CS(%rsp)
 	jz	retint_kernel
 	/* Interrupt came from user space */
-retint_user:
+GLOBAL(retint_user)
 	GET_THREAD_INFO(%rcx)
 
 	/* %rcx: thread info. Interrupts are off. */
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index bb187a6a877c..efe0b1e499fa 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -425,8 +425,39 @@ cstar_tracesys:
 END(entry_SYSCALL_compat)
 
 ia32_badarg:
-	ASM_CLAC
-	movq	$-EFAULT, RAX(%rsp)
+	/*
+	 * So far, we've entered kernel mode, set AC, turned on IRQs, and
+	 * saved C regs except r8-r11.  We haven't done any of the other
+	 * standard entry work, though.  We want to bail, but we shouldn't
+	 * treat this as a syscall entry since we don't even know what the
+	 * args are.  Instead, treat this as a non-syscall entry, finish
+	 * the entry work, and immediately exit after setting AX = -EFAULT.
+	 *
+	 * We're really just being polite here.  Killing the task outright
+	 * would be a reasonable action, too.  Given that the only valid
+	 * way to have gotten here is through the vDSO, and we already know
+	 * that the stack pointer is bad, the task isn't going to survive
+	 * for long no matter what we do.
+	 */
+
+	ASM_CLAC			/* undo STAC */
+	movq	$-EFAULT, RAX(%rsp)	/* return -EFAULT if possible */
+
+	/* Fill in the rest of pt_regs */
+	xorl	%eax, %eax
+	movq	%rax, R11(%rsp)
+	movq	%rax, R10(%rsp)
+	movq	%rax, R9(%rsp)
+	movq	%rax, R8(%rsp)
+	SAVE_EXTRA_REGS
+
+	/* Turn IRQs back off. */
+	DISABLE_INTERRUPTS(CLBR_NONE)
+	TRACE_IRQS_OFF
+
+	/* And exit again. */
+	jmp retint_user
+
 ia32_ret_from_sys_call:
 	xorl	%eax, %eax		/* Do not leak kernel information */
 	movq	%rax, R11(%rsp)
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 03/17] uml: Fix do_signal() prototype
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 01/17] selftests/x86: Add a test for 32-bit fast syscall arg faults Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 02/17] x86/entry/64/compat: Fix bad fast syscall arg failure path Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:49   ` [tip:x86/asm] um: " tip-bot for Ingo Molnar
  2015-07-03 19:44 ` [PATCH v5 04/17] context_tracking: Add ct_state and CT_WARN_ON Andy Lutomirski
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Ingo Molnar, Richard Weinberger, Andrew Morton, Andy Lutomirski,
	Denys Vlasenko, H. Peter Anvin, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Andy Lutomirski

From: Ingo Molnar <mingo@kernel.org>

Once x86 exports its do_signal(), the prototypes will clash.

Fix the clash and also improve the code a bit: remove the unnecessary
kern_do_signal() indirection. This allows interrupt_end() to share
the 'regs' parameter calculation.

Also remove the unused return code to match x86.

Minimally build and boot tested.

Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[Adjusted the commit message because I reordered the patch. --Andy]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/um/include/shared/kern_util.h | 3 ++-
 arch/um/kernel/process.c           | 6 ++++--
 arch/um/kernel/signal.c            | 8 +-------
 arch/um/kernel/tlb.c               | 2 +-
 arch/um/kernel/trap.c              | 2 +-
 5 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/um/include/shared/kern_util.h b/arch/um/include/shared/kern_util.h
index 83a91f976330..35ab97e4bb9b 100644
--- a/arch/um/include/shared/kern_util.h
+++ b/arch/um/include/shared/kern_util.h
@@ -22,7 +22,8 @@ extern int kmalloc_ok;
 extern unsigned long alloc_stack(int order, int atomic);
 extern void free_stack(unsigned long stack, int order);
 
-extern int do_signal(void);
+struct pt_regs;
+extern void do_signal(struct pt_regs *regs);
 extern void interrupt_end(void);
 extern void relay_signal(int sig, struct siginfo *si, struct uml_pt_regs *regs);
 
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index 68b9119841cd..a6d922672b9f 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -90,12 +90,14 @@ void *__switch_to(struct task_struct *from, struct task_struct *to)
 
 void interrupt_end(void)
 {
+	struct pt_regs *regs = &current->thread.regs;
+
 	if (need_resched())
 		schedule();
 	if (test_thread_flag(TIF_SIGPENDING))
-		do_signal();
+		do_signal(regs);
 	if (test_and_clear_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(&current->thread.regs);
+		tracehook_notify_resume(regs);
 }
 
 void exit_thread(void)
diff --git a/arch/um/kernel/signal.c b/arch/um/kernel/signal.c
index 4f60e4aad790..57acbd67d85d 100644
--- a/arch/um/kernel/signal.c
+++ b/arch/um/kernel/signal.c
@@ -64,7 +64,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 	signal_setup_done(err, ksig, singlestep);
 }
 
-static int kern_do_signal(struct pt_regs *regs)
+void do_signal(struct pt_regs *regs)
 {
 	struct ksignal ksig;
 	int handled_sig = 0;
@@ -110,10 +110,4 @@ static int kern_do_signal(struct pt_regs *regs)
 	 */
 	if (!handled_sig)
 		restore_saved_sigmask();
-	return handled_sig;
-}
-
-int do_signal(void)
-{
-	return kern_do_signal(&current->thread.regs);
 }
diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c
index f1b3eb14b855..2077248e8a72 100644
--- a/arch/um/kernel/tlb.c
+++ b/arch/um/kernel/tlb.c
@@ -291,7 +291,7 @@ void fix_range_common(struct mm_struct *mm, unsigned long start_addr,
 		/* We are under mmap_sem, release it such that current can terminate */
 		up_write(&current->mm->mmap_sem);
 		force_sig(SIGKILL, current);
-		do_signal();
+		do_signal(&current->thread.regs);
 	}
 }
 
diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index 47ff9b7f3e5d..1b0f5c59d522 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -173,7 +173,7 @@ static void bad_segv(struct faultinfo fi, unsigned long ip)
 void fatal_sigsegv(void)
 {
 	force_sigsegv(SIGSEGV, current);
-	do_signal();
+	do_signal(&current->thread.regs);
 	/*
 	 * This is to tell gcc that we're not returning - do_signal
 	 * can, in general, return, but in this case, it's not, since
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 04/17] context_tracking: Add ct_state and CT_WARN_ON
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (2 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 03/17] uml: Fix do_signal() prototype Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:50   ` [tip:x86/asm] context_tracking: Add ct_state() and CT_WARN_ON() tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 05/17] notifiers: Assert that RCU is watching in notify_die Andy Lutomirski
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

This will let us sprinkle sanity checks around the kernel without
making too much of a mess.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 include/linux/context_tracking.h       | 15 +++++++++++++++
 include/linux/context_tracking_state.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index b96bd299966f..008fc67d0d96 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -49,13 +49,28 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 	}
 }
 
+
+/**
+ * ct_state() - return the current context tracking state if known
+ *
+ * Returns the current cpu's context tracking state if context tracking
+ * is enabled.  If context tracking is disabled, returns
+ * CONTEXT_DISABLED.  This should be used primarily for debugging.
+ */
+static inline enum ctx_state ct_state(void)
+{
+	return context_tracking_is_enabled() ?
+		this_cpu_read(context_tracking.state) : CONTEXT_DISABLED;
+}
 #else
 static inline void user_enter(void) { }
 static inline void user_exit(void) { }
 static inline enum ctx_state exception_enter(void) { return 0; }
 static inline void exception_exit(enum ctx_state prev_ctx) { }
+static inline enum ctx_state ct_state(void) { return CONTEXT_DISABLED; }
 #endif /* !CONFIG_CONTEXT_TRACKING */
 
+#define CT_WARN_ON(cond) WARN_ON(context_tracking_is_enabled() && (cond))
 
 #ifdef CONFIG_CONTEXT_TRACKING_FORCE
 extern void context_tracking_init(void);
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 678ecdf90cf6..ee956c528fab 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -14,6 +14,7 @@ struct context_tracking {
 	bool active;
 	int recursion;
 	enum ctx_state {
+		CONTEXT_DISABLED = -1,	/* returned by ct_state() if unknown */
 		CONTEXT_KERNEL = 0,
 		CONTEXT_USER,
 		CONTEXT_GUEST,
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 05/17] notifiers: Assert that RCU is watching in notify_die
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (3 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 04/17] context_tracking: Add ct_state and CT_WARN_ON Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:50   ` [tip:x86/asm] notifiers, RCU: Assert that RCU is watching in notify_die() tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 06/17] x86: Move C entry and exit code to arch/x86/entry/common.c Andy Lutomirski
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

Low-level arch entries often call notify_die, and it's easy for arch
code to fail to exit an RCU quiescent state first.  Assert that
we're not quiescent in notify_die.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 kernel/notifier.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/notifier.c b/kernel/notifier.c
index ae9fc7cc360e..980e4330fb59 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -544,6 +544,8 @@ int notrace notify_die(enum die_val val, const char *str,
 		.signr	= sig,
 
 	};
+	rcu_lockdep_assert(rcu_is_watching(),
+			   "notify_die called but RCU thinks we're quiescent");
 	return atomic_notifier_call_chain(&die_chain, val, &args);
 }
 NOKPROBE_SYMBOL(notify_die);
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 06/17] x86: Move C entry and exit code to arch/x86/entry/common.c
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (4 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 05/17] notifiers: Assert that RCU is watching in notify_die Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:50   ` [tip:x86/asm] x86/entry: Move C entry and exit code to arch/x86/ entry/common.c tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 07/17] x86/traps: Assert that we're in CONTEXT_KERNEL in exception entries Andy Lutomirski
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

The entry and exit C helpers were confusingly scattered between
ptrace.c and signal.c, even though they aren't specific to ptrace or
signal handling.  Move them together in a new file.

This change just moves code around.  It doesn't change anything.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/Makefile       |   1 +
 arch/x86/entry/common.c       | 253 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/signal.h |   1 +
 arch/x86/kernel/ptrace.c      | 202 +--------------------------------
 arch/x86/kernel/signal.c      |  28 +----
 5 files changed, 257 insertions(+), 228 deletions(-)
 create mode 100644 arch/x86/entry/common.c

diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index 7a144971db79..bd55dedd7614 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -2,6 +2,7 @@
 # Makefile for the x86 low level entry code
 #
 obj-y				:= entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o
+obj-y				+= common.o
 
 obj-y				+= vdso/
 obj-y				+= vsyscall/
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
new file mode 100644
index 000000000000..917d0c3cb851
--- /dev/null
+++ b/arch/x86/entry/common.c
@@ -0,0 +1,253 @@
+/*
+ * common.c - C code for kernel entry and exit
+ * Copyright (c) 2015 Andrew Lutomirski
+ * GPL v2
+ *
+ * Based on asm and ptrace code by many authors.  The code here originated
+ * in ptrace.c and signal.c.
+ */
+
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/smp.h>
+#include <linux/errno.h>
+#include <linux/ptrace.h>
+#include <linux/tracehook.h>
+#include <linux/audit.h>
+#include <linux/seccomp.h>
+#include <linux/signal.h>
+#include <linux/export.h>
+#include <linux/context_tracking.h>
+#include <linux/user-return-notifier.h>
+#include <linux/uprobes.h>
+
+#include <asm/desc.h>
+#include <asm/traps.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/syscalls.h>
+
+static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
+{
+#ifdef CONFIG_X86_64
+	if (arch == AUDIT_ARCH_X86_64) {
+		audit_syscall_entry(regs->orig_ax, regs->di,
+				    regs->si, regs->dx, regs->r10);
+	} else
+#endif
+	{
+		audit_syscall_entry(regs->orig_ax, regs->bx,
+				    regs->cx, regs->dx, regs->si);
+	}
+}
+
+/*
+ * We can return 0 to resume the syscall or anything else to go to phase
+ * 2.  If we resume the syscall, we need to put something appropriate in
+ * regs->orig_ax.
+ *
+ * NB: We don't have full pt_regs here, but regs->orig_ax and regs->ax
+ * are fully functional.
+ *
+ * For phase 2's benefit, our return value is:
+ * 0:			resume the syscall
+ * 1:			go to phase 2; no seccomp phase 2 needed
+ * anything else:	go to phase 2; pass return value to seccomp
+ */
+unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
+{
+	unsigned long ret = 0;
+	u32 work;
+
+	BUG_ON(regs != task_pt_regs(current));
+
+	work = ACCESS_ONCE(current_thread_info()->flags) &
+		_TIF_WORK_SYSCALL_ENTRY;
+
+	/*
+	 * If TIF_NOHZ is set, we are required to call user_exit() before
+	 * doing anything that could touch RCU.
+	 */
+	if (work & _TIF_NOHZ) {
+		user_exit();
+		work &= ~_TIF_NOHZ;
+	}
+
+#ifdef CONFIG_SECCOMP
+	/*
+	 * Do seccomp first -- it should minimize exposure of other
+	 * code, and keeping seccomp fast is probably more valuable
+	 * than the rest of this.
+	 */
+	if (work & _TIF_SECCOMP) {
+		struct seccomp_data sd;
+
+		sd.arch = arch;
+		sd.nr = regs->orig_ax;
+		sd.instruction_pointer = regs->ip;
+#ifdef CONFIG_X86_64
+		if (arch == AUDIT_ARCH_X86_64) {
+			sd.args[0] = regs->di;
+			sd.args[1] = regs->si;
+			sd.args[2] = regs->dx;
+			sd.args[3] = regs->r10;
+			sd.args[4] = regs->r8;
+			sd.args[5] = regs->r9;
+		} else
+#endif
+		{
+			sd.args[0] = regs->bx;
+			sd.args[1] = regs->cx;
+			sd.args[2] = regs->dx;
+			sd.args[3] = regs->si;
+			sd.args[4] = regs->di;
+			sd.args[5] = regs->bp;
+		}
+
+		BUILD_BUG_ON(SECCOMP_PHASE1_OK != 0);
+		BUILD_BUG_ON(SECCOMP_PHASE1_SKIP != 1);
+
+		ret = seccomp_phase1(&sd);
+		if (ret == SECCOMP_PHASE1_SKIP) {
+			regs->orig_ax = -1;
+			ret = 0;
+		} else if (ret != SECCOMP_PHASE1_OK) {
+			return ret;  /* Go directly to phase 2 */
+		}
+
+		work &= ~_TIF_SECCOMP;
+	}
+#endif
+
+	/* Do our best to finish without phase 2. */
+	if (work == 0)
+		return ret;  /* seccomp and/or nohz only (ret == 0 here) */
+
+#ifdef CONFIG_AUDITSYSCALL
+	if (work == _TIF_SYSCALL_AUDIT) {
+		/*
+		 * If there is no more work to be done except auditing,
+		 * then audit in phase 1.  Phase 2 always audits, so, if
+		 * we audit here, then we can't go on to phase 2.
+		 */
+		do_audit_syscall_entry(regs, arch);
+		return 0;
+	}
+#endif
+
+	return 1;  /* Something is enabled that we can't handle in phase 1 */
+}
+
+/* Returns the syscall nr to run (which should match regs->orig_ax). */
+long syscall_trace_enter_phase2(struct pt_regs *regs, u32 arch,
+				unsigned long phase1_result)
+{
+	long ret = 0;
+	u32 work = ACCESS_ONCE(current_thread_info()->flags) &
+		_TIF_WORK_SYSCALL_ENTRY;
+
+	BUG_ON(regs != task_pt_regs(current));
+
+	/*
+	 * If we stepped into a sysenter/syscall insn, it trapped in
+	 * kernel mode; do_debug() cleared TF and set TIF_SINGLESTEP.
+	 * If user-mode had set TF itself, then it's still clear from
+	 * do_debug() and we need to set it again to restore the user
+	 * state.  If we entered on the slow path, TF was already set.
+	 */
+	if (work & _TIF_SINGLESTEP)
+		regs->flags |= X86_EFLAGS_TF;
+
+#ifdef CONFIG_SECCOMP
+	/*
+	 * Call seccomp_phase2 before running the other hooks so that
+	 * they can see any changes made by a seccomp tracer.
+	 */
+	if (phase1_result > 1 && seccomp_phase2(phase1_result)) {
+		/* seccomp failures shouldn't expose any additional code. */
+		return -1;
+	}
+#endif
+
+	if (unlikely(work & _TIF_SYSCALL_EMU))
+		ret = -1L;
+
+	if ((ret || test_thread_flag(TIF_SYSCALL_TRACE)) &&
+	    tracehook_report_syscall_entry(regs))
+		ret = -1L;
+
+	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
+		trace_sys_enter(regs, regs->orig_ax);
+
+	do_audit_syscall_entry(regs, arch);
+
+	return ret ?: regs->orig_ax;
+}
+
+long syscall_trace_enter(struct pt_regs *regs)
+{
+	u32 arch = is_ia32_task() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
+	unsigned long phase1_result = syscall_trace_enter_phase1(regs, arch);
+
+	if (phase1_result == 0)
+		return regs->orig_ax;
+	else
+		return syscall_trace_enter_phase2(regs, arch, phase1_result);
+}
+
+void syscall_trace_leave(struct pt_regs *regs)
+{
+	bool step;
+
+	/*
+	 * We may come here right after calling schedule_user()
+	 * or do_notify_resume(), in which case we can be in RCU
+	 * user mode.
+	 */
+	user_exit();
+
+	audit_syscall_exit(regs);
+
+	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
+		trace_sys_exit(regs, regs->ax);
+
+	/*
+	 * If TIF_SYSCALL_EMU is set, we only get here because of
+	 * TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP).
+	 * We already reported this syscall instruction in
+	 * syscall_trace_enter().
+	 */
+	step = unlikely(test_thread_flag(TIF_SINGLESTEP)) &&
+			!test_thread_flag(TIF_SYSCALL_EMU);
+	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
+		tracehook_report_syscall_exit(regs, step);
+
+	user_enter();
+}
+
+/*
+ * notification of userspace execution resumption
+ * - triggered by the TIF_WORK_MASK flags
+ */
+__visible void
+do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
+{
+	user_exit();
+
+	if (thread_info_flags & _TIF_UPROBE)
+		uprobe_notify_resume(regs);
+
+	/* deal with pending signal delivery */
+	if (thread_info_flags & _TIF_SIGPENDING)
+		do_signal(regs);
+
+	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
+		clear_thread_flag(TIF_NOTIFY_RESUME);
+		tracehook_notify_resume(regs);
+	}
+	if (thread_info_flags & _TIF_USER_RETURN_NOTIFY)
+		fire_user_return_notifiers();
+
+	user_enter();
+}
diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index 31eab867e6d3..b42408bcf6b5 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -30,6 +30,7 @@ typedef sigset_t compat_sigset_t;
 #endif /* __ASSEMBLY__ */
 #include <uapi/asm/signal.h>
 #ifndef __ASSEMBLY__
+extern void do_signal(struct pt_regs *regs);
 extern void do_notify_resume(struct pt_regs *, void *, __u32);
 
 #define __ARCH_HAS_SA_RESTORER
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 9be72bc3613f..4aa1ab6435d3 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -37,12 +37,10 @@
 #include <asm/proto.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/traps.h>
+#include <asm/syscall.h>
 
 #include "tls.h"
 
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
 enum x86_regset {
 	REGSET_GENERAL,
 	REGSET_FP,
@@ -1434,201 +1432,3 @@ void send_sigtrap(struct task_struct *tsk, struct pt_regs *regs,
 	/* Send us the fake SIGTRAP */
 	force_sig_info(SIGTRAP, &info, tsk);
 }
-
-static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
-{
-#ifdef CONFIG_X86_64
-	if (arch == AUDIT_ARCH_X86_64) {
-		audit_syscall_entry(regs->orig_ax, regs->di,
-				    regs->si, regs->dx, regs->r10);
-	} else
-#endif
-	{
-		audit_syscall_entry(regs->orig_ax, regs->bx,
-				    regs->cx, regs->dx, regs->si);
-	}
-}
-
-/*
- * We can return 0 to resume the syscall or anything else to go to phase
- * 2.  If we resume the syscall, we need to put something appropriate in
- * regs->orig_ax.
- *
- * NB: We don't have full pt_regs here, but regs->orig_ax and regs->ax
- * are fully functional.
- *
- * For phase 2's benefit, our return value is:
- * 0:			resume the syscall
- * 1:			go to phase 2; no seccomp phase 2 needed
- * anything else:	go to phase 2; pass return value to seccomp
- */
-unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
-{
-	unsigned long ret = 0;
-	u32 work;
-
-	BUG_ON(regs != task_pt_regs(current));
-
-	work = ACCESS_ONCE(current_thread_info()->flags) &
-		_TIF_WORK_SYSCALL_ENTRY;
-
-	/*
-	 * If TIF_NOHZ is set, we are required to call user_exit() before
-	 * doing anything that could touch RCU.
-	 */
-	if (work & _TIF_NOHZ) {
-		user_exit();
-		work &= ~_TIF_NOHZ;
-	}
-
-#ifdef CONFIG_SECCOMP
-	/*
-	 * Do seccomp first -- it should minimize exposure of other
-	 * code, and keeping seccomp fast is probably more valuable
-	 * than the rest of this.
-	 */
-	if (work & _TIF_SECCOMP) {
-		struct seccomp_data sd;
-
-		sd.arch = arch;
-		sd.nr = regs->orig_ax;
-		sd.instruction_pointer = regs->ip;
-#ifdef CONFIG_X86_64
-		if (arch == AUDIT_ARCH_X86_64) {
-			sd.args[0] = regs->di;
-			sd.args[1] = regs->si;
-			sd.args[2] = regs->dx;
-			sd.args[3] = regs->r10;
-			sd.args[4] = regs->r8;
-			sd.args[5] = regs->r9;
-		} else
-#endif
-		{
-			sd.args[0] = regs->bx;
-			sd.args[1] = regs->cx;
-			sd.args[2] = regs->dx;
-			sd.args[3] = regs->si;
-			sd.args[4] = regs->di;
-			sd.args[5] = regs->bp;
-		}
-
-		BUILD_BUG_ON(SECCOMP_PHASE1_OK != 0);
-		BUILD_BUG_ON(SECCOMP_PHASE1_SKIP != 1);
-
-		ret = seccomp_phase1(&sd);
-		if (ret == SECCOMP_PHASE1_SKIP) {
-			regs->orig_ax = -1;
-			ret = 0;
-		} else if (ret != SECCOMP_PHASE1_OK) {
-			return ret;  /* Go directly to phase 2 */
-		}
-
-		work &= ~_TIF_SECCOMP;
-	}
-#endif
-
-	/* Do our best to finish without phase 2. */
-	if (work == 0)
-		return ret;  /* seccomp and/or nohz only (ret == 0 here) */
-
-#ifdef CONFIG_AUDITSYSCALL
-	if (work == _TIF_SYSCALL_AUDIT) {
-		/*
-		 * If there is no more work to be done except auditing,
-		 * then audit in phase 1.  Phase 2 always audits, so, if
-		 * we audit here, then we can't go on to phase 2.
-		 */
-		do_audit_syscall_entry(regs, arch);
-		return 0;
-	}
-#endif
-
-	return 1;  /* Something is enabled that we can't handle in phase 1 */
-}
-
-/* Returns the syscall nr to run (which should match regs->orig_ax). */
-long syscall_trace_enter_phase2(struct pt_regs *regs, u32 arch,
-				unsigned long phase1_result)
-{
-	long ret = 0;
-	u32 work = ACCESS_ONCE(current_thread_info()->flags) &
-		_TIF_WORK_SYSCALL_ENTRY;
-
-	BUG_ON(regs != task_pt_regs(current));
-
-	/*
-	 * If we stepped into a sysenter/syscall insn, it trapped in
-	 * kernel mode; do_debug() cleared TF and set TIF_SINGLESTEP.
-	 * If user-mode had set TF itself, then it's still clear from
-	 * do_debug() and we need to set it again to restore the user
-	 * state.  If we entered on the slow path, TF was already set.
-	 */
-	if (work & _TIF_SINGLESTEP)
-		regs->flags |= X86_EFLAGS_TF;
-
-#ifdef CONFIG_SECCOMP
-	/*
-	 * Call seccomp_phase2 before running the other hooks so that
-	 * they can see any changes made by a seccomp tracer.
-	 */
-	if (phase1_result > 1 && seccomp_phase2(phase1_result)) {
-		/* seccomp failures shouldn't expose any additional code. */
-		return -1;
-	}
-#endif
-
-	if (unlikely(work & _TIF_SYSCALL_EMU))
-		ret = -1L;
-
-	if ((ret || test_thread_flag(TIF_SYSCALL_TRACE)) &&
-	    tracehook_report_syscall_entry(regs))
-		ret = -1L;
-
-	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
-		trace_sys_enter(regs, regs->orig_ax);
-
-	do_audit_syscall_entry(regs, arch);
-
-	return ret ?: regs->orig_ax;
-}
-
-long syscall_trace_enter(struct pt_regs *regs)
-{
-	u32 arch = is_ia32_task() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
-	unsigned long phase1_result = syscall_trace_enter_phase1(regs, arch);
-
-	if (phase1_result == 0)
-		return regs->orig_ax;
-	else
-		return syscall_trace_enter_phase2(regs, arch, phase1_result);
-}
-
-void syscall_trace_leave(struct pt_regs *regs)
-{
-	bool step;
-
-	/*
-	 * We may come here right after calling schedule_user()
-	 * or do_notify_resume(), in which case we can be in RCU
-	 * user mode.
-	 */
-	user_exit();
-
-	audit_syscall_exit(regs);
-
-	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
-		trace_sys_exit(regs, regs->ax);
-
-	/*
-	 * If TIF_SYSCALL_EMU is set, we only get here because of
-	 * TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP).
-	 * We already reported this syscall instruction in
-	 * syscall_trace_enter().
-	 */
-	step = unlikely(test_thread_flag(TIF_SINGLESTEP)) &&
-			!test_thread_flag(TIF_SYSCALL_EMU);
-	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
-
-	user_enter();
-}
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 206996c1669d..197c44e8ff8b 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -701,7 +701,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
  * want to handle. Thus you cannot kill init even with a SIGKILL even by
  * mistake.
  */
-static void do_signal(struct pt_regs *regs)
+void do_signal(struct pt_regs *regs)
 {
 	struct ksignal ksig;
 
@@ -736,32 +736,6 @@ static void do_signal(struct pt_regs *regs)
 	restore_saved_sigmask();
 }
 
-/*
- * notification of userspace execution resumption
- * - triggered by the TIF_WORK_MASK flags
- */
-__visible void
-do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
-{
-	user_exit();
-
-	if (thread_info_flags & _TIF_UPROBE)
-		uprobe_notify_resume(regs);
-
-	/* deal with pending signal delivery */
-	if (thread_info_flags & _TIF_SIGPENDING)
-		do_signal(regs);
-
-	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
-		clear_thread_flag(TIF_NOTIFY_RESUME);
-		tracehook_notify_resume(regs);
-	}
-	if (thread_info_flags & _TIF_USER_RETURN_NOTIFY)
-		fire_user_return_notifiers();
-
-	user_enter();
-}
-
 void signal_fault(struct pt_regs *regs, void __user *frame, char *where)
 {
 	struct task_struct *me = current;
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 07/17] x86/traps: Assert that we're in CONTEXT_KERNEL in exception entries
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (5 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 06/17] x86: Move C entry and exit code to arch/x86/entry/common.c Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:51   ` [tip:x86/asm] x86/traps, context_tracking: Assert that we' re " tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode and use it in syscalls Andy Lutomirski
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

Other than the super-atomic exception entries, all exception entries
are supposed to switch our context tracking state to CONTEXT_KERNEL.
Assert that they do.  These assertions appear trivial at this point,
as exception_enter is the function responsible for switching
context, but I'm planning on reworking x86's exception context
tracking, and these assertions will help make sure that all of this
code keeps working.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/traps.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index f5791927aa64..2a783c4fe0e9 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -292,6 +292,8 @@ static void do_error_trap(struct pt_regs *regs, long error_code, char *str,
 	enum ctx_state prev_state = exception_enter();
 	siginfo_t info;
 
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
+
 	if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) !=
 			NOTIFY_STOP) {
 		conditional_sti(regs);
@@ -376,6 +378,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 	siginfo_t *info;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	if (notify_die(DIE_TRAP, "bounds", regs, error_code,
 			X86_TRAP_BR, SIGSEGV) == NOTIFY_STOP)
 		goto exit;
@@ -457,6 +460,7 @@ do_general_protection(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	conditional_sti(regs);
 
 	if (v8086_mode(regs)) {
@@ -514,6 +518,7 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 		return;
 
 	prev_state = ist_enter(regs);
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
 	if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
 				SIGTRAP) == NOTIFY_STOP)
@@ -750,6 +755,7 @@ dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	math_error(regs, error_code, X86_TRAP_MF);
 	exception_exit(prev_state);
 }
@@ -760,6 +766,7 @@ do_simd_coprocessor_error(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	math_error(regs, error_code, X86_TRAP_XF);
 	exception_exit(prev_state);
 }
@@ -776,6 +783,7 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	BUG_ON(use_eager_fpu());
 
 #ifdef CONFIG_MATH_EMULATION
@@ -805,6 +813,7 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	local_irq_enable();
 
 	info.si_signo = SIGILL;
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 08/17] x86/entry: Add enter_from_user_mode and use it in syscalls
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (6 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 07/17] x86/traps: Assert that we're in CONTEXT_KERNEL in exception entries Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:51   ` [tip:x86/asm] x86/entry: Add enter_from_user_mode() " tip-bot for Andy Lutomirski
  2015-12-21 20:50   ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode " Sasha Levin
  2015-07-03 19:44 ` [PATCH v5 09/17] x86/entry: Add new, comprehensible entry and exit hooks Andy Lutomirski
                   ` (9 subsequent siblings)
  17 siblings, 2 replies; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

Changing the x86 context tracking hooks is dangerous because there
are no good checks that we track our context correctly.  Add a
helper to check that we're actually in CONTEXT_USER when we enter
from user mode and wire it up for syscall entries.

Subsequent patches will wire this up for all non-NMI entries as
well.  NMIs are their own special beast and cannot currently switch
overall context tracking state.  Instead, they have their own
special RCU hooks.

This is a tiny speedup if !CONFIG_CONTEXT_TRACKING (removes a
branch) and a tiny slowdown if CONFIG_CONTEXT_TRACING (adds a layer
of indirection).  Eventually, we should fix up the core context
tracking code to supply a function that does what we want (and can
be much simpler than user_exit), which will enable us to get rid of
the extra call.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/common.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 917d0c3cb851..9a327ee24eef 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -28,6 +28,15 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
 
+#ifdef CONFIG_CONTEXT_TRACKING
+/* Called on entry from user mode with IRQs off. */
+__visible void enter_from_user_mode(void)
+{
+	CT_WARN_ON(ct_state() != CONTEXT_USER);
+	user_exit();
+}
+#endif
+
 static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
 {
 #ifdef CONFIG_X86_64
@@ -65,14 +74,16 @@ unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
 	work = ACCESS_ONCE(current_thread_info()->flags) &
 		_TIF_WORK_SYSCALL_ENTRY;
 
+#ifdef CONFIG_CONTEXT_TRACKING
 	/*
 	 * If TIF_NOHZ is set, we are required to call user_exit() before
 	 * doing anything that could touch RCU.
 	 */
 	if (work & _TIF_NOHZ) {
-		user_exit();
+		enter_from_user_mode();
 		work &= ~_TIF_NOHZ;
 	}
+#endif
 
 #ifdef CONFIG_SECCOMP
 	/*
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 09/17] x86/entry: Add new, comprehensible entry and exit hooks
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (7 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode and use it in syscalls Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:51   ` [tip:x86/asm] x86/entry: Add new, comprehensible entry and exit handlers written in C tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 10/17] x86/entry/64: Really create an error-entry-from-usermode code path Andy Lutomirski
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

The current entry and exit code is incomprehensible, appears to work
primary by luck, and is very difficult to incrementally improve.  Add
new code in preparation for simply deleting the old code.

prepare_exit_to_usermode is a new function that will handle all slow
path exits to user mode.  It is called with IRQs disabled and it
leaves us in a state in which it is safe to immediately return to
user mode.  IRQs must not be re-enabled at any point after
prepare_exit_to_usermode returns and user mode is actually entered.
(We can, of course, fail to enter user mode and treat that failure
as a fresh entry to kernel mode.)  All callers of do_notify_resume
will be migrated to call prepare_exit_to_usermode instead;
prepare_exit_to_usermode needs to do everything that
do_notify_resume does, but it also takes care of scheduling and
context tracking.  Unlike do_notify_resume, it does not need to be
called in a loop.

syscall_return_slowpath is exactly what it sounds like.  It will be
called on any syscall exit slow path.  It will replaces
syscall_trace_leave and it calls prepare_exit_to_usermode on the way
out.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/common.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 111 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 9a327ee24eef..febc53086a69 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -207,6 +207,7 @@ long syscall_trace_enter(struct pt_regs *regs)
 		return syscall_trace_enter_phase2(regs, arch, phase1_result);
 }
 
+/* Deprecated. */
 void syscall_trace_leave(struct pt_regs *regs)
 {
 	bool step;
@@ -237,8 +238,117 @@ void syscall_trace_leave(struct pt_regs *regs)
 	user_enter();
 }
 
+static struct thread_info *pt_regs_to_thread_info(struct pt_regs *regs)
+{
+	unsigned long top_of_stack =
+		(unsigned long)(regs + 1) + TOP_OF_KERNEL_STACK_PADDING;
+	return (struct thread_info *)(top_of_stack - THREAD_SIZE);
+}
+
+/* Called with IRQs disabled. */
+__visible void prepare_exit_to_usermode(struct pt_regs *regs)
+{
+	if (WARN_ON(!irqs_disabled()))
+		local_irq_disable();
+
+	/*
+	 * In order to return to user mode, we need to have IRQs off with
+	 * none of _TIF_SIGPENDING, _TIF_NOTIFY_RESUME, _TIF_USER_RETURN_NOTIFY,
+	 * _TIF_UPROBE, or _TIF_NEED_RESCHED set.  Several of these flags
+	 * can be set at any time on preemptable kernels if we have IRQs on,
+	 * so we need to loop.  Disabling preemption wouldn't help: doing the
+	 * work to clear some of the flags can sleep.
+	 */
+	while (true) {
+		u32 cached_flags =
+			READ_ONCE(pt_regs_to_thread_info(regs)->flags);
+
+		if (!(cached_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
+				      _TIF_UPROBE | _TIF_NEED_RESCHED)))
+			break;
+
+		/* We have work to do. */
+		local_irq_enable();
+
+		if (cached_flags & _TIF_NEED_RESCHED)
+			schedule();
+
+		if (cached_flags & _TIF_UPROBE)
+			uprobe_notify_resume(regs);
+
+		/* deal with pending signal delivery */
+		if (cached_flags & _TIF_SIGPENDING)
+			do_signal(regs);
+
+		if (cached_flags & _TIF_NOTIFY_RESUME) {
+			clear_thread_flag(TIF_NOTIFY_RESUME);
+			tracehook_notify_resume(regs);
+		}
+
+		if (cached_flags & _TIF_USER_RETURN_NOTIFY)
+			fire_user_return_notifiers();
+
+		/* Disable IRQs and retry */
+		local_irq_disable();
+	}
+
+	user_enter();
+}
+
+/*
+ * Called with IRQs on and fully valid regs.  Returns with IRQs off in a
+ * state such that we can immediately switch to user mode.
+ */
+__visible void syscall_return_slowpath(struct pt_regs *regs)
+{
+	struct thread_info *ti = pt_regs_to_thread_info(regs);
+	u32 cached_flags = READ_ONCE(ti->flags);
+	bool step;
+
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
+
+	if (WARN(irqs_disabled(), "syscall %ld left IRQs disabled",
+		 regs->orig_ax))
+		local_irq_enable();
+
+	/*
+	 * First do one-time work.  If these work items are enabled, we
+	 * want to run them exactly once per syscall exit with IRQs on.
+	 */
+	if (cached_flags & (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT |
+			    _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT)) {
+		audit_syscall_exit(regs);
+
+		if (cached_flags & _TIF_SYSCALL_TRACEPOINT)
+			trace_sys_exit(regs, regs->ax);
+
+		/*
+		 * If TIF_SYSCALL_EMU is set, we only get here because of
+		 * TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP).
+		 * We already reported this syscall instruction in
+		 * syscall_trace_enter().
+		 */
+		step = unlikely(
+			(cached_flags & (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU))
+			== _TIF_SINGLESTEP);
+		if (step || cached_flags & _TIF_SYSCALL_TRACE)
+			tracehook_report_syscall_exit(regs, step);
+	}
+
+#ifdef CONFIG_COMPAT
+	/*
+	 * Compat syscalls set TS_COMPAT.  Make sure we clear it before
+	 * returning to user mode.
+	 */
+	ti->status &= ~TS_COMPAT;
+#endif
+
+	local_irq_disable();
+	prepare_exit_to_usermode(regs);
+}
+
 /*
- * notification of userspace execution resumption
+ * Deprecated notification of userspace execution resumption
  * - triggered by the TIF_WORK_MASK flags
  */
 __visible void
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 10/17] x86/entry/64: Really create an error-entry-from-usermode code path
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (8 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 09/17] x86/entry: Add new, comprehensible entry and exit hooks Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:52   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 11/17] x86/entry/64: Migrate 64-bit and compat syscalls to new exit hooks Andy Lutomirski
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

In 539f51136500 ("x86/asm/entry/64: Disentangle error_entry/exit
gsbase/ebx/usermode code"), I arranged the code slightly wrong --
IRET faults would skip the code path that was intended to execute on
all error entries from user mode.  Fix it up.

While we're at it, make all the labels in error_entry local.

This does not fix a bug, but we'll need it, and it slightly shrinks
the code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/entry_64.S | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 141a5d49dddc..ccfcba90de6e 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1143,12 +1143,17 @@ ENTRY(error_entry)
 	SAVE_EXTRA_REGS 8
 	xorl	%ebx, %ebx
 	testb	$3, CS+8(%rsp)
-	jz	error_kernelspace
+	jz	.Lerror_kernelspace
 
-	/* We entered from user mode */
+.Lerror_entry_from_usermode_swapgs:
+	/*
+	 * We entered from user mode or we're pretending to have entered
+	 * from user mode due to an IRET fault.
+	 */
 	SWAPGS
 
-error_entry_done:
+.Lerror_entry_from_usermode_after_swapgs:
+.Lerror_entry_done:
 	TRACE_IRQS_OFF
 	ret
 
@@ -1158,31 +1163,30 @@ error_entry_done:
 	 * truncated RIP for IRET exceptions returning to compat mode. Check
 	 * for these here too.
 	 */
-error_kernelspace:
+.Lerror_kernelspace:
 	incl	%ebx
 	leaq	native_irq_return_iret(%rip), %rcx
 	cmpq	%rcx, RIP+8(%rsp)
-	je	error_bad_iret
+	je	.Lerror_bad_iret
 	movl	%ecx, %eax			/* zero extend */
 	cmpq	%rax, RIP+8(%rsp)
-	je	bstep_iret
+	je	.Lbstep_iret
 	cmpq	$gs_change, RIP+8(%rsp)
-	jne	error_entry_done
+	jne	.Lerror_entry_done
 
 	/*
 	 * hack: gs_change can fail with user gsbase.  If this happens, fix up
 	 * gsbase and proceed.  We'll fix up the exception and land in
 	 * gs_change's error handler with kernel gsbase.
 	 */
-	SWAPGS
-	jmp	error_entry_done
+	jmp	.Lerror_entry_from_usermode_swapgs
 
-bstep_iret:
+.Lbstep_iret:
 	/* Fix truncated RIP */
 	movq	%rcx, RIP+8(%rsp)
 	/* fall through */
 
-error_bad_iret:
+.Lerror_bad_iret:
 	/*
 	 * We came from an IRET to user mode, so we have user gsbase.
 	 * Switch to kernel gsbase:
@@ -1198,7 +1202,7 @@ error_bad_iret:
 	call	fixup_bad_iret
 	mov	%rax, %rsp
 	decl	%ebx
-	jmp	error_entry_done
+	jmp	.Lerror_entry_from_usermode_after_swapgs
 END(error_entry)
 
 
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 11/17] x86/entry/64: Migrate 64-bit and compat syscalls to new exit hooks
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (9 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 10/17] x86/entry/64: Really create an error-entry-from-usermode code path Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:52   ` [tip:x86/asm] x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers and remove old assembly code tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 12/17] x86/asm/entry/64: Save all regs on interrupt entry Andy Lutomirski
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

These need to be migrated together, as the compat case used to jump
into the middle of the 64-bit exit code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/entry_64.S        | 69 +++++-----------------------------------
 arch/x86/entry/entry_64_compat.S |  6 ++--
 2 files changed, 11 insertions(+), 64 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ccfcba90de6e..4ca5b782ed70 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -229,6 +229,11 @@ entry_SYSCALL_64_fastpath:
 	 */
 	USERGS_SYSRET64
 
+GLOBAL(int_ret_from_sys_call_irqs_off)
+	TRACE_IRQS_ON
+	ENABLE_INTERRUPTS(CLBR_NONE)
+	jmp int_ret_from_sys_call
+
 	/* Do syscall entry tracing */
 tracesys:
 	movq	%rsp, %rdi
@@ -272,69 +277,11 @@ tracesys_phase2:
  * Has correct iret frame.
  */
 GLOBAL(int_ret_from_sys_call)
-	DISABLE_INTERRUPTS(CLBR_NONE)
-int_ret_from_sys_call_irqs_off: /* jumps come here from the irqs-off SYSRET path */
-	TRACE_IRQS_OFF
-	movl	$_TIF_ALLWORK_MASK, %edi
-	/* edi:	mask to check */
-GLOBAL(int_with_check)
-	LOCKDEP_SYS_EXIT_IRQ
-	GET_THREAD_INFO(%rcx)
-	movl	TI_flags(%rcx), %edx
-	andl	%edi, %edx
-	jnz	int_careful
-	andl	$~TS_COMPAT, TI_status(%rcx)
-	jmp	syscall_return
-
-	/*
-	 * Either reschedule or signal or syscall exit tracking needed.
-	 * First do a reschedule test.
-	 * edx:	work, edi: workmask
-	 */
-int_careful:
-	bt	$TIF_NEED_RESCHED, %edx
-	jnc	int_very_careful
-	TRACE_IRQS_ON
-	ENABLE_INTERRUPTS(CLBR_NONE)
-	pushq	%rdi
-	SCHEDULE_USER
-	popq	%rdi
-	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
-	jmp	int_with_check
-
-	/* handle signals and tracing -- both require a full pt_regs */
-int_very_careful:
-	TRACE_IRQS_ON
-	ENABLE_INTERRUPTS(CLBR_NONE)
 	SAVE_EXTRA_REGS
-	/* Check for syscall exit trace */
-	testl	$_TIF_WORK_SYSCALL_EXIT, %edx
-	jz	int_signal
-	pushq	%rdi
-	leaq	8(%rsp), %rdi			/* &ptregs -> arg1 */
-	call	syscall_trace_leave
-	popq	%rdi
-	andl	$~(_TIF_WORK_SYSCALL_EXIT|_TIF_SYSCALL_EMU), %edi
-	jmp	int_restore_rest
-
-int_signal:
-	testl	$_TIF_DO_NOTIFY_MASK, %edx
-	jz	1f
-	movq	%rsp, %rdi			/* &ptregs -> arg1 */
-	xorl	%esi, %esi			/* oldset -> arg2 */
-	call	do_notify_resume
-1:	movl	$_TIF_WORK_MASK, %edi
-int_restore_rest:
+	movq	%rsp, %rdi
+	call	syscall_return_slowpath	/* returns with IRQs disabled */
 	RESTORE_EXTRA_REGS
-	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
-	jmp	int_with_check
-
-syscall_return:
-	/* The IRETQ could re-enable interrupts: */
-	DISABLE_INTERRUPTS(CLBR_ANY)
-	TRACE_IRQS_IRETQ
+	TRACE_IRQS_IRETQ		/* we're about to change IF */
 
 	/*
 	 * Try to use SYSRET instead of IRET if we're returning to
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index efe0b1e499fa..204528cf4359 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -209,10 +209,10 @@ sysexit_from_sys_call:
 	.endm
 
 	.macro auditsys_exit exit
-	testl	$(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
-	jnz	ia32_ret_from_sys_call
 	TRACE_IRQS_ON
 	ENABLE_INTERRUPTS(CLBR_NONE)
+	testl	$(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
+	jnz	ia32_ret_from_sys_call
 	movl	%eax, %esi		/* second arg, syscall return value */
 	cmpl	$-MAX_ERRNO, %eax	/* is it an error ? */
 	jbe	1f
@@ -231,7 +231,7 @@ sysexit_from_sys_call:
 	movq	%rax, R10(%rsp)
 	movq	%rax, R9(%rsp)
 	movq	%rax, R8(%rsp)
-	jmp	int_with_check
+	jmp	int_ret_from_sys_call_irqs_off
 	.endm
 
 sysenter_auditsys:
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 12/17] x86/asm/entry/64: Save all regs on interrupt entry
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (10 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 11/17] x86/entry/64: Migrate 64-bit and compat syscalls to new exit hooks Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:52   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 13/17] x86/asm/entry/64: Simplify irq stack pt_regs handling Andy Lutomirski
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

To prepare for the big rewrite of the error and interrupt exit
paths, we will need pt_regs completely filled in.  It's already
completely filled in when error_exit runs, so rearrange interrupt
handling to match it.  This will slow down interrupt handling very
slightly (eight instructions), but the simplification it enables
will be more than worth it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/calling.h  |  3 ---
 arch/x86/entry/entry_64.S | 29 +++++++++--------------------
 2 files changed, 9 insertions(+), 23 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index f4e6308c4200..f5eda6ecbca3 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -135,9 +135,6 @@ For 32-bit we have the following conventions - kernel is built with
 	movq %rbp, 4*8+\offset(%rsp)
 	movq %rbx, 5*8+\offset(%rsp)
 	.endm
-	.macro SAVE_EXTRA_REGS_RBP offset=0
-	movq %rbp, 4*8+\offset(%rsp)
-	.endm
 
 	.macro RESTORE_EXTRA_REGS offset=0
 	movq 0*8+\offset(%rsp), %r15
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 4ca5b782ed70..65029f48bcc4 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -502,21 +502,13 @@ END(irq_entries_start)
 /* 0(%rsp): ~(interrupt number) */
 	.macro interrupt func
 	cld
-	/*
-	 * Since nothing in interrupt handling code touches r12...r15 members
-	 * of "struct pt_regs", and since interrupts can nest, we can save
-	 * four stack slots and simultaneously provide
-	 * an unwind-friendly stack layout by saving "truncated" pt_regs
-	 * exactly up to rbp slot, without these members.
-	 */
-	ALLOC_PT_GPREGS_ON_STACK -RBP
-	SAVE_C_REGS -RBP
-	/* this goes to 0(%rsp) for unwinder, not for saving the value: */
-	SAVE_EXTRA_REGS_RBP -RBP
+	ALLOC_PT_GPREGS_ON_STACK
+	SAVE_C_REGS
+	SAVE_EXTRA_REGS
 
-	leaq	-RBP(%rsp), %rdi		/* arg1 for \func (pointer to pt_regs) */
+	movq	%rsp,%rdi	/* arg1 for \func (pointer to pt_regs) */
 
-	testb	$3, CS-RBP(%rsp)
+	testb	$3, CS(%rsp)
 	jz	1f
 	SWAPGS
 1:
@@ -553,9 +545,7 @@ ret_from_intr:
 	decl	PER_CPU_VAR(irq_count)
 
 	/* Restore saved previous stack */
-	popq	%rsi
-	/* return code expects complete pt_regs - adjust rsp accordingly: */
-	leaq	-RBP(%rsi), %rsp
+	popq	%rsp
 
 	testb	$3, CS(%rsp)
 	jz	retint_kernel
@@ -580,7 +570,7 @@ retint_swapgs:					/* return to user-space */
 	TRACE_IRQS_IRETQ
 
 	SWAPGS
-	jmp	restore_c_regs_and_iret
+	jmp	restore_regs_and_iret
 
 /* Returning to kernel space */
 retint_kernel:
@@ -604,6 +594,8 @@ retint_kernel:
  * At this label, code paths which return to kernel and to user,
  * which come from interrupts/exception and from syscalls, merge.
  */
+restore_regs_and_iret:
+	RESTORE_EXTRA_REGS
 restore_c_regs_and_iret:
 	RESTORE_C_REGS
 	REMOVE_PT_GPREGS_FROM_STACK 8
@@ -674,12 +666,10 @@ retint_signal:
 	jz	retint_swapgs
 	TRACE_IRQS_ON
 	ENABLE_INTERRUPTS(CLBR_NONE)
-	SAVE_EXTRA_REGS
 	movq	$-1, ORIG_RAX(%rsp)
 	xorl	%esi, %esi			/* oldset */
 	movq	%rsp, %rdi			/* &pt_regs */
 	call	do_notify_resume
-	RESTORE_EXTRA_REGS
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
 	GET_THREAD_INFO(%rcx)
@@ -1160,7 +1150,6 @@ END(error_entry)
  */
 ENTRY(error_exit)
 	movl	%ebx, %eax
-	RESTORE_EXTRA_REGS
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
 	testl	%eax, %eax
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 13/17] x86/asm/entry/64: Simplify irq stack pt_regs handling
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (11 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 12/17] x86/asm/entry/64: Save all regs on interrupt entry Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:53   ` [tip:x86/asm] x86/asm/entry/64: Simplify IRQ " tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 14/17] x86/asm/entry/64: Migrate error and interrupt exit work to C Andy Lutomirski
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

There's no need for both rsi and rdi to point to the original stack.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/entry_64.S | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 65029f48bcc4..83eb63d31da4 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -506,8 +506,6 @@ END(irq_entries_start)
 	SAVE_C_REGS
 	SAVE_EXTRA_REGS
 
-	movq	%rsp,%rdi	/* arg1 for \func (pointer to pt_regs) */
-
 	testb	$3, CS(%rsp)
 	jz	1f
 	SWAPGS
@@ -519,14 +517,14 @@ END(irq_entries_start)
 	 * a little cheaper to use a separate counter in the PDA (short of
 	 * moving irq_enter into assembly, which would be too much work)
 	 */
-	movq	%rsp, %rsi
+	movq	%rsp, %rdi
 	incl	PER_CPU_VAR(irq_count)
 	cmovzq	PER_CPU_VAR(irq_stack_ptr), %rsp
-	pushq	%rsi
+	pushq	%rdi
 	/* We entered an interrupt context - irqs are off: */
 	TRACE_IRQS_OFF
 
-	call	\func
+	call	\func	/* rdi points to pt_regs */
 	.endm
 
 	/*
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 14/17] x86/asm/entry/64: Migrate error and interrupt exit work to C
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (12 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 13/17] x86/asm/entry/64: Simplify irq stack pt_regs handling Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:53   ` [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 15/17] x86/entry: Remove exception_enter from most trap handlers Andy Lutomirski
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/entry_64.S        | 64 +++++++++++-----------------------------
 arch/x86/entry/entry_64_compat.S |  5 ++++
 2 files changed, 23 insertions(+), 46 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 83eb63d31da4..168ee264c345 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -508,7 +508,16 @@ END(irq_entries_start)
 
 	testb	$3, CS(%rsp)
 	jz	1f
+
+	/*
+	 * IRQ from user mode.  Switch to kernel gsbase and inform context
+	 * tracking that we're in kernel mode.
+	 */
 	SWAPGS
+#ifdef CONFIG_CONTEXT_TRACKING
+	call enter_from_user_mode
+#endif
+
 1:
 	/*
 	 * Save previous stack pointer, optionally switch to interrupt stack.
@@ -547,26 +556,13 @@ ret_from_intr:
 
 	testb	$3, CS(%rsp)
 	jz	retint_kernel
-	/* Interrupt came from user space */
-GLOBAL(retint_user)
-	GET_THREAD_INFO(%rcx)
 
-	/* %rcx: thread info. Interrupts are off. */
-retint_with_reschedule:
-	movl	$_TIF_WORK_MASK, %edi
-retint_check:
+	/* Interrupt came from user space */
 	LOCKDEP_SYS_EXIT_IRQ
-	movl	TI_flags(%rcx), %edx
-	andl	%edi, %edx
-	jnz	retint_careful
-
-retint_swapgs:					/* return to user-space */
-	/*
-	 * The iretq could re-enable interrupts:
-	 */
-	DISABLE_INTERRUPTS(CLBR_ANY)
+GLOBAL(retint_user)
+	mov	%rsp,%rdi
+	call	prepare_exit_to_usermode
 	TRACE_IRQS_IRETQ
-
 	SWAPGS
 	jmp	restore_regs_and_iret
 
@@ -644,35 +640,6 @@ native_irq_return_ldt:
 	popq	%rax
 	jmp	native_irq_return_iret
 #endif
-
-	/* edi: workmask, edx: work */
-retint_careful:
-	bt	$TIF_NEED_RESCHED, %edx
-	jnc	retint_signal
-	TRACE_IRQS_ON
-	ENABLE_INTERRUPTS(CLBR_NONE)
-	pushq	%rdi
-	SCHEDULE_USER
-	popq	%rdi
-	GET_THREAD_INFO(%rcx)
-	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
-	jmp	retint_check
-
-retint_signal:
-	testl	$_TIF_DO_NOTIFY_MASK, %edx
-	jz	retint_swapgs
-	TRACE_IRQS_ON
-	ENABLE_INTERRUPTS(CLBR_NONE)
-	movq	$-1, ORIG_RAX(%rsp)
-	xorl	%esi, %esi			/* oldset */
-	movq	%rsp, %rdi			/* &pt_regs */
-	call	do_notify_resume
-	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
-	GET_THREAD_INFO(%rcx)
-	jmp	retint_with_reschedule
-
 END(common_interrupt)
 
 /*
@@ -1088,7 +1055,12 @@ ENTRY(error_entry)
 	SWAPGS
 
 .Lerror_entry_from_usermode_after_swapgs:
+#ifdef CONFIG_CONTEXT_TRACKING
+	call enter_from_user_mode
+#endif
+
 .Lerror_entry_done:
+
 	TRACE_IRQS_OFF
 	ret
 
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 204528cf4359..55fa85837da2 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -455,6 +455,11 @@ ia32_badarg:
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
 
+	/* Now finish entering normal kernel mode. */
+#ifdef CONFIG_CONTEXT_TRACKING
+	call enter_from_user_mode
+#endif
+
 	/* And exit again. */
 	jmp retint_user
 
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 15/17] x86/entry: Remove exception_enter from most trap handlers
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (13 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 14/17] x86/asm/entry/64: Migrate error and interrupt exit work to C Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:53   ` [tip:x86/asm] x86/entry: Remove exception_enter() " tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 16/17] x86/entry: Remove SCHEDULE_USER and asm/context-tracking.h Andy Lutomirski
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

On 64-bit kernels, we don't need it any more: we handle context
tracking directly on entry from user mode and exit to user mode.  On
32-bit kernels, we don't support context tracking at all, so these
hooks had no effect.

This doesn't change do_page_fault.  Before we do that, we need to
make sure that there is no code that can page fault from kernel mode
with CONTEXT_USER.  The 32-bit fast system call stack argument code
is the only offender I'm aware of right now.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/traps.h         |  4 +-
 arch/x86/kernel/cpu/mcheck/mce.c     |  5 +--
 arch/x86/kernel/cpu/mcheck/p5.c      |  5 +--
 arch/x86/kernel/cpu/mcheck/winchip.c |  4 +-
 arch/x86/kernel/traps.c              | 78 +++++++++---------------------------
 5 files changed, 27 insertions(+), 69 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index c5380bea2a36..c3496619740a 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -112,8 +112,8 @@ asmlinkage void smp_threshold_interrupt(void);
 asmlinkage void smp_deferred_error_interrupt(void);
 #endif
 
-extern enum ctx_state ist_enter(struct pt_regs *regs);
-extern void ist_exit(struct pt_regs *regs, enum ctx_state prev_state);
+extern void ist_enter(struct pt_regs *regs);
+extern void ist_exit(struct pt_regs *regs);
 extern void ist_begin_non_atomic(struct pt_regs *regs);
 extern void ist_end_non_atomic(void);
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index df919ff103c3..dc87973098dc 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1029,7 +1029,6 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 {
 	struct mca_config *cfg = &mca_cfg;
 	struct mce m, *final;
-	enum ctx_state prev_state;
 	int i;
 	int worst = 0;
 	int severity;
@@ -1055,7 +1054,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	int flags = MF_ACTION_REQUIRED;
 	int lmce = 0;
 
-	prev_state = ist_enter(regs);
+	ist_enter(regs);
 
 	this_cpu_inc(mce_exception_count);
 
@@ -1227,7 +1226,7 @@ out:
 	local_irq_disable();
 	ist_end_non_atomic();
 done:
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 EXPORT_SYMBOL_GPL(do_machine_check);
 
diff --git a/arch/x86/kernel/cpu/mcheck/p5.c b/arch/x86/kernel/cpu/mcheck/p5.c
index 737b0ad4e61a..12402e10aeff 100644
--- a/arch/x86/kernel/cpu/mcheck/p5.c
+++ b/arch/x86/kernel/cpu/mcheck/p5.c
@@ -19,10 +19,9 @@ int mce_p5_enabled __read_mostly;
 /* Machine check handler for Pentium class Intel CPUs: */
 static void pentium_machine_check(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
 	u32 loaddr, hi, lotype;
 
-	prev_state = ist_enter(regs);
+	ist_enter(regs);
 
 	rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi);
 	rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi);
@@ -39,7 +38,7 @@ static void pentium_machine_check(struct pt_regs *regs, long error_code)
 
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
 
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 
 /* Set up machine check reporting for processors with Intel style MCE: */
diff --git a/arch/x86/kernel/cpu/mcheck/winchip.c b/arch/x86/kernel/cpu/mcheck/winchip.c
index 44f138296fbe..01dd8702880b 100644
--- a/arch/x86/kernel/cpu/mcheck/winchip.c
+++ b/arch/x86/kernel/cpu/mcheck/winchip.c
@@ -15,12 +15,12 @@
 /* Machine check handler for WinChip C6: */
 static void winchip_machine_check(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state = ist_enter(regs);
+	ist_enter(regs);
 
 	printk(KERN_EMERG "CPU0: Machine Check Exception.\n");
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
 
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 
 /* Set up machine check reporting on the Winchip C6 series */
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 2a783c4fe0e9..8e65d8a9b8db 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -108,13 +108,10 @@ static inline void preempt_conditional_cli(struct pt_regs *regs)
 	preempt_count_dec();
 }
 
-enum ctx_state ist_enter(struct pt_regs *regs)
+void ist_enter(struct pt_regs *regs)
 {
-	enum ctx_state prev_state;
-
 	if (user_mode(regs)) {
-		/* Other than that, we're just an exception. */
-		prev_state = exception_enter();
+		CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	} else {
 		/*
 		 * We might have interrupted pretty much anything.  In
@@ -123,32 +120,25 @@ enum ctx_state ist_enter(struct pt_regs *regs)
 		 * but we need to notify RCU.
 		 */
 		rcu_nmi_enter();
-		prev_state = CONTEXT_KERNEL;  /* the value is irrelevant. */
 	}
 
 	/*
-	 * We are atomic because we're on the IST stack (or we're on x86_32,
-	 * in which case we still shouldn't schedule).
-	 *
-	 * This must be after exception_enter(), because exception_enter()
-	 * won't do anything if in_interrupt() returns true.
+	 * We are atomic because we're on the IST stack; or we're on
+	 * x86_32, in which case we still shouldn't schedule; or we're
+	 * on x86_64 and entered from user mode, in which case we're
+	 * still atomic unless ist_begin_non_atomic is called.
 	 */
 	preempt_count_add(HARDIRQ_OFFSET);
 
 	/* This code is a bit fragile.  Test it. */
 	rcu_lockdep_assert(rcu_is_watching(), "ist_enter didn't work");
-
-	return prev_state;
 }
 
-void ist_exit(struct pt_regs *regs, enum ctx_state prev_state)
+void ist_exit(struct pt_regs *regs)
 {
-	/* Must be before exception_exit. */
 	preempt_count_sub(HARDIRQ_OFFSET);
 
-	if (user_mode(regs))
-		return exception_exit(prev_state);
-	else
+	if (!user_mode(regs))
 		rcu_nmi_exit();
 }
 
@@ -162,7 +152,7 @@ void ist_exit(struct pt_regs *regs, enum ctx_state prev_state)
  * a double fault, it can be safe to schedule.  ist_begin_non_atomic()
  * begins a non-atomic section within an ist_enter()/ist_exit() region.
  * Callers are responsible for enabling interrupts themselves inside
- * the non-atomic section, and callers must call is_end_non_atomic()
+ * the non-atomic section, and callers must call ist_end_non_atomic()
  * before ist_exit().
  */
 void ist_begin_non_atomic(struct pt_regs *regs)
@@ -289,7 +279,6 @@ NOKPROBE_SYMBOL(do_trap);
 static void do_error_trap(struct pt_regs *regs, long error_code, char *str,
 			  unsigned long trapnr, int signr)
 {
-	enum ctx_state prev_state = exception_enter();
 	siginfo_t info;
 
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
@@ -300,8 +289,6 @@ static void do_error_trap(struct pt_regs *regs, long error_code, char *str,
 		do_trap(trapnr, signr, str, regs, error_code,
 			fill_trap_info(regs, signr, trapnr, &info));
 	}
-
-	exception_exit(prev_state);
 }
 
 #define DO_ERROR(trapnr, signr, str, name)				\
@@ -353,7 +340,7 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
 	}
 #endif
 
-	ist_enter(regs);  /* Discard prev_state because we won't return. */
+	ist_enter(regs);
 	notify_die(DIE_TRAP, str, regs, error_code, X86_TRAP_DF, SIGSEGV);
 
 	tsk->thread.error_code = error_code;
@@ -373,15 +360,13 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
 
 dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
 	const struct bndcsr *bndcsr;
 	siginfo_t *info;
 
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	if (notify_die(DIE_TRAP, "bounds", regs, error_code,
 			X86_TRAP_BR, SIGSEGV) == NOTIFY_STOP)
-		goto exit;
+		return;
 	conditional_sti(regs);
 
 	if (!user_mode(regs))
@@ -438,9 +423,8 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 		die("bounds", regs, error_code);
 	}
 
-exit:
-	exception_exit(prev_state);
 	return;
+
 exit_trap:
 	/*
 	 * This path out is for all the cases where we could not
@@ -450,36 +434,33 @@ exit_trap:
 	 * time..
 	 */
 	do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
-	exception_exit(prev_state);
 }
 
 dotraplinkage void
 do_general_protection(struct pt_regs *regs, long error_code)
 {
 	struct task_struct *tsk;
-	enum ctx_state prev_state;
 
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	conditional_sti(regs);
 
 	if (v8086_mode(regs)) {
 		local_irq_enable();
 		handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
-		goto exit;
+		return;
 	}
 
 	tsk = current;
 	if (!user_mode(regs)) {
 		if (fixup_exception(regs))
-			goto exit;
+			return;
 
 		tsk->thread.error_code = error_code;
 		tsk->thread.trap_nr = X86_TRAP_GP;
 		if (notify_die(DIE_GPF, "general protection fault", regs, error_code,
 			       X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP)
 			die("general protection fault", regs, error_code);
-		goto exit;
+		return;
 	}
 
 	tsk->thread.error_code = error_code;
@@ -495,16 +476,12 @@ do_general_protection(struct pt_regs *regs, long error_code)
 	}
 
 	force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
-exit:
-	exception_exit(prev_state);
 }
 NOKPROBE_SYMBOL(do_general_protection);
 
 /* May run on IST stack. */
 dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
-
 #ifdef CONFIG_DYNAMIC_FTRACE
 	/*
 	 * ftrace must be first, everything else may cause a recursive crash.
@@ -517,7 +494,7 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 	if (poke_int3_handler(regs))
 		return;
 
-	prev_state = ist_enter(regs);
+	ist_enter(regs);
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
 	if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
@@ -544,7 +521,7 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 	preempt_conditional_cli(regs);
 	debug_stack_usage_dec();
 exit:
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_int3);
 
@@ -620,12 +597,11 @@ NOKPROBE_SYMBOL(fixup_bad_iret);
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 {
 	struct task_struct *tsk = current;
-	enum ctx_state prev_state;
 	int user_icebp = 0;
 	unsigned long dr6;
 	int si_code;
 
-	prev_state = ist_enter(regs);
+	ist_enter(regs);
 
 	get_debugreg(dr6, 6);
 
@@ -700,7 +676,7 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 	debug_stack_usage_dec();
 
 exit:
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug);
 
@@ -752,23 +728,15 @@ static void math_error(struct pt_regs *regs, int error_code, int trapnr)
 
 dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
-
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	math_error(regs, error_code, X86_TRAP_MF);
-	exception_exit(prev_state);
 }
 
 dotraplinkage void
 do_simd_coprocessor_error(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
-
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	math_error(regs, error_code, X86_TRAP_XF);
-	exception_exit(prev_state);
 }
 
 dotraplinkage void
@@ -780,9 +748,6 @@ do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
 dotraplinkage void
 do_device_not_available(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
-
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	BUG_ON(use_eager_fpu());
 
@@ -794,7 +759,6 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 
 		info.regs = regs;
 		math_emulate(&info);
-		exception_exit(prev_state);
 		return;
 	}
 #endif
@@ -802,7 +766,6 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 #ifdef CONFIG_X86_32
 	conditional_sti(regs);
 #endif
-	exception_exit(prev_state);
 }
 NOKPROBE_SYMBOL(do_device_not_available);
 
@@ -810,9 +773,7 @@ NOKPROBE_SYMBOL(do_device_not_available);
 dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
 {
 	siginfo_t info;
-	enum ctx_state prev_state;
 
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	local_irq_enable();
 
@@ -825,7 +786,6 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
 		do_trap(X86_TRAP_IRET, SIGILL, "iret exception", regs, error_code,
 			&info);
 	}
-	exception_exit(prev_state);
 }
 #endif
 
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 16/17] x86/entry: Remove SCHEDULE_USER and asm/context-tracking.h
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (14 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 15/17] x86/entry: Remove exception_enter from most trap handlers Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:54   ` [tip:x86/asm] x86/entry: Remove SCHEDULE_USER and asm/ context-tracking.h tip-bot for Andy Lutomirski
  2015-07-03 19:44 ` [PATCH v5 17/17] x86/irq: Document how IRQ context tracking works and add an assertion Andy Lutomirski
  2015-07-07 11:12 ` [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Ingo Molnar
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

SCHEDULE_USER is no longer used, and asm/context-tracking.h
contained nothing else.  Remove the header entirely

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/entry_64.S               |  1 -
 arch/x86/include/asm/context_tracking.h | 10 ----------
 2 files changed, 11 deletions(-)
 delete mode 100644 arch/x86/include/asm/context_tracking.h

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 168ee264c345..041a37a643e1 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -33,7 +33,6 @@
 #include <asm/paravirt.h>
 #include <asm/percpu.h>
 #include <asm/asm.h>
-#include <asm/context_tracking.h>
 #include <asm/smap.h>
 #include <asm/pgtable_types.h>
 #include <linux/err.h>
diff --git a/arch/x86/include/asm/context_tracking.h b/arch/x86/include/asm/context_tracking.h
deleted file mode 100644
index 1fe49704b146..000000000000
--- a/arch/x86/include/asm/context_tracking.h
+++ /dev/null
@@ -1,10 +0,0 @@
-#ifndef _ASM_X86_CONTEXT_TRACKING_H
-#define _ASM_X86_CONTEXT_TRACKING_H
-
-#ifdef CONFIG_CONTEXT_TRACKING
-# define SCHEDULE_USER call schedule_user
-#else
-# define SCHEDULE_USER call schedule
-#endif
-
-#endif
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v5 17/17] x86/irq: Document how IRQ context tracking works and add an assertion
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (15 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 16/17] x86/entry: Remove SCHEDULE_USER and asm/context-tracking.h Andy Lutomirski
@ 2015-07-03 19:44 ` Andy Lutomirski
  2015-07-07 10:54   ` [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion tip-bot for Andy Lutomirski
  2015-07-07 11:12 ` [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Ingo Molnar
  17 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-03 19:44 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck,
	Andy Lutomirski

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/irq.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 88b366487b0e..6233de046c08 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -216,8 +216,23 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
 	unsigned vector = ~regs->orig_ax;
 	unsigned irq;
 
+	/*
+	 * NB: Unlike exception entries, IRQ entries do not reliably
+	 * handle context tracking in the low-level entry code.  This is
+	 * because syscall entries execute briefly with IRQs on before
+	 * updating context tracking state, so we can take an IRQ from
+	 * kernel mode with CONTEXT_USER.  The low-level entry code only
+	 * updates the context if we came from user mode, so we won't
+	 * switch to CONTEXT_KERNEL.  We'll fix that once the syscall
+	 * code is cleaned up enough that we can cleanly defer enabling
+	 * IRQs.
+	 */
+
 	entering_irq();
 
+	/* entering_irq() tells RCU that we're not quiescent.  Check it. */
+	rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
+
 	irq = __this_cpu_read(vector_irq[vector]);
 
 	if (!handle_irq(irq, regs)) {
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry, selftests/x86: Add a test for 32-bit fast syscall arg faults
  2015-07-03 19:44 ` [PATCH v5 01/17] selftests/x86: Add a test for 32-bit fast syscall arg faults Andy Lutomirski
@ 2015-07-07 10:49   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, tglx, mingo, peterz, luto, bp, luto, linux-kernel, riel,
	dvlasenk, vda.linux, fweisbec, keescook, brgerst, oleg, torvalds

Commit-ID:  5e5c684a2c78b98dcba3d6fce56773a375f63980
Gitweb:     http://git.kernel.org/tip/5e5c684a2c78b98dcba3d6fce56773a375f63980
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:18 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:58:30 +0200

x86/entry, selftests/x86: Add a test for 32-bit fast syscall arg faults

This test passes on 4.0 and fails on some newer kernels.
Fortunately, the failure is likely not a big deal.

This test will make sure that we don't break it further (e.g. OOPSing)
as we clean up the entry code and that we eventually fix the
regression.

There's arguably no need to preserve the old ABI here --
anything that makes it into a fast (vDSO) syscall with a bad
stack is about to crash no matter what we do.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/9cfcc51005168cb1b06b31991931214d770fc59a.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/testing/selftests/x86/Makefile            |   2 +-
 tools/testing/selftests/x86/syscall_arg_fault.c | 130 ++++++++++++++++++++++++
 2 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index caa60d5..e8df47e 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -5,7 +5,7 @@ include ../lib.mk
 .PHONY: all all_32 all_64 warn_32bit_failure clean
 
 TARGETS_C_BOTHBITS := sigreturn single_step_syscall sysret_ss_attrs
-TARGETS_C_32BIT_ONLY := entry_from_vm86
+TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault
 
 TARGETS_C_32BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_32BIT_ONLY)
 BINARIES_32 := $(TARGETS_C_32BIT_ALL:%=%_32)
diff --git a/tools/testing/selftests/x86/syscall_arg_fault.c b/tools/testing/selftests/x86/syscall_arg_fault.c
new file mode 100644
index 0000000..7db4fc9
--- /dev/null
+++ b/tools/testing/selftests/x86/syscall_arg_fault.c
@@ -0,0 +1,130 @@
+/*
+ * syscall_arg_fault.c - tests faults 32-bit fast syscall stack args
+ * Copyright (c) 2015 Andrew Lutomirski
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#define _GNU_SOURCE
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/signal.h>
+#include <sys/ucontext.h>
+#include <err.h>
+#include <setjmp.h>
+#include <errno.h>
+
+/* Our sigaltstack scratch space. */
+static unsigned char altstack_data[SIGSTKSZ];
+
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+		       int flags)
+{
+	struct sigaction sa;
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_sigaction = handler;
+	sa.sa_flags = SA_SIGINFO | flags;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+static volatile sig_atomic_t sig_traps;
+static sigjmp_buf jmpbuf;
+
+static volatile sig_atomic_t n_errs;
+
+static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
+{
+	ucontext_t *ctx = (ucontext_t*)ctx_void;
+
+	if (ctx->uc_mcontext.gregs[REG_EAX] != -EFAULT) {
+		printf("[FAIL]\tAX had the wrong value: 0x%x\n",
+		       ctx->uc_mcontext.gregs[REG_EAX]);
+		n_errs++;
+	} else {
+		printf("[OK]\tSeems okay\n");
+	}
+
+	siglongjmp(jmpbuf, 1);
+}
+
+static void sigill(int sig, siginfo_t *info, void *ctx_void)
+{
+	printf("[SKIP]\tIllegal instruction\n");
+	siglongjmp(jmpbuf, 1);
+}
+
+int main()
+{
+	stack_t stack = {
+		.ss_sp = altstack_data,
+		.ss_size = SIGSTKSZ,
+	};
+	if (sigaltstack(&stack, NULL) != 0)
+		err(1, "sigaltstack");
+
+	sethandler(SIGSEGV, sigsegv, SA_ONSTACK);
+	sethandler(SIGILL, sigill, SA_ONSTACK);
+
+	/*
+	 * Exercise another nasty special case.  The 32-bit SYSCALL
+	 * and SYSENTER instructions (even in compat mode) each
+	 * clobber one register.  A Linux system call has a syscall
+	 * number and six arguments, and the user stack pointer
+	 * needs to live in some register on return.  That means
+	 * that we need eight registers, but SYSCALL and SYSENTER
+	 * only preserve seven registers.  As a result, one argument
+	 * ends up on the stack.  The stack is user memory, which
+	 * means that the kernel can fail to read it.
+	 *
+	 * The 32-bit fast system calls don't have a defined ABI:
+	 * we're supposed to invoke them through the vDSO.  So we'll
+	 * fudge it: we set all regs to invalid pointer values and
+	 * invoke the entry instruction.  The return will fail no
+	 * matter what, and we completely lose our program state,
+	 * but we can fix it up with a signal handler.
+	 */
+
+	printf("[RUN]\tSYSENTER with invalid state\n");
+	if (sigsetjmp(jmpbuf, 1) == 0) {
+		asm volatile (
+			"movl $-1, %%eax\n\t"
+			"movl $-1, %%ebx\n\t"
+			"movl $-1, %%ecx\n\t"
+			"movl $-1, %%edx\n\t"
+			"movl $-1, %%esi\n\t"
+			"movl $-1, %%edi\n\t"
+			"movl $-1, %%ebp\n\t"
+			"movl $-1, %%esp\n\t"
+			"sysenter"
+			: : : "memory", "flags");
+	}
+
+	printf("[RUN]\tSYSCALL with invalid state\n");
+	if (sigsetjmp(jmpbuf, 1) == 0) {
+		asm volatile (
+			"movl $-1, %%eax\n\t"
+			"movl $-1, %%ebx\n\t"
+			"movl $-1, %%ecx\n\t"
+			"movl $-1, %%edx\n\t"
+			"movl $-1, %%esi\n\t"
+			"movl $-1, %%edi\n\t"
+			"movl $-1, %%ebp\n\t"
+			"movl $-1, %%esp\n\t"
+			"syscall\n\t"
+			"pushl $0"	/* make sure we segfault cleanly */
+			: : : "memory", "flags");
+	}
+
+	return 0;
+}

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry/64/compat: Fix bad fast syscall arg failure path
  2015-07-03 19:44 ` [PATCH v5 02/17] x86/entry/64/compat: Fix bad fast syscall arg failure path Andy Lutomirski
@ 2015-07-07 10:49   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, bp, oleg, fweisbec, riel, vda.linux, keescook, luto,
	hpa, torvalds, linux-kernel, peterz, mingo, brgerst, luto, tglx

Commit-ID:  5e99cb7c35ca0580da8e892f91c655d35ecf8798
Gitweb:     http://git.kernel.org/tip/5e99cb7c35ca0580da8e892f91c655d35ecf8798
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:19 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:58:30 +0200

x86/entry/64/compat: Fix bad fast syscall arg failure path

If user code does SYSCALL32 or SYSENTER without a valid stack,
then our attempt to determine the syscall args will result in a
failed uaccess fault.  Previously, we would try to recover by
jumping to the syscall exit code, but we'd run the syscall exit
work even though we never made it to the syscall entry work.

Clean it up by treating the failure path as a non-syscall entry
and exit pair.

This fixes strace's output when running the syscall_arg_fault
test. Without this fix, strace would get out of sync and would
fail to associate syscall entries with syscall exits.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/903010762c07a3d67df914fea2da84b52b0f8f1d.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/entry_64.S        |  2 +-
 arch/x86/entry/entry_64_compat.S | 35 +++++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3bb2c43..141a5d4 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -613,7 +613,7 @@ ret_from_intr:
 	testb	$3, CS(%rsp)
 	jz	retint_kernel
 	/* Interrupt came from user space */
-retint_user:
+GLOBAL(retint_user)
 	GET_THREAD_INFO(%rcx)
 
 	/* %rcx: thread info. Interrupts are off. */
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index b868cfc..e5ebdd9 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -428,8 +428,39 @@ cstar_tracesys:
 END(entry_SYSCALL_compat)
 
 ia32_badarg:
-	ASM_CLAC
-	movq	$-EFAULT, RAX(%rsp)
+	/*
+	 * So far, we've entered kernel mode, set AC, turned on IRQs, and
+	 * saved C regs except r8-r11.  We haven't done any of the other
+	 * standard entry work, though.  We want to bail, but we shouldn't
+	 * treat this as a syscall entry since we don't even know what the
+	 * args are.  Instead, treat this as a non-syscall entry, finish
+	 * the entry work, and immediately exit after setting AX = -EFAULT.
+	 *
+	 * We're really just being polite here.  Killing the task outright
+	 * would be a reasonable action, too.  Given that the only valid
+	 * way to have gotten here is through the vDSO, and we already know
+	 * that the stack pointer is bad, the task isn't going to survive
+	 * for long no matter what we do.
+	 */
+
+	ASM_CLAC			/* undo STAC */
+	movq	$-EFAULT, RAX(%rsp)	/* return -EFAULT if possible */
+
+	/* Fill in the rest of pt_regs */
+	xorl	%eax, %eax
+	movq	%rax, R11(%rsp)
+	movq	%rax, R10(%rsp)
+	movq	%rax, R9(%rsp)
+	movq	%rax, R8(%rsp)
+	SAVE_EXTRA_REGS
+
+	/* Turn IRQs back off. */
+	DISABLE_INTERRUPTS(CLBR_NONE)
+	TRACE_IRQS_OFF
+
+	/* And exit again. */
+	jmp retint_user
+
 ia32_ret_from_sys_call:
 	xorl	%eax, %eax		/* Do not leak kernel information */
 	movq	%rax, R11(%rsp)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] um: Fix do_signal() prototype
  2015-07-03 19:44 ` [PATCH v5 03/17] uml: Fix do_signal() prototype Andy Lutomirski
@ 2015-07-07 10:49   ` tip-bot for Ingo Molnar
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Ingo Molnar @ 2015-07-07 10:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: riel, torvalds, luto, richard.weinberger, keescook, linux-kernel,
	akpm, tglx, vda.linux, bp, oleg, fweisbec, luto, hpa, mingo,
	dvlasenk, peterz, brgerst

Commit-ID:  ccaee5f851470dec6894a6835b6fadffc2bb7514
Gitweb:     http://git.kernel.org/tip/ccaee5f851470dec6894a6835b6fadffc2bb7514
Author:     Ingo Molnar <mingo@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:20 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:58:54 +0200

um: Fix do_signal() prototype

Once x86 exports its do_signal(), the prototypes will clash.

Fix the clash and also improve the code a bit: remove the
unnecessary kern_do_signal() indirection. This allows
interrupt_end() to share the 'regs' parameter calculation.

Also remove the unused return code to match x86.

Minimally build and boot tested.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/67c57eac09a589bac3c6c5ff22f9623ec55a184a.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/um/include/shared/kern_util.h | 3 ++-
 arch/um/kernel/process.c           | 6 ++++--
 arch/um/kernel/signal.c            | 8 +-------
 arch/um/kernel/tlb.c               | 2 +-
 arch/um/kernel/trap.c              | 2 +-
 5 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/um/include/shared/kern_util.h b/arch/um/include/shared/kern_util.h
index 83a91f9..35ab97e 100644
--- a/arch/um/include/shared/kern_util.h
+++ b/arch/um/include/shared/kern_util.h
@@ -22,7 +22,8 @@ extern int kmalloc_ok;
 extern unsigned long alloc_stack(int order, int atomic);
 extern void free_stack(unsigned long stack, int order);
 
-extern int do_signal(void);
+struct pt_regs;
+extern void do_signal(struct pt_regs *regs);
 extern void interrupt_end(void);
 extern void relay_signal(int sig, struct siginfo *si, struct uml_pt_regs *regs);
 
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index 68b9119..a6d9226 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -90,12 +90,14 @@ void *__switch_to(struct task_struct *from, struct task_struct *to)
 
 void interrupt_end(void)
 {
+	struct pt_regs *regs = &current->thread.regs;
+
 	if (need_resched())
 		schedule();
 	if (test_thread_flag(TIF_SIGPENDING))
-		do_signal();
+		do_signal(regs);
 	if (test_and_clear_thread_flag(TIF_NOTIFY_RESUME))
-		tracehook_notify_resume(&current->thread.regs);
+		tracehook_notify_resume(regs);
 }
 
 void exit_thread(void)
diff --git a/arch/um/kernel/signal.c b/arch/um/kernel/signal.c
index 4f60e4a..57acbd6 100644
--- a/arch/um/kernel/signal.c
+++ b/arch/um/kernel/signal.c
@@ -64,7 +64,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 	signal_setup_done(err, ksig, singlestep);
 }
 
-static int kern_do_signal(struct pt_regs *regs)
+void do_signal(struct pt_regs *regs)
 {
 	struct ksignal ksig;
 	int handled_sig = 0;
@@ -110,10 +110,4 @@ static int kern_do_signal(struct pt_regs *regs)
 	 */
 	if (!handled_sig)
 		restore_saved_sigmask();
-	return handled_sig;
-}
-
-int do_signal(void)
-{
-	return kern_do_signal(&current->thread.regs);
 }
diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c
index f1b3eb1..2077248 100644
--- a/arch/um/kernel/tlb.c
+++ b/arch/um/kernel/tlb.c
@@ -291,7 +291,7 @@ void fix_range_common(struct mm_struct *mm, unsigned long start_addr,
 		/* We are under mmap_sem, release it such that current can terminate */
 		up_write(&current->mm->mmap_sem);
 		force_sig(SIGKILL, current);
-		do_signal();
+		do_signal(&current->thread.regs);
 	}
 }
 
diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index 557232f..d8a9fce 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -173,7 +173,7 @@ static void bad_segv(struct faultinfo fi, unsigned long ip)
 void fatal_sigsegv(void)
 {
 	force_sigsegv(SIGSEGV, current);
-	do_signal();
+	do_signal(&current->thread.regs);
 	/*
 	 * This is to tell gcc that we're not returning - do_signal
 	 * can, in general, return, but in this case, it's not, since

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] context_tracking: Add ct_state() and CT_WARN_ON()
  2015-07-03 19:44 ` [PATCH v5 04/17] context_tracking: Add ct_state and CT_WARN_ON Andy Lutomirski
@ 2015-07-07 10:50   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, torvalds, bp, peterz, tglx, fweisbec, oleg, brgerst,
	vda.linux, hpa, keescook, luto, riel, dvlasenk, linux-kernel,
	mingo

Commit-ID:  f9281648ecd5081803bb2da84b9ccb0cf48436cd
Gitweb:     http://git.kernel.org/tip/f9281648ecd5081803bb2da84b9ccb0cf48436cd
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:21 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:04 +0200

context_tracking: Add ct_state() and CT_WARN_ON()

This will let us sprinkle sanity checks around the kernel
without making too much of a mess.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/5da41fb2ceb29eac671f427c67040401ba2a1fa0.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/context_tracking.h       | 15 +++++++++++++++
 include/linux/context_tracking_state.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index b96bd29..008fc67 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -49,13 +49,28 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 	}
 }
 
+
+/**
+ * ct_state() - return the current context tracking state if known
+ *
+ * Returns the current cpu's context tracking state if context tracking
+ * is enabled.  If context tracking is disabled, returns
+ * CONTEXT_DISABLED.  This should be used primarily for debugging.
+ */
+static inline enum ctx_state ct_state(void)
+{
+	return context_tracking_is_enabled() ?
+		this_cpu_read(context_tracking.state) : CONTEXT_DISABLED;
+}
 #else
 static inline void user_enter(void) { }
 static inline void user_exit(void) { }
 static inline enum ctx_state exception_enter(void) { return 0; }
 static inline void exception_exit(enum ctx_state prev_ctx) { }
+static inline enum ctx_state ct_state(void) { return CONTEXT_DISABLED; }
 #endif /* !CONFIG_CONTEXT_TRACKING */
 
+#define CT_WARN_ON(cond) WARN_ON(context_tracking_is_enabled() && (cond))
 
 #ifdef CONFIG_CONTEXT_TRACKING_FORCE
 extern void context_tracking_init(void);
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 678ecdf..ee956c5 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -14,6 +14,7 @@ struct context_tracking {
 	bool active;
 	int recursion;
 	enum ctx_state {
+		CONTEXT_DISABLED = -1,	/* returned by ct_state() if unknown */
 		CONTEXT_KERNEL = 0,
 		CONTEXT_USER,
 		CONTEXT_GUEST,

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] notifiers, RCU: Assert that RCU is watching in notify_die()
  2015-07-03 19:44 ` [PATCH v5 05/17] notifiers: Assert that RCU is watching in notify_die Andy Lutomirski
@ 2015-07-07 10:50   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: oleg, tglx, linux-kernel, torvalds, mingo, dvlasenk, luto, hpa,
	brgerst, keescook, peterz, paulmck, bp, riel, fweisbec, luto,
	vda.linux

Commit-ID:  e727c7d7a11e109849582e9165d54b254eb181d7
Gitweb:     http://git.kernel.org/tip/e727c7d7a11e109849582e9165d54b254eb181d7
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:22 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:04 +0200

notifiers, RCU: Assert that RCU is watching in notify_die()

Low-level arch entries often call notify_die(), and it's easy for
arch code to fail to exit an RCU quiescent state first.  Assert
that we're not quiescent in notify_die().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1f5fe6c23d5b432a23267102f2d72b787d80fdd8.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/notifier.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/notifier.c b/kernel/notifier.c
index ae9fc7c..980e433 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -544,6 +544,8 @@ int notrace notify_die(enum die_val val, const char *str,
 		.signr	= sig,
 
 	};
+	rcu_lockdep_assert(rcu_is_watching(),
+			   "notify_die called but RCU thinks we're quiescent");
 	return atomic_notifier_call_chain(&die_chain, val, &args);
 }
 NOKPROBE_SYMBOL(notify_die);

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry: Move C entry and exit code to arch/x86/ entry/common.c
  2015-07-03 19:44 ` [PATCH v5 06/17] x86: Move C entry and exit code to arch/x86/entry/common.c Andy Lutomirski
@ 2015-07-07 10:50   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, luto, tglx, riel, peterz, torvalds, hpa, linux-kernel,
	brgerst, vda.linux, luto, fweisbec, oleg, bp, mingo, keescook

Commit-ID:  1f484aa6904697f390027c12fba130fa94b20831
Gitweb:     http://git.kernel.org/tip/1f484aa6904697f390027c12fba130fa94b20831
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:23 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:05 +0200

x86/entry: Move C entry and exit code to arch/x86/entry/common.c

The entry and exit C helpers were confusingly scattered between
ptrace.c and signal.c, even though they aren't specific to
ptrace or signal handling.  Move them together in a new file.

This change just moves code around.  It doesn't change anything.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/324d686821266544d8572423cc281f961da445f4.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/Makefile       |   1 +
 arch/x86/entry/common.c       | 253 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/signal.h |   1 +
 arch/x86/kernel/ptrace.c      | 202 +--------------------------------
 arch/x86/kernel/signal.c      |  28 +----
 5 files changed, 257 insertions(+), 228 deletions(-)

diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index 7a14497..bd55ded 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -2,6 +2,7 @@
 # Makefile for the x86 low level entry code
 #
 obj-y				:= entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o
+obj-y				+= common.o
 
 obj-y				+= vdso/
 obj-y				+= vsyscall/
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
new file mode 100644
index 0000000..917d0c3
--- /dev/null
+++ b/arch/x86/entry/common.c
@@ -0,0 +1,253 @@
+/*
+ * common.c - C code for kernel entry and exit
+ * Copyright (c) 2015 Andrew Lutomirski
+ * GPL v2
+ *
+ * Based on asm and ptrace code by many authors.  The code here originated
+ * in ptrace.c and signal.c.
+ */
+
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/smp.h>
+#include <linux/errno.h>
+#include <linux/ptrace.h>
+#include <linux/tracehook.h>
+#include <linux/audit.h>
+#include <linux/seccomp.h>
+#include <linux/signal.h>
+#include <linux/export.h>
+#include <linux/context_tracking.h>
+#include <linux/user-return-notifier.h>
+#include <linux/uprobes.h>
+
+#include <asm/desc.h>
+#include <asm/traps.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/syscalls.h>
+
+static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
+{
+#ifdef CONFIG_X86_64
+	if (arch == AUDIT_ARCH_X86_64) {
+		audit_syscall_entry(regs->orig_ax, regs->di,
+				    regs->si, regs->dx, regs->r10);
+	} else
+#endif
+	{
+		audit_syscall_entry(regs->orig_ax, regs->bx,
+				    regs->cx, regs->dx, regs->si);
+	}
+}
+
+/*
+ * We can return 0 to resume the syscall or anything else to go to phase
+ * 2.  If we resume the syscall, we need to put something appropriate in
+ * regs->orig_ax.
+ *
+ * NB: We don't have full pt_regs here, but regs->orig_ax and regs->ax
+ * are fully functional.
+ *
+ * For phase 2's benefit, our return value is:
+ * 0:			resume the syscall
+ * 1:			go to phase 2; no seccomp phase 2 needed
+ * anything else:	go to phase 2; pass return value to seccomp
+ */
+unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
+{
+	unsigned long ret = 0;
+	u32 work;
+
+	BUG_ON(regs != task_pt_regs(current));
+
+	work = ACCESS_ONCE(current_thread_info()->flags) &
+		_TIF_WORK_SYSCALL_ENTRY;
+
+	/*
+	 * If TIF_NOHZ is set, we are required to call user_exit() before
+	 * doing anything that could touch RCU.
+	 */
+	if (work & _TIF_NOHZ) {
+		user_exit();
+		work &= ~_TIF_NOHZ;
+	}
+
+#ifdef CONFIG_SECCOMP
+	/*
+	 * Do seccomp first -- it should minimize exposure of other
+	 * code, and keeping seccomp fast is probably more valuable
+	 * than the rest of this.
+	 */
+	if (work & _TIF_SECCOMP) {
+		struct seccomp_data sd;
+
+		sd.arch = arch;
+		sd.nr = regs->orig_ax;
+		sd.instruction_pointer = regs->ip;
+#ifdef CONFIG_X86_64
+		if (arch == AUDIT_ARCH_X86_64) {
+			sd.args[0] = regs->di;
+			sd.args[1] = regs->si;
+			sd.args[2] = regs->dx;
+			sd.args[3] = regs->r10;
+			sd.args[4] = regs->r8;
+			sd.args[5] = regs->r9;
+		} else
+#endif
+		{
+			sd.args[0] = regs->bx;
+			sd.args[1] = regs->cx;
+			sd.args[2] = regs->dx;
+			sd.args[3] = regs->si;
+			sd.args[4] = regs->di;
+			sd.args[5] = regs->bp;
+		}
+
+		BUILD_BUG_ON(SECCOMP_PHASE1_OK != 0);
+		BUILD_BUG_ON(SECCOMP_PHASE1_SKIP != 1);
+
+		ret = seccomp_phase1(&sd);
+		if (ret == SECCOMP_PHASE1_SKIP) {
+			regs->orig_ax = -1;
+			ret = 0;
+		} else if (ret != SECCOMP_PHASE1_OK) {
+			return ret;  /* Go directly to phase 2 */
+		}
+
+		work &= ~_TIF_SECCOMP;
+	}
+#endif
+
+	/* Do our best to finish without phase 2. */
+	if (work == 0)
+		return ret;  /* seccomp and/or nohz only (ret == 0 here) */
+
+#ifdef CONFIG_AUDITSYSCALL
+	if (work == _TIF_SYSCALL_AUDIT) {
+		/*
+		 * If there is no more work to be done except auditing,
+		 * then audit in phase 1.  Phase 2 always audits, so, if
+		 * we audit here, then we can't go on to phase 2.
+		 */
+		do_audit_syscall_entry(regs, arch);
+		return 0;
+	}
+#endif
+
+	return 1;  /* Something is enabled that we can't handle in phase 1 */
+}
+
+/* Returns the syscall nr to run (which should match regs->orig_ax). */
+long syscall_trace_enter_phase2(struct pt_regs *regs, u32 arch,
+				unsigned long phase1_result)
+{
+	long ret = 0;
+	u32 work = ACCESS_ONCE(current_thread_info()->flags) &
+		_TIF_WORK_SYSCALL_ENTRY;
+
+	BUG_ON(regs != task_pt_regs(current));
+
+	/*
+	 * If we stepped into a sysenter/syscall insn, it trapped in
+	 * kernel mode; do_debug() cleared TF and set TIF_SINGLESTEP.
+	 * If user-mode had set TF itself, then it's still clear from
+	 * do_debug() and we need to set it again to restore the user
+	 * state.  If we entered on the slow path, TF was already set.
+	 */
+	if (work & _TIF_SINGLESTEP)
+		regs->flags |= X86_EFLAGS_TF;
+
+#ifdef CONFIG_SECCOMP
+	/*
+	 * Call seccomp_phase2 before running the other hooks so that
+	 * they can see any changes made by a seccomp tracer.
+	 */
+	if (phase1_result > 1 && seccomp_phase2(phase1_result)) {
+		/* seccomp failures shouldn't expose any additional code. */
+		return -1;
+	}
+#endif
+
+	if (unlikely(work & _TIF_SYSCALL_EMU))
+		ret = -1L;
+
+	if ((ret || test_thread_flag(TIF_SYSCALL_TRACE)) &&
+	    tracehook_report_syscall_entry(regs))
+		ret = -1L;
+
+	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
+		trace_sys_enter(regs, regs->orig_ax);
+
+	do_audit_syscall_entry(regs, arch);
+
+	return ret ?: regs->orig_ax;
+}
+
+long syscall_trace_enter(struct pt_regs *regs)
+{
+	u32 arch = is_ia32_task() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
+	unsigned long phase1_result = syscall_trace_enter_phase1(regs, arch);
+
+	if (phase1_result == 0)
+		return regs->orig_ax;
+	else
+		return syscall_trace_enter_phase2(regs, arch, phase1_result);
+}
+
+void syscall_trace_leave(struct pt_regs *regs)
+{
+	bool step;
+
+	/*
+	 * We may come here right after calling schedule_user()
+	 * or do_notify_resume(), in which case we can be in RCU
+	 * user mode.
+	 */
+	user_exit();
+
+	audit_syscall_exit(regs);
+
+	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
+		trace_sys_exit(regs, regs->ax);
+
+	/*
+	 * If TIF_SYSCALL_EMU is set, we only get here because of
+	 * TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP).
+	 * We already reported this syscall instruction in
+	 * syscall_trace_enter().
+	 */
+	step = unlikely(test_thread_flag(TIF_SINGLESTEP)) &&
+			!test_thread_flag(TIF_SYSCALL_EMU);
+	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
+		tracehook_report_syscall_exit(regs, step);
+
+	user_enter();
+}
+
+/*
+ * notification of userspace execution resumption
+ * - triggered by the TIF_WORK_MASK flags
+ */
+__visible void
+do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
+{
+	user_exit();
+
+	if (thread_info_flags & _TIF_UPROBE)
+		uprobe_notify_resume(regs);
+
+	/* deal with pending signal delivery */
+	if (thread_info_flags & _TIF_SIGPENDING)
+		do_signal(regs);
+
+	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
+		clear_thread_flag(TIF_NOTIFY_RESUME);
+		tracehook_notify_resume(regs);
+	}
+	if (thread_info_flags & _TIF_USER_RETURN_NOTIFY)
+		fire_user_return_notifiers();
+
+	user_enter();
+}
diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index 31eab86..b42408b 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -30,6 +30,7 @@ typedef sigset_t compat_sigset_t;
 #endif /* __ASSEMBLY__ */
 #include <uapi/asm/signal.h>
 #ifndef __ASSEMBLY__
+extern void do_signal(struct pt_regs *regs);
 extern void do_notify_resume(struct pt_regs *, void *, __u32);
 
 #define __ARCH_HAS_SA_RESTORER
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 7155957..558f50e 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -37,12 +37,10 @@
 #include <asm/proto.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/traps.h>
+#include <asm/syscall.h>
 
 #include "tls.h"
 
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
 enum x86_regset {
 	REGSET_GENERAL,
 	REGSET_FP,
@@ -1444,201 +1442,3 @@ void send_sigtrap(struct task_struct *tsk, struct pt_regs *regs,
 	/* Send us the fake SIGTRAP */
 	force_sig_info(SIGTRAP, &info, tsk);
 }
-
-static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
-{
-#ifdef CONFIG_X86_64
-	if (arch == AUDIT_ARCH_X86_64) {
-		audit_syscall_entry(regs->orig_ax, regs->di,
-				    regs->si, regs->dx, regs->r10);
-	} else
-#endif
-	{
-		audit_syscall_entry(regs->orig_ax, regs->bx,
-				    regs->cx, regs->dx, regs->si);
-	}
-}
-
-/*
- * We can return 0 to resume the syscall or anything else to go to phase
- * 2.  If we resume the syscall, we need to put something appropriate in
- * regs->orig_ax.
- *
- * NB: We don't have full pt_regs here, but regs->orig_ax and regs->ax
- * are fully functional.
- *
- * For phase 2's benefit, our return value is:
- * 0:			resume the syscall
- * 1:			go to phase 2; no seccomp phase 2 needed
- * anything else:	go to phase 2; pass return value to seccomp
- */
-unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
-{
-	unsigned long ret = 0;
-	u32 work;
-
-	BUG_ON(regs != task_pt_regs(current));
-
-	work = ACCESS_ONCE(current_thread_info()->flags) &
-		_TIF_WORK_SYSCALL_ENTRY;
-
-	/*
-	 * If TIF_NOHZ is set, we are required to call user_exit() before
-	 * doing anything that could touch RCU.
-	 */
-	if (work & _TIF_NOHZ) {
-		user_exit();
-		work &= ~_TIF_NOHZ;
-	}
-
-#ifdef CONFIG_SECCOMP
-	/*
-	 * Do seccomp first -- it should minimize exposure of other
-	 * code, and keeping seccomp fast is probably more valuable
-	 * than the rest of this.
-	 */
-	if (work & _TIF_SECCOMP) {
-		struct seccomp_data sd;
-
-		sd.arch = arch;
-		sd.nr = regs->orig_ax;
-		sd.instruction_pointer = regs->ip;
-#ifdef CONFIG_X86_64
-		if (arch == AUDIT_ARCH_X86_64) {
-			sd.args[0] = regs->di;
-			sd.args[1] = regs->si;
-			sd.args[2] = regs->dx;
-			sd.args[3] = regs->r10;
-			sd.args[4] = regs->r8;
-			sd.args[5] = regs->r9;
-		} else
-#endif
-		{
-			sd.args[0] = regs->bx;
-			sd.args[1] = regs->cx;
-			sd.args[2] = regs->dx;
-			sd.args[3] = regs->si;
-			sd.args[4] = regs->di;
-			sd.args[5] = regs->bp;
-		}
-
-		BUILD_BUG_ON(SECCOMP_PHASE1_OK != 0);
-		BUILD_BUG_ON(SECCOMP_PHASE1_SKIP != 1);
-
-		ret = seccomp_phase1(&sd);
-		if (ret == SECCOMP_PHASE1_SKIP) {
-			regs->orig_ax = -1;
-			ret = 0;
-		} else if (ret != SECCOMP_PHASE1_OK) {
-			return ret;  /* Go directly to phase 2 */
-		}
-
-		work &= ~_TIF_SECCOMP;
-	}
-#endif
-
-	/* Do our best to finish without phase 2. */
-	if (work == 0)
-		return ret;  /* seccomp and/or nohz only (ret == 0 here) */
-
-#ifdef CONFIG_AUDITSYSCALL
-	if (work == _TIF_SYSCALL_AUDIT) {
-		/*
-		 * If there is no more work to be done except auditing,
-		 * then audit in phase 1.  Phase 2 always audits, so, if
-		 * we audit here, then we can't go on to phase 2.
-		 */
-		do_audit_syscall_entry(regs, arch);
-		return 0;
-	}
-#endif
-
-	return 1;  /* Something is enabled that we can't handle in phase 1 */
-}
-
-/* Returns the syscall nr to run (which should match regs->orig_ax). */
-long syscall_trace_enter_phase2(struct pt_regs *regs, u32 arch,
-				unsigned long phase1_result)
-{
-	long ret = 0;
-	u32 work = ACCESS_ONCE(current_thread_info()->flags) &
-		_TIF_WORK_SYSCALL_ENTRY;
-
-	BUG_ON(regs != task_pt_regs(current));
-
-	/*
-	 * If we stepped into a sysenter/syscall insn, it trapped in
-	 * kernel mode; do_debug() cleared TF and set TIF_SINGLESTEP.
-	 * If user-mode had set TF itself, then it's still clear from
-	 * do_debug() and we need to set it again to restore the user
-	 * state.  If we entered on the slow path, TF was already set.
-	 */
-	if (work & _TIF_SINGLESTEP)
-		regs->flags |= X86_EFLAGS_TF;
-
-#ifdef CONFIG_SECCOMP
-	/*
-	 * Call seccomp_phase2 before running the other hooks so that
-	 * they can see any changes made by a seccomp tracer.
-	 */
-	if (phase1_result > 1 && seccomp_phase2(phase1_result)) {
-		/* seccomp failures shouldn't expose any additional code. */
-		return -1;
-	}
-#endif
-
-	if (unlikely(work & _TIF_SYSCALL_EMU))
-		ret = -1L;
-
-	if ((ret || test_thread_flag(TIF_SYSCALL_TRACE)) &&
-	    tracehook_report_syscall_entry(regs))
-		ret = -1L;
-
-	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
-		trace_sys_enter(regs, regs->orig_ax);
-
-	do_audit_syscall_entry(regs, arch);
-
-	return ret ?: regs->orig_ax;
-}
-
-long syscall_trace_enter(struct pt_regs *regs)
-{
-	u32 arch = is_ia32_task() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
-	unsigned long phase1_result = syscall_trace_enter_phase1(regs, arch);
-
-	if (phase1_result == 0)
-		return regs->orig_ax;
-	else
-		return syscall_trace_enter_phase2(regs, arch, phase1_result);
-}
-
-void syscall_trace_leave(struct pt_regs *regs)
-{
-	bool step;
-
-	/*
-	 * We may come here right after calling schedule_user()
-	 * or do_notify_resume(), in which case we can be in RCU
-	 * user mode.
-	 */
-	user_exit();
-
-	audit_syscall_exit(regs);
-
-	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
-		trace_sys_exit(regs, regs->ax);
-
-	/*
-	 * If TIF_SYSCALL_EMU is set, we only get here because of
-	 * TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP).
-	 * We already reported this syscall instruction in
-	 * syscall_trace_enter().
-	 */
-	step = unlikely(test_thread_flag(TIF_SINGLESTEP)) &&
-			!test_thread_flag(TIF_SYSCALL_EMU);
-	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		tracehook_report_syscall_exit(regs, step);
-
-	user_enter();
-}
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 6c22aad..7e88cc7 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -700,7 +700,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
  * want to handle. Thus you cannot kill init even with a SIGKILL even by
  * mistake.
  */
-static void do_signal(struct pt_regs *regs)
+void do_signal(struct pt_regs *regs)
 {
 	struct ksignal ksig;
 
@@ -735,32 +735,6 @@ static void do_signal(struct pt_regs *regs)
 	restore_saved_sigmask();
 }
 
-/*
- * notification of userspace execution resumption
- * - triggered by the TIF_WORK_MASK flags
- */
-__visible void
-do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
-{
-	user_exit();
-
-	if (thread_info_flags & _TIF_UPROBE)
-		uprobe_notify_resume(regs);
-
-	/* deal with pending signal delivery */
-	if (thread_info_flags & _TIF_SIGPENDING)
-		do_signal(regs);
-
-	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
-		clear_thread_flag(TIF_NOTIFY_RESUME);
-		tracehook_notify_resume(regs);
-	}
-	if (thread_info_flags & _TIF_USER_RETURN_NOTIFY)
-		fire_user_return_notifiers();
-
-	user_enter();
-}
-
 void signal_fault(struct pt_regs *regs, void __user *frame, char *where)
 {
 	struct task_struct *me = current;

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/traps, context_tracking: Assert that we' re in CONTEXT_KERNEL in exception entries
  2015-07-03 19:44 ` [PATCH v5 07/17] x86/traps: Assert that we're in CONTEXT_KERNEL in exception entries Andy Lutomirski
@ 2015-07-07 10:51   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brgerst, mingo, keescook, vda.linux, fweisbec, tglx, oleg, luto,
	torvalds, bp, hpa, dvlasenk, luto, peterz, linux-kernel, riel

Commit-ID:  02fdcd5eac9d653d1addbd69b0c58d73650e1c00
Gitweb:     http://git.kernel.org/tip/02fdcd5eac9d653d1addbd69b0c58d73650e1c00
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:24 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:05 +0200

x86/traps, context_tracking: Assert that we're in CONTEXT_KERNEL in exception entries

Other than the super-atomic exception entries, all exception
entries are supposed to switch our context tracking state to
CONTEXT_KERNEL. Assert that they do.  These assertions appear
trivial at this point, as exception_enter() is the function
responsible for switching context, but I'm planning on reworking
x86's exception context tracking, and these assertions will help
make sure that all of this code keeps working.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20fa1ee2d943233a184aaf96ff75394d3b34dfba.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/traps.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index f579192..2a783c4 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -292,6 +292,8 @@ static void do_error_trap(struct pt_regs *regs, long error_code, char *str,
 	enum ctx_state prev_state = exception_enter();
 	siginfo_t info;
 
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
+
 	if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) !=
 			NOTIFY_STOP) {
 		conditional_sti(regs);
@@ -376,6 +378,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 	siginfo_t *info;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	if (notify_die(DIE_TRAP, "bounds", regs, error_code,
 			X86_TRAP_BR, SIGSEGV) == NOTIFY_STOP)
 		goto exit;
@@ -457,6 +460,7 @@ do_general_protection(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	conditional_sti(regs);
 
 	if (v8086_mode(regs)) {
@@ -514,6 +518,7 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 		return;
 
 	prev_state = ist_enter(regs);
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
 	if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
 				SIGTRAP) == NOTIFY_STOP)
@@ -750,6 +755,7 @@ dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	math_error(regs, error_code, X86_TRAP_MF);
 	exception_exit(prev_state);
 }
@@ -760,6 +766,7 @@ do_simd_coprocessor_error(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	math_error(regs, error_code, X86_TRAP_XF);
 	exception_exit(prev_state);
 }
@@ -776,6 +783,7 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	BUG_ON(use_eager_fpu());
 
 #ifdef CONFIG_MATH_EMULATION
@@ -805,6 +813,7 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
 	enum ctx_state prev_state;
 
 	prev_state = exception_enter();
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	local_irq_enable();
 
 	info.si_signo = SIGILL;

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry: Add enter_from_user_mode() and use it in syscalls
  2015-07-03 19:44 ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode and use it in syscalls Andy Lutomirski
@ 2015-07-07 10:51   ` tip-bot for Andy Lutomirski
  2015-07-14 23:00     ` Frederic Weisbecker
  2015-12-21 20:50   ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode " Sasha Levin
  1 sibling, 1 reply; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, linux-kernel, brgerst, peterz, mingo, vda.linux, luto, hpa,
	dvlasenk, riel, torvalds, bp, tglx, keescook, oleg, fweisbec

Commit-ID:  feed36cde0a10adb957445a37e48f957f30b2273
Gitweb:     http://git.kernel.org/tip/feed36cde0a10adb957445a37e48f957f30b2273
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:25 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:06 +0200

x86/entry: Add enter_from_user_mode() and use it in syscalls

Changing the x86 context tracking hooks is dangerous because
there are no good checks that we track our context correctly.
Add a helper to check that we're actually in CONTEXT_USER when
we enter from user mode and wire it up for syscall entries.

Subsequent patches will wire this up for all non-NMI entries as
well.  NMIs are their own special beast and cannot currently
switch overall context tracking state.  Instead, they have their
own special RCU hooks.

This is a tiny speedup if !CONFIG_CONTEXT_TRACKING (removes a
branch) and a tiny slowdown if CONFIG_CONTEXT_TRACING (adds a
layer of indirection).  Eventually, we should fix up the core
context tracking code to supply a function that does what we
want (and can be much simpler than user_exit), which will enable
us to get rid of the extra call.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/853b42420066ec3fb856779cdc223a6dcb5d355b.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/common.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 917d0c3..9a327ee 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -28,6 +28,15 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
 
+#ifdef CONFIG_CONTEXT_TRACKING
+/* Called on entry from user mode with IRQs off. */
+__visible void enter_from_user_mode(void)
+{
+	CT_WARN_ON(ct_state() != CONTEXT_USER);
+	user_exit();
+}
+#endif
+
 static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
 {
 #ifdef CONFIG_X86_64
@@ -65,14 +74,16 @@ unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
 	work = ACCESS_ONCE(current_thread_info()->flags) &
 		_TIF_WORK_SYSCALL_ENTRY;
 
+#ifdef CONFIG_CONTEXT_TRACKING
 	/*
 	 * If TIF_NOHZ is set, we are required to call user_exit() before
 	 * doing anything that could touch RCU.
 	 */
 	if (work & _TIF_NOHZ) {
-		user_exit();
+		enter_from_user_mode();
 		work &= ~_TIF_NOHZ;
 	}
+#endif
 
 #ifdef CONFIG_SECCOMP
 	/*

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry: Add new, comprehensible entry and exit handlers written in C
  2015-07-03 19:44 ` [PATCH v5 09/17] x86/entry: Add new, comprehensible entry and exit hooks Andy Lutomirski
@ 2015-07-07 10:51   ` tip-bot for Andy Lutomirski
  2015-07-14 23:07     ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, oleg, luto, fweisbec, dvlasenk, linux-kernel, riel, tglx,
	peterz, keescook, brgerst, mingo, vda.linux, torvalds, luto, bp

Commit-ID:  c5c46f59e4e7c1ab244b8d38f2b61d317df90bba
Gitweb:     http://git.kernel.org/tip/c5c46f59e4e7c1ab244b8d38f2b61d317df90bba
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:26 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:06 +0200

x86/entry: Add new, comprehensible entry and exit handlers written in C

The current x86 entry and exit code, written in a mixture of assembly and
C code, is incomprehensible due to being open-coded in a lot of places
without coherent documentation.

It appears to work primary by luck and duct tape: i.e. obvious runtime
failures were fixed on-demand, without re-thinking the design.

Due to those reasons our confidence level in that code is low, and it is
very difficult to incrementally improve.

Add new code written in C, in preparation for simply deleting the old
entry code.

prepare_exit_to_usermode() is a new function that will handle all
slow path exits to user mode.  It is called with IRQs disabled
and it leaves us in a state in which it is safe to immediately
return to user mode.  IRQs must not be re-enabled at any point
after prepare_exit_to_usermode() returns and user mode is actually
entered. (We can, of course, fail to enter user mode and treat
that failure as a fresh entry to kernel mode.)

All callers of do_notify_resume() will be migrated to call
prepare_exit_to_usermode() instead; prepare_exit_to_usermode() needs
to do everything that do_notify_resume() does today, but it also
takes care of scheduling and context tracking.  Unlike
do_notify_resume(), it does not need to be called in a loop.

syscall_return_slowpath() is exactly what it sounds like: it will
be called on any syscall exit slow path. It will replace
syscall_trace_leave() and it calls prepare_exit_to_usermode() on the
way out.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/c57c8b87661a4152801d7d3786eac2d1a2f209dd.1435952415.git.luto@kernel.org
[ Improved the changelog a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/common.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 111 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 9a327ee..febc530 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -207,6 +207,7 @@ long syscall_trace_enter(struct pt_regs *regs)
 		return syscall_trace_enter_phase2(regs, arch, phase1_result);
 }
 
+/* Deprecated. */
 void syscall_trace_leave(struct pt_regs *regs)
 {
 	bool step;
@@ -237,8 +238,117 @@ void syscall_trace_leave(struct pt_regs *regs)
 	user_enter();
 }
 
+static struct thread_info *pt_regs_to_thread_info(struct pt_regs *regs)
+{
+	unsigned long top_of_stack =
+		(unsigned long)(regs + 1) + TOP_OF_KERNEL_STACK_PADDING;
+	return (struct thread_info *)(top_of_stack - THREAD_SIZE);
+}
+
+/* Called with IRQs disabled. */
+__visible void prepare_exit_to_usermode(struct pt_regs *regs)
+{
+	if (WARN_ON(!irqs_disabled()))
+		local_irq_disable();
+
+	/*
+	 * In order to return to user mode, we need to have IRQs off with
+	 * none of _TIF_SIGPENDING, _TIF_NOTIFY_RESUME, _TIF_USER_RETURN_NOTIFY,
+	 * _TIF_UPROBE, or _TIF_NEED_RESCHED set.  Several of these flags
+	 * can be set at any time on preemptable kernels if we have IRQs on,
+	 * so we need to loop.  Disabling preemption wouldn't help: doing the
+	 * work to clear some of the flags can sleep.
+	 */
+	while (true) {
+		u32 cached_flags =
+			READ_ONCE(pt_regs_to_thread_info(regs)->flags);
+
+		if (!(cached_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
+				      _TIF_UPROBE | _TIF_NEED_RESCHED)))
+			break;
+
+		/* We have work to do. */
+		local_irq_enable();
+
+		if (cached_flags & _TIF_NEED_RESCHED)
+			schedule();
+
+		if (cached_flags & _TIF_UPROBE)
+			uprobe_notify_resume(regs);
+
+		/* deal with pending signal delivery */
+		if (cached_flags & _TIF_SIGPENDING)
+			do_signal(regs);
+
+		if (cached_flags & _TIF_NOTIFY_RESUME) {
+			clear_thread_flag(TIF_NOTIFY_RESUME);
+			tracehook_notify_resume(regs);
+		}
+
+		if (cached_flags & _TIF_USER_RETURN_NOTIFY)
+			fire_user_return_notifiers();
+
+		/* Disable IRQs and retry */
+		local_irq_disable();
+	}
+
+	user_enter();
+}
+
+/*
+ * Called with IRQs on and fully valid regs.  Returns with IRQs off in a
+ * state such that we can immediately switch to user mode.
+ */
+__visible void syscall_return_slowpath(struct pt_regs *regs)
+{
+	struct thread_info *ti = pt_regs_to_thread_info(regs);
+	u32 cached_flags = READ_ONCE(ti->flags);
+	bool step;
+
+	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
+
+	if (WARN(irqs_disabled(), "syscall %ld left IRQs disabled",
+		 regs->orig_ax))
+		local_irq_enable();
+
+	/*
+	 * First do one-time work.  If these work items are enabled, we
+	 * want to run them exactly once per syscall exit with IRQs on.
+	 */
+	if (cached_flags & (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT |
+			    _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT)) {
+		audit_syscall_exit(regs);
+
+		if (cached_flags & _TIF_SYSCALL_TRACEPOINT)
+			trace_sys_exit(regs, regs->ax);
+
+		/*
+		 * If TIF_SYSCALL_EMU is set, we only get here because of
+		 * TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP).
+		 * We already reported this syscall instruction in
+		 * syscall_trace_enter().
+		 */
+		step = unlikely(
+			(cached_flags & (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU))
+			== _TIF_SINGLESTEP);
+		if (step || cached_flags & _TIF_SYSCALL_TRACE)
+			tracehook_report_syscall_exit(regs, step);
+	}
+
+#ifdef CONFIG_COMPAT
+	/*
+	 * Compat syscalls set TS_COMPAT.  Make sure we clear it before
+	 * returning to user mode.
+	 */
+	ti->status &= ~TS_COMPAT;
+#endif
+
+	local_irq_disable();
+	prepare_exit_to_usermode(regs);
+}
+
 /*
- * notification of userspace execution resumption
+ * Deprecated notification of userspace execution resumption
  * - triggered by the TIF_WORK_MASK flags
  */
 __visible void

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry/64: Really create an error-entry-from-usermode code path
  2015-07-03 19:44 ` [PATCH v5 10/17] x86/entry/64: Really create an error-entry-from-usermode code path Andy Lutomirski
@ 2015-07-07 10:52   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, dvlasenk, bp, hpa, linux-kernel, peterz, brgerst, riel,
	oleg, vda.linux, luto, luto, fweisbec, keescook, tglx, mingo

Commit-ID:  cb6f64ed5a04036eef07e70b57dd5dd78f2fbcef
Gitweb:     http://git.kernel.org/tip/cb6f64ed5a04036eef07e70b57dd5dd78f2fbcef
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:27 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:07 +0200

x86/entry/64: Really create an error-entry-from-usermode code path

In 539f51136500 ("x86/asm/entry/64: Disentangle error_entry/exit
gsbase/ebx/usermode code"), I arranged the code slightly wrong
-- IRET faults would skip the code path that was intended to
execute on all error entries from user mode.  Fix it up.

While we're at it, make all the labels in error_entry local.

This does not fix a bug, but we'll need it, and it slightly
shrinks the code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/91e17891e49fa3d61357eadc451529ad48143ee1.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/entry_64.S | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 141a5d4..ccfcba9 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1143,12 +1143,17 @@ ENTRY(error_entry)
 	SAVE_EXTRA_REGS 8
 	xorl	%ebx, %ebx
 	testb	$3, CS+8(%rsp)
-	jz	error_kernelspace
+	jz	.Lerror_kernelspace
 
-	/* We entered from user mode */
+.Lerror_entry_from_usermode_swapgs:
+	/*
+	 * We entered from user mode or we're pretending to have entered
+	 * from user mode due to an IRET fault.
+	 */
 	SWAPGS
 
-error_entry_done:
+.Lerror_entry_from_usermode_after_swapgs:
+.Lerror_entry_done:
 	TRACE_IRQS_OFF
 	ret
 
@@ -1158,31 +1163,30 @@ error_entry_done:
 	 * truncated RIP for IRET exceptions returning to compat mode. Check
 	 * for these here too.
 	 */
-error_kernelspace:
+.Lerror_kernelspace:
 	incl	%ebx
 	leaq	native_irq_return_iret(%rip), %rcx
 	cmpq	%rcx, RIP+8(%rsp)
-	je	error_bad_iret
+	je	.Lerror_bad_iret
 	movl	%ecx, %eax			/* zero extend */
 	cmpq	%rax, RIP+8(%rsp)
-	je	bstep_iret
+	je	.Lbstep_iret
 	cmpq	$gs_change, RIP+8(%rsp)
-	jne	error_entry_done
+	jne	.Lerror_entry_done
 
 	/*
 	 * hack: gs_change can fail with user gsbase.  If this happens, fix up
 	 * gsbase and proceed.  We'll fix up the exception and land in
 	 * gs_change's error handler with kernel gsbase.
 	 */
-	SWAPGS
-	jmp	error_entry_done
+	jmp	.Lerror_entry_from_usermode_swapgs
 
-bstep_iret:
+.Lbstep_iret:
 	/* Fix truncated RIP */
 	movq	%rcx, RIP+8(%rsp)
 	/* fall through */
 
-error_bad_iret:
+.Lerror_bad_iret:
 	/*
 	 * We came from an IRET to user mode, so we have user gsbase.
 	 * Switch to kernel gsbase:
@@ -1198,7 +1202,7 @@ error_bad_iret:
 	call	fixup_bad_iret
 	mov	%rax, %rsp
 	decl	%ebx
-	jmp	error_entry_done
+	jmp	.Lerror_entry_from_usermode_after_swapgs
 END(error_entry)
 
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers and remove old assembly code
  2015-07-03 19:44 ` [PATCH v5 11/17] x86/entry/64: Migrate 64-bit and compat syscalls to new exit hooks Andy Lutomirski
@ 2015-07-07 10:52   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, peterz, vda.linux, hpa, brgerst, bp, riel, oleg, keescook,
	luto, torvalds, linux-kernel, dvlasenk, mingo, tglx, fweisbec

Commit-ID:  29ea1b258b98a862e59d72556714b75051ae93fb
Gitweb:     http://git.kernel.org/tip/29ea1b258b98a862e59d72556714b75051ae93fb
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:28 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:07 +0200

x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers and remove old assembly code

These need to be migrated together, as the compat case used to
jump into the middle of the 64-bit exit code.

Remove the old assembly code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/d4d1d70de08ac3640badf50048a9e8f18fe2497f.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/entry_64.S        | 69 +++++-----------------------------------
 arch/x86/entry/entry_64_compat.S |  6 ++--
 2 files changed, 11 insertions(+), 64 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ccfcba9..4ca5b78 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -229,6 +229,11 @@ entry_SYSCALL_64_fastpath:
 	 */
 	USERGS_SYSRET64
 
+GLOBAL(int_ret_from_sys_call_irqs_off)
+	TRACE_IRQS_ON
+	ENABLE_INTERRUPTS(CLBR_NONE)
+	jmp int_ret_from_sys_call
+
 	/* Do syscall entry tracing */
 tracesys:
 	movq	%rsp, %rdi
@@ -272,69 +277,11 @@ tracesys_phase2:
  * Has correct iret frame.
  */
 GLOBAL(int_ret_from_sys_call)
-	DISABLE_INTERRUPTS(CLBR_NONE)
-int_ret_from_sys_call_irqs_off: /* jumps come here from the irqs-off SYSRET path */
-	TRACE_IRQS_OFF
-	movl	$_TIF_ALLWORK_MASK, %edi
-	/* edi:	mask to check */
-GLOBAL(int_with_check)
-	LOCKDEP_SYS_EXIT_IRQ
-	GET_THREAD_INFO(%rcx)
-	movl	TI_flags(%rcx), %edx
-	andl	%edi, %edx
-	jnz	int_careful
-	andl	$~TS_COMPAT, TI_status(%rcx)
-	jmp	syscall_return
-
-	/*
-	 * Either reschedule or signal or syscall exit tracking needed.
-	 * First do a reschedule test.
-	 * edx:	work, edi: workmask
-	 */
-int_careful:
-	bt	$TIF_NEED_RESCHED, %edx
-	jnc	int_very_careful
-	TRACE_IRQS_ON
-	ENABLE_INTERRUPTS(CLBR_NONE)
-	pushq	%rdi
-	SCHEDULE_USER
-	popq	%rdi
-	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
-	jmp	int_with_check
-
-	/* handle signals and tracing -- both require a full pt_regs */
-int_very_careful:
-	TRACE_IRQS_ON
-	ENABLE_INTERRUPTS(CLBR_NONE)
 	SAVE_EXTRA_REGS
-	/* Check for syscall exit trace */
-	testl	$_TIF_WORK_SYSCALL_EXIT, %edx
-	jz	int_signal
-	pushq	%rdi
-	leaq	8(%rsp), %rdi			/* &ptregs -> arg1 */
-	call	syscall_trace_leave
-	popq	%rdi
-	andl	$~(_TIF_WORK_SYSCALL_EXIT|_TIF_SYSCALL_EMU), %edi
-	jmp	int_restore_rest
-
-int_signal:
-	testl	$_TIF_DO_NOTIFY_MASK, %edx
-	jz	1f
-	movq	%rsp, %rdi			/* &ptregs -> arg1 */
-	xorl	%esi, %esi			/* oldset -> arg2 */
-	call	do_notify_resume
-1:	movl	$_TIF_WORK_MASK, %edi
-int_restore_rest:
+	movq	%rsp, %rdi
+	call	syscall_return_slowpath	/* returns with IRQs disabled */
 	RESTORE_EXTRA_REGS
-	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
-	jmp	int_with_check
-
-syscall_return:
-	/* The IRETQ could re-enable interrupts: */
-	DISABLE_INTERRUPTS(CLBR_ANY)
-	TRACE_IRQS_IRETQ
+	TRACE_IRQS_IRETQ		/* we're about to change IF */
 
 	/*
 	 * Try to use SYSRET instead of IRET if we're returning to
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index e5ebdd9..d9bbd31 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -210,10 +210,10 @@ sysexit_from_sys_call:
 	.endm
 
 	.macro auditsys_exit exit
-	testl	$(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
-	jnz	ia32_ret_from_sys_call
 	TRACE_IRQS_ON
 	ENABLE_INTERRUPTS(CLBR_NONE)
+	testl	$(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
+	jnz	ia32_ret_from_sys_call
 	movl	%eax, %esi		/* second arg, syscall return value */
 	cmpl	$-MAX_ERRNO, %eax	/* is it an error ? */
 	jbe	1f
@@ -232,7 +232,7 @@ sysexit_from_sys_call:
 	movq	%rax, R10(%rsp)
 	movq	%rax, R9(%rsp)
 	movq	%rax, R8(%rsp)
-	jmp	int_with_check
+	jmp	int_ret_from_sys_call_irqs_off
 	.endm
 
 sysenter_auditsys:

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/asm/entry/64: Save all regs on interrupt entry
  2015-07-03 19:44 ` [PATCH v5 12/17] x86/asm/entry/64: Save all regs on interrupt entry Andy Lutomirski
@ 2015-07-07 10:52   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, peterz, torvalds, tglx, riel, fweisbec, luto, bp,
	brgerst, vda.linux, keescook, oleg, mingo, luto, linux-kernel,
	hpa

Commit-ID:  ff467594f2a4be01a0fa5e9ffc223fa930d232dd
Gitweb:     http://git.kernel.org/tip/ff467594f2a4be01a0fa5e9ffc223fa930d232dd
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:29 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:07 +0200

x86/asm/entry/64: Save all regs on interrupt entry

To prepare for the big rewrite of the error and interrupt exit
paths, we will need pt_regs completely filled in.

It's already completely filled in when error_exit runs, so rearrange
interrupt handling to match it.  This will slow down interrupt
handling very slightly (eight instructions), but the
simplification it enables will be more than worth it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/d8a766a7f558b30e6e01352854628a2d9943460c.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/calling.h  |  3 ---
 arch/x86/entry/entry_64.S | 29 +++++++++--------------------
 2 files changed, 9 insertions(+), 23 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 519207f..3c71dd9 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -135,9 +135,6 @@ For 32-bit we have the following conventions - kernel is built with
 	movq %rbp, 4*8+\offset(%rsp)
 	movq %rbx, 5*8+\offset(%rsp)
 	.endm
-	.macro SAVE_EXTRA_REGS_RBP offset=0
-	movq %rbp, 4*8+\offset(%rsp)
-	.endm
 
 	.macro RESTORE_EXTRA_REGS offset=0
 	movq 0*8+\offset(%rsp), %r15
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 4ca5b78..65029f4 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -502,21 +502,13 @@ END(irq_entries_start)
 /* 0(%rsp): ~(interrupt number) */
 	.macro interrupt func
 	cld
-	/*
-	 * Since nothing in interrupt handling code touches r12...r15 members
-	 * of "struct pt_regs", and since interrupts can nest, we can save
-	 * four stack slots and simultaneously provide
-	 * an unwind-friendly stack layout by saving "truncated" pt_regs
-	 * exactly up to rbp slot, without these members.
-	 */
-	ALLOC_PT_GPREGS_ON_STACK -RBP
-	SAVE_C_REGS -RBP
-	/* this goes to 0(%rsp) for unwinder, not for saving the value: */
-	SAVE_EXTRA_REGS_RBP -RBP
+	ALLOC_PT_GPREGS_ON_STACK
+	SAVE_C_REGS
+	SAVE_EXTRA_REGS
 
-	leaq	-RBP(%rsp), %rdi		/* arg1 for \func (pointer to pt_regs) */
+	movq	%rsp,%rdi	/* arg1 for \func (pointer to pt_regs) */
 
-	testb	$3, CS-RBP(%rsp)
+	testb	$3, CS(%rsp)
 	jz	1f
 	SWAPGS
 1:
@@ -553,9 +545,7 @@ ret_from_intr:
 	decl	PER_CPU_VAR(irq_count)
 
 	/* Restore saved previous stack */
-	popq	%rsi
-	/* return code expects complete pt_regs - adjust rsp accordingly: */
-	leaq	-RBP(%rsi), %rsp
+	popq	%rsp
 
 	testb	$3, CS(%rsp)
 	jz	retint_kernel
@@ -580,7 +570,7 @@ retint_swapgs:					/* return to user-space */
 	TRACE_IRQS_IRETQ
 
 	SWAPGS
-	jmp	restore_c_regs_and_iret
+	jmp	restore_regs_and_iret
 
 /* Returning to kernel space */
 retint_kernel:
@@ -604,6 +594,8 @@ retint_kernel:
  * At this label, code paths which return to kernel and to user,
  * which come from interrupts/exception and from syscalls, merge.
  */
+restore_regs_and_iret:
+	RESTORE_EXTRA_REGS
 restore_c_regs_and_iret:
 	RESTORE_C_REGS
 	REMOVE_PT_GPREGS_FROM_STACK 8
@@ -674,12 +666,10 @@ retint_signal:
 	jz	retint_swapgs
 	TRACE_IRQS_ON
 	ENABLE_INTERRUPTS(CLBR_NONE)
-	SAVE_EXTRA_REGS
 	movq	$-1, ORIG_RAX(%rsp)
 	xorl	%esi, %esi			/* oldset */
 	movq	%rsp, %rdi			/* &pt_regs */
 	call	do_notify_resume
-	RESTORE_EXTRA_REGS
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
 	GET_THREAD_INFO(%rcx)
@@ -1160,7 +1150,6 @@ END(error_entry)
  */
 ENTRY(error_exit)
 	movl	%ebx, %eax
-	RESTORE_EXTRA_REGS
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
 	testl	%eax, %eax

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/asm/entry/64: Simplify IRQ stack pt_regs handling
  2015-07-03 19:44 ` [PATCH v5 13/17] x86/asm/entry/64: Simplify irq stack pt_regs handling Andy Lutomirski
@ 2015-07-07 10:53   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, brgerst, torvalds, hpa, oleg, luto, bp, keescook,
	peterz, mingo, luto, riel, tglx, dvlasenk, vda.linux, fweisbec

Commit-ID:  a586f98e9767fb0dfdb989002866b4024f00ce08
Gitweb:     http://git.kernel.org/tip/a586f98e9767fb0dfdb989002866b4024f00ce08
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:30 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:08 +0200

x86/asm/entry/64: Simplify IRQ stack pt_regs handling

There's no need for both RSI and RDI to point to the original stack.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/3a0481f809dd340c7d3f54ce3fd6d66ef2a578cd.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/entry_64.S | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 65029f4..83eb63d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -506,8 +506,6 @@ END(irq_entries_start)
 	SAVE_C_REGS
 	SAVE_EXTRA_REGS
 
-	movq	%rsp,%rdi	/* arg1 for \func (pointer to pt_regs) */
-
 	testb	$3, CS(%rsp)
 	jz	1f
 	SWAPGS
@@ -519,14 +517,14 @@ END(irq_entries_start)
 	 * a little cheaper to use a separate counter in the PDA (short of
 	 * moving irq_enter into assembly, which would be too much work)
 	 */
-	movq	%rsp, %rsi
+	movq	%rsp, %rdi
 	incl	PER_CPU_VAR(irq_count)
 	cmovzq	PER_CPU_VAR(irq_stack_ptr), %rsp
-	pushq	%rsi
+	pushq	%rdi
 	/* We entered an interrupt context - irqs are off: */
 	TRACE_IRQS_OFF
 
-	call	\func
+	call	\func	/* rdi points to pt_regs */
 	.endm
 
 	/*

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-07-03 19:44 ` [PATCH v5 14/17] x86/asm/entry/64: Migrate error and interrupt exit work to C Andy Lutomirski
@ 2015-07-07 10:53   ` tip-bot for Andy Lutomirski
  2015-08-11 22:18     ` Frederic Weisbecker
  2015-08-11 22:38     ` Frederic Weisbecker
  0 siblings, 2 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, mingo, fweisbec, oleg, luto, luto, hpa, linux-kernel,
	peterz, dvlasenk, riel, bp, brgerst, vda.linux, tglx, keescook

Commit-ID:  02bc7768fe447ae305e924b931fa629073a4a1b9
Gitweb:     http://git.kernel.org/tip/02bc7768fe447ae305e924b931fa629073a4a1b9
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:31 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:08 +0200

x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/entry_64.S        | 64 +++++++++++-----------------------------
 arch/x86/entry/entry_64_compat.S |  5 ++++
 2 files changed, 23 insertions(+), 46 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 83eb63d..168ee26 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -508,7 +508,16 @@ END(irq_entries_start)
 
 	testb	$3, CS(%rsp)
 	jz	1f
+
+	/*
+	 * IRQ from user mode.  Switch to kernel gsbase and inform context
+	 * tracking that we're in kernel mode.
+	 */
 	SWAPGS
+#ifdef CONFIG_CONTEXT_TRACKING
+	call enter_from_user_mode
+#endif
+
 1:
 	/*
 	 * Save previous stack pointer, optionally switch to interrupt stack.
@@ -547,26 +556,13 @@ ret_from_intr:
 
 	testb	$3, CS(%rsp)
 	jz	retint_kernel
-	/* Interrupt came from user space */
-GLOBAL(retint_user)
-	GET_THREAD_INFO(%rcx)
 
-	/* %rcx: thread info. Interrupts are off. */
-retint_with_reschedule:
-	movl	$_TIF_WORK_MASK, %edi
-retint_check:
+	/* Interrupt came from user space */
 	LOCKDEP_SYS_EXIT_IRQ
-	movl	TI_flags(%rcx), %edx
-	andl	%edi, %edx
-	jnz	retint_careful
-
-retint_swapgs:					/* return to user-space */
-	/*
-	 * The iretq could re-enable interrupts:
-	 */
-	DISABLE_INTERRUPTS(CLBR_ANY)
+GLOBAL(retint_user)
+	mov	%rsp,%rdi
+	call	prepare_exit_to_usermode
 	TRACE_IRQS_IRETQ
-
 	SWAPGS
 	jmp	restore_regs_and_iret
 
@@ -644,35 +640,6 @@ native_irq_return_ldt:
 	popq	%rax
 	jmp	native_irq_return_iret
 #endif
-
-	/* edi: workmask, edx: work */
-retint_careful:
-	bt	$TIF_NEED_RESCHED, %edx
-	jnc	retint_signal
-	TRACE_IRQS_ON
-	ENABLE_INTERRUPTS(CLBR_NONE)
-	pushq	%rdi
-	SCHEDULE_USER
-	popq	%rdi
-	GET_THREAD_INFO(%rcx)
-	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
-	jmp	retint_check
-
-retint_signal:
-	testl	$_TIF_DO_NOTIFY_MASK, %edx
-	jz	retint_swapgs
-	TRACE_IRQS_ON
-	ENABLE_INTERRUPTS(CLBR_NONE)
-	movq	$-1, ORIG_RAX(%rsp)
-	xorl	%esi, %esi			/* oldset */
-	movq	%rsp, %rdi			/* &pt_regs */
-	call	do_notify_resume
-	DISABLE_INTERRUPTS(CLBR_NONE)
-	TRACE_IRQS_OFF
-	GET_THREAD_INFO(%rcx)
-	jmp	retint_with_reschedule
-
 END(common_interrupt)
 
 /*
@@ -1088,7 +1055,12 @@ ENTRY(error_entry)
 	SWAPGS
 
 .Lerror_entry_from_usermode_after_swapgs:
+#ifdef CONFIG_CONTEXT_TRACKING
+	call enter_from_user_mode
+#endif
+
 .Lerror_entry_done:
+
 	TRACE_IRQS_OFF
 	ret
 
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index d9bbd31..25aca51 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -458,6 +458,11 @@ ia32_badarg:
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
 
+	/* Now finish entering normal kernel mode. */
+#ifdef CONFIG_CONTEXT_TRACKING
+	call enter_from_user_mode
+#endif
+
 	/* And exit again. */
 	jmp retint_user
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry: Remove exception_enter() from most trap handlers
  2015-07-03 19:44 ` [PATCH v5 15/17] x86/entry: Remove exception_enter from most trap handlers Andy Lutomirski
@ 2015-07-07 10:53   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, keescook, luto, riel, oleg, brgerst, hpa, bp, mingo,
	peterz, torvalds, luto, tglx, dvlasenk, vda.linux, linux-kernel

Commit-ID:  8c84014f3bbb112d07e73f30a10ac8a3a72f8649
Gitweb:     http://git.kernel.org/tip/8c84014f3bbb112d07e73f30a10ac8a3a72f8649
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:32 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:09 +0200

x86/entry: Remove exception_enter() from most trap handlers

On 64-bit kernels, we don't need it any more: we handle context
tracking directly on entry from user mode and exit to user mode.

On 32-bit kernels, we don't support context tracking at all, so
these callbacks had no effect.

Note: this doesn't change do_page_fault().  Before we do that,
we need to make sure that there is no code that can page fault
from kernel mode with CONTEXT_USER.  The 32-bit fast system call
stack argument code is the only offender I'm aware of right now.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/ae22f4dfebd799c916574089964592be218151f9.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/traps.h         |  4 +-
 arch/x86/kernel/cpu/mcheck/mce.c     |  5 +--
 arch/x86/kernel/cpu/mcheck/p5.c      |  5 +--
 arch/x86/kernel/cpu/mcheck/winchip.c |  4 +-
 arch/x86/kernel/traps.c              | 78 +++++++++---------------------------
 5 files changed, 27 insertions(+), 69 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index c5380be..c3496619 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -112,8 +112,8 @@ asmlinkage void smp_threshold_interrupt(void);
 asmlinkage void smp_deferred_error_interrupt(void);
 #endif
 
-extern enum ctx_state ist_enter(struct pt_regs *regs);
-extern void ist_exit(struct pt_regs *regs, enum ctx_state prev_state);
+extern void ist_enter(struct pt_regs *regs);
+extern void ist_exit(struct pt_regs *regs);
 extern void ist_begin_non_atomic(struct pt_regs *regs);
 extern void ist_end_non_atomic(void);
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 96ccecc..99940d1 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1029,7 +1029,6 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 {
 	struct mca_config *cfg = &mca_cfg;
 	struct mce m, *final;
-	enum ctx_state prev_state;
 	int i;
 	int worst = 0;
 	int severity;
@@ -1055,7 +1054,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	int flags = MF_ACTION_REQUIRED;
 	int lmce = 0;
 
-	prev_state = ist_enter(regs);
+	ist_enter(regs);
 
 	this_cpu_inc(mce_exception_count);
 
@@ -1227,7 +1226,7 @@ out:
 	local_irq_disable();
 	ist_end_non_atomic();
 done:
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 EXPORT_SYMBOL_GPL(do_machine_check);
 
diff --git a/arch/x86/kernel/cpu/mcheck/p5.c b/arch/x86/kernel/cpu/mcheck/p5.c
index 737b0ad..12402e1 100644
--- a/arch/x86/kernel/cpu/mcheck/p5.c
+++ b/arch/x86/kernel/cpu/mcheck/p5.c
@@ -19,10 +19,9 @@ int mce_p5_enabled __read_mostly;
 /* Machine check handler for Pentium class Intel CPUs: */
 static void pentium_machine_check(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
 	u32 loaddr, hi, lotype;
 
-	prev_state = ist_enter(regs);
+	ist_enter(regs);
 
 	rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi);
 	rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi);
@@ -39,7 +38,7 @@ static void pentium_machine_check(struct pt_regs *regs, long error_code)
 
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
 
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 
 /* Set up machine check reporting for processors with Intel style MCE: */
diff --git a/arch/x86/kernel/cpu/mcheck/winchip.c b/arch/x86/kernel/cpu/mcheck/winchip.c
index 44f1382..01dd870 100644
--- a/arch/x86/kernel/cpu/mcheck/winchip.c
+++ b/arch/x86/kernel/cpu/mcheck/winchip.c
@@ -15,12 +15,12 @@
 /* Machine check handler for WinChip C6: */
 static void winchip_machine_check(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state = ist_enter(regs);
+	ist_enter(regs);
 
 	printk(KERN_EMERG "CPU0: Machine Check Exception.\n");
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
 
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 
 /* Set up machine check reporting on the Winchip C6 series */
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 2a783c4..8e65d8a 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -108,13 +108,10 @@ static inline void preempt_conditional_cli(struct pt_regs *regs)
 	preempt_count_dec();
 }
 
-enum ctx_state ist_enter(struct pt_regs *regs)
+void ist_enter(struct pt_regs *regs)
 {
-	enum ctx_state prev_state;
-
 	if (user_mode(regs)) {
-		/* Other than that, we're just an exception. */
-		prev_state = exception_enter();
+		CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	} else {
 		/*
 		 * We might have interrupted pretty much anything.  In
@@ -123,32 +120,25 @@ enum ctx_state ist_enter(struct pt_regs *regs)
 		 * but we need to notify RCU.
 		 */
 		rcu_nmi_enter();
-		prev_state = CONTEXT_KERNEL;  /* the value is irrelevant. */
 	}
 
 	/*
-	 * We are atomic because we're on the IST stack (or we're on x86_32,
-	 * in which case we still shouldn't schedule).
-	 *
-	 * This must be after exception_enter(), because exception_enter()
-	 * won't do anything if in_interrupt() returns true.
+	 * We are atomic because we're on the IST stack; or we're on
+	 * x86_32, in which case we still shouldn't schedule; or we're
+	 * on x86_64 and entered from user mode, in which case we're
+	 * still atomic unless ist_begin_non_atomic is called.
 	 */
 	preempt_count_add(HARDIRQ_OFFSET);
 
 	/* This code is a bit fragile.  Test it. */
 	rcu_lockdep_assert(rcu_is_watching(), "ist_enter didn't work");
-
-	return prev_state;
 }
 
-void ist_exit(struct pt_regs *regs, enum ctx_state prev_state)
+void ist_exit(struct pt_regs *regs)
 {
-	/* Must be before exception_exit. */
 	preempt_count_sub(HARDIRQ_OFFSET);
 
-	if (user_mode(regs))
-		return exception_exit(prev_state);
-	else
+	if (!user_mode(regs))
 		rcu_nmi_exit();
 }
 
@@ -162,7 +152,7 @@ void ist_exit(struct pt_regs *regs, enum ctx_state prev_state)
  * a double fault, it can be safe to schedule.  ist_begin_non_atomic()
  * begins a non-atomic section within an ist_enter()/ist_exit() region.
  * Callers are responsible for enabling interrupts themselves inside
- * the non-atomic section, and callers must call is_end_non_atomic()
+ * the non-atomic section, and callers must call ist_end_non_atomic()
  * before ist_exit().
  */
 void ist_begin_non_atomic(struct pt_regs *regs)
@@ -289,7 +279,6 @@ NOKPROBE_SYMBOL(do_trap);
 static void do_error_trap(struct pt_regs *regs, long error_code, char *str,
 			  unsigned long trapnr, int signr)
 {
-	enum ctx_state prev_state = exception_enter();
 	siginfo_t info;
 
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
@@ -300,8 +289,6 @@ static void do_error_trap(struct pt_regs *regs, long error_code, char *str,
 		do_trap(trapnr, signr, str, regs, error_code,
 			fill_trap_info(regs, signr, trapnr, &info));
 	}
-
-	exception_exit(prev_state);
 }
 
 #define DO_ERROR(trapnr, signr, str, name)				\
@@ -353,7 +340,7 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
 	}
 #endif
 
-	ist_enter(regs);  /* Discard prev_state because we won't return. */
+	ist_enter(regs);
 	notify_die(DIE_TRAP, str, regs, error_code, X86_TRAP_DF, SIGSEGV);
 
 	tsk->thread.error_code = error_code;
@@ -373,15 +360,13 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
 
 dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
 	const struct bndcsr *bndcsr;
 	siginfo_t *info;
 
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	if (notify_die(DIE_TRAP, "bounds", regs, error_code,
 			X86_TRAP_BR, SIGSEGV) == NOTIFY_STOP)
-		goto exit;
+		return;
 	conditional_sti(regs);
 
 	if (!user_mode(regs))
@@ -438,9 +423,8 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 		die("bounds", regs, error_code);
 	}
 
-exit:
-	exception_exit(prev_state);
 	return;
+
 exit_trap:
 	/*
 	 * This path out is for all the cases where we could not
@@ -450,36 +434,33 @@ exit_trap:
 	 * time..
 	 */
 	do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
-	exception_exit(prev_state);
 }
 
 dotraplinkage void
 do_general_protection(struct pt_regs *regs, long error_code)
 {
 	struct task_struct *tsk;
-	enum ctx_state prev_state;
 
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	conditional_sti(regs);
 
 	if (v8086_mode(regs)) {
 		local_irq_enable();
 		handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
-		goto exit;
+		return;
 	}
 
 	tsk = current;
 	if (!user_mode(regs)) {
 		if (fixup_exception(regs))
-			goto exit;
+			return;
 
 		tsk->thread.error_code = error_code;
 		tsk->thread.trap_nr = X86_TRAP_GP;
 		if (notify_die(DIE_GPF, "general protection fault", regs, error_code,
 			       X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP)
 			die("general protection fault", regs, error_code);
-		goto exit;
+		return;
 	}
 
 	tsk->thread.error_code = error_code;
@@ -495,16 +476,12 @@ do_general_protection(struct pt_regs *regs, long error_code)
 	}
 
 	force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
-exit:
-	exception_exit(prev_state);
 }
 NOKPROBE_SYMBOL(do_general_protection);
 
 /* May run on IST stack. */
 dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
-
 #ifdef CONFIG_DYNAMIC_FTRACE
 	/*
 	 * ftrace must be first, everything else may cause a recursive crash.
@@ -517,7 +494,7 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 	if (poke_int3_handler(regs))
 		return;
 
-	prev_state = ist_enter(regs);
+	ist_enter(regs);
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
 	if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
@@ -544,7 +521,7 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 	preempt_conditional_cli(regs);
 	debug_stack_usage_dec();
 exit:
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_int3);
 
@@ -620,12 +597,11 @@ NOKPROBE_SYMBOL(fixup_bad_iret);
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 {
 	struct task_struct *tsk = current;
-	enum ctx_state prev_state;
 	int user_icebp = 0;
 	unsigned long dr6;
 	int si_code;
 
-	prev_state = ist_enter(regs);
+	ist_enter(regs);
 
 	get_debugreg(dr6, 6);
 
@@ -700,7 +676,7 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 	debug_stack_usage_dec();
 
 exit:
-	ist_exit(regs, prev_state);
+	ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug);
 
@@ -752,23 +728,15 @@ static void math_error(struct pt_regs *regs, int error_code, int trapnr)
 
 dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
-
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	math_error(regs, error_code, X86_TRAP_MF);
-	exception_exit(prev_state);
 }
 
 dotraplinkage void
 do_simd_coprocessor_error(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
-
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	math_error(regs, error_code, X86_TRAP_XF);
-	exception_exit(prev_state);
 }
 
 dotraplinkage void
@@ -780,9 +748,6 @@ do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
 dotraplinkage void
 do_device_not_available(struct pt_regs *regs, long error_code)
 {
-	enum ctx_state prev_state;
-
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	BUG_ON(use_eager_fpu());
 
@@ -794,7 +759,6 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 
 		info.regs = regs;
 		math_emulate(&info);
-		exception_exit(prev_state);
 		return;
 	}
 #endif
@@ -802,7 +766,6 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 #ifdef CONFIG_X86_32
 	conditional_sti(regs);
 #endif
-	exception_exit(prev_state);
 }
 NOKPROBE_SYMBOL(do_device_not_available);
 
@@ -810,9 +773,7 @@ NOKPROBE_SYMBOL(do_device_not_available);
 dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
 {
 	siginfo_t info;
-	enum ctx_state prev_state;
 
-	prev_state = exception_enter();
 	CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	local_irq_enable();
 
@@ -825,7 +786,6 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
 		do_trap(X86_TRAP_IRET, SIGILL, "iret exception", regs, error_code,
 			&info);
 	}
-	exception_exit(prev_state);
 }
 #endif
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry: Remove SCHEDULE_USER and asm/ context-tracking.h
  2015-07-03 19:44 ` [PATCH v5 16/17] x86/entry: Remove SCHEDULE_USER and asm/context-tracking.h Andy Lutomirski
@ 2015-07-07 10:54   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: oleg, vda.linux, tglx, brgerst, luto, bp, torvalds, luto, riel,
	hpa, keescook, mingo, linux-kernel, fweisbec, peterz, dvlasenk

Commit-ID:  06a7b36c7bd932e60997bedbae32b3d8e6722281
Gitweb:     http://git.kernel.org/tip/06a7b36c7bd932e60997bedbae32b3d8e6722281
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:33 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:09 +0200

x86/entry: Remove SCHEDULE_USER and asm/context-tracking.h

SCHEDULE_USER is no longer used, and asm/context-tracking.h
contained nothing else.  Remove the header entirely.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/854e9b45f69af20e26c47099eb236321563ebcee.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/entry_64.S               |  1 -
 arch/x86/include/asm/context_tracking.h | 10 ----------
 2 files changed, 11 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 168ee26..041a37a 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -33,7 +33,6 @@
 #include <asm/paravirt.h>
 #include <asm/percpu.h>
 #include <asm/asm.h>
-#include <asm/context_tracking.h>
 #include <asm/smap.h>
 #include <asm/pgtable_types.h>
 #include <linux/err.h>
diff --git a/arch/x86/include/asm/context_tracking.h b/arch/x86/include/asm/context_tracking.h
deleted file mode 100644
index 1fe4970..0000000
--- a/arch/x86/include/asm/context_tracking.h
+++ /dev/null
@@ -1,10 +0,0 @@
-#ifndef _ASM_X86_CONTEXT_TRACKING_H
-#define _ASM_X86_CONTEXT_TRACKING_H
-
-#ifdef CONFIG_CONTEXT_TRACKING
-# define SCHEDULE_USER call schedule_user
-#else
-# define SCHEDULE_USER call schedule
-#endif
-
-#endif

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
  2015-07-03 19:44 ` [PATCH v5 17/17] x86/irq: Document how IRQ context tracking works and add an assertion Andy Lutomirski
@ 2015-07-07 10:54   ` tip-bot for Andy Lutomirski
  2015-07-14 23:26     ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-07 10:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, mingo, luto, keescook, fweisbec, torvalds, peterz, paulmck,
	oleg, linux-kernel, hpa, vda.linux, riel, bp, tglx, dvlasenk,
	brgerst

Commit-ID:  0333a209cbf600e980fc55c24878a56f25f48b65
Gitweb:     http://git.kernel.org/tip/0333a209cbf600e980fc55c24878a56f25f48b65
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 3 Jul 2015 12:44:34 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 7 Jul 2015 10:59:10 +0200

x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/e8bdc4ed0193fb2fd130f3d6b7b8023e2ec1ab62.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/irq.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 88b36648..6233de0 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -216,8 +216,23 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
 	unsigned vector = ~regs->orig_ax;
 	unsigned irq;
 
+	/*
+	 * NB: Unlike exception entries, IRQ entries do not reliably
+	 * handle context tracking in the low-level entry code.  This is
+	 * because syscall entries execute briefly with IRQs on before
+	 * updating context tracking state, so we can take an IRQ from
+	 * kernel mode with CONTEXT_USER.  The low-level entry code only
+	 * updates the context if we came from user mode, so we won't
+	 * switch to CONTEXT_KERNEL.  We'll fix that once the syscall
+	 * code is cleaned up enough that we can cleanly defer enabling
+	 * IRQs.
+	 */
+
 	entering_irq();
 
+	/* entering_irq() tells RCU that we're not quiescent.  Check it. */
+	rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
+
 	irq = __this_cpu_read(vector_irq[vector]);
 
 	if (!handle_irq(irq, regs)) {

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v5 00/17] x86: Rewrite exit-to-userspace code
  2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
                   ` (16 preceding siblings ...)
  2015-07-03 19:44 ` [PATCH v5 17/17] x86/irq: Document how IRQ context tracking works and add an assertion Andy Lutomirski
@ 2015-07-07 11:12 ` Ingo Molnar
  2015-07-07 16:03   ` Andy Lutomirski
  17 siblings, 1 reply; 70+ messages in thread
From: Ingo Molnar @ 2015-07-07 11:12 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, linux-kernel, Frédéric Weisbecker, Rik van Riel,
	Oleg Nesterov, Denys Vlasenko, Borislav Petkov, Kees Cook,
	Brian Gerst, paulmck


[-- Attachment #1: Type: text/plain, Size: 1663 bytes --]


So this looks mostly problem free on my boxen, except this warning triggers:

Adding 3911820k swap on /dev/sda2.  Priority:-1 extents:1 across:3911820k 
capability: warning: `dbus-daemon' uses 32-bit capabilities (legacy support in use)
------------[ cut here ]------------
WARNING: CPU: 1 PID: 2445 at arch/x86/entry/common.c:311 syscall_return_slowpath+0x4c/0x270()
syscall 6 left IRQs disabled
Modules linked in:
CPU: 1 PID: 2445 Comm: distccd Not tainted 4.2.0-rc1-01597-gaecd781-dirty #18
 0000000000000000 00000000776afac2 ffff880035413e58 ffffffff81c8915f
 0000000000000000 ffff880035413eb0 ffff880035413e98 ffffffff810a8d82
 ffff880035413e78 ffff880035413f58 0000000020020002 ffff880035410000
Call Trace:
 [<ffffffff81c8915f>] dump_stack+0x4f/0x7b
 [<ffffffff810a8d82>] warn_slowpath_common+0xa2/0xc0
 [<ffffffff810a8df5>] warn_slowpath_fmt+0x55/0x70
 [<ffffffff81001ddc>] syscall_return_slowpath+0x4c/0x270
 [<ffffffff81c96471>] int_ret_from_sys_call+0x25/0x9f
---[ end trace 083efc734e089d37 ]---
device: 'vcs2': device_add
PM: Adding info for No Bus:vcs2
device: 'vcsa2': device_add

with ancient user-space, running the attached .config.

The system booted up fine otherwise. The warning corresponds to:

        if (WARN(irqs_disabled(), "syscall %ld left IRQs disabled",
                 regs->orig_ax))
                local_irq_enable();

and this was just the regular startup of the distccd daemon during bootup, nothing 
particularly fancy.

Note that 'distccd' is a 32-bit ELF binary - and this is a 64-bit kernel.

Syscall 6 would be:

arch/x86/entry/syscalls/syscall_32.tbl:6        i386    close                   sys_close

Thanks,

	Ingo

[-- Attachment #2: config --]
[-- Type: text/plain, Size: 120572 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.2.0-rc1 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
CONFIG_BOOTPARAM_SUPPORT_NOT_WANTED=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_BROKEN_BOOT_ALLOWED4=y
# CONFIG_BROKEN_BOOT_ALLOWED3 is not set
# CONFIG_BROKEN_BOOT_DISALLOWED is not set
CONFIG_BROKEN_BOOT_EUROPE=y
CONFIG_BROKEN_BOOT_TITAN=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
CONFIG_COMPILE_TEST=y
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
CONFIG_KERNEL_BZIP2=y
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_CROSS_MEMORY_ATTACH is not set
# CONFIG_FHANDLE is not set
# CONFIG_USELIB is not set
# CONFIG_AUDIT is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_CHIP=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_IRQ_DOMAIN_DEBUG=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
# CONFIG_TICK_CPU_ACCOUNTING is not set
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
CONFIG_IRQ_TIME_ACCOUNTING=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_EXPERT=y
CONFIG_SRCU=y
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_FANOUT=64
CONFIG_RCU_FANOUT_LEAF=16
# CONFIG_RCU_FAST_NO_HZ is not set
CONFIG_TREE_RCU_TRACE=y
CONFIG_RCU_BOOST=y
CONFIG_RCU_KTHREAD_PRIO=1
CONFIG_RCU_BOOST_DELAY=500
# CONFIG_RCU_NOCB_CPU is not set
# CONFIG_RCU_EXPEDITE_BOOT is not set
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=20
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_CGROUPS=y
CONFIG_CGROUP_DEBUG=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
# CONFIG_CGROUP_CPUACCT is not set
CONFIG_PAGE_COUNTER=y
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y
# CONFIG_MEMCG_SWAP_ENABLED is not set
CONFIG_MEMCG_KMEM=y
# CONFIG_CGROUP_PERF is not set
CONFIG_CGROUP_SCHED=y
# CONFIG_FAIR_GROUP_SCHED is not set
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_BLK_CGROUP is not set
CONFIG_CHECKPOINT_RESTORE=y
# CONFIG_NAMESPACES is not set
# CONFIG_SCHED_AUTOGROUP is not set
CONFIG_SYSFS_DEPRECATED=y
CONFIG_SYSFS_DEPRECATED_V2=y
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
CONFIG_RD_LZ4=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
CONFIG_EXPERT=y
# CONFIG_UID16 is not set
CONFIG_MULTIUSER=y
# CONFIG_SGETMASK_SYSCALL is not set
# CONFIG_SYSFS_SYSCALL is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_PRINTK=y
CONFIG_BUG=y
# CONFIG_ELF_CORE is not set
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
# CONFIG_EPOLL is not set
# CONFIG_SIGNALFD is not set
# CONFIG_TIMERFD is not set
CONFIG_EVENTFD=y
CONFIG_BPF_SYSCALL=y
# CONFIG_SHMEM is not set
CONFIG_AIO=y
CONFIG_ADVISE_SYSCALLS=y
# CONFIG_PCI_QUIRKS is not set
CONFIG_EMBEDDED=y
CONFIG_HAVE_PERF_EVENTS=y

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_SYSTEM_TRUSTED_KEYRING=y
# CONFIG_PROFILING is not set
CONFIG_TRACEPOINTS=y
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
# CONFIG_KPROBES is not set
CONFIG_JUMP_LABEL=y
CONFIG_UPROBES=y
# CONFIG_HAVE_64BIT_ALIGNED_ACCESS is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_ATTRS=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_HAVE_CC_STACKPROTECTOR=y
CONFIG_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR_NONE is not set
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
CONFIG_CC_STACKPROTECTOR_STRONG=y
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_ARCH_HAS_PGD_INIT_LATE=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_COPY_THREAD_TLS=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y

#
# GCOV-based kernel profiling
#
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
# CONFIG_MODULE_UNLOAD is not set
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_MODULE_SIG=y
# CONFIG_MODULE_SIG_FORCE is not set
CONFIG_MODULE_SIG_ALL=y
# CONFIG_MODULE_SIG_SHA1 is not set
CONFIG_MODULE_SIG_SHA224=y
# CONFIG_MODULE_SIG_SHA256 is not set
# CONFIG_MODULE_SIG_SHA384 is not set
# CONFIG_MODULE_SIG_SHA512 is not set
CONFIG_MODULE_SIG_HASH="sha224"
CONFIG_MODULE_COMPRESS=y
CONFIG_MODULE_COMPRESS_GZIP=y
# CONFIG_MODULE_COMPRESS_XZ is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_BSG=y
CONFIG_BLK_DEV_BSGLIB=y
CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLK_CMDLINE_PARSER=y

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
CONFIG_ACORN_PARTITION=y
CONFIG_ACORN_PARTITION_CUMANA=y
CONFIG_ACORN_PARTITION_EESOX=y
CONFIG_ACORN_PARTITION_ICS=y
CONFIG_ACORN_PARTITION_ADFS=y
CONFIG_ACORN_PARTITION_POWERTEC=y
CONFIG_ACORN_PARTITION_RISCIX=y
# CONFIG_AIX_PARTITION is not set
CONFIG_OSF_PARTITION=y
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
# CONFIG_MINIX_SUBPARTITION is not set
CONFIG_SOLARIS_X86_PARTITION=y
# CONFIG_UNIXWARE_DISKLABEL is not set
# CONFIG_LDM_PARTITION is not set
CONFIG_SGI_PARTITION=y
CONFIG_ULTRIX_PARTITION=y
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
CONFIG_EFI_PARTITION=y
CONFIG_SYSV68_PARTITION=y
CONFIG_CMDLINE_PARTITION=y
CONFIG_BLOCK_COMPAT=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=m
CONFIG_IOSCHED_CFQ=m
CONFIG_DEFAULT_NOOP=y
CONFIG_DEFAULT_IOSCHED="noop"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_PADATA=y
CONFIG_ASN1=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_FREEZER=y

#
# Processor type and features
#
# CONFIG_ZONE_DMA is not set
# CONFIG_SMP_SUPPORT is not set
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_MPPARSE=y
# CONFIG_X86_EXTENDED_PLATFORM is not set
CONFIG_IOSF_MBI=m
CONFIG_IOSF_MBI_DEBUG=y
# CONFIG_SCHED_OMIT_FRAME_POINTER is not set
# CONFIG_KVMTOOL_TEST_ENABLE is not set
# CONFIG_HYPERVISOR_GUEST is not set
CONFIG_NO_BOOTMEM=y
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_PROCESSOR_SELECT=y
# CONFIG_CPU_SUP_INTEL is not set
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
# CONFIG_DMI is not set
CONFIG_GART_IOMMU=y
CONFIG_CALGARY_IOMMU=y
CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT=y
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
# CONFIG_MAXSMP is not set
CONFIG_NR_CPUS=64
# CONFIG_SCHED_SMT is not set
# CONFIG_SCHED_MC is not set
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
# CONFIG_X86_MCE is not set
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
# CONFIG_I8K is not set
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
# CONFIG_MICROCODE_AMD is not set
CONFIG_MICROCODE_OLD_INTERFACE=y
# CONFIG_MICROCODE_EARLY is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
# CONFIG_UP_WANTED_1 is not set
CONFIG_SMP=y
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
# CONFIG_NUMA_EMU is not set
CONFIG_NODES_SHIFT=6
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
# CONFIG_SPARSEMEM_VMEMMAP is not set
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_MOVABLE_NODE=y
# CONFIG_HAVE_BOOTMEM_INFO_NODE is not set
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_MEMORY_BALLOON=y
# CONFIG_COMPACTION is not set
# CONFIG_MIGRATION is not set
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=0
CONFIG_NEED_BOUNCE_POOL=y
CONFIG_VIRT_TO_BUS=y
CONFIG_MMU_NOTIFIER=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
# CONFIG_TRANSPARENT_HUGEPAGE is not set
# CONFIG_CLEANCACHE is not set
# CONFIG_FRONTSWAP is not set
# CONFIG_CMA is not set
# CONFIG_MEM_SOFT_DIRTY is not set
CONFIG_ZPOOL=y
# CONFIG_ZBUD is not set
CONFIG_ZSMALLOC=y
CONFIG_PGTABLE_MAPPING=y
CONFIG_ZSMALLOC_STAT=y
CONFIG_GENERIC_EARLY_IOREMAP=y
CONFIG_ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT=y
# CONFIG_X86_PMEM_LEGACY is not set
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
CONFIG_X86_RESERVE_LOW=64
CONFIG_MTRR=y
# CONFIG_MTRR_SANITIZER is not set
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
# CONFIG_ARCH_RANDOM is not set
CONFIG_X86_SMAP=y
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_SCHED_HRTICK=y
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_RANDOMIZE_BASE=y
CONFIG_RANDOMIZE_BASE_MAX_OFFSET=0x40000000
CONFIG_X86_NEED_RELOCS=y
CONFIG_PHYSICAL_ALIGN=0x200000
CONFIG_HOTPLUG_CPU=y
CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
CONFIG_DEBUG_HOTPLUG_CPU0=y
# CONFIG_COMPAT_VDSO is not set
CONFIG_CMDLINE_BOOL=y
CONFIG_CMDLINE=""
CONFIG_HAVE_LIVEPATCH=y
CONFIG_LIVEPATCH=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y

#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
# CONFIG_SUSPEND is not set
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
CONFIG_PM_DEBUG=y
# CONFIG_PM_ADVANCED_DEBUG is not set
CONFIG_PM_SLEEP_DEBUG=y
CONFIG_DPM_WATCHDOG=y
CONFIG_DPM_WATCHDOG_TIMEOUT=60
# CONFIG_PM_TRACE_RTC is not set
CONFIG_WQ_POWER_EFFICIENT_DEFAULT=y
# CONFIG_ACPI is not set
CONFIG_SFI=y

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=y
# CONFIG_CPU_FREQ_STAT_DETAILS is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_GOV_ONDEMAND=m
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y

#
# CPU frequency scaling drivers
#
# CONFIG_X86_INTEL_PSTATE is not set
# CONFIG_X86_P4_CLOCKMOD is not set

#
# shared options
#
# CONFIG_X86_SPEEDSTEP_LIB is not set

#
# CPU Idle
#
# CONFIG_CPU_IDLE is not set
# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set

#
# Memory power savings
#
# CONFIG_I7300_IDLE is not set

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_DOMAINS=y
# CONFIG_PCI_CNB20LE_QUIRK is not set
# CONFIG_PCIEPORTBUS is not set
CONFIG_PCI_BUS_ADDR_T_64BIT=y
# CONFIG_PCI_MSI is not set
CONFIG_PCI_DEBUG=y
# CONFIG_PCI_REALLOC_ENABLE_AUTO is not set
# CONFIG_PCI_STUB is not set
CONFIG_HT_IRQ=y
CONFIG_PCI_ATS=y
CONFIG_PCI_IOV=y
# CONFIG_PCI_PRI is not set
CONFIG_PCI_PASID=y

#
# PCI host controller drivers
#
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
CONFIG_PCCARD=y
CONFIG_PCMCIA=y
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_CARDBUS=y

#
# PC-card bridges
#
# CONFIG_YENTA is not set
CONFIG_PD6729=y
CONFIG_I82092=y
CONFIG_PCCARD_NONSTATIC=y
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_CPCI=y
# CONFIG_HOTPLUG_PCI_CPCI_ZT5550 is not set
CONFIG_HOTPLUG_PCI_CPCI_GENERIC=y
CONFIG_HOTPLUG_PCI_SHPC=y
CONFIG_RAPIDIO=y
CONFIG_RAPIDIO_DISC_TIMEOUT=30
# CONFIG_RAPIDIO_ENABLE_RX_TX_PORTS is not set
CONFIG_RAPIDIO_DMA_ENGINE=y
# CONFIG_RAPIDIO_DEBUG is not set
CONFIG_RAPIDIO_ENUM_BASIC=m

#
# RapidIO Switch drivers
#
CONFIG_RAPIDIO_TSI57X=y
CONFIG_RAPIDIO_CPS_XX=y
CONFIG_RAPIDIO_TSI568=y
# CONFIG_RAPIDIO_CPS_GEN2 is not set
# CONFIG_X86_SYSFB is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
# CONFIG_BINFMT_SCRIPT is not set
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_MISC=m
CONFIG_COREDUMP=y
CONFIG_IA32_EMULATION=y
CONFIG_IA32_AOUT=m
# CONFIG_X86_X32 is not set
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_KEYS_COMPAT=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_PMC_ATOM=y
CONFIG_NET=y
CONFIG_NET_INGRESS=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=m
CONFIG_UNIX=y
# CONFIG_UNIX_DIAG is not set
CONFIG_XFRM=y
CONFIG_XFRM_ALGO=m
# CONFIG_XFRM_USER is not set
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
CONFIG_XFRM_STATISTICS=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_FIB_TRIE_STATS=y
# CONFIG_IP_MULTIPLE_TABLES is not set
CONFIG_IP_ROUTE_MULTIPATH=y
# CONFIG_IP_ROUTE_VERBOSE is not set
CONFIG_IP_ROUTE_CLASSID=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE_DEMUX=y
CONFIG_NET_IP_TUNNEL=y
CONFIG_NET_IPGRE=m
# CONFIG_NET_IPGRE_BROADCAST is not set
CONFIG_IP_MROUTE=y
CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
CONFIG_IP_PIMSM_V1=y
# CONFIG_IP_PIMSM_V2 is not set
# CONFIG_SYN_COOKIES is not set
CONFIG_NET_IPVTI=m
CONFIG_NET_UDP_TUNNEL=y
CONFIG_NET_FOU=m
CONFIG_NET_FOU_IP_TUNNELS=y
CONFIG_GENEVE_CORE=y
CONFIG_INET_AH=m
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_XFRM_TUNNEL is not set
CONFIG_INET_TUNNEL=m
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
CONFIG_INET_XFRM_MODE_TUNNEL=m
# CONFIG_INET_XFRM_MODE_BEET is not set
CONFIG_INET_LRO=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
CONFIG_INET_UDP_DIAG=m
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_MD5SIG is not set
# CONFIG_NETLABEL is not set
CONFIG_NETWORK_SECMARK=y
CONFIG_NET_PTP_CLASSIFY=y
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=m

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_NETLINK=y
CONFIG_NETFILTER_NETLINK_ACCT=y
CONFIG_NETFILTER_NETLINK_QUEUE=y
CONFIG_NETFILTER_NETLINK_LOG=y
# CONFIG_NF_CONNTRACK is not set
CONFIG_NF_LOG_COMMON=y
CONFIG_NF_TABLES=y
CONFIG_NF_TABLES_NETDEV=y
CONFIG_NFT_EXTHDR=m
CONFIG_NFT_META=y
# CONFIG_NFT_RBTREE is not set
CONFIG_NFT_HASH=y
CONFIG_NFT_COUNTER=m
# CONFIG_NFT_LOG is not set
CONFIG_NFT_LIMIT=m
CONFIG_NFT_QUEUE=y
CONFIG_NFT_REJECT=y
CONFIG_NFT_COMPAT=m
CONFIG_NETFILTER_XTABLES=y

#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=y

#
# Xtables targets
#
# CONFIG_NETFILTER_XT_TARGET_CHECKSUM is not set
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_HL=m
CONFIG_NETFILTER_XT_TARGET_HMARK=y
CONFIG_NETFILTER_XT_TARGET_IDLETIMER=y
CONFIG_NETFILTER_XT_TARGET_LOG=y
# CONFIG_NETFILTER_XT_TARGET_MARK is not set
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
# CONFIG_NETFILTER_XT_TARGET_NFQUEUE is not set
CONFIG_NETFILTER_XT_TARGET_RATEEST=y
# CONFIG_NETFILTER_XT_TARGET_TEE is not set
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
# CONFIG_NETFILTER_XT_TARGET_TRACE is not set
# CONFIG_NETFILTER_XT_TARGET_SECMARK is not set
# CONFIG_NETFILTER_XT_TARGET_TCPMSS is not set
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m

#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NETFILTER_XT_MATCH_CGROUP=y
CONFIG_NETFILTER_XT_MATCH_COMMENT=y
# CONFIG_NETFILTER_XT_MATCH_CPU is not set
CONFIG_NETFILTER_XT_MATCH_DCCP=m
# CONFIG_NETFILTER_XT_MATCH_DEVGROUP is not set
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ECN=m
CONFIG_NETFILTER_XT_MATCH_ESP=y
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
# CONFIG_NETFILTER_XT_MATCH_HL is not set
CONFIG_NETFILTER_XT_MATCH_IPCOMP=y
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
# CONFIG_NETFILTER_XT_MATCH_L2TP is not set
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
# CONFIG_NETFILTER_XT_MATCH_MARK is not set
# CONFIG_NETFILTER_XT_MATCH_MULTIPORT is not set
CONFIG_NETFILTER_XT_MATCH_NFACCT=m
CONFIG_NETFILTER_XT_MATCH_OSF=y
# CONFIG_NETFILTER_XT_MATCH_OWNER is not set
CONFIG_NETFILTER_XT_MATCH_POLICY=y
# CONFIG_NETFILTER_XT_MATCH_PHYSDEV is not set
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
# CONFIG_NETFILTER_XT_MATCH_QUOTA is not set
CONFIG_NETFILTER_XT_MATCH_RATEEST=y
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_RECENT=y
CONFIG_NETFILTER_XT_MATCH_SCTP=y
CONFIG_NETFILTER_XT_MATCH_SOCKET=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=y
CONFIG_NETFILTER_XT_MATCH_TCPMSS=y
CONFIG_NETFILTER_XT_MATCH_TIME=y
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
# CONFIG_IP_SET is not set
CONFIG_IP_VS=y
CONFIG_IP_VS_DEBUG=y
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
# CONFIG_IP_VS_PROTO_TCP is not set
# CONFIG_IP_VS_PROTO_UDP is not set
CONFIG_IP_VS_PROTO_AH_ESP=y
# CONFIG_IP_VS_PROTO_ESP is not set
CONFIG_IP_VS_PROTO_AH=y
CONFIG_IP_VS_PROTO_SCTP=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=y
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=y
CONFIG_IP_VS_WLC=y
# CONFIG_IP_VS_FO is not set
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=y
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
# CONFIG_IP_VS_SED is not set
CONFIG_IP_VS_NQ=y

#
# IPVS SH scheduler
#
CONFIG_IP_VS_SH_TAB_BITS=8

#
# IPVS application helper
#

#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_TABLES_IPV4=m
CONFIG_NFT_CHAIN_ROUTE_IPV4=m
CONFIG_NFT_REJECT_IPV4=m
CONFIG_NF_TABLES_ARP=m
CONFIG_NF_LOG_ARP=y
CONFIG_NF_LOG_IPV4=y
CONFIG_NF_REJECT_IPV4=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_RPFILTER=m
# CONFIG_IP_NF_MATCH_TTL is not set
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_ECN=m
# CONFIG_IP_NF_TARGET_TTL is not set
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_SECURITY=m
CONFIG_IP_NF_ARPTABLES=y
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m

#
# DECnet: Netfilter Configuration
#
# CONFIG_DECNET_NF_GRABULATOR is not set
CONFIG_NF_TABLES_BRIDGE=m
CONFIG_NFT_BRIDGE_META=m
CONFIG_NF_LOG_BRIDGE=m
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
# CONFIG_BRIDGE_EBT_T_FILTER is not set
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
# CONFIG_BRIDGE_EBT_LIMIT is not set
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
# CONFIG_BRIDGE_EBT_STP is not set
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
# CONFIG_BRIDGE_EBT_REDIRECT is not set
# CONFIG_BRIDGE_EBT_SNAT is not set
# CONFIG_BRIDGE_EBT_LOG is not set
# CONFIG_BRIDGE_EBT_NFLOG is not set
CONFIG_IP_DCCP=y
CONFIG_INET_DCCP_DIAG=m

#
# DCCP CCIDs Configuration
#
# CONFIG_IP_DCCP_CCID2_DEBUG is not set
# CONFIG_IP_DCCP_CCID3 is not set

#
# DCCP Kernel Hacking
#
CONFIG_IP_DCCP_DEBUG=y
CONFIG_IP_SCTP=m
CONFIG_SCTP_DBG_OBJCNT=y
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set
CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE=y
# CONFIG_SCTP_COOKIE_HMAC_MD5 is not set
CONFIG_SCTP_COOKIE_HMAC_SHA1=y
# CONFIG_TIPC is not set
CONFIG_ATM=m
# CONFIG_ATM_CLIP is not set
CONFIG_ATM_LANE=m
CONFIG_ATM_MPOA=m
# CONFIG_ATM_BR2684 is not set
# CONFIG_L2TP is not set
CONFIG_STP=m
CONFIG_BRIDGE=m
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_HAVE_NET_DSA=y
# CONFIG_VLAN_8021Q is not set
CONFIG_DECNET=m
CONFIG_DECNET_ROUTER=y
CONFIG_LLC=y
CONFIG_LLC2=m
CONFIG_IPX=y
CONFIG_IPX_INTERN=y
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
CONFIG_PHONET=y
CONFIG_IEEE802154=m
# CONFIG_IEEE802154_SOCKET is not set
CONFIG_MAC802154=m
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
# CONFIG_NET_SCH_HTB is not set
CONFIG_NET_SCH_HFSC=y
# CONFIG_NET_SCH_ATM is not set
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_MULTIQ=y
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFB=y
CONFIG_NET_SCH_SFQ=y
CONFIG_NET_SCH_TEQL=y
CONFIG_NET_SCH_TBF=y
CONFIG_NET_SCH_GRED=m
# CONFIG_NET_SCH_DSMARK is not set
CONFIG_NET_SCH_NETEM=y
CONFIG_NET_SCH_DRR=m
# CONFIG_NET_SCH_MQPRIO is not set
CONFIG_NET_SCH_CHOKE=m
CONFIG_NET_SCH_QFQ=m
CONFIG_NET_SCH_CODEL=m
# CONFIG_NET_SCH_FQ_CODEL is not set
CONFIG_NET_SCH_FQ=y
# CONFIG_NET_SCH_HHF is not set
CONFIG_NET_SCH_PIE=y
CONFIG_NET_SCH_PLUG=m

#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=y
# CONFIG_NET_CLS_TCINDEX is not set
CONFIG_NET_CLS_ROUTE4=y
CONFIG_NET_CLS_FW=y
CONFIG_NET_CLS_U32=y
# CONFIG_CLS_U32_PERF is not set
# CONFIG_CLS_U32_MARK is not set
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_FLOW=m
CONFIG_NET_CLS_CGROUP=m
CONFIG_NET_CLS_BPF=m
# CONFIG_NET_CLS_FLOWER is not set
# CONFIG_NET_EMATCH is not set
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_CLS_IND=y
CONFIG_NET_SCH_FIFO=y
CONFIG_DCB=y
# CONFIG_DNS_RESOLVER is not set
# CONFIG_BATMAN_ADV is not set
# CONFIG_OPENVSWITCH is not set
# CONFIG_VSOCKETS is not set
CONFIG_NETLINK_MMAP=y
CONFIG_NETLINK_DIAG=m
CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
CONFIG_MPLS_ROUTING=m
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_XPS=y
# CONFIG_CGROUP_NET_PRIO is not set
CONFIG_CGROUP_NET_CLASSID=y
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
# CONFIG_BPF_JIT is not set
CONFIG_NET_FLOW_LIMIT=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
CONFIG_NET_DROP_MONITOR=y
CONFIG_HAMRADIO=y

#
# Packet Radio protocols
#
# CONFIG_AX25 is not set
CONFIG_CAN=y
CONFIG_CAN_RAW=y
# CONFIG_CAN_BCM is not set
CONFIG_CAN_GW=y

#
# CAN Device Drivers
#
# CONFIG_CAN_VCAN is not set
# CONFIG_CAN_SLCAN is not set
CONFIG_CAN_DEV=m
CONFIG_CAN_CALC_BITTIMING=y
# CONFIG_CAN_LEDS is not set
# CONFIG_CAN_AT91 is not set
# CONFIG_CAN_JANZ_ICAN3 is not set
CONFIG_PCH_CAN=m
# CONFIG_CAN_SJA1000 is not set
CONFIG_CAN_C_CAN=m
# CONFIG_CAN_C_CAN_PLATFORM is not set
# CONFIG_CAN_C_CAN_PCI is not set
# CONFIG_CAN_M_CAN is not set
CONFIG_CAN_CC770=m
# CONFIG_CAN_CC770_ISA is not set
CONFIG_CAN_CC770_PLATFORM=m

#
# CAN SPI interfaces
#
CONFIG_CAN_MCP251X=m

#
# CAN USB interfaces
#
# CONFIG_CAN_EMS_USB is not set
CONFIG_CAN_ESD_USB2=m
# CONFIG_CAN_GS_USB is not set
CONFIG_CAN_KVASER_USB=m
CONFIG_CAN_PEAK_USB=m
# CONFIG_CAN_8DEV_USB is not set
CONFIG_CAN_SOFTING=m
CONFIG_CAN_SOFTING_CS=m
# CONFIG_CAN_DEBUG_DEVICES is not set
CONFIG_IRDA=m

#
# IrDA protocols
#
CONFIG_IRLAN=m
CONFIG_IRNET=m
# CONFIG_IRCOMM is not set
CONFIG_IRDA_ULTRA=y

#
# IrDA options
#
CONFIG_IRDA_CACHE_LAST_LSAP=y
CONFIG_IRDA_FAST_RR=y
# CONFIG_IRDA_DEBUG is not set

#
# Infrared-port device drivers
#

#
# SIR device drivers
#
CONFIG_IRTTY_SIR=m

#
# Dongle support
#
CONFIG_DONGLE=y
CONFIG_ESI_DONGLE=m
CONFIG_ACTISYS_DONGLE=m
CONFIG_TEKRAM_DONGLE=m
# CONFIG_TOIM3232_DONGLE is not set
CONFIG_LITELINK_DONGLE=m
# CONFIG_MA600_DONGLE is not set
CONFIG_GIRBIL_DONGLE=m
CONFIG_MCP2120_DONGLE=m
# CONFIG_OLD_BELKIN_DONGLE is not set
CONFIG_ACT200L_DONGLE=m
CONFIG_KINGSUN_DONGLE=m
CONFIG_KSDAZZLE_DONGLE=m
CONFIG_KS959_DONGLE=m

#
# FIR device drivers
#
# CONFIG_USB_IRDA is not set
CONFIG_SIGMATEL_FIR=m
CONFIG_NSC_FIR=m
# CONFIG_WINBOND_FIR is not set
# CONFIG_SMC_IRCC_FIR is not set
# CONFIG_ALI_FIR is not set
CONFIG_VLSI_FIR=m
CONFIG_VIA_FIR=m
CONFIG_MCS_FIR=m
# CONFIG_SH_IRDA is not set
# CONFIG_BT is not set
CONFIG_AF_RXRPC=m
# CONFIG_AF_RXRPC_DEBUG is not set
# CONFIG_RXKAD is not set
CONFIG_FIB_RULES=y
# CONFIG_WIRELESS is not set
# CONFIG_WIMAX is not set
CONFIG_RFKILL=y
# CONFIG_RFKILL_INPUT is not set
CONFIG_RFKILL_REGULATOR=y
CONFIG_RFKILL_GPIO=y
CONFIG_NET_9P=y
# CONFIG_NET_9P_VIRTIO is not set
CONFIG_NET_9P_RDMA=y
CONFIG_NET_9P_DEBUG=y
CONFIG_CAIF=m
# CONFIG_CAIF_DEBUG is not set
CONFIG_CAIF_NETDEV=m
CONFIG_CAIF_USB=m
CONFIG_CEPH_LIB=m
CONFIG_CEPH_LIB_PRETTYDEBUG=y
# CONFIG_CEPH_LIB_USE_DNS_RESOLVER is not set
CONFIG_NFC=m
CONFIG_NFC_DIGITAL=m
CONFIG_NFC_NCI=m
CONFIG_NFC_NCI_SPI=y
CONFIG_NFC_NCI_UART=m
# CONFIG_NFC_HCI is not set

#
# Near Field Communication (NFC) devices
#
CONFIG_NFC_PN533=m
# CONFIG_NFC_TRF7970A is not set
CONFIG_NFC_SIM=m
CONFIG_NFC_PORT100=m
# CONFIG_NFC_MRVL is not set
# CONFIG_NFC_ST_NCI is not set
CONFIG_NFC_NXP_NCI=m
CONFIG_NFC_NXP_NCI_I2C=m
CONFIG_HAVE_BPF_JIT=y

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH=""
# CONFIG_DEVTMPFS is not set
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
CONFIG_EXTRA_FIRMWARE=""
CONFIG_FW_LOADER_USER_HELPER=y
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
# CONFIG_ALLOW_DEV_COREDUMP is not set
CONFIG_DEBUG_DRIVER=y
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
# CONFIG_GENERIC_CPU_DEVICES is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=y
CONFIG_REGMAP_SPI=y
CONFIG_REGMAP_MMIO=y
CONFIG_REGMAP_IRQ=y
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_FENCE_TRACE is not set

#
# Bus devices
#
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_NULL_BLK=y
# CONFIG_BLK_DEV_FD is not set
CONFIG_BLK_DEV_PCIESSD_MTIP32XX=m
CONFIG_ZRAM=m
CONFIG_ZRAM_LZ4_COMPRESS=y
# CONFIG_BLK_CPQ_CISS_DA is not set
CONFIG_BLK_DEV_DAC960=y
CONFIG_BLK_DEV_UMEM=y
# CONFIG_BLK_DEV_COW_COMMON is not set
# CONFIG_BLK_DEV_LOOP is not set
CONFIG_BLK_DEV_NBD=m
# CONFIG_BLK_DEV_SKD is not set
CONFIG_BLK_DEV_SX8=m
# CONFIG_BLK_DEV_RAM is not set
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
CONFIG_CDROM_PKTCDVD_WCACHE=y
CONFIG_ATA_OVER_ETH=m
CONFIG_VIRTIO_BLK=y
CONFIG_BLK_DEV_HD=y
CONFIG_BLK_DEV_RBD=m
# CONFIG_BLK_DEV_RSXX is not set

#
# Misc devices
#
CONFIG_SENSORS_LIS3LV02D=y
# CONFIG_AD525X_DPOT is not set
CONFIG_DUMMY_IRQ=y
CONFIG_IBM_ASM=m
CONFIG_PHANTOM=m
# CONFIG_INTEL_MID_PTI is not set
# CONFIG_SGI_IOC4 is not set
CONFIG_TIFM_CORE=y
CONFIG_TIFM_7XX1=m
CONFIG_ICS932S401=y
# CONFIG_ATMEL_SSC is not set
CONFIG_ENCLOSURE_SERVICES=y
# CONFIG_HP_ILO is not set
CONFIG_APDS9802ALS=y
# CONFIG_ISL29003 is not set
CONFIG_ISL29020=y
CONFIG_SENSORS_TSL2550=y
# CONFIG_SENSORS_BH1780 is not set
CONFIG_SENSORS_BH1770=m
CONFIG_SENSORS_APDS990X=y
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
# CONFIG_TI_DAC7512 is not set
CONFIG_BMP085=y
CONFIG_BMP085_I2C=m
CONFIG_BMP085_SPI=m
CONFIG_PCH_PHUB=y
CONFIG_USB_SWITCH_FSA9480=m
# CONFIG_LATTICE_ECP3_CONFIG is not set
CONFIG_SRAM=y
CONFIG_C2PORT=y
CONFIG_C2PORT_DURAMAR_2150=y

#
# EEPROM support
#
# CONFIG_EEPROM_AT24 is not set
CONFIG_EEPROM_AT25=y
# CONFIG_EEPROM_LEGACY is not set
# CONFIG_EEPROM_MAX6875 is not set
CONFIG_EEPROM_93CX6=m
# CONFIG_EEPROM_93XX46 is not set
CONFIG_CB710_CORE=y
CONFIG_CB710_DEBUG=y
CONFIG_CB710_DEBUG_ASSUMPTIONS=y

#
# Texas Instruments shared transport line discipline
#
# CONFIG_TI_ST is not set
# CONFIG_SENSORS_LIS3_SPI is not set
CONFIG_SENSORS_LIS3_I2C=y

#
# Altera FPGA firmware download module
#
CONFIG_ALTERA_STAPL=y
CONFIG_VMWARE_VMCI=m

#
# Intel MIC Bus Driver
#
CONFIG_INTEL_MIC_BUS=m

#
# SCIF Bus Driver
#
CONFIG_SCIF_BUS=m

#
# Intel MIC Host Driver
#
CONFIG_INTEL_MIC_HOST=m

#
# Intel MIC Card Driver
#
CONFIG_INTEL_MIC_CARD=m

#
# SCIF Driver
#
CONFIG_SCIF=m
CONFIG_GENWQE=m
CONFIG_GENWQE_PLATFORM_ERROR_RECOVERY=0
CONFIG_ECHO=y
# CONFIG_CXL_BASE is not set
# CONFIG_CXL_KERNEL_API is not set
CONFIG_HAVE_IDE=y

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
CONFIG_RAID_ATTRS=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_NETLINK is not set
CONFIG_SCSI_MQ_DEFAULT=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y
CONFIG_CHR_DEV_SCH=m
# CONFIG_SCSI_ENCLOSURE is not set
# CONFIG_SCSI_CONSTANTS is not set
CONFIG_SCSI_LOGGING=y
# CONFIG_SCSI_SCAN_ASYNC is not set

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=y
# CONFIG_SCSI_FC_ATTRS is not set
CONFIG_SCSI_ISCSI_ATTRS=y
CONFIG_SCSI_SAS_ATTRS=y
CONFIG_SCSI_SAS_LIBSAS=y
# CONFIG_SCSI_SAS_ATA is not set
CONFIG_SCSI_SAS_HOST_SMP=y
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
CONFIG_ISCSI_TCP=m
CONFIG_ISCSI_BOOT_SYSFS=y
CONFIG_SCSI_CXGB3_ISCSI=m
CONFIG_SCSI_CXGB4_ISCSI=m
# CONFIG_SCSI_BNX2_ISCSI is not set
CONFIG_BE2ISCSI=y
CONFIG_BLK_DEV_3W_XXXX_RAID=y
# CONFIG_SCSI_HPSA is not set
CONFIG_SCSI_3W_9XXX=m
# CONFIG_SCSI_3W_SAS is not set
CONFIG_SCSI_ACARD=y
CONFIG_SCSI_AACRAID=m
CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=5000
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC7XXX_REG_PRETTY_PRINT=y
CONFIG_SCSI_AIC79XX=y
CONFIG_AIC79XX_CMDS_PER_DEVICE=32
CONFIG_AIC79XX_RESET_DELAY_MS=5000
# CONFIG_AIC79XX_DEBUG_ENABLE is not set
CONFIG_AIC79XX_DEBUG_MASK=0
# CONFIG_AIC79XX_REG_PRETTY_PRINT is not set
CONFIG_SCSI_AIC94XX=m
# CONFIG_AIC94XX_DEBUG is not set
CONFIG_SCSI_MVSAS=m
CONFIG_SCSI_MVSAS_DEBUG=y
# CONFIG_SCSI_MVSAS_TASKLET is not set
CONFIG_SCSI_MVUMI=m
# CONFIG_SCSI_DPT_I2O is not set
CONFIG_SCSI_ADVANSYS=m
CONFIG_SCSI_ARCMSR=y
CONFIG_SCSI_ESAS2R=y
CONFIG_MEGARAID_NEWGEN=y
# CONFIG_MEGARAID_MM is not set
# CONFIG_MEGARAID_LEGACY is not set
CONFIG_MEGARAID_SAS=y
CONFIG_SCSI_MPT2SAS=y
CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS_LOGGING=y
# CONFIG_SCSI_UFSHCD is not set
CONFIG_SCSI_HPTIOP=m
CONFIG_SCSI_BUSLOGIC=m
CONFIG_SCSI_FLASHPOINT=y
# CONFIG_VMWARE_PVSCSI is not set
# CONFIG_SCSI_SNIC is not set
CONFIG_SCSI_DMX3191D=y
# CONFIG_SCSI_EATA is not set
CONFIG_SCSI_FUTURE_DOMAIN=y
CONFIG_SCSI_GDTH=y
CONFIG_SCSI_ISCI=y
# CONFIG_SCSI_IPS is not set
CONFIG_SCSI_INITIO=y
CONFIG_SCSI_INIA100=y
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
CONFIG_SCSI_QLOGIC_1280=y
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_AM53C974 is not set
CONFIG_SCSI_WD719X=m
# CONFIG_SCSI_PMCRAID is not set
# CONFIG_SCSI_PM8001 is not set
CONFIG_SCSI_VIRTIO=m
CONFIG_SCSI_LOWLEVEL_PCMCIA=y
# CONFIG_PCMCIA_AHA152X is not set
CONFIG_PCMCIA_FDOMAIN=m
# CONFIG_PCMCIA_QLOGIC is not set
# CONFIG_PCMCIA_SYM53C500 is not set
# CONFIG_SCSI_DH is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
# CONFIG_ATA_VERBOSE_ERROR is not set
CONFIG_SATA_PMP=y

#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=y
CONFIG_SATA_AHCI_PLATFORM=m
# CONFIG_SATA_INIC162X is not set
CONFIG_SATA_ACARD_AHCI=m
CONFIG_SATA_SIL24=y
CONFIG_ATA_SFF=y

#
# SFF controllers with custom DMA interface
#
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
CONFIG_SATA_SX4=m
CONFIG_ATA_BMDMA=y

#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=y
CONFIG_SATA_HIGHBANK=m
CONFIG_SATA_MV=y
CONFIG_SATA_NV=y
CONFIG_SATA_PROMISE=y
CONFIG_SATA_RCAR=m
# CONFIG_SATA_SIL is not set
CONFIG_SATA_SIS=y
# CONFIG_SATA_SVW is not set
CONFIG_SATA_ULI=m
# CONFIG_SATA_VIA is not set
CONFIG_SATA_VITESSE=m

#
# PATA SFF controllers with BMDMA
#
CONFIG_PATA_ALI=m
CONFIG_PATA_AMD=y
CONFIG_PATA_ARASAN_CF=y
CONFIG_PATA_ARTOP=m
CONFIG_PATA_ATIIXP=y
# CONFIG_PATA_ATP867X is not set
CONFIG_PATA_CMD64X=m
CONFIG_PATA_CS5520=m
CONFIG_PATA_CS5530=m
CONFIG_PATA_CS5536=m
CONFIG_PATA_CYPRESS=y
# CONFIG_PATA_EFAR is not set
CONFIG_PATA_HPT366=y
CONFIG_PATA_HPT37X=y
CONFIG_PATA_HPT3X2N=m
# CONFIG_PATA_HPT3X3 is not set
CONFIG_PATA_IT8213=y
# CONFIG_PATA_IT821X is not set
CONFIG_PATA_JMICRON=y
CONFIG_PATA_MARVELL=m
CONFIG_PATA_NETCELL=y
CONFIG_PATA_NINJA32=y
CONFIG_PATA_NS87415=y
CONFIG_PATA_OLDPIIX=y
# CONFIG_PATA_OPTIDMA is not set
CONFIG_PATA_PDC2027X=y
CONFIG_PATA_PDC_OLD=y
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RDC is not set
CONFIG_PATA_SC1200=m
CONFIG_PATA_SCH=y
# CONFIG_PATA_SERVERWORKS is not set
CONFIG_PATA_SIL680=m
CONFIG_PATA_SIS=y
CONFIG_PATA_TOSHIBA=m
CONFIG_PATA_TRIFLEX=m
# CONFIG_PATA_VIA is not set
CONFIG_PATA_WINBOND=y

#
# PIO-only SFF controllers
#
CONFIG_PATA_CMD640_PCI=m
CONFIG_PATA_MPIIX=y
# CONFIG_PATA_NS87410 is not set
CONFIG_PATA_OPTI=y
CONFIG_PATA_PCMCIA=y
# CONFIG_PATA_PLATFORM is not set
CONFIG_PATA_RZ1000=y

#
# Generic fallback / legacy drivers
#
# CONFIG_ATA_GENERIC is not set
CONFIG_PATA_LEGACY=m
CONFIG_MD=y
# CONFIG_BLK_DEV_MD is not set
CONFIG_BCACHE=m
# CONFIG_BCACHE_DEBUG is not set
# CONFIG_BCACHE_CLOSURES_DEBUG is not set
# CONFIG_BLK_DEV_DM is not set
CONFIG_TARGET_CORE=y
# CONFIG_TCM_IBLOCK is not set
# CONFIG_TCM_FILEIO is not set
# CONFIG_TCM_PSCSI is not set
# CONFIG_TCM_USER2 is not set
# CONFIG_LOOPBACK_TARGET is not set
CONFIG_ISCSI_TARGET=m
CONFIG_FUSION=y
CONFIG_FUSION_SPI=y
CONFIG_FUSION_SAS=y
CONFIG_FUSION_MAX_SGE=128
# CONFIG_FUSION_CTL is not set
CONFIG_FUSION_LOGGING=y

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_FIREWIRE_NOSY is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_NET_FC is not set
# CONFIG_NET_TEAM is not set
CONFIG_MACVLAN=y
CONFIG_MACVTAP=y
CONFIG_VXLAN=m
CONFIG_GENEVE=y
CONFIG_NETCONSOLE=y
# CONFIG_NETCONSOLE_DYNAMIC is not set
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_RIONET=y
CONFIG_RIONET_TX_SIZE=128
CONFIG_RIONET_RX_SIZE=128
CONFIG_TUN=m
CONFIG_TUN_VNET_CROSS_LE=y
# CONFIG_VETH is not set
CONFIG_VIRTIO_NET=y
CONFIG_NLMON=m
CONFIG_ARCNET=y
CONFIG_ARCNET_1201=y
CONFIG_ARCNET_1051=m
CONFIG_ARCNET_RAW=m
# CONFIG_ARCNET_CAP is not set
CONFIG_ARCNET_COM90xx=m
CONFIG_ARCNET_COM90xxIO=y
CONFIG_ARCNET_RIM_I=m
# CONFIG_ARCNET_COM20020 is not set
CONFIG_ATM_DRIVERS=y
CONFIG_ATM_DUMMY=m
CONFIG_ATM_TCP=m
# CONFIG_ATM_LANAI is not set
# CONFIG_ATM_ENI is not set
CONFIG_ATM_FIRESTREAM=m
CONFIG_ATM_ZATM=m
# CONFIG_ATM_ZATM_DEBUG is not set
CONFIG_ATM_NICSTAR=m
CONFIG_ATM_NICSTAR_USE_SUNI=y
# CONFIG_ATM_NICSTAR_USE_IDT77105 is not set
CONFIG_ATM_IDT77252=m
CONFIG_ATM_IDT77252_DEBUG=y
# CONFIG_ATM_IDT77252_RCV_ALL is not set
CONFIG_ATM_IDT77252_USE_SUNI=y
# CONFIG_ATM_AMBASSADOR is not set
# CONFIG_ATM_HORIZON is not set
# CONFIG_ATM_IA is not set
CONFIG_ATM_FORE200E=m
CONFIG_ATM_FORE200E_USE_TASKLET=y
CONFIG_ATM_FORE200E_TX_RETRY=16
CONFIG_ATM_FORE200E_DEBUG=0
CONFIG_ATM_HE=m
CONFIG_ATM_HE_USE_SUNI=y
CONFIG_ATM_SOLOS=m

#
# CAIF transport drivers
#
CONFIG_CAIF_TTY=m
# CONFIG_CAIF_SPI_SLAVE is not set
CONFIG_CAIF_HSI=m
# CONFIG_CAIF_VIRTIO is not set
# CONFIG_VHOST_NET is not set
CONFIG_VHOST_SCSI=m
CONFIG_VHOST_RING=m
CONFIG_VHOST=m
CONFIG_VHOST_CROSS_ENDIAN_LEGACY=y

#
# Distributed Switch Architecture drivers
#
# CONFIG_NET_DSA_MV88E6XXX is not set
# CONFIG_NET_DSA_MV88E6XXX_NEED_PPU is not set
CONFIG_ETHERNET=y
CONFIG_MDIO=y
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_VENDOR_ADAPTEC is not set
# CONFIG_NET_VENDOR_AGERE is not set
# CONFIG_NET_VENDOR_ALTEON is not set
CONFIG_ALTERA_TSE=y
CONFIG_NET_VENDOR_AMD=y
CONFIG_AMD8111_ETH=m
# CONFIG_PCNET32 is not set
CONFIG_PCMCIA_NMCLAN=m
# CONFIG_NET_XGENE is not set
# CONFIG_NET_VENDOR_ARC is not set
# CONFIG_NET_VENDOR_ATHEROS is not set
CONFIG_NET_CADENCE=y
CONFIG_MACB=m
CONFIG_NET_VENDOR_BROADCOM=y
CONFIG_B44=m
CONFIG_B44_PCI_AUTOSELECT=y
CONFIG_B44_PCICORE_AUTOSELECT=y
CONFIG_B44_PCI=y
CONFIG_BCMGENET=m
CONFIG_BNX2=y
CONFIG_CNIC=y
CONFIG_TIGON3=y
CONFIG_BNX2X=y
CONFIG_BNX2X_SRIOV=y
# CONFIG_NET_VENDOR_BROCADE is not set
CONFIG_NET_CALXEDA_XGMAC=y
# CONFIG_NET_VENDOR_CAVIUM is not set
CONFIG_NET_VENDOR_CHELSIO=y
CONFIG_CHELSIO_T1=m
# CONFIG_CHELSIO_T1_1G is not set
CONFIG_CHELSIO_T3=y
CONFIG_CHELSIO_T4=y
CONFIG_CHELSIO_T4_DCB=y
CONFIG_CHELSIO_T4VF=y
CONFIG_NET_VENDOR_CISCO=y
# CONFIG_ENIC is not set
CONFIG_CX_ECAT=m
CONFIG_DNET=m
CONFIG_NET_VENDOR_DEC=y
CONFIG_NET_TULIP=y
CONFIG_DE2104X=y
CONFIG_DE2104X_DSL=0
# CONFIG_TULIP is not set
CONFIG_DE4X5=y
# CONFIG_WINBOND_840 is not set
CONFIG_DM9102=m
CONFIG_ULI526X=m
# CONFIG_PCMCIA_XIRCOM is not set
# CONFIG_NET_VENDOR_DLINK is not set
CONFIG_NET_VENDOR_EMULEX=y
CONFIG_BE2NET=y
CONFIG_BE2NET_HWMON=y
CONFIG_NET_VENDOR_EZCHIP=y
CONFIG_NET_VENDOR_EXAR=y
CONFIG_S2IO=y
CONFIG_VXGE=y
CONFIG_VXGE_DEBUG_TRACE_ALL=y
CONFIG_NET_VENDOR_FUJITSU=y
# CONFIG_PCMCIA_FMVJ18X is not set
# CONFIG_NET_VENDOR_HP is not set
CONFIG_NET_VENDOR_INTEL=y
CONFIG_E100=y
CONFIG_E1000=m
CONFIG_E1000E=y
CONFIG_IGB=m
# CONFIG_IGB_HWMON is not set
# CONFIG_IGBVF is not set
# CONFIG_IXGB is not set
# CONFIG_IXGBE is not set
# CONFIG_I40E is not set
# CONFIG_NET_VENDOR_I825XX is not set
CONFIG_IP1000=m
CONFIG_JME=m
# CONFIG_NET_VENDOR_MARVELL is not set
# CONFIG_NET_VENDOR_MELLANOX is not set
# CONFIG_NET_VENDOR_MICREL is not set
CONFIG_NET_VENDOR_MICROCHIP=y
CONFIG_ENC28J60=y
# CONFIG_ENC28J60_WRITEVERIFY is not set
CONFIG_NET_VENDOR_MYRI=y
# CONFIG_MYRI10GE is not set
# CONFIG_FEALNX is not set
CONFIG_NET_VENDOR_NATSEMI=y
CONFIG_NATSEMI=y
# CONFIG_NS83820 is not set
CONFIG_NET_VENDOR_8390=y
CONFIG_PCMCIA_AXNET=y
CONFIG_NE2K_PCI=y
# CONFIG_PCMCIA_PCNET is not set
CONFIG_NET_VENDOR_NVIDIA=y
CONFIG_FORCEDETH=y
# CONFIG_NET_VENDOR_OKI is not set
# CONFIG_ETHOC is not set
CONFIG_NET_PACKET_ENGINE=y
CONFIG_HAMACHI=m
CONFIG_YELLOWFIN=y
CONFIG_NET_VENDOR_QLOGIC=y
CONFIG_QLA3XXX=y
# CONFIG_QLCNIC is not set
CONFIG_QLGE=m
CONFIG_NETXEN_NIC=y
CONFIG_NET_VENDOR_QUALCOMM=y
# CONFIG_NET_VENDOR_REALTEK is not set
# CONFIG_NET_VENDOR_RENESAS is not set
CONFIG_NET_VENDOR_RDC=y
# CONFIG_R6040 is not set
CONFIG_NET_VENDOR_ROCKER=y
# CONFIG_NET_VENDOR_SAMSUNG is not set
# CONFIG_NET_VENDOR_SEEQ is not set
# CONFIG_NET_VENDOR_SILAN is not set
CONFIG_NET_VENDOR_SIS=y
CONFIG_SIS900=m
CONFIG_SIS190=m
CONFIG_SFC=y
# CONFIG_SFC_MCDI_MON is not set
CONFIG_SFC_SRIOV=y
# CONFIG_SFC_MCDI_LOGGING is not set
CONFIG_NET_VENDOR_SMSC=y
CONFIG_PCMCIA_SMC91C92=m
CONFIG_EPIC100=y
CONFIG_SMSC911X=m
# CONFIG_SMSC911X_ARCH_HOOKS is not set
CONFIG_SMSC9420=y
CONFIG_NET_VENDOR_STMICRO=y
# CONFIG_STMMAC_ETH is not set
# CONFIG_NET_VENDOR_SUN is not set
# CONFIG_NET_VENDOR_TEHUTI is not set
CONFIG_NET_VENDOR_TI=y
# CONFIG_TI_CPSW_ALE is not set
CONFIG_TLAN=y
# CONFIG_NET_VENDOR_VIA is not set
CONFIG_NET_VENDOR_WIZNET=y
CONFIG_WIZNET_W5100=y
CONFIG_WIZNET_W5300=m
# CONFIG_WIZNET_BUS_DIRECT is not set
# CONFIG_WIZNET_BUS_INDIRECT is not set
CONFIG_WIZNET_BUS_ANY=y
# CONFIG_NET_VENDOR_XIRCOM is not set
CONFIG_FDDI=m
CONFIG_DEFXX=m
# CONFIG_DEFXX_MMIO is not set
CONFIG_SKFP=m
# CONFIG_HIPPI is not set
CONFIG_PHYLIB=y

#
# MII PHY device drivers
#
CONFIG_AT803X_PHY=y
CONFIG_AMD_PHY=m
CONFIG_MARVELL_PHY=y
CONFIG_DAVICOM_PHY=m
CONFIG_QSEMI_PHY=y
# CONFIG_LXT_PHY is not set
CONFIG_CICADA_PHY=y
CONFIG_VITESSE_PHY=m
CONFIG_SMSC_PHY=y
CONFIG_BROADCOM_PHY=y
CONFIG_BCM7XXX_PHY=m
# CONFIG_BCM87XX_PHY is not set
# CONFIG_ICPLUS_PHY is not set
# CONFIG_REALTEK_PHY is not set
# CONFIG_NATIONAL_PHY is not set
CONFIG_STE10XP=m
# CONFIG_LSI_ET1011C_PHY is not set
# CONFIG_MICREL_PHY is not set
# CONFIG_DP83867_PHY is not set
CONFIG_FIXED_PHY=m
CONFIG_MDIO_BITBANG=m
CONFIG_MDIO_GPIO=m
CONFIG_MDIO_BCM_UNIMAC=m
CONFIG_MICREL_KS8995MA=m
CONFIG_PPP=y
CONFIG_PPP_BSDCOMP=y
CONFIG_PPP_DEFLATE=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_MPPE=y
CONFIG_PPP_MULTILINK=y
# CONFIG_PPPOATM is not set
# CONFIG_PPPOE is not set
CONFIG_PPTP=m
# CONFIG_PPP_ASYNC is not set
CONFIG_PPP_SYNC_TTY=m
CONFIG_SLIP=y
CONFIG_SLHC=y
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLIP_SMART=y
# CONFIG_SLIP_MODE_SLIP6 is not set
CONFIG_USB_NET_DRIVERS=m
CONFIG_USB_CATC=m
# CONFIG_USB_KAWETH is not set
CONFIG_USB_PEGASUS=m
CONFIG_USB_RTL8150=m
# CONFIG_USB_RTL8152 is not set
CONFIG_USB_USBNET=m
CONFIG_USB_NET_AX8817X=m
CONFIG_USB_NET_AX88179_178A=m
CONFIG_USB_NET_CDCETHER=m
# CONFIG_USB_NET_CDC_EEM is not set
CONFIG_USB_NET_CDC_NCM=m
CONFIG_USB_NET_HUAWEI_CDC_NCM=m
CONFIG_USB_NET_CDC_MBIM=m
# CONFIG_USB_NET_DM9601 is not set
CONFIG_USB_NET_SR9700=m
CONFIG_USB_NET_SR9800=m
CONFIG_USB_NET_SMSC75XX=m
# CONFIG_USB_NET_SMSC95XX is not set
CONFIG_USB_NET_GL620A=m
CONFIG_USB_NET_NET1080=m
CONFIG_USB_NET_PLUSB=m
CONFIG_USB_NET_MCS7830=m
# CONFIG_USB_NET_RNDIS_HOST is not set
CONFIG_USB_NET_CDC_SUBSET=m
# CONFIG_USB_ALI_M5632 is not set
# CONFIG_USB_AN2720 is not set
# CONFIG_USB_BELKIN is not set
# CONFIG_USB_ARMLINUX is not set
# CONFIG_USB_EPSON2888 is not set
CONFIG_USB_KC2190=y
CONFIG_USB_NET_ZAURUS=m
CONFIG_USB_NET_CX82310_ETH=m
# CONFIG_USB_NET_KALMIA is not set
# CONFIG_USB_NET_QMI_WWAN is not set
CONFIG_USB_HSO=m
# CONFIG_USB_NET_INT51X1 is not set
CONFIG_USB_CDC_PHONET=m
# CONFIG_USB_IPHETH is not set
# CONFIG_USB_SIERRA_NET is not set
CONFIG_USB_VL600=m
# CONFIG_WLAN is not set

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
CONFIG_WAN=y
# CONFIG_HDLC is not set
CONFIG_DLCI=y
CONFIG_DLCI_MAX=8
# CONFIG_SBNI is not set
# CONFIG_IEEE802154_DRIVERS is not set
CONFIG_VMXNET3=y

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_LEDS=m
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_INPUT_POLLDEV=y
# CONFIG_INPUT_SPARSEKMAP is not set
CONFIG_INPUT_MATRIXKMAP=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=m
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
CONFIG_KEYBOARD_ADP5589=m
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1070 is not set
CONFIG_KEYBOARD_QT2160=m
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_GPIO is not set
# CONFIG_KEYBOARD_GPIO_POLLED is not set
CONFIG_KEYBOARD_TCA6416=y
CONFIG_KEYBOARD_TCA8418=y
CONFIG_KEYBOARD_MATRIX=y
CONFIG_KEYBOARD_LM8323=y
CONFIG_KEYBOARD_LM8333=y
CONFIG_KEYBOARD_MAX7359=y
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
CONFIG_KEYBOARD_NEWTON=y
CONFIG_KEYBOARD_OPENCORES=m
CONFIG_KEYBOARD_STOWAWAY=y
CONFIG_KEYBOARD_ST_KEYSCAN=y
CONFIG_KEYBOARD_SUNKBD=y
CONFIG_KEYBOARD_SH_KEYSC=m
CONFIG_KEYBOARD_XTKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=m
CONFIG_MOUSE_PS2_ALPS=y
# CONFIG_MOUSE_PS2_LOGIPS2PP is not set
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_CYPRESS=y
# CONFIG_MOUSE_PS2_TRACKPOINT is not set
# CONFIG_MOUSE_PS2_ELANTECH is not set
CONFIG_MOUSE_PS2_SENTELIC=y
CONFIG_MOUSE_PS2_TOUCHKIT=y
CONFIG_MOUSE_PS2_FOCALTECH=y
# CONFIG_MOUSE_SERIAL is not set
CONFIG_MOUSE_APPLETOUCH=m
# CONFIG_MOUSE_BCM5974 is not set
# CONFIG_MOUSE_CYAPA is not set
CONFIG_MOUSE_ELAN_I2C=y
# CONFIG_MOUSE_ELAN_I2C_I2C is not set
CONFIG_MOUSE_ELAN_I2C_SMBUS=y
CONFIG_MOUSE_VSXXXAA=m
CONFIG_MOUSE_GPIO=y
# CONFIG_MOUSE_SYNAPTICS_I2C is not set
CONFIG_MOUSE_SYNAPTICS_USB=y
CONFIG_INPUT_TABLET=y
CONFIG_TABLET_USB_ACECAD=m
CONFIG_TABLET_USB_AIPTEK=m
# CONFIG_TABLET_USB_GTCO is not set
# CONFIG_TABLET_USB_HANWANG is not set
CONFIG_TABLET_USB_KBTAB=m
# CONFIG_TABLET_SERIAL_WACOM4 is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_BMA150 is not set
CONFIG_INPUT_E3X0_BUTTON=m
# CONFIG_INPUT_PCSPKR is not set
# CONFIG_INPUT_MAX77693_HAPTIC is not set
CONFIG_INPUT_MAX77843_HAPTIC=y
CONFIG_INPUT_MAX8997_HAPTIC=m
# CONFIG_INPUT_MC13783_PWRBUTTON is not set
# CONFIG_INPUT_MMA8450 is not set
CONFIG_INPUT_MPU3050=m
CONFIG_INPUT_APANEL=y
CONFIG_INPUT_GP2A=m
CONFIG_INPUT_GPIO_BEEPER=y
# CONFIG_INPUT_GPIO_TILT_POLLED is not set
CONFIG_INPUT_ATI_REMOTE2=m
CONFIG_INPUT_KEYSPAN_REMOTE=m
# CONFIG_INPUT_KXTJ9 is not set
CONFIG_INPUT_POWERMATE=m
CONFIG_INPUT_YEALINK=y
# CONFIG_INPUT_CM109 is not set
# CONFIG_INPUT_REGULATOR_HAPTIC is not set
CONFIG_INPUT_RETU_PWRBUTTON=m
CONFIG_INPUT_AXP20X_PEK=y
CONFIG_INPUT_TWL6040_VIBRA=m
CONFIG_INPUT_UINPUT=m
CONFIG_INPUT_PALMAS_PWRBUTTON=y
CONFIG_INPUT_PCF50633_PMU=m
# CONFIG_INPUT_PCF8574 is not set
CONFIG_INPUT_PWM_BEEPER=y
CONFIG_INPUT_GPIO_ROTARY_ENCODER=y
CONFIG_INPUT_DA9052_ONKEY=m
# CONFIG_INPUT_DA9055_ONKEY is not set
# CONFIG_INPUT_ADXL34X is not set
CONFIG_INPUT_IMS_PCU=m
CONFIG_INPUT_CMA3000=m
CONFIG_INPUT_CMA3000_I2C=m
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
CONFIG_INPUT_DRV260X_HAPTICS=m
CONFIG_INPUT_DRV2665_HAPTICS=y
CONFIG_INPUT_DRV2667_HAPTICS=m

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
CONFIG_SERIO_CT82C710=y
CONFIG_SERIO_PCIPS2=y
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
CONFIG_SERIO_ALTERA_PS2=m
CONFIG_SERIO_PS2MULT=y
CONFIG_SERIO_ARC_PS2=y
# CONFIG_SERIO_OLPC_APSP is not set
# CONFIG_SERIO_SUN4I_PS2 is not set

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
# CONFIG_CONSOLE_TRANSLATIONS is not set
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
# CONFIG_VT_HW_CONSOLE_BINDING is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
# CONFIG_SERIAL_NONSTANDARD is not set
CONFIG_NOZOMI=m
# CONFIG_N_GSM is not set
# CONFIG_TRACE_SINK is not set
# CONFIG_DEVMEM is not set
CONFIG_DEVKMEM=y

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
CONFIG_SERIAL_8250_CONSOLE=y
# CONFIG_SERIAL_8250_DMA is not set
CONFIG_SERIAL_8250_PCI=m
# CONFIG_SERIAL_8250_CS is not set
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
# CONFIG_SERIAL_8250_SHARE_IRQ is not set
CONFIG_SERIAL_8250_DETECT_IRQ=y
CONFIG_SERIAL_8250_RSA=y
CONFIG_SERIAL_8250_DW=y

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_CLPS711X is not set
CONFIG_SERIAL_MAX3100=m
CONFIG_SERIAL_MAX310X=m
CONFIG_SERIAL_UARTLITE=y
CONFIG_SERIAL_UARTLITE_CONSOLE=y
CONFIG_SERIAL_SH_SCI=m
CONFIG_SERIAL_SH_SCI_NR_UARTS=2
# CONFIG_SERIAL_SH_SCI_DMA is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SERIAL_JSM=y
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
CONFIG_SERIAL_TIMBERDALE=y
CONFIG_SERIAL_BCM63XX=y
CONFIG_SERIAL_BCM63XX_CONSOLE=y
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
CONFIG_SERIAL_ALTERA_UART=m
CONFIG_SERIAL_ALTERA_UART_MAXPORTS=4
CONFIG_SERIAL_ALTERA_UART_BAUDRATE=115200
CONFIG_SERIAL_IFX6X60=y
CONFIG_SERIAL_PCH_UART=y
CONFIG_SERIAL_PCH_UART_CONSOLE=y
# CONFIG_SERIAL_ARC is not set
CONFIG_SERIAL_RP2=m
CONFIG_SERIAL_RP2_NR_UARTS=32
# CONFIG_SERIAL_FSL_LPUART is not set
CONFIG_SERIAL_ST_ASC=m
CONFIG_SERIAL_MEN_Z135=m
CONFIG_SERIAL_STM32=y
# CONFIG_SERIAL_STM32_CONSOLE is not set
# CONFIG_TTY_PRINTK is not set
# CONFIG_VIRTIO_CONSOLE is not set
# CONFIG_IPMI_HANDLER is not set
# CONFIG_HW_RANDOM is not set
CONFIG_NVRAM=m
CONFIG_R3964=m
CONFIG_APPLICOM=m

#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
CONFIG_CARDMAN_4000=y
# CONFIG_CARDMAN_4040 is not set
# CONFIG_IPWIRELESS is not set
CONFIG_MWAVE=y
# CONFIG_RAW_DRIVER is not set
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
CONFIG_XILLYBUS=y

#
# I2C support
#
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
# CONFIG_I2C_COMPAT is not set
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_MUX=y

#
# Multiplexer I2C Chip support
#
CONFIG_I2C_MUX_GPIO=m
# CONFIG_I2C_MUX_PCA9541 is not set
# CONFIG_I2C_MUX_PCA954x is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_SMBUS=y
CONFIG_I2C_ALGOBIT=y
CONFIG_I2C_ALGOPCA=y

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
CONFIG_I2C_ALI1535=y
CONFIG_I2C_ALI1563=y
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=y
# CONFIG_I2C_AMD756_S4882 is not set
CONFIG_I2C_AMD8111=m
# CONFIG_I2C_HIX5HD2 is not set
CONFIG_I2C_I801=y
# CONFIG_I2C_ISCH is not set
CONFIG_I2C_ISMT=m
CONFIG_I2C_PIIX4=m
# CONFIG_I2C_NFORCE2 is not set
CONFIG_I2C_SIS5595=y
CONFIG_I2C_SIS630=m
CONFIG_I2C_SIS96X=y
CONFIG_I2C_VIA=y
CONFIG_I2C_VIAPRO=y

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
CONFIG_I2C_AXXIA=m
CONFIG_I2C_BCM_IPROC=m
CONFIG_I2C_BRCMSTB=y
CONFIG_I2C_CBUS_GPIO=m
CONFIG_I2C_DESIGNWARE_CORE=y
CONFIG_I2C_DESIGNWARE_PLATFORM=y
# CONFIG_I2C_DESIGNWARE_PCI is not set
# CONFIG_I2C_EFM32 is not set
CONFIG_I2C_EG20T=y
CONFIG_I2C_GPIO=y
CONFIG_I2C_IMG=y
CONFIG_I2C_JZ4780=y
# CONFIG_I2C_KEMPLD is not set
CONFIG_I2C_MT65XX=y
CONFIG_I2C_OCORES=y
CONFIG_I2C_PCA_PLATFORM=y
# CONFIG_I2C_PXA_PCI is not set
CONFIG_I2C_RIIC=m
CONFIG_I2C_SH_MOBILE=m
CONFIG_I2C_SIMTEC=y
CONFIG_I2C_XILINX=m
# CONFIG_I2C_XLP9XX is not set
CONFIG_I2C_RCAR=m

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_DIOLAN_U2C is not set
CONFIG_I2C_DLN2=m
CONFIG_I2C_PARPORT_LIGHT=y
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
CONFIG_I2C_TAOS_EVM=m
CONFIG_I2C_TINY_USB=y
CONFIG_I2C_VIPERBOARD=m

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_STUB is not set
CONFIG_I2C_SLAVE=y
CONFIG_I2C_SLAVE_EEPROM=y
CONFIG_I2C_DEBUG_CORE=y
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
CONFIG_SPI=y
CONFIG_SPI_DEBUG=y
CONFIG_SPI_MASTER=y

#
# SPI Master Controller Drivers
#
# CONFIG_SPI_ALTERA is not set
# CONFIG_SPI_ATMEL is not set
CONFIG_SPI_BCM2835=m
# CONFIG_SPI_BCM63XX_HSSPI is not set
CONFIG_SPI_BITBANG=y
CONFIG_SPI_CADENCE=y
CONFIG_SPI_CLPS711X=m
CONFIG_SPI_DLN2=m
CONFIG_SPI_EP93XX=m
# CONFIG_SPI_GPIO is not set
CONFIG_SPI_IMG_SPFI=y
# CONFIG_SPI_IMX is not set
CONFIG_SPI_FSL_DSPI=y
CONFIG_SPI_MESON_SPIFC=y
CONFIG_SPI_OC_TINY=y
CONFIG_SPI_TI_QSPI=y
CONFIG_SPI_OMAP_100K=m
# CONFIG_SPI_ORION is not set
# CONFIG_SPI_PXA2XX is not set
# CONFIG_SPI_PXA2XX_PCI is not set
CONFIG_SPI_RSPI=m
# CONFIG_SPI_SC18IS602 is not set
CONFIG_SPI_SH=m
# CONFIG_SPI_SH_HSPI is not set
# CONFIG_SPI_SUN4I is not set
# CONFIG_SPI_TOPCLIFF_PCH is not set
CONFIG_SPI_TXX9=m
CONFIG_SPI_XCOMM=y
# CONFIG_SPI_XILINX is not set
CONFIG_SPI_XTENSA_XTFPGA=m
# CONFIG_SPI_ZYNQMP_GQSPI is not set
CONFIG_SPI_DESIGNWARE=m
CONFIG_SPI_DW_PCI=m
# CONFIG_SPI_DW_MID_DMA is not set
CONFIG_SPI_DW_MMIO=m

#
# SPI Protocol Masters
#
CONFIG_SPI_SPIDEV=m
CONFIG_SPI_TLE62X0=y
# CONFIG_SPMI is not set
CONFIG_HSI=m
CONFIG_HSI_BOARDINFO=y

#
# HSI controllers
#

#
# HSI clients
#
CONFIG_HSI_CHAR=m

#
# PPS support
#
CONFIG_PPS=y

#
# PPS clients support
#
CONFIG_PPS_CLIENT_KTIMER=m
# CONFIG_PPS_CLIENT_LDISC is not set
CONFIG_PPS_CLIENT_GPIO=m

#
# PPS generators support
#

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK=y

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
# CONFIG_PTP_1588_CLOCK_PCH is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
CONFIG_GPIOLIB=y
CONFIG_GPIO_DEVRES=y
CONFIG_GPIOLIB_IRQCHIP=y
CONFIG_DEBUG_GPIO=y
# CONFIG_GPIO_SYSFS is not set
CONFIG_GPIO_GENERIC=y
CONFIG_GPIO_MAX730X=y

#
# Memory mapped GPIO drivers
#
CONFIG_GPIO_CLPS711X=y
CONFIG_GPIO_DWAPB=m
CONFIG_GPIO_F7188X=m
# CONFIG_GPIO_GENERIC_PLATFORM is not set
# CONFIG_GPIO_ICH is not set
CONFIG_GPIO_IT8761E=y
CONFIG_GPIO_SCH=m
# CONFIG_GPIO_SCH311X is not set
CONFIG_GPIO_TS5500=m
# CONFIG_GPIO_VX855 is not set

#
# I2C GPIO expanders
#
# CONFIG_GPIO_ADP5588 is not set
# CONFIG_GPIO_MAX7300 is not set
# CONFIG_GPIO_MAX732X is not set
CONFIG_GPIO_PCA953X=m
CONFIG_GPIO_PCF857X=m
# CONFIG_GPIO_SX150X is not set

#
# MFD GPIO expanders
#
CONFIG_GPIO_ARIZONA=y
CONFIG_GPIO_CRYSTAL_COVE=m
# CONFIG_GPIO_DA9052 is not set
CONFIG_GPIO_DA9055=m
CONFIG_GPIO_DLN2=m
CONFIG_GPIO_JANZ_TTL=m
CONFIG_GPIO_KEMPLD=m
CONFIG_GPIO_LP3943=m
CONFIG_GPIO_PALMAS=y
# CONFIG_GPIO_RC5T583 is not set
CONFIG_GPIO_TIMBERDALE=y
# CONFIG_GPIO_TPS6586X is not set
CONFIG_GPIO_TPS65912=y
# CONFIG_GPIO_TWL6040 is not set

#
# PCI GPIO expanders
#
CONFIG_GPIO_AMD8111=y
# CONFIG_GPIO_INTEL_MID is not set
CONFIG_GPIO_ML_IOH=m
CONFIG_GPIO_PCH=m
CONFIG_GPIO_RDC321X=y

#
# SPI GPIO expanders
#
CONFIG_GPIO_MAX7301=y
CONFIG_GPIO_MCP23S08=m
CONFIG_GPIO_MC33880=m

#
# USB GPIO expanders
#
CONFIG_GPIO_VIPERBOARD=m
CONFIG_W1=y
# CONFIG_W1_CON is not set

#
# 1-wire Bus Masters
#
# CONFIG_W1_MASTER_MATROX is not set
CONFIG_W1_MASTER_DS2490=m
CONFIG_W1_MASTER_DS2482=y
CONFIG_W1_MASTER_MXC=y
# CONFIG_W1_MASTER_DS1WM is not set
CONFIG_W1_MASTER_GPIO=y

#
# 1-wire Slaves
#
# CONFIG_W1_SLAVE_THERM is not set
CONFIG_W1_SLAVE_SMEM=y
# CONFIG_W1_SLAVE_DS2408 is not set
CONFIG_W1_SLAVE_DS2413=m
CONFIG_W1_SLAVE_DS2406=m
CONFIG_W1_SLAVE_DS2423=y
# CONFIG_W1_SLAVE_DS2431 is not set
# CONFIG_W1_SLAVE_DS2433 is not set
CONFIG_W1_SLAVE_DS2760=m
CONFIG_W1_SLAVE_DS2780=y
CONFIG_W1_SLAVE_DS2781=y
# CONFIG_W1_SLAVE_DS28E04 is not set
CONFIG_W1_SLAVE_BQ27000=y
CONFIG_POWER_SUPPLY=y
CONFIG_POWER_SUPPLY_DEBUG=y
# CONFIG_PDA_POWER is not set
CONFIG_GENERIC_ADC_BATTERY=m
CONFIG_TEST_POWER=m
CONFIG_BATTERY_DS2760=m
CONFIG_BATTERY_DS2780=y
CONFIG_BATTERY_DS2781=y
CONFIG_BATTERY_DS2782=y
CONFIG_BATTERY_SBS=y
CONFIG_BATTERY_BQ27x00=y
# CONFIG_BATTERY_BQ27X00_I2C is not set
# CONFIG_BATTERY_BQ27X00_PLATFORM is not set
CONFIG_BATTERY_DA9030=m
# CONFIG_BATTERY_DA9052 is not set
CONFIG_CHARGER_DA9150=m
# CONFIG_AXP288_FUEL_GAUGE is not set
CONFIG_BATTERY_MAX17040=m
CONFIG_BATTERY_MAX17042=m
CONFIG_CHARGER_PCF50633=m
# CONFIG_CHARGER_ISP1704 is not set
CONFIG_CHARGER_MAX8903=m
CONFIG_CHARGER_LP8727=y
# CONFIG_CHARGER_LP8788 is not set
# CONFIG_CHARGER_GPIO is not set
# CONFIG_CHARGER_MANAGER is not set
# CONFIG_CHARGER_MAX14577 is not set
CONFIG_CHARGER_MAX77693=m
CONFIG_CHARGER_BQ2415X=m
CONFIG_CHARGER_BQ24190=m
# CONFIG_CHARGER_BQ24257 is not set
# CONFIG_CHARGER_BQ24735 is not set
CONFIG_CHARGER_BQ25890=y
CONFIG_CHARGER_SMB347=y
CONFIG_BATTERY_GAUGE_LTC2941=m
CONFIG_BATTERY_GOLDFISH=m
# CONFIG_BATTERY_RT5033 is not set
CONFIG_CHARGER_RT9455=y
# CONFIG_POWER_RESET is not set
# CONFIG_POWER_AVS is not set
CONFIG_HWMON=y
CONFIG_HWMON_VID=y
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Native drivers
#
# CONFIG_SENSORS_AD7314 is not set
# CONFIG_SENSORS_AD7414 is not set
# CONFIG_SENSORS_AD7418 is not set
CONFIG_SENSORS_ADM1021=y
# CONFIG_SENSORS_ADM1025 is not set
CONFIG_SENSORS_ADM1026=y
# CONFIG_SENSORS_ADM1029 is not set
CONFIG_SENSORS_ADM1031=m
# CONFIG_SENSORS_ADM9240 is not set
CONFIG_SENSORS_ADT7X10=y
CONFIG_SENSORS_ADT7310=y
CONFIG_SENSORS_ADT7410=y
CONFIG_SENSORS_ADT7411=m
CONFIG_SENSORS_ADT7462=m
# CONFIG_SENSORS_ADT7470 is not set
CONFIG_SENSORS_ADT7475=m
CONFIG_SENSORS_ASC7621=y
CONFIG_SENSORS_K8TEMP=y
CONFIG_SENSORS_K10TEMP=m
CONFIG_SENSORS_FAM15H_POWER=m
CONFIG_SENSORS_APPLESMC=y
CONFIG_SENSORS_ASB100=y
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS620=m
CONFIG_SENSORS_DS1621=y
CONFIG_SENSORS_DELL_SMM=m
CONFIG_SENSORS_DA9052_ADC=m
CONFIG_SENSORS_DA9055=y
# CONFIG_SENSORS_I5K_AMB is not set
CONFIG_SENSORS_F71805F=y
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
# CONFIG_SENSORS_MC13783_ADC is not set
CONFIG_SENSORS_FSCHMD=m
# CONFIG_SENSORS_GL520SM is not set
# CONFIG_SENSORS_G760A is not set
CONFIG_SENSORS_G762=m
CONFIG_SENSORS_GPIO_FAN=y
CONFIG_SENSORS_HIH6130=y
CONFIG_SENSORS_IIO_HWMON=y
# CONFIG_SENSORS_I5500 is not set
CONFIG_SENSORS_CORETEMP=m
# CONFIG_SENSORS_IT87 is not set
CONFIG_SENSORS_JC42=m
CONFIG_SENSORS_POWR1220=m
CONFIG_SENSORS_LINEAGE=y
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC4151 is not set
CONFIG_SENSORS_LTC4215=y
CONFIG_SENSORS_LTC4222=y
CONFIG_SENSORS_LTC4245=m
CONFIG_SENSORS_LTC4260=y
CONFIG_SENSORS_LTC4261=y
CONFIG_SENSORS_MAX1111=m
# CONFIG_SENSORS_MAX16065 is not set
# CONFIG_SENSORS_MAX1619 is not set
CONFIG_SENSORS_MAX1668=y
CONFIG_SENSORS_MAX197=y
CONFIG_SENSORS_MAX6639=y
# CONFIG_SENSORS_MAX6642 is not set
# CONFIG_SENSORS_MAX6650 is not set
CONFIG_SENSORS_MAX6697=y
# CONFIG_SENSORS_HTU21 is not set
CONFIG_SENSORS_MCP3021=m
# CONFIG_SENSORS_ADCXX is not set
# CONFIG_SENSORS_LM63 is not set
# CONFIG_SENSORS_LM70 is not set
# CONFIG_SENSORS_LM73 is not set
# CONFIG_SENSORS_LM75 is not set
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=y
CONFIG_SENSORS_LM80=y
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=y
CONFIG_SENSORS_LM87=y
CONFIG_SENSORS_LM95234=y
# CONFIG_SENSORS_LM95241 is not set
CONFIG_SENSORS_LM95245=y
CONFIG_SENSORS_PC87360=y
CONFIG_SENSORS_PC87427=y
CONFIG_SENSORS_NTC_THERMISTOR=y
CONFIG_SENSORS_NCT6683=m
CONFIG_SENSORS_NCT6775=y
# CONFIG_SENSORS_NCT7802 is not set
CONFIG_SENSORS_NCT7904=m
# CONFIG_SENSORS_PCF8591 is not set
CONFIG_PMBUS=y
# CONFIG_SENSORS_PMBUS is not set
CONFIG_SENSORS_ADM1275=y
CONFIG_SENSORS_LM25066=m
CONFIG_SENSORS_LTC2978=y
CONFIG_SENSORS_LTC2978_REGULATOR=y
CONFIG_SENSORS_MAX16064=m
CONFIG_SENSORS_MAX34440=m
# CONFIG_SENSORS_MAX8688 is not set
CONFIG_SENSORS_TPS40422=m
CONFIG_SENSORS_UCD9000=m
CONFIG_SENSORS_UCD9200=m
# CONFIG_SENSORS_ZL6100 is not set
CONFIG_SENSORS_PWM_FAN=m
CONFIG_SENSORS_SHT15=m
# CONFIG_SENSORS_SHT21 is not set
CONFIG_SENSORS_SHTC1=y
CONFIG_SENSORS_SIS5595=m
# CONFIG_SENSORS_DME1737 is not set
CONFIG_SENSORS_EMC1403=y
# CONFIG_SENSORS_EMC2103 is not set
# CONFIG_SENSORS_EMC6W201 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47M192 is not set
CONFIG_SENSORS_SMSC47B397=y
# CONFIG_SENSORS_SCH56XX_COMMON is not set
CONFIG_SENSORS_SMM665=m
# CONFIG_SENSORS_ADC128D818 is not set
CONFIG_SENSORS_ADS1015=m
CONFIG_SENSORS_ADS7828=m
CONFIG_SENSORS_ADS7871=m
CONFIG_SENSORS_AMC6821=m
# CONFIG_SENSORS_INA209 is not set
CONFIG_SENSORS_INA2XX=y
# CONFIG_SENSORS_TC74 is not set
# CONFIG_SENSORS_THMC50 is not set
CONFIG_SENSORS_TMP102=m
# CONFIG_SENSORS_TMP103 is not set
CONFIG_SENSORS_TMP401=y
CONFIG_SENSORS_TMP421=m
# CONFIG_SENSORS_VIA_CPUTEMP is not set
CONFIG_SENSORS_VIA686A=y
CONFIG_SENSORS_VT1211=m
CONFIG_SENSORS_VT8231=m
CONFIG_SENSORS_W83781D=y
CONFIG_SENSORS_W83791D=m
CONFIG_SENSORS_W83792D=y
# CONFIG_SENSORS_W83793 is not set
CONFIG_SENSORS_W83795=y
# CONFIG_SENSORS_W83795_FANCTRL is not set
# CONFIG_SENSORS_W83L785TS is not set
CONFIG_SENSORS_W83L786NG=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
CONFIG_THERMAL=y
# CONFIG_THERMAL_HWMON is not set
# CONFIG_THERMAL_WRITABLE_TRIPS is not set
# CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE is not set
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
CONFIG_THERMAL_DEFAULT_GOV_POWER_ALLOCATOR=y
# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
CONFIG_THERMAL_GOV_STEP_WISE=y
# CONFIG_THERMAL_GOV_BANG_BANG is not set
CONFIG_THERMAL_GOV_USER_SPACE=y
CONFIG_THERMAL_GOV_POWER_ALLOCATOR=y
# CONFIG_THERMAL_EMULATION is not set
CONFIG_RCAR_THERMAL=m
# CONFIG_INTEL_SOC_DTS_THERMAL is not set

#
# Texas Instruments thermal drivers
#
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
CONFIG_SSB=y
CONFIG_SSB_SPROM=y
CONFIG_SSB_PCIHOST_POSSIBLE=y
CONFIG_SSB_PCIHOST=y
# CONFIG_SSB_B43_PCI_BRIDGE is not set
CONFIG_SSB_PCMCIAHOST_POSSIBLE=y
CONFIG_SSB_PCMCIAHOST=y
CONFIG_SSB_SILENT=y
CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
CONFIG_SSB_DRIVER_PCICORE=y
CONFIG_SSB_DRIVER_GPIO=y
CONFIG_BCMA_POSSIBLE=y

#
# Broadcom specific AMBA
#
CONFIG_BCMA=y
CONFIG_BCMA_HOST_PCI_POSSIBLE=y
CONFIG_BCMA_HOST_PCI=y
# CONFIG_BCMA_HOST_SOC is not set
CONFIG_BCMA_DRIVER_PCI=y
# CONFIG_BCMA_DRIVER_GMAC_CMN is not set
CONFIG_BCMA_DRIVER_GPIO=y
CONFIG_BCMA_DEBUG=y

#
# Multifunction device drivers
#
CONFIG_MFD_CORE=y
# CONFIG_MFD_CS5535 is not set
CONFIG_MFD_AS3711=y
# CONFIG_PMIC_ADP5520 is not set
CONFIG_MFD_AAT2870_CORE=y
CONFIG_MFD_BCM590XX=y
CONFIG_MFD_AXP20X=y
# CONFIG_MFD_CROS_EC is not set
CONFIG_PMIC_DA903X=y
CONFIG_PMIC_DA9052=y
CONFIG_MFD_DA9052_SPI=y
CONFIG_MFD_DA9052_I2C=y
CONFIG_MFD_DA9055=y
# CONFIG_MFD_DA9063 is not set
CONFIG_MFD_DA9150=y
CONFIG_MFD_DLN2=m
CONFIG_MFD_MC13XXX=y
CONFIG_MFD_MC13XXX_SPI=y
CONFIG_MFD_MC13XXX_I2C=y
# CONFIG_HTC_PASIC3 is not set
# CONFIG_HTC_I2CPLD is not set
CONFIG_LPC_ICH=y
CONFIG_LPC_SCH=m
CONFIG_INTEL_SOC_PMIC=y
CONFIG_MFD_JANZ_CMODIO=m
CONFIG_MFD_KEMPLD=m
# CONFIG_MFD_88PM800 is not set
CONFIG_MFD_88PM805=y
# CONFIG_MFD_88PM860X is not set
CONFIG_MFD_MAX14577=y
CONFIG_MFD_MAX77693=y
CONFIG_MFD_MAX77843=y
CONFIG_MFD_MAX8907=m
# CONFIG_MFD_MAX8925 is not set
CONFIG_MFD_MAX8997=y
CONFIG_MFD_MAX8998=y
CONFIG_MFD_MT6397=y
# CONFIG_MFD_MENF21BMC is not set
# CONFIG_EZX_PCAP is not set
CONFIG_MFD_VIPERBOARD=m
CONFIG_MFD_RETU=m
CONFIG_MFD_PCF50633=m
CONFIG_PCF50633_ADC=m
CONFIG_PCF50633_GPIO=m
CONFIG_MFD_RDC321X=y
CONFIG_MFD_RTSX_PCI=m
CONFIG_MFD_RT5033=y
CONFIG_MFD_RTSX_USB=y
CONFIG_MFD_RC5T583=y
CONFIG_MFD_RN5T618=m
CONFIG_MFD_SEC_CORE=y
CONFIG_MFD_SI476X_CORE=y
CONFIG_MFD_SM501=m
CONFIG_MFD_SM501_GPIO=y
CONFIG_MFD_SKY81452=y
# CONFIG_MFD_SMSC is not set
CONFIG_ABX500_CORE=y
# CONFIG_AB3100_CORE is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
CONFIG_MFD_LP3943=m
CONFIG_MFD_LP8788=y
CONFIG_MFD_PALMAS=y
CONFIG_TPS6105X=m
CONFIG_TPS6507X=y
# CONFIG_MFD_TPS65090 is not set
CONFIG_MFD_TPS65217=m
# CONFIG_MFD_TPS65218 is not set
CONFIG_MFD_TPS6586X=y
# CONFIG_MFD_TPS65910 is not set
CONFIG_MFD_TPS65912=y
# CONFIG_MFD_TPS65912_I2C is not set
CONFIG_MFD_TPS65912_SPI=y
# CONFIG_MFD_TPS80031 is not set
# CONFIG_TWL4030_CORE is not set
CONFIG_TWL6040_CORE=y
CONFIG_MFD_WL1273_CORE=y
CONFIG_MFD_LM3533=m
CONFIG_MFD_TIMBERDALE=y
# CONFIG_MFD_TMIO is not set
CONFIG_MFD_VX855=m
CONFIG_MFD_ARIZONA=y
CONFIG_MFD_ARIZONA_I2C=m
CONFIG_MFD_ARIZONA_SPI=m
# CONFIG_MFD_WM5102 is not set
# CONFIG_MFD_WM5110 is not set
CONFIG_MFD_WM8997=y
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM831X_SPI is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_WM8994 is not set
CONFIG_REGULATOR=y
CONFIG_REGULATOR_DEBUG=y
CONFIG_REGULATOR_FIXED_VOLTAGE=m
CONFIG_REGULATOR_VIRTUAL_CONSUMER=m
CONFIG_REGULATOR_USERSPACE_CONSUMER=y
# CONFIG_REGULATOR_ACT8865 is not set
# CONFIG_REGULATOR_AD5398 is not set
CONFIG_REGULATOR_AAT2870=m
CONFIG_REGULATOR_AS3711=y
# CONFIG_REGULATOR_AXP20X is not set
# CONFIG_REGULATOR_BCM590XX is not set
# CONFIG_REGULATOR_DA903X is not set
CONFIG_REGULATOR_DA9052=m
# CONFIG_REGULATOR_DA9055 is not set
# CONFIG_REGULATOR_DA9210 is not set
CONFIG_REGULATOR_DA9211=y
CONFIG_REGULATOR_FAN53555=m
CONFIG_REGULATOR_GPIO=m
# CONFIG_REGULATOR_ISL9305 is not set
CONFIG_REGULATOR_ISL6271A=m
# CONFIG_REGULATOR_LP3971 is not set
CONFIG_REGULATOR_LP3972=y
CONFIG_REGULATOR_LP872X=m
# CONFIG_REGULATOR_LP8755 is not set
# CONFIG_REGULATOR_LP8788 is not set
CONFIG_REGULATOR_LTC3589=m
CONFIG_REGULATOR_MAX14577=m
CONFIG_REGULATOR_MAX1586=y
# CONFIG_REGULATOR_MAX8649 is not set
CONFIG_REGULATOR_MAX8660=y
CONFIG_REGULATOR_MAX8907=m
CONFIG_REGULATOR_MAX8952=m
CONFIG_REGULATOR_MAX8973=y
# CONFIG_REGULATOR_MAX8997 is not set
# CONFIG_REGULATOR_MAX8998 is not set
CONFIG_REGULATOR_MAX77693=y
CONFIG_REGULATOR_MAX77843=y
CONFIG_REGULATOR_MC13XXX_CORE=y
CONFIG_REGULATOR_MC13783=y
# CONFIG_REGULATOR_MC13892 is not set
CONFIG_REGULATOR_MT6397=y
# CONFIG_REGULATOR_PALMAS is not set
# CONFIG_REGULATOR_PCF50633 is not set
CONFIG_REGULATOR_PFUZE100=m
CONFIG_REGULATOR_PWM=m
CONFIG_REGULATOR_QCOM_SPMI=m
CONFIG_REGULATOR_RC5T583=m
# CONFIG_REGULATOR_RN5T618 is not set
CONFIG_REGULATOR_RT5033=m
# CONFIG_REGULATOR_S2MPA01 is not set
CONFIG_REGULATOR_S2MPS11=y
CONFIG_REGULATOR_S5M8767=m
CONFIG_REGULATOR_SKY81452=y
CONFIG_REGULATOR_TPS51632=y
CONFIG_REGULATOR_TPS6105X=m
CONFIG_REGULATOR_TPS62360=m
CONFIG_REGULATOR_TPS65023=m
# CONFIG_REGULATOR_TPS6507X is not set
CONFIG_REGULATOR_TPS65217=m
# CONFIG_REGULATOR_TPS6524X is not set
# CONFIG_REGULATOR_TPS6586X is not set
# CONFIG_REGULATOR_TPS65912 is not set
CONFIG_MEDIA_SUPPORT=y

#
# Multimedia core support
#
CONFIG_MEDIA_CAMERA_SUPPORT=y
CONFIG_MEDIA_ANALOG_TV_SUPPORT=y
# CONFIG_MEDIA_DIGITAL_TV_SUPPORT is not set
CONFIG_MEDIA_RADIO_SUPPORT=y
# CONFIG_MEDIA_SDR_SUPPORT is not set
CONFIG_MEDIA_RC_SUPPORT=y
# CONFIG_MEDIA_CONTROLLER is not set
CONFIG_VIDEO_DEV=y
CONFIG_VIDEO_V4L2=y
# CONFIG_VIDEO_ADV_DEBUG is not set
# CONFIG_VIDEO_FIXED_MINOR_RANGES is not set
CONFIG_VIDEO_PCI_SKELETON=y
CONFIG_VIDEO_TUNER=y
CONFIG_V4L2_MEM2MEM_DEV=y
CONFIG_VIDEOBUF_GEN=y
CONFIG_VIDEOBUF_DMA_SG=m
CONFIG_VIDEOBUF_DMA_CONTIG=y
CONFIG_VIDEOBUF2_CORE=y
CONFIG_VIDEOBUF2_MEMOPS=y
CONFIG_VIDEOBUF2_DMA_CONTIG=y
CONFIG_VIDEOBUF2_VMALLOC=y
CONFIG_VIDEOBUF2_DMA_SG=y
# CONFIG_TTPCI_EEPROM is not set

#
# Media drivers
#
CONFIG_RC_CORE=y
# CONFIG_RC_MAP is not set
# CONFIG_RC_DECODERS is not set
CONFIG_RC_DEVICES=y
CONFIG_RC_ATI_REMOTE=m
CONFIG_IR_HIX5HD2=m
CONFIG_IR_IMON=m
CONFIG_IR_MCEUSB=m
CONFIG_IR_MESON=y
CONFIG_IR_REDRAT3=m
CONFIG_IR_STREAMZAP=m
CONFIG_IR_IGORPLUGUSB=y
CONFIG_IR_IGUANA=y
CONFIG_IR_TTUSBIR=m
# CONFIG_IR_IMG is not set
# CONFIG_RC_LOOPBACK is not set
# CONFIG_IR_GPIO_CIR is not set
CONFIG_RC_ST=m
CONFIG_IR_SUNXI=m
CONFIG_MEDIA_USB_SUPPORT=y

#
# Webcam devices
#
# CONFIG_USB_VIDEO_CLASS is not set
CONFIG_USB_GSPCA=m
CONFIG_USB_M5602=m
CONFIG_USB_STV06XX=m
# CONFIG_USB_GL860 is not set
CONFIG_USB_GSPCA_BENQ=m
CONFIG_USB_GSPCA_CONEX=m
CONFIG_USB_GSPCA_CPIA1=m
CONFIG_USB_GSPCA_DTCS033=m
CONFIG_USB_GSPCA_ETOMS=m
CONFIG_USB_GSPCA_FINEPIX=m
CONFIG_USB_GSPCA_JEILINJ=m
CONFIG_USB_GSPCA_JL2005BCD=m
CONFIG_USB_GSPCA_KINECT=m
# CONFIG_USB_GSPCA_KONICA is not set
# CONFIG_USB_GSPCA_MARS is not set
CONFIG_USB_GSPCA_MR97310A=m
CONFIG_USB_GSPCA_NW80X=m
CONFIG_USB_GSPCA_OV519=m
# CONFIG_USB_GSPCA_OV534 is not set
CONFIG_USB_GSPCA_OV534_9=m
# CONFIG_USB_GSPCA_PAC207 is not set
CONFIG_USB_GSPCA_PAC7302=m
CONFIG_USB_GSPCA_PAC7311=m
# CONFIG_USB_GSPCA_SE401 is not set
CONFIG_USB_GSPCA_SN9C2028=m
CONFIG_USB_GSPCA_SN9C20X=m
# CONFIG_USB_GSPCA_SONIXB is not set
CONFIG_USB_GSPCA_SONIXJ=m
# CONFIG_USB_GSPCA_SPCA500 is not set
CONFIG_USB_GSPCA_SPCA501=m
CONFIG_USB_GSPCA_SPCA505=m
CONFIG_USB_GSPCA_SPCA506=m
CONFIG_USB_GSPCA_SPCA508=m
CONFIG_USB_GSPCA_SPCA561=m
CONFIG_USB_GSPCA_SPCA1528=m
# CONFIG_USB_GSPCA_SQ905 is not set
CONFIG_USB_GSPCA_SQ905C=m
CONFIG_USB_GSPCA_SQ930X=m
CONFIG_USB_GSPCA_STK014=m
CONFIG_USB_GSPCA_STK1135=m
CONFIG_USB_GSPCA_STV0680=m
CONFIG_USB_GSPCA_SUNPLUS=m
# CONFIG_USB_GSPCA_T613 is not set
CONFIG_USB_GSPCA_TOPRO=m
# CONFIG_USB_GSPCA_TOUPTEK is not set
# CONFIG_USB_GSPCA_TV8532 is not set
# CONFIG_USB_GSPCA_VC032X is not set
CONFIG_USB_GSPCA_VICAM=m
CONFIG_USB_GSPCA_XIRLINK_CIT=m
CONFIG_USB_GSPCA_ZC3XX=m
CONFIG_USB_PWC=m
CONFIG_USB_PWC_DEBUG=y
# CONFIG_USB_PWC_INPUT_EVDEV is not set
CONFIG_VIDEO_CPIA2=y
# CONFIG_USB_ZR364XX is not set
# CONFIG_USB_STKWEBCAM is not set
# CONFIG_USB_S2255 is not set

#
# Analog TV USB devices
#
# CONFIG_VIDEO_PVRUSB2 is not set
CONFIG_VIDEO_HDPVR=y
CONFIG_VIDEO_USBVISION=y
CONFIG_VIDEO_STK1160_COMMON=y
CONFIG_VIDEO_STK1160=y

#
# Analog/digital TV USB devices
#
# CONFIG_VIDEO_CX231XX is not set
# CONFIG_VIDEO_TM6000 is not set

#
# Webcam, TV (analog/digital) USB devices
#
CONFIG_VIDEO_EM28XX=m
CONFIG_VIDEO_EM28XX_V4L2=m
CONFIG_VIDEO_EM28XX_RC=m
CONFIG_MEDIA_PCI_SUPPORT=y

#
# Media capture support
#

#
# Media capture/analog TV support
#
# CONFIG_VIDEO_IVTV is not set
CONFIG_VIDEO_ZORAN=m
CONFIG_VIDEO_ZORAN_DC30=m
CONFIG_VIDEO_ZORAN_ZR36060=m
CONFIG_VIDEO_ZORAN_BUZ=m
CONFIG_VIDEO_ZORAN_DC10=m
CONFIG_VIDEO_ZORAN_LML33=m
CONFIG_VIDEO_ZORAN_LML33R10=m
CONFIG_VIDEO_ZORAN_AVS6EYES=m
CONFIG_VIDEO_HEXIUM_GEMINI=m
CONFIG_VIDEO_HEXIUM_ORION=m
# CONFIG_VIDEO_MXB is not set
CONFIG_VIDEO_TW68=m
# CONFIG_VIDEO_DT3155 is not set

#
# Media capture/analog/hybrid TV support
#
CONFIG_VIDEO_CX25821=m
# CONFIG_VIDEO_CX88 is not set
CONFIG_VIDEO_BT848=m
CONFIG_VIDEO_SAA7134=y
CONFIG_VIDEO_SAA7134_RC=y
CONFIG_V4L_PLATFORM_DRIVERS=y
CONFIG_VIDEO_CAFE_CCIC=m
CONFIG_VIDEO_VIA_CAMERA=m
# CONFIG_VIDEO_DAVINCI_VPIF_DISPLAY is not set
# CONFIG_VIDEO_DAVINCI_VPIF_CAPTURE is not set
CONFIG_VIDEO_DM6446_CCDC=y
# CONFIG_VIDEO_DM355_CCDC is not set
CONFIG_VIDEO_SH_VOU=m
CONFIG_VIDEO_M32R_AR=m
CONFIG_SOC_CAMERA=m
CONFIG_SOC_CAMERA_PLATFORM=m
# CONFIG_VIDEO_RCAR_VIN is not set
CONFIG_VIDEO_MX2=m
# CONFIG_VIDEO_ATMEL_ISI is not set
CONFIG_VIDEO_SAMSUNG_S5P_TV=y
# CONFIG_VIDEO_SAMSUNG_S5P_HDMI is not set
# CONFIG_VIDEO_SAMSUNG_S5P_HDMIPHY is not set
CONFIG_VIDEO_SAMSUNG_S5P_SII9234=y
CONFIG_VIDEO_SAMSUNG_S5P_SDO=y
# CONFIG_VIDEO_SAMSUNG_S5P_MIXER is not set
CONFIG_V4L_MEM2MEM_DRIVERS=y
CONFIG_VIDEO_MEM2MEM_DEINTERLACE=m
CONFIG_VIDEO_SAMSUNG_S5P_G2D=m
# CONFIG_VIDEO_SAMSUNG_S5P_JPEG is not set
CONFIG_VIDEO_SAMSUNG_S5P_MFC=m
# CONFIG_VIDEO_MX2_EMMAPRP is not set
CONFIG_VIDEO_SAMSUNG_EXYNOS_GSC=y
CONFIG_VIDEO_STI_BDISP=y
CONFIG_VIDEO_SH_VEU=y
# CONFIG_VIDEO_TI_VPE is not set
# CONFIG_V4L_TEST_DRIVERS is not set

#
# Supported MMC/SDIO adapters
#
CONFIG_RADIO_ADAPTERS=y
CONFIG_RADIO_TEA575X=y
CONFIG_RADIO_SI470X=y
# CONFIG_USB_SI470X is not set
# CONFIG_I2C_SI470X is not set
# CONFIG_RADIO_SI4713 is not set
CONFIG_USB_MR800=m
CONFIG_USB_DSBR=y
CONFIG_RADIO_MAXIRADIO=y
CONFIG_RADIO_SHARK=y
CONFIG_RADIO_SHARK2=m
# CONFIG_USB_KEENE is not set
CONFIG_USB_RAREMONO=m
# CONFIG_USB_MA901 is not set
CONFIG_RADIO_TEA5764=m
CONFIG_RADIO_SAA7706H=m
# CONFIG_RADIO_TEF6862 is not set
# CONFIG_RADIO_TIMBERDALE is not set
CONFIG_RADIO_WL1273=y

#
# Texas Instruments WL128x FM driver (ST based)
#
CONFIG_VIDEO_TVEEPROM=y
CONFIG_CYPRESS_FIRMWARE=m
CONFIG_VIDEO_SAA7146=m
CONFIG_VIDEO_SAA7146_VV=m

#
# Media ancillary drivers (tuners, sensors, i2c, frontends)
#
# CONFIG_MEDIA_SUBDRV_AUTOSELECT is not set
CONFIG_MEDIA_ATTACH=y
CONFIG_VIDEO_IR_I2C=y

#
# Encoders, decoders, sensors and other helper chips
#

#
# Audio decoders, processors and mixers
#
CONFIG_VIDEO_TVAUDIO=m
CONFIG_VIDEO_TDA7432=y
# CONFIG_VIDEO_TDA9840 is not set
CONFIG_VIDEO_TEA6415C=m
CONFIG_VIDEO_TEA6420=m
# CONFIG_VIDEO_MSP3400 is not set
CONFIG_VIDEO_CS5345=y
CONFIG_VIDEO_CS53L32A=y
CONFIG_VIDEO_TLV320AIC23B=m
# CONFIG_VIDEO_UDA1342 is not set
# CONFIG_VIDEO_WM8775 is not set
CONFIG_VIDEO_WM8739=m
# CONFIG_VIDEO_VP27SMPX is not set
# CONFIG_VIDEO_SONY_BTF_MPX is not set

#
# RDS decoders
#
CONFIG_VIDEO_SAA6588=m

#
# Video decoders
#
# CONFIG_VIDEO_ADV7183 is not set
# CONFIG_VIDEO_BT819 is not set
CONFIG_VIDEO_BT856=y
CONFIG_VIDEO_BT866=y
# CONFIG_VIDEO_KS0127 is not set
CONFIG_VIDEO_ML86V7667=y
CONFIG_VIDEO_SAA7110=m
CONFIG_VIDEO_SAA711X=y
CONFIG_VIDEO_TVP514X=y
# CONFIG_VIDEO_TVP5150 is not set
CONFIG_VIDEO_TVP7002=y
# CONFIG_VIDEO_TW2804 is not set
# CONFIG_VIDEO_TW9903 is not set
CONFIG_VIDEO_TW9906=y
CONFIG_VIDEO_VPX3220=m

#
# Video and audio decoders
#
# CONFIG_VIDEO_SAA717X is not set
CONFIG_VIDEO_CX25840=y

#
# Video encoders
#
CONFIG_VIDEO_SAA7127=m
# CONFIG_VIDEO_SAA7185 is not set
CONFIG_VIDEO_ADV7170=m
CONFIG_VIDEO_ADV7175=y
CONFIG_VIDEO_ADV7343=y
# CONFIG_VIDEO_ADV7393 is not set
# CONFIG_VIDEO_AK881X is not set
CONFIG_VIDEO_THS8200=m

#
# Camera sensor devices
#
CONFIG_VIDEO_OV2659=m
# CONFIG_VIDEO_OV7640 is not set
CONFIG_VIDEO_OV7670=m
# CONFIG_VIDEO_VS6624 is not set
CONFIG_VIDEO_MT9V011=y
# CONFIG_VIDEO_SR030PC30 is not set

#
# Flash devices
#

#
# Video improvement chips
#
# CONFIG_VIDEO_UPD64031A is not set
# CONFIG_VIDEO_UPD64083 is not set

#
# Audio/Video compression chips
#
CONFIG_VIDEO_SAA6752HS=y

#
# Miscellaneous helper chips
#
CONFIG_VIDEO_THS7303=y
# CONFIG_VIDEO_M52790 is not set

#
# Sensors used on soc_camera driver
#

#
# soc_camera sensor drivers
#
# CONFIG_SOC_CAMERA_IMX074 is not set
# CONFIG_SOC_CAMERA_MT9M001 is not set
CONFIG_SOC_CAMERA_MT9M111=m
# CONFIG_SOC_CAMERA_MT9T031 is not set
# CONFIG_SOC_CAMERA_MT9T112 is not set
# CONFIG_SOC_CAMERA_MT9V022 is not set
CONFIG_SOC_CAMERA_OV2640=m
# CONFIG_SOC_CAMERA_OV5642 is not set
CONFIG_SOC_CAMERA_OV6650=m
# CONFIG_SOC_CAMERA_OV772X is not set
CONFIG_SOC_CAMERA_OV9640=m
# CONFIG_SOC_CAMERA_OV9740 is not set
# CONFIG_SOC_CAMERA_RJ54N1 is not set
# CONFIG_SOC_CAMERA_TW9910 is not set
CONFIG_MEDIA_TUNER=y

#
# Customize TV tuners
#
# CONFIG_MEDIA_TUNER_SIMPLE is not set
# CONFIG_MEDIA_TUNER_TDA8290 is not set
CONFIG_MEDIA_TUNER_TDA827X=m
# CONFIG_MEDIA_TUNER_TDA18271 is not set
CONFIG_MEDIA_TUNER_TDA9887=y
CONFIG_MEDIA_TUNER_TEA5761=y
CONFIG_MEDIA_TUNER_TEA5767=y
# CONFIG_MEDIA_TUNER_MSI001 is not set
# CONFIG_MEDIA_TUNER_MT20XX is not set
CONFIG_MEDIA_TUNER_MT2060=m
# CONFIG_MEDIA_TUNER_MT2063 is not set
CONFIG_MEDIA_TUNER_MT2266=y
CONFIG_MEDIA_TUNER_MT2131=m
CONFIG_MEDIA_TUNER_QT1010=m
CONFIG_MEDIA_TUNER_XC2028=y
CONFIG_MEDIA_TUNER_XC5000=y
CONFIG_MEDIA_TUNER_XC4000=y
CONFIG_MEDIA_TUNER_MXL5005S=m
CONFIG_MEDIA_TUNER_MXL5007T=m
CONFIG_MEDIA_TUNER_MC44S803=y
CONFIG_MEDIA_TUNER_MAX2165=m
CONFIG_MEDIA_TUNER_TDA18218=m
CONFIG_MEDIA_TUNER_FC0011=m
CONFIG_MEDIA_TUNER_FC0012=m
# CONFIG_MEDIA_TUNER_FC0013 is not set
# CONFIG_MEDIA_TUNER_TDA18212 is not set
# CONFIG_MEDIA_TUNER_E4000 is not set
CONFIG_MEDIA_TUNER_FC2580=y
CONFIG_MEDIA_TUNER_M88RS6000T=m
CONFIG_MEDIA_TUNER_TUA9001=y
# CONFIG_MEDIA_TUNER_SI2157 is not set
CONFIG_MEDIA_TUNER_IT913X=y
CONFIG_MEDIA_TUNER_R820T=m
# CONFIG_MEDIA_TUNER_MXL301RF is not set
# CONFIG_MEDIA_TUNER_QM1D1C0042 is not set

#
# Customise DVB Frontends
#
# CONFIG_DVB_AU8522_V4L is not set
CONFIG_DVB_TUNER_DIB0070=y
CONFIG_DVB_TUNER_DIB0090=y

#
# Tools to develop new frontends
#
CONFIG_DVB_DUMMY_FE=y

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
# CONFIG_AGP_INTEL is not set
CONFIG_AGP_SIS=y
CONFIG_AGP_VIA=m
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16

#
# Direct Rendering Manager
#
# CONFIG_DRM is not set

#
# Frame buffer Devices
#
CONFIG_FB=m
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_CMDLINE=y
CONFIG_FB_DDC=m
# CONFIG_FB_BOOT_VESA_SUPPORT is not set
CONFIG_FB_CFB_FILLRECT=m
CONFIG_FB_CFB_COPYAREA=m
CONFIG_FB_CFB_IMAGEBLIT=m
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
CONFIG_FB_SYS_FILLRECT=m
CONFIG_FB_SYS_COPYAREA=m
CONFIG_FB_SYS_IMAGEBLIT=m
# CONFIG_FB_FOREIGN_ENDIAN is not set
CONFIG_FB_SYS_FOPS=m
CONFIG_FB_DEFERRED_IO=y
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
CONFIG_FB_PM2=m
CONFIG_FB_PM2_FIFO_DISCONNECT=y
CONFIG_FB_CLPS711X=m
CONFIG_FB_CYBER2000=m
# CONFIG_FB_CYBER2000_DDC is not set
CONFIG_FB_ARC=m
# CONFIG_FB_UVESA is not set
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
CONFIG_FB_NVIDIA=m
# CONFIG_FB_NVIDIA_I2C is not set
CONFIG_FB_NVIDIA_DEBUG=y
# CONFIG_FB_NVIDIA_BACKLIGHT is not set
CONFIG_FB_RIVA=m
# CONFIG_FB_RIVA_I2C is not set
CONFIG_FB_RIVA_DEBUG=y
# CONFIG_FB_RIVA_BACKLIGHT is not set
CONFIG_FB_I740=m
CONFIG_FB_LE80578=m
# CONFIG_FB_CARILLO_RANCH is not set
CONFIG_FB_MATROX=m
CONFIG_FB_MATROX_MILLENIUM=y
CONFIG_FB_MATROX_MYSTIQUE=y
CONFIG_FB_MATROX_G=y
CONFIG_FB_MATROX_I2C=m
# CONFIG_FB_MATROX_MAVEN is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
CONFIG_FB_SAVAGE=m
# CONFIG_FB_SAVAGE_I2C is not set
CONFIG_FB_SAVAGE_ACCEL=y
# CONFIG_FB_SIS is not set
CONFIG_FB_VIA=m
# CONFIG_FB_VIA_DIRECT_PROCFS is not set
CONFIG_FB_VIA_X_COMPATIBILITY=y
# CONFIG_FB_NEOMAGIC is not set
CONFIG_FB_KYRO=m
CONFIG_FB_3DFX=m
CONFIG_FB_3DFX_ACCEL=y
CONFIG_FB_3DFX_I2C=y
CONFIG_FB_VOODOO1=m
# CONFIG_FB_VT8623 is not set
CONFIG_FB_TRIDENT=m
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
CONFIG_FB_CARMINE=m
# CONFIG_FB_CARMINE_DRAM_EVAL is not set
CONFIG_CARMINE_DRAM_CUSTOM=y
# CONFIG_FB_GEODE is not set
# CONFIG_FB_TMIO is not set
CONFIG_FB_SM501=m
CONFIG_FB_SMSCUFX=m
# CONFIG_FB_UDL is not set
CONFIG_FB_GOLDFISH=m
CONFIG_FB_METRONOME=m
CONFIG_FB_MB862XX=m
CONFIG_FB_MB862XX_PCI_GDC=y
CONFIG_FB_MB862XX_I2C=y
# CONFIG_FB_BROADSHEET is not set
CONFIG_FB_AUO_K190X=m
CONFIG_FB_AUO_K1900=m
# CONFIG_FB_AUO_K1901 is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=m
CONFIG_LCD_L4F00242T03=m
CONFIG_LCD_LMS283GF05=m
CONFIG_LCD_LTV350QV=m
CONFIG_LCD_ILI922X=m
CONFIG_LCD_ILI9320=m
CONFIG_LCD_TDO24M=m
CONFIG_LCD_VGG2432A4=m
CONFIG_LCD_PLATFORM=m
CONFIG_LCD_S6E63M0=m
CONFIG_LCD_LD9040=m
CONFIG_LCD_AMS369FG06=m
CONFIG_LCD_LMS501KF03=m
CONFIG_LCD_HX8357=m
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_GENERIC=m
CONFIG_BACKLIGHT_LM3533=m
CONFIG_BACKLIGHT_CARILLO_RANCH=m
CONFIG_BACKLIGHT_PWM=y
# CONFIG_BACKLIGHT_DA903X is not set
CONFIG_BACKLIGHT_DA9052=y
CONFIG_BACKLIGHT_SAHARA=m
CONFIG_BACKLIGHT_ADP8860=y
CONFIG_BACKLIGHT_ADP8870=m
# CONFIG_BACKLIGHT_PCF50633 is not set
CONFIG_BACKLIGHT_AAT2870=y
CONFIG_BACKLIGHT_LM3630A=y
# CONFIG_BACKLIGHT_LM3639 is not set
CONFIG_BACKLIGHT_LP855X=m
# CONFIG_BACKLIGHT_LP8788 is not set
CONFIG_BACKLIGHT_SKY81452=y
# CONFIG_BACKLIGHT_TPS65217 is not set
CONFIG_BACKLIGHT_AS3711=m
CONFIG_BACKLIGHT_GPIO=m
CONFIG_BACKLIGHT_LV5207LP=m
CONFIG_BACKLIGHT_BD6107=y
CONFIG_VGASTATE=m
CONFIG_VIDEOMODE_HELPERS=y

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
# CONFIG_FRAMEBUFFER_CONSOLE is not set
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
CONFIG_LOGO_LINUX_VGA16=y
CONFIG_LOGO_LINUX_CLUT224=y
# CONFIG_SOUND is not set

#
# HID support
#
CONFIG_HID=y
# CONFIG_HID_BATTERY_STRENGTH is not set
# CONFIG_HIDRAW is not set
CONFIG_UHID=m
CONFIG_HID_GENERIC=y

#
# Special HID drivers
#
# CONFIG_HID_A4TECH is not set
CONFIG_HID_ACRUX=m
CONFIG_HID_ACRUX_FF=y
# CONFIG_HID_APPLE is not set
CONFIG_HID_APPLEIR=m
CONFIG_HID_AUREAL=m
CONFIG_HID_BELKIN=m
CONFIG_HID_BETOP_FF=m
CONFIG_HID_CHERRY=m
CONFIG_HID_CHICONY=m
CONFIG_HID_CP2112=m
CONFIG_HID_CYPRESS=y
# CONFIG_HID_DRAGONRISE is not set
CONFIG_HID_EMS_FF=m
# CONFIG_HID_ELECOM is not set
CONFIG_HID_ELO=m
CONFIG_HID_EZKEY=m
CONFIG_HID_HOLTEK=m
CONFIG_HOLTEK_FF=y
CONFIG_HID_GT683R=m
CONFIG_HID_KEYTOUCH=y
CONFIG_HID_KYE=m
# CONFIG_HID_UCLOGIC is not set
CONFIG_HID_WALTOP=y
CONFIG_HID_GYRATION=y
CONFIG_HID_ICADE=m
CONFIG_HID_TWINHAN=m
CONFIG_HID_KENSINGTON=y
# CONFIG_HID_LCPOWER is not set
# CONFIG_HID_LENOVO is not set
CONFIG_HID_LOGITECH=y
CONFIG_HID_LOGITECH_HIDPP=m
CONFIG_LOGITECH_FF=y
# CONFIG_LOGIRUMBLEPAD2_FF is not set
# CONFIG_LOGIG940_FF is not set
CONFIG_LOGIWHEELS_FF=y
# CONFIG_HID_MAGICMOUSE is not set
# CONFIG_HID_MICROSOFT is not set
CONFIG_HID_MONTEREY=y
CONFIG_HID_MULTITOUCH=y
CONFIG_HID_NTRIG=m
# CONFIG_HID_ORTEK is not set
CONFIG_HID_PANTHERLORD=y
# CONFIG_PANTHERLORD_FF is not set
# CONFIG_HID_PENMOUNT is not set
# CONFIG_HID_PETALYNX is not set
# CONFIG_HID_PICOLCD is not set
CONFIG_HID_PLANTRONICS=m
CONFIG_HID_PRIMAX=y
# CONFIG_HID_ROCCAT is not set
CONFIG_HID_SAITEK=y
# CONFIG_HID_SAMSUNG is not set
CONFIG_HID_SONY=m
# CONFIG_SONY_FF is not set
CONFIG_HID_SPEEDLINK=y
CONFIG_HID_STEELSERIES=y
# CONFIG_HID_SUNPLUS is not set
CONFIG_HID_RMI=y
# CONFIG_HID_GREENASIA is not set
CONFIG_HID_SMARTJOYPLUS=m
CONFIG_SMARTJOYPLUS_FF=y
CONFIG_HID_TIVO=y
CONFIG_HID_TOPSEED=y
CONFIG_HID_THINGM=y
CONFIG_HID_THRUSTMASTER=y
CONFIG_THRUSTMASTER_FF=y
CONFIG_HID_WACOM=y
CONFIG_HID_WIIMOTE=y
# CONFIG_HID_XINMO is not set
CONFIG_HID_ZEROPLUS=m
# CONFIG_ZEROPLUS_FF is not set
CONFIG_HID_ZYDACRON=y
CONFIG_HID_SENSOR_HUB=m
# CONFIG_HID_SENSOR_CUSTOM_SENSOR is not set

#
# USB HID support
#
CONFIG_USB_HID=m
CONFIG_HID_PID=y
# CONFIG_USB_HIDDEV is not set

#
# USB HID Boot Protocol drivers
#
CONFIG_USB_KBD=m
CONFIG_USB_MOUSE=y

#
# I2C HID support
#
CONFIG_I2C_HID=m
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
# CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set

#
# Miscellaneous USB options
#
# CONFIG_USB_DEFAULT_PERSIST is not set
CONFIG_USB_DYNAMIC_MINORS=y
CONFIG_USB_OTG=y
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_OTG_BLACKLIST_HUB is not set
CONFIG_USB_OTG_FSM=y
CONFIG_USB_ULPI_BUS=m
# CONFIG_USB_MON is not set
CONFIG_USB_WUSB_CBAF=m
CONFIG_USB_WUSB_CBAF_DEBUG=y

#
# USB Host Controller Drivers
#
CONFIG_USB_C67X00_HCD=m
# CONFIG_USB_XHCI_HCD is not set
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
# CONFIG_USB_EHCI_TT_NEWSCHED is not set
CONFIG_USB_EHCI_PCI=y
CONFIG_USB_EHCI_HCD_PLATFORM=y
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
CONFIG_USB_ISP1362_HCD=m
CONFIG_USB_FUSBH200_HCD=y
CONFIG_USB_FOTG210_HCD=m
# CONFIG_USB_MAX3421_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PCI=y
CONFIG_USB_OHCI_HCD_SSB=y
CONFIG_USB_OHCI_HCD_PLATFORM=y
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_U132_HCD is not set
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_RENESAS_USBHS_HCD is not set
# CONFIG_USB_HCD_BCMA is not set
CONFIG_USB_HCD_SSB=y
CONFIG_USB_HCD_TEST_MODE=y
CONFIG_USB_RENESAS_USBHS=m

#
# USB Device Class drivers
#
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=y
CONFIG_USB_WDM=m
CONFIG_USB_TMC=m

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=y
CONFIG_USB_STORAGE_DEBUG=y
# CONFIG_USB_STORAGE_REALTEK is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
CONFIG_USB_STORAGE_FREECOM=m
CONFIG_USB_STORAGE_ISD200=m
# CONFIG_USB_STORAGE_USBAT is not set
CONFIG_USB_STORAGE_SDDR09=m
CONFIG_USB_STORAGE_SDDR55=m
CONFIG_USB_STORAGE_JUMPSHOT=m
CONFIG_USB_STORAGE_ALAUDA=m
CONFIG_USB_STORAGE_ONETOUCH=m
CONFIG_USB_STORAGE_KARMA=y
CONFIG_USB_STORAGE_CYPRESS_ATACB=m
CONFIG_USB_STORAGE_ENE_UB6250=m
# CONFIG_USB_UAS is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_MUSB_HDRC is not set
CONFIG_USB_DWC3=y
# CONFIG_USB_DWC3_HOST is not set
# CONFIG_USB_DWC3_GADGET is not set
CONFIG_USB_DWC3_DUAL_ROLE=y

#
# Platform Glue Driver Support
#
CONFIG_USB_DWC3_EXYNOS=y
CONFIG_USB_DWC3_PCI=m
CONFIG_USB_DWC3_KEYSTONE=m
CONFIG_USB_DWC3_QCOM=m

#
# Debugging features
#
# CONFIG_USB_DWC3_DEBUG is not set
# CONFIG_USB_DWC2 is not set
CONFIG_USB_CHIPIDEA=y
CONFIG_USB_CHIPIDEA_PCI=m
CONFIG_USB_CHIPIDEA_UDC=y
CONFIG_USB_CHIPIDEA_HOST=y
CONFIG_USB_CHIPIDEA_DEBUG=y
# CONFIG_USB_ISP1760 is not set

#
# USB port drivers
#
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_CONSOLE=y
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_SIMPLE is not set
CONFIG_USB_SERIAL_AIRCABLE=m
# CONFIG_USB_SERIAL_ARK3116 is not set
# CONFIG_USB_SERIAL_BELKIN is not set
# CONFIG_USB_SERIAL_CH341 is not set
# CONFIG_USB_SERIAL_WHITEHEAT is not set
# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set
CONFIG_USB_SERIAL_CP210X=m
# CONFIG_USB_SERIAL_CYPRESS_M8 is not set
# CONFIG_USB_SERIAL_EMPEG is not set
CONFIG_USB_SERIAL_FTDI_SIO=m
# CONFIG_USB_SERIAL_VISOR is not set
# CONFIG_USB_SERIAL_IPAQ is not set
# CONFIG_USB_SERIAL_IR is not set
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
# CONFIG_USB_SERIAL_F81232 is not set
CONFIG_USB_SERIAL_GARMIN=y
CONFIG_USB_SERIAL_IPW=y
CONFIG_USB_SERIAL_IUU=y
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
CONFIG_USB_SERIAL_KEYSPAN=y
# CONFIG_USB_SERIAL_KLSI is not set
# CONFIG_USB_SERIAL_KOBIL_SCT is not set
# CONFIG_USB_SERIAL_MCT_U232 is not set
# CONFIG_USB_SERIAL_METRO is not set
CONFIG_USB_SERIAL_MOS7720=m
CONFIG_USB_SERIAL_MOS7840=y
CONFIG_USB_SERIAL_MXUPORT=m
# CONFIG_USB_SERIAL_NAVMAN is not set
CONFIG_USB_SERIAL_PL2303=m
# CONFIG_USB_SERIAL_OTI6858 is not set
CONFIG_USB_SERIAL_QCAUX=m
CONFIG_USB_SERIAL_QUALCOMM=y
# CONFIG_USB_SERIAL_SPCP8X5 is not set
CONFIG_USB_SERIAL_SAFE=m
CONFIG_USB_SERIAL_SAFE_PADDED=y
CONFIG_USB_SERIAL_SIERRAWIRELESS=y
CONFIG_USB_SERIAL_SYMBOL=y
CONFIG_USB_SERIAL_TI=m
# CONFIG_USB_SERIAL_CYBERJACK is not set
# CONFIG_USB_SERIAL_XIRCOM is not set
CONFIG_USB_SERIAL_WWAN=y
CONFIG_USB_SERIAL_OPTION=y
CONFIG_USB_SERIAL_OMNINET=m
# CONFIG_USB_SERIAL_OPTICON is not set
CONFIG_USB_SERIAL_XSENS_MT=y
# CONFIG_USB_SERIAL_WISHBONE is not set
CONFIG_USB_SERIAL_SSU100=m
CONFIG_USB_SERIAL_QT2=m
# CONFIG_USB_SERIAL_DEBUG is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
CONFIG_USB_EMI26=m
CONFIG_USB_ADUTUX=y
# CONFIG_USB_SEVSEG is not set
CONFIG_USB_RIO500=y
CONFIG_USB_LEGOTOWER=y
# CONFIG_USB_LCD is not set
CONFIG_USB_LED=m
CONFIG_USB_CYPRESS_CY7C63=m
CONFIG_USB_CYTHERM=y
# CONFIG_USB_IDMOUSE is not set
CONFIG_USB_FTDI_ELAN=m
CONFIG_USB_APPLEDISPLAY=y
CONFIG_USB_SISUSBVGA=y
# CONFIG_USB_SISUSBVGA_CON is not set
CONFIG_USB_LD=m
CONFIG_USB_TRANCEVIBRATOR=m
# CONFIG_USB_IOWARRIOR is not set
CONFIG_USB_TEST=y
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
CONFIG_USB_ISIGHTFW=m
CONFIG_USB_YUREX=y
CONFIG_USB_EZUSB_FX2=y
# CONFIG_USB_HSIC_USB3503 is not set
# CONFIG_USB_LINK_LAYER_TEST is not set
# CONFIG_USB_ATM is not set

#
# USB Physical Layer drivers
#
CONFIG_USB_PHY=y
CONFIG_KEYSTONE_USB_PHY=m
CONFIG_NOP_USB_XCEIV=m
CONFIG_AM335X_CONTROL_USB=m
CONFIG_AM335X_PHY_USB=m
CONFIG_USB_GPIO_VBUS=y
CONFIG_TAHVO_USB=m
CONFIG_TAHVO_USB_HOST_BY_DEFAULT=y
CONFIG_USB_ISP1301=m
# CONFIG_USB_RCAR_PHY is not set
CONFIG_USB_GADGET=y
# CONFIG_USB_GADGET_DEBUG is not set
# CONFIG_USB_GADGET_DEBUG_FILES is not set
CONFIG_USB_GADGET_DEBUG_FS=y
CONFIG_USB_GADGET_VBUS_DRAW=2
CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2

#
# USB Peripheral Controller
#
# CONFIG_USB_FOTG210_UDC is not set
CONFIG_USB_GR_UDC=y
CONFIG_USB_R8A66597=y
CONFIG_USB_RENESAS_USBHS_UDC=m
CONFIG_USB_PXA27X=m
CONFIG_USB_MV_UDC=m
CONFIG_USB_MV_U3D=m
CONFIG_USB_M66592=m
CONFIG_USB_BDC_UDC=y

#
# Platform Support
#
# CONFIG_USB_BDC_PCI is not set
# CONFIG_USB_AMD5536UDC is not set
# CONFIG_USB_NET2272 is not set
CONFIG_USB_NET2280=m
CONFIG_USB_GOKU=y
CONFIG_USB_EG20T=y
CONFIG_USB_GADGET_XILINX=y
CONFIG_USB_DUMMY_HCD=y
CONFIG_USB_LIBCOMPOSITE=m
CONFIG_USB_F_ACM=m
CONFIG_USB_F_SS_LB=m
CONFIG_USB_U_SERIAL=m
CONFIG_USB_U_ETHER=m
CONFIG_USB_F_SERIAL=m
CONFIG_USB_F_OBEX=m
CONFIG_USB_F_NCM=m
CONFIG_USB_F_ECM=m
CONFIG_USB_F_PHONET=m
CONFIG_USB_F_EEM=m
CONFIG_USB_F_SUBSET=m
CONFIG_USB_F_RNDIS=m
CONFIG_USB_F_MASS_STORAGE=m
CONFIG_USB_F_FS=m
CONFIG_USB_F_HID=m
CONFIG_USB_F_PRINTER=m
CONFIG_USB_CONFIGFS=m
CONFIG_USB_CONFIGFS_SERIAL=y
# CONFIG_USB_CONFIGFS_ACM is not set
# CONFIG_USB_CONFIGFS_OBEX is not set
# CONFIG_USB_CONFIGFS_NCM is not set
CONFIG_USB_CONFIGFS_ECM=y
CONFIG_USB_CONFIGFS_ECM_SUBSET=y
CONFIG_USB_CONFIGFS_RNDIS=y
CONFIG_USB_CONFIGFS_EEM=y
CONFIG_USB_CONFIGFS_PHONET=y
CONFIG_USB_CONFIGFS_MASS_STORAGE=y
CONFIG_USB_CONFIGFS_F_LB_SS=y
CONFIG_USB_CONFIGFS_F_FS=y
CONFIG_USB_CONFIGFS_F_HID=y
# CONFIG_USB_CONFIGFS_F_UVC is not set
CONFIG_USB_CONFIGFS_F_PRINTER=y
CONFIG_USB_ZERO=m
CONFIG_USB_ZERO_HNPTEST=y
CONFIG_USB_ETH=m
# CONFIG_USB_ETH_RNDIS is not set
# CONFIG_USB_ETH_EEM is not set
CONFIG_USB_G_NCM=m
CONFIG_USB_GADGETFS=m
# CONFIG_USB_FUNCTIONFS is not set
CONFIG_USB_MASS_STORAGE=m
CONFIG_USB_GADGET_TARGET=m
CONFIG_USB_G_SERIAL=m
CONFIG_USB_G_PRINTER=m
CONFIG_USB_CDC_COMPOSITE=m
CONFIG_USB_G_ACM_MS=m
CONFIG_USB_G_MULTI=m
# CONFIG_USB_G_MULTI_RNDIS is not set
CONFIG_USB_G_MULTI_CDC=y
# CONFIG_USB_G_HID is not set
CONFIG_USB_G_DBGP=m
CONFIG_USB_G_DBGP_PRINTK=y
# CONFIG_USB_G_DBGP_SERIAL is not set
# CONFIG_USB_G_WEBCAM is not set
# CONFIG_UWB is not set
CONFIG_MMC=m
CONFIG_MMC_DEBUG=y
# CONFIG_MMC_CLKGATE is not set

#
# MMC/SD/SDIO Card Drivers
#
CONFIG_MMC_BLOCK=m
CONFIG_MMC_BLOCK_MINORS=8
CONFIG_MMC_BLOCK_BOUNCE=y
CONFIG_SDIO_UART=m
CONFIG_MMC_TEST=m

#
# MMC/SD/SDIO Host Controller Drivers
#
# CONFIG_MMC_SDHCI is not set
CONFIG_MMC_OMAP_HS=m
CONFIG_MMC_WBSD=m
CONFIG_MMC_TIFM_SD=m
# CONFIG_MMC_SPI is not set
CONFIG_MMC_SDRICOH_CS=m
CONFIG_MMC_CB710=m
CONFIG_MMC_VIA_SDMMC=m
# CONFIG_MMC_DW is not set
# CONFIG_MMC_SH_MMCIF is not set
CONFIG_MMC_VUB300=m
# CONFIG_MMC_USHC is not set
# CONFIG_MMC_USDHI6ROL0 is not set
CONFIG_MMC_REALTEK_PCI=m
CONFIG_MMC_REALTEK_USB=m
CONFIG_MMC_TOSHIBA_PCI=m
# CONFIG_MMC_MTK is not set
CONFIG_MEMSTICK=y
# CONFIG_MEMSTICK_DEBUG is not set

#
# MemoryStick drivers
#
CONFIG_MEMSTICK_UNSAFE_RESUME=y
CONFIG_MSPRO_BLOCK=y
CONFIG_MS_BLOCK=m

#
# MemoryStick Host Controller Drivers
#
CONFIG_MEMSTICK_TIFM_MS=m
CONFIG_MEMSTICK_JMICRON_38X=y
CONFIG_MEMSTICK_R592=y
CONFIG_MEMSTICK_REALTEK_PCI=m
CONFIG_MEMSTICK_REALTEK_USB=y
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
# CONFIG_LEDS_CLASS_FLASH is not set

#
# LED drivers
#
CONFIG_LEDS_LM3530=y
# CONFIG_LEDS_LM3533 is not set
CONFIG_LEDS_LM3642=m
CONFIG_LEDS_PCA9532=y
CONFIG_LEDS_PCA9532_GPIO=y
CONFIG_LEDS_GPIO=y
CONFIG_LEDS_LP3944=m
CONFIG_LEDS_LP55XX_COMMON=y
CONFIG_LEDS_LP5521=m
# CONFIG_LEDS_LP5523 is not set
CONFIG_LEDS_LP5562=y
# CONFIG_LEDS_LP8501 is not set
CONFIG_LEDS_LP8788=m
# CONFIG_LEDS_LP8860 is not set
CONFIG_LEDS_PCA955X=m
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_DA903X is not set
CONFIG_LEDS_DA9052=m
CONFIG_LEDS_DAC124S085=m
# CONFIG_LEDS_PWM is not set
# CONFIG_LEDS_REGULATOR is not set
CONFIG_LEDS_BD2802=y
# CONFIG_LEDS_LT3593 is not set
CONFIG_LEDS_MC13783=m
CONFIG_LEDS_TCA6507=y
CONFIG_LEDS_TLC591XX=m
# CONFIG_LEDS_MAX8997 is not set
# CONFIG_LEDS_LM355x is not set
# CONFIG_LEDS_OT200 is not set

#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
CONFIG_LEDS_BLINKM=y
# CONFIG_LEDS_PM8941_WLED is not set

#
# LED Triggers
#
# CONFIG_LEDS_TRIGGERS is not set
# CONFIG_ACCESSIBILITY is not set
CONFIG_INFINIBAND=y
CONFIG_INFINIBAND_USER_MAD=y
# CONFIG_INFINIBAND_USER_ACCESS is not set
CONFIG_INFINIBAND_ADDR_TRANS=y
CONFIG_INFINIBAND_MTHCA=y
# CONFIG_INFINIBAND_MTHCA_DEBUG is not set
# CONFIG_INFINIBAND_IPATH is not set
CONFIG_INFINIBAND_QIB=y
# CONFIG_INFINIBAND_AMSO1100 is not set
CONFIG_INFINIBAND_CXGB3=y
# CONFIG_INFINIBAND_CXGB3_DEBUG is not set
# CONFIG_INFINIBAND_CXGB4 is not set
# CONFIG_MLX4_INFINIBAND is not set
CONFIG_INFINIBAND_NES=m
# CONFIG_INFINIBAND_NES_DEBUG is not set
CONFIG_INFINIBAND_OCRDMA=y
# CONFIG_INFINIBAND_IPOIB is not set
# CONFIG_INFINIBAND_SRP is not set
CONFIG_INFINIBAND_SRPT=m
# CONFIG_INFINIBAND_ISER is not set
# CONFIG_INFINIBAND_ISERT is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=y
# CONFIG_EDAC_LEGACY_SYSFS is not set
CONFIG_EDAC_DEBUG=y
CONFIG_EDAC_MM_EDAC=y
CONFIG_EDAC_E752X=y
# CONFIG_EDAC_I82975X is not set
CONFIG_EDAC_I3000=m
CONFIG_EDAC_I3200=y
CONFIG_EDAC_IE31200=y
CONFIG_EDAC_X38=y
CONFIG_EDAC_I5400=m
CONFIG_EDAC_I5000=m
# CONFIG_EDAC_I5100 is not set
CONFIG_EDAC_I7300=y
# CONFIG_EDAC_XGENE is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
# CONFIG_RTC_HCTOSYS is not set
# CONFIG_RTC_SYSTOHC is not set
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
# CONFIG_RTC_INTF_PROC is not set
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
CONFIG_RTC_DRV_TEST=m

#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_ABB5ZES3 is not set
CONFIG_RTC_DRV_ABX80X=y
CONFIG_RTC_DRV_DS1307=m
CONFIG_RTC_DRV_DS1374=m
# CONFIG_RTC_DRV_DS1374_WDT is not set
# CONFIG_RTC_DRV_DS1672 is not set
CONFIG_RTC_DRV_DS3232=y
# CONFIG_RTC_DRV_LP8788 is not set
CONFIG_RTC_DRV_MAX6900=m
CONFIG_RTC_DRV_MAX8907=m
# CONFIG_RTC_DRV_MAX8998 is not set
CONFIG_RTC_DRV_MAX8997=m
CONFIG_RTC_DRV_RS5C372=m
CONFIG_RTC_DRV_ISL1208=y
# CONFIG_RTC_DRV_ISL12022 is not set
# CONFIG_RTC_DRV_ISL12057 is not set
CONFIG_RTC_DRV_X1205=m
CONFIG_RTC_DRV_PALMAS=y
CONFIG_RTC_DRV_PCF2127=y
# CONFIG_RTC_DRV_PCF8523 is not set
CONFIG_RTC_DRV_PCF8563=m
CONFIG_RTC_DRV_PCF85063=m
CONFIG_RTC_DRV_PCF8583=y
CONFIG_RTC_DRV_M41T80=y
# CONFIG_RTC_DRV_M41T80_WDT is not set
CONFIG_RTC_DRV_BQ32K=y
# CONFIG_RTC_DRV_TPS6586X is not set
# CONFIG_RTC_DRV_RC5T583 is not set
# CONFIG_RTC_DRV_S35390A is not set
CONFIG_RTC_DRV_FM3130=y
# CONFIG_RTC_DRV_RX8581 is not set
# CONFIG_RTC_DRV_RX8025 is not set
# CONFIG_RTC_DRV_EM3027 is not set
CONFIG_RTC_DRV_RV3029C2=y
CONFIG_RTC_DRV_S5M=y

#
# SPI RTC drivers
#
# CONFIG_RTC_DRV_M41T93 is not set
CONFIG_RTC_DRV_M41T94=y
# CONFIG_RTC_DRV_DS1305 is not set
CONFIG_RTC_DRV_DS1343=y
# CONFIG_RTC_DRV_DS1347 is not set
CONFIG_RTC_DRV_DS1390=m
# CONFIG_RTC_DRV_MAX6902 is not set
CONFIG_RTC_DRV_R9701=m
CONFIG_RTC_DRV_RS5C348=m
CONFIG_RTC_DRV_DS3234=m
CONFIG_RTC_DRV_PCF2123=y
CONFIG_RTC_DRV_RX4581=y
# CONFIG_RTC_DRV_MCP795 is not set

#
# Platform RTC drivers
#
# CONFIG_RTC_DRV_CMOS is not set
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1685_FAMILY is not set
# CONFIG_RTC_DRV_DS1742 is not set
CONFIG_RTC_DRV_DS2404=m
CONFIG_RTC_DRV_DA9052=m
CONFIG_RTC_DRV_DA9055=y
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
CONFIG_RTC_DRV_M48T35=y
# CONFIG_RTC_DRV_M48T59 is not set
CONFIG_RTC_DRV_MSM6242=y
# CONFIG_RTC_DRV_BQ4802 is not set
# CONFIG_RTC_DRV_RP5C01 is not set
CONFIG_RTC_DRV_V3020=m
CONFIG_RTC_DRV_PCF50633=m

#
# on-CPU RTC drivers
#
CONFIG_RTC_DRV_GEMINI=m
# CONFIG_RTC_DRV_MC13XXX is not set
CONFIG_RTC_DRV_MOXART=y
CONFIG_RTC_DRV_MT6397=y
CONFIG_RTC_DRV_XGENE=m

#
# HID Sensor RTC drivers
#
CONFIG_RTC_DRV_HID_SENSOR_TIME=m
CONFIG_DMADEVICES=y
CONFIG_DMADEVICES_DEBUG=y
CONFIG_DMADEVICES_VDEBUG=y

#
# DMA Devices
#
# CONFIG_INTEL_MIC_X100_DMA is not set
CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH=y
# CONFIG_INTEL_IOATDMA is not set
CONFIG_DW_DMAC_CORE=y
CONFIG_DW_DMAC=m
CONFIG_DW_DMAC_PCI=y
# CONFIG_HSU_DMA_PCI is not set
CONFIG_RENESAS_DMA=y
CONFIG_SH_DMAE_BASE=y
CONFIG_SH_DMAE=m
CONFIG_SUDMAC=m
CONFIG_RCAR_HPB_DMAE=m
# CONFIG_RCAR_DMAC is not set
CONFIG_RENESAS_USB_DMAC=y
CONFIG_TIMB_DMA=m
CONFIG_PCH_DMA=y
CONFIG_NBPFAXI_DMA=y
CONFIG_XGENE_DMA=m
CONFIG_DMA_ENGINE=y
CONFIG_DMA_VIRTUAL_CHANNELS=y

#
# DMA Clients
#
CONFIG_ASYNC_TX_DMA=y
CONFIG_DMATEST=y
CONFIG_DMA_ENGINE_RAID=y
# CONFIG_AUXDISPLAY is not set
CONFIG_UIO=y
# CONFIG_UIO_CIF is not set
CONFIG_UIO_PDRV_GENIRQ=y
CONFIG_UIO_DMEM_GENIRQ=m
# CONFIG_UIO_AEC is not set
# CONFIG_UIO_SERCOS3 is not set
CONFIG_UIO_PCI_GENERIC=m
CONFIG_UIO_NETX=m
# CONFIG_UIO_PRUSS is not set
# CONFIG_UIO_MF624 is not set
CONFIG_VIRT_DRIVERS=y
CONFIG_VIRTIO=y

#
# Virtio drivers
#
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_INPUT=y
# CONFIG_VIRTIO_MMIO is not set

#
# Microsoft Hyper-V guest support
#
# CONFIG_X86_PLATFORM_DEVICES is not set
# CONFIG_CHROME_PLATFORMS is not set

#
# Hardware Spinlock drivers
#

#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# CONFIG_ATMEL_PIT is not set
# CONFIG_SH_TIMER_CMT is not set
# CONFIG_SH_TIMER_MTU2 is not set
# CONFIG_SH_TIMER_TMU is not set
CONFIG_EM_TIMER_STI=y
# CONFIG_MAILBOX is not set
# CONFIG_IOMMU_SUPPORT is not set

#
# Remoteproc drivers
#
CONFIG_REMOTEPROC=y
CONFIG_STE_MODEM_RPROC=y

#
# Rpmsg drivers
#

#
# SOC (System On Chip) specific Drivers
#
# CONFIG_SUNXI_SRAM is not set
# CONFIG_SOC_TI is not set
# CONFIG_PM_DEVFREQ is not set
CONFIG_EXTCON=m

#
# Extcon Device Drivers
#
# CONFIG_EXTCON_ADC_JACK is not set
# CONFIG_EXTCON_AXP288 is not set
CONFIG_EXTCON_GPIO=m
# CONFIG_EXTCON_MAX14577 is not set
# CONFIG_EXTCON_MAX77693 is not set
# CONFIG_EXTCON_MAX77843 is not set
CONFIG_EXTCON_MAX8997=m
CONFIG_EXTCON_PALMAS=m
CONFIG_EXTCON_RT8973A=m
# CONFIG_EXTCON_SM5502 is not set
# CONFIG_EXTCON_USB_GPIO is not set
# CONFIG_MEMORY is not set
CONFIG_IIO=y
CONFIG_IIO_BUFFER=y
# CONFIG_IIO_BUFFER_CB is not set
CONFIG_IIO_KFIFO_BUF=y
CONFIG_IIO_TRIGGERED_BUFFER=y
CONFIG_IIO_TRIGGER=y
CONFIG_IIO_CONSUMERS_PER_TRIGGER=2

#
# Accelerometers
#
# CONFIG_BMA180 is not set
CONFIG_BMC150_ACCEL=m
# CONFIG_HID_SENSOR_ACCEL_3D is not set
# CONFIG_IIO_ST_ACCEL_3AXIS is not set
# CONFIG_KXSD9 is not set
CONFIG_MMA8452=y
CONFIG_KXCJK1013=y
CONFIG_MMA9551_CORE=y
CONFIG_MMA9551=m
CONFIG_MMA9553=y
# CONFIG_STK8312 is not set
CONFIG_STK8BA50=m

#
# Analog to digital converters
#
CONFIG_AD7266=y
# CONFIG_AD7291 is not set
CONFIG_AD7298=y
CONFIG_AD7476=y
# CONFIG_AD7791 is not set
# CONFIG_AD7793 is not set
CONFIG_AD7887=y
# CONFIG_AD7923 is not set
CONFIG_AD799X=m
# CONFIG_AXP288_ADC is not set
CONFIG_DA9150_GPADC=m
# CONFIG_CC10001_ADC is not set
CONFIG_LP8788_ADC=m
CONFIG_MAX1027=m
CONFIG_MAX1363=y
CONFIG_MCP320X=m
CONFIG_MCP3422=y
# CONFIG_MEN_Z188_ADC is not set
# CONFIG_NAU7802 is not set
CONFIG_TI_ADC081C=m
CONFIG_TI_ADC128S052=y
CONFIG_VIPERBOARD_ADC=m
# CONFIG_XILINX_XADC is not set

#
# Amplifiers
#
CONFIG_AD8366=m

#
# Hid Sensor IIO Common
#
CONFIG_HID_SENSOR_IIO_COMMON=m
CONFIG_HID_SENSOR_IIO_TRIGGER=m

#
# SSP Sensor Common
#
CONFIG_IIO_SSP_SENSORS_COMMONS=m
CONFIG_IIO_SSP_SENSORHUB=y
CONFIG_IIO_ST_SENSORS_I2C=y
CONFIG_IIO_ST_SENSORS_SPI=y
CONFIG_IIO_ST_SENSORS_CORE=y

#
# Digital to analog converters
#
CONFIG_AD5064=m
# CONFIG_AD5360 is not set
# CONFIG_AD5380 is not set
CONFIG_AD5421=y
CONFIG_AD5446=y
CONFIG_AD5449=y
CONFIG_AD5504=y
# CONFIG_AD5624R_SPI is not set
# CONFIG_AD5686 is not set
# CONFIG_AD5755 is not set
CONFIG_AD5764=m
CONFIG_AD5791=y
CONFIG_AD7303=m
CONFIG_M62332=y
CONFIG_MAX517=m
CONFIG_MCP4725=y
CONFIG_MCP4922=y

#
# Frequency Synthesizers DDS/PLL
#

#
# Clock Generator/Distribution
#
CONFIG_AD9523=y

#
# Phase-Locked Loop (PLL) frequency synthesizers
#
# CONFIG_ADF4350 is not set

#
# Digital gyroscope sensors
#
CONFIG_ADIS16080=m
# CONFIG_ADIS16130 is not set
CONFIG_ADIS16136=y
# CONFIG_ADIS16260 is not set
CONFIG_ADXRS450=y
CONFIG_BMG160=y
CONFIG_HID_SENSOR_GYRO_3D=m
CONFIG_IIO_ST_GYRO_3AXIS=m
CONFIG_IIO_ST_GYRO_I2C_3AXIS=m
CONFIG_IIO_ST_GYRO_SPI_3AXIS=m
CONFIG_ITG3200=y

#
# Humidity sensors
#
CONFIG_DHT11=m
CONFIG_SI7005=m
# CONFIG_SI7020 is not set

#
# Inertial measurement units
#
CONFIG_ADIS16400=y
# CONFIG_ADIS16480 is not set
CONFIG_KMX61=y
CONFIG_INV_MPU6050_IIO=y
CONFIG_IIO_ADIS_LIB=y
CONFIG_IIO_ADIS_LIB_BUFFER=y

#
# Light sensors
#
# CONFIG_ADJD_S311 is not set
CONFIG_AL3320A=y
CONFIG_APDS9300=m
CONFIG_BH1750=y
CONFIG_CM32181=y
# CONFIG_CM3232 is not set
# CONFIG_CM3323 is not set
# CONFIG_CM36651 is not set
# CONFIG_GP2AP020A00F is not set
CONFIG_ISL29125=m
# CONFIG_HID_SENSOR_ALS is not set
CONFIG_HID_SENSOR_PROX=m
CONFIG_JSA1212=m
CONFIG_SENSORS_LM3533=m
CONFIG_LTR501=m
CONFIG_STK3310=y
# CONFIG_TCS3414 is not set
CONFIG_TCS3472=m
# CONFIG_SENSORS_TSL2563 is not set
CONFIG_TSL4531=y
CONFIG_VCNL4000=m

#
# Magnetometer sensors
#
CONFIG_AK8975=y
# CONFIG_AK09911 is not set
# CONFIG_MAG3110 is not set
CONFIG_HID_SENSOR_MAGNETOMETER_3D=m
# CONFIG_MMC35240 is not set
CONFIG_IIO_ST_MAGN_3AXIS=y
CONFIG_IIO_ST_MAGN_I2C_3AXIS=y
CONFIG_IIO_ST_MAGN_SPI_3AXIS=y
# CONFIG_BMC150_MAGN is not set

#
# Inclinometer sensors
#
CONFIG_HID_SENSOR_INCLINOMETER_3D=m
CONFIG_HID_SENSOR_DEVICE_ROTATION=m

#
# Triggers - standalone
#
CONFIG_IIO_INTERRUPT_TRIGGER=m
CONFIG_IIO_SYSFS_TRIGGER=y

#
# Pressure sensors
#
CONFIG_BMP280=y
CONFIG_HID_SENSOR_PRESS=m
CONFIG_MPL115=y
# CONFIG_MPL3115 is not set
CONFIG_MS5611=y
CONFIG_MS5611_I2C=m
CONFIG_MS5611_SPI=m
CONFIG_IIO_ST_PRESS=y
CONFIG_IIO_ST_PRESS_I2C=y
CONFIG_IIO_ST_PRESS_SPI=y
# CONFIG_T5403 is not set

#
# Lightning sensors
#
CONFIG_AS3935=m

#
# Proximity sensors
#
CONFIG_SX9500=m

#
# Temperature sensors
#
# CONFIG_MLX90614 is not set
# CONFIG_TMP006 is not set
# CONFIG_NTB is not set
CONFIG_VME_BUS=y

#
# VME Bridge Drivers
#
CONFIG_VME_CA91CX42=m
CONFIG_VME_TSI148=y

#
# VME Board Drivers
#
CONFIG_VMIVME_7805=m

#
# VME Device Drivers
#
CONFIG_PWM=y
CONFIG_PWM_SYSFS=y
CONFIG_PWM_CLPS711X=m
# CONFIG_PWM_LP3943 is not set
CONFIG_PWM_LPSS=y
CONFIG_PWM_LPSS_PCI=m
CONFIG_PWM_RENESAS_TPU=y
CONFIG_IPACK_BUS=y
CONFIG_BOARD_TPCI200=m
CONFIG_SERIAL_IPOCTAL=m
# CONFIG_RESET_CONTROLLER is not set
CONFIG_FMC=y
# CONFIG_FMC_TRIVIAL is not set
# CONFIG_FMC_WRITE_EEPROM is not set
# CONFIG_FMC_CHARDEV is not set

#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
CONFIG_PHY_EXYNOS_MIPI_VIDEO=y
# CONFIG_PHY_PXA_28NM_HSIC is not set
CONFIG_PHY_PXA_28NM_USB2=y
# CONFIG_OMAP_CONTROL_PHY is not set
CONFIG_BCM_KONA_USB2_PHY=y
CONFIG_PHY_ST_SPEAR1310_MIPHY=m
CONFIG_PHY_ST_SPEAR1340_MIPHY=m
CONFIG_PHY_TUSB1210=m
CONFIG_POWERCAP=y
CONFIG_INTEL_RAPL=m
CONFIG_MCB=m
CONFIG_MCB_PCI=m
CONFIG_RAS=y
CONFIG_THUNDERBOLT=m

#
# Android
#
CONFIG_ANDROID=y
CONFIG_ANDROID_BINDER_IPC=y
# CONFIG_LIBNVDIMM is not set

#
# Firmware Drivers
#
CONFIG_EDD=m
# CONFIG_EDD_OFF is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DELL_RBU=y
CONFIG_DCDBAS=y
# CONFIG_GOOGLE_FIRMWARE is not set

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
# CONFIG_EXT2_FS is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
CONFIG_EXT4_FS=y
# CONFIG_EXT4_USE_FOR_EXT23 is not set
# CONFIG_EXT4_FS_POSIX_ACL is not set
CONFIG_EXT4_FS_SECURITY=y
CONFIG_EXT4_ENCRYPTION=m
CONFIG_EXT4_FS_ENCRYPTION=y
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD=y
CONFIG_JBD_DEBUG=y
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
CONFIG_JFS_FS=y
# CONFIG_JFS_POSIX_ACL is not set
# CONFIG_JFS_SECURITY is not set
CONFIG_JFS_DEBUG=y
CONFIG_JFS_STATISTICS=y
# CONFIG_XFS_FS is not set
CONFIG_GFS2_FS=y
CONFIG_OCFS2_FS=m
# CONFIG_OCFS2_FS_O2CB is not set
CONFIG_OCFS2_FS_USERSPACE_CLUSTER=m
# CONFIG_OCFS2_FS_STATS is not set
# CONFIG_OCFS2_DEBUG_MASKLOG is not set
# CONFIG_OCFS2_DEBUG_FS is not set
CONFIG_BTRFS_FS=m
CONFIG_BTRFS_FS_POSIX_ACL=y
CONFIG_BTRFS_FS_CHECK_INTEGRITY=y
CONFIG_BTRFS_DEBUG=y
CONFIG_BTRFS_ASSERT=y
CONFIG_NILFS2_FS=y
# CONFIG_F2FS_FS is not set
# CONFIG_FS_DAX is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
# CONFIG_INOTIFY_USER is not set
# CONFIG_FANOTIFY is not set
CONFIG_QUOTA=y
# CONFIG_QUOTA_NETLINK_INTERFACE is not set
CONFIG_PRINT_QUOTA_WARNING=y
# CONFIG_QUOTA_DEBUG is not set
CONFIG_QUOTA_TREE=m
CONFIG_QFMT_V1=m
CONFIG_QFMT_V2=m
CONFIG_QUOTACTL=y
CONFIG_QUOTACTL_COMPAT=y
# CONFIG_AUTOFS4_FS is not set
CONFIG_FUSE_FS=y
CONFIG_CUSE=y
# CONFIG_OVERLAY_FS is not set

#
# Caches
#
CONFIG_FSCACHE=m
# CONFIG_FSCACHE_STATS is not set
CONFIG_FSCACHE_HISTOGRAM=y
CONFIG_FSCACHE_DEBUG=y
CONFIG_FSCACHE_OBJECT_LIST=y
CONFIG_CACHEFILES=m
# CONFIG_CACHEFILES_DEBUG is not set
# CONFIG_CACHEFILES_HISTOGRAM is not set

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
# CONFIG_MSDOS_FS is not set
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
# CONFIG_PROC_KCORE is not set
# CONFIG_PROC_SYSCTL is not set
# CONFIG_PROC_PAGE_MONITOR is not set
CONFIG_PROC_CHILDREN=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_CONFIGFS_FS=y
CONFIG_MISC_FILESYSTEMS=y
CONFIG_ADFS_FS=y
# CONFIG_ADFS_FS_RW is not set
CONFIG_AFFS_FS=m
CONFIG_ECRYPT_FS=m
CONFIG_ECRYPT_FS_MESSAGING=y
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
CONFIG_EFS_FS=y
# CONFIG_LOGFS is not set
# CONFIG_CRAMFS is not set
# CONFIG_SQUASHFS is not set
CONFIG_VXFS_FS=m
CONFIG_MINIX_FS=y
# CONFIG_OMFS_FS is not set
CONFIG_HPFS_FS=m
CONFIG_QNX4FS_FS=y
CONFIG_QNX6FS_FS=y
CONFIG_QNX6FS_DEBUG=y
CONFIG_ROMFS_FS=m
CONFIG_ROMFS_BACKED_BY_BLOCK=y
CONFIG_ROMFS_ON_BLOCK=y
CONFIG_PSTORE=y
CONFIG_PSTORE_CONSOLE=y
CONFIG_PSTORE_PMSG=y
CONFIG_PSTORE_FTRACE=y
# CONFIG_PSTORE_RAM is not set
CONFIG_SYSV_FS=m
CONFIG_UFS_FS=m
# CONFIG_UFS_FS_WRITE is not set
CONFIG_UFS_DEBUG=y
# CONFIG_NETWORK_FILESYSTEMS is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
# CONFIG_NLS_CODEPAGE_852 is not set
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=y
# CONFIG_NLS_CODEPAGE_860 is not set
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=y
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=y
# CONFIG_NLS_CODEPAGE_865 is not set
CONFIG_NLS_CODEPAGE_866=y
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=y
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=y
CONFIG_NLS_ISO8859_8=y
CONFIG_NLS_CODEPAGE_1250=y
# CONFIG_NLS_CODEPAGE_1251 is not set
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
# CONFIG_NLS_ISO8859_4 is not set
CONFIG_NLS_ISO8859_5=y
# CONFIG_NLS_ISO8859_6 is not set
CONFIG_NLS_ISO8859_7=y
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
CONFIG_NLS_KOI8_R=y
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_MAC_ROMAN=m
CONFIG_NLS_MAC_CELTIC=m
# CONFIG_NLS_MAC_CENTEURO is not set
CONFIG_NLS_MAC_CROATIAN=m
# CONFIG_NLS_MAC_CYRILLIC is not set
CONFIG_NLS_MAC_GAELIC=m
CONFIG_NLS_MAC_GREEK=m
# CONFIG_NLS_MAC_ICELAND is not set
# CONFIG_NLS_MAC_INUIT is not set
CONFIG_NLS_MAC_ROMANIAN=m
CONFIG_NLS_MAC_TURKISH=m
CONFIG_NLS_UTF8=y
CONFIG_DLM=m
# CONFIG_DLM_DEBUG is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y

#
# printk and dmesg options
#
# CONFIG_PRINTK_TIME is not set
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
CONFIG_DYNAMIC_DEBUG=y

#
# Compile-time checks and compiler options
#
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=2048
CONFIG_STRIP_ASM_SYMS=y
CONFIG_READABLE_ASM=y
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_PAGE_OWNER is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
# CONFIG_DEBUG_SECTION_MISMATCH is not set
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_DEBUG_KERNEL=y

#
# Memory Debugging
#
CONFIG_PAGE_EXTENSION=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_OBJECTS=y
CONFIG_DEBUG_OBJECTS_SELFTEST=y
CONFIG_DEBUG_OBJECTS_FREE=y
CONFIG_DEBUG_OBJECTS_WORK=y
# CONFIG_DEBUG_OBJECTS_RCU_HEAD is not set
CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
# CONFIG_DEBUG_SLAB is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_VM is not set
CONFIG_DEBUG_VIRTUAL=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_PER_CPU_MAPS=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_DEBUG_SHIRQ=y

#
# Debug Lockups and Hangs
#
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
# CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
# CONFIG_DETECT_HUNG_TASK is not set
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHED_INFO=y
# CONFIG_SCHEDSTATS is not set
CONFIG_SCHED_STACK_END_CHECK=y
CONFIG_DEBUG_TIMEKEEPING=y
# CONFIG_TIMER_STATS is not set
CONFIG_DEBUG_PREEMPT=y

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
# CONFIG_DEBUG_RT_MUTEXES is not set
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_LOCK_STAT=y
# CONFIG_DEBUG_LOCKDEP is not set
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
CONFIG_LOCK_TORTURE_TEST=y
CONFIG_TRACE_IRQFLAGS=y
CONFIG_STACKTRACE=y
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_PI_LIST is not set
CONFIG_DEBUG_SG=y
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set

#
# RCU Debugging
#
CONFIG_PROVE_RCU=y
# CONFIG_PROVE_RCU_REPEATEDLY is not set
# CONFIG_SPARSE_RCU_POINTER is not set
CONFIG_TORTURE_TEST=y
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# CONFIG_RCU_CPU_STALL_INFO is not set
CONFIG_RCU_TRACE=y
# CONFIG_RCU_EQS_DEBUG is not set
CONFIG_NOTIFIER_ERROR_INJECTION=m
# CONFIG_CPU_NOTIFIER_ERROR_INJECT is not set
CONFIG_PM_NOTIFIER_ERROR_INJECT=m
CONFIG_FAULT_INJECTION=y
CONFIG_FAILSLAB=y
# CONFIG_FAIL_PAGE_ALLOC is not set
CONFIG_FAIL_MAKE_REQUEST=y
# CONFIG_FAIL_IO_TIMEOUT is not set
# CONFIG_FAIL_MMC_REQUEST is not set
CONFIG_FAULT_INJECTION_DEBUG_FS=y
# CONFIG_LATENCYTOP is not set
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
# CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_RING_BUFFER_ALLOW_SWAP=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
# CONFIG_FUNCTION_GRAPH_TRACER is not set
CONFIG_IRQSOFF_TRACER=y
# CONFIG_PREEMPT_TRACER is not set
CONFIG_SCHED_TRACER=y
# CONFIG_FTRACE_SYSCALLS is not set
CONFIG_TRACER_SNAPSHOT=y
CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
CONFIG_BRANCH_PROFILE_NONE=y
CONFIG_STACK_TRACER=y
# CONFIG_BLK_DEV_IO_TRACE is not set
CONFIG_UPROBE_EVENT=y
CONFIG_PROBE_EVENTS=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_MMIOTRACE is not set
CONFIG_TRACEPOINT_BENCHMARK=y
CONFIG_RING_BUFFER_STARTUP_TEST=y
# CONFIG_TRACE_ENUM_MAP_FILE is not set

#
# Runtime Testing
#
# CONFIG_LKDTM is not set
CONFIG_TEST_LIST_SORT=y
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_RBTREE_TEST is not set
CONFIG_INTERVAL_TREE_TEST=m
CONFIG_PERCPU_TEST=m
CONFIG_ATOMIC64_SELFTEST=y
CONFIG_TEST_HEXDUMP=y
# CONFIG_TEST_STRING_HELPERS is not set
# CONFIG_TEST_KSTRTOX is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
CONFIG_BUILD_DOCSRC=y
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_TEST_LKM is not set
CONFIG_TEST_USER_COPY=m
CONFIG_TEST_BPF=m
CONFIG_TEST_FIRMWARE=y
# CONFIG_TEST_UDELAY is not set
# CONFIG_MEMTEST is not set
CONFIG_SAMPLES=y
CONFIG_SAMPLE_TRACE_EVENTS=m
# CONFIG_SAMPLE_KOBJECT is not set
# CONFIG_SAMPLE_HW_BREAKPOINT is not set
# CONFIG_SAMPLE_KFIFO is not set
CONFIG_SAMPLE_LIVEPATCH=m
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_STRICT_DEVMEM=y
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
# CONFIG_X86_PTDUMP is not set
CONFIG_DEBUG_RODATA=y
CONFIG_DEBUG_RODATA_TEST=y
# CONFIG_DEBUG_SET_MODULE_RONX is not set
CONFIG_DEBUG_NX_TEST=m
CONFIG_DOUBLEFAULT=y
CONFIG_DEBUG_TLBFLUSH=y
CONFIG_IOMMU_DEBUG=y
CONFIG_IOMMU_STRESS=y
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
# CONFIG_IO_DELAY_0X80 is not set
# CONFIG_IO_DELAY_0XED is not set
CONFIG_IO_DELAY_UDELAY=y
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=2
CONFIG_DEBUG_BOOT_PARAMS=y
# CONFIG_CPA_DEBUG is not set
CONFIG_OPTIMIZE_INLINING=y
# CONFIG_DEBUG_NMI_SELFTEST is not set
# CONFIG_X86_DEBUG_STATIC_CPU_HAS is not set
# CONFIG_X86_DEBUG_FPU is not set
# CONFIG_PUNIT_ATOM_DEBUG is not set

#
# Security options
#
CONFIG_KEYS=y
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_ENCRYPTED_KEYS=m
# CONFIG_SECURITY_DMESG_RESTRICT is not set
CONFIG_SECURITY=y
# CONFIG_SECURITYFS is not set
CONFIG_SECURITY_NETWORK=y
# CONFIG_SECURITY_NETWORK_XFRM is not set
CONFIG_SECURITY_PATH=y
# CONFIG_SECURITY_APPARMOR is not set
# CONFIG_SECURITY_YAMA is not set
# CONFIG_INTEGRITY is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_DEFAULT_SECURITY=""
CONFIG_XOR_BLOCKS=m
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_FIPS=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_PCOMP=y
CONFIG_CRYPTO_PCOMP2=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=m
CONFIG_CRYPTO_RSA=m
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_USER=m
# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
CONFIG_CRYPTO_GF128MUL=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_PCRYPT=m
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CRYPTD=y
# CONFIG_CRYPTO_MCRYPTD is not set
CONFIG_CRYPTO_AUTHENC=m
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_ABLK_HELPER=y
CONFIG_CRYPTO_GLUE_HELPER_X86=y

#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=m
# CONFIG_CRYPTO_GCM is not set
CONFIG_CRYPTO_CHACHA20POLY1305=m
CONFIG_CRYPTO_SEQIV=m
CONFIG_CRYPTO_ECHAINIV=y

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=m
CONFIG_CRYPTO_CTS=m
CONFIG_CRYPTO_ECB=y
CONFIG_CRYPTO_LRW=y
CONFIG_CRYPTO_PCBC=y
CONFIG_CRYPTO_XTS=y

#
# Hash modes
#
CONFIG_CRYPTO_CMAC=m
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=m
CONFIG_CRYPTO_VMAC=m

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_CRC32C_INTEL=m
# CONFIG_CRYPTO_CRC32 is not set
CONFIG_CRYPTO_CRC32_PCLMUL=y
CONFIG_CRYPTO_CRCT10DIF=y
CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
# CONFIG_CRYPTO_GHASH is not set
CONFIG_CRYPTO_POLY1305=m
# CONFIG_CRYPTO_MD4 is not set
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_RMD128=m
# CONFIG_CRYPTO_RMD160 is not set
CONFIG_CRYPTO_RMD256=m
CONFIG_CRYPTO_RMD320=m
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA1_SSSE3=y
CONFIG_CRYPTO_SHA256_SSSE3=m
CONFIG_CRYPTO_SHA512_SSSE3=y
# CONFIG_CRYPTO_SHA1_MB is not set
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
CONFIG_CRYPTO_TGR192=y
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL=y

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_AES_X86_64=m
CONFIG_CRYPTO_AES_NI_INTEL=m
CONFIG_CRYPTO_ANUBIS=y
CONFIG_CRYPTO_ARC4=y
# CONFIG_CRYPTO_BLOWFISH is not set
CONFIG_CRYPTO_BLOWFISH_COMMON=y
CONFIG_CRYPTO_BLOWFISH_X86_64=y
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAMELLIA_X86_64=y
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64=y
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64=m
CONFIG_CRYPTO_CAST_COMMON=y
CONFIG_CRYPTO_CAST5=m
# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
CONFIG_CRYPTO_CAST6=y
CONFIG_CRYPTO_CAST6_AVX_X86_64=m
CONFIG_CRYPTO_DES=m
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_SALSA20=y
# CONFIG_CRYPTO_SALSA20_X86_64 is not set
CONFIG_CRYPTO_CHACHA20=m
CONFIG_CRYPTO_SEED=y
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
# CONFIG_CRYPTO_TEA is not set
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=y
CONFIG_CRYPTO_TWOFISH_X86_64=y
# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set
# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set

#
# Compression
#
# CONFIG_CRYPTO_DEFLATE is not set
CONFIG_CRYPTO_ZLIB=y
# CONFIG_CRYPTO_LZO is not set
# CONFIG_CRYPTO_842 is not set
CONFIG_CRYPTO_LZ4=m
CONFIG_CRYPTO_LZ4HC=m

#
# Random Number Generation
#
CONFIG_CRYPTO_ANSI_CPRNG=y
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
# CONFIG_CRYPTO_DRBG_HASH is not set
CONFIG_CRYPTO_DRBG_CTR=y
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
CONFIG_CRYPTO_USER_API=y
# CONFIG_CRYPTO_USER_API_HASH is not set
CONFIG_CRYPTO_USER_API_SKCIPHER=y
CONFIG_CRYPTO_USER_API_RNG=y
# CONFIG_CRYPTO_USER_API_AEAD is not set
CONFIG_CRYPTO_HASH_INFO=y
# CONFIG_CRYPTO_HW is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_PUBLIC_KEY_ALGO_RSA=y
CONFIG_X509_CERTIFICATE_PARSER=y
# CONFIG_PKCS7_MESSAGE_PARSER is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_KVM_COMPAT=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=y
CONFIG_KVM_AMD=y
# CONFIG_KVM_MMU_AUDIT is not set
CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_RAID6_PQ=m
CONFIG_BITREVERSE=y
# CONFIG_HAVE_ARCH_BITREVERSE is not set
CONFIG_RATIONAL=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_IO=y
CONFIG_PERCPU_RWSEM=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_CRC_CCITT=y
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
CONFIG_CRC32_SELFTEST=y
# CONFIG_CRC32_SLICEBY8 is not set
CONFIG_CRC32_SLICEBY4=y
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
CONFIG_CRC7=y
CONFIG_LIBCRC32C=y
# CONFIG_CRC8 is not set
# CONFIG_AUDIT_ARCH_COMPAT_GENERIC is not set
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_COMPRESS=m
CONFIG_LZ4HC_COMPRESS=m
CONFIG_LZ4_DECOMPRESS=y
# CONFIG_XZ_DEC is not set
# CONFIG_XZ_DEC_BCJ is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=y
CONFIG_TEXTSEARCH_BM=y
CONFIG_TEXTSEARCH_FSM=y
CONFIG_INTERVAL_TREE=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_CHECK_SIGNATURE=y
CONFIG_CPUMASK_OFFSTACK=y
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
CONFIG_GLOB_SELFTEST=y
CONFIG_NLATTR=y
CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE=y
CONFIG_AVERAGE=y
CONFIG_CLZ_TAB=y
CONFIG_CORDIC=m
CONFIG_DDR=y
CONFIG_MPILIB=y
CONFIG_OID_REGISTRY=y
CONFIG_ARCH_HAS_SG_CHAIN=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_FORCE_SUCCESSFUL_BUILD=y
CONFIG_FORCE_MINIMAL_CONFIG=y
CONFIG_FORCE_MINIMAL_CONFIG_64=y
CONFIG_FORCE_MINIMAL_CONFIG_PHYS=y

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v5 00/17] x86: Rewrite exit-to-userspace code
  2015-07-07 11:12 ` [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Ingo Molnar
@ 2015-07-07 16:03   ` Andy Lutomirski
  2015-07-07 17:55     ` [PATCH] x86/entry/64: Fix warning on compat syscalls with CONFIG_AUDITSYSCALL=n Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-07 16:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, X86 ML, linux-kernel,
	Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst,
	Paul McKenney

On Tue, Jul 7, 2015 at 4:12 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> So this looks mostly problem free on my boxen, except this warning triggers:
>
> Adding 3911820k swap on /dev/sda2.  Priority:-1 extents:1 across:3911820k
> capability: warning: `dbus-daemon' uses 32-bit capabilities (legacy support in use)
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 2445 at arch/x86/entry/common.c:311 syscall_return_slowpath+0x4c/0x270()
> syscall 6 left IRQs disabled
> Modules linked in:
> CPU: 1 PID: 2445 Comm: distccd Not tainted 4.2.0-rc1-01597-gaecd781-dirty #18
>  0000000000000000 00000000776afac2 ffff880035413e58 ffffffff81c8915f
>  0000000000000000 ffff880035413eb0 ffff880035413e98 ffffffff810a8d82
>  ffff880035413e78 ffff880035413f58 0000000020020002 ffff880035410000
> Call Trace:
>  [<ffffffff81c8915f>] dump_stack+0x4f/0x7b
>  [<ffffffff810a8d82>] warn_slowpath_common+0xa2/0xc0
>  [<ffffffff810a8df5>] warn_slowpath_fmt+0x55/0x70
>  [<ffffffff81001ddc>] syscall_return_slowpath+0x4c/0x270
>  [<ffffffff81c96471>] int_ret_from_sys_call+0x25/0x9f
> ---[ end trace 083efc734e089d37 ]---
> device: 'vcs2': device_add
> PM: Adding info for No Bus:vcs2
> device: 'vcsa2': device_add
>
> with ancient user-space, running the attached .config.
>
> The system booted up fine otherwise. The warning corresponds to:
>
>         if (WARN(irqs_disabled(), "syscall %ld left IRQs disabled",
>                  regs->orig_ax))
>                 local_irq_enable();
>
> and this was just the regular startup of the distccd daemon during bootup, nothing
> particularly fancy.
>
> Note that 'distccd' is a 32-bit ELF binary - and this is a 64-bit kernel.
>
> Syscall 6 would be:
>
> arch/x86/entry/syscalls/syscall_32.tbl:6        i386    close                   sys_close
>
> Thanks,
>
>         Ingo

It's irq state confusion in these lovely macros:

#ifndef CONFIG_AUDITSYSCALL
# define sysexit_audit        ia32_ret_from_sys_call
# define sysretl_audit        ia32_ret_from_sys_call
#endif

Frankly, I'm amazed that the old code seems to have worked.  I should
have a patch for you later today.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH] x86/entry/64: Fix warning on compat syscalls with CONFIG_AUDITSYSCALL=n
  2015-07-07 16:03   ` Andy Lutomirski
@ 2015-07-07 17:55     ` Andy Lutomirski
  2015-07-08  9:57       ` [tip:x86/asm] x86/entry/64: Fix IRQ state confusion and related warning on compat syscalls with CONFIG_AUDITSYSCALL =n tip-bot for Andy Lutomirski
  2015-07-08 19:12       ` tip-bot for Andy Lutomirski
  0 siblings, 2 replies; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-07 17:55 UTC (permalink / raw)
  To: x86, linux-kernel; +Cc: Borislav Petkov, Andy Lutomirski

int_ret_from_sys_call now expects IRQs to be enabled.  I got this right
in the real sysexit_audit and sysretl_audit asm paths, but I missed it
in the #defined-away versions when CONFIG_AUDITSYSCALL=n.  This is
a straightforward fix for CONFIG_AUDITSYSCALL=n

Fixes: 29ea1b258b98 ("x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers and remove old assembly code")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/entry_64_compat.S | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 25aca51a6324..d7571532e7ce 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -22,8 +22,8 @@
 #define __AUDIT_ARCH_LE		0x40000000
 
 #ifndef CONFIG_AUDITSYSCALL
-# define sysexit_audit		ia32_ret_from_sys_call
-# define sysretl_audit		ia32_ret_from_sys_call
+# define sysexit_audit		ia32_ret_from_sys_call_irqs_off
+# define sysretl_audit		ia32_ret_from_sys_call_irqs_off
 #endif
 
 	.section .entry.text, "ax"
@@ -466,6 +466,10 @@ ia32_badarg:
 	/* And exit again. */
 	jmp retint_user
 
+ia32_ret_from_sys_call_irqs_off:
+	TRACE_IRQS_ON
+	ENABLE_INTERRUPTS(CLBR_NONE)
+
 ia32_ret_from_sys_call:
 	xorl	%eax, %eax		/* Do not leak kernel information */
 	movq	%rax, R11(%rsp)
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry/64: Fix IRQ state confusion and related warning on compat syscalls with CONFIG_AUDITSYSCALL =n
  2015-07-07 17:55     ` [PATCH] x86/entry/64: Fix warning on compat syscalls with CONFIG_AUDITSYSCALL=n Andy Lutomirski
@ 2015-07-08  9:57       ` tip-bot for Andy Lutomirski
  2015-07-08 19:12       ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-08  9:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, luto, mingo, brgerst, peterz, dvlasenk, torvalds, luto, bp,
	linux-kernel, tglx

Commit-ID:  0c6541b605747fc39dc6b1715e1f3a3dca1cace5
Gitweb:     http://git.kernel.org/tip/0c6541b605747fc39dc6b1715e1f3a3dca1cace5
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Tue, 7 Jul 2015 10:55:28 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 8 Jul 2015 11:53:44 +0200

x86/entry/64: Fix IRQ state confusion and related warning on compat syscalls with CONFIG_AUDITSYSCALL=n

int_ret_from_sys_call now expects IRQs to be enabled.  I got
this right in the real sysexit_audit and sysretl_audit asm
paths, but I missed it in the #defined-away versions when
CONFIG_AUDITSYSCALL=n.  This is a straightforward fix for
CONFIG_AUDITSYSCALL=n

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 29ea1b258b98 ("x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers and remove old assembly code")
Link: http://lkml.kernel.org/r/25cf0a01e01c6008118dd8f8d9f043020416700c.1436291493.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/entry_64_compat.S | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 25aca51..d757153 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -22,8 +22,8 @@
 #define __AUDIT_ARCH_LE		0x40000000
 
 #ifndef CONFIG_AUDITSYSCALL
-# define sysexit_audit		ia32_ret_from_sys_call
-# define sysretl_audit		ia32_ret_from_sys_call
+# define sysexit_audit		ia32_ret_from_sys_call_irqs_off
+# define sysretl_audit		ia32_ret_from_sys_call_irqs_off
 #endif
 
 	.section .entry.text, "ax"
@@ -466,6 +466,10 @@ ia32_badarg:
 	/* And exit again. */
 	jmp retint_user
 
+ia32_ret_from_sys_call_irqs_off:
+	TRACE_IRQS_ON
+	ENABLE_INTERRUPTS(CLBR_NONE)
+
 ia32_ret_from_sys_call:
 	xorl	%eax, %eax		/* Do not leak kernel information */
 	movq	%rax, R11(%rsp)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry/64: Fix IRQ state confusion and related warning on compat syscalls with CONFIG_AUDITSYSCALL =n
  2015-07-07 17:55     ` [PATCH] x86/entry/64: Fix warning on compat syscalls with CONFIG_AUDITSYSCALL=n Andy Lutomirski
  2015-07-08  9:57       ` [tip:x86/asm] x86/entry/64: Fix IRQ state confusion and related warning on compat syscalls with CONFIG_AUDITSYSCALL =n tip-bot for Andy Lutomirski
@ 2015-07-08 19:12       ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-08 19:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, hpa, mingo, torvalds, linux-kernel, tglx, brgerst, luto,
	dvlasenk, bp, peterz

Commit-ID:  8f7f06b87acd2e017d6c536f59e10045dd8d0578
Gitweb:     http://git.kernel.org/tip/8f7f06b87acd2e017d6c536f59e10045dd8d0578
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Tue, 7 Jul 2015 10:55:28 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 8 Jul 2015 21:10:25 +0200

x86/entry/64: Fix IRQ state confusion and related warning on compat syscalls with CONFIG_AUDITSYSCALL=n

int_ret_from_sys_call now expects IRQs to be enabled.  I got
this right in the real sysexit_audit and sysretl_audit asm
paths, but I missed it in the #defined-away versions when
CONFIG_AUDITSYSCALL=n.  This is a straightforward fix for
CONFIG_AUDITSYSCALL=n

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 29ea1b258b98 ("x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers and remove old assembly code")
Link: http://lkml.kernel.org/r/25cf0a01e01c6008118dd8f8d9f043020416700c.1436291493.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/entry_64_compat.S | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 25aca51..d757153 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -22,8 +22,8 @@
 #define __AUDIT_ARCH_LE		0x40000000
 
 #ifndef CONFIG_AUDITSYSCALL
-# define sysexit_audit		ia32_ret_from_sys_call
-# define sysretl_audit		ia32_ret_from_sys_call
+# define sysexit_audit		ia32_ret_from_sys_call_irqs_off
+# define sysretl_audit		ia32_ret_from_sys_call_irqs_off
 #endif
 
 	.section .entry.text, "ax"
@@ -466,6 +466,10 @@ ia32_badarg:
 	/* And exit again. */
 	jmp retint_user
 
+ia32_ret_from_sys_call_irqs_off:
+	TRACE_IRQS_ON
+	ENABLE_INTERRUPTS(CLBR_NONE)
+
 ia32_ret_from_sys_call:
 	xorl	%eax, %eax		/* Do not leak kernel information */
 	movq	%rax, R11(%rsp)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/entry: Add enter_from_user_mode() and use it in syscalls
  2015-07-07 10:51   ` [tip:x86/asm] x86/entry: Add enter_from_user_mode() " tip-bot for Andy Lutomirski
@ 2015-07-14 23:00     ` Frederic Weisbecker
  2015-07-14 23:04       ` Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-07-14 23:00 UTC (permalink / raw)
  To: luto, brgerst, linux-kernel, peterz, mingo, vda.linux, luto, hpa,
	dvlasenk, riel, torvalds, bp, tglx, oleg, keescook
  Cc: linux-tip-commits

On Tue, Jul 07, 2015 at 03:51:29AM -0700, tip-bot for Andy Lutomirski wrote:
> Commit-ID:  feed36cde0a10adb957445a37e48f957f30b2273
> Gitweb:     http://git.kernel.org/tip/feed36cde0a10adb957445a37e48f957f30b2273
> Author:     Andy Lutomirski <luto@kernel.org>
> AuthorDate: Fri, 3 Jul 2015 12:44:25 -0700
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Tue, 7 Jul 2015 10:59:06 +0200
> 
> x86/entry: Add enter_from_user_mode() and use it in syscalls
> 
> Changing the x86 context tracking hooks is dangerous because
> there are no good checks that we track our context correctly.
> Add a helper to check that we're actually in CONTEXT_USER when
> we enter from user mode and wire it up for syscall entries.
> 
> Subsequent patches will wire this up for all non-NMI entries as
> well.  NMIs are their own special beast and cannot currently
> switch overall context tracking state.  Instead, they have their
> own special RCU hooks.
> 
> This is a tiny speedup if !CONFIG_CONTEXT_TRACKING (removes a
> branch) and a tiny slowdown if CONFIG_CONTEXT_TRACING (adds a
> layer of indirection).  Eventually, we should fix up the core
> context tracking code to supply a function that does what we
> want (and can be much simpler than user_exit), which will enable
> us to get rid of the extra call.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Denys Vlasenko <vda.linux@googlemail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: paulmck@linux.vnet.ibm.com
> Link: http://lkml.kernel.org/r/853b42420066ec3fb856779cdc223a6dcb5d355b.1435952415.git.luto@kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/x86/entry/common.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 917d0c3..9a327ee 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -28,6 +28,15 @@
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/syscalls.h>
>  
> +#ifdef CONFIG_CONTEXT_TRACKING
> +/* Called on entry from user mode with IRQs off. */
> +__visible void enter_from_user_mode(void)
> +{
> +	CT_WARN_ON(ct_state() != CONTEXT_USER);
> +	user_exit();
> +}
> +#endif
> +
>  static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
>  {
>  #ifdef CONFIG_X86_64
> @@ -65,14 +74,16 @@ unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
>  	work = ACCESS_ONCE(current_thread_info()->flags) &
>  		_TIF_WORK_SYSCALL_ENTRY;
>  
> +#ifdef CONFIG_CONTEXT_TRACKING
>  	/*
>  	 * If TIF_NOHZ is set, we are required to call user_exit() before
>  	 * doing anything that could touch RCU.
>  	 */
>  	if (work & _TIF_NOHZ) {
> -		user_exit();
> +		enter_from_user_mode();
>  		work &= ~_TIF_NOHZ;

We should move the sanity check to user_exit/enter() and use user_exit/enter()
only when we actually enter/exit user. Here it's the case but syscall_trace_leave()
and do_notify_resume() are special case that should probably use exception_enter/exit()
unless your patchset have changed things such that there is only one call to user_exit()
once we completed everything before resuming userspace. I need to review the rest of
the patchset to discover that :-)

>  	}
> +#endif
>  
>  #ifdef CONFIG_SECCOMP
>  	/*

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/entry: Add enter_from_user_mode() and use it in syscalls
  2015-07-14 23:00     ` Frederic Weisbecker
@ 2015-07-14 23:04       ` Andy Lutomirski
  2015-07-14 23:28         ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-14 23:04 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Brian Gerst, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Denys Vlasenko, Andrew Lutomirski, H. Peter Anvin,
	Denys Vlasenko, Rik van Riel, Linus Torvalds, Borislav Petkov,
	Thomas Gleixner, Oleg Nesterov, Kees Cook, linux-tip-commits

On Tue, Jul 14, 2015 at 4:00 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Jul 07, 2015 at 03:51:29AM -0700, tip-bot for Andy Lutomirski wrote:
>> Commit-ID:  feed36cde0a10adb957445a37e48f957f30b2273
>> Gitweb:     http://git.kernel.org/tip/feed36cde0a10adb957445a37e48f957f30b2273
>> Author:     Andy Lutomirski <luto@kernel.org>
>> AuthorDate: Fri, 3 Jul 2015 12:44:25 -0700
>> Committer:  Ingo Molnar <mingo@kernel.org>
>> CommitDate: Tue, 7 Jul 2015 10:59:06 +0200
>>
>> x86/entry: Add enter_from_user_mode() and use it in syscalls
>>
>> Changing the x86 context tracking hooks is dangerous because
>> there are no good checks that we track our context correctly.
>> Add a helper to check that we're actually in CONTEXT_USER when
>> we enter from user mode and wire it up for syscall entries.
>>
>> Subsequent patches will wire this up for all non-NMI entries as
>> well.  NMIs are their own special beast and cannot currently
>> switch overall context tracking state.  Instead, they have their
>> own special RCU hooks.
>>
>> This is a tiny speedup if !CONFIG_CONTEXT_TRACKING (removes a
>> branch) and a tiny slowdown if CONFIG_CONTEXT_TRACING (adds a
>> layer of indirection).  Eventually, we should fix up the core
>> context tracking code to supply a function that does what we
>> want (and can be much simpler than user_exit), which will enable
>> us to get rid of the extra call.
>>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: Brian Gerst <brgerst@gmail.com>
>> Cc: Denys Vlasenko <dvlasenk@redhat.com>
>> Cc: Denys Vlasenko <vda.linux@googlemail.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: H. Peter Anvin <hpa@zytor.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Oleg Nesterov <oleg@redhat.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: paulmck@linux.vnet.ibm.com
>> Link: http://lkml.kernel.org/r/853b42420066ec3fb856779cdc223a6dcb5d355b.1435952415.git.luto@kernel.org
>> Signed-off-by: Ingo Molnar <mingo@kernel.org>
>> ---
>>  arch/x86/entry/common.c | 13 ++++++++++++-
>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
>> index 917d0c3..9a327ee 100644
>> --- a/arch/x86/entry/common.c
>> +++ b/arch/x86/entry/common.c
>> @@ -28,6 +28,15 @@
>>  #define CREATE_TRACE_POINTS
>>  #include <trace/events/syscalls.h>
>>
>> +#ifdef CONFIG_CONTEXT_TRACKING
>> +/* Called on entry from user mode with IRQs off. */
>> +__visible void enter_from_user_mode(void)
>> +{
>> +     CT_WARN_ON(ct_state() != CONTEXT_USER);
>> +     user_exit();
>> +}
>> +#endif
>> +
>>  static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
>>  {
>>  #ifdef CONFIG_X86_64
>> @@ -65,14 +74,16 @@ unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
>>       work = ACCESS_ONCE(current_thread_info()->flags) &
>>               _TIF_WORK_SYSCALL_ENTRY;
>>
>> +#ifdef CONFIG_CONTEXT_TRACKING
>>       /*
>>        * If TIF_NOHZ is set, we are required to call user_exit() before
>>        * doing anything that could touch RCU.
>>        */
>>       if (work & _TIF_NOHZ) {
>> -             user_exit();
>> +             enter_from_user_mode();
>>               work &= ~_TIF_NOHZ;
>
> We should move the sanity check to user_exit/enter() and use user_exit/enter()
> only when we actually enter/exit user.

I agree, but I don't know what other arches to.

> Here it's the case but syscall_trace_leave()
> and do_notify_resume() are special case that should probably use exception_enter/exit()
> unless your patchset have changed things such that there is only one call to user_exit()
> once we completed everything before resuming userspace. I need to review the rest of
> the patchset to discover that :-)

syscall_trace_leave and do_notify_resume may be so screwed up that
your suggestion wouldn't even work.  However, the next set of patches
(out for review but currently stalled pending Brian Gerst's vm86 work)
remove those functions entirely.

--Andy

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/entry: Add new, comprehensible entry and exit handlers written in C
  2015-07-07 10:51   ` [tip:x86/asm] x86/entry: Add new, comprehensible entry and exit handlers written in C tip-bot for Andy Lutomirski
@ 2015-07-14 23:07     ` Frederic Weisbecker
  2015-07-15 19:56       ` Linus Torvalds
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-07-14 23:07 UTC (permalink / raw)
  To: keescook, peterz, vda.linux, mingo, brgerst, luto, torvalds, bp,
	luto, oleg, hpa, linux-kernel, dvlasenk, tglx, riel
  Cc: linux-tip-commits

On Tue, Jul 07, 2015 at 03:51:48AM -0700, tip-bot for Andy Lutomirski wrote:
> Commit-ID:  c5c46f59e4e7c1ab244b8d38f2b61d317df90bba
> Gitweb:     http://git.kernel.org/tip/c5c46f59e4e7c1ab244b8d38f2b61d317df90bba
> Author:     Andy Lutomirski <luto@kernel.org>
> AuthorDate: Fri, 3 Jul 2015 12:44:26 -0700
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Tue, 7 Jul 2015 10:59:06 +0200
> 
> x86/entry: Add new, comprehensible entry and exit handlers written in C
> 
> The current x86 entry and exit code, written in a mixture of assembly and
> C code, is incomprehensible due to being open-coded in a lot of places
> without coherent documentation.
> 
> It appears to work primary by luck and duct tape: i.e. obvious runtime
> failures were fixed on-demand, without re-thinking the design.
> 
> Due to those reasons our confidence level in that code is low, and it is
> very difficult to incrementally improve.
> 
> Add new code written in C, in preparation for simply deleting the old
> entry code.
> 
> prepare_exit_to_usermode() is a new function that will handle all
> slow path exits to user mode.  It is called with IRQs disabled
> and it leaves us in a state in which it is safe to immediately
> return to user mode.  IRQs must not be re-enabled at any point
> after prepare_exit_to_usermode() returns and user mode is actually
> entered. (We can, of course, fail to enter user mode and treat
> that failure as a fresh entry to kernel mode.)
> 
> All callers of do_notify_resume() will be migrated to call
> prepare_exit_to_usermode() instead; prepare_exit_to_usermode() needs
> to do everything that do_notify_resume() does today, but it also
> takes care of scheduling and context tracking.  Unlike
> do_notify_resume(), it does not need to be called in a loop.
> 
> syscall_return_slowpath() is exactly what it sounds like: it will
> be called on any syscall exit slow path. It will replace
> syscall_trace_leave() and it calls prepare_exit_to_usermode() on the
> way out.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Denys Vlasenko <vda.linux@googlemail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: paulmck@linux.vnet.ibm.com
> Link: http://lkml.kernel.org/r/c57c8b87661a4152801d7d3786eac2d1a2f209dd.1435952415.git.luto@kernel.org
> [ Improved the changelog a bit. ]
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/x86/entry/common.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 111 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 9a327ee..febc530 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -207,6 +207,7 @@ long syscall_trace_enter(struct pt_regs *regs)
>  		return syscall_trace_enter_phase2(regs, arch, phase1_result);
>  }
>  
> +/* Deprecated. */
>  void syscall_trace_leave(struct pt_regs *regs)
>  {
>  	bool step;
> @@ -237,8 +238,117 @@ void syscall_trace_leave(struct pt_regs *regs)
>  	user_enter();
>  }
>  
> +static struct thread_info *pt_regs_to_thread_info(struct pt_regs *regs)
> +{
> +	unsigned long top_of_stack =
> +		(unsigned long)(regs + 1) + TOP_OF_KERNEL_STACK_PADDING;
> +	return (struct thread_info *)(top_of_stack - THREAD_SIZE);
> +}
> +
> +/* Called with IRQs disabled. */
> +__visible void prepare_exit_to_usermode(struct pt_regs *regs)
> +{
> +	if (WARN_ON(!irqs_disabled()))
> +		local_irq_disable();
> +
> +	/*
> +	 * In order to return to user mode, we need to have IRQs off with
> +	 * none of _TIF_SIGPENDING, _TIF_NOTIFY_RESUME, _TIF_USER_RETURN_NOTIFY,
> +	 * _TIF_UPROBE, or _TIF_NEED_RESCHED set.  Several of these flags
> +	 * can be set at any time on preemptable kernels if we have IRQs on,
> +	 * so we need to loop.  Disabling preemption wouldn't help: doing the
> +	 * work to clear some of the flags can sleep.
> +	 */
> +	while (true) {
> +		u32 cached_flags =
> +			READ_ONCE(pt_regs_to_thread_info(regs)->flags);
> +
> +		if (!(cached_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
> +				      _TIF_UPROBE | _TIF_NEED_RESCHED)))
> +			break;
> +
> +		/* We have work to do. */
> +		local_irq_enable();
> +
> +		if (cached_flags & _TIF_NEED_RESCHED)
> +			schedule();
> +
> +		if (cached_flags & _TIF_UPROBE)
> +			uprobe_notify_resume(regs);
> +
> +		/* deal with pending signal delivery */
> +		if (cached_flags & _TIF_SIGPENDING)
> +			do_signal(regs);
> +
> +		if (cached_flags & _TIF_NOTIFY_RESUME) {
> +			clear_thread_flag(TIF_NOTIFY_RESUME);
> +			tracehook_notify_resume(regs);
> +		}
> +
> +		if (cached_flags & _TIF_USER_RETURN_NOTIFY)
> +			fire_user_return_notifiers();
> +
> +		/* Disable IRQs and retry */
> +		local_irq_disable();
> +	}

I dreamed so many times about this loop in C!

> +
> +	user_enter();

So now we are sure that we have only one call to user_enter() before
resuming userspace, once we've completed everything, rescheduling, signals,
etc... No more context tracking hacky round on signals and rescheduling?

That's great. I need to check if other archs still need schedule_user().

Thanks a lot!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
  2015-07-07 10:54   ` [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion tip-bot for Andy Lutomirski
@ 2015-07-14 23:26     ` Frederic Weisbecker
  2015-07-14 23:33       ` Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-07-14 23:26 UTC (permalink / raw)
  To: luto, mingo, luto, keescook, torvalds, peterz, paulmck, oleg,
	hpa, linux-kernel, vda.linux, riel, bp, tglx, dvlasenk, brgerst
  Cc: linux-tip-commits

On Tue, Jul 07, 2015 at 03:54:32AM -0700, tip-bot for Andy Lutomirski wrote:
> Commit-ID:  0333a209cbf600e980fc55c24878a56f25f48b65
> Gitweb:     http://git.kernel.org/tip/0333a209cbf600e980fc55c24878a56f25f48b65
> Author:     Andy Lutomirski <luto@kernel.org>
> AuthorDate: Fri, 3 Jul 2015 12:44:34 -0700
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Tue, 7 Jul 2015 10:59:10 +0200
> 
> x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Denys Vlasenko <vda.linux@googlemail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: paulmck@linux.vnet.ibm.com
> Link: http://lkml.kernel.org/r/e8bdc4ed0193fb2fd130f3d6b7b8023e2ec1ab62.1435952415.git.luto@kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/x86/kernel/irq.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 88b36648..6233de0 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -216,8 +216,23 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
>  	unsigned vector = ~regs->orig_ax;
>  	unsigned irq;
>  
> +	/*
> +	 * NB: Unlike exception entries, IRQ entries do not reliably
> +	 * handle context tracking in the low-level entry code.  This is
> +	 * because syscall entries execute briefly with IRQs on before
> +	 * updating context tracking state, so we can take an IRQ from
> +	 * kernel mode with CONTEXT_USER.  The low-level entry code only
> +	 * updates the context if we came from user mode, so we won't
> +	 * switch to CONTEXT_KERNEL.  We'll fix that once the syscall
> +	 * code is cleaned up enough that we can cleanly defer enabling
> +	 * IRQs.
> +	 */
> +

Now is it a problem to take interrupts in kernel mode with CONTEXT_USER?
I'm not sure it's worth trying to make it not happen.

>  	entering_irq();
>  
> +	/* entering_irq() tells RCU that we're not quiescent.  Check it. */
> +	rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");

Why do we need to check that?

> +
>  	irq = __this_cpu_read(vector_irq[vector]);
>  
>  	if (!handle_irq(irq, regs)) {

Thanks.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/entry: Add enter_from_user_mode() and use it in syscalls
  2015-07-14 23:04       ` Andy Lutomirski
@ 2015-07-14 23:28         ` Frederic Weisbecker
  0 siblings, 0 replies; 70+ messages in thread
From: Frederic Weisbecker @ 2015-07-14 23:28 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Brian Gerst, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Denys Vlasenko, Andrew Lutomirski, H. Peter Anvin,
	Denys Vlasenko, Rik van Riel, Linus Torvalds, Borislav Petkov,
	Thomas Gleixner, Oleg Nesterov, Kees Cook, linux-tip-commits

On Tue, Jul 14, 2015 at 04:04:47PM -0700, Andy Lutomirski wrote:
> On Tue, Jul 14, 2015 at 4:00 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Tue, Jul 07, 2015 at 03:51:29AM -0700, tip-bot for Andy Lutomirski wrote:
> >> Commit-ID:  feed36cde0a10adb957445a37e48f957f30b2273
> >> Gitweb:     http://git.kernel.org/tip/feed36cde0a10adb957445a37e48f957f30b2273
> >> Author:     Andy Lutomirski <luto@kernel.org>
> >> AuthorDate: Fri, 3 Jul 2015 12:44:25 -0700
> >> Committer:  Ingo Molnar <mingo@kernel.org>
> >> CommitDate: Tue, 7 Jul 2015 10:59:06 +0200
> >>
> >> x86/entry: Add enter_from_user_mode() and use it in syscalls
> >>
> >> Changing the x86 context tracking hooks is dangerous because
> >> there are no good checks that we track our context correctly.
> >> Add a helper to check that we're actually in CONTEXT_USER when
> >> we enter from user mode and wire it up for syscall entries.
> >>
> >> Subsequent patches will wire this up for all non-NMI entries as
> >> well.  NMIs are their own special beast and cannot currently
> >> switch overall context tracking state.  Instead, they have their
> >> own special RCU hooks.
> >>
> >> This is a tiny speedup if !CONFIG_CONTEXT_TRACKING (removes a
> >> branch) and a tiny slowdown if CONFIG_CONTEXT_TRACING (adds a
> >> layer of indirection).  Eventually, we should fix up the core
> >> context tracking code to supply a function that does what we
> >> want (and can be much simpler than user_exit), which will enable
> >> us to get rid of the extra call.
> >>
> >> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> >> Cc: Andy Lutomirski <luto@amacapital.net>
> >> Cc: Borislav Petkov <bp@alien8.de>
> >> Cc: Brian Gerst <brgerst@gmail.com>
> >> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> >> Cc: Denys Vlasenko <vda.linux@googlemail.com>
> >> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> >> Cc: H. Peter Anvin <hpa@zytor.com>
> >> Cc: Kees Cook <keescook@chromium.org>
> >> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> >> Cc: Oleg Nesterov <oleg@redhat.com>
> >> Cc: Peter Zijlstra <peterz@infradead.org>
> >> Cc: Rik van Riel <riel@redhat.com>
> >> Cc: Thomas Gleixner <tglx@linutronix.de>
> >> Cc: paulmck@linux.vnet.ibm.com
> >> Link: http://lkml.kernel.org/r/853b42420066ec3fb856779cdc223a6dcb5d355b.1435952415.git.luto@kernel.org
> >> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> >> ---
> >>  arch/x86/entry/common.c | 13 ++++++++++++-
> >>  1 file changed, 12 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> >> index 917d0c3..9a327ee 100644
> >> --- a/arch/x86/entry/common.c
> >> +++ b/arch/x86/entry/common.c
> >> @@ -28,6 +28,15 @@
> >>  #define CREATE_TRACE_POINTS
> >>  #include <trace/events/syscalls.h>
> >>
> >> +#ifdef CONFIG_CONTEXT_TRACKING
> >> +/* Called on entry from user mode with IRQs off. */
> >> +__visible void enter_from_user_mode(void)
> >> +{
> >> +     CT_WARN_ON(ct_state() != CONTEXT_USER);
> >> +     user_exit();
> >> +}
> >> +#endif
> >> +
> >>  static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
> >>  {
> >>  #ifdef CONFIG_X86_64
> >> @@ -65,14 +74,16 @@ unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
> >>       work = ACCESS_ONCE(current_thread_info()->flags) &
> >>               _TIF_WORK_SYSCALL_ENTRY;
> >>
> >> +#ifdef CONFIG_CONTEXT_TRACKING
> >>       /*
> >>        * If TIF_NOHZ is set, we are required to call user_exit() before
> >>        * doing anything that could touch RCU.
> >>        */
> >>       if (work & _TIF_NOHZ) {
> >> -             user_exit();
> >> +             enter_from_user_mode();
> >>               work &= ~_TIF_NOHZ;
> >
> > We should move the sanity check to user_exit/enter() and use user_exit/enter()
> > only when we actually enter/exit user.
> 
> I agree, but I don't know what other arches to.

Right, I'll need to check that carefully, once I fully understand your patchset.

> 
> > Here it's the case but syscall_trace_leave()
> > and do_notify_resume() are special case that should probably use exception_enter/exit()
> > unless your patchset have changed things such that there is only one call to user_exit()
> > once we completed everything before resuming userspace. I need to review the rest of
> > the patchset to discover that :-)
> 
> syscall_trace_leave and do_notify_resume may be so screwed up that
> your suggestion wouldn't even work.  However, the next set of patches
> (out for review but currently stalled pending Brian Gerst's vm86 work)
> remove those functions entirely.

Ok so I'm probably confused. I need to check the resulting code.

> 
> --Andy

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
  2015-07-14 23:26     ` Frederic Weisbecker
@ 2015-07-14 23:33       ` Andy Lutomirski
  2015-07-18 13:23         ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-14 23:33 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Andrew Lutomirski, Ingo Molnar, Kees Cook, Linus Torvalds,
	Peter Zijlstra, Paul McKenney, Oleg Nesterov, H. Peter Anvin,
	linux-kernel, Denys Vlasenko, Rik van Riel, Borislav Petkov,
	Thomas Gleixner, Denys Vlasenko, Brian Gerst, linux-tip-commits

On Tue, Jul 14, 2015 at 4:26 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Jul 07, 2015 at 03:54:32AM -0700, tip-bot for Andy Lutomirski wrote:
>> Commit-ID:  0333a209cbf600e980fc55c24878a56f25f48b65
>> Gitweb:     http://git.kernel.org/tip/0333a209cbf600e980fc55c24878a56f25f48b65
>> Author:     Andy Lutomirski <luto@kernel.org>
>> AuthorDate: Fri, 3 Jul 2015 12:44:34 -0700
>> Committer:  Ingo Molnar <mingo@kernel.org>
>> CommitDate: Tue, 7 Jul 2015 10:59:10 +0200
>>
>> x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
>>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: Brian Gerst <brgerst@gmail.com>
>> Cc: Denys Vlasenko <dvlasenk@redhat.com>
>> Cc: Denys Vlasenko <vda.linux@googlemail.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: H. Peter Anvin <hpa@zytor.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Oleg Nesterov <oleg@redhat.com>
>> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: paulmck@linux.vnet.ibm.com
>> Link: http://lkml.kernel.org/r/e8bdc4ed0193fb2fd130f3d6b7b8023e2ec1ab62.1435952415.git.luto@kernel.org
>> Signed-off-by: Ingo Molnar <mingo@kernel.org>
>> ---
>>  arch/x86/kernel/irq.c | 15 +++++++++++++++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
>> index 88b36648..6233de0 100644
>> --- a/arch/x86/kernel/irq.c
>> +++ b/arch/x86/kernel/irq.c
>> @@ -216,8 +216,23 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
>>       unsigned vector = ~regs->orig_ax;
>>       unsigned irq;
>>
>> +     /*
>> +      * NB: Unlike exception entries, IRQ entries do not reliably
>> +      * handle context tracking in the low-level entry code.  This is
>> +      * because syscall entries execute briefly with IRQs on before
>> +      * updating context tracking state, so we can take an IRQ from
>> +      * kernel mode with CONTEXT_USER.  The low-level entry code only
>> +      * updates the context if we came from user mode, so we won't
>> +      * switch to CONTEXT_KERNEL.  We'll fix that once the syscall
>> +      * code is cleaned up enough that we can cleanly defer enabling
>> +      * IRQs.
>> +      */
>> +
>
> Now is it a problem to take interrupts in kernel mode with CONTEXT_USER?
> I'm not sure it's worth trying to make it not happen.

It's not currently a problem, but it would be nice if we could do the
equivalent of:

if (user_mode(regs)) {
  user_exit();  (or enter_from_user_mode or whatever)
} else {
  // don't bother -- already in CONTEXT_KERNEL
}

i.e. the same thing that do_general_protection, etc do in -tip.  That
would get rid of any need to store the previous context.

Currently we can't because of syscalls and maybe because of KVM.  KVM
has a weird fake interrupt thing.

>
>>       entering_irq();
>>
>> +     /* entering_irq() tells RCU that we're not quiescent.  Check it. */
>> +     rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
>
> Why do we need to check that?

Sanity check.  If we're changing a bunch of context tracking details,
I want to assert that it actually works.

--Andy

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/entry: Add new, comprehensible entry and exit handlers written in C
  2015-07-14 23:07     ` Frederic Weisbecker
@ 2015-07-15 19:56       ` Linus Torvalds
  2015-07-15 20:46         ` Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Linus Torvalds @ 2015-07-15 19:56 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Kees Cook, Peter Zijlstra, Denys Vlasenko, Ingo Molnar,
	Brian Gerst, Andy Lutomirski, Borislav Petkov, Andrew Lutomirski,
	Oleg Nesterov, Peter Anvin, Linux Kernel Mailing List,
	Denys Vlasenko, Thomas Gleixner, Rik van Riel, linux-tip-commits

On Tue, Jul 14, 2015 at 4:07 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Jul 07, 2015 at 03:51:48AM -0700, tip-bot for Andy Lutomirski wrote:
>> +     while (true) {
>> +             u32 cached_flags =
>> +                     READ_ONCE(pt_regs_to_thread_info(regs)->flags);
>> +
>> +             if (!(cached_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
>> +                                   _TIF_UPROBE | _TIF_NEED_RESCHED)))
>> +                     break;
>> +
>> +             /* We have work to do. */
>> +             local_irq_enable();
>> +
>> +             if (cached_flags & _TIF_NEED_RESCHED)
>> +                     schedule();
>> +
>> +             if (cached_flags & _TIF_UPROBE)
>> +                     uprobe_notify_resume(regs);
>> +
>> +             /* deal with pending signal delivery */
>> +             if (cached_flags & _TIF_SIGPENDING)
>> +                     do_signal(regs);
>> +
>> +             if (cached_flags & _TIF_NOTIFY_RESUME) {
>> +                     clear_thread_flag(TIF_NOTIFY_RESUME);
>> +                     tracehook_notify_resume(regs);
>> +             }
>> +
>> +             if (cached_flags & _TIF_USER_RETURN_NOTIFY)
>> +                     fire_user_return_notifiers();
>> +
>> +             /* Disable IRQs and retry */
>> +             local_irq_disable();
>> +     }
>
> I dreamed so many times about this loop in C!

So this made me look at it again, and now I'm worried.

There's that "early break", but it doesn't check
_TIF_USER_RETURN_NOTIFY. So if *only* USER_RETURN_NOTIFY is set, we're
screwed.

It migth be that that doesn't happen for some reason, but I'm not
seeing what that reason would be.

The other thing that worries me is that this depends on all the
handler routines to clear the flags (except for
tracehook_notify_resume()). Which they hopefully do. But that means
that just looking at this locally, it's not at all obvious that it
works right.

So wouldn't it be much nicer to do:

        u32 cached_flags = READ_ONCE(pt_regs_to_thread_info(regs)->flags);

        cached_flags &= _TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
_TIF_USER_RETURN_NOTIFY | _TIF_UPROBE | _TIF_NEED_RESCHED;

        if (!cached_flags)
                break;

        atomic_clear_mask(cached_flags, &pt_regs_to_thread_info(regs)->flags);

and then have those bit tests after that?

                   Linus

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/entry: Add new, comprehensible entry and exit handlers written in C
  2015-07-15 19:56       ` Linus Torvalds
@ 2015-07-15 20:46         ` Andy Lutomirski
  2015-07-15 21:25           ` [PATCH] x86/entry: Fix _TIF_USER_RETURN_NOTIFY check in prepare_exit_to_usermode Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-15 20:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Frederic Weisbecker, Kees Cook, Peter Zijlstra, Denys Vlasenko,
	Ingo Molnar, Brian Gerst, Borislav Petkov, Andrew Lutomirski,
	Oleg Nesterov, Peter Anvin, Linux Kernel Mailing List,
	Denys Vlasenko, Thomas Gleixner, Rik van Riel, linux-tip-commits

On Wed, Jul 15, 2015 at 12:56 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Jul 14, 2015 at 4:07 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>> On Tue, Jul 07, 2015 at 03:51:48AM -0700, tip-bot for Andy Lutomirski wrote:
>>> +     while (true) {
>>> +             u32 cached_flags =
>>> +                     READ_ONCE(pt_regs_to_thread_info(regs)->flags);
>>> +
>>> +             if (!(cached_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
>>> +                                   _TIF_UPROBE | _TIF_NEED_RESCHED)))
>>> +                     break;
>>> +
>>> +             /* We have work to do. */
>>> +             local_irq_enable();
>>> +
>>> +             if (cached_flags & _TIF_NEED_RESCHED)
>>> +                     schedule();
>>> +
>>> +             if (cached_flags & _TIF_UPROBE)
>>> +                     uprobe_notify_resume(regs);
>>> +
>>> +             /* deal with pending signal delivery */
>>> +             if (cached_flags & _TIF_SIGPENDING)
>>> +                     do_signal(regs);
>>> +
>>> +             if (cached_flags & _TIF_NOTIFY_RESUME) {
>>> +                     clear_thread_flag(TIF_NOTIFY_RESUME);
>>> +                     tracehook_notify_resume(regs);
>>> +             }
>>> +
>>> +             if (cached_flags & _TIF_USER_RETURN_NOTIFY)
>>> +                     fire_user_return_notifiers();
>>> +
>>> +             /* Disable IRQs and retry */
>>> +             local_irq_disable();
>>> +     }
>>
>> I dreamed so many times about this loop in C!
>
> So this made me look at it again, and now I'm worried.
>
> There's that "early break", but it doesn't check
> _TIF_USER_RETURN_NOTIFY. So if *only* USER_RETURN_NOTIFY is set, we're
> screwed.

Crap, that's a bug.  I'll send a patch.

>
> It migth be that that doesn't happen for some reason, but I'm not
> seeing what that reason would be.
>
> The other thing that worries me is that this depends on all the
> handler routines to clear the flags (except for
> tracehook_notify_resume()). Which they hopefully do. But that means
> that just looking at this locally, it's not at all obvious that it
> works right.

The old do_notify_resume work loop worked more or less the same way,
so we should be okay here.  See below.

>
> So wouldn't it be much nicer to do:
>
>         u32 cached_flags = READ_ONCE(pt_regs_to_thread_info(regs)->flags);
>
>         cached_flags &= _TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
> _TIF_USER_RETURN_NOTIFY | _TIF_UPROBE | _TIF_NEED_RESCHED;
>
>         if (!cached_flags)
>                 break;
>
>         atomic_clear_mask(cached_flags, &pt_regs_to_thread_info(regs)->flags);
>
> and then have those bit tests after that?
>

Yes, but it would be a slowdown unless we converted all the various
handlers stopped clearing the bits separately (two atomics instead of
one).  And to do that, we'd probably want to change all the arches,

Signal handling has all the recalc_sigpending stuff.  schedule() had
better clear TIF_NEED_RESCHED.  fire_user_return_notifiers is totally
absurd but it does clear the bit.  uprobes clears the bit directly.

I'd be all for changing this, but coordinating with the generic code
could be annoying.

--Andy

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH] x86/entry: Fix _TIF_USER_RETURN_NOTIFY check in prepare_exit_to_usermode
  2015-07-15 20:46         ` Andy Lutomirski
@ 2015-07-15 21:25           ` Andy Lutomirski
  2015-07-18  3:25             ` [tip:x86/asm] " tip-bot for Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-07-15 21:25 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, Linus Torvalds, Andy Lutomirski

Linus noticed that the early return check was missing
_TIF_USER_RETURN_NOTIFY.  If the only work flag was
_TIF_USER_RETURN_NOTIFY, we'd skip user return notifiers.  Fix it.
(This is the only missing bit.)

This fixes double faults on a KVM host.  It's the same issue as last
time, except that this time it's very easy to trigger.  Apparently no
one uses -next as a KVM host.

(I'm still not quite sure what it is that KVM does that blows up so
 badly if we miss a user return notifier.  My best guess is that KVM
 lets KERNEL_GS_BASE (i.e. the user's gs base) be negative and fixes
 it up in a user return notifier.  If we actually end up in user mode
 with a negative gs base, we blow up pretty badly.)

Fixes: c5c46f59e4e7 ("x86/entry: Add new, comprehensible entry and exit handlers written in C")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index febc53086a69..a3e9c7fa15d9 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -264,7 +264,8 @@ __visible void prepare_exit_to_usermode(struct pt_regs *regs)
 			READ_ONCE(pt_regs_to_thread_info(regs)->flags);
 
 		if (!(cached_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
-				      _TIF_UPROBE | _TIF_NEED_RESCHED)))
+				      _TIF_UPROBE | _TIF_NEED_RESCHED |
+				      _TIF_USER_RETURN_NOTIFY)))
 			break;
 
 		/* We have work to do. */
-- 
2.4.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [tip:x86/asm] x86/entry: Fix _TIF_USER_RETURN_NOTIFY check in prepare_exit_to_usermode
  2015-07-15 21:25           ` [PATCH] x86/entry: Fix _TIF_USER_RETURN_NOTIFY check in prepare_exit_to_usermode Andy Lutomirski
@ 2015-07-18  3:25             ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-18  3:25 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: tglx, hpa, luto, peterz, linux-kernel, torvalds, mingo

Commit-ID:  d132803e6c611d50c19baedc8ae520203a2baca7
Gitweb:     http://git.kernel.org/tip/d132803e6c611d50c19baedc8ae520203a2baca7
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Wed, 15 Jul 2015 14:25:16 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 17 Jul 2015 16:08:22 +0200

x86/entry: Fix _TIF_USER_RETURN_NOTIFY check in prepare_exit_to_usermode

Linus noticed that the early return check was missing
_TIF_USER_RETURN_NOTIFY.  If the only work flag was
_TIF_USER_RETURN_NOTIFY, we'd skip user return notifiers.  Fix
it. (This is the only missing bit.)

This fixes double faults on a KVM host.  It's the same issue as
last time, except that this time it's very easy to trigger.
Apparently no one uses -next as a KVM host.

( I'm still not quite sure what it is that KVM does that blows up
  so badly if we miss a user return notifier.  My best guess is that KVM
  lets KERNEL_GS_BASE (i.e. the user's gs base) be negative and fixes
  it up in a user return notifier.  If we actually end up in user mode
  with a negative gs base, we blow up pretty badly. )

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c5c46f59e4e7 ("x86/entry: Add new, comprehensible entry and exit handlers written in C")
Link: http://lkml.kernel.org/r/3f801104d24ee7a6bb1446408d9950777aa63277.1436995419.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index febc530..a3e9c7f 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -264,7 +264,8 @@ __visible void prepare_exit_to_usermode(struct pt_regs *regs)
 			READ_ONCE(pt_regs_to_thread_info(regs)->flags);
 
 		if (!(cached_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
-				      _TIF_UPROBE | _TIF_NEED_RESCHED)))
+				      _TIF_UPROBE | _TIF_NEED_RESCHED |
+				      _TIF_USER_RETURN_NOTIFY)))
 			break;
 
 		/* We have work to do. */

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
  2015-07-14 23:33       ` Andy Lutomirski
@ 2015-07-18 13:23         ` Frederic Weisbecker
  2015-07-18 14:10           ` Paul E. McKenney
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-07-18 13:23 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew Lutomirski, Ingo Molnar, Kees Cook, Linus Torvalds,
	Peter Zijlstra, Paul McKenney, Oleg Nesterov, H. Peter Anvin,
	linux-kernel, Denys Vlasenko, Rik van Riel, Borislav Petkov,
	Thomas Gleixner, Denys Vlasenko, Brian Gerst, linux-tip-commits

On Tue, Jul 14, 2015 at 04:33:39PM -0700, Andy Lutomirski wrote:
> On Tue, Jul 14, 2015 at 4:26 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Tue, Jul 07, 2015 at 03:54:32AM -0700, tip-bot for Andy Lutomirski wrote:
> >> Commit-ID:  0333a209cbf600e980fc55c24878a56f25f48b65
> >> Gitweb:     http://git.kernel.org/tip/0333a209cbf600e980fc55c24878a56f25f48b65
> >> Author:     Andy Lutomirski <luto@kernel.org>
> >> AuthorDate: Fri, 3 Jul 2015 12:44:34 -0700
> >> Committer:  Ingo Molnar <mingo@kernel.org>
> >> CommitDate: Tue, 7 Jul 2015 10:59:10 +0200
> >>
> >> x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
> >>
> >> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> >> Cc: Andy Lutomirski <luto@amacapital.net>
> >> Cc: Borislav Petkov <bp@alien8.de>
> >> Cc: Brian Gerst <brgerst@gmail.com>
> >> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> >> Cc: Denys Vlasenko <vda.linux@googlemail.com>
> >> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> >> Cc: H. Peter Anvin <hpa@zytor.com>
> >> Cc: Kees Cook <keescook@chromium.org>
> >> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> >> Cc: Oleg Nesterov <oleg@redhat.com>
> >> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >> Cc: Peter Zijlstra <peterz@infradead.org>
> >> Cc: Rik van Riel <riel@redhat.com>
> >> Cc: Thomas Gleixner <tglx@linutronix.de>
> >> Cc: paulmck@linux.vnet.ibm.com
> >> Link: http://lkml.kernel.org/r/e8bdc4ed0193fb2fd130f3d6b7b8023e2ec1ab62.1435952415.git.luto@kernel.org
> >> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> >> ---
> >>  arch/x86/kernel/irq.c | 15 +++++++++++++++
> >>  1 file changed, 15 insertions(+)
> >>
> >> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> >> index 88b36648..6233de0 100644
> >> --- a/arch/x86/kernel/irq.c
> >> +++ b/arch/x86/kernel/irq.c
> >> @@ -216,8 +216,23 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
> >>       unsigned vector = ~regs->orig_ax;
> >>       unsigned irq;
> >>
> >> +     /*
> >> +      * NB: Unlike exception entries, IRQ entries do not reliably
> >> +      * handle context tracking in the low-level entry code.  This is
> >> +      * because syscall entries execute briefly with IRQs on before
> >> +      * updating context tracking state, so we can take an IRQ from
> >> +      * kernel mode with CONTEXT_USER.  The low-level entry code only
> >> +      * updates the context if we came from user mode, so we won't
> >> +      * switch to CONTEXT_KERNEL.  We'll fix that once the syscall
> >> +      * code is cleaned up enough that we can cleanly defer enabling
> >> +      * IRQs.
> >> +      */
> >> +
> >
> > Now is it a problem to take interrupts in kernel mode with CONTEXT_USER?
> > I'm not sure it's worth trying to make it not happen.
> 
> It's not currently a problem, but it would be nice if we could do the
> equivalent of:
> 
> if (user_mode(regs)) {
>   user_exit();  (or enter_from_user_mode or whatever)
> } else {
>   // don't bother -- already in CONTEXT_KERNEL
> }

This was the initial implementation of context tracking but it was terribly
buggy. What if we enter the kernel, we haven't yet got a change to call
context_tracking_user_exit() and we get an exception in the kernel entry
path? user_mode(regs) will return the wrong value and bad things happen.

This is why context tracking needs its own tracking state, because we are always
out of sync with the real processor context anyway.

> 
> i.e. the same thing that do_general_protection, etc do in -tip.  That
> would get rid of any need to store the previous context.
> 
> Currently we can't because of syscalls and maybe because of KVM.  KVM
> has a weird fake interrupt thing.
> 
> >
> >>       entering_irq();
> >>
> >> +     /* entering_irq() tells RCU that we're not quiescent.  Check it. */
> >> +     rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
> >
> > Why do we need to check that?
> 
> Sanity check.  If we're changing a bunch of context tracking details,
> I want to assert that it actually works.

But we call rcu_irq_enter() right before.

It's more or less like doing:

local_irq_disable();
WARN_ON(!irqs_disabled());

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
  2015-07-18 13:23         ` Frederic Weisbecker
@ 2015-07-18 14:10           ` Paul E. McKenney
  0 siblings, 0 replies; 70+ messages in thread
From: Paul E. McKenney @ 2015-07-18 14:10 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Andy Lutomirski, Andrew Lutomirski, Ingo Molnar, Kees Cook,
	Linus Torvalds, Peter Zijlstra, Oleg Nesterov, H. Peter Anvin,
	linux-kernel, Denys Vlasenko, Rik van Riel, Borislav Petkov,
	Thomas Gleixner, Denys Vlasenko, Brian Gerst, linux-tip-commits

On Sat, Jul 18, 2015 at 03:23:57PM +0200, Frederic Weisbecker wrote:
> On Tue, Jul 14, 2015 at 04:33:39PM -0700, Andy Lutomirski wrote:
> > On Tue, Jul 14, 2015 at 4:26 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > On Tue, Jul 07, 2015 at 03:54:32AM -0700, tip-bot for Andy Lutomirski wrote:
> > >> Commit-ID:  0333a209cbf600e980fc55c24878a56f25f48b65
> > >> Gitweb:     http://git.kernel.org/tip/0333a209cbf600e980fc55c24878a56f25f48b65
> > >> Author:     Andy Lutomirski <luto@kernel.org>
> > >> AuthorDate: Fri, 3 Jul 2015 12:44:34 -0700
> > >> Committer:  Ingo Molnar <mingo@kernel.org>
> > >> CommitDate: Tue, 7 Jul 2015 10:59:10 +0200
> > >>
> > >> x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion
> > >>
> > >> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> > >> Cc: Andy Lutomirski <luto@amacapital.net>
> > >> Cc: Borislav Petkov <bp@alien8.de>
> > >> Cc: Brian Gerst <brgerst@gmail.com>
> > >> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> > >> Cc: Denys Vlasenko <vda.linux@googlemail.com>
> > >> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > >> Cc: H. Peter Anvin <hpa@zytor.com>
> > >> Cc: Kees Cook <keescook@chromium.org>
> > >> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > >> Cc: Oleg Nesterov <oleg@redhat.com>
> > >> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > >> Cc: Peter Zijlstra <peterz@infradead.org>
> > >> Cc: Rik van Riel <riel@redhat.com>
> > >> Cc: Thomas Gleixner <tglx@linutronix.de>
> > >> Cc: paulmck@linux.vnet.ibm.com
> > >> Link: http://lkml.kernel.org/r/e8bdc4ed0193fb2fd130f3d6b7b8023e2ec1ab62.1435952415.git.luto@kernel.org
> > >> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > >> ---
> > >>  arch/x86/kernel/irq.c | 15 +++++++++++++++
> > >>  1 file changed, 15 insertions(+)
> > >>
> > >> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> > >> index 88b36648..6233de0 100644
> > >> --- a/arch/x86/kernel/irq.c
> > >> +++ b/arch/x86/kernel/irq.c
> > >> @@ -216,8 +216,23 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
> > >>       unsigned vector = ~regs->orig_ax;
> > >>       unsigned irq;
> > >>
> > >> +     /*
> > >> +      * NB: Unlike exception entries, IRQ entries do not reliably
> > >> +      * handle context tracking in the low-level entry code.  This is
> > >> +      * because syscall entries execute briefly with IRQs on before
> > >> +      * updating context tracking state, so we can take an IRQ from
> > >> +      * kernel mode with CONTEXT_USER.  The low-level entry code only
> > >> +      * updates the context if we came from user mode, so we won't
> > >> +      * switch to CONTEXT_KERNEL.  We'll fix that once the syscall
> > >> +      * code is cleaned up enough that we can cleanly defer enabling
> > >> +      * IRQs.
> > >> +      */
> > >> +
> > >
> > > Now is it a problem to take interrupts in kernel mode with CONTEXT_USER?
> > > I'm not sure it's worth trying to make it not happen.
> > 
> > It's not currently a problem, but it would be nice if we could do the
> > equivalent of:
> > 
> > if (user_mode(regs)) {
> >   user_exit();  (or enter_from_user_mode or whatever)
> > } else {
> >   // don't bother -- already in CONTEXT_KERNEL
> > }
> 
> This was the initial implementation of context tracking but it was terribly
> buggy. What if we enter the kernel, we haven't yet got a change to call
> context_tracking_user_exit() and we get an exception in the kernel entry
> path? user_mode(regs) will return the wrong value and bad things happen.
> 
> This is why context tracking needs its own tracking state, because we are always
> out of sync with the real processor context anyway.
> 
> > 
> > i.e. the same thing that do_general_protection, etc do in -tip.  That
> > would get rid of any need to store the previous context.
> > 
> > Currently we can't because of syscalls and maybe because of KVM.  KVM
> > has a weird fake interrupt thing.
> > 
> > >
> > >>       entering_irq();
> > >>
> > >> +     /* entering_irq() tells RCU that we're not quiescent.  Check it. */
> > >> +     rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
> > >
> > > Why do we need to check that?
> > 
> > Sanity check.  If we're changing a bunch of context tracking details,
> > I want to assert that it actually works.
> 
> But we call rcu_irq_enter() right before.
> 
> It's more or less like doing:
> 
> local_irq_disable();
> WARN_ON(!irqs_disabled());

If we end up in a world where RCU sometimes uses context-tracking state
and sometimes uses its own state (for example, for architecture that
do not support context tracking), such a check might make more sense.
It would be all too easy for someone to accidentailly manage to disable
both somehow, and things would sort of work but have strange undebuggable
failure cases.  Sometimes.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-07-07 10:53   ` [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code tip-bot for Andy Lutomirski
@ 2015-08-11 22:18     ` Frederic Weisbecker
  2015-08-11 22:25       ` Andy Lutomirski
  2015-08-11 22:38     ` Frederic Weisbecker
  1 sibling, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-08-11 22:18 UTC (permalink / raw)
  To: dvlasenk, riel, bp, peterz, brgerst, vda.linux, keescook, tglx,
	oleg, luto, luto, torvalds, mingo, hpa, linux-kernel
  Cc: linux-tip-commits

On Tue, Jul 07, 2015 at 03:53:29AM -0700, tip-bot for Andy Lutomirski wrote:
> Commit-ID:  02bc7768fe447ae305e924b931fa629073a4a1b9
> Gitweb:     http://git.kernel.org/tip/02bc7768fe447ae305e924b931fa629073a4a1b9
> Author:     Andy Lutomirski <luto@kernel.org>
> AuthorDate: Fri, 3 Jul 2015 12:44:31 -0700
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Tue, 7 Jul 2015 10:59:08 +0200
> 
> x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Denys Vlasenko <vda.linux@googlemail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: paulmck@linux.vnet.ibm.com
> Link: http://lkml.kernel.org/r/60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/x86/entry/entry_64.S        | 64 +++++++++++-----------------------------
>  arch/x86/entry/entry_64_compat.S |  5 ++++
>  2 files changed, 23 insertions(+), 46 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 83eb63d..168ee26 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -508,7 +508,16 @@ END(irq_entries_start)
>  
>  	testb	$3, CS(%rsp)
>  	jz	1f
> +
> +	/*
> +	 * IRQ from user mode.  Switch to kernel gsbase and inform context
> +	 * tracking that we're in kernel mode.
> +	 */
>  	SWAPGS
> +#ifdef CONFIG_CONTEXT_TRACKING
> +	call enter_from_user_mode
> +#endif

There have been a lot of patches going there lately so I couldn't follow
everything and since you just started a discussion on context tracking, I
just had a look on the latest change.

So it seems we're now calling user_exit() on IRQ entry. This is not something
we want. We already have everything we need with rcu_irq_enter() and
vtime_account_irq_enter(). user_exit() brings a lot of overhead here that we
don't need. Plus this is called unconditionally since CONFIG_CONTEXT_TRACKING=y
on most distros now.

We really want the context tracking code to be called on syscall slow path only
(and exceptions with static keys but an exception slow path would be desired as well).

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 22:18     ` Frederic Weisbecker
@ 2015-08-11 22:25       ` Andy Lutomirski
  2015-08-11 22:49         ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-08-11 22:25 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Tue, Aug 11, 2015 at 3:18 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Jul 07, 2015 at 03:53:29AM -0700, tip-bot for Andy Lutomirski wrote:
>> Commit-ID:  02bc7768fe447ae305e924b931fa629073a4a1b9
>> Gitweb:     http://git.kernel.org/tip/02bc7768fe447ae305e924b931fa629073a4a1b9
>> Author:     Andy Lutomirski <luto@kernel.org>
>> AuthorDate: Fri, 3 Jul 2015 12:44:31 -0700
>> Committer:  Ingo Molnar <mingo@kernel.org>
>> CommitDate: Tue, 7 Jul 2015 10:59:08 +0200
>>
>> x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
>>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: Brian Gerst <brgerst@gmail.com>
>> Cc: Denys Vlasenko <dvlasenk@redhat.com>
>> Cc: Denys Vlasenko <vda.linux@googlemail.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: H. Peter Anvin <hpa@zytor.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Oleg Nesterov <oleg@redhat.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: paulmck@linux.vnet.ibm.com
>> Link: http://lkml.kernel.org/r/60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org
>> Signed-off-by: Ingo Molnar <mingo@kernel.org>
>> ---
>>  arch/x86/entry/entry_64.S        | 64 +++++++++++-----------------------------
>>  arch/x86/entry/entry_64_compat.S |  5 ++++
>>  2 files changed, 23 insertions(+), 46 deletions(-)
>>
>> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
>> index 83eb63d..168ee26 100644
>> --- a/arch/x86/entry/entry_64.S
>> +++ b/arch/x86/entry/entry_64.S
>> @@ -508,7 +508,16 @@ END(irq_entries_start)
>>
>>       testb   $3, CS(%rsp)
>>       jz      1f
>> +
>> +     /*
>> +      * IRQ from user mode.  Switch to kernel gsbase and inform context
>> +      * tracking that we're in kernel mode.
>> +      */
>>       SWAPGS
>> +#ifdef CONFIG_CONTEXT_TRACKING
>> +     call enter_from_user_mode
>> +#endif
>
> There have been a lot of patches going there lately so I couldn't follow
> everything and since you just started a discussion on context tracking, I
> just had a look on the latest change.
>
> So it seems we're now calling user_exit() on IRQ entry. This is not something
> we want. We already have everything we need with rcu_irq_enter() and
> vtime_account_irq_enter(). user_exit() brings a lot of overhead here that we
> don't need. Plus this is called unconditionally since CONFIG_CONTEXT_TRACKING=y
> on most distros now.
>
> We really want the context tracking code to be called on syscall slow path only
> (and exceptions with static keys but an exception slow path would be desired as well).

Can you explain to me what context tracking does that rcu_irq_enter
and vtime_account_irq_enter don't do that's expensive?  Frankly, I'd
rather drop everything except the context tracking callback.

We also need this for the deletion of exception_enter from the trap
entries to be correct.

Like I said in the other thread, there are too many hooks for arch
code to juggle.  Grumble.

--Andy

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-07-07 10:53   ` [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code tip-bot for Andy Lutomirski
  2015-08-11 22:18     ` Frederic Weisbecker
@ 2015-08-11 22:38     ` Frederic Weisbecker
  2015-08-11 22:51       ` Andy Lutomirski
  1 sibling, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-08-11 22:38 UTC (permalink / raw)
  To: dvlasenk, riel, bp, peterz, brgerst, vda.linux, keescook, tglx,
	oleg, luto, luto, torvalds, mingo, hpa, linux-kernel
  Cc: linux-tip-commits

On Tue, Jul 07, 2015 at 03:53:29AM -0700, tip-bot for Andy Lutomirski wrote:
> Commit-ID:  02bc7768fe447ae305e924b931fa629073a4a1b9
> Gitweb:     http://git.kernel.org/tip/02bc7768fe447ae305e924b931fa629073a4a1b9
> Author:     Andy Lutomirski <luto@kernel.org>
> AuthorDate: Fri, 3 Jul 2015 12:44:31 -0700
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Tue, 7 Jul 2015 10:59:08 +0200
> 
> x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Denys Vlasenko <vda.linux@googlemail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: paulmck@linux.vnet.ibm.com
> Link: http://lkml.kernel.org/r/60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/x86/entry/entry_64.S        | 64 +++++++++++-----------------------------
>  arch/x86/entry/entry_64_compat.S |  5 ++++
>  2 files changed, 23 insertions(+), 46 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 83eb63d..168ee26 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1088,7 +1055,12 @@ ENTRY(error_entry)
>  	SWAPGS
>  
>  .Lerror_entry_from_usermode_after_swapgs:
> +#ifdef CONFIG_CONTEXT_TRACKING
> +	call enter_from_user_mode
> +#endif

This makes me very nervous as well!

It means that instead of using the context tracking save/restore model that we had
with exception_enter/exception_exit(), now we rely on the CS register.

I don't think we can do that because our "context tracking" is a soft tracking whereas
CS is hard tracking and both are not atomically synchronized together.

Imagine this situation: we are running in userspace. Context tracking knows it, everything
is fine. Now we do a syscall, we enter in kernel entry code but we trigger an exception
(DEBUG for example) before we got a chance to call user_exit(), which means that the context
tracking code still thinks we are in userspace, so we look at CS from the exception entry code
and it says the exception happened in the kernel. Hence we don't call user_exit() before calling
the exception handler. There is the bug because the exception handler may use RCU which still
thinks we run in userspace.

In early context tracking days we have relied on CS. But I changed that because of such
issue. The only reliable source for soft context tracking is the soft context tracking itself.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 22:25       ` Andy Lutomirski
@ 2015-08-11 22:49         ` Frederic Weisbecker
  2015-08-11 22:59           ` Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-08-11 22:49 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Tue, Aug 11, 2015 at 03:25:04PM -0700, Andy Lutomirski wrote:
> Can you explain to me what context tracking does that rcu_irq_enter
> and vtime_account_irq_enter don't do that's expensive?  Frankly, I'd
> rather drop everything except the context tracking callback.

Irqs have their own hooks in the generic code. irq_enter() and irq_exit().
And those take care of RCU and time accounting already. So arch code really
doesn't need to care about that.

context tracking exists for the sole purpose of tracking states that don't
have generic hooks. Those are syscalls and exceptions.

Besides, rcu_user_exit() is more costly than rcu_irq_enter() which have been
designed for the very purpose of providing a fast RCU tracking for non sleepable
code (which needs rcu_user_exit()).

> 
> We also need this for the deletion of exception_enter from the trap
> entries to be correct.

I'm not sure we can really delete exception_enter(). See my other email.

> Like I said in the other thread, there are too many hooks for arch
> code to juggle.  Grumble.

Well, archs don't need to care about irq hooks. They only need to track
syscalls and exception.

I've been thinking about pushing down syscalls and exceptions to generic
handlers. It might work for syscalls btw. But many exceptions have only
arch handlers, or significant amount of work is done on the arch level
which might make use of RCU (eg: breakpoint handlers on x86).

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 22:38     ` Frederic Weisbecker
@ 2015-08-11 22:51       ` Andy Lutomirski
  2015-08-11 23:22         ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-08-11 22:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Tue, Aug 11, 2015 at 3:38 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Jul 07, 2015 at 03:53:29AM -0700, tip-bot for Andy Lutomirski wrote:
>> Commit-ID:  02bc7768fe447ae305e924b931fa629073a4a1b9
>> Gitweb:     http://git.kernel.org/tip/02bc7768fe447ae305e924b931fa629073a4a1b9
>> Author:     Andy Lutomirski <luto@kernel.org>
>> AuthorDate: Fri, 3 Jul 2015 12:44:31 -0700
>> Committer:  Ingo Molnar <mingo@kernel.org>
>> CommitDate: Tue, 7 Jul 2015 10:59:08 +0200
>>
>> x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
>>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: Brian Gerst <brgerst@gmail.com>
>> Cc: Denys Vlasenko <dvlasenk@redhat.com>
>> Cc: Denys Vlasenko <vda.linux@googlemail.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: H. Peter Anvin <hpa@zytor.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Oleg Nesterov <oleg@redhat.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: paulmck@linux.vnet.ibm.com
>> Link: http://lkml.kernel.org/r/60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org
>> Signed-off-by: Ingo Molnar <mingo@kernel.org>
>> ---
>>  arch/x86/entry/entry_64.S        | 64 +++++++++++-----------------------------
>>  arch/x86/entry/entry_64_compat.S |  5 ++++
>>  2 files changed, 23 insertions(+), 46 deletions(-)
>>
>> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
>> index 83eb63d..168ee26 100644
>> --- a/arch/x86/entry/entry_64.S
>> +++ b/arch/x86/entry/entry_64.S
>> @@ -1088,7 +1055,12 @@ ENTRY(error_entry)
>>       SWAPGS
>>
>>  .Lerror_entry_from_usermode_after_swapgs:
>> +#ifdef CONFIG_CONTEXT_TRACKING
>> +     call enter_from_user_mode
>> +#endif
>
> This makes me very nervous as well!
>
> It means that instead of using the context tracking save/restore model that we had
> with exception_enter/exception_exit(), now we rely on the CS register.
>
> I don't think we can do that because our "context tracking" is a soft tracking whereas
> CS is hard tracking and both are not atomically synchronized together.
>
> Imagine this situation: we are running in userspace. Context tracking knows it, everything
> is fine. Now we do a syscall, we enter in kernel entry code but we trigger an exception
> (DEBUG for example) before we got a chance to call user_exit(), which means that the context
> tracking code still thinks we are in userspace, so we look at CS from the exception entry code
> and it says the exception happened in the kernel. Hence we don't call user_exit() before calling
> the exception handler. There is the bug because the exception handler may use RCU which still
> thinks we run in userspace.

#DB doesn't go through this patch -- it uses the paranoid entry path
and ist_enter.  But I see your point.  I think that, if we have a
problem like this in practice, then we should fix it.

But the old code had the same issue.  If we got an exception (the most
likely one is probably a vmalloc fault) during user_exit and we then
hit exception_enter, the result would probably be bad.

>
> In early context tracking days we have relied on CS. But I changed that because of such
> issue. The only reliable source for soft context tracking is the soft context tracking itself.

I don't see why the soft state is more reliable.  The only bad case is
where the entry itself (HW entry up to user_exit) is not atomic
enough, but that path should be at least as atomic as user_exit itself
is.

--Andy

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 22:49         ` Frederic Weisbecker
@ 2015-08-11 22:59           ` Andy Lutomirski
  2015-08-12  1:02             ` Paul E. McKenney
  2015-08-12 13:13             ` Frederic Weisbecker
  0 siblings, 2 replies; 70+ messages in thread
From: Andy Lutomirski @ 2015-08-11 22:59 UTC (permalink / raw)
  To: Frederic Weisbecker, Paul McKenney
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Tue, Aug 11, 2015 at 3:49 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Aug 11, 2015 at 03:25:04PM -0700, Andy Lutomirski wrote:
>> Can you explain to me what context tracking does that rcu_irq_enter
>> and vtime_account_irq_enter don't do that's expensive?  Frankly, I'd
>> rather drop everything except the context tracking callback.
>
> Irqs have their own hooks in the generic code. irq_enter() and irq_exit().
> And those take care of RCU and time accounting already. So arch code really
> doesn't need to care about that.

I'd love to have irq_enter_from_user and irq_enter_from_kernel instead.

>
> context tracking exists for the sole purpose of tracking states that don't
> have generic hooks. Those are syscalls and exceptions.
>
> Besides, rcu_user_exit() is more costly than rcu_irq_enter() which have been
> designed for the very purpose of providing a fast RCU tracking for non sleepable
> code (which needs rcu_user_exit()).
>

So rcu_user_exit is slower because it's okay to sleep after calling it?

Would it be possible to defer the overhead until we actually try to
sleep rather than doing it on entry?  (I have no idea what's going on
under the hood.)

Anyway, irq_enter_from_user would solve this problem completely.

>
> I've been thinking about pushing down syscalls and exceptions to generic
> handlers. It might work for syscalls btw. But many exceptions have only
> arch handlers, or significant amount of work is done on the arch level
> which might make use of RCU (eg: breakpoint handlers on x86).

I'm trying to port the meat of the x86 syscall code to C.  Maybe the
result will generalize.  The exit code is already in C (in -tip).

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 22:51       ` Andy Lutomirski
@ 2015-08-11 23:22         ` Frederic Weisbecker
  2015-08-11 23:33           ` Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-08-11 23:22 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits


On Tue, Aug 11, 2015 at 03:51:26PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 11, 2015 at 3:38 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >
> > This makes me very nervous as well!
> >
> > It means that instead of using the context tracking save/restore model that we had
> > with exception_enter/exception_exit(), now we rely on the CS register.
> >
> > I don't think we can do that because our "context tracking" is a soft tracking whereas
> > CS is hard tracking and both are not atomically synchronized together.
> >
> > Imagine this situation: we are running in userspace. Context tracking knows it, everything
> > is fine. Now we do a syscall, we enter in kernel entry code but we trigger an exception
> > (DEBUG for example) before we got a chance to call user_exit(), which means that the context
> > tracking code still thinks we are in userspace, so we look at CS from the exception entry code
> > and it says the exception happened in the kernel. Hence we don't call user_exit() before calling
> > the exception handler. There is the bug because the exception handler may use RCU which still
> > thinks we run in userspace.
> 
> #DB doesn't go through this patch -- it uses the paranoid entry path
> and ist_enter.  But I see your point.  I think that, if we have a
> problem like this in practice, then we should fix it.

Whatever hack we do to prevent from exceptions happening in between real kernel entry
to tracked kernel entry is going to be far less robust than relying strictly on soft
context tracking.

The resulting bugs are rare and very hard to reproduce and diagnose.

> 
> But the old code had the same issue.  If we got an exception (the most
> likely one is probably a vmalloc fault) during user_exit and we then
> hit exception_enter, the result would probably be bad.

We have a recursion protection in context tracking that should protect against
exceptions triggering in the middle of half-set states.

> 
> >
> > In early context tracking days we have relied on CS. But I changed that because of such
> > issue. The only reliable source for soft context tracking is the soft context tracking itself.
> 
> I don't see why the soft state is more reliable.  The only bad case is
> where the entry itself (HW entry up to user_exit) is not atomic
> enough, but that path should be at least as atomic as user_exit itself
> is.

Note it's not only about entry code up to user_exit() but also about
user_enter() up to iret.

Also as long as there is at least one instruction between entry to the kernel
and context tracking noting it, there is a risk for an exception. Hence entry
code will never be atomic enough to avoid this kind of bugs.

Heh if only we had something like local_exception_save()!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 23:22         ` Frederic Weisbecker
@ 2015-08-11 23:33           ` Andy Lutomirski
  2015-08-12 13:32             ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-08-11 23:33 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Tue, Aug 11, 2015 at 4:22 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>
> On Tue, Aug 11, 2015 at 03:51:26PM -0700, Andy Lutomirski wrote:
>> On Tue, Aug 11, 2015 at 3:38 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>> >
>> > This makes me very nervous as well!
>> >
>> > It means that instead of using the context tracking save/restore model that we had
>> > with exception_enter/exception_exit(), now we rely on the CS register.
>> >
>> > I don't think we can do that because our "context tracking" is a soft tracking whereas
>> > CS is hard tracking and both are not atomically synchronized together.
>> >
>> > Imagine this situation: we are running in userspace. Context tracking knows it, everything
>> > is fine. Now we do a syscall, we enter in kernel entry code but we trigger an exception
>> > (DEBUG for example) before we got a chance to call user_exit(), which means that the context
>> > tracking code still thinks we are in userspace, so we look at CS from the exception entry code
>> > and it says the exception happened in the kernel. Hence we don't call user_exit() before calling
>> > the exception handler. There is the bug because the exception handler may use RCU which still
>> > thinks we run in userspace.
>>
>> #DB doesn't go through this patch -- it uses the paranoid entry path
>> and ist_enter.  But I see your point.  I think that, if we have a
>> problem like this in practice, then we should fix it.
>
> Whatever hack we do to prevent from exceptions happening in between real kernel entry
> to tracked kernel entry is going to be far less robust than relying strictly on soft
> context tracking.
>

Why?

Any exception that doesn't leave the context tracking state exactly
the way it found it is buggy.  That means that we need to make sure
that context tracking itself is safe wrt exceptions and that we need
to make sure that any exception that can happen early in entry is
itself safe.

The latter is annoying, but the entry code needs to deal with it
anyway.  For example, any exception early in NMI is currently really
bad.  Non-IST exceptions very early in SYSCALL are fatal.
Non-paranoid exceptions outside swapgs are fatal.  Etc.

> The resulting bugs are rare and very hard to reproduce and diagnose.

That's why I stuck assertions all over the place.  I know of exactly
one case that will trip the assertion, and it's a false positive and I
plan on fixing it soon.

>
>>
>> But the old code had the same issue.  If we got an exception (the most
>> likely one is probably a vmalloc fault) during user_exit and we then
>> hit exception_enter, the result would probably be bad.
>
> We have a recursion protection in context tracking that should protect against
> exceptions triggering in the middle of half-set states.

I sure hope so.  It would be nice to mark it with with nokprobes, etc
if needed, too.

>
>>
>> >
>> > In early context tracking days we have relied on CS. But I changed that because of such
>> > issue. The only reliable source for soft context tracking is the soft context tracking itself.
>>
>> I don't see why the soft state is more reliable.  The only bad case is
>> where the entry itself (HW entry up to user_exit) is not atomic
>> enough, but that path should be at least as atomic as user_exit itself
>> is.
>
> Note it's not only about entry code up to user_exit() but also about
> user_enter() up to iret.
>

We already need to block interrupts there, and the code for exit back
to userspace is very clean in -tip.

> Also as long as there is at least one instruction between entry to the kernel
> and context tracking noting it, there is a risk for an exception. Hence entry
> code will never be atomic enough to avoid this kind of bugs.

By that argument, we're doomed.  Non-IST exceptions outside swapgs are fatal.

>
> Heh if only we had something like local_exception_save()!

What would that mean?

Exceptions aren't magic asynchronous things.  They happen only when
you do something that can trigger an exception.

--Andy

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 22:59           ` Andy Lutomirski
@ 2015-08-12  1:02             ` Paul E. McKenney
  2015-08-12 13:13             ` Frederic Weisbecker
  1 sibling, 0 replies; 70+ messages in thread
From: Paul E. McKenney @ 2015-08-12  1:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Frederic Weisbecker, Denys Vlasenko, Rik van Riel,
	Borislav Petkov, Peter Zijlstra, Brian Gerst, Denys Vlasenko,
	Kees Cook, Thomas Gleixner, Oleg Nesterov, Andrew Lutomirski,
	Linus Torvalds, Ingo Molnar, H. Peter Anvin, linux-kernel,
	linux-tip-commits

On Tue, Aug 11, 2015 at 03:59:37PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 11, 2015 at 3:49 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Tue, Aug 11, 2015 at 03:25:04PM -0700, Andy Lutomirski wrote:
> >> Can you explain to me what context tracking does that rcu_irq_enter
> >> and vtime_account_irq_enter don't do that's expensive?  Frankly, I'd
> >> rather drop everything except the context tracking callback.
> >
> > Irqs have their own hooks in the generic code. irq_enter() and irq_exit().
> > And those take care of RCU and time accounting already. So arch code really
> > doesn't need to care about that.
> 
> I'd love to have irq_enter_from_user and irq_enter_from_kernel instead.

RCU would need to know about irq_enter_from_user(), but could blithely
ignore irq_enter_from_kernel().  Unless irq_enter_from_kernel() is called
from the idle loop, in which case RCU would need to know.  All that aside,
the overhead of rcu_irq_enter() when called from non-idle kernel mode
should be relatively small.  So just telling RCU about all the interrupts
is actually not a bad strategy.

> > context tracking exists for the sole purpose of tracking states that don't
> > have generic hooks. Those are syscalls and exceptions.
> >
> > Besides, rcu_user_exit() is more costly than rcu_irq_enter() which have been
> > designed for the very purpose of providing a fast RCU tracking for non sleepable
> > code (which needs rcu_user_exit()).
> 
> So rcu_user_exit is slower because it's okay to sleep after calling it?
> 
> Would it be possible to defer the overhead until we actually try to
> sleep rather than doing it on entry?  (I have no idea what's going on
> under the hood.)

Nor do I, at least not until someone tells me what .config they are
using.  NO_HZ_FULL, NO_HZ_FULL_SYSIDLE, and RCU_FAST_NO_HZ make a
difference in this case.

> Anyway, irq_enter_from_user would solve this problem completely.
> 
> >
> > I've been thinking about pushing down syscalls and exceptions to generic
> > handlers. It might work for syscalls btw. But many exceptions have only
> > arch handlers, or significant amount of work is done on the arch level
> > which might make use of RCU (eg: breakpoint handlers on x86).
> 
> I'm trying to port the meat of the x86 syscall code to C.  Maybe the
> result will generalize.  The exit code is already in C (in -tip).

That does sound like a good thing!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 22:59           ` Andy Lutomirski
  2015-08-12  1:02             ` Paul E. McKenney
@ 2015-08-12 13:13             ` Frederic Weisbecker
  1 sibling, 0 replies; 70+ messages in thread
From: Frederic Weisbecker @ 2015-08-12 13:13 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paul McKenney, Denys Vlasenko, Rik van Riel, Borislav Petkov,
	Peter Zijlstra, Brian Gerst, Denys Vlasenko, Kees Cook,
	Thomas Gleixner, Oleg Nesterov, Andrew Lutomirski,
	Linus Torvalds, Ingo Molnar, H. Peter Anvin, linux-kernel,
	linux-tip-commits

On Tue, Aug 11, 2015 at 03:59:37PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 11, 2015 at 3:49 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Tue, Aug 11, 2015 at 03:25:04PM -0700, Andy Lutomirski wrote:
> >> Can you explain to me what context tracking does that rcu_irq_enter
> >> and vtime_account_irq_enter don't do that's expensive?  Frankly, I'd
> >> rather drop everything except the context tracking callback.
> >
> > Irqs have their own hooks in the generic code. irq_enter() and irq_exit().
> > And those take care of RCU and time accounting already. So arch code really
> > doesn't need to care about that.
> 
> I'd love to have irq_enter_from_user and irq_enter_from_kernel instead.

I don't get why we need that. Vtime internals already keeps track of where we
are. Again mixing up hard and soft tracking is asking for troubles.

> 
> >
> > context tracking exists for the sole purpose of tracking states that don't
> > have generic hooks. Those are syscalls and exceptions.
> >
> > Besides, rcu_user_exit() is more costly than rcu_irq_enter() which have been
> > designed for the very purpose of providing a fast RCU tracking for non sleepable
> > code (which needs rcu_user_exit()).
> >
> 
> So rcu_user_exit is slower because it's okay to sleep after calling it?
> 
> Would it be possible to defer the overhead until we actually try to
> sleep rather than doing it on entry?  (I have no idea what's going on
> under the hood.)

That's a question for Paul.

> Anyway, irq_enter_from_user would solve this problem completely.

How?

> >
> > I've been thinking about pushing down syscalls and exceptions to generic
> > handlers. It might work for syscalls btw. But many exceptions have only
> > arch handlers, or significant amount of work is done on the arch level
> > which might make use of RCU (eg: breakpoint handlers on x86).
> 
> I'm trying to port the meat of the x86 syscall code to C.  Maybe the
> result will generalize.  The exit code is already in C (in -tip).

But please don't change such semantics along the way, it really doesn't help
to review the x86 low level changes if it's mixed up with fundamental context
tracking changes.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-11 23:33           ` Andy Lutomirski
@ 2015-08-12 13:32             ` Frederic Weisbecker
  2015-08-12 14:59               ` Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-08-12 13:32 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Tue, Aug 11, 2015 at 04:33:05PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 11, 2015 at 4:22 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >
> > On Tue, Aug 11, 2015 at 03:51:26PM -0700, Andy Lutomirski wrote:
> >> On Tue, Aug 11, 2015 at 3:38 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >> >
> >> > This makes me very nervous as well!
> >> >
> >> > It means that instead of using the context tracking save/restore model that we had
> >> > with exception_enter/exception_exit(), now we rely on the CS register.
> >> >
> >> > I don't think we can do that because our "context tracking" is a soft tracking whereas
> >> > CS is hard tracking and both are not atomically synchronized together.
> >> >
> >> > Imagine this situation: we are running in userspace. Context tracking knows it, everything
> >> > is fine. Now we do a syscall, we enter in kernel entry code but we trigger an exception
> >> > (DEBUG for example) before we got a chance to call user_exit(), which means that the context
> >> > tracking code still thinks we are in userspace, so we look at CS from the exception entry code
> >> > and it says the exception happened in the kernel. Hence we don't call user_exit() before calling
> >> > the exception handler. There is the bug because the exception handler may use RCU which still
> >> > thinks we run in userspace.
> >>
> >> #DB doesn't go through this patch -- it uses the paranoid entry path
> >> and ist_enter.  But I see your point.  I think that, if we have a
> >> problem like this in practice, then we should fix it.
> >
> > Whatever hack we do to prevent from exceptions happening in between real kernel entry
> > to tracked kernel entry is going to be far less robust than relying strictly on soft
> > context tracking.
> >
> 
> Why?
> 
> Any exception that doesn't leave the context tracking state exactly
> the way it found it is buggy.  That means that we need to make sure
> that context tracking itself is safe wrt exceptions and that we need
> to make sure that any exception that can happen early in entry is
> itself safe.

Right, and doing it the way we did previously was safe wrt. that.

Can't we have exceptions slow path just like the way we do it in syscalls?

Then the exception slow path would just do:

    if TIF_NOHZ
       ctx = exception_enter()
    exception_handler()
    if TIF_NOHZ
       exception_exit(ctx)

Right now we are calling unconditionally the context tracking code, which is
not good.

> 
> The latter is annoying, but the entry code needs to deal with it
> anyway.  For example, any exception early in NMI is currently really
> bad.  Non-IST exceptions very early in SYSCALL are fatal.
> Non-paranoid exceptions outside swapgs are fatal.  Etc.

Sure but that doesn't mean I'm happy with introducing new fragile path
like those. Especially as we have a way to fix without more overhead.

> 
> > The resulting bugs are rare and very hard to reproduce and diagnose.
> 
> That's why I stuck assertions all over the place.  I know of exactly
> one case that will trip the assertion, and it's a false positive and I
> plan on fixing it soon.
> 
> >
> >>
> >> But the old code had the same issue.  If we got an exception (the most
> >> likely one is probably a vmalloc fault) during user_exit and we then
> >> hit exception_enter, the result would probably be bad.
> >
> > We have a recursion protection in context tracking that should protect against
> > exceptions triggering in the middle of half-set states.
> 
> I sure hope so.  It would be nice to mark it with with nokprobes, etc
> if needed, too.

Sure.

> > Also as long as there is at least one instruction between entry to the kernel
> > and context tracking noting it, there is a risk for an exception. Hence entry
> > code will never be atomic enough to avoid this kind of bugs.
> 
> By that argument, we're doomed.  Non-IST exceptions outside swapgs are fatal.

Does that concern only error_entry() exceptions?

> >
> > Heh if only we had something like local_exception_save()!
> 
> What would that mean?
> 
> Exceptions aren't magic asynchronous things.  They happen only when
> you do something that can trigger an exception.

Sure but, did you really never wish to have such an API? :-p

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-12 13:32             ` Frederic Weisbecker
@ 2015-08-12 14:59               ` Andy Lutomirski
  2015-08-18 22:34                 ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-08-12 14:59 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Wed, Aug 12, 2015 at 6:32 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Aug 11, 2015 at 04:33:05PM -0700, Andy Lutomirski wrote:
>> On Tue, Aug 11, 2015 at 4:22 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>> >
>> > On Tue, Aug 11, 2015 at 03:51:26PM -0700, Andy Lutomirski wrote:
>> >> On Tue, Aug 11, 2015 at 3:38 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>> >> >
>> >> > This makes me very nervous as well!
>> >> >
>> >> > It means that instead of using the context tracking save/restore model that we had
>> >> > with exception_enter/exception_exit(), now we rely on the CS register.
>> >> >
>> >> > I don't think we can do that because our "context tracking" is a soft tracking whereas
>> >> > CS is hard tracking and both are not atomically synchronized together.
>> >> >
>> >> > Imagine this situation: we are running in userspace. Context tracking knows it, everything
>> >> > is fine. Now we do a syscall, we enter in kernel entry code but we trigger an exception
>> >> > (DEBUG for example) before we got a chance to call user_exit(), which means that the context
>> >> > tracking code still thinks we are in userspace, so we look at CS from the exception entry code
>> >> > and it says the exception happened in the kernel. Hence we don't call user_exit() before calling
>> >> > the exception handler. There is the bug because the exception handler may use RCU which still
>> >> > thinks we run in userspace.
>> >>
>> >> #DB doesn't go through this patch -- it uses the paranoid entry path
>> >> and ist_enter.  But I see your point.  I think that, if we have a
>> >> problem like this in practice, then we should fix it.
>> >
>> > Whatever hack we do to prevent from exceptions happening in between real kernel entry
>> > to tracked kernel entry is going to be far less robust than relying strictly on soft
>> > context tracking.
>> >
>>
>> Why?
>>
>> Any exception that doesn't leave the context tracking state exactly
>> the way it found it is buggy.  That means that we need to make sure
>> that context tracking itself is safe wrt exceptions and that we need
>> to make sure that any exception that can happen early in entry is
>> itself safe.
>
> Right, and doing it the way we did previously was safe wrt. that.
>
> Can't we have exceptions slow path just like the way we do it in syscalls?
>
> Then the exception slow path would just do:
>
>     if TIF_NOHZ
>        ctx = exception_enter()
>     exception_handler()
>     if TIF_NOHZ
>        exception_exit(ctx)

What's the purpose of TIF_NOHZ right now?  For syscalls, it makes
sense, but is there any case in which TIF_NOHZ is set on one CPU but
not on another CPU?  It might make sense to get the performance back
using static keys instead of TIF_NOHZ.

If we switched back to exception_enter, we'd have to remember the
previous state, and, with a single exception right now, I think that's
unnecessary.

I think there are only three states we can be in at exception entry:
user (and user_mode(regs)), kernel (and kernel_mode(regs)), or
NMI-like.  In the user case, the new code is correct.  In the kernel
case, the new code is also correct.  In the NMI case (if we're nested
in an NMI or similar entry)) then it is and was the responsibility of
the NMI-like entry to call rcu_nmi_enter(), and things that nest
inside that shouldn't touch context tracking (with the possible
exception of calling rcu_nmi_enter() again).

In current -tip, there's a slight hole in this due to syscalls, and I'll fix it.

>
>>
>> The latter is annoying, but the entry code needs to deal with it
>> anyway.  For example, any exception early in NMI is currently really
>> bad.  Non-IST exceptions very early in SYSCALL are fatal.
>> Non-paranoid exceptions outside swapgs are fatal.  Etc.
>
> Sure but that doesn't mean I'm happy with introducing new fragile path
> like those. Especially as we have a way to fix without more overhead.

I think my approach can work with even less overhead: there are fewer
branches due to checking the previous state.

>> > Also as long as there is at least one instruction between entry to the kernel
>> > and context tracking noting it, there is a risk for an exception. Hence entry
>> > code will never be atomic enough to avoid this kind of bugs.
>>
>> By that argument, we're doomed.  Non-IST exceptions outside swapgs are fatal.
>
> Does that concern only error_entry() exceptions?

Yes, but the set of paranoid_entry exceptions is shrinking.  In -tip, there are:

NMI: NMI is special and will call rcu_nmi_enter().  Nothing's changing here.

MCE: Once upon a time, MCE was simply buggy.  As of 4.0 (IIRC) MCE
from kernel mode calls rcu_nmi_enter().

BP: This is going away, I think.  #BP should stop being special by 4.4.

DB: That's the only weird case.  Patches to prevent instruction
breakpoints in entry code are already in -tip.  The only thing left is
kernel watchpoints, and we need to do something about that.

>
>> >
>> > Heh if only we had something like local_exception_save()!
>>
>> What would that mean?
>>
>> Exceptions aren't magic asynchronous things.  They happen only when
>> you do something that can trigger an exception.
>
> Sure but, did you really never wish to have such an API? :-p

:)

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-12 14:59               ` Andy Lutomirski
@ 2015-08-18 22:34                 ` Frederic Weisbecker
  2015-08-18 22:40                   ` Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-08-18 22:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Wed, Aug 12, 2015 at 07:59:44AM -0700, Andy Lutomirski wrote:
> On Wed, Aug 12, 2015 at 6:32 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > Right, and doing it the way we did previously was safe wrt. that.
> >
> > Can't we have exceptions slow path just like the way we do it in syscalls?
> >
> > Then the exception slow path would just do:
> >
> >     if TIF_NOHZ
> >        ctx = exception_enter()
> >     exception_handler()
> >     if TIF_NOHZ
> >        exception_exit(ctx)
> 
> What's the purpose of TIF_NOHZ right now?  For syscalls, it makes
> sense, but is there any case in which TIF_NOHZ is set on one CPU but
> not on another CPU?  It might make sense to get the performance back
> using static keys instead of TIF_NOHZ.

Sure if we can manage to do that. The nice thing about TIF flags is that
they are a single check that is always there.

> 
> If we switched back to exception_enter, we'd have to remember the
> previous state, and, with a single exception right now, I think that's
> unnecessary.
> 
> I think there are only three states we can be in at exception entry:
> user (and user_mode(regs)), kernel (and kernel_mode(regs)), or
> NMI-like.

But we can have user && (!user_mode(regs)) if exception happens on exception
entry code.

> In the user case, the new code is correct.  In the kernel
> case, the new code is also correct.  In the NMI case (if we're nested
> in an NMI or similar entry)) then it is and was the responsibility of
> the NMI-like entry to call rcu_nmi_enter(), and things that nest
> inside that shouldn't touch context tracking (with the possible
> exception of calling rcu_nmi_enter() again).
> 
> In current -tip, there's a slight hole in this due to syscalls, and I'll fix it.

There must be a check for context tracking enabled anyway. So why can't
we just just do in exception entry code:

       if (exception_slow_path()) {
           exception_enter()
           exception_handler()
           exception_exit()
       } else {
           normal stuff
       }

Especially if we can manage to implement static keys in ASM, this will sum up to
a single one.

> >> The latter is annoying, but the entry code needs to deal with it
> >> anyway.  For example, any exception early in NMI is currently really
> >> bad.  Non-IST exceptions very early in SYSCALL are fatal.
> >> Non-paranoid exceptions outside swapgs are fatal.  Etc.
> >
> > Sure but that doesn't mean I'm happy with introducing new fragile path
> > like those. Especially as we have a way to fix without more overhead.
> 
> I think my approach can work with even less overhead: there are fewer
> branches due to checking the previous state.
> 
> >> > Also as long as there is at least one instruction between entry to the kernel
> >> > and context tracking noting it, there is a risk for an exception. Hence entry
> >> > code will never be atomic enough to avoid this kind of bugs.
> >>
> >> By that argument, we're doomed.  Non-IST exceptions outside swapgs are fatal.
> >
> > Does that concern only error_entry() exceptions?
> 
> Yes, but the set of paranoid_entry exceptions is shrinking.  In -tip, there are:
> 
> NMI: NMI is special and will call rcu_nmi_enter().  Nothing's changing here.
> 
> MCE: Once upon a time, MCE was simply buggy.  As of 4.0 (IIRC) MCE
> from kernel mode calls rcu_nmi_enter().
> 
> BP: This is going away, I think.  #BP should stop being special by 4.4.
> 
> DB: That's the only weird case.  Patches to prevent instruction
> breakpoints in entry code are already in -tip.  The only thing left is
> kernel watchpoints, and we need to do something about that.

So now we can't set a breakpoint on syscall entry anymore?

I'm still nervous with all that.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-18 22:34                 ` Frederic Weisbecker
@ 2015-08-18 22:40                   ` Andy Lutomirski
  2015-08-19 17:18                     ` Frederic Weisbecker
  0 siblings, 1 reply; 70+ messages in thread
From: Andy Lutomirski @ 2015-08-18 22:40 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Tue, Aug 18, 2015 at 3:34 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Wed, Aug 12, 2015 at 07:59:44AM -0700, Andy Lutomirski wrote:
>> On Wed, Aug 12, 2015 at 6:32 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>> > Right, and doing it the way we did previously was safe wrt. that.
>> >
>> > Can't we have exceptions slow path just like the way we do it in syscalls?
>> >
>> > Then the exception slow path would just do:
>> >
>> >     if TIF_NOHZ
>> >        ctx = exception_enter()
>> >     exception_handler()
>> >     if TIF_NOHZ
>> >        exception_exit(ctx)
>>
>> What's the purpose of TIF_NOHZ right now?  For syscalls, it makes
>> sense, but is there any case in which TIF_NOHZ is set on one CPU but
>> not on another CPU?  It might make sense to get the performance back
>> using static keys instead of TIF_NOHZ.
>
> Sure if we can manage to do that. The nice thing about TIF flags is that
> they are a single check that is always there.
>

True, although my patch loses that benefit for the fast compat entries
due to the syscall arg fault stuff (what a mess!).

>>
>> If we switched back to exception_enter, we'd have to remember the
>> previous state, and, with a single exception right now, I think that's
>> unnecessary.
>>
>> I think there are only three states we can be in at exception entry:
>> user (and user_mode(regs)), kernel (and kernel_mode(regs)), or
>> NMI-like.
>
> But we can have user && (!user_mode(regs)) if exception happens on exception
> entry code.

I sure hope not, unless it nests inside an NMI-like thing.  It's
conceivable that this might happen due to perf NMIs causing a failed
MSR read or similar.  We might need to relax the assertions to check
that we're either in kernel or NMI context.  If so, that's
straightforward.  Meanwhile no one has reported this happening.

>
>> In the user case, the new code is correct.  In the kernel
>> case, the new code is also correct.  In the NMI case (if we're nested
>> in an NMI or similar entry)) then it is and was the responsibility of
>> the NMI-like entry to call rcu_nmi_enter(), and things that nest
>> inside that shouldn't touch context tracking (with the possible
>> exception of calling rcu_nmi_enter() again).
>>
>> In current -tip, there's a slight hole in this due to syscalls, and I'll fix it.
>
> There must be a check for context tracking enabled anyway. So why can't
> we just just do in exception entry code:
>
>        if (exception_slow_path()) {
>            exception_enter()
>            exception_handler()
>            exception_exit()
>        } else {
>            normal stuff
>        }
>
> Especially if we can manage to implement static keys in ASM, this will sum up to
> a single one.

There isn't really an exception slow path.  There's already a branch
for user vs kernel (in the CPL sense), and with my patches, there's no
additional branch for previous context tracking state.

>
>> >> The latter is annoying, but the entry code needs to deal with it
>> >> anyway.  For example, any exception early in NMI is currently really
>> >> bad.  Non-IST exceptions very early in SYSCALL are fatal.
>> >> Non-paranoid exceptions outside swapgs are fatal.  Etc.
>> >
>> > Sure but that doesn't mean I'm happy with introducing new fragile path
>> > like those. Especially as we have a way to fix without more overhead.
>>
>> I think my approach can work with even less overhead: there are fewer
>> branches due to checking the previous state.
>>
>> >> > Also as long as there is at least one instruction between entry to the kernel
>> >> > and context tracking noting it, there is a risk for an exception. Hence entry
>> >> > code will never be atomic enough to avoid this kind of bugs.
>> >>
>> >> By that argument, we're doomed.  Non-IST exceptions outside swapgs are fatal.
>> >
>> > Does that concern only error_entry() exceptions?
>>
>> Yes, but the set of paranoid_entry exceptions is shrinking.  In -tip, there are:
>>
>> NMI: NMI is special and will call rcu_nmi_enter().  Nothing's changing here.
>>
>> MCE: Once upon a time, MCE was simply buggy.  As of 4.0 (IIRC) MCE
>> from kernel mode calls rcu_nmi_enter().
>>
>> BP: This is going away, I think.  #BP should stop being special by 4.4.
>>
>> DB: That's the only weird case.  Patches to prevent instruction
>> breakpoints in entry code are already in -tip.  The only thing left is
>> kernel watchpoints, and we need to do something about that.
>
> So now we can't set a breakpoint on syscall entry anymore?
>
> I'm still nervous with all that.

We haven't done anything that would make breakpoints on syscall entry
less safe than they were, but we now disallow the breakpoints.  In the
future, we might take advantage of that change.

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-18 22:40                   ` Andy Lutomirski
@ 2015-08-19 17:18                     ` Frederic Weisbecker
  2015-08-19 18:02                       ` Andy Lutomirski
  0 siblings, 1 reply; 70+ messages in thread
From: Frederic Weisbecker @ 2015-08-19 17:18 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Tue, Aug 18, 2015 at 03:40:20PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 18, 2015 at 3:34 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >> If we switched back to exception_enter, we'd have to remember the
> >> previous state, and, with a single exception right now, I think that's
> >> unnecessary.
> >>
> >> I think there are only three states we can be in at exception entry:
> >> user (and user_mode(regs)), kernel (and kernel_mode(regs)), or
> >> NMI-like.
> >
> > But we can have user && (!user_mode(regs)) if exception happens on exception
> > entry code.
> 
> I sure hope not, unless it nests inside an NMI-like thing.  It's
> conceivable that this might happen due to perf NMIs causing a failed
> MSR read or similar.  We might need to relax the assertions to check
> that we're either in kernel or NMI context.  If so, that's
> straightforward.  Meanwhile no one has reported this happening.

But we can still have #DB on entry code right? We blocked breakpoints on entry
code (I still don't get why and it looks to me like an overkill) but we still
have watchpoints.

> 
> >
> >> In the user case, the new code is correct.  In the kernel
> >> case, the new code is also correct.  In the NMI case (if we're nested
> >> in an NMI or similar entry)) then it is and was the responsibility of
> >> the NMI-like entry to call rcu_nmi_enter(), and things that nest
> >> inside that shouldn't touch context tracking (with the possible
> >> exception of calling rcu_nmi_enter() again).
> >>
> >> In current -tip, there's a slight hole in this due to syscalls, and I'll fix it.
> >
> > There must be a check for context tracking enabled anyway. So why can't
> > we just just do in exception entry code:
> >
> >        if (exception_slow_path()) {
> >            exception_enter()
> >            exception_handler()
> >            exception_exit()
> >        } else {
> >            normal stuff
> >        }
> >
> > Especially if we can manage to implement static keys in ASM, this will sum up to
> > a single one.
> 
> There isn't really an exception slow path.  There's already a branch
> for user vs kernel (in the CPL sense), and with my patches, there's no
> additional branch for previous context tracking state.

But an exception slow path based on static key would the most lightweight
thing for context tracking off-case (which is 99.9999% of usecases) and we
would keep it robust (ie: no need to enumerate all the fragile non-possibility
for an exception in entry code to get it safe).

> > So now we can't set a breakpoint on syscall entry anymore?
> >
> > I'm still nervous with all that.
> 
> We haven't done anything that would make breakpoints on syscall entry
> less safe than they were, but we now disallow the breakpoints.  In the
> future, we might take advantage of that change.

I still don't get the reason of that.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code
  2015-08-19 17:18                     ` Frederic Weisbecker
@ 2015-08-19 18:02                       ` Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: Andy Lutomirski @ 2015-08-19 18:02 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Denys Vlasenko, Rik van Riel, Borislav Petkov, Peter Zijlstra,
	Brian Gerst, Denys Vlasenko, Kees Cook, Thomas Gleixner,
	Oleg Nesterov, Andrew Lutomirski, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, linux-tip-commits

On Wed, Aug 19, 2015 at 10:18 AM, Frederic Weisbecker
<fweisbec@gmail.com> wrote:
> On Tue, Aug 18, 2015 at 03:40:20PM -0700, Andy Lutomirski wrote:
>>
>> I sure hope not, unless it nests inside an NMI-like thing.  It's
>> conceivable that this might happen due to perf NMIs causing a failed
>> MSR read or similar.  We might need to relax the assertions to check
>> that we're either in kernel or NMI context.  If so, that's
>> straightforward.  Meanwhile no one has reported this happening.
>
> But we can still have #DB on entry code right? We blocked breakpoints on entry
> code (I still don't get why and it looks to me like an overkill) but we still
> have watchpoints.

The actual reason is buried in the many threads about NMIs.
Basically, we want to start using RET to return from exceptions to
contexts with IF=0, but we can't do that if we need RF to work
correctly, and we need RF to work correctly if we allow breakpoints in
entry asm (otherwise we risk random infinite loops).  So we're
disallowing breakpoints in entry asm.

> But an exception slow path based on static key would the most lightweight
> thing for context tracking off-case (which is 99.9999% of usecases) and we
> would keep it robust (ie: no need to enumerate all the fragile non-possibility
> for an exception in entry code to get it safe).
>

IRQs work more or less like this in -tip (restructured, but this gets the gist):

if (user_mode(regs)) {
  swapgs;
  enter_from_user_mode;
  do_IRQ;
  prepare_exit_to_usermode;
  swapgs;
  iret;
} else {
  do_IRQ;
  check for preemption;
  iret;
}

In 4.2 and before, the enter_from_user_mode wasn't there, and instead
of calling prepare_exit_to_usermode in a known context
(CONTEXT_KERNEL), we went through the maze of retint_user in an
unknown context.  That meant that we needed things like SCHEDULE_USER
(which had a bug at some point), do_notify_resume (probably had tons
of bugs), etc, and somehow we still needed to end up in CONTEXT_USER
at the end.

I think the new state of affairs is much nicer.  It means that we
finally actually know what state we're in throughout the entry asm.
The only real downsides that I can see are:

1. There's an unnecessary pair of branches due to rcu_irq_enter and
rcu_irq_exit when an IRQ hits user mode.

2. If user_exit is indeed much more expensive than rcu_irq_enter, then
we pay that cost.

If you have suggestions for how to make this faster without making it
uglier, please let me know. :)

--Andy

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v5 08/17] x86/entry: Add enter_from_user_mode and use it in syscalls
  2015-07-03 19:44 ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode and use it in syscalls Andy Lutomirski
  2015-07-07 10:51   ` [tip:x86/asm] x86/entry: Add enter_from_user_mode() " tip-bot for Andy Lutomirski
@ 2015-12-21 20:50   ` Sasha Levin
  2015-12-21 22:44     ` Andy Lutomirski
  1 sibling, 1 reply; 70+ messages in thread
From: Sasha Levin @ 2015-12-21 20:50 UTC (permalink / raw)
  To: Andy Lutomirski, x86, linux-kernel
  Cc: Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst, paulmck

On 07/03/2015 03:44 PM, Andy Lutomirski wrote:
> Changing the x86 context tracking hooks is dangerous because there
> are no good checks that we track our context correctly.  Add a
> helper to check that we're actually in CONTEXT_USER when we enter
> from user mode and wire it up for syscall entries.
> 
> Subsequent patches will wire this up for all non-NMI entries as
> well.  NMIs are their own special beast and cannot currently switch
> overall context tracking state.  Instead, they have their own
> special RCU hooks.
> 
> This is a tiny speedup if !CONFIG_CONTEXT_TRACKING (removes a
> branch) and a tiny slowdown if CONFIG_CONTEXT_TRACING (adds a layer
> of indirection).  Eventually, we should fix up the core context
> tracking code to supply a function that does what we want (and can
> be much simpler than user_exit), which will enable us to get rid of
> the extra call.

Hey Andy,

I see the following warning in today's -next:

[ 2162.706868] ------------[ cut here ]------------
[ 2162.708021] WARNING: CPU: 4 PID: 28801 at arch/x86/entry/common.c:44 enter_from_user_mode+0x1c/0x50()
[ 2162.709466] Modules linked in:
[ 2162.709998] CPU: 4 PID: 28801 Comm: trinity-c375 Tainted: G    B           4.4.0-rc5-next-20151221-sasha-00020-g840272e-dirty #2753
[ 2162.711847]  0000000000000000 00000000f17e6fcd ffff880292d5fe08 ffffffffa4045334
[ 2162.713108]  0000000041b58ab3 ffffffffaf66686b ffffffffa4045289 ffff880292d5fdc0
[ 2162.714544]  0000000000000000 00000000f17e6fcd ffffffffa23cf466 0000000000000004
[ 2162.715793] Call Trace:
[ 2162.716229] dump_stack (lib/dump_stack.c:52)
[ 2162.719021] warn_slowpath_common (kernel/panic.c:484)
[ 2162.721014] warn_slowpath_null (kernel/panic.c:518)
[ 2162.721950] enter_from_user_mode (arch/x86/entry/common.c:44 (discriminator 7) include/linux/context_tracking_state.h:30 (discriminator 7) include/linux/context_tracking.h:30 (discriminator 7) arch/x86/entry/common.c:45 (discriminator 7))
[ 2162.722911] syscall_trace_enter_phase1 (arch/x86/entry/common.c:94)
[ 2162.726914] tracesys (arch/x86/entry/entry_64.S:241)
[ 2162.727704] ---[ end trace 1e5b49c361cbfe8b ]---
[ 2162.728468] BUG: scheduling while atomic: trinity-c375/28801/0x00000401
[ 2162.729517] Modules linked in:
[ 2162.730020] Preemption disabled param_attr_store (kernel/params.c:625)
[ 2162.731304]
[ 2162.731579] CPU: 4 PID: 28801 Comm: trinity-c375 Tainted: G    B   W       4.4.0-rc5-next-20151221-sasha-00020-g840272e-dirty #2753
[ 2162.733432]  0000000000000000 00000000f17e6fcd ffff880292d5fe20 ffffffffa4045334
[ 2162.734778]  0000000041b58ab3 ffffffffaf66686b ffffffffa4045289 ffff880292d5fde0
[ 2162.736036]  fffffffface198f9 00000000f17e6fcd ffff880292d5fe50 0000000000000282
[ 2162.737309] Call Trace:
[ 2162.737718] dump_stack (lib/dump_stack.c:52)
[ 2162.740566] __schedule_bug (kernel/sched/core.c:3102)
[ 2162.741498] __schedule (./arch/x86/include/asm/preempt.h:27 kernel/sched/core.c:3116 kernel/sched/core.c:3225)
[ 2162.742391] schedule (kernel/sched/core.c:3312 (discriminator 1))
[ 2162.743221] exit_to_usermode_loop (arch/x86/entry/common.c:246)
[ 2162.744331] syscall_return_slowpath (arch/x86/entry/common.c:282 include/linux/context_tracking_state.h:30 include/linux/context_tracking.h:24 arch/x86/entry/common.c:284 arch/x86/entry/common.c:344)
[ 2162.745364] int_ret_from_sys_call (arch/x86/entry/entry_64.S:282)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v5 08/17] x86/entry: Add enter_from_user_mode and use it in syscalls
  2015-12-21 20:50   ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode " Sasha Levin
@ 2015-12-21 22:44     ` Andy Lutomirski
  0 siblings, 0 replies; 70+ messages in thread
From: Andy Lutomirski @ 2015-12-21 22:44 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Andy Lutomirski, X86 ML, linux-kernel,
	Frédéric Weisbecker, Rik van Riel, Oleg Nesterov,
	Denys Vlasenko, Borislav Petkov, Kees Cook, Brian Gerst,
	Paul McKenney

On Mon, Dec 21, 2015 at 12:50 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
> On 07/03/2015 03:44 PM, Andy Lutomirski wrote:
>> Changing the x86 context tracking hooks is dangerous because there
>> are no good checks that we track our context correctly.  Add a
>> helper to check that we're actually in CONTEXT_USER when we enter
>> from user mode and wire it up for syscall entries.
>>
>> Subsequent patches will wire this up for all non-NMI entries as
>> well.  NMIs are their own special beast and cannot currently switch
>> overall context tracking state.  Instead, they have their own
>> special RCU hooks.
>>
>> This is a tiny speedup if !CONFIG_CONTEXT_TRACKING (removes a
>> branch) and a tiny slowdown if CONFIG_CONTEXT_TRACING (adds a layer
>> of indirection).  Eventually, we should fix up the core context
>> tracking code to supply a function that does what we want (and can
>> be much simpler than user_exit), which will enable us to get rid of
>> the extra call.
>
> Hey Andy,
>
> I see the following warning in today's -next:

Weird.  I wonder if you might have hit this while switching context
tracking on a runtime.  (Can you even do that?)

--Andy


>
> [ 2162.706868] ------------[ cut here ]------------
> [ 2162.708021] WARNING: CPU: 4 PID: 28801 at arch/x86/entry/common.c:44 enter_from_user_mode+0x1c/0x50()
> [ 2162.709466] Modules linked in:
> [ 2162.709998] CPU: 4 PID: 28801 Comm: trinity-c375 Tainted: G    B           4.4.0-rc5-next-20151221-sasha-00020-g840272e-dirty #2753
> [ 2162.711847]  0000000000000000 00000000f17e6fcd ffff880292d5fe08 ffffffffa4045334
> [ 2162.713108]  0000000041b58ab3 ffffffffaf66686b ffffffffa4045289 ffff880292d5fdc0
> [ 2162.714544]  0000000000000000 00000000f17e6fcd ffffffffa23cf466 0000000000000004
> [ 2162.715793] Call Trace:
> [ 2162.716229] dump_stack (lib/dump_stack.c:52)
> [ 2162.719021] warn_slowpath_common (kernel/panic.c:484)
> [ 2162.721014] warn_slowpath_null (kernel/panic.c:518)
> [ 2162.721950] enter_from_user_mode (arch/x86/entry/common.c:44 (discriminator 7) include/linux/context_tracking_state.h:30 (discriminator 7) include/linux/context_tracking.h:30 (discriminator 7) arch/x86/entry/common.c:45 (discriminator 7))
> [ 2162.722911] syscall_trace_enter_phase1 (arch/x86/entry/common.c:94)
> [ 2162.726914] tracesys (arch/x86/entry/entry_64.S:241)
> [ 2162.727704] ---[ end trace 1e5b49c361cbfe8b ]---
> [ 2162.728468] BUG: scheduling while atomic: trinity-c375/28801/0x00000401
> [ 2162.729517] Modules linked in:
> [ 2162.730020] Preemption disabled param_attr_store (kernel/params.c:625)
> [ 2162.731304]
> [ 2162.731579] CPU: 4 PID: 28801 Comm: trinity-c375 Tainted: G    B   W       4.4.0-rc5-next-20151221-sasha-00020-g840272e-dirty #2753
> [ 2162.733432]  0000000000000000 00000000f17e6fcd ffff880292d5fe20 ffffffffa4045334
> [ 2162.734778]  0000000041b58ab3 ffffffffaf66686b ffffffffa4045289 ffff880292d5fde0
> [ 2162.736036]  fffffffface198f9 00000000f17e6fcd ffff880292d5fe50 0000000000000282
> [ 2162.737309] Call Trace:
> [ 2162.737718] dump_stack (lib/dump_stack.c:52)
> [ 2162.740566] __schedule_bug (kernel/sched/core.c:3102)
> [ 2162.741498] __schedule (./arch/x86/include/asm/preempt.h:27 kernel/sched/core.c:3116 kernel/sched/core.c:3225)
> [ 2162.742391] schedule (kernel/sched/core.c:3312 (discriminator 1))
> [ 2162.743221] exit_to_usermode_loop (arch/x86/entry/common.c:246)
> [ 2162.744331] syscall_return_slowpath (arch/x86/entry/common.c:282 include/linux/context_tracking_state.h:30 include/linux/context_tracking.h:24 arch/x86/entry/common.c:284 arch/x86/entry/common.c:344)
> [ 2162.745364] int_ret_from_sys_call (arch/x86/entry/entry_64.S:282)
>
>
> Thanks,
> Sasha



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, back to index

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-03 19:44 [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 01/17] selftests/x86: Add a test for 32-bit fast syscall arg faults Andy Lutomirski
2015-07-07 10:49   ` [tip:x86/asm] x86/entry, " tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 02/17] x86/entry/64/compat: Fix bad fast syscall arg failure path Andy Lutomirski
2015-07-07 10:49   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 03/17] uml: Fix do_signal() prototype Andy Lutomirski
2015-07-07 10:49   ` [tip:x86/asm] um: " tip-bot for Ingo Molnar
2015-07-03 19:44 ` [PATCH v5 04/17] context_tracking: Add ct_state and CT_WARN_ON Andy Lutomirski
2015-07-07 10:50   ` [tip:x86/asm] context_tracking: Add ct_state() and CT_WARN_ON() tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 05/17] notifiers: Assert that RCU is watching in notify_die Andy Lutomirski
2015-07-07 10:50   ` [tip:x86/asm] notifiers, RCU: Assert that RCU is watching in notify_die() tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 06/17] x86: Move C entry and exit code to arch/x86/entry/common.c Andy Lutomirski
2015-07-07 10:50   ` [tip:x86/asm] x86/entry: Move C entry and exit code to arch/x86/ entry/common.c tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 07/17] x86/traps: Assert that we're in CONTEXT_KERNEL in exception entries Andy Lutomirski
2015-07-07 10:51   ` [tip:x86/asm] x86/traps, context_tracking: Assert that we' re " tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode and use it in syscalls Andy Lutomirski
2015-07-07 10:51   ` [tip:x86/asm] x86/entry: Add enter_from_user_mode() " tip-bot for Andy Lutomirski
2015-07-14 23:00     ` Frederic Weisbecker
2015-07-14 23:04       ` Andy Lutomirski
2015-07-14 23:28         ` Frederic Weisbecker
2015-12-21 20:50   ` [PATCH v5 08/17] x86/entry: Add enter_from_user_mode " Sasha Levin
2015-12-21 22:44     ` Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 09/17] x86/entry: Add new, comprehensible entry and exit hooks Andy Lutomirski
2015-07-07 10:51   ` [tip:x86/asm] x86/entry: Add new, comprehensible entry and exit handlers written in C tip-bot for Andy Lutomirski
2015-07-14 23:07     ` Frederic Weisbecker
2015-07-15 19:56       ` Linus Torvalds
2015-07-15 20:46         ` Andy Lutomirski
2015-07-15 21:25           ` [PATCH] x86/entry: Fix _TIF_USER_RETURN_NOTIFY check in prepare_exit_to_usermode Andy Lutomirski
2015-07-18  3:25             ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 10/17] x86/entry/64: Really create an error-entry-from-usermode code path Andy Lutomirski
2015-07-07 10:52   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 11/17] x86/entry/64: Migrate 64-bit and compat syscalls to new exit hooks Andy Lutomirski
2015-07-07 10:52   ` [tip:x86/asm] x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers and remove old assembly code tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 12/17] x86/asm/entry/64: Save all regs on interrupt entry Andy Lutomirski
2015-07-07 10:52   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 13/17] x86/asm/entry/64: Simplify irq stack pt_regs handling Andy Lutomirski
2015-07-07 10:53   ` [tip:x86/asm] x86/asm/entry/64: Simplify IRQ " tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 14/17] x86/asm/entry/64: Migrate error and interrupt exit work to C Andy Lutomirski
2015-07-07 10:53   ` [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code tip-bot for Andy Lutomirski
2015-08-11 22:18     ` Frederic Weisbecker
2015-08-11 22:25       ` Andy Lutomirski
2015-08-11 22:49         ` Frederic Weisbecker
2015-08-11 22:59           ` Andy Lutomirski
2015-08-12  1:02             ` Paul E. McKenney
2015-08-12 13:13             ` Frederic Weisbecker
2015-08-11 22:38     ` Frederic Weisbecker
2015-08-11 22:51       ` Andy Lutomirski
2015-08-11 23:22         ` Frederic Weisbecker
2015-08-11 23:33           ` Andy Lutomirski
2015-08-12 13:32             ` Frederic Weisbecker
2015-08-12 14:59               ` Andy Lutomirski
2015-08-18 22:34                 ` Frederic Weisbecker
2015-08-18 22:40                   ` Andy Lutomirski
2015-08-19 17:18                     ` Frederic Weisbecker
2015-08-19 18:02                       ` Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 15/17] x86/entry: Remove exception_enter from most trap handlers Andy Lutomirski
2015-07-07 10:53   ` [tip:x86/asm] x86/entry: Remove exception_enter() " tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 16/17] x86/entry: Remove SCHEDULE_USER and asm/context-tracking.h Andy Lutomirski
2015-07-07 10:54   ` [tip:x86/asm] x86/entry: Remove SCHEDULE_USER and asm/ context-tracking.h tip-bot for Andy Lutomirski
2015-07-03 19:44 ` [PATCH v5 17/17] x86/irq: Document how IRQ context tracking works and add an assertion Andy Lutomirski
2015-07-07 10:54   ` [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion tip-bot for Andy Lutomirski
2015-07-14 23:26     ` Frederic Weisbecker
2015-07-14 23:33       ` Andy Lutomirski
2015-07-18 13:23         ` Frederic Weisbecker
2015-07-18 14:10           ` Paul E. McKenney
2015-07-07 11:12 ` [PATCH v5 00/17] x86: Rewrite exit-to-userspace code Ingo Molnar
2015-07-07 16:03   ` Andy Lutomirski
2015-07-07 17:55     ` [PATCH] x86/entry/64: Fix warning on compat syscalls with CONFIG_AUDITSYSCALL=n Andy Lutomirski
2015-07-08  9:57       ` [tip:x86/asm] x86/entry/64: Fix IRQ state confusion and related warning on compat syscalls with CONFIG_AUDITSYSCALL =n tip-bot for Andy Lutomirski
2015-07-08 19:12       ` tip-bot for Andy Lutomirski

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git