* [PATCH v2 1/3] x86/insn-eval: Add support for 64-bit kernel mode @ 2019-11-15 19:17 Jann Horn 2019-11-15 19:17 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Jann Horn 2019-11-15 19:17 ` [PATCH v2 3/3] x86/kasan: Print original " Jann Horn 0 siblings, 2 replies; 21+ messages in thread From: Jann Horn @ 2019-11-15 19:17 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, jannh Cc: linux-kernel, Andrey Konovalov, Andy Lutomirski, Sean Christopherson To support evaluating 64-bit kernel mode instructions: Replace existing checks for user_64bit_mode() with a new helper that checks whether code is being executed in either 64-bit kernel mode or 64-bit user mode. Select the GS base depending on whether the instruction is being evaluated in kernel mode. Signed-off-by: Jann Horn <jannh@google.com> --- Notes: v2: no changes arch/x86/include/asm/ptrace.h | 13 +++++++++++++ arch/x86/lib/insn-eval.c | 26 +++++++++++++++----------- 2 files changed, 28 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index 5057a8ed100b..ac45b06941a5 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -159,6 +159,19 @@ static inline bool user_64bit_mode(struct pt_regs *regs) #endif } +/* + * Determine whether the register set came from any context that is running in + * 64-bit mode. + */ +static inline bool any_64bit_mode(struct pt_regs *regs) +{ +#ifdef CONFIG_X86_64 + return !user_mode(regs) || user_64bit_mode(regs); +#else + return false; +#endif +} + #ifdef CONFIG_X86_64 #define current_user_stack_pointer() current_pt_regs()->sp #define compat_user_stack_pointer() current_pt_regs()->sp diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c index 306c3a0902ba..31600d851fd8 100644 --- a/arch/x86/lib/insn-eval.c +++ b/arch/x86/lib/insn-eval.c @@ -155,7 +155,7 @@ static bool check_seg_overrides(struct insn *insn, int regoff) */ static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off) { - if (user_64bit_mode(regs)) + if (any_64bit_mode(regs)) return INAT_SEG_REG_IGNORE; /* * Resolve the default segment register as described in Section 3.7.4 @@ -266,7 +266,7 @@ static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff) * which may be invalid at this point. */ if (regoff == offsetof(struct pt_regs, ip)) { - if (user_64bit_mode(regs)) + if (any_64bit_mode(regs)) return INAT_SEG_REG_IGNORE; else return INAT_SEG_REG_CS; @@ -289,7 +289,7 @@ static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff) * In long mode, segment override prefixes are ignored, except for * overrides for FS and GS. */ - if (user_64bit_mode(regs)) { + if (any_64bit_mode(regs)) { if (idx != INAT_SEG_REG_FS && idx != INAT_SEG_REG_GS) idx = INAT_SEG_REG_IGNORE; @@ -646,23 +646,27 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx) */ return (unsigned long)(sel << 4); - if (user_64bit_mode(regs)) { + if (any_64bit_mode(regs)) { /* * Only FS or GS will have a base address, the rest of * the segments' bases are forced to 0. */ unsigned long base; - if (seg_reg_idx == INAT_SEG_REG_FS) + if (seg_reg_idx == INAT_SEG_REG_FS) { rdmsrl(MSR_FS_BASE, base); - else if (seg_reg_idx == INAT_SEG_REG_GS) + } else if (seg_reg_idx == INAT_SEG_REG_GS) { /* * swapgs was called at the kernel entry point. Thus, * MSR_KERNEL_GS_BASE will have the user-space GS base. */ - rdmsrl(MSR_KERNEL_GS_BASE, base); - else + if (user_mode(regs)) + rdmsrl(MSR_KERNEL_GS_BASE, base); + else + rdmsrl(MSR_GS_BASE, base); + } else { base = 0; + } return base; } @@ -703,7 +707,7 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx) if (sel < 0) return 0; - if (user_64bit_mode(regs) || v8086_mode(regs)) + if (any_64bit_mode(regs) || v8086_mode(regs)) return -1L; if (!sel) @@ -948,7 +952,7 @@ static int get_eff_addr_modrm(struct insn *insn, struct pt_regs *regs, * following instruction. */ if (*regoff == -EDOM) { - if (user_64bit_mode(regs)) + if (any_64bit_mode(regs)) tmp = regs->ip + insn->length; else tmp = 0; @@ -1250,7 +1254,7 @@ static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs) * After computed, the effective address is treated as an unsigned * quantity. */ - if (!user_64bit_mode(regs) && ((unsigned int)eff_addr > seg_limit)) + if (!any_64bit_mode(regs) && ((unsigned int)eff_addr > seg_limit)) goto out; /* -- 2.24.0.432.g9d3f5f5b63-goog ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-15 19:17 [PATCH v2 1/3] x86/insn-eval: Add support for 64-bit kernel mode Jann Horn @ 2019-11-15 19:17 ` Jann Horn 2019-11-18 14:21 ` Borislav Petkov ` (2 more replies) 2019-11-15 19:17 ` [PATCH v2 3/3] x86/kasan: Print original " Jann Horn 1 sibling, 3 replies; 21+ messages in thread From: Jann Horn @ 2019-11-15 19:17 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, jannh Cc: linux-kernel, Andrey Konovalov, Andy Lutomirski, Sean Christopherson A frequent cause of #GP exceptions are memory accesses to non-canonical addresses. Unlike #PF, #GP doesn't come with a fault address in CR2, so the kernel doesn't currently print the fault address for #GP. Luckily, we already have the necessary infrastructure for decoding X86 instructions and computing the memory address that is being accessed; hook it up to the #GP handler so that we can figure out whether the #GP looks like it was caused by a non-canonical address, and if so, print that address. While it is already possible to compute the faulting address manually by disassembling the opcode dump and evaluating the instruction against the register dump, this should make it slightly easier to identify crashes at a glance. Signed-off-by: Jann Horn <jannh@google.com> --- Notes: v2: - print different message for segment-related GP (Borislav) - rewrite check for non-canonical address (Sean) - make it clear we don't know for sure why the GP happened (Andy) arch/x86/kernel/traps.c | 45 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 43 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index c90312146da0..12d42697a18e 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -56,6 +56,8 @@ #include <asm/mpx.h> #include <asm/vm86.h> #include <asm/umip.h> +#include <asm/insn.h> +#include <asm/insn-eval.h> #ifdef CONFIG_X86_64 #include <asm/x86_init.h> @@ -509,6 +511,38 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code) do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, 0, NULL); } +/* + * On 64-bit, if an uncaught #GP occurs while dereferencing a non-canonical + * address, print that address. + */ +static void print_kernel_gp_address(struct pt_regs *regs) +{ +#ifdef CONFIG_X86_64 + u8 insn_bytes[MAX_INSN_SIZE]; + struct insn insn; + unsigned long addr_ref; + + if (probe_kernel_read(insn_bytes, (void *)regs->ip, MAX_INSN_SIZE)) + return; + + kernel_insn_init(&insn, insn_bytes, MAX_INSN_SIZE); + insn_get_modrm(&insn); + insn_get_sib(&insn); + addr_ref = (unsigned long)insn_get_addr_ref(&insn, regs); + + /* Bail out if insn_get_addr_ref() failed or we got a kernel address. */ + if (addr_ref >= ~__VIRTUAL_MASK) + return; + + /* Bail out if the entire operand is in the canonical user half. */ + if (addr_ref + insn.opnd_bytes - 1 <= __VIRTUAL_MASK) + return; + + pr_alert("probably dereferencing non-canonical address 0x%016lx\n", + addr_ref); +#endif +} + dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code) { @@ -547,8 +581,15 @@ do_general_protection(struct pt_regs *regs, long error_code) return; if (notify_die(DIE_GPF, desc, regs, error_code, - X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP) - die(desc, regs, error_code); + X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP) + return; + + if (error_code) + pr_alert("GPF is segment-related (see error code)\n"); + else + print_kernel_gp_address(regs); + + die(desc, regs, error_code); return; } -- 2.24.0.432.g9d3f5f5b63-goog ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-15 19:17 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Jann Horn @ 2019-11-18 14:21 ` Borislav Petkov 2019-11-18 16:02 ` Dmitry Vyukov 2019-11-20 4:25 ` Andi Kleen 2019-11-23 23:07 ` Andy Lutomirski 2 siblings, 1 reply; 21+ messages in thread From: Borislav Petkov @ 2019-11-18 14:21 UTC (permalink / raw) To: Jann Horn Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, linux-kernel, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Fri, Nov 15, 2019 at 08:17:27PM +0100, Jann Horn wrote: > dotraplinkage void > do_general_protection(struct pt_regs *regs, long error_code) > { > @@ -547,8 +581,15 @@ do_general_protection(struct pt_regs *regs, long error_code) > return; > > if (notify_die(DIE_GPF, desc, regs, error_code, > - X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP) > - die(desc, regs, error_code); > + X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP) > + return; > + > + if (error_code) > + pr_alert("GPF is segment-related (see error code)\n"); > + else > + print_kernel_gp_address(regs); > + > + die(desc, regs, error_code); Right, this way, those messages appear before the main "general protection ..." message: [ 2.434372] traps: probably dereferencing non-canonical address 0xdfff000000000001 [ 2.442492] general protection fault: 0000 [#1] PREEMPT SMP Can we glue/merge them together? Or is this going to confuse tools too much: [ 2.542218] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP (and that sentence could be shorter too: "general protection fault for non-canonical address 0xdfff000000000001" looks ok to me too.) Here's a dirty diff together with a reproducer ontop of yours: --- diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index bf796f8c9998..dab702ba28a6 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -515,7 +515,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code) * On 64-bit, if an uncaught #GP occurs while dereferencing a non-canonical * address, print that address. */ -static void print_kernel_gp_address(struct pt_regs *regs) +static unsigned long get_kernel_gp_address(struct pt_regs *regs) { #ifdef CONFIG_X86_64 u8 insn_bytes[MAX_INSN_SIZE]; @@ -523,7 +523,7 @@ static void print_kernel_gp_address(struct pt_regs *regs) unsigned long addr_ref; if (probe_kernel_read(insn_bytes, (void *)regs->ip, MAX_INSN_SIZE)) - return; + return 0; kernel_insn_init(&insn, insn_bytes, MAX_INSN_SIZE); insn_get_modrm(&insn); @@ -532,22 +532,22 @@ static void print_kernel_gp_address(struct pt_regs *regs) /* Bail out if insn_get_addr_ref() failed or we got a kernel address. */ if (addr_ref >= ~__VIRTUAL_MASK) - return; + return 0; /* Bail out if the entire operand is in the canonical user half. */ if (addr_ref + insn.opnd_bytes - 1 <= __VIRTUAL_MASK) - return; + return 0; - pr_alert("probably dereferencing non-canonical address 0x%016lx\n", - addr_ref); + return addr_ref; #endif } +#define GPFSTR "general protection fault" dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code) { - const char *desc = "general protection fault"; struct task_struct *tsk; + char desc[90]; RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU"); cond_local_irq_enable(regs); @@ -584,12 +584,18 @@ do_general_protection(struct pt_regs *regs, long error_code) X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP) return; - if (error_code) - pr_alert("GPF is segment-related (see error code)\n"); - else - print_kernel_gp_address(regs); + if (error_code) { + snprintf(desc, 90, "segment-related " GPFSTR); + } else { + unsigned long addr_ref = get_kernel_gp_address(regs); + + if (addr_ref) + snprintf(desc, 90, GPFSTR " while derefing a non-canonical address 0x%lx", addr_ref); + else + snprintf(desc, 90, GPFSTR); + } - die(desc, regs, error_code); + die((const char *)desc, regs, error_code); return; } diff --git a/init/main.c b/init/main.c index 91f6ebb30ef0..7acc7e660be9 100644 --- a/init/main.c +++ b/init/main.c @@ -1124,6 +1124,9 @@ static int __ref kernel_init(void *unused) rcu_end_inkernel_boot(); + asm volatile("mov $0xdfff000000000001, %rax\n\t" + "jmpq *%rax\n\t"); + if (ramdisk_execute_command) { ret = run_init_process(ramdisk_execute_command); if (!ret) -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-18 14:21 ` Borislav Petkov @ 2019-11-18 16:02 ` Dmitry Vyukov 2019-11-18 16:19 ` Jann Horn 0 siblings, 1 reply; 21+ messages in thread From: Dmitry Vyukov @ 2019-11-18 16:02 UTC (permalink / raw) To: Borislav Petkov Cc: Jann Horn, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Mon, Nov 18, 2019 at 3:21 PM Borislav Petkov <bp@alien8.de> wrote: > > On Fri, Nov 15, 2019 at 08:17:27PM +0100, Jann Horn wrote: > > dotraplinkage void > > do_general_protection(struct pt_regs *regs, long error_code) > > { > > @@ -547,8 +581,15 @@ do_general_protection(struct pt_regs *regs, long error_code) > > return; > > > > if (notify_die(DIE_GPF, desc, regs, error_code, > > - X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP) > > - die(desc, regs, error_code); > > + X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP) > > + return; > > + > > + if (error_code) > > + pr_alert("GPF is segment-related (see error code)\n"); > > + else > > + print_kernel_gp_address(regs); > > + > > + die(desc, regs, error_code); > > Right, this way, those messages appear before the main "general > protection ..." message: > > [ 2.434372] traps: probably dereferencing non-canonical address 0xdfff000000000001 > [ 2.442492] general protection fault: 0000 [#1] PREEMPT SMP > > Can we glue/merge them together? Or is this going to confuse tools too much: > > [ 2.542218] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP > > (and that sentence could be shorter too: > > "general protection fault for non-canonical address 0xdfff000000000001" > > looks ok to me too.) This exact form will confuse syzkaller crash parsing for Linux kernel: https://github.com/google/syzkaller/blob/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/linux.go#L1347 It expects a "general protection fault:" line for these crashes. A graceful way to update kernel crash messages would be to add more tests with the new format here: https://github.com/google/syzkaller/tree/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/testdata/linux/report Update parsing code. Roll out new version. Update all other testing systems that detect and parse kernel crashes. Then commit kernel changes. An unfortunate consequence of offloading testing to third-party systems... > Here's a dirty diff together with a reproducer ontop of yours: > > --- > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index bf796f8c9998..dab702ba28a6 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -515,7 +515,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code) > * On 64-bit, if an uncaught #GP occurs while dereferencing a non-canonical > * address, print that address. > */ > -static void print_kernel_gp_address(struct pt_regs *regs) > +static unsigned long get_kernel_gp_address(struct pt_regs *regs) > { > #ifdef CONFIG_X86_64 > u8 insn_bytes[MAX_INSN_SIZE]; > @@ -523,7 +523,7 @@ static void print_kernel_gp_address(struct pt_regs *regs) > unsigned long addr_ref; > > if (probe_kernel_read(insn_bytes, (void *)regs->ip, MAX_INSN_SIZE)) > - return; > + return 0; > > kernel_insn_init(&insn, insn_bytes, MAX_INSN_SIZE); > insn_get_modrm(&insn); > @@ -532,22 +532,22 @@ static void print_kernel_gp_address(struct pt_regs *regs) > > /* Bail out if insn_get_addr_ref() failed or we got a kernel address. */ > if (addr_ref >= ~__VIRTUAL_MASK) > - return; > + return 0; > > /* Bail out if the entire operand is in the canonical user half. */ > if (addr_ref + insn.opnd_bytes - 1 <= __VIRTUAL_MASK) > - return; > + return 0; > > - pr_alert("probably dereferencing non-canonical address 0x%016lx\n", > - addr_ref); > + return addr_ref; > #endif > } > > +#define GPFSTR "general protection fault" > dotraplinkage void > do_general_protection(struct pt_regs *regs, long error_code) > { > - const char *desc = "general protection fault"; > struct task_struct *tsk; > + char desc[90]; > > RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU"); > cond_local_irq_enable(regs); > @@ -584,12 +584,18 @@ do_general_protection(struct pt_regs *regs, long error_code) > X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP) > return; > > - if (error_code) > - pr_alert("GPF is segment-related (see error code)\n"); > - else > - print_kernel_gp_address(regs); > + if (error_code) { > + snprintf(desc, 90, "segment-related " GPFSTR); > + } else { > + unsigned long addr_ref = get_kernel_gp_address(regs); > + > + if (addr_ref) > + snprintf(desc, 90, GPFSTR " while derefing a non-canonical address 0x%lx", addr_ref); > + else > + snprintf(desc, 90, GPFSTR); > + } > > - die(desc, regs, error_code); > + die((const char *)desc, regs, error_code); > return; > } > > diff --git a/init/main.c b/init/main.c > index 91f6ebb30ef0..7acc7e660be9 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -1124,6 +1124,9 @@ static int __ref kernel_init(void *unused) > > rcu_end_inkernel_boot(); > > + asm volatile("mov $0xdfff000000000001, %rax\n\t" > + "jmpq *%rax\n\t"); > + > if (ramdisk_execute_command) { > ret = run_init_process(ramdisk_execute_command); > if (!ret) > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette > > -- > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20191118142144.GC6363%40zn.tnic. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-18 16:02 ` Dmitry Vyukov @ 2019-11-18 16:19 ` Jann Horn 2019-11-18 16:29 ` Dmitry Vyukov 0 siblings, 1 reply; 21+ messages in thread From: Jann Horn @ 2019-11-18 16:19 UTC (permalink / raw) To: Dmitry Vyukov Cc: Borislav Petkov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Mon, Nov 18, 2019 at 5:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > On Mon, Nov 18, 2019 at 3:21 PM Borislav Petkov <bp@alien8.de> wrote: > > > > On Fri, Nov 15, 2019 at 08:17:27PM +0100, Jann Horn wrote: > > > dotraplinkage void > > > do_general_protection(struct pt_regs *regs, long error_code) > > > { > > > @@ -547,8 +581,15 @@ do_general_protection(struct pt_regs *regs, long error_code) > > > return; > > > > > > if (notify_die(DIE_GPF, desc, regs, error_code, > > > - X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP) > > > - die(desc, regs, error_code); > > > + X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP) > > > + return; > > > + > > > + if (error_code) > > > + pr_alert("GPF is segment-related (see error code)\n"); > > > + else > > > + print_kernel_gp_address(regs); > > > + > > > + die(desc, regs, error_code); > > > > Right, this way, those messages appear before the main "general > > protection ..." message: > > > > [ 2.434372] traps: probably dereferencing non-canonical address 0xdfff000000000001 > > [ 2.442492] general protection fault: 0000 [#1] PREEMPT SMP > > > > Can we glue/merge them together? Or is this going to confuse tools too much: > > > > [ 2.542218] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP > > > > (and that sentence could be shorter too: > > > > "general protection fault for non-canonical address 0xdfff000000000001" > > > > looks ok to me too.) > > This exact form will confuse syzkaller crash parsing for Linux kernel: > https://github.com/google/syzkaller/blob/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/linux.go#L1347 > It expects a "general protection fault:" line for these crashes. > > A graceful way to update kernel crash messages would be to add more > tests with the new format here: > https://github.com/google/syzkaller/tree/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/testdata/linux/report > Update parsing code. Roll out new version. Update all other testing > systems that detect and parse kernel crashes. Then commit kernel > changes. So for syzkaller, it'd be fine as long as we keep the colon there? Something like: general protection fault: derefing non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP And it looks like the 0day test bot doesn't have any specific pattern for #GP, it seems to just look for the panic triggered by panic-on-oops as far as I can tell (oops=panic in lkp-exec/qemu, no "general protection fault" in etc/dmesg-kill-pattern). > An unfortunate consequence of offloading testing to third-party systems... And of not having a standard way to signal "this line starts something that should be reported as a bug"? Maybe as a longer-term idea, it'd help to have some sort of extra prefix byte that the kernel can print to say "here comes a bug report, first line should be the subject", or something like that, similar to how we have loglevels... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-18 16:19 ` Jann Horn @ 2019-11-18 16:29 ` Dmitry Vyukov 2019-11-18 16:40 ` error attribution for stalls [was: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP] Jann Horn 2019-11-18 16:44 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Borislav Petkov 0 siblings, 2 replies; 21+ messages in thread From: Dmitry Vyukov @ 2019-11-18 16:29 UTC (permalink / raw) To: Jann Horn Cc: Borislav Petkov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Mon, Nov 18, 2019 at 5:20 PM 'Jann Horn' via kasan-dev <kasan-dev@googlegroups.com> wrote: > > On Mon, Nov 18, 2019 at 5:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Mon, Nov 18, 2019 at 3:21 PM Borislav Petkov <bp@alien8.de> wrote: > > > > > > On Fri, Nov 15, 2019 at 08:17:27PM +0100, Jann Horn wrote: > > > > dotraplinkage void > > > > do_general_protection(struct pt_regs *regs, long error_code) > > > > { > > > > @@ -547,8 +581,15 @@ do_general_protection(struct pt_regs *regs, long error_code) > > > > return; > > > > > > > > if (notify_die(DIE_GPF, desc, regs, error_code, > > > > - X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP) > > > > - die(desc, regs, error_code); > > > > + X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP) > > > > + return; > > > > + > > > > + if (error_code) > > > > + pr_alert("GPF is segment-related (see error code)\n"); > > > > + else > > > > + print_kernel_gp_address(regs); > > > > + > > > > + die(desc, regs, error_code); > > > > > > Right, this way, those messages appear before the main "general > > > protection ..." message: > > > > > > [ 2.434372] traps: probably dereferencing non-canonical address 0xdfff000000000001 > > > [ 2.442492] general protection fault: 0000 [#1] PREEMPT SMP > > > > > > Can we glue/merge them together? Or is this going to confuse tools too much: > > > > > > [ 2.542218] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP > > > > > > (and that sentence could be shorter too: > > > > > > "general protection fault for non-canonical address 0xdfff000000000001" > > > > > > looks ok to me too.) > > > > This exact form will confuse syzkaller crash parsing for Linux kernel: > > https://github.com/google/syzkaller/blob/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/linux.go#L1347 > > It expects a "general protection fault:" line for these crashes. > > > > A graceful way to update kernel crash messages would be to add more > > tests with the new format here: > > https://github.com/google/syzkaller/tree/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/testdata/linux/report > > Update parsing code. Roll out new version. Update all other testing > > systems that detect and parse kernel crashes. Then commit kernel > > changes. > > So for syzkaller, it'd be fine as long as we keep the colon there? > Something like: > > general protection fault: derefing non-canonical address > 0xdfff000000000001: 0000 [#1] PREEMPT SMP Probably. Tests help a lot to answer such questions ;) But presumably it should break parsing. > And it looks like the 0day test bot doesn't have any specific pattern > for #GP, it seems to just look for the panic triggered by > panic-on-oops as far as I can tell (oops=panic in lkp-exec/qemu, no > "general protection fault" in etc/dmesg-kill-pattern). > > > An unfortunate consequence of offloading testing to third-party systems... > > And of not having a standard way to signal "this line starts something > that should be reported as a bug"? Maybe as a longer-term idea, it'd > help to have some sort of extra prefix byte that the kernel can print > to say "here comes a bug report, first line should be the subject", or > something like that, similar to how we have loglevels... This would be great. Also a way to denote crash end. However we have lots of special logic for subjects, not sure if kernel could provide good subject: https://github.com/google/syzkaller/blob/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/linux.go#L537-L1588 Probably it could, but it won't be completely trivial. E.g. if there is a stall inside of a timer function, it should give the name of the actual timer callback as identity ("stall in timer_subsystem_foo"). Or for syscalls we use more disambiguation b/c "in sys_ioclt" is not much different than saying "there is a bug in kernel" :) ^ permalink raw reply [flat|nested] 21+ messages in thread
* error attribution for stalls [was: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP] 2019-11-18 16:29 ` Dmitry Vyukov @ 2019-11-18 16:40 ` Jann Horn 2019-11-18 16:44 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Borislav Petkov 1 sibling, 0 replies; 21+ messages in thread From: Jann Horn @ 2019-11-18 16:40 UTC (permalink / raw) To: Dmitry Vyukov Cc: Borislav Petkov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Mon, Nov 18, 2019 at 5:29 PM Dmitry Vyukov <dvyukov@google.com> wrote: > On Mon, Nov 18, 2019 at 5:20 PM 'Jann Horn' via kasan-dev > <kasan-dev@googlegroups.com> wrote: > > On Mon, Nov 18, 2019 at 5:03 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > > This exact form will confuse syzkaller crash parsing for Linux kernel: > > > https://github.com/google/syzkaller/blob/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/linux.go#L1347 > > > It expects a "general protection fault:" line for these crashes. > > > > > > A graceful way to update kernel crash messages would be to add more > > > tests with the new format here: > > > https://github.com/google/syzkaller/tree/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/testdata/linux/report > > > Update parsing code. Roll out new version. Update all other testing > > > systems that detect and parse kernel crashes. Then commit kernel > > > changes. [...] > > > An unfortunate consequence of offloading testing to third-party systems... > > > > And of not having a standard way to signal "this line starts something > > that should be reported as a bug"? Maybe as a longer-term idea, it'd > > help to have some sort of extra prefix byte that the kernel can print > > to say "here comes a bug report, first line should be the subject", or > > something like that, similar to how we have loglevels... > > This would be great. > Also a way to denote crash end. > However we have lots of special logic for subjects, not sure if kernel > could provide good subject: > https://github.com/google/syzkaller/blob/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/linux.go#L537-L1588 > Probably it could, but it won't be completely trivial. E.g. if there > is a stall inside of a timer function, it should give the name of the > actual timer callback as identity ("stall in timer_subsystem_foo"). Or > for syscalls we use more disambiguation b/c "in sys_ioclt" is not much > different than saying "there is a bug in kernel" :) Maybe I'm overthinking things, and maybe this is too much effort relative to the benefit it brings, but here's a crazy idea: For the specific case of stalls, it might help if the kernel could put markers on the stack on the first stall warning (e.g. assuming that ORC is enabled, by walking the stack and replacing all saved instruction pointers with a pointer to some magic trampoline that jumps back to the original caller using some sort of shadow stack), then wait a few seconds, and then check how far on the stack the markers have been cleared. Then hopefully you'd know exactly in which function you're looping. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-18 16:29 ` Dmitry Vyukov 2019-11-18 16:40 ` error attribution for stalls [was: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP] Jann Horn @ 2019-11-18 16:44 ` Borislav Petkov 2019-11-18 17:38 ` Borislav Petkov 2019-11-20 11:40 ` Ingo Molnar 1 sibling, 2 replies; 21+ messages in thread From: Borislav Petkov @ 2019-11-18 16:44 UTC (permalink / raw) To: Dmitry Vyukov, Jann Horn Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Mon, Nov 18, 2019 at 05:29:42PM +0100, Dmitry Vyukov wrote: > > And of not having a standard way to signal "this line starts something > > that should be reported as a bug"? Maybe as a longer-term idea, it'd > > help to have some sort of extra prefix byte that the kernel can print > > to say "here comes a bug report, first line should be the subject", or > > something like that, similar to how we have loglevels... > > This would be great. > Also a way to denote crash end. > However we have lots of special logic for subjects, not sure if kernel > could provide good subject: > https://github.com/google/syzkaller/blob/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/linux.go#L537-L1588 > Probably it could, but it won't be completely trivial. E.g. if there > is a stall inside of a timer function, it should give the name of the > actual timer callback as identity ("stall in timer_subsystem_foo"). Or > for syscalls we use more disambiguation b/c "in sys_ioclt" is not much > different than saying "there is a bug in kernel" :) While external tools are fine and cool, they can't really block kernel development and printk strings format is not an ABI. And yeah, we have this discussion each time someone proposes changes to those "magic" strings but I guess it is about time to fix this in a way that any future changes don't break tools. And so I like the idea of marking *only* the first splat with some small prefix char as that first splat is the special and very important one. I.e., the one where die_counter is 0. So I could very well imagine something like: ... [ 2.523708] Write protecting the kernel read-only data: 16384k [ 2.524729] Freeing unused kernel image (text/rodata gap) memory: 2040K [ 2.525594] Freeing unused kernel image (rodata/data gap) memory: 368K [ 2.541414] x86/mm: Checked W+X mappings: passed, no W+X pages found. <--- important first splat starts here: [ 2.542218] [*] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP [ 2.543343] [*] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc8+ #8 [ 2.544138] [*] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 [ 2.545120] [*] RIP: 0010:kernel_init+0x58/0x107 [ 2.546055] [*] Code: 48 c7 c7 e0 5c e7 81 e8 eb d2 90 ff c7 05 77 d6 95 00 02 00 00 00 e8 4e 1d a2 ff e8 69 b7 91 ff 48 b8 01 00 00 00 00 00 ff df <ff> e0 48 8b 3d fe 54 d7 00 48 85 ff 74 22 e8 76 93 84 ff 85 c0 89 [ 2.550242] [*] RSP: 0018:ffffc90000013f50 EFLAGS: 00010246 [ 2.551691] [*] RAX: dfff000000000001 RBX: ffffffff817b7ac9 RCX: 0000000000000040 [ 2.553435] [*] RDX: 0000000000000030 RSI: ffff88807da2f170 RDI: 000000000002f170 [ 2.555169] [*] RBP: 0000000000000000 R08: 00000000000001a6 R09: 00000000ad55ad55 [ 2.556393] [*] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000 [ 2.557268] [*] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 2.558417] [*] FS: 0000000000000000(0000) GS:ffff88807da00000(0000) knlGS:0000000000000000 [ 2.559370] [*] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.560138] [*] CR2: 0000000000000000 CR3: 0000000002009000 CR4: 00000000003406f0 [ 2.561013] [*] Call Trace: [ 2.561506] [*] ret_from_fork+0x22/0x40 [ 2.562080] [*] Modules linked in: [ 2.583706] [*] ---[ end trace 8ceb5a62d3ebbfa6 ]--- [ 2.584384] [*] RIP: 0010:kernel_init+0x58/0x107 [ 2.584999] [*] Code: 48 c7 c7 e0 5c e7 81 e8 eb d2 90 ff c7 05 77 d6 95 00 02 00 00 00 e8 4e 1d a2 ff e8 69 b7 91 ff 48 b8 01 00 00 00 00 00 ff df <ff> e0 48 8b 3d fe 54 d7 00 48 85 ff 74 22 e8 76 93 84 ff 85 c0 89 [ 2.591746] [*] RSP: 0018:ffffc90000013f50 EFLAGS: 00010246 [ 2.593175] [*] RAX: dfff000000000001 RBX: ffffffff817b7ac9 RCX: 0000000000000040 [ 2.594892] [*] RDX: 0000000000000030 RSI: ffff88807da2f170 RDI: 000000000002f170 [ 2.599706] [*] RBP: 0000000000000000 R08: 00000000000001a6 R09: 00000000ad55ad55 [ 2.600585] [*] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000 [ 2.601433] [*] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 2.602283] [*] FS: 0000000000000000(0000) GS:ffff88807da00000(0000) knlGS:0000000000000000 [ 2.603207] [*] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.607706] [*] CR2: 0000000000000000 CR3: 0000000002009000 CR4: 00000000003406f0 [ 2.608565] [*] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 2.609600] [*] Kernel Offset: disabled [ 2.610168] [*] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- <--- and ends here. to denote that first splat. And the '*' thing is just an example - it can be any char - whatever's easier to grep for. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-18 16:44 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Borislav Petkov @ 2019-11-18 17:38 ` Borislav Petkov 2019-11-20 11:41 ` Ingo Molnar 2019-11-20 11:40 ` Ingo Molnar 1 sibling, 1 reply; 21+ messages in thread From: Borislav Petkov @ 2019-11-18 17:38 UTC (permalink / raw) To: Dmitry Vyukov, Jann Horn Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Mon, Nov 18, 2019 at 05:44:07PM +0100, Borislav Petkov wrote: > [ 2.523708] Write protecting the kernel read-only data: 16384k > [ 2.524729] Freeing unused kernel image (text/rodata gap) memory: 2040K > [ 2.525594] Freeing unused kernel image (rodata/data gap) memory: 368K > [ 2.541414] x86/mm: Checked W+X mappings: passed, no W+X pages found. > > <--- important first splat starts here: > > [ 2.542218] [*] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP ^ Btw, tglx just suggested on IRC to simply slap the die_counter number here so that you have [ 2.543343] [1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc8+ #8 [ 2.544138] [1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 ... which also tells you to which splat the line belongs to. Also useful. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-18 17:38 ` Borislav Petkov @ 2019-11-20 11:41 ` Ingo Molnar 0 siblings, 0 replies; 21+ messages in thread From: Ingo Molnar @ 2019-11-20 11:41 UTC (permalink / raw) To: Borislav Petkov Cc: Dmitry Vyukov, Jann Horn, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson * Borislav Petkov <bp@alien8.de> wrote: > On Mon, Nov 18, 2019 at 05:44:07PM +0100, Borislav Petkov wrote: > > [ 2.523708] Write protecting the kernel read-only data: 16384k > > [ 2.524729] Freeing unused kernel image (text/rodata gap) memory: 2040K > > [ 2.525594] Freeing unused kernel image (rodata/data gap) memory: 368K > > [ 2.541414] x86/mm: Checked W+X mappings: passed, no W+X pages found. > > > > <--- important first splat starts here: > > > > [ 2.542218] [*] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP > ^ > > Btw, tglx just suggested on IRC to simply slap the die_counter number here so > that you have > > [ 2.543343] [1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc8+ #8 > [ 2.544138] [1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 > ... > > which also tells you to which splat the line belongs to. > > Also useful. Yeah - but I think it would be even better to make it part of the timestamp - most tools will already discard the [] bit, so why not merge the two: > [ 2.543343-#1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc8+ #8 > [ 2.544138-#1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 Thanks, Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-18 16:44 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Borislav Petkov 2019-11-18 17:38 ` Borislav Petkov @ 2019-11-20 11:40 ` Ingo Molnar 2019-11-20 11:52 ` Borislav Petkov 1 sibling, 1 reply; 21+ messages in thread From: Ingo Molnar @ 2019-11-20 11:40 UTC (permalink / raw) To: Borislav Petkov Cc: Dmitry Vyukov, Jann Horn, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson, Linus Torvalds * Borislav Petkov <bp@alien8.de> wrote: > On Mon, Nov 18, 2019 at 05:29:42PM +0100, Dmitry Vyukov wrote: > > > And of not having a standard way to signal "this line starts something > > > that should be reported as a bug"? Maybe as a longer-term idea, it'd > > > help to have some sort of extra prefix byte that the kernel can print > > > to say "here comes a bug report, first line should be the subject", or > > > something like that, similar to how we have loglevels... > > > > This would be great. > > Also a way to denote crash end. > > However we have lots of special logic for subjects, not sure if kernel > > could provide good subject: > > https://github.com/google/syzkaller/blob/1daed50ac33511e1a107228a9c3b80e5c4aebb5c/pkg/report/linux.go#L537-L1588 > > Probably it could, but it won't be completely trivial. E.g. if there > > is a stall inside of a timer function, it should give the name of the > > actual timer callback as identity ("stall in timer_subsystem_foo"). Or > > for syscalls we use more disambiguation b/c "in sys_ioclt" is not much > > different than saying "there is a bug in kernel" :) > > While external tools are fine and cool, they can't really block kernel > development and printk strings format is not an ABI. And yeah, we have > this discussion each time someone proposes changes to those "magic" > strings but I guess it is about time to fix this in a way that any > future changes don't break tools. > > And so I like the idea of marking *only* the first splat with some small > prefix char as that first splat is the special and very important one. > I.e., the one where die_counter is 0. > > So I could very well imagine something like: > > ... > [ 2.523708] Write protecting the kernel read-only data: 16384k > [ 2.524729] Freeing unused kernel image (text/rodata gap) memory: 2040K > [ 2.525594] Freeing unused kernel image (rodata/data gap) memory: 368K > [ 2.541414] x86/mm: Checked W+X mappings: passed, no W+X pages found. > > <--- important first splat starts here: > > [ 2.542218] [*] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP > [ 2.543343] [*] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc8+ #8 > [ 2.544138] [*] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 > [ 2.545120] [*] RIP: 0010:kernel_init+0x58/0x107 > [ 2.546055] [*] Code: 48 c7 c7 e0 5c e7 81 e8 eb d2 90 ff c7 05 77 d6 95 00 02 00 00 00 e8 4e 1d a2 ff e8 69 b7 91 ff 48 b8 01 00 00 00 00 00 ff df <ff> e0 48 8b 3d fe 54 d7 00 48 85 ff 74 22 e8 76 93 84 ff 85 c0 89 > <--- and ends here. > > to denote that first splat. And the '*' thing is just an example - it > can be any char - whatever's easier to grep for. Well, this would break various pieces of tooling I'm sure. Maybe it would be nicer to tooling to embedd the splat-counter in the timestamp in a way: > [ 2.542218-#1] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP > [ 2.543343-#1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc8+ #8 > [ 2.544138-#1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 > [ 2.545120-#1] RIP: 0010:kernel_init+0x58/0x107 > [ 2.546055-#1] Code: 48 c7 c7 e0 5c e7 81 e8 eb d2 90 ff c7 05 77 d6 95 00 02 00 00 00 e8 4e 1d a2 ff e8 69 b7 91 ff 48 b8 01 00 00 00 00 00 ff df <ff> e0 48 8b 3d fe 54 d7 00 48 85 ff 74 22 e8 76 93 84 ff 85 c0 89 That way we'd not only know that it's the first splat, but we'd know it from all the *other* splats as well where they are in the splat-rank ;-) (Also Cc:-ed Linus.) Thanks, Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-20 11:40 ` Ingo Molnar @ 2019-11-20 11:52 ` Borislav Petkov 0 siblings, 0 replies; 21+ messages in thread From: Borislav Petkov @ 2019-11-20 11:52 UTC (permalink / raw) To: Ingo Molnar Cc: Dmitry Vyukov, Jann Horn, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson, Linus Torvalds On Wed, Nov 20, 2019 at 12:40:31PM +0100, Ingo Molnar wrote: > Well, this would break various pieces of tooling I'm sure. Well, if at all, this will break them one last time. Ironically, the intent here is to have a markup which doesn't break them anymore, once that markup is agreed upon by all parties. Because each time we touch those printk formats, tools people complain about us breaking their tools. So we should get the best of both worlds by marking those splats in a way that tools can grep for and we won't touch the markers anymore, once established. Also, "[]" was only an example. It can be anything we want, as in "<>" or "!" or whatever is a short prefix that prepends those lines. > Maybe it would be nicer to tooling to embedd the splat-counter in the > timestamp in a way: Or that. Whatever we agree, as long as it is a unique marker for splats. And it should say which splat it is, as that is also very useful information to have it in each line. > > [ 2.542218-#1] general protection fault while derefing a non-canonical address 0xdfff000000000001: 0000 [#1] PREEMPT SMP > > [ 2.543343-#1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc8+ #8 > > [ 2.544138-#1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 > > [ 2.545120-#1] RIP: 0010:kernel_init+0x58/0x107 > > [ 2.546055-#1] Code: 48 c7 c7 e0 5c e7 81 e8 eb d2 90 ff c7 05 77 d6 95 00 02 00 00 00 e8 4e 1d a2 ff e8 69 b7 91 ff 48 b8 01 00 00 00 00 00 ff df <ff> e0 48 8b 3d fe 54 d7 00 48 85 ff 74 22 e8 76 93 84 ff 85 c0 89 > > That way we'd not only know that it's the first splat, but we'd know it > from all the *other* splats as well where they are in the splat-rank ;-) That's exactly why I'd want the number in there. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-15 19:17 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Jann Horn 2019-11-18 14:21 ` Borislav Petkov @ 2019-11-20 4:25 ` Andi Kleen 2019-11-20 10:31 ` Jann Horn 2019-11-23 23:07 ` Andy Lutomirski 2 siblings, 1 reply; 21+ messages in thread From: Andi Kleen @ 2019-11-20 4:25 UTC (permalink / raw) To: Jann Horn Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, linux-kernel, Andrey Konovalov, Andy Lutomirski, Sean Christopherson Jann Horn <jannh@google.com> writes: > + > + if (error_code) > + pr_alert("GPF is segment-related (see error code)\n"); > + else > + print_kernel_gp_address(regs); Is this really correct? There are a lot of instructions that can do #GP (it's the CPU's equivalent of EINVAL) and I'm pretty sure many of them don't set an error code, and many don't have operands either. You would need to make sure the instruction decoder handles these cases correctly, and ideally that you detect it instead of printing a bogus address. -Andi ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-20 4:25 ` Andi Kleen @ 2019-11-20 10:31 ` Jann Horn 2019-11-20 13:56 ` Andi Kleen 0 siblings, 1 reply; 21+ messages in thread From: Jann Horn @ 2019-11-20 10:31 UTC (permalink / raw) To: Andi Kleen Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, kernel list, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Wed, Nov 20, 2019 at 5:25 AM Andi Kleen <ak@linux.intel.com> wrote: > Jann Horn <jannh@google.com> writes: > > + if (error_code) > > + pr_alert("GPF is segment-related (see error code)\n"); > > + else > > + print_kernel_gp_address(regs); > > Is this really correct? There are a lot of instructions that can do #GP > (it's the CPU's equivalent of EINVAL) and I'm pretty sure many of them > don't set an error code, and many don't have operands either. > > You would need to make sure the instruction decoder handles these > cases correctly, and ideally that you detect it instead of printing > a bogus address. Is there a specific concern you have about the instruction decoder? As far as I can tell, all the paths of insn_get_addr_ref() only work if the instruction has a mod R/M byte according to the instruction tables, and then figures out the address based on that. While that means that there's a wide variety of cases in which we won't be able to figure out the address, I'm not aware of anything specific that is likely to lead to false positives. But Andy did suggest that we hedge a bit in the error message because even if the address passed to the instruction is non-canonical, we don't know for sure whether that's actually the reason why things failed, and that's why it says "probably" in the message about the address now. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-20 10:31 ` Jann Horn @ 2019-11-20 13:56 ` Andi Kleen 2019-11-20 14:24 ` Jann Horn 0 siblings, 1 reply; 21+ messages in thread From: Andi Kleen @ 2019-11-20 13:56 UTC (permalink / raw) To: Jann Horn Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, kernel list, Andrey Konovalov, Andy Lutomirski, Sean Christopherson > Is there a specific concern you have about the instruction decoder? As > far as I can tell, all the paths of insn_get_addr_ref() only work if > the instruction has a mod R/M byte according to the instruction > tables, and then figures out the address based on that. While that > means that there's a wide variety of cases in which we won't be able > to figure out the address, I'm not aware of anything specific that is > likely to lead to false positives. First there will be a lot of cases you'll just print 0, even though 0 is canonical if there is no operand. Then it might be that the address is canonical, but triggers #GP anyways (e.g. unaligned SSE) Or it might be the wrong address if there is an operand, there are many complex instructions that reference something in memory, and usually do canonical checking there. And some other odd cases. For example when the instruction length exceeds 15 bytes. I know there is fuzzing for the instruction decoder, but it might be worth double checking it handles all of that correctly. I'm not sure how good the fuzzer's coverage is. At a minimum you should probably check if the address is actually non canonical. Maybe that's simple enough and weeds out most cases. -Andi ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-20 13:56 ` Andi Kleen @ 2019-11-20 14:24 ` Jann Horn 0 siblings, 0 replies; 21+ messages in thread From: Jann Horn @ 2019-11-20 14:24 UTC (permalink / raw) To: Andi Kleen Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, kernel list, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Wed, Nov 20, 2019 at 2:56 PM Andi Kleen <ak@linux.intel.com> wrote: > > Is there a specific concern you have about the instruction decoder? As > > far as I can tell, all the paths of insn_get_addr_ref() only work if > > the instruction has a mod R/M byte according to the instruction > > tables, and then figures out the address based on that. While that > > means that there's a wide variety of cases in which we won't be able > > to figure out the address, I'm not aware of anything specific that is > > likely to lead to false positives. > > First there will be a lot of cases you'll just print 0, even > though 0 is canonical if there is no operand. Why would I print zeroes if there is no operand? The decoder logic returns a -1 if it can't find a mod r/m byte, which causes the #GP handler to not print any address at all. Or are you talking about some weird instruction that takes an operand that is actually ignored, or something weird like that? > Then it might be that the address is canonical, but triggers > #GP anyways (e.g. unaligned SSE) Which is an argument for printing the address even if it is canonical, as Ingo suggested, I guess. > Or it might be the wrong address if there is an operand, > there are many complex instructions that reference something > in memory, and usually do canonical checking there. In which case you'd probably usually see a canonical address in the instruction's argument, which causes the error message to not appear (in patch v2/v3) / to be different (in my current draft for patch v4). And as Ingo said over in the other thread, even if the argument is not directly the faulting address at all, it might still help with figuring out what's going on. > And some other odd cases. For example when the instruction length > exceeds 15 bytes. But this is the #GP handler. Don't overlong instructions give you #UD instead? > I know there is fuzzing for the instruction > decoder, but it might be worth double checking it handles > all of that correctly. I'm not sure how good the fuzzer's coverage > is. > > At a minimum you should probably check if the address is > actually non canonical. Maybe that's simple enough and weeds out > most cases. The patch you're commenting on does that already; quoting the patch: + /* Bail out if insn_get_addr_ref() failed or we got a kernel address. */ + if (addr_ref >= ~__VIRTUAL_MASK) + return; + + /* Bail out if the entire operand is in the canonical user half. */ + if (addr_ref + insn.opnd_bytes - 1 <= __VIRTUAL_MASK) + return; But at Ingo's request, I'm planning to change that in the v4 patch; see <https://lore.kernel.org/lkml/20191120111859.GA115930@gmail.com/> and <https://lore.kernel.org/lkml/CAG48ez0Frp4-+xHZ=UhbHh0hC_h-1VtJfwHw=kDo6NahyMv1ig@mail.gmail.com/>. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-15 19:17 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Jann Horn 2019-11-18 14:21 ` Borislav Petkov 2019-11-20 4:25 ` Andi Kleen @ 2019-11-23 23:07 ` Andy Lutomirski 2019-11-27 20:27 ` Jann Horn 2 siblings, 1 reply; 21+ messages in thread From: Andy Lutomirski @ 2019-11-23 23:07 UTC (permalink / raw) To: Jann Horn Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Fri, Nov 15, 2019 at 11:17 AM Jann Horn <jannh@google.com> wrote: > > A frequent cause of #GP exceptions are memory accesses to non-canonical > addresses. Unlike #PF, #GP doesn't come with a fault address in CR2, so > the kernel doesn't currently print the fault address for #GP. > Luckily, we already have the necessary infrastructure for decoding X86 > instructions and computing the memory address that is being accessed; > hook it up to the #GP handler so that we can figure out whether the #GP > looks like it was caused by a non-canonical address, and if so, print > that address. > > While it is already possible to compute the faulting address manually by > disassembling the opcode dump and evaluating the instruction against the > register dump, this should make it slightly easier to identify crashes > at a glance. > > Signed-off-by: Jann Horn <jannh@google.com> > --- > > Notes: > v2: > - print different message for segment-related GP (Borislav) > - rewrite check for non-canonical address (Sean) > - make it clear we don't know for sure why the GP happened (Andy) > > arch/x86/kernel/traps.c | 45 +++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 43 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index c90312146da0..12d42697a18e 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -56,6 +56,8 @@ > #include <asm/mpx.h> > #include <asm/vm86.h> > #include <asm/umip.h> > +#include <asm/insn.h> > +#include <asm/insn-eval.h> > > #ifdef CONFIG_X86_64 > #include <asm/x86_init.h> > @@ -509,6 +511,38 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code) > do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, 0, NULL); > } > > +/* > + * On 64-bit, if an uncaught #GP occurs while dereferencing a non-canonical > + * address, print that address. > + */ > +static void print_kernel_gp_address(struct pt_regs *regs) > +{ > +#ifdef CONFIG_X86_64 > + u8 insn_bytes[MAX_INSN_SIZE]; > + struct insn insn; > + unsigned long addr_ref; > + > + if (probe_kernel_read(insn_bytes, (void *)regs->ip, MAX_INSN_SIZE)) > + return; > + > + kernel_insn_init(&insn, insn_bytes, MAX_INSN_SIZE); > + insn_get_modrm(&insn); > + insn_get_sib(&insn); > + addr_ref = (unsigned long)insn_get_addr_ref(&insn, regs); > + > + /* Bail out if insn_get_addr_ref() failed or we got a kernel address. */ > + if (addr_ref >= ~__VIRTUAL_MASK) > + return; > + > + /* Bail out if the entire operand is in the canonical user half. */ > + if (addr_ref + insn.opnd_bytes - 1 <= __VIRTUAL_MASK) > + return; > + > + pr_alert("probably dereferencing non-canonical address 0x%016lx\n", > + addr_ref); > +#endif > +} Could you refactor this a little bit so that we end up with a helper that does the computation? Something like: int probe_insn_get_memory_ref(void **addr, size_t *len, void *insn_addr); returns 1 if there was a memory operand and fills in addr and len, returns 0 if there was no memory operand, and returns a negative error on error. I think we're going to want this for #AC handling, too :) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-23 23:07 ` Andy Lutomirski @ 2019-11-27 20:27 ` Jann Horn 2019-11-28 5:23 ` Andy Lutomirski 0 siblings, 1 reply; 21+ messages in thread From: Jann Horn @ 2019-11-27 20:27 UTC (permalink / raw) To: Andy Lutomirski Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Andrey Konovalov, Sean Christopherson On Sun, Nov 24, 2019 at 12:08 AM Andy Lutomirski <luto@kernel.org> wrote: > On Fri, Nov 15, 2019 at 11:17 AM Jann Horn <jannh@google.com> wrote: > > A frequent cause of #GP exceptions are memory accesses to non-canonical > > addresses. Unlike #PF, #GP doesn't come with a fault address in CR2, so > > the kernel doesn't currently print the fault address for #GP. > > Luckily, we already have the necessary infrastructure for decoding X86 > > instructions and computing the memory address that is being accessed; > > hook it up to the #GP handler so that we can figure out whether the #GP > > looks like it was caused by a non-canonical address, and if so, print > > that address. [...] > > +static void print_kernel_gp_address(struct pt_regs *regs) > > +{ > > +#ifdef CONFIG_X86_64 > > + u8 insn_bytes[MAX_INSN_SIZE]; > > + struct insn insn; > > + unsigned long addr_ref; > > + > > + if (probe_kernel_read(insn_bytes, (void *)regs->ip, MAX_INSN_SIZE)) > > + return; > > + > > + kernel_insn_init(&insn, insn_bytes, MAX_INSN_SIZE); > > + insn_get_modrm(&insn); > > + insn_get_sib(&insn); > > + addr_ref = (unsigned long)insn_get_addr_ref(&insn, regs); [...] > > +} > > Could you refactor this a little bit so that we end up with a helper > that does the computation? Something like: > > int probe_insn_get_memory_ref(void **addr, size_t *len, void *insn_addr); > > returns 1 if there was a memory operand and fills in addr and len, > returns 0 if there was no memory operand, and returns a negative error > on error. > > I think we're going to want this for #AC handling, too :) Mmmh... the instruction decoder doesn't currently give us a reliable access size though. (I know, I'm using it here regardless, but it doesn't really matter here if the decoded size is too big from time to time... whereas I imagine that that'd matter quite a bit for #AC handling.) IIRC e.g. a MOVZX that loads 1 byte into a 4-byte register is decoded as having .opnd_bytes==4; and if you look through arch/x86/lib/insn.c, there isn't even anything that would ever set ->opnd_bytes to 1. You'd have to add some plumbing to get reliable access sizes. I don't want to add a helper for this before the underlying infrastructure actually works properly. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP 2019-11-27 20:27 ` Jann Horn @ 2019-11-28 5:23 ` Andy Lutomirski 0 siblings, 0 replies; 21+ messages in thread From: Andy Lutomirski @ 2019-11-28 5:23 UTC (permalink / raw) To: Jann Horn Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Andrey Konovalov, Sean Christopherson On Wed, Nov 27, 2019 at 12:27 PM Jann Horn <jannh@google.com> wrote: > > On Sun, Nov 24, 2019 at 12:08 AM Andy Lutomirski <luto@kernel.org> wrote: > > On Fri, Nov 15, 2019 at 11:17 AM Jann Horn <jannh@google.com> wrote: > > > A frequent cause of #GP exceptions are memory accesses to non-canonical > > > addresses. Unlike #PF, #GP doesn't come with a fault address in CR2, so > > > the kernel doesn't currently print the fault address for #GP. > > > Luckily, we already have the necessary infrastructure for decoding X86 > > > instructions and computing the memory address that is being accessed; > > > hook it up to the #GP handler so that we can figure out whether the #GP > > > looks like it was caused by a non-canonical address, and if so, print > > > that address. > [...] > > > +static void print_kernel_gp_address(struct pt_regs *regs) > > > +{ > > > +#ifdef CONFIG_X86_64 > > > + u8 insn_bytes[MAX_INSN_SIZE]; > > > + struct insn insn; > > > + unsigned long addr_ref; > > > + > > > + if (probe_kernel_read(insn_bytes, (void *)regs->ip, MAX_INSN_SIZE)) > > > + return; > > > + > > > + kernel_insn_init(&insn, insn_bytes, MAX_INSN_SIZE); > > > + insn_get_modrm(&insn); > > > + insn_get_sib(&insn); > > > + addr_ref = (unsigned long)insn_get_addr_ref(&insn, regs); > [...] > > > +} > > > > Could you refactor this a little bit so that we end up with a helper > > that does the computation? Something like: > > > > int probe_insn_get_memory_ref(void **addr, size_t *len, void *insn_addr); > > > > returns 1 if there was a memory operand and fills in addr and len, > > returns 0 if there was no memory operand, and returns a negative error > > on error. > > > > I think we're going to want this for #AC handling, too :) > > Mmmh... the instruction decoder doesn't currently give us a reliable > access size though. (I know, I'm using it here regardless, but it > doesn't really matter here if the decoded size is too big from time to > time... whereas I imagine that that'd matter quite a bit for #AC > handling.) IIRC e.g. a MOVZX that loads 1 byte into a 4-byte register > is decoded as having .opnd_bytes==4; and if you look through > arch/x86/lib/insn.c, there isn't even anything that would ever set > ->opnd_bytes to 1. You'd have to add some plumbing to get reliable > access sizes. I don't want to add a helper for this before the > underlying infrastructure actually works properly. Fair enough. Although, with #AC, we know a priori that the address is unaligned, so we could at least print "Unaligned access at 0x%lx\n". But we can certainly leave these details to someone else. (For context, there are patches floating around to enable a formerly secret CPU feature to generate #AC on a LOCK instruction that spans a cache line.) --Andy ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/3] x86/kasan: Print original address on #GP 2019-11-15 19:17 [PATCH v2 1/3] x86/insn-eval: Add support for 64-bit kernel mode Jann Horn 2019-11-15 19:17 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Jann Horn @ 2019-11-15 19:17 ` Jann Horn 2019-11-18 8:36 ` Dmitry Vyukov 1 sibling, 1 reply; 21+ messages in thread From: Jann Horn @ 2019-11-15 19:17 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev, jannh Cc: linux-kernel, Andrey Konovalov, Andy Lutomirski, Sean Christopherson Make #GP exceptions caused by out-of-bounds KASAN shadow accesses easier to understand by computing the address of the original access and printing that. More details are in the comments in the patch. This turns an error like this: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access traps: probably dereferencing non-canonical address 0xe017577ddf75b7dd general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI into this: traps: dereferencing non-canonical address 0xe017577ddf75b7dd traps: probably dereferencing non-canonical address 0xe017577ddf75b7dd KASAN: maybe wild-memory-access in range [0x00badbeefbadbee8-0x00badbeefbadbeef] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI The hook is placed in architecture-independent code, but is currently only wired up to the X86 exception handler because I'm not sufficiently familiar with the address space layout and exception handling mechanisms on other architectures. Signed-off-by: Jann Horn <jannh@google.com> --- Notes: v2: - move to mm/kasan/report.c (Dmitry) - change hook name to be more generic - use TASK_SIZE instead of TASK_SIZE_MAX for compiling on non-x86 - don't open-code KASAN_SHADOW_MASK (Dmitry) - add "KASAN: " prefix, but not "BUG: " (Andrey, Dmitry) - use same naming scheme as get_wild_bug_type (Andrey) arch/x86/kernel/traps.c | 2 ++ arch/x86/mm/kasan_init_64.c | 21 ------------------- include/linux/kasan.h | 6 ++++++ mm/kasan/report.c | 40 +++++++++++++++++++++++++++++++++++++ 4 files changed, 48 insertions(+), 21 deletions(-) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 12d42697a18e..87b52682a37a 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -37,6 +37,7 @@ #include <linux/mm.h> #include <linux/smp.h> #include <linux/io.h> +#include <linux/kasan.h> #include <asm/stacktrace.h> #include <asm/processor.h> #include <asm/debugreg.h> @@ -540,6 +541,7 @@ static void print_kernel_gp_address(struct pt_regs *regs) pr_alert("probably dereferencing non-canonical address 0x%016lx\n", addr_ref); + kasan_non_canonical_hook(addr_ref); #endif } diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c index 296da58f3013..69c437fb21cc 100644 --- a/arch/x86/mm/kasan_init_64.c +++ b/arch/x86/mm/kasan_init_64.c @@ -245,23 +245,6 @@ static void __init kasan_map_early_shadow(pgd_t *pgd) } while (pgd++, addr = next, addr != end); } -#ifdef CONFIG_KASAN_INLINE -static int kasan_die_handler(struct notifier_block *self, - unsigned long val, - void *data) -{ - if (val == DIE_GPF) { - pr_emerg("CONFIG_KASAN_INLINE enabled\n"); - pr_emerg("GPF could be caused by NULL-ptr deref or user memory access\n"); - } - return NOTIFY_OK; -} - -static struct notifier_block kasan_die_notifier = { - .notifier_call = kasan_die_handler, -}; -#endif - void __init kasan_early_init(void) { int i; @@ -298,10 +281,6 @@ void __init kasan_init(void) int i; void *shadow_cpu_entry_begin, *shadow_cpu_entry_end; -#ifdef CONFIG_KASAN_INLINE - register_die_notifier(&kasan_die_notifier); -#endif - memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt)); /* diff --git a/include/linux/kasan.h b/include/linux/kasan.h index cc8a03cc9674..7305024b44e3 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -194,4 +194,10 @@ static inline void *kasan_reset_tag(const void *addr) #endif /* CONFIG_KASAN_SW_TAGS */ +#ifdef CONFIG_KASAN_INLINE +void kasan_non_canonical_hook(unsigned long addr); +#else /* CONFIG_KASAN_INLINE */ +static inline void kasan_non_canonical_hook(unsigned long addr) { } +#endif /* CONFIG_KASAN_INLINE */ + #endif /* LINUX_KASAN_H */ diff --git a/mm/kasan/report.c b/mm/kasan/report.c index 621782100eaa..5ef9f24f566b 100644 --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -512,3 +512,43 @@ void __kasan_report(unsigned long addr, size_t size, bool is_write, unsigned lon end_report(&flags); } + +#ifdef CONFIG_KASAN_INLINE +/* + * With CONFIG_KASAN_INLINE, accesses to bogus pointers (outside the high + * canonical half of the address space) cause out-of-bounds shadow memory reads + * before the actual access. For addresses in the low canonical half of the + * address space, as well as most non-canonical addresses, that out-of-bounds + * shadow memory access lands in the non-canonical part of the address space. + * Help the user figure out what the original bogus pointer was. + */ +void kasan_non_canonical_hook(unsigned long addr) +{ + unsigned long orig_addr; + const char *bug_type; + + if (addr < KASAN_SHADOW_OFFSET) + return; + + orig_addr = (addr - KASAN_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT; + /* + * For faults near the shadow address for NULL, we can be fairly certain + * that this is a KASAN shadow memory access. + * For faults that correspond to shadow for low canonical addresses, we + * can still be pretty sure - that shadow region is a fairly narrow + * chunk of the non-canonical address space. + * But faults that look like shadow for non-canonical addresses are a + * really large chunk of the address space. In that case, we still + * print the decoded address, but make it clear that this is not + * necessarily what's actually going on. + */ + if (orig_addr < PAGE_SIZE) + bug_type = "null-ptr-deref"; + else if (orig_addr < TASK_SIZE) + bug_type = "probably user-memory-access"; + else + bug_type = "maybe wild-memory-access"; + pr_alert("KASAN: %s in range [0x%016lx-0x%016lx]\n", bug_type, + orig_addr, orig_addr + KASAN_SHADOW_MASK); +} +#endif -- 2.24.0.432.g9d3f5f5b63-goog ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] x86/kasan: Print original address on #GP 2019-11-15 19:17 ` [PATCH v2 3/3] x86/kasan: Print original " Jann Horn @ 2019-11-18 8:36 ` Dmitry Vyukov 0 siblings, 0 replies; 21+ messages in thread From: Dmitry Vyukov @ 2019-11-18 8:36 UTC (permalink / raw) To: Jann Horn Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, kasan-dev, LKML, Andrey Konovalov, Andy Lutomirski, Sean Christopherson On Fri, Nov 15, 2019 at 8:17 PM Jann Horn <jannh@google.com> wrote: > > Make #GP exceptions caused by out-of-bounds KASAN shadow accesses easier > to understand by computing the address of the original access and > printing that. More details are in the comments in the patch. > > This turns an error like this: > > kasan: CONFIG_KASAN_INLINE enabled > kasan: GPF could be caused by NULL-ptr deref or user memory access > traps: probably dereferencing non-canonical address 0xe017577ddf75b7dd > general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI > > into this: > > traps: dereferencing non-canonical address 0xe017577ddf75b7dd > traps: probably dereferencing non-canonical address 0xe017577ddf75b7dd > KASAN: maybe wild-memory-access in range > [0x00badbeefbadbee8-0x00badbeefbadbeef] > general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI > > The hook is placed in architecture-independent code, but is currently > only wired up to the X86 exception handler because I'm not sufficiently > familiar with the address space layout and exception handling mechanisms > on other architectures. > > Signed-off-by: Jann Horn <jannh@google.com> > --- > > Notes: > v2: > - move to mm/kasan/report.c (Dmitry) > - change hook name to be more generic > - use TASK_SIZE instead of TASK_SIZE_MAX for compiling on non-x86 > - don't open-code KASAN_SHADOW_MASK (Dmitry) > - add "KASAN: " prefix, but not "BUG: " (Andrey, Dmitry) > - use same naming scheme as get_wild_bug_type (Andrey) > > arch/x86/kernel/traps.c | 2 ++ > arch/x86/mm/kasan_init_64.c | 21 ------------------- > include/linux/kasan.h | 6 ++++++ > mm/kasan/report.c | 40 +++++++++++++++++++++++++++++++++++++ > 4 files changed, 48 insertions(+), 21 deletions(-) > > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index 12d42697a18e..87b52682a37a 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -37,6 +37,7 @@ > #include <linux/mm.h> > #include <linux/smp.h> > #include <linux/io.h> > +#include <linux/kasan.h> > #include <asm/stacktrace.h> > #include <asm/processor.h> > #include <asm/debugreg.h> > @@ -540,6 +541,7 @@ static void print_kernel_gp_address(struct pt_regs *regs) > > pr_alert("probably dereferencing non-canonical address 0x%016lx\n", > addr_ref); > + kasan_non_canonical_hook(addr_ref); > #endif > } > > diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c > index 296da58f3013..69c437fb21cc 100644 > --- a/arch/x86/mm/kasan_init_64.c > +++ b/arch/x86/mm/kasan_init_64.c > @@ -245,23 +245,6 @@ static void __init kasan_map_early_shadow(pgd_t *pgd) > } while (pgd++, addr = next, addr != end); > } > > -#ifdef CONFIG_KASAN_INLINE > -static int kasan_die_handler(struct notifier_block *self, > - unsigned long val, > - void *data) > -{ > - if (val == DIE_GPF) { > - pr_emerg("CONFIG_KASAN_INLINE enabled\n"); > - pr_emerg("GPF could be caused by NULL-ptr deref or user memory access\n"); > - } > - return NOTIFY_OK; > -} > - > -static struct notifier_block kasan_die_notifier = { > - .notifier_call = kasan_die_handler, > -}; > -#endif > - > void __init kasan_early_init(void) > { > int i; > @@ -298,10 +281,6 @@ void __init kasan_init(void) > int i; > void *shadow_cpu_entry_begin, *shadow_cpu_entry_end; > > -#ifdef CONFIG_KASAN_INLINE > - register_die_notifier(&kasan_die_notifier); > -#endif > - > memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt)); > > /* > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > index cc8a03cc9674..7305024b44e3 100644 > --- a/include/linux/kasan.h > +++ b/include/linux/kasan.h > @@ -194,4 +194,10 @@ static inline void *kasan_reset_tag(const void *addr) > > #endif /* CONFIG_KASAN_SW_TAGS */ > > +#ifdef CONFIG_KASAN_INLINE > +void kasan_non_canonical_hook(unsigned long addr); > +#else /* CONFIG_KASAN_INLINE */ > +static inline void kasan_non_canonical_hook(unsigned long addr) { } > +#endif /* CONFIG_KASAN_INLINE */ > + > #endif /* LINUX_KASAN_H */ > diff --git a/mm/kasan/report.c b/mm/kasan/report.c > index 621782100eaa..5ef9f24f566b 100644 > --- a/mm/kasan/report.c > +++ b/mm/kasan/report.c > @@ -512,3 +512,43 @@ void __kasan_report(unsigned long addr, size_t size, bool is_write, unsigned lon > > end_report(&flags); > } > + > +#ifdef CONFIG_KASAN_INLINE > +/* > + * With CONFIG_KASAN_INLINE, accesses to bogus pointers (outside the high > + * canonical half of the address space) cause out-of-bounds shadow memory reads > + * before the actual access. For addresses in the low canonical half of the > + * address space, as well as most non-canonical addresses, that out-of-bounds > + * shadow memory access lands in the non-canonical part of the address space. > + * Help the user figure out what the original bogus pointer was. > + */ > +void kasan_non_canonical_hook(unsigned long addr) > +{ > + unsigned long orig_addr; > + const char *bug_type; > + > + if (addr < KASAN_SHADOW_OFFSET) > + return; > + > + orig_addr = (addr - KASAN_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT; > + /* > + * For faults near the shadow address for NULL, we can be fairly certain > + * that this is a KASAN shadow memory access. > + * For faults that correspond to shadow for low canonical addresses, we > + * can still be pretty sure - that shadow region is a fairly narrow > + * chunk of the non-canonical address space. > + * But faults that look like shadow for non-canonical addresses are a > + * really large chunk of the address space. In that case, we still > + * print the decoded address, but make it clear that this is not > + * necessarily what's actually going on. > + */ > + if (orig_addr < PAGE_SIZE) > + bug_type = "null-ptr-deref"; > + else if (orig_addr < TASK_SIZE) > + bug_type = "probably user-memory-access"; > + else > + bug_type = "maybe wild-memory-access"; > + pr_alert("KASAN: %s in range [0x%016lx-0x%016lx]\n", bug_type, > + orig_addr, orig_addr + KASAN_SHADOW_MASK); > +} > +#endif > -- > 2.24.0.432.g9d3f5f5b63-goog Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Thanks! ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2019-11-28 5:24 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-15 19:17 [PATCH v2 1/3] x86/insn-eval: Add support for 64-bit kernel mode Jann Horn 2019-11-15 19:17 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Jann Horn 2019-11-18 14:21 ` Borislav Petkov 2019-11-18 16:02 ` Dmitry Vyukov 2019-11-18 16:19 ` Jann Horn 2019-11-18 16:29 ` Dmitry Vyukov 2019-11-18 16:40 ` error attribution for stalls [was: [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP] Jann Horn 2019-11-18 16:44 ` [PATCH v2 2/3] x86/traps: Print non-canonical address on #GP Borislav Petkov 2019-11-18 17:38 ` Borislav Petkov 2019-11-20 11:41 ` Ingo Molnar 2019-11-20 11:40 ` Ingo Molnar 2019-11-20 11:52 ` Borislav Petkov 2019-11-20 4:25 ` Andi Kleen 2019-11-20 10:31 ` Jann Horn 2019-11-20 13:56 ` Andi Kleen 2019-11-20 14:24 ` Jann Horn 2019-11-23 23:07 ` Andy Lutomirski 2019-11-27 20:27 ` Jann Horn 2019-11-28 5:23 ` Andy Lutomirski 2019-11-15 19:17 ` [PATCH v2 3/3] x86/kasan: Print original " Jann Horn 2019-11-18 8:36 ` Dmitry Vyukov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).